Since machine learning uses an array of mathematical concepts such as linear algebra and optimisation techniques, among others, it can be said that the numerical analysis holds a prime spot in the study of ML. Algorithms will be rendered void if only mathematical concepts are based on without giving thought to a certain numerical approximation in ML. In fact, numerical methods are at the heart of ML fields like deep learning (DL).
Here, we will discuss a particular numerical analysis technique called Nyström method. It has now found its application in ML. Latest research has led to variations of the Nyström method for custom implementations in machine learning.
The Origins Of Nyström Method
Originally developed by EJ Nyström in 1930, the technique finds its use in eigenvalue problems which is evident in linear algebra. Nyström method aids in developing numerical solutions to integral equations in eigen matrices, by assigning a weighted sum to integrals which produce an approximation of those values.
Nyström method was first brought into ML by Christopher KI Williams and Matthias Seeger in 2000, who were scholars at the University of Edinburgh. The idea behind bringing this technique was to speed up kernel machines, which were computation-intensive. They demonstrated this by developing an approximation to the Gram matrix values in the kernels. This is obtained by applying eigen decomposition to reduce matrix dimensions which also brings down the computational complexity. However, the important point to be noted is that the authors were focussed on computations in ML algorithms such as Support Vector Machines (SVM) and Gaussian processes, and drastically brought the computation down without losing any accuracy in the numerical solutions obtained.
The Nyström method by Williams and Seeger is commonly followed by most ML researchers before beginning with any ML study that involves numerical analysis. The general form present in their study is illustrated below:
“In the context of kernel machines, the covariance kernels are represented by k(x,y). This is in line with an attribute called feature space of dimension N, which is a large number. Now, the relation between these is represented by the equation
k(x,y) = N(∑)i=1ƛiɸi(x)ɸi(y),
where N ≤ ∞, ƛ1≥ ƛ2 ≥ …≥ 0 denotes eigenvalues and ɸ1, ɸ2, … denotes eigenfunctions of the operator whose kernel is k, so that
∫k(y,x)ɸi(x) p(x)dx = ƛiɸi (y),
where p(x) is the probability density function of the vector ‘x’. In order to approximate the above eigenvalue equation, the integral is replaced by an empirical average. The modified equation is given below:
1/q q(∑)k=1 k(y,xk)ɸi(xk) ≅ ƛiɸi (y)
This equation is combined with a Gram matrix (specified in the problem) for the Nyström approximation cited below
ɸi (y) =√q/ƛi(q) ky. ui(q)
where k denotes vector coordinates and u denotes parameters from the Gram matrix.”
Now, the pertinent thought here is that what enticed the researchers to pursue Nyström method for ML. Most of the ML research studies first entail the Nyström method in general and consider a specific key factor to be optimised. They cite problems in ML such as efficient numerical approximations and increased matrix dimensions in algorithms, and concentrate on applying the technique here. Clearly, the numerical aspects were the key problem statement in these studies which made the researchers go with the Nyström method.
The Involvement And Progress Of Nyström Method In ML Research
Nyström method has gradually evolved from being just an alternate technique in numerical analysis to popular ML areas such as image segmentation, manifold learning, pattern analysis and intelligent machines. ML research in the recent years has also led to newer and better variations such as ensemble Nyström, modified Nyström and SS-Nyström. These variations show an improvement over the standard Nyström method when it comes to value approximations.
As a matter of fact, Nyström method is highly suitable for ML performed on a large scale. Kernel methods will be scalable with this technique. Complex linear systems which are evident in SVMs can also be solved with Nyström. In addition, newer ML methods such as spectral clustering, manifold learning and principal component analysis (PCA), which sometimes necessitates eigenvalue decomposition, will benefit highly in terms of learning computation.
Recently, sampling methods have also been combined with the Nyström technique. This provides an avenue for bettering sampling algorithms even in software systems. Altogether Nyström method has progressed vastly in terms of research and implementation.
Nyström technique has proved to be more beneficial in machine learning. Its variations and collaborations have shown better results. However, this is applicable only in specific cases. Diligent care should be taken before going ahead with these variations. Also, it may seem theoretically feasible but care has to be taken that it is also practically achievable for machine learning.