When machine learning classification problems are performed, there are various factors that are considered on the basis of which the final classification is done. These factors – fundamental variables are known as features. The greater the number of features, the harder it gets to envision the training set and then work on it. Sometimes, most of these features are related, and hence unnecessary. This issue can be addressed with dimensionality reduction algorithms. Dimensionality reduction is the process of reducing the number of random variables under study, by collecting a set of principal variables. It can be classified into feature selection and feature extraction.
Feature Selection
In this process, we try to identify a subset of the primary set of variables, or features, to get a modest subset which can be used to illustrate the problem.
Feature extraction
In this process, the data is reduced into a high dimensional space to a profound dimensional space.
Methods for Dimensionality Reduction
Dimension reduction or turning a group of data having immense dimensions into data with subordinate dimensions with effective concise information can be achieved by using various methods.
Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a dimension-reduction mechanism that can be used to
overcome a large set of variables to a small set that still contains most of the information in
the large set. In this procedure, correlated variables are transformed into a number of uncorrelated variables termed as principal components. The original principal component accounts for the variability in the practicability data, and each succeeding component values for as much of the outstanding variability is possible.
A principal component analysis can be considered as rotation of the axes of the original variable coordinate system to new orthogonal axes, called principal axes, such that the new axes coincide with directions of maximum variation of the original observations.
Linear Dimensionality Reduction (LDA)
Linear Discriminant Analysis (LDA) is a technique used for supervised classification problems.
Linear Discriminant Analysis is a dimensionality reduction technique used as a preprocessing level in Machine Learning and pattern classification applications.
Linear Discriminant Analysis takes labels into consideration. This level of dimensionality reduction is used in biometrics, chemistry and many more. The primary motive of LDA is to calculate the characteristics in higher dimension space onto a lower dimensional space.
The process starts by calculating the separability between various classes also termed as between-class variance. Once the class variance is obtained we need to determine the distance between the mean and sample of every class, which is called within class modification, followed by construction of lower dimensional space which maximises the value between class variance and minimises the within-class variance.
Generalised Discriminant Analysis(GDA)
The GDA technique applies the methods of the general linear model to the discriminant function analysis problem. In GDA, the discriminant function analysis problem is termed as “recast” which is a general multivariate linear model, where the conditional variables of a class are coded vectors that indicate the group membership of each case. The remainder of the analysis is then produced as described in the context of General Regression Models (GRM), with a few additional characteristics.
- Defining standards for predictor variables and predictor effects.
- Stepwise and optimal-subset analyses.
- Value profiling of succeeding classification probabilities.
Advantages Of Dimensionality Reduction
- Dimensionality reduction has a host of advantages from a machine learning point of view
- Since the model has smaller degrees of freedom, the possibility of overfitting is lower. The model will generalise more easily on new data
- If user applies feature selection or linear classifications (such as PCA), the conversion will promote the most related variables which will improve the interpretability of the model
- Most of features extraction procedures are unsupervised. The user can encourage the autoencoder or fit a PCA on unlabeled data. This can be really effective as the user will have a bunch of unlabeled data and labelling is time-consuming and expensive