# What is Dimensionality Reduction – Techniques, Methods, Components

## 1. Dimensionality Reduction – Objective

In this **Machine Learning Tutorial**, we will study What is Dimensionality Reduction. Also, will cover every related aspect of machine learning- Dimensionality Reduction like components & Methods of Dimensionality Reduction, Principle Component analysis & Importance of Dimensionality Reduction, Feature selection, Advantages & Disadvantages of Dimensionality Reduction. Along with this, we will see all W’s of Dimensionality Reduction.

So, let’s start Dimensionality Reduction Tutorial.

## 2. What is Dimensionality Reduction?

**machine learning**we are having too many factors on which the final classification is done. These factors are basically, known as variables. The higher the number of features, the harder it gets to visualize the training set and then work on it. Sometimes, most of these features are correlated, and hence redundant. This is where dimensionality reduction algorithms come into play.

## 3. Motivation

- When we deal with real problems and real data we often deal with high dimensional data that can go up to millions.

- In original high dimensional structure, data represents itself. Although, sometimes we need to reduce its dimensionality.

- We need to reduce the dimensionality that needs to associate with visualizations. Although, that is not always the case.

## 4. Components of Dimensionality Reduction

### a. Feature selection

**Filter**

**Wrapper**

**Embedded**

### b. Feature Extraction

## 5. Dimensionality Reduction Methods

- Principal Component Analysis (PCA)

- Linear Discriminant Analysis (LDA)

- Generalized Discriminant Analysis (GDA)

## 6. Principal Component Analysis

- Construct the covariance matrix of the data.

- Compute the eigenvectors of this
**matrix**.

## 7. Importance of Dimensionality Reduction

Why is Dimension Reduction is important in machine learning predictive modeling?

## 8. What are Dimensionality Reduction Techniques?

**classification**or regression task.

## 9. Common Methods to Perform Dimensionality Reduction

### a. Missing Values

### b. Low Variance

### c. Decision Trees

### d. Random Forest

### e. High Correlation

### f. Backward Feature Elimination

### g. Factor Analysis

- EFA (Exploratory Factor Analysis)

- CFA (Confirmatory Factor Analysis)

### h. Principal Component Analysis (PCA)

## 10. Reduce the Number of Dimensions

- Dimensionality reduction has several advantages from a machine learning point of view.
- Since your model has fewer degrees of freedom, the likelihood of overfitting is lower. The model will generalize more easily to new data.
- If we are using feature selection the reduction will promote the important variables. Also, it helps in improving the interpretability of your model.
- Most of features extraction techniques are unsupervised. You can train your autoencoder or fit your PCA on unlabeled data. This can be helpful if you have a lot of unlabeled data and labeling is time-consuming and expensive.

## 11. Features Selection in Reduction

Most, important is to reduce dimensionality. Also, is to remove some dimensions and to select the more suitable variables for the problem.

Here are some ways to select variables:

- Greedy algorithms which add and remove variables until some criterion is met.
- Shrinking and penalization methods, which will add cost for having too many variables. For instance, L1 regularization will cut some variables’ coefficient to zero. Regularization limits the space where the coefficients can live in.
- As we have to select model on particular criteria. That need to take the number of dimensions into accounts. Such as the adjusted R², AIC or BIC. Contrary to regularization, the model is not trained to optimize these criteria.
- Filtering of variables using correlation, VIF or some “distance measure” between the features.

## 12. Advantages of Dimensionality Reduction

- Dimensionality Reduction helps in data compression, and hence reduced storage space.
- It reduces computation time.
- It also helps remove redundant features, if any.
- Dimensionality Reduction helps in data compressing and reducing the storage space required
- It fastens the time required for performing same computations.
- If there present fewer dimensions then it leads to less computing. Also, dimensions can allow usage of algorithms unfit for a large number of dimensions.
- It takes care of multicollinearity that improves the model performance. It removes redundant features. For example, there is no point in storing a value in two different units (meters and inches).
- Reducing the dimensions of data to 2D or 3D may allow us to plot and visualize it precisely. You can then observe patterns more clearly. Below you can see that, how a 3D data is converted into 2D. First, it has identified the 2D plane then represented the points on these two new axes z1 and z2.

- It is helpful in noise removal also and as a result of that, we can improve the performance of models.

## 13. Disadvantages of Dimensionality Reduction

- Basically, it may lead to some amount of data loss.

- Although, PCA tends to find linear correlations between variables, which is sometimes undesirable.

- Also, PCA fails in cases where mean and covariance are not enough to define datasets.

- Further, we may not know how many principal components to keep- in practice, some thumb rules are applied.

So, this was all about Dimensionality Reduction Tutorial. Hope you like our explanation.

## 14. Conclusion

**machine learning –**Motivation, Components, Methods, Principal Component Analysis, importance, techniques, Features selection, reduce the number, Advantages, and Disadvantages of Dimension Reduction. As Machine Learning- Dimensionality Reduction is a hot topic nowadays. Furthermore, if you feel any query, feel free to ask in a comment section.

**Gradient Boosting Algorithm**&

**XGBoost Algorithm**