0% found this document useful (0 votes)

100 views47 pages

Dimensionality Reduction

Dimensionality reduction techniques are used to reduce the number of features in datasets with high dimensionality. Common techniques include principal component analysis (PCA), forward selection, and backward elimination. PCA works by converting correlated features into linearly uncorrelated principal components while maintaining similar information.

Uploaded by

bka212407

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views47 pages

Dimensionality Reduction

Uploaded by

bka212407

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

DIMENSIONALITY

REDUCTION
INTRODUCTION TO DIMENSIONALITY REDUCTION

 The number of input features, variables, or columns present in a given

dataset is known as dimensionality and the process to reduce these features is
called dimensionality reduction.
 A dataset contains a huge number of input features in various cases, which
makes the predictive modeling task more complicated. Because it is very
difficult to visualize or make predictions for the training dataset with a high
number of features, for such cases, dimensionality reduction techniques are
required.
 Dimensionality reduction technique can be defined as "It is a
way of converting the higher dimensions dataset into lesser
dimensions dataset ensuring that it provides similar
information." These techniques are widely used in machine
learning for obtaining a better-fit predictive model while solving
classification and regression problems.
 It is commonly used in fields that deal with high-dimensional
data, such as speech recognition, signal processing,
bioinformatics, etc. It can also be used for data visualization,
noise reduction, cluster analysis, etc.
Common techniques of Dimensionality Reduction
a. Principal Component Analysis
b. Backward Elimination
c. Forward Selection
d. Score comparison
e. Missing Value Ratio
f. Low Variance Filter
g. High Correlation Filter
h. Random Forest
i. Factor Analysis
j. Auto-Encoder
Principal Component Analysis (PCA)
Principal Component Analysis is a statistical process that converts the
observations of correlated features into a set of linearly uncorrelated features
with the help of orthogonal transformation. These new transformed features are
called the Principal Components. It is one of the popular tools that is used for
exploratory data analysis and predictive modeling.
PCA works by considering the variance of each attribute because the high
attribute shows a good split between the classes, and hence it reduces the
dimensionality. Some real-world applications of PCA are image processing,
movie recommendation systems, and optimizing the power allocation in
various communication channels.
Forward Feature Selection
Forward feature selection follows the inverse process of the backward
elimination process. It means, in this technique, we don't eliminate the
feature; instead, we will find the best features that can produce the highest
increase in the performance of the model. Below steps are performed in this
technique:
o We start with a single feature only, and progressively we will add each
feature at a time.
o Here we will train the model on each feature separately.
o The feature with the best performance is selected.
o The process will be repeated until we get a significant increase in the
performance of the model.
BACKWARD FEATURE ELIMINATION 8

The backward feature elimination technique is mainly used while developing Linear
Regression or Logistic Regression models. Below steps are performed in this
technique to reduce the dimensionality or in feature selection:
• In this technique, firstly, all the n variables of the given dataset are taken to train
the model.
• The performance of the model is checked.
• Now we will remove one feature each time and train the model on n-1 features for
n times, and compute the performance of the model.
• We will check the variable that has made the smallest or no change in the
performance of the model, and then we will drop that variable or features; after
that, we will be left with n-1 features.
• Repeat the complete process until no feature can be dropped.
In this technique, by selecting the optimum performance of the model and maximum
tolerable error rate, we can define the optimal number of features required for the
machine learning algorithms.
9
Missing Value Ratio
If a dataset has too many missing values, then we drop those variables as they do
not carry much useful information. To perform this, we can set a threshold level,
and if a variable has missing values more than that threshold, we will drop that
variable. The higher the threshold value, the more efficient the reduction.
High Correlation Filter
High Correlation refers to the case when two variables carry approximately
similar information. Due to this factor, the performance of the model can be
degraded. This correlation between the independent numerical variables gives
the calculated value of the correlation coefficient. If this value is higher than the
threshold value, we can remove one of the variables from the dataset. We can
consider those variables or features that show a high correlation with the target
variable.
Random Forest
Random Forest is a popular and very useful feature selection algorithm in machine
learning. This algorithm contains an in-built feature importance package, so we do not
need to program it separately. In this technique, we need to generate a large set of trees
against the target variable, and with the help of usage statistics of each attribute, we need
to find the subset of features.
Random forest algorithm takes only numerical variables, so we need to convert the input
data into numeric data using hot encoding.
Factor Analysis
Factor analysis is a technique in which each variable is kept within a group according
to the correlation with other variables, it means variables within a group can have a
high correlation between themselves, but they have a low correlation with variables of
other groups.
Auto-encoders
One of the popular methods of dimensionality reduction is auto-encoder, which is a
type of ANN or artificial neural network, and its main aim is to copy the inputs to
their outputs. In this, the input is compressed into latent-space representation, and
output is occurred using this representation. It has mainly two parts:
o Encoder: The function of the encoder is to compress the input to form the latent-
space representation.
o Decoder: The function of the decoder is to recreate the output from the latent-
space representation.
12
KEY ASPECTS OF DIMENSIONALITY
Reduction
1) The Curse of Dimensionality
2) Main Approaches for Dimensionality Reduction
3) PCA(Principle Component Analysis)
4) Using Scikit-Learn
5) Randomized PCA
6) Kernel PCA.
THE CURSE OF DIMENSIONALITY 13

• Handling high-dimensional data is very difficult in practice, commonly

known as the curse of dimensionality.

• If the dimensionality of the input dataset increases, any machine learning

algorithm and model becomes more complex.

• As the number of features increases, the number of samples also

increases proportionally, and the chance of overfitting also increases.

• If the machine learning model is trained on high-dimensional data, it

becomes overfitted and results in poor performance.

• Hence, it is often required to reduce the number of features, which can

be done with dimensionality reduction.
2. Approaches of Dimensionality Reduction 14
There are two ways to apply the dimension reduction technique,
which are given below:
 Feature Selection.
 Feature Extraction.

Feature Selection
Feature selection is the process of selecting the subset of the relevant features
and leaving out the irrelevant features present in a dataset to build a model of
high accuracy. In other words, it is a way of selecting the optimal features from
the input dataset.
Three methods are used for the feature selection:
1. Filters Methods
In this method, the dataset is filtered, and a
subset that contains only the relevant features is
taken. Some common techniques of the filters
method are:
o Correlation
o Chi-Square Test
o ANOVA
o Information Gain, etc.
2. Wrappers Methods
The wrapper method has the same goal as the filter method,
but it takes a machine-learning model for its evaluation. In
this method, some features are fed to the ML model and
evaluate the performance.
The performance decides whether to add those features or
remove them to increase the accuracy of the model. This method
is more accurate than the filtering method but complex to work.
Some common techniques of wrapper methods are:
o Forward Selection
o Backward Selection
o Bi-directional Elimination
3. Embedded Methods: Embedded methods check the different
training iterations of the machine learning model and evaluate
the importance of each feature. Some common techniques of
Embedded methods are:
o LASSO
o Elastic Net
o Ridge Regression, etc.
17
Feature Extraction:
Feature extraction is the process of transforming the space
containing many dimensions into space with fewer dimensions.
This approach is useful when we want to keep the whole
information but use fewer resources while processing the
information.
Some common feature extraction techniques are:
a. Principal Component Analysis
b. Linear Discriminant Analysis
c. Kernel PCA
d. Quadratic Discriminant Analysis
18

3. PRINCIPAL COMPONENT ANALYSIS (PCA)

Principal Component Analysis is a statistical process that converts the observations
of correlated features into a set of linearly uncorrelated features with the help of
orthogonal transformation. These new transformed features are called the Principal
Components. It is one of the popular tools that is used for exploratory data analysis
and predictive modeling.
PCA works by considering the variance of each attribute because the high attribute
shows a good split between the classes, and hence it reduces the dimensionality.
Some real-world applications of PCA are image processing, movie
recommendation systems, and optimizing the power allocation in various
communication channels.
Principal Component Analysis Solved Example 19

Principal component analysis (PCA) is a statistical procedure that uses an

orthogonal transformation to convert a set of observations of possibly
correlated variables into a set of values of linearly uncorrelated variables
called principal components.

In this article, I will discuss how to find the principal components with a
simple solved numerical example.

Problem definition: Given data in the Table, reduce the dimension from 2 to
1 using the Principal Component Analysis (PCA) algorithm.
Feature Example 1 Example 2 Example 3 Example 4

X1 4 8 13 7

X2 11 4 5 14
20

Calculate the mean of X1 and X2 as shown below.

Step 2: Calculation of the covariance matrix.

The covariances are calculated as follows:
21
22
The covariance matrix is,
23
Step 3: Eigenvalues of the covariance
matrix
The characteristic equation of the
covariance matrix is,
24

Solving the characteristic equation we get,

25
Step 4: Computation of the eigenvectors
To find the first principal components, we need only compute the
eigenvector corresponding to the largest Eigen value. In the present
example, the largest Eigen value is λ1 and so we compute the eigenvector
corresponding to λ1.
The eigenvector corresponding to λ = λ1 is a vector
26
satisfying the following equation:
27
This is equivalent to the following two equations:

Using the theory of systems of linear equations, we note that these

equations are not independent and solutions are given by,
28

that is,

where t is any real number.

Taking t = 1, we get an eigenvector
corresponding to λ1 as

To find a unit eigenvector, we compute the length of X1 which is given by,

Therefore, a unit eigenvector corresponding to λ1 is

30
By carrying out similar computations, the unit
eigenvector e2 corresponding to the eigenvalue λ= λ2 can be shown to
be,

Step 5: Computation of first principal components

Let, be the kth sample in the above Table (dataset). The first principal
component of this example is given by (here “T” denotes the transpose of
the matrix)
31

For example, the first principal component corresponding to the first example
is calculated as follows:
32

The results of the calculations are summarised in the below Table.

X1 4 8 13 7

X2 11 4 5 14

First Principle
Components -4.3052 3.7361 5.6928 -5.1238

Step 6: Geometrical meaning of first principal components

First, we shift the origin to the “center” and then change the directions of
coordinate axes to the directions of the eigenvectors e1 and e2.
33

The coordinate system for principal components

Next, we drop perpendiculars from the given data points to the e1-axis
(see below Figure).
Projections of data points on the axis of the first
principal component
The first principal components are the e1-
coordinates of the feet of perpendiculars, that is, the
projections on the e1-axis. The projections of the
data points on the e1-axis may be taken as
approximations of the given data points hence we
may replace the given data set with these points.

Now, each of these approximations can be

unambiguously specified by a single number, namely,
the e1-coordinate of
approximation. Thus the two-dimensional data set
can be represented approximately by the following
one-dimensional data set.
35

Geometrical representation of one-dimensional

approximation to the data set
TYPES OF PRINCIPLE COMPONENT ANALYSIS
THERE ARE FOUR METHODS TO IMPLEMENT PCA:
37
Randomized PCA :

number of data
points
39
Kernel PCA
44
Benefits of applying Dimensionality Reduction
Some benefits of applying the dimensionality reduction technique
to the given dataset are given below:
o By reducing the dimensions of the features, the space required
to store the
dataset also gets reduced.
o Less Computation training time is required for reduced
dimensions of
features.
o Reduced dimensions of features of the dataset help in
visualizing the data
quickly.
o It removes the redundant features (if present) by taking care of
multicollinearity.
Disadvantages of dimensionality Reduction
There are also some disadvantages of applying the
dimensionality reduction, which are given below:
o Some data may be lost due to dimensionality
reduction.
o In the PCA dimensionality reduction technique,
sometimes the principal components required to
consider are unknown.

Feature Engineering and Dimensionality Reduction
No ratings yet
Feature Engineering and Dimensionality Reduction
146 pages
Unit 9 Progress Check - FRQ Scoring Guide
No ratings yet
Unit 9 Progress Check - FRQ Scoring Guide
6 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
27 pages
ML Unit 4 (R22)
No ratings yet
ML Unit 4 (R22)
34 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
56 pages
Feature Selection
No ratings yet
Feature Selection
53 pages
Data Reduction Techniques
No ratings yet
Data Reduction Techniques
41 pages
Dimensionality Reduction in Machine Learning-1
No ratings yet
Dimensionality Reduction in Machine Learning-1
16 pages
ML (Unit 5)
No ratings yet
ML (Unit 5)
34 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
30 pages
Unit 3
No ratings yet
Unit 3
50 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Day School 03
No ratings yet
Day School 03
32 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in PDF
14 pages
ML Unit 4 at VS
No ratings yet
ML Unit 4 at VS
33 pages
Dva 2
No ratings yet
Dva 2
13 pages
Feature Selection
No ratings yet
Feature Selection
13 pages
ML Mod 4 & 6 Pyq
No ratings yet
ML Mod 4 & 6 Pyq
11 pages
Machine Learning Unit-5
No ratings yet
Machine Learning Unit-5
49 pages
Python Implementation of Random Forest Algorithm
No ratings yet
Python Implementation of Random Forest Algorithm
10 pages
AI5003 AML Week07
No ratings yet
AI5003 AML Week07
14 pages
ASM-BDM - Module 3 - Notes
No ratings yet
ASM-BDM - Module 3 - Notes
12 pages
Chapter6 - Unit IV2024
No ratings yet
Chapter6 - Unit IV2024
84 pages
Data Reduction
No ratings yet
Data Reduction
23 pages
Unit 3
No ratings yet
Unit 3
23 pages
Unit 11
No ratings yet
Unit 11
37 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
20 pages
03preprocessing3 Part3 4
No ratings yet
03preprocessing3 Part3 4
49 pages
Dimensionality Reduction Final
No ratings yet
Dimensionality Reduction Final
5 pages
EE769 9 Combining Models
No ratings yet
EE769 9 Combining Models
32 pages
1-2C-Introduction To Statistical Analysis For Industrial Engineering 2
100% (1)
1-2C-Introduction To Statistical Analysis For Industrial Engineering 2
12 pages
ML Lecture UIII 1 Dim Red
No ratings yet
ML Lecture UIII 1 Dim Red
25 pages
ML Unit 2 Part - 2
No ratings yet
ML Unit 2 Part - 2
6 pages
Unit 3
No ratings yet
Unit 3
15 pages
Presentation 1
No ratings yet
Presentation 1
15 pages
Data Analytics
No ratings yet
Data Analytics
28 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
30 pages
Stepwiseselection MATTOUHI AICHA
No ratings yet
Stepwiseselection MATTOUHI AICHA
7 pages
Dimension Reduction
No ratings yet
Dimension Reduction
38 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
3.1 Dimensionality Reduction
No ratings yet
3.1 Dimensionality Reduction
24 pages
Least Square Regression
No ratings yet
Least Square Regression
15 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
Savitri 2014
No ratings yet
Savitri 2014
18 pages
r20 DWDM Unit 2 PART 2
No ratings yet
r20 DWDM Unit 2 PART 2
15 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
L-10 - Presentation1-09052024-072206pm
No ratings yet
L-10 - Presentation1-09052024-072206pm
27 pages
Theory Question For 504 A
No ratings yet
Theory Question For 504 A
2 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
An Introduction To Feature Selection
No ratings yet
An Introduction To Feature Selection
45 pages
Business Data Mining Week 4
No ratings yet
Business Data Mining Week 4
12 pages
Unit No.02 - Feature Extraction and Selection
No ratings yet
Unit No.02 - Feature Extraction and Selection
17 pages
Lecture 1
No ratings yet
Lecture 1
15 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Chapter 1.2. Overview of ML
No ratings yet
Chapter 1.2. Overview of ML
17 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
23 pages
Introduction To Dimensionality Reduction-1
No ratings yet
Introduction To Dimensionality Reduction-1
16 pages
ML Unit Iv Part I
No ratings yet
ML Unit Iv Part I
11 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
Regression Analysis Using SPSS: DR Somesh K Sinha
100% (1)
Regression Analysis Using SPSS: DR Somesh K Sinha
17 pages
Dimenn Red PDF
No ratings yet
Dimenn Red PDF
135 pages
Reference+Materials ARIMA
No ratings yet
Reference+Materials ARIMA
21 pages
Dimensionality Reduction Techniques You Should Know in 2021
No ratings yet
Dimensionality Reduction Techniques You Should Know in 2021
12 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Comparartive
No ratings yet
Comparartive
7 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
ML Module 6
No ratings yet
ML Module 6
6 pages
IEEE Dimensionality Reduction
No ratings yet
IEEE Dimensionality Reduction
6 pages
Panel Data Econometrics: Manuel Arellano
No ratings yet
Panel Data Econometrics: Manuel Arellano
5 pages
E-Note 14653 Content Document 20231228101402AM
No ratings yet
E-Note 14653 Content Document 20231228101402AM
10 pages
The Multinomial Logit Model For Nominal Response Data: James J. Dignam
No ratings yet
The Multinomial Logit Model For Nominal Response Data: James J. Dignam
19 pages
Cronbach's α (Reliability of data) and Factor Analysis (Construct Validity)
No ratings yet
Cronbach's α (Reliability of data) and Factor Analysis (Construct Validity)
55 pages
SMDM Extended Project
No ratings yet
SMDM Extended Project
3 pages
6 Anova
No ratings yet
6 Anova
44 pages
Stock Watson 3U ExerciseSolutions Chapter11 Students
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter11 Students
7 pages
Eco - Test 2
No ratings yet
Eco - Test 2
4 pages
Kecemasan Akademik1
No ratings yet
Kecemasan Akademik1
12 pages
Hasil Olah Data Moderator SPSS
No ratings yet
Hasil Olah Data Moderator SPSS
8 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
KNIME - Seven Techs For Dimensionality Reduction
No ratings yet
KNIME - Seven Techs For Dimensionality Reduction
17 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Multiple-Choice Test Linear Regression Regression: y X y X y X
No ratings yet
Multiple-Choice Test Linear Regression Regression: y X y X y X
2 pages
Descriptive Statistics F
No ratings yet
Descriptive Statistics F
4 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
SPSS Logistic Regression
No ratings yet
SPSS Logistic Regression
4 pages
Assignment 3: This Assignment Aims To Fit A VAR To The Following Variables
No ratings yet
Assignment 3: This Assignment Aims To Fit A VAR To The Following Variables
6 pages

Dimensionality Reduction

Uploaded by

Dimensionality Reduction

Uploaded by

DIMENSIONALITY

 The number of input features, variables, or columns present in a given

• Handling high-dimensional data is very difficult in practice, commonly

• If the dimensionality of the input dataset increases, any machine learning

• As the number of features increases, the number of samples also

• If the machine learning model is trained on high-dimensional data, it

• Hence, it is often required to reduce the number of features, which can

3. PRINCIPAL COMPONENT ANALYSIS (PCA)

Principal component analysis (PCA) is a statistical procedure that uses an

Calculate the mean of X1 and X2 as shown below.

Step 2: Calculation of the covariance matrix.

Solving the characteristic equation we get,

Using the theory of systems of linear equations, we note that these

where t is any real number.

To find a unit eigenvector, we compute the length of X1 which is given by,

Therefore, a unit eigenvector corresponding to λ1 is

Step 5: Computation of first principal components

The results of the calculations are summarised in the below Table.

Step 6: Geometrical meaning of first principal components

The coordinate system for principal components

Now, each of these approximations can be

Geometrical representation of one-dimensional

You might also like