0% found this document useful (0 votes)

30 views

PCA Explained Stepbystep

Uploaded by

Nguyen Cao Trí

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

PCA Explained Stepbystep

Uploaded by

Nguyen Cao Trí

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 4

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

Principal Component Analysis, or PCA, is a dimensionality-reduction method that is

often used to reduce the dimensionality of large data sets, by transforming a large
set of variables into a smaller one that still contains most of the information in
the large set.

Reducing the number of variables of a data set naturally comes at the expense of
accuracy, but the trick in dimensionality reduction is to trade a little accuracy
for simplicity. Because smaller data sets are easier to explore and visualize and
make analyzing data much easier and faster for machine learning algorithms without
extraneous variables to process.

So to sum up, the idea of PCA is simple — reduce the number of variables of a data
set, while preserving as much information as possible.

STEP BY STEP EXPLANATION OF PCA

STEP 1: STANDARDIZATION
The aim of this step is to standardize the range of the continuous initial
variables so that each one of them contributes equally to the analysis.

More specifically, the reason why it is critical to perform standardization prior

to PCA, is that the latter is quite sensitive regarding the variances of the
initial variables. That is, if there are large differences between the ranges of
initial variables, those variables with larger ranges will dominate over those with
small ranges (For example, a variable that ranges between 0 and 100 will dominate
over a variable that ranges between 0 and 1), which will lead to biased results.
So, transforming the data to comparable scales can prevent this problem.

Mathematically, this can be done by subtracting the mean and dividing by the
standard deviation for each value of each variable.

Once the standardization is done, all the variables will be transformed to the same
scale.

STEP 2: COVARIANCE MATRIX COMPUTATION

The aim of this step is to understand how the variables of the input data set are
varying from the mean with respect to each other, or in other words, to see if
there is any relationship between them. Because sometimes, variables are highly
correlated in such a way that they contain redundant information. So, in order to
identify these correlations, we compute the covariance matrix.

The covariance matrix is a p × p symmetric matrix (where p is the number of

dimensions) that has as entries the covariances associated with all possible pairs
of the initial variables. For example, for a 3-dimensional data set with 3
variables x, y, and z, the covariance matrix is a 3×3 matrix of this from:

Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)),

in the main diagonal (Top left to bottom right) we actually have the variances of
each initial variable. And since the covariance is commutative (Cov(a,b)=Cov(b,a)),
the entries of the covariance matrix are symmetric with respect to the main
diagonal, which means that the upper and the lower triangular portions are equal.

What do the covariances that we have as entries of the matrix tell us about the
correlations between the variables?
It’s actually the sign of the covariance that matters :

if positive then: the two variables increase or decrease together (correlated)

if negative then : One increases when the other decreases (Inversely correlated)
Now, that we know that the covariance matrix is not more than a table that
summaries the correlations between all the possible pairs of variables, let’s move
to the next step.
STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF THE COVARIANCE MATRIX TO
IDENTIFY THE PRINCIPAL COMPONENTS
Eigenvectors and eigenvalues are the linear algebra concepts that we need to
compute from the covariance matrix in order to determine the principal components
of the data. Before getting to the explanation of these concepts, let’s first
understand what do we mean by principal components.

Principal components are new variables that are constructed as linear combinations
or mixtures of the initial variables. These combinations are done in such a way
that the new variables (i.e., principal components) are uncorrelated and most of
the information within the initial variables is squeezed or compressed into the
first components. So, the idea is 10-dimensional data gives you 10 principal
components, but PCA tries to put maximum possible information in the first
component, then maximum remaining information in the second and so on, until having
something like shown in the scree plot below.

Percentage of Variance (Information) for each by PC

Organizing information in principal components this way, will allow you to reduce
dimensionality without losing much information, and this by discarding the
components with low information and considering the remaining components as your
new variables.

An important thing to realize here is that, the principal components are less
interpretable and don’t have any real meaning since they are constructed as linear
combinations of the initial variables.

Geometrically speaking, principal components represent the directions of the data

that explain a maximal amount of variance, that is to say, the lines that capture
most information of the data. The relationship between variance and information
here, is that, the larger the variance carried by a line, the larger the dispersion
of the data points along it, and the larger the dispersion along a line, the more
the information it has. To put all this simply, just think of principal components
as new axes that provide the best angle to see and evaluate the data, so that the
differences between the observations are better visible.

HOW PCA CONSTRUCTS THE PRINCIPAL COMPONENTS

As there are as many principal components as there are variables in the data,
principal components are constructed in such a manner that the first principal
component accounts for the largest possible variance in the data set. For example,
let’s assume that the scatter plot of our data set is as shown below, can we guess
the first principal component ? Yes, it’s approximately the line that matches the
purple marks because it goes through the origin and it’s the line in which the
projection of the points (red dots) is the most spread out. Or mathematically
speaking, it’s the line that maximizes the variance (the average of the squared
distances from the projected points (red dots) to the origin).

The second principal component is calculated in the same way, with the condition
that it is uncorrelated with (i.e., perpendicular to) the first principal component
and that it accounts for the next highest variance.
This continues until a total of p principal components have been calculated, equal
to the original number of variables.

Now that we understood what we mean by principal components, let’s go back to

eigenvectors and eigenvalues. What you firstly need to know about them is that they
always come in pairs, so that every eigenvector has an eigenvalue. And their number
is equal to the number of dimensions of the data. For example, for a 3-dimensional
data set, there are 3 variables, therefore there are 3 eigenvectors with 3
corresponding eigenvalues.

Without further ado, it is eigenvectors and eigenvalues who are behind all the
magic explained above, because the eigenvectors of the Covariance matrix are
actually the directions of the axes where there is the most variance(most
information) and that we call Principal Components. And eigenvalues are simply the
coefficients attached to eigenvectors, which give the amount of variance carried in
each Principal Component.

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you

get the principal components in order of significance.

Example:
let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the
eigenvectors and eigenvalues of the covariance matrix are as follows:

Principal Component Analysis Example

If we rank the eigenvalues in descending order, we get λ1>λ2, which means that the
eigenvector that corresponds to the first principal component (PC1) is v1 and the
one that corresponds to the second component (PC2) is v2.

After having the principal components, to compute the percentage of variance

(information) accounted for by each component, we divide the eigenvalue of each
component by the sum of eigenvalues. If we apply this on the example above, we find
that PC1 and PC2 carry respectively 96% and 4% of the variance of the data.

STEP 4: FEATURE VECTOR

As we saw in the previous step, computing the eigenvectors and ordering them by
their eigenvalues in descending order, allow us to find the principal components in
order of significance. In this step, what we do is, to choose whether to keep all
these components or discard those of lesser significance (of low eigenvalues), and
form with the remaining ones a matrix of vectors that we call Feature vector.

So, the feature vector is simply a matrix that has as columns the eigenvectors of
the components that we decide to keep. This makes it the first step towards
dimensionality reduction, because if we choose to keep only p eigenvectors
(components) out of n, the final data set will have only p dimensions.

Example:

Continuing with the example from the previous step, we can either form a feature
vector with both of the eigenvectors v1 and v2:

Or discard the eigenvector v2, which is the one of lesser significance, and form a
feature vector with v1 only:
Discarding the eigenvector v2 will reduce dimensionality by 1, and will
consequently cause a loss of information in the final data set. But given that v2
was carrying only 4% of the information, the loss will be therefore not important
and we will still have 96% of the information that is carried by v1.

So, as we saw in the example, it’s up to you to choose whether to keep all the
components or discard the ones of lesser significance, depending on what you are
looking for. Because if you just want to describe your data in terms of new
variables (principal components) that are uncorrelated without seeking to reduce
dimensionality, leaving out lesser significant components is not needed.

LAST STEP: RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS AXES
In the previous steps, apart from standardization, you do not make any changes on
the data, you just select the principal components and form the feature vector, but
the input data set remains always in terms of the original axes (i.e, in terms of
the initial variables).

In this step, which is the last one, the aim is to use the feature vector formed
using the eigenvectors of the covariance matrix, to reorient the data from the
original axes to the ones represented by the principal components (hence the name
Principal Components Analysis). This can be done by multiplying the transpose of
the original data set by the transpose of the feature vector.

Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Pattern Recognition Principles
0% (3)
Pattern Recognition Principles
396 pages
Final Report
No ratings yet
Final Report
41 pages
A Step by Step Explanation of Principal Component Analysis
No ratings yet
A Step by Step Explanation of Principal Component Analysis
7 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
Steps for PCA
No ratings yet
Steps for PCA
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Aim: Theory: Experiment 3
No ratings yet
Aim: Theory: Experiment 3
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
ML Mod32019
No ratings yet
ML Mod32019
6 pages
Unit 3dimentionality Reduction
No ratings yet
Unit 3dimentionality Reduction
13 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
82 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
9 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Data Mining - Module 2 - HU
No ratings yet
Data Mining - Module 2 - HU
88 pages
PCA - Ensemble Classifiers
No ratings yet
PCA - Ensemble Classifiers
9 pages
Unit 4 (PCA)
No ratings yet
Unit 4 (PCA)
12 pages
Module3 Notes
No ratings yet
Module3 Notes
13 pages
3
No ratings yet
3
12 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Pca Tutorial
No ratings yet
Pca Tutorial
11 pages
Unit 4 Part 2
No ratings yet
Unit 4 Part 2
17 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
How Do You Do A Principal Component Analysis?
No ratings yet
How Do You Do A Principal Component Analysis?
13 pages
MDA PrincipalComponentAnalysis
No ratings yet
MDA PrincipalComponentAnalysis
20 pages
ML Unit - 3 DimensionalitY Reduction
No ratings yet
ML Unit - 3 DimensionalitY Reduction
39 pages
PCA_dev
No ratings yet
PCA_dev
16 pages
Image Segmentation Using Clustering (Texture With PCA)
No ratings yet
Image Segmentation Using Clustering (Texture With PCA)
25 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
10 ASAP Advanced Statistics Dimension Reduction
No ratings yet
10 ASAP Advanced Statistics Dimension Reduction
8 pages
PC A Tutorial
No ratings yet
PC A Tutorial
12 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Dimensionality Reduction (Pca)
No ratings yet
Dimensionality Reduction (Pca)
32 pages
Devoir PCA
No ratings yet
Devoir PCA
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
3 pages
STAT502
No ratings yet
STAT502
13 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
DR Pca
No ratings yet
DR Pca
22 pages
Unit 17
No ratings yet
Unit 17
12 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
4823 Dsejournal
No ratings yet
4823 Dsejournal
129 pages
Week 9 Lecture - Revision Test-dual-translated
No ratings yet
Week 9 Lecture - Revision Test-dual-translated
92 pages
Pca Tutorial
No ratings yet
Pca Tutorial
27 pages
Principal Component Analysis Primer
100% (2)
Principal Component Analysis Primer
15 pages
CO 2 MULTIVARIATE ANALYSIS
No ratings yet
CO 2 MULTIVARIATE ANALYSIS
71 pages
Mvda - Question Bank
No ratings yet
Mvda - Question Bank
14 pages
Principle Component Analysis
No ratings yet
Principle Component Analysis
4 pages
Projecting Data To A Lower Dimension With PCA
No ratings yet
Projecting Data To A Lower Dimension With PCA
6 pages
Website Worksheets - R - Principal Components Analysis
No ratings yet
Website Worksheets - R - Principal Components Analysis
7 pages
Geometric Explanation of PCA
No ratings yet
Geometric Explanation of PCA
5 pages
Geometric Explanation of PCA
No ratings yet
Geometric Explanation of PCA
5 pages
Edab Module - 5
No ratings yet
Edab Module - 5
19 pages
Feature Engineering
No ratings yet
Feature Engineering
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
16. Principal Component Analysis
No ratings yet
16. Principal Component Analysis
27 pages
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet
Dynamic Bayesian Networks: Fundamentals and Applications
From Everand
Dynamic Bayesian Networks: Fundamentals and Applications
Fouad Sabry
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Em Ccss
No ratings yet
Em Ccss
17 pages
Verification of Law of Conservation of Mechanical Energy Using Air Track
No ratings yet
Verification of Law of Conservation of Mechanical Energy Using Air Track
6 pages
Class 910 Pre Mid Syllabus Compressed
No ratings yet
Class 910 Pre Mid Syllabus Compressed
2 pages
Asistensi PKM 6 Jawaban
No ratings yet
Asistensi PKM 6 Jawaban
9 pages
Modeling and Design of Microwave-Millimeterwave Filters and Multiplexers
No ratings yet
Modeling and Design of Microwave-Millimeterwave Filters and Multiplexers
314 pages
Guidelines Design Example 1
No ratings yet
Guidelines Design Example 1
7 pages
First Summative Test Math 8
No ratings yet
First Summative Test Math 8
2 pages
08 - Simultaneous Equations PDF
No ratings yet
08 - Simultaneous Equations PDF
13 pages
Anaconda CheatSheet PDF
No ratings yet
Anaconda CheatSheet PDF
2 pages
EMF - Additional - IES Questions
No ratings yet
EMF - Additional - IES Questions
17 pages
Form Four Green Math
No ratings yet
Form Four Green Math
18 pages
Taylor@Notes On Several Complex Variables 1997 PDF
No ratings yet
Taylor@Notes On Several Complex Variables 1997 PDF
183 pages
Chapter 2 Part 2
No ratings yet
Chapter 2 Part 2
12 pages
Curator B QP Aptitude Test 2021
No ratings yet
Curator B QP Aptitude Test 2021
6 pages
Math Thesis Title Proposal
100% (3)
Math Thesis Title Proposal
8 pages
Quarter 3 Week 5 D2 Mathematics: Imelda G. Catuira MT1 Ilaya Elementary School Edith S. Mato Ph.D. Principal Ii
No ratings yet
Quarter 3 Week 5 D2 Mathematics: Imelda G. Catuira MT1 Ilaya Elementary School Edith S. Mato Ph.D. Principal Ii
20 pages
A Step by Step Guide To Dynamic Programming
No ratings yet
A Step by Step Guide To Dynamic Programming
6 pages
DIP Lab Manual No 09
No ratings yet
DIP Lab Manual No 09
22 pages
MSC Maths 2016
No ratings yet
MSC Maths 2016
29 pages
The Official Guide For GMAT Review, 7th
0% (1)
The Official Guide For GMAT Review, 7th
37 pages
Math 20-1 Year End Review - Practice Questions
No ratings yet
Math 20-1 Year End Review - Practice Questions
47 pages
The Effect of Temperature On The Desorption of Gold
No ratings yet
The Effect of Temperature On The Desorption of Gold
15 pages
CH 2 Assignment 1
No ratings yet
CH 2 Assignment 1
2 pages
PMC Ovate Class 7 8
No ratings yet
PMC Ovate Class 7 8
15 pages
Lab Report
No ratings yet
Lab Report
26 pages
Instant Download Introduction to Structural Analysis 1st Edition Podder PDF All Chapters
100% (2)
Instant Download Introduction to Structural Analysis 1st Edition Podder PDF All Chapters
52 pages
HHF HFH FHHF HHF
No ratings yet
HHF HFH FHHF HHF
18 pages
Panel Lecture - Gujarati
100% (1)
Panel Lecture - Gujarati
26 pages

PCA Explained Stepbystep

Uploaded by

PCA Explained Stepbystep

Uploaded by

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

Principal Component Analysis, or PCA, is a dimensionality-reduction method that is

STEP BY STEP EXPLANATION OF PCA

More specifically, the reason why it is critical to perform standardization prior

STEP 2: COVARIANCE MATRIX COMPUTATION

The covariance matrix is a p × p symmetric matrix (where p is the number of

Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)),

if positive then: the two variables increase or decrease together (correlated)

Percentage of Variance (Information) for each by PC

Geometrically speaking, principal components represent the directions of the data

HOW PCA CONSTRUCTS THE PRINCIPAL COMPONENTS

Now that we understood what we mean by principal components, let’s go back to

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you

Principal Component Analysis Example

After having the principal components, to compute the percentage of variance

STEP 4: FEATURE VECTOR

You might also like