0% found this document useful (0 votes)

56 views8 pages

Mathematical Approach To PCA

The document discusses the mathematical approach to principal component analysis (PCA). It explains that PCA performs feature extraction to reduce the dimensionality of data while preserving as much information as possible. It describes how PCA uses eigenvalue decomposition of the covariance matrix to identify orthogonal principal components that capture the most variance in the data.

Uploaded by

Shobha Kumari Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views8 pages

Mathematical Approach To PCA

Uploaded by

Shobha Kumari Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Mathematical Approach to PCA

The main guiding principle for Principal Component Analysis is FEATURE

EXTRACTION i.e. “Features of a data set should be less as well as the similarity
between each other is very less.” In PCA, a new set of features are extracted from
the original features which are quite dissimilar in nature. So, an n-dimensional
feature space gets transformed into an m-dimensional feature space., where the
dimensions are orthogonal to each other.
Concept of Orthogonality: (In order to understand this topic, we have to go to the
vector space concept in linear algebra) Vector Space is a set of vectors. They can be
represented as a linear combination of the smaller set of vectors called BASIS
VECTORS. So any vector ‘v’ in a vector space can be represented as:

where a represent ‘n’ scalars and u represents the basis vectors. Basis vectors are
orthogonal to each other. Orthogonality of vectors can be thought of an extension of
the vectors being perpendicular in a 2-D vector space. So our feature vector (data-
set) can be transformed into a set of principal components (just like the basis
vectors).

Objectives of PCA:
1. The new features are distinct i.e. the covariance between the new features
(in case of PCA, they are the principal components) is 0.
2. The principal components are generated in order of the variability in the
data that it captures. Hence, the first principal component should capture
the maximum variability, the second one should capture the next highest
variability etc.
3. The sum of the variance of the new features / the principal components
should be equal to the sum of the variance of the original features.
Working of PCA:
PCA works on a process called Eigenvalue Decomposition of a covariance matrix
of a data set. The steps are as follows:
 First, calculate the covariance matrix of a data set.
 Then, calculate the eigenvectors of the covariance matrix.
 The eigenvector having the highest eigenvalue represents the direction in
which there is the highest variance. So this will help in identifying the first
principal component.
 The eigenvector having the next highest eigenvalue represents the
direction in which data has the highest remaining variance and also
orthogonal to the first direction. So, this helps in identifying the second
principal component.
 Like this, identify the top ‘k’ eigenvectors having top ‘k’ eigenvalues to
get the ‘k’ principal components.
Numerical for PCA :
Consider the following dataset
2. 3. 1.
x1 0.5 2.2 1.9 2.3 2.0 1.0 1.1
5 1 5

2. 3. 1.
x2 0.7 2.9 2.2 2.7 1.6 1.1 0.9
4 0 6

Step 1: Standardize the Dataset

Mean for = 1.81 =
Mean for = 1.91 =
We will change the dataset.

- - - -
0.6 0.3 0.0 1.2 0.4 0.1
1.3 0.8 0.3 0.7
9 9 9 9 9 9
1 1 1 1

- - - - -
0.4 0.9 0.2 1.0 0.7
1.2 0.3 0.8 0.3 1.0
9 9 9 9 9
1 1 1 1 1

Step 2: Find the Eigenvalues and eigenvectors

Correlation Matrix c =
where, X is the Dataset Matrix (In this numerical, it is a 10 X 2 matrix)
is the transpose of the X (In this numerical, it is a 2 X 10 matrix) and N is the
number of elements = 10

So,
{So in order to calculate the Correlation Matrix, we have to do the multiplication of
the Dataset Matrix with its transpose}

Using the equation, | C – I | = 0– equation (i) where { \lambda is the eigenvalue

and I is the Identity Matrix }
So solving equation (i)
Taking the determinant of the left side, we get

We get two values for , that are ( ) = 1.28403 and ( ) = 0.0490834. Now we
have to find the eigenvectors for the eigenvalues and
To find the eigenvectors from the eigenvalues, we will use the following
approach:
First, we will find the eigenvectors for the eigenvalue 1.28403 by using the
equation

Solving the matrices, we get

0.616556x + 0.615444y = 1.28403x ; x = 0.922049 y
(x and y belongs to the matrix X) so if we put y = 1, x comes out to be 0.922049. So
now the updated X matrix will look like:

IMP: Till now we haven’t reached to the eigenvectors, we have to a bit of

modifications in the X matrix. They are as follows:
A. Find the square root of the sum of the squares of the element in X matrix i.e.

B. Now divide the elements of the X matrix by the number 1.3602 (just found that)
So now we found the eigenvectors for the eigenvector , they are 0.67787 and
0.73518
Secondly, we will find the eigenvectors for the eigenvalue 0.0490834 by using
the equation {Same approach as of previous step)

Solving the matrices, we get

0.616556x + 0.615444y = 0.0490834x; y = -0.922053
(x and y belongs to the matrix X) so if we put x = 1, y comes out to be -0.922053 So
now the updated X matrix will look like:

IMP: Till now we haven’t reached to the eigenvectors, we have to a bit of

modifications in the X matrix. They are as follows:
A. Find the square root of the sum of the squares of the elements in X matrix i.e.

B. Now divide the elements of the X matrix by the number 1.3602 (just found that)

So now we found the eigenvectors for the eigenvector \lambda_2, they are
0.735176 and 0.677873
Sum of eigenvalues ( ) and ( ) = 1.28403 + 0.0490834 = 1.33 = Total Variance
{Majority of variance comes from }
Step 3: Arrange Eigenvalues
The eigenvector with the highest eigenvalue is the Principal Component of the
dataset. So in this case, eigenvectors of lambda1 are the principal components.

{Basically in order to complete the numerical we have to only solve till this step, but
if we have to prove why we have chosen that particular eigenvector we have to
follow the steps from 4 to 6}
Step 4: Form Feature Vector

This is the FEATURE VECTOR for Numerical

Where first column are the eigenvectors of & second column are the
eigenvectors of
Step 5: Transform Original Dataset
Use the equation Z = X V
Step 6: Reconstructing Data
Use the equation X = ( is Transpose of V), X = Row Zero Mean
Data
So in order to reconstruct the original data, we follow:
Row Original DataSet = Row Zero Mean Data + Original Mean
So for the eigenvectors of first eigenvalue, data can be reconstructed similar to the
original dataset. Thus we can say that the Principal Component of the dataset is is
1.28403 followed by that is 0.0490834

Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
D3S2 - Unsupervised - Dimensionality Reduction
No ratings yet
D3S2 - Unsupervised - Dimensionality Reduction
81 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
Principal Component Analysis - A Numerical Approach
No ratings yet
Principal Component Analysis - A Numerical Approach
8 pages
Dimensionality Reduction - PCA LDA
No ratings yet
Dimensionality Reduction - PCA LDA
25 pages
AML Non Evaluative Assignment 2
No ratings yet
AML Non Evaluative Assignment 2
2 pages
1-Python Algebra Maths
No ratings yet
1-Python Algebra Maths
26 pages
The Mathematics Behind Principal Component Analysis
No ratings yet
The Mathematics Behind Principal Component Analysis
9 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
09 Pca
No ratings yet
09 Pca
19 pages
MLSP-6 Dimensionality Reduction
No ratings yet
MLSP-6 Dimensionality Reduction
39 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Comprehensive Technical Note On Symmetric Matrices, Eigenvectors, Eigenvalues, and Principal Component Analysis (PCA)
No ratings yet
Comprehensive Technical Note On Symmetric Matrices, Eigenvectors, Eigenvalues, and Principal Component Analysis (PCA)
6 pages
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
No ratings yet
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
31 pages
E-Commerce & Business Communication Ebook (SEM 4)
No ratings yet
E-Commerce & Business Communication Ebook (SEM 4)
87 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Exp 15
No ratings yet
Exp 15
12 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
MLSP Exp02
No ratings yet
MLSP Exp02
10 pages
PCA Steps - Numerical Problem
No ratings yet
PCA Steps - Numerical Problem
8 pages
ML Lec-20
No ratings yet
ML Lec-20
17 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Maths Pca
No ratings yet
Maths Pca
6 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
ML Unit - 3 DimensionalitY Reduction
No ratings yet
ML Unit - 3 DimensionalitY Reduction
39 pages
Application Letter
100% (1)
Application Letter
2 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
17 pages
Steps For PCA
No ratings yet
Steps For PCA
5 pages
Lecture6 PCA
No ratings yet
Lecture6 PCA
30 pages
Water Calculation
100% (2)
Water Calculation
38 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
PCA With An Example
No ratings yet
PCA With An Example
7 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Pca
No ratings yet
Pca
16 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Principal Component Analysis (PCA) Final
No ratings yet
Principal Component Analysis (PCA) Final
37 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Pca
No ratings yet
Pca
18 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
16 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
9 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Electrical Electronics VOL.08 PDF
50% (2)
Electrical Electronics VOL.08 PDF
148 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Unit 3
No ratings yet
Unit 3
28 pages
Pac
No ratings yet
Pac
70 pages
3 F Lower Godavari Subzone
No ratings yet
3 F Lower Godavari Subzone
90 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
5 Pca
No ratings yet
5 Pca
14 pages
Module 6 NC II Presenting Relevant Information Final
No ratings yet
Module 6 NC II Presenting Relevant Information Final
68 pages
01 TMSS 01 R0
0% (1)
01 TMSS 01 R0
0 pages
PCA
100% (1)
PCA
33 pages
Math C4 Practice
No ratings yet
Math C4 Practice
53 pages
A History of Graphic Design - Chapter 3 - A Symbiotic Relationship - Codices and Manuscript Books
No ratings yet
A History of Graphic Design - Chapter 3 - A Symbiotic Relationship - Codices and Manuscript Books
34 pages
Zoology Non Chordata
No ratings yet
Zoology Non Chordata
525 pages
Cimplicity 1
No ratings yet
Cimplicity 1
119 pages
Financial Modelling PDF
No ratings yet
Financial Modelling PDF
2 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
The Restless Heart, No Rest Since Birth
No ratings yet
The Restless Heart, No Rest Since Birth
11 pages
ABM - Business Finance CG - 2
No ratings yet
ABM - Business Finance CG - 2
6 pages
Food Rules, 2027 (1970) : Amendments: 1. 2. 3. 4. 5
No ratings yet
Food Rules, 2027 (1970) : Amendments: 1. 2. 3. 4. 5
50 pages
Ikea2 Bam B2 SC 6501 - 01 PDF
No ratings yet
Ikea2 Bam B2 SC 6501 - 01 PDF
1 page
What Is The Definition of - Medium - in Art
No ratings yet
What Is The Definition of - Medium - in Art
9 pages
CNS Unit 3
No ratings yet
CNS Unit 3
94 pages
4-Lens and Cataract
No ratings yet
4-Lens and Cataract
59 pages
BBA OB Unit-1
No ratings yet
BBA OB Unit-1
16 pages
Activation Functions
No ratings yet
Activation Functions
15 pages
Difference Between K Means and Hierarchical Clustering
No ratings yet
Difference Between K Means and Hierarchical Clustering
2 pages
2014 Capstone Team Member Guide
No ratings yet
2014 Capstone Team Member Guide
28 pages
SQL WITH Clause
No ratings yet
SQL WITH Clause
3 pages
Campus Map
No ratings yet
Campus Map
1 page
Overview of Data Cleaning
No ratings yet
Overview of Data Cleaning
17 pages
A19 CC5051NP CW1
No ratings yet
A19 CC5051NP CW1
39 pages
Implementing PCA in Python With Scikit
No ratings yet
Implementing PCA in Python With Scikit
6 pages
Terms of Reference Microeconomic/econometric Consultant For The Poverty and Equity Global Practice
No ratings yet
Terms of Reference Microeconomic/econometric Consultant For The Poverty and Equity Global Practice
2 pages
SQL Query Processing10
No ratings yet
SQL Query Processing10
3 pages
SQL Sequences
No ratings yet
SQL Sequences
3 pages
SQL UNION Clause
No ratings yet
SQL UNION Clause
3 pages
Linear Equations-2
No ratings yet
Linear Equations-2
2 pages
PJ Poliuretán Hosszbordás Szíj
No ratings yet
PJ Poliuretán Hosszbordás Szíj
1 page
CBR Proposal
No ratings yet
CBR Proposal
14 pages
CSC441 Script Video Sawanah Koko
No ratings yet
CSC441 Script Video Sawanah Koko
2 pages
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
No ratings yet
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
3 pages
Loading XL Sheet
No ratings yet
Loading XL Sheet
9 pages
EBD Blades Sponsorhip Letter
No ratings yet
EBD Blades Sponsorhip Letter
2 pages
Kronecker Products and Matrix Calculus with Applications
From Everand
Kronecker Products and Matrix Calculus with Applications
Alexander Graham
No ratings yet

Mathematical Approach To PCA

Uploaded by

Mathematical Approach To PCA

Uploaded by

Mathematical Approach to PCA

The main guiding principle for Principal Component Analysis is FEATURE

Step 1: Standardize the Dataset

Step 2: Find the Eigenvalues and eigenvectors

Using the equation, | C – I | = 0– equation (i) where { \lambda is the eigenvalue

Solving the matrices, we get

IMP: Till now we haven’t reached to the eigenvectors, we have to a bit of

Solving the matrices, we get

IMP: Till now we haven’t reached to the eigenvectors, we have to a bit of

This is the FEATURE VECTOR for Numerical

You might also like