0% found this document useful (0 votes)

15 views4 pages

Module 5.2 Principal Component Analysis - V1

The document outlines the steps for performing Principal Component Analysis (PCA) for data dimensionality reduction, including data normalization, covariance matrix calculation, and eigenvalue extraction. It highlights the advantages of PCA, such as reducing computational complexity and improving machine learning performance, while also noting its disadvantages, including interpretability challenges. The analysis demonstrates that using the first two eigenvectors can retain approximately 80% of the data variance, effectively reducing the dimensionality from 4 to 2.

Uploaded by

stutiii24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views4 pages

Module 5.2 Principal Component Analysis - V1

Uploaded by

stutiii24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Principal Component Analysis (PCA) for data dimensionality reduction

Steps to perform PCA:

I. Read data: Number of predictors =4; number of data instances =5

Sepallength=F1 Sepalwidth=F2 Petallength=F3 Petalwidth=F4

1 5 3 1
4 2 6 3
1 4 3 2
4 4 1 1
5 5 2 3

Data has 4 predictors and number of data instances in 5(small number considered for demonstration
purpose). To represent each data instance 4 dimensional space is required. Is it possible to present
the same data in lesser dimensional space without loss of information?(lesser than 4)

Advantage of smaller data dimension is reduction in computational complexity and memory

requirement. Mny a times PCA helps to speed up learning in Machine Learning algorithms

2. Normalize the data by subtracting the mean of each of the data dimensions and dividing by
standard deviation of each dimension (x-xmean )\ xstd dev

Normalized data
3. Find co-variance matrix of normalized data matrix:

4 Get the eigen values and eigen vectors of the co-variance matrix:

Eigen values are: λ1 = 2.11691, λ2= 0.855413 , λ3=0.481689 λ4=, 0.334007

Corresponding eigen vectors are:

Explained variance in data by first eigen vector= (2.11691\[2.11691+0.85541+0.48168+0.33400]

)*100=55.88%
Explained variance in data by second eigen vector=(0.85541\[2.11691+0.85541+0.48168+0.33400]
)*100=22.58%

Explained variance in data by third eigen vector ==(0.48168\[2.11691+0.85541+0.48168+0.33400]

)*100=12.71%

Explained variance in data by fourth eigen vector ==(0.334\[2.11691+0.85541+0.48168+0.33400]

)*100=8.83%

5 Determine the number of eigen vectors to be retained based on explained variance and
transform data in terms of retained eigen vectors which results in dimensionality reduction.

If only first two eigen vectors are used, around 80% variance in the data is retained and we get data
dimensionality reduction from 4 to 2(50% lower dimension). In practice, number of dimensions
retained contain 95% variance in data.

To train a machine learning algorithm, the normalized train and test data is transformed with 2
eigen vectors(considered here) as:

The transformed data features Ф1 and Ф2 do not have any units

Applications of PCA Analysis
 PCA in machine learning is used to visualize multidimensional data.
 PCA helps to compress data.
 PCA can be used to analyze patterns in data when you are dealing with high-
dimensional data sets.

Advantages of Principal Component Analysis

 Easy to calculate and compute.
 Speeds up machine learning computing processes and algorithms.
 Prevents predictive algorithms from data overfitting issues
 Increases performance of ML algorithms by eliminating unnecessary correlated variables
 Helps reduce noise that cannot be eliminated otherwise

Disadvantages of Principal Component Analysis

 Sometimes, PCA is difficult to interpret. In rare cases, you may feel difficult to identify the
most important features even after computing the principal components.
 It is difficult to calculate the covariances and covariance matrices.
 Sometimes, the computed principal components can be more difficult to read rather than the
original data.

Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Seminar PPT On Pca
No ratings yet
Seminar PPT On Pca
17 pages
Principal Component Analysis (PCA) in Machine Learning
No ratings yet
Principal Component Analysis (PCA) in Machine Learning
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
1 page
Mindful Sport Performance Enhancement: Mental Training Athletes Coaches
No ratings yet
Mindful Sport Performance Enhancement: Mental Training Athletes Coaches
321 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
3 pages
Principal Component Analysis PCA in Machine Learning
No ratings yet
Principal Component Analysis PCA in Machine Learning
20 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
CHE134P FINAL EXAM 2013 14 4t
No ratings yet
CHE134P FINAL EXAM 2013 14 4t
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
Love Report
No ratings yet
Love Report
7 pages
Electrochemistry
100% (1)
Electrochemistry
78 pages
Principal Component Analysis: #Datascience
No ratings yet
Principal Component Analysis: #Datascience
13 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
3 pages
Pca - Principal Component Analysis 1233
No ratings yet
Pca - Principal Component Analysis 1233
30 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Principal Component Analysis (PCA) : Anisha M. Lal
No ratings yet
Principal Component Analysis (PCA) : Anisha M. Lal
20 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
Pca
No ratings yet
Pca
18 pages
Pca
No ratings yet
Pca
16 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Program 3
No ratings yet
Program 3
7 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Unit 3
No ratings yet
Unit 3
28 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Principal Component Analysis (PCA) : Gundimeda Venugopal
No ratings yet
Principal Component Analysis (PCA) : Gundimeda Venugopal
17 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Standards Reference Guide PDF
No ratings yet
Standards Reference Guide PDF
4 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Pca 1
No ratings yet
Pca 1
3 pages
Module 3
No ratings yet
Module 3
41 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
CS464 Ch6 FeatureExtraction
No ratings yet
CS464 Ch6 FeatureExtraction
46 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
PCA
100% (1)
PCA
33 pages
DR Pca
No ratings yet
DR Pca
22 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
22 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Job Description
No ratings yet
Job Description
7 pages
Lesson 11 - Communication Professionals and Practitioners
No ratings yet
Lesson 11 - Communication Professionals and Practitioners
20 pages
Simulate CMOS Inverter With Cadence Virtuoso IC617
No ratings yet
Simulate CMOS Inverter With Cadence Virtuoso IC617
18 pages
WD801
No ratings yet
WD801
2 pages
Lease Essays
No ratings yet
Lease Essays
5 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Periodic Assessment I 2025 2026-5-12
No ratings yet
Periodic Assessment I 2025 2026-5-12
1 page
Dylyver Technologies Business Plan
No ratings yet
Dylyver Technologies Business Plan
29 pages
Physical Science - Reviewer
No ratings yet
Physical Science - Reviewer
7 pages
Delphi - Visual Component Library PDF
No ratings yet
Delphi - Visual Component Library PDF
1,072 pages
Earth Station Subsystem
No ratings yet
Earth Station Subsystem
3 pages
SL-QMS-26 Outstation Audit Checklist
No ratings yet
SL-QMS-26 Outstation Audit Checklist
4 pages
Hadiths Notes (1-20)
No ratings yet
Hadiths Notes (1-20)
13 pages
Hypnosis Hypnotic Gaze Braco
No ratings yet
Hypnosis Hypnotic Gaze Braco
7 pages
O-540-A Ilustrate Parts Catalog PC-115-1
No ratings yet
O-540-A Ilustrate Parts Catalog PC-115-1
72 pages
Ibm Power S1022 Server Product Report
No ratings yet
Ibm Power S1022 Server Product Report
5 pages
Road To Revolution-Lesson 7
No ratings yet
Road To Revolution-Lesson 7
2 pages
LMHC
No ratings yet
LMHC
1 page
English Paper 1 2025
No ratings yet
English Paper 1 2025
143 pages
Paper 1
No ratings yet
Paper 1
27 pages
Standards Based Grades in The World Language Classroom
No ratings yet
Standards Based Grades in The World Language Classroom
13 pages
2024 Exercise Allomorph Der Inf
No ratings yet
2024 Exercise Allomorph Der Inf
5 pages
Asus Prime Z270-A User's Manual (En)
No ratings yet
Asus Prime Z270-A User's Manual (En)
104 pages
The Problem Background of The Study
No ratings yet
The Problem Background of The Study
61 pages
Harrington 2017
No ratings yet
Harrington 2017
5 pages
B.A (Hons) XIX (B) Literary Theory (I) Sem-V (1293)
No ratings yet
B.A (Hons) XIX (B) Literary Theory (I) Sem-V (1293)
4 pages
Turbo Straight
No ratings yet
Turbo Straight
1 page

Module 5.2 Principal Component Analysis - V1

Uploaded by

Module 5.2 Principal Component Analysis - V1

Uploaded by

Principal Component Analysis (PCA) for data dimensionality reduction

Steps to perform PCA:

I. Read data: Number of predictors =4; number of data instances =5

Sepallength=F1 Sepalwidth=F2 Petallength=F3 Petalwidth=F4

Advantage of smaller data dimension is reduction in computational complexity and memory

Eigen values are: λ1 = 2.11691, λ2= 0.855413 , λ3=0.481689 λ4=, 0.334007

Corresponding eigen vectors are:

Explained variance in data by first eigen vector= (2.11691\[2.11691+0.85541+0.48168+0.33400]

Explained variance in data by third eigen vector ==(0.48168\[2.11691+0.85541+0.48168+0.33400]

Explained variance in data by fourth eigen vector ==(0.334\[2.11691+0.85541+0.48168+0.33400]

The transformed data features Ф1 and Ф2 do not have any units

Advantages of Principal Component Analysis

Disadvantages of Principal Component Analysis

You might also like