Principal Component Analysis

Principal Component Analysis (PCA) is a dimensionality reduction technique that simplifies datasets while preserving variance. It transforms original data into uncorrelated variables called principal components, which are ranked by their explained variance. Key outputs include explained variance, loadings, and scores, which help in understanding the structure of the data and visualizing patterns.

Uploaded by

komalmodhvadiya54

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views20 pages

Principal Component Analysis

Uploaded by

komalmodhvadiya54

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Principal Component

Analysis
What is PCA?
• Principal Component Analysis (PCA) is a dimensionality reduction
technique used in data analysis and machine learning. Its main goal is
to simplify a dataset while retaining as much information (variance)
as possible.
Why use PCA?
• Datasets can have many variables (features), which can make analysis
hard.
• Some features may be correlated or redundant.
• PCA transforms the original data into a new set of uncorrelated
variables, called principal components, ordered by the amount of
variance they explain.
Steps
•Standardize the data (if so required)
.Compute the covariance matrix
•Calculate the eigenvalues and eigenvectors
Eigenvectors represent the directions (principal components)
Eigenvalues show how much variance each principal component
captures.
•Sort components by variance
Keep the top k components to explain the most variance
.

•Transform the data

Key outputs of PCA
•Explained Variance
•Principal Components (PCs)
•Loadings (component weights)
•Scores (transformed data)
1. Explained Variance /
Eigenvalues
• Each principal component explains a percentage of the total variance
in the data.
• The first PC explains the most variance, the second explains the next
most, and so on.
* Use a scree plot (variance vs. component index) to decide how many
PCs to keep (look for the “elbow”).
2. Principal Components (Axes of
New Space)
• These are new features created by PCA.
• They’re linear combinations of your original features.
• They are interpreted based on the loadings
3. Loadings (Component Matrix)
• Loadings show how much each original variable contributes to a
principal component
• Large (positive or negative) loadings = strong influence
• Small loadings ≈ negligible effect.
4. Scores (Transformed Data Points)
• These are your original observations projected into the new PC space.
Plot them (e.g. PC1 vs PC2) to see patterns:
• Clusters = similar data points.
• Outliers = unusual observations.
• Trends = natural groupings or directions of change.
Summary
PCA Element Meaning Use and Interpretation
Explained Variance How much structure each PC Select number of PCs to retain
captures
Loadings Contribution of each original Understand what PCs represent
variable
Scores Transformed observations Visualize clustering, patterns
Eigen value
In linear algebra, for a square matrix A, an eigenvalue λ and
corresponding eigenvector v satisfy the equation:
Av=λv

. v is a non-zero vector whose direction doesn’t change when

transformed by A.
λ is a scalar that stretches (or shrinks) the vector
Connecting with PCA
• In PCA:
1.You compute the covariance matrix Σ of your dataset (or sometimes
the correlation matrix).
2.Then you solve the eigenvalue equation for Σ:
Σvi=λivi
λi: eigenvalues — how much variance is in direction vi
: eigenvectors — the principal components (new axes in PCA)
vi

• V- principal direction, λ- amount

Loadings
• Suppose PC1 is a new axis created by PCA.
• The loading for variable A on PC1 is how much variable A "aligns"
with that axis.
PC1=a1⋅X1+a2⋅X2+………………..+an⋅Xn

Here, a1,a2,…,anare the loadings of variables X1,X2,…,Xn.

* If you standardize the data, the loadings are the correlations.

Assumptions
• Normality- variables are multivariate normal
• Linear relations between variables- bivariate scatterplots
• Factorability
1. Inter-item correlations- Correlation adequacy- Bartlett’s test of
correlation adequacy (should be significant: p<0.05)
• Ho: Variables are orthogonal (not correlated)
2. Sample adequacy: Kaiser-Meyer-Olkin measure (KMO test) >0.70
(N:k=20:1) ex: 20 items: 400 cases
(EFA is done with 5:1)
KMO values
Bartlett’s test of correlation
adequacy
• Bartlett's (1951) test of sphericity tests whether a matrix (of
correlations) is significantly different from an identity matrix (filled
with 0).
• The test computes the probability that the correlation matrix has
significant correlations among at least some of the variables in a
dataset, a prerequisite for factor analysis to work.
Communalities and Eigen values
Aspect Eigenvalue Communality
Applies to A component (PC1, PC2, etc.) A variable (e.g., Height, Weight)

Tells you How much total variance is How much of a variable’s

captured by a component variance is explained by the
components
Computed from Sum of squared loadings for Sum of squared loadings of that
that component variable across components

Units Total variance units Proportion of variance in each

variable
Eigen values explained
• We can use the eigenvalues to calculate the percentage of variance
accounted for by each of the factors.
• Given that the maximum sum of the eigenvalues will always be equal
to the total number of variables in the analysis, we can calculate the
percentage of variance accounted for by dividing each eigenvalue by
the total number of variables in the analysis.
• % of variance explained = (Eigen value/No. of factors)
How to get simple structure (set up)
• Process by which the solution is made better (smaller residuals)
without changing the mathematical properties

• Orthogonal (method-varimax)- holds factors completely uncorrelated-

PCA
How to tell if my set up achieved
the simple structure
• Variable loading (>0.40 or 0.30)
• Check p-value

Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Balanced Cantilever Bridge Design Considering Seismic Analysis Manual
50% (2)
Balanced Cantilever Bridge Design Considering Seismic Analysis Manual
31 pages
DETAILED LESSON PLAN in Educ 227
100% (1)
DETAILED LESSON PLAN in Educ 227
11 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
34 pages
ҰБТ тест жинағы Ағылшын
No ratings yet
ҰБТ тест жинағы Ағылшын
96 pages
Method Statement Refrigerant Copper Piping
No ratings yet
Method Statement Refrigerant Copper Piping
9 pages
Theory of Soil Failure
100% (1)
Theory of Soil Failure
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Waec 2000 Past Questions
No ratings yet
Waec 2000 Past Questions
9 pages
Bosch Power Tools Product Catalogue 2013-2014 In-En
No ratings yet
Bosch Power Tools Product Catalogue 2013-2014 In-En
149 pages
BM153 User Manual REV3 2130695
100% (1)
BM153 User Manual REV3 2130695
16 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Fatehpursikri 160411155403 PDF
No ratings yet
Fatehpursikri 160411155403 PDF
65 pages
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
No ratings yet
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
8 pages
Anleitung Espadrilles en
No ratings yet
Anleitung Espadrilles en
4 pages
Handbook of Pilot Operational Equipment For Manned Spaceflight
No ratings yet
Handbook of Pilot Operational Equipment For Manned Spaceflight
296 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
13 Releasing Suppressed Emotions Opening Chakras
100% (1)
13 Releasing Suppressed Emotions Opening Chakras
10 pages
Structure & Bonding Poster
No ratings yet
Structure & Bonding Poster
1 page
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
No ratings yet
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
1 page
Term Paper On Noise Pollution in Delhi
100% (1)
Term Paper On Noise Pollution in Delhi
4 pages
GEO - Report - 167 Alkali Silica Reaction in Concrete
100% (1)
GEO - Report - 167 Alkali Silica Reaction in Concrete
81 pages
PCA
100% (1)
PCA
33 pages
Muscle Mag Chest Workouts
100% (1)
Muscle Mag Chest Workouts
6 pages
MCWP 4-6 (1996) - MAGTF Supply Ops
No ratings yet
MCWP 4-6 (1996) - MAGTF Supply Ops
68 pages
Pca Tutorial
No ratings yet
Pca Tutorial
11 pages
Ahmed Rebai PCA-ICA
No ratings yet
Ahmed Rebai PCA-ICA
34 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Data Mining - Module 2 - HU
No ratings yet
Data Mining - Module 2 - HU
88 pages
Pac
No ratings yet
Pac
70 pages
R PCA (Principal Component Analysis) - DataCamp
No ratings yet
R PCA (Principal Component Analysis) - DataCamp
54 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
PCA1
No ratings yet
PCA1
45 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
(ABDI H.) Principal Component Analysis
No ratings yet
(ABDI H.) Principal Component Analysis
27 pages
Principal Component Analysis: Herv e Abdi and Lynne J. Williams
No ratings yet
Principal Component Analysis: Herv e Abdi and Lynne J. Williams
27 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Pca
No ratings yet
Pca
18 pages
Impact of Lean, Six Sigma and Environmental Sustainability On The Performance of Smes
No ratings yet
Impact of Lean, Six Sigma and Environmental Sustainability On The Performance of Smes
25 pages
Pca 1692550768
No ratings yet
Pca 1692550768
13 pages
Principal Component Analysis - Wikipedia
No ratings yet
Principal Component Analysis - Wikipedia
28 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
17 pages
Overview of The Management of Acute Kidney Injury (AKI) in Adults - UpToDate
No ratings yet
Overview of The Management of Acute Kidney Injury (AKI) in Adults - UpToDate
30 pages
Pca Ica
No ratings yet
Pca Ica
34 pages
Majlis Vol 27 No 05 A4
No ratings yet
Majlis Vol 27 No 05 A4
24 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
STAT502
No ratings yet
STAT502
13 pages
DR Pca
No ratings yet
DR Pca
22 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
12 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
DS Ca2 PPT 3010 3017
No ratings yet
DS Ca2 PPT 3010 3017
10 pages
Lock Out Tag Out (LOTO) Safety Awareness
No ratings yet
Lock Out Tag Out (LOTO) Safety Awareness
27 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
Kumar 2017
No ratings yet
Kumar 2017
13 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Calg t2 4 Filled in
No ratings yet
Calg t2 4 Filled in
7 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Iman Magnético
No ratings yet
Iman Magnético
6 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
Engineering Materials Exp. - 2
No ratings yet
Engineering Materials Exp. - 2
6 pages
Fallas Electrica MP C2050 RICOH D027, D029
No ratings yet
Fallas Electrica MP C2050 RICOH D027, D029
7 pages
Ts X Biology Final Exam Revision 2023-24
No ratings yet
Ts X Biology Final Exam Revision 2023-24
7 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
Principle Component Analysis
No ratings yet
Principle Component Analysis
4 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
15 pages
Pca 1
No ratings yet
Pca 1
3 pages
GuideforBookReview2 568863761482473
No ratings yet
GuideforBookReview2 568863761482473
2 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
1 page
TKC41005
No ratings yet
TKC41005
2 pages
2.5L Diesel 1993 On
No ratings yet
2.5L Diesel 1993 On
2 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages

Principal Component Analysis

Uploaded by

Principal Component Analysis

Uploaded by

Principal Component

•Transform the data

. v is a non-zero vector whose direction doesn’t change when

• V- principal direction, λ- amount

Here, a1,a2,…,anare the loadings of variables X1,X2,…,Xn​.

* If you standardize the data, the loadings are the correlations.

Tells you How much total variance is How much of a variable’s

Units Total variance units Proportion of variance in each

• Orthogonal (method-varimax)- holds factors completely uncorrelated-

You might also like

Here, a1,a2,…,anare the loadings of variables X1,X2,…,Xn.