Chapter Five Principal Comonent Analysis (PCA)

The document discusses principal component analysis (PCA), an unsupervised machine learning technique for dimensionality reduction. PCA transforms a set of correlated variables into a set of uncorrelated variables called principal components. The document explains the steps of PCA including standardizing data, computing the covariance matrix, finding eigenvectors and eigenvalues, and selecting principal components. Applications and advantages like dimensionality reduction and visualization are covered, along with interpretability issues.

Uploaded by

Ruun Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views33 pages

Chapter Five Principal Comonent Analysis (PCA)

Uploaded by

Ruun Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Chapter five

Introduction Principal Component Analysis(PCA)

Supervised and Unsupervised learning
• Machine learning is a field of computer science that gives computers
the ability to learn without being explicitly programmed.
• The two main types of machine learning are:
Supervised learning and
unsupervised learning
supervised learning
• In supervised learning ,the machine is trained on a set of labeled data,
which means that the input data is paired with the desired output.
• The machine then learns to predict the output for new input data.
Supervised learning is often used for tasks such as
• classification,
• regression
unsupervised learning
• In unsupervised learning, the machine is trained on a set of unlabeled
data, which means that the input data is not paired with the desired
output. The machine then learns to find patterns and relationships in
the data.
• Unsupervised learning is often used for tasks such as
• clustering
• dimensionality reduction,
Cont..
Understanding Dimensionality reduction
• Dimensionality reduction is the process of reducing the number of
features (or dimensions) in a dataset while retaining as much
information as possible.
• Reasons for Dimensionality reduction are:
to reduce the complexity of a model,
 to improve the performance of a learning algorithm, or
to make it easier to visualize the data.
Approaches to dimensionality reduction
• There are two main approaches to dimensionality reduction:
feature selection and
feature extraction.
Feature Selection:
Feature selection involves selecting a subset of the original features
that are most relevant to the problem at hand.
The goal is to reduce the dimensionality of the dataset while retaining
the most important features.
Cont..
• There are several methods for feature selection,
 filter methods: Filter methods rank the features based on their
relevance to the target variable.
wrapper methods: use the model performance as the criteria for
selecting features.
 embedded methods. embedded methods combine feature selection
with the model training process.
Cont…
• Feature Extraction:
Feature extraction involves creating new features by combining or
transforming the original features. The goal is to create a set of
features that captures the essence of the original data in a lower-
dimensional space.
• There are several methods for feature extraction
principal component analysis (PCA).
linear discriminant analysis (LDA), and
t-distributed stochastic neighbor embedding (t-SNE).
Understanding PCA
• In many explanatory studies the number of variables under
consideration are too large to handle. A way of reducing the number
of variables to be treated is the discord the linear combination which
have small variance and study only those with large variance.
• In other words to examine the relationship among a set of correlated
variables, it may be useful to transform the original set of variables to
a new set of a variable which are uncorrelated with each other and
called principal component analysis.
Cont..
• Principal component analysis, or PCA, is a dimensionality reduction
method that is often used to reduce the dimensionality of large data
set by transforming a large set of variables into a smaller one that still
contains most of the information in the large set.
• Reducing the number of variables of a data set naturally comes at the
expense of accuracy, but the trick in dimensionality reduction is to
trade a little accuracy for simplicity.
• Because smaller data sets are easier to explore and visualize, and thus
make analyzing data points much easier and faster for machine
learning algorithms without extraneous variables to process.
Steps for PCA
• Principal component analysis can be broken down into five steps.
Standardize the range of continuous initial variables
Compute the covariance matrix to identify correlations
Compute the eigenvectors and eigenvalues of the covariance matrix
to identify the principal components
Create a feature vector to decide which principal components to
keep
Recast the data along the principal components axes
Applications of PCA in Machine Learning

PCA is used to visualize multidimensional data.

It is used to reduce the number of dimensions in healthcare data.
PCA can help resize an image.
It can be used in finance to analyze stock data and forecast returns.
PCA helps to find patterns in the high-dimensional datasets.
Advantages of PCA
Dimensionality reduction: By determining the most crucial features
or components, PCA reduces the dimensionality of the data, which is
one of its primary benefits. This can be helpful when the initial data
contains a lot of variables and is therefore challenging to visualize or
analyze.
Feature Extraction: PCA can also be used to derive new features or
elements from the original data that might be more insightful or
understandable than the original features. This is particularly helpful
when the initial features are correlated or noisy.
Cont..
Data visualization: By projecting the data onto the first few principal
components, PCA can be used to visualize high-dimensional data in
two or three dimensions.
Noise Reduction: By locating the underlying signal or pattern in the
data, PCA can also be used to lessen the impacts of noise or
measurement errors in the data.
Multicollinearity: When two or more variables are strongly
correlated, there is multicollinearity in the data, which PCA can
handle.
Disadvantages of PCA
Interpretability: Although principal component analysis (PCA) is
effective at reducing the dimensionality of data and spotting patterns,
the resulting principal components are not always simple to
understand or describe in terms of the original features.
Information loss: PCA involves choosing a subset of the most crucial
features or components in order to reduce the dimensionality of the
data.
Outliers: Because PCA is susceptible to anomalies in the data, the
resulting principal components may be significantly impacted. The
covariance matrix can be distorted by outliers, which can make it
harder to identify the most crucial characteristics.
Cont..
Scaling: PCA makes the assumption that the data is scaled and
centralized, which can be a drawback in some circumstances.
Computing complexity: For big datasets, it may be costly to compute
the eigenvectors and eigenvalues of the covariance matrix.
How Does Principal Component Analysis
Work?
1) Normalize the Data:Standardize the data before performing PCA.
This will ensure that each feature has a mean = 0 and variance = 1.
2. Build the Covariance Matrix
• Construct a square matrix to express the correlation between two or
more features in a multidimensional dataset.
3.Find the Eigenvectors and Eigenvalues
• Calculate the eigenvectors/unit vectors and eigenvalues. Eigenvalues
are scalars by which we multiply the eigenvector of the covariance
matrix.

4. Sort the Eigenvectors in Highest to Lowest Order and Select the Number of
Principal Components.
Example
Solution
Cont..
Cont..
Cont..
Cont..
Cont..
Cont..
Cont..
Cont
Cont..
Cont..
Cont..

Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
Principal Component Analysis (PCA) in Machine Learning
No ratings yet
Principal Component Analysis (PCA) in Machine Learning
20 pages
Principal Component Analysis PCA in Machine Learning
No ratings yet
Principal Component Analysis PCA in Machine Learning
20 pages
Wiggers S.L., Pedersen P. Structural Stability and Vibration. An Integrated Introduction by Analytical and Numerical Methods.2018. 160p
100% (4)
Wiggers S.L., Pedersen P. Structural Stability and Vibration. An Integrated Introduction by Analytical and Numerical Methods.2018. 160p
169 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Pca - Principal Component Analysis 1233
No ratings yet
Pca - Principal Component Analysis 1233
30 pages
ML Co3 Session 21 Pca
No ratings yet
ML Co3 Session 21 Pca
12 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
28 pages
Unit 3
No ratings yet
Unit 3
102 pages
Dimensionality Reduction: Motivation I: Data Compression
No ratings yet
Dimensionality Reduction: Motivation I: Data Compression
35 pages
Unit 3
No ratings yet
Unit 3
31 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
30 pages
Module 3 ML
No ratings yet
Module 3 ML
19 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Implementation of Dimensionality Reduction Techniques in Hospital Management
No ratings yet
Implementation of Dimensionality Reduction Techniques in Hospital Management
4 pages
01 Csikos Differential Geometry
100% (2)
01 Csikos Differential Geometry
354 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
DS Ca2 PPT 3010 3017
No ratings yet
DS Ca2 PPT 3010 3017
10 pages
Module 3
No ratings yet
Module 3
41 pages
Data Reduction
No ratings yet
Data Reduction
9 pages
ML Module 6
No ratings yet
ML Module 6
6 pages
Love Report
No ratings yet
Love Report
7 pages
Pca 1692550768
No ratings yet
Pca 1692550768
13 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Linear Algebra
No ratings yet
Linear Algebra
5 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
ML (Unit 5)
No ratings yet
ML (Unit 5)
34 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Elementary Linear Algebra, Applications Version 11th Edition Textbook
0% (1)
Elementary Linear Algebra, Applications Version 11th Edition Textbook
12 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
2 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
STAT502
No ratings yet
STAT502
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Program 3
No ratings yet
Program 3
7 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
3 pages
Linear Algebra
No ratings yet
Linear Algebra
5 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
1.1 What Is PCA
No ratings yet
1.1 What Is PCA
3 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
P.G. Department of Physics Choice Based Credit System: M.Sc. Physics Syllabus (Utkal University)
No ratings yet
P.G. Department of Physics Choice Based Credit System: M.Sc. Physics Syllabus (Utkal University)
29 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Pca 1
No ratings yet
Pca 1
3 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
ECE-MATH 1st Sem Syllabus
0% (1)
ECE-MATH 1st Sem Syllabus
2 pages
Revised - Course Policy - Linear Algebra and Differential Equations - Sem II
No ratings yet
Revised - Course Policy - Linear Algebra and Differential Equations - Sem II
26 pages
Rayalaseema University: Kurnool Department of Physics Revised Syllabus (CBCS) W.E.F 2015-16
No ratings yet
Rayalaseema University: Kurnool Department of Physics Revised Syllabus (CBCS) W.E.F 2015-16
19 pages
Identification and Ranking Fo Road Hazardous Location
No ratings yet
Identification and Ranking Fo Road Hazardous Location
27 pages
Hoja 2 PDF
100% (1)
Hoja 2 PDF
3 pages
New Syllabus IPE
No ratings yet
New Syllabus IPE
37 pages
HP49G Pocket Guide
No ratings yet
HP49G Pocket Guide
80 pages
Eigen
No ratings yet
Eigen
3 pages
2023 25 - Syllabus MSC Physics
No ratings yet
2023 25 - Syllabus MSC Physics
146 pages
Dhrumil Aml
No ratings yet
Dhrumil Aml
14 pages
BSC - CSD - 2024 - Scheme and Syllabus
No ratings yet
BSC - CSD - 2024 - Scheme and Syllabus
48 pages
Chap 8 e
No ratings yet
Chap 8 e
19 pages
An Introduction To Grids Graphs and Networks
No ratings yet
An Introduction To Grids Graphs and Networks
299 pages
Exercises For 8.5
No ratings yet
Exercises For 8.5
17 pages
2.2.5 Basic Properties of Eigenvalue Problems: 2.2 Linear Algebra and Eigenvalues Problems
No ratings yet
2.2.5 Basic Properties of Eigenvalue Problems: 2.2 Linear Algebra and Eigenvalues Problems
57 pages
Introduction To The Square Root of A 2 by 2 Matrix
No ratings yet
Introduction To The Square Root of A 2 by 2 Matrix
6 pages
2011 Final
No ratings yet
2011 Final
15 pages
Numerical Ranges of Unbounded Operators
No ratings yet
Numerical Ranges of Unbounded Operators
22 pages
Unit I Discrete State-Variable Technique 9
No ratings yet
Unit I Discrete State-Variable Technique 9
15 pages
4.10 M.E. Instrumentation and Control
No ratings yet
4.10 M.E. Instrumentation and Control
80 pages
Lecture # 7: Eigenvalues, Eigenvectors and Diagonalization Learning Outcomes
No ratings yet
Lecture # 7: Eigenvalues, Eigenvectors and Diagonalization Learning Outcomes
21 pages
Phys 2041
No ratings yet
Phys 2041
7 pages
Phase Plane Analysis
No ratings yet
Phase Plane Analysis
10 pages
Systems of Conservation Law Equations: Example 5.1
No ratings yet
Systems of Conservation Law Equations: Example 5.1
18 pages
GHHH
No ratings yet
GHHH
10 pages
Circular Arc Detection Based On Hough Transform: Soo-Chang Pei, Ji-Hwei Horng
No ratings yet
Circular Arc Detection Based On Hough Transform: Soo-Chang Pei, Ji-Hwei Horng
11 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Chapter Five Principal Comonent Analysis (PCA)

Uploaded by

Chapter Five Principal Comonent Analysis (PCA)

Uploaded by

Chapter five

Introduction Principal Component Analysis(PCA)

PCA is used to visualize multidimensional data.

You might also like