0% found this document useful (0 votes)
59 views32 pages

Dimensionality Reduction Using Principal Component Analysis

Principal Component Analysis (PCA) is a technique used to reduce dimensionality in data. It works by transforming the data to a new coordinate system where the greatest variance by any projection of the data comes to lie on the first coordinate, called the first principal component. PCA identifies the directions of maximum variance in high-dimensional data and projects it onto a lower dimensional space. It does this by computing the eigenvalues and eigenvectors of the covariance matrix of the data. The principal components corresponding to the largest eigenvalues represent the directions with the most variance.

Uploaded by

sai varun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views32 pages

Dimensionality Reduction Using Principal Component Analysis

Principal Component Analysis (PCA) is a technique used to reduce dimensionality in data. It works by transforming the data to a new coordinate system where the greatest variance by any projection of the data comes to lie on the first coordinate, called the first principal component. PCA identifies the directions of maximum variance in high-dimensional data and projects it onto a lower dimensional space. It does this by computing the eigenvalues and eigenvectors of the covariance matrix of the data. The principal components corresponding to the largest eigenvalues represent the directions with the most variance.

Uploaded by

sai varun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

Dimensionality Reduction using

Principal Component Analysis


Topics to be covered….
• Introduction to PCA
• Basics of statistics
• PCA algorithm
• Application of PCA in Face Recognition
• Limitations of PCA
Data Reduction Using PCA

Reduce space dimensionality with minimum loss


of description information.
Data Reduction:
Example of an Ideal Case

reduces the number of dimensions,


without much loss of information.
Principal Component Analysis
• Find an orthogonal coordinate system such that the greatest
variance by any projection of data comes to lie on the first
coordinate(principal Component)
Basics………………..
• The Standard Deviation (SD) :of a data set is a
measure of how spread out the data is.

• Variance is another measure of the spread of


data in a data set.

Says How much the dimensions vary from the mean with
respect to each other.
Basics………………..

Covariance is always measured between


2 dimensions

if Covariance is +ve -------------Both Dimensions increase together


-ve--------------- When One dimension increase the
other decrease
0--------------- 2 dimensions are independent of
each other
Basics………………..
• The covariance Matrix If we have a data set
with more than 2 dimensions, there is more
than one covariance measurement that can
be calculated.
Eigen Values and Eigen Vectors
• If A is a linear transformation, a non-null vector x is
an eigenvector of A if there is a scalar λ such that
Ax= λx
The scalar λ is said to be an eigenvalue of A
corresponding to the eigenvector x.
Physical Significance……..
• Transformation matrix acts on certain vectors by
changing only their magnitude and leaving the
direction unchanged.These vectors are eigen vectors
• A matrix acts on an eigen vector by multiplying its
magnitude by a factor(+/-).This values is eigen value
associated with that eigen vector
Eigen Values and Eigen Vectors
picture was deformed in such a
way that its central vertical axis
(red vector) has not changed
direction, but the diagonal
vector (blue) has changed
direction. Hence the red vector
is an eigenvector of the
transformation and the blue
vector is not.

Each eigenvalue represents the the total variance in its dimension.


Throwing away the least significant eigenvectors means---throwing
away the least significant variance information(eigen values) !
Principal Component Analysis
− PCA projects the data
along the directions
where the data varies
the most.
− These directions are
determined by the
eigenvectors of the
covariance matrix
corresponding to the
largest eigenvalues.
− The magnitude of the
eigenvalues
corresponds to the
variance of the data
along the eigenvector
directions.
PCA-Steps
• Calculate the mean sample
• Subtract it from the samples(variance)
• Calculate the Covariance Matrix
• Find the set of eigen vectors for the
Covariance Matrix
• Consider the eigen vectors corresponding to
"largest" eigen values---also called "principal
components“.
Dimensionality reduction

The goal of PCA is to reduce the


dimensionality of the data while retaining as
much as possible of the variation present in
the original dataset
Principal Component Analysis (PCA)
• Lower dimensionality basis
− Approximate vectors by finding a basis in an appropriate lower
dimensional space.

(1) Higher-dimensional space representation:

(2) Lower-dimensional space representation:

15
Principal Component Analysis (PCA)
• Information loss
− Dimensionality reduction implies •information loss !!
− Want to preserve as much information as possible, that is:

• How to determine the best lower dimensional sub-space?

The “best” low dimensional space can be determined by


the best eigen vectors of the covariance matrix of x
(ie eigen vectors corresponding to "largest" eigen values---
also called "principal components"
16
Principal Component Analysis (PCA)
Steps:- − Suppose x , x , ..., x are N x 1
vectors
1 2 M
Principal Component Analysis (PCA)
Principal Component Analysis (PCA)
• What is the error due to dimensionality reduction?
− original vector x can be reconstructed using its principal components:

− It can be shown that the low-dimensional basis based on principal


components minimizes the reconstruction error:

− It can be shown that the error is equal to:

19
Face Detection
Samples at different
orientations,illuminations, different
expressions
Face Recognition using PCA
• Acquire an initial set of face images(training
set)
• Calculate the eigenfaces from the training
set,keeping only M images corresponding to
highest eigen values.These M images define
the face space.
Recognition

• Calculate a set of weights based on the input


image and the M eigenfaces by projecting the
input image onto each of the eigenfaces
• Determine if the image is a face at all by
checking to see if the image is close to’face
space’
• If it is a face, classify the weight pattern as
known person
Principal Component Analysis (PCA)

• The linear transformation RN  RK that


performs the dimensionality reduction is:

• How to choose the


principal
components?
Principal Component Analysis (PCA)
• Representing faces onto this basis

27
Limitations of PCA
• PCA is a linear method. It fails as the largest
variance is not along a single vector, but along a
non-linear path.
• Static Traditional PCA assumes that the
monitored process is static. Many industrial
processes do not display a stationary behavior
because the operational conditions change
• When PCA is used for clustering, its main
limitation is that it does not account for class
separability since it makes no use of the class
label of the feature vector.
Thank You

You might also like