0% found this document useful (0 votes)
56 views9 pages

Lecture07 Dimensionality Pca

The document discusses dimensionality reduction through principal component analysis (PCA). PCA works by finding a new feature space of reduced dimensionality m (< original dimensionality d) that adequately describes the original feature space using a set of orthogonal basis vectors. It quantifies the error introduced by reducing dimensions and aims to minimize this error by making the basis vectors the eigenvectors of the covariance matrix, known as the Karhunen-Loeve expansion. The first principal components correspond to the eigenvectors with highest eigenvalues and capture the most information or variance in the original data.

Uploaded by

SalmaanCadeXaaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views9 pages

Lecture07 Dimensionality Pca

The document discusses dimensionality reduction through principal component analysis (PCA). PCA works by finding a new feature space of reduced dimensionality m (< original dimensionality d) that adequately describes the original feature space using a set of orthogonal basis vectors. It quantifies the error introduced by reducing dimensions and aims to minimize this error by making the basis vectors the eigenvectors of the covariance matrix, known as the Karhunen-Loeve expansion. The first principal components correspond to the eigenvectors with highest eigenvalues and capture the most information or variance in the original data.

Uploaded by

SalmaanCadeXaaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 9

ECE 471/571 – Lecture 7

Dimensionality Reduction – Principal


Component Analysis
Different Approaches - More Detail
Pattern Classification

Statistical Approach Syntactic Approach

Supervised Unsupervised
Basic concepts: Basic concepts:
Baysian decision rule Distance
(MPP, LR, Discri.) Agglomerative method
Parametric learning (ML, BL) k-means
Non-Parametric learning (kNN) Winner-take-all
NN (Perceptron, BP) Kohonen maps

Dimensionality Reduction Performance Evaluation Stochastic Methods


Fisher’s linear discriminant ROC curve local optimization (GD)
K-L transform (PCA) TP, TN, FN, FP global optimization (SA, GA)
ECE471/571, Hairong Qi 2
Principal Component Analysis or K-L
Transform

How to find a new feature space (m-


dimensional) that is adequate to
describe the original feature space (d-
dimensional). Suppose m<d
x2 y1
y2

x1 3
K-L Transform (1)
Describe vector x in terms of a set of
basis vectors bi.
d
x =å yi b i yi =b xT
i
i =1

The basis vectors (bi) should be linearly


independent and orthonormal, that is,
T ì1 i =j
b i b j =í
î0 i¹ j
4
K-L Transform (2)
Suppose we wish to ignore all but m (m<d)
components of y and still represent x,
although with some error. We will thus
calculate the first m elements of y and
replace the others with constants
m d m d
x =å yi b i + å yb »å yb + å a b
i i i i i i
i =1 i =m +1 i =1 i =m +1

d
Error: D x = å (yi - a i )b i
i =m +1
5
K-L Transform (3)
Use mean-square error to quantify the
error
ì d d ü
e (m )=E í å å (yi - a i )b i (y j - a j )b j ý
2 T

îi =m+1 j =m +1 þ
ì d d ü
=E í å å (yi - a i )(y j - a j )b i b j ý
T

îi =m +1 j =m +1 þ
d
{
= å E (yi - a i )
2
}
i =m +1
6
K-L Transform (4)
Find the optimal i to minimize 2
¶e 2
=- 2(E{yi }- a i ) =0
¶a i
a i =E{yi }

Therefore,
d the error is now equal to
e (m )= å E (yi - E {yi })
2
{ 2
}
i =m +1
d d

i =m +1
{ 2
}
= å E (bTi x - E {bTi x}) = å E {(bTi x - E {bTi x})(xT b i - E {xT b i })}
i =m +1
d d d
= å b E (x - E{x})(x - E{x}) b i = å b S x b i = å l i
{ }
T T T
i i
i =m +1 i =m +1 i =m +1 7
K-L Transform (5)
The optimal choice of basis vectors is the
eigenvectors of x
The expansion of a random vector in terms of the
eigenvectors of the covariance matrix is referred to
as the Karhunen-Loeve expansion, or the “K-L
expansion”
Without loss of generality, we will sort the
eigenvectors bi in terms of their eigenvalues. That
is 1 >= 2 >= … >= d. Then we refer to b1,
corresponding to 1, as the “major eigenvector”, or
“principal component”

8
Summary
Raw data  covariance matrix 
eigenvalue  eigenvector  principal
component
How to use error rate?

You might also like