0% found this document useful (0 votes)

19 views

Machine Learning (CSO851) - Lecture 03

The document discusses Principal Component Analysis (PCA) as a method for dimensionality reduction, emphasizing its role in simplifying high-dimensional datasets by extracting essential features while preserving information. It outlines key aspects of PCA, including its applications in data exploration, clustering, and classification, as well as its advantages and limitations. The document also details the steps involved in PCA, including calculating eigenvalues and eigenvectors to identify principal components that maximize variance in the data.

Uploaded by

trijitrana9878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Machine Learning (CSO851) - Lecture 03

Uploaded by

trijitrana9878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 71

Dimensionality Reduction with PCA

Machine Learning (CSO851)

Acknowledgement
Duda, Hard et. al.
Key Aspects of PCA
• Dimensionality Reduction: PCA helps manage high-
dimensional datasets by extracting essential information
and discarding less relevant features, simplifying analysis.
• Data Exploration and Visualization: It plays a significant
role in data exploration and visualization, aiding in
uncovering hidden patterns and insights.
• Linear Transformation: PCA performs a linear
transformation of data, seeking directions of maximum
variance.
• Feature Selection: Principal components are ranked by
the variance they explain, allowing for effective feature
selection.
• Data Compression: PCA can compress data while
preserving most of the original information.
Key Aspects of PCA
• Clustering and Classification: It finds applications in
clustering and classification tasks by reducing noise and
highlighting underlying structure.
• Advantages: PCA offers linearity, computational efficiency,
and scalability for large datasets.
• Limitations: It assumes data normality and linearity and
may lead to information loss.
• Matrix Requirements: PCA works with symmetric
correlation or covariance matrices and requires numeric,
standardized data.
• Eigenvalues and Eigenvectors: Eigenvalues represent
variance magnitude, and eigenvectors indicate variance
direction.
• Number of Components: The number of principal
components chosen determines the number of eigenvectors
computed.
Curse of Dimensionality

• Increasing the number of features

will not always improve
classification accuracy.

• In practice, the inclusion of more

features might actually lead to k=3
3 bins
1

worse performance.

• The number of training examples

required increases exponentially
with dimensionality d (i.e., kd). 32 bins

33 bins
k: number of bins per feature
Dimensionality Reduction
• What is the objective?
− Choose an optimum set of features of lower
dimensionality to improve classification accuracy.

• Different methods can be used to reduce

dimensionality:
− Feature extraction
− Feature selection

6
Dimensionality Reduction (cont’d)

Feature extraction: finds a

set of new features (i.e., Feature selection:
through some mapping f()) chooses a subset of the
from the existing features. original features.

The mapping f()  x1 

 x1  could be linear or x 
x   2  xi1 
non-linear  . 
 2  y1   
 .  y     xi2 
 . 
 
.  2 x  y  . 
x   f (x)
  y  .   .   
 .       . 
   .    .  
 .   yK   .   xiK 
 .   
   xN 
 xN  K<<N K<<N
7
Feature Extraction
• Linear combinations are particularly attractive because
they are simpler to compute and analytically tractable.

• Given x ϵ RN, find an K x N matrix T such that:

y = Tx ϵ RK where K<<N

 x1 
x 
 2  y1 
 .  y  This is a projection from
  T  2
. the N-dimensional space
x     f (x)
  y  .  to a K-dimensional space.
 .   
   . 
  .
 yK 
 . 
 
 xN 
8
Feature Extraction (cont’d)

mapping y=𝑓(x) is equivalent to optimizing an objective

• From a mathematical point of view, finding an optimum

criterion.

• Different methods use different objective criteria, e.g.,

− Minimize Information Loss: represent the data as accurately as

possible in the lower-dimensional space.

− Maximize Discriminatory Information: enhance the class-

discriminatory information in the lower-dimensional space.

9
Feature Extraction (cont’d)
• Popular linear feature extraction methods:
− Principal Components Analysis (PCA): Seeks a projection that
preserves as much information in the data as possible.
− Linear Discriminant Analysis (LDA): Seeks a projection that best
discriminates the data.

• Many other methods:

− Making features as independent as possible (Independent
Component Analysis or ICA).
− Retaining interesting directions (Projection Pursuit).
− Embedding to lower dimensional manifolds (Isomap, Locally Linear
Embedding or LLE).

10
Vector Representation
 x1 
• A vector x ϵ Rn can be x 
 2
 . 
represented by n components:  
.
x: 
 . 
 
 . 
• Assuming the standard base  . 
 
 xN 
<v1, v2, …, vN> (i.e., unit vectors
in each dimension), xi can be xT vi
xi  T xT vi
vi vi
obtained by projecting x along
the direction of vi:
N
• x can be “reconstructed” from x  xi vi  x1v1  x2v2  ...  xN vN
i 1
its projections as follows:
• Since the basis vectors are the same for all x ϵ Rn
(standard basis), we typically represent them as a
n-component vector. 11
Vector Representation (cont’d)

• Example assuming n=2:  x1   3 

x :    
 x2   4  j

• Assuming the standard base  1

x1 x i  3 4   3
T

<v1=i, v2=j>, xi can be obtained  0

by projecting x along the
 0
direction of vi: x2 xT j  3 4   4
 1

• x can be “reconstructed” from x 3i  4 j

its projections as follows:

12
Principal Component Analysis (PCA)
• The main objective of PCA is to identify patterns that
reduce the dimensions of the dataset with minimal loss of
information.
• PCA projects a feature space onto a smaller subspace that
represents the data well.
• In PCA, we are interested to find the directions
(components) that maximize the variance in the dataset.
That means the data is found more spread within a class.
• Principal components correspond to eigenvectors and
eigenvalues are associated with eigenvectors.
• Eigenvalues denote the magnitude of the eigenvectors.
• When eigenvalues are found having similar magnitudes, it
is said to be “good subspace”.
13
Principal Component Analysis (PCA)

• If x∈RN, then it can be written a linear combination of an

x 
orthonormal set of N basis vectors <v1,v2,…,v𝑁> in R  x 
1
N
  2

(e.g., using the standard base):  . 

 
N .
x:  
T 1 if i  j x  xi vi x1v1  x2 v2  ...  xN vN  . 
v v j 
i
 
0 otherwise
i 1
xT vi  . 
where xi  T xT vi  . 
vi vi  
 xN 

• PCA seeks to approximate x in a subspace of RN using a

new set of K<<N basis vectors <u1, u2, …,uK> in RN:
K  y1 
xˆ  yi ui  y1u1  y2u2  ...  yK uK xT ui
where yi  T xT ui y 
ui ui  2
i 1 (reconstruction)
xˆ :  . 
 
|| x  xˆ ||  . 
such that is minimized!  yK 
(i.e., minimize information loss) 14
Principal Component Analysis (PCA)

• The “optimal” set of basis vectors <u1, u2, …,uK> can be

found as follows (we will see why):

(1) Find the eigenvectors u𝑖 of the covariance matrix of the

Σx u𝑖= 𝜆𝑖 u𝑖
(training) data Σx

(2) Choose the K “largest” eigenvectors u𝑖 (i.e., corresponding

to the K “largest” eigenvalues 𝜆𝑖)

<u1, u2, …,uK> correspond to the “optimal” basis!

We refer to the “largest” eigenvectors u𝑖 as principal components.

15
PCA - Steps
• Suppose we are given x1, x2, ..., xM (N x 1) vectors
N: # of features
Step 1: compute sample mean M: # data
M
1
x
M
x
i 1
i

Step 2: subtract sample mean (i.e., center data at zero)

Φi xi  x
Step 3: compute the sample covariance matrix Σx

1 M
1 M
1 where A=[Φ1 Φ2 ... ΦΜ]
x 
M

i 1
( x i  x )( x i  x )T

M

i 1
 T
i
i  
M
AAT
i.e., the columns of A are the Φi
(N x M matrix)

16
PCA - Steps
Step 4: compute the eigenvalues/eigenvectors of Σx
 x ui i ui
where we assume 1  2  ...  N
Note : most software packages return the eigenvalues (and corresponding eigenvectors)
is decreasing order – if not, you can explicitly put them in this order)

Since Σx is symmetric, <u1,u2,…,uN> form an orthogonal basis

in RN and we can represent any x∈RN as: x 
x 
y 
y 
1 1

 2  2
N  .   . 

x  x  yi ui  y1u1  y2u2  ...  yN uN

   
 .  .
x x:   
 .   . 
i 1    
i.e., this is  .   . 
just a “change”  .   . 
(x  x)T ui T    
yi  T
( x  x ) ui if || ui ||1 of basis!  xN   y N 
ui ui
Note : most software packages normalize ui to unit length to simplify calculations; if
not, you can explicitly normalize them) 17
PCA - Steps
Step 5: dimensionality reduction step – approximate x using
only the first K eigenvectors (K<<N) (i.e., corresponding to
the K largest eigenvalues where K is a parameter):
N
x  x  yi ui  y1u1  y2u2  ...  yN uN
i 1
approximate x by xˆ
using first K eigenvectors only
K
xˆ  x  yi ui  y1u1  y2u2  ...  yK uK
i 1 (reconstruction)
 x1   y1 
x  y 
  2  2  y1 
 .   .  y 
     2
x x:  . 
 . 
  . 
 . 
 xˆ  x :  .  note that if K=N, then xˆ x
 
     .  (i.e., zero reconstruction error)
 .   .   yK 
 .   . 
   
 xN   yN 
18
What is the Linear Transformation
implied by PCA?
• The linear transformation y = Tx which performs the
dimensionality reduction in PCA is:
K
xˆ  x  yi ui  y1u1  y2u2  ...  yK uK
i 1

 y1 
y  where U [u1 u2 ... u K ] N x K matrix
 2
(xˆ  x ) U  . 
  i.e., the columns of U are the
 .  the first K eigenvectors of Σx
 yK 

 y1 
y 
 2 T= K x N matrix
 .  U T (xˆ  x ) i.e., theU
T
rows of T are the first
 
 .  K eigenvectors of Σx
 yK 
19
What is the form of Σy ?
M M
1 1
x 
M

i 1
(xi  x )(xi  x ) 
M
T
 i i
 
i 1
T

Using diagonalization:
The diagonal elements of
The columns of P are the
 x PPT Λ are the eigenvalues of ΣX
eigenvectors of ΣX
or the variances

y i U T (xi  x ) PT  i
M M
1 M 1 1
y 
M
 T
(y i  y )(y i  y ) 
M

i 1
( y i )( y i )T

M
 ( P  )( P  )
i 1
T
i
T
i
T

i 1

M M
1 1
M

i 1
(T
P  i )(  T
P
i )  P ( T

M
 
i 1
i
T
i ) P  PT  x P  PT ( PPT ) P 

PCA de-correlates the data!

 y 
Preserves original variances!
20
Interpretation of PCA
• PCA chooses the eigenvectors of
the covariance matrix corresponding
to the largest eigenvalues.
• The eigenvalues correspond to the
variance of the data along the
eigenvector directions.
• Therefore, PCA projects the data
along the directions where the data
varies most. u1: direction of max variance
• PCA preserves as much information u2: orthogonal to u1
in the data by preserving as much
variance in the data.

21
Example
• Compute the PCA of the following dataset:

(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)

• Compute the sample covariance matrix is:

n
1
ˆ   (x k  μˆ )(x k  μˆ )t
n k 1

• The eigenvalues can be computed by finding the roots of the

characteristic polynomial:

22
Example (cont’d)
• The eigenvectors are the solutions of the systems:

 xui i ui

Note: if ui is a solution, then cui is also a solution where c≠0.

Eigenvectors can be normalized to unit-length using:

vi
vˆi 
|| vi ||
23
How do we choose K ?

• K is typically chosen based on how much information

(variance) we want to preserve:
K

Choose the smallest  i

K that satisfies
i 1
N
T where T is a threshold (e.g ., 0.9)
the following
inequality:

i 1
i

• If T=0.9, for example, we “preserve” 90% of the information

(variance) in the data.

• If K=N, then we “preserve” 100% of the information in the

data (i.e., just a “change” of basis and xˆ x )

24
Approximation Error

• The approximation error (or reconstruction error) can be

computed by:
|| x  xˆ ||
K
where xˆ  yi ui  x  y1u1  y2u2  ...  yK uK  x
i 1 (reconstruction)

• It can also be shown that the approximation error can be

computed as follows:

1 N
|| x  xˆ ||  i
2 i K 1
25
Data Normalization

• The principal components are dependent on the units used

to measure the original variables as well as on the range of
values they assume.

• Data should always be normalized prior to using PCA.

• A common normalization method is to transform all the data

to have zero mean and unit standard deviation:

xi   where μ and σ are the mean and standard

deviation of the i-th feature xi

26
Application to Images
• The goal is to represent images in a space of lower
dimensionality using PCA.
− Useful for various applications, e.g., face recognition, image
compression, etc.
• Given M images of size N x N, first represent each image
as a 1D vector (i.e., by stacking the rows together).
− Note that for face recognition, faces must be centered and of the
same size.

27
Application to Images (cont’d)

• The key challenge is that the covariance matrix Σx is now

very large (i.e., N2 x N2) – see Step 3:

Step 3: compute the covariance matrix Σx

M
1 1 where A=[Φ1 Φ2 ... ΦΜ]
x 
M

i 1
 T
i
i  
M
AAT

(N2 x M
matrix)

• Σx is now an N2 x N2 matrix – computationally expensive to

compute its eigenvalues/eigenvectors λi, ui
(AAT)ui= λiui
28
Application to Images (cont’d)
• We will use a simple “trick” to get around this by relating
the eigenvalues/eigenvectors of AAT to those of ATA.

• Let us consider the matrix ATA instead (i.e., M x M matrix)

− Suppose its eigenvalues/eigenvectors are μi, vi
(ATA)vi= μivi
− Multiply both sides by A:
A(ATA)vi=Aμivi or (AAT)(Avi)= μi(Avi)

− Assuming (AAT)ui= λiui

A=[Φ1 Φ2 ... ΦΜ]
λi=μi and ui=Avi

(N2 x M matrix)
29
Application to Images (cont’d)

• But do AAT and ATA have the same number of

eigenvalues/eigenvectors?
− AAT can have up to N2 eigenvalues/eigenvectors.
− ATA can have up to M eigenvalues/eigenvectors.
− It can be shown that the M eigenvalues/eigenvectors of ATA are
also the M largest eigenvalues/eigenvectors of AAT

• Steps 3-5 of PCA need to be updated as follows:

30
Application to Images (cont’d)
Step 3 compute ATA (i.e., instead of AAT)

Step 4: compute μi, vi of ATA

Step 4b: compute λi, ui of AAT using λi=μi and ui=Avi, then
normalize ui to unit length.

Step 5: dimensionality reduction step – approximate x using

only the first K eigenvectors (K<M):  y1 
y 
K each image can be  2
xˆ  x  yi ui  y1u1  y2u2  ...  yK uK represented by
a K-dimensional
xˆ  x :  . 
 
i 1 vector  . 
 yK 

31
Example

Dataset

32
Example (cont’d)
Top eigenvectors: u1,…uk
(visualized as an image - eigenfaces)
u1 u2 u3

Mean face: x

33
Example (cont’d)

• How can you visualize the eigenvectors (eigenfaces)

as an image? u u1
u2 3

− Their values must be first mapped to integer values in

the interval [0, 255] (required by PGM format).
− Suppose fmin and fmax are the min/max values of a given
eigenface (could be negative).
− If xϵ[fmin, fmax] is the original value, then the new value
yϵ[0,255] can be computed as follows:

y=(int)255(x - fmin)/(fmax - fmin)

34
Application to Images (cont’d)
• Interpretation: represent a face in terms of eigenfaces

u1 u2 u3

 y1 
y 
 2
K xˆ  x :  . 
xˆ  yi ui  y1u1  y2u2  ...  yK uK  x  
i 1
 . 
 yK 
y1 y2 y3
x

35
Case Study: Eigenfaces for Face
Detection/Recognition

− M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of

Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.

• Face Recognition
− The simplest approach is to think of it as a template matching
problem.

− Problems arise when performing recognition in a high-

dimensional space.

− Use dimensionality reduction!

36
Face Recognition Using Eigenfaces
• Process the image database (i.e., set of images with
labels) – typically referred to as “training” phase:

− Compute PCA space using image database (i.e., training data)

− Represent each image in the database with K coefficients Ωi

 y1 
y 
 2
Ωi  . 
 
 . 
 yK 
Face Recognition Using Eigenfaces
Given an unknown face x, follow these steps:
Step 1: Subtract mean face x (computed from training data)
 x  x
 y1 
Step 2: Project unknown face in the eigenspace: y 
K  2
 
ˆ  y u where yi T ui
i i
: . 
 
i 1  . 
Step 3: Find closest match Ωi from training set using:  
 yK 
K K
1
er min i ||   i ||min i  ( y j  y ) or min i 
i 2
j ( y j  y ij )2
j 1 j 1 j
Euclidean distance Mahalanobis distance

The distance er is called distance in face space (difs)

Step 4: Recognize x as person “k” where k is the ID linked to Ωi

Note: for intruder rejection, we need er<Tr, for some threshold Tr

38
Face detection vs recognition

Detection Recognition “Sally”

39
Face Detection Using Eigenfaces
Given an unknown image x, follow these steps:
Step 1: Subtract mean face x (computed from training data):

 x  x

Step 2: Project unknown face in the eigenspace:

K
ˆ  y u
 where yi T ui
i i
i 1
ˆ ||
Step 3: Compute ed ||   

The distance ed is called distance from face space (dffs)

Step 4: if ed<Td, then x is a face.

40
Eigenfaces
Input Reconstructed

Reconstructed image looks

like a face.

Reconstructed image looks

like a face.

Reconstructed image
looks like a face again!
41
Reconstruction from partial information
• Robust to partial face occlusion.

Input Reconstructed

42
Eigenfaces

• Can be used for face detection, tracking, and recognition!

Visualize dffs as an image:

ˆ ||
ed ||   

Dark: small distance

Bright: large distance

43
Limitations
• Background changes cause problems
− De-emphasize the outside of the face (e.g., by multiplying the input
image by a 2D Gaussian window centered on the face).
• Light changes degrade performance
− Light normalization might help but this is a challenging issue.
• Performance decreases quickly with changes to face size
− Scale input image to multiple sizes.
− Multi-scale eigenspaces.
• Performance decreases with changes to face orientation
(but not as fast as with scale changes)
− Out-of-plane rotations are more difficult to handle.
− Multi-orientation eigenspaces.

44
Limitations (cont’d)
• Not robust to misalignment.

45
Limitations (cont’d)
• PCA is not always an optimal dimensionality-reduction
technique for classification purposes.

46
Linear Discriminant Analysis (LDA)
• Linear discriminant analysis, also known as normal discriminant
analysis (NDA) or discriminant function analysis (DFA), follows
a generative model framework.

• This means LDA algorithms model the data distribution for each
class and use Bayes' theorem to classify new data points.

• Bayes calculates conditional probabilities—the probability of an

event given some other event has occurred.

• LDA algorithms make predictions by using Bayes to calculate the

probability of whether an input data set will belong to a particular
output.

47
Linear Discriminant Analysis (LDA)
• LDA works by identifying a linear combination of features that
separates or characterizes two or more classes of objects or
events.
• LDA does this by projecting data with two or more dimensions
into one dimension so that it can be more easily classified.
• The technique is, therefore, sometimes referred to as
dimensionality reduction.
• This versatility ensures that LDA can be used for multi-class data
classification problems, unlike logistic regression, which is
limited to binary classification.
• LDA is thus often applied to enhance the operation of other
learning classification algorithms such as decision tree, random
forest, or support vector machines (SVM).

48
Linear Discriminant Analysis (LDA)

49
Linear Discriminant Analysis (LDA)

• What is the goal of LDA?

− Seeks to find directions along which the classes are best
separated (i.e., increase discriminatory information).
− It takes into consideration the scatter (i.e., variance) within-
classes and between-classes.

projection direction

Bad separability Good separability

50
Linear Discriminant Analysis (LDA) (cont’d)
• Let us assume C classes with each class containing Mi samples,
i=1,2,..,C and M the total number of samples:
C
M  M i
i 1

• Let μi is the mean of the i-th class, i=1,2,…,C and μ is the mean of the
whole dataset: 1 C
μ
C
 μ
i 1
i

Within-class scatter matrix

C Mi
S w   ( x j  μi )( x j  μi )T
i 1 j 1

C
S matrix
Between-class scatter
b (μ i  μ )(
i 1
μ i  μ ) T

51
Linear Discriminant Analysis (LDA) (cont’d)
• Suppose the desired projection transformation is:

y U T x
• Suppose the scatter matrices of the projected data y are:

Sb , S w

• LDA seeks transformations that maximize the between-

class scatter and minimize the within-class scatter:

| Sb | | U T SbU |
max or max T
| S |
w
| U S wU |
52
Linear Discriminant Analysis (LDA) (cont’d)

• It can be shown that the columns of the matrix U are the

eigenvectors (i.e., called Fisherfaces) corresponding to the
largest eigenvalues of the following generalized eigen-
problem:
Sb uk k S wuk
• It can be shown that Sb has at most rank C-1; therefore,
the max number of eigenvectors with non-zero
eigenvalues is C-1, that is:

max dimensionality of LDA sub-space is C-1

e.g., when C=2, we always end up with one LDA feature

no matter what the original number of features was!
53
Example

54
Linear Discriminant Analysis (LDA) (cont’d)

• If Sw is non-singular, we can solve a conventional

eigenvalue problem as follows:

Sb uk k S wuk

S Sb uk k uk
1
w

• In practice, Sw is singular due to the high dimensionality

of the data (e.g., images) and a much lower number of
data (M << N )

55
Linear Discriminant Analysis (LDA) (cont’d)
• To alleviate this problem, PCA could be applied first:

1) First, apply PCA to reduce data dimensionality:

 x1   y1 
x  y 
  2  2
x  .   
PCA
 y  . 
   
  .  . 
 xN   yM 

2) Then, apply LDA to find the most discriminative directions:

 y1   z1 
y  z 
  2  2
y  .   LDA
  z  . 
   
  .  . 
 yM   z K 
56
Case Study I

− D. Swets, J. Weng, "Using Discriminant Eigenfeatures for Image

Retrieval", IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 18, no. 8, pp. 831-836, 1996.

• Content-based image retrieval:

− Application: query-by-example content-based image retrieval

− Question: how to select a good set of image features?

57
Case Study I (cont’d)
• Assumptions
− Well-framed images are required as input for training and query-by-
example test probes.
− Only a small variation in the size, position, and orientation of the
objects in the images is allowed.

58
Case Study I (cont’d)
• Terminology
− Most Expressive Features (MEF): features obtained using PCA.
− Most Discriminating Features (MDF): features obtained using LDA.

• Numerical instabilities
− Computing the eigenvalues/eigenvectors of Sw-1SBuk = kuk could
lead to unstable computations since Sw-1SB is not always symmetric.

− Check the paper for more details about how to deal with this issue.

59
Case Study I (cont’d)
• Comparing projection directions between MEF with MDF:
− PCA eigenvectors show the tendency of PCA to capture major
variations in the training set such as lighting direction.
− LDA eigenvectors discount those factors unrelated to classification.

60
Case Study I (cont’d)

• Clustering effect

PCA space LDA space

61
Case Study I (cont’d)

• Methodology

1) Represent each training image in terms of MDFs (or MEFs for

comparison).

2) Represent a query image in terms of MDFs (or MEFs for

comparson).

3) Find the k closest neighbors (e.g., using Euclidean distance).

62
Case Study I (cont’d)
• Experiments and results
Face images
− A set of face images was used with 2 expressions, 3 lighting conditions.
− Testing was performed using a disjoint set of images.

63
Case Study I (cont’d)

Top match (k=1)

64
Case Study I (cont’d)
− Examples of correct search probes

65
Case Study I (cont’d)

− Example of a failed search probe

66
Case Study II

− A. Martinez, A. Kak, "PCA versus LDA", IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228-
233, 2001.

• Is LDA always better than PCA?

− There has been a tendency in the computer vision community to

prefer LDA over PCA.
− This is mainly because LDA deals directly with discrimination
between classes while PCA does not pay attention to the underlying
class structure.

67
Case Study II (cont’d)
AR database

68
Case Study II (cont’d)

LDA is not always better when the training set is small

PCA w/o 3: not using the

first three principal components
that seem to encode mostly
variations due to lighting

69
Case Study II (cont’d)

LDA outperforms PCA when the training set is large

PCA w/o 3: not using the

first three principal components
that seem to encode mostly
variations due to lighting

70
4th Quiz

• When: Monday, April 19th at 2pm

− Closed book/notes
− 10 minutes for answering the questions
− 5 minutes for uploading your answers on Canvas
(system will NOT accept any submissions after 2:15pm)
− See “Course Overview” slides for details
• What: Dimensionality Reduction

967.27 Salmonella en Alimentos - Identificación
No ratings yet
967.27 Salmonella en Alimentos - Identificación
3 pages
Computer Software Assignment
100% (1)
Computer Software Assignment
8 pages
Chapter 10. Dimensionality Reduction With PCA
No ratings yet
Chapter 10. Dimensionality Reduction With PCA
23 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
No ratings yet
Dimension Reduction and Hidden Structure: 1.1 Principal Component Analysis (PCA)
40 pages
PCA
100% (1)
PCA
33 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
No ratings yet
Class8-9 DataPreprocessing DataReduction 30Sept-05Oct2020
22 pages
2. PCA
No ratings yet
2. PCA
22 pages
Unit 3
No ratings yet
Unit 3
102 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
PCA ChrisDing4
No ratings yet
PCA ChrisDing4
74 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
CS464_Ch6_FeatureExtraction
No ratings yet
CS464_Ch6_FeatureExtraction
46 pages
Principal Component Analysis (PCA) : Anisha M. Lal
No ratings yet
Principal Component Analysis (PCA) : Anisha M. Lal
20 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
Lecture 9_PCA
No ratings yet
Lecture 9_PCA
44 pages
Prs l6
No ratings yet
Prs l6
10 pages
Presentation
No ratings yet
Presentation
31 pages
PCA revis-BoW PDF
No ratings yet
PCA revis-BoW PDF
47 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
Week12_PCA_BayesianInference_before_lecture
No ratings yet
Week12_PCA_BayesianInference_before_lecture
82 pages
Lecture 9 -Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 -Data Prep - Reduction - PCA-M
44 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
7.3 Pca
No ratings yet
7.3 Pca
17 pages
Lecture 14: Principal Component Analysis: Computing The Principal Components
No ratings yet
Lecture 14: Principal Component Analysis: Computing The Principal Components
6 pages
lec15
No ratings yet
lec15
28 pages
Principal Component Analysis: Atent Ariables
No ratings yet
Principal Component Analysis: Atent Ariables
13 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
16. Principal Component Analysis
No ratings yet
16. Principal Component Analysis
27 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Unit-3
No ratings yet
Unit-3
28 pages
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
23 pages
Unit 3dimentionality Reduction
No ratings yet
Unit 3dimentionality Reduction
13 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
Module3 Notes
No ratings yet
Module3 Notes
13 pages
MLSP-6 dimensionality reduction
No ratings yet
MLSP-6 dimensionality reduction
39 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Pca
No ratings yet
Pca
6 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Pca
No ratings yet
Pca
17 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
lec 13-14 PCA
No ratings yet
lec 13-14 PCA
53 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
Presentation a i Std 2
No ratings yet
Presentation a i Std 2
63 pages
2d
No ratings yet
2d
17 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
OD333269701051295100-1
No ratings yet
OD333269701051295100-1
8 pages
RNA torsoin angles.RNA
No ratings yet
RNA torsoin angles.RNA
20 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
Assignment
No ratings yet
Assignment
3 pages
Assignment1_Introduction
No ratings yet
Assignment1_Introduction
2 pages
Guidlines or The Design of Bored Concrete Piles in Expansive Soil of Sudan
No ratings yet
Guidlines or The Design of Bored Concrete Piles in Expansive Soil of Sudan
10 pages
Project Logbook ITM University
No ratings yet
Project Logbook ITM University
22 pages
PVC U Conduit
No ratings yet
PVC U Conduit
2 pages
Konelab PRIME 30 Service Manual
No ratings yet
Konelab PRIME 30 Service Manual
282 pages
Ball Mill 1 Chapter One and Two
No ratings yet
Ball Mill 1 Chapter One and Two
44 pages
Lunar Prospector End of Mission Overview Press Kit
No ratings yet
Lunar Prospector End of Mission Overview Press Kit
35 pages
Report On Khepera III
No ratings yet
Report On Khepera III
28 pages
More Truth Than Fact Disch
No ratings yet
More Truth Than Fact Disch
31 pages
For Fcfs
No ratings yet
For Fcfs
2 pages
Rs900 C Datasheet
No ratings yet
Rs900 C Datasheet
10 pages
SBST1303 Pengenalan Statistik
No ratings yet
SBST1303 Pengenalan Statistik
4 pages
CSC573
No ratings yet
CSC573
41 pages
IJCSDF 7 4 AnalysisofSecureHash
No ratings yet
IJCSDF 7 4 AnalysisofSecureHash
10 pages
Behavior and Strengths of Single Cast-In Anchors in Ultra-High-Performance
No ratings yet
Behavior and Strengths of Single Cast-In Anchors in Ultra-High-Performance
10 pages
IRS Guidelins HPC
No ratings yet
IRS Guidelins HPC
12 pages
Cabin NVH
No ratings yet
Cabin NVH
8 pages
Gas Console Manifold Replacement HPRXD Systems: Field Service Bulletin 807570 - Revision 0 - July, 2012
No ratings yet
Gas Console Manifold Replacement HPRXD Systems: Field Service Bulletin 807570 - Revision 0 - July, 2012
12 pages
Introduction To Python 2
No ratings yet
Introduction To Python 2
8 pages
EES.23.QSV.01012_CRF
No ratings yet
EES.23.QSV.01012_CRF
1 page
Chapter 2 - Ratios and Percentages
No ratings yet
Chapter 2 - Ratios and Percentages
14 pages
Product Data: 220 VA Power Amplifier - Type 2707
No ratings yet
Product Data: 220 VA Power Amplifier - Type 2707
6 pages
Iare DS Lecture Notes 2
No ratings yet
Iare DS Lecture Notes 2
135 pages
CrochetedBohoSarong
No ratings yet
CrochetedBohoSarong
5 pages
Grade 8 Ch5 Workbook PDF
No ratings yet
Grade 8 Ch5 Workbook PDF
57 pages
ABAP Dictionary Interview Questions With Answers / Dictionary FAQ
No ratings yet
ABAP Dictionary Interview Questions With Answers / Dictionary FAQ
12 pages
Mark Scheme Paper 2 June 2014 1
No ratings yet
Mark Scheme Paper 2 June 2014 1
14 pages
Confidence Interval For Means
No ratings yet
Confidence Interval For Means
37 pages