Principal Component Analysis - A Tutorial
Principal Component Analysis - A Tutorial
Alaa Tharwat
April 2, 2016
Introduction.
Principal Component Analysis (PCA).
Numerical Examples.
Experiments.
Conclusions and Future Works.
PCA
Orthogonal Transformation
Axes Rotation
x2 PC1 M k
(Direction of the
maximum variance)
ℜ ℜ
PC2
PC2
2
σ22 σ1
PC1
x1
Figure: Example of the two-dimensional data (x1 , x2 ).
Alaa Tharwat April 2, 2016 4 / 37
Introduction Principal Components (PCs)
In this method, there are two main steps to calculate the PCs of the
PCA space. First, the covariance matrix of the data matrix (X) is
calculated. Second, the eigenvalues and eigenvectors of the
covariance matrix are calculated.
The covariance matrix is used when the number of variables more
than one.
Covariance matrix is a symmetric matrix (i.e. X = X T ) and always
positive semi-definite matrix.
V ar(x1 , x1 ) Cov(x1 , x2 ) . . . Cov(x1 , xM )
Cov(x2 , x1 ) V ar(x2 , x2 ) . . . Cov(x2 , xM )
(1)
.. .. .. ..
. . . .
Cov(xM , x1 ) Cov(xM , x2 ) V ar(xM , xM )
V Σ = λV (2)
where V and λ represent the eigenvectors and eigenvalues of the
covariance matrix, respectively.
The eigenvalues are scalar values, while the eigenvectors are non-zero
vectors, which represent the principal components, i.e. each
eigenvector represents one principal component.
The eigenvectors represent the directions of the PCA space, and the
corresponding eigenvalues represent the scaling factor, length,
magnitude, or the robustness of the eigenvectors.
The eigenvector with the highest eigenvalue represents the first
principal component and it has the maximum variance.
A
DataLMatrixL MeanL Mean-CentringLDataL
(X) (μ) (D=X-μ)
x1 x2 xN d1 d2 dN
X= ــ =
LargestLkL
Eigenvalues
kLSelected CovarianceL
Eigenvectors MatrixL(Σ)
(MxM)
PCA
LSpace
V1 V2 Vk VM
(Mxk)
Eigenvectors
A
DataNMatrixN-X) MeanN-μ) Mean-CentringN DT
DataN-D=X-μ)
x1 x2 xN d1 d2 dN
d1
d2
Transpose
X= ــ =
dN
-NxM)
-MxN) -Mx1) -MxN)
Data
Sample
SingularNValueN
DecompositionNMethod
B
A
L S RT
li M M
ri
si M
-MxM)
N
-NxN)
C
A
λi=si2
LargestNkN
Eigenvalues
V1 V2 Vk VM
λ1 λ2 λk λM
SortedN Eigenvectors
Eigenvalues
PCA
NSpace kNSelected
-W) Eigenvectors
-Mxk)
Figure: Visualized steps to calculate the PCA space using SVD method.
Alaa Tharwat April 2, 2016 12 / 37
Introduction Singular Value Decomposition (SVD) Method
DatayMatrixy(D) DatayAftery
Projectiony(Y)
d1 d2 dN
y1 y2 yN
PCA
Projection Y=
D= ySpace
Y=WTD (W)
−1.63 −1.63 −0.63 −2.63 2.38 1.38 2.38 0.38
D=
−0.63 −1.63 −0.63 −0.63 0.38 1.38 1.38 0.38
(14)
The error between the original data and the reconstructed data that
were projected on the first and second eigenvectors are denoted by
Ev1 and Ev2 , respectively. The values of Ev1 and Ev2 are as follows:
Ev1 = X − X̂1
−1.55 −1.95 −0.75 −2.35 2.05 1.65 2.45 0.45
=
−0.77 −0.97 −0.37 −1.17 1.02 0.82 1.22 0.22
Ev2 = X − X̂2
−0.07 0.33 0.12 −0.27 0.32 −0.28 −0.08 −0.08
=
0.15 −0.66 −0.25 0.54 −0.65 0.55 0.16 0.15
(18)
x2 )
C1
st t (P
Fir onen
6 mp
l Co
pa
nci
tion Pri
ec
5 P roj
4 PC2
µ
Pr
in c
3
ipa
µ
Se mpo
PC1
lC
co nen
o
nd
2 Data Sample
Projection on PC1
t (P
Projection on PC2
C2
)
1
1 2 3 4 5 6 x1
(a) (b)
Figure: A visualized example of the PCA technique, (a) the dotted line
represents the first eigenvector (v1 ), while the solid line represents the second
eigenvector (v2 ) and the blue and green lines represent the reconstruction error
using P C1 and P C2 , respectively; (b) projection of the data on the principal
components, the blue and green stars represent the projection onto the first and
second principal components, respectively.
Alaa Tharwat April 2, 2016 22 / 37
Numerical Examples First Example: 2D-Class Example
The first three steps in SVD method and covariance matrix methods
are common. In the fourth step in SVD, the original data were
transposed as follows, Z = N 1−1 DT . The values of Z are as follows:
−0.61 −0.24
−0.61 −0.61
−0.24 −0.24
−0.99 −0.24
Z=
0.90
(19)
0.14
0.52 0.52
0.90 0.52
0.14 0.14
1.00 1.00 2.00 0.00 7.00 6.00 7.00 8.00
2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00
X=
5.00
(21)
6.00 5.00 9.00 1.00 2.00 1.00 4.00
3.00 2.00 3.00 3.00 4.00 5.00 5.00 4.00
The covariance matrix of the given data was calculated and its values
are shown below.
10.86 0 −7.57 2.86
0 0 0 0
Σ= −7.57 0
(22)
7.55 −2.23
2.86 0 −2.23 1.13
140
120
100
Reconstruction Error
80
60
40
20
0
1 2 3 4
Index of the Eigenvectors
Table: A comparison between ORL, Ear, and Yale datasets in terms of accuracy
(%), CPU time (sec), and cumulative variance (%) using different number of
eigenvectors (biometric experiment).
ORL Dataset Ear Dataset Yale Dataset
Number of CPU Cum. CPU Cum. CPU
Acc. Acc. Acc. Cum.
Eigenvectors Time Var. Time Var. Time
(%) (%) (%) Var. (%)
(sec) (%) (sec) (%) (sec)
1 13.33 0.074 18.88 15.69 0.027 29.06 26.67 0.045 33.93
5 80.83 0.097 50.17 80.39 0.026 66.10 76.00 0.043 72.24
10 94.17 0.115 62.79 90.20 0.024 83.90 81.33 0.042 85.13
15 95.00 0.148 69.16 94.12 0.028 91.89 81.33 0.039 90.18
20 95.83 0.165 73.55 94.12 0.033 91.89 84.00 0.042 93.36
30 95.83 0.231 79.15 94.12 0.033 98.55 85.33 0.061 96.60
40 95.83 0.288 82.99 94.12 0.046 99.60 85.33 0.064 98.22
50 95.83 0.345 85.75 94.12 0.047 100.00 85.33 0.065 99.12
100 95.83 0.814 93.08 94.12 0.061 100.00 85.33 0.091 100.00
Acc. accuracy; Cum. Cumulative; Var. variance.
(d) OneAlaa
principal
Tharwat (e) 10% of the (f) 50%
April of the
2, 2016 30 / 37
Experiments Image Compression Experiment
Table: Compression ratio and mean square error of the compressed images using
different percentages of the eigenvectors (image compression experiment).
Lena Image Cameraman Image
Percentage of the Cumulative Cumulative
M SE CR M SE CR
used Eigenvectors Variance (%) Variance (%)
10 5.3100 512:51.2 97.35 8.1057 256:25.6 94.56
20 2.9700 512:102.4 99.25 4.9550 256:51.2 98.14
30 1.8900 512:153.6 99.72 3.3324 256:76.8 99.24
40 1.3000 512:204.8 99.87 2.0781 256:102.4 99.73
50 0.9090 512:256 99.94 1.1926 256:128 99.91
60 0.6020 512:307.2 99.97 0.5588 256:153.6 99.98
70 0.3720 512:358.4 99.99 0.1814 256:179.2 100.00
80 0.1935 512:409.6 100.00 0.0445 256:204.8 100.00
90 0.0636 512:460.8 100.00 0.0096 256:230.4 100.00
100 (All) 0.0000 512:512=1 100.00 0.0000 1 100.00
5
x 10
4
Lena
3.5 Cameraman
Robustness of the Eigenvectors
2.5
1.5
0.5
0
10 20 30 40 50 60 70 80 90 100
The Index of Eigenvectors
Figure: The robustness, i.e. total variance (see Equation (12)), of the first 100
eigenvectors using Lena and Cameraman images.
1.5 4 0.8
Cancer
Setosa Bad Radar
Versicolour Good Radar Normal
1 3 0.6
Virginica
Second Principal Component (PC2)
−1 −2 −0.4
−1.5 −3 −0.6
−4 −3 −2 −1 0 1 2 3 4 −5 −4 −3 −2 −1 0 1 2 3 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 −0.4
First Principal Component (PC1) First Principal Component (PC1) First Principal Component (PC1)
400 −500
200
200 −1000
0 0
−1500
−200
−200 −2000
−400
Setosa
Versicolour Bad Radar
Good Radar Cancer
Virginica
Normal
1 4 0.3
Third Principal Component (PC3)
−1 −4 −0.2
2 4 1
1 4 2 6 0.5 −0.5
0 2 0 4
2 −1
0 0 0
−1 −2 −2 −1.5
−2
Second Principal Component (PC2) −2 −4 Second Principal Component (PC2) −4 −4 First Principal Component (PC1) Second Principal Component (PC2) −0.5 −2
First Principal Component (PC1) First Principal Component (PC1)
Subject 1 Subject 1
Subject 1 Subject 2
Subject 2
Third Principal Component (PC3)
Subject 2 Subject 3
Subject 3 400 2000
Third Principal Component (PC3)
2D 3D
Dataset
Robustness Robustness
M SE M SE
(in %) (in %)
Iris 97.76 0.12 99.48 0.05
Iono 43.62 0.25 51.09 0.23
Ovarian 98.75 0.04 99.11 0.03
ORL 34.05 24.03 41.64 22.16
Ear64×64 41.17 15.07 50.71 13.73
Yale 48.5 31.86 57.86 28.80