Chapter 4: Normalized Principal Components Analysis: Dr. Lassad El Moubarki Tunis Business School
Chapter 4: Normalized Principal Components Analysis: Dr. Lassad El Moubarki Tunis Business School
2 Example of data
9 PCA with R
Initial data
Example of data
Reduce the number of dimension of the data by looking for the best
planes visualization and this by applying orthogonal projections of the
data.
Group together homogeneous individuals and identify exceptional
individuals.
Analyze the relationships between the variables.
Cases studies
. .
. . . .
xij −X¯j
Z = . . zij =
σj . .
. . . .
. .
Remark: throughout the rest of the chapter we assume that the data is
normalized.
Principals axes :
1 Principals axes ∆1 , ∆2 , . . . ,k are identified by looking for eigenvalue of
the eigenvectors of the correlation matrix R =t ZDZ . (D = n1 In is the
weight matrix).
2 In a next step We sort the eigenvalue in an decreasing order
λ1 >> λ2 > . . . > λp . We denote by U the (pXp) matrix of
eigenvectors uj organized in columns.
Principals components
1 The coordinates, over the new axes formed by the eigenvectors, of a
given individuals
1are given by the scalar
products:
C1 . C1α . C1p
. . . . .
C = ZU = . .
. . .
Cn1 . Cnα . Cnp
2 Any couple of columns of the matrix U form a factors map.
Absolute contribution :
The absolute contribution of a given point i to the projection inertia
p (C α )2
over the axis α is: ACTR(i, α) = i λαi
Relative contribution or cosine square:
The relative contribution of a given point i over the axis α is:
(C α )2
RCTR(i, α) = cos 2 (zi , zˆi α ) = ||zi ||2 . where zˆi α is the orthogonal
i
projection of Zi over the axis α.
Remarks
z1j
z2j
The coordinates of the j th variable point : Z j = .
.
znj
The eigenvectors (v1 , v2 , . . .) defining the principals axes of the
second scatterplot are given by the transition formula:
vα = √1λ Zuα .
α
The new factor coordinates
√ of each variable point j over the axis α
α j
are given by: Sj = λα uα
Correlation circle
λ1 + λ2
p
L. El Moubarki Sampling November 10, 2020 15 / 23
Inertia and the choice of the number of principal axes
This rate defines the explanatory power of the k first axis (or factors): it
represents the part of total variance taken into account by these k axis.
However, its appreciation must take into account the number of variables
and the number of individuals. For example, an inertia rate relative to an
axis of 10 % can be an important value if the we have 100 variables and
low if it has only 7 variables.
Rotation
Case Studies