PCAvs FA
PCAvs FA
net/publication/355207581
CITATIONS READS
10 1,776
1 author:
Zenon Gniazdowski
Warsaw School of Computer Science
42 PUBLICATIONS 172 CITATIONS
SEE PROFILE
All content following this page was uploaded by Zenon Gniazdowski on 14 October 2021.
Zenon Gniazdowski*
Abstract
The article discusses selected problems related to both principal component anal-
ysis (PCA) and factor analysis (FA). In particular, both types of analysis were
compared. A vector interpretation for both PCA and FA has also been proposed.
The problem of determining the number of principal components in PCA and fac-
tors in FA was discussed in detail. A new criterion for determining the number of
factors and principal components is discussed, which will allow to present most
of the variance of each of the analyzed primary variables. An efficient algorithm
for determining the number of factors in FA, which complies with this criterion,
was also proposed. This algorithm was adapted to find the number of principal
components in PCA. It was also proposed to modify the PCA algorithm using a
new method of determining the number of principal components. The obtained
results were discussed.
1 Introduction
To be able to talk about factor analysis in the context of principal component analysis, the details
of both methods should be compared, starting with the algorithms and ending with the effects of
both types of analysis. Only selected problems related to principal component analysis (PCA)
and factor analysis (FA) will be discussed in this article. First of all, the common elements of
both analyzes will be presented, but also the differences between them. Principal component
analysis and factor analysis will be performed for the sample data set. In both analyzes, a
matrix of correlation coefficients will be used. Additionally, in the case of factor analysis,
considerations will be limited to exploratory factor analysis (EFA) using principal components
and Varimax rotation. Also, the elements on the diagonal of the correlation matrix will not be
reduced by the value of the common variances.
* E-mail: [email protected]
A detailed comparison of the PCA with the FA will allow conclusions to be drawn about
the relationship between both types of analysis. This will allow a broader view of the criteria
for determining the number of principal components or factors in both types of analysis. As a
consequence, it will enable the development of a new efficient algorithm for determining the
number of factors in FA and principal components in PCA.
2 Preliminaries
This section introduces the basic concepts or notations that you will use later in this article. In
particular, this applies to some letter symbols, abbreviations, basic statistics, rotation of the co-
ordinate system, criteria for determining the number of factors or principal components, as well
as the factor analysis algorithm which is based on principal components. Principal components
analysis algorithm will not be presented here. This algorithm has already been presented in the
article [1].
36
Principal Component Analysis versus Factor Analysis
• In this article, both PCA and FA will use a matrix of correlation coefficients. Therefore,
in both types of analysis, it is sufficient to consider standardized primary variables instead
of the original primary variables. The assumption about the standardization of primary
variables is not a limitation here for three reasons:
1. Standardization of random variables does not affect the matrix of correlation coef-
ficients.
2. Both types of analysis work on standardized primary variables. FA identifies the
linear model of standardized primary variables as a function of independent fac-
tors (also standard random variables), while PCA transforms standardized primary
variables into independent principal components.
3. Using the transformation of formula (7), one can find the primary variables X from
the standardized primary variables x.
• The article will examine principal components analysis as well as factor analysis. Certain
algorithms exist in both types of analysis. They are either common or analogous. The
common algorithm is eigenproblem solving for the matrix of correlation coefficients.
However, an analogous algorithm is the algorithm for determining the number of prin-
cipal components in the principal components analysis, as well as the algorithm for de-
termining the number of factors in the factor analysis. An analogous algorithm is also
the rotation of the coordinate system. In principal components analysis, rotation enables
the identification of principal components, and in factor analysis, rotation enables the
identification of optimal factors.
When discussing analogous algorithms, referring to both factors and principal compo-
nents, the article will use a conglomerate of two words: ”factor/component”. In the
context of principal component analysis, this conglomerate will only refer to the princi-
pal components. In the context of factor analysis, this conglomerate will only refer to
factors.
37
Zenon Gniazdowski
38
Principal Component Analysis versus Factor Analysis
Xi − X xi
xi := = . (7)
s s
After standardization, the variable x has the mean value x = 0 and the standard deviation s = 1.
Using (2), the formula for the correlation coefficient can be transformed to the form:
Pn
xi yi
RX,Y = pPn i=1 2
pPn
2
. (9)
i=1 xi i=1 yi
In the numerator of the formula there is the dot product of two vectors x and y, and in the
denominator there is the product of the lengths of these vectors. This means that the correlation
coefficient is identical to the cosine of the angle between the two random vectors x and y [3]:
x·y
RX,Y = = cos (x, y). (10)
∥x∥ · ∥y∥
Here it should also be added that the standardization of a random variable does not change the
correlation coefficient.
39
Zenon Gniazdowski
v ′ := Rv. (12)
When the vector v specifies one point in space, it is represented as a row in some matrix. This
means that its transposition is available. Also the resulting vector will be a row in the matrix.
In order to rotate the vector represented in this way, both sides of the formula (12) should be
transposed:
T
(v ′ ) = (Rv)T . (13)
The result is a formula to rotate the row vector:
T
(v ′ ) = v T RT = v T U. (14)
If instead of single points v T and (v ′ )T we consider sets of points in the form of rectangular
matrices M and M ′ , in which vectors v T and (v ′ )T will be single rows, then equation (14)
becomes the matrix equation:
M ′ = M U. (15)
40
Principal Component Analysis versus Factor Analysis
In this way, the rows of the factor loadings matrix are rotated in factor analysis, and the stan-
dardized primary variables are transformed into principal components in principal components
analysis.
···
1 0 0
.. .. ..
.
. .
0
··· 1
c s
1 ··· 0
rij =
.. .. ..
(16)
. . .
0 ··· 1
−s c
1 ··· 0
.. .. ..
. . .
0 0 ··· 1
The algorithm for finding the final matrix describing the resultant rotation is as follows:
1. R := I;
2. ∀i,j(i=1,2,...,n−1 ) R := R · rij .
j=i+1,...,n
41
Zenon Gniazdowski
Since the matrix rij describing rotation in a given plane is a modified identity matrix with
changed four elements, therefore in the second point of the algorithm there is no need to fully
multiply the matrices R and rij , but it is enough to modify the elements of the matrix R in the
′
i−th and the j−th rows, as well as in the i−th and j−th columns. First, the elements Rii and
′ ′ ′
Rjj on the diagonal can be found, as well as the non-diagonal elements Rij and Rji :
′
Rii :=c2 rii + sc (rij + rji ) + s2 rjj ;
′
Rjj := − c2 rjj + sc (rij + rji ) − s2 rii ;
′
(18)
Rij :=sc (rjj − rii ) + c2 rij − s2 rji ;
′
Rji :=scrjj − rii + c2 rji − s2 rij .
Also, the remaining elements in both rows and both columns should be calculated:
′
Rip := cRip + sRjp
′
Rjp := −sRip + cRjp
′
p ̸= i, p ̸= j. (19)
Rpi := cRpi + sRpj
′
Rpj := −sRpi + cRpj
42
Principal Component Analysis versus Factor Analysis
one. A single standardized variable has a variance of one. Any factor/component with an
eigenvalue greater than one accounts for more variance than a single variable. The ratio-
nale for using the eigenvalue criterion is that each factor/component should represent or
explain at least one primary variable. That is, only factors/components with eigenvalues
not less than one should be kept. Since the goal of both PCA and FA is to reduce the
total number of factors/components, each factor/component should account for a greater
variance than the variance of a single primary variable.
• The criterion of half the number of primary variables: It is assumed that the number of
factors/components should not exceed half the number of all primary variables. If the
identification of principal components is treated as lossy compression, it is important
that this compression significantly reduces the size of the stored set. A file that is half the
size of the original file can be considered sufficiently compressed. On the other hand, the
number of factors/components in the factor/component model, not greater than half of
the primary variables (including potentially possible factors/components), is satisfactory
from the point of view of the simplicity of the model.
It should be emphasized at this point that none of the above criteria should be regarded as
absolute criteria, but rather as subsidiary criteria. First of all, it may happen that particular
criteria may produce different or inconclusive results:
• The scree criterion may not apply as the two phases clearly separated by a so-called
”elbow”’ may not be visible in the scree plot.
• The percentage criterion of the explained variance may also give unsatisfactory results,
despite the relatively large variance represented. The number of factors/components re-
sulting from this criterion may be too small to adequately represent the primary variables.
• The Kaiser criterion can also falsify the number of factors/components. For example,
for data describing the petals of an iris flower [8], the second eigenvalue for the correla-
tion coefficient matrix is less than one. Nevertheless, only two factors/components can
satisfactorily represent the primary variables [1].
• Also, the criterion of half the number of primary variables may be too strict. For ex-
ample, according to this criterion, with an odd number of variables, the number of fac-
tors/components should not be greater than (n − 1)/2. Meanwhile, it would be better if
this number was (n + 1)/2.
In such an ambiguous situation, the number of factors/components should be decided by ana-
lyzing the full context of the study.
43
Zenon Gniazdowski
the causes of the variability of the observed variables. Various approaches to FA are mentioned
in the literature. These approaches relate to the types of FA, algorithms used, as well as rotation
methods (e.g. [6, 7, 9, 10, 11]:
• There are two types of factor analysis. The first type is exploratory factor analysis (EFA),
the second is confirmatory factor analysis (CFA). In exploratory factor analysis, neither a
priori relationship between the observed variables and factors nor the number of factors
is assumed. On the other hand, confirmatory factor analysis presupposes some knowl-
edge of the model, which may be confirmed during the course of it. In practice, some
researchers may run EFA first and then use CFA to validate or confirm EFA results. In
turn, EFA is not a necessary condition for the CFA [11].
• In FA, in the context of the algorithms used, there are at least two different approaches:
the principal component approach and the maximum likelihood approach.
• The factors obtained from the principal components need not be final factors. Factors can
be rotated for simpler interpretation. The main division of rotation methods is between
orthogonal and oblique rotation. Different possible rotation criteria lead to different
possible rotation methods. Hence, at least the following methods of factor rotation are
known: Varimax - orthogonal rotation, Quartimax - orthogonal rotation, and Oblimin -
oblique rotation.
Only some aspects of the factor analysis will be presented in this article:
• Exploratory Factor Analysis,
• Method based on principal components,
• Varimax rotation.
Since in factor analysis standardized primary variables are modeled as a function of independent
factors, therefore, without losing generality in the remainder of this article, all mathematical
formulas describing factor analysis will only apply to standardized primary variables.
Having a set of n random variables x = [x1 , . . . , xn ], we can proceed to FA. In the
n−dimensional space defined by these variables, m measurement points are considered. The
data is stored in the form of the xm×n matrix:
x11 · · · x1n
x = ... .. .. . (20)
. .
xm1 ··· xmn
The individual columns of the x matrix contain n successive xi random variables. Each i−th
random variable creates a column random vector xi (i = 1, 2, . . . , m):
xi = [x1i , x2i , . . . , xmi ]T (21)
In turn, the j−th row of the X matrix represents a single measurement point pj , containing the
j−th elements of successive random variables xi (i = 1, 2, . . . , n):
pj = [xj1 , xj2 , . . . , xjn ]. (22)
44
Principal Component Analysis versus Factor Analysis
3. The matrix U is also obtained, which in its columns contains successive eigenvectors
corresponding to the successive eigenvalues:
U11 · · · U1n
U = ... .. .. . (25)
. .
Un1 ··· Unn
4. From the Λ matrix, a diagonal S matrix containing standard deviations of potential fac-
tors is calculated: √
λ1 · · · 0
√
S = Λ = ... .. .. . (26)
.
√.
0 ··· λn
5. Using the matrices U and S there is a square matrix of factor loadings L:
L11 · · · L1n
L = U · S = ... .. .. . (27)
. .
Ln1 ··· Lnn
45
Zenon Gniazdowski
For each primary variable, the value of vi determines the level of variance reproduced by using
k factors in the model (29). The components vi form the vector of common variances V :
v1
..
V = . . (32)
vn
The diversity of the elements of Vi in the vector V shows that individual variables are modeled
with different accuracy by a selected set of factors. A reasonable model should represent most of
46
Principal Component Analysis versus Factor Analysis
the variance of the modeled primary variable. Most means at least 50%. This level of explaining
the variance of the primary variable can be found in [5]. If the condition vi ≤ 0.5 holds for
any i, it is information that too few factors were used to explain the primary variables. One or
more of the criteria described in Section 2.4 may be used to determine the appropriate number
of factors.
εn wn
In this way, the factor model suitable for simulating the influence of factors on the primary
variables, also considering the influence on the primary variables of uncontrolled random dis-
turbances, takes the final form:
x1 L11 · · · L1k f1 w1
.. .. . . .
. .. ..
. = . . . · . + . · f0 . (35)
xn Ln1 ··· Lnk fk wn
In this model, each primary variable is linearly dependent on at least one common factor fi (i =
1, . . . , k) and on one specific (unique) factor f0 .
47
Zenon Gniazdowski
Its transposition RT will be used to transform the points, i.e. to rotate row vectors:
T cos ϕ − sin ϕ
R = . (37)
sin ϕ cos ϕ
After the coordinate system is rotated in the new coordinate system, the point with the coordi-
nates (xi , yi ) becomes the point with the new coordinates (Xi , Yi ). Coordinate transformation
can be described by matrix multiplication:
X1 Y1 x1 y1
.. .. := .. .. · cos ϕ − sin ϕ . (38)
. . . . sin ϕ cos ϕ
XN YN xn yn
48
Principal Component Analysis versus Factor Analysis
Hence: (
dXi /dϕ = Yi ,
(40)
dYi /dϕ = −Xi .
The maximized objective function has the form:
X 2 X 2 X 2 X 2
n2 vxy = n X2 − X2 +n Y2 − Y2 . (41)
Using the equation (40), the objective function (41) can be differentiated with respect to the
angle ϕ and after differentiation it can be compared to zero:
X X X
XY X 2 − Y 2 − X 2 − Y 2 = 0.
n XY (42)
To solve the problem in the space of variables before rotation (x and y), the formula (39) should
be used. After transformation, the relationship describing the angle of rotation on the OXY
plane is obtained:
P 2 P
x − y 2 (2xy) −
P 2
2 n x − y2 (2xy)
4ϕ = arctan nP h io n P o. (43)
2 2 P 2
n (x2 − y 2 ) − 2xy 2 − [ (x2 − y 2 )] − [ (2xy)]
If we substitute ui = x2i − yi2 and vi = 2xi yi , then the above expression reduces to a simpler
form: P P P
2 [n ui vi − ui vi ]
4ϕ = arctan P hP P 2i . (44)
2
n (u2i − vi2 ) − ( ui ) − ( vi )
In the range of full rotation from −1800 to +1800 , the functions sin 4ϕ and cos 4ϕ reach both
negative and positive values (Figure 1). Therefore, the expression arctan (·) is ambiguous.
As a result of examining the signs of the first and second derivative of the numerator and the
denominator in the expression (44), Kaiser’s work [12] presents ranges of the angle ϕ depending
on the signs of the numerator and the denominator of this expression. Table 1 shows the ranges
of the 4ϕ angle values. These ranges are consistent with the ranges of variability of the sin and
cos functions presented in Figure 1.
Table 1: The relationship of the solution of the equation (44) with the signs of its
numerator and denominator
N umerator sign
+ −
Denominator + 00 to 900 −900 to 00
sign − 900 to 1800 −1800 to −900
49
Zenon Gniazdowski
50
Principal Component Analysis versus Factor Analysis
X1 X2 X3 X4 X5 X6 X7
Sea level Air Dew point Wind Wind Visibility Time of
pressure temperature temperature direction speed measurement
[mbar] [0 C] [0 C] [0 ] [m/s] [m]
Mean 1016.378 10.221 5.314 180.805 3.397 18890.569 0.479
Median 1016.2 10 5.6 180 3 20000 0.46
Mode 1014.3 1.3 11.5 270 3 30000 0.75
Standard
8.400 8.972 7.263 100.781 2.065 9769.928 0.289
deviation
Minimum 975.2 -18.6 -20.8 0 0 0 0
Maximum 1045.8 36.7 22.1 360 24 80000 0.96
Basic statistics were estimated for all seven variables. The mean values of the variables, their
medians and modes were adopted as the measures of the location of random variable distribu-
tions. Standard deviations for all variables as well as their minima and maxima were assumed
as measures of dispersion. The results are presented in Table 2. The matrix of correlation co-
efficients (Table 3) and the matrix of determination coefficients (Table 4) were also estimated
for all seven variables. The coefficient of determination (equal to the square of the correlation
coefficient) defines the degree of similarity of random variables measured as a percentage of
their common variance. Table 4 shows the values of the determination coefficients given as a
percentage. Their analysis shows that in most cases the analyzed variables are characterized by
a low level of mutual similarity (mutual correlation). Only air temperature and dew point tem-
perature are strongly correlated. In this case, the common variance measured by the coefficient
of determination is over 76%. The coefficient of determination estimated for temperature and
visibility indicates their common variance at a level slightly greater than 32%. The remaining
determination coefficients do not exceed 10%.
51
Zenon Gniazdowski
x1 x2 x3 x4 x5 x6 x7
x1 1 -0.197 -0.257 -0.110 -0.108 -0.032 -0.010
x2 -0.197 1 0.875 0.025 -0.038 0.568 0.100
x3 -0.257 0.875 1 0.031 -0.142 0.313 0.010
x4 -0.110 0.025 0.031 1 0.311 0.050 0.034
x5 -0.108 -0.038 -0.142 0.311 1 0.146 0.044
x6 -0.032 0.568 0.313 0.050 0.146 1 0.122
x7 -0.010 0.100 0.010 0.034 0.044 0.122 1
Eigenvalue No. 1 2 3 4 5 6 7
Eigenvalue 2.290 1.390 1.058 0.919 0.751 0.518 0.075
Successive eigenvectors corresponding to the eigenvalues in Table 5 form the columns of the
matrix U :
u11 · · · u1n
.. .. .. .
U = [U1 , . . . , Un ] = . . . (46)
un1 ··· unn
For the considered data, the matrix of eigenvectors has the following form:
−0.231 −0.218 0.579 0.579 −0.367 −0.306 0.030
0.633 −0.097 0.028 0.084 −0.063 −0.193 −0.736
0.583 −0.177 −0.187 −0.018 −0.219 −0.379 0.634
U = 0.067 0.625 −0.146 0.089 −0.716 0.251 −0.025. (47)
0.005
0.696 0.057 0.197 0.432 −0.534 0.041
0.438 0.122 0.383 0.350 0.319 0.609 0.228
0.099 0.149 0.677 −0.699 −0.115 −0.081 0.042
52
Principal Component Analysis versus Factor Analysis
P C1 P C2 P C3 P C4 P C5 P C6 P C7
Mean 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Standard deviation 1.513 1.179 1.028 0.959 0.866 0.720 0.275
Variance 2.290 1.390 1.058 0.919 0.751 0.518 0.075
PC = x · R T . (48)
As a result of the transformation (48), seven principal components were obtained. These are
uncorrelated random variables, the statistics of which are presented in Table 5. Comparing
the results presented in Table 5 with the results in Table 4, it can be seen that the variances
of individual principal components are equal to the successive eigenvalues estimated for the
correlation coefficient matrix contained in Table 5.
Having a set of primary variables and a set of principal components, the correlation coeffi-
cients between primary variables and principal components were estimated (Table 7). Based on
the correlation coefficients, the coefficients of determination between the variables from both
sets were found. Table 8 contains information which variable and in what percentage is rep-
resented by successive principal components. It can be seen that most of the variances of the
variables x2 and x3 represent the first principal component, P C1 . This component represents
53
Zenon Gniazdowski
Table 7: The correlation coefficients between the primary variables, and the principal
components
P C1 P C2 P C3 P C4 P C5 P C6 P C7
x1 -0.349 -0.257 0.595 0.555 -0.318 -0.220 0.008
x2 0.957 -0.114 0.029 0.081 -0.054 -0.139 -0.202
x3 0.882 -0.208 -0.193 -0.017 -0.190 -0.273 0.174
x4 0.101 0.737 -0.150 0.085 -0.620 0.181 -0.007
x5 0.008 0.820 0.058 0.189 0.375 -0.384 0.011
x6 0.663 0.144 0.394 0.335 0.276 0.438 0.063
x7 0.150 0.176 0.696 -0.670 -0.100 -0.058 0.012
over 91% of the variance of the x2 variable and over 77% of the variance of the x3 variable.
The second principal component P C2 represents more than half of the variance of the variable
x4 and x5 . The common variance of these variables with the principal component P C2 exceeds
the level of 54% and 67%, respectively. The variables x1 , x6 and x7 do not have a principal
component that would represent most of their variance. For these variables, more components
are needed to represent at least half of their variance. The principal components P C3 and P C4
represent most of the variances of the variables x1 and x7 . In turn, the principal components
P C1 and P C3 contain most of the variance of the variable x6 .
Table 9 also shows the coefficients of determination between primary variables and princi-
pal components, but now not in percent, but in absolute numbers. Additionally, it is enriched
with sums of elements in rows and columns:
• The sum of the determination coefficients in each row is equal to one. This is the vari-
ance of the standardized primary variable. The primary variable shared its variance with
successive princupal components.
• The sum of the determination coefficients in each column is equal to the eigenvalue, i.e.
the variance of the corresponding principal component. The principal component owes
its variance to a certain part of the variance of the primary variables.
54
Principal Component Analysis versus Factor Analysis
P C1 P C2 P C3 P C4 P C5 P C6 P C7 Σ
x1 0.122 0.066 0.354 0.308 0.101 0.048 0.000 1
x2 0.917 0.013 0.001 0.007 0.003 0.019 0.041 1
x3 0.778 0.043 0.037 0.000 0.036 0.075 0.030 1
x4 0.010 0.543 0.023 0.007 0.384 0.033 0.000 1
x5 0.000 0.673 0.003 0.036 0.140 0.148 0.000 1
x6 0.440 0.021 0.155 0.112 0.076 0.192 0.004 1
x7 0.022 0.031 0.484 0.449 0.010 0.003 0.000 1
Σ 2.290 1.390 1.058 0.919 0.751 0.518 0.075 7.00
Table 10: The cumulative variances of the primary variables represented by adding
successive factors
P C1 P C2 P C3 P C4 P C5 P C6 P C7
x1 12.21% 18.81% 54.24% 85.06% 95.15% 99.99% 100%
x2 91.65% 92.96% 93.04% 93.69% 93.99% 95.92% 100%
x3 77.84% 82.18% 85.89% 85.92% 89.52% 96.97% 100%
x4 1.03% 55.31% 57.58% 58.30% 96.74% 100.00% 100%
x5 0.01% 67.28% 67.62% 71.18% 85.21% 99.99% 100%
x6 44.00% 46.08% 61.57% 72.80% 80.41% 99.61% 100%
x7 2.25% 5.33% 53.78% 98.65% 99.65% 99.99% 100%
55
Zenon Gniazdowski
Table 10 presents the cumulative values of the coefficients of determination in the rows. Cu-
mulative values explain the level of representation of primary variables by successive principal
components. Figure 3 shows a graphical representation of Table 10. In Figure 3 it is possible to
see how many principal components are needed to achieve a satisfactory level of representation
of the variance of individual primary variables. It can be seen that one principal component
P C1 represents more than 70% of the variance of the primary variable x3 and more than 90%
of the variable x2 . On the other hand, two principal components are not sufficient to represent
more than half of the variance of the primary variables x1 and x7 .
56
Principal Component Analysis versus Factor Analysis
Pythagorean theorem can be used1 . Since each row is a vector representing a single primary
variable, the cosines between successive vectors are identical to the correlation coefficients be-
tween the successive primary variables that these vectors represent.
57
Zenon Gniazdowski
Table 11: The percentage of variances explained by the successive principal compo-
nents
[1]. This criterion will be shown in the example presented in Table 12. It was assumed that
the primary variables x1 , . . . , x7 will be represented by the three principal components P C1 ,
P C2 and P C3 . The table shows what level of variance of the xi variable is represented by the
prinipal component P Cj . For example, 91.65% of the variance of the primary variable x2 is
represented by the principal component P C1 . The same principal component represents the
primary variables x4 , x5 and x7 to a very small extent. The level of representation of these
variables by the principal component P C1 amounts to 1.03%, 0.01% and 2.27%, respectively.
The row marked as ”Average in column” in Table 12 is identical to the column ”Percentage of
variance explained by each PC” in Table 10. Table 11 refers to the eigenvalues. Eigenvalues are
identical to variances. A comparison of Table 11 and Table 12 shows that the results in Table
11 refer to the mean variance of the primary variables explained by each principal component.
Looking at the last row in the last column of Table 12, it can be seen that the three principal
components contain just over 67% of the mean variance of all primary variables. However, the
variance of single primary variables is represented to a varying degree. The variance of the
primary variables x1 , x4 and x7 is represented in slightly more than half, and the variance of
the primary variable x2 is represented in more than 93%.
The above observations confirm that there is an additional criterion for determining the ap-
propriate number of principal components with regard to the degree of reconstruction of the
variance of the primary variables. The application of this criterion will allow to determine the
appropriate number of principal components in such a way that the level of variance represen-
tation of each of the primary variables is at least satisfactory, not lower than the set threshold
[1]. There should be enough principal components so that most of the variance of each of the
primary variables can be reproduced. Common sense suggests that most means more than half
the variance.
Of course, there still remains the technical problem of applying this criterion. The criterion
presented here requires identifying all the principal components, then calculating the correlation
coefficients and the coefficients of determination between the primary variables and the prin-
cipal components, then determining which principal components are necessary and rejecting
the others. Compared to the previously known criteria, the criterion presented here has greater
58
Principal Component Analysis versus Factor Analysis
Table 12: The level of representation of primary variables by the three principal com-
ponents
P C1 P C2 P C3 Σ
x1 12.21% 6.60% 35.43% 54.24%
x2 91.65% 1.31% 0.08% 93.04%
x3 77.84% 4.34% 3.71% 85.89%
x4 1.03% 54.28% 2.26% 57.58%
x5 0.01% 67.27% 0.34% 67.62%
x6 44.00% 2.08% 15.49% 61.57%
x7 2.25% 3.09% 48.45% 53.78%
Average in column 32.71% 19.85% 15.11% 67.67%
59
Zenon Gniazdowski
F1 F2 F3 F4 F5 F6 F7
x1 -0.349 -0.257 0.595 0.555 -0.318 -0.220 0.008
x2 0.957 -0.114 0.029 0.081 -0.054 -0.139 -0.202
x3 0.882 -0.208 -0.193 -0.017 -0.190 -0.273 0.174
x4 0.101 0.737 -0.150 0.085 -0.620 0.181 -0.007
x5 0.008 0.820 0.058 0.189 0.375 -0.384 0.011
x6 0.663 0.144 0.394 0.335 0.276 0.438 0.063
x7 0.150 0.176 0.696 -0.670 -0.100 -0.058 0.012
F1 F2 F3 F4 F5 F6 F7
x1 0.1221 0.1881 0.5424 0.8506 0.9515 0.9999 1
x2 0.9165 0.9296 0.9304 0.9369 0.9399 0.9592 1
x3 0.7784 0.8218 0.8589 0.8592 0.8952 0.9697 1
x4 0.0103 0.5531 0.5758 0.5830 0.9674 1.0000 1
x5 0.0001 0.6728 0.6762 0.7118 0.8521 0.9999 1
x6 0.4400 0.4608 0.6157 0.7280 0.8041 0.9961 1
x7 0.0225 0.0533 0.5378 0.9865 0.9965 0.9999 1
matrix of factor loadings obtained in the FA. The identity of tables 7 and 13 means that each
factor loading that connects the i − th primary variable to the j − th factor is equal to the
correlation coefficient between the i − th primary variable and the j − th principal component.
It means that:
• The factors obtained in the factor analysis can be identified before their rotation with the
standardized principal components obtained in the principal components analysis.
• Factors connecting the i-th primary variable with successive principal components (i-th
row in Table 13) can be interpreted as components of the vector representing the primary
variables in the coordinate system made up of eigenvectors. Thanks to this, a geometric
description can be used to describe the behavior of primary variables, in particular, the
Pythagorean theorem can be used.
• Since in the vector interpretation each single row is a vector representing a single primary
variable, therefore, as in PCA, the cosines between successive vectors are identical to the
correlation coefficients between those primary variables that these vectors represent.
60
Principal Component Analysis versus Factor Analysis
3.4.1 Artifact
During the analysis of the full matrix of factor loadings L (27), a fact was observed, which will
be presented here in more detail2 . For this purpose, attention should be paid to some properties
of this full matrix of factor loadings:
• The rows of the full matrix of factor loadings L can be interpreted as vectors that rep-
resent successive standardized primary variables. The sums of the squares of the com-
ponents of the row vectors, representing the squares of the lengths of these vectors, are
equal to the unit variances of the standardized primary variables.
• The columns of the L matrix can be interpreted as vectors that represent the principal
components in PCA. The sums of the squares of the components of the column vectors
representing the squares of the lengths of these vectors are equal to the eigenvalues of the
correlation coefficient matrix, and thus equal to the variances of the principal components
in the PCA.
By multiplying the factor loadings matrix L by the transposition of the eigenvector matrix U T ,
the following symmetric matrix was obtained:
0.987 −0.076 −0.122 −0.050 −0.057 0.004 −0.003
−0.076 0.803 0.508 0.005 −0.014 0.297 0.050
−0.122 0.508 0.843 0.018 −0.084 0.095 −0.011
L · UT =
−0.050 0.005 0.018 0.986 0.157 0.018 0.015 . (49)
−0.057 −0.014 −0.084 0.157 0.979 0.080 0.019
0.004 0.297 0.095 0.018 0.080 0.945 0.055
−0.003 0.050 −0.011 0.015 0.019 0.055 0.997
The U T matrix, similarly to the U (25) matrix, is an orthogonal matrix, and therefore describes
a certain rotation of the coordinate system in which the row vectors of the factor loadings matrix
are described. Rotation means changing the basis, or in other words changing the coordinate
system in which the vectors are described. So there is a new basis in which the factor loadings
matrix is symmetrical. This means that not only are the sums of the squares of the row vector
components equal to 1, but also the sums of the squares of the column vector components are
equal to 1. As a result of the performed rotation, the row vectors representing standardized
primary variables did not change. It only happened that the standardized primary variables are
represented by a different set of factors than before the rotation. After rotation, the primary
variables can be described as linear combinations of independent factors, but not factors iden-
tical to the standardized principal components. Now the factors are random variables with unit
variances.
With regard to the artifact described here, questions arise about both its causes and its
potential effects. As for the effects, it is still unknown whether they are important from the
2
By the way, it should be mentioned that the article [1] describes an analogous fact that was observed
in the context of the analysis of the matrix containing the correlation coefficients between the primary
variables and the principal components.
61
Zenon Gniazdowski
point of view of data analysis. As for the causes, the article [1] did not know them yet. In the
context of FA, it seems that more can now be said about the causes. An attempt to explain the
causes will be undertaken in section 5 where some detailed results presented in this paper will
be discussed.
62
Principal Component Analysis versus Factor Analysis
F1 F2 F3 F4 F5 F6 F7
x1 12.21% 18.81% 54.24% 85.06% 95.15% 99.99% 100%
x2 91.65% 92.96% 93.04% 93.69% 93.99% 95.92% 100%
x3 77.84% 82.18% 85.89% 85.92% 89.52% 96.97% 100%
x4 1.03% 55.31% 57.58% 58.30% 96.74% 100.00% 100%
x5 0.01% 67.28% 67.62% 71.18% 85.21% 99.99% 100%
x6 44.00% 46.08% 61.57% 72.80% 80.41% 99.61% 100%
x7 2.25% 5.33% 53.78% 98.65% 99.65% 99.99% 100%
Average in column 32.71% 52.56% 67.67% 80.80% 91.52% 98.92% 100%
Table 16: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors
No. of factors 1 2 3 4 5 6 7
EigVal 32.71% 19.85% 15.11% 13.13% 10.72% 7.40% 1.08%
MinVar 0.01% 5.33% 53.78% 58.30% 80.41% 95.92% 100%
AverVar 32.71% 52.56% 67.67% 80.80% 91.52% 98.92% 100%
NrMinVar 5 7 7 4 6 2 6
the reconstruction of the mean variance of all primary variables as well as on the reconstruction
of the variance of individual primary variables can be analyzed.
Based on the content of Table 15, Table 16 was created, in which for successive factors, the
63
Zenon Gniazdowski
64
Principal Component Analysis versus Factor Analysis
F1 F2 F3 Communality
x1 -0.3494 -0.2569 0.5952 54.24%
x2 0.9574 -0.1143 0.0286 93.04%
x3 0.8823 -0.2083 -0.1927 85.89%
x4 0.1015 0.7368 -0.1504 57.58%
x5 0.0082 0.8202 0.0583 67.62%
x6 0.6634 0.1442 0.3935 61.57%
x7 0.1499 0.1757 0.6960 53.78%
65
Zenon Gniazdowski
Table 19: The matrix of factor loadings for a three-factor model after Varimax rotation
F1 F2 F3 Communality
x1 -0.3314 -0.3410 0.5624 54.24%
x2 0.9634 -0.0250 0.0403 93.04%
x3 0.9009 -0.1056 -0.1901 85.89%
x4 0.0320 0.7536 -0.0831 57.58%
x5 -0.0719 0.8088 0.1300 67.62%
x6 0.6406 0.1708 0.4196 61.57%
x7 0.1223 0.1263 0.7120 53.78%
Table 20: The matrix of common variances for a three-factor model after Varimax
rotation
F1 F2 F3 Communality
x1 10.98% 11.63% 31.63% 54.24%
x2 92.82% 0.06% 0.16% 93.04%
x3 81.16% 1.12% 3.61% 85.89%
x4 0.10% 56.78% 0.69% 57.58%
x5 0.52% 65.41% 1.69% 67.62%
x6 41.04% 2.92% 17.61% 61.57%
x7 1.50% 1.60% 50.69% 53.78%
by the x3 and x4 axes. Before the rotation, the third and fourth coordinates in the first and last
row vector in Table 21 (two pairs of numbers [0.595, 0.555] and [0.696, −0.670] respectively)
had similar values in terms of the modul. It is manifested in the fact that in Fig. 5 the line
segments which represent the variable x1 and x7 are distant from the axis of the coordinate
system. It can be observed that after the rotation the line segments representing the variables
x1 and x7 became clearly close to the axis of the coordinate system. This means that there are
two different factors that represent most of the variances of the variables x1 and x7 . The above
observation is consistent with the conclusions presented above after the analysis of Table 24.
66
Principal Component Analysis versus Factor Analysis
primary variables and n principal components. On the other hand, an unjustified increase in
memory complexity would result from the necessity to use all principal components for the cal-
culation of appropriate correlation coefficients, and not only those that are ultimately necessary
to represent most of the variances of the primary variables.
Similarly, subsection 3.4.2 deals with the problem of determining the appropriate number
of factors in factor analysis, due to the need to represent most of the variances of individual
primary variables by an appropriate factor model. The subsection 3.4.2 mentioned here also
suggests that there may be an appropriate algorithm for finding the appropriate number of fac-
tors. However, in this case, the proposed algorithm would not need significantly more time and
memory complexity.
In subsection 3.4, it was found that both principal component analysis and FA share a
common vector interpretation. As a result, a version of the algorithm for finding the appropriate
number of factors representing most of the variances of primary variables in FA can also be
67
Zenon Gniazdowski
F1 F2 F3 F4 Communality
x1 -0.349 -0.257 0.595 0.555 0.851
x2 0.957 -0.114 0.029 0.081 0.937
x3 0.882 -0.208 -0.193 -0.017 0.859
x4 0.101 0.737 -0.150 0.085 0.583
x5 0.008 0.820 0.058 0.189 0.712
x6 0.663 0.144 0.394 0.335 0.728
x7 0.150 0.176 0.696 -0.670 0.986
used to find the appropriate number of principal components representing most of the variances
of primary variables in PCA. And since this version of the algorithm does not generate greater
computational complexity, its use in principal component analysis will also not require greater
computational complexity.
A common algorithm for determining the appropriate number of principal components in
PCA, as well as determining the appropriate number of factors in FA, is presented in Table
25. The algorithm refers to some common elements found in both PCA and FA. In particular,
it uses the diagonal matrix of eigenvalues Λ described by the formula (24) and the matrix of
eigenvectors U (25), which is obtained as a result of solving the eigenproblem for the matrix of
correlation coefficients. The algorithm also uses other variables, the interpretation of which is
as follows:
• N oF – a positive integer, obtained as a result of the algorithm’s operation, counts the
principal components or factors significant from the point of view of representing most
of the variances of individual primary variables.
• ε – a floating point number greater than 0.5, arbitrarily taken as a reference minimum
value of the variance of each of the primary variables, which should be represented by
principal components or factors. For the purposes of this work, the author assumed the
value ε = 0.51 (i.e. 51%) in the calculations.
• C[n] – n−element non-negative floating point vector that contains variances of individ-
68
Principal Component Analysis versus Factor Analysis
Table 23: The matrix of factor loadings for a four-factor model after Varimax rotation
F1 F2 F3 F4 Communality
x1 -0.129 -0.143 -0.030 0.902 85.06%
x2 0.956 -0.032 0.051 -0.136 93.69%
x3 0.853 -0.149 -0.065 -0.324 85.92%
x4 0.017 0.742 -0.031 -0.177 58.30%
x5 -0.040 0.840 0.052 0.047 71.18%
x6 0.733 0.258 0.151 0.319 72.80%
x7 0.038 0.012 0.992 -0.021 98.65%
Table 24: The matrix of common variances for a four-factor model after Varimax rota-
tion
F1 F2 F3 F4 Communality
x1 1.66% 2.03% 0.09% 81.28% 85.06%
x2 91.48% 0.10% 0.26% 1.86% 93.69%
x3 72.77% 2.22% 0.42% 10.51% 85.92%
x4 0.03% 55.03% 0.09% 3.15% 58.30%
x5 0.16% 70.53% 0.27% 0.22% 71.18%
x6 53.70% 6.67% 2.28% 10.16% 72.80%
x7 0.15% 0.01% 98.44% 0.04% 98.65%
69
Zenon Gniazdowski
of the variance of some of the primary variables. On the other hand, the criterion presented in
section 4 avoids this deficit. The algorithm using the above criterion allows for a more reliable
way of determining the number of factors (principal components). An example of the use of
this algorithm in FA is presented in section 3.4.2. Here, the discussed algorithm will be used
to modify the PCA in order to enable the determination of the optimal number of principal
components. The modified version of the PCA algorithm is presented in Table 26.
70
Principal Component Analysis versus Factor Analysis
Table 26: Modification of the PCA algorithm, taking into account the new criterion for
determining the number of principal components
the variances of the primary variables. Four examples will be shown in this subsection which
will confirm the necessity to apply the criterion presented in section 4.
71
Zenon Gniazdowski
Table 27: The percentage of variances explained by the successive factors for Dataset
No. 2
Table 28: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 2)
No. of factors 1 2 3 4 5 6 7 8 9
EigVal 43.5% 21.4% 18.9% 10.1% 3.3% 1.6% 0.7% 0.5% 0.2%
MinVar 0.8% 7.3% 18.8% 87.4% 90.4% 96.9% 98.3% 99.3% 100%
AverVar 43.5% 64.8% 83.7% 93.8% 97.1% 98.7% 99.3% 99.8% 100%
NrMinVar 1 2 3 1 6 4 8 5 2
tors. When analyzing the table, it can be noticed that the choice of three factors will explain
slightly over 83% of the variance of the primary variables. The Kaiser criterion also suggests
72
Principal Component Analysis versus Factor Analysis
the choice of three factors. On the other hand, in the scree plot (Fig. 6), the four eigenvalues are
above the ”elbow” on the slope of the scree. The analysis of Table 27 and Figure 7 shows that
the choice of three factors will explain only 18.75% of the variance of the variable x3 . There-
fore, four factors must be selected that explain more than 87% of the variance of each primary
variable.
73
Zenon Gniazdowski
Table 29: The percentage of variances explained by the successive factors for Dataset
No. 3
Table 30: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 3)
No. of factors 1 2 3 4 5 6 7 8 ···
EigVal 47.4% 15.9% 14.6% 6.9% 6.0% 3.4% 1.8% 0.6% ···
MinVar 3.9% 4.0% 12.5% 13.3% 14.3% 33.2% 52.0% 60.1% ···
AverVar 47.4% 63.2% 77.8% 84.7% 90.7% 94.1% 95.9% 96.5% ···
NrMinVar 81 81 8 8 8 11 4 11 ···
74
Principal Component Analysis versus Factor Analysis
Table 31: The percentage of variances explained by the successive factors for Dataset
No. 4
Table 32: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 4)
No. of factors 1 2 3 4 5 6 7
EigVal 25.88% 24.24% 21.16% 14.29% 11.62% 2.70% 0.11%
MinVar 0.00% 0.06% 0.06% 44.43% 90.54% 99.73% 100%
AverVar 25.88% 50.12% 71.28% 85.57% 97.19% 99.89% 100%
NrMinVar 3 1 1 2 7 3 1
contains the first nine lines describing the distribution of variances explained by the successive
factors. The last column of the table shows that the choice of four factors will explain more than
84% of the variance of the primary variables. The Kaiser criterion suggests the use of seven fac-
tors. On the other hand, in Figure 8, there are three so-called ”Elbows”. This fact does not make
analysis easier. The final results are not unequivocal. On the other hand, analysis of Table 30
and Figure 9 shows that selecting the seven factors will represent most of the variance of each
of the primary variables (see MinVar). This clearly suggests that seven factors (components)
should be selected for both factor analysis and principal component analysis.
75
Zenon Gniazdowski
Figure 11: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 4)
explained variance (Table 33), suggests three factors (71.3% of variance) or four factors (85.6%
of variance) depending on the accepted minimum threshold of explained variance. As shown
above, this criterion only informs about the average level of reproduction of the variance of
all primary variables.Using this criterion, it is possible that the variance of individual primary
variables may not be sufficiently reproduced.
And it really is. The analysis of Table 34 and Figure 13 shows that for three factors, at least
the variance of the first primary variable x will be insufficiently reproduced. For three factors
the level of reproduction of variance x1 will be less than 1%. From the point of view of the
possibility of reproducing most of the variances of single primary variables, also four factors
are not enough. With four factors, the variance x2 will be reproduced in 44.4%. Only five
factors will reproduce most of the variance of all single primary variables.
76
Principal Component Analysis versus Factor Analysis
Table 33: The percentage of variances explained by the successive factors for Dataset
No. 5
Table 34: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 5)
No. of factors 1 2 3 4 5 6 7 8 9
EigVal 64.3% 12.6% 9.3% 6.4% 3.6% 1.6% 1.4% 0.5% 0.3%
MinVar 16.6% 64.8% 75.2% 81.0% 92.7% 92.7% 97.8% 98.3% 100%
AverVar 64.3% 76.9% 86.2% 92.6% 96.2% 97.8% 99.2% 99.7% 100%
NrMinVar 6 7 7 3 4 4 2 8 7
77
Zenon Gniazdowski
Figure 13: Minimum variance (MinVar) and mean variance (AverVar) reproduced by
successive factors (Dataset No. 5)
of the data can be found on the UCI Machine Learning Repository website [21]. The data table
has 536 rows and 10 columns. After the date is rejected, 9 columns remain for analysis.
As in the previous examples, the determination of the appropriate number of factors was
made in the context of the four classical criteria and the criterion developed in this article:
• Figure 12 shows a scree plot. As there is one point in front of the so-called ”elbow” on
the slope of the scree, the criterion of the scree plot suggests the selection of one factor.
• The Kaiser criterion suggests two factors because two eigenvalues are greater than one.
• Four factors suggest the criterion of half the number of primary variables.
• Assuming that the average level of the explained variance should be at least 80%, on the
basis of Table 33 it can be said that the criterion of the explained variance suggests the
selection of three factors.
Each of the criteria suggests a different solution to the problem of determining the number of
factors.
There is still the last criterion, discussed in this article. From Table 34 and Figure 13, it can
be seen that one factor would explain less than 17% of the variance of the primary variable x6 .
Two factors explain the variance of the primary variable x7 at almost 65%. Therefore, from the
point of view of the necessity to reproduce most of the variance of each of the primary variables,
two factors are sufficient.
5 Discussion
The article attempts to compare the exploratory factor analysis based on principal components
with the principal components analysis using the correlation coefficient matrix. Both types of
analyzes have a common mathematical core. Among the elements common to both types of
78
Principal Component Analysis versus Factor Analysis
analyzes, there are also non-obvious elements. Here an attempt will be made to discuss them.
Particular attention will be paid to a common algorithm for determining the number of factors
in FA and principal components in PCA.
79
Zenon Gniazdowski
Result Principal components are representatives of pri- Factor analysis provides models for pri-
mary variables: mary variables:
Single principal component = linear combination Single observed variable = linear combi-
of the observed variables nation of factors (components) + error
Interpretation Principal components have no interpretation Factors are subject to interpretation.
Ambiguity The obtained solution is unambiguous. There are many different solutions.
Benefits Factor analysis models n random vari-
• From the set of principal members, you can re-
ables against fewer hidden factors. This
move those components that have the smallest
gives the following possibilities:
variance. In this case, most of the information
• Hidden factors can be interpreted
contained in the set of primary variables will be
(identified). Their correct interpreta-
represented by a smaller set of principal com-
tion makes it possible to explain the
ponents. In addition, such a reduction can also
random phenomenon represented by
be viewed as a lossy compression of the input
the primary variables, and thus to ex-
data.
plain the common causes that influ-
• A reduced set of principal components enables
ence the observed phenomenon.
more effective data clustering. Clustering in a
• Primary variables can be clustered
reduced space is less computationally intensive.
due to their similarity to factors.
• In the case of reducing the principal compo-
• If the factor model is known, and
nents to two, there is a possibility of effective
the primary variables are not avail-
data visualization, as well as the assessment of
able, the factor model will enable the
the possibility of their clustering.
Monte Carlo simulation of the pri-
• In the least squares method, the use of principal
mary variables, and then the estima-
components instead of primary variables pre-
tion of their statistical characteristics.
vents errors resulting from the ill-conditioning
of the matrix of a system of normal equations
80
Principal Component Analysis versus Factor Analysis
was found that the values of the factor loadings connecting the primary variables with the inde-
pendent factors are equal to the above-mentioned correlation coefficients between the primary
variables and the principal components obtained in the PCA. Consequently, the matrix of corre-
lation coefficients between primary variables and principal components obtained in the principal
components analysis is identical to the matrix of factor loadings obtained in the factor analysis.
This means that the two vector interpretations of the primary variables are not only analogous,
but identical.
The consequence of this vector representation is the possibility of using the Pythagorean
theorem to describe the behavior of primary variables. On the other hand, the cosines between
the individual vectors representing the primary variables are the same as the correlation coeffi-
cients between the corresponding fundamental variables.
81
Zenon Gniazdowski
• Percentage criterion of explained variance - the weakness of this criterion is that it re-
lates to the average variance of the primary variables represented by the selected factors.
Depending on the distribution of the obtained eigenvalues, the explained mean variance
of the primary variables may be relatively large, and the reconstructed variance of indi-
vidual primary variables may be negligible (Table 32, Fig. 11).
• Eigenvalue criterion called the Kaiser criterion - the fact that a given factor with an eigen-
value greater than one should have a variance greater than the variance of a single primary
variable does not mean that a factor with a variance of less than one will never represent
most of the variance of some primary variable. On the other hand, if a factor with an
eigenvalue less than one would represent most of the variance of some primary variable,
then that factor should not be rejected. It should also be added that this criterion does
not consider rotation, which can radically change the situation by assigning significant
factor loadings to the non-rejected factor.
• The criterion of half the number of primary variables - in practice, there may be situations
in which the mutual correlations between the variables are low. Then the number of
factors necessary to reproduce the variance of primary variables may be greater than half
the number of primary variables (see subsection 4.2.3).
The results of the analysis carried out lead to the conclusion that all the criteria discussed
above have deficits, and their application does not always lead to the correct determination
of the number of factors/components. Due to these deficits in determining the number of fac-
tors/components, inconsistencies can arise. However, since these criteria are blind to the vari-
ances of single primary variables, their greatest deficit is the inability to reproduce most of the
variances of single primary variables. Both the selected principal components in the principal
components analysis and the selected factors in the factor analysis may not sufficiently repro-
duce the variance of some individual primary variables. In response to the above deficits, a new
criterion for determining the number of factors in factor analysis was analyzed. This criterion
makes it possible to present most of the variances of each of the analyzed primary variables.
To enable the application of this criterion, an efficient algorithm for determining the number of
factors has been proposed.
The answer to the deficits presented above is the criterion which is also discussed in this
article. With regard to this criterion, an algorithm is proposed in section 4 that allows the
number of factors to be determined in the factor analysis in such a way that the factor model
can represent most of the variance of each of the primary variables. On the other hand, it should
be emphasized that:
• The matrix of factor loadings is identical to the matrix of correlation coefficients between
the original variables and the principal components obtained in the principal components
analysis.
• The algorithm for estimating factor loadings has a lower time and memory complexity
than the algorithm for estimating correlation coefficients between primary and principal
variables.
82
Principal Component Analysis versus Factor Analysis
Therefore, the algorithm for determining the number of factors in factor analysis can also be
used to determine the number of principal components in principal component analysis. As
a result, the number of factors/components can be effectively determined so that most of the
variance of each of the primary variables can be represented, not just their mean variance:
• In factor analysis, the algorithm selects a sufficient number of factors so that the factor
model reproduces most of the variance for each of the primary variables.
• In principal components analysis, the algorithm selects enough principal components to
represent most of the variance of each of the primary variables3 .
5.2.5 Artifact
Subsection 3.4.1 describes the observed phenomenon (similarly to[1]) in which the product of
the matrix L by the matrix U T results in a symmetric matrix:
T
L · UT = L · UT . (50)
The article [1] asks questions about the cause of the observed phenomenon, as well as its po-
tential application. Although there is still no answer to the second question, there is an answer
to the first question in the area of factor analysis. Since the matrix of factor loadings in FA
is identical to the matrix of correlation coefficients between primary variables and principal
components in PCA, this answer is also valid in PCA.
3
Principal components analysis can be viewed as lossy compression, where several principal compo-
nents carry most of the information contained in the primary variables. Common sense says that lossy
compression assumes that most of the information for all primary variables can be reconstructed from a
compressed dataset. An unsatisfactory reconstruction of any primary variable would not achieve this lossy
compression goal.
83
Zenon Gniazdowski
Formula (27) presents L as the product of the matrix U and the diagonal matrix
√ S. The
diagonal matrix S can be expressed as the square of two diagonal matrices D = S
S = D · D. (51)
Hence:
L · UT = U · D · D · UT . (52)
Using the association law for the matrix product, the right side of the above expression can be
grouped. Then the expression (52) takes the form:
L · U T = (U · D) · D · U T .
(53)
Since the transposition of a diagonal matrix is the same matrix, therefore the expression (53)
can be expressed as follows:
T
L · U T = (U · D) · (U · D) . (54)
The right side of the expression (54) shows the product of the matrix by its transposition, so the
product L · U T is symmetrical.
84
Principal Component Analysis versus Factor Analysis
One of the intentions of factor analysis is to clustered primary variables due to their similarity
to independent factors. Due to the facts presented above, it is reasonable to conclude that,
regardless of the type of analysis (FA or PCA), clustering of primary variables due to their
similarity to factors/components should be performed only with the use of the factor loadings
matrix. On the other hand, with regard to clustering of primary variables, due to their similarity
to factors, it seems correct to conclude that clustering of primary variables should not depend
on factor rotation.
Before rotation, the factors are equivalent to the standardized principal component. Rota-
tion finds factors other than standardized principal components. It can be assumed that in the
case of a simple analysis of the factor loadings matrix (without the use of a computer), rotation
will only facilitate clustering. It is assumed that these new factors will allow for easier grouping
of primary variables. This hypothesis should possibly be tested in further research.
6 Conclusions
The article discusses selected problems related to both principal component analysis (PCA) and
factor analysis (FA). In particular, both types of analysis were compared. The comparison was
limited to principal components analysis, which uses a matrix of correlation coefficients instead
of a covariance matrix. Factor analysis was limited to exploratory factor analysis, which uses
principal components.
Comparing principal component analysis and factor analysis not only confirms the exis-
tence of many common elements in both types of analysis, but above all reveals three important
facts:
• The matrix of factor loadings is identical to the matrix of correlation coefficients between
primary variables and principal components obtained in principal components analysis.
• The algorithm for estimating the factor loadings has a lower time and memory com-
plexity than the algorithm for estimating the correlation coefficients between primary
variables and principal components.
• There is a vector interpretation of primary variables. In this interpretation, the respec-
tive factor loadings are the components of the vector that represents the given primary
variable.
Therefore, all operations performed on factors/components (determining the number of fac-
tors/components) and on primary variables (clustering) can be performed on the factor loadings
matrix, and the vector interpretation of primary variables leads to useful conclusions and gives
real possibilities of its use:
• The Pythagorean theorem can be used to describe the behavior of primary variables.
• The cosines between the vectors representing the primary variables are identical to the
correlation coefficients between the corresponding primary variables.
• Based on the vector representation of the primary variables, the number of factors/com-
ponents can be determined so that they can represent most of the variances of all the
primary variables. For this purpose, an appropriate algorithm has been proposed.
85
Zenon Gniazdowski
• The condition for the number of factors/components, which enables the representation of
most of the variance of each of the primary variables, is a necessary and sufficient condi-
tion to determine the optimal number of principal components in principal components
analysis and a necessary condition to determine the optimal number of factors in factor
analysis.
• Reducing the number of factors/components is the same as reducing the size of the space
in which the primary variables are represented.
• Based on the Pythagorean theorem, it is possible to analyze the standard deviations and
variance of individual original variables by analyzing the lengths and squares of the
lengths of the respective vector components.
• Clustering of primary variables due to their mutual similarity, and also due to the simi-
larity to factors in factor analysis, and also due to the similarity to principal components
in principal component analysis, can be performed by clustering vectors (points) due to
their mutual similarity and because of the similarity to the factors/components.
In addition to the practical aspects considered in the article, it is also worth noting an aspect
that probably has no practical significance, but is somewhat surprising. This is true for the
artifact that has been observed for the vector representation of the primary variables in both
PCA and FA. By multiplying the matrix of row vectors representing the primary variables by
the transposition of the matrix of eigenvectors U T , a symmetric matrix was obtained. In this
context, questions arose about the cause of the phenomenon and about the possibility of its use.
There is no answer to the second question. The article found an algebraic answer to the first
question.
Acknowledgments
The author expresses his gratitude to Andrzej Ptasznik for making the Weather Data available
for analysis.
References
[1] Z. Gniazdowski, “New Interpretation of Principal Components Analysis,” Zeszyty
Naukowe WWSI, vol. 11, no. 16, pp. 43–65, 2017. [Online]. Available: https:
//www.doi.org/10.26348/znwwsi.16.43
[2] P. Francuz and R. Mackiewicz, Liczby nie wiedza,˛ skad ˛ pochodza.˛ Przewodnik po
metodologii i statystyce nie tylko dla psychologów. Lublin: Wydawnictwo KUL, 2007.
[3] Z. Gniazdowski, “Geometric interpretation of a correlation,” Zeszyty Naukowe WWSI,
vol. 7, no. 9, pp. 27–35, 2013. [Online]. Available: https://www.doi.org/10.26348/
znwwsi.9.27
[4] J. Legras, Praktyczne metody analizy numerycznej. Wydawnictwa Naukowo-Techniczne,
1974.
86
Principal Component Analysis versus Factor Analysis
[5] D. T. Larose, Data mining methods & models. John Wiley & Sons, 2006.
[6] E. Mooi, M. Sarstedt, and I. Mooi-Reci, “Principal Component and Factor
Analysis,” in Market Research: The Process, Data, and Methods Using Stata.
Singapore: Springer Singapore, 2018, pp. 265–311. [Online]. Available: https:
//doi.org/10.1007/978-981-10-5218-7_8
[7] B. Thompson, Exploratory and confirmatory factor analysis: Understanding concepts
and applications. Washington, DC: American Psychological Association, 2004.
[8] R. A. Fisher, “The use of multiple measurements in taxonomic problems,”
Annals of eugenics, vol. 7, no. 2, pp. 179–188, 1936. [Online]. Available:
https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
[9] S. Ertel, Factor analysis-Healing an ailing model. Universitätsverlag Göttingen, 2013.
[10] C. F. Hofacker, Mathematical marketing. New South Network Services, 2007.
[11] A. Phakiti, “Exploratory factor analysis,” in The Palgrave handbook of applied linguistics
research methodology. Springer, 2018, pp. 423–457.
[12] H. F. Kaiser, “The varimax criterion for analytic rotation in factor analysis,”
Psychometrika, vol. 23, no. 3, pp. 187–200, 1958. [Online]. Available: https:
//doi.org/10.1007/BF02289233
[13] H. Abdi, “Factor rotations in factor analyses,” in Encyclopedia for Research Methods for
the Social Sciences. Thousand Oaks, CA: Sage, 2003, pp. 792–795.
[14] M. Loève, “Elementary Probability Theory,” in Probability Theory I. New
York, NY: Springer New York, 1977, pp. 1–52. [Online]. Available: https:
//doi.org/10.1007/978-1-4684-9464-8_1
[15] R. K. Pace and R. Barry, “Sparse spatial autoregressions,” Statistics & Probability
Letters, vol. 33, no. 3, pp. 291–297, 1997. [Online]. Available: http://www.sciencedirect.
com/science/article/pii/S016771529600140X
[16] ——, “Data from: Sparse spatial autoregressions,” 1999. [Online]. Available:
http://lib.stat.cmu.edu/datasets/houses.zip
[17] Z. Gniazdowski and D. Kaliszewski, “On the clustering of correlated random variables,”
Zeszyty Naukowe WWSI, vol. 12, no. 18, pp. 45–114, 2018. [Online]. Available:
https://www.doi.org/10.26348/znwwsi.18.45
[18] J. W. Poelstra, N. Vijay, M. Hoeppner, and J. B. Wolf, “Transcriptomics of colour pattern-
ing and coloration shifts in crows,” Molecular ecology, vol. 24, no. 18, pp. 4617–4628,
2015.
[19] J. W. Poelstra, N. Vijay, M. P. Höppner, and J. B. W. Wolf, “Data from: Transcriptomics
of colour patterning and colouration shifts in crows,” 2015. [Online]. Available:
https://doi.org/10.5061/dryad.hv333
87
Zenon Gniazdowski
88