Multiple Factor Analysis Overview
Multiple Factor Analysis Overview
(MFA)
Hervé Abdi1 & Dominique Valentin
1 Overview
1.1 Origin and goal of the method
Multiple factor analysis (MFA, see Escofier and Pagès, 1990, 1994)
analyzes observations described by several “blocks" or sets of vari-
ables. M FA seeks the common structures present in all or some of
these sets. M FA is performed in two steps. First a principal com-
ponent analysis (PCA) is performed on each data set which is then
“normalized” by dividing all its elements by the square root of the
first eigenvalue obtained from of its PCA. Second, the normalized
data sets are merged to form a unique matrix and a global PCA is
performed on this matrix. The individual data sets are then pro-
jected onto the global analysis to analyze communalities and dis-
crepancies. M FA is used in very different domains such as sensory
evaluation, economy, ecology, and chemistry.
1
In: Neil Salkind (Ed.) (2007). Encyclopedia of Measurement and Statistics.
Thousand Oaks (CA): Sage.
Address correspondence to: Hervé Abdi
Program in Cognition and Neurosciences, MS: Gr.4.1,
The University of Texas at Dallas,
Richardson, TX 75083–0688, USA
E-mail: [email protected] http://www.utd.edu/∼herve
1
H. Abdi & D. Valentin: Multiple Factor Analysis
2 An example
To illustrate MFA, we selected six wines, coming from the same har-
vest of Pinot Noir, aged in six different barrels made with one of
two different types of oak. Wines 1, 5, and 6 were aged with the first
type of oak, and wines 2, 3, and 4 with the second. Next, we asked
each of three wine experts to choose from two to five variables to
2
H. Abdi & D. Valentin: Multiple Factor Analysis
describe the six wines. For each wine, the expert rated the inten-
sity of the variables on a 9-point scale. The results are presented
in Table 1 (the same example is used in the entry for S TATIS). The
goal of the analysis is twofold. First we want to obtain a typology
of the wines and second we want to know if there is an agreement
between the experts.
3 Notations
The raw data consist in T data sets. Each data set is called a study.
Each study is an I × J [t ] rectangular data matrix denoted Y[t ] , where
I is the number of observations and J [t ] the number of variables of
the t -th study. Each data matrix is, in general, preprocessed (e.g.,
centered, normalized) and the preprocessed data matrices actu-
ally used in the analysis are denoted X[t ] .
For our example, the data consist in T = 3 studies. The data
(from Table 1) were centered by column (i.e., the mean of each col-
umn is zero) and normalized (i.e., for each column, the sum of the
squared elements is equal to 1). So, the starting point of the analy-
sis consists in three matrices:
−0.57 0.58 0.76
0.19 −0.07 −0.28
0.38 −0.50 −0.48
X[1] = ,
0.57 −0.50 −0.28
−0.38 0.36 0.14
−0.19 0.14 0.14
−0.50 0.35 0.57 0.54
0.00 0.05 0.03 −0.32
0.25 −0.56 −0.51 −0.54
X[2] = ,
0.75 −0.56 −0.51 −0.32
−0.25 0.35 0.39 0.32
−0.25 0.35 0.03 0.32
and
3
Table 1: Raw data for the wine example
4
wine1 1 1 6 7 2 5 7 6 3 6 7
wine2 2 5 3 2 4 4 4 2 4 4 3
wine3 2 6 1 1 5 2 1 1 7 1 1
wine4 2 7 1 2 7 2 1 2 2 2 2
wine5 1 2 5 4 3 5 6 5 2 6 6
wine6 1 3 4 4 3 5 4 5 1 7 5
H. Abdi & D. Valentin: Multiple Factor Analysis
H. Abdi & D. Valentin: Multiple Factor Analysis
−0.03 0.31 0.57
0.17 −0.06 −0.19
0.80 −0.61 −0.57
X[3] = . (1)
−0.24 −0.43 −0.38
−0.24 0.31 0.38
−0.45 0.49 0.19
5
H. Abdi & D. Valentin: Multiple Factor Analysis
and
£ ¤
diag {∆} = 1.68 0.60 0.34 0.18 0.11
© ª £ ¤
and diag {Λ} = diag ∆2 = 2.83 0.36 0.11 0.03 0.01 (6)
6
H. Abdi & D. Valentin: Multiple Factor Analysis
The global factor scores for the wines are obtained as:
1
F = M− 2 U∆ (8)
2.18 −0.51 −0.48 −0.02 0.08
−0.56 −0.20 0.41 0.23 0.15
−2.32 −0.83 0.01 −0.16 −0.07
= . (9)
−1.83 0.90 −0.40 0.07 0.01
1.40 0.05 0.13 0.17 −0.20
1.13 0.58 0.34 −0.29 0.03
5 Partial analyses
The global analysis reveals the common structure of the wine space.
In addition, we want to see how each expert “interprets" this space.
This is achieved by projecting the data set of each expert onto the
7
H. Abdi & D. Valentin: Multiple Factor Analysis
4
6
5
2
1
3
Figure 1: Global analysis: Plot of the wines on the rst two princi-
pal components. First component: λ1 = 2.83, explains 84% of the
inertia, Second component: λ2 = 2.83, explains 11% of the inertia.
1
This shows that P = M− 2 U∆−1 is a projection matrix which trans-
forms the matrix ZZT into factor scores. Here, we obtain:
0.77 −1.43 −4.20 −0.55 6.68
−0.20 −0.55 3.56 6.90 11.71
1 −0.82 −2.33 0.05 −4.85 −5.53
P = M− 2 U∆−1 = ,
−0.65 2.54 −3.46 2.01 0.66
0.50 0.15 1.13 5.27 −15.93
0.40 1.61 2.91 −8.78 2.43
(11)
The projection matrix is then used to project the studies onto
the global space. For example, for the first expert we obtain
³ ´
F[1] = T × Z[1] ZT
[1] P (12)
8
H. Abdi & D. Valentin: Multiple Factor Analysis
2.76 −1.10 −2.29 −0.39 0.67
−0.77 0.30 0.81 0.31 −0.27
−1.99 0.81 1.48 0.08 −0.39
= , (13)
−1.98 0.93 0.92 −0.02 0.59
1.29 −0.62 −0.49 0.10 −0.51
0.69 −0.30 −0.43 −0.07 −0.08
and
1.54 0.44 0.09 0.07 −0.47
−0.61 −0.76 0.06 −0.17 0.19
−2.85 −3.80 −0.69 −0.07 0.19
F[3] = . (15)
−1.12 0.56 −0.55 0.42 0.11
1.43 1.27 0.26 0.03 −0.22
1.62 2.28 0.82 −0.28 0.20
Figure 2 shows the first two principal components of the global
analysis along with the wine projections for the experts. Note that,
the position of each wine in the global analysis is the barycenter
(i.e., centroid) of its positions for the experts. To facilitate the inter-
pretation, we have drawn lines linking the expert wine projection
to the global wine position. This picture shows that Expert 3 is at
variance with the other experts in particular for Wines 3 and 6.
9
4
6
5
2 1
3
10
H. Abdi & D. Valentin: Multiple Factor Analysis
Figure 2: Projection of the experts onto the global analysis. Experts are represented by their faces. A line
segment links the position of the wine for a given expert to its global position. First component: λ1 = 2.83,
explains 84% of the inertia, Second component: λ2 = 2.83, explains 11% of the inertia.
Table 2: Loadings on the principal components of the global analysis of 1.) the original variables and 2.) the
principal components of the study pca's. Only the rst three dimensions are kept.
Axis λ % Fruity Woody Coffee Fruit Roasted Vanillin Woody Fruity Butter Woody
1 2.83 85 −0.97 0.98 0.92 −0.89 0.96 0.95 0.97 −0.59 0.95 0.99
2 .36 11 0.23 −0.15 −0.06 0.38 −0.00 −0.20 0.10 −0.80 0.19 0.00
11
3 .12 3 0.02 −0.02 −0.37 −0.21 0.28 −0.00 −0.14 0.08 0.24 −0.11
PC 2
PC 1
Red Fruit
Woody
PC1 Butter
Fruity Roasted
Coffee PC1 PC1 Woody
PC 1 PC 1
Woody Vanillin
PC 2
12
Fruity
13
H. Abdi & D. Valentin: Multiple Factor Analysis
λ2 = 0.36
Dimension 2
τ2 = 11%
3
2
1
λ1 = 2.83
Dimension 1 τ1 = 85%
Figure 4: Partial Inertia: Plot of the experts on the rst two com-
ponents.
obtained as
Jk
X £ ¤
λ1 × q 2j ,1 = 2.83 × (−.34)2 + (.35)2 + (.32)2 = 2.83 × .34 = .96 .
j
(16)
Similar computations gives the values reported in Table 3. These
values are used to plot the studies as shown in Figure 4. The plot
confirms the originality of Expert 3, and its importance for Dimen-
sion 2.
References
[1] Abdi, H. (2003). Multivariate analysis. In M. Lewis-Beck, A. Bry-
man, & T. Futing (Eds): Encyclopedia for research methods for
the social sciences. Thousand Oaks: Sage.
[2] Escofier, B. and Pagès, J. (1990). Analyses factorielles simples et
multiples: objectifs, méthodes, interprétation. Dunod, Paris.
[3] Escofier, B. and Pagès, J. (1990). Multiple factor analysis. Com-
putational Statistics & Data Analysis, 18, 121–140.
14