Multiple Correspondence Analysis: 1 Overview
Multiple Correspondence Analysis: 1 Overview
Multiple Correspondence Analysis: 1 Overview
1 Overview
Multiple correspondence analysis (MCA) is an extension of corre-
spondence analysis (CA) which allows one to analyze the pattern of
relationships of several categorical dependent variables. As such,
it can also be seen as a generalization of principal component anal-
ysis when the variables to be analyzed are categorical instead of
quantitative. Because MCA has been (re)discovered many times,
equivalent methods are known under several different names such
as optimal scaling, optimal or appropriate scoring, dual scaling,
homogeneity analysis, scalogram analysis, and quantification me-
thod.
Technically MCA is obtained by using a standard correspon-
dence analysis on an indicator matrix (i.e., a matrix whose entries
are 0 or 1). The percentages of explained variance need to be cor-
rected, and the correspondence analysis interpretation of inter-
point distances needs to be adapted.
1
In: Neil Salkind (Ed.) (2007). Encyclopedia of Measurement and Statistics.
Thousand Oaks (CA): Sage.
Address correspondence to: Hervé Abdi
Program in Cognition and Neurosciences, MS: Gr.4.1,
The University of Texas at Dallas,
Richardson, TX 75083–0688, USA
E-mail: [email protected] http://www.utd.edu/∼herve
1
H. Abdi & D. Valentin: Multiple Correspondence Analysis
2 When to use it
M CA is used to analyze a set of observations described by a set of
nominal variables. Each nominal variable comprises several lev-
els, and each of these levels is coded as a binary variable. For ex-
ample gender, (F vs. M) is one nominal variable with two levels.
The pattern for a male respondent will be 0 1 and 1 0 for a female.
The complete data table is composed of binary columns with one
and only one column taking the value “1” per nominal variable.
M CA can also accommodate quantitative variables by recod-
ing them as “bins.” For example, a score with a range of −5 to +5
could be recoded as a nominal variable with three levels: less than
0, equal to 0, or more than 0. With this schema, a value of 3 will be
expressed by the pattern 0 0 1. The coding schema of MCA implies
that each row has the same total, which for CA implies that each
row has the same mass.
3 An example
We illustrate the method with an example from wine testing. Sup-
pose that we want to evaluate the effect of the oak species on barrel-
aged red Burgundy wines. First, we aged wine coming from the
same harvest of Pinot Noir in six different barrels made with two
types of oak. Wines 1, 5, and 6 were aged with the first type of
oak, whereas wines 2, 3, and 4 were aged with the second. Next,
we asked each of three wine experts to choose from two to five
variables to describe the wines. For each wine and for each vari-
able, the expert was asked to rate the intensity. The answer given
by the expert was coded either as a binary answer (i.e., fruity vs.
non-fruity) or as a ternary answer (i.e., no vanilla, a bit of vanilla,
clear smell of vanilla). Each binary answer is represented by 2 bi-
nary columns (e.g., the answer “fruity” is represented by the pat-
tern 1 0 and “non-fruity” is 0 1). A ternary answer is represented
by 3 binary columns (i.e., the answer “some vanilla” is represented
by the pattern 0 1 0). The results are presented in Table 1 (the same
data are used to illustrate STATIS and Multiple factor analysis, see
the respective entries). The goal of the analysis is twofold. First
2
H. Abdi & D. Valentin: Multiple Correspondence Analysis
4 Notations
There are K nominal variables, each nominal variable has J k lev-
els and the sum of the J k is equal to J . There are I observations.
The I × J indicator matrix is denoted X. Performing CA on the in-
dicator matrix will provide two sets of factor scores: one for the
rows and one for the columns. These factor scores are, in gen-
eral scaled such that their variance is equal to their corresponding
eigenvalue (some versions of CA compute row factor scores nor-
malized to unity).
The grand total of the table is noted N , and the first step of
the analysis is to compute the probability matrix Z = N −1 X. We
denote r the vector of the row totals of Z, (i.e., r = Z1, with 1 being
a conformable vector of 1’s) c the vector of the columns totals, and
Dc = diag {c}, Dr = diag {r}. The factor scores are obtained from the
following singular value decomposition:
−1
³ ´ −1
Dr 2 Z − rcT Dc 2 = P∆QT (1)
3
Table 1: Data for the barrel-aged red burgundy wines example. “Oak Type" is an illustrative (supplementary)
variable, The wine W? is an unknown wine treated as a supplementary observation.
4
W2 2 0 1 0 1 0 1 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 0
W3 2 0 1 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0 1 1 0 1 0
W4 2 0 1 1 0 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0
W5 1 1 0 0 0 1 0 1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 1
W6 1 1 0 0 1 0 0 1 1 0 0 1 0 1 0 0 1 1 0 0 1 0 1
W? ? 0 1 0 1 0 .5 .5 1 0 1 0 0 1 0 .5 .5 1 0 .5 .5 0 1
H. Abdi & D. Valentin: Multiple Correspondence Analysis
H. Abdi & D. Valentin: Multiple Correspondence Analysis
The squared cosine between row i and factor ` and column j and
factor ` are obtained respectively as:
f i2,` g 2j ,`
o i ,` = 2
and o j ,` = 2
. (4)
d r,i d c, j
2 2
(with d r,i , and d c, j
, being respectively the i -th element of dr and
the j -th element of dc ). Squared cosines help locating the factors
important for a given observation or variable.
The contribution of row i to factor ` and of column j to factor
` are obtained respectively as:
f i2,` g 2j ,`
t i ,` = and t j ,` = . (5)
λ` λ`
5
H. Abdi & D. Valentin: Multiple Correspondence Analysis
1 2
·µ
K
¶µ ¶¸
1
λ` − if λ` >
K −1
K K
c λ` = . (7)
1
if λ` ≤
0
K
Using this formula gives a better estimate of the inertia, extracted
by each eigenvalue.
Traditionally, the percentages of inertia are computed by di-
viding each eigenvalue by the sum of the eigenvalues, and this ap-
proach could be used here also. However, it will give an optimistic
estimation of the percentage of inertia. A better estimation of the
inertia has been proposed by Greenacre (1993) who suggested in-
stead to evaluate the percentage of inertia relative to the average
inertia of the off-diagonal blocks of the Burt matrix. This average
6
Table 2: Eigenvalues, corrected eigenvalues, proportion of explained inertia and corrected proportion of explained
inertia. The eigenvalues of the Burt matrix are equal to the squared eigenvalues of the indicator matrix; The
corrected eigenvalues for Benzécri and Greenacre are the same, but the proportion of explained variance differ.
Eigenvalues are denoted by λ, proportions of explained inertia by τ (note that the average inertia used to
compute Greenacre’s correction is equal to I = .7358).
7
Factor Iλ τI Bλ τB Zλ τZ cλ τc
1 .8532 .7110 .7280 .9306 .7004 .9823 .7004 .9519
2 .2000 .1667 .0400 .0511 .0123 .0173 .0123 .0168
3 .1151 .0959 .0133 .0169 .0003 .0004 .0003 .0004
4 .0317 .0264 .0010 .0013 0 0 0 0
P
1.2000 1 .7822 1 .7130 1 .7130 .9691
H. Abdi & D. Valentin: Multiple Correspondence Analysis
H. Abdi & D. Valentin: Multiple Correspondence Analysis
cλ cλ
τc = instead of P . (9)
I c λ`
6 Interpreting MCA
As with CA, the interpretation in MCA is often based upon proxim-
ities between points in a low-dimensional map (i.e., two or three
dimensions). As well as for CA, proximities are meaningful only
between points from the same set (i.e., rows with rows, columns
with columns). Specifically, when two row points are close to each
other they tend to select the same levels of the nominal variables.
For the proximity between variables we need to distinguish two
cases. First, the proximity between levels of different nominal vari-
ables means that these levels tend to appear together in the obser-
vations. Second, because the levels of the same nominal variable
cannot occur together, we need a different type of interpretation
for this case. Here the proximity between levels means that the
groups of observations associated with these two levels are them-
selves similar.
8
Table 3: Factor scores, squared cosines, and contributions for the observations ( I -set). The eigenvalues and
proportions of explained inertia are corrected using Benzécri/Greenacre formula. Contributions corresponding
to negative scores are in italic. The mystery wine (Wine ?) is a supplementary observation. Only the first two
factors are reported.
9
2 .0123 2 0.08 −0.16 0.08 0.08 0.08 −0.16 −0.16
F Squared Cosines
F Contributions ×1000
10
Wood−b
a b
Figure 1: Multiple Correspondence Analysis. Projections on the first 2 dimensions. The eigenvalues (λ) and
proportion of explained inertia (τ) have been corrected with Benzécri/Greenacre formula. (a) The I set: rows
(i.e., wines), wine ? is a supplementary element. (b) The J set: columns (i.e., adjectives). Oak 1 and Oak 2 are
supplementary elements. (the projection points have been slightly moved to increase readability). (Projections
from Tables 3 and 4).
H. Abdi & D. Valentin: Multiple Correspondence Analysis
Table 4: Factor scores, squared cosines, and contributions for the for the variables ( J -set). The eigenvalues and
percentages of inertia have been corrected using Benzécri/Greenacre formula. Contributions corresponding to
negative scores are in italic. Oak 1 and 2 are supplementary variables.
y n 1 2 3 y n y n y n 1 2 3 y n y n y n y n 1 2
F c λ %c Factor Scores
11
1 .7004 95 .90 −.90 −.97 .00 .97 −.90 .90 .90 −.90 −.90 .90 −.97 .00 .97 −.90 .90 .28 −.28 −.90 .90 −.90 .90 .90 −.90
2 .0103 2 .00 .00 .18 −.35 .18 .00 .00 .00 .00 .00 .00 .18 −.35 .18 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00
F Squared Cosines
1 .81 .81 .47 .00 .47 .81 .81 .81 .81 .81 .81 .47 .00 .47 .81 .81 .08 .08 .81 .81 .81 .81 1.00 1.00
2 .00 .00 .02 .06 .02 .00 .00 .00 .00 .00 .00 .02 .06 .02 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00
F Contributions ×1000
1 58 58 44 0 44 58 58 58 58 58 58 44 0 44 58 58 6 6 58 58 58 58 − −
2 0 0 83 333 83 0 0 0 0 0 0 83 333 83 0 0 0 0 0 0 0 0 − −
H. Abdi & D. Valentin: Multiple Correspondence Analysis
H. Abdi & D. Valentin: Multiple Correspondence Analysis
7 Alternatives to MCA
Because the interpretation of MCA is more delicate than simple
CA , several approaches have been suggested to offer the simplic-
ity of interpretation of CA for indicator matrices. One approach
is to use a different metric than χ2 , the most attractive alterna-
tive being the Hellinger distance (see entry on distances and Es-
cofier, 1978; Rao, 1994). Another approach, called joint correspon-
dence analysis, fits only the off-diagonal tables of the Burt matrix
(see Greenacre, 1993), and can be interpreted as a factor analytic
model.
References
[1] Benzécri, J.P. (1979). Sur le calcul des taux d’inertie dans
l’analyse d’un questionnaire. Cahiers de l’Analyse des Données,
4, 377–378.
[2] Clausen, S.E. (1998). Applied correspondence analysis. Thousand
Oaks (CA): Sage.
[3] Escofier, B. (1978). Analyse factorielle et distances répondant au
principe d’équivalence distributionnelle. Revue de Statistiques
Appliquées, 26, 29–37.
12
H. Abdi & D. Valentin: Multiple Correspondence Analysis
Acknowledgments
Many thanks to Szymon Czarnik and Michael Greenacre for point-
ing out a mistake in a previous version of this paper.
13