Multivariate Data Analysis - EFA
Multivariate Data Analysis - EFA
Lecture 3
Chapter 5
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
What is Exploratory Factor Analysis?
• Latent Traits or Unobservable Characteristics
– For length or weight, such a distinction may be
unnecessary because the property is almost perfectly
Observable
– However, for attitudes, belief, perceptions, and other
psychological notions, our measurement instruments are
imperfect
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
Types of Factor Analysis
Exploratory Factor Analysis (EFA)
• used to discover the factor structure of a construct and
examine its reliability. It is data driven.
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
Simple illustration
• Holzinger and Swineford (1939)
– Psychological Testing of Children
– Five tests with 7th & 8th-graded children(n=145)
– X1 = Paragraph Comprehension (PARA)
– X2 = Sentence Completion (SENT)
– X3 = Word Meaning (WORD)
– X4 = Addition (ADD)
– X5 = Counting Dots (DOTS)
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
Simple illustration
• Correlation Matrix
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
Simple illustration: One-Factor Model
𝑋1 𝑋2 𝑋3 𝑋4 𝑋5
𝛿 1
For use with Hair, Black,
𝛿
Babin and Anderson, 2
𝛿 𝛿
3 © 2018 Cengage Learning4EMEA
Multivariate Data Analysis 8e
𝛿5
Simple illustration : One-Factor Model
• Let 𝜉 = common factor and 𝛿𝑖 = a specific factor
• One-factor model with five variables
𝑋𝑖 = 𝜆𝑖 𝜉 + 𝛿𝑖 , 𝑖 = 1,2,3,4,5
where cor 𝛿𝑖 , 𝛿𝑗 = 0, 𝑖 ≠ 𝑗 and cor 𝛿𝑖 , 𝜉 = 0
• Assume 𝑋 and 𝜉 are standardized variables. Then
Var 𝑋𝑖 = Var 𝜆𝑖 𝜉 + 𝛿𝑖 = 𝜆𝑖 2 + 𝑉𝑎𝑟 𝛿𝑖 = 1
𝜆𝑖 2 = communality of 𝑋𝑖
= the proportion of the variation in 𝑋𝑖 explained by 𝜉
= 1 − Var 𝛿𝑖 = 1 − 𝜃𝑖𝑖2
where 𝜃𝑖𝑖2 = Var 𝛿𝑖 = variance of specific factor 𝑋𝑖
• As 𝜆𝑖 2 → 1 𝜃𝑖𝑖2 → 0 , 𝑋𝑖 is a nearly perfect measure of 𝜉
• As 𝜆𝑖 2 → 0 𝜃𝑖𝑖2 → 1 , 𝑋𝑖 is not explained by 𝜉
What is Exploratory Factor Analysis?
𝜉1 𝜉2
𝑋1 𝑋2 𝑋3 𝑋4 𝑋5
𝛿 1
For use with Hair, Black,
𝛿
Babin and Anderson, 2
𝛿 𝛿
3 © 2018 Cengage Learning4EMEA
Multivariate Data Analysis 8e
𝛿5
Simple illustration : Two-Factor Model
• Two-factor model with five variables
𝑋𝑖 = 𝜆𝑖1 𝜉1 + 𝜆𝑖2 𝜉2 + 𝛿𝑖 , 𝑖 = 1,2,3,4,5
where cor 𝛿𝑖 , 𝛿𝑗 = 0, 𝑖 ≠ 𝑗 and cor 𝛿𝑖 , 𝜉𝑘 = 0, 𝑘 = 1,2
where 𝜉1 = Verbal Aptitude Factor
𝜉2 = Quantitative Aptitude Factor
• Assume 𝑋 and 𝜉 are standardized variables. Then
Var 𝑋𝑖 = Var 𝜆𝑖1 𝜉1 + 𝜆𝑖2 𝜉2 + 𝛿𝑖 = 𝜆𝑖1 2 + 𝜆𝑖2 2 + 𝑉𝑎𝑟 𝛿𝑖
= 𝜆𝑖1 2 + 𝜆𝑖2 2 + 𝜃𝑖𝑖2 = 1
𝜆𝑖1 2 + 𝜆𝑖2 2 = communality of 𝑋𝑖 = 1 − 𝜃𝑖𝑖2
• Consider a student with high 𝜉1 and low 𝜉2 . We expect the student to
perform well on those tests requiring more verbal than quantitative
ability. That is, If the student’s performance in a task of sentence
completion (measured by 𝑋1 ), we should expect a value of 𝜆11 near 1 and
a value for 𝜆12 closer to 0.
Exploratory Factor Analysis with c Common Factors
In matrix notation,
𝑿 = 𝜩𝜦𝑇𝑐 + 𝜟
where 𝜩 = [𝜉1 , 𝜉2 , ⋯, 𝜉𝑐 ]
𝜟 = [𝛿1 ,𝛿2 , ⋯,𝛿𝑝 ]
𝜦𝑐 = 𝑝 × 𝑐 matrix of coefficients
Exploratory Factor Analysis with c Common Factors
𝑹 − 𝜣 = 𝜦𝑐 𝜦𝑇𝑐
• PCA revisited:
– Singular Value Decomposition of 𝐗
𝑿 = 𝒁𝒔 𝑫𝟏/𝟐 𝑼𝑻
𝟏 𝟏
𝑹=E 𝑿𝑻 𝑿 =E 𝑼𝑫𝟏/𝟐 (𝒁𝒔 𝑻 𝒁𝒔 )𝑫𝟏/𝟐 𝑼𝑻 = (𝑼𝑫 )(𝑼𝑫 )𝑻 = 𝑭𝑭𝑻
𝟐 𝟐
𝑹 ≈ 𝑭𝒄 𝑭𝑻𝒄
For dimension reduction, we try to extract some subset of c components that closely
approximates 𝑹: 𝑭𝑐 is the first c columns of factor loading matrix F whose elements
are interpretable as the correlations between original variables X and c extracted
common factors
Solution Procedure
• The correlations between the rotated factors (𝜩𝑇 𝑻) and the original
variables 𝑿 = 𝜩𝜦𝑇𝑐 + 𝜟 is
1
𝜦∗𝑐 = 𝚵𝜦𝑇𝑐 + 𝚫 𝑇 𝜩𝑇 𝑻 = 𝜦𝑐 𝑻
𝑛−1
Rotation Indeterminancy
• Type of rotations
1. Orthogonal rotation: varimax or quartimax
2. Non-orthogonal (Oblique) rotation: promax or direct oblimin
Factor Rotation
For use with Hair, Black, Babin and Anderson, Multivariate Data Analysis 8e © 2018 Cengage Learning EMEA
Factor Rotation
Factor 1: Healthful
Factor 2: Artificial
Factor 3: Non-Adult
Factor 4: Interesting
RTE Cereal: Factor Scores
RTE Cereal: Factor Scores
RTE Cereal: Factor Scores