Exploratory Data Analysis v3 Part3
Exploratory Data Analysis v3 Part3
Analysis
Factor Analysis
Part 3
Recall
Identifying the
Right Data
What is in the
data?
Multivariate Analysis
• used to analyze data that involves multiple variables or observations
simultaneously
Shopping Avg. Purchase Product Online Loyalty
Customer ID Age Income Frequency Amount Categories Shopping Points
Electronics,
1 35 $60,000 Weekly $100 Clothing Yes 1200
Groceries,
2 45 $75,000 Monthly $75 Electronics No 800
Clothing,
3 28 $40,000 Rarely $200 Jewelry Yes 500
Groceries,
4 50 $90,000 Weekly $150 Electronics No 1500
Clothing,
5 22 $25,000 Monthly $50 Shoes Yes 300
Multivariate Analysis
Techniques
Factor Analysis
• used to uncover underlying latent factors that influence observed variables.
• often used in psychology and social sciences to understand the relationships between variables.
Discriminant Analysis
• Discriminant analysis is used to differentiate between two or more groups based on a set of predictor variables.
• often used in classification problems, such as distinguishing between different species based on multiple characteristics.
Factor Analysis
Latent
Variables
• Where:
• Observation Matrix:
• Loading Matrix:
• Factor Matrix:
• Error term matrix:
• Mean Matrix: where (i,m)th element is simply
• Assumptions:
• F and are independent
• E(F) = 0; E: Expectation
• Cov(F) = I
Factor Analysis
Example
• Education Assessment
• Math (X₁), Science (X₂), and English (X₃)
• suspect that these test scores are influenced by two underlying latent
factors: "Academic Ability" (F₁) and "Study Habits" (F₂).
• Factor Loading Matrix
F1 F2
X1 0.9 0.2
X2 0.8 -0.1
X3 0.7 0.6
Variance-Covariance Matrix
• The variance-covariance matrix of the X1 X2 X3
observed variables can be expressed
as a function of the factor loadings
and the unique variances. X1 15.0 3.0 4.0
• Σ = LL' + Ψ
• Where: X2 3.0 9.0 2.0
• Σ is the p x p observed variable
covariance matrix.
• L is the p x m factor loading matrix.
X3 4.0 2.0 8.0
• L' is the transpose of the factor loading
matrix.
• Ψ is a diagonal matrix of unique
variances.