Multivariate Analysis Notes
Multivariate Analysis Notes
Contents
1 Introduction 2
1.1 Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Books to follow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Principal Components 3
2.1 Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Construction of Y’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.1 Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.2 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.5 A special case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.6 How to choose a ’k’ ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Contents by Lecture
1
Lecture 1, Aug, 1
1 Introduction
1.1 Topics
• Traditional Topics
Multivariate Analysis
Multivariate Normal
MANOVA
• Non-traditional Topics
2
7. S.F. Arnold : Linear Statistical Inference and Multivariate Analysis
2 Principal Components
Let, we have X , Σp×p (real, symmetric, p.d.).
∼ p×1
We would want to base the future analysis on k(<< p) variables.
2.1 Variability
Total variability of a dataset is defined as-
p
X
Total variability := σii , where σii is i-th diagonal of Σ
i=1
p
X
= σi2
i=1
We would like to reduce the dimension while retaining as much variability as possible.
X := (X1 , X2 , ..., Xp )T
∼
We would like to replace this by Y1 , Y2 , ..., Yk , k << p, without losing out much on variability.
l1T Σl1
max Tl
l1 ̸=0 l1 1
2.2.1 Lemma
This is maximized when l1 is the eigenvector with the highest eigenvalue.
2.2.2 Proof
Let, λ1 ≥ λ2 ≥ ... ≥ λp = 0 are the eigenvalues of Σ and the corresponding eigenvectors would
be e1 , e2 , ..., ep .
3
l1T Σl1 l1T P ΛP T l1
=
l1T l1 l1T l1
(P T l1 )T Λ(P T l1 )
= [∵ P is orthogonal]
(P T l1 )T (P T l1 )
Y T ΛY
= T
Y Y
λi y 2
P
= P 2i
yi
≤λ1 (≥ λp )
2.3 Components
Y1 = eT1 X is the 1st Principal Component
∼
V ar(Y1 ) = λ1
Principal components are defined to be uncorrelated.
Y2 = l2T X
∼
So we would need to find:-
l2T Σl2
max Tl
, subject to Cov(Y1 , Y2 ) = 0
l1 ̸=0 l2 2
Yj = e T
j X, Var(Yj ) = λj , 1 ≤ j ≤ p
∼
2.4 Note
T otal
p
X
V ar(Xi ) = σii σii
i=1
Xp
V ar(Yi ) = λi λi
i=1
If the variables are uncorrelated, i.e. Σ is diagonal then Y1 , Y2 , ..., Yp will just be a permutation
of X1 , X2 , ..., Xp in decreasing order of Variance.
4
2.5 A special case
Let, we have a bi-variate data (X1 , X2 ). We will tr to get its Principal components. We know
∼ ∼
the components will be uncorrelated. So once we get the 1st principal component, the other one
would be the perpendicular of that.
Let the data be like:-
Y1
X2
Y2
X1
Then Y1 will be the 1st principal component as most variation will be along this axes.
So basically the components will be some rotation of the rectangular axes.
This idea can be extended for higher dimension for multivariate normal.
5
We stop and choose when you observe a major slope change.
Lecture 2, Aug, 4