0% found this document useful (0 votes)
119 views3 pages

HA IX On PCA

This document contains instructions for a homework assignment on principal component analysis (PCA). It includes 10 questions covering key concepts in PCA like determining the number of components to retain, validating results, and hypothesis testing related to PCA. Examples are provided requiring calculation and interpretation of principal components, loadings, variance explained, and Bartlett's test of sphericity. The homework aims to help students understand and apply the basic theory and steps involved in PCA.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views3 pages

HA IX On PCA

This document contains instructions for a homework assignment on principal component analysis (PCA). It includes 10 questions covering key concepts in PCA like determining the number of components to retain, validating results, and hypothesis testing related to PCA. Examples are provided requiring calculation and interpretation of principal components, loadings, variance explained, and Bartlett's test of sphericity. The homework aims to help students understand and apply the basic theory and steps involved in PCA.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Department of Industrial and Systems Engineering, IIT Kharagpur

Subject: Applied Multivariate Statistical Modeling I (IM60061)


Home Assignment IX on PCA
Prepared by Prof J Maiti of ISE, IIT Kharagpur

1. (i) Why PCA is conducted?


(ii) Explain the basic steps performed in PCA.
2. Consider a bivariate situation with original variables x1and x2 . Let two principal
components extracted are Z1 and Z 2 . Prove thatZ = AT X
Where,
x1
z1
cos sin
=
, Z =
, A
X =

cos
sin
x2
z2
3. If the j-th principal component of X(pX1) is Zj (where j = 1,2,,p), then show that
E (Z j ) = a Tj
and Var (Z j ) = a Tj

1

2
Where, =
and
=
...

p

11 12 ........ 1P

21 22 ........ 2 P
...........................

P1 P 2 ........ PP

4. Show that,
(i)
(S I) a j =
0
(ii)
S = AAT

and

(iii)

1 0 0.......0
0 0........0
2

Where, =
.........................

0 0 0.......P

a11 a21 a j1 aP1

a12 a22 a j 2 aP 2

A=
..............................

a1P a2 P a jP aPP
p

S jj = j

=j 1 =j 1

(iv)

rjk =

a jk * j

; Where j= 1,2,, p; k = 1,2,.,p, rjk=Correlation Coefficient


S kk
between Xj and Zk, ajk= Loading of Xj on Zk.
1

(v)

Sampling distribution of

2 j 2
; Where j = j-th eigen value of population covariance matrix
j ~ N j ,
(n 1)

2 j2
vi) If j is N ~ j ,
, then
( n 1)

2
2
1 Z /2
(n 1)
(n 1)
5. i) What is Bartletts sphericity test? Explain the steps.
ii) Why Bartletts Sphericity test is conducted in PCA?
6. i) How do you decide the number of Principal components to be extracted?
ii) Explain the following:
a) Cumulative % of total variation
b) Kaisers Rule
c) Average root criterion
d) The broken stick method
e) Scree plot
7. Conduct the following hypothesis w.r.t. PCA
H 0 : m +1= m + 2= ...= p
1 + Z /2

H1 : m +1 m + k

j k for at least one pair ( j , k ) from the last ( p m ) values.


8.

i) How do you validate the results of PCA?


ii) Explain (i) with the help of following:
a) Use of holdout samples
b) Jackknife validation
c) Bootstrap validation

9.

Consider the following sample covariance matrix.


S=

100 60
60 64

(a) Convert S to R (sample correlation matrix). What conclusion can you draw?
(b) Obtain principal components using S. How many components do you retain and
why?
(c) Obtain principal components using R. How many components do you retain and
why? Compare the result of (c) with that of (b).
10. In any academic institution, teaching is considered to be the highest priority amongst all
academic activities. Students feedback is considered to be the best evaluator of quality
learning imparted. All students attending a subject over a semester was requested to give
their responses (score) on a 100 point scale on three aspects namely, (i) contents of the
subject (X1), (ii) teaching quality (X2) and (iii) tutorials and assignments handled (X3).
36 students responded with a drop rate of 4%. Three matrices S, and V representing the
covariance matrix of X, the eigen values and eigen vectors of S, respectively are shown
below.
2

100 28 48
=
S 28 =
49 45 ,
48 45 64
i.
ii.
iii.
iv.

9.30 0 0
0 47.50
=
0 , V

0 0 156.20

0.16 0.70 0.70


0.70 -0.60 0.45

-0.70 -0.40 0.58

How many principal components (PCs) are to be retained?


Obtain the principal components (PCs) to be retained.
Compute correlation coefficients between the PCs retained and X.
Comment on the result of (b) and (c).

11. Consider the following samples co-variance matrix


100 55
S =
, here sample size ( n ) = 50 .
22
55 36 22
Show that
i) One principal component is adequate enough to explain the variability of the two
original variables ( X 1 , X 2 )
ii) Find out the first PC loading vector
iii) Obtain 95% CI for the population variance 1 (for PC 1).
iv) Obtain 100 (1 ) % CR for the loading vector. a1 [for PC 1]. Hence, = 0.05
v) Conduct Bartletts sphericity test
vi) Show that the variance explained by the PC2 is negligible.
[Hints: hypothesis test proposed by Bartlett]

**END**

You might also like