Lecture6 Module2 Anova 1
Lecture6 Module2 Anova 1
Lecture6 Module2 Anova 1
MODULE II
LECTURE - 6
Analysis of Variance The technique in the analysis of variance involves the breaking down of total variation into orthogonal
components. Each orthogonal factor represents the variation due to a particular factor contributing in the total variation. Model Let Y1 , Y2 ,..., Yn be independently distributed following a normal distribution with mean E (Yi ) = j xij and variance 2. Denoting
p
Y = (Y1 , Y2 ,..., Yn ) a n 1 column vector, such assumption can be expressed in the form of a linear regression model Y = X +
where X is a n p matrix, is a p 1 vector and is a n 1 vector of disturbances with
j =1
E ( ) = 0
Cov ( ) = 2 I and follows a normal distribution. This implies that
E (Y ) = X E (Y X )(Y X ) = 2 I .
Now we consider four different types of tests of hypothesis . I the In th first fi t two t cases, we develop d l the th likelihood lik lih d ratio ti test t t for f the th null ll hypothesis h th i related l t d to t the th analysis l i of f variance. i N t that, Note th t later we will derive the same test on the basis of least squares principle also. An important idea behind the development of this test is to demonstrate that the test used in the analysis of variance can be derived using least squares principle as well as likelihood ratio test.
Case 1: Test of H 0 : = 0
0 0 ,..., p ) ' is specified and 2 is Consider the null hypothesis for testing H 0 : = 0 where = ( 1 , 2 ,..., p ), 0 = ( 10 , 2
unknown.
Assume that all i ' s are estimable, i.e., rank(X) = p (full column rank). We now develop the likelihood ratio test. The ( p + 1) 1 dimensional parametric space is a collection of points such that
= {( 0 , 2 ); 2 > 0} .
The likelihood function of y1 , y2 ,..., yn is 1 2 1 exp 2 ( y X )( y X ) . L( y | , ) = 2 2 2
2 n
2 The likelihood function is maximum over when and are substituted with their maximum likelihood estimators, i.e.,
= ( X X )1 X y )( y X ). 2 = (y X
and Substituting 2 in L( y | , 2 ) gives
1 n
1 1 2 )( y X ) Max L( y | , 2 ) = exp 2 ( y X 2 2 2
2 n n = exp . 2 ( y X )( y X ) 2
U d H 0 , the Under th maximum i lik lih d estimator likelihood ti t of f 2 is i
n
2 = ( y X 0 )( y X 0 ).
The maximum value of the likelihood function under H 0 is
1 n
1 1 2 exp 2 ( y X 0 )( y X 0 ) Max L( y | , ) = 2 2 2
2
2 n n = exp . 0 0 2 2 ( y X )( y X )
5
The likelihood ratio test statistic is
Max L( y | , 2 ) Max L( y | , 2 )
n
)( y X ) 2 (y X = 0 0 ( y X )( y X )
2 )( y X ) (y X = ' ) + (X X 0 ) ( y X ) + (X X 0 ) ( y X
n
0 ) ' X X ( 0) ( = 1 + )( y X ) y X (
q = 1 + 1 q2
n 2
n 2
)( y X ) where q2 = ( y X
and
0 ) X X ( 0 ). q1 = (
Consider
0 ) X X ( 0) q1 = ( X ( X X )1 X y 0 X )1 X y 0 = ( X X X ( X X )1 X ( y X 0 ) = X )1 X ( y X 0 ) ( X X = ( y X 0 ) X ( X X )1 X X ( X X )1 X ( y X 0 ) = ( y X 0 ) X ( X X )1 X ( y X 0 ) )( y X ) q2 = ( y X 1 X )1 X y y = y X (X y X ( X X ) X
1 = y I X ( X X ) X y
= [( y X 0 ) + X 0 ][ I X ( X X ) 1 X '][( y X 0 ) + X 0 ] = ( y X 0 )[ I X ( X X )1 X ]( y X 0 ).
7
In order to find out the decision rule for H 0 based on , first we need to find if is a monotonic increasing or decreasing function of
q1 . So we proceed as follows: q2
n 2
q q Let g = 1 , so that = 1 + 1 q2 q2
= (1 + g )
n 2
then
d n = d dg 2
1 (1 + g ) 2
n +1
So as g increases,
d decreases. dg
q1 . q2
The decision rule is to reject H 0 if 0 where 0 is a constant to be determined on the basis of size of the test. Let us simplify this in our context.
0
or or
q1 1 + q2 1
n 2
(1 + g )
2 n
n 2
o
2
or or or
(1 + g ) 0 n g 0 1 g C
So reject H 0
whenever
q1 C q2
q1 can also be obtained by the least squares method as follows. The least squares methodology will q2 also be discussed in further lectures.
0 ) X X ( 0) q1 = ( q1 Min( y X )( y X ) =
Min( y X )( y X )
9 Theorem 9
Let
Z =Y X0 Q1 = Z X ( X X )1 X ' Z Q2 = Z [ I X ( X X )1 X ]Z .
Q1 Q2 2 2 Then Q1 and Q2 are independently distributed. Further, when H 0 is true , then 2 ~ ( p ) and 2 ~ (n p ) 2 2
E (Z ) = X 0 X 0 = 0 Var ( Z ) = Var (Y ) = 2 I .
Further Z is a linear function of Y and Y follows a normal distribution. So Z ~ N (0, (0 2 I ) The matrices X ( X X )1 X and [ I X ( X X ) 1 X ] are idempotent matrices. So
tr [ X ( X X ) 1 X ] = tr[( X X ) 1 X X ] = tr ( I p ) = p tr [ I X ( X X ) 1 X ] = tr I n tr[ X ( X X ) 1 X ] = n p.
Q1
2
~ 2 ( p ) and
Q2
2
~ 2 (n p )
1 1 where the degrees of freedom p and (n-p) are obtained by the trace of I X ( X X ) X , X ( X X ) X and trace of respectively.
1 1 Q Q I X ( X X ) X X ( X X ) X = 0, so using theorem 7, the quadratic forms 1 and 2 are independent under H0 . Hence the theorem is proved.
Since
10
Since Q1 and Q2 are independently distributed, so under H 0 ,
Q1 / p Q2 /(n p ) follows a central F distribution, i.e.
n p Q1 F ( p, n p). p Q2
Hence the constant C in the likelihood ratio test statistic is given by C = F1 ( p, n p) where F1 (n1 , n2 ) denotes the upper 100 % points of F-distribution with n1 and n2 degrees of freedom. The computations of this test of hypothesis can be represented in the form of an analysis of variance table.
Degrees of freedom
p
Sum of squares
q1
Mean squares
q1 p
F - value
n p q1 p q2
Error (n p)
C = F1 ( p, n p) q2
H0 : = 0
q2 (n p)
Total
( y X 0 )( y X 0 )