0% found this document useful (0 votes)
101 views5 pages

Untitled 1

This document contains the solutions to problems 3 and 6 from a linear models assignment. [Problem 3] The least squares estimate of the mean (μ) in a simple linear regression model with one predictor is the sample mean (ȳ) of the observed responses. [Problem 6] It proves that the matrix A used in calculating the residual sum of squares in a linear regression is symmetric and idempotent. It also derives the distribution of the F-statistic used to test if regression coefficients are equal to zero.

Uploaded by

rishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views5 pages

Untitled 1

This document contains the solutions to problems 3 and 6 from a linear models assignment. [Problem 3] The least squares estimate of the mean (μ) in a simple linear regression model with one predictor is the sample mean (ȳ) of the observed responses. [Problem 6] It proves that the matrix A used in calculating the residual sum of squares in a linear regression is symmetric and idempotent. It also derives the distribution of the F-statistic used to test if regression coefficients are equal to zero.

Uploaded by

rishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

UNSW MATH2931 Linear Models

Writing Assignment 2
KEVIN GE ([email protected])
RISHIKESAN MARAN ([email protected])
TIMOTHY YU ([email protected])

15 September 2017

Problem 3:
Let y = (y1 , ..., yn )T denote a set of responses, and consider the linear model
y =µ+
where µ = (µ, ..., µ)T and  is a vector of zero mean uncorrelated errors with variance σ 2 .
1. This model can be written in general linear model form y = Xβ +  where

X = [1, 1, ..., 1]T


| {z }
0 n0 times

and β = µ.
2. • X T X = [1, 1, ..., 1][1, 1, ..., 1]T = n
• (X T X)−1 = 1/n
• X T Y = [1, 1, ..., 1][y1 , ..., yn ]T = ni=1 yi
P

• (X T X)−1 X T Y = n1 ni=1 yi = ȳ
P

3. The least squares estimate of µ is the value µ1 which minimises the sum of squared
distances between y1 and E(y1 ).
i.e. We are required to minimise
n
X
S(µ) = (yi − E[yi ])2
i=1
Xn
= (yi − E[µ + i ])2
i=1
Xn
= (yi − µ)2 (Since E(i ) = 0)
i=1

We now minimise S with respect to µ, obtaining:


n
∂S X
=−2 (yi − µ)
∂µ i=1

1
By setting the partial derivative to zero, we retrieve the equation which yields µ1 :
n
X
−2 (yi − µ) = 0
i=1
n
X
=⇒ (yi − µ) = 0
i=1
n
X
=⇒ nµ = yi
i=1
=⇒ µ = ȳ

Problem 6:
Suppose we have the linear model y = Xβ +  with n × p design matrix X, normal errors
 ∼ N (0, σ 2 In×n ) and
(n − p)σ̂ 2 = y T (I − X(X T X)−1 X T )y = y T Ay.

1. We prove that A is symmetric and idempotent.


AT = (I − X(X T X)−1 X T )T
= I T − (X(X T X)−1 X T )T
= I − (X T )T ((X T X)−1 )T X T
= I − X(X T X)−1 X T = I.

So A is symmetric. Also,

A2 = (I − X(X T X)−1 X T )2
= I − 2X(X T X)−1 X T + X (X T X)−1 X T X (X T X)−1 X T
| {z }
=I
−1 −1
= I − 2X(X X) X + X(X X) X T
T T T

= I − X(X T X)−1 X T = A.

So A is idempotent.
2. Since A is symmetric and idempotent, rank(A) = tr(A).
Also,

tr(A) = tr(In×n − X(X T X)−1 X T )


= tr(In×n ) − tr(X(X T X)−1 X T )
= n − tr(X T X(X T X)−1 )
= n − tr(Ip×p )
=n−p

Hence rank(A) = tr(A) = n − p.

2
3. We first compute Var(Ay + b):
Var(Ay + b) = Var((A + (X T X)−1 X T )y)
= (A + (X T X)−1 X T )Var(y)(A + (X T X)−1 X T )T
= (A + (X T X)−1 X T )(σ 2 I)(A + (X T X)−1 X T )T
= σ 2 (A + (X T X)−1 X T )(A + X(X T X)−1 )
= σ 2 (A2 + AX(X T X)−1 + (X T X)−1 X T A + (X T X)−1 X T X (X T X)−1 )
| {z }
=I
= σ 2 (A + (AX)(X T X)−1 + (X T X)−1 (X T A) + (X T X)−1 ).

Now,

AX = (I − X(X T X)−1 X T )X = IX − X(X T X)−1 X T X = X − X = 0

and

X T A = X T (I − X(X T X)−1 X T ) = X T I − X T X(X T X)−1 X T = X T − X T = 0.

Hence by substituting 2 and 3 into 1, we obtain:

Var(Ay + b) = σ 2 (A + (0)(X T X)−1 + (X T X)−1 (0) + (X T X)−1 )


= σ 2 (A + (X T X)−1 ).

Next we compute Var(Ay) and Var(b) individually:

Var(Ay) = AVar(y)AT Var(b) = Var((X T X)−1 X T )y)


= σ 2 AAT = (X T X)−1 X T )Var(y)((X T X)−1 X T )T
= σ 2 A. = σ 2 (X T X)−1 X T X)(X T X)−1
| {z }
=I
2 T −1
= σ (X X)
Finally we substitute the obtained variances into the variance of sum property:
Var(Ay + b) = Var(Ay) + Var(b) + 2 Cov(Ay, b).
σ (A + (X T X)−1 ) = σ 2 A + σ 2 (X T X)−1 + 2 Cov(Ay, b).
2

2 Cov(Ay, b) = 0
Cov(Ay, b) = 0.

4. To determine the distribution of


(b − β)T X T X(b − β)
σ2
(which we will denote as E), we can rewrite Xb:
Xb = X(X T X)−1 X T y
= X(X T X)−1 X T (Xβ + )
= X(X T X)−1 X T Xβ + X(X T X)−1 X T 
= Xβ + X(X T X)−1 X T .

3
Therefore,

(Xb − Xβ)T (Xb − Xβ)


E=
σ2
(X(X T X)−1 X T )T X(X T X)−1 X T 
=
σ2
 X(X X) X X(X T X)−1 X T 
T T −1 T
=
σ2
T X(X T X)−1 X T 
= .
σ2
T (I − A)
= . (1)
σ2
Now, since A is symmetric,

(I − A)T = I T − AT = I − A.

Hence, I − A is also symmetric. So using the spectral theorem we can diagonalise I − A


in equation (1):

T (I − A) T Q−1 DQ


=
σ2 σ2
Where D is a diagonal matrix containing the eigenvalues of I − A and Q is a real
orthogonal matrix (that is, Q−1 = QT ). Therefore,

T QT DQ
E=
σ2
 T  
Q Q
= D
σ σ
Q
= z T Dz (where = z)
σ
X
= zi [D]ij zj
0≤i,j≤n
Xn
= [D]ii zi 2 (Since D is a diagonal matrix)
i=1

Now, we determine the possible values of [D]ii , that is, the eigenvalues of I − A. Since A
is idempotent, I − A is also idempotent because

(I − A)2 = I 2 − IA − AI + A2 = I − 2A + A = I − A.

This implies that its eigenvalues only take values 1 and 0. Also, since the sum of the
eigenvalues of I − A is

tr(I − A) = tr(I) − tr(A) = n − (n − p) = p,

Eigenvalues 1 and 0 must have multiplicities p and n − p respectively. Therefore [D]ii = 1


for p values of i (say i = i1 , i2 , ..., ip ) and 0 otherwise.

4
Next we determine the distribution of zi2 . It is given that  ∼ N (0, σ 2 In×n ), so it follows
that
    T 
Q Q 2 Q d 
z∼N · 0, σ In×n = N 0, In×n . (Since QQT = I)
σ σ σ

Therefore, zi ∼ N 0, 1 =⇒ zi2 ∼ χ21 . To conclude,
n p p
X X X
2
E= [D]ii zi = 1· zi2k = zi2k ∼ χ2p .
i=1 k=1 k=1

You might also like