0% found this document useful (0 votes)
227 views

Homework 5 Solution

This document provides the answers to homework problems regarding linear regression models. It derives the partial derivatives of the likelihood function with respect to the parameters for a Box-Cox transformed simple linear regression model. It also extends the Bonferroni inequality to the case of three statements and uses matrix methods to analyze a small-scale regression study, obtaining estimates of the regression coefficients, residuals, sum of squared residuals, variance-covariance matrix of the estimates, predicted values, and variance of the predicted values.

Uploaded by

alanpicard2303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
227 views

Homework 5 Solution

This document provides the answers to homework problems regarding linear regression models. It derives the partial derivatives of the likelihood function with respect to the parameters for a Box-Cox transformed simple linear regression model. It also extends the Bonferroni inequality to the case of three statements and uses matrix methods to analyze a small-scale regression study, obtaining estimates of the regression coefficients, residuals, sum of squared residuals, variance-covariance matrix of the estimates, predicted values, and variance of the predicted values.

Uploaded by

alanpicard2303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

LINEAR REGRESSION MODELS W4315

HOMEWORK 5 ANSWERS
March 9, 2010

Due: 03/04/10
Instructor: Frank Wood
1. (20 points) In order to get a maximum likelihood estimate of the parameters of a
Box-Cox transformed simple linear regression model (Yi = 0 + 1 Xi + i ), we need to find
the gradient of the likelihood with respect to its parameters (the gradient consists of the
partial derivatives of the likelihood function w.r.t. all of the parameters). Derive the partial
derivatives of the likelihood w.r.t all parameters assuming that i N (0, 2 ). (N.B. the
parameters here are , 0 , 1 , )
(Extra Credit: Given this collection of partial derivatives (the gradient), how would you then
proceed to arrive at final estimates of all the parameters? Hint: consider how to increase
the likelihood function by making small changes in the parameter settings.)
Answer:
The gradient of a multi-variate function is defined to be a vector consisting of all the partial
derivatives w.r.t
every single variable. So we need to write down the full likelihood first:
P
Q 1 (yi 021 xi )2
2
L = 2 e
Then the log-likelihood
function is:
P
(y

1 xi )2
0
i
l = n2 log( 2 )
2
2
Take derivatives w.r.t to all the four parameters, we have the followings:
l
1 X
= 2
(yi 0 1 xi )yi lnyi

l
1 X
(yi 0 1 xi )
= 2
0

l
1 X
= 2
(yi 0 1 xi )xi
1

P
l
n
(yi 0 1 xi )2
=

+
2
2 2
2 4
From the above equations array, we can have the gradient.

(1)
(2)
(3)
(4)

2. (15 points)

Derive an extension of Bonferroni inequality (4.2a) which is given as


T
P (A1 A2 ) 1 = 1 2

for the case of three statements, each with statement confidence coefficient 1 .
Answer:

Following the thread on Page 155 in the textbook, we have:


SupposeP (A1 ) = P (A2 ) = P (A3 ) = , then
P (A1 A2 A3 ) = P (A1 A2 A3 ) = 1P (A1 A2 A3 ) = 1P (A1 ) + P (A2 ) + P (A3 ) P (A1 A2 ) P (A1 A
1 3 + P (A1 A2 ) + P (A1 A3 ) + P (A2 A3 ) P (A1 A2 A3 )
So we have P (A1 A2 A3 ) 1 P (A1 ) P (A2 ) P (A3 )
3. (25 points) 2 Refer to Consumer finance Problems 5.5 and 5.13.
a. Using matrix methods, obtain the following: (1) vector of estimated regression coefficients,
(2) vector of residuals, (3) SSR, (4) SSE, (5) estimated variance-covariance matrix of b, (6)
point estimate of E{Yh } when Xh = 4, (7) s2 {pred} when Xh = 4
b. From your estimated variance-covariance matrix in part (a5), obtain the following: (1)
s{b0 , b1 }; (2) s2 {b0 }; (3) s{b1 }
c. Find the hat matrix H
d. Find s2 {e}
Answer:


16
4


1
5
"

10 0
2
; X X = 1
;Y=

4
3
15


13
3
4
22
"
#
55 17
1
(X0 X)1 = 41
;
17 6
"
#"
55
17
1 1 1 1
1
(X0 X)1 X0 = 41
17 6
4 1 2 3
1

1
(a) X =
1

1
1

1
2

1
#

1 1 1 1 1 1
1 2 3 3 4
1

1
1

#
1 1
=
3 4

"
1
41

1 "
#

2
6
17
=
3
17 55

3
4

#
13 38 21 4 4 13
7 11 5 1 1 7

This is problem 4.22 in Applied Linear Regression Models(4th edition) by Kutner etc.
This is problem 5.24 in Applied Linear Regression Models(4th edition) by Kutner etc.

0
1 1
1 0
H = X(X X) X = 41
1

1
1

4
15 6 1 8 8 15

1 "
6 27 16 5 5 6
#

1 16 11 6 6 1
2
13
38
21
4
4
13
1

= 41
8
3
5 6 7 7 8

7 11 5 1 1 7

8
3
5 6 7 7 8
4
15 6 1 8 8 15

16

5
"
" # "
#
#

13
38
21
4
4
13
18
10
0.4390
1
= 1
(1): = (X0 X)1 X0 Y = 41
41 189 = 4.6098
7 11 5 1 1 7
15

13
22


1
16

5 1

10 1

(2): Residual=Y X =
15 1


13 1
1
22

2.8780
4

0.0488
1 "
#

0.3415
0.4390
2

=
0.7317
3

4.6098

1.2683
3
3.1220
4

(3): SSR = Y0 [H n1 J]Y = 145.2073


(4): SSE = Y0 (I H)Y = 20.2927
"

(5): The estimated variance-covariance matrix of b = s2 {b} = M SE(X X)1


"
#
h
i 0.4390
(6): The point estimate of E{Yh } = X0h b = 1 4
= 18.8780
4.6098
(7): At Xh = 4, s2 {pred} = M SE(1 + X0h (X0 X)1 Xh ) = 6.9292
(b) s{b0 , b1 } = 2.1035; s2 {b0 } = 6.8055; s{b1 } =

0.7424 = 0.8616

0
1 1
1 0
(c) As calculated in part(a), the hat matrix H = X(X X) X = 41
1

1
1

#
6.8055 2.1035
=
2.1035 0.7424

1 "
#

2
13
38
21
4
4
13

3
7 11 5 1 1 7

3
4

15

1 1
= 41
8

8
15

6 1 8 8 15
0.3659 0.1463 0.0244 0.1951 0.1951 0.3659

27 16 5 5 6 0.1463 0.6585 0.3902 0.1220 0.1220 0.1463


0.0244

16 11 6 6 1
0.3902
0.2683
0.1463
0.1463
0.0244
=

5 6 7 7 8 0.1951
0.1220 0.1463 0.1707 0.1707 0.1951

5 6 7 7 8 0.1951
0.1220 0.1463 0.1707 0.1707 0.1951
6 1 8 8 15
0.3659 0.1463 0.0244 0.1951 0.1951 0.3659

3.2171
0.7424 0.1237 0.9899 0.9899

1.7323 1.9798 0.6187 0.6187


0.7424

0.1237 1.9798 3.7121 0.7424 0.7424


(d) s2 {e} = M SE(I H) =
0.9899 0.6187 0.7424 4.2070 0.8662

0.9899 0.6187 0.7424 0.8662 4.2070


1.8560 0.7424 0.1237 0.9899 0.9899

1.8560

0.7424

0.1237

0.9899

0.9899
3.2171

Matlab Code:
X=[1 4;1 1;1 2;1 3;1 3;1 4]
Y=[16;5;10;15;13;22]
J=ones(6,6)
I=eye(6,6)
[n, m] = size(Y )
Z = inv(X 0 X)
H=X*Z*X
beta=Z*X*Y
residual=Y-H*Y
SSR=Y*(H-(1/n)*J)*Y
SSE=Y*(I-H)*Y
MSE=SSE/(n-2)
cov=MSE*Z
s2 e = M SE (I H)
Xh=[1;4]
Yhhat=Xh*beta
s2 pred = M SE (1 + Xh0 Z Xh)

4. (25 points) 3 In a small-scale regression study, the following data were obtained: Assume
3

This is problem 6.27 in Applied Linear Regression Models(4th edition) by Kutner etc.

i:
Xi1
Xi2
Yi

1
7
33
42

2 3
4 5
6
4 16 3 21 8
41 7 49 5 31
33 75 28 91 55

that regression model (1) which is:


Yi = 0 + 1 Xi1 + Xi2 + i

(5)

with independent normal error terms is appropriate. Using matrix methods, obtain (a) b;
(b) e; (c) H; (d) SSR; (e) s2 {b}; (f) Yh when Xh1 = 10, Xh2 = 30; (g) s2 {Yh } when Xh1 = 10,
Xh2 = 30
Answer:

33.9321

(a) b = (X0 X)1 X0 Y = 2.7848


0.2644

2.6996

1.2300

1.6374

(b) e = Y Xb =
1.3299

0.0900
6.9868

0.2314
0.2517
0.2118
0.1489 0.0548 0.2110

0.3124
0.0944
0.2663 0.1479 0.2231
0.2517

0.2118

0.0944
0.7044
0.3192
0.1045
0.2041
0
1 0

(c) H = X(X X) X =
0.1489
0.2663 0.3192 0.6143
0.1414 0.1483

0.0548 0.1479 0.1045


0.1414
0.9404 0.0163
0.2110
0.2231
0.2041
0.1483
0.0163 0.1971
(d) SSR = Y0 [H n1 J]Y = 3009.926

715.4711 34.1589 13.5949

(e) s2 {b} = M SE(X0 X)1 = 34.1589 1.6617


0.6441
13.5949 0.6441
0.2625

i 33.9321

(f) Yh = X0h b = 1 10 30 2.7848 = 53.8471


0.2644
h

(g) At Xh1 =10 and Xh2 = 30, s2 {Yh } = X0h s2 {b}Xh = 5.4246
Matlab Code:
X=[1 7 33;1 4 41;1 16 7;1 3 49;1 21 5; 1 8 31]
Y=[42;33;75;28;91;55]
J=ones(6,6)
I=eye(6,6)
[n, m] = size(Y )
Z=inv(X*X)
H=X*Z*X
beta=Z*X*Y
residual=Y-H*Y
SSR=Y*(H-(1/n)*J)*Y
SSE=Y*(I-H)*Y
MSE=SSE/(n-3)
cov=MSE*Z
s2 e=MSE*(I-H)
Xh=[1;10;30]
Yhhat=Xh*beta
s2 yhat=Xh*cov*Xh

5. (15 points) Consider the classic regression model using matrix, i.e.
Y = X + 
where X is a n p design matrix whose first column is an all 1 vector,  N (0, I) and I is
an identity matrix. Prove the followings:
0 e
can be written in a matrix form:
a. The residual sum of squares RSS = e
RSS = y0 (I X(X0 X)1 X0 )y

(6)

b. We call the RHS of (2) a sandwich. Prove the matrix in the middle layer of the sandwich
N = I X(X0 X)1 X0 is an idempotent matrix.
6

c. Prove that the rank of N defined in part (b) is n p.


N.B. p columns in design matrix means there are p 1 predictors plus 1 intercept term.
Before handling the problem, make clear of the dimensions of all the matrices here.
Answer:
(a) SSE = e0 e = (y Xb)0 (y Xb) = (y0 b0 X0 )(y Xb) = y0 y 2b0 X0 y + b0 X0 Xb =
y0 y2b0 X0 y+b0 X0 X(X0 X)1 X0 y = y0 y2b0 X0 y+b0 IX0 y = y0 yb0 X0 y = y0 (Ib0 X0 )y =
y0 (I ((X0 X)1 X0 )0 X0 )y = y0 (I X(X0 X)1 X0 )y
(b)A2 = AA = (IX(X0 X)1 X0 )(IX(X0 X)1 X0 ) = I2X(X0 X)1 X0 +X(X0 X)1 X0 X(X0 X)1 X0 =
I 2X(X0 X)1 X0 + XI(X0 X)1 X0 = I X(X0 X)1 X0 = A
Therefore, A is an idempotent matrix.

(c) Since A is a symmetric and idempotent matrix, rank(A)=trace(A)


Let H = X(X0 X)1 X0
trace(A) = trace(Inn Hnn ) = trace(I)trace(H) = ntrace(H) = ntrace(X(X0 X)1 X0 ) =
0
n trace((X0 X)1
pp Xpn Xnp ) = n trace(Ipp ) = n p
So rank(A)=n-p

You might also like