0% found this document useful (0 votes)
91 views9 pages

Week 2 - Solutions

1. The document provides solutions to problems involving properties of expectations, variances, covariances, and linear regressions. It proves properties such as linearity of expectation, variance of scaled random variables, and the formula for variance of linear combinations. 2. It derives formulas for sums of squares in simple linear regression models and computes variances of regression coefficients. It shows covariances between regression coefficients are equal to zero. 3. The document computes expected values involving quadratic forms and derives the moment generating function of a chi-squared distributed random variable by making a substitution of variables in the integral.

Uploaded by

Rodney Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views9 pages

Week 2 - Solutions

1. The document provides solutions to problems involving properties of expectations, variances, covariances, and linear regressions. It proves properties such as linearity of expectation, variance of scaled random variables, and the formula for variance of linear combinations. 2. It derives formulas for sums of squares in simple linear regression models and computes variances of regression coefficients. It shows covariances between regression coefficients are equal to zero. 3. The document computes expected values involving quadratic forms and derives the moment generating function of a chi-squared distributed random variable by making a substitution of variables in the integral.

Uploaded by

Rodney Chen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Week 2 - Solutions

1. We show the claim only in the case that X and Y are continuous random variables.
Let fX,Y be the joint distribution of X and Y , then
Z Z
E(aX + bY ) =

(ax + by)fX,Y (x, y)dxdy


Z Z
=
axfX,Y (x, y)dxdy +
byfX,Y (x, y)dxdy
Z
Z
= a xfX (x)dx + b yfY (y)dy
Z Z

= aE(X) + bE(Y )

2. (a) From the denition of the Variance.


Var(aX) = E((aX E(aX))2 )
= a2 E((X E(X))2 ) by linearity of the expectation
= a2 Var(X)

(b) The rst claim is obvious, we show only the second. If X and Y are independent,
then
E [(X E(X))(Y E(Y ))] = E [(X E(X))] E [(Y E(Y ))] = 0

(c) From the denition of the Covariance function, we have


Cov(X, Y ) = E [(X E(X))(Y E(Y ))]
= E [(Y E(Y ))(X E(X))]
= Cov(Y, X)

To prove linearity in the rst entry,


Cov(aX + bY, Z) = E [((aX + bY ) E(aX + bY ))(Y E(Y ))]
= aE [((X E(X))(Y E(Y ))] + bE [((Y ) E(Y ))(Y E(Y ))]
= aCov(X, Z) + bCov(Y, Z)

Linearity in the second component follows from the fact that the Covariance
function is symmetric.
Finally, for a, b R and random variable X and Y ,
Var(aX + bY )
= Cov(aX + bY, aX + bY )
= Cov(aX, aX + bY ) + Cov(bY, aX + bY ) linearity in the rst entry
= Cov(aX, aX) + Cov(aX, bY ) + Cov(bY, aX) + Cov(bY, bY ) linearity in the second entry
= Var(aX) + 2Cov(aX, bY ) + Var(bY )
= a2 Var(X) + 2abCov(X, Y ) + b2 Var(Y )

3. In the simple linear regression model,


SSreg =
=

n
X
i=1
n
X

(b
yi yi )2
(b0 + b1 xi (b0 + b1 x
))2

i=1

n
X

b21 (xi x
)2

i=1

= b21 Sxx

where in the second equality, we have use the fact that


ybi = b0 + b1 xi

and y = b0 + b1 x

One can conclude the by using the fact that b1 =

Sxy
Sxx

4. We compute directly,
Sxx =
=
=
=

n
X
i=1
n
X
i=1
n
X
i=1
n
X

(xi x
)xi
(xi x
)(xi x
+x
)
(xi x
)(xi x
) +

n
X

(xi x
)
x

i=1

(xi x
)(xi x
)

i=1

where the second term is zero since


n
X

(xi x
)
x=x

i=1

n
X

(xi x
)

i=1

=x

n
X

xi n
x=0

i=1

By a similar argument,
Sxy =

n
X

(xi x
)yi

i=1

n
X

(xi x
)(yi y + y)

i=1

n
X

(xi x
)(yi y) + y

i=1

n
X

n
X
i=1

(xi x
)(yi y)

i=1

(xi x
)

On the other hand


Sxy =
=
=

n
X
i=1
n
X
i=1
n
X

(xi x
)yi
yi xi x

yi xi

i=1

1
n

n
X

yi

i=1
n
X
i=1

xi

n
X

yi

i=1

5. Compute Var(b0 )
Var(b0 ) = Var(
y b1 x
)
= Var(
y ) + Var(b1 x
) 2Cov(
y , b1 x
)
=

2
2
+x
2
n
Sxx

Where we have made use of the following


n

Var(
y ) = Var(

1X
yi )
n
i=1

n
1 X
= 2
Var(yi ) by independence of yi
n
i=1

1
= 2
n

using the fact that Var(yi ) = 2

and
Var(b1 x
) = x
2 Var(b1 )
n
1 X
=x
2 2
(xi x
)2 Var(yi ) Using the fact that b1 =
Sxx

Sxy
Syy

i=1

1
= x

Sxx
2 2

To compute the covariance


1
x
Cov(
y , Sxy )
2
Sxx
n
1 X
= 2 x

Cov(yi , Sxy )
Sxx

Cov(
y , b1 x
) =

1
x

2
Sxx

i=1
n
X

Cov(yi ,

i=1

n
X
(xj x
)yj ) by linearity in the rst component
j=1

n
n
1 XX
= 2 x

(xj x
)Cov(yi , yj ) by linearity in the second component
Sxx

1
x

2
Sxx

i=1 j=1
n
X

(xi x
) 2

Cov(yi , yj ) = 2 when j = i

i=1

=0

6. Compute Cov(b0 , b1 )
Cov(b0 , b1 ) = Cov(
y b1 x
, b1 )
= Cov(
y , b1 ) Cov(b1 x
, b1 )
= Cov(
y , b1 ) Cov(b1 x
, b1 )
=
xCov(b1 , b1 ) we used the results of the previous question
1
=
x 2 Sxx

7. Let X N (0, 1) and Y 2 (n), compute the density of T := XY /n


p
P(T t) = P(X t Y /n)
Z
=
fX,Y (x, y)dxdy
xt

y/n

Z
=

xt

fX (x)fY (y)dxdy
y/n

Z ty/n

fY (y)
Z

fX (x)dxdy

p
fY (y)FX (t y/n)dy

assuming that we can dierentiate under the integral sign to obtain


Z

p
p
fY (y) y/nfX (t y/n)dy
Z0

y t2 y
n
1
1 y2
2

=
y
e
e 2n

n
2n
2 2 n2
0
Z
2
n1
1
1
1 ( t +1)y
2 e 2 n

= n
y
dy

2n 0
2 2 n2

fT (t) =

Change variable by setting 12 ( tn + 1)y = z . We obtain using the fact that ( 12 ) =


and the denition of the Gamma function,
2

1
 2
 n1
Z
2
n1
n1
t2
t
1
2

+1
2
+1
z 2 ez 2dz

n
n
n
2n 0
2
1
 n1


 2
 2
2
n+1
1
t
t
n+1
= n
+1
2 2
+1


n
n
2
2 2 n 2 n2

n+1
n+1
 2
 2
2 2 n+1
t
2
= n+1
+
1


n
2 2 n 21 n2
 2
 n+1
2
1
1
t

=
+1
.
1 n
B( 2 , 2 ) n n

1
fT (t) = n
22

1 n
Where the last inequality follows from the fact that B( 12 , n2 ) = ( (2 )n+1( 2) ) .
2

8. It is enough to notice that xT AT Ax = kAxk2 0 and xT AAT x = kAT xk2 0


4

Week 4 - Solutions

1. Let X = CY , then from the denition of the Covariance function for random vectors
Var(CX) = E((CX E(CX))((CX E(CX))T )
= E((CX E(CX))([C(X E(X))]T )
= CE((X E(X))((X E(X))T )C T
= CVar(X)C T .

Recall that b = X(X T X)1 X T y . Therefore, by using the above formula and the fact
that X T X is a symmetric matrix.
Var(b) = Var((X T X)1 X T y)
= (X T X)1 X T Var(y)X(X T X)1
= X(X T X)1 X T 2 IX(X T X)1
= 2 (X T X)1 .

2. Given a n n matrix X , tr(X) =

i=1 Xii .

Pn

(a) Show tr(cX) = ctr(X).


tr(cX) =

n
X

cXii = c

i=1

n
X

Xii = ctr(X)

i=1

(b) Show tr(X + Y ) = tr(X + Y ).


tr(X + Y ) =

n
n
n
X
X
X
[Xii + Yii ] =
Xii +
Yii = tr(X) + tr(Y )
i=1

i=1

i=1

(c) Show tr(XY ) = tr(Y X).


tr(XY ) =

n X
n
X
i=1 j=1
n X
n
X

Xij Yji
Yji Xij

j=1 i=1

= tr(Y X)

3. Prove the formula E(y T Ay) = tr(AV ) + T A, where V := Var(y) and := E(y).
T

E(y Ay) =
=

n X
n
X
i=1 j=1
n
n X
X
i=1 j=1
n
n X
X

E[yi Aij yj ]
Aij [Cov(yi , yj ) + E(yi )E(yj )]
Aij Vji +

i=1 j=1

n X
n
X
i=1 j=1

= tr(AV ) + T A

E(yi )Ai,j E(yj )

where in the second equality, we have used the fact that


Cov(yi , yj ) = E(yi yj ) E(yi )E(yj )

4. To nd yb, we need to solve the following system of equations





 yb1

 7
 
1 1
1
1 1
1
9
yb2 =
0 =
2 1 1
2 1 1
12
yb3
2

Please enjoy yourself by doing row reduction.


5. Suppose v1 , . . . , vk is an orthogonal basis of V . Since yb is assumed to be in V then
y =

k
X

ci vi

i=1

for some ci R. It is then sucient to nd the ci . To do that, we multiple by vjT on


both hand sides and
vjT y =

k
X

ci vjT vi = ci kvj k2 ,

i=1

since vjT vi = 0 if i 6= j .
6. To show that c = (c1 , . . . , ck ) where where c1 = y T xi /kxi k2 solves the normal equation,
it is sucient to notice that since X = (x1 , . . . , xk ) is an orthogonal basis for V , then

kx1 k2

XT X =
0

...

kxk k2

and by substituting c = (c1 , . . . ck ) into the normal equation, we see that c satises
the normal equation.
7. The projection of y onto V is given by
yb =

k
X
y T xi
xi
kxi k2
i=1

Week 6 - Solutions

1. For vectors x and y , by using the fact that xT y = y T x, we have


kx yk2 + kx + yk2 = (x y)T (x y) + (x + y)T (x + y)
= xT x 2xT y + y T y + xT x + 2xT y + y T y
= 2kxk2 + 2kyk2

2. Given the vector y and x, add and subtract yb to y cx.


y cx = y cx yb + yb
= (y yb) + (b
y cx).

The vector y yb is perpendicular to yb cx = (b c)x. Therefore from Pythagoras


theorem
ky cxk2 = ky ybk2 + kb
y cxk2
ky ybk2

This shows that the distance from y to yb is the shortest amongst all vectors of the
form cx for c R.
3. Expand ky cxk2 = (y cx)T (y cx) to obtain
ky cxk2 = kyk 2cy T x + c2 kxk2

which is a parabolic equation in c and from the previous question we know that it has
an unique minimum. Therefore the determinate is less or equal to zero. That is
4c2 |y T x|2 4c2 kxk2 kyk2 0 = |y T x|2 kxk2 kyk2

4. (a) Compute the moment generating function of 2k random variable.By making the
substitution ( 12 t)x = y
Z

e
0

tx

k
2

( k2 )

k
1
2

x2

dx =
2

Z
=

1
2

k
2

= (2

x 2 1 e( 2 t)x dx
k

2 ( k2 )
Z

( k2 )

( k2 )

k
2

k
2

((21 t)1 y) 2 1 ek (21 t)1 dy


k

(21 t)1 2 y 2 1 ey (21 t)1 dy

0
k2

t)

1
k
2

2
k 1
= (21 t) 2 k
22
k
= (1 2t) 2

( k2 )

y 2 1 ey dy

To compute the expectation


M 0 (t)|t=0 =

k
k
d
(1 2t) 2 |t=0 = k(1 2t) 2 1 |t=0 = k
dt

1 ). Therefore under H , we have


(b) Recall that SSreg = b21 Sxx and b1 N (1 , 2 Sxx
0
1 2
2
b
S

.
Therefore
1
2 1 xx

E(

1
1
SSreg ) = E( 2 b21 Sxx ) = 1
2

(c) Computing the SSreg in general


E(SSreg ) = E(b21 Sxx )
= Sxx E(b21 )


= Var(b1 ) + E(b1 )2 Sxx


1
= 2 Sxx
+ 12 Sxx
= 2 + 12 Sxx

We can that E(SSreg ) under the null is smaller, since 12 Sxx 0.


(d) Recall that 12 SSres = (np)

b2 2np . In the simple linear regress model p = 2.


2
Therefore
E(SSres ) = 2 (n 2)

Using the fundamental identity we have show that


E(SStotal ) = E(SSreg ) + E(SSres )
= 2 (n 2) + 2 + 12 Sxx
= 2 (n 1) + 12 Sxx

under the null 1 = 0, E(SStotal ) = 2 (n 1).


5. Properties of e.
(a) In the simple linear regression model
E(ei ) = (yi ybi ) = E(0 + 1 xi b0 + b1 xo )

which is equal to zero, since E(b0 ) = 0 and E(b1 ) = 1 .


Alternatively To show that E(e) = E(y Xb) = 0, it is enough to use the fact
that b is an unbiased estimator of and write
E(y Xb) = X XE(y) = X X = 0

(b) Xb = X(X T X)1 X T y = Hy .


(c) The matrix H is a n n matrix. the matrix H is symmetric, since
H T = (X(X T X)1 X T )T = (X T )T ((X T X)1 )T X T = X((X T X)1 )X T = H,

and X T X is symmetric.
(d) Computing the variance of e = (I H)y .
Var(e) = Var((I H)y)
= (I H)Var(y)(I H)T
= 2 (I H)(I H)T
= 2 (I H)

where the last equality holds since I H is symmetric and idempotent.


8

(e) From the above, we can write that


Var(ei ) = 2 (1 Hii )
Cov(ei , ej ) = 2 Hij

You might also like