Stock Watson 3U ExerciseSolutions Chapter5 Instructors
Stock Watson 3U ExerciseSolutions Chapter5 Instructors
Introduction
to
Econometrics
(3rd
Updated
Edition)
by
James
H.
Stock
and
Mark
W.
Watson
Solutions
to
End-‐of-‐Chapter
Exercises:
Chapter
5*
(This version August 17, 2014)
*Limited
distribution:
For
Instructors
Only.
Answers
to
all
odd-‐numbered
questions
are
provided
to
students
on
the
textbook
website.
If
you
find
errors
in
the
solutions,
please
pass
them
along
to
us
at
[email protected].
5.1 (a) The 95% confidence interval for β1 is {−5.82 ± 1.96 × 2.21}, that is
−10.152 ≤ β1 ≤ −1.4884.
βˆ 1 − 0 −5.82
t act = = = −2.6335.
SE( βˆ 1) 2.21
The p-value is less than 0.01, so we can reject the null hypothesis at the 5%
significance level, and also at the 1% significance level.
βˆ 1 − (−5.6) 0.22
t act = = = 0.10
SE ( βˆ 1) 2.21
The p-value is larger than 0.10, so we cannot reject the null hypothesis at the
10%, 5% or 1% significance level. Because β1 = −5.6 is not rejected at the 5%
(d) The 99% confidence interval for b0 is {520.4 ± 2.58 × 20.4}, that is,
467.7 ≤ β0 ≤ 573.0.
(b) The hypothesis testing for the gender gap is H 0 : β1 = 0 vs. H1 : β1 ≠ 0. With a t-
βˆ 1 − 0 2.12
statistic t act = = = 5.89, the p-value for the test is
SE ( βˆ 1) 0.36
(c) The 95% confidence interval for the gender gap β1 is {2.12 ± 1.96 × 0.36}, that
(d) The sample average wage of women is βˆ0 = $12.52/hour. The sample average
(e) The binary variable regression model relating wages to gender can be written as
either Wage = β0 + β1Male + ui , or Wage = γ 0 + γ 1Female + vi . In the first
regression equation, Male equals 1 for men and 0 for women; β 0 is the
5.2 (continued)
γ 0 = β 0 + β1 ,
γ 0 + γ 1 = β0 .
γˆ 0 = βˆ 0 + βˆ 1 = 14.64,
γˆ1 = βˆ 0 − γˆ 0 = −βˆ 1 = −2.12.
uˆ i = vˆi. Thus the sum of squared residuals, SSR = ∑ i =1 uˆ i , is the same under the
2 n
unchanged.
! = 14.64 − 2.12Female,
Wages R2 = 0.06, SER = 4.2.
(c) The increase in wages for college education is β1 × 4. Thus, the counselor’s
assertion is that β1 = 10/4 = 2.50. The t-statistic for this null hypothesis is
t = 1.93−2.50
0.08
= −7.13, which has a p-value of 0.00. Thus, the counselor’s assertion
5. 5 (a) The estimated gain from being in a small class is 13.9 points. This is equal to
approximately 1/5 of the standard deviation in test scores, a moderate increase.
5.6. (a) The question asks whether the variability in test scores in large classes is the
same as the variability in small classes. It is hard to say. On the one hand, teachers
in small classes might able so spend more time bringing all of the students along,
reducing the poor performance of particularly unprepared students. On the other
hand, most of the variability in test scores might be beyond the control of the
teacher.
(c) Yes. If Y and X are independent, then β1 = 0; but this null hypothesis was rejected
at the 5% level in part (a).
5.8. (a) 43.2 ± 2.05 × 10.2 or 43.2 ± 20.91, where 2.05 is the 5% two-sided critical value
from the t28 distribution.
(c) The one sided 5% critical value is 1.70; tact is less than this critical value, so that
the null hypothesis is not rejected at the 5% level.
1
(Y1 + Y2 +!+ Yn )
5.9. (a) β = n
so that it is linear function of Y1, Y2, …, Yn.
X
1 1
E( β |X 1 ,…, X n ) = E (Y + Y +!+ Yn )|X 1 ,…, X n )
X n 1 2
1 1
= β ( X +!+ X n ) = β1
X n 1 1
5.10. Let n0 denote the number of observation with X = 0 and n1 denote the number of
∑ ∑
n n
observations with X = 1; note that i =1
X i = n1; X = n1| n; 1
n1 i =1
X iYi = Y1;
∑ ( )
( X i − X ) 2 =∑ i =1 X i2 − nX 2 = n1 − nn1 = n1 1 − nn1 = ; n1Y1 + n0Y0 = ∑ i =1Yi , so
n n 2
n1n0 n
i =1 n
∑ i=1
n
( X i − X )(Yi − Y ) ∑ i=1
n
X i (Yi − Y ) ∑ i=1
n
X iYi − Yn1
β̂1 = = n =
∑ i=1 ( X i − X )
n 2
∑ i=1 ( X i − X ) 2
n1n0 |n
n n⎛ n n ⎞
= (Y1 − Y ) = ⎜ Y − 1 Y1 − 0 Y0 ⎟ = Y1 − Y0 ,
n0 n0 ⎝ n n ⎠
⎛n n ⎞ n n +n0
and βˆ0 = Y − βˆ1 X = ⎜ 0 Y0 + 1 Y1 ⎟ − (Y1 − Y0 ) 1 = 1 Y0 = Y0
⎝n n ⎠ n n
5.11. Using the results from 5.10, βˆ0 = Ym and βˆ1 = Yw − Ym . From Chapter 3,
var (H i ui ) µX
σ β2ˆ = , where H i = 1 − Xi.
n ⎡⎣ E ( H i2 )⎤⎦ E ( X i2 )
0
2
Using the facts that E (ui | X i ) = 0 and var (ui | X i ) = σ u2 (homoskedasticity), we have
⎛ ⎞
⎜ µx ⎟ µx
E ( H i ui ) = E u − ⎜
X u = E (ui ) −
⎟
E [ X i E (ui | X i )]
E(X ) E ( X i2 )
⎜ i 2 i i⎟
⎜⎜ ⎟⎟
⎝ i ⎠
=0−
µx × 0 = 0,
E ( X i2 )
and
⎧⎛ ⎞ 2 ⎫⎪
⎪⎜
⎪⎪ ⎜ µX ⎟
⎟ ⎪⎪
E [( H i ui ) ] = E u −
2
⎨⎜ i Xu i i ⎟ ⎬⎪
⎪⎜
⎪ ⎜⎝ E ⎛⎜⎝ X i2 ⎞⎟⎠ ⎟⎟
⎩⎪ ⎠ ⎪⎪⎭
⎧ 2 ⎫
⎪
µX ⎡ µ ⎤ ⎪
⎪⎪ 2 2 2 ⎪⎪
=E u −2 ⎨ i X u + ⎢ ⎛ 2 ⎞ ⎥ X i ui ⎬
2
i i
X
⎪
⎪ E ⎛⎜⎝ X i2 ⎞⎟⎠ ⎢⎣ E ⎜⎝ X i ⎟⎠ ⎥⎦ ⎪
⎪
⎪⎩ ⎪⎭
2
µ ⎡ ⎤
⎡ µ ⎤ ⎡ 2 ⎛ 2 ⎞⎤
= E u − 2 ⎛ X 2 ⎞ E ⎢ X i E ⎝⎛⎜ ui2| X i ⎠⎞⎟ ⎥ + ⎢
⎛ 2⎞
⎜⎜ i ⎟⎟
X
⎥ E ⎢ X i E ⎜⎝ ui | X i ⎟⎠ ⎥
⎣⎢ ⎛ 2⎞ ⎦⎥ ⎣⎢ ⎦⎥
⎝ ⎠
E ⎜⎝ X i ⎟
⎠
⎢⎣ E ⎜⎝ X i ⎟⎠ ⎥⎦
2
µ ⎡ µ ⎤ ⎛ 2⎞
= σ − 2 ⎛ X 2 ⎞ µ σ u2 + ⎢
2
u
X
⎥ E ⎜⎜ X i ⎟⎟σ u2
X
⎛ 2⎞
E ⎜⎝ X i ⎟
⎠
⎢⎣ E ⎜⎝ X i ⎟⎠ ⎥⎦ ⎝ ⎠
⎛ µ2 ⎞
= ⎜1 − ⎛ X 2 ⎞ ⎟σ u2 .
⎜ E ⎜ Xi ⎟ ⎟
⎝ ⎝ ⎠⎠
⎛ µ X2 ⎞ 2
var (H i ui ) = E[( H i ui ) 2 ] = ⎜1 − ⎟σ .
⎜ E ( X i2 ) ⎟ u
⎝ ⎠
5.12 (continued)
⎧⎛ ⎞ 2 ⎫⎪ ⎧ 2 ⎫
⎪⎜
µX ⎪
µX ⎡ µ ⎤ ⎪
E ( H i2 ) = E 1 −
⎟
⎪⎪ ⎜ ⎟ ⎪⎪ ⎪⎪ 2 ⎪⎪
X = E 1− 2 X + ⎢ X
⎥ X
E ( X i2 ) E ( X i2 ) ⎢⎣ E ( X i ) ⎥⎦
⎨⎜ i ⎟ ⎬⎪ ⎨ i i ⎬⎪
⎪⎜ ⎪ 2
⎟⎟
⎪ ⎜⎝ ⎠ ⎪⎪⎭ ⎪ ⎪
⎩⎪ ⎩⎪ ⎭⎪
2
µ X2 + ⎡ µ X ⎤ E X 2 = 1− µ X2 .
=1− 2 2
⎢ 2
⎥
E ( Xi
i
) ⎢⎣ E ( X i ) ⎥⎦
2 ( ) E ( Xi )
Thus
⎛ µX2 ⎞ 2
⎜1 − ⎟σ
var (H i ui ) ⎜ E ( X i2 ) ⎟ u σ u2
σ β2ˆ = =⎝ ⎠ =
⎡ nE ( H 2 )2 ⎤ ⎛
2
⎛ µX2 ⎞
µX2 ⎞
0
⎢⎣ i ⎥⎦ n⎜1 − ⎟ n ⎜1 − ⎟
⎜ E ( X i2 ) ⎟ ⎜ E ( X i2 ) ⎟
⎝ ⎠ ⎝ ⎠
E ( X i2 )σ u2 E ( X i2 ) σ u2
= = .
n[ E ( X i2 − µ X2 )] nσ X2
(b) Yes, this follows from the assumptions in KC 4.3 and conditional
homoskedasticity
(c) They would be unchanged for the reasons specified in the answers to those
questions.
(d) (a) is unchanged; (b) is no longer true as the errors are not conditionally
homosckesdastic.
∑ i=1
n
X i E(ui |X 1 ,…, X n )
(b) E( β̂ |X 1 ,…, X n ) = β + = β since E (ui | X1 , K , X n ) = 0
∑ i=1
n
X 2j
∑ i=1
n
X i 2Var (ui |X 1 ,…, X n ) σ2
(c) Var ( β̂ |X 1 ,…, X n ) = =
⎡ ∑ i=1
n
X 2j ⎤⎦
2
∑ i=1
n
X 2j
⎣
5.15. Because the samples are independent, βˆm ,1 and βˆw,1 are independent. Thus
var ( βˆm,1 − βˆw,1 ) = var ( βˆm,1 ) + var( βˆw,1 ). Var ( βˆm,1 ) is consistently estimated as
[ SE ( βˆm,1 )]2 and Var (βˆw,1 ) is consistently estimated as [ SE ( βˆw,1 )]2 , so that
var( βˆm,1 − βˆw,1 ) is consistently estimated by [ SE ( βˆm,1 )]2 + [ SE ( βˆw,1 )]2 , and the result
follows by noting the SE is the square root of the estimated variance.