0% found this document useful (0 votes)
52 views4 pages

Problem Set 4 With Solutions

Applied Statistics and Econometrics

Uploaded by

Giorgia Fantini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

Problem Set 4 With Solutions

Applied Statistics and Econometrics

Uploaded by

Giorgia Fantini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Solution to Problem Set 4

Question 1

(a) ui represents factors other than nap time that influence the stu-
dent’s performance on the exam including amount of time study-
ing, aptitude for the material, and so forth. Some students will
have studied more than average, other less; some students will
have higher than average aptitude for the subject, others lower,
and so forth.

(b) Because of random assignment u is independent of X. Since u


represents deviations from average E(u) = 0. Because u and X
are independent E(u|X) = E(u) = 0. The estimated coe¢ceint
is therefore an unbiased estimator of the causal e§ect of nap time
on test scores.

(c) Assumption # 2 is satisfied if this year’s class is typical of other


classes, that is, students in this year’s class can be viewed as
random draws from the population of students that enroll in the
class. Assumption #3 is satisfied because 0 ! Yi ! 30 and Xi
can take on only two values (60 and 75).

i.

Ybi = 55 + 0.17 " 60 = 65.2

Ybi = 55 + 0.17 " 75 = 67.75

Ybi = 55 + 0.17 " 90 = 70.3

ii.
0.17 " 5 = 0.85

1
Question 2

We answer both parts together. Suppose X ! denotes the new X which is


given by
X ! = cX

where c is a number. For example, in part (a) c = 100, and in part (b)
c = 1/1000.
The OLS estimator obtained from the regression of Y on X ! is
Pn P
ˆ ! i=1 (Yi # Ȳ )(Xi! # X̄ ! ) c ni=1 (Yi # Ȳ )(Xi # X̄) 1ˆ
!1 = Pn = Pn = !
! !
i=1 (Xi # X̄ )
2 c 2
i=1 (Xi # X̄)
2 c 1
ˆ is the OLS obtained from the regression of Y on X
because X̄ ! = cX̄. ! 1

(the “original” estimator).


Thus, when c = 100, the new OLS estimator is 1/100 times the original
OLS estimator , and when c = 1/1000, the new OLS estimator is 1000 times
the original OLS estimator.
The estimator of the intercept is not a§ected:

! ˆ ! X̄ ! = Ȳ # 1 !
ˆ ! = Ȳ # ! ˆ cX̄ = Ȳ # !
ˆ X̄ = !
ˆ
0 1
c 1 1 0

Question 3

(a) The estimated gender gap equals $1.78/hour.

(b) The hypothesis testing for the gender gap is H0 : ! 1 = 0 vs.


H1 : ! 1 6= 0, with a t-statistic
ˆ #0
! 1.78
1
tact = # $= = 6.1379
SE ! ˆ 0.29
1
% %
p # value = 2!(# %tact %) = 2!(#6.14) = 2 " 0.0000 = 0.000

(to four decimal places). The p-value is less than 0.01, so we can
reject the null hypothesis that there is no gender gap at a 1%
significance level.

(c) The 95% confidence interval for the gender gap is {1.78 ± 1.96 " 0.29},
that is, 1.21 ! ! 1 ! 2.35.

2
(d) The sample average wage of women is ! ˆ = 10.73. The sample
0
ˆ ˆ
average wage of men is ! 0 + ! 1 = 10.73 + 1.78 = 12.51.

(e) The binary variable regression model relating wages to gender can
be written as either W age = ! 0 + ! 1 M ale + u , or as W age =
" 0 + " 1 F emale + v. In the first regression equation, Male equals
1 for men and 0 for women; ! 0 is the population mean of wages
for women and ! 0 + ! 1 is the population mean of wages for men.
In the second regression equation, Female equals 1 for women
and 0 for men; " 0 is the population mean of wages for men and
" 0 + " 1 is the population mean of wages for women. This implies
the following relationship for the coe¢cients in the two regression
equations:

"0 = !0 + !1

"0 + "1 = !0

ˆ and !
Given the coe¢cient estimates ! ˆ , we have
0 1

ˆ +!
"ˆ 0 = ! ˆ = 10.73 + 1.78 = 12.51
0 1

ˆ # "ˆ = 10.73 # 12.71 = #1.98


"ˆ 1 = ! 0 0 5 7
Due to the relationship among coe¢cient estimates, for each in-
dividual observation, the OLS residual is the same under the two
regression equations: ûi = v̂i . Thus the sum of squared residuals
is the same under the two regressions. This implies that both R2
and SER are unchanged. In summary, in regressing Wages on
Female, we will get

Wd
age = 12.51 # 1.98F emale, R2 = 0.09, SER = 3.8

3
1

Question 4
(a) The estimated regression is

= 512.7 + 707.7×Height
(3379.9) (50.4)

The 95% confidential interval for the slope coefficient is 707.7 ± 1.96×50.4, or
608.9 ≤ 1 ≤ 806.5. This interval does not include 1 = 0, so the estimated slope is
significantly different than 0 at the 5% level. Alternatively, the t-statistic is 707.7/50.4 ≈
14.0, which is greater in absolute value than the 5% critical value of 1.96. And finally,
the p-value for the t-statistic is p-value ≈ 0.000, which is smaller than 0.05.

(b) For women the estimated regression is

= 12650 + 511.2×Height
(6299) (97.6)

The 95% confidential interval for the slope coefficient is 511.2 ± 1.96×97.6, or
319.9 ≤ 1,Female ≤ 702.5. This interval does not include 1,Female = 0, so the estimated
slope is significantly different than 0 at the 5% level.

(c) For men the estimated regression is

= -43130 + 1306.9×Height
(6925) (98.9)

The 95% confidential interval for the slope coefficient is 1306.9 ± 1.96×98.9, or
1113.1 ≤ 1,Male ≤ 1500.6. This interval does not include 1,Male = 0, so the estimated
slope is significantly different than 0 at the 5% level.

(d) The estimate of 1,Male 1,Female is b̂1,Male - b̂1,Female and the standard error is

( )
SE b̂1,Male - b̂1,Female = var( b̂1,Male ) + var( b̂1,Female ) = SE( b̂1,Male )2 + SE( b̂1,Female )2 . Using
the estimated regressions in (b) and (c): b̂1,Male - b̂1,Female = 1306.9 511.2 = 795.7, and .

( )
SE b̂1,Male - b̂1,Female = 98.9 2 + 97.6 2 = 138.9 .

The 95% confidence interval for 1,Male 1,Female is 795.7 ± 1.96 × 138.9 or
523.5 ≤ 1,Male 1,Female ≤ 1,067.8. This interval does not include 1,Male 1,Female = 0, so
the estimated difference in the slopes is significantly different than 0 at the 5% level.

You might also like