HW Week6
HW Week6
1 Suppose your data {(Xi1 , Xi2 , Yi )}ni=1 satisfy the “long regression”
Yi = α + β1 Xi1 + β2 Xi2 + Vi ,
where E[Vi |Xi1 , Xi2 ] = 0, but you instead estimate the “short regression”
Yi = α + β1 Xi1 + Ui .
(i) Show that the OLS estimator from the short regression is generally inconsis-
tent for β1 .
(b) E[W i Ui ] = 0;
(c) E[W i X ⊤
i ] = P, a finite r × k matrix of rank k;
p
(d) An → A, a finite symmetric PD r × r matrix.
Let βbn be the IV estimator based on the instrument vector W i and the weighting
matrix An .
p
(i) Show that βbn exists with probability approaching one as n → ∞ and βbn → β.
1
(ii) Show that, under the additional assumption that W i (Yi − β ⊤ X i ) has a fi-
nite PD variance D, βbn is asymptotically normal and derive its asymptotic
variance.
p
(iii) Given a symmetric matrix ∆ ˆ n such that ∆
ˆn → D, derive a consistent estima-
tor of the asymptotic variance of β n .
b
p
(ii) AV[βbn ] − AV[βen ] ≥ 0, with equality if An − c Bn → 0 for some c > 0.
d
5 Suppose n−1/2 W⊤ (Y −Xβ) → Nr (0, D), let P = plim W⊤ X/n, C = plim W⊤ W/n,
and consider the 2SLS estimator βen = [X⊤ W(W⊤ W)−1 W⊤ X]−1 X⊤ W(W⊤ W)−1 W⊤ Y.
(iii) Can you suggest a consistent estimator of the asymptotic variance of βe when
D ̸= σ 2 C?
2
6 Consider the structural linear model
Yi = α + βXi + Ui , i = 1, . . . , n,
where σW U and ρW U respectively denote the covariance and the correlation between
Wi and Ui , etc. Discuss this result.
7 Let Jn be the Sargan statistic for testing the null hypothesis that your instruments
are valid.
(i) Show that Jn = 0 in the exactly identified case (r = k).
(ii) Show that, under the null, Jn is asymptotically distributed as χ2r−k provided
r > k.
9 Suppose you have individual data from a developing country and you want to
test whether nutrition affects productivity using the model
3
(i) Explain why calories and protein may be endogenous.
(ii) The literature has proposed regional prices of various goods (e.g. grains, meats,
breads, dairy products, etc.) as possible instruments for calories and protein.
Under what circumstances are these instruments valid?
(iii) What happens if these prices reflect quality of food?
(iv) If prices are a valid set of instruments, how many prices would you need to
identify α1 and α2 ?
(v) How many prices would you need to test the null hypothesis that they are
valid instruments?
(vi) Suppose you have M ≥ 2 prices. Explain how you would test the null hypoth-
esis that calories and protein are exogenous.
10 You are interested in the causal effect of physical activity on health, so you
consider the following empirical model for the health of an individual
health = β0 + β1 exercise + β2 age + β3 height + β4 BMI + β5 work + β6 married + U, (1)
where health is some quantitative measure of the person’s health, exercise is hours
of exercise per week, BMI is the ratio of weight (in kg.) to the square of height (in
meters), work is weekly hours worked, age, height and married are self-explanatory,
and U is an error term. Because females and males may behave differently, you fit
the model separately by gender.
(i) What parameter in (1) is of primary interest to you and how can it be inter-
preted?
(ii) Why might you be concerned about exercise being correlated with U ?
(iii) Suppose you can collect data on two other variables, the distance of home from
work (distwork) and the distance of home from the nearest gym (distgym).
Discuss whether these variables are likely to be correlated with U .
(iv) Now assume that distwork and distgym are in fact uncorrelated with U , as are
all regressors in equation (1) with the exception of exercise. Write down the
reduced form for exercise, and state the conditions under which the parameters
of equation (1) are identified.
(v) How can this identification assumption be tested?
(vi) Is it a problem that distwork is only available if a person works? How would
you take this into account?