Worksheet 5. Regression and Experiments: Problems With Regression: Omitted Variables
Worksheet 5. Regression and Experiments: Problems With Regression: Omitted Variables
Y = β0 + β1 X1 + · · · + βk Xk + u
(a) Show that a population linear regression of Y on (X1 , . . . , Xk−1 ) – i.e. omitting Xk –
yields a coefficient on X1 of
γ1 = β1 + βk π1
Y = β0 + β1 X1 + β2 X2 + u (1)
(a) Suppose we attempt to estimate (1) by linear regression. Derive an expression for
the large-sample limit of the OLS estimator of β1 (i.e. compute the coefficient on X1
in a population linear regression of Y on X1 and X2 ). [Hint: use the FWL theorem.]
(b) Using your answer to part (a), provide a condition under which the OLS estimator is
consistent for β1 (when δ 6= 0). Interpret this condition.
Y = β0 + β1 X1 + β2 X2 + β3 X3 + u
where OR holds, i.e. Eu = 0 and cov(Xl , u) = 0 for l ∈ {1, 2, 3}. Suppose that you observe
X1 and X2 . You do not observe X3 , but instead observe a possible proxy variable W .
Provide conditions under which an OLS regression of Y on (X1 , X2 , W ) will consistently
estimate β1 and β2 . [Hint: adapt the argument given in Section 7.1.2 of the notes.]
1
Randomised control trials
4. Consider a study, run in 2005–06 in the US, to evaluate the effect on college student grades
of dorm room internet connections. In a large dorm, half the rooms were randomly wired
for high-speed internet connections (the treatment group), and final course grades collected
for all residents at the end of the academic year (in July 2006). Which of the following
would pose threats to the internal validity of the study, and why?
(a) Midway through the year, all the male athletes moved into a fraternity and dropped
out of the study. (Their final grades were not observed.)
(b) Engineering students assigned to the control group put together a local area network
so that they could share a private wireless Internet connection that they paid for
jointly.
(c) The art majors in the treatment group never learned how to access their internet
accounts.
(d) The economics majors in the treatment group provided access to their internet con-
nection to those in the control group, for a fee.
(e) A major storm in early October 2005 caused damage to the campus network such
that around 20% of all dorm room internet connections failed, and repairs were not
successfully carried out until August 2006.
5. Suppose you have data on an outcome {Yi }ni=1 and a binary treatment dummy {Di }ni=1 .
Let β̂1 denote the estimate of the coefficient on D in an OLS regression of Y on D, n1
the number of treated observations (Di = 1) and n0 = n − n1 the number of untreated
observations (Di = 0). Show that
1 X 1 X
β̂1 = Yi − Yi
n1 n0
{i|Di =1} {i|Di =0}
i.e. that the OLS estimator is equal to the difference in the sample means of the treated
and untreated groups.
6. An economist has run the following experiment to estimate the income elasticity of food
consumption, using income transfers. 10,000 households were randomly sampled from the
population of a large city in a developing country, to participate in the study: for each of
these the economist has information on the household head (their age, years of completed
education, and height) and the household itself (household size, and household income
and expenditure on food in the week prior to the study). After collecting this data, the
economist used a random number generator to assign each of the participating households
to either:
2
with equal proportions of these receiving each of the possible values of the income
transfer).
The economist then recorded each household’s expenditure on food during the week follow-
ing their receipt of the income transfer. The following table gives OLS regression estimates
obtained using the data collected by the economist (all regressions include a constant term,
the estimate of which is not reported):
where F denotes the F statistic for a test of the null that all slope coefficients are zero.
(a) What is the purpose of performing regression (3)? What do you infer from the F
statistic?
(b) In the context of (1) and (2), why do you think the economist has regressed food
consumption on the value of the income transfer, rather than on total household
income (inclusive of the transfer)? (Assume that income in the week that the transfer
was received was also recorded.) Or would you not expect the choice of either to make
a fundamental difference to the estimates?
(c) Do you think the estimated coefficient on the income transfer in (1) can be given a
causal interpretation? Construct a 95% confidence interval for the estimated coeffi-
cient on the income transfer in (1), and interpret it.
3
(d) What is the purpose of including the additional regressors in (2)? Compare the
estimated coefficient on the income transfer, and its standard error, with that in (1),
and give an intuitive explanation for why they differ.
(e) Explain why height might have been included in (2). Is it possible to give its estimated
coefficient a causal interpretation?
(f) Recall that the economist is interested in the income elasticity of food consumption. A
reviewer suggests re-estimating the regression in (1), but this time with the logarithm
of food consumption as the dependent variable, and the the logarithm of the income
transfer on the r.h.s. (in place of the level of the income transfer). Do you think this
is a sensible approach? Could you propose an alternative way to estimate the income
elasticity of food consumption?
7. Suppose we are interested in the effect of kindergarten class sizes on outcomes later in
life, in this case on earnings at age 40. We observe a group of individuals who were
randomly assigned to ‘small’ and ‘regularly’ sized classes during kindergarten as part of
an experimental study. Our dataset records the type of class they were assigned to (D = 1
if a small class, 0 otherwise), their earnings at age 40 (Y ), and their total years spent in
education by age 40 (X).
[Hint: in answering the preceding questions, it might be helpful to consider the following
model for the determination of Y and X
Y = β0 + β1 D + β2 X + u
X = δ0 + δ1 D + v
and think about what might be plausibly assumed about D, X, u and v in this setting.]