Assignment 5 - Instrumental Variables - Solution
Assignment 5 - Instrumental Variables - Solution
Question 1: Consider a simple model to estimate the effect of personal computer (PC)
ownership on college grade point average for graduating seniors at a large public university:
𝐺𝑃𝐴 = 𝛽0 + 𝛽1 𝑃𝐶 + 𝜇
Where PC is a binary variable indicating PC ownership.
a. Why might PC ownership be correlated with 𝜇?
It has been fairly well established that socioeconomic status affects student performance. The error
term u contains, among other things, family income, which has a positive effect on GPA and is also very
likely to be correlated with PC ownership.
b. Explain why PC is likely to be related to parents’ annual income. Does this mean
parental income is a good IV for PC? Why or why not?
Families with higher incomes can afford to buy computers for their children. Therefore, family income
certainly satisfies the second requirement for an instrumental variable: it is correlated with the
endogenous explanatory variable [see (15.5) with x = PC and z = faminc]. But as we suggested in part (i),
faminc has a positive affect on GPA, so the first requirement for a good IV, (15.4), fails for faminc. If we
had faminc we would include it as an explanatory variable in the equation; if it is the only important
omitted variable correlated with PC, we could then estimate the expanded equation by OLS.
c. Suppose that, four years ago, the university gave grants to buy computers to roughly
one-half of the incoming students, and the students who received grants were
randomly chosen. Carefully explain how you would use this information to construct
an instrumental variable for PC.
This is a natural experiment that affects whether or not some students own computers. Some students
who buy computers when given the grant would not have without the grant. (Students who did not
receive the grants might still own computers.) Define a dummy variable, grant, equal to one if the
student received a grant, and zero otherwise. Then, if grant was randomly assigned, it is uncorrelated
with u. In particular, it is uncorrelated with family income and other socioeconomic factors in u.
Further, grant should be correlated with PC: the probability of owning a PC should be significantly
higher for student receiving grants. Incidentally, if the university gave grant priority to low-income
students, grant would be negatively correlated with u, and IV would be inconsistent.
Question 2: The following is a simple model to measure the effect of a school choice program
on standardized test performance:
𝑆𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1 𝑐ℎ𝑜𝑖𝑐𝑒+𝛽2 𝑓𝑎𝑚𝑖𝑛𝑐 + 𝜇
Where score is the score on a statewide test, choice is a binary variable indicating whether a
student attended a choice school in the last year, and faminc is family income. The IV for choice
is grant, the dollar amount granted to students to use for tuition at choice schools. The grant
amount differed by family income level, which is why we control for faminc in the equation.
a. Even with faminc in the equation, why might choice be correlated with 𝜇?
Even at a given income level, some students are more motivated and more able than others, and their
families are more supportive (say, in terms of providing transportation) and enthusiastic about
education. Therefore, there is likely to be a self-selection problem: students that would do better
anyway are also more likely to attend a choice school.
b. If within each income class, the grant amounts were assigned randomly, is grant
uncorrelated with 𝜇?
Assuming we have the functional form for faminc correct, the answer is yes. Since u1 does not contain
income, random assignment of grants within income class means that grant designation is not
correlated with unobservables such as student ability, motivation, and family support.
Question 3: Suppose that a fad for oats (resulting from the announcement of the health benefits
of oat bran) has made you toy with the idea of becoming a broker in the oat market. Before
spending your money, you decide to build a simple model of supply and demand of the market
for oats:
𝑄𝑑𝑡 = 𝛽0 + 𝛽1 𝑃𝑡 +𝛽2 𝑌𝐷𝑡 +∈𝐷𝑡
𝑄𝑠𝑡 = 𝛼0 + 𝛼1 𝑃𝑡 +𝛼2 𝑊𝑡 +∈𝑠𝑡
𝑄𝑑𝑡 = 𝑄𝑠𝑡
Where: 𝑄𝑑𝑡 = 𝑡ℎ𝑒 𝑞𝑢𝑎𝑡𝑖𝑡𝑦 𝑜𝑓 𝑜𝑎𝑡𝑠 𝑑𝑒𝑚𝑎𝑛𝑑𝑒𝑑 𝑖𝑛 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑡
𝑄𝑠𝑡 = 𝑡ℎ𝑒 𝑞𝑢𝑎𝑡𝑖𝑡𝑦 𝑜𝑓 𝑜𝑎𝑡𝑠 𝑠𝑢𝑝𝑝𝑙𝑖𝑒𝑑 𝑖𝑛 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑡
𝑃𝑡 = 𝑡ℎ𝑒 𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑜𝑎𝑡𝑠 𝑖𝑛 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑡
𝑊𝑡 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑜𝑎𝑡 𝑓𝑎𝑟𝑚𝑒𝑟 𝑤𝑎𝑔𝑒𝑠 𝑖𝑛 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑡
𝑌𝐷𝑡 = 𝑑𝑖𝑠𝑝𝑜𝑠𝑖𝑏𝑙𝑒 𝑖𝑛𝑐𝑜𝑚𝑒 𝑖𝑛 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑 𝑡
a. You notice that no left-hand-side variable appears on the right-hand-side of either of your
stochastic simultaneous equations. Does this mean that OLS estimation will encounter no
simultaneity bias? Why or why not?
b. You expect that when 𝑃𝑡 goes up, 𝑄𝑑𝑡 will fall. Does this mean that if you encounter
simultaneity bias in the demand equation, it will be negative instead of the positive bias
we typically associate with OLS estimation of simultaneous equations? Explain your
answer.
c. Carefully outline how you would apply 2SLS to this system. How many equations
(including reduced forms) would you have to estimate? Specify precisely which variables
would be in each equation.
Solution: