Chapter 13 Homework Answer: 13.2 The First Equation Omits The 1981 Year Dummy Variable, Y81, and So Does Not Allow
Chapter 13 Homework Answer: 13.2 The First Equation Omits The 1981 Year Dummy Variable, Y81, and So Does Not Allow
13.2 The first equation omits the 1981 year dummy variable, y81, and so does not allow
any appreciation in nominal housing prices over the three year period in the absence of an
incinerator. The interaction term in this case is simply picking up the fact that even
homes that are near the incinerator site have appreciated in value over the three years.
This equation suffers from omitted variable bias.
The second equation omits the dummy variable for being near the incinerator site,
nearinc, which means it does not allow for systematic differences in homes near and far
from the site before the site was built. If, as seems to be the case, the incinerator was
located closer to less valuable homes, then omitting nearinc attributes lower housing
prices too much to the incinerator effect. Again, we have an omitted variable problem.
This is why equation (13.9) (or, even better, the equation that adds a full set of controls),
is preferred.
13.6 (i) Let FL be a binary variable equal to one if a person lives in Florida, and zero
otherwise. Let y90 be a year dummy variable for 1990. Then, from equation (13.10), we
have the linear probability model
The effect of the law is measured by δ1, which is the change in the probability of drunk
driving arrest due to the new law in Florida. Including y90 allows for aggregate trends in
drunk driving arrests that would affect both states; including FL allows for systematic
differences between Florida and Georgia in either drunk driving behavior or law
enforcement.
(ii) It could be that the populations of drivers in the two states change in different
ways over time. For example, age, race, or gender distributions may have changed. The
levels of education across the two states may have changed. As these factors might affect
whether someone is arrested for drunk driving, it could be important to control for them.
At a minimum, there is the possibility of obtaining a more precise estimator of δ1 by
reducing the error variance. Essentially, any explanatory variable that affects arrest can
be used for this purpose. (See Section 6.3 for discussion.)
(iii) If we have a county level data, we can have frac_arr (fraction of licensed
drivers arrested for DD), then we can now estimate the equation with regular regression
model assumption instead of linear probability model (LPM) in (i). Of course, we cannot
have individual characteristics, but we can have general demographics of each county
such as average age, average income, education level, etc.
C13.1 (i) The F statistic (with 4 and 1,111 df) is about 1.16 and p-value ≈ .328, which
shows that the living environment variables are jointly insignificant.
(ii) The F statistic (with 3 and 1,111 df) is about 3.01 and p-value ≈ .029, and so
the region dummy variables are jointly significant at the 5% level.
(iii) After obtaining the OLS residuals, û , from estimating the model in Table
13.1, we run the regression û 2 on y74, y76, …, y84 using all 1,129 observations. The
null hypothesis of homoskedasticity is H0: γ1 = 0, γ2 = 0, … , γ6 = 0. So we just use the
usual F statistic for joint significance of the year dummies. The R-squared is about .0153
and F ≈ 2.90; with 6 and 1,122 df, the p-value is about .0082. So there is evidence of
heteroskedasticity that is a function of time at the 1% significance level. This suggests
that, at a minimum, we should compute heteroskedasticity-robust standard errors, t
statistics, and F statistics. We could also use weighted least squares (although the form
of heteroskedasticity used here may not be sufficient; it does not depend on educ, age,
and so on).
(iv) Adding y74 ⋅ educ, , y84 ⋅ educ allows the relationship between fertility and
education to be different in each year; remember, the coefficient on the interaction gets
added to the coefficient on educ to get the slope for the appropriate year. When these
interaction terms are added to the equation, R2 ≈ .137. The F statistic for joint
significance (with 6 and 1,105 df) is about 1.48 with p-value ≈ .18. Thus, the
interactions are not jointly significant at even the 10% level. This is a bit misleading,
however. An abbreviated equation (which just shows the coefficients on the terms
involving educ) is
Three of the interaction terms, y78 ⋅ educ, y82 ⋅ educ, and y84 ⋅ educ are statistically
significant at the 5% level against a two-sided alternative, with the p-value on the latter
being about .012. The coefficients are large in magnitude as well. The coefficient on
educ – which is for the base year, 1972 – is small and insignificant, suggesting little if
any relationship between fertility and education in the early seventies. The estimates
above are consistent with fertility becoming more linked to education as the years pass.
The F statistic is insignificant because we are testing some insignificant coefficients
along with some significant ones.
C13.3 (i) Other things equal, δ1 > 0 means that homes farther from the incinerator should
be worth more even before incinerator building plan. δ1 measures the difference-in-
difference. β1 > 0 means that homes farther-away from incinerator are more expensive
even after building the incinerator.
While δˆ1 = .048 is the expected sign, but it is not statistically significant (t statistic ≈ .59).
(iii) When we add the list of housing characteristics to the regression, δ 1 is the
difference-in-difference (effect of incinerator on home price), and changes from .048
to .062 (se = .050). So the estimated effect is larger – the elasticity of price with respect
to dist is .062 after the incinerator site was chosen – but its t statistic is only 1.24. The p-
value for the one-sided alternative H1: δ1 > 0 is about .108, which is close to being
significant at the 10% level.
is the effect of the home price before the incinerator was built. In 1978
(iv) β 1
before the incinerator was built, homes far away from the site are more expensive (.317),
but they have better home characteristics. Distance from the site did not matter (.001)
after controlling home characteristics.
C13.6 (i) You may use an econometrics software package that directly tests restrictions
such as H0: β1 = β2 after estimating the unrestricted model in (13.22). But, as we have
seen many times, we can simply rewrite the equation to test this using any regression
software. Write the differenced equation as
Following the hint, we define θ1 = β1 − β2, and then write β1 = θ1 + β2. Plugging this into
the differenced equation and rearranging gives
Estimating this equation by OLS gives θˆ1 = .0091, se( θˆ1 ) = .0085. The t statistic for H0:
β1 = β2 is .0091/.0085 ≈ 1.07, which is not statistically significant.
Since we did not reject the hypothesis in part (i), we would be justified in using the
simpler model with avgclr. Based on adjusted R-squared, we have a slightly worse fit
with the restriction imposed. But this is a minor consideration. Ideally, we could get
more data to determine whether the fairly different unconstrained estimates of β1 and β2
in equation (13.22) reveal true differences in β1 and β2.