Sample Exercises
Sample Exercises
SUMMARY
In this chapter we considered regression models with more than one explana-
tory variable. The least squares coefficients measure the direct effect of an
explanatory variable on the dependent variable after neutralizing for the
indirect effects that run via the other explanatory variables. These estimated
effects therefore depend on the set of all explanatory variables included in the
model. We paid particular attention to the question of which explanatory
variables should be included in the model. For reasons of efficiency it is better
to exclude variables that have only a marginal effect. The statistical proper-
ties of least squares were derived under a number of assumptions on the data
generating process. Under these assumptions, the F-test can be used to test
for the individual and joint significance of explanatory variables.
FURTHER READING
In our analysis we made intensive use of matrix methods. We give some references
to econometric textbooks that also follow this approach. Chow (1983), Greene
(2000), Johnston and DiNardo (1997), Stewart and Gill (1998), Verbeek (2000),
and Wooldridge (2002) are on an intermediate level; the other books are on an
advanced level. The handbooks edited by Griliches and Intriligator contain over-
views of many topics that are treated in this and the next chapters.
KEYWORDS
auxiliary regressions 140 omitted variables bias 143
ceteris paribus 140 partial regression 146
Chow forecast test 173 partial regression scatter 148
coefficient of determination 129 prediction interval 171
covariance matrix 126 predictive performance 169
degrees of freedom 129 projection 123
direct effect 140 significance 153
F-test 161 significance of the regression 164
Frisch–Waugh 146 standard error 128
indirect effect 140 standard error of the regression 128
inefficient 144 t-test 153
joint significance 164 t-value 153
least squares estimator 122 total effect 140
linear restrictions 165 true model 142
matrix form 120 unbiased 126
minimal variance 127 uncontrolled 140
multicollinearity 158 variance inflation factor 159
normal equations 121
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 180
Exercises
THEORY QUESTIONS
2
@ S
3.1 (E Section 3.1.2) c. With the same conventions we get @b@b 0 ¼ Q for
Exercises 181
P1
bR ¼ b A(Rb r), where b is the unrestricted sub-samples.
P Let e01 e1 ¼ ni¼1 (yi y1 )2 and
n1 þn2 2
least squares estimator and A ¼ (X0 X)1 0
e2 e2 ¼ i¼n1 þ1 (yi y2 ) be the total sum of squares
R0 [R(X0 X)1 R0 ]1 . in the first and second sub-sample respectively; then
b. Let e ¼ y Xb and eR ¼ y XbR ; then show that the pooled estimator of the variance is defined by
s2p ¼ (e01 e1 þ e02 e2 )=(n1 þ n2 2) and the pooled
e0R eR ¼ e0 e þ (Rb r)0 [R(X0 X)1 R0 ]1 (Rb r): t-test is defined by
Exercises 183
b. Derive the residual sum of squares over the full c. Derive the F-test for the hypothesis that g ¼ 0.
sample i ¼ 1, , n þ 1 under the alternative
hypothesis.
log (Y) ¼ 0:701 þ 0:756 log (L) þ 0:242 log (K) þ e b. Comment on the differences between the conclu-
sions that could be drawn (without further think-
(0:415) (0:091) (0:110)
ing) from each of these two regressions.
c. Draw a partial regression scatter plot (with re-
The model is also estimated under two alternative
gression line) for salary (in logarithms) against
restrictions, the first with equal coefficients for
gender after correction for the variable education
log (L) and log (K) and the second with the sum of
(see Case 3 in Section 3.2.5). Draw also a scatter
the coefficients of log (L) and log (K) equal to one
plot (with regression line) for the original (uncor-
(‘constant returns to scale’). For this purpose the
rected) data on salary (in logarithms) and gender.
following two regressions are performed.
Discuss how these plots help in clarifying the
differences in b.
log (Y) ¼ 0:010 þ 0:524( log (L) þ log (K)) þ e1 d. Check the results on regression coefficients and
(0:358) (0:026) residuals in the result of Frisch–Waugh (3.39) for
these data, where X1 refers to the variable x4 ,
log(Y)log(K) ¼ 0:686þ0:756(log(L)log(K))þe2 and X2 refers to the constant term and the vari-
(0:132) (0:089) able x2 .
Exercises 185
twelve observations in the estimation sample, c. Now estimate the price elasticity by regressing y
and a second one of log (Q) against the predicted on a constant and the variables x2 and x3 . Pro-
values for the six observations in the prediction vide a motivation for this choice of explained and
sample. Relate these graphs to your conclusions explanatory variables and comment on the out-
in d. comes.
d. If y is regressed on a constant and the variable x3
3.18 (E Section 3.2.5) then the estimated elasticity is more negative
In this exercise we consider yearly data than in c. Check this result and give an explan-
(from 1970 to 1999) related to motor gas- XR318MGC ation in terms of partial regressions. Use the fact
oline consumption in the USA. The data that, in the period 1970–99, real income has
are taken from different sources (see the table). Here mostly gone up and the price of gasoline (as
‘rp’ refers to data in the Economic Report of compared with other prices) has mostly gone
the President (see w3.access.gpo.gov), ‘ecocb’ to down.
data of the Census Bureau, and ‘ecode’ to data of e. Perform the partial regressions needed to remove
the Department of Energy (see www.economagic. the effect of income (x2 ) on the consumption (y)
com). The price indices are defined so that the aver- and on the relative price (x3 ). Make a partial
age value over the years 1982–4 is equal to 100. regression scatter plot of the ‘cleaned’ variables
We define the variables y ¼log (SGAS=PGAS), and check the validity of the result of Frisch–
x2 ¼ log (INC=PALL), x3 ¼ log (PGAS=PALL), Waugh in this case.
x4 ¼ log (PPUB=PALL), x5 ¼ log (PNCAR=PALL),
f. Estimate the price elasticity by regressing y on a
and x6 ¼ log (PUCAR=PALL). We are interested in
constant and the variables x2 , x3 , x4 , x5 , and x6 .
the price elasticity of gasoline consumption — that is,
Comment on the outcomes and compare them
the marginal relative increase in sold quantity due to
with the ones in c.
a marginal relative price increase.
g. Transform the four price indices (PALL, PPUB,
PNCAR, and PUCAR) so that they all have the
Variable Definition Units Source value 100 in 1970. Perform the regression of f
SGAS Retail sales gasoline 106 dollars ecocb
for the transformed data (taking logarithms
service stations again) and compare the outcomes with the ones
PGAS Motor gasoline retail cts/gallon ecode in f. Which regression statistics remain the same,
price, US city average and which ones have changed? Explain these
INC Nominal personal 109 dollars rp results.
disposable income
PALL Consumer price index
(1982 4)=3 rp
¼ 100
3.19 (E Sections 3.4.1, 3.4.3)
PPUB Consumer price index idem rp We consider the same data on motor gas-
of public transport oline consumption as in Exercise 3.18 XR318MGC
PNCAR Consumer price index idem rp and we use the same notation as intro-
of new cars duced there. For all tests below, compute sums of
PUCAR Consumer price index idem rp squared residuals of appropriate regressions, deter-
of used cars mine the degrees of freedom of the test statistic, and
use a significance level of 5%.
a. Estimate this price elasticity by regressing a. Regress y on a constant and the variables x2 , x3 ,
log (SGAS) on a constant and log (PGAS). Com- x4 , x5 , and x6 . Test for the joint significance of
ment on the outcome, and explain why this out- the prices of new and used cars.
come is misleading. b. Regress y on a constant and the four explanatory
b. Estimate the price elasticity now by regressing y variables log (PGAS), log (PALL), log (INC),
on a constant and log (PGAS). Explain the precise and log (PPUB). Use the results to construct a
relation with the results in a. Why is this outcome 95% interval estimate for the price elasticity of
still misleading? gasoline consumption.
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 186
c. Test the null hypothesis that the sum of the coef- a 95% interval estimate for the price elasticity of
ficients of the four regressors in the model in b gasoline consumption. Compare this with the
(except the constant) is equal to zero. Explain result in b and comment.
why this restriction is of interest by relating f. Search the Internet to find the most recent year
this regression model to the restricted regression with values of the variables SGAS, PGAS,
in a. PALL, INC, and PPUB (make sure to use the
d. Show that the following null hypothesis is not same units as the ones mentioned in Exercise
rejected: the sum of the coefficients of log (PALL), 3.18). Use the models in b and d to construct
log (INC), and log (PPUB) in the model of b is 95% forecast intervals of y ¼ log (SGAS=PGAS)
equal to zero. Show that the restricted model has for the given most recent values of the regressors.
regressors log (PGAS), x2 and x4 (and a constant g. Compare the most recent value of y with the two
term), and estimate this model. forecast intervals of part f. For the two models in
e. Use the model of d (with the constant, b and d, perform Chow forecast tests for the most
log (PGAS), x2 and x4 as regressors) to construct recent value of y.