0% found this document useful (0 votes)
87 views

Sample Exercises

This chapter discusses multiple linear regression models with more than one explanatory variable. It describes how the least squares coefficients measure the direct effect of each explanatory variable on the dependent variable, accounting for the effects of other variables. The statistical properties of the least squares estimator are derived under assumptions about the data generation process. The F-test can be used to test the significance of individual or joint explanatory variables. Further reading references several econometrics textbooks for additional information on matrix methods and topics covered in this chapter.

Uploaded by

Darío
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Sample Exercises

This chapter discusses multiple linear regression models with more than one explanatory variable. It describes how the least squares coefficients measure the direct effect of each explanatory variable on the dependent variable, accounting for the effects of other variables. The statistical properties of the least squares estimator are derived under assumptions about the data generation process. The F-test can be used to test the significance of individual or joint explanatory variables. Further reading references several econometrics textbooks for additional information on matrix methods and topics covered in this chapter.

Uploaded by

Darío
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.

2004 3:04pm page 178

178 3 Multiple Regression

Summary, further reading,


and keywords

SUMMARY
In this chapter we considered regression models with more than one explana-
tory variable. The least squares coefficients measure the direct effect of an
explanatory variable on the dependent variable after neutralizing for the
indirect effects that run via the other explanatory variables. These estimated
effects therefore depend on the set of all explanatory variables included in the
model. We paid particular attention to the question of which explanatory
variables should be included in the model. For reasons of efficiency it is better
to exclude variables that have only a marginal effect. The statistical proper-
ties of least squares were derived under a number of assumptions on the data
generating process. Under these assumptions, the F-test can be used to test
for the individual and joint significance of explanatory variables.

FURTHER READING
In our analysis we made intensive use of matrix methods. We give some references
to econometric textbooks that also follow this approach. Chow (1983), Greene
(2000), Johnston and DiNardo (1997), Stewart and Gill (1998), Verbeek (2000),
and Wooldridge (2002) are on an intermediate level; the other books are on an
advanced level. The handbooks edited by Griliches and Intriligator contain over-
views of many topics that are treated in this and the next chapters.

Chow, G. G. (1983). Econometrics. Auckland: McGraw-Hill.


Davidson, R., and MacKinnon, J. G. (1993). Estimation and Inference in Econo-
metrics. New York: Oxford University Press.
Gourieroux, C., and Monfort, A. (1995). Statistics and Econometric Models.
2 vols. Cambridge: Cambridge University Press.
Greene, W. H. (2000). Econometric Analysis. New York: Prentice Hall.
Griliches, Z., and Intriligator, M. D. (1983, 1984, 1986). Handbook of Econo-
metrics. 3 vols. Amsterdam: North-Holland.
Johnston, J., and DiNardo, J. (1997). Econometric Methods. New York:
McGraw-Hill.
Judge, G. G., Griffiths, W. E., Hill, R. C., Lütkepohl, H., and Lee, T. C. (1985).
The Theory and Practice of Econometrics. New York: Wiley.
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 179

Summary, further reading, and keywords 179

Malinvaud, E. (1980). Statistical Methods of Econometrics. Amsterdam: North


Holland.
Mittelhammer, R. C., Judge, G. G., and Miller, D. J. (2000). Econometric Foun-
dations. Cambridge: Cambridge University Press.
Ruud, P. A. (2000). An Introduction to Classical Econometric Theory. New York:
Oxford University Press.
Stewart, J., and Gill, L. (1998). Econometrics. London: Prentice Hall.
Theil, H. (1971). Principles of Econometrics. New York: Wiley.
Verbeek, M. (2000). A Guide to Modern Econometrics. Chichester: Wiley.
Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data.
Cambridge, MA: MIT Press.

KEYWORDS
auxiliary regressions 140 omitted variables bias 143
ceteris paribus 140 partial regression 146
Chow forecast test 173 partial regression scatter 148
coefficient of determination 129 prediction interval 171
covariance matrix 126 predictive performance 169
degrees of freedom 129 projection 123
direct effect 140 significance 153
F-test 161 significance of the regression 164
Frisch–Waugh 146 standard error 128
indirect effect 140 standard error of the regression 128
inefficient 144 t-test 153
joint significance 164 t-value 153
least squares estimator 122 total effect 140
linear restrictions 165 true model 142
matrix form 120 unbiased 126
minimal variance 127 uncontrolled 140
multicollinearity 158 variance inflation factor 159
normal equations 121
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 180

180 3 Multiple Regression

Exercises

THEORY QUESTIONS
2
@ S
3.1 (E Section 3.1.2) c. With the same conventions we get @b@b 0 ¼ Q for

In this exercise we study the derivatives of (3.6) and the Hessian.


prove the result in (3.7). For convenience, we write d. Let X be an n  k matrix with rank k; then prove
X0 y ¼ p (a k  1 vector) and X0 X ¼ Q (a k  k that the k  k matrix X0 X is positive definite.
matrix), so that we have to minimize the function
f (b) ¼ y0 y  p0 b  b0 p þ b0 Qb ¼ y0 y  2b0 p þ b0 Qb.
3.3 (E Section 3.1.2)
Check every detail of the following argument.
The following steps show that the least squares
a. Let b increase to b þ h, where we may choose estimator b ¼ (X0 X)1 X0 y minimizes (3.6) without
the elements of the k  1 vector h as small as we using the first and second order derivatives. In this
like. Then f (bþh) ¼ f (b)þh0 (2pþ(Q0 þQ)b) exercise b denotes any k  1 vector.
þh0 Qh.
a. Let b ¼ (X0 X)1 X0 y þ d; then show that
b. This result can be interpreted as a Taylor expan- y  Xb ¼ e  Xd, where e is a vector of con-
sion. If the elements of h are sufficiently small, stants that does not depend on the choice of d.
the last term can be neglected, and the central
b. Show that S(b ) ¼ e0 e þ (Xd)0 (Xd) and that the
term is a linear expression containing the
minimum of this expression is attained if Xd ¼ 0.
k  1 vector of first order derivatives
@f 0 c. Derive the condition for uniqueness of this min-
@b ¼ 2p þ (Q þ Q)b. There are k first order
derivatives and we follow the convention to ar- imum and show that the minimum is then given
range them in a column vector. by d ¼ 0.
c. If we apply this to (3.6), this shows that
@S 0 0 3.4 (E Section 3.1.4)
@b ¼ 2X y þ 2X Xb.
a. In the model y ¼ Xb þ e, the normal equations
are given by X0 Xb ¼ X0 y, the least squares esti-
3.2 (E Section 3.1.2) mates by b ¼ (X0 X)1 X0 y, and the variance by
In this exercise we prove the result in (3.10). The var(b) ¼ s2 (X0 X)1 . Work these three formulas
vector of first order derivatives in (3.7) contains one out for the special case of the simple regression
term that depends on b. For convenience we write it model yi ¼ a þ bxi þ ei and prove that these
as Qb and we partition the k  k matrix Q ¼ 2X0 X results are respectively equal to the normal equa-
into its columns as Q ¼ (q1 q2 . . . qk ). Verify each tions, the estimates a and b, and the variances of
step in the following argument. a and b obtained in Sections 2.1.2 and 2.2.4.
a. Qb can be written as Qb ¼ q1 b1 þ q2 b2 þ . . . þ b. Suppose that the k random variables y, x2 ,
qk bk . x3 ,    , xk are jointly normally distributed
b. The derivatives of the elements of Qb with re- with mean m and (non-singular) covariance
spect to the scalar bi can be written as a column matrix S. Let the observations be obtained by
qi . To write all derivatives for i ¼ 1, . . . , k in one a random sample of size n from this distribu-
formula we follow the convention to write them tion N(m, S). Define the random variable
as a ‘row of columns’ — that is, we group them yc ¼ yjfx2 ,    , xk g — that is, y conditional on
into a matrix, so that @Qb
@b0 ¼ Q (note the prime in the values of x2 ,    , xk . Show that the n obser-
the left-hand denominator; this indicates that the vations yc satisfy Assumptions 1–7 of Section
separate derivatives are arranged as a row). 3.1.4.
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 181

Exercises 181

3.5 (E Section 3.1.5) c. If a regression model contains no constant term


In some software packages the user is asked to specify so that the matrix X contains no column of ones,
the variable to be explained and the explanatory vari- then show that 1  (SSR=SST) (and hence R2
ables, while an intercept is added automatically. Now when it is computed in this way) may be negative.
suppose that you wish to compute the least squares d. Let y ¼ X1 b1 þ X2 b2 þ e and let b1 be estimated
estimates b in a regression of the type y ¼ Xb þ e by regressing y on X1 alone (the ‘omitted vari-
where the n  k matrix X does not contain an ‘inter- ables’ case of Section 3.2.3). Show that
cept column’ consisting of unit elements. Define var(bR )  var(b1 ) in the sense that var(b1 ) –
    var(bR ) is a positive semidefinite matrix. When
y i X are the two variances equal?
y ¼ , X ¼ ,
y i X e. Show that the F-test for a single restriction bj ¼ 0
is equal to the square of the t-value of bj . Show
where the i columns, consisting of unit elements also that both tests lead to the same conclusion,
only, are added by the computer package and the irrespective of the chosen significance level.
user specifies the other data. 
f . Consider the expression (3.49) of the F-test in
a. Prove that the least squares estimator obtained terms of the random variables b02 X02 M1 X2 b2
by regressing y on X gives the desired result. and e0 e. Prove that, under the null hypothesis
b. Prove that the standard errors of the regression that b2 ¼ 0, these two random variables are inde-
coefficients pofffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
this regression must ffi be corrected pendently distributed as w2 (g) and w2 (n  k)
by a factor (2n  k  1)=(n  k). respectively by showing that (i) they can be
expressed as e0 Q1 e and e0 Q2 e, with (ii)
3.6 (E Section 3.1.6) Q1 ¼ M1  M and Q2 ¼ M, where M is the
Suppose we wish to explain a variable y and that the M-matrix corresponding to X and M1 is the
number of possible explanatory variables is so large M-matrix corresponding to X1 , so that (iii) Q1
that it is tempting to take a subset. In such a situation is idempotent with rank g and Q2 is idempotent
some researchers apply the so-called Theil criterion with rank (n  k), and (iv) Q1 Q2 ¼ 0.
and maximize the adjusted R2 defined by g. In Section 3.4 we considered the prediction of
2
R ¼ 1  n1 2
nk (1  R ), where n is the number of ob- y2 for given values of X2 under the assumptions
servations and k the number of explanatory variables. that y1 ¼ X1 b þ e1 and y2 ¼ X2 b þ e2 where
a. Prove that R2 never decreases by including an E[e1 ] ¼ 0, E[e2 ] ¼ 0, E[e1 e01 ] ¼ s2 I, E[e2 e02 ] ¼ s2 I,
additional regressor in the model. and E[e1 e02 ] ¼ 0. Prove that under Assumptions
1–6 the predictor X2 b with b ¼ (X01 X1 )1 X01 y1 is
b. Prove that the Theil criterion is equivalent with
best linear unbiased. That is, among all predict-
minimizing s, the standard error of regression.
ors of the form ^ y2 ¼ Ly1 (with L a given matrix)
c. Prove that the Theil criterion implies that an with the property that E[y2  ^ y2 ] ¼ 0, it minim-
explanatory variable xj will be maintained if izes the variance of the forecast error y2  ^ y2.
and only if the F-test statistic for the null hypoth-
h. Using the notation introduced in Section 3.4.3,
esis bj ¼ 0 is larger than one.
show that a (1  a) pprediction
ffiffiffiffiffi interval for y2j is
d. Show that the size (significance level) of such a given by X02j b cs djj.
test is larger than 0.05.
3.8 (E Section 3.4.1)
3.7 (E Sections 3.1.5, 3.1.6, 3.2.4, 3.4.1, 3.4.3) Consider the model y ¼ Xb þ e with the null hy-
Some of the following questions and arguments pothesis that Rb ¼ r where R is a given g  k matrix
were mentioned in this chapter. of rank g and r is a given g  1 vector. Use the
a. Prove the result stated in Section 3.1.5 that following steps to show that the expression (3.54)
hi > 0 if the n  k matrix X contains a column for the F-test can be written in terms of residual
of unit elements and rank (X) ¼ k. sums of squares as in (3.50).
b. Prove that R2 (in the model with constant term) a. The restricted least squares estimator bR
is the square of the sample correlation coefficient minimizes the sum of squares (y  Xb ^)0 (y  Xb
^)
between y and ^ y ¼ Xb. under the restriction that Rb ^ ¼ r. Show that
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 182

182 3 Multiple Regression

P1
bR ¼ b  A(Rb  r), where b is the unrestricted sub-samples.
P Let e01 e1 ¼ ni¼1 (yi  y1 )2 and
n1 þn2 2
least squares estimator and A ¼ (X0 X)1 0
e2 e2 ¼ i¼n1 þ1 (yi  y2 ) be the total sum of squares
R0 [R(X0 X)1 R0 ]1 . in the first and second sub-sample respectively; then
b. Let e ¼ y  Xb and eR ¼ y  XbR ; then show that the pooled estimator of the variance is defined by
s2p ¼ (e01 e1 þ e02 e2 )=(n1 þ n2  2) and the pooled
e0R eR ¼ e0 e þ (Rb  r)0 [R(X0 X)1 R0 ]1 (Rb  r): t-test is defined by

c. Show that the F-test in (3.54) can be written as in rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi


n1 n2 y2  y1
(3.50). tp ¼ :
n1 þ n2 sp
d. In Section 3.4.2 we tested the null hypothesis that
b4 þ b5 ¼ 0 in the model with k ¼ 5 explanatory
variables. Describe a method to determine the a. Formulate the testing problem of m1 ¼ m2 against
restricted sum of squared residuals e0R eR in this m1 6¼ m2 in terms of a parameter restriction in a
case. multivariate regression model (with parameters
m1 and m2 ).
3.9 (E Section 3.2.5) b. Derive the F-test for H0 : m1 ¼ m2 in the form
This exercise serves to clarify a remark on standard (3.50).
errors in partial regressions that was made in c. Prove that tp2 is equal to the F-test in b and that tp
Example 3.3 (p. 150). We use the notation of follows the t(n1 þ n2  2) distribution if the null
Section 3.2.5, in particular the estimated regressions hypothesis of equal means holds true.
(1) y ¼ X1 b1 þ X2 b2 þ e, and d. In Example 1.12 (p. 62) we considered the
(2) M2 y ¼ (M2 X1 )b þ e FGPA scores of n1 ¼ 373 male students and
n2 ¼ 236 female students. Use the results
in the result of Frisch–Waugh. Here X1 and M2 X1 reported in Exhibit 1.6 to perform a test of the
are n  (k  g) matrices and X2 is an n  g matrix. null hypothesis of equal means for male and
a. Prove that var(b1 ) ¼ var(b ) ¼ s2 (X01 M2 X1 )1 . female students against the alternative that
b. Derive expressions for the estimated variance s2 female students have on average higher scores
in regression (1) and s2 in regression (2), both in than male students.
terms of e0 e.
c. Prove that the standard errors of the coefficients 3.11 (E Section 3.4.3)
b1 in (1) can be obtained by multiplying the We consider the Chow forecast test (3.58) for the
standard errors of the coefficients b in (2) by case g ¼ 1 of a single new observation (xnþ1 , ynþ1 ).
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi The n preceding observations are used in the model
the factor (n  k þ g)=(n  k).
y1 ¼ X1 b þ e with least squares estimator b. We
d. Check this result by considering the standard assume that Assumptions 1–4 and 7 are satisfied
errors of the variable education in the second for the full sample i ¼ 1,    , n þ 1, and Assump-
regression in Exhibit 3.7 and the last regression tions 5 and 6 for the estimation sample
in Exhibit 3.10. (These values are rounded; a i ¼ 1,    , n, whereas for the (n þ 1)st observation
more precise result is obtained when higher pre- we write
cision values from a regression package are used).
e. Derive the relation between the t-values of (1) ynþ1 ¼ x0nþ1 b þ g þ enþ1
and (2).
with g an unknown scalar parameter. We consider
3.10 (E Section 3.4.1) the null hypothesis that g ¼ 0 against the alternative
In Section 1.4.2 we mentioned the situation of two that g 6¼ 0.
independent random samples, one of size n1 from a. Prove that the least squares estimators of b and g
N(m1 , s2 ) and a second one of size n2 from over the full sample i ¼ 1,    , n þ 1, are given
N(m2 , s2 ). We want to test the null hypothesis by b and c ¼ ynþ1  x0nþ1 b. Show that the re-
H0 : m1 ¼ m2 against the alternative H1 : m1 6¼ m2 . sidual for the (n þ 1)st observation is equal to
The pooled t-test is based on the difference be- zero. Provide an intuitive explanation for this
tween the sample means y1 and y2 of the two result.
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 183

Exercises 183

b. Derive the residual sum of squares over the full c. Derive the F-test for the hypothesis that g ¼ 0.
sample i ¼ 1,    , n þ 1 under the alternative
hypothesis.

EMPIRICAL AND SIMULATION QUESTIONS


3.12 (E Section 3.3.3) minority. Discuss the relevance of this informa-
In this simulation exercise we consider five variables tion with respect to the power of the test for
(y, z, x1 , x2 , and x3 ) that are generated as follows. hypothesis (iii).
Let n ¼ 100 and let ei , !i , Zi  NID(0, 1) be inde- c. Finally consider the subset of employees with
pendent random samples from the standard normal custodial jobs (job category 2, where all employ-
distribution, i ¼ 1,    , n. Define ees are male). Use the results in Exhibit 3.16 to
test the hypothesis that b5 ¼ 0. Test also the hy-
x1i ¼ 5 þ !i þ 0:3Zi pothesis that b2 ¼ b3 ¼ b5 ¼ 0.
x2i ¼ 10 þ !i
x3i ¼ 5 þ Zi 3.14 (E Sections 3.2.2, 3.3.3)
In this exercise we consider the data set
yi ¼ x1i þ x2i þ ei
on student learning of Example 1.1 (p. 12) XR314STU
zi ¼ x2i þ x3i þ ei for 609 students. The dependent variable
(y) is the FGPA score of a student, and the explana-
a. What is the correlation between x1 and x3 ? And tory variables are x1 (constant term), x2 (SATM
what is the correlation between x2 and x3 ? score), x3 (SATV score), and x4 (FEM, with x4 ¼ 1
b. Perform the regression of y on a constant, x1 and for females and x4 ¼ 0 for males).
x2 . Compute the regression coefficients and their a. Compute the 4  4 correlation matrix for the
t-values. Comment on the outcomes. variables (y, x2 , x3 , x4 ).
c. Answer the questions of b for the regression of z b. Estimate a model for FGPA in terms of SATV by
on a constant, x2 and x3 . regressing y on x1 and x3 . Estimate also a model
d. Perform also regressions of y on a constant and by regressing y on x1 , x2 , x3 , and x4 .
x1 , and of z on a constant and x3 . Discuss the c. Comment on the differences between the two
differences that arise between these two cases. models in b for the effect of SATV on FGPA.
d. Investigate the presence of collinearity between
3.13 (E Section 3.4.1) the explanatory variables by computing R2j in
In Section 3.4.2 we tested four different (3.47) and the square root
hypotheses — that is, (i) b5 ¼ 0, (ii) b2 ¼ XM301BWA ffi of the variance infla-
qffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tion factors, 1= 1  R2j , for j ¼ 2, 3, 4.
b3 ¼ b4 ¼ b5 ¼ 0, (iii) b4 ¼ b5 ¼ 0, and
(iv) b4 þ b5 ¼ 0. As data set we considered the
3.15 (E Section 3.4.1)
data on all 474 employees (see Exhibit 3.16). Use a
In this exercise we consider production
significance level of 5 per cent in all tests below.
data for the year 1994 of n ¼ 26 US firms XR315PMI
a. Test these four hypotheses also for the subset of in the sector of primary metal industries
employees working in management (job category (SIC33). The data are taken from E. J. Bartelsman
3), using the results in the last two columns in and W. Gray, National Bureau of Economic Re-
Exhibit 3.16. search, NBER Technical Working Paper 205, 1996.
b. Now consider the hypothesis (iii) that gender and For each firm, values are given of production (Y,
minority have no effect on salary for employees value added in millions of dollars), labour (L, total
in management. We mention that of the eighty- payroll in millions of dollars), and capital (K, real
four employees in management, seventy are capital stock in millions of 1987 dollars). A log-linear
male non-minority, ten are female-non-minority, production function is estimated with the following
four are male-minority, and no one is female- result (standard errors are in parentheses).
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 184

184 3 Multiple Regression

log (Y) ¼ 0:701 þ 0:756 log (L) þ 0:242 log (K) þ e b. Comment on the differences between the conclu-
sions that could be drawn (without further think-
(0:415) (0:091) (0:110)
ing) from each of these two regressions.
c. Draw a partial regression scatter plot (with re-
The model is also estimated under two alternative
gression line) for salary (in logarithms) against
restrictions, the first with equal coefficients for
gender after correction for the variable education
log (L) and log (K) and the second with the sum of
(see Case 3 in Section 3.2.5). Draw also a scatter
the coefficients of log (L) and log (K) equal to one
plot (with regression line) for the original (uncor-
(‘constant returns to scale’). For this purpose the
rected) data on salary (in logarithms) and gender.
following two regressions are performed.
Discuss how these plots help in clarifying the
differences in b.
log (Y) ¼ 0:010 þ 0:524( log (L) þ log (K)) þ e1 d. Check the results on regression coefficients and
(0:358) (0:026) residuals in the result of Frisch–Waugh (3.39) for
these data, where X1 refers to the variable x4 ,
log(Y)log(K) ¼ 0:686þ0:756(log(L)log(K))þe2 and X2 refers to the constant term and the vari-
(0:132) (0:089) able x2 .

3.17 (E Section 3.4.3)


The residual sums of squares are respectively e0 e ¼ In this exercise we consider data on
1:825544, e01 e1 ¼ 2:371989, and e02 e2 ¼ 1:825652, weekly coffee sales of a certain brand of XR317COF
and the R2 are respectively equal to R2 ¼ 0:956888, coffee. These data come from the same
R21 ¼ 0:943984, and R22 ¼ 0:751397. In the marketing experiment as discussed in Example 2.3
following tests use a significance level of 5%. (p. 78), but for another brand of coffee and for
a. Test for the individual significance of log (L) and another selection of weeks. The data provide for
log (K) in the first regression. Test also for the n ¼ 18 weeks the values of the coffee sales in that
joint significance of these two variables. week (Q, in units), the applied deal rate (D ¼ 1 for
b. Test the restriction of equal coefficients by means the usual price, D ¼ 1:05 in weeks with 5% price
of an F-test based on the residual sums of reduction, and D ¼ 1:15 in weeks with 15% price
squares. reduction), and advertisement (A ¼ 1 in weeks with
advertisement, A ¼ 0 otherwise). We postulate the
c. Test this restriction also by means of the R2 .
model
d. Test the restriction of constant returns to scale
also in two ways, one with the F-test based on the log (Q) ¼ b1 þ b2 log (D) þ b3 A þ e:
residual sums of squares and the other with the
F-test based on the R2 . For all tests below use a significance level of 5 %.
e. Explain why the outcomes of b and c are the a. Test whether advertisement has a significant
same but the two outcomes in d are different. effect on sales, both by a t-test and by an F-test.
Which of the two tests in d is the correct one? b. Test the null hypothesis that b2 ¼ 1 against the
alternative that b2 > 1.
3.16 (E Section 3.2.5) c. Construct 95% interval estimates for the param-
Consider the data on bank wages of the eters b2 and b3 .
example in Section 3.1.7. To test for XM301BWA d. Estimate the model using only observations in
the possible effect of gender on wage, weeks without advertisement. Test whether this
someone proposes to estimate the model model produces acceptable forecasts for the sales
y ¼ b1 þ b4 x4 þ e, where y is the yearly wage (in (in logarithms) in the weeks with advertisement.
logarithms) and x4 is the variable gender (with Note: take special care of the fact that the esti-
x4 ¼ 0 for females and x4 ¼ 1 for males). As an mated model can not predict the effect of adver-
alternative we consider the model with x2 (educa- tisement.
tion) as an additional explanatory variable. e. Make two scatter plots, one of the actual values
a. Use the data to perform the two regressions. of log (Q) against the fitted values of d for the
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 185

Exercises 185

twelve observations in the estimation sample, c. Now estimate the price elasticity by regressing y
and a second one of log (Q) against the predicted on a constant and the variables x2 and x3 . Pro-
values for the six observations in the prediction vide a motivation for this choice of explained and
sample. Relate these graphs to your conclusions explanatory variables and comment on the out-
in d. comes.
d. If y is regressed on a constant and the variable x3
3.18 (E Section 3.2.5) then the estimated elasticity is more negative
In this exercise we consider yearly data than in c. Check this result and give an explan-
(from 1970 to 1999) related to motor gas- XR318MGC ation in terms of partial regressions. Use the fact
oline consumption in the USA. The data that, in the period 1970–99, real income has
are taken from different sources (see the table). Here mostly gone up and the price of gasoline (as
‘rp’ refers to data in the Economic Report of compared with other prices) has mostly gone
the President (see w3.access.gpo.gov), ‘ecocb’ to down.
data of the Census Bureau, and ‘ecode’ to data of e. Perform the partial regressions needed to remove
the Department of Energy (see www.economagic. the effect of income (x2 ) on the consumption (y)
com). The price indices are defined so that the aver- and on the relative price (x3 ). Make a partial
age value over the years 1982–4 is equal to 100. regression scatter plot of the ‘cleaned’ variables
We define the variables y ¼log (SGAS=PGAS), and check the validity of the result of Frisch–
x2 ¼ log (INC=PALL), x3 ¼ log (PGAS=PALL), Waugh in this case.
x4 ¼ log (PPUB=PALL), x5 ¼ log (PNCAR=PALL),
f. Estimate the price elasticity by regressing y on a
and x6 ¼ log (PUCAR=PALL). We are interested in
constant and the variables x2 , x3 , x4 , x5 , and x6 .
the price elasticity of gasoline consumption — that is,
Comment on the outcomes and compare them
the marginal relative increase in sold quantity due to
with the ones in c.
a marginal relative price increase.
g. Transform the four price indices (PALL, PPUB,
PNCAR, and PUCAR) so that they all have the
Variable Definition Units Source value 100 in 1970. Perform the regression of f
SGAS Retail sales gasoline 106 dollars ecocb
for the transformed data (taking logarithms
service stations again) and compare the outcomes with the ones
PGAS Motor gasoline retail cts/gallon ecode in f. Which regression statistics remain the same,
price, US city average and which ones have changed? Explain these
INC Nominal personal 109 dollars rp results.
disposable income
PALL Consumer price index
(1982  4)=3 rp
¼ 100
3.19 (E Sections 3.4.1, 3.4.3)
PPUB Consumer price index idem rp We consider the same data on motor gas-
of public transport oline consumption as in Exercise 3.18 XR318MGC
PNCAR Consumer price index idem rp and we use the same notation as intro-
of new cars duced there. For all tests below, compute sums of
PUCAR Consumer price index idem rp squared residuals of appropriate regressions, deter-
of used cars mine the degrees of freedom of the test statistic, and
use a significance level of 5%.
a. Estimate this price elasticity by regressing a. Regress y on a constant and the variables x2 , x3 ,
log (SGAS) on a constant and log (PGAS). Com- x4 , x5 , and x6 . Test for the joint significance of
ment on the outcome, and explain why this out- the prices of new and used cars.
come is misleading. b. Regress y on a constant and the four explanatory
b. Estimate the price elasticity now by regressing y variables log (PGAS), log (PALL), log (INC),
on a constant and log (PGAS). Explain the precise and log (PPUB). Use the results to construct a
relation with the results in a. Why is this outcome 95% interval estimate for the price elasticity of
still misleading? gasoline consumption.
Heij / Econometric Methods with Applications in Business and Economics Final Proof 28.2.2004 3:04pm page 186

186 3 Multiple Regression

c. Test the null hypothesis that the sum of the coef- a 95% interval estimate for the price elasticity of
ficients of the four regressors in the model in b gasoline consumption. Compare this with the
(except the constant) is equal to zero. Explain result in b and comment.
why this restriction is of interest by relating f. Search the Internet to find the most recent year
this regression model to the restricted regression with values of the variables SGAS, PGAS,
in a. PALL, INC, and PPUB (make sure to use the
d. Show that the following null hypothesis is not same units as the ones mentioned in Exercise
rejected: the sum of the coefficients of log (PALL), 3.18). Use the models in b and d to construct
log (INC), and log (PPUB) in the model of b is 95% forecast intervals of y ¼ log (SGAS=PGAS)
equal to zero. Show that the restricted model has for the given most recent values of the regressors.
regressors log (PGAS), x2 and x4 (and a constant g. Compare the most recent value of y with the two
term), and estimate this model. forecast intervals of part f. For the two models in
e. Use the model of d (with the constant, b and d, perform Chow forecast tests for the most
log (PGAS), x2 and x4 as regressors) to construct recent value of y.

You might also like