Solutions Chapter 4 PDF
Solutions Chapter 4 PDF
Solutions Chapter 4 PDF
Exercise Solutions
60
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 61
EXERCISE 4.1
(a) R2 = 1
ei2 =1
182.85
= 0.71051
( yi y )
2
631.63
( yi y )
2
(b) To calculate R 2 we need ,
Therefore,
SSR 666.72
R2 = = = 0.8455
SST 788.5155
(c) From
R2 = 1
ei2 =1
( N K ) 2
SST SST
we have,
SST (1 R 2 ) 552.36 (1 0.7911)
2 = = = 6.4104
N K (20 2)
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 62
EXERCISE 4.2
y = 5.83 + 8.69 x x
(a) where x =
(1.23) (1.17) 10
y = 0.583 + 0.0869 x y
(b) where y =
(0.123) (0.0117) 10
y = 0.583 + 0.869 x y x
(c) where y = and x =
(0.123) (0.117) 10 10
EXERCISE 4.3
(a) y 0 = b1 + b2 x0 = 1 + 1 5 = 6
n 1 ( x0 x ) 2 1 (5 1) 2
(b) var( f ) = 2 1 + + = 5.3333 1 + + = 14.9332
N ( xi x ) 2 5 10
n 1 ( x0 x ) 2 1 (1 1) 2
var( f ) = 2 1 + + = 5.3333 1 + + = 6.340
N ( xi x ) 2 5 10
The width in part (e) is smaller than the width in part (c), as expected. Predictions are
more precise when made for x values close to the mean.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 64
EXERCISE 4.4
(a) When estimating E ( y0 ), we are estimating the average value of y for all observational
units with an x-value of x0 . When predicting y0 , we are predicting the value of y for one
observational unit with an x-value of x0 . The first exercise does not involve the random
error e0 ; the second does.
2 ( ( xi x ) 2 + Nx 2 ) 2 ( x02 2 x0 x )
= +
N ( xi x ) 2 ( xi x )2
1 x 2 2 x0 x + x 2 1 ( x0 x ) 2
= 2 + 0 = 2
+ 2
N ( xi x )2 N ( xi x )
[ E ( y0 ) = 1 + 2 x0 ] [1 + 2 x0 + e0 = y0 ]
We need to include y0 in the expectation so that
E ( y 0 y0 ) = E ( y 0 ) E ( y0 ) = 1 + 2 x0 ( 1 + 2 x0 + E (e0 ) ) = 0.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 65
EXERCISE 4.5
(a) If we multiply the x values in the simple linear regression model y = 1 + 2 x + e by 10,
the new model becomes
y = 1 + 2 ( x 10 ) + e
10
= 1 + *2 x + e where *2 = 2 10 and x = x 10
The estimated equation becomes
b
y = b1 + 2 ( x 10 )
10
Thus, 1 and b1 do not change and 2 and b2 becomes 10 times smaller than their
original values. Since e does not change, the variance of the error term var(e) = 2 is
unaffected.
(b) Multiplying all the y values by 10 in the simple linear regression model y = 1 + 2 x + e
gives the new model
y 10 = ( 1 10 ) + ( 2 10 ) x + ( e 10 )
or
y = 1* + *2 x + e
where
y = y 10, 1* = 1 10, *2 = 2 10, e = e 10
The estimated equation becomes
y = y 10 = ( b1 10 ) + ( b2 10 ) x
Thus, both 1 and 2 are affected. They are 10 times larger than their original values.
Similarly, b1 and b2 are 10 times larger than their original values. The variance of the new
error term is
var(e ) = var ( e 10 ) = 100 var(e) = 100 2
Thus, the variance of the error term is 100 times larger than its original value.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 66
EXERCISE 4.6
yi 1
( b1 + xi b2 ) = ( b1 N + b2 xi ) = b1 + b2 i = b1 + b2 x
1 x
y =
N
=
N
N N
EXERCISE 4.7
(a) y 0 = b2 x0
( )
2
2yy yi y ( yi y )
(42.549) 2
(c) ryy2 = 2 2 = = = 0.943
y y ( y y )2 y y( ) 65.461 29.333
2
i i
The two alternative goodness of fit measures Ru2 and ryy2 are not equal.
EXERCISE 4.8
2.0
1.6
1.2
.8
0.8
.4
0.4
.0
-.4
-.8
50 55 60 65 70 75 80 85 90 95
Figure xr4.8(a) Fitted line and residuals for the simple linear regression
2.4
2.0
1.6
1.2
1.2
0.8
0.8
0.4
0.4
0.0
-0.4
-0.8
50 55 60 65 70 75 80 85 90 95
Figure xr4.8(b) Fitted line and residuals for the linear-log regression
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 69
(b)
2.4
2.0
1.6
1.2
.6
0.8
.4
.2 0.4
.0
-.2
-.4
-.6
50 55 60 65 70 75 80 85 90 95
Figure xr4.8(c) Fitted line and residuals for the quadratic regression
2. R 2 : The value of R 2 for the third equation is the highest, namely 0.5685.
3. The plots of the fitted equations and their residuals: The upper parts of the figures
display the fitted equation while the lower parts display the residuals. Considering the
plots for the fitted equations, the one obtained from the third equation seems to fit the
observations best. In terms of the residuals, the first two equations have concentrations
of positive residuals at each end of the sample. The third equation provides a more
balanced distribution of positive and negative residuals throughout the sample.
The third equation is preferable.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 70
EXERCISE 4.9
m
dyt
(b) Equation 1: = 1 = 0.0161
dt
m
dyt 1 0.1855
Equation 2: = = = 0.0038
dt t 49
m
dyt
Equation 3: = 2 1 t = 2 0.0003547 49 = 0.0348
dt
(c) Evaluating the elasticities at t = 49 and the relevant value for y 0 gives the following
results.
n
dyt t t 49
Equation 1: = 1 = 0.0161 = 0.538
dt yt y 0 1.467
n
dyt t 1 0.1855
Equation 2: = = = 0.148
dt yt y t 1.251
n
dyt t 2 1 t 2 2 0.0003547 492
Equation 3: = = = 1.037
dt yt y 0 1.643
dyt dyt t
(d) The slopes and the elasticities give the marginal change in yield and the
dt dt yt
percentage change in yield, respectively, that can be expected from technological change
in the next year. The results show that the predicted effect of technological change is very
sensitive to the choice of functional form.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 71
EXERCISE 4.10
For 2 we would expect a negative value because as the total expenditure increases the
food share should decrease with higher proportions of expenditure devoted to less
essential items. Both estimations give the expected sign. The standard errors for b1 and b2
from both estimations are relatively small resulting in high values of t ratios and
significant estimates.
(b) For households with 1 child, the average total expenditure is 94.848 and
= ( )
b1 + b2 ln TOTEXP + 1 1.0099 0.1495 [ln(94.848) + 1]
= = 0.5461
(
b1 + b2 ln TOTEXP ) 1.0099 0.1495 ln(94.848)
For households with 2 children, the average total expenditure is 101.168 and
= ( )
b1 + b2 ln TOTEXP + 1 0.9535 0.12944 [ ln(101.168) + 1]
= = 0.6363
(
b1 + b2 ln TOTEXP ) 0.9535 0.12944 ln(101.168)
Both of the elasticities are less than one; therefore, food is a necessity.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 72
0.8 0.4
0.6 0.2
WFOOD1
RESID
0.4 0.0
0.2 -0.2
0.0 -0.4
3 4 5 6 3 4 5 6
X1 X1
(c) Figures xr4.10 (a) and (b) display the fitted curve and the residual plot for households with
1 child. The function linear in WFOOD and ln(TOTEXP) seems to be an appropriate one.
However, the observations vary considerably around the fitted line, consistent with the
low R 2 value. Also, the absolute magnitude of the residuals appears to decline as
ln(TOTEXP) increases. In Chapter 8 we discover that such behavior suggests the existence
of heteroskedasticity.
Figures xr4.10 (c) and (d) are plots of the fitted equation and the residuals for households
with 2 children. They lead to similar conclusions to those made for the one-child case.
The values of JB for testing H 0 : the errors are normally distributed are 10.7941 and
6.3794 for households with 1 child and 2 children, respectively. Since both values are
greater than the critical value (0.95,
2
2) = 5.991 , we reject H 0 . The p-values obtained are
0.0045 and 0.0412, respectively, confirming that H 0 is rejected. We conclude that for
both cases the errors are not normally distributed.
0.8 0.4
0.6 0.2
WFOOD2
RESID
0.4 0.0
0.2 -0.2
0.0 -0.4
3.5 4.0 4.5 5.0 5.5 6.0 3.5 4.0 4.5 5.0 5.5 6.0
X2 X2
EXERCISE 4.11
(d) The non-incumbent party will receive 50.1% of the vote if the incumbent party receives
49.9% of the vote. Thus, we want the value of GROWTH for which
49.9 = 52.0281 + 0.6631 GROWTH
Solving for GROWTH yields
GROWTH = 3.209
Real per capita GDP would have had to decrease by 3.209% in the first three quarters of
the election year for the non-incumbent party to win 50.1% of the vote.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 74
EXERCISE 4.12
EXERCISE 4.13
At the mean,
elasticity = 2 SQFT = 0.00059596 1611.9682 = 0.9607
This result tells us that, at the mean, a 1% increase in area is associated with an
approximate 1% increase in the price of the house.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 76
[cov( y, y )]
2
2
1.99573 109
R = [corr( y, y )] =
2 2
= = 0.715
var( y ) var( y ) 2.78614 109 1.99996 109
g
[cov( y, y )]
2
2
1.57631 109
R = [corr( y, y )] =
2 2
= = 0.673
var( y ) var( y ) 2.78614 109 1.32604 109
g
The highest R 2 value is that of the log-linear functional form. The linear association
between the data and the fitted line is highest for the log-linear functional form. In this
sense the log-linear model fits the data best.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 77
(d)
120
100
80
60
40
20
0
-0.75 -0.50 -0.25 0.00 0.25 0.50 0.75
120
100
80
60
40
20
0
-0.75 -0.50 -0.25 0.00 0.25 0.50 0.75
200
160
120
80
40
0
-100000 0 100000 200000
(e)
1.2
0.8
0.4
residual
0.0
-0.4
-0.8
0 1000 2000 3000 4000 5000
SQFT
1.2
0.8
0.4
residual
0.0
-0.4
-0.8
0 1000 2000 3000 4000 5000
SQFT
250000
200000
150000
100000
residaul
50000
-50000
-100000
-150000
0 1000 2000 3000 4000 5000
SQFT
The residuals appear to increase in magnitude as SQFT increases. This is most evident in
the residuals of the simple linear functional form. Furthermore, the residuals in the area
around 1000 square feet of the simple linear model are all positive indicating that perhaps
the functional form does not fit well in this region.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 79
( x0 x )
2
1
se( f ) = 1 + +
2
N ( xi x )
2
(g) The standard error of forecast for the simple linear model is
(h) The simple linear model is not a good choice because the residuals are heavily skewed to
the right and hence far from being normally distributed. It is difficult to choose between
the other two models the log-linear and log-log models. Their residuals have similar
patterns and they both lead to a plausible elasticity of price with respect to changes in
square feet, namely, a 1% change in square feet leads to a 1% change in price. The log-
linear model is favored on the basis of its higher Rg2 value, and its smaller standard
deviation of the error, characteristics that suggest it is the model that best fits the data.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 81
EXERCISE 4.14
(a)
240
200
160
120
80
40
0
0 10 20 30 40 50 60
80
70
60
50
40
30
20
10
0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
Neither WAGE nor ln(WAGE) appear normally distributed. The distribution for WAGE is
positively skewed and that for ln(WAGE) is too flat at the top. However, ln(WAGE) more
closely resembles a normal distribution. This conclusion is confirmed by the Jarque-Bera
test results which are JB = 2684 (p-value = 0.0000) for WAGE and JB = 17.6 (p-value =
0.0002) for ln(WAGE).
( )
n = 0.7884 + 0.1038 EDUC
ln WAGE R 2 = 0.2146
( se ) ( 0.0849 ) ( 0.0063)
The estimated return to education = b2 100 = 10.38%.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 82
(c)
240
200
160
120
80
40
0
-10 0 10 20 30 40
90
80
70
60
50
40
30
20
10
0
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
The Jarque-Bera test results are JB = 3023 (p-value = 0.0000) for the residuals from the
linear model and JB = 3.48 (p-value = 0.1754) for the residuals from the log-linear model.
Both the histograms and the Jarque-Bera test results suggest the residuals from the log-
linear model are more compatible with normality. In the log-linear model a null hypothesis
of normality is not rejected at a 10% level of significance. In the linear regression model it
is rejected at a 1% level of significance.
[cov( y, y )]
2
6.871962
R = [corr( y, y )] =
2 2
= = 0.2246
38.9815 5.39435
g
var( y ) var( y )
Since, Rg2 > R 2 we conclude that the log-linear model fits the data better.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 83
(e)
50
40
30
20
residual
10
-10
-20
0 2 4 6 8 10 12 14 16 18 20
EDUC
2.0
1.6
1.2
0.8
residual
0.4
0.0
-0.4
-0.8
-1.2
-1.6
0 2 4 6 8 10 12 14 16 18 20
EDUC
The absolute value of the residuals increases in magnitude as EDUC increases, suggesting
heteroskedasticity which is covered in Chapter 8. It is also apparent, for both models, that
there are only positive residuals in the early range of EDUC. This suggests that there
might be a threshold effect education has an impact only after a minimum number of
years of education. We also observe the non-normality of the residuals in the linear model;
the positive residuals tend to be greater in absolute magnitude than the negative residuals.
(g) The log-linear function is preferred because it has a higher goodness-of-fit value and its
residuals are consistent with normality. However, when predicting the average age of
workers with 16 years of education, the linear model had a smaller prediction error
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 84
EXERCISE 4.15
(a), (b)
Summary statistics for WAGE
Sub-sample Mean Std Dev Min Max CV
(i) all males 11.525 6.659 2.07 60.19 57.8
(ii) all females 8.869 5.484 2.03 41.32 61.8
(iii) all whites 10.402 6.343 2.03 60.19 61.0
(iv) all blacks 8.259 4.740 3.50 25.26 57.4
(v) white males 11.737 6.716 2.07 60.19 57.2
(vi) white females 9.007 5.606 2.03 41.32 62.2
(vii) black males 9.066 5.439 3.68 25.26 60.0
(viii) black females 7.586 4.003 3.50 18.44 52.8
These results show that, on average, white males have the highest wages and black
females the lowest. The wage of white females is approximately the same as that of black
males. White females have the highest coefficient of variation and black females have the
lowest.
(c)
Regression results
Sub-sample Constant EDUC % return R2
(i) all males 1.0075 0.0967 9.67 0.2074
(se) (0.1144) (0.0084)
(ii) all females 0.5822 0.1097 10.97 0.2404
(se) (0.1181) (0.0088)
(iii) all whites 0.7822 0.1048 10.48 0.2225
(se) (0.0881) (0.0065)
(iv) all blacks 1.0185 0.0744 7.44 0.1022
(se) (0.3108) (0.0238)
(v) white males 0.9953 0.0987 9.87 0.2173
(se) (0.1186) (0.0087)
(vi) white females 0.6099 0.1085 10.85 0.2429
(se) (0.1223) (0.0091)
(vii) black males 1.3809 0.0535 5.35 0.0679
(se) (0.4148) (0.0321)
(viii) black females 0.2428 0.1275 12.75 0.2143
(se) (0.4749) (0.0360)
The return to education is highest for black females (12.75%) and lowest for black males
(5.35%). It is approximately 10% for all other sub-samples with the exception of all blacks
where it is around 7.5%.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 85
(d) The model does not fit the data equally well for each sub-sample. The best fits are for all
females and white females. Those for all blacks and black males are particularly poor.
We reject H 0 if t > tc or t < tc where tc = t(0.975,df ) . The results are given in the following
table.
There are no sub-samples where the data contradict the assertion that the wage return to an
extra year of education is 10%. Thus, although the estimated return to education is much
lower for all blacks and black males, it is not sufficiently less to conclude conclusively it
is not equal to 10%.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 86
EXERCISE 4.15
(a), (b)
Summary statistics for WAGE
Sub-sample Mean Std Dev Min Max CV
(i) all males 11.315 6.521 1.05 74.32 57.6
(ii) all females 8.990 5.630 1.28 78.71 62.6
(iii) all whites 10.358 6.275 1.05 78.71 60.6
(iv) all blacks 8.626 5.387 1.57 39.35 62.5
(v) white males 11.491 6.591 1.05 74.32 57.4
(vi) white females 9.105 5.648 1.28 78.71 62.0
(vii) black males 9.307 5.274 2.76 34.07 56.7
(viii) black females 8.129 5.424 1.57 39.35 66.7
These results show that, on average, white males have the highest wages and black
females the lowest. Males have higher average wages than females and whites have higher
average wages than blacks. The highest wage earner is, however, a white female. Black
females have the highest coefficient of variation and black males have the lowest.
(c)
Regression results
Sub-sample Constant EDUC % return R2
(i) all males 0.9798 0.0982 9.82 0.1954
(se) (0.0543) (0.0040)
(ii) all females 0.4776 0.1173 11.73 0.2479
(se) (0.0579) (0.0043)
(iii) all whites 0.7965 0.1040 10.40 0.2030
(se) (0.0428) (0.0032)
(iv) all blacks 0.6230 0.1066 10.66 0.1800
(se) (0.1390) (0.0106)
(v) white males 0.9859 0.0988 9.88 0.2009
(se) (0.0561) (0.0042)
(vi) white females 0.5142 0.1152 11.52 0.2453
(se) (0.0611) (0.0045)
(vii) black males 1.0641 0.0798 7.98 0.1167
(se) (0.2063) (0.0157)
(viii) black females 0.2147 0.1327 13.27 0.2569
(se) (0.1820) (0.0138)
The return to education is highest for black females (13.27%) and lowest for black males
(7.98%). It is approximately 10% for all other sub-samples with the exception of all
females and white females where it is around 11.5%.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 87
(d) The model does not fit the data equally well for each sub-sample. The best fits are for all
females, white females and black females. That for black males is particularly poor.
We reject H 0 if t > tc or t < tc where tc = t(0.975,df ) . The results are given in the following
table.
The null hypothesis is rejected for females, white females and black females. In these
cases the wage return to an extra year of education is estimated as greater than 10%. In all
other sub-samples, the data do not contradict the assertion that the wage return is 10%.
Chapter 4, Exercise Solutions, Principles of Econometrics, 3e 88
EXERCISE 4.16
(b) The vote in Palm Beach for George Bush is 152,846. Therefore, the predicted vote for Pat
Buchanan is:
n
BUCHANAN 0 = 65.503 + 0.003482 152,846 = 598
1 (152,846 41761.9697 )
2
The actual vote for Pat Buchanan in Palm Beach was 3407 which is not in the prediction
interval. The model is clearly not a good one for explaining the Palm Beach vote. This
conclusion is confirmed by the scatter diagram in part (c).
(c)
3500
3000
2500
BUCHANAN
2000
1500
1000
500
0
0 200 400 600 800 1000 1200
BUCHANANHAT
The vote in Palm Beach for Al Gore is 268,945. Therefore, the predicted vote for Pat
Buchanan is:
n
BUCHANAN 0 = 109.23 + 0.002544 268945 = 793
1 ( 268,945 39975.55 )
2
The actual vote for Pat Buchanan in Palm Beach was 3407 which is not in the prediction
interval. The model is clearly not a good one for explaining the Palm Beach vote. This
conclusion is confirmed by the scatter diagram below.
3,500
3,000
2,500
BUCHANAN
2,000
1,500
1,000
500
0
0 200 400 600 800 1,000 1,200
BUCHANANHAT2
1 ( 0.354827 0.554756 )
2
There were 430,762 total votes cast in Palm Beach. Multiplying the confidence interval
endpoints by this figure yields ( 3767, 5791) . The actual vote for Pat Buchanan in Palm
Beach was 3407 which falls inside this interval.