CH 02 Ans
CH 02 Ans
5TH EDITION
ANSWERS TO ODD-NUMBERED
EXERCISES IN CHAPTER 2
1
Chapter 2, Exercise Answers, Principles of Econometrics, 5e 2
EXERCISE 2.1
(a)
xi =5 yi = 10 ( xi − x ) = 0 ( xi − x ) ( y − y ) = 0 ( x − x )( y − y ) = 8
2
= 10
̅ = 1, =2
(b) b2 =
( x − x )( y − y ) = 8 = 0.8
( x − x )
2
10
b1 = y − b2 x = 2 − 0.8 × 1 = 1.2
5
(c) xi2 = 15
i =1
5
xi yi = 18
i =1
5
xi2 − Nx 2 = 10
i =1
5
xi yi − Nxy = 8
i =1
(d)
xi =5 yi =10 yˆi =10 eˆi =0 eˆi2 =3.6 xi eˆi =0
s y2 = 2.5
s x2 = 2.5
sxy = 2
rxy = 0.8
CVx = 158.11388
median( x) = 1
(e)
Figure xr2.1 Observations and fitted line
4
3
2
1
0
-1 0 1 2 3
x
y Fitted values
(f) See figure above. The fitted line passes through the point of the means, ̅ = 1, = 2.
(g) =2, + ̅= 2
(h) yˆ = 2
(i) σˆ 2 = 1.2
EXERCISE 2.3
(a) We show the least squares fitted line.
20
15
10
5
1 2 3 4 5 6
x
y Fitted values
(b) b2 = 2.285714 , b1 = 2
20
15
y
10
5 1 2 3 4 5 6
x
y yhat
(d)
eˆi
1.71429
−2.57143
2.14286
−2.14286
−0.42857
1.28571
EXERCISE 2.5
(a) SALES = 4000 + 4 × ADVERT
Figure xr2.5
Regression line
15000
10000
sales
5000
EXERCISE 2.7
(a) eˆi2 = 697.82566
(b) ∑( − ̅ ) = 1553.8833
(c) b2 = 1.02896 suggests that a 1% increase in the percentage of the population with a
bachelor’s degree or more will lead to an increase of $1028.96 in the mean income per
capita.
(d) b1 = 11.519745
EXERCISE 2.9
( )
(a) E βˆ 2, mean | x = E ( y2 − y1 ) ( x2 − x1 ) | x = 1 ( x2 − x1 ) E ( y2 − y1 ) | x
E ( y2 − y1 ) | x = E [ y2 | x ] − E [ y1 | x]
1 6 1 6 1 6
E [ y2 | x ] = E i = 4 yi | x = i = 4 E ( yi | x ) = i = 4 ( β1 + β2 xi )
3 3 3
1
3
6
1
3
6
( )
= 3β1 + β2 i = 4 xi = β1 + β2 i = 4 xi = β1 + β2 x2
Similarly E [ y1 | x ] = β1 + β 2 x1 . Then
E ( y2 − y1 ) | x = E [ y2 | x] − E [ y1 | x] = ( β1 + β2 x2 ) − ( β1 + β2 x1 ) = β2 ( x2 − x1 )
Finally,
( )
E βˆ 2, mean | x = E ( y2 − y1 ) ( x2 − x1 ) | x = 1 ( x2 − x1 ) E ( y2 − y1 ) | x
= 1 ( x2 − x1 ) β 2 ( x2 − x1 ) = β 2
( ) ( )
(b) E βˆ 2, mean = Ex E βˆ 2, mean | x = Ex ( β 2 ) = β 2
( )
(c) var βˆ 2, mean | x = 1 ( x2 − x1 ) var ( y2 − y1 ) | x = 1 ( x2 − x1 ) {var [ y2 | x ] + var [ y1 | x ]}
2 2
1 6 1 1
3 9 ( 9
)
var [ y2 | x ] = var i = 4 y i | x = i = 4 var y i | x = ( 3σ 2 ) = σ 2 3
6
2 σ σ2
2
2σ 2
( )
var βˆ 2, mean | x = 1 ( x2 − x1 ) {var [ y2 | x ] + var [ y1 | x ]} = 1 ( x2 − x1 ) + =
2
3 3 3 ( x2 − x1 )2
( )
We know that var βˆ 2, mean | x is larger than the variance of the least squares estimator because
6 y 6 yi
i = 4 i
1 3 yi
3
y
βˆ 2,mean = ( y2 − y1 ) ( x2 − x1 ) = − i =1 = i = 4
− i =1 i
( x2 − x1 ) 3 3
3 ( x2 − x1 ) 3 ( x2 − x1 )
= i =1 ai yi
6
−1 1
Where a1 = a2 = a3 = and a4 = a5 = a6 =
3 ( x2 − x1 ) 3 ( x2 − x1 )
Furthermore βˆ 2, mean is an unbiased estimator. From the Gauss-Markov theorem we know that the
least squares estimator is the “best” linear unbiased estimator, the one with the smallest variance.
( )
Therefore, we know that var βˆ 2, mean | x is larger than the variance of the least squares estimator.
EXERCISE 2.11
(a) We estimate that each additional $100 per month income is associated with an additional 52
cents per person expenditure, on average, on food away from home. If monthly income is
zero, we estimate that household will spend an average of $13.77 per person on food away
from home.
(b) yˆ = 24.17 .
(c) εˆ = 0.43 .
(d) In this log-linear relationship, the elasticity is εˆ = 0.007 ( 20 ) = 0.14 .
(e) For x = 20, dyˆ / dx = 0.1860 . For x = 30, dyˆ / dx = 0.1995 . It is increasing at an increasing
rate. Also, the second derivative, the rate of change of the first derivative is
d 2 yˆ / dx 2 = exp ( 3.14 + 0.007 x )( 0.007 ) > 0 . A positive second derivative means that the
2
(f) The number of zeros is 2334 – 2005 = 329. The reason for the reduction in the number of
observations is that the logarithm of zero is undefined and creates a missing data value. The
software throws out the row of data when it encounters a missing value when doing its
calculations.
EXERCISE 2.13
(a) We estimate that each additional 1000 FTE students increase real total academic cost per
student by $266, holding all else constant. The intercept suggests if there were no students
the real total academic cost per student would be $14,656.
(b) _ = 22.0907.
(c) eˆ = −0.6877 .
(d) ACA = 20.732975 .
EXERCISE 2.15
y2 − y1 1 1
(a) bEZ = = y2 − y1 = ki yi
x2 − x1 x2 − x1 x2 − x1
−1 1
where k1 = , k2 = , and k3 = k4 = ... = kN = 0
x2 − x1 x2 − x1
2σ 2
(c) var ( bEZ | x ) = var( ki yi | x) = ki2 var ( ei | x ) = σ 2 ki2 =
( x2 − x1 )
2
2σ 2
(d) ( )
If ei ~ N 0, σ2 , then bEZ | x ~ N β2 ,
( x2 − x1 )
2
(e) To convince E.Z. Stuff that var(b2|x) < var(bEZ|x), we need to show that
( x2 − x1 )
2
2σ 2 σ2
( xi − x )
2
> or that >
( x2 − x1 ) ( xi − x )
2 2
2
Consider
2
( x2 − x1 ) ( x2 − x ) − ( x1 − x ) ( x − x ) + ( x1 − x ) − 2 ( x2 − x )( x1 − x )
2 2 2
= = 2
2 2 2
Thus, we need to show that
N
2 ( xi − x ) > ( x2 − x ) + ( x1 − x ) − 2 ( x2 − x )( x1 − x )
2 2 2
i =1
or that
N
( x1 − x ) + ( x2 − x ) + 2 ( x2 − x )( x1 − x ) + 2 ( xi − x ) > 0
2 2 2
i =3
or that
2 N
( x1 − x ) + ( x2 − x ) + 2 ( xi − x ) > 0.
2
i =3
This last inequality clearly holds. Thus, bEZ is not as good as the least squares estimator.
Rather than prove the result directly, as we have done above, we could also refer Professor
E.Z. Stuff to the Gauss Markov theorem.
EXERCISE 2.17
(a)
Figure xr2.17a Collegetown: Price and Square Foot
1500 1000
Price, $1000
500 0
0 20 40 60 80 100
Sqft, 100s
Figure xr2.11(a) Price (in $1,000s) against square feet for houses (in 100s)
1500 1000
Price, $1000
500 0
0 20 40 60 80 100
Sqft, 100s
(d)
Figure xr2.17d Observations and quadratic fitted line
2000 1500
Price, $1000
1000500
0
0 20 40 60 80 100
Sqft, 100s
(e) ̂ = 0.882
Figure xr2.17 Residuals from linear relation Figure xr2.17 Residuals from quadratic relation
400 400
0 0
-200 -200
-400 -400
0 20 40 60 80 100 0 20 40 60 80 100
Sqft, 100s Sqft, 100s
In both models, the residual patterns do not appear random. The variation in the residuals
increases as SQFT increases, suggesting that the homoskedasticity assumption may be
violated.
(g) The sum of square residuals linear relationship is 5,262,846.9. The sum of square residuals
for the quadratic relationship is 4,222,356.3. In this case the quadratic model has the lower
SSE. The lower SSE means that the data values are closer to the fitted line for the quadratic
model than for the linear model.
EXERCISE 2.19
(a)
Figure xr2.19a Selling price vs. square feet
600 400
Price, $1000
200
0
10 20 30 40 50
Sqft, 100s
meaningful in this example. The reason is that there are no data values with a house size
near zero.
600 400
Price, $1000
200 0
10 20 30 40 50
Sqft, 100s
(d)
Figure xr2.19d Fitted linear and quadratic
600
Price, $1000
200 0 400
10 20 30 40 50
Sqft, 100s
The sum of squared residuals for the linear relation is SSE = 1,879,826.9948. For the
quadratic model the sum of squared residuals is SSE = 1,795,092.2112. In this instance, the
sum of squared residuals is smaller for the quadratic model, one indicator of a better fit.
(e) If the quadratic model is in fact “true,” then the results and interpretations we obtain for the
linear relationship are incorrect, and may be misleading.
EXERCISE 2.21
(a)
= 152.6144 − 0.9812
( ) (3.3473) (0.0949)
We estimate that a house that is new, AGE = 0, will have expected price $152,614.40.
We estimate that each additional year of age will reduce expected price by $981.20,
other things held constant. The expected selling price for a 30-year-old house is
= $123,177.70 .
(b)
Figure xr2.21b Observations and linear fitted line
600 400
Selling Price
200 0
0 20 40 60 80 100
Age
The data show an inverse relationship between house prices and age. The data on newer
houses is not as close to the fitted regression line as the data for older homes.
(c) ln( ) = 4.9283 − 0.0075
( ) (0.0205) (0.0006)
We estimate that each additional year of age reduces expected price by about 0.75%,
holding all else constant.
(d)
Figure xr2.21d Observations and log-linear fitted line
400 600
Selling Price
200 0
0 20 40 60 80 100
Age
The fitted log-linear model is not too much different than the fitted linear relationship.
(e) The expected selling price of a house that is 30 years old is = $110,370.32.
EXERCISE 2.23
(a)
Figure xr2.23a Vote vs Growth
60
democratic share of presidential vote
40 45 50
35 55
-10 -5 0 5 10 15
Growth
-10 -5 0 5 10 15
Growth
(d) The figure below shows a plot of VOTE against INFLATION. It is difficult to see if there
is positive or inverse relationship.
-10 -5 0 5 10
Inflation
-10 -5 0 5 10
Inflation
(f) The actual inflation value in the 2016 election was 1.42%. The predicted vote in favor of
the Democratic candidate (Clinton) was = 49.99, or 49.99%.
EXERCISE 2.25
(a)
Figure xr2-25a Histogram of FOODAWAY
60
40
Percent
20
0
The mean of the 1200 observations is 49.27, the 25th, 50th and 75th percentiles are 12.04,
32.56 and 67.60.
(b)
N Mean Median
ADVANCED = 1 257 73.15 48.15
(c)
Figure xr2-25c Histogram of ln(FOODAWAY)
15
10
Percent
5
0
0 2 4 6 8
lfoodaway
There are 178 fewer values of ln(FOODAWAY) because 178 households reported spending
$0 on food away from home per person, and ln(0) is undefined. It creates a “missing value”
which software cannot use in the regression.
(e)
Figure xr2.25e Observations and log-linear fitted line
8
6
ln(foodaway)
4 2
0
The OLS residuals do appear randomly distributed with no obvious patterns. There are fewer
observations at higher incomes, so there is more “white space.”
EXERCISE 2.27
(a)
Figure xr2.27a motel_pct vs. relprice
100
Motel Occupancy Rate
60 40 80
65 70 75 80 85
100*Relative price
There seems to be an inverse association between relative price and occupancy rate.
(b)
_ = 166.6560 − 1.2212
( ) (43.5709) (0.5835)
Based economic reasoning we anticipate a negative coefficient for RELPRICE. The slope
estimate is interpreted as saying, the expected model occupancy rate falls by 1.22% given a
1% increase in relative price, other factors held constant.
(c)
Figure xr2.27c OLS residuals
20 0
OLS residuals
-20 -40
0 5 10 15 20 25
Time
The residuals are scattered about zero for the first 16 observations but for observations 17-
23 all but one of the residuals is negative. This suggests that the occupancy rate was lower
than predicted by the regression model for these dates.
(d) _ = 79.3500 − 13.2357
( ) (3.1541) (5.9606)
We estimate that during the non-repair period the expected occupancy rate is 79.35%.
During the repair period, the expected occupancy rate is estimated to fall by 13.24%, other
things held constant, to 66.11%.
EXERCISE 2.29
(a)
Figure xr2.29a Histogram of ln(wage)
.8
.6
Density
.4 .2
0
1 2 3 4 5
ln(wage)
The histogram shows the distribution of ln(WAGE) to be almost symmetrical. Note that the
mean and median are similar, which is not the case for skewed distributions. The skewness
coefficient is not quite zero. Similarly, the kurtosis is not quite three, as it should be for a
normal distribution.
(b) The OLS estimates are
ln( ) = 1.5968 + 0.0987
( ) (0.0702) (0.0048)
We estimate that each additional year of education predicts a 9.87% higher wage, all else
held constant.
(c) For someone with 12 years of education the predicted value is = 16.1493 and for
someone with 16 years of education it is = 23.9721.
(d) For individuals with 12 and 16 years of education, respectively, these values are $1.1850
and $1.5801.
(e)
200
150
100
50
0 0 5 10 15 20
years of education
The log-linear model fits the data better at low levels of education.
(f) For the log-linear model this value is 228,573.5 and for the linear model 220,062.3. Based
on this measure the linear model fits the data better than the linear model.