0% found this document useful (0 votes)
34 views

Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing

Chapter 2 discusses properties of regression coefficients and hypothesis testing in regression analysis. It shows how the criteria of unbiasedness and efficiency are applied to regression models. Key topics covered include unbiasedness and variance of ordinary least squares regression estimators, the Gauss-Markov theorem on efficiency of OLS, and hypothesis testing of regression coefficients using t-tests and F-tests. Understanding every detail of this chapter is essential for properly assessing regression model performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing

Chapter 2 discusses properties of regression coefficients and hypothesis testing in regression analysis. It shows how the criteria of unbiasedness and efficiency are applied to regression models. Key topics covered include unbiasedness and variance of ordinary least squares regression estimators, the Gauss-Markov theorem on efficiency of OLS, and hypothesis testing of regression coefficients using t-tests and F-tests. Understanding every detail of this chapter is essential for properly assessing regression model performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter 2: Properties of the regression coefficients and hypothesis testing

Chapter 2: Properties of the regression


coefficients and hypothesis testing

Overview
Chapter 1 introduced least squares regression analysis, a mathematical
technique for fitting a relationship given suitable data on the variables
involved. It is a fundamental chapter because much of the rest of the
text is devoted to extending the least squares approach to handle more
complex models, for example models with multiple explanatory variables,
nonlinear models, and models with qualitative explanatory variables.
However, the mechanics of fitting regression equations are only part of
the story. We are equally concerned with assessing the performance of our
regression techniques and with developing an understanding of why they
work better in some circumstances than in others. Chapter 2 is the starting
point for this objective and is thus equally fundamental. In particular, it
shows how two of the three main criteria for assessing the performance
of estimators, unbiasedness and efficiency, are applied in the context of
a regression model. The third criterion, consistency, will be considered in
Chapter 8.

Learning outcomes
After working through the corresponding chapter in the text, studying the
corresponding slideshows, and doing the starred exercises in the text and
the additional exercises in this guide, you should be able to explain what is
meant by:
• cross-sectional, time series, and panel data
• unbiasedness of OLS regression estimators
• variance and standard errors of regression coefficients and how they
are determined
• Gauss–Markov theorem and efficiency of OLS regression estimators
• two-sided t tests of hypotheses relating to regression coefficients and
one-sided t tests of hypotheses relating to regression coefficients
• F tests of goodness of fit of a regression equation
in the context of a regression model. The chapter is a long one and you
should take your time over it because it is essential that you develop a
perfect understanding of every detail.

Further material
Derivation of the expression for the variance of the naïve estimator in
Section 2.3.
The variance of the naïve estimator in Section 2.3 and Exercise 2.9 is not
of any great interest in itself but its derivation provides an example of how
one obtains expressions for variances of estimators in general.
In Section 2.3 we considered the naïve estimator of the slope coefficient
derived by joining the first and last observations in a sample and
calculating the slope of that line:

37
20 Elements of econometrics

Yn − Y1
b2 = .
X n − X1

It was demonstrated that the estimator could be decomposed as


u n − u1
b2 = b 2 +
X n − X1

and hence that E(b2) = β2.


The population variance of a random variable X is defined to be
E([X – μX]2) where μX = E(X). Hence the population variance of b2 is given
by
     u − u   2 2

(  
) u − u1 
σ b2 = E [b2 − b 2 ]2 = E  b 2 + n 
X n − X1 
− b

 n
2 = E 
 X − X
1 
 .
 
2

  n 1

On the assumption that X is nonstochastic, this can be written as
2

σ 2
b2

=
1 
E ([u n )
− u1 ] .
2

 X n − X1 

Expanding the quadratic, we have


2
 
σ b2 = 
2
X
1
− X
2 2
(
 E u n + u1 − 2u n u1 )
 n 1

2

=
1 
[( ) ( )
 E u n + E u1 − 2 E (u n u1 ) .
2 2
]
 X n − X1 

Each value of the disturbance term is drawn randomly from a distribution


with mean 0 and population variance σ u2 , so E u n2 and E u12 are both ( ) ( )
equal to σ u2 . un and u1 are drawn independently from the distribution, so
E(unu1) = E(un)E(u1) = 0. Hence

2σ u2 σ u2 .
σ b2 = =
2
( X n − X 1 )2 1
( X n − X 1 )2
2
1
Define A = ( X 1 + X n ), the average of X1 and Xn, and D = X n − A = A − X 1.
2
Then
1 (X n − X 1 ) 2 = 1 (X n − A + A − X 1 ) 2
2 2
1
[
= ( X n − A) + (A − X 1 ) + 2(X n − A)(A − X 1 )
2
2 2
]
1
[
= D2 + D 2 + 2(D )(D ) = 2 D 2
2
]
= (Xn – A)2 + (A – X1)2
= (X n − A) + (X 1 − A)
2 2

( ) (
= X n − X + X − A + X1 − X + X − A )
2 2

= (X − X ) + (X − A) + 2(X − X )(X − A)
2 2
n n

+ (X − X ) + (X − A) + 2(X − X )(X − A)
2 2
1 1

= (X − X ) + (X − X ) + 2(X − A) + 2(X + X − 2 X )(X − A)


2 2 2
1 n 1 n

= (X − X ) + (X − X ) + 2(X − A) + 2(2 A − 2 X )(X − A)


2 2 2
1 n

= (X − X ) + (X − X ) − 2(X − A) = (X − X ) + (X − X ) − 2(A − X )
2 2 2 2 2 2
1 n 1 n

= (X − X ) + (X − X ) − (X + X − 2 X )
2 1 2 2
38 1 n 1 n
2
Chapter 2: Properties of the regression coefficients and hypothesis testing

Hence we obtain the expression in Exercise 2.9. There must be a shorter


proof.

Additional exercises
A2.1
A variable Yi is generated as
Yi = β1 + ui
where β1 is a fixed parameter and ui is a disturbance term that is
independently and identically distributed with expected value 0 and
population variance σ u2 . The least squares estimator of β1 is Y , the
sample mean of Y. However a researcher believes that Y is a linear
function of another variable X and uses ordinary least squares to fit the
relationship
Yˆ = b + b X
1 2

calculating b1 as Y − b2 X , where X is the sample mean of X. X may


be assumed to be a nonstochastic variable. Determine whether the
researcher’s estimator b1 is biased or unbiased, and if biased, determine
the direction of the bias.

A2.2
With the model described in Exercise A2.1, standard theory states that the
population variance of the researcher’s estimator b1 is
1 X2  . In general, this is larger than the population
σ u2  + 
(
n ∑ Xi − X 2 
 )
σ2
variance of Y , which is u . Explain the implications of the difference in
n
the variances.
In the special case where X = 0 , the variances are the same. Give an
intuitive explanation.

A2.3
Using the output for the regression in Exercise A1.9 in the text, reproduced
below, perform appropriate statistical tests.

. reg CHILDREN SM

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 63.60
Model | 272.69684 1 272.69684 Prob > F = 0.0000
Residual | 2306.7402 538 4.28762118 R-squared = 0.1057
-------------+------------------------------ Adj R-squared = 0.1041
Total | 2579.43704 539 4.78559747 Root MSE = 2.0707

------------------------------------------------------------------------------
CHILDREN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
SM | -.2525473 .0316673 -7.98 0.000 -.314754 -.1903406
_cons | 7.198478 .3773667 19.08 0.000 6.457186 7.939771
------------------------------------------------------------------------------

39
20 Elements of econometrics

A2.4
Using the output for the regression in Exercise A1.1, reproduced below,
perform appropriate statistical tests.

UHJ)'+2(;3LI)'+2!

6RXUFH_66GI061XPEHURIREV 
)   
0RGHO_3URE!) 
5HVLGXDO_H5VTXDUHG 
$GM5VTXDUHG 
7RWDO_H5RRW06( 


)'+2_&RHI6WG(UUW3!_W_>&RQI,QWHUYDO@

(;3_
BFRQV_


A2.5
Using the output for your regression in Exercise A1.2, perform appropriate
statistical tests.

A2.6
Using the output for the regression in Exercise A1.3, reproduced below,
perform appropriate statistical tests.

UHJ:(,*+7:(,*+7

6RXUFH_66GI061XPEHURIREV 
)   
0RGHO_3URE!) 
5HVLGXDO_5VTXDUHG 
$GM5VTXDUHG 
7RWDO_5RRW06( 


:(,*+7_&RHI6WG(UUW3!_W_>&RQI,QWHUYDO@

:(,*+7_
BFRQV_


A2.7
Using the output for the regression in Exercise A1.4, reproduced below,
perform appropriate statistical tests.

UHJ($51,1*6+(,*+7

6RXUFH_66GI061XPEHURIREV 
)   
0RGHO_3URE!) 
5HVLGXDO_5VTXDUHG 
$GM5VTXDUHG 
7RWDO_5RRW06( 


($51,1*6_&RHI6WG(UUW3!_W_>&RQI,QWHUYDO@

+(,*+7_
BFRQV_


40
Chapter 2: Properties of the regression coefficients and hypothesis testing

A2.8
With the information given in Exercise A1.5, how would the change in the
measurement of GDP affect
• the standard error of the coefficient of GDP
• the F statistic for the equation?

A2.9
With the information given in Exercise A1.6, how would the change in the
measurement of GDP affect
• the standard error of the coefficient of GDP
• the F statistic for the equation?

A2.10
[This is a continuation of Exercise 1.15 in the text.] A sample of data
consists of n observations on two variables, Y and X. The true model is

Yi = b 1 + b 2 X i + u i

where β1 and β2 are parameters and u is a disturbance term that satisfies


the usual regression model assumptions. In view of the true model,

Y = b1 + b 2 X + u

where Y , X , and u are the sample means of Y, X, and u. Subtracting the


second equation from the first, one obtains
Yi* = b 2 X i* + ui*

where Yi* = Yi − Y , X i* = X i − X , and u i* = u i − u . Note that, by


construction, the sample means of Y * , X * , and u * are all equal to zero.
One researcher fits
Yˆ = b + b X .
1 2 (1)
A second researcher fits
Yˆ * = b * + b * X * .
1 2 (2)
[Note: The second researcher included an intercept in the specification.]
• Comparing regressions (1) and (2), demonstrate that Yˆ * = Yˆ − Y .
i i

• Demonstrate that the residuals in (2) are identical to the residuals in


(1).
• Demonstrate that the OLS estimator of the variance of the disturbance
term in (2) is equal to that in (1).
• Explain how the standard error of the slope coefficient in (2) is related
to that in (1).
• Explain how R2 in (2) is related to R2 in (1).
• Explain why, theoretically, the specification (2) of the second researcher
is incorrect and he should have fitted
Yˆ * = b2* X * (3)

not including a constant in his specification.


• If the second researcher had fitted (3) instead of (2), how would this
have affected his estimator of β2? Would dropping the unnecessary
intercept lead to a gain in efficiency?
41
20 Elements of econometrics

A2.11
A variable Y depends on a nonstochastic variable X with the relationship
Y = β1 + β2X + u
where u is a disturbance term that satisfies the regression model
assumptions. Given a sample of n observations, a researcher decides to
estimate β2 using the expression
n

∑X Y
i =1
i i
b2 = n .

i =1
X i2

(This is the OLS estimator of β2 for the model Y = β2X + u). It can be
shown that the population variance of this estimator is σ 2 .
u
n

∑X
i =1
i
2

• Demonstrate that b2 is in general a biased estimator of β2.


• Discuss whether it is possible to determine the sign of the bias.
• Demonstrate that b2 is unbiased if β1 = 0.
• What can be said in this case about the efficiency of b2, comparing it
(X − X ) (Y − Y )
n
with the estimator

i =1
i i

?
∑(X )
n
2
i −X
i =1

• Demonstrate that b2 is unbiased if X = 0.


What can be said in this case about the efficiency of b2, comparing it
with the estimator

∑(X )(Y )
n

i −X i −Y ?
i =1

∑(X )
n
2
i −X
i =1

Explain the underlying reason for this conclusion.


• Returning to the general case where b 1 ≠ 0 and X ≠ 0 , suppose that
there is very little variation in X in the sample. Is it possible that b2
might be a better estimator than the OLS estimator?

A2.12
A variable Yi is generated as
Yi = β1 + β2Xi + ui (1)
where β1 and β2 are fixed parameters and ui is a disturbance term that
satisfies the regression model assumptions. The values of X are fixed
and are as shown in the figure opposite. Four of them, X1 to X4, are close
together. The fifth, X5, is much larger. The corresponding values that Y
would take, if there were no disturbance term, are given by the circles on
the line. The presence of the disturbance term in the model causes the
actual values of Y in a sample to be different. The solid black circles depict
a typical sample of observations.

42
Chapter 2: Properties of the regression coefficients and hypothesis testing

<

E


 ; ; ;; ; ;

Discuss the advantages and disadvantages of dropping the observation


corresponding to X5 when regressing Y on X. If you keep the observation in
the sample, will this cause the regression estimates to be biased?

Answers to the starred exercises in the textbook


2.1 n
1
Demonstrate that b1 = b 1 + ∑ ci u i , where ci = – ai X and ai is defined
in equation (2.21). i =1 n
Answer:
 
( )
n

b1 = Y − b2 X = β 1 + β 2 X + u − X  β 2 + ∑ a i u i 
 i =1 
1 n n n
= β1 + ∑
n i =1
u i − X ∑ ai u i = β 1 + ∑ ci u i .
i =1 i =1

2.5
An investigator correctly believes that the relationship between two
variables X and Y is given by
Yi = β1 + β2Xi + ui.
Given a sample of observations on Y, X, and a third variable Z (which is
not a determinant of Y), the investigator estimates β2 as

∑(Z
n

i −Z )(Y i −Y )
i =1
.
∑(Z
n

i −Z )(X i −X )
i =1

Demonstrate that this estimator is unbiased.


Answer: Noting that Yi − Y = b 2 X i − X + u i − u , ( )
∑ (Z i − Z )(Yi − Y ) = ∑(Z i − Z )β 2 (Xi − X ) + ∑ (Z i − Z ) (ui − u)

b2 =
∑(Z i − Z )(X i − X ) ∑(Z i − Z )(X i − X )
= β2 +
∑(Z i − Z ) (ui − u ) .
∑ (Z i − Z )(Xi − X )

43
20 Elements of econometrics

Hence

E (b2 ) = β 2 +
∑ (Z i − Z )E (ui − u ) = β .
∑ (Z i − Z )(Xi − X )
2

2.6
Using the decomposition of b1 obtained in Exercise 2.1, derive the
expression for σ b21 given in equation (2.38).
n
1
Answer: b1 = b 1 + ∑ c i u i , where ci = – ai X . Hence
i =1 n

 n  
2
n
 1 X n n

σ b2 = E   ∑ c i u i   = σ u2 ∑ c i2 = σ u2  n 2 − 2 ∑a i + X 2 ∑ a i2  .
1
  i =1   i =1  n n i =1 i =1 
n n
1
From Box 2.2, ∑a i = 0 and ∑ a i2 = .
(X i − X )
n


i =1 2
i =1

i =1

Hence
 
 
2 1 X2 .
σ b2 = σu  + 
∑ (X
n
n
)
1
2
 i −X 
 i =1 

2.7
Given the decomposition in Exercise 2.2 of the OLS estimator of β2 in
the model Yi = b 2 X i + u i , demonstrate that the variance of the slope
coefficient is given by
σ u2
σ b2 = n
.
2

∑X j =1
2
j

Answer:
n
Xi
b2 = b 2 + ∑ d i u i , where d i = n
, and E (b2 ) = b 2 . Hence

i =1
∑X
j =1
2
j

 
 
  n  
2
n
 n
X i2 σ u2 n
σ u2
σ b2 = E  ∑ d i u i   = σ u2 ∑ d i2 = σ u2 ∑  = ∑X i
2
= n
.
i =1   n   n
2 2
 i =1 
2
  
i =1
 ∑ X j  2
 ∑ X j  2
i =1
∑X 2
j
  j =1    j =1  j =1
    

2.10
It can be shown that the variance of the estimator of the slope coefficient
in Exercise 2.4,

∑(Z
n

i )(
− Z Yi − Y )
i =1

∑(Z
n

i −Z )(X i −X )
i =1

44
Chapter 2: Properties of the regression coefficients and hypothesis testing

is given by
σ u2 1
σ b2 = ×
∑ (X
n
)
2
2
2 rXZ
i −X
i =1

where rXZ is the correlation between X and Z. What are the implications for
the efficiency of the estimator?
Answer:
If Z happens to be an exact linear function of X, the population variance
1
will be the same as that of the OLS estimator. Otherwise 2 will be
rXZ
greater than 1, the variance will be larger, and so the estimator will be less
efficient.

2.13
Suppose that the true relationship between Y and X is Yi = b 1 + b 2 X i + u i
and that the fitted model is Yˆi = b1 + b2 X i . In Exercise 1.12 it was shown
that if X i* = µ1 + µ 2 X i , and Y is regressed on X * , the slope coefficient
b2* = b2 µ 2 . How will the standard error of b2* be related to the standard
error of b2?
Answer:
In Exercise 1.22 it was demonstrated that the fitted values of Y would be
the same. This means that the residuals are the same, and hence
s u2 , the estimator of the variance of the disturbance term, is the same. The
standard error of b2* is then given by
s u2 s u2
( )
s.e. b2* = =
∑ (X i* − X * ) ∑ (µ1 + µ 2 X i − µ1 − µ 2 X )
2 2

s u2 1
= = s.e. (b2 ) .
∑ (X i − X ) µ2
2
µ 2
2

2.15
A researcher with a sample of 50 individuals with similar education
but differing amounts of training hypothesises that hourly earnings,
EARNINGS, may be related to hours of training, TRAINING, according to
the relationship
EARNINGS = β1 + β2TRAINING + u .
He is prepared to test the null hypothesis H0: β2 = 0 against the alternative
hypothesis H1: β2 ≠ 0 at the 5 per cent and 1 per cent levels. What should
he report
1. if b2 = 0.30, s.e.(b2) = 0.12?
2. if b2 = 0.55, s.e.(b2) = 0.12?
3. if b2 = 0.10, s.e.(b2) = 0.12?
4. if b2 = –0.27, s.e.(b2) = 0.12?
Answer:
There are 48 degrees of freedom, and hence the critical values of t at the
5 per cent, 1 per cent, and 0.1 per cent levels are 2.01, 2.68, and 3.51,
respectively.

45
20 Elements of econometrics

1. The t statistic is 2.50. Reject H0 at the 5 per cent level but not at the 1
per cent level.
2. t = 4.58. Reject at the 0.1 per cent level.
3. t = 0.83. Fail to reject at the 5 per cent level.
4. t = –2.25. Reject H0 at the 5 per cent level but not at the 1 per cent
level.

2.20
Explain whether it would have been possible to perform one-sided tests
instead of two-sided tests in Exercise 2.15. If you think that one-sided tests
are justified, perform them and state whether the use of a one-sided test
makes any difference.
Answer:
First, there should be a discussion of whether the parameter β2 in
EARNINGS = β1 + β2TRAINING + u
can be assumed not to be negative. The objective of training is to impart
skills. It would be illogical for an individual with greater skills to be paid
less on that account, and so we can argue that we can rule out β2 < 0. We
can then perform a one-sided test. With 48 degrees of freedom, the critical
values of t at the 5 per cent, 1 per cent, and 0.1 per cent levels are 1.68,
2.40, and 3.26, respectively.
1. The t statistic is 2.50. We can now reject H0 at the 1 per cent level (but
not at the 0.1 per cent level).
2. t = 4.58. Not affected by the change. Reject at the 0.1 per cent level.
3. t = 0.83. Not affected by the change. Fail to reject at the 5 per cent
level.
4. t = –2.25. Fail to reject H0 at the 5 per cent level. Here there is a
problem because the coefficient has an unexpected sign and is large
enough to reject H0 at the 5 per cent level with a two-sided test.
In principle we should ignore this and fail to reject H0. Admittedly, the
likelihood of such a large negative t statistic occurring under H0 is very
small, but it would be smaller still under the alternative hypothesis
H1: β2 >0.
However we should consider two further possibilities. One is that the
justification for a one-sided test is incorrect. For example, some jobs
pay relatively low wages because they offer training that is valued by
the employee. Apprenticeships are the classic example. Alternatively,
workers in some low-paid occupations may, for technical reasons, receive a
relatively large amount of training. In either case, the correlation between
training and earnings might be negative instead of positive.
Another possible reason for a coefficient having an unexpected sign is that
the model is misspecified in some way. For example, the coefficient might
be distorted by omitted variable bias, to be discussed in Chapter 6.

2.25
Suppose that the true relationship between Y and X is Yi = b 1 + b 2 X i + u i
and that the fitted model is Yˆi = b1 + b2 X i . In Exercise 1.12 it was shown
that if X i* = µ 1 + µ 2 X i , and Y is regressed on X * , the slope coefficient
b2* = b2 µ 2 . How will the t statistic for b2* be related to the t statistic for
b2? (See also Exercise 2.13.)

46
Chapter 2: Properties of the regression coefficients and hypothesis testing

Answer:
( )
From Exercise 2.13, we have s.e. b2* = s.e. (b2 ) / µ 2 . Since b2* = b2 µ 2 , it
follows that the t statistic must be the same.
Alternatively, since we saw in Exercise 1.22 that R2 must be the same,
it follows that the F statistic for the equation must be the same. For a
simple regression the F statistic is the square of the t statistic on the slope
coefficient, so the t statistic must be the same.

2.28
Calculate the 95 per cent confidence interval for β2 in the price inflation/
wage inflation example:
p̂ = –1.21 + 0.82w
.
(0.05) (0.10)
What can you conclude from this calculation?
Answer:
With n equal to 20, there are 18 degrees of freedom and the critical value
of t at the 5 per cent level is 2.10. The 95 per cent confidence interval is
therefore
0.82 – 0.10 × 2.10 ≤ β2 ≤ 0.82 + 0.10 × 2.10
that is,
0.61 ≤ β2 ≤ 1.03.

2.34
Suppose that the true relationship between Y and X is Yi = b 1 + b 2 X i + u i
and that the fitted model is Yˆi = b1 + b2 X i . Suppose that X i* = µ1 + µ 2 X i , and
Y is regressed on X * . How will the F statistic for this regression be related
to the F statistic for the original regression? (See also Exercises 1.22, 2.13,
and 2.24.)
Answer:
We saw in Exercise 1.22 that R2 would be the same, and it follows that F
must also be the same.

Answers to the additional exercises


Note:
Each of the exercises below relates to a simple regression. Accordingly, the
F test is equivalent to a two-sided t test on the slope coefficient and there
is no point in performing both tests. The F statistic is equal to the square of
the t statistic and, for any significance level, the critical value of F is equal
to the critical value of t. Obviously a one-sided t test, when justified, is
preferable to either in that it has greater power for any given significance
level.

A2.1
First we need to show that E (b2 ) = 0 .

∑ (X i)(Y − Y ) ∑(X − X ) (β + u − β
−X i i 1 i 1 −u) ∑(X i −X ) (u i −u)
b2 = i
= i
= i
.
∑(X − X ) ∑(X − X ) ∑(X )
2 2 2
i i i −X
i i i

47
20 Elements of econometrics

Hence, given that we are told that X is nonstochastic,


( )
 ∑ X i − X (u i − u )
 
 
E (b2 ) = E  i
=
1
(
E  ∑ X i − X ) (u − u )
( ) ∑i (X i − X )
2 i
 ∑ X −X 2   i 
 i 
 i 

∑i (X i − X ) E (u i − u ) = 0
1
=
∑i (X i − X )
2

since E(u) = 0. Thus


( )
E (b1 ) = E Y − b2 X = b 1 − XE (b2 ) = b 1

and the estimator is unbiased.

A2.2
If X = 0 , the estimators are identical. Y − b2 X reduces to Y .

A2.3
The t statistic for the coefficient of SM is –7.98, very highly significant. The
t statistic for the intercept is even higher, but it is of no interest. All the
mothers in the sample must have had at least one child (the respondent),
for otherwise they would not be in the sample. The F statistic is 63.60,
very highly significant.

A2.4
The t statistic for the coefficient of EXP is 19.50, very highly significant.
There is little point performing a t test on the intercept, given that it has
no plausible meaning. The F statistic is 380.4, very highly significant.

A2.5
The slope coefficient for every category is significantly different from
zero at a very high significance level, with the exception of local public
transportation. The coefficient for the latter is virtually equal to zero and
the t statistic is only 0.40. Evidently this category is on the verge of being
an inferior good.

A2.6
A straight t test on the coefficient of WEIGHT85 is not very interesting
since we know that those who were relatively heavy in 1985 will also be
relatively heavy in 2002. The t statistic confirms this obvious fact. Of more
interest would be a test investigating whether those relatively heavy in
1985 became even heavier in 2002. We set up the null hypothesis that they
did not, H0: β2 = 1, and see if we can reject it. The t statistic for this test is
  
W  

and hence the null hypothesis is not rejected. The constant indicates that
the respondents have tended to put on 23.6 pounds over the interval,
irrespective of their 1985 weight. The null hypothesis that the general
increase is zero is rejected at the 0.1 per cent level.

48
Chapter 2: Properties of the regression coefficients and hypothesis testing

A2.7
The t statistic for height, 5.42, suggests that the effect of height on
earnings is highly significant, despite the very low R2. In principle the
estimate of an extra 86 cents of hourly earnings for every extra inch of
height could have been a purely random result of the kind that one obtains
with nonsense models. However, the fact that it is apparently highly
significant causes us to look for other explanations, the most likely one
being that suggested in the answer to Exercise A1.4. Of course we would
not attempt to test the negative constant.

A2.8
The standard error of the coefficient of GDP. This is given by
su* ,

∑ (G *
i −G )
* 2

where su* , the standard error of the regression,

is estimated as
∑ ei*2 . Since RSS is unchanged, s * = s .
u
n−2 u

* *
We saw in Exercise A1.5 that Gi − G = Gi − G for all i. Hence the new
standard error is given
su
by and is unchanged.
∑ (G )
2
i −G
2
ESS
F=
RSS / n − 2
where ESS = explained sum of squares = ∑  Yˆi* − Yˆ *  .
Since ei* = ei , Yˆi* = Yˆi and ESS is unchanged. We saw in Exercise A1.5 that
RSS is unchanged. Hence F is unchanged.

A2.9
The standard error of the coefficient of GDP. This is given by
su* ,
∑ (G )
* 2
i −G*

where su* , the standard error of the regression, is estimated as , ∑e *2


i

n−2
*
where n is the number of observations. We saw in Exercise 1.6 that ei = ei
*
and so RSS is unchanged. Hence su = su. Thus the new standard error is
given by
su 1 su
= = 0.005 .
∑( 2 ) ∑( )
2 2
2G − 2G i G −G i

ESS 2
F=
RSS / n − 2
where ESS = explained sum of squares = ∑  Yˆ
i
*
− Yˆ *  .

Since ei* = ei , Yˆi* = Yˆi and ESS is unchanged. Hence F is unchanged.

A2.10
One way of demonstrating that Yˆi* = Yˆi − Y :
Yˆi* = b1* + b2* X i* = b2 X i − X ( )
( )
Yˆi − Y = (b1 + b2 X i ) − Y = Y − b2 X + b2 X i − Y = b2 X i − X . ( )
49
20 Elements of econometrics

Demonstration that the residuals are the same:



(
ei* = Yi* − Yˆi* = Yi − Y − Yˆi − Y = ei . ) ( )
Demonstration that the OLS estimator of the variance of the disturbance
term in (2) is equal to that in (1):

s*2
=
∑e *2
i
=
∑e 2
i
= su2 .
u
n−2 n−2

The standard error of the slope coefficient in (2) is equal to that in (1).
s u*
2
s u2 s u2
σˆ =2
= = = σˆ b2
∑ (X ) ∑X ∑ (X ) .
b2* * 2 *2 2
− X*
2

i i i −X

Hence the standard errors are the same.


Demonstration that R2 in (2) is equal to R2 in (1):
2

*
∑  Yˆ i
*
− Yˆ * 

R2 =
∑ (Y )
* 2
i −Y *

Yˆi* = Yˆi − Y and Yˆ = Y . Hence Yˆ * = 0 . Y * = Y − Y = 0 . Hence



∑ (Yˆ ) * 2
∑ (Yˆ ) 2
−Y
2*
= R2 .
i i
R = =
∑ (Y ) * 2
∑ (Y ) 2
i i −Y

The reason that specification (2) of the second researcher is incorrect is


that the model does not include an intercept.
If the second researcher had fitted (3) instead of (2), this would not in fact
have affected his estimator of β2. Using (3), the researcher should have

estimated β2 as b2* =
∑X Y * *
i i
. However, Exercise 1.15 demonstrates
∑X *2
i

that, effectively, he has done exactly this. Hence the estimator will be the
same. It follows that dropping the unnecessary intercept would not have
led to a gain in efficiency.

A2.11
n n n n

∑X Y i i ∑ X (b i 1 + b 2 X i + ui ) b1 ∑ X i ∑X u i i
b2 = i =1
n
= i =1
n
= n
i =1
+ b2 + i =1
n .
∑X
i =1
i
2
∑X
i =1
i
2
∑X
i =1
i
2
∑Xi =1
i
2

Hence
n
 n  n n
b1 ∑ X i  ∑ X i ui  b1 ∑ X i ∑ X E (u ) i i
E (b2 ) = n
i =1
+ b 2 + E  i =1n =
 n
i =1
+ b2 + i =1
n

∑X i
2
 ∑ Xi
2
 ∑ X i
2
∑X i
2

i =1  i =1  i =1 i =1

50
Chapter 2: Properties of the regression coefficients and hypothesis testing

assuming that X is nonstochastic. Since E(ui) = 0,


n
b1 ∑ X i
E (b2 ) = n
i =1
+ b2 .
∑X
i =1
i
2

Thus b2 will in general be a biased estimator. The sign of the bias depends
n
on the signs of β1 and ∑X i =1
i .

We have no information about either of these.


∑(X )(Y )
n

i −X i −Y
b­2 is more efficient than i =1 unless X = 0 since its

∑(X )
n
2
i −X
i =1 n

σ 2 ∑ (X i −X )(Y i −Y )
u i =1
population variance is , whereas that of is
∑(X )
n n
2

∑ i =1
X i2
i =1
i −X

σ u2 σ u2 .
=
∑(X ) ∑X
n n
2 2 2
i −X i − nX
i =1 i =1

The expression for the variance of b2 has a smaller denominator if X ≠ 0 .


If X = 0, the estimators are equally efficient because the population
variance expressions are identical. The reason for this is that the
estimators are now identical:

∑ (X )( ) ∑ X (Y ) ∑X Y
n n n n n

i − X Yi − Y i i −Y i i Y ∑ Xi ∑X Y i i
i =1 i =1 i =1 i =1 i =1
= = − =
∑(X )
n n n n n

∑X ∑X ∑X ∑X
2 2 2 2 2
i −X i i i i
i =1 i =1 i =1 i =1 i =1

n
since ∑X i =1
i = nX = 0 .

∑(X )
n
2
If there is little variation in X in the sample, i −X
i =1

∑(X )(Y )
n

may be small and hence the population variance of i −X i −Y


i =1

∑(X )
n
2
i −X
i =1

may be large. Thus using a criterion such as mean square error, b2 may be
preferable if the bias is small.

A2.12
The inclusion of the fifth observation does not cause the model to be
misspecified or the regression model assumptions to be violated, so
retaining it in the sample will not give rise to biased estimates. There
would be no advantages in dropping it and there would be one major

∑(X ) would be greatly reduced and hence the


n
2
disadvantage. i −X
i= 1

variances of the coefficients would be increased, adversely affecting the


precision of the estimates.

51
20 Elements of econometrics

This said, in practice one would wish to check whether it is sensible to


assume that the model relating Y to X for the other observations really
does apply to the observation corresponding to X5 as well. This question
can be answered only by being familiar with the context and having some
intuitive understanding of the relationship between Y and X.

52

You might also like