0% found this document useful (0 votes)

179 views71 pages

Classical Linear Regression Model Assumptions and Diagnostics

Uploaded by

nuttawatv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

179 views71 pages

Classical Linear Regression Model Assumptions and Diagnostics

Uploaded by

nuttawatv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 71

Chapter 5

Classical linear regression model assumptions

and diagnostics

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 1

Violation of the Assumptions of the CLRM

• Recall that we assumed of the CLRM disturbance terms:

1. E(ut) = 0
2. Var(ut) = 2 < 
3. Cov (ui,uj) = 0
4. The X matrix is non-stochastic or fixed in repeated samples
5. ut  N(0,2)

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 2

Investigating Violations of the
Assumptions of the CLRM
• We will now study these assumptions further, and in particular look at:
- How we test for violations
- Causes
- Consequences
in general we could encounter any combination of 3 problems:
- the coefficient estimates are wrong
- the associated standard errors are wrong
- the distribution that we assumed for the
test statistics will be inappropriate
- Solutions
- the assumptions are no longer violated
- we work around the problem so that we
use alternative techniques which are still valid

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 3

Statistical Distributions for Diagnostic Tests

• Often, an F- and a 2- version of the test are available.

• The F-test version involves estimating a restricted and an unrestricted

version of a test regression and comparing the RSS.

• The 2- version is sometimes called an “LM” test, and only has one degree
of freedom parameter: the number of restrictions being tested, m.

• Asymptotically, the 2 tests are equivalent since the 2 is a special case of the
F-distribution:
 2  m
 F  m, T  k  as T  k  
m
• For small samples, the F-version is preferable.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 4

Assumption 1: E(ut) = 0

• Assumption that the mean of the disturbances is zero.

• For all diagnostic tests, we cannot observe the disturbances and so

perform the tests of the residuals.

• The mean of the residuals will always be zero provided that there is a
constant term in the regression.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 5

Assumption 2: Var(ut) = 2 < 

• We have so far assumed that the variance of the errors is constant, 2 - this
is known as homoscedasticity.
û + t
• If the errors do not have a
constant variance, we say
that they are heteroscedastic
e.g. say we estimate a regression
and calculate the residuals, ut .
x2t

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 6

Detection of Heteroscedasticity: The GQ Test

• Graphical methods
• Formal tests: There are many of them: we will discuss Goldfeld-Quandt
test and White’s test

The Goldfeld-Quandt (GQ) test is carried out as follows.

1. Split the total sample of length T into two sub-samples of length T1 and T2.
The regression model is estimated on each sub-sample and the two
residual variances are calculated.
2. The null hypothesis is that the variances of the disturbances are equal,

H0: 1
2
  2
2

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 7

The GQ Test (Cont’d)

3. The test statistic, denoted GQ, is simply the ratio of the two residual
variances where the larger of the two variances must be placed in
the numerator
s12
GQ  2
s2

4. The test statistic is distributed as an F(T1-k, T2-k) under the null of

homoscedasticity

5. A problem with the test is that the choice of where to split the
sample is that usually arbitrary and may crucially affect the
outcome of the test

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 8

Detection of Heteroscedasticity using White’s Test

• White’s general test for heteroscedasticity is one of the best

approaches because it makes few assumptions about the form of the
heteroscedasticity.
• The test is carried out as follows.
1. Assume that the regression we carried out is:
yt = 1 + 2x2t + 3x3t + ut
And we want to test Var(ut) = 2. We estimate the model, obtaining
the residuals,
ut

2. Then run the auxiliary regression

uˆt2  1   2 x2t   3 x3t   4 x22t   5 x32t   6 x2t x3t  vt
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 9
Performing White’s Test for Heteroscedasticity

3. Obtain R2 from the auxiliary regression and multiply it by the

number of observations, T. It can be shown that
T R2  2 (m)
where m is the number of regressors in the auxiliary regression
excluding the constant term.

4. If the 2 test statistic from step 3 is greater than the corresponding

value from the statistical table then reject the null hypothesis that the
disturbances are homoscedastic.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 10

Consequences of Using OLS in the Presence of
Heteroscedasticity

• OLS estimation still gives unbiased coefficient estimates, but they are
no longer BLUE.

• This implies that if we still use OLS in the presence of

heteroscedasticity, our standard errors could be inappropriate and
hence any inferences we make could be misleading.

• Whether the standard errors calculated using the usual formulae are
too big or too small will depend upon the form of the
heteroscedasticity.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 11

How Do we Deal with Heteroscedasticity?

• If the form (i.e. the cause) of the heteroscedasticity is known, then we can
use an estimation method which takes this into account (called generalised
least squares, GLS).
• A simple illustration of GLS is as follows: Suppose that the error variance
is related to another variable zt by
var ut    2 zt2
• To remove the heteroscedasticity, divide the regression equation by zt
yt 1 x x
 1   2 2t   3 3t  vt
zt zt zt zt
ut
where vt  is an error term.
zt
 ut  var ut   2 zt2
• Now var  vt   var   2
 2
  2
for known zt.
 zt  z t z t

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 12

Other Approaches to Dealing
with Heteroscedasticity
• So the disturbances from the new regression equation will be
homoscedastic.

• Other solutions include:

1. Transforming the variables into logs or reducing by some other measure
of “size”.
2. Use White’s heteroscedasticity consistent standard error estimates.
The effect of using White’s correction is that in general the standard errors
for the slope coefficients are increased relative to the usual OLS standard
errors.
This makes us more “conservative” in hypothesis testing, so that we would
need more evidence against the null hypothesis before we would reject it.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 13

Background –
The Concept of a Lagged Value

t yt yt-1 yt
1989M09 0.8 - -
1989M10 1.3 0.8 1.3-0.8=0.5
1989M11 -0.9 1.3 -0.9-1.3=-2.2
1989M12 0.2 -0.9 0.2--0.9=1.1
1990M01 -1.7 0.2 -1.7-0.2=-1.9
1990M02 2.3 -1.7 2.3--1.7=4.0
1990M03 0.1 2.3 0.1-2.3=-2.2
1990M04 0.0 0.1 0.0-0.1=-0.1
. . . .
. . . .
. . . .

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 14

Autocorrelation

• We assumed of the CLRM’s errors that Cov (ui , uj) = 0 for ij, i.e.
This is essentially the same as saying there is no pattern in the errors.

• Obviously we never have the actual u’s, so we use their sample

counterpart, the residuals (the ut’s).

• If there are patterns in the residuals from a model, we say that they are
autocorrelated.

• Some stereotypical patterns we may find in the residuals are given on

the next 3 slides.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 15

Positive Autocorrelation

+
û t ût
+

- +
uˆ t 1 Time

Positive Autocorrelation is indicated by a cyclical residual plot over time.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 16

Negative Autocorrelation

+ ût
ût
+

- +
uˆt 1 T
ime

- -

Negative autocorrelation is indicated by an alternating pattern where the residuals

cross the time axis more frequently than if they were distributed randomly
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 17
No pattern in residuals –
No autocorrelation
û t
+
ût +

- +
uˆt 1 Time

-
-

No pattern in residuals at all: this is what we would like to see

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 18

Detecting Autocorrelation:
The Durbin-Watson Test

The Durbin-Watson (DW) is a test for first order autocorrelation - i.e.

it assumes that the relationship is between an error and the previous
one
ut = ut-1 + vt (1)
where vt  N(0, v2).
• The DW test statistic actually tests
H0 : =0 and H1 : 0
• The test statistic is calculated by T
  ut  ut 1 2
DW  t  2 T
 ut 2
t 2

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 19

The Durbin-Watson Test:
Critical Values
• We can also write
DW  2(1   ) (2)
where  is the estimated correlation coefficient. Since  is a
correlation, it implies that  1  pˆ  1.
• Rearranging for DW from (2) would give 0DW4.

• If  = 0, DW = 2. So roughly speaking, do not reject the null

hypothesis if DW is near 2  i.e. there is little evidence of
autocorrelation

• Unfortunately, DW has 2 critical values, an upper critical value (du)

and a lower critical value (dL), and there is also an intermediate region
where we can neither reject nor not reject H0.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 20

The Durbin-Watson Test: Interpreting the Results

Conditions which Must be Fulfilled for DW to be a Valid Test

1. Constant term in regression
2. Regressors are non-stochastic
3. No lags of dependent variable
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 21
Another Test for Autocorrelation:
The Breusch-Godfrey Test
• It is a more general test for rth order autocorrelation:
ut  1ut 1  2ut  2  3ut  3 ... r ut  r  vt , vt N(0, v2 )
• The null and alternative hypotheses are:
H0 : 1 = 0 and 2 = 0 and ... and r = 0
H1 : 1  0 or 2  0 or ... or r  0
• The test is carried out as follows:
1. Estimate the linear regression using OLS and obtain the residuals,ut .
2. Regress ut on all of the regressors from stage 1 (the x’s) plus ut 1 , ut  2 ,..., ut  r
Obtain R2 from this regression.
3. It can be shown that (T-r)R2  2(r)
• If the test statistic exceeds the critical value from the statistical tables, reject
the null hypothesis of no autocorrelation.
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 22
Consequences of Ignoring Autocorrelation
if it is Present

• The coefficient estimates derived using OLS are still unbiased, but
they are inefficient, i.e. they are not BLUE, even in large sample sizes.

• Thus, if the standard error estimates are inappropriate, there exists the
possibility that we could make the wrong inferences.

• R2 is likely to be inflated relative to its “correct” value for positively

correlated residuals.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 23

“Remedies” for Autocorrelation

• If the form of the autocorrelation is known, we could use a GLS

procedure – i.e. an approach that allows for autocorrelated residuals
e.g., Cochrane-Orcutt.

• But such procedures that “correct” for autocorrelation require

assumptions about the form of the autocorrelation.

• If these assumptions are invalid, the cure would be more dangerous

than the disease! - see Hendry and Mizon (1978).

• However, it is unlikely to be the case that the form of the

autocorrelation is known, and a more “modern” view is that residual
autocorrelation presents an opportunity to modify the regression.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 24

Dynamic Models

• All of the models we have considered so far have been static, e.g.
yt = 1 + 2x2t + ... + kxkt + ut

• But we can easily extend this analysis to the case where the current
value of yt depends on previous values of y or one of the x’s, e.g.
yt = 1 + 2x2t + ... + kxkt + 1yt-1 + 2x2t-1 + … + kxkt-1+ ut

• We could extend the model even further by adding extra lags, e.g.
x2t-2 , yt-3 .

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 25

Why Might we Want/Need To Include Lags
in a Regression?

• Inertia of the dependent variable

• Over-reactions
• Measuring time series as overlapping moving averages

• However, other problems with the regression could cause the null hypothesis
of no autocorrelation to be rejected:
– Omission of relevant variables, which are themselves autocorrelated.
– If we have committed a “misspecification” error by using an inappropriate
functional form.
– Autocorrelation resulting from unparameterised seasonality.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 26

Models in First Difference Form

• Another way to sometimes deal with the problem of autocorrelation is to

switch to a model in first differences.

• Denote the first difference of yt, i.e. yt - yt-1 as yt; similarly for the x-
variables, x2t = x2t - x2t-1 etc.

• The model would now be

yt = 1 + 2 x2t + ... + kxkt + ut

• Sometimes the change in y is purported to depend on previous values of y

or xt as well as changes in x:
yt = 1 + 2 x2t + 3x2t-1 +4yt-1 + ut

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 27

The Long Run Static Equilibrium Solution

• One interesting property of a dynamic model is its long run or static

equilibrium solution.
• “Equilibrium” implies that the variables have reached some steady state
and are no longer changing, i.e. if y and x are in equilibrium, we can say
yt = yt+1 = ... =y and xt = xt+1 = ... =x
Consequently, yt = yt - yt-1 = y - y = 0 etc.
• So the way to obtain a long run static solution is:
1. Remove all time subscripts from variables
2. Set error terms equal to their expected values, E(ut)=0
3. Remove first difference terms altogether
4. Gather terms in x together and gather terms in y together.
• These steps can be undertaken in any order
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 28
The Long Run Static Equilibrium Solution:
An Example

If our model is
yt = 1 + 2 x2t + 3x2t-1 +4yt-1 + ut

then the static solution would be given by

0 = 1 + 3x2t-1 +4yt-1

4yt-1 = - 1 - 3x2t-1
 1  3
y  x2
4 4

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 29

Problems with Adding Lagged Regressors
to “Cure” Autocorrelation

• Inclusion of lagged values of the dependent variable violates the

assumption that the RHS variables are non-stochastic.

• What does an equation with a large number of lags actually mean?

• Note that if there is still autocorrelation in the residuals of a model

including lags, then the OLS estimators will not even be consistent.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 30

Multicollinearity

• This problem occurs when the explanatory variables are very highly correlated
with each other.

• Perfect multicollinearity
Cannot estimate all the coefficients
- e.g. suppose x3 = 2x2
and the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut

• Problems if Near Multicollinearity is Present but Ignored

- R2 will be high but the individual coefficients will have high standard errors.
- The regression becomes very sensitive to small changes in the specification.
- Thus confidence intervals for the parameters will be very wide, and
significance tests might therefore give inappropriate conclusions.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 31

Measuring Multicollinearity

• The easiest way to measure the extent of multicollinearity is simply to

look at the matrix of correlations between the individual variables. e.g.

Corr x2 x3 x4
x2 - 0.2 0.8
x3 0.2 - 0.3
x4 0.8 0.3 -
• But another problem: if 3 or more variables are linear
- e.g. x2t + x3t = x4t

• Note that high correlation between y and one of the x’s is not
muticollinearity.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 32

Solutions to the Problem of Multicollinearity

• “Traditional” approaches, such as ridge regression or principal

components. But these usually bring more problems than they solve.

• Some econometricians argue that if the model is otherwise OK, just

ignore it

• The easiest ways to “cure” the problems are

- drop one of the collinear variables
- transform the highly correlated variables into a ratio
- go out and collect more data e.g.
- a longer run of data
- switch to a higher frequency

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 33

Adopting the Wrong Functional Form

• We have previously assumed that the appropriate functional form is linear.

• This may not always be true.
• We can formally test this using Ramsey’s RESET test, which is a general test for
mis-specification of functional form.

• Essentially the method works by adding higher order terms of the fitted values (e.g.
etc.)
y t2 , into
yt3 an auxiliary regression:
Regress on powers of the fitted values:
ut
ut  0  1 yt2  2 yt3 ...  p 1 ytp  vt
Obtain R2 from this regression. The test statistic is given by TR2 and is distributed as
a .
 2 ( p  1)
• So if the value of the test statistic is greater than a then reject the null
hypothesis that the functional form was correct.  ( p  1)
2

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 34

But what do we do if this is the case?

• The RESET test gives us no guide as to what a better specification might

be.

• One possible cause of rejection of the test is if the true model is

yt  1   2 x2t   3 x22t   4 x23t  ut
In this case the remedy is obvious.

• Another possibility is to transform the data into logarithms. This will

linearise many previously multiplicative models into additive ones:

yt  Axt e ut  ln yt     ln xt  ut

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 35

Testing the Normality Assumption

• Why did we need to assume normality for hypothesis testing?

Testing for Departures from Normality

• The Bera-Jarque normality test

• A normal distribution is not skewed and is defined to have a
coefficient of kurtosis of 3.
• The kurtosis of the normal distribution is 3 so its excess kurtosis (b2-3)
is zero.
• Skewness and kurtosis are the (standardised) third and fourth moments
of a distribution.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 36

Normal versus Skewed Distributions

f(x ) f(x )

x x

A normal distribution A skewed distribution

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 37

Leptokurtic versus Normal Distribution

0.5

0.4

0.3

0.2

0.1

0.0
-5.4 -3.6 -1.8 -0.0 1.8 3.6 5.4

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 38

Testing for Normality

• Bera and Jarque formalise this by testing the residuals for normality by
testing whether the coefficient of skewness and the coefficient of excess
kurtosis are jointly zero.
• It can be proved that the coefficients of skewness and kurtosis can be
expressed respectively as:
E [u3 ] E [u4 ]
b1  and b2  2 2
2 3/ 2
   

• The Bera-Jarque test statistic is given by

 b12  b2  3 2 
W T   ~  2
 2
 6 24 
• We estimate b1 and b2 using the residuals from the OLS regression, u .

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 39

What do we do if we find evidence of Non-Normality?

• It is not obvious what we should do!

• Could use a method which does not assume normality, but difficult and
what are its properties?

• Often the case that one or two very extreme residuals causes us to reject
the normality assumption.

• An alternative is to use dummy variables.

e.g. say we estimate a monthly model of asset returns from 1980-1990,
and we plot the residuals, and find a particularly large outlier for October
1987:

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 40

What do we do if we find evidence
of Non-Normality? (cont’d)

û
t
+

Oct T
ime
1
987

-
• Create a new variable:
D87M10t = 1 during October 1987 and zero otherwise.
This effectively knocks out that observation. But we need a theoretical reason
for adding dummy variables.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 41

Omission of an Important Variable or
Inclusion of an Irrelevant Variable

Omission of an Important Variable

• Consequence: The estimated coefficients on all the other variables will be
biased and inconsistent unless the excluded variable is uncorrelated with
all the included variables.
• Even if this condition is satisfied, the estimate of the coefficient on the
constant term will be biased.
• The standard errors will also be biased.

Inclusion of an Irrelevant Variable

• Coefficient estimates will still be consistent and unbiased, but the
estimators will be inefficient.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 42

Parameter Stability Tests

• So far, we have estimated regressions such as yt = 1 + 2x2t + 3x3t + ut

• We have implicitly assumed that the parameters (1, 2 and 3) are constant
for the entire sample period.

• We can test this implicit assumption using parameter stability tests. The
idea is essentially to split the data into sub-periods and then to estimate up
to three models, for each of the sub-parts and for all the data and then to
“compare” the RSS of the models.

• There are two types of test we can look at:

- Chow test (analysis of variance test)
- Predictive failure tests

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 43

The Chow Test

• The steps involved are:

1. Split the data into two sub-periods. Estimate the regression over the
whole period and then for the two sub-periods separately (3 regressions).
Obtain the RSS for each regression.
2. The restricted regression is now the regression for the whole period
while the “unrestricted regression” comes in two parts: for each of the sub-
samples.
We can thus form an F-test which is the difference between the RSS’s.

RSS   RSS1  RSS2  T  2k

The statistic is 
RSS1  RSS2 k

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 44

The Chow Test (cont’d)

where:
RSS = RSS for whole sample
RSS1 = RSS for sub-sample 1
RSS2 = RSS for sub-sample 2
T = number of observations
2k = number of regressors in the “unrestricted” regression (since it comes in
two parts)
k = number of regressors in (each part of the) “unrestricted” regression

3. Perform the test. If the value of the test statistic is greater than the critical
value from the F-distribution, which is an F(k, T-2k), then reject the null
hypothesis that the parameters are stable over time.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 45

A Chow Test Example

• Consider the following regression for the CAPM  (again) for the returns
on Glaxo.

• Say that we are interested in estimating Beta for monthly data from 1981-
1992. The model for each sub-period is

• 1981M1 - 1987M10
0.24 + 1.2RMt T = 82 RSS1 = 0.03555
• 1987M11 - 1992M12
0.68 + 1.53RMtT = 62 RSS2 = 0.00336
• 1981M1 - 1992M12
0.39 + 1.37RMtT = 144 RSS = 0.0434

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 46

A Chow Test Example - Results

• The null hypothesis is

H0 : 1   2 and 1  2
• The unrestricted model is the model where this restriction is not imposed

00434
.   00355
.  000336
.  144  4
Test statistic  
00355
.  000336
. 2
= 7.698

Compare with 5% F(2,140) = 3.06

• We reject H0 at the 5% level and say that we reject the restriction that the
coefficients are the same in the two periods.

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 47

The Predictive Failure Test

• Problem with the Chow test is that we need to have enough data to do the
regression on both sub-samples, i.e. T1>>k, T2>>k.
• An alternative formulation is the predictive failure test.
• What we do with the predictive failure test is estimate the regression over a “long” sub-
period (i.e. most of the data) and then we predict values for the other period and compare
the two.
To calculate the test:
- Run the regression for the whole period (the restricted regression) and obtain the RSS
- Run the regression for the “large” sub-period and obtain the RSS (called RSS1). Note
we call the number of observations T1 (even though it may come second).

RSS  RSS1 T1  k
Test Statistic  
where T2 = number of observations we are RSS
attempting
1 to T
“predict”.
2 The test statistic
will follow an F(T2, T1-k).

‘Introductory Econometrics for Finance’ © Chris Brooks 2019 48

Backwards versus Forwards Predictive Failure Tests

• There are 2 types of predictive failure tests:

- Forward predictive failure tests, where we keep the last few

observations back for forecast testing, e.g. we have observations for
1970Q1-1994Q4. So estimate the model over 1970Q1-1993Q4 and
forecast 1994Q1-1994Q4.

- Backward predictive failure tests, where we attempt to “back-cast”

the first few observations, e.g. if we have data for 1970Q1-1994Q4,
and we estimate the model over 1971Q1-1994Q4 and backcast
1970Q1-1970Q4.

Predictive Failure Tests – An Example

• We have the following models estimated:

For the CAPM  on Glaxo.
• 1980M1-1991M12
0.39 + 1.37RMt T = 144 RSS = 0.0434
• 1980M1-1989M12
0.32 + 1.31RMt T1 = 120 RSS1 = 0.0420
Can this regression adequately “forecast” the values for the last two years?

0.0434  0.0420 120 2

Test Statistic   = 0.164
0.0420 24
• Compare with F(24,118) = 1.66.
So we do not reject the null hypothesis that the model can adequately predict
the last few observations.

How do we decide the sub-parts to use?
• As a rule of thumb, we could use all or some of the following:
- Plot the dependent variable over time and split the data accordingly to any
obvious structural changes in the series, e.g. 1400

1200

1000

Value of Series (y t)
800

600

400

200

1
27
53
79

157
183
209
235
261
287

391
417
443
105
131

313
339
365
- Split the data according to any known important Sample Period

historical events (e.g. stock market crash, new government elected)

- Use all but the last few observations and do a predictive failure test on those.

Measurement Errors

• If there is measurement error in one or more of the explanatory variables, this will
violate the assumption that the explanatory variables are non-stochastic
• Sometimes this is also known as the errors-in-variables problem
• Measurement errors can occur in a variety of circumstances, e.g.
– Macroeconomic variables are almost always estimated quantities (GDP,
inflation, and so on), as is most information contained in company accounts
– Sometimes we cannot observe or obtain data on a variable we require and so we
need to use a proxy variable – for instance, many models include expected
quantities (e.g., expected inflation) but we cannot typically measure expectations.

52
Measurement Error in the Explanatory Variable(s)

• Suppose that we wish to estimate a model containing just one explanatory variable, xt:
• yt = β1 + β2xt + ut,
where ut is a disturbance term
• Suppose further that xt is measured with error so that instead of observing its true value,
we observe a noisy version, , that comprises the actual xt plus some additional noise, vt
that is independent of xt and ut:

• Taking the first equation and substituting in for xt from the second:

• We can rewrite this equation by separately expressing the composite error term, (ut − β2vt)

53
Measurement Error in the Explanatory Variable(s)

• It should be clear from this equation and the one for the explanatory variable measured
with error, and the composite error term,
(ut − β2vt), are correlated since both depend on vt
• Thus the requirement that the explanatory variables are non-stochastic does not hold
• This causes the parameters to be estimated inconsistently
• The size of the bias in the estimates will be a function of the variance of the noise in xt
as a proportion of the overall disturbance variance
• If β2 is positive, the bias will be negative but if β2 is negative, the bias will be positive
• So the parameter estimate will always be biased towards zero as a result of the
measurement noise.

54
Measurement Error and Tests of the CAPM

• The standard approach to testing the CAPM pioneered by Fama and MacBeth
(1973) comprises two stages
• Since the betas are estimated at the first stage rather than being directly observable,
they will surely contain measurement error
• The effect of this has sometimes been termed attenuation bias.
• Tests of the CAPM showed that the relationship between beta and returns was
smaller than expected, and this is precisely what would happen as a result of
measurement error
• Various approaches to solving this issue have been proposed, the most common of
which is to use portfolio betas in place of individual betas
• An alternative approach (Shanken,1992) is to modify the standard errors in the
second stage regression to adjust directly for the measurement errors.

55
Measurement Error in the Explained Variable

• Measurement error in the explained variable is much less serious than in the
explanatory variable(s)
• This is one of the motivations for the inclusion of the disturbance term in a
regression model
• When the explained variable is measured with error, the disturbance term will in
effect be a composite of the usual disturbance term and another source of noise from
the measurement error
• Then the parameter estimates will still be consistent and unbiased and the usual
formulae for calculating standard errors will still be appropriate
• The only consequence is that the additional noise means that the standard errors will
be enlarged relative to the situation where there was no measurement error in y.

56
A Strategy for Building Econometric Models

Our Objective:
• To build a statistically adequate empirical model which
- satisfies the assumptions of the CLRM
- is parsimonious
- has the appropriate theoretical interpretation
- has the right “shape” - i.e.
- all signs on coefficients are “correct”
- all sizes of coefficients are “correct”
- is capable of explaining the results of all competing models

2 Approaches to Building Econometric Models
• There are 2 popular philosophies of building econometric models: the “specific-
to-general” and “general-to-specific” approaches.

• “Specific-to-general” was used almost universally until the mid 1980’s, and
involved starting with the simplest model and gradually adding to it.

• Little, if any, diagnostic testing was undertaken. But this meant that all inferences
were potentially invalid.

• An alternative and more modern approach to model building is the “LSE” or

Hendry “general-to-specific” methodology.

• The advantages of this approach are that it is statistically sensible and also the
theory on which the models are based usually has nothing to say about the lag
structure of a model.

The General-to-Specific Approach

• First step is to form a “large” model with lots of variables on the right hand side
• This is known as a GUM (generalised unrestricted model)
• At this stage, we want to make sure that the model satisfies all of the
assumptions of the CLRM
• If the assumptions are violated, we need to take appropriate actions to remedy
this, e.g.
- taking logs
- adding lags
- dummy variables
• We need to do this before testing hypotheses
• Once we have a model which satisfies the assumptions, it could be very big
with lots of lags & independent variables

The General-to-Specific Approach:
Reparameterising the Model

• The next stage is to reparameterise the model by

- knocking out very insignificant regressors
- some coefficients may be insignificantly different from each other,
so we can combine them.

• At each stage, we need to check the assumptions are still OK.

• Hopefully at this stage, we have a statistically adequate empirical model which

we can use for
- testing underlying financial theories
- forecasting future values of the dependent variable
- formulating policies, etc.

Regression Analysis In Practice - A Further Example:
Determinants of Sovereign Credit Ratings
• Cantor and Packer (1996)
Financial background:
• What are sovereign credit ratings and why are we interested in them?

• Two ratings agencies (Moody’s and Standard and Poor’s) provide credit
ratings for many governments.

• Each possible rating is denoted by a grading:

Moody’s Standard and Poor’s
Aaa AAA
…… …..
B3 B-

Purposes of the Paper

- to attempt to explain and model how the ratings agencies arrived at

their ratings.

- to use the same factors to explain the spreads of sovereign yields

above a risk-free proxy

- to determine what factors affect how the sovereign yields react to

ratings announcements

Determinants of Sovereign Ratings

• Data
Quantifying the ratings (dependent variable): Aaa/AAA=16, ... , B3/B-=1
• Explanatory variables (units of measurement):
- Per capita income in 1994 (thousands of dollars)
- Average annual GDP growth 1991-1994 (%)
- Average annual inflation 1992-1994 (%)
- Fiscal balance: Average annual government budget surplus as a
proportion of GDP 1992-1994 (%)
- External balance: Average annual current account surplus as a proportion
of GDP 1992-1994 (%)
- External debt foreign currency debt as a proportion of exports 1994 (%)
- Dummy for economic development
- Dummy for default history
Income and inflation are transformed to their logarithms.

The model: Linear and estimated using OLS

Dependent Variable
Expected Average Moody’s S&P Moody’s / S&P
Explanatory Variable sign Rating Rating Rating Difference
Intercept ? 1.442 3.408 -0.524 3.932**
(0.663) (1.379) (-0.223) (2.521)
Per capita income + 1.242*** 1.027*** 1.458*** -0.431***
(5.302) (4.041) (6.048) (-2.688)
GDP growth + 0.151 0.130 0.171** -0.040
(1.935) (1.545) (2.132) (0.756)
Inflation - -0.611*** -0.630*** -0.591*** -0.039
(-2.839) (-2.701) (2.671) (-0.265)
Fiscal Balance + 0.073 0.049 0.097* -0.048
(1.324) (0.818) (1.71) (-1.274)
External Balance + 0.003 0.006 0.001 0.006
(0.314) (0.535) (0.046) (0.779)
External Debt - -0.013*** -0.015*** -0.011*** -0.004***
(-5.088) (-5.365) (-4.236) (-2.133)
Development dummy + 2.776*** 2.957*** 2.595*** 0.362
(4.25) (4.175) (3.861) (0.81)
Default dummy - -2.042*** -1.63** -2.622*** 1.159***
(-3.175) (-2.097) (-3.962) (2.632)
Adjusted R2 0.924 0.905 0.926 0.836
Notes: t-ratios in parentheses; *, **, and *** indicate significance at the 10%, 5% and 1% levels
respectively. Source: Cantor and Packer (1996). Reprinted with permission from Institutional Investor.

Interpreting the Model

From a statistical perspective

• Virtually no diagnostics
• Adjusted R2 is high
• Look at the residuals: actual rating - fitted rating

From a financial perspective

• Do the coefficients have their expected signs and sizes?

Do Ratings Add to Publicly Available Available Information?

• Now dependent variable is
- Log (Yield on the sovereign bond - yield on a US treasury bond)
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 65
Do Ratings Add to Publicly Available Available
Information? Results

Dependent Variable: Log (yield spread)

Variable Expected Sign (1) (2) (3)
Intercept ? 2.105*** 0.466 0.074
(16.148) (0.345) (0.071)
Average - -0.221*** -0.218***
Rating (-19.175) (-4.276)
Per capita - -0.144 0.226
income (-0.927) (1.523)
GDP growth - -0.004 0.029
(-0.142) (1.227)
Inflation + 0.108 -0.004
(1.393) (-0.068)
Fiscal Balance - -0.037 -0.02
(-1.557) (-1.045)
External - -0.038 -0.023
Balance (-1.29) (-1.008)
External Debt + 0.003*** 0.000
(2.651) (0.095)
Development - -0.723*** -0.38
dummy (-2.059) (-1.341)
Default dummy + 0.612*** 0.085
(2.577) (0.385)
Adjusted R2 0.919 0.857 0.914
Notes: t-ratios in parentheses; *, **, and *** indicate significance at the 10%, 5% and 1% levels
respectively. Source: Cantor and Packer (1996). Reprinted with permission from Institutional Investor.
‘Introductory Econometrics for Finance’ © Chris Brooks 2019 66
What Determines How the Market Reacts
to Ratings Announcements?

• The sample: Every announcement of a ratings change that occurred

between 1987 and 1994 - 79 such announcements spread over 18
countries.

• 39 were actual ratings changes

• 40 were “watchlist / outlook” changes

• The dependent variable: changes in the relative spreads over the US

T-bond over a 2-day period at the time of the announcement.

What Determines How the Market Reacts
to Ratings Announcements? Explanatory variables.

0 /1 dummies for
- Whether the announcement was positive
- Whether there was an actual ratings change
- Whether the bond was speculative grade
- Whether there had been another ratings announcement in the previous 60 days.
and
- The change in the spread over the previous 60 days.
- The ratings gap between the announcing and the other agency

What Determines How the Market Reacts
to Ratings Announcements? Results

Dependent Variable: Log Relative Spread

Independent variable Coefficient (t-ratio)
Intercept -0.02
(-1.4)
Positive announcements 0.01
(0.34)
Ratings changes -0.01
(-0.37)
Moody’s announcements 0.02
(1.51)
Speculative grade 0.03**
(2.33)
Change in relative spreads from day –60 to day -1 -0.06
(-1.1)
Rating gap 0.03*
(1.7)
Other rating announcements from day –60 to day -1 0.05**
(2.15)
2
Adjusted R 0.12
Note: * and ** denote significance at the 10% and 5% levels respectively. Source: Cantor and Packer
(1996). Reprinted with permission from Institutional Investor.

Conclusions

• 6 factors appear to play a big role in determining sovereign credit

ratings - incomes, GDP growth, inflation, external debt, industrialised
or not, and default history.

• The ratings provide more information on yields than all of the macro
factors put together.

• We cannot determine well what factors influence how the markets will
react to ratings announcements.

Comments on the Paper

• Only 49 observations for first set of regressions and 35 for yield

regressions and up to 10 regressors

• No attempt at reparameterisation

• Little attempt at diagnostic checking

• Where did the factors (explanatory variables) come from?

Classical Linear Regression Model Assumptions and Diagnostics
No ratings yet
Classical Linear Regression Model Assumptions and Diagnostics
66 pages
Linear Regression Model
No ratings yet
Linear Regression Model
195 pages
Ch5 Slides
No ratings yet
Ch5 Slides
32 pages
Econometrics Guide E-Veiw
No ratings yet
Econometrics Guide E-Veiw
16 pages
Chapter9 Heteroscedasticity
No ratings yet
Chapter9 Heteroscedasticity
17 pages
Chapter 4
No ratings yet
Chapter 4
62 pages
Ch5 - Slides 2022 - 11 - 29 - L1
No ratings yet
Ch5 - Slides 2022 - 11 - 29 - L1
35 pages
EC229 Part II Answers
No ratings yet
EC229 Part II Answers
9 pages
Further Issues With The Classical Linear Regression Model: Introductory Econometrics For Finance' © Chris Brooks 2002 1
No ratings yet
Further Issues With The Classical Linear Regression Model: Introductory Econometrics For Finance' © Chris Brooks 2002 1
74 pages
Econometrics For Finance Chapter 4
No ratings yet
Econometrics For Finance Chapter 4
44 pages
Workshop 4 - Part 1 - Introductory Econometrics With EViews
100% (1)
Workshop 4 - Part 1 - Introductory Econometrics With EViews
99 pages
Hsts423 Unit 4
No ratings yet
Hsts423 Unit 4
13 pages
Modelling Long-Run Relationship in Finance: Introductory Econometrics For Finance' © Chris Brooks 2013 1
No ratings yet
Modelling Long-Run Relationship in Finance: Introductory Econometrics For Finance' © Chris Brooks 2013 1
18 pages
Chapter 4
No ratings yet
Chapter 4
38 pages
Ecd202 Lec09 2023
No ratings yet
Ecd202 Lec09 2023
18 pages
Chapter 5 Violations of CLRM Assumptions
100% (2)
Chapter 5 Violations of CLRM Assumptions
25 pages
Chris Brooks - Chapter 5 - Slides
No ratings yet
Chris Brooks - Chapter 5 - Slides
71 pages
Econ 335 Wooldridge CH 8 Heteroskedasticity
No ratings yet
Econ 335 Wooldridge CH 8 Heteroskedasticity
23 pages
Diagnostic Tests
No ratings yet
Diagnostic Tests
51 pages
OMF Lecture 7
No ratings yet
OMF Lecture 7
72 pages
Chapter-05 Heter
No ratings yet
Chapter-05 Heter
105 pages
Advances 20220303 24
No ratings yet
Advances 20220303 24
13 pages
Econometrics
No ratings yet
Econometrics
46 pages
MFIN 305 - Lecture3
No ratings yet
MFIN 305 - Lecture3
66 pages
Chapter 4
No ratings yet
Chapter 4
62 pages
L1090 Lecture7 AU24
No ratings yet
L1090 Lecture7 AU24
27 pages
Chapter 6
No ratings yet
Chapter 6
5 pages
PH.D Student University of Craiova, Faculty of Economics and Business Administration Craiova, Romania E-Mail
No ratings yet
PH.D Student University of Craiova, Faculty of Economics and Business Administration Craiova, Romania E-Mail
5 pages
Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 5
No ratings yet
Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 5
9 pages
Chapter 6
No ratings yet
Chapter 6
10 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Lecture 4
No ratings yet
Lecture 4
43 pages
Chapter 4
No ratings yet
Chapter 4
55 pages
Chapter 4 New Edited
No ratings yet
Chapter 4 New Edited
45 pages
ECTRX Topic6 Heteroscedasticity
No ratings yet
ECTRX Topic6 Heteroscedasticity
31 pages
OLS Assumptions
No ratings yet
OLS Assumptions
40 pages
Intro To Econometrics Latter Half Chanon-1016098-17101310898743
No ratings yet
Intro To Econometrics Latter Half Chanon-1016098-17101310898743
15 pages
Lecture Notes (1) : - Definition of Financial Econometrics
No ratings yet
Lecture Notes (1) : - Definition of Financial Econometrics
21 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
15 pages
Panel Data Econometrics Kenya
No ratings yet
Panel Data Econometrics Kenya
114 pages
Econometrics
No ratings yet
Econometrics
23 pages
Heteros Kedasti City
No ratings yet
Heteros Kedasti City
26 pages
Chapter 4 - Acct
No ratings yet
Chapter 4 - Acct
16 pages
Chapter8 Solutions
No ratings yet
Chapter8 Solutions
7 pages
BOOK MADDLA Econometric - Introduction To Econometrics
0% (1)
BOOK MADDLA Econometric - Introduction To Econometrics
637 pages
Chapter 5 Solutions Solution Manual Introductory Econometrics For Finance
No ratings yet
Chapter 5 Solutions Solution Manual Introductory Econometrics For Finance
9 pages
Econometrics moduleII
100% (2)
Econometrics moduleII
114 pages
Econometrics A
No ratings yet
Econometrics A
18 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
8 pages
Ch5 Slides Ed3 Feb2021
No ratings yet
Ch5 Slides Ed3 Feb2021
49 pages
Mad Dala
No ratings yet
Mad Dala
637 pages
Further Regression Topics II
No ratings yet
Further Regression Topics II
32 pages
Classical Linear Regression Model Assumptions and Diagnostics
No ratings yet
Classical Linear Regression Model Assumptions and Diagnostics
71 pages
SST Ûr Var: Principles of Econometrics - Class of October 14 Feunl
No ratings yet
SST Ûr Var: Principles of Econometrics - Class of October 14 Feunl
18 pages
Outline: Basic Econometrics in Transportation Basic Econometrics in Transportation
No ratings yet
Outline: Basic Econometrics in Transportation Basic Econometrics in Transportation
7 pages
CH 03 Wooldridge 6e PPT Updated
No ratings yet
CH 03 Wooldridge 6e PPT Updated
36 pages
Violations of OLS
No ratings yet
Violations of OLS
64 pages
Heteroscedasticity Notes
No ratings yet
Heteroscedasticity Notes
9 pages
CH 04 Wooldridge 6e PPT Updated
No ratings yet
CH 04 Wooldridge 6e PPT Updated
39 pages
CH 05 Wooldridge 6e PPT Updated
No ratings yet
CH 05 Wooldridge 6e PPT Updated
8 pages
Lecture 1. Introduction To Econometrics
100% (1)
Lecture 1. Introduction To Econometrics
24 pages
Revision Questions On Regression
No ratings yet
Revision Questions On Regression
9 pages
Estimating Stock Market Volatility With Markov Regime-Switching GARCH Models
No ratings yet
Estimating Stock Market Volatility With Markov Regime-Switching GARCH Models
11 pages
3 Mavzu
No ratings yet
3 Mavzu
87 pages
13
No ratings yet
13
18 pages
Non Linear Probability Models
No ratings yet
Non Linear Probability Models
18 pages
Acemoglu Et Al. - 2014 - Democracy Does Cause Growth PDF
No ratings yet
Acemoglu Et Al. - 2014 - Democracy Does Cause Growth PDF
66 pages
Act 5.6. 1-5
No ratings yet
Act 5.6. 1-5
3 pages
Lecture-2 Least Squares Regression
No ratings yet
Lecture-2 Least Squares Regression
18 pages
Rotten Tomatoes Audience Rating Prediction
No ratings yet
Rotten Tomatoes Audience Rating Prediction
36 pages
CH 10 Ans
No ratings yet
CH 10 Ans
17 pages
IE360 Final Exam Sample
No ratings yet
IE360 Final Exam Sample
2 pages
19 Assessing Model Accuracy
No ratings yet
19 Assessing Model Accuracy
16 pages
Statistical Foundations and Dealing With Data: Introductory Econometrics For Finance' © Chris Brooks 2019 1
No ratings yet
Statistical Foundations and Dealing With Data: Introductory Econometrics For Finance' © Chris Brooks 2019 1
54 pages
T Test
No ratings yet
T Test
17 pages
Reading 1 Multiple Regression
No ratings yet
Reading 1 Multiple Regression
31 pages
Construction of Almost Unbiased Estimator For Population Mean Using Neutrosophic Information
No ratings yet
Construction of Almost Unbiased Estimator For Population Mean Using Neutrosophic Information
15 pages
Lecture 3
No ratings yet
Lecture 3
21 pages
Robust Geodetic Parameter Estimation Under Least Squares Through Weighting On The Basis of The Mean Square Error
No ratings yet
Robust Geodetic Parameter Estimation Under Least Squares Through Weighting On The Basis of The Mean Square Error
12 pages
SPSS LAB Assignment 3
No ratings yet
SPSS LAB Assignment 3
9 pages
STAT - Lec.3 - Correlation and Regression
No ratings yet
STAT - Lec.3 - Correlation and Regression
8 pages
Performance Metrics
No ratings yet
Performance Metrics
3 pages
Course Outline - Advanced Econometrics - SemVI - 23-24
No ratings yet
Course Outline - Advanced Econometrics - SemVI - 23-24
3 pages
Ordinal Logistic Regression Stata Command
No ratings yet
Ordinal Logistic Regression Stata Command
3 pages
OPMT Final Exam Guide Q1
No ratings yet
OPMT Final Exam Guide Q1
2 pages
Piecewise Linear Regression Examples (Lesson 1) Truncated
No ratings yet
Piecewise Linear Regression Examples (Lesson 1) Truncated
4 pages
Multiple Linear Regression (Continue) Example:: CO Product Y Solvent Total X Hydrogen Consumption X Y X Y X Y X X
No ratings yet
Multiple Linear Regression (Continue) Example:: CO Product Y Solvent Total X Hydrogen Consumption X Y X Y X Y X X
4 pages
Sebuah Pabrik Garmen Di Batam: 1. Buat Persamaan Energy Baseline Nya, Y Ax + B
No ratings yet
Sebuah Pabrik Garmen Di Batam: 1. Buat Persamaan Energy Baseline Nya, Y Ax + B
4 pages
Correlation and Regression: Smoking and Lung Capacity
No ratings yet
Correlation and Regression: Smoking and Lung Capacity
7 pages
Advanced Econometrics - Assignment 2
No ratings yet
Advanced Econometrics - Assignment 2
2 pages
Financial Plans for Successful Wealth Management In Retirement: An Easy Guide to Selecting Portfolio Withdrawal Strategies
From Everand
Financial Plans for Successful Wealth Management In Retirement: An Easy Guide to Selecting Portfolio Withdrawal Strategies
Tushar S. Chande, Ph.D., MBA
No ratings yet
Future Ready: How to Master Business Forecasting
From Everand
Future Ready: How to Master Business Forecasting
Steve Morlidge
No ratings yet
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Financial Risk Management: A Simple Introduction
From Everand
Financial Risk Management: A Simple Introduction
K.H. Erickson
4.5/5 (7)