0% found this document useful (0 votes)
64 views3 pages

Icma Centre University of Reading: Quantitative Methods For Finance

This document provides solutions to an exercise on assumptions of the classical linear regression model (CLR). It discusses issues like multicollinearity in a regression model where coefficients are individually insignificant but the overall model fit is good. It also addresses violating assumptions of homoscedasticity and autocorrelation and provides ways to test for these issues, such as using heteroscedasticity-robust standard errors and the Durbin-Watson test. The document concludes with calculating the long-run solution of a regression model containing lagged dependent and independent variables.

Uploaded by

Ana-Maria Badea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views3 pages

Icma Centre University of Reading: Quantitative Methods For Finance

This document provides solutions to an exercise on assumptions of the classical linear regression model (CLR). It discusses issues like multicollinearity in a regression model where coefficients are individually insignificant but the overall model fit is good. It also addresses violating assumptions of homoscedasticity and autocorrelation and provides ways to test for these issues, such as using heteroscedasticity-robust standard errors and the Durbin-Watson test. The document concludes with calculating the long-run solution of a regression model containing lagged dependent and independent variables.

Uploaded by

Ana-Maria Badea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

ICMA CENTRE UNIVERSITY of READING

Quantitative Methods for Finance


Solutions to Exercise 4: Assumptions of the CLRM

1. The t-ratios for the coefficients in this model are given in the third row after the
standard errors. They are calculated by dividing the individual coefficients by their
standard errors.

ŷt = 0.638 + 0.402 X2t - 0.891 X3t R 2  0.96, R 2  0.89


(0.436) (0.291) (0.763)
t-ratio 1.46 1.38 1.17
The problem appears to be that the regression parameters are all individually
insignificant (i.e. not significantly different from zero), although the value of R2 and
its adjusted version are both very high, so that the regression taken as a whole
seems to indicate a good fit. This looks like a classic example of what we term near
multicollinearity. This is where the individual regressors are very closely related, so
that it becomes difficult to disentangle the effect of each individual variable upon
the dependent variable.

The solution to near multicollinearity that is usually suggested is that since the
problem is really one of insufficient information in the sample to determine each of
the coefficients, then one should go out and get more data. In other words, we
should switch to a higher frequency of data for analysis (e.g. weekly instead of
monthly, monthly instead of quarterly etc.). An alternative is also to get more data
by using a longer sample period (i.e. one going further back in time), or to combine
the two independent variables in a ratio (e.g. x2t / x3t ).

Other, more ad hoc methods for dealing with the possible existence of near
multicollinearity were discussed in the lectures:
- Ignore it: if the model is otherwise adequate, i.e. statistically and in terms of
each coefficient being of a plausible magnitude and having an appropriate sign.
Sometimes, the existence of multicollinearity does not reduce the t-ratios on
variables that would have been significant without the multicollinearity
sufficiently to make them insignificant. It is worth stating that the presence of
near multicollinearity does not affect the BLUE properties of the OLS estimator
– i.e. it will still be consistent, unbiased and efficient since the presence of near
multicollinearity does not violate any of the CLRM assumptions 1-4. However,
in the presence of near multicollinearity, it will be hard to obtain small standard
errors. This will not matter if the aim of the model-building exercise is to
produce forecasts from the estimated model, since the forecasts will be
unaffected by the presence of near multicollinearity so long as this relationship
between the explanatory variables continues to hold over the forecasted sample.
- Drop one of the collinear variables - so that the problem disappears. However,
this may be unacceptable to the researcher if there were strong a priori
theoretical reasons for including both variables in the model. Also, if the

1
removed variable was relevant in the data generating process for y, an omitted
variable bias would result (see Section 4.12).
- Transform the highly correlated variables into a ratio and include only the ratio
and not the individual variables in the regression. Again, this may be
unacceptable if financial theory suggests that changes in the dependent variable
should occur following changes in the individual explanatory variables, and not
a ratio of them.

2. (i) The assumption of homoscedasticity is that the variance of the errors is


constant and finite over time. Technically, we write Var (ut )   u2 .
(ii) The coefficient estimates would still be the “correct” ones (assuming
that the other assumptions for OLS optimality are not violated), but the problem
would be that the standard errors could be wrong. Hence if we were trying to test
hypotheses about the true parameter values, we could end up drawing the wrong
conclusions. In fact, for all of the variables except the constant, the standard errors
would typically be too small, so that we would end up rejecting the null hypothesis
too many times.
(iii) There are a number of ways to proceed in practice, including
- Using heteroscedasticity robust standard errors which correct for the problem by
enlarging the standard errors relative to what they would have been for the situation
where the error variance is positively related to one of the explanatory variables.
- Transforming the data into logs, which has the effect of reducing the effect of
large errors relative to small ones.

3. (i) This is where there is a relationship between the ith and jth residuals.
Recall that one of the assumptions of the CLRM was that such a relationship did not
exist. We want our residuals to be random, and if there is evidence of
autocorrelation in the residuals, then it implies that we could predict the sign of the
next residual and get the right answer more than half the time on average!
(ii) The Durbin Watson test is a test for first order autocorrelation. The test
is calculated as follows. You would run whatever regression you were interested in,
and obtain the residuals. Then calculate the statistic
 uˆt  uˆt 1 
T
2

DW  t  2 T
 uˆt
2

t 2

You would then need to look up the two critical values from the Durbin Watson
tables, and these would depend on how many variables and how many observations
and how many regressors (excluding the constant this time) you had in the model.
The rejection / non-rejection rule would be given by selecting the appropriate
region from the following diagram:

2
(iii) We have 60 observations, and the number of regressors excluding the
constant term is 3. The appropriate lower and upper limits are 1.32 and 1.52
respectively, so the Durbin Watson is lower than the lower limit. It is thus clear that
we reject the null hypothesis of no autocorrelation. So it looks like the residuals are
positively autocorrelated.
(iv) yt  1   2 X 2t   3 X 3t   4 X 4t  ut
The problem with a model entirely in first differences, is that once we calculate the
long run solution, all the first difference terms drop out (as in the long run we
assume that the values of all variables have converged on their own long run values
so that yt = yt-1 etc.) Thus when we try to calculate the long run solution to this
model, we cannot do it because there isn’t a long run solution to this model!
(v)
yt  1   2 X 2t   3 X 3t   4 X 4t   5 X 2t 1   6 X 3t 1   7 X 4t 1  vt
The answer is yes, there is no reason why we cannot use Durbin Watson in this
case. You may have said no here because there are lagged values of the regressor
(X) variables in the regression. In fact this would be wrong since there are no lags of
the DEPENDENT variable ( y ) as an explanatory variable in the model. Hence the
Durbin Watson statistic can still be used.

4. yt  1   2 X 2t   3 X 3t   4 yt 1   5 X 2t 1   6 X 3t 1   7 X 3t 4  ut

The major steps involved in calculating the long run solution are to
- set the disturbance term equal to its expected value of zero
- drop the time subscripts
- remove all difference terms altogether since these will all be zero by the definition
of the long run in this context.

Following these steps, we obtain


0   1   4 y   5 x 2   6 x3   7 x3
We now want to rearrange this to have all the terms in x2 together and so that y is
the subject of the formula:
 4 y    1   5 x 2   6 x3   7 x3
 4 y    1   5 x 2  (  6   7 ) x3
  (   4 )
y   1  5 x2  6 x3
4 4 4
The last equation above is the long run solution.

You might also like