Icma Centre University of Reading: Quantitative Methods For Finance
Icma Centre University of Reading: Quantitative Methods For Finance
1. The t-ratios for the coefficients in this model are given in the third row after the
standard errors. They are calculated by dividing the individual coefficients by their
standard errors.
The solution to near multicollinearity that is usually suggested is that since the
problem is really one of insufficient information in the sample to determine each of
the coefficients, then one should go out and get more data. In other words, we
should switch to a higher frequency of data for analysis (e.g. weekly instead of
monthly, monthly instead of quarterly etc.). An alternative is also to get more data
by using a longer sample period (i.e. one going further back in time), or to combine
the two independent variables in a ratio (e.g. x2t / x3t ).
Other, more ad hoc methods for dealing with the possible existence of near
multicollinearity were discussed in the lectures:
- Ignore it: if the model is otherwise adequate, i.e. statistically and in terms of
each coefficient being of a plausible magnitude and having an appropriate sign.
Sometimes, the existence of multicollinearity does not reduce the t-ratios on
variables that would have been significant without the multicollinearity
sufficiently to make them insignificant. It is worth stating that the presence of
near multicollinearity does not affect the BLUE properties of the OLS estimator
– i.e. it will still be consistent, unbiased and efficient since the presence of near
multicollinearity does not violate any of the CLRM assumptions 1-4. However,
in the presence of near multicollinearity, it will be hard to obtain small standard
errors. This will not matter if the aim of the model-building exercise is to
produce forecasts from the estimated model, since the forecasts will be
unaffected by the presence of near multicollinearity so long as this relationship
between the explanatory variables continues to hold over the forecasted sample.
- Drop one of the collinear variables - so that the problem disappears. However,
this may be unacceptable to the researcher if there were strong a priori
theoretical reasons for including both variables in the model. Also, if the
1
removed variable was relevant in the data generating process for y, an omitted
variable bias would result (see Section 4.12).
- Transform the highly correlated variables into a ratio and include only the ratio
and not the individual variables in the regression. Again, this may be
unacceptable if financial theory suggests that changes in the dependent variable
should occur following changes in the individual explanatory variables, and not
a ratio of them.
3. (i) This is where there is a relationship between the ith and jth residuals.
Recall that one of the assumptions of the CLRM was that such a relationship did not
exist. We want our residuals to be random, and if there is evidence of
autocorrelation in the residuals, then it implies that we could predict the sign of the
next residual and get the right answer more than half the time on average!
(ii) The Durbin Watson test is a test for first order autocorrelation. The test
is calculated as follows. You would run whatever regression you were interested in,
and obtain the residuals. Then calculate the statistic
uˆt uˆt 1
T
2
DW t 2 T
uˆt
2
t 2
You would then need to look up the two critical values from the Durbin Watson
tables, and these would depend on how many variables and how many observations
and how many regressors (excluding the constant this time) you had in the model.
The rejection / non-rejection rule would be given by selecting the appropriate
region from the following diagram:
2
(iii) We have 60 observations, and the number of regressors excluding the
constant term is 3. The appropriate lower and upper limits are 1.32 and 1.52
respectively, so the Durbin Watson is lower than the lower limit. It is thus clear that
we reject the null hypothesis of no autocorrelation. So it looks like the residuals are
positively autocorrelated.
(iv) yt 1 2 X 2t 3 X 3t 4 X 4t ut
The problem with a model entirely in first differences, is that once we calculate the
long run solution, all the first difference terms drop out (as in the long run we
assume that the values of all variables have converged on their own long run values
so that yt = yt-1 etc.) Thus when we try to calculate the long run solution to this
model, we cannot do it because there isn’t a long run solution to this model!
(v)
yt 1 2 X 2t 3 X 3t 4 X 4t 5 X 2t 1 6 X 3t 1 7 X 4t 1 vt
The answer is yes, there is no reason why we cannot use Durbin Watson in this
case. You may have said no here because there are lagged values of the regressor
(X) variables in the regression. In fact this would be wrong since there are no lags of
the DEPENDENT variable ( y ) as an explanatory variable in the model. Hence the
Durbin Watson statistic can still be used.
4. yt 1 2 X 2t 3 X 3t 4 yt 1 5 X 2t 1 6 X 3t 1 7 X 3t 4 ut
The major steps involved in calculating the long run solution are to
- set the disturbance term equal to its expected value of zero
- drop the time subscripts
- remove all difference terms altogether since these will all be zero by the definition
of the long run in this context.