Chapter 4 Multicollinearity
Chapter 4 Multicollinearity
In Matrix notation:
𝒀 = 𝑿𝜷 + 𝝁,
where
1 X11 … X1p
1 X2i … X2p
𝑌1 . 𝜷𝟎 𝝁𝟏
𝒀 = (𝑌.2 ), 𝑿 = . , 𝜷 = (𝜷.𝟏 ) , 𝝁 = (𝝁.𝟐 ),
..
.. ..
. 𝜷𝒑 𝜷𝒏
𝑌𝑛
(1 Xni . . . Xnp )
̂ = 𝑯𝒀 where 𝜷
̂ = 𝑿𝜷
Fitted model (values): 𝒀 ̂ = (𝑿′ 𝑿)−𝟏 𝑿′ 𝒀 and
𝑯 = 𝑿(𝑿′ 𝑿)−𝟏 𝑿′
It is assumed that the explanatory variables are not perfectly or exactly linearly
correlated of equivalently that the moment matrix 𝑿′ 𝑿 is non-singular.
Questions we ask:
• What is the nature of multicollinearity?
• Is multicollinearity really a problem?
• What are its practical consequences?
• How do we detect it?
• What are the remedial measures?
Problems of Multicollinearity in OLS
Extreme cases:
(a) Perfect correlations
If explanatory variables 𝑋𝑖 and 𝑋𝑗 are perfectly correlated i.e. 𝜌𝑋𝑖 𝑋𝑗 = 1, the normal
equations 𝑿′ 𝑿𝜷 = 𝑿′ 𝒚 become indeterminate i.e. become impossible to obtain the
parameters 𝛽𝑖 ’s since the moment matrix 𝑿′ 𝑿 is singular or non-invertible or
𝑑𝑒𝑡(𝑿′ 𝑿) = 0.
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝜇𝑖 and 𝑋1𝑖 = 𝑋2𝑖 ⇒ 𝑌𝑖 = 𝛽0 + (𝛽1 + 𝛽2 )𝑋1𝑖 + 𝜇𝑖 .
1
(b) Perfect non-correlated (orthogonal)
If explanatory variables 𝑋𝑖 and 𝑋𝑗 are perfectly orthogonal i.e. 𝜌𝑋𝑖 𝑋𝑗 = 0, ∀𝑖 ≠ 𝑗, then
there is no problem with estimating the parameters 𝛽𝑖 ’s.
Cases of Multicollinearity
In practice, extreme cases are rare, we have a certain degree of intercorrelation or
interdependency among the explanatory variables.
2
(e) use of lagged variables
(f) cointegration
Practical consequences of multicollinearity
When the regressors are highly correlated we often observe the following:
• Although BLUE, the OLS estimators have large variances making precision
estimation difficult.
• The confidence intervals tend to be much wider, leading to acceptance of the
zero-null hypothesis (coefficient is zero) more readily.
• 𝑡 ratio of one or more coefficients tend to be statistically insignificant.
• Although 𝑡 ratio of one or more coefficients tend to be statistically insignificant,
𝑅 2 , the overall measure of goodness of fit is very high. Coefficients may have
high standard errors and low significance levels in spite the fact that they are
jointly highly significant and the 𝑅 2 in the regression is quite high.
• The OLS estimators and their standard errors are sensitive to small changes to
the data. i.e. Small changes in the data can produce wide swings in the
parameter estimation.
• Coefficients will have wrong signs or implausible magnitude.
3
𝑦𝑖 = 𝛽̂2 𝑥2𝑖 + 𝛽̂3 𝑥3𝑖 + 𝜇̂ 𝑖 = (𝛽̂2 + 𝜆𝛽̂3 )𝑥2𝑖 + 𝜇̂ 𝑖 = 𝛼̂𝑥2𝑖 + 𝜇̂ 𝑖 ,
𝛼̂ = (𝛽̂2 + 𝜆𝛽̂3)
Whatever the value of 𝛼 and 𝜆, one cannot get a unique solution for the individual
regression coefficients. i.e. the sum 𝛽̂2 + 𝜆𝛽̂3 can be identified but the individual
parameters 𝛽̂2 and 𝛽̂3 will be unidentified.
Testing for Multicollinearity
• Inspection of regression results
• Variance inflation factor and Tolerance
• Eigen values and condition index
Inspection of regression results
• Standard errors of parameter estimate (high standard errors but not always due
to multicollinearity).
• High partial correlations.
• High 𝑅 2 statistic and low t-statistic values.
• Sensitivity of parameter estimates.
• Inter-correlations between explanatory variables.
Variance inflation factor (VIF) and Tolerance
1
𝑉𝐼𝐹 = ,
(1 − 𝑟23 )
where 𝑟23 is coefficient of correlation between 𝑋2 and 𝑋3.
If 𝑉𝐼𝐹 > 10, variables are highly collinear.
1
𝑇𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒 = ,
𝑉𝐼𝐹
If 𝑇𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒 < 0.1, variables are highly collinear. The closer the tolerance to zero, the
greater the degree of collinearity with other regressors. The closer the tolerance to 1,
the greater the evidence that the variable is not collinear with other regressors.
SAS code for VIF and tolerance:
proc reg;
model y = x1 x2/ vif tol collin;
TITLE ‘Test for multicollinearity’;
run;
4
• Choose the elementary regression which appears to be the most plausible
results on both criteria used and then gradually add variables and examine their
effects on both coefficients and standard errors, 𝑅 2 statistic, Durbin Watson
statistic etc.
A new variable is classified as useful, superfluous or detrimental as follows:
• If the new variable improves 𝑅 2 and does not affect the values of the individual
coefficients then it is considered as useful and is retained as an explanatory
variable.
• If the new variable does not improve 𝑅 2 and does not affect the values of the
individual coefficients then it is considered as superfluous and is rejected.
• If the new variable affects considerably the signs and values of the individual
coefficients then it is considered detrimental. This is an indicator that there is
multicollinearity. The new variable is important in the model but new to inter-
correlation, cannot be assessed statistically by OLS.
Example:
The following data expenditure on clothing (𝑌), disposable income (𝑋1 ), liquid assets
(𝑋2 ), price index for clothing items (𝑋3 ) and a general price index of (𝑋4 ).
Expenditure Disposable Liquid assets Price index for General price
on clothing (𝑌) income (𝑋1) (𝑋2 ), clothing items index of (𝑋4 ).
(𝑋3 )
8.4 82.9 17.1 92 94
9.6 88.0 21.3 93 96
10.4 99.9 25.1 96 97
11.4 105.8 29.0 94 97
12.2 117.7 34.0 100 100
14.2 131.0 40.0 101 101
15.8 148.2 44.0 105 104
17.9 161.8 49.0 112 109
19.3 174.2 51.0 112 111
20.8 184.7 53.0 112 111
From the correlation matrix we deduce that the explanatory variable appearto be
multicollinear as shown by the high samole correlation coefecients.
5
Use Frisch confluence analysis to assess the effects, if any, of multicollinearity in a
regression of clothing expenditure on clothing and the economic variables suggested
as the explanatory variables.
Since the first elementary regression has the highest 𝑅 2 -statistic and the 𝐷𝑊 statistic
is close to 2 (slightly above 2), we conclude that 𝑋1 is the most important explanatory
variable.
The remaining explanatory variables are now brought in one by one, each time a
careful inspection of the relevant regression statistics.
The standard errors are presented below the corresponding estimate.
Regressors 𝛽0 𝛽1 𝛽2 𝛽3 𝛽4 𝑅2 𝐷𝑊
𝑋1 −1.24 0.118 0.995 2.6
(0.37) (0.002)
𝑋1 , 𝑋2 1.40 0.126 −0.036 0.996 2.5
(4.92) (0.010) (0.070)
𝑋1 , 𝑋2 , 𝑋3 0.94 0.138 −0.034 0.037 0.996 3.1
(5.17) (0.020) (0.060) (0.050)
𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 −13.53 0.097 −0.190 0.012 0.33 0.998 3.4
(7.50) (0.030) (0.090) (0.050) (0.15)
Examination of standard errors, 𝑡 ratios, 𝑅 2 -statistic and the 𝐷𝑊 statistic suggest that
the explanatory variables 𝑋2 , 𝑋3 and 𝑋4 are superfluous. A parsimonious and still
satisfactory model may be given by the model
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝜇 = −1.24 + 0.118𝑋1 + 𝜇
It is important to note that several satisfactory models may be adopted.
6
Remedies for Multicollinearity
• Combining cross-sectional data and time series data
• Dropping a variable (s) and specifications bias.
• Transformation of variables
• Additional or new data
• Increasing sample size
• Multivariate statistical techniques such as factor analysis and principal
component analysis
• Ridge regression.