0% found this document useful (0 votes)
26 views7 pages

Chapter 4 Multicollinearity

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views7 pages

Chapter 4 Multicollinearity

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CHAPTER 4: MULTICOLLINEARITY

Multicollinearity: what happens if the regressors are correlated?


• Multicollinearity refers to the existence of an exact linear relationship among
some or all explanatory variables of a regression model.
• Multicollinearity is a phenomenon in which two or more predictor variables in
multiple regression are highly correlated (highly linearly related).
When considering the model:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑝 𝑋𝑝𝑖 + 𝜇𝑖 , 𝑖 = 1,2, … , 𝑛

In Matrix notation:
𝒀 = 𝑿𝜷 + 𝝁,
where
1 X11 … X1p
1 X2i … X2p
𝑌1 . 𝜷𝟎 𝝁𝟏
𝒀 = (𝑌.2 ), 𝑿 = . , 𝜷 = (𝜷.𝟏 ) , 𝝁 = (𝝁.𝟐 ),
..
.. ..
. 𝜷𝒑 𝜷𝒏
𝑌𝑛
(1 Xni . . . Xnp )

̂ = 𝑯𝒀 where 𝜷
̂ = 𝑿𝜷
Fitted model (values): 𝒀 ̂ = (𝑿′ 𝑿)−𝟏 𝑿′ 𝒀 and

𝑯 = 𝑿(𝑿′ 𝑿)−𝟏 𝑿′

It is assumed that the explanatory variables are not perfectly or exactly linearly
correlated of equivalently that the moment matrix 𝑿′ 𝑿 is non-singular.
Questions we ask:
• What is the nature of multicollinearity?
• Is multicollinearity really a problem?
• What are its practical consequences?
• How do we detect it?
• What are the remedial measures?
Problems of Multicollinearity in OLS
Extreme cases:
(a) Perfect correlations
If explanatory variables 𝑋𝑖 and 𝑋𝑗 are perfectly correlated i.e. 𝜌𝑋𝑖 𝑋𝑗 = 1, the normal
equations 𝑿′ 𝑿𝜷 = 𝑿′ 𝒚 become indeterminate i.e. become impossible to obtain the
parameters 𝛽𝑖 ’s since the moment matrix 𝑿′ 𝑿 is singular or non-invertible or
𝑑𝑒𝑡(𝑿′ 𝑿) = 0.
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝜇𝑖 and 𝑋1𝑖 = 𝑋2𝑖 ⇒ 𝑌𝑖 = 𝛽0 + (𝛽1 + 𝛽2 )𝑋1𝑖 + 𝜇𝑖 .

1
(b) Perfect non-correlated (orthogonal)
If explanatory variables 𝑋𝑖 and 𝑋𝑗 are perfectly orthogonal i.e. 𝜌𝑋𝑖 𝑋𝑗 = 0, ∀𝑖 ≠ 𝑗, then
there is no problem with estimating the parameters 𝛽𝑖 ’s.
Cases of Multicollinearity
In practice, extreme cases are rare, we have a certain degree of intercorrelation or
interdependency among the explanatory variables.

Figure 4.1: Cases of multicollinearity.


Causes of Multicollinearity
(a) Data collection methods: sampling over a limited range of values taken by the
regressors in the population.
(b) Model specification: adding a polynomial term to a regression model especially
when the range of the predictor is small.
(c) Overdetermined model: model has more explanatory variables than the number of
observations,
(d) Regressors share a common trend

2
(e) use of lagged variables
(f) cointegration
Practical consequences of multicollinearity
When the regressors are highly correlated we often observe the following:
• Although BLUE, the OLS estimators have large variances making precision
estimation difficult.
• The confidence intervals tend to be much wider, leading to acceptance of the
zero-null hypothesis (coefficient is zero) more readily.
• 𝑡 ratio of one or more coefficients tend to be statistically insignificant.
• Although 𝑡 ratio of one or more coefficients tend to be statistically insignificant,
𝑅 2 , the overall measure of goodness of fit is very high. Coefficients may have
high standard errors and low significance levels in spite the fact that they are
jointly highly significant and the 𝑅 2 in the regression is quite high.
• The OLS estimators and their standard errors are sensitive to small changes to
the data. i.e. Small changes in the data can produce wide swings in the
parameter estimation.
• Coefficients will have wrong signs or implausible magnitude.

Estimation in the presence of Perfect Multicollinearity


Let’s consider

𝑦𝑖 = 𝛽̂2 𝑥2𝑖 + 𝛽̂3 𝑥3𝑖 + 𝜇̂ 𝑖


It can be proved that
2
(∑ 𝑦𝑖 𝑥2𝑖 )(∑ 𝑥3𝑖 ) − (∑ 𝑦𝑖 𝑥3𝑖 )(∑ 𝑥2𝑖 𝑥3𝑖 )
𝛽̂2 = 2 )(∑ 2 )
(∑ 𝑥2𝑖 𝑥3𝑖 − (∑ 𝑥2𝑖 𝑥3𝑖 )2
and
2
(∑ 𝑦𝑖 𝑥3𝑖 )(∑ 𝑥2𝑖 ) − (∑ 𝑦𝑖 𝑥2𝑖 )(∑ 𝑥2𝑖 𝑥3𝑖 )
𝛽̂3 = 2 )(∑ 2 )
(∑ 𝑥2𝑖 𝑥3𝑖 − (∑ 𝑥2𝑖 𝑥3𝑖 )2
If we assume 𝑥3𝑖 = 𝜆𝑥2𝑖 , where 𝜆 ≠ 0.
It can be shown that
2 2
(∑ 𝑦𝑖 𝑥2𝑖 )(𝜆2 ∑ 𝑥2𝑖 ) − (𝜆 ∑ 𝑦𝑖 𝑥2𝑖 )(𝜆 ∑ 𝑥2𝑖 ) 0
𝛽̂2 = 2 )(𝜆2 ∑ 2 ) 2 )2 =
(∑ 𝑥2𝑖 𝑥2𝑖 − 𝜆2 (∑ 𝑥2𝑖 0

Which is an indeterminate expression.

Show that 𝛽̂3 is also indeterminate.

3
𝑦𝑖 = 𝛽̂2 𝑥2𝑖 + 𝛽̂3 𝑥3𝑖 + 𝜇̂ 𝑖 = (𝛽̂2 + 𝜆𝛽̂3 )𝑥2𝑖 + 𝜇̂ 𝑖 = 𝛼̂𝑥2𝑖 + 𝜇̂ 𝑖 ,

𝛼̂ = (𝛽̂2 + 𝜆𝛽̂3)
Whatever the value of 𝛼 and 𝜆, one cannot get a unique solution for the individual
regression coefficients. i.e. the sum 𝛽̂2 + 𝜆𝛽̂3 can be identified but the individual
parameters 𝛽̂2 and 𝛽̂3 will be unidentified.
Testing for Multicollinearity
• Inspection of regression results
• Variance inflation factor and Tolerance
• Eigen values and condition index
Inspection of regression results
• Standard errors of parameter estimate (high standard errors but not always due
to multicollinearity).
• High partial correlations.
• High 𝑅 2 statistic and low t-statistic values.
• Sensitivity of parameter estimates.
• Inter-correlations between explanatory variables.
Variance inflation factor (VIF) and Tolerance
1
𝑉𝐼𝐹 = ,
(1 − 𝑟23 )
where 𝑟23 is coefficient of correlation between 𝑋2 and 𝑋3.
If 𝑉𝐼𝐹 > 10, variables are highly collinear.
1
𝑇𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒 = ,
𝑉𝐼𝐹
If 𝑇𝑜𝑙𝑒𝑟𝑎𝑛𝑐𝑒 < 0.1, variables are highly collinear. The closer the tolerance to zero, the
greater the degree of collinearity with other regressors. The closer the tolerance to 1,
the greater the evidence that the variable is not collinear with other regressors.
SAS code for VIF and tolerance:
proc reg;
model y = x1 x2/ vif tol collin;
TITLE ‘Test for multicollinearity’;
run;

Frisch’s Confluence Analysis


It helps to determine a useful, superfluous or detrimental explanatory variable.
The procedure is:
• Regress the dependent variable on each of the explanatory variables
separately and examine the regression results, and

4
• Choose the elementary regression which appears to be the most plausible
results on both criteria used and then gradually add variables and examine their
effects on both coefficients and standard errors, 𝑅 2 statistic, Durbin Watson
statistic etc.
A new variable is classified as useful, superfluous or detrimental as follows:
• If the new variable improves 𝑅 2 and does not affect the values of the individual
coefficients then it is considered as useful and is retained as an explanatory
variable.
• If the new variable does not improve 𝑅 2 and does not affect the values of the
individual coefficients then it is considered as superfluous and is rejected.
• If the new variable affects considerably the signs and values of the individual
coefficients then it is considered detrimental. This is an indicator that there is
multicollinearity. The new variable is important in the model but new to inter-
correlation, cannot be assessed statistically by OLS.
Example:
The following data expenditure on clothing (𝑌), disposable income (𝑋1 ), liquid assets
(𝑋2 ), price index for clothing items (𝑋3 ) and a general price index of (𝑋4 ).
Expenditure Disposable Liquid assets Price index for General price
on clothing (𝑌) income (𝑋1) (𝑋2 ), clothing items index of (𝑋4 ).
(𝑋3 )
8.4 82.9 17.1 92 94
9.6 88.0 21.3 93 96
10.4 99.9 25.1 96 97
11.4 105.8 29.0 94 97
12.2 117.7 34.0 100 100
14.2 131.0 40.0 101 101
15.8 148.2 44.0 105 104
17.9 161.8 49.0 112 109
19.3 174.2 51.0 112 111
20.8 184.7 53.0 112 111

The correlation matrix for the explanatory variables 𝑋1 , 𝑋2 , 𝑋3 and 𝑋4 is

1.000 0.993 0.980 0.987


𝑅 = [0.993 1.000 0.964 0.973]
0.980 0.964 1.000 0.991
0.987 0.973 0.991 1.000

From the correlation matrix we deduce that the explanatory variable appearto be
multicollinear as shown by the high samole correlation coefecients.

5
Use Frisch confluence analysis to assess the effects, if any, of multicollinearity in a
regression of clothing expenditure on clothing and the economic variables suggested
as the explanatory variables.

To explain the effects of multicollinearity, we compute the simple liar regressions as


follows:
Regressor Estimate 𝛽̂ Standard error 𝑅 2 statistic Durbin-
𝑠𝛽̂ Watson(𝐷𝑊)
statistic
𝑋1 −1.240 0.370 0.995 2.6
0.118 0.002
𝑋2 −38.51 4.200 0.951 2.4
0.516 0.040
𝑋3 2.110 0.810 0.967 0.4
0.327 0.020
𝑋4 −53.650 3.630 0.977 2.1
0.663 0.030

Since the first elementary regression has the highest 𝑅 2 -statistic and the 𝐷𝑊 statistic
is close to 2 (slightly above 2), we conclude that 𝑋1 is the most important explanatory
variable.
The remaining explanatory variables are now brought in one by one, each time a
careful inspection of the relevant regression statistics.
The standard errors are presented below the corresponding estimate.
Regressors 𝛽0 𝛽1 𝛽2 𝛽3 𝛽4 𝑅2 𝐷𝑊
𝑋1 −1.24 0.118 0.995 2.6
(0.37) (0.002)
𝑋1 , 𝑋2 1.40 0.126 −0.036 0.996 2.5
(4.92) (0.010) (0.070)
𝑋1 , 𝑋2 , 𝑋3 0.94 0.138 −0.034 0.037 0.996 3.1
(5.17) (0.020) (0.060) (0.050)
𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 −13.53 0.097 −0.190 0.012 0.33 0.998 3.4
(7.50) (0.030) (0.090) (0.050) (0.15)

Examination of standard errors, 𝑡 ratios, 𝑅 2 -statistic and the 𝐷𝑊 statistic suggest that
the explanatory variables 𝑋2 , 𝑋3 and 𝑋4 are superfluous. A parsimonious and still
satisfactory model may be given by the model
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝜇 = −1.24 + 0.118𝑋1 + 𝜇
It is important to note that several satisfactory models may be adopted.

6
Remedies for Multicollinearity
• Combining cross-sectional data and time series data
• Dropping a variable (s) and specifications bias.
• Transformation of variables
• Additional or new data
• Increasing sample size
• Multivariate statistical techniques such as factor analysis and principal
component analysis
• Ridge regression.

You might also like