0% found this document useful (0 votes)
71 views27 pages

Multicollinerity

The document discusses multicollinearity, which occurs when there is a nearly perfect linear relationship between two or more predictor variables in a regression model. This can result in high standard errors and unstable regression coefficients. The variance inflation factor quantifies the severity of multicollinearity in a regression model, with values above 10 usually indicating problematic multicollinearity.

Uploaded by

hilmiazis15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views27 pages

Multicollinerity

The document discusses multicollinearity, which occurs when there is a nearly perfect linear relationship between two or more predictor variables in a regression model. This can result in high standard errors and unstable regression coefficients. The variance inflation factor quantifies the severity of multicollinearity in a regression model, with values above 10 usually indicating problematic multicollinearity.

Uploaded by

hilmiazis15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

MULTICOLLINEARITY

Dr. Indra. S.Si, M.Si


INTRODUCTION

➢One of the assumptions of the classical linear regr


ession (CLRM) is that there is no exact linear relatio
nship among the regressors.
➢If there are one or more such relationships among
the regressors, we call it multicollinearity, or colline
arity for short.
➢Perfect collinearity: A perfect linear relationship betw
een the two variables exists.
➢Imperfect collinearity: The regressors are highly (but
not perfectly) collinear.
PERFECT MULTICOLLINEARITY

• When there is a perfect linear relationship.


• Assume we have the following model:
Y=β1+β2X2+ β3X3+e
where the sample values for X2 and X3 are:

X2 1 2 3 4 5 6

X3 2 4 6 8 10 12
PERFECT MULTICOLLINEARITY

• We observe that X3=2X2


• Therefore, although it seems that there are two
explanatory variables in fact it is only one.
• This is because X2 is an exact linear function of
X3 or because X2 and X3 are perfectly collinear.
CONSEQUENCES OF PERFECT
MULTICOLLINEARITY
• Under Perfect Multicollinearity, the OLS estimators simply do not exist. (pr
ove on board)

Ŷ = a + b1 X 1 + b2 X 2 where X 2 = kX 1
Estimator OLS for b1 is given by:

( x1 y )( x22 ) − ( x2 y )( x1 x2 )


b1 =
( x12 )( x22 ) − ( x1 x2 )2

k 2 ( x1 y )( x12 ) − k 2 ( x1 y )( x12 ) 0


b1 = =
k 2 ( x12 )( x12 ) − k 2 ( x12 )2 0
• If you try to estimate an equation in Eviews and your equation specification
s suffers from perfect multicollinearity Eviews will not give you results but
will give you an error message mentioning multicollinearity in it.
IMPERFECT MULTICOLLINEARITY

• Imperfect multicollinearity (or near multicollin


earity) exists when the explanatory variables in
an equation are correlated, but this correlation
is less than perfect.
• This can be expressed as:
X3=X2+v
• where v is a random variable that can be viewe
d as the ‘error’ in the exact linear releationship.
CONSEQUENCES OF HIGH
COLLINEARITY
➢If collinearity is not perfect, but high, several conse
quences ensue:
➢ The OLS estimators are still BLUE, but one or more regressio
n coefficients have large standard errors relative to the values
of the coefficients, thereby making the t ratios small.
➢ Even though some regression coefficients are statistically insi
gnificant, the R2 value may be very high.
➢ Therefore, one may conclude (misleadingly) that the true valu
es of these coefficients are not different from zero.
➢ Also, the regression coefficients may be very sensitive to smal
l changes in the data, especially if the sample is relatively sma
ll.
VARIANCE INFLATION FACTOR

➢For the following regression model:


Yi = B1 + B2 X 2i + B3 X 3i + ui
It can be shown that:
2 2
var(b2 ) = = VIF
 x (1 − r )
2
2i
2
23 x
2
2i
and
2 2
var(b3 ) = = VIF
 x (1 − r )
2
3i
2
23 x 2
3i
where σ2 is the variance of the error term ui, and r23 is the coefficient
of correlation between X2 and X3.
VARIANCE INFLATION FACTOR

1
VIF =
1 − r23
2

is the variance-inflating factor.

➢VIF is a measure of the degree to which the variance o


f the OLS estimator is inflated because of collinearity.
VARIANCE INFLATION FACTOR

R2j VIFj
0 1
0.5 2
0.8 5
0.9 10
0.95 20
0.075 40
0.99 100
0.995 200
0.999 1000
THE VARIANCE INFLATION FACTOR

• VIF values that exceed 10 are generally viewed as


evidence of the existence of problematic
multicollinearity.
• This happens for R2j >0.9 (explain auxiliary reg)
• So large standard errors will lead to large confide
nce intervals.
• Also, we might have t-stats that are totally wrong.
DETECTION OF MULTICOLLINEARITY

➢ High R2 but few significant t ratios.


➢ High pair-wise correlations among explanatory varia
bles or regressors.
➢ High partial correlation coefficients.
➢ Significant F test for auxiliary regressions (regressio
ns of each regressor on the remaining regressors).
➢ High Variance Inflation Factor (VIF) – particularly e
xceeding 10 in value – and low Tolerance Factor (TO
L, the inverse of VIF).
REMEDIAL MEASURES

➢What should we do if we detect multicollinearity?


➢ Nothing, for we often have no control over the data.
➢ Redefine the model by excluding variables may attenuate
the problem, provided we do not omit relevant variables.
➢ Principal components analysis: Construct artificial variable
s from the regressors such that they are orthogonal to one
another.
➢These principal components become the regressors in the
model.
➢Yet the interpretation of the coefficients on the principal
components is not as straightforward.
Resolving MULTICOLLINEARITY

The easiest ways to “cure” the problems are


(a) drop one of the collinear variables
(b) transform the highly correlated variables into a ratio
(c) go out and collect more data e.g.
(d) a longer run of data
(e) switch to a higher frequency
ILLUSTRATION USING EVIEWS

• This tutorial shows you how to estimate a multi


ple regression model in EViews and test for
multicollinearity.
• The example is one of the predictors of CEO
salaries.
• The salaries are measured in thousands of dollar.
• Firms’ sales are measured in millions of dollar.
• Market value (or market capitalisation) in millions
of dollar.
• And profits are also measured in millions of dollar.
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS
ILLUSTRATION USING EVIEWS

• Testing Variance Inflation Factor in Eviews


• For example, we have a regression model as follow:
lnIMPORTt = β0 + β1 lnGDPt + β2 lnCPIt + ut
• And the estimation result is provided as follow:
ILLUSTRATION USING EVIEWS

• To show the VIF test result from the estimation above, cl


ick on the menu view → coefficient diagnostic → varian
ce factors
ILLUSTRATION USING EVIEWS

You might also like