0% found this document useful (0 votes)
78 views

Lecture 16

Perfect multicollinearity occurs when two or more explanatory variables are perfectly linearly related. This violates an assumption of the ordinary least squares model and prevents it from estimating regression coefficients. Imperfect multicollinearity exists when explanatory variables are approximately but not perfectly linearly related, causing increased variance and standard errors of coefficients. Multicollinearity can be detected through high correlation between variables, insignificant t-statistics, and high variance inflation factors. Potential remedies include dropping redundant variables, transforming variables, or increasing the sample size.

Uploaded by

Mamat Mat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Lecture 16

Perfect multicollinearity occurs when two or more explanatory variables are perfectly linearly related. This violates an assumption of the ordinary least squares model and prevents it from estimating regression coefficients. Imperfect multicollinearity exists when explanatory variables are approximately but not perfectly linearly related, causing increased variance and standard errors of coefficients. Multicollinearity can be detected through high correlation between variables, insignificant t-statistics, and high variance inflation factors. Potential remedies include dropping redundant variables, transforming variables, or increasing the sample size.

Uploaded by

Mamat Mat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

CHAPTER 8: MULTICOLLINEARITY 

 
Perfect multicollinearity is the violation of Assumption 6 (no explanatory variable is a
perfect linear function of any other explanatory variables).

Perfect (or Exact) Multicollinearity

If two or more independent variables have an exact linear relationship between them then
we have perfect multicollinearity.

Examples: including the same information twice (weight in pounds and weight in
kilograms), not using dummy variables correctly (falling into the dummy
variable trap), etc.

Here is an example of perfect multicollinearity in a model with two explanatory


variables:

Page 1 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 

Consequence: OLS cannot generate estimates of regression coefficients (error message).

Why? OLS cannot estimate the marginal effect of on while holding constant
because moves exactly when moves!

Solution: Easy - Drop one of the variables!

Page 2 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
Imperfect (or Near) Multicollinearity

When we use the word multicollinearity we are usually talking about severe imperfect
multicollinearity.

When explanatory variables are approximately linearly related, we have

Page 3 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
The Consequences of Multicollinearity

1. Imperfect multicollinearity does not violate Assumption 6. Therefore the Gauss-


Markov Theorem tells us that the OLS estimators are BLUE.

So then why do we care about multicollinearity?

2. The variances and the standard errors of the regression coefficient estimates will
increase. This means lower t-statistics.

3. The overall fit of the regression equation will be largely unaffected by


multicollinearity. This also means that forecasting and prediction will be largely
unaffected.

4. Regression coefficients will be sensitive to specifications. Regression coefficients


can change substantially when variables are added or dropped.

Page 4 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
The Detection of Multicollinearity

High Correlation Coefficients

Pairwise correlations among independent variables might be high (in absolute value).
Rule of thumb: If the correlation > 0.8 then severe multicollinearity may be present.

High with low t-Statistic Values

Possible for individual regression coefficients to be insignificant but for the overall fit of
the equation to be high.

High Variance Inflation Factors (VIFs)

A VIF measures the extent to which multicollinearity has increased the variance of an
estimated coefficient. It looks at the extent to which an explanatory variable can be
explained by all the other explanatory variables in the equation.

Page 5 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
Suppose our regression is equation includes k explanatory variables:

… .

In this equation there are k VIFs:

Step 1: Run the OLS regression for each X variable. For example for :

Step 2: Calculate the VIF for :

1
VIF( )
1

is the for the auxiliary regression in Step 1.

Step 3: Analyze the degree of multicollinearity by evaluating each VIF( ).


Rule of thumb: If VIF( ) > 5 then severe multicollinearity may be present.
Page 6 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
Remedies for Multicollinearity

No single solution exists that will eliminate multicollinearity. Certain approaches may be
useful:

1. Do Nothing

Live with what you have.

2. Drop a Redundant Variable

If a variable is redundant, it should have never been included in the model in the
first place. So dropping it actually is just correcting for a specification error. Use
economic theory to guide your choice of which variable to drop.

3. Transform the Multicollinear Variables

Sometimes you can reduce multicollinearity by re-specifying the model, for


instance, create a combination of the multicollinear variables. As an example, rather
than including the variables GDP and population in the model, include
GDP/population (GDP per capita) instead.

Page 7 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
4. Increase the Sample Size

Increasing the sample size improves the precision of an estimator and reduces the
adverse effects of multicollinearity. Usually adding data though is not feasible.

Page 8 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
Example

How would perfect multicollinearity arise in our previous election example?

We’ve already seen one case:

vote sharei = spendingi + incumbencyi + malei + Liberali +


Conservativei + BQi + NDPi +

What else?

Campaign spending is really the sum of five separate expenditures:

 Advertising
 Election surveys
 Office expenses
 Salaries
 Other

Page 9 of 10 
 
CHAPTER 8: MULTICOLLINEARITY 
 
What if a researcher were interested in the individual effect of each of these
expenditures?

vote sharei = spendingi + advertising + surveys + office + salaries +


other + incumbencyi + malei + Liberali + Conservativei +
BQi + NDPi +

Even if you correct the model there still may be an imperfect multicollinearity between
the components of campaign expenditures.

Page 10 of 10 
 

You might also like