0% found this document useful (0 votes)

72 views36 pages

Multicollinearity

The document discusses different types and effects of multicollinearity in regression analysis. It defines perfect and imperfect multicollinearity and explains their impacts on coefficient estimates and standard errors. Methods for detecting multicollinearity are also presented, including examining correlation coefficients, variance inflation factors, and auxiliary regressions.

Uploaded by

math.atha.official

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views36 pages

Multicollinearity

Uploaded by

math.atha.official

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 36

• Perfect and imperfect multicollinearity

• Effects of multicollinearity
• Detecting multicollinearity
• Remedies for multicollinearity
The nature of Multicollinearity
Perfect multicollinearity:
When there are some functional relationships existing
among independent variables, that is  iXi = 0

or 1X1+ 2X2 + 3X3 +…+ iXi = 0

Such as 1X1+ 2X2 = 0  X1= -2X2
If multicollinearity is perfect,
perfect the regression coefficients of
the Xi variables, is, are indeterminate and their standard
errors, Se(i)s, are infinite.
Example:
Y =^0 + ^1X1 + ^
2X2 + ^

3-variable Case:
(yx1)(x22) - (yx2)(x1x2)
^
1 =
(x12)(x22) - (x1x2)2
If x 2 =  x 1,
(yx1)(2x12) - (yx1)(x1x1) 0
^
1 = =
(x1 )( x1 ) -  (x1x1)
2 2 2 2 2
0
Indeterminate
(yx2)(x12) - (yx1)(x1x2)
^
2 =
(x12)(x22) - (x1x2)2
Similarly
If x2 = x1 ^
(yx1)(x12) - (yx1)(x1x1) 0
2 = =
(x1 )( x1 ) -  (x1x1)
2 2 2 2 2
0
Indeterminate
If multicollinearity is imperfect,

x 2 = 1 x 1 +  where  is a stochastic error

(or x 2 = 0 + 1 x 1 +  )

Then the regression coefficients, although determinate,

possess large standard errors, which means the
coefficients can be estimated but less accurate.
^ (yx1)(2x12 +  2 ) - ( yx1 + y )( x1x1+ x1 )
1 =
(x12)(2 x12 +  2 ) - ( x1x1 + x1 )2

0 =0
(Why?) –
uncorrelated
Example: Production function
(log form)
Yi = 0 + 1X1i + 2X2i + 3X3i + i
Y X1 X2 X3 Y: Output

122 10 50 52 X1: Capital

170 15 75 75 X2: Labor
202 18 90 97 X3: Land
270 24 120 129
X1 = 5X2
330 30 150 152
Example: Perfect multicollinearity
a. Suppose D1, D2, D3 and D4 = 1 for spring,
summer, autumn and winter, respectively.
Yi = 0 + 1D1i + 2D2i + 3D3i + 4D4i + 1X1i + i.
b. Yi = 0 + 1X1i + 2X2i + 3X3i + i
X1: Nominal interest rate;
X2: Real interest rate;
X3: CPI
c. Yt = 0 + 1Xt + 2Xt + 3Xt-1 + t
Where Xt = (Xt – Xt-1) is called “first different”
Imperfect Multicollinearity

Yi = 0 + 1X1i + 2X2i + … + KXKi + i

When some independent variables are linearly
correlated but the relation is not exact, there is
imperfect multicollinearity.
0 + 1X1i + 2X2i +  + KXKi + ui = 0
where u is a random error term and k  0 for
some k.
When will it be a
Consequences of imperfect multicollinearity

1. The estimated coefficients are still BLUE,

however, OLS estimators have large variances and covariances,

thus making the estimation with less accuracy.

2. The estimation confidence intervals tend to be
much wider, leading to accept the “zero null
hypothesis” more readily.
3. The t-statistics of coefficients tend to be
Can be
statistically insignificant.
detected from
4. The R2 can be very high. regression
5. The OLS estimators and their standard errors results
can be sensitive to small change in the data.
OLS estimators are still BLUE
under imperfect multicollinearity
Why???
Remarks:
•Unbiasedness is a repeated sampling property, not
about the properties of estimators in any given
sample
•Minimum variance does not mean small variance
•Imperfect multicollinearity is just a sample
phenomenon
Effects of Imperfect
Multicollinearity

Unaffected:
a. OLS estimators are still BLUE.
b. The overall fit of the equation
c. The estimation of the coefficients of
non-multicollinear variables
The variances of OLS estimators
increase with the degree of
multicollinearity
Regression model:
Yi = 0 + 1X1i + 2X2i + i

High correlation between X1 and X2

Difficult to isolate effects of X1 and X2 from
each other, as should in OLS partial regression

coefficients
Closer relation between X1 and X2

 larger r223 toward 1

 larger VIF; VIF=1/(1 - r223 )
approaches infinity
 larger variances
Larger var   
ˆ 
k

a. More likely to get unexpected signs.

signs

  
b. se ˆ k
2
tends to be large

Larger variances tend to increase the

standard errors of estimated coefficients.
coefficients
c. Larger standard errors  Lower t-values
ˆ  *
k k
tk 
 
se ˆ k
d. Larger standard errors
 Wider confidence intervals

 
ˆ k  t df , / 2  se ˆ k

Less precise interval estimates.

Detecting Multicollinearity

IN FACT, No formal “tests” for multicollinearity

Careful considerations should be taken care of:
1. It is a question of degree and not a kind of – its presence
is not the main issue
2. Since the explanatory variables are assumed to
nonstochastic, it is a feature of the sample not the
population – issue of sample selection
3. Therefore, we test the degree of multicollinearity and not
its presence
Detecting Multicollinearity

IN FACT, No formal “tests” for multicollinearity

1.Few significant t-ratios but a high R 2 and a
collective significance of the variables (F test)
2. High pairwise correlation between the explanatory
variables – does not always hold
3. Examination of partial correlations – sometimes
4. Estimate auxiliary regressions
5. Calculate condition index from eigenvalues
6. Estimate variance inflation factor (VIF)
Dependent Variable: Y
Variable Coefficient Std. Error t-Statistic Prob.
C 2933.906 8172.257 0.359008 0.7271
X1 50.53776 69.70081 0.725067 0.4850
X2 -103.5042 51.14744 -2.023644 0.0705
X3 6.115795 3.713679 1.646829 0.1306
X4 -105.9787 151.9428 -0.697491 0.5014
X5 0.123797 0.122672 1.009174 0.3367
R-squared 0.754505 Akaike info criterion 16.23749
Adjusted R- 0.631757 Schwarz criterion 16.52721
squared
S.E. of regression 706.1345 F-statistic 6.146800

Sum squared 4986260. Prob(F-statistic) 0.007426

resid
High pairwise correlation between
the explanatory variables

CORRELATION MATRIX

X1 X2 X3 X4 X5
X1 1.00000 0.99686 0.99135 0.52583 0.97214
X2 0.99686 1.00000 0.99127 0.54331 0.96524
X3 0.99135 0.99127 1.00000 0.46144 0.97262
X4 0.52583 0.54331 0.46144 1.00000 0.53618
X5 0.97214 0.96524 0.97262 0.53618 1.00000
Auxiliary Regressions

Auxiliary Regressions - regress each explanatory

variable on the remaining explanatory variables
X 2i  a1  a2 X 3i  a3 X 4i
X 3i  b1  b2 X 2i  b3 X 4i
X 4i  c1  c2 X 2i  c3 X 3i

The R2 will show how strongly Xji is collinear with

the other explanatory variables
Dependent
Variable R2 VIF Prob (F-stat)
X1 0.995877 242.5418 0.0000
X2 0.99766 427.3504 0.0000
X3 0.995625 228.5714 0.0000
X4 0.793711 4.8476 0.00091
X5 0.974219 38.7883 0.0000
Fi=(R2xi.x2x3…xk)/(k – 2)/(1 - R2xi.x2x3…xk)/(n – k + 1)
If F>F stat. Xi is collinear with particular X’s
Or if auxillary R2 > overall R2 – troubled multi..
Following Klien’s rule of thumb
Detection of Multicollinearity
Example: Data set: Table#M1
COi = 0 + 1Ydi + 2LAi + i
CO: Annual consumption expenditure
Yd: Annual disposable income
LA: Liquid assets
Since LA is highly related to YD

Results:
High R2 and adjust R2
Less significant t-values
OLS estimates and SE’s can be
sensitive to specification and small
changes in data

Specification changes:
Add or drop variables
Small changes:
Add or drop some observations
Change some data values
High Simple
Correlation
Coefficients

rij 
 X  X X  X 
i i j j

 X  X    X  X 
i i
2
j j
2

Remark: High rij for any i and j is a sufficient indicator for

the existence of multicollinearity but not necessary.
Variance Inflation Factors (VIF)
method
Procedures:
(1) Y   0  1 X 1   2 X 2  . . .   k X k  
(2) X 1  1   2 X 2   3 X 3  ...   k X k 
Obtain Rk2

(3)  
VIF ˆk 
1
1  Rk
2

Rule of thumb: VIF > 5  multicollinearity

Notes: (a.) Using VIF is not a statistical test.
(b.) The cutting point is arbitrary, some use VIF>10
Remedial Measures
1. Drop the Redundant Variable

Using theories to pick the variables to drop.

Do not drop a variable strongly supported by theory.
(Danger of specification error)
Insignificant

Insignificant

Since M1 and M2
are highly related

Other examples: CPI <=> WPI;

CD rate <=> TB rate
GDP  GNP  GNI
Check after dropping variables:
• The estimation of the coefficients of other variables are not
affected. (necessary)
• R2 does not fall much when some collinear variables are dropped.
(necessary)
• More significant t-values vs. smaller standard errors (likely)
2. Redesigning the Regression
Model
There is no definite rule for this method.
    
Ft  f ( PFt , PBt Yd t , N t , Pt )  t
Ft = average pounds of fish consumed per capita
PFt = price index for fish
PBt = price index for beef
Ydt = real per capita disposable income
N = the # of Catholic
P = dummy = 1 after the Pop’s 1966 decision, = 0 otherwise

Ft   0  1 PFt   2 PBt   3 LnYdt   4 N t   5 Pt   t

High correlations

VIFPF = 43.4
Signs are unexpected VIFlnYd =23.3
Most t-values are insignificant
VIFPB = 18.9
VIFN =18.5
VIFP =4.4
Drop N, but not improved

Improved

Use the Relative Prices

(RPt = PFt/PBt)
  
Ft  f ( RPt , Yd t , P t )   t

Ft = 0 + 1RPt + 2lnYdt + 3Pt + t

Improved much

Using the lagged term of RP to allow the lag effect

in the regression
Ft = 0 + 1RPt-1 + 2lnYdt + 3Pt + t
3. Using A Priori Information
From previous empirical work, e.g.
Consi = 0 + 1Incomei + 2Wealthi + i
and a priori information: 2 = 0.1.
0.1
Then construct a new variable or proxy,
(Cons*i = Consi – 0.1Wealthi)
Run OLS: Cons*i = 0 + 1Incomei + i
4. Transformation of the Model
Taking first differences of time series data.
Origin regression model:
Yt = 0 + 1X1t + 2X2t + t
Transforming model: First differencing
Yt = 1X1t + 2X2t + ut
Where Yt = Yt- Yt-1, (Yt-1 is called a lagged
lag term)
X1t = X1t- X1,t-1, X2t = X2t- X2,t-1,
5. Collect More Data (expand sample size)

Larger sample size means smaller variance of

estimators.

6. Doing Nothing

Point out the problem of multicollinearity.

MA 411 Assignment 2025
No ratings yet
MA 411 Assignment 2025
7 pages
Chapter 4. Violation of Assumptions
No ratings yet
Chapter 4. Violation of Assumptions
51 pages
Pyzdek Six Sigma Handbook 4th
No ratings yet
Pyzdek Six Sigma Handbook 4th
9 pages
Multicollinearity Nature of Multicollinearity
100% (3)
Multicollinearity Nature of Multicollinearity
7 pages
Topic 3 - Portfolio Math - Annotated
No ratings yet
Topic 3 - Portfolio Math - Annotated
31 pages
QMT 533 Assesment 2
No ratings yet
QMT 533 Assesment 2
20 pages
Lec 5 Measures of Variation
No ratings yet
Lec 5 Measures of Variation
77 pages
Lecture 11, Portfolio Optimization With Annotation
No ratings yet
Lecture 11, Portfolio Optimization With Annotation
49 pages
Multi Collinearity
No ratings yet
Multi Collinearity
22 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
6 pages
CH 10
No ratings yet
CH 10
9 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
45 pages
Measures of Dispersion Homework 1
No ratings yet
Measures of Dispersion Homework 1
2 pages
Ecf230 FPD 9 2018 2
No ratings yet
Ecf230 FPD 9 2018 2
44 pages
Stat 101 Compild Questions
No ratings yet
Stat 101 Compild Questions
1 page
Slides 3 Iu
No ratings yet
Slides 3 Iu
22 pages
Pioneers' Statistics - Sample Papers
No ratings yet
Pioneers' Statistics - Sample Papers
62 pages
Multi Col Linearity
No ratings yet
Multi Col Linearity
37 pages
Chapter 4
No ratings yet
Chapter 4
38 pages
Practical Session 1 Solved
No ratings yet
Practical Session 1 Solved
14 pages
Eda Important Two Marks & 16 Marks
0% (1)
Eda Important Two Marks & 16 Marks
17 pages
Measures of Central Location For Ungrouped Data
No ratings yet
Measures of Central Location For Ungrouped Data
20 pages
Multicollinearity
No ratings yet
Multicollinearity
12 pages
Business Statistics-Solved
No ratings yet
Business Statistics-Solved
11 pages
Hasil SPSS PDF
No ratings yet
Hasil SPSS PDF
61 pages
CH 4 2023 Eonometrics For Acct and Finance
No ratings yet
CH 4 2023 Eonometrics For Acct and Finance
21 pages
Chapter 5 Multicollinearity
No ratings yet
Chapter 5 Multicollinearity
20 pages
Mini-Test: Chapter 4 Student's Name:: A: B: C: D: F
No ratings yet
Mini-Test: Chapter 4 Student's Name:: A: B: C: D: F
2 pages
Results: Contingency Tables
No ratings yet
Results: Contingency Tables
7 pages
Unit II - Diagnotis and Multiple Linear
No ratings yet
Unit II - Diagnotis and Multiple Linear
8 pages
LAB1 - Descriptive Statistics
No ratings yet
LAB1 - Descriptive Statistics
4 pages
Nur Razimah (940907015834)
No ratings yet
Nur Razimah (940907015834)
8 pages
Chapter 04
No ratings yet
Chapter 04
70 pages
Monthly Test 1 ON S1: Topic: Measure of Location and Dispersion
No ratings yet
Monthly Test 1 ON S1: Topic: Measure of Location and Dispersion
3 pages
Chap 7 - Two Sample Test
No ratings yet
Chap 7 - Two Sample Test
59 pages
AE Unit II
No ratings yet
AE Unit II
64 pages
Research MCQ
No ratings yet
Research MCQ
66 pages
Business Statistics Unit 1
No ratings yet
Business Statistics Unit 1
22 pages
X + X + X + + X 0 0: When There Are Some Functional Relationships Existing Among Independent Variables, That Is
No ratings yet
X + X + X + + X 0 0: When There Are Some Functional Relationships Existing Among Independent Variables, That Is
8 pages
Lecture 4 - Multicolinearity
No ratings yet
Lecture 4 - Multicolinearity
24 pages
C4 English
No ratings yet
C4 English
27 pages
MULTICOLLINEALITY
No ratings yet
MULTICOLLINEALITY
20 pages
Multicollinearity
100% (1)
Multicollinearity
2 pages
CHAPTER 4 - Violations of Assumptions
No ratings yet
CHAPTER 4 - Violations of Assumptions
96 pages
A Study On The Financial Performance of General Insurance Companies in India
No ratings yet
A Study On The Financial Performance of General Insurance Companies in India
7 pages
Multicolinearidade
No ratings yet
Multicolinearidade
24 pages
MULTICOLLINEARITY
No ratings yet
MULTICOLLINEARITY
21 pages
Multicollinerity
No ratings yet
Multicollinerity
27 pages
Econometrics Edited Chapter-4
No ratings yet
Econometrics Edited Chapter-4
35 pages
Multicolnearity 2
No ratings yet
Multicolnearity 2
28 pages
Multicollinearity AND Heteroskedasticity
No ratings yet
Multicollinearity AND Heteroskedasticity
75 pages
Lecture 6 Multicollinearity
No ratings yet
Lecture 6 Multicollinearity
25 pages
LEC11
No ratings yet
LEC11
21 pages
Confidence Interval For The Difference of Means
No ratings yet
Confidence Interval For The Difference of Means
9 pages
X-MR Control Chart For Autocorrelated Fuzzy Data Using - Distance
No ratings yet
X-MR Control Chart For Autocorrelated Fuzzy Data Using - Distance
9 pages
Module 3 Report
No ratings yet
Module 3 Report
66 pages
Missing Value 11
No ratings yet
Missing Value 11
14 pages
Chapter 7 (Multicolinarity)
No ratings yet
Chapter 7 (Multicolinarity)
64 pages
Multicollinearity
No ratings yet
Multicollinearity
7 pages
Multicollinearity 074432
No ratings yet
Multicollinearity 074432
21 pages
Multicollinearity 2023
No ratings yet
Multicollinearity 2023
32 pages
Investigating Data PDF
100% (1)
Investigating Data PDF
44 pages
Chapter 4 Multicollinearity
No ratings yet
Chapter 4 Multicollinearity
7 pages
7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced
No ratings yet
7dJDuD5Y2Fia6Ch 6 Multicollinearity&Heterosced
23 pages
Multicollinearity
No ratings yet
Multicollinearity
35 pages
Mulicolinearity
No ratings yet
Mulicolinearity
18 pages
Violation of Assumptions of CLR Model:: Multicollinearity
No ratings yet
Violation of Assumptions of CLR Model:: Multicollinearity
28 pages
Multi Kol
No ratings yet
Multi Kol
44 pages
Trapti Chap4
No ratings yet
Trapti Chap4
8 pages
Frequency Table: Data: GROUP 2-Mary Help of Christians
No ratings yet
Frequency Table: Data: GROUP 2-Mary Help of Christians
2 pages
Econometrics ch11
No ratings yet
Econometrics ch11
44 pages
MULTICOLLINEARITY
No ratings yet
MULTICOLLINEARITY
8 pages
Statistical Modelling: Regression: Multicollinearity
No ratings yet
Statistical Modelling: Regression: Multicollinearity
22 pages
4 Regression Diagnostics I
No ratings yet
4 Regression Diagnostics I
10 pages
CH 4 Multicollearity S
No ratings yet
CH 4 Multicollearity S
24 pages
Multicollinearity
No ratings yet
Multicollinearity
26 pages
Chapter 4
No ratings yet
Chapter 4
68 pages
4.2D Estimated Median, Mode and Range in Grouped Frequency Tables RSE
No ratings yet
4.2D Estimated Median, Mode and Range in Grouped Frequency Tables RSE
20 pages
Chapter 5
No ratings yet
Chapter 5
26 pages
Lecture 1A: Statistical Estimators of Grade: Min 4025 Geostatistics
No ratings yet
Lecture 1A: Statistical Estimators of Grade: Min 4025 Geostatistics
23 pages
Data Problems: Multicollinearity and Inadequate Variation
No ratings yet
Data Problems: Multicollinearity and Inadequate Variation
4 pages
Multicollinearity: What Happens If Explanatory Variables Are Correlated.
No ratings yet
Multicollinearity: What Happens If Explanatory Variables Are Correlated.
20 pages
6 Multicolinearity
No ratings yet
6 Multicolinearity
6 pages
Multicollinearity and Endogeneity PDF
No ratings yet
Multicollinearity and Endogeneity PDF
37 pages
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
No ratings yet
CH 10 MULTICOLLINEARITY WHAT HAPPENS IF THE EGRESSORS ARE CORRELATED
36 pages
Econometrics: Multicollinearity: What Happens If The Regressors Are Correlated?
100% (1)
Econometrics: Multicollinearity: What Happens If The Regressors Are Correlated?
45 pages
Multicollinearity
100% (1)
Multicollinearity
25 pages
Multicollinearity
No ratings yet
Multicollinearity
25 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet

Multicollinearity

Uploaded by

Multicollinearity

Uploaded by

• Perfect and imperfect multicollinearity

or 1X1+ 2X2 + 3X3 +…+ iXi = 0

x 2 = 1 x 1 +  where  is a stochastic error

Then the regression coefficients, although determinate,

122 10 50 52 X1: Capital

Yi = 0 + 1X1i + 2X2i + … + KXKi + i

1. The estimated coefficients are still BLUE,

thus making the estimation with less accuracy.

High correlation between X1 and X2

 larger r223 toward 1

a. More likely to get unexpected signs.

Larger variances tend to increase the

Less precise interval estimates.

IN FACT, No formal “tests” for multicollinearity

IN FACT, No formal “tests” for multicollinearity

Sum squared 4986260. Prob(F-statistic) 0.007426

Auxiliary Regressions - regress each explanatory

The R2 will show how strongly Xji is collinear with

Remark: High rij for any i and j is a sufficient indicator for

Rule of thumb: VIF > 5  multicollinearity

Using theories to pick the variables to drop.

Other examples: CPI <=> WPI;

Ft   0  1 PFt   2 PBt   3 LnYdt   4 N t   5 Pt   t

Use the Relative Prices

Ft = 0 + 1RPt + 2lnYdt + 3Pt + t

Using the lagged term of RP to allow the lag effect

Larger sample size means smaller variance of

Point out the problem of multicollinearity.

You might also like