0% found this document useful (0 votes)
19 views7 pages

cn2 Multi

Arizona States Univ. ECON 725 2

Uploaded by

Alexy Flemmings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views7 pages

cn2 Multi

Arizona States Univ. ECON 725 2

Uploaded by

Alexy Flemmings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2.

MULTICOLLINEARITY AND MISSING OBS

[1] MULTICOLLINEARITY
(1) Perfect Collinearity:
• When regressors are perfectly linearly related: The X matrix is less than
full rank (Rank(X) < k).

Example 1:
GDPt = β1 + β2Gt + β3Tt + β4DEFt + εt.
→ DEFt = Gt - Tt for any t.
→ Rank(X) = 3.

Example 2: Dummy variable trap


log(wt ) = β1 + β2aget + β3dt1 + β4dt2 + β5dt3 + εt ,
where dt1 = 1 iff t’s education level is lower than high school
graduation (dt1 = 0, otherwise); dt2 = 1 iff person t is a high school
graduate but not college graduate (dt2 = 0, otherwise); dt3 = 1 if person t
is a college graduate (dt3 = 0, otherwise).
→ dt1 + dt2 + dt3 = 1.

• Consequence of perfect multicollinearity:


• Cannot compute OLS estimates.

MULTI-1
(2) Near Multicollinearity
• When regressors are highly (not perfectly) correlated.

1) Consequences of near multicollinearity on OLS estimators:


• Under SIC, OLS estimators are unbiased, consistent and efficient.
• Under WIC, OLS estimators are consistent and asymptotically
normal and efficient.
• Then, what is the problem?
→ Individual estimates are unreliable.

2) Symptoms of near multicollinearity:


• Small changes in sample lead to large changes in estimates.
• High R2, but low t statistics.
• High Rj2 's (Rj2 is R2 from OLS of xtj on xt1, ... , xt,j-1, xt,j+1, ... , xtk).
• High value for [λmax/λmin]1/2, where λ's are eigenvalues of X′X.
(See Greene)
• Estimates may be wildly different from those suggested by theory.
• Think about a regression of consumption on one, income and
wealth.

Question: Why does multicollinearity make values of t-statistic low?

MULTI-2
Theorem:
xj = j'th column of X;
Xj* = X with j'th column deleted.
SSEj = SSE from a regression of xj on Xj*.
SSTj = Σt(xtj- x j )2.

1 1
Then, the j'th diagonal of (X′X)-1 = = .
SSE j SST j (1 − R j 2 )

Implication:
σ2
• var( βˆ j ) = → As Rj2↑, var( βˆ j )↑.
SST j (1 − R j )
2

s2
• se( βˆ j ) = → As Rj
2
↑, se( βˆ j )↑.
SST j (1 − R j )
2

βˆ j
• (t statistic for Ho: βj = 0) = → As Rj2↑, |t|↓.
se( βˆ j )

3) Remedies
1. Drop some regressors highly correlated with others (?)
2. Collect a richer data set.
3. Use alternative estimators.

MULTI-3
(3) Alternative Estimators
1) Ridge regression estimator:
βˆr = (X′X + rIk)-1X′y, r > 0.
• Biased but smaller MSE than OLS variances.
Cov( βˆr ) = σ2[X′X + rIk]-1X′X[X′X + rIk]-1.
• This estimator solves multicollinearity problem?
• No clear meaning to statistical inferences.
• What is optimal choice of r?

2) Principal Component Estimator.


• Procedure:
• Compute eigenvalues of X′X and sort them as λ1  λ2  ...  λk.
(and ck×1 such that X′Xc = λc.)

• Choose normalized eigenvectors, i.e., c j′c j = 1.

• Choose L largest λ’s (λ1,λ2,...,λL) and corresponding c1, ..., cL.


• Define CL = [c1, ... , cL].
• Let Z = XCL, and do OLS on y = Zγ + error: γˆ = (Z′Z)-1Z′y.
• Principal Component Estimator: βˆ pc = CL γˆ .

MULTI-4
• Facts:

• βˆ pc = C LC L′βˆ (See Greene).

• Biased unless L = k. (In fact, βˆ pc = βˆ if L = k.)

• Sensitive to the scale of measurement on regressors.


• No clear meaning to statistical inferences.

3) Suggestion:
Ridge regression or PC estimators could be useful for prediction,
but may not be much helpful for statistical inferences.

(4) An ad hoc alternative (M is conquered?)


1) Situation: Consider the following regression model:
yt = β1 + β 2 xt 2 + β 3 xt 3 + β 4 xt 4 + ε t .
Suppose that xt3 and xt4 are so highly correlated that the t-tests for
β2 and β3 indicate insignificance of the two parameters.

MULTI-5
<Example>
Dependent Variable: LWAGE
Sample: 1 935

Variable Coefficient Std. Error t-Statistic Prob.

C 5.517432 0.124819 44.20360 0.0000


EDUC 0.077987 0.006624 11.77291 0.0000
EXPER 0.016256 0.013540 1.200595 0.2302
EXPER^2 0.000152 0.000567 0.268133 0.7887

R-squared 0.130926 Mean dependent var 6.779004

2) Alternative regression:
Step 1: Regress xt3 on one and xt4, and residuals ut.
Step 2: Regress yt on one, xt2, ut and xt4.

<Example>
Dependent Variable: LWAGE
Included observations: 935

Variable Coefficient Std. Error t-Statistic Prob.

C 5.604536 0.101658 55.13115 0.0000


EDUC 0.077987 0.006624 11.77291 0.0000
U 0.016256 0.013540 1.200595 0.2302
EXPER^2 0.000812 0.000138 5.869204 0.0000

R-squared 0.130926 Mean dependent var 6.779004

Observe that EXPER2 is now significant!!!

MULTI-6
3) Logic of this treatment:
• Let xt 3 = δ 1 + δ 2 xt 4 + ut . Substitute this into the original regression
model:
yt = β1 + β 2 xt 2 + β 3 (δ 1 + δ 2 xt 4 + ut ) + β 4 xt 4 + ε t
= ( β1 + β 3δ 1 ) + β 2 xt 2 + β 3ut + ( β 4 + δ 2 β 3 ) xt 4 + ε t .
Since ut and xt4 are uncorrelated, this alternative model does not
suffer from M.
• Is it really?

3) Problems in this approach.


• Your estimate of β4 is an estimate of (β4+δ2β3), not of β4!
• You can’t use real ut. Instead, you use the estimated ut. When you
estimated ut, the OLS covariance matrix is no longer of the form
σ 2 ( X ′X ) −1 . This problem is called “generated regressor” problem.

4) Conclusion: No magic! No solution!

[2] MISSING OBSERVATIONS


• Suppose that some values of yt, xt1, ..., xtk are missing for some
people.
• Is there any good way to fill up the missing values?
• There might be. But may better not to do so.
• See Greene Chapter 4.9.2.

MULTI-7

You might also like