0% found this document useful (0 votes)
93 views5 pages

Chapter 6

The document discusses relaxing some assumptions of the classical linear regression model, including heteroscedasticity, autocorrelation, and multicollinearity. It defines these terms and discusses their consequences, how to detect them, and potential remedies. For example, heteroscedasticity occurs when the variance of the error term is not constant, which means OLS estimators are no longer efficient. Autocorrelation happens when error terms are correlated over time. Multicollinearity is a high correlation between explanatory variables, making their individual effects difficult to determine.

Uploaded by

Nermine Limeme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views5 pages

Chapter 6

The document discusses relaxing some assumptions of the classical linear regression model, including heteroscedasticity, autocorrelation, and multicollinearity. It defines these terms and discusses their consequences, how to detect them, and potential remedies. For example, heteroscedasticity occurs when the variance of the error term is not constant, which means OLS estimators are no longer efficient. Autocorrelation happens when error terms are correlated over time. Multicollinearity is a high correlation between explanatory variables, making their individual effects difficult to determine.

Uploaded by

Nermine Limeme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Chapter VI: Relaxing Assumptions of the Classical Model

I. Heteroscedasticity
2 2
The variance of the error term is not constant for all observations: σ𝑢 ≠σ𝑢 ∀ 𝑖≠𝑗
𝑖 𝑗

The Consequences of Heteroscedasticity


● OLS estimators are still unbiased (unless there are also omitted variables)
● However OLS estimators are no longer efficient or minimum variance
● The formulae used to estimate the coefficient standard errors are no longer correct
- so the t-tests will be misleading (if the error variance is positively related to an independent variable then the
estimated standard errors are biased downwards and hence the t-values will be inflated)
- confidence intervals based on these standard errors will be wrong.

Detecting Heteroscedasticity
There are two methods to detect Heteroscedasticity:
1. Graphical method : visual inspection of scatter diagram for the residuals
2. Use one of the more formal tests :
- Goldfeld-Quandt test : suitable for a simple form of heteroscedasticity
- Breusch-Pagan test : for more general forms of heteroscedasticity

1. Graphical Model
If there is no a priori or empirical information about the nature of heteroscedasticity, in practice one can do the
regression analysis on the assumption that there is no heteroscedasticity and then do a postmortem examination of the
^2
𝑢𝑖 residual squared to see if they exhibit any systematic pattern.

^2 2
Although 𝑢𝑖 are not the same thing as 𝑢𝑖 , they can be used as proxies especially if the sample size is sufficiently large.

^2
An examination of the 𝑢𝑖 may reveal patterns such as those shown in figures a-e.

^2 ^
In figures a-e 𝑢𝑖 are plotted against 𝑌𝑖 , the estimated 𝑌𝑖 from the regression line, the idea being to find out whether the
estimated mean value of Y is systematically related to the squared residual.

In figure (a) we see that there is no systematic pattern between the two variables, suggesting that perhaps no
heteroscedasticity is present in the data.

Figure (b) to (e), however, exhibits definite patterns :

• Figure (c) suggests a linear relationship,


^2
• Figure (d) and (e) indicates a quadratic relationship between 𝑢𝑖 and

Using such knowledge, albeit informal, one may transform the data in such a manner that the transformed data do not
exhibit heteroscedasticity.

2. Formal Methods
1. Goldfeld-Quandt test
(
● Suppose the error variance is proportional to the square of one of the X’s σ𝑢𝑖 = σ𝑢𝑋𝑖 . )
(𝑅𝑆𝑆2/𝑑𝑓1)
● Rank the data according to the culprit variable and conduct an F-test using (𝑅𝑆𝑆1/𝑑𝑓2)
≈ 𝐹(𝑑𝑓1, 𝑑𝑓2) where
these RSS are based on regressions using the first and last [𝑛 − 𝑐]/2 observations [c is a central section of data
usually about 25% of n].
● The decision rule is, Reject 𝐻0 of homoscedasticity if 𝐹𝑐𝑎𝑙 > 𝐹𝑡𝑎𝑏𝑙𝑒𝑠

2. Breusch-Pagan test

● Regress the squared residuals on a number of regressors 𝑧𝑠, which may or may not be present in the initial
regression model that we want to test.
● Equation below shows the general form of the variance function:
^2
𝑢𝑖 = α1 + α2𝑧𝑖2 + α3𝑧𝑖3 + … + α𝑠𝑧𝑖𝑠 + 𝑣𝑖

● The Breusch-Pagan test is based on the statistic:


2 2 2
𝑥 = 𝑁 ×𝑅 ~ 𝑋 (𝑆 − 1)
2 2 2
● The decision rule is Reject 𝐻0 if 𝑥 = 𝑁 ×𝑅 > 𝑋 (𝑆 − 1)

Remedies of Heteroscedasticity
● Respecification of the model
- Include relevant omitted variable(s)
- Express model in log-linear form or some other appropriate functional form
- Express variables in per capita form
● Where Respecification won’t solve the problem we can us for example ;
- Robust Heteroscedastic Consistent Standard Errors (due to H. White, Econometrica 1980); or
- Generalized Least Squares (GLS) estimator.

II. Autocorrelation
What is meant by autocorrelation
The error terms are not independent from observation to observation
→𝑢𝑡 depends on one or more past values of 𝑢.

What are its consequences?


The last squares estimators are no longer “efficient” (i.e. they don’t have the lowest variance).

🡺More seriously autocorrelation may be a symptom of model misspecification.


How can you detect the problem?
Plot the residuals against time or their own lagged values, calculate the Durbin-Watson statistic or use some other tests
of autocorrelation such as the Breusch-Godfrey test

How can you remedy the problem?


Consider possible model re-specification of the model: a different functional form, missing variables, lags etc.

🡺If all else fails you could correct for autocorrelation by using the Cochrane-Orcutt procedure or Autoregressive Least
Squares

The Sources of Autocorrelation


Each of the following types of mis-specification can result in autocorrealted disturbances:
● Omitted (autocorrealated) explanatory variables.
● Incorrect functional form.
● Inappropriate time periods.
● Incorrect dynamic structure
● Inappropriately « filtered » data (e.g. seasonal adjustment).

How should you Deal With a Problem of Autocorrelation?


Consider possible re-specification of the model:
● a different functional form,
● the inclusion of additional explanatory variables,
● the inclusion of lagged variables (independent and dependent)

🡺If all else fails you can correct for autocorrelation by using the Cochrane-Orcutt procedure or Autoregressive Least
Squares

First-Order Autocorrelation
In its simplest form the errors or the disturbances in one period are related to those previous period by a simple
first-order autoregressive process

𝑢𝑡 = ρ𝑢𝑡−1 + ε𝑡 −1<ρ<1 [ε𝑡 𝑖𝑠 𝑤ℎ𝑖𝑡𝑒 𝑛𝑜𝑖𝑠𝑒]


● If ρ > 0 we have positive autocorrelation, with each error arising as a proportion of last period’s error plus a
random shock (or innovation).
● If ρ < 0 we have a negative correlation.
We say that 𝑢 is 𝐴𝑅(1).

Detecting Autocorrelation
Graphical methods
● Time series plot of residuals
● Plot of residuals against tagged residuals
● The residual correlogram

Test statistics
● The Durbin-Watson test: This test is only appropriate against 𝐴𝑅(1) errors i.e.
it tests the hypothesis of ρ = 0 in the process
𝑢𝑡 = ρ𝑢𝑡−1 + ε𝑡
The Durbin-Watson Test
Consider the 𝐴𝑅(1):
𝑢𝑡 = ρ𝑢𝑡−1 + ε𝑡

We want to test,

𝐻0: ρ = 0

𝐻1: ρ > 0 or ρ > 0 if you are testing for a negative correlation

The test statistic is calculated by:


𝑇
(
∑ 𝑒𝑡−𝑒𝑡−1
𝑡=2
)2
𝑑= 𝑇
∑ 𝑒𝑡
𝑡=2
( )2

D (or DW) can take values between 0 and 4 but would be expected to take a value of 2 under 𝐻0: ρ = 0
^
It can be shown that, 𝑑≈2(1 − ρ)

So with zero autocorrelation (ρ = 0) we expect 𝑑 = 2


With positive autocorrelation (0 < ρ < 1) we expect 0 < 𝑑 < 2
And with negative autocorrelation (− 1 < ρ < 0) we expect 2 < 𝑑 < 4

⇨ Thus the test =0 is a test of d=2.

Warning: You cannot use this test when there is a lagged dependent variable on the RHS.

Unfortunately the exact distribution of 𝑑 could not be computed. However 𝐷𝑊 established upper 𝑑𝑈 and lower 𝑑𝐿
limits for the distribution.

These tables assume that the u’s are normal homoscedastic and non-autocorrelated and that the regressors are all
exogenous. N indicates the sample size and k (excluding the intercept) is the number of regressors.
III. Multicollinearity
What does it mean? A high degree of correlation amongst the explanatory variables
What are its consequences? It may be difficult to separate out the effects of the individual regressors. Standard
errors may be overestimated and t-values depressed.

Note: a symptom may be high R2 but low t-values


How can you detect the problem? Examine the correlation matrix of regressors - also carry out auxiliary
regressions amongst the regressors.

Look at the Variance Inflation Factors

Note:
Be careful not to apply t-tests mechanically without checking for multicollinearity
Multicollinearity is a data problem, not a misspecification problem.

Variance Inflation Factor (VIF)


Multicollinearity inflates the variance of an estimator.

Calculate 𝑘 different 𝑉𝐼𝐹𝑠, one for each 𝑋𝑖 by first running an ordinary least square regression that has 𝑋𝑖 as a function
of all the other explanatory variables in the first equation.
1
𝑉𝐼𝐹𝑠 =
(1−𝑅 ) 2
𝑗

2 2
Where 𝑅𝑗 measures the 𝑅 from a regression of 𝑋𝑗 on the other 𝑋 variable/s

⇨ serious multicollinearity problem if 𝑉𝐼𝐹𝑗 > 10

You might also like