0% found this document useful (0 votes)
5 views10 pages

4 Heteroscedasticity

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 10

Heteroscedasticity

• Meaning
• Consequences
• Detection
• Remedial Measures

Meaning

The CLRM 𝑌! = 𝛽" + 𝛽# 𝑋! + 𝜀!

The assumptions related to the error term

𝐸(𝜀! ) = 0

𝑉𝑎𝑟(𝜀! ) = 𝐸 (𝜀! # ) = 𝜎 #

𝐶𝑜𝑣3𝜀! , 𝜀$ 5 = 𝐸3𝜀! , 𝜀$ 5 = 0

What happens when the first assumption is violated?

Now, what about the second one?

When the assumption of homoscedasticity is violated, we have


heteroscedasticity. In such a situation,

𝑉𝑎𝑟(𝜀! ) = 𝜎! #

In other words, the variance of the error term is no longer a constant value and
changes with 'i' (and, in general, with X)

Page 1 of 10
What do you think could be potential reasons that lead to heteroscedasticity?

The reasons vary with the problem in hand

• Advancements/ improvements in data collection methodology


• Error-learning models
• Outliers in data (and skewed distribution)
• Specification error (mainly omission of relevant variables, and incorrect
functional form)

Heteroscedasticity and cross-sectional (vis-a-vis time-series) data

Page 2 of 10
Consequences

OLS in presence of heteroscedasticity

OLS are BLUE if the assumptions are satisfied. Are they still BLUE?

• OLS are still linear and unbiased


• But, they are no longer the best or efficient (Contrast it to Multicollinearity)

Variance in presence of heteroscedasticity

Variance in the homoscedastic case

We get biased estimate of variance of parameters. Conventionally computed


confidence intervals and the t and F tests are unreliable and will give misleading
inferences.

Peter Kennedy notes that the parameter variances of the OLS are biased even
asymptotically.

We have another estimator with variance lower than the OLS. This estimator is
known as Generalized Least Squares (GLS). In cases of heteroscedasticity, it is
often the GLS and not the OLS that is BLUE.

Detection

Page 3 of 10
The idea behind most of the tests is to examine the relationship between the
square of the error term (in fact, residual; why?) and X or estimated Y.

Nature of the problem: Often, the nature of the problem at hand gives a clue
regarding presence of heteroscedasticity. For instance, (i) family-budget studies
or consumption-income relationship, (ii) analysis at the firm level.

Graphical methods

Plot the squared residual against estimated Y or against X

Page 4 of 10
Formal Methods

1. Park Test: Assumption is that the variance of the error term is some function of
X.

Park suggested the following form

LHS is not known. Use squared residual as its proxy and run the following
regression

Now, test for the significance of β. If β turns out to be statistically significant,


heteroscedasticity is present in the data.

The Park test is a two-stage procedure.

• First stage: run the OLS regression disregarding the heteroscedasticity.


Obtained residuals.
• Second stage: regress the ln(squared residuals) on ln(X)

Page 5 of 10
2. Glejser Test: Similar to the Park Test. In the second stage, regress the absolute
values of residuals on X or its variants

Criticism of Park’s and Glejser’s tests by Goldfeld-Quandt

3. Goldfeld-Quandt Test

𝜎! # = 𝜎 # 𝑋! #

Steps:

S1: Rank the observations according to the values of X, beginning with the lowest
one.

S2: Omit c central observations, divide the remaining (n-c) observations into two
groups each of (n-c)/2 observations.

S3: Fit separate OLS regressions to the first and the last (n-c)/2 observations.
Obtain the residual sums of squares RSS1 and RSS2 respectively. These RSS each
have ((n-c)/2) - k df.

S4: Compute the ratio,

Page 6 of 10
𝑅𝑆𝑆#
;𝑑𝑓
𝜆=
𝑅𝑆𝑆"
;𝑑𝑓

If the error term are assumed to be normally distributed and if the assumption of
homoscedasticity is valid, then it can be shown that λ follows the F distribution
with numerator and denominator df, ((n-c)/2 - k).

Decision rule: if computed λ > critical F, reject the Null of homoscedasticity.

Problem: distinction between variance group 1 (RSS1) and variance group 2(RSS2).
On the basis of the MC experiments, GQ suggested c=8 for n=30 and c=16 for
n=60. Others (Judge et al.) note that in practice c=4 with n=30 and c=10 with n=60
are satisfactory.

If the number of X-variables is higher than 1, the ranking can be done according to
anyone of these.

4. White’s General Heteroscedasticity Test

Consider the following three variable model:

𝑌! = 𝛽" + 𝛽# 𝑋#! + 𝛽% 𝑋%! + 𝜀!

Steps:

S1: Estimate the residuals from the above equation

S2: Run the following (auxiliary) regression:

𝜀̂! # = 𝛼" + 𝛼# 𝑋#! + 𝛼% 𝑋%! + 𝛼& 𝑋#! # + 𝛼' 𝑋%! # + 𝛼( 𝑋#! 𝑋%! + 𝜈!

The squared residuals from the original regression are regressed on the original X
variables or regressors, their squared values, and the cross products of the
regressors. Obtain the R2 from this (auxiliary) regression.

Page 7 of 10
S3: Under the null hypothesis that there is no heteroscedasticity (viz., all αs are
zero), the sample size (n) times the R2 obtained from the auxiliary regression
follows the chi-square distribution with df equal to the number of regressors
(excluding the constant term) in the auxiliary regression [df =5 in the above
example]. That is,
#
𝑛. 𝑅 # ~𝜒)*

Decision rule: If the LHS exceeds the critical chi-square value, we can reject the
null and conclude that there is heteroscedasticity

Note: White’s test does not rely on the normality assumption

Remedial Measures

Introduction to GLS

Consider the following

𝑌! = 𝛽" 𝑋"! + 𝛽# 𝑋#! + 𝜀!

where X1i ≡1 and 𝑉𝑎𝑟(𝜀! ) = 𝜎! #

Now divide both the sides by σi and run the following transformed regression

𝑌! ∗ = 𝛽" ∗ 𝑋"! ∗ + 𝛽# ∗ 𝑋#! ∗ + 𝜀! ∗

What will the variance of the starred error term be?

Will the parameter estimates from the regression on transformed variables be


BLUE?

This procedure of transforming the original variables in such a way that the
transformed variables satisfy the assumptions of the classical model and then
applying OLS to them is known as the method of generalized least squares. The
estimators thus obtained are known as GLS estimators.

Page 8 of 10
In GLS we minimize a sum of residual squares with weights but in OLS we
minimize an unweighted or equally weighted RSS

Coming back to remedies,

• If the error variance is known one can go ahead with GLS discussed above.

But in practice this is hardly a case. Now what to do?

• White’s Heteroscedasticity-Consistent Variances and Standard Errors


(robust):

It is possible to get asymptotically valid statistical inferences about the


parameters using his method. The standard erros so obtained are quite
often used in research and are often known as robust standard errors. Note
however that such robustification may not work well in small samples.

Page 9 of 10
Another option is to use transformed regression.

Transformation: assumption regarding the heteroscedasticity pattern is required



Case 1: error variance is proportional to X2
use WLS with weight = 1/X

Case 2:error variance is proportional to X
use WLS with weight = 1/sqrt(X)
• Log transformation: often reduces the heteroscedasticity

Page 10 of 10

You might also like