4 Heteroscedasticity

Heteroscedasticity
• Meaning
• Consequences
• Detection
• Remedial Measures
Meaning
The CLRM 𝑌! = 𝛽" + 𝛽# 𝑋! + 𝜀!
The assumptions related to the error term
𝐸(𝜀! ) = 0
𝑉𝑎𝑟(𝜀! ) = 𝐸 (𝜀! # ) = 𝜎 #
𝐶𝑜𝑣3𝜀! , 𝜀$ 5 = 𝐸3𝜀! , 𝜀$ 5 = 0
What happens when the first assumption is violated?
Now, what about the second one?
When the assumption of homoscedasticity is violated, we have

heteroscedasticity. In such a situation,
𝑉𝑎𝑟(𝜀! ) = 𝜎! #
In other words, the variance of the error term is no longer a constant value and
changes with 'i' (and, in general, with X)
Page 1 of 10
What do you think could be potential reasons that lead to heteroscedasticity?
The reasons vary with the problem in hand
• Advancements/ improvements in data collection methodology

• Error-learning models
• Outliers in data (and skewed distribution)
• Specification error (mainly omission of relevant variables, and incorrect
functional form)
Heteroscedasticity and cross-sectional (vis-a-vis time-series) data
Page 2 of 10
Consequences
OLS in presence of heteroscedasticity
OLS are BLUE if the assumptions are satisfied. Are they still BLUE?
• OLS are still linear and unbiased

• But, they are no longer the best or efficient (Contrast it to Multicollinearity)
Variance in presence of heteroscedasticity
Variance in the homoscedastic case
We get biased estimate of variance of parameters. Conventionally computed

confidence intervals and the t and F tests are unreliable and will give misleading
inferences.
Peter Kennedy notes that the parameter variances of the OLS are biased even
asymptotically.
We have another estimator with variance lower than the OLS. This estimator is
known as Generalized Least Squares (GLS). In cases of heteroscedasticity, it is
often the GLS and not the OLS that is BLUE.
Detection
Page 3 of 10
The idea behind most of the tests is to examine the relationship between the
square of the error term (in fact, residual; why?) and X or estimated Y.
Nature of the problem: Often, the nature of the problem at hand gives a clue
regarding presence of heteroscedasticity. For instance, (i) family-budget studies
or consumption-income relationship, (ii) analysis at the firm level.
Graphical methods
Plot the squared residual against estimated Y or against X
Page 4 of 10
Formal Methods
1. Park Test: Assumption is that the variance of the error term is some function of
X.
Park suggested the following form
LHS is not known. Use squared residual as its proxy and run the following
regression
Now, test for the significance of β. If β turns out to be statistically significant,

heteroscedasticity is present in the data.
The Park test is a two-stage procedure.
• First stage: run the OLS regression disregarding the heteroscedasticity.

Obtained residuals.
• Second stage: regress the ln(squared residuals) on ln(X)
Page 5 of 10
2. Glejser Test: Similar to the Park Test. In the second stage, regress the absolute
values of residuals on X or its variants
Criticism of Park’s and Glejser’s tests by Goldfeld-Quandt
3. Goldfeld-Quandt Test
𝜎! # = 𝜎 # 𝑋! #
Steps:
S1: Rank the observations according to the values of X, beginning with the lowest
one.
S2: Omit c central observations, divide the remaining (n-c) observations into two
groups each of (n-c)/2 observations.
S3: Fit separate OLS regressions to the first and the last (n-c)/2 observations.
Obtain the residual sums of squares RSS1 and RSS2 respectively. These RSS each
have ((n-c)/2) - k df.
S4: Compute the ratio,
Page 6 of 10
𝑅𝑆𝑆#
;𝑑𝑓
𝜆=
𝑅𝑆𝑆"
;𝑑𝑓
If the error term are assumed to be normally distributed and if the assumption of
homoscedasticity is valid, then it can be shown that λ follows the F distribution
with numerator and denominator df, ((n-c)/2 - k).
Decision rule: if computed λ > critical F, reject the Null of homoscedasticity.
Problem: distinction between variance group 1 (RSS1) and variance group 2(RSS2).
On the basis of the MC experiments, GQ suggested c=8 for n=30 and c=16 for
n=60. Others (Judge et al.) note that in practice c=4 with n=30 and c=10 with n=60
are satisfactory.
If the number of X-variables is higher than 1, the ranking can be done according to
anyone of these.
4. White’s General Heteroscedasticity Test
Consider the following three variable model:
𝑌! = 𝛽" + 𝛽# 𝑋#! + 𝛽% 𝑋%! + 𝜀!
Steps:
S1: Estimate the residuals from the above equation
S2: Run the following (auxiliary) regression:
𝜀̂! # = 𝛼" + 𝛼# 𝑋#! + 𝛼% 𝑋%! + 𝛼& 𝑋#! # + 𝛼' 𝑋%! # + 𝛼( 𝑋#! 𝑋%! + 𝜈!
The squared residuals from the original regression are regressed on the original X
variables or regressors, their squared values, and the cross products of the
regressors. Obtain the R2 from this (auxiliary) regression.
Page 7 of 10
S3: Under the null hypothesis that there is no heteroscedasticity (viz., all αs are
zero), the sample size (n) times the R2 obtained from the auxiliary regression
follows the chi-square distribution with df equal to the number of regressors
(excluding the constant term) in the auxiliary regression [df =5 in the above
example]. That is,
#
𝑛. 𝑅 # ~𝜒)*
Decision rule: If the LHS exceeds the critical chi-square value, we can reject the
null and conclude that there is heteroscedasticity
Note: White’s test does not rely on the normality assumption
Remedial Measures
Introduction to GLS
Consider the following
𝑌! = 𝛽" 𝑋"! + 𝛽# 𝑋#! + 𝜀!
where X1i ≡1 and 𝑉𝑎𝑟(𝜀! ) = 𝜎! #
Now divide both the sides by σi and run the following transformed regression
𝑌! ∗ = 𝛽" ∗ 𝑋"! ∗ + 𝛽# ∗ 𝑋#! ∗ + 𝜀! ∗
What will the variance of the starred error term be?
Will the parameter estimates from the regression on transformed variables be

BLUE?
This procedure of transforming the original variables in such a way that the
transformed variables satisfy the assumptions of the classical model and then
applying OLS to them is known as the method of generalized least squares. The
estimators thus obtained are known as GLS estimators.
Page 8 of 10
In GLS we minimize a sum of residual squares with weights but in OLS we
minimize an unweighted or equally weighted RSS
Coming back to remedies,
• If the error variance is known one can go ahead with GLS discussed above.
But in practice this is hardly a case. Now what to do?
• White’s Heteroscedasticity-Consistent Variances and Standard Errors

(robust):
It is possible to get asymptotically valid statistical inferences about the

parameters using his method. The standard erros so obtained are quite
often used in research and are often known as robust standard errors. Note
however that such robustification may not work well in small samples.
Page 9 of 10
Another option is to use transformed regression.
Transformation: assumption regarding the heteroscedasticity pattern is required

•
Case 1: error variance is proportional to X2
use WLS with weight = 1/X
•
Case 2:error variance is proportional to X
use WLS with weight = 1/sqrt(X)
• Log transformation: often reduces the heteroscedasticity
Page 10 of 10

4 Heteroscedasticity

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

4 Heteroscedasticity

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4 Heteroscedasticity

Uploaded by

Copyright:

Available Formats

Heteroscedasticity

The CLRM 𝑌! = 𝛽" + 𝛽# 𝑋! + 𝜀!

The assumptions related to the error term

What happens when the first assumption is violated?

Now, what about the second one?

When the assumption of homoscedasticity is violated, we have

The reasons vary with the problem in hand

• Advancements/ improvements in data collection methodology

Heteroscedasticity and cross-sectional (vis-a-vis time-series) data

OLS in presence of heteroscedasticity

• OLS are still linear and unbiased

Variance in presence of heteroscedasticity

Variance in the homoscedastic case

We get biased estimate of variance of parameters. Conventionally computed

Plot the squared residual against estimated Y or against X

Park suggested the following form

Now, test for the significance of β. If β turns out to be statistically significant,

The Park test is a two-stage procedure.

• First stage: run the OLS regression disregarding the heteroscedasticity.

Criticism of Park’s and Glejser’s tests by Goldfeld-Quandt

S4: Compute the ratio,

Decision rule: if computed λ > critical F, reject the Null of homoscedasticity.

4. White’s General Heteroscedasticity Test

Consider the following three variable model:

𝑌! = 𝛽" + 𝛽# 𝑋#! + 𝛽% 𝑋%! + 𝜀!

S1: Estimate the residuals from the above equation

S2: Run the following (auxiliary) regression:

Note: White’s test does not rely on the normality assumption

Consider the following

𝑌! = 𝛽" 𝑋"! + 𝛽# 𝑋#! + 𝜀!

where X1i ≡1 and 𝑉𝑎𝑟(𝜀! ) = 𝜎! #

𝑌! ∗ = 𝛽" ∗ 𝑋"! ∗ + 𝛽# ∗ 𝑋#! ∗ + 𝜀! ∗

What will the variance of the starred error term be?

Will the parameter estimates from the regression on transformed variables be

Coming back to remedies,

But in practice this is hardly a case. Now what to do?

• White’s Heteroscedasticity-Consistent Variances and Standard Errors

It is possible to get asymptotically valid statistical inferences about the

Transformation: assumption regarding the heteroscedasticity pattern is required

You might also like