L11_2023

Download as pdf or txt
Download as pdf or txt
You are on page 1of 99

1

Heteroskedasticity
2

Introduction
Consider the following questions:

1. What is the nature of heteroscedasticity?

2. What are its consequences?

3. How does one detect it?

4. What are the remedial measures?


3

The Nature of Heteroscedasticity


• Homoscedasticity or equal (homo) spread (scedasticity), that is,
equal variance.

• The conditional variance of Yi (which is equal to that of ui),


conditional upon the given Xi
4

Heteroscedasticity
• When the variances of Yi are not the same.
• Symbolically,
5

Heteroscedasticity
There are several reasons why the variances of ui may be variable,
some of which are as follows:

1. Following the error-learning models, as people learn, their errors of


behavior become smaller over time or the number of errors
becomes more consistent.

2. As incomes grow, people have more discretionary income and


hence more scope for choice about the disposition of their income.

3. As data collecting techniques improve, σi2 is likely to decrease.


6

Heteroscedasticity
4. Heteroscedasticity can also arise as a result of the presence of
outliers.

5. when the regression model is not correctly specified.

6. Skewness in the distribution of one or more regressors included in


the model.

7. Heteroscedasticity can also arise because of (1) incorrect data


transformation (e.g., ratio or first difference transformations) and (2)
incorrect functional form (e.g., linear versus log–linear models)
7

OLS Estimation in the Presence of


Heteroscedasticity
• The two-variable model:

• Under homoskesdasticity
8

OLS Estimation
• It still BLUE when we drop only the homoscedasticity assumption
and replace it with the assumption of heteroscedasticity?

• 𝛽መ2 is still linear and unbiased.

• The variance of ui, homoscedastic or heteroscedastic, plays no part


in the determination of the unbiasedness property.

• 𝛽መ2 is a consistent estimator despite heteroscedasticity; that is, as


the sample size increases indefinitely, the estimated 𝛽2 converges
to its true value.
9

OLS Estimation
• Given that 𝛽መ2 is still linear unbiased and consistent, is it “efficient” or
“best”?

• That is, does it have minimum variance in the class of unbiased


estimators?

• And is that minimum variance given by previous equation?

• The answer is no to both the questions: 𝛽መ2 is no longer best and the
minimum variance is not given by the previous equation.
10

The Method of Generalized Least Squares


(GLS)
• Two-variable model:

• We may write it as

• where X0i = 1 for each i


11

GLS
• Assume that the heteroscedastic variances σi2 are known.
• Divide the equation as
12

GLS
• Assume that the heteroscedastic variances σi2 are known.
• Divide the equation as
13

GLS
• This procedure of transforming the original variables in such a way
that the transformed variables satisfy the assumptions of the
classical model and then applying OLS to them is known as the
method of generalized least squares (GLS).

• In short, GLS is OLS on the transformed variables that satisfy the


standard least-squares assumptions.

• The estimators thus obtained are known as GLS estimators, and it


is these estimators that are BLUE.
14

GLS
• The coefficient and the variances are
15

Difference between OLS and GLS


• In OLS we minimize

• In GLS, the weight assigned to each observation is inversely


proportional to its σi, that is, observations coming from a population
with larger σi will get relatively smaller weight and those from a
population with smaller σi will get proportionately larger weight in
minimizing the RSS.
16

Consequences of Using OLS in the Presence of


Heteroscedasticity
OLS Estimation Allowing for Heteroscedasticity.

• Using the variance formula seen earlier, and assuming σi2 are
known.

• Can we establish confidence intervals and test hypotheses with the


usual t and F tests?
17

Consequences of Using OLS in the Presence of


Heteroscedasticity
OLS Estimation Allowing for Heteroscedasticity.

• Using the variance formula seen earlier, and assuming σi2 are
known.

• Can we establish confidence intervals and test hypotheses with the


usual t and F tests?

• The answer generally is no because var (𝛽መ2∗ ) ≤ var (𝛽መ2 ).


18

Consequences of Using OLS in the Presence of


Heteroscedasticity
OLS Estimation Disregarding Heteroscedasticity

• Suppose we use the following formula ignoring heteroskedasticity:

• Variance will be biased estimator.

• If we persist in using the usual testing procedures despite heteroscedasticity,


whatever conclusions we draw or inferences we make may be very
misleading.
19

Consequences of Using OLS in the Presence of


Heteroscedasticity
OLS Estimation Disregarding Heteroscedasticity

• The usual OLS standard errors are either too large (for the
intercept) or generally too small (for the slope coefficient) in relation
to those obtained by OLS allowing for heteroscedasticity.

• The message is clear: In the presence of heteroscedasticity, use


GLS.
20

Detection of Heteroscedasticity
There are two types of methods

○ Informal methods

○ Formal methods
21

Informal Methods
1. Nature of the Problem

• In cross-sectional data involving heterogeneous units,


heteroscedasticity may be the rule rather than the exception.

• Thus, in a cross-sectional analysis involving the investment


expenditure in relation to sales, rate of interest, etc.,
heteroscedasticity is generally expected if small-, medium-, and
large-size firms are sampled together.
22

Informal Methods
2. Graphical Method
• In Fig a we see that there is no
systematic pattern between the
two variables, suggesting that
perhaps no heteroscedasticity is
present in the data.
• Figures b to e, however, exhibit
definite patterns.
23

Informal Methods
• Instead of plotting 𝜇Ƹ 𝑖2 against
𝑌෠𝑖 , one may plot them
against one of the
explanatory variables,
especially if plotting 𝜇Ƹ 𝑖2
against 𝑌෠𝑖 results in the
pattern shown in Figure a
24

Formal Methods
Park Test
• The functional form suggested is

• where vi is the stochastic disturbance term.


25

Park Test
• Since σi2 is generally not known, Park suggests using 𝜇Ƹ 𝑖2 as a proxy
and running the following regression:

• If β turns out to be statistically significant, it would suggest that


heteroscedasticity is present in the data. If it turns out to be
insignificant, we may accept the assumption of homoscedasticity.
26

Park Test
The Park test is thus a two-stage procedure.

• In the first stage, we run the OLS regression disregarding the


heteroscedasticity question.

• We obtain 𝜇Ƹ 𝑖2 from this regression, and then in the second stage we


run the above regression model.
27

EXAMPLE : Relationship between Compensation and


Productivity
The following data is available: Y X
3396 9355
• Y = average compensation in 3787 8584
thousands of dollars, 4013 7962
4104 8275
4146 8389
• X = average productivity
4241 9418
in thousands of dollars 4388 9795
4538 10281
4843 11750
28

EXAMPLE : Relationship between Compensation and


Productivity
• Run the following regression
Source SS df MS Number of obs = 9
F(1, 7) = 5.44
Model 619377.506 1 619377.506 Prob > F = 0.0523
Residual 796278.05 7 113754.007 R-squared = 0.4375
Adj R-squared = 0.3572
Total 1415655.56 8 176956.944 Root MSE = 337.27

x Coef. Std. Err. t P>|t| [95% Conf. Interval]

y .2329993 .0998528 2.33 0.052 -.0031151 .4691137


_cons 1992.062 936.6123 2.13 0.071 -222.6741 4206.798

• As labor productivity increases by a dollar, labor compensation on


the average increases by about 23 cents.
29

EXAMPLE : Relationship between Compensation and


Productivity
• Obtain the residuals using the following command:

• Generate squared log of residuals using the following command:

• Generate log of variable X using the command:


30

EXAMPLE : Relationship between Compensation and


Productivity
• Run the following regression using log squared residuals:


Source SS df MS Number of obs = 9
F(1, 7) = 0.45
Model .95900152 1 .95900152 Prob > F = 0.5257
Residual 15.0546945 7 2.15067064 R-squared = 0.0599
Adj R-squared = -0.0744
Total 16.013696 8 2.001712 Root MSE = 1.4665

lehat2 Coef. Std. Err. t P>|t| [95% Conf. Interval]

lx -2.802023 4.196131 -0.67 0.526 -12.7243 7.12025


_cons 35.82686 38.3227 0.93 0.381 -54.79192 126.4456
31

EXAMPLE : Relationship between Compensation and


Productivity
• Run the following regression:

There is no statistically significant


relationship between the two
variables.
Following the Park test, we may
conclude that there is no
heteroscedasticity in the error
variance.
32

Glejser Test
• The Glejser test is similar in spirit to
the Park test.
• After obtaining the residuals 𝜇Ƹ 𝑖 from
the OLS regression, Glejser
suggests regressing the absolute
values of 𝜇Ƹ 𝑖 on the X variable.
• Glejser uses the following functional
forms:
33

Example
• We take the previous example and use the absolute value of
residuals.

• Generate the absolute residuals variable:


34

Example
• Run the regression model:

Source SS df MS Number of obs = 9


F(1, 7) = 0.09
Model 4724.55491 1 4724.55491 Prob > F = 0.7718
Residual 363925.371 7 51989.3388 R-squared = 0.0128
Adj R-squared = -0.1282
Total 368649.926 8 46081.2408 Root MSE = 228.01

aehat Coef. Std. Err. t P>|t| [95% Conf. Interval]

x -.0203497 .0675047 -0.30 0.772 -.179973 .1392736


_cons 407.476 633.1895 0.64 0.540 -1089.779 1904.731
35

Example
• We take the previous example

• There is no relationship between the absolute value of the residuals


and the regressor, average productivity.

• This reinforces the conclusion based on the Park test.


36

Glejser Test
• Goldfeld and Quandt point out that the error term vi has some
problems in that its expected value is nonzero, it is serially
correlated, and heteroscedastic.
37

Spearman’s Rank Correlation Test


The formula is

Step 1. Fit the regression to the data on Y and X and obtain the
residuals 𝜇Ƹ 𝑖

Step 2. Ignoring the sign of 𝜇Ƹ 𝑖 , that is, taking their absolute value 𝜇Ƹ 𝑖
rank both 𝜇Ƹ 𝑖 and Xi (or 𝑌෠𝑖 )
38

Spearman’s Rank Correlation Test


Step 3. Assuming that the population rank correlation coefficient ρs is
zero and n > 8, the significance of the sample rs can be tested by the t
test as follows

• If the computed t value exceeds the critical t value, we may accept


the hypothesis of heteroscedasticity; otherwise we may reject it.
39

Example
40

Goldfeld–Quandt Test
• Assumption is that the heteroscedastic variance, σi2 , is positively
related to one of the explanatory variables in the regression model.

• Consider the usual two-variable model:

• Suppose σi2 is positively related to Xi as

• where σ2 is a constant
41

Goldfeld–Quandt Test
Goldfeld and Quandt suggest the following steps:

Step 1. Order or rank the observations according to the values of Xi,


beginning with the lowest X value.

Step 2. Omit c central observations, where c is specified a priori, and


divide the remaining (n − c) observations into two groups each of (n −
c)/ 2 observations.
42

Goldfeld–Quandt Test
Goldfeld and Quandt suggest the following steps:

Step 3. Fit separate OLS regressions to the first (n − c)/ 2 observations


and the last (n − c)/ 2 observations, and obtain the respective residual
sums of squares RSS1 and RSS2, RSS1 representing the RSS from the
regression corresponding to the smaller Xi values (the small variance
group) and RSS2 that from the larger Xi values (the large variance
group).
43

Goldfeld–Quandt Test
• These RSS have

• where k is the number of parameters to be estimated, including the


intercept.

Step 4. Compute the ratio


44

Goldfeld–Quandt Test
• If we assume ui are normally distributed, and if the assumption of
homoscedasticity is valid, then λ follows the F distribution with
numerator and denominator df each of (n − c − 2k)/2.

• If in an application the computed λ ( = F) is greater than the critical F


at the chosen level of significance, we can reject the hypothesis of
homoscedasticity, that is, we can say that heteroscedasticity is very
likely.
45

Goldfeld–Quandt Test
• In case there is more than one X variable in the model, the ranking
of observations, the first step in the test, can be done according to
any one of them.
46

EXAMPLE : The Goldfeld–Quandt Test


• Data on consumption expenditure in relation to income for a cross
section of 30 families.

• Dropping the middle 4 observations, the OLS regressions based on


the first 13 and the last 13 observations and their associated
residual sums of squares are as follows:
47

EXAMPLE
• Regression model based on first 13 observations:

Source SS df MS Number of obs = 13


F(1, 11) = 87.79
Model 3010.06452 1 3010.06452 Prob > F = 0.0000
Residual 377.166253 11 34.2878412 R-squared = 0.8887
Adj R-squared = 0.8785
Total 3387.23077 12 282.269231 Root MSE = 5.8556

ry Coef. Std. Err. t P>|t| [95% Conf. Interval]

rx .6967742 .074366 9.37 0.000 .5330958 .8604526


_cons 3.409429 8.704924 0.39 0.703 -15.74998 22.56884
48

EXAMPLE
• Regression model based on last 13 observations:

Source SS df MS Number of obs = 13


F(1, 11) = 36.42
Model 5088.89274 1 5088.89274 Prob > F = 0.0001
Residual 1536.79957 11 139.709052 R-squared = 0.7681
Adj R-squared = 0.7470
Total 6625.69231 12 552.141026 Root MSE = 11.82

ry Coef. Std. Err. t P>|t| [95% Conf. Interval]

rx .7941373 .1315819 6.04 0.000 .5045274 1.083747


_cons -28.02717 30.64214 -0.91 0.380 -95.47006 39.41573
49

EXAMPLE
50

EXAMPLE
• Obtain the critical F value for 11 numerator and 11 denominator df
at the 5 percent level.
51

EXAMPLE

lambda = 4.0745946
pvalue = .01408971
crit = 2.8179305

• Since the estimated F (= λ) value exceeds the critical value, we may


conclude that there is heteroscedasticity in the error variance.
52

EXAMPLE
• The critical F value for 11 numerator and 11 denominator df at the 5
percent level is 2.82.

• Since the estimated F (= λ) value exceeds the critical value, we may


conclude that there is heteroscedasticity in the error variance.
53

Breusch–Pagan–Godfrey Test
• Consider the k-variable linear regression model

• Assume that the error variance σi2 is described as

• That is, σi2 is some function of the nonstochastic Z variables; some


or all of the X’s can serve as Z’s.
54

Breusch–Pagan–Godfrey Test
• Specifically, assume that
55

Breusch–Pagan–Godfrey Test
• If α2 = α3 = ··· = αm = 0, σi2 = α1, which is a constant.

• Therefore, to test whether σi2 is homoscedastic, one can test the


hypothesis that α2 = α3 = ··· = αm = 0.

• This is the basic idea behind the Breusch–Pagan–Godfrey test.

• The actual test procedure is as follows.


56

Breusch–Pagan–Godfrey Test
• Step 1. Estimate Eq. by OLS and obtain the residuals

• Step 2. Obtain

• Step 3. Construct variables pi defined as


57

Breusch–Pagan–Godfrey Test
• Step 4. Regress pi thus constructed on the Z’s as

• Step 5. Obtain the ESS (explained sum of squares) and define


58

Breusch–Pagan–Godfrey Test
• Assuming ui are normally distributed, if there is homoscedasticity
and if the sample size n increases indefinitely, then

• That is, it follows the chi-square distribution with (m − 1) degrees of


freedom.
59

Breusch–Pagan–Godfrey Test

• Therefore, if in an application the computed Θ ( = χ2) exceeds the


critical χ2 value at the chosen level of significance, we can reject the
hypothesis of homoscedasticity; otherwise we do not reject it.
60

EXAMPLE : The Breusch– Pagan–Godfrey


(BPG) Test
• We use the previous FULL data set

• Regressing Y on X
• Step 1.
Source SS df MS Number of obs = 30
F(1, 28) = 496.72
Model 41886.7134 1 41886.7134 Prob > F = 0.0000
Residual 2361.15325 28 84.3269018 R-squared = 0.9466
Adj R-squared = 0.9447
Total 44247.8667 29 1525.78851 Root MSE = 9.183

y Coef. Std. Err. t P>|t| [95% Conf. Interval]

x .6377846 .0286167 22.29 0.000 .579166 .6964031


_cons 9.290307 5.231386 1.78 0.087 -1.4257 20.00632
61

EXAMPLE : The Breusch– Pagan–Godfrey


(BPG) Test
• Step 2.
• Generate residuals
• Get the sum of residuals
62

EXAMPLE : The Breusch– Pagan–Godfrey


(BPG) Test
• Step 2.
• Generate sum of residuals
• Generate number of observations

• Get the sigma tilde value


63

EXAMPLE
• Step 3. Divide the squared residuals 𝜇Ƹ 𝑖 obtained from regression by
78.7051 to construct the variable pi.
64

EXAMPLE
Step 4. Assuming that pi are linearly related to Xi (= Zi ), we obtain the
regression using the command:

Source SS df MS Number of obs = 30


F(1, 28) = 5.97
Model 10.4280495 1 10.4280495 Prob > F = 0.0211
Residual 48.9100398 28 1.74678714 R-squared = 0.1757
Adj R-squared = 0.1463
Total 59.3380894 29 2.04614101 Root MSE = 1.3217

p Coef. Std. Err. t P>|t| [95% Conf. Interval]

x .0100632 .0041187 2.44 0.021 .0016265 .0184999


_cons -.7426146 .7529284 -0.99 0.332 -2.284918 .7996892
65

EXAMPLE
Step 4. Assuming that pi are linearly related to Xi (= Zi ), we obtain the
regression using the command:
66

EXAMPLE
Step 5.
• Use the command to get model ESS:

• To obtain theta value, use the command:


67

EXAMPLE

theta = 5.2140103
pvalue = .0224056
crit_5 = 3.8414588
crit_1 = 6.6348966

• For 1 df, the 5 percent critical chi-square value is 3.8414 and the 1
percent critical χ2 value is 6.6349.

• Thus, the observed chi-square value of 5.2140 is significant at the 5


percent but not the 1 percent level of significance.
68

White’s General Heteroscedasticity Test


• Does not rely on the normality assumption

• Consider the following model

• The White test proceeds as follows:

• Step 1. Given the data, we estimate Eq. and obtain the residuals, 𝜇Ƹ 𝑖
69

White’s General Heteroscedasticity Test


• Step 2. We then run the following (auxiliary) regression:

• There is a constant term in this equation even though the original


regression may or may not contain it.
70

White’s General Heteroscedasticity Test


• Step 3. Under the null hypothesis that there is no
heteroscedasticity, the sample size (n) times the R2 obtained from
the auxiliary regression asymptotically follows the chi-square
distribution with df equal to the number of regressors (excluding the
constant term) in the auxiliary regression.
71

White’s General Heteroscedasticity Test


• Step 4. If the chi-square value obtained exceeds the critical chi-
square value at the chosen level of significance, the conclusion is
that there is heteroscedasticity.
72

White’s Heteroscedasticity Test


The White test can be a test of (pure) heteroscedasticity or
specification error or both.

• If no cross-product terms are present in the White test procedure,


then it is a test of pure heteroscedasticity.

• If cross-product terms are present, then it is a test of both


heteroscedasticity and specification bias.
73

EXAMPLE : White’s Heteroscedasticity Test

• Use the dataset HPRICE1 from Wooldridge


• Run the regression using the command:

Source SS df MS Number of obs = 88


F(3, 84) = 57.46
Model 617130.701 3 205710.234 Prob > F = 0.0000
Residual 300723.805 84 3580.0453 R-squared = 0.6724
Adj R-squared = 0.6607
Total 917854.506 87 10550.0518 Root MSE = 59.833

price Coef. Std. Err. t P>|t| [95% Conf. Interval]

lotsize .0020677 .0006421 3.22 0.002 .0007908 .0033446


sqrft .1227782 .0132374 9.28 0.000 .0964541 .1491022
bdrms 13.85252 9.010145 1.54 0.128 -4.065141 31.77018
_cons -21.77031 29.47504 -0.74 0.462 -80.38466 36.84405
74

EXAMPLE : White’s Heteroscedasticity Test

• Compute residuals and their squared terms:


• Now, run the regression

Source SS df MS Number of obs = 88


F(3, 84) = 5.34
Model 701213780 3 233737927 Prob > F = 0.0020
Residual 3.6775e+09 84 43780003.5 R-squared = 0.1601
Adj R-squared = 0.1301
Total 4.3787e+09 87 50330276.7 Root MSE = 6616.6

resid2 Coef. Std. Err. t P>|t| [95% Conf. Interval]

lotsize .2015209 .0710091 2.84 0.006 .0603116 .3427302


sqrft 1.691037 1.46385 1.16 0.251 -1.219989 4.602063
bdrms 1041.76 996.381 1.05 0.299 -939.6526 3023.173
_cons -5522.795 3259.478 -1.69 0.094 -12004.62 959.0348
75

EXAMPLE : White’s Heteroscedasticity Test

• Form the scalar for R2 and observations:

• Compute test statistics

• Find out critical value for df and 5% significance level:


76

EXAMPLE : White’s Heteroscedasticity Test

• Compute the pvalue:

• List the value using command:


nr2 = 14.092386
crit = 7.8147279
pvalue = .00278206

• Thus, we reject the null hypothesis of no heteroskedasticity.


77

EXAMPLE : White’s Heteroscedasticity Test

• Perform the same analysis using the logarithmic transformation of


variables.

Source SS df MS Number of obs = 88


F(3, 84) = 50.42
Model 5.15503875 3 1.71834625 Prob > F = 0.0000
Residual 2.86256415 84 .034078145 R-squared = 0.6430
Adj R-squared = 0.6302
Total 8.0176029 87 .092156355 Root MSE = .1846

lprice Coef. Std. Err. t P>|t| [95% Conf. Interval]

llotsize .1679668 .0382811 4.39 0.000 .0918406 .2440931


lsqrft .7002321 .0928653 7.54 0.000 .5155594 .8849049
bdrms .0369583 .0275313 1.34 0.183 -.0177907 .0917073
_cons -1.297041 .6512836 -1.99 0.050 -2.59219 -.0018916
78

EXAMPLE : White’s Heteroscedasticity Test

• Generate squared residuals and run the regression

Source SS df MS Number of obs = 88


F(3, 84) = 1.41
Model .022620136 3 .007540045 Prob > F = 0.2451
Residual .448718204 84 .005341883 R-squared = 0.0480
Adj R-squared = 0.0140
Total .471338339 87 .005417682 Root MSE = .07309

resid2 Coef. Std. Err. t P>|t| [95% Conf. Interval]

llotsize -.0070156 .0151563 -0.46 0.645 -.0371556 .0231244


lsqrft -.0627367 .0367674 -1.71 0.092 -.1358526 .0103792
bdrms .0168407 .0109002 1.54 0.126 -.0048357 .038517
_cons .5099937 .2578573 1.98 0.051 -.0027838 1.022771
79

EXAMPLE : White’s Heteroscedasticity Test

• Compute test statistics, critical value and pvalue.


80

EXAMPLE : White’s Heteroscedasticity Test

• Show all the values and make a decision:

nr2 = 4.2232337
crit = 7.8147279
pvalue = .23834602

• As pvalue is very high, thus we don’t reject the null hypothesis.


• It means the standard errors are well defined.
81

EXAMPLE : White’s Heteroscedasticity Test


82

Remedial Measures
83

When σi2 Is Known: The Method of Weighted Least


Squares
EXAMPLE : Illustration of the Method of Weighted Least Squares.

• Suppose we want to study the relationship between compensation


and employment size.
84

When σi2 Is Known: The Method of Weighted Least


Squares
85

When σi2 Is Known: The Method of Weighted Least


Squares
The procedure
86

When σi2 Is Known: The Method of Weighted Least


Squares
• Run the regression model:

Source SS df MS Number of obs = 9


F(2, 7) = 5258.53
Model 174.509886 2 87.2549429 Prob > F = 0.0000
Residual .116151268 7 .016593038 R-squared = 0.9993
Adj R-squared = 0.9991
Total 174.626037 9 19.402893 Root MSE = .12881

yt Coef. Std. Err. t P>|t| [95% Conf. Interval]

inter 3392.686 77.17705 43.96 0.000 3210.192 3575.181


xt 154.3817 16.16221 9.55 0.000 116.1641 192.5992
87

When σiSource
2 Is Known: SS The
df Method
MS ofofWeighted
Number
F(2, 7)
obs =
=
9 Least
5258.53
Model
Residual
174.509886
.116151268
Squares
2 87.2549429
7 .016593038
Prob > F
R-squared
=
=
0.0000
0.9993
Adj R-squared = 0.9991
• The results of WLS are as:
Total 174.626037 9 19.402893 Root MSE = .12881

yt Coef. Std. Err. t P>|t| [95% Conf. Interval]

inter 3392.686 77.17705 43.96 0.000 3210.192 3575.181


xt 154.3817 16.16221 9.55 0.000 116.1641 192.5992
When σi2 Is Known
• For comparison, we give the usual or unweighted OLS regression
results:

Source SS df MS Number of obs = 9


F(1, 7) = 121.62
Model 1354804.27 1 1354804.27 Prob > F = 0.0000
Residual 77979.7333 7 11139.9619 R-squared = 0.9456
Adj R-squared = 0.9378
Total 1432784 8 179098 Root MSE = 105.55

y Coef. Std. Err. t P>|t| [95% Conf. Interval]

x 150.2667 13.62593 11.03 0.000 118.0465 182.4869


_cons 3400.333 76.6774 44.35 0.000 3219.02 3581.647
89

When σi2 Is Not Known


• White’s Heteroscedasticity-Consistent Variances and Standard
Errors.

• White’s heteroscedasticity corrected standard errors are also known


as robust standard errors.
90

Plausible Assumptions about Heteroscedasticity


Pattern
• Consider the two-variable regression model:

• Consider several assumptions about the pattern of


heteroscedasticity.
91

Plausible Assumptions about Heteroscedasticity


Pattern
• Transform the original model as follows: Divide the original model
through by Xi :
92

Plausible Assumptions about Heteroscedasticity


Pattern
• Transform the original model as follows:
93

Plausible Assumptions about Heteroscedasticity


Pattern
• Transform the original model as follows:
94

Plausible Assumptions about Heteroscedasticity


Pattern
• Transform the original model as follows:

• Very often reduces heteroscedasticity when compared with the


regression Yi = β1 + β2Xi +ui .
95

Concluding Examples
• Data on research and development (R&D) expenditure, sales, and
profits for 18 industry groupings in the United States, all figures in
millions of dollars.
• Run the regression

Source SS df MS Number of obs = 18


F(1, 16) = 14.67
Model 111675212 1 111675212 Prob > F = 0.0015
Residual 121806834 16 7612927.12 R-squared = 0.4783
Adj R-squared = 0.4457
Total 233482046 17 13734238 Root MSE = 2759.2

rd Coef. Std. Err. t P>|t| [95% Conf. Interval]

sales .0319003 .008329 3.83 0.001 .0142436 .049557


_cons 192.9932 990.9858 0.19 0.848 -1907.803 2293.789
96

Test for Heteroskedasticity


• Obtain residuals, absolute and squared absolute values.

97

Test for Heteroskedasticity


• Park Test
98

Test for Heteroskedasticity


99

You might also like