0% found this document useful (0 votes)
4 views37 pages

3a - Relaxing The Ols Assumptions

The document discusses the concepts of internal and external validity in research, emphasizing the importance of proper sampling and generalization of results. It outlines the assumptions of Ordinary Least Squares (OLS) regression, including linearity, independence, and normality, and highlights potential threats to these assumptions such as multicollinearity and non-normality. The document also explains the consequences of violating these assumptions, particularly in terms of estimation accuracy and hypothesis testing.

Uploaded by

priyanshuiitg3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views37 pages

3a - Relaxing The Ols Assumptions

The document discusses the concepts of internal and external validity in research, emphasizing the importance of proper sampling and generalization of results. It outlines the assumptions of Ordinary Least Squares (OLS) regression, including linearity, independence, and normality, and highlights potential threats to these assumptions such as multicollinearity and non-normality. The document also explains the consequences of violating these assumptions, particularly in terms of estimation accuracy and hypothesis testing.

Uploaded by

priyanshuiitg3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

RELAXING THE OLS

ASSUMPTIONS
Internal and external validity
• A study is internally valid if its statistical inferences about causal effects are
valid for the population and setting being studied
• A study is externally valid if its statistical inferences can be generalized to
other populations and settings
• The population studied is the population of entities—people, companies, school
districts, and so forth—from which the sample was drawn

• The population to which the results are generalized, or the population of interest, is the
population of entities to which the causal inferences from the study are to be applied to.

• Example: A policy maker might want to generalize the findings on household income
and consumption in Maharashtra (the population studied) to the population of
households in India (the population of interest)
Internal and external validity
Threats to External Validity
• Differences in populations: Differences between the population studied and the
population of interest can pose a threat to external validity
• This could be because the population studied was chosen in a way that makes it different
from the population of interest- due to differences in characteristics, geographical
differences or because the study is out-of-date

• Differences in settings: The results based on the population studied may not be
generalizable due to differences in settings arising out of cultural or institutional
differences in the two populations
Internal and external validity
Threats to Internal Validity
• Internal validity has two components:
• The estimator of the causal effect should be unbiased and consistent
• Hypothesis tests should have the desired significance level (the actual rejection
rate of the test under the null hypothesis should equal its desired significance
level), and confidence intervals should have the desired confidence level

• Threats to internal validity originate from the failures of one or more of the
least squares assumptions
Ordinary Least Squares: Assumptions
• Assumption 1: The regression line is linear in parameters, although it may not be
linear in variables
• Assumption 2: The values taken by the regressor 𝑋 may be considered fixed in
repeated sampling
• In many real world scenarios, the data are collected such that the independent variables
are random or stochastic, in nature.
• In such cases, we assume that the 𝑋 variables are independent of the error term
𝐶𝑜𝑣 𝑋𝑖 , 𝑢𝑖 = 0
• Assumption 2 implies the absence of a linear association between 𝑋𝑖 and 𝑢𝑖

Example: When regressing household income on consumption, we assume that omitted


factors such as household wealth which is subsumed in 𝑢𝑖 is uncorrelated with 𝑋𝑖
Ordinary Least Squares: Assumptions
• Assumption 3 (Zero Conditional Mean): Given the value of 𝑋𝑖 , the mean or
the expected value of the random disturbance term 𝑋𝑖 is zero
𝐸 𝑢𝑖 ⃓𝑋𝑖 = 0
• Assumption 3 implies Assumption 2 but extends the latter to include non-linear
association between 𝑋𝑖 and 𝑢𝑖
• Assumption 3 ensures that 𝑌0𝑖 , 𝑌1𝑖 ⊥ 𝑋𝑖
• The presence of some omitted factor (or baseline characteristic) influencing
treatment assignment violates Assumption 3
Example: When regressing household income on consumption, we assume that the
average influence of some omitted factor such as household wealth which is subsumed
in 𝑢𝑖 is zero for given values of 𝑋𝑖
Ordinary Least Squares: Assumptions
• Assumption 4 (Homoscedasticity): The variance of the error, or disturbance,
term is the same regardless of the value of 𝑋
𝑣𝑎𝑟 𝑢𝑖 = 𝐸[𝑢𝑖 − 𝐸(𝑢𝑖 ⃓𝑋𝑖 )]2
= 𝐸 𝑢𝑖 2 ⃓𝑋𝑖 = 𝜎 2
Ordinary Least Squares: Assumptions
• Assumption 5 (No Autocorrelation): Given any two 𝑋 values, 𝑋𝑖 and 𝑋𝑗 (𝑖 ≠
𝑗), the correlation between any two 𝑢𝑖 and 𝑢𝑗 (𝑖 ≠ 𝑗) is zero.
• In short, the observations are sampled independently.
𝑐𝑜𝑣 𝑢𝑖 , 𝑢𝑗 ⃓𝑋𝑖 , 𝑋𝑗 = 0

• Assumption 6 (Normality of 𝑢𝑖 ): The disturbance term 𝑢𝑖 is normally


distributed.
• The disturbance term, 𝑢𝑖 represents the combined influence of a large number
of 𝑖𝑖𝑑 omitted variables
• By Central Limit Theorem, the sum of a large number of 𝑖𝑖𝑑 variables are
normally distributed
Multiple Linear Regression: Assumptions
• Assumption 7 (No Multicollinearity): Under the assumption of no
multicollinearity, it is assumed that there is no perfect collinearity among the
independent variables.
• It is assumed that none of the regressors can be written as exact linear
combinations of the remaining regressors in the model.
• No multicollinearity implies that there exists no set of numbers 𝜆1 and 𝜆2 , not
both zero, such that,
𝜆1 𝑋1𝑖 + 𝜆2 𝑋2𝑖 = 0

• Assumption 8 (No specification bias): The regression model is correctly


specified
Violation of OLS Assumptions: Non-normality of 𝑢𝑖
• The assumption of the normality of 𝑢𝑖 is essential for hypothesis testing with regard to
the OLS estimators
• Without the normality assumption of 𝑢𝑖 , the OLS estimators are also not normally
distributed and we cannot use the usual 𝑡 and 𝐹 tests to test statistical hypothesis in the
usual sense
• In case of non-normality of 𝑢𝑖 , we rely on inferences based on large sample
If the disturbances 𝑢𝑖 are independently and identically distributed with mean zero and constant
variance, 𝜎 2 , and if the explanatory variables are constant in repeated sampling, the OLS coefficient
estimates are asymptotically normally distributed with means equal to the corresponding 𝛽’s. Thus the
usual test procedures –𝑡 and 𝐹 tests– are still valid asymptotically, that is in large samples but not in
finite small samples.
Violation of OLS Assumptions: Multicollinearity
Multicollinearity: What happens if the regressors are correlated?
• The term multicollinearity originally meant the existence of a perfect or exact linear
relationship among some or all explanatory variables of a regression model
• For a 𝑘-variable linear regression involving the explanatory variables 𝑋0 , 𝑋1 , 𝑋2 , … 𝑋𝑘 , an
exact linear relationship is said to exist if the following condition is satisfied:
𝜆0 𝑋0 + 𝜆1 𝑋1 + 𝜆2 𝑋2 + ⋯ + 𝜆𝑘 𝑋𝑘 = 0
Here 𝑋0 = 1 for all observations to allow for the intercept term
• In modern times, the term multicollinearity refers to perfect as well as imperfect
interrelationship among the regressors, as defined below:
𝜆0 𝑋0 + 𝜆1 𝑋1 + 𝜆2 𝑋2 + ⋯ + 𝜆𝑘 𝑋𝑘 + 𝑣𝑖 = 0
Here 𝑣𝑖 is the stochastic error term
Multicollinearity
Sources of Multicollinearity
• Data collection method: Sampling over a limited range of values taken by the
regressors in the model
• Nature of the variables in regression model: Some regressors in a regression model
may be interrelated by nature resulting in the problem of multicollinearity
• Example: Regression of consumption on household income and household size faces the
problem of multicollinearity as household income and household size are likely to be
highly correlated

• Model specification: Adding polynomial terms of a regressor to a model leads to the


problem of multicollinearity
Multicollinearity: Consequences
Estimation in the presence of perfect multicollinearity
• Three variable regression model:
෢0 + 𝛽
𝑌𝑖 = 𝛽 ෢1 𝑋1𝑖 + 𝛽
෢2 𝑋2𝑖 + 𝑢𝑖

σ 𝑦𝑖 𝑥1𝑖 σ 𝑥2𝑖 2 − σ 𝑦𝑖 𝑥2𝑖 σ 𝑥1𝑖 𝑥2𝑖


෢1 =
𝛽
σ 𝑥1𝑖 2 σ 𝑥2𝑖 2 − σ 𝑥1𝑖 𝑥2𝑖 2
ത 𝑥1𝑖 = 𝑋1𝑖 − 𝑋1 ; 𝑥2𝑖 = 𝑋2𝑖 − 𝑋2
Here 𝑦𝑖 = 𝑌𝑖 − 𝑌;
• Under perfect multicollinearity: 𝑋1𝑖 = 𝜆𝑋2𝑖 ⇒ 𝑥1𝑖 = 𝜆𝑥2𝑖

𝜆 σ 𝑦𝑖 𝑥2𝑖 σ 𝑥2𝑖 2 − σ 𝑦𝑖 𝑥2𝑖 𝜆 σ 𝑥2𝑖 2 0



𝛽1 = = i. e. undefined
𝜆2 σ 𝑥2𝑖 2 σ 𝑥2𝑖 2 − 𝜆2 σ 𝑥2𝑖 2 2 0
෢1 and 𝛽
• Both 𝛽 ෢2 are indeterminate
Multicollinearity: Consequences
Consequences of imperfect multicollinearity
• Even in the presence of near perfect multicollinearity, the OLS estimators retain the
property of Best Linear Unbiased Estimators (BLUE)
Consequences of imperfect multicollinearity
• The OLS estimators have large variances making precise estimation difficult
σ 𝑥2𝑖 2
෢1 =
𝑣𝑎𝑟 𝛽 𝜎2
σ 𝑥1𝑖 2 σ 𝑥2𝑖 2 − σ 𝑥1𝑖 𝑥2𝑖 2

𝜎2
෢1 =
⇒ 𝑣𝑎𝑟 𝛽
σ 𝑥1𝑖 𝑥2𝑖 2
σ 𝑥1𝑖 2 −
σ 𝑥2𝑖 2

𝜎2
෢1 =
⇒ 𝑣𝑎𝑟 𝛽
2 2 σ 𝑥1𝑖 𝑥2𝑖 2
σ 𝑥1𝑖 − σ 𝑥1𝑖
σ 𝑥1𝑖 2 σ 𝑥2𝑖 2

𝜎2 1 σ 𝑥 1𝑖 𝑥2𝑖
2
෢1 =
⇒ 𝑣𝑎𝑟 𝛽 2
where 𝑟12 =
2 2
σ 𝑥1𝑖 1 − 𝑟12 σ 𝑥1𝑖 2 σ 𝑥2𝑖 2
෢1 is directly proportional to 𝑟12
• 𝑣𝑎𝑟 𝛽 2
or the correlation between 𝑋1 and 𝑋2
Consequences of imperfect multicollinearity
• In a general 𝑘-variable regression model,

𝜎2 1

𝑣𝑎𝑟 𝛽𝑗 =
σ 𝑥𝑗 2 1 − 𝑅𝑗2

𝜎2
𝑣𝑎𝑟 𝛽෡𝑗 = 𝑉𝐼𝐹𝑗
σ 𝑥𝑗 2

Here, 𝛽෡𝑗 is the estimated partial regression coefficient of the regressor 𝑋𝑗 ;


𝑅𝑗2 is the 𝑅2 of the regression of 𝑋𝑗 on the remaining (𝑘 − 2) regressors.
Consequences of imperfect multicollinearity
• Inflated variances results in wider confidence intervals leading to the acceptance of
the null hypothesis more readily
• In cases of high multicollinearity, the sample data may be compatible with a large set of
hypotheses increasing the probability of Type II error.

• The 𝑡-ratio of 𝛽෡𝑗 tends to be statistically insignificant


𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 − ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠𝑒𝑑 𝑣𝑎𝑙𝑢𝑒
𝑡= ~𝑡𝑚 𝑑.𝑓.
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟
• High collinearity by artificially enlarging the standard errors can result in a smaller 𝑡-
statistic thus increasing the chances of accepting the null hypothesis
Consequences of imperfect multicollinearity
• The individual regression coefficients may be statistically insignificant on the basis of
𝑡-test, yet the overall 𝑅2 in such situations may be high
𝐹 test would reject 𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑗 = ⋯ = 𝛽𝑘 = 0
Individual 𝑡 test would fail to reject 𝐻0 : 𝛽𝑗 = 0; 𝑗 = 1,2, … 𝑘
• The 𝐹 test would convincingly reject the null hypothesis that all the slope coefficients are
simultaneously equal to zero although the 𝛽’s may all be individually insignificant
• High collinearity among the regressors makes it very difficult to identify the partial or
direct effect of an individual regressor on the outcome variable

• The estimates and the standard errors of the OLS estimators become very sensitive
to even the slightest changes in data
Detection of multicollinearity
• High 𝑅2 but very few significant 𝑡 ratios: A classic symptom of multicollinearity is a
situation where the 𝑅2 is high but the individual slope coefficients are not significant
• Auxiliary regressions: Another way of detecting multicollinearity in a 𝑘-variable
regression is to regress each of the independent variables on the remaining regressors
• If the estimated 𝑅2 from the auxiliary regression exceeds 0.90, it can imply that
multicollinearity is an issue
Detection of multicollinearity
• The tolerance and the Variance Inflating Factor (VIF) are useful tools for detecting
multicollinearity
• The VIF of a regressor 𝑋𝑗 is defined as
1
𝑉𝐼𝐹𝑗 =
1 − 𝑅𝑗2
𝑅𝑗2 is the 𝑅2 of the regression of 𝑋𝑗 on the remaining regressors
• As a rule of thumb, if the VIF of a variable exceeds 10, which will happen if 𝑅𝑗2 >
0.90, that variable is said to be highly collinear
• The tolerance of a regressor is defined as the inverse of the VIF.
1
𝑇𝑂𝐿𝑗 = = 1 − 𝑅𝑗 2
𝑉𝐼𝐹𝑗
• Lower the tolerance, the greater is the degree of multicollinearity
Detection of multicollinearity
• The condition number and the condition index are useful tools for detecting
multicollinearity
• The condition number 𝑘 is defined as
𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒
𝑘=
𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒
• As a rule of thumb, if the condition number lies between 100 to 1000, there is moderate
to strong multicollinearity if it exceeds 1000 there is severe multicollinearity
• The condition index of a regressor is defined as the square root of the condition number.

𝑀𝑎𝑥𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒
𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝐼𝑛𝑑𝑒𝑥 = = 𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑛𝑢𝑚𝑏𝑒𝑟
𝑀𝑖𝑛𝑖𝑚𝑢𝑚 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒

• Higher the condition index, the greater is the degree of multicollinearity


Multicollinearity: Remedial measures
• Dropping a variable and specification bias: One of the simplest methods to counter
multicollinearity is to drop one of the collinear variables
• Although, it can solve the multicollinearity problem it leads to specification bias in the model
• Transformation of variables: A useful method to correct for multicollinearity is to
transform the variables in the model.
• Examples, include specifying values in per capita or in logarithms rather than in levels.
• Additional or new data: Simply increasing the sample size may be a potent solution to the
multicollinearity problem.
𝜎2
𝑣𝑎𝑟 𝛽෡𝑗 = 𝑉𝐼𝐹𝑗
σ 𝑥𝑗 2
• As the sample size increases, σ 𝑥𝑗 2 is likely to increase, lowering 𝑣𝑎𝑟 𝛽෡𝑗
Multicollinearity: Remedial measures
• Reducing collinearity in polynomial models: One simple solution to reducing
multicollinearity in models with quadratic or cubic terms is to express the independent
variables in deviation forms.
• Other methods such as factor analysis, principal components methods and ridge
regression can also be used to solve for the multicollinearity problem
Violation of OLS Assumptions: Heteroscedasticity
Heteroscedasticity: What happens if the error variance is non-constant
• The assumption of homoscedasticity states that the error variance conditional on
the values of the regressor is constant
𝑣𝑎𝑟 𝑢𝑖 ⃓𝑋𝑖 = 𝐸 𝑢𝑖 2 ⃓𝑋𝑖 = 𝜎 2

• Under heteroscedasticity, the variance of the error term varies with the values of
the regressor
𝐸 𝑢𝑖 2 ⃓𝑋𝑖 = 𝜎𝑖 2
Violation of OLS Assumptions: Heteroscedasticity
Heteroscedasticity: Sources
• Cross-sectional data involve observations on heterogeneous units with varying
range/scale. Such data invariably results in heteroscedasticity
• Example: Data on households display a varying range of expenditures at different levels
of income. Households with low income exhibit low range of expenses whereas higher
income households exhibit a larger variance of expenses due to their greater discretionary
incomes
• Example: Data on firms display a varying range of profits at different levels of firm sizes.
Smaller firms exhibit low range of profits whereas larger firms exhibit a wider variance of
profits due to their higher R&D expenses which is inherently risky
• Example: The number of typing errors and its variance decreases with the hours of
typing practice as practice makes one perfect
Heteroscedasticity: Sources
• Heteroscedasticity also arises due to the presence of outliers
• Heteroscedasticity may arise due to incorrect model specification
• Omission of a relevant variable results in the effect of this variable being subsumed in the
disturbance term. If this omitted variable is correlated with any other regressor, the
disturbance term will be systematically correlated with the regressor
• Specifying a linear functional form instead of a quadratic form results in the quadratic
term being subsumed in the disturbance leading to correlation between the regressor and
the error term
Heteroscedasticity: Consequences
• The OLS estimate of 𝛽1 under heteroscedasticity is identical to its estimate under
homoscedasticity
෢1 under heteroscedasticity is still a linear, unbiased and consistent
• The OLS estimate, 𝛽
estimator of 𝛽1
෢1 is still asymptotically normally distributed under
• Under certain conditions, 𝛽
heteroscedasticity
• The interpretation of the goodness-of-fit measures, 𝑅2 and 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅2 is also unaffected
by the presence of heteroscedasticity
Heteroscedasticity: Consequences
• The variance of 𝛽෢1 is different from its corresponding variance under homoscedasticity

𝜎2
෢1 =
Under homoscedasticity: 𝑣𝑎𝑟 𝛽
σ 𝑥𝑖 2
σ 𝑥𝑖 2 𝜎𝑖 2
෢1 =
Under heteroscedasticity: 𝑣𝑎𝑟 𝛽
(σ 𝑥𝑖 2 )2
෢1 is no longer the best or minimum variance estimator of 𝛽1 in
• Under heteroscedasticity, 𝛽
the class of linear unbiased estimators

• It is not possible to rely on conventionally computed confidence intervals and 𝑡-test to


test hypothesis about individual regression coefficients
Standard errors of least squares estimators
𝑥𝑖
෢1 = 𝐸[𝛽
𝑣𝑎𝑟 𝛽 ෢1 − 𝛽1 ]2 ∵ 𝛽
෢1 = 𝛽1 + ෍ 𝑘𝑖 𝑢𝑖 , 𝑤ℎ𝑒𝑟𝑒 𝑘𝑖 =
σ 𝑥𝑖2

2
෢1 = 𝐸 ෍ 𝑘𝑖 𝑢𝑖
⇒ 𝑣𝑎𝑟 𝛽 = 𝐸 𝑘1 𝑢1 + 𝑘2 𝑢2 + ⋯ + 𝑘𝑛 𝑢𝑛 2

෢1 = 𝐸 𝑘12 𝑢12 + 𝑘22 𝑢22 + ⋯ + 𝑘𝑛2 𝑢𝑛2 + 2𝑘1 𝑘2 𝑢1 𝑢2 + ⋯ + 2𝑘𝑛−1 𝑘𝑛 𝑢𝑛−1 𝑢𝑛


⇒ 𝑣𝑎𝑟 𝛽

• Under the assumption of heteroscedasticity and no autocorrelation,

σ 𝑥𝑖 2 𝜎𝑖 2
෢1 =
𝑣𝑎𝑟 𝛽
(σ 𝑥𝑖 2 )2
Heteroscedasticity: Detection
Informal methods of detecting heteroscedasticity
• Nature of the problem: In cross-sectional data involving heterogeneous units with a
large range, heteroscedasticity may be the rule rather than the exception
• Graphical method: We estimate the residuals 𝑢ෝ𝑖 after running an OLS model assuming
homoscedasticity
• Plotting 𝑢ෝ𝑖 2 against 𝑌
෢𝑖 provides evidence on the presence or absence of heteroscedasticity

• Panels 𝑏 − (𝑒) display definite patterns of heteroscedasticity


Heteroscedasticity: Detection
Heteroscedasticity: Detection
Formal methods of detecting heteroscedasticity: Breusch-Pagan-Godfrey test
• The 𝑘 + 1 variable linear regression model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑢𝑖
• The error variance is assumed to be a function of the 𝑋’s. Specifically,
𝜎𝑖 2 = 𝛼0 + 𝛼1 𝑋1𝑖 + 𝛼2 𝑋2𝑖 + ⋯ + 𝛼𝑘 𝑋𝑘𝑖
• Under homoscedasticity: 𝑯𝟎 : 𝛼1 = 𝛼2 = ⋯ = 𝛼𝑘 = 0 → 𝜎𝑖 2 = 𝛼0 (a constant)
• Estimate 𝑢ෝ𝑖 2 from the model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑢𝑖
• Regress 𝑢ෝ𝑖 2 on the 𝑋’s: 𝑢ෝ𝑖 2 = 𝛼0 + 𝛼1 𝑋1𝑖 + 𝛼2 𝑋2𝑖 + ⋯ … + 𝛼𝑚 𝑋𝑚𝑖 + 𝑣𝑖
• Test 𝑯𝟎 : 𝛼1 = 𝛼2 = ⋯ = 𝛼𝑚 = 0 using 𝐿𝑀 or 𝐹 test
• If the critical value of 𝐿𝑀 or 𝐹 statistic exceeds the theoretical value, we reject 𝑯𝟎 .
Else, we may accept 𝑯𝟎
Heteroscedasticity: Detection
Formal methods of detecting heteroscedasticity: White’s test
• The 𝑘 = 3 variable linear regression model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝑢𝑖
• The error variance is assumed to be a general function of the 𝑋’s. Specifically,
𝜎𝑖 2 = 𝛼0 + 𝛼1 𝑋1𝑖 + 𝛼2 𝑋2𝑖 + 𝛼3 𝑋1𝑖 2 + 𝛼4 𝑋2𝑖 2 + 𝛼5 𝑋1𝑖 𝑋2𝑖
• Under homoscedasticity: 𝑯𝟎 : 𝛼1 = 𝛼2 = ⋯ = 𝛼5 = 0 → 𝜎𝑖 2 = 𝛼0 (a constant)
• Estimate 𝑢ෝ𝑖 2 from the model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑢𝑖
• Rather than regressing on the non-linear functions of the 𝑋’s, we equivalently regress 𝑢ෝ𝑖 2 on a
non-linear function of the fitted 𝑌’s:
2 2
෡ ෡
𝑢ෝ𝑖 = 𝛿0 + 𝛿1 𝑌𝑖 + 𝛿2 𝑌𝑖 + 𝑣𝑖
• Test 𝑯𝟎 : 𝛿1 = 𝛿2 = 0 using 𝐿𝑀 or 𝐹 test
• If the critical value of 𝐿𝑀 or 𝐹 statistic exceeds the theoretical value, we reject 𝑯𝟎 . Else, we
may accept 𝑯𝟎
Heteroscedasticity: Remedial measures
White’s heteroscedasticity consistent standard errors
• The 𝑘 = 2 variable linear regression model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
• Under heteroscedasticity:
𝑣𝑎𝑟 𝑢𝑖 ⃓𝑋𝑖 = 𝐸 𝑢𝑖 2 ⃓𝑋𝑖 = 𝜎𝑖 2
σ 𝑥𝑖 2 𝜎𝑖 2
෢1 =
𝑣𝑎𝑟 𝛽
(σ 𝑥𝑖 2 )2

ෞ𝑖 2
σ 𝑥𝑖 2 𝑢
෢1 under any form of heteroscedasticity:
• A valid estimator of 𝑣𝑎𝑟 𝛽
σ 𝑥𝑖 2 2

• White’s heteroscedasticity consistent standard errors are valid only for large samples
Heteroscedasticity: Remedial measures
Log transformation
• The 𝑘 = 2 variable linear regression model: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
• A very useful solution for solving heteroscedasticity is log-transformation of the model
• Log transformation compresses the scale of the variable
THANK YOU

You might also like