0% found this document useful (0 votes)
125 views3 pages

What Are The Consequences of Heteroscedasticity and Multicollinearity in Regression? What Are The Possible Remedies?

Heteroscedasticity occurs when the variance of residuals is not constant, meaning the variance increases or decreases with the values of the independent variables. This violates an assumption of linear regression and can result in an inefficient and unstable model. Heteroscedasticity can be detected graphically by plotting residuals against fitted values or using statistical tests like the Breusch-Pagan test. If detected, the model may need to be re-specified by transforming or adding new predictors to correct for heteroscedasticity.

Uploaded by

Taranum Shohel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views3 pages

What Are The Consequences of Heteroscedasticity and Multicollinearity in Regression? What Are The Possible Remedies?

Heteroscedasticity occurs when the variance of residuals is not constant, meaning the variance increases or decreases with the values of the independent variables. This violates an assumption of linear regression and can result in an inefficient and unstable model. Heteroscedasticity can be detected graphically by plotting residuals against fitted values or using statistical tests like the Breusch-Pagan test. If detected, the model may need to be re-specified by transforming or adding new predictors to correct for heteroscedasticity.

Uploaded by

Taranum Shohel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

What are the consequences of heteroscedasticity and

multicollinearity in regression? What are the possible


remedies?
Heteroscedasticity: One of the important assumptions of linear regression is that, there
should be no heteroscedasticity of residuals. In simpler terms, this means that the variance
of residuals should not increase with fitted values of response variable.
Why is it important to check for heteroscedasticity? It is customary to check for
heteroscedasticity of residuals once you build the linear regression model. The reason is,
we want to check if the model thus built is unable to explain some pattern in the response
variable (Y), that eventually shows up in the residuals. This would result in an inefficient
and unstable regression model that could yield bizarre predictions later on.
How to detect heteroscedasticity? I am going to illustrate this with an actual regression
model based on the cars dataset, that comes built-in with R. Lets first build the model using
the lm() function.
summary(cars)

## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00

mydata<- cars
modelcars <- lm (speed~ ., data = mydata)
summary(modelcars)

##
## Call:
## lm(formula = speed ~ ., data = mydata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.5293 -2.1550 0.3615 2.4377 6.4179
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.28391 0.87438 9.474 1.44e-12 ***
## dist 0.16557 0.01749 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.156 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12

#Graphical method
par(mfrow=c(2,2)) # init 4 charts in 1 panel
plot(modelcars)

The plots we are


interested in are at the top-left and bottom-left. The top-left is the chart of residuals vs
fitted values, while in the bottom-left one, it is standardised residuals on Y axis. If there is
absolutely no heteroscedastity, you should see a completely random, equal distribution of
points throughout the range of X axis and a flat red line.
But in our case, as you can notice from the top-left plot, the red line is slightly curved and
the residuals seem to increase as the fitted Y values increase. So, the inference here is,
heteroscedasticity exists.
Statistical tests Sometimes you may want an algorithmic approach to check for
heteroscedasticity so that you can quantify its presence automatically and make amends.
For this purpose, there are a couple of tests that comes handy to establish the presence or
absence of heteroscedasticity - The Breush-Pagan test and the NCV test.
Breush Pagan Test
lmtest::bptest(modelcars)
##
## studentized Breusch-Pagan test
##
## data: modelcars
## BP = 0.71522, df = 1, p-value = 0.3977

The test has a p-value less that a significance level of 0.05, therefore we can reject the null
hypothesis that the variance of the residuals is constant and infer that heteroscedasticity is
indeed present, thereby confirming our graphical inference.
How to rectify? Re-build the model with new predictors.

You might also like