Multi Hetero Auto

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

CHAPTER 10: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?

 The Nature of Multicollinearity Multicollinearity the existence of a perfect, or exact, linear relationship among some or all explanatory variables for a regression model. Why does the classical linear regression model assume that there is no multicollinearity among the Xs? If multicollinearity is perfect, the regression coefficients of the X variables are indeterminate and their standard errors are infinite. If multicollinearity is less than perfect, the regression coefficients, although determinate, possess large standard errors, which means the coefficient cannot be estimated with great passion or accuracy. Sources of Multicollinearity 1. 2. 3. 4. The data collection method employed. Constraints on the model or in the population being sampled. Model specification. An overdetermined model.

 Practical Consequences of Multicollinearity 1. Although BLUE, the OLS estimators have large variances and covariances, making precise estimation difficult. 2. Because of consequence 1, the confidence intervals tend to be much wider, leading to the acceptance of the zero null hypothesis. 3. Also because of consequence 1, the t ratio of one or more coefficients tend to be statistically insignificant. 4. Although the t ratio of one or more coefficients is statistically insignificant, R2, the overall measure of goodness of fit, can be very high. 5. The OLS estimators and their standard errors can be sensitive to small changes in the data.  Large Variances and Covariances of OLS Estimators var( 2) = X2i(1 r23) var( 3) = X3i(1 r23) cov( 2,
3)

-r23 (1 r23) X2iX3i

 Detection of Multicollinearity 1. 2. 3. 4. 5. High R2 but few significant t ratios. High pair-wise correlations among regressors. Examinations of partial correlations. Auxiliary regressions. Eigenvalues and condition index.

CI = Maximum eigenvalue = Minimum eigenvalue

6. Tolerance and variance inflation factor  Remedial Measures y y 1) 2) 3) 4) 5) 6) 7) Do Nothing Rule-of-Thumb Procedures A priori information. Combining cross-sectional and time series data. Dropping a variable(s) and specification bias. Transformation of variables. Additional or new data. Reducing collinearity in polynomial regressions. Other methods of remedying multicollinearity.

CHAPTER 11: HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?


 The Nature of Heteroscedasticity There are several reasons why the variances of ui may be a variable, some of which are as follows: Following the error-learning models, as people learn, their errors of behavior become smaller over time. As incomes grow, people have more discretionary income and hence more scope of choice about the disposition of their income. As data collecting techniques, is likely to decrease. Heteroscedasticity can also arise as a result of the presence of outliers. Another source of heteroscedasticity arises from violating Assumption 9 of CLRM, namely, that the regression model is correctly specified. Another source of heteroscedasticity is skewness in the distribution of one or more regressors included in the model.

1. 2. 3. 4. 5. 6.

7. Heteroscedasticity can also arise because of (1) incorrect data transformation and (2) incorrect functional form.  OLS Estimation in the Presence of Heteroscedasticity var( 2) = Xi2 (Xi2)2 var( 3) = Xi2

  The Method of Generalized Least Squares - Takes such information into account explicitly and is therefore capable of producing estimators that are BLUE.  Difference Between OLS and GLS OLS: ui2 = (Yi GLS: wiui2 = wi(Yi
1Xi 1

2Xi)

2Xi)

 Consequence of Using OLS in the Presence of Heteroscedasticity  OLS Estimation Disregarding Heteroscedasticity In short, if we persist in using the usual testing procedures despite heteroscedasticity, whatever conclusions we draw or inferences we make may be very misleading.  Detection of Heteroscedasticity 1. Informal Methods Nature of the Problem. Very often the nature of the problem under consideration suggests whether heteroscedasticity is likely to be encountered. Graphical Method. If there are no priori or empirical information about the nature of heteroscedasticity, in practice one can do the regression analysis on the assumption that there is no heteroscedasticity and then do the postmortem examination of the residual squared ui2 to see if they exhibit any systematic pattern. 2. Formal Methods Park Test. Park formalizes the graphical method by suggesting that is some function of the explanatory variable Xi. Glejser Test. After obtaining the residuals ui from the OLS regression, Glejser suggests regressing the absolute values of ui on the X variable that is thought to be closely associated with . Spearmans Rank Correlation Test.

di2 n(n2 1) Step 1: Fit the regression to the data on Y and X and obtain the residuals ui. Step 2: Ignoring the sign of ui, that is, taking their absolute value ui , rank both ui and Xi or (Yi) according to an ascending or descending order and compute the Spearmans rank correlation coefficient given previously. Step 3: Assuming that the population rank correlation coefficient s is zero and n>8, the significance of the sample rs can be tested by the t test as follows. rs = 1 t = rs n 2 1 r2s Goldfeld-Quandt Test. This popular method is applicable if one assumes that the heteroscedastic variance , is positively related to one of the explanatory variable in the regression model. Step 1: Order or rank the observations according to the values of Xi, beginning with the lowest X value. Step 2: Omit c central observations, where c is specified a priori, and divide the remaining (n c) observations into two groups each of (n c) /2 observations. Step 3: Fit separate OLS regressions to the first (n c)/2 observations and the last (n c)/2 observations and obtain the respective residual sums of squares RSS1 and RSS2, RSS1 representing the RSS from the regression corresponding to the smaller Xi values and RSS2 that from the larger Xi values. Step 4: Compute the ratio = RSS2/df RSS1/df If ui are assumed to be normally distributed and if the assumption of homoscedasticity is valid.

Breusch-Pagan-Godfrey Test Step 1: Estimate Yi =


1

2X2i +

kXk

+ ui

by OLS and obtain the residuals u1, u2, . . . ,un Step 2: Obtain = ui2/n Step 3: Construct variables pi defined as pi = ui2/ which is simply each residual squared divided by . Step 4: Regress pi thus considered on the zs as

pi = 1 + 2Z2i + . . . + where vi is the residual term of this regression. Step 5: Obtain the ESS and define = (ESS) Whites General Heteroscedasticity Test Step 1: Given the data, we estimate Yi = 1 + 2X2i + and obtain the residuals, ui. Step 2: We then run the following regression:

mZmi

+ vi

3X3i

+ ui

ui2 = 1 + 2X2i + 3X3i + 4X2i + 5X3i + 6X2iX3i + vi Step 3: Under the null hypothesis that there is no heteroscedasticity, it can be shown that the sample size (n) times the R2 obtained from the auxiliary regression asymptotically follows the chi-square distribution with df equal to the number of regressors in the auxiliary regression. That is, n R2 asy X2df Step 4: If the chi-square value obtained in n R2 asy X2df exceeds the critical chi-square value at the chosen level of significance, conclusion is that there is heteroscedasticity.  Other Tests of Heteroscedasticity y Koenker-Bassett (KB) Test  Remedial Measures y When is Known: The Method of Weighted Least Squares y When not Known y Plausible assumptions about heteroscedasticity pattern Assumption 1: The error variance is proportional to Xi2 Xi2 E(ui2) = Assumption 2: The error variance is proportional to Xi. The square root transformation: Xi E(ui2) = Assumption 3: The error variance is proportional to the square of the mean value of Y. E(ui2) = Assumption 4: A log transformation such as [E(Yi)]2

lnYi = 1 + 2lnXi + ui very often reduces heteroscedasticity when compared with the regression Yi =

2Xi

+ ui

CHAPTER 12: AUTOCORRELATION: WHAT HAPPENS IF THE ERROR TERMS ARE CORRELATED?
 The Nature of The Problem 1. Autocorrelation correlation between members of series of observations ordered in time. 2. Specification Bias: Excluded Variables Case. In empirical analysis, the researcher often starts with a plausible regression model that may not be the most perfect one. After the regression analysis, the researcher does the postmortem to find out whether the results accord with a priori expectations. 3. Cobweb Phenomenon. The supply of many agricultural commodities reflects the socalled cobweb phenomenon, where supply reacts to price with a lag of one time period because supply decisions take time to implement. 4. Lags. A regression such as Consumptiont = 1 + 2incomet + 3consumptiont 1 + ui is known as autoregression because one of the explanatory variables is the lagged value of the dependent variable. Manipulation of Data. Another source of manipulation is interpolation or extrapolation of data.  OLS Estimation in the Presence of Autocorrelation (=rho) is known as the coefficient of autocovariance. The scheme: ut = ut 1 + t -1< <1 is known as the Markov first-order autoregressive sceme, or simply a first-order autoregressive scheme, usually denoted as AR(1). The name autoregressive is appropriate because it can be interpreted as the regression of ui on itself lagged one period.  The BLUE Estimator in the Presence of Autocorrelation
2

= t = 2(Xt Xt 1)(Yt Yt 1) + C t = 2(Xt Xt 1)2


2

var

=
2

+D

t = 2(Xt Xt 1)  Consequences of Using OLS in the Presence of Autocorrelation  OLS Estimation Allowing for Autocorrelation To establish confidence intervals and to test hypotheses, one should use GLS and not OLS even though the estimators derived from the latter are unbiased and consistent.  OLS Estimation Disregarding Autocorrelation 1. The residual variance = ui2/(n 2) is likely to underestimate the true . 2. As a result, we are likely to overestimate R2. 3. Even if is not underestimated, var( 2) may underestimate var( 2) AR1, its variance under autocorrelation, even though the latter is inefficient compared to var( 2)GLS.

4. Therefore, the usual t and F tests of significance are no longer valid, and if applied, are likely to give seriously misleading conclusions about the statistical significance of the estimated regression coefficients.  Detecting Autocorrelation I. Graphical Method II. The Runs Test Mean: E(R) = 2N1N2 + 1 N = 2N1N2(2N1N2 N) (N)2(N 1) Decision Rule: Do not reject the null hypothesis of randomness with 95% confidence if the number of runs, lies in the preceding confidence interval, reject the null hypothesis if the estimated R lies outside these limits. Variance:

R,

III.

Durbin-Watson d Test Durbin-Watson d statistic

d = t = 2(ui ut 1)2 t = 1 ui2 Assumptions underlying d statistic: 1. The regression model includes the intercept form. 2. The explanatory variables, the Xs are nonstochastic, or fixed in repeated sampling. 3. The disturbances ui are generated by the first-order autoregressive sceme: ut = ut 1 +
t

4. The error term ut is assumed to be normally distributed. 5. The regression model does not include the lagged value(s) of the dependent variable as one of the explanatory variables. 6. There are no missing observations in the data. Therefore, as a rule of thumb, if d is found to be 2 in the application, one may assume that there is no first-order autocorrelation, either positive or negative. Mechanics of the Durbin-Watson Test 1) Run the OLS regression and obtain the residuals. 2) Compute d. 3) For the given sample size and given number of explanatory variables, find out the critical dL and dV values. 4) Now follow the decision rules of durbin-watson d test. IV. A General Test of Autocorrelation: The Breusch-Godfrey (BG) Test Steps: 1. Estimate Yt = 1 + 2Xt + ut by OLS and obtain the residuals, ut. 2. Regress ut on the original Xt and ut 1, ut 2, . . . ,ut p, where the latter are the lagged values of the estimated residuals in step 1. 3. If the sample size is large, Breusch and Godfrey have shown that (n p)R2 X2p

 What To Do When You Find Autocorrelation: Remedial Measures 1. Try to find out if the autocorrelation is pure autocorrelation and not the result of misspecification of the model. 2. If it is pure autocorrelation, one can use appropriate transformation of the original model so that in the transformed model we do not have the problem of autocorrelation. 3. In large samples, we can use the Newey-West method to obtain standard errors of OLS estimators that are corrected for autocorrelation. 4. In some situations we can continue to use the OLS method.

You might also like