Week 9 - Random Effects Model
Week 9 - Random Effects Model
where ai represented unobserved heterogeneity that was stable (or fixed) over time.
In the random effects model, we now additionally assume that the unobserved effect ai is
uncorrelated with each explanatory variable. Wooldridge writes this assumption as:
• Note that the composite error term is subscripted with both i and t.
• Because ai is in the composite error for each time period t, the error term (vit) is
serially correlated across time (i.e., the error terms are correlated with each other
over time).
• Fortunately for us, the random effects estimators used in statistical packages today
will approximate the degree of serial correlation (or its importance in the model)
and compute estimates accordingly. When the unobserved effects (ai) are large
and important, the random effects estimates will be similar to the fixed effects
model. When the unobserved effects are unimportant (relative to the variance of
uit), the random effects estimates will be closer to a pooled OLS model.
In deciding whether to use a fixed effects or random effects specification, we should consider
whether we want to estimate ai as a parameter in our model (as in the fixed effects model) or
as an outcome of a random variable (specified in the composite error term of the random
effects model). SAS and Stata will even compute a test (e.g., the Hausman specification test)
for whether there is correlation between ai and the xit, assuming other model assumptions are
met. Statistically, fixed effects models always give consistent results, but they may not be the
most efficient model to estimate. Random effects will give you more accurate p-values as
they are a more efficient estimator, so you should run random effects if it is statistically
justifiable to do so.
Dynamic Panel Models
If there is serial correlation in a model, it is necessary to deal with it. One can apply one
or more of the several tests for residual autocorrelation (e.g., the Durbin-Watson test for
first-order autocorrelation in the residuals). There may be panel specific autocorrelation
or there may be common autocorrelation across all panels, and there are ways to specify
the type of autocorrelation.
There are a number of problems that plague panel data models. Outliers can bias
regression slopes, particularly if they have significant leverage. These outliers can be
downweighted with special techniques. Heteroskedasticity problems often arise from
cross-sectional differences, although simply taking group means can remove this
heteroskedasticity.
Heteroskedastic models are usually fitted with estimated or feasible generalized least
squares (EGLS or FGLS). Heteroskedasticity can be assessed with a Breusch-Pagan test.
For the most part, fixed effects models with group-wise heteroskedasticity cannot be
efficiently estimated with OLS. If the sample size is large enough and autocorrelation
plagues the errors, FGLS can be used.
STATA and SAS are among those statistical packages that excel in panel data analysis.
Both packages have fixed and random effects models, the Hausman test for specification,
and procedures that can correct for autocorrelation in the models.
The command for a linear regression on panel data with fixed effects in Stata is xtreg
with the fe option, used like this:
The command for a linear regression on panel data with random effects in Stata is xtreg
with the re option.
The Hausman test checks a more efficient model against a less efficient but consistent
model to make sure that the more efficient model also gives consistent results.
To run a Hausman test comparing fixed with random effects in Stata, you need to first
estimate the fixed effects model, save the coefficients so that you can compare them with
the results of the next model, estimate the random effects model, and then do the
comparison.
The Hausman test tests the null hypothesis that the coefficients estimated by the efficient
random effects estimator are the same as the ones estimated by the consistent fixed
effects estimator. If they are insignificant, then it is safe to use random effects. If you get
a significant p-value, however, you should use fixed effects.
SAS software contains two procedures for fitting general linear models to panel data. The
GLM procedure fits general linear models involving fixed effects. The more general
MIXED procedure fits mixed linear models containing both fixed and random effects.
The MIXED procedure provides easy accessibility to a variety of mixed models useful in
many common statistical analyses.