Topic 6 - FE, RE and Tests
Topic 6 - FE, RE and Tests
3
(a) The Least Squares Dummy Variable
(LSDV) estimator
• We estimate the model in levels (not differences)
yit = Xit β + t + i + it
• Now we include a dummy variable for each observational
unit, to estimate i :
– Previously, we included a dummy variable for each time
period, to estimate t.
– Can also do both simultaneously.
• Now, the model becomes:
yit = 2d2it + 3d3it + … + NdNit + Xit β + t + it
• with cross-sectional unit 1 as the reference.
• But, may not be numerically feasible to estimate, especially
in short and wide panels (N is large):
– Addition of large number of dummies 4
• Our Stata example:
– Earnings data for 545 men for 8 years
– We need to add 544 dummy variables to this model!
• This is not practical
• (Household surveys may have more than 30 000
cross-sectional units)
• In addition:
– We are typically interested in β rather than i
• In practice:
– LSDV method is seldom used, except sometimes in
long and narrow panels (small N)
– e.g. a macro panel model may include a dummy for
each country
• Preferred method: Fixed Effects by time demeaning 5
METHOD B2:
FIXED EFFECTS
CONTINUED
9
4. Since 𝜷 is the coefficient on 𝑿ሷ 𝑖𝑡 = 𝑿𝑖𝑡 − 𝑿𝑖 , there must be
variation in 𝑿ሷ 𝑖𝑡 in order to identify β:
– We cannot estimate the effect on y of anything that:
• Remains constant over time (e.g. race)
• Changes at a constant rate over time (e.g. age)
– We can include interactions between such variables and
the time dummies:
• Assess whether returns to such characteristics
change over time.
– FE estimator is robust to the omission of any time-
invariant regressors.
– Being able to estimate a coefficient precisely depends on
the amount of variation in 𝑿ሷ 𝑖𝑡
• Less variation (i.e. 𝑿𝑖𝑡 seldom deviates from its mean)
→ wide confidence intervals
10
If Xj is a continuous variable:
• Lack of variation not usually a problem
Variation
in 𝑿ሷ 𝑖𝑡 ?
If Xj is a categorical variable:
• Be careful about potential lack of variation
• Call those who change their Xj value ‘switchers’
– If dataset has no switchers: cannot determine effect of Xj
on y.
– If dataset has few switchers: effect of Xj on y is
estimated based only on those who switch.
• May not be representative of all individuals with
characteristic Xj.
– E.g. are people who change their marital status
representative of all married people?
• Also likely that the effect will not be estimated very
precisely.
11
5.3 Properties of FE
• Like FD, the strict exogeneity assumption applies to it :
– E(it | Xit , i) = 0 , t = 1,... ,T
12
Stata example
• Earnings data for 545 men in the US from 1980 – 1987
– (see previous slides for details)
• We estimate a fixed effects model for the log of wages.
• Recall: we cannot include race, education or experience.
• We include: experience squared and dummies for union,
married and the years 1981-1987 (with 1980 as base year).
• Do we observe much switching of union status?
– Use transition matrix in Stata:
. xttrans union
status in next wave
=1 if in =1 if in union
union 0 1 Total
13
. xtreg lwage expersq married union d81 d82 d83 d84 d85 d86 d87, fe
F(10,3805) = 83.85
corr(u_i, Xb) = -0.1222 Prob > F = 0.0000
sigma_u .39176195
sigma_e .35099001
rho .55472817 (fraction of variance due to u_i)
F test that all u_i=0: F(544, 3805) = 9.16 Prob > F = 0.0000
14
Output:
• What Stata calls u_i is the individual-specific fixed effect i
– technically, it’s all time-invariant variables and i
• F test for existence of fixed effects:
– H0: 1 = 2 = … = 0
– Compares FE and OLS to see how much FE can
improve the goodness-of-fit
– Reject H0 (p=0.000), so there is a significant fixed effect
– Therefore, FE is ‘better’ than POLS.
15
Interpretation:
• Interpret the coefficients as you would in an OLS model:
– On average, belonging to a union significantly increases
wages by 100 𝑒 0.080 − 1 = 8.33% compared to non-union
member c.p., after removing individual heterogeneity.
educ 0.091***
black -0.139***
hisp 0.016
exper 0.067***
expersq -0.002** -0.005***
married 0.108*** 0.047*
union 0.182*** 0.080***
B. FE-type methods:
• FD
Slides pt 4
• LSDV
• FE time demeaning Slides pt 5
22
How should we estimate β?
If we believe cov(i , Xit) = 0:
• We don’t need to control for i when estimating β
– We can estimate β consistently using a single cross-
section (i.e. no need for panel data).
– But this disregards useful information from other time
periods if we actually have a panel.
24
Theta and model type:
1
σ
2 2
= 1− 2 e
2
0 1 (6.2)
σ
e + Tσ
where σe2 and σ2 are the variances of eit and i respectively.
• If = 1, the model is FE
– Complete time-demeaning
• If = 0, the model is pooled OLS
– Occurs if all individual effects are equal (σ2 = 0).
• If 0 < < 1, the model is RE
– approaches 1 as σ2 (the variability in i) grows,
relative to σe2 (the variability in the time-varying error).
– As T increases, → 1.
25
6.4 Properties of RE
• Unbiased, provided Xit is independent of all i and eit
• More efficient than FE under the RE assumptions
• Consistent for T → ∞ (or N → ∞ with T fixed)
27
. xtreg lwage educ black hisp exper expersq married union d81 d82 d83 d84 d85
> d86 d87, re theta
sigma_u .32460315
sigma_e .35099001
rho .46100216 (fraction of variance due to u_i)
28
• Output:
– Note that corr(u_i, X) = 0 (assumed)
In the FE model, Stata estimated this correlation.
– The key estimates are: (theta) = 0.643
σ (sigma_u) = 0.325
σe (sigma_e) = 0.35
• Interpretation:
Interpret the coefficients as you would in an OLS model:
– Return to an additional year of education is 9.2%, c.p.
– On average, those who are married earn 100(𝑒 0.064 −
1) = 6.61% more than not married, and union
100 𝑒 0.106 − 1 = 11.2% more than non-union, c.p.
– How do these estimates compare to FE and POLS?
29
Variable OLS RE FE
• educ, black and hisp: similar effects for pooled OLS and
RE.
• married and union: smaller effects for RE than OLS
– RE eliminates (part of) individual unobserved effect
– Smaller still when we eliminate the entire effect using FE
• For estimates on time-varying variables:
– RE will be closer to FE when is close to 1, and closer
to pooled OLS when is close to 0.
– Here, = 0.643.
30
6.6 Conclusion
• Random effects uses a transformation to eliminate
autocorrelation
– But does not completely eliminate i
• RE has a key advantage over FE:
– Can estimate effect of time-invariant factors
• But RE (and POLS) will produce biased and inconsistent
results if cov(i , Xit) ≠ 0
(
– R2 within, from yˆ it − yˆ i = X it − X i βˆ ) using time variation
within individuals
• Comparisons:
– Can compare POLS and FE using R2
– Typically cannot compare RE and FE using R2:
• Treat the i’s differently.
– In general: estimates of β are more important than R2.
33
• FE model: use R2 within to conduct an F-test for the model,
as usual (i.e. H0: Xit are jointly insignificant).
• RE model: only the asymptotic properties of the estimators
are known, so we use a chi-squared test.
arried union d81 d82 d83 d84 d85 d86 d87, fe
• For our two models:
regression Number of obs = 4360
– FE: Number of groups = 545
F(10,3805) = 83.85
k
2 hisp exper expersq married union
Prob > F d81 d82 d83
= d84 d85
0.0000
ession
f. Std. Err. t Number of obs
P>|t| = Interval]
[95% Conf. 4360
– RE: Number of groups = 545
55 .0007044 -7.36 0.000 -.0065666 -.0038044
04 .0183104 2.55 Obs
0.011per group: min
.0107811 = 8
.0825796
19 .0193103 4.14 0.000 avg = .1178614
.0421423 8.0
12 .0219489 6.89 0.000 max =
.1081584 8
.194224
09 .0244185 10.36 0.000 .2050963 .3008454
ussian
37 .0292419 12.12 Wald
0.000 chi2(14)
.2971125 = 957.77
.4117749
(assumed)
48 .0362266 13.53 Prob > chi2
0.000 .4190894 = .5611402
0.0000
4291089
23 .0452435 13.65 0.000 .5287784 .7061861
66 – Both sets of explanatory variables are jointly significant.
.0561277 13.64 0.000 .6554532 .8755399 34
49 .0687731 13.45 0.000 .7901893 1.059861
7.2 Random effects or fixed effects?
• Recall that, with panel data, there are two types of variation:
– within variation (variation from observation to
observation, within a single individual)
– between variation (variation in observations from
individual to individual)
• FE uses the first type only:
– β is referred to as the within estimator.
• OLS conducted on averages over time of all variables:
– produces the between estimator of β.
• Pooled OLS:
– produces unweighted average of these two estimators.
• RE uses both types of variation.
– The random effects estimator is a matrix-weighted
average of the within and between estimators.
35
There are three implications:
1. This is why RE is more efficient than FE
– it uses both types of variation.
37
(a) Economic knowledge/intuition
• Consider our sample of employed individuals
• Their education levels don’t change over time:
– Cannot include education in a FE wage model, but
– Can include education in a RE wage model.
• But in RE we assume that education is uncorrelated with i:
– Recall that i includes all unmeasured time-invariant
factors (e.g. ability, family background, etc.)
• If this assumption doesn’t hold:
– Biased and inconsistent estimate of effect of education
• Often, the key reason for using panel data is to allow
(control) for i being correlated with the explanatory
variables
– i.e. RE is not justifiable in such cases
38
(b) Testing for random effects
• Recall that if = 0, we should just estimate the model by
pooled OLS.
– This occurs if all individual effects are equal
– i.e. there is no variation in i
• We can test this using the Breusch and Pagan (1980)
Lagrange multiplier test:
H0: σ2 = 0 (use POLS or FE)
H1: σ2 ≠ 0 (use RE or FE)
• The test statistic follows a chi-squared distribution, with
df = 1.
39
• This is a post-estimation test:
– i.e. first estimate xtreg model
. xttest0
Estimated results:
Var sd = sqrt(Var)
Test: Var(u) = 0
chi2(1) = 3203.64
Prob > chi2 = 0.0000
Coefficients
(b) (B) (b-B) sqrt(diag(V_b-V_B))
FE RE Difference S.E.
chi2(10) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 26.36
Prob>chi2 = 0.0033
(V_b-V_B is not positive definite)
• At all significance levels above 0.33%, we must reject H0.
• Therefore there is significant correlation between i and Xit
– Model should be estimated by FE.
– Not surprising that unobserved heterogeneity is correlated
with the observed variables in a wage equation. 42
7.3 Selection of a static panel model
Is there a Random Effect?
B-P LM test (H0: σ2 = 0)
No Yes
No POLS -- (RE)
Is there a Fixed Effect?
F-test (H0: all i = 0) FE or RE:
Yes FE
Do a Hausman test
Is cov(i , Xit) = 0?
Do not
Reject H0
reject H0
Use FE Use RE
43
Limitations
• When choosing POLS:
– contemporaneous exogeneity must hold
• When choosing FE or RE:
– strict exogeneity must hold
45
7.5 Conclusion
• Panel data methods are widely used in many research
fields:
– They expand the range of questions we can answer
– Allow for better causal analysis
• We’ve introduced the key methods for static panel
analysis:
– Pooling of multiple cross sections
– Fixed effects estimation
– Random effects estimation