BST281Micro-EconPractice Panel1
BST281Micro-EconPractice Panel1
Serena Trucchi
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Roadmap
• Estimation methods
• Pooled OLS;
• Random effects.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Panel data
• Balanced panel: the same time periods are available for all cross
section units.
While the mechanics of the unbalanced case are similar to the
balanced case, a careful treatment of the unbalanced case requires
a formal description of why the panel may be unbalanced, and the
sample selection issues can be somewhat subtle.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Unobserved heterogeneity I
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Unobserved heterogeneity II
• Start with the balanced panel case, and assume random sampling
across i (the cross section dimension), with fixed time periods T .
So {(xit , yit ) : t = 1, ..., T , ci } where ci is the unobserved effect
drawn along with the observed data.
• ci is constant over time and unobservable (e.g. individual ability
or risk aversion; firm’s managerial quality).
• Note that ci is a random variable, and not a parameter to be
estimated.
• The unbalanced case is trickier because we must know why we
are missing some time periods for some units. We consider this
much later under missing data/sample selection issues.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Motivation: Omitted variable problem I
vit = ci + uit
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Motivation: Omitted variable problem II
∂E (yt |xt , c)
βj = ,
∂xtj
so that βj is the partial effect of xtj on E (yt |xt , c), so that we are
“holding c fixed.”
• Hope is that we can allow c to be correlated with xt .
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Motivation: Omitted variable problem III
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
The framework I
We assume
• a balanced panel and
• all asymptotic analysis – implicit or explicit – is with fixed T and
N → ∞, where N is the size of the cross section.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
The framework II
• A general specification is
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Assumptions about the Covariates and the Unobserved
Effect
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Assumptions about Covariates and Idiosyncratic Errors I
E (uit |xit , ci ) = 0
or
E (yit |xit , ci ) = xit β + ci .
• Ideally, we could proceed with just this assumption.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Assumptions about Covariates and Idiosyncratic Errors II
2. Strict Exogeneity Conditional on the Unobserved Effect
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Assumptions about Covariates and Idiosyncratic Errors III
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Assumptions about Covariates and Idiosyncratic Errors IV
3. Strict Exogeneity Unconditional on the Unobserved Effect
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Estimation and Testing
1 Pooled OLS,
2 Random Effects,
3 Fixed Effects,
4 First Differencing.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Pooled OLS
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Pooled OLS I
• Under certain assumptions, the pooled OLS estimator can be
used to obtain a consistent estimator of β in model
E (xit0 ci ) = 0
E (xit0 uit ) = 0, t = 1, ..., T .
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Pooled OLS II
• In Stata:
reg y x1 x2 ... xK, cluster(id)
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation I
ASSUMPTION RE.1:
• Assume xit includes (at least) unity, and probably time dummies
in addition. Then E (ci ) = 0 is without loss of generality.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation II
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation III
• Define
Ω = E (vi vi0 ) = Var (vi ).
T ×T
i=1 i=1
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation IV
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation V
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation VI
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation VII
Then
• Further, for t 6= s,
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation VIII
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation IX
• The pairwise correlations are Corr (vit , vis ) = ρ = σc2 /(σc2 + σu2 ).
Note that ρ is the fraction of the total variance accounted for by
ci , We can also write Ω as
1 ρ ··· ρ
.. ..
ρ 1 . .
Ω= σv2
. .. ..
. . .
. ρ
ρ ··· ρ 1
which shows we only need to estimate ρ to proceed with FGLS.
• Typically, we estimate σv2 and σc2 , but ρ is useful for summarizing
the importance of ci .
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation X
• Note that the correlation between the composite errors vit and vis
do not depend on the difference between t and s:
Corr (vit , vis ) = ρ = σc2 /(σc2 + σu2 ) ≥ 0, s 6= t.
• Therefore, ρ does not tend to zero as t and s get far apart under
the RE covariance structure.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XI
For now, assume that we have consistent estimators of σu2 and σc2 .
We can substitute them into the Ω matrix.
In a panel data context, the FGLS estimator that uses the variance
matrix Ω̂ is what is known as the random effects estimator:
N
!−1 N
!
β^RE X0i Ω̂−1 Xi X0i Ω̂−1 yi
X X
i=1 i=1
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XIII
T −1 T
σc2 = [T (T − 1)/2]−1
X X
E (vit vis ).
t=1 s=t+1
N T −1 X
T
σ̃c2 = [NT (T − 1)/2]−1
X X
vit vis .
i=1 t=1 s=t+1
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XIV
N T −1 X
T
σ̂c2 = [NT (T − 1)/2 − K ]−1
X X
v̌it v̌is ,
i=1 t=1 s=t+1
with T fixed.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XV
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XVI
Fully robust inference is available for RE, and there are good
reasons for doing so.
1 Ω may not have the special (and restrictive, especially for large T)
RE structure, that is, E (vi vi ) need not have the RE form. Serial
correlation or changing variances in {uit : 1, ..., T } invalidate the
RE structure.
2 The system homoskedasticity requirement, E (vi vi |Xi ) = E (vi vi )
might not hold.
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
Random Effects Estimation XVII
N
!−1 N
! N
!
[ (β^RE ) = X0i Ω̂−1 Xi X0i Ω̂−1^ vi0 Ω̂−1 Xi X0i Ω̂−1 Xi
X X X
Avar vi ^
i=1 i=1 i=1
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi
RE or POLS?
Why using RE instead of pooled OLS?
In Stata, fully robust inference uses the “cluster” option; for the
“usual” variance matrix estimator, drop this option:
xtreg y x1 x2 ... xK, re cluster(id)
Panel data: Introduction, Pooled OLS and Random Effect Serena Trucchi