Panel Data I
Panel Data I
ECONOMETRICS
Outline
Pooled OLS
Up next
ECONOMETRICS
The one-way error components model
The error vit contains unobservable individual specific effects ci (intellectual ability,
gender etc.) and a remainder disturbance:
vit = ci + uit
This is called a one-way error components structure: both ci and uit are unobservable
random variables.
Interpretation: once xit and ci are controlled for, xis has no partial effect on yit for all
s 6= t.
Adding linearity:
E(yit |xi1 , . . . , xiT , ci ) = xit β + ci .
Error form:
yit = xit β + ci + uit .
ECONOMETRICS
Strict exogeneity II
E(xis0 uit ) = 0 s, t = 1, . . . , T .
But it leaves
E(xit0 ci )
fully unrestricted (which is nice for us).
ECONOMETRICS
Large N asymptotics
Panels with N ≈ T or N < T typically require different asymptotics but we will not
cover them in this course.
ECONOMETRICS
Random effects or fixed effects?
But as we shall see, there is still an important distinction between the fixed effects
estimator and the random effects estimator:
I Fixed effects estimator: the ci are allowed to correlate with xit .
I Random effects estimator: the ci need to be uncorrelated with xit , t = 1, . . . , T .
ECONOMETRICS
Questions to be answered by the researcher
ECONOMETRICS
Example: program evaluation
Model:
log(wageit ) = θt + zit + δ1 progit + ci + uit t = 1, 2
t = 1: no-one has participated → progit = 0 for all i
t = 2: treatment group has participated, control group not
Role of ci : participation may depend on personal characteristics (self selection,
non-random assignment to one of the groups)
Discussion:
(1) The ci are probably correlated with zit .
(2) Cov(uit , xit ) = 0 may be uncritical as long as we include all important control
variables (and because we include ci )
(3) What about Cov(ui1 , xi2 ) = 0?
Think of a feedback: Negative income shock ui1 may induce people to participate in the program
or program administrators to choose these people to participate.
Then Cov(ui1 , progi2 ) 6= 0.
ECONOMETRICS
Example: models of lagged adjustment
Distributed lag model to study the relationship between patents and current and past
levels of R&D spending:
Discussion:
(1) The ci probably correlated with RDit and its lags unless all important factors of
firm heterogeneity are controlled for.
(2) Feedback: Negative patent shock uit may induce firm to spend more on future
R&D. Then Cov(uit , RDit +τ ) 6= 0.
ECONOMETRICS
Example: lagged dependent variable
Discussion:
(1) The ci by construction correlated with log(wageit −1 ).
(2) “Feedback” shows up by construction: strict exogeneity fails.
ECONOMETRICS
Some LS stuff
Stacking observations t = 1, . . . , T for individual i yields
yi = xi β + vi = xi β + ιT ci + ui ,
where ιT is a T × 1 vector of ones. Stacking individuals i = 1, . . . , N yields
y = xβ + v,
where
y11 x11 v11 c1 u11
.. .. .. .. ..
. . . . .
y1T x1T v1T c1 u1T
.. .. .. .. ..
y = . x = . v = . = . + .
(NT ×1) (NT ×K ) (NT ×1)
yN 1 xN 1 vN 1 cN uN 1
. . . . .
.. .. .. .. ..
yNT xNT vNT cN uNT
ECONOMETRICS
Pooled OLS estimator
ECONOMETRICS
Variance of the pooled OLS estimator
vit = ci + uit
is white noise with time-invariant variance, the overall disturbances vit and vis are
correlated due to ci :
(ci ) + (uit )
if i = j and t = s
E(vit vjs ) = E[(ci + uit )(cj + ujs )] = (ci ) if i = j and t 6= s
0 if i 6= j
This implies that conventional standard errors computed for β̂POLS are incorrect, no
matter whether they are heteroskedasticity robust or not.
ECONOMETRICS
But there is more
Even without ci , we’re still not back to the cross-section case anymore.
To see the problem, let us number the N · T pooled observations as l = 1, . . . , NT as
if it was a single cross section.
One may be tempted to base the asymptotic distribution of the pooled OLS estimator
on a CLT for the cross section
NT N T
∑ xl0 vl = (NT )− 2 ∑ ∑ xit0 vit →
1 1 d
(NT )− 2 N (0, B)
l =1 i =1 t =1
where B ≡ Cov (xl0 vl ) = Cov (xit0 vit ) . In fact, this is what a regression software would
do unless you tell it that this is not a single cross section.
Problem: this CLT requires the (here: invalid) assumption that E(xl0 vl vm xm ) 6= 0 only
if l = m and thus i = j and t = s.
ECONOMETRICS
For fixed-T panels
The asymptotic distribution of the pooled OLS estimator is based on
N
∑ xi0 vi →
1 d
N− 2 N (0, B)
i =1
where B ≡ E (xi0 vi vi0 xi ) ≡ Cov (xi0 vi ).
The pooled OLS estimator is asymptotically normally distributed,
√
d
N β̂POLS − β → N 0, A−1 BA−1 ,
where A ≡ E [xi0 xi ].
We estimate the asymptotic variance by sample counterparts:
N N
 = N −1 ∑ xi0 xi and B̂ = N −1 ∑ xi0 v̂i v̂i0 xi .
i =1 i =1
Note: the variance estimator leaves the correlation between different time periods of
the same individual fully unrestricted. (May think of GLS.) ECONOMETRICS
Software/Stata hints
What all this means when you estimate a pooled regression using a regression package:
I Applying OLS to your pooled data will produce the correct β̂ but the wrong
standard errors. Your regression package does not know that there is correlation
between certain elements (same individual, different time periods) unless you tell
it.
I Some regression packages have an option to take the correlation within groups of
observations into account. This is called clustering and applicable in many
situations.
I Example: Data set has identifier id for each unit. Then you should use the Stata
command regress y x1 x2 x3, vce(cluster id)
ECONOMETRICS
Strict exogeneity assumption
Pooled OLS is just the start. Let’s take more advantage of the panel structure.
Assumption FE.1:
I E(uit |xi1 , . . . , xiT , ci ) = 0 for all t = 1, . . . , T
Discussion:
I This is the strict exogeneity assumption as discussed above.
I However, the correlation between ci and any xit , t = 1, . . . , T is left unrestricted.
Hence, the “omitted variable problem” used to motivate panel analysis can be
handled here.
ECONOMETRICS
Within transformation
Model:
yit = xit β + ci + uit
ECONOMETRICS
The FE or within estimator
Stacking all observations 1, . . . , T of one individual into ÿi , ẍi , and üi yields
This is satisfied if the strict exogeneity assumption FE.1 holds because it implies
and T
E(ūi |xi1 , . . . , xiT ) = T −1 ∑ E(uis |xi1 , . . . , xiT ) = 0
s =1
ECONOMETRICS
Some matrix algebra for the FE model ?
Here is the matrix transformation (“within transformation”) that turns xi into ẍi :
QT = IT − JT ,
1 ... 1
Hence, JT is a T × T of ones divided by T .
ECONOMETRICS
More details ?
Applying the JT -projection
yields time averages:
1 ... 1 xi,11 . . . xi,K 1 x̄i,1 . . . x̄i,K
−1 .. .. .. .
.. = ... .. = ι x̄
JT xi = T . . . . . ..
. . . T i
1 ... 1 xi,1T . . . xi,KT x̄i,1 . . . x̄i,K
This helps us to re-write the FE estimator as a direct function of the ui . First note
that (like always) we can represent the estimator as
! −1
N N
β̂FE − β = ∑ ẍi0 ẍi ∑ ẍi0 üi
i =1 i =1
√
Multiplying the result of the previous page by N yields
! −1
√ N N
N ( β̂FE − β) = N −1 ∑ ẍi0 ẍi N −1/2 ∑ ẍi0 ui
i =1 i =1
This structure is equivalent to system OLS with regressor matrix ẍi and disturbance
vector ui . Hence,
√
d
N β̂FE − β → N 0, A−1 BA−1 ,
where
A ≡ E(ẍi0 ẍi ) = E(xi0 QT xi ) B ≡ E ẍi0 ui ui0 ẍi .
and
ECONOMETRICS
Estimating the robust variance matrix
where ûi ≡ ÿi − ẍi β̂FE is the residual vector of the FE estimator.
This is sometimes called the Arellano-White variance estimator for the FE model.
Note that this estimator is not only robust to heteroskedasticity. It is also robust to
autocorrelation within individuals, i.e., it allows
ECONOMETRICS
The classical FE variance estimator
Discussion:
I The first part means that conditional and unconditional variance matrix are equal.
This rules out heteroskedasticity.
I The second part implies constant variances over time and lack of serial correlation:
= σu2 A.
Thus the asymptotic variance matrix simplifies to
A−1 BA−1 = A−1 σu2 AA−1 = σu2 A−1 .
It can be estimated by
N
 ≡ N −1 ∑ ẍi0 ẍi
i =1
and the consistent estimator
N T
1
σ̂u2 =
N (T − 1) − K ∑ ∑ ûit2 .
i =1 t =1
Hence,
E(üit2 ) = σu2 + T −1 σu2 − 2T −1 σu2 = (1 − 1/T )σu2 .
ECONOMETRICS
... because
Due to E(üit2 ) = σu2 (1 − 1/T ) we have
!
N T N T
E ∑ ∑ üit2 = ∑ ∑ E(üit2 ) = NT (1 − 1/T )σu2 = N (T − 1)σu2
i =1 t =1 i =1 t =1
and thus !
N T
1
N (T − 1) i∑ ∑ üit2
E = σu2 .
=1 t =1
Accounting for the loss in degrees of freedom we therefore use the sample equivalent
N T
1
σ̂u2 =
N (T − 1) − K ∑ ∑ ûit2 .
i =1 t =1
Discussion:
I Consistency requires that wit be uncorrelated with the deviation of vit from its
average; correlation of wit with v̄i is allowed.
I Assume wit measures program participation. I.e., program participation can be
systematically related to the persistent component in the error vit . This can be
helpful in situations we have to suspect certain kinds of self selection etc.
I Obviously, variation in wit over time is required for at least some i.
ECONOMETRICS
Implementation in Stata
Example: Data set has identifier for each individual denoted id and for each time
period denoted year.
You first have to tell Stata that you have panel data:
xtset id year
ECONOMETRICS
... and a small remark
Note that Stata’s FE regression results include an intercept even though the within
transformation wipes out any time-invariant regressor.
α̂FE = ȳ − x̄ β̂FE
Note, however, that the ci ’s cannot be estimated consistently. (Can you imagine why?)
ECONOMETRICS
Example: Effects of job training grants on scrap rates
Example 10.5 taken from Wooldridge’s textbook
Sample: 54 firms reported scrap rates for 1987, 1988, and 1989. Some received a
grant in one of the years 1988 or 1989 to initiate a training program.
Analysis: Regression of log scrap rates on yearly dummies, grant dummy (“grant”) and
lagged grant dummy (“grant 1”). (We leave out the union membership dummy as it is
time-invariant and include it later on when we use the RE estimator.)
ECONOMETRICS
Stata output – descriptive statistics
Load data: use "jtrain1.dta", clear
Set panel: xtset fcode year
Definitions:
overall = xit
between = x̄i
within = xit − x̄i + x̄¯
ECONOMETRICS
Stata output – FE estimation
ECONOMETRICS
Translation
Notes:
I R-sq within: squared correlation between (xit − x̄i ) β̂FE and yit − ȳi .
I R-sq between: squared correlation between x̄i β̂FE and ȳi .
I R-sq overall: squared correlation between xit β̂FE and yit .
I sigma u: square root of (ci ) = σc2
I sigma e: square root of (uit ) = σu2
I rho: variance share (ci )/(uit ) = σc2 /σu2
I corr(u i, Xb): correlation between ĉi and x̄it β̂FE .
ECONOMETRICS
Coming up
ECONOMETRICS