0% found this document useful (0 votes)
11 views

Panel Data I

The document outlines the estimation of linear panel data models, focusing on pooled OLS and fixed effects estimation. It discusses the one-way error components model, strict exogeneity assumptions, and the implications for estimating parameters in the presence of unobservable individual effects. Additionally, it highlights the differences between fixed effects and random effects estimators and provides examples of model applications in econometrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Panel Data I

The document outlines the estimation of linear panel data models, focusing on pooled OLS and fixed effects estimation. It discusses the one-way error components model, strict exogeneity assumptions, and the implications for estimating parameters in the presence of unobservable individual effects. Additionally, it highlights the differences between fixed effects and random effects estimators and provides examples of model applications in econometrics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Panel Econometrics I

ECONOMETRICS
Outline

Estimation of linear panel data models:


Pooled OLS and fixed effects estimation

The one-way error components model

Pooled OLS

The fixed effects estimator

Up next

ECONOMETRICS
The one-way error components model

A model to start with


yit = xit β + vit
Random sample of individuals (or households, firms, ...) i = 1, . . . , N for time periods
t = 1, . . . , T .

The error vit contains unobservable individual specific effects ci (intellectual ability,
gender etc.) and a remainder disturbance:

vit = ci + uit

This is called a one-way error components structure: both ci and uit are unobservable
random variables.

Before finding estimators, we have to think about the assumptions.


ECONOMETRICS
Strict exogeneity I

Strict exogeneity assumption:

E(yit |xi1 , . . . , xiT , ci ) = E(yit |xit , ci )

Interpretation: once xit and ci are controlled for, xis has no partial effect on yit for all
s 6= t.

(We may explicitly separate variables of interest from controls at times.)

Adding linearity:
E(yit |xi1 , . . . , xiT , ci ) = xit β + ci .

Error form:
yit = xit β + ci + uit .

ECONOMETRICS
Strict exogeneity II

Implication of the strict exogeneity assumption for the disturbance:

E(uit |xi1 , . . . , xiT , ci ) = 0 t = 1, . . . , T .

This implies in turn

E(xis0 uit ) = 0 s, t = 1, . . . , T .

But it leaves
E(xit0 ci )
fully unrestricted (which is nice for us).

ECONOMETRICS
Large N asymptotics

In the following we use large N asymptotics.

This means, to derive an asymptotic distribution we keep T fixed and let N → ∞.

Whether this leads to an asymptotic distribution that approximates the (unknown)


finite-sample distribution well, obviously depends on the sample relation between N
and T . Whenever N ≥ T it should work nicely.

Panels with N ≈ T or N < T typically require different asymptotics but we will not
cover them in this course.

We’ll typically let uit be iid (or so).

ECONOMETRICS
Random effects or fixed effects?

In earlier literature, there was a discussion whether ci should be treated as a fixed


parameter (fixed effect) or as a random variable (random effect).

Since we typically assume we have a random sample of individuals (or households,


firms, etc.), ci should be treated as a random variable.

But as we shall see, there is still an important distinction between the fixed effects
estimator and the random effects estimator:
I Fixed effects estimator: the ci are allowed to correlate with xit .
I Random effects estimator: the ci need to be uncorrelated with xit , t = 1, . . . , T .

ECONOMETRICS
Questions to be answered by the researcher

(1) Are the ci uncorrelated with xit , for all t = 1, . . . , T ?

(2) Is the strict exogeneity assumption E(uit |xi1 , . . . , xiT , ci ) = 0 reasonable?

This may be difficult at times; see below.

ECONOMETRICS
Example: program evaluation
Model:
log(wageit ) = θt + zit + δ1 progit + ci + uit t = 1, 2
t = 1: no-one has participated → progit = 0 for all i
t = 2: treatment group has participated, control group not
Role of ci : participation may depend on personal characteristics (self selection,
non-random assignment to one of the groups)

Discussion:
(1) The ci are probably correlated with zit .
(2) Cov(uit , xit ) = 0 may be uncritical as long as we include all important control
variables (and because we include ci )
(3) What about Cov(ui1 , xi2 ) = 0?
Think of a feedback: Negative income shock ui1 may induce people to participate in the program
or program administrators to choose these people to participate.
Then Cov(ui1 , progi2 ) 6= 0.
ECONOMETRICS
Example: models of lagged adjustment

Distributed lag model to study the relationship between patents and current and past
levels of R&D spending:

patentsit = θt + zit + δ0 RDit + δ1 RDit −1 + δ2 RDit −2 + · · · + ci + uit

Hence, interest rests on the δj ’s.

Role of ci : unobserved firm heterogeneity (firm culture, risk attitudes, productivity)

Discussion:
(1) The ci probably correlated with RDit and its lags unless all important factors of
firm heterogeneity are controlled for.
(2) Feedback: Negative patent shock uit may induce firm to spend more on future
R&D. Then Cov(uit , RDit +τ ) 6= 0.
ECONOMETRICS
Example: lagged dependent variable

Model of dynamic wage adjustment:

log(wageit ) = β 1 log(wageit −1 ) + ci + uit t = 1, . . . , T

Interest is on speed of wage adjustment.

Role of ci : unobserved heterogeneity (e.g., individual productivity)

Reasonable assumption: E(uit | log(wageit −1 ), . . . , log(wagei0 ), ci ) = 0.

Discussion:
(1) The ci by construction correlated with log(wageit −1 ).
(2) “Feedback” shows up by construction: strict exogeneity fails.

ECONOMETRICS
Some LS stuff
Stacking observations t = 1, . . . , T for individual i yields
yi = xi β + vi = xi β + ιT ci + ui ,
where ιT is a T × 1 vector of ones. Stacking individuals i = 1, . . . , N yields
y = xβ + v,
where
         
y11 x11 v11 c1 u11
 ..   ..   ..   ..   .. 
 .   .   .   .   . 
         
 y1T   x1T   v1T   c1   u1T 
         
         
         
 ..   ..   ..   ..   .. 
y = .  x = .  v =  .  =  . + . 
(NT ×1)   (NT ×K )   (NT ×1)      
         
         
 yN 1   xN 1   vN 1  cN   uN 1 
         
 .   .   .   .   . 
 ..   ..   ..   ..   .. 
yNT xNT vNT cN uNT
ECONOMETRICS
Pooled OLS estimator

The classical estimator is:


! −1
 −1 N N
β̂POLS = x x0 0
xy= ∑ xi0 xi ∑ xi0 yi
i =1 i =1

Condition for consistency:


E(xit0 vit ) = 0
which is satisfied if
E(xit0 uit ) = 0 and E(xit0 ci ) = 0.

Hence, omitted variable problem only avoided if c is not confounding.

ECONOMETRICS
Variance of the pooled OLS estimator

Even if the error component uit of

vit = ci + uit

is white noise with time-invariant variance, the overall disturbances vit and vis are
correlated due to ci :

(ci ) + (uit )
 if i = j and t = s
E(vit vjs ) = E[(ci + uit )(cj + ujs )] = (ci ) if i = j and t 6= s

0 if i 6= j

Hence, the standard assumption that elements are independent is violated.

This implies that conventional standard errors computed for β̂POLS are incorrect, no
matter whether they are heteroskedasticity robust or not.
ECONOMETRICS
But there is more

Even without ci , we’re still not back to the cross-section case anymore.
To see the problem, let us number the N · T pooled observations as l = 1, . . . , NT as
if it was a single cross section.

One may be tempted to base the asymptotic distribution of the pooled OLS estimator
on a CLT for the cross section
NT N T
∑ xl0 vl = (NT )− 2 ∑ ∑ xit0 vit →
1 1 d
(NT )− 2 N (0, B)
l =1 i =1 t =1

where B ≡ Cov (xl0 vl ) = Cov (xit0 vit ) . In fact, this is what a regression software would
do unless you tell it that this is not a single cross section.
Problem: this CLT requires the (here: invalid) assumption that E(xl0 vl vm xm ) 6= 0 only
if l = m and thus i = j and t = s.
ECONOMETRICS
For fixed-T panels
The asymptotic distribution of the pooled OLS estimator is based on
N
∑ xi0 vi →
1 d
N− 2 N (0, B)
i =1
where B ≡ E (xi0 vi vi0 xi ) ≡ Cov (xi0 vi ).
The pooled OLS estimator is asymptotically normally distributed,
√  
d
N β̂POLS − β → N 0, A−1 BA−1 ,


where A ≡ E [xi0 xi ].
We estimate the asymptotic variance by sample counterparts:
N N
 = N −1 ∑ xi0 xi and B̂ = N −1 ∑ xi0 v̂i v̂i0 xi .
i =1 i =1
Note: the variance estimator leaves the correlation between different time periods of
the same individual fully unrestricted. (May think of GLS.) ECONOMETRICS
Software/Stata hints
What all this means when you estimate a pooled regression using a regression package:

I Applying OLS to your pooled data will produce the correct β̂ but the wrong
standard errors. Your regression package does not know that there is correlation
between certain elements (same individual, different time periods) unless you tell
it.

I Example: Stata command regress y x1 x2 x3, vce(robust) generates the


wrong standard errors.

I Some regression packages have an option to take the correlation within groups of
observations into account. This is called clustering and applicable in many
situations.

I Example: Data set has identifier id for each unit. Then you should use the Stata
command regress y x1 x2 x3, vce(cluster id)
ECONOMETRICS
Strict exogeneity assumption

Pooled OLS is just the start. Let’s take more advantage of the panel structure.

Assumption FE.1:
I E(uit |xi1 , . . . , xiT , ci ) = 0 for all t = 1, . . . , T

Discussion:
I This is the strict exogeneity assumption as discussed above.
I However, the correlation between ci and any xit , t = 1, . . . , T is left unrestricted.
Hence, the “omitted variable problem” used to motivate panel analysis can be
handled here.

ECONOMETRICS
Within transformation
Model:
yit = xit β + ci + uit

Time average (bars denote time averages like ȳi = T −1 ∑T


t =1 yit ):

ȳi = x̄i β + ci + ūi

Within transformation (subtract individual time averages) wipes out the ci :

yit − ȳi = (xit − x̄i ) β + uit − ūi

Defining ÿit = yit − ȳi etc yields the within-transformed equation

ÿit = ẍit β + üit


ECONOMETRICS
Discussion

I The within transformation wipes out all time-invariant regressors. To circumvent


zero columns in the within-transformed regressor matrix, do not include
time-invariant regressors such as an intercept.
I This is the price we have to pay for the weaker assumptions compared to the RE
estimator: parameters for time-invariant regressors are not identified.
I The within transformed equation can be estimated by pooled OLS as discussed
below.
I Interpretation of parameters is based on the original structural equation
yit = xit β + ci + uit .

ECONOMETRICS
The FE or within estimator

The FE or within estimator applies pooled OLS to

ÿit = ẍit β + üit .

Stacking all observations 1, . . . , T of one individual into ÿi , ẍi , and üi yields

ÿi = ẍi β + üi .

Hence, the estimator is


! −1 ! −1
N N N T N T
β̂FE = ∑ ẍi0 ẍi ∑ ẍi0 ÿi = ∑ ∑ ẍit0 ẍit ∑ ∑ ẍit0 ÿit .
i =1 i =1 i =1 t =1 i =1 t =1

To guarantee invertibility, we add assumption FE.2: rank E(ẍi0 ẍi ) = K


ECONOMETRICS
Consistency
Is β̂FE unbiased and consistent?

A sufficient condition is strict exogeneity of the transformed regressors:

E(üit |ẍi1 , . . . , ẍiT ) = 0.

Since each ẍit is a function of xi1 , . . . , xiT , this is implied by

E(üit |xi1 , . . . , xiT ) = E(uit |xi1 , . . . , xiT ) − E(ūi |xi1 , . . . , xiT ) = 0

This is satisfied if the strict exogeneity assumption FE.1 holds because it implies

E(uit |xi1 , . . . , xiT ) = 0

and T
E(ūi |xi1 , . . . , xiT ) = T −1 ∑ E(uis |xi1 , . . . , xiT ) = 0
s =1
ECONOMETRICS
Some matrix algebra for the FE model ?

Here is the matrix transformation (“within transformation”) that turns xi into ẍi :

ẍi = QT xi = (IT − JT )xi ,

where the time-demeaning matrix is defined as

QT = IT − JT ,

IT is a T × T identity matrix and JT is a T × T projection matrix on a column of


ones:  
1 ... 1
0
JT = ι T ( ι T ι T ) −1 ι T
0 0
= ιT ιT /T = T −1  ... . . . ...  .
 

1 ... 1
Hence, JT is a T × T of ones divided by T .

ECONOMETRICS
More details ?
Applying the JT -projection
yields time averages:
    
1 ... 1 xi,11 . . . xi,K 1 x̄i,1 . . . x̄i,K
−1  .. ..   .. .
..  =  ... ..  = ι x̄
JT xi = T  . . . . . ..  
.  . .  T i
1 ... 1 xi,1T . . . xi,KT x̄i,1 . . . x̄i,K

Hence, the within-transformation yields the deviation from time averages:


   
xi,11 . . . xi,K 1 x̄i,1 . . . x̄i,K
 .. .. ..  −  .. .. 
(IT − JT )xi =  . . .   . . 
xi,1T . . . xi,KT x̄i,1 . . . x̄i,K
 
ẍi,11 . . . ẍi,K 1
 .. .. ..  = ẍ
=  . . .  i
ẍi,1T . . . ẍi,KT
ECONOMETRICS
... which helps ?
Note that QT is a symmetric, idempotent T × T matrix of rank T − 1,
QT QT0 =Q Q =Q .
T T T

This helps us to re-write the FE estimator as a direct function of the ui . First note
that (like always) we can represent the estimator as
! −1
N N
β̂FE − β = ∑ ẍi0 ẍi ∑ ẍi0 üi
i =1 i =1

Next observe that


ẍi0 üi = xi0 QT
0
QT ui = xi0 QT ui = ẍi0 ui .
Substitute this into the above expression:
! −1
N N
β̂FE − β = ∑ ẍi0 ẍi ∑ ẍi0 ui .
i =1 i =1
ECONOMETRICS
Asymptotic distribution of the FE estimator


Multiplying the result of the previous page by N yields
! −1
√ N N
N ( β̂FE − β) = N −1 ∑ ẍi0 ẍi N −1/2 ∑ ẍi0 ui
i =1 i =1

This structure is equivalent to system OLS with regressor matrix ẍi and disturbance
vector ui . Hence,
√  
d
N β̂FE − β → N 0, A−1 BA−1 ,


where
A ≡ E(ẍi0 ẍi ) = E(xi0 QT xi ) B ≡ E ẍi0 ui ui0 ẍi .

and

ECONOMETRICS
Estimating the robust variance matrix

To estimate A and B, use sample equivalents:


N N
 ≡ N −1 ∑ ẍi0 ẍi and B̂ ≡ N −1 ∑ ẍi0 ûi ûi0 ẍi ,
i =1 i =1

where ûi ≡ ÿi − ẍi β̂FE is the residual vector of the FE estimator.

This is sometimes called the Arellano-White variance estimator for the FE model.

Note that this estimator is not only robust to heteroskedasticity. It is also robust to
autocorrelation within individuals, i.e., it allows

E(uit uis ) 6= 0 for all s, t = 1, . . . , T .

ECONOMETRICS
The classical FE variance estimator

The classical FE estimator assumes homoscedasticity and lack of serial correlation.

Assumption FE.3: E(ui ui0 |xi1 , . . . , xiT , ci ) = E(ui ui0 ) = σu2 IT

Discussion:
I The first part means that conditional and unconditional variance matrix are equal.
This rules out heteroskedasticity.
I The second part implies constant variances over time and lack of serial correlation:

E(uit2 ) = σu2 and E(uit uis ) = 0.

I If this assumption is correct, the variance estimator simplifies greatly. The


simplified estimator is more efficient than the robust estimator presented above
but inconsistent if the assumptions fails.
ECONOMETRICS
... continued
Under FE.3 we obtain by the LIE
B = E ẍi0 ui ui0 ẍi = E ẍi0 σu2 IT ẍi = σu2 E(ẍi0 ẍi ) = σu2 E(xi0 QT xi )
 

= σu2 A.
Thus the asymptotic variance matrix simplifies to
A−1 BA−1 = A−1 σu2 AA−1 = σu2 A−1 .
It can be estimated by
N
 ≡ N −1 ∑ ẍi0 ẍi
i =1
and the consistent estimator
N T
1
σ̂u2 =
N (T − 1) − K ∑ ∑ ûit2 .
i =1 t =1

Why division by N (T − 1) − K and not NT − K ? ECONOMETRICS


Why division by N (T − 1) − K and not NT − K ?
Recall that ûit ≡ ÿit − ẍit β̂FE is the residual vector of the FE estimator and thus the
sample equivalent of üit .
In population, we obtain
E(üit2 ) = E[(uit − ūi )2 ] = E(uit2 ) + E(ūi2 ) − 2E(uit ūi ).
Under FE.3,
E(uit2 ) = σu2
 !2 
T T T
E(ūi2 ) = T −2 E  ∑ uit  = T −2 ∑ E(uit2 ) = T −2 ∑ σu2 = T −1 σu2
t =1 t =1 t =1
!
T
E(uit ūi ) = T −1 E uit ∑ uis = T −1 E uit2 = T −1 σu2

s =1

Hence,
E(üit2 ) = σu2 + T −1 σu2 − 2T −1 σu2 = (1 − 1/T )σu2 .
ECONOMETRICS
... because
Due to E(üit2 ) = σu2 (1 − 1/T ) we have
!
N T N T
E ∑ ∑ üit2 = ∑ ∑ E(üit2 ) = NT (1 − 1/T )σu2 = N (T − 1)σu2
i =1 t =1 i =1 t =1

and thus !
N T
1
N (T − 1) i∑ ∑ üit2
E = σu2 .
=1 t =1

Accounting for the loss in degrees of freedom we therefore use the sample equivalent
N T
1
σ̂u2 =
N (T − 1) − K ∑ ∑ ûit2 .
i =1 t =1

Note that a regression package that applies pooled OLS on within-transformed


variables will automatically divide by NT − K . Therefore, you should use specialized
panel commands. ECONOMETRICS
Efficiency of the FE estimator ?

Use the result


ẍi0 ÿi = xi0 QT
0
QT yi = xi0 QT yi = ẍi0 yi
to rewrite the FE estimator:
! −1 ! −1
N N N N
β̂FE = ∑ ẍi0 ẍi ∑ ẍi0 ÿi = ∑ ẍi0 ẍi ∑ ẍi0 yi ,
i =1 i =1 i =1 i =1

Hence, the FE estimator is equivalent to the SURE/SOLS estimator applied to the


model
yi = ẍi β + ui .
Assumptions FE.1 to FE.3 say that the regressors of this model are strictly exogenous
and the (conditional and unconditional) variance matrix of the disturbances is σu2 IT .

Under these assumptions Gauss-Markov applies.


ECONOMETRICS
FE estimator for policy analysis
Consider the model
yit = xit β + vit = zit + δwit + vit ,
where wit is a policy variable, zit contain control variables, and vit may contain an
individual effect.

The FE estimator of δ is consistent if

E[xit0 (vit − v̄i )] = 0.

Discussion:
I Consistency requires that wit be uncorrelated with the deviation of vit from its
average; correlation of wit with v̄i is allowed.
I Assume wit measures program participation. I.e., program participation can be
systematically related to the persistent component in the error vit . This can be
helpful in situations we have to suspect certain kinds of self selection etc.
I Obviously, variation in wit over time is required for at least some i.
ECONOMETRICS
Implementation in Stata

Example: Data set has identifier for each individual denoted id and for each time
period denoted year.

You first have to tell Stata that you have panel data:
xtset id year

The FE estimator with classical (nonrobust) variance matrix is computed using


xtreg y x1 x2 x3, fe

The FE estimator with robust Arellano-White variance matrix is computed using


xtreg y x1 x2 x3, fe vce(robust)

ECONOMETRICS
... and a small remark

Note that Stata’s FE regression results include an intercept even though the within
transformation wipes out any time-invariant regressor.

Stata estimates the intercept as

α̂FE = ȳ − x̄ β̂FE

where ȳ and x̄ are the sample means over all N and T .

The individual effects are estimated as

ĉi = ȳi − α̂FE − x̄i β̂FE .

Note, however, that the ci ’s cannot be estimated consistently. (Can you imagine why?)

ECONOMETRICS
Example: Effects of job training grants on scrap rates
Example 10.5 taken from Wooldridge’s textbook

Question: How do job training grants affect scrap rates?

Sample: 54 firms reported scrap rates for 1987, 1988, and 1989. Some received a
grant in one of the years 1988 or 1989 to initiate a training program.

Analysis: Regression of log scrap rates on yearly dummies, grant dummy (“grant”) and
lagged grant dummy (“grant 1”). (We leave out the union membership dummy as it is
time-invariant and include it later on when we use the RE estimator.)

ECONOMETRICS
Stata output – descriptive statistics
Load data: use "jtrain1.dta", clear
Set panel: xtset fcode year

Definitions:
overall = xit
between = x̄i
within = xit − x̄i + x̄¯
ECONOMETRICS
Stata output – FE estimation

ECONOMETRICS
Translation

Notes:
I R-sq within: squared correlation between (xit − x̄i ) β̂FE and yit − ȳi .
I R-sq between: squared correlation between x̄i β̂FE and ȳi .
I R-sq overall: squared correlation between xit β̂FE and yit .
I sigma u: square root of (ci ) = σc2
I sigma e: square root of (uit ) = σu2
I rho: variance share (ci )/(uit ) = σc2 /σu2
I corr(u i, Xb): correlation between ĉi and x̄it β̂FE .

ECONOMETRICS
Coming up

The random effects estimator

ECONOMETRICS

You might also like