0% found this document useful (0 votes)

38 views55 pages

Chapter 1

The document discusses panel data and static panel data models. Panel data involves repeated observations over time for cross-sectional units like people, firms, or countries. It has advantages over repeated cross-sections like accounting for permanent unobserved heterogeneity. The document presents a static panel data model and discusses estimators like pooled OLS, fixed effects (within estimation), and least squares dummy variables. It uses cigarette consumption data as a running example.

Uploaded by

David Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views55 pages

Chapter 1

Uploaded by

David Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Chapter 1.

Panel Data

Joan Llull

Quantitative Statistical Methods II Part II

Barcelona School of Economics
Introduction

Chapter 1. Panel Data 2

Panel data
The term panel data refers to data sets with repeated observations over time
(or other dimensions) for a given cross-section of units.

Units can be persons, households, rms, countries,...

It is dierent from repeated cross-sections.

Main advantages of panel data:

Permanent unobserved heterogeneity.

Dynamic responses and error components.

Chapter 1. Panel Data 3

Cigarette Consumption
We will use the same example throughout the chapter.

Consider the estimation of the demand for cigarettes:

ln Cit = β0 + β1 ln Pit + β2 ln Yit + ηi + vit ,

where:

ln Cit is log consumption of cigarettes,

ln Pit is log price of the cigarettes,

ln Yit is log income of the individual,

ηi + vit is unobserved.

Chapter 1. Panel Data 4

Static Models

Chapter 1. Panel Data 5

General notation
We consider the following model:
yit = x0it β + (ηi + vit ).
where yit and xit are observed, and ηi + vit is unobserved.
Let {yit , xit }t=1,...,T
i=1,...,N be our sample. We dene:

x0i1
     
yi1 vi1
y i ≡  ...  , Xi =  ...  , η i = ηi ιT , and v i =  ...  ,
     

yiT x0iT viT

       
y1 X1 η1 v1
y ≡  ...  , X =  ...  , η =  ...  , and v =  ...  ,
       

yN XN ηN vN
where ιT is a size T vector of ones.
Hence, we can use compact notation: y i = Xi β+(η i +v i ), and y = Xβ+(η+v).
Chapter 1. Panel Data 6
General assumptions for static models
For static models, we assume:

Fixed eects: E[xit ηi ] 6= 0 or random eects: E[xit ηi ] = 0.

Strict exogeneity: E[xit vis ] = 0 ∀s, t. This assumption rules out eects of past vis
on current xit (e.g. xit cannot include lagged dependent variables).

Error components: E[ηi ] = E[vit ] = E[ηi vit ] = 0.

Serially uncorrelated shocks: E[vit vis ] = 0 ∀s 6= t.

Homoskedasticity and i.i.d. errors: ηi ∼ iid(0, ση2 ) and vit ∼ iid(0, σv2 ), which
does not aect any crucial result, but simplies some derivations.

Chapter 1. Panel Data 7

Pooled OLS
A simple approach: dene: u≡η+v and estimate β by OLS:

β̂ OLS = (X 0 X)−1 X 0 y.

The properties of β̂OLS depend on E[xit ηi ], as E[xit vit ] = 0:

If E[xit ηi ] = 0 ⇒ E[xj uj ] = 0 (random eects):

E[xit vit ]=0

β̂ OLS is consistent as N → ∞, or T → ∞, or both.

it is ecient only if ση2 = 0.

If E[xit ηi ] 6= 0 ⇒ E[xj uj ] 6= 0 (xed eects):

β̂ OLS is inconsistent as N → ∞, or T → ∞, or both.

cross-section results are also inconsistent (but panel helps in constructing a consistent
alternative).
Chapter 1. Panel Data 8
Pooled OLS in Cigarette Demand
In our previous example, we would redene our regression as:
ln Cj = β0 + β1 ln Pj + β2 ln Yj + uj ,
where I used subindex j to emphasize that each observation it is considered as one inde-
pendent observation.

OLS
ln Prices (β1 ) -0.083 (0.015)
ln Income (β2 ) -0.032 (0.006)
Potential problems:
preferences,
education,
parental smoking behavior,...
Chapter 1. Panel Data 9
The xed eects model.
Within groups estimation.

Chapter 1. Panel Data 10

Within groups estimator
WritePthe model in deviations from individual means, ỹit ≡ yit − ȳi , where ȳi ≡
T −1 t=1 yit :
T

ỹit = (xit − x̄i )0 β + (ηi − η̄i ) + (vit − v̄i ) = x̃0it β + ṽit .

Given the previous assumptions:

E[x̃it ṽit ] = 0.

Therefore, OLS on the transformed model:

−1
β̂ W G = X̃ 0 X̃ X̃ 0 ỹ,

is a consistent estimator either if E[xit ηi ] 6= 0 or E[xit ηi ] = 0.

Strict exogeneity is a crucial assumption (see next slide).

Chapter 1. Panel Data 11

The role of strict exogeneity
In the case where N → ∞ and T is xed, consistency depends on strict exogeneity.

To see it, recall that:

1 1
x̃it = xit − (xi1 + ... + xiT ) and ṽit = vit − (vi1 + ... + viT ).
T T

Therefore E[x̃it ṽit ] = 0 requires E[xit vis ] = 0 ∀ s, t unless T → ∞.

This has motivated the development of dynamic panel data models, to relax this
assumption.

Chapter 1. Panel Data 12

Pros and cons of within groups estimator
Advantage: consistent either if E[xit ηi ] 6= 0 or E[xit ηi ] = 0.

Limitations:

Not ecient:
When N → ∞ but T is xed, less ecient that e.g. β̂GLS if E[xit ηi ] = 0.

Even if E[xit ηi ] 6= 0, dierencing introduces correlation in the errors.

It does not allow to identify coecients for time-invariant regressors, and

identication is through switchers.

Chapter 1. Panel Data 13

Within Groups in Cigarette Demand

In our example:

OLS WG

ln Prices (β1 ) -0.083 (0.015) -0.292 (0.023)

ln Income (β2 ) -0.032 (0.006) 0.107 (0.019)

Chapter 1. Panel Data 14

Least Squares Dummy Variables
The Within Groups estimator can also be obtained by including a set of N individual
dummy variables:
yit = x0it β + η1 D1i + ... + ηN DN i + vit ,
where Dhi = 1{h = i} (e.g. D1i takes the value of 1 for the observations on individual
1 and 0 for all other observations).

OLS estimation of this model gives numerically equivalent estimates to WG

(that's why β̂ W G is a.k.a. β̂ LSDV ).

Chapter 1. Panel Data 15

LSDV in Cigarette Demand
We can generate individual (rm) dummies and estimate by OLS, to check that it
delivers the same results:

OLS WG LSDV

ln Prices (β1 ) -0.083 -0.292 -0.292

(0.015) (0.023) (0.023)

ln Income (β2 ) -0.032 0.107 0.107

(0.006) (0.019) (0.019)

Individual 1 (β0 + η1 ) 2.804

(0.288)

Individual 2 (β0 + η2 ) 3.455

(0.398)

Individual 3 (β0 + η3 ) 2.891

(0.416)

Individual 4 (β0 + η4 ) 2.908

(0.384)

Individual 5 (β0 + η5 ) 3.490

(0.433)

Individual 6 (β0 + η6 ) 2.092

Chapter 1. Panel Data 16
(0.325)
First-Dierenced Least Squares
Another transformation we can consider is rst dierences:
∆yit = ∆x0it β + ∆vit , for i = 1, ..., N ; t = 2, ..., T

where ∆yit = yit − yit−1 .

Takes out time-invariant individual eects (∆ηi = ηi − ηi = 0), so OLS on the

consistent.
dierenced model is

Consistency requires E[∆xit ∆vit ] = 0 which is implied by but weaker than strict
exogeneity.

WG more ecient than FDLS under classical assumptions.

FDLS more ecient if vit random walk (∆vit = εit ∼ iid(0, σε2 )).
Chapter 1. Panel Data 17
FDLS in Cigarette Demand
To get FDLS we generate st dierences and estimate by OLS:

OLS WG LSDV FDLS

ln Prices (β1 ) -0.083 -0.292 -0.292 -0.413
(0.015) (0.023) (0.023) (0.035)
ln Income (β2 ) -0.032 0.107 0.107 0.178
(0.006) (0.019) (0.019) (0.055)
Individual 1 (β0 + η1 ) 2.804
(0.288)
Individual 2 (β0 + η2 ) 3.455
(0.398)
Individual 3 (β0 + η3 ) 2.891
(0.416)
Individual 4 (β0 + η4 ) 2.908
(0.384)
... ...

Chapter 1. Panel Data 18

The random eects model.
Error components.

Chapter 1. Panel Data 19

Uncorrelated eects
Now we assume uncorrelated or random eects: E[xit ηi ] = 0.

In this case, OLS is consistent, but not ecient.

The ineciency is provided by the serial correlation induced by ηi :

E[uit uis ] = E[(ηi + vit )(ηi + vis )] = E[ηi2 ] = ση2 .

The variance of the unobservables (under classical assumptions) is:

E[u2it ] = E[ηi2 ] + E[vit
2
] = ση2 + σv2 .

Chapter 1. Panel Data 20

Error structure
Therefore, the variance-covariance matrix of the unobservables is:

 2
ση + σv2 ση2 ση2

...
 σ2 ση2 + σv2 . . . ση2 
0 η
E[ui ui ] =   = Ωi ,
 
. . .. .
. . . .
 . . . 
ση2 ση2 2
. . . ση + σv2

whose dimensions are T × T, and E[ui u0h ] = 0 ∀ i 6= h, or:

 
Ω1 0 . . . 0
 0 Ω2 . . . 0 
E[uu0 ] =  . .  = Ω,
 
. ..
 .. .
. . . 
.
0 0 ... ΩN

which is block-diagonal with dimension NT × NT.

Chapter 1. Panel Data 21
Generalized Least Squares
Under the classical assumptions, GLS (Balestra-Nerlove) estimator is consistent
and ecient if E[xit ηi ] = 0:

−1
β̂ GLS = X 0 Ω−1 X X 0 Ω−1 y.

If E[xit ηi ] 6= 0 GLS is inconsistent as N → ∞ and T is xed.

This estimator is unfeasible because we do not know ση2 and σv2 .

Chapter 1. Panel Data 22

Theta-dierencing
β̂ GLS is equivalent to OLS on the theta-dierenced model:
∗
yit = x∗it 0 β + u∗it ,

where:
∗
yit = yit − (1 − θ)ȳi ,
and:
σv2
θ2 = .
σv2 + T ση2

Consistency relies on E[xit ηi ] = 0 (as ηi not eliminated).

Two special cases:

If ση2 = 0 (i.e. no individual eect), OLS is ecient.

If T → ∞, then θ → 0, and yit∗ → ỹit = yit − ȳi : WG is ecient.

Chapter 1. Panel Data 23
Feasible GLS
β̂ GLS is unfeasible because we do not know ση2 and σv2 .

A consistent estimator of σv2 is provided by the WG residuals:

ṽˆit ≡ ỹit − x̃0it β̂ W G

ˆ0 ṽ
ṽ ˆ
σ̂v2 =
N (T − 1) − K
Then, a consistent estimator of ση2 is given by the BG residuals:
ȳi = x̄0i β + η̄i + v̄i , i = 1, ..., N ⇒ β̂ BG

ˆi ≡ ȳi − x̄0i β̂ BG
ū

ˆ0 ū
ˆ
\
1 ū 1 2
σ̂ū2 = ση2 + σv2 = ⇒ σ̂η2 = σ̂ū2 − σ̂ .
T N −K T v
Chapter 1. Panel Data 24
Feasible GLS in Cigarette Demand

In our example, if we now estimate β̂ F GLS , we get:

OLS WG FGLS

ln Prices (β1 ) -0.083 -0.292 -0.122

(0.015) (0.023) (0.014)

ln Income (β2 ) -0.032 0.107 -0.012

(0.006) (0.019) (0.004)

Chapter 1. Panel Data 25

Testing for correlated individual eects.

Chapter 1. Panel Data 26

Testing for correlated eects (Hausman test)
β̂ W G is consistent regardless of E[xit ηi ] being equal to zero or not.

β̂ F GLS is consistent only if E[xit ηi ] = 0.

⇒ we can test whether both estimates are similar!

The Hausman test does exactly this comparison:

a
h = q̂ 0 [avar(q̂)]−1 q̂ ∼ χ2 (K),
under the null hypothesis E[xit ηi ] = 0, where:
q̂ = β̂ W G − β̂ F GLS ,
and:
avar(q̂) = avar β̂ W G − avar β̂ F GLS .

Requires classical assumptions (FGLS to be more ecient than WG).

Chapter 1. Panel Data 27
Hausman test in Cigarette Demand

Output from software packages often includes the Hausman test. In our example:

Statistic P-value

Hausman test 24.661 0.000

Chapter 1. Panel Data 28

Dynamic Models

Chapter 1. Panel Data 29

Autoregressive models with individual eects

Chapter 1. Panel Data 30

Autorregressive panel data model
We consider the following model:

yit = αyit−1 + ηi + vit |α| < 1.

Other regressors can be included, but main results unaected.

We assume:

Error components: E[ηi ] = E[vit ] = E[ηi vit ] = 0.

Serially uncorrelated shocks: E[vit vis ] = 0 ∀ s 6= t.

Predetermined initial cond.: E[yi0 vit ] = 0 ∀ t = 1, ..., T .

Chapter 1. Panel Data 31

Properties of pooled OLS and WG estimators
Even assuming E[yit−1 vit ] = 0, still OLS delivers:
plim α̂OLS > α,
N →∞

because E[yit−1 ηi ] = ση2 1−αt−1
1−α
+ αt−1 E[yi0 ηi ] > 0.

Doing the within groups transformation we see that:

plim α̂W G < α,
N →∞

(1−α) 1+T 1−αt−1 −αT −1−t +αT 1−αT −1

because E[ỹit−1 ṽit ] = −Aσv2 < 0. A= 2

T (1−α) 2

WG bias vanishes as T → ∞ (bias not small even with T = 15).

Supposedly consistent estimators with α̂ >> αOLS or α̂ << α̂W G should be seen with suspicion.

Chapter 1. Panel Data 32

Anderson and Hsiao
Consider the model in rst dierences:

∆yit = α∆yit−1 + ∆vit .

OLS in rst dierences is inconsistent: E[∆yit−1 ∆vit ] = −σv2 < 0.

However, yit−2 or ∆yit−2 are valid instruments for ∆yit−1 :

Relevance: E[∆yit−2 ∆yit−1 ] 6= 0, E[yit−2 ∆yit−1 ] 6= 0.

Orthogonality: E[∆yit−2 ∆vit ] = E[yit−2 ∆vit ] = 0.

Anderson and Hsiao (1981) proposed this 2SLS estimators:

−1
\ 0 \ \ 0
α̂AH = ∆y ∆y
−1 −1∆y ∆y, −1

where: −1
0
∆y
\ −1 = Z Z Z Z 0 ∆y −1 ,
where Z can be y −2 or ∆y −2 .

Requires min. three periods (T = 2 and yi0 ). Only ecient if T = 2.

Chapter 1. Panel Data 33
AR(1) Cigarette Demand (No Covariates)
In our example, we redene the model to be an AR(1) process (for now without
regressors):
ln Cit = α ln Cit−1 + ηi + vit .

The Anderson-Hsiao results (together with OLS and WG) are:

OLS WG Anderson-
Hsiao

Lagged cons. (ln Cit−1 ) 0.982 0.884 1.395

(0.003) (0.061) (0.090)

Chapter 1. Panel Data 34

Dierenced GMM estimation.

Chapter 1. Panel Data 35

GMM in 3 slides (I): the setup
GMM nds parameter estimates that come as close as possible to satisfy orthogo-
nality conditions (or moment conditions) in the sample.

E.g., consider K regressors xi and L instruments zi:

ui = yi − f (xi , β) z i = g(xi ).

The model species L moment conditions: E[z i ui ] = 0.

Sample analogue:
N
1 X
bN (β) = z i ui (β).
N
i=1

Chapter 1. Panel Data 36

GMM in 3 slides (II): the estimation
Two possible cases cases:
L = K (just identied): bN (β̂ GM M ) = 0.

L > K (overidentied): β̂ GM M = arg minβ bN (β)0 WN bN (β).

WN is a positive denite weighting matrix.

−1
Optimal GMM (ecient) uses
PN
1
E[z u u0 z0 ] , the inverse of the variance-
N i=1 i i i i
covariance matrix, as weighting matrix.

Chapter 1. Panel Data 37

GMM in 3 slides (III): asymptotic properties
β̂ GM M is a consistent estimator of β.
It is asymptotically normal, with the following variance:

avar(β̂ GM M ) = (D0 W D)−1 D0 W S0 W D(D0 W D)−1 ,

where:

∂bN (β)
D ≡ plim ,
N →∞ ∂β 0

W ≡ plim WN ,
N →∞

N
1 X
S0 ≡ E[z i ui u0i z 0i ].
N
i=1

Chapter 1. Panel Data 38

Moment conditions
Given previous assumptions, several moment conditions:
Equation Instruments Orthogonality cond.

∆yi2 = α∆yi1 + ∆vi2 yi0 E[∆vi2 yi0 ] = 0

yi0
∆yi3 = α∆yi2 + ∆vi3 yi0 , yi1 E ∆vi3 =0
yi1
  
yi0
∆yi4 = α∆yi3 + ∆vi4 yi0 , yi1 , yi2 E ∆vi4 yi1  = 0
yi2
.. .. ..
. . .
  
yi0



 yi1 

∆yiT = α∆yiT −1 + ∆viT yi0 , yi1 , yi2 , ..., yiT −2 E ∆viT
  yi2  = 0

..

.
  
  
yiT −2

We end up with (T − 1)T /2 moment conditions.

Chapter 1. Panel Data 39
Moment conditions in matrix form
We can write these moment conditions as E[Zi0 ∆v i ] = 0, where:
   
yi0 0 0 0 ... 0 0 ... 0 ∆vi2
0 yi0 yi1 0 ... 0 0 ... 0   ∆vi3 
Zi =  .

.. .. .. .. .. .. .. 

and ∆v i = 
 . ,

 .. . . . . . . ... .   .
. 
0 0 0 0 ... yi0 yi1 ... yiT −2 ∆viT

and the sample analogue is:

N
1 X 0
bN (α) = Zi ∆v i (α).
N i=1
Hence, the GMM estimator (proposed by Arellano and Bond, 1991) is:
N
! N
!
1 X 1 X 0
α̂GM M = arg min ∆v 0i (α)Zi WN Zi ∆v i (α) =
α N i=1 N i=1

= (∆y 0−1 ZWN Z 0 ∆y −1 )−1 ∆y 0−1 ZWN Z 0 ∆y.

Chapter 1. Panel Data 40
Optimal weighting matrix
The optimal weighting matrix (ecient GMM) is:
N
!−1
1 X
WN = E[Zi0 ∆v i ∆v 0i Zi ] .
N i=1
The sample analogue is obtained in a two-step procedure:
N
!−1
1 X 0 \ \
WN = [Zi ∆v i (α̂)∆v 0i (α̂)Zi ] .
N i=1
Windeijer (2005) proposes a nite sample correction of the variance that accounts for α being
estimated.

The most common one-step (and rst-step) matrix uses the structure of E[∆v i ∆v 0i ]:
−1
 
2 0 0 ... 0
−1 2 −1 0 ... 0
E[∆v i ∆v 0i ] = σv2  . .. .. .. .. .
 
 .. . . . . 
0 0 0 0 ... 2
Chapter 1. Panel Data 41
GMM in Cigarette Demand
GMM results in our example are:

Coecient Standard Error

Least Squares (OLS) 0.982 (0.003)
Within Groups (WG) 0.884 (0.061)
Anderson-Hsiao 1.395 (0.090)
One-step GMM 1.023 (0.104)
Two-step GMM 0.994 (0.040)
Two-step GMM small sample 0.994 (0.121)

Chapter 1. Panel Data 42

Potential limitations of Arellano-Bond
Weak instruments:

When α → 1, relevance of the instrument decreases.

Instruments are still valid, but have poor small sample properties.

Monte Carlo evidence shows that with α > 0.8, estimator behaves poorly unless huge
samples available.

There are alternatives in the literature.

Overtting:

Too many instruments if T relative to N is relatively large.

We might want to restrict the number of instruments used.

It is always good practice to check robustness to dierent combinations of instruments.

Chapter 1. Panel Data 43
Dynamic linear model
Once we include regressors, the model is:
yit = αyit−1 + x0it β + ηi + vit |α| < 1.

We maintain the previous assumptions: error components, serially uncorrelated

shocks, and predetermined initial conditions.

Therefore, moment conditions of the kind:

E[yit−s ∆vit ] = 0 t = 2, ..., T, s ≥ 2,

are still valid.
Chapter 1. Panel Data 44
Assumptions on regressors
Dierent assumptions regarding xit will enable dierent additional orthogonality
conditions:
xit can be correlated or uncorrelated with ηi .

xit can endogenous, predetermined, or strictly exogenous with respect to vit .

For instance, if assumptions are analogous to those for yit−1 , we may use xit−1 (and
longer lags) as instruments:

x0i0 x0i1 00 00
 
yi0 ... 0 ... 0 ...
 . .. .. .. .. .. .. .. .. 
Zi =  .. . . . . ... . . . . .
0 00 00 ... yi0 ... yiT −2 x0i0 ... x0iT −1

Chapter 1. Panel Data 45

Dynamic Cigarette Demand with Regressors
In our example, we rewrite the employment equations as follows:

ln Cit = α ln Cit−1 + β1 ln Pit + β2 ln Yit + ηi + vit .

Results are:

OLS WG GMM

Lagged dep (α) 0.947 0.528 0.495

(0.011) (0.064) (0.127)

ln Prices (β1 ) 0.010 -0.501 -0.607

(0.006) (0.098) (0.143)

ln Income (β2 ) 0.049 0.369 0.338

(0.011) (0.044) (0.051)

Chapter 1. Panel Data 46

System GMM estimation

Chapter 1. Panel Data 47

Additional orthogonality conditions

Recall our (T − 1)T /2 moment conditions:

E[yit−s ∆vit ] = 0 t = 2, ..., T ; s ≥ 2.

System GMM (Arellano and Bover, 1995) uses the assumption E[yi0 |ηi ] = 1−α
ηi
:

E[∆yit ηi ] = 0, ∀t

or, alternatively:

E[∆yiT −s uiT ] = 0, uiT = ηi + viT , s = 1, ..., T − 1.

Chapter 1. Panel Data 48

The System GMM estimator

Analogously to the rst-dierenced GMM, the estimator is given by E[(Z ∗ )0 u∗i ] = 0:

−1
α̂Sys−GM M = (X ∗ )0 Z ∗ WN (Z ∗ )0 X ∗ X ∗ Z ∗ WN (Z ∗ )0 y ∗ ,

where:

Zi 0 ... 0 ∆v i ∆y −1i ∆y i
Zi∗ = , u∗i = , Xi∗ = and y ∗i = ,
00 ∆yi1 ... ∆yiT −1 ηi + viT yiT −1 yiT

This estimator is more ecient, as it uses additional moment conditions.

It reduces small sample bias, especially when α → 1.

Chapter 1. Panel Data 49

System-GMM in Cigarette Demand
GMM results in our example are:

Coecient Standard Error

Least Squares (OLS) 0.982 (0.003)
Within Groups (WG) 0.884 (0.061)
Anderson-Hsiao 1.395 (0.090)
One-step GMM 1.023 (0.104)
Two-step GMM 0.994 (0.040)
Two-step GMM small sample 0.994 (0.121)
One-step System-GMM 0.926 (0.023)
Two-step System-GMM small 0.911 (0.032)

Chapter 1. Panel Data 50

Specication tests

Chapter 1. Panel Data 51

Specication tests
There are several relevant aspects for the validity of the estimation that can be
tested formally.

Orthogonality conditions: are they small enough to not reject that they are
zero (overidentifying restrictions).

Validity of a subset of more restrictive assumptions (dierence Sargan test,

Hausman test).

Serial correlation in the data: vital for the validity of the instruments
(Arellano-Bond test).

Chapter 1. Panel Data 52

Sargan/Hansen overidentifying restrictions test
The null hypothesis is whether the orthogonality conditions are satised (i.e. moments
are equal to zero).

The test can only be implemented if the problem is overidentied (otherwise the sample
moments are exactly zero by construction).

The test is:


N N
!−1 N

1 X ˆ0 1 X 0 1 X 0ˆ 
S = N JN (β) = N  ûi Zi Zi ûi û0i Zi Zi ûi ,
N N N
i=1 i=1 i=1

where û are those predicted residuals from the rst step and û
ˆ are predicted from the second
stage, and:
a
S ∼ χ2 (L − K).

Chapter 1. Panel Data 53

Testing stronger assumptions
Some of the assumptions that we make are stronger than others.
If the problem is overidentied, we can test whether results change if we include
or exclude the orthogonality conditions generated by them.
If they are true, increase eciency, but if not, inconsistent!
Two ways of testing it:
Overidentifying restrictions (dierence in Sargan should be close to zero:
a
∆S = S − S A ∼ χ2 (L − LA )).

Dierences in coecients (Hausman test: A

q̂ = δ̂ GM M − δ̂ GM M ).

Chapter 1. Panel Data 54

Direct test for serial correlation
The test was proposed by Arellano-Bond (1991).

Tests for the presence of second order autocorrelation in the rst-dierenced

residuals.

If dierences in residuals are second-order correlated, some instruments would

not be valid!
The test is:
0
∆v
\ −2 ∆v ∗ a
d
m2 = ∼ N (0, 1),
se
where ∆v −2 is the second lagged residual in dierences, and ∆v ∗ is the part of
the vector of contemporaneous rst dierences for the periods that overlap with the
second lagged vector.

Values close to zero do not reject the hypothesis of no serial correlation.

Chapter 1. Panel Data 55

The Life Table Modelling Survival and Death - 1st Edition Full MOBI Ebook
100% (19)
The Life Table Modelling Survival and Death - 1st Edition Full MOBI Ebook
16 pages
Econometric Methods For Panel Data
No ratings yet
Econometric Methods For Panel Data
58 pages
Croissant y Millo, Panel Data Econometrics
100% (1)
Croissant y Millo, Panel Data Econometrics
52 pages
Panel Data Regression Models
100% (1)
Panel Data Regression Models
25 pages
Panel Data Lecture Notes
No ratings yet
Panel Data Lecture Notes
38 pages
Chapter 2 - Panel Data Regression
No ratings yet
Chapter 2 - Panel Data Regression
30 pages
PEV Onesided
No ratings yet
PEV Onesided
322 pages
Panel Data For Learing
100% (2)
Panel Data For Learing
34 pages
Panel Data
100% (1)
Panel Data
13 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Panel Data Econometrics: Manuel Arellano
No ratings yet
Panel Data Econometrics: Manuel Arellano
5 pages
MScFE 610 Econometrics - CompiledVideo - Transcripts - M2
No ratings yet
MScFE 610 Econometrics - CompiledVideo - Transcripts - M2
14 pages
Basic Level
No ratings yet
Basic Level
3 pages
Quantitative Techniques, ASSIGNMENTS-2
No ratings yet
Quantitative Techniques, ASSIGNMENTS-2
7 pages
Ec1 15
No ratings yet
Ec1 15
72 pages
Cuarta Clase
No ratings yet
Cuarta Clase
142 pages
Week 3-1
No ratings yet
Week 3-1
25 pages
Econ-654 - Unit 3-PDM
No ratings yet
Econ-654 - Unit 3-PDM
211 pages
DOE (Fractional Factorial Design-Revised)
No ratings yet
DOE (Fractional Factorial Design-Revised)
80 pages
WILP ASM Mid-Sem Makeup Solutions
No ratings yet
WILP ASM Mid-Sem Makeup Solutions
4 pages
ECMT6007/ECON4954: Panel Data Econometrics: Lecture 3: Pooled OLS, LSDV and FD Estimators
No ratings yet
ECMT6007/ECON4954: Panel Data Econometrics: Lecture 3: Pooled OLS, LSDV and FD Estimators
48 pages
Bond (2007 Lecture) Dynamic PD First Difference GMM
No ratings yet
Bond (2007 Lecture) Dynamic PD First Difference GMM
50 pages
Value-at-Risk and Expected Shortfall
No ratings yet
Value-at-Risk and Expected Shortfall
62 pages
BST281Micro-EconPractice Panel1
No ratings yet
BST281Micro-EconPractice Panel1
38 pages
Panel Data Econometrics In: The Package: Yves Croissant Giovanni Millo
No ratings yet
Panel Data Econometrics In: The Package: Yves Croissant Giovanni Millo
51 pages
Dependent Variable 1
No ratings yet
Dependent Variable 1
3 pages
Panel Cookbook
No ratings yet
Panel Cookbook
98 pages
10 Estimation 21022024 081610pm
No ratings yet
10 Estimation 21022024 081610pm
50 pages
EC501 Lecture 04
No ratings yet
EC501 Lecture 04
30 pages
Time Series Forecasting - Sparkling - Buisness Report
No ratings yet
Time Series Forecasting - Sparkling - Buisness Report
70 pages
5-1: Boyle's Law: Pressure and Volume
100% (1)
5-1: Boyle's Law: Pressure and Volume
5 pages
Lecture Notes20120425152410
No ratings yet
Lecture Notes20120425152410
118 pages
Slides 6 Iu
No ratings yet
Slides 6 Iu
38 pages
Panel Data Analysi
No ratings yet
Panel Data Analysi
27 pages
Lecture 14 - Panel Data Models
No ratings yet
Lecture 14 - Panel Data Models
40 pages
04 - Panel Data PDF
No ratings yet
04 - Panel Data PDF
84 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Slide
No ratings yet
Slide
43 pages
(Ebook) Econometrics by Example by Damodar Gujarati ISBN 9781137375018, 1137375019 Download
No ratings yet
(Ebook) Econometrics by Example by Damodar Gujarati ISBN 9781137375018, 1137375019 Download
43 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
61 pages
2025 Static Panels
No ratings yet
2025 Static Panels
19 pages
MT547 Business Forecasting (End - SP23)
0% (1)
MT547 Business Forecasting (End - SP23)
1 page
Topic 9: Panel Data Models
No ratings yet
Topic 9: Panel Data Models
46 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
72 UE Panelv3
No ratings yet
72 UE Panelv3
35 pages
Fere
No ratings yet
Fere
46 pages
Slides 2014 Panel Data
No ratings yet
Slides 2014 Panel Data
67 pages
Logistic Regression Playbook
No ratings yet
Logistic Regression Playbook
19 pages
ITLS5050 Data Set 2 v5 Simple Regression
No ratings yet
ITLS5050 Data Set 2 v5 Simple Regression
19 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-06 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-06 Reference-Material-I
21 pages
OPM 501 Assignment 1
No ratings yet
OPM 501 Assignment 1
16 pages
Chapter 14
No ratings yet
Chapter 14
16 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
PLM
No ratings yet
PLM
51 pages
Panel Data Modeling and Estimation Process
No ratings yet
Panel Data Modeling and Estimation Process
11 pages
Panel Data Econometrics in R: The PLM Package: Yves Croissant Giovanni Millo
No ratings yet
Panel Data Econometrics in R: The PLM Package: Yves Croissant Giovanni Millo
51 pages
Handout 5 Panel Data
No ratings yet
Handout 5 Panel Data
23 pages
Questions Regarding Panel Data
No ratings yet
Questions Regarding Panel Data
3 pages
6 Panelmf
No ratings yet
6 Panelmf
18 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
Panel Data
No ratings yet
Panel Data
14 pages
Me Chapter 4 Demand Estimation Covid 19 Student Version
No ratings yet
Me Chapter 4 Demand Estimation Covid 19 Student Version
18 pages
Pengaruh Lingkungan Kerja Fisik Dan Motivasi Terhadap Kinerja Karyawan Pt. Raya Azura Persada Jakarta Selatan
No ratings yet
Pengaruh Lingkungan Kerja Fisik Dan Motivasi Terhadap Kinerja Karyawan Pt. Raya Azura Persada Jakarta Selatan
10 pages
Panel Data Models
No ratings yet
Panel Data Models
25 pages
Topic 4 Panel Regression Model Wble
No ratings yet
Topic 4 Panel Regression Model Wble
34 pages
Chapter 2 Panel Data
No ratings yet
Chapter 2 Panel Data
17 pages
Rev Lect 3&4 J
No ratings yet
Rev Lect 3&4 J
56 pages
Panel 2 Up
No ratings yet
Panel 2 Up
9 pages
Chapter 2
No ratings yet
Chapter 2
12 pages
Lectute 2 - Panel Data Regression
No ratings yet
Lectute 2 - Panel Data Regression
30 pages
Seattle SISG 18 IntroQG Lecture08
No ratings yet
Seattle SISG 18 IntroQG Lecture08
21 pages
2023 Past Year Question Paper
No ratings yet
2023 Past Year Question Paper
6 pages
Ch. 1 - Endogeneity
No ratings yet
Ch. 1 - Endogeneity
18 pages
Đề thi cuối kỳ - Tổng hợp - EN1
No ratings yet
Đề thi cuối kỳ - Tổng hợp - EN1
7 pages
Stats Medic Unit 6 Important Ideas
No ratings yet
Stats Medic Unit 6 Important Ideas
5 pages
Exercises w9
No ratings yet
Exercises w9
4 pages
Regression Analysis Excel Template
No ratings yet
Regression Analysis Excel Template
5 pages
Advanced Econometrics: Based On The Textbook by Verbeek: A Guide To Modern Econometrics
No ratings yet
Advanced Econometrics: Based On The Textbook by Verbeek: A Guide To Modern Econometrics
24 pages
Inversion Equation
No ratings yet
Inversion Equation
4 pages
Ridge and Lasso
No ratings yet
Ridge and Lasso
2 pages
Panel Data-1 FD and FE Estimators
No ratings yet
Panel Data-1 FD and FE Estimators
4 pages
Some Basics For Panel Data Analysis
No ratings yet
Some Basics For Panel Data Analysis
21 pages
Chapter 3
No ratings yet
Chapter 3
2 pages
Panel Data: Fixed and Random Effects: I1 0 I1 0 I I1
No ratings yet
Panel Data: Fixed and Random Effects: I1 0 I1 0 I I1
8 pages
Panal Data Method ch14 PDF
No ratings yet
Panal Data Method ch14 PDF
38 pages
Econometrics - Review Sheet ' (Main Concepts)
No ratings yet
Econometrics - Review Sheet ' (Main Concepts)
5 pages
Problem Set 1: Panel Data
No ratings yet
Problem Set 1: Panel Data
3 pages
Kinetic Theory of Gases - 152 - Download
No ratings yet
Kinetic Theory of Gases - 152 - Download
24 pages
MIT Microeconomics 14.32 Final Review
No ratings yet
MIT Microeconomics 14.32 Final Review
5 pages
Intro Panel Data by Kurt-Univ Basel
No ratings yet
Intro Panel Data by Kurt-Univ Basel
8 pages
Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions
No ratings yet
Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions
62 pages
LN 13
No ratings yet
LN 13
8 pages
Panel Data Lecture Rome
No ratings yet
Panel Data Lecture Rome
47 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet

Chapter 1

Uploaded by

Chapter 1

Uploaded by

Chapter 1.

Quantitative Statistical Methods II  Part II

Chapter 1. Panel Data 2

Units can be persons, households, rms, countries,...

It is dierent from repeated cross-sections.

Main advantages of panel data:

Dynamic responses and error components.

Chapter 1. Panel Data 3

Consider the estimation of the demand for cigarettes:

ln Cit = β0 + β1 ln Pit + β2 ln Yit + ηi + vit ,

ln Cit is log consumption of cigarettes,

ln Pit is log price of the cigarettes,

ln Yit is log income of the individual,

Chapter 1. Panel Data 4

Chapter 1. Panel Data 5

yiT x0iT viT

Fixed eects: E[xit ηi ] 6= 0 or random eects: E[xit ηi ] = 0.

Error components: E[ηi ] = E[vit ] = E[ηi vit ] = 0.

Serially uncorrelated shocks: E[vit vis ] = 0 ∀s 6= t.

Chapter 1. Panel Data 7

The properties of β̂OLS depend on E[xit ηi ], as E[xit vit ] = 0:

If E[xit ηi ] = 0 ⇒ E[xj uj ] = 0 (random eects):

 β̂ OLS is consistent as N → ∞, or T → ∞, or both.

If E[xit ηi ] 6= 0 ⇒ E[xj uj ] 6= 0 (xed eects):

 β̂ OLS is inconsistent as N → ∞, or T → ∞, or both.

Chapter 1. Panel Data 10

ỹit = (xit − x̄i )0 β + (ηi − η̄i ) + (vit − v̄i ) = x̃0it β + ṽit .

Given the previous assumptions:

Therefore, OLS on the transformed model:

is a consistent estimator either if E[xit ηi ] 6= 0 or E[xit ηi ] = 0.

Strict exogeneity is a crucial assumption (see next slide).

Chapter 1. Panel Data 11

To see it, recall that:

Therefore E[x̃it ṽit ] = 0 requires E[xit vis ] = 0 ∀ s, t unless T → ∞.

Chapter 1. Panel Data 12

 Even if E[xit ηi ] 6= 0, dierencing introduces correlation in the errors.

It does not allow to identify coecients for time-invariant regressors, and

Chapter 1. Panel Data 13

ln Prices (β1 ) -0.083 (0.015) -0.292 (0.023)

ln Income (β2 ) -0.032 (0.006) 0.107 (0.019)

Chapter 1. Panel Data 14

OLS estimation of this model gives numerically equivalent estimates to WG

Chapter 1. Panel Data 15

ln Prices (β1 ) -0.083 -0.292 -0.292

(0.015) (0.023) (0.023)

ln Income (β2 ) -0.032 0.107 0.107

(0.006) (0.019) (0.019)

Individual 1 (β0 + η1 ) 2.804

Individual 2 (β0 + η2 ) 3.455

Individual 3 (β0 + η3 ) 2.891

Individual 4 (β0 + η4 ) 2.908

Individual 5 (β0 + η5 ) 3.490

Individual 6 (β0 + η6 ) 2.092

where ∆yit = yit − yit−1 .

Takes out time-invariant individual eects (∆ηi = ηi − ηi = 0), so OLS on the

WG more ecient than FDLS under classical assumptions.

OLS WG LSDV FDLS

Chapter 1. Panel Data 18

Chapter 1. Panel Data 19

In this case, OLS is consistent, but not ecient.

The ineciency is provided by the serial correlation induced by ηi :

The variance of the unobservables (under classical assumptions) is:

Chapter 1. Panel Data 20

whose dimensions are T × T, and E[ui u0h ] = 0 ∀ i 6= h, or:

which is block-diagonal with dimension NT × NT.

If E[xit ηi ] 6= 0 GLS is inconsistent as N → ∞ and T is xed.

This estimator is unfeasible because we do not know ση2 and σv2 .

Chapter 1. Panel Data 22

Consistency relies on E[xit ηi ] = 0 (as ηi not eliminated).

Two special cases:

If T → ∞, then θ → 0, and yit∗ → ỹit = yit − ȳi : WG is ecient.

A consistent estimator of σv2 is provided by the WG residuals:

In our example, if we now estimate β̂ F GLS , we get:

ln Prices (β1 ) -0.083 -0.292 -0.122

(0.015) (0.023) (0.014)

ln Income (β2 ) -0.032 0.107 -0.012

Quantitative Statistical Methods II Part II

Units can be persons, households, rms, countries,...

It is dierent from repeated cross-sections.

Fixed eects: E[xit ηi ] 6= 0 or random eects: E[xit ηi ] = 0.

If E[xit ηi ] = 0 ⇒ E[xj uj ] = 0 (random eects):

β̂ OLS is consistent as N → ∞, or T → ∞, or both.

If E[xit ηi ] 6= 0 ⇒ E[xj uj ] 6= 0 (xed eects):

β̂ OLS is inconsistent as N → ∞, or T → ∞, or both.

Even if E[xit ηi ] 6= 0, dierencing introduces correlation in the errors.

It does not allow to identify coecients for time-invariant regressors, and

Takes out time-invariant individual eects (∆ηi = ηi − ηi = 0), so OLS on the

WG more ecient than FDLS under classical assumptions.

In this case, OLS is consistent, but not ecient.

The ineciency is provided by the serial correlation induced by ηi :

If E[xit ηi ] 6= 0 GLS is inconsistent as N → ∞ and T is xed.

If T → ∞, then θ → 0, and yit∗ → ỹit = yit − ȳi : WG is ecient.

Requires classical assumptions (FGLS to be more ecient than WG).

Other regressors can be included, but main results unaected.

OLS in rst dierences is inconsistent: E[∆yit−1 ∆vit ] = −σv2 < 0.

Requires min. three periods (T = 2 and yi0 ). Only ecient if T = 2.

E.g., consider K regressors xi and L instruments zi:

The model species L moment conditions: E[z i ui ] = 0.

L > K (overidentied): β̂ GM M = arg minβ bN (β)0 WN bN (β).

WN is a positive denite weighting matrix.

Coecient Standard Error

Too many instruments if T relative to N is relatively large.

It is always good practice to check robustness to dierent combinations of instruments.

Analogously to the rst-dierenced GMM, the estimator is given by E[(Z ∗ )0 u∗i ] = 0: