0% found this document useful (0 votes)
18 views15 pages

Appendix B

The document discusses dynamic panel bias, also known as Nickell's bias. It defines an AR(1) panel data model and assumptions. It shows that the least squares dummy variable estimator is inconsistent for a dynamic panel model with individual effects as the number of time periods and individuals increases, due to dynamic panel bias.

Uploaded by

chalashebera0314
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views15 pages

Appendix B

The document discusses dynamic panel bias, also known as Nickell's bias. It defines an AR(1) panel data model and assumptions. It shows that the least squares dummy variable estimator is inconsistent for a dynamic panel model with individual effects as the number of time periods and individuals increases, due to dynamic panel bias.

Uploaded by

chalashebera0314
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Dynamic panel bias

The LSDV estimator is consistent for the static model whether the e¤ects are …xed or random.

On the contrary, the LSDV estimator is inconsistent for a dynamic panel data model with individual
e¤ects, whether the e¤ects are …xed or random.

De…nition 1 ( Nickell’s bias (1981)) The bias of the LSDV estimator in a dynamic model is generally
known as dynamic panel bias or Nickel’s bias (1981).

De…nition 2 (AR(1) panel data model)

Consider the simple AR(1) model


yit = yi;t 1 + i + it

for i = 1; ::; n and t = 1; ::; T .

For simplicity, let us assume that

i = + i

Pn
to avoid imposing the restriction that i=1 i = 0 or E ( i ) = 0 in the case of random individual
e¤ects.

Assumptions
1. The autoregressive parameter satis…es j j < 1.

2. The initial condition yi0 is observable.

2 2
3. The error term it i:i:d: 0; ; i.e., E( it ) = 0, E( it js ) = if j = i and t = s, and E( it js ) =0
otherwise.

In this AR(1) panel data model, we will show that

plim bLSDV 6= dynamic panel bias


n!1
plim bLSDV =
n; T !1

The LSDV estimator is de…ned by

bi = yi bLSDV y i; 1
n X
T
! 1 n X
T
!
X 2 X
bLSDV = yi;t 1 y i; 1 yi;t 1 y i; 1 (yit yi )
i=1 t=1 i=1 t=1

where
T T
1X 1X
yi = yit , y i; 1 = yi;t 1
T t=1 T t=1

1
Given that yit yi = yi;t 1 y i; 1 +( it i) ; the LSDV estimator can be written as

n X
T
! 1 ( n X
T n X
T
!)
X 2 X 2 X
bLSDV = yi;t 1 y i; 1 yi;t 1 y i; 1 + yi;t 1 y i; 1 ( it i)
i=1 t=1 i=1 t=1 i=1 t=1
n X
T
! 1 n X
T
!
X 2 X
= + yi;t 1 y i; 1 yi;t 1 y i; 1 ( it i)
i=1 t=1 i=1 t=1

De…nition 3 (bias)

The bias of the LSDV estimator is de…ned by:

n X
T
! 1 n X
T
!
X 2 X
bLSDV = yi;t 1 y i; 1 yi;t 1 y i; 1 ( it i)
i=1 t=1 i=1 t=1

The bias of the LSDV estimator can be rewritten as:


Pn PT
t=1 yi;t 1 y i; 1 ( it i ) =nT
bLSDV = i=1 Pn PT 2 (1)
i=1 t=1 yi;t 1 y i; 1 =nT

Let us consider the numerator of equation (1). Because it are (1) uncorrelated with i and (2) are
independently and identically distributed, we have
n T n T n T
1 XX 1 XX 1 XX
plim yi;t 1 y i; 1 ( it i) = plim yi;t 1 it plim yi;t 1 i
n!1 nT n!1 nT n!1 nT
i=1 t=1 i=1 t=1 i=1 t=1
| {z } | {z }
N1 N2
n T n T
1 XX 1 XX
plim y i; 1 it + plim y i; 1 i (2)
n!1 nT n!1 nT
i=1 t=1 i=1 t=1
| {z } | {z }
N3 N4

Theorem 4 (Weak law of large numbers, Khinchine): If fXi g, for i = 1; 2; : : : ; m is a sequence of i.i.d.
random variables with E(Xi ) = < 1, then the sample mean converges in probability to :
m m
1 X p 1 X
Xi ! E(Xi ) = or plim Xi = E(Xi ) =
m i=1 m!1 m
i=1

By application of the WLLN (Khinchine’s theorem)


n T T
1 XX 1X
N1 = plim yi;t 1 it = E (yi;t 1 it )
n!1 nT T t=1
i=1 t=1

Since (1) yi;t 1 only depends on i;t 1 , i;t 2 ,. . . and (2) the it are uncorrelated, then we have

E (yi;t 1 it ) =0

and …nally
n T
1 XX
N1 = plim yi;t 1 it =0
n!1 nT
i=1 t=1

2
For the second term N2 , we have:
n T n T
1 XX 1 X X
N2 = plim yi;t 1 i = plim i yi;t 1
n!1 nT n!1 nT
i=1 t=1 i=1 t=1
n n
1 X 1X
= plim i T y i; 1 = plim y i; 1 i
n!1 nT n!1 n
i=1 i=1

In the same way:


n T n T
1 XX 1 X X
N3 = plim y i; 1 it = plim y i; 1 it
n!1 nT n!1 nT
i=1 t=1 i=1 t=1
n n
1 X 1X
= plim y i; 1T i = plim y i; 1 i
n!1 nT n!1 n
i=1 i=1
n T n n
1 XX 1 X 1X
N4 = plim y i; 1 i = plim T y i; 1 i = plim y i; 1 i
n!1 nT n!1 nT n!1 n
i=1 t=1 i=1 i=1

The bias expression in (2) can be rewritten as


n T
1 XX
plim yi;t 1 y i; 1 ( it i)
n!1 nT
i=1 t=1
n n n
1X 1X 1X
= 0 plim y i; 1 i plim y i; 1 i + plim y i; 1
n!1 n n!1 n n!1 n
i=1 i=1 i=1
n
1X
= plim y i; 1 i
n!1 n
i=1

If this plim is not null, then the LSDV estimator bLSDV is biased when n tends to in…nity and T is
…xed.
Let us examine this plim
n
1X
plim y i; 1 i
n!1 n
i=1

We know that

yit = yi;t 1 + i + it

2
= yi;t 2 + (1 + ) i + it + i;t 1

3 2 2
= yi;t 3 + 1+ + i + it + i;t 1 + i;t 2
..
.
t
t 1 2 t 1
= yi0 + i + it + i;t 1 + i;t 2 + + i1
1

For any time t, we have:


t
2 t 1 1 t
yit = it + i;t 1 + i;t 2 + + i1 + i + yi0 (3)
1

For yi;t 1, we have:


t 1
2 t 2 1 t 1
yi;t 1 = i;t 1 + i;t 2 + i;t 3 + + i1 + i + yi0 (4)
1

3
Summing yi;t 1 over t, we get:
T
X T X
X t 2 T T
l (T 1) T + 1
yi;t 1 = i;t 1 l + 2 i + yi0
t=1 t=1 l=0 (1 ) 1

T
1X
y i; 1 = yi;t 1
T t=1
( T t 2 )
1 XX l (T 1) T + T
1 T
= i;t 1 l + 2 i + yi0
T t=1 (1 ) 1
l=0

Finally, the plim is equal to


n
1X
plim y i; 1 i
n!1 n
i=1
n
( T t 2 )( T
)
1X1 XX
l (T 1) T + T
1 T
1X
= plim i;t 1 l + 2 i + yi0 is
n!1 n T (1 ) 1 T s=1
i=1 t=1 l=0
n T T t 2
! n T
1 XXXX l (T 1) T + T
1 XX
= plim 2 is i;t 1 l + plim 2 i is
n!1 nT
i=1 t=1 s=1
n!1 (1 ) nT 2 i=1 s=1
l=0
n T
1 T
1 XX
+ plim 2
yi0 is
1 n!1 nT
i=1 s=1

Because it are i.i.d, by a law of large numbers, we have:


T T t 2
! T
1 XXX l (T 1) T + T
1 X
= E( is i;t 1 l) + 2 E( i is )
T 2 t=1 s=1 (1 ) T 2 s=1
l=0
T
1 T
1 X
+ E (yi0 is )
1 T 2 s=1
2 T X
X t 2
l
=
T2 t=1 l=0
as 8
< 2
, if s = t 1 l
E( is i;t 1 l) =
: 0, otherwise
and

E( i is ) = 0 for all i and s

E (yi0 is ) = 0 for s = 1; 2; : : : ; T

Therefore,
n T X
t 2 T
1X 2 X
l
2 X 1 t 1
plim y i; 1 i = =
n!1 n
i=1
T2 t=1 l=0
T2 t=1
1
( )
2 T
T 1 = (1 )
=
T2 1
( )
2 T
(T 1) T +
= 2
T2 (1 )

4
2
Theorem 5 1: If the error (idiosyncratic) terms it are i.i.d. 0; , we have:
n T n
( )
1 XX 1X 2
(T 1) T + T
plim yi;t 1 y i; 1 ( it i) = plim y i; 1 i = 2
n!1 nT
i=1 t=1
n!1 n
i=1
T2 (1 )

Consider the denominator of bLSDV


n X
X T n X
X T
2
yi;t 1 y i; 1 = yi;t 1 y i; 1 yi;t 1
i=1 t=1 i=1 t=1
n X
X T n X
X T
2
= yi;t 1 y i; 1 yi;t 1
i=1 t=1 i=1 t=1
n X
X T n
X
2
= yi;t 1 y i; 1 T y i; 1
i=1 t=1 i=1
n X
X T n
X
2
= yi;t 1 T y 2i; 1
i=1 t=1 i=1

n T n T n
1 XX 2 1 XX 2 1X 2
plim yi;t 1 y i; 1 = plim yi;t 1 plim y i; 1
n!1 nT n!1 nT n!1 n
i=1 t=1 i=1 t=1 i=1
T
1X 2
= E yi;t 1 E y 2i; 1 (5)
T t=1

From the result in (4), we have


t 2
X t 1
j 1 t 1
yi;t 1 = t 1 j + i + yi0 (6)
j=0
1

However,

yi;0 = yi; 1 + i + i0
1
X
i i0 i j
yi;0 = + = + i;0 j
1 1 L 1 j=0

i
) E yi;0 =
1
00 12 1
h i 1
2 B X A C
2
) var yi;0 = E yi;0 E yi;0 = E @@ j
i;0 j A= 2
j=0
1

2 2 2
2 2 i
) E yi;0 = 2
+ E yi;0 = 2
+ (7)
1 1 1
8 0 19
< X1 = 2
) E i yi;0 =E @ i
+ j
i;0 j
A = i
(8)
: i 1 ; 1
j=0

From the expression in (6), we have:


Xt t 1 2
2
2
j 1 t 1
yi;t 1 = t 1 j + i + yi0
j=0 1
Xt 2 t 1 2 t 1
2
j 1 2 t 1 2 2 1 t 1
= t 1 j + i + yi0 +2 i yi0 + other cross products terms
j=0 1 1

5
( ) 2
Xt 2
2
1 t 1
1 t 1
2 j 2 2 t 1 2 t 1
E yi;t 1 =E t 1 j + E i + E yi0 +2 E( i yi0 )
j=0 1 1

as the expectation of the other cross product terms are zero, using the results in (7) and (8), we have
( )
2 2 2 2
2 2(t 1) t 1 2 i 2 t 1 i
E yi;t 1 = 2
1 + 1 + 2
+
1 1 1 1
2
t 1 t 1 i
+2 1
1
2 n o 2 n o
2(t 1) 2(t 1) i t 1 2 2 t 1 t 1 t 1
= 2
1 + + 1 + +2 1
1 1
2 2 n o
i t 1 2(t 1) 2(t 1) t 1 2(t 1)
= 2
+ 1 2 + + +2 2
1 1
2 2
i
= 2
+ (9)
1 1

The expression in (9) implies that


T T
( )
1X 1X 2 2 2 2
2 i i
E yi;t 1 = 2
+ = 2
+ (10)
T t=1 T t=1 1 1 1 1

However,

yi1 = yi0 + i + i1

yi2 = yi1 + i + i2 = ( yi0 + i + i1 )


+ i2 + i
2
2 1
= yi0 + (1 + ) i + i2 + i1 = 2 yi0 + i + i2 + i1
1
..
.
T 1
T 1 1 T 2
yi;T 1 = yi0 + i + i;T 1 + i;T 2 + + i;1
1

Therefore,
T
X T T
X1 t T
X1 T t
1 1 1
yi;t 1 = yi0 + i + it
t=1
1 t=1
1 t=1
1
Hence
T
( T
)
1X 1 1 T X1 1 t T
X1 1 T t
y i; 1 = yi;t 1 = yi0 + i + it
T t=1 T 1 t=1
1 t=1
1
( ! T
)
1 1 T 1 T 1 X1 1 T t
i
= yi0 + (T 1) + it
T 1 1 1 t=1
1

( 2
!2 T
!2
1 1 T 1 T 1 2 X1 1 T t
i
y 2i; 1 = 2
yi0 + (T 1) + it
T2 1 1 1 t=1
1
! )
T T 1
1 1 i yi0
+2 (T 1) + other cross product terms
1 1 1

6
Given the fact that other cross product terms have zero expectation and using the results in (7) and (8)
we obtain
( 2
! !2
T 2 2 T 1 2
1 1 i 1 i
E y 2i; 1 = + + (T 1)
T2 1 1 2 1 1 1
T 2
! )
X1 1 T t
1 T 1 T 1 2
2 i
+ +2 (T 1)
t=1
1 1 1 1
( 2 T
) ( 2
2
1 T
1 2 X1 1
2
1 T
T t 2 i
= + 2 1 + 2
T 2 (1 2) 1 (1 ) t=1
T 1 1
!2 !9
1 T 1
1 T 1 T 1 =
+ (T 1) +2 (T 1)
1 1 1 ;
( 2
!)
2 T 2 T 1 2 2(T 1)
1 1 2 1 1
= + 2 (T 1) +
T 2 (1 2) 1 (1 ) 1 1 2
8 !2
2< T 2 T 1
1 i 1 1
+ 2 + (T 1)
T 1 : 1 1
!)
T T 1
1 1
+2 (T 1) (11)
1 1

The …rst term on the right hand side of (11) simpli…es to


( !)
2 T 2 2 2 1 T 1 2
1 2(T 1)
1 1
+ 2 (T 1) +
T 2 (1 2) 1 (1 ) 1 1 2
( )
2 T 2 2 1 T 1
(1 + ) 2
1 2(T 1)
1 1+
= + (T 1) 2 + 2
T 2 (1 2) 1 1 (1 ) (1 )
( )
2
1+ 1 T 2 T 1 2 2T
= (T 1) + 2 1 2 1 (1 + ) +
T 2 (1 2) 1 (1 )
( )
2
1+ 1 T 2T 2 T T +1 2 2T
= (T 1) + 2 1 2 + 2 2 +2 +2 +
T 2 (1 2) 1 (1 )
( )
2
1+ 1 2 T +1
= (T 1) + 2 1 2 +2
T 2 (1 2) 1 (1 )
( )
2
1+ 1 T
= (T 1) + 2 (1 ) (1 + ) 2 1
T 2 (1 2) 1 (1 )
( )
2 T
1+ 1+ 2 1
= (T 1) + 2
T 2 (1 2) 1 1 (1 )
( ) ( )
2 T 2 T
1+ 2 1 2 2 1
= T 2 = 2 T 1+ 2
T 2 (1 2) 1 (1 ) T (1 2) 1 (1 )
( )
2
2 T
= T+ 2 T (1 ) 1
T 2 (1 2)
(1 )
( )
2
1 2 T T 1+ T
= 2)
+ 2 (12)
(1 T (1 ) T2

7
On the other hand, the second term in the right hand side of (11) simpli…es to
8 !2 !9
2< T 2 T 1 T T 1 =
1 i 1 1 1 1
+ (T 1) + 2 (T 1)
T2 1 : 1 1 1 1 ;
(
2 T 2 2 (T 1) 1 T 1 T 2
1 i 1 2
= + (T 1) +
T2 1 1 1 1
)
T T T
2 (T 1) 1 2 1
+ 2
1 (1 )
( ! )
2 T 2 T
1 i 1 2 2 (T 1) T
= 2
+ (T 1) + 1 + T + 2
T
2+2 T
T 1 1 1 (1 )
( 2
! )
2 T T
1 i 1 2 T
= 2
+ (T 1) + 2 (T 1) + 2 ( 1) + 1
T 1 1 (1 )
( ! )
2 T 2 T
1 i 1 2 T
= + T 1 2 (1 )+ 1
T2 1 1 (1 )
2 T T T T
1 i 1 1
= T2 1 +
T2 1 1 1 1 1
2 T T
1 i 1
= T2 1 +
T2 1 1 1
2 2
1 i i
= T2 1 +1 = (13)
T2 1 1

From the results in (12) and (13) the expression in (11) becomes
( )
2 T 2
1 2 T T 1+ i
E y 2i; 1 = 2)
+ 2 2
+ (14)
(1 T (1 ) T 1

The results in (10) and (14) imply that the plim in (5) is given by
n T T
1 XX 2 1X 2
plim yi;t 1 y i; 1 = E yi;t 1 E y 2i; 1
n!1 nT T t=1
i=1 t=1
( )
2 2 2 T
i 1 2 T T 1+
= 2
+ 2)
+ 2
1 1 (1 T (1 ) T2
2
i
1
( )
2 T
1 2 T T 1+
= 2
1 2 (15)
1 T (1 ) T2

From Theorem (5) and the results in (15), we have


Pn PT
plim i=1 t=1 yi;t 1 y i; 1 ( it i ) =nT
n!1
plim (bLSDV ) = Pn PT 2
n!1 plim i=1 t=1 yi;t 1 y i; 1 =nT
n!1
2 T 2
T 1 T + = (T (1 ))
= n T)
o
2 1 1 2 (T T 1+ 2)
T (1 )2 T2 = (1

8
T
T 1 T + (1 + )
=
2
T2 T (1 )2
(T T 1+ T) (1 )
T
(1 + ) T 1 T +
=
2
(1 ) T2 T (1 )2
(T T 1+ T)

The above expression imply that the semi-asymptotic bias can be rewritten as:
T
(1 + ) T 1 T +
plim (bLSDV )=
2
n!1 (1 ) T2 T (1 )2
(T T 1+ T)

Fact 1 If T also tends to in…nity, then the numerator converges to zero, and denominator converges to a
2 2
nonzero constant = 1 , hence the LSDV estimator of and i are consistent.

Fact 2 If T is …xed, then the denominator is a nonzero constant, and bLSDV and b i are inconsistent estimators
when n is large.

Theorem 6 (Dynamic panel bias) In a dynamic panel AR(1) model with individual e¤ ects, the semi-
asymptotic bias (with n) of the LSDV estimator on the autoregressive parameter is equal to:
T
(1 + ) T 1 T +
plim (bLSDV )=
2
n!1 (1 ) T2 T (1 )2
(T T 1+ T)

Theorem 7 For an AR(1) model, the dynamic panel bias can be rewritten as :
T T 1
(1 + ) 11 2 1
plim (bLSDV )= 1 1 1
n!1 T 1 T 1 (T 1) (1 ) T (1 )
Fact 3 The dynamic bias of bLSDV is caused by having to eliminate the individual e¤ects i from each
observation, which creates a correlation of order (1=T ) between the explanatory variables and the
residuals in the transformed model
0 1 0 1
B C B C
yit yi = @yit y i; 1 A+@ it i A
| {z } |{z}
dep ends on past value of it
dep ends on past value of it

Intuition of the dynamic bias

yit yi = yit y i; 1 +( it i)

with cov y i; 1; i 6= 0 since


T T
!
1X 1X
cov y i; 1; i = cov yi;t 1 ; it
T t=1 T t=1
( ! T
) !
1 1 T 1 T 1 X1 1 T t
1X
T
i
= cov yi0 + (T 1) + it ; it
T 1 1 1 t=1
1 T t=1
2 T
X1 T t
1 2
= , as it are i:i:d: 0;
T2 t=1
1
! !
2 T 1 2 T
(T 1) (1 ) 1 T (1 ) 1+
= 2 = 2
T2 (1 ) T2 (1 )

9
Therefore, !
n
1X 2
T (1 ) 1+ T
plim y i; 1 i = cov y i; 1; i = 2
n!1 n
i=1
T2 (1 )

Remark 8
T T 1
(1 + ) 1 1 2 1
plim (bLSDV )= 1 1 1
n!1 T 1 T (1 ) (T 1) (1 ) T (1 )

1. When T is large, the right-hand-side variables become asymptotically uncorrelated.

2. For small T , this bias is always negative if > 0.

3. The bias does not go to zero as goes to zero.

10
Estimation of an AR(1) Panel Data Model
To solve the inconsistency problem of the FE estimator of an AR(1) panel data model, we will use a
di¤erence transformation to eliminate the individual e¤ects i. This gives

yit yi;t 1 = (yi;t 1 yi;t 2) +( it i;t 1 ) (1)

If we estimate (1) by OLS we do not get a consistent estimator for because yi;t 1 and i;t 1 are,
by de…nition, correlated, even if T ! 1. However, this transformed model suggests an instrumental
variables approach. For example, yi;t 2 is correlated with yi;t 1 yi;t 2 but not with ( it i;t 1 ),

unless it exhibits autocorrelation (which we excluded by assumption). This suggests an instrumental


variables estimator for as
Pn PT
yi;t 2 (yit yi;t 1 )
bIV = Pn i=1PT t=2 (2)
i=1 t=2 yi;t 2 (yi;t 1 yi;t 2 )

A necessary condition for consistency of this estimator is that


n X
X T
1
plim yi;t 2 ( it i;t 1 ) = E fyi;t 2 ( it i;t 1 )g =0 (3)
n (T 1) i=1 t=2

for either T , or n, or both going to in…nity. The estimator in (2) is one of the estimators proposed
by Anderson and Hsiao (1981). They also proposed an alternative, where yi;t 2 yi;t 3 is used as an
instrument. This gives
Pn PT
(2) (yi;t 2 yi;t 3 ) (yit yi;t 1 )
bIV = Pn i=1PT t=2 (4)
i=1 t=2 (yi;t 2 yi;t 3 ) (yi;t 1 yi;t 2 )

which is consistent (under regularity conditions) if


n X
X T
1
plim (yi;t 2 yi;t 3 ) ( it i;t 1 ) = E f(yi;t 2 yi;t 3 ) ( it i;t 1 )g =0 (5)
n (T 2) i=1 t=3

Consistency of both of these estimators is guaranteed by the assumption that it has no autocorrelation.

Note that the second instrumental variables estimator requires an additional lag to construct the
instrument, such that the e¤ective number of observations used in estimation is reduced (one sample
period is ‘lost’).

A method of moments approach can unify the estimators and eliminate the disadvantages of reduced
sample sizes.

Both IV estimators thus impose one moment condition in estimation. It is well known that imposing
more moment conditions increases the e¢ ciency of the estimators (provided the additional conditions
are valid, of course).

11
Arellano and Bond (1991) suggest that the list of instruments can be extended by exploiting additional
moment conditions and letting their number vary with t. To do this, they keep T …xed. For example,
when T = 4, we have
E f( i2 i1 ) yi0 g =0

as the moment condition for t = 2. For t = 3, we have

E f( i3 i2 ) yi1 g =0

but it also holds that


E f( i3 i2 ) yi0 g =0

For period t = 4, we have three moment conditions and three valid instruments

E f( i4 i3 ) yi0 g = 0

E f( i4 i3 ) yi1 g = 0

E f( i4 i3 ) yi2 g = 0

All these moment conditions can be exploited in a GMM framework. To introduce the GMM estimator
de…ne for general sample size T , 0 1
i2 i1
B C
B .. C
i =B . C (6)
@ A
i;T i;T 1

as the vector of transformed error terms, and


0 1
[yi0 ] 0 0
B h i C
B 0 0 C
B yi0 ; yi1 C
Zi = B
B .. .. .. ..
C
C (7)
B . . . . C
@ h i A
0 0 yi0 ; ; yi;T 2

as the matrix of instruments. Each row in the matrix Zi contains the instruments that are valid for a
given period. Consequently, the set of all moment conditions can be written concisely as

E fZ0i ig =0 (8)

1
Note that these are 1 + 2 + 3+ +T 1 = 2T (T 1) moment conditions. To derive the GMM
estimator, write this as
E fZ0i ( yi yi; 1 )g =0 (9)

12
Because the number of moment conditions will typically exceed the number of unknown coe¢ cients,
we estimate by minimizing a quadratic expression in terms of the corresponding sample moments,
that is " #0 " #
n n
1X 0 1X 0
min Z ( yi yi; 1) Wn Z ( yi yi; 1) (10)
n i=1 i n i=1 i

where Wn is a symmetric positive de…nite weighting matrix. Di¤erentiating this with respect to and
solving for gives
n
! n
!! 1 n
! n
!!
X X X X
bGM M = yi;0 1 Zi Wn Z0i yi; 1 yi;0 1 Zi Wn Z0i yi (11)
i=1 i=1 i=1 i=1

The properties of this estimator depend upon the choice for Wn , although it is consistent as long as
Wn is positive de…nite, for example, for Wn = I, the identity matrix.

The optimal weighting matrix is the one that gives the most e¢ cient estimator, i.e. that gives the
smallest asymptotic covariance matrix for bGM M .

From the general theory of GMM, we know that the optimal weighting matrix is (asymptotically)
proportional to the inverse of the covariance matrix of the sample moments. In this case, this means
that the optimal weighting matrix should satisfy
1 1
plim Wn = [var (Z0i i )] = [E (Z0i i
0
i Zi )] (12)
n!1

In the standard case where no restrictions are imposed upon the covariance matrix of i , this can
be estimated using a …rst-step consistent estimator of and replacing the expectation operator by a
sample average. This gives
n
! 1
c opt 1X 0
W n = Z bi b0i Zi (13)
n i=1 i

where bi is a residual vector from a …rst-step consistent estimator, for example using Wn = I.

The general GMM approach does not impose that it is i:i:d: over individuals and time, and the optimal
weighting matrix is thus estimated without imposing these restrictions.

Note, however, that the absence of autocorrelation was needed to guarantee the validity of the moment
conditions.

Instead of estimating unrestricted optimal weighting matrix, it is possible (and potentially advisable
in small samples) to impose the absence of autocorrelation in it , combined with a homoscedasticity
assumption. Note that under these restrictions
0 1
2 1 0
B C
B .. C
B 1 2 . 0 C
0 2 2B C
E( i i ) = G = B .. .. C (14)
B 0 . . 1 C
B C
@ . A
.. 0 1 2

13
the optimal weighting matrix can be determined as
n
! 1
1X 0
Wnopt = Z GZi (15)
n i=1 i

Note that this matrix does not involve unknown parameters, so that the optimal GMM estimator can
be computed in one step if the original errors it are assumed to be homoscedastic and exhibit no
autocorrelation.

Under weak regularity conditions, the GMM estimator for is asymptotically normal for n ! 1 and
…xed T , with its covariance matrix given by
0 ! ! !1 1
n n 1 n
1 X 1X 0 X
plim @ 0
yi; 1 Zi Z i
0
i Zi yi;0 1 Zi
A (16)
n!1 n i=1 n i=1 i i=1

With i:i:d: errors the middle term reduces to


n
! 1
2 1X 0
Wnopt = 2
Z GZi
n i=1 i

Alvarez and Arellano (2003) showed that, in general, the GMM estimator is also consistent when both
n and T tend to in…nity despite the fact that the number of moment conditions tends to in…nity with
the sample size. For large T , however, the GMM estimator will be close to the …xed e¤ects estimator,
which is a more attractive alternative.

Dynamic Panel Data Models with Exogenous Variables


If the model also contains exogenous variables, we have

yit = x0it + yi;t 1 + i + it (1)

which can also be estimated by the generalized instrumental variables or GMM approach.

Depending upon the assumptions made about xit , di¤erent sets of additional instruments can be
constructed. If the xit are strictly exogenous in the sense that they are uncorrelated with any of the
is error terms, we also have that

E fxis it g = 0 for each s; t (2)

so that xi1 ; :::; xiT can be added to the instruments list for the …rst-di¤erenced equation in each period.

This would make the number of rows in Z0i quite large. Instead, almost the same level of information
may be retained when the …rst-di¤erenced xit s are used as their own instruments. In this case, we
impose the moment conditions
E f xit it g =0 for each t (3)

14
and the instrument matrix can be written as
0 h i 1
0 0 0
B y i0 ; xi2 C
B h i C
B 0 yi0 ; yi1 x0i3 0 C
B C
Zi = B . .. .. .. C
B .. . C
B . . C
@ h i A
0 0 yi0 ; ; yi;T 2 x0i;T

If the xit variables are not strictly exogenous but predetermined, in which case current and lagged xit s
are uncorrelated with current error terms, we only have that Efxit is g = 0 for s t. In this case, only
xi;t 1 ; :::; xi1 are valid instruments for the …rst-di¤erenced equation in period t. Thus, the moment
conditions that can be imposed are

Efxi;t j it g =0 for j = 1; : : : ; t 1 (for each t) (4)

In practice, a combination of strictly exogenous and predetermined x-variables may occur rather than
one of these two extreme cases. The matrix Zi should then be adjusted accordingly.

15

You might also like