Bond (2007 Lecture) Dynamic PD First Difference GMM
Bond (2007 Lecture) Dynamic PD First Difference GMM
NB. The …rst observation is yi1, so that the …rst available equation is
Some authors assume yi0 is observed, and thus have T equations in levels.
Two important properties of the lagged dependent variable.
E[yi;t 1 i] > 0 since i is part of the process that generates yi;t 1 according
to our speci…cation.
Thus yi;t 1 is correlated with the individual e¤ects, and is not strictly
exogenous.
Most of the estimation issues are present in the simpler dynamic model
Useful to establish the properties of pooled OLS and Within Groups esti-
Assuming E[yi;t 1vit] = 0, then p lim b OLS > as a result of the positive
Consider
1
yei;t 1 = yi;t 1 (yi1 + ::: + yi;T 1)
T 1
and
1
veit = vit (vi2 + ::: + viT )
T 1
Notice that all correlations of order T 1 1 are negative. E.g. corr(yi;t 1; T 1
1 vi;t 1)
1
and corr(vit; T 1 yit):
1
yi;t 1veit] < 0 and is of order
This suggests that E[e T 1 yi;t 1veit] ! 0
(i.e. E[e
as (T 1) ! 1).
These properties can be shown more formally (e.g. Nickell, Econometrica
1981).
suggest the bias of the Within estimator remains non-negligible in cases with
T = 10 or even T = 15:
Also note that the inconsistency does not disappear as ! 0. So, unless T
is large, the Within estimator does not provide reliable evidence on whether
biased upwards, and (in short panels) Within Groups is likely to be biased
downwards.
…rst transform the model to eliminate the individual e¤ects, and then apply
instrumental variables.
the shocks from all time periods into the transformed error term.
equations.
Two-stage least squares (2SLS) estimators of this type were suggested by
E.g.
b AH = ( y 0 1Z(Z 0Z) 1Z 0 y 1) 1
y 0 1Z(Z 0Z) 1Z 0 y
One further time series observation is lost if yi;t 2 rather than yi;t 2 is
ing that the vit are serially uncorrelated shocks, provided the initial condi-
With T > 3, further valid instruments become available for the …rst-
di¤erenced equations in the later time periods. E¢ ciency can be improved
by exploiting these additional instruments.
The transformed error term vit also has a known moving average form
of serial correlation, under the maintained assumption that vit is serially
uncorrelated. More generally, vit may be heteroskedastic. These features
can be exploited to improve e¢ ciency when T > 3 (i.e. when the model is
overidenti…ed).
Generalised method of moments (GMM)
Example
yi f (xi; ) = ui i = 1; :::; N is K 1
Sample analogue:
XN XN
1 0 1
bN ( ) = ziui( ) = zi0[yi f (xi; )]
N i=1 N i=1
GMM estimators choose bGM M to minimise the distance of bN ( ) from
zero.
i.e.
bGM M = arg min JN ( ) = bN ( )0WN bN ( )
Note that this gives a family of GMM estimators, based on the same mo-
and
@bN ( )
D = p lim
N !1 @ 0
W = p lim WN
N !1
XN
1
SW = E(zi0uiu0izi)
N i=1
SW is the average covariance matrix for the moment conditions.
avar(bGM M ) = (D0W D) 1
bi = yi f (xi; b)
is a consistent estimate of SW , obtained using the residuals u
zi = xi:
E( i) = E(vit) = E( ivit) = 0
E(visvit) = 0 for s 6= t
(yiT yi;T 1) = (yi;T 1 yi;T 2) + (viT vi;T 1) yi1; yi2; :::; yi;T 2
tions.
E(yi1 vi4) = 0 follows similarly.
0 1 0 1
B yi1 0 0 ::: 0 0 ::: 0 C B vi3 C
B C B C
B C B C
B 0 yi1 yi2 : : : 0 0 : : : 0 C B vi4 C
B C B C
Zi = B C and vi = B C
B . .. .. . . . .. .. .. C B .. C
B . C B C
B C B C
@ A @ A
0 0 0 : : : yi1 yi2 : : : yi;T 2 viT
(T 2) m (T 2) 1
Sample analogue
XN
1
bN ( ) = Zi0 vi( )
N i=1
For T = 3, we have 1 moment condition E(yi1 vi3) = 0 and 1 parameter.
is just identi…ed, the choice of the weight matrix is irrelevant, and the
1
= ( y 0 1ZWN Z 0 y 1) y 0 1ZWN Z 0 y
1
b GM M = ( y 0 1ZWN Z 0 y 1) y 0 1ZWN Z 0 y
and
b AH = ( y 0 1Z(Z 0Z) 1Z 0 y 1) 1
y 0 1Z(Z 0Z) 1Z 0 y
For an arbitrary WN
where
XN
b 1 c c 0
VN = Zi v i v i Zi
N i=1
and cv it = yit b yi;t 1 are consistent estimates of the …rst-di¤erenced
N
! 1
1 X 0
WN = c c
Zi v i v i Zi
N i=1
giving
1
avar(b GM M ) = N ( y 0 1ZWN Z 0 y 1)
One step weight matrix
2
For the special case in which vit iid(0; v ), we can obtain a one step
are using.
Role here is merely to suggest a good choice for the one step weight matrix.
For the …rst-di¤erenced equations, this choice is not 2SLS, due to the serial
E( vit2 ) = 2 2
v
2
E( vit vi;t 1) = v
N
! 1
1 X
WN = Zi0HZi
N i=1
gives a one step GMM estimator for the …rst-di¤erenced equations that is
tor.
While asymptotic results for the two step estimator only require an initial
N
! 1
1 X 0
WN = c c
Zi v i v i Zi
N i=1
possible.
Two step inference
indicates a small sample problem with the usual estimate of the asymptotic
1
avar(b GM M ) = N ( y 0 1ZWN (b )Z 0 y 1)
feasible GMM estimator, that uses the true value rather than the initial
rection that provides more accurate estimates of the variance of (linear) two
and PC-Give/Ox.
as those based on the one step GMM estimator (where no parameters are
(As usual with t and Wald tests) these are quite accurate in cases where
the estimated parameter(s) are not subject to serious …nite sample bias prob-
provide Monte Carlo evidence, and consider alternative tests of linear re-
where uit = i + vit is the error term for the untransformed equations in
levels.
See Ahn and Schmidt (Journal of Econometrics, 1995) for further discus-
sion.
Homoskedasticity over time
E(vit2 ) = 2
i
These are simple to implement. They will improve e¢ ciency if the addi-
if not.
Alternative transformations
- does not introduce lagged shocks earlier than vi;t 1 into the transformed
error term
‘orthogonal deviations’transformation
1
T t+1 2
1
vitO = vi;t 1 (vit + vi;t+1 + ::: + viT )
T t+2 T t+1
This estimates the mean for individual i using future observations on the
One property is that if the vit are iid, then so are the vitO .
Hence the asymptotically e¢ cient one step estimator for the iid special
Transformed model
yitO = yi;t
O
1 + v O
it for t = 3; :::; T
2SLS using all available linear moment conditions coincides with the one
These results are useful for thinking about …nite sample bias issues.
Over…tting
One source of …nite sample bias is the use of ‘too many’instruments relative
ting’.
For 2SLS estimators, over…tting results in a …nite sample bias in the direc-
tion of the corresponding OLS estimator.
The result is also much less useful in the context of models with endogenous
explanatory variables, where the Within estimator is not consistent as T !
1, and hence not such a benign estimator to be converging on.
Weak instruments
properties in cases where the instruments, although valid, are only weakly
This is relevant for the …rst-di¤erenced GMM estimator in the AR(1) model
the correlation between yi;t 1 and the lagged levels yi;t s for s 2 becomes
weaker as ! 1.
In the model we have focused on
At = 1 we have
so that
E( yi;t 1 yi;t 2) = E( 2i ) 6= 0
become very imprecise, and subject to serious …nite sample biases, for values
of around 0.8 and above, unless the available samples are huge.
the Within estimator, consistent with …ndings for 2SLS estimators in simple
cases where the weak instruments problem has been studied analytically.
Blundell and Bond (Journal of Econometrics, 1998) provide Monte Carlo
evidence, and develop an extended GMM estimator that is more useful for
We will return to this after brie‡y considering how the basic GMM esti-
mator studied so far can be adapted for models with additional explanatory
variables.
Note also that if we combine
yit = i + "it
or
In this speci…cation, the process for yit approaches a pure random walk as
Now at =1
yi;t 1 = vi;t 1
yi;t 1 in the case where = 1, and is not identi…ed using only the
moment conditions
Although for this model we can note that the OLS levels estimator is
consistent when = 1.