0% found this document useful (0 votes)
34 views30 pages

How2Do Xtabond2

QƯERQ

Uploaded by

khoaa678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views30 pages

How2Do Xtabond2

QƯERQ

Uploaded by

khoaa678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

How to Do xtabond2

David Roodman
Research Fellow
Center for Global Development
xtabond2 in a nutshell
 Firstado version in 11/03, Mata version in 11/05.
 Extends built-in xtabond, to do system GMM,
Windmeijer correction, revamped syntax
 Estimators designed for
– Small-T, large-N panels
– One dependent variable
– Dynamic
– Linear
– Regressors endogenous and predetermined
– Fixed individual effects
– Arbitrary autocorrelation and het. within panels
– General application
Outline of paper
Introduction to linear GMM
Motivation and design of
difference and system GMM
xtabond2 syntax
Black box problem
 Canned & sophisticated procedure
 Dangers in hidden sophistication
– finite sample ≠ asymptotic
 Users should understand motivation and
limits of estimator
Linear GMM in one slide
 Instrument vector z such that E[z ] 0
 # instruments > # parameters so can’t have E N [z ]  N1 Z 'Eˆ 0
 Want to “minimize” N1 Z 'Eˆ in some sense
 In what sense? By a pos-semi-def. quad. form given by A:
'
1 ˆ  1 ˆ   1 'ˆ  1 ˆ'
E N z  A Z 'E N  Z 'E  A Z E   E ZAZ E

N A N  N  N
β
 Given A, minimizing leads to Aˆ  X '
ZAZ '
X 1
X '
ZAZ '
Y
 Always unbiased, but which A is efficient? Answer: A should weight
moments z i' E inversely with their variances and
covariances:
 '
A EGMM Var Z E X, Z  Z Var E X, ZZ Z ΩZ
1 ' 1 ' 1

 To make feasible, choose arbitrary proxy for Ω, call it H. Do GMM


(one-step). Use residuals to make robust sandwich estimator of
Z'ΩZ  1 . Rerun. Two-step is feasible, theoretically efficient.
Linear GMM and 2SLS

 
βˆ EGMM  X ' Z Z ' ΩZ 
1
ZX '
 X ZZ ΩZ Z Y
1
' ' 1 '

 If Ω  I, reduces to
2

 ' ' 1 '



βˆ 2 SLS  X ZZ Z  Z X X ' ZZ ' Z  Z ' Y
1 1

 If errors i.i.d., efficient GMM is 2SLS


 If not, 2SLS inefficient
Linear GMM in another slide
(Holtz-Eakin, Newey, and Rosen 1988)
(1) Y Xβ  E
OLS inconsistent: EX E 0
'

' ' '


(2) Take Z-moments: Z Y  Z Xβ  Z E
OLS consistent E X '
ZZ' E 0 
but inefficient Var Z' E Z' ΩZ not scalar 
Left-multiply by Z ΩZ  :
' 1
2

Z ΩZ
' 1
2
Z Y Z ΩZ 
' ' 1
2
Z Xβ  Z ΩZ 
' ' 1
2
Z 'E
1 1 1

Let X Z ΩZ  Z X, Y Z ΩZ  Z Y, E Z ΩZ  Z E


* '  ' ' * '  ' * ' 
2 2 2

* * *
(3) Y X β  E
OLS efficient Var E *
 I 
OLS on (3) = GLS on (2) = GMM on (1)
GMM = GLS on Z-moments
Difference and system GMM
Basic model:
yit yi ,t  1  x it' β  it
 it i   it
Ei  E it  Ei it  0

Conceptual starting point: OLS


Problem: Dynamic Panel Bias (Nickell 1981)
 Fixed effects in disturbance term make y endogenous
i ,t  1

o Example: Indonesia
 A problem of short panels
 Individual dummies (=Within Groups) don’t help
o Transformed y endogenous, as are deeper lags
i ,t  1
Partial solution: OLS in differences

yit yi ,t  1  x it' β   it


 Purges fixed effects, doesn’t spread endogeneity much
 Transformed yi ,t  1 still becomes endogenous since the
yi ,t  1 in y i ,t  1  y i ,t  1  y i ,t  2 correlates with the  i ,t  1 in
 it  it   i ,t  1
 But deeper lags exogenous if no AR(), offering
instruments
Problem: Other endogeneity
 Differencing eliminates endogeneity to fixed
effects error component. But
o y now endogenous to 
i ,t  1 it

o Other predetermined variables become


endogenous in same way
o Still other variables may be endogenous
from the start
 For general application, assume no perfect
instruments waiting in the wings
Solution: Instrument with lags (2SLS)
(Anderson and Hsiao 1981)
 Assuming no AR() in  , natural
it

instruments for y are y and y


i ,t  1 i ,t  2 i ,t  2

 Both mathematically related to y i ,t  1

 y seems preferable: available at t = 3


i ,t  2

 Again, small T influences


 Do same for other endogenous variables
Problem: Inefficiency
 Deeper lags available as instruments
o But reduce sample in 2SLS
o Problem for short panels
 In differences, errors not i.i.d.
o  and  mathematically correlated
it i ,t  1

o 2SLS not efficient


Solution: GMM & GMM-style instruments
(Holtz-Eakin, Newey, and Rosen 1988)
 Use many lags, replacing missing with zero
 Generate separate instrument for each lag and time
period instrumented
 0 0 0 0 0 0 
 .   0 0 0 0 0 0 
 .  
   y i1 0 0 0 0 0 
 y i1   .
   0 yi 2 y i1 0 0 0 
IV-style:    GMM-style:  0 0 0 yi3 yi 2 y i1 
 y i ,T  2   
          
 Result: Arellano-Bond (1991) difference GMM
Problem: Autocorrelation
 E.g., if  it are AR(1), then yi ,t  2 ~  i ,t  2 ~  i ,t  1 ~  it
 Must assume yi ,t  2 is invalid instrument in i,t
Solution: Restrict to deeper lags
 If we find AR(l) in  , use lags l + 1 and deeper
it
Arellano-Bond AR() test
 Expect AR() in  it i   it
 To check for AR(1) in  it , test for AR(2) in eit
 E.g., compare eit  ei ,t  1 and ei ,t  2  ei ,t  3 to detect ei ,t  1 ~ ei ,t  2

 Test statistic for AR(l) in differences:


 e e
i ,t
it i ,t  l

 Normal under null of no AR(l)


 Arellano and Bond calculate its standard deviation
 z test for AR()
 More general than other AR() tests in Stata.
 abar: post-estimation command for regress, ivreg, ivreg2
Problem: Weak instruments

If y is nearly a random walk, yi ,t  1 is a poor


instrument for yit , mathematical
relationship notwithstanding
Solution: Instead of purging fixed effects,
find instruments orthogonal to them
(Arellano and Bover 1995)
 If E y   stationary, then Eyit i  0
it i

 yi,t  1 uncorrelated with fixed effects, thus with it i 


good instrument in levels (if no AR)
 Make system of difference and levels equations
 Concretely, make a stacked data set, with difference
up top, levels below. Treat as single estimation prob
 Instrument differences with levels and v.v.
 “System GMM” (Blundell and Bond 1998)
Relationship among moments
(Tue Gorgens)
Ewi1i1  D Ewi1i 2  D Ewi1i 3  D Ewi1i 4 
L L L
Ewi 2i1  Ewi 2i 2  D Ewi 2i 3  D Ewi 2i 4 
L L
Ewi 3i1  Ewi 3i 2  Ewi 3i 3  D Ewi 3i 4 
L
Ewi 4i1  Ewi 4i 2  Ewi 4i 3  Ewi 4i 4 
Problem: Two-step errors too small
Regression for Arellano-Bond (1991) column (a1), Table 4
Arellano-Bond dynamic panel-data estimation, one-step difference GMM results
------------------------------------------------------------------------------

| Robust
| Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .6862261 .1445943 4.75 0.000 .4003376 .9721147
L2. | -.0853582 .0560155 -1.52 0.130 -.1961109 .0253944
w |
--. | -.6078208 .1782055 -3.41 0.001 -.9601647 -.2554769
L1. | .3926237 .1679931 2.34 0.021 .0604714 .7247759
k |
--. | .3568456 .0590203 6.05 0.000 .240152 .4735392

Arellano-Bond dynamic panel-data estimation, two-step difference GMM results


------------------------------------------------------------------------------
| Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .6287089 .0904543 6.95 0.000 .4498646 .8075531
L2. | -.0651882 .0265009 -2.46 0.015 -.1175852 -.0127912
w |
--. | -.5257597 .0537692 -9.78 0.000 -.6320709 -.4194485
L1. | .3112899 .0940116 3.31 0.001 .1254122 .4971675
k |
--. | .2783619 .0449083 6.20 0.000 .1895702 .3671537
Problem, cont’d

 Problem appears to be one of overfitting


o Efficient GMM deemphasizes moments
with high variance (high second moments)
o Feasible efficient GMM in small samples
may deemphasize outliers (high first
moments)
o Spurious precision
Solution: finite-sample correction
(Windmeijer 2005)
o One-step estimate: β̂1  f Y (conditioning on X, Z)
o One-step residuals used to construct Ω̂ :
   
1 1
ˆβ  X Z Z Ω
' ' ' ˆ Z  1 Z ' Y  g ( Y, Ω
ˆ Z Z X  X ' Z Z 'Ω ˆ )  g (Y, f (Y))
2
 
 
o Standard estimate of Var βˆ 2 treats Ω̂ as constant,
observed, precise—despite dependence on random Y
o Taylor expansion of g around true β :

βˆ 2  g Y, Ω  
ˆ   g Y, Ω
ˆ ˆ  g Y, Ω
1 
βˆ

ˆ ˆ

  βˆ  β 
1
βˆ β

o “Correction” comes from second term


 Eβˆ1  β  0 so no effect on Eβˆ 2  —no coefficient bias
 Affects variance
Arellano-Bond dynamic panel-data estimation, one-step difference GMM results
------------------------------------------------------------------------------
| Robust
| Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .6862261 .1445943 4.75 0.000 .4003376 .9721147
L2. | -.0853582 .0560155 -1.52 0.130 -.1961109 .0253944
w |
--. | -.6078208 .1782055 -3.41 0.001 -.9601647 -.2554769
L1. | .3926237 .1679931 2.34 0.021 .0604714 .7247759
k |
--. | .3568456 .0590203 6.05 0.000 .240152 .4735392

Arellano-Bond dynamic panel-data estimation, two-step difference GMM results


------------------------------------------------------------------------------
| Coef. Std. Err. t P>|t| [95% Conf. Interval ]
-------------+----------------------------------------------------------------
n |
L1. | .6287089 .0904543 6.95 0.000 .4498646 .8075531
L2. | -.0651882 .0265009 -2.46 0.015 -.1175852 -.0127912
w |
--. | -.5257597 .0537692 -9.78 0.000 -.6320709 -.4194485
L1. | .3112899 .0940116 3.31 0.001 .1254122 .4971675
k |
--. | .2783619 .0449083 6.20 0.000 .1895702 .3671537

Arellano-Bond dynamic panel-data estimation, two-step difference GMM results


------------------------------------------------------------------------------
| Corrected
| Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .6287089 .1934138 3.25 0.001 .2462954 1.011122
L2. | -.0651882 .0450501 -1.45 0.150 -.1542602 .0238838
w |
--. | -.5257597 .1546107 -3.40 0.001 -.8314524 -.2200669
L1. | .3112899 .2030006 1.53 0.127 -.0900784 .7126582
k |
--. | .2783619 .0728019 3.82 0.000 .1344196 .4223043
Problem: too many instruments
 In difference and system GMM, # instruments (j) quadratic in T
 Analogy:
– In 2SLS, if j = # of regressors, first-stage R2’s=1.0 and
2SLS=OLS (biased)
– Too many instruments overfit endogenous variables
 '
1
 And # of cross-moments in Var Z E X, Z to be estimated for
efficient GMM quadratic in j—quartic in T!
 '
1
 Estimate of Var Z E X, Z degrades
 Hansen test very weak—p values of 1.000 not uncommon
 Little guidance on how many is too many
 xtabond2 warns if j > N
Solution: consider limiting instruments
 Limit number of lags of variables used as instruments
 Or “collapse” instruments:

 0 0 0 0 0 0   0 0 0 
 0 0 0 0 0 0   0 0 0 
 
 y i1 0 0 0 0 0   y i1 0 0 
. .


 
 0 yi 2 y i1 0 0 0   yi 2 y i1 0 
 0 0 0 yi3 yi 2 y i1   yi3 yi 2 y i1 
   
,

              

y
i
i ,t  2 eˆit 0 for each t 3
 i ,t
y i ,t  2 eit 0.
xtabond2 syntax
Y X Z
xtabond2 depvar varlist [if exp] [in range]
[, level(#) twostep robust noconstant small noleveleq
artests(#) arlevels h(#) nomata]
ivopt [ivopt ...] gmmopt [gmmopt ...]]

where gmmopt is
“GMM-style”
gmmstyle(varlist [, laglimits(# #) collapse
equation({diff | level | both}) passthru])
and ivopt is
Classic

ivstyle(varlist [, equation({diff | level | both})


passthru mz])
Examples
 Classic one-step difference GMM with no controls except
time dummies
xi: xtabond2 y L.y i.t, gmm(y, laglim(2 .))
iv(i.t) robust noleveleq

 Equivalents:
xi: xtabond2 y L.y i.t, gmm(L.y, laglim(1 .))
iv(i.t) robust noleveleq
xi: xtabond2 y L.y i.t, gmm(L.y)
iv(i.t) robust noleveleq

 System GMM, two-step, Windmeijer correction,


w1 exogenous, w2 predetermined, w3 exogenous:
xi: xtabond2 y L.y w1 w2 w3 i.t,
gmm(L.y w2 L.w3) iv(i.t w1) two robust
Examples, cont’d
If conditions imposed only on levels,
difference equation effectively discarded.
Equivalent pairs:
regress n w k
xtabond2 n w k, iv(w k, eq(level)) small

ivreg2 n cap (w = k ys)


xtabond2 n w cap, iv(cap k ys, eq(level))

ivreg2 n cap (w = k ys), cluster(id) gmm


xtabond2 n w cap, iv(cap k ys, eq(level)) two

Or even:
regress n w k
abar, lags(2)
xtabond2 n w k, iv(w k, eq(level)) small arlevel
Run times for bbest (seconds)
700 MHz PC
xtabond2 ado 57

xtabond2 Mata, favoring space 14

xtabond2 Mata, favoring speed 11

DPD for Ox 3

You might also like