0% found this document useful (0 votes)

5 views

Estimation_of_dynamic_panel_data_models_2

This paper presents a new method for estimating dynamic panel data models that accounts for sample selection, utilizing backward substitution for the lagged dependent variable. The proposed estimator addresses the weak instruments problem associated with differencing and allows for a simpler selection bias test. The methodology is applied to dynamic earnings equations for women, demonstrating its effectiveness in handling unbalanced panels and selection issues.

Uploaded by

Solomon Assefa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Estimation_of_dynamic_panel_data_models_2

Uploaded by

Solomon Assefa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

===

Estimation of Dynamic Panel Data Models

with Sample Selection
Anastasia Semykina*
Department of Economics
Florida State University
Tallahassee, FL 32306-2180
[email protected]

Jeffrey M. Wooldridge
Department of Economics
Michigan State University
East Lansing, MI 48824-1038
[email protected]

March 2, 2011

* Correspondence to: Anastasia Semykina, Department of Economics, Florida State

University, Tallahassee, FL 32306-2180, USA. E-mail: [email protected]. Phone: 850-
644-4557. Fax: 850-644-4535.
We thank the editor M. Hashem Pesaran and three anonymous referees for their useful
comments.

1
Summary

We propose a new method for estimating dynamic panel data models with selection.
The method uses backward substitution for the lagged dependent variable, which leads to
an estimating equation that requires correcting for contemporaneous selection only. The
estimator is valid under relatively weak assumptions about errors and permits avoiding
the weak instruments problem associated with differencing. We also propose a simple test
for selection bias that is based on the addition of a selection term to the first-difference
equation and subsequent testing for significance of this term. The methods are applied
to estimating dynamic earnings equations for women.

Key words: Sample selection, Panel data, Dynamic models, Two-step estimation.

2
1 Introduction

Recently developed methods for estimating dynamic unobserved effects panel data mod-
els have become widely used in applied economics research. In the present paper, we
contribute to the literature by developing a new estimation method for the models, where
the panel is not balanced due to nonrandom selection.
In the absence of selection, the traditional approach to estimating dynamic panel data
models is to remove the unobserved effect by first-differencing and then use instrumental
variables methods for estimating the differenced equation. This approach was initially
proposed by Anderson and Hsiao (1981) and was later considered within a more efficient
generalized method of moments (GMM) framework by Holtz-Eakin, Newey and Rosen
(1988), Arellano and Bond (1991), Ahn and Schmidt (1995), and others.
Blundell and Bond (1998) raised the problem of weak instruments in the context of
the first-differenced GMM estimation. This problem arises when the series are highly
persistent, which happens in a simple AR(1) model with the autoregressive coefficient
close to unity.1 Blundell and Bond show that imposing restrictions on the initial condition
results in additional linear moments that can help to improve the performance of the GMM
estimator. As an alternative solution, they model the relationship between the unobserved
effect and initial condition through a linear function and suggest using the generalized
least squares estimator on the extended model, where the initial value is included in the
conditioning set.
Several previous studies considered estimation of dynamic panel data models with
selectivity; most of them use differencing to remove the unobserved effect.2 Ziliak and
Kniesner (1998) and Wooldridge (2002) propose a solution to the selection problem that
1
Binder, Hsiao and Pesaran (2005) show that the same problem arises in panel vector autoregressive
models.
2
Dynamic panel data models with censoring are considered, for example, by Honore and Hu (2004),
Hu (2002) and Labeaga (1999). See also Bover and Arellano (1997).

3
arises because of nonrandom attrition. Given the nature of attrition as an absorbing state
– if the unit is observed in the current period, it is observed in the previous period, also –
Ziliak and Kniesner, and Wooldridge show that accounting for the current period selection
in the differenced equation results in consistent estimation. Under the assumption that
errors in the selection equation are normally distributed, the selection correction term is
the inverse Mills ratio.
Arellano, Bover and Labeaga (1999) consider autoregressive panel data models with
sample selection. They model the conditional expectation of the unobserved effect as a
linear function of the past values of the dependent variable and consider the distribution
of the dependent variable conditional on its past. For each t, the resulting reduced-form
equation is estimated on a sub-sample of data, which includes cross-section units without
missing past values. Arellano, Bover and Labeaga assume normality of the error terms
in both primary and selection equations and use the inverse Mills ratio to account for
the fact that only the sub-samples with observed past values are used. The structural
autoregressive coefficient is then recovered from the reduced-form coefficients using the
restrictions imposed on parameters.
Another solution to the incidental truncation problem in dynamic panel data models
was proposed by Kyriazidou (2001), who suggested taking differences between any two
periods in which the selection index for the given unit is the same or “similar.” Under the
assumption that the vector of errors is independent and identically distributed over time
conditional on the exogenous variables, differencing eliminates both the unobserved effect
and selection effect. For consistency, it is crucial that the assumptions of strict stationarity
and conditional serial independence of the errors hold. Moreover, the estimator converges
at a rate that is slower than the usual square root of the cross-section sample size.
Another semiparametric estimator was proposed by Gayle and Viauroux (2007), who
consider a three-step sieve estimator. In the first step the selection probabilities in each

4
period are estimated nonparametrically by a kernel estimator. In the second step the
inverse probability function is linearized, the unobserved effect is removed by differencing
and the parameters in the linearized specification of the inverse probability function are
estimated using a sieve minimum distance estimator (a GMM estimator with series used
to approximate unknown functions). In the third step the GMM estimator is used to
estimate the differenced primary equation augmented by the correction term, where the
differenced correction term is again approximated by series estimators.
As mentioned above, most earlier studies use differencing. A benefit of differencing for
unbalanced panels is that it removes additive heterogeneity, and therefore any selection is
allowed to be arbitrarily correlated with the heterogeneity in the levels equation. Unfor-
tunately, if selection depends on the idiosyncratic shocks, consistent estimation requires
either imposing relatively strong assumptions on the properties of error distributions or
necessitates derivation of a complicated selection correction term that accounts for selec-
tion in several consecutive periods. As noted by Blundell and Bond (1998), differencing
may also lead to a weak instruments problem. Furthermore, in the case of incidental trun-
cation – such as labor force participation – units may drop out and appear again in any
period; therefore, the use of first-differencing or otherwise conditioning on observability
of the dependent variable in multiple consecutive periods in dynamic panel data models
with arbitrary selection patterns implies that much of the data is lost.
In this paper we consider an alternative method for estimating dynamic panel data
models with selection, which does not rely on differencing. One of the key assumptions
is that the initial condition is observed for all cross-section units. To account for unob-
served heterogeneity, rather than using differencing we follow Blundell and Bond (1998)
and Chamberlain (1980, 1982, 1984), and model the conditional expectation of the unob-
served effect as a linear function of the exogenous variables and initial condition. Then,
backward substitution for the lagged dependent variable is used to obtain the equation

5
that contains the lags of the exogenous explanatory variables (which are assumed to be
always observed) and the initial condition, but no lags of the dependent variable. As a
result, selection correction reduces to a contemporaneous selection problem of the type
studied in Wooldridge (1995) with strictly exogenous variables. The ability to focus on
selection period-by-period greatly simplifies the derivation of the correction term while
allowing general serial correlation in the error of the selection equation. The simplest ap-
proach relies on the assumption that the error terms in the selection equation are normally
distributed, but we also briefly discuss the possibility of semiparametric estimation. Once
the correction term is obtained, the augmented equation can be consistently estimated
by nonlinear least squares (NLS) or GMM.
The new estimation methods have several important advantages. Modeling the un-
observed effects allows us to estimate the equation of interest in levels, thereby avoiding
the weak instruments problem often associated with the estimators that use differencing.
In the discussed context the error terms in both primary and selection equations may
be heterogeneously distributed over time, and the error in the selection equation may be
arbitrarily serially dependent. We also discuss how estimation can be modified, so that
the observability of the initial condition is not required, and serial dependence in the error
terms is permitted in both equations. Additionally, the approach proposed here makes
use of all cross-section units observed at least once after the initial period, which helps to
avoid losing data.

2 The Model

Consider a dynamic panel data model with unobserved heterogeneity:

yit = ρyi,t−1 + xit β + ci1 + uit1 , t = 1, ..., T, (1)

6
where xit is a 1×K vector of time-varying variables, β is a K×1 vector of parameters, ρ
is a scalar parameter, ci1 is a time-constant unobserved effect, and uit1 is an idiosyncratic
error. Variables in xit are assumed to be strictly exogenous conditional on the unobserved
effect, but may be correlated with ci1 .
Selection occurs because of the partial observability of the dependent variable, yit .
This is modeled by specifying a selection rule

sit = 1[zit δ2t + ci2 + uit2 > 0], t = 1, ..., T, (2)

where sit is a selection indicator that equals one if yit is observed and is zero otherwise, ci2
is a time-constant unobserved effect, uit2 is an idiosyncratic error, zit is a 1×L (L > K)
vector of variables that are strictly exogenous conditional on the unobserved effect, and
δ2t is an L×1 vector of parameters. In what follows, it is assumed that zit contains all of
the regressors from the primary equation, but must also contain at least one additional
time-varying variable. Additional variables may be the factors that affect selection but
not the dependent variable in the primary equation. Alternatively, if selection is partly
determined by the lagged values of yit (as in some labor supply models, for example),
vector zit would include lagged values of xit .
Given the selection problem, estimation of equation (1) by differencing is complicated
for several reasons. First, we need to observe the dependent variable and explanatory
variables in the current and previous periods. Because of the lagged dependent variable,
we would only be able to use observations where yit is observed in three consecutive peri-
ods. Moreover, any selection correction term would involve conditioning on observability
in three different periods, making its derivation and estimation difficult.
We can avoid these problems by substituting back for yi,t−1 and expressing yit through

7
the current and lagged values of the explanatory variables and the initial condition, yi0 :

t−1
! t−1 t−1
X X X
yit = ρt yi0 + ρj xi,t−j β + ci1 ρj + ρj ui,t−j,1 , t = 1, ..., T. (3)
j=0 j=0 j=0

Denote zi ≡ (zi1 , zi2 , . . . , ziT ). Given (3), the estimating equation can be derived
under the following assumption:

ASSUMPTION 2.1

(i) yi0 and zi are always observed, while yit , t = 1, . . . , T , are observed only for sit = 1.

(ii) E(uit1 |xit , yi,t−1 , xi,t−1 , ..., yi0 , ci1 ) = 0, so that Cov(uit1 , uis1 ) = 0, for all s 6= t.

(iii) E(uit1 |zi , yi0 , ci1 ) = 0, t = 1, . . . , T .

PT
(iv) ci1 = η1 + s=1 ξs zis + γ1 yi0 + ai1 , E(ai1 |zi , yi0 ) = 0.

PT
(v) ci2 = η2 + s=1 ψs zis + γ2 yi0 + ai2 .

2
(vi) For vit2 = ai2 + uit2 , vit2 |zi , yi0 ∼ N ormal(0, σ2t ), t = 1, . . . , T .

Pt−1
(vii) For vit1 = j=0 ρj (ui,t−j,1 + ai1 ), E(vit1 |zi , yi0 , vit2 ) = ϕ2t vit2 , t = 1, . . . , T .

According to part (ii) of Assumption 2.1, the conditional mean in equation (1) is
assumed to be dynamically complete, which is a rather standard assumption in the lit-
erature. This part of the assumption ensures that yi0 is exogenous with respect to the
final error in (3). At the end of this section we discuss an alternative set of assumptions
and the corresponding estimating equation, where the dynamic completeness assumption
is dropped, so that {uit1 } may be serially correlated.
Part (iv) of Assumption 2.1 uses Chamberlain’s (1980, 1982, 1984) device to model the
conditional mean of the unobserved effect, ci1 , as a linear function of exogenous variables
(see also Blundell and Bond, 1998). This approach was used by Wooldridge (2005) in

8
the context of nonlinear dynamic panel data models with balanced panels. In general,
zit may contain time-constant variables; of course, the leads and lags of such variables
would not be included in the conditional mean of ci1 . A non-zero correlation between
the time-constant variables and ci1 implies that the effect of these variables cannot be
distinguished from that of the unobserved heterogeneity. However, it may still be useful
to include the time-invariant characteristics in zit because controlling for more variables
can help to improve on the precision of the estimator.
Under Assumption 2.1, parts (i)-(iv), the primary equation can be written as

t−1
! T
!
1 − ρt
X X
t j
yit = ρ yi0 + ρ xi,t−j β+ η1 + ξs zis + γ1 yi0 + vit1 ,
j=0
1−ρ s=1
E(vit1 |zi , yi0 ) = 0, t = 1, . . . , T, (4)

Pt−1
where vit1 = j=0 ρj (ui,t−j,1 + ai1 ), t = 1, . . . , T , are the new error terms, which will be
serially correlated even though the initial idiosyncratic errors were not.
Equation (4) can be used to estimate the parameters when the panel is balanced.3
Estimating equation (4) by NLS or GMM can serve as an alternative to traditional estima-
tors that combine first differencing with instrumental variables methods. As mentioned in
the introduction, a GMM estimator that uses first-differenced data suffers from the weak
instruments problem when the series are highly persistent. Specifically, for a sequentially
exogenous variable ωit , such as a lagged dependent variable, we can write the data gener-
ating process as ωit = ρωi,t−1 + ǫit , where Cov(ǫis , ǫit ) = 0 for s 6= t. In the extreme case,
where ρ = 1, ∆ωit = ǫit , so that past values (ωi,t−1 , . . . , ωi1 ) are not correlated with ∆ωit
and hence, cannot be used as instruments. When ρ is close to one, the lagged values are
correlated with ∆ωit , but the correlation is weak, which results in the weak instruments
3
We thank the anonymous referee for bringing this fact to our attention. The referee also noted that an
interesting question is whether our approach is less efficient than the Blundell and Bond (1998) approach.
This is difficult to say, as the two approaches make different assumptions about the initial condition.

9
problem. It is important to note, however, that this problem arises only when the esti-
mation method is GMM. Binder, Hsiao and Pesaran (2005) proposed a quasi maximum
likelihood estimator that uses differencing to remove unobserved heterogeneity, but does
not suffer from the weak instruments problem. Similarly, Hsiao, Pesaran and Tahmis-
cioglu (2002) propose a transformed likelihood approach and show that their maximum
likelihood estimator that uses differenced data performs better than the GMM estimator.
In equation (4), the weak instruments problem does not arise. Because all variables in
(4) are in levels, all of them are exogenous under Assumption 2.1 parts (ii)-(iv) and hence,
are used as their own instruments. Although the estimator relies on time variation in the
variables, the source of this variation does not matter. Even if ρ = 1, the parameters in
(4) can be consistently estimated by NLS or GMM, as long as Var(ǫit ) 6= 0. As is true
for all panel data models with large N and fixed T , the autoregressive coefficient can be
identified from the cross-sectional variation in the data.
In the context of an unbalanced panel, under Assumption 2.1, parts (v) and (vi), the
selection equation can we written as

T
X
sit = 1[η2 + zit δ2t + ψs zis + γ2 yi0 + vit2 > 0], t = 1, ..., T, (5)
s=1
2
vit2 |zi , yi0 ∼ N ormal(0, σ2t ), t = 1, . . . , T, (6)

where the Chamberlain’s modeling device is used to model the distribution of the time-
constant unobserved effect, ci2 . Note that due to the presence of the unobserved effect,
the composite errors, vit2 = uit2 + ai2 , t = 1, . . . , T , are necessarily serially correlated.
Also, error variances are allowed to vary over time. The normality assumption is not
crucial for estimating the selection equation. As long as vit2 is independent of (zi , yi0 )
and the appropriate regularity conditions hold, parameters in (5) can be consistently
estimated using a semiparametric estimator (see, for example, Ichimura 1993, Klein and

10
Spady 1991). However, as discussed below, the derivation of the selection correction term
is substantially simplified if Assumption 2.1(vi) holds.
To correct for the selection bias, we consider a two-step estimator and use the as-
sumptions similar to the standard selection literature in a cross-sectional context; see, for
example, Wooldridge (2002, Chapter 17). Specifically, from Assumption 2.1(vii) it follows
that

E[vit1 |zi , yi0 , sit = 1] = E[E(vit1 |vit2 )|zi , yi0 , sit = 1] = E[ϕ2t vit2 |zi , yi0 , sit = 1]

= ht (zi , yi0 ) ≡ hit , t = 1, . . . , T, (7)

PT
where hit ≡ ht (η2 + zit δ2t + s=1 ψs zis + γ2 yi0 ), and ht (·) is an unknown function.
From (7), it follows that for sit = 1, equation (4) can be written as

t−1
! T
!
1 − ρt
X X
yit = ρt yi0 + ρj xi,t−j β+ η1 + ξs zis + γ1 yi0 + hit + eit1 ,
j=0
1−ρ s=1
E(eit1 |zi , yi0 , sit = 1) = 0, t = 1, ..., T. (8)

It is possible to estimate equation (8) semiparametrically. A semiparametric estimator

would be appropriate if either the error distribution in the selection equation is not normal,
or E(vit1 |zi , yi0 , vit2 ) is a nonlinear function of vit2 , or both. However, it is also useful to
consider a fully parametric approach that would lead to a simple estimation routine and
would help to avoid computational difficulties typically associated with semiparametric
methods. Therefore, in what follows we focus on the parametric case.
Under Assumption 2.1, parts (vi) and (vii), function ht is given by

φ(·)
ht (·) = ϕ2t ≡ ϕ2t λ(·), (9)
Φ(·)

where φ(·) and Φ(·) are standard normal pdf and cdf , respectively, and λ(·) is the inverse

11
Mills ratio. Thus, with some abuse of notation we can write the primary equation for the
selected sample as

t−1
! T
!
1 − ρt
X X
t j
yit = ρ yi0 + ρ xi,t−j β+ η1 + ξs zis + γ1 yi0 + ϕ2t λit2 + eit1 ,
j=0
1−ρ s=1
E(eit1 |zi , yi0 , sit = 1) = 0, t = 1, ..., T, (10)

PT
where λit2 ≡ λ(η2 +zit δ2t + s=1 ψs zis +γ2 yi0 ). Under Assumption 2.1, equation (10) is
the final estimating equation that can be consistently estimated by NLS or GMM.
As an alternative approach, one could treat the initial condition as an unobserved
effect and model its conditional expectation as a linear function of exogenous variables,
as suggested by Chamberlain (1984).4 In this case, the dynamic completeness of the
conditional mean in equation (2) is not needed (and most likely will not hold), so that
the idiosyncratic errors in (2) may be serially correlated. Formally, the set of assumptions
can be summarized as follows:

ASSUMPTION 2.2

(i) yi0 is not observed, zi is always observed, and yit , t = 1, . . . , T , are observed only
for sit = 1.

(ii) E(uit1 |zi ) = 0, t = 1, . . . , T .

PT
(iii) yi0 = s=1 κs zis + bi , E(bi |zi ) = 0.

PT PT
(vi) ci1 = η1 + s=1 ξs zis + γ1 yi0 + ai1 = η1 + s=1 (ξs + γ1 κs )zis + ai1 + γ1 bi , E(ai1 |zi ) = 0.

PT PT
(v) ci2 = η2 + s=1 ψs zis + γ2 yi0 + ai2 = η2 + s=1 (ψs + γ2 κs )zis + ai2 + γ2 bi .

2
(vi) For vit2 = ai2 + γ2 bi + uit2 , vit2 |zi ∼ N ormal(0, σ2t ), t = 1, . . . , T .
4
We thank the anonymous referee for suggesting that we consider this approach.

12
Pt−1
(vii) For vit1 = ρt bi + j=0 ρj (ui,t−j,1 + ai1 + γ1 bi ), E(vit1 |zi , vit2 ) = ϕ2t vit2 , t = 1, . . . , T .

Under Assumption 2.2, for sit = 1, the primary equation can be written as

T t−1
! " T
#
1 − ρt

ξ˜s zis + ϕ2t λit2 + eit1 ,
X X X
yit = ρt κs zis + ρj xi,t−j β+ η1 +
s=1 j=0
1−ρ s=1
E(eit1 |zi , sit = 1) = 0, t = 1, ..., T. (11)

ψ̃s zis ), ψ̃s = ψs + γ2 κs and ξ˜s = ξs + γ1 κs . Similarly

PT
where λit2 ≡ λ(η2 + zit δ2t + s=1

to (10), parameters in equation (11) can be consistently estimated by NLS or GMM, as

discussed in the following two sections. Alternatively, one can estimate the reduced-form
equation and then obtain structural coefficients, ρ and β, using nonlinear restrictions on
parameters. In (11), it is possible to test the presence of the observed dynamics. If only
the unobserved dynamics is present, the lags of the exogenous variables would not appear
in equation (11), i.e. ρ would be zero.
Specifying the estimating equation as in (11) has an advantage of allowing serial
correlation in idiosyncratic errors in equation (2). However, it also requires that the
model necessarily contains exogenous time-varying explanatory variables and ignores the
dynamics that is due to unobserved factors that are not included in the model. In what
follows, we focus on the approach, where the initial condition appears in the conditioning
set, and the conditional mean in (2) is assumed to be dynamically complete, so that uit1
are serially uncorrelated. We emphasize, however, that equation (11) can be estimated
using the proposed methods, also.

3 NLS Estimation

A simple way to obtain a consisted estimator of parameters in equation (10) is to re-

place λit2 with its consistent estimator and estimate the parameters in two steps. Under

13
Assumption 2.1(v) and (vi), equation (5) can be consistently estimated by probit after
the error variance is normalized to equal unity. Since error variances may differ across
time periods, it is most appropriate to estimate the selection equation separately for each
time period. Denote the first-step estimators π̂t = (ηt2 , ψ̂1t , . . . , δ2t\
+ ψtt , . . . , ψ̂T t , γ̂t2 )′ ,
π̂ = (π̂1′ , . . . , π̂T′ )′ , and the first-step vector of regressors qit = (1, zi1 , . . . , ziT , yi0 ). These
can be used to obtain λ̂it2 ≡ λ(qit π̂t ), and then λ̂it2 can be used instead of λit2 in equation
(10).
Denote the 1×[K + LT + T + 3] vector of the parameters θ ≡ (ρ, β, η1 , ξ1 , . . ., ξT , γ1 ,
ϕ21 , . . ., ϕ2T ). Parameters in θ can be consistently estimated by pooled nonlinear least
squares (NLS) on the selected sample.
Define the conditional expectation of yit :

mit (θ) ≡ m(zi , yi0 , sit = 1; θ) = E(yit |zi , yi0 , sit = 1), (12)

where

t−1
!
X
m(zi , yi0 , sit = 1; θ) = ρt yi0 + ρj xi,t−j β
j=0
T
!
t

1−ρ X
+ η1 + ξs zis + γ1 yi0 + ϕ2t λit2 . (13)
1−ρ s=1

The correction term, λit2 , is not available, but it can be replaced by a consistent estimator
mentioned above. In general, let mit (θ, π̂) be a conditional expectation obtained using the
estimators of the parameters in the selection equation. Then, the pooled NLS estimator
of θ is the solution to the minimization problem

N T
1 XX
min sit [yit − mit (θ, π̂)]2 , (14)
θ 2
i=1 t=1

14
where one half is used as a multiplier for convenience. The first-order condition for this
problem is
N X
X T
−sit ∇θ mit (θ̂, π̂)′ [yit − mit (θ̂, π̂)] = 0, (15)
i=1 t=1

which can be solved for θ̂ using the iterative procedures. As is standard in panel data
models, for identification it is necessary that T ≥ 2.
In summary, if Assumption 2.1 holds, a consistent estimator of θ can be obtained
from the following two-step procedure:

PROCEDURE 3.1

1. For each t = 1, . . . , T , estimate separate probit models,

sit on 1, zi1 , . . . , ziT , yi0 i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it2 .

2. For sit = 1, estimate equation (10) with λit2 replaced by λ̂it2 by pooled NLS. Esti-
mate the asymptotic variance as described in Appendix A.

From Procedure 3.1 it is apparent that one needs at least one additional exogenous
variable in the selection equation (L > K). Although the inverse Mills ratio, λ̂it2 , is a
nonlinear function of its argument, it is approximately linear on the most of its range,
which may lead to multicollinearity. Thus, it is necessary to have at least one exclusion
restriction in order to make the estimation convincing.
Even though the resulting estimator is consistent, it is not efficient. From equations
(3) and (4) it is seen that the error terms in (10) are serially correlated. Besides, the
errors are going to be heteroskedastic because of selection. A nonlinear analog of the
seemingly unrelated regressions estimator (see Wooldridge 2002, Problem 12.7) cannot be

15
used in this context because selection is not strictly exogenous in the selection equation.
However, one can improve efficiency by using a GMM estimator, as discussed in the next
section.

4 GMM Estimation

The efficiency of the two-step estimator can be improved by using GMM at the second
step. Equation (10) is linear in regressors, but nonlinear in parameters, which results in
overidentification and permits obtaining a more efficient estimator than pooled NLS.
To specify a GMM estimator, define a 1×(LT +3) vector of instruments ω̂it ≡ ωit (π̂t ) ≡
(1, yi0 , zi1 , . . . , ziT , λ̂it2 ), t = 1, . . . , T , and a T ×T (LT + 3) matrix of instruments Ŵi ,

 
 ω̂i1 0 0 . . . 0 0 
 
 0 ω̂i2 0 . . . 0 0 
Ŵi ≡ Wi (π̂) ≡  (16)
 


 ... 

 
0 0 0 . . . 0 ω̂iT

Here 0 denotes a 1×(LT + 3) vector of zeros.

Define a T ×1 vector ĝi ≡ gi (θ, π̂) ≡ (ĝi1 , . . . , ĝiT )′ , where

ĝit ≡ git (θ, π̂t ) ≡ sit [yit − mit (θ, π̂)], t = 1, . . . , T. (17)

From equation (10) it follows that the following moment conditions are available:

E[Wi (π)′ gi (θ, π)] = 0. (18)

Since the conditional expectation of yit is different in each time period, equation (18)
implies T (LT + 3) moment conditions. Moreover, because mit (θ, π̂) is nonlinear in θ,

16
these conditions are not redundant and can be used to enhance efficiency.
The GMM estimator of θ is the solution to the minimization problem

N
!′ N
!
X X
min Wi (π̂)′ gi (θ∗ , π̂) Ω̂−1 Wi (π̂)′ gi (θ∗ , π̂) , (19)
θ
∗
i=1 i=1

where Ω̂−1 is a consistent estimator of a T (LT + 3)×T (LT + 3) positive semidefinite

weighting matrix Ω−1 . The first-order condition for this problem is given by

" N
#′ " N
#
X X
Wi (π̂)′ ∇θ gi (θ̂, π̂) Ω̂−1 Wi (π̂)′ gi (θ̂, π̂) = 0. (20)
i=1 i=1

Then, θ can be consistently estimated using a procedure similar to Procedure 3.1, where
the GMM estimator is used instead of the pooled NLS estimator.
Notice that the pooled NLS estimator is identical to a GMM estimator, which exploits
the moment conditions
T
X
E[∇θ git (θ, π)′ git (θ, π)] = 0 (21)
t=1

and uses the weighting matrix

( T
)−1
X
E[∇θ git (θ, π)′ ∇θ git (θ, π)] . (22)
t=1

Thus, in the NLS estimation, the instruments are “stacked” on top of each other, and
each time period receives an equal weight. In contrast, a general GMM estimator that
uses a block-diagonal matrix of instruments, as in equation (16), assigns different weights
to each time period, which can be used to improve efficiency. In the discussion below, it
is the solution to the minimization problem (19), which we call the GMM estimator.
The proposed GMM estimator will be consistent for any positive definite matrix Ω;
however, a particular form is preferred. Specifically, we formulate an additional assump-

17
tion:

ASSUMPTION 4.1

(i) Λ is the asymptotic variance of Wi (π̂)′ gi (θ, π̂).

(ii) Ω = Λ.

p
(iii) Ω̂ −→ Ω.

Appendix A provides a formula for Ω̂ that satisfies Assumption 4.1. Following a

standard argument for the relative efficiency of the GMM estimator, the GMM estimator
that employs weighting matrix Ω̂ as specified in Assumption 4.1 is asymptotically more
efficient than pooled NLS and results in a relatively simple expression for the asymptotic
variance of θ̂. Specifically, denote G ≡ E[Wi (π)′ ∇θ gi (θ, π)]. If Ω satisfies Assumption 4.1,
then the asymptotic variance of the described GMM estimator is

Avar(θ̂) = (GΩ−1 G)−1 /N, (23)

which can be estimated as (ĜΩ̂−1 Ĝ)−1 /N , using the formulae provided in Appendix A.
We can now summarize a two-step estimation procedure. Let Assumptions 2.1 and
4.1 hold. Then, an estimator of θ that is asymptotically more efficient than the estimator
discussed in Section 3 can be obtained using the procedure:

PROCEDURE 4.1

1. For each t = 1, . . . , T , estimate separate probit models,

sit on 1, zi1 , . . . , ziT , yi0 i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it2 .

18
2. In equation (10), replace λit2 with λ̂it2 . For sit = 1, estimate the equation by
GMM that uses moment conditions (18) and the weighting matrix that satisfies
Assumption 4.1. Estimate the asymptotic variance as described in Appendix A.

It is important to note that there are more moment conditions available in addition
to those specified in equation (18). Equation (10) implies that eit1 is uncorrelated with
any function of zi and yi0 . Therefore, any nonlinear functions of the exogenous variables
and the initial condition should be valid instruments and can be used to obtain additional
moment conditions.
The proposed two-step estimator can also be formulated as a joint GMM estimator
of (θ, π). As suggested by Newey and McFadden (1994, Section 6.1), such an estimator
can be obtained by “stacking” the moment conditions from the two steps. The moment
conditions from the second step are given in (18), while the first-order conditions from
the first-step estimation generate the additional moment conditions:

E {Φ(qit πt )[1 − Φ(qit πt )]}−1 φ(qit πt )qit′ [yit2 − Φ(qit πt )] = 0, t = 1, . . . , T. (24)

The conditions in (18) and (24) can be used to form a vector of moment conditions
for the joint GMM estimation. In that way the additional conditions can be used for
estimating θ, which can help to improve efficiency. However, since the first-step equa-
tions are exactly identified, the efficiency gain may be modest or even not present at all.
Moreover, the two-step GMM estimator appears to be computationally more tractable
than the joint GMM estimator in applications where the number of the first-step moment
conditions is large, for example, due to T being relatively large.
To study the properties of the proposed estimators in finite samples we performed
Monte Carlo experiments.5 In the experiments, among the three estimators that account
5
Detailed description of the experiments and all results are summarized in the supplement to the
paper, which is available from the authors upon request.

19
for the selection bias (two-step NLS, two-step GMM and joint GMM that uses the moment
conditions for both equations) the two-step NLS estimator has the smallest standard
deviations and root mean square errors (RMSEs) in small samples (N = 200), which is
likely due to the fact that the GMM estimators use estimated weighting matrices, Ω̂, that
cannot be precisely estimated in small samples. However, in large samples (N = 4000)
both GMM estimators are more efficient than the two-step NLS estimator. The joint
GMM estimator tends to have slightly smaller standard deviations and RMSEs than the
two-step GMM estimator, but the differences are minor and virtually disappear when N
is large (N = 4000).
The two-step NLS, two-step GMM and joint GMM estimators also perform reasonably
well when testing simple hypothesis about parameters. Although for all three estimators
the true null is rejected too often in small samples (with the over-rejection being most
severe for the two-step GMM estimator), the computed size gets closer to the nominal
size as N grows. Both the two-step GMM and joint GMM estimators outperform the
two-step NLS estimator in terms of the power of the tests.

5 Testing for Selection Bias

It is possible to test for selection bias by testing the hypothesis H0 : ϕ2t = 0 in equation
(10). A variety of tests for GMM estimators described in Newey and McFadden (1994,
Section 9) can be used for this purpose. However, such tests require estimation of either
restricted or unrestricted model, or both, prior to testing. Since estimation of equation
(10) may be computationally costly due to nonlinearity in the parameters, it is useful to
have a simple alternative.
A simple test can be developed based on the initial linear model (1). To construct
a test, introduce a new selection indicator which identifies observability of yit in three

20
consecutive periods, and nominally assume that this new indicator follows an index model
with unobserved heterogeneity:

dit = 1[sit · si,t−1 · si,t−2 = 1]

= 1[zit δ30t + zi,t−1 δ31t + zi,t−2 δ32t + ci3 + uit3 > 0], t = 3, . . . , T, (25)

where ci3 is the unobserved effect and uit3 is the idiosyncratic error. Moreover, (nominally)
assume that uit3 is normally distributed and independent of the explanatory variables and
unobserved effect,
uit3 |zi , ci3 ∼ N ormal(0, 1). (26)

Using Chamberlain’s approach and assuming normality, write the unobserved effect as

T
X
ci3 = η3 + zis ζs + ai3 ,
s=1
ai3 |zi ∼ N ormal(0, σ3t ), t = 3, . . . , T. (27)

Combining (25), (26), and (27) together gives

T
X
dit = 1[η3 + zit δ30t + zi,t−1 δ31t + zi,t−2 δ32t + zis ζs + vit3 > 0],
s=1
vit3 |zi ∼ N ormal(0, 1 + σ3t ), t = 3, . . . , T, (28)

where vit3 ≡ ai3 + uit3 is a new composite error term. With regard to the error terms in
the primary equation, assume

E(∆uit1 |zi , vit3 ) = E(∆uit1 |vit3 ) = ϕ3t vit3 , t = 3, . . . , T, (29)

21
which, when combined with the normality assumption, gives

E(∆uit1 |zi , dit = 1) = ϕ3t E(vit3 |zi , dit = 1)

T
X
= ϕ3t λ(η3 + zit δ30t + zi,t−1 δ31t + zi,t−2 δ32t + zis ζs )
s=1
≡ ϕ3t λit3 , t = 3, . . . , T. (30)

After applying first differencing to equation (1), with some abuse of notation we can
write the differenced primary equation for dit = 1 as

∆yit = ρ∆yi,t−1 + ∆xit β + ϕ3t λit3 + ǫit1 ,

E(ǫit1 |zi , dit = 1) = 0, t = 1, . . . , T. (31)

Thus, the unobserved effect is removed by first differencing and ϕ3t λit3 captures the se-
lection effect. Naturally, time-constant variables drop out from the equation. The test is
then performed using the following procedure:

PROCEDURE 5.1

1. For each of t = 3, . . . , T , run a probit regression

dit on 1, zi1 , . . . , ziT , i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it3 .

2. For dit = 1, augment the first-differenced primary equation by λ̂it3 and its interac-
tions with time dummies and estimate the augmented equation by pooled two stage
least squares or GMM using yi,t−2 and leads and lags of zit as instruments for ∆yi,t−1
(∆xit , λ̂it3 and the interaction terms should be used as their own instruments). Use
the Wald test to test the hypothesis ϕ31 = . . . = ϕ3T = 0.

22
As an extension to the proposed procedure, it is possible to impose a restriction of
equal variances in the selection equation and estimate equation (28) by pooled probit.
Similarly, one may assume that the effect of selection is the same in all time periods and
omit the interaction terms in the second-step estimation. A test for selection bias in that
case is a usual t-test of the significance of the coefficient on λ̂it3 . Note that for testing
a usual variance-covariance matrix should be used; there is no need to adjust for the
first-step estimation.
If in some period, t − j (for j = 3, . . . , t − 1), yi,t−j is observed for all cross-section
units, then yi,t−j can be used as an additional instrument in the second-step estimation.
Otherwise, if there are missing values for at least some i, then the observable variable is
(si,t−j · yi,t−j ), and this is not a valid instrument, since we did not account for selection in
period t − j when constructing λ̂it3 .
Importantly, the proposed test is valid regardless of whether or not the model in (25)
is correct and whether or not the normality assumption holds. All we need for testing is
a reasonable proxy for the selection effect, and the correct specification of the selection
term is not essential. If selection problem is present, hopefully this will still be captured
by a non-zero coefficient on the inverse Mills ratio in the differenced equation. Similar to
the estimators discussed above, having additional variables in zit that are not also in xit
helps to make the test more reliable.
When the hypothesis of no selection bias is not rejected, the pooled two stage least
squares or GMM estimation of the first-differenced equation with ∆xit , yi,t−2 , and leads
and lags of strictly exogenous variables used as instruments will produce consistent es-
timators. More distant lags can be used as additional instruments if observed for all
cross-section units. However, if the null is rejected, Procedure 5.1 will be a valid correc-
tion procedure only if all the assumptions specified in this section are correct. Given that
model for dit in equation (25) is quite restrictive, Procedure 5.1 is unlikely to perform

23
well as a correction method. Therefore, the methodology described in the previous two
sections should be used instead.

6 Empirical Application

This section illustrates the proposed methodology with an empirical example by applying
the new methods to the estimation of dynamic earnings equations for females. This
example is appropriate because earnings are largely determined by different historical
factors and tend to be correlated over time.
The data come from the Panel Study of Income Dynamics (PSID), years 1980 to
1992. The sample consists of white females, who were followed over the considered pe-
riod.6 Because when estimating equation (10) it is necessary that the initial condition
is observed, we keep only those females for whom 1980 earnings are available. The final
sample consists of 579 women, or 6,948 observations over the 12-year period (1981-1992).
For this period, the earnings sample is comprised of 5,891 observations. Thus, about 15%
of earnings data are missing due to non-participation.
Because we define the population as women working in 1980, this exercise should
be viewed as an evaluation of the effects of movement in and out of the labor force
on estimated earnings equations. Such a question is of considerable interest in labor
economics.
The dependent variable in the primary equation is the natural logarithm of the aver-
age annual hourly earnings, while the independent variables include age, age squared and
time dummies. We assume that age is strictly exogenous and is not correlated with the
6
We consider working-age women (ages 18-65) who were either household heads or “wives,” have com-
pleted their education and are neither self-employed nor agricultural workers. The woman was excluded
from the analysis if her self-reported age exceeded the age constructed using information on the year of
birth by more than two years or self-reported age was smaller than the constructed age by more than
one year, or if the woman reported positive work hours and zero earnings.

24
unobserved effect. This assumption implies that the mean ability of women born in differ-
ent years is about the same. Our sample is restricted to women who have completed their
education (i.e. years of schooling do not vary over time); hence, the effect of education
is not separable from unobserved heterogeneity. Therefore, we only include education as
part of the unobserved effect. Additionally, to control for unobserved heterogeneity, we
include the number of children in all time periods (i.e. the number of children is assumed
to belong to zit , but not xit ).
The selection rule is for labor force participation. A woman is considered to be a
participant if she reports positive work hours in a given year. When estimating selection
equations, in the probit regressions in each time period we include education, age, age
squared, and the number of children in all time periods, where the number of children
may have a direct effect on the labor force participation. Log of hourly earnings in 1980
is included depending on whether the methodology of Sections 2-4 or the methodology of
Section 5 is used for the analysis.
Before applying the more advanced methods developed in Sections 2 through 4, we first
estimate equation (1) using the simple approach of Section 5. From the total 1980-1992
sample we keep observations for which earnings data are available in three consecutive
periods and use first differencing to remove the unobserved effect. As a result, the sample
size reduces to 5,033 observations; age and education drop out from the equation. Then,
we estimate the first-differenced equation by pooled instrumental variables using the log
of hourly earnings in t − 2 as an instrument for ∆yi,t−1 . We call this estimator the first
difference instrumental variables (FD-IV) estimator.
The estimates for the log earnings equations are reported in Table 1. The first column
of the Table display contains the estimates from FD-IV regressions without inverse Mills
ratios. The second column contains the test of selection bias in the first-differenced
equation using the results in Section 5. The estimate of ρ is rather similar in the two

25
columns; it is about 0.15-0.16 and is statistically significant at the 1% level. However, the
test suggests that selection bias may be present. The null of no selection is rejected at
the 8% significance level. Thus, one might conclude from the test using the FD equation
that selection into the work force may be systematically related to idiosyncratic shocks
to earnings.
The estimates obtained using the methods discussed in Sections 2-4 are reported in
the remaining three columns of Table 1. Columns (3) and (4) show estimation results
from regressions where the NLS estimator is used at the second step. Column (5) contains
the estimates obtained using Procedure 4.1, which employs GMM at the second step. The
estimates for the augmented log earnings equation are reported in columns (4) and (5).
Based on the Wald tests of the joint significance of the selection terms, the hypothesis
of no selection bias is rejected at the 5% level in both cases. Thus, we again find the
evidence of the selection bias.
The NLS and GMM estimates of ρ are very similar in all three regressions. The
estimate is about 0.6 and is significant at the 1% level, which provides evidence of state
dependence in earnings offers. This estimate is rather different from the one obtained using
first-differencing. Interestingly, similar results were obtained in Monte Carlo simulations,
where the FD-IV estimator had substantially larger biases than the NLS estimator that
did not account for selection. For all coefficient estimates, standard errors are smaller
when the GMM estimator is used at the second step.
Columns (3)-(5) show an estimated effect of another year of schooling of about 3%,
which is statistically significant at the 1% level. We emphasize, however, that this effect
is not distinguishable from unobserved heterogeneity. Moreover, the coefficient on years
of schooling in these regressions is not a true return to education because education has
an additional effect on earnings through the autoregressive earnings term.
The coefficients on the age and age squared reveal a usual U-shape profile, although

26
the corresponding estimates are less precise, particularly in the NLS regressions.
As a robustness check, we re-estimated the earnings equation using the data from
years 1981-1992. The sample was restricted to only include women who reported earnings
in 1981 (583 women).7 The resulting coefficient estimates and standard errors were very
similar to the ones reported in Table 1. The only noticeable change was observed for the
two-step GMM estimates of the coefficients on age and age squared, which became some-
what smaller and statistically insignificant. Based on the results of the joint Wald tests,
the null of no selection bias could not be rejected; however, several selection correction
terms were individually significant. Specifically, in the FD-IV regression the inverse Mills
ratios for years 1984, 1985 and 1991 were significant at the 5% significance level. The
correction term for year 1991 was also significant at the 5% level in the two-step NLS and
two-step GMM regressions. The table with detailed estimation results is available from
the authors upon request.
Returning to the discussion of the estimating equation in Section 2, we note that one
could also estimate the parameters using equation (11). Is such a case, identification would
rely on time variation in strictly exogenous variables, age and age squared. Moreover, the
autoregressive coefficient, ρ, would only capture the observed dynamics. In applications
where there are no time-varying strictly exogenous variables in the model (i.e. xit is
empty), the data would not provide a distinction between the observed and unobserved
dynamics.8

7 Conclusions

In this paper, the new methods for estimating dynamic panel data models with selectivity
were proposed. A distinctive feature of the new estimators is that they do not rely on
7
The cross-section sample size increased because more women were working in 1981 than in 1980.
8
We thank the anonymous referee for suggesting that we include the discussion of this issue.

27
differencing when treating the unobserved heterogeneity. This feature allows to avoid the
weak instruments problem, which arises in the context of differencing if series are highly
persistent or close to unit root. The proposed correction is relatively simple because the
method requires correcting for selection in current period only. The errors in both selection
and primary equations may be heterogeneously distributed. The errors in the selection
equation may also be serially dependent, and the general form of heteroskedasticity is
allowed in the primary equation. Additionally, this paper develops a simple test for
sample selection bias.
The proposed methods are applied to the estimation of dynamic earnings equations
for females using the Panel Study of Income Dynamics data. The evidence of selection
bias is found in both the first-differenced equation and the equation obtained after back-
substitution. The NLS and GMM estimation based on the new methodology produces the
estimate of the stability parameter that equals 0.6 and is rather different from the estimate
obtained from the instrumental variables estimation of the first-differenced equation.
The proposed correction procedure is parametric and assumes normality of the er-
rors in the selection equation. An important topic for future research is developing a
semiparametric estimator, which would not require parametric assumptions regarding the
error distributions. Such an estimator can be implemented within the framework of this
paper using the methods similar to those considered in Semykina and Wooldridge (2010).

Appendix A

This section starts with a derivation of the variance of the GMM estimator. The derivation
of the variance of the pooled NLS estimator follows by analogy. Using the notation from
Section 3, let π̂t = (ηt2 , ψ̂1t , . . . , δ2t\
+ ψtt , . . . , ψ̂T t , γ̂t2 )′ , π̂ = (π̂1′ , . . . , π̂T′ )′ , be the first-step
estimators, and let qit = (1, zi1 , . . . , ziT , yi0 ) be the first-step vector of regressors. Also,

28
denote the vector of the parameters θ ≡ (ρ, β, η1 , ξ1 , . . ., ξT , γ1 , ϕ21 , . . ., ϕ2T ).
Under the standard regularity conditions given, for example, in Wooldridge (2002,
Theorem 14.1), the GMM estimator, θ̂, is consistent when π is known. If π̂ is a consistent
estimator of π, the first stage estimation will not affect consistency of θ̂.
By definition, if Ω̂ is a consistent estimator of a positive definite matrix Ω, then
p
Ω̂ −→ Ω. Also, by consistency of θ̂ and π̂, and the weak law of large numbers,

N
p
X
−1
N Wi (π̂)′ ∇θ gi (θ̂, π̂) −→ G,
i=1

where G ≡ E[Wi (π)′ ∇θ gi (θ, π)] was also defined earlier.

Recall that the first-order condition for the GMM estimator is as in equation (20). Af-
ter normalizing by the number of observations, taking the appropriate probability limits,
and expanding N −1 N
P
i=1 Wi (π̂) gi (θ̂, π̂) around θ, we obtain
′

N
X N
X
′ −1 −1/2 ′ ′ −1 −1/2
GΩ N Wi (π̂) gi (θ, π̂) + G Ω N Wi (π̂)′ ∇θ gi (θ, π̂)(θ̂ − θ) + op (1) = 0,
i=1 i=1
N
√ X
N (θ̂ − θ) = −C −1 G′ Ω−1 N −1/2 Wi (π̂)′ gi (θ, π̂) + op (1), (32)
i=1

where C ≡ G′ Ω−1 G.
Next, we need to account for the first-stage estimation of π. In equation (32), both the
matrix of instruments and function gi depend on π̂. However, as is known, the use of gen-
erated instruments does not affect the asymptotic variance of the GMM estimator. This
result follows from the conditional moment restrictions in equation (10), which imply that
E[gi (θ, π)|xi , yi0 , sit = 1] = 0, so that gi (θ, π) is uncorrelated with any function of (xi , yi0 )
conditional on sit = 1. Therefore, the mean-value expansion of N −1/2 N
P
i=1 Wi (π̂) gi (θ, π̂)
′

29
around π gives

N N
−1/2
X
′ −1/2
X √
N Wi (π̂) gi (θ, π̂) = N Wi (π)′ gi (θ, π) + F N (π̂ − π) + op (1), (33)
i=1 i=1

where F ≡ E[Wi (π)′ ∇π gi (θ, π)], and ∇π gi (θ, π) is a block-diagonal matrix,

 
 −sit qi1 ϕ21 λi12 (qi1 π1 + λi12 ) 0 . . . 0 0 
 
∇π gi (θ, π) = 
 ... 

 
0 0 . . . 0 −sit qiT ϕ2T λiT 2 (qiT πT + λiT 2 )
(34)
Here we used the fact that the derivative of the inverse Mills ratio is equal to −qit λit2 (qit π+
λit2 ) [see, for example, Wooldridge 2002, p. 522].
Since π̂t , t = 1, . . . , T are maximum likelihood estimators, π̂ satisfies

N
√ X
N (π̂ − π) = N −1/2 di (π) + op (1), (35)
i=1

where di (π) ≡ (di1 (π1 )′ , . . . , diT (πT )′ )′ ,

dit (πt ) = A−1 −1 ′

t {Φ(qit πt )[1 − Φ(qit πt )]} φ(qit πt )qit [sit − Φ(qit πt )],

At ≡ E[−Hit (πt )],

Hit (πt ) = −{Φ(qit πt )[1 − Φ(qit πt )]}−1 [φ(qit πt )]2 qit′ qit . (36)

Combining equations (32), (33), and (35), we can write

N
√ −1 ′ −1 −1/2
X
N (θ̂ − θ) = −C GΩ N [Wi (π)′ gi (θ, π) + F di (π)] + op (1), (37)
i=1

30
and by the central limit theorem,

√ d
N (θ̂ − θ) −→ N ormal(0, C −1 G′ Ω−1 P Ω−1 GC −1 ), (38)

where

P ≡ E[pi p′i ],

pi = Wi (π)′ gi (θ, π) + F di (π). (39)

By choosing Ω = P , we obtain

√
Avar N (θ̂ − θ) = C −1 = (G′ Ω−1 G)−1 , (40)

and the asymptotic variance of θ is given in equation (23). The variance can be estimated
using the estimators of θ and π instead of the true parameter values and by replacing
matrices G, A, F , and Ω, with their consistent estimators:

N
X
−1
Ĝ ≡ N Wi (π̂)′ ∇θ gi (θ̂, π̂)],
i=1
XN
Ât ≡ N −1 [−Hit (π̂)],
i=1
XN
F̂ ≡ N −1 Wi (π̂)′ ∇π gi (θ̂, π̂),
i=1
N
X
Ω̂ ≡ N −1 [p̂i p̂′i ],
i=1

p̂i = Wi (π̂)′ gi (θ̂, π̂) + F̂ di (π̂). (41)

The asymptotic variance of the pooled NLS estimator can be derived using the same
logic. First, following the same steps as the ones used to obtain (32) [see also Wooldridge

31
2002, Section 12.3], we can write

T
N X
√ −1 −1/2
X
N (θ̂N LS − θ) = −D N ∇θ git (θ, π̂)′ git (θ, π̂) + op (1),
i=1 t=1
" T
#
X
D=E ∇θ git (θ, π̂)′ ∇θ git (θ, π̂) . (42)
t=1

PN PT
The mean-value expansion of N −1/2 i=1 t=1 ∇θ git (θ, π̂)′ git (θ, π̂) around π gives

N X
X T
−1/2
N ∇θ git (θ, π̂)′ git (θ, π̂)
i=1 t=1
N X T
X √
= N −1/2 ∇θ git (θ, π)′ git (θ, π) + Q N (π̂ − π) + op (1),
i=1 t=1
" T
#
X
Q = E ∇θ git (θ, π)′ ∇π git (θ, π)
t=1
= E [∇θ gi1 (θ, π)′ ∇π1 gi1 (θ, π), . . . , ∇θ giT (θ, π)′ ∇πT giT (θ, π)] . (43)

Combining (42), (43), and (35) gives

N
" T #
√ X X
N (θ̂N LS − θ) = −D−1 N −1/2 ∇θ git (θ, π)′ git (θ, π) + Qdi (π) + op (1), (44)
i=1 t=1

where di is as defined in equation (36). And by the central limit theorem,

√ d
N (θ̂N LS − θ) −→ N ormal(0, D−1 RD−1 ),

R ≡ E[ri ri′ ],
XT
ri = ∇θ git (θ, π)′ git (θ, π) + Qdi (π). (45)
t=1

Then, the asymptotic variance of the pooled NLS estimator can be estimated as

[ θ̂N LS ) = D̂−1 R̂D̂−1 /N,

Avar( (46)

32
where

N
" T #
X X
D̂ ≡ N −1 ∇θ git (θ̂, π̂)′ ∇θ git (θ̂, π̂) ,
i=1 t=1
′
R̂ ≡ E[r̂i r̂i ],
XT
r̂i ≡ ∇θ git (θ̂, π̂)′ git (θ̂, π̂) + Q̂di (π̂),
t=1
N
" T #
X X
Q̂ ≡ N −1 ∇θ git (θ̂, π̂)′ ∇π git (θ̂, π̂) . (47)
i=1 t=1

Appendix B

Expressions for the derivatives in matrix ∇θ gi (θ, π) ≡ (∇θ gi1 (θ, π))′ , . . . , ∇θ giT (θ, π)′ )′ are
summarized below:

∂git ∂git ∂git ∂git ∂git ∂git ∂git ∂git
∇θ git (θ, π) ≡ , , , ,..., , , , , t = 1, . . . , T
∂ρ ∂β ∂η1 ∂ξ1 ∂ξT ∂γ1 ∂ϕ12 ∂ϕT 2
" t−1
!
∂git X
= −sit tρt−1 yi0 + j · ρj−1 xi,t−j β
∂ρ j=1
T
!#
1 − ρt tρt−1
X
+ − η1 + ξr zir + γ1 yi0 ,
(1 − ρ)2 1 − ρ r=1
t−1
∂git X
= −sit ρj xi,t−j ,
∂β j=0

1 − ρt

∂git
= −sit ,
∂η1 1−ρ
1 − ρt

∂git
= −sit zir , r = 1, . . . , T,
∂ξr 1−ρ
1 − ρt

∂git
= −sit yi0 ,
∂γ1 1−ρ
∂git
= −sit λit ,
∂ϕt2
∂git
= 0, r 6= t. (48)
∂ϕr2

33
Also, one can easily obtain [sit ∇θ mit (θ, π)] as −∇θ git (θ, π).

References

Ahn SC, Schmidt P. 1995. Efficient estimation of models for dynamic panel data. Journal
of Econometrics 68: 5-27.

Anderson TW, Hsiao C. 1981. Estimation of dynamic models with error components.
Journal of the American Statistical Association 76: 598-606.

Arellano M, Bond SR. 1991. Some tests of specification for panel data: Monte Carlo
evidence and an application to employment equations. Review of Economic Studies
58: 277-297.

Arellano M, Bover O, Labeaga JM. 1999. Autoregressive models with sample selectivity
for panel data. In Analysis of Panels and Limited Dependent Variable Models:
In honour of G. S. Maddala, Hsiao C, Lahiri K, Lee L, and Pesaran MH (eds).
Cambridge University Press.

Binder M, Hsiao C, Pesaran MH. 2005. Estimation and inference in short panel vector
autoregressions with unit roots and cointegration. Econometric Theory 21: 795-837.

Blundell RW, Bond SR. 1998. Initial conditions and moment restrictions in dynamic
panel data models. Journal of Econometrics 87: 115-143.

Bover O, Arellano M. 1997. Estimating dynamic limited dependent variable models from
panel data. Investigaciones Economicas 21: 141-165.

34
Chamberlain G. 1980. Analysis with qualitative data. Review of Economic Studies 47:
225-238.

Chamberlain G. 1982. Multivariate regression models for panel data. Journal of Econo-
metrics 18: 5-46.

Chamberlain G. 1984. Panel Data. In Handbook of Econometrics, Volume 2, Griliches

Z, Intriligator MD (eds). Amsterdam: North Holland, 1248-1318.

Gayle GL, Viauroux C. 2007. Root-N consistent semiparametric estimators of a dynamic

panel-sample-selection model. Journal of Econometrics 141: 179-212.

Holtz-Eakin D, Newey WK, Rosen HS. 1988. Estimating vector autoregressions with
panel data. Econometrica 56: 1371-1395.

Honore BE, Hu L. 2004. Estimation of cross sectional and panel data censored regression
models with endogeneity. Journal of Econometrics 122: 293-316.

Hsiao C, Pesaran MH, Tahmiscioglu AK. 2002. Maximum likelihood estimation of fixed
effects dynamic panel data models covering short time periods. Journal of Econo-
metrics 109: 107-150.

Hu L. 2002. Estimation of a Censored Dynamic Panel Data Model. Econometrica 70:

2499-2517.

Ichimura H. 1993. Semiparametric least squares (SLS) and weighted SLS estimation of
single-index models. Journal o f Econometrics 58: 71-120.

Klein RL, Spady RH. 1993. An efficient semiparametric estimator for binary response
models. Econometrica 61: 387-421.

Kyriazidou E. 2001. Estimation of dynamic panel data sample selection models. Review
of Economic Studies 68: 543-572.

35
Labeaga JM. 1999. A Double-hurdle rational addiction model with heterogeneity: Esti-
mating the demand for tobacco. Journal of Econometrics 93: 49-72.

Newey WK, McFadden D. 1994. Large sample estimation and hypothesis testing. In
Handbook of Econometrics, Volume 4, Engle RF, McFadden D (eds). Amsterdam:
North Holland, 2111-2245.

Semykina A, Wooldridge JM. 2010. Estimating panel data models in the presence of
endogeneity and selection. Journal of Econometrics 157: 375-380.

Wooldridge JM. 2002. Econometric Analysis of Cross Section and Panel Data. MIT:
Cambridge, MA.

Wooldridge JM. 2005. Simple solutions to the initial conditions problem in dynamic,
nonlinear panel data models with unobserved heterogeneity. Journal of Applied
Econometrics 20: 39-54.

Ziliak JP, Kniesner TJ. 1998. The importance of sample attrition in life cycle labor
supply estimation. Journal of Human Resources 22: 507-530.

36
Table 1: Estimates for the Dynamic Log(Hourly Earnings) Equation

FD-IV, FD-IV NLS, NLS GMM

no λ̂it3 with λ̂it3 no λ̂it2 with λ̂it2 with λ̂it2
(1) (2) (3) (4) (5)

Lagged Log of 0.153∗∗∗ 0.162∗∗∗ 0.576∗∗∗ 0.586∗∗∗ 0.574∗∗∗

Hourly Earnings (0.049) (0.046) (0.056) (0.056) (0.040)

Education 0.033∗∗∗ 0.032∗∗∗ 0.029∗∗∗

(0.005) (0.005) (0.004)

Age 0.013∗∗ 0.012∗∗ 0.009∗∗

(0.006) (0.007) (0.004)

Age squared −0.0002 −0.0001 −0.00018∗∗ −0.00016∗ −0.00013∗∗∗

(0.0002) (0.0002) (0.00007) (0.00009) (0.000046)

Wald Test of
Joint Significance χ211 = 18.35 χ212 = 22.13 χ212 = 41.28
of the Inverse (0.074) (0.036) (0.000)
Mills Ratios

Time-specific intercept is used in all regressions.

FD-IV means the first difference instrumental variables estimation, where the log of hourly
earnings in t − 2 is used as an instrument for the differenced log of hourly earnings in t − 1.
Standard errors robust to serial correlation and heteroskedasticity are in parentheses under
the coefficient estimates; robust p-values are under the test statistics.
Standard errors in the NLS regression with λ̂it2 are corrected for the first-step estimation.
* = significant at the 10% level
** = significant at the 5% level
*** = significant at the 1% level

(Libro) Arellano - Panel Data Econometrics 2003
No ratings yet
(Libro) Arellano - Panel Data Econometrics 2003
244 pages
SSRN Id276592
No ratings yet
SSRN Id276592
64 pages
Vvi Dynamic GMM
No ratings yet
Vvi Dynamic GMM
24 pages
Abrevaya Projectionapproachunbalanced 2013
No ratings yet
Abrevaya Projectionapproachunbalanced 2013
19 pages
Blundell Bond 1998
No ratings yet
Blundell Bond 1998
29 pages
Arellano Bond
No ratings yet
Arellano Bond
22 pages
Panel Data Method-Baltagi
100% (1)
Panel Data Method-Baltagi
51 pages
Unbalanced Panel Data PDF
No ratings yet
Unbalanced Panel Data PDF
51 pages
Panel Data Models: Dynamic Panels and Unit Roots
No ratings yet
Panel Data Models: Dynamic Panels and Unit Roots
20 pages
BBBBBBBB
No ratings yet
BBBBBBBB
22 pages
MIT14 382S17 Lec10
No ratings yet
MIT14 382S17 Lec10
10 pages
GMM Estimation and Inference in Dynamic Panel Data
No ratings yet
GMM Estimation and Inference in Dynamic Panel Data
45 pages
Economics 717 Fall 2019 Lecture - Heckman
No ratings yet
Economics 717 Fall 2019 Lecture - Heckman
16 pages
Arellano 1991
No ratings yet
Arellano 1991
22 pages
Autoregressive Models With Sample Selectivity For Panel Data
No ratings yet
Autoregressive Models With Sample Selectivity For Panel Data
45 pages
A Wavelet Method For Panel Models With Jump Discontinuities in The Parameters
No ratings yet
A Wavelet Method For Panel Models With Jump Discontinuities in The Parameters
3 pages
Lee Wooldridge 20230720
No ratings yet
Lee Wooldridge 20230720
45 pages
Heckman The Common Structure of Statistical Models of Truncation Sample Selection and Limited Dependence Model
No ratings yet
Heckman The Common Structure of Statistical Models of Truncation Sample Selection and Limited Dependence Model
19 pages
Lec06 - Panel Data
No ratings yet
Lec06 - Panel Data
160 pages
1709.08980v2
No ratings yet
1709.08980v2
40 pages
Verbeek e Nijman - Testing For Selectivity Bias in Panel Data Models
No ratings yet
Verbeek e Nijman - Testing For Selectivity Bias in Panel Data Models
24 pages
Panel Vs Pooled Data
No ratings yet
Panel Vs Pooled Data
9 pages
Arellano y Bover - 1995
No ratings yet
Arellano y Bover - 1995
23 pages
Sample Selection Correction in Panel Data Models When Selectivity Is Due To Two Sources
No ratings yet
Sample Selection Correction in Panel Data Models When Selectivity Is Due To Two Sources
20 pages
Estimating Regression Models of Finite But Unknown Order
No ratings yet
Estimating Regression Models of Finite But Unknown Order
17 pages
On Bias, Inconsistency, and Efficiency of Various Estimators in Dynamic Panel Data Models
No ratings yet
On Bias, Inconsistency, and Efficiency of Various Estimators in Dynamic Panel Data Models
26 pages
Moment-Based Estimation of Nonlinear Regression Models Under Unobserved Heterogeneity, With Applications To Non-Negative and Fractional Responses
No ratings yet
Moment-Based Estimation of Nonlinear Regression Models Under Unobserved Heterogeneity, With Applications To Non-Negative and Fractional Responses
29 pages
Clase Panel
No ratings yet
Clase Panel
70 pages
Pvar Stata Modul
No ratings yet
Pvar Stata Modul
29 pages
Panel Analysis
No ratings yet
Panel Analysis
3 pages
Panel Data From Time Series of Cross-Sections
No ratings yet
Panel Data From Time Series of Cross-Sections
18 pages
Model
No ratings yet
Model
3 pages
Panel Data Notes
No ratings yet
Panel Data Notes
26 pages
System GMM With Small Sample
No ratings yet
System GMM With Small Sample
28 pages
Holtz Eakin1988 PDF
No ratings yet
Holtz Eakin1988 PDF
11 pages
PLM
No ratings yet
PLM
51 pages
Some Estimation Methods For Dynamic Panel Data Models: July 2014
No ratings yet
Some Estimation Methods For Dynamic Panel Data Models: July 2014
9 pages
Croissant y Millo, Panel Data Econometrics
100% (1)
Croissant y Millo, Panel Data Econometrics
52 pages
2
No ratings yet
2
43 pages
Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses With Unit Fixed Effects
No ratings yet
Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses With Unit Fixed Effects
28 pages
Wooldridge 2010
No ratings yet
Wooldridge 2010
42 pages
Panel Data Econometrics: Manuel Arellano
No ratings yet
Panel Data Econometrics: Manuel Arellano
5 pages
Dynamic Panel Data
No ratings yet
Dynamic Panel Data
51 pages
Estimating Econometric Models With Fixed Effects
No ratings yet
Estimating Econometric Models With Fixed Effects
14 pages
Cortazar Schwartz Naranjo 2007
No ratings yet
Cortazar Schwartz Naranjo 2007
17 pages
Wooldridge Slides 10 Diff in Diffs
No ratings yet
Wooldridge Slides 10 Diff in Diffs
31 pages
What's New in Econometrics? Difference-in-Differences Estimation
No ratings yet
What's New in Econometrics? Difference-in-Differences Estimation
31 pages
Panel VAR
No ratings yet
Panel VAR
29 pages
State Dependence and Heterogeneity in Health Using A Bias-Corrected Fixed-Effects Estimator
No ratings yet
State Dependence and Heterogeneity in Health Using A Bias-Corrected Fixed-Effects Estimator
27 pages
Dynamic Panel Data Model - 2011
No ratings yet
Dynamic Panel Data Model - 2011
1 page
Makro Ekonomi
No ratings yet
Makro Ekonomi
22 pages
Kripfganz
No ratings yet
Kripfganz
27 pages
Handout 6 Causality
No ratings yet
Handout 6 Causality
16 pages
Mahyudin CeTMA-Panel Data Dynamic Analysis
No ratings yet
Mahyudin CeTMA-Panel Data Dynamic Analysis
27 pages
Panel Ecmiic2
No ratings yet
Panel Ecmiic2
57 pages
Identification For Difference in Differences With Cross-Section and Panel Data
No ratings yet
Identification For Difference in Differences With Cross-Section and Panel Data
7 pages
Fully Modified Ols For Heterogeneous Cointegrated Panels: Peter Pedroni
No ratings yet
Fully Modified Ols For Heterogeneous Cointegrated Panels: Peter Pedroni
38 pages
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
From Everand
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
César Pérez López
No ratings yet
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
305-Article Text-3917-1-10-20221130
No ratings yet
305-Article Text-3917-1-10-20221130
7 pages
Dagmawe Kassahun
No ratings yet
Dagmawe Kassahun
96 pages
The Analysis of Value Added Tax (Vat) To Increasing Government Revenue in Ethiopia
No ratings yet
The Analysis of Value Added Tax (Vat) To Increasing Government Revenue in Ethiopia
13 pages
Seada Kedir
No ratings yet
Seada Kedir
101 pages
35
No ratings yet
35
24 pages
32
No ratings yet
32
5 pages
10
No ratings yet
10
3 pages
DDD Analysis
No ratings yet
DDD Analysis
21 pages
2021 RIBF Santoso - The Bright Side of Market Power in Asian Banking - Implications of Bank Capitalization and Financial Freedom
No ratings yet
2021 RIBF Santoso - The Bright Side of Market Power in Asian Banking - Implications of Bank Capitalization and Financial Freedom
11 pages
pdynmc-pres-in-a-nutshell
No ratings yet
pdynmc-pres-in-a-nutshell
22 pages
Adeola Evans 2019 Ict Infrastructure and Tourism Development in Africa
No ratings yet
Adeola Evans 2019 Ict Infrastructure and Tourism Development in Africa
18 pages
Cover Jurnal Pkbrs Nanin
No ratings yet
Cover Jurnal Pkbrs Nanin
9 pages
Does Earning Per Share (Eps) Affected by Debt To Asset Ratio (Dar) and Debt To Equity Ratio (Der) ?
No ratings yet
Does Earning Per Share (Eps) Affected by Debt To Asset Ratio (Dar) and Debt To Equity Ratio (Der) ?
11 pages
Panel Data
100% (2)
Panel Data
5 pages
Chapter 2. Dynamic Panel Data Models
No ratings yet
Chapter 2. Dynamic Panel Data Models
209 pages
Dynamic Panel Data Analysis: Workshop
No ratings yet
Dynamic Panel Data Analysis: Workshop
3 pages
Block 3
No ratings yet
Block 3
36 pages
Lab Introduction To STATA
No ratings yet
Lab Introduction To STATA
27 pages
Wooldridge Session 4
No ratings yet
Wooldridge Session 4
64 pages
econometrics II CH-4 PPT (3)
No ratings yet
econometrics II CH-4 PPT (3)
25 pages
Cuesta 2000
No ratings yet
Cuesta 2000
20 pages
(Ebook) Panel Data Econometrics: Theoretical Contributions and Empirical Applications by Badi H. Baltagi (Eds.) ISBN 9780444521729, 0444521720 pdf download
100% (1)
(Ebook) Panel Data Econometrics: Theoretical Contributions and Empirical Applications by Badi H. Baltagi (Eds.) ISBN 9780444521729, 0444521720 pdf download
54 pages
Regression With Panel Data
No ratings yet
Regression With Panel Data
16 pages
02 Production
No ratings yet
02 Production
72 pages
Chapter-11Panel Data
No ratings yet
Chapter-11Panel Data
13 pages
Panel Data MOdel-5 PDF
No ratings yet
Panel Data MOdel-5 PDF
44 pages
ES1004ebe Lecture12-1
No ratings yet
ES1004ebe Lecture12-1
48 pages
ESG and Corporate Financial Pe
No ratings yet
ESG and Corporate Financial Pe
17 pages
ECONOMETRIC MODELS WITH PANEL DATA APPLICATIONS WITH STATA 1st Edition César Pérez López all chapter instant download
No ratings yet
ECONOMETRIC MODELS WITH PANEL DATA APPLICATIONS WITH STATA 1st Edition César Pérez López all chapter instant download
40 pages
2019 3061 Ajt PDF
No ratings yet
2019 3061 Ajt PDF
34 pages
ECN3322 - Panel Data-1
No ratings yet
ECN3322 - Panel Data-1
56 pages
Materi RCT EBN
No ratings yet
Materi RCT EBN
6 pages
Unit V Full
No ratings yet
Unit V Full
23 pages
Econ140 Spring2016 Section09 Handout Solutions
No ratings yet
Econ140 Spring2016 Section09 Handout Solutions
12 pages
Assessing Determinants of Dividend Policy of The Government-Owned Companies in Indonesia
No ratings yet
Assessing Determinants of Dividend Policy of The Government-Owned Companies in Indonesia
12 pages
Regresi Data Panel
No ratings yet
Regresi Data Panel
10 pages

Estimation_of_dynamic_panel_data_models_2

Uploaded by

Estimation_of_dynamic_panel_data_models_2

Uploaded by

===

Estimation of Dynamic Panel Data Models

* Correspondence to: Anastasia Semykina, Department of Economics, Florida State

Consider a dynamic panel data model with unobserved heterogeneity:

yit = ρyi,t−1 + xit β + ci1 + uit1 , t = 1, ..., T, (1)

sit = 1[zit δ2t + ci2 + uit2 > 0], t = 1, ..., T, (2)

(iii) E(uit1 |zi , yi0 , ci1 ) = 0, t = 1, . . . , T .

= ht (zi , yi0 ) ≡ hit , t = 1, . . . , T, (7)

It is possible to estimate equation (8) semiparametrically. A semiparametric estimator

(ii) E(uit1 |zi ) = 0, t = 1, . . . , T .

ψ̃s zis ), ψ̃s = ψs + γ2 κs and ξ˜s = ξs + γ1 κs . Similarly

to (10), parameters in equation (11) can be consistently estimated by NLS or GMM, as

A simple way to obtain a consisted estimator of parameters in equation (10) is to re-

1. For each t = 1, . . . , T , estimate separate probit models,

sit on 1, zi1 , . . . , ziT , yi0 i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it2 .

Here 0 denotes a 1×(LT + 3) vector of zeros.

E[Wi (π)′ gi (θ, π)] = 0. (18)

where Ω̂−1 is a consistent estimator of a T (LT + 3)×T (LT + 3) positive semidefinite

and uses the weighting matrix

(i) Λ is the asymptotic variance of Wi (π̂)′ gi (θ, π̂).

Appendix A provides a formula for Ω̂ that satisfies Assumption 4.1. Following a

Avar(θ̂) = (GΩ−1 G)−1 /N, (23)

1. For each t = 1, . . . , T , estimate separate probit models,

sit on 1, zi1 , . . . , ziT , yi0 i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it2 .

5 Testing for Selection Bias

dit = 1[sit · si,t−1 · si,t−2 = 1]

Combining (25), (26), and (27) together gives

E(∆uit1 |zi , vit3 ) = E(∆uit1 |vit3 ) = ϕ3t vit3 , t = 3, . . . , T, (29)

E(∆uit1 |zi , dit = 1) = ϕ3t E(vit3 |zi , dit = 1)

∆yit = ρ∆yi,t−1 + ∆xit β + ϕ3t λit3 + ǫit1 ,

E(ǫit1 |zi , dit = 1) = 0, t = 1, . . . , T. (31)

1. For each of t = 3, . . . , T , run a probit regression

dit on 1, zi1 , . . . , ziT , i = 1, . . . , N

and compute the inverse Mills ratios, λ̂it3 .

where G ≡ E[Wi (π)′ ∇θ gi (θ, π)] was also defined earlier.

where F ≡ E[Wi (π)′ ∇π gi (θ, π)], and ∇π gi (θ, π) is a block-diagonal matrix,

where di (π) ≡ (di1 (π1 )′ , . . . , diT (πT )′ )′ ,

dit (πt ) = A−1 −1 ′

At ≡ E[−Hit (πt )],

Combining equations (32), (33), and (35), we can write

pi = Wi (π)′ gi (θ, π) + F di (π). (39)

p̂i = Wi (π̂)′ gi (θ̂, π̂) + F̂ di (π̂). (41)

Combining (42), (43), and (35) gives

where di is as defined in equation (36). And by the central limit theorem,

[ θ̂N LS ) = D̂−1 R̂D̂−1 /N,

Chamberlain G. 1984. Panel Data. In Handbook of Econometrics, Volume 2, Griliches

Gayle GL, Viauroux C. 2007. Root-N consistent semiparametric estimators of a dynamic

Hu L. 2002. Estimation of a Censored Dynamic Panel Data Model. Econometrica 70:

FD-IV, FD-IV NLS, NLS GMM

Lagged Log of 0.153∗∗∗ 0.162∗∗∗ 0.576∗∗∗ 0.586∗∗∗ 0.574∗∗∗

Education 0.033∗∗∗ 0.032∗∗∗ 0.029∗∗∗

Age 0.013∗∗ 0.012∗∗ 0.009∗∗

Age squared −0.0002 −0.0001 −0.00018∗∗ −0.00016∗ −0.00013∗∗∗

Time-specific intercept is used in all regressions.

You might also like