0% found this document useful (0 votes)
62 views20 pages

Estimation of Dynamic Structural Equation Models With Latent Variables

The paper proposes a time series generalisation of the structural equation model with latent variables (SEM). An instrumental variable estimator is considered and its asymptotic properties are analysed. Special emphases are placed on the potential use of the lagged observed variables as instruments and consistency of such estimation is established under some general assumptions about the stochastic properties of the modelled variables. In addition, an identification procedure suitable both for s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views20 pages

Estimation of Dynamic Structural Equation Models With Latent Variables

The paper proposes a time series generalisation of the structural equation model with latent variables (SEM). An instrumental variable estimator is considered and its asymptotic properties are analysed. Special emphases are placed on the potential use of the lagged observed variables as instruments and consistency of such estimation is established under some general assumptions about the stochastic properties of the modelled variables. In addition, an identification procedure suitable both for s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Metodološki zvezki, Vol. 1, No.

1, 2004, 185-204

Estimation of Dynamic Structural


Equation Models with Latent Variables
Dario Cziráky1

Abstract
The paper proposes a time series generalisation of the structural equa-
tion model with latent variables (SEM). An instrumental variable estimator
is considered and its asymptotic properties are analysed. Special emphases
are placed on the potential use of the lagged observed variables as instruments
and consistency of such estimation is established under some general assump-
tions about the stochastic properties of the modelled variables. In addition,
an identification procedure suitable both for static and dynamic structural
equation models is described. The methods are illustrated in an empirical
application to dynamic panel estimation of a consumption function using UK
household data.

1 Introduction
Latent variable methods for time series data are notably underdeveloped in compar-
ison with cross-sectional methods. So far the main developments in the literature
focused on simple factor analysis model without causal or structural relationships
between latent variables.
Stock and Watson (1989) considered a time series single-factor model of asset
return and Stock and Watson (1999) analysed factor analytic models for forecast-
ing purposes. Lewbel (1991) and Donald (1997) considered factor analytic models
for time series data and proposed a procedure for determining the number of fac-
tors. Similarly Cragg and Donald (1997), Connor and Korajczyk (1993), Stock and
Watson (1998), and Bai and Ng (2002) developed procedures for determining the
number of factors in time series and panel models. An early selection procedure for
pure time series factor models was proposed by Mallows (1973).
Sargent and Sims (1977), Geweke (1977), and Forni et al. (2000) considered es-
timation of dynamic factor models.2 Chamberlain and Rothschild (1983) analysed
approximate factor models allowing for correlation in the idiosyncratic components
of the latent errors. Recently, Bai (2003) developed asymptotic inferential theory for
a principal components estimator of factor models suitable for large panels. How-
ever, time series generalisations of the latent variable models that include structural
1
Department of Statistics, London SchoolPof Economics; [email protected]
2 p
Dynamic factor model is specified as xt = i=1 Λi ξ t−i + et , i.e. the contemporaneous observ-
able indicators are assumed to be caused by both contemporaneous and lagged latent factors.
186 Dario Cziráky

(causal) relationships among latent variables such as the general structural equation
model with latent variables (SEM or LISREL) developed by Jöreskog (1973) and
Jöreskog et al. (2000) were not developed.
In this paper we propose a time series generalisation of the structural equation
model with latent variables in the form of a structural autoregressive distributed
lag model with latent variables and propose a general estimation procedure. We
show how instrumental variables methods can be used to estimate dynamic latent
variable models and we analyse the asymptotic properties of these estimators. In
particular, we consider instruments in the form of the lagged observable indicators
and show that these can be used for consistent estimation.
The paper is organised as follows. The second section describes the static struc-
tural equation model with latent variables and the third section generalizes this
model to a dynamic structural equation model. Fourth section describes IV estima-
tion procedures while the fifth section deals with the identification of the model.

2 Static structural equation model


The static structural equation model with latent variables (Jöreskog and Sörbom,
1996) is specified with three matrix equations–the structural equation, the measure-
ment equation for latent exogenous variables, and the measurement equation for
latent endogenous variables

η = αη + Bη + Γξ + ζ, x = αx + Λx ξ + δ, y = αy + Λy η + ε, (2.1)

where η is a (m × 1) matrix of endogenous latent variables; ξ is a (g × 1) matrix of


exogenous latent variables; B and Γ are (m × m) and (m × g) matrices of structural
coefficients, respectively; Λx and Λy are k × g and l × m matrices of factor loadings,
respectively; αη , αx , and αy are (m × 1), (k × 1), and (l × 1) matrices of intercepts,
respectively.

3 Dynamic structural equation model (DSEM)


We formulate a dynamic structural equation model with latent variables (DSEM)
as a time series generalisation of the static structural equation model with latent
variables.3 Specifically, we define a structural autoregressive distributed lag model
of the form
p
X q
X
η t = αη + Bj η t−j + Γj ξt−j + ζ t , (3.1)
j=0 j=0

where αη , B0 , and Γ0 are coefficient matrices from the static model (2.1), and B1 ,
B2 ,. . . , Bp , Γ1 , Γ2 ,. . . , Γq are the additional p + q matrices that contain coefficients
3
A static version of this model can be easily estimated by software packages such as LISREL
8.54 (see e.g. Cziráky, 2004).
Estimation of Dynamic Structural Equation Models. . . 187

of the lagged endogenous and exogenous latent variables.4 Note that the specifi-
cation (3.1) is “structural” because contemporaneous endogenous latent variables
might be included as regressors (i.e. B0 6= 0). If we assume time-invariance of the
measurement model, the usual specification of the measurement models for xt and
yt applies, thus the structural part of the model (3.1) can be augmented with the
measurement equation for the latent exogenous variables
xt = αx + Λx ξ t + δ t (3.2)
and for the latent endogenous variables
yt = αy + Λy η t + εt (3.3)
The matrix equations (2)-(4) provide full specification of a general DSEM model
directly extending the static structural equation model with latent variables (SEM)
to time series. It follows that static SEM is a special case of the DSEM model.
However, the DSEM model from (3.1)–(3.3) cannot be directly estimated due to
the presence of unobserved latent components. To solve this problem and enable
estimation of the model parameters, we rewrite the latent variable specification
in terms of the observed variables and latent errors only, following the approach
similar to Bollen (1996; 2001; 2002). Bollen used such specification to enable non-
parametric estimation of standard (cross-sectional) structural equation models with
an aim of achieving greater robustness to misspecification and non-normality.
In this paper we show that a similar approach can be used to re-write the DSEM
model in the observed form specification (OFS) and to subsequently estimate all
model parameters (except latent error terms) by generalised instrumental variables
methods.
The OFS uses the fact that in the measurement model for each latent variable
one loading can be fixed to one without loss of generality. Thus, we can re-write the
measurement models for xt and yt as
! ! ! !
x1t 0 I δ 1t
xt = = (x) + (x) ξt + (3.4)
x2t α2 Λ2 δ 2t
and
! ! ! !
y1t 0 I ε1t
yt = = (y) + (y) ηt + (3.5)
y2t α2 Λ2 ε2t
Note that the observed indicators with unit loadings were placed in the top part
of the vectors for xt and yt and thus the upper part of the lambda matrix is an
identity matrix. Having divided xt into xt1 and xt2 , note that for xt1 it holds that
x1t = ξt + δ 1t ⇒ ξ t = x1t − δ 1t (3.6)
and, similarly, for yt1 we can replace the latent variable with its unit-loading indi-
cators
4
Note that (3.1) does not require specification of lagged latent variables as separate variables;
rather each vector containing all modelled and exogenous latent variables is written for each in-
cluded lag separately, with a separate coefficient matrix. Also note that (3.1) allows different lag
lengths for different latent variables (i.e., elements of η and ξ vectors) by appropriate specification
of Bj and Γj matrices (e.g., zero elements).
188 Dario Cziráky

y1t = η t + ε1t ⇒ η t = y1t − ε1t (3.7)


It is now possible to use the relations in (3.6) and (3.7) to re-write the measurement
model for xt as
(x) (x)
x2t = α2 + Λ2 (x1t − δ 1t ) + δ 2t 
(x) (x) (x) (3.8)
= α2 + Λ2 x1t + δ 2t − Λ2 δ 1t
and for yt as
(y) (y)
y2t = α2 + Λ2 (y1t − ε1t ) + ε2t 
(y) (y) (y) (3.9)
= α2 + Λ2 y1t + ε2t − Λ2 ε1t
Following the same principle it is possible to re-write the structural part of the
model using definitions (3.6) and (3.7) as follows
p
X q
X
y1t − ε1t = αη + Bj (y1t−j − ε1t−j ) + Γj (x1t−j − δ 1t−j ) + ζ t . (3.10)
j=0 j=0

Separating the observed part of the model from the latent errors we obtain

 
p
X q
X p
X q
X
y1t = αη + Bj y1t−j + Γj x1t−j + ζ t + ε1t − Bj ε1t−j − Γj δ 1t−j , (3.11)
j=0 j=0 j=0 j=0

with the measurement model for the latent endogenous variables

(y) (y) (y)


 
y2t = α2 + Λ2 y1t + ε2t − Λ2 ε1t , (3.12)

and for the latent exogenous variables

(x) (x) (x)


 
x2t = α2 + Λ2 x1t + δ 2t − Λ2 δ 1t . (3.13)

Aside of the specific structure of the latent error terms, (3.11)–(3.13) present a
classical structural equation system with observed variables. However, the OFS form
of the DSEM model differs from the standard econometric simultaneous equation
system in respect to the exogeneity status of the OFS variables, which are generally
observable indicators of the latent variables.
It can be shown that estimation of the OFS equations might be possible by the
use of the instrumental variable (IV) methods. Furthermore, it can be shown that
IV estimation might be based on model-implied instruments in the form of various
lags of the OFS variables.
We propose a limited information generalised IV (GIVE) technique for consistent
estimation of the OFS equations by using the model-implied instruments in the form
of the lagged indicators of the latent variables.
Estimation of Dynamic Structural Equation Models. . . 189

4 Estimation of the OFS system


4.1 Full-sample specification
Estimation of the OFS equations aims at consistent and, possibly, efficient estima-
tion of the structural and measurement-model parameters. However, the structural
(latent) errors cannot be directly estimated. Therefore, ignoring the specific struc-
ture of the measurement error terms, let u1t ≡ ζ t +ε1t − pj=0 Bj ε1t−j − qj=0 Γj δ 1t−j ,
P P
(y) (x)
u2t ≡ ε2t − Λ2 ε1t , and u3t ≡ δ 2t − Λ2 δ 1t the structural OFS equations can be
written as

p
X q
X
y1t = αη + Bj y1t−j + Γj x1t−j + u1t , (4.1)
j=0 j=0

with the measurement models

(y) (y)
y2t = α2 + Λ2 y1t + u2t , (4.2)

and
(x) (x)
x2t = α2 + Λ2 x1t + u3t . (4.3)

For notational convenience, we switch to full-sample notation, assuming that


a max(p, q) pre-sample observations are available for estimation. Define ykj ≡
(kj) (kj) (kj) (2j) (2j) (2j)
   
y0 , y1 , . . . , yT , and x2j ≡ x0 , x1 , . . . , xT , for k = 1, 2 where the
“j ” subscript refers to the j th equation where there are m individual y1 equa-
tions, n individual y2 equations, and h individual x2 equations. Further define
Y1j ≡ (Y1jt , Y1jt−k ), and X1j ≡ (X1jt , X1jt−k ), where

(11) (12) (1m) (11) (12) (1m)


   
y y0 · · · y0 x x0 · · · x0
 0  0
 (11) (12) (1m)   (11) (12) (1m) 
 
 y1 y1 · · · y1   x1 x1 · · · x1 
(11) (12) (1m) (11) (12) (1m)
   
Y1jt ≡  y2

y2 · · · y2 ,

X1jt ≡  x2

x2 · · · x2 ,

 .. .. .. ..  .. .. .. ..
   
 . . . .  . . . .
 
 
   
(11) (12) (1m) (11) (12) (1m)
yT yT · · · yT xT xT · · · xT

and
(11) (12) (1m) (11) (12) (1m)
 
y−1 y−1 · · · y−1 · · · y−p y−p · · · y−p
(11) (12) (1m) (11) (12) (1m)
 

 y0 y0 · · · y0 · · · y1−p y1−p · · · y1−p 

(11) (12) (1m) (11) (12) (1m)
 
Y1jt−k ≡

 y2 y1 · · · y1 · · · y2−p y2−p · · · y2−p ,


.. .. .. .. .. .. .. .. .. 
. . . . . . . . .
 
 
 
(11) (12) (1m) (11) (12) (1m)
yT −1 yT −1 · · · yT −1 · · · yT −p yT −p · · · yT −p
190 Dario Cziráky

(11) (12) (1g) (11) (12) (1g)


 
x−1 x−1 · · · x−1 · · · x−q x−q · · · x−q
(11) (12) (1g) (11) (12) (1g)
 

 x0 x0 · · · x0 · · · x1−q x1−q · · · x1−q 

(11) (12) (1g) (11) (12) (1g)
 
X1jt−k ≡

 x2 x1 · · · x1 · · · x2−q x2−q · · · x2−q .


.. .. .. .. .. .. .. .. .. 
. . . . . . . . .
 
 
 
(11) (12) (1g) (11) (12) (1g)
xT −1 xT −1 · · · xT −1 · · · xT −q xT −q · · · xT −q

In addition, we define the following notation for the parameter vectors


(2n) ′ (2h) ′
   
(y) (21) (22) (x) (21) (22)
λj ≡ λyj , λyj , . . . , λyj , λj ≡ λxj , λxj , . . . , λxj ,
 ′
(11) (12) (1m) (11) (12) (1m)
β j ≡ β0 , β0 , . . . , β0 , β1 , β1 , . . . , β1 , . . . , βp(11) , βp(12) , . . . , βp(1m) ,

and 
(11) (12) (1g) (11) (12) (1g)
′
γ j ≡ γ0 , γ0 , . . . , γ0 , γ1 , γ1 , . . . , γ1 , . . . , γq(11) , γq(12) , . . . , γq(1g) .

Using the above notation, we can now write the (4.1)–(4.3) as


(y)
y1j = α1j + Y1j β j + X1j γ j + u1j , (4.4)
(y) (y)
y2j = α2j + Y1jt λj + u2j , (4.5)
(x) (x)
x2j = α2j + X1jt λj + u3j . (4.6)

Note that the individual OFS equations are specified as


p
m X g X
q
(y) X (1k) (1k) X (1k) (1k)
y1j = α1j + βi yt−i + γi xt−i + u1jt,
k=1 i=0 k=1 i=0

for the structural part of the model, and as


m g
(y) X (y) (1k) (x) X (x) (1k)
y2j = α2j + λ2jk yt + u2jt , x2j = α2i + λ2jk xt + u3jt ,
k=1 k=1

for the measurement models. This completes the specification of the DSEM model.
It remains to show that the available instruments in the form of lags of the
observed variables can enable consistent estimation. The issue of the choice of
instruments is also discussed in Bollen (1996; 2001), however he does not discuss
this issue in the context of dynamic models. The following discussion takes into
account the specific structure of the OFS system and the implications derived from
the composition of the latent errors. This (known) composition of the latent error
terms and their implied relation with the observed components of the model, as a
consequence of the latent structure, presents the major difference between the DSEM
OFS equations and classical econometric models. Specifically, it is not possible to
Estimation of Dynamic Structural Equation Models. . . 191

simply assume the availability of external instrumental variables that satisfy some
general conditions such as being uncorrelated with the errors and correlated with
the regressors. Rather, it will be necessary to show under which conditions the
lagged modelled variables can serve as valid instruments in the estimation of the
OFS equations.

4.2 Consistency conditions and instrumental variables


The standard consistency conditions needed for the validity of instrumental variables
(see e.g. Judge et al., 1985) and Davidson and MacKinnon, 1993) can be stated in
terms of the data matrix X defined as X ≡ (ι, Yj , Xj ) where Y1j ≡ (Y1jt , Y1jt−k )
and X1j ≡ (X1jt , X1jt−k ), as defined above. Let Z be a matrix of valid instru-
ments defined as Z ≡ (Y1∗ , Y2∗ , X∗1 , X∗2 ) where Y1∗ ≡ (Y11 ∗
, Y12∗ ∗
, . . . , Y1a ), Y2∗ ≡
∗ ∗ ∗
(Y21 , Y22 , . . . , Y2b ), X∗1 ≡ (X∗11 , X∗12 , . . . , X∗1c ), X∗2 ≡ (X∗21 , X∗22 , . . . , X∗2d ), and
 (11) (12) (1m)
  (21) (22) (2n) 
y−p−k y−p−k ··· y−p−k y−l y−l · · · y−l
 (11) (12) (1m)   y (21) (22) (2n)

 y1−p−k y1−p−k ··· y1−p−k 
  −l+1 y−l+1 · · · y−l+1 

(11) (12) (1m)  (21) (22) (2n) 

Y1k =
 y2−p−k y2−p−k ··· y2−p−k  ∗
 , Y2l  y−l+2
= y−l+2 · · · y−l+2 ,
.. .. .. .. .. ..

 ..   .. 

 . . . .

  . . . . 
(11) (12) (1m) (21) (22) (2n)
yT −p−k yT −p−k · · · yT −p−k yT −l yT −l · · · yT −l

   
(11) (12) (1m) (21) (22) (2n)
x−q−i x−q−i ··· x−q−i x−j x−j · · · x−j
 (11) (12) (1m)   (21) (22) (2n) 

 x1−q−i x1−q−i ··· x1−q−i 


 x−j+1 x−j+1 · · · x−j+1 

(11) (12) (1m) (21) (22) (2n)
X∗1i =
 x2−q−i x2−q−i ··· x2−q−i  , X∗2j = 
  x−j+2 x−j+2 · · · x−j+2 ,

 .. .. .. ..   .. .. .. .. 

 . . . .



 . . . .


(11) (12) (1m) (21) (22) (2n)
xT −q−i xT −q−i · · · xT −q−i xT −j xT −j · · · xT −j

where k = 1, 2, . . . , a; l = 1, 2, . . . , b; i = 1, 2, . . . , c; and j = 1, 2, . . . , d.
We state the general conditions for these instruments in terms of the joint ma-
trices X and Z though, in practice, only subsets of these matrices will be used in
estimated models. It is generally necessary that
   
plim T −1 Z′ Z = lim T −1 Z′ Z = ΣZZ ,
T →∞

and also that


   
plim T −1 Z′ X = lim T −1 Z′ X = ΣZX ,
T →∞

where ΣZZ and ΣZX are positive definite matrices. These conditions will generally
hold for the case of lagged instruments given they satisfy certain stochastic condi-
tions. In addition, we assume homoscedastic residuals, i.e., E (ui u′ j ) = σij I and,
specially, E (Z′ ui ) = 0.
192 Dario Cziráky

To assure the consistency of the IV estimator we will need to make the following
assumption about the stochastic properties of the observed variables.

Assumption 4.2.1 For stochastic processes {yt } and {xt } suppose that:
(y)
A1. E (yijt ) = µij , ∀t
(x)
A2. E (xijt ) = µij , ∀t
 
(y) (y) (ijef )
A3. E (yij,t−r − µij )(yef,t−w − µef ) = γ|r−w| , ∀t
(x) (x) (ijef )
 
A4. E (xij,t−r − µij )(xef,t−w − µef ) = δ|r−w| , ∀t
(y) (x) (ijef )
 
A5. E (yij,t−r − µij )(xef,t−w − µef ) = ψ|r−w| , ∀t
∞ (.) ∞ (.) ∞ (.)
A6. γk < ∞, δk < ∞, ψk < ∞
P P P
k=0 k=0 k=0

We will also need the following two lemmas.

Lemma 4.2.2 Let wt be a covariance-stationary process with finite fourth moments


and absolutely summable autocovariances. Then the sample mean satisfies
XT m.s.
T −1 t=1
wt → µw

where m.s. denotes convergence in mean square.

Proof. Omitted. See Hamilton (1994: 188), Proposition 7.5.

Lemma 4.2.3 Let yt and xt be stochastic processes satisfying Assumption (4.2.2).


Then the following convergence results hold:
T p (y)
1
(i) yij,t−s →E (yijt ) = µij
P
T
t=0
T p  
(ij) (y)
1 2 2
(ii) yij,t−s →E yijt = γ0 + (µij )2
P
T
t=0
T p (ijef ) (y) (y)
1
(iii) yij,t−r yef,t−w →E (yij,t−r yij,t−w ) = γ|r−w| + µij µef
P
T
t=0
T p (x)
1
(vi) xij,t−s →E (xijt ) = µij
P
T
t=0
T p  
(ij) (x)
1
(v) x2ij,t−s →E x2ijt = δ0 + (µij )2
P
T
t=0
T p (ijef ) (x) (x)
1
(vi) xij,t−r xef,t−w →E (xij,t−r xij,t−w ) = δ|r−w| + µij µef
P
T
t=0
T p (ijef ) (y) (x)
1
(vii) yij,t−r xef,t−w → E (yij,t−r xef,t−w ) = ψ|r−w| + µij µef
P
T
t=0
Estimation of Dynamic Structural Equation Models. . . 193

Proof. Omitted. See Cziráky (2003) for details.


The main underlying assumption in lemma (4.2.2) and lemma (4.2.3) is that
of covariance stationarity for the observable variables. Therefore, to apply these
methods to non-stationary variables the data would need to be differences to achieve
stationarity.

Proposition 4.2.4 Let X ≡ (ι, Yj , Xj ) where Y1j ≡ (Y1jt , Y1jt−k ) and X1j ≡
(X1jt, X1jt−k ). Let Z be a matrix of valid instruments defined as Z ≡ (Y1∗, Y2∗ , X∗1 , X∗2 ).
Assuming that E (ui u′ j ) = σij I, the following result holds
 
1 ′
(i) plim T
ZZ = ΣZZ
 
1 ′
(ii) plim T
ZX = ΣZX
(iii) E (Z′ ui ) = 0

Proof. Omitted. See Cziráky (2003) for details.

The above results allow consistent GIVE estimation of the OFS equations using
the available, model-implied (lagged) instruments contained in Z, which includes all
available eligible instruments that do not come from outside the modelled data. It
must be mentioned that nothing precludes availability of valid instruments that are
not merely lags of the modelled variables. However, the nature of structural equation
models with latent variables casts doubt that such variables will be available. In
any case, valid variables will satisfy the same conditions, but we have shown that
available instruments already might exist in the used data in forms of lagged values
not already included in the model.

4.3 Consistent generalised instrumental variable estimation


of the OFS equations
Formulation and estimation of the OFS equations requires reliance on specific struc-
ture and status of the modelled variables. This structure is determined by the latent-
form specification and makes specification of the OFS equations rather complex. In
order to derive generalised instrumental variable estimators (GIVE) for the OFS
equations, we start from the system of equations given in (4.4), (4.5), and (4.6) and
write it by positioning its matrix and vector elements in the way that will facilitate
the use of more concise notation, i.e.,

(y)
y1j = α1j + Y1j β j + X1j γ j + u1j
(y) (y)
y2j = α2j + Y1jt λj + u2j (4.7)
(x) (x)
x2j = α2j + X1jt λj + u3j
194 Dario Cziráky

We are now able to simplify our notation by stacking all of the right-hand-
side variables of each of the three parts of the system (4.7) by making the follow-
(y)
ing definitions: W1j ≡ (ι, Y1j , X1j ), W2j ≡ (ι, Y1jt), W3j ≡ (ι, X1jt ), δ 1j ≡
′
(y)′ (y)′ (y)′ ′ (x)′ (x)′ ′
    
(y) (x)
α1j , β ′j , γ ′j , δ 2j ≡ α2j , λ2j , and δ 2j ≡ α2j , λ2j . It is now possible
to re-write the system (4.7) in a simpler, more concise notation as
(y)
y1j = W1j δ 1j + u1j
(y)
y2j = W2j δ 2j + u2j
(x)
x2j = W3j δ 2j + u3j (4.8)

An appropriate matrix of instruments Z need not contain all available eligible


instruments, but it needs to have at least as many of them as there are endogenous
variables in each equation. The matrix of instruments Z can differ across different
(individual) equations of the system (4.8). For simplicity we assume that Z is
correctly specified.
We proceed in defining the GIVE estimator. First, by premultiplying each part
(y)
of the system by Z we obtain matrix equations Z′ y1j = Z′ W1j δ 1j + Z′ u1j , Z′ y2j =
(y) (x)
Z′ W2j δ 2j + Z′ u2j , and Z′ x2j = Z′ W3j δ 2j + Z′ u3j . We now define usual GIVE
(y) (y) (x)
estimators for coefficient vectors δ̂ 1j , δ̂ 2j , and δ̂ 2j as

(y) 
−1

−1
δ̂ 1j = W′ 1j Z (Z′ Z) Z′ W1j W′1j Z (Z′ Z) Z′ y1j , (4.9)

−1 −1
 
(y)
δ 2j = W′ 2j Z (Z′ Z) Z′ W2j W′2j Z (Z′ Z) Z′ y2j , (4.10)

and

−1 −1
 
(x)
δ 2j = W′ 3j Z (Z′ Z) Z′ W3j W′3j Z (Z′ Z) Z′ x2j . (4.11)

It is easy to show that (4.9), (4.10), and (4.11) are consistent estimators of the
(y) (y) (x)
unknown coefficient vectors δ 1j , δ 2j , and δ 2j . To show this note that

(∗) (∗)

−1

−1
δ̂ ij = δ ij + W′ ij Z (Z′ Z) Z′ Wij W′ij Z (Z′ Z) Z′ uij

Taking probability limits we obtain


  −1
(∗) (∗)
     
plim δ̂ ij = δ ij + plim 1
T
W′ij Z · plim 1
T
(Z′ Z)−1 plim 1 ′
T
Z Wij
     
×plim 1
T
W′ij Z · plim T1 (Z′ Z)−1 plim 1 ′
T
Z uij
−1
(∗)

= δ ij + ΣWij Z Σ−1 Σ
ZZ ZWij ΣWij Z Σ−1
ZZ · 0
(∗)
= δ ij
Estimation of Dynamic Structural Equation Models. . . 195

(y) (y) (x)


The above results holds for each of the vectors δ̂ 1j , δ̂ 2j , and δ̂ 2j , where super-
scripts (y, x) were replaced by asterisks, and subscripts (1, 2) by i. For computa-
tional purposes, the GIVE estimators using the OFS notation defined above can be
written in more detail as follows. Firstly, the three sets of coefficient vectors in the
structural part of the model are estimated by
−1
ι′ Z(Z′ Z)−1 Z′ ι ι′ Z(Z′ Z)−1 Z′ Y 1j ι′ Z(Z′ Z)−1 Z′ X1j
  
α̂ηj
−1 −1 −1
 β̂ j  =  Y 1j Z(Z′ Z) Z′ ι Y ′ 1j Z(Z′ Z) Z′ Y 1j Y ′ 1j Z(Z′ Z) Z′ X1j 
   ′ 
γ̂ j X′ 1j Z(Z′ Z)−1 Z′ ι X′ 1j Z(Z′ Z)−1 Z′ Y1j X′ 1j Z(Z′ Z)−1 Z′ X1j 
ι′ Z(Z′ Z)−1 Z′ y1j
×  Y 1j Z(Z′ Z)−1 Z′ y1j 
 ′ 

X′ 1j Z(Z′ Z)−1 Z′ y1j

Secondly, the GIVE estimators of the measurement model are given by


! !
(y) −1 −1
ι′ Z(Z′ Z)−1 Z′ ι ι′ Z(Z′ Z)−1 Z′ Y1jt

α̂2j ι′ Z(Z′ Z) Z′ y2j
(y) = ,
λ2j Y′ 1jt Z(Z′ Z)−1 Z′ ι Y′ 1jt Z(Z′ Z)−1 Z′ Y 1jt Y′ 1jt Z(Z′ Z)−1 Z′ y2j

and
! !
(y) −1 −1
ι′ Z(Z′ Z)−1 Z′ ι ι′ Z(Z′ Z)−1 Z′ Y1jt

α̂2j ι′ Z(Z′ Z) Z′ y2j
(y) = .
λ2j Y′ 1jt Z(Z′ Z)−1 Z′ ι Y′ 1jt Z(Z′ Z)−1 Z′ Y 1jt Y′ 1jt Z(Z′ Z)−1 Z′ y2j

Asymptotic distribution of these estimators does not depend on the assumption


that the modelled data is multivariate normal and, thus, GIVE estimators of the
DSEM model are asymptotically distribution free. This is an advantage over the
maximum likelihood estimator of the static structural equation model, and therefore,
GIVE estimator can prove to be more robust to both misspecification of certain parts
of the model and to departure from normality.5
The asymptotic distribution of the GIVE estimators is normal and it can be
derived by noting that
√  (∗) (∗)
    −1
T δ̂ ij − δ ij = T1 W′ij Z T1 (Z′ Z)−1 T1 Z′ Wij
   
× 1
T
W′ij Z 1
T
(Z′ Z)−1 √1 Z′ uij
T
.

d
If we assume that T −1/2 Z′ uij → N (0,σij ΣZZ ), we can conclude that the asymptotic
distribution of the DSEM coefficient estimates is
√  (∗) (∗)

d
  −1 
T δ̂ ij − δ ij → N 0, σij ΣWij Z Σ−1 Σ
ZZ ZWij

5
Misspecification of one OFS equation will not necessarily affect coefficients of other equations
since these are estimated separately using a limited information estimator
196 Dario Cziráky

 −1
The asymptotic covariance matrix σ̂ij W′ij Z (Z′ Z)−1 Z′ Wij can be estimated
 −1
−1
with Σ̂δ̂(∗) = σ̂ij W′ ij Z (Z′ Z) Z′ Wij where
ij
 ′  
(∗) (∗)
σ̂ij = T −1 û′ij ûij = T −1 yij − Wij δ̂ ij yij − Wij δ̂ ij .
The empirical validity of instrumental variables, as opposite to their model-
implied eligibility, is empirically testable. The validity of the choice of the instru-
mental variables can be tested by the Sargan’s (1964) χ2 test. Applied to the OFS
equations, the Sargan test can be calculated as

(∗)′   (∗)
y′ ij Z (Z′ Z)−1 Z′ yij − δ̂ ij W′ ij Z (Z′ Z)−1 Z′ Wij δ̂ ij
∼ χ2(d) , (4.12)
T −1 û′ij ûij app

where d is the number of over-identifying instruments, assumed to be independent of


the equation error. It is important to note that selection of the IV’s on the basis of
the model-implied eligibility without testing for their empirical validity can result in
considerable bias in the estimated coefficients. As the choice of instruments affects
consistency of GIVE estimates, inappropriate IV selection might result in estimates
that will not be robust to misspecification. Therefore, testing for the validity of IV’s
should be an important part in empirical estimation of DSEM models.

5 Identification
Identification of the static structural equation models with latent variables is gen-
erally problematic. An early discussion of this topic can be found already in Wiley
(1973), but a simple and straightforward procedure still does not exist. On the other
hand, identification is well defined and straightforward in classical econometric si-
multaneous equation systems, and a similar approach can be developed for the OFS
equations.
We propose a simple procedure that uses only the coefficient matrices from the
latent specification for identifying the OFS estimation equations. The following
technique provides sufficient conditions for identification of all equations in the sys-
tems.

Proposition 5.0.1 Given a DSEM model with the structural equation of the form
η t = αη + pj=0 Bj η t−j + qj=0 Γj ξ t−j + ζ t and the measurement model given by
P P

xt = αx + Λx ξt + δ t and yt = αy + Λy η t + εt define
Estimation of Dynamic Structural Equation Models. . . 197

 
(y) (x)
 −αη −α2 −α2 
−B′ 1 0 0
 
 
 
−B′ 2 0 0
 
 
.. .. ..
   
(y)′
(I − B′ 0 ) −Λ2
 

0  
 . . . 

−B′ p
 
K= 0 In 0  , G= 0 0
   
  
 
(x)′
−Γ′ 0
 
0 0 Ih 
 0 −Λ2 

−Γ′ 1
 

 0 0 

 .. .. .. 
. . .
 
 
 
−Γ′ q 0 0
Then, the j th equation of the system will be identified iff
  
 K 
rank Rj  ≥m+n+h−1 (5.1)
 G 
where Rj is a zero-one selection matrix having one’s in places of omitted variables
and one row for each omission. Note that if the equality holds the equation is exactly
identified, otherwise it is overidentified.

Corollary 5.0.2 A corollary to Proposition (5.0.1) states that unless

rank (Rj ) ≥ m + n + h − 1 (5.2)


the j th equation is not identified. The condition (5.2) is necessary for identification,
while condition (5.1) is sufficient.

Proof. Omitted. See Cziráky (2003) for details.

It is therefore possible to use these rules to check for identification of each in-
dividual equation. The relevance of this approach lies in its ability to check for
identification of the model that is specified in latent form and thus it avoids the
need to derive the OFS equations. In addition, this method is equally applicable for
both static and dynamic structural equation models with latent variables.

6 An empirical application: Estimating a dynamic


latent consumption function using UK house-
hold data
We apply above proposed methods by estimating a latent consumption function
model that incorporates liquidity effects using micro data from the last 10 waves
198 Dario Cziráky

Table 1: Variables and notation.

Symbol Description
Ft Annual personal food expenditure
Ht Annual personal housing costs
Lt Annual labour income
NLt Annual non-labour, non-investment income
It Annual investment income
St Annual personal savings
Bt Cumulative credit repayment problem
EDt Highest level of academic education

(years) of the British Household Panel Study. We merged the 10 waves of the British
Household Panel Survey (Taylor, et al. 2001) into a panel. Since the available
variables on consumption expenditure, types of income, and liquidity constraints
indicators vary across waves, we use only those variables that were available across
all 10 waves.6
The variables we use in estimation are shown in Table 1. Household data (expen-
ditures) were firstly spread to individual level and then combined with the individual
level income data, thus creating all-individual data files for each wave. Finally, wave-
specific files were merged into a joint panel for all individuals across all waves in
the “long format”, meaning the first individual in the sample is recorded on each
time point followed by the second individual, etc. In this analysis we do not ad-
dress the issues of missing values and attrition but we note that we used data on
3,324 individuals that had no missing values. Thus our panel with NT observations
amounted to 33,240 observations.
We specify our model in the general DSEM form. The model assumes that cur-
rent consumption, modelled as a latent variable, depends on current (latent) income,
previous period consumption and previous period income. Simultaneously, current
income depends on the last period income and education, which is assumed to be
measured without error for simplicity. Note that we assume that education is per-
fectly measured by a single indicator EDt . Finally, the (latent) liquidity constraints
are directly incorporated into the model and assumed to depend on previous period
consumption.7 The structural part of the model describes the relationships among
the latent variables and is specified as

6
See Cziráky (2002) for details on data manipulation and computer implementation.
7
Logically, we expect that excessive spending in one year causes greater degree of liquidity
constraints in the following year.
Estimation of Dynamic Structural Equation Models. . . 199

     (0)  
LQt αLQ 0 0 β13 LQt
  (0) (0)   C
 Ct  =  αC  +  β21
   
0 β23   t 
Yt αY 0 0 0 Yt
 
(1) (1) (1)     
β β12 β13 LQt−1 0 ζLQ
 11 (1) (1)  

+  0 β22 β23   Ct−1  +  0  (Et ) +  ζC 
     
(1) Yt−1 γ31 ζY
0 0 β33

There are three measurement models, for latent consumption, income and liq-
uidity constraints variables. The measurement model is given by
(y)
 
αS
 

St

1 0 0 
εSt

(y)
0 1 0
  

Ft
  αF    
ε

 Ft
     
(y)   
0 0 1

Lt αL LQt  εLt
      
     
(y) (y) 
λ41 0 0
       
Bt = +   Ct  + 
αB  εBt
  
(y)
     
(y) 0 λ52 0


 Ht  
  αH
 
   Yt  ε
 Ht


(y)

N Lt  εN Lt
      
(y) 0 0 λ63 
 
αN L
    
It εIt
 
(y) (y)
αI 0 0 λ73

We can re-write the model in the specific OFS form. The OFS form for the
structural model is thus given by:

     (0)  
St αS 0 0 β13 St
  (0) (0)   F 
F = α +
  
 t   F   β21 0 β23   t 
Lt αL 0 0 0 Lt
 
(1) (1) (1)     
β β12 β13
 11 (1) (1)
  St−1   0  u11t
+  0 β22 β23  F + 0 (ED ) + u12t 
 
t−1 t
     

(1) Lt−1 γ31 u13t
0 0 β33

and the OFS for the measurement model is given by:


 (y)   (y) 
αB λ41 0 0
   
Bt   u21t
(y) (y)   St  
   
 Ht   αH   0 λ52 0 u22t 
= +   Ft  + 
   

N Lt (y) (y) 
u23t

αN L 0 0 λ63
   
L
       
t
It (y)
αI 0
(y)
0 λ73 u24t

We estimate the model using the GIVE technique. Table 2 shows the IV-validity
test results (Sargan, 1964) for individual equations estimated by GIVE methods
using different Zj matrices. Due to the panel nature of the BHPS data, it was
necessary to estimate differenced equations using appropriately constructed IV ma-
trices (see Arellano and Bond, 1991). While differences as well as lags can be used
200 Dario Cziráky

Table 2: Validity of instruments tests.

Equation for: Instruments χ2 d.f.


∆St Bt−3 , Ht−3 , EDt 0.687 2
∆Ft Bt−3 , Ht−3 , EDt 0.253 1
∆Lt Bt−3 , Ht−3 , Ft−3 1.253 3
∆Bt NLt−2 , It−2 , EDt 9.167 2
∆Ht Bt−2 , Ht−2 , NLt−2 , It−2 , St−2 Ft−2 , Lt−2 , EDt 2.771 7
∆NLt Bt−2 , Ht−2 , EDt 1.379 2
∆It Bt−2 , Ht−2 , EDt 2.007 2

as instruments (given they are selected from the set of eligible instruments for each
equation), we used lagged variables only (see Arellano, 1989 for more details on
problems caused by the use of differences for instruments in simple error-component
models).

The first column of Table 2 shows endogenous variable for which OFS equations
were estimated. Note that the constant term cannot be estimated in the differenced
model, and since intercepts have no substantive importance here we do not attempt
to recover them.

The second column shows which instruments were used for estimation. The
selection was based on minimisation of the Sargan’s validity-of-instruments test.

We report the coefficient estimates in Table 3. Looking at the individual coeffi-


cient estimates it is possible to conclude that most coefficients are well determined
with small standard errors. The attempt to model the degree of liquidity constraints
and its influence on relationship between consumption and income provided little
new insight in this well researched topic. Namely, the efforts to construct and model
a liquidity constraints variable that includes cumulative credit repayment problem
measure, though conceptually promising, resulted in poor statistical results; the co-
efficient of Bt turned out to be insignificant, thus effectively all that we have in the
liquidity constraints measurement model is personal savings, which however, has
small (though significant) negative effect on consumption (higher savings, in return,
result in smaller consumption). A significant negative effect of the LQt−1 variable
suggests cyclical saving pattern, i.e., saving is lower in the current period when it
(1)
was unusually high in the previous period and vice versa. The meaning of the β11
coefficient can be explained as the increase in income for each additional year of
education.
Estimation of Dynamic Structural Equation Models. . . 201

Table 3: Coefficient estimates.

Coefficient Estimate Standard error


(0)
β13 0.0914 0.0267
(1)
β11 0.3420 0.0059
(1)
β12 0.0254 0.0287
(1)
β13 0.0178 0.0047
(0)
β21 0.0719 0.4430
(0)
β23 0.1553 0.0921
(1)
β22 0.3171 0.0908
(1)
β23 0.0239 0.0121
(1)
β33 0.1690 0.0006
γ31 118.1200 12.3740
(y)
λ41 0.0002 0.0000
(y)
λ52 0.1933 0.2763
(y)
λ63 0.3368 0.0699
(y)
λ73 0.0349 0.0261

7 Conclusion
In this paper we considered a time series generalisation of the structural equation
model with latent variables and proposed an asymptotically distribution-free ap-
proach to its estimation. We described a limited information instrumental variable
estimator and analysed suitability of the lagged observable indicators as instruments.
We showed that such lagged variables can be used for consistent estimation under
some general assumptions regarding stochastic properties of the observed variables.
The main restriction of this research is in the stationarity requirement, hence further
extensions should consider non-stationary, possibly cointegrated variables. Another
direction for further research would be to consider full-information estimation of
the OFS equations such as maximum likelihood and 3-SLS methods. In addition,
diagnostic and fit statistics including specification and misspecification test should
be developed and small sample performance of the available estimators should be
further studied.

References
[1] Arellano, M. (1989): A note on the Anderson-Hsiao estimator for panel data.
Economics Letters, 31, 337–341.
202 Dario Cziráky

[2] Arellano, M. and Bond, S. (1991): Some tests of specification for panel data:
Monte Carlo evidence and an application to employment equations. Review of
Economic Studies, 58, 277–297.
[3] Bai, J. and Ng. S. (2002): Determining the number of factors in approximate
factor models. Econometrica, 70, 191–221.
[4] Bai, J. (2003): Inferential theory for factor models of large dimensions, Econo-
metrica, 71, 135–171.
[5] Bollen, K.A. (1996): An alternative two stage least squares (2SLS) estimator
for latent variable equations, Psychometrika, 61, 109–121.
[6] Bollen, K.A. (2001): Two-stage least squares and latent variable models: si-
multaneous estimation and robustness to misspecification, In R. Cudeck, S.
Du Toit, and D. Sörbom (Eds.): Structural Equation Modeling: Present and
Future. Chicago: Scientific Software International, 119–138.
[7] Bollen, K.A. (2002): A Note on a Two-Stage Least Squares Estimator for
Higher-Order Factor Analysis, Sociological Methods and Research, 30, 568–579.
[8] Connor, G. and Korajzcyk, R. (1993): A test for the number of factors in an
approximate factor model, Journal of Finance, 48, 1263–1291.
[9] Cragg, J. and Donald, S. (1997): Inferring the rank of a matrix. Journal of
Econometrics, 76, 223–250.
[10] Cziráky D. (2002): Estimation of a general structural equation latent vari-
able autoregressive distributed lag model with an application to UK micro-
consumption function, University of Essex, unpublished MA thesis (download-
able from www.policy.hu/cziraky/papers.htm).
[11] Cziráky D. (2003): Estimation of a dynamic structural equation
model with latent variables. International Conference on Methodology
and Statistics; September 14–17, Ljubljana, Slovenia (downloadable from
www.policy.hu/cziraky/papers.htm).
[12] Cziráky, D. (2004): LISREL 8.54: A programme for structural equation mod-
elling with latent variables, Journal of Applied Econometrics, 19, 135–141.
[13] Cziráky, D., Tišma, S., and Pisarović, A. (2003): Determinants of the low SME
loan approval rate in Croatia: A latent variable structural equation approach,
Small Business Economics Journal, forthcoming.
[14] Davidson, R. and MacKinnon, J. (1993): Estimation and Inference in Econo-
metrics. Oxford: Oxford University Press.
Estimation of Dynamic Structural Equation Models. . . 203

[15] Dhrymes, P.J., Friend, I., and Glutekin, N.B. (1984): A critical reexamination
of the empirical evidence on the arbitrage pricing theory, Journal of Finance,
39, 323–346.

[16] Donald, S. (1997): Inference concerning the number of factors in a multivariate


nonparametric relationship. Econometrica, 65, 103–132.

[17] Forni, M., Hallin, M., Lippi, M., and Reichlin, L. (2000): The generalized
dynamic factor model: Identification and estimation. Review of Economics and
Statistics, 82, 540–554.

[18] Forni, M. and Reichlin, L. (1998): Let’s get real: A factor-analytic approach
to disaggregated business cycle dynamics. Review of Economic Studies, 65,
453–473.

[19] Geweke, J. (1977): The dynamic factor analysis of economic time series. In
D.J. Aigner and A.S. Goldberger (Eds.): Latent Variables in Socio Economic
Models. Amsterdam: North Holland.

[20] Jöreskog, K.G. (1973): A general method for estimating a linear structural
equation system. In A.S. Goldberger and O.D. Duncan (Eds.): Structural Equa-
tion Models in the Social Sciences. Chicago: Academic Press, 85–112.

[21] Judge, G.G., Griffiths, W.E., Hill, R.C., Ltkepohl, H., and Lee, T.C. (1985):
The Theory and Practice of Econometrics. 2nd ed. New York: John Wiley.

[22] Lewbel, A. (1991): The rank of demand systems: Theory and nonparametric
estimation, Econometrica, 59, 711–730.

[23] Mallows, C.L. (1973): Some comments on Cp , Technometrics, 15, 661–675.

[24] Sargent, T. and Sims, C. (1977): Business cycle modelling without pretending
to have too much a priori economic theory. In C. Sims (Ed.): New Methods in
Business Cycle Research. Minneapolis: Federal Reserve Bank of Minneapolis.

[25] Stock, J.H. and Watson, M. (1989): New indexes of coincident and leading
economic indications. In O.J. Blanchard S. and Fisher (Eds.): NBER Macroe-
conomic Annual 1989. Cambridge, MA.: M.I.T. Press.

[26] Stock, J.H. and Watson, M. (1998): Diffusion indexes. NBER Working Paper
6702.

[27] Stock, J.H. and Watson, M. (1999): Forecasting inflation. Journal of Monetary
Economics, 44, 293–335.
204 Dario Cziráky

[28] Taylor, M.F. (Ed.) with Bruce J., Nick, B. and Prentice-Lane, E. (2001): British
Household Panel Survey User Manual, Volume A: Introduction, Technical Re-
port and Appendices. Colchester, University of Essex.

You might also like