0% found this document useful (0 votes)
77 views39 pages

Econometric Toolkit For Studying Dynamic Models in Economics and Finance

This document discusses linear time series models, including deterministic and stochastic trends. 1) Deterministic trends follow a polynomial pattern, while stochastic trends like a random walk process have permanent, non-decaying shocks such that past values influence future values. 2) A random walk has a time-dependent variance that increases over time, showing it is non-stationary. Its first-difference is stationary. 3) The autocorrelation of a random walk decays slowly over time, making it difficult to distinguish from a process with a unit root. Graphically, random walks do not return to a long-run mean value.

Uploaded by

rrrrrrrr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views39 pages

Econometric Toolkit For Studying Dynamic Models in Economics and Finance

This document discusses linear time series models, including deterministic and stochastic trends. 1) Deterministic trends follow a polynomial pattern, while stochastic trends like a random walk process have permanent, non-decaying shocks such that past values influence future values. 2) A random walk has a time-dependent variance that increases over time, showing it is non-stationary. Its first-difference is stationary. 3) The autocorrelation of a random walk decays slowly over time, making it difficult to distinguish from a process with a unit root. Graphically, random walks do not return to a long-run mean value.

Uploaded by

rrrrrrrr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Working Paper No.

40

Econometric Toolkit for Studying Dynamic


Models in Economics and Finance

by

Peter Wöhrmann and Willi Semmler

University of Bielefeld
Department of Economics
Center for Empirical Macroeconomics
P.O. Box 100 131
33501 Bielefeld, Germany
Econometric Toolkit for Studying Dynamic Models
in Economics and Finance
Peter Wöhrmann and Willi Semmler
January 2001

Abstract
The aim of this toolkit is to give students a practical introduction to linear
econometric models as well as nonlinear econometric techniques that are widely
applied in economics and finance. The particular purpose is to summarize tools
and to collect computer programs that are useful in estimating dynamic relation-
ships in economics and finance. The following topics are covered: (1) linear and
nonlinear time series models and (2) estimation of intertemporal models.
Emphasizing the practical character of this toolkit, the computer programs
of the estimated models are written in Gauss and available upon request.

1
Contents
1 Basic Definitions 1

2 Deterministic and Stochastic Trends 4

3 Trends and ”the Linear Model” 11

4 Co-Integration and Error Correction Models 11

5 Estimation of Continuous Time Models 13

6 Threshold Regression Models 16

7 ARCH/GARCH-Models 20

8 Neural Networks 21

9 Estimating a Stochastic Dynamic Using MLE 23

10 Estimating an Endogenous Growth Model Using GMM 26

11 Estimating Transversality Conditions of Intertemporal Models: Bub-


ble Tests 30

References 33

I
Part I: Introduction

1 Basic Definitions
1.1 Random Variables, Distributions
Given a probability space (Ω, A, P ), a function X : Ω → R is called a continuous
random variable it it is measurable, i.e.

ω ∈ Ω : X(ω) ≤ a ∈ A∀a ∈ R.

Then the distribution and density function of X are defined by

FX (x) : R → R∀x ∈ R, FX (x) = PX ((−∞, x)) = P ({ω ∈ Ω : X(ω) < x})

and
pX (x) : R → R∀x ∈ R, pX (x) = FX0 (x),
respectively.
Appropriate measures to characterize a distribution are it’s moments:

• mean and expected value


Z ∞
µX = E(X) = xp(x)dx,
−∞

• variance:
2
σX = V ar(x) = E((X − µx )2 ) = E(X 2 ) − µ2X ≤ 0,

• skewness
E(X − µX )3 ,
and

• kurtosis
E(X − µX )4 .

Two random variables X1 and X2 are called independent if their joint and marginal
distribution functions satisfy

FX1 ,X2 (x1 , x2 ) = FX1 (x1 )FX2 (x2 )∀x1 , x2 ∈ R.


Then the following holds:

1
E(X1 X2 ) = E(X1 )E(X2 ),
i.e. they are not correlated and

V ar(X1 + X2 ) = V ar(X1 ) + V ar(X2 )


Functions of Random Variables, Y = f (X) are Random Variables, in particular the
sum of two Gaussian random variables follows also a Gaussian distribution.

1.2 Convergence of Random Sequences


Often one is interested in the asymptotic behaviour of a sequence X1 , X2 , ..., Xn , ... of
random variables, i.e. the existence of a random variable X to which Xn for n → ∞
converges.
Some useful convergence concepts

• Convergence with probability one (w.p.1)

P ({ω ∈ Ω : lim | Xn (ω) − X(ω) |= 0}) = 1


n→∞

• Mean-square convergence

E(Xn2 ) < ∞, for n = 1, 2, ..., E(X 2 ) < ∞ and lim E(| Xn − X |2 ) = 0


n→∞

• convergence in distribution

lim FXn (x) = FX (x)


n→∞

at all continuity points of FX .

Law of Large Numbers


Consider the sum of a sequence of independent identically distributed (i.i.d.) ran-
dom variables X1 , X2 , ...,
1 1
An = Sn = (X1 + X2 + ... + Xn ).
n n
2
For µ = E(Xn ) and σ 2 = V ar(Xn ), E(An ) = µ and V ar(An ) = σ2 follows from in-
dependence. The Law of Large Numbers proposes mean-square convergence as follows:

An → µ as n → ∞.

2
1.3 Stochastic Processes and Time Series
A sequence of random variables X = {X(t), t ∈ T } is called a stochastic process if it
describes the evolution of a stochastic system over time, i.e.

X :T ×Ω→R
with T = [0, ∞) (continuous time) or T = {t1 , t2 , ....tn } (discrete time).Then
X(·, ω) : T → R is a trajectory of the stochastic process. It is called stationary
if

E(Xt ) = µX < ∞ and E(Xt − µX )(Xτ − µX ) = σtτ < ∞.


The outcome of a stochastic process is called time series. The following models of
stochastic processes are commonly used in economics and finance.

Markov Chains (temporal dependence)


A stochastic process X is called a Markov Chain, it it fulfills the condition

P (Xn+1 = xj | Xn = xin = P (Xn+1 = xj | X1 = xi1 , ...., Xn = xin )


for all xj , xi1 , ..., xin in a given state space X and all n = 1, 2, ...

Wiener Processes (independence)


A standard Wiener Process W = {W (t), t ≥ 0} is defined to be a continuous
Gaussian process with independent increments and W (0) = 0, w.p.1, E(W (t)) = 0
and V ar(W (t) − W (s)) = t − s ∀0 ≤ s ≤ t. Then W is normally distributed and can be
approximated in distribution on any finite time interval by means of a random walk.

3
Part II: Linear Time Series Models

2 Deterministic and Stochastic Trends


Trends, unlike the cyclical and irregular components have a permanent effect on a
series. Most familiar trend model is a deterministic (polynomial) trend model:

yt = a0 + a1 t + a2 t2 + ... + an tn + ²t (1)
as opposed to a stochastic trend model such as the random walk model:

yt = yt−1 + ²t (2)
The stochastic trend has been introduced by Nelson and Plotter (1982). The ran-
dom walk model is just a special case of an AR(1),

yt = a0 + a1 yt−1 + ²t (3)
with a0 = 0 and a1 = 1.
The general solution to a random walk model is
t
X
yt = y0 + ²i (4)
i=1
which shows that the past shocks have a permanent, non-decaying effect on the
series. In other words,
∂yt ∂yt ∂yt ∂yt
= = = ... = (5)
∂²t ∂²t−1 ∂²t−2 ∂²1
The mean of the series is constant,

E[yt ] = y0 (6)
However, the variance is time dependent:

var(yt ) = var(²t + ²t−1 + ... + ²t ) = tσ 2 . (7)


which indicates that the random walk process is non-stationary. Covariance of yt
and at−s is also time dependent:

E[yt − y0 )(yt−s − y0 )] = [(²t + ²t−1 + ...²t )(²t−s + ²t−s−1 + ...²t )]


= [(²2t−s + ²2t−s−1 + ...²21 )]
= (t − s)σ 2

4
One should notice that the first difference of the series is stationary:

∆yt = ²t (8)
The correlation coefficient ρs is:

(t − s)σ 2
ρs = √ p (9)
tσ 2 (t − s)σ 2
(t − s)
= p (10)
t(t − s)
³ t − s ´1/2
= (11)
t
Hence, the autocorrelation function (ACF) for a random walk process will show a
slight tendency to decay. This also implies that ACF cannot distinguish between a
unit root process (a1 = 1) and processes with a1 close to unity.
Graphically random walk processes are distinguished by the fact that they do not
tend to return to a long-run (mean) value. The conditional mean of yt+1 is

Et yt+1 = Et (yt + ²t+1 ) = yt (12)


Similarly, the conditional mean of yt+s can be obtained from
s
X
yt+s = yt + ²t+i (13)
i=1

so that

³X
s ´
Et yt+s = yt + Et ²t+i (14)
i=1
= yt (15)

Hence, the constant value of yt is the unbiased estimator of all future values of yt+s
for all s > 0.

2.1 Random Walk plus Drift Model


Random walk plus drift model is given by

yt = yt−1 + a0 + ²t (16)
whose solution is

5
t
X
at = y0 + a0 t + ²i (17)
i=1
P
with a deterministic trend a0 t and a stochastic trend, i=1 ²i . Again, notice that
the first difference of the series yields a stationary series:

∆yt = a0 + ²t (18)
The forecast function of random walk with drift model is different from the pure
random walk model in that it is not flat but has a trend

t+s
X
yt+s = y0 + a0 (t + s) + ²i (19)
i=1
s
X
= yt + a0 s + ²i (20)
i=1

and

Et yt+s = yt + a0 s (21)
A more general model is obtained by augmenting the random walk plus model with
a stationary noise process, A(L)ηt , which is called the general trend plus irregular
model:
t
X
yt = µ0 + a0 t + ²i + A(L)ηt (22)
i=1

2.2 Removing the Trend


the appropriate form of eliminating the trend depends on the form of the trend, i.e.,
deterministic of stochastic, or both. If the nature of the trend is deterministic then
detrending procedure s used where the series is regressed on a function of time and the
residuals from that regression are regarded as the detrended series. However, if the
trend is stochastic, then the appropriate procedure is to difference the series. Note,
however, that differencing the time series will increase the volatility of the series.

2.2.1 Differencing
Take an ARIM A(p, d, q) :
A(L)yt = B(L)²t (23)

6
Suppose that A(L) has a single unit root and B(L) has all roots outside the unit
circle [i.e., the model is (ARIM A(p, 1, q)]. Then, we can write

A(L) = (1 − L)A ∗ (L) (24)


and

(1 − L)A∗ (L)yt = B(L)²t (25)


or

A∗ (L)∆yt = B(L)²t (26)


The ∆yt series is stationary since all roots of A∗ (L) lie outside the unit circle. In gen-
eral the dth difference of a process with d unit roots is stationary. An ARIM A(p, d, q)
model has d unit roots; the dth difference of such a model is a stationary ARIM A(p, q)
process. A series with d unit roots is said to be integrated of order d or simply I(d).

2.2.2 Detrending by Filtering Out the Time Trend


Nonstationary processes with deterministic trends, however, cannot be transformed
into well-behaved ARMA models by differencing. Consider,

yt = a0 + a1 t + ²t (27)
Differencing this process yields

∆yt = a1 + ²t − ²t−1 (28)


which is stationary but noninvertible. The appropriate way is to regress yt on a
constant and linear trend and retrieve the residual as the stationary component.
More generally a time series may have a polynomial time trend

y1 = a0 + a1 t + a2 t2 + ... + an tn + ²t

in which case detrending is accomplished by regressing yt on a polynomial time


trend. The appropriate degree of the polynomial can be determined by standard
t−tests, F −tests and/or AIC or SBC.

2.2.3 Detrending by the Exponential Smoothing and HP-Filters


The HP-filter as proposed by Hodrick and Prescott (1980) separates the growth compo-
nent (y g = trend component) which can represent a (i) deterministic or a (ii) stochastic
trend and y c = cyclical component. We then have

7
yt = ytg + ytc .
For a linear filter it holds:

X
ytg = gj yt−j ;
j=−∞

then ytc
= yt − ytg .
The linear filter, as the HP- filter below, is a symmetric low
frequency filter. The exponential smoothing filter reads
T
X
min (yt − ytg )2 + λ(ytg − yt−1
g
)2 .
{yt }T
t=1 t=1

The HP-filter is obtained by


T
X
min (yt − ytg )2 + λ[(yt+1
g
− ytg ) − (ytg − yt−1 )]2 .
{yt }T
t=1 t=1

with λ as a penalty parameter for changes in the trend (acceleration of growth)


component. In practical applications one sets λ = 1600 for the HP-filter (see the Gauss
program Hpfilt.g).1 The HP-filter, and recently the Band Pass Filter, are frequently
employed in macro dynamic modeling, and in the RBC literature, see Stock and Watson
(1997). The Band Pass Filter also lists in Gauss.

2.2.4 Detrending by Other Smoothing Techniques


Detrending might also be undertaken by removing piecewise linear (segmented) trends
or trends generated by moving averages, see the computer program by Franke and
Semmler (1993) which is used in Franke and Semmler (1994).

2.2.5 Difference Versus Trend Stationary Models


A difference stationary (DS) model can be transformed into a stationary model by
differencing and a trend stationary (TS) model can be transformed into a stationary
model by removing the deterministic trend. At it was not appropriate to difference a
TS series it is also inappropriate to detrend a DS series.
1
See also the Band Pass Filter by Baxter and King (1995).

8
2.3 Testing for Trends and Unit-Roots
Since standard econometric theory depends on the stationarity assumption of the series
and since the appropriate form of transforming a non-stationary series into a stationary
one depends on the form of the trend we need a statistical devise, i.e. a test procedure,
to determine the nature of the trend.

2.3.1 Unit Root Processes


Consider,

yt = a1 yt−1 + ²t (29)
and suppose we want to test
H0 : a1 = 1 (30)
Under the null hypothesis yt is a random walk (a unit root process)

yt = yt−1 + ²t (31)
It has been shown that if one applies OLS to equation 29 to estimate a1 , where
the actual a1 = 1, one would obtain an estimate below unity. in other words OLS
estimate of a1 when it is equal to unity is downward biased. Furthermore, under the
null hypothesis the â1 /se(â1 ) does not have the usual t−distribution. In other words
we cannot use the critical values from the t−table to test hypothesis that a1 = 1. Unit
Roots in a Regression Model
Consider
yt = a0 + a1 zt + ²t (32)
The assumptions of the classical regression model requires that both the yt and zt
series are stationary and the errors have a zero mean and finite variance. However,
if the series have unit roots (i.e. are non-stationary) we might have what is called a
spurious regression. In such a case the regression has high R2 and high t−statistics
which are misleading because the customary statistical tests do not apply. The problem
arises because normally the error process in a spurious regression it not stationary, i.e.
contains a stochastic trend.
There are four cases with different consequences:

1. Both yt and zt are stationary ⇒ the classical regression model is appropriate

2. yt and zt are integrated of different orders ⇒ meaningless regression

9
3. yt and zt are integrated of the same order but the error term contains a stochastic
trend ⇒ spurious regression. If they are both I(1) first difference formulation is
appropriate
∆yt = a1 ∆zt + ∆²t (33)

4. yt and zt are integrated of the same order but the error term is stationary ⇒ the
variables are co-integrated.

2.3.2 Dickey-Fuller Tests


An equivalent way of testing H0 : a1 = 1 in

yt = ayt−1 + ²t (34)

is to test H0 : γ = 0 in
∆yt = γyt−1 + ²t (35)
where γ = a1 − 1. to test this hypothesis one estimates γ by applying OLS to the
above equation and forms the t−statistic. The critical values for the test is provided
by Dickey and Fuller which is now part of most of the econometric software (including
Eviews). Those critical values however depend on whether an intercept and/or time
trend is included in the regression equation. In other words there are three basic forms
of regressions through which one can test for unit roots:

∆yt = γyt−1 + ²t (36)


∆yt = a0 + γyt−1 + ²t (37)
∆yt = a0 + a2 t + γyt−1 + ²t (38)
One drawback of the Dickey-Fuller test is that it assumes that the errors are in-
dependent and have a constant variance. If the stochastic process generating the y :t
series contains an ARMA component in addition to the stochastic and/or deterministic
trend the residuals in equations 36, 37 and 38 will suffer from serial correlation and
hence Dickey-Fuller test will not be appropriate. In such a case one applies Augmented
Dickey-Fuller Test (ADF) which just ads an AR component to the above regressions
and use the same critical values:
p
X
∆yt = γyt−1 + bi ∆yt−i + ²t (39)
i=1
p
X
∆yt = a0 + γyt−1 + bi ∆yt−i + ²t (40)
i=1

10
p
X
∆yt = a0 + a2 t + γyt−1 + bi ∆yt−i + ²t (41)
i=1
The above procedure assumed that the null hypothesis is that there is only one unit
root. If existence of two unit roots is suspected then the appropriate procedure is to
first test for unit root in the first difference of the series and then if this is rejected test
for the unit root in the level series.

3 Trends and ”the Linear Model”


Considering the two-variable VAR (2) model

yt = a0 + a1 yt−1 + a2 xt−1 + e1t


xt = b0 + b1 xt−1 + b2 yt−1 + e2t
with stationary time series xt and yt , ordinary least squares regression yields consis-
tent and normally distributed estimates of a and b, if Xt = (xt−1 , yt−1 ) is independent
of the error terms e1t and e2t , respectively. In the case of non-stationary time series xt
and yt with unit roots due to the higher convergence rates of moments of non-stationary
processes than of moments of stationary processes, OLS-regression turns out to be a
”superconsistent” estimate of a and b but the normal distribution of the parameters2
can only be guaranteed if xt and yt are co-integrated, i.e. if their linear combination,
the so called co-integration vector,3

zt = yt − βxt
is stationary - thus co-integration of two non-stationary time series is characterized
by similar trend components in both time series constituting a long-term dependence
between them. Simple tests of co-integration can be performed using e.g. the Aug-
mented Dickey-Fuller test of integration (see sect. 4) based on zt .

4 Co-Integration and Error Correction Models


4.1 Derivation and Interpretation
Consider the two-variable (VAR (2) model
yt = a10 + a11 xt−1 + b11 yt−1 + e1t
xt = a20 + a21 yt−1 + b21 xt−1 + e2t
2
This property of parameter estimates is important to make statistical inferences about them.
3
Note, that β refers to the OLS-regression of xt on yt .

11
Applying techniques of linear algebra, we restate this model in the following form,4

∆yt = a10 ∆xt − (1 − b11 )(yt−1 − (a10 + a11 )xt−1 + e1t (42)
∆xt = a20 ∆yt − (1 − b21 ) (xt−1 − (a20 + a21 )yt−1 +e2t (43)
| {z } | {z } | {z }
[1] [2] [3]

called die Error Correction Model. The short-term dynamics [1] and long-term de-
pendencies [2]&[3] can be analyzed simultaneously. Suppose, the error cannot become
infinitely large, i.e. xt and yt are co-integrated, [3] can be interpreted as the deviation
from the long-run equilibrium and [2] as it’s correction.

4.2 Estimation and Tests


Suppose, xt and yt are co-integrated non-stationary time series with unit roots (see
sect. 4.2.3), then parameters are estimated consistently using OLS-regression for each
equation in the following reparameterization,

1 1
∆yt = a10 ∆xt + b1 yt−1 + b2 xt−1 + e1t (44)
2 2
∆xt = a20 ∆yt + b1 xt−1 + b2 yt−1 + e2t (45)

Alternatively one can employ the two-step procedure of Engle and Granger:

1. estimate the co-integration vector [3] of xt and yt (see sect. 4.2.3) and
{1,2} {1,2}
2. substitute the co-integration vector in (42), (43) and estimate a0 and 1−b1 .

This procedure is justified as the OLS-estimates of the parameters of stationary


and non-stationary variables are asymptotically independent.
In the framework of (44), (45) Granger causality5 tests can be designed in the following
way: ∆xt Granger causes ∆yt , if a−2 −2 −1
0 = b2 = 0 and a0 > 0.
For applications of the ECM, see Hansen (1993).

4
Note that ∆xt = xt − xt−1 and ∆yt = yt − yt−1 , respectively.
5
The concept of Granger causality is based on the proposition that the future does not influence
the presence.

12
Part III: Nonlinear Time Series Models
In the last decade research on nonlinear models has found that economic and fi-
nancial time series may be of high complexity (see especially chaos theory and related
literature) which cannot possibly be generated by linear models. I our context, non-
linear dynamic models are defined as models where the dynamics of variables undergo
regime changes as the variables vary. In general these are amplitude dependent mod-
els. For a survey of numerous nonlinear econometric models and for applications in
economics and finance see the contributions in Semmler (1994).

5 Estimation of Continuous Time Models


Assume, for example, a second order nonlinear stochastic differential equation, such as
the can der Pol equation

ẍ − b(x)ẋ + bx = n (46)
with

n : white noise
b(x) : a(1 − x2 )

with n a white noise term and a a coefficient. It can be estimated as follows. One
can write eqn. (46) as a first order differential equation system in two variables such
as

ż = f (zt | θ) + ωt (47)
where θ is a parameter set and ωt a white noise term. One can then solve the
continuous time dynamic model (47)through the Euler method, Runge Kutta method,
Milstein scheme or local linearizations6 . Furthermore, one can directly estimate the
parameters of the proposed model. We will discuss two estimation procedures.

5.1 The Euler Procedure


Employing, for example, the Euler procedure the discrete-time form of equation system
(47) can be estimated by using actual data sets.
The Euler procedure amounts to estimate the parameter set θ in (47) using the following
discretization,
6
For a survey and comparison of the numerical accuracy of the different discretization methods, see
Kloeden, Platen and Schurz (1991) and for the local linearization procedure, see Ozaki (1985, 1994).

13
zt+1 = zt + hf (zt | θ) + ωt
where h is the step size (for examples, see Semmler and Kockesen (1997) and
Chiarella, Semmler and Kockesen (1996)). Although the Euler procedure is a very con-
venient method of estimating a continuous time model, as has, however, been pointed
out by Kloeden, Platen an Schurz (1991) and Ozaki (1994), it is not the most precise
method of turning out stochastic differential equations into a discrete time estimable
form. It is possible that instability can arise in the different equation although the
corresponding differential equation is stable. This problem has been addressed in a
series of papers by Ozaki. The Euler difference scheme is, however, still a useful ap-
proximation for estimating an equation system such as (47) in time discrete form by
using Maximum Likelihood. Nonlinear Least Squares may also be a useful estimator,
although it is not necessarily a consistent estimator for the above model. For the use
of the Euler scheme in estimating a stochastic differential equation system such as (47
by ML, see Kloeden and Platen (1995: 242).

5.2 Local Linearization Procedure


In various papers Ozaki (1986, 1987, 1989, 1994) proposes a local linearization proce-
dure that overcomes short comings of methods such as the Euler scheme. He suggests a
local linearization of a nonlinear stochastic differential equation such as (47) by comput-
ing the Jacobian at each point in the state space, The nonlinear differential equation
model (47) can be transformed into a discrete time model through linearization as
follows (see Ozaki, 1994):

zt+∆t = A(zt )zt + B(zt )wt+∆z


wt+∆z : discrete time white noise
A(zt ) = exp(L(zt )∆t)
1
L(zt ) = log{I + Jt−1 (eJt ∆t − I)Ft }
∆t
n ∂f (z) o
Jt =
∂z z=zt
Ft : derived from Ft zt = f (zt )
B(zt ) : function of the eigenvalues of L(zt )

This local linearization is consistent since ∆t → 0 the original differential equation


is obtained. The estimation procedure can be undertaken by nonlinear least square or
maximum likelihood procedures (see Ozaki 1994). The local linearization method of
Ozaki is written in GAUSS. For an example, see Semmler and Kockesen (1997).

14
5.3 Estimation of a Random Walk
Random walks as stochastic processes are important tools in asset pricing, see Neftci
(1996). A discrete time random walk can be expressed as
(
∆ with probability π
pt = pt−1 + ²t , ²t = (48)
−∆ with probability 1 − π 0
where ²t is iid. (for details see Campbell, Lo and MacKinlay, 1996, ch. 9). A
standard Brownian motion (Wiener process) which is frequently used in asset pricing
is

Pt = µt + σBt (49)
Conditional moments are

E[pt | pt0 ] = pt0 + µ(t − t0 )


V ar[pt | pt0 ] = σ 2 (t − t0 )
Cov[pt | pt0 ] = V ar[pt1 ] = σ 2 t1 .

In differential form we obtain from (49) a stochastic differential equation

dpt = µdt + σdBt ,


where µ is the drift and σBt the diffusion term. Moreover,

dBt ≡ lim Bt + h − Bt .
h→dt

The general form of the stochastic differential equation (SDE) with amplitude de-
pendent coefficients is

dpt = a(p, t, α) + b(p, t, β)dBt ,


called a standard Wiener process with θ = α, β) as unknown parameters. For a
Wiener process with constant coefficients a ML estimation can be employed to estimate
the parameter set θ.
Take

1
d log P = (µ − σ 2 ) dt + σdB and
| {z2 }
α
d log P = αdt + σdB,

15
where θ = (α, σ) are constants. In discrete time form we have

log Pt+h = log Pt + αh + σdBt ,


where dB is white noise generated by a random number generator. For examples
of such processes, see Neftci (1996, ch. 11).
As suggested by Campbell, Lo and MacKinlay (1996, ch. 9) we can employ the ML
function
n
1 1 X
L(α, σ) = − log(2πσ 2 h) − 2 (rk (h) − αh)2 ,
2 2σ h k=1
with rk as continuously compounded returns and where h is now the observation
interval of the data. Hence, estimates of α, σ can be written as follows:

n
1 X
σ̂ 2 = (rk (h) − α̂h)2 and
nh k=1
n
1 X
α̂ = rk (h).
nh k=1

For the estimation of a more general SDE, the remarks in sect. 6.1 may be valid.

6 Threshold Regression Models


6.1 Regime Changes
The above van der Pol equation (46) can be approximated by a discrete time locally
self-exciting but globally bounded system of the following type (see Ozaki 1985).
2 2
xt = (∅1 + πe−xt−1 )xt−1 + (∅2 + π2 e−xt−1 )xt−2 + εt

or equivalently by a (piecewise linear or nonlinear) threshold model such as




π(T1 )xt−1 + ε1 for xt−1 < T1
xt = π(xt−1 )xt−1 + ε1 for T1 ≤ xt−1 < T2


π(T2 )xt−1 + ε1 for xt−1 ≥ T2
Threshold models have become popular since they offer a rich array of dynamic
behavior and are able to identify models with amplitude-dependent behavior - see
Tong (1990). See also the papers in Semmler (1994) which provide extensive reviews
of the literature on these various frameworks.

16
6.2 Threshold Models
One particular class of time series models, called Threshold Autoregressive (TAR)
models are based upon the principle of local approximations to a nonlinear system by
introducing different linear regimes via thresholds. Tong (1990, p.99) refers to this
principle as the threshold principle in the sense that as certain variables pass through
thresholds the dynamic behavior of the system changes. In that sense they re suitable
to analyze amplitude-dependence and regime changes and/or asymmetries and also
amenable to examine nonlinear economic phenomena particularly within a multivariate
context.
The univariate TAR models are called self-exciting thresholds autoregressive and
moving average models (SETARMA). One particular case is the self-exciting thresh-
old autoregressive (SETAR) model. A SETAR (k, p, d) is given by
p
(j)
X (j) (j)
Yt = α0 + αi Yt−1 + εt , if rj−1 ≤ Yt−d < rj , j = 1, 2, ...k (50)
i=1

where k denotes the number of different regimes; d is the delay parameter and
r1 ...rk−1 are the threshold parameters which satisfy −∞ = r0 < r1 <, ..., < rk =
∞; ²jt is white noise with ²jt independent of ²lt for j 6= l.
A more general case is when the model includes some moving average terms. A
SETARMA (k, p, q, d) is given by
p q
(j)
X (j)
X (j)
Yt = α0 + αi Yt−1 + βi εt−i + εt (51)
i=1 i=1

if
rj−1 ≤ Yt−d < rj ,  = 1, 2, ..., k
These models give rise to phenomena such as limit cycles, catastrophe (jump phe-
nomenon), asymmetries, corridor stability (state dependent response to shocks), thus
capturing many forms of nonlinear interaction between variables.
Tsay (1989) provides an easy to implement procedure to test for the nonlinearities of
SETAR type and to estimate k, p1 , ..., pk , d and {rj }s. Applications to macroeconomic
data can be found in the papers collected in Semmler (1994).7
7
Potter (1993) found evidence for threshold nonlinearities in post-war U.S. GNP data and estimated
a two regime SETAR model. Using nonlinear impulse response functions he found that asymmetries
exist in GNP dynamics. Tiaoo & Tsay (1991) used a more refined regime categorization which
distinguishes between upswing and downswing phases and estimated a four-regime TAR model for the
same data. In both models the behavior of real GNP is different in recession and expansion periods.
They both capture the notion of different stability properties in different regimes and asymmetric
movements in the form of rapid turns into recessions (Potter, (1994), Cao & Tsay, (1992)). Cao &
Tsay (1992) applied the SETAR modeling procedure developed by Tsay (1989) to daily stock return

17
An Open-loop TAR (TARSO) model (Tong, 1990) is characterized by the fact
that the regimes are generated by a different variable Xt−d ,
p p
(j)
X (j)
X (j) (j)
Yt = α0 + αi Yt−1 + βi Xt−i + εt (52)
i=1 i=1

if
rj−1 ≤ Xt−d < rj ,
Generalizations of this model include an n−equation TARSO model, called TARS
and any mixture of SETAR and TARSO equations.
A further variant of this framework is elaborated by Luukonen et al. (1988a, 1988b),
Granger and Teräsvirta (1993), Teräsvirta (1994). It is based upon the idea that, in
contrast to SETAR models, although there is amplitude-dependence the transitions
between regimes may take place smoothly.
A single equation STR-model can be written as
0 0
Yt = β xt + (θ xt )F (zt ) + εt (53)

0
xt = (1, Yt−1 , ...Yt−p ; x1t , ...xkt )
where zt is any variable which is thought to be governing the transition between
regimes and F (zt ) some continuous function. One widely used form of STR model is
logistic STR(LSTR)model characterized by logistic function F , i.e.,

F (zt − c) = [1 + exp(−γ(zt − c))]−1 , γ>0 (54)


F (−∞) = 0, F (+∞) = 1, F (0) = 1/2.
Another subcategory is called exponential STR (ETSTR)model where function
F is exponential, i.e.,

F (zt − c) = 1 − exp[−γ(zt − c)2 ], γ>0 (55)


F (±∞) = 1, F (0) = 0.
If the transitions are generated by deviations of the transition variable from its
linear (trend) path rather than from a fixed value c the model takes the form of STR-
DEVIATION (STR-D) model. One possibility is to use lagged fitted residuals
from the linear part of the equation (141) as the transition variable, i.e.,
0
zt = Yt−d − β̂ xt−d (56)
volatilities and found that SETAR models fare better in forecast performance as compared to GARCH
and EGARCH models, especially for large stock returns.

18
then, depending upon the form of the transition function one could have a logistic
or exponential STR-D model.8 10

6.3 Estimation of STR Models


The empirical methodology to apply an STR model should be composed of the following
steps:

1. Specify a linear model as a starting point: Use model selection criteria and resid-
ual diagnostic tests to specify an appropriate vector autoregression (VAR) model.

2. Test linearity against STR using the linear model specified in the first step as
the null model. If the linearity hypothesis is rejected, determine the transition
variable (or a linear combination of them) from the data.

• A test with power against both LSTR and ESTR involves testing H0 : ∅1 =
∅2 = ∅3 = 0 in the following auxiliary regression:
0 0 0 2 0 3
Yt = β xt + ∅1 xt ztd + ∅2 xt ztd + ∅3 xt ztd + ηt
The above regression can also be used to select the transition variable zt by
conducting the test for different variables.
• If linearity is rejected for several choices of zt , the select the one with the
smallest probability value as the transition variable.

3. Choose between the LSTR and ESTR models. Consider the following sequence
of nested hypothesis:
8
Granger and Teräsvirta (1993), Teräsvirta (1994), and Eitrheim and Teräsvita (1995) present
a full-fledge testing, specification, estimation und evaluation procedure for STR models and show
that TR family, contains TAR, TARSO, TARSC, and Exponential Autoregressive (EXPAR)9 models
as special cases. Teräsvirta and Anderson (1992), Granger and Teräsvirta & Anderson (1993) and
Teräsvirta (1993) applied this procedure to various economic series.
10
In particular, Teräsvirta & Anderson (1992) tested and modeled STAR-STAR stands for smooth
transition autoregressive and is distinguished by being univariate. STAR models constitute most of
the empirical literature which uses STR methodology-specifications for quarterly industrial production
indices of OECD countries, for the period between 1960.1 and 1986.4. For the U.S. data an LSTAR
model is estimated which performs better than a linear one and whose dynamic behavior may include
chaos and long and short-cycles. They found that recessions are locally explosive with complex roots
and have a period of 8.9 quarters whereas expansions are stable with complex roots and periods of 61
quarters. Hence, the asymmetric behavior which has been observed for U.S. GNP in previous studies
using threshold autoregressive models is also detected for industrial production series.

19
H03 : ∅3 = 0
H02 : ∅2 = 0 | ∅3 = 0
H01 : ∅1 = 0 | ∅3 = ∅2 = 0

If the test of H02 has the smallest probability value choose ESTR family, otherwise
choose the LSTR family.

4. Estimation of a specified STR model can be undertaken by nonlinear least squares.

5. Evaluation:

• Check if the parameter estimates are reasonable,


• Diagnostic checks on residuals,
• Evaluate the dynamic properties of the model.

An application of the above methodology to test a nonlinear model using the STR
approach is given in Semmler and Kockesen (1997) for a dynamic economic model and
in Chiarella, Semmler and Kockesen (1996) for a model in finance. A GAUSS program
for a multivariate STR estimation is also available.

7 ARCH/GARCH-Models
7.1 Motivation and Derivation
In the financial literature nonlinear models became popular since stylized facts revealed
that financial time series data do not appear to be consistent with the assumption of
normal distribution. The so called ARCH/GARCH-models consider time-dependent
variances of a time series.
Suppose the financial time series yt follows a normal distribution with mean µ = 0
and standard deviation ht , an ARCH(p) model is formalized as follows:

yt = ²t ht0.5 , yt | yt−1 ∼ N (0, ht ),


2 2
ht = α0 + α1 yt−1 + ... + α1 yt−p , a1 , ...ap > 0.

Note, that the squared y’s are a consequence of the identity of the conditional
expectation of yt2 and the variance of y in t. Practical investigations of the ARCH(p)
model provide-evidence that p has to be chosen quiet large in order to obtain good
approximation results.

20
This overparameterization may be avoided by employing an extended ARCH model,
amounting to the following GARCH(p, q) model:

yt = ²t ht0.5 , yt | ψt−1 ≈ N (0, ht ),


p q
X X
2 2
ht = α 0 + αi yt−i + βi yt−i ,
i=1 i=1
p, q > 0 and α1 , βj > 0, i = 1, ...p, j = 1, ..., q.
P P
Note, that the variance of yt is only finite in case of αi + βj < 1.

7.2 Estimation and Tests


Estimation of the structural parameter set θ = (α, beta) is usually performed by the
Maximum likelihood technique. For normally distributed yt , one obtains the likelihood
function
1 − 1 y2
Lt (θ; yt ) = √ √ exp 2ht .
ht 2π
Numerically, the likelihood function can be solved using the gradient ∂L
∂θ
.
In practical applications the optimal lag lengths p and q have to be chosen. This
procedure may be guided by a test whether residuals exhibit further nonlinear depen-
dencies using the test of Brock, Dechert and Scheinkmann.
Computer programs are available in GAUSS.

8 Neural Networks
8.1 Motivation und Derivation
Addressing the problem of nonlinearities, most widely employed nonlinear time series
models are now nonparametric models like kernel regression as well as general approx-
imation methods applying neural networks. The latter is a well known representative
model in the econometric literature.
Assuming a financial time series generated by the conditional distribution f (yt |
yt−1 , ...yt−m ) can be represented by the stochastic relationship

yt = g(yt−1 , ...yt−m ) + ²t , E(²t ) = 0, V ar(²t ) < ∞,


where g is a nonlinear, but differentiable function neural networks are an appropri-
ate nonlinear econometric method to estimate g, due to the universal approximation
capabilities of the in θ parameterized function, the so called multi-layer perceptron:

21
H
X
ŷt = ĝ(θ, yt−1 , ..., yt−m ) = θ01 + θh1 Ψ(θ2 , xt )
h=1
n
X
2 2 2
Ψ(θ , xt ) = (1 + exp(−θh0 − θhi xit )),
i=1

where Ψ is a logistic, i.e. sigmoid function.

8.2 Estimation and Tests


Employing the method of Backpropagation means to update an initial guess of the
parameter set θ0 11 iteratively using the gradient descent method,

∂M SE(y, ŷ)
θi = θi−1 − η , i = 1, 2....,
∂θ
with a small learning parameter η, performing parameter changes with maximal loss
in the error function. Getting stuck in a local minimum in the error surface, several
methods such as for example simulated annealing techniques are suggested to obtain
convergence to the global optimum.12
An important task for the model builder is to choose the right complexity H of
the neural network to avoid biased estimators (under-parameterization) as well as to
prevent from having too much variance in the estimated data (over-parameterization).
A straight-forward procedure would be the following:

1. fit the data yt , t = 1, ..., T − k for H = 1, 2, ...;

2. chose H ∗ so as to minimize M SEH (y, ŷ);

3. fit the data yt , t = 1, ..., T for h∗ .

More sophysticated techniques for neural network model selection are reviewed in
Hertz, Krogh and Palmer (1993).
Computer programs are available in GAUSS and RATS.

11
How to get an adequate initial guess of the parameters is discussed in Hertz, Krogh and Palmer
(1993). In practice small random numbers are appropriate to avoid the tails of the logistic function.
12
This and further topics of how to guide the learning process are reviewed in Hertz, Krogh and
Palmer (1993).

22
Part IV: Estimation of Intertemporal Models
Next, we will sketch estimated strategies for intertemporal models where the dy-
namic model to be estimated is derived from the first order conditions of a dynamic
optimization problem. We discuss discrete time and continuous time examples.

9 Estimating a Stochastic Dynamic Using MLE


9.1 The Model
A discrete time variant of a nonlinear intertemporal model13 usually reads as follows:
X
V (x0 , u0 ) = Ex0 β t F (xt , ut )
t

s.t. xt+1 = xt + f (xt , ut ) + ²t+1 (57)


where xt is a vector of state variables, ut a vector of control variables, β the dis-
count rate, ²t+1 a vector of random shocks to the economy, Ex0 , u0 z0 is expectation
conditioned on the information at time t and F (xt , ut ) the return function. This is the
general form of a real business cycle (RBC) model. For details of the following model,
see Semmler and Gong (1996).
Written in feed back form, in general, the control variables are nonlinear functions
of the state variables.

ut = G(xt ) (58)
In special cases the nonlinear map (58) can explicitly be computed.14 Using a
Lagrangian approach a linearized version of an intertemporal model - linearized in the
state variables, Lagrangian multipliers and control variables - is proposed by Chow
(1993), see also Chow (1997). In general it can be written as

xt+1 = Axt + Cut + b + ²t+1 (59)

λt+1 = Hxt + h (60)

ut+1 = Gxt + g + et (61)


13
Model variants of the subsequent type are extensively treated in Cooley (1995).
14
Semmler and Sieveking (1996), see also Taylor and Uhlig (1990) for a comparison of different
techniques to compute the control variables in feedback form and to solve those models.

23
where
0 0
G = −(K2 + βC HC)−1 (K21 + βC HA) (62)
Furthermore,
0 0
g = −(K2 + βC HC)−1 [k2 + βC (Hb + h)] (63)
where
0 0
h = (K12 + βA HC)g + k1 + βA (Hb + h) (64)
and
0
H = K1 + K12 G + βA H(A + CG) (65)
λ represents the vector of Lagrangian multipliers. The other matrices and vectors
are to be explained below. The parameters of the model (57) are embedded in the
matrix G and the vector g.
Next, let us specify the above general intertemporal model as an RBC model15
whereby the return function takes the form of a utility function including two control
variables, consumption, u1 and labor input, u2

F (u1,t u2,t ) = log u1,t + θ log(1 − u2,t ) (66)


The state space is two dimensional representing capital accumulation, x2 and a
stochastic process denoting a sequence of productivity shocks x1 . Employing a Cobb-
Douglas production function we obtain

qt = (x2,t )1−α (At u2,t )α


The dynamics of the two state variables are

x1,t = γ + x1,t−1 + ² (67)


1−α αx1,t−1 α
x2,t = (1 − δ)x2,t−1 + x2,t−1 e u2,t−1 − u1,t−1 (68)

where in equ. (67) x1,t = log at representing a random walk with a drift, γ and ²,
a random shock to the technology. Equ. (68) represents the evolution of the capital
stock, x2,t and δ, being the depreciation rate. The structural parameters involved in
this RBC model are

ϕ = (α, β, θ, δ, γ).
15
For details, see Chow (1993, 1997)

24
which are embedded in the matrices and vectors of Eqs. (62) - (65) which, in the
case of an RBC model can be defined as

· ¸ · ¸
1 0 0 0
A = and C = ;
α q(1 − δ + (1 − α)q/x2 −1 αq/u2
0 0
b = (γ, −αqx1 ) , ² = (²t , 0) , K1 = 0, K12 = 0, k1 = 0 K21 = 0;
−1 0
k2 = (2u1 , −θ[(1 − u2 )−1 − (1 − u2 )−2 u2 ]) ,
· −2
¸
−u1 0
and K2 =
0 θ(1 − u2 )−2

Given the parameter set ϕ the linearized functions for the control variables, corre-
sponding to equ. (58), read

            
1 0 0 x1,t γ 1 0 0 x1,t−1 0 ²t
 −G11 1 0   u1,t  =  g1  +  0 0 0   u1,t−1  +  G12  x2,t +  e1,t 
−G21 0 1 u2,t g2 0 0 0 u2,t−1 G22 e2,t

9.2 Estimating the model


The above model can be written as standard simultaneous equation system

Byt + Γxt = et (69)


0 0
where y = (x1,2 , u1,t , u2,t ) , xt = (x1,t−1 u1,t−1 , u2,t−1 , x2,t 1) and

     
1 0 0 −1 0 0 0 −γ ²t
B =  −G11 1 0  , Γ =  0 0 0 −G12 −g1  , et =  e1,t 
−G21 0 1 0 0 0 −G22 −g2 e2,1t

For n observations we obtain


0 0 0
BY + ΓX = E (70)
0 0 0
where Y is 3 × n, X is 5 × n and E is 3 × n. Following Chow (1983: 170-171) and
Chow (1993:13) we can introduce a concentrated log-likelihood function to iteratively
compute the deep parameters ϕ, starting with ϕ0 (α0 , β0 , θ0 , δ0 , γ0 ). It has the form
0 0
logL = const + nlog | b | −(n/2)log | (1/n)(BY + XΓ )
whereby the maximum likelihood estimate is

25
X̂ 0 0 0 0
= (1/n)(BY + ΓX )(Y B + XΓ ) (71)
To compute the deep parameters iteratively it requires three steps
• start with initial values ϕ0 and compute G, g, H and h of equs. (62)-(65),
• evaluate the log likelihood function (71) for ϕ0 ,
• apply an optimization algorithm to change the ϕt in the direction of maximizing
log L.
The recursive procedure which employs the simulated annealing as optimization
algorithm is available in GAUSS. The results of the estimation for U.S. macro economic
time series data 1952-1990are reported in Semmler and Gong (1996, 1997). Asset
market restrictions can also be considered in intertemporal models of the above type,
see Lettau, Gong and Semmler.

10 Estimating an Endogenous Growth Model Us-


ing GMM
Next we present a continuous time variant of an intertemporal model and sketch an
estimation strategy.

10.1 The model


Consider an endogenous growth model in continuous time in the context of a closed
economy16 The economy is composed of three sectors: the household sector is assumed
to optimize whereas the government sector does not optimize but follows certain bud-
getary rules.
The household is supposed to maximize the discounted stream of utilities arising
from consumption subject to its per capita budget constraint, i.e.
Z ∞
max e−(p−n)t L0 u(C(t))dt,
C(t) 0
subject to

C(t) + K̇(t) + (δ1 + n)K(t) + Ḃ(t) + nB(t)


= (w(t) + r1 (t)K(t) + r2 (t)B(t))(1 − τ ) + Tp (t).
16
The analytical treatment of the subsequent model is given in Greiner and Semmler (1996a). The
model is estimated in Greiner, Semmler and Gong (1997).

26
where C(t) is consumption at time T . The assets accumulated by the household
are physical capital K(t), which depreciates at the rate δ1 and government bonds or
public debt B(t). Tp (t) is the lump-sum transfer payment to the household which the
household takes as given in solving its optimization problem. The term τ is the income
tax rate and w(t), r1 (t), and r2 (t) denote the wage rate, the return to physical capital,
and the return to government bonds respectively. The no-arbitrage condition requires
the after tax equalization of the two rates of returns (see appendix of Greiner, Semmler
and Gong (2000)).
Moreover, ρ is the constant rate of time preference. The labor supply which equals
L0 at t = 0 is assumed to grow at the constant rate n.
As to the utility function we use the CRRA function

(C(t)1−σ − 1)/(1 − σ),


where σ is the coefficient of relative risk aversion which is a constant. For σ = 1
the utility function can be replaced by the logarithmic function in C(t).
The productive sector is assume to be represented by a firm which behaves com-
petitively exhibiting a per capita production function of the form,

f (K, G) = K β (G/L)α .
where G is the aggregate stock of public capital which is subject to congestion. As
to congestion we adopt the modeling proposed in Glomm and Ravikumar (1994) and
assume that the per capita stock o public capital G = G/L affects per capita output.
β and α, β, α ∈ (0, 1) denotes the share of private and public per capita capital in the
production function respectively. Since K denotes per capita capital the wage rate and
the return to private capital are determined as w = (1 − β)K β Gα and r1 = βK β−1 Gα .
The budget constraint of the government in per capita terms is given by

Ḃ = r2 B + Cp + Tp + Ġ − T − nB.
where r2 B is the debt service, Cp stands for public consumption, Tp for transfers,
Ġ for public investment and T for the tax income, given by T = τ (w + r1 K + r2 B).
In addition as to the sustainability of public debt we posit that the government is
not allowed to play a Ponzi game. We thus state that the usual transversality condition
Rt
lim B(t)e− 0 (r2 (s)−n)ds =0
t→∞

must hold. In Greiner and Semmler (1996) we show that the above transversality
condition will hold when certain parameter constellations are given.17
17
In Semmler and Sieveking (1996) the problem of sustainability of debt is analytically studied for
a related model and in Greiner and Semmler (1997) the above sustainability condition is estimated
for German time series data. Section 13 discusses details of such a test.

27
The dynamic behavior of our economy can be analyzed after defining the ratios
c = C/K, b = B/K, and x = G/K. Differentiating c, b, and x with respect to time we
get a new dynamic system which completely describes our model around a BGP. The
dynamic system, where the growth rates of c, b, and x are derived from the first order
condition of the above maximization problem by employing the Hamiltonian approach,
is given by the following equations (76)-(78):

ċ ρ + δ1 (1 − τ )βGα C Gα
= − + + + (δ 1 + n) −
c σ σK 1−β K K 1−β
³ Gα B ³ G α
δ1 ´´
+τ (ϕ2 + ϕ3 (1 − ϕ0 )) · + β −
K 1−β K K 1−β 1 − τ
ḃ ³ Gα δ1 K β Gα ´
= (ϕ0 − 1)(1 − ϕ3 )τ β 1−β − +
b K 1−τ B
³ Gα δ1 C Gα
+(1 − ϕ4 ) β 1−β − + + δ1 − 1−β
K 1−τ K K
³ Gα B ³ Gα δ1 ´´
+τ (ϕ2 + ϕ3 )(1 − ϕ0 ) + β − ,
K 1−β K K 1−β 1 − τ
ẋ ³ Kβ Gα B B δ1 ´ C
= ϕ3 (1 − ϕ0 )τ 1−α
+ β 1−β − − δ2 + + δ1
x G K G G1−τ K
Gα ³ Gα B ³ Gα δ1 ´´
− 1−β + τ (ϕ2 + ϕ3 (1 − ϕ0 )) + β −
K K 1−β K K 1−β 1 − τ
Here we have assumed that public consumption and transfer payments to the house-
hold constitutes a certain part of the tax income, i.e. Cp = ϕ2 T and Tp = ϕ1 T, ϕ1 , ϕ2 <
1. Moreover, we define per capita government expenditure for public investment as
Ġ = ϕ3 · (1 − ϕ0 )T − (δ2 + n)G, with ϕ3 ≥ 0 and δ2 the depreciation rate of public
capital. The fraction ϕ0 depends on the policy.
For β = 1 − α, the above is an autonomous system of differential equations in
the variables c, b, and x. The local dynamics of the model can then be analyzed by
computing analytically or numerically the eigenvalues of the Jacobian matrix. The
above system may also exhibit multiple equilibria (for details, see Greiner, Semmler
and Gong, 1997).

10.2 Estimating the Model


We employ time series data on consumption, public debt and public capital stock to
estimate the above model for the U.S. and German economies from 1952 to 1990. All
variables are defined relative to the private capital stock. We first need to describe our
estimation strategy. Then we turn to the description of the actual estimation.

28
We employ a GM estimation and the Euler Scheme for the discretization of the con-
tinuous time model. The GMM estimation starts with a set of orthogonal conditions,
representing the population moments established by a theoretical model:

E[g(yt , ψ)] = 0 (72)


where yt is a p × 1 vector of observed variables at date t; ψ is a q × 1 vector of
unknown parameters to be estimated and g(·) is a r × 1 vector mapping from Rp+q .
Let T denote the sample size. The sample moments of g(·) can be written as
T
1X
gt (ψ) = g(yt , ψ). (73)
T t=1
the idea of GMM estimator is to choose an estimated ψ that matches the sample
moments gt (ψ) and the population moments given by (73) as closely as possible. To
achieve this, one needs to define a distance function by which that closeness can be
judged. Hansen (1982) suggested a distance function:

Jt (ψ) = [gT (ψ)0 WT [gT (ψ)], (74)


where WT , called the weighting matrix, is r × r, symmetric and positive definite.
Thus, the GMM estimator is the value of ψ, denoted as ψ̂, that minimizes (75). From
the results established in Hansen (1982), consistent estimator of the variance-covariance
matrix of ψ̂ is given by
1
V ar(ψ̂) = (DT )−1 W −1 T (DT0 )−1 , (75)
T
where DT = ∂gT (ψ̂)/∂ψ 0 .
There is a great flexibility in the choice of WT for constructing a consistent and
asymptotically normal GMM estimator. We can adopt the method by Newey and West
(1987), where it is suggested that
m
X
Wt−1 = Ω̂0 + w(j, m)(Ω̂j + Ω̂0j ), (76)
j=1
P
with w(j, m) ≡ 1 − j/(1 + m), Ω̂j ≡ (1/T ) Tt=j+1 g(yt , ψ̂ ∗ )g(yt−j , ψ̂ ∗ ) and m to
be a suitable of ψ. Thus two-step estimation is suggested as in Hansen and Singleton
(1982). First, one chooses an sub-optimal weighting matrix to minimize (74) and hence
obtains a consistent estimator ψ̂ ∗ . One then uses the consistent estimator obtained in
the first step to calculate the optimum WT through which (74) is re-minimized.
To define the set of orthogonal conditions (73) for our GMM estimation we can
employ our above dynamic system. Then we have three equations defining our set of
orthogonal conditions as follows:

29
E[c̃ − f1 (c(ψ), x(ψ), b(ψ))] = 0 (77)

E[b̃ − f2 (c(ψ), x(ψ), b(ψ))] = 0 (78)

E[x̃ − f3 (c(ψ), x(ψ), b(ψ))] = 0 (79)


where fi (i = 1, 2, 3) depends on the the parameter set ψ = (ρ, σ, α). the terms c̃, b̃, x̃
represent the deviation of the actual growth rates from their trend values at time period
t.18 The sample moments (g(·) as defined in (73) can be computed using (77)-(79).Our
estimation problem with three equations and three parameters ψ = (ρ, σ, α) amounts
to numerically finding such a parameter set ψ such that the distance as expressed in
(72) is minimized.
A computer algorithm, written in ”Gauss” is designed to solve the optimization
problem (72) with the simulated annealing in the above mentioned two steps. The
estimations are undertaken for U.S. and German time series data for 1952-1990. For
details of the results, see Greiner, Semmler and Gong (1997).

11 Estimating Transversality Conditions of Intertem-


poral Models: Bubble Tests
Recently some effort has been spent to empirically test whether a transversality con-
dition, for example as stated in sect. 12.1 can hold. this is equivalent to testing for
bubble in time series data. We report here an estimator that originates in Flood and
Garber (1980) which has been extended by Hamilton and Flavin (1986) and recently
employed in Greiner and Semmler (1997).19
In the above model of sect. 12.1 the evolution of the state variable, appearing in
the transversality condition, can be written in discrete time form as

Bt = (1 + r)Bt−1 − St (80)
where St is a flow variable (government surplus) and B as stock (government debt).
In models for an open economy B can be viewed as external debt and S the trade
surplus.
18
In order to devoid detrending through procedures like the Hodrick-Prescott (HP) filter which may
bias the estimation results we here use a formulation in terms of growth rates as suggested by our
model (78)-(80). We, however, also undertook the estimation with HP-detrended data. The results
turned out to be less reasonable than the below reported results using growth rates.
19
Note, that in theory we can compute the transversality condition at the steady state for t → ∞.
In the subsequent section we want to estimate the transversality or non-explosiveness condition for a
finite number of observations.

30
By recursive substitution forwards, equation (81) becomes
N
X Si (1 + r)t BN
Bt = + (81)
i=t+1
(1 + r)i−t (1 + r)N
For a more proper treatment of forward solutions in expectations models, see
Gourieroux and Montfort (1997, ch.12). When wealth holders expectations are that
the borrowing behavior of the government or country is subject to the present value
borrowing constraint

X Si
Bt = Et (82)
i=t+1
(1 + r)i−t
this is equivalent to requiring that the real supply of debt held by the public is
expected to grow no faster than the interest rate
BN
Et lim =0 (83)
N →∞ (1 + r)N

If the government debt is constrained not to exceed constant, A0 , on the right hand
side of (2), we then have from (82)

X Si
Bt = Et i−t
+ A0 (1 + r)t (84)
i=t+1
(1 + r)
and the empirical question of non-sustainability amounts to the problem whether
the term A0 (1 + r)t is positive an significantly different from zero. Following Flood and
Garber (1980) and Hamilton and Flavin (1986) the following nonlinear least square
test for sustainability can be employed

St = b1 + b2 St−1 + b3 St−2 + b4 St−3 + ε2t (85)

(b2 b + b3 b2 + b4 b3 )St
Bt = b5 (1 + r)t + b6 +
(1 − b2 b − b3 b2 − b4 b3 )
2
(b3 b + b4 b )St−1 (b4 b)St − 2
+ + + ε1t (86)
(1 − b2 b − b3 b − b4 b ) (1 − b2 b − b3 b2 − b4 b3 )
2 3

Equations (85)-(86) should be jointly estimated by nonlinear least square. The


term b = 1/(1 + r) is based on a constant interest rate.
Note, that in order to create non-sustainability of debt or for a bubble to exist
the parameter b5 should be positive and significantly different from zero. Estimation
results of the sustainability of government debt for the U.S. and Germany are reported

31
in Greiner and Semmer (1997). Similar bubble or non-sustainability tests have been
undertaken for price level (Flood and Garber, 1990), foreign debt (Trehan and Walsh,
1991) and stock market dynamics.A computer program to estimate equations (85)-(86)
is available in GAUSS.

32
References
[1] Baxter, M. and R. King (1995), ”Measuring business cycles approximate band-
pass filters for economic time series.”, NBER working paper 50220.
[2] Bollerslev, T. (1986), ”Generalized Autoregressive Conditional Heteroscedastic-
ity”, Journal of Econometrics 31, pp. 307-327.
[3] Campbell, J.Y., Lo, A.W. and A.C. MacKinlay (1996), ”The Econometrics of
Financial Markets”, Princeton University Press, Princeton, N.J.
[4] Cao and Tsay (1992), ”Nonlinear Time-Series Analysis of Stock Volatilities”,
Journal of Applied Econometrics 7, pp. 165-185.
[5] Chiarella, C., Semmler, W. and L. Kockesen (1996), ”The Specification and
Estimation of a Nonlinear Model of Real and Stock Market interaction”, mimeo,
New School for Social research, New York.
[6] Chow, G. (1983), ”Econometrics”, Mac Graw Hill, New York.
[7] Chow, G. (1993), ”Statistical Estimation and Testing of a Real Business Cy-
cle Model”, Econometric Research Program, Research Memorandum, no. 365,
Princeton: Princeton University.
[8] Chow, G. (1997), ”Dynamic Economics”, Oxford University Press, New York et
al.
[9] Cooley, T. (1995), ed., ”Frontiers of Business Cycle Research”, Princeton Unver-
sity Press, Princeton.
[10] Eitrheim and T. Teräsvirta (1995), ”Testing the Adequacy of Smooth Transition
Autoregressive Models”, unpublished paper.
[11] Engle, E.F. (198), ”Autoregressive Conditional Heteroscedasticity with Estimates
of the Variance of United Kingdom Inflation”, Econometrica 50, pp. 987-1007.
[12] Franke, R. and W. Semmler (1994), ”The Financial-Real Interaction and Invest-
ment in the Business Cycle: Theory and Empirical Evidence”, in: Nell, E. and
G. Deleplace (eds.), Money in Motion, Macmillan.
[13] Franke, R. and W. Semmler (1993), ”Detrending Programs”, (available on the
DOS-level.).
[14] Glomm, G. and B. Ravikumar (1994), ”Public Investment in Infrastructure in a
Simple Growth Model”, Journal of Economic Dynamics and Control, vol. 18, pp.
1173-87.

33
[15] Gourieroux, C. and A. Montfort (1997), ”Time Series and Dynamic Models”,
Cambridge University Press, Cambridge et al.

[16] Granger, C.W. and T. Teräsvirta (1993), ”Modelling Nonlinear Economic Rela-
tionships”, Chicago University Press, Chicago.

[17] Granger, C.W., Teräsvirta, T. and H.M. Anderson (1993), ”Modelling Nonlin-
earity over the Business Cycle”, New Research on Business Cycles, Indicators
and Forecasting, J.H. Storch and M.W. Watson (eds.), Chicago University Press,
Chicago.

[18] Greiner, A. and W. Semmler (1996a), ”Endogenous Growth, Government Debt


and Budgetary Regimes”, paper presented at the 52nd Congress, International
Institute of Public Finance, Tel Aviv, Israel, August 1996, mimeo, University of
Augsburg, New School for Social Research.

[19] Greiner, A., Semmler, W. and G. Gong (1997), ”Estimating an Endogenous


Growth Model with Public Capital and Government Borrwowing”, University of
Bielefeld, Working paper340.

[20] Hamilton, J.D. (1994), ”Time Series Analysis”, Princeton University Press,
Princeton, N.J.

[21] Hansen, G. (1993), ”Quantitative Wirtschaftsforschung”, Verlag Vahlen,


München.

[22] Hansen, L.P. (1982), ”Large Sample Properties of Generalized Methods of Mo-
ments Estimators”, Econometrica 50, pp. 1029-1054.

[23] Hansen, L.P. and K.J. Singleton (1982), ”Generalized Instrument Variables Es-
timation of Nonlinear Expectations Models”, Econometrica 50, pp. 1268-1286.

[24] Hertz, J.A., Krogh, A.S. and R.G. Palmer (1995), ”Introduction to the Theory
of Neural Computation”, Addison Wesley Publishing Company, Redwood City,
C.A. et al.

[25] Hodrick, ? and ? Prescott (1980), ”Post-War U.S. Business Cycles”, Working
paper, Carnegie Mellon University.

[26] Kloeden, P.E. and E. Platen (1995), ”Numerical Solutions of Stochastic Differ-
ential Equations”, Springer Verlag, Heidelberg et al.

[27] Kloeden, P.E., Platen, E. and H. Schurz (1991), ”Numerical Solutions of SNDE
Through Computer Experiments”, Springer Verlag, Heidelberg et al.

34
[28] Lettau, M., Gong, G. and W. Semmler (1997), ”Parameter Estimation and Mo-
ment Evaluation of a Stochastic Growth Model with Asset Market”.

[29] Luukonen, R., Saikkonen, P. and T. Teräsvirta (1988a), ”Testing Linearity


Against Smooth Transition Autoregression Models”, Biometrica 75, pp. 491-499.

[30] Luukonen, R., Saikkonen, P. and T. Teräsvirta (1988b), ”Testing Linearity in


Univariate Time Series Models”, Scandinavian Journal of Statistics 15, pp. 161-
175.

[31] Neftci, S. (1996), ”An Introduction to the Mathematics of Financial Deravatives”,


Academic Press, New York.

[32] Nelson, C. and C. Plosser (1982), ”Trends and Random Walks in a Macroeco-
nomic Time Series”, Journal of Monetary Economics 10, pp.139-162.

[33] Newey, W.K. and K.D. West (1987), ”A Simple, Positive-Definite, Heteroscedas-
ticity and Autocorrelation Consistent Co-variance Matrix”, Econometrica, vol.
55, pp. 703-708.

[34] Ozaki, T. (1985), ”Nonlinear Time Series Models and Dynamical Systems”,
Handbook of Statistics 5, E.J. Hannan,P.R. Krishnaih and M.M. Rao (eds.),
Elsevier Science Publishers B.V., North-Holland.

[35] Ozaki, T. (1986), ”Local Gaussian Modeling of Stochastic Dynamical Systems in


the Analysis of Nonlinear Random Vibrations”, Essays in Time Series and Allied
Processes: Papers in Honour of E.-J. Hannon, J. Gani and M.B. Priestley (eds.),
Applied Probability Trust Sheffield.

[36] Ozaki, T. (1987), ”Statistical Identification of Nonlinear Dynamics in Macroeco-


nomics Using Nonlinear Time Series Models”.

[37] Ozaki, T. (1989),”Statistical Identification of Nonlinear Random Vibration Sys-


tems”, Journal of Applied Mechanics 56, pp. 186-191.

[38] Ozaki, T. (1994), ”The Local Lineariation Filter with Application to Nonlinear
System Identification”, Proceedings of the First U.S./Japan Conference on the
Frontiers of Statistical Modeling: An Informational Approach, H. Bozdogan (ed.),
Kluwer Academic Publishers,Boston, pp. 217-240.

[39] Potter, S.M. (1993), ”A Nonlinear Approach to U.S. GNP”, mimeo, University
of Wisconsin-Madison.

35
[40] Potter, S.M. (1994), ”Asymmetric Economic Propagation Mechanisms”, Business
Cycles: Theory and Empirical Methods, W. Semmler (ed.), Kluwer Academic
Publishers, Boston.

[41] Semmler, W. (1994), Business Cycles: Theory and Empirical Methods, Kluwer
Academic Publishers, Boston.

[42] Semmler, W. and M. Sieveking (1996), ”Computing Creditworthiness and Sus-


tainable Debt”, paper prepared for he International Conference on Computing in
Economics and Finance, University of Geneva, June 1996, mimeo, Department
of Mathematics, Unversity of Frankfurt.

[43] Stock, J. and M. Watson (1997), ”Business Cycle Fluctuations in U.S. Macroeco-
nomic Time Series”, forthcoming, Handbook of Macroeconomics, ed. by J. Taylor
and M. Woodford.

[44] Taylor, J. and H. Uhlig (1990), ”Solving nonlinear stochastic Growth Models: A
Comparison of alternative solution of models”, Journal of Business and Economic
Statistics 8, pp. 1-17.

[45] Teräsvirta, T. (1994), ”Specification, Estimation and Evaluation of Smooth Tran-


sition Autoregressive Models”, Journal of the American Statistical Association
89.

[46] Teräsvirta, T. and H.M. Anderson (1992), ”Characterizing Nonlinearities in Busi-


ness Cycles Using Smooth Transition Autoregressive Models”, Journal of Applied
Econometrics 7, pp. S119-S136

[47] Tong, H. (1990), ”Non-Linear Time Series, A Dynamical System Approach”,


Oxford University Press, New York.

[48] Tsay (1989), ”Testing and Modeling Threshold Autoregressive Processes”, Jour-
nal of the American Statistical Association 84, pp. 231-240

36

You might also like