State Space Models
State Space Models
052183595X - State Space and Unobserved Component Models: Theory and Applications - Edited by
Andrew Harvey, Siem Jan Koopman and Neil Shephard
Excerpt
More information
Part I
State space models
1
Introduction to state space time series analysis
James Durbin
Department of Statistics, London School of Economics and Political Science
Abstract
The paper presents a broad general review of the state space approach to
time series analysis. It begins with an introduction to the linear Gaussian
state space model. Applications to problems in practical time series analysis
are considered. The state space approach is briefly compared with the Box–
Jenkins approach. The Kalman filter and smoother and the simulation
smoother are described. Missing observations, forecasting and initialisation
are considered. A representation of a multivariate series as a univariate series
is displayed. The construction and maximisation of the likelihood function
are discussed. An application to real data is presented. The treatment is
extended to non-Gaussian and nonlinear state space models. A simulation
technique based on importance sampling is described for analysing these
models. The use of antithetic variables in the simulation is considered.
Bayesian analysis of the models is developed based on an extension of the
importance sampling technique. Classical and Bayesian methods are applied
to a real time series.
State Space and Unobserved Component Models: Theory and Applications, eds. Andrew
C. Harvey, Siem Jan Koopman and Neil Shephard. Published by Cambridge University
Press.
C Cambridge University Press 2004
4 James Durbin
yt = µt + γt + ct + rt + it + εt , (1.1)
where
7.9
drivers
7.8
7.7
7.6
number (log)
7.5
7.4
7.3
7.2
7.1
70 75 80 85
year
Fig. 1.1. Monthly numbers (logged) of car drivers who were killed or seriously
injured in road accidents in Great Britain.
6 James Durbin
yt = µt + εt , εt ∼ N (0, σε2 ),
(1.2)
µt+1 = µt + ηt , ηt ∼ N (0, ση2 ),
for t = 1, . . . , n, where the εt s and ηt s are all mutually independent and are
also independent of µ1 .
The objective of this model is to represent a series with no trend or sea-
sonal whose level µt is allowed to vary over time. The second equation of
the model is a random walk ; random walks are basic elements in many state
space time series models. Although it is simple, the local level model is not
an artificial model and it provides the basis for the treatment of important
series in practice. It is employed to explain the ideas underlying state space
time series analysis in an elementary way in Chapter 2 of the DK book.
The properties of time series that are generated by a local level model are
studied in detail in Harvey (1989).
yt = µt + εt , εt ∼ N (0, σε2 ),
µt+1 = µt + νt + ξt , ξt ∼ N (0, σξ2 ), (1.3)
νt+1 = νt + ζt , ζt ∼ N (0, σζ2 ).
This extends the local level model to the case where there is a trend with a
slope νt where both level and slope are allowed to vary over time. It is worth
noting that when both ξt and ζt are zero, the model reduces to the classical
linear trend plus noise model, yt = α + βt + error. It is sometimes useful to
smooth the trend by putting ξt = 0 in (1.3). Details of the model and its
extensions to the general class of structural time series models are given in
the DK book Section 3.2 and in Harvey (1989).
The matrix form of the local linear trend model is
µt
yt = 1 0 + εt ,
νt
µt+1 1 1 µt ξt
= + .
νt+1 0 1 νt ζt
By considering this and other special cases in matrix form we are led to the
following general model which provides the basis for much of our further
treatment of state space models.
(i) Structural time series models. These are models of the basic form (1.1)
where the submodels for the components are chosen to be compatible
with the state space form (1.4). The local level model and the local
linear trend model are simple special cases. The models are sometimes
called dynamic linear models .
(ii) ARMA and Box–Jenkins (BJ) ARIMA models. These can be put in
state space form as described in the DK book, Section 3.3. This means
that ARIMA models can be treated as special cases of state space mod-
els. I will make a few remarks at this point on the relative merits of
the BJ approach and the state space approach for practical time series
analysis.
8 James Durbin
data, whereas state space fits the data to the structure of the
system which generated the data.
(b) BJ eliminates trend and seasonal by differencing. However, in
many cases these components have intrinsic interest and in state
space they can be estimated directly. While in BJ estimates can be
‘recovered’ from the differenced series by maximising the residual
mean square, this seems an artificial procedure.
(c) The BJ identification procedure need not lead to a unique model;
in some cases several apparently quite different models can appear
to fit the data equally well.
(d) In BJ it is difficult to handle regression effects, missing observa-
tions, calendar effects, multivariate observations and changes in
coefficients over time; these are all straightforward in state space.
A fuller discussion of the relative merits of state space and BJ is given
in the DK book, Section 3.5. The comparison is strongly in favour of
state space.
(iii) Model (1.4) handles time-varying regression and regression with auto-
correlated errors straightforwardly.
(iv) State space models can deal with problems in spline smoothing in
discrete and continuos time on a proper modelling basis in which pa-
rameters can be estimated by standard methods, as compared with
customary ad hoc methods.
• Kalman filter. This recursively computes at+1 = E(αt+1 |Yt ) and Pt+1 =
V ar(αt+1|Yt ) for t = 1, . . . , n. Since distributions are normal, these quan-
tities specify the distribution of αt+1 given data up to time t.
• State smoother. This estimates α̂t = E(αt | Yn ) and Vt = Var(αt | Yn )
and hence the conditional distribution of αt given all the observations for
t = 1, . . . , n.
• Simulation smoother. An algorithm for generating draws from
p(α1 , . . . , αn |Yn ).
by the recursion
vt = yt − Zt at ,
Ft = Zt Pt Zt + Ht ,
Kt = Tt Pt Zt Ft−1 ,
(1.5)
Lt = Tt − Kt Zt ,
at+1 = Tt at + Kt vt ,
Pt+1 = Tt Pt Lt + Rt Qt Rt t = 1, . . . , n,
with a1 and P1 as the mean vector and the variance matrix of α1 .
10 James Durbin
vector given that a second vector is fixed does not depend on the second vec-
tor. These observations lead to a straightforward derivation of the following
algorithm for drawing random vectors α̃t from p(α|Yn ):
Step 1. Obtain random draws ε+ t and ηt from densities N (0, Ht ) and
+
Step 2. Compute α̂t = E(αt |Yn ) and α̂t+ = E(αt |Yn+ ) where
Yn+ = {y1+ , . . . , yn+ } by means of standard filtering and smoothing
using (1.5) forwards and (1.6) backwards.
Step 3. Take
α̃t = α̂t − α̂t+ + αt+ ,
for t = 1, . . . , n.
This algorithm for generating α̃t only requires standard Kalman filtering
and state smoothing applied to the constructed series y1+ , . . . , yn+ and is
therefore easy to incorporate in new software; special algorithms for simula-
tion smoothing such as the ones developed by Frühwirth-Schnatter (1994c),
Carter and Kohn (1994) and de Jong and Shephard (1995) are not required.
The algorithm and similar ones for the disturbances are intended to replace
those given in Section 4.7 of the DK book.
1.2.6 Forecasting
This also is very easy in state space analysis. Suppose we want to forecast
yn+1 , . . . , yn+k given y1 , . . . , yn and calculate mean square forecast errors.
We merely treat yn+1 , . . . , yn+k as missing observations and proceed using
(1.5) as in Section 1.2.5. We use
Zn+1 an+1 , . . . , Zn+k an+k
as the forecasts and use
Vn+1 , . . . , Vn+k
to provide mean square errors; for details see the DK book, Section 4.9.