An Introduction To Nonlinear Filtering
An Introduction To Nonlinear Filtering
M.H.A. Davis
Department of Electrical Engineering
Imperial College, London SW7 2BT, England.
Steven I. Marcus
Department of Electrical Engineering
The University of Texas at Austin
Austin, Texas 78712, U.S.A.
ABSTRACT
In this paper we provide an introduction to nonlinear
filtering from two points of view: the innovations approach
and the approach based upon an unnormalized conditional density.
The filtering problem concerns the estimation of an unobserved
stochastic process {x t } given observations of a related process
{Yt}; the classic problem is to calculate, for each t, the
conditional distribution of xt given {ys ,O.s.s.s. t}. First, a
brief review of key results on martingales and Markov and
diffusion processes is presented. Using the inn6vations approach,
stochastic differential equations for the evolution of conditional
statistics and of the conditional measure of xt given {ys,O.s.s.s.t}
are given; these equations are the analogs for the filtering
problem of the Kolmogorov forward equations. Several examples
are discussed. Finally, a .less complicated evolution equation is
derived by considering an "unnormalized" conditional measure.
53
M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification
and Applications, 53-75.
Copyright Iii> 1981 by D. Reidel Publishing Company.
S4 M. H. A. DAVIS AND S. I. MARCUS
I. INTRODUCTION
Filtering problems concern "estimating" something about an
unobserved stochastic process {x t } given observations of a related
process {Yt}; the classic problem is to calculate, for each t,
the conditional distribution of xt given {Ys'O~s~ tL This was
solved in the context of linear system theory by Kalman and Bucy
[1],[2] in 1960,1961, and the resulting "Kalman filter" has of
course enjoyed immense success in a wide variety of applications.
Attempts were soon made to generalize the results to systems with
nonlinear dynamics. This is an essentially more difficult problem,
being in general infinite-dimensional, but nevertheless equations
describing the evolution of conditional distributions were
obtained by several authors in the mid-sixties; for example, Bucy
[3], Kushner [4], Shiryaev [5], Stratonovich [6], Wonham [7]. In
1969 Zakai [81 obtained these equations in substantially simpler
form us i ng the so-called "reference probabil i ty" method (see Wong
[9]) .
fo zsds +wt
t
Yt = ( 1)
xt = xO+
t
fo
f(xs)ds+
t
f0
G(x s )df3 s ' (4)
H3. E[ I
T 2
zsds] < 00
o
H4. {Zt} is independent of {w t }.
Hypotheses (HI) and (H4) can be weakened, but the calculations
become more involved [8],[12],[13, Chapter 8]. Similar results
to those derived here can also be derived in the case that {w t }
is replaced by the sum of a Brownian motion and a counting process
[25]. Hypotheses on the process {x t } and the relationship between
xt and wt will be imposed as they are needed in the sequel.
Finally, we will need two special cases of Ito's differential
rule. Suppose that (~~,Ft)' i=I,2, are semimartingales of the
form
iii i ( 5)
~t = ~O + at + mt ,
tt 0 0 0 s So s s
f t
~1 ~2 = ~1 ~2 + ~1 d~2 + ~2 d~1 + <ml m2> •
' t
Jt (6a)
1jJ(x t ) = 1jJ(x o) +
n
L ~
.
(x )dx 1 + 2
1 n
L
Jt
i = 1 ax 1 s s i , j =1 0
and (8) is, in abstract form, the backward equation for the
process. Writing it out in integral form and recalling the
definition of Tt gives the Dynkin formula:
This implies, using the Markov property again, that the process
Mt defined for ¢eD(L) by
AN INTRODUCTION TO NONLINEAR FILTERING S9
t
Mt = <p(X t ) - <p(X) - Lo L<p(xS)ds ( 10)
( 11)
p(s,x,t,B) = 1B p(s,x,t,y)dy
satisfying
a) for t-s > 0 > 0, p(s,x,t,y) is continuous and bounded in s, t,
and x;
i 2
b) the partial derivatives lP., 4, a. p. exist.
as ax 1 ax 1 ax J
Then for 0 < s < t, p satisfies the Kolmogorov backward equation
+ i i,j=l
I ayiayj
a2 (aij(y)p(s,x,t,y)) (14)
:= L* p(s,x,t,y)
where L* is the formal adjoint of L. Also, the initial condition
is lim p(s,x,t,y) = o(y-x).
t +s
Outline of Proof: Assume, for simplicity of notation, that
{x t } is a scalar diffusion (n=l). From (9), we have
a2 (z)dz = J4>(z) -2
f. p(s.x.t.z)g 2(z) -2
az
cp a2 (g 2(z)p(s.x.t.z))dz.
az
hence
1 a2 2
-2-2 [g (z)p(s.x.t.z)J) 4>(z)dz = o.
az
Since the expression in curly brackets is continuous and 4>(z) is
an arbitrary twice differentiable function vanishing outside a
finite interval. (14) follows.
We note that if Xo has distribution PO' then the density of
xt is p(t,y) = !p(O,x,t,y)PO(dx), and p(t,y) also satisfies (14).
Conditions for the existence of a density satisfying the
differentiability hypotheses of Theorems 2 and 3 are given in
[24, pp.96-99] (see also Pardoux [15]).
mt = E[m O] +
t
nsdvs Jo ( 18)
Lo asds + J0 [Ys-€SZs]dvs'
t t
gt = €o + (20)
Lo &sds
t
~t: = €t - €O -
AN INTRODUCTION TO NONLINEAR FILTERING 65
E[€t-€sIY s ] = E[~t-~sIYs]
( 22)
1T t (<p) = 1T 0 (<P) + fot 1Ts (L<P)ds + i0t [1Ts (hcp)-1T s (h)1Ts (<P) ]dvs · (26)
xt = Xo+ Iot 7fs (f}ds+ I0t [7fs (hx)-7fs (h)X s ]dvs · (29)
Hence, 7ft(f}, 7f t (hx), and 7f t (h) are all necessary for the
computation of xt ' etc. One case in which this calculation ~
possible is given in the next example.
o
t
xt = Xo + J. axsds + c It [7fs (x 2)-xs2] [dYs-cXsdS]
0 . .
(t
= Xo + J. axs + c
o
10t Pt(dys-cxsds) ( 30)
(31 )
( 32)
(34)
(35)
----
(35) and (36) yields
so (34) becomes
(38)
= At 1f t (LCP)dt + At 1f t (hCP)dy t ,
( 40)
where
-
Lcp (x) = Lcp (x) - 2"1 h2 (x) cP ( x)
(9),[22).
Example 4: Under the assumptions of Example 2, we can
derive a stochastic differential equation for q(t,x):= Atp(t,x);
this is interpreted as an unnormalized conditional density, since
then
-() g( t,x)
p t,x = Jq(t,x)dx.
Notice that (41) has a much simpler structure than (27): it does
not involve an integral such as TIt(h) , and it is a bilinear
stochastic differential equation with {Yt} as its input. This
structure is utilized by a number of papers in this volume.
ACKNOWLEDGMENT
The work of S. I. Marcus was supported in part by the U.S.
National Science Foundation under grant ENG-76-11106.
REFERENCES
1. R. E. Kalman, "A new approach to linear filtering and
prediction problems," J. Basic Eng. ASME, 82, 1960,
pp. 33-45.
2. R. E. Kalman and R. S. Bucy, "New results in linear
filtering and prediction theory," J. Basic Engr. ASME
Series D, 83, 1961, pp. 95-108.
3. R. S. Bucy, "Nonlinear filtering," IEEE Trans. Automatic
Control, AC-I0, 1965, p. 198.
4. H. J. Kushner, "On the differential equations satisfied by
conditional probability densities of Markov processes," SIAM
J. Control, 2, 1964, pp. 106-119.
5. A. N. Shiryaev, "Some new results in the theory of controlled
stochastic processes [Russian]," Trans. 4th Prague Conference
on Information Theory~ Czech. Academy of Sciences, Prague,
1967.
6. R. L. Stratonovich, Conditional Markov Processes and Their
Application to the Theory of Optimal Control. New York:
Elsevier, 1968.
7. W. M. Wonham, "Some applications of stochastic differential
equations to optimal nonlinear filtering," SIAM J. Control,
2, 1965, pp. 347-369.
8. M. Zakai, "On the optimal filtering of diffusion processes,"
Z. Wahr. Verw. Geb., 11, 1969, pp. 230-243.
9. E. Wong, Stochastic Processes in Information and Dynamical
Systems. New York: McGraw-Hill, 1971.
74 M. H. A. DAVIS AND S. I. MARCUS