0% found this document useful (0 votes)

261 views23 pages

An Introduction To Nonlinear Filtering

An Introduction to Nonlinear Filtering

Uploaded by

Shafayat Abrar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

261 views23 pages

An Introduction To Nonlinear Filtering

An Introduction to Nonlinear Filtering

Uploaded by

Shafayat Abrar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

AN INTRODUCTION TO NONLINEAR FILTERING

M.H.A. Davis
Department of Electrical Engineering
Imperial College, London SW7 2BT, England.

Steven I. Marcus
Department of Electrical Engineering
The University of Texas at Austin
Austin, Texas 78712, U.S.A.

ABSTRACT
In this paper we provide an introduction to nonlinear
filtering from two points of view: the innovations approach
and the approach based upon an unnormalized conditional density.
The filtering problem concerns the estimation of an unobserved
stochastic process {x t } given observations of a related process
{Yt}; the classic problem is to calculate, for each t, the
conditional distribution of xt given {ys ,O.s.s.s. t}. First, a
brief review of key results on martingales and Markov and
diffusion processes is presented. Using the inn6vations approach,
stochastic differential equations for the evolution of conditional
statistics and of the conditional measure of xt given {ys,O.s.s.s.t}
are given; these equations are the analogs for the filtering
problem of the Kolmogorov forward equations. Several examples
are discussed. Finally, a .less complicated evolution equation is
derived by considering an "unnormalized" conditional measure.

53
M. Hazewinkel and J. C. Willems (eds.), Stochastic Systems: The Mathematics of Filtering and Identification
and Applications, 53-75.
Copyright Iii> 1981 by D. Reidel Publishing Company.
S4 M. H. A. DAVIS AND S. I. MARCUS

I. INTRODUCTION
Filtering problems concern "estimating" something about an
unobserved stochastic process {x t } given observations of a related
process {Yt}; the classic problem is to calculate, for each t,
the conditional distribution of xt given {Ys'O~s~ tL This was
solved in the context of linear system theory by Kalman and Bucy
[1],[2] in 1960,1961, and the resulting "Kalman filter" has of
course enjoyed immense success in a wide variety of applications.
Attempts were soon made to generalize the results to systems with
nonlinear dynamics. This is an essentially more difficult problem,
being in general infinite-dimensional, but nevertheless equations
describing the evolution of conditional distributions were
obtained by several authors in the mid-sixties; for example, Bucy
[3], Kushner [4], Shiryaev [5], Stratonovich [6], Wonham [7]. In
1969 Zakai [81 obtained these equations in substantially simpler
form us i ng the so-called "reference probabil i ty" method (see Wong
[9]) .

In 1968 Kailath [10] introduced the "innovations approach" to

linear filtering, and the significance for nonlinear filtering was
immediately appreciated [11], namely that the filtering problem
ought to be formulated in the context of martingale theory. The
definitive treatment from [Link] point of view was given in 1972 by
Fujisaki, Kallianpur and Kunita [12]. Textbook accounts including
all the mathematical background can be found in Liptser and
Shiryaev [13] and Kallianpur [14].
More recent work on nonlinear filtering has concentrated on
the following areas (this list and the references are not intended
to be exhaustive):
(i) Rigorous formulation of the theory of stochastic partial
differential equations (Pardoux [15], Krylov and Rozovskii [16]);
(ii) Introduction of Lie algebraic and differential
geometric methods (Brockett [17]);
(iii) Discovery of finite dimensional nonlinear filters
(Benes [18]);
(iv) Development of "robust" or "pathwise" solutions of the
filtering equations (Davis [19]);
(v) Functional integration and group representation methods
(Mitter [30)).
All of these topics are dealt with in this volume and all of
them use the basic equations of nonlinear filtering theory: the
Fujisaki, [Link]., equation [12] and/or the Zakai equation [8].
These equations can be derived in a quick and self-contained way,
modulo some technical results, the statements of which are
readily appreciated and the details of which can be found in the
AN INTRODUCTION TO NONLINEAR FILTERING 55

references [13],[14]. This is the purpose of the present article.

The general problem can be described as follows. The signal
or state process {x t } is a stochastic process which cannot be
observed directly. Information concerning {x t } is obtained from
the observation process {Yt}, which we will assume is given by

fo zsds +wt
t
Yt = ( 1)

where {Zt} is a process "related" to {x t } (e.g., Zt = h(x t )) and

{w t } is a Brownian motion process. The process {Yt} is to be
thought of as noisy nonlinear observations of the signal {x t }.
The objective is to compute least squares estimates of functions
of the signal xt given the "past" observations {ys ,0.:::. s.:::. t} --
i.e. to compute quantities of the form E[CP(xt)IYs,O.:::.s.:::.t]. In
addition, it is desired that this computation be done recursively
in terms of a statistic {TIt} which can be updated using only new
observations:
(2)

and from which estimates can be calculated in a "pointwise" or

"memoryless" fashion:
E[CP(Xt)IYS,O':::'S.:::.t] = 6(t,Yt,TI t )· ( 3)

In general, TIt will be closely related to the conditional

distribution of xt given {ys,O.:::.s.:::.t}, but in certain special
cases TIt will be computable with a finite set of stochastic
differential equations driven by {Yt} (see [20] for some examples).
In order to obtain specific results, additional structure
wi 11 be assumed for the process {x t }; we will assume throughout
that {x t } is a semimartingale (see Section II), but more detailed
results will be derived under the assumption that {x t } is a Markov
process or in particular a vector diffusion process of the form

xt = xO+
t
fo
f(xs)ds+
t
f0
G(x s )df3 s ' (4)

where Xt€lR n and f3 t €lR m is a vector of independent Brownian

56 M. H. A. DAVIS AND S. I. MARCUS

motion processes. General terminology and precise assumptions

will be presented in Section II. In Section III, Markov processes
of the form (4) will be studied, and Kolmogorov's equations for
the evolution of the unconditional distribution (i.e. without
observations) of the process {x t } will be presented. The
corresponding equations for the conditional distribution of xt
given {ys,O,::,s,::,t} will be derived in Section IV using the
"innovations approach". Finally, in Section V we derive a less
complex set of equations for an unnormalized conditional
distribution of xt ' in the form given by Zakai [8].

II. TERMINOLOGY AND ASSUMPTIONS

In this section we review certain notions concerning
stochastic processes and martingales; for further tutorial
material on martingale integrals and stochastic calculus, the
reader is referred to the tutorial of R. Curtain in this volume
and the paper of Davis [21] (see also [9],[13],[22]-[24]). All
stochastic processes will be defined on a fixed probability space
(D,F,P) and a finite time interval [O,T], on which there is
defined an increasing family of a-fields {Ft,O,::,t,::,T}. It is
assumed that each process {x t } is adapted to Ft--i.e. xt is Ft -
measurable for all t. The a-field generated by {x s ,0 -< s -< t} is
denoted by Xt = a{xs,O,::,s,::,t}. (xt,F t ) is a martingale if xt is
adapted to Ft , Elxtl <00, and E[xtIFs] = Xs for t~s. (xt,F t ) is
a supermartingale if E[xtIFs] ,::, Xs and a submartingale if
E[xtIFs] ~ xs' The process (xt,F t ) is a semimartingale if it has
a decomposition xt=xO+at+m t , where (mt,Ft ) is a martingale and
{at} is a process of bounded variation. Given two square
integrable martingales (mt,F t ) and (nt,F t ), one can define the
predictable quadratic covariation «m,n>t,Ft ) to be the unique
"predictable process of integrable variation" such that
(mtnt-<m,n>t,F t ) is a martingale [29, p.34]. For the purposes of
this paper, however, the only necessary facts concerning <m,n>
are that (a) <m,n>t = 0 if mtn t is a martingale; and (b) if S is a
standard Brownian motion process, then

<S,(3)t = t and <

tllsdS s ' It llsdS s > t = It llsllsds.
I0 1 2 1 2
00
AN INTRODUCTION TO NONLINEAR FILTERING 57

In this tutorial exposition, the following hypotheses will

be assumed for all nonlinear estimation problems:
HI. {Yt} is a real-valued process;
H2. {w t } is a standard Brownian motion process;

H3. E[ I
T 2
zsds] < 00

o
H4. {Zt} is independent of {w t }.
Hypotheses (HI) and (H4) can be weakened, but the calculations
become more involved [8],[12],[13, Chapter 8]. Similar results
to those derived here can also be derived in the case that {w t }
is replaced by the sum of a Brownian motion and a counting process
[25]. Hypotheses on the process {x t } and the relationship between
xt and wt will be imposed as they are needed in the sequel.
Finally, we will need two special cases of Ito's differential
rule. Suppose that (~~,Ft)' i=I,2, are semimartingales of the
form
iii i ( 5)
~t = ~O + at + mt ,

where {m~}, i=I,2, are square integrable martingales

and {a~} sample continuous. Then

tt 0 0 0 s So s s
f t
~1 ~2 = ~1 ~2 + ~1 d~2 + ~2 d~1 + <ml m2> •
' t
Jt (6a)

Also, if ¢ is a twice continuously differentiable funct i on of a

process x of the form (4), then

1jJ(x t ) = 1jJ(x o) +
n
L ~
.
(x )dx 1 + 2
1 n
L
Jt
i = 1 ax 1 s s i , j =1 0

where A(x) [aij(x)]:=G(x)G'(x) and xi denotes the ith component

of x.

III. MARKOV AND DIFFUSION PROCESSES

A very clear account of the material in this section can be
found in Wong's book [9]. A stochastic process {xt,t€ [O,T]} is
a Markov process if for any 02. s 2. t 2. T and any Borel set B of the
state space S,
P(xteB/\) = P(xt€B/x s )·
58 M. H. A. DAVIS AND S. I. MARCUS

For any Markov process {x t }, we can define the transition

probability function
P(s,x,t,B):= P(xteBlxs=x),
which can easily be shown to satisfy the Chapman-Kolmogorov
equation: for any 0 ~ s~ u ~ t~ T,

P( s ,x , t ,B) = LS P( u,y , t ,B) P( s ,x ,u ,dy) . (7)

In addition, all finite dimensional distributions of a Markov

process are determined by its initial distribution and transition
probabili~y function. A Markov process {x t } is homogeneous if
P(s+u,x,t+u,B) = P(s,x,t,B) for all O~s~t~T and O~s+u~t+u~T.
For a homogeneous Markov process {x t } and feB(S) (i.e. f is
a bounded measurable real-valued function on S), define

Ttf(x) = Ex[f(x t )]:= ~ f(y)P(O,x,t,dy).

The Chapman-Kolmogorov equation then implies that Tt is a

semigroup of operators acting on B(s); i.e. Tt+sf(x) = Tt(Tsf)(x)
for t,s.:.. o. The generator L of Tt (or, of {x t }) is the operator
acting on a domain D(L) C B(S) given by
L¢ = 1i m 1. (T ¢ - ¢) ,
t +0 t t
the limit being uniform in xeS and D(L) consisting of all
functions such that this limit exists. It is immediate from this
and the semi group property that

and (8) is, in abstract form, the backward equation for the
process. Writing it out in integral form and recalling the
definition of Tt gives the Dynkin formula:

Ex [¢(x t )] - ¢(x) = Ex 1ot L¢(xs)ds. (9)

This implies, using the Markov property again, that the process
Mt defined for ¢eD(L) by
AN INTRODUCTION TO NONLINEAR FILTERING S9
t
Mt = <p(X t ) - <p(X) - Lo L<p(xS)ds ( 10)

is a martingale [26, p.4]. This property can be used as a

definition of L; this is the approach pioneered by Stroock and
Varadhan [26]. Then L is known as the extended genepatop of {x t },
since there may be functions <p for which Mt is a martingale but
which are not in D(L) as previously defined.
There is another semi group of operators associated with {x t },
namely the operators which transfer the initial distribution of
the process into the distributions at later times t. More
precisely, let M(S) be the set of probability measures on Sand
denote
<<P,~> = 1S <p(x)~( dx)
for <peS(S), ~eM(S). Suppose Xo has distribution 7feM(S); then
the distribution of xt is given by
Ut 7f(A) = P[xteA] = E(IA(x t )) = <T t I A,7f>·
This shows that Ut is adjoint to Tt in that
<<p, Ut 7f> = <T t <p, 7f> (=E<p(x t ))
for <peS(S), 7feM(S). Thus the generator of Ut is L*, the adjoint
of L, and 7f t :=U t 7f satisfies

( 11)

This is the fOPWaPd equation of xt in that it describes the

evolution of the distribution 7f t of xt . The objective of
filtering theory is to obtain a similar description of the
conditionaZ distribution of xt given {ys,[Link]
In order to get these results in more explicit form we
consider in the remainder of this section a process {x t }
satisfying a stochastic differential equation of the form (4),
where {St} is an mm-va1ued standard Brownian ,motion process
independent of xo. For simplicity we assume that f'and G do not
depend explicitly on t (this is no loss of generality, since the
60 M. H. A. DAVIS AND S. I. MARCUS

"process" T(t) = t can be accommodated by augmenting (4) with the

equation dT/dt = 1, T(O) = 0). Under the usual Lipschitz and
growth assumptions which guarantee existence and uniqueness of
(strong) solutions of (4), the following results can be proved
[9], [22]- [24].
Theorem 1: The solution of (4) is a homogeneous Markov
process with infinitesimal generator
n i ail n a2
L = L f (x) - . + 2" L a lJ(x) (12)
i=1 axl i,j=1 axiaxj
where A(x) = [aij(x)]:=G(x)G'(x), and fi and xi denote the ith
components of f and x, respectively.
Hence Ito's rule (6b) in this case can be written as

ljJ(x t ) = ljJ(x o) + 1ot LIjJ(xs)ds + J0t VIjJ' (xs)G(xs)dS s '

emphasizing again that Mt (see (10)) is a martingale (here VIjJ is
the gradient of IjJ with respect to x, expressed as a column vector).
It can also be shown [24] that the solution of (4) satisfies the
Feller and strong Markov properties, and is a diffusion process
with drift vector f and diffusion matrix A. If this process has
a smooth density then the abstract equations (8) and (11)
translate into Kolmogorov's backward and forward equations for
the transition density.
Theorem 2 [24, p.l04]: Assume that the solution {x t } of (4)
has a transition density:

p(s,x,t,B) = 1B p(s,x,t,y)dy
satisfying
a) for t-s > 0 > 0, p(s,x,t,y) is continuous and bounded in s, t,
and x;
i 2
b) the partial derivatives lP., 4, a. p. exist.
as ax 1 ax 1 ax J
Then for 0 < s < t, p satisfies the Kolmogorov backward equation

asa p(s,x,t,y) + Lp(s,x,t,y) =0 ( 13)

AN INTRODUCTION TO NONLINEAR FILTERING 61

with lim p(s,x,t,y) = o(x-y) and L given by (12); i.e. p is the

st t
fundamental solution of (13).
Outline of Proof: From (7), we have
p(s+h,x,t,y) - p(s,x,t,y)
= fp(S,X,S+h,Z)[P(S+h,x,t,y) - p(s+h,z,t,y)]dz.

Dividing both sides by h and letting h-+O yields (13) by using

the definition of L.
More relevant to filtering problems is the Kolmogorov
forward equation.
Theorem 3 [24, p.102]: Assume that {x t } satisfying (4) has
2
a transition density p(s,x,t,y), and that af., aA., ~ A. '¥-t'
ax 1 ax 1 ax 1 ax J
~ a2 exist. Then for
ay , and ~ 0<s< t, p satisfies the Kolmogorov
ay
forward equation
a
1% (s,x,t,y) =- L -ai
n i
(f (y)p(s,x,t,y))
i=l ay

+ i i,j=l
I ayiayj
a2 (aij(y)p(s,x,t,y)) (14)

:= L* p(s,x,t,y)
where L* is the formal adjoint of L. Also, the initial condition
is lim p(s,x,t,y) = o(y-x).
t +s
Outline of Proof: Assume, for simplicity of notation, that
{x t } is a scalar diffusion (n=l). From (9), we have

;t fp(s,x,t,Z)¢(Z)dZ = fp(s,x,t,Z)L¢(z)dZ (15)

for some twice continuously differentiable function which vanishes

outside some finite interval. The derivative and integral on the
left-hand side of (15) can be interchanged, and an integration by
parts then yields
62 M. H. A. DAVIS AND S. l. MARCUS

Jp(S.x.t.z)f(z) ~: (z)dz = -J4>(Z) aaz (f(z)p(s.x.t.z))dz.

a2 (z)dz = J4>(z) -2
f. p(s.x.t.z)g 2(z) -2
az
cp a2 (g 2(z)p(s.x.t.z))dz.
az
hence

f{~ (s.x.t.z) + a; [f(z)p(s.x.t.z)]

1 a2 2
-2-2 [g (z)p(s.x.t.z)J) 4>(z)dz = o.
az
Since the expression in curly brackets is continuous and 4>(z) is
an arbitrary twice differentiable function vanishing outside a
finite interval. (14) follows.
We note that if Xo has distribution PO' then the density of
xt is p(t,y) = !p(O,x,t,y)PO(dx), and p(t,y) also satisfies (14).
Conditions for the existence of a density satisfying the
differentiability hypotheses of Theorems 2 and 3 are given in
[24, pp.96-99] (see also Pardoux [15]).

IV. THE INNOVATIONS APPROACH TO NONLINEAR FILTERING

In this section we derive stochastic differential equations
for the evolution of conditional statistics and of the conditional
density for nonlinear filtering problems of the types discussed
in Sections I and II; the equations will be the analogs of (9)
and the Kolmogorov forward equation for the filtering problem.
We will follow the innovations approach, as presented in [12] and
[13]; this approach was originally suggested by Kailath [10] (for
linear filtering) and Frost and Kailath [11].
Assume that the observations have the form (1) and that
(H1)-(H4) hold. Define Yt:=a{ys,O~s~t}; for any process nt
we use the notation nt:=E[ntIYt]. Now introduce the innovations
proaess:
t
\\:=Yt-l zsds. (16)
o
The incremental innovations vt+h-v t represent the "new
information" concerning the process {Zt} available from the
observations between t and t+h. in the sense that vt+h-v t is
AN INTRODUCTION TO NONLINEAR FILTERING 63

independent of Yt . The following properties of the innovations

are crucial.
Lemma 1: The process (vt,Y t ) is a standard Brownian motion
process. Furthermore, Ys and a{vu-vt,02.S2.t<U2.T} are
independent.
Proof: From (16) we have for s < t,
t
E[vtIY s ] = vs+E[f (zu-zu)du+wt-wsIYs]· ( 17)
s
The second term on the right-hand side of (17) is zero; here we
have used the fact that wt-w s is independent of Ys . Hence
(vt,Y t ) is a martingale. Consider now the quadratic variation
of {v t }: for t€ [O,T] fix an integer n and define

Q~= L n [v((k+l)/2 n)_v(k/2 n)]2.

O<k<2 t
The almost sure limit (as n-+oo) of Q~:=Qt is the quadratic
variation of v t . It is easy to see that the quadratic variation
of f° (z u-z u)du
t
is zero, so that the quadratic variation of v t
is the same as that of wt ' or Qt=t. But by a theorem of Doob
[12, Lemma 2.1], a square integrable martingale with continuous
sample paths and quadratic variation t is a standard Brownian
motion, and the lemma follows.
Notice that the very specific conclusions of Lemma 1
regarding the structure of the innovations process are valid
without any restrictions on the distributions of Zt. The next
lemma is related to Kailath's "innovations conjecture". By
definition v t is Yt-measurable and a{vs,02.S2.t}cYt. The
innovations conjecture is that YtCa{vs,02.S2.t}, and hence that
the two a-fields are equal; i.e. the observations and innovations
processes contain the same information. At the time that [12]
was written, the answer to this question was not known under very
general conditions on {Zt}; recently, it has been shown in [27]
that the conjecture is true under the conditions (Hl)-(H4). It
is a well-known fact [13, Theorem 5.6] that all martingales of
Brownian motion are stochastic integrals, and the point of a
64 M. H. A. DAVIS AND S.1. MARCUS

positive answer to the innovations conjecture is that it enables

any Yt-martingale to be written as a stochastic integral with
respect to the innovations process {v t }. The essential
contribution of Fujisaki, Kallianpur and Kunita [12] was to show
that this representation holds whether or not the innovations
conjecture is valid. Specifically, they showed:
Lemma 2: Every square integrable martingale (mt,Y t ) with
respect to the observation a-fields Yt is sample continuous and
has the representation

mt = E[m O] +
t
nsdvs Jo ( 18)

where fT E[n s2 ] ds <00 and {nt} is jOintly measurable and adapted

o
to Yt . In other words, mt can be written as a stochastic integral
with respect to the innovations process. (But note that {nt} is
adapted to Yt and not necessarily to F~.)
In order to obtain a general filtering equation, let us
consider a real-valued Ft-semimartingale St and derive an equation
satisfied by gt. We have in mind semimartingales ~(Xt) where ~
is some smooth real-valued function and {x t } is the signal process,
but it is just as easy to consider a general semimartingale of the
form
t
St = So + fo [Link] + nt (19)

where (nt,Ft ) is a martingale.

Theorem 4: Assume that {St} and {Yt} are given by (19) and
(1), respectively, and that <n,w>t = O. Then {~t} satisfies the
stochastic differential equation

Lo asds + J0 [Ys-€SZs]dvs'
t t
gt = €o + (20)

Proof: First we define

Lo &sds
t
~t: = €t - €O -
AN INTRODUCTION TO NONLINEAR FILTERING 65

and show that (]Jt'Y t ) is a martingale. Now, for s < t,

E[€t-€sIY s ] = E[~t-~sIYs]

= E[Ist audulYs] + E[nt-nsIYs ]

= E[It E[a IY ]duIYs]+E[E[nt-nsIFs]IYs]'
s u u
(21)

The last term in (21) is zero, since (nt,F t ) is a martingale;

thus (21) proves that (]Jt'Y t ) is a martingale. Hence,

( 22)

where the last term in (22) follows from Lemma 2.

It remains only to identify the precise form of nt, using
Ito's differential rule (6a) and an idea introduced by Wong [28].
From (1) and (19), and since <n,w>t = 0,

~tYt = ~oYO + [Link] ~s(zsds + dws ) + L0t ys(asds + dn s )' (23)

Also, from (16) and (22),

t t t
~tYt = ~oYO + L ~s(zsds + dv s ) + L ys(asds + ns dv s ) + L nsds •
o 0 0
(24)
Now it follows immediately from properties of conditional
expectations that for t~s,
E[~tYt-~tYtIYs] = O.

Calculating this from (23),(24) we see that

(25)
66 M. H. A. DAVIS AND S. l. MARCUS

Inserting (25) into (22) gives the desired result (20).

Formula (20) is not very useful as it stands (it is not a
recursive equation for ~t), but we can use it to obtain more
explicit results for filtering of Markov processes.
Theorem 5: Assume that {x t } is a homogeneous Markov process
with infinitesimal generator L, that {Yt} is given by (1) with
Zt = h{x t ), and that {x t } and {w t } are independent. Then for any
<P e D(L), 1T t (<p) :=E[<p(x t ) IY t ] satisfies

1T t (<p) = 1T 0 (<P) + fot 1Ts (L<P)ds + i0t [1Ts (hcp)-1T s (h)1Ts (<P) ]dvs · (26)

Proof: Notice that (Mt,F t ) (see (10)) is a martingale, so

that ~t:=<P(Xt) is of the form (19) with at:=L<P(x t ), nt:=M~.
Also, it is shown in [12, Lemma 4.2] that the independence of
{x t } and {w t } implies <M<P'W>t=O. The theorem then follows
immediately from Theorem 4.
Remarks: (i) Since {1T t (<P): <peD(L)} determines a measured
valued stochastic process 1T t , (26) can be regarded as a recursive
(infinite-dimensional) stochastic differential equation for the
conditional measure 1Tt of xt given Yt , and 1T t (<P) is a conditional
statistic computed from 1Tt in a memoryless fashion (see (2)-(3)).
In general, however, it is not possible to derive a finite
dimensionaZ recursive filter, even for the conditional mean xt ;
some special cases in which finite dimensional recursive filters
exist are given in Examples 1 and 3 below.
(ii) If wt in (1) were multiplied by r~ with r>O, one would
suspect that as r-+ oo the observations would become infinitely
noisy, thus giving no information about the state; i.e. 1T t (<P)
would reduce to the unconditional expectation E[<P(X t )]. In fact,
in this case the last term in (26) is multiplied by r- 1 , so (26)
reduces to (9) as r-+ oo •
Example 1 [7]: Let {x t } be a finite state Markov process
taking values in S = {sl' ... ,sNL Let Pti be the probability that
AN INTRODUCTION TO NONLINEAR FILTERING 67
1 N ,
xt=si' and assume that Pt:=[Pt""'Pt] satisfies

(This is the forward equation for {x t }; cf.(ll).) Given the

observations (1), the conditional distribution of xt given Yt can
be determined from (26) as follows. Let <p(x) = [<P1(x),,,,,<p N(x)]',
where
x = S.1
<p.(X) = ~ 1,
1
~ 0, X f si .

Then applying (26) to each <Pi yields the following: let

B = diag(h(sl), ... ,h(sN)) and let b = [h(sl),,,.,h(sN)]' . Then
if pi = P[Xt=si!Y t ] and Pt = [pi,···,p~]', we have
t t
Pt = 15 0 + fo APsds+ fo [B-(b'ps)I] ps(dYs-(b'ps)ds).

In this case, the conditional distribution is determined

recursively by N stochastic differential equations.
Example 2: Assume that {x t } is a diffusion process given by
(4) with infinitesimal generator (12) and that the conditional
distribution of xt given Yt has a density p(t,x). Then under
appropriate differentiability hypotheses [13, Theorem 8.6], one
can do an integration by parts in (26) (precisely as in Theorem 3
above) to obtain the stochastic partial differential equation
dp(t,x) = L*p(t,x)dt+p(t,x) [h(x)-'1T t (h)]dv t ( 27)
where
(28)

This is a recursive equation for the computation of p(t,x); it is

not only infinite dimensional but has a complicated structure due
to the presence of the integral in (28). Equation (27) is the
analog of the Kolmogorov forward equation; in fact, (27) reduces
to (13) as the observation noise approaches (see Remark (ii)).
00
68 M. H. A. DAVIS AND S. I. MARCUS

The conditional mean cannot in general be computed with a

finite dimensional recursive filter, as is seen by letting
~(x) = x in (26):

xt = Xo+ Iot 7fs (f}ds+ I0t [7fs (hx)-7fs (h)X s ]dvs · (29)

Hence, 7ft(f}, 7f t (hx), and 7f t (h) are all necessary for the
computation of xt ' etc. One case in which this calculation ~
possible is given in the next example.

where Xo is Gaussian and independent of {w t } and {v t }. Then

(29) yields

o
t
xt = Xo + J. axsds + c It [7fs (x 2)-xs2] [dYs-cXsdS]
0 . .
(t
= Xo + J. axs + c
o
10t Pt(dys-cxsds) ( 30)

where Pt:=E[(Xt-Xt)2jYt] is the conditional error covariance.

However, since {x t } and {Yt} are jointly Gaussian, Pt is
nonrandom and constitutes a "gain" process which can be
precomputed and stored. Pt satisfies the differential equation
(derived from (26) by noticing that the third central moment of
a Gaussian distribution is zero):
d 2 22
dt Pt = 2aP t + b - CPt'

Since Pt is nonrandom and the differential equation for xt

AN INTRODUCTION TO NONLINEAR FILTERING 69

involves no other conditional statistics, it constitutes a

recursive one-dimensional filter (the Ka1man-Bucy filter) for
the computation of the conditional mean.

v. THE UNNORMALIZED EQUATIONS

Throughout this section it will be assumed that {x t } is a
homogeneous Markov process with infinitesimal generator L, {Yt}
is given by (1) with Zt = h(x t ), and {x t } and {w t } are
independent. In this case, the conditional measure TIt satisfies
the equation (26), but it is often more convenient to work with
a less complicated equation which is obtained by considering an
lunnorma1ized" version of TIt. The unnorma1ized equations are
derived in [9, Chapter 6] and [8]; the use of measure
transformations will follow these references, but we will use a
shorter derivation of the unnorma1ized equations, via (26) and
Ito's rule.
The first step is to define a new measure Po on the
measurable space (~,F) by

for all A€ F, where

is the Radon-Nikodym derivative of Po with respect to P.

Lemma 3 [9, p.232]: Po has the following properties:
(a) Po is a probability measure -- i.e. Po(~) = 1;
(b) Under PO' {Yt} is a standard Brownian motion;
(c) Under Po, {x t } and {Yt} are independent;
(d) {x t } has the same distributions under Po as under P;
(e) P is absolutely continuous with respect to Po with Radon-
Nikodym derivative
70 M. H. A. DAVIS AND S.l. MARCUS

It can also be shown [13. Section 6.2] that

is a martingale with respect to Ft and PO' so that

dP
At = EO [(jj5""" 1 Ft ],
o
where EO is the expectation with respect to PO. It can be shown
[9. p.234] that

(31 )

Hence conditional statistics of xt given Yt , in terms of the

original measure p. can be calculated in terms of conditional
statistics under the measure PO. We now proceed to derive a
recursive equation for the measure crt; an approach to solving
(31) by a path integration of the numerator and denominator is
pursued in some other papers in this volume.
Since crt(q,) = cr t (1) ·1T t (<jl) and we have the equation (26)
for 1Tt(<jl) , an equation for crt(<jl) is derived by finding a
stochastic differential equation for cr t (l):= EO[AtlYt] and
applying Ito's rule.
Lemma 4: EO[AtlYt] is given by the formula

( 32)

Proof: By Ito's rule, At satisfies

t
At = 1+ fo As h(x s ) dy s . ( 33)
AN INTRODUCTION TO NONLINEAR FILTERING 71

It follows as in the proof of Theorem 4 that At is a martingale

with respect to Yt . Since {Yt} is a Brownian motion under PO'
there must exist a Yt-adapted process {nt} such that
[13, Theorem 5.6]

(34)

We identify nt by the same technique as in Theorem 4: from (33)

and Ito's rule,

(35)

From (34) and Ito's rule,

t t t
AtY t = ~ AsdYs + ~ Ysnsdy s + ~ nsds . (36)

Now E[AtYt-AtYtIYs] = a for t~s, and calculating this from

----
(35) and (36) yields

nt = Ath(x t ):= EO[Ath(xt)IYt ]·

But from (31),

(37}

so (34) becomes

(38)

However, this has the unique solution

and the lemma is proved.

72 M. H. A. DAVIS AND S. I. MARCUS

Theorem 6: For any cP € D(L), 0 t (cp) satisfies

0 t (cp) = 0 0 (cp) + fot 0 s (LCP)ds+ f0t 0 s (hCP) dys· ( 39)

Proof: By Ito's rule, we have from (26) and (38):

d(A t 1f t (CP)) = At [1f t (LCP)dt + (1ft(hCP)-1ft(h)1ft(cP))(dYt-1ft(h)dt))

+ 1f t (CP) [A t 1f t (h)dy t ) + [1f t (hCP)-1f t (h)1ft(CP) )A t 1f t (h)dt

= At 1f t (LCP)dt + At 1f t (hCP)dy t ,

which gives (39) since 0t (cp) = At 1f t (CP).

The remarks following Theorem 5 are also applicable here.
In addition, we note that the Stratonovich version of (39), which
is utilized in a number of papers in this volume, is:

( 40)

where
-
Lcp (x) = Lcp (x) - 2"1 h2 (x) cP ( x)

and denotes a Stratonovich (symmetric) stochastic integral

(9),[22).
Example 4: Under the assumptions of Example 2, we can
derive a stochastic differential equation for q(t,x):= Atp(t,x);
this is interpreted as an unnormalized conditional density, since
then
-() g( t,x)
p t,x = Jq(t,x)dx.

As in Example 2, an integration by parts in (39) yields the

stochastic partial differential equation:
dq(t,x) = L*q(t,x)dt + h(x)q(t,x)dy t . (41 )
AN INTRODUCTION TO NONLINEAR FILTERING 73

Notice that (41) has a much simpler structure than (27): it does
not involve an integral such as TIt(h) , and it is a bilinear
stochastic differential equation with {Yt} as its input. This
structure is utilized by a number of papers in this volume.

ACKNOWLEDGMENT
The work of S. I. Marcus was supported in part by the U.S.
National Science Foundation under grant ENG-76-11106.

REFERENCES
1. R. E. Kalman, "A new approach to linear filtering and
prediction problems," J. Basic Eng. ASME, 82, 1960,
pp. 33-45.
2. R. E. Kalman and R. S. Bucy, "New results in linear
filtering and prediction theory," J. Basic Engr. ASME
Series D, 83, 1961, pp. 95-108.
3. R. S. Bucy, "Nonlinear filtering," IEEE Trans. Automatic
Control, AC-I0, 1965, p. 198.
4. H. J. Kushner, "On the differential equations satisfied by
conditional probability densities of Markov processes," SIAM
J. Control, 2, 1964, pp. 106-119.
5. A. N. Shiryaev, "Some new results in the theory of controlled
stochastic processes [Russian]," Trans. 4th Prague Conference
on Information Theory~ Czech. Academy of Sciences, Prague,
1967.
6. R. L. Stratonovich, Conditional Markov Processes and Their
Application to the Theory of Optimal Control. New York:
Elsevier, 1968.
7. W. M. Wonham, "Some applications of stochastic differential
equations to optimal nonlinear filtering," SIAM J. Control,
2, 1965, pp. 347-369.
8. M. Zakai, "On the optimal filtering of diffusion processes,"
Z. Wahr. Verw. Geb., 11, 1969, pp. 230-243.
9. E. Wong, Stochastic Processes in Information and Dynamical
Systems. New York: McGraw-Hill, 1971.
74 M. H. A. DAVIS AND S. I. MARCUS

10. T. Kailath, "An innovations approach to least-squares

estimation -- Part I: Linear fil tering in additive white
noise," IEEE Trans. Automatic Control, AC-13, 1968,
pp. 646-655.
11. P. A. Frost and T. Kailath, "An innovations approach to
least-squares estimation III," IEEE Trans. Automatic Control,
AC-16, 1971, pp. 217-226.
12. M. Fujisaki, G. Kallianpur, and H. Kunita, "Stochastic
differential equations for the nonlinear filtering problem,"
Osaka J. Math., 1, 1972, pp. 19-40.
13. R. S. Liptser and A. N. Shiryaev, Statistics of Random
Processes I. New York: Springer-Verlag, 1977.
14. G. Kallianpur, Stochastic Filtering Theory. Berlin-
Heidelberg-New York: Springer-Verlag, 1980.
15. E. Pardoux, "Stochastic partial differential equations and
filtering of diffusion processes," Stochastics, 2, 1979,
pp. 127-168 [see also Pardoux' article in this volume].
16. N. V. Krylov and B. L. Rozovskii, "On the conditional
distribution of diffusion processes [Russian]," Izvestia
Akad. Nauk SSSR, Math Series 42, 1978, pp. 356-378.
17. R. W. Brockett, this volume.
18. V. E. Benes, "Exact finite dimensional filters for certain
diffusions with nonlinear drift," Stochastics, to appear.
19. M. H. A. Davis, this volume.
20. J. H. Van Schuppen, "Stochastic filtering theory: A
discussion of concepts, methods, and results," in
Stochastic Control Theory and Stochastic Differential
Systems, M. Kohlmann and W. Vogel, eds. New York:
Springer-Verlag, 1979.
21. M. H. A. Davis, "Martingale integrals and stochastic
calculus," in Communication Systems and Random Process
Theory, J. K. Skwirzynski, ed. Leiden: Noordhoff, 1978.
22. L. Arnold, Stochastic Differential Equations. New York:
Wiley, 1974.
23. A. Friedman, Stochastic Differential Equations and
Applications, Vol. 1. New York: Academic Press, 1975.
AN INTRODUCTION TO NONLINEAR FILTERING 75

24. I. I. Gihman and A. V. Skorohod, Stoahastia Differential

Equations. New York: Springer-Verlag, 1972.
25. I. Gertner, "An alternative approach to nonlinear filtering,"
Stochastic Processes and their Applications, 7, 1978,
pp. 231-246.
26. D. W. Stroock and S. R. S. Varadhan, MUltidimensional
Diffusion Proaesses. New York: Springer-Verlag, 1979.
27. D. All inger and S. K. Mitter, "New results on the innovations
problem for nonlinear filtering," Stochastics, to appear.
28. E. Wong, "Recent progress in stochastic process -- a survey,"
IEEE Trans. Inform. Theory, IT-19, 1973, pp. 262-275.
29. J. Jacod, Calaul Stoahastique et Problemes de Martingales.
Berlin-Heidelberg-New York: Springer-Verlag, 1979.
30. S. K. Mitter, this volume.

Giannakis 1989
No ratings yet
Giannakis 1989
5 pages
Brownian Motion Notes
No ratings yet
Brownian Motion Notes
14 pages
An Application of The Stochastic Mcshane'S Equations in Financial Modelling
No ratings yet
An Application of The Stochastic Mcshane'S Equations in Financial Modelling
9 pages
An Introductory Course in Stochastic Processes
100% (2)
An Introductory Course in Stochastic Processes
105 pages
Lectures On Stochastic Control and Nonlinear Filtering: M. H. A. Davis
No ratings yet
Lectures On Stochastic Control and Nonlinear Filtering: M. H. A. Davis
112 pages
Lectures On Stochastic Control and Nonlinear Filtering: M. H. A. Davis
No ratings yet
Lectures On Stochastic Control and Nonlinear Filtering: M. H. A. Davis
112 pages
SDE Book
No ratings yet
SDE Book
119 pages
Itô's Stochastic Calculus and Its Applications
No ratings yet
Itô's Stochastic Calculus and Its Applications
61 pages
Statistical Inference For Ergodic Diffusion Process: Yu.A. Kutoyants
No ratings yet
Statistical Inference For Ergodic Diffusion Process: Yu.A. Kutoyants
24 pages
Pavliotis Book
No ratings yet
Pavliotis Book
155 pages
Stochastic Analysis for Mathematicians
No ratings yet
Stochastic Analysis for Mathematicians
63 pages
Control Tex Chapte13
No ratings yet
Control Tex Chapte13
65 pages
Signal Processing in Radio Physics
No ratings yet
Signal Processing in Radio Physics
13 pages
Doob
No ratings yet
Doob
8 pages
Escape Rates in Random Walks
No ratings yet
Escape Rates in Random Walks
239 pages
SDErules
No ratings yet
SDErules
5 pages
Stochastic Processes, Ito Calculus and Black-Scholes Formula
No ratings yet
Stochastic Processes, Ito Calculus and Black-Scholes Formula
36 pages
Stochastic Calculus Fundamentals
No ratings yet
Stochastic Calculus Fundamentals
5 pages
Piterbarg Andersen Interest Rate Modeling
100% (1)
Piterbarg Andersen Interest Rate Modeling
392 pages
Advanced Stochastic Control Theory
No ratings yet
Advanced Stochastic Control Theory
109 pages
R018 FiltStochSys Carleman TAC07
No ratings yet
R018 FiltStochSys Carleman TAC07
7 pages
Hidden Markov Models: Ramon Van Handel
No ratings yet
Hidden Markov Models: Ramon Van Handel
123 pages
0 Inference For Diffusion Processes
No ratings yet
0 Inference For Diffusion Processes
20 pages
Stochastic Partial Differential Equations (SPDEs) 22077 17 by Based
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 17 by Based
9 pages
Kalman Filter Basics & Gaussian Distributions
No ratings yet
Kalman Filter Basics & Gaussian Distributions
11 pages
Quantum Law of Large Numbers in Stochastic Systems
No ratings yet
Quantum Law of Large Numbers in Stochastic Systems
22 pages
METULecture 1
No ratings yet
METULecture 1
15 pages
Comparisons For Measure Valued Processes With Interactions
No ratings yet
Comparisons For Measure Valued Processes With Interactions
34 pages
MIT18 S096F13 Lecnote1
No ratings yet
MIT18 S096F13 Lecnote1
67 pages
KalmanSlides 2
No ratings yet
KalmanSlides 2
57 pages
Mckean-Vlasov Sdes in Nonlinear Filtering
No ratings yet
Mckean-Vlasov Sdes in Nonlinear Filtering
28 pages
Dynamic Copula Methods in Finance - 2011 - Cherubini - Appendix B Elements of Stochastic Processes Theory
No ratings yet
Dynamic Copula Methods in Finance - 2011 - Cherubini - Appendix B Elements of Stochastic Processes Theory
13 pages
The Geometry of Filtering 1st Edition K. David Elworthy Digital Download
No ratings yet
The Geometry of Filtering 1st Edition K. David Elworthy Digital Download
127 pages
A Fresh Look at The Kalman Filter
No ratings yet
A Fresh Look at The Kalman Filter
23 pages
Principles of Data Assimilation Explained
No ratings yet
Principles of Data Assimilation Explained
49 pages
Mathematical Finance Cheat Sheet
100% (1)
Mathematical Finance Cheat Sheet
2 pages
Math Finance Cheat Sheet PDF
No ratings yet
Math Finance Cheat Sheet PDF
2 pages
Gaussian Particle Filtering: Jayesh H. Kotecha and Petar M. Djuric, Senior Member, IEEE
No ratings yet
Gaussian Particle Filtering: Jayesh H. Kotecha and Petar M. Djuric, Senior Member, IEEE
10 pages
Introduction to Stochastic Processes
100% (1)
Introduction to Stochastic Processes
10 pages
An Adaptive Unscented Kalman Filter Approach For State Estimation of Nonlinear Continuous-Discrete System
No ratings yet
An Adaptive Unscented Kalman Filter Approach For State Estimation of Nonlinear Continuous-Discrete System
4 pages
Stein 2011 DiffFilter
No ratings yet
Stein 2011 DiffFilter
20 pages
Stochastic Calculus, Filtering, and Stochastic Control
100% (2)
Stochastic Calculus, Filtering, and Stochastic Control
265 pages
BEP Thesis FINAL
No ratings yet
BEP Thesis FINAL
48 pages
Introduction To Nonlinear Filtering
No ratings yet
Introduction To Nonlinear Filtering
126 pages
Bayesian Derivation of Kalman Filter
No ratings yet
Bayesian Derivation of Kalman Filter
6 pages
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
No ratings yet
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
69 pages
An Introduction To Stochastic Calculus
100% (1)
An Introduction To Stochastic Calculus
102 pages
N. Privault, Stochastic Analysis in Discrete and Continuous Settings, Lecture Notes in Mathematics 1982, DOI 10.1007/978-3-642-02380-4 0, C 1
No ratings yet
N. Privault, Stochastic Analysis in Discrete and Continuous Settings, Lecture Notes in Mathematics 1982, DOI 10.1007/978-3-642-02380-4 0, C 1
6 pages
Stochastic Processes Lecture Notes
No ratings yet
Stochastic Processes Lecture Notes
130 pages
Continuous Time Finance II Notes
No ratings yet
Continuous Time Finance II Notes
79 pages
Diffusions and Stochastic Differential Equations
No ratings yet
Diffusions and Stochastic Differential Equations
8 pages
Kalman and Extended Kalman Filter Overview
No ratings yet
Kalman and Extended Kalman Filter Overview
44 pages
Modern Inertial and Satellite Navigation Systems
100% (1)
Modern Inertial and Satellite Navigation Systems
119 pages
Learning Algorithms Utilizing Quasi-Geodesic Flows On The Stiefel Manifold
No ratings yet
Learning Algorithms Utilizing Quasi-Geodesic Flows On The Stiefel Manifold
30 pages
Kaczmarz Method Bibliography
No ratings yet
Kaczmarz Method Bibliography
58 pages
CMA Error Surface Topology
No ratings yet
CMA Error Surface Topology
4 pages
A New Adaptive Algorithm For Joint Blind Equalization and Carrier Recovery
No ratings yet
A New Adaptive Algorithm For Joint Blind Equalization and Carrier Recovery
5 pages
A New Adaptive Algorithm For Joint Blind Equalization and Carrier Recovery
No ratings yet
A New Adaptive Algorithm For Joint Blind Equalization and Carrier Recovery
5 pages
AVLib A Simulink Library For Multi-Agent Systems Research
No ratings yet
AVLib A Simulink Library For Multi-Agent Systems Research
7 pages
The Vector Field Histogram-Fast Obstacle Avoidance For Mobile Robots
No ratings yet
The Vector Field Histogram-Fast Obstacle Avoidance For Mobile Robots
11 pages
Multivariate Gaussian and Student T Process Regression For Multi-Output Prediction
No ratings yet
Multivariate Gaussian and Student T Process Regression For Multi-Output Prediction
29 pages
Robust Student's t Kalman Filter
No ratings yet
Robust Student's t Kalman Filter
10 pages
Macadamia Agribusiness Business Plan
No ratings yet
Macadamia Agribusiness Business Plan
35 pages
Pineapple Peel Wine & Vinegar Guide
No ratings yet
Pineapple Peel Wine & Vinegar Guide
15 pages
Sweet - Citrus - Fruit - Detection - in - Thermal - Images - Using
No ratings yet
Sweet - Citrus - Fruit - Detection - in - Thermal - Images - Using
6 pages
Form 2 Agriculture
100% (1)
Form 2 Agriculture
8 pages
2013 Catalogue Mounts Hardware
No ratings yet
2013 Catalogue Mounts Hardware
11 pages
RM4 Spec Sheet
No ratings yet
RM4 Spec Sheet
2 pages
Babygro Book
No ratings yet
Babygro Book
118 pages
Sustainable Water Usage in Residential Areas Survey
No ratings yet
Sustainable Water Usage in Residential Areas Survey
5 pages
1.001-221.01.00!01!9 Waterleau TAG Numbering System
No ratings yet
1.001-221.01.00!01!9 Waterleau TAG Numbering System
26 pages
Capotorto's Apizza Menu Overview
No ratings yet
Capotorto's Apizza Menu Overview
2 pages
The Tale of Despereaux - Kate DiCamillo
91% (33)
The Tale of Despereaux - Kate DiCamillo
151 pages
Chapter 60 - Xiao Feng, "The Only God"
No ratings yet
Chapter 60 - Xiao Feng, "The Only God"
6 pages
Chapter 2: The Dynamic Environment of International Trade
No ratings yet
Chapter 2: The Dynamic Environment of International Trade
24 pages
Schaum's Outline of Complex Variables - 9780071615709 - Exercise 92 - Quizlet
No ratings yet
Schaum's Outline of Complex Variables - 9780071615709 - Exercise 92 - Quizlet
13 pages
Group 2 Presentation CC View
No ratings yet
Group 2 Presentation CC View
10 pages
تعديل اخير
No ratings yet
تعديل اخير
60 pages
Manufacture of Surgical Gloves Using The Coagulant Dipping Method
No ratings yet
Manufacture of Surgical Gloves Using The Coagulant Dipping Method
14 pages
2021 Sustainable Development Highlights
No ratings yet
2021 Sustainable Development Highlights
1 page
6 - Graphical Display of Categorical Data
No ratings yet
6 - Graphical Display of Categorical Data
10 pages
Free Reading Passage and Comprehension Questions: This Resource Includes A Digital Version in
No ratings yet
Free Reading Passage and Comprehension Questions: This Resource Includes A Digital Version in
9 pages
ICSE Class 10 Project Guidelines 2023-24
No ratings yet
ICSE Class 10 Project Guidelines 2023-24
47 pages
Boat Manual - Volvo Penta 5.0 OSI 270 HP (43osic)
No ratings yet
Boat Manual - Volvo Penta 5.0 OSI 270 HP (43osic)
76 pages
Grundfos SP - 17-17
No ratings yet
Grundfos SP - 17-17
9 pages
QEMIFLOC VH 1007 Safety Data Sheet
No ratings yet
QEMIFLOC VH 1007 Safety Data Sheet
3 pages
Principle and Practices of Training For Soccer
No ratings yet
Principle and Practices of Training For Soccer
7 pages
Counter Side Portal - Awakened Unit
No ratings yet
Counter Side Portal - Awakened Unit
4 pages
Boatowner S Mechanical and Electrical Manual 4th Edition Nigel Calder Download Full Chapters
No ratings yet
Boatowner S Mechanical and Electrical Manual 4th Edition Nigel Calder Download Full Chapters
150 pages
Chapter 9 Flow in Closed Conduits
No ratings yet
Chapter 9 Flow in Closed Conduits
10 pages
Tksu 10 - 20230214
No ratings yet
Tksu 10 - 20230214
2 pages
PLTD Pesanggaran Bali
No ratings yet
PLTD Pesanggaran Bali
2 pages