0% found this document useful (0 votes)
15 views8 pages

Rao-Blackwellised Particle Filtering For Dynamic Bayesian Networks

Uploaded by

haodomybest
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views8 pages

Rao-Blackwellised Particle Filtering For Dynamic Bayesian Networks

Uploaded by

haodomybest
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

176 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000

Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks

Arnaud Doucett Nando de Freitast Kevin Murphyt Stuart Russent

t Engineering Dept. t Computer Science Dept.


Cambridge University UC Berkeley
[email protected] {jfgf,murphyk,russell}©cs.berkeley.edu

Abstract number of hidden nodes.

To handle these problems, sequential Monte Carlo meth­


Particle filters (PFs) are powerful sampling­ ods, also known as particle filters (PFs), have been in­
based inference/learning algorithms for dynamic troduced (Handschin and Mayne 1969, Akashi and Ku­
Bayesian networks (DBNs). They allow us to mamoto 1977). In the mid 1990s, several PF algorithms
treat, in a principled way, any type of probabil­ were proposed independently under the names of Monte
ity distribution, nonlinearity and non-stationarity. Carlo filters (Kitagawa 1996), sequential importance sam­
They have appeared in several fields under such pling (SIS) with resampling (SIR) (Doucet 1998), bootstrap
names as "condensation", "sequential Monte filters (Gordon, Salmond and Smith 1993), condensation
Carlo" and "survival of the fittest". In this pa­ trackers (lsard and Blake 1996), dynamic mixture models
per, we show how we can exploit the structure (West 1993), survival of the fittest (Kanazawa, Koller and
of the DBN to increase the efficiency of parti­ Russell 1995), etc. One of the major innovations during the
cle filtering, using a technique known as Rao­ 1990s was the inclusion of a resampling step to avoid de­
Blackwellisation. Essentially, this samples some generacy problems inherent to the earlier algorithms (Gor­
of the variables, and marginalizes out the rest don et al. 1993). In the late nineties, several statistical im­
exactly, using the Kalman filter, HMM filter, provements for PFs were proposed, and some important
junction tree algorithm, or any other finite di­ theoretical properties were established. In addition, these
mensional optimal filter. We show that Rao­ algorithms were applied and tested in many domains: see
Blackwellised particle filters (RBPFs) lead to (Doucet, de Freitas and Gordon 2000) for an up-to-date sur­
more accurate estimates than standard PFs. We vey of the field.
demonstrate RBPFs on two problems, namely
One of the major drawbacks of PF is that sampling in
non-stationary online regression with radial ba­
high-dimensional spaces can be inefficient. In some cases,
sis function networks and robot localization and
however, the model has "tractable substructure", which
map building. We also discuss other potential ap­
can be analytically marginalized out, conditional on cer­
plication areas and provide references to some fi­
tain other nodes being imputed, c.f., cutset conditioning in
nite dimensional optimal filters.
static Bayes nets (Pearl 1988). The analytical marginal­
ization can be carried out using standard algorithms, such
as the Kalman filter, the HMM filter, the junction tree al­
1 INTRODUCTION
gorithm for general DBNs (Cowell, Dawid, Lauritzen and
Spiegelhalter 1999), or, any other finite-dimensional opti­
State estimation (online inference) in state-space models is
mal filters. The advantage of this strategy is that it can
widely used in a variety of computer science and engineer­
drastically reduce the size of the space over which we need
ing applications. However, the two most famous algorithms
to sample.
for this problem, the Kalman filter and the HMM filter, are
only applicable to linear-Gaussian models and models with Marginalizing out some of the variables is an example of
finite state spaces, respectively. Even when the state space the technique called Rao-Blackwellisation, because it is
is finite, it can be so large that the HMM or junction tree related to the Rao-Blackwell formula: see (Casella and
algorithms become too computationally expensive. This is Robert 1996) for a general discussion. Rao-Blackwellised
typically the case for large discrete dynamic Bayesian net­ particle filters (RBPF) have been applied in specific con­
works (DBNs) (Dean and Kanazawa 1989): inference re­ texts such as mixtures of Gaussians (Akashi and Ku­
quires at each time space and time that is exponential in the mamoto 1977, Doucet 1998, Doucet, Godsill and Andrieu
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 177

2000), fixed parameter estimation (Kong, Liu and Wong the alternative recursion
1994), HMMs (Doucet 1998, Doucet, Godsill and Andrieu
( p(Yt/Y1:t-l,ro:t)P(rt/rt-!)p(ro:t-11Yl:t-d
p ro:t I Yl:t) -
_

2000) and Dirichlet process models (MacEachern, Clyde


P ( Yt I Yl:t-1 )
and Liu 1999). In this paper, we develop the general theory (2)
of RBPFs, and apply it to several novel types of DBNs. We
omit the proofs of the theorems for lack of space: please If eq. ( 1) does not admit a closed-form expression, then eq.
refer to the technical report (Doucet, Gordon and Krishna­ (2) does not admit one either and sampling-based methods
murthy 1999). are also required. Since the dimension of is p (ro:t IYl:t)
smaller than the one of p ( r0,t,Xo:t Yl:t).
I we should expect
to obtain better results.
2 PROBLEM FORMULATION
In the following section, we review the importance sam­
Let us consider the following general state space pling (IS) method, which is the core of PF, and quantify the
model/DBN with hidden variables Zt
and observed vari­ improvements one can expect by marginalizing out xo:t,
ables Yt· We assume that Zt
is a Markov process of ini­ i.e. using the so-called Rao-Blackwellised estimate. Sub­
tial distribution p(z0) p(zt/Zt-1).
and transition equation sequently, in Section 4, we describe a general RBPF algo­
The observations Yl:t � {y1,Y2,... ,yt} are assumed rithm and detail the implementation issues.
to be conditionally independent given the process Zt of
marginal distribution p(yt/zt). Given these observations, 3 IMPORTANCE SAMPLING AND
zo:t
the inference of any subset or property of the states � RAO-BLACKWELLISATION
{zo,z1,... ,Zt} relies on the joint posterior distribution
p(zo,tiY1:t). Our objective is, therefore, to estimate this
If we were able to sample N i.i.d. random sam­
distribution, or some of its characteristics such as the filter­
ing density p (zt/Y1:t)or the minimum mean square error ples (particles), { (r��L x���) ; i = 1, . . . , N }. according to
(MMSE) estimate E[ztiYl:t]· The posterior satisfies the p ( ro:t,Xo:t / YI:t), then an empirical estimate of this distri­
following recursion bution would be given by

N
1
PN (ro,t,Xo:tl Yl:t) """' J( (i) (il ) (dro:tdXo:t)
=

N L.....t ro x
:t, O:t
=l i

where J(r&�Lx��n (dro:tdXo:t) denotes the Dirac delta


If one attempts to solve this problem analytically, one ob­
tains integrals that are not tractable. One, therefore, has to
resort to some form of numerical approximation scheme. In
function located at (r���,x���). As a corollary, an
estimate of the filtering distribution p ( rt,Xt I yl:t) is
this paper, we focus on sampling-based methods. Advan­
tages and disadvantages of other approaches are discussed
PN (rt,Xt/ Yl:t) = -iJ 2::;:,1 J( rt
(iJ
, xt
(il ) (drtdxt). Hence
at length in (de Freitas 1999). one can easily estimate the expected value of any function
ftof the hidden variables w.r.t. this distribution, us­ I(ft),
The above description assumes that there is no structure
ing
within the hidden variables. But suppose we can di­
Zt rt Xt,
vide the hidden variables
such that p(zt lzt-d =
into two groups, and
p(xt/rt-1:t,Xt_l)p(rtlrt-1)and,
/ ft (ro:t,xo,t) PN ( ro:t,Xo:t/ YI:t) dro:tdXo:t
conditional on ro:t.
the conditional posterior distribution N
p ( Xo:t I Yl:t,ro:t)
is analytically tractable. 1 Then we can = ft (ro:t> Xo:t)
1 """' (i) (i)
N �
easily marginalize out xo:t
from the posterior, and only i=l
need to focus on estimating p (ro:tiY1:t),
which lies in a
This estimate is unbiased and, from the strong law of
space of reduced dimension. Formally, we are making use
large numbers (SLLN), IN (ft) converges almost surely
of the following decomposition of the posterior, which fol­
lows from the chain rule
(a.s.) towards I (ft)
as N --+ +oo. If o}, �
[ft (ro:t,Xo:t)]
varp( ro,t ,xo,t i Yl,t ) < +oo, then a central
limit theorem (CLT) holds
P (ro:t,Xo:t/ Yl:t) = P ( xo:tl Yl:t,ro:t) P (ro:tl Yl:t)
p (ro:t/yu) Vii [I; (ft)- I (ft)] ====> .N (o, ai)
'
The marginal posterior distribution satisfies N--too
1
The problem of how to automatically identify which vari­
where " =? " denotes convergence in distribution. Typi­
ables should be sampled, and which can be handled analytically,
is one we are currently working on. We anticipate that algorithms
cally, it is impossible to sample efficiently from the "tar­
similar to cutset conditioning (Becker, Bar-Yehuda and Geiger get" posterior distribution p ( ro:t,Xo:t Yl:t)
I at any time t.
1999) might prove useful. So we focus on alternative methods.
178 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000

One way to estimate p ( ro:t,Xo:tiYI:t)


and con­ I(ft) closed-form expression, then the following alternative im­
sists of using the well-known importance sampling method portance sampling estimate of I (ft)
can be used
(Bernardo and Smith 1994). This method is based on the -
following observation. Let us introduce an arbitrary impor­ (ft)
q( ro:t,xo:t YI:t),
I IJv (ft)= �
tance distribution from which it is easy
B'fv (ft)
( ro:t,xo:tl y1,t)
> 0 implies
) ) ( (i))
to get samples, and such thatp
N (i)
q(ro:t,Xo:tiYI:t) > 0. Then Li=llE ( p xo:t 1 Yl:t ,r<•>
o:t
( (
) ft ro:t ,Xo:t w ro:t
I (ft) = lEg( ro,t,Xo,tiYu) (ft (ro:t,Xo:t) w (ro:t.Xo:t))
lEg( r0,,,x0,, IYu) (W (ro:t,Xo:t))
where
where the importance weight is equal to

w (ro:t) p(ro:tiYI:t)
t)
w (ro:t,Xo:t) = p=---:--((r....:o...:.:t-'x'-' o'-':t-:-1Y=-1'-: 7 q( ro:tl YI:t)
q ro:t. Xo:tl YI:t)
G.1ven .1.1. . d. sampIes { (r(0,ti),x0,t(i)) } q( ro:tl YI:t) J q( ro:t,Xo:t I YI:t) dxo:t
N d1" stn"buted accord-
ing to q( ro:t,Xo:tl YI:t), a Monte Carlo estimate of I (ft) Il:I._(ft) will require
Intuitively, to reach a given precision,
is given by a reduced number N of samples over I}v (ft) as we only

LiN=l ft (ro:t(i) ,xo:t(i)) w (ro:t(i),xo:t) (i) need to sample from a lower-dimensional distribution. This
is proven in the following propositions.
N
Li=l w (ro:n (i) Xo:t(i)) Proposition 1 The variances of the importance weights,
N the numerators and the denominators satisfy for any N
wa:t�t (ro:t
"" -(i)�
L.....t
(i),Xo:t(i)) (w (ro,t))::;
i=l Varq(r0,, ly1,t)

varq(ro,tiHt) )
(A� (ft) <
varq(r0,.,x0,, IYu) (W (ro,t, Xo,t))
varq(ro,.,xo,tiYt,tl (4 (ft) )
where the normalized importance weights wt� are equal to
(nJv (ft))
(i) (i)
varq(ro,tiYu) < varq(ro,.,xo,tiHtl (BJ; (ft))
WO:t(i) -- NW (rO:t'( (Xj)o:t)(j) IJv (ft) to
-

Lj=l w ro:t ,Xo:t) A sufficient condition for


{ft (ro:t.Xo:t)} < +oo
satisfy a CLT is varp(ro,,,xo,tiYu)
This method is equivalent to the following point mass ap­ andw (ro:t.Xo:t) < +oo for any(ro:t,xoJ2 (Bernardo and
proximation of p (ro:t,Xo:tl Yl:t) Smith 1994). This trivially implies that I'fv (ft) also satis­
fies a CLT. More precisely, we get the following result.
N
PN ( ro:t,Xo:tl Yl:t) = � (i) (i) ) (dro:tdXo:t)
""w���o(ro:t, Proposition 2 Under
i=l
xO:t
the assumptions given above,
-
I}v (ft) and
-
I'fv (ft) satisfy
q( Xo:t IYI:t) =
Wb�� ro:t,
For "perfect" simulation, that is aCLT
p(ro:t,Xo:tl YI:t), we would have for any i. = N-1
In practice, we will try to select the importance distribu­
VN (f1 (ft)- I (ft)) N-too ===> N (o, ai)

tion as close as possi to the target distribution in a given
VN (� (ft) -I (ft)) ===> N (o, a�)
sense. For N finite, I}v (ft) is biased (since it is a ratio of N-too
estimates), but according to the SLLN, IN (ft) converges where ai 2: a�, ai and a� being given by
asymptotically a.s. towards I (ft). Under additional as­
sumptions, a CLT also holds. af = �( r0,,,x0,, IHtl [((ft (ro,t, Xo,t) - I (ft)) w (ro,t, Xo,t))2]
Now consider the case where one can marginalize out Xo:t 17� = �( ro,, IY1,,) [( (�Ep( x0,, IY1,,,r0,,) (ft (ro,t, Xo,t))
analytically, then we can propose an alternative estimate -I (ft))wt (ro,t))2]
for I (ft) with a reduced variance. As p ( ro:t,Xo:tl YI:t) =
p ( ro:t YI:t) p (Xo:t YI:t,ro:t),
I I where I is p (Xo:t Yl:t,ro:t)
a distribution that can be computed exactly, then an The Rao-Blackwellised estimate I'fv (ft) is u�ally compu­
approximation of p ( ro:t YI:t)
I yields straightforwardly tationally more extensive to compute than I}v (ft) so it is
an approximation of p( ro:t,Xo:tiYI:t).
Moreover, if of interest to know when, for a fixed computational com­
lEv( x0,, IYu,r0,,) (ft (ro:t,Xo:t))
can be evaluated in a plexity, one can expect to achieve variance reduction. One
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 179

has 4.1 IMPLEMENTATION ISSUES

Eq( ro,t IYu) [varq( IYu ,ro,tl [(ft (ro:t,Xo:t)


xo,t 4.1.1 Sequential importance sampling
- I (ft))w (ro:t,Xo:t)]J
If we restrict ourselves to importance functions of the fol­
so that, accordingly to the intuition, it will be worth gen­ lowing form
erally performing Rao-Blackwellisation when the average t
conditional variance of the variable is high. Xo:t q( ro:tiYI:t) q(ro) IT q( rkiYl:k,rl:k-1) (3)
=

k=I
4 RAO-BLACKWELLISED PARTICLE
we can obtain recursive formulas to evaluate w (ro:t) =

FILTERS w (ro:t-d Wt and thus WI:t. The "incremental weight" Wt


. is given by
.
GIven N partie1es (samp1es) { (i} ,x0,t_1
} at time
· t-
ro:t-1 (i)
1, approximately distributed according to the distribution
( ')
Wt<X P(Ytl YI:t-1,ro:t) P( rtI rt-1)
C)
p(r0�t-I,x0�t-1IYI:t-I), RBPFs allow us to compute N q( rtl YI:t,rl:t-1)
particles (r��L x��D approximately distributed according Wt denotes the normalized version of Wt. i.e. w}i)
1 i
=

. .
to the postenor p (r0(i),p Xo:t
(i} IYI:t) , at time
.
t. Th' ac- IS IS
[L::%1 w}1l ] - w} } . Hence we can perform importance
complished with the algorithm shown below, the details of
sampling online.
which will now be explained.
Choice of the Importance Distribution

Generic RBPF
There are infinitely many possible choices for q(ro:tI YI:t),
the only condition being that its supports must include that

1. Sequential importance sampling step


ofp (ro:tl Y1:t)· The simplest choice is to just sample from
the prior, p( rtI rt-1),in which case the importance weight
• Fori = 1, ... , N, sample: is equal to the likelihood, p (YtI Y1:t-1 ,ro:t)·
This is the

(r�il) ,...., q(rtlr���-1 , Yl:t) most widely used distribution, since it is simple to compute,
but it can be inefficient, since it ignores the most recent
and set: evidence, Yt· Intuitively, many of our samples may end up

(�(i) (�(i) (i) )


ro:t) - rt 'ro:t-I

in a region of the space that has low likelihood, and hence
receive low weight; these particles are effectively wasted.

We can show that the "optimal" proposal distribution, in


• For i = 1, ... , N,
evaluate the importance
the sense of minimizing the variance of the importance
weights up to a normalizing constant:
weights, takes the most recent evidence into account:
�i)
Wt(i) -
p(ro,tiYI:t)
(i)
Iro:t-1•YI:t) (�(i)
_

(�(i) Proposition 3 The distribution that minimizes the vari­


q rt ro:t-1IY1:t-1 )
P
ance of the importance weights conditional upon ro:t-1
• For i = 1, ... , N, normalize the importance and Yl:t is
weights:
p ( rt I ro:t-1,Y1:t)P(YtiYI:t-1,ro:t)p(rtl rt-I)
P(YtI Y1:t-1,ro:t-1)
=

and the associated importance weight Wt is


2. Selection step
P(YtI Y1:t-1,ro:t-d =Jp (YtI Y1:t-1 ,ro:t) P(rtI rt-d drt
• Multiply/ suppress samples (r6��) with high/low
importance weights w�i), respectively, to obtain N Unfortunately, computing the optimal importance sampling
random samples (rg�) approximately distributed distribution is often too expensive. Several deterministic
according to p(r���IYI:t).
approximations to the optimal distribution have been pro­
3. MCMC step posed, see for example (de Freitas 1999, Doucet 1998).

• Apply a Markov transition kernel with invariant Degeneracy of SIS


distribution given by p(r6��1Y�:t) to obtain (r���). The following proposition shows that, for importance func­

------'·
tions of the form (3), the variance of w (ro:t)
can only in­
crease (stochastically) over time. The proof of this propo­
sition is an extension of a Kong-Liu-Wong theorem (Kong
180 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000

et a!. 1994, p. 285) to the case of an importance function of 4.2 CONVERGENCE RESULTS
the form (3).
Let B(IRn) be the space of bounded, Borel measurable
Proposition 4 The unconditional variance (i.e. with the
functions on IRn. We denote 11!11 �
sup If (x)l. The fol-
xEJRn

observations Y1:t being interpreted as random variables) lowing theorem is a straightforward consequence of Theo-
of the importance weights w (ro:t) increases over time. rem I in (Crisan and Doucet 2000) which is an extension
of previous results in (Crisan et a!. 1999).

In practice, the degeneracy caused by the variance increase Theorem 5 If the importance weights Wt are upper
can be observed by monitoring the importance weights. bounded and if one uses one of the selection schemes de­
Typically, what we observe is that, after a few iterations, scribed previously, then, for all t 2: 0, there exists Ct inde-
one of the normalized importance weights tends to I, while pendent of N such that for any ft E B ( (JRnz )t+l )
the remaining weights tend to zero.

4.1.2 Selection step

To avoid the degeneracy of the sequential importance sam­


where the expectation is taken w.r.t. to the randomness in­
pling simulation method, a selection (resampling) stage
troduced by the PF algorithm. This results shows that, un­
may be used to eliminate samples with low importance ra­
der very lose assumptions, convergence of this general par­
tios and multiply samples with high importance ratios. A
ticle filtering method is ensured and that the convergence
selection scheme associates to each particle r ��� a num­ rate of the method is independent of the dimension of the
ber of offsprings, say N; E N, such that 2::�1
N; N. = state-space. However, Ct usually increases exponentially
Several selection schemes have been proposed in the lit­ with time. If additional assumptions on the dynamic sys­
erature. These schemes satisfy E( N;) =N w�i),
but tem under study are made (e.g. discrete state spaces), it
their performance varies in terms of the variance of the is possible to get uniform convergence results (ct c for
=

particles, var(N;). Recent theoretical results in (Crisan, any t) for the filtering distribution p ( Xt I Yl:t). We do not
Del Moral and Lyons 1999) indicate that the restriction pursue this here.
E( N;) N= w�i)
is unnecessary to obtain convergence re­
sults (Doucet et a!. 1999). Examples of these selection 5 EXAMPLES
schemes include multinomial sampling (Doucet 1998, Gor­
don et a!. 1993, Pitt and Shephard 1999), residual resam­ We now illustrate the theory by briefly describing two ap­
pling (Kitagawa 1996, Liu and Chen 1998) and stratified plications we have worked on.
sampling (Kitagawa 1996). Their computational complex­
ity is 0 (N).
5.1 ON-LINE REGRESSION AND MODEL
SELECTION WITH NEURAL NETWORKS

4.1.3 MCMC step


Consider a function approximation scheme consisting of
a mixture of k radial basis functions (RBFs) and a linear
After the selection scheme at time t, we obtain N par­
regression term. The number of basis functions, kt, their
ticles distributed marginally approximately according to
centers, J.Lt , the coefficients (weights of the RBF centers
p(ro:tiYl:t)· As discussed earlier, the discrete nature of the
plus regression terms), Ot, and the variance of the Gaussian
approximation can lead to a skewed importance weights
distribution. That is, many particles have no offspring
noise on the output, az, can all vary with time, so we treat
them as latent random variables: see Figure I. For details,
(N; = 0), whereas others have a large number of off­
see (Andrieu, de Freitas and Doucet 1999a).
spring, the extreme case being N; = N for a particular
value i. In this case, there is a severe reduction in the di­ In (Andrieu et a!. 1999a), we show that it is possible to
versity of the samples. A strategy for improving the re­ simulate J.Lt, kt and ut with a particle filter and to com­
sults involves introducing MCMC steps of invariant distri­ pute the coefficients Ot analytically using Kalman filters.
bution p(ro:t IYl:t) on each particle (Andrieu, de Freitas and This is possible because the output of the neural network
Doucet 1999b, Gilks and Berzuini 1998, MacEachern et a!. is linear in Ot, and hence the system is a conditionally lin­
1999). The basic idea is that, by applying a Markov tran­ ear Gaussian state-space model (CLGSSM), that is it is a
sition kernel, the total variation of the current distribution linear Gaussian state-space model conditional upon the lo­
with respect to the invariant distribution can only decrease. cation of the bases and the hyper-parameters. This leads to
Note, however, that we do not require this kernel to be er­ an efficient RBPF that can be combined with a reversible
godic. jump MCMC algorithm (Green 1995) to select the number
2000

") -- ' --
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 181

.
/··---.. ,.............
;------•{ � }------
/----....,

·. ' • .· k,·: --( k, k, k,

,./
1, r 'r _.L ,_-
J.
-
..
/ -, �- --

( llo·: ·;· Jl.·, :.' ll2; . '


/
. __/' ,___ .
' -...•.

_ ,--, /·---,

: a0 � a1 ;'------'.---.{
' __ /
- .
' .. /

Figure 3: A Factorial HMM with 3 hidden chains. Mt (i )


represents the color of grid cell i at time t, Lt represents
the robot's location,and yt the current observation.

Figure 1: DBN representation of the RBF model. The

o' - -
hyper-parameters have been omitted for clarity. sensors are not perfect (they may accidentally flip bits),nor
are the motors (the robot may fail to move in the desired di­

-
rection with some probability due e.g., to wheel slippage).
Consequently,it is easy for the robot to get lost. And when


.
.' the robot is lost,it does not know what part of the matrix to
��
£-t
·_. . ' . .
-\
.
.. .. . ..
update. So we are faced with a chicken-and-egg situation:
.
.
-2 the robot needs to know where it is to learn the map, but
230 240 250 260 270 280 290 300 310
needs to know the map to figure out where it is.

ts;;. � - �- � :.�.: 1
The problem of concurrent localization and map learn­
ing for mobile robots has been widely studied. In (Mur­
phy 2000), we adopt a Bayesian approach, in which we
0 50 100 150 200 250 300 350 400 450 500 maintain a belief state over both the location of the robot,
()
Lt E {1, . . . , N£ } ,and the color of each grid cell,Mt i E

�::� ; ; ; ' ; ; ; : 1
{1, . . . , Nc } i = 1, . . . , N£, where NL is the number
,

of cells, and Nc is the number of colors. The DBN we


are using is shown in Figure 3. The state space has size
0 50 100 150 200 250
Time
300 350 400 450 g
O (N L ) . Note that we can easily handle changing envi­
ronments, since the map is represented as a random vari­
able, unlike the more common approach, which treats the
Figure 2: The top plot shows the one-step-ahead output map as a fixed parameter.
predictions [-] and the true outputs [ ] for the RBF
The observation model is yt = f(Mt(Lt)), where f(·) is
· · ·

model. The middle and bottom plots show the true val­
a function that flips its binary argument with some fixed
ues and estimates of the model order and noise variance
probability. In other words, the robot gets to see the color
respectively.
of the cell it is currently at, corrupted by noise: yt is a
noisy multiplexer with Lt acting as a "gate" node. Note
of basis functions online. For example, we generated some that this conditional independence is not obvious from the
data from a mixture of 2 RBFs for t = 1, ... , 500, and graph structure in Figure 3(a), which suggests that all the
then from a single RBF fort= 501, ... , 1000; the method nodes in each slice should be correlated by virtue of sharing
was able to track this change,as shown in Figure 2. Further a common observed child, as in a factorial HMM (Ghahra­
experiments on real data sets are described in (Andrieu et mani and Jordan 1997). The extra independence informa­
al. 1999a). tion is encoded in yt 's distribution, c.f., (Boutilier, Fried­
man,Goldszmidt and Koller 1996).

5.2 ROBOT LOCALIZATION AND MAP The basic idea of the algorithm is to sample Ll:t with a PF,
BUILDING ()
and marginalize out the Mt i nodes exactly,which can be
done efficiently since they are conditionally independent
Consider a robot that can move on a discrete, two­ given Ll:t:
dimensional grid. Suppose the goal is to learn a map of
the environment, which, for simplicity, we can think of as P(Mt(l), ... , Mt(NL)iyl:t, Ll:t) = IJf�i P(Mt(i)iyl:t, Ll:t)
a matrix which stores the color of each grid cell, which
can be either black or white. The difficulty is that the color Some results on a simple one-dimensional grid world are
1 82 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000

famous ones, there exist numerous other dynamic systems

1 ... ll
admitting finite dimensional filters. That is, the filtering

\ / 11
il I
I '!"
distribution can be estimated in closed-form at any timet
using a fixed number of sufficient statistics. These include

l �
"'Wll
I
I

I �110 12 14 11


Dynamic models for counting observations (Smith
and Miller 1986).

Dynamic models with a time-varying unknow covari­


a b c ance matrix for the dynamic noise (West and Harrison
1996, Uhlig 1997).
Figure 4: Estimated position as the robot moves from cell
1 to 8 and back. The robot "gets stuck" in cell 4 for two • Classes of the exponential family state space models
steps in a row on the outgoing leg of the journey (hence the (Vidoni 1999).
double diagonal), but the robot does not realize this until
it reaches the end of the "corridor" at step 9, where it is This list is by no means exhaustive. It,however,shows that
able to relocalise. (a) Exact inference. (b) RBPF with 50 RBPFs apply to very wide class of dynamic models. Con­
particles. (c) Fully-factorised BK. sequently, they have a big role to play in computer vision
(where mixtures of Gaussians arise commonly), robotics,
speech and dynamic factor analysis.
shown in Figure 4. We compared exact Bayesian infer­
ence with the RBPF method, and with the fully-factorised
version of the Boyen-Koller (BK) algorithm (Boyen and References
Koller 1998), which represents the belief state as a product
of marginals: Akashi, H. and Kumamoto, H. ( 1977). Random sampling
approach to state estimation in switching environ­
NL

P(Lt, Mt(l), 0 0 0 'Mt(NL)IYl:t)=P(LtiYl:t) II P(Mt(i)IYl:t) ments,Automatica 13: 429-434.


i=l
Andrieu,C.,de Freitas,J. F. G. and Doucet,A. ( 1999a). Se­
We see that the RBPF results are very similar to the ex­ quential Bayesian estimation and model selection ap­
act results, even with only 50 particles, but that BK gets plied to neural networks, Technical Report CUEDIF­
confused because it ignores correlations between the map /NFENG/TR 341, Cambridge University Engineering
cells. We have obtained good results learning a 10 x 10 Department.
map (so the state space has size 0(2100))
using only 100
particles (the observation model in the 2D case is that the Andrieu,C.,de Freitas, J. F. G. and Doucet,A. ( 1999b). Se­
quential MCMC for Bayesian model selection, IEEE
robot observes the colors of all the cells in a 3 x 3 neighbor­
Higher Order Statistics Workshop, Ceasarea, Israel,
hood centered on its current location). For a more detailed
pp. 130- 134.
discussion of these results, please see (Murphy 2000).
Becker,A.,Bar-Yehuda,R. and Geiger,D. ( 1999). Random
5.3 CONCLUSIONS AND EXTENSIONS algorithms for the loop cutset problem.

RBPFs have been applied to many problems, mostly in Bernardo,J. M. and Smith, A. F. M. ( 1994). Bayesian The­
the framework of conditionally linear Gaussian state-space ory, Wiley Series in Applied Probability and Statis­
models and conditionally finite state-space HMMs. That is, tics.
they have been applied to models that, conditionally upon
Boutilier, C., Friedman, N., Goldszmidt, M. and Koller,
a set of variables (imputed by the PF algorithm), admit a
D. ( 1996). Context-specific independence in bayesian
closed-form filtering distribution (Kalman filter in the con­
networks,Proc. Conf Uncertainty in AI.
tinuous case and HMM filter in the discrete case). One can
also make use of the special structure of the dynamic model Boyen, X. and Koller, D. ( 1998). Tractable inference
under study to perform the calculations efficiently using the for complex stochastic processes, Proc. Conf. Uncer­
junction tree algorithm. For example, if one had evolv­ tainty in AI.
ing trees, one could sample the root nodes with the PF and
compute the leaves using the junction tree algorithm. This Casella, G. and Robert,C. P. ( 1996). Rao-Blackwellisation
would result in a substantial computational gain as one only of sampling schemes,Biometrika 83(1): 8 1-94.
has to sample the root nodes and apply the juction tree to
Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegel­
lower dimensional sub-networks.
halter, D. J. ( 1999). Probabilistic Networks and Ex­
Although the previoulsy mentioned models are the most pert Systems, Springer-Verlag, New York.
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 183

Crisan, D. and Doucet, A. (2000). Convergence of gen­ Isard, M. and Blake, A. ( 1996). Contour tracking by
eralized particle filters, Technical Report CUEDIF­ stochastic propagation of conditional density, Euro­
INFENG/TR 381, Cambridge University Engineering pean Conference on Computer Vision, Cambridge,
Department. UK, pp. 343-356.

Crisan, D., Del Moral, P. and Lyons, T. ( 1999). Dis­ Kanazawa,K.,Koller,D. and Russell,S. ( 1995). Stochastic
crete filtering using branching and interacting parti­ simulation algorithms for dynamic probabilistic net­
cle systems, Markov Processes and Related Fields works, Proceedings of the Eleventh Conference on
5(3): 293-318. Uncertainty in Artificial Intelligence, Morgan Kauf­
mann,pp. 346-35 1 .
de Freitas, J. F. G. ( 1999). Bayesian Methods for Neu­
Kitagawa, G. ( 1996). Monte Carlo filter and smoother for
ral Networks, PhD thesis, Department of Engineer­
non-Gaussian nonlinear state space models, Journal
ing, Cambridge University,Cambridge, UK.
ofComputational and Graphical Statistics 5: 1 -25.
Dean, T. and Kanazawa, K. ( 1989). A model for reason­ Kong, A., Liu, J. S. and Wong, W. H. (1994). Se­
ing about persistence and causation,Artificial Intelli­ quential imputations and Bayesian missing data prob­
gence 93(1 -2): 1 -27. lems,Journal of the American Statistical Association
89(425): 278-288.
Doucet, A. ( 1998). On sequential simulation-based meth­
ods for Bayesian filtering,Technical ReportCUED/F­ Liu, J. S. and Chen, R. ( 1998). Sequential Monte Carlo
INFENG/TR 310, Department of Engineering, Cam­ methods for dynamic systems, Journal of the Ameri­
bridge University. can Statistical Association 93: 1 032-1 044.

Doucet, A., de Freitas, J. F. G. and Gordon, N. J. MacEachern, S. N., Clyde, M. and Liu, J. S. ( 1999).
(2000). Sequential MonteCarlo Methods in Practice, Sequential importance sampling for nonparametric
Springer-Verlag. Bayes models: the next generation,Canadian Jour­
nal of Statistics 27: 25 1 -267.
Doucet, A., Godsill, S. and Andrieu, C. (2000). On se­
Murphy,K. P. (2000). Bayesian map learning in dynamic
quential Monte Carlo sampling methods for Bayesian
environments, in S. Solla, T. Leen and K.-R. Muller
filtering,Statistics andComputing 10(3): 197-208.
(eds), Advances in Neural Information Processing
Doucet, A., Gordon, N. J. and Krishnamurthy, V.
Systems 12, MIT Press, pp. 1 0 1 5- 1 02 1 .
( 1999). Particle filters for state estimation of jump Pearl, J. ( 1988). Probabilistic Reasoning in Intelligent Sys­
Markov linear systems, Technical Report CUEDIF­ tems: Networks of Plausible Inference, Morgan Kauf­
INFENG/TR 359, Cambridge University Engineering mann.
Department.
Pitt, M. K. and Shephard, N. ( 1999). Filtering via simula­
Ghahramani, Z. and Jordan, M. ( 1997). Factorial Hidden tion: Auxiliary particle filters, Journal of the Ameri­
Markov Models,Machine Learning 29: 245-273. can Statistical Association 94(446): 590-599.

Gilks, W. R. and Berzuini, C. ( 1998). Monte Carlo in­ Smith, R. L. and Miller, J. E. ( 1986). Predictive records,
ference for dynamic Bayesian models, Unpublished. Journal of the Royal Statistical Society B 36 : 79-88.
Medical Research Council, Cambridge, UK.
Uhlig, H. ( 1997). Bayesian vector-autoregressions with
stochastic volatility, Econometrica.
Gordon,N. 1., Salmond, D. J. and Smith, A. F. M. ( 1993).
Novel approach to nonlinear/non-Gaussian Bayesian Vidoni, P. ( 1999). Exponential family state space models
state estimation, lEE Proceedings-F 140(2): 1 07- based on a conjugate latent process, Journal of the
1 1 3. Royal Statistical Society B 61: 2 1 3-221 .

Green,P. J. ( 1995). Reversible jump Markov chain Monte West, M. ( 1993). Mixture models, Monte Carlo, Bayesian
Carlo computation and Bayesian model determina­ updating and dynamic models, Computing Science
tion,Biometrika 82: 7 1 1 -732. and Statistics 24: 325-333.

Handschin, J. E. and Mayne, D. Q. ( 1969). Monte Carlo West, M. and Harrison, J. ( 1996). Bayesian Forecasting
techniques to estimate the conditional expectation in and Dynamic Linear Models, Springer-Verlag.
multi-stage non-linear filtering,International Journal
ofControl 9(5): 547-559.

You might also like