An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
———
An introduction to particle filters
David Salmond and Neil Gordon
Sept 2005
1 Introduction
Aims The aim of this tutorial is to introduce particle filters to those with a back-
ground in “classical” recursive estimation based on variants of the Kalman filter.
We describe the principles behind the basic particle filter algorithm and provide
a detailed worked example. We also show that the basic algorithm is a special
case of a more general particle filter that greatly extends the filter design options.
The paper concludes with a discussion of computational issues and application areas.
1
there models can be expressed in a probabilistic form, a Bayesian approach may be
adopted.
The dynamics model describes how the state vector evolves with time and is
2
assumed to be of the form
Here xk is the state vector to be estimated, k denotes the time step and fk−1 is a
known possibly non-linear function. vk−1 is a white noise sequence, usually referred
to as the process, system or driving noise. The pdf of vk−1 is assumed known.
Note that (1) defines a first order Markov process, and an equivalent probabilistic
description of the state evolution is p (xk |xk−1 ) , which is sometimes called the tran-
sition density. For the special case when f is linear and v is Gaussian, the transition
density p (xk |xk−1 ) is also Gaussian.
The measurement equation relates the received measurements to the state vector:
3
This is known as the Chapman-Kolmogorov equation.
The prior pdf may be updated to incorporate the new measurements zk to give
the required posterior pdf at time step k > 0:
This
R is Bayes rule, where the normalising denominator is given by p (zk |Zk−1 ) =
p (zk |xk ) p (xk |Zk−1 ) dxk . The measurement model regarded as a function of xk
with zk given is the measurement likelihood. The relations (3) and (4) define the
formal Bayesian recursive filter with initial condition given by the specified prior pdf
p (x0 |Z0 ) = p (x0 ) (where Z0 is interpreted as the empty set). If (3) is substituted
into (4), the prediction and update may be written concisely as a single expression.
The relations (3) and (4) define a very general but formal (or conceptual) solution
to the recursive estimation problem. Only in special cases can an exact, closed form
algorithm be obtained from this general result. (In other words, only in special cases
can the posterior density be exactly characterized by a sufficient statistic of fixed
and finite dimension.) By far the most important of these special cases is the linear-
Gaussian (L-G) model: if p (x0 ), p (xk |xk−1 ) and p (zk |xk ) are all Gaussian, then
the posterior density remains Gaussian [4] and (3) and (4) reduce to the standard
Kalman filter (which recursively specifies the mean and covariance of the posterior
Gaussian). Furthermore, for non-linear / non-Gaussian problems, the first recourse
is usually to attempt to force the problem into an L-G framework by linearisation.
This leads to the extended Kalman filter (EKF) and its many variants. For mildly
non-linear problems, this is often a successful strategy and many real systems operate
entirely satisfactorily using EKFs. However, with increasingly severe departures
from the L-G situation, this type of approximation becomes stressed to the point of
filter divergence (exhibited by estimation errors substantially larger than indicated
by the filter’s internal covariance). For such grossly non-linear problems, the particle
filter may be an attractive option.
Suppose that a set of N random samples from the posterior pdf p (xk−1 |Zk−1 )
(k > 0) is available. We denote these samples or particles by {xik−1
∗
}N
i=1 .
4
The prediction phase of the basic algorithm consists of passing each of these
samples from time step k − 1 through the system model (1) to generate a set of prior
samples at time step k. These prior samples are written {xik }N
i=1 , where
i
¡ i∗ i
¢
xk = fk−1 xk−1 , vk−1
i
and vk−1 is a (independent) sample drawn from the pdf of the system noise. This
straightforward and intuitively reasonable procedure produces a set of samples or
particles from the prior pdf p (xk |Zk−1 ).
Note that the measurement likelihood effectively indicates those regions of the
state space that are plausible “explanations” of the observed measurement value.
Where the value of the likelihood function is high, these state values are well sup-
ported by the measurement, and where the likelihood is low, these state values are
unlikely. (And where the likelihood is zero, these state values are incompatible with
the measurement model - i.e. they cannot exist.) So the update procedure effec-
tively weights each prior sample of the state vector by its plausibility with respect to
the latest measurement. The re-sampling operation is therefore biased towards the
more plausible prior samples, and the more heavily weighted samples may well be
chosen repeatedly (see discussion of sample impoverishment below). The algorithm
is shown schematically in fig ?? and some Matlab code for an example application
is given in Section 3.
5
2.4 Empirical distributions
The sample sets described above may also be viewed as empirical distributions for
the required state pdfs, i.e. the prior:
N
1 X
p (xk |Zk−1 ) ≈ δ(xk − xik ) (5)
N i=1
This representation also facilitates a simple justification of the update phase of the
basic filter using the “plug-in principle” [8]. Substituting the approximate form of
the prior (5) into Bayes rule (4), we obtain:
6
intervals of 1/N is defined and the complete comb is translated by an offset chosen
randomly from a uniform distribution over [0, 1/N ]. The comb is then compared
with the cumulative sum of weights wCi as illustrated in fig 1 for N = 7. For this
example, the resamped set would consist of labels 2, 3, 3, 5, 6, 6 and 7 of the original
set. This scheme has the advantage of only requiring the generation of a single
random sample, irrespective of the number of particles, and it minimises the Monte
Carlo variation - see Section 2.7 below. This method is used in the example of
Section 3 and so Matlab code for it is given in the example listing.
0.9
SYSTEMATIC RESAMPLING
0.8 EXAMPLE WITH N=7 SAMPLES
Cumulative sum of weights
0.7
0.6
COMB OF N
0.5 EQUALLY
SPACED
0.4 LEVELS
0.3
0.2
0.1
RANDOM OFFSET
0
1 2 3 4 5 6 7
Sample Number
7
This rather ad hoc procedure can be formalized as regularization - where a kernel is
placed over each particle to effectively provide a continuous mixture approximation
to the discrete (empirical) distribution (akin to kernel density estimation). Optimal
kernels for regularization are discussed in [11]. Another scheme for maintaining
diversity is to perform a Monte Carlo move - see [12].
A convenient
P measure of degeneracy is the effective sample size [13] defined by
N̂ef f = 1/ N j=1 (w j 2
k ) which varies between 1 and N . A value close to 1 indicates
that almost all the probability mass is assigned to one particle and there is only
one useful sample in the set - i.e. severe degeneracy. Conversely, if the weights
are uniformly spread amongst the particles the effective sample size approaches N .
It is often suggested that the resampling process should only be performed if N̂ef f
falls below some threshold (chosen empirically). If resampling is not carried out,
the particle weights from the previous time step are updated via the likelihood:
w̃ki = wk−1
i
p (zk |xik ) and then normalised. In this case the required posterior pdf of
the state is given by the random measure {xik , wki }N i=1 , and the these particles are
passed through the system model in the prediction phase to generate the xik+1 for the
next measurementPupdate (so the prior distribution at k + 1 would be approximated
by p (xk+1 |Zk ) ≈ N i i
i=1 wk δ(xk+1 − xk+1 ) ).
8
and guidance purposes can be easily estimated.
Z
£ T
¤
cov(xk ) = E (xk − x̂k )(xk − x̂k ) |Zk = (xk − x̂k )(xk − x̂k )T p(xk |Zk )dxk
N
X N
1 X i∗
≈ wki (xik − x̂k )(xik − x̂k )T or (x − x̂k )(xik ∗ − x̂k )T .
i=1
N i=1 k
However, the mean and covariance may be a poor summary of the posterior, par-
ticularly if it is multimodal or skewed. A scatter plot of the samples, a histogram
or a kernel density estimate [14] are more informative for a 1 or 2-D state vector
(or for marginals of the full state vector). Another useful descriptor is the highest
probability density (HPD) region. The (1 − α)HPD region is the set of values of
the state vector which contain 1 − α of the total probability mass, such that the
pdfs of all points within the region are greater than or equal Rto the pdfs of all those
outside the region - i.e. if H is the (1 − α)HPD region, then H p(x)dx = 1 − α and
p(x0 ) ≥ p(x00 ) for all x0 ∈ H and x00 6∈ H. The HPD region is usually only considered
for scalars and it may be difficult to find for multmodal pdfs. A simpler option is
to find the percentile points on scalar marginals of the distribution. For example
the (1 − α)100 percentile point is given (roughly) by finding the largest N α samples
and choosing the smallest of these.
In many cases, the requirement is find some particular function of the posterior,
and the sample representation is often ideal for this. For example, for threat analysis,
one may be interested in the probability that a target is within some particular
region - this can be estimated by counting the number of particles falling within
that region. Also for decision and control problems, an estimate of the expected
value of any form of cost or utility function C(xk ) is simply given by
Z X N N
1 X
E[C(xk )|Zk ] = C(xk )p(xk |Zk )dxk ≈ wki C(xik ) or C(xik ∗ ) .
i=1
N i=1
This is the starting point for Monte Carlo approaches to the difficult problem of
stochastic control - especially with non-quadratic cost functions [15, 16].
9
2.9 Discussion
Convenience The basic particle filter is a very simple algorithm and it is quite
straightforward to obtain good results for many highly non-linear recursive estima-
tion problems. So problems that would be difficult to handle using an extended
Kalman filter, state space gridding or a Gaussian mixture approach are quite ac-
cessible to the “novice” via a blind application of the basic algorithm. Although
this is hugely liberating, it is something of a mixed blessing : there is a danger that
such challenging cases are not treated with proper respect and that subtleties and
implications of the problem are not appreciated [17].
3 Example
To demonstrate the operation of the particle filter we present an application to a
pendulum estimation problem. A weightless rigid rod of length L is freely pivoted
at one end and carries a mass at its other end. The rod makes an angle θ with the
horizontal and its instantaneous angular acceleration is given by
θ̈ = (1/L)(−g + v) cos θ
where θ has been restricted to the range [0, 2π), ∆t is the fixed time step and the
acceleration disturbance vk is a zero mean, white, Gaussian random sequence of
³ ´T
variance q. So this example has a two element state vector xk = θk , θ̇k for k > 0.
Measurements are obtained from the length of the rod projected onto a vertical axis,
10
i.e. L| sin θk |. These measurements are quantized at intervals of δ, but are otherwise
error-free, so
zk = Qδ (L| sin θk |) ,
where the quantization operator Qδ (x) = (n − 1)δ for the integer n such that (n −
1)δ < x ≤ nδ . Thus the likelihood of the state vector is
1 if zk < L| sin θk | ≤ zk + δ
p (zk |xk ) = (7)
0 otherwise .
MEASURED
L g+v L |sin θ | PROJECTION
PIVOT (QUANTIZED)
11
The basic version of the particle filter has been applied to this example. Here
each particle is a two element vector (θ, θ̇). As already indicated, the prediction
phase of the filter consists of passing each particle through the dynamics model (6).
A Matlab code for this example is shown below. In this code, the posterior particles
{xi ∗ }N
i=1 are contained in the 2 × N array x_post, where the two rows correspond
to θ and θ̇, and each column is an individual particle. Similarly, the prior particles
{xi }N
i=1 are contained in the 2 × N array x_prior, and nsamples is the number of
particles N . The un-normalised weights for each particle are stored in the N element
array likelihood, while the normalised weights and their cumulative sum are held
in weight. It is easy to recognise the dynamics equations (6) in the prediction phase
and the likelihood (7) in the update phase.
Hopefully, this code listing will clarify the specification of the filter given in
Section 2.3. Note that the complete filter can be expressed in a few lines of Matlab:
the basic algorithm is (embarrassingly) simple. Furthermore, there are no “hidden
extras”: the code does not call any sophisticated numerical algorithms (numerical
integration packages, eigenvector solvers, etc) or symbolic manipulation packages -
except perhaps for the random number generator and the Matlab array handling
routines.
12
%**************************************************************************
% Generate initial samples for k=0:
x_post(1,:) = 2*pi*rand(1,nsamples);
x_post(2,:) = theta_dot_init + sig_vel_init*randn(1,nsamples);
for k=1:nsteps
% PREDICT
F1 = dt*dt/(2*pend_len); F2=dt/pend_len;
drive1 = randn(1,nsamples); % random samples for system noise
accn_in = (-gee+drive1*sig_a).*cos(x_post(1,:));
x_prior(1,:) = mod( x_post(1,:) + dt*x_post(2,:) + F1*accn_in , 2*pi );
x_prior(2,:) = x_post(2,:) + F2*accn_in;
% UPDATE
% EVALUATE WEIGHTS resulting form meas(k):
project = pend_len.*abs(sin(x_prior(1,:))); % rod projection for each sample
likelihood = zeros(nsamples,1);
likelihood( find( project>=meas(k) & project<meas(k)+delta ) )=1;
weight = likelihood/sum(likelihood); % normalise weights
weight = cumsum(weight); % form cumulative distribution
This program has been applied to the data set shown in fig 3. Here the quanti-
zation interval is δ = L/2, so the only information available form the measurements
is whether the projected rod length is greater or less than L/2 (i.e. one bit of infor-
mation). The other parameters of this simulation are θ0 = 0.3 rads, θ̇0 = 2 rads/s,
∆t = 0.05 s, L = 3 m, g = 10 m/s2 and the standard deviation of the driving noise
v is 7 m/s2 . In the 10 second period shown, the pendulum changes its direction of
rotation twice (after about 1 s and 8 s), between which it makes a complete rotation.
The initial conditions supplied to the particle filter are that the angular velocity θ̇0
is from a Gaussian distribution with a mean of 2.4 rads/s and a standard deviation
of 0.4 rads/s. As already stated, the initial angle θ0 is uniformly distributed over
13
[0, 2π) rads. The initial particle set is drawn from these distributions as shown in
the above listing.
6
Position (radians)
0
0 1 2 3 4 5 6 7 8 9 10
Time
4
(radians per sec)
2
Velocity
−2
−4
0 1 2 3 4 5 6 7 8 9 10
Time
3 Truth
measurements
Projection
0
0 1 2 3 4 5 6 7 8 9 10
Time
The result of running the filter with N = 1000 particles is shown in fig 4. This
figure shows the evolution of the posterior pdf of the angle θ obtained directly from
the posterior particles. The pdf for each of the 200 time steps is a simple histogram
of the posterior angle particles. The evolving distribution consists of streams or
paths of modes that cross and pass through regions of bifurcation. For this case
of δ = L/2, the measurements switch between 0 and L/2 whenever | sin θ| = 0.5,
i.e when θ = π/6, 5π/6, 7π/6 or 11π/6. As is evident from fig 4, at these transition
points, the pdf modes sharpen. Occasionally a path is terminated if it is incompat-
ible with a measurement transition (for example, for θ = 11π/6 at about k = 15).
The actual angle of the pendulum is shown as a string of dots.
14
10
pdf of angle
0
200
180
160
140
120
100
Frame number
80
60
2pi
40
3pi/2
20 pi Angle
pi/2
0
but is otherwise very similar to the 1000 particle result. The N = 1000 case took
about 5 ms per time step to run on a ??MHz Xeon processor - a quite acceptable
rate for the quality of the result. For N = 50000, the time taken increased almost
linearly to 260 ms per time step. Note that apart from the obvious time penalty, it
is trivial to improve filter performance to approach the exact posterior pdf.
Discussion This filtering example was chosen to demonstrate the particle filter
because it is simple to specify and would be difficult to tackle using an extended
Kalman filter (or an Unscented Kalman filter). It is also a low dimensional case and
so is an easy example for a particle filter. With some effort, it would be possible to
develop a multiple hypothesis Kalman filter to capture the multi-modal nature of the
posterior pdf and to include the 2π wrap around in angle. Also it might be possible
to to represent the quantization function as a form of Gaussian mixture. However,
this would all be quite awkward and definitely approximate (and would probably be
more computationally expensive). The particle approach avoids all such difficulties
in this example. Also, the traditional summary descriptors of recursive estimation -
the mean and covariance - would be quite inappropriate for this example where the
posterior pdf is often multi-modal and sometimes unimodal but highly skewed.
15
10
pdf of angle
0
200
180
160
140
120
100
Frame number
80
60
2pi
40
3pi/2
20 pi Angle
pi/2
0
p (xk |Zk−1 ). Furthermore, these samples {xik }N i=1 are obtained from the posterior
samples {xik−1∗
}N
i=1 of the previous time step by passing them through the dynam-
ics¡ model. ¢In other words, each support point xik is a sample of the transition pdf
p xk |xik−1
∗
conditional on xik−1
∗
. However, it is not necessary to generate the {xik }N
i=1
in this way, they may be obtained from any pdf (known as an importance or proposal
density) whose support includes that of the required posterior p (xk |Zk ). In partic-
ular, the importance pdf may depend on zk , the value of the measurement at time
step k. This more general approach considerably broadens the scope for filter design.
The more general formulation is a two stage process similar to the basic filter of
Section 2.3, but these stages do not correspond directly to prediction and update
phases. As before, we assume that N random samples {xik−1 ∗
}N
i=1 of the posterior
pdf p (xk−1 |Zk−1 ) at time step k − 1 are available.
Sampling
¡ :¢ For each particle xik−1
∗
, draw a sample xik from an importance density
q xk |xik−1
∗
, zk .
16
Weight evaluation : The unnormalised weight corresponding to sample xik is
given by:
i
¡ i i∗ ¢
p (z k |x ) p x |x
w̃ki = ¡ ki i ∗ k k−1 ¢ . (8)
q xk |xk−1 , zk
P
As before, the weights are normalised wki = w̃ki / N j
j=1 w̃k and the empirical pdf of
P
the posterior is given by p (xk |Zk ) ≈ N i i
i=1 wk δ(xk − xk ). Resampling with replace-
ment according to the normalised weights produces a set of samples {xik ∗ }N i=1 of the
posterior pdf p (xk |Z¡k ). Note that¢ if the importance
¡ i i∗ ¢ density is chosen to be the
i i∗
transition pdf, i.e. q xk |xk−1 , zk = p xk |xk−1 , equation (8) reverts to the basic
particle filter. The general from of the weight equation (8) is essentially a modifica-
tion of the basic form to compensate for the different importance density.
The advantage of this formulation is that the filter designer can choose any im-
portance density q (xk |xk−1 , zk ) provided its support includes that of p (xk |Zk ). If
this condition is met, as N → ∞, the resulting sample set {xik ∗ }N
i=1 will be distributed
as p (xk |Zk ). This flexibility allows one to place samples where they are needed to
provide a good representation of the posterior - i.e. in areas of high probability
density rather than in sparse regions. In particular, since the importance density
may depend on the value of the received measurement zk , if the measurement is
very accurate (or if it strongly localizes the state vector in some sense), the impor-
tance samples can be placed in the locality defined by zk [18]. This is especially
important if the “overlap” between the prior and the likelihood is low - adjusting
the importance density could avoid wasting a high percentage of the particles (i.e.
impoverishment). There is considerable scope for ingenuity in designing the im-
portance density and a number of particle filter versions have been suggested for
particular choices of this density. An optimal importance density may be defined
as one that minimises the variance of the importance weights. For the special case
of non-linear dynamics with additive Gaussian noise, a closed form expression for
the optimal importance density can be obtained [10]. In general such an analytical
solution is not possible, but sub-optimal results based on local linearisation (via an
EKF or unscented Kalman filter) may be employed [1].
As in the basic version of the filter, it is not necessary to carry out resampling
at every time step. If resampling is omitted, the particle weights from the previous
time step are updated according to:
¡ ¢
i i p (zk |xik ) p xik |xik−1
w̃k = wk−1 ¡ ¢ . (9)
q xik |xik−1 , zk
17
This general result is known as Sequential Importance Sampling and it is most easily
derived by (formally) considering the full time history or trajectory of each particle
and marginalizing out past time steps [1, 2, 9, 10, 18]. This result is also the starting
point for for most expositions on particle filter theory (although, unusually, in this
paper the development has been from specific to general).
5 Computational issues
5.1 Computational cost for the basic filter
The computational cost of the basic particle filter (with systematic resampling) is
almost proportional to the number N of particles employed, both in terms of op-
eration count and memory requirements. The computational effort associated with
each particle clearly depends directly on the complexity of the system dynamics and
the measurement process. For example, problems involving measurement associa-
tion uncertainty may require a substantial measurement likelihood calculation (i.e.
a summation over hypotheses). For such cases there is a strong motivation to find
efficient ways of evaluating the likelihood - including approximate gating and the
use of likelihood ratios (see examples in Chapters 11 and 12 of [1]).
18
A notable advantage of the particle filter is that the available computational
resources can be fully exploited by simply adjusting the number of particles - so it is
easy to take advantage of the ever increasing capability cheap computers. Similarly,
if the measurement data rate is variable, the filter can match the number of particles
to the available time interval to optimize performance. (However, if the number of
particles falls below a critical level, the filter performance may degrade to a point
from which it cannot recover.) Also note that the filter is amenable to parallelization
- until a resampling event occurs, all particle operations are independent. (Some
recent developments on a parallel particle filter in the context of multiple target
tracking are reported in [23].)
The required sample size depends strongly on the design of the particle filter
and the problem being addressed (dimension of state vector, volume of support,
etc). For certain problems, especially high dimensional ones, an enormous, infea-
sible, number of samples is required to obtain satisfactory results with the basic
filter. To obtain a practical algorithm in these circumstances, the designer has to
be inventive. The theory outlined in Section 4 provides a rigorous framework for
exploring options, and, with a careful choice of proposal distribution and/or ex-
ploiting Rao-Blackwellization, it may be possible to design a filter that gives quite
satisfactory performance with a modest number of particles (a few hundred or even
tens in some cases). However, the basic algorithm has the advantage of simplicity,
so that the operation count for each particle may be much lower than for a more
subtle filter. Practical particle filter design is therefore a compromise between these
approaches with the aim of minimizing the overall computational load. Also note
that heuristic tricks may well be helpful.
The usual way of determining when enough samples are being deployed is via
19
trial and error: the sample size is increased until the observed error in the parameter
of interest (from a set of representative simulation examples) falls to a steady level.
If the required sample size is too large for the available processing resources, one may
have to settle for sub-optimal filter performance or attempt to improve the design
of the filter. This empirical approach is not entirely satisfactory, and more work in
this area is required to obtain, at least, guidelines that are of use to practising (as
opposed to academic) engineers.
Finally, note that filter initialization is often the most challenging aspect of a
recursive estimation problem. In particular, if the prior information (i.e. before
measurements are received) is vague, so that the initial uncertainty spans a large
volume of state space, the direct (obvious) approach of populating the prior pdf with
particles may be very wasteful. Semi-batch schemes using the first few measurement
frames may be useful.
6 Applications
Particle filters have been employed in a wide range of domains: essentially, wherever
there is a requirement to estimate the state of a stochastic evolving system using
uncertain measurement data. Below, we briefly indicate some of the more successful
or popular applications (with a bias towards tracking problems).
Tracking and navigation with a bounded support Particle filters are ideal
for problems where the state space has a restricted or bounded support. Examples
include, targets moving on a road network (the Ground Moving Target Indicator
(GMTI) problem - see [25] and chapter 10 of [1]), inside a building [26] or in restricted
waters [27, 28]. Hard edges and boundaries, which cannot be easily accommodated
by Kalman-type filters, do not pose any difficulty for the particle approach. Essen-
tially, the bounded support is simply flooded with particles.
Tracking with non-standard sensors The classical non-linear tracking test case
is the bearings-only problem with passive sensors (acoustic, electro-optical or elec-
tronic support measures (ESM)), and particle filters have certainly been applied to
examples of this type (see [5, 28] and chapter 6 of [1]). However, particle tracking
filters have also been successfully implemented with range-doppler sensors that pro-
vide measurements of only observer-target range and range rate (see chapter 7 of
[1]). An interesting application to a network of binary sensors (i.e. each sensor pro-
vides one bit of information) is reported in [29]. Also particle filtering of raw sensor
outputs (such as pixel grey levels) have been examined by a number of workers in
the context of track-before-detect (see chapter 11 of [1] and [30, 31]).
20
Multiple object tracking and association uncertainty The obvious way of
approaching multiple target tracking problems is to concatenate the state vectors
of individual targets and attempt to estimate the combined state. This approach is
appropriate if the targets’ dynamics are interdependent (for example, formation or
group dynamics - see chapter 12 of [1]) or if there is measurement association uncer-
tainty (or unresolved targets) due to object proximity [32]. Particle filters have been
successfully applied in these cases for small numbers of objects, although the eval-
uation of the likelihood function (for every particle) can be expensive as it involves
summing over feasible assignment hypotheses. An alternative more efficient route
suggested in [33] is to employ a Probabilistic Multiple Hypothesis Tracker (PMHT)
likelihood which effectively imposes independence between object-measurement as-
signments. This approach may also be viewed as a superposition of Poisson target
models (possibly including extended objects) [34]. Particle filtering is also the im-
plementation mechanism for the finite-set statistics Probability Hypothesis Density
(PHD) filter [35, 36, 37].
Computer vision and robotics Particle filtering was introduced to the com-
puter vision community as the CONDENSATION algorithm [7]. In this application,
the state vector includes shape descriptors as well as dynamics parameters. This has
been a successful domain for particle filters and there is now a substantial literature
especially in IEEE Computer Society Conferences on Computer Vision and Pattern
Recognition (CVPR) and IEEE International Conferences on Pattern Recognition
(ICPR). Applications include tracking of facial features (especially using active con-
tours or “snakes”), gait recognition and people tracking (some recent publications
include [38, 39, 40, 41] ). Particle filters are also well represented in the robotics
literature: they have been successfully applied to localization, mapping and fault
diagnosis problems [26, 42, 43, 44].
Econometrics Progress in this field has tended to parallel, but remain largely
independent of, engineering developments. However, in the case of particle filtering
and Monte Carlo methods, there has been perhaps more “cross-over” than usual.
Econometric applications include stochastic volatility modelling for stock indices
and commodity prices [45, 46, 47, 48].
21
filters for data assimilation has been considered [49].
7 Concluding remarks
Over the past few years, particle filters have become a popular topic. There have
been a large number of papers (arguably too many) demonstrating new applications
and algorithm developments This popularity may be due to the simplicity and gen-
erality of the basic algorithm - it is easy to get started. Furthermore, the particle
filter is not another variant of the EKF: it does not stem from linear-Gaussian or
least-squares theory. It also appeals to both the “hands on engineer” (there is plenty
of scope for algorithm tweaking) and to the more theoretical community (with sub-
stantial challenges to develop performance bounds and guidelines for finite sample
sizes). Undoubtedly a key enabler for this activity has been the massive increase in
the capability of cheap computers - as Daum [17] has recently pointed out “comput-
ers are now eight orders of magnitude faster (per unit cost) compared with 1960,
when Kalman published his famous paper” .
The basic or naive version of the version of the particle filter may be regarded as
a black box algorithm with a single tuning parameter - the number of samples. This
filter is very effective for many low dimensional problems, and, perhaps fortuitously,
reasonable results have been obtained for state vectors with about ten elements
without resorting to an enormous number of particles. For more challenging high
dimensional problems, a more subtle approach (exploiting Rao-Blackwellization and
carefully chosen proposal distributions) is generally beneficial - there is a design
trade-off between many simple or fewer smart particles. This (problem dependent)
compromise would benefit from further study.
To date, most particle filter applications have been in simulation studies or off-
line with recorded data. However, particle filters are beginning to appear as on-line
elements of real systems - mainly in navigation and robotics applications. The tech-
nology (and necessary processing capability) is now sufficiently mature to support
the leap to such real-time system implementation - we expect to see a significant
increase here in coming years.
References
[1] B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman filter: particle
filters for tracking applications. Artech House, 2004.
22
[2] A. Doucet, N. de Freitas, and N. J. Gordon, eds., Sequential Monte Carlo
Methods in Practice. New York: Springer, 2001.
[3] IEEE, “Special issue on Monte Carlo methods for statistical signal processing,”
IEEE Trans. Signal Processing, vol. 50, February 2002.
[6] G. Kitagawa, “Monte Carlo filter and smoother for non-Gaussian non-linear
state space models,” Journal of Computational and Graphical Statistics, vol. 5,
no. 1, pp. 1–25, 1996.
[9] J. S. Liu and R. Chen, “Sequential Monte Carlo methods for dynamical sys-
tems,” Journal of the American Statistical Association, vol. 93, pp. 1032–1044,
1998.
[10] A. Doucet, S. Godsill, and C. Andrieu, “On sequential Monte Carlo sam-
pling methods for Bayesian filtering,” Statistics and Computing, vol. 10, no. 3,
pp. 197–208, 2000.
[12] W. R. Gilks and C. Berzuini, “Following a moving target – Monte Carlo infer-
ence for dynamic Baysian models,” Journal of the Royal Statistical Society, B,
vol. 63, pp. 127–146, 2001.
23
[14] B. Silverman, Density estimation for statistics and applied data analysis. Chap-
man and Hall, 1986.
[15] C. Andrieu, A. Doucet, S. Singh, and V. Tadic, “Particle methods for change
detection, system identification, and control,” Proceedings of the IEEE, vol. 92,
pp. 423–438, March 2004.
[16] D. Salmond, N. Everett, and N. Gordon, “Target tracking and guidance using
particles,” American Control Conference, Arlington, Virginia, pp. 4387–4392,
June 2001.
[17] F. Daum, “Nonlinear filters: beyond the Kalman filter,” IEEE A and E Systems
Magazine - Part 2: Tutorials II, vol. 28, pp. 57–69, August 2005.
[21] A. Doucet, N. Gordon, and V. Krishnamurthy, “Particle filters for state estima-
tion of jump Markov linear systems,” IEEE Trans. Signal Processing, vol. 49,
pp. 613–624, March 2001.
[24] F. Daum and J. Huang, “Curse of dimensionality and particle filters,” in Pro-
ceedings of IEEE Aerospace Conference, (Big Sky), March 2003.
24
[26] D. Fox, S. Thrun, W. Burgard, and F. Dellaert, “Particle filters for mobile
robot localization,” in Sequential Monte Carlo Methods in Practice (A. Doucet,
N. de Freitas, and N. J. Gordon, eds.), New York: Springer, 2001.
[30] Y. Boers and J. Driessen, “Multitarget particle filter track before detect ap-
plication,” IEE Proceedings on Radar, Sonar and Navigation, vol. 151, no. 6,
pp. 351–357, 2004.
[32] Z. Khan, T. Balch, and F. Dellaert, “Multitarget tracking with split and merged
measurements,” in CVPR 2005: Proceedings of the 2005 Computer Society
Conference on Computer Vision and Pattern Recognition, 2005.
[33] C. Hue, J. L. Cadre, and P. Pérez, “Sequential Monte Carlo methods for mul-
tiple target tracking and data fusion,” IEEE Trans. Signal Processing, vol. 50,
pp. 309–325, February 2002.
[34] K. Gilholm, S. Godsill, S. Maskell, and D. Salmond, “Poisson models for ex-
tended target and group tracking,” SPIE: Signal and Data Processing of Small
Targets, vol. 5913, August 2005.
[35] B.-N. Vo, S. Singh, and A. Doucet, “Random finite sets and sequential monte
carlo methods in multi-target tracking,” in Proceedings of International Radar
Conference, pp. 486–491, September 2003.
[36] R. Mahler, “Statistics 101 for multisensor, multitarget data fusion,” IEEE A
and E Systems Magazine - Part 2: Tutorials, vol. 19, pp. 53–64, January 2004.
25
[37] R. Mahler, “Random sets: unification and computation for information fusion
- a retrospective assessment,” in Fusion 2004: Proceedings of the 7th interna-
tional conference on Information Fusion, pp. 1–20, 2004.
[38] R. Green and L. Guan, “Quantifying and recognizing human movement pat-
terns from monocular images: Parts I and II,” IEEE Transactions on Circuits
and Systems for Video Technology, vol. 14, no. 2, pp. 179–198, 2004.
[39] L. Wang, H. Ning, T. Tan, and W. Hu, “Fusion of static and dynamic body
biometrics for gait recognition,” IEEE Transactions on Circuits and Systems
for Video Technology, vol. 14, no. 2, pp. 149–158, 2004.
[40] J. Tu, Z. Zhang, Zeng, and T. Huang, “Face localization via hierarchial CON-
DENSATION with Fisher boosting fearture selection,” in CVPR 2004: Pro-
ceedings of the 2004 Computer Society Conference on Computer Vision and
Pattern Recognition, pp. II–719–II–724, 2004.
[42] S. Thrun, “Particle filters in robotics,” in Proceedings of the 17th Annual Con-
ference on Uncertainty in AI (UAI), 2002.
[44] S. Sutharsan, A. Kirubarajan, and A. Sinha, “Adapting the sample size in par-
ticle filters through KLD-sampling,” International journal of robotics research,
vol. 22, pp. 985–1004, October 2003.
[45] G. Kitagawa and S. Sato, “Monte Carlo smoothing and self-oragising state-
space model,” in Sequential Monte Carlo Methods in Practice (A. Doucet,
N. de Freitas, and N. J. Gordon, eds.), New York: Springer, 2001.
[46] M. Pitt and N. Shephard, “Auxiliary variable based particle filters,” in Se-
quential Monte Carlo Methods in Practice (A. Doucet, N. de Freitas, and N. J.
Gordon, eds.), New York: Springer, 2001.
[47] J. Stroud, N. Polson, and P. Mueller, “Practical filtering for stochastic volatil-
ity models,” in State Space and Unobserved Component Models (A. Harvey,
S. Koopman, and Shephard, eds.), Cambridge University Press, 2004.
26
[48] P. Fearnhead, “Using random Quasi-Monte-Carlo within particle filters with
application to financial time series,” Journal of Computational and Graphical
Statistics, To appear.
[49] P. J. van Leeuwen, “Nonlinear ensemble data assimilation for the ocean,” in
Seminar on recent developments in data assimilation for atmosphere and ocean,
(Shinfield Park, Reading, UK), pp. 265–286, European Centre for Medium-
Range Weather Forecasts, September 2003.
27