0% found this document useful (0 votes)
61 views17 pages

Particle Filter

The document summarizes the particle filter algorithm. It represents the posterior distribution with a set of random samples called particles. It uses importance sampling to adjust the particle weights based on the likelihood function to approximate the target distribution. The key steps are sampling particles from the proposal distribution, computing importance weights based on the likelihood, and resampling particles according to the weights to concentrate on high-likelihood regions. Resampling reduces diversity over time, so methods like low variance sampling are used to address this issue.

Uploaded by

Vignesh Waran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views17 pages

Particle Filter

The document summarizes the particle filter algorithm. It represents the posterior distribution with a set of random samples called particles. It uses importance sampling to adjust the particle weights based on the likelihood function to approximate the target distribution. The key steps are sampling particles from the proposal distribution, computing importance weights based on the likelihood, and resampling particles according to the weights to concentrate on high-likelihood regions. Resampling reduces diversity over time, so methods like low variance sampling are used to address this issue.

Uploaded by

Vignesh Waran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Lecture 4: The Particle Filter

Lecturer: Raghu Krishnapuram


Distinguished Member of Technical Staff
Robert Bosch Centre for Cyber-Physical Systems
[email protected]

1/17
* The Particle Filter

* Represent the posterior bel(xt ) by a set of random state samples drawn from the
posterior.
* Since it is non-parametric, is more broadly applicable.
* The samples of the posterior distribution are called particles:
[1] [2] [M ]
Xt := xt , xt , . . . , xt

* In general, M needs to be quite large for a good representation (e.g., 1,000).


* It can be a function of t or other quantities.
* Ideally, the likelihood of a state hypothesis xt to be included in Xt should be
[m]
proportional to its Bayes filter posterior, i.e., xt ∼ p(xt |z1:t , u1:t )

2/17
* Algorithm Particle filter (Xt−1 , ut , zt )
1. X t = Xt = ∅
2. for m = 1 to M do
[m] [m]
3. sample xt ∼ p(xt |ut , xt−1 ) – representation of bel(xt )
[m] [m]
4. wt = p(zt |xt ) – compute the importance factor
[m] m]
5. X t = X t + hxt , wt i
6. endfor
7. for m = 1 to M do
[i]
8. draw i with probability ∝ wt – importance sampling
[i]
9. add xi to Xt
10. endfor
* Return Xt
[m]
* Importance sampling changes distribution from bel(xt ) to η p(zt |xt )bel(xt ).

3/17
* Importance sampling

* Used when we need to work with a probability density function f , but we have
access only to samples from a different pdf g. For example, need to estimate the
expectation that x ∈ A:
Z
Ef [I(x ∈ A)] = f (x)I(x ∈ A)dx
Z
f (x)
= g(x)I(x ∈ A)dx
g(x)
| {z }
w(x)

= Eg [w(x)I(x ∈ A)]

* f is the target distribution, and g is the proposal distribution such that whenever
f (x) > 0, g(x) > 0.
4/17
* Importance sampling (contd...)

* When we have samples (particles) from g, we can compensate for the difference
by weighting them according to:

f (x[m] )
w[m] = , and
g(x[m] )
M
X M
−1 X Z
[m] [m] [m]
w I(x ∈ A)w → f (x)dx
m=1 m=1 A

* In most cases, estimates with weighted particles converge to the desired Ef at the
rate of O( √1M ).

5/17
* Importance sampling in particle filtering

* Importance sampling will possess many duplicates, since the sampling is with
replacement and M is fixed.
* It has a tendency to “collapse” all samples to a single value (survival of the
fittest).
* An alternative version does not re-sample, but iteratively update a weight for each
sample, starting with weight=1:
[m] [m] [m]
wt = p(zt |xt )wt−1

* In this approach, many particles will continue to exist in regions of low probability,
and hence the representation is not adaptive.

6/17
* Illustration of the “particle” representation used by particle filters

Samples of X are passed through the nonlinear function, resulting the samples of Y .
7/17
* Illustration of importance factors particle filters

We need to approximate the target density f . However, we can only generate samples
from g. 8/17
* Illustration of importance factors particle filters (contd...)

A sample of f can be obtained by attaching a weight f (x)/g(x) to each sample x.

9/17
* Mathematical derivation of the PF

* Particle filters can be thought of samples of state sequences, and define the belief
accordingly:
[m] [m] [m] [m]
x0:t = x0 , x1 , . . . , xt
bel(x0:t ) = p(x0:t |u1:t , z1:t )
* This gives us

p(x0:t |z1:t , u1:t ) = η p(zt |x0:t , z1:t−1 , u1:t )p(x0:t |, z1:t−1 , u1:t )
= η p(zt |xt )p(x0:t |, z1:t−1 , u1:t )
= η p(zt |xt )p(xt |x0:t−1 , z1:t−1 , u1:t )p(x0:t−1 |z1:t−1 , u1:t )
= η p(zt |xt )p(xt |xt−1 , ut )p(x0:t−1 |z1:t−1 , u1:t−1 )

10/17
* Mathematical derivation of the PF (contd...)

* The first particle set is sampled from the prior p(x0 ). Particle set at time t − 1 is
distributed according to bel(x0:t−1 )
[m]
* Sample xt , which replaces the m-th particle in the set, is generated from the
proposal distribution:

p(xt |xt−1 , ut )bel(x0:t−1 ) = p(xt |xt−1 , ut )p(x0:t−1 |, z1:t−1 , u1:t ), with


[m] target distribution
wt =
proposal distribution
p(zt |xt )p(xt |xt−1 , ut )p(x0:t−1 |z1 1 : t − 1, u1:t )

p(xt |xt−1 , ut )p(x0:t−1 |z1:t−1 , u1:t−1 )
= η p(zt |xt )
[m]
* Therefore, it follows that by using importance weights wt , we achieve state
[m]
samples xt distributed according to bel(xt ). 11/17
* Practical considerations of particle filters

* Often we need continuous representations of density functions.


* This can be achieved by a Gaussian approximation (only unimodal), or a mixture
of Gaussians (higher computational cost), or histograms (high space complexity),
density trees (expensive lookups), or kernel densities (smoother, but linear in
complexity w.r.t. number of particles or kernels).
* The method chosen depends on the application, the dimensionality of state space,
the number of hypotheses to be supported, etc.

12/17
* Different ways of extracting densities from particles

Gaussian approximation, kernel estimate, density and sample set approximation,and


histogram approximation.
13/17
* Sampling variance

When the sample size is small, the variation is generally large. The sampling variance
can be reduced with larger samples.
14/17
* Resampling and variance reduction

* The resampling step in the particle filtering algorithm will not reproduce all the
particles in general.
* The particles with larger probabilities or weights tend to repeat and the ones with
smaller probabilities or weights may not appear in the next round.
* In theory, after repeated re-sampling, the diversity vanishes, and M copies of a
single particle will survive!
* The variance of of the particle set as an estimator will increas with time.
* One possible approach to counter this effect is to reduce the frequency of
resampling, and simply update the important weights as follows:
(
[m] 1
wt = [m] [m]
p(zt |xt ) wt−1
15/17
* Resampling and variance reduction (contd...)

* Monitor the variance of the importance weights and decide when to resample.
* Another option is low variance sampling
* This approach covers the samples systematically by cycling through them.
* Guarantees that if all have the same importance factors, all will be reproduced.
* The complexity of the low variance sampler is O(M ).
* There are many alternative approaches, such as stratified sampling, which group
particles into subsets, and ensure adequate representation from each subset.
* Sampling bias occurs because of the normalization of the weights, i.e., one degree
of freedom is reduced.
* Particle deprivation happens because of variance in random sampling. Generally
can be overcome with a larger set of particles.
16/17
* Algorithm Low variance sampler(Xt , Wt )
1. Xt = ∅
2. r = rand(0; M −1 )
[1]
3. c = wt
4. i=1
5. for m = 1 to M do
6. U = r + (m − 1).M −1
7. while (U > c)
8. i=i+1
[i]
9. c = c + wt
10. endwhile
[i]
11. add xt to X t
12. endfor
13. return X t
* Return Xt

17/17

You might also like