0% found this document useful (0 votes)

38 views8 pages

Temporal Parallelization of Bayesian Smoothers

Uploaded by

tmp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views8 pages

Temporal Parallelization of Bayesian Smoothers

Uploaded by

tmp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

GENERIC COLORIZED JOURNAL, VOL. XX, NO.

XX, XXXX 2017 1

Temporal Parallelization of Bayesian Smoothers

Simo Särkkä, Senior Member, IEEE, Ángel F. Garcı́a-Fernández

Abstract— This paper presents algorithms for temporal paral- but there is a lack of algorithms that are specifically designed for
lelization of Bayesian smoothers. We define the elements and the solving state-estimation problems in parallel architectures. There are,
operators to pose these problems as the solutions to all-prefix-
however, some existing approaches for parallelizing Kalman type
sums operations for which efficient parallel scan-algorithms are
available. We present the temporal parallelization of the general of filters as well as particle filters. One approach was studied in
Bayesian filtering and smoothing equations and specialize them to [11] and [12] is to parallelize the corresponding batch formulation,
linear/Gaussian models. The advantage of the proposed algorithms which leads to sub-linear computational methods, because the matrix
is that they reduce the linear complexity of standard smoothing computations can be parallelized. If the state-space of the Kalman
algorithms with respect to time to logarithmic.
arXiv:1905.13002v2 [stat.CO] 20 Feb 2020

filter is large, it is then possible to speed up the matrix computations

Index Terms— Bayesian smoothing, Kalman filtering and via parallelization [13], [14]. Particle filters can also be parallelized
smoothing, parallel computing, parallel scan, prefix sums over the particles [15], [16] the bottleneck being the resampling step.
In some specific cases such as in multiple target tracking [17] it is
possible to develop parallelized algorithms by using the structure of
I. I NTRODUCTION the specific problem.
Parallel computing is rapidly transforming from a scientists’ The contribution of this article is to propose a novel general
computational tool to a general purpose computational paradigm. algorithmic framework for parallel computation of Bayesian smooth-
The availability of affordable massively-parallel graphics processing ing solutions for state space models. We also present algorithms
units (GPUs) as well as widely-available parallel grid and cloud for parallelizing the Bayesian filtering solutions, but our focus is
computing systems [1]–[3] drive this transformation by bringing in smoothing, because the parallel computation is done off-line in
parallel computing technology to everyday use. This creates a demand the sense that all the data needs to be available during the parallel
for parallel algorithms that can harness the full power of the parallel computations and it cannot arrive sequentially. Our approach to
computing hardware. parallelization differs from the aforementioned approaches in the
Stochastic state-space models allow for modeling of time- aspect that we replace the whole Bayesian filtering and smoothing
behaviour and uncertainties of dynamic systems, and they have long formalism with another, parallelizable formalism. We replace the
been used in various tracking, automation, communications, and Bayesian filtering and smoothing equations [4], [6] with another set
imaging applications [4]–[8]. More recently, they have also been used of equations that can be combined with so-called scan or prefix-
as representations of prior information in machine learning setting sums algorithm [2], [18]–[20], which is one of the fundamental
(see, e.g., [9]). In all of these applications, the main problem can algorithm frameworks in parallel computing. The advantage of this
be mathematically formulated as a state-estimation problem on the is that it allows for reduction of the linear O(n) complexity of
stochastic model, where we estimate the unknown phenomenon from batch filtering and smoothing algorithms to logarithmic O(log n)
a set of noisy measurement data. Given the mathematical problem, span-complexity in the number n of data points. Based on the
the remaining task is to design efficient computational methods for novel formulation we develop parallel algorithms for computing the
solving the inference problems on large data sets such that they utilize filtering and smoothing solutions to linear Gaussian systems with
the computational hardware as effectively as possible. the logarithmic span-complexity. As this parallelization is done in
Bayesian filtering and smoothing methods [6] provide the classical temporal direction, the individual steps of the resulting algorithm
[10] solutions to state-estimation problems which are computationally could further be parallelized in the same way as Kalman filters and
optimal in the sense that their computational complexities are linear particle filters have previously been parallelized [13]–[17].
with respect to the number of data points. Although these solutions The organization of the article is the following. In Section II we
are optimal for single central processing unit (CPU) systems, due review the classical Bayesian filtering and smoothing methodology
to the sequential nature of the algorithms, their complexity remains as well as the parallel scan algorithm for computing prefix sums.
linear also in parallel multi-CPU systems. However, genuine parallel In Section III we present the general framework for parallelizing
algorithms are often capable to perform operations in logarithmic Bayesian filtering and smoothing methods. Section IV is concerned
number of steps by massive parallelization of the operations. More with specializing the general framework to linear Gaussian systems.
precisely, their span-complexity [3], that is, the number of computa- Numerical example of linear/Gaussian systems is given in Section V
tional steps as measured by a wall-clock, is often logarithmic with and finally Section VI concludes the article along with discussion on
respect to the number of data points. However, their total number of various aspects of the methodology.
operations, the work-complexity, is still linear as all the data points
II. B ACKGROUND
need to be processed.
Despite the long history of state-estimation methods, the exist- A. Bayesian filtering and smoothing
ing parallelization methods have concentrated on parallelizing the Bayesian filtering and smoothing methods [6] are algorithms for
subproblems arising in Bayesian filtering and smoothing methods, statistical inference in probabilistic state-space models of the form
xk ∼ p(xk | xk−1 ),
S. Särkkä is with the Department of Electrical Engineering (1)
and Automation, Aalto University, 02150 Espoo, Finland (email: yk ∼ p(yk | xk ).
[email protected]).
Ángel F. Garcı́a-Fernández is with the Department of Electrical En-
with x0 ∼ p(x0 ). Above, the state xk ∈ Rnx at time step k evolves
gineering and Electronics, University of Liverpool, Liverpool L69 3GJ, as a Markov process with transition density p(xk | xk−1 ). State xk is
United Kingdom (email: [email protected]). observed by the measurement yk ∈ Rny whose density is p(yk | xk ).
2 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2017

// Save the input:

The objective of Bayesian filtering is to compute the posterior
for i ← 1 to n do . Compute in parallel
density p(xk | y1:k ) of the state xk given the measurements y1:k =
bi ← ai
(y1 , . . . , yk ) up to time step k. Given the measurements up to a
end for
time step n, the objective is smoothing is to compute the density
// Up sweep:
p(xk | y1:n ) of the state xk for k < n.
for d ← 0 to log2 n − 1 do
The key insight of Bayesian filters and smoothers is that the
for i ← 0 to n − 1 by 2d+1 do . Compute in parallel
computation of the required densities can be done in linear O(n)
j ← i + 2d
number of computational steps by using recursive (forward) filtering
k ← i + 2d+1
and (backward) smoothing algorithms. This is significant, because a
ak ← aj ⊗ ak
naive computation of the posterior distribution would typically take
end for
at least O(n3 ) computational steps.
end for
The Bayesian filter is a sequential algorithm, which iterates the
// Down sweep:
following prediction and update steps:
Z for d ← log2 n − 1 to 0 do
p(xk | y1:k−1 ) = p(xk | xk−1 ) p(xk−1 | y1:k−1 ) dxk−1 , (2) for i ← 0 to n − 1 by 2d+1 do . Compute in parallel
j ← i + 2d
p(yk | xk ) p(xk | y1:k−1 ) k ← i + 2d+1
p(xk | y1:k ) = R . (3) t ← aj
p(yk | xk ) p(xk | y1:k−1 ) dxk
aj ← ak
Given the filtering outputs for k = 1, . . . , n, the Bayesian forward- ak ← ak ⊗ t
backward smoother consists of the following backward iteration for end for
k = n − 1, . . . , 1: end for
p(xk | y1:n ) // Final pass:
Z
p(xk+1 | xk ) p(xk+1 | y1:n ) (4) for i ← 1 to n do . Compute in parallel
= p(xk | y1:k ) dxk+1 . ai ← ai ⊗ bi
p(xk+1 | y1:k )
end for
When applied to a batch of data of size n, the computational complex-
ity of the filter and smoother is O(n) as they perform n sequential Fig. 1. Parallel scan algorithm for in-place transformation of the
sequence (ai ) into its all-prefix-sums in O(log n) span-complexity.
steps in forward and backward directions when looping over the data. Note that the algorithm in this forms assumes that n is a power of 2,
The Kalman filter and Rauch–Tung–Striebel (RTS) smoother [21], but it can easily be generalized to an arbitrary n.
[22] are the solutions to these recursions when the transition densities
are linear and Gaussian. The filtering and smoothing equations can
also be analogously solved in closed form for discrete-state models iteration inherently takes O(n) time. We can now see the analogy
[7]. In this paper, we show how to parallelize the previous recursions of the iteration to the Bayesian filter discussed in previous section
using parallel scan-algorithm, which is reviewed next. – both of the algorithms have linear O(n) complexity, because they
need to loop over all the elements in forward direction. A similar
argument applies to the Bayesian smoothing pass.
B. Parallel scan-algorithm
Fortunately, the all-prefix-sum sums operation can be computed in
The parallel-scan algorithm [20] is a general parallel computing parallel in O(log n) span-time by using up-sweep and down-sweep
framework that can be used to convert sequential O(n) algorithms algorithms [20] shown in Fig. 1. These algorithms correspond to up
with certain associative property to O(log n) parallel algorithms. The and down traversals in a binary tree which are used for computing
algorithm was originally developed for computing prefix sums [18], partial (generalized) sums of the elements. A final pass is then used to
where it uses the associative property of summation. The algorithm construct the final result. The algorithms can be used for computing
has since been generalized to arbitrary associative binary operators all-prefix-sums (5) for an arbitrary associative operator ⊗.
and it is used as the basis of multitude of parallel algorithms including
sorting, linear programming, and graph algorithms. This kind of III. PARALLEL B AYESIAN FILTERING AND SMOOTHING
algorithms are especially useful in GPU-based computing systems
In this section, we explain how to define the elements and
and they are likely to be fundamental algorithms in a many future
the binary operators to be able to perform Bayesian filtering and
parallel computing systems.
smoothing using parallel scan algorithms.
The problem that the parallel-scan algorithm [20] solves is the
all-prefix-sums operation, which is defined next.
A. Bayesian filtering
Definition 1: Given a sequence of elements (a1 , a2 , . . . , an ),
where ai belongs to a certain set, along with an associative binary In order to perform parallel Bayesian filtering, we need to find the
operator ⊗ on this set, the all-prefix-sums operation computes the suitable element ak and the binary associative operator ⊗. As we
sequence will see in this section, an element a consists of a pair (f, g) ∈ F
(a1 , a1 ⊗ a2 , . . . , a1 ⊗ · · · ⊗ an ). (5) where F is
Z
For example, if we have n = 4, ai = i, and ⊗ denotes the F = (f, g) : f (y | z) dy = 1 , (6)
ordinary summation, the all-prefix-sums are (1, 3, 6, 10). If ⊗ denotes
the subtraction, the all-prefix-sums are (1, −1, −4, −8). It should be and f : Rnx × Rnx → [0, ∞) represents a conditional density, and
noted that the operator is not necessarily commutative so we use a g : Rnx → [0, ∞) represents a likelihood.
product symbol, as matrix products are not commutative, instead of

Definition 2: Given two elements (fi , gi ) ∈ F and fj , gj ∈ F ,
a summation symbol. the binary associative operator ⊗ for Bayesian filtering is
The all prefix-sums operation can be computed sequentially by
processing one element after the other. However, this direct sequential (fi , gi ) ⊗ fj , gj = fij , gij ,
SÄRKKÄ et al.: TEMPORAL PARALLELIZATION OF BAYESIAN SMOOTHERS 3

where with the filtering densities but not the marginal likelihood p (y1:n ),
we can still recover the marginal likelihoods as follows. We first run
R
gj (y) fj (x | y) fi (y | z) dy
fij (x | z) = R , the parallel filtering algorithm to recover all filtering distributions
g (y) fi (y | z) dy
Z j p (xk | y1:k ) for k = 1 to n and then, we perform the following
gij (z) = gi (z) gj (y) fi (y | z) dy. decomposition for p (y1:n )
The proof that ⊗ has the associative property is given in Appendix I. n
Y
Theorem 3: Given the element ak = (fk , gk ) ∈ F where p (y1:n ) = p (yk | y1:k−1 ) ,
k=1
fk (xk | xk−1 ) = p (xk | yk , xk−1 ) ,
where
gk (xk−1 ) = p (yk | xk−1 ) , Z
p (yk | y1:k−1 ) = p (yk | xk ) p (xk | y1:k−1 ) dxk .
p (x1 | y1 , x0 ) = p (x1 | y1 ), and p (y1 | x0 ) = p (y1 ), the k-th
prefix sum is Each factor p (yk | y1:k−1 ) can be computed in parallel using the
predictive density p (xk | y1:k−1 ) and the likelihood p (yk | xk ). We

p (xk | y1:k )
a1 ⊗ a2 ⊗ · · · ⊗ ak = . can then recover all p (y1:k ) by O(log n) parallel recursive pairwise
p (y1:k )
Theorem 3 is proved in Appendix I. Theorem 3 implies that we can multiplications of the adjacent terms.
parallelise the computation of all filtering distributions p (xk | y1:k ) It is also possible to perform the parallelization at block level
and the marginal likelihoods p (y1:k ), of which the latter ones can instead of at individual element level. When using the parallel scan
be used for parameter estimation [6]. algorithm, we do not need to assign each single-measurement element
Remark 4: If we only know p (yk | xk−1 ) up to a proportionality to a single computational node, but instead we can perform initial
constant, which means that gk (xk−1 ) ∝ p (yk | xk−1 ), we can still computations in blocks such that a single node processes a block of
recover the filtering density p (xk | y1:k ) by the above operations. measurements before combining the results with other blocks. The
However, we will not be able to recover the marginal likelihoods results of the blocks can then be used as the elements in the parallel-
p (y1:k ). We can nevertheless recover p (y1:k ) by an additional scan algorithm. This kind of procedure corresponds to selecting the
parallel pass, as will be explained in Section III-C. elements for the scan algorithm to consist of blocks of length l:
 
p xlk | yl(k−1)+1:kl , xl(k−1)
B. Bayesian smoothing ak =  
p yl(k−1)+1:kl | xl(k−1)
The Bayesian smoothing pass requires that the filtering densities
have been obtained beforehand. In smoothing, we consider a different in filtering and
type of element a and binary operator ⊗ than those used in filtering.
As we will see in this section, an element a is a function a : Rnx × ak = p xlk | y1:l(k+1)−1 , xl(k+1) (7)
Rnx → [0, ∞) that belongs to the set
Z in smoothing instead of the corresponding terms with l = 1. A
practical advantage of this is that we can more easily distribute the
S = a : a (x | z) dx = 1 .
computations to a limited number of computational nodes while still
getting the optimal speedup from parallelization.
Definition 5: Given two elements ai ∈ S and aj ∈ S, the binary
associative operator ⊗ for Bayesian smoothing is IV. PARALLEL LINEAR /G AUSSIAN FILTER AND SMOOTHER
ai ⊗ aj = aij , The parallel linear/Gaussian filter and smoother are obtained by
particularising the element a and binary operator ⊗ for Bayesian
where filtering and smoothing explained in the previous section to lin-
Z
aij (x | z) = ai (x | y) aj (y | z) dy. ear/Gaussian systems. The sequential versions of these algorithms
correspond to the Kalman filter and the RTS smoother.
The proof that ⊗ has the associative property is included in Ap- We consider the linear/Gaussian state space model
pendix II.
Theorem 6: Given the element ak = p (xk | y1:k , xk+1 ) ∈ S, xk = Fk−1 xk−1 + uk−1 + qk−1 ,
with an = p (xn | y1:n ), we have that yk = Hk xk + dk + rk ,
ak ⊗ ak+1 ⊗ · · · ⊗ an = p (xk | y1:n ) . where Fk−1 ∈ Rnx ×nx and Hk ∈ Rny ×nx are known matrices,
Theorem 6 is proved in Appendix II. Theorem 6 implies that we uk−1 ∈ Rnx and dk ∈ Rny are known vectors, and qk−1 and rk
can compute all smoothing distributions in parallel form. However, are zero-mean, independent Gaussian noises with covariance matrices
it should be noted we should apply the parallel scan algorithm with Qk−1 ∈ Rnx ×nx and Rk ∈ Rny ×ny . The initial distribution is
elements in reverse other, that is, with elements bk = an−k+1 , so given as x0 ∼ N(m0 , P0 ). With this model, we have that
that the prefix-sums b1 ⊗ · · · ⊗ bk recover the smoothing densities.
p (xk | xk−1 ) = N (xk ; Fk−1 xk−1 + uk−1 , Qk−1 ) , (8)
C. Additional aspects p (yk | xk ) = N (yk ; Hk xk + dk , Rk ) . (9)
We proceed to discuss additional aspects of the previous formula- In this section, we use the notation NI (·; η, J) to denote a
tion of filtering and smoothing. In Section III-A, it was indicated that Gaussian density parameterised in information form so that η is the
the marginal likelihood p (y1:n ) is directly available from the parallel information vector and J is the information matrix. If a Gaussian dis-
scan algorithm if gk (xk−1 ) = p (yk | xk−1 ). However, sometimes tribution has mean x and covariance matrix P , its parameterisation in
we only know p (yk | xk−1 ) up to a proportionality constant so information form is η = P −1 x and J = P −1 . This parametrization
gk (xk−1 ) ∝ p (yk | xk−1 ), as will happen in Section IV. Although corresponds to so-called information form of Kalman filter [23]. We
in this case, the parallel scan Bayesian filtering algorithm provides us also use Inx to denote an identity matrix of size nx .
4 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2017

A. Linear/Gaussian filtering with

−1
We first describe the representation of an element ak ∈ F for Aij = Aj Inx + Ci Jj Ai ,
filtering in linear and Gaussian systems by the following lemma. −1
bij = Aj Inx + Ci Jj bi + Ci ηj + bj ,
Lemma 7: For linear/Gaussian systems, the element ak ∈ F for −1
filtering becomes Cij = Aj Inx + Ci Jj Ci A>j + Cj ,
> −1
fk (xk | xk−1 ) = p (xk | yk , xk−1 ) = N (xk ; Ak xk−1 + bk , Ck ) , ηij = Ai Inx + Jj Ci ηj − Jj bi + ηi ,
−1
gk (xk−1 ) = p (yk | xk−1 ) ∝ NI (xk−1 ; η k , Jk ) , Jij = A> i Inx + Jj Ci Jj Ai + Ji .
The proof is provided in Appendix III.
where the parameters of the first term are given for k > 1 as
Ak = (Inx − Kk Hk ) Fk−1 , B. Linear/Gaussian smoothing
bk = uk−1 + Kk (yk − Hk uk−1 − dk ) , We first describe the representation of an element ak ∈ S for
smoothing in linear and Gaussian systems by the following lemma.
Ck = (Inx − Kk Hk ) Qk−1 , (10)
Lemma 9: For linear/Gaussian systems, the element ak ∈ S for
Kk = Qk−1 Hk> Sk−1 , smoothing becomes
Sk = Hk Qk−1 Hk> + Rk , ak (xk | xk+1 ) = p (xk | y1:k , xk+1 )
and for k = 1 as = N (xk ; Ek xk+1 + gk , Lk ) ,
m−1 = F0 m0 + u0 , where for k < n
P1− F0 P0 F0> + Q0 ,
−1
=

Ek = Pk Fk> Fk Pk Fk> + Qk ,
S1 = H1 P1− H1> + R1 ,
gk = xk − Ek (Fk xk + uk ) ,
K1 = P1− H1> S1−1 , (11)
Lk = Pk − Ek Fk Pk ,
A1 = 0,
and for k = n we have
b1 = m− −
1 + K1 [y1 − H1 m1 − d1 ],
En = 0,
C1 = P1− − K1 S1 K1> .
gn = xn ,
The parameters of the second term are given as
Ln = Pn .
>
ηk = Fk−1 Hk> Sk−1 (yk − Hk uk−1 − dk ) , Above, xk and Pk are the filtering mean and covariance matrix at
>
(12)
Jk = Fk−1 Hk> Sk−1 Hk Fk−1 , time step k, such that p (xk | y1:k ) = N (xk ; xk , Pk ).
Lemma 9 is obtained by performing a Kalman filter update on density
for k = 1, . . . , n. p (xk | y1:k ) with an observation xk+1 , whose distribution is given
In Lemma 7, densities p (xk | yk , xk−1 ) and p (yk | xk−1 ) are by (8). Element ak for smoothing with linear/Gaussian systems can
obtained by applying the Kalman filter update with measurement be parameterised as ak = (Ek , gk , Lk ).
yk , distributed according to (9), applied to the density p (xk | xk−1 ) Lemma 10: Given two elements ai ∈ S and aj ∈ S with
in (8) and matching the terms. For the first step we have applied parameterisation
the Kalman filter prediction and update steps starting from x0 ∼
N (m0 , P0 ) and matched the terms. ai (y | z) = N (y; Ei z + gi , Li ) ,
Therefore, an element ak can be parameterised by the binary operator ⊗ for smoothing becomes
(Ak , bk , Ck , ηk , Jk ), which can be computed for each element
in parallel. Also, it is relevant to notice that if the system parameters ai ⊗ aj = aij ,
(Fk , uk , Qk , Hk , dk , Rk ) do not depend on the time step k, the
where
only parameters of ak that depend on k are bk and ηk , as they Z
depend on the measurement yk . aij (x | z) = ai (x | y) aj (y | z) dy
Lemma 8: Given two elements (fi , gi ) ∈ F and fj , gj ∈ F , Z
with parameterisations

= N (x; Ei y + gi , Li ) N y; Ej z + gj , Lj dy
fi (y | z) = N (y; Ai z + bi , Ci ) ,

= N x; Eij z + gij , Lij ,
gi (z) ∝ NI (z; η i , Ji ) ,
and
fj (y | z) = N y; Aj z + bj , Cj ,
Eij = Ei Ej ,
gj (z) ∝ NI z; η j , Jj ,
gij = Ei gj + gi ,
the binary operator ⊗ for filtering becomes
Lij = Ei Lj Ei> + Li .

(fi , gi ) ⊗ fj , gj = fij , gij ,
V. N UMERICAL EXPERIMENT
where In order to illustrate the benefit of parallelization we consider
a simple tracking model (see, e.g., [5], [6]) with the state x =
fij (x | z) = N x; Aij z + bij , Cij , (13) >
u v u̇ v̇ , where (u, v) is the 2D position and (u̇, v̇) is the
gij (z) ∝ NI z; ηij , Jij , (14) 2D velocity of the tracked object. From noisy measurements of the
SÄRKKÄ et al.: TEMPORAL PARALLELIZATION OF BAYESIAN SMOOTHERS 5

2 10 7
True Trajectory KF flops
0 Measurements PKF span flops

Floating point operations (flops)

KF estimate PKF work flops
RTS estimate
-2 10 6

-4

10 5
v

-6

-8

-10 10 4

-12

-14 10 3 0
-5 0 5 10 15 20 25 10 10 1 10 2 10 3 10 4
u Time series length
Fig. 2. Simulated trajectory from the linear tracking model in Eqs. (15) Fig. 3. The flops and the span and work flops for the sequential Kalman
and (16) along with the Kalman filter (KF) and RTS smoother results. filter (KF) and the parallel Kalman filter (PKF).

10 7
position (u, v), we aim to solve the smoothing problem in order to RTS flops
determine the whole trajectory of the target. PRTS span flops

Floating point operations (flops)

The model has the form PRTS work flops
10 6
xk = F xk−1 + qk−1 ,
(15)
yk = H xk + rk , 10 5
where qk ∼ N(0, Q), rk ∼ N(0, R), and
 3
∆t2

  ∆t
0 0 10 4
1 0 ∆t 0  3 3
2
2
0 1 0 ∆t  0 ∆t ∆t
3 0 2

F = 0 0 1 0  , Q = q  ∆t2
  ,
0 ∆t 0  10 3

 2
0 0 0 1 ∆t2
0 2 0 ∆t
(16)
10 2 0
along with 10 10 1 10 2 10 3 10 4
2 Time series length
1 0 0 0 σ 0
H= , R= . (17)
0 1 0 0 0 σ2 Fig. 4. The flops and the span and work flops for the (sequential) RTS
smoother and the parallel RTS (PRTS) smoother.
In our simulations we used the parameters σ = 0.5, ∆t = 0.1, q = 1,
and started the trajectory from a random Gaussian initial condition
>
with mean m0 = 0 0 1 −1 and covariance P0 = I4 . Fig. 3 shows the flops required by the sequential KF along with
Fig. 2 shows a typical trajectory and measurements from the model the span flops and work flops required by the parallel Kalman filter
defined by Eqs. (15) and (16) along with the Kalman filter and algorithm. As expected, with small data set sizes the number of span
RTS smoother solutions. As the parallel algorithms produce exactly flops required by the parallel KF is larger than that of the sequential
the same filter and smoothing solutions as the classic sequential KF, but already starting from time step count of around 20, the span
algorithms, this result also illustrates the typical result produced by flops is lower for the parallel KF. The logarithmic growth of the span
the proposed algorithms. flops in the parallel algorithm can be clearly seen while the number
We now aim to evaluate the required number of floating point of flops for the sequential KF grows linearly. However, the work flops
operations (flops) for generating the smoothing solution for this required by the parallel KF is approximately 8 times the flops of the
model. In order to do that, we run the sequential filter and smoothing sequential KF. This means that although the execution time for the
methods (KF and RTS) as well as the proposed parallel algorithms parallel algorithms is smaller than for the sequential algorithms, they
(PKF and PRTS) over simulated data sets of different sizes and need to perform more floating point operations in total.
evaluate their span and work flops. The span flops here refers to The flops required by the sequential RTS smoother along with the
the minimum number of floating point steps when the parallelizable span flops and work flops required by the parallel RTS smoother
operations in the algorithm are done in parallel – this corresponds to are shown in Fig. 4. In this case, the parallel algorithm reaches the
the actual execution time required to do the computations in a parallel sequential algorithm speed already with data set of size less than 10.
computer. The work flops refers to the total number of operations that Furthermore, the total number of floating point operations required
the parallel computer needs to perform – it measures the total energy by the parallel algorithm is approximately 4 times the operations
required for the computations or equivalently the time required by the required by the sequential algorithm. The ratios of these total (work)
algorithm in a single-core computer. As the classic sequential KF and operations for both the filter and smoother are shown in Fig. 5.
RTS algorithms are not parallelizable, their span and work flops are
equal. The flops have been computed by estimating how many flops
each of the matrix operations takes (multiplication, summation, LU- VI. C ONCLUSION AND D ISCUSSION
factorization) and incrementing the flops counter after every operation In this article we have proposed a novel general algorithmic
in the code. framework for parallel computation of batch Bayesian filtering and
6 GENERIC COLORIZED JOURNAL, VOL. XX, NO. XX, XXXX 2017

8 the present framework along with various other Bayesian filter and
Ratio of work flops smoother approximations proposed in literature.
6

4 A PPENDIX I
In this appendix, we prove the required results for Bayesian
2 KF work ratio
RTS work ratio
filtering: the associative property of the operator in Definition 2 and
0 Theorem 3.
0 500 1000 1500 2000 2500
Time series length
A. Associative property
Fig. 5. Ratio of work flops for the parallel and sequential Kalman filter
and the parallel and sequential RTS smoother.
In order to prove the associative property of ⊗ for filtering, we need
to prove that for three elements (fi , gi ), fj , gj , (fk , gk ) ∈ F , the
following relation holds

smoothing solutions for state space models. The framework is based (fi , gi ) ⊗ fj , gj ⊗ (fk , gk )
on formulating the computations in terms of associative operations
= (fi , gi ) ⊗ fj , gj ⊗ (fk , gk ) .

(18)
between suitably defined elements such that the all-prefix-sums
operation computed by a parallel-scan algorithm exactly produces We proceed to perform the calculations on both sides of the equation
the Bayesian filtering and smoothing solutions. The advantage of to check that they yield the same result.
the framework is that the parallelization allows for performing the 1) Left-hand side: We use Definition 2 in the left-hand side of
computations in O(log n) span complexity, where n is the number (18) and obtain
of data points, while sequential filtering and smoothing algorithms
fij , gij ⊗ (fk , gk ) = fijk , gijk ,
have an O(n) complexity. Parallel versions of Kalman filters and
Rauch–Tung–Striebel smoothers were derived as special cases of the where
framework. The computational advantages of the framework were
fijk (x | z)
illustrated in a numerical simulation.
gk (y) fk (x | y) gj y 0 fj y | y 0 fi y 0 | z dy 0 dy
RR
A disadvantage of the proposed methodology is that although = , (19)
gk (y) gj (y 0 ) fj (y | y 0 ) fi (y 0 | z) dy 0 dy
RR
the wall-clock time of execution is significantly reduced, the total
amount of computations (and hence required energy) is larger than and
with conventional sequential algorithms. Although the total amount
of computations is only increased by a constant factor, in some gijk (z)
Z
systems, such as small-scale mobile systems, even if parallelization = gij (z) gk (y) fij (y | z) dy
would be possible, it can be beneficial to use the classic algorithms. Z
However, the speedup gain of the proposed approach is beneficial
= gi (z) gj (y) fi (y | z) dy
in applications such as data-assimilation based weather forecasting
[24] and other spatio-temporal systems appearing, for example, in "Z "R
gj y 0 fj y | y 0 fi y 0 | z dy 0
# #
tomographic reconstruction [8] or machine learning [25], where the × gk (y) dy
gj (y 0 ) fi (y 0 | z) dy 0
R
computations take a significant amount of time. In these systems, it ZZ
is possible to dedicate the required amount of extra computational = gi (z) gk (y) gj y 0 fj y | y 0 fi y 0 | z dy 0 dy.

(20)
resources to gain the significant speedup provided by parallelization.
Although we have restricted our consideration to specific types 2) Right-hand side: We first use operator ⊗ to the elements with
parallel-scan algorithms, it is also possible to use other kinds of algo- indices j and k in the right-hand side of (18), see Definition 2,
rithms for computing the prefix sums corresponding to the Bayesian
filtering and smoothing solutions. We could also select algorithms fj , gj ⊗ (fk , gk ) = fjk , gjk ,
for given computer or network architectures, for minimizing the where
communication between the nodes, or for minimizing the energy R
gk (y) fk (x | y) fj (y | z) dy
consumption [26], [27]. The present formulation of the computations fjk (x | z) = R ,
g (y) fj (y | z) dy
in terms of local associative operations is likely to have other Zk
applications beyond parallelization. For example, in decentralized gjk (z) = gj (z) gk (y) fj (y | z) dy.
systems, it is advantageous to be able to first perform operations
locally and then combine them to produce the full state-estimation Then, the right-hand side of (18) becomes
solution. 0 0
(fi , gi ) ⊗ fjk , gjk = fijk , gijk ,
The proposed framework is also valid for discrete state spaces as
well as for other state spaces provided that we consider the elements where
with the appropriate domain and replace the Lebesgue integrals by 0
fijk (x | z)
integrals with respect to the corresponding reference measure, e.g., R
counting measure in the case of discrete-state models. gjk (y) fjk (x | y) fi (y | z) dy
= R
The framework could be extended to non-linear and non-Gaussian gjk (y) fi (y | z) dy
gj (y) gk y 0 fk x | y 0 fj y 0 | y dy 0 fi (y | z) dy
R R
models by replacing the exact Kalman filters and smoothers with iter-
ated extended Kalman filters and smoothers [28], [29] or their sigma- =
gj (y) gk (y 0 ) fj (y 0 | y) dy 0 fi (y | z) dy
R R
point/numerical-integration versions such as posterior linearization
gj (y) gk y 0 fk x | y 0 fj y 0 | y fi (y | z) dy 0 dy
RR
filters and smoothers [30]–[32]. Possible future work also includes = , (21)
gj (y) gk (y 0 ) fj (y 0 | y) fi (y | z) dy 0 dy
RR
developing particle filter and smoother methods (see, e.g., [6]) for
SÄRKKÄ et al.: TEMPORAL PARALLELIZATION OF BAYESIAN SMOOTHERS 7

and The second element of a1 ⊗ [a2 ⊗ · · · ⊗ ak ], denoted as g1k is

0
gijk (z) g1k (x0 )
Z Z
= gi (z) gjk (y) fi (y | z) dy = p (y1 ) p (y2:k | x1 ) p (x1 | y1 ) dx1
Z Z
= p (y1:k ) . (27)
gj (y) gk y 0 fj y 0 | y dy 0 fi (y | z) dy

= gi (z)
ZZ Results (26) and (27) finish the proof of Theorem 3.
gj (y) gk y 0 fj y 0 | y fi (y | z) dy 0 dy.

= gi (z) (22)
A PPENDIX II
0 0
In (19), (20), it is met that fijk (x | z) = f ijk (x | z) and gijk (z) =
In this appendix, we prove the required results for Bayesian
g ijk (z), which proves the associative property of ⊗ in Definition 2.
smoothing: the associative property of the operator in Definition 5
and Theorem 6.
B. Proof of Theorem 3
In this appendix, we prove Theorem 3. We first prove by induction A. Associative property
that In order to prove the associative property of ⊗ for filtering, we
need to prove that, for three elements ai , aj , ak ∈ S , the following
ak−l ⊗ · · · ⊗ ak−1 ⊗ ak
relation holds:
= (p (xk | yk−l:k , xk−l−1 ) , p (yk−l:k | xk−l−1 )) , (23)
ai ⊗ aj ⊗ ak = ai ⊗ aj ⊗ ak . (28)
for l < k + 1. Relation (23) holds for l = 0 by definition of ak .
Then, assuming that We proceed to perform the calculations on both sides of the equation
to check that they yield the same result.
ak−l+1 ⊗ · · · ⊗ ak−1 ⊗ ak 1) Left-hand side: We apply the operator in Definition 5 on the
= (p (xk | yk−l+1:k , xk−l ) , p (yk−l+1:k | xk−l )) (24) left-hand side of (28) to obtain

holds, we need to prove that (23) holds. aij ⊗ ak = aijk ,

We calculate the first element of ak−l ⊗ bk−l+1 , denoted by fab ,
where
where bk−l+1 = ak−l+1 ⊗ · · · ⊗ ak−1 ⊗ ak . We have Z
aijk (x | z) = aij (x | y) ak (y | z) dy
fab (xk | xk−l )
R ZZ
p (yk−l+1:k , xk | xk−l ) p (xk−l | yk−l , xk−l−1 ) dxk−l = ai x | y 0 aj y 0 | y ak (y | z) dydy 0 .

(29)
=
p (yk−l+1:k | yk−l , xk−l−1 )
p (yk−l+1:k , xk | yk−l , xk−l−1 ) 2) Right-hand side: We first calculate aj ⊗ ak using the operator
=
p (yk−l+1:k | yk−l , xk−l−1 ) in Definition 5 which gives
= p (xk | yk−l:k , xk−l−1 ) .
Z
ajk (x | z) = aj (x | y) ak (y | z) dy.
Function fab corresponds to the first element of (23), as required.
We further get Then, we calculate the right hand side of (28) we have that
ai ⊗ ajk = a0ijk ,
Z
gab (xk−l−1 ) = p (yk−l | xk−l−1 ) p (yk−l+1:k | xk−l )
where
× p (xk−l | yk−l , xk−l−1 ) dxk−l Z
= p (yk−l | xk−l−1 ) p (yk−l+1:k | yk−l , xk−l−1 ) a0ijk (x | z) = ai (x | y) ajk (y | z) dy
= p (yk−l:k | xk−l−1 ) .
ZZ
ai (x | y) aj y | y 0 ak y 0 | z dy 0 dy.

= (30)
Function gab corresponds to the second element of (23), as required.
Substituting l = k + 2 into (23), we obtain We can see that aijk in (29) is equal to a0ijk in (30), which proves
the associative property of the operator in Definition 5.
a2 ⊗ · · · ⊗ ak
= (p (xk | y2:k , x1 ) , p (y2:k | x1 )) . (25) B. Proof of Theorem 6
We now calculate the first element of a1 ⊗ [a2 ⊗ · · · ⊗ ak ], denoted In this appendix, we prove Theorem 6. We first prove by induction
as f1k , where a1 is given in Theorem 3: that

f1k (xk | x0 ) ak ⊗ · · · ⊗ ak+l

We use ak+l in Theorem 6 to calculate [10] Y. C. Ho and R. C. K. Lee, “A Bayesian approach to problems in
stochastic estimation and control,” IEEE Transactions on Automatic
[ak ⊗ · · · ⊗ ak+l−1 ] ⊗ ak+l Control, vol. 9, no. 4, pp. 333–339, 1964.
[11] T. D. Barfoot, C. H. Tong, and S. Särkkä, “Batch continuous-time
Z
= p (xk | y1:k+l−1 , xk+l ) p (xk+l | y1:k+l , xk+l+1 ) dxk+l trajectory estimation as exactly sparse Gaussian process regression,” in
Z Proceedings of Robotics: Science and Systems (RSS), 2014.
= p (xk | y1:k+l , xk+l , xk+l+1 ) [12] A. Grigorievskiy, N. Lawrence, and S. Särkkä, “Parallelizable sparse
inverse formulation Gaussian processes (SpInGP),” in Proceedings of
IEEE International Workshop on Machine Learning for Signal Process-
× p (xk+l | y1:k+l , xk+l+1 ) dxk+l
Z ing (MLSP), 2017.
[13] P. M. Lyster, S. E. Cohn, R. Ménard, L. P. Chang, S. J. Lin, and R. G.
= p (xk , xk+l | y1:k+l , xk+l+1 ) dxk+l Olsen, “Parallel implementation of a Kalman filter for constituent data
assimilation,” Monthly Weather Review, vol. 125, no. 7, pp. 1674–1686,
= p (xk | y1:k+l , xk+l+1 ) . 1997.
This proves (31). [14] G. Evensen, “The ensemble Kalman filter: Theoretical formulation and
practical implementation,” Ocean dynamics, vol. 53, no. 4, pp. 343–367,
If l = n − k − 1 and an as in Theorem 6, we have 2003.
[15] A. Lee, C. Yau, M. B. Giles, A. Doucet, and C. C. Holmes, “On
[ak ⊗ · · · ⊗ an−1 ] ⊗ an the utility of graphics cards to perform massively parallel simulation
Z
of advanced Monte Carlo methods,” Journal of Computational and
= p (xk | y1:n−1 , xn ) p (xn | y1:n ) dxn Graphical Statistics, vol. 19, no. 4, pp. 769–789, 2010.
Z [16] O. Rosen and A. Medvedev, “Efficient parallel implementation of state
= p (xk | y1:n , xn ) p (xn | y1:n ) dxn estimation algorithms on multicore platforms,” IEEE Transactions on
Control Systems Technology, vol. 21, no. 1, pp. 107–120, 2013.
= p (xk | y1:n ) . [17] M. E. Liggins, C.-Y. Chong, I. Kadar, M. G. Alford, V. Vannicola, and
S. Thomopoulos, “Distributed fusion architectures and algorithms for
This result finishes the proof of Theorem 6. target tracking,” Proceedings of the IEEE, vol. 85, no. 1, pp. 95–107,
1997.
[18] R. E. Ladner and M. J. Fischer, “Parallel prefix computation,” Journal
A PPENDIX III of the ACM, vol. 27, no. 4, pp. 831–838, 1980.
In this appendix, we prove Lemma 8. We have the following easily [19] G. E. Blelloch, “Scans as primitive parallel operations,” IEEE Transac-
verifiable identities: tions on Computers, vol. 38, no. 11, pp. 1526–1538, 1989.
[20] ——, “Prefix sums and their applications,” School of Computer Science,
NI (y; η, J)N(y; m, C) Carnegie Mellon University, Tech. Rep. CMU-CS-90-190, 1990.
[21] R. E. Kalman, “A new approach to linear filtering and prediction
∝ N(y; [J + C −1 ]−1 [η + C −1 m], [J + C −1 ]−1 ) problems,” Transactions of the ASME, Journal of Basic Engineering,
vol. 82, no. 1, pp. 35–45, 1960.
and [22] H. E. Rauch, F. Tung, and C. T. Striebel, “Maximum likelihood estimates
of linear dynamic systems,” AIAA Journal, vol. 3, no. 8, pp. 1445–1450,
NI (y; η, J)NI (y; η 0 , J 0 ) ∝ NI (y; η + η 0 , J + J 0 ). 1965.
We also have [23] B. D. O. Anderson and J. B. Moore, Optimal Filtering. Prentice-Hall,
Z 1979.
NI (y; η, J)N(y; Az + b, C)dy [24] N. Cressie and C. K. Wikle, Statistics for Spatio-Temporal Data. John
Wiley & Sons, 2011.
[25] S. Särkkä, A. Solin, and J. Hartikainen, “Spatiotemporal learning via
∝ NI (z; A> [I + JC]−1 (η − Jb), A> [I + JC]−1 JA). infinite-dimensional Bayesian filtering and smoothing,” IEEE Signal
Processing Magazine, vol. 30, no. 4, pp. 51–61, 2013.
By using Definition 2 for fij and gij together with parameterizations [26] A. Grama, V. Kumar, A. Gupta, and G. Karypis, Introduction to parallel
in Lemma 8, elementary computations lead to (13) and (14). computing, 2nd ed. Pearson Education, 2003.
[27] P. Sanders and J. L. Träff, “Parallel prefix (scan) algorithms for MPI,”
ACKNOWLEDGMENT in Recent Advances in Parallel Virtual Machine and Message Passing
Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science,
The authors would like to thank Academy of Finland for financial B. Mohr, J. Träff, J. Worringen, and J. Dongarra, Eds. Springer, 2006,
support. vol. 4192.
[28] B. M. Bell and F. W. Cathey, “The iterated Kalman filter update as
a Gauss–Newton method,” IEEE Transactions on Automatic Control,
R EFERENCES vol. 38, no. 2, pp. 294–297, 1993.
[1] T. Rauber and G. Rünger, Parallel programming: For multicore and [29] B. M. Bell, “The iterated Kalman smoother as a Gauss–Newton method,”
cluster systems, 2nd ed. Springer, 2013. SIAM Journal on Optimization, vol. 4, no. 3, pp. 626–636, 1994.
[2] S. Cook, CUDA programming: a developer’s guide to parallel computing [30] A. F. Garcı́a-Fernández, L. Svensson, M. R. Morelande, and S. Särkkä,
with GPUs. Morgan Kaufmann, 2013. “Posterior linearisation filter: principles and implementation using sigma
[3] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to points,” IEEE Transactions on Signal Processing, vol. 63, no. 20, pp.
Algorithms, 3rd ed. MIT Press, 2009. 5561–5573, 2015.
[4] A. H. Jazwinski, Stochastic Processes and Filtering Theory. Academic [31] A. F. Garcı́a-Fernández, L. Svensson, and S. Särkkä, “Iterated poste-
Press, New York, 1970. rior linearization smoother,” IEEE Transactions on Automatic Control,
[5] Y. Bar-Shalom, X.-R. Li, and T. Kirubarajan, Estimation with Applica- vol. 62, no. 4, pp. 2056–2063, 2017.
tions to Tracking and Navigation. Wiley, New York, 2001. [32] F. Tronarp, A. F. Garcı́a-Fernández, and S. Särkkä, “Iterative filtering
[6] S. Särkkä, Bayesian Filtering and Smoothing. Cambridge University and smoothing in non-linear and non-Gaussian systems using conditional
Press, 2013. moments,” IEEE Signal Processing Letters, vol. 25, no. 3, pp. 408–412,
[7] O. Cappé, E. Moulines, and T. Rydén, Inference in Hidden Markov 2018.
Models, ser. Springer Series in Statistics. New York, NY: Springer-
Verlag, 2005.
[8] J. Kaipio and E. Somersalo, Statistical and Computational Inverse
Problems. Springer, 2005.
[9] S. Särkkä, M. A. Álvarez, and N. D. Lawrence, “Gaussian process latent
force models for learning and stochastic control of physical systems,”
IEEE Transactions on Automatic Control, 2019.

82-P01.91.300096-07 GE300 GE320 Operation Manual
No ratings yet
82-P01.91.300096-07 GE300 GE320 Operation Manual
126 pages
Sources of Data
100% (3)
Sources of Data
18 pages
An Introduction To Kalman Filtering With MATLAB Examples: S L S P
89% (9)
An Introduction To Kalman Filtering With MATLAB Examples: S L S P
83 pages
Kalman Filtering Book PDF
No ratings yet
Kalman Filtering Book PDF
83 pages
Sarkka
No ratings yet
Sarkka
252 pages
Bayesian Filtering
No ratings yet
Bayesian Filtering
252 pages
Bfs Book 2023 Online
No ratings yet
Bfs Book 2023 Online
436 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Director's Concept & Vision Slides
100% (2)
Director's Concept & Vision Slides
14 pages
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
No ratings yet
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
27 pages
Machine Learning Econometrics Bayesian Algorithms
No ratings yet
Machine Learning Econometrics Bayesian Algorithms
33 pages
ML Lecture17
No ratings yet
ML Lecture17
60 pages
Lec 11 Tracking
No ratings yet
Lec 11 Tracking
70 pages
Journal of Statistical Software: Pyssm: A Python Module For Bayesian Inference of Linear Gaussian State Space Models
No ratings yet
Journal of Statistical Software: Pyssm: A Python Module For Bayesian Inference of Linear Gaussian State Space Models
37 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
39 pages
Course 1
No ratings yet
Course 1
47 pages
Inference in Mixed Linear/Nonlinear State-Space Models Using Sequential Monte Carlo
No ratings yet
Inference in Mixed Linear/Nonlinear State-Space Models Using Sequential Monte Carlo
31 pages
Discrete and Continuous Dynamical Systems Series S: Doi:10.3934/dcdss.2022054
No ratings yet
Discrete and Continuous Dynamical Systems Series S: Doi:10.3934/dcdss.2022054
25 pages
Journal of Statistical Software
No ratings yet
Journal of Statistical Software
41 pages
Lec8 - Bayesian Network II
No ratings yet
Lec8 - Bayesian Network II
50 pages
Bayesian Estimation of Time Varying Systems
No ratings yet
Bayesian Estimation of Time Varying Systems
115 pages
Particle Filter Theory and Practice With Positioning Applications
No ratings yet
Particle Filter Theory and Practice With Positioning Applications
30 pages
JSD 6 Chatzi
No ratings yet
JSD 6 Chatzi
27 pages
Table.1 Demographic Profile of The Respondents in Terms of Age
No ratings yet
Table.1 Demographic Profile of The Respondents in Terms of Age
5 pages
Fully Parallel Particle Learning
No ratings yet
Fully Parallel Particle Learning
23 pages
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
No ratings yet
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
38 pages
Particle Filtering and Marginalization For Parameter Identification in Structural Systems
No ratings yet
Particle Filtering and Marginalization For Parameter Identification in Structural Systems
25 pages
The Kalman Filter and Related Algorithms: A Literature Review
No ratings yet
The Kalman Filter and Related Algorithms: A Literature Review
18 pages
Lecture 8: Bayesian Estimation of Parameters in State Space Models
No ratings yet
Lecture 8: Bayesian Estimation of Parameters in State Space Models
33 pages
Journalclub Wood2016
No ratings yet
Journalclub Wood2016
18 pages
On Particle Methods For Parameter Estimation in State-Space Models
No ratings yet
On Particle Methods For Parameter Estimation in State-Space Models
25 pages
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
No ratings yet
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
41 pages
Journal of Statistical Software: Pyparticleest: A Python Framework For
No ratings yet
Journal of Statistical Software: Pyparticleest: A Python Framework For
25 pages
Parallel Resampling in The Particle Filter
No ratings yet
Parallel Resampling in The Particle Filter
18 pages
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
No ratings yet
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
35 pages
A Hybrid Maneuvering Extended Target Tracking Algorithm Based On PMBM Filte
No ratings yet
A Hybrid Maneuvering Extended Target Tracking Algorithm Based On PMBM Filte
12 pages
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
No ratings yet
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
69 pages
Gaussian Particle Filtering: Jayesh H. Kotecha and Petar M. Djuric, Senior Member, IEEE
No ratings yet
Gaussian Particle Filtering: Jayesh H. Kotecha and Petar M. Djuric, Senior Member, IEEE
10 pages
Posterior Linearization Filter Principles and Implementation Using Sigma Points
No ratings yet
Posterior Linearization Filter Principles and Implementation Using Sigma Points
13 pages
Signal Averaging by Parallel Digital Filters
No ratings yet
Signal Averaging by Parallel Digital Filters
10 pages
S Torvik 2002
No ratings yet
S Torvik 2002
9 pages
(Daum) Nonlinear Filters - Beyond The Kalman Filter
No ratings yet
(Daum) Nonlinear Filters - Beyond The Kalman Filter
13 pages
ISO 37001 New
No ratings yet
ISO 37001 New
13 pages
Laplace State Space Filter With Exact Inference and Moment Matching
No ratings yet
Laplace State Space Filter With Exact Inference and Moment Matching
5 pages
Multimodel System Parameter Estimation
No ratings yet
Multimodel System Parameter Estimation
5 pages
Novel Approach Nonlinear/non-Gaussian Bayesian State Estimation
No ratings yet
Novel Approach Nonlinear/non-Gaussian Bayesian State Estimation
7 pages
Kalmannote Basics
No ratings yet
Kalmannote Basics
4 pages
Bottlenecks in Stability Analysis
No ratings yet
Bottlenecks in Stability Analysis
7 pages
Baye Sian S Tate E Stim Atio N
No ratings yet
Baye Sian S Tate E Stim Atio N
9 pages
Iterated Posterior Linearization Smoother
No ratings yet
Iterated Posterior Linearization Smoother
8 pages
Set-Membership Recursive Least-Squares Adaptive Filtering Algorithm
No ratings yet
Set-Membership Recursive Least-Squares Adaptive Filtering Algorithm
4 pages
On The Convergence of Bayesian Adaptive Filtering: Tayeb Sadiki, Dirk T.M. Slock
No ratings yet
On The Convergence of Bayesian Adaptive Filtering: Tayeb Sadiki, Dirk T.M. Slock
4 pages
Derivation of The Kalman Filter in A Bayesian Filtering Perspective
No ratings yet
Derivation of The Kalman Filter in A Bayesian Filtering Perspective
6 pages
Bayesian Filtering Techniques: Kalman and Extended Kalman Filter Basics
No ratings yet
Bayesian Filtering Techniques: Kalman and Extended Kalman Filter Basics
4 pages
Occamposter
No ratings yet
Occamposter
1 page
Christophe Andrieu - Arnaud Doucet Bristol, BS8 1TW, UK. Cambridge, CB2 1PZ, UK. Email
No ratings yet
Christophe Andrieu - Arnaud Doucet Bristol, BS8 1TW, UK. Cambridge, CB2 1PZ, UK. Email
4 pages
Sequential State-Space Filters For Speech Enhancement
No ratings yet
Sequential State-Space Filters For Speech Enhancement
4 pages
3 Dirt Wall
No ratings yet
3 Dirt Wall
5 pages
3rd Quarter Test SCIENCE 6
No ratings yet
3rd Quarter Test SCIENCE 6
10 pages
Particle Filtering: Emin Orhan Eorhan@bcs - Rochester.edu
No ratings yet
Particle Filtering: Emin Orhan Eorhan@bcs - Rochester.edu
6 pages
Intoduction To Ista
No ratings yet
Intoduction To Ista
14 pages
Visual Tracking With Filtering Algorithms
No ratings yet
Visual Tracking With Filtering Algorithms
6 pages
HSC English 2nd Paper 2024 (All Board)
No ratings yet
HSC English 2nd Paper 2024 (All Board)
1 page
11.metar and Taf
No ratings yet
11.metar and Taf
51 pages
Untitled
No ratings yet
Untitled
4 pages
3 Simple Habits To Improve Your Critical Thinking
No ratings yet
3 Simple Habits To Improve Your Critical Thinking
6 pages
Impact of Colonialism On Africa and Its Economic Development
No ratings yet
Impact of Colonialism On Africa and Its Economic Development
8 pages
Ansys Fluent Project in Advanced Fluid Mechanics
100% (1)
Ansys Fluent Project in Advanced Fluid Mechanics
28 pages
Entrepreneurship Development: Examination
No ratings yet
Entrepreneurship Development: Examination
15 pages
Alexis de Tocqueville The First Social Scientist 1st Edition Jon Elster Instant Download
No ratings yet
Alexis de Tocqueville The First Social Scientist 1st Edition Jon Elster Instant Download
49 pages
STS Reviewer
No ratings yet
STS Reviewer
23 pages
Full Download Electromagnetic Waves and Lasers Second Edition Kimura Wayne D PDF
100% (3)
Full Download Electromagnetic Waves and Lasers Second Edition Kimura Wayne D PDF
49 pages
American Culture and Drug Abuse
No ratings yet
American Culture and Drug Abuse
1 page
Avalanche Formation and Characteristics
No ratings yet
Avalanche Formation and Characteristics
13 pages
Motion in 2D DPP 7 Min
No ratings yet
Motion in 2D DPP 7 Min
3 pages
The Next Big Thing Quantum Computings Potential On Chemicals
No ratings yet
The Next Big Thing Quantum Computings Potential On Chemicals
7 pages
Template RPS Blended Learning
No ratings yet
Template RPS Blended Learning
36 pages
Eoa Peg-4000 (En) Msds
No ratings yet
Eoa Peg-4000 (En) Msds
7 pages
8 Total Quality Management Principles - Lucidchart Blog
No ratings yet
8 Total Quality Management Principles - Lucidchart Blog
12 pages
Spanos - Past-Life Ids Ufos Satanic Abuse
No ratings yet
Spanos - Past-Life Ids Ufos Satanic Abuse
8 pages
GIS A Tool For Sustainable Development PDF
No ratings yet
GIS A Tool For Sustainable Development PDF
11 pages
Prueba Modelo Diagnostica Optativa Ingles
No ratings yet
Prueba Modelo Diagnostica Optativa Ingles
5 pages
BachHoang FritoLay Memo
No ratings yet
BachHoang FritoLay Memo
4 pages
BNAD 277 Tableau Assignment
No ratings yet
BNAD 277 Tableau Assignment
1 page
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
From Everand
Knuth-Morris-Pratt Algorithm Explained: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Foundations of Scheduling Algorithms: Definitive Reference for Developers and Engineers
From Everand
Foundations of Scheduling Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
From Everand
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

Temporal Parallelization of Bayesian Smoothers

Uploaded by

Temporal Parallelization of Bayesian Smoothers

Uploaded by

GENERIC COLORIZED JOURNAL, VOL. XX, NO.

XX, XXXX 2017 1

Temporal Parallelization of Bayesian Smoothers

filter is large, it is then possible to speed up the matrix computations

// Save the input:

A. Linear/Gaussian filtering with

Floating point operations (flops)

Floating point operations (flops)

and The second element of a1 ⊗ [a2 ⊗ · · · ⊗ ak ], denoted as g1k is

holds, we need to prove that (23) holds. aij ⊗ ak = aijk ,

f1k (xk | x0 ) ak ⊗ · · · ⊗ ak+l

You might also like