Notes On Kalman Filter KF EKF ESKF IEKF IESKF
Notes On Kalman Filter KF EKF ESKF IEKF IESKF
May 2, 2024
Contents
1 Recursive Bayes Filter 2
1.1 Derivation of Recursive Bayes Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Gaussian Belief Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1
6.1.4 Difference (MAP-based derivation) . . . . . . . . . . . . . . . . . . . . . . . 18
6.2 Prediction step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.3 Correction step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
9 Wrap-up 27
9.1 Kalman Filter (KF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
9.2 Extended Kalman Filter (EKF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
9.3 Error-state Kalman Filter (ESKF) . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
9.4 Iterated Extended Kalman Filter (IEKF) . . . . . . . . . . . . . . . . . . . . . . . 29
9.5 Iterated Error-state Kalman Filter (IESKF) . . . . . . . . . . . . . . . . . . . . . . 29
10 Reference 29
11 Revision log 30
bel(·) is expressed and developed according to the Bayesian rule, so it is also called the Bayes
filter. Using the Markov assumption and the Bayesian rule, a recursive filter can be induced, and
this is called the recursive Bayes filter.
- η = 1/p(zt |z1:t−1 , u1:t ) : A normalization factor to maintain the definition of the probability
distribution by normalizing its width to 1
- p(z
R t | xt ) : Observation model
- p(xt | xt−1 , ut )dxt−1 : Motion model
The Recursive Bayes filter is thus called a recursive filter because it calculates the current
step’s bel(xt ) from the previous step’s bel(xt−1 ).
2
1.1 Derivation of Recursive Bayes Filter
The formulation of the Recursive Bayes Filter is derived as follows:
Steps applied include Bayesian rule, Markov Assumption, and Marginalization to derive the re-
cursive filter process.
Step 1:
bel(xt ) = p(xt |z1:t , u1:t )
(4)
= η · p(zt |xt , z1:t−1 , u1:t ) · p(xt |z1:t−1 , u1:t )
Apply the Markov Assumption that the current state depends only on the immediately previous
state.
3
Step 5:
bel(xt ) = p(xt |z1:t , u1:t )
= η · p(zt |xt , z1:t−1 , u1:t ) · p(xt |z1:t−1 , u1:t )
= η · p(zt |xt ) · p(xt |z1:t−1 , u1:t )
Z
= η · p(zt |xt ) · p(xt |xt−1 , z1:t−1 , u1:t ) · p(xt−1 |z1:t−1 , u1:t )dxt−1
xt−1 (8)
Z
= η · p(zt |xt ) · p(xt |xt−1 , ut ) · p(xt−1 |z1:t−1 , u1:t )dxt−1
xt−1
Z
= η · p(zt |xt ) · p(xt |xt−1 , ut ) · p(xt−1 |z1:t−1 , u1:t−1 )dxt−1
xt−1
Apply the Markov Assumption that the current state depends only on the immediately previous
state.
Step 6:
bel(xt ) = p(xt |z1:t , u1:t )
= η · p(zt |xt , z1:t−1 , u1:t ) · p(xt |z1:t−1 , u1:t )
= η · p(zt |xt ) · p(xt |z1:t−1 , u1:t )
Z
= η · p(zt |xt ) · p(xt |xt−1 , z1:t−1 , u1:t ) · p(xt−1 |z1:t−1 , u1:t )dxt−1
xt−1
(9)
Z
= η · p(zt |xt ) · p(xt |xt−1 , ut ) · p(xt−1 |z1:t−1 , u1:t )dxt−1
xt−1
Z
= η · p(zt |xt ) · p(xt |xt−1 , ut ) · p(xt−1 |z1:t−1 , u1:t−1 )dxt−1
xt−1
Z
= η · p(zt |xt ) · p(xt |xt−1 , ut ) · bel(xt−1 )dxt−1
xt−1
4
Mean and covariance are also denoted as (x̂, P) or (µ̂, Σ), representing the same concept with
different notations.
– x̂t|t−1 : Mean at step t given the correction value at step t − 1. Some literature also
denotes this as x−
t .
– P̂t|t−1 : Covariance at step t given the correction value at step t − 1. Some literature
also denotes this as P−
t .
At time t, the robot’s position is denoted by xt , the measurements from the robot’s sensor
by zt , and the robot’s control input by ut . Using these, we can define the motion model and
observation model. The models are constrained by the requirement that they must be linear.
5
Assuming all random variables follow Gaussian distributions, p(xt | xt−1 , ut ), p(zt | xt ) can be
represented as follows.
Next, the bel(xt ), bel(xt ) that need to be computed through the Kalman filter can be repre-
sented as follows.
Z
bel(xt ) = p(xt | xt−1 , ut )bel(xt−1 )dxt−1 ∼ N (x̂t|t−1 , Pt|t−1 )
(15)
bel(xt ) = η · p(zt | xt )bel(xt ) ∼ N (x̂t|t , Pt|t )
As seen in (15), the Kalman filter operates by first computing the predicted value bel(xt )
using the previous step’s value and motion model in the prediction phase, then obtaining the
corrected value bel(xt ) using the measurement and observation model in the correction phase. By
substituting (13) and (14) into (15), we can derive the means and covariances for the prediction
and correction steps (x̂t|t−1 , Pt|t−1 ), (x̂t|t , Pt|t ) respectively. For a detailed derivation process,
refer to Section 3.2.4 "Mathematical Derivation of the KF" in [[13]].
The initial value bel(x0 ) is given as follows.
x̂t|t−1 = Ft x̂t−1|t−1 + Bt ut
(17)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Qt
6
2.3 Summary
The Kalman Filter can be expressed as a function as follows.
KalmanFilter(x̂t−1|t−1 , Pt−1|t−1 , ut , zt ){
(Prediction Step)
x̂t|t−1 = Ft x̂t−1|t−1 + Bt ut
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Qt
Kalman filters assume that both the motion model and the observation model are linear to
estimate the state. However, since most phenomena in the real world are modeled non-linearly,
applying the Kalman filter as previously defined would not work properly. The extended Kalman
filter (EKF) was proposed to use the Kalman filter even in non-linear motion and observation
models. EKF uses the first-order Taylor approximation to approximate the non-linear
model to a linear model before applying the Kalman filter. The motion model and
observation model of the EKF are as follows.
Motion Model: xt = f (xt−1 , ut , wt )
(20)
Observation Model: zt = h(xt , vt )
- xt : state variable of the model
- ut : input of the model
- zt : measurement of the model
- f (·): non-linear motion model function
- h(·): non-linear observation model function
- wt ∼ N (0, Qt ): noise in the motion model. Qt represents the covariance matrix of wt
- vt ∼ N (0, Rt ): noise in the observation model. Rt represents the covariance matrix of vt
In the above formula, f (·) means a non-linear motion model and h(·) means a non-linear ob-
servation model. When the first-order Taylor approximation is performed on f (·), h(·), it becomes
as follows.
xt ≈ f (x̂t−1|t−1 , ut , 0) + Ft (xt−1 − x̂t−1|t−1 ) + Fw wt
(21)
zt ≈ h(x̂t|t−1 , 0) + Ht (xt − x̂t|t−1 ) + Hv vt
At this time, Ft represents the Jacobian matrix of the motion model calculated at x̂t−1|t−1 ,
and Ht represents the Jacobian matrix of the observation model calculated at x̂t|t−1 . And Fw , Hv
7
represent the Jacobian matrix of the noise when wt = 0, vt = 0 respectively. For more information
on the Jacobian, refer to this post.
∂f ∂f
Ft = Fw = (22)
∂xt−1 xt−1 =x̂t−1|t−1 ∂wt xt−1 =x̂t−1|t−1
wt =0
∂h ∂h
Ht = Hv = (23)
∂xt xt =x̂t|t−1 ∂vt xt =x̂t|t−1
vt =0
(21) If the formula is expanded, it becomes as follows.
zt = Ht xt + h(x̂t|t−1 , 0) − Ht x̂t|t−1 + Hv vt
(25)
= Ht xt + z̃t + ṽt
- z̃t = h(x̂t|t−1 , 0) − Ht x̂t|t−1
- ṽt = Hv vt ∼ N (0, Hv Rt H⊺v )
Assuming all random variables follow a Gaussian distribution, p(xt |xt−1 , ut ), p(zt |xt ) can be
expressed as follows.
p(xt |xt−1 , ut )∼ N (Ft xt−1 + ũt , Fw Qt F⊺w )
1 1 ⊺ ⊺ −1
=p exp − (xt − Ft xt−1 − ũt ) (Fw Qt Fw ) (xt − Ft xt−1 − ũt )
det(2πFw Qt F⊺w ) 2
(26)
p(zt |xt ) ∼ N (Ht xt + z̃t , Hv Rt H⊺v )
1 1 ⊺ ⊺ −1
=p exp − (z t − Ht x t − z̃ t ) (Hv R t Hv ) (z t − Ht xt − z̃ t )
det(2πHv Rt H⊺v ) 2
(27)
Fw Qt F⊺w represents the noise in the linearized motion model, and Hv Rt H⊺v represents the
noise in the linearized observation model. Next, bel(xt ), bel(xt ) that need to be found through
the Kalman filter can be expressed as follows.
Z
bel(xt ) = p(xt |xt−1 , ut )bel(xt−1 )dxt−1 ∼ N (x̂t|t−1 , Pt|t−1 )
(28)
bel(xt ) = η · p(zt |xt )bel(xt ) ∼ N (x̂t|t , Pt|t )
EKF operates in a similar manner to KF, using the values from the previous step and the
motion model in the prediction to first obtain the predicted value bel(xt ), and then using the
observed values and observation model in the correction to obtain the corrected value bel(xt ).
Substitute p(xt |xt−1 , ut ), p(zt |xt ) into the above equation to obtain the mean and covariance of
the prediction and correction steps (x̂t|t−1 , Pt|t−1 ), (x̂t|t , Pt|t ) respectively. Refer to section 3.2.4
"Mathematical Derivation of the KF" in [[13]] for detailed derivation.
The initial value bel(x0 ) is given as follows.
bel(x0 ) ∼ N (x̂0 , P0 ) (29)
- x̂0 : Typically set to 0
- P0 : Typically set to a small value (<1e-2)
8
3.1 Prediction step
Prediction involves calculating bel(xt ). The covariance matrix is obtained using the linearized
Jacobian matrix Ft .
x̂t|t−1 = f (x̂t−1|t−1 , ut , 0)
(30)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Fw Qt F⊺w
3.3 Summary
The Extended Kalman Filter can be expressed as a function as follows.
ExtendedKalmanFilter(x̂t−1|t−1 , Pt−1|t−1 , ut , zt ){
(Prediction Step)
x̂t|t−1 = ft (x̂t−1|t−1 , ut , 0)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Fw Qt F⊺w
– δx̂t|t−1 : The mean at step t given the correction value at step t − 1. Some literature
also denotes this as δx−
t .
– P̂t|t−1 : The covariance at step t given the correction value at step t−1. Some literature
also denotes this as P−t .
9
The error-state Kalman filter (ESKF) is a Kalman filter algorithm that estimates
the mean and variance of the error state variable δxt , unlike the EKF which estimates
the mean and variance of the conventional state variable xt . Since it estimates through the
error state rather than directly estimating the state variable, it is also called an indirect Kalman
filter. It is also known as the error-state extended Kalman filter (ES-EKF). In the ESKF, the
conventional state variable is called the true state variable, and it is represented as a sum of the
nominal (nominal) state and the error (error) state.
If we interpret the above formula, it means that the actual (true) state variable xt we want to
estimate can be represented as a sum of the general (or nominal) state x̂t without errors and the
error state δxt arising from model and sensor noise. Here, the nominal state has (relatively) large
values and is nonlinear. In contrast, the error state has small values near zero and is linear. While
the conventional EKF linearizes the nonlinear true (nominal + error) state variable
for filtering, resulting in slow speed and accumulated errors over time, the ESKF
linearizes only the error state for filtering, thus having faster speed and accuracy.
The advantages of ESKF over conventional EKF are as follows (Madyastha et al., 2011):
• The representation of the orientation error state involves minimal parameters. That is, it has
the minimum number of parameters as degrees of freedom, so phenomena like singularities
due to over-parameterization do not occur.
• The error state system only operates near the origin, making it easy to linearize. Therefore,
parameter singularities like gimbal lock do not occur, and it is always possible to perform
linearization.
• Since error states typically have small values, terms of second order and higher can be
ignored. This helps perform Jacobian operations quickly and easily. Some Jacobians are
also used as constants.
However, while the prediction speed of the ESKF is fast, the correction step where the nominal
state with generally nonlinear large values is processed is slow.
The motion model and observation model of the ESKF are as follows.
Error-state Motion Model: xt + δxt = f (xt−1 , δxt−1 , ut , wt )
(34)
Error-state Observation Model: zt = h(xt , δxt , vt )
10
- f (·): Nonlinear motion model function
- h(·): Nonlinear observation model function
- wt ∼ N (0, Qt ): Noise of the error state model. Qt denotes the covariance matrix of wt
- vt ∼ N (0, Rt ): Noise of the observation model. Rt denotes the covariance matrix of vt
When first-order Taylor approximations are performed for f (·), h(·), the following expansions
result. This development refers to [5].
Note that both Jacobians, Ft , Ht , are for the error state δxt , not the true state
xtrue,t . This is the most significant difference from EKF in the Jacobian part.
∂f ∂f
Ft = Fw = (36)
∂δxt−1 δxt−1 =δx̂t−1|t−1 ∂wt δxt−1 =δx̂t−1|t−1
wt =0
∂h ∂h
Ht = Hv = (37)
∂δxt δxt =δx̂t|t−1 ∂vt δxt =δx̂t|t−1
vt =0
Since δx̂t−1|t−1 = 0 is always initialized at the previous correction step, 0 is substituted for
the related terms.
xt + δxt = Ft δxt−1 + f (xt−1|t−1 , 0, ut , 0) + Fw wt (40)
The nominal state variable xt is identical to f (xt−1 , 0, ut , 0) by definition, so they cancel each
other out.
δxt = Ft δxt−1 + Fw wt
= Ft δxt−1 + w̃t (41)
= 0 + w̃t
- z̃t = h(xt|t−1 , 0, 0)
- ṽt = Hv vt ∼ N (0, Hv Rt H⊺v )
11
Assuming all random variables follow a Gaussian distribution, p(δxt | δxt−1 , ut ), p(zt | δxt )
can be represented as follows.
ESKF also operates in the same way as EKF, where the prediction first derives the predicted
value bel(δxt ) using the previous step’s value and motion model, and then the correction derives
the corrected value bel(δxt ) using the measurement value and observation model. Substitut-
ing p(δxt |δxt−1 , ut ), p(zt |δxt ) into the above formula allows you to derive the mean and
covariance for the prediction and correction steps, (δx̂t|t−1 , Pt|t−1 ), (δx̂t|t , Pt|t ), respec-
tively. The detailed derivation process can be found in section 3.2.4 "Mathematical
Derivation of the KF" of [[13]].
The initial value bel(δx0 ) is given as follows.
Note that Ft is the Jacobian for the error state. Since the error state variable δx̂ is reset to
0 at every correction step, multiplying it by the linear Jacobian Ft still results in 0. Therefore,
the value of δx̂ always remains 0 in the prediction step. Thus, the error state δx̂t|t−1
remains unchanged in the prediction step, while only the nominal state x̂t|t−1 and the error state
covariance Pt|t−1 are updated.
12
4.2 Correction step
Correction is the process of deriving bel(δxt ). The Kalman gain and covariance matrix are derived
using the linearized Jacobian matrix Ht .
Note that in the above formula, Ht is the Jacobian for the observation model of
the error state δxt , not the true state xtrue,t . The symbols Pt|t−1 , Kt are only similar
to those in EKF, but the actual values are different. Thus, while the overall formulas are
the same as in EKF, the matrices F, H, P, K denote values for the error state δxt , which is a key
difference.
4.2.1 Reset
Once the nominal state is updated, the next step is to reset the value of the error state to 0. The
reset is necessary because it is necessary to represent a new error for the new nominal state. The
reset updates the covariance of the error state Pt|t .
If the reset function is denoted as g(·), it can be represented as follows. For more details on
this, see the chapter 6 content in the post Quaternion kinematics for the error-state Kalman filter
summary.
δx ← g(δx) = δx − δx̂t|t−1 (49)
The reset process in ESKF is as follows.
δx̂t|t ← 0
(50)
Pt|t ← GPt|t G⊺
13
4.3 Summary
ErrorStateKalmanFilter(x̂t−1|t−1 , δx̂t−1|t−1 , Pt−1|t−1 , ut , zt ){
(Prediction Step)
δx̂t|t−1 = Ft x̂t−1|t−1 = 0 ← Always 0
x̂t|t−1 = f (x̂t−1|t−1 , 0, ut , 0)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Fw Qt F⊺w
(Correction Step)
Kt = Pt|t−1 H⊺t (Ht Pt|t−1 H⊺t + Hv Rt H⊺v )−1
(52)
δx̂t|t = Kt (zt − ht (x̂t|t−1 , 0, 0))
x̂t|t = x̂t|t−1 + δx̂t|t
Pt|t = (I − Kt Ht )Pt|t−1
reset δx̂t|t
δx̂t|t ← 0
Pt|t ← GPt|t P⊺
return δx̂t|t , Pt|t
}
The Iterated Extended Kalman Filter (IEKF) is an algorithm that repetitively performs the
correction step of the Extended Kalman Filter (EKF). Since EKF linearizes non-linear functions
to estimate state variables, errors inevitably occur during the linearization process. IEKF reduces
these linearization errors by repeatedly performing the linearization if the update change δx̂t|t,j
after the correction step is sufficiently large.
In this context, δx̂ is interpreted as the update change, not an error state variable. That is,
x̂t|t,j+1 ← x̂t|t,j +δx̂t|t,j is used solely to update the j-th posterior value to the j+1 posterior value.
In other words, it is not the target of state estimation.
14
5.1.2 Commonality 2
Next, the linearization process is also exactly the same as EKF’s equations (24), (25).
zt = Ht xt + h(x̂t|t−1 , 0) − Ht x̂t|t−1 + Hv vt
(55)
= Ht xt + z̃t + ṽt
5.1.3 Commonality 3
IEKF, like EKF, uses the matrices F, H, K true to the state variables xtrue .
5.1.4 Difference 1
EKF: The correction value is obtained from the prediction value in one step.
IEKF: The correction value becomes the prediction value again and the correction step is repeated
iteratively.
- j: j-th iteration
5.1.5 Difference 2
Expanding equation (55) to create the innovation term rt gives the following.
15
5.2 Prediction step
Prediction involves the process of determining bel(xt ). The linearized Jacobian matrix Ft is used
when computing the covariance matrix. This is identical to the prediction step in EKF.
x̂t|t−1 = f (x̂t−1|t−1 , ut , 0)
(60)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Fw Qt F⊺w
set ϵ
start j-th loop
Kt,j = Pt|t−1 H⊺t,j (Ht,j Pt|t−1 H⊺t,j + Hv,j Rt H⊺v,j )−1
δx̂t|t,j = Kt,j (zt − h(x̂t|t,j , 0) − Ht (x̂t|t−1 − x̂t|t,j ))
(61)
x̂t|t,j+1 = x̂t|t,j + δx̂t|t,j
iterate until δx̂t|t,j < ϵ.
end loop
Pt|t,n = (I − Kt,n Ht,n )Pt|t−1
5.4 Summary
IteratedExtendedKalmanFilter(x̂t−1|t−1 , Pt−1|t−1 , ut , zt ){
(Prediction Step)
x̂t|t−1 = ft (x̂t−1|t−1 , ut , 0)
Pt|t−1 = Ft Pt−1|t−1 F⊺t + Fw Qt F⊺w
(Correction Step)
set ϵ
start j-th loop
(62)
Kt,j = Pt|t−1 H⊺t,j (Ht,j Pt|t−1 H⊺t,j + Hv,j Rt H⊺v,j )−1
δx̂t|t,j = Kt,j (zt − h(x̂t|t,j , 0) − Ht (x̂t|t−1 − x̂t|t,j ))
x̂t|t,j+1 = x̂t|t,j + δx̂t|t,j
iterate until δx̂t|t,j < ϵ.
end loop
Pt|t,n = (I − Kt,n Ht,n )Pt|t−1
return x̂t|t , Pt|t
}
16
While IEKF estimates the true state variable xtrue iteratively in the correction step, IESKF
is an algorithm that iteratively estimates the error state variable δxt . The relationship between
these state variables is as follows.
The iterative update process in IESKF involves understanding xtrue,t as the post-update state,
nominal xt as the pre-update state, and the error state variable δxt as the update amount.
6.1.2 Commonality 2
Both Jacobians Ft , Ht of IESKF refer to the Jacobians for the error state δxt , similar to ESKF.
∂f ∂f
Ft = Fw = (65)
∂δxt−1 δxt−1 =δx̂t−1|t−1 ∂wt δxt−1 =δx̂t−1|t−1
wt =0
∂h ∂h
Ht = Hv = (66)
∂δxt δxt =δx̂t|t−1 ∂vt δxt =δx̂t|t−1
vt =0
17
6.1.3 Commonality 3
The linearized error state variable x̂t can also be derived identically to ESKF.
δxt = Ft δxt−1 + Fw wt
= Ft δxt−1 + w̃t (68)
= 0 + w̃t
IESKF derivation:
IESKF uses maximum a posteriori (MAP) for deriving the correction step. For MAP estima-
tion, the Gauss-Newton optimization method is utilized. For more details, refer to section 7. This
derivation process is based on references [6], [7], [8], [9], [10], and [11], among which [6] provides
the most detailed explanation of the IESKF derivation process.
If EKF is derived using the MAP approach, the final equation to optimize is as follows:
arg min ∥zt − h(xtrue,t , 0)∥Hv R−1 H⊺v + xtrue,t − x̂t|t−1 P−1 (71)
xt t|t−1
When expressed in terms of the nominal state xt and error state δxt , it becomes the following:
arg min ∥zt − h(xt , δxt , 0)∥Hv R−1 H⊺v + δx̂t|t−1 P−1 (72)
δx̂t|t t|t−1
Tip
18
The first part of (72) should be linearized similarly to (42).
If the third and fourth lines of (77) are rearranged and simplified, the following formula is
obtained:
−⊺
δx̂t|t ∼ N (−J−1 −1
t (x̂t|t − x̂t|t−1 ), Jt Pt|t−1 Jt ) (79)
Having calculated (76) and (77), the MAP estimation problem becomes as follows:
IESKF iteratively estimates the error state variable δx̂t|t until it converges to a value less than
a specific threshold ϵ. The expression for the j-th iteration is as follows:
−⊺
δx̂t|t,j ∼ N (−J−1 −1
t,j (x̂t|t,j − x̂t|t−1 ), Jt,j Pt|t−1 Jt,j ) (81)
Refer to Section 8 for the derivation process of the update equation from the above equation.
19
6.3 Correction step
Correction involves the process of determining bel(xt ). The Kalman gain and covariance matrix
are calculated using the linearized Jacobian matrix Ht . The correction step of IESKF is performed
iteratively until the update amount δx̂t|t,j becomes sufficiently small. The derived formula (82) is
differentiated and set to zero as follows:
set ϵ
start j-th loop
−⊺ ⊺ ⊺
St,j = Ht,j J−1
t,j Pt|t−1 Jt,j Ht,j + Hv,j Rt Hv,j
−⊺ ⊺
Kt,j = J−1 −1
t,j Pt|t−1 Jt,j Ht,j St,j
δx̂t|t,j = Kt,j zt − h(x̂t|t,j , 0, 0) + Ht,j J−1
t,j (x̂t|t,j − x̂ t|t−1 ) − J−1
t,j (x̂t|t,j − x̂t|t−1 )
(84)
6.4 Summary
IESKF can be expressed as a function as follows.
(Correction Step)
set ϵ
start j-th loop
−⊺ ⊺ ⊺
St,j = Ht,j J−1
t,j Pt|t−1 Jt,j Ht,j + Hv,j Rt Hv,j (85)
−⊺ ⊺
Kt,j = J−1 −1
t,j Pt|t−1 Jt,j Ht,j St,j
δx̂t|t,j = Kt,j zt − h(x̂t|t,j , 0, 0) + Ht,j J−1
t,j (x̂t|t,j − x̂t|t−1 ) − J−1
t,j (x̂t|t,j − x̂t|t−1 )
20
7 MAP, GN, and EKF relationship
7.1 Traditional EKF derivation
Let’s consider the observation model function for the EKF given as follows. For the convenience
of development, the observation noise vt is positioned outside.
Removing the negative sign turns the maximization problem into a minimization problem, and
it can be organized into the following optimization formula.
x̂t|t ∝ arg min exp (zt − h(xt ))⊺ R−1 t (zt − h(xt ))
xt
(90)
+ (xt − x̂t|t−1 )⊺ P−1t|t−1 (xt − x̂t|t−1 )
Expanding the formula within (91) and defining it as a cost function Ct results in the following.
Note that switching the order of xt − x̂t|t−1 does not affect the overall value.
21
Ct = (zt − h(xt ))⊺ R−1 ⊺ −1
t (zt − h(xt )) + (x̂t|t−1 − xt ) Pt|t−1 (x̂t|t−1 − xt ) (92)
This can be expressed in matrix form as follows.
⊺ −1
x̂t|t−1 − xt Pt|t−1 0 x̂t|t−1 − xt
Ct = (93)
zt − h(xt ) 0 R−1
t
zt − h(xt )
yt = g(xt ) + et
(94)
∼ N (g(xt ), Pe )
x̂t|t−1
- yt =
zt
xt
- g(xt ) =
h(xt )
- et ∼ N (0, Pe )
Pt|t−1 0
- Pe =
0 Rt
The nonlinear function g(xt ) can be linearized as follows.
∂g
Jt =
∂xt xt =x̂t|t−1
xt
∂
h(xt )
∂xt xt =x̂t|t−1
(96)
xt
∂
h(x̂t|t−1 ) + Ht (xt − x̂t|t−1 )
∂xt xt =x̂t|t−1
I
=
Ht
Therefore, the likelihood for yt can be developed as follows.
22
x̂t|t = arg max p(yt |xt )
xt
rt = yt − g(x̂t|t−1 )
= Jt δx̂t|t−1 + e (100)
∼ N (0, Pe )
By solving the normal equations of Gauss-Newton, the solution is as follows.
(J⊺t P−1 ⊺ −1
e Jt )δx̂t|t−1 = Jt Pe rt (101)
23
Pt|t =E[(xt − x̂t|t−1 )(xt − x̂t|t−1 )⊺ ]
=E(δx̂t|t−1 δx̂⊺t|t−1 )
=E (J⊺t P−1
e J t ) −1 ⊺ −1
J P
t e r r ⊺ −⊺
P
t t e J (J ⊺ −1
t t eP J t ) −⊺
=(J⊺t P−1e Jt )
−1 ⊺ −1
Jt Pe E(rt r⊺t )P−⊺ ⊺ −1
e Jt (Jt Pe Jt )
−⊺
← E(rt r⊺t ) = Pe
−1
= J⊺t P−1
e Jt
(103)
−1 !−1
⊺ Pt|t−1 0 I
= I Ht
0 Rt Ht
−1
⊺ −1
= P−1 t|t−1 + H t R t Ht
=Pt|t−1 − Pt|t−1 H⊺t (Ht Pt|t−1 H⊺t + Rt )−1 Ht Pt|t−1 ← matrix inversion lemmas
=(I − Kt Ht )Pt|t−1
As can be seen from the fifth line of the above equation, the inverse of the approximate Hessian
matrix H̃−1 obtained through GN and the EKF’s posterior covariance Pt|t are the same.
24
This is identical to the solution of the EKF. That is, the EKF implies that GN iteration=1,
and IEKF performs the same operations as GN.
(107)
The symbols in the above equation are simplified as follows:
2 2
x = arg min ∥z − Hx∥R + ∥c + Jx∥P
x
- z = zt − h(x̂t|t−1 , 0, 0)
- c = x̂t|t,j − x̂t|t−1
- R = Hv R−1 H⊺v
- x = δx̂
- P̄ = J−1 PJ−⊺
Tip
25
Substituting this equation into (109) yields:
= Kz + KHJ−1 c − J−1 c
(112)
- K = J−1 PJ−⊺ H⊺ S−1
Restoring the symbols substituted in the last line of the equation yields the final update
equation (84) as follows:
∴ δx̂t|t,j = Kt,j zt − h(x̂t|t,j , 0, 0) + Ht,j J−1
t,j (x̂t|t,j − x̂t|t−1 ) − J−1
t,j (x̂t|t,j − x̂t|t−1 ) (113)
- z = zt − h(x̂t|t−1 , 0, 0)
- c = x̂t|t,j − x̂t|t−1
- R = Hv R−1 H⊺v
- x = δx̂
26
9 Wrap-up
The explanation of KF, EKF, ESKF, IEKF, IESKF so far can be represented on a single slide as
follows. Click to see the larger image.
27
9.2 Extended Kalman Filter (EKF)
28
9.4 Iterated Extended Kalman Filter (IEKF)
9.5 Iterated Error-state Kalman Filter (IESKF)
10 Reference
[1] (Wiki) Kalman Filter
29
[2] (Paper) Sola, Joan. "Quaternion kinematics for the error-state Kalman filter." arXiv preprint
arXiv:1711.02508 (2017).
[3] (Youtube) Robot Mapping Coure - Freiburg Univ
[4] (Blog) [SLAM] Kalman filter and EKF(Extended Kalman Filter) - Jinyong Jeong
[5] (Blog) Error-State Kalman Filter understanding and formula derivation - CSDN
[6] (Paper) He, Dongjiao, Wei Xu, and Fu Zhang. "Kalman filters on differentiable manifolds."
arXiv preprint arXiv:2102.03804 (2021).
[7] (Book) SLAM in Autonomous Driving book (SAD book)
[8] (Paper) Xu, Wei, and Fu Zhang. "Fast-lio: A fast, robust lidar-inertial odometry package by
tightly-coupled iterated kalman filter." IEEE Robotics and Automation Letters 6.2 (2021):
3317-3324.
[9] (Paper) Huai, Jianzhu, and Xiang Gao. "A Quick Guide for the Iterated Extended Kalman
Filter on Manifolds." arXiv preprint arXiv:2307.09237 (2023).
[10] (Paper) Bloesch, Michael, et al. "Iterated extended Kalman filter based visual-inertial odom-
etry using direct photometric feedback." The International Journal of Robotics Research
36.10 (2017): 1053-1072.
[11] (Paper) Skoglund, Martin A., Gustaf Hendeby, and Daniel Axehill. "Extended Kalman filter
modifications based on an optimization view point." 2015 18th International Conference on
Information Fusion (Fusion). IEEE, 2015.
[12] (Blog) From MAP, MLE, OLS, GN to IEKF, EKF
[13] (Book) Thrun, Sebastian. "Probabilistic robotics." Communications of the ACM 45.3 (2002):
52-57.
11 Revision log
• 1st: 2020-06-23
• 2nd: 2020-06-24
• 3rd: 2020-06-26
• 4th: 2023-01-21
• 5th: 2023-01-31
• 6th: 2023-02-02
• 7th: 2023-02-04
• 8th: 2024-02-08
• 9th: 2024-02-09
• 10th: 2024-05-02
30