Error State - Kalman Filter - Mike
Error State - Kalman Filter - Mike
Notation
We will use Proper Euler angles to note rotations, that will be is α, β, γ, we are only
interested in 2D rotations, therefore, we will use the z-x’-z’’ representation in which α
represents the yaw (the representation does not matter as far as the first rotation
happens in the z axis). The steering angle will be noted by δ.
Explanation
The Kalman Filter is used to keep track of certain variables and fuse information
coming from other sensors such as Inertial Measurement Unit (IMU) or Wheels or
any other sensor. It is very common in robotics because it fuses the information
according to how certain the measurements are. Therefore we can have several
sources of information, some more reliable than others and a KF takes that into
account to keep track of the variables we are interested in.
State
The state s we are interested in tracking is composed by x and y coordinates, the
t
heading of the vehicle or the yaw θ, the current velocity v and steering angle δ. The
tracked orientation is only composed by the yaw θ, we are only modelling a 2D world,
therefore we do not care about the roll β or pitch γ. And finally, we added the
steering angle δ which is important to predict the movement of the car. Therefore
the state in timestep t is
x
⎡ ⎤
⎢ y ⎥
⎢ ⎥
st = ⎢ θ ⎥
⎢ ⎥
⎢ ⎥
⎢ v ⎥
⎣ ⎦
δ
KF can be divided into two steps, update and predict step. In the predict step, using
the tracked information we predict where will the object move in the next step. In the
update step, we update the belief we have about the variables using the external
measurements coming from the sensors.
Sensor
Keep in mind that a KF can handle any number of sensors, so far we are going to use
the localization measurement coming from a GPS + pseudo-gyro.
This measurement contains the global measurements (x, y) that avoid the system of
drifting. This system (without global variables) is also called Dead reckoning. Dead
reckoning or using a Kalman Filter without a global measurement is prone to
cumulative errors, that means that the state will slowly diverge from the true value.
Prediction Step
We will track the state as a multivariable Gaussian distribution with mean μ and
covariance P . μ will be the expected value of the state using the information
t
available (i.e. the mean of s ). And the state will have a covariance matrix P which
t
means how certain we are about our prediction. We will use μ and u to predict μ .
t−1 t
Here u is a control column-vector of any extra information we can use, for example,
steering angle if we can have access to the steering of the car or the acceleration if
we have access to it. u can be a vector of any size.
We will try to model everything using matrices but for now, we will use scalars, the
new value of the state in t will be
xt = xt−1 + vΔt cos θ
θ = θt−1
vt = vt−1
δt = δt−1
Here we are making simplifying assumptions about the world. First, the velocity v
and the steering δ of the next step will be the same as before which is a weak
assumption. The strong assumption is that the heading or yaw of the car θ is the
same. Notice we are not using the steering but we still track it, it will be useful later.
We can incorporate the kinematic model here to make the prediction more robust.
But that will be adding non-linearities (and so far it is a linear KF). For now, let’s work
with a simple environment and later on we can make things more interesting.
This prediction can be re-formulated in matrix form as
μt = F μt−1 + Bu
Where u is a zero vector and B is a linear transformation from u into the same form
of the state s. Also, F would be (F has to be linear so far, in the EKF we will expand
that to include non-linearities)
1 0 0 Δt cos θ 0
⎡ ⎤
⎢0 1 0 Δt sin θ 0⎥
⎢ ⎥
F = ⎢0 0 1 0
⎥
0⎥
⎢
⎢ ⎥
⎢0 0 0 1 0⎥
⎣ ⎦
0 0 0 0 1
This will result in the same equations but using matrix notation. Rember now that we
are modelling s as a multivariable gaussian distribution and we are keeping track of
the mean μ of the state s and the covariance P . Using the equations above we
update the mean of the state, now we have to update the covariance of the state.
Every time we predict we make small errors which add noise and results in a slightly
less accurate prediction. The covariance P has to reflect this reduction in certainty.
The way it is done with Gaussian distributions is that the distribution gets slightly
more flat (i.e. the covariance “increase”).
In a single-variable gaussian distribution y ∼ N (μ , σ ) the variance has the
′ 2
Pt = F Pt−1 F
T
. Now we have to take into account that we are adding Bu, where u
is the control vector and a gaussian variable with covariance Q. The good thing about
Gaussians is that the covariances of a sum of Gaussians is the sum of the covariances
(if both random variables are independent). Having this into account we have.
T T
Pt = F Pt−1 F + BQB
And with this, we have finished prediction the state and updating its covariance.
Update step
In the update step, we receive a measurement z coming from a sensor. We use the
sensor information to correct/update the belief we have about the state. The
measurement is a random variable with covariance R. This is where things get
interesting. In this case, we have two Gaussians variables, the state best estimate μ t
2 2
(x−μ ) (x−μ )
1 2
1 −
2 1 −
2
2σ 2σ
p(x1 ) = e 2
p(x2 ) = e 2
2 2
√2πσ1 √2πσ2
1 −
2 1 −
2
2σ 2σ
e 1
e 2
2 2
√2πσ √2πσ
1 2
We also now about a very useful property of Gaussians: the product of Gaussians is
also a gaussian distribution. Therefore, to know the result of fusing both Gaussians
we have to write the equation above in a gaussian form.
2 2
(x−μ ) (x−μ )
1 2
− −
1 2 1 2
2σ 2σ
= e 1
e 2
2 2
√2πσ1 √2πσ2
2 2
(x−μ ) (x−μ )
1 2
−( + )
1 2 2
2σ 2σ
= e 1 2
2 2
2πσ σ
1 2
Because we know the result will be a Gaussian distribution, we do not care about
constant values (e.g. 2πσ ), in fact, we only care about the exponent value, which I
2
1
2
2something else
Where something will be the new mean and something else will be the new 2
covariance after multiplication. Therefore we will ignore all the other terms and focus
on the exponent value.
2 2 2 2 2 2
(x − μ1 ) (x − μ2 ) σ (x − μ1 ) + σ (x − μ2 )
2 1
+ =
2 2 2 2
2σ 2σ 2σ σ
1 2 1 2
2 2 2 2 2 2 2 2 2 2
σ x − 2σ μ1 x + σ μ + σ x − 2σ μ2 x + σ μ
2 2 2 1 1 1 1 2
=
2 2
2σ σ
1 2
2 2 2 2 2 2 2 2 2
x (σ + σ ) − 2x(σ μ1 + σ μ2 ) σ μ + σ μ
2 1 2 1 2 1 1 2
= +
2 2 2 2
2σ σ 2σ σ
1 2 1 2
2 2 2 2 2 2 2 2
(σ + σ ) σ μ1 + σ μ2 σ μ + σ μ
2 1 2 1 2 1 1 2
2
= (x − 2x ) +
2 2 2 2 2 2
2σ σ σ + σ 2σ σ
1 2 2 1 2
The term on the right can be ignored because it is constant and goes out of the
exponent. And the term in parenthesis resembles a perfect square trinomial lacking
the last squared term.
2 2 2 2
(σ + σ ) σ μ1 + σ μ2
2 1 2 1
2
= (x − 2x )
2 2 2 2
2σ σ σ + σ
1 2 2
2 2
2 2 2 2 2 2 2 2
(σ + σ ) ⎛ σ μ1 + σ μ2 σ μ1 + σ μ2 σ μ1 + σ μ2 ⎞
2 1 2 1 2 1 2 1
2
= x − 2x + ( ) − ( )
2 2 2 2 2 2 2 2
2σ σ ⎝ σ + σ σ + σ σ + σ ⎠
1 2 2 2 2
2 2
2 2 2 2 2 2
(σ + σ ) ⎛ σ μ1 + σ μ2 σ μ1 + σ μ2 ⎞
2 1 2 1 2 1
= (x − ) − ( )
2 2 2 2 2 2
2σ σ ⎝ σ + σ σ + σ ⎠
1 2 2 2
Ignoring the second term because it is also a constant, the final result of the
exponent value is
2
2 2
σ μ1 +σ μ2
2 1
2 2 2 2
2 (x − 2 2
)
σ +σ
(σ + σ ) σ μ1 + σ μ2 1 2
2 1 2 1
(x − ) =
2 2 2 2 2 2
2σ σ
2σ σ σ + σ 1 2
1 2 1 2
2 2
(σ +σ )
2 1
In fact this final form does resemble a Gaussian distribution. The new mean will be
what is in the parenhesis with x and the new covariance will be the denominator
divided by 2. To simplify things further along the way, we will re write it like
2 2
σ μ1 + σ μ2
2 1
μnew =
2 2
σ + σ
1 2
2 2
σ μ1 + σ μ2
2 1
= μ1 + − μ1
2 2
σ + σ
1 2
2 2 2 2
σ μ1 + σ μ2 − μ1 (σ + σ )
2 1 1 2
= μ1 +
2 2
σ + σ
1 2
2 2 2 2
σ μ1 + σ μ2 − μ1 σ − σ μ1
2 1 1 2
= μ1 +
2 2
σ + σ
1 2
2
σ (μ2 − μ1 )
1
= μ1 +
2 2
σ + σ
1 2
= μ1 + K(μ2 − μ1 )
where K 2
= σ /(σ
1 1
2 2
+ σ )
2
. For the variance we have
2 2
σ σ
1 2
σnew =
2 2
(σ + σ )
2 1
2 2
σ σ
1 2
2 2
= σ + − σ
1 2 2 1
(σ + σ )
2 1
2 2 2 2 2
σ σ − σ (σ + σ )
1 2 1 2 1
2
= σ +
1 2 2
σ + σ
2 1
2 2 2 2 4
σ σ − σ σ + σ
1 2 1 2 1
2
= σ +
1 2 2
σ + σ
2 1
4
σ
1
2
= σ +
1 2 2
σ + σ
2 1
2 2
= σ + Kσ
1 1
Now we need to transform that to matrix notation and change for the correct
variables. μ and z are not in the same vector space, therefore to transform x into the
same vector space as the measurement space we use the matrix H . The final result
will be
T T −1
K = H Pt−1 H (H Pt−1 H + R)
T T T
H Pt H = H Pt−1 H + KH Pt−1 H
T T T
H Pt H = H Pt−1 H + H KH Pt−1 H
And that is it! The all the equations for a Linear Kalman Filter.
Prediction step
μt = F μt−1 + Bu
T T
Pt = F Pt−1 F + BQB
Update step:
T T −1
K = PH (H P H + R)
ẏ = v sin(θ + β)
v cos(β) tan(δ)
˙
θ =
L
lr tan δ
−1
β = tan ( )
L
Where θ is the heading of the vehicle (yaw), β is the slip angle of the centre of gravity,
L is the length of the vehicle, l is the length between the rearmost part to the centre
r
yt = yt−1 + Δt ⋅ v sin(θ + β)
v cos(β) tan(δ)
θt = θt−1 + Δt ⋅
L
lr tan δt−1
−1
βt = tan ( )
L
vt = vt−1
δt = δt−1
If you define that system of equations as f (x, y, θ, v, δ) ∈ R then we can model the
6
whole system using f and F = ∂fj /∂xi . We can also use the same trick with the
transformation from state space s into measurement vector space z.
We can also add non-linearities in the measurement. Before we used the matrix H
now we can use the function h(⋅) and define H as H = ∂hi /∂xi . The final Extended
Kalman Filter is
Prediction step
μt = f (μt−1 ) + Bu
T T
Pt = F Pt−1 F + BQB
Update step:
T T −1
K = Pt−1 H (H Pt−1 H + R)
Pt = (I + KH )Pt−1
##
left image taken from “Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video”.
Therefore modelling the error of the state (i.e. error-state) is more likely that will be
model correctly by a linear model. Therefore, we can avoid some noise coming from
trying to model highly non-linear behaviour by modelling the error-state. Let’s define
the error-state as e = μ t − μt−1 . We can approximate f (μ t−1 ) using the Taylor
series expansion only using the first derivative. Therefore f (μ t−1 ) ≈ μt−1 + F et−1 .
Replacing this and rearranging equation we end up with the final equations for the
Error state - Extended Kalman Filter (ES-EKF)
Prediction step
st = f (st−1 , u)
T T
Pt = F Pt−1 F + BQB
Update step:
T T −1
K = PH (H P H + R)
et = K(z − h(μt−1 ))
st = st−1 + et
Pt = (I + KH )Pt−1
Keep in mind that now we are tracking the error state and the covariance of the
error, therefore we need to predict the state s and correct it by using the error-state
t
during the update step, otherwise, we can estimate the state directly using f (⋅) as in
ithe prediction step.
(if you see I have made a mistake, don’t hesitate to tell me).
4 Comments
1 Login
Name
− ⚑
S
Sharief Saleh
3 months ago
There is virtually no difference between the EKF and ES-EKF implementations that you have presented... Both
estimate the state using a non-linear function. Then both transition the covariance matrix P with the same
linearized F matrix. After that, both compute the same error state "K(z-h(x))". Then you added that error to the
navigation state. In EKF implementation you did that step in the same line while in the ES-EKF you did it in two
separate lines. For me, both are the same... Would you please show how they are different?
0 0 Reply • Share ›
− ⚑
Y
YoK > Sharief Saleh
a month ago edited
I think that F matrix is actually not the same. For the traditional EKF, F is just computed by 1st order partial
derivative with respect to the states itself. But for ES-EKF, F is computed with respect to “error state“.
As a result propagation of covariance becomes more accurate and stable utilizing error state form.
1 0 Reply • Share ›
− ⚑
E
Eric Huang
2 years ago
Great article, but could explain more why error state EKF defines the error-state as e = μ(t) − μ(t−1) ? some article
defines it as nominal state - true state.
0 0 Reply • Share ›
− ⚑
M
Mike Woodcock > Eric Huang
12 days ago
The error state is already subtracting some non-linearity thanks to subtracting the prediction, so in several
cases the ES has shown to lower the degree of non-linearity behaviour
0 0 Reply • Share ›
Subscribe Privacy Do Not Sell My Data
Powered by Jekyll with Phantom template by Jami Gibbs © Copyright Mike Woodcock.