Kalman Filter
Kalman Filter
Kalman Filter
Henzeh Lee†
Korea Advanced Institute of Science and Technology, Daejon, Korea, 305-701.
March 2007
1 Mathematical Background
1.1 Linear System
Every linear system that is causal1 and relaxed2 at t0 can be described by
Z t
y(t) = g(t, τ )u(τ )dτ (1)
t0
where g is called the impulse response, and such a integration is called convolution. Let ŷ(s) be the
Laplace transform of y(t), that is
Z t
ŷ(s) = y(t)e−st dt (2)
t0
or
where
Z t
ĝ(s) = g(t)e−st dt (5)
t0
is called the transfer function. Thus, the transfer function is the Laplace transform of the impulse
response.
∗ by Hyunjae Lee (Since 2007, named Henzeh K. Lee )- 1st Ed. March 2007.
† Post-Doctoral Researcher, Department of Aerospace Engineering, hjlee@fdcl.kaist.ac.kr
1 If a system is causal, the output will not appear before input is applied.
2 A system is said to be relaxed at t if its initial state at t is zero.
0 0
1
H ENZEH L EE 2
Matrix Exponential
The most important function of A is the exponential function eAt . Because the Taylor series
a 2 t2 a n tn
eat = 1 + at + + ··· + + ··· (6)
2! n!
converges for all finite a and t, we have
∞
A 2 t2 A n tn X tk
e At
= I + At + + ··· + + ··· = Ak (7)
2! n! k!
k=0
e0 = I (8)
A(t1 +t2 )
e = eAt1 eAt2 (9)
−1
eAt = e−At (10)
d At
e = AeAt = eAt A (11)
dt
Using Eq.(7), the properties can be easily proven.(homework)
Taking the Laplace transform of Eq.(7) yields
∞ k
1X A
L(e ) =At
(12)
s s
k=0
we can obtain
∞ k −1
1X A 1 A
= I− = (sI − A)−1 (14)
s s s s
k=0
Therefore, we can compute the matrix exponential using the Laplace transformation described by
The approximation of the matrix exponential by a truncated power series expansion in Eq.(7) is
not a recommended method. But, it is useful if the characteristic values of At are well approximated.
As an alternative, “the scaling and squaring method” combined with a Padé approximation[2] is the
recommended method when it is hardly obtained by the Laplace transformation method in Eq.(15).
H ENZEH L EE 3
To obtain the solution of LTI, multiplying e−At on both side of LTI yields
which implies
d −At
e x(t) = e−At Bu(t) (19)
dt
Integrating from 0 to t gives
Z t
e−At x(t) − e0 x(0) = e−Aτ Bu(τ )dτ (20)
0
Note that the fundamental solution matrix transforms any initial state x(0) to the correspond-
ing state x(t). Let us assume that Φ is not singular, then, the product Φ−1 (t)x(t) = x(0) and
Φ(τ )Φ−1 (t)x(t) = x(τ ). The matrix product
0 t τ
xk+1 = xk + α1 k1 + · · · + αn kn (44)
and
where ∆t is integration time interval, the coefficients aji , ci , and αi can chosen by matching the de-
terministic case. A feasible choice4 of the covariance matrix for the noise generation5 for stochastic
numerical simulations, as a function of integration method, is given by
Q
Qs = β (47)
∆t
where
1
β = Pn
i=1 αi2
In a case of the Runge-Kutta 4th order algorithm, the scaling constant β = 3.6 due to the constants
α1 = 16 , α2 = α3 = 62 , and α4 = 16 which are in general matching the deterministic case.
By using upper two equations, the difference between two values are given by
Z t
x(t) − Ehx(t)i = Φ(t, t0 )(x(t0 ) − Ehx0 i) + Φ(t, τ )G(τ )w(τ )dτ (53)
t0
Therefore, using a stochastic property, that is, the uncorrelated process7 , the covariance matrix which
is defined in the previous section P (t) can be easily obtained as
P (t) = Φ(t, t0 ) Eh[x(t0 ) − Ehx(t0 )i][x(t0 ) − Ehx(t0 )i]T i ΦT (t, t0 )
| {z }
= P (t0 )
Z t
+ Φ(t, t0 ) Eh[x(t0 ) − Ehx(t0 )i][w T (τ )]i GT (τ )ΦT (t, τ )dτ
t0 | {z }
=0
Z t
+ G(τ )Φ(t, t0 ) Eh[w(τ )][x(t0 ) − Ehx(t0 )i]T i ΦT (t, t0 )dτ
t0 | {z }
=0
Z tZ t
+ Φ(t, τ )G(τ ) Ehw(τ )w T (σ)i GT (σ)ΦT (t, σ)dτ dσ
t0 t0 | {z }
= Qδ(τ −σ)
Z t
= Φ(t, t0 )P (t0 )ΦT (t, t0 ) + Φ(t, τ )G(τ )Q(τ )GT (τ )ΦT (t, τ )dτ (54)
t0
600 t
P(t ) = ∫ G (τ )Q(τ )G T (τ )dτ
t0
400
Position (m)
200
-200
-400
0 100 200 300 400
Time (sec)
To describe the state propagation between samples, let t0 = k∆t, t = (k + 1)∆t for an integer k.
Defining the sampled state vector as xk = x(k∆t), we can write
Z (k+1)∆t
xk+1 = e A∆t
xk + eA[(k+1)∆t−τ ] Bu(τ )dτ
k∆t
Z (k+1)∆t
+ eA[(k+1)∆t−τ ] Gw(τ )dτ (60)
k∆t
Assume that the control input u(t) is a constant value over the integration interval. Let us denote
the control input as uk = u(k∆t). The third term is a smoothed version of the continuous white
process noise weighted by the state transition matrix and G. Here we can define
Z (k+1)∆t
wk = eA[(k+1)∆t−τ ] Gw(τ )dτ (61)
k∆t
By the assumption and definition, Eq.(60) becomes
Z (k+1)∆t
xk+1 = eA∆t xk + eA[(k+1)∆t−τ ] Bdτ · uk + wk (62)
k∆t
This is the discrete version of Eq.(59), which we can write as [3]
xk+1 = Ak xk + Bk uk + wk (63)
where
Ak = eA∆t
Z ∆t
Bk = eAτ Bdτ
0
H ENZEH L EE 9
To find the covariance Qk of the new noise sequence w k in terms of Q, Let us define an expec-
tation8 described as
Qk = Ehwk wTk i
Z Z (k+1)∆t
T
= eA[(k+1)∆t−τ ] GEhw(τ )w T (σ)iGT eA [(k+1)∆t−σ]
dτ dσ (64)
k∆t
Note that the random noise is uncorrelated, that is, Ehw(τ )w T (σ)i = Qδ(τ − σ). The covariance
is given by [3]
Z ∆t
T
Qk = eAτ GQGT eA τ
dτ (65)
0
It is worthwhile to write down the first few terms of the infinite series expansions such that
1
Ak = I + A∆t + A2 ∆t2 + · · · (66)
2
1
Bk = B∆t + AB∆t2 + · · · (67)
2
1
Qk = GQG ∆t + (AGQGT + GQGT A)∆t2 + · · ·
T
(68)
2
The discrete linear time-varying system and noise covariance matrices are given by
1.3 Problems
Let us consider that there is a stochastic system given by
ẋ = Ax + Bu + w
Ehwi = 0
Ehw(t1 )w(t2 )T i = Qδ(t1 − t2 )
where
0 −1 0 0 0
A= ,B = ,Q =
1 −2 1 0 1
1. Find the transition matrix using power series, and the Laplace transformation method, respec-
tively.
2. Find the discrete version of the system (Ak , Bk ) and the covariance Qk using power series
and transition matrix, respectively.
3. Plot x(t) and P (t) with x(0) = [5, 5]T , P (0) = 50I and no input. For this simulation, you
can use ∆t = 0.01 and ∆t = 0.1, respectively. ( You must use Itô integration method for this
stochastic simulation [5])
Using Matlab, find the transition matrix of
0 a3 −a2 1
−a3 0 a1 0
a2 −a1 0 0
0 0 0 0
H ENZEH L EE 11
2 Kalman Filter
2.1 Orthogonality Principle
The inner product of two vectors can be defined as the correlation of the two variables such that
and
hx − x̂, αi = 0 (78)
for all α ∈ Y .
Proof) Let us consider the sufficiency. By the definition, the cost function is given by
hx − α, x − αi = hx − x̂ + x̂ − α, x − x̂ + x̂ − αi
= hx − x̂, x − x̂i + 2hx − x̂, x − αi + hx̂ − α, x̂ − αi (79)
Next, To prove the necessity condition, we can apply the contradiction method. Suppose that x̂
is minimum and there exists α1 such that
hx − x̂, α1 i = β 6= 0, α ∈ Y (82)
For a scalar λ,
(k + 1) − 1 1
x̂k+1 = x̂k + zk+1
k+1 k+1
1
= x̂k + (zk+1 − x̂k ) (95)
k+1
The optimal linear estimate is equivalent to the general optimal estimator if the variables x
and z are jointly Gaussian. Therefore, it suffices to seek an updated estimate x̂+ k is based on the
observation z k . That is, the estimate is a linear function of the a priori estimate and the measurement.
x̂+ −
k = K̄k x̂k + Kk z k (96)
where x̂− + +
k = Φk−1 x̂k−1 is the a priori estimate of xk and x̂k is the posteriori value of the estimate.
Optimization Problem
The matrices K, K̄ are unknown. We have to find out these optimal matrices. The orthogonality
principle in the previous section can be written in the form
Eh[xk − x̂+ T
k ][z i ]i = 0, i = 1, 2, · · · , k − 1 (97)
Eh[xk − x̂+ T
k ][z k ]i = 0 (98)
Using the discrete time plant model and measurement model in Eqs.(86),(87), we can obtain the
following relation :
xk
xk − xˆ k+
u1
xˆk+
uj uk
Figure 3: Geometric interpretation for the optimal state estimation based on measurement vectors.
Using the properties of the noise process11 , we can rewrite the above equation :
We also know that Eqs.(97) and (98) hold at the previous step, that is,
0 = Eh[xk−1 − x̂+ T
k−1 ][z i ]i, i = 1, 2, · · · , k − 1
= Φk−1 Eh[xk−1 − x̂+ T
k−1 ][z i ]i
= Eh[Φk−1 xk−1 − Φk−1 x̂+ T − T
k−1 ][z i ]i = Eh(xk − x̂k )z i i (101)
and12
Ehv k z Ti i = 0, i = 1, 2, · · · , k − 1 (102)
Clearly, the orthogonality condition in Eq.(97) can be satisfied for any given x k if
K̄k = I − Kk Hk (106)
x̂+
k = (I − Kk Hk )x̂−
k + Kk z k
= x̂k + Kk (z k − Hk x̂−
−
k) (107)
The choice of K is such that Eq.(98) is satisfied. Let us define some useful errors :
x̃+
k ≡ x̂+
k − xk (108)
x̃−
k ≡ x̂−k − xk (109)
−
z̃ k ≡ ẑ k − z k
= Hk x̂−
k − zk (110)
Vectors x̃+ −
k and x̃k are the estimation errors after and before updates, respectively. The parameters
x̃k depends on xk , which depends on z k . Therefore, from Eq.(98), it is obvious such that
Eh(xk − x̂+ −T
k )ẑ k i = 0 (111)
12 Ehv T = Ehv k v T T T Ehv k v T T T
k zi i i + vk z i H i = i i +z j Hi Ehvk i
| {z } | {z }
=0 =0
H ENZEH L EE 15
− − − T
= Eh[−(x̂k − xk ) + Kk Hk (x̂k − xk ) − Kk v k ][Hk (x̂k − xk ) − v k ] i
= Eh[−x̃− − − T
k + Kk Hk x̃k − Kk v k ][Hk x̃k − v k ] i
= Eh−x̃− −T T − −T T T
k x̃k Hk + Kk Hk x̃k x̃k Hk + Kk v k v k i
= −Ehx̃− −T T − −T T T
k x̃k iHk + Kk Hk Ehx̃k x̃k iHk + Kk Ehv k v k i (114)
| {z }
Rk
= (I − Kk Hk )Pk− (I − Kk Hk ) + Kk Rk KkT
T
(120)
H ENZEH L EE 16
The above form is the so-called “Joseph form” of the covariance update equation derived by P.D
Joseph. There are many other forms for Kk , Pk+ that might not be as useful for robust computa-
tion. The covariance matrix is symmetric. Failure to obtain such symmetric condition would be
introduced during numerical calculation. Joseph form is one of the robust algorithms.
An alternative is derived by substituting for Kk from Eq.(118) such that
Pk+ = Pk− − Kk Hk Pk− − Pk− HkT KkT + Kk Hk Pk− HkT KkT + Kk Rk KkT
=0
z }| {
= (I − Kk Hk )Pk− −Pk− HkT KkT + Kk (Hk Pk− HkT + Rk ) KkT
| {z }
= Pk− HkT (∵ Eq.(118))
= (I − Kk Hk )Pk− (121)
This form is the one most often used in computation. The priori error covariance matrix is derived
by using state error equation given by
x̃−
k = x̂− +
k − xk = (Φk−1 x̂k−1 ) − (Φk−1 xk−1 + w k−1 )
= Φk−1 (x̂+
k−1 − xk−1 ) + w k−1
= Φk−1 x̃+
k−1 + w k−1 (122)
By definition of the priori error covariance matrix, we can obtain a compact form :
which gives the a priori value of the covariance matrix of estimation uncertainty as a function of the
previous a posteriori value.
Summary
The equations derived in the previous section are summarized. In this formulation of the filter
eqations, G has been combined with the plant covariance. The basic steps of the computational
procedure for the discrete-time Kalman filter are as follows :
1. Compute Pk− using Pk−1+
, Φk−1 , and Qk−1
−
2. Compute Kk using Pk , Hk , and Rk
3. Compute Pk+ using Kk , and Pk−
4. Compute successive values of x̂+ k recursively using the computed values of Kk , the given
initial estimate and the input data z k
H ENZEH L EE 17
R = Rk ∆(k) (131)
H ENZEH L EE 18
Since ∆(k) has a value of one at k = 0, in the discrete case the covariance is equal to R k , a finite
matrix. On the other hand, the covariance of R is given by
Rδ(t) (132)
For continuous white noise, the covariance is unbounded since δ(t) is unbounded at t = 0. To make
consistency between Rk and R, the following relationship holds
R
Rk = (133)
∆t
where ∆t be the time interval tk , tk−1 . It is evident that the equation is the relation required between
R and Rk so that v k and v(t) have the same spectral densities.
As shown in the previous section, the following relationships hold :
where O(∆t2 ) consists of terms with power of ∆t greater than or equal to two, and from Eq.(68)
the process noise is approximately given by
Qk ≈ GQGT ∆t (135)
By substituting above equations, the discrete covariance matrices and gain can be modified as
Pk− = +
(I + A∆t)Pk−1 (I + A∆t)T + GQGT ∆t (136)
−1
+ T R
Kk = Pk H HPk H +T
(137)
∆t
Pk+ = (I − Kk H)Pk− (138)
Let us first examine the behavior of the Kalman gain as ∆t approaches to zero. From Eq.(137), we
have
1
Kk = Pk− H T (HPk− H T ∆t + R)−1 (139)
∆t
so that
1
lim Kk = Pk− H T R−1 (140)
∆t→0 ∆t
where R is positive definite such that the right hand side of above equation is constant. This implies
that
lim Kk = 0 (141)
∆t→0
The discrete Kalman gain tends to approach to zero as the sampling time becomes to zero. Let us
consider the covariance matrix in Eq.(136), we have
where O(∆t)2 denotes terms of order ∆t2 . Substituting Eq.(138) into this equation yields
Pk− = −
(I − Kk−1 H)Pk−1
+ A(I − Kk−1 H)Pk−1 + (I − Kk−1 H)Pk−1 AT + GQGT ∆t + O(∆t2 )
− −
= Pk−1 − Kk−1 HPk−1
+ APk−1 + Pk−1 AT + GQGT − AKk−1 HPk−1 − Kk−1 HPk−1 AT ∆t
+ O(∆t)2 (143)
Pk− − Pk−1
−
Kk−1 −
= − HPk−1
∆t ∆t
+ APk−1 + Pk−1 AT + GQGT − AKk−1 HPk−1 − Kk−1 HPk−1 AT
+ O(∆t) (144)
Letting ∆t tend to zero. As a result, using Eqs.(140),(141), the continuous-time covariance matrix
differential equation is derived as
Pk− − Pk−1
−
lim = lim APk−1 + Pk−1 AT + GQGT
∆t→0 ∆t ∆t→0
Kk−1 −
+ − lim HPk−1
∆t→0 ∆t
+ lim −AKk−1 HPk−1 − Kk−1 HPk−1 AT + O(∆t)
∆t→0 | {z }
→0
where lim∆t→0 Pk−1 = P (t). This is the continuous-time Riccati equation for propagation of the
error covariance.
In similar manners, the state vector update equation can be derived such that
x̂+
k = (I + A∆t)x̂+ +
k−1 + Kk [z k − H((I + A∆t)x̂k−1 )]
= x̂+ + + +
k−1 + A∆tx̂k−1 + Kk (z k − H x̂k−1 + HAx̂k−1 ∆t) (146)
˙
x̂(t) = A(t)x̂(t) + P (t)H T (t)R−1 (t)[z(t) − H(t)x̂(t)] (148)
Summary
The equations derived in the previous section are summarized.
State update :
˙ = Ax̂ + P H T R−1 (z − H x̂)
x̂
Error covariance Update :
Ṗ = AP + P AT + GQGT − P H T R−1 HP
H ENZEH L EE 21
References
[1] Chen, C.-T., Linear System Theory and Design, 3rd Ed., Oxford University Press Inc., 1999.
[2] Grewal, M. S., and Andrews, A. P., Kalman Filtering : Theory and Practice Using MATLAB,
2nd Ed., John Wiley & Sons, Inc., 2001.
[3] Lewis, F. L., Optimal Estimation with an Introduction to Stochastic Control Theory, John
Wiley & Sons, Inc., 1986.
[4] Arnold, L., Stochastic Differential Equations : Thery and Applications, John Wiley & Sons,
Inc., 1974.
[5] Tahk, M.-J., Lecture Note : Navigation and Guidance, MAE664, http://fdcl.kaist.ac.kr, 2006.
[6] Mayback, P. S., Stochastic Models, estimation, and control Vol.1, Academic Press, Inc., 1979.
[7] Kasdin, N., “Runge-Kutta Algorithm for the Numerical Integration of Stochastic Differential
Equations,” Journal of Guidance, Control, and Dynamics, Vol. 18, No. 1, pp. 114-120, 1995.
[8] Pitelkau, M. E., “Kalman Filtering for Spacecraft System Alignment Calibration,” Journal of
Guidance, Control, and Dynamics, Vol. 24, No. 6, pp. 1187-1195, 2001.