Kalman Filter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Quick Guide to Kalman Filter∗

Henzeh Lee†
Korea Advanced Institute of Science and Technology, Daejon, Korea, 305-701.
March 2007

1 Mathematical Background
1.1 Linear System
Every linear system that is causal1 and relaxed2 at t0 can be described by
Z t
y(t) = g(t, τ )u(τ )dτ (1)
t0

where g is called the impulse response, and such a integration is called convolution. Let ŷ(s) be the
Laplace transform of y(t), that is
Z t
ŷ(s) = y(t)e−st dt (2)
t0

Manipulating the above equation, we can obtain a simple form [1] :


Z t Z t
ŷ(s) = g(v)e−sv dv u(τ )e−sτ dτ (3)
t0 t0

or

ŷ(s) = ĝ(s)û(s) (4)

where
Z t
ĝ(s) = g(t)e−st dt (5)
t0

is called the transfer function. Thus, the transfer function is the Laplace transform of the impulse
response.
∗ by Hyunjae Lee (Since 2007, named Henzeh K. Lee )- 1st Ed. March 2007.
† Post-Doctoral Researcher, Department of Aerospace Engineering, hjlee@fdcl.kaist.ac.kr
1 If a system is causal, the output will not appear before input is applied.
2 A system is said to be relaxed at t if its initial state at t is zero.
0 0

1
H ENZEH L EE 2

Matrix Exponential
The most important function of A is the exponential function eAt . Because the Taylor series

a 2 t2 a n tn
eat = 1 + at + + ··· + + ··· (6)
2! n!
converges for all finite a and t, we have

A 2 t2 A n tn X tk
e At
= I + At + + ··· + + ··· = Ak (7)
2! n! k!
k=0

The matrix exponential have following properties

e0 = I (8)
A(t1 +t2 )
e = eAt1 eAt2 (9)
−1
eAt = e−At (10)
d At
e = AeAt = eAt A (11)
dt
Using Eq.(7), the properties can be easily proven.(homework)
Taking the Laplace transform of Eq.(7) yields
∞  k
1X A
L(e ) =At
(12)
s s
k=0

From the well-known infinite series given by


∞  k
X a a  a 2  a −1
=1+ + + ··· = 1 − (13)
s s s s
k=0

we can obtain
∞  k  −1
1X A 1 A
= I− = (sI − A)−1 (14)
s s s s
k=0

Therefore, we can compute the matrix exponential using the Laplace transformation described by

eAt = L−1 (sI − A)−1 (15)

The approximation of the matrix exponential by a truncated power series expansion in Eq.(7) is
not a recommended method. But, it is useful if the characteristic values of At are well approximated.
As an alternative, “the scaling and squaring method” combined with a Padé approximation[2] is the
recommended method when it is hardly obtained by the Laplace transformation method in Eq.(15).
H ENZEH L EE 3

Solution of LTI Equations


Consider the linear time-invariant(LTI) state-space system

ẋ = Ax(t) + Bu(t) (16)


y = Cx(t) + Du(t) (17)

To obtain the solution of LTI, multiplying e−At on both side of LTI yields

e−At ẋ − e−At Ax(t) = e−At Bu(t) (18)

which implies
d −At 
e x(t) = e−At Bu(t) (19)
dt
Integrating from 0 to t gives
Z t
e−At x(t) − e0 x(0) = e−Aτ Bu(τ )dτ (20)
0

Using the properties in Eqs.(8)-(10), the above equation can be written by


Z t
At
x = e x(0) + eA(t−τ ) Bu(τ )dτ (21)
0

Inserting the solution to y of LTI yields


Z t
y = CeAt x(0) + C eA(t−τ ) Bu(τ )dτ + Du(t) (22)
0

State Transition Matrix


Obtaining a closed-form solution of the linear time-varying(LTV) state equation is in general dif-
ficult. Thus, different approaches are required to solve the LTV systems. An matrix function
Φ is called a fundamental solution of Eq.(16) without input vector on the interval t ∈ [0, T ] if
Φ̇(t) = AΦ(t) and Φ(0) = I. Note that the vector x(t) = Φ(t)x(0) satisfies
d
ẋ = [Φ(t)x(0)]
dt
= Φ̇(t)x(0) = [AΦ(t)]x(0)
= A [Φ(t)x(0)] = Ax(t) (23)

Note that the fundamental solution matrix transforms any initial state x(0) to the correspond-
ing state x(t). Let us assume that Φ is not singular, then, the product Φ−1 (t)x(t) = x(0) and
Φ(τ )Φ−1 (t)x(t) = x(τ ). The matrix product

Φ(τ, t) = Φ(τ )Φ−1 (t) (24)

transforms a solution from time t to the corresponding solution at time τ


H ENZEH L EE 4

Other useful properties of Φ include the following :


Φ(t, t) = Φ(0) = I (25)
−1
Φ (τ, t) = Φ(t, τ ) (26)
Φ(τ, σ)Φ(σ, t) = Φ(τ, t) (27)

Φ(τ, t) = A(τ )Φ(τ, t) (28)
∂τ

Φ(τ, t) = −Φ(τ, t)A(t) (29)
∂t
where Φ(t, 0) = Φ(t) for simplification.
-1
Φ (t) x(t) Φ (τ,t)
x(0)
x(τ)
Φ (τ)

0 t τ

Figure 1: Nature of the state transition matrix.

Solution of LTV Equations


The solution of LTV or LTI based on the transition matrix is given by
Z t
x(t) = Φ(t, t0 )x(t0 ) + Φ(t, τ )B(τ )u(τ )dτ (30)
t0

Therefore, for A constant, the transition matrix is described as


Φ(t, τ ) = eA(t−τ ) (31)
Other than the preceding special cases, computing state transition matrices is in general difficult. We
show that Eq.(30) satisfies the initial condition and the state equation in Eq.(16) with time-varying
matrices.
Proof) At t = t0 , the initial value of the solution gives x(t0 ). Therefore, it satisfies the initial
condition. Using Eq.(28) and Leibniz’s rule 3 , we can obtain
Z t
d ∂ ∂
x(t) = Φ(t, t0 )x(t0 ) + Φ(t, τ )B(τ )u(τ )dτ
dt ∂t ∂t t0
Z t

= A(t)Φ(t, t0 )x(t0 ) + Φ(t, τ )B(τ )u(τ )dτ + Φ(t, t) B(t)u(t)
t0 ∂t | {z }
=I
 Z t 
= A(t) Φ(t, t0 )x(t0 ) + Φ(t, τ )B(τ )u(τ )dτ + B(t)u(t)
t0
= A(t)x(t) + B(t)u(t) (32)
3 ∂
Rt Rt  ∂

∂t t0 f (t, τ )dτ = t0 ∂t
f (t, τ ) dτ + f (t, τ )|τ =t
H ENZEH L EE 5

Conclusively, Eq.( 30) is the solution of LTV or LTI.

1.2 Random Variable and Random Process


Mean, Correlation, and Covariance
Let x be an vector random process. then its mean is given by
Z ∞
Ehxi = xp(x)dx (33)
−∞
The correlation of the vector valued process is defined by
 
Ehx1 (t1 )x1 (t2 )i · · · Ehx1 (t1 )xn (t2 )i
T
Ehx(t1 )x (t2 )i = 
 .. .. .. 
(34)
. . . 
Ehxn (t1 )x1 (t2 )i ··· Ehxn (t1 )xn (t2 )i
where
Z ∞ Z
Ehxyi = xyp(x, y)dxdy (35)
−∞
The covariance of the vector is also defined by
Q(t1 , t2 ) = Eh[x(t1 ) − Ehx(t1 )i][x(t2 ) − Ehx(t2 )i]T i
= Ehx(t1 )xT (t2 )i − Ehx(t1 )iEhx(t2 )i (36)
When the process has zero mean, its correlation and covariance is equal.

Orthogonal Process and White Noise


Two random processes x and y are called uncorrelated if their cross-covariance matrix is identically
zero for all ti and t2 :
Eh[x(t1 ) − Ehx(t1 )i][y(t2 ) − Ehy(t2 )i]T i = 0 (37)
The processes are called orthogonal if their correlation matrix identically zero:
Ehx(t1 )y T (t2 )i = 0 (38)
The random process x is called uncorrelated if
Eh[x(t1 ) − Ehx(t1 )i][x(t2 ) − Ehx(t2 )i]T i = Q(t1 , t2 )δ(t1 − t2 ) (39)
where δ(t) is the Dirac delta function defined by
Z b 
1 if a ≤ t = 0 ≤ b
δ(t)dt = (40)
a 0 otherwise
Similarly, a random sequence xk is called uncorrelated if
Eh[xk − Ehxk i][xj − Ehxj i]T i = Qk (k, j)∆(k − j) (41)
where ∆() is the Kronecker delta function defined by

1 if k = 0
∆(k) = (42)
0 otherwise
H ENZEH L EE 6

Stochastic Differential Equations for Random Process


Differential equations involving random processes are called stochastic differential equations. The
problem in here is that random processes are not integrable functions in the conventional calculus.
The resolution of this problem requires foundational modifications of the calculus to obtain many
of the results presented. The Riemann integral of the ordinary calculus must be modified to what is
called the Itô calculus.[4][5][7]
Let us consider a stochastic differential equation given by

ẋ = F (t, x) + G(t, x)w (43)

The stochastic version of Runge-Kutta algorithm is described as

xk+1 = xk + α1 k1 + · · · + αn kn (44)

and

k1 = ∆tF (tk , xk ) + ∆tG(tk , xk )w1 (45)


j−1
! j−1
!
X X
kj = ∆tF tk + cj ∆t, xk + aji ki + ∆tG tk + cj ∆t, xk + aji ki wj (46)
i=1 i=1

where ∆t is integration time interval, the coefficients aji , ci , and αi can chosen by matching the de-
terministic case. A feasible choice4 of the covariance matrix for the noise generation5 for stochastic
numerical simulations, as a function of integration method, is given by

Q
Qs = β (47)
∆t
where
1
β = Pn
i=1 αi2

In a case of the Runge-Kutta 4th order algorithm, the scaling constant β = 3.6 due to the constants
α1 = 16 , α2 = α3 = 62 , and α4 = 16 which are in general matching the deterministic case.

Covariance Propagation Equations


Consider a stochastic differential equation for random process

ẋ = A(t)x(t) + G(t)w(t) (48)


Ehw(t)i = 0 (49)
Ehw(t1 )wT (t2 )i = Q(t1 , t2 )δ(t2 − t1 ) (50)
4 Thecovariance Qs differ from the discrete covariance Qk .
5 UsingMatlab, you can generate the random gaussian noise with zero mean for stochastic numerical simulations by a
simple command such as w = sqrt(Qs ) ∗ normrnd(0, 1) where Qs ∈ R in this case.
H ENZEH L EE 7

The solution of this equation with x(t0 ) and Φ(t) is given by


Z t
x(t) = Φ(t, t0 )x(t0 ) + Φ(t, τ )G(τ )w(τ )dτ (51)
t0

Taking expectation6 yields


Z t
Ehx(t)i = Φ(t, t0 )Ehx(t0 )i + Φ(t, τ )G(τ ) Ehw(τ )i dτ (52)
t0 | {z }
=0

By using upper two equations, the difference between two values are given by
Z t
x(t) − Ehx(t)i = Φ(t, t0 )(x(t0 ) − Ehx0 i) + Φ(t, τ )G(τ )w(τ )dτ (53)
t0

Therefore, using a stochastic property, that is, the uncorrelated process7 , the covariance matrix which
is defined in the previous section P (t) can be easily obtained as
P (t) = Φ(t, t0 ) Eh[x(t0 ) − Ehx(t0 )i][x(t0 ) − Ehx(t0 )i]T i ΦT (t, t0 )
| {z }
= P (t0 )
Z t
+ Φ(t, t0 ) Eh[x(t0 ) − Ehx(t0 )i][w T (τ )]i GT (τ )ΦT (t, τ )dτ
t0 | {z }
=0
Z t
+ G(τ )Φ(t, t0 ) Eh[w(τ )][x(t0 ) − Ehx(t0 )i]T i ΦT (t, t0 )dτ
t0 | {z }
=0
Z tZ t
+ Φ(t, τ )G(τ ) Ehw(τ )w T (σ)i GT (σ)ΦT (t, σ)dτ dσ
t0 t0 | {z }
= Qδ(τ −σ)
Z t
= Φ(t, t0 )P (t0 )ΦT (t, t0 ) + Φ(t, τ )G(τ )Q(τ )GT (τ )ΦT (t, τ )dτ (54)
t0

Using the property of the transition matrix given by dt


d
Φ(t, τ ) = A(t)Φ(t, τ ) and Leibniz’s rule, the
differential form of the covariance can be described as
Ṗ = A(t)P (t) + P (t)AT (t) + G(t)Q(t)GT (t) (55)

Discrete Time System for Random Process


Recall a stochastic differential equation for LTI random process given by
ẋ = Ax(t) + Bu(t) + Gw(t) (56)
Ehw(t)i = 0 (57)
Ehw(t1 )wT (t2 )i = Q(t1 , t2 )δ(t2 − t1 ) (58)
6 Abouttwo random variables w1 , w2 and a deterministic variable a, the expectation is given by Ehw1 + aw2 i =
Ehw1 i + aEhw2 i.
7 Ehx(t)w T (τ )i = 0 for t > t
0
H ENZEH L EE 8

600 t
P(t ) = ∫ G (τ )Q(τ )G T (τ )dτ
t0

400

Position (m)
200

-200

-400
0 100 200 300 400
Time (sec)

Figure 2: An illustrative covariance propagation of stochastic differential equations. - ( ẋ = Gw)

To obtain a discrete system, we begin with the solution of the equation


Z t Z t
x(t) = eA(t−t0 ) x(t0 ) + eA(t−τ ) Bu(τ )dτ + eA(t−τ ) Gw(τ )dτ (59)
t0 t0

To describe the state propagation between samples, let t0 = k∆t, t = (k + 1)∆t for an integer k.
Defining the sampled state vector as xk = x(k∆t), we can write
Z (k+1)∆t
xk+1 = e A∆t
xk + eA[(k+1)∆t−τ ] Bu(τ )dτ
k∆t
Z (k+1)∆t
+ eA[(k+1)∆t−τ ] Gw(τ )dτ (60)
k∆t

Assume that the control input u(t) is a constant value over the integration interval. Let us denote
the control input as uk = u(k∆t). The third term is a smoothed version of the continuous white
process noise weighted by the state transition matrix and G. Here we can define
Z (k+1)∆t
wk = eA[(k+1)∆t−τ ] Gw(τ )dτ (61)
k∆t
By the assumption and definition, Eq.(60) becomes
Z (k+1)∆t
xk+1 = eA∆t xk + eA[(k+1)∆t−τ ] Bdτ · uk + wk (62)
k∆t
This is the discrete version of Eq.(59), which we can write as [3]
xk+1 = Ak xk + Bk uk + wk (63)
where
Ak = eA∆t
Z ∆t
Bk = eAτ Bdτ
0
H ENZEH L EE 9

To find the covariance Qk of the new noise sequence w k in terms of Q, Let us define an expec-
tation8 described as

Qk = Ehwk wTk i
Z Z (k+1)∆t
T
= eA[(k+1)∆t−τ ] GEhw(τ )w T (σ)iGT eA [(k+1)∆t−σ]
dτ dσ (64)
k∆t

Note that the random noise is uncorrelated, that is, Ehw(τ )w T (σ)i = Qδ(τ − σ). The covariance
is given by [3]
Z ∆t
T
Qk = eAτ GQGT eA τ
dτ (65)
0

It is worthwhile to write down the first few terms of the infinite series expansions such that
1
Ak = I + A∆t + A2 ∆t2 + · · · (66)
2
1
Bk = B∆t + AB∆t2 + · · · (67)
2
1
Qk = GQG ∆t + (AGQGT + GQGT A)∆t2 + · · ·
T
(68)
2
The discrete linear time-varying system and noise covariance matrices are given by

Ak = Φ((k + 1)∆t, k∆t) (69)


Z (k+1)∆t
Bk = Φ((k + 1)∆t, τ )B(τ )dτ (70)
k∆t
Z (k+1)∆t
Qk = Φ((k + 1)∆t, τ )G(τ )Q(τ )GT (τ )ΦT ((k + 1)∆t, τ )dτ (71)
k∆t

Covariance Propagation in Discrete Time


Recall that the equations for the state and its expectation can be written by

xk = Φk−1 xk−1 + wk−1 (72)


Ehxk i = Φk−1 Ehxk−1 i + Ehwk−1 i
= Φk−1 Ehxk−1 i (73)
8 About a random variable w and a deterministic variable a, the expectation is Ehawi = aEhwi.
H ENZEH L EE 10

Therefore, covariance matrix in discrete time can be easily obtained as

Pk = Eh[xk − Ehxk i][xk − Ehxk i]T i


= Φk−1 Eh[xk−1 − Ehxx−1 i][xk−1 − Ehxx−1 i]T i ΦTk−1
| {z }
= Pk−1

+ Φk−1 Eh[xk−1 − Ehxx−1 i]wTk−1 i


| {z }
=0
+ Ehwk−1 [xk−1 − Ehxx−1 i]T i ΦTk−1 + Ehwk−1 wTk−1 i
| {z } | {z }
=0 = Qk−1

= Φk−1 Pk−1 ΦTk−1 + Qk−1 (74)

where Φk = Φ((k + 1)∆t, k∆t)

1.3 Problems
Let us consider that there is a stochastic system given by

ẋ = Ax + Bu + w
Ehwi = 0
Ehw(t1 )w(t2 )T i = Qδ(t1 − t2 )

where
     
0 −1 0 0 0
A= ,B = ,Q =
1 −2 1 0 1

1. Find the transition matrix using power series, and the Laplace transformation method, respec-
tively.
2. Find the discrete version of the system (Ak , Bk ) and the covariance Qk using power series
and transition matrix, respectively.
3. Plot x(t) and P (t) with x(0) = [5, 5]T , P (0) = 50I and no input. For this simulation, you
can use ∆t = 0.01 and ∆t = 0.1, respectively. ( You must use Itô integration method for this
stochastic simulation [5])
Using Matlab, find the transition matrix of
 
0 a3 −a2 1
 −a3 0 a1 0 
 
 a2 −a1 0 0 
0 0 0 0
H ENZEH L EE 11

2 Kalman Filter
2.1 Orthogonality Principle
The inner product of two vectors can be defined as the correlation of the two variables such that

hx, yi = EhxT yi (75)

and

hx, xi = EhxT xi = kxk2 (76)

where k · k denotes two norm.


Orthogonal Projection Lemma, OPL [5] Let X be a normed space9 where x ∈ X has a finite
second moment and let Y be a closed subspace of X, then there exists a unique x̂ such that

min kx − αk2 = kx − x̂k2 (77)

where α ∈ Y , if and only if

hx − x̂, αi = 0 (78)

for all α ∈ Y .
Proof) Let us consider the sufficiency. By the definition, the cost function is given by

hx − α, x − αi = hx − x̂ + x̂ − α, x − x̂ + x̂ − αi
= hx − x̂, x − x̂i + 2hx − x̂, x − αi + hx̂ − α, x̂ − αi (79)

If hx − x̂, x̂i = 0 and hx − x̂, αi = 0, for x̂, α ∈ Y , then

hx − α, x − αi = hx − x̂, x − x̂i + hx̂ − α, x̂ − αi (80)

The above equation implies

kx − αk2 = kx − x̂k2 + kx̂ − αk2


≥ kx − x̂k2 , ∀α ∈ Y (81)

Next, To prove the necessity condition, we can apply the contradiction method. Suppose that x̂
is minimum and there exists α1 such that

hx − x̂, α1 i = β 6= 0, α ∈ Y (82)

For a scalar λ,

h(x − x̂) + λα1 , (x − x̂) + λα1 i = kx − x̂k2 + 2βλ + λ2 kα1 k2 (83)

Assume that the scalar constant is chosen as


β
λ=− , (84)
kα1 k2
9 The normed space is defined in the Hilbert space
H ENZEH L EE 12

then the updated cost is given by


β2
kx − x̂ + λα1 k2 = kx − x̂k2 −
kα1 k2
≤ kx − x̂k2 (85)
This contradicts the hypothesis that x̂ is the minimum. Therefore, if x̂ is the minimum, then β
should be zero.

2.2 Discrete-Time Kalman Filter


Estimator in Linear Form
Suppose that stochastic systems can be represented by the types of plant and measurement models :
Discrete Time Models
xk = Φk−1 xk−1 + wk−1 (86)
zk = H k xk + v k (87)
Ehwk i = 0 (88)
Ehwk wTj i = Qk (k, j)∆(k − j) (89)
Ehv k i = 0 (90)
Ehv k v Tj i = Rk (t, j)∆(k − j) (91)
The measurement and plant noise are assumed to be zero mean Gaussian process. The noise se-
quences w and v are assumed to be uncorrelated10 .
A Comprehensive example for recursive filters : Consider the problem of estimating a scalar
constant based on k noise-corrupted measurement zk , where zk = x + vi (i = 1, 2, · · · , k) Here,
vj represents the measurement noise, which are assumed to be a white sequence. An unbiased,
minimum variance estimate x̂k is given by
k
1X
x̂k = zi (92)
k i=1
When an additional measurement becomes available, the new estimate is described as
k+1
1 X
x̂k+1 = zi (93)
k + 1 i=1
This expression can be modified as
k
1 X 1
x̂k+1 = zi + zk+1
k + 1 i=1 k+1
k
!
k 1X 1
= zi + zk+1
k + 1 k i=1 k+1
k 1
= x̂k + zk+1 (94)
k+1 k+1
10 Ehv T
k wj i =0
H ENZEH L EE 13

Note that Eq.(94) can be written in the alternative form :

(k + 1) − 1 1
x̂k+1 = x̂k + zk+1
k+1 k+1
1
= x̂k + (zk+1 − x̂k ) (95)
k+1
The optimal linear estimate is equivalent to the general optimal estimator if the variables x
and z are jointly Gaussian. Therefore, it suffices to seek an updated estimate x̂+ k is based on the
observation z k . That is, the estimate is a linear function of the a priori estimate and the measurement.

x̂+ −
k = K̄k x̂k + Kk z k (96)

where x̂− + +
k = Φk−1 x̂k−1 is the a priori estimate of xk and x̂k is the posteriori value of the estimate.

Optimization Problem
The matrices K, K̄ are unknown. We have to find out these optimal matrices. The orthogonality
principle in the previous section can be written in the form

Eh[xk − x̂+ T
k ][z i ]i = 0, i = 1, 2, · · · , k − 1 (97)
Eh[xk − x̂+ T
k ][z k ]i = 0 (98)

Using the discrete time plant model and measurement model in Eqs.(86),(87), we can obtain the
following relation :

Eh[Φk−1 xk−1 + wk−1 − K̄k x̂− T


k + Kk (Hk xk + v k )][z i ]i = 0, i = 1, 2, · · · , k − 1 (99)

xk
xk − xˆ k+

u1

xˆk+
uj uk

Figure 3: Geometric interpretation for the optimal state estimation based on measurement vectors.

Using the properties of the noise process11 , we can rewrite the above equation :

Eh[Φk−1 xk−1 − K̄k x̂− T


k + Kk Hk xk + Kk v k ][z i ]i = 0, i = 1, 2, · · · , k − 1 (100)
11 Ehw
ki =0
H ENZEH L EE 14

We also know that Eqs.(97) and (98) hold at the previous step, that is,

0 = Eh[xk−1 − x̂+ T
k−1 ][z i ]i, i = 1, 2, · · · , k − 1
= Φk−1 Eh[xk−1 − x̂+ T
k−1 ][z i ]i
= Eh[Φk−1 xk−1 − Φk−1 x̂+ T − T
k−1 ][z i ]i = Eh(xk − x̂k )z i i (101)

and12

Ehv k z Ti i = 0, i = 1, 2, · · · , k − 1 (102)

Then Eq.(100) can be reduced to the form

EhΦk−1 xk−1 iz Ti − K̄k Ehx̂− T T T


k iz i + Kk Hk EhΦk−1 xk−1 i z i + Kk Ehv k z i i = 0 (103)
| {z } | {z } | {z }
Ehxk i Ehxk i =0

Arranging above equation yields

Ehxk iz Ti + Kk Hk Ehxk iz Ti − K̄k Ehx̂− T


k iz i = 0
− T
T T T T
Ehxk iz i + Kk Hk Ehxk iz i −EhK̄k xk iz i + EhK̄k xk iz i − K̄k Ehx̂k iz i = 0 (104)

By the property in Eq.(101), the above equation can be

(I − K̄k + Kk Hk )Ehxk iz Ti + K̄k Eh(xk − x̂− )z T i = 0 (105)


| {z k i }
=0

Clearly, the orthogonality condition in Eq.(97) can be satisfied for any given x k if

K̄k = I − Kk Hk (106)

By inserting Eq.(106) into Eq.(96), we can obtain a recursive form :

x̂+
k = (I − Kk Hk )x̂−
k + Kk z k
= x̂k + Kk (z k − Hk x̂−

k) (107)

The choice of K is such that Eq.(98) is satisfied. Let us define some useful errors :

x̃+
k ≡ x̂+
k − xk (108)
x̃−
k ≡ x̂−k − xk (109)

z̃ k ≡ ẑ k − z k
= Hk x̂−
k − zk (110)

Vectors x̃+ −
k and x̃k are the estimation errors after and before updates, respectively. The parameters
x̃k depends on xk , which depends on z k . Therefore, from Eq.(98), it is obvious such that

Eh(xk − x̂+ −T
k )ẑ k i = 0 (111)
12 Ehv T = Ehv k v T T T Ehv k v T T T
k zi i i + vk z i H i = i i +z j Hi Ehvk i
| {z } | {z }
=0 =0
H ENZEH L EE 15

Subtracting Eq.(98) from the above equation gives


Eh(xk − x̂+ T
k )z̃ k i = 0 (112)
Substituting for x̂+
k , z̃ k yields

Eh[xk − K̄k x̂− − T


k − Kk z k ][Hk x̂k − z k ] i = 0 (113)
Substituting for K̄k in Eq.(106), z k and x̃− − T
k , and using the fact that Ehx̃k v k i = 0, the above
equation can be rewritten as
0 = Eh[xk − x̂− − −
k + Kk Hk x̂k − Kk Hk xk − Kk v k ][Hk x̂k − Hk xk − v k ] i
T

− − − T
= Eh[−(x̂k − xk ) + Kk Hk (x̂k − xk ) − Kk v k ][Hk (x̂k − xk ) − v k ] i
= Eh[−x̃− − − T
k + Kk Hk x̃k − Kk v k ][Hk x̃k − v k ] i
= Eh−x̃− −T T − −T T T
k x̃k Hk + Kk Hk x̃k x̃k Hk + Kk v k v k i
= −Ehx̃− −T T − −T T T
k x̃k iHk + Kk Hk Ehx̃k x̃k iHk + Kk Ehv k v k i (114)
| {z }
Rk

By definition, the a priori covariance and a posteriori covariance


Pk− = Ehx̃− −T
k x̃k i (115)
Pk+ = Ehx̃+ +T
k x̃k i (116)
Therefore, it satisfies the equation
−Pk− HkT + Kk Hk Pk− HkT + Kk Rk = 0 (117)
and the gain can be expressed as
Kk = Pk− HkT [Hk Pk− HkT + Rk ]−1 (118)
One can derive a similar formula for the posteriori covariance matrix. At first, using Eq.(107),
error state vector for the posteriori is given by
+
x̃+
k = x̂k − xk = x̂− −
k + Kk (Hk xk + v k − Hk x̂k ) − xk
− −
= (x̂k − xk ) − Kk Hk (x̂k − xk ) + Kk v k
= (I − Kk Hk )(x̂−
k − xk ) + Kk v k

= (I − Kk Hk )x̃k + Kk v k (119)
Inserting above equation into the definition of the posteriori covariance matrix in Eq.(116) yields
Pk+ = Eh(I − Kk Hk )x̃− −T T − T T
k x̃k (I − Kk Hk ) i + Eh(I − Kk Hk )x̃k v k Kk i
+ EhKk v k x̃−T T T T
k (I − Kk Hk ) i + EhKk v k v k Kk i
= (I − Kk Hk ) Ehx̃− x̃−T i(I − Kk Hk )T + (I − Kk Hk ) Ehx̃− T T
k v k i Kk
| k{z k } | {z }
Pk− =0

+ Kk Ehv k x̃−T i(I − Kk Hk )T + Kk Ehv k v Tk i KkT


| {z k } | {z }
=0 Rk

= (I − Kk Hk )Pk− (I − Kk Hk ) + Kk Rk KkT
T
(120)
H ENZEH L EE 16

The above form is the so-called “Joseph form” of the covariance update equation derived by P.D
Joseph. There are many other forms for Kk , Pk+ that might not be as useful for robust computa-
tion. The covariance matrix is symmetric. Failure to obtain such symmetric condition would be
introduced during numerical calculation. Joseph form is one of the robust algorithms.
An alternative is derived by substituting for Kk from Eq.(118) such that

Pk+ = Pk− − Kk Hk Pk− − Pk− HkT KkT + Kk Hk Pk− HkT KkT + Kk Rk KkT
=0
z }| {
= (I − Kk Hk )Pk− −Pk− HkT KkT + Kk (Hk Pk− HkT + Rk ) KkT
| {z }
= Pk− HkT (∵ Eq.(118))

= (I − Kk Hk )Pk− (121)

This form is the one most often used in computation. The priori error covariance matrix is derived
by using state error equation given by

x̃−
k = x̂− +
k − xk = (Φk−1 x̂k−1 ) − (Φk−1 xk−1 + w k−1 )
= Φk−1 (x̂+
k−1 − xk−1 ) + w k−1
= Φk−1 x̃+
k−1 + w k−1 (122)

By definition of the priori error covariance matrix, we can obtain a compact form :

Pk− = Φk−1 Ehx̃+ +T T T


k−1 x̃k−1 iΦk−1 + Ehw k−1 w k−1 i
+
= Φk−1 Pk−1 ΦTk−1 + Qk−1 (123)

which gives the a priori value of the covariance matrix of estimation uncertainty as a function of the
previous a posteriori value.

Summary
The equations derived in the previous section are summarized. In this formulation of the filter
eqations, G has been combined with the plant covariance. The basic steps of the computational
procedure for the discrete-time Kalman filter are as follows :
1. Compute Pk− using Pk−1+
, Φk−1 , and Qk−1

2. Compute Kk using Pk , Hk , and Rk
3. Compute Pk+ using Kk , and Pk−
4. Compute successive values of x̂+ k recursively using the computed values of Kk , the given
initial estimate and the input data z k
H ENZEH L EE 17

Table 1: Discrete-Kalman Filter Equations

System dynamic model : xk = Φk−1 xk−1 + wk−1


wk ∼ N (0, Qk )
Measurement Model : z k = Hk z k + v k
v k ∼ N (0, Rk )
Initial Conditions : Ehx0 i = x̂0
Ehx̃0 x̃T0 i = P0
Independence assumption : Ehv k wTj i = 0 ∀ k, j
Rk > 0

State time update :


x̂− +
k = Φk−1 x̂k−1
Error covariance extrapolation :
Pk− = Φk−1 Pk+ ΦTk−1 + Qk−1
Kalman gain matrix :
Kk = Pk− HkT (Hk Pk− HkT + Rk )−1
Error covariance Update :
Pk+ = (I − Kk Hk )Pk−
State measurement update :
x̂+ − −
k = x̂k + Kk (z k − Hk x̂k )

2.3 Continuous-Time Kalman Filter


A rigorous approach to obtain the discrete Kalman filter can be achieved by using the orthogonality
principle. This section provides a formal (but not rigorous) derivation of the continuous-Kalman es-
timator, sometimes called Kalman-Bucy filter. Less emphasis is placed on continuous-time Kalman
filter, especially, in view of the main objective to obtain efficient and practical estimators. At first,
the continuous time random process is given by

ẋ(t) = A(t)x(t) + G(t)w(t) (124)


z(t) = H(t)x(t) + v(t) (125)
Ehw(t)i = 0 (126)
Ehw(t)w(σ)T i = Q(t, σ)δ(t − σ) (127)
Ehv(t)i = 0 (128)
Ehv(t)v(σ)T i = R(t, σ)δ(t − σ) (129)
Ehw(t)v(σ)T i = 0 (130)

where the covariance matrices Q, R are positive definite.


Before proceeding, we need the relationship between Rk and R. We can write the covariance of
v k in terms of its spectral density R as

R = Rk ∆(k) (131)
H ENZEH L EE 18

Since ∆(k) has a value of one at k = 0, in the discrete case the covariance is equal to R k , a finite
matrix. On the other hand, the covariance of R is given by

Rδ(t) (132)

For continuous white noise, the covariance is unbounded since δ(t) is unbounded at t = 0. To make
consistency between Rk and R, the following relationship holds

R
Rk = (133)
∆t
where ∆t be the time interval tk , tk−1 . It is evident that the equation is the relation required between
R and Rk so that v k and v(t) have the same spectral densities.
As shown in the previous section, the following relationships hold :

Φk = I + Ak−1 ∆t + O(∆t2 ) (134)

where O(∆t2 ) consists of terms with power of ∆t greater than or equal to two, and from Eq.(68)
the process noise is approximately given by

Qk ≈ GQGT ∆t (135)

By substituting above equations, the discrete covariance matrices and gain can be modified as

Pk− = +
(I + A∆t)Pk−1 (I + A∆t)T + GQGT ∆t (136)
 −1
+ T R
Kk = Pk H HPk H +T
(137)
∆t
Pk+ = (I − Kk H)Pk− (138)

Let us first examine the behavior of the Kalman gain as ∆t approaches to zero. From Eq.(137), we
have
1
Kk = Pk− H T (HPk− H T ∆t + R)−1 (139)
∆t
so that
1
lim Kk = Pk− H T R−1 (140)
∆t→0 ∆t
where R is positive definite such that the right hand side of above equation is constant. This implies
that

lim Kk = 0 (141)
∆t→0

The discrete Kalman gain tends to approach to zero as the sampling time becomes to zero. Let us
consider the covariance matrix in Eq.(136), we have

Pk− = Pk−1 + (APk−1 + Pk AT + GQGT )∆t + O(∆t2 ) (142)


H ENZEH L EE 19

where O(∆t)2 denotes terms of order ∆t2 . Substituting Eq.(138) into this equation yields

Pk− = −
(I − Kk−1 H)Pk−1
 
+ A(I − Kk−1 H)Pk−1 + (I − Kk−1 H)Pk−1 AT + GQGT ∆t + O(∆t2 )
− −
= Pk−1 − Kk−1 HPk−1
 
+ APk−1 + Pk−1 AT + GQGT − AKk−1 HPk−1 − Kk−1 HPk−1 AT ∆t
+ O(∆t)2 (143)

Then dividing by ∆t yields

Pk− − Pk−1

Kk−1 −
= − HPk−1
∆t ∆t 
+ APk−1 + Pk−1 AT + GQGT − AKk−1 HPk−1 − Kk−1 HPk−1 AT
+ O(∆t) (144)

Letting ∆t tend to zero. As a result, using Eqs.(140),(141), the continuous-time covariance matrix
differential equation is derived as

Pk− − Pk−1


lim = lim APk−1 + Pk−1 AT + GQGT
∆t→0 ∆t ∆t→0
Kk−1 −
+ − lim HPk−1
∆t→0 ∆t

+ lim −AKk−1 HPk−1 − Kk−1 HPk−1 AT + O(∆t)
∆t→0 | {z }
→0

Ṗ (t) = A(t)P (t) + P (t)A (t) + G(t)Q(t)GT (t)


T

− P (t)H T (t)R−1 (t)H(t)P (t) (145)

where lim∆t→0 Pk−1 = P (t). This is the continuous-time Riccati equation for propagation of the
error covariance.
In similar manners, the state vector update equation can be derived such that

x̂+
k = (I + A∆t)x̂+ +
k−1 + Kk [z k − H((I + A∆t)x̂k−1 )]
= x̂+ + + +
k−1 + A∆tx̂k−1 + Kk (z k − H x̂k−1 + HAx̂k−1 ∆t) (146)

Dividing by ∆t, and approaching ∆t → 0 yield


 
x̂+ − x̂+ Kk 
k k−1
= lim Ax̂+ + +
(147)

lim k−1 + lim z k − H x̂k−1 − HAx̂k−1 ∆t
∆t→0 ∆t ∆t→0 ∆t→0 ∆t | {z }
→0

where lim∆t→0 x̂+


k−1 = x̂(t). We can obtain the limit solution given by

˙
x̂(t) = A(t)x̂(t) + P (t)H T (t)R−1 (t)[z(t) − H(t)x̂(t)] (148)

This equation is the estimate update form in continuous-time.


H ENZEH L EE 20

Summary
The equations derived in the previous section are summarized.

Table 2: Kalman-Bucy Filter Equations

System dynamic model : ẋ(t) = A(t)x + G(t)w(t)


w(t) ∼ N (0, Q)
Measurement Model : z(t) = H(t)x(t) + v(t)
v(t) ∼ N (0, R)
Initial Conditions : Ehx(0)i = x̂(0)
Ehx̃(0)x̃(0)T i = P (0)
Independence assumption : Ehw(t)v(σ)T i = 0,
R>0

State update :
˙ = Ax̂ + P H T R−1 (z − H x̂)

Error covariance Update :
Ṗ = AP + P AT + GQGT − P H T R−1 HP
H ENZEH L EE 21

References
[1] Chen, C.-T., Linear System Theory and Design, 3rd Ed., Oxford University Press Inc., 1999.

[2] Grewal, M. S., and Andrews, A. P., Kalman Filtering : Theory and Practice Using MATLAB,
2nd Ed., John Wiley & Sons, Inc., 2001.

[3] Lewis, F. L., Optimal Estimation with an Introduction to Stochastic Control Theory, John
Wiley & Sons, Inc., 1986.

[4] Arnold, L., Stochastic Differential Equations : Thery and Applications, John Wiley & Sons,
Inc., 1974.
[5] Tahk, M.-J., Lecture Note : Navigation and Guidance, MAE664, http://fdcl.kaist.ac.kr, 2006.

[6] Mayback, P. S., Stochastic Models, estimation, and control Vol.1, Academic Press, Inc., 1979.

[7] Kasdin, N., “Runge-Kutta Algorithm for the Numerical Integration of Stochastic Differential
Equations,” Journal of Guidance, Control, and Dynamics, Vol. 18, No. 1, pp. 114-120, 1995.

[8] Pitelkau, M. E., “Kalman Filtering for Spacecraft System Alignment Calibration,” Journal of
Guidance, Control, and Dynamics, Vol. 24, No. 6, pp. 1187-1195, 2001.

You might also like