0% found this document useful (0 votes)
58 views17 pages

Least Squares and Kalman Filtering: Questions: Email Me, Namrata@ece - Gatech.edu

This document summarizes least squares estimation and the Kalman filter. It discusses weighted least squares, regularized least squares, recursive least squares, and the connection between recursive least squares and the Kalman filter. Recursive least squares allows estimating parameters from sequential data without storing all past observations. It can incorporate a forgetting factor to downweight older data. The Kalman filter extends this approach to dynamically model and estimate time-varying signals.

Uploaded by

Omar Zeb Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views17 pages

Least Squares and Kalman Filtering: Questions: Email Me, Namrata@ece - Gatech.edu

This document summarizes least squares estimation and the Kalman filter. It discusses weighted least squares, regularized least squares, recursive least squares, and the connection between recursive least squares and the Kalman filter. Recursive least squares allows estimating parameters from sequential data without storing all past observations. It can incorporate a forgetting factor to downweight older data. The Kalman filter extends this approach to dynamically model and estimate time-varying signals.

Uploaded by

Omar Zeb Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Least Squares and Kalman Filtering

Questions: Email me, [email protected]

Least Squares and Kalman Filtering 1


Recall: Weighted Least Squares

• y = Hx + e

• Minimize
4
J(x) = (y − Hx)T W (y − Hx) = ||y − Hx||2W (1)

Solution:

x̂ = (H T W H)−1 H T W y (2)

• Given that E[e] = 0 and E[eeT ] = V ,


Min. Variance Unbiased Linear Estimator of x: choose W = V −1 in (2)
Min. Variance of a vector: variance in any direction is minimized

Least Squares and Kalman Filtering 2


Recall: Proof

• Given x̂ = Ly, find L, s.t. E[Ly] = E[LHx] = E[x], so LH = I


• Let L0 = (H T V −1 H)−1 H T V −1
• Error variance E[(x − x̂)(x − x̂)T ]

E[(x − x̂)(x − x̂)T ] = E[(x − LHx − Le)(x − LHX − Le)T ]


= E[LeeT LT ] = LV LT

Say L = L − L0 + L0 . Here LH = I, L0 H = I, so (L − L0 )H = 0

LV LT = L0 V LT0 + (L − L0 )V (L − L0 )T + 2L0 V (L − L0 )T
= L0 V LT0 + (L − L0 )V (L − L0 )T + (H T V −1 H)−1 H T (L − L0 )T
= L0 V LT0 + (L − L0 )V (L − L0 )T ≥ L0 V LT0

Thus L0 is the optimal estimator (Note: ≥ for matrices)

Least Squares and Kalman Filtering 3


Regularized Least Squares

• Minimize

J(x) = (x − x0 )T Π−1 T
0 (x − x0 ) + (y − Hx) W (y − Hx) (3)

0 4 0 4
x = x − x0 , y = y − Hx0
J(x) = x0T Π−1
0 x0
+ y 0T
W y 0

= zM −1z  
4 0 I
z =  −  x0
y0 H
 
4 Π−1 0
M =  0 
0 W

Least Squares and Kalman Filtering 4


   
0 I
• Solution: Use least squares formula with ỹ =  , H̃ =  ,
y0 H
W̃ = M

Get:

x̂ = x0 + (Π−1 T
0 + H W H)
−1 T
H W (y − Hx0 )

• Advantage: improves condition number of H T H, incorporate prior


knowledge about distance from x0

Least Squares and Kalman Filtering 5


Recursive Least Squares

• When number of equations much larger than number of variables


– Storage
– Invert big matrices
– Getting data sequentially
• Use a recursive algorithm
At step i − 1, have x̂i−1 : minimizer of
(x − x0 )T Π−1 2
0 (x − x0 ) + ||Hi−1 x − Yi−1 ||Wi−1 , Yi−1 = [y1 , ...yi−1 ]
T

Find x̂i : minimizer of (x − x0 )T Π−1


0 (x − x0 ) + ||Hi x − Yi ||2Wi ,
 
Hi−1
Hi =   (hi is a row vector), Yi = [y1 , ...yi ]T (column vector)
hi

Least Squares and Kalman Filtering 6


For simplicity of notation, assume x0 = 0 and Wi = I.

HiT Hi T
= Hi−1 Hi−1 + hTi hi
x̂i = (Π−1
0 + H T
i H i )−1 T
Hi Yi
= (Π−1
0 + H T
i−1 H i−1 + h T
i h i ) −1
(H T
i−1 Y i−1 + h T
i yi )

Define

Pi = (Π−1 T
0 + Hi Hi )
−1
, P−1 = Π0
So Pi−1 −1
= Pi−1 + hTi hi

Use Matrix Inversion identity:


(A + BCD)−1 = A−1 + A−1 B(C −1 + DA−1 B)−1 DA−1
Pi−1 hTi hi Pi−1
Pi = Pi−1 −
1 + hi Pi−1 hTi

Least Squares and Kalman Filtering 7


x̂0 = 0
x̂i = Pi HiT Yi
Pi−1 hTi hi Pi−1 T T
= [Pi−1 − T
][H i−1 Yi−1 + h i yi ]
1 + hi Pi−1 hi
T Pi−1 hTi T
= Pi−1 Hi−1 Yi−1 − T
h i P i−1 H i−1 Yi−1
1 + hi Pi−1 hi
hi Pi−1 hTi
+Pi−1 hTi (1 − T
)yi
1 + hi Pi−1 hi
Pi−1 hTi
= x̂i−1 + T
(yi − hi x̂i−1 )
1 + hi Pi−1 hi

1/2 1/2
If Wi 6= I, this modifies to (replace yi by wi yi & hi by wi hi ):

x̂i = x̂i−1 + Pi−1 hTi (wi −1 + hi Pi−1 hTi )−1 (yi − hi x̂i−1 )

Least Squares and Kalman Filtering 8


Here we considered yi to be a scalar and hi to be a row vector.
In general: yi can be a k-dim vector, hi will be a matrix with k rows

RLS with Forgetting factor


Pi
Weight older data with smaller weight J(x) = j=1 (yj − hj x)2 β(i, j)
Exponential forgetting: β(i, j) = λi−j , λ<1
Moving average: β(i, j) = 0 if |i − j| > ∆ and β(i, j) = 1 otherwise

Least Squares and Kalman Filtering 9


Connection with Kalman Filtering

The above is also the Kalman filter estimate of the state for the following
system model:

xi = xi−1
yi = hi xi + vi , vi ∼ N (0, Ri ), Ri = wi−1 (4)

Least Squares and Kalman Filtering 10


Kalman Filter

RLS was for static data: estimate the signal x better and better as more and
more data comes in, e.g. estimating the mean intensity of an object from a
video sequence
RLS with forgetting factor assumes slowly time varying x

Kalman filter: if the signal is time varying, and we know (statistically) the
dynamical model followed by the signal: e.g. tracking a moving object

x0 ∼ N (0, Π0 )
xi = Fi xi−1 + vx,i , vx,i ∼ N (0, Qi )

The observation model is as before:

yi = hi xi + vi , vi ∼ N (0, Ri )

Least Squares and Kalman Filtering 11


Goal: get the best (minimum mean square error) estimate of xi from Yi

Cost: J(x̂i ) = E[(xi − x̂i )2 |Yi ]

Minimizer: conditional mean x̂i = E[xi |Yi ]

This is also the MAP estimate, i.e. x̂i also maximizes p(xi |Yi )

Least Squares and Kalman Filtering 12


Kalman filtering algorithm

At i = 0, x̂0 = 0, P0 = Π0 .

For any i, assume that we know x̂i−1 = E[xi |Yi−1 ]. Then


4
E[xi |Yi−1 ] = Fi x̂i−1 = x̂i|i−1
4
V ar(xi |Yi−1 ) = Fi Pi−1 FiT + Qi = Pi|i−1 (5)

This is the prediction step

Least Squares and Kalman Filtering 13


Filtering or correction step: Now xi |Yi−1 & yi |xi , Yi−1 jointly Gaussian

xi |Yi−1 ∼ N (x̂i|i−1 , Pi|i−1 )


yi |xi , Yi−1 = yi |xi ∼ N (hi xi , Ri )

Using formula for the conditional distribution of Z1 |Z2 when Z1 and Z2 are
jointly Gaussian,

E[xi |Yi ] = x̂i|i−1 + Pi|i−1 hTi (Ri + hi Pi|i−1 hTi )−1 (yi − hi x̂i|i−1 )
V ar(xi |Yi ) = Pi|i−1 − Pi|i−1 hTi hi Pi|i−1 (Ri + hi Pi|i−1 hTi )−1

x̂i = E[xi |Yi ] and Pi = V ar(xi |Yi )

Least Squares and Kalman Filtering 14


Summarizing the algorithm

x̂i|i−1 = Fi x̂i−1
Pi|i−1 = Fi Pi−1 FiT + Qi
x̂i = x̂i|i−1 + Pi|i−1 hTi (Ri + hi Pi|i−1 hTi )−1 (yi − hi x̂i|i−1 )
Pi = Pi|i−1 − Pi|i−1 hTi hi Pi|i−1 (Ri + hi Pi|i−1 hTi )−1

For Fi = I, Qi = 0, get the RLS algorithm.

Least Squares and Kalman Filtering 15


Example Applications

• RLS:
– adaptive noise cancelation, given a noisy signal dn assumed to be
given by dn = uTn w + vn , get the best estimate of the weight w.
Here yn = dn , hn = un , x = w
– channel equalization using a training sequence
– Object intensity estimation: x = intensity, yi = vector of intensities of
object region in frame i, hi = 1m (column vector of m ones),

• Kalman filter: Track a moving object (estimate its location and velocity
at each time)

Least Squares and Kalman Filtering 16


Suggested Reading

• Chapters 2, 3 & 9 of Linear Estimation, by Kailath, Sayed, Hassibi

• Chapters 4 & 5 of An Introduction to Signal Detection and Estimation,


by Vincent Poor

Least Squares and Kalman Filtering 17

You might also like