Module-3.1 Static, Linear Inverse Problem - Nov-06
Module-3.1 Static, Linear Inverse Problem - Nov-06
S. Lakshmivarahan
School of Computer Science
University of Oklahoma
Norman, Ok – 73069, USA
[email protected]
1
PROBLEM STATEMENT – A ST.LINE PROBLEM
• A particle is moving in a st. line:
• Constant velocity, V – Not known
• Initial position, Z0 – Not known
• Observations of position Zi at time ti for 1 ≤ i ≤ m are available
TIME t1 t2 … ti... Tm
POSITION Z1 Z2 … Zi … Zm
• Problem: Given the pair (ti, Zi), 1 ≤ i ≤ m, estimate the unknowns Z0, V
2
BUILD A LINEAR MODEL
• To enable estimation of the unknowns, we need to build a relation – called the model
between the known and unknowns
• From basic Physics relating time and motion:
Zi = Z0 + Vti (1)
must hold for each 1 ≤ i ≤ m
• In matrix-vector notation (6.1) becomes
𝑍1 1 𝑡1
𝑍2 1 𝑡2
⋮ ⋮ ⋮ 𝑍0
Z= = = Hx (2)
𝑍𝑖 1 𝑡𝑖 𝑉
⋮ ⋮ ⋮
𝑍𝑚 1 𝑡𝑚
x, ɳ ∈ Rn Rn Rm Z, ξ ∈ Rn Z = Hx
H
x Z ɳ = H Tξ
ɳ ξ
Model space HT Observation space 4
ON SOLVING Z = Hx
• When m = n and H is non-singular, then
x = H-1Z (4)
• When m ≠ n, H is a rectangular matrix and the standard notion of
non-singularity does not apply
• Two cases arise:
m > n – overdetermined case – Inconsistent case
m < n – underdetermined case – Infinity many solution
5
OVERDETERMINED CASE: m > n
1 1
• m = 3 and n = 2, H = 1 2
1 3
−1
• Z = Hx has a solution x =
1 6
INCONSISTENT CASE: m > n
• Recall that columns of H are defined by the mathematical model but
the column Z of observation that come from the real world
measurement
• Generally, observations have noise embedded in them and models
are only approximations to reality
• Hence, move often than not, Z does not belong to the SPAN(H)
• In such cases Z = HX has no solution in the sense that there is no
vector x that will satisfy equation Z = Hx
7
ANOTHER LOOK AT INCONSISTENT CASE: m > n
1 1
• m = 3, n = 2, H = 1 2
1 3
• Z = (2, 3.5, 4.2)T
• Z = Hx => x1 + x2 = 2, x1 + 2x2 = 3.5, x1 + 3x2 = 4.2
1 3
• Verify that x1 = and x2 = is the solution of the first two, but this
2 2
does not satisfy the third
• Verify that solution of any two out of these three equations, does not
satisfy the remaining equation
• In this sense there is no solution to Z = Hx when m > n
8
UNDERDETERMINED CASE: m < n
1 2 3
• m = 2, n = 3, H =
1 4 5
• Z = Hx becomes
Z1 = x1 + 2x2 + 3x3
Z2 = x1 + 4x2 + 5x3
• Rewrite:
x1 + 2x2 = Z1 – 3x3
x1 + 4x2 = Z2 – 5x3
• For each x3 ∈ R, there is a pair (x1(x3), x2(x3))T that is the solution of this
pair
• Z = Hx has infinite solution (x1(x3), x2(x3),x3)T
• Hence, there is no uniqueness in this case when m < n 9
SUMMARY – LINEAR INVERSE PROBLEM
• Z = Hx and H is of full rank
m:n
• Thus, we need to generalize the concept of solution for the two extreme cases
when m > n and m < n
• This generalized solution in called the lease square solution
10
UNWEIGHTED LEAST SQUARES SOLUTION: m > n
• Define Λ(x) = Z – Hx ∈ Rm – residual vector <- r(x)
• Recall when m > n, there is no x ∈ Rn for which r (x) = 0
• As a compromise, we seek x ∈ Rn for which the vector r (x) will have a
minimum length
• To this end, define f(x) = r(𝑥) 22 = rT(x) r(x) = 𝑛𝑖=1 r𝑖2 (𝑥) which is the
square of the norm of the residual
• ri(x) = Zi – Hi*x where Hi* is the ith row of H
= ith component of the residual vector
• Hence, f(x) = sum of the squares of the components of the residual
vector
• The vector x ∈ Rn that minimizes f(x) is called the least squares
solution 11
LEAST SQUQARES METHOD: m > n
• f(x) = rT(x)r(x) = (Z – Hx)T(Z – Hx)
= (ZT – (Hx)T)(Z – Hx)
= (ZT – xTHT)(Z – Hx)
= ZTZ – ZTHx – xTHTZ + xT(HTH)x (5)
• ZTHx being a scalar: ZTHx = (ZTHx)T
= xTHTZ (6)
• Therefore, f(x) = ZTZ – 2ZTHx + xT(HTH)x (7)
• Find x that minimizes f(x) in (7)
12
T
HH SPD WHEN H IS OF FULL RANK
• Since HTH = (HTH)T, HTH is symmetric
• Consider xT(HTH)x = (xTHT)(Hx) = (Hx)T(Hx)
= 𝐻𝑥 22 (8)
• Since m > n, Rank(H) = n and the columns of H are linearly
independent
• That is, Hx = 0 exactly when x = 0
≠ 0 otherwise
• Hence xT(HTH)x > 0 for x ≠ 0
-> (9)
= 0 only when x = 0
• (HTH) is positive definite
13
GRADIENT AND HESSIAN OF f(x)
• Refer to f(x) in (7)
• 𝛻𝑥 (ZTZ) = 0, 𝛻𝑥2 (ZTZ) = 0
• 𝛻𝑥 (2ZTHx) = 2𝛻𝑥 (aTx) with a = HTZ
= 2a = 2HTZ
• 𝛻𝑥2 (2ZTHx) = 0
• 𝛻𝑥 (xT(HTH)x) = 2(HTH)x
• 𝛻𝑥2 (xT(HTH)x) = 2HTH – SPD
• Combining
• Gradient of f = 𝛻𝑥 f(x) = -2HTZ + 2(HTH)x -> (10)
• Hessian of f = 𝛻𝑥2 f(x) = 2(HTH) -> (11) 14
UNCONSTRAINED MINIMIZATION OF f(x) – NORMAL EQUATION
• Setting 𝛻𝑥 f(x) = -2HTZ + 2(HTH)x = 0
• Since the Hessian 𝛻𝑥2 f(x) = 2(HTH) is SPD, f(x) is a convex function and
hence the minimum is unique
15
MINIMUM RESIDUAL
• The minimum residual r(xLS) = Z – HxLS
19
CONTOURS OF f(x) – GRAPHICAL METHOD
• Using the data in slide (19) we can get
f(Z0, V) = ZTZ – 2ZTHx + xT(HTH)x
= Z02 + 3Z0V + 3.5V2 – 9Z0 – 25V + 23
• The contours of f(Z0, V) using MATLAB is given below
• The minimum is Z0∗ = 1.5, V* = 0.5
20
WEIGHTED LEAST SQUARES: m > n
• Let W ∈ Rmxm be a SPD matrix
• The weighted sum of squared errors:
fw(x) = (Z – Hx)TW(Z – Hx)
• W – could be a diagonal matrix with different weights along the
diagonal or a general SPD
• Verify that the normal equations in this case is
(HTWH)x =HTWZ
• The weighted least square solution is:
Xls = (HTWH)-1HTWZ -> (17)
21
UNDERDETERMINED CASE: m < n
• Recall: There are infinitely many solutions
• r(x) = 0 for infinitely many x ∈ Rn
• Unlike when m > n, in this case f(x) = r(𝑥) 22 = 0
• Need a new approach
• To get an unique solution, formulate it as a constrained minimization
problem using the standard Lagrangian multiplier methods for
equality constrained problem (Module 5)
22
LAGRANGIAN FORMULATION: m < n
• Problem statement: Find x ∈ Rn such that ||x||2 is a minimum when Z
satisfies Z = Hx
• Let λ ∈ Rm and define the Lagrangian
L(x, λ) = ||x||2 + λT(Z – Hx) -> (18)
• Now the above constrained minimization is solved by minimizing
L(x, λ) with respect to x ∈ Rn and λ ∈ Rm as an unconstrained problem
23
LAGRANGIAN METHOD: m < n
• A necessary conditions for the minimum are:
𝛻𝑥 L(x, λ) = 0
𝛻𝜆 L(x, λ) = 0
• By solving these two equations in the two unknowns x, λ, we get the
optimal x and λ
• For L in (18)
𝛻𝑥 L(x, λ) = 2x – HTλ = 0
-> (19)
𝛻𝜆 L(x, λ) =Z – Hx = 0
24
LEAST SQUARES SOLUTION: m < n
1 T
• Solving (19): x = Hλ -> (20)
2
1
Z = Hx = HHTλ -> (21)
2
• From (21): λ = 2(HHT)-1Z -> (22)
• Using (22) in (19)
XLs = HT(HHT)-1Z -> (23)
• If H is of full rank, Rank(H) = m then it can be verified (HHT) is SPD
• XLS is computed in two steps:
• Solve normal equations: (HHT)y = Z and find y = (HHT)-1Z
• XLs = HTy
25
RESIDUAL AT XLS
• r(xLS) = Z – HxLS
= Z – HHT(HHT)-1Z
=Z–Z=0
26
EXERCISES
6.1) Let x1 + x2 = 1, x1 + 2x2 = 3.5, x1 + 3x2 = 4.2
Solve any two and verify that this solution is not consistent with the
third equation
1 𝑡 𝑍0
6.2) Solve = 𝑍
𝑡 𝑡2 𝑉 𝑍𝑡
𝑍𝑡 − 𝑡 𝑍
and verify that the solution is given: V* = , Z* = 𝑍 - 𝑡V*
𝑡 2 −(𝑡)2
6.3) Using MATLAB, plot the contours of
f(Z0, V) = 𝑍02 + 3Z0V + 3.5V2 – 9Z0 -25V + 23
Find the minimizer (Z*, V*) graphically
27
EXERCISES
6.4) Find the minimizer of
fw(x) = (Z – Hx)TW(Z – Hx)
and verify that
xLS = (HTWH)-1HTWZ
6.5) The generalized inverse of H is
H+ = (HTH)-1HT if m > n
= HT(HHT)-1 if m < n
when H is of full rank
Verify that H+ satisfies the Moore-Penrose Condition: (Module – 3)
a) HH+H = H
b) H+HH+ = H+
c) (HH+)T = HH+
d) (H+H)T = H+H 28
REFERENCES
• J. Lewis, S. Lakshmivarahan, S. Dhall (2006), Dynamic Data
Assimilation: a least squares approach, Cambridge University Press –
Chapter 5
29