0% found this document useful (0 votes)
6 views

Module-3.1 Static, Linear Inverse Problem - Nov-06

The document describes building a linear model to relate known observations to unknown parameters in order to estimate the unknowns from the observations. It discusses solving the linear inverse problem for different cases when the number of observations is greater than, equal to, or less than the number of unknowns. The least squares method is introduced as a way to find an approximate solution when there are more observations than unknowns by minimizing the residual between the observations and model predictions.

Uploaded by

hrithik suresh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Module-3.1 Static, Linear Inverse Problem - Nov-06

The document describes building a linear model to relate known observations to unknown parameters in order to estimate the unknowns from the observations. It discusses solving the linear inverse problem for different cases when the number of observations is greater than, equal to, or less than the number of unknowns. The least squares method is introduced as a way to find an approximate solution when there are more observations than unknowns by minimizing the residual between the observations and model predictions.

Uploaded by

hrithik suresh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Module – 3.

STATIC, DETERMINISTIC LINEAR INVERSE


(WELL – POSED) PROBLEM

S. Lakshmivarahan
School of Computer Science
University of Oklahoma
Norman, Ok – 73069, USA
[email protected]

1
PROBLEM STATEMENT – A ST.LINE PROBLEM
• A particle is moving in a st. line:
• Constant velocity, V – Not known
• Initial position, Z0 – Not known
• Observations of position Zi at time ti for 1 ≤ i ≤ m are available
TIME t1 t2 … ti... Tm
POSITION Z1 Z2 … Zi … Zm

• Problem: Given the pair (ti, Zi), 1 ≤ i ≤ m, estimate the unknowns Z0, V

2
BUILD A LINEAR MODEL
• To enable estimation of the unknowns, we need to build a relation – called the model
between the known and unknowns
• From basic Physics relating time and motion:
Zi = Z0 + Vti (1)
must hold for each 1 ≤ i ≤ m
• In matrix-vector notation (6.1) becomes
𝑍1 1 𝑡1
𝑍2 1 𝑡2
⋮ ⋮ ⋮ 𝑍0
Z= = = Hx (2)
𝑍𝑖 1 𝑡𝑖 𝑉
⋮ ⋮ ⋮
𝑍𝑚 1 𝑡𝑚

• Or Z = Hx Z ∈ Rm, H ∈ Rmx2, x ∈ R2 (3)


• Equation (3) is a linear model
• Given (Z, H), find x, is the linear inverse problem 3
A GENERALIZATION – LINEAR MODEL
• Let Z ∈ Rm be the observation vector
• Rm is called the observation space
• Let x ∈ Rn be the unknown vector
• Rn is called the model space
• H ∈ Rmxn is the relation between the model space and observation
space

x, ɳ ∈ Rn Rn Rm Z, ξ ∈ Rn Z = Hx
H
x Z ɳ = H Tξ
ɳ ξ
Model space HT Observation space 4
ON SOLVING Z = Hx
• When m = n and H is non-singular, then
x = H-1Z (4)
• When m ≠ n, H is a rectangular matrix and the standard notion of
non-singularity does not apply
• Two cases arise:
m > n – overdetermined case – Inconsistent case
m < n – underdetermined case – Infinity many solution

5
OVERDETERMINED CASE: m > n
1 1
• m = 3 and n = 2, H = 1 2
1 3

• Columns are H are linearly independent


• SPAN(H) = 2-D plane defined by these two columns which is a subset
of R3
0 1 1
• Let Z = 1 , since Z = (-1) 1 + 1 2 , Z ∈ SPAN(H)
2 1 3

−1
• Z = Hx has a solution x =
1 6
INCONSISTENT CASE: m > n
• Recall that columns of H are defined by the mathematical model but
the column Z of observation that come from the real world
measurement
• Generally, observations have noise embedded in them and models
are only approximations to reality
• Hence, move often than not, Z does not belong to the SPAN(H)
• In such cases Z = HX has no solution in the sense that there is no
vector x that will satisfy equation Z = Hx

7
ANOTHER LOOK AT INCONSISTENT CASE: m > n
1 1
• m = 3, n = 2, H = 1 2
1 3
• Z = (2, 3.5, 4.2)T
• Z = Hx => x1 + x2 = 2, x1 + 2x2 = 3.5, x1 + 3x2 = 4.2
1 3
• Verify that x1 = and x2 = is the solution of the first two, but this
2 2
does not satisfy the third
• Verify that solution of any two out of these three equations, does not
satisfy the remaining equation
• In this sense there is no solution to Z = Hx when m > n
8
UNDERDETERMINED CASE: m < n
1 2 3
• m = 2, n = 3, H =
1 4 5
• Z = Hx becomes
Z1 = x1 + 2x2 + 3x3
Z2 = x1 + 4x2 + 5x3
• Rewrite:
x1 + 2x2 = Z1 – 3x3
x1 + 4x2 = Z2 – 5x3
• For each x3 ∈ R, there is a pair (x1(x3), x2(x3))T that is the solution of this
pair
• Z = Hx has infinite solution (x1(x3), x2(x3),x3)T
• Hence, there is no uniqueness in this case when m < n 9
SUMMARY – LINEAR INVERSE PROBLEM
• Z = Hx and H is of full rank
m:n

m>n m=n m<n


• Overdetermined • Underdetemined
• Rank(H) = n • Rank(H) = n • Rank(H) = m
• Inconsistent system • Infinitely many solution
• No solution • Unique solution X = H-1z • No uniqueness

• Thus, we need to generalize the concept of solution for the two extreme cases
when m > n and m < n
• This generalized solution in called the lease square solution
10
UNWEIGHTED LEAST SQUARES SOLUTION: m > n
• Define Λ(x) = Z – Hx ∈ Rm – residual vector <- r(x)
• Recall when m > n, there is no x ∈ Rn for which r (x) = 0
• As a compromise, we seek x ∈ Rn for which the vector r (x) will have a
minimum length
• To this end, define f(x) = r(𝑥) 22 = rT(x) r(x) = 𝑛𝑖=1 r𝑖2 (𝑥) which is the
square of the norm of the residual
• ri(x) = Zi – Hi*x where Hi* is the ith row of H
= ith component of the residual vector
• Hence, f(x) = sum of the squares of the components of the residual
vector
• The vector x ∈ Rn that minimizes f(x) is called the least squares
solution 11
LEAST SQUQARES METHOD: m > n
• f(x) = rT(x)r(x) = (Z – Hx)T(Z – Hx)
= (ZT – (Hx)T)(Z – Hx)
= (ZT – xTHT)(Z – Hx)
= ZTZ – ZTHx – xTHTZ + xT(HTH)x (5)
• ZTHx being a scalar: ZTHx = (ZTHx)T
= xTHTZ (6)
• Therefore, f(x) = ZTZ – 2ZTHx + xT(HTH)x (7)
• Find x that minimizes f(x) in (7)

12
T
HH SPD WHEN H IS OF FULL RANK
• Since HTH = (HTH)T, HTH is symmetric
• Consider xT(HTH)x = (xTHT)(Hx) = (Hx)T(Hx)
= 𝐻𝑥 22 (8)
• Since m > n, Rank(H) = n and the columns of H are linearly
independent
• That is, Hx = 0 exactly when x = 0
≠ 0 otherwise
• Hence xT(HTH)x > 0 for x ≠ 0
-> (9)
= 0 only when x = 0
• (HTH) is positive definite
13
GRADIENT AND HESSIAN OF f(x)
• Refer to f(x) in (7)
• 𝛻𝑥 (ZTZ) = 0, 𝛻𝑥2 (ZTZ) = 0
• 𝛻𝑥 (2ZTHx) = 2𝛻𝑥 (aTx) with a = HTZ
= 2a = 2HTZ
• 𝛻𝑥2 (2ZTHx) = 0
• 𝛻𝑥 (xT(HTH)x) = 2(HTH)x
• 𝛻𝑥2 (xT(HTH)x) = 2HTH – SPD
• Combining
• Gradient of f = 𝛻𝑥 f(x) = -2HTZ + 2(HTH)x -> (10)
• Hessian of f = 𝛻𝑥2 f(x) = 2(HTH) -> (11) 14
UNCONSTRAINED MINIMIZATION OF f(x) – NORMAL EQUATION
• Setting 𝛻𝑥 f(x) = -2HTZ + 2(HTH)x = 0

• Least square solution is the solution of the Normal equations which is


linear symmetric, positive definite system: (HTH)x = HTZ -> (12)

• Or Xls = (HTH)-1HTZ = H+Z -> (13)


H+ = (HTH)-1HT – Generalized inverse of H -> (14)

• Since the Hessian 𝛻𝑥2 f(x) = 2(HTH) is SPD, f(x) is a convex function and
hence the minimum is unique
15
MINIMUM RESIDUAL
• The minimum residual r(xLS) = Z – HxLS

• By (14): r(xLS) = [I - H(HTH)-1HT]Z ≠ 0 -> (15)

• Here in lies the difference between the classical solution where


r(x) = 0 and the least squares solution where r(xls) ≠ 0 for the
overdetermined case

• Verify f(xls) = r(𝑥) 22 = ZT[I - H(HTH)-1HT]Z -> (16)


which is the minimum value of sum of square errors (SSE)
16
AN ILLUSTRATION – ST.LINE PROBLEM
1 𝑡1
1 𝑡2
•H=
⋮ ⋮
1 𝑡𝑚
1 𝑡1
𝑚
1 1 ⋯ 1 1 𝑡2 𝑚 𝑖=1 𝑡𝑖
•H H=
T = 𝑚 𝑚 2
𝑡1 𝑡2 ⋯ 𝑡𝑚 ⋮ ⋮ 𝑖=1 𝑡𝑖 𝑡
𝑖=1 𝑖
1 𝑡𝑚
𝑍1
𝑚
1 1 ⋯ 1 𝑍2 𝑖=1 𝑍𝑖
• HTZ = = 𝑚
𝑡1 𝑡2 ⋯ 𝑡𝑚 ⋮ 𝑖=1 𝑍𝑖 𝑡𝑖
𝑍𝑚 17
ILLUSTRATION CONTINUED
• Normal equations: (HTH)x = HTZ
𝑚 𝑚
𝑚 𝑖=1 𝑖𝑡 𝑍0 𝑖=1 𝑍𝑖
𝑚 𝑚 2 = 𝑚
𝑖=1 𝑡𝑖 𝑖=1 𝑡𝑖 𝑉 𝑖=1 𝑍𝑖 𝑡𝑖
1 𝑡 𝑍0
• Dividing by n => = 𝑍
𝑡 𝑡2 𝑉 𝑍𝑡
1 𝑚 2 1 𝑚 2 1 𝑚 1 𝑚
𝑡= 𝑖=1 𝑡𝑖 , 𝑡 = 𝑖=1 𝑡𝑖 , 𝑍 = 𝑖=1 𝑍𝑖 , 𝑍𝑡 = 𝑖=1 𝑍𝑖 𝑡𝑖
𝑚 𝑚 𝑚 𝑚
𝑍𝑡 − 𝑡 𝑍
• Solution: V* =
𝑡 2 −(𝑡)2
Z* = 𝑍 - 𝑡V*
• SSE = f(Z0*, V*) = 𝑚 [𝑍
𝑖=1 𝑖 −(𝑍0

+ 𝑉 ∗ 𝑡 )]2 is the minimum value of the
𝑖
sum of squared errors
𝑆𝑆𝐸 1 𝑓(𝑍0∗ , 𝑉 ∗ ) 1 𝑚
• RMS error = [ ]2 =[ ] is a measure of the linear fit 18
𝑚 𝑚
NUMERICAL EXAMPLE – ALGEBRAIC METHOD
1 1 1.0
1 2 3.0
• m = 4, n = 2, H = , Z=
1 3 2.0
1 4 3.0
• 𝑡 = 1.5, 𝑡 2 = 3.5, 𝑍 = 2.25, 𝑍𝑡 = 4
1 1.5 𝑍0 2.25
• Normal equation: =
1.5 3.5 𝑉 4
• Solution: V* = 0.5, Z0* = 1.5
• Filted/assimilated model: Zi = 1.5 + 0.5ti
• SSE = 1.5, RMS error = 0.6124

19
CONTOURS OF f(x) – GRAPHICAL METHOD
• Using the data in slide (19) we can get
f(Z0, V) = ZTZ – 2ZTHx + xT(HTH)x
= Z02 + 3Z0V + 3.5V2 – 9Z0 – 25V + 23
• The contours of f(Z0, V) using MATLAB is given below
• The minimum is Z0∗ = 1.5, V* = 0.5

20
WEIGHTED LEAST SQUARES: m > n
• Let W ∈ Rmxm be a SPD matrix
• The weighted sum of squared errors:
fw(x) = (Z – Hx)TW(Z – Hx)
• W – could be a diagonal matrix with different weights along the
diagonal or a general SPD
• Verify that the normal equations in this case is
(HTWH)x =HTWZ
• The weighted least square solution is:
Xls = (HTWH)-1HTWZ -> (17)

21
UNDERDETERMINED CASE: m < n
• Recall: There are infinitely many solutions
• r(x) = 0 for infinitely many x ∈ Rn
• Unlike when m > n, in this case f(x) = r(𝑥) 22 = 0
• Need a new approach
• To get an unique solution, formulate it as a constrained minimization
problem using the standard Lagrangian multiplier methods for
equality constrained problem (Module 5)

22
LAGRANGIAN FORMULATION: m < n
• Problem statement: Find x ∈ Rn such that ||x||2 is a minimum when Z
satisfies Z = Hx
• Let λ ∈ Rm and define the Lagrangian
L(x, λ) = ||x||2 + λT(Z – Hx) -> (18)
• Now the above constrained minimization is solved by minimizing
L(x, λ) with respect to x ∈ Rn and λ ∈ Rm as an unconstrained problem

23
LAGRANGIAN METHOD: m < n
• A necessary conditions for the minimum are:
𝛻𝑥 L(x, λ) = 0
𝛻𝜆 L(x, λ) = 0
• By solving these two equations in the two unknowns x, λ, we get the
optimal x and λ
• For L in (18)
𝛻𝑥 L(x, λ) = 2x – HTλ = 0
-> (19)
𝛻𝜆 L(x, λ) =Z – Hx = 0

24
LEAST SQUARES SOLUTION: m < n
1 T
• Solving (19): x = Hλ -> (20)
2
1
Z = Hx = HHTλ -> (21)
2
• From (21): λ = 2(HHT)-1Z -> (22)
• Using (22) in (19)
XLs = HT(HHT)-1Z -> (23)
• If H is of full rank, Rank(H) = m then it can be verified (HHT) is SPD
• XLS is computed in two steps:
• Solve normal equations: (HHT)y = Z and find y = (HHT)-1Z
• XLs = HTy
25
RESIDUAL AT XLS
• r(xLS) = Z – HxLS
= Z – HHT(HHT)-1Z
=Z–Z=0

• This is to be expected since we start with the infinitely many solutions


for which r(x) = 0

26
EXERCISES
6.1) Let x1 + x2 = 1, x1 + 2x2 = 3.5, x1 + 3x2 = 4.2
Solve any two and verify that this solution is not consistent with the
third equation
1 𝑡 𝑍0
6.2) Solve = 𝑍
𝑡 𝑡2 𝑉 𝑍𝑡
𝑍𝑡 − 𝑡 𝑍
and verify that the solution is given: V* = , Z* = 𝑍 - 𝑡V*
𝑡 2 −(𝑡)2
6.3) Using MATLAB, plot the contours of
f(Z0, V) = 𝑍02 + 3Z0V + 3.5V2 – 9Z0 -25V + 23
Find the minimizer (Z*, V*) graphically
27
EXERCISES
6.4) Find the minimizer of
fw(x) = (Z – Hx)TW(Z – Hx)
and verify that
xLS = (HTWH)-1HTWZ
6.5) The generalized inverse of H is
H+ = (HTH)-1HT if m > n
= HT(HHT)-1 if m < n
when H is of full rank
Verify that H+ satisfies the Moore-Penrose Condition: (Module – 3)
a) HH+H = H
b) H+HH+ = H+
c) (HH+)T = HH+
d) (H+H)T = H+H 28
REFERENCES
• J. Lewis, S. Lakshmivarahan, S. Dhall (2006), Dynamic Data
Assimilation: a least squares approach, Cambridge University Press –
Chapter 5

29

You might also like