0% found this document useful (0 votes)
14 views

Lecture 4

The document discusses singular value decomposition (SVD) as a method for analyzing and solving linear discrete ill-posed problems. SVD decomposes the matrix G relating the data and model spaces into three matrices: U, S, and V. U and V are orthogonal matrices that span the data and model spaces, respectively. S is a diagonal matrix containing the singular values of G. For ill-posed problems, some singular values will be zero, creating null spaces in the data and model. The SVD provides insight into resolving the data and uniqueness of solutions to the inverse problem.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture 4

The document discusses singular value decomposition (SVD) as a method for analyzing and solving linear discrete ill-posed problems. SVD decomposes the matrix G relating the data and model spaces into three matrices: U, S, and V. U and V are orthogonal matrices that span the data and model spaces, respectively. S is a diagonal matrix containing the singular values of G. For ill-posed problems, some singular values will be zero, creating null spaces in the data and model. The SVD provides insight into resolving the data and uniqueness of solutions to the inverse problem.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Proof: Minimum Length solution

Constrained minimization

M in L(m) = mT m : d = Gm
Lagrange multipliers leads to unconstrained minimization of
M
X N
X
φ(m, λ) = mT m + λT (d − Gm) = m2
j + λi(di − Gi,j mj )
j=1 i=1
∂φ ∂φ N
X
= d − Gm = 0 ∂λi
= (di − Gi,j mj )
∂λ i=1
∂φ N
X
∂φ
= 2m − G T λ = 0 ∂mj
= 2mj − λiGi,j
∂m i=1
1
⇒m= GT λ
2
1
⇒ d = Gm = GGT λ
2

⇒ λ = 2 (GGT )−1d

⇒ m = GT (GGT )−1d
101
Minimum Length and least squares solutions

mM L = GT (GGT )−1d mLS = (GT G)−1GT d

mest = G−g d
Data resolution matrix
dpre = Ddobs
D = GG−g
Minimum length Least squares

D = GGT (GGT )−1 = I D = G(GT G)−1GT 6= I


R = GT (GGT )−1G 6= I R = (GT G)−1GT G = I

There is symmetry between the least squares and minimum length solutions.
Least squares complete solves the over-determined problem and has perfect
model resolution, while the minimum length solves the completely
under-determined problem and has perfect data resolution. For mix-determined
problems all solutions will be between these two extremes.

102
Singular value decomposition

SVD is a method of analyzing and solving linear discrete ill-posed problems.

At its heart is the Lanczos decomposition of the matrix G


d = Gm
T
G = U SV

G = [u1, u2, . . . , uN ]S[v 1, v 2, . . . , v M ]T


N ×M N ×N M ×M

U is an N x N ortho-normal matrix with columns that span the data space

V is an M x M ortho-normal matrix with columns that span the model space

S is an N x M diagonal matrix with non-negative elements → singular values

U U T = U T U = IN

V V T = V T V = IM

Ill-posed problems arise when some of the singular values are zero

Lanczos (1977) For a discussion see Ch. 4 of Aster et al. (2004) 103
Singular value decomposition

Given G, how do we calculate the matrices U, V and S ?

U = [u1|u2| . . . |uN ]
T
G = U SV V = [v 1|v 2| . . . |v M ]

It can be shown that the columns of U are the eigenvectors of the matrix GGT

GGT ui = s2
i ui Try and prove this !

It can be shown that the columns of V are the eigenvectors of the matrix GTG

GT Gv i = s2
i vi Try and prove this !

The eigenvalues, si2, are the square of the elements in diagonal of the

N x M matrix S.
⎡ ⎤
If N > M ⎢
s1 0 · · · 0

If M > N
⎢ 0 s2 0 0 ⎥ ⎡ ⎤
⎢ .. ... 0 ⎥



⎥ s1 0 · · · 0 0 ··· 0
⎢ ⎥ ⎢ 0 s2 0 0 0 ··· 0 ⎥
⎢ 0 0 0 sM ⎥ ⎢
S=⎢


S=⎢ ⎥ ⎣ .. ... 0 0 ··· 0 ⎦
⎢ ⎥
⎢ ⎥

⎢ 0 0 0 0 ⎥⎥
0 0 0 sN 0 ··· 0
⎢ .. .. .. .. ⎥
⎣ ⎦
0 0 0 0 104
Singular value decomposition
⎡ ⎤
s1 0 · · · 0
⎢ ⎥
⎢ 0 s2 0 0 ⎥ ⎡ ⎤
⎢ .. ... 0 ⎥



⎥ s1 0 · · · 0 0 ··· 0
⎢ ⎥ ⎢ 0 s2 0 0 0 ··· 0 ⎥
⎢ 0 0 0 sM ⎥ ⎢
S=⎢


S=⎢ ⎥ ⎣ .. ... 0 0 ··· 0 ⎦
⎢ ⎥
⎢ ⎥

⎢ 0 0 0 0 ⎥⎥
0 0 0 sN 0 ··· 0
⎢ .. .. .. .. ⎥
⎣ ⎦
0 0 0 0

Suppose the first p are non-zero, then N x M non square matrix S S can be
written in a partitioned form
M
" #
By convention we order
Sp 0 the singular values
S= N
0 0 s1 ≥ s2 ≥ · · · ≥ sp
p
⎡ ⎤
s1 0 · · · 0 U = [u1|u2| . . . |uN ]
⎢ 0 s2 0 0 ⎥
⎢ ⎥
Sp = ⎢ .. ... 0 ⎥p V = [v 1|v 2| . . . |v M ]
⎣ ⎦
0 0 0 sp

where the submatrix sp is a p x p diagonal matrix contains the non-zero


singular values
pmax = min(N, M )
105
Singular value decomposition
If only the first p singular values are nonzero we write
" #
Sp 0
G = [Up | Uo] [Vp | Vo]T
0 0
Up represents the first p columns of U
Uo represents the last N-p columns of U → A data null space is created

Vp represents the first p columns of V


→ A model null space is created
Vo represents the last M-p columns of V
Properties

UpT Uo = 0 UoT Up = 0 VpT Vo = 0 VoT Vp = 0


UpT Up = I UoT Uo = I VoT Vo = I VpT Vp = I

Since the columns of Vo and Uo multiply by zeros we get the


compact form for G

G = UpSpVpT
106
Model null space

Consider a vector made up of a linear combination of the columns of Vo

M
X
mv = λiv i
i=p+1
The model m lies in the space spanned by columns of Vo

M
X
Gm v = λiUpSpVpT v i = 0
i=p+1
So any model of this type has no affect on the data. It lies in the
model null space !
Where have we seen this before ?

Consequence: If any solution exists to the inverse problem then an


infinite number will

Assume the model mls fits the data Gmls = dobs


Uniqueness question
G(mls + mv ) = Gmls + Gmv of Backus and Gilbert

= dobs + 0
The data can not constrain models in the model null space
107
Example: tomography

Idealized tomographic experiment

δ d = Gδ m

⎡ ⎤
G1,1 G1,2 G1,3 G1,4


.. .. .. .. ⎥

G=⎢ ... ... ... ... ⎥
⎣ ⎦
.. .. .. ..

What are the entries of G ?

108
Example: tomography

Using rays 1-4 δ d = Gδ m


⎡ ⎤
1 0 1 0
⎢ ⎥
⎢ 0 √1 √0 1 ⎥
G=⎢



⎣ √0 2 2 √0 ⎦
2 0 0 2

⎡ ⎤
3 0 1 2
⎢ 0 3 2 1 ⎥
T ⎢ ⎥
G G=⎢ ⎥
⎣ 1 2 3 0 ⎦
2 1 0 3

This has eigenvalues 6,4,2,0.


⎡ ⎤ ⎡ ⎤
0.5 −0.5 −0.5 0.5
⎢ ⎥ ⎢ ⎥

Vp = ⎢
0.5 0.5 0.5 ⎥


Vo = ⎢
0.5 ⎥
⎥ Gv o = 0
⎣ 0.5 0.5 −0.5 ⎦ ⎣ −0.5 ⎦
0.5 −0.5 0.5 −0.5

What type of change does the null space vector correspond to ?

109
Worked example: Eigenvectors
S12=6 S22=4

⎡ ⎤
0.5 −0.5 −0.5
⎢ 0.5 0.5 0.5 ⎥
⎢ ⎥
Vp = ⎢ ⎥
⎣ 0.5 0.5 −0.5 ⎦
0.5 −0.5 0.5

S32=2 S42=0

⎡ ⎤
0.5
⎢ 0.5 ⎥
⎢ ⎥
Vo = ⎢ ⎥
⎣ −0.5 ⎦
−0.5

110
Data null space

Consider a data vector with at least one component in Uo

dobs = do + λiui (i > p)


For any model space vector m we have

dpre = Gm = UpSpVpT m
= Up a
For the model to fit the data we must have dobs = dpre
p
X
do + λiui = aj uj
j=1 Where have we seen this before ?

So data of this type can not be fit by any model. The data has a
component in the data null space !

Consequence: No model exists that can fit the data


Existence question
of Backus and Gilbert

All this depends on the structure of the kernel matrix G !


111
Data and model null spaces

112
Moore-Penrose inverse and data null space

d = Gm G = UpSpVpT

The Moore-Penrose pseudo or generalized inverse of G is written

It is the unique matrix that satisfies four special properties

G† = VpSp−1UpT

GG†G = G (GG†)T = GG†

G†GG† = G† (G†G)T = G†G

Even when G has zero eigenvalues the Moore-Penrose inverse always exists
and has desirable properties

m† = G†dobs = VpSp−1UpT dobs


113
Properties of the Moore-Penrose inverse

U_p and V_p always exist and there are FOUR possible situations to consider

Case 1:
No model or data null space
Uo and Vo do not exist

Case 2:
Only a model null space exists
Uo does not exist, Vo exists

Case 3:
Only a data null space exists
Uo exists, Vo does not exist

Case 4:
Both data and model null space exists
Uo and Vo exists

114
Properties of the generalized inverse: case 1

There are FOUR possible situations

Both model and data spaces have `trivial’ null spaces.


Up = U and Vp = V are square and orthogonal

UpT = Up−1 VpT = Vp−1

Moore-Penrose inverse becomes

G† = VpSp−1UpT

= (UpSpVpT )−1

= G−1

Gm† = GG†d
=d
The solution is unique and the data are fit exactly.

115
Properties of the generalized inverse: case 2

Data null space is trivial, model null space is non-trivial.


Up = U is square Vp is non-square

The solution is non-unique


UpT = Up−1

VpT Vp = Ip
The data are fit exactly

Gm† = GG†d = d
But which of the infinite number of solutions do we get ?

m† = VpSp−1UpT d
= (VpSpUpT )(UpSp−2UpT )d

= GT (UpSp−2UpT )d

= GT (GGT )−1d

This is the minimum length solution !

116
Properties of the generalized inverse: case 3

Data null space is non-trivial, model null space is trivial.


V T = V −1 and Up is non-square
p p

What happens when a data vector do only lies in


the space spanned by Uo ?
N
X
do = βi u i
i=p+1
XN
m† = G†do = βiVpSp−1UpT ui = 0
i=p+1

Models can not satisfy data in the data null space

The data are not fit exactly

Gm† = GG†d
= (UpSpVpT )(VpSp−1UpT )d
= UpUpT d
117
Properties of the generalized inverse: case 3

Data null space is non-trivial, model null space is trivial.


V T = V −1 and Up is non-square
p p

The solution is unique

We get a least squares solution

m = (GT G)−1GT d
= (VpSpUpT UpSpVpT )−1VpSpUpT d
= VpSp−1UpT d
= G† d

We get the least squares solution which minimizes the prediction error!

118
Properties of the generalized inverse: case 4

Data and model space are non-trivial


p < N, M

Both case 2 and case 3 arguments apply

It minimizes the model length (case 2)

L(m†) = m†T m†
It minimizes data prediction error

φ(m†) = (d − Gm†)T (d − Gm†)

The generalized inverse combines the best features of both solutions.


In our linearized tomographic problem it gives the best fit to the data
while also minimizing the length of the solution.

119
Covariance and Resolution of the pseudo inverse

How does data noise propagate into the model ?

What is the model covariance matrix for the generalized inverse ?

CM = G†Cd(G†)T
G† = VpSp−1UpT
For the case Cd = σ 2I

CM = σ 2G†(G†)T

= σ 2VpSp−2VpT
Recall that Sp is a diagonal matrix of singular ordered values

Sp = diag[s1, s2, . . . , sp]


p
v iv Ti
2
X
⇒ CM = σ 2
i=1 s i
As the number of singular values, p, increases the variance of
the model parameters increases !
120
Covariance and Resolution of the pseudo inverse

How is the estimated model related to the true model ?

Model resolution matrix m† = Rmtrue


R = G† G
= VpSp−1UpT UpSpVpT
G† = VpSp−1UpT
= VpVpT
As p increases the model null space decreases

p→M : VpT → Vp−1, R→I

As the number of singular values, p, increases the resolution of


the model parameters increases !

We see the trade-off between variance and resolution

121
Worked example: tomography

Using rays 1- 4 δ d = Gδ m
⎡ ⎤
1 0 1 0
⎢ ⎥
⎢ 0 √1 √0 1 ⎥
G=⎢



⎣ √0 2 2 √0 ⎦
2 0 0 2

⎡ ⎤
3 0 1 2
⎢ 0 3 2 1 ⎥
T ⎢ ⎥
G G=⎢ ⎥
⎣ 1 2 3 0 ⎦
2 1 0 3

This has eigenvalues 0, 2, 4, 6.


⎡ ⎤ ⎡ ⎤
0.5 −0.5 −0.5 0.5
⎢ ⎥ ⎢ ⎥

Vp = ⎢
0.5 0.5 0.5 ⎥


Vo = ⎢
0.5 ⎥
⎥ Gv o = 0
⎣ 0.5 0.5 −0.5 ⎦ ⎣ −0.5 ⎦
0.5 −0.5 0.5 −0.5

s12 = 6 s22 = 4 s32 = 2 s42 = 0

122
Worked example: Eigenvectors
S12=6 S22=4

⎡ ⎤
0.5 −0.5 −0.5
⎢ 0.5 0.5 0.5 ⎥
⎢ ⎥
Vp = ⎢ ⎥
⎣ 0.5 0.5 −0.5 ⎦
0.5 −0.5 0.5

S32=2

⎡ ⎤
0.5
⎢ 0.5 ⎥
⎢ ⎥
Vo = ⎢ ⎥
⎣ −0.5 ⎦
−0.5

123
Worked example: tomography

Using eigenvalues all non zero eigenvalues s1, s2 and s3 the


resolution matrix becomes
⎡ ⎤
= Rδ mtrue = VpVpT δ mtrue
0.5 −0.5 −0.5
δm ⎢
⎢ 0.5 0.5 0.5 ⎥

Vp = ⎢ ⎥
⎣ 0.5 0.5 −0.5 ⎦
⎡ ⎤ 0.5 −0.5 0.5
0.75 −0.25 0.25 0.25
⎢ −0.25 0.75 0.25 0.25 ⎥
⎢ ⎥
R=⎢ ⎥
⎣ 0.25 0.25 0.75 −0.25 ⎦
0.25 0.25 −0.25 0.75

Input model Recovered model

124
Worked example: tomography

Using eigenvalues s1, s2 and s3 the model covariance becomes

p
v iv Ti
2
X
⇒ CM =σ 2
i=1 si
s12 = 2 s22 = 4 s32 = 6

⎧ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫

⎪ 1 −1 1 −1 1 −1 −1 1 1 1 1 1 ⎪


⎨ ⎜ ⎟ ⎪
σ 2 1 ⎜ −1 1 −1 1 ⎟ 1⎜
⎜ −1 1 1 −1 ⎟
⎟ 1⎜
⎜ 1 1 1 1 ⎟⎬

CM = ⎜ ⎟+ ⎜ ⎟+ ⎜ ⎟
4 ⎪ ⎪ 2 ⎝ 1 −1 1 −1 ⎠ 4⎝ −1 −1 1 −1 ⎠ 6⎝ 1 1 1 1 ⎠⎪⎪

⎩ ⎪

−1 1 −1 1 1 −1 −1 1 1 1 1 1

⎡ ⎤
11 −7 5 −1
σ2 ⎢
⎢ −7 11 −1 5 ⎥

CM = ⎢ ⎥
48 ⎣ 5 −1 11 −7 ⎦
−1 5 −7 11

125
Worked example: tomography
⎡ ⎤
0.5
Repeat using only one singular value s3 =6 ⎢
⎢ 0.5 ⎥

Vp = ⎢ ⎥
⎣ 0.5 ⎦
Model resolution matrix 0.5

⎡ ⎤
1 1 1 1
T 1⎢
⎢ 1 1 1 1 ⎥

R = Vp Vp = ⎢ ⎥
4⎣ 1 1 1 1 ⎦
1 1 1 1

Input Output
Model covariance matrix

p
v iv Ti
2
X
CM = σ 2
i=1 si

⎡ ⎤
1 1 1 1
σ2 ⎢
⎢ 1 1 1 1 ⎥

= ⎢ ⎥
24 ⎣ 1 1 1 1 ⎦
1 1 1 1
126
Recap: Singular value decomposition

There may exist a model null space -> models that can not
be constrained by the data.

There may exist a data null space -> data that can not be fit by
any model.

The general linear discrete inverse problem may be simultaneously


under and over determined (mix-determined).

Singular value decomposition is a framework for dealing with


ill-posed problems.

The Pseudo inverse is constructed using SVD and provides a


unique model with desirable properties.
Fits the data in a least squares sense
Gives a minimum length model (no component in the null space)

Model Resolution and Covariance can be traded off by choosing the


number of eigenvalues to use in reconstruction.

127

You might also like