Lecture 4
Lecture 4
Constrained minimization
M in L(m) = mT m : d = Gm
Lagrange multipliers leads to unconstrained minimization of
M
X N
X
φ(m, λ) = mT m + λT (d − Gm) = m2
j + λi(di − Gi,j mj )
j=1 i=1
∂φ ∂φ N
X
= d − Gm = 0 ∂λi
= (di − Gi,j mj )
∂λ i=1
∂φ N
X
∂φ
= 2m − G T λ = 0 ∂mj
= 2mj − λiGi,j
∂m i=1
1
⇒m= GT λ
2
1
⇒ d = Gm = GGT λ
2
⇒ λ = 2 (GGT )−1d
⇒ m = GT (GGT )−1d
101
Minimum Length and least squares solutions
mest = G−g d
Data resolution matrix
dpre = Ddobs
D = GG−g
Minimum length Least squares
There is symmetry between the least squares and minimum length solutions.
Least squares complete solves the over-determined problem and has perfect
model resolution, while the minimum length solves the completely
under-determined problem and has perfect data resolution. For mix-determined
problems all solutions will be between these two extremes.
102
Singular value decomposition
U U T = U T U = IN
V V T = V T V = IM
Ill-posed problems arise when some of the singular values are zero
Lanczos (1977) For a discussion see Ch. 4 of Aster et al. (2004) 103
Singular value decomposition
U = [u1|u2| . . . |uN ]
T
G = U SV V = [v 1|v 2| . . . |v M ]
It can be shown that the columns of U are the eigenvectors of the matrix GGT
GGT ui = s2
i ui Try and prove this !
It can be shown that the columns of V are the eigenvectors of the matrix GTG
GT Gv i = s2
i vi Try and prove this !
The eigenvalues, si2, are the square of the elements in diagonal of the
N x M matrix S.
⎡ ⎤
If N > M ⎢
s1 0 · · · 0
⎥
If M > N
⎢ 0 s2 0 0 ⎥ ⎡ ⎤
⎢ .. ... 0 ⎥
⎢
⎢
⎥
⎥ s1 0 · · · 0 0 ··· 0
⎢ ⎥ ⎢ 0 s2 0 0 0 ··· 0 ⎥
⎢ 0 0 0 sM ⎥ ⎢
S=⎢
⎥
⎥
S=⎢ ⎥ ⎣ .. ... 0 0 ··· 0 ⎦
⎢ ⎥
⎢ ⎥
⎢
⎢ 0 0 0 0 ⎥⎥
0 0 0 sN 0 ··· 0
⎢ .. .. .. .. ⎥
⎣ ⎦
0 0 0 0 104
Singular value decomposition
⎡ ⎤
s1 0 · · · 0
⎢ ⎥
⎢ 0 s2 0 0 ⎥ ⎡ ⎤
⎢ .. ... 0 ⎥
⎢
⎢
⎥
⎥ s1 0 · · · 0 0 ··· 0
⎢ ⎥ ⎢ 0 s2 0 0 0 ··· 0 ⎥
⎢ 0 0 0 sM ⎥ ⎢
S=⎢
⎥
⎥
S=⎢ ⎥ ⎣ .. ... 0 0 ··· 0 ⎦
⎢ ⎥
⎢ ⎥
⎢
⎢ 0 0 0 0 ⎥⎥
0 0 0 sN 0 ··· 0
⎢ .. .. .. .. ⎥
⎣ ⎦
0 0 0 0
Suppose the first p are non-zero, then N x M non square matrix S S can be
written in a partitioned form
M
" #
By convention we order
Sp 0 the singular values
S= N
0 0 s1 ≥ s2 ≥ · · · ≥ sp
p
⎡ ⎤
s1 0 · · · 0 U = [u1|u2| . . . |uN ]
⎢ 0 s2 0 0 ⎥
⎢ ⎥
Sp = ⎢ .. ... 0 ⎥p V = [v 1|v 2| . . . |v M ]
⎣ ⎦
0 0 0 sp
G = UpSpVpT
106
Model null space
M
X
mv = λiv i
i=p+1
The model m lies in the space spanned by columns of Vo
M
X
Gm v = λiUpSpVpT v i = 0
i=p+1
So any model of this type has no affect on the data. It lies in the
model null space !
Where have we seen this before ?
= dobs + 0
The data can not constrain models in the model null space
107
Example: tomography
δ d = Gδ m
⎡ ⎤
G1,1 G1,2 G1,3 G1,4
⎢
⎢
.. .. .. .. ⎥
⎥
G=⎢ ... ... ... ... ⎥
⎣ ⎦
.. .. .. ..
108
Example: tomography
⎡ ⎤
3 0 1 2
⎢ 0 3 2 1 ⎥
T ⎢ ⎥
G G=⎢ ⎥
⎣ 1 2 3 0 ⎦
2 1 0 3
109
Worked example: Eigenvectors
S12=6 S22=4
⎡ ⎤
0.5 −0.5 −0.5
⎢ 0.5 0.5 0.5 ⎥
⎢ ⎥
Vp = ⎢ ⎥
⎣ 0.5 0.5 −0.5 ⎦
0.5 −0.5 0.5
S32=2 S42=0
⎡ ⎤
0.5
⎢ 0.5 ⎥
⎢ ⎥
Vo = ⎢ ⎥
⎣ −0.5 ⎦
−0.5
110
Data null space
dpre = Gm = UpSpVpT m
= Up a
For the model to fit the data we must have dobs = dpre
p
X
do + λiui = aj uj
j=1 Where have we seen this before ?
So data of this type can not be fit by any model. The data has a
component in the data null space !
112
Moore-Penrose inverse and data null space
d = Gm G = UpSpVpT
G† = VpSp−1UpT
Even when G has zero eigenvalues the Moore-Penrose inverse always exists
and has desirable properties
U_p and V_p always exist and there are FOUR possible situations to consider
Case 1:
No model or data null space
Uo and Vo do not exist
Case 2:
Only a model null space exists
Uo does not exist, Vo exists
Case 3:
Only a data null space exists
Uo exists, Vo does not exist
Case 4:
Both data and model null space exists
Uo and Vo exists
114
Properties of the generalized inverse: case 1
G† = VpSp−1UpT
= (UpSpVpT )−1
= G−1
Gm† = GG†d
=d
The solution is unique and the data are fit exactly.
115
Properties of the generalized inverse: case 2
VpT Vp = Ip
The data are fit exactly
Gm† = GG†d = d
But which of the infinite number of solutions do we get ?
m† = VpSp−1UpT d
= (VpSpUpT )(UpSp−2UpT )d
= GT (UpSp−2UpT )d
= GT (GGT )−1d
116
Properties of the generalized inverse: case 3
Gm† = GG†d
= (UpSpVpT )(VpSp−1UpT )d
= UpUpT d
117
Properties of the generalized inverse: case 3
m = (GT G)−1GT d
= (VpSpUpT UpSpVpT )−1VpSpUpT d
= VpSp−1UpT d
= G† d
We get the least squares solution which minimizes the prediction error!
118
Properties of the generalized inverse: case 4
L(m†) = m†T m†
It minimizes data prediction error
119
Covariance and Resolution of the pseudo inverse
CM = G†Cd(G†)T
G† = VpSp−1UpT
For the case Cd = σ 2I
CM = σ 2G†(G†)T
= σ 2VpSp−2VpT
Recall that Sp is a diagonal matrix of singular ordered values
121
Worked example: tomography
Using rays 1- 4 δ d = Gδ m
⎡ ⎤
1 0 1 0
⎢ ⎥
⎢ 0 √1 √0 1 ⎥
G=⎢
⎢
⎥
⎥
⎣ √0 2 2 √0 ⎦
2 0 0 2
⎡ ⎤
3 0 1 2
⎢ 0 3 2 1 ⎥
T ⎢ ⎥
G G=⎢ ⎥
⎣ 1 2 3 0 ⎦
2 1 0 3
122
Worked example: Eigenvectors
S12=6 S22=4
⎡ ⎤
0.5 −0.5 −0.5
⎢ 0.5 0.5 0.5 ⎥
⎢ ⎥
Vp = ⎢ ⎥
⎣ 0.5 0.5 −0.5 ⎦
0.5 −0.5 0.5
S32=2
⎡ ⎤
0.5
⎢ 0.5 ⎥
⎢ ⎥
Vo = ⎢ ⎥
⎣ −0.5 ⎦
−0.5
123
Worked example: tomography
124
Worked example: tomography
p
v iv Ti
2
X
⇒ CM =σ 2
i=1 si
s12 = 2 s22 = 4 s32 = 6
⎧ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫
⎪
⎪ 1 −1 1 −1 1 −1 −1 1 1 1 1 1 ⎪
⎪
⎪
⎨ ⎜ ⎟ ⎪
σ 2 1 ⎜ −1 1 −1 1 ⎟ 1⎜
⎜ −1 1 1 −1 ⎟
⎟ 1⎜
⎜ 1 1 1 1 ⎟⎬
⎟
CM = ⎜ ⎟+ ⎜ ⎟+ ⎜ ⎟
4 ⎪ ⎪ 2 ⎝ 1 −1 1 −1 ⎠ 4⎝ −1 −1 1 −1 ⎠ 6⎝ 1 1 1 1 ⎠⎪⎪
⎪
⎩ ⎪
⎭
−1 1 −1 1 1 −1 −1 1 1 1 1 1
⎡ ⎤
11 −7 5 −1
σ2 ⎢
⎢ −7 11 −1 5 ⎥
⎥
CM = ⎢ ⎥
48 ⎣ 5 −1 11 −7 ⎦
−1 5 −7 11
125
Worked example: tomography
⎡ ⎤
0.5
Repeat using only one singular value s3 =6 ⎢
⎢ 0.5 ⎥
⎥
Vp = ⎢ ⎥
⎣ 0.5 ⎦
Model resolution matrix 0.5
⎡ ⎤
1 1 1 1
T 1⎢
⎢ 1 1 1 1 ⎥
⎥
R = Vp Vp = ⎢ ⎥
4⎣ 1 1 1 1 ⎦
1 1 1 1
Input Output
Model covariance matrix
p
v iv Ti
2
X
CM = σ 2
i=1 si
⎡ ⎤
1 1 1 1
σ2 ⎢
⎢ 1 1 1 1 ⎥
⎥
= ⎢ ⎥
24 ⎣ 1 1 1 1 ⎦
1 1 1 1
126
Recap: Singular value decomposition
There may exist a model null space -> models that can not
be constrained by the data.
There may exist a data null space -> data that can not be fit by
any model.
127