Lecture 9 Unit2
Lecture 9 Unit2
(Code: 455)
• Vector space: a set that is closed under finite vector addition and
scalar multiplication. (also called a linear space)
Vector space
Vector space
• If the scalars are restricted to be real numbers, then V is called a real
vector space;
• If the scalars are allowed to be complex numbers, then V is called a
complex vector space.
• Here assume that all scalars are real and that we are dealing with real
vector space.
SUBSPACE
A = UWV T
Where :
U is a m-by-m matrix of the orthonormal eigenvectors of
AAT
VT is the transpose of a n-by-n matrix containing the
orthonormal eigenvectors of ATA
W is a m-by-n Diagonal matrix of the singular values
which are the square roots of the eigenvalues of ATA
SVD
𝑤1 0 0 T
𝐀 = 𝐔 0 ⋱ 0 𝐕
0 0 𝑤𝑛
m m m n V is nn
0 2/ 6 1/ 3 1 0
1 / 2 1/ 2
1 / 2 −1/ 6 1 / 3 0 3
1 / 2 1/ 2 −1/ 2
1/ 6 − 1 / 3 0 0
Ak = i =1 i ui viT
k
column notation: sum
of rank 1 matrices
Approximation error
• How good (bad) is this approximation?
• It’s the best possible, measured by the Frobenius norm of the error:
min A− X F
= A − Ak F
= k +1
X :rank ( X ) = k
feature 2
Rows: vectors in a Euclidean space,
Object d
5 Output:
2nd (right)
singular 1st (right) singular vector:
vector direction of maximal variance,
4
2nd (right)
singular 1: measures how much of the
4
vector data variance is explained by the
first singular vector.
49
More on U and V
2
T V: eigenvectors of ATA
A A = V U UV = V
T T T T 1
2
V
2
U: eigenvectors of AAT
Similarly,
2
T
AA = UV V U = U
T T T T 1
2
U
2
[Find vi first, then use Avi to find ui
This is the
key to solve
SVD
50
SVD: A=UVT
• The singular values are the diagonal entries of the matrix and are
arranged in descending order
• The singular values are always real (non-negative) numbers
• If A is real matrix, U and V are also real
51
Example (2x2, full rank)
2 2
A=
− 1 1
5 3 1 2 − 1 2
A A=
T
, v1 = , v2 =
3 5 1 2 1 2
2 2 1 1 STEPS:
Av1 = = 1 , u1 = 1. Find e-vectors of ATA;
0 0 0 normalize the basis
2. Compute Avi, get i
0 0 0
Av2 = = 2 , u2 = If i 0, get ui
2 1 1 Else find ui from N(AT)
1 0 2 2 0 1 2 1 2
A = UV =
T
0 1 0 2 − 1 2 1 2
52
SVD Theory
AV = U
→ Av j = j u j , j = 1,2, , r
• If j=0, Avj=0→vj is in N(A)
• The corresponding uj in N(AT)
• [UTA=VT=0]
• Else, vj in C(AT)
• The corresponding uj in C(A)
• #of nonzero j = rank
53
Example 2(2x2, rank deficient)
2 1 1 1
A= , r = 1, C ( A ) basis : ; v1 =
T
1
1 1
1 2
Can also be
Av1 = 1u1 obtained from
2 2 1 1 1 2 e-vectors of ATA
1 1 1 = 1 1 → 1 = 10
2 5
1 1
v2 ⊥ v1 → v2 N ( A) → Av2 = 0 → v2 = − 1
2
1 1
u2 ⊥ u1 → u2 N ( A ) → A u2 = 0 → u A = 0 → u1 =
T T T
5 − 2
2
2 2 1 2 1 10 0 1 1 1
1 1 = UV =
T
1 − 2 1 − 1
5 0 0 2
54
Example (cont)
2 2 1 2 1 10 0 1 1 1
1 1 = UV =
T
1 − 2 1 − 1
5 0 0 2
1 0 v1T
A = UV T = u1 u2 T
0 0 v2
1v1T
= u1 u2 = u v
1 1 1
T
0
(AA ) T
mm = (UV )(UV ) = U ( ) V
T T T
mm
T
mm
T
mm 56
Extend to Amxn (cont)
1
Av1 vr vr +1 vn = u1 ur ur +1 um
r
0
Summation of r
v T
1 rank-one matrices!
vrT
A = 1u1 r ur 0 0 T = 1u1v1T + + r ur vrT
vr +1
Bases of N(A) and N(AT) T They are useful only for
vm
do not contribute nullspace solutions
57
1 1 0
A= , r = 2, A23V33 = U 22 23
0 1 1
1 1 0
V : A A = 1 2 1 1 3 3 2 1
Av1 = = 1, 1 = 3
T 1
6 3 6 2
0 1 1
1 1
Av2 = − 1, 2 = 1
2
1
1
1 = 3, v1 = 2
1 1 1
6 1 1 0 6 2 3
1 1 1 3
=
−1
1 0 1 1
2
0 1 − 1
6 3
1 −1 1 2 1
1 6 2 3
1
2 = 1, v2 = 0
2 C(AT) N(A) C(A)
− 1 1 2 1
1 1 0 1 1 1 3 6 6 2
0 1 1 =
−1
1 − 1
1
1 0
2 2
1 2 1
3 = 0, v3 = − 1 (nullspace)
1 −1 1
3 3 3 3
1
58
1 2 1 6 10 6
Example :A = 2 3 2 , AA T = A T A = 10 17 10
1 2 1 6 10 6
Eigen values of A T A ,A T A
28 .86
0 .14 , Number of non - zero singularvalues = rank of A
0
Eigen vectors of A T A ,A T A
0 .454 0 .542 − 0 .707
u 1 = v 1 = 0 .776 u 2 = v 2 = − 0 .643 u 3 = v 3 = 0
0 .454 0 .542 − 0 .707
Expansion of A
2
A =
i
iu iv Ti
=1
59
Summary
• SVD chooses the right basis for the 4 subspaces
• AV=U
• v1…vr: orthonormal basis in Rn for C(AT)
• vr+1…vn: N(A)
• u1…ur: in Rm C(A)
• ur+1…um: N(AT)
• These bases are not only ⊥, but also Avi=iui
• High points of Linear Algebra
• Dimension, rank, orthogonality, basis, diagonalization, …
60
SVD Applications
• Using SVD in computation, rather than A, has the advantage of being more
robust to numerical error
• Many applications:
• Inverse of matrix A
• Conditions of matrix
• Image compression
• Solve Ax=b for all cases (unique, many, no solutions; least square solutions)
• rank determination, matrix approximation, …
• SVD usually found by iterative methods (see Numerical Recipe, Chap.2)
61
Matrix Inverse
A isnonsingular iff i 0 foralli
A = U V T or A −1 = V −1U T where −1 = diag (1/ 1,1/ 2 ,....,
1/ n )
A issingularor ill
- conditioned
(
A −1 = U V T )
−1
V 0−1U T 1/ i if i t
where 0−1 =
0 otherwise
62
SVD and Ax=b (mn)
• Check for existence of solution
UV x = b → Vx = U
T
b
T T
z d
z = d
If i = 0 but d i 0,
solution does not exist
63
Ax=b (inconsistent)
2 2 8
A= ,b =
1 1 3
2 2 1 2 1 10 0 1 1 1
1 1 = UV =
T
1 − 2 1 − 1
5 0 0 2
10 0 z1 1 2 1 8 20 5
= U b = =
T
0 0 2
z 5 1 − 2 2 5
3
No solution!
64
Ax=b (underdetermined)
2 2 8
A= ,b =
1 1 4
2 2 1 2 1 10 0 1 1 1
1 1 = UV =
T
1 − 2 1 − 1
5 0 0 2
10 0 z1 1 2 1 8 20 5
= U b = =
T
0 0 z 2 5 1 − 2 4 0
z1 2 2 1 1 1 2 2 2
z = → x particular = Vz = 1 − 1 =
2 0 2 0 2
2 1 2
xcomplete = x particular + xnull = + c
− 1 2
2
65
Pseudo Inverse
(Sec7.4, p.395)
• The role of A:
• Takes a vector vi from row space to iui in the column space
• The role of A-1 (if it exists):
• Does the opposite: takes a vector ui from column space to row space vi
Av i = i ui
v i = i A−1ui
A−1ui = 1i v i
66
Pseudo Inverse (cont)
• While A-1 may not exist, a matrix that takes ui back to vi/i does exist. It is
denoted as A+, the pseudo inverse
• A+: dimension n by m
+ 1
A ui = vi for i r and A+ ui = 0 for i r
i
A+ = Vnn +nmU mT m
1−1 T
= v1 vr vn u u u
r−1 1 r m
67
Pseudo Inverse and Ax=b
Ax = b A panacea for Ax=b
+ +
x = A b = V U b T
• Overdetermined case: find the solution that minimize the error r=|Ax–
b|, the least square solution
Compare A T A xˆ = A T b
68
Ex: full rank
2 2 0
A= ,b =
− 1 1 − 2
1 0 2 2 0 1 2 1 2
A = UV =
T
0 1 0 2 − 1 2 1 2
Ax = b → UV T x = b → x = Vdiag(1 / i )U T b
1 −1 2 2 2 0 1 0 0 1
1
2
x= 0 1 =
1 2 1 2 2
0 1 − 2 − 1
69
Ex: over-determined
1 0 1
A = 1 1, b = 2 , r = 2, A32V22 = U 33 32
0 1 − 1
1 0 16 12 13 3
1 1 = 2 0 −1 1 1 1
3
6 1 1 − 1
0 1 16 −12 13 2
A = UV T
A+ = V +U T
Will show this need not
be computed…
70
Over-determined (cont)
x = V +U T b Compare A T A xˆ = A T b
16 26 16 1 2 1 x 1 3
1 1 1 1
1 −1 =
=
3
2 0 2
2 1 2 x 2 1
2 1 − 1 1 x y z − 1
x 1 53
4
= −1
1 1 1 1
6
x 2 3
= 1 − 1
3
2
2
2 1 x + 2 y − z
1 1 1 2 3 2 53 Same
= 1 − 1 = −1
2 2 3
result!!
71
Ex: general case, no solution
2 2 8
A= ,b =
1 1 3
2 2 1 2 1 10 0 1 1 1
1 1 = UV =
T
1 − 2 1 − 1
5 0 0 2
x = V +U T b
2 x + 2 y = 8
1
x 1
0 2 1
8
= x+ y =3
2
0 x y 3
10 5 5
1
2
y 0
2 1 5 0 25 1
8 15 1
8 1910
= 1
5
= 1 10
= 19
2 5 0 x y 3 5 3 10
1
10
72
Matrix Approximation
Ai = U iV T
i : the rank i version of (by setting last m − i ' s to zero)
Ai : the best rank i approximation to A in the sence of Euclidean distance
73
Image Compression
• As described in text p.352
• For grey scale images: mn bytes
After SVD, taking the most significan t r terms :
A = 1u1v1T + 2u2v2T + + r ur vrT
• Only need to store r(m+n+1)
Original
r = 1,3,5,10,16 (no perceivable difference afterwards)
6464 74
75
76
Euclidean inner product
– Linfinity - L0 Norm
- 𝑥 0 = counts the total
number of nonzero
vector X= [-6, 4, 2], the elements of a vector.
L-infinity norm is 6.
Basic concepts
We will use lower case letters for vectors The elements are
referred by xi.
• Vector dot (inner) product:
x2= 4 + 25 + 9 6.1644
x
= 5
x p
= p
2 p + 5p + 3p
Vector Norms
Example: S = {x R n : x 1}
{x R n : x 2 1} {x R n : x 1}
{x R : x 1 1}
n {x R n : x p
1}
Matrix Norm Induced by Vector Norm
DEF: the matrix norm of A (induced by the vector norm) is defined to be
Ax
A m,n = sup m A m,n = sup Ax m
0 xR n x x n =1
n
Example: 1 2
A=
0 2
The unit vector x that is
amplified most by A is
[0,1]^T, the amplification
factor is 4.
A1=4
Special matrices
a 0 0 a b c
0 b 0 diagonal 0 d e upper-triangular
0 0 c 0 0
f
a b 0 0 a 0 0
c d e 0
b c 0 lower-triangular
0 tri-diagonal
f g h
d e f
0
0 i j
1 0 0
0 1 0 I (identity matrix)
0 0 1
Basic concepts
Transpose: You can think of it as
• “flipping” the rows and columns
OR
• “reflecting” vector/matrix on line
T
e.g. a
= (a b )
b
T
a b a c
=
c d b d
Matrix Norm Induced by Vector Norm
DEF: If the matrix A is a square matrix
A n = sup Ax n
x n =1
Example: 1 2
A=
0 2
The unit vector x that is
amplified most by A is
[0,1]^T, the amplification
factor is 4.
A1=4
Matrix Norm Induced by Vector Norm
DEF: If the matrix A is a square matrix
A n = sup Ax n
x n =1
Example: 1 2
A=
0 2
Cauchy-Schwarz:
𝑥𝑇𝑦 ≤ 𝑥 2 𝑦 2
Holder Inequality: 1 p, q 1 1
+ =1
p q
𝑥𝑇𝑦 ≤ 𝑥 𝑝 𝑦 𝑞
Image Transforms
• Image Transforms
Image Transforms
• Transform is basically a mathematical tool, which allows us to move
from one domain to another domain (time domain to the frequency
domain).
• The reason to migrate from one domain to another domain is to
perform the task at hand in an easier manner. Image transforms are
useful for fast computation of convolution and correlation.
• Transforms change the representation of a signal by projecting it onto
a set of basis functions.
• The transforms do not change the information content present in the
signal.
NEED FOR TRANSFORM
a
a + bi = R ei
i
e = cos( ) + i sin( )
R = (a + b ) b
= tan
−1
2 2
a
i1 i 2 i (1 + 2 )
R1e R2e = R1R2e
Fourier Spectrum
Fourier: F (u) = R(u) + iI (u)
Fourier Spectrum F (u ) = R 2 (u ) + I 2 (u )
Fourier: F (u ) = F (u ) exp( i )
Discrete Fourier Transform
Fourier Transform
N −1 −2iux N −1
F (u ) = f ( x) e F (0) = f ( x) e = f
1 1 0
N
N N
x =0 x =0
N −1 2iux
f ( x) = 11 F (u ) e N
u =0
Example
1 1 1 −1 1 1 − 1
A= − 1 1 , A = 1 1 = A T
2 2
For a real matrix A, it is unitary if A-1=AT
Inverse of Unitary Transform
For a unitary transform, its inverse is defined by
−1 *T
x=A y=A y
Inverse Transform
x1 a11* a*N 1 y1
x
2= y2
* *
N a1N a NN N
x y
N
x = yi bi , bi = [a1*i ,..., a*Ni ]T
i =1
basis vectors corresponding to inverse transform
1-D Unitary Transform
• Linear invertible transform
• 1-D sequence { x(0), x(1), …, x(N-1) } as a vector
• y = A x and A is invertible
• Rotation
• The angles between vectors are preserved
1 1 1 1 100
Hadamard matrix
1 1 − 1 1 − 1 98
A= ,x =
2 1 1 − 1 − 1 98
1 − 1 − 1 1 100
significant
1 1 1 1 100 198
1 − 1 1 − 1 98 0
1 =
y = Ax =
2 1 1 − 1 − 1 98 0
1 − 1 − 1 1 100 2
insignificant
Energy Conservation*
2 2
A is unitary
|| y || =|| x ||
y = Ax
Proof
2 N
2 *T *T
|| y || = | yi | = y y = ( Ax ) ( Ax )
i =1
*T *T *T N 2
= x ( A A) x = x x = | xi | = || x ||
2
i =1
Numerical Example
1 1 1 3
A= − 1 1, x = 4
2
1 1 1 3 1 7
y = Ax = − 1 1 4 = 1
2 2
Check:
2 2 7 2
+ 12
|| x || = 3 + 4 2 = 25, || y ||2 = = 25
2
Properties of 1-D Unitary Transform y = A x
• Energy Conservation: || y || = || x ||
2 2
• Interpretation:
• The angles between vectors are preserved
Question: What unitary transform gives the best compaction and decorrelation?
• { z(n) } { Z(k) } N −1
z ( n) = 1
n, k = 0, 1, …, N-1
N k =0
Z ( k ) W − nk
N
WN = exp{ - j2 / N }
~ complex conjugate of primitive Nth root of unity
• Vector form and interpretation for inverse transform
z = k Z(k) ak
ak = [ 1 WN-k WN-2k … WN-(N-1)k ]T / N
• Basis vectors ➔
• akH = ak* T = [ 1 WNk WN2k … WN(N-1)k ] / N
Z (0) 1 1 1 1 z (0) *T
Z (1) 1 e − j 2 / N e − j 2 2 / N
e − j 2 ( N −1) / N a0
*T
z (1)
− j 2 2 / N
e − j 2 4 / N e − j 2 2( N −1) / N z (2) a
Z (2) = 1 e = 1 z
*T
Z ( N − 1) 1 e − j 2 ( N −1) / N e − j 2 2( N −1) / N e − j 2 ( N −1) / N z ( N − 1) N −1
2 a
z (0) 1 1 1 1 Z (0)
1 e j 2 / N Z (0)
z (1) e j 2 2 / N e j 2 ( N −1) / N Z (1)
Z (1)
z (2) = 1 e
j 2 2 / N
e j 2 4 / N
e j 2 2 ( N −1) / N
Z (2) = a 0 a1 a N −1
Z ( N − 1)
z ( N − 1) 1 e j 2 ( N −1) / N e j 2 2( N −1) / N e j 2 ( N −1) / N Z ( N − 1)
2
Matrix/Vector Form of 1-D DFT (cont’d)
1 N −1
• { z(n) } { Z(k) }
Z ( k ) =
N n =0
z ( n ) W nk
N
2 j 1 1 j
A1 = A2 =
2 j 1
− j 2
1 1
2 3 2 cos sin
A3 = A4 = 2 A5 =
1 cos
− sin
1
1 2 −
2 2
• Transform matrix C
• c(k,n) = (0) for k=0
• c(k,n) = (k) cos[(2n+1)/2N] for k>0
• C is real and orthogonal
• Rows of C form orthonormal basis
• C is not symmetric!
• DCT is not the real part of unitary DFT!
• related to DFT of a symmetrically extended signal
Periodicity Implied by DFT and DCT
From Ken Lam’s DCT talk 2001 (HK Polytech)
100 100
50 50
z(n) Z(k)
0 0
DCT
-50 -50
-100 -100
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
n k
From Ken Lam’s DCT talk 2001 (HK Polytech)
n 0.0 0.0 0 0
0.0 0.0 0 0
k 0.0 0.0 0 0
row transform
Y2 = XAT = (AXT )T(right matrix multiplication first)
column transform
Y = AY2
Conclusion:
• 2D separable transform can be decomposed into two sequential
• The ordering of 1D transforms does not matter
Energy Compaction Property of
2D Unitary Transform
• Example 1 1 1 1
1 1 − 1 1 − 1
A=
2 1 1 − 1 − 1
1 − 1 − 1 1
100 100 98 99 391.5 0 5.5 1
100 100 94 94 Y = AXA
T
2.5 − 2 − 4.5 2
X= Y=
98 97 96 100 1 − 0.5 2 − 0.5
100 99 97 94 2 1.5 0 − 1.5
A coefficient is called significant if its magnitude
is above a pre-selected threshold th insignificant coefficients (th=64)
Energy Conservation Property of
2D Unitary Transform
N N
X = | xij |2
2
2-norm of a matrix X
i =1 j =1
Y = X
2 2
Y = AXA T A unitary
Example: 1 / 2 1/ 2
A=
1 / 2 −1/ 2
1 2 Y = AXA T
5 − 1
X= Y=
3 4 − 2 0
X = 12 + 2 2 + 32 + 4 2 = 30 = 52 + 2 2 + 12 + 0 2 =|| Y ||2
2