Lecture1 Slides
Lecture1 Slides
Lecture 1
Daniel Kressner
Chair for Numerical Algorithms and HPC
Institute of Mathematics, EPFL
[email protected]
1
Organizational aspects
2
From http://www.niemanlab.org
... his [Aleksandr
Kogan’s] message
went on to confirm
that his approach
was indeed similar to
SVD or other matrix
factorization meth-
ods, like in the Netflix
Prize competition, and
the Kosinki-Stillwell-
Graepel Facebook
model. Dimensionality
reduction of Facebook
data was the core of
his model.
3
Rank and basic properties
For field F , let A ∈ F m×n . Then
rank(A) := dim(range(A)).
4
Rank and matrix factorizations
Let B = {b1 , . . . , br } ⊂ Rm with r = rank(A) be basis
of range(A).
Then each of the columns of A = a1 , a2 , . . . , an can be expressed
as linear combination of B:
ci1
.
ai = b1 ci1 + b2 ci2 + · · · + br cir = b1 , . . . , br .. ,
cir
5
Rank and matrix factorizations
Lemma. A matrix A ∈ Rm×n of rank r admits a factorization of the
form
A = BC T , B ∈ Rm×r , C ∈ Rn×r .
A BC T
#entries mn mr + nr
I Generically (and in most applications), A has full rank, that is,
rank(A) = min{m, n}.
I Aim instead at approximating A by a low-rank matrix.
6
Questions addressed in lecture series
7
Literature for Lecture 1
8
1. Fundamental tools
I SVD
I Relation to eigenvalues
I Norms
I Best low-rank approximation
9
The singular value decomposition
Theorem (SVD). Let A ∈ Rm×n with m ≥ n. Then there are
orthogonal matrices U ∈ Rm×m and V ∈ Rn×n such that
σ1
..
A = UΣV T , with Σ =
. ∈ Rm×n
σn
0
and σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0.
I σ1 , . . . , σn are called singular values
I u1 , . . . , un are called left singular vectors
I v1 , . . . , vn are called right singular vectors
I Avi = σi ui , AT ui = σi vi for i = 1, . . . , n.
I Singular values are always uniquely defined by A.
I Singular values are never unique. If σ1 > σ2 > · · · σn > 0 then
unique up to ui ← ±ui , vi ← ±vi .
10
SVD: Sketch of proof
Induction over n. n = 1 trivial.
For general n, let v1 solve max{kAv k2 : kv k2 = 1} =: kAk2 . Set
σ1 := kAk2 and u1 := Av1 /σ1 .1 By definition,
Av1 = σ1 u1 .
∈ Rm×m and
After completion to orthogonal matrices U 1 = u1 , U ⊥
V1 = v1 , V⊥ ∈ Rn×n :
12
SVD: Computation (for small dense matrices)
Computation of SVD proceeds in two steps:
1. Reduction to bidiagonal form: By applying n Householder
reflectors from left and n − 1 Householder reflectors from right,
compute orthogonal matrices U1 , V1 such that
B @@
U1T AV1 = B = 1 = @@ @ ,
0
0
that is, B1 ∈ Rn×n is an upper bidiagonal matrix.
2. Reduction to diagonal form: Use Divide&Conquer to compute
orthogonal matrices U2 , V2 such that Σ = U2T B1 V2 is diagonal.
Set U = U1 U2 and V = V1 V2 .
Step 1 is usually the most expensive. Remarks on Step 1:
I If m is significantly larger than n, say, m ≥ 3n/2, first computing
QR decomposition of A reduces cost.
I Most modern implementations reduce A successively via banded
form to bidiagonal form.2
2 Bischof, C. H.; Lang, B.; Sun, X. A framework for symmetric band reduction. ACM
and σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0.
Computed by M ATLAB’s [U,S,V] = svd(A,’econ’).
Complexity:
memory operations
singular values only O(mn) O(mn2 )
economy size SVD O(mn) O(mn2 )
(full) SVD O(m2 + mn) O(m2 n + mn2 )
14
SVD: Computation (for small dense matrices)
Beware of roundoff error when interpreting singular value plots.
Exmaple: semilogy(svd(hilb(100)))
10 0
10 -10
10 -20
0 20 40 60 80 100
A = U diag(λ1 , λ2 , . . . , λn )U T
16
Singular/eigenvalue relations: general matrices
Consider SVD A = UΣV T of A ∈ Rm×n with m ≥ n. We then have:
1. Spectral decomposition of Gramian
AT A = V ΣT ΣV T = V diag(σ12 , . . . , σn2 )V T
AT A has eigenvalues σ12 , . . . , σn2 ,
right singular vectors of A are eigenvectors of AT A.
2. Spectral decomposition of Gramian
AAT = UΣΣT U T = U diag(σ12 , . . . , σn2 , 0, . . . , 0)U T
AAT has eigenvalues σ12 , . . . , σn2 and, additionally, m − n zero
eigenvalues,
first n left singular vectors A are eigenvectors of AAT .
3. Decomposition of Golub-Kahan matrix
T
0 A U 0 0 Σ U 0
A= T =
A 0 0 V ΣT 0 0 V
17
Norms: Spectral and Frobenius norm
18
Euclidean geometry on matrices
Let B ∈ Rn×n have eigenvalues λ1 , . . . , λn ∈ C. Then
In turn, X
kAk2F = trace AT A = trace AAT = aij2 .
i,j
an
kAk(p) := ks(A)kp
Lemma
Let p, q ∈ [1, ∞] such that p−1 + q −1 = 1. Then
kAkD
(p) = kAk(q) .
Then
Tk (A) := Uk Σk VkT
has rank at most k . For any unitarily invariant norm k · k:
22
Best low-rank approximation
Theorem (Schmidt-Mirsky). Let A ∈ Rm×n . Then
24
Approximating the range of a matrix
Aim at finding a matrix Q ∈ Rm×k with orthonormal columns such that
range(Q) ≈ range(A).
k(I − QQ T )Ak = kA − QQ T Ak
kA − QQ T Ak ≥ kA − Tk (A)k.
Q = Uk is optimal.
25
Approximating the range of a matrix
Variation:
max{kQ T AkF : Q T Q = Ik }.
Equivalent to
max{|hAAT , QQ T i| : Q T Q = Ik }.
By Von Neumann’s trace inequality and equivalence between
eigenvectors of AAT and left singular vectors of A, optimal Q given by
Uk .
26