0% found this document useful (0 votes)
105 views24 pages

8 - The Singular Value Decomposition: Cmda 3606 Mark Embree

The singular value decomposition (SVD) is a factorization of a matrix into its fundamental subspaces and reveals the relative importance of each direction within those subspaces. It is useful for analyzing data, solving least squares problems, and approximating matrices with lower rank matrices. The SVD constructs the factorization in several steps: (1) compute the eigenvalues and eigenvectors of the symmetric, positive definite matrix A*A; (2) define the singular values as the square roots of the eigenvalues; (3) define the left singular vectors as columns of A scaled by the singular values.

Uploaded by

Meenaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views24 pages

8 - The Singular Value Decomposition: Cmda 3606 Mark Embree

The singular value decomposition (SVD) is a factorization of a matrix into its fundamental subspaces and reveals the relative importance of each direction within those subspaces. It is useful for analyzing data, solving least squares problems, and approximating matrices with lower rank matrices. The SVD constructs the factorization in several steps: (1) compute the eigenvalues and eigenvectors of the symmetric, positive definite matrix A*A; (2) define the singular values as the square roots of the eigenvalues; (3) define the left singular vectors as columns of A scaled by the singular values.

Uploaded by

Meenaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

8.

the singular value decomposition


cmda 3606; mark embree
version of 19 February 2017

The singular value decomposition (SVD) is among the most


important and widely applicable matrix factorizations. It provides a
natural way to untangle a matrix into its four fundamental subspaces,
and reveals the relative importance of each direction within those
subspaces. Thus the singular value decomposition is a vital tool
for analyzing data, and it provides a slick way to understand (and
prove) many fundamental results in matrix theory. It is the perfect
tool for solving least squares problems, and provides the best way to
approximate a matrix with one of lower rank. These notes construct
the SVD in various forms, then describe a few of its most compelling
applications.

8.1 Helpful facts about symmetric matrices

To derive the singular value decomposition of a general (rectangular)


matrix A ∈ C m×n , we shall rely on several special properties of the
square, symmetric matrix A∗ A. For this reason we first recall some
fundamental results from the theory of symmetric matrices.
We shall use the term “symmetric” to mean a matrix where A∗ = A.
Often such matrices are instead called “Hermitian” or “self-adjoint,” and
the term “symmetric” is reserved for matrices with AT = A. If the entries of
A are all real, there is no distinction.

Theorem 8.1 (Spectral Theorem) Suppose H ∈ C n×n is symmetric. For example, when
 
Then there exist n (not necessarily distinct) eigenvalues λ1 , . . . , λn and 3 −1
H= ,
−1 3
corresponding unit-length eigenvectors v1 , . . . , vn such that
we have λ1 = 4 and λ2 = 2, with
 √ √
Hvj = λ j vj .
 
v1 = √2/2 , v2 = √2/2 .
− 2/2 2/2
The eigenvectors form an orthonormal basis for C n : Note that these eigenvectors are unit
vectors, and they are orthogonal.
C n = span{v1 , . . . , vn }

and v∗j vk = 0 when j 6= k, and v∗j vj = kvj k2 = 1.

As a consequence of the Spectral Theorem, we can write any sym-


metric matrix H ∈ C n×n in the form
n
H= ∑ λ j vj v∗j . (8.1)
j =1
8. the singular value decomposition 2

This equation expresses H as the sum of the special rank-1 matrices


λ j vj v∗j . The singular value decomposition will provide a similar way
to tease apart a rectangular matrix.

Theorem 8.2 All eigenvalues of a symmetric matrix are real.

Proof. Let (λ j , vj ) be an arbitrary eigenpair of the symmetric matrix


H, so that Hvj = λ j vj . Without loss of generality, we can assume that
vj is scaled so that kvj k = 1, i.e., v∗j vj = kvj k2 = 1. Thus

λ j = λ j (v∗j vj ) = v∗j (λ j vj ) = v∗j (Hvj ).

Since H is symmetric, H = H∗ , and so

v∗j (Hvj ) = v∗j H∗ vj = (Hvj )∗ vj = (λ j vj )∗ vj = λ j v∗j vj = λ j .

Thus λ j = λ j , which is only possible if λ j is real.

Definition 8.1 A symmetric matrix H ∈ C n×n is positive definite For the example above,

provided x∗ Hx > 0 for all nonzero x ∈ C n ; if x∗ Hx ≥ 0 for all x ∈ C n , we


 ∗   
x1 3 −1 x1
x∗ Hx =
say H is positive semidefinite. x2 −1 3 x2
= 3x12 − 2x1 x2 + x22
Theorem 8.3 All eigenvalues of a symmetric positive definite matrix are = 2( x1 − x2 )2 + ( x1 + x2 )2 .
positive; all eigenvalues of a symmetric positive semidefinite matrix are This last expression, the sum of squares,
nonnegative. is clearly positive for all nonzero x, so
H is positive definite.

Proof. Let (λ j , vj ) denote an eigenpair of the symmetric positive defi-


nite matrix H ∈ C n×n with kvj k2 = v∗j vj = 1. Since H is symmetric ,
Can you prove the converse of this
λ j must be real. We conclude that theorem? (A symmetric matrix with
positive eigenvalues is positive defi-
λ j = λ j v∗j vj = v∗j (λ j vj ) = v∗j Hvj , nite.) Hint: use the Spectral Theorem.
With this result, we can check if H is
positive definite by just looking at its
which must be positive since H is positive definite and vj 6= 0. eigenvalues, rather than working out a
The proof for positive semidefinite matrices is the same, except we formula for x∗ Hx, as done above.
can only conclude that λ j = v∗j Hvj ≥ 0.

8.2 Derivation of the singular value decomposition: Full rank case

We seek to derive the singular value decomposition of a general rect-


angular matrix. To simplify our initial derivation, we shall assume
that A ∈ C m×n with m ≥ n, and that rank(A) is as large as possible,
i.e.,
rank(A) = n.
First, form A∗ A, which is an n × n matrix. Notice that A∗ A is always
symmetric, since A∗ = A∗ A
A
(A∗ A)∗ = A∗ (A∗ )∗ = A∗ A.
8. the singular value decomposition 3

Furthermore, this matrix is positive definite: notice that

x∗ A∗ Ax = (Ax)∗ (Ax) = kAxk2 ≥ 0.

Since rank(A) = n, notice that Keep this in mind: If rank(A) < n, then
dim(N(A)) > 0, so there exist x 6= 0
dim(N(A)) = n − rank(A) = 0. for which x∗ A∗ Ax = kAxk2 = 0. Hence
A∗ A will only be positive semidefinite
in this case.
Since the null space of A is trivial, Ax 6= 0 whenever x 6= 0, so

x∗ A∗ Ax = kAxk2 > 0

for all nonzero x. Hence A∗ A is positive definite.

We are now ready to construct our first version of the singular


value decomposition. We shall construct the pieces one at a time,
then assemble them into the desired decomposition.

Step 1. Compute the eigenvalues and eigenvectors of A∗ A.


As a consequence of results about symmetric matrices presented Even if A is a square matrix, be sure to
compute the eigenvalues and eigenvec-
above, we can find n eigenpairs {(λ j , vj )}nj=1 of H = A∗ A with unit tors of A∗ A.
eigenvectors (vj∗ vj = kvj k2 = 1) that are orthogonal to one another
(vj∗ vk = 0 when j 6= k). We are free to pick any convenient indexing
for these eigenpairs; we shall label them so that the eigenvalues
are decreasing in size, λ1 ≥ λ2 ≥ · · · ≥ λn > 0. It is helpful to Since A∗ A is positive definite, all its
emphasize that v1 , . . . , vn ∈ C n . eigenvalues are positive.

q
Step 2. Define σj = kAvj k = λ j , j = 1, . . . , n.
Note that σj2 = kAvj k22 = v∗j A∗ Avj = λ j . Since the eigenvalues
λ1 , . . . , λn are decreasing in size, so too are the σj values:

σ1 ≥ σ2 ≥ · · · ≥ σn > 0.

Step 3. Define u j = Avj /σj for j = 1, . . . , n. The assumption that rank(A) = n


helped us out here, by ensuring that
Notice that u1 , . . . , un ∈ Cm. Because σj = kAvj k, we ensure that σj > 0 for all j: hence we can safely
divide by σj in the definition of u j .
1 kAvj k
ku j k = Avj = = 1.

σj σj

Furthermore, these u j vectors are orthogonal. To see this, write

1 1 ∗ ∗
u∗j uk = (Avj )∗ (Avk ) = v A Avk .
σj σk σj σk j

Since vk is an eigenvector of A∗ A corresponding to eigenvalue λk ,

1 ∗ ∗ 1 ∗ λj ∗
vj A Avk = vj (λk vk ) = vv.
σj σk σj σk σj σk j k
8. the singular value decomposition 4

Since the eigenvectors of the symmetric matrix A∗ A are orthogo-


nal, v∗j vk = 0 when j 6= k, so the u j vectors inherit the orthogonal-
ity of the vj vectors:

u∗j uk = 0, j 6= k.

Step 4. Put the pieces together.


For all j = 1, . . . , n,
Avj = σj u j ,
regardless of whether σj = 0 or not. We can stack these n vector
equations as columns of a single matrix equation,
    Av1 · · · Avn = σ1 u1 · · · σn un
| | | | | |
 Av1 Av2 · · · Avn  =  σ1 u1 σ2 u2 · · · σn un  .
   
| | | | | |
Note that both matrices in this equation can be factored into the
product of simpler matrices: v1 · · · vn
 
    σ1
| | | | | |   A
 σ2 
A  v1 v2 · · · vn  =  u1 u2 · · · un   .
  
..
.
 
| | | | | |  
σn

Denote these matrices as σ1

b Σ, ..
AV = U b (8.2) .

where A ∈ C m×n , V ∈ C n×n , U


b ∈ C m×n , and Σ
b ∈ C n×n . = u1 · · · u n
σn

We now have all the ingredients for various forms of the sin-
gular value decomposition. Since the eigenvectors vj of the symmet-
ric matrix A∗ A are orthonormal, the square matrix V has orthonor-
mal columns. This means that

V∗ V = I,
The inverse of a square matrix is
unique: since V∗ does what the inverse
since the ( j, k ) entry of V∗ V is simply v∗j vk . Since V is square, the
of V is supposed to do, i.e., V∗ V = I, it
equation V∗ V
= I implies that V∗ = V−1 .
Thus, in addition to V∗ V, must be the unique matrix V−1 .
we also have
VV∗ = VV−1 = I.
Thus multiplying both sides of equation (8.2) on the right by V∗ gives
Σ V∗
b ∗.
b ΣV
b
A=U (8.3) A = U
b

This factorization is the reduced (or skinny) singular value decomposi-


tion of A. It can be obtained via the MATLAB command
8. the singular value decomposition 5

[Uhat, Sighat, V] = svd(A,0).

What can be said of the matrix U b ∈ C m×n ? Recall that its columns, When m > n, there must exist
the vectors u1 , . . . , un , are orthonormal. However, in contrast to V, we some nonzero z ∈ C m such that
b ∗ z = 0.
z ⊥ u1 , . . . , un , which implies U
cannot conclude that U bU b ∗ = I when m > n. Why not? Because when Hence U bU b ∗ z = 0, so we cannot have
m > n, Ub has a nontrivial null space, and hence cannot be invertible. U
bU b ∗ = I. However, U bUb ∗ ∈ C m×m
is a projector onto the n-dimensional
subspace span{u1 , . . . , un } of C m .
We wish to augment the matrix U b with m − n additional column
vectors, to give a full set of m orthonormal vectors in C m . Here is the
recipe to find these extra vectors: For j = n + 1, . . . , m, pick

u j ⊥ span{u1 , . . . , u j−1 }

with u∗j u j = 1. Then define


 
| | | |
U =  u1 · · · u n u n +1 · · · um  ∈ C m×m . (8.4)
 
| | | |
We have constructed u1 , . . . , um to be orthonormal vectors, so

U∗ U = I.

However, since U ∈ C m×m , this orthogonality also implies U−1 = U∗ .


Now we are ready to replace the rectangular matrix U b ∈ C m×n in
×
the reduced SVD (8.3) with the square matrix U ∈ C m . To do so,
m

b ∈ C n×n by some Σ ∈ C m×n in such a way


we also need to replace Σ
Σ =
that
b
U U Σ

b
U b = UΣ.

The simplest approach is to obtain Σ by appending zeros to the end


of Σ,
b thus ensuring there is no contribution when the new entries of
U multiply against the new entries of Σ:
Σ = Σ
" #
Σ
b b
Σ= ∈ C m×n .
b
(8.5) U
b U
b U
e
0 0

Finally, we arrive at the main result, the full singular value decomposi-
tion, for the case where rank(A) = n.

Theorem 8.4 (Singular value decomposition, provisional version)


Suppose A ∈ C m×n has rank(A) = n, with m ≥ n. Then we can write

A = UΣV∗ ,

where the columns of U ∈ C m×m and V ∈ C n×n are orthonormal,

U∗ U = I ∈ C m×m , V∗ V = I ∈ C n×n ,

and Σ ∈ C m×n is zero everywhere except for entries on the main diagonal,
where the ( j, j) entry is σj , for j = 1, . . . , n and

σ1 ≥ σ2 ≥ · · · ≥ σn > 0.
8. the singular value decomposition 6

The full SVD is obtained via the MATLAB command

[U,S,V] = svd(A).

Definition 8.2 Let A = UΣV∗ be a full singular value decomposition. The


diagonal entries of Σ, denoted σ1 ≥ σ2 ≥ · · · ≥ σn , are called the singular
values of A. The columns u1 , . . . , um of U are the left singular vectors;
the columns v1 , . . . , vm of V are the right singular vectors.

8.3 The dyadic form of the SVD

We are now prepared to develop an analogue of the formula (8.1) for


rectangular matrices. Consider the reduced SVD,

b ∗,
b ΣV
A=U

and multiply U bΣb to obtain


 
  σ1  
| | | 
σ2
 | | |
  
u u · · · u  =  σ1 u1 σ1 u2 · · · σn un  .
 
 1 2 n  ..

.

| | |   | | |
σn


Now notice that you can write A = (U b )V∗ as

v1∗
  

v2∗ n
  
 σ1 u1 σ1 u2 · · · σn un   ∑ σj u j v∗j ,
  
.. =
.
  
   j =1
v∗n

which parallels the form (8.1) we had for symmetric matrices:


v∗j
n
A= ∑ σj u j v∗j . (8.6) A =
n
∑ σj uj =
n
∑ σj u j v∗j
j =1 j =1 j =1

This expression is called the dyadic form of the SVD. Because we have
ordered σ1 ≥ σ2 ≥ · · · ≥ σn , the leading terms in this sum dominate
the others. This fact plays a crucial role in applications where we
want to approximate a matrix with its leading low-rank part.

8.4 A small example

Consider the matrix


1 1
 

A= 0 0 ,
√ √
2 − 2
8. the singular value decomposition 7

for which A∗ A is the symmetric matrix used as an example earlier in


these notes:
3 −1
 
A∗ A = .
−1 3
This matrix has rank(A) = 2 = n, so we can apply the analysis
described above.

Step 1. Compute the eigenvalues and eigenvectors of A∗ A.


We have already seen that, for this matrix, λ1 = 4 and λ2 = 2, with
 √  √ 
2/2 2/2
v1 = √ , v2 = √ ,
− 2/2 2/2
with λ1 ≥ λ2 , the required order. The vectors v1 and v2 will be the
right singular vectors of A.
q
Step 2. Define σj = kAvj k = λ j , j = 1, . . . , n.
In this case, we compute
p p √
σ1 = λ1 = 2, σ2 = λ2 = 2.

Alternatively, we could have computed the singular values from


 √
1 1 0
  
 
2/2
Av1 =  0 0  √ = 0
√ √ − 2/2
2 − 2 2
 √ √ 
1 1 2

 
2/2
Av2 =  0 0  √ =  0 ,
√ √ 2/2
2 − 2 0

with σ1 = kAv1 k = 2 and σ2 = kAv2 k = 2.

Step 3. Define u j = Avj /σj , j = 1, . . . , n.


We use the vectors Av1 and Av2 computed at the last step:
√   
0 0 2 1
   
1 1    1 1 
u1 = Av1 = 0 = 0 , u2 = Av2 = √ 0  = 0.

σ1 2 σ2 2
2 1 1 0

Step 4. Put the pieces together.


b ∗:
b ΣV
We immediately have the reduced SVD A = U
1 1 0 1  √ √
   
2/2 − 2/2

2 0
 0 0  = 0 0
  √ √ √ .
√ √ 0 2 2/2 2/2
2 − 2 1 0
To get the full SVD, we need a unit vector u3 that is orthogonal to
u1 and u2 . In this case, such a vector is easy to spot:
0
 

u3 =  1  .
0
8. the singular value decomposition 8

Thus we can write the full SVD A = UΣV∗ :

1 1 0 1 0 2 0 √ √
    
√ − 2/2

2/2
 0 0  = 0 0 10 2 √ √ .
√ √ 2/2 2/2
2 − 2 1 0 0 0 0

Finally, we write the dyadic form of the SVD, A = ∑2j=1 σj u j v∗j :

1 1 0 √ 1 √
     
√ √ √
 0 0  = 2  0  [ 2/2 − 2/2 ] + 2  0  [ 2/2 2/2 ]
√ √
2 − 2 1 0
0 0 1 1
   

= 0 0 +0 0.
√ √
2 − 2 0 0

8.5 Derivation of the singular value decomposition: Rank defi-


cient case

Having computed the singular value decomposition of a matrix A ∈


C m×n with rank(A) = n, we must now consider the adjustments
necessary when rank(A) = r < n, still with m ≥ n.
Recall that the dimension of the null space of A is given by

dim(N(A)) = n − rank(A) = n − r.

How do the null spaces of A and A∗ A compare?

Lemma 8.1 For any matrix A ∈ C m×n , N(A∗ A) = N(A).

Proof. First we show that N(A) is contained in N(A∗ A). If x ∈ N(A),


then Ax = 0. Premultiplying by A∗ gives A∗ Ax = 0, so x ∈ N(A∗ A).
Now we show that N(A∗ A) is contained in N(A). If x ∈ N(A∗ A),
then A∗ Ax = 0. Premultiplying by x∗ gives

0 = x∗ A∗ Ax = (Ax)∗ (Ax) = kAxk2 .

Since kAxk = 0, we conclude that Ax = 0, and so x ∈ N(A).


Since the spaces N(A) and N(A∗ A) each contain the other, we
conclude that N(A) = N(A∗ A).
Now we can make a crucial insight: the dimension of N(A) tells
us how many zero eigenvalues A∗ A has. In particular, suppose Can you construct a 2 × 2 matrix A
whose only eigenvalue is zero, but
x1 , . . . , xn−r is a basis for N(A). Then Ax j = 0 implies
dim(N(A)) = 1? What is the multiplic-
ity of the zero eigenvalue of A∗ A?
A∗ Ax j = 0, j = 1, . . . , n − r
= 0x j ,

and so λ = 0 is an eigenvalue of A∗ A of multiplicity n − r.


8. the singular value decomposition 9

How do these zero eigenvalues of A∗ A affect the singular value


decomposition? To begin, perform Steps 1 and 2 of the SVD proce-
dure just as before.

Step 1. Compute the eigenvalues and eigenvectors of A∗ A.


Since we order the eigenvalues of A∗ A so that λ1 ≥ · · · ≥ λn ≥ 0,
and we have just seen that zero is an eigenvalue of A∗ A of multi-
plicity n − r, we must have

λ1 ≥ λ2 ≥ · · · ≥ λr > 0, λr+1 = · · · = λn = 0.

The corresponding orthonormal eigenvectors are v1 , . . . , vn , with


the last n − r of these vectors in N(A∗ A) = N(A), i.e., Avj = 0.
q
Step 2. Define σj = kAvj k = λ j , j = 1, . . . , n.
This step proceeds without any alterations, though now we have

σ1 ≥ σ2 ≥ · · · ≥ σr > 0, σr+1 = · · · = σn = 0.

The third step of the SVD construction needs alteration, since we can
only define the left singular vectors via u j = Avj /σj when σj > 0, that
is, for j = 1, . . . , r. Any choice for the remaining vectors, ur+1 , . . . , un ,
will trivially satisfy the equation Avj = σj u j , since Avj = 0 and
σj = 0 for j = r + 1, . . . , n. Since we are building U b ∈ C m×n (and
eventually U ∈ C m × m ) to have orthonormal, we will simply build out
ur+1 , . . . , un so that all the vectors u1 , . . . , un are orthonormal.

Step 3a. Define u j = Avj /σj for j = 1, . . . , r.


Step 3b. Construct orthonormal vectors ur+1 , . . . , un .
For each j = r + 1, . . . , n, construct a unit vector u j such that

u j ⊥ span{u1 , . . . , u j−1 }. If r = 0 (which implies the trivial case


A = 0), just set u1 to be any unit vector.
This procedure is exactly the same as used above to construct the
b ∈ C m×n to
vectors un+1 , . . . , um to extend the reduced SVD with U
the full SVD with U ∈ C m × m .

Step 4. Put the pieces together.


This step proceeds exactly as before. Now we define
 
| | | |
m×n
b =
U  u 1 · · · u r u r +1 · · · u n  ∈ C ,

| | | |
8. the singular value decomposition 10

   
σ1 σ1
 ..   .. 

 .  
  . 

 σr   σr 
Σ= =  ∈ C n×n ,
b    
 σr+1   0 
   
 ..   .. 
 .   . 
σn 0
and  
| | | |
V =  v1 · · · vr vr+1 · · · vn  ∈ C n×n .
 
| | | |
Notice that V is still a square matrix with orthonormal columns, so
V∗ V = I and V−1 = V∗ . Since Avj = σj u j holds for j = 1, . . . , n, we
again have the reduced singular value decomposition
b ∗.
b ΣV
A=U
b ∈ C m×n can be enlarged to give U ∈ C n×n by supply-
As before, U
ing extra orthogonal unit vectors that complete a basis for C m :
u j ⊥ span{u1 , . . . , u j−1 }, ku j k = 1, j = n + 1, . . . , m.
Constructing U ∈ C m×m as in (8.4) and Σ ∈ C m×n as in (8.5), we
have the full singular value decomposition
A = UΣV∗ .

The dyadic decomposition could still be written as


n
A= ∑ σj u j v∗j ,
j =1

but we get more insight if we crop the trivial terms from this sum.
Since σr+1 = · · · = σn = 0, we can truncate the decomposition to
its first r terms in the sum:
r
A= ∑ σj u j v∗j .
j =1

We will see that this form of A is especially useful for understand-


ing the four fundamental subspaces.

8.6 The connection to AA∗

Our derivation of the SVD relied heavily on an eigenvalue decompo-


sition of A∗ A. How does the SVD relate to AA∗ ? Consider forming
AA∗ = (UΣV∗ )(UΣV∗ )∗
= UΣV∗ VΣ∗ U∗
= UΣΣ∗ U∗ . (8.7)
8. the singular value decomposition 11

Notice that ΣΣ∗ is a diagonal m × m matrix:


 2
Σ Σ
  
∗ ∗ 0
ΣΣ =
b
[Σ 0] =
b
b ,
0 0 0

where we have used the fact that Σ


b is a diagonal matrix. Indeed,
 2   
σ1 λ1
Σ
2
b = .. ..
= . ,
   
.
σn2 λn
where the λ j values still denote the eigenvalues of A∗ A. Thus equa-
tion (8.7) becomes
Λ 0
 
AA∗ = U U∗ ,
0 0
which is a diagonalization of AA∗ . Postmultiplying this equation by
U, we have
Λ 0
 

(AA )U = U ;
0 0
the first n columns of this equation give

AA∗ u j = λ j u j , j = 1, . . . , n,

while the last m − n columns give

AA∗ u j = 0u j , j = n + 1, . . . , m.

Thus the columns u1 , . . . , un are eigenvectors of AA∗ . Notice then This suggests a different way to com-
pute the U matrix: form AA∗ and
that AA∗ and A∗ A have the same eigenvalues, except that AA∗ has compute all its eigenvectors, giving
m − n extra zero eigenvalues. u1 , . . . , um all at once. Thus we avoid
the need for a special procedure to
construct unit vectors orthogonal to
8.7 Modification for the case of m < n u1 , . . . , ur .

How does the singular value decomposition change if A has more


columns than rows, n > m? The answer is easy: write the SVD of
A∗ (which has more rows than columns) using the procedure above,
then take the conjugate-transpose of each term in the SVD. If this
makes good sense, skip ahead to the next section. If you prefer the
gory details, read on.
We will formally adapt the steps described above to handle the
case n > m. Let r = rank(A) ≤ m.

Step 1. Compute the eigenvalues and eigenvectors of AA∗ .


Label the eigenvalues of AA∗ ∈ C m×m as

λ1 ≥ λ2 ≥ · · · ≥ λ m

and corresponding orthonormal eigenvectors as

u1 , u2 , · · · , u m
8. the singular value decomposition 12

q
Step 2. Define σj = kA∗ u j k = λ j , j = 1, . . . , m.
Step 3a. Define vj = A∗ u j /σj for j = 1, . . . , r.
Step 3b. Construct orthonormal vectors vr+1 , . . . , vm .
Notice that these vectors only arise in the rank-deficient case, Steps 3a and 3b construct a matrix
b ∈ C n×m with orthonormal columns.
V
when r < m.

Step 3c. Construct orthonormal vectors vm+1 , . . . , vn .


Following the same procedure as step 3b, we construct the extra
vectors needed to obtain a full orthonormal basis for C n .

Step 4. Put the pieces together.


First, defining
   
| | | |
b =
U  u1 · · · um  ∈ C m×m , b =
V  v1 · · · vm  ∈ C n×m ,
 
| | | |

with diagonal matrix

b = diag(σ1 , . . . , σm ) ∈ C m×m ,
Σ

we have the reduced SVD A

A = UΣ
bVb ∗.
= U Σ
b b∗
V

To obtain the full SVD, we extend Vb to obtain


 
| | | |
V =  v1 · · · vm vm+1 · · · vn  ∈ C n×n ,
 
| | | |

and similarly extend Σ,


b
h i
Σ= Σ
b 0 ∈ C m×n .

where we have now added extra zero columns, in contrast to the


extra zero rows added in the m > n case in (8.5). We thus arrive at A
the full SVD,
A = UΣV∗ .
= U Σ
V∗
8.8 General statement of the singular value decomposition

We now can state the singular value decomposition in its fullest


generality.
8. the singular value decomposition 13

Theorem 8.5 (Singular value decomposition) Suppose A ∈ C m×n has


rank(A) = r. Then we can write

A = UΣV∗ ,

where the columns of U ∈ C m×m and V ∈ C n×n are orthonormal,

U∗ U = I ∈ C m×m , V∗ V = I ∈ C n×n ,

and Σ ∈ C m×n is zero everywhere except for entries on the main diagonal,
where the ( j, j) entry is σj , for j = 1, . . . , min{m, n} and

σ1 ≥ σ2 ≥ · · · ≥ σr > σr+1 = · · · = σmin{m,n} = 0. Of course, when r = 0 all the singular


values are zero; when r = min{m, n},
Denoting the columns of U and V as u1 , . . . , um and v1 , . . . , vm , we can all the singular values are positive.

write
r
A= ∑ σj u j v∗j . (8.8)
j =1

8.9 Connection to the four fundamental subspaces

Having labored to develop the singular value decomposition in its


complete generality, we are ready to reap its many rewards. We begin
by establishing the connection between the singular vectors and the
‘four fundamental subspaces,’ i.e., the column space

R(A) = {Ax : x ∈ C n } ⊆ C m ,

the row space


R(A∗ ) = {A∗ y : y ∈ C m } ⊆ C n ,
the null space

N(A) = {x ∈ C n : Ax = 0} ⊆ C n ,

and the left null space

N(A∗ ) = {y ∈ C m : A∗ y = 0} ⊆ C m .

We shall explore these spaces using the dyadic form of the


SVD (8.8). To characterize the column space, apply A to a generic
vector x ∈ C n :
 r  r r
Ax = ∑ σj u j v∗j x = ∑ σj u j v∗j x = ∑ σj v∗j x u j ,
 
(8.9)
j =1 j =1 j =1

where in the last step we have switched the order of the scalar v∗j x
and the vector u j . We see that Ax is a weighted sum of the vectors
u1 , . . . , ur . Since this must hold for all x ∈ C n , we conclude that

R(A) ⊆ span{u1 , . . . , ur }.
8. the singular value decomposition 14

Can we conclude the converse? We know that R(A) is a subspace, so


if we can show that each of the vectors u1 , . . . , ur is in R(A), then we
will know that
span{u1 , . . . , ur } ⊆ R(A). (8.10)
To show that uk ∈ R(A), we must find some x such that Ax = uk .
Inspect equation (8.9). We can make Ax = uk if all the coefficients
σj v∗j x are zero when j 6= k, and σk v∗k x = 1. Can you see how to use
orthogonality of the right singular vectors v1 , . . . , vr to achieve this?
Setting
1
x = vk ,
σk
we have Ax = uk . Thus uk ∈ R(A), and we can conclude that (8.10)
holds. Since R(A) and span{u1 , . . . , ur } contain one another, we
conclude that
R(A) = span{u1 , . . . , ur }.
We can characterize the row space in exactly the same way, using
the dyadic form
 r ∗ r  ∗ r
A∗ = ∑ σj u j v∗j = ∑ σj u j v∗j = ∑ σj vj u∗j .
j =1 j =1 j =1

Adapting the argument we have just made leads to

R(A∗ ) = span{v1 , . . . , vr }.

Equation (8.9) for Ax is also the key that unlocks the null space
N(A). For what x ∈ C n does Ax = 0? Let us consider
 r  ∗  r  
kAxk2 = (Ax)∗ (Ax) = ∑ σj v∗j x u j ∑ σk v∗k x uk
j =1 k =1

 r   r   We quietly used (v∗j x)∗ = x∗ vj here.


= ∑ σj x∗ vj u∗j ∑ σk v∗k x uk If v∗j x is real, (v∗j x)∗ = x∗ vj = v∗j x;
j =1 k =1
if v∗j x is complex, we must be more
r r   careful: (v∗j x)∗ = x∗ vj = v∗j x, where the
∑∑ ∗
σk v∗k x u∗j uk
 
= σj x vj . line denotes complex-conjugation.
j =1 k =1

Since the left singular vectors are orthogonal, u∗j uk = 0 for j 6= k, this
double-sum collapses: only the terms with j = k make a nontrivial
contribution:
r r
kAxk2 = ∑ σj x∗ vj σj v∗j x u∗j u j = ∑ σj2 |v∗j x|2 ,
 
(8.11)
j =1 j =1

since u∗j u j = 1 and (x∗ vj )(v∗j x) = |v∗j x|2 . If z is complex, then z∗ z = zz = |z|2 .
8. the singular value decomposition 15

Since σj > 0, the right-hand side of (8.11) is the sum of nonnega-


tive numbers. To have kAxk = 0, all the coefficients in this sum must
be zero. The only way for that to happen is for

v∗j x = 0, j = 1, . . . , r,

i.e., Ax = 0 if and only if x is orthogonal to v1 , . . . , vr . We already


have a characterization of such vectors from the singular value de-
composition:
x ∈ span{vr+1 , . . . , vn }.
Thus we conclude

N(A) = span{vr+1 , . . . , vn }. If r = n, then this span is vacuous, and


we just have N(A) = 0.

To compute N(A∗ ), we can repeat the same argument based on


kA∗ yk2 to obtain

N(A∗ ) = span{ur+1 , . . . , um }.

Putting these results together, we arrive at a beautiful elaboration


of the Fundamental Theorem of Linear Algebra1 . 1
Gilbert Strang. The fundamental
theorem of linear algebra. Amer. Math.
Theorem 8.6 (Fundamental Theorem of Linear Algebra, SVD Version) Monthly, 100:848–855, 1993

Suppose A ∈ C m×n has rank(A) = r, with left singular vectors {u1 , . . . , um }


and right singular vectors {v1 , . . . , vn }. Then

R(A) = span{u1 , . . . , ur }
N(A∗ ) = span{ur+1 , . . . , um }

R(A∗ ) = span{v1 , . . . , vr }
N(A) = span{vr+1 , . . . , vn },

which implies

R(A) ⊕ N(A∗ ) = span{u1 , . . . , um } = C m

R(A∗ ) ⊕ N(A) = span{v1 , . . . , vn } = C n ,

and
R(A) ⊥ N(A∗ ), R(A∗ ) ⊥ N(A).

8.10 Matrix norms

How ‘large’ is a matrix? We do not mean dimension – but how large,


in aggregate, are its entries? One can imagine a multitude of ways to
measure the entries; perhaps most natural is to sum the squares of
8. the singular value decomposition 16

the entries, then take the square root. This idea is useful, but we pre-
fer a more subtle alternative that is of more universal utility through-
out mathematics: we shall gauge the size A ∈ C m×n by the maximum
amount it can stretch a vector, x ∈ C n . That is, we will measure kAk
by the largest that kAxk can be. Of course, we can inflate kAxk as
much as we like simply by making kxk larger, which we avoid by
imposing a normalization: kxk = 1. We arrive at the definition

kAk = max kAxk.


kxk=1

To study kAxk, we could appeal to the formula (8.11); however, we


will take a slightly different approach. First, suppose that Q is some
matrix with orthonormal columns, so that Q∗ Q = I. Then

kQxk2 = (Qx)∗ (Qx) = x∗ Q∗ Qx = x∗ x = kxk2 ,

so premultiplying by Q does not alter the norm of x. Now substitute


the full SVD A = UΣV∗ for A:

kAxk = kUΣV∗ xk = kΣV∗ xk,

where we have used the orthonormality of the columns of U. Now


define a new variable y = V∗ x (which means Vy = x), and notice that
kxk = kV∗ xk = kyk, since V is a square matrix with orthonormal
columns (and hence orthonormal rows). Now we can compute the The fact that V is square and has
matrix norm: orthonormal columns implies that both
V∗ V = I and VV∗ = I. This means that
kV∗ xk2 = x∗ VV∗ x = x∗ x = kxk2 .
kAk = max kAxk = max kΣV∗ xk = max kΣyk = max kΣyk
kxk=1 kxk=1 kVyk=1 kyk=1

So the norm of A is the same as the norm of Σ. We now must figure


out how to pick the unit vector y to maximize kΣyk. This is easy: we
want to optimize

kΣyk2 = σ12 |y1 |2 + · · · + σr2 |yr |2

subject to 1 = kyk2 ≥ |y1 |2 + · · · + |yr |2 . Since σ1 ≥ · · · ≥ σr , Alternatively, you could compute kΣk
by maximizing f (y) = kΣyk subject to
kΣyk2 = σ12 |y1 |2 + · · · + σr2 |yr |2 kyk = 1 using the Lagrange multiplier
 technique from vector calculus.
≤ σ12 |y1 |2 + · · · + |yr |2 ) ≤ σ12 kyk2 = σ12 ,

resulting in the upper bound

kΣk = max kΣyk ≤ σ1 . (8.12)


kyk=1

Will any unit vector y attain this upper bound? That is, can we find
such a vector so that kΣyk = σ1 ? Sure: just take y = [1, 0, · · · , 0]∗ to
be the first column of the identity matrix. For this special vector,

kΣyk2 = σ12 |y1 |2 + · · · + σr2 |yr |2 = σ12 .


8. the singular value decomposition 17
x2

1
Since |Σyk can be no larger than σ1 for any y, and since kΣyk = σ1
for at least one choice of y, we conclude
x1
kΣk = max kΣyk = σ1 , −1 1
kyk=1

and hence the norm of a matrix is its largest singular value:


−1
kAk = σ1 .
Every unit vector x in C 2 is a point
Consider the matrix where kxk2 = x12 + x22 = 1, so the set
√  !  √ of all such vectors traces out the unit
0 1 ∗
   
1/2 1 2 1 1 2 0 circle shown in black in the plot above.
A= = √ .
−1/2 1 2 1 −1 0 2/2 1 0 We highlight two distinguished vectors:
x = v1 (blue) and x = v2 (red).

We see from this SVD that kAk = σ1 = 2. For this example the
vector Ax has the form (Ax)2

1
Ax = σ1 (v1∗ x)u1 + σ2 (v2∗ x)u2

√ 2
= 2 x2 u1 + x u2 ,
2 1 (Ax)1

so Ax is a blend of some expansion in the u1 direction and some con- −1 1

traction in the u2 direction. We maximize the size of Ax by picking


an x for which Ax is maximally rich in u1 , i.e., x = v1 .
−1

8.11 Low-rank approximation


The plot above shows Ax for all unit
vectors x, which traces out an ellipse
Perhaps the most important property of the singular value decompo- in C 2 . The vector x = v1 is mapped to
sition is its ability to immediately deliver optimal low-rank approxi- Ax = σ1 u1 (blue), and this is the most
A stretches any unit vector; x = v2 is
mations to a matrix. The dyadic form mapped to Ax = σ2 u2 (red), which
gives the smallest value of kAxk.
r
∑ σj u j v∗j
(Plots like this can be traced out with
A= MATLAB’s eigshow command.)
j =1

writes the rank-r matrix A as the sum of the r rank-1 matrices

σj u j v∗j .

Since σ1 ≥ σ2 ≥ · · · ≥ σr > 0, we might hope that the partial sum

k
∑ σj u j v∗j
j =1

will give a good approximation to A for some value of k that is much


smaller than r (mathematicians write k  r for emphasis). This is
especially true in situations where A models some low-rank phe-
nomenon, but some noise (such as random sampling errors, when
the entries of A are measured from some physical process) causes A
8. the singular value decomposition 18

to have much larger rank. If the noise is small relative to the “true”
data in A, we expect A to have a number of very small singular val-
ues that we might wish to neglect as we work with A. We will see
examples of this kind of behavior in the next chapter.
For square diagonalizable matrices, the eigenvalue decompositions
we wrote down in Chapter 6 also express A as the sum of rank-1
matrices,
n
A= ∑ λ j w j wb ∗j , Here we write w j and w b j for the right
and left eigenvectors, to distinguish
j =1
them from the singular vectors.
but there are three key distinctions that make the singular value
decomposition a better tool for developing low-rank approximations
to A.

1. The SVD holds for all matrices, while the eigenvalue decomposi-
tion only holds for square matrices.

2. The singular values are nonnegative real numbers whose ordering

σ1 ≥ σ2 ≥ · · · ≥ σr > 0

gives a natural way to understand how much the rank-1 matrices


σj u j v∗j contribute to A. In contrast, the eigenvalues will generally
be complex numbers, and thus do not have the same natural order.

3. The eigenvectors are not generally orthogonal, and this can skew
b ∗j away from giving good approxima-
the rank-1 matrices λ j w j w
b ∗j k  1, whereas
tions to A. In particular, we can find that kw j w
the matrices u j v j from the SVD always satisfy ku j v∗j k = 1.

This last point is subtle, so let us investigate it with an example.


Consider  
2 100
A=
0 1
with eigenvalues λ1 = 2 and λ2 = 1 and eigenvalue decomposition
   
−1 1 1 2 0 1 100
A = WΛW =
0 −1/100 0 1 0 −100
b 1∗ + λ2 w2 w
= λ1 w1 w b 2∗
[ 0 −100 ]
   
1 [ 1 100 ] 1
=2 +1
0 −1/100
0 −100
   
1 100
=2 +1 .
0 0 0 1
Let us inspect individually the two rank-1 matrices that appear in the
eigendecomposition:
0 −100
   
∗ 2 200 ∗
b1 =
λ1 w1 w , b2 =
λ2 w2 w .
0 0 0 1
8. the singular value decomposition 19

Neither matrix individually gives a good approximation to A:


0 −100
   
2 200
A − λ1 w1 wb 1∗ = , b 2∗ =
A − λ2 w2 w .
0 1 0 0
Both rank-1 “approximations” to A leave large errors!
Contrast this situation with the rank-1 approximation σ1 u1 v1∗ given
by the SVD for this A. To five decimal digits, we have
0.99995 −0.01000
   
100.025 0 0.01999 0.99980
A = UΣV∗ =
0.01000 0.99995 0 0.020 −0.99980 0.01999
= σ1 u1 v1∗ + +σ2 u2 v2∗
−0.01000 [ −0.99980 0.01999 ]
   
0.99995 [ 0.01999 0.99980 ]
= 100.025 + 0.020
0.01000 0.99995
0.00999 −0.00020
   
0.01999 0.99975
= 100.025 + 0.020 .
0.00020 00.00999 −.99975 0.01999
Like the eigendecomposition, the SVD breaks A into two rank-1
pieces:
   
∗ 1.99980 100.00000 ∗ 0.00020 0.00000
σ1 u1 v1 = , σ2 u2 v2 = .
0.01999 0.99960 −0.01999 0.00040
The first of these, the dominant term in the SVD, gives an excellent
approximation to A:
 
0.00020 0.00000
A − σ1 u1 v1∗ = .
−0.01999 0.00040

The key factor making this approximation so good is that σ1  σ2 .


What is more remarkable is that the dominant part of the singular
value decomposition is actually the best low-rank approximation for
all matrices.

Definition 8.3 Let A = ∑rj=1 σj u j v∗j be a rank-r matrix, written in terms


of its singular value decomposition. Then for any k ≤ r, the truncated
singular value of rank-k is the partial sum
k
Ak = ∑ σj u j v∗j .
j =1

Theorem 8.7 (Schmidt–Mirsky–Eckart–Young) Let A ∈ C m×n . Then


for all k ≤ rank(A), the truncated singular value decomposition
k
Ak = ∑ σj u j v∗j
j =1

is a best rank-k approximation to A, in the sense that

kA − Ak k = min kA − Xk = σk+1 .
rank(X)≤k
8. the singular value decomposition 20

It is easy to see that this Ak gives the approximation error σk+1 , since

r k r
A − Ak = ∑ σj u j v∗j − ∑ σj u j v∗j = ∑ σj u j v∗j ,
j =1 j =1 j = k +1

and this last expression is an SVD for the error in the approximation Let X ∈ C m×n be any rank-k matrix.
A − Ak . As described in Section 8.10, the norm of a matrix equals its The Fundamental Theorem of Linear
Algebra gives C n = R(X∗ ) ⊕ N(X).
largest singular value, so Since rank(X∗ ) = rank(X) = k, notice
r that dim(N(X)) = n − k. From the
singular value decomposition of A
kA − Ak k = ∑ σj u j vk = σk+1 .



extract v1 , . . . , vk+1 , a basis for some
j = k +1 k + 1 dimensional subspace of C n . Since
N(X) ⊆ C n has dimension n − k, it must
To complete the proof, one needs to show that no other rank-k matrix be that the intersection
can come closer to A than Ak . This pretty argument is a bit too intri- N(X) ∩ span{v1 , . . . , vk+1 }
cate for this course, but we include it in the margin for those that are
has dimension at least one. (Otherwise,
interested. N(X) ⊕ span{v1 , . . . , vk+1 } would be
an n + 1 dimensional subspace of C n :
impossible!) Let z be some unit vector
8.11.1 Compressing images with low rank approximatoins in that intersection: kzk = 1 and

Image compression provides the most visually appealing application z ∈ N(X) ∩ span{v1 , . . . , vk+1 }.

of the low-rank matrix factorization ideas we have just described. An Expand z = γ1 v1 + · · · + γk+1 vk+1 , so
that kzk = 1 implies
image can be represented as a matrix. For example, typical grayscale
 k +1 ∗  k+1  k +1
images consist of a rectangular array of pixels, m in the vertical direc- 1 = z∗ z = ∑ γ j v j ∑ γ j v j = ∑ | γ j |2 .
tion, n in the horizontal direction. The color of each of those pixels j =1 j =1 j =1

is denoted by a single number, an integer between 0 (black) and 255 Since z ∈ N(X), we have
(white). (This gives 28 = 256 different shades of gray for each pixel. kA − Xk ≥ k(A − X)zk = kAzk,
Color images are represented by three such matrices: one for red, one and then
for green, and one for blue. Thus each pixel in a typical color image k +1 k +1
∑ ∗ ∑ σ γ u .

takes (28 )3 = 224 = 16, 777, 216 shades.) kAzk =
σ u v z
j j j = j j j
j =1 j =1

matlab has many built-in routines for processing images. The Since σk+1 ≤ σk ≤ · · · ≤ σ1 and the u j
imread command reads in image files. For example, if you want to vectors are orthogonal,

load the file snapshot.jpg into matlab, you would use the com-
k +1 k +1
∑σγu ≥σ ∑γu .

j j j k +1 j j
mand:

j =1 2 j =1 2

A = double(imread(’snapshot.jpg’)); But notice that


If your file contains a grayscale image, A will now contain the m × n k +1 2 k +1
∑ γ u = ∑ |γ j |2 = 1,

matrix containing the gray colors of your image. If you have a color j j
j =1 2 j =1
image, then A will be an m × n × 3 matrix, and you will need to where the last equality was derived
extract the color levels in an extra step. above from the fact that kzk2 = 1. In
Ared = A(:,:,1); Agreen = A(:,:,2); Ablue = A(:,:,3); conclusion, for any rank-k matrix X,

The double command converts the entries of the image into floating
k +1


kA − Xk2 ≥ σk+1 γ j j = σk +1 .
u
point numbers. (To conserve memory, MATLAB’s default is to save

j =1 2

the entries of an image as integers, but MATLAB’s linear algebra (This proof is adapted from §3.2.3 of
routines like svd will only work with floating point matrices.) Finally, Demmel’s text.)
James W. Demmel. Applied Numerical
to visualize an image in MATLAB, use
Linear Algebra. SIAM, Philadelphia, 1997
imagesc(A)
8. the singular value decomposition 21

and, if the image is grayscale, follow this with


colormap(gray)
The imagesc command is a useful tool for visualizing any matrix of
data; it does not require that the entries in A be integers. (However,
for color images stored in m × n × 3 floating point matrices, you
need to use imagesc(uint8(A)) to convert A back to positive integer
values.)
Images are ripe for data compression: Often they contain broad
regions of similar colors, and in many areas of the image adjacent
rows (or columns) will look quite similar. If the image stored in A
can be represented well by a rank-k matrix, then one can approximate
A by storing only the leading k singular values and vectors. To build
this approximation
k
Ak = ∑ σj u j v∗j ,
j =1

one need only store k(1 + m + n) values. When k (1 + m + n)  mn,


there will be a significant savings in storage, thus giving an effective
compression of A.
Let us look at an example to see how effective this image com-
pression can be. For convenience we shall use an image built into
matlab,
load gatlin, A = X;
imagesc(A), colormap(gray)
which shows some of the key developers of the numerical linear alge-

original uncompressed image, rank = 480 Figure 8.1: A sample image: the
founders of numerical linear algebra
at an early Gatlinburg Symposium.
From left to right: Jim Wilkinson,
Wallace Givens, George Forsythe,
Alston Householder, Peter Henrici, and
Friedrich Bauer.
8. the singular value decomposition 22

bra algorithms we have studied this semester, gathered in Gatlinburg,


Tennessee, for an important early conference in the field. The image
is of size 480 × 640, so rank(A) ≤ 480. We shall compress this image
with truncated singular value decompositions. Figures 8.2 and 8.3
show compressions of A for dimensions ranging from k = 200 down
to k = 1. For k = 200 and 100, the compression Ak provides an ex-
cellent proxy for the full image A. For k = 50, 25 and 10, the quality
degrades a bit, but even for k = 10 you can still tell that the image
shows six men in suits standing on a patterned floor. For k ≤ 5 we
lose much of the quality, but isn’t it remarkable how much structure
is still apparent even when k = 5? The last image is interesting as

truncated SVD, rank k = 200 truncated SVD, rank k = 100

truncated SVD, rank k = 50 truncated SVD, rank k = 25

Figure 8.2: Compressions of the Gatlin-


burg image in Figure 8.1 using trun-
cated SVDs Ak = ∑kj=1 σj u j v∗j . Each
of these images can be stored with less
memory than the original full image.
8. the singular value decomposition 23

truncated SVD, rank k = 10 truncated SVD, rank k = 5

truncated SVD, rank k = 2 truncated SVD, rank k = 1

Figure 8.3: Continuation of Figure 8.3,


showing compressions of the Gatlin-
a visualization of a rank-1 matrix: each row is a multiple of all the burg image via truncated SVDs of rank
10, 5, 2, and 1. The rank-10 image might
other rows, and each column is a multiple of all the other columns. be useful as a “thumbnail” sketch of
We gain an understanding of the quality of this compression by the image (e.g., an icon on a computer
looking at the singular values of A, shown in Figure 8.4. The first sin- desktop), but the other images are
compressed beyond the point of being
gular value σ1 is a about an order of magnitude larger than the rest, useful.
and the singular values decay quite rapidly. (Notice the logarithmic
vertical axis.) We have σ1 ≈ 15, 462, while σ50 ≈ 204.48. When we
truncate the singular value decomposition at k = 50, the neglected
terms in the singular value decomposition do not make a major con-
tribution to the image.
8. the singular value decomposition 24

Figure 8.4: Singular values of the 640 ×


480 Gatlinburg image matrix. The first
few singular values are much larger
than the rest, suggesting the potential
for accurate low-rank approximation
(compression).

student experiments

8.1. The two images shown in Figure 8.5 show characters gen-
erated in the MinionPro italic font, defined in the image files
minionamp.jpg and minion11.jpg, each of which leads to a 200 ×
200 matrix. Which image do you think will better lend itself to
low-rank approximation? Compute the singular values and trun-
cated SVD approximations for a variety of ranks k. Do your results
agree with your intuition?

Figure 8.5: Two images showing


characters in the italic MinionPro font.
(The first is an ampersand, which one
can clearly see here derives from the
Latin word et, meaning “and.”)

8.12 Special Topic: Reducing dimensions with POD

8.13 Afterword

The singular value decomposition was developed in its initial form


by Eugenio Beltrami (1873) and, independently, by Camille Jor-
dan (1874).2 2
G. W. Stewart. On the early history
of the singular value decomposition.
SIAM Review, 35:551–566, 1993

You might also like