0% found this document useful (0 votes)
50 views15 pages

Sol 11

Homework 11 for EECS 16B at UC Berkeley covers topics such as matrix dimensions, rank, singular value decomposition (SVD), and the Moore-Penrose pseudoinverse. The homework includes reading lecture notes, proofs, and practical exercises involving matrix calculations and properties. It is due on April 9, 2021, with self-grades and resubmissions due on April 13, 2021.

Uploaded by

npcsignupaccnt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views15 pages

Sol 11

Homework 11 for EECS 16B at UC Berkeley covers topics such as matrix dimensions, rank, singular value decomposition (SVD), and the Moore-Penrose pseudoinverse. The homework includes reading lecture notes, proofs, and practical exercises involving matrix calculations and properties. It is due on April 9, 2021, with self-grades and resubmissions due on April 13, 2021.

Uploaded by

npcsignupaccnt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Homework 11 @ 2021-04-10 12:49:55-07:00

EECS 16B Designing Information Devices and Systems II


Spring 2021 UC Berkeley Homework 11
This homework is due on Friday, April 9, 2021, at 11:00PM. Self-grades and
HW Resubmission are due on Tuesday, April 13, 2021, at 11:00PM.

1. Reading Lecture Notes


Staying up to date with lectures is an important part of the learning process in this course. Here are links to
the notes that you need to read for this week: Note 12

(a) Consider two vectors ~x ∈ Rm and ~y ∈ Rn , what is the dimension of the matrix ~x~y > and what is the
rank of it?
h i>
Solution: Let ~y = y1 y2 . . . yn , then
 

~x~y > =  y1 ~x y2 ~x . . . yn ~x 
 

We can see that each column of ~x~y > is a multiple of ~x, thus it has rank 1.
(b) Consider a matrix A ∈ Rm×n and the rank of A is r. Suppose its SVD is A = U ΣV > where
U ∈ Rm×m , Σ ∈ Rm×n , and V ∈ Rn×n . Can you write A in terms of the singular values of A and
outer products of the columns of U and V ?
Solution: We have A = ri=1 σi u~i v~i > where u~i and v~i are the columns of U and V . Note that we
P
only sum to r since A has rank r and hence it has r non-zero singular values σ1 , ..., σr . This is the
outer product form of the SVD.

EECS 16B, Spring 2021, Homework 11 1


Homework 11 @ 2021-04-10 12:49:55-07:00

2. Proofs
In this problem we will review some of the important proofs
h we saw
i in the lecture as practice. Let’s define
>
S = A A where A is an arbitrary m × n matrix. V = ~v1 · · · ~vn is the matrix of normalized eigenvectors
of S with eigenvalues λ1 , · · · , λn .

(a) Let v~1 , · · · , v~r be those normalized eigenvectors that correspond to non-zero eigenvalues (i.e. k~vi k = 1
and λi 6= 0 for i = 1, · · · , r). Show that kA~vi k2 = λi .
Solution: We know that vi is an eigenvector of S, so we can start with:

S v~i = λi v~i
~vi> S~vi = λi~vi>~vi .
~vi> S~vi = λi k~vi k2 .

On the second step, we multiplied both sides with ~vi> . Sincek~vi k2 = 1 the right hand side will simplify
to λi . By substituting S = A> A:

~vi> A> A~vi = λi


(A~vi )> A~vi = λi
kA~vi k2 = λi .

(b) Following the assumptions in part (a), show that A~vi is orthogonal to A~vj .
Solution: To show that two vectors are orthogonal we have to take the inner product and show that
it is zero:

(A~vi )> A~vj = ~vi> A> A~vj = ~vi> S~vj .

We know that S~vj = λj ~vj . Therefore,

~vi> S~vj = λj ~vi>~vj = 0.


| {z }
=0(since ~vi ⊥~vj )

(c) Show that if V ∈ Rn×n is an orthonormal square matrix, then ||~x||2 = ||V ~x||2 for all ~x ∈ Rn . Hint:
Write the norm as an inner product instead of trying to do this elementwise.
Solution:

||V ~x||2 =< V ~x, V ~x >= ~x> V > V ~x = ~x> In×n ~x = ||~x||2 .

EECS 16B, Spring 2021, Homework 11 2


Homework 11 @ 2021-04-10 12:49:55-07:00

3. SVD

(a) Consider the matrix  


−1 1 5
A= 3 1 −1 .
 
2 −1 4
√ √ √
Observe that the columns of matrixA are
 mutually
 orthogonal
 with norms 14, 3, 42.
1 5
Verify numerically that columns  1  and −1 are orthogonal to each other.
   
−1 4
Solution: Taking the inner product of the two vectors, we have
     >  
D 1 5 E 1 5
 1  , −1 =  1  −1 = 5 − 1 − 4 = 0.
       
−1 4 −1 4

So the two columns are orthogonal to each other.


(b) Write A = BD, where B is an orthonormal matrix and D is a diagonal matrix. What is B? What
is D?
Solution: We compute the norm for each column and divide each column by its norm to obtain
matrix B. Matrix D is formed by placing the norms on the diagonal.
 
− √114 √1
3
√5
42 
 3
B= √ √1 − √1 
 14 3 42 
√2 − √1 √4
14 3 42
√ 
14 √0 0
D= 0 3 √0 
 
0 0 42

(c) Write out a singular value decomposition of A = U ΣV > using the previous part. Note the
ordering of the singular values in Σ should be from the largest to smallest.
(HINT: From the previous part A = BDI3×3 . Now, re-order to have eigenvalues in decreasing order.)
Solution:

 
− √1 √1 √5
 
 3 14 3 42  14 √0 0 1 0 0
A = BD = BDI =  √ √1 − √142  3 √0  0 1 0 .
 0
  
 14 3
√2 − √13 √4 0 0 42 0 0 1
14 42

Reordering the singular values and corresponding left and right singular vectors, we have the SVD:

 
√5 √1 √1
 
42
− 14 3 42 0 0 0 0 1
 1 √
√3 √1  

− √ 14 √0  1 0 0
3  0
 
 42 14
√4 √2 − √13 0 0 3 0 1 0
42 14

EECS 16B, Spring 2021, Homework 11 3


Homework 11 @ 2021-04-10 12:49:55-07:00

4. The Moore-Penrose pseudoinverse for “wide” matrices


Say we have a set of linear equations given by A~x = ~y . If A is invertible, we know that the solution
for ~x is ~x = A−1 ~y . However, what if A is not a square matrix? In 16A, you saw how this problem
could be approached for tall “standing up” matrices A where it really wasn’t possible to find a solution that
exactly matches all the measurements, using linear least-squares. The linear least-squares solution gives us
a reasonable answer that asks for the “best” match in terms of reducing the norm of the error vector.
This problem deals with the other case — when the matrix A is wide — with more columns than rows. In
this case, there are generally going to be lots of possible solutions — so which should we choose? Why?
We will walk you through the Moore-Penrose pseudoinverse that generalizes the idea of the matrix inverse
and is derived from the singular value decomposition.
This approach to finding solutions complements the OMP approach that you learned in 16A and that we
used earlier in 16B in the context of outlier removal during system identification.

(a) Say you have the matrix " #


1 −1 1
A= .
1 1 −1
To find the Moore-Penrose pseudoinverse we start by calculating the SVD of A. That is to say, calculate
U, Σ, V such that
A = U ΣV >
where U and V are orthonormal matrices.
Here we will give you that the decomposition of A is:
 
" #" # 0 − √1 √1
√1 √1 2 √0 0  2 2
2 2
A= 1 0 0

− √12 √1 0 2 0
2 0 √12 √1
2

where: " #
√1 √1
U= 2 2
− √12 √1
2
" #
2 √0 0
Σ=
0 2 0
 
0 − √12 √1
2
V> = 1 0 0
 
0 √12 √1
2

It is a good idea to be able to calculate the SVD yourself as you may be asked to solve similar questions
on your own in the exam. For this subpart, verify that this decomposition of A is correct. You may use
a computer to do the matrix multiplication if you want, but it is better to verify by hand.
Solution: You may give yourself full credit for this subpart if you showed the multiplication of the
matrices out to get A, or stated that you did this using a computer. Though you did not have to do any
work for this sub-part the following solutions will walk you through how to solve for the SVD:

A = U ΣV >

EECS 16B, Spring 2021, Homework 11 4


Homework 11 @ 2021-04-10 12:49:55-07:00

. #"
3 −1
AA> = .
−1 3

Which has characteristic polynomial λ2 − 6λ + 8 = 0, producing eigenvalues 4 and 2. Solving


Av = λi v produces eigenvectors [ √12 , − √12 ]> and [ √12 , √12 ]> associated with eigenvalues 4 and 2
respectively. The singular values are the square roots of the eigenvalues of AA> , so
" #
2 √0 0
Σ=
0 2 0

and " #
√1 √1
U= 2 2
− √12 √1
2

We can then solve for the ~v vectors using A> ~ui = σi~vi , producing ~v1 = [0, − √12 , √12 ]> and ~v2 =
[1, 0, 0]> . The last ~v must be orthonormal to the other two, so we can pick [0, √12 , √12 ]> .
The SVD is:  
" #" # 0 − √1 √1
√1 √1 2 √0 0  2 2
2 2
A= 1 0 0

− √12 √1 0 2 0
2 0 √12 √1
2

(b) Let us now think about what the SVD does. Consider a rank m matrix A ∈ Rm×n , with n > m. Let
U ∈ Rm×m , Σ ∈ Rm×n , V ∈ Rn×n . Let us look at matrix A acting on some vector ~x to give the result
~y . We have
A~x = U ΣV > ~x = ~y .
We can think of V > ~x as a rotation of the vector ~x, then Σ as a scaling, and U as another rotation
(multiplication by an orthonormal matrix does not change the norm of a vector, try to verify this for
yourself). We will try to "reverse" these operations one at a time and then put them together to construct
the Moore-Penrose pseudoinverse.
 
If U “rotates” the vector ΣV > ~x, what operator can we derive that will undo the rotation?
Solution: By orthonormality, we know that U > U = U U > = I. Therefore, U > undoes the rotation.
(c) Derive a matrix Σe that will "unscale", or undo the effect of Σ where it is possible to undo. Recall
that Σ has the same dimensions as A.
Hint: Consider  
1
σ0 0 ... 0
0 1
... 0 
 . σ.1
 
. .. . . . . 
.. 
. 
1 
Σ
e =  0 0 . . . σm−1 .
 
 0 0 ... 0 
 .. .. .. .. 
 
. . . . 
0 0 ... 0

Then compute ΣΣ.


e

EECS 16B, Spring 2021, Homework 11 5


Homework 11 @ 2021-04-10 12:49:55-07:00

Solution: If you observe the equation:

Σ~x = U > ~y = ~ye, (1)

you can see that σi xi = yei for i = 0, ..., m − 1, which means that to obtain xi from yi , we need to
multiply yi by σ1i . For any i > m − 1, the information in xi is lost by multiplying with 0. If the
corresponding yei 6= 0, there is no way of solving this equation. No solution exists, and we have to
accept an approximate solution. If the corresponding yei = 0, then any xi would still work. Either way,
it is reasonable to just say xi is 0 in the case that σi = 0. That’s why we can legitimately pad 0s in the
bottom of Σ e given below:
 
1
σ0 0 ... 0
0 1
   σ1 ... 0


σ0 0 0 0 0 ... 0 .
. .. ..

. . ... .

 0 σ1 0 0 0 ... 0
  
1 
If Σ = 
 .. .. .. .. ..  then Σ 0
e = 0 ... σm−1 
. . . . . ... 0
  
0 0 ... 0 
0 0 0 σm−1 0 . . . 0  ..

.. .. .. 

. . . . 
0 0 ... 0

To recover ~x, let’s try to transform ~y with Σ:


e

Σ~
e y = ΣΣ~
e x
 
1
σ0 0 ... 0
0 1
 σ1 ... 0 
 
. .. ..  σ0 0 0 0 0 ... 0
.
. . ... .

  0 σ1 0 0 0 ... 0

1 
=0 0 . . . σm−1   .  ~x

.. .. .. ..
   .. . . . . ... 0

0 0 ... 0   0 0 0 σm−1 0 ... 0
 .. .. .. .. 

. . . . 
0 0 ... 0
" #
Σ−1
1
h i
= Σ1 0m×(n−m) ~x
0(n−m)×m
" #
Im×m 0m×(n−m)
= ~x
0(n−m)×m 0(n−m)×(n−m)
 
x1
 . 
 .. 
 
x 
 
=  m
 0 
 . 
 . 
 . 
0

Therefore, we are able to recover all xi for 1 ≤ i ≤ m with Σ. e The rest of the entries of ~x are lost
when multiplied by the zeros in Σ and thus it’s not possible to recover them.

EECS 16B, Spring 2021, Homework 11 6


Homework 11 @ 2021-04-10 12:49:55-07:00

(d) Derive an operator that would "unrotate" by V > .


Solution: By orthonormality, we know that V > V = V V > = I. Therefore, V undoes the rotation.
(e) Try to use this idea of "unrotating" and "unscaling" to write out an "inverse", denoted as A† .
That is to say,
~x̂ = A† ~y

where ~x̂ is the recovered ~x. The reason why the word inverse is in quotes (or why this is called a
pseudo-inverse) is because we’re ignoring the "divisions" by zero and ~x̂ isn’t exactly equal to ~x.
Solution: You may give yourself full credit for this subpart if you write out A† = V ΣU e >.
To see how we get this, recall that ~y = A~x = U ΣV > ~x. We get ~y by first rotating ~x by V > , then
scaling the resulting vector by Σ, and finally rotating by U . To undo the operations, we should first
"unrotate" ~y by U > , then "unscale" it by Σ,
e and finally "unrotate" it by V . Therefore, we have

e > ~y
~x̂ = V ΣU

which leads to the definition A† = V ΣU


e > to be the pseudo-inverse of A. Of course, nothing can
possibly be done for the information that was destroyed by the nullspace of A — there is no way
to recover any component of the true ~x that was in the nullspace of A. However, we can get back
everything else.

~y = A~x = U ΣV > ~x
U > ~y = U > U ΣV > ~x Unrotating by U
U > ~y = Im×m ΣV > ~x Unrotating by U
>
U ~y = ΣV > ~x Unrotating by U
e > ~y = ΣΣV
ΣU e >~x Unscaling by Σ
e
V ΣU
e > ~y = V ΣΣV
e >~x Unrotating by V

It’s worth noting that ~x̂ = V ΣU


e > ~y = V ΣΣV e >~ x has no component in the null space of A. To see why
this is the case, recall that {~vm+1 , ..., v~n } is a basis of the null space of A, and V > ~x is the coordinate
of ~x in the V basis.
From previous parts, we know that
 
(V > ~x)1
 .. 

 . 

 >
(V ~
x )

> m
ΣΣV ~x = 
e 
0

 
 .. 
.
 
 
0

where (V > ~x)i is the ith entry of V > ~x. This gives the coordinate of ~x in the basis of V with the
components of ~x in the null space of A zeroed out.
Finally, we transform this vector back to the standard basis by V and this gives ~x̂ = V (ΣΣVe >~x).
~
Therefore, x̂ has no component in the null space of A.
(f) Use A† to solve for a vector ~x in the following system of equations.
" # " #
1 −1 1 2
~x =
1 1 −1 4

EECS 16B, Spring 2021, Homework 11 7


Homework 11 @ 2021-04-10 12:49:55-07:00

Solution: From the above, we have the solution given by:

~x = A† ~y = V ΣUe > ~y
  
0 1 0 1
0 "
√1
#" #
1 1  2 − √12 2
= − √2 0 √2   0
 √1  2
2 √1 √1 4
√1 0 √12 0 0 2 2
2
   
1 1 " # 3
 21 2
1  2
= − 4 =  21 
 
4  4
1 1
4 − 4 − 21

Therefore, a reasonable solution to the system of equations is:


 
3
~x =  12 
 
− 12

(g) Now we will see why this matrix, A† = V ΣU e > , is a useful proxy for the matrix inverse in such
circumstances. Show that the solution given by the Moore-Penrose pseudoinverse satisfies the
minimality property that if ~x̂ = A† ~y is the pseudo-inverse solution to A~x = ~y , then ~x̂ has no
component in the nullspace of A.
Hint: To show this recall that for any vector ~x, the vector V −1 ~x represents the coordinates of ~x in the
basis of the columns of V . Compute V −1 ~x and show that the last (n − m) rows are all zero.
This minimality property is useful in many applications. You saw a control application in lecture. This
is also used all the time in machine learning, where it is connected to the concept behind what is called
ridge regression or weight shrinkage.
Solution:
Since ~x̂ is the pseudoinverse solution, we know that,
e > ~y
~x̂ = V ΣU

Let us write down what ~x̂ is with respect to the orthonormal basis formed by the columns of V .
Let there be k non-zero singular values. The following expression comes from expanding the matrix
multiplication.

V > ~x̂ = V > V ΣU


e > ~y
= V > V ΣU
e > ~y
e > ~y
= ΣU
 
" # u~1 >
Σ−1
1
 .. 
= 
.
 ~y
0(n−m)×m  
u~m >
" #
Σ−1 >y
1 U ~
=
0(n−m)×m

EECS 16B, Spring 2021, Homework 11 8


Homework 11 @ 2021-04-10 12:49:55-07:00

Note that V > ~x is the coordinate of ~x in the V basis and recall that {~vm+1 , ..., v~n } is a basis of the null
space of A. Since the last n − m entries of V > ~x̂ are all zeros, this means that ~x̂ has no component in
the direction of ~vm+1 , ..., v~n , i.e. ~x̂ has no component in the null space of A.
(h) Consider a generic wide matrix A. We know that A can be written using A = U ΣV > where U and
V each are the appropriate size and have orthonormal columns, while Σ is the appropriate size and is
a diagonal matrix — all off-diagonal entries are zero. Further assume that the rows of A are linearly
independent. Prove that A† = A> (AA> )−1 .
(HINT: Just substitute in U ΣV > for A in the expression above and simplify using the properties you
know about U, Σ, V . Remember the transpose of a product of matrices is the product of their transposes
in reverse order: (CD)> = D> C > .)
Solution:
We just substitute in to see what happens:

A> (AA> )−1 = (U ΣV > )> (U ΣV > (U ΣV > )> )−1 (2)
> > > > > −1
= V Σ U (U ΣV V Σ U ) (3)
> > > > −1
= V Σ U (U (ΣΣ )U ) (4)
= V Σ> U > U (ΣΣ> )−1 U > (5)
> > −1 >
= V Σ (ΣΣ ) U . (6)

At this point, we are almost done in reaching A† = V ΣU e > . We have the leading V and the ending
U > . All that we need to do is multiply out the diagonal matrices in the middle.

EECS 16B, Spring 2021, Homework 11 9


Homework 11 @ 2021-04-10 12:49:55-07:00

 
σ0 0 · · · 0
 0 σ1 · · · 0 
   
σ0 0 0 0 0 ... 0 σ0 0 0 0 0 ... 0 . . .. 
.. ...

  .. . 

0 σ 0 0 0 ... 0 >  0 σ1 0 0 0 ... 0 
  
1
Σ> (ΣΣ> )−1 )−1

= ( . . . . . ) ( . .. .. .. ..   0 0 · · · σm−1 
. .. .. .. .. . . .
. 0  .. . . . . ... 0 
  
 0 0 ··· 0 

0 0 0 σm−1 0 . . . 0 0 0 0 σm−1 0 . . . 0 .
 .. .. .. 
 .. . . . 

0 0 ... 0
(7)
   
σ0 0 0 0 0 ... 0 σ02 0 . . . 0
 0 σ1 0 0 0 . . . 0 >  0 σ12 . . . 0  −1
   
= (
 .. .. .. .. .. ) ( . . . . . . .
 
)
..  (8)
. . . . . . . . 0  .. . 
0 0 0 σm−1 0 ... 0 0 0 2
0 σm−1
 
σ0 0 ... 0
 0 σ1 ... 0
  
 1
. .. ..  σ02 0 . . . 0
. 1
. . ... .
 
 0 ... 0
σ12

=0 0 . . . σm−1   . . (9)
  
 . .. ... .. 
0 0 ... 0  . .
 

1
 0 0 0
. .. .. .. 
. 2
σm−1
. . . . 
0 0 ... 0
 
1
σ0 0 ... 0
0 1
... 0 
 . σ.1
 
. .. .. 
. ... . 

1 
=0 0 . . . σm−1  = Σ. (10)
 e
 
0 0 ... 0 
 .. .. .. .. 
 
. . . . 
0 0 ... 0

This concludes the proof.


In case you were wondering, the alternative form A> (AA> )−1 comes from first noticing that the
columns of A> are all orthogonal to the nullspace of A by definition, and so using them as the basis
for the subspace in which we want to find the solution. The (AA> ) are the columns of where each of
these basis elements ends up through A. Inverting this tells us how to get to where we want.
Anyway, it is interesting to step back and see that at this point, between 16A and 16B, you now
know two different ways to solve problems in which there are fewer linear equations than you have
unknowns. You learned OMP in 16A which proceeded in a greedy fashion and basically tried to
minimize the number of variables that it set to anything other than zero. And now you have learned
the Moore-Penrose Psuedoinverse that finds the solution that minimizes the Euclidean norm.
Both of these, as well as least-squares, are ways to manifest the philosophical principle of Occam’s
Razor algorithmically for learning. Occam’s Razor says “Numquam ponenda est pluralitas sine ne-
cessitate” (translation: don’t posit more than you need.) But there are two different ways to measure
“more” — counting and weighing. Least squares (when we have more equations than variables) is on

EECS 16B, Spring 2021, Homework 11 10


Homework 11 @ 2021-04-10 12:49:55-07:00

the path of weighing — the norm of the error is minimized. OMP iterates that to also follow the path
of counting, where the number of nonzero variables corresponds to the things that are counted. The
Moore-Penrose Psuedoinverse is fully in the path of weighing.
Both of these paths grow into major themes in machine learning generally, and both play a very impor-
tant role in modern machine learning in particular. This is because in many contemporary approaches
to machine learning, we try to learn models that have more parameters than we have data points.

EECS 16B, Spring 2021, Homework 11 11


Homework 11 @ 2021-04-10 12:49:55-07:00

5. Frobenius Norm
In this problem we will investigate the basic properties of the Frobenius norm.
qP
N N 2
Similar to how the norm of vector ~x ∈ R is defined as kxk = i=1 xi , the Frobenius norm of a matrix
A ∈ RN ×N is defined as v
uN N
uX X
kAkF = t |Aij |2 .
i=1 j=1

Aij is the entry in the ith row and the j th column. This is basically the norm that comes from treating a
matrix like a big vector filled with numbers.

(a) With the above definitions, show that for a 2 × 2 matrix A:


q
kAkF = Tr{A> A}.

Think about how this generalizes to n × n matrices. Note: The trace of a matrix is the sum of its
diagonal entries. For example, let A ∈ RN ×N , then,
N
X
Tr{A} = Aii
i=1

Solution: This proof is for the general case of n × n matrices. You should give yourself full credit
if you did this calculation only on the 2 × 2 case.
N
X
Tr{A> A} = (A> A)ii (11)
i=1
 
N
X N
X
=  (A> )ij Aji  (12)
i=1 j=1
 
N
X N
X
=  Aji Aji  (13)
i=1 j=1
N X
X N
= (A2ji ) (14)
i=1 j=1

= kAk2F (15)

In the above solution, step 11 writes out the trace definition, step 12 expands the matrix multiplication
on the diagonal indices (i.e. index (i, i) is the inner product of row i and column i), step 13 applies the
definition of matrix transpose, and the last two steps collects the result into the definition of Frobenius
norm.
(b) Show that if U and V are square orthonormal matrices, then

kU AkF = kAV kF = kAkF .

EECS 16B, Spring 2021, Homework 11 12


Homework 11 @ 2021-04-10 12:49:55-07:00

(HINT: Use the trace interpretation from part (a). )


Solution: The direct path is just to compute using the trace formula:
q q q
kU AkF = Tr{(U A)> (U A)} = Tr{A> U > U A} = Tr{A> A} = kAkF

Another path is to note that the Frobenius norm squared of a matrix is the sum of squared Euclidean
norms of the columns of the matrix. Matrix multiplication U A proceeds to act on each column of A
independently. None of those norms change since U is orthonormal, and so the Frobenius norm also
doesn’t change.
To show the second equality, we must note that A> = kAkF , because we are just summing over
F
the same numbers, just in a different order. Hence:

kAV kF = (AV )>


F
> >
= V A
F

But the transpose of an orthonormal matrix is also orthonormal, hence this case reduces to the previous
case.
qP
N 2
(c) Use the SVD decomposition to show that kAkF = i=1 σi , where σ1 , · · · , σN are the singular
values of A.
(HINT: The previous part might be quite useful.)
Solution:

kAkF = U ΣV > = ΣV > = kΣkF


F F
v
q uN
uX
>
= Tr{Σ Σ} = t σi2
i=1

EECS 16B, Spring 2021, Homework 11 13


Homework 11 @ 2021-04-10 12:49:55-07:00

6. SVD from the other side


In lecture, we thought about the SVD for a wide matrix M with n rows and m > n columns by looking at the
big m×m symmetric matrix M T M and its eigenbasis. This question is about seeing what happens when we
look at the small n × n symmetric matrix Q = M M T and its orthonormal eigenbasis U = [~u1 , ~u2 , · · · , ~un ]
instead. Suppose we have sorted the eigenvalues so that the real eigenvalues λ
ei are sorted in descending
order where λei ~ui = Q~ui .

ei ≥ 0.
(a) Show that λ
(HINT: You want to involve ~uTi ~ui somehow.)
Solution: We know that ui is an eigenvector of Q, so we can start with:
λ
ei u~i = Qu~i
ei ~uT ~ui = ~uT Q~ui
λ i i
ei k~ui k2 = ~uT Q~ui
λ i
ei k~ui k2 = ~uT M M T ~ui
λ i
ei k~ui k2 = (M T ~ui )T (M T ~ui )
λ
2
ei k~ui k2 = M T ~ui
λ
Since we know that any squared quantity is positive, the right side of the equation is positive. This
means that λ
ei must also be positive.
T
(b) Suppose that we define w~ i = M√ ~ui for all i for which λ
ei > 0. Suppose that there are ` such eigenval-
λ
ei
ues. Show that W = [w ~ 2, · · · , w
~ 1, w ~ ` ] has orthonormal columns.
Solution: To show orthonormality, we can compute the inner product w ~ iT w
~ j for all i, j ∈ 1, 2, . . . `
and demonstrate that w T
~i w~ j = 1 if i = j and w T
~i w~ j = 0 if i 6= j:
M T ~ui M T ~uj
~ iT w
w ~ j = ( q )T ( q )
λ
ei λ
ej

~uTi M M T ~uj
~ iT w
w ~j = q q
λi λ
e ej

ei k~ui k2 = ~uT M M T ~ui


If i = j we saw that in the previous part λ i
ei k~ui k2
λ
~ iT w
w ~i = q q
λ
ei λ ei

Since we know U is orthonormal,k~ui k = 1:


λ
ei
~ iT w
w ~i = =1
λ
ei

If i 6= j, we know that ~uj is an eigenvector of M M T , so the original equation becomes:


λj ~uTi ~uj
~ iT w
w ~j = q q
λ
ei λ ej

Since U is orthonormal, ~uTi ~uj = 0 and so w


~ iT w
~ j = 0.

EECS 16B, Spring 2021, Homework 11 14


Homework 11 @ 2021-04-10 12:49:55-07:00

7. Homework Process and Study Group


Citing sources and collaborators are an important part of life, including being a student!
We also want to understand what resources you find helpful and how much time homework is taking, so we
can change things in the future if possible.

(a) What sources (if any) did you use as you worked through the homework?
(b) If you worked with someone on this homework, who did you work with?
List names and student ID’s. (In case of homework party, you can also just describe the group.)

Contributors:

• Kourosh Hakhamaneshi.

• Elena Jia.

• Siddharth Iyer.

• Justin Yim.

• Anant Sahai.

• Gaoyue Zhou.

• Aditya Arun.

EECS 16B, Spring 2021, Homework 11 15

You might also like