0% found this document useful (0 votes)

96 views11 pages

CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations

The document discusses the singular value decomposition (SVD) and low-rank matrix approximations. It begins with an example showing how knowing a matrix has structure, like being low-rank, can allow determining missing entries. It then defines matrix rank and describes how any matrix can be written as the product of two matrices in a way related to its rank. Finally, it motivates finding the best low-rank approximation of a matrix, for applications like compression, de-noising, and matrix completion.

Uploaded by

Onkar Pandit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views11 pages

CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations

Uploaded by

Onkar Pandit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

CS168: The Modern Algorithmic Toolbox

Lecture #9: The Singular Value Decomposition (SVD)

and Low-Rank Matrix Approximations
Tim Roughgarden & Gregory Valiant∗
April 26, 2021

1 What Are The Missing Entries?

Here’s a quiz for you: consider the following 5 × 3 matrix, with 7 entries shown and 8 entries
missing:  
7 ? ?
 ? 8 ? 
 
 ? 12 6  .
 
 ? ? 2 
21 6 ?
What are the missing entries?
This matrix completion problem seems a bit unfair, no? After all, each of the unknown
entries could be anything, and there’s no way to know what they are. But what if I told you
the additional hint that the complete matrix has “nice structure?” This could mean many
things, but for the example let’s use an extreme assumption: that all rows are multiples of
each other.
Now, it is possible to recover all of the missing entries! For example, if the third row is a
multiple of the second one, then each entry in the latter must be 23 times the corresponding
entry in the former (because of the “12” and “8” in the middle column). Thus we can
conclude that the third entry of the second row must be a “4.” Similarly, the middle entry
of the fourth row is a “4,” the last entry of the final row is a “3,” and so on. Here’s the
∗
2016–2021,
c Tim Roughgarden and Gregory Valiant. Not to be sold, published, or distributed without
the authors’ consent.

1
completed matrix:  
7 2 1

 28 8 4 


 42 12 6 .
 (1)
 14 4 2 
21 6 3
The point of the example is that when you know something about the “structure” of a
partially known matrix, then sometimes it’s possible to recover all of the “lost” information.
The assumption that all rows are multiples of each other is pretty extreme — what would
“matrix structure” mean more generally? One natural and useful definition, explained in
the next section, is that of having low rank.

2 Matrix Rank
You have probably seen the notion of matrix rank in previous courses, but let’s take a
moment to page back in the relevant concepts.

Rank-0 Matrices. There is only one rank-zero matrix of a given size, namely the all-zero
matrix.

Rank-1 Matrices. A rank-one matrix is precisely a non-zero matrix of the type assumed
in the example above — all rows are (not necessarily integral) multiples of each other. In
the example in (1), all columns are also multiples of each other; this is not an accident.
An equivalent definition of a rank-1 m × n matrix is as the outer product uv> of an
m-vector u and an n-vector v:1
 
u1 v>  
>
 u2 v 
A = uv> =   = v1 u v2 u · · · vn u  .
  
..
 . 
>
um v

Note that each row is a multiple of v> , and each column is multiple of u.

Rank-2 Matrices. A rank-two matrix is just a superposition (i.e., sum) of two rank-1
matrices:
 
u1 v> + w1 z>  
> >
u2 v + w2 z v>
 
> >
A = uv + wz =  = u w · . (2)
  
.. 
z>
 . 
um v> + wm z>
Pn
Contrast this with the inner (a.k.a. dot) product u> v =
1
i=1 ui vi , which is only defined for two vectors
of the same dimension and results in a scalar.

2
n k n

× k ZT

m A = Y m

Figure 1: Any matrix A of rank k can be decomposed into a long and skinny matrix times
a short and long one.

It’s worth spending some time checking and internalizing the equalities in (2).
OK not quite: a rank-2 matrix is one that can be written as the sum of two rank-1
matrices and is not itself a rank-0 or rank-1 matrix.

Rank-k Matrices. The general definition of matrix rank should now be clear: a matrix
A has rank k if it can be written as the sum of k rank-one matrices, and cannot be written
as the sum of k − 1 or fewer rank-one matrices.
Rephrased in terms of matrix multiplication, an equivalent definition is that A can written
as, or “factored into,” the product of a long and skinny (m × k) matrix Y and a short and
long (k × n) matrix Z> (Figure 1). (And that A cannot be likewise factored into the product
of m × (k − 1) and (k − 1) × n matrices.)
There are many equivalent definitions of the rank of a matrix A. The following two
conditions are equivalent to each other and to the definition above (any one of the three
conditions implies the other two):

1. The largest linearly independent subset of columns of A has size k. That is, all n
columns of A arise as linear combinations of only k of them.

2. The largest linearly independent subset of rows of A has size k. That is, all m rows of
A arise as linear combinations of only k of them.

It is clear that the first condition implies the second and third: if A = YZ> , then all of
A’s columns are linear combinations of the k columns of Y, and all of A’s rows are linear
combinations of the k rows of Z> . See your linear algebra course for proofs of the other
implications (they are not difficult).

3
3 Low-Rank Matrix Approximations: Motivation
The primary goal of this lecture is to identify the “best” way to approximate a given matrix A
with a rank-k matrix, for a target rank k. Such a matrix is called a low-rank approximation.
Why might you want to do this?

1. Compression. A low-rank approximation provides a (lossy) compressed version of the

matrix. The original matrix A is described by mn numbers, while describing Y and
Z> requires only k(m + n) numbers. When k is small relative to m and n, replacing
the product of m and n by their sum is a big win. For example, when A represents
an image (with entries = pixel intensities), m and n are typically in the 100s. In other
applications, m and n might well be in the tens of thousands or more. With images, a
modest value of k (say 100 or 150) is usually enough to achieve approximations that
look a lot like the original image.
Thus low-rank approximations are a matrix analog to notions of dimensionality reduc-
tion for vectors (Lectures #4 and #7).

2. De-noising. If A is a noisy version of some “ground truth” signal that is approximately

low-rank, then passing to a low-rank approximation of the raw data A might throw out
lots of noise and little signal, resulting in a matrix that is actually more informative
than the original.

3. Matrix completion. Low-rank approximations offer a first-cut approach to the matrix

completion problem introduced in Section 1. (We’ll see more advanced approaches
in Week 9.) Given a matrix A with missing entries, the first step is to obtain a full
matrix Â by filling in the missing entries with “default” values. Exactly what these
default values should be requires trial and error, and the success of the method is often
highly dependent on this choice. Natural things to try for default values include: 0,
the average of the known entries in the same column, the average of the known entries
in the same row, and the average of the known entries of the matrix. The second step
is to compute the best rank-k approximation to Â. This approach works reasonably
well when the unknown matrix is close to a rank-k matrix and there are not too many
missing entries.

Our high-level plan for computing a rank-k approximation of a matrix A is: (i) express
A as a list of its ingredients, ordered by “importance;” (ii) keep only the k most important
ingredients. The non-trivial step (i) is made easy by the singular value decomposition, a
general matrix operation discussed in the next section.

4
Figure 2: The singular value decomposition (SVD). Each singular value in S has an associated
left singular vector in U, and right singular vector in V.

4 The Singular Value Decomposition (SVD)

4.1 Definitions
We’ll start with the formal definitions, and then discuss interpretations, applications, and
connections to concepts in previous lectures. A singular value decomposition (SVD) of an
m × n matrix A expresses the matrix as the product of three “simple” matrices:

A = USV> , (3)

where:

1. U is an m × m orthogonal matrix;2

2. V is an n × n orthogonal matrix;

3. S is an m × n diagonal matrix with nonnegative entries, and with the diagonal entries
sorted from high to low (as one goes “northwest” to “southeast).”3

Note that in contrast to the decomposition discussed in Lecture #8 (A = QDQ> when

A has the form X> X), the orthogonal matrices U and V are not the same — since A need
not be square, U and V need not even have the same dimensions.4
The columns of U are called the left singular vectors of A (these are m-vectors). The
columns of V (that is, the rows of V> ) are the right singular vectors of A (these are n-
vectors). The entries of S are the singular values of A. Thus with each singular vector (left
2
Recall from last lecture that a matrix is orthogonal if its columns (or equivalently, its rows) are orthonor-
mal vectors, meaning they all have norm 1 and the inner product of any distinct pair of them is 0.
3
When we say that a (not necessarily square) matrix is diagonal, we mean what you’d think: only the
entries of the form (i, i) are allowed to be non-zero.
4
Even small numerical examples are tedious to do in detail — the orthogonality constraint on singular
vectors ensures that most of the numbers are messy. The easiest way to get a feel for what SVDs look like
is to feed a few small matrices into the SVD subroutine supported by your favorite environment (Matlab,
python’s numpy library, etc.).

5
or right) there is an associated singular value. The “first” or “top” singular vector refers to
the one associated with the largest singular value, and so on. See Figure 2.
To better see how the SVD expresses A as a “list of its ingredients,” check that the
factorization A = USV> is equivalent to the expression
min{m,n}
X
A= si · ui vi> , (4)
i=1

where si is the ith singular value and ui , vi are the corresponding left and right singular
vectors. That is, the SVD expresses A as a nonnegative linear combination of min{m, n}
rank-1 matrices, with the singular values providing the multipliers and the outer products
of the left and right singular vectors providing the rank-1 matrices.
Every matrix A has a SVD. The proof is not deep, but is better covered in a linear
algebra course than here. Geometrically, thinking of an m × n matrix as a mapping from
Rn to Rm , this fact is kind of amazing: every matrix A, no matter how weird, is only
performing a rotation in the domain (multiplication by V> ), followed by scaling plus adding
or deleting dimensions (multiplication by S) as needed, followed by a rotation in the range
(multiplication by U). Along the lines of last lecture’s discussion, the SVD is “more or
less unique.” The singular values of a matrix are unique. When a singular value appears
multiple times, the subspaces spanned by the corresponding left and right singular vectors
are uniquely defined, but arbitrary orthonormal bases can be chosen for each.5
There are pretty good algorithms for computing the SVD of a matrix; details are covered
in any numerical analysis course. It is unlikely that you will ever need to implement one
of these yourself. For example, in Matlab, you literally just write [U,S,V] = svd(A) to
compute the SVD of A. The running time of the algorithm is the smaller of O(m2 n) and
O(n2 m), and the standard implementations of it have been heavily optimized. A typical
laptop should have no trouble computing the SVD of a 5000 × 5000 dense matrix, but might
take a bit of time for a 10000 × 10000 matrix. As we remark at the end of these notes, if
you just want to compute the largest k singular values, and their associated singular vectors,
this can be computed significantly faster, in time roughly O(kmn). (This last comment is
quite relevant to this week’s miniproject.)

5 Low-Rank Approximations from the SVD

If we want to best approximate a matrix A by a rank-k matrix, how should we do it? If only
we had a representation of the data matrix A as a sum of several ingredients, with these
ingredients ordered by “importance,” then we could just keep the k “most important” ones.
But wait, the SVD gives us exactly such a representation! Recalling that the SVD expresses
a matrix A as a sum of rank-1 matrices (weighted by the corresponding singular values), a
natural idea is to keep only the first k terms on the right-hand side of (4). That is, for A as
5
Also, one can always multiply the ith left and right singular vectors by -1 to get another SVD.

6
Figure 3: Low rank approximation via SVD. Recall that S is non-zero only on its diagonal,
and the diagonal entries of S are sorted from high to low. Our low rank approximation is
Ak = Uk Sk Vk> .

in (4) and a target rank k, the proposed rank-k approximation is

k
X
Â = si · ui vi> , (5)
i=1

where as usual we assume that the singular values have been sorted (s1 ≥ s2 ≥ · · · smin{m,n} ≥
0), and ui and vi denote the ith left and right singular vectors. As the sum of k rank-1
matrices, Â clearly has rank (at most) k.
Here is an equivalent way to think about the proposed rank-k approximation (see also
Figure 3).
1. Compute the SVD A = USV> , where U is an m × m orthogonal matrix, S is a
nonnegative m × n diagonal matrix with diagonal entries sorted from high to low, and
V> is a n × n orthogonal matrix.

2. Keep only the top k right singular vectors: set Vk> equal to the first k rows of V> (a
k × n matrix).

3. Keep only the top k left singular vectors: set Uk equal to the first k columns of U (an
m × k matrix).

4. Keep only the top k singular values: set Sk equal to the first k rows and columns of S
(a k × k matrix), corresponding to the k largest singular values of A.

5. The rank-k approximation is then

Ak = Uk Sk Vk> . (6)

Storing the matrices on the right-hand side of (6) takes O(k(m + n)) space, in contrast
to the O(mn) space required to store the original matrix A. This is a big win when k is
relatively small and m and n are relatively large (as in many applications).

7
It is natural to interpret (6) as approximating the raw data A in terms of k “concepts”
(e.g., “math,” “music,” and “sports”), where the singular values of Sk express the sig-
nal strengths of these concepts, the rows of V> and columns of U express the “canonical
row/column” associated with each concept (e.g., a customer that likes only music products,
or a product liked only by music customers), and the rows of U (respectively, columns of
V> ) approximately express each row (respectively, column) of A as a linear combination
(scaled by Sk ) of the “canonical rows” (respectively, canonical columns).
Conceptually, this method of producing a low-rank approximation is as clean as could
be imagined: we re-represent A using the SVD, which provides a list of A’s “ingredients,”
ordered by “importance,” and we retain only the k most important ingredients. But is the
result of this elegant computation any good?
The next fact justifies this approach: this low-rank approximation is optimal in a natural
sense. The guarantee is in terms of the “Frobenius norm” of a matrix qP M, which just means
2
applying the `2 norm to the matrix as if it were a vector: kMkF = i,j mij .

Fact 5.1 For every m × n matrix A, rank target k ≥ 1, and rank-k m × n matrix B,

kA − Ak kF ≤ kA − BkF ,

where Ak is the rank-k approximation (6) derived from the SVD of A.

We won’t prove Fact 5.1 formally, but see Section 8 for a plausibility argument based on the
properties we’ve already established about the closely related PCA method.

Remark 5.2 (How to Choose k) When producing a low-rank matrix approximation, we’ve
been taking as a parameter the target rank k. But how should k be chosen? In a perfect
world, the singular values of A give strong guidance: if the top few such values are big and
the rest are small, then the obvious solution is to take k equal to the number of big values.
In a less perfect world, one takes k as small as possible subject to obtaining a useful approx-
imation — of course what “useful” means depends on the application. Rules of thumb often
take the form: choose k such that the sum of the top k singular values is at least c times
as big as the sum of the other singular values, where c is a domain-dependent constant (like
10, say).

Remark 5.3 (Lossy Compression via Truncated Decompositions) Using the SVD to
produce low-rank matrix approximations is another example of a useful paradigm for lossy
compression. The first step of the paradigm is to re-express the raw data exactly as a de-
composition into several terms (as in (3)). The second step is to throw away all but the
“most important” terms, yielding an approximation of the original data. This paradigm
works well when you can find a representation of the data such that most of the interesting
information is concentrated in just a few components of the decomposition. The appropriate
representation will depend on the data set — though there are a few rules of thumb, as we’ll
discuss — and of course, messy enough data sets might not admit any nice representations
at all.

8
6 PCA Reduces to SVD
There is an interesting relationship between the SVD and the decompositions we discussed
last week. Recall in the last lecture we used the fact that X> X, as a symmetric n×n matrix,
can be written as X> X = QDQ> , where Q is an n × n orthogonal matrix and D is an n × n
diagonal matrix. (Here X is the data matrix, with each of the m rows representing a data
point in Rn .) Consider the SVD X = USV> and what its existence means for X> X:
X> X = (USV> )> (USV> ) = VS> U > > >
| {zU} SV = VDV , (7)
=I

where D is a diagonal matrix with diagonal entries equal to the squares of the diagonal
entries of S (if m < n then the remaining n − m diagonal entries of D are 0).
Recall from last lecture that if you decompose X> X as QDQ> , then the rows of Q>
are the eigenvectors of X> X. The computation in (7) therefore shows that the rows of V>
are the eigenvectors of X> X. Thus, the right singular vectors of X are the same as the
eigenvectors of X> X. Similarly, the eigenvalues of X> X are the squares of the singular
values of X.
Thus principal components analysis (PCA) reduces to computing the SVD of A (without
forming X> X). Recall that the output of PCA, given a target k, is simply the top k eigen-
vectors of the covariance matrix X> X. The SVD USV> of X hands you these eigenvectors
on a silver platter — they are simply the first k rows of V> . This is an alternative to the
Power Iteration method discussed last lecture. So which method for computing eigenvectors
is better? There is no clear answer; in many cases, either should work fine, and if perfor-
mance is critical you’ll want to experiment with both. Certainly the Power Iteration method,
which finds the eigenvectors of X> X one-by-one, looks like a good idea when you only want
the top few eigenvectors (as in our data visualization use cases). If you want many or all of
them, then the SVD — which gives you all of the eigenvectors, whether you want them or
not — is probably the first thing to try.
Now that we understand the close connection between the SVD and the PCA method,
let’s return to Fact 5.1, which states that the SVD-based rank-k approximation is optimal
(with respect to the Frobenius norm). Intuitively, this fact holds because: (i) minimizing the
Frobenius norm kA − BkF is equivalent to minimizing the average (over i) of the squared
Euclidean distances between the ith rows of A and B; (ii) the SVD uses the same vectors to
approximate the rows of A as PCA (the top eigenvectors of A> A/right singular vectors of A);
and (iii) PCA, by definition, chooses its k vectors to minimize the average squared Euclidean
distance between the rows of A and the k-dimensional subspace of linear combinations of
these vectors. The contribution of a row of A − Ak to the Frobenius norm corresponds
exactly to one of these squared Euclidean distances.

7 More on PCA vs. SVD

PCA and SVD are closely related, and in data analysis circles you should be ready for the
terms to be used almost interchangeably. There are differences, however. First, PCA refers

9
to a data analysis approach — a choice of how to define the “best” way to approximate a
bunch of data points as linear combinations of a small set of vectors. Meanwhile, the SVD
is a general operation defined on all matrices. For example, it doesn’t really make sense to
talk about “applying PCA” to a matrix A unless the rows of A have clear semantics —
typically, as data points x1 , . . . , xm in Rn . By contrast, the SVD (3) is well defined for every
matrix A, whatever the semantics for A. In the particular case where A is a matrix where
the rows represent data points, the SVD can be interpreted as performing the calculations
required by PCA. (The SVD is also useful for many other computational tasks.)
We can also make more of an “apples vs. apples” comparison in the following way. Let’s
define the “PCA operation” as taking an m × n data matrix X as input, and possibly a
parameter k, and outputting all (or the top k) eigenvectors of the covariance matrix X> X.
The “SVD operation” takes as input an m × n matrix X and outputs U, S, and V> , where
the rows of V> are the eigenvectors of X> X. Thus the SVD gives strictly more information
than required by PCA, namely the matrix U.
Is the additional information U provided by the SVD useful? In applications where you
want to understand the column structure of X, in addition to the row structure, the answer
is “yes.” To see this, let’s review some interpretations of the SVD (3). On the one hand,
the decomposition expresses every row of X as a linear combinations of the rows of V> ,
with the rows of US providing the coefficients of these linear combinations. That is, we can
interpret the rows of X in terms of the rows of V> , which is useful when the rows of V>
have interesting semantics. Analogously, the decomposition in (3) expresses the columns of
X as linear combinations of the columns of U, with the coefficients given by the columns
of SV> . So when the columns of U are interpretable, the decomposition gives us a way to
understand the columns of X.
In some applications, we really only care about understanding the rows of X, and the
extra information U provided by the SVD over PCA is irrelevant. In other applications,
both the rows and the columns of X are interesting in their own right. For example:

1. Suppose rows of X are indexed by customers, and the columns by products, with the
matrix entries indicating who likes what. We are interested in understanding the rows,
and in the best-case scenario, the right singular vectors (rows of V> ) are interpretable
as “customer types” or “canonical customers” and the SVD expresses each customer as
a mixture of customer types. For example, perhaps each student’s purchasing history
can be understood simply as a mixture of a “CS customer,” a “music customer,” and a
“clothing customer.” In the ideal case, the left singular vectors (columns of U) can be
interpreted as “product types,” where the “types” are the same as for customers, and
the SVD expresses each product as a mixture of product types (the extent to which a
product appeals to a “CS customer,” a “music customer,” etc.).

2. Suppose the matrix represents data about drug interactions, with the rows of X indexed
by proteins or pathways, and the columns by chemicals or drugs. We’re interested in
understanding both proteins and drugs in their own right, as mixtures of a small set
of “basic types.”

10
In the above two examples, what we really care about is the relationships between two groups
of objects — customers and products, or proteins and drugs — the labeling of one group
as the “rows” of a matrix and the other as the “columns” is arbitrary. In such cases, you
should immediately think of the SVD as a potential tool for better understanding the data.
When the columns of X are not interesting in their own right, PCA already provides the
relevant information.

8 PCA-Based Low-Rank Approximations (Optional)

The techniques developed for PCA can also be used to produce low-rank matrix approxima-
tions, as follows.
1. Preprocess the given matrix A so that the rows sum to the all-zero vector and, option-
ally, normalize each column (like last week).

2. Form the covariance matrix A> A.

3. In the notation of Figure 1, take the k rows of Z> to be the top k principal components
of A — the k eigenvectors v1 , v2 , . . . , vk of A> A that have the largest eigenvalues.

4. For i = 1, 2, . . . , m, the ith row of the matrix Y is defined as the projections (hxi , v1 i, . . . , hxi , vk i)
of xi onto the vectors v1 , . . . , vk . This is the best approximation, in terms of Euclidean
distance from xi , of xi as a linear combination of v1 , . . . , vk .6
The above four steps certainly produce a matrix

Y · Z> (8)

that has rank at most k. How does it compare to the SVD-based low-rank approximation?
In fact, it is exactly the same!7

Fact 8.1 The matrix Ak defined in (8) and the matrix Ak defined in (6) are identical.
We won’t prove Fact 8.1, but pause to note its plausibility. We defined Z> to be the top
k principal components of A — the first k eigenvectors of the covariance matrix A> A.
As noted in Section 6, the right singular vector of A (i.e., the rows of V> ) are also the
eigenvectors of A> A. Thus, the matrices Z> and Vk> are identical, both equal to the top
k eigenvectors of A> A/top k right singular vectors of A. Given this, it is not surprising
that the two definitions of Ak are the same: both the matrix Y in (8) and the matrix Uk Sk
in (6) are intuitively defining the linear combinations of the rows of Z> and Vk> that give
the best approximation to A. In the PCA-based solution in Section 8, this is explicitly how
Y is defined; the SVD encodes the same linear combinations in the form Uk Sk .
6
For example, with k = 2, these values (hxi , w1 i, hxi , w2 i) are the values that we plotted in the “map of
Europe” example in Lecture #7.
7
We’re assuming that identical preprocessing of A, if any, is done in both cases.

Singular Value Decomposition
100% (1)
Singular Value Decomposition
24 pages
Linear Algebra With Applications
100% (2)
Linear Algebra With Applications
303 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
37 pages
Data Mining: Dimensionality Reduction Pca - SVD
No ratings yet
Data Mining: Dimensionality Reduction Pca - SVD
33 pages
SVD My Lecture 2021-Desktop-Qov8vhr
No ratings yet
SVD My Lecture 2021-Desktop-Qov8vhr
79 pages
21.notes - Linear Algebra
No ratings yet
21.notes - Linear Algebra
278 pages
Mathbootcamp Ampba Aug 2021
No ratings yet
Mathbootcamp Ampba Aug 2021
193 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
43 pages
11 SVD
No ratings yet
11 SVD
47 pages
Isye 6416: Computational Statistics Spring 2023: Prof. Yao Xie
No ratings yet
Isye 6416: Computational Statistics Spring 2023: Prof. Yao Xie
44 pages
MFDS Lecture BITS WILP
No ratings yet
MFDS Lecture BITS WILP
29 pages
Singular Value Decomposition (SVD)
No ratings yet
Singular Value Decomposition (SVD)
94 pages
SVD Topics
No ratings yet
SVD Topics
38 pages
Final Report: Linear Algebra For It
No ratings yet
Final Report: Linear Algebra For It
25 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
26 pages
Introduction To Linear Algebra For Science and Engineering (PDFDrive)
100% (1)
Introduction To Linear Algebra For Science and Engineering (PDFDrive)
550 pages
Multivariate Notes r1
No ratings yet
Multivariate Notes r1
54 pages
Matrix YM
100% (3)
Matrix YM
17 pages
(AMS - MAA Textbooks 47) Przemyslaw Bogacki - Linear Algebra - Concepts and Applications-MAA Press (2019)
100% (1)
(AMS - MAA Textbooks 47) Przemyslaw Bogacki - Linear Algebra - Concepts and Applications-MAA Press (2019)
397 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
04 EigenValuesSolSAE
No ratings yet
04 EigenValuesSolSAE
16 pages
Dama50 2024 2025 Unit3n
No ratings yet
Dama50 2024 2025 Unit3n
56 pages
Module 2 - DS I
No ratings yet
Module 2 - DS I
94 pages
Singular-Value Decomposition and Its Applications
No ratings yet
Singular-Value Decomposition and Its Applications
28 pages
Internal 4 Sem
No ratings yet
Internal 4 Sem
36 pages
SVD Other2
No ratings yet
SVD Other2
11 pages
CH Matrix Approximation
No ratings yet
CH Matrix Approximation
17 pages
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
No ratings yet
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
26 pages
ComputationalMathematics - Chapter 2 PDF
No ratings yet
ComputationalMathematics - Chapter 2 PDF
29 pages
Elec9731 LM1
No ratings yet
Elec9731 LM1
41 pages
Unit-2 DSM
No ratings yet
Unit-2 DSM
22 pages
MatrixAnalysis Review
No ratings yet
MatrixAnalysis Review
10 pages
Chapter1 - Numerical Analysis II 2023-2024
No ratings yet
Chapter1 - Numerical Analysis II 2023-2024
30 pages
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
No ratings yet
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
8 pages
CS 532 Lecture Notes
No ratings yet
CS 532 Lecture Notes
25 pages
1 Matrix Algebra: - What To Read
No ratings yet
1 Matrix Algebra: - What To Read
22 pages
斯坦福大学机器学习数学基础 9-16
No ratings yet
斯坦福大学机器学习数学基础 9-16
8 pages
3 - Low Rank Apprx For SVD
No ratings yet
3 - Low Rank Apprx For SVD
4 pages
PART I: Approximation of Static Systems
No ratings yet
PART I: Approximation of Static Systems
123 pages
CS3220 Lecture Notes: Singular Value Decomposition and Applications
No ratings yet
CS3220 Lecture Notes: Singular Value Decomposition and Applications
13 pages
Imc Linear Algebra
No ratings yet
Imc Linear Algebra
11 pages
SVD Slides
No ratings yet
SVD Slides
17 pages
M2 - FDS
No ratings yet
M2 - FDS
20 pages
Final 4 Sem
No ratings yet
Final 4 Sem
29 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Linear Algebra Project
No ratings yet
Linear Algebra Project
9 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
Svdnotes
No ratings yet
Svdnotes
10 pages
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
No ratings yet
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
18 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
34 pages
Strang 367-376
No ratings yet
Strang 367-376
11 pages
L02 Notes
No ratings yet
L02 Notes
6 pages
SFML DATE 19 Lecture3 Svdpca Notes
No ratings yet
SFML DATE 19 Lecture3 Svdpca Notes
6 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
24 pages
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
No ratings yet
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
9 pages
Engineering Mathematics MTH101: Term Paper
No ratings yet
Engineering Mathematics MTH101: Term Paper
7 pages
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
10 pages
Cos323 s06 Lecture09 SVD
No ratings yet
Cos323 s06 Lecture09 SVD
24 pages
Math Primer
No ratings yet
Math Primer
13 pages
Unit I
No ratings yet
Unit I
39 pages
Statistical Learning With Math and R 100 Exercises For Building Logic 1st Edition Joe Suzuki PDF Download
No ratings yet
Statistical Learning With Math and R 100 Exercises For Building Logic 1st Edition Joe Suzuki PDF Download
79 pages
LinearAlgebra Author Benjamin
No ratings yet
LinearAlgebra Author Benjamin
491 pages
Armstrong Solution PDF
100% (2)
Armstrong Solution PDF
53 pages
Matrices and Matrix Operations PDF
No ratings yet
Matrices and Matrix Operations PDF
23 pages
MTech AFM Syllabus
No ratings yet
MTech AFM Syllabus
20 pages
Lecture00-Linear Algebra Recap
No ratings yet
Lecture00-Linear Algebra Recap
461 pages
Calculus and Matrix Algebra
No ratings yet
Calculus and Matrix Algebra
64 pages
Linear Algebra Cheat-Sheet: Laurent Lessard
100% (1)
Linear Algebra Cheat-Sheet: Laurent Lessard
13 pages
Linear Algebra in 25 Lectures Denton Waldron Instant Download
No ratings yet
Linear Algebra in 25 Lectures Denton Waldron Instant Download
83 pages
Nicholson OpenLAWA 2019A PartialStudentSolutionManualsss
No ratings yet
Nicholson OpenLAWA 2019A PartialStudentSolutionManualsss
192 pages
Applied Linear Algebra Second Edition Peter J. Olver Download
No ratings yet
Applied Linear Algebra Second Edition Peter J. Olver Download
55 pages
BE CSE CyberSecurity
No ratings yet
BE CSE CyberSecurity
64 pages
Linear Algebra and Its Applications 5th Edition Lay Solutions Manual PDF Download
100% (1)
Linear Algebra and Its Applications 5th Edition Lay Solutions Manual PDF Download
49 pages
MML Book (121 150)
No ratings yet
MML Book (121 150)
30 pages
Eigen Values & Eigen Vectors
No ratings yet
Eigen Values & Eigen Vectors
34 pages
KKKQ1223 Engineering Mathematics (Linear Algebra) : Best Approximation Least Squares Least Squares Fitting To Data
No ratings yet
KKKQ1223 Engineering Mathematics (Linear Algebra) : Best Approximation Least Squares Least Squares Fitting To Data
22 pages
Between Logic and Quantic: A Tract
No ratings yet
Between Logic and Quantic: A Tract
36 pages
MA 106: Linear Algebra: J. K. Verma Department of Mathematics Indian Institute of Technology Bombay
No ratings yet
MA 106: Linear Algebra: J. K. Verma Department of Mathematics Indian Institute of Technology Bombay
13 pages
Updated Orthogonal Transformation Numericals Detailed
No ratings yet
Updated Orthogonal Transformation Numericals Detailed
8 pages
Singular Value Decomposition - MIT PDF
No ratings yet
Singular Value Decomposition - MIT PDF
5 pages
Section 6.2: Orthogonal Sets: 1 2 P I J N 1 2 N
No ratings yet
Section 6.2: Orthogonal Sets: 1 2 P I J N 1 2 N
7 pages
Society For Industrial and Applied Mathematics Journal of The Society For Industrial and Applied Mathematics: Series B, Numerical Analysis
No ratings yet
Society For Industrial and Applied Mathematics Journal of The Society For Industrial and Applied Mathematics: Series B, Numerical Analysis
21 pages
Singular Value Decomposition and Polar Form
No ratings yet
Singular Value Decomposition and Polar Form
24 pages
JRF Math MTB 2024
No ratings yet
JRF Math MTB 2024
4 pages
UCLA Math 33A Midterm 2 Solutions
No ratings yet
UCLA Math 33A Midterm 2 Solutions
9 pages
BSC, HS23 - CheatSheet LinAlg.
No ratings yet
BSC, HS23 - CheatSheet LinAlg.
6 pages
Orthogonal Complement
No ratings yet
Orthogonal Complement
6 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Introduction to Vectors, Matrices and Tensors
From Everand
Introduction to Vectors, Matrices and Tensors
Simone Malacrida
No ratings yet

CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations

Uploaded by

CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations

Uploaded by

CS168: The Modern Algorithmic Toolbox

Lecture #9: The Singular Value Decomposition (SVD)

1 What Are The Missing Entries?

1. Compression. A low-rank approximation provides a (lossy) compressed version of the

2. De-noising. If A is a noisy version of some “ground truth” signal that is approximately

3. Matrix completion. Low-rank approximations offer a first-cut approach to the matrix

4 The Singular Value Decomposition (SVD)

Note that in contrast to the decomposition discussed in Lecture #8 (A = QDQ> when

5 Low-Rank Approximations from the SVD

in (4) and a target rank k, the proposed rank-k approximation is

5. The rank-k approximation is then

where Ak is the rank-k approximation (6) derived from the SVD of A.

7 More on PCA vs. SVD

8 PCA-Based Low-Rank Approximations (Optional)

2. Form the covariance matrix A> A.

You might also like