Singular Value Decomposition
Singular Value Decomposition
Program: M.C.A.
Course Code: MCAS9220
Course Name: Data Science Fundamentals
Singular Value
Decomposition (SVD)
Linear Algebra
Background
Eigenvalues & Eigenvectors
Eigenvectors (for a square mm matrix S)
Example
3 0 0
S 0 2 0 has eigenvalues 3, 2, 0 with
0 0 0 corresponding eigenvectors
1 0 0
v1 0 v2 1 v3 0
0 0 1
Sx S (2v1 4v2 6v 3 )
Sx 2 Sv1 4 Sv2 6 Sv 3 21v1 42 v2 6 3v 3
Then 2 1
S I (2 ) 1 0.
2
1 2
The eigenvalues are 1 and 3 (nonnegative, real).
The eigenvectors are orthogonal (and real):
1 1 Plug in these values
and solve for
1 1 eigenvectors.
Eigen/diagonal Decomposition
Let be a square matrix with m linearly
independent eigenvectors (a “non-defective”
Uniqu
matrix) e for
distinc
Theorem: Exists an eigen decomposition t
diagonal
eigen-
values
And S=UU–1.
Diagonal decomposition - example
2 1
Recall S ; 1 1, 2 3.
1 2
1 1 1 1
The eigenvectors and form U
1 1 1 1
1 1 / 2 1 / 2 Recall
Inverting, we have U
1 / 2 1 / 2 UU–1 =1.
1 1 1 0 1 / 2 1 / 2
Then, S=UU–1 =
1 1 0 3 1 / 2 1 / 2
Example continued
Let’s divide U (and multiply U–1) by 2
1 / 2 1 / 2 1 0 1 / 2 1/ 2
Then, S=
1 / 2 1 / 2 0 3 1 / 2 1/ 2
Q (Q-1= QT )
0 1 0 1 1 2 2 2
1 0 1 0 2 3 2 4
Time out!
I came to this class to learn about text retrieval
and mining, not have my linear algebra past
dredged up again …
But if you want to dredge, Strang’s Applied
Mathematics is a good place to start.
What do these matrices have to do with text?
Recall m n term-document matrices …
But everything so far needs square matrices – so
…
Singular Value Decomposition
For an m n matrix A of rank r there exists a factorization
(Singular Value Decomposition = SVD) as follows:
T
A UV
m m m n V is nn
0 2/ 6 1/ 3 1 0
1 / 2 1/ 2
1 / 2 1/ 6 1/ 3 0 3
1 / 2 1/ 2 1/ 2
1/ 6 1 / 3 0 0
Ak min
X :rank ( X ) k
A X F Frobenius norm
min
X :rank ( X ) k
A X F
A Ak F
k 1
Dimensions Precision
250 0.367
300 0.371
346 0.374
Failure modes
Negated phrases
TREC topics sometimes negate certain
query/terms phrases – automatic conversion of
topics to
Boolean queries
As usual, freetext/vector space syntax of LSI
queries precludes (say) “Find any doc having to do
with the following 5 companies”
See Dumais for more.
But why is this clustering?
We’ve talked about docs, queries, retrieval and
precision here.
What does this have to do with clustering?
Intuition: Dimension reduction through LSI brings
together “related” axes in the vector space.
Intuition from block matrices
n documents
0’s
Block 2
m
terms
…
0’s
Block k
= non-zero entries.
Intuition from block matrices
n documents
Block 1
0’s
Block 2
m
terms
…
0’s
Block k
Vocabulary partitioned into k topics
(clusters); each doc discusses only one topic.
Intuition from block matrices
n documents
0’s
Block 2
m
terms
…
0’s
Block k
= non-zero entries.
Intuition from block matrices
Likely there’s a good rank-k
approximation to this matrix.
wiper
tire Block 1
V6
…
Few nonzero entries
Block k
car 10
automobile 0 1
Simplistic picture
Topic 1
Topic 2
Topic 3
Some wild extrapolation
The “dimensionality” of a corpus is the number of
distinct topics represented in it.
More mathematical wild extrapolation:
if A has a rank k approximation of low Frobenius
error, then there are no more than k distinct topics
in the corpus.
LSI has many other applications
In many settings in pattern recognition and
retrieval, we have a feature-object matrix.
For text, the terms are features and the docs are
objects.
Could be opinions and users … more in 276B.
This matrix may be redundant in dimensionality.
Can work with low-rank approximation.
If entries are missing (e.g., users’ opinions), can
recover if dimensionality is low.
Powerful general analytical technique
Close, principled analog to clustering methods.
Resources
http://www.cs.utk.edu/~berry/lsi++/
http://lsi.argreenhouse.com/lsi/LSIpapers.html
Dumais (1993) LSI meets TREC: A status report.
Dumais (1994) Latent Semantic Indexing (LSI) and
TREC-2.
Dumais (1995) Using LSI for information filtering:
TREC-3 experiments.
M. Berry, S. Dumais and G. O'Brien. Using
linear algebra for intelligent information
retrieval. SIAM Review, 37(4):573--595, 1995.