0% found this document useful (0 votes)
53 views8 pages

14-Learning Emb

Standard dimensionality reduction methods like singular value decomposition (SVD) decompose a document-term matrix A of size m x n into three matrices: U of size m x r, S of size r x r, and V of size n x r. U and V contain the left and right singular vectors. S contains the singular values, representing the strength of each concept. Projecting A onto U and V produces low-dimensional embeddings of the documents and terms in an r-dimensional space, where similarities can be computed. For example, a document's embedding would be a vector of its dot products with each of the first r right singular vectors V.

Uploaded by

Imane Ch'atoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views8 pages

14-Learning Emb

Standard dimensionality reduction methods like singular value decomposition (SVD) decompose a document-term matrix A of size m x n into three matrices: U of size m x r, S of size r x r, and V of size n x r. U and V contain the left and right singular vectors. S contains the singular values, representing the strength of each concept. Projecting A onto U and V produces low-dimensional embeddings of the documents and terms in an r-dimensional space, where similarities can be computed. For example, a document's embedding would be a vector of its dot products with each of the first r right singular vectors V.

Uploaded by

Imane Ch'atoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

¡ Standard dimensionality Reduction methods

§ Singular value decompositions (SVD)


n r r n
´ S ´ VT r

m A = U

¡ A: Input data matrix: m x n matrix (e.g., m documents, n terms)


(r : rank of the matrix A – often r < min(m,n) )
¡ U: Left singular vectors: m x r matrix (m documents, r concepts)
¡ S: Singular values: r x r diagonal matrix (strength of each ‘concept’)
¡ V: Right singular vectors: n x r matrix (n terms, r concepts)

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 11
¡ U, V: column orthonormal
§ UT U = I; VT V = I (I: identity matrix)
§ Columns are orthogonal unit vectors hence they
define an r-dimensional subspace
§ U defines an r-dim subspace in Rm
§ V defines an r-dim subspace in Rn

¡ Projecting A onto V and U produces embeddings:


§ Since A = U S VT then AV = U S are row embeddings
§ Since A = U S VT then UTA = S VT are col embeddings

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 12
Ex: compute document & word embeddings
Step 1: given a corpus of documents convert it
to BOW vectors à get a term-document matrix
§ Use term frequencies (tf), or normalize using tf-idf
data science spark Stanford learning
document 1 10 15 3 0 10
document 2 0 9 2 8 2
document 3 1 2 20 0 4
document 4 14 11 1 32 2
document 5 5 1 7 12 5
document 6 6 3 5 1 1
document 7 2 3 5 2 7

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 13
Step 2: apply SVD on the term-document matrix
and pick a value 𝑟 ≤ 𝑟𝑎𝑛𝑘(𝐴)
Here, we set r = 3.

10 15 3 0 10 -0.30 0.41 -0.79


0 9 2 8 2 -0.25 0.03 -0.12 S
1 2 20 0 4 -0.14 0.74 0.5 42.7 0 0
14 11 1 32 2 ~ -0.83 -0.40 0.12 x 0 23.8 0 x
5 1 7 12 5 -0.33 0.11 0.31 0 0 16.7
6 3 5 1 1 -0.13 0.20 -0.04
2 3 5 2 7 -0.14 0.27 -0.05
-0.41 -0.40 -0.20 -0.77 -0.20
A U 0.06 0.21 0.78 -0.45 0.37
VT -0.26 -0.63 0.55 0.40 -0.28
2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 14
Step 3: compute embedding of documents as
emb = [<doc, v1> , <doc, v2> , <doc, v3>]
doc

¡ <doc, v1> = <[10,15,3,0,10] , v1>= -12.7

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 15
¡ Step 3: compute embedding of documents as
emb = [<doc, v1> , <doc, v2> , <doc, v3>]
doc

¡ <doc, v1> = <[10,15,3,0,10] , v1>= -12.7


¡ <doc, v2> = <[10,15,3,0,10] , v2> = 9.79

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 16
¡ Step 3: compute embedding of documents as
emb = [<doc, v1> , <doc, v2> , <doc, v3>]
doc

¡ <doc, v1> = <[10,15,3,0,10] , v1>= -12.7


¡ <doc, v2> = <[10,15,3,0,10] , v2> = 9.79
¡ <doc, v3> = <[10,15,3,0,10] , v3>= -13.9

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 17
¡ Step 3: compute embedding of documents as
emb = [<doc, v1> , <doc, v2> , <doc, v3>]
doc

¡ <doc, v1> = <[10,15,3,0,10] , v1>= -12.7


¡ <doc, v2> = <[10,15,3,0,10] , v2> = 9.79 emb1 = [-12.7, 9.79, -13.9]
¡ <doc, v3> = <[10,15,3,0,10] , v3>= -13.9

2/17/22 Jure Leskovec & Mina Ghashami, Stanford CS246: Mining Massive Datasets 18

You might also like