0% found this document useful (0 votes)
54 views

Spectral Clustering: Eyal David Image Processing Seminar May 2008

Spectral clustering involves 3 main steps: 1) constructing a graph based on pairwise similarities of points, 2) computing the eigenvectors of the graph laplacian, and 3) clustering the points in the new eigenvector space (e.g. with k-means). This allows clustering of non-convex shapes unlike traditional methods like k-means. Key parameters include the number of clusters k and the similarity measure for constructing the graph.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Spectral Clustering: Eyal David Image Processing Seminar May 2008

Spectral clustering involves 3 main steps: 1) constructing a graph based on pairwise similarities of points, 2) computing the eigenvectors of the graph laplacian, and 3) clustering the points in the new eigenvector space (e.g. with k-means). This allows clustering of non-convex shapes unlike traditional methods like k-means. Key parameters include the number of clusters k and the similarity measure for constructing the graph.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

Spectral Clustering

Eyal David
Image Processing seminar
May 2008
Lecture Outline
 Motivation
 Graph overview and construction
 Demo
 Spectral Clustering
 Demo
 Cool implementations

2
3

A Tutorial on Spectral Clustering\Arik Azran


Spectral Clustering Example – 2 Spirals
2

1.5 Dataset exhibits complex


1 cluster shapes
 K-means performs very
0.5

-2 -1.5 -1 -0.5
0
0 0.5 1 1.5 2 poorly in this space due bias
-0.5
toward dense spherical
-1
clusters.
-1.5

-2

0.8

0.6

0.4

0.2

In the embedded space -0.709 -0.7085 -0.708 -0.7075 -0.707 -0.7065


0
-0.706

given by two leading -0.2

eigenvectors, clusters are -0.4

trivial to separate. -0.6


4
-0.8

Spectral Clustering - Derek Greene


Lecture Outline
 Motivation
 Graph overview and construction
 Graph demo
 Spectral Clustering
 Spectral Clustering demo
 Cool implementation

5
6

Matthias Hein and Ulrike von Luxburg August 2007


7

Matthias Hein and Ulrike von Luxburg August 2007


8

Matthias Hein and Ulrike von Luxburg August 2007


9

Matthias Hein and Ulrike von Luxburg August 2007


10

Matthias Hein and Ulrike von Luxburg August 2007


11

Matthias Hein and Ulrike von Luxburg August 2007


12

Matthias Hein and Ulrike von Luxburg August 2007


13

Matthias Hein and Ulrike von Luxburg August 2007


14

Matthias Hein and Ulrike von Luxburg August 2007


15

Matthias Hein and Ulrike von Luxburg August 2007


16

Matthias Hein and Ulrike von Luxburg August 2007


17

Matthias Hein and Ulrike von Luxburg August 2007


Demo
(Live example)
Lecture Outline
 Motivation
 Graph overview and construction
 Demo
 Spectral Clustering
 Demo
 Cool implementations

19
20

Matthias Hein and Ulrike von Luxburg August 2007


21

Matthias Hein and Ulrike von Luxburg August 2007


22

Matthias Hein and Ulrike von Luxburg August 2007


23

Matthias Hein and Ulrike von Luxburg August 2007


24

Matthias Hein and Ulrike von Luxburg August 2007


25

Matthias Hein and Ulrike von Luxburg August 2007


26

Matthias Hein and Ulrike von Luxburg August 2007


27

Matthias Hein and Ulrike von Luxburg August 2007


28

Matthias Hein and Ulrike von Luxburg August 2007


Eigenvectors & Eigenvalues

29
30

Matthias Hein and Ulrike von Luxburg August 2007


31

Matthias Hein and Ulrike von Luxburg August 2007


Demo
(Live example)
Spectral Clustering Algorithm
Ng, Jordan, and Weiss

 Motivation
 Given a set of points

S  s1 ,..., sn   Rl
 We would like to cluster them into k
subsets

33
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
 Form the affinity matrix W  R nxn

||si  s j || / 2
if i  j
2 2
 DefineWij  e
Wii  0
 Scaling parameter chosen by user

 Define D a diagonal matrix whose


(i,i) element is the sum of A’s row i

Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry 34

based on Ng et al On Spectral clustering: analysis and algorithm


Algorithm
1/ 2 1/ 2
 Form the matrix LD WD

 Find x1 , x2 ,..., xk , the k largest eigenvectors of


L
 These form the the columns of the new
matrix X
 Note: have reduced dimension from nxn to nxk

35
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
 Form the matrix Y
 Renormalize each of X’s rows to have unit length
 Yij  X ij /( X ij 2 ) 2
 Y  Rnxk j
 Treat each row of Y as a point in R k
 Cluster into k clusters via K-means

36
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
 Final Cluster Assignment
 Assign point si to cluster j iff row i of Y was
assigned to cluster j

37
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Why?
 If we eventually use K-means, why not just
apply K-means to the original data?

 This method allows us to cluster non-convex


regions

38
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
 Some Examples

39
40
41
42
43
44
45
46
47
User’s Prerogative
 Affinity matrix construction
 Choice of scaling factor
 Realistically, search over  2
and pick value that
gives the tightest clusters
 Choice of k, the number of clusters
 Choice of clustering method

48
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
How to select k?
 Eigengap: the difference between two consecutive eigenvalues.
 Most stable clustering is generally given by the value k that
maximises the expression
k  k  k 1
Largest eigenvalues 50

of Cisi/Medline data 45 λ1
40

35

max k  2  1
Eigenvalue

30

25

λ2
 Choose k=2
20
15

10
5

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
K 49

Spectral Clustering - Derek Greene


Recap – The bottom line

50

Matthias Hein and Ulrike von Luxburg August 2007


Summary
 Spectral clustering can help us in hard
clustering problems
 The technique is simple to understand
 The solution comes from solving a simple
algebra problem which is not hard to
implement
 Great care should be taken in choosing the
“starting conditions”

51
The End

You might also like