0% found this document useful (0 votes)
14 views41 pages

Lect 11 DM

Uploaded by

Saba Tariq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views41 pages

Lect 11 DM

Uploaded by

Saba Tariq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

DISCLAIMER

In preparation of these slides, materials have been taken from


different online sources in the shape of books, websites, research
papers and presentations etc. However, the author does not have any
intention to take any benefit of these in her/his own name. This
lecture (audio, video, slides etc) is prepared and delivered only for
educational purposes and is not intended to infringe upon the
copyrighted material. Sources have been acknowledged where
applicable. The views expressed are presenter’s alone and do not
necessarily represent actual author(s) or the institution.
Data Mining

Clustering

1
Clustering Approaches
1. Partitioning Methods
2. Hierarchical Methods
3. Density-Based Methods
Hierarchical Clustering
• Two main types of hierarchical clustering
– Agglomerative:
• Start with the points as individual clusters
• At each step, merge the closest pair of clusters until only one cluster (or k
clusters) left
– Divisive:
• Start with one, all-inclusive cluster
• At each step, split a cluster until each cluster contains a point (or there are k
clusters)
• Traditional hierarchical algorithms use a similarity or distance
matrix
– Merge or split one cluster at a time
– Image segmentation mostly uses simultaneous merge/split
Hierarchical clustering
• Agglomerative (Bottom-up)
– Compute all pair-wise pattern-pattern similarity
coefficients
– Place each of n patterns into a class of its own
– Merge the two most similar clusters into one
• Replace the two clusters into the new cluster
• Re-compute inter-cluster similarity scores w.r.t. the new
cluster
– Repeat the above step until there are k clusters
left (k can be 1)
Hierarchical clustering
• Agglomerative (Bottom up)
Hierarchical clustering
• Agglomerative (Bottom up)
• 1st iteration
1
Hierarchical clustering
• Agglomerative (Bottom up)
• 2nd iteration
1 2
Hierarchical clustering
• Agglomerative (Bottom up)
• 3rd iteration
3 2
1
Hierarchical clustering
• Agglomerative (Bottom up)
• 4th iteration
3 2
1

4
Hierarchical clustering
• Agglomerative (Bottom up)
• 5th iteration
3 2
1
5

4
Hierarchical clustering
• Agglomerative (Bottom up)
• Finally k clusters left
6 3 2 9
1
5
8
4
7
Hierarchical clustering
• Divisive (Top-down)
– Start at the top with all patterns in one cluster
– The cluster is split using a flat clustering algorithm
– This procedure is applied recursively until each
pattern is in its own singleton cluster
Hierarchical clustering
• Divisive (Top-down)
Hierarchical Clustering: The Algorithm

• Hierarchical clustering takes as input a set of points


• It creates a tree in which the points are leaves and the internal
nodes reveal the similarity structure of the points.
– The tree is often called a “dendogram.”
• The method is summarized below:

Place all points into their own clusters


While there is more than one cluster,
do
Merge the closest pair of clusters

The behavior of the algorithm depends on how “closest pair


of clusters” is defined
Hierarchical Clustering: Example
This example illustrates single-link clustering in
Euclidean space on 6 points.

F
E

A
B
C D

A B C D E F
Hierarchical Clustering
• Produces a set of nested clusters organized as a
hierarchical tree
• Can be visualized as a dendrogram
– A tree like diagram that records the sequences of merges
or splits

6 5
0.2

4
0.15 3 4
2
5
0.1
2

0.05
1
3 1
0
1 3 2 5 4 6
Strengths of Hierarchical Clustering

• Do not have to assume any particular number of


clusters
– Any desired number of clusters can be obtained by
‘cutting’ the dendogram at the proper level
Hierarchical Clustering: Merging
Clusters
Single Link: Distance between two clusters
is the distance between the closest points.
Also called “neighbor joining.”

Average Link: Distance between


clusters is distance between the cluster
centroids.

Complete Link: Distance between


clusters is distance between farthest pair
of points.
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1
Similarity?
p2

p3

p4

p5
• MIN .
• MAX .
• Group Average .
Proximity Matrix
• Distance Between Centroids
• Other methods driven by an
objective function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
• MIN .
• MAX .
• Group Average .
Proximity Matrix
• Distance Between Centroids
• Other methods driven by an
objective function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
• MIN .
• MAX .
• Group Average .
Proximity Matrix
• Distance Between Centroids
• Other methods driven by an
objective function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

p2

p3

p4

p5
• MIN .
• MAX .
• Group Average .
Proximity Matrix
• Distance Between Centroids
• Other methods driven by an
objective function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...
p1

  p2

p3

p4

p5
• MIN .
• MAX .
• Group Average .
Proximity Matrix
• Distance Between Centroids
• Other methods driven by an
objective function
An example
Let us consider a gene measured in a set of 5 experiments:
A,B,C,D and E. The values measured in the 5 experiments are:
A=100 B=200 C=500 D=900 E=1100

We will construct the hierarchical clustering of these values


using Euclidean distance, centroid linkage and an
agglomerative approach.

50
An example
SOLUTION:
• The closest two values are 100 and 200
=>the centroid of these two values is 150.
• Now we are clustering the values: 150, 500, 900, 1100
• The closest two values are 900 and 1100
=>the centroid of these two values is 1000.
• The remaining values to be joined are: 150, 500, 1000.
• The closest two values are 150 and 500
=>the centroid of these two values is 325.
• Finally, the two resulting subtrees are joined in the root of the
tree.

51
An example:
Two hierarchical clusters of the expression values of a single gene measured
in 5 experiments.

1100 500 1100 900


500 900
D E C E D
C 200 100
100 200
A B B A

 The dendograms are identical: both diagrams show that:


•A is most similar to B
•C is most similar to the group (A,B)
•D is most similar to E
 In the left dendogram A and E are plotted far from each other
 In the right dendogram A and E are immediate neighbors

THE PROXIMITY IN A HIERARCHICAL CLUSTERING DOES NOT NECESSARILY


CORRESPOND TO SIMILARITY 52
Example: Single Link Method

50
Example: Single Link Method

50
Example: Single Link Method

50
Example: Single Link Method
Example: Single Link Method
Example: Single Link Method
Example: Single Link Method
Example: Single Link Method
Example: Single Link Method
Example

50
Example: Complete Link Method

50
Example

50
Example: Group Average Method

50
Acknowledgements
 Introduction to Machine Learning, Alphaydin
 Pattern Classification” by Duda et al., John Wiley & Sons.
 Read GMM from “Automated Detection of Exudates in Colored Retinal Images for
Diagnosis of Diabetic Retinopathy”, Applied Optics, Vol. 51 No. 20, 4858-4866, 2012.
Material in these slides has been taken from, the following

 Biomisa.org
resources

100

You might also like