Clustering Analysis
Clustering Analysis
p1
Similarity?
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
× × p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
! Another way to view the processing of the
hierarchical algorithm is that we create links
between their elements in order of increasing
distance
" The MIN – Single Link, will merge two clusters
when a single pair of elements is linked
" The MAX – Complete Linkage will merge two
clusters when all pairs of elements have been
linked.
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5
1 2 .24 0 .15 .20 .14 .25
3 3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 5 .34 .14 .28 .29 0 .39
2 1
6 .23 .25 .11 .22 .39 0
2 3 6
0.2
4
4 0.15
0.1
0.05
4 0.3
0.25
0.2
0.15
0.1
Nested Clusters Dendrogram
0.05
0
3 6 4 1 2 5
Original Points Two Clusters
p j∈Clusterj
proximity(Clusteri , Clusterj ) =
|Clusteri |∗|Clusterj |
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
2 .24 0 .15 .20 .14 .25
3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5 4 1 2 .24 0 .15 .20 .14 .25
0.15
0.1
0
3 6 4 1 2 5
! Compromise between Single and Complete
Link
! Strengths
" Less susceptible to noise and outliers
! Limitations
" Biased towards globular clusters
! Similarity of two clusters is based on the
increase in squared error (SSE) when two
clusters are merged
" Similar to group average if distance between points is
distance squared
! Less susceptible to noise and outliers
! Biased towards globular clusters
! Hierarchical analogue of K-means
" Can be used to initialize K-means
5
1 4 1
3
2 5
5 5
2 1 2
MIN MAX
2 3 6 3 6
3
1
4 4
4
5
1 5 4 1
2 2
5 Ward’s Method 5
2 2
3 6 Group Average 3 6
3
4 1 1
4 4
3
! O(N2) space since it uses the proximity
matrix.
" N is the number of points.