0% found this document useful (0 votes)
3 views

Clustering Analysis

The document discusses various clustering methods, particularly hierarchical clustering techniques such as Ward's Method, which utilizes squared error for merging clusters. It highlights the strengths and limitations of different linkage methods, including Single Link, Complete Linkage, and Group Average, as well as their computational complexities. Additionally, it addresses issues like sensitivity to noise and outliers, and the challenges of handling clusters of varying sizes and shapes.

Uploaded by

Third Party
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Clustering Analysis

The document discusses various clustering methods, particularly hierarchical clustering techniques such as Ward's Method, which utilizes squared error for merging clusters. It highlights the strengths and limitations of different linkage methods, including Single Link, Complete Linkage, and Group Average, as well as their computational complexities. Additionally, it addresses issues like sensitivity to noise and outliers, and the challenges of handling clusters of varying sizes and shapes.

Uploaded by

Third Party
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

p1 p2 p3 p4 p5 ...

p1
Similarity?
p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1

p2
p3
p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1

p2
p3

p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
p2
p3

p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
p1 p2 p3 p4 p5 ...
p1
× × p2
p3

p4
p5
! MIN
.
! MAX
.
! Group Average .
Proximity Matrix
! Distance Between Centroids
! Other methods driven by an objective
function
– Ward’s Method uses squared error
! Another way to view the processing of the
hierarchical algorithm is that we create links
between their elements in order of increasing
distance
" The MIN – Single Link, will merge two clusters
when a single pair of elements is linked
" The MAX – Complete Linkage will merge two
clusters when all pairs of elements have been
linked.
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5
1 2 .24 0 .15 .20 .14 .25
3 3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 5 .34 .14 .28 .29 0 .39
2 1
6 .23 .25 .11 .22 .39 0
2 3 6

0.2
4
4 0.15

0.1

0.05

Nested Clusters Dendrogram


0
3 6 2 5 4 1
Original Points Two Clusters

• Can handle non-elliptical shapes


Original Points Two Clusters

• Sensitive to noise and outliers


1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
4 1 2 .24 0 .15 .20 .14 .25
2 5 3 .22 .15 0 .15 .28 .11

5 4 .37 .20 .15 0 .29 .22


2 5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
3 6
3 0.4
1
0.35

4 0.3

0.25

0.2

0.15

0.1
Nested Clusters Dendrogram
0.05

0
3 6 4 1 2 5
Original Points Two Clusters

• Less susceptible to noise and outliers


Original Points Two Clusters

•Tends to break large clusters


•Biased towards globular clusters
! Proximity of two clusters is the average of pairwise proximity
between points in the two clusters.
∑ proximity(p , p )
pi∈Clusteri
i j

p j∈Clusterj
proximity(Clusteri , Clusterj ) =
|Clusteri |∗|Clusterj |

! Need to use average connectivity for scalability since total


proximity favors large clusters

1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
2 .24 0 .15 .20 .14 .25
3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5 4 1 2 .24 0 .15 .20 .14 .25

2 3 .22 .15 0 .15 .28 .11

5 4 .37 .20 .15 0 .29 .22


2 5 .34 .14 .28 .29 0 .39
3 6 .23 .25 .11 .22 .39 0
6
1
0.25
4
3 0.2

0.15

0.1

Nested Clusters Dendrogram 0.05

0
3 6 4 1 2 5
! Compromise between Single and Complete
Link

! Strengths
" Less susceptible to noise and outliers

! Limitations
" Biased towards globular clusters
! Similarity of two clusters is based on the
increase in squared error (SSE) when two
clusters are merged
" Similar to group average if distance between points is
distance squared
! Less susceptible to noise and outliers
! Biased towards globular clusters
! Hierarchical analogue of K-means
" Can be used to initialize K-means
5
1 4 1
3
2 5
5 5
2 1 2
MIN MAX
2 3 6 3 6
3
1
4 4
4

5
1 5 4 1
2 2
5 Ward’s Method 5
2 2
3 6 Group Average 3 6
3
4 1 1
4 4
3
! O(N2) space since it uses the proximity
matrix.
" N is the number of points.

! O(N3) time in many cases


" There are N steps and at each step the size, N2,
proximity matrix must be updated and searched
" Complexity can be reduced to O(N2 log(N) ) time
for some approaches
! Computational complexity in time and space
! Once a decision is made to combine two
clusters, it cannot be undone
! No objective function is directly minimized
! Different schemes have problems with one or
more of the following:
" Sensitivity to noise and outliers
" Difficulty handling different sized clusters and convex
shapes
" Breaking large clusters

You might also like