0% found this document useful (0 votes)
149 views21 pages

Hierarchical Clustering: Ke Chen

This document discusses hierarchical clustering and the agglomerative algorithm. It begins with an introduction to hierarchical clustering, including that it is an approach that sequentially constructs nested partitions of data into a tree of clusters without needing to specify the number of clusters in advance. It then covers cluster distance measures, provides an example application of the agglomerative algorithm, and discusses relevant issues like determining the number of clusters.

Uploaded by

Shay Bhatter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views21 pages

Hierarchical Clustering: Ke Chen

This document discusses hierarchical clustering and the agglomerative algorithm. It begins with an introduction to hierarchical clustering, including that it is an approach that sequentially constructs nested partitions of data into a tree of clusters without needing to specify the number of clusters in advance. It then covers cluster distance measures, provides an example application of the agglomerative algorithm, and discusses relevant issues like determining the number of clusters.

Uploaded by

Shay Bhatter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

Hierarchical Clustering

Ke Chen

COMP24111 Machine Learning

Outline
Introduction
Cluster Distance Measures
Agglomerative Algorithm
Example and Demo
Relevant Issues
Summary
COMP24111 Machine Learning

Introduction
Hierarchical Clustering Approach

A typical clustering analysis approach via partitioning data set


sequentially

Construct nested partitions layer by layer via grouping objects into a


tree of clusters (without the need to know the number of clusters in
advance)

Use (generalised) distance matrix as clustering criteria

Agglomerative vs. Divisive

Two sequential clustering strategies for constructing a tree of clusters

Agglomerative: a bottom-up strategy


Initially each data object is in its own (atomic) cluster
Then merge these atomic clusters into larger and larger clusters

Divisive: a top-down strategy


Initially all objects are in one single cluster
Then the cluster is subdivided into smaller and smaller clusters
COMP24111 Machine Learning

Introduction
Illustrative Example
Agglomerative and divisive clustering on the data set {a, b,
c, d ,e }
Step 1
Step 2
Step 3
Step 4
Step 0
Agglomerative
a

ab

abcde

cde

de

e
Step 4

Cluster distance
Termination condition

Divisive
Step 3

Step 2

Step 1

Step 0

COMP24111 Machine Learning

Cluster Distance Measures

Single link: smallest distance


between an element in one cluster

single link
(min)

and an element in the other, i.e.,


d(Ci, Cj) = min{d(xip, xjq)}

Complete link: largest distance


between an element in one cluster

complete link
(max)

and an element in the other, i.e.,


d(Ci, Cj) = max{d(xip, xjq)}

Average: avg distance between


elements in one cluster and

average

elements in the other, i.e.,


d(Ci, Cj) = avg{d(xip, xjq)}

COMP24111 Machine Learning

d(C, C)=0
5

Cluster Distance Measures


Example: Given a data set of five objects characterised by a single
continuous feature, assume that there are two clusters: C 1: {a, b} and C2:
{c, d, e}.
a
b
c
d
e
Feature

1. Calculate the distance matrix.Single


2. link
Calculate three cluster distances between
C1 andaC2. b
c
d
e
dist(C1 , C 2 ) min{d(a, c), d(a, d), d(a, e), d(b, c), d(b, d), d(b, e)}

min{3, 4, 5, 2, 3, 4} 2
Complete link

dist(C 1 , C 2 ) max{d(a, c), d(a, d), d(a, e), d(b, c), d(b, d), d(b, e)}

Average

max{3, 4, 5, 2, 3, 4} 5

d(a, c) d(a, d) d(a, e) d(b, c) d(b, d) d(b, e)


6
3 4 5 2 3 4 21

3.5
6
6

dist(C1 , C 2 )

COMP24111 Machine Learning

Agglomerative Algorithm
The Agglomerative algorithm is carried out in three
steps:
1)Convert all object features
into a distance matrix
2)Set each object as a cluster
(thus if we have N objects,
we will have N clusters at
the beginning)
3)Repeat until number of
cluster is one (or known #
of clusters)
Merge two closest
clusters

COMP24111 Machine Learning

Update distance

Example
Problem: clustering analysis with agglomerative
algorithm

data matrix

Euclidean distance
distance matrix
COMP24111 Machine Learning

Example
Merge two closest clusters (iteration 1)

COMP24111 Machine Learning

Example
Update distance matrix (iteration 1)

COMP24111 Machine Learning

10

Example
Merge two closest clusters (iteration 2)

COMP24111 Machine Learning

11

Example
Update distance matrix (iteration 2)

COMP24111 Machine Learning

12

Example
Merge two closest clusters/update distance matrix
(iteration 3)

COMP24111 Machine Learning

13

Example
Merge two closest clusters/update distance matrix
(iteration 4)

COMP24111 Machine Learning

14

Example
Final result (meeting termination condition)

COMP24111 Machine Learning

15

Example
Dendrogram tree representation

lifetime

5
4
3
2

object

1. In the beginning we have 6


clusters: A, B, C, D, E and F
2. We merge clusters D and F into
cluster (D, F) at distance 0.50
3. We merge cluster A and cluster B
into (A, B) at distance 0.71
4. We merge clusters E and (D, F)
into ((D, F), E) at distance 1.00
5. We merge clusters ((D, F), E) and C
into (((D, F), E), C) at distance 1.41
6. We merge clusters (((D, F), E), C)
and (A, B) into ((((D, F), E), C), (A, B))
at distance 2.50
7. The last cluster contain all the objects,
thus conclude the computation

COMP24111 Machine Learning

16

Example
Dendrogram tree representation: clustering USA
states

COMP24111 Machine Learning

17

Exercise
Given a data set of five objects characterised by a single continuous
feature:
a
b
C
d
e
Feature

Apply the agglomerative algorithm with single-link, complete-link and


averaging cluster distance measures to produce three dendrogram trees,
a
b
c
d
e
respectively.
a

COMP24111 Machine Learning

18

Demo

Agglomerative Demo

COMP24111 Machine Learning

19

Relevant Issues
How to determine the number of clusters

If the number of clusters known, termination condition is


given!
The K-cluster lifetime as the range of threshold value on
the dendrogram tree that leads to the identification of K
clusters
Heuristic rule: cut a dendrogram tree with maximum Kcluster life time

COMP24111 Machine Learning

20

Summary
Hierarchical algorithm is a sequential clustering
algorithm

Use distance matrix to construct a tree of clusters (dendrogram)


Hierarchical representation without the need of knowing # of
clusters (can set termination condition with known # of clusters)

Major weakness of agglomerative clustering methods

Can never undo what was done previously


Sensitive to cluster distance measures and noise/outliers
Less efficient: O (n2 logn), where n is the number of total objects

There are several variants to overcome its weaknesses

BIRCH: scalable to a large data set


ROCK: clustering categorical data
CHAMELEON: hierarchical clustering using dynamic modelling

Online tutorial: the hierarchical clustering functions in Matlab


https://www.youtube.com/watch?v=aYzjenNNOcc
COMP24111 Machine Learning

21

You might also like