0% found this document useful (0 votes)
4 views

Hierarchical Clustering Algorithm

The document discusses hierarchical clustering algorithms, which group data into a tree of clusters, starting with each data point as an individual cluster. It outlines two main types: Agglomerative Clustering, a bottom-up approach that merges clusters based on similarity, and Divisive Clustering, a top-down method that separates clusters. The document also highlights the advantages and disadvantages of hierarchical clustering, including its ability to handle non-convex clusters and the high computational costs involved.

Uploaded by

script1712
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Hierarchical Clustering Algorithm

The document discusses hierarchical clustering algorithms, which group data into a tree of clusters, starting with each data point as an individual cluster. It outlines two main types: Agglomerative Clustering, a bottom-up approach that merges clusters based on similarity, and Divisive Clustering, a top-down method that separates clusters. The document also highlights the advantages and disadvantages of hierarchical clustering, including its ability to handle non-convex clusters and the high computational costs involved.

Uploaded by

script1712
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

HIERARCHICAL

CLUSTERING
ALGORITHM
ROSHINI SELVAKUMAR
2021503041
INTRODUCTION
A Hierarchical clustering method works via grouping data into a tree of clusters.
Hierarchical clustering begins by treating every data point as a separate cluster.
Then, it repeatedly executes the subsequent steps:
1.Identify the 2 clusters which can be closest together, and

2.Merge the 2 maximum comparable clusters. We need to continue these steps


until all the clusters are merged together.

The result of hierarchical clustering is a tree-like structure, called a dendrogram,


which illustrates the hierarchical relationships among the clusters.
TYPES OF HIERARCHICAL
CLUSTERING
Basically, there are two types of hierarchical Clustering:
 Agglomerative Clustering
 Divisive clustering

Agglomerative Clustering

Initially consider every data point as an individual Cluster and at every step, merge
the nearest pairs of the cluster. (It is a bottom-up method).
At first, every dataset is considered an individual entity or cluster.
At every iteration, the clusters merge with different clusters until one cluster is
formed.
AGGLOMERATIVE CLUSTERING
The algorithm for Agglomerative Hierarchical Clustering is:
Calculate the similarity of one cluster with all the other clusters (calculate proximity matrix)
 Calculate the distance between each pair of data points using a distance function, like
Euclidean distance.
 Fill the matrix with the distances calculated in step 1. The proximity matrix will be a
square matrix with dimensions n x n where n is the number of data points.
Consider every data point as an individual cluster
Merge the clusters which are highly similar or close to each other.
Recalculate the proximity matrix for each cluster
Repeat Steps 3 and 4 until only a single cluster remains.
AGGLOMERATIVE CLUSTERING-
EXAMPLE
AGGLOMERATIVE CLUSTERING-
EXAMPLE
• Step-1: Consider each alphabet as a single cluster and calculate the distance of one cluster from all the
other clusters.

• Step-2: In the second step comparable clusters are merged together to form a single cluster. Let’s say
cluster (B) and cluster (C) are very similar to each other therefore we merge them in the second step
similarly to cluster (D) and (E) and at last, we get the clusters [(A), (BC), (DE), (F)]

• Step-3: We recalculate the proximity according to the algorithm and merge the two nearest
clusters([(DE), (F)]) together to form new clusters as [(A), (BC), (DEF)]

• Step-4: Repeating the same process; The clusters DEF and BC are comparable and merged together to
form a new cluster. We’re now left with clusters [(A), (BCDEF)].

• Step-5: At last, the two remaining clusters are merged together to form a single cluster [(ABCDEF)].
DIVISIVE CLUSTERING
We can say that Divisive Hierarchical clustering is
precisely the opposite of Agglomerative Hierarchical
clustering.
It’s a top- down method.
In Divisive Hierarchical clustering, we take into account
all of the data points as a single cluster and in every iteration,
we separate the data points from the clusters which aren’t
comparable.
In the end, we are left with N clusters.
ADVANTAGES AND DIS-
ADVANTAGES
ADVANTAGES: DIS-ADVANTAGES:
•The ability to handle non-convex •The need for a criterion to stop the
clusters and clusters of different sizes clustering process and determine the
and densities. final number of clusters.

•The ability to handle missing data •The computational cost and memory
and noisy data. requirements of the method can be
high, especially for large datasets.
•The ability to reveal the hierarchical
structure of the data, which can be •The results can be sensitive to the
useful for understanding the initial conditions, linkage criterion,
relationships among the clusters. and distance metric used.

You might also like