Before diving into the Agglomerative algorithms, we must understand the different concepts in clustering techniques. So, first, look at the concept of Clustering in Machine Learning:
Clustering is the broad set of techniques for finding subgroups or clusters on the basis of characterization of objects within dataset such that objects with groups are similar but different from the object of other groups. Primary guideline of clustering is that data inside a cluster should be very similar to each other but very different from those outside clusters. There are different types of clustering techniques like Partitioning Methods, Hierarchical Methods and Density Based Methods.
Method
| Characteristics
|
Partitioning Method
| - Uses mean/mediod to represent cluster centre
- Adopts distance-based approach to refine clusters
- Finds mutually exclusive clusters of spherical / nearly spherical shape
- Effective for datasets of small - medium age
|
Hierarchical Method
| - Creates tree-like structure through decomposition
- Uses distance between nearest / furthest points in neighbouring clusters for refinement
- Error can't be corrected at subsequent levels
|
Density Based Method
| - Useful for identifying arbitrarily shaped clusters
- May filter out outliers
|
In Partitioning methods, there are 2 techniques namely, k-means and k-medoids technique ( partitioning around medoids algorithm ). But in order to learn about the Agglomerative Methods, we have to discuss the hierarchical methods.
- Hierarchical Methods: Data is grouped into a tree like structure. There are two main clustering algorithms in this method:
- A. Divisive Clustering: It uses the top-down strategy, the starting point is the largest cluster with all objects in it and then split recursively to form smaller and smaller clusters. It terminates when the user-defined condition is achieved or final clusters contain only one object.
- B. Agglomerative Clustering: It uses a bottom-up approach. It starts with each object forming its own cluster and then iteratively merges the clusters according to their similarity to form large clusters. It terminates either
- When certain clustering condition imposed by user is achieved or
- All clusters merge into a single cluster
A dendrogram, which is a tree like structure, is used to represent hierarchical clustering. Individual objects are represented by leaf nodes and the clusters are represented by root nodes. A representation of dendrogram is shown in this figure:
Dendrogram for Clustering
Now we will look into the variants of Agglomerative methods:
1. Agglomerative Algorithm: Single Link
Single-nearest distance or single linkage is the agglomerative method that uses the distance between the closest members of the two clusters. We will now solve a problem to understand it better:
Question. Find the clusters using a single link technique. Use Euclidean distance and draw the dendrogram.
Sample No.
| X
| Y
|
P1
| 0.40
| 0.53
|
P2
| 0.22
| 0.38
|
P3
| 0.35
| 0.32
|
P4
| 0.26
| 0.19
|
P5
| 0.08
| 0.41
|
P6
| 0.45
| 0.30
|
Solution:
Step 1: Compute the distance matrix by: d[(x,y)(a,b)] = \sqrt{(x-a)^2 + (y-b)^2}
So we have to find the Euclidean distance between each and every points, say we first find the euclidean distance between P1 and P2
d(P1,P2) = \sqrt{(0.4-0.22)^2 + (0.53-0.38)^2} = \sqrt{(0.18)^2 + (0.15)^2} = \sqrt{0.0324+0.0225} = 0.23
So the DISTANCE MATRIX will look like this:
\begin{pmatrix} & P1 & P2 & P3 & P4 & P5 & P6 \\ P1 & 0 \\ P2 & 0.23 & 0 \\ P3 & & & 0 \\ P4 & & & & 0 \\ P5 & & & & & 0 \\ P6 & & & & & & 0 \\ \end{pmatrix}
Similarly, find the Euclidean distance for every point. But there is one point to focus on that the diagonal of the above distance matrix is a special point for us. The distance above and below the diagonal will be same. For eg: d(P2, P5) is equivalent to d(P5, P2). So we will find the distance of the below section of the matrix.
d(P1,P4) = \sqrt{(0.4-0.26)^2 + (0.53-0.19)^2} = \sqrt{(0.14)^2 + (0.34)^2} = \sqrt{0.0196+0.1156} = 0.3676 = 0.37
d(P1,P5) = \sqrt{(0.4-0.08)^2 + (0.53-0.41)^2} = \sqrt{(0.32)^2 + (0.12)^2} = \sqrt{0.1024+0.0144} = 0.3417 = 0.34
d(P1,P6) = \sqrt{(0.4-0.45)^2 + (0.53-0.30)^2} = \sqrt{(0.05)^2 + (0.23)^2} = \sqrt{0.0025+0.0529} = 0.2354 = 0.24
d(P2,P3) = \sqrt{(0.22-0.35)^2 + (0.38-0.32)^2} = \sqrt{(-0.13)^2 + (0.06)^2} = \sqrt{0.0169+0.0036} = 0.1432 = 0.14
d(P2,P4) = \sqrt{(0.22-0.26)^2 + (0.38-0.19)^2} = \sqrt{(-0.04)^2 + (0.19)^2} = \sqrt{0.0016+0.0361} = 0.1942 = 0.19
d(P2,P5) = \sqrt{(0.22-0.08)^2 + (0.38-0.41)^2} = \sqrt{(0.14)^2 + (-0.03)^2} = \sqrt{0.0196+0.0009} = 0.1432 = 0.14
d(P2,P3) = \sqrt{(0.22-0.35)^2 + (0.38-0.32)^2} = \sqrt{(-0.13)^2 + (0.06)^2} = \sqrt{0.0169+0.0036} = 0.0.1432 = 0.14
d(P2,P6) = \sqrt{(0.22-0.45)^2 + (0.38-0.30)^2} = \sqrt{(-0.23)^2 + (0.08)^2} = \sqrt{0.0529+0.0064} = 0.2435 = 0.24
d(P3,P4) = \sqrt{(0.35-0.26)^2 + (0.32-0.19)^2} = \sqrt{(0.03)^2 + (0.13)^2} = \sqrt{0.0009+0.0169} = 0.1334 = 0.13
d(P3,P5) = \sqrt{(0.35-0.08)^2 + (0.32-0.41)^2} = \sqrt{(0.27)^2 + (-0.09)^2} = \sqrt{0.0729+0.0081} = 0.2846 = 0.28
d(P3,P6) = \sqrt{(0.35-0.45)^2 + (0.32-0.30)^2} = \sqrt{(-0.1)^2 + (0.02)^2} = \sqrt{0.01+0.0004} = 0.10198 = 0.10
d(P4,P5) = \sqrt{(0.26-0.08)^2 + (0.19-0.41)^2} = \sqrt{(0.07)^2 + (-0.22)^2} = \sqrt{0.0049+0.0484} = 0.2309 = 0.23
d(P4,P6) = \sqrt{(0.26-0.45)^2 + (0.19-0.30)^2} = \sqrt{(-0.19)^2 + (-0.11)^2} = \sqrt{0.0361+0.0121} = 0.2195 = 0.22
d(P5,P6) = \sqrt{(0.08-0.45)^2 + (0.41-0.30)^2} = \sqrt{(-0.37)^2 + (0.11)^2} = \sqrt{0.1369+0.0121} = 0.3860 = 0.39
Therefore, the updated Distance Matrix will be :
\begin{pmatrix} & P1 & P2 & P3 & P4 & P5 & P6 \\ P1 & 0 \\ P2 & 0.23 & 0 \\ P3 & 0.22 & 0.14 & 0 \\ P4 & 0.37 & 0.19 & 0.13 & 0 \\ P5 & 0.34 & 0.14 & 0.28 & 0.23 & 0 \\ P6 & 0.24 & 0.24 & 0.10 & 0.22 & 0.39 & 0 \\ \end{pmatrix}
Step 2: Merging the two closest members of the two clusters and finding the minimum element in distance matrix. Here the minimum value is 0.10 and hence we combine P3 and P6 (as 0.10 came in the P6 row and P3 column). Now, form clusters of elements corresponding to the minimum value and update the distance matrix. To update the distance matrix:
min ((P3,P6), P1) = min ((P3,P1), (P6,P1)) = min (0.22,0.24) = 0.22
min ((P3,P6), P2) = min ((P3,P2), (P6,P2)) = min (0.14,0.24) = 0.14
min ((P3,P6), P4) = min ((P3,P4), (P6,P4)) = min (0.13,0.22) = 0.13
min ((P3,P6), P5) = min ((P3,P5), (P6,P5)) = min (0.28,0.39) = 0.28
Now we will update the Distance Matrix:
\begin{pmatrix} & P1 & P2 & P3,P6 & P4 & P5 \\ P1 & 0 \\ P2 & 0.23 & 0 \\ P3,P6 & 0.22 & 0.14 & 0 \\ P4 & 0.37 & 0.19 & 0.13 & 0 \\ P5 & 0.34 & 0.14 & 0.28 & 0.23 & 0 \end{pmatrix}
Now we will repeat the same process. Merge two closest members of the two clusters and find the minimum element in distance matrix. The minimum value is 0.13 and hence we combine P3, P6 and P4. Now, form the clusters of elements corresponding to the minimum values and update the Distance matrix. In order to find, what we have to update in distance matrix,
min (((P3,P6) P4), P1) = min (((P3,P6), P1), (P4,P1)) = min (0.22,0.37) = 0.22
min (((P3,P6), P4), P2) = min (((P3,P6), P2), (P4,P2)) = min (0.14,0.19) = 0.14
min (((P3,P6), P4), P5) = min (((P3,P6), P5), (P4,P5)) = min (0.28,0.23) = 0.23
Now we will update the Distance Matrix:
\begin{pmatrix} & P1 & P2 & P3,P6,P4 & P5 \\ P1 & 0 \\ P2 & 0.23 & 0 \\ P3,P6,P4 & 0.22 & 0.14 & 0 \\ P5 & 0.34 & 0.14 & 0.23 & 0 \end{pmatrix}
Again repeating the same process: The minimum value is 0.14 and hence we combine P2 and P5. Now, form cluster of elements corresponding to minimum value and update the distance matrix. To update the distance matrix:
min ((P2,P5), P1) = min ((P2,P1), (P5,P1)) = min (0.23, 0.34) = 0.23
min ((P2,P5), (P3,P6,P4)) = min ((P3,P6,P4), (P3,P6,P4)) = min (0.14. 0.23) = 0.14
Update Distance Matrix will be:
\begin{pmatrix} & P1 & P2,P5 & P3,P6,P4 \\ P1 & 0 \\ P2,P5 & 0.23 & 0 \\ P3,P6,P4 & 0.22 & 0.14 & 0 \\ \end{pmatrix}
Again repeating the same process: The minimum value is 0.14 and hence we combine P2,P5 and P3,P6,P4. Now, form cluster of elements corresponding to minimum value and update the distance matrix. To update the distance matrix:
min ((P2,P5,P3,P6,P4), P1) = min ((P2,P5), P1), ((P3,P6,P4), P1)) = min (0.23, 0.22) = 0.22
Updated Distance Matrix will be:
\begin{pmatrix} & P1 & P2,P5,P3,P6,P4 \\ P1 & 0 \\ P2,P5,P3,P6,P4 & 0.22 & 0 \end{pmatrix}
So now we have reached to the solution finally, the dendrogram for those question will be as follows:

2. Agglomerative Algorithm: Complete Link
In this algorithm, complete farthest distance or complete linkage is the agglomerative method that uses the distance between the members that are the farthest apart.
Question. For the given set of points, identify clusters using the complete link agglomerative clustering
Sample No
| X
| Y
|
P1
| 1
| 1
|
P2
| 1.5
| 1.5
|
P3
| 5
| 5
|
P4
| 3
| 4
|
P5
| 4
| 4
|
P6
| 3
| 3.5
|
Solution.
Step 1: Compute the distance matrix by: d[(x,y)(a,b)] = \sqrt{(x-a)^2 + (y-b)^2} So we have to find the euclidean distance between each and every point, say we first find the euclidean distance between P1 and P2. (fORMULA FOR CALCULATING THE DISTANCE IS SAME AS ABOVE)
Say the Distance Matrix for some points is :
\begin{pmatrix} & P1 & P2 & P3 & P4 & P5 & P6 \\ P1 & 0 \\ P2 & 0.71 & 0 \\ P3 & 5.66 & 4.95 & 0 \\ P4 & 3.6 & 2.92 & 2.24 & 0 \\ P5 & 4.24 & 3.53 & 1.41 & 1.0 & 0 \\ P6 & 3.20 & 2.5 & 2.5 & 0.5 & 1.12 & 0 \\ \end{pmatrix}
Step 2: Merging the two closest members of the two clusters and finding the minimum element in distance matrix. So, the minimum value is 0.5 and hence we combine P4 and P6. To update the distance matrix,
max (d(P4,P6), P1) = max (d(P4,P1), d(P6,P1)) = max (3.6, 3.2) = 3.6
max (d(P4,P6), P2) = max (d(P4,P2), d(P6,P2)) = max (2.92, 2.5) = 2.92
max (d(P4,P6), P3) = max (d(P4,P3), d(P6,P3)) = max (2.24, 2.5) = 2.5
max (d(P4,P6), P5) = max (d(P4,P5), d(P6,P5)) = max (1.0, 1.12) = 1.12
Updated distance matrix is:
\begin{pmatrix} & P1 & P2 & P3 & P4,P6 & P5 \\ P1 & 0 \\ P2 & 0.71 & 0 \\ P3 & 5.66 & 4.95 & 0 \\ P4,P6 & 3.6 & 2.92 & 2.5 & 0 \\ P5 & 4.24 & 3.53 & 1.41 & 1.12 & 0 \\ \end{pmatrix}
Again, merging the two closest members of the two clusters and finding the minimum element in distance matrix. We get the minimum value as 0.71 and hence we combine P1 and P2. To update the distance matrix,
max (d(P1, P2), P3) = max (d(P1, P3), d(P2, P3)) = max (5.66, 4.95) = 5.66
max (d(P1,P2), (P4,P6)) = max (d(P1, P4, P6), d(P2, P4, P6)) = max (3.6, 2.92) = 3.6
max (d(P1,P2), P5) = max (d(P1, P5), d(P2, P5)) = max (4.24, 3.53) = 4.24
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P3 & P4,P6 & P5 \\ P1,P2 & 0 \\ P3 & 5.66 & 0 \\ P4,P6 & 3.6 & 2.5 & 0 \\ P5 & 4.24 & 1.41 & 1.12 & 0 \\ \end{pmatrix}
Again, merging the two closest members of the two clusters and finding the minimum element in distance matrix. We get the minimum value as 1.12 and hence we combine P4, P6 and P5. To update the distance matrix,
max (d(P4,P6,P5), (P1,P2)) = max (d(P4,P6,P1,P2), d(P5,P1,P2)) = max (3.6, 4.24) = 4.24
max (d(P4,P6,P5), P3) = max (d(P4,P6,P3), d(P5, P3)) = max (2.5, 1.41) = 2.5
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P3 & P4,P6,P5 \\ P1,P2 & 0 \\ P3 & 5.66 & 0 \\ P4,P6,P5 & 4.24 & 2.5 & 0 \\ \end{pmatrix}
Again, merging the two closest members of the two clusters and finding the minimum element in distance matrix. We get the minimum value as 2.5 and hence combine P4,P6,P5 and P3. to update the distance matrix,
min (d(P4,P6,P5,P3), (P1,P2)) = max (d(P4,P6,P5,P1,P2), d(P3,P1,P2)) = mac (3.6, 5.66) = 5.66
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P4,P6,P5,P3 \\ P1,P2 & 0 \\ P4,P6,P5,P3 & 5.66 & 0\\ \end{pmatrix}
So now we have reached to the solution finally, the dendrogram for those question will be as follows:

3. Agglomerative Algorithm: Average Link
Average-average distance or average linkage is the method that involves looking at the distances between all pairs and averages all of these distances. This is also called Universal Pair Group Mean Averaging.
Question. For the points given in the previous question, identify clusters using average link agglomerative clustering
Solution:
We have to first find the Distance Matrix, as we have picked the same question the distance matrix will be same as above:
\begin{pmatrix} & P1 & P2 & P3 & P4 & P5 & P6 \\ P1 & 0 \\ P2 & 0.71 & 0 \\ P3 & 5.66 & 4.95 & 0 \\ P4 & 3.6 & 2.92 & 2.24 & 0 \\ P5 & 4.24 & 3.53 & 1.41 & 1.0 & 0 \\ P6 & 3.20 & 2.5 & 2.5 & 0.5 & 1.12 & 0 \\ \end{pmatrix}
Merging two closest members of the two clusters and finding the minimum elements in distance matrix. We get the minimum value as 0.5 and hence we combine P4 and P6. To update the distance matrix :
average (d(P4,P6), P1) = average (d(P4,P1), d(P6,P1)) = average (3.6, 3.20) = 3.4
average (d(P4,P6), P2) = average (d(P4,P2), d(P6,P2)) = average (2.92, 2.5) = 2.71
average (d(P4,P6), P3) = average (d(P4,P3), d(P6,P3)) = average (2.24, 2.5) = 2.37
average (d(P4,P6), P5) = average (d(P4,P5), d(P6,P5)) = average (1.0, 1.12) = 1.06
Updated distance matrix is:
\begin{pmatrix} & P1 & P2 & P3 & P4,P6 & P5 \\ P1 & 0 \\ P2 & 0.71 & 0 \\ P3 & 5.66 & 4.95 & 0 \\ P4,P6 & 3.4 & 2.71 & 2.37 & 0 \\ P5 & 4.24 & 3.53 & 1.41 & 1.06 & 0 \\ \end{pmatrix}
Merging two closest members of the two clusters and finding the minimum elements in distance matrix. We get the minimum value as 0.71 and hence we combine P1 and P2. To update the distance matrix:
average (d(P1,P2), P3) = average (d(P1,P3), d(P2,P3)) = average (5.66, 4.95) = 5.31
average (d(P1,P2), (P4,P6)) = average (d(P1,P4,P6), d(P2,P4,P6)) = average (3.2, 2.71) = 2.96
average (d(P1,P2), P5) = average (d(P1,P5), d(P2,P5)) = average (4.24, 3.53) = 3.89
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P3 & P4,P6 & P5 \\ P1,P2 & 0 \\ P3 & 5.31 & 0 \\ P4,P6 & 2.96 & 2.5 & 0 \\ P5 & 3.89 & 1.41 & 1.12 & 0 \\ \end{pmatrix}
Merging two closest members of the two clusters and finding the minimum elements in distance matrix. We get the minimum value as 1.12 and hence we combine P4,P6 and P5. To update the distance matrix:
average (d(P4,P6,P5), (P1,P2)) = average (d(P4,P6,P1,P2), d(P5,P1,P2)) = average (2.96, 3.89) = 3.43
average (d(P4,P6,P5), P3) = average (d(P4,P6,P3), d(P5,P3)) = average (2.5, 1.41) = 1.96
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P3 & P4,P6,P5 \\ P1,P2 & 0 \\ P3 & 5.66 & 0 \\ P4,P6,P5 & 3.43 & 1.96 & 0 \\ \end{pmatrix}
Merging two closest members of the two clusters and finding the minimum elements in distance matrix. We get the minimum value as 1.96 and hence we combine P4,P6,P5 and P3. To update the distance matrix:
average (d(P4,P6,P5,P3), (P1,P2)) = average (d(P4,P6,P5,P1,P2), d(P3,P1P2)) = average (3.43, 5.66) = 4.55
Updated distance matrix is:
\begin{pmatrix} & P1,P2 & P4,P6,P5,P3 \\ P1,P2 & 0 \\ P4,P6,P5,P3 & 4.55 & 0 \\ \end{pmatrix}
So, the final cluster can be drawn is shown as:

Hence, we have studied all three variants of the agglomerative algorithms.
Similar Reads
Machine Learning Tutorial Machine learning is a branch of Artificial Intelligence that focuses on developing models and algorithms that let computers learn from data without being explicitly programmed for every task. In simple words, ML teaches the systems to think and understand like humans by learning from the data.Do you
5 min read
Introduction to Machine Learning
Python for Machine Learning
Machine Learning with Python TutorialPython language is widely used in Machine Learning because it provides libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and Keras. These libraries offer tools and functions essential for data manipulation, analysis, and building machine learning models. It is well-known for its readability an
5 min read
Pandas TutorialPandas is an open-source software library designed for data manipulation and analysis. It provides data structures like series and DataFrames to easily clean, transform and analyze large datasets and integrates with other Python libraries, such as NumPy and Matplotlib. It offers functions for data t
6 min read
NumPy Tutorial - Python LibraryNumPy (short for Numerical Python ) is one of the most fundamental libraries in Python for scientific computing. It provides support for large, multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on arrays.At its core it introduces the ndarray (n-dimens
3 min read
Scikit Learn TutorialScikit-learn (also known as sklearn) is a widely-used open-source Python library for machine learning. It builds on other scientific libraries like NumPy, SciPy and Matplotlib to provide efficient tools for predictive data analysis and data mining.It offers a consistent and simple interface for a ra
3 min read
ML | Data Preprocessing in PythonData preprocessing is a important step in the data science transforming raw data into a clean structured format for analysis. It involves tasks like handling missing values, normalizing data and encoding variables. Mastering preprocessing in Python ensures reliable insights for accurate predictions
6 min read
EDA - Exploratory Data Analysis in PythonExploratory Data Analysis (EDA) is a important step in data analysis which focuses on understanding patterns, trends and relationships through statistical tools and visualizations. Python offers various libraries like pandas, numPy, matplotlib, seaborn and plotly which enables effective exploration
6 min read
Feature Engineering
Supervised Learning
Unsupervised Learning
Model Evaluation and Tuning
Advance Machine Learning Technique
Machine Learning Practice