0% found this document useful (0 votes)

25 views4 pages

DWM Exp8 127 133 137

The experiment implements agglomerative hierarchical clustering on a dataset. It calculates the Euclidean distance matrix between data points to quantify their dissimilarity based on attributes like age, income, and spending score. This distance matrix is then used as input for the hierarchical clustering algorithm. A dendrogram plot is produced to visualize the hierarchy of clusters formed at different distance thresholds. The experiment demonstrates how hierarchical clustering can reveal natural groupings within unlabeled data.

Uploaded by

Manav Purswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views4 pages

DWM Exp8 127 133 137

Uploaded by

Manav Purswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Experiment 8

Aim: Implementation of any one Hierarchical Clustering method

Theory: Hierarchical clustering is another unsupervised machine learning algorithm, that is

used to group the unlabeled datasets into a cluster and is also known as hierarchical
cluster analysis or HCA.
In this algorithm, we develop the hierarchy of clusters in the form of a tree, and this
tree-shaped structure is known as the dendrogram. Sometimes the results of K-
means
clustering and hierarchical clustering may look similar, but they both differ depending on
how they work. There is no requirement to predetermine the number of clusters as we did in
the K-Means algorithm. The hierarchical clustering technique has two approaches:

1. Agglomerative: Agglomerative is a bottom-up approach, in which the algorithm

starts with taking all data points as single clusters and merging them until one
cluster is left.
2. Divisive: The divisive algorithm is the reverse of the agglomerative algorithm as it
is a top-down approach.

Agglomerative Hierarchical clustering

The agglomerative hierarchical clustering algorithm is a popular example of HCA. To group

the datasets into clusters, it follows the bottom-up approach. It means this algorithm
considers each dataset as a single cluster at the beginning, and then starts combining the
closest pair of clusters together. It does this until all the clusters are merged into a single
cluster that contains all the datasets.

This hierarchy of clusters is represented in the form of the dendrogram.

Measure for the distance between two clusters

As we have seen, the closest distance between the two clusters is crucial for hierarchical
clustering. There are various ways to calculate the distance between two clusters, and these
ways decide the rule for clustering. These measures are called Linkage methods. Some of
the popular linkage methods are given below:

Single Linkage: It is the Shortest Distance between the closest points of the clusters.
Consider the below image
Complete Linkage: It is the farthest distance between the two points of two different
clusters. It is one of the popular linkage methods as it forms tighter clusters than
single-linkage.

Average Linkage: It is the linkage method in which the distance between each pair of
datasets is added up and then divided by the total number of datasets to calculate the
average distance between two clusters. It is also one of the most popular linkage methods.

Centroid Linkage: It is the linkage method in which the distance between the centroid of the
clusters is calculated. Consider the below image

CODE:

import numpy as nm

import matplotlib.pyplot as mtp

import pandas as pd

dataset =

pd.read_csv('exp8.csv')

dataset.head()

OUTPUT:

import pandas as pd

from scipy.spatial.distance import pdist, squareform

data = pd.DataFrame({

'Age': [19, 21, 20, 23, 31],

'Annual Income(k$)': [15, 15, 16, 16, 17],

'Spending Score(1-100)': [39, 81, 6, 77, 40]

})
distance_matrix = pdist(data, metric='euclidean')

distance_matrix_square = squareform(distance_matrix)

print(distance_matrix_square)

OUTPUT:

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
dataset = pd.read_csv('exp8.csv')
x = dataset.iloc[:, [3, 4]].values
import scipy.cluster.hierarchy as shc
dendro = shc.dendrogram(shc.linkage(x, method="ward"))
mtp.title("Dendrogrma Plot")
mtp.ylabel("Euclidean Distances")
mtp.xlabel("Customers")
mtp.show()

OUTPUT:
Conclusion: In the experiment, we calculated the Euclidean distance matrix for a subset
of data points from the given dataset. This distance matrix quantiﬁes the dissimilarity
between data points based on their 'Age,' 'Annual Income(k$),' and 'Spending Score(1-100).'
The matrix provides a foundation for hierarchical clustering analysis, which can reveal
natural groupings or clusters within the data. The distance matrix is a crucial input for
clustering algorithms and allows us to identify similarities and differences among data
points for further analysis and decision-making.

Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Massachusetts Institute of Technology
No ratings yet
Massachusetts Institute of Technology
5 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
11 pages
ML Unit 5
No ratings yet
ML Unit 5
50 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
7 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
34 pages
Exp 8
No ratings yet
Exp 8
3 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Clustring
No ratings yet
Clustring
20 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
7 pages
Hierarchial Clustering
No ratings yet
Hierarchial Clustering
14 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
7 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
ML Lec-17
No ratings yet
ML Lec-17
12 pages
Hierarchical Clusters
No ratings yet
Hierarchical Clusters
6 pages
Unit-6 Clustering Techniques
No ratings yet
Unit-6 Clustering Techniques
110 pages
Clustering
No ratings yet
Clustering
19 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
3.2 HierCluster
No ratings yet
3.2 HierCluster
17 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
9536 DWM Expt 7 Merged
No ratings yet
9536 DWM Expt 7 Merged
14 pages
Unit 4 Self Made
No ratings yet
Unit 4 Self Made
28 pages
3CP10 MJJ Hierarchical Clustering
No ratings yet
3CP10 MJJ Hierarchical Clustering
40 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
10Hierarchical&Probabilistic Clustering & GMM (ML)
No ratings yet
10Hierarchical&Probabilistic Clustering & GMM (ML)
24 pages
Expt 5
No ratings yet
Expt 5
3 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
Unit-4 New
No ratings yet
Unit-4 New
36 pages
Heirarchical Clustering
No ratings yet
Heirarchical Clustering
22 pages
Distance Measures
No ratings yet
Distance Measures
11 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
23 pages
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
No ratings yet
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
41 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
MA Unit 5
No ratings yet
MA Unit 5
7 pages
Spooo
No ratings yet
Spooo
9 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
6 pages
Hierarchical Clustering and Data Science Group Project - Assignment 2
No ratings yet
Hierarchical Clustering and Data Science Group Project - Assignment 2
29 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Agnes
No ratings yet
Agnes
25 pages
Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
8 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
4 pages
Linkage (Analisis Gerarquico)
No ratings yet
Linkage (Analisis Gerarquico)
7 pages
AI20 - Hierarchical-Clustering
No ratings yet
AI20 - Hierarchical-Clustering
31 pages
Chinninti Venkata Assessment Machine Learning
No ratings yet
Chinninti Venkata Assessment Machine Learning
11 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
Gene and Sample Clustering
No ratings yet
Gene and Sample Clustering
5 pages
K-Means and Hierarchical Clustering
No ratings yet
K-Means and Hierarchical Clustering
30 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Lecture Notes - Clustering
No ratings yet
Lecture Notes - Clustering
13 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Exp 8
No ratings yet
Exp 8
7 pages
DA Seminar
No ratings yet
DA Seminar
29 pages
Hierarchical Clustering Case Study
No ratings yet
Hierarchical Clustering Case Study
4 pages
ML-Lec-Hierarchical Clustering
No ratings yet
ML-Lec-Hierarchical Clustering
10 pages
Week-9-Part-2 Agglomerative Clustering
No ratings yet
Week-9-Part-2 Agglomerative Clustering
40 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
DC Ans-1
No ratings yet
DC Ans-1
66 pages
Distributed Systems Updated Question Bank
No ratings yet
Distributed Systems Updated Question Bank
3 pages
DC Ans
No ratings yet
DC Ans
73 pages
Distributed Systems Cleaned Question Bank
No ratings yet
Distributed Systems Cleaned Question Bank
5 pages
Project Management Module Wise Questions With Repetitions
No ratings yet
Project Management Module Wise Questions With Repetitions
2 pages
DC Final Sem
No ratings yet
DC Final Sem
142 pages
Goa Plan
No ratings yet
Goa Plan
2 pages
What Is Double Spending Problem? How Pow Solves It?
No ratings yet
What Is Double Spending Problem? How Pow Solves It?
67 pages
ML Lab34
No ratings yet
ML Lab34
29 pages
Management Consultant Resume Example
No ratings yet
Management Consultant Resume Example
1 page
MIS
No ratings yet
MIS
55 pages
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
No ratings yet
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
109 pages
ADS Ans
No ratings yet
ADS Ans
18 pages
2018 Mult 9
No ratings yet
2018 Mult 9
46 pages
Binary Search Algorithm - Data Structure
No ratings yet
Binary Search Algorithm - Data Structure
3 pages
Yellow: Blue Curve
100% (1)
Yellow: Blue Curve
28 pages
ML Unit 3 V1
No ratings yet
ML Unit 3 V1
25 pages
Akar Persamaan 1
No ratings yet
Akar Persamaan 1
15 pages
19 Web Mining 2
No ratings yet
19 Web Mining 2
41 pages
Unit-3 Divide and Concur
No ratings yet
Unit-3 Divide and Concur
78 pages
Lecture 4 - Spectral Theorem For Symmetric Matrix
No ratings yet
Lecture 4 - Spectral Theorem For Symmetric Matrix
5 pages
Phase Plane Analysis
No ratings yet
Phase Plane Analysis
83 pages
01 Part2
No ratings yet
01 Part2
10 pages
Cns 2
No ratings yet
Cns 2
13 pages
Machine Learning Bloque 4
No ratings yet
Machine Learning Bloque 4
12 pages
Assignment # 2: Discrete Mathematics Counting Principles
No ratings yet
Assignment # 2: Discrete Mathematics Counting Principles
4 pages
ISM Unit - 2
No ratings yet
ISM Unit - 2
9 pages
(Yard) Individual ASSIGNMENT (Qantitative)
40% (5)
(Yard) Individual ASSIGNMENT (Qantitative)
2 pages
The Method of The Real-Time Human Detection and Tracking: ISSN 2710 - 1673 Artificial Intelligence 2023 1
No ratings yet
The Method of The Real-Time Human Detection and Tracking: ISSN 2710 - 1673 Artificial Intelligence 2023 1
8 pages
Government Polytechnic, Washim: "Implement Modifier Caesar's Cipher With Shift of Any Key
No ratings yet
Government Polytechnic, Washim: "Implement Modifier Caesar's Cipher With Shift of Any Key
12 pages
BYJU'S Answer: Study Materials
No ratings yet
BYJU'S Answer: Study Materials
13 pages
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
No ratings yet
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
35 pages
Privasea Whitepaper
No ratings yet
Privasea Whitepaper
44 pages
PCA Code-Checkpoint
No ratings yet
PCA Code-Checkpoint
4 pages
SC - Unit 1 Updated Notes
No ratings yet
SC - Unit 1 Updated Notes
82 pages
Chapter 20 PowerPoint
No ratings yet
Chapter 20 PowerPoint
18 pages
Unit - 1 ADA
No ratings yet
Unit - 1 ADA
14 pages
Electives For E-MTech AI DSE 1 IIT Patna 2024 Decembter
No ratings yet
Electives For E-MTech AI DSE 1 IIT Patna 2024 Decembter
23 pages
Introduction To Matrix: A11x1 + A12x2 + ... + A1nxn b1 A21x1 + A22x2 + ... + A2nxn b2
No ratings yet
Introduction To Matrix: A11x1 + A12x2 + ... + A1nxn b1 A21x1 + A22x2 + ... + A2nxn b2
42 pages
Introduction To Fuzzy Logic Control
No ratings yet
Introduction To Fuzzy Logic Control
21 pages
Deep Generative Model
No ratings yet
Deep Generative Model
27 pages
Ann Model Paper
No ratings yet
Ann Model Paper
7 pages

DWM Exp8 127 133 137

Uploaded by

DWM Exp8 127 133 137

Uploaded by

Experiment 8

Aim: Implementation of any one Hierarchical Clustering method

Theory: Hierarchical clustering is another unsupervised machine learning algorithm, that is

1. Agglomerative: Agglomerative is a bottom-up approach, in which the algorithm

Agglomerative Hierarchical clustering

The agglomerative hierarchical clustering algorithm is a popular example of HCA. To group

This hierarchy of clusters is represented in the form of the dendrogram.

Measure for the distance between two clusters

import matplotlib.pyplot as mtp

from scipy.spatial.distance import pdist, squareform

'Age': [19, 21, 20, 23, 31],

'Annual Income(k$)': [15, 15, 16, 16, 17],

'Spending Score(1-100)': [39, 81, 6, 77, 40]

You might also like