0% found this document useful (0 votes)
30 views15 pages

Hierarchical Clustering

The document discusses hierarchical clustering, an unsupervised machine learning technique that groups similar data points into clusters through an iterative process of merging clusters. It begins by treating each data point as an individual cluster, then repeatedly merges the closest pairs of clusters until only one cluster remains, represented through a tree diagram called a dendrogram. The document covers the advantages and disadvantages of the approach.

Uploaded by

H.D.Praveena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views15 pages

Hierarchical Clustering

The document discusses hierarchical clustering, an unsupervised machine learning technique that groups similar data points into clusters through an iterative process of merging clusters. It begins by treating each data point as an individual cluster, then repeatedly merges the closest pairs of clusters until only one cluster remains, represented through a tree diagram called a dendrogram. The document covers the advantages and disadvantages of the approach.

Uploaded by

H.D.Praveena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Hierarchical Clustering in
Data Mining
Last Updated : 12 Dec, 2023
A Hierarchical clustering method works via grouping
data into a tree of clusters. Hierarchical clustering
begins by treating every data point as a separate
cluster. Then, it repeatedly executes the subsequent
steps:

1. Identify the 2 clusters which can be closest


together, and
2. Merge the 2 maximum comparable clusters. We
need to continue these steps until all the clusters
Aptitude Engineering Mathematics Discrete Mathematics Operating System DBMS Computer Networks Digital Logic and Design C Programming
are merged together.

Share Your Experience


In Hierarchical Clustering, the aim is to produce a
Traditional Data Mining hierarchical series of nested clusters. A diagram
Life Cycle (Crisp called Dendrogram (A Dendrogram is a tree-like
Methodology)
diagram that statistics the sequences of merges or
Partitioning Method (K- splits) graphically represents this hierarchy and is an
Mean) in Data Mining inverted tree that describes the order
Skip to ▲ in which factors
content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 1/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

are merged (bottom-up view) or clusters are broken


Data Mining
Multidimensional up (top-down view).
Association Rule
What is Hierarchical Clustering?
Tasks and Hierarchical clustering is a method of cluster analysis
Functionalities of Data
in data mining that creates a hierarchical
Mining
representation of the clusters in a dataset. The
Frequent Itemsets and method starts by treating each data point as a
it's applications in data separate cluster and then iteratively combines the
analytics
closest clusters until a stopping criterion is reached.
Types and Part of Data The result of hierarchical clustering is a tree-like
Mining architecture structure, called a dendrogram, which illustrates the
hierarchical relationships among the clusters.
Implementation and
Components in Data
Warehouse Hierarchical clustering has several advantages
over other clustering methods
Attribute Subset
Selection in Data
The ability to handle non-convex clusters and
clusters of different sizes and densities.
The ability to handle missing data and noisy data.
The ability to reveal the hierarchical structure of
the data, which can be useful for understanding
the relationships among the clusters.

Drawbacks of Hierarchical Clustering


Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 2/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

The need for a criterion to stop the clustering


process and determine the final number of
clusters.
The computational cost and memory requirements
of the method can be high, especially for large
datasets.
The results can be sensitive to the initial
conditions, linkage criterion, and distance metric
used.
In summary, Hierarchical clustering is a method of
data mining that groups similar data points into
clusters by creating a hierarchical structure of the
clusters.
This method can handle different types of data and
reveal the relationships among the clusters.
However, it can have high computational cost and
results can be sensitive to some conditions.

Types of Hierarchical Clustering

Basically, there are two types of hierarchical


Clustering:

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 3/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

1. Agglomerative Clustering
2. Divisive clustering

1. Agglomerative Clustering

Initially consider every data point as an individual


Cluster and at every step, merge the nearest pairs of
the cluster. (It is a bottom-up method). At first, every
dataset is considered an individual
Skip toentity or cluster. At
content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 4/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

every iteration, the clusters merge with different


clusters until one cluster is formed.
The algorithm for Agglomerative Hierarchical Clustering
is:
Calculate the similarity of one cluster with all the
other clusters (calculate proximity matrix)
Consider every data point as an individual cluster
Merge the clusters which are highly similar or
close to each other.
Recalculate the proximity matrix for each cluster
Repeat Steps 3 and 4 until only a single cluster
remains.

Let’s see the graphical representation of this


algorithm using a dendrogram.

Note: This is just a demonstration of how the


actual algorithm works no calculation has been
performed below all the proximity among the
clusters is assumed.

Let’s say we have six data points A, B, C, D, E, and F.

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 5/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Agglomerative Hierarchical clustering

Step-1: Consider each alphabet as a single cluster


and calculate the distance of one cluster from all
the other clusters.
Step-2: In the second step comparable clusters are
merged together to form a single cluster. Let’s say
cluster (B) and cluster (C) are very similar to each
other therefore we merge them
Skip toincontent
the second step
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 6/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

similarly to cluster (D) and (E) and at last, we get


the clusters [(A), (BC), (DE), (F)]
Step-3: We recalculate the proximity according to
the algorithm and merge the two nearest
clusters([(DE), (F)]) together to form new clusters
as [(A), (BC), (DEF)]
Step-4: Repeating the same process; The clusters
DEF and BC are comparable and merged together
to form a new cluster. We’re now left with clusters
[(A), (BCDEF)].
Step-5: At last, the two remaining clusters are
merged together to form a single cluster
[(ABCDEF)].

2. Divisive Hierarchical clustering

We can say that Divisive Hierarchical clustering is


precisely the opposite of Agglomerative Hierarchical
clustering. In Divisive Hierarchical clustering, we take
into account all of the data points as a single cluster
and in every iteration, we separate the data points
from the clusters which aren’t comparable. In the end,
we are left with N clusters.

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 7/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Divisive Hierarchical clustering

Also Check:

Hierarchical Clustering in Machine Learning


Agglomerative Methods in Machine Learning
Difference between K means and Hierarchical
Clustering
Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 8/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Implementing Agglomerative Clustering using


Sklearn

Frequently Asked Questions (FAQs)

1. What are two types of Hierarchical Clustering?

There are two types of hierarchical clustering:


agglomerative and divisive clustering. In
agglomerative clustering, the data points are
considered as separate clusters and merges
them based on similarity until a single cluster is
formed. And in hierarchical clustering, the data
points are considered a single cluster and then
recursively divided into smaller clusters based
on dissimilarity until individual data points
become separate clusters.

2. What is the difference between K means and


Hierarchical Clustering?

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 9/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

The key difference between k means and


hierarchical clustering is that k means divides
the data into a predetermined number of
clusters (k) whereas, hierarchical clustering
forms a hierarchy of clusters without requiring
the number of clusters beforehand.

"GeeksforGeeks helped me ace the GATE


exam! Whenever I had any doubt regarding any topic,
GFG always helped me and made my concepts quiet
clear." - Anshika Modi | AIR 21

Choose GeeksforGeeks as your perfect GATE 2025


Preparation partner with these newly launched
programs
GATE CS & IT
GATE DS & AI
GATE Offline (Delhi/NCR)

Over 125,000+ students already trust us to be their


GATE Exam guide. Join them & let us help you in
opening the GATE to top-tech IITs & NITs!
Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 10/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

46 Suggest improvement

Previous Next

Data Preprocessing in Partitioning Method (K-


Data Mining Mean) in Data Mining

Share your thoughts in the


Add Your Comment
comments

Similar Reads
Clustering in Data Mining Measuring Clustering
Quality in Data Mining

Difference Between Data Difference Between Data


Mining and Text Mining Mining and Web Mining

Generalized Sequential
Pattern (GSP) Mining in Text Mining in Data Mining
Data Mining

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 11/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Difference between Difference between


Hierarchical and Hierarchical and Network
Relational data model Data Model

Difference between How Data is Stored in


Hierarchical, Network and Hierarchical Database?
Relational Data Model

pcp21599

Article Tags : data mining , DBMS

Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 12/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Company Explore Languages DSA Data Web


A-143, 9th Floor, Sovereign Corporate About Us Job-A-Thon Python Data Science & Technologies
Tower, Sector-136, Noida, Uttar Pradesh - Hiring Structures ML
201305 Legal Java HTML
Challenge Algorithms
Careers C++ Data Science CSS
Hack-A-Thon DSA for With Python
In Media PHP JavaScript
GfG Weekly Beginners Data Science
Contact Us GoLang TypeScript
Contest Basic DSA For Beginner
Advertise with SQL ReactJS
Offline Classes Problems Machine
us R Language NextJS
(Delhi/NCR) DSA Roadmap Learning
GFG Corporate Android NodeJs
DSA in DSA Interview Tutorial
Solution Tutorial
JAVA/C++ Questions ML Maths Bootstrap
Placement
Master System Competitive Data Tailwind CSS
Training
Design Programming Visualisation
Program
Master CP Tutorial
GeeksforGeeks Pandas
Videos Tutorial
Geeks NumPy
Community Tutorial
Skip to content NLP Tutorial
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 13/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

Deep Learning
Tutorial

Python Computer DevOps System School Commerce


Tutorial Science Git Design Subjects Accountancy
Python GATE CS Notes AWS High Level Mathematics Business
Programming Operating Docker Design Physics Studies
Examples Systems Low Level Economics
Kubernetes Chemistry
Django Tutorial Computer Design Management
Azure Biology
Python Network UML Diagrams HR
GCP Social Science
Projects Database Interview Management
DevOps English
Python Tkinter Management Guide Finance
Roadmap Grammar
Web Scraping System Design Income Tax
OpenCV Software Patterns
Tutorial Engineering OOAD
Python Digital Logic System Design
Interview Design Bootcamp
Question Engineering Interview
Maths Questions

UPSC Study Preparation Competitive More Free Online Write &


Material Corner Exams Tutorials Tools Earn
Polity Notes Company-Wise JEE Advanced Software Typing Test Write an Article
Geography Recruitment UGC NET Development Image Editor Improve an
Notes Process Article
SSC CGL
Skip to content
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 14/15
4/22/24, 10:45 PM Hierarchical Clustering in Data Mining - GeeksforGeeks

History Notes Resume SBI PO Software Code Pick Topics to


Science and Templates SBI Clerk Testing Formatters Write
Technology Aptitude IBPS PO Product Code Share your
Notes Preparation Management Converters Experiences
IBPS Clerk
Economy Puzzles Project Currency Internships
Notes Company-Wise Management Converter
Ethics Notes Preparation Linux Random
Previous Year Companies Excel Number
Papers Generator
Colleges All Cheat
Sheets Random
Password
Generator

@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved

https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/ 15/15

You might also like