0% found this document useful (0 votes)

18 views3 pages

Clustering

Uploaded by

yashnath405

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

Clustering

Uploaded by

yashnath405

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Introduction

When encountering an unsupervised learning problem initially, confusion may arise as you
aren’t seeking specific insights but rather identifying data structures. This process, known
as clustering or cluster analysis, identifies similar groups within a dataset.

It Is one of the most popular clustering techniques in data science used by data scientists.
Entities in each group are comparatively more similar to entities of that group than those of
the other groups. In this article, I will be taking you through the types of clustering, different
clustering algorithms, and a comparison between two of the most commonly used
methods of clustering in machine learning.

What Is Clustering in Machine Learning?

Clustering is the task of dividing the unlabeled data or data points into different clusters
such that similar data points fall in the same cluster than those which differ from the
others. In simple words, the aim of the clustering process is to segregate groups with
similar traits and assign them into clusters.

Let’s understand this with an example. Suppose you are the head of a rental store and wish
to understand the preferences of your customers to scale up your business. Is it possible
for you to look at the details of each customer and devise a unique business strategy for
each one of them? Definitely not. But, what you can do is cluster all of your customers into,
say 10 groups based on their purchasing habits and use a separate strategy for customers
in each of these 10 groups. And this is what we call clustering methods.

Now that we understand what clustering is. Let’s take a look at its different types.

Types of Clustering in Machine Learning

Clustering broadly divides into two subgroups:

1.Hard Clustering: Each input data point either fully belongs to a cluster or not. For
instance, in the example above, every customer is assigned to one group out of the ten.

2.Soft Clustering: Rather than assigning each input data point to a distinct cluster, it
assigns a probability or likelihood of the data point being in those clusters. For example, in
the given scenario, each customer receives a probability of being in any of the ten retail
store clusters.

Different Types of Clustering Algorithms

Since the task of clustering methods is subjective, the means that can be used for
achieving this goal are plenty. Every methodology follows a different set of rules for defining
the ‘similarity’ among data points. In fact, there are more than 100 clustering algorithms
known. But few of the algorithms are used popularly. Let’s look at them in detail:

1.Connectivity Models

As the name suggests, these models are based on the notion that the data points closer in
data space exhibit more similarity to each other than the data points lying farther away.
These models can follow two approaches. In the first approach, they start by classifying all
data points into separate clusters & then aggregating them as the distance decreases. In
the second approach, all data points are classified as a single cluster and then partitioned
as the distance increases. Also, the choice of distance function is subjective. These
models are very easy to interpret but lack scalability for handling big datasets. Examples of
these models are the hierarchical clustering algorithms and their variants.

2.Centroid Models

These clustering algorithms iterate, deriving similarity from the proximity of a data point to
the centroid or cluster center. The k-Means clustering algorithm, a popular example, falls
into this category. These models necessitate specifying the number of clusters beforehand,
requiring prior knowledge of the dataset. They iteratively run to discover local optima.

3.Distribution Models

These clustering models are based on the notion of how probable it is that all data points in
the cluster belong to the same distribution (For example: Normal, Gaussian). These
models often suffer from overfitting. A popular example of these models is the Expectation-
maximization algorithm which uses multivariate normal distributions.

4.Density Models

These models search the data space for areas of the varied density of data points in the
data space. They isolate different dense regions and assign the data points within these
regions to the same cluster. Popular examples of density models are DBSCAN and OPTICS.
These models are particularly useful for identifying clusters of arbitrary shape and
detecting outliers, as they can detect and separate points that are located in sparse
regions of the data space, as well as points that belong to dense regions.

Applications of Clustering

Clustering has a large no. of application of clustering spread across various domains.
Some of the most popular applications of clustering are recommendation engines, market
segmentation, social network analysis, search result grouping, medical imaging, image
segmentation, and anomaly detection.
Improving Supervised Learning Algorithms With Clustering

Clustering is an unsupervised machine learning approach, but can it be used to improve

the accuracy of supervised machine learning algorithms as well by clustering the data
points into similar groups and using these cluster labels as independent variables in the
supervised machine learning algorithm? Let’s find out.

Key Takeaways

1.Clustering helps to identify patterns in data and is useful for exploratory data analysis,
customer segmentation, anomaly detection, pattern recognition, and image segmentation.

2.It is a powerful tool for understanding data and can help to reveal insights that may not be
apparent through other methods of analysis.

3.Its types include partition-based, hierarchical, density-based, and grid-based clustering.

4.The choice of clustering algorithm and the number of clusters to use depend on the
nature of the data and the specific problem at hand.

Machine Learning, Animated (Liu, Mark) (Z-Library)
No ratings yet
Machine Learning, Animated (Liu, Mark) (Z-Library)
582 pages
d2l en Pytorch
No ratings yet
d2l en Pytorch
979 pages
Classification and Clustering
No ratings yet
Classification and Clustering
8 pages
ML Unit-Iii
No ratings yet
ML Unit-Iii
18 pages
Knowledge Management: Unit - 1
100% (1)
Knowledge Management: Unit - 1
12 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
7 pages
BE Information Technology R2019 'C' Scheme Syllabus Draft
No ratings yet
BE Information Technology R2019 'C' Scheme Syllabus Draft
143 pages
(DIGIMAR) B2C Digital Marketing Trends For 2025 and Beyond
No ratings yet
(DIGIMAR) B2C Digital Marketing Trends For 2025 and Beyond
47 pages
Ifferent Methods of Clustering
No ratings yet
Ifferent Methods of Clustering
8 pages
ML Unit 4 (Ab 22)
No ratings yet
ML Unit 4 (Ab 22)
39 pages
E-Note 28966 Content Document 20241211091351PM
No ratings yet
E-Note 28966 Content Document 20241211091351PM
69 pages
Unit 4
No ratings yet
Unit 4
62 pages
Module 5
No ratings yet
Module 5
91 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
64 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
66 pages
Lecturer-1 Unit 3
No ratings yet
Lecturer-1 Unit 3
31 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
Gartner - Become An Ai First Organization 5 Critical Ai Adoption Phases
No ratings yet
Gartner - Become An Ai First Organization 5 Critical Ai Adoption Phases
21 pages
DW & DM Unit 4 Notes
No ratings yet
DW & DM Unit 4 Notes
40 pages
Module 5 - Notes - 13 12 2024
No ratings yet
Module 5 - Notes - 13 12 2024
45 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
Intelligent System: Lecture Notes For Chapter 7
No ratings yet
Intelligent System: Lecture Notes For Chapter 7
25 pages
Unsupervised Learning-01
No ratings yet
Unsupervised Learning-01
42 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
Classify Clustering
No ratings yet
Classify Clustering
31 pages
Cluster Analysis: Basic Concepts and Algorithms
No ratings yet
Cluster Analysis: Basic Concepts and Algorithms
141 pages
Clustering Notes
No ratings yet
Clustering Notes
17 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 8: by Tan, Steinbach, Kumar
93 pages
Zara
No ratings yet
Zara
47 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
18 pages
ML Unit-3
No ratings yet
ML Unit-3
22 pages
Overview Basics
No ratings yet
Overview Basics
16 pages
Clustering
No ratings yet
Clustering
20 pages
Clustering New
No ratings yet
Clustering New
6 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
21 pages
Unit 4-L2
No ratings yet
Unit 4-L2
19 pages
FPA Unit 3
No ratings yet
FPA Unit 3
17 pages
ML Unit 3
No ratings yet
ML Unit 3
28 pages
Clustering
No ratings yet
Clustering
8 pages
Unit 3 Unsupervised Learning Algorith
No ratings yet
Unit 3 Unsupervised Learning Algorith
15 pages
Unit 4
No ratings yet
Unit 4
16 pages
Clustering
No ratings yet
Clustering
57 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
7 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
Clustering
No ratings yet
Clustering
6 pages
Clustering: An Overview: Key Concepts Objective
No ratings yet
Clustering: An Overview: Key Concepts Objective
12 pages
01 Introduction Clustering
No ratings yet
01 Introduction Clustering
11 pages
Clustering
No ratings yet
Clustering
21 pages
Clustering
No ratings yet
Clustering
12 pages
Geopolitics Assignment-SYB AG07
No ratings yet
Geopolitics Assignment-SYB AG07
36 pages
Clustering
No ratings yet
Clustering
11 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
32 pages
Cbsyllabus Bda
No ratings yet
Cbsyllabus Bda
5 pages
Data Science
No ratings yet
Data Science
20 pages
Clustering in Machine Learning - Javatpoint
No ratings yet
Clustering in Machine Learning - Javatpoint
10 pages
Unit 4 Introduction To Algorithm
No ratings yet
Unit 4 Introduction To Algorithm
10 pages
Extractive Text Summarization Using Word Vector Embedding
No ratings yet
Extractive Text Summarization Using Word Vector Embedding
5 pages
An Introduction To Clustering Methods
No ratings yet
An Introduction To Clustering Methods
8 pages
CLUSTERING
No ratings yet
CLUSTERING
5 pages
Clustering U 5
No ratings yet
Clustering U 5
2 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
An Introduction To Clustering and Different Methods of Clustering
No ratings yet
An Introduction To Clustering and Different Methods of Clustering
9 pages
Undergraduate Prospectus 2025 Accessible
No ratings yet
Undergraduate Prospectus 2025 Accessible
25 pages
Presentation On Unsupervised Learning
No ratings yet
Presentation On Unsupervised Learning
3 pages
PublishedPaperNo.8 2022
100% (1)
PublishedPaperNo.8 2022
14 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
64 pages
Autonomous Vehicle With Smart Control
No ratings yet
Autonomous Vehicle With Smart Control
21 pages
Deception & Detection-On Amazon Reviews Dataset
No ratings yet
Deception & Detection-On Amazon Reviews Dataset
9 pages
Electricity Theft Monitoring System Using Artificial Neural Network
No ratings yet
Electricity Theft Monitoring System Using Artificial Neural Network
1 page
Xiaodong's Session Notes RAN1#119 (9.1 AI&ML) v08 - Final
No ratings yet
Xiaodong's Session Notes RAN1#119 (9.1 AI&ML) v08 - Final
12 pages
Opinion - Noam Chomsky - The False Promise of ChatGPT - The New York Times
No ratings yet
Opinion - Noam Chomsky - The False Promise of ChatGPT - The New York Times
8 pages
1 s2.0 S0148296325001432 Main
No ratings yet
1 s2.0 S0148296325001432 Main
16 pages
03 KAnwar - AICOMSQ - QECC2022
No ratings yet
03 KAnwar - AICOMSQ - QECC2022
61 pages
BITS F464 Machine Learning Neural Network Practice Questions - SolutionKey
No ratings yet
BITS F464 Machine Learning Neural Network Practice Questions - SolutionKey
5 pages
Policy-Based Reinforcement Learning: Shusen Wang
No ratings yet
Policy-Based Reinforcement Learning: Shusen Wang
46 pages
Fajardan Ines Nardo Thesis Animal-Intrusion-monitoring-system
No ratings yet
Fajardan Ines Nardo Thesis Animal-Intrusion-monitoring-system
96 pages
Review Paper-Multilingual Sentimentanalysis
No ratings yet
Review Paper-Multilingual Sentimentanalysis
3 pages
IT5409 - Ch0 - About Course
No ratings yet
IT5409 - Ch0 - About Course
9 pages
ADR 431 Course Outline - 2025
No ratings yet
ADR 431 Course Outline - 2025
17 pages
ICChE 2020 Poster
No ratings yet
ICChE 2020 Poster
2 pages
PR1 Chapter 3 Edited2
No ratings yet
PR1 Chapter 3 Edited2
5 pages
XVR5108HS-4KL-I3 Datasheet 20230107
No ratings yet
XVR5108HS-4KL-I3 Datasheet 20230107
3 pages
March 2025
No ratings yet
March 2025
7 pages
Harvestable Black Pepper Recognition Using Computer Vision
No ratings yet
Harvestable Black Pepper Recognition Using Computer Vision
6 pages
Neural Net
No ratings yet
Neural Net
15 pages

Clustering

Uploaded by

Clustering

Uploaded by

Introduction

What Is Clustering in Machine Learning?

Types of Clustering in Machine Learning

Clustering broadly divides into two subgroups:

Different Types of Clustering Algorithms

Clustering is an unsupervised machine learning approach, but can it be used to improve

3.Its types include partition-based, hierarchical, density-based, and grid-based clustering.

You might also like