Clustering: Prof. Ankur Sinha

This document discusses clustering, an unsupervised machine learning technique used to group unlabeled data points into clusters based on similarity. It provides examples of clustering applications in marketing, urban planning, and more. Different similarity measures for comparing data points are introduced, such as Euclidean distance. An example clusters 10 customers defined by age and service usage attributes. Hierarchical and k-means clustering algorithms are overviewed, with k-means explained as iteratively assigning points to centroids and updating centroids.

Uploaded by

Vibhuti Batra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views

Clustering: Prof. Ankur Sinha

Uploaded by

Vibhuti Batra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Clustering

Prof. Ankur Sinha

Indian Institute of Management Ahmedabad
Gujarat India
Clustering
• Grouping a set of data objects into different
groups based on similarity
• An example of unsupervised learning
• Data objects can be vectors representing
different attributes for an object, for example,
customer, location, product, etc.
Examples
• Used in a variety of areas
– Marketing
– Urban planning
– Customer segmentation
– Product segmentation
– Seismology
Similarity Measure
• If two objects i and j are represented by
vectors xi and xj
– How do you measure similarity between the two
objects
• Euclidean distance
• Manhattan distance
• Mahalanobis distance
– Similarity can be chosen based on the application
Similarity Measure
• Consider 10 customers with two attributes
– Attribute 1: Recent usage of services
– Attribute 2: Customer age
• Objective: Cluster the data into two classes and design two marketing
campaigns for the two customer segments
X 10 years
10

7
Customer Age

0
0 1 2 3 4 5 6 7 8 9 10
X 10 minutes

Usage of Service
Similarity Measure
• Consider 10 customers with two attributes
– Attribute 1: Usage of services
– Attribute 2: Customer age

10 Cluster 1 Cluster2
9

8
(3,4) (6,2)
7

6 (2,6) (7,2)
5

4
(4,5) (7,4)
3 (4,7) (8,4)
2

1
(3,8) (8,5)
0
0 1 2 3 4 5 6 7 8 9 10
Clustering approaches
• Hierarchical clustering
– Agglomerative
– Divisive
Step 0 Step 1 Step 2 Step 3 Step 4
agglomerative
(AGNES)
a ab
b abcde
c
cde
d
de
e
divisive
Step 4 Step 3 Step 2 Step 1 Step 0 (DIANA)
Clustering approaches
• K-means Clustering
– Select initial centroids randomly
– Assign objects to centroids based on similarity
measure
– Compute new centroid as mean of each class
– Repeat the above two steps until there is no
change
K-Means Clustering

Start with centroids randomly placed Assign points to the centroids Update centroids

Assign points to the new centroids Update centroids Assign points to the new centroids
Random centroids
K-Means Clustering

Start with centroids randomly placed Assign points to the centroids Update centroids

Assign points to the new centroids Update centroids Assign points to the new centroids

Continue until there is no

change in the structure of the
clusters

Accenture Data Scientist Interview Questions
No ratings yet
Accenture Data Scientist Interview Questions
13 pages
My Startup Guide Workbook
50% (2)
My Startup Guide Workbook
44 pages
Unit 3
No ratings yet
Unit 3
58 pages
Pattern Recognition - Clustering - Classification
No ratings yet
Pattern Recognition - Clustering - Classification
177 pages
CLustering Methods
No ratings yet
CLustering Methods
2 pages
3. Chapter 5 CLUSTERING
No ratings yet
3. Chapter 5 CLUSTERING
36 pages
Soft Vs Hard Clustering
No ratings yet
Soft Vs Hard Clustering
5 pages
Python Machine Learning
No ratings yet
Python Machine Learning
19 pages
DM Chapter 5 (Clustering)
No ratings yet
DM Chapter 5 (Clustering)
40 pages
DM 10,11 Clustering PDF
No ratings yet
DM 10,11 Clustering PDF
65 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
Unit - 4 - Modified
No ratings yet
Unit - 4 - Modified
152 pages
ML Unit-4-1
No ratings yet
ML Unit-4-1
39 pages
Chapter 5 Clustering
No ratings yet
Chapter 5 Clustering
40 pages
09 Clustering
No ratings yet
09 Clustering
21 pages
Clustering in Data Mining
No ratings yet
Clustering in Data Mining
5 pages
Cluster analysis (3)
No ratings yet
Cluster analysis (3)
46 pages
Lect 10 DM
No ratings yet
Lect 10 DM
36 pages
21AI71-module-5-textbook
No ratings yet
21AI71-module-5-textbook
25 pages
Week 9 - Clustering
No ratings yet
Week 9 - Clustering
63 pages
Lecture 5 Unsupervised
No ratings yet
Lecture 5 Unsupervised
54 pages
Unit- 4 DMA
No ratings yet
Unit- 4 DMA
145 pages
Week 09
No ratings yet
Week 09
26 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
51 pages
DM Clustering
No ratings yet
DM Clustering
51 pages
An Introduction To Data Mining IIT Bombay
No ratings yet
An Introduction To Data Mining IIT Bombay
48 pages
Cluster Analysis
No ratings yet
Cluster Analysis
62 pages
Supervised Learning vs. Unsupervised Learning
No ratings yet
Supervised Learning vs. Unsupervised Learning
7 pages
Clustering and Association Rule
No ratings yet
Clustering and Association Rule
69 pages
Market Segmentation - Cluster Analysis
No ratings yet
Market Segmentation - Cluster Analysis
18 pages
Cluster Analysis: G Sreenivas
No ratings yet
Cluster Analysis: G Sreenivas
29 pages
DM - Topic Four - Part III (Autosaved)
No ratings yet
DM - Topic Four - Part III (Autosaved)
67 pages
V DM Clustering
No ratings yet
V DM Clustering
76 pages
Clustering: ISOM3360 Data Mining For Business Analytics
No ratings yet
Clustering: ISOM3360 Data Mining For Business Analytics
28 pages
Ds Module 5
No ratings yet
Ds Module 5
49 pages
UNIT 4 NOTES
No ratings yet
UNIT 4 NOTES
66 pages
Data Mining
No ratings yet
Data Mining
23 pages
DM Lecture 06
No ratings yet
DM Lecture 06
32 pages
Unit 4
No ratings yet
Unit 4
65 pages
MOD 5 BUSAN
No ratings yet
MOD 5 BUSAN
5 pages
unit-4 ML
No ratings yet
unit-4 ML
16 pages
Cluster Analysis
No ratings yet
Cluster Analysis
60 pages
Week6_clustering_regression
No ratings yet
Week6_clustering_regression
101 pages
Grouping
No ratings yet
Grouping
98 pages
Module 5
No ratings yet
Module 5
370 pages
Clustering Basics
No ratings yet
Clustering Basics
39 pages
Clustering L7
No ratings yet
Clustering L7
7 pages
Datawarehousing and Data Mining
No ratings yet
Datawarehousing and Data Mining
119 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
42 pages
2a. Basic Data Mining Techniques
No ratings yet
2a. Basic Data Mining Techniques
39 pages
Clustering 1
No ratings yet
Clustering 1
75 pages
INS2061 Introductions
No ratings yet
INS2061 Introductions
75 pages
Unsupervised Learning Modi
No ratings yet
Unsupervised Learning Modi
16 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
77 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
48 pages
Lecture - 10 Unsupervised Learning & K-Means Clustering
No ratings yet
Lecture - 10 Unsupervised Learning & K-Means Clustering
31 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
No ratings yet
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
38 pages
FPA unit 3
No ratings yet
FPA unit 3
17 pages
CS8091 BDA Unit 2
No ratings yet
CS8091 BDA Unit 2
101 pages
unsupervised-learning
No ratings yet
unsupervised-learning
18 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
April 2020 - Master Deck - Fashion
No ratings yet
April 2020 - Master Deck - Fashion
28 pages
CS Case Analysis Template
No ratings yet
CS Case Analysis Template
1 page
IIMA Casebook
No ratings yet
IIMA Casebook
142 pages
Revision Notes For Class 12 Macro Economics Chapter 1 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 1 - Free PDF Download
15 pages
Revision Notes For Class 12 Macro Economics Chapter 4 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 4 - Free PDF Download
11 pages
Sarah's Dilemma
No ratings yet
Sarah's Dilemma
1 page
Revision Notes For Class 12 Macro Economics Chapter 5 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 5 - Free PDF Download
6 pages
Revision Notes For Class 12 Macro Economics Chapter 6 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 6 - Free PDF Download
8 pages
Revision Notes For Class 12 Macro Economics Chapter 2 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 2 - Free PDF Download
17 pages
Linear Programming: Basic Concepts Solution To Solved Problems
No ratings yet
Linear Programming: Basic Concepts Solution To Solved Problems
15 pages
Revision Notes For Class 12 Macro Economics Chapter 3 - Free PDF Download
No ratings yet
Revision Notes For Class 12 Macro Economics Chapter 3 - Free PDF Download
7 pages
Solution To Solved Problems: 1.S1 Make or Buy
No ratings yet
Solution To Solved Problems: 1.S1 Make or Buy
3 pages
Mnitel Pronto Italia: Syndicate A4
No ratings yet
Mnitel Pronto Italia: Syndicate A4
10 pages
Hillier6e Chapter01
No ratings yet
Hillier6e Chapter01
1 page
Presented by Sarvashreshtha Chaudhary Basu Bhattar
No ratings yet
Presented by Sarvashreshtha Chaudhary Basu Bhattar
14 pages
Handbook Preview PDF
No ratings yet
Handbook Preview PDF
9 pages
Safola PDF
No ratings yet
Safola PDF
7 pages
2019-20 CG PGPX Outline PDF
No ratings yet
2019-20 CG PGPX Outline PDF
4 pages
Os Case Study Analysis: Managing Innovation at Nypro Inc
No ratings yet
Os Case Study Analysis: Managing Innovation at Nypro Inc
8 pages
Community Detection
No ratings yet
Community Detection
72 pages
Tiny Robotics Dataset and Benchmark for Continual
No ratings yet
Tiny Robotics Dataset and Benchmark for Continual
7 pages
cs221-lecture12
No ratings yet
cs221-lecture12
28 pages
Introduction To Data Mining Clustering Analysis
No ratings yet
Introduction To Data Mining Clustering Analysis
84 pages
Gaussian Mixture Modelling GMM
No ratings yet
Gaussian Mixture Modelling GMM
11 pages
DOE Homework 7 Stefan Garnett Harmasi
No ratings yet
DOE Homework 7 Stefan Garnett Harmasi
5 pages
K Means
No ratings yet
K Means
3 pages
CDR-1967 Determining Measured Mile T Zhao 2015 AACE
100% (1)
CDR-1967 Determining Measured Mile T Zhao 2015 AACE
17 pages
Intrinsic and Extrinsic Evaluations of Word Embeddings: Michael Zhai, Johnny Tan, Jinho D. Choi
No ratings yet
Intrinsic and Extrinsic Evaluations of Word Embeddings: Michael Zhai, Johnny Tan, Jinho D. Choi
2 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Student Cluster Analysis Based On Moodle Data and Academic Performance Indicators
No ratings yet
Student Cluster Analysis Based On Moodle Data and Academic Performance Indicators
4 pages
Immediate download Deep Learning Powered Technologies Autonomous Driving Artificial Intelligence of Things AIoT Augmented Reality 5G Communications and Beyond on Engineering Science and Technology Khaled Salah Mohamed ebooks 2024
100% (2)
Immediate download Deep Learning Powered Technologies Autonomous Driving Artificial Intelligence of Things AIoT Augmented Reality 5G Communications and Beyond on Engineering Science and Technology Khaled Salah Mohamed ebooks 2024
65 pages
Project-I_Report_Format_2023-2024_(1)[1]
No ratings yet
Project-I_Report_Format_2023-2024_(1)[1]
35 pages
(Ebook) Artificial Intelligence for Materials Science by Yuan Cheng, Tian Wang, Gang Zhang, (eds.) ISBN 9783030683092, 3030683095 download pdf
100% (12)
(Ebook) Artificial Intelligence for Materials Science by Yuan Cheng, Tian Wang, Gang Zhang, (eds.) ISBN 9783030683092, 3030683095 download pdf
81 pages
Advanced Python Training Content 2022
No ratings yet
Advanced Python Training Content 2022
4 pages
hw7 Sol
No ratings yet
hw7 Sol
12 pages
MLQB Unit 3
No ratings yet
MLQB Unit 3
12 pages
Chapter - 5 Machine Learning
0% (1)
Chapter - 5 Machine Learning
25 pages
DM Practice
No ratings yet
DM Practice
15 pages
Analysis and Optimization of Data Classification Using K-Means Clustering and Affinity Propagation Technique
No ratings yet
Analysis and Optimization of Data Classification Using K-Means Clustering and Affinity Propagation Technique
9 pages
Introduction To The Case Study: Hank Roark
No ratings yet
Introduction To The Case Study: Hank Roark
25 pages
Aiet Brochure
No ratings yet
Aiet Brochure
14 pages
2018-Clustering by Fast Search and Find of Density Peaks
No ratings yet
2018-Clustering by Fast Search and Find of Density Peaks
6 pages
Unit 3
No ratings yet
Unit 3
41 pages
K Means EM Cobweb WEKA PDF
No ratings yet
K Means EM Cobweb WEKA PDF
6 pages
Predictive Analytics and Data Mining: Segmentation Using Clustering
No ratings yet
Predictive Analytics and Data Mining: Segmentation Using Clustering
25 pages
OIL SPECTRA ANALYSIS AND ADULTERATION DETECTION IN MIR SPECTROSCOPY DATA USING MACHINE LEARNING
No ratings yet
OIL SPECTRA ANALYSIS AND ADULTERATION DETECTION IN MIR SPECTROSCOPY DATA USING MACHINE LEARNING
6 pages
Research On K-Value Selection Method of K-Means Clustering Algorithm
No ratings yet
Research On K-Value Selection Method of K-Means Clustering Algorithm
10 pages
OMSA6740 Summer2023 Xie Syllabus
No ratings yet
OMSA6740 Summer2023 Xie Syllabus
6 pages

Clustering: Prof. Ankur Sinha

Uploaded by

Clustering: Prof. Ankur Sinha

Uploaded by

Clustering

Prof. Ankur Sinha

Continue until there is no

You might also like