0% found this document useful (0 votes)

32 views22 pages

ML-Unit III - K-Means Clustering

Uploaded by

t40088356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views22 pages

ML-Unit III - K-Means Clustering

Uploaded by

t40088356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Machine Learning

Dr. Sunil Saumya

IIIT Dharwad
K-means clustering Algo.
K-means clustering: Intro
● K-means clustering is an unsupervised iterative clustering technique.
● It partitions the given data set into k predefined distinct clusters.
● A cluster is defined as a collection of data points exhibiting certain
similarities.
K-means clustering: Intro
● It partitions the data set such that- Each data point belongs to a cluster with
the nearest mean.
● Data points belonging to one cluster have high degree of similarity.
● Data points belonging to different clusters have high degree of dissimilarity.
K-means clustering: Algorithm
K-Means Clustering Algorithm involves the following steps-
● Step-1: Choose the number of clusters K.
● Step-02: Randomly select any K data points as cluster centers. Select cluster
centers in such a way that they are as farther as possible from each other.
● Step-03: Calculate the distance between each data point and each cluster
center. The distance may be calculated either by using given distance
function or by using euclidean distance formula.
● Step-04: Assign each data point to some cluster. A data point is assigned to
that cluster whose center is nearest to that data point.
K-means clustering: Algorithm Contd..
K-Means Clustering Algorithm involves the following steps-
● Step-05: Re-compute the center of newly formed clusters. The center of a
cluster is computed by taking mean of all the data points contained in that
cluster.
● Step-06: Keep repeating the procedure from Step-03 to Step-05 until any of
the following stopping criteria is met-
○ Center of newly formed clusters do not change
○ Data points remain present in the same cluster
○ Maximum number of iterations are reached
K-means clustering: Exercise
Cluster the following eight points (with (x, y) representing locations) into three
clusters:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)
Consider initial cluster centers are: A1(2, 10), A4(5, 8) and A7(1, 2).
The distance function between two points a = (x1, y1) and b = (x2, y2) is defined
as-
p(a, b) = |x2 – x1| + |y2 – y1|
Use K-Means Algorithm to find the three cluster centers after the second
iteration.
K-means clustering: Exercise
The given points can be plotted as:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8),
A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)
Initial cluster centers are: A1(2, 10),
A4(5, 8) and A7(1, 2).
K-means clustering: Exercise
Solution: Iteration 1

A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)

Initial cluster centers are: C1(2, 10), C2(5, 8) and C3(1, 2).

We calculate the distance of each point from each of the center of the three clusters.

Calculating Distance Between A1(2, 10) and C1(2, 10): Ρ(A1, C1) = |x2 – x1| + |y2 – y1| = |2 – 2| + |10 – 10| = 0

Calculating Distance Between A1(2, 10) and C2(5, 8): Ρ(A1, C2) = |x2 – x1| + |y2 – y1| = |5 – 2| + |8 – 10| = 3 + 2 = 5

Calculating Distance Between A1(2, 10) and C3(1, 2): Ρ(A1, C3) = |x2 – x1| + |y2 – y1| = |1 – 2| + |2 – 10| = 1 + 8 = 9

According to this, Distance Between A1(2, 10) and C1(2, 10) is minimum and it will go in cluster C1.

In the similar manner, we calculate the distance of other points from each of the center of the three clusters.
K-means clustering: Exercise
Solution: Iteration 1
K-means clustering: Exercise
Solution: Iteration 1

New clusters formed are:

Cluster-01:

First cluster contains points- A1(2, 10)

Cluster-02:

Second cluster contains points- A3(8, 4)

A4(5, 8) A5(7, 5) A6(6, 4) A8(4, 9)

Cluster-03:

Third cluster contains points- A2(2, 5)

A7(1, 2)
K-means clustering: Exercise
Solution: Iteration 1
Now, We re-compute the new cluster clusters. The new cluster
New clusters formed are: center is computed by taking mean of all the points contained in
that cluster.
Cluster-01:
For Cluster-01:
First cluster contains points- A1(2, 10) We have only one point A1(2, 10) in Cluster-01. So, cluster
center remains the same.
Cluster-02:
For Cluster-02:
Second cluster contains points- A3(8, 4) Center of Cluster-02 = ((8 + 5 + 7 + 6 + 4)/5, (4 + 8 + 5 + 4 +
A4(5, 8) A5(7, 5) A6(6, 4) A8(4, 9) 9)/5) = (6, 6)

Cluster-03: For Cluster-03:

Center of Cluster-03 = ((2 + 1)/2, (5 + 2)/2) = (1.5, 3.5) This is
Third cluster contains points- A2(2, 5)
completion of Iteration-01.
A7(1, 2)
This completes the Iteration 1.
K-means clustering: Exercise
Solution: Iteration 2

Cluster-01:

First cluster contains points- A1(2, 10)

Center = C1(2,10)

Cluster-02:

Second cluster contains points- A3(8, 4) A4(5, 8)

A5(7, 5) A6(6, 4) A8(4, 9)

Center=C2(6,6)

Cluster-03:

Third cluster contains points- A2(2, 5) A7(1, 2)

Center = C3(1.5,3.5)
K-means clustering: Exercise
Solution: Iteration 2

From here, New clusters are-

Cluster-01:

First cluster contains points- A1(2, 10) A8(4, 9)

Cluster-02:

Second cluster contains points- A3(8, 4) A4(5, 8)

A5(7, 5) A6(6, 4)

Cluster-03: Third cluster contains points- A2(2, 5)

A7(1, 2)
K-means clustering: Exercise
Solution: Iteration 2

Now, We re-compute the new cluster clusters. The new

cluster center is computed by taking mean of all the points
contained in that cluster.
This is completion of Iteration-02.
For Cluster-01: A1(2, 10) A8(4, 9)
After second iteration, the center of the three
Center of Cluster-01 = ((2 + 4)/2, (10 + 9)/2) = (3, 9.5) clusters are-

For Cluster-02: A3(8, 4) A4(5, 8) A5(7, 5) A6(6, 4) C1(3, 9.5) C2(6.5, 5.25) C3(1.5, 3.5)
Center of Cluster-02 = ((8 + 5 + 7 + 6)/4, (4 + 8 + 5 + 4)/4) =
(6.5, 5.25)

For Cluster-03: A2(2, 5) A7(1, 2)

Center of Cluster-03 = ((2 + 1)/2, (5 + 2)/2) = (1.5, 3.5)

K-means clustering: Algo
Decide n clusters

Initialize centroids

Assign Cluster

Move Centroids

Finish
K-means Clustering: Elbow method
● How to decides number of clusters?
○ The elbow method is a graphical representation of finding the optimal
‘K’ in a K-means clustering.
○ It works by finding WCSS (Within-Cluster Sum of Square) i.e. the sum
of the square distance between points in a cluster and the cluster
centroid.
K-means Clustering: Elbow method

WCSS1 > WCSS2 > WCSS3 > ..... > WCSSn

K-means Clustering: Elbow method

WCSS1 > WCSS2 > WCSS3 > ..... > WCSSn

● When we see an elbow shape in the

graph, we pick the K-value where the
elbow gets created. We can call this
point the Elbow point.
● Beyond the Elbow point, increasing the
value of ‘K’ does not lead to a
significant reduction in WCSS.
K-means Clustering: Silhouette score
● In the majority of the real-world datasets, it is not very clear to identify the
right ‘K’ using the elbow method. The elbow looks like
K-means Clustering: Silhouette score
● The Silhouette score is a very useful method to find the number of K when
the Elbow method doesn't show the Elbow point.

● The Silhouette score ranges from -1 to +1.

○ 1: Points are perfectly assigned in a cluster and clusters are easily
distinguishable.
○ 0: Clusters are overlapping.
○ -1: Points are wrongly assigned in a cluster.
K-means Clustering: Silhouette score
● Silhouette Score = (b-a)/max(a,b)
where,
○ a= average intra-cluster
distance i.e the average
distance between each point
within a cluster.
○ b= average inter-cluster
distance i.e the average
distance between all clusters.

The Cambridge Handbook of Violent Behavior and Aggression, 1st Edition Annotated PDF Download
100% (17)
The Cambridge Handbook of Violent Behavior and Aggression, 1st Edition Annotated PDF Download
17 pages
5 - CH 5-K-Means Clustering
No ratings yet
5 - CH 5-K-Means Clustering
54 pages
建筑师求职信
100% (1)
建筑师求职信
7 pages
Dissertation Kant
100% (2)
Dissertation Kant
15 pages
Clustering Numericals
No ratings yet
Clustering Numericals
8 pages
Graven and Venkat
No ratings yet
Graven and Venkat
21 pages
L7 Clustering
No ratings yet
L7 Clustering
58 pages
K - Means Clustering
No ratings yet
K - Means Clustering
34 pages
K-Means Clustering
No ratings yet
K-Means Clustering
5 pages
Script Output
No ratings yet
Script Output
53 pages
Dell Vostro 5368 5468 Inspiron 7569 7778 LA-D822P UMA Rev 1.0 Schematics
No ratings yet
Dell Vostro 5368 5468 Inspiron 7569 7778 LA-D822P UMA Rev 1.0 Schematics
46 pages
Latihan Soal PRDDD
No ratings yet
Latihan Soal PRDDD
73 pages
Overhead Lines Chapter 4 PDF
No ratings yet
Overhead Lines Chapter 4 PDF
102 pages
K Clustering
No ratings yet
K Clustering
28 pages
Ramp Check List
No ratings yet
Ramp Check List
1 page
McIntyre - Quantum Mechanics - 83
No ratings yet
McIntyre - Quantum Mechanics - 83
3 pages
Lecture 18 K Means Clustering
No ratings yet
Lecture 18 K Means Clustering
77 pages
List of Authorised Recyclers 09 07 2024 (2) - 3
No ratings yet
List of Authorised Recyclers 09 07 2024 (2) - 3
1 page
K Means
No ratings yet
K Means
25 pages
Case Study BARGAIN CITY
No ratings yet
Case Study BARGAIN CITY
1 page
Kmea
No ratings yet
Kmea
53 pages
K Mean Cluster Analysis
No ratings yet
K Mean Cluster Analysis
16 pages
How To Use The TIMESTAMPADD Parameter To Retrieve by Today - X Time in An Alma Analytics Report
No ratings yet
How To Use The TIMESTAMPADD Parameter To Retrieve by Today - X Time in An Alma Analytics Report
27 pages
Unit V
No ratings yet
Unit V
165 pages
K-Mean Clustering
No ratings yet
K-Mean Clustering
8 pages
PART2
No ratings yet
PART2
61 pages
3 00f3f2a7d5 K Means
No ratings yet
3 00f3f2a7d5 K Means
13 pages
Clustering TNP
No ratings yet
Clustering TNP
53 pages
K Means
No ratings yet
K Means
66 pages
DM Unit Iv
No ratings yet
DM Unit Iv
45 pages
Unit 4
No ratings yet
Unit 4
22 pages
Algo
No ratings yet
Algo
59 pages
Digital Computer Concept and Practice: Unsupervised Learning
No ratings yet
Digital Computer Concept and Practice: Unsupervised Learning
21 pages
Clustering Solved Examples
No ratings yet
Clustering Solved Examples
13 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
Quality of Clustering: Clustering (K-Means Algorithm)
No ratings yet
Quality of Clustering: Clustering (K-Means Algorithm)
4 pages
KMean Merged
No ratings yet
KMean Merged
13 pages
Waves Interference Remote Lab1
25% (4)
Waves Interference Remote Lab1
3 pages
60 41 Ab SPC 00002
No ratings yet
60 41 Ab SPC 00002
39 pages
CPE412 Pattern Recognition (Week 7)
No ratings yet
CPE412 Pattern Recognition (Week 7)
48 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
No ratings yet
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
5 pages
Kmeans Clustering Lecture 8
No ratings yet
Kmeans Clustering Lecture 8
20 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
K Means
No ratings yet
K Means
14 pages
K Means Example
No ratings yet
K Means Example
14 pages
K-Means Clustering Algorithm With Numerical Example
No ratings yet
K-Means Clustering Algorithm With Numerical Example
11 pages
K-Means Clustering
No ratings yet
K-Means Clustering
21 pages
Clustering
No ratings yet
Clustering
18 pages
K Means
No ratings yet
K Means
19 pages
Basfiber For Construction Market (US Customary Units) .
No ratings yet
Basfiber For Construction Market (US Customary Units) .
4 pages
KMeans Example
No ratings yet
KMeans Example
8 pages
ML Unit 4 Part A Material
No ratings yet
ML Unit 4 Part A Material
15 pages
K Means Tutorial
No ratings yet
K Means Tutorial
8 pages
Construction Management
No ratings yet
Construction Management
13 pages
Breccia Types: Hydrothermal, Fault, Volcanic, ETC: June 2016
No ratings yet
Breccia Types: Hydrothermal, Fault, Volcanic, ETC: June 2016
40 pages
08 K-Means
No ratings yet
08 K-Means
19 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
Kmean
No ratings yet
Kmean
24 pages
K-Means With Elbow Method
No ratings yet
K-Means With Elbow Method
24 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
ML Unit-2
No ratings yet
ML Unit-2
31 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
China Orifice Forged Flanges Manufacturer & Supplier DHDZ
No ratings yet
China Orifice Forged Flanges Manufacturer & Supplier DHDZ
1 page
K Means Alg, Example
No ratings yet
K Means Alg, Example
9 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
ML Seminar
No ratings yet
ML Seminar
37 pages
2nd Diagnostic Test
No ratings yet
2nd Diagnostic Test
2 pages
CH-6 DM Clustering
No ratings yet
CH-6 DM Clustering
28 pages
K-Means Clustering
No ratings yet
K-Means Clustering
38 pages
A Paper With 12pt Global Font Size
No ratings yet
A Paper With 12pt Global Font Size
13 pages
DLP Cot2
No ratings yet
DLP Cot2
3 pages
Definitions of Curriculum Bsed
No ratings yet
Definitions of Curriculum Bsed
1 page
Lecture 11 K Means Clustering
No ratings yet
Lecture 11 K Means Clustering
8 pages
K-Means Clustering Algorithm - Javatpoint
No ratings yet
K-Means Clustering Algorithm - Javatpoint
21 pages
K Means Algorithms
No ratings yet
K Means Algorithms
27 pages
Updated - K-Means Naive Bayes
No ratings yet
Updated - K-Means Naive Bayes
11 pages
Kmeans Clustering Numerical - 1
No ratings yet
Kmeans Clustering Numerical - 1
5 pages
INFO1113 Assignment 2023 S2
No ratings yet
INFO1113 Assignment 2023 S2
11 pages
9780374533557RGGReading Group Gold
No ratings yet
9780374533557RGGReading Group Gold
5 pages
Hippo 4 - Writing SF
No ratings yet
Hippo 4 - Writing SF
2 pages
K Means Example
No ratings yet
K Means Example
10 pages
K Means Clustering Algorithm
No ratings yet
K Means Clustering Algorithm
12 pages
DPKG Command Cheat Sheet For Debian Linux
No ratings yet
DPKG Command Cheat Sheet For Debian Linux
2 pages
Digital Signal Processing by Ramesh Babu..
33% (3)
Digital Signal Processing by Ramesh Babu..
303 pages
Sousa Graphics Gems CryENGINE3
No ratings yet
Sousa Graphics Gems CryENGINE3
59 pages
Itl 512 Learning Map Planning 1
No ratings yet
Itl 512 Learning Map Planning 1
12 pages
Ski-hill Graph Pedagogy Meter Fundamentals: Mathematical Music Theory for Beginners
From Everand
Ski-hill Graph Pedagogy Meter Fundamentals: Mathematical Music Theory for Beginners
Andrea M. Calilhanna
No ratings yet
Instruction for Using a Slide Rule
From Everand
Instruction for Using a Slide Rule
W. Stanley
No ratings yet
Geometry and Locus (Geometry) Mathematics Question Bank
From Everand
Geometry and Locus (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet

ML-Unit III - K-Means Clustering

Uploaded by

ML-Unit III - K-Means Clustering

Uploaded by

Machine Learning

Dr. Sunil Saumya

New clusters formed are:

First cluster contains points- A1(2, 10)

Second cluster contains points- A3(8, 4)

Third cluster contains points- A2(2, 5)

Cluster-03: For Cluster-03:

First cluster contains points- A1(2, 10)

Second cluster contains points- A3(8, 4) A4(5, 8)

Third cluster contains points- A2(2, 5) A7(1, 2)

From here, New clusters are-

First cluster contains points- A1(2, 10) A8(4, 9)

Second cluster contains points- A3(8, 4) A4(5, 8)

Cluster-03: Third cluster contains points- A2(2, 5)

Now, We re-compute the new cluster clusters. The new

For Cluster-03: A2(2, 5) A7(1, 2)

Center of Cluster-03 = ((2 + 1)/2, (5 + 2)/2) = (1.5, 3.5)

WCSS1 > WCSS2 > WCSS3 > ..... > WCSSn

WCSS1 > WCSS2 > WCSS3 > ..... > WCSSn

● When we see an elbow shape in the

● The Silhouette score ranges from -1 to +1.

You might also like