0% found this document useful (0 votes)

2 views38 pages

Clustering and K-Mean Algorithm

This lecture covers the concepts of supervised and unsupervised learning, focusing on clustering and the K-Means algorithm. It explains the steps involved in the K-Means algorithm, including initialization, membership calculation, and centroid updating, as well as methods for choosing the optimal number of clusters (K) using the Elbow method. Additionally, applications of K-Means, such as image color quantization and handling outliers, are discussed.

Uploaded by

aliahmed23456u857

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views38 pages

Clustering and K-Mean Algorithm

Uploaded by

aliahmed23456u857

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Machine Learning

Clustering and k-Means Algorithm

Lecture – 7

Instructor: Qamar Askari

Headlines
• Supervised vs. Unsupervised Learning
• Clustering
• K-Means algorithm
• Random initialization
• Choosing K – Elbow Method
• Implementation of K-Means Algorithm in Python
Supervised learning

Training set:
Unsupervised learning

Training set:
Clustering
• It is the task of identifying subgroups in the data such that data points
in the same subgroup (cluster) are very similar while data points in
different clusters are very different.

Market segmentation Social network analysis

K-means algorithm
An example on board
K-Means Algorithm
• 1.Initialize:
• Choose K random data points from the data set to represent the initial centers
of the K partitions

• 2.Calculate the group memberships:


1 if x t
− mi = min j x t − mj
bi = 
t


0 otherwise
• 3.Update the centroids:
 ix
b
t
t t

mi =
 i
b t

t
• 4.Repeat steps 2 and 3
• until converge to stable values
An empirical study
Initial centres

Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,

International Conference on Electrical Engineering, 2008
Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,
International Conference on Electrical Engineering, 2008
Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,
International Conference on Electrical Engineering, 2008
Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,
International Conference on Electrical Engineering, 2008
Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,
International Conference on Electrical Engineering, 2008
Application – Image color quantization
20

100

120

140

160

180

200

220

50 100 150 200 250 300

•Look at the above picture….it does not have all 2563 colors.
•Suppose we want to represent it using even less colors
•Kmeans has an application called color quantization
20 20 20

40 40 40

60 60 60

80 80 80

100 100 100

120 120 120

140 140 140

160 160 160

180 180 180

200 200 200

220 220 220

50 100 150 200 250 300 50 100 150 200 250 300 50 100 150 200 250 300

Original Image: K=2 K=10

37859 Shades of Color

20 20

40 40

60 60

80 80

100 100
KMEANS
120 120

140 140

160 160

180 180

200 200

220 220

50 100 150 200 250 300 50 100 150 200 250 300

K=15 K=20
k=2 k=3 k=10

Reference: Bishop 2006.

K-means for non-separated clusters

T-shirt sizing

Weight
Height
K-Means and Globular/Non-Globular
structures

Globular Structure Non-Globular Structure

K-Means and Globular/Non-Globular
structures
K-Means on Non-Globular Structure
K-Means is sensitive to outliners
Handling Outliers
1. Remove/ignore outlier(s). Outliers can be identified from distance
of point from centroid
2. Make extra cluster for outlier as shown below
Random initialization
Should have

Randomly pick training

examples.

Set equal to these

examples.
Local optima
Random initialization
For i = 1 to 100 {

Randomly initialize K-means.

Run K-means. Get .
Compute cost function (distortion/WCSS)

Pick clustering that gave lowest cost

What is the right value of K?
Choosing the value of K
Elbow method:
Cost function

Cost function
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

(no. of clusters) (no. of clusters)

Choosing the value of K
Sometimes, you’re running K-means to get clusters to use for some
later/downstream purpose. Evaluate K-means based on a metric for
how well it performs for that later purpose.

E.g. T-shirt sizing T-shirt sizing

Weight
Weight

Height Height
Implementation of K-Means Algorithm

Discussion from Google Colab

UNIT 4
No ratings yet
UNIT 4
125 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
CLUSTERING CLASSIFICATION AND INTRO NEURAL NETWORK
No ratings yet
CLUSTERING CLASSIFICATION AND INTRO NEURAL NETWORK
168 pages
Lecture7_KMeans
No ratings yet
Lecture7_KMeans
30 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
40 pages
2021 Clustering
No ratings yet
2021 Clustering
50 pages
Unsupervised Learning (1)
No ratings yet
Unsupervised Learning (1)
27 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
Clustering(Kmeans)
No ratings yet
Clustering(Kmeans)
10 pages
4.1.2. K Means Clustering
No ratings yet
4.1.2. K Means Clustering
38 pages
Chapter - Non Resident Taxation
No ratings yet
Chapter - Non Resident Taxation
162 pages
Week 10
No ratings yet
Week 10
41 pages
Lecture 13
No ratings yet
Lecture 13
29 pages
Unit-4
No ratings yet
Unit-4
46 pages
Kmeans
No ratings yet
Kmeans
92 pages
Clustering-Part1.pptx
No ratings yet
Clustering-Part1.pptx
84 pages
Clustering: Unsupervised Learning
No ratings yet
Clustering: Unsupervised Learning
44 pages
06. k Clustering
No ratings yet
06. k Clustering
28 pages
K_means.ipynb_-_Colab
No ratings yet
K_means.ipynb_-_Colab
10 pages
algo
No ratings yet
algo
59 pages
David Lewis, A.J. Brown and Richard Moberly
No ratings yet
David Lewis, A.J. Brown and Richard Moberly
34 pages
Clustering: Unsupervised Learning
No ratings yet
Clustering: Unsupervised Learning
29 pages
19.1. Partitioning-Based Clustering Algorithms
No ratings yet
19.1. Partitioning-Based Clustering Algorithms
27 pages
1731009606_Clustering_(Class_38-39)
No ratings yet
1731009606_Clustering_(Class_38-39)
45 pages
Clusterin G: Unsupervised Learning
No ratings yet
Clusterin G: Unsupervised Learning
29 pages
Experiment 10 vtu ml
No ratings yet
Experiment 10 vtu ml
5 pages
Clustering: Unsupervised Learning Introduc3on
No ratings yet
Clustering: Unsupervised Learning Introduc3on
29 pages
UNIT-4
No ratings yet
UNIT-4
22 pages
Digital Image Processing: Segmentation-5
No ratings yet
Digital Image Processing: Segmentation-5
43 pages
1 The K-Medoids Algorithm
No ratings yet
1 The K-Medoids Algorithm
5 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
2 - K-Mean
No ratings yet
2 - K-Mean
39 pages
P-3 1 2-Kmeans
No ratings yet
P-3 1 2-Kmeans
43 pages
1 s2.0 S0031320319301608 Main
No ratings yet
1 s2.0 S0031320319301608 Main
18 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
Machine Learning K Means - Unsupervised
No ratings yet
Machine Learning K Means - Unsupervised
5 pages
Clustering Kmeans
No ratings yet
Clustering Kmeans
6 pages
K-MEANS CLUSTERING ppt kpu
No ratings yet
K-MEANS CLUSTERING ppt kpu
4 pages
10.Lab Activity
No ratings yet
10.Lab Activity
11 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
Kmean
No ratings yet
Kmean
24 pages
3.1 K - Means
No ratings yet
3.1 K - Means
16 pages
Mod4_Unsupervised Learning
No ratings yet
Mod4_Unsupervised Learning
9 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
WWW Simplilearn Com Tutorials Machine Learning Tutorial K Means Clustering Algor
No ratings yet
WWW Simplilearn Com Tutorials Machine Learning Tutorial K Means Clustering Algor
19 pages
Clustering
No ratings yet
Clustering
6 pages
Lecture 3. Partitioning-Based Clustering Methods
No ratings yet
Lecture 3. Partitioning-Based Clustering Methods
27 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
K-Means Clustering
No ratings yet
K-Means Clustering
5 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
24 pages
Haine Millington Investigating the Harrison Oscillator
No ratings yet
Haine Millington Investigating the Harrison Oscillator
17 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
12 pages
Presentation: Operating System Concept CS-582
No ratings yet
Presentation: Operating System Concept CS-582
13 pages
Global Perspectives on Corporate Governance and CSR 1st Edition by GÃ¼ler Aras 0566088304 9780566088308pdf download
100% (6)
Global Perspectives on Corporate Governance and CSR 1st Edition by GÃ¼ler Aras 0566088304 9780566088308pdf download
71 pages
DA_EXP_10_66
No ratings yet
DA_EXP_10_66
6 pages
K Means
No ratings yet
K Means
33 pages
K means algorithm
No ratings yet
K means algorithm
4 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
K, Eans
No ratings yet
K, Eans
4 pages
Report 1
No ratings yet
Report 1
3 pages
Applied Financial Analysis Introduction (1)
No ratings yet
Applied Financial Analysis Introduction (1)
7 pages
GIS and Public Health, 2nd Edition Complete Book Download
No ratings yet
GIS and Public Health, 2nd Edition Complete Book Download
16 pages
1910206037 Slid
No ratings yet
1910206037 Slid
19 pages
Notebook Bangho PDF
No ratings yet
Notebook Bangho PDF
226 pages
A Tutorial On Clustering Algorithms
No ratings yet
A Tutorial On Clustering Algorithms
4 pages
13: Clustering: Unsupervised Learning - Introduction
No ratings yet
13: Clustering: Unsupervised Learning - Introduction
4 pages
Reading
No ratings yet
Reading
10 pages
South Vietnam
No ratings yet
South Vietnam
5 pages
K Mean
No ratings yet
K Mean
12 pages
Agriculture
No ratings yet
Agriculture
27 pages
Performance Appraisal
100% (1)
Performance Appraisal
82 pages
SCOPE+OF+ACCREDITATION (Construction+Materials+Testing) DCL Testing
No ratings yet
SCOPE+OF+ACCREDITATION (Construction+Materials+Testing) DCL Testing
34 pages
Wind Tunnels For Automobile Aerodynamics: Wolf-Heinrich Hucho
No ratings yet
Wind Tunnels For Automobile Aerodynamics: Wolf-Heinrich Hucho
39 pages
Chapter 10 - Profitability Analysis
100% (1)
Chapter 10 - Profitability Analysis
46 pages
AAA - CHAPTER 6 THE AUDIT APPROACH_watermark
No ratings yet
AAA - CHAPTER 6 THE AUDIT APPROACH_watermark
6 pages
Kpim Saskia - Mei 2021
No ratings yet
Kpim Saskia - Mei 2021
24 pages
EC204 Topic 3 - Oligopoly and Game Theory Applications Student Slides PDF
No ratings yet
EC204 Topic 3 - Oligopoly and Game Theory Applications Student Slides PDF
30 pages
De Tham Khao Tieng Anh
No ratings yet
De Tham Khao Tieng Anh
6 pages
MONISHA new (1)
No ratings yet
MONISHA new (1)
3 pages
University of Essex Online Undergraduate Prospectus
No ratings yet
University of Essex Online Undergraduate Prospectus
18 pages
SAP FI MCQs - General Ledger
No ratings yet
SAP FI MCQs - General Ledger
5 pages
8 Grade Math The Number System CCSS "I Can" Statements
No ratings yet
8 Grade Math The Number System CCSS "I Can" Statements
79 pages
Written Report in Law 3 Sales Laran
No ratings yet
Written Report in Law 3 Sales Laran
4 pages
How To Add A Right Margin To The Visual Studio Code Editor?: Stack Overflow
No ratings yet
How To Add A Right Margin To The Visual Studio Code Editor?: Stack Overflow
6 pages
Datex-Ohmeda Tec6+ - Service Manual
No ratings yet
Datex-Ohmeda Tec6+ - Service Manual
90 pages
Let Reviewer General Education Gened: Information and Communication Technology (Ict) Part 1
No ratings yet
Let Reviewer General Education Gened: Information and Communication Technology (Ict) Part 1
3 pages
AN7124
No ratings yet
AN7124
7 pages
Question Paper For Air Brake
No ratings yet
Question Paper For Air Brake
4 pages
Employee Retention
No ratings yet
Employee Retention
2 pages

Clustering and K-Mean Algorithm

Uploaded by

Clustering and K-Mean Algorithm

Uploaded by

Machine Learning

Clustering and k-Means Algorithm

Instructor: Qamar Askari

Market segmentation Social network analysis

• 2.Calculate the group memberships:

Ref: K. Javed et. al, “The behavior of K-Means: An empirical study”,

50 100 150 200 250 300

100 100 100

120 120 120

140 140 140

160 160 160

180 180 180

200 200 200

220 220 220

Original Image: K=2 K=10

Reference: Bishop 2006.

Globular Structure Non-Globular Structure

Randomly pick training

Set equal to these

Randomly initialize K-means.

Pick clustering that gave lowest cost

(no. of clusters) (no. of clusters)

E.g. T-shirt sizing T-shirt sizing

Discussion from Google Colab

You might also like