Kmeans Practice

This document provides instructions for practicing k-means clustering on customer data using scikit-learn in Python. It includes steps to load and visualize the data, run k-means clustering with k=3, find the optimal k value using the elbow method by computing distortion for k from 1 to 16, and train a final k-means model on the training data using the best k.

Uploaded by

luchi lovo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views3 pages

Kmeans Practice

Uploaded by

luchi lovo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MSc AIBT : Machine Learning with Python

Practice 2 – kmeans
The database is available on moodle (or mail)

1) Data Visualization
a) Load the database (Customers_practice.csv).

b) Print the 10 first rows (with head function) of the dataset. Determine the size of the
examples and the number of features of the problem.

c) Display a scatter plot of the data. You should obtain the following expected result :

2) K-means algorithm
Sklearn documentation available here : https://scikit-
learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

a) Test the kmeans algorithm with k=3, with random_state=0. Use the fit() function on
your dataset. Because there is no target column, you can use all of the Data to train
your model.

1
MSc AIBT : Machine Learning with Python

b) Once the model is trained, you can access to the labels assigned to Data by kmeans
using labels_ attribute (look for the documentation to see an example of usage).
Display the distinct classes assigned by kmeans (use np.unique())
c) You can access to the centroids of the clusters using the cluster_centers_ attribute
(look for the documentation to see an example of usage). Print them.
d) Plot the scatter plot using the labels assigned by kmeans algorithm. This time plot the
points according to the label. You should obtain the following plot :

e) Explain why k=3 seems not appropriate for the correct number of clusters.
f) Find a way to plot the centroids on the plot. Be practical and create a function to plot
everything.

3) Find the optimal value of k

Find in the documentation the attribute allowing you to recover the ssd value of the trained
kmeans model.
a) Using the whole base, write a script for :
- Finding the optimal value of k using the elbow method (use the following range : [1,16[ ).

2
MSc AIBT : Machine Learning with Python

- Use the following parameters in Kmeans initialization : random_state = 42 and init=’k-

means++’.
- Draw the elbow method plot (you should obtain the following plot)

- Conclude on the best value of k.

b) Train a k-means model with the best value of k obtained before :

- random_state=42 and init=’k-means++’
- Draw the scatterplot associated
- Observe and describe the obtained clusters according to the axis (e.g. cluster 1 contains the
customers having low income but a high number of transactions)

5) More
Load the test samples (Customers_practice_test.csv).

a) Use your trained kmeans on optimal value of k (found in part 4) to predict the test
samples just loaded.
b) Print the predictions
c) Plot the decision boundaries (here is an example with k=3)

Invitation Letter For Visa Spouse
No ratings yet
Invitation Letter For Visa Spouse
2 pages
Practical-8: Import As Import As Import As Import Import As
No ratings yet
Practical-8: Import As Import As Import As Import Import As
9 pages
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
7 pages
Lab Manual
No ratings yet
Lab Manual
9 pages
Practical File of AI and ML
No ratings yet
Practical File of AI and ML
26 pages
ML - K-Means
No ratings yet
ML - K-Means
12 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
ML Assignment 1 - Nageswar
No ratings yet
ML Assignment 1 - Nageswar
7 pages
Document
No ratings yet
Document
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
39 pages
Clustering and K-Mean Algorithm
No ratings yet
Clustering and K-Mean Algorithm
38 pages
Experiment 9
No ratings yet
Experiment 9
10 pages
IGNOU BCA Data and File Structure Previous Year Unsolved Papers MCS 021
From Everand
IGNOU BCA Data and File Structure Previous Year Unsolved Papers MCS 021
Manish Soni
No ratings yet
Da Exp 10
No ratings yet
Da Exp 10
6 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Da Exp 10
No ratings yet
Da Exp 10
6 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
3.1 K - Means
No ratings yet
3.1 K - Means
16 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
ML Lab Programs (1-13)
No ratings yet
ML Lab Programs (1-13)
44 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Data Mining
No ratings yet
Data Mining
18 pages
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
No ratings yet
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
9 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
Scikit Learn
No ratings yet
Scikit Learn
107 pages
IntroML 8 KmeanClustering
No ratings yet
IntroML 8 KmeanClustering
21 pages
Machine Intelligence
No ratings yet
Machine Intelligence
24 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
Aiml Lab
No ratings yet
Aiml Lab
37 pages
Program 8
No ratings yet
Program 8
11 pages
Department Of: Computer Science & Engineering
No ratings yet
Department Of: Computer Science & Engineering
4 pages
Aam Unit 4 QB With Answer
No ratings yet
Aam Unit 4 QB With Answer
11 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
4 pages
Machine Learning Lab Manual 7
100% (1)
Machine Learning Lab Manual 7
8 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Pa66 ML Exp6
No ratings yet
Pa66 ML Exp6
9 pages
K-Means in Python - Solution
No ratings yet
K-Means in Python - Solution
6 pages
Machine Learning Algorithms 1728923216
No ratings yet
Machine Learning Algorithms 1728923216
12 pages
ML 2.3 Prashant
No ratings yet
ML 2.3 Prashant
4 pages
Artificial Intellegence Lab Practical
No ratings yet
Artificial Intellegence Lab Practical
48 pages
ML Lab
No ratings yet
ML Lab
7 pages
Model Learning Steps
No ratings yet
Model Learning Steps
12 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
Machine Learning Laboratory: Manual
No ratings yet
Machine Learning Laboratory: Manual
52 pages
Assigmnent 3 (Data Mining)
No ratings yet
Assigmnent 3 (Data Mining)
18 pages
Da Exp 10 66
No ratings yet
Da Exp 10 66
6 pages
Learn Machine Learning in One Lesson Book
No ratings yet
Learn Machine Learning in One Lesson Book
8 pages
2 Machine Learning
No ratings yet
2 Machine Learning
21 pages
IGNOU MCA Design and Analysis of Algorithms Previous Years Unsolved Papers MCS 211
From Everand
IGNOU MCA Design and Analysis of Algorithms Previous Years Unsolved Papers MCS 211
Manish Soni
No ratings yet
Machinelearning - Lab Manual
No ratings yet
Machinelearning - Lab Manual
26 pages
IGNOU BCA Computer Oriented Numerical Technique Previous Year Unsolved Papers BCS 054
From Everand
IGNOU BCA Computer Oriented Numerical Technique Previous Year Unsolved Papers BCS 054
Manish Soni
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
K-Means Algorithm
No ratings yet
K-Means Algorithm
29 pages
Here's An Visualization of The K-Nearest Neighbors Algorithm
No ratings yet
Here's An Visualization of The K-Nearest Neighbors Algorithm
5 pages
Machine Learning K Means - Unsupervised
No ratings yet
Machine Learning K Means - Unsupervised
5 pages
Record
No ratings yet
Record
23 pages
Plato's Symposium - In-Depth Analysis
No ratings yet
Plato's Symposium - In-Depth Analysis
23 pages
Thucydides' Book 1 Analysis
No ratings yet
Thucydides' Book 1 Analysis
12 pages
Cratylus - Names, Language, Reality
No ratings yet
Cratylus - Names, Language, Reality
9 pages
Mercado Libre in Argentina
No ratings yet
Mercado Libre in Argentina
34 pages
Friedman M Essays in Positive Economics
100% (1)
Friedman M Essays in Positive Economics
336 pages
(Communication Electronic Circuits) Preface
No ratings yet
(Communication Electronic Circuits) Preface
2 pages
Solutions
No ratings yet
Solutions
30 pages
Ce2304 Nol
No ratings yet
Ce2304 Nol
171 pages
LI 2024 Invitation - Online
No ratings yet
LI 2024 Invitation - Online
15 pages
Corporate Social Responsibility - What Does It Mean ?: by Mallen Baker: First Published 8 Jun 2004
No ratings yet
Corporate Social Responsibility - What Does It Mean ?: by Mallen Baker: First Published 8 Jun 2004
4 pages
Time Value of Money
No ratings yet
Time Value of Money
3 pages
Chitaliya Dipak - Nirma
No ratings yet
Chitaliya Dipak - Nirma
93 pages
EN Checklist ISO Aanvulling Ontwerp 7 - 3 260303
No ratings yet
EN Checklist ISO Aanvulling Ontwerp 7 - 3 260303
3 pages
JHU Intro Syl Fall 2015
No ratings yet
JHU Intro Syl Fall 2015
7 pages
Experiment 1
No ratings yet
Experiment 1
3 pages
THHDH
No ratings yet
THHDH
56 pages
Netbackup Troubleshooting Commands
No ratings yet
Netbackup Troubleshooting Commands
4 pages
A100K11750 CTB Technical Manual
No ratings yet
A100K11750 CTB Technical Manual
82 pages
TQM 2-Customer Satisfaction
No ratings yet
TQM 2-Customer Satisfaction
10 pages
5 Diego V Castillo
No ratings yet
5 Diego V Castillo
2 pages
Rooftop-Mounted Wind Turbine: Final Design Report: Client: Professor Upmanu Lall, EEE
No ratings yet
Rooftop-Mounted Wind Turbine: Final Design Report: Client: Professor Upmanu Lall, EEE
20 pages
Australian Royal Commission Into National Natural Disaster Arrangements - Report (Accessible)
No ratings yet
Australian Royal Commission Into National Natural Disaster Arrangements - Report (Accessible)
594 pages
GBV Monthly Work Plan
No ratings yet
GBV Monthly Work Plan
20 pages
Best Practices in Change Management
100% (2)
Best Practices in Change Management
114 pages
Online Platforms For ICT Content Development
No ratings yet
Online Platforms For ICT Content Development
11 pages
Aminu Final Draft-1
No ratings yet
Aminu Final Draft-1
86 pages
Aisi 5140 PDF
No ratings yet
Aisi 5140 PDF
2 pages
Midterm Exam. (ONLINE) Autumn 2021
No ratings yet
Midterm Exam. (ONLINE) Autumn 2021
9 pages
CSE303 CourseOutline Spring2024 IUB
No ratings yet
CSE303 CourseOutline Spring2024 IUB
6 pages
Cta Cli
No ratings yet
Cta Cli
52 pages
0901d19680089cee PDF Preview Medium
No ratings yet
0901d19680089cee PDF Preview Medium
4 pages
Heather R. Flores: Creative Director
No ratings yet
Heather R. Flores: Creative Director
1 page
Certificates
No ratings yet
Certificates
54 pages
Dividend Payout of Meezan Sovereign Fund and Meezan Cash Fund
No ratings yet
Dividend Payout of Meezan Sovereign Fund and Meezan Cash Fund
11 pages

Kmeans Practice

Uploaded by

Kmeans Practice

Uploaded by

MSc AIBT : Machine Learning with Python

3) Find the optimal value of k

- Use the following parameters in Kmeans initialization : random_state = 42 and init=’k-

- Conclude on the best value of k.

b) Train a k-means model with the best value of k obtained before :

You might also like