Assignment 6 ML

The document discusses the K-Means clustering algorithm. It defines K-Means clustering as an unsupervised learning technique that groups unlabeled data points into K number of clusters, where each data point belongs to the cluster with the nearest mean. The document outlines the steps of the K-Means algorithm, which iteratively assigns data points to centroids and updates the centroids until cluster membership stabilizes. It also provides a diagram illustrating how K-Means clustering works.

Uploaded by

Mansi Todmal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

276 views4 pages

Assignment 6 ML

Uploaded by

Mansi Todmal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Vidya Pratishthan’sKamalnayan Bajaj Institute of Engineering and Technology,

Baramati
Department of Computer Engineering
Assignment No:6
Roll Number: -2241072
Name of Student: - Todmal Mansi
Subject: - Machine Learning
Class: - BE Computer

Title : Implement K-Means clustering/ hierarchical clustering on sales_data_sample.csv dataset.

Determine the number of clusters using the elbow method.
Dataset link : https://www.kaggle.com/datasets/kyanyoga/sample-sales-data

 K-Means Clustering Algorithm

K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering problems in
machine learning or data science. In this topic, we will learn what is K-means clustering algorithm, how
the algorithm works, along with the Python implementation of k-means clustering.

What is K-Means Algorithm?

K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into
different clusters. Here K defines the number of pre-defined clusters that need to be created in the process,
as if K=2, there will be two clusters, and for K=3, there will be three clusters, and so on.

It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a way that each
dataset belongs only one group that has similar properties.

It allows us to cluster the data into different groups and a convenient way to discover the categories of
groups in the unlabeled dataset on its own without the need for any training.

It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this
algorithm is to minimize the sum of distances between the data point and their corresponding clusters.
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of clusters, and repeats
the process until it does not find the best clusters. The value of k should be predetermined in this algorithm.

The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the particular k-
center, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away from other clusters.

The below diagram explains the working of the K-means Clustering Algorithm:

How does the K-Means Algorithm Work?

The working of the K-Means algorithm is explained in the below steps:

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input dataset).

Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each
cluster.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

Applied ML Notes
No ratings yet
Applied ML Notes
123 pages
Marketing Strategy Text and Cases 6th Edition Ferrell Test Bank 1
100% (80)
Marketing Strategy Text and Cases 6th Edition Ferrell Test Bank 1
9 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Aviat PV User Manual PDF
100% (3)
Aviat PV User Manual PDF
568 pages
Exercises 695 Ar
No ratings yet
Exercises 695 Ar
1 page
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
1) Architecture of Data Mining
No ratings yet
1) Architecture of Data Mining
10 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
ML Lab
No ratings yet
ML Lab
62 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
DL Notes ALL
No ratings yet
DL Notes ALL
63 pages
Daa Notes Unit 4
No ratings yet
Daa Notes Unit 4
14 pages
Mini Project HPC
No ratings yet
Mini Project HPC
17 pages
Software Project Management Questionnaire
No ratings yet
Software Project Management Questionnaire
18 pages
Data Mining and Business Intelligence Lab Manual
No ratings yet
Data Mining and Business Intelligence Lab Manual
52 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
No ratings yet
A Report of Six Weaks Industrial Training at BBSBEC, Fatehgarh Sahib
24 pages
CD Questions With Answers
100% (1)
CD Questions With Answers
36 pages
CNN 1
No ratings yet
CNN 1
23 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
Unit 5 RNN
No ratings yet
Unit 5 RNN
14 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
Machine Learning Quantum
No ratings yet
Machine Learning Quantum
64 pages
Deep Learning - Wikipedia
No ratings yet
Deep Learning - Wikipedia
36 pages
Vanishing and Exploding
No ratings yet
Vanishing and Exploding
9 pages
ML Decode
No ratings yet
ML Decode
130 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
Sample Technical Seminar Vtu
No ratings yet
Sample Technical Seminar Vtu
14 pages
Lecture Notes 5
No ratings yet
Lecture Notes 5
3 pages
Android Studio Viva Questions
No ratings yet
Android Studio Viva Questions
23 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
12 pages
ML Lab Final R22
No ratings yet
ML Lab Final R22
67 pages
Unit 4 Association Rule Mining
No ratings yet
Unit 4 Association Rule Mining
18 pages
CSE Dept. PPT 176 173
No ratings yet
CSE Dept. PPT 176 173
17 pages
Unit 3 Full Notes
No ratings yet
Unit 3 Full Notes
30 pages
Chpater 1 - Unit 2
No ratings yet
Chpater 1 - Unit 2
31 pages
006 Practical List of DM-2023
No ratings yet
006 Practical List of DM-2023
1 page
Hadoop Ecosystem and Their Components
No ratings yet
Hadoop Ecosystem and Their Components
19 pages
MCS 226
No ratings yet
MCS 226
13 pages
Design A Learning System in Machine Learning
No ratings yet
Design A Learning System in Machine Learning
41 pages
Email Classification: Roll No-41463 (LP-3)
No ratings yet
Email Classification: Roll No-41463 (LP-3)
5 pages
Unit 2 AI
No ratings yet
Unit 2 AI
22 pages
Automatically Designing CNN Architectures Using Genetic Algorithm For Image Classification PDF
No ratings yet
Automatically Designing CNN Architectures Using Genetic Algorithm For Image Classification PDF
14 pages
Lecture 3: Text Processing & Minimum Edit Distance Algorithm
No ratings yet
Lecture 3: Text Processing & Minimum Edit Distance Algorithm
57 pages
Data Warehousing & Data Mining Unit-2 Notes
100% (1)
Data Warehousing & Data Mining Unit-2 Notes
36 pages
A719552767 - 20992 - 7 - 2019 - Lecture10 Python OOP
No ratings yet
A719552767 - 20992 - 7 - 2019 - Lecture10 Python OOP
15 pages
Barcode and QR Code Scanner Using ZBar and OpenCV - Learn OpenCV
No ratings yet
Barcode and QR Code Scanner Using ZBar and OpenCV - Learn OpenCV
8 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
44 pages
Data Mining Syllabus
No ratings yet
Data Mining Syllabus
1 page
Bayers Optimal Classifier
No ratings yet
Bayers Optimal Classifier
9 pages
Chap 11 12 - Practical Methodology and Applications - Heechul Lim
100% (1)
Chap 11 12 - Practical Methodology and Applications - Heechul Lim
60 pages
DL Unit-2 Notes PPT
No ratings yet
DL Unit-2 Notes PPT
39 pages
ML QB With Answer
No ratings yet
ML QB With Answer
20 pages
Study Materials - Restricted Boltzmann Machine
No ratings yet
Study Materials - Restricted Boltzmann Machine
6 pages
BScCSIT Transaction DBMS
No ratings yet
BScCSIT Transaction DBMS
30 pages
Data Mining-Rule Based Classification
No ratings yet
Data Mining-Rule Based Classification
4 pages
TE AI Honor Course
No ratings yet
TE AI Honor Course
18 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Data Mining and Model Selection
No ratings yet
Data Mining and Model Selection
27 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Ite 1 Reviewer
No ratings yet
Ite 1 Reviewer
4 pages
1.4 Process Models
No ratings yet
1.4 Process Models
40 pages
Data Science Bootcamp - UG - V1 - 0324
No ratings yet
Data Science Bootcamp - UG - V1 - 0324
30 pages
Pros and Cons of e Banking
No ratings yet
Pros and Cons of e Banking
2 pages
OPTIKA - B-1000BF - PH - Ti-2-3-5-10 - Instruction Manual - EN - IT - ES - FR - DE - PT
No ratings yet
OPTIKA - B-1000BF - PH - Ti-2-3-5-10 - Instruction Manual - EN - IT - ES - FR - DE - PT
228 pages
Network Engineer - Praneesha Martha
No ratings yet
Network Engineer - Praneesha Martha
4 pages
Geomatics Engineering Technology
No ratings yet
Geomatics Engineering Technology
3 pages
Hik ProConnect Mobile Client User Manual
No ratings yet
Hik ProConnect Mobile Client User Manual
44 pages
Catalogue & Price List 2019-20: Swimming Pool & Spa Equipment
No ratings yet
Catalogue & Price List 2019-20: Swimming Pool & Spa Equipment
260 pages
Understanding The Security Architecture of The One Identity Safeguard Appliance
No ratings yet
Understanding The Security Architecture of The One Identity Safeguard Appliance
6 pages
API ISCAN-LITE Scanner
No ratings yet
API ISCAN-LITE Scanner
4 pages
Inspection Notification-093.Rev A
No ratings yet
Inspection Notification-093.Rev A
2 pages
13930
No ratings yet
13930
11 pages
REMOTE PLAN WORK DAILY ACCOMPLISHMENTS As of 04.6.2020
No ratings yet
REMOTE PLAN WORK DAILY ACCOMPLISHMENTS As of 04.6.2020
12 pages
AdmitCard QYTEVZ
No ratings yet
AdmitCard QYTEVZ
1 page
Social Entrepreneurship: Assignment 1: Social Enterprise and Entrepreneur Desicrew Solutions and Saloni Malhotra
No ratings yet
Social Entrepreneurship: Assignment 1: Social Enterprise and Entrepreneur Desicrew Solutions and Saloni Malhotra
3 pages
Aa270625068397p - SCN25062025 GST
No ratings yet
Aa270625068397p - SCN25062025 GST
1 page
Smart Traffic Management Project
No ratings yet
Smart Traffic Management Project
2 pages
Textbook of Pharmacoepidemiology, 3rd Edition, 3rd Edition Fast Ebook Download
100% (8)
Textbook of Pharmacoepidemiology, 3rd Edition, 3rd Edition Fast Ebook Download
14 pages
Data Sampel Properti & Real Estate
No ratings yet
Data Sampel Properti & Real Estate
6 pages
6FM9Y
No ratings yet
6FM9Y
2 pages
Cosworth Performance Parts 2011
No ratings yet
Cosworth Performance Parts 2011
48 pages
Vision
No ratings yet
Vision
39 pages
Bangladesh Telecommunications Company LTD.: Subscriber Copy ADSL Bill
No ratings yet
Bangladesh Telecommunications Company LTD.: Subscriber Copy ADSL Bill
3 pages
Java
No ratings yet
Java
9 pages
Powin - SAMPLE Commissioning Schedule 22NOV2021
No ratings yet
Powin - SAMPLE Commissioning Schedule 22NOV2021
1 page
Aln-V Ha-06-043-Analog Sensor Bases Installation Instructions
No ratings yet
Aln-V Ha-06-043-Analog Sensor Bases Installation Instructions
4 pages
Experience Summary: Vijaya Bhaskar P
No ratings yet
Experience Summary: Vijaya Bhaskar P
3 pages

Assignment 6 ML

Uploaded by

Assignment 6 ML

Uploaded by

Vidya Pratishthan’sKamalnayan Bajaj Institute of Engineering and Technology,

Title : Implement K-Means clustering/ hierarchical clustering on sales_data_sample.csv dataset.

 K-Means Clustering Algorithm

What is K-Means Algorithm?

The k-means clustering algorithm mainly performs two tasks:

How does the K-Means Algorithm Work?

The working of the K-Means algorithm is explained in the below steps:

Step-1: Select the number K to decide the number of clusters.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

You might also like