Lab 11 - HT

The document outlines a task to segment customers based on Age, Annual Income, and Spending Score using K-Means Clustering. The optimal number of clusters (K) identified is 3, which provides the highest silhouette score, indicating well-defined customer segments. The analysis suggests that this segmentation can enhance targeted marketing strategies for different customer groups.

Uploaded by

Lehza Jafri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views4 pages

Lab 11 - HT

Uploaded by

Lehza Jafri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Lab 11 - Home tasks:

Home Task 1:
A company wants to segment its customers based on their Age, Annual Income, and Spending Score. The goal is
to group the customers into distinct segments to improve targeted marketing strategies. Your task is to apply K-
Means Clustering to segment the customers and evaluate the clustering quality using the Silhouette Score. What
is the optimal value of K in your case?

CODE:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

# Step 1: Create a sample dataset

data = {
'Age': [25, 34, 22, 45, 31, 41, 38, 29, 50, 35,
21, 27, 43, 36, 33, 40, 48, 28, 24, 26],
'Annual Income (k$)': [15, 40, 22, 80, 60, 75, 50, 35, 85, 58,
20, 30, 70, 55, 45, 68, 90, 38, 25, 29],
'Spending Score': [39, 81, 6, 77, 40, 76, 50, 60, 85, 49,
10, 50, 73, 52, 39, 65, 80, 62, 14, 48]
}

# Convert to DataFrame
df = pd.DataFrame(data)

# Step 2: Convert to array for clustering

X = df.values

# Step 3: Try different values of K and compute silhouette scores

silhouette_scores = []
K_range = range(2, 11)

print("Silhouette Scores for different K values:\n")

for k in K_range:
kmeans = KMeans(n_clusters=k, random_state=0)
labels = kmeans.fit_predict(X)
score = silhouette_score(X, labels)
silhouette_scores.append(score)
print(f"K = {k} -> Silhouette Score = {score:.4f}")
# Step 4: Plot Silhouette Score vs K
plt.figure(figsize=(8, 5))
plt.plot(K_range, silhouette_scores, marker='o', linestyle='-', color='blue')
plt.title("Silhouette Score vs Number of Clusters (K)")
plt.xlabel("Number of Clusters (K)")
plt.ylabel("Silhouette Score")
plt.grid(True)
plt.tight_layout()
plt.show()

# Step 5: Fit KMeans with optimal K (highest score)

optimal_k = K_range[silhouette_scores.index(max(silhouette_scores))]
kmeans_final = KMeans(n_clusters=optimal_k, random_state=0)
labels_final = kmeans_final.fit_predict(X)

# Step 6: Add cluster labels to DataFrame

df['Cluster'] = labels_final

# Step 7: Print final DataFrame with clusters

print("\nCustomer Segmentation Result (with Cluster Labels):\n")
print(df)

OUTPUT:
Discussion & Analysis of results
Why K=3?

 K=3 gave the highest silhouette score.

 It means the customers are best grouped into three distinct segments with minimal overlap.
 Each cluster contains customers with similar:
o Age
o Income
o Spending Score

Business Insight

 Helps the company create targeted marketing strategies for:

o Budget shoppers
o Premium customers
o Average spenders

Clustering Evaluation

 Silhouette Score ranges from -1 to 1:

o Closer to 1 → well-defined clusters
o Near 0 → overlapping clusters
o Negative → wrong clustering

Limitations

 Sensitive to outliers
 K-Means assumes spherical clusters

Conclusion
 K-Means successfully segments customers using Age, Income, Spending Score.
 Optimal K = 3 for this sample.
 Silhouette Score is a good metric to evaluate cluster quality.

Database Schema
No ratings yet
Database Schema
7 pages
Knowledge Representation in Data Mining
No ratings yet
Knowledge Representation in Data Mining
22 pages
Module 2 - The Framework and Process of Business Analytics
100% (1)
Module 2 - The Framework and Process of Business Analytics
9 pages
Unit 1 Understanding Big Data
No ratings yet
Unit 1 Understanding Big Data
17 pages
Record-Keeping Requirements For Digitization
No ratings yet
Record-Keeping Requirements For Digitization
27 pages
Customer Segmentation Using Machine Learning
100% (1)
Customer Segmentation Using Machine Learning
28 pages
Jupyter Notebook Project DM Nikita Chaturvedi 25.07.2021
100% (5)
Jupyter Notebook Project DM Nikita Chaturvedi 25.07.2021
83 pages
Ict550 Final Assessment
No ratings yet
Ict550 Final Assessment
4 pages
Data Warehouse Notes
No ratings yet
Data Warehouse Notes
21 pages
Unit 3
No ratings yet
Unit 3
130 pages
Health Information Systems: Lesson 5
100% (2)
Health Information Systems: Lesson 5
7 pages
Decision Support and Business Intelligence Systems
No ratings yet
Decision Support and Business Intelligence Systems
57 pages
Mini Project Clustering
50% (2)
Mini Project Clustering
33 pages
Six Types of Enterprise Architecture Artifacts
No ratings yet
Six Types of Enterprise Architecture Artifacts
4 pages
Module 1 Nosql Notes
No ratings yet
Module 1 Nosql Notes
56 pages
Customer Segmentation Report
No ratings yet
Customer Segmentation Report
8 pages
Customer Segmentation Using K
No ratings yet
Customer Segmentation Using K
16 pages
Bone Suplement Market Segmentation
No ratings yet
Bone Suplement Market Segmentation
20 pages
Customer Segmentation in Python
No ratings yet
Customer Segmentation in Python
71 pages
Impact Factor List - 2012 Journal Impact Factor
No ratings yet
Impact Factor List - 2012 Journal Impact Factor
195 pages
Data Mining - Assignment: Girish Nayak
100% (1)
Data Mining - Assignment: Girish Nayak
21 pages
PDF Custome Segmentation
No ratings yet
PDF Custome Segmentation
18 pages
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
Mall Customer Segmentation Using KMeans Clustering Algorithm and Classification Algorithm
No ratings yet
Mall Customer Segmentation Using KMeans Clustering Algorithm and Classification Algorithm
40 pages
Practical File of AI and ML
No ratings yet
Practical File of AI and ML
26 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
ML Solution
No ratings yet
ML Solution
60 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
Set 2
No ratings yet
Set 2
19 pages
K Means Clustering
No ratings yet
K Means Clustering
5 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
ML Assignment
No ratings yet
ML Assignment
11 pages
Reading Data: #Importing Required Libraries
No ratings yet
Reading Data: #Importing Required Libraries
16 pages
Clustering Algorithms SciKit Learn 1705740354
No ratings yet
Clustering Algorithms SciKit Learn 1705740354
22 pages
Oracle Syllabus
No ratings yet
Oracle Syllabus
15 pages
Databases and Database Management Systems: Understanding Computers: Today and Tomorrow, 13th Edition
No ratings yet
Databases and Database Management Systems: Understanding Computers: Today and Tomorrow, 13th Edition
43 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
Customer Segmentation in Python Chapter4
No ratings yet
Customer Segmentation in Python Chapter4
37 pages
ML - K-Means
No ratings yet
ML - K-Means
12 pages
Phase3 3
No ratings yet
Phase3 3
8 pages
DS Prac 8
No ratings yet
DS Prac 8
4 pages
8 SQL Techniques Data Analysis Analytics Data Science
No ratings yet
8 SQL Techniques Data Analysis Analytics Data Science
13 pages
Bensaci Aness Zehouani Yacine HAROUR Elamine Chahbi Imad: Group Members
No ratings yet
Bensaci Aness Zehouani Yacine HAROUR Elamine Chahbi Imad: Group Members
10 pages
Untitled Document-2-1-13-7-11.4
No ratings yet
Untitled Document-2-1-13-7-11.4
5 pages
Ass6 (DMDS)
No ratings yet
Ass6 (DMDS)
7 pages
Digital Forensics in Cyber Security-Recent Trends, Threats, and Opportunities
No ratings yet
Digital Forensics in Cyber Security-Recent Trends, Threats, and Opportunities
14 pages
Unit 3 Data Mining PDF
No ratings yet
Unit 3 Data Mining PDF
19 pages
Name: Aditya Parade Roll No: 281047 PRN: 22311577 Batch: A-2 Assignment 5
No ratings yet
Name: Aditya Parade Roll No: 281047 PRN: 22311577 Batch: A-2 Assignment 5
3 pages
ML-Lab Programs - VTU
No ratings yet
ML-Lab Programs - VTU
5 pages
Exp 8ml
No ratings yet
Exp 8ml
5 pages
Model
No ratings yet
Model
7 pages
Implement Clustering Algorithms For Unsupervised Classification
No ratings yet
Implement Clustering Algorithms For Unsupervised Classification
4 pages
PMA Experiment 2
No ratings yet
PMA Experiment 2
6 pages
AAM 7th Prac
No ratings yet
AAM 7th Prac
4 pages
Tugas Clustering - 132021012 - Kevin Gazkia Naufal
No ratings yet
Tugas Clustering - 132021012 - Kevin Gazkia Naufal
6 pages
Kman 07
No ratings yet
Kman 07
9 pages
Phase 2
No ratings yet
Phase 2
5 pages
Assignment 1 RM
No ratings yet
Assignment 1 RM
20 pages
Peer Eval
No ratings yet
Peer Eval
6 pages
PeerEval Unsupervised
No ratings yet
PeerEval Unsupervised
6 pages
Customer Segmentation With K-Means and RMF
No ratings yet
Customer Segmentation With K-Means and RMF
13 pages
Phase 3
No ratings yet
Phase 3
5 pages
Practical 5
No ratings yet
Practical 5
6 pages
Chapter 1,2 Report
No ratings yet
Chapter 1,2 Report
5 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Project Sample Alpine XI
No ratings yet
Project Sample Alpine XI
8 pages
Introduction To Malware Detection
No ratings yet
Introduction To Malware Detection
8 pages
Final Code
No ratings yet
Final Code
3 pages
AI With Python - Unsupervised Learning - Clustering
No ratings yet
AI With Python - Unsupervised Learning - Clustering
12 pages
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
7 pages
Amity School of Engineering & Technology: B. Tech. (MAE), V Semester Rdbms Sunil Vyas
No ratings yet
Amity School of Engineering & Technology: B. Tech. (MAE), V Semester Rdbms Sunil Vyas
13 pages
KDD WS 24 25 E4 Clustering I
No ratings yet
KDD WS 24 25 E4 Clustering I
2 pages
On The Experimental Theatre - Bertolt Brecht and Carl Richard Mueller
No ratings yet
On The Experimental Theatre - Bertolt Brecht and Carl Richard Mueller
17 pages
Shahapure 2020
No ratings yet
Shahapure 2020
2 pages
ScoopSense ADSR For Mall Managers
No ratings yet
ScoopSense ADSR For Mall Managers
8 pages
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-3
No ratings yet
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-3
3 pages
Lecture - 7 - Practical - DBSCAN Clustering in Python
No ratings yet
Lecture - 7 - Practical - DBSCAN Clustering in Python
3 pages
IF Is Initial
No ratings yet
IF Is Initial
8 pages
21CS53 DBMS Iat3 QB
No ratings yet
21CS53 DBMS Iat3 QB
2 pages
Salesforce PD1
No ratings yet
Salesforce PD1
3 pages
Autodesk REVIT: Training Details
No ratings yet
Autodesk REVIT: Training Details
3 pages
Worksheet 1
No ratings yet
Worksheet 1
3 pages
Cluster Australia: 1 Strategy
No ratings yet
Cluster Australia: 1 Strategy
5 pages
Digital Customer Success: Why the Next Frontier of CS is Digital and How You Can Leverage it to Drive Durable Growth
From Everand
Digital Customer Success: Why the Next Frontier of CS is Digital and How You Can Leverage it to Drive Durable Growth
Nick Mehta
No ratings yet
Business 360°: Unlocking Computer Application
From Everand
Business 360°: Unlocking Computer Application
NotesKaro
No ratings yet
Practice Tests for CASAS Math GOAL 2 Level C, Forms 925M and 926M
From Everand
Practice Tests for CASAS Math GOAL 2 Level C, Forms 925M and 926M
Coaching For Better Learning
No ratings yet