0% found this document useful (0 votes)

51 views14 pages

A Beginner's Guide To Customer Segmentation With Python - by Sigli Mumuni - Medium

Uploaded by

Vladimir Shkolnikov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views14 pages

A Beginner's Guide To Customer Segmentation With Python - by Sigli Mumuni - Medium

Uploaded by

Vladimir Shkolnikov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

08.03.

2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

Sigli Mumuni Follow

Jan 6 · 7 min read · Listen

A Beginner’s Guide to Customer Segmentation

with Python
A step-by-step introduction to clustering analysis

Photo by Olivier Le Moal on Shutterstock

Customer segmentation is the process of splitting your customer base into different
groups based on common characteristics. These characteristics are usually
demographic, like age, sex, and income, but psychographic or behavioral
characteristics like personality, interests, and habits are often considered as well.
Customer segmentation allows a business to deliver more targeted and effective
marketing that appeals to the different segments identified.

While customer segmentation has been around for as long as marketing itself, recent
advances in machine learning have made the process easier and more accurate. We can
il i l t t t ti i l t i
https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3l i t f 1/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium
easily implement customer segmentation using clustering analysis, a type of
Upgrade Open in app

unsupervised machine learning technique that places subjects in different groups (or
clusters) based on how closely associated they are with each other.

In this tutorial, we will implement customer segmentation using the K-means

clustering algorithm from the Scikit Learn library in Python. We will be using the mall
customers dataset. This dataset contains information about customers in an
undisclosed mall. It consists of 200 rows and five attributes:

Customer ID

Gender

Age

Annual Income (in thousands of dollars)

Spending Score (based on customer behavior and spending patterns).

You can download the dataset from the Kaggle website or my GitHub repository if you
want to follow along. All the relevant code used in this tutorial is also available in my
GitHub repository.

Importing the libraries

For this tutorial, we will be using NumPy, Pandas, Matplotlib, Seaborn, mpl_toolkits,
and Scikit-learn. If you don’t have these installed already, you can do so by using the
!pip install command.

1 #Import the relevant libraries

2 import numpy as np
3 import pandas as pd
4 import matplotlib.pyplot as plt
5 import seaborn as sns
6 from mpl_toolkits.mplot3d import Axes3D
7 from sklearn.cluster import KMeans

import_libraries.py hosted with ❤ by GitHub view raw

Loading the data

Next, we’ll need to load the dataset to pandas using the read_csv() method. Here, I
have provided the URL to the location of the dataset on my github repository as the
https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 2/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

argument. If you have downloaded the file to your computer, then be sure to enter the
Upgrade Open in app
file path instead.
1 #Load the dataset
2 df = pd.read_csv("https://raw.githubusercontent.com/siglimumuni/Datasets/master/Mall_Customers.c
3
4 #View the first 5 rows
5 df.head()

load_data.py hosted with ❤ by GitHub view raw

Preview of the mall customers dataset

We can get a glimpse of the dataset by using the head() method to display the first 5
rows of data. We can also use the info() method to get a quick breakdown of the
structure of the dataset including the number of rows and columns and data types of
all the columns as well as information on missing values.

1 #Check the structure of the dataset

2 df.info()

structure.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 3/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

Structure of the mall customers dataset

We have a total of 200 rows of data and 5 columns, 4 of which are integers and 1 string
object. The dataset contains no null values.

Before we perform our cluster analysis, we will conduct an exploratory data analysis to
better understand the characteristics of the dataset and familiarize ourselves with the
relationships between the different variables.

Exploring the data

We can start exploring the data by using the describe() method. This gives us quick
summary statistics on all numeric variables, like the mean, median, max, and min
values. We can pass it into the round() function to specify two decimal places for the
outputs. Before proceeding, we can drop the CustomerID column as it is irrelevant for
this analysis. The Gender column is dropped automatically since it is a categorical
column.

1 #Check the summary statistics for the numeric columns

2 round(df.drop(columns="CustomerID",axis=1).describe(),2)

describe.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 4/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

The mean age is 38.85 and mean annual income is around $60,000 dollars. We can
explore these variables in more depth by visualizing their distributions with a
histogram. We can create multiple plots, side by side by using the plt.subplots() method
and then iterating through them with the histplot() method in seaborn.

1 #Create a subplot object with one row and three columns

2 fig,axes = plt.subplots(nrows=1,ncols=3,figsize=[16,4],sharey=True)
3
4 #Plot three histograms, Age, Annual Income and Spending Score
5 for i,col in enumerate(["Age","Annual Income (k$)","Spending Score (1-100)"]):
6 sns.histplot(df[col],bins=20,ax=axes[i]).set(ylabel=" ")

histograms.py hosted with ❤ by GitHub view raw

There’s a wide range of different ages represented with most customers belonging to
the 20–40 year range. Also, the majority of customers are in the 60 to 80 thousand
dollars annual income bracket while most customers’ spending score is between 40 and
60.

Next, let’s check the proportion of males and females in the dataset. We can use the
countplot() method in seaborn to create a bar chart.

1 #Check the proportion of males and females in the dataset

2 ax = sns.countplot(df["Gender"])
3 total = len(df)
4
5 #Annotate bars with percentage values
6 for p in ax.patches:
7 percentage = f'{100 * p.get_height() / total:.1f}%\n'
8 x = p.get_x() + p.get_width() / 2
9 y = p.get_height()
10 t t ( t ( ) h ' t ' ' t ')
https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 5/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium
10 ax.annotate(percentage, (x, y), ha='center', va='center')
11 Upgrade Open in app
12 ax.set(title="Proportion of Males and Females",xlabel="")

countplot.py hosted with ❤ by GitHub view raw

We have more female representation in the dataset than male. Finally, we can explore
the relationships between the different variables in the dataset. One great way to do
this is to use the corr() method to show the correlation between the different variables.

1 #Check the correlation between the different variables

2 corr_matrix = df[["Age","Annual Income (k$)","Spending Score (1-100)"]].corr()
3
4 #Visualize the correlation using seaborn
5 sns.heatmap(corr_matrix, cmap="coolwarm",annot=True)

correlation.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 6/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

There doesn’t seem to be any correlation between the different variables except for Age
and Spending Score, which share a weak negative correlation.

This concludes our exploratory data analysis. We can now move on to our main task.

Segmenting the data

In this section, we will focus on building a K-means model to segment our customers
based on two variables, Annual Income, and Spending Score as a start.

One of the key arguments we need to specify in a K-means clustering model is the
number of clusters. The optimal number of clusters will always vary from dataset to
dataset. Fortunately, there is tried and tested method to arrive at this number, through
a process known as the elbow method.

We begin by plotting the explained variation in the data as a function of the number of
clusters (called the Within Cluster Sum of Squared Errors or WCSS), and then pick out
the value at the elbow of the curve as the number of clusters to use.

1 #Create a subset of the dataframe with only Annual Income and Spending Score
2 X = df[["Annual Income (k$)","Spending Score (1-100)"]]
3
4 #Determine the variation in the data
5 wcss=[]
6 for i in range(1,11):
7 km=KMeans(n_clusters=i)
8 km.fit(X)
9 wcss.append(km.inertia_)
10
11 #Plot the elbow curve
12 plt.figure(figsize=(12,6))
13 plt.plot(range(1,11),wcss, linewidth=2, color="blue", marker ="8")
14 plt.xlabel("Number of Clusters (K)")
15 plt.xticks(np.arange(1,11,1))
16 plt.title("The Elbow Method")
17 plt.ylabel("WCSS")
18 plt.show()

wcss.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 7/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

Our challenge now is to determine the optimal value of K from the elbow diagram. The
trick is to identify the value at which the WCSS suddenly stops decreasing significantly
compared to previous decreases. In our case, we notice that the drop after 5 is
relatively minimal so we choose 5 as our optimal value. With this information, we can
now build our model.

1 #Build the model with 5 clusters specified

2 kmeans_model=KMeans(n_clusters=5)
3
4 #Fit the input data to the model
5 kmeans_model.fit(X)
6
7 #Segement the input data by assigning labels
8 y = kmeans_model.predict(X)
9
10 #Create a new column in the original dataset for the labels
11 df["label"] = y
12
13 #The dataframe with clustering complete
14 df.head()

model.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 8/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

And there is our updated dataframe with a label column specifying which segment a
given client belongs to. We can visualize the different segments using a scatterplot.

1 #Create a scatterplot to show the different clusters

2 plt.figure(figsize=(12,7))
3 sns.scatterplot(data = df, x = 'Annual Income (k$)',y = 'Spending Score (1-100)',hue="label",pal
4 plt.xlabel('Annual Income (k$)')
5 plt.ylabel('Spending Score (1-100)')
6 plt.title('Spending Score (1-100) vs Annual Income (k$)')
7 plt.show()

scatterplot.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 9/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

Now we’re able to see the clusters more clearly. Clients in Cluster 0 (Blue) have the
least income and least spending scores while clients in Cluster 2 (Green) have the most
income and highest spending scores.

Adding another variable

So far, we have segmented our client base on two criteria, annual income, and
spending score. In this section, we will perform another segmentation using three
variables. In addition to annual income and spending scores, we shall be including the
age of the customers.

As we did previously, we will begin by calculating the values of WCSS, but this time
with the Age column included.

1 #Create a subset of the dataframe with only Age, Annual Income and Spending Score
2 X2 = df[["Age","Annual Income (k$)","Spending Score (1-100)"]]
3
4 #Determine the variation in the data
5 wcss=[]
6 for i in range(1,11):
7 km=KMeans(n_clusters=i)
8 km.fit(X2)
9 wcss.append(km.inertia_)
10
11 #Plot the elbow curve
12 plt.figure(figsize=(12,6))
13 plt.plot(range(1,11),wcss, linewidth=2, color="blue", marker ="8")
14 plt.xlabel("Number of Clusters (K)")
15 plt.xticks(np.arange(1,11,1))
16 plt.title("The Elbow Method")
17 plt.ylabel("WCSS")
18 plt.show()

wcss2.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 10/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

Again, we can select 5 as our optimal value of K. Let’s go ahead and build our second
model.

1 #Build the model with 5 clusters specified

2 kmeans_model3D = KMeans(n_clusters=5)
3
4 #Fit the input data to the model
5 kmeans_model3D.fit(X2)
6
7 #Segement the input data by assigning labels
8 y2 = kmeans_model3D.predict(X2)
9
10 #Update the "label" column in the original dataset with the new values
11 df["label"] = y2
12
13 #The dataframe with clustering complete
14 df.head()

model3D.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 11/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

To see the individual clusters more clearly, we will need to visualize them using a
scatterplot, except this time we will need to create a 3D plot since we are dealing with 3
dimensions or variables.

1 #Create a 3D scatter plot

2 fig = plt.figure(figsize=(20,10))
3 ax = fig.add_subplot(111, projection='3d')
4
5 ax.scatter(df["Age"][df["label"] == 0], df["Annual Income (k$)"][df["label"] == 0], df["Spendin
6 ax.scatter(df["Age"][df["label"] == 1], df["Annual Income (k$)"][df["label"] == 1], df["Spendin
7 ax.scatter(df["Age"][df["label"] == 2], df["Annual Income (k$)"][df["label"] == 2], df["Spendin
8 ax.scatter(df["Age"][df["label"] == 3], df["Annual Income (k$)"][df["label"] == 3], df["Spendin
9 ax.scatter(df["Age"][df["label"] == 4], df["Annual Income (k$)"][df["label"] == 4], df["Spendin
10 ax.view_init(35, 185)
11
12 plt.xlabel("Age")
13 plt.ylabel("Annual Income (k$)")
14 ax.set_zlabel('Spending Score (1-100)')
15 plt.show()

3Dscatter.py hosted with ❤ by GitHub view raw

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 12/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium

Upgrade Open in app

We can also get a good idea of how the different segments differ by calculating the
average values of the three variables for each segment as well as a count of the number
of clients in each segment. This can be done with the groupby() method.

1 #Check the count and mean values of all three variables for the different segments
2 round(df.groupby(by="label")\
3 .agg({"CustomerID":"count","Age":"mean","Annual Income (k$)":"mean","Spending Score (1-1
4 .reset_index()\
5 .rename(columns={"label":"Segment","CustomerID":"No.of Clients"}))

groupby.py hosted with ❤ by GitHub view raw

The results provide a lot of interesting insights. For example, clients in Segment 0 are
the youngest, with a low income but high spending score while clients in Segment 4
are the oldest, with a low income and low spending score. Segment 2 has the largest
https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 13/14
08.03.2022, 16:42 A Beginner’s Guide to Customer Segmentation with Python | by Sigli Mumuni | Medium
are the oldest, with a low income and low spending score. Segment 2 has the largest
Upgrade Open in app

number of clients with moderate incomes and moderate spending scores. We can
summarise the different segments as follows:

Segment 0: young, with low incomes and high spending

Segment 1: middle-aged, with high incomes and high spending

Segment 2: older, with moderate incomes and moderate spending

Segment 3: middle-aged, with high incomes and low spending

Segment 4: older, with low incomes and low spending

Using this information, we can go a step further by creating different personas for the
different segments. Then based on their unique characteristics, we can apply the
appropriate growth strategies which may include loyalty, referral, upselling, and
incentive programs among several others.

And with that, we come to the end of this tutorial. I hope that you learned something
new. If you have any questions or comments, please be sure to leave a note in the
comments section. Thank you very much for reading and all the best in your data
journey.

https://medium.com/@siglimumuni/a-beginners-guide-to-customer-segmentation-with-python-fc8c219d6fa3 14/14

Extreme Privacy - Mobile Devices
100% (6)
Extreme Privacy - Mobile Devices
135 pages
JT808-2013 Protocol
No ratings yet
JT808-2013 Protocol
88 pages
Customer Segmentation Using Machine Learning
100% (1)
Customer Segmentation Using Machine Learning
28 pages
Low Code AIML USL Project CreditCardCustomerSegmentation Vijay Borade Aug23
67% (3)
Low Code AIML USL Project CreditCardCustomerSegmentation Vijay Borade Aug23
66 pages
Full Customer Segmentation
No ratings yet
Full Customer Segmentation
11 pages
Customer Segmentation Project
No ratings yet
Customer Segmentation Project
16 pages
DDCS Expert User's Manual V1-已压缩
No ratings yet
DDCS Expert User's Manual V1-已压缩
137 pages
Mastering Python For Data Science - Sample Chapter
71% (7)
Mastering Python For Data Science - Sample Chapter
24 pages
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
From Everand
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
PURNA CHANDER RAO. KATHULA
5/5 (1)
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
From Everand
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Avishek Nag
No ratings yet
Neutral Grounding
No ratings yet
Neutral Grounding
57 pages
Tdp-704 Variable Volume and Temperature Systems
No ratings yet
Tdp-704 Variable Volume and Temperature Systems
64 pages
K-Means Clustering For Customer Segmentation - A Practical Example - Kimberly Coffey, PH.D - PDF
100% (2)
K-Means Clustering For Customer Segmentation - A Practical Example - Kimberly Coffey, PH.D - PDF
41 pages
Python Feature Engineering Cookbook: A complete guide to crafting powerful features for your machine learning models
From Everand
Python Feature Engineering Cookbook: A complete guide to crafting powerful features for your machine learning models
Soledad Galli
No ratings yet
Smuat Guide
No ratings yet
Smuat Guide
53 pages
(Electrical Power Systems) (By: C.L. Wadhwa) (Published: July, 2009)
No ratings yet
(Electrical Power Systems) (By: C.L. Wadhwa) (Published: July, 2009)
5 pages
Random Forest
No ratings yet
Random Forest
9 pages
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies
From Everand
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies
Timothy Eastridge
No ratings yet
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
From Everand
Graph Data Science with Python and Neo4j: Hands-on Projects on Python and Neo4j Integration for Data Visualization and Analysis Using Graph Data Science for Building Enterprise Strategies (English Edition)
Timothy Eastridge
No ratings yet
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
From Everand
Machine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition)
Suhas Pote
No ratings yet
Factor Analysis - Segmentation New
No ratings yet
Factor Analysis - Segmentation New
142 pages
Alarm Code Servo Okuma
No ratings yet
Alarm Code Servo Okuma
4 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
Introduction - 3 Topics: Airplane Design (Aerodynamic) Prof. E.G. Tulapurkara Chapter-1
No ratings yet
Introduction - 3 Topics: Airplane Design (Aerodynamic) Prof. E.G. Tulapurkara Chapter-1
38 pages
Customer Segmentation in Python
No ratings yet
Customer Segmentation in Python
71 pages
2 Crypto
No ratings yet
2 Crypto
86 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Chapter 2
No ratings yet
Chapter 2
33 pages
BMC Resmart Gii Y30t Bipap Humidifier
No ratings yet
BMC Resmart Gii Y30t Bipap Humidifier
4 pages
Mathematica Data Analysis
From Everand
Mathematica Data Analysis
Suchok Sergiy
No ratings yet
Customer Segemntation
No ratings yet
Customer Segemntation
26 pages
Seville Workshop Presentation RBalcombe
No ratings yet
Seville Workshop Presentation RBalcombe
28 pages
Digitalgovernmentreviewbrazil Oecd
No ratings yet
Digitalgovernmentreviewbrazil Oecd
23 pages
Mining and Visualising Real-World Data: About This Module
100% (1)
Mining and Visualising Real-World Data: About This Module
16 pages
288175101
No ratings yet
288175101
51 pages
Manual Hiad 6 Ton Inv. 1942
No ratings yet
Manual Hiad 6 Ton Inv. 1942
46 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Digital Quartz AM/FM Tuner: Owners Manual
No ratings yet
Digital Quartz AM/FM Tuner: Owners Manual
4 pages
Kollmorgen AKM - Servomotor
No ratings yet
Kollmorgen AKM - Servomotor
44 pages
Segmentation Analysis
No ratings yet
Segmentation Analysis
17 pages
113 Trellix NX 4600 Ds Trellix Network Security Tech Specifications Datasheet
No ratings yet
113 Trellix NX 4600 Ds Trellix Network Security Tech Specifications Datasheet
9 pages
Guides
No ratings yet
Guides
23 pages
Dissertation Topics Logistics Supply Chain
100% (1)
Dissertation Topics Logistics Supply Chain
7 pages
Chapter4 3
No ratings yet
Chapter4 3
37 pages
Chapter1 PDF
No ratings yet
Chapter1 PDF
37 pages
A Neural Network-Empowered Inverse FSS Design and Synthesis Approach For 5G Shielding Applications
No ratings yet
A Neural Network-Empowered Inverse FSS Design and Synthesis Approach For 5G Shielding Applications
13 pages
Using Big - Fact Customer Table With Proper Unseen Data - Colab
No ratings yet
Using Big - Fact Customer Table With Proper Unseen Data - Colab
21 pages
ML Review PPT 2
No ratings yet
ML Review PPT 2
29 pages
Experience Summary: Vijaya Bhaskar P
No ratings yet
Experience Summary: Vijaya Bhaskar P
3 pages
PM2-Project Charter
No ratings yet
PM2-Project Charter
23 pages
PDF Custome Segmentation
No ratings yet
PDF Custome Segmentation
18 pages
Python Projects For Data Analysis
No ratings yet
Python Projects For Data Analysis
18 pages
API ISCAN-LITE Scanner
No ratings yet
API ISCAN-LITE Scanner
4 pages
KehuaFrance 3kW
No ratings yet
KehuaFrance 3kW
2 pages
Qualcomm 213
No ratings yet
Qualcomm 213
28 pages
Presentation 17
No ratings yet
Presentation 17
18 pages
Customer Segmentation in Python Chapter4
No ratings yet
Customer Segmentation in Python Chapter4
37 pages
Spark Lab
No ratings yet
Spark Lab
6 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
Some Basic Knowledge On RF Combiners & RF Splitters - ELEKITSORPARTS
No ratings yet
Some Basic Knowledge On RF Combiners & RF Splitters - ELEKITSORPARTS
5 pages
Aiml Project Review
No ratings yet
Aiml Project Review
22 pages
Balaji 1
No ratings yet
Balaji 1
30 pages
Customer Segmentation
No ratings yet
Customer Segmentation
9 pages
Machine Learning Project Report - Customer Segmentation
No ratings yet
Machine Learning Project Report - Customer Segmentation
2 pages
Analog Communication Lab VIVA Questions & Answers
No ratings yet
Analog Communication Lab VIVA Questions & Answers
9 pages
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
No ratings yet
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
9 pages
Customer Segmentation in Python Chapter2
No ratings yet
Customer Segmentation in Python Chapter2
33 pages
1
No ratings yet
1
15 pages
5
No ratings yet
5
14 pages
DWDM Report
No ratings yet
DWDM Report
6 pages
Customer Segmentation Report
No ratings yet
Customer Segmentation Report
8 pages
Sankosha 2017 Laundry the-Higher-Standard
No ratings yet
Sankosha 2017 Laundry the-Higher-Standard
10 pages
Customer Segmentation New
No ratings yet
Customer Segmentation New
11 pages
CSUDS Project
No ratings yet
CSUDS Project
13 pages
Paris Rhone PE AH001 Ultrasonic Cool Mist Humidifier User Manual
No ratings yet
Paris Rhone PE AH001 Ultrasonic Cool Mist Humidifier User Manual
3 pages
Ads Phase 4
No ratings yet
Ads Phase 4
12 pages
Tech Documentation
No ratings yet
Tech Documentation
5 pages
Supermarket Sales Data Analysis
No ratings yet
Supermarket Sales Data Analysis
6 pages
Machine Learning Project Report - Customer Segmentation
No ratings yet
Machine Learning Project Report - Customer Segmentation
2 pages
Customer Segmentation Using Machine Learning
No ratings yet
Customer Segmentation Using Machine Learning
8 pages
Catalogo
No ratings yet
Catalogo
3 pages
Tasks For Students
No ratings yet
Tasks For Students
4 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
GMP 11 Good Measurement Practice For Assignment and Adjustment of Calibration Intervals For Laboratory Standards
No ratings yet
GMP 11 Good Measurement Practice For Assignment and Adjustment of Calibration Intervals For Laboratory Standards
10 pages
Back To The Future - Next-Generation Vacuum Electronics
No ratings yet
Back To The Future - Next-Generation Vacuum Electronics
3 pages
Phase 4
No ratings yet
Phase 4
5 pages
Customer Segmentation IEEE Report
No ratings yet
Customer Segmentation IEEE Report
2 pages
Axe Submission
No ratings yet
Axe Submission
4 pages
D-4856 Vensim Conversion Guide (Aaron Diamond)
No ratings yet
D-4856 Vensim Conversion Guide (Aaron Diamond)
6 pages
VL2024250504566 Ast03
No ratings yet
VL2024250504566 Ast03
2 pages
RFM How To Automatically Segment Customers Using Purchase Data and A Few Lines of Python
No ratings yet
RFM How To Automatically Segment Customers Using Purchase Data and A Few Lines of Python
8 pages
TechOffer Innovative Traveling Wave Tube Design Incl. Slow Wave Structure
No ratings yet
TechOffer Innovative Traveling Wave Tube Design Incl. Slow Wave Structure
1 page
Chapter 1,2 Report
No ratings yet
Chapter 1,2 Report
5 pages

A Beginner's Guide To Customer Segmentation With Python - by Sigli Mumuni - Medium

Uploaded by

A Beginner's Guide To Customer Segmentation With Python - by Sigli Mumuni - Medium

Uploaded by

08.03.

Upgrade Open in app

Sigli Mumuni Follow

Jan 6 · 7 min read · Listen

A Beginner’s Guide to Customer Segmentation

Photo by Olivier Le Moal on Shutterstock

In this tutorial, we will implement customer segmentation using the K-means

Annual Income (in thousands of dollars)

Spending Score (based on customer behavior and spending patterns).

Importing the libraries

1 #Import the relevant libraries

import_libraries.py hosted with ❤ by GitHub view raw

Loading the data

load_data.py hosted with ❤ by GitHub view raw

Preview of the mall customers dataset

1 #Check the structure of the dataset

structure.py hosted with ❤ by GitHub view raw

Upgrade Open in app

Structure of the mall customers dataset

Exploring the data

1 #Check the summary statistics for the numeric columns

describe.py hosted with ❤ by GitHub view raw

Upgrade Open in app

1 #Create a subplot object with one row and three columns

histograms.py hosted with ❤ by GitHub view raw

1 #Check the proportion of males and females in the dataset

countplot.py hosted with ❤ by GitHub view raw

1 #Check the correlation between the different variables

correlation.py hosted with ❤ by GitHub view raw

Upgrade Open in app

Segmenting the data

wcss.py hosted with ❤ by GitHub view raw

Upgrade Open in app

1 #Build the model with 5 clusters specified

model.py hosted with ❤ by GitHub view raw

Upgrade Open in app

1 #Create a scatterplot to show the different clusters

scatterplot.py hosted with ❤ by GitHub view raw

Upgrade Open in app

Adding another variable

wcss2.py hosted with ❤ by GitHub view raw

Upgrade Open in app

1 #Build the model with 5 clusters specified

model3D.py hosted with ❤ by GitHub view raw

Upgrade Open in app

1 #Create a 3D scatter plot

3Dscatter.py hosted with ❤ by GitHub view raw

Upgrade Open in app

groupby.py hosted with ❤ by GitHub view raw

Segment 0: young, with low incomes and high spending

Segment 1: middle-aged, with high incomes and high spending

Segment 2: older, with moderate incomes and moderate spending

Segment 3: middle-aged, with high incomes and low spending

Segment 4: older, with low incomes and low spending

You might also like