0% found this document useful (0 votes)
42 views43 pages

Unsupervised Machine Learning (Customer Segmentation) Online Retail

Uploaded by

sonalrig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views43 pages

Unsupervised Machine Learning (Customer Segmentation) Online Retail

Uploaded by

sonalrig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

UNSUPERVISED MACHINE

LEARNING
(CUSTOMER SEGMENTATION)
ONLINE RETAIL
INTRODUCTION
1. The main goal is to identify customers that are most profitable and the ones who
churned out to prevent further loss of customer by redefining company policies.
2. CLUSTER ANALYSIS: Statistically Segment Customers into groups Observation by using the
features given below

Data Description
Attribute Data Type Description
Invoice Number Nominal 6-digit unique number / code starts with letter 'c', it indicates a cancellation
Stock Code Nominal a 5-digit unique number assigned to each distinct product.
Description Nominal Product (item) name
Quantity Numeric Quantities of each product (item) per transaction
Invoice Date Numeric Date and time when each transaction was generated
Unit Price Numeric Product price per unit in sterling.
CustomerID Nominal 5-digit unique number for Customer
Country Nominal the name of the country where each customer resides.
IMPORTING AND INSPECTING DATASET
Data set Name:- Online Retail
No of Observation:541908 (shape=8x541908)
dtypes: datetime=(1), float64=(2), int64=(1), object=(4) 1+2+1+4 = 8 columns

Data Cleaning
Checking Missing data
No use of this data it
1. CustomerID - 135080(25% Missing Values)
can be dropped
2. Description - 1454 (0.27% Missing Values)
Checking duplicates
Dropped
5268 data points were duplicated duplicates

Total data points left


No of Observation left :401604 (shape=8x 401604)
FEATURE ENGINEERING

Extracting year Date and Month from Invoice Date

Added Feature 'TotalAmount' by multiplying values from the Quantity


and UnitPrice column.(Sterling)

Added feature 'TimeType' based on hours to define whether its


Morning, Afternoon, or Evening

Dropping InvoiceNo starting with 'C’ that represents cancellation


MOST FREQUENT VALUES
MOST FREQUENT VALUES
MOST FREQUENT VALUES

Observations/Hypothesis
1. Most Customers are from the United Kingdom. A considerable number of customers are also from Germany, France, EIRE and
Spain.
2. There are no orders placed on Saturdays. Looks like it’s a non-working day for the retailer.
3. Most of the customers have purchased gifts in the month of November, October, December, and September. Less number of
customers have purchased the gifts in the month of April, January, and February.
4. Most of the customers have purchased the items in the Afternoon, moderate numbers of customers have purchased the
items in Morning and the least in the Evening.
5. WHITE HANGING HEART T-LIGHT HOLDER, REGENCY CAKESTAND 3 TIER, JUMBO BAG RED RETRO SPOT are the most ordered
products
LESS FREQUENT VALUES

Observations/Hypothesis
1. Saudi Arabia, Bahrain, the Czech Republic, Brazil, and Lithuania has the least number of customers

2. GREEN WIT METAL BAG CHARM, WHITE WITH METAL BAG CHARM, BLUE/NAT SELL NECLACE W PENDENT, PINK EASTER ENS
FLOWER, PAPER CRAFT LITTLE BIRDIE are some of the least sold products.
COUNTRY WISE ORDERS
COUNTRY WISE CUSTOMERS
COUNTRY WISE PURCHASE QUANTITY
PRODUCT WISE PURCHASE QUANTITY
PRODUCT WISE REVENUE
PRODUCT WISE CUSTOMERS
CUSTOMER WISE CANCELLATIONS
COUNTRY WISE CANCELLATIONS
VISUALIZING DISTRIBUTIONS

1. Visualizing the distribution of quantity, unitprice and total amount columns

2. It shows a positively skewed distribution because most of the values are clustered around the left side of the distribution
while the right tail of the distribution is longer, which means mean>median>mode

3. For symmetric graph mean=median=mode.


LOG TRANSFORMATION

1. After applying log transformation now the distribution plot looks comparatively better than being skewed.

2. We use log transformation when our original continuous data does not follow the bell curve, we can log transform this data
to make it as “normal” as possible so that the analysis results from this data become more valid.
RFM ANALYSIS

How frequently do
customers visit

Money By
Spent Customer
Recent visit by the
customers
RFM MODELLING
Customer Name Recency Frequency Monetary
Anthony 326 1
15 7183
RFM TABLE
Rahul 2 182 4310

Syed 75 31 1765

CONCLUSIONS:
Anthony

Anthony visited 326 days (approx. 1 year) ago and visited 15 times and spent Lost Potential
around 7183 Sterlings Customer
Rahul

Rahul visited 2 days ago and visited 182 times and spent around 4310 Sterlings
Recently visited
Potential Customer
Syed

Syed visited 75 days ago (2.5 months)and visited 31 times and spent around About to Lose
1765 Sterlings Average Customer
RFM MODELLING

1. Earlier the distributions of Recency, Frequency and Monetary columns were positively skewed but after applying log
transformation, the distributions appear to be symmetrical and normally distributed.

2. It will be more suitable to use the transformed features for better visualization of clusters.
RFM CORRELATION HEATMAP
1. We can see that Recency is highly correlated with the
RFM value.

2. Frequency and Monetary are moderately correlated


with the RFM.

Scaling for CLUSTERING Analysis


1. Log Transformation 2. Standard Scaler
of Features like on X variables, (0) Clustering Analysis
Followed by Modelling
Recency Frequency mean and (1) as
and Monetary standard deviation
Pipeline
EXTRACTING DATA DATA CLEANING DATA VISUALIZATION RFM ANALYSIS
Checking Missing data
Online Retail 1. 25 % of items RECENCY: Must be LESS
Observation:541908 (i.e 135080) FREQUENCY: Must be MORE
(shape=8x541908) 2. CustomerID – 1454
Checking duplicates MONETARY: Must be MORE
5268 data points were
Condition: For Best Customers
Duplicated
401604 DATA POINT LEFT

MODELLING CUSTOMER SEGMENTATION CONCLUSION

Binning (RFM SCORE)


Binning (RFM combination)
K-Means
Hierarchical
DBSCAN Clustering
BINNING RFM SCORES
BINNING RFM SCORES
RECENCY

FREQUENCY GROUP 1 LOST POOR CUSTOMERS

GROUP 2 AVERAGE CUSTOMERS

GROUP 3 GOOD CUSTOMERS

MONETARY GROUP 4 BEST CUSTOMERS


QUANTILE CUT
QUANTILE CUT
RECENCY

FREQUENCY GROUP 1 LOST POOR CUSTOMERS

GROUP 2 LOSING LOYAL CUSTOMERS

GROUP 3 GOOD CUSTOMERS

MONETARY GROUP 4 BEST CUSTOMERS


K-MEANS CLUSTERING

1. From the Elbow curve 5 appears to be at the elbow and hence can be considered as the number of clusters. n_clusters=4 or 6
can also be considered.

2. If we go by the maximum Silhouette Score as the criteria for selecting an optimal number of clusters, then n_clusters=2 can
be chosen.

3. If we look at both of the graphs at the same time to decide the optimal number of clusters, So 4 appears to be a good choice,
having a decent Silhouette score as well as near the elbow of the elbow curve.
K-MEANS | 2CLUSTER
K-MEANS | 2CLUSTER
RECENCY

GROUP 0 BEST CUSTOMERS

FREQUENCY GROUP 1 LOST POOR CUSTOMERS

MONETARY
K-MEANS | 5CLUSTER
K-MEANS | 5CLUSTER
RECENCY

FREQUENCY
GROUP 0 LOST POOR CUSTOMERS

GROUP 1 BEST CUSTOMERS

RECENTLY VISITED AVERAGE


GROUP 2 CUSTOMERS
MONETARY
GROUP 3 LOSING LOYAL CUSTOMERS

GROUP 4 AVERAGE CUSTOMERS


K-MEANS | 4CLUSTER
K-MEANS | 4CLUSTER
RECENCY

FREQUENCY GROUP 0 LOSING LOYAL CUSTOMERS

GROUP 1 BEST CUSTOMERS

GROUP 2 LOST POOR CUSTOMERS

RECENTLY VISITED AVERAGE


MONETARY GROUP 3 CUSTOMERS
HIERARCHICAL CLUSTERING
In the K-means clustering there is a challenge to
predetermine the number of clusters, and it always tries
to create the clusters of the same size. To solve these two
challenges, we can opt for the hierarchical clustering
algorithm because, in this algorithm, we don't need to
have knowledge about the predefined number of clusters.
Hierarchical clustering is based on two techniques:

a. Agglomerative: Agglomerative is a bottom-up approach, in which the


algorithm starts with taking all data points as single clusters and merging
them until one cluster is left.

b. Divisive: Divisive algorithm is the reverse of the agglomerative algorithm


as it is a top-down approach.

We have defined the optimal number of clusters based on dendrogram as


shown here
HIERARCHICAL | 2CLUSTER
HIERARCHICAL | 2CLUSTER
RECENCY

GROUP 0 AVERAGE CUSTOMERS

FREQUENCY GROUP 1 BEST CUSTOMERS

MONETARY
HIERARCHICAL | 3CLUSTER
HIERARCHICAL | 3CLUSTER
RECENCY

GROUP 0 BEST CUSTOMERS


FREQUENCY
GROUP 1 LOSING LOYAL CUSTOMERS

GROUP 2 LOST POOR CUSTOMERS

MONETARY
DBSCAN
DBSCAN
RECENCY

FREQUENCY GROUP -1 AVERAGE CUSTOMERS

GROUP 0 LOST POOR CUSTOMERS

GROUP 1 GOOD CUSTOMERS

MONETARY GROUP 2 LOSING LOYAL CUSTOMERS


SUMMARY

▪ We started with a simple binning and quantile based simple segmentation model first then moved to more complex
models because simple implementation helps having a first glance at the data and know where/how to exploit it better.
▪ Then we moved to k-means clustering and visualized the results with different number of clusters. As we know there is
no assurance that k-means will lead to the global best solution. We moved forward and tried Hierarchical Clustering
and DBSCAN clusterer as well.
▪ We created several useful clusters of customers on the basis of different metrics and methods to cateorize the
customers on the basis of their behavioral attributes to define their valuability, loyalty, profitability etc. for the business.
Though significantly separated clusters are not visible in the plots, but the clusters obtained is fairly valid and useful as
per the algorithms and the statistics extracted from the data.
▪ Segments depends on how the business plans to use the results, and the level of granularity they want to see in the
clusters. Keeping these points in view we clustered the major segments based on our understanding as per different
criteria as shown in the summary dataframe.
FINAL CONCLUSION
CUSTOMER SEGMENTS OBTAINED FROM CLUSTERING ANALYSIS

You might also like