0% found this document useful (0 votes)
6 views1 page

Lab 8-DA

Lab 8 focuses on clustering analysis using the 'Online Retail.xlsx' dataset. It involves calculating RFM values for customers based on their transaction history, identifying customer segments using the elbow method, and applying both K-means and Agglomerative clustering algorithms. The lab also includes visualizing the clusters and comparing the results from both clustering methods.

Uploaded by

batmanflyinsky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views1 page

Lab 8-DA

Lab 8 focuses on clustering analysis using the 'Online Retail.xlsx' dataset. It involves calculating RFM values for customers based on their transaction history, identifying customer segments using the elbow method, and applying both K-means and Agglomerative clustering algorithms. The lab also includes visualizing the clusters and comparing the results from both clustering methods.

Uploaded by

batmanflyinsky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Lab 8 – SECTION B BATCH 2 Date: 06th October 2023

Exer 1: Clustering
Download the data set “Online Retail.xlsx” from
https://archive.ics.uci.edu/ml/datasets/online+retail

a. Read and write a summary of the metadata .


b. Select only the transactions that have occurred from 01/04/ 2011 and
09/12/2011 and create a dataset.
c. Calculate the RFM values for each customer (by customer id). RFM
represents:
2. R (Recency) − Recency should be calculated as the number of months before he or
she has made a purchase from the online store. If he/she made a purchase in the month of
December 2011, then the Recency should be 0. If purchase is made in November 2011
then Recency should be 1 and so on and so forth.
3. F (Frequency) − Number of invoices by the customer from 01/04/ 2011 and
09/12/2011.
4. M (Monetary Value) − Total spend by the customer from 01/04/ 2011 and
09/12/2011.
a. Use the elbow method to identify how many customer segments exist, using
the RFM
5. values for each customer.
a. Create the customer segments with K-means algorithm by using number of
clusters is suggested by elbow method.
6. from sklearn.cluster import KMeans
a. Plot the clusters in a scatter plot and mark each segment differently using
lmplot.
b. Print the cluster centers of each customer segment and explain them
intuitively.
c. Create the customer segments with Agglomerative algorithm by using number
of clusters is suggested by elbow method.
7. from sklearn.cluster import AgglomerativeClustering
a. Visualize the clusters using the dendrogram.
b. Compare the clusters obtained using KMeans vs. Agglomeration.

You might also like