Lab 8 focuses on clustering analysis using the 'Online Retail.xlsx' dataset. It involves calculating RFM values for customers based on their transaction history, identifying customer segments using the elbow method, and applying both K-means and Agglomerative clustering algorithms. The lab also includes visualizing the clusters and comparing the results from both clustering methods.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
6 views1 page
Lab 8-DA
Lab 8 focuses on clustering analysis using the 'Online Retail.xlsx' dataset. It involves calculating RFM values for customers based on their transaction history, identifying customer segments using the elbow method, and applying both K-means and Agglomerative clustering algorithms. The lab also includes visualizing the clusters and comparing the results from both clustering methods.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1
Lab 8 – SECTION B BATCH 2 Date: 06th October 2023
Exer 1: Clustering Download the data set “Online Retail.xlsx” from https://archive.ics.uci.edu/ml/datasets/online+retail
a. Read and write a summary of the metadata .
b. Select only the transactions that have occurred from 01/04/ 2011 and 09/12/2011 and create a dataset. c. Calculate the RFM values for each customer (by customer id). RFM represents: 2. R (Recency) − Recency should be calculated as the number of months before he or she has made a purchase from the online store. If he/she made a purchase in the month of December 2011, then the Recency should be 0. If purchase is made in November 2011 then Recency should be 1 and so on and so forth. 3. F (Frequency) − Number of invoices by the customer from 01/04/ 2011 and 09/12/2011. 4. M (Monetary Value) − Total spend by the customer from 01/04/ 2011 and 09/12/2011. a. Use the elbow method to identify how many customer segments exist, using the RFM 5. values for each customer. a. Create the customer segments with K-means algorithm by using number of clusters is suggested by elbow method. 6. from sklearn.cluster import KMeans a. Plot the clusters in a scatter plot and mark each segment differently using lmplot. b. Print the cluster centers of each customer segment and explain them intuitively. c. Create the customer segments with Agglomerative algorithm by using number of clusters is suggested by elbow method. 7. from sklearn.cluster import AgglomerativeClustering a. Visualize the clusters using the dendrogram. b. Compare the clusters obtained using KMeans vs. Agglomeration.