03 Segmenting Stores Using Clustering - SAC
03 Segmenting Stores Using Clustering - SAC
Clustering
Authors:
Nitin Kalé, University of Southern California
Nancy Jones, San Diego State University
Revised:
Liz Simmons, July 2022
OBJECTIVE
The objective of this exercise is to segment retail stores based on various attributes to help with sales
promotions.
ACTIVITIES
• Import and prepare data.
• Apply Smart Grouping cluster analysis.
• Merge data.
• Create data visualizations.
• Analyze and interpret output from models.
SOFTWARE PREREQUISITES
• SAP Analytics Cloud
• Microsoft Excel
DATA SET
Data file titled Stores.csv
1 of 12
Scenario
The Country Manager of a retail chain (which has 150 stores) is finalizing plans for three sales
promotion strategies. Data pertaining to the stores such as store location, sales turnover, store
size, staff, and profit margin are stored in a CSV file. The manager wants to segment the 150
stores into three different groups based on sales turnover, profit margin, store size, and staff
size so specific strategies can be applied to each store segment. You will use clustering of retail
stores data to assist the manager in developing promotion strategies.
Cluster Analysis
Given a dataset, organizing it into meaningful groups is a basic and useful approach to data
mining and data analysis. Clustering classifies samples into groups using a measure of
association so that data points within a group are similar. Data points from different groups are
not similar. Data points are multidimensional, that is they consist of several variables.
Visualization is not practical for humans when datasets consist of more than three dimensions.
The input to a clustering exercise is a dataset and the number of clusters. The result of the
analysis is a set of clusters. K-means clustering is a method of finding clusters and their
centers (R) given a choice in the number of clusters (K). It is often used for market
segmentation. The goal is to make the inter-cluster difference (distance) high and the intra-
cluster difference (distance) low.
2 of 12
a. Insert Chart.
b. Select Bubble Chart from the Correlation charts.
c. Configure the Chart Structure as follows:
(1) + Add Sales Turnover to the X-Axis.
(2) + Add Staff Size to the Y-Axis.
(3) + Add Profit Margin to Size.
(4) + Add Store to the Dimensions.
(5)+ Add a Tooltip Measure as shown in Figure 1. You can find Add Tooltip after
clicking the three dots icon next to Chart Structure.
(6) Tooltip Measures will now show as a Chart Structure option. + Add Store Size
to Tooltip Measures.
(7) You will now see a Bubble chart of the first three measures by Store.
3 of 12
Figure 2: A Bubble Chart of Store Data
4 of 12
Figure 3: Configure Smart Grouping
5. The clusters in the default monochromatic color scheme tend to blend together, so you
may want to change the Color pallet. You should now see three distinct groups
(clusters) in your chart. You can filter on the clusters by clicking the cluster number you
wish to examine.
Question 1: Add your name to the title of the clustered Bubble chart and
submit a screenshot of the chart.
(1) Notice that SAC Smart Grouping will continue to break down the filtered data set
to even smaller clusters. You can ignore these new groups.
(2) Select Export from the chart dropdown list.
6 of 12
Figure 5: Export the Clustered Data
(3) Name the .csv file “Cluster_1”. The data from Cluster 1 will be downloaded to
your computer.
b. Repeat these steps for Clusters 2 and 3 and name the files “Cluster_2”
and “Cluster_3” respectively.
NOTE: Be sure to remove the chart filter (click the X to the right of 1 Filter in the
header) and replace it with the next cluster number before downloading the data.
You should have three downloaded .csv files.
c. Now you will prepare the cluster data for integration with the Stores data model in
SAC.
(1) The first step is to clean up the header information so it is only one row. Open the
Cluster_1.csv file.
(i) Move content of cells B1:D1 to cells B2:D2.
(ii) Delete row 1.
(2) Next add a column called “Cluster”.
(i) Add the cluster number to all the rows of data.
(3) Save the .csv file.
7 of 12
(4) You can see the results of your clean up in the following before and after Figures:
8 of 12
(5) On the Save dropdown select Open With Basic Data Preparation. This
will allow you to append the files for clusters 2 and 3.
9 of 12
Figure 11: Append a File
(9) Finish.
(10) Repeat the append for Cluster_3.csv.
(i) Now look at the data in the Clusters data set and you should find stores in all
three clusters and 150 rows.
f. Save.
2. To visualize the Stores and Clusters data, go to the Story view.
a. Add a new page with either a Canvas or a Responsive page.
b. Add a chart.
c. Add a Calculated Measure for Count of Stores as shown below:
g. Leave the chart as a Column chart. To add variables to the chart, you will now have a
choice of which data set you would like to use. You will see them as a drop down
when you add a Measure or Dimension. SAC calls this a blended data chart.
(1) Add Count of Stores from the Store data set to Measures.
(2) Add Cluster from the Clusters data set to Dimensions.
11 of 12
3. Create visualizations to answer the following questions:
Question 4: How does Average Profit Margin, Average Sales Turnover, and
Average Staff Size compare amongst the clusters?
Support your answers with a screenshot.
Hint: You will need to create Calculated Measures to determine
Averages.
12 of 12