Marketing & Retail Analytics - Report - Part A
Marketing & Retail Analytics - Report - Part A
Marketing & Retail Analytics - Report - Part A
Analytics.
[Part A]
By Ashish Agrawal
• Agenda -
• Agenda of this project is to find the underlying buying patterns of the customers of an
automobile part manufacturer. based on the past 3 years of the Company's transaction
data and recommend them customized marketing strategies for different segments of
customers.
• We have received the 3 years data of automobile part manufacture . Consisting 2747
entries with 20 variable details regarding the demography of the product and customer
information.
Content
Problem Statement.
Data Summary .
Exploratory Analysis and Inferences.
•Univariate analysis.
•Bivariate analysis.
•Multivariate analysis.
•Time series & Trends in Sales.
Customer Segmentation using RFM analysis.
KNIME Workflow image .
Output table head For RFM Analysis.
Inferences from RFM Analysis and identified segments.
Recommendation
Problem Statement:
An automobile parts manufacturing company has collected data of transactions for 3 years. They do not have
any in-house data science team, thus they have hired you as their consultant. Your job is to use your magical
data science skills to provide them with suitable insights about their data and their customers.
• Data Summary –
• The data is about an automobile parts manufacturing company. They have provided the data collected of transactions for
3 years.
• The data has 2747 entries of rows and 20 columns. The data has 1 datetime64 , 2 float64, 5 int64, and 12 Object data
types. There is no missing values present in the data set.
• This data more or less reflects the purchasing behavior of customers in different categories . The company is into
automobile part manufacture, and they have different product line like Classic car , Motorcycle, plane, train, ship, Bus
truck, vintage cars etc.
• The data maintained each transactions entry as order number and for each order number maintained all required
information like customer identity details , and product details like price , quantity , product code, and sales for each
customer.
• We noticed that one order number has many different entries with different product codes.
• Manufacturer's Suggested Retail Price(MSRP) for each product code is decided but we found that this is not matching
with Price of Each item & is inconsistent with MSRP.
Exploratory Analysis and Inferences.
Univariate analysis.
Univariate analysis.
Using boxplot on sales & quantity order variable we have plotted univariate analysis. We can clearly see that outlier
is present there.
Also using histogram on sales variable we did univariate analysis.
For Categorical variable like product line we also did univariate analysis using bar plot.
We have noticed that the sales of classic cars products are high followed by vintage car product sales
Bivariate analysis.
MSRP, Price Each, status, sales & product line using these variables we did multivariate analysis. For this we used
horizontal bar, tree map, stack bar , scatter plot respectively.
As sales are high for classic cars the company has even sold below MSRP, there might be a chances that the company
has given more discounts to its customers. And vice versa for vintage cars were the company has sold above MSRP.
Ship, vintage car & train are been sold above the MSRP. By looking at the given data almost all the transactions are
been shipped.
Yearly, Quarterly, monthly, Weekly time series analysis & its trend are been shown. We observed that in Last quarter sales
are high as compared to other quarters. There is a seasonality seen.
Summary of the inferences
For Categorical variable like product line we also did univariate analysis using bar plot.
Using boxplot on sales , product line, deal size variables we have plotted bivariate analysis.
And using MSRP, Price Each, status, sales & product line variables we did multivariate analysis
After deriving univariate, bivariate & Multivariate analysis we can see there is a high demand of classic cars
followed by vintage cars and least is for trains.
The sale are high for the last quarter of the year & we can see seasonality in it.
The demand for classic cars are so high that the company has also sold the products below MSRP giving the
customers a good discount. However for vintage cars they have sold above the MSRP too.
• Which tool used?
-> As per your suggestion about ignoring the column "Days Since last order" and create new column name Recency
as "[Today Date - order date)]"
If we can see the data there are same order number repeated for different product Code. So we can assume count of each order
number as frequency of an order number.
In SALES column we get sales amount for each transaction. We can use SALES parameter and using an assumption of sum of
aggregation we created a new column as Monetary .
Then created three different bin for each Recency, frequency & Monetary using percentile range(0,0.25,0.75,100).
Based on above 3 bin assumption we have considered 3 segments like High , Medium and Low.
KNIME Workflow image
Output table head For RFM Analysis.
RFM summary Metric
On basis on Recency, frequency & monetary we have grouped our top customers. We have given the most
significance to recency parameter as these customers has recently purchased our products. Also according to
RFM model the most importance is given to recency.
Active – Should be top most priority and resources and budget should be spent to retain them
At-Risk – Second priority after active customers to retain them
Inactive – Least priority. No budget and resource should be spent
Top customers
On basis on Recency, frequency & monetary we have grouped our top customers. We have given the most significance to
recency parameter as these customers has recently purchased our products. Also according to RFM model the most
importance is given to recency. Hence we have kept it as our first parameter for selecting top customers. These customers
should be kept on priority and active marketing strategy should be formalized to retain these customers and try to increase
their spent with promotional activities and by offering discounts.
Active potential customers
These customers should also be kept on radar and certain resource and budget should be spent to retain these customers
since they are most recent customers and there is a good opportunity to increase their spent on products and bring them to
Gold customer category with promotional activities and by offering discounts.
Least priority customers
On basis on Recency, frequency & monetary we have grouped our least priority customers. We have given the most
significance to recency parameter as these customers has recently purchased our products. Also according to RFM model the
most importance is given to recency.
These customers should be kept on least priority and no resources or budget should be spent on marketing strategy to bring
back these customers since all three parameters of RFM analysis are lowest.