BOP Assignment
BOP Assignment
Problem Statement:
Blackberry is a major retailer, running multiple stores. They have a huge loyalty program with
approximately 250,000 members.
They have recently launched a new range of private label consumer durables and are looking to run a
marketing campaign covering all the customers who are part of the loyalty program. A mini campaign
run was run on a random sample of approximately 22,000 customers who have signed up to the loyalty
program and the purchase outcome (or otherwise) has been recorded. Data on the following
information is available from the sample dataset
ID Customer identification
Prosp Prosperity grade on a scale of 1 to 30
Age Age, in years
Code House location Code
Gender Male/Female/Unknown
Region Geographic Region
TV Television Region
Class Loyalty Status
AmtSpent Amount spent by customer
Time Length of relationship
Buy Yes/No
Further it is known that the lifetime value of a new customer is $15,000 while the cost per capita of the
campaign is $4,420.
As it does not make sense to target the whole population of 220,000, Therefore, we have to determine
the best subset based on the model built on the sample (top 10%/20%/30%...).
To do so, focus is to decile the sample based on probability of purchase and use performance in each
decile (% purch, % non purch) to determine what percent of overall population should be targeted to
maximize profits by appluting the below formula:
(15,000 * # of customers in top 2 decilea * Cumulative %purch) – (4,420 * # of customers in top 2
deciles)
With the objective of maximizing profits, we are required to come up with an analytics-based targeting
plan, given the campaign economics stated above.
Approach:
Page | 2
6. Logistic Regression with the new Cut-off value
7. Hosmer and Lemeshow Test to determine the ideal Sample Set
Data Cleanup
If we look at the data set, this is about a set of data that is with Blackberry for a mini-campaign run on a
random sample of 22223 customers who have signed up for the loyalty program and the purchase
outcome has been recorded. It is having data about their customers and some prospects who did not turn
out to be customers at the end of the marketing campaigns.
Some information about the customers and their prospects, like prosperity, location, age, class, region,
and buying decision which tells us whether they own a house or not own a house, the number of cars that
they own, age of the buyer, whether a loyalty program has been taken up or not.
Code, Gender, Region, and TV Region are essentially categorical transformations of the earlier variables.
As evident in the dataset, a number of data points in each category were found to be missing. Referring
to the Data Preparation instruction in the case, for numeric variables we replace the missing data for
numeric variables with the mean calculated based on the remaining data and for categorical variables with
the mode. The above exercise was achieved with the help of using pivot tables.
Logistic Regression
Now the modified data, we started to actually build the model and ran the Logistic Regression.
Buying decision of the customer is our dependent variable and Prospect, Code, Gender, Region, TV,
Class, Amount spent by the customer, Time and age are the covariates.
Code, Gender, Region, and TV are identified as categorical variables based on the responses that are there
in the data. Since they are alphanumeric, it is automatically coded as categorical.
Next, we save the predicted probabilities to see how that is panning out.
The default cut-off in SPSS is always 0.5, which means if a particular row has a predicted probability
greater than or equal to 0.5, it is classified as a positive outcome. In this case, somebody who's potentially
going to buy. And if the probability is less than 0.5, then it is classified as a zero, or predicted zero, or
somebody who is not going to buy.
Page | 3
Block 0
In the output, the first block, essentially, tell us how many cases are there for the analysis and if there are
any missing cases.
The next block tells us how the dependent variable has been coded internally.
The dependent variable, in this case, is 1 or 0, and internally also it has been coded as 1 or 0.
Then next comes the categorical variable coding. Code, Region, TV, and Gender is a categorical variable
and we have kept the reference category last. Hence the last variable is coded as 0 and others are coded
with respect to that. For example, in Gender “Unknown” is coded as 0 and Male and Female are coded
with respect to that. The beginning block is essential and talks about when we have the null model, which
is when there are no predictors in the model, And the accuracy which we got is 75.2 percent with a
default cut-off value of 0.5.
Block 1
This block model summary gives us the equivalent of the R square which explains the proportion of
variation. So Cox and Snell R square value (R square) is coming as 0.201 and Nagelkerke R
square(Adjusted R square) value is 0.298.
Page | 4
The classification table tells us how many are the actual number of buyers and what the predicted number
of buyers is and how many actual numbers of nonbuyers and the predicted number of nonbuyers. Also,
we get the sensitivity and specificity data from this table. Right now it is telling us that our model has an
accuracy of 80.3 percent (with the default cut-off of 0.5).
ROC Curve:
ROC Curve curve to determine an optimal cut-off and refine our results. The saved probabilities are
deterministic in building the ROC Curve. This can be checked and viewed in the data view of SPSS
Page | 5
whether the probabilities are reflected or not. The predicted probability reflects if the new range of private
label consumer durables recently launched by Blackberry will be bought or not.
The ROC Curve helps to understand how to balance the cost of false positives and false negatives.
Parameters that are there in understanding accuracy.
Overall accuracy is always going to be the percentage of 1s plus the percentage of 0s. So the number of
the 1s correctly predicted plus the number of negative 0s correctly predicted the total number of
responses. But there are two other metrics that are of concern.
One is Sensitivity and the other is Specificity. The reason is, not always the overall accuracy important.
There are times when the percentage of ones correctly predicted is much more important than the overall
accuracy and similarly percentage of zeros correctly is more important the overall accuracy.
Output
In the output, The first block is telling us which is the number of positives and the number of negatives
and the Roc curve.
Next Block tell us area under the curve which shows how good the model is.
Page | 6
New Cut off value based on ROC Curve:
To refine our results we got a new cut-off value from the ROC table, we get the value from the
intersection of sensitivity and specificity.
The point of intersection the specificity of 0.714 and p=0.246. Now we rebuilt the logistic regression, and
instead of 0.5, we keep 0.246. And then we run the model again.
We see the overall accuracy dropped to 71.6 % however since we want to have the number of 1’s
correctly predicted as high as possible which increased to 3917 will help us to target more people who are
likely to buy than what could have happened earlier. This is how we balance the ROC Curve here. Now
we are able to get a better trade off done and more possible ones captured. Anybody who's predicted
properties greater than this number will be considered as, as a possible buyer and will be targeted by the
marketing campaign. This is how we balance on the costs of false positive and false negative.
Decile Calculation:
In the rebuilt model for decile calculation, we use the Hosmer and Lemeshow Test.
Page | 7
It works in a way that after the algorithm scores the data and creates the predicted probabilities, the data is
sorted in descending order of the predicted probabilities. Once the data is sorted in the descending order
of predicted probabilities, it is decile into 10 groups (10%,20%,30%.....100%). So we now have for each
decile an observed value and expected value for both zeros and ones.
We take deciled data to excel where we find out the buyer percentage, and percentage of the segment size
and finally calculate the profitability.
On our observation, we see the 30% segment size had near to 51% buyer in the sample. This is also
explained by the logic of maxima, as the segment increases there will be a point of maximum and then the
profitability will start to fall.
Profit Calculation:
It is known that the lifetime value of a new customer is $15,000 while the cost per capita of the campaign
is $4,420.
Buyer% in most profitable sample (30%) = 50.94%%
So the total profit will be
(15,000 * # of customers in top 3 decile * Cumulative % purch) – (4,420 * # of customers in top 3
deciles)
= 15,000 X (22223 X 30.01%) X 50.94% – 4,420 X (22223 X 30.01%)
= 15,000 X 6669 X 50.94% – 4,420 X 6669
=21476640
Conclusion:
In order to maximise profits, the best subset is to be based on the model built on sample of 30%.
Page | 8
Page | 9