0% found this document useful (0 votes)

63 views7 pages

Col Solare Case Study 2

A logistic regression model was created to predict the probability of customers buying wine. Customers were assigned to deciles based on their predicted probabilities, with decile 1 having the highest predicted probabilities. Lift analysis showed that targeting customers in the top deciles resulted in much higher response rates compared to random selection, demonstrating the model's ability to effectively target high-potential customers.

Uploaded by

perestotnik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views7 pages

Col Solare Case Study 2

Uploaded by

perestotnik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

1.

A logistic regression model was created in R after first transforming the “customer type” variable
into a 0/1 dummy variable called restaurant. The following command was used:
> ColSolare$restaurant <- ifelse(ColSolare$customer_type == "restaurant",
1, 0)

The following command was used to create the logit model:

> ColSolarelogit <- glm(buyer ~ last_purch + dollars + restaurant +
customer_sqft + cab_franc + cab_sauvignon + malbec + merlot + red_blend +
syrah, family = binomial(link='logit'), data = ColSolare)
> summary (ColSolarelogit)
> ColSolare$purch_prob <- predict.glm(ColSolarelogit, ColSolare, type =
"response")

The following output was obtained in R:

Deviance Residuals:
Min 1Q Median 3Q Max
-2.2897 -0.4260 -0.2897 -0.1862 3.3694

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.778e+00 1.066e-01 -16.679 < 2e-16 ***
last_purch -9.215e-02 3.112e-03 -29.613 < 2e-16 ***
dollars 3.401e-04 3.942e-04 0.863 0.38828
restaurant -8.181e-01 4.027e-02 -20.317 < 2e-16 ***
customer_sqft 2.787e-05 3.110e-05 0.896 0.37021
cab_franc -3.521e-01 2.856e-01 -1.233 0.21767
cab_sauvignon 3.335e-01 2.846e-01 1.172 0.24134
malbec -7.616e-01 2.860e-01 -2.663 0.00775 **
merlot -4.148e-01 2.845e-01 -1.458 0.14479
red_blend 8.094e-01 2.849e-01 2.840 0.00451 **
syrah 2.253e-03 2.853e-01 0.008 0.99370
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 23817 on 39999 degrees of freedom

Residual deviance: 19373 on 39989 degrees of freedom
AIC: 19395

Number of Fisher Scoring iterations: 6

A new variable, purch_prob, was created using the command:

> ColSolare$purch_prob <- predict.glm(ColSolarelogit, ColSolare, type =
"response")

2. The odds ratios were calculated in R using the commands:

> oddsratio_ColSolarelogit <- exp(ColSolarelogit$coef)
> oddsratio_ColSolarelogit

The following output was obtained:

(Intercept) last_purch dollars restaurant customer_sqft
cab_franc cab_sauvignon malbec merlot
0.1689488 0.9119639 1.0003402 0.4412598 1.0000279
0.7032023 1.3958055 0.4669234 0.6604796
red_blend syrah
2.2464588 1.0022553
The odds ratios can be interpreted for marketing managers as follows:

The odds of buying the 2019 Red Blend after receiving the free sample increase by a factor of .91
(which means decrease by 8.8%) for each additional month elapsed since the customer’s most recent
purchase from Col Solare if all else remains constant.

The odds of buying the 2019 Red Blend after receiving the free sample increase by a factor of .44
(which means decrease by 55%) if the customer’s establishment is a restaurant. In other words, odds
of buying are more than 1 if it is a bar.

The probability of buying the 2019 Red Blend after receiving the free sample is not affected by the
total dollars that the customer has spent with Col Solare (odds ratio of 1) if all else remains constant.

The probability of buying the 2019 Red Blend after receiving the free sample is not affected by the
size of the customer’s establishment (odds ratio of 1) if all else remains constant.

3. Customers were each assigned to a decile based on their purchase probabilities stored in the
variable purch_prob. The following command was used:
> ColSolare$predict <- 11 - ntile(ColSolare$purch_prob, 10)

The following output was obtained:

0% 10% 20% 30% 40%
50% 60% 70% 80%
0.0005558238 0.0119596568 0.0200447655 0.0283770059 0.0380511811
0.0511387163 0.0659792076 0.0905440840 0.1256687440
90% 100%
0.1978702502 0.9793837331

4. The bar chart is shown below.

> ggplot(ColSolare) + geom_bar(aes(x = predict, y = buyer), stat =
"summary", fun = "mean")
5. The following report was generated:
> ColSolare %>% group_by(ColSolare$predict) %>% summarize(count =
length(red_blend), buyers = sum(buyer), responserate =
sum(buyer)/sum(count))

# A tibble: 10 x 4
`ColSolare$predict` count buyers responserate
<dbl> <int> <dbl> <dbl>
1 1 4000 1458 0.364
2 2 4000 598 0.150
3 3 4000 401 0.100
4 4 4000 331 0.0828
5 5 4000 226 0.0565
6 6 4000 184 0.046
7 7 4000 118 0.0295
8 8 4000 89 0.0222
9 9 4000 80 0.02
10 10 4000 32 0.008

6. The logistic regression was performed using only the merlot variable using the command
> ColSolaremerlot <- glm(buyer ~ merlot, family = binomial(link='logit'),
data = ColSolare)
> summary (ColSolaremerlot)

The following output was obtained in R:

Call:
glm(formula = buyer ~ merlot, family = binomial(link = "logit"),
data = ColSolare)

Deviance Residuals:
Min 1Q Median 3Q Max
-0.5447 -0.4306 -0.4162 -0.4162 2.2314

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.40289 0.02251 -106.766 < 2e-16 ***
merlot 0.07122 0.01496 4.761 1.93e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)

Null deviance: 23817 on 39999 degrees of freedom

Residual deviance: 23795 on 39998 degrees of freedom
AIC: 23799

Number of Fisher Scoring iterations: 5

> exp(ColSolaremerlot$coef)

(Intercept) merlot
0.09045611 1.07381299

Intuitively, the odds ratio for Merlot is different than the earlier logistic regression because the
second model does not take into account the residual effect from the other predictive variables (the
other wine brands). As a result, it is not weighed correctly.

7. The following table shows the lift and cumulative lift for each decile:

No No of No
Decil of Cumulativ Cumulativ Buyer Cumulativ Respons Cumulativ Cumulativ Mode
e Cust e e% s e e rate Lift e resp rate e Lift l
400 4.1
1 0 4000 10% 1458 1458 36.45% 5 36.45% 4.15 1
400
2 0 8000 20% 598 2056 14.95% 1.7 25.70% 2.92 1
400 1.1
3 0 12000 30% 401 2457 10.03% 4 20.48% 2.33 1
400 0.9
4 0 16000 40% 331 2788 8.28% 4 17.43% 1.98 1
400 0.6
5 0 20000 50% 226 3014 5.65% 4 15.07% 1.71 1
400 0.5
6 0 24000 60% 184 3198 4.60% 2 13.33% 1.52 1
400 0.3
7 0 28000 70% 118 3316 2.95% 4 11.84% 1.35 1
400 0.2
8 0 32000 80% 89 3405 2.23% 5 10.64% 1.21 1
400 0.2
9 0 36000 90% 80 3485 2.00% 3 9.68% 1.1 1
400 0.0
10 0 40000 100% 32 3517 0.80% 9 8.79% 1 1

8. The chart was created in Excel and is shown below.

9. The table is shown below

No of Cumulative No of Cumulative Cumulative No

Decile Cust Cumulative % Buyers Buyers Gains Gains Model
0 0 0 0 0 0 0 0 0
1 4000 4000 10% 1458 1458 41.46% 41.46% 10%
2 4000 8000 20% 598 2056 17.00% 58.46% 20%
3 4000 12000 30% 401 2457 11.40% 69.86% 30%
4 4000 16000 40% 331 2788 9.41% 79.27% 40%
5 4000 20000 50% 226 3014 6.43% 85.70% 50%
6 4000 24000 60% 184 3198 5.23% 90.93% 60%
7 4000 28000 70% 118 3316 3.36% 94.28% 70%
8 4000 32000 80% 89 3405 2.53% 96.82% 80%
9 4000 36000 90% 80 3485 2.27% 99.09% 90%
10 4000 40000 100% 32 3517 0.91% 100.00% 100%
4000
Total 0 3517

10. The chart is shown below.

11. The following R command was used to calculate the customer’s predicted probability of response
need to be in order for them to be a profitable target:
> breakeven_rate <- ((20+10) / (720 - 40 - 120))*100
> breakeven_rate
[1] 5.357143

Therefore, the predicted probability of response should be at least 5.36%.

12. Number of buyers was calculated using the following R commands:

> ColSolare$targetcust <- ifelse(ColSolare$purch_prob > 0.0536,1,0)
> mean(ColSolare$targetcust)
[1] 0.4843
> .4843 * 120000
[1] 58116
> mean(subset(ColSolare, targetcust==1)$purch_prob)
[1] 0.1542016
> 0.1542016 * 58116
[1] 8961.58
> mean(subset(ColSolare, targetcust==1)$purch_prob)
[1] 0.1542016
> (720-40-120) * 8961.58 - 30 * 58116
[1] 3275005

a. I would expect 8962 buyers.

b. The expected response rate would be 15.42%
c. Expected profit would be $3,275,005.
d. Expected return on marketing expenses would be 3275005/(30*58116) = 1.87843 or 188%.

13. Yes, ColSolare needs to run a new campaign next year.

The reason is that customer taste or preferences may shift next year and they may not prefer merlot
compared to the red blend. Therefore, the marketing department may not be able to use the
response from red blend to find out how many customers will buy the company’s merlot brand.

14. A linear regression model would be appropriate in this case because dollars is a continuous
predictor. The model was created as follows:
> CustSpend <- lm(dollars ~ first_purch + malbec + merlot + syrah, data =
ColSolare)
> summary(CustSpend)

Call:
lm(formula = dollars ~ first_purch + malbec + merlot + syrah,
data = ColSolare)

Residuals:
Min 1Q Median 3Q Max
-3381.2 -527.9 -35.7 373.3 4695.0

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 93.2988 6.8775 13.57 <2e-16 ***
first_purch 38.7869 0.3032 127.91 <2e-16 ***
malbec 673.4377 5.8776 114.58 <2e-16 ***
merlot 685.6872 4.3502 157.62 <2e-16 ***
syrah 658.7488 7.1909 91.61 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 801.8 on 39995 degrees of freedom

Multiple R-squared: 0.829, Adjusted R-squared: 0.8289
F-statistic: 4.846e+04 on 4 and 39995 DF, p-value: < 2.2e-16

Each of the predictor variables is significant and the model can predict 82.9% of the variation in
spending. The coefficients can be interpreted as follows:

Spending increases by $38.79 for each additional month since first purchase if all else remains
constant. Spending increases by $673.47 for each additional case of malbec purchased if all else
remains constant. Spending increases by $685.69 for each additional case of merlot purchased if all
else remains constant. Spending increases by $658.75 for each additional case of syrah purchased if
all else remains constant.

Arihant Trigonometry Unlocked
90% (10)
Arihant Trigonometry Unlocked
401 pages
Six Sigma Case Studies With Minitab® - Six-Sigma-Case-Studies-With-Minitab - Compress (01-14)
No ratings yet
Six Sigma Case Studies With Minitab® - Six-Sigma-Case-Studies-With-Minitab - Compress (01-14)
14 pages
Wine Prediction
100% (1)
Wine Prediction
13 pages
7 Regression
No ratings yet
7 Regression
96 pages
Greenwood Intermediate Statistics With R
No ratings yet
Greenwood Intermediate Statistics With R
429 pages
Accounting#17
100% (1)
Accounting#17
13 pages
Homework 4
No ratings yet
Homework 4
119 pages
Carine's Data Panshak
No ratings yet
Carine's Data Panshak
62 pages
Unit 3 Regression Models
No ratings yet
Unit 3 Regression Models
74 pages
BAM3 Lesson03.1 LinearRegression
No ratings yet
BAM3 Lesson03.1 LinearRegression
22 pages
Sampling Chapter4
No ratings yet
Sampling Chapter4
41 pages
Regression
No ratings yet
Regression
90 pages
Sampling Chapter3
No ratings yet
Sampling Chapter3
29 pages
Generalised Linear Models: Getwd
No ratings yet
Generalised Linear Models: Getwd
7 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
100% (1)
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
10 pages
Introducing The Linear Model
No ratings yet
Introducing The Linear Model
15 pages
R Project
No ratings yet
R Project
22 pages
7 K-Means Clustering
No ratings yet
7 K-Means Clustering
27 pages
MT2013 CC Group
No ratings yet
MT2013 CC Group
42 pages
An Introduction To Robust Estimation With R Functi Removed
No ratings yet
An Introduction To Robust Estimation With R Functi Removed
12 pages
Notes - Predicitve Analystics - Multiple Regression - S
No ratings yet
Notes - Predicitve Analystics - Multiple Regression - S
24 pages
Course Notes18
No ratings yet
Course Notes18
113 pages
BZAN 535: Linear Regression
No ratings yet
BZAN 535: Linear Regression
11 pages
PO687 Assignment Example: What's in Orange Are Tips From Me
No ratings yet
PO687 Assignment Example: What's in Orange Are Tips From Me
14 pages
Isye HW2
No ratings yet
Isye HW2
10 pages
Report Revathy
No ratings yet
Report Revathy
13 pages
45B AIML Practical07 Clustering
No ratings yet
45B AIML Practical07 Clustering
8 pages
Wine
No ratings yet
Wine
15 pages
HW3 Solution Fall 2024
No ratings yet
HW3 Solution Fall 2024
8 pages
BDA MSC It
No ratings yet
BDA MSC It
35 pages
Business Analytics
No ratings yet
Business Analytics
17 pages
R Lab 1
No ratings yet
R Lab 1
5 pages
Retail Relay Case
No ratings yet
Retail Relay Case
6 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
WINE Prediction Quality
100% (1)
WINE Prediction Quality
6 pages
Codigo R Diamantes
No ratings yet
Codigo R Diamantes
5 pages
Correlacion y Regresion 2
No ratings yet
Correlacion y Regresion 2
28 pages
Chapter - 2 - Fundamentals of C++ Programming
100% (2)
Chapter - 2 - Fundamentals of C++ Programming
16 pages
Year 8 Mathematics Autumn White Rose Higher B
0% (1)
Year 8 Mathematics Autumn White Rose Higher B
12 pages
Statistics and Probability PROJECT 2
No ratings yet
Statistics and Probability PROJECT 2
8 pages
Lab 4 Classification v.0
No ratings yet
Lab 4 Classification v.0
5 pages
Week 11 Assignment 11.2.2
No ratings yet
Week 11 Assignment 11.2.2
3 pages
Assignment Shreya Sec A
No ratings yet
Assignment Shreya Sec A
10 pages
Digital Assignment-6: Read The Data
No ratings yet
Digital Assignment-6: Read The Data
30 pages
Hypothesis Testing and Regression Modelling
No ratings yet
Hypothesis Testing and Regression Modelling
8 pages
Exercise#9 Instructions 2021
No ratings yet
Exercise#9 Instructions 2021
5 pages
Ai Logistic Regression
No ratings yet
Ai Logistic Regression
2 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
6 pages
AMA Assignment
No ratings yet
AMA Assignment
6 pages
Time Series Forcast
No ratings yet
Time Series Forcast
18 pages
Regression and Correlation
No ratings yet
Regression and Correlation
54 pages
R Code Default Data PDF
No ratings yet
R Code Default Data PDF
10 pages
Model Building
No ratings yet
Model Building
7 pages
Regression
No ratings yet
Regression
50 pages
Linear Regression
No ratings yet
Linear Regression
28 pages
Masaki Rhodes 11/16/2020: Library Function
No ratings yet
Masaki Rhodes 11/16/2020: Library Function
1 page
BDM 2 - 15 Dec 2009
No ratings yet
BDM 2 - 15 Dec 2009
6 pages
Le Club Francais Case
No ratings yet
Le Club Francais Case
8 pages
Design Analysis of Machine Tool Structure With Art
No ratings yet
Design Analysis of Machine Tool Structure With Art
14 pages
Hybrid Rocket Propulsion For Future Aircraft
100% (1)
Hybrid Rocket Propulsion For Future Aircraft
40 pages
Cluster
No ratings yet
Cluster
3 pages
(Computational Neuroscience) Daniel Gardner - Neurobiology of Neural Networks-The MIT Press (1993) (Z-Lib - Io)
No ratings yet
(Computational Neuroscience) Daniel Gardner - Neurobiology of Neural Networks-The MIT Press (1993) (Z-Lib - Io)
235 pages
Meq Model Questions
0% (1)
Meq Model Questions
4 pages
R Console
No ratings yet
R Console
1 page
System Administration Books - Red Hat Enterprise Linux 5
No ratings yet
System Administration Books - Red Hat Enterprise Linux 5
136 pages
Modifications For The Kenwood TS-940
No ratings yet
Modifications For The Kenwood TS-940
10 pages
Westock - Ultra Slim Floor Beam (USFB) Design
100% (1)
Westock - Ultra Slim Floor Beam (USFB) Design
20 pages
C-Full Programs 001
No ratings yet
C-Full Programs 001
25 pages
15 16 H2 Quantum Physics II Summary
No ratings yet
15 16 H2 Quantum Physics II Summary
1 page
2 UG Crystal Note
No ratings yet
2 UG Crystal Note
97 pages
Akashi-Kaikyo Bridge
No ratings yet
Akashi-Kaikyo Bridge
17 pages
03-TN - SP023 - E1 - 1 Number Plan in CS Domain-11
No ratings yet
03-TN - SP023 - E1 - 1 Number Plan in CS Domain-11
9 pages
2022 Lutomirski - Strength Reduction Factors
No ratings yet
2022 Lutomirski - Strength Reduction Factors
9 pages
1 Binary & Hexadecimal Systems J24
No ratings yet
1 Binary & Hexadecimal Systems J24
19 pages
Revision - Length Time
No ratings yet
Revision - Length Time
12 pages
Hume Empiricism
No ratings yet
Hume Empiricism
13 pages
DW Suite Cours
No ratings yet
DW Suite Cours
111 pages
Learning Project 3
No ratings yet
Learning Project 3
5 pages
Prof Ed Sample Questions Set 1
No ratings yet
Prof Ed Sample Questions Set 1
10 pages
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
100% (1)
Ahmet Ozan HATİPOĞLU Cansu Çalişir Mehmet Özgür TEMUÇİN
14 pages
Data Sheet LEO13GT
No ratings yet
Data Sheet LEO13GT
2 pages
Generalized Minimum Miscibility Pressure Correlation: SPE, Petroleum Technology Research LNST
No ratings yet
Generalized Minimum Miscibility Pressure Correlation: SPE, Petroleum Technology Research LNST
10 pages
Edexcel Magnetism 1 QP
No ratings yet
Edexcel Magnetism 1 QP
17 pages
TEXA Axone Nemo Specs
No ratings yet
TEXA Axone Nemo Specs
36 pages
Forecasting Demand: at The End of This Section, You Will Be Able To
No ratings yet
Forecasting Demand: at The End of This Section, You Will Be Able To
4 pages
A Few TEQC Tips For Getting Started: Beth Pratt-Sitaula (UNAVCO)
No ratings yet
A Few TEQC Tips For Getting Started: Beth Pratt-Sitaula (UNAVCO)
2 pages
LOGO! StarterKit
No ratings yet
LOGO! StarterKit
2 pages
International Standards in Nanotechnologies: A B C C D
No ratings yet
International Standards in Nanotechnologies: A B C C D
15 pages
Virtual Density Lab 2018 PDF
No ratings yet
Virtual Density Lab 2018 PDF
2 pages
Forecasting Demand: at The End of This Section, You Will Be Able To
No ratings yet
Forecasting Demand: at The End of This Section, You Will Be Able To
4 pages
Enhanced Geothermal Systems
No ratings yet
Enhanced Geothermal Systems
8 pages
Ethics: Application Form
No ratings yet
Ethics: Application Form
11 pages
IPXP One Data Sheet
No ratings yet
IPXP One Data Sheet
8 pages
English Assessment: Do Not Start The Test Until You Can Commit The 90 Minutes Needed To Complete IT
No ratings yet
English Assessment: Do Not Start The Test Until You Can Commit The 90 Minutes Needed To Complete IT
1 page
Facility Location Lit Review Old
No ratings yet
Facility Location Lit Review Old
10 pages
Get Married in Pattaya
No ratings yet
Get Married in Pattaya
2 pages
Datasheet RevPi AIO
No ratings yet
Datasheet RevPi AIO
2 pages
Nan National Museum
No ratings yet
Nan National Museum
2 pages
Mass OCR of High Volume
No ratings yet
Mass OCR of High Volume
2 pages
Abstract Explain 2
No ratings yet
Abstract Explain 2
2 pages
Advanced Process Control: Beyond Single Loop Control
From Everand
Advanced Process Control: Beyond Single Loop Control
Cecil L. Smith
No ratings yet
Control of DC Motor Using Different Control Strategies
From Everand
Control of DC Motor Using Different Control Strategies
Dr. Hidaia Mahmood Alassouli
No ratings yet
Stories from the Road 3: An Automotive Case Studies Series
From Everand
Stories from the Road 3: An Automotive Case Studies Series
Mandy Concepcion
No ratings yet

Col Solare Case Study 2

Uploaded by

Col Solare Case Study 2

Uploaded by

1.

The following command was used to create the logit model:

The following output was obtained in R:

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 23817 on 39999 degrees of freedom

Number of Fisher Scoring iterations: 6

A new variable, purch_prob, was created using the command:

2. The odds ratios were calculated in R using the commands:

The following output was obtained:

The following output was obtained:

4. The bar chart is shown below.

The following output was obtained in R:

Null deviance: 23817 on 39999 degrees of freedom

Number of Fisher Scoring iterations: 5

8. The chart was created in Excel and is shown below.

No of Cumulative No of Cumulative Cumulative No

10. The chart is shown below.

Therefore, the predicted probability of response should be at least 5.36%.

12. Number of buyers was calculated using the following R commands:

a. I would expect 8962 buyers.

13. Yes, ColSolare needs to run a new campaign next year.

Residual standard error: 801.8 on 39995 degrees of freedom

You might also like