0% found this document useful (0 votes)

10 views

Module 7 Homework Prompt - JMP

The document outlines a homework assignment for MNGT 379 - Business Analytics, focusing on predictive data mining through three problems involving Salmons Stores, BlueOrRed, and CreditScore. Each problem requires data manipulation, model building, and interpretation using JMP software, including logistic regression, K nearest neighbors, and classification trees. Students are instructed to save their work and submit it along with JMP files upon completion.

Uploaded by

ajmalfarid077

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Module 7 Homework Prompt - JMP

Uploaded by

ajmalfarid077

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

MNGT 379 - Business Analytics

Module 7 Predictive Data Mining Homework

IMPORTANT: As you complete these problems, save your completed JMP files for each Problem so
that you may submit them with your Solution Word document.

Problem 1 – Salmons Stores

Salmons Stores operates a national chain of women’s apparel stores. Five thousand copies of an
expensive four-color sales catalog have been printed, and each catalog includes a coupon that provides a
$50 discount on purchases of $200 or more. Salmons would like to send the catalogs only to customers
who have the highest probability of using the coupon. The file “Module 7 SalmonsStores.xlsx” contains
data from an earlier promotional campaign. For each of 1,000 Salmons customers, three variables are
tracked: last year’s total spending at Salmons, whether they have a Salmons store credit card, and
whether they used the promotional coupon they were sent. Use the data in “Module 7 SalmonsStores
Data.xlsx” to complete the steps below.

Follow the instructions below to partition the data into Training, Validation, and Test Sets, and perform
a Logistic Regression on the data.

Step 1: Let’s start by reviewing our data to ensure it’s configured

with the correct data types; after the data is imported, right-click the column header for Customer and
choose the first menu item, Column Info, then change the Modeling Type to Nominal. Repeat these
steps to change Card and Coupon to Nominal as well.

Step 2: Now, we need to partition the data into Training, Validation, and Test sets. Under Analyze >
Predictive Modeling, select . In this dialog box, you don’t need to select
anything; just click OK. In the next window, change the Training Set value to 0.40, the Validation Set
value to 0.40, and the Test Set value to 0.2. In the Options, change New Column Name to “SetName”
without the quotes, and change the Random Seed to 1. Make sure that your dialog box matches the
screenshot to the right before you continue. Once you are satisfied, click Go.

Step 3: Now we can build the Logistic Regression model. Under the Analyze tab on the menu bar, select
Fit Model. Select the Coupon Column as the Y variable either by clicking and dragging or by
highlighting Coupon and then clicking . In the upper-right of this dialog box, the Personality
should automatically switch to Nominal Logistic; if it does not, go back to Step 1 and re-check your data
types. Next, add SetName to the Validation box and add Spending and Card to the Construct Model
Effects. Again, make sure that your dialog box matches the screenshot below before you continue.
Once you are satisfied, click Run.
Step 4: First, minimize all the report subsections except for Parameter Estimates. Then, click the red
triangle next to Nominal Logistic Fit for Coupon, and choose Lift Curve, then open the red triangle
again and choose Confusion Matrix (both are near the middle). Take screenshots of the Parameter
Estimates table, the three Lift Curves, and the Confusion Matrix, and paste them into the document
under the appropriate headers below.

Parameter Estimates

Lift Curve on Training Data

Lift Curve on Validation Data

Lift Curve on Test Data

Step 5: Now we can begin working to understand the report. Interpret the output by completing the
sentence, “The smallest classification error on the validation set results from the model… ” in the space
below, rounding parameter values to four decimal places:

Step 6: Recall that a value of 1 indicates that the decile is equally likely to correctly predict observations
(customers in this case) compared to choosing randomly, while a value of 1.35 indicates that the decile
is 35% more likely to predict customers correctly. Now, with this in mind, consider the Lift Curves we
added in Step 4; at what decile should we expect our model to be around twice as good at predicting
which customers use a Coupon? Enter your answer in the space below.

Step 7: Again consider the Lift Curves, and compare the Lift Curve on Training Data to the other two
Lift Curves. Does this suggest that the Regression Equation you defined in Step 5 has good predictive
power, or is there evidence of model overfitting? Justify your answer in the space below.

Problem 2 – BlueOrRed

Suppose that campaign organizers for both the Republican and Democratic parties are interested in
identifying individual undecided voters who would consider voting for their party in an upcoming
election. The file “Module 7 BlueOrRed Data.xlsx” contains data on a sample of voters with tracked
variables, including whether or not they are undecided regarding their candidate preference, age,
whether they own a home, gender, marital status, household size, income, years of education, and
whether they attend church.

Follow the instructions below to partition the data into Training, Validation, and Test Sets, and perform
K Nearest Neighbors on the data.

Step 1: As we did on Problem 1, we’ll need to change the

data types for many of our variables. Use the right-click
menu > Column Info to change Undecided, HomeOwner,
Female, Married, and Church to Nominal (you can hold
control or command to select multiple columns at once), and
Education to Ordinal.

Step 2: Again like Problem 1, partition the data into

validation and test sets using the Make Validation Column
tool found under Analyze > Predictive Modeling (if you
have a column selected from Step 1, click into the data and
push Escape). In the Specify Rates area, set your partition
percentages as 0.50, 0.30, and 0.20 respectively, then under
Options set the New Column Name to SetName and the
Random Seed to 5. Compare your dialog box to the
example on the right before continuing. Click Go.

Step 3: Open the K Nearest Neighbors tool found under

Analyze > Predictive Modeling (not the one we used in Module 4 under Clustering). Use SetName as
the “Validation” Variable; Undecided as the “Y, Response” Variable; and Age, HomeOwner, Female,
HouseholdSize, Income, Education, and Church (i.e. all the remaining variables except for Married) to
the list of “X, Factor” Variables. Finally, set the value of Set Random Seed to 10. Click OK.

Step 4: Take screenshots of the Model Selection Chart, Training Table, Validation Table, and Test Table
and include them under the appropriate headings below.

Model Selection Chart

Training Table

Validation Table

Test Table

Step 5: Consider the four screenshots you just took; these report the misclassification percentage for
each value of k, meaning for each number of neighbors used in the k-Nearest Neighbors Classification
procedure. Based on these screenshots, which value for k has the smallest misclassification rate, and
consequently, what is the optimal number of neighbors to use in our analysis? Justify your response in
the space below.

Step 6: Consider the Misclassification Rates reported on each Training, Validation, and Test tables; does
the error rate reported on the Training table seem to be optimistic (better than the true performance of
the model), conservative (worse than the true performance of the model), or somewhere in the middle?
Justify your response in the space below.

Problem 3 – CreditScore

A consumer advocacy agency, Equitable Ernest, is interested in providing a service in which an

individual can estimate their own credit score (a continuous measure used by banks, insurance
companies, and other businesses when granting loans, quoting premiums, and issuing credit). The file
“Module 7 CreditScore Data.xlsx” contains data on an individual’s credit score and other variables.
Follow the instructions below to partition the data into Training, Validation, and Test Sets, and create a
Classification Tree for the data.

Step 1: Let’s once again start by checking the data types of

our variables. Most of them are fine as Continuous, but we
need to convert HomeOwner to Nominal. Do that, and
move on to Step 2.

Step 2: Once more partition the data into Training,

Validation, and Test sets with a 40%/40%/20% split, name
the New Column SetName, and use seed 3. Confirm the
settings with the screenshot to the right before continuing.

Step 3: Create a Decision Tree using the Partition tool

found under Analyze > Predictive Modeling. Select
SetName as the “Validation” Variable, CreditScore as the
“Y, Response” Variable, and all the other variables as the
“X, Factor” Variables.

Step 4: Click OK to create the initial Partition, then click

Go to create the rest of the decision tree. WARNING: The Go button will not disappear after you click it;
do not click it again or you will create additional partitions, and you will need to start again. Then,
comparing the RASE (it stands for Median of Root Average Squared Error, but you can ignore the
details, it’s just a measure of error) values from each Set in the table; do these suggest that the model has
been overfit? Justify your response in the space below.

Step 5: Now, click the red triangle menu and choose Display Options > Show Tree. Don’t be alarmed,
but it won’t fit on your screen very well. Take a screenshot of the top node labeled “All Rows” and paste
it into the space below (again, just the one box that says All Rows at the top). IMPORTANT: If you
click Split, Prune, or Go, you’ll have to Redo the Analysis.

Step 6: There’s a lot to dive into here, but we’ll keep it simple and accept the Decision Tree that JMP
has given us. This might look very intimidating, but this Decision Tree is just like the one we went over
in the Classification Trees lecture video – it just looks a little different! Let me provide a quick
recap/description of how the tree is formatted in JMP Pro:

Each node is giving us the criteria for entry into the node (the bold label text), along with some
key metrics: the number of people in our Training Sample that meet the criteria to sort into this
node (which means they meet all the criteria from the nodes higher in the Tree as well), the
Mean credit score of those qualifying individuals (which doubles as the predicted value at that
node), and the Standard Deviation of credit scores within those qualifying individuals.

So, with that in mind, use the Decision Tree to predict the credit score of an individual who has had 5
credit bureau inquiries, has used 10% of her available credit, has $14,500 of total available credit, has no
collection reports or missed payments, is a homeowner, has an average credit age of 6.5 years (i.e.
CreditAge=6.5), and has worked continuously for the past 5 years (i.e. TimeOnJob=5). Enter your
estimate for the credit score, i.e. the Mean of the final node reached by the individual described above,
into the space below, rounding your answer to two decimal places.

Hint/Reminder: the process for this was described in the Classification Trees video provided on D2L; if
you’re stuck, review that video, and if you’re still stuck, send me (or the Tutoring Office) a quick email
so we can find a time to meet. This is a lot easier to explain “live”.

Final Submission Instructions

Once you have completed all three problems above, submit your completed version of this file to the
Assignment on D2L along with your completed JMP files. As always, let me know if you have any
questions!

Pricing Procedure In SAP
From Everand
Pricing Procedure In SAP
Shyamala N
4.5/5 (24)
Assignment 2 - HLTH 605b - Fall 2020 (100 Marks)
No ratings yet
Assignment 2 - HLTH 605b - Fall 2020 (100 Marks)
2 pages
Credit Card Marketing Analytics
100% (1)
Credit Card Marketing Analytics
18 pages
Tableau Advanced Table Calculations Guide
No ratings yet
Tableau Advanced Table Calculations Guide
14 pages
Tutorial For Analytic Hierarchy Process: Case: Jenny'S Gelato, P. 226
No ratings yet
Tutorial For Analytic Hierarchy Process: Case: Jenny'S Gelato, P. 226
14 pages
A. What Are The Coordinates of The Centroids For The Good Students and The Weak Students?
No ratings yet
A. What Are The Coordinates of The Centroids For The Good Students and The Weak Students?
18 pages
ICT583 Data Science Applications - Final Assignment - Individual - UPDATED!!! - Explanation
0% (1)
ICT583 Data Science Applications - Final Assignment - Individual - UPDATED!!! - Explanation
5 pages
3.multiple Linear Regression - Jupyter Notebook
No ratings yet
3.multiple Linear Regression - Jupyter Notebook
5 pages
SAS Studio 05 Two Sample Tests - 06.28
No ratings yet
SAS Studio 05 Two Sample Tests - 06.28
4 pages
SAS Studio 06 Paired Sample T Tests - Updated - 08 22
No ratings yet
SAS Studio 06 Paired Sample T Tests - Updated - 08 22
5 pages
حل المشروع
No ratings yet
حل المشروع
13 pages
SAS Studio 03 Confidence Intervals
No ratings yet
SAS Studio 03 Confidence Intervals
4 pages
SAS Studio 04 One Sample T Test
No ratings yet
SAS Studio 04 One Sample T Test
5 pages
SAS Studio 09 Hypothesis Testing For Proportions
No ratings yet
SAS Studio 09 Hypothesis Testing For Proportions
5 pages
SAS Studio 10 Linear Regression With Assumptions
No ratings yet
SAS Studio 10 Linear Regression With Assumptions
5 pages
Tutorial Rapid Miner Life Insurance Promotion PDF
No ratings yet
Tutorial Rapid Miner Life Insurance Promotion PDF
11 pages
Fathom Tutorial
No ratings yet
Fathom Tutorial
24 pages
SAS Studio 08 Two Way ANOVA - 07.05docx
No ratings yet
SAS Studio 08 Two Way ANOVA - 07.05docx
4 pages
Unit 5 Data Modelling Jacob
No ratings yet
Unit 5 Data Modelling Jacob
13 pages
Primer For Using XLminer and LightSIDE 2017 Anitesh Barua
No ratings yet
Primer For Using XLminer and LightSIDE 2017 Anitesh Barua
25 pages
Lesson 34: Principal Component Analysis: 1. Cross-Tabulation
No ratings yet
Lesson 34: Principal Component Analysis: 1. Cross-Tabulation
4 pages
Chapter 4 Analyse Data Using Scenarios and Goal Seek NOTES Important Points
No ratings yet
Chapter 4 Analyse Data Using Scenarios and Goal Seek NOTES Important Points
2 pages
Mkt3mre Spss Workshops
No ratings yet
Mkt3mre Spss Workshops
111 pages
Tutoriel 2 - Shape and Combine Data in Power BI Desktop
No ratings yet
Tutoriel 2 - Shape and Combine Data in Power BI Desktop
19 pages
Database Creation: Table: Employee
No ratings yet
Database Creation: Table: Employee
15 pages
Trian Model Using WS
No ratings yet
Trian Model Using WS
24 pages
Data Mining Lab Syllabus
No ratings yet
Data Mining Lab Syllabus
2 pages
Lab 2 Worksheet
No ratings yet
Lab 2 Worksheet
3 pages
Tableau Class2
No ratings yet
Tableau Class2
6 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
34 pages
Manya Jain 06316659424 Group1
No ratings yet
Manya Jain 06316659424 Group1
46 pages
Act430-Project Mim
No ratings yet
Act430-Project Mim
12 pages
08 Regression Analysis Predictions - SAC
No ratings yet
08 Regression Analysis Predictions - SAC
11 pages
7.1
No ratings yet
7.1
8 pages
Stage 2 Word Report Final[1]
No ratings yet
Stage 2 Word Report Final[1]
16 pages
Simple Liner REgression
No ratings yet
Simple Liner REgression
27 pages
Excel Template For Finding Out Company's Details May 2013
No ratings yet
Excel Template For Finding Out Company's Details May 2013
144 pages
Copia de Free-Excel-Student-Template May 2013
No ratings yet
Copia de Free-Excel-Student-Template May 2013
145 pages
Chapter 3
No ratings yet
Chapter 3
19 pages
SPSS Data Analysis-Steps
No ratings yet
SPSS Data Analysis-Steps
4 pages
Assessor Model NPD
No ratings yet
Assessor Model NPD
0 pages
Chaid Tutorial 1
No ratings yet
Chaid Tutorial 1
15 pages
Objective: Classification Using ID3 and C4.5 Algorithms Tasks
No ratings yet
Objective: Classification Using ID3 and C4.5 Algorithms Tasks
8 pages
Lab 2 Worksheet
0% (1)
Lab 2 Worksheet
2 pages
SAS Studio 07 One Way ANOVA - 07.05docx
No ratings yet
SAS Studio 07 One Way ANOVA - 07.05docx
5 pages
BI Final Journal
No ratings yet
BI Final Journal
40 pages
It - Skill - Lab - Practical (1) .Docx Tanishka Mathur
No ratings yet
It - Skill - Lab - Practical (1) .Docx Tanishka Mathur
70 pages
Hypothesis+Test-Analysis+Toolpack
No ratings yet
Hypothesis+Test-Analysis+Toolpack
5 pages
Kabir Brm File PDF
No ratings yet
Kabir Brm File PDF
28 pages
STD-X-Analyse Data Using Scenarios and Goal Seek
No ratings yet
STD-X-Analyse Data Using Scenarios and Goal Seek
34 pages
Be A Data Manager
No ratings yet
Be A Data Manager
49 pages
Resource Allocation Tutorial PDF
No ratings yet
Resource Allocation Tutorial PDF
16 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
17 pages
DWM Lab Manual
No ratings yet
DWM Lab Manual
92 pages
SAS Studio 02 One Way Frequencies
No ratings yet
SAS Studio 02 One Way Frequencies
4 pages
How To Create Table in Microsoft Access 2010: Created New Blank Database
No ratings yet
How To Create Table in Microsoft Access 2010: Created New Blank Database
19 pages
02 - Using Advanced Regression
No ratings yet
02 - Using Advanced Regression
5 pages
Spss Before You Do Analysis(1)
No ratings yet
Spss Before You Do Analysis(1)
47 pages
IT Practical File 2022 HPS
No ratings yet
IT Practical File 2022 HPS
43 pages
German Dataset Tasks
No ratings yet
German Dataset Tasks
6 pages
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
100% (1)
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
12 pages
20 Most Powerful Conditional Formatting Techniques
From Everand
20 Most Powerful Conditional Formatting Techniques
Andrei Besedin
No ratings yet
Excel for Auditors: Audit Spreadsheets Using Excel 97 through Excel 2007
From Everand
Excel for Auditors: Audit Spreadsheets Using Excel 97 through Excel 2007
Bill Jelen
No ratings yet
Module 7 BlueOrRed Template (1)
No ratings yet
Module 7 BlueOrRed Template (1)
234 pages
Module 9 Prompt (1)
No ratings yet
Module 9 Prompt (1)
2 pages
Writing Assignment 2
No ratings yet
Writing Assignment 2
4 pages
AppleCart Stacked
No ratings yet
AppleCart Stacked
103 pages
Schedule 9 - Care and Support With Community Accommodation General Quality Questionnaire v3
No ratings yet
Schedule 9 - Care and Support With Community Accommodation General Quality Questionnaire v3
9 pages
Project-details-healthy-oats
No ratings yet
Project-details-healthy-oats
2 pages
Schedule 12 - Lot 3 Level one PDSI Supported Independent Living Quality Questionnaire
No ratings yet
Schedule 12 - Lot 3 Level one PDSI Supported Independent Living Quality Questionnaire
3 pages
Schedule 08 - Care and Support With Community Accommodation Selection Questionnaire
No ratings yet
Schedule 08 - Care and Support With Community Accommodation Selection Questionnaire
9 pages
Schedule 16 - Lot 7 Level One Mental Health Support to Recover Quality Questionnaire
No ratings yet
Schedule 16 - Lot 7 Level One Mental Health Support to Recover Quality Questionnaire
4 pages
Final RLDB Guide For Students Windows Mac
No ratings yet
Final RLDB Guide For Students Windows Mac
3 pages
Multan Ramadan Calendar 2024 Georamadan
No ratings yet
Multan Ramadan Calendar 2024 Georamadan
1 page
Decision Trees and Decision Modeling
No ratings yet
Decision Trees and Decision Modeling
58 pages
SPSS
No ratings yet
SPSS
30 pages
Business Research Methods: Multivariate Analysis
No ratings yet
Business Research Methods: Multivariate Analysis
34 pages
ccs341 data warehousing syllabus
No ratings yet
ccs341 data warehousing syllabus
2 pages
Resolving Project Delay-Part2 PDF
0% (1)
Resolving Project Delay-Part2 PDF
137 pages
Impact of Mobile Money Services On Thegrowth of Smes in Botswana
No ratings yet
Impact of Mobile Money Services On Thegrowth of Smes in Botswana
17 pages
CREATING DATA FOR ANALYTICS THROUGH DESIGN OF EXPERIMENTS
No ratings yet
CREATING DATA FOR ANALYTICS THROUGH DESIGN OF EXPERIMENTS
7 pages
Data Analysis q1 Finals
100% (1)
Data Analysis q1 Finals
405 pages
Data Science and Predictive Analytics
100% (9)
Data Science and Predictive Analytics
309 pages
Statistical Analysis: A Manual On Dissertation Statistics in SPSS
No ratings yet
Statistical Analysis: A Manual On Dissertation Statistics in SPSS
198 pages
Sempro Sintalia
No ratings yet
Sempro Sintalia
14 pages
Lampiran SPSS 25 Desember 2023
No ratings yet
Lampiran SPSS 25 Desember 2023
2 pages
Business Intelligence and Analytics: Prepared by Dr. Hima Suresh Assistant Professor Division of CS, SOE
No ratings yet
Business Intelligence and Analytics: Prepared by Dr. Hima Suresh Assistant Professor Division of CS, SOE
36 pages
S2 - 2023 - Assessment 2 - Questions
No ratings yet
S2 - 2023 - Assessment 2 - Questions
9 pages
Heteroskedasticity Test Glejser
No ratings yet
Heteroskedasticity Test Glejser
2 pages
Dicky Pramana Agung
No ratings yet
Dicky Pramana Agung
31 pages
ey-certificate-in-predictive-analytics-in-python
No ratings yet
ey-certificate-in-predictive-analytics-in-python
7 pages
Practice Sample Questions STA404
100% (1)
Practice Sample Questions STA404
5 pages
A Second Course in Statistics: Regression Analysis: Journal of The American Statistical Association June 1997
No ratings yet
A Second Course in Statistics: Regression Analysis: Journal of The American Statistical Association June 1997
9 pages
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Data Mining: Model Overfitting Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
15 pages
Descriptive Statistics: Numerical Measures: Learning Objectives
100% (1)
Descriptive Statistics: Numerical Measures: Learning Objectives
3 pages
Stake Evaluation Model (Countenance Model) in Learning Process Bahasa Indonesia at Ganesha University of Educational
No ratings yet
Stake Evaluation Model (Countenance Model) in Learning Process Bahasa Indonesia at Ganesha University of Educational
12 pages
Excel Work
No ratings yet
Excel Work
7 pages
Variance Worksheet
No ratings yet
Variance Worksheet
2 pages
Statistical Foundations, Reasoning and Inference: For Science and Data Science (Springer Series in Statistics) Göran Kauermann
100% (2)
Statistical Foundations, Reasoning and Inference: For Science and Data Science (Springer Series in Statistics) Göran Kauermann
79 pages
Customer Relationship Management Websites Analysis of The Top Ten Consumer Goods Companies
No ratings yet
Customer Relationship Management Websites Analysis of The Top Ten Consumer Goods Companies
20 pages
Ultimate Guide To Tensorflow 2.0 in Python
No ratings yet
Ultimate Guide To Tensorflow 2.0 in Python
23 pages