0% found this document useful (0 votes)
98 views2 pages

CIS4930 Big Data Analytics Homework 5: Direct Mailing To Airline Customers

This document provides instructions for homework assignment 5 on direct mailing and customer purchase prediction. Students are asked to import a dataset on 2000 customer purchases, split the data into training and validation sets, run logistic regression and neural network models on the training set, generate confusion matrices on the validation set for each model, and compare model performances to determine which would be more successful for predicting customer purchases from direct mailing.

Uploaded by

brody
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views2 pages

CIS4930 Big Data Analytics Homework 5: Direct Mailing To Airline Customers

This document provides instructions for homework assignment 5 on direct mailing and customer purchase prediction. Students are asked to import a dataset on 2000 customer purchases, split the data into training and validation sets, run logistic regression and neural network models on the training set, generate confusion matrices on the validation set for each model, and compare model performances to determine which would be more successful for predicting customer purchases from direct mailing.

Uploaded by

brody
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

CIS4930 Big Data Analytics

Homework 5: Direct Mailing to Airline Customers


Due by 11/11/2021 (Thursday), 11:59 pm

What to Turn In:


1) An MS Word file that contains your answers to the questions. You can use screenshots
to report the required results if it is convenient for you.
2) An R script that contains your code.

Note: Please do NOT submit any other file format. Failing to submit the files in the correct
file format will cause the loss of homework grades!

Problems
Tayko Software is a software catalog firm that sells games and educational software. It started
out as a software manufacturer and then added third-party titles to its offerings. It recently
revised its collection of items in a new catalog, which it mailed out to its customers. This mailing
yielded 2000 purchases. Based on these data, Tayko wants to devise a model for predicting
whether a new customer will make a purchase when they receive the new catalog. The file
CIS4930_CSV_Tayko.csv contains information on 2000 purchases. The table below describes
the variables.
Variable Description
Id Unique ID for each individual
Freq Number of transactions in last year
Last_update Number of days since last update to customer record
Web Whether customer purchased by web order at least once
Gender Male/female
Address_res Whether the customer account address is a residential address
Address_US Whether the customer account address is a US address
Purchase Whether customer has purchased Tayko product after receiving the new catalog

1. Import the dataset. Remove variable id from the imported data because we will not use it
as a predictor.

2. Split the data into training and validation sets using a 6:4 ratio.

3. Run a logistic regression on the training data (You need to choose the correct target
variable). Then, generate a confusion matrix of the model using the validation data.
Report the confusion matrix.
4. Run a neural net model on the training data, using a single hidden layer with 5 nodes.
Generate a confusion matrix of the model using the validation data. Report the confusion
matrix

5. Compare the two model performances. Based on the result, which model will yield more
success in terms of customer purchase?
(Hint: If Tayko selects 100 new customers classified as “purchase” and send new
catalogs, how many of them will purchase Tayko product according to each model?)

You might also like