CIS4930 Big Data Analytics Homework 5: Direct Mailing To Airline Customers
CIS4930 Big Data Analytics Homework 5: Direct Mailing To Airline Customers
Note: Please do NOT submit any other file format. Failing to submit the files in the correct
file format will cause the loss of homework grades!
Problems
Tayko Software is a software catalog firm that sells games and educational software. It started
out as a software manufacturer and then added third-party titles to its offerings. It recently
revised its collection of items in a new catalog, which it mailed out to its customers. This mailing
yielded 2000 purchases. Based on these data, Tayko wants to devise a model for predicting
whether a new customer will make a purchase when they receive the new catalog. The file
CIS4930_CSV_Tayko.csv contains information on 2000 purchases. The table below describes
the variables.
Variable Description
Id Unique ID for each individual
Freq Number of transactions in last year
Last_update Number of days since last update to customer record
Web Whether customer purchased by web order at least once
Gender Male/female
Address_res Whether the customer account address is a residential address
Address_US Whether the customer account address is a US address
Purchase Whether customer has purchased Tayko product after receiving the new catalog
1. Import the dataset. Remove variable id from the imported data because we will not use it
as a predictor.
2. Split the data into training and validation sets using a 6:4 ratio.
3. Run a logistic regression on the training data (You need to choose the correct target
variable). Then, generate a confusion matrix of the model using the validation data.
Report the confusion matrix.
4. Run a neural net model on the training data, using a single hidden layer with 5 nodes.
Generate a confusion matrix of the model using the validation data. Report the confusion
matrix
5. Compare the two model performances. Based on the result, which model will yield more
success in terms of customer purchase?
(Hint: If Tayko selects 100 new customers classified as “purchase” and send new
catalogs, how many of them will purchase Tayko product according to each model?)