0% found this document useful (0 votes)
9 views

Assignment5

The assignment focuses on analyzing employee attrition data from a tech company, aiming to understand and reduce employee turnover rates. Students are required to build and compare various predictive models, including classification trees and ensemble methods, using the outcome variable 'left_company'. The final deliverable is an RMarkdown file that assesses model performance and identifies the best model, which should then be converted into a Word document.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Assignment5

The assignment focuses on analyzing employee attrition data from a tech company, aiming to understand and reduce employee turnover rates. Students are required to build and compare various predictive models, including classification trees and ensemble methods, using the outcome variable 'left_company'. The final deliverable is an RMarkdown file that assesses model performance and identifies the best model, which should then be converted into a Word document.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MBA 739 – Advanced Analytics

Week 5 – Decision Trees and Ensembles Assignment

Addressing Employee Attrition


For this assignment we will look at the employee attrition data. Ideally, companies would like to keep
attrition rates (the proportion of employees leaving a company for other opportunities) as low as
possible due to the variable costs and business disruptions that come with replacing productive
employees on short notice.
In the recent past, before Covid struck, the company had seen an increase in the rate of employee
departures. Being a tech company, and with certain skills being rare, employee retention has been a
challenge. This has taken a toll on their operations and ability to deliver quality products. The
information related to the dataset is available below.

Employee Attrition Data


The following data consists of 1,470 employee records for a U.S. based product company. The rows
in this data frame represent an employee’s attributes across the variables listed in the table below.

Variable Definition Data Type


left_company Did the employee leave the company? (Yes/No) Factor
department Department within the company Categorical
job_level Job Level (Associate - Vice President) Categorical
salary Employee yearly salary (US Dollars) Numeric
weekly_hours Self-reported average weekly hours spent on the Numeric
job (company survey)
business_travel Level of required business travel Categorical
yrs_at_company Tenure at the company (years) Numeric
yrs_since_promotion Years since last promotion Numeric
previous_companies Number of previous companies for which the Numeric
employee has worked
job_satisfaction Self-reported job satisfaction (company survey) Factor
performance_rating Most recent annual performance rating Factor
marital_status Marital status (Single, Married, or Divorced) Categorical
miles_from_home Distance from employee address to office location Numeric

While importing the data csv file, the data type may not match with the data type given above. Please
convert them to appropriate data type, before doing any analysis. Also, generate the dummies as
required.

1
MBA 739 – Advanced Analytics

Questions
Using the employee attrition data, build the following models, and compare its overall performance
using the confusion matrix, and accuracy. The objective is to predict with outcome variable
left_company
• Fully grown classification tree
• Pruned classification tree
• Random Forest
• Boosting

Use an RMarkdown file, build the model, provide your assessment about the models, and choose the
best possible model. Once done, knit the file into Word document and upload the answers.

You might also like