0% found this document useful (0 votes)

44 views5 pages

Employee Turnover Problem Statement

Portobello Tech aims to predict employee turnover by analyzing various work-related factors such as satisfaction levels, project involvement, and tenure. The project involves performing data quality checks, understanding turnover factors, clustering employees, handling class imbalance, and evaluating multiple machine learning models. Ultimately, the goal is to identify the best model and suggest retention strategies for employees based on their predicted turnover risk.

Uploaded by

Zoran zoran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views5 pages

Employee Turnover Problem Statement

Uploaded by

Zoran zoran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Machine Learning

Course-End Project Problem Statement

Employee Turnover Analytics

Project Statement:

Portobello Tech is an app innovator who has devised an intelligent way of

predicting employee turnover within the company. It periodically evaluates
employees' work details, including the number of projects they worked on,
average monthly working hours, time spent in the company, promotions in the
last five years, and salary level.

Data from prior evaluations shows the employees’ satisfaction in the workplace.
The data could be used to identify patterns in work style and their interest in
continuing to work for the company.

The HR Department owns the data and uses it to predict employee turnover.
Employee turnover refers to the total number of workers who leave a company
over time.

As the ML Developer assigned to the HR Department, you have been asked to

create ML programs to:
1. Perform data quality checks by checking for missing values, if any.
2. Understand what factors contributed most to employee turnover at EDA.
3. Perform clustering of employees who left based on their satisfaction and
evaluation.
4. Handle the left Class Imbalance using the SMOTE technique.
5. Perform k-fold cross-validation model training and evaluate performance.
6. Identify the best model and justify the evaluation metrics used.
7. Suggest various retention strategies for targeted employees.

Data will be modified from:

https://www.kaggle.com/liujiaqi/hr-comma-sepcsv

Column Name Description

satisfaction_level Satisfaction level at the job of an employee

last_evaluation Rating between 0 and 1, received by an employee at his
last evaluation

number_project The number of projects an employee is involved in

average_montly_hours Average number of hours in a month spent by an

employee at the office

time_spend_company Number of years spent in the company

Work_accident 0 - no accident during employee stay, 1 - accident during

employee stay

left 0 indicates an employee stays with the company and

1 indicates an employee left the company

promotion_last_5years Number of promotions in his stay

Department Department to which an employee belongs to

salary Salary in USD

Perform the following steps:

1. Perform data quality checks by checking for missing values, if any.

2. Understand what factors contributed most to employee turnover at EDA.

2.1. Draw a heatmap of the correlation matrix between all numerical
features or columns in the data.
2.2. Draw the distribution plot of:
■ Employee Satisfaction (use column satisfaction_level)
■ Employee Evaluation (use column last_evaluation)
■ Employee Average Monthly Hours (use column
average_montly_hours)
2.3. Draw the bar plot of the employee project count of both employees
who left and stayed in the organization (use column number_project
and hue column left), and give your inferences from the plot.

3. Perform clustering of employees who left based on their satisfaction and

evaluation.
3.1. Choose columns satisfaction_level, last_evaluation, and left.
3.2. Do K-means clustering of employees who left the company into 3
clusters?
3.3. Based on the satisfaction and evaluation factors, give your thoughts
on the employee clusters.

4. Handle the left Class Imbalance using the SMOTE technique.

4.1. Pre-process the data by converting categorical columns to numerical
columns by:
■ Separating categorical variables and numeric variables
■ Applying get_dummies() to the categorical variables
■ Combining categorical variables and numeric variables
4.2. Do the stratified split of the dataset to train and test in the ratio 80:20
with random_state=123.
4.3. Upsample the train dataset using the SMOTE technique from the
imblearn module.

5. Perform 5-fold cross-validation model training and evaluate performance.

5.1. Train a logistic regression model, apply a 5-fold CV, and plot the
classification report.
5.2. Train a Random Forest Classifier model, apply the 5-fold CV, and plot
the classification report.
5.3. Train a Gradient Boosting Classifier model, apply the 5-fold CV, and
plot the classification report.

6. Identify the best model and justify the evaluation metrics used.
6.1. Find the ROC/AUC for each model and plot the ROC curve.
6.2. Find the confusion matrix for each of the models.
6.3. Explain which metric needs to be used from the confusion matrix:
Recall or Precision?

7. Suggest various retention strategies for targeted employees.

7.1. Using the best model, predict the probability of employee turnover
in the test data.
7.2. Based on the probability score range below, categorize the
employees into four zones and suggest your thoughts on the
retention strategies for each zone.
■ Safe Zone (Green) (Score < 20%)
■ Low-Risk Zone (Yellow) (20% < Score < 60%)
■ Medium-Risk Zone (Orange) (60% < Score < 90%)
■ High-Risk Zone (Red) (Score > 90%).

Capstone Interim Report - HR CTC Prediction
80% (10)
Capstone Interim Report - HR CTC Prediction
16 pages
Exam AFF700 211210 - Solutions
No ratings yet
Exam AFF700 211210 - Solutions
11 pages
Tolerancias Mettler Oct 2016
No ratings yet
Tolerancias Mettler Oct 2016
292 pages
Employee Attrition Study Case
No ratings yet
Employee Attrition Study Case
88 pages
Final Capstone Project Report
100% (1)
Final Capstone Project Report
35 pages
FRA Milestone1 - Maminulislam
100% (4)
FRA Milestone1 - Maminulislam
23 pages
14 Ideal Bose Gas
No ratings yet
14 Ideal Bose Gas
8 pages
Employee Turnover
No ratings yet
Employee Turnover
19 pages
Employee Turnover Prediction Project
No ratings yet
Employee Turnover Prediction Project
10 pages
DATA4800 Report
No ratings yet
DATA4800 Report
6 pages
Problem Statement:: Field Characteristics Data Type
No ratings yet
Problem Statement:: Field Characteristics Data Type
4 pages
Cdu 1121 09
No ratings yet
Cdu 1121 09
10 pages
Employee Performance Prediction Abstract
No ratings yet
Employee Performance Prediction Abstract
2 pages
Employee Turnover1
No ratings yet
Employee Turnover1
4 pages
Data Mining
No ratings yet
Data Mining
17 pages
Attrition Project Mangal
No ratings yet
Attrition Project Mangal
75 pages
Methodology
No ratings yet
Methodology
2 pages
Research Paper
No ratings yet
Research Paper
5 pages
Report
No ratings yet
Report
45 pages
BerkeGündüz MelihAydın Cmpe442 Training Report
No ratings yet
BerkeGündüz MelihAydın Cmpe442 Training Report
14 pages
Employee Future Prediction
No ratings yet
Employee Future Prediction
3 pages
Employee Turnover Prediction
100% (1)
Employee Turnover Prediction
16 pages
Human Retention Using Data Science
No ratings yet
Human Retention Using Data Science
16 pages
CA Cover Sheet For Submissions
No ratings yet
CA Cover Sheet For Submissions
9 pages
Churn Prediction - Commercial Use of Data Science
No ratings yet
Churn Prediction - Commercial Use of Data Science
25 pages
Employee Attrition Prediction
No ratings yet
Employee Attrition Prediction
66 pages
MKTM Ca2
No ratings yet
MKTM Ca2
7 pages
Reportprediction of Employee Atrition Uisng Machine Learning
No ratings yet
Reportprediction of Employee Atrition Uisng Machine Learning
6 pages
Employee Turnover Prediction
No ratings yet
Employee Turnover Prediction
12 pages
Report
No ratings yet
Report
15 pages
[email protected]
No ratings yet
[email protected]
13 pages
AI Workshop Predict Employee Leave
No ratings yet
AI Workshop Predict Employee Leave
22 pages
SMARAN HR Analytics - Ipynb - Colab
No ratings yet
SMARAN HR Analytics - Ipynb - Colab
65 pages
Assignment Report - Group A
No ratings yet
Assignment Report - Group A
31 pages
Employee Attrition Classification
No ratings yet
Employee Attrition Classification
16 pages
Business Analytics
No ratings yet
Business Analytics
5 pages
Analysis and Prediction of Employee Turnover Characteristics Based On Machine Learning
No ratings yet
Analysis and Prediction of Employee Turnover Characteristics Based On Machine Learning
6 pages
HR Analyst (Data Analyst)
No ratings yet
HR Analyst (Data Analyst)
11 pages
RESEARCH PAPER (HR Analytics)
No ratings yet
RESEARCH PAPER (HR Analytics)
11 pages
Predicting Employee Churn in Python
100% (1)
Predicting Employee Churn in Python
19 pages
Is 451 Report 1
No ratings yet
Is 451 Report 1
4 pages
Batch 16
No ratings yet
Batch 16
8 pages
HR Analytics Synopsis
100% (1)
HR Analytics Synopsis
3 pages
New Content-1
No ratings yet
New Content-1
2 pages
Summer Internship Report
No ratings yet
Summer Internship Report
24 pages
Employee Attrition Miniblogs
100% (1)
Employee Attrition Miniblogs
15 pages
Predicting Employee Retention Report
No ratings yet
Predicting Employee Retention Report
14 pages
HR Review1
No ratings yet
HR Review1
11 pages
Iinx Project Summary
No ratings yet
Iinx Project Summary
20 pages
Aarti Valwani Resume
No ratings yet
Aarti Valwani Resume
1 page
(Slides) Module 8 (Employee Attrition Prediction)
No ratings yet
(Slides) Module 8 (Employee Attrition Prediction)
100 pages
Employee Attrition Prediction
100% (1)
Employee Attrition Prediction
21 pages
IBM HR Analytics For Employee Attrition and Performance Prediction
No ratings yet
IBM HR Analytics For Employee Attrition and Performance Prediction
44 pages
Research Paper 102
No ratings yet
Research Paper 102
8 pages
ANLY 502 Final Report
No ratings yet
ANLY 502 Final Report
7 pages
Evaluation of Machine Learning Models For Employee Churn
No ratings yet
Evaluation of Machine Learning Models For Employee Churn
5 pages
Capstone Final PPT Group 6
No ratings yet
Capstone Final PPT Group 6
19 pages
Lab Assignment 1 Ucs551
No ratings yet
Lab Assignment 1 Ucs551
23 pages
Hanoi - 2021: (Document Title)
No ratings yet
Hanoi - 2021: (Document Title)
19 pages
Project
No ratings yet
Project
6 pages
Training
No ratings yet
Training
13 pages
Ways to Achieve Quality
From Everand
Ways to Achieve Quality
chakrapani srinivasa
5/5 (1)
Planning for Small Manufacturing Projects
From Everand
Planning for Small Manufacturing Projects
Fareed Nasr
No ratings yet
Kinetic Molecular Theory of Gases
No ratings yet
Kinetic Molecular Theory of Gases
4 pages
Supercritical Liquid-Gas Boundaries
No ratings yet
Supercritical Liquid-Gas Boundaries
2 pages
Crossvalidation - 1
No ratings yet
Crossvalidation - 1
30 pages
Introduction To Xenobots
No ratings yet
Introduction To Xenobots
7 pages
Inbound 75510600044434454
No ratings yet
Inbound 75510600044434454
12 pages
Regression by XLstat
No ratings yet
Regression by XLstat
1,025 pages
2022-23 S1 - 22 (DSE) - ISM - EC3M - April 2023
No ratings yet
2022-23 S1 - 22 (DSE) - ISM - EC3M - April 2023
2 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Lectute 2 - Panel Data Regression
No ratings yet
Lectute 2 - Panel Data Regression
30 pages
ch05 Edit v2
No ratings yet
ch05 Edit v2
56 pages
ICI Forecast
No ratings yet
ICI Forecast
9 pages
Statistical Learning
No ratings yet
Statistical Learning
31 pages
Business - Report-Comp-Fin - Data - Part A - Problem
No ratings yet
Business - Report-Comp-Fin - Data - Part A - Problem
17 pages
Statistical and Machine-Learning Data Mining: Bruce Ratner
No ratings yet
Statistical and Machine-Learning Data Mining: Bruce Ratner
13 pages
Violations of Classical Assumptions: Chapter Four
No ratings yet
Violations of Classical Assumptions: Chapter Four
38 pages
Problem Set 3 (With Dummy Variable)
No ratings yet
Problem Set 3 (With Dummy Variable)
3 pages
Chapter 6 (Business Statistics 1 BA 1315)
No ratings yet
Chapter 6 (Business Statistics 1 BA 1315)
12 pages
5) Multiple Regression
100% (1)
5) Multiple Regression
8 pages
Analisis Data Studi Kohort
No ratings yet
Analisis Data Studi Kohort
42 pages
CH 05
No ratings yet
CH 05
64 pages
DA Unit-3
No ratings yet
DA Unit-3
13 pages
Home Price Prediction
No ratings yet
Home Price Prediction
4 pages
Strobe Guide Lines
No ratings yet
Strobe Guide Lines
29 pages
Natural Disasters Prediction
No ratings yet
Natural Disasters Prediction
21 pages
Compound Interest (Top 15 Questions)
No ratings yet
Compound Interest (Top 15 Questions)
7 pages
Mathematics: Answer Key
No ratings yet
Mathematics: Answer Key
9 pages
Experiment 1 Lab Report
No ratings yet
Experiment 1 Lab Report
32 pages

Employee Turnover Problem Statement

Uploaded by

Employee Turnover Problem Statement

Uploaded by

Machine Learning

Course-End Project Problem Statement

Portobello Tech is an app innovator who has devised an intelligent way of

As the ML Developer assigned to the HR Department, you have been asked to

Data will be modified from:

Column Name Description

satisfaction_level Satisfaction level at the job of an employee

number_project The number of projects an employee is involved in

average_montly_hours Average number of hours in a month spent by an

time_spend_company Number of years spent in the company

Work_accident 0 - no accident during employee stay, 1 - accident during

left 0 indicates an employee stays with the company and

promotion_last_5years Number of promotions in his stay

Department Department to which an employee belongs to

salary Salary in USD

Perform the following steps:

2. Understand what factors contributed most to employee turnover at EDA.

3. Perform clustering of employees who left based on their satisfaction and

4. Handle the left Class Imbalance using the SMOTE technique.

5. Perform 5-fold cross-validation model training and evaluate performance.

7. Suggest various retention strategies for targeted employees.

You might also like