0% found this document useful (0 votes)

21 views20 pages

Employee Turnover

Uploaded by

sneha fabey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views20 pages

Employee Turnover

Uploaded by

sneha fabey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Understanding and Predicting Employee Turnover

A case study in prescriptive analytics ***

We'll be covering:
• Descriptive Analytics - What happened?
• Predictive Analytics - What might happen?
• Prescriptive Analytics - What should we do?

Objective:
• To understand what factors contributed most to employee turnover.
• To perform clustering of Employees who left based on their satisfaction and
evaluation
• To create a model that predicts the likelihood if a certain employee will leave the
company or not.
• To create or improve different retention strategies on targeted employees.

Reading the Data

# Import the neccessary modules for data manipulation and visual

representation
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as matplot
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline

df = pd.read_csv('HR_comma_sep.csv')

# Examine the dataset

df.head()

satisfaction_level last_evaluation number_project

average_montly_hours \
0 0.38 0.53 2
157
1 0.80 0.86 5
262
2 0.11 0.88 7
272
3 0.72 0.87 5
223
4 0.37 0.52 2
159

time_spend_company Work_accident left promotion_last_5years

sales \
0 3 0 1 0
sales
1 6 0 1 0
sales
2 4 0 1 0
sales
3 5 0 1 0
sales
4 3 0 1 0
sales

salary
0 low
1 medium
2 medium
3 low
4 low

# Rename Columns
# Renaming certain columns for better readability
df = df.rename(columns={'satisfaction_level': 'satisfaction',
'last_evaluation': 'evaluation',
'number_project': 'projectCount',
'average_montly_hours': 'averageMonthlyHours',
'time_spend_company': 'yearsAtCompany',
'Work_accident': 'workAccident',
'promotion_last_5years': 'promotion',
'sales' : 'department',
'left' : 'turnover'
})

df.shape

(14999, 10)

df.head()

satisfaction evaluation projectCount averageMonthlyHours \

0 0.38 0.53 2 157
1 0.80 0.86 5 262
2 0.11 0.88 7 272
3 0.72 0.87 5 223
4 0.37 0.52 2 159
yearsAtCompany workAccident turnover promotion department
salary
0 3 0 1 0 sales
low
1 6 0 1 0 sales
medium
2 4 0 1 0 sales
medium
3 5 0 1 0 sales
low
4 3 0 1 0 sales
low

round(df.turnover.value_counts(1), 2)

0 0.76
1 0.24
Name: turnover, dtype: float64

plt.figure(figsize=(12,8))
turnover = df.turnover.value_counts()
sns.barplot(y=turnover.values, x=turnover.index, alpha=0.6)
plt.title('Distribution of Employee Turnover')
plt.xlabel('Employee Turnover', fontsize=16)
plt.ylabel('Count', fontsize=16);
1. Perform data quality check by checking for missing values if any

# Can you check to see if there are any missing values in our data set
df.isnull().any()

satisfaction False
evaluation False
projectCount False
averageMonthlyHours False
yearsAtCompany False
workAccident False
turnover False
promotion False
department False
salary False
dtype: bool

# Check the type of our features. Are there any data inconsistencies?
df.dtypes

satisfaction float64
evaluation float64
projectCount int64
averageMonthlyHours int64
yearsAtCompany int64
workAccident int64
turnover int64
promotion int64
department object
salary object
dtype: object

2. Understand what factors contributed most to employee turnover by EDA

# Display the statistical overview of the employees

round(df.describe(), 2)

satisfaction evaluation projectCount averageMonthlyHours \

count 14999.00 14999.00 14999.00 14999.00
mean 0.61 0.72 3.80 201.05
std 0.25 0.17 1.23 49.94
min 0.09 0.36 2.00 96.00
25% 0.44 0.56 3.00 156.00
50% 0.64 0.72 4.00 200.00
75% 0.82 0.87 5.00 245.00
max 1.00 1.00 7.00 310.00
yearsAtCompany workAccident turnover promotion
count 14999.00 14999.00 14999.00 14999.00
mean 3.50 0.14 0.24 0.02
std 1.46 0.35 0.43 0.14
min 2.00 0.00 0.00 0.00
25% 3.00 0.00 0.00 0.00
50% 3.00 0.00 0.00 0.00
75% 4.00 0.00 0.00 0.00
max 10.00 1.00 1.00 1.00

# Display the mean summary of Employees (Turnover V.S. Non-turnover).

What do you notice between the groups?
turnover_Summary = df.groupby('turnover')

round(turnover_Summary.mean(), 2)

satisfaction evaluation projectCount averageMonthlyHours

\
turnover

0 0.67 0.72 3.79 199.06

1 0.44 0.72 3.86 207.42

yearsAtCompany workAccident promotion

turnover
0 3.38 0.18 0.03
1 3.88 0.05 0.01

round(turnover_Summary.std(), 2)

satisfaction evaluation projectCount averageMonthlyHours

\
turnover

0 0.22 0.16 0.98 45.68

1 0.26 0.20 1.82 61.20

yearsAtCompany workAccident promotion

turnover
0 1.56 0.38 0.16
1 0.98 0.21 0.07

Correlation Matrix
# Create a correlation matrix. What features correlate the most with
turnover? What other correlations did you find?
corr = df.corr()
corr

satisfaction evaluation projectCount \

satisfaction 1.000000 0.105021 -0.142970
evaluation 0.105021 1.000000 0.349333
projectCount -0.142970 0.349333 1.000000
averageMonthlyHours -0.020048 0.339742 0.417211
yearsAtCompany -0.100866 0.131591 0.196786
workAccident 0.058697 -0.007104 -0.004741
turnover -0.388375 0.006567 0.023787
promotion 0.025605 -0.008684 -0.006064

averageMonthlyHours yearsAtCompany workAccident

\
satisfaction -0.020048 -0.100866 0.058697

evaluation 0.339742 0.131591 -0.007104

projectCount 0.417211 0.196786 -0.004741

averageMonthlyHours 1.000000 0.127755 -0.010143

yearsAtCompany 0.127755 1.000000 0.002120

workAccident -0.010143 0.002120 1.000000

turnover 0.071287 0.144822 -0.154622

promotion -0.003544 0.067433 0.039245

turnover promotion
satisfaction -0.388375 0.025605
evaluation 0.006567 -0.008684
projectCount 0.023787 -0.006064
averageMonthlyHours 0.071287 -0.003544
yearsAtCompany 0.144822 0.067433
workAccident -0.154622 0.039245
turnover 1.000000 -0.061788
promotion -0.061788 1.000000

plt.figure(figsize=(15,10))
sns.heatmap(corr, xticklabels=corr.columns.values,
yticklabels=corr.columns.values, annot=True)
plt.title('Heatmap of Correlation Matrix');
Distribution of Satisfaction, Evaluation, and Monthly Hours
# Plot the distribution of Employee Satisfaction, Evaluation, and
Project Count. What story can you tell?

# Set up the matplotlib figure

f, axes = plt.subplots(ncols=3, figsize=(16, 8))

# Graph Employee Satisfaction

sns.distplot(df.satisfaction, kde=False, color="g",
ax=axes[0]).set_title('Employee Satisfaction Distribution')
axes[0].set_ylabel('Employee Count');

# Graph Employee Evaluation

sns.distplot(df.evaluation, kde=False, color="r",
ax=axes[1]).set_title('Employee Evaluation Distribution')
axes[1].set_ylabel('Employee Count');

# Graph Employee Average Monthly Hours

sns.distplot(df.averageMonthlyHours, kde=False, color="b",
ax=axes[2]).set_title('Employee Average Monthly Hours Distribution')
axes[2].set_ylabel('Employee Count');
Bar plot of employee project count
plt.figure(figsize=(20,8))
ax = sns.barplot(x="projectCount", y="projectCount", hue="turnover",
data=df, estimator=lambda x: len(x) / len(df) * 100)
ax.set(ylabel="Percent");

• More than half of the employees with 2, 6 and 7 projects left the company
• Majority of the employees who did not leave the company had 3,4, and 5 projects
• All of the employees with 7 projects left the company
• There is an increase in employee turnover rate as project count increases

3. Perform clustering of Employees who left based on their satisfaction and evaluation
# Import KMeans Model
from sklearn.cluster import KMeans

# Graph and create 3 clusters of Employee Turnover

kmeans = KMeans(n_clusters=3,random_state=2)
kmeans.fit(df[df.turnover==1][["satisfaction","evaluation"]])

kmeans_colors = ['green' if c == 0 else 'blue' if c == 2 else 'red'

for c in kmeans.labels_]

fig = plt.figure(figsize=(10, 6))

plt.scatter(x="satisfaction",y="evaluation", data=df[df.turnover==1],
alpha=0.25,color = kmeans_colors)
plt.xlabel("Satisfaction")
plt.ylabel("Evaluation")
plt.scatter(x=kmeans.cluster_centers_[:,0],y=kmeans.cluster_centers_[:
,1],color="black",marker="X",s=100)
plt.title("Clusters of Employee Turnover")

plt.show();

Cluster 1 (Blue): Hard-working and Sad Employees

Cluster 2 (Red): Bad and Sad Employee
Cluster 3 (Green): Hard-working and Happy Employee
There are 3 distinct clusters for employees who left the company
Cluster 1 (Hard-working and Sad Employee): Satisfaction was below 0.2 and evaluations
were greater than 0.75. Which could be a good indication that employees who left the
company were good workers but felt horrible at their job.
• Question: What could be the reason for feeling so horrible when you are highly
evaluated? Could it be working too hard? Could this cluster mean employees who
are "overworked"?
Cluster 2 (Bad and Sad Employee): Satisfaction between about 0.35~0.45 and
evaluations below ~0.58. This could be seen as employees who were badly evaluated and
felt bad at work.
• Question: Could this cluster mean employees who "under-performed"?
Cluster 3 (Hard-working and Happy Employee): Satisfaction between 0.7~1.0 and
evaluations were greater than 0.8. Which could mean that employees in this cluster were
"ideal". They loved their work and were evaluated highly for their performance.
• Question: Could this cluser mean that employees left because they found another
job opportunity?

Pre-processing

4.Handle the left Class Imbalance using SMOTE technique.

4a. Pre-Process the data by converting categorical columns to numerical

• Separate categorical variables and numeric variables.
• Apply get_dummies() to the categorical variables.
• Combine categorical variables and numeric variables.
cat_var = ['department','salary','turnover','promotion']
num_var =
['satisfaction','evaluation','projectCount','averageMonthlyHours','yea
rsAtCompany', 'workAccident']
categorical_df = pd.get_dummies(df[cat_var], drop_first=True,
dummy_na=True)
numerical_df = df[num_var]

new_df = pd.concat([categorical_df,numerical_df], axis=1)

new_df.head()

turnover promotion department_RandD department_accounting \

0 1 0 0 0
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 0 0 0
department_hr department_management department_marketing \
0 0 0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0

department_product_mng department_sales
department_support ... \
0 0 1 0 ...

1 0 1 0 ...

2 0 1 0 ...

3 0 1 0 ...

4 0 1 0 ...

department_nan salary_low salary_medium salary_nan satisfaction

\
0 0 1 0 0 0.38

1 0 0 1 0 0.80

2 0 0 1 0 0.11

3 0 1 0 0 0.72

4 0 1 0 0 0.37

evaluation projectCount averageMonthlyHours yearsAtCompany

workAccident
0 0.53 2 157 3
0
1 0.86 5 262 6
0
2 0.88 7 272 4
0
3 0.87 5 223 5
0
4 0.52 2 159 3
0

[5 rows x 21 columns]

new_df.shape
(14999, 21)

4b. Split Train/Test Set

Let's split our data into a train and test set. We'll fit our model with the train set and leave our
test set for our last evaluation.
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report,
precision_score, recall_score, confusion_matrix,
precision_recall_curve

# Create the X and y set

X = new_df.iloc[:,1:]
y = new_df.iloc[:,0]

# Define train and test

X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.20, random_state=123, stratify=y)

print(X_train.shape)
print(X_test.shape)

(11999, 20)
(3000, 20)

4c. Class Imbalance

Employee Turnover Rate: 24%

round(df.turnover.value_counts(1), 2)

0 0.76
1 0.24
Name: turnover, dtype: float64

4c. Upsample the train dataset using SMOTE technique

from sklearn.utils import resample

from imblearn.over_sampling import SMOTE

# Upsample using SMOTE

sm = SMOTE(random_state=12, sampling_strategy = 1.0)
x_train_sm, y_train_sm = sm.fit_resample(X_train, y_train)
print("Original shape:", X_train.shape, y_train.shape)
print ("SMOTE sample shape:", x_train_sm.shape, y_train_sm.shape)

Original shape: (11999, 20) (11999,)

SMOTE sample shape: (18284, 20) (18284,)

from sklearn.linear_model import LogisticRegression

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import cross_val_score

Perform 5-Fold cross validation model training and evaluate performance

We're training 3 models:

1. Logistic Regression Classifier
2. Random Forest Classifier
3. Gradient Boosting Classifier

Logistic Regression Classifier

from sklearn.metrics import roc_auc_score
from sklearn.metrics import accuracy_score

lr = LogisticRegression()
lr = lr.fit(x_train_sm, y_train_sm)
lr

LogisticRegression()

Apply 5-Fold Cross Validation on Logistic Regression

lr_result = cross_val_score(lr, x_train_sm, y_train_sm, cv=5,
scoring='f1')
lr_result.mean()

0.7877848483907197

Logistic Regression Classifier AUC (0.77)

print ("\n\n ---Logistic Regression Model---")
lr_auc = roc_auc_score(y_test, lr.predict(X_test))

print ("Logistic Regression AUC = %2.2f" % lr_auc)

lr2 = lr.fit(x_train_sm, y_train_sm)

print(classification_report(y_test, lr.predict(X_test)))

---Logistic Regression Model---

Logistic Regression AUC = 0.77
precision recall f1-score support

0 0.92 0.75 0.83 2286

1 0.50 0.79 0.61 714

accuracy 0.76 3000

macro avg 0.71 0.77 0.72 3000
weighted avg 0.82 0.76 0.78 3000

Random Forest Classifier

from sklearn.ensemble import RandomForestClassifier

# Random Forest Model

rf = RandomForestClassifier()
rf = rf.fit(x_train_sm, y_train_sm)
rf

RandomForestClassifier()

Apply 5-Fold Cross Validation on Random Forest

rf_result = cross_val_score(rf, x_train_sm, y_train_sm, cv=5,
scoring='f1')
rf_result.mean()

0.9805593654003257

Random Forest Classifier AUC (0.98)

from sklearn.metrics import roc_auc_score

print ("\n\n ---Random Forest Model---")

rf_roc_auc = roc_auc_score(y_test, rf.predict(X_test))
print ("Random Forest AUC = %2.2f" % rf_roc_auc)
print(classification_report(y_test, rf.predict(X_test)))

---Random Forest Model---

Random Forest AUC = 0.98
precision recall f1-score support

0 0.99 0.99 0.99 2286

1 0.97 0.98 0.97 714

accuracy 0.99 3000

macro avg 0.98 0.98 0.98 3000
weighted avg 0.99 0.99 0.99 3000

Gradient Boosting Classifier

from sklearn.ensemble import GradientBoostingClassifier

gbc = GradientBoostingClassifier()
gbc = gbc.fit(x_train_sm,y_train_sm)
gbc

GradientBoostingClassifier()

Apply 5-Fold Cross Validation on Gradient Boosting Classifier

gbc_result = cross_val_score(gbc, x_train_sm, y_train_sm, cv=5,
scoring='f1')
gbc_result.mean()

0.9580470716647115

Gradient Boosting Classifier AUC (0.96)

from sklearn.metrics import roc_auc_score

print ("\n\n ---Gradient Boosting Model---")

gbc_auc = roc_auc_score(y_test, gbc.predict(X_test))
print ("Gradient Boosting Classifier AUC = %2.2f" % gbc_auc)
print(classification_report(y_test, gbc.predict(X_test)))

---Gradient Boosting Model---

Gradient Boosting Classifier AUC = 0.96
precision recall f1-score support

0 0.98 0.97 0.98 2286

1 0.91 0.94 0.92 714

accuracy 0.96 3000

macro avg 0.95 0.96 0.95 3000
weighted avg 0.96 0.96 0.96 3000

Model Evaluation (F1 Score & ROC/AUC)

ROC Graph
# Create ROC Graph
from sklearn.metrics import roc_curve
from sklearn.metrics import confusion_matrix

fpr, tpr, thresholds = roc_curve(y_test, lr.predict_proba(X_test)

[:,1])
rf_fpr, rf_tpr, rf_thresholds = roc_curve(y_test,
rf.predict_proba(X_test)[:,1])
gbc_fpr, gbc_tpr, gbc_thresholds = roc_curve(y_test,
gbc.predict_proba(X_test)[:,1])

plt.figure(figsize=(15,12))

# Plot Logistic Regression ROC

plt.plot(fpr, tpr, label='Logistic Regression (area = %0.2f)' %
lr_auc)

# Plot Random Forest ROC

plt.plot(rf_fpr, rf_tpr, label='Random Forest Classifier (area =
%0.2f)' % rf_roc_auc)

# Plot Decision Tree ROC

plt.plot(gbc_fpr, gbc_tpr, label='Gradient Boosting Classifier (area =
%0.2f)' % gbc_auc)

# Plot Base Rate ROC

plt.plot([0,1], [0,1],label='Base Rate')

plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Graph')
plt.legend(loc="lower right")
plt.show();
# Confusion Matrix for Logistic Regresion
confusion_matrix(y_test, lr.predict(X_test))

array([[1720, 566],
[ 153, 561]])

# Confusion Matrix for GBC

confusion_matrix(y_test, gbc.predict(X_test))

array([[2220, 66],
[ 43, 671]])

# Confusion Matrix for Random Forest

confusion_matrix(y_test, rf.predict(X_test))

array([[2261, 25],
[ 16, 698]])

Recall or Precision?

It depends on how much cost/weight you want on your two types of errors: (1) False
Positives or (2) False Negatives
We want our machine learn model to capture as much of the minority class as possible
(turnover group). Our objective is to catch ALL of the highly probable turnover employee at
the risk of flagging some low-risk non-turnover employee.
• Consider employee turnover domain where an employee is given treatment by
Human Resources because they think the employee will leave the company within a
month, but the employee actually does not. This is a false positive. This mistake
could be expensive, inconvenient, and time consuming for both the Human
Resources and employee, but is a good investment for relational growth.

• Compare this with the opposite error, where Human Resources does not give
treatment/incentives to the employees and they do leave. This is a false negative.
This type of error is more detrimental because the company lost an employee,
which could lead to great setbacks and more money to rehire.

• Depending on these errors, different costs are weighed based on the type of
employee being treated. For example, if it’s a high-salary employee then would we
need a costlier form of treatment? What if it’s a low-salary employee? The cost for
each error is different and should be weighed accordingly.

Optimize Recall When:

• You want to limit false negatives
• You will get more False Positives
• FN > FP
• Example: Predicting Cancer Patients or Credit Card Fraud
Optimize Precision When:
• You want to limit false positives
• FP > FN
• Example: Spam VS Not Spam

7. Suggest various retention strategies on targeted employees.

7a. Using the best model, predict the probability of employee turnover in the test data

Retention Plan

# Ranking turnover probability for employees

rf.predict_proba(X_test)[175:185]

array([[0.96, 0.04],
[1. , 0. ],
[1. , 0. ],
[0. , 1. ],
[0. , 1. ],
[0.84, 0.16],
[0.81, 0.19],
[0.01, 0.99],
[0.18, 0.82],
[0.99, 0.01]])

list(rf.predict_proba(X_test)[175:185, 1])

[0.04, 0.0, 0.0, 1.0, 1.0, 0.16, 0.19, 0.99, 0.82, 0.01]

list(rf.predict_proba(X_test)[175:185, 1] > 0.2)

[False, False, False, True, True, False, False, True, True, False]

Since this model is being used for people, we should refrain from soley relying on the
output of our model. Instead, we can use it's probability output and design our own system
to treat each employee accordingly.
1. Safe Zone (Green) – Employees within this zone are considered safe.
2. Low Risk Zone (Yellow) – Employees within this zone are too be taken into
consideration of potential turnover. This is more of a long-term track.
3. Medium Risk Zone (Orange) – Employees within this zone are at risk of turnover.
Action should be taken and monitored accordingly.
4. High Risk Zone (Red) – Employees within this zone are considered to have the
highest chance of turnover. Action should be taken immediately.
Safe Zone (Green)
• No Action required
Low Risk Zone (Yellow)
• Action to be taken on long term basis
• Apply group interventions
• HR to track demographic data for these individuals to see if the risk profiles are
changing or if the equation needs to be altered
Medium Risk Zone (Orange)
• Action to be taken on medium term basis
• HR to keep a close watch on the behavioral status to change from "Medium" to
"High" risk. HR to analyze demographic data to identify high risk supervisors and
point them to the BU heads
• Apply group interventions
High Risk Zone (Red)
• Action to be taken on immediate basis
• HR to send list to the concerned managers for immediate acion
• HR to validate the risks for consistency with the identified clusters
• Managers to have one - to one conversation with the identified employee

Conclusion
What to Optimize
Binary Classification: Turnover V.S. Non Turnover
Instance Scoring: Likelihood of employee responding to an offer/incentive to save them
from leaving.
Need for Application: Save employees from leaving
In our employee retention problem, rather than simply predicting whether an employee
will leave the company within a certain time frame, we would much rather have an
estimate of the probability that he/she will leave the company. We would rank employees
by their probability of leaving, then allocate a limited incentive budget to the highest
probability instances.

Solution 1:
• We can rank employees by their probability of leaving, then allocate a limited
incentive budget to the highest probability instances.
• OR, we can allocate our incentive budget to the instances with the highest expected
loss, for which we'll need the probability of turnover.
Solution 2:
• Develop learning programs for managers, then use analytics to gauge their
performance and measure progress.
• Be a good coach. Empower the team and do not micromanage
• Express interest for team member success
• Have clear vision / strategy for team
• Help team with career development

Selection Bias

• One thing to note about this dataset is the turnover feature. We don't know if the
employees that left are interns, contractors, full-time, or part-time. These are
important variables to take into consideration when performing a machine learning
algorithm to it.

• Another thing to note down is the type of bias of the evaluation feature. Evaluation
is heavily subjective, and can vary tremendously depending on who is the evaluator.
If the employee knows the evaluator, then he/she will probably have a higher score.

TakenMind Python Analytics Problem Case Study 1 1
100% (1)
TakenMind Python Analytics Problem Case Study 1 1
335 pages
Employee - Attrition - Rate - Jupyter Notebook
No ratings yet
Employee - Attrition - Rate - Jupyter Notebook
62 pages
Hrdata Analysis
No ratings yet
Hrdata Analysis
199 pages
Capstone Project - Employee Attrition Rate
No ratings yet
Capstone Project - Employee Attrition Rate
66 pages
ML Projects
No ratings yet
ML Projects
22 pages
Assignment 2
0% (1)
Assignment 2
217 pages
HR Comma Sep - CSV
No ratings yet
HR Comma Sep - CSV
255 pages
INX Future Employee Performance Project
No ratings yet
INX Future Employee Performance Project
62 pages
Group 3
No ratings yet
Group 3
15 pages
PySpark Slides
No ratings yet
PySpark Slides
30 pages
HR Analytic Using Logistic Regression
No ratings yet
HR Analytic Using Logistic Regression
12 pages
Tarea 4
No ratings yet
Tarea 4
6 pages
ML 2 Project Business Report - Nandini
No ratings yet
ML 2 Project Business Report - Nandini
43 pages
Output
No ratings yet
Output
5 pages
Final Capstone Project Report
100% (1)
Final Capstone Project Report
35 pages
Assignment3: 1) Identify Percentage of Missing Values in Each Column and Display The Same
No ratings yet
Assignment3: 1) Identify Percentage of Missing Values in Each Column and Display The Same
30 pages
PDF
No ratings yet
PDF
11 pages
Predictive Modeling
No ratings yet
Predictive Modeling
42 pages
Employees Burnout Analysis
No ratings yet
Employees Burnout Analysis
20 pages
Assignment On Classification Tree Model Development: Submitted By-Gaurav Khokhani
No ratings yet
Assignment On Classification Tree Model Development: Submitted By-Gaurav Khokhani
19 pages
AML Project LearnerNotebook LowCode
No ratings yet
AML Project LearnerNotebook LowCode
74 pages
Kinh Tế Lượng
No ratings yet
Kinh Tế Lượng
15 pages
Employee Turnover Analytics
No ratings yet
Employee Turnover Analytics
32 pages
Coding
No ratings yet
Coding
9 pages
Ads Exam 21c3
No ratings yet
Ads Exam 21c3
22 pages
Презентация1
No ratings yet
Презентация1
17 pages
Satya772244@gmail Compdf
No ratings yet
Satya772244@gmail Compdf
7 pages
Data Analytics Report - Case Study - Employee Attrition
100% (1)
Data Analytics Report - Case Study - Employee Attrition
41 pages
Data Cleaning
No ratings yet
Data Cleaning
1 page
S 18 PGP HRM
No ratings yet
S 18 PGP HRM
14 pages
Salary Prediction
No ratings yet
Salary Prediction
28 pages
Howxtre
No ratings yet
Howxtre
8 pages
Predicting Employee Churn in Python
100% (1)
Predicting Employee Churn in Python
19 pages
Capstone Project Assignment
No ratings yet
Capstone Project Assignment
3 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
BA370 Spreadsheet Assignment - Tatag Sasono
100% (1)
BA370 Spreadsheet Assignment - Tatag Sasono
7 pages
[email protected]
No ratings yet
[email protected]
13 pages
ML Project 2
No ratings yet
ML Project 2
19 pages
Report
No ratings yet
Report
15 pages
Data Pre Processing and Cleaning
No ratings yet
Data Pre Processing and Cleaning
56 pages
Principlesofworkflowin Dataanalysis: Scottlong
No ratings yet
Principlesofworkflowin Dataanalysis: Scottlong
14 pages
Supply. HRP
No ratings yet
Supply. HRP
22 pages
AI Assignment 6 - Employee Performance Analysis - Jupyter Notebook
No ratings yet
AI Assignment 6 - Employee Performance Analysis - Jupyter Notebook
9 pages
Churn Prediction - Commercial Use of Data Science
No ratings yet
Churn Prediction - Commercial Use of Data Science
25 pages
Assignment Ds Midterm
No ratings yet
Assignment Ds Midterm
2 pages
Data Analytics Final Project
No ratings yet
Data Analytics Final Project
6 pages
Employee Turnover Problem Statement
No ratings yet
Employee Turnover Problem Statement
5 pages
Chapter 1
No ratings yet
Chapter 1
19 pages
Excel Assignment 2-Data Manipulation and Working With Tables
No ratings yet
Excel Assignment 2-Data Manipulation and Working With Tables
233 pages
Human Resources
No ratings yet
Human Resources
26 pages
Student Notebook HR Analysis
No ratings yet
Student Notebook HR Analysis
11 pages
Industry Assignment 1 - EmployeeAnalyis
No ratings yet
Industry Assignment 1 - EmployeeAnalyis
4 pages
Data Preprocessing & Visualization1
No ratings yet
Data Preprocessing & Visualization1
2 pages
Workflow Slides JSLong 110410 PDF
No ratings yet
Workflow Slides JSLong 110410 PDF
14 pages
Point Factor Method-Temp
100% (1)
Point Factor Method-Temp
25 pages
Telecom Churn Report
No ratings yet
Telecom Churn Report
66 pages
Lookup Functions With Practical Business Case Study
No ratings yet
Lookup Functions With Practical Business Case Study
17 pages
Kinetic Theory of Gases
100% (1)
Kinetic Theory of Gases
39 pages
Genetics II Quiz
0% (2)
Genetics II Quiz
8 pages
Axxela Interview Questions
No ratings yet
Axxela Interview Questions
25 pages
Reading 2 Time-Series Analysis - Answers
No ratings yet
Reading 2 Time-Series Analysis - Answers
52 pages
CHM 301 Significance of The Partition Function
No ratings yet
CHM 301 Significance of The Partition Function
2 pages
Gas Laws Project
No ratings yet
Gas Laws Project
7 pages
Strobe Guide Lines
No ratings yet
Strobe Guide Lines
29 pages
Stats Final Cheat Sheet PDF
No ratings yet
Stats Final Cheat Sheet PDF
8 pages
Chapter 8 Confidence Intervals
No ratings yet
Chapter 8 Confidence Intervals
34 pages
Mod5 - Ising Model
No ratings yet
Mod5 - Ising Model
10 pages
1709829641
No ratings yet
1709829641
21 pages
Making Sense of Odds and Odds Ratios.24
No ratings yet
Making Sense of Odds and Odds Ratios.24
4 pages
James Steiger R For MultipleRegressionIntro
No ratings yet
James Steiger R For MultipleRegressionIntro
54 pages
Agec 313 Ecnometrics
No ratings yet
Agec 313 Ecnometrics
3 pages
04 Chap04 ClassificationMethods-LogisticRegression 2024
No ratings yet
04 Chap04 ClassificationMethods-LogisticRegression 2024
23 pages
Confusion Matrix Examples
No ratings yet
Confusion Matrix Examples
2 pages
Moving Average
No ratings yet
Moving Average
7 pages
Callaway SantAnna 2020
No ratings yet
Callaway SantAnna 2020
45 pages
Hsslive-Xii-Statistics-2. Rehression English
No ratings yet
Hsslive-Xii-Statistics-2. Rehression English
5 pages
Unit 8 - Behavior of Gases
No ratings yet
Unit 8 - Behavior of Gases
24 pages
Homework 5
No ratings yet
Homework 5
6 pages
9 - Functional Data Analysis With Application To Periodically
No ratings yet
9 - Functional Data Analysis With Application To Periodically
13 pages
Guide To Interprate ROC Analysis
No ratings yet
Guide To Interprate ROC Analysis
4 pages
1917041105-Daftar Nama Retail
No ratings yet
1917041105-Daftar Nama Retail
7 pages
The Gas Laws:: Pressure Volume Temperature Relationships Boyle's Law: The Pressure-Volume Law
No ratings yet
The Gas Laws:: Pressure Volume Temperature Relationships Boyle's Law: The Pressure-Volume Law
4 pages
Ujjval Modi 2
No ratings yet
Ujjval Modi 2
12 pages
Hasil Word SPSS
No ratings yet
Hasil Word SPSS
4 pages
Introduction To Xenobots
No ratings yet
Introduction To Xenobots
7 pages
Analisis Autokorelasi Spasialtitik Panas Di Kalimantan Timur Menggunakan Indeks Moran PDF
No ratings yet
Analisis Autokorelasi Spasialtitik Panas Di Kalimantan Timur Menggunakan Indeks Moran PDF
8 pages
DID101R
No ratings yet
DID101R
5 pages
100 Tips for Consulting Firms to Accelerate Profit and Value Growth
From Everand
100 Tips for Consulting Firms to Accelerate Profit and Value Growth
Tony Rice
4/5 (1)
Customer Success vs. Account Manager
From Everand
Customer Success vs. Account Manager
Anastasia Nikolaeva
No ratings yet
Financial analysis – Tesco Plc: Model Answer Series
From Everand
Financial analysis – Tesco Plc: Model Answer Series
AIB Publishing
No ratings yet