0% found this document useful (0 votes)
8 views30 pages

BDA Finaltest

This report analyzes employee attrition using the IBM HR Analytics dataset, focusing on key factors contributing to attrition and strategies to reduce it. The methodology includes descriptive, exploratory, predictive, and prescriptive analyses, employing techniques like regression and classification. The findings highlight significant correlations between various employee attributes and attrition rates, emphasizing the importance of HR analytics in optimizing workforce management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views30 pages

BDA Finaltest

This report analyzes employee attrition using the IBM HR Analytics dataset, focusing on key factors contributing to attrition and strategies to reduce it. The methodology includes descriptive, exploratory, predictive, and prescriptive analyses, employing techniques like regression and classification. The findings highlight significant correlations between various employee attributes and attrition rates, emphasizing the importance of HR analytics in optimizing workforce management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

ĐẠI HỌC QUỐC GIA HÀ NỘI

TRƯỜNG QUỐC TẾ
VNU – INTERNATIONAL SCHOOL

⁎⁎⁎

REPORT
Introduction to Business Data Analytic
Topic: Employee attrition analysis

Member’s name Student ID No


Nguyễn Việt Anh 22070679
Đỗ Thị Cẩm Ly 22070858
Dương Thị Vân Trang 22070850
Trần Mạnh Tuấn 22070638
Trần Tuấn Tú 22070628

Lectures: Trần Đức Quỳnh


Class: INS1053 – INS105301

Hanoi, May 13, 2023


TABLE OF CONTENTS

INTRODUCTION
1. Reasons for selecting the topic
2. Objective
3. About the dataset
METHODOLOGY
CHAPTER 1: DESCRIPTIVE ANALYSIS
1. Data preprocessing
2. Descriptive Statistics for Nominal Variables
3. Descriptive Statistics for Quantitative Variables
CHAPTER 2: EXPLORATORY DATA ANALYSIS
1. Correlation Analysis between Variables
2. The Most Influential Factors on Employee Attrition
3. Factors Driving Employee Attrition: Visualizing and Analyzing the
Impact
3.1. Visualizing Factors Influencing Employee Attrition
3.2. Analyzing Individual Factors that Contribute to Employee Attrition
CHAPTER 3: PREDICTIVE ANALYSIS
Logistic Regression
CHAPTER 4: PRESCRIPTIVE ANALYSIS
CONCLUSION
APPENDIX
INTRODUCTION

1. Reasons for selecting the topic

As the business environment becomes increasingly competitive,


organizations are seeking to optimize their human resources (HR)
management strategies to attract and retain talented employees. HR
Analytics is a field in human resources that has emerged as a powerful tool
for optimizing HR management practices. With data analysis tools, HR
managers can make informed decisions and respond quickly to changes in
the labor market.

Employee Attrition, or the loss of employees in an organization, is a critical


challenge that businesses face today. The cost of recruiting and training
new employees can be high, and disruptions in production or services
caused by employee turnover can have negative impacts on business
operations. Therefore, it is crucial for organizations to understand the
causes and consequences of Employee Attrition and develop effective
strategies to reduce this problem

2. Objective
The primary objective of this report is to provide a comprehensive
overview of HR Analytics and its application in Employee Attrition
analysis. Specifically, this report aims to answer 2 questions:
 What are the key factors contributing to Employee Attrition?
 How to reduce employee attrition through the dataset?

3. About the dataset

The “IBM HR Analytics Employee Attrition and Performance” dataset


is provided by IBM, which contains information on the attrition and
performance of employees in a simulated company. The dataset includes
1470 records of employees, with 35 different attributes such as age, gender,
geography, department, salary, skills, and many other attributes.
The dataset provides information on factors that may affect employees'
departure from the company, including factors such as age, salary, job role,
job satisfaction…

The IBM HR Analytics Employee Attrition and Performance dataset is a


valuable resource for researchers and HR managers to study the factors that
affect employee attrition and performance, helping organizations develop
effective strategies to retain employees and optimize business operations.
Data source: https://www.kaggle.com/code/hadeerismail/ibm-hr-analytics-
employee-attrition-performance#Data-visualization

METHODOLGY

 Regression: Find out the main factors affecting attrition rate


Regression analysis is a statistical technique used to model the relationship
between one or more independent variables and a dependent variable. In
this case, we want to identify the main factors that affect the attrition rate in
a company. Logistic regression then can be used to model the relationship
between independent variables and the dependent variable (attrition rate).
The results can be evaluated using techniques such as coefficient
significance testing, adjusted R-squared, and residual analysis to identify
the main factors contributing to attrition

 Classification: Analyze in detail the factors that influence attrition.


Classification analysis is used to separate employees who have attrited
from those who have not and analyze the factors that influence this
decision. A predictive model can be built using classification algorithms
such as logistic regression. The model can then be used to predict the
attrition status of new employees based on their characteristics, and the
factors with the strongest influence can be identified using feature
importance analysis. To analyze each influencing factor in detail, attrition
rates can be calculated for different subgroups of employees. Data
visualization techniques such as box plots, scatter plots, and heat maps can
also be used to explore the relationship between each factor and attrition.

CHAPTER 1: DESCRIPTIVE ANALYSIS

1. Data preprocessing:

To preprocess the "Employee Attrition" dataset, we performed several key


steps.
Firstly, we checked for missing values in the dataset. We found that there
were no missing values in the data.
Secondly, we checked for duplicate values, and then dropped them from
the dataset. This ensured that the dataset only contained unique records,
which made our analysis more accurate.
Thirdly, we removed redundant columns such as "EmployeeCount",
"EmployeeNumber", "Over18", and "StandardHours" since they did not
provide any valuable information for our analytical or predictive models.
Finally, we converted categorical variables such as "BusinessTravel",
"Department", "EducationField", "Gender", "JobRole", "MaritalStatus",
and "OverTime" into numerical values using one-hot encoding. This was
necessary because most machine learning algorithms require all input
variables to be numerical values. By transforming non-numerical variables
into numerical ones, we were able to use them in our analytical and
predictive models with greater accuracy.

2. Descriptive Statistics for Nominal Variables

Employee Number: 1470

Name Name

No 1233 Male 882


Attrition Gender
Yes 237 Female 588

Travel_Rarely 1043 Sales Executive 326

Travel_Frequentl
Business 277 Research Scientist 292
y
Travel
Laboratory
Non-Travel 150 259
Technician
JobRole
Research & Manufacturing
961 145
Development Director

Department Healthcare
Sales 446 131
Representative

Human Resources 63 Manager 102


Sales
Life Sciences 606 83
Representative
JobRole
Medical 464 Research Director 80
Education
Marketing 159 Human Resources 52
Field
Technical Degree 132 Married 673
Marital
Other 82 Single 470
Status
Human Resources 27 Divorced 327

OverTime No 1054

Yes 416

Observations:

 Attrition: There are 237 observations (17% of the total observations)


for employees who have left the company
 Business Travel: Most (70%) of the observations are
"Travel_Rarely”
 Department: There are three main departments listed, and most of the
observations belong to the Research & Development department
 Education Field: There are six academic fields listed, and most of the
observations belong to the Life Sciences and Medical fields
 Gender: About 60% of the employees are male and 40% are female
 JobRole: There are nine main job roles listed, and most of the
observations belong to the Sales Executive and Research Scientist
roles
 Marital Status: About 50% of the employees are married, 32% are
divorced or separated, and the rest are single
 OverTime: About 28% of the employees work overtime

 Attrition rate:

Observations: Based on the pie chart of attrition rate, we can see that the
majority of employees in the dataset (83.9%) did not leave the company.
However, a significant proportion of employees (16.1%) did leave, which
indicates that attrition is a concern for the organization

 Number of employees in each department:

Observations: The Research & Development department has the largest


number of employees, accounting for 65.4% of all employees in the
company. Following that, the Sales and Human Resources departments
have ratios of 30.3% and 4.3%, respectively.
3. Descriptive Statistics for Quantitative Variables

Mean Std Min Max

Personal Age 36.9239 9.1353 18.000 60.000


Profile
Education 2.9129 1.0241 1.000 5.000
Factors
Job Level 2.0639 1.1069 1.000 5.000

Monthly
6502.9312 4707.9567 1009.000 19999.0
Income

Num
Companies 2.6931 2.4980 0.000 9.0000
Worked

Relationship
2.7122 1.0812 1.000 4.0000
Satisfaction

Years At
7.0081 6.1265 0.000 40.000
Company

Years In
4.2292 3.6231 0.000 18.000
Current Role

Years Since
Last 2.1877 3.2224 0.000 15.000
Promotion

Years With
Current 4.1231 3.5681 0.000 17.000
Manager

Performance
3.1537 0.3608 3.000 4.000
Rating

Total
Working 11.2795 7.7807 0.000 40.000
Years

Job
2.7299 0.7115 1.000 4.000
Involvement

Work-Life 2.7612 0.7064 1.000 4.000


Mean Std Min Max

Balance

Daily Rate 802.4857 403.5091 102.000 1499.00

Distance
9.1925 8.1068 1.000 29.000
From Home

Environment
2.7217 1.0930 1.000 4.000
Satisfaction

Hourly Rate 65.8911 20.3294 30.000 100.000

Job
2.7285 1.1028 1.000 4.000
Satisfaction
Working
Environme Percent
15.2095 3.6599 11.000 25.000
nt Factors Salary Hike

Standard
80.000 0.000 80.000 80.000
Hours

Stock
Option 0.7938 0.8520 0.000 3.000
Level

Training
Times Last 2.79993 1.2892 0.000 6.000
Year

Observations:

For Personal Profile Factors:


 The average age of employees in this dataset is 36.92 years old, with
a standard deviation of 9.14 years. The youngest employee in the
dataset is 18 years old, while the oldest is 60.
 On average, employees have worked for 5.13 companies before
joining IBM, with a standard deviation of 2.70.
 There are variations in education levels among employees, with an
average education level of 2.91 (on a scale of 1 to 5), indicating that
most employees have completed some college or a bachelor's degree.
 The average monthly income of employees is $6502.93, with a
standard deviation of $4707.96. This indicates a wide range of
salaries among employees, from entry-level positions to high-paying
management roles.
 Most employees have been with the company for less than 10 years,
with an average tenure of 7.01 years.
Overall, the personal profile factors group shows a diverse and varied
workforce in terms of age, education, experience, and income.

For Working Environment Factors:


 The average daily rate of pay for employees is $802.49, with a
standard deviation of $406.24.
 The average distance from home for employees is 9.19 miles, with a
standard deviation of 8.11 miles.
 Employees generally report high levels of satisfaction with their
work environment, with an average satisfaction score of 2.72 (on a
scale of 1 to 4).
 The average hourly rate of pay for employees is $65.89, with a
standard deviation of $20.33.
 Employees generally report high levels of job satisfaction, with an
average satisfaction score of 2.73 (on a scale of 1 to 4).
 The average percent salary hike for employees is 15.21%, with a
standard deviation of 3.66%.
Overall, the working environment factors group shows that the company
offers competitive pay and benefits, as well as generally positive work
environments.

CHAPTER 2: EXPLORATORY DATA


1. Correlation Analysis between Variables

In this section, we will understand what features have a positive correlation


with each other. This tells us whether there is an association between two
variables. The values of a correlation matrix range from -1 to 1, with values
closer to 1 indicating a stronger positive correlation between the
corresponding variables, and values closer to -1 indicating a stronger
negative correlation

Observations:
 Age and Total Working Years have the highest positive correlation
in the dataset, with a correlation value of 0.68. This indicates that
older employees tend to have more years of experience
 Job Level and Monthly Income also have a strong positive
correlation, with a correlation value of 0.95. This suggests that
higher-level employees tend to have higher incomes
 On the other hand, variables such as Education, Job Involvement,
and Performance Rating have relatively weak correlations with
Attrition (correlation values of -0.03, -0.13, and -0.002,
respectively), suggesting that they may not have a significant impact
on employee turnover.
 Age is correlated with several features, including : NumCompanies
Worked, Monthly Income, Job Level, Education, and other more
obvious features such as relating to seniority.
 Attrition has some negative correlation with the following features:
YearsWithCurrManager, YearsInCurrentRole, YearsAtCompany,
TotalWorkingYears, StockOptionLevel, MonthlyIncome, JobLevel,
JobInvolvement, EnvironmentSatisfaction, and Age. Attrition is also
correlated with OverTime
 Job satisfaction seems to have no correlation with any of the other
features.
 Performance Rating is highly correlated with PercentSalaryHike, i.e.
high performance earns better raises.

From these observations, we can conclude that to predict Attrition, we


should focus on variables with higher correlations with Attrition, such as
Age, TotalWorkingYears, JobLevel, MonthlyIncome, and JobSatisfaction.

2. The Most Influential Factors on Employee Attrition

WHICH FACTOR HAS THE MOST


INFLUENCE ON ATTRITION
ACCORDING TO THE HEATMAP?
General description of steps: The level of correlation between the
variables and Attrition is determined by the absolute value of the Pearson
correlation coefficient. This coefficient always falls between -1 and 1, and
indicates the magnitude and direction of the correlation. A coefficient close
to 0 indicates a weak or no correlation, while a coefficient close to 1 (or -1)
indicates a strong positive (or negative) correlation between the variables
and Attrition.

Observations: According to the output of the bar chart, OverTime,


TotalWorkingYears, and JobLevel are the most strongly correlated features
with Attrition. This means that these features have the highest absolute
correlation values with Attrition among all the other features.

CAN WE USE OTHER METHODS TO FIND


WHAT FACTORS HAVE THE MOST
SIGNIFICANT EFFECT ON ATTRITION?
General description of steps:

To use multivariate analysis to identify the factors affecting employee


attrition, we specifically use the Random Forest Classifier algorithm to
determine the importance level of each independent variable with respect to
the dependent variable Attrition. After calculating the importance of each
variable, we sort the variables in descending order of importance and plot a
chart to show the level of importance.

Observations: In the "IBM HR Analytics Employee Attrition &


Performance" dataset, the variables of monthly income, age, and
overtime_yes have a significant impact on attrition as they relate to the
financial needs and health of employees.
Firstly, monthly income is an important factor in assessing employee
satisfaction with their job and income. If monthly income is not enough to
meet employees' needs or does not correspond to their abilities and
experience, they may feel undervalued and lack motivation to continue
working at the company. This can lead to limitations in their personal
development and career, which can ultimately result in them leaving their
job.

Secondly, the age of employees also has a significant impact on attrition.


Younger employees may want to explore new opportunities to challenge
themselves and advance their careers, while older employees may want to
reduce work pressure and seek more secure employment. Middle-aged
employees may feel caught between these two choices and make the
decision to leave their job.

Finally, overtime is a factor that can cause stress and affect the health of
employees. If employees have to work overtime regularly, this can lead to
sleep disorders, stress, fatigue, increased risk of illness, and reduced work
performance. When employees feel unfairly treated in terms of working
hours, they may feel undervalued and decide to quit their jobs to seek better
opportunities.

3. Factors Driving Employee Attrition: Visualizing and Analyzing


the Impact
a. Visualizing Factors Influencing Employee Attrition
Observation: The above plots compare the distribution of key variables for
employees who left the company (attrition_yes) and those who stayed
(attrition_no). The density of each variable for both groups is shown, with
red representing employees who left and blue representing those who
stayed.
Based on the distribution plots generated above, we can observe some
differences between the two groups as follows:
 Age: The distribution of age for employees who left (Yes) tends to
be younger compared to those who stayed (No).
 Monthly Income: The distribution of monthly income for employees
who left (Yes) tends to be lower than those who stayed (No).
 Total Working Years: The distribution of total working years for
employees who left (Yes) tends to be less than those who stayed
(No).
 Distance From Home: The distribution of distance from home to
work for employees who left (Yes) tends to be longer than those who
stayed (No).
 Job Level: The distribution of job level for employees who left (Yes)
tends to be lower than those who stayed (No).
 Job Satisfaction: both high and low levels of job satisfaction may
correlate with a higher likelihood of leaving the company
From these observations, we can make some clear conclusions: younger
employees with lower income, less working experience, longer commute
distance, lower job level,... with their jobs have a tendency to leave the
company.

b. Analyzing Individual Factors that Contribute to Employee


Attrition

AGE

 The distribution of Attrition by Age

Observation: From the count plot above, we can observe the attrition rate
based on age.
The plot shows that the highest attrition rate occurs among employees aged
29 and 31. After that, the attrition rate gradually decreases as the age
increases and remains relatively stable after the age of 40.
This may suggest that younger employees are more likely to leave the
company for various reasons such as low income or the desire for new
career opportunities. Older employees, on the other hand, may have more
job stability and a higher level of experience, making them less likely to
leave the company.

Why the highest attrition rate occurs among employees aged 28 and
29?

Observation:
 The first plot shows that some young employees (from 25 to 35 years
old) have relatively low monthly incomes and tend to leave the
company. This may indicate that these young individuals are
dissatisfied with their salaries or are not paid in accordance with their
abilities.
 The second plot indicates that the attrition rate increases among
employees around 28-33 years old and decreases as they get older.
This may suggest that these young individuals often have less work
experience, so they tend to look for new opportunities to develop
their careers or feel limited in their current jobs.
In summary, the data shows that the highest attrition rate is among young
people aged 28-33, for various reasons such as low income or the desire for
new career opportunities.

How do age and business travel affect attrition rates?

Observations:
A stacked bar chart shows the attrition rates for different age groups based
on the frequency of business travel. The x-axis shows the age groups, and
the y-axis shows the attrition rate. The bars are stacked based on the
frequency of business travel, with the colors representing the three
categories: no travel, rare travel and frequent travel.

From the chart, we can see that as the age increases, the attrition rate tends
to decrease. Additionally, we can compare the different age groups and see
which groups have a higher or lower attrition rate for each business travel
category.
 For the age groups between 18-50, those who do not travel for
business have a higher attrition rate. They may feel unfairly treated
due to a lack of opportunities to explore and learn from new
environments
 For those aged 50 and above, those who travel frequently tend to
have a higher attrition rate. Business Travel enthusiasts typically
require a lot of energy and enjoy exploring, but with age, they may
have less physical capacity and experience more fatigue after
business trips.

Total working years

Why do employees with fewer years of work experience tend to have a


higher attrition rate?

Observations:
 Individuals with higher monthly incomes tend to have more years of
work experience
 Individuals with lower monthly incomes tend to have fewer years of
work experience
 Most individuals who have left the company have lower work
experience and lower monthly incomes compared to those who are
still employed.
These results suggest that individuals with higher incomes tend to have
more experience, which can lead to higher-paying positions. Additionally,
the fact that most individuals who left the company had less work
experience than those who remained suggests that long-term contributions
by employees can have a significant impact on their monthly income.

OverTime

 Attrition rate by overtime

Observations:
A chart demonstrating the attrition rate of employees based on their
overtime status has been presented in the form of a pie chart with two
elements, representing the attrition rate of employees who work overtime
(Yes) and those who do not (No)

It can be observed that the attrition rate of employees who work overtime
(Yes) is significantly higher than the attrition rate of employees who do not
work overtime (No). The attrition rate of employees who work overtime
(Yes) accounts for approximately 84.2%, while the attrition rate of
employees who do not work overtime (No) accounts for only about 15.8%.
This indicates that working overtime has a negative impact on the job
stability and psychological well-being of employees, leading to a higher
likelihood of them quitting their jobs compared to those who do not work
overtime.

How do Overtime and Distance from Home affect the Attrition rate?

Observations: Employees who work overtime tend to have a higher


attrition rate compared to those who do not. Additionally, the chart shows
that factors such as overtime and proximity to the workplace not only have
individual effects on the employee's decision to quit, but also interact with
each other to influence the decision. For example, employees who work
overtime and have a long commute tend to have a significantly higher
attrition rate compared to others.

Job Level
Why do people who have a lower job level have a tendency to leave the
company?
 Average monthly income by years at company and job level
Observations:
This chart illustrates the change in average monthly income based on years
of experience at the company and job level. Job levels 1 through 5 are
displayed on the x-axis, while years of experience are displayed on the y-
axis.

Based on the chart, we can observe that the average monthly income
increases with job level and years of experience. Individuals who have
been with the company for approximately 7 to 10 years and work at higher
job levels (Job Level 4 and 5) have higher incomes compared to those who
have been with the company for different time periods and work at lower
job levels. From this, it can be seen that individuals with lower job levels
will have lower salaries, which may lead to thoughts of attrition.

Department

 Attrition by department
Observations: The data indicates that the Research & Development
department had the highest resignation rate at 56.1%, followed by Sales at
38.8%, and Human Resources with the lowest rate at 5.1%. To improve
employee retention, it is recommended that the Research & Development
and Sales departments re-evaluate their organizational strategies and
conduct psychological surveys among their employees.

CHAPTER 3: PREDICTIVE ANALYSIS


ANALYSIS
Logistic Regression

In order to figure out the factors to the Attrition rate, we analyzed the
Training Data by Regression of Analysis Tools in Excel.

The normal regression equation is a statistical model used to explain the


relationship between a dependent variable and independent variables. The
equation takes the form:

Y = β₀ + β₁X₁ + β₂X₂ + ... + βᵣXᵣ

Where: - Y is the value of the dependent variable

- X₁, X₂,.. are the values of the independent variables


- β₀ is the intercept coefficient of the regression equation, representing the
value of Y when all independent variables are equal to 0.

- β₁, β₂,.., βᵣ are regression coefficients, which represent the degree of


influence of each independent variable on the dependent variable.

For Personal Profile Factors

According to the summary output from Excel, if some of the values in the
column of P-value are smaller than 0.05, the corresponding inputs X might
be the most influential factors. Thus, we figured out that
‘NumCompaniesWorked’, ‘YearsInCurrentRole’ and ‘JobInvolvement’ are
the most influential factors. Besides, we also decided to study the ‘Age’,
‘YearsSinceLastPromotion’ and ‘YearsWithCurrManager’. Furthermore,
the Residual Output from Excel shows that percent of accuracy is 87%.

The linear regression equation for personal profile factors provided by


the model is

Attrition = 0.7189 - 0.0044Age + 0.0177NumCompaniesWorked -


0.0176YearsInCurrentRole + 0.0118YearsSinceLastPromotion
- 0.0104YearsWithCurrManager - 0.0556JobInvolvement -
0.0297*WorkLifeBalance + 4.62E - 04*MonthlyIncome

For Working Environment Factors


According to the summary output from Excel, if some of the values in the
column of P-value are smaller than 0.05, the corresponding inputs X might
be the most influential factors. Thus, we figured out that ‘OverTime’,
‘EnvironmentSatisfaction’, ‘JobSatisfaction and ‘StockOptionLevel’’ are
the most influential factors. Besides, we also decided to study the
‘Department’ and ‘DistanceFromHome’. Furthermore, the Residual Output
from Excel shows that the percent of accuracy is 83%.

The linear regression equation for working environment factors


provided by the model is

Attrition = 0.2590 + 0.2346*OverTime + 0.0468*Department +


0.0030*DistanceFromHome - 0.0358*EnvironmentSatisfaction
- 0.0401*JobSatisfaction - 0.0641*StockOptionLevel

CHAPTER 4: PRESCRIPTIVE ANALYSIS

 Monthly Income: Employees who earn more are less inclined to quit
their jobs. Therefore, it is important to make an effort to learn about
local industry benchmarks in order to assess whether the company is
paying competitive wages.
 Overtime: Employees who put in extra hours are more likely to quit
their jobs. Therefore, steps must be taken to properly scope projects
up front with enough support and personnel to minimize the need for
overtime.
 YearsWithCurrManager: A significant portion of departing
employees do so six months after their current managers. One can
identify which Manager has had the most number of employees
resign over the past year by using the Line Manager details for each
employee. Here, a number of metrics can be utilized to decide
whether or not a Line Manager needs to be acted upon:
 High turnover rates among employees who report to managers may
be a sign that the organizational structure needs to be reviewed in
order to increase efficiency.
 Number of years a Line Manager has held a specific position: This
may indicate that the employees need management training or be
assigned a mentor (ideally an Executive) in the organization.
 Resignation patterns of employees: This may indicate recurrent
patterns in employees leaving, in which case action may be taken
accordingly.
 Age: Workers in the comparatively young 25–35 age group are more
likely to quit. Young employees should therefore be included in the
company's long-term goal, which should be clearly stated, and
incentives like clear paths to progression should be offered.
 DistanceFromHome: Residents who reside more away from their
workplace have a higher likelihood of quitting. Therefore, efforts
should be made to support clusters of employees leaving the same
location via business transportation or through the payment of a
transportation allowance. As long as employees arrive at work on
time each day, initial screening of employees based on their
residence is probably not advised because it would be seen as
discrimination.
 TotalWorkingYears: Longer tenured staff members are less likely to
leave. It is advisable to identify employees with 5-8 years of
experience as potentially having a higher chance of leaving.
 YearsAtCompany: Long-standing businesses are less likely to
disband. Two-year anniversary employees should be noted as
possibly having a higher probability of quitting.

CONCLUSION

The aim of this report was to analyze the factors contributing to employee
attrition and performance in the company and provide recommendations for
improving employee retention and productivity.

Based on the analysis of the "IBM HR Analytics Employee Attrition &


Performance" dataset, we have conducted a study and evaluated the factors
that affect the employee attrition rate in the company. The results show that
both personal profile factors and working environment factors have a
significant impact on the attrition rate.
Specifically, for personal profile factors, the JobInvolvement variable is
identified as the most vital factor affecting the attrition rate.
For working environment factors, the OverTime variable is identified as the
most vital factor affecting the attrition rate. Additionally, the Department
and DistanceFromHome variables also have a significant impact on this
index.

While our analysis provides valuable insights into the factors influencing
employee attrition and performance, our approach has some limitations.
One limitation is that the dataset only includes information from one
company, which may limit the generalizability of our findings.
Additionally, our analysis was based on a correlational approach, which
cannot infer causality.
Overall, our report highlights the importance of addressing key factors such
as age, monthly income, job satisfaction, distance from home,... to improve
employee retention and performance. We hope that our recommendations
will be useful for the company in developing strategies to retain and
motivate its workforce.

APPENDIX
CODE

You might also like