Employee Attrition Prediction Using Machine Learning
Employee Attrition Prediction Using Machine Learning
of the International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME 2023)
19-21 July 2023, Tenerife, Canary Islands, Spain
MACHINE LEARNING
Sanidhya Barara Umang Soni
Student, Class XI Assistant Professor
Modern School, Barakhamba Road Netaji Subhash University of technology
New Delhi, India New Delhi, India
[email protected] [email protected]
Abstract— Data mining plays an important role in the leading to a non-ideal workforce [4]. However, the analysis
internal Human Resource management processes of any firm of sizeable datasets of employee attrition data has the potential
[1]. These processes include prevention and prediction of to reveal certain dominating trends in employee attritions [5,
employee attrition. This paper presents a machine learning 6, 7, 8]. The modus operandi of such an approach is to process
based approach to attrition prediction for individual employees,
attrition data and give weights or importance values to all of
by training different machine learning models on attrition data.
In this paper, machine learning models like the logistic the various attributes (i.e. all of the characteristics) of any
regression model, the linear regression model, the decision tree particular employee’s employment which correspond to the
model, the random forest model, K nearest neighbors model, correlation between the value of that attribute and the
radius nearest neighbors model, the naïve bayes classifier model employee’s likeliness to attrit [4, 6, 8-12]. The weights thus
and Bayesian ridge model are trained on data obtained from obtained can then be used as a foundation for making
Kaggle, an online community platform for data scientists and informed decisions relating to the attributes that most impact
machine learning enthusiasts. The resulting models then predict employee attrition.
whether an employee with particular attributes will attrit, and
if so, within how many years they can be expected to do so. Prior The principles of machine learning provide us with a variety
to training the models, the data is cleaned (outlier removal, of ways in which to analyze the dataset of attrition data.
feature removal), scaled, and categorical variables are Different models in machine learning correspond to different
converted to numerical ones. Then, predictions are carried out ideologies for processing this data. Various such models are
and feature importances are found using random forest and of interest to us in this paper. These are the logistic regression
decision tree models. The utilities of these models are then
compared against each other based on accuracy, precision,
model, the linear regression model, the decision tree model,
recall, f-beta score, kappa score and three self made metrics. the random forest model, K nearest neighbors model, radius
The results show that, surprisingly, the nearest neighbor models nearest neighbors model, the naïve bayes classifier model and
outperformed all others by a large margin (possibly due to data Bayesian ridge model. All of these models, with the exception
scaling being a part of preprocessing), and the logistic regression of radius nearest neighbors, are those that have been used in
model was unable to predict attrition very satisfactorily. The prior studies on the same topic and have been noticed to be
results showed that satisfaction_level, average_monthly_hours staple in other prediction-based studies as well. This is
and last_evaluation_rating are the most important features because they have been seen to perform generally well at
when predicting attrition or years. The research also shows that predicting any given target variables. The inclusion of radius
it is viable to use traditional ML models to predict the time in
which an employee will attrit, and using the methodology
nearest neighbors (RNN) model in this paper is due to data
defined in this research on a real dataset will provide useful scaling being a step in our preprocessing, which may provide
information to the corporation applying it. an opportunity for the RNN model to provide good results.
Keywords— Artificial Intelligence, Machine Learning, These models may be trained on a given employee attrition
Employee Attrition dataset to predict not only whether an employee with some
I. INTRODUCTION given values for the attributes will attrit or not, but also the
exact duration for which the employee will stay with the firm
The human resource is a key factor in the success of any in case they do attrit and thus, the number of years before the
firm. As such, large amounts of time and effort are dedicated firm needs to look for a replacement. This may be achieved by
by the firm towards the effective management of this resource means of a straightforward analysis of the correlation between
[2]. An important aspect of this management is the employee the attributes of the employees in the dataset and the duration
hiring process. Hiring must keep up with the personnel for which those who attrit stayed with the firm before doing
demand of the company to ensure maximum productivity [3]. so.
In order to ensure that this demand is met however, firms must
also take into account the attrition of employees presently Furthermore, various metrics may be employed to better
working for the firm. understand the effectiveness of these models in predicting
employee attrition and give it an objective measure. For the
Since attrition is at the most basic level a human decision, purposes of this paper, the dataset has been subjected to an 85-
it is difficult to predict with absolute certainty. This results in 15 train-test split and graded based on eight different metrics.
it being one of the greatest issues a company can face, often These metrics are the accuracy, the precision, the f-beta score,
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
the recall and the Cohen kappa score, (all five of which are Elaborating on Step 1, preprocessing the data involves the
standard reliable metrics generally applied together in following steps:
multiple researches [13, 14, 15]) and three self designed
Data Cleanup: Rows which contain null values, or rows in
metrics namely FP year difference, TP percentage mean, TP
which any feature has an erroneous outlier value are removed
percentage median of the machine learning models in
from the dataset entirely. Useless/unimportant features such
question.
as Over18, StandardHours, and ID were removed from the
II. DATASET dataset.
The dataset used in this research paper was obtained from Categorical variables: Categorical variables are dealt with
online community platform for data scientists Kaggle. The using one hot encoding for logistic and linear regression,
dataset is an employee attrition dataset, containing KNN Models, RNN Models and Hybrid Models [7, 16]. Plus,
information about employees and the number of years they label encoding is used for Decision Tree Models, Random
have already spent at the company, along with the status of Forest Models and Naïve Bayes models [17].
their attrition.
Data Scaling: The continuous numerical variables are scaled
The dataset has 2 predivided train and test sets, in an using python’s ‘Standard Scaler’ in order to achieve better
approximately 85-15 train-test split, with the training set predictions [18].
containing ~25500 datapoints, and the test set containing IV. SCORING MECHANISMS
~4500 data points. The features contained in the dataset are
“ID”, “satisfaction_level”, “last_evaluation_rating”, Eight scoring mechanisms were used to check the
“projects_worked_on”, “average_montly_hours”, effectiveness of the model in predicting attrition. Of these
“time_spend_company”, “Work_accident”, “Over18”, eight, the following six are mechanisms in which a higher
“standard_hours”, “promotion_last_5years”, “Department”, value indicates a better model:
“salary”, and “Attrition”. Of these, “Over18” and A. Accuracy Score
“standard_hours” contained only one unique value, “1” and
“9” respectively, and “ID” contained a unique numerical value The accuracy score of a machine learning model is the
for each datapoint. simplest metric to evaluate its performance. To calculate the
accuracy score of the model, the actual attritions and the
Due to the nature of the field of research, data on a company’s model’s predicted attritions are required. Once both these
employees is required. Such data is highly private and is arguments are received, the fraction of correct predictions
protected by most companies. As such, it is near impossible to over the total predictions is calculated, and this provides us
obtain this data from a company, or via extensive surveying of with the accuracy score of the model. Generally, an accuracy
employees of several similar companies. Thus, a dataset from score of 0.9+ is regarded as excellent, a score of 0.7+ as good,
kaggle, on which the methods applied in our research have not and anything below is regarded as poor. The formula for
been previously applied, was used for the purposes of our accuracy is as follows:
research. +
= (1)
+ + +
III. RESEARCH METHODOLOGY
• TN: Number of True Negatives
The methodology for conducting the research was as follows:
• TP: Number of True Positives
Step 1: Preprocessing
• FN: Number of False Negatives
Step 2: Feature importances are isolated while predicting
attrition, and it is found that years worked with • FP: Number of False Positives
company (the second factor to be predicted) is a
significantly important one. However, in our case, the accuracy score may not be the
best possible scoring metric as the model’s False Positives will
Step 3: “time_spend_company” is made a target variable. decrease the accuracy when they may not indicate any error
Step 4: A classification model is trained on the 9 predictor with the model. For example, it may predict that a person will
variables and is asked to predict which employees in attrit in 2 more years than are written for them in the dataset,
the test set will attrit. but the accuracy score of the model will be reduced since the
accuracy will simply see that while the attrition column says
Step 5: For the predicted attritions, the regression model is “No” in the database, our model predicted “Yes”.
used to predict the number of years of employment
after which the attrition will occur using the 8 B. Precision Score
predictor variables except “time_spend_company”, The precision score of a machine learning model is yet
which is now the target variable. another easily understood metric for performance evaluation.
Step 6: Each model’s scoring metrics are calculated and Again, the actual attritions and the model’s predicted attritions
analyzed. are required. However, the precision score of a model
represents the fraction of predicted positives (in our research,
predicted attritions) which were correct. A precision of 0.85+
is excellent, that of 0.7+ is good, and anything below 0.7 is
poor. Precision can be calculated by the formula:
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
model has made an erroneous prediction. For this scoring
= (2)
+ mechanism, magnitude is not of importance, only the sign is.
• TP: Number of True Positives G. Tp Percentage Mean
• FP: Number of False Positives It is the mean of the absolute values of the percentage
differences between the predicted years and actual years of
C. Recall Score employment before attrition in the correctly predicted
The precision and recall scores are used together as they attritions. A lower value of the mean represents a lower
are complementary. While the precision tells us the fraction of average percentage difference in predictions of years,
predicted attritions that were correct, the recall tells us the meaning a better model.
fraction of attritions our model managed to catch/predict. H. Tp Percentage Median
While the same benchmarks for evaluation as precision do
apply to recall in most cases, for the purposes of our research It is the median of the absolute values of the percentage
we may set the benchmark at 0.9+ for excellency, 0.75+ for differences between the predicted years and actual years of
good, and consider anything below 0.75 to be poor. The employment before attrition in the correctly predicted
formula for recall is: attritions. A lower value of the median represents a lower
percentage difference in most predictions of years, meaning
= (3) a better model.
+
V. MACHINE LEARNING MODELS
• TP: Number of True Positives
A. Logistic Regression
• FN: Number of False Negatives
The Logistic Regression Model (proposed in 1958 [21])
D. F-Beta Score (Beta=1, 2) provides the likelihood that Attrition (the target variable) is
The f-beta score is a metric that features the harmonic class 1 (i.e., the employee will attrit) from a linear sum of the
mean of precision and recall, i.e., it is a way to analyze predictor variables, in the formula:
precision and recall together and give one single aggregate + = ,- + ,. /. + , / + ⋯ + ,1 /1 (6)
quality value. While the f1 score is the most used due to its
4
interpretability, f2 score is utilized in our research to give
(3 = 1) = (7)
more importance to recall (as in [19]). The f-beta score of a 4 +1
model may be calculated by setting the value of beta as
Where X1, X2, ..., Xp are the different variables, the
required (in our case, 2) in the formula below:
omegas are the respective coefficients, and P(E = 1) is the the
∙ probability that the data point E is of class 1. Also, in order to
= (1 + )∙ (4)
( ∙ )+ optimize our task, we utilize the log-likelihood loss function:
<
1
& 7( 8 log( 8 ) + (1 & 8 ) log(1 & 8) (8)
E. Cohen Kappa Score 6
8=.
The Cohen kappa score of an ML model informs us about the Where m is the number of samples in the training data, yi
reliability of the model, accounting for the probability of the is the label of the i-th sample, Pi is the prediction value of the
model predicting attrition correctly by random chance [20]. i-th sample.
The kappa score reveals the correlation between the actual Applying this model to our dataset yields results as
and the predicted attritions using the formula: follows. The model assigns weights to the various attributes in
!"##$!% & !'()!$
the dataset as can be seen in Fig. 1.
= (5)
1& !'()!$
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
Therefore, according to the Logistic Regression Model Therefore, linear regression predicts that salary and
satisfaction_level, salary_high and work_accident are the department are most important while predicting years of
most important attributes for an employee to stay, and employment before attrition.
salary_low, department_hr and salary_medium are most
As can be seen by the scoring metrics, Linear Regression
important for an employee to attrit.
model is acceptable at predicting years of employment before
attrition.
Where:
Fig. 2. Logistic Regression Confusion Matrix
• a represents a specific attribute or class label
Owing to the low value of the Kappa score, we can say
that this model is not reliable. • Entropy(S) is the entropy of dataset, S
• |SV|/|S| represents the proportion of the values in
B. Linear Regression SV to the number of values in dataset, S
The Linear Regression model predicts the years of • Entropy (SV) is the entropy of dataset, SV
employment of the person before attrition using the traditional Entropy is calculated by:
least squares method, i.e., finding the line of best fit through
the data such that the sum of the residuals of the points is M N (O) = & 7 ( ) log ( ) (11)
minimized. It can be said to be solving a problem of the form !XY
min‖/, & ‖ (9) The feature with greater information gain is more
B
important. Based on the feature importances, it splits the data
It can be represented mathematically in the same linear at nodes, using basic True/False structures to arrive at a
equation form as the logistic regression model (See [22] for decision for attrition and number of years.
more).
Applying this model to our dataset yields results as
The following results are gained from the Linear follows. The model can also be used to find out feature
Regression model: importances [4, 24] as in Fig. 4 and Fig. 5.
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
Applying this model to our dataset yields results as follows.
The model can also be used to find out feature importances
as in Fig. 7 and Fig. 8.
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
As is clear from the scoring metrics and the convolution As seen from Fig. 11, RNN lags behind KNN while
matrix, Random Forest models are excellent at predicting predicting attrition and years. However, it still provides
both attritions and years. excellent results for attrition and acceptable results for years.
F. Naïve Bayes Classifier
E. Nearest Neighbors Models The naïve bayes classifier, as the name indicates, works
only for classification and as such can only be used to predict
Nearest neighbors is an ML model which simply remembers
attrition. It assumes independence among the variables, i.e.,
the data given by the user, and then, based on the points that
the value in one of the variables does not depend on the values
are closeby, classifies/predicts the target values. Two types of
in any of the rest of the variables. This is a bad assumption to
nearest neighbors models have been utilized in this research,
make in our scenario, since, for example, salary and
the K nearest neighbors model (KNN), and the radius nearest
satisfaction index may be inter-dependent. However, as the
neighbors model (RNN).
model deals well with large datasets, it has been included in
The KNN model simply finds the K (integer value specified the scope of the research (See [26] and [30] for more). It works
for our research purposes to be 1) nearest points and (a) off of the Bayes algorithm:
Classifies an employee to attrit or not attrit based on a (Z| ) ( )
majority vote and (b) predicts the number of years by ( |Z) = (12)
averaging the 2 nearest points (See [18] and [26] for more). (Z)
( |/) = (Z. | ) [ … [ (Z) | ) [ ( ) (13)
Above,
• P(c|x) is the posterior probability of target class c
if given the predictor attribute x
• P(c) is the prior probability of the target.
• P(x|c) is the probability of predictor given class.
• P(x) is the prior probability of the attribute.
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
H. Hybrid Model TABLE II. TABLE SHOWING THE SCORING METRICS FOR EACH MODEL
WHEN PREDICTING YEARS
Looking at the results for the models, the KNN and RNN
Scoring Metrics
models were best at predicting attrition and years. Thus, we
used all models except Random Forest, KNN, and RNN Models TP TP
FP Year
Percentage Percentage
together to form a hybrid model (See [31]); using 2 Difference
Mean Median
independent ensembles with 5 of each type of model (15 Log Reg NA- NA- NA-
models total) in order to predicted attritions and then years.
Lin Reg -1.05 0.055 0.028
DT 0.213 0.053 0.0
RF 0.275 0.046 0.005
KNN 0.0 0.0 0.0
RNN -0.055 0.178 0.209
NB NA- NA- NA-
BRR 0.017 0.046 0.018
Hybrid -0.096 0.035 0.012
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
attrition status of employees can, with slight modification and [5] V Nagadevara, V. Srinivasan, R. Valk: Establishing a link between
employee turnover and withdrawal behaviours: application of data
utilisation of regressors instead of classifiers, also be used to mining techniques. Res. Pract. Hum. Resour. Manag. 16, 81–97 (2008)
predict the years in which employee will attrit, providing
[6] K. Suceendran, R. Saravanan, D.S. Divya Ananthram, R.K. Kumar, K.
further invaluable information to the employers. Sarukesi: Applying classifier algorithms to organizational memory to
build an attrition predictor model
As compared to similar studies carried out in previous years,
the paper reveals a contrast in the best performing model. [7] R. Punnoose, P. Ajit: Prediction of employee turnover in organizations
While the random forest model, or sometimes the logistic using machine learning algorithms. Int. J. Adv. Res. Artif. Intell. 5, 22–
26 (2016)
regression model are generally expected to perform best, with
the indicated preprocessing of the data, the nearest neighbour [8] E. Sikaroudi, A. Mohammad, R. Ghousi, A. Sikaroudi: A data mining
approach to employee turnover prediction (case study: Arak automotive
models can also be utilized to their full potential. While the parts manufacturing). J. Ind. Syst. Eng. 8, 106–121 (2015)
result that satisfaction level is an important predictor is in line
[9] Q.A. Al-Radaideh, E. Al Nagi: Using data mining techniques to build a
with some other studies [34, 35], it is a generally uncommon classification model for predicting employees performance. Int. J. Adv.
one as can be seen in [36, 37, 38, 39]. In comparision to Comput. Sci. Appl. 3, 144–151 (2012)
previous papers, a new target variable of [10] H.Y. Chang: Employee turnover: a novel prediction solution with
“time_spend_company” (years the employee spends with the effective feature selection. WSEAS Trans. Inf. Sci. Appl. 6, 417–426
company before attriting) has been introduced. (2009)
[11] C.F. Chien, L.F. Chen: Data mining to improve personnel selection and
While the study does propose a new prediction methodology enhance human capital: a case study in high-technology industry. Expert
and target variable, the research is put at a disadvantage due Syst. Appl. 34, 280–290 (2008)
to a lack of extensive real world data on the topic. However, [12] R.S. Sexton, S. McMurtrey, J.O Michalopoulos, A.M. Smith:
the results so obtained do show that the additional, extremely Employee turnover: a neural network solution. Comput. Oper. Res. 32,
useful variable of years before attrition can also be predicted 2635–2651 (2005)
using traditional ML models (meaning that using the [13] K. Ranveer, S. B. Dwivedi, and S. Gaur. "A comparative study of
proposed methods on a real dataset can yield information machine learning and Fuzzy-AHP technique to groundwater potential
crucial to a corporation), building upon prior studies on the mapping in the data-scarce region." Computers & Geosciences 155
(2021)
topic of prediction in attrition.
[14] Rustam, Furqan, et al. "Wireless capsule endoscopy bleeding images
In comparision to previous works, this paper presents a classification using CNN based model." IEEE Access 9 (2021)
comparative analysis of some of the most used machine [15] Sors, Arnaud, et al. "A convolutional neural network for sleep stage
learning models at predicting attritions, complete with scoring from raw single-channel EEG." Biomedical Signal Processing
preprocessing methodologies, and also checks if these and Control 42 (2018)
models can be used to predict the years of employment before [16] A. Quinn, J.R. Rycraft, D. Schoech: Building a model to predict
attrition. The feature selection is done by hand due to the caseworker and supervisor turnover using a neural network and logistic
regression. J. Technol. Hum. Serv. 19, 65–85 (2002)
limited number of features, most are kept and some which
only include 1 unique value or which are wholly irrelevant to [17] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O.
Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J.
the research are removed. Random Forest and Decision Tree Vanderplas: Scikit-learn: machine learning in Python. J. Mach. Learn.
are used to find feature importances. Similar scoring metrics Res. 12, 2825–2830 (2011)
to the new ones introduced in this paper may be developed [18] J. Friedman, T. Hastie, R. Tibshirani: The elements of statistical
for future research. For future research into attrition, machine learning. Springer, New York (2001)
learning on a real dataset consisting of a vast variety of
[19] B. Prasetiyo, M. A. Muslim, and N. Baroroh. "Evaluation performance
variables including some which are based on more of the recall and F2 score of credit card fraud detection unbalanced dataset
psychological factors of attrition and some others like their using SMOTE oversampling technique." Journal of Physics:
compensation ratio inside their salary bracket and inside their Conference Series. Vol. 1918. No. 4. IOP Publishing, 2021
line of work in their country. When carrying out the research [20] Sun, Shuyan. "Meta-analysis of Cohen’s kappa." Health Services and
on the larger dataset with a greater variety of features, the Outcomes Research Methodology 11 (2011)
feature selection can also be improved by the m-max-out [21] H. Zhang: The optimality of naive Bayes. AA, 1, 3
method, and the results can be improved by generating
[22] Su, Xiaogang, Xin Yan, and Chih‐Ling Tsai. "Linear regression."
smaller bootstrap datasets from the parent dataset, as Wiley Interdisciplinary Reviews: Computational Statistics 4.3 (2012)
described in [37].
[23] J.N. Morgan, J.A. Sonquist: Problems in the analysis of survey data,
REFERENCES and a proposal. J. Am. Stat. Assoc. 58, 415–434 (1963)
[1] J. Ranjan, D. P. Goyal, and S. I. Ahson. "Data mining techniques for [24] H. Jantan, A.R. Hamdan, Z.A. Othman: Human talent prediction in
better decisions in human resource management systems." HRM using C4. 5 classification algorithm. Int. J. Comput. Sci. Eng. 2,
International Journal of Business Information Systems 3.5 (2008) 2526–2534 (2010)
[2] B.L. Das, and Mukulesh Baruah. "Employee retention: A review of [25] L. Breiman: Random forests. Mach. Learn. 45, 5–32 (2001) Cox, D.R.:
literature." Journal of business and management 14.2 (2013) The regression analysis of binary sequences. J. Roy. Stat. Soc. B. Met.,
215–242 (1958)
[3] D. A. B. A. Alao, and A. B. Adeyemo. "Analyzing employee attrition
using decision tree algorithms." Computing, Information Systems, [26] K.P. Murphy: Machine learning: a probabilistic perspective. MIT
Development Informatics and Allied Research Journal 4.1 (2013) press, Cambridge (2012)
[4] D. Alao, A.B. Adeyemo: Analyzing employee attrition using decision [27] Scikit-Learn User Manual. Available online: https://scikit-
tree algorithms. Comput. Inf. Syst. Dev. Inform. Allied Res. J. 4 (2013) learn.org/stable/modules/generated/sklearn.neighbors.RadiusNeighbo
rsClassifier.html#sklearn.neighbors.RadiusNeighborsClassifier
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.
[28] Scikit-Learn User Manual. Available online: https://scikit- Integration and Inclusion Through Interdisciplinary Practices in
learn.org/stable/modules/generated/sklearn.neighbors.RadiusNeighbo Management, March 2019, pp.62- 67,
rsRegressor.html#sklearn.neighbors.RadiusNeighborsRegressor
[35] .R. Srivastava, and P. Eachempati. "Intelligent employee retention
[29] Gursoy, Mehmet Emre, et al. "Differentially private nearest neighbor system for attrition rate analysis and churn prediction: An ensemble
classification." Data Mining and Knowledge Discovery 31 (2017) machine learning and multi-criteria decision-making
approach." Journal of Global Information Management (JGIM) 29, no.
[30] A. Géron: Hands-on machine learning with Scikit-Learn and 6 (2021): 1-29.
TensorFlow: concepts, tools, and techniques to build intelligent
systems. O’Reilly Media (2017) [36] Yahia, Nesrine Ben, Jihen Hlel, and Ricardo Colomo-Palacios. "From
big data to deep data to support people analytics for employee attrition
[31] Yi Tan, and P.P. Shenoy. "A bias-variance based heuristic for prediction." IEEE Access 9 (2021): 60447-60458.
constructing a hybrid logistic regression-naïve Bayes model for
classification." International Journal of Approximate Reasoning 117 [37] S. Najafi-Zangeneh; N. Shams-Gharneh; A. Arjomandi-Nezhad; S.
(2020) Hashemkhani Zolfani: An Improved Machine Learning-Based
Employees Attrition Prediction Framework with Emphasis on Feature
[32] McDonald, C. Gary "Ridge regression." Wiley Interdisciplinary Selection. Mathematics 2021, 9, 1226.
Reviews: Computational Statistics 1.1 (2009)
[38] F. Fallucchi, C. Marco, R. Giuliano, and E.W.D. Luca. "Predicting
[33] Price, Bertram. "Ridge regression: Application to nonexperimental employee attrition using machine learning techniques." Computers 9,
data." Psychological Bulletin 84.4 (1977) no. 4 (2020): 86.
[34] Dr. R. S. Kamath | Dr. S. S. Jamsandekar | Dr. P.G. Naik "Machine [39] N. El-Rayes, F. Ming, Michael Smith, and S. M. Taylor. "Predicting
Learning Approach for Employee Attrition Analysis" Published in employee attrition using tree-based models." International Journal of
International Journal of Trend in Scientific Research and Development Organizational Analysis (2020)
(ijtsrd), ISSN: 2456-6470, Special Issue | Fostering Innovation,
Authorized licensed use limited to: Zhejiang University. Downloaded on November 13,2024 at 13:13:22 UTC from IEEE Xplore. Restrictions apply.