0% found this document useful (0 votes)

17 views12 pages

Absenteeism at Work Project Report

Uploaded by

chaitanya.samanchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views12 pages

Absenteeism at Work Project Report

Uploaded by

chaitanya.samanchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

ANLY 530-90-O-2020

Machine Learning I

Prediction of Absenteeism

Project Report

Submitted by-Team 2

Rama Chaitanya Samanchi

Saisanthosh Mamidala

Shuyu Sui

Srilakshmi Peesa

Subject Instructor: Prof. Roozbeh Sadeghian

Abstract

There is very stiff competition for any business in the current marketplace and productivity of its
workforce helps an organization to nudge ahead of its competition. An employee’s productivity has a lot
of impact on the product of any business, so it is important for the Human Resources Department of an
organization to understand what impacts productivity. Absenteeism is one such behavior that affects the
regular workflow. It is defined as the habitual non-presence of an employee at his or her job (Will
Kenton, 2019). Organizations have to address this behavior of employees. This paper intends to explore
different Machine Learning techniques to understand what factors are contributing to this behavior and
help organizations to update or redesign their employee satisfaction matrix.

Introduction
This analysis intends to understand the relationships between different employee behaviors and
absenteeism. Understand these relationships can be used as a tool to define what factors are causing this
and how an organization can address these issues.
There are many ways to explore the data and understand the relationships but for our analysis, we are
leveraging some Supervised Machine Learning Algorithms. To be specific, we are using four different
algorithms namely Decision Tree, Random Forests, Naïve Bayes Classifier, and Support Vector
Machines. Each of these algorithms has its advantages and disadvantages. So, we want to analyze the
model outputs from all these algorithms and suggest the best algorithm that would fit this data. For model
validation purposes, we will be using a combination of accuracy scores, confusion matrix, ROC, and
AUC.

Related Work
Supervised Machine Learning
This analysis makes use of multiple supervised machine learning algorithms. It cleans most of the
data by removing all the features which are not related to the target. Doing this might potentially lose a lot
of features which might have an indirect effect on the target. This analysis made use of Decision Tree,
Random Forests, Naïve Bayes Classifier and Support Vector Machine algorithms and concluded that
Random Forests is giving the best results purely based on the confusion matrix (Ojo Olawale. 2019, Aug
27).
Unsupervised Machine Learning
This analysis makes use of a hierarchical clustering technique to see what features are
contributing to absenteeism behavior. It concludes that the features profiles and age groups are the biggest
contributors to this behavior (Parker Oakes. 2019, Feb 24).
Data Preprocessing

The dataset that has been used for this analysis is a popular (Absenteeism at work) dataset which
was created using employee information from July 2007 to July 2010 of a courier company in Brazil. The
dataset had 666 observations with 20 columns. Of those 20 columns, there is one target variable and 19
other feature variables.
Figure 1
Data Types of all the variables in the dataset

From the above metadata, we observed that two features Age and Workload Average/day are
string data types which should be numeric data type. Also, Workload Average/day had a blank space in
its name and the values were comma-separated. So, we had to clean up all the blank spaces in the column
name and get rid of the commas in the data. Then we converted both these values to numeric. Once these
features were converted, they introduced null values that were dropped from this analysis. We also
performed null value checks and valid value checks wherever applicable. We dropped all the observations
which had invalid values. In total six observations were dropped from the initial data.
We also checked for multi-collinearity in the data. We plotted the collinearity between all the
variables and dropped the features with an absolute correlated coefficient value of above 0.8. Body Mass
Index seems to be highly correlated with other features and it was dropped.

Figure 2
Heat Map Showing Correlation Coefficient Values Between Features

The above heat map visualizes the correlation coefficient values between features. Lighter the
value of tile, the higher the correlation coefficient value. As we can see from figure 2, dropping Body
Mass Index prevents multi-collinearity. We also converted the target variable, Absenteeism time in hours,
to categorical variables for our initial analysis. The following are the thresholds we used to convert our
target variable to a categorical variable.

Table 1
Target Value Threshold by Groups
Group Number Threshold
0 Number of hours=0
1 0 < Number of hours <= 6
2 Number of hours > 6

Technical Approach

Feature selection
Feature selection is the process of selecting a certain number of most useful features that will be
used to train the model. This is done to reduce the dimensionality when most of the features are not
contributing enough to the overall variance.
In our project, We have used a correlation plot to eliminate a highly correlated variable. Random forest
to find feature importance and Principle Component analysis for dimension reduction.
a. Principle Component Analysis (PCA)
We use PCA primarily for dimensionality reduction. It is a technique of extracting important
features from a larger set of data variables in the dataset. We performed normalization before performing
PCA.
Data has a total of 18 input features after removing BMI Variable. Data has 18 features and 92% of the
variance is explained by 14 components. So, we have selected the number of components is 14.

b. Data Splitting:
We set the Randomization seed at 12345 and make the train and test model in 80: 20 ratios.

Training and Validating Models

.
We have created 4 models using Random Forest Classifier, Decision Tree Classifier, Naïve Bayes
Classifier, and Support Vector Machine techniques. We have created them with 14 components from the
Principal Component Analysis and found out the confusion matrix as well as calculated the accuracy
score for each. The classifiers are imported from the sci-kit-learn library. The model was fit with the
training set and the target is predicted for the independent variables of the test set. These values are
compared against each other and the confusion matrix is computed. The accuracy scores are reported in
Table 1.

Table 1
Model Accuracy Scores After Performing PCA
Techniques Accuracy Score
Random Forest Classifier 75.00%
Decision Tree Classifier 74.24%
Naïve Bayes Classifier 71.96%
Support Vector Machine 67.42%
The accuracy scores appear to be moderately performing well. The Random Forest Classifier
leads the accuracy score closely followed by Naïve Bayes Classifier & Support Vector Machine. The
Decision Tree Classifier stands the least out of the four tested classification models. The four models
were again created without applying the PCA i.e. without eliminating any features from the given dataset
(except for the Body Mass Index which was dropped earlier due to high correlation). The new models
show a significant difference in the accuracy scores.

Table 2
Model Accuracy Scores Without PCA
Techniques Accuracy Score
Random Forest Classifier 84.84%
Decision Tree Classifier 81.06%
Naïve Bayes Classifier 78.78%
Support Vector Machine 74.24%

Without conducting Principal Component Analysis, the accuracy scores of Random Forest
Classifier and Decision Tree Classifier show a remarkable increase. This can be attributed to the fact that,
though PCA served its purpose in reducing the dimensionality of the data, the model applied on data with
PCA failed to capture the underlying pattern and returned a relatively lower accuracy score. The Random
Forest Classifier Technique has the best accuracy score and further, the parameters are tuned by applying
Grid Search. The parameters tuned are n_estimators: [6, 100, 30], max depth: [5, 7, 10]. Even after tuning
the Grid Search, the best parameters {‘max_depth’: 7, ‘n_estimators’: 30} did not show any significant
improvement in the accuracy score. An interesting observation we found was that the SVM model is
misclassifying all the observations as either Group 1 or Group 2 while eliminating to classify as Group 0.
This can be identified from the confusion matrix below.

Figure 3
Confusion Matrix for SVM

Model Performance on Test Dataset

The Test Data of all the models is replaced with the Test Dataset provided for evaluation. The
confusion matrix & accuracy score are computed and tabulated below.

Table 3
Model Accuracy Scores After Using Test Data
Techniques Accuracy Score
Random Forest Classifier 71.62%
Decision Tree Classifier 71.62%
Naïve Bayes Classifier 60.81%
Support Vector Machine 60.60%

The Random Forest Classifier and the Decision Tree Classifier both stand the highest at 71.62%
accuracy score. The Naïve Bayes Classifier stands third at about 60.81%. The Support Vector Machine is
misclassifying the total observations as Group 1 without considering Group 0 and Group 2, which means
that the SVM is not a recommended model for the given problem statement. This can be verified from the
confusion matrix in the figure below.

Figure 4
Confusion Matrix and Accuracy Score for SVM

An accuracy score of above 70 is a good score considering that the problem statement falls in the
scope of the Human Resources Domain. Human Resources deal directly with the behavior of human
beings; hence, high accuracy scores cannot be expected. Moreover, we are dealing with limited variables
to predict employee’s behavior. Several other variables such as employee morale, job satisfaction,
relationship with the manager, workplace ambiance, etc. which are generally considered to be the key
indicators of an employee’s performance and absenteeism rates are not present in the dataset. Thus, the
Random Forest Classifier & Decision Tree Classifier which has an accuracy score of 71.62% are selected
for further evaluation.

Model Evaluation

We utilized several methods to evaluate the model performances. In addition to the accuracy
score, we also leverage receiver operating characteristics (ROC) curve with area under the curve (AUC)
and confusion matrix to compare model performance. We are particularly interested in comparing the
performance difference between the random forest model and the decision tree. In the previous section,
we observe that the accuracy score for the random forest model and decision tree model are almost the
same. So, we decided to explore more evaluation methods to choose the best method.

ROC Curve
Receiver operating characteristic curve plots the true positive rate on the Y-axis and the false
positive for the classification rate on the X-axis. For our specific case, we have three classes in total, and
we plot the ROC curve for each of the classes. Plotting these two rates we can have a clear picture of the
sensitivity of the model.

Calculate Area Under the Curve

The area under the ROC curve approximates the accuracy score for our classification prediction.
Using the area under the curve for each class, we can have an approximate method of measuring how
good the model performance is for each of the three classes we are trying to predict.

Plotted ROC
From the below ROC curve plots, we can see that from a ROC and AUC perspective random
forest is performing better than the decision tree model. The average AUC for the random forest is 0.92,
and the average AUC for the decision tree model is 0.84. We can also observe from the ROC plot that for
Absenteeism time between zero and six and for Absenteeism time more than six are farther away from 1.
Because for the area under the curve, one is the best result we can get, so in terms of ROC and AUC, the
Random Forest model is performing better than the Decision Tree model.

Utilizing both the confusion matrix and the area under the curve, we can see that though we did
not correctly predict all data points in Absenteeism time equals zero, the AUC for Absenteeism time
equals zero is one. This is because the ROC curve only considers false positive and true positive
classifications. Even in the Random Forest model, we have zero false-positive predictions. Therefore, we
will get a perfect AUC score for Absenteeism time equals zero.

Random Forest

AUC for Absenteeism time equals zero: 1.0

AUC for Absenteeism time between zero and six: 0.89
AUC for Absenteeism time more than six: 0.89
Figure 5
ROC with Random Forest

Decision Tree
AUC for Absenteeism time equals zero: 1.0
AUC for Absenteeism time between zero and six: 0.76
AUC for Absenteeism time more than six: 0.76
Figure 6
ROC with Decision Tree
Confusion Matrix
For the confusion matrix, we list out prediction outcomes for each class. In this way, we can see
the prediction result for each class. Using the confusion matrix, we can also calculate precision and recall
scores and get more insights into how we can fine-tune the model.
As we can see from the below two confusion matrices, the Decision Tree is performing better at
predicting Absenteeism time equals zero. The Random Forest model is performing better at predicting
Absenteeism time between zero and six and Absenteeism time between more than six.

Table 5
Confusion Matrix for Random Forest
Predicted Classes

Absenteeism Absenteeism time Absenteeism time

time equals 0 between 0 and 6 more than 6
Absenteeism time
equals 0 4 0 3

Absenteeism time
Actual Classes between 0 and 6 0 22 6

Absenteeism time
more than 6 0 2 27

Table 6
Confusion Matrix for Decision Tree
Predicted Classes

Absenteeism Absenteeism time Absenteeism time

time equals 0 between 0 and 6 more than 6
Absenteeism time
equals 0 7 0 0

Absenteeism time
Actual Classes between 0 and 6 0 27 11

Absenteeism time
more than 6 0 10 19

Conclusion
In conclusion, after considering ROC and area under the curve as well as the confusion matrix,
we think that the Random Forest model is performing better than the Decision Tree model. The Random
Forest model gives us a better AUC score and a less false prediction rate on the test dataset.

Predicting Absenteeism Hour Using the Same Dataset

We also perform prediction on the actual absenteeism hour using the same dataset. We utilized
Lasso regression, Ridge regression, Random Forest regression, and Xgboost regression to try and predict
actual hours of the absence. We followed a similar model value and feature selection process as in our
classification model. And instead of using the accuracy score to measure the performance of the model,
we used RMSE (root mean square error) to measure the performance of our models.
To improve the accuracy of the model, we converted the reason of absence feature into
categorical variables. By converting this column from numeric to categorical data, we can improve the
RMSE.
Below is the RMSE prediction result we got on the test dataset for all the models we tried. We
concluded that the Random Forest model performance is the best. Using the feature importance
functionality, we identify workload, the month of absence, age, and medical consultation are among the
most influential features in the model (See attached Continuous Prediction notebook).

Table 7
RMSE for Different Models
Model Name RMSE
Lasso Regression 16.72
Ridge Regression 16.71
Random Forest 16.73
Xgboost 17.38

References
1) L. Breiman. Random forests. Maching Learning,45(1):5–32, Oct. 2001.
2) Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining. doi:10.1145/2939672.2939785
3) A Survey on Decision Tree Algorithms of Classification in Data Mining. (2016). International
Journal of Science and Research (IJSR), 5(4), 2094-2097. doi:10.21275/v5i4.nov162954
4) Will Kenton (2019, Jun 4). Absenteeism. Retrieved from
https://www.investopedia.com/terms/a/absenteeism.asp
5) Andrea Martiniano, Ricardo Pinto Ferreira, and Renato Jose Sassi. Absenteeism at work.
Retrieved from https://archive.ics.uci.edu/ml/datasets/Absenteeism+at+work
6) Ojo Olawale (2019, Aug 27). EXPLORATION OF ABSENTEEISM WITH MACHINE
LEARNING. Retrieved from https://medium.com/@ojoolawalejulius2016/exploration-of-
absenteeism-with-machine-learning-1f01a8f9357e
7) Parker Oakes (2019, Feb 24). Using Machine Learning to Discover Employee Absenteeism
Reasons. Retrieved from https://rpubs.com/alanoakes/EmployeeAbsenteeism

Final Capstone Project Report
100% (1)
Final Capstone Project Report
35 pages
Machine Learning
100% (2)
Machine Learning
30 pages
Employee Attrition Prediction
100% (1)
Employee Attrition Prediction
21 pages
MaWinPaPaMayPhyoAung - First Seminar
No ratings yet
MaWinPaPaMayPhyoAung - First Seminar
21 pages
JR Ims
No ratings yet
JR Ims
51 pages
Lead Scoring Group Case Study Presentation
100% (2)
Lead Scoring Group Case Study Presentation
19 pages
CS 20 - Discrete Structure 2
No ratings yet
CS 20 - Discrete Structure 2
12 pages
Machine Learning Model
No ratings yet
Machine Learning Model
9 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
ANLY 502 Final Report
No ratings yet
ANLY 502 Final Report
7 pages
Assignment Week 3 500832
No ratings yet
Assignment Week 3 500832
6 pages
Article
No ratings yet
Article
10 pages
Project Report
No ratings yet
Project Report
3 pages
1935510219+edwin Thungari Macpal+ Experiment2
No ratings yet
1935510219+edwin Thungari Macpal+ Experiment2
14 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Classification Data Mining
No ratings yet
Classification Data Mining
84 pages
A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition
No ratings yet
A Novel Optimized Approach For Machine Learning Techniques For Predicting Employee Attrition
9 pages
DATA 51000 ClassificationAssignment
No ratings yet
DATA 51000 ClassificationAssignment
10 pages
Turover Prediction
No ratings yet
Turover Prediction
52 pages
Employee Turnover Prediction
100% (1)
Employee Turnover Prediction
16 pages
Problem Statement:: Field Characteristics Data Type
No ratings yet
Problem Statement:: Field Characteristics Data Type
4 pages
10 RRL
No ratings yet
10 RRL
3 pages
Vasu Gupta, Sharan Srinivasan, Sneha Kudli, Prediction and Classification of Cardiac Arrhythmia
No ratings yet
Vasu Gupta, Sharan Srinivasan, Sneha Kudli, Prediction and Classification of Cardiac Arrhythmia
5 pages
Anly 530 - Final Project Report
No ratings yet
Anly 530 - Final Project Report
18 pages
Voice Assistant
No ratings yet
Voice Assistant
46 pages
IBM HR Analytics For Employee Attrition and Performance Prediction
No ratings yet
IBM HR Analytics For Employee Attrition and Performance Prediction
44 pages
Ibm Attrition Practices
No ratings yet
Ibm Attrition Practices
7 pages
Heart Disease
No ratings yet
Heart Disease
13 pages
FRA Milestone 1
No ratings yet
FRA Milestone 1
33 pages
Employee Attrition Miniblogs
100% (1)
Employee Attrition Miniblogs
15 pages
HR Analyst (Data Analyst)
No ratings yet
HR Analyst (Data Analyst)
11 pages
BerkeGündüz MelihAydın Cmpe442 Training Report
No ratings yet
BerkeGündüz MelihAydın Cmpe442 Training Report
14 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Cdu 1121 09
No ratings yet
Cdu 1121 09
10 pages
TB 969425740
No ratings yet
TB 969425740
16 pages
Research Paper
No ratings yet
Research Paper
5 pages
Employee Attrition PREDICTION Using Machine Learning
No ratings yet
Employee Attrition PREDICTION Using Machine Learning
11 pages
Cse437 4
No ratings yet
Cse437 4
14 pages
Reportprediction of Employee Atrition Uisng Machine Learning
No ratings yet
Reportprediction of Employee Atrition Uisng Machine Learning
6 pages
Karpagam Sep Oct 2019 Article 6
No ratings yet
Karpagam Sep Oct 2019 Article 6
6 pages
Special Topic: Missing Values
No ratings yet
Special Topic: Missing Values
25 pages
ML CS
No ratings yet
ML CS
4 pages
Employee Future Prediction
No ratings yet
Employee Future Prediction
3 pages
Capstone Final PPT Group 6
No ratings yet
Capstone Final PPT Group 6
19 pages
FP Report - Group 2
No ratings yet
FP Report - Group 2
4 pages
ICAICT 2016 Paper 26
No ratings yet
ICAICT 2016 Paper 26
8 pages
Number Theory Mcqs Mcqs Covering All Topics
0% (1)
Number Theory Mcqs Mcqs Covering All Topics
18 pages
Camera Ready
No ratings yet
Camera Ready
5 pages
Literature Review Ieee Format Example
100% (3)
Literature Review Ieee Format Example
6 pages
Employee Attrition Prediction
No ratings yet
Employee Attrition Prediction
3 pages
Voltage Stability
100% (1)
Voltage Stability
45 pages
Fire Alarm System - Notifier PDF
No ratings yet
Fire Alarm System - Notifier PDF
19 pages
Quectel HCM111Z AT Commands Manual V1.0.0 Preliminary 20230916
No ratings yet
Quectel HCM111Z AT Commands Manual V1.0.0 Preliminary 20230916
50 pages
MS Access 2007 Tutorial
No ratings yet
MS Access 2007 Tutorial
108 pages
FD1104SN-R1 Datasheet V1.0
100% (1)
FD1104SN-R1 Datasheet V1.0
2 pages
ROVE R3 User Manual
No ratings yet
ROVE R3 User Manual
96 pages
3MTT Onboarding Learning Resources
No ratings yet
3MTT Onboarding Learning Resources
31 pages
More About Spreadsheet Errors and Fixes
100% (1)
More About Spreadsheet Errors and Fixes
3 pages
Central Station EX PSC-A128EX3 Installation Manual
No ratings yet
Central Station EX PSC-A128EX3 Installation Manual
96 pages
Kecs 101
No ratings yet
Kecs 101
26 pages
Python Lesson 5 - Selection
No ratings yet
Python Lesson 5 - Selection
19 pages
Copy of DP Research Report - Sources Guided
No ratings yet
Copy of DP Research Report - Sources Guided
6 pages
Scala Unit 1
No ratings yet
Scala Unit 1
60 pages
DLD Lab-Report
No ratings yet
DLD Lab-Report
49 pages
APS Master Interface User Manual V5.0.0
No ratings yet
APS Master Interface User Manual V5.0.0
42 pages
Assignment 5 Ageing Chchcs 001 Chcccs 025
No ratings yet
Assignment 5 Ageing Chchcs 001 Chcccs 025
39 pages
BI Manual (E-Next - In)
No ratings yet
BI Manual (E-Next - In)
66 pages
What Is Computer
No ratings yet
What Is Computer
19 pages
Stok Status Report
No ratings yet
Stok Status Report
52 pages
Citra Log
No ratings yet
Citra Log
7 pages
qw5618 EN
No ratings yet
qw5618 EN
21 pages
Functions
No ratings yet
Functions
12 pages
Voucher-VICTOIRE WIFI-24H-up-977-07.06.24
No ratings yet
Voucher-VICTOIRE WIFI-24H-up-977-07.06.24
10 pages
TALON Datasheet
No ratings yet
TALON Datasheet
2 pages
Internetshutdown
No ratings yet
Internetshutdown
1 page
ISTE STDS Self Assessment - Sarah - Duong
No ratings yet
ISTE STDS Self Assessment - Sarah - Duong
4 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Practical Statistical Process Control
From Everand
Practical Statistical Process Control
Colin Hardwick
5/5 (9)
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
10 Minute Guide to Orthogonal Array Test Strategy
From Everand
10 Minute Guide to Orthogonal Array Test Strategy
Rajeev Nair Raman
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Differential Evolution: Fundamentals and Applications
From Everand
Differential Evolution: Fundamentals and Applications
Fouad Sabry
No ratings yet

Absenteeism at Work Project Report

Uploaded by

Absenteeism at Work Project Report

Uploaded by

ANLY 530-90-O-2020

Rama Chaitanya Samanchi

Subject Instructor: Prof. Roozbeh Sadeghian

Training and Validating Models

Model Performance on Test Dataset

Calculate Area Under the Curve

AUC for Absenteeism time equals zero: 1.0

Absenteeism Absenteeism time Absenteeism time

Absenteeism Absenteeism time Absenteeism time

Predicting Absenteeism Hour Using the Same Dataset

You might also like