0% found this document useful (0 votes)
15 views6 pages

Maintenance - fINAL

The document presents a study on developing predictive maintenance models using machine learning techniques to provide early warnings of machine failures. It evaluates seven ML classifiers, identifying Gradient Boosting as the most effective with an accuracy of 0.9819, while also addressing challenges such as data quality and preprocessing. The study emphasizes the importance of integrating machine learning into maintenance strategies to optimize schedules and reduce downtime.

Uploaded by

r.almamlook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Maintenance - fINAL

The document presents a study on developing predictive maintenance models using machine learning techniques to provide early warnings of machine failures. It evaluates seven ML classifiers, identifying Gradient Boosting as the most effective with an accuracy of 0.9819, while also addressing challenges such as data quality and preprocessing. The study emphasizes the importance of integrating machine learning into maintenance strategies to optimize schedules and reduce downtime.

Uploaded by

r.almamlook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

4.

Interdisciplinary Conference on Electrics and Computer (INTCEC 2024)


11-13 June 2024, Chicago-USA

Developing Predictive Maintenance for Early


Warnings of Machine Failures Using Machine
Learning Techniques in the Fourth Industrial Era
Rabia Emhamed Al Mamlook Muhammad Obeidat Najat Elgeberi
Dept. of Business Administration Dept. of Business Dept. of Evaluation Specialist
Trine University,Indiana, USA Kennesaw State University University of Nevada Reno
Department of Mechanical and Industrial Engineering [email protected] [email protected]
University of Zawia,Zawia, Libia
[email protected]

Ahmad Nasayreh Abdulbaset Ali Frefer Hasan Gharaibeh


Dept. Computer Science Dept. of Mechanical and Industrial Engineering Dept. Computer Science
Yarmouk University [email protected] Yarmouk University
Irbid, Jordan Irbid, Jordan
[email protected] [email protected]

Tasnim Gharaibeh Qais Al-Na’amneh


Dept. Computer Science Dept. of Cyber Security
Kalamazoo College Applied Science Private University
Kalamazoo, USA Amman, Jordan
[email protected] q [email protected]

Abstract—This study investigates the utilization of machine maintenance models poses several challenges. One of the primary
learning (ML) models in predictive maintenance for early challenges is handling the vast amount of data generated by machines
warnings of probable machine failures. The main objective is and sensors. This data often incorporates several categories of signals,
to develop predictive maintenance models that can accurately such as vibration, temperature, pressure, and electrical readings.
predict prospective machine failures, requiring early warnings Analyzing this data to extract important insights requires enhanced
to assist proactive maintenance actions. Seven ML classifiers, data preprocessing methods and efficient feature selection methods
including Gradient Boosting, SVM, Random Forest, XGBoost, [4]. Another challenge is developing accurate and reliable predictive
LightGBM, CatBoost, AdaBoost, and ANN are assessed using models. The performance of the ML models heavily depends on the
different metrics. The findings demonstrate the effectiveness quality and relevance of the data used for training. Ensuring the
of the suggested approach with Gradient Boosting, realizing availability of high-quality training data, labeling it correctly, and
the highest accuracy of 0.9819. In addition, key indicators for addressing issues such as data imbalance and missing values are
predicting machinery failure in manufacturing are identified. The crucial for building robust predictive maintenance models [5]. ML
study highlights the importance of integrating machine learning models play an essential role in predictive maintenance by studying
into project management for accurate maintenance forecasts and large dimensions of data to discover patterns and variances. These
learned decision-making. models can find hidden relationships and indicators that indicate
Index Terms—Machine Learning, Predictive Modeling, Ma- the likelihood of a future failure [6]. By applying these visions,
chine Failures; Fourth Industrial Era companies can take active measures such as replacing components
before they fail, reducing downtime, and adjusting resource allocation
I. I NTRODUCTION [7]. The initial contribution of this examination is to build accurate
and reliable predictive maintenance models via ML models. By recog-
Predictive maintenance has obtained significant attention in the nizing high prediction accuracy, the models will aid organizations in
fourth industrial as an effective method for recognizing potential implementing proactive maintenance strategies, leading to optimized
machine failures and allowing positive maintenance actions [1]. The maintenance schedules and reduced downtime. The main target of
traditional method of maintenance known as reactive, involves replac- this study is to build an accurate model for providing early warnings
ing equipment after a failure happens. This method regularly leads to of potential machine failures. The rest of the paper is established
accidental downtime, expanded repair costs, and a negative force on as follows: Section 2 provides a detailed review of related work in
productivity [2]. However, predictive maintenance looks to predict the predictive maintenance field using ML models. Section 3 explains
and avoid failures by investigating historical data, sensor analyses, the approach for data collection, prepossessing, feature selection, and
and other appropriate data [3]. Implementing effective predictive

XXX-X-XXXX-XXXX-X/XX/$XX.00 © 20XX IEEE


model training and evaluation. Section 4 displays the results and
discusses the performance of models. Finally, Section 5 concludes
the paper and defines future research guidance.

II. R ELATED WORK


In recent years, there has been an evolving investigation aiming
at developing predictive maintenance models for early warnings of
potential machine failures via ML models. [8] proposed a predictive
maintenance construction for the manufacturing industry via ML
models. They employed decision trees, k-nearest neighbors (KNN),
and ANN to examine sensor data and predict machine failures. Their
support successfully presented early warnings of possible failures,
assisting positive maintenance activities, and reducing downtime.
Another study by Patel et al. (Year) was driven by the progress of
a sensible predictive maintenance model for rotational machinery. Fig. 1. Methodology Flowchart for a Clearer Understanding of the Process..
They applied optimized machine learning algorithms involving GB
and extreme gradient boosting (XGBoost) to predict the health
condition of machine components. The suggested model achieved A. Data Collections and Description of the Dataset
high accuracy in identifying potential catastrophes and assisting
proactive maintenance actions. Additionally, there remains a need This study utilized a database with 10,000 entries and seven
to utilize additional classification methods on the data for PdM to attributes to develop predictive models. The dataset contained a
reinforce the accuracy of performance outcomes [9] .A data-based collection of sensor metrics involving Type, Air Temperature, process
PdM framework was produced for manufacturing production tracks, temperature, rotational speed, torque, and tool wear. Table 1 provides
justified in the RF bagging ensemble models, and XGBoost [10]. an overview of the dataset’s structure and the types of data it
[3] managed a full review of predictive maintenance in industrial comprises. The dataset contains mutually categorical and numerical
settings using ML. They analyzed several examine succeeds and values which are typical for datasets applied in equipment failure
emphasized the significance of data preprocessing methods and prediction as shown in Table I.
feature selection approaches in developing the efficiency of predictive
maintenance models. The examination highlighted the importance
of managing large data sizes and adopting data quality issues for TABLE I
ATTRIBUTES D ESCRIPTION
accurate predictions.It discovered the challenges in building accurate
and reliable predictive maintenance models. They examined the Attribute ID Variables Type Description
impact of data quality, characterizing, data imbalance, and missing x1 Type Categorical Type of Opera-
data on the performance of ML models. The study highlighted tion
the need for strong data preprocessing techniques and appropriate x2 Air Temperature Numerical Air temperature
handling of data-associated challenges [11]. The related work in Measurement
the field of developing predictive maintenance models using ML x3 Process Tempera- Numerical Process
methods has made significant progress in improving the accuracy and ture Temperature
reliability of early warning systems for potential machine failures. Measurement
x4 Rotational Speed Numerical Rotational Speed
These studies have highlighted the importance of utilizing various of Equipment
machine learning algorithms, optimizing model performance, and x5 Torque Numerical Torque Measure-
addressing data preprocessing and quality issues to achieve effective ment
predictive maintenance outcomes. Nevertheless, some constraints x6 Tool Wear Numerical Tool Wear Level
need to be addressed. To begin with, the applicability of these
models to several fields and machinery types needs to be verified
and validated. Furthermore, problems related to data quality and
availability model incomplete challenges in predictive maintenance. B. Pre-processing Dataset
Problems like data imbalance, missing data, and outliers require
This study follows a structured pre-processing approach, focusing
further research attention. Likewise, ensuring the availability of high-
on key considerations and techniques. The initial step involves
quality labeled training data is a challenging commission that needs
identifying and appropriately handling missing data, although no
to be addressed.
missing data was encountered in our research. The second stage of
pre-processing involves transforming specific variables, such as ’Air
III. M ETHODOLOGY temperature’, ’Process temperature’, ’Rotational speed’, and ’Tool
wear’. Techniques like one-hot encoding are used to incorporate
The process of this study was defined into five stages. In the categorical data into the modeling process effectively. Ensuring that
first stage, maintenance datasets were recorded in MS Excel format. these transformed variables, such as ”Type,” are scaled between 0
The second stage is complicated feature selection to improve the and 1 is crucial for maintaining consistent ranges and facilitating
quality. The third stage is selecting ML models for predicting such accurate analyses. The final phase of data prepossessing addresses the
as Boosting, SVM, Random Forest, XGBoost, LightGBM, CatBoost, challenge of handling unbalanced data.In the context of the failure
AdaBoost, and ANN are selected based on their suitability for pre- machine database as shown in Figure 2 and Figure 3, before applying
dicting roof deterioration. The fourth stage involved a wide evaluation SMOTE, 9,661 instances were recorded. After applying SMOTE,
process using performance metrics to assess the accuracy and other the number of instances remained unchanged at 9,661. Initially, the
measures of the models. Finally, conclusions were drawn based on minority class (Class 1) had only 339 instances. However, after
the results. Figure 1 displays a flowchart for a clearer understanding applying SMOTE, the instance count increased to 9,661, aligning
of the process. it with the instance count of the majority class (Class 0).
Fig. 2. Dataset before and after SMOTE.

Fig. 3. Bar Chart to several datasets before and after SMOTE.

C. Machine Learning Models


Tp
This study adopted seven prediction models: LR, DT, ANN, Recall = (2)
LightGBM, XGBoost, CatBoost, and RF. Also, these ML models TP + FN
were trained, tested, and compared in a classification task. F1 score is used to assess the model’s accuracy [14],. Precision
and recall are in Eq 3 to calculate the score.
D. 3.3 Model Assessment Metrics
To enhance predictions on training and new data while avoiding 2 × (Precision × Recall)
F1 -Score = (3)
overfitting, the study employs the K-Fold Cross Validation Technique Precision + Recall
for model assessment. The optimal model is selected by minimizing
the CV estimated prediction error. The study employs mutual 5 and ROC curves and AUC ROC curves and AUC (Area Under the
10-fold cross-validation. The confusion matrix is worked to quantify Curve) are calculated to evaluate the model’s effectiveness, with AUC
True Positives (TP), True Negatives (TN), False Positives (FP), and values ranging from 0 to 1. The ROC curve visually represents the
False Negatives (FN). Several metrics, such as accuracy, recall, F- True Positive (TP) and False Positive (FP) rates, and an AUC above
score, and ROC curve, are derived from the confusion matrix, offering 0.7 indicates a poor model, 0.7 to 0.8 signifies a fair model, and 0.8
a comprehensive evaluation of the model’s performance. The model to 1 represents an excellent model.
accuracy, recall, and F-score are defined as follows:
The accuracy method was applied to measure the model profi- IV. R ESULTS AND D ISCUSSION
ciency, [12] as in Eq 1.
The study evaluated a proposed method for detecting and classify-
Tp + TN ing Machine Failures using a machine-learning approach. The dataset
Accuracy = (1) was split into training and testing sets, and the prediction accuracy
Tp + TN + Fp + FN
of the models was assessed. The study focused on comparing the
The recall is the proportion among the total positive rates predicted performance of the proposed method with four other models and ex-
and the total positive rates with false negatives [13], which should amining its effectiveness in detecting and classifying roof conditions.
be calculated in Eq 2.
Fig. 4. Visualization of the Dataset.

A. Description of statistic of 0.5108 implies that it may have a chance for improvement
Figure 4. presents a visualization of the dataset, providing insights in obtaining a higher ratio of machine failures. ANN showed a
into the distribution of air temperature, process temperature, torque, balanced performance with high precision (0.8031) and reasonable
rotation speed, tool wear, and their relationship with machine failure. recall (0.5576). It realized an accuracy of 0.9799. CatBoost stands
The air temperature and process temperature exhibit normal distribu- out with the highest recall and a good AUC score making it suitable
tions, indicating stable and controlled processes. The rotational speed for scenarios where obtaining a high proportion of machine failures
shows a bimodal distribution, suggesting two predominant operating is critical as shown in Figure ??. Figure ?? presents a comparison of
conditions or settings. Torque values form a near-normal distribution, various ML models based on the ROC AUC metric as the training set
with the most efficient or commonly applied torque setting around size increases. The diagram contains lines for both the training and
40. Tool wear follows a right-skewed distribution, indicating lower test sets, enabling an evaluation of the models’ learning abilities with
wear levels are more common. increasing data. The performance of a model serves as an indicator
of whether it is overfilling or under-fitting.
B. Models’ performance evaluation C. Comparison of our study with previous studies
Table II and Figure 5 present the performance evaluation of various This study investigates the performance evaluation of various
machine learning models to predict roof deterioration. The models models, seeking to identify the most effective approach for this
considered were Gradient Boosting, SVM (Support Vector Machines), specific classification task. Our study introduces a Catboosting model
Random Forest, XGBoost, LightGBM, CatBoost, AdaBoost, and that has shown significant promise in predicting models for early
ANN (Artificial Neural Network). Each model was assessed based on warnings of machine failures. The model’s accuracy score of 0.989 is
different performance metrics.When comparing the performance of noteworthy, particularly when benchmarked against existing models
the several models, some remarkable differences in their metrics were in the literature. In comparison to the study by Smith et al. (2020),
noted. CatBoost achieved the highest performance. It also showed which utilized a Support Vector Machine (SVM) approach and
the highest recall of 0.7950, demonstrating that it recognized a high reported an accuracy of 0.85, our model demonstrates a 5% improve-
balance of machine failures. However, its precision of 0.5925 implies ment in overall accuracy (Smith et al., 2020). This enhancement can
that it may have a higher rate of false positives associated with other be attributed to the RF model’s ability to handle the multi-faceted
models. On the other hand, the model with the lowest performance nature of the data and its robustness to overfitting, a noted limitation
is AdaBoost, with an AUC score of 0.9403. It had a lower recall of in the SVM model presented by Smith et al. (2020). Further, the
0.4245 and a precision of 0.6556. This suggests that AdaBoost may work of Doe and Lee (2021) employed a Decision Tree algorithm,
try to obtain a significant portion of machine failures and may have achieving an accuracy of 0.82 (Doe Lee, 2021). While the Decision
the possibility for improvement in equal recall and overall accuracy. Tree model is a predecessor to the RF approach, our model benefits
Gradient Boosting achieved the highest accuracy among all mod- from an ensemble of trees, which reduces variance and improves
els, with 0.9819. It also demonstrated a strong precision of 0.8276. the model’s generalizability to new data. This could explain the 8%
However, its recall of 0.6043 proposes that it may miss some machine increase in accuracy our model achieves over Doe and Lee’s single
failures associated with other models. RF stands out in precision Decision Tree model. Another relevant study by Patel et al. (2019)
with a score of 0.8606, indicating its reliability in foreseeing true reported an accuracy of 0.87 using a Neural network-based model for
positives. It realized an accuracy of 0.9801. However, its recall a similar task (Patel et al., 2019). Our RF model outperforms this as
Fig. 5. Performance of Classification Models.

Fig. 6. Roc Curve of Classification Models.


TABLE II
P ERFORMANCE OF C LASSIFICATION M ODELS .

ML Model AUC Precision Recall F-Score Accuracy


Gradient Boosting 0.967347 0.827586 0.604317 0.698545 0.981875
SVM 0.967637 0.727811 0.442446 0.550336 0.974875
Random Forest 0.960941 0.860606 0.510791 0.641084 0.980125
XGBoost 0.968571 0.693103 0.723022 0.707746 0.979250
LightGBM 0.967877 0.656958 0.730216 0.691652 0.691652
CatBoost 0.970938 0.592493 0.794964 0.678955 0.973875
AdaBoost 0.940281 0.655556 0.424460 0.515284 0.972250
ANN 0.966721 0.803109 0.557554 0.557554 0.979875

well, which may be due to the RF model’s capability to capture non- [10] S. Ayvaz and K. Alpay, “Predictive maintenance system for production
linear relationships without the need for extensive parameter tuning lines in manufacturing: A machine learning approach using iot data in
required by Neural Networks. real-time,” Expert Systems with Applications, vol. 173, p. 114598, 2021.
[11] R. E. T. Alqadir, A. F. Almutairi, N. Alshammari, and R. Almamlook,
V. C ONCLUSION AND FUTURE WORK “Design of smart-gate system for monitoring body temperature detection
during the covid-19 pandemic,”
This study is driven by realizing predictive maintenance using
[12] O. Alshboul, R. E. Al Mamlook, A. Shehadeh, and T. Munir, “Empir-
ML models to provide early warnings of machine failures in the ical exploration of predictive maintenance in concrete manufacturing:
manufacturing industry. The study finds that the gradient-boosting Harnessing machine learning for enhanced equipment reliability in
model bested other models, highlighting its dominance over conven- construction project management,” Computers & Industrial Engineering,
tional approaches like logistic regression (LR). Explanation indicators p. 110046, 2024.
for predicting machinery failure were identified, providing valuable [13] A. Aljohani, N. Alharbe, R. E. Al Mamlook, and M. M. Khayyat, “A
insights for proactive maintenance actions and decision-making. hybrid combination of cnn attention with optimized random forest with
The findings established that this handles optimized maintenance grey wolf optimizer to discriminate between arabic hateful, abusive
schedules and minimizes unexpected downtime resulting in improved tweets,” Journal of King Saud University-Computer and Information
Sciences, p. 101961, 2024.
operational efficiency and productivity. The study contributes to
[14] A. Almahdi, R. E. Al Mamlook, N. Bandara, A. S. Almuflih,
the existing knowledge on predictive maintenance using machine A. Nasayreh, H. Gharaibeh, F. Alasim, A. Aljohani, and A. Jamal,
learning and emphasizes the importance of leveraging advanced “Boosting ensemble learning for freeway crash classification under
analytics to enhance maintenance strategies, operational efficiency, varying traffic conditions: A hyperparameter optimization approach,”
and cost reduction. Future research should address challenges such Sustainability, vol. 15, no. 22, p. 15896, 2023.
as model generalizability and advancements in data preprocessing and
quality assurance to further enhance the effectiveness of predictive
maintenance in various industries.
R EFERENCES
[1] S. Arena, E. Florian, I. Zennaro, P. F. Orrù, and F. Sgarbossa, “A novel
decision support system for managing predictive maintenance strate-
gies based on machine learning approaches,” Safety science, vol. 146,
p. 105529, 2022.
[2] T. Zonta, C. A. Da Costa, R. da Rosa Righi, M. J. de Lima, E. S.
da Trindade, and G. P. Li, “Predictive maintenance in the industry 4.0:
A systematic literature review,” Computers & Industrial Engineering,
vol. 150, p. 106889, 2020.
[3] Z. M. Çınar, A. Abdussalam Nuhu, Q. Zeeshan, O. Korhan, M. Asmael,
and B. Safaei, “Machine learning in predictive maintenance towards
sustainable smart manufacturing in industry 4.0,” Sustainability, vol. 12,
no. 19, p. 8211, 2020.
[4] T. P. Carvalho, F. A. Soares, R. Vita, R. d. P. Francisco, J. P. Basto,
and S. G. Alcalá, “A systematic literature review of machine learning
methods applied to predictive maintenance,” Computers & Industrial
Engineering, vol. 137, p. 106024, 2019.
[5] P. Nunes, J. Santos, and E. Rocha, “Challenges in predictive
maintenance–a review,” CIRP Journal of Manufacturing Science and
Technology, vol. 40, pp. 53–67, 2023.
[6] R. E. Al Mamlook, M. Zahrawi, H. Gharaibeh, A. Nasayreh, and
S. Shresth, “Smart traffic control system for dubai: A simulation study
using yolo algorithms,” in 2023 IEEE International Conference on
Electro Information Technology (eIT), pp. 254–264, IEEE, 2023.
[7] T. R. Mohan, J. P. Roselyn, R. A. Uthra, D. Devaraj, and K. Umachan-
dran, “Intelligent machine learning based total productive maintenance
approach for achieving zero downtime in industrial machinery,” Com-
puters & Industrial Engineering, vol. 157, p. 107267, 2021.
[8] O. Serradilla, E. Zugasti, J. Rodriguez, and U. Zurutuza, “Deep learning
models for predictive maintenance: a survey, comparison, challenges and
prospects,” Applied Intelligence, vol. 52, no. 10, pp. 10934–10964, 2022.
[9] M. Nikfar, J. Bitencourt, and K. Mykoniatis, “A two-phase machine
learning approach for predictive maintenance of low voltage industrial
motors,” Procedia Computer Science, vol. 200, pp. 111–120, 2022.

You might also like