0% found this document useful (0 votes)

137 views23 pages

Predictive Maintenance - Final Presentation

The document outlines a predictive maintenance project using a dataset from Kaggle with 10,000 data points and 14 features related to machine operations and failures. The project aims to predict machine failures to improve maintenance schedules and reduce downtime, employing various machine learning models, with XGBoost achieving the highest accuracy of 0.978. Key insights include the importance of balancing recall and precision for effective failure prediction and the practical implications for enhancing maintenance strategies in industries.

Uploaded by

adithyaharsha444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

137 views23 pages

Predictive Maintenance - Final Presentation

Uploaded by

adithyaharsha444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Predictive Maintenance

DePaul University, Chicago, Illinois, USA

Group #11: Adithya Harsha, Joao Vitor Lira de Carvalho Firmino, Beemnet Desta,
Tejas
About our Dataset
● The dataset we selected is the Predictive Maintenance Dataset, sourced from Kaggle, which contains
10,000 data points and 14 features.
● These features capture various operational conditions, such as air temperature, process temperature,
rotational speed, torque, and tool wear, along with a machine failure label.
● Machine failures are categorized into five types: tool wear failure, heat dissipation failure, power failure,
overstrain failure, and random failures.

Goal:
The goal of this project is to predict machine failures using operational data. By doing so, we hope to improve maintenance
schedules, reduce downtime, and avoid premature replacements.

Use:
This prediction model will assist industries in implementing predictive maintenance strategies, ensuring that machines are serviced
at the most appropriate time. This reduces operational expenses, improves machine reliability, and avoids unexpected
breakdowns. Ultimately, it will result in increased production and cost savings for companies that rely on heavy gear.
Pre processing and Cleaning

● There were no null/missing values.

● Checked for empty strings in the 'Product ID' and 'Type' columns, and found none.
● Detected outliers in 'Rotational Speed [rpm]' (418 outliers) and 'Torque [Nm]' (69 outliers) using the IQR
method.
● Applied log transformation to 'Rotational Speed [rpm]' and 'Torque [Nm]' to reduce the impact of extreme
values and normalize the distribution.
● Converted the categorical 'Type' feature into numerical binary features using One-Hot Encoding.
● Standardized numerical features including 'Air temperature [K]', 'Process temperature [K]', 'Rotational Speed
[rpm]', 'Torque [Nm]', and 'Tool wear [min]' using StandardScaler to ensure equal contribution during
modeling.
● Altogether, we now have transformed and scaled features ready for machine learning models.
Outlier Detection & Handling Feature Scaling

● Outlier Detection: The features in the dataset have different ranges,

○ We applied the Interquartile Range (IQR) which can affect the performance of machine
method to identify outliers in the numerical learning models. To ensure all features contribute
columns: equally, we applied standardization. After
○ Rotational Speed [rpm]: 418 outliers detected. considering several options, we decided on using
○ Torque [Nm]: 69 outliers detected. StandardScaler, which standardizes features by
○ No outliers found in 'Air temperature [K]', giving them: Mean = 0 and Standard Deviation = 1
'Process temperature [K]', or 'Tool wear [min]'.
● Handling Outliers: We made sure that scaling was applied only to the
○ To reduce the impact of extreme values, we numeric features and not to the categorical ones.
applied log transformation to the 'Rotational
Speed [rpm]' and 'Torque [Nm]' columns: ● Features Scaled:
○ Rotational Speed [rpm]: After transformation, ○ 'Air temperature [K]'
the data is concentrated around a tighter ○ 'Process temperature [K]'
range, with fewer extreme outliers. ○ 'Rotational speed [rpm]'
○ Torque [Nm]: The transformed data is more ○ 'Torque [Nm]'
compressed, with smaller impacts from outliers ○ 'Tool wear [min]'
on the lower end.
Exploratory Data Analysis(EDA)

Pie chart illustrates that, within this dataset, 60%

of the manufactured products are of low quality,
30% are of medium quality, and 10% are of high
quality.

Machine failure based on product Quality (Bar Chart): In this graph,

it is possible to see that almost no high-quality products are related to
machine failure. On the other hand, almost all failures occur during the
manufacturing of low-quality products.
Correlation Matrix:

○ Air Temperature [K] & Process Temperature [K] (0.88

correlation): These variables are linked to the machine's
thermal dynamics. Higher air temperatures make cooling
harder, leading to increased process temperatures.

○ Rotational Speed [rpm] & Torque [Nm] (-0.94

correlation): Typically, as rotational speed increases,
torque decreases, indicating an inverse relationship crucial
for maintaining consistent power output.

○ Different Failure Types (HDF, PWF, OSF): Failure modes

like HDF, PWF, and OSF are interconnected. For instance,
poor heat dissipation (HDF) can cause electrical issues,
leading to power failures (PWF)
Histograms of features:
○ Air Temperature Histogram: The distribution is bimodal, with concentrations around 298 K and 301 K.

○ Process Temperature Histogram: Values range from 306 K to 314 K, with most clustered between 309 K and 312
K. A prominent peak between 310 K and 311 K suggests most temperatures are centered here, though higher
temperatures occur less often.

○ Rotational Speed Histogram: Right-skewed with a peak around 1500 rpm, indicating most data points cluster at
lower speeds, with fewer high-speed instances.

○ Torque Histogram: Displays a normal distribution, with a mean torque of approximately 40 Nm.

○ Tool Wear Histogram: Tool wear ranges from 0 to 200 minutes, with a nearly uniform distribution and a sharp
decline after 200 minutes.
Feature Engineering

● New Features Created:

○ Mechanical_Power:
■ Captures the machine’s output using the formula: Mechanical Power=Rotational
Speed×Torque\text{Mechanical Power} = \text{Rotational Speed} \times \
text{Torque}Mechanical Power=Rotational Speed×Torque.
■ Key Insight: Strong inverse correlation between Rotational Speed and Torque (-0.94).
○ Temp_Diff:
■ Highlights potential overheating issues with the formula: Temp_Diff=Process Temperature−Air
Temperature\text{Temp\_Diff} = \text{Process Temperature} - \text{Air
Temperature}Temp_Diff=Process Temperature−Air Temperature.
■ Key Insight: Strong positive correlation between Air and Process Temperatures (0.88).
● Log Transformation Applied After feature creation to avoid distorting Mechanical Power.
○ This maintains the accuracy of features like Mechanical Power, which could become negative or zero,
distorting the data before transformation.
Model Preparation and Training
Scaling Removed for Tree-Based Models:

● Random Forest, Gradient Boosting, and XGBoost don't require scaling, simplifying preprocessing.
● Tree-based models split features based on order, so the magnitude of values (e.g., speed or temperature)
doesn't affect their performance.

Column Renaming:

● Renamed columns to remove special characters like '[', ']', and spaces. This was necessary for XGBoost to
avoid errors during model training, as XGBoost requires column names without special characters.

Train-Test Split:

● Used a train-test split of 80% training data and 20% test data with a random_state=42 for reproducibility.
Scaling Selected Features

● The goal was to Standardize features to improve model performance and convergence.
● Columns Scaled: Air_temperature_K, Process_temperature_K, Rotational_speed_rpm, Torque_Nm,
Tool_wear_min.
● We used StandardScaler to apply scaling.
● This ensures features are on a similar scale, reducing bias toward larger values.
● Post-scaling, mean values are close to 0, and standard deviations are close to 1, indicating successful
normalization.
Train/Test Split
● The purpose is to separate the data for model evaluation and to avoid overfitting.
● Target variable (Any_Failure) indicates any type of machine failure.
● Feature selection:
○ Dropped columns unrelated to model training (Machine_failure and target column).
○ Ensures only predictive features are included in X.

● Outcome:
● The split ensures the model is trained on a larger portion of data while retaining enough for reliable testing.
Model Selection
● Tree-based models were selected due to their ability to handle complex interactions between features.:
○ Random Forest;
○ Gradient Boosting;
○ XGBoost.

● Additionally, other models were applied to evaluate performance across different types of algorithms:
○ AdaBoost;
○ Logistic Regression.
Model Training, Prediction, and Evaluation

● We trained the models on the SMOTE-balanced data using only the selected features.

● This step ensured that the training data was balanced and tailored to the important features, enhancing the model’s
ability to accurately detect instances of the minority class.

● After training, predictions were made on the test set, and the model’s performance was evaluated using metrics such
as accuracy, precision, recall, and F1-score.
Model Training, Prediction, and Evaluation

Random Forest:
● Final Accuracy: 0.965
● Precision: 0.47
● Recall: 0.68
● F1: 0.56
Model Training, Prediction, and Evaluation

Gradient Boosting Model:

● Final Accuracy: 0.938
● Precision: 0.32
● Recall: 0.78
● F1: 0.45
Model Training, Prediction, and Evaluation

XGBoost Model:
● Final Accuracy: 0.978
● Precision: 0.64
● Recall: 0.72
● F1:0.68
Model Training, Prediction, and Evaluation

Threshold Tuning on XGBoost Model:It was employed to balance recall and precision further, enhancing
sensitivity to true failure cases.

● Final Accuracy: 0.967

● Precision: 0.49
● Recall: 0.74
● F1:0.59
Model Training, Prediction, and Evaluation

AdaBoost Model:
● Final Accuracy: 0.904
● Precision: 0.22
● Recall: 0.75
● F1:0.34
Model Training, Prediction, and Evaluation

Logistic Regression Model:

● Final Accuracy: 0.705
● Precision: 0.07
● Recall: 0.63
● F1:0.12
Conclusion

In this predictive maintenance project, multiple models were evaluated to predict failures. In
conclusion, several key points can be highlighted:
● Top Performing Model:
○ XGBoost achieved the highest performance with an accuracy of 0.978.
○ Threshold tuning further improved recall to 0.74, optimizing sensitivity to true failures.

● Model Comparisons:
○ Random Forest also performed well but had slightly lower precision and recall.
○ Gradient Boosting and AdaBoost provided acceptable recall but lacked precision balance.
○ Logistic Regression showed the lowest metrics and was not suited for this task.
Conclusion

● Main Insights:
○ XGBoost, especially with threshold tuning, is the most effective model for this dataset.
○ A balanced approach between recall and precision was essential for reliable failure prediction.

● Practical Implications:
○ The model’s reliability can enhance maintenance scheduling and reduce unexpected downtimes..
Thank You

Wiley - A Guide To Writing As An Engineer, 5th Edition - 978-1-119-62576-6
0% (1)
Wiley - A Guide To Writing As An Engineer, 5th Edition - 978-1-119-62576-6
2 pages
PRQC
No ratings yet
PRQC
15 pages
Introduction To Academic Research PDF
100% (1)
Introduction To Academic Research PDF
59 pages
AirAsia DataScientist Interview Questions Take-Home
No ratings yet
AirAsia DataScientist Interview Questions Take-Home
5 pages
Micro Mentor Book - 1761-RM001 EN-P
No ratings yet
Micro Mentor Book - 1761-RM001 EN-P
175 pages
Avoiding Plagiarism
No ratings yet
Avoiding Plagiarism
9 pages
NJIT Statement of Purpose
100% (1)
NJIT Statement of Purpose
3 pages
CE 22 Lecture 2-1 General Cost Terms and Economic Environment
No ratings yet
CE 22 Lecture 2-1 General Cost Terms and Economic Environment
36 pages
Presentation of Statistical Data - Textual Presentation
100% (10)
Presentation of Statistical Data - Textual Presentation
2 pages
PTE Acadamic
No ratings yet
PTE Acadamic
16 pages
The Academic Word List (AWL) With Frequency
100% (1)
The Academic Word List (AWL) With Frequency
4 pages
PTE Academic Reading 7 - Multiple Choice, Multiple Correct Answers
No ratings yet
PTE Academic Reading 7 - Multiple Choice, Multiple Correct Answers
2 pages
Engineering Data Analysis Notes
No ratings yet
Engineering Data Analysis Notes
6 pages
Syllabus First Year Seminar EGG 101: Introductory Engineering Experience (2 Credits) Fall 2012
No ratings yet
Syllabus First Year Seminar EGG 101: Introductory Engineering Experience (2 Credits) Fall 2012
21 pages
Module 1,2,3
From Everand
Module 1,2,3
Pia Lebsund
1/5 (1)
Chapter 10 Ethics of IT Organization (Finals)
No ratings yet
Chapter 10 Ethics of IT Organization (Finals)
45 pages
Road To Ielts Syllabus General Training
0% (1)
Road To Ielts Syllabus General Training
5 pages
White Paper - LeeBoy Legend Screen System
No ratings yet
White Paper - LeeBoy Legend Screen System
5 pages
Death Penalty Report
No ratings yet
Death Penalty Report
3 pages
3 - The Data Science Method
No ratings yet
3 - The Data Science Method
8 pages
Blooms Taxonomy Final Draft
0% (1)
Blooms Taxonomy Final Draft
14 pages
Technical Communication
No ratings yet
Technical Communication
33 pages
MPA 613 - Organizational Behavior (Revised)
No ratings yet
MPA 613 - Organizational Behavior (Revised)
5 pages
Notes Technical Writing
No ratings yet
Notes Technical Writing
4 pages
Technical Writing
No ratings yet
Technical Writing
87 pages
3 - Technical - Methods of Development
No ratings yet
3 - Technical - Methods of Development
29 pages
Membranes For MD
No ratings yet
Membranes For MD
6 pages
Complete Download Advanced Engineering Mathematics 9th Edition, International Edition Edition Erwin Kreyszig PDF All Chapters
No ratings yet
Complete Download Advanced Engineering Mathematics 9th Edition, International Edition Edition Erwin Kreyszig PDF All Chapters
77 pages
Narrative Report PJKM
No ratings yet
Narrative Report PJKM
52 pages
Working: Angga Taufan Dayu, M.PD - BI Pbi Fkip Uniska
No ratings yet
Working: Angga Taufan Dayu, M.PD - BI Pbi Fkip Uniska
61 pages
How Science and Technology Affect Your Life As An Individual
No ratings yet
How Science and Technology Affect Your Life As An Individual
1 page
Assessment Tips-ABET-GloriaRogers PDF
No ratings yet
Assessment Tips-ABET-GloriaRogers PDF
30 pages
Topik 2
100% (1)
Topik 2
72 pages
The Use Technology Based On Artificial Intelligence in English Teaching and Learning
No ratings yet
The Use Technology Based On Artificial Intelligence in English Teaching and Learning
12 pages
MSC Data Science Entrance Test Syllabus
No ratings yet
MSC Data Science Entrance Test Syllabus
2 pages
Notes On Linear Motion - Physics From 4 Lesson 1
No ratings yet
Notes On Linear Motion - Physics From 4 Lesson 1
7 pages
Modeling and Vibration Analysis
100% (1)
Modeling and Vibration Analysis
11 pages
2@software Reliability
No ratings yet
2@software Reliability
126 pages
1.4 1styr 1stsem ELECTRO 1 Basic Electricity
No ratings yet
1.4 1styr 1stsem ELECTRO 1 Basic Electricity
29 pages
Importance of Mathematics in Engineering and Technology
No ratings yet
Importance of Mathematics in Engineering and Technology
3 pages
Academic Writing Learning Tasks
No ratings yet
Academic Writing Learning Tasks
44 pages
215 Chap04 Forecasting
No ratings yet
215 Chap04 Forecasting
30 pages
Multiple Regression
No ratings yet
Multiple Regression
56 pages
IMAGE PROCESSING ANALYSIS WITH GRAPHS THEORY PRACTICE - Image-Processing-Analysis-With-Graphs-Theory-Practice PDF
No ratings yet
IMAGE PROCESSING ANALYSIS WITH GRAPHS THEORY PRACTICE - Image-Processing-Analysis-With-Graphs-Theory-Practice PDF
2 pages
Caso Hospital Arnold Palmer
No ratings yet
Caso Hospital Arnold Palmer
2 pages
It Hub
No ratings yet
It Hub
11 pages
Grad
No ratings yet
Grad
74 pages
E201 Lecture Notes
No ratings yet
E201 Lecture Notes
72 pages
Syllabus Scientific Research Methodology AC
No ratings yet
Syllabus Scientific Research Methodology AC
6 pages
Lesson 1 Introduction To Graphic Communication
No ratings yet
Lesson 1 Introduction To Graphic Communication
12 pages
Learning To Speak: Appropriacy
No ratings yet
Learning To Speak: Appropriacy
6 pages
Entrepreneurship: Successfully Launching New Ventures
No ratings yet
Entrepreneurship: Successfully Launching New Ventures
8 pages
Main
100% (2)
Main
104 pages
Concept Paper in Robotics
No ratings yet
Concept Paper in Robotics
4 pages
Pte Writing Task
No ratings yet
Pte Writing Task
1 page
Reading Full PDF
No ratings yet
Reading Full PDF
104 pages
Technology and The Information Age
No ratings yet
Technology and The Information Age
9 pages
FOUN 1105 Plenary 5 (Part 3 - Lecturer) Semester 3 2016 2017
No ratings yet
FOUN 1105 Plenary 5 (Part 3 - Lecturer) Semester 3 2016 2017
38 pages
Topic 3 - FINANCING OF SHIPS AND MARITIME FACILITIES by Aris Renz Sangalang
No ratings yet
Topic 3 - FINANCING OF SHIPS AND MARITIME FACILITIES by Aris Renz Sangalang
27 pages
PTE Prediction 3-9 Dec
No ratings yet
PTE Prediction 3-9 Dec
80 pages
Machine Predictive Maintainance Ppt
No ratings yet
Machine Predictive Maintainance Ppt
26 pages
Final Paper Imgprocessing
No ratings yet
Final Paper Imgprocessing
11 pages
Project Final Presentation
No ratings yet
Project Final Presentation
30 pages
DSC540 Adithya Harsha Paper Presentation
No ratings yet
DSC540 Adithya Harsha Paper Presentation
15 pages
Detailed Summary of Agentic Rag
No ratings yet
Detailed Summary of Agentic Rag
2 pages
(PDF Download) Microeconometrics Using Stata Cross Sectional and Panel Regression Models 2nd Edition A Colin Cameron Pravin K Trivedi Fulll Chapter
100% (3)
(PDF Download) Microeconometrics Using Stata Cross Sectional and Panel Regression Models 2nd Edition A Colin Cameron Pravin K Trivedi Fulll Chapter
64 pages
Suicide Among Hospitality Workers in Australia, 2006-2017
No ratings yet
Suicide Among Hospitality Workers in Australia, 2006-2017
9 pages
RD DataScientist
No ratings yet
RD DataScientist
3 pages
Student Note-Lecture-1 - New
No ratings yet
Student Note-Lecture-1 - New
4 pages
Complete Statistical Concepts A Second Course 4th Edition Richard G. Lomax PDF For All Chapters
100% (2)
Complete Statistical Concepts A Second Course 4th Edition Richard G. Lomax PDF For All Chapters
51 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
Yulu Case Study
No ratings yet
Yulu Case Study
1 page
Anova PDF
100% (1)
Anova PDF
7 pages
Data Mining: Lecture - 03
No ratings yet
Data Mining: Lecture - 03
56 pages
Assignment For TYBBI 2022-23
No ratings yet
Assignment For TYBBI 2022-23
2 pages
Inferential Statistics C1-3
100% (1)
Inferential Statistics C1-3
111 pages
Higgs K Framework
No ratings yet
Higgs K Framework
5 pages
649 - Ebook Econometrics Imp
No ratings yet
649 - Ebook Econometrics Imp
18 pages
Đề thi cuối kỳ - Tổng hợp - EN1
No ratings yet
Đề thi cuối kỳ - Tổng hợp - EN1
7 pages
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
No ratings yet
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
13 pages
SVKM'S Nmims Anil Surendra Modi School of Commerce Academic Year: 2020-2021
No ratings yet
SVKM'S Nmims Anil Surendra Modi School of Commerce Academic Year: 2020-2021
3 pages
WeiBull Analysis
100% (1)
WeiBull Analysis
76 pages
Patrick Breheny
No ratings yet
Patrick Breheny
18 pages
Kendall Tau
No ratings yet
Kendall Tau
8 pages
Statistics II Worksheet # 2 Solutions)
No ratings yet
Statistics II Worksheet # 2 Solutions)
6 pages
12-Regularization For Deep Learning-17!08!2024
No ratings yet
12-Regularization For Deep Learning-17!08!2024
51 pages
Ch3 Forecasting
No ratings yet
Ch3 Forecasting
53 pages
Regression Modeling in Biostatistics
No ratings yet
Regression Modeling in Biostatistics
3 pages
General Examples Using The Crow Model
No ratings yet
General Examples Using The Crow Model
10 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
TD1 ELTP 2023 Correction
No ratings yet
TD1 ELTP 2023 Correction
6 pages
s1 Paper1
No ratings yet
s1 Paper1
32 pages
Worksheet 9 - Statistics
No ratings yet
Worksheet 9 - Statistics
4 pages
Measures of Association
No ratings yet
Measures of Association
46 pages

Predictive Maintenance - Final Presentation

Uploaded by

Predictive Maintenance - Final Presentation

Uploaded by

Predictive Maintenance

DePaul University, Chicago, Illinois, USA

● There were no null/missing values.

● Outlier Detection: The features in the dataset have different ranges,

Pie chart illustrates that, within this dataset, 60%

Machine failure based on product Quality (Bar Chart): In this graph,

○ Air Temperature [K] & Process Temperature [K] (0.88

○ Rotational Speed [rpm] & Torque [Nm] (-0.94

○ Different Failure Types (HDF, PWF, OSF): Failure modes

● New Features Created:

Gradient Boosting Model:

● Final Accuracy: 0.967

Logistic Regression Model:

You might also like