0% found this document useful (0 votes)
25 views6 pages

LSTM 1

This document discusses using LSTM networks to predict engine condition based on large-scale sensor data. It describes how LSTM networks can be trained on Spark, a distributed data processing framework, to handle large volumes of sequential sensor data from aircraft engines. The proposed model is tested on NASA engine degradation data to predict current engine life condition and provide failure alerts. Big data technologies like Spark and deep learning models like LSTM are presented as effective approaches for predictive maintenance applications.

Uploaded by

Raj Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

LSTM 1

This document discusses using LSTM networks to predict engine condition based on large-scale sensor data. It describes how LSTM networks can be trained on Spark, a distributed data processing framework, to handle large volumes of sequential sensor data from aircraft engines. The proposed model is tested on NASA engine degradation data to predict current engine life condition and provide failure alerts. Big data technologies like Spark and deep learning models like LSTM are presented as effective approaches for predictive maintenance applications.

Uploaded by

Raj Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/317391149

Using LSTM networks to predict engine condition on large scale data processing
framework

Conference Paper · April 2017


DOI: 10.1109/ICEEE2.2017.7935834

CITATIONS READS
56 5,553

2 authors:

Olgun Aydın Seren Guldamlasioglu


Gdansk University of Technology Bilkent University
20 PUBLICATIONS   88 CITATIONS    4 PUBLICATIONS   89 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

thestats: An R package for Turkish higher education statistics View project

Big data analysis View project

All content following this page was uploaded by Seren Guldamlasioglu on 24 April 2018.

The user has requested enhancement of the downloaded file.


Using LSTM Networks to Predict Engine Condition
on Large Scale Data Processing Framework

Olgun Aydin, Data Scientist Seren Guldamlasioglu, Data Scientist


Big Data Research and Development Group Big Data Research and Development Group
STM Defense Technologies Engineering and Trade Inc. STM Defense Technologies Engineering and Trade Inc.
Ankara,Turkey Ankara,Turkey
[email protected] [email protected]

Abstract — As the Internet of Things technology is developing


rapidly, companies have an ability to observe the health of I. INTRODUCTION
engine components and constructed systems through collecting The system reliability is one of the most critical points
signals from sensors. According to output of IoT sensors, for engineering operations. Failure of some parts of system
companies can build systems to predict the conditions of could affect all of the operation. Turbine engines, power
components. Practically the components are required to be
supplies and batteries are typical instances that could cause
maintained or replaced before the end of life in performing
operation failure.
their assigned task. Predicting the life condition of a
component is so crucial for industries that have intent to grow
To avoid break down condition, some or all parts of the
in a fast paced technological environment. Recent studies on system should be well maintained.
predictive maintenance help industries to create an alert before In common maintenance strategies, part of a system is
the components are corrupted. Thanks to prediction of repaired when failure is observed [1].
component failures, companies have a chance to sustain their To predict current condition of any system units,
operations efficiently while reducing their maintenance cost by condition based maintenance (CBM) has been proposed.
repairing components in advance. Since maintenance affects According to Jardine et al. CBM recommends actions based
production capacity and the service quality directly, optimized on the information collected from system. Main aims of
maintenance is the key factor for organizations to have more CBM are avoiding unnecessary maintenance actions and
revenue and stay competitive in developing industrialized recommending maintenance actions if anomaly is detected
world. With the aid of well-designed prediction system for [2]. Estimation remaining useful lifetime (RUL) with high
understanding current situation of an engine, components accuracy is crucial to develop effective CBM strategy. RUL
could be taken out of active service before malfunction occurs. could be predicted through collecting signals with sensors
With the help of inspection, effective maintenance extends located on related units of system
component life, improves equipment availability and keeps In 2014 Amit et al. developed artificial neural networks
components in a proper condition while reducing costs. Real (ANN) to predict RUL under unknown initial wear [3].
time data collected from sensors is a great source to model
Amit et al. proposed ANN based approach for more accurate
component deteriorations. Markov Chain models, Survival
RUL prediction of high speed milling cutters. The proposed
Analysis, Optimization algorithms and several machine
learning approaches have been implemented in order to model model was constructed on time based statistical features.
predictive maintenance. In this paper Long Short Term Sateedh et.al proposed a novel approach for RUL estimation
Memory (LSTM) networks has been performed to predict the called Meta-cognitive Regression Neural Network (McRNN)
current situation of an engine. LSTM model deals with a for function approximation. McRNN employs Extended
sequential input data. Training process of LSTM networks has Kalman Filter (EKF) to find optimal network parameters
been performed on large-scale data processing engine with training [4].
high performance. Since huge amount of data is flowing into Porotsky developed a new model to control parameter
the predictive model, Apache Spark which is offering a optimization based on cross validation procedure for solution
distributed clustering environment has been used. The output of the question in the IEEE PHM 2012 Conference
of the LSTM network is deciding the current life condition of Challenge Competition and with their solution have been
components and offering the alerts for components before the awarded “Winner from Industry”[5]. In 2013 Rodney et. al
end of their life. The proposed model also trained and tested on proposed a time-frequency feature extraction based method
an open source data that is about an engine degradation for estimating RUL. The method extracts measures that
simulation provided by the Prognostics CoE at NASA Ames. quantify the complexity of time-frequency surfaces [6]. Felix
et. al. developed data-driven algorithm to predict RUL [7].
Keywords- ANN, Predictive Maintenance, LSTM, Apache Spark,
In the past few years large-scale data analysis have been
Big Data
center of attention, with data volumes in both industry and
research continue to grow the processing speed of individual
machines faster. Google’s MapReduce model and Hadoop
pioneered an ecosystem for parallel data analysis large
clusters, such as Apache’s Hive and Pig engines for SQL
processing [8]. The main advantage in Apache Spark is TABLE I. DESCRIPTION OF DATASET
resilient distributed dataset (RDD), users can explicitly cache
an RDD and it is available to reuse in multiple MapReduce-
like parallel [9]. Operational Settings
Apache Spark, which is a fast engine for large-scale in Settings No Description
memory data processing, is open source cluster computing
1 Altitude
framework. Apache Spark reads data items distributed over a
cluster of machines and has been built for streaming, 2 Mach number
machine learning and graph processing. 3 Throttle resolver angle
Instead of traditional machine learning based models, this
paper has been focused on new generation algorithms Sensor Measurements
developing on rising generation technology platform for Sensor No Description
predicting the life condition of components. Deep learning
algorithms are preferred as a bleeding edge and applicable 1 Total temperature at fan inlet (°R)
approach for several areas. This research has been focused 2 Total temperature at LPC outlet (°R)
on LSTM networks that is a type of Recurrent Neural 3 Total temperature at HPC outlet (°R)
Network and running the network on Apache Spark [10].
LSTM has been applied to predict current condition of an 4 Total temperature at LPT outlet (°R)
engine by using large-scale data processing engine Spark. 5 Pressure at fan inlet (psia)
Hidden layers of LSTM memorize long data sequence using
current input, previous input and network memory state. 6 Total pressure in bypass-duct (psia)
LSTM has four gate layers that are forget gate layer, input 7 Total pressure at HPC outlet(psia)
gate layer, candidate gate layer, output gate layer to preserve
8 Physical fan speed (rpm)
the information [11]. In this study the Python library Keras
which uses Tensorflow backend and the other Python library 9 Physical core speed (rpm)
Elephas which makes Keras available on Apache Spark were 10 Engine pressure ratio (P50/P2)
used.
This study makes available prediction of current 11 Engine pressure ratio (P50/P2)
condition of an engine by using huge size of sensor data. 12 Ratio of fuel flow to Ps30 (pps/psi)
LSTM network running on distributed open source
framework Apache Spark has been proposed for deciding on 13 Corrected fan speed (rpm)
the engine life cycle. 14 Corrected core speed (rpm)
15 Bypass Ratio
II. PROBLEM DESCRIPTION 16 Burner fuel-air ratio
It could be clearly said that accurate estimation of engine 17 Bleed Enthalpy
condition based on sensors data has many benefits and
18 Demanded fan speed (rpm)
advantages for prognosis of engine’s current condition. In
this paper the data, is about engine degradation simulation 19 Demanded corrected fan speed (rpm)
(C-MAPSS) provided by the Prognostics CoE at NASA 20 HPT coolant bleed (lbm/s)
Ames, has been used. Data set includes different
combinations of operational conditions and fault modes in 21 LPT coolant bleed (lbm/s)
multivariate time series format [12]. Data consists three
operational settings and 21 sensor measurements. Sensors
are collecting data related temperature, engine pressure, fuel, Before building a model, percentage residual life of engine
coolant bleed. Details could be found in Table I. Rl has been calculated as mentioned in [4] by using following
equation.

Rl = (Time to failure – Current Age) / Time to failure


When Rl is equal to one, it indicates that remaining life of an
engine is 100%. When Rl is equal to zero, remaining life of
engine is 0%, engine is failed. Rl values were divided to four
classes indicating the life condition of the components. Each
class points out the one of the following life cycle
conditions: Healthy, Caution, Repair, Failure as shown in
Table II. Healthy state indicates that the engine is at the
beginning of its life time. Caution state points out the engine
which has already began to work and needed to be cared. An
action has to be taken before the component falls into the
repair state. Fail is the undesired state which causes the break Fig. 1 LSTM Modules including four layers
down.
the simple structures like tanh and sigmoid layers; however,
TABLE II. DESCRIPTION OF CONDITIONS LSTMs have different repeating modules comparing RNN
and other type of neural networks. Instead of having a single
neural network layer, LSTMs have four interacting special
Rl Range Condition layers. These four layers covers the tanh and sigmoid layers
(0.85 - 1] Healthy as seen on Figure 1. Each layer carries an entire vector from
(0.7 - 0.85] Caution the output of layer to inputs of the next layer.
LSTMs have an ability to add information or remove
(0.5 - 0.7] Repair
information while going through gates that are a way to
[0 - 0.5] Fail decide how much information to carry among the layers.
Figure 2 depicts the pointwise multiplication operation
Our solution is aimed to have an alert before the component which carries the information among the cells.
falls into the repair condition. By having this alert, the
component would be maintained or replaced before the
system breaks down. Moreover, the solution also has a goal
to observe the life condition of each component at any time.
This study approaches the problem from a different
viewpoint through offering a predictive maintenance solution
based on distributed clustering environment served by
Apache Spark.
III. LONG SHORT TERM MEMORY(LSTM)
Fig. 2 LSTM Module Including Add Operation
Recent studies on predictive maintenance are mostly
applied with Hidden Semi Markov models to predict the
If all information has been carried through the gates, the
remaining useful life and reliability of components. Hidden
value of sigmoid function takes 1. Therefore, the first step
Semi Markov models commonly are expressed by failure
of LSTMs is deciding the amount of information carrying
rates that are defined as the frequency of break down of a
among the states. This decision is taken by forget gate layer
component per hour. Hidden Semi Markov models find a
which is also a sigmoid layer. If this layer takes 0 value, all
probability of failure transition rates [13]. Rather than
information is forgotten. After deciding the forget gate
Hidden Semi Markov models, Artificial Neural Networks
layer, the next step is to decide new information will be
(ANN) are preferred for training predictive maintenance
added to next cell state. Input gate layer decides on which
models. Features feed the input layer and feed forward
information will be updated on the next layer while a tanh
neural network topology is being designed mostly. The
layer creates new values will be added to the state.
network giving the minimum validation error is selected to
Moreover, the value of how much the states will be updated
represent the optimum outcome. Log sigmoid transfer
should be decided. Then, updated and new added candidates
function is applied while constructing ANN model. Output
are combined into the state. After this process, the old state
is normalized between 0 and 1 [14].
(St-1) is updated with the new state (St) as seen on the Figure
This study has been focused on Long Short Term
3. Old states are multiplied by ft, in order to forget things
Memory Networks that is a specialized implementation of
that are decided to forget. Additionally, in order to decide
Recurrent Neural Networks (RNN). LSTM has been
the amount of information that would be gained, it * St has
introduced by German researchers; Sepp Hochreiter and
been added to the model as seen on the Figure 3. St
Bergen Schmidhuber in the mid 90s in order to enable
represents a tanh layer that is creating vector of new
model to learn long term dependencies. LSTM has been
candidate values. While moving through the next states,
proposed for vanishing gradient problem [15]. LSTMs have
information decided to forget is dropped from the gained
a chain of repeating modules of neural network as standard
information [16].
RNNs. Repeating modules in standard neural networks have
Fig. 3 Updating the new state

IV. THE PROPOSED MODEL


To build an efficient prediction model for learning long
Fig. 4 Proposed LSTM Network Layers
term dependencies, new generation one of the Deep Learning
algorithms named LSTM has been applied for developing
LSTM Network model has been processed by
predictive maintenance model aims to predict remaining life
implementing Keras and Elephas libraries on distributed
of the engine. As a feature selection, according to
calculations, Total temperature at fan inlet (°R), Pressure at environment with the power of Apache Spark. In this study
fan inlet (psia), Demanded fan speed (rpm), Demanded epoch number in training section has been decided as 200
corrected fan speed (rpm) has been dropped, because they and the batch sizes has been identified as 30. Categorical
had zero variance. Total temperature at HPC outlet (°R), Cross Entropy has been selected as loss function and
Total temperature at LPT outlet (°R), Total pressure in adadelta has been used to optimize the loss function.
bypass-duct (psia), Total pressure at HPC outlet(psia), Ratio According to Figure 5. training accuracy of RUL
of fuel flow to Ps30 (pps/psi), Bleed Enthalpy, HPT coolant prediction has resulted as %85.
bleed (lbm/s) has been dropped, because they has been
highly correlated with other sensors.
Finally, Total temperature at LPC outlet (°R), Physical fan
speed (rpm), Physical core speed (rpm), Engine pressure
ratio (P50/P2), Engine pressure ratio (P50/P2), Corrected fan
speed (rpm), Corrected core speed (rpm), Bypass Ratio,
Burner fuel-air ratio, LPT coolant bleed (lbm/s) has been
used as input variables for sensors.
The layer architecture shown in Figure 4 has been
implemented in this study. The architecture covers 15 inputs
which are operation settings 1, operation settings 2, operation
settings 3, unit number, total temperature at LPC outlet (°R),
physical fan speed (rpm), physical core speed (rpm), engine
pressure ratio (P50/P2), engine pressure ratio (P50/P2),
corrected fan speed (rpm), corrected core speed (rpm),
bypass ratio, burner fuel-air ratio, LPT coolant bleed (lbm/s)) Fig. 5 Training Process
feeding into the network. The proposed model creates the
classed of 4 output labels as healthy, caution, repair, fail. VI. CONCLUSION
Keras library of Python has been implemented on
Apache Spark distributed clustering environment to build the This paper describes the LSTM implemented on Apache
network. Keras is one of high-level neural networks library, Spark offering a large-scale distributed data processing
written in Python and capable of running on top of either environment in order to predict the current life condition of
TensorFlow or Theano [17]. For making all of the process an engine. Previous researches which aim to predict the
distributed, another library of Python called Elephas has maintenance conditions were mostly based on Hidden
been used. Elephas is an extension of Keras, which allows to Markov Models or traditional Artificial Neural Networks.
run distributed deep learning models at scale with Spark Instead of the common models, LSTM working on
distributed environment is an edge bleeding technology to
V. RESULTS AND DISCUSSION
predict the current engine condition. The accuracy of the
Experimental results show that current engine condition proposed model points out the reliability of the architecture
could be predicted to take necessary precautions when which might lead industries to get an alert before a break
engine current condition is predicted as repair. down occurs. Therefore, companies might have a chance to
reduce maintenance cost while increasing revenue and
service quality.
ACKNOWLEDGMENT [6] Heimes, O. Felix. "Recurrent neural networks for remaining useful
life estimation." Prognostics and Health Management, PHM 2008.
Thanks to STM Defense Technologies Engineering and International Conference on. IEEE, Oct 2008, pp. 1-6, doi:
Trade Inc. for supporting us to maintain this study. STM 10.1109/PHM.2008.4711422
provides system engineering, technical support, project [7] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. Mccauley,
management, technology transfer and logistics support M. Franklin, S. Shenker, I. Stoica, “Fast and interactive analytics over
Hadoop data with Spark”, login, Vol 37, Aug. 2012, pp. 45-51
services for TAF (Turkish Armed Forces) and SSM
(Undersecretariat for Defense Industries). [8] M. Zaharia, M. Chowdhury, M. Franklin, S. Shenker, I. Stoica,
“Spark: cluster computing with working sets.”,HotCloud, vol.10, pp.
10-10.
[9] L. Liao , H. Ahn, “Combining Deep Learning and Survival Analysis
REFERENCES for Asset” , 2016
[10] E. Nardo, “Distributed implementation of a LSTM on Spark and
[1] Q Zhou, J Son, S Zhou, X Mao, M Salman, Remaining “Useful life Tensorflow”, 2016
prediction of individual units subject to hard failure”, IIE
[11] A. Saxena, K. Goebel, "Turbofan Engine Degradation Simulation
Transactions Vol 46, Jun2014, pp. 1017-1030,doi:
Data Set", NASA Ames Prognostics Data Repository
10.1080/0740817X.2013.876126
(http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA
[2] A. Jardine, D. Lin, D. Banjevic. “A review on machinery diagnostics Ames Research Center, Moffett Field, CA, 2008
and prognostics implementing condition-based maintenance”,
[12] M. Abbas, O. H. I. Mohammad, N. A. Omer. "Development of
Mechanical Systems and Signal Processing, vol 20, Oct 2006, pp.
predictive markov-chain conditionbased tractor failure analysis
1483-1510, doi: 10.1016/j.ymssp.2005.09.012
algorithm." Research Journal of Agriculture and Biological Sciences
[3] A. Jain, P. Kundu, B. K. Lad, “Prediction of remaining useful life of vol. 7, pp. 52-67, 2011
an aircraft engine under unknown initial wear minutes”, 5th
[13] A. K. Mahamad, S. Saon, T.. Hiyama. "Predicting remaining useful
International & 26th All India Manufacturing Technology, Design
life of rotating machinery based artificial neural network." Computers
and Research Conference (AIMTDR 2014), Dec 2014, pp. 494:1-
& Mathematics with Applications, vol.60, pp. 1078-1087, 2015
494:5, doi: 10.1007/978-81-322-2352-8
[14] K. Greff, R. Srivastava, J. Koutnik, B. R. Steunebrink, J.
[4] G. Sateesh Babu , Xiao-Li Li , S. Suresh . "Meta-cognitive
Schmidhuber, "LSTM: A search space odyssey." arXiv preprint
Regression Neural Network for function approximation: Application
arXiv:1503.04069, March 2015.
to Remaining Useful Life estimation." Neural Networks (IJCNN),
2016 International Joint Conference on. IEEE, Jul 2016, pp. 4803 – [15] D. Hristovski, B.Peterlin, J. Mitchell, M. H. Susanne, "Using
4810, doi: 10.1109/IJCNN.2016.7727831 literature-based discovery to identify disease candidate genes."
International journal of medical informatics, vol 74, pp. 289-298,
[5] S. Porotsky. "Remaining useful life estimation for systems with non-
March 2005.
trendability behaviour." Prognostics and Health Management (PHM),
IEEE Conference on., Jun 2012, pp. 1-6, doi: [16] Keras: Deep Learning library for Theano and TensorFlow,
10.1109/ICPHM.2012.6299544 https://keras.io/
[17] Elephas: Distributed Deep Learning with Keras & Spark,
https://github.com/maxpumperla/elephas

AUTHORS’ BACKGROUND

Your Name Title* Research Field Personal website


Olgun Aydin Data Scientist Big Data, Machine Learning, http://olgunaydin.com
Deep Learning
Seren Guldamlasioglu Big Data, Machine Learning, http://serensweekly.blogspot.com.tr
Data Scientist
Deep Learning

*This form helps us to understand your paper better, the form itself will not be published.

*Title can be chosen from: master student, Phd candidate, assistant professor, lecture, senior lecture, associate professor,
full professor

View publication stats

You might also like