Sensors 21 00418 v2
Sensors 21 00418 v2
Sensors 21 00418 v2
Article
A Remaining Useful Life Prognosis of Turbofan Engine Using
Temporal and Spatial Feature Fusion
Cheng Peng 1,2 , Yufeng Chen 1 , Qing Chen 1 , Zhaohui Tang 2, * , Lingling Li 1 and Weihua Gui 2
1 School of Computer, Hunan University of Technology, Zhuzhou 412007, China; [email protected] (C.P.);
[email protected] (Y.C.); [email protected] (Q.C.); [email protected] (L.L.)
2 School of Automation, Central South University, Changsha 410083, China; [email protected]
* Correspondence: [email protected]
Abstract: The prognosis of the remaining useful life (RUL) of turbofan engine provides an important
basis for predictive maintenance and remanufacturing, and plays a major role in reducing failure
rate and maintenance costs. The main problem of traditional methods based on the single neural
network of shallow machine learning is the RUL prognosis based on single feature extraction, and
the prediction accuracy is generally not high, a method for predicting RUL based on the combination
of one-dimensional convolutional neural networks with full convolutional layer (1-FCLCNN) and
long short-term memory (LSTM) is proposed. In this method, LSTM and 1- FCLCNN are adopted
to extract temporal and spatial features of FD001 andFD003 datasets generated by turbofan engine
respectively. The fusion of these two kinds of features is for the input of the next convolutional neural
networks (CNN) to obtain the target RUL. Compared with the currently popular RUL prediction
models, the results show that the model proposed has higher prediction accuracy than other models in
RUL prediction. The final evaluation index also shows the effectiveness and superiority of the model.
Keywords: remaining useful life (RUL); long short-term memory (LSTM); one-dimensional convo-
lutional neural networks with full convolutional layer (1-FCLCNN); temporal and spatial features;
turbofan engine
Citation: Peng, C.; Chen, Y.; Chen,
Q.; Tang, Z.; Li, L.; Gui, W. A
Remaining Useful Life Prognosis of
Turbofan Engine Using Temporal and
1. Introduction
Spatial Feature Fusion. Sensors 2021,
21, 418. https://doi.org/10.3390/
Turbofan engine is a highly complex and precise thermal machinery, which is the
s21020418 “heart” of the aircraft. About 60% of the total faults of the aircraft are related to the tur-
bofan engine [1]. The RUL prediction of the turbofan engine will provide an important
Received: 14 December 2020 basis for predictive maintenance and pre maintenance. In recent years, because of the
Accepted: 6 January 2021 rapid development of machine learning and deep learning, the intelligence and work effi-
Published: 8 January 2021 ciency of turbofan engines have been greatly improved. However, as large-scale precision
equipment, its operation process cannot be separated from the comprehensive influence of
Publisher’s Note: MDPI stays neu- internal factors such as physical and electrical characteristics and external factors such as
tral with regard to jurisdictional clai- temperature and humidity [2]. The performance degradation process also shows temporal
ms in published maps and institutio- and spatial characteristics [3], which provide necessary data support and bring challenges
nal affiliations. for the RUL prediction of the turbofan engine. At the same time, the data generated
by the operation process of turbofan engine have the characteristics of nonlinearity [4],
time-varying [5], large scale and high dimension [6], which results in the failure of effective
Copyright: © 2021 by the authors. Li-
feature extraction, and the non-linear relationship between the extracted features and the
censee MDPI, Basel, Switzerland.
RUL cannot be mapped, which are the key problems to be solved urgently.
This article is an open access article
Many models have been developed to predict the RUL of turbofan engines. Ah-
distributed under the terms and con- madzadeh et al. [7] divided the predicting methods into four categories, including experi-
ditions of the Creative Commons At- mental, physics-based, data driven, and hybrid methods. Experimental type relies on prior
tribution (CC BY) license (https:// knowledge and historical data, but the operating conditions and operating environment of
creativecommons.org/licenses/by/ the equipment are uncertain, which leads to large prediction accuracy error, and cannot
4.0/). be promoted in complex scenarios. The physical model uses the physical and electrical
We found that the data set of turbofan engine is composed of multiple time series,
the data in different data sets contain different noise levels, so it is necessary to normalize
multi fault mode and multi-dimensional feature data in different operating environments.
the original data, which will eliminate the influence of noise, and realize data centraliza-
It istion
alsotonecessary to use multi scene and multi time point data to extract effective features
enhance the generalization ability of the model. At the same time, it is difficult to
to improve prediction
capture multi fault mode accuracy, traditional methods
and multi-dimensional featurecannot
data in extract temporal
different operating anden-spatial
features simultaneously and effectively fuse them. In addition, single
vironments. It is also necessary to use multi scene and multi time point data to extract neural network
model is difficult to extract enough effective information in the face
effective features to improve prediction accuracy, traditional methods cannot extract of multiple working
conditions
temporaland and multiple types simultaneously
spatial features of features. and effectively fuse them. In addition, sin-
gleThe main
neural contributions
network of this paper
model is difficult include:
to extract enough(1) use LSTM
effective to extract
information in thethe temporal
face of
multiple working
characteristics of theconditions and multiple
data sequence, and types
learnof features.
how to model the sequence according to
the target TheRUL
maintocontributions
provide accurateof thisresults.
paper include: (1) use LSTM to extract
(2) A one-dimensional the temporal layer
full-convolutional
characteristics of the data sequence, and learn how to model
neural network is adopted to extract spatial features, and through dimensionality the sequence according to
reduction
the target RUL to provide accurate results. (2) A one-dimensional full-convolutional
processing, the parameters and computational complexity of the training process are greatly layer
neural network is adopted to extract spatial features, and through dimensionality reduc-
reduced. (3) The spatiotemporal features extracted by the two models are fused and used
tion processing, the parameters and computational complexity of the training process are
as the input of the one-dimensional convolutional neural network for secondary processing.
greatly reduced. (3) The spatiotemporal features extracted by the two models are fused
Comparing
and used as this
themethod
input ofwith other mainstream
the one-dimensional RUL prediction
convolutional methods,
neural network forthe score and
second-
errorary processing. Comparing this method with other mainstream RUL prediction methods, the
control of the method proposed in this article are better than others, which proves
feasibility
the scoreand
andeffectiveness
error control of of the
thismethod
method. proposed in this article are better than others,
The proves
which rest ofthethis article and
feasibility is arranged as of
effectiveness follows: Part 2 is the basic theory, mainly
this method.
introducing theofmodel
The rest structure
this article of neural
is arranged network
as follows: and
Part 2 isevaluation indicators.
the basic theory, mainlyThe in- third
parttroducing the of
is the focus model
this structure of neural
article, mainly networkthe
including and evaluation
proposed indicators.
model The third
structure, algorithm,
part isprocess,
training the focusimplementation
of this article, mainly
flow.including
The fourththepart
proposed
is themodel structure,
experiment andalgorithm,
result analysis,
training process, implementation
and the last part is summary and prospect. flow. The fourth part is the experiment and result
analysis, and the last part is summary and prospect.
2. Basic Theory
2.1.2.Convolutional
Basic Theory
Neural Network
2.1. Convolutional Neural Network
Convolutional neural network has been widely used in image recognition, complex
Convolutional neural network has been widely used in image recognition, complex
data processing [24]. The convolutional neural network consists of the input layer, convo-
data processing [24]. The convolutional neural network consists of the input layer, con-
lutional layer, pooling layer, full connection layer, and output layer. Its basic components
volutional layer, pooling layer, full connection layer, and output layer. Its basic compo-
arenents
shown in Figure 1.
are shown in Figure 1.
Figure
Figure 1. 1. Traditionalconvolutional
Traditional convolutional neural
neuralnetwork.
network.
Nn + 2P − f
Nn+1 = +1 (2)
t
Sensors 2021, 21, 418 4 of 20
In the formula, i, j are pixel indexes, d is the bias in the calculation process, W is the
weight matrix, and An , An+1 represent the input and output of the n+1 layer, Nn+1 is the
size of A, M is the number of convolution channels, t is the step size, and p and f are
the padding and convolution kernel size. The activation function is usually used after the
convolution kernel, in order to make the model fit the training data better and accelerate
the convergence, avoid the problem of gradient vanishing, in this article, ReLU is selected
as the activation function, as follows:
x is the input value of the upper neural network. the convolutional layer performs
feature extraction, and the obtained information is used as the input of the pooling layer.
The pooling layer can further filter the information, which not only reduces the dimension
of the feature, but also prevents the risk of overfitting. Pool layer generally has average
pool and maximum pool. The expression for the pooling layer is as follows [26]:
1
f f n
(i, j) = [∑ x=1 ∑y=1 Z (t ∗ i + x, t ∗ j + y)s ]
n s
Zm (4)
m
where t is the step size, pixel (i, j) are the same as convolutional layer, s is a specified pa-
rameter. When s = 1, the expression is average pooling, and when s → ∞ , the expression
is maximum pooling. m is the number of channels in the feature map, Z is the output
of pooling layer, and the value of s determines whether the output is average pooling or
maximum pooling. The other variables have the same meaning as convolution.
After feature extraction and dimensionality reduction of convolutional layer and pool-
ing layer, the fully connected layer maps these features to the labeled sample space. After
smoothing, the fully connected layer transforms the feature matrix into one-dimensional
feature vector. The full connection layer also contains the weight matrix and parameters.
The expression of the full connection layer is as follows:
Y = σ (W I + b ) (5)
where I is the input of the fully connected layer, Y is the output, W is the weight matrix,
and b is the bias. σ() is a general term for activation function, which can be softmax,
sigmoid, and ReLU, and the common ones are the multi-class softmax function and the
two-class sigmoid function.
Figure 2. 2.The
Figure Thestructure of full
structure of fullconvolution
convolution model.
model.
f = (W [a , x ] + d ) (6)
f t = tσ (W f · [fat−1 t,−x1 t ] +
t
d f )f (6)
where 𝑓𝑡 is the forget gate, which means that some features of 𝐶𝑡−1 are used in the cal-
where f t is the forget gate, which means that some features of Ct−1 are used in the calcula-
culation of 𝐶𝑡 . The value range of elements in 𝑓𝑡 is between [0, 1], while the activation
tion of Ct . The value range of elements in f is between [0, 1], while the activation function
function is generally sigmoid, 𝑊𝑓 is tthe weight matrix of forgetting gate, 𝑑𝑓 is the bias,
is generally sigmoid, W f is the weight matrix of forgetting gate, d f is the bias, and ⊗ is the
and ⊗ is the gate mechanism, which represents the relational operation of bit multipli-
gate mechanism,
cation. which represents the relational operation of bit multiplication.
(2) (2) and
Input gate Input gate andunit
memory memory unit update
update
ut =𝑢 σ=(Wu ·[ at∙− 1 , xt ] + du ) (7)
𝑡 𝜎(𝑊 𝑢 [𝑎𝑡−1 , 𝑥𝑡 ] + 𝑑𝑢 ) (7)
at = ot ◦ tanh(Ct ) (11)
Sensors 2021, 21, 418 6 of 20
Among them, at is derived from the output gate ot and the cell state Ct , and the
average value of do is initialized to 1.
mean square error (RMSE) and scoring function (SF), as shown in Figure 4.
Figure 4. The curve of root mean square error (RMSE) and scoring function (SF).
Figure 4. The curve of root mean square error (RMSE) and scoring function (SF).
RMSE: it is used to measure the deviation between the observed value and the actual
3.value.
1-FCLCNN-LSTM
It is a commonPrediction
evaluationModeindex for error prediction; RMSE has the same penalty
3.1.
for Overall Frameworks
early prediction and late prediction in terms of prediction. The calculation formula is
as follows:
The life prediction model 1-FCLCNN-LSTM s proposed in this paper adopts the idea
of classification and parallel processing; 1-FCLCNN 1 X and LSTM network extract spa-
RMSE =
tio-temporal features separately, and the two X types
n
∑i=ofn1 feature
Yi 2
are fused and then input to
(12)
the one-dimensional convolutional neural network and fully connected layer. Specifi-
where Xn is the total number of test samples of turbofan engine; Yi refers to the difference
cally, first, by preprocessing the C-MAPSS data set, the data are standardized and di-
between the predicted value pv of RUL of the i-th turbofan engine and the actual value av
vided
of RUL.into two input parts: INP1 and INP2. These two parts are input to the 1-FCLCNN
and theSF:LSTM
SF is neural network.
structurally Among them,
an asymmetric the 1-FCLCNN
function, which isisdifferent
used to extract the spatial
from RMSE. It is
feature of the data set. At the same time, the LSTM is adopted to extract
more inclined to early prediction (in this stage, RUL prediction value pv is less than the time series
the
feature of the av)
actual value datarather
set. After
than the
late feature extraction,
prediction to avoidthe algorithm
serious 1 is applied
consequences duetotofuse the
delayed
two types ofThe
prediction. feature
lowerandthe then
valueinputs
of RMSEthem andintoSF one-dimensional
score, the better theconvolutional
effect of the neural
model.
network. Finally, the data through
The calculation formula is as follows: the pooling layer and fully connected layer, is delt
with algorithm 2, and the predicted ( RUL result is obtained. The overall framework of the
x Yi
model is shown in Figure 5. ∑i=n 1 e( 13 ) − 1, Y i < 0;
score = x Yi (13)
∑i=n 1 e( 10 ) − 1, Y i ≥ 0;
In the formula, score represents the score. When RMSE, score, and Yi are as small as
possible, the effect of the model will be better.
The feature fusion method, which only splices the feature information without
The life prediction model 1-FCLCNN-LSTM proposed in this paper adopts the idea
of classification and parallel processing; 1-FCLCNN and LSTM network extract spa-
tio-temporal features separately, and the two types of feature are fused and then input to
the one-dimensional convolutional neural network and fully connected layer. Specifi-
Sensors 2021, 21, 418
cally, first, by preprocessing the C-MAPSS data set, the data are standardized and di-
7 of 20
vided into two input parts: INP1 and INP2. These two parts are input to the 1-FCLCNN
and the LSTM neural network. Among them, the 1-FCLCNN is used to extract the spatial
feature of the data set. At the same time, the LSTM is adopted to extract the time series
data set.ofAtthe
feature thedata
same time,
set. thethe
After LSTM is adopted
feature to extract
extraction, the time series
the algorithm feature to
1 is applied of fuse
the data
the
set. After
two typestheof feature
feature extraction, the algorithm
and then inputs 1 is applied
them into to fuse theconvolutional
one-dimensional two types of feature
neural
and then Finally,
network. inputs them into through
the data one-dimensional convolutional
the pooling layer and fullyneural network.layer,
connected Finally, the
is delt
with algorithm 2, and the predicted RUL result is obtained. The overall framework of the
data through the pooling layer and fully connected layer, is delt with algorithm 2, and
predicted
model RUL result
is shown is obtained.
in Figure 5. The overall framework of the model is shown in Figure 5.
Figure 5.
Figure The overall
5. The overall framework
framework of
of the
the model.
model.
The feature fusion method, which only splices the feature information without chang-
The feature fusion method, which only splices the feature information without
ing the content, preserves the integrity of the feature information, and can be used for
changing the content, preserves the integrity of the feature information, and can be used
multi-feature information fusion. Specifically, in the feature dimension splicing, the number
for multi-feature information fusion. Specifically, in the feature dimension splicing, the
of channels (features) increases after the splicing, but the information under each feature
number of channels (features) increases after the splicing, but the information under each
does not increase, and each channel of the splicing maps the corresponding convolution
kernel, the expression is as follows:
A = { Ai |i = 1, 2, 3, · · · , channel } (14)
B = { Bi |i = 1, 2, 3, · · · , channel } (15)
channel channel
Dsin gle = ∑ i =1 A i ∗ Ki + ∑ i =1 Bi ∗ Ki+channel (16)
In the formula, the two input channels are A and B respectively, the single output
channel to be spliced is Dsingle , ∗ means convolution, and K is the convolution kernel.
The algorithm of the spatio-temporal feature fusion and RUL prediction in this paper are
as follows.
Algorithm 1 Spatio-temporal Feature Fusion Algorithm
Input: INP1, INP2
Output: Spatio-temporal fusion feature
1. Conduct regular processing.
2. Keep the chronological order of the entire sequence, and use one-dimensional convolutional
layer to extract local spatial features to obtain spatial information.
3. Extract local spatial extreme values by one-dimensional pooling layer to obtain a
multi-dimensional spatiotemporal feature map
4. Learn the characteristics of the data sequence over time through LSTM.
5. Splice the above two features to get the final spatiotemporal fusion feature.
Sensors 2021, 21, 418 8 of 20
Figure 6. The
Figure detailed
6. The architecture
detailed architectureof
ofthe
the 1-FCLCNN path.
1-FCLCNN path.
Xm,n − Xnmin
X ∗ m,n = , ∀m, n (17)
Xnmax − Xnmin
∗ represents a value of the mth data point of the nth feature after
In the formula, Xm,n
normalization processing. Xm,n represents the initial data before processing. Xnmax , Xnmin
are the maximum and minimum values of the features respectively.
In the model training section, the purpose of training is to minimize the cost function
and the loss, and to obtain the best parameters. The cost function as RMSE is defined by
the model (Formula (12)). In the meantime, Adam algorithm [30] and Early stopping [31]
are adopted to optimize the training process. The Early stopping can not only verify the
effect of the model in the training process, but also avoid overfitting. During the training,
the normalized data are segmented by sliding window. The input data INP1 and INP2 are
in the form of two-dimensional tensors with the size of ssw × n f , which are processed in
parallel paths separately, they are 1D convolutional layer and LSTM network. In order to
make the gradient larger and reduce the gradient disappearance problem [32], normalized
operation is used after each max pooling layer. In addition, the normalized operation
normalizes the activation values of each layer of the neural network to maintain the same
distribution. In the meantime, a larger gradient means an increase in convergence and
training speed.
The 1-FCLCNN-LSTM training algorithm is as follows:
Figure
Figure 7. The flow chart The flow method.
7.proposed
of the chart of the proposed method.
According to the needs of experiment, this paper adopts the data set FD001 and FD003
for model verification, and the specific description of the data set is shown in Table 3:
Type of Operating
Data Set Training Set Test Set Operating Conditions Fault Mode Number of Sensors
Parameters
FD001 100 100 1 1 21 3
FD003 100 100 1 2 21 3
In this table, the training set in the data set includes the data of the entire engine life
cycle, while the data trajectory of the test set terminates at some point before failure. FD001
and FD003 were simulated under the same (sea level) condition. FD001 was only tested in
the case of HPC degradation, and FD003 was simulated in two fault modes: HPC and fan
degradation. The number of sensors and the type of operation parameters are consistent
for the four data subsets (FD001-FD004). The data subsets FD001 and FD003 contain
actual RUL values, so that the effect of the model can be seen according to the comparison
between the actual value and the predicted value. The result of the experiment is to predict
the number of remaining running cycles before the failure of the test set, namely RUL.
(c) DROPOUT
(c) DROPOUT and RMSE.
and RMSE.
Figure 11. Experimental results of different parameters of FD003.
Figure 11. Experimental results
Figure 11. of different
Experimental parameters
results of FD003.
of different parameters of FD003.
After model
After model training
training and comparative
and
After model comparative analysis
training andanalysis of experimental
of
comparativeexperimental results, the
results, the param-
analysis of experimental param-
results, the parameter
eter setting of FD001
eter setting of FD001 and
settingand FD003 data
FD003 and
of FD001 subsets
data FD003 with
subsetsdata the best
withsubsets model
the bestwith
model performance
theperformance is finally
best model isperformance
finally is finally
obtained, as
obtained, as shown
shown in Table
obtained,
in Table 4.
as shown
4. in Table 4.
Sensors 2021, 21, 418 15 of 20
Data Subset
FD001
Table 4. Model parameter Settings for FD001 and FD003 data subsets. FD003
Parameter
Data Subset
epoch 60 60
FD001 FD003
batchParameter
size 256 512
dropout
epoch 0.2 60 60 0.2
batch size 256 512
4.4. Experimental Resultsdropout
and Comparison 0.2 0.2
In this section, we mainly introduce the prediction results of this model and the
4.4. Experimental Results and Comparison
comparative analysis with the recent popular research methods. With the same data input
In this section, we mainly introduce the prediction results of this model and the
andcomparative
the same pretreatment process, the prediction results of the traditional convolutional
analysis with the recent popular research methods. With the same data in-
neural
put and the same pretreatmentwith
network are compared thethe
process, 1-FCLCNN-LSTM
prediction results of model proposed
the traditional in this paper.
convolu-
Thetional
traditional convolutional
neural network neuralwith
are compared network consists of two
the 1-FCLCNN-LSTM convolutional
model proposed inlayers,
this two
pooling
paper.layers, and a full
The traditional connectedneural
convolutional layer.network
For FD001 and
consists ofFD003 data subset,
two convolutional this paper
layers,
compares the training
two pooling effect
layers, and of convolutional
a full connected layer.neuralFor FD001network and FCLCNN-LSTM
and FD003 data subset, this model
paper
under thecompares
same datatheset
training effect of The
and engine. convolutional neural network
training effects of engines andwith
FCLCNN-LSTM
FD001 and FD003
datamodel under
subsets on the
thesame data set are
two models and shown
engine. inThe training12effects
Figures and 13. of engines with FD001
The training diagrams of
the and
twoFD003
models data
insubsets
a singleondata
the two models
subset canarebe shown
obtained in Figures 12 and
as follows: 13. The
RUL training
began to decrease
diagrams of the two models in a single data subset can be obtained as follows: RUL began
with the increase of time step, and finally failed. From the process of RUL reduction, it can
to decrease with the increase of time step, and finally failed. From the process of RUL
be observed that with the increase of time, the higher the prediction accuracy, the closer the
reduction, it can be observed that with the increase of time, the higher the prediction
predicted value and the
accuracy, the closer the actual thevalue
predicted values andare,
thewhich means
actual the that
values the
are, smaller
which RUL
means thatis closer
to the
the potential
smaller RUL fault. In this
is closer paper,
to the RMSE
potential is In
fault. used
this to express
paper, RMSE the training
is used effectthe
to express of FD001
andtraining
FD003effect
training sets, and
of FD001 as shown in Formula
FD003 training sets, (12). Thein
as shown comparison
Formula (12). results are shown in
The compar-
Table
ison5.results are shown in Table 5.
(a) Training diagram of CNN model (b) Training diagram of 1-FCLCNN-LSTM model
Sensors 2021, 21, x FOR PEER REVIEW 17 of 21
Figure
Figure 12. 12. Training
Training diagramof
diagram ofthe
the same
same FD001
FD001engine
engineunder twotwo
under models.
models.
(a) Training diagram of CNN model. (b) Training diagram of 1-FCLCNN-LSTM model.
Figure
Figure 13. Training
13. Training diagramofofthe
diagram thesame
same FD003
FD003engine
engineunder
undertwotwo
models.
models.
Table 5. RMSE training values of FD001 and FD003 under the two models.
FD001 FD003
CNN 8.25 14.00
1-FCLCNN-LSTM 4.87 7.56
Sensors 2021, 21, 418 16 of 20
(a) Training diagram of CNN model. (b) Training diagram of 1-FCLCNN-LSTM model.
Figure 13. Training diagram of the same FD003 engine under two models.
Table 5. RMSE training values of FD001 and FD003 under the two models.
Table 5. RMSE training values of FD001 and FD003 under the two models.
FD001 FD003
FD001 FD003
CNN 8.25 14.00
CNN 8.25 14.00
1-FCLCNN-LSTM 4.87 7.56
1-FCLCNN-LSTM 4.87 7.56
From
From Table
Table 55 and
and the
the training
training diagrams
diagrams of the two
of the two models
models onon different
different data
data sets,
sets, it
it
can
can be
be concluded
concluded that
that the 1-FCLCNN-LSTM proposed
the 1-FCLCNN-LSTM proposed in this paper
in this paper performs
performs better
better in
in
the
the training
training process than the
process than the traditional
traditional single
single CNN
CNN neural
neural network.
network. Among
Among them, the
them, the
RMSE of of 1-FCLCNN-LSTM
1-FCLCNN-LSTMmodel modelonon FD001 training set was 41% lower than
FD001 training set was 41% lower than that of CNN that of
CNN model, and the RMSE of 1-FCLCNN-LSTM model on FD003
model, and the RMSE of 1-FCLCNN-LSTM model on FD003 training set was 46% lower training set was 46%
lower thanofthat
than that CNN of model.
CNN model.
FD003FD003
has twohas twomodes
fault fault modes while FD001
while FD001 has onlyhasone,
only one,
which
which indicates
indicates that
that the the multi-neural
multi-neural networknetwork
modelmodel has certain
has certain advantages
advantages in dealing
in dealing with
with complex
complex fault fault problems.
problems.
The test
testsets
sets of FD001
of FD001 and FD003
and FD003 wereinto
were input input into the
the trained CNN trained CNN and
and 1-FCLCNN-
1-FCLCNN-LSTM
LSTM models to obtain modelsthe
to obtain the prediction
prediction results,
results, which arewhich
shownareinshown
Figuresin 14
Figures 14
and 15,
respectively.
and 15, respectively.
(a) Prediction results of CNN model. (b) Prediction results of 1-FCLCNN-LSTM model.
Figure 15.
Figure 15. RUL
RUL prediction
prediction results
results of
of FD003
FD003 with
with different
differentmodels.
models.
In this paper,
paper, the
the RMSE
RMSE is is used
used to
to express
express the
the effects
effects of
of FD001
FD001 and
and FD003
FD003 test
test sets,
sets, as
as
shown in Formula (12).
(12). See Table 6 for details.
Table 6 for details.
Table 6. RMSE predicted by FD001 and FD003 under the two models.
FD001 FD003
CNN 17.22 15.50
FCLCNN-LSTM 11.17 9.99
Sensors 2021, 21, 418 17 of 20
Table 6. RMSE predicted by FD001 and FD003 under the two models.
FD001 FD003
CNN 17.22 15.50
FCLCNN-LSTM 11.17 9.99
It can be seen from Table 6 that the training effect of the model directly affects the
performance of the test set of the model. As shown in the above table, the RMSE of 1-
FCLCNN-LSTM model on FD001 test set is 35% lower than that of CNN model and the
RMSE of 1-FCLCNN-LSTM model on FD003 is 35.5% lower than that of CNN model.
In order to measure the prediction performance of the model more comprehensively,
this paper selects the latest advanced RUL prediction method, and compares the deviations
of various methods under the same data set. The evaluation indicators are RMSE and the
score function, both of which are as low as possible. The comparison results of FD001 data
set are shown in Table 7, and the comparison results of FD003 data set are shown in Table 8.
The comparison results with multiple models show that the model proposed in
this paper has the lowest score and RMSE values on both FD001 and FD003 data sets.
The RMSE of 1-FCLCNN-LSTM model on FD001 was 11.4–36.6% lower than that of RF,
DCNN, D-LSTM, and other traditional methods, and the RMSE of 1-FCLCNN-LSTM
model on FD003 was 37.5–78% lower than that of GB, SVM, LSTMBS, and other traditional
methods. The above results are attributed to the multi-neural network structure and
parallel processing of feature information in this model, which can effectively extract
RUL information. Compared with the current popular multi-model Autoencoder-BLSTM,
VAE-D2GAN, HDNN, and other methods, the RMSE of FD001 was decreased by 4–18%,
and the RMSE of FD003 was decreased by 18–37.5% compared with that on HDNN,
DCNN, Rulclipper, and other methods. The above results are attributed to the same multi-
model structure and multi-network structure, the 1-FCLCNN-LSTM model has advantages
Sensors 2021, 21, 418 18 of 20
in feature processing in the1-FCLCNN path, and the fused data are processed by the 1D
full-convolutional layer to obtain more accurate prediction results. The score of 1-FCLCNN-
LSTM model in FD001 was 5% lower than the optimal LSTMBS in the previous model. The
score of 1-FCLCNN-LSTM model in FD003 was 17.6% lower than the optimal DNN in the
previous mode. This indicates that the prediction accuracy of this model in C-MAPSS data
set is improved, and no expert knowledge or physical knowledge is required, which can
help maintain predictive turbofan engines, as a research direction of mechanical equipment
health management.
5. Conclusions
This paper has presented a method for RUL prognosis of spatiotemporal feature
fusion modeled with 1-FCLCNN and LSTM network. From the current data sets which are
issued from some location and state sensors, the proposed method extracts spatiotemporal
feature and estimates the trend of the future remaining useful life of the turbofan engine. In
addition, for different data sets and prognosis horizon, it is shown that the RUL prognosis
error is superior to other methods. Future research will improve the use of the approach for
online applications. The main challenge is to incrementally update the prognosis results.
The use of the approach for other real applications and large machinery systems will also
be considered.
Author Contributions: Conceptualization, C.P. and Z.T.; methodology, W.G. and C.P.; software, Y.C.;
validation, Y.C., Q.C.; formal analysis, C.P. and L.L.; investigation, C.P.; resources, W.G.; writing—
original draft preparation, C.P. and Y.C.; writing—review and editing, C.P.; visualization, Y.C.;
supervision, C.P.; project administration, C.P.; funding acquisition, Z.T. and W.G. All authors have
read and agreed to the published version of the manuscript.
Funding: This work is supported by Natural Science Foundation of China (No. 61871432, No.
61771492), the Natural Science Foundation of Hunan Province (No.2020JJ4275, No.2019JJ6008, and
No.2019JJ60054).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Wei, J.; Bai, P.; Qin, D.T.; Lim, T.C.; Yang, P.W.; Zhang, H. Study on vibration characteristics of fan shaft of geared turbofan engine
with sudden imbalance caused by blade off. J. Vib. Acoust. 2018, 140, 1–14. [CrossRef]
2. Tuzcu, H.; Hret, Y.; Caliskan, H. Energy, environment and enviroeconomic analyses and assessments of the turbofan engine used
in aviation industry. Environ. Prog. Sustain. Energy 2020, 3, e13547. [CrossRef]
3. You, Y.Q.; Sun, J.B.; Ge, B.; Zhao, D.; Jiang, J. A data-driven M2 approach for evidential network structure learning. Knowl. Based
Syst. 2020, 187, 104800–104810. [CrossRef]
4. De Oliveira da Costa, P.R.; Akcay, A.; Zhang, Y.Q.; Kaymak, U. Attention and long short-term memory network for remaining
useful lifetime predictions of turbofan engine degradation. Int. J. Progn. Health Manag. 2020, 10, 34.
5. Ghorbani, S.; Salahshoor, K. Estimating remaining useful life of turbofan engine using data-level fusion and feature-level fusion.
J. Fail. Anal. Prev. 2020, 20, 323–332. [CrossRef]
6. Sun, H.; Guo, Y.Q.; Zhao, W.L. Fault detection for aircraft turbofan engine using a modified moving window KPCA. IEEE Access
2020, 8, 166541–166552. [CrossRef]
7. Ahmadzadeh, F.; Lundberg, J. Remaining useful life estimation: Review. Int. J. Syst. Assur. Eng. Manag. 2014, 5, 461–474.
[CrossRef]
8. Kok, C.; Jahmunah, V.; Shu, L.O.; Acharya, U.R. Automated prediction of sepsis using temporal convolutional network. Comput.
Biol. Med. 2020, 127, 103957. [CrossRef]
9. Cheong, K.H.; Poeschmann, S.; Lai, J.; Koh, J.M. Practical automated video analytics for crowd monitoring and counting. IEEE
Access 2019, 7, 83252–183261. [CrossRef]
10. Saravanakumar, R.; Krishnaraj, N.; Venkatraman, S.; Sivakumar, B.; Prasanna, S.; Shankar, K. Hierarchical symbolic analysis and
particle swarm optimization based fault diagnosis model for rotating machineries with deep neural networks. Measurement 2021,
171, 108771. [CrossRef]
Sensors 2021, 21, 418 19 of 20
11. Du, X.L.; Chen, Z.G.; Zhang, N.; Xu, X. Bearing fault diagnosis based on Synchronous Extrusion S transformation and deep
learning. Modul. Mach. Tool Autom. Process. Technol. 2019, 5, 90–93, 97.
12. Peng, C.; Tang, Z.H.; Gui, W.H.; He, J. A bidirectional weighted boundary distance algorithm for time series similarity computation
based on optimized sliding window size. J. Ind. Manag. Optim. 2019, 13, 1–16. [CrossRef]
13. Peng, C.; Tang, Z.H.; Gui, W.H.; Chen, Q.; Zhang, L.X.; Yuan, X.P.; Deng, X.J. Review of key technologies and progress in
industrial equipment health management. IEEE Access 2020, 8, 151764–151776. [CrossRef]
14. Yang, Y.K.; Fan, W.B.; Peng, D.X. Driving behavior recognition based on one-dimensional convolutional neural network and
noise reduction autoencoder. Comput. Appl. Softw. 2020, 37, 171–176.
15. Peng, C.; Chen, Q.; Zhou, X.H.; Tang, Z.H. Wind turbine blades icing failure prognosis based on balanced data and improved
entropy. Int. J. Sens. Netw. 2020, 34, 126–135. [CrossRef]
16. Peng, C.; Liu, M.; Yuan, X.P.; Zhang, L.X. A new method for abnormal behavior propagation in networked software. J. Internet
Technol. 2018, 19, 489–497.
17. Zhang, J.D.; Zou, Y.S.; Deng, J.L.; Zhang, X.L. Bearing remaining life prediction based on full convolutional layer neural networks.
China Mech. Eng. 2019, 30, 2231–2235.
18. Yang, B.Y.; Liu, R.N.; Zio, E. Remaining useful life prediction based on a double-convolutional neural network architecture. IEEE
Trans. Ind. Electron. 2019, 66, 9521–9530. [CrossRef]
19. Hsu, H.Y.; Srivastava, G.; Wu, H.T.; Chen, M.Y. Remaining useful life prediction based on state assessment using edge computing
on deep learning. Comput. Commun. 2020, 160, 91–100. [CrossRef]
20. Li, X.; Ding, Q.; Sun, J.Q. Remaining useful life estimation in prognostics using deep convolutional neural networks. Reliab. Eng.
Syst. Saf. 2018, 172, 1–11. [CrossRef]
21. Zhang, J.J.; Wang, P.; Yuan, R.Q.; Gao, R.X. Long short-term memory for machine remaining life prediction. J. Manuf. Syst. 2018,
48, 78–86. [CrossRef]
22. Kong, Z.; Cui, Y.; Xia, Z.; Lv, H. Convolution and long short-term memory hybrid deep neural networks for remaining useful life
prognostics. Appl. Sci. 2019, 9, 4156. [CrossRef]
23. Song, Y.; Xia, T.B.; Zheng, Y.; Zhuo, P.C.; Pan, E.S. Residual life prediction of turbofan Engines based on Autoencoder-BLSTM.
Comput. Integr. Manuf. Syst. 2019, 25, 1611–1619.
24. Yan, C.M.; Wang, W. Development and application of a convolutional neural network model. Comput. Sci. Explor. 2020, 18, 1–22.
25. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; pp. 326–366.
26. Estrach, J.B.; Szlam, A.; LeCun, Y. Signal recovery from pooling representations. In Proceedings of the 31st International
Conference on Machine Learning (ICML), Beijing, China, 21–26 June 2014; pp. 307–315.
27. Jonathan, L.; Shelhamer, E.; Darrell, T.; Berkeley, U.C. Fully convolutional networks for semantic segmentation. IEEE Trans.
Pattern Anal. Mach. Intell. 2017, 39, 640–651.
28. Zhang, Y.; Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y. Short-term residential load forecasting based on LSTM recurrent neural
network. IEEE Trans. Smart Grid 2019, 12, 312–325.
29. Al-Dulaimi, A.; Zabihi, S.; Asif, A.; Mohammadi, A. A multimodal and hybrid deep neural network model for Remaining Useful
Life estimation. Comput. Ind. 2019, 108, 186–196. [CrossRef]
30. Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning
Representations (ICLR), Banff, AB, Canada, 14–16 April 2014.
31. Famouri, M.; Azimifar, Z.; Taheri, M. Fast linear SVM validation based on early stopping in iterative learning. Int. J. Pattern
Recognit. Artif. Intell. 2015, 29, 1551013. [CrossRef]
32. Loffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015,
arXiv:1502.03167.
33. Ramasso, E.; Gouriveau, R. Prognostics in switching systems: Evidential Markovian classification of real-time neuro-fuzzy
predictions. In Proceedings of the Prognostics and Health Management Conference IEEE PHM, Portland, OR, USA, 10–16
October 2010.
34. Frederick, D.; de Castro, J.; Litt, J. User’s Guide for the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS);
NASA/ARL: Hanover, MD, USA, 2007.
35. Saxena, A.; Goebel, K.; Simon, D.; Eklund, N. Damage propagation modeling for aircraft engine run-to-failure simulation.
In Proceedings of the 1st International Conference on Prognostics and Health Management (PHM08), Denver, CO, USA, 6–9
October 2008.
36. Peel, L. Data driven prognostics using a Kalman filter ensemble of neural network models. In Proceedings of the 2008 International
Conference on Prognostics and Health Management IEEE, Denver, CO, USA, 6–9 October 2008.
37. Li, N.; Lei, Y.; Gebraeel, N.; Wang, Z.; Cai, X.; Xu, P.; Wang, B. Multi-sensor data-driven remaining useful life prediction of
semi-observable systems. IEEE Trans. Ind. Electron. 2020, 1. [CrossRef]
38. Gou, B.; Xu, Y.; Feng, X. State-of-health estimation and remaining-useful-life prediction for lithium-ion battery using a hybrid
data-driven method. IEEE Trans. Veh. Technol. 2020, 69, 10854–10867. [CrossRef]
39. Hsu, C.; Jiang, J. Remaining useful life estimation using long short-term memory deep learning. In Proceedings of the 2018 IEEE
International Conference on Applied System Innovation (ICASI), Tokyo, Japan, 13–17 April 2018; pp. 58–61.
Sensors 2021, 21, 418 20 of 20
40. Xu, S.; Hou, G.S. Prediction of remaining service life of turbofan engine based on VAE-D2GAN. Comput. Integr. Manuf. Syst. 2020,
23, 1–17.
41. Khelif, R.; Chebel-Morello, B.; Malinowski, S.; Laajili, E.; Fnaiech, F.; Zerhouni, N. Direct remaining useful life estimation based
on support vector regression. IEEE Trans. Ind. Electron. 2016, 64, 2276–2285. [CrossRef]
42. Liao, Y.; Zhang, L.; Liu, C. Uncertainty Prediction of Remaining Useful Life Using Long Short-Term Memory Network Based
on Bootstrap Method. In Proceedings of the IEEE International Conference on Prognostics and Health Management (ICPHM),
Seattle, WA, USA, 11–13 June 2018; pp. 1–8.
43. Zheng, S.; Ristovski, K.; Farahat, A.; Gupta, C. Long short-term memory network for remaining useful life estimation. In
Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21
June 2017; pp. 88–95.
44. Zhang, C.; Lim, P.; Qin, A.; Tan, K. Multiobjective deep belief networks ensemble for remaining useful life estimation in
prognostics. IEEE Trans. Neural Netw. Learn Syst. 2017, 28, 2306–2318. [CrossRef] [PubMed]
45. Yu, W.N.; Kim, I.Y.; Mechefske, C. Remaining useful life estimation using a bidirectional recurrent neural network based
autoencoder scheme. Mech. Syst. Signal Process. 2019, 129, 764–780. [CrossRef]
46. Babu, G.S.; Zhao, P.; Li, X. Deep convolutional neural network based regression approach for estimation of remaining useful life.
In Proceedings of the International Conference on Database Systems for Advanced Applications, Dallas, TX, USA, 16–19 April
2016; pp. 214–228.