0% found this document useful (0 votes)
32 views8 pages

Remaining Useful Life Prediction of Machining Tools by 1D-CNN LSTM Network

Uploaded by

alexandre.msl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Remaining Useful Life Prediction of Machining Tools by 1D-CNN LSTM Network

Uploaded by

alexandre.msl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2019 IEEE Symposium Series on Computational Intelligence (SSCI)

December 6-9 2019, Xiamen, China

Remaining Useful Life Prediction of Machining Tools


by 1D-CNN LSTM Network
Jiahe Niu, Chongdang Liu, Linxuan Zhang, Yuan Liao
Department of Automation
Tsinghua University
Beijing 100084, P. R. China
Email:{njh17; liucd16; lxzhang}@mails.tsinghua.edu.cn; [email protected]

Abstract—In the field of machining, machining tool life learning methods. Common statistics-based methods include
(degree of wear) is a key factor affecting the quality of the wiener processes [2], gamma processes [3], markov models [4],
machined workpiece. Over-protection strategies may increase etc. Common methods based on traditional machine learning
production costs and cause unnecessary machining tool downtime. mainly include support vector regression (SVR) [5], artificial
Therefore, if the remaining useful life (RUL) of the machining neural networks (ANN) [6], extreme learning machines (ELM)
tool can be accurately predicted, the work schedule will be and neuro-fuzzy system [7]. In fact, for a large amount of data
effectively optimized and the machining tool procurement cost in the actual machining process, traditional machine learning
will be reduced. In this paper, we propose a system schema that algorithms are sometimes difficult to extract hidden
integrates programmable logic controller (PLC) signals with
information that characterizes the degradation process of the
sensor signals for online RUL prediction of machining tools. The
preprocessed sensor signals are segmented and we propose
tool. In this respect, deep learning methods tend to have better
ensemble discrete wavelets transform (EDWT) to eliminate the effects, as it has powerful adaptive learning and anti-noise
noise of three-dimensional vibration signals and get time- ability, and it can automatically extract deep features, which is
frequency information. Then statistics features are extracted more versatile than traditional machine learning methods.
based on time domain and frequency domain analysis. Further, Common RUL prediction methods based on deep learning
we use spearman’s coefficient, autocorrelation and monotonicity include recurrent neural network (RNN) [8], Long Short-Term
indicators for feature selection to reduce feature dimensions. Memory network (LSTM) [9], convolutional neural networks
Finally, we use a 1D-CNN LSTM network architecture for (CNN) [10], and deep belief networks (DBN) [11], etc.
machining tools RUL prediction. The evaluation results show
that our system schema is feasible for the industrial field, and has
Among the commonly used deep learning models, CNN
a better performance than other common methods. has a very important position in the field of image recognition.
Due to its capacity to automatically extract features, it is also
Keywords—machining tools; remaining useful life; 1D-CNN; being used in the field of fault diagnosis and process
LSTM. monitoring today [13]. Liang [14] used the one-dimensional
convolutional neural network (1D-CNN) to extract the deep
I. INTRODUCTION features of high-speed train fault signal which can achieve the
The failure of machining tools may result in an increase in classification accuracy of 96.4%. Turker [10] also used 1D-
the surface roughness and a decrease in dimensional accuracy CNN on real motor data for real-time motor condition
of the workpiece, more seriously, the workpieces may be monitoring. And LSTM can effectively mine the hidden
scrapped or the computer numerical control (CNC) machine degradation trend in time series. Zheng [9] proposed an LSTM
may be damaged. Therefore, the remaining useful life (RUL) approach for RUL estimation. In fact, we can combine CNN's
of the machining tool is a practical problem to be solved in the high-dimensional feature extraction capacity and LSTM’s
factory. advantage on time series problems. After CNN extracts
features, we input them into the LSTM for training, then some
In recent years, studies on machining tools’ RUL prediction improvements in accuracy and speed can be achieved.
can be divided into two types: model-based methods and data-
driven methods. The model-based method mainly uses the Based on the working condition information and sensor
domain knowledge and physical principle model of the system data collected by the programmable logic controller (PLC) and
or component to perform calculations, and the failure behavior external sensors, this paper constructs the machining tool wear
of the machining tools can be quantitatively characterized. state evaluation and life prediction model to diagnose the wear
However, model-based methods are often difficult to achieve state of the machining tool. First, data preprocessing is
due to the uncertainty of model parameters and the complexity performed on the training dataset and the testing dataset,
of failure mechanisms of the machining tools in the cutting including denoising, outlier culling, and data structure
process [1]. In this case, data-driven methods are receiving defragmentation. The cleaned data is then decomposed by
more and more attention. ensemble discrete wavelets transform (EDWT) to obtain a

Data-driven methods can be divided into statistics-based


methods, traditional machine learning methods, and deep

978-1-7281-2485-8/19/$31.00 ©2019 IEEE 1056


Fig. 1. Discrete wavelet transform.

time-frequency signal. Then, a series of statistical features such


as mean, median, variance, energy, etc. are extracted according
to the time-frequency signal. After the statistical features are
acquired, the optimal feature selection is performed by using
the spearman’s coefficient, autocorrelation and monotonicity
indicators to obtain the feature set with better performance.
Finally, we constructs a 1D-CNN LSTM network to predict the
remaining life of the machining tools.
II. FEATURE ENGINEERING
A. Sensor signal preprocessing based on EDWT
Discrete Wavelet Transform (DWT) [19] is a signal
analysis algorithm with time domain and frequency domain
analysis capabilities, which is achieved by discretization of
continuous wavelet transform. And it is often used to feature
mining of mechanical vibration signals. Each stage of discrete
Fig. 2. Ensemble eiscrete wavelets transform.
wavelet transform is mainly related to two filters as shown in
“Fig. 1”: low-pass filters ݃ሺ݊ሻand high-pass filter ݄ሺ݊ሻ. When generated data. Finally, we calculate the average of all
decomposing, the signal pass ݃൫݊൯ and ݄ሺ݊ሻ respectively, and iteration results. The following figure shows the process of
then performs 1/2 downsampling to get coefficient ܿ‫ ܣ‬and signal preprocessing based on EDWT. The specific steps are
coefficient ܿ‫[ ܦ‬15]. shown in ĀFig. 2”.
The meanings of the symbols are as follows:
B. Feature extraction
‫ݔ‬ሾ݊ሿ: Input signal. In this paper, classical features and Trigonometric features
݃ሾ݊ሿ: Low-pass filter that filters out the high-frequency are extracted from the time-frequency domain signal [1][15].
portion of the input signal and outputs the low-frequency The extracted features can be divided into time domain features
portion. and frequency domain features. Mean value (MV), variance
value (VV), median value (MDV), mean square error (MSE),
݄ሾ݊ሿ: High-pass filter, in contrast to the low-pass filter, square mean root (SMR), root mean square (RMS), maximum
filters out the low-frequency part and outputs the high- absolute value (MA), energy value and margin factor (MF) are
frequency part. features in the time domain. Root mean square frequency
(RMSF) and variance frequency (VF) are features in the
՝ ܳ: Downsampling filter, if ‫ݔ‬ሾ݊ሿ is used as input, output frequency domain. The trigonometric features include the
‫ݕ‬ሾ݊ሿ ൌ ‫ݔ‬ሾܳ݊ሿ. Here ܳ is 2. standard deviation of inverse hyperbolic sine (asinh) and
standard deviation of inverse tangent (atanh). The calculating
Due to the influence of the actual processing environment
equations of different features are shown in “Table I”.
and the defects of the sensor devices, the obtained tool data
may contain a variety of noises. With reference to the design of C. Feature selection
ensemble empirical mode decomposition (EEMD) [16], this For RUL prediction in machining tools, the number of
paper proposes an improved DWT method EDWT. During extracted features are sizeable, and there are some irrelevant
each iteration, it adds different white noise to the original and redundant features. So the selection of features is very
vibration signal, then DWT decomposition is performed on the important for improving the prediction results. This paper
selects the features by the following three indicators, which are
widely used in the field of prognostic and health
management[1][15]:
The first indicator is Spearman’s coefficient [18].

ቚσಿ ഥ തതത൯
ೕసభ൫௧ೕ ି௧ണ ൯൫௫೔ ൫௧ೕ ൯ି௫ ഢ ቚ
 ܴ௦ ሺ‫ݔ‬௜ ሻ ൌ మ మ
 
ටσಿ ഥ ಿ
ೕసభ൫௧ೕ ି௧ണ ൯ σೕసభ൫௫೔ ൫௧ೕ ൯ି௫
തതത൯

1057
TABLE I. STATISTICS FRA TURES TABLE dimensional convolutional neural network (1D-CNN) is more
Feature Name Equation suitable than common convolution neural network. One of the

MV ‫ݔ‬ଵ ൌ ෍ ‫ݕ‬ሺ݅ሻ Ȁܰ
௜ୀଵ

VV ‫ݔ‬ଶ ൌ ෍ ሺ‫ݕ‬ሺ݅ሻ െ ‫ݔ‬ଵ ሻଶ Ȁܰ
௜ୀଵ
MDV ‫ݔ‬ଷ ൌ ݉݁݀݅ܽ݊ሺ‫ݕ‬ሺ݅ሻሻ

MSE ‫ݔ‬ସ ൌ ඩ෍ሺ‫ݕ‬ሺ݅ሻ െ ‫ݔ‬ଵ ሻȀܰ
௜ୀଵ

ே ଶ
ȁ‫ݕ‬ሺ݅ሻȁ
SMR ‫ݔ‬ହ ൌ ൭෍ ൱
ܰ
௜ୀଵ


RMS ‫ ଺ݔ‬ൌ ඩ෍൫‫ݕ‬ሺ݅ሻ൯ Ȁܰ
௜ୀଵ

MA ‫ ଻ݔ‬ൌ ݉ܽ‫ݔ‬ȁ‫ݕ‬ሺ݅ሻȁ

Energy ‫ ଼ݔ‬ൌ ෍ ‫ݕ‬ሺ݅ሻଶ
௜ୀଵ
σே
௜ୀଵሺ‫ݕ‬ሺ݅ሻ െ ‫ݔ‬ଵ ሻ

Kurtosis ‫ ଼ݔ‬ൌ
ሺܰ െ ͳሻߪ ସ

Entropy ‫ݔ‬ଵ଴ ൌ െ ෍ ‫݌‬ሺ‫ݕ‬௞ ሻ ݈‫݃݋‬௕ ‫݌‬ሺ‫ݕ‬௞ ሻ


௜ୀଵ Fig. 3. One-dimensional convolutional neural network.
CF ‫ݔ‬ଵଵ ൌ ‫ ଻ݔ‬Ȁ‫଺ݔ‬
MF ‫ݔ‬ଵଶ ൌ ‫ ଻ݔ‬Ȁ‫ݔ‬ହ characteristics of the 1D-CNN is that for time-series data, the
σே ‫ݕ‬ሶ ଶ ሺ݅ሻ receptive field moves only in the direction of time, so the local
RMSF ‫ݔ‬ଵଷ ൌ ඨ ଶ௜ୀଶே ଶ inter-variable correlation can be extracted. [13]. “Figure. 3”
Ͷߨ σ௜ୀଵ ‫ ݕ‬ሺ݅ሻ
ே shows the architecture of the 1D-CNN. Each convolutional
σ௜ୀଶ ‫ݕ‬ሶ ଶ ሺ݅ሻ
σே ே
௜ୀଶ σ௜ୀଶ ‫ݕ‬ሶ ሺ݅ሻ‫ݕ‬ሺ݅ሻ
VF ‫ݔ‬ଵସ ൌ ଶ ே ଶ െ layer consists of several convolutional units whose parameters
Ͷߨ σ௜ୀଵ ‫ ݕ‬ሺ݅ሻ ʹߨ σே ଶ
௜ୀଵ ‫ ݕ‬ሺ݅ሻ are optimized by backpropagation algorithms.
SD of asinh ‫ݔ‬ଵହ ൌ ߪሺ݈‫݃݋‬ሾ‫ݕ‬௜ ൅ ሺ‫ݕ‬௜ଶ ൅ ͳሻଵȀଶ ሿሻ
SD of atan ‫ݔ‬ଵ଺ ൌ ߪሺሺ݅Ȁʹሻ݈‫݃݋‬ሺሺ݅ ൅ ‫ݕ‬௝ ሻȀሺ݅ െ ‫ݕ‬௝ ሻሻሻ The process of the convolution layer can be represented as
[14]:

The second indicator is Autocorrelation [1].  ‫ݔ‬௝௟ ൌ ݂ሺσ௜‫א‬ெೕ ‫ݔ‬௜௟ିଵ ‫݇ כ‬௜௝



൅ ܾ௝௟ ሻ 

σಿ
೔సభቀ௫೔ ൫௧ೕ ൯ି௫೔ ൫௧ೕషభ ൯ቁ ݈ିଵ ݈ ݈
 ܴ஺ ሺ‫ݔ‬௜ ሻ ൌ   Where ‫݅ݔ‬ is the input, ݆݇݅ is the kernel weights, ܾ݆ is the

݈
The third indicator is Monotonicity [1]. biases, ݂ሺȈሻ is the activation function, ‫ ݆ݔ‬is the output of the ݆-
th kernel in the ݈-th convolutional layer.
 ୑ ሺ‫ݔ‬௜ ሻ ൌ

ቚσே
௜ୀଵ ߜ ቀ‫ݔ‬௜ ൫‫ݐ‬௝ାଵ ൯ െ ‫ݔ‬௜ ൫‫ݐ‬௝ ൯ቁ െ
ேିଵ
The feature map obtained by the convolution layer continues to
σே
௜ୀଵ ߜሺ‫ݔ‬௜ ൫‫ݐ‬௝ ൯ െ ‫ݔ‬௜ ሺ‫ݐ‬௝ାଵ ሻሻቚ  carry pooling operation with maximum pooling function,
which can effectively reduce the amount of data and increase
In these equations, ‫ݔ‬௜ is the ith extracted statistics feature. the calculation speed. The max pooling function’s formula can
Finally, this paper selects the top 30 attributes of each be represented as:
indicator, and combines the results of the three indicators as a
collection of high-quality features.  ‫ݔ‬෡ఫ௟ ൌ ݂ሺߚ௝௟ ݀‫݊ݓ݋‬൫‫ݔ‬௝௟ ൯ ൅ ܿ௝௟ ሻ 

෡݈ ݈
III. 1D-CNN LSTM NETWORK FOR RUL ESTIMATION Where ‫ ݆ݔ‬is the input from convolution layer, ߚ݆ is the
A. 1D-Convolutional and pooling ݈
weight matrix, ݀‫݊ݓ݋‬ሺȈሻ is the down sampling function, ݆ܿ is
The data collected during the actual machining of the tool
can be represented as a two-dimensional matrix with a time the baise. ‫ݔ‬෡ఫ௟ is the output of the ݆-th kernel in the ݈-th pooling
axis and a sensor variable axis. For time-series problems, one- layer.

1058
B. LSTM RNN, each unit of RNN is a simple chain structure, it
Long short term memory network (LSTM) [20] is a special processes the input sequence ቄ‫ݔ‬ଵ ǡ ‫ݔ‬ଶ ǡ ǥ ǡ ‫ ܶݔ‬ቅ sequentially to
type of recurrent neural network (RNN) structure. In traditional

Fig. 4. 1D-convolutional LSTM Network architecture.

 ݃௧ ൌ ‫݄݊ܽݐ‬ሺܹ௫௚ ‫ݔ‬௧ ൅ ܹ௛௚ ݄௧ିଵ ൅ ܾ௚ ሻ 

 ܿ௧ ൌ ݂௧ ٖ ܿ௧ିଵ ൅ ݅௧ ٖ ݃௧   

 ݄௧ ൌ ‫݋‬௧ ٖ –ƒŠሺܿ௧ ሻ  

Where ߪሺȈሻ is the sigmoid function and ٖ is the element-


wise multiplication.

Fig. 5. Long short term memory network cell.


C. 1D-convolutional LSTM Network architecture
The architecture of 1D-CNN LSTM network consists of 6
construct a corresponding sequence of hidden states layers as shown in “Fig. 4”: 1 convolutional layer, 1 max-
ሼŠଵ ǡ Šଶ ǡ ǥ ǡ Š୘ ሽ. In LSTM, a mermory cell ୲ is introduced in pooling layer, 2 LSTM layers, 1 fully connected layer and 1
addition to the hidden state Š୲ at the timestep – [17]. activation layer. First, the original signals are passed into the
first one-dimensional convolution layer after feature extraction,
As shown in “Figure. 5”, ‫ݔ‬௧ , ݄௧ , and ܿ௧ represent the feature selection, and time window division. The filter and
input, output, and status information of the ݈݈ܿ݁௧ . They are kernel size of the one-dimensional convolution layer are both
computed via three gate functions, including forget gate, 32. The size of the time window taken in the data
input gate, and output gate. The forget gate function ݂௧ realizes preprocessing process is 200, so after passing through the layer
the information discarding of the state from the previous of convolution the output shape is changed to (None, 169, 32).
The second layer is the one-dimensional max pooling layer.
memory cell ݈݈ܿ݁௧ିଵ , the input gate controls the information
The pooling size of this layer is 40, and the output shape of this
that will participate in the calculation of the memory cell ݈݈ܿ݁௧ , layer is (None, 4, 32). Layer 3 and 4 are the LSTM layers with
and the output gate determines the output information of the 64 neurons and 16 neurons, and output shape after these two
memory cell ݈݈ܿ݁௧ . Due to the gate structure, the LSTM has layers is (None, 16). Layer 5 is a fully connected and dropout
the ability to “memorize” and thus exhibits better performance layer which consists of 1 neuron. The main purpose of the
in processing some time series data. We use ݂ , ݅ and ‫ ݋‬to dropout layer is to reduce over-fitting, and in the fully
represent the forget function, the input gate function and the connected layers, a sigmoid activation function is been used at
output gate function respectively. The subscript of the the end for RUL prediction.
parameter ܹ , h indicates which gate it corresponds to.
According to the above notations, an LSTM is formally IV. EXPERIMENT
defined as: The proposed RUL prediction system is tested using CNC
machining tools data collected from actual machining process,
 ݅௧ ൌ ߪሺܹ௫௜ ‫ݔ‬௧ ൅ ܹ௛௜ ݄௧ିଵ ൅ ܾ௜ ሻ  including the data from PLC and vibration sensor. We
compare the results of 1D-CNN, LSTM and 1D-CNN LSTM
 ݂௧ ൌ ߪሺܹ௫௙ ‫ݔ‬௧ ൅ ܹ௛௙ ݄௧ିଵ ൅ ܾ௙ ሻ  networks after DWT transformation and EDWT
transformation respectively.
 ‫݋‬௧ ൌ ߪሺܹ௫௢ ‫ݔ‬௧ ൅ ܹ௛௢ ݄௧ିଵ ൅ ܾ௢ ሻ 

1059
A. Data set description dimension. This article chooses to align them evenly. In
The dataset comes from the 2nd Industrial Big Data addition, some abnormal values exist in the original data. For
Innovation Competition organized by China Academy of example, the current contains abnormal maximum value and
Information and Communications Technology. According to abnormal minimum value (absolute value is even greater than
the PLC and external sensor in cyber-physical systems, the ͳͲଵ଴ ), these values may affect the extraction of statistics. So
working condition information and sensor data during the this article discards the row containing the kind of outliers (this
machining process are collected to achieve the online article divides the outliers with the absolute value ͳͲଶ as the
monitoring and remaining useful life prediction of tool wear. limit).
PLC data is the complete processing history data, including

Fig. 7. Spindle load variable.

Fig. 6. The installation location of the vibration sensor.


Fig. 8. Difference of the spindle load variable.
TABLE II. FEATURE FIELD DESCRIPTION
Data Sources Feature Description
time Recording time

spindle_load Spindle load

x X-axis mechanical coordinates


PLC data Fig. 9. Spindle load in normal processing state.
y Y-axis mechanical coordinates

z Z-axis mechanical coordinates

csv_no Corresponding sensor_file

vibration_1 X-axis vibration signal

vibration_2 Y-axis vibration signal


Sensor data
vibration_3 Z-axis vibration signal

current First phase current signal


Fig. 10. The label of remaining life ratio.
recording time, spindle load, X-axis vibration, Y-axis
vibration, Z-axis vibration and other information. Since the According to the data provided in the PLC, the spindle load
amount of raw signal data of the sensor is extremely large, one change of the tool during the actual machining process can be
minute of data is taken every 5 minutes to form one csv file. In obtained. As shown in “Fig. 7”, it can be found that there is
terms of the data sampling frequency, the sampling frequency intermittent load value at a lower value, the data of this part
of the PLC signal is 33 Hz, and the sampling frequency of the may not work in the normal processing state, so it needs to be
vibration sensor is 25600 Hz. eliminated. The solution adopted in this paper is to calculate
The data set contains a total of 7 sets of CNC machining the absolute value of the difference of the spindle load variable
data from the actual CNC machining process from the start of a in the time dimension as shown in “Fig. 8”, and then add the
new tool until the end of the tool life. Three of them are value to the spindle load value. If the sum is less than the
training sets, and the remaining four are testing sets. threshold value 5, it is considered that the current time tool is
not working in the normal processing state. “Fig. 9” represents
The installation location of the vibration sensor is shown in the result of the data culling by the threshold, and it can be
“Fig. 6”, where the angle of view is in the direction of the front found that the data of the abnormal working state is
of the machine. The fields included in the PLC and Sensor data substantially eliminated.
are shown in “Table II”.
The processed data is then evenly divided into 600 parts
B. Data preprocessing (each corresponding time is about 0.1 seconds), and the time-
The sampling frequency of the PLC signal is 33 Hz, and the frequency feature is extracted after performing EDWT
sampling frequency of the vibration sensor is 25.6 KHz. transformation on each data segment. Finally, the size of the
Therefore, it is necessary to first align them in the time

1060
feature matrix extracted by each tool is 600*k*n, where k is the should be 2.5 minutes. For the three tools in the training set,
number of .csv files, n is the number of extracted features. the complete life cycle is 240mins, 240mins, 180mins, even if
Since the data in each csv is fragment data of 1 minute
every 5 minutes, when the training set is subjected to RUL 
labeling, as shown in “Fig. 10”, the 5 minutes expected value is
used as the label of the entire csv. For example, the RUL value 
corresponding to the last csv file of each tool in the training set



Fig. 11. Results of the first machining tool in the test data set using different methods.

they have same remaining life of 100mins, their tool wear In the aspect of comparative experiments, we compares the
states are not consistent . Therefore, the concept of results of 1D-CNN, LSTM and 1D-CNN LSTM networks
remaining life ratio is introduced here. We normalize the after DWT transformation and 1D-CNN, LSTM and 1D-CNN
remaining useful life of each tool and renormalize it to LSTM networks after EDWT transformation. The model
obtain the true remaining useful life of the tool. parameters of CNN and LSTM are manually searched to get
better results. Since the scoring function is sensitive to
C. Results prediction results and the network output has some volatility,
The remaining life ratio of the machining tool’s begin and
end of the prediction can be obtained from the 1DCNN-LSTM
network. Then the remaining life of the test set can be
calculated by the following formula:

ହ‫כ‬ோ೐ ‫כ‬ሺ௡೎ೞೡ ିଵሻ


 ܴܷ‫ ܮ‬ൌ   
ோ್ ିோ೐

Where ܴ݁ is the remaining life ratio of the predicted end


time, ܴܾ is the remaining life ratio of the predicted begin time,
݊ܿ‫ ݒݏ‬is the number of csv files.
Fig. 12. Score function.
The scoring function given by the organizer is:
we need to use multiple experimental results to evaluate the
ಶೝ
ି ௟௡ሺ଴Ǥହሻ‫כ‬ሺ ೔ ሻ
predictive ability of different methods. This paper conducts
݁‫݌ݔ‬ ఱ ǡ ݂݅‫ݎܧ‬௜ ൑ Ͳ training and testing for 20 times each scheme. And “Fig. 11”
 ‫ܣ‬௜ ൌ ቐ   
ಶೝ
ା ௟௡ሺ଴Ǥହሻ‫כ‬ሺ ೔ ሻ shows the mean of 20 times predictions of each methods,
݁‫݌ݔ‬ మబ ǡ ݂݅‫ݎܧ‬௜ ൐ Ͳ which based on the first machining tool in the test dataset. The
x-axis represents the line number of the test data, the y-axis
 ‫ݎܧ‬௜ ൌ  ‫݁ݑݎݐ‬௜  െ  ‫݀݁ݎ݌‬௜    represents the predicted RUL in minutes and the black line is
the true value.
Where ݅ is the ݅-th tool of the test, ‫ ݅݁ݑݎݐ‬is the true value of
We use two evaluation indicators in our experiment, which
the ݅-th tool of the test, ‫ ݅݀݁ݎ݌‬is the predicted value of the ݅-th are the score from organizer and the mean square error(MSE).
tool of the test. The curve of the score function is shown in For each indicator, we calculate the results’ mean and standard
“Fig. 12”. deviation after 20 experiments. The results are shown in detail

1061
in “Table III” as “MSE mean” and “Score mean”, etc. In the In the future, the proposed method can also add more
case of using score as the evaluation indicator, we can see that analysis of PLC signals. Thus, better results may be obtained
the 1D-CNN LSTM model using EDWT has the highest by analyzing the working conditions. At the same time, the
average score, and the standard deviation is very close to the practicality of the system can be improved by automatic
lowest value, indicating that EDWT can effectively improve parameter optimization.
the accuracy while ensuring the stability of the prediction.
ACKNOWLEDGMENT
When using mean square error as the evaluation indicator, the
1D-CNN LSTM model using EDWT completely defeated The authors would like to thank China Academy of
other methods. And regardless of which feature extraction information and Communications Technology, Foxconn,
method is used, the network structure using 1D-CNN LSTM CyberInsight for providing the raw data of CNC cutting tools.
has better performance than the 1D-CNN model or the LSTM Moreover, assistance provided by Dr. Zhao was greatly
appreciated.

REFERENCES
TABLE III. 20 TIMES TRAINING-TESTING RESULTS
[1] J. Wu, Y. Su, Y. Cheng, X. Shao, C. Deng, and C. Liu, “Multi-sensor
Score Score MSE MSE information fusion for remaining useful life prediction of machining
Preprocess Model tools by adaptive network based fuzzy inference system,” Applied Soft
mean std mean std
Computing, vol. 68, pp. 13-23, 2018.
&11/670 
 
 
 
[2] Q. Zhai and Z. Ye, “RUL Prediction of Deteriorating Products Using an
(':7 &11     Adaptive Wiener Process Model,” IEEE Transactions on Industrial
Informatics, vol. 13, no. 6, pp. 2911-2921, 2017.
/670     [3] Q. Wei and D. Xu, “Remaining useful life estimation based on gamma
&11/670     process considered with measurement error,” in 2014 10th International
Conference on Reliability, Maintainability and Safety (ICRMS). 2014,
':7 &11  
   pp. 645-649.
[4] T. T. Le, C. Berenguer, and F. Chatelain, “Multi-branch Hidden semi-
/670     Markov modeling for RUL prognosis,” in 2015 Annual Reliability and
Maintainability Symposium (RAMS). 2015, pp. 1-6.
[5] Y. Guo, “MKLS-SVR based remaining useful life prediction for
avionics,” in 2015 12th IEEE International Conference on Electronic
Measurement & Instruments (ICEMI). 2015, pp. 223-227.
[6] P. Lall, S. Deshpande, and L. Nguyen, “ANN based RUL assessment for
copper-aluminum wirebonds subjected to harsh environments,” in 2016
IEEE International Conference on Prognostics and Health Management
(ICPHM), 2016, pp. 1-10.
[7] X. Li, “Remaining Useful Life Prediction of Bearings Using Fuzzy
Multimodal Extreme Learning Regression,” in 2017 International
Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC).
2017, pp. 499-503.
[8] Ü. Şentürk, I. Yücedağ, and K. Polat, “Repetitive neural network (RNN)
based blood pressure estimation using PPG and ECG signals,” in 2018
Fig. 13. Boxplot of different methods. 2nd International Symposium on Multidisciplinary Studies and
Innovative Technologies (ISMSIT). 2018, pp. 1-4.
model alone. These results are shown in “Fig. 13” in the [9] S. Zheng, K. Ristovski, A. Farahat, and C. Gupta, “Long Short-Term
form of a box diagram. Memory Network for Remaining Useful Life estimation,” in 2017 IEEE
International Conference on Prognostics and Health Management
V. CONCLUSIONS (ICPHM). 2017, pp. 88-95.
[10] T. Ince, S. Kiranyaz, L. Eren, M. Askar, and M. Gabbouj, “Real-Time
In this paper, we propose a system schema that integrates Motor Fault Detection by 1-D Convolutional Neural Networks,” IEEE
PLC signals with sensor signals for online RUL prediction of Transactions on Industrial Electronics, vol. 63, no. 11, pp. 7067-7075,
machining tools. The preprocessed sensor signals are 2016.
segmented and we use EDWT to eliminate the noise of three- [11] D. A. Tobon-Mejia, K. Medjaher, and N. Zerhouni, “CNC machine
dimensional vibration signals. Then statistics features are tool's wear diagnostic and prognostic by using dynamic Bayesian
networks,” Mechanical Systems and Signal Processing, vol. 28, pp. 167-
extracted based on time domain and frequency domain analysis. 182, 2012.
Further, we use spearman’s coefficient, autocorrelation and [12] F. Pacheco, M. Cerrada, R.-V. Sanchez, D. Cabrera, C. Li, and J. V. de
monotonicity indicators for feature selection to reduce feature Oliveira, “Attribute clustering using rough set theory for feature
dimensions. Finally, we propose a 1D-CNN LSTM network selection in fault severity classification of rotating machinery,” Expert
architecture for machining tools RUL prediction. The Systems with Applications, vol. 71, pp. 69-86, 2017.
evaluation results show that our system schema is feasible for [13] K. B. Lee, S. Cheon, and C. O. Kim, “A Convolutional Neural Network
the industrial field, and has a better performance than other for Fault Classification and Diagnosis in Semiconductor Manufacturing
Processes,” IEEE Transactions on Semiconductor Manufacturing, vol.
methods. 30, no. 2, pp. 135-142, 2017.

1062
[14] K. Liang, “1D Convolutional Neural Networks For Fault Diagnosis of [17] Z. Li, H. Di, F. Tian, W. Chen Q. Tao, L. Wang, and T. Liu, “Towards
High-speed Train Bogie,” in 2018 IEEE 23rd International Conference Binary-Valued Gates for Robust LSTM Training,” arXiv preprint arXiv,
on Digital Signal Processing. 2018, pp. 1-5. 2018.
[15] K. Javed, R. Gouriveau, N. Zerhouni, and P. Nectoux, “Enabling Health [18] X. Liu, P. Song, C. Yang, C. Hao, and W. Peng, “Prognostics and
Monitoring Approach Based on Vibration Data for Accurate Health Management of Bearings Based on Logarithmic Linear
Prognostics,” IEEE Transactions on Industrial Electronics, vol. 62, no. Recursive Least-Squares and Recursive Maximum Likelihood
1, pp. 647-656, 2015. Estimation,” IEEE Transactions on Industrial Electronics, vol. 65, no. 2,
[16] X. Chen, G. Cai, H. Cao, and W. Xin, “Condition assessment for pp. 1549-1558, 2018.
automatic tool changer based on sparsity-enabled signal decomposition [19] M. J. Shensa, “The discrete wavelet transform: wedding the a trous and
method,” Mechatronics, vol. 31, pp. 50-59, 2015. Mallat algorithms,” IEEE Transactions on Signal Processing, vol. 40,
no. 10, pp. 2464-2482, 1992.
[20] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural
Computation, vol. 9, no. 8, pp. 1735-1780, 1997.

1063

You might also like