A Transfer Convolutional Neural Network For Fault Diagnosis Based On Resnet-50
A Transfer Convolutional Neural Network For Fault Diagnosis Based On Resnet-50
A Transfer Convolutional Neural Network For Fault Diagnosis Based On Resnet-50
https://doi.org/10.1007/s00521-019-04097-w(0123456789().,-volV)(0123456789().,-volV)
ORIGINAL ARTICLE
Received: 11 October 2018 / Accepted: 12 February 2019 / Published online: 26 February 2019
Springer-Verlag London Ltd., part of Springer Nature 2019
Abstract
With the rapid development of smart manufacturing, data-driven fault diagnosis has attracted increasing attentions. As one
of the most popular methods applied in fault diagnosis, deep learning (DL) has achieved remarkable results. However, due
to the fact that the volume of labeled samples is small in fault diagnosis, the depths of DL models for fault diagnosis are
shallow compared with convolutional neural network in other areas (including ImageNet), which limits their final pre-
diction accuracies. In this research, a new TCNN(ResNet-50) with the depth of 51 convolutional layers is proposed for fault
diagnosis. By combining with transfer learning, TCNN(ResNet-50) applies ResNet-50 trained on ImageNet as feature
extractor for fault diagnosis. Firstly, a signal-to-image method is developed to convert time-domain fault signals to RGB
images format as the input datatype of ResNet-50. Then, a new structure of TCNN(ResNet-50) is proposed. Finally, the
proposed TCNN(ResNet-50) has been tested on three datasets, including bearing damage dataset provided by KAT
datacenter, motor bearing dataset provided by Case Western Reserve University (CWRU) and self-priming centrifugal
pump dataset. It achieved state-of-the-art results. The prediction accuracies of TCNN(ResNet-50) are as high as
98.95% ± 0.0074, 99.99% ± 0 and 99.20% ± 0, which demonstrates that TCNN(ResNet-50) outperforms other DL
models and traditional methods.
123
6112 Neural Computing and Applications (2020) 32:6111–6124
and these features have great impacts on the final results of 2 Related work
machine learning methods [12].
Deep learning (DL) has emerged as a new area in The related work contains deep learning-based data-driven
machine learning field to overcome the above drawback. It fault diagnosis, and feature transferring by using CNN
can learn the representation features of raw data automat- network.
ically [13]. DL methods have been applied in the fault
diagnosis field widely [14], such as deep belief network 2.1 Deep learning-based data-driven fault
(DBN), sparse autoencoder (SAE) and convolutional neu- diagnosis
ral network (CNN). Since DL methods can reduce the
effects of the handcrafted features designed by feature With the rapid development of smart manufacturing, data-
extraction processes, they show great potential for fault driven fault diagnosis has become a hot research topic
diagnosis. [11, 18].
However, due to the fact that the volume of labeled Deep learning has been widely applied in fault diagnosis
samples is small in fault diagnosis, the depths of DL field. Wen et al. [19] investigated a sparse autoencoder
models for fault diagnosis are almost up to 5 hidden layers (SAE)-based deep transfer learning for fault diagnosis and
[14], which limits their final prediction accuracies. Com- achieved the accuracy as high as 99.82%. Cho et al. [20]
pared with the benchmark CNN models for ImageNet with studied the fault detection of induction motors using
hundreds of layers, the structures of DL methods for fault recurrent neural networks and dynamic Bayesian modeling.
diagnosis are very shallow. What is more, it is hard to train They conducted real-time experiments with three motors
deep CNN models without large amount of well-organized and estimated the probability distributions for the motor’s
training dataset like ImageNet with ten million annotated states and observe random residues online. Sun et al. [21]
images. To deal with this challenge, some researchers investigated a stacked sparse Autoencoder for intelligent
trained deep CNN model on ImageNet and then applied the fault diagnosis and the results showed that it could generate
trained CNN model as the feature extractor to small dataset more accurate fault classifications than the commonly used
in another domains [15] by combining with transfer deep neural network. Verma et al. [22] compared the per-
learning technique [16]. And they had achieved good formance of SAE with softmax regression, fast classifier
results. based on Mahalanobis distance and SVM in fault diagnosis
Motivated by this, this research proposes a new Transfer of air compressors. Jia et al. [23] proposed normalized SAE
CNN (TCNN) using ResNet-50 as feature extractor, named with local connection network (NSAE-LCN) for intelligent
as TCNN(ResNet-50), for fault diagnosis. The proposed fault diagnosis of machines and NSAE-LCN outperformed
TCNN(ResNet-50) has the depth of 51 layers. Since the commonly used diagnosis network. Qi et al. [24]
ResNet-50 has a very good performance on image classi- studied the stacked SAE-based fault diagnosis method with
fication [17], it can extract high-quality features of images ensemble empirical mode decomposition, and the results
on ImageNet. Our hypothesis is that the feature extraction demonstrated that it could extract more discriminative
layers of ResNet-50 will also perform well on fault diag- high-level features and had a better performance. Gan et al.
nosis. With deeper network layers and better feature [25] applied DBN-based hierarchical diagnosis network
extraction layers, the proposed TCNN(ResNet-50) would (HDN) on the fault diagnosis of rolling element bearings.
improve the final prediction accuracies on fault diagnosis. The results showed that HDN was highly reliable for pre-
The proposed TCNN(ResNet-50) is tested on three famous cise multi-stage diagnosis. Han et al. [26] studied a deep
datasets and has achieved significant results by comparing transfer network (DTN) for fault diagnosis, and achieved
with other DL models and traditional methods. What is many state-of-the-art transfer results in terms of diverse
more, the TCNN model is also conducted on the other operating conditions, fault severities and fault types.
famous benchmark CNNs, including VGG-16, VGG-19 As one of the most successful DL methods, CNN
and Inception-V3. The results also show that TCNN(Res- methods have also been widely studied for fault diagnosis
Net-50) outperforms these three TCNN variants. [27–29]. Wang et al. [30] investigated an adaptive deep
The rest of this paper is organized as follows. Section II CNN model. The main parameters were determined by
discusses the related works. Section III presents the particle swarm optimization, and the results validated that
methodologies of the proposed TCNN(ResNet-50) model. the proposed method was more effective and robust than
Section IV presents the case studies. The conclusion and other intelligent methods. Guo et al. [31] investigated a
future researches are presented in Section V. hierarchical adaptive CNN by adding an adaptive learning
rate and a momentum component to the weight updating
process. Lu et al. [32] studied a hierarchical CNN to the
123
Neural Computing and Applications (2020) 32:6111–6124 6113
intelligent fault diagnosis of rolling bearing and the results classification, and achieved the state-of-the-art perfor-
delineated the effectiveness of the CNN model for fault mance. Janssens et al. [38] investigated the transfer learn-
classification of rolling bearings. Xie et al. [33] investi- ing to detect various machine conditions using thermal
gated the CNN with empirical mode decomposition for infrared imaging data and achieved 95% and 91.67%
fault diagnosis. The experiments with vibration data of 52 accuracy for the respective use cases. Shao et al. [39]
different categories under different machine conditions proposed a new deep transfer learning method by using
were conducted, and the results indicated the proposed VGG16 and frozen the first three convolution blocks as
method was more accurate and reliable than previous feature extractor. The proposed method was tested on
approaches. Xia et al. [34] studied the CNN with multiple induction motors, gearboxes, and bearings dataset, and
sensors for fault diagnosis. Compared with traditional achieved the state-of-the-art results.
approaches using manual feature extraction, the results The application of feature transferring method in fault
showed the CNN-based method achieved the superior diagnosis is few. In this research, the TCNN(ResNet-50) is
diagnosis performance. designed by using ResNet-50 trained on ImageNet as the
However, as stated by Zhao et al. [14], the depths of DL feature extractor for fault diagnosis. Our hypothesis is that
models developed for fault diagnosis are relative shallow, the feature extraction layers of ResNet-50 will also perform
which limits their final prediction accuracies. In this well on fault diagnosis field. The structure of TCNN is also
research, a new TCNN(ResNet-50) model is developed, conducted on VGG-16, VGG-19 and Inception-V3, and
which has 51 hidden layers, and can improve the final compared with TCNN(ResNet-50). The trained ResNet-50,
prediction accuracy for fault diagnosis. VGG-16, VGG-19 and Inception-V3 networks can be
found in Keras Applications website: https://keras.io/appli
2.2 Feature transferring by using CNN network cations/.
123
6114 Neural Computing and Applications (2020) 32:6111–6124
new fault signal-to-image method is developed based on performance of image classification, and can extract high-
the signal-to-gray method proposed by Chong [40]. quality features of images, our hypothesis is that the feature
In signal-to-gray method, the collected time-domain extraction layers of ResNet-50 will also perform well in
signals are segmented to generate the signal samples for fault diagnosis field.
different fault types. Let N denote the total number of
samples, m m denote the sizes of gray image. Li(a), 3.2.1 Residual building block
i = 1,….N, a = 1,…,m2, denote the strength value of signal
samples. GP(j,k), j = 1,…,m, k = 1,…,m denote the pixel Residual building block (RBB) is the most vital element in
strength of gray image. Then, the signal-to-gray can be ResNet-50. RBB is based on the idea of skipping blocks of
formulated by Eq. (1). convolutional layers by using shortcut connections. These
Lððj 1Þ m þ kÞ MinðLÞ shortcuts are useful for optimizing trainable parameters in
GPðj; kÞ ¼ 255 ð1Þ error backpropagation to avoid the vanishing/exploding
MaxðLÞ MinðLÞ
gradients problem, which can help to construct deeper
Different from 2D gray image, RGB image is 3D matrix CNN structure to improve final performance for fault
format. Let RGBPixel(j,k,p), p = 1,2,3, denote the RGB 3D diagnosis.
matrix. The third dimension is the red (p = 1), green RBB consists of several convolutional layers (Conv),
(p = 2) and blue (p = 3) channels. batch normalizations (BN), Relu activation function and
The fault signal-to-image method is shown in Fig. 1. one shortcut. There are two different RBB structures
After segmenting to generate the fault signal samples, the denoted by RBB-1 and RBB-2 in this research, as shown in
conversion method can be formulated by Eqs. (2)–(4). Fig. 2. Both RBB-1 and RBB-2 have three Conv and BN
Equation (2) converts the time-domain fault signal samples layers. But the shortcut in RBB-1 is the identity x, as shown
L to a base matrix BM. Then, BM would be normalized by in Fig. 2a. Let F denote the nonlinear function for the
the maximum and minimum value of the whole samples, as convolutional path in RBB-1, the output of RBB-1 can be
shown in Eq. (3). Finally, RGBPixel will be obtained by formulated in Eq. (5). Figure 2b presents the structure of
Eq. (4). The red, green and blue elements of RGBPixel are RBB-2. The shortcut contains one alternative Conv and BN
same with each other, and their pixel values are scaled to layers. Let H denote the shortcut path, and the output of
0–255. RBB-2 can be formulated in Eq. (6).
BMi ðj; kÞ ¼ Li ððj 1Þ M þ kÞ ð2Þ y ¼ F ð xÞ þ x ð5Þ
BMi ðj; kÞ Mini;j;k ðBMi ðj; kÞÞ y ¼ F ð xÞ þ H ð xÞ ð6Þ
NMi ðj; kÞ ¼ ð3Þ
Maxi;j;k ðBMi ðj; kÞÞ Mini;j;k ðBMi ðj; kÞÞ
Several RBB-1 and RBB-2 blocks are stacked after the first
RGBPixeli ðj; k; pÞ ¼ NMi ðj; kÞ 255; p ¼ 1; 2; 3 ð4Þ convolutional layer in ResNet-50. The ResNet-50 model is
published in [17], and it is applied in this research.
This conversion method contains only one hyper parame-
ter, which is the size of image m. However, m will be
3.2.2 Transfer leaning using ResNet-50
analyzed and determined by cross-validation, which can
reduce the effect of experts’ bias on fault diagnosis as
The structure of the proposed TCNN(ResNet-50) is pre-
much as possible.
sented on Fig. 3. We transfer the first 49 layers of ResNet-
50 (There are 1 ? 16 * 3 = 49 Conv layers in Fig. 3).
3.2 TCNN(ResNet-50) structures
Then, a fully connected layer (FC) and the softmax clas-
sifier are added to ResNet-50 to fit the class labels of fault
Due to the fact that volume of labeled samples in fault
diagnosis dataset. It should be noted that the depth of
diagnosis is relatively small compared with ten million
TCNN(ResNet-50) is 51 layers. With deeper network lay-
annotated images in ImageNet, it is hard to train very deep
ers and better feature extraction layers, the proposed
CNN models for fault diagnosis, which limits the predic-
TCNN(ResNet-50) would improve its final prediction
tion accuracies of CNN models for fault diagnosis. How-
accuracy on fault diagnosis.
ever, by combining with transfer learning technique [16],
TCNN(ResNet-50) applies ResNet-50 to extract the
the deep CNN models trained on ImageNet can also per-
features of the converted images generated in signal-to-
form well on the small data in other domains [15],
image method. Then, these features would be trained for
including fault diagnosis field. In this research, we transfer
fault classification. The output size at 49-layer of ResNet-
the ResNet-50 trained on ImageNet to fault diagnosis field.
50 is 2048. Let FResNet denote the nonlinear function of
ResNet-50 can easily gain accuracy along with the greatly
ResNet-50, y_fi,j denote the feature extracted from
increased of depth. Since ResNet-50 has a very good
123
Neural Computing and Applications (2020) 32:6111–6124 6115
123
6116 Neural Computing and Applications (2020) 32:6111–6124
!
1 X
Nt n o
Acc ¼ b
1 Yt ¼¼ Y t : ð10Þ
Nt i¼1
y fij ¼ FResNet ðRGBPixeli Þ: ð7Þ In this step, the time-domain fault signals Li will be con-
verted to RGB images RGBPixeli as shown in Eq. (2)–(4).
3.3 Performance evaluation using cross- Because the input size in ResNet-50 are 224 9 224 and the
validation size of the converted images are m m, a resize operator
will be conducted on the converted RGB images.
Cross-validation (CV) is a popular technique to obtain the
reliable performance evaluation of fault classifier [43]. K- 3.4.2 Initialization structure of TCNN(ResNet-50)
fold CV is the most popular technique of CV techniques. It
divides the whole data into K subsamples with approxi- Initialize the structure of ResNet-50 and restore the pre-
mately equal cardinality N/K samples. Each subsample trained weights into ResNet-50. Then, add the FC and
successively plays the role of validating dataset, while the softmax classifier layers. In this research, the number of
rest K - 1 subsamples are used for training the fault hidden neurons in FC layer is 128, and that of softmax is
classifier. Rauber et al. [43] applied tenfold cross-valida- determined according to the label class of the fault diag-
tion together with feature selection to optimize the fault nosis dataset.
diagnosis system. Han et al. [44] applied fivefold cross-
validation with ANN and SVM for intelligent diagnosis of 3.4.3 Training TCNN(ResNet-50)
rotating machinery. Zhu et al. [45] applied fivefold cross-
validation on roller bearing fault diagnosis. Take RGBPixeli as the input to obtain its feature y_fi,j. And
In this research, 10 times tenfold CV is applied to train the newly added FC and softmax classifier layers for
achieve the reliable performance evaluation of fault classification. During the training process, dropout
bv technique and the L2 regulation are applied.
TCNN(ResNet-50) on fault diagnosis. Suppose Yv and Y
denote the actual and prediction labels on the validate
dataset, and Nv is the sample number of validate dataset.
The accuracy of fault classifier (Accm ) is presented by
4 Case studies with results
Eq. (8). The accuracy of CV (Acccv ) is the mean of 10
In this section, three case studies are conducted to evaluate
times tenfold running, and it can be shown by Eq. (9).
! the performance of TCNN(ResNet-50). The proposed
1 X Nv n o models are implemented by Python 3.5 with Keras using
Accm ¼ 1 Yv ¼¼ Y bv ð8Þ
Nv i¼1 TensorFlow as backend, and run on Ubuntu with Titan XP
GPU.
1 X 10 X10
Acccv ¼ Accm ði; jÞ ð9Þ
10 10 i¼1 j¼1 4.1 Case 1: KAT bearing dataset
After finishing CV process, the obtained fault classifier 4.1.1 Data description
will be evaluated by another separated testing dataset.
Denote Yt and Y bt as the actual and prediction labels on In this subsection, the proposed TCNN(ResNet-50) is
testing dataset, Nt is the sample number of testing dataset. conducted on the condition monitoring of bearing damage
The final prediction accuracy (Acc) of the fault classifier is dataset provided by KAT datacenter in Paderborn
given by Eq. (10). University [46]. The hardware of this experiment is shown
in [46], and there are 15 datasets and they can be
123
Neural Computing and Applications (2020) 32:6111–6124 6117
Table 3 Cross-validation
TCNN (ResNet-50) TCNN (VGG-16) TCNN (VGG-19) TCNN (Inception-V3)
results Acccv of different m and
TCNN variants in Case 1 (%) m = 32 95.30 ± 0.0368 90.40 ± 0.1827 88.27 ± 0.1499 87.78 ± 0.0372
m = 40 97.20 ± 0.0097 94.78 ± 0.0424 93.07 ± 0.1301 90.76 ± 0.0424
m = 48 98.63 ± 0.0109 96.32 ± 0.0452 94.82 ± 0.0964 91.08 ± 0.0952
m = 56 98.66 ± 0.0143 97.11 ± 0.0947 94.28 ± 0.1178 92.23 ± 0.1919
m = 64 98.97 ± 0.0121 97.27 ± 0.0437 95.92 ± 0.0967 93.68 ± 0.1345
123
6118 Neural Computing and Applications (2020) 32:6111–6124
Table 4 Acc results of TCNN variants in Case 1 (%) TCNN(ResNet-50) is stable. Figure 6 presents the con-
Methods Max Min Mean SD
vergence curves of four TCNN variants. From the results, it
can be seen that TCNN(ResNet-50) has the fastest con-
TCNN(ResNet-50) 98.96 98.94 98.95 0.0074 vergence speed among these four TCNN variants.
TCNN(VGG-16) 97.38 97.26 97.32 0.0410
TCNN(VGG-19) 96.07 95.69 95.90 0.1064 4.1.3 Results and comparisons with other methods
TCNN(Inception-V3) 93.97 93.50 93.77 0.1366
The proposed TCNN(ResNet-50) is compared with other
methods provided in [46]. They are classification and
regression trees (CART), random forests (RF), Boosted
Trees (BT), neural networks (NN), support vector machi-
nes with parameters optimally tuned using particle swarm
optimization (SVM-PSO), extreme learning machine
(ELM), k-nearest neighbors (KNN) and their ensemble
algorithms using majority voting (Ensemble). The results
of these compared methods are directly taken from the
literature, and their parameters settings can be found in
[46]. The comparison results are presented in Table 5.
The results show that TCNN(ResNet-50) has achieved
good results, and its prediction accuracy is 98.95%. The
results of CART, RF, BT, NN, SVM-PSO, ELM, KNN and
Ensemble are 98.3%, 98.3%, 83.3%, 44.2%, 75.8%, 60.8%,
62.5% and 98.3%, respectively. These results validate the
performance of TCNN(ResNet-50).
Fig. 5 Effect of m on Acccv under different TCNN variants in Case 1 4.2 Case 2: CWRU motor bearing dataset
Fig. 6 The convergence curve of four TCNN variants in Case 1 TCNN(ResNet-50) 98.95
CART 98.3
Table 4 presents the Acc results of four TCNN variants. RF 98.3
m is set to be 64. The maximum (max), minimum (min), BT 83.3
mean and standard deviation (std) are given. The max, min, NN 44.2
mean and std of TCNN(ResNet-50) are 98.96%, 98.94%, SVM-PSO 75.8
98.95% and 0.074, and TCNN(ResNet-50) outperforms ELM 60.8
TCNN(VGG-16), TCNN(VGG-19) and TCNN(Inception- KNN 62.5
V3) in all comparison terms. The std of TCNN(ResNet-50) Ensemble 98.3
is a small value showing that the prediction of
123
Neural Computing and Applications (2020) 32:6111–6124 6119
0 1797
1 1772
2 1750
3 1730
Table 7 Cross-validation
TCNN(ResNet-50) TCNN(VGG-16) TCNN(VGG-19) TCNN(Inception-V3)
results Acccv of different m and
TCNN variants in Case 2 (%) m = 32 99.40 ± 0.0193 98.28 ± 0.0307 98.16 ± 0.0638 97.64 ± 0.0574
m = 48 99.75 ± 0.0136 99.49 ± 0.0184 99.17 ± 0.0428 99.20 ± 0.0359
m = 64 99.95 ± 0.0078 99.85 ± 0.0319 99.67 ± 0.0171 99.68 ± 0.0223
m = 80 99.95 ± 0.0095 99.96 ± 0.0122 99.91 ± 0.0165 99.96 ± 0.0115
m = 96 100 – 0 99.99 ± 0.0066 99.92 ± 0.0201 99.93 ± 0.0098
m = 112 100 ± 0 99.99 – 0.0055 99.97 ± 0.0059 99.96 ± 0.0113
m = 128 100 ± 0 99.98 ± 0.0043 99.98 – 0.0110 99.98 – 0.0065
123
6120 Neural Computing and Applications (2020) 32:6111–6124
TCNN(VGG-19) and TCNN(Inception-V3). Figure 9 pre- 4.3 Case 3: Self-priming centrifugal pump
sents the convergence curves of the TCNN variants. From dataset
the results, it can be seen that all four TCNN variants have
the similar convergence speed, and TCNN(ResNet-50) has 4.3.1 Data description
slightly advantage than other TCNN variants.
In this case study, the proposed method TCNN models are
4.2.3 Results and comparisons with other methods conducted on self-priming centrifugal pump dataset from
Lu et al. [42]. The data acquisition system of the self-
The proposed TCNN(ResNet-50) is compared with other priming centrifugal pump is shown in Fig. 10. The accel-
famous DL-based methods and other CNN-based methods. eration sensor is installed above the motor housing, and the
They are normalized SAE (NSAE-LCN) from Jia et al. vibrational signals are collected for further analysis. In this
[23], stacked sparse autoencoder (SSAE) from Qi et al. case study, the rotation speed is 2900 rpm, and sampling
[24], sparse filter from Lei et al. [48], deep belief network frequency of vibration signals is 10239 Hz. The fault
(DBN) form Gan et al. [25], deep CNN from Wang et al. conditions contain bearing roller wearing (BR), inner race
[30], adaptive deep convolution neural network (ADCNN) wearing (IR), outer race wearing (OR), and impeller
from Guo et al. [31], hierarchical convolutional network wearing (IW) fault condition. There exists one normal
(CNN-1) form Lu et al. [32], convolutional neural network condition (NO), so there are five healthy conditions for this
and empirical mode decomposition (CNN-2) from Xie dataset.
et al. [33] and CNN using multiple sensors (CNN-3) from
Xia et al. [34] The comparison results of TCNN(ResNet- 4.3.2 Parameter analysis with tenfold CV
50) are presented in Tables 9 and 10.
The proposed TCNN(ResNet-50) obtains a good result. The effect of m on the final prediction accuracy of TCNN
The mean prediction accuracy is as high as 99.99%. From variants are studied in this subsection. The value of m is set
Table 9, it can be seen that the mean prediction results of to be 32, 48, 64, 80 and 96. The implementations of all
NSAE-LCN, SSAE, Sparse filter and DBN are 99.92%, TCNN variants are 10 times tenfold CV, and the compar-
ison results are presented in Tables 11 and 12; Figs. 11 and
12. The parameters of TCNN(ResNet-50), TCNN(VGG-
Table 9 Comparisons with 16), TCNN(VGG-19) and TCNN(Inception-V3) are the
Methods Mean accuracy
other DL methods in Case 2 (%)
same with Case 1 and Case 2.
TCNN 99.99
From Table 11, TCNN(ResNet-50) obtains better mean
NSAE-LCN 99.92
and standard deviation of Acccv than other three TCNN
SSAE 99.85
variants under the same value of m. The best m is 80 for all
Sparse filter 99.66 TCNN variants, and finally, the TCNN(ResNet-50)
DBN 99.03 achieves as high as 100% ± 0. As shown in Fig. 11, it can
123
Neural Computing and Applications (2020) 32:6111–6124 6121
be seen that TCNN(ResNet-50) outperforms other three Table 12 Acc results of TCNN variants in Case 3 (%)
TCNN variants. TCNN(VGG-16) has the similar perfor-
Methods Max Min Mean SD
mance with TCNN(VGG-19), and their performances are
superior to TCNN(Inception-V3). TCNN(ResNet-50) 99.20 99.20 99.20 0
In Table 12, m is set to be 80. The Acc values of TCNN(VGG-16) 99.07 99.03 99.05 0.0144
TCNN(VGG-16), TCNN(VGG-19) and TCNN(Inception- TCNN(VGG-19) 98.99 98.95 98.97 0.0124
V3) are 99.05% ± 0.0144, 98.97% ± 0.0124 and TCNN(Inception-V3) 98.59 98.46 98.54 0.0422
98.54% ± 0.0422, respectively, while that of TCNN(Res-
Net-50) is 99.20% ± 0. The Acc result of TCNN(ResNet-
50) outperforms TCNN(VGG-16), TCNN(VGG-19) and
TCNN(Inception-V3). Figure 12 presents the convergence
curves of the TCNN variants. From the results, it can be
seen that all four TCNN variants have the similar conver-
gence speed, and TCNN(ResNet-50) has slight advantage
than other TCNN variants.
Table 11 Cross-validation
TCNN(ResNet-50) TCNN(VGG-16) TCNN(VGG-19) TCNN(Inception-V3)
results Acccv of different m and
TCNN variants in Case 3 (%) m = 32 98.57 ± 0.0574 93.96 ± 0.1423 94.68 ± 0.1174 90.87 ± 0.1839
m = 48 99.70 ± 0.0291 97.12 ± 0.0780 97.62 ± 0.0861 94.68 ± 0.0767
m = 64 100 – 0 99.63 ± 0.0277 99.78 ± 0.0129 98.51 ± 0.0769
m = 80 100 – 0 99.85 – 0.0145 99.79 – 0.0226 99.62 – 0.0414
m = 96 99.98 ± 0.0106 99.81 ± 0.0230 99.51 ± 0.0381 99.32 ± 0.0375
123
6122 Neural Computing and Applications (2020) 32:6111–6124
123
Neural Computing and Applications (2020) 32:6111–6124 6123
4. Wen L , Li XY, Gao L (2019) A new two-level hierarchical 24. Qi Y, Shen C, Wang D, Shi J, Jiang X, Zhu Z (2017) Stacked
diagnosis network based on convolutional neural network. IEEE sparse autoencoder-based deep network for fault diagnosis of
T Instrum Meas. https://doi.org/10.1109/TIM.2019.2896370 rotating machinery. IEEE Access 5:15066–15079
5. Dai X, Gao Z (2013) From model, signal to knowledge: a data- 25. Gan M, Wang C (2016) Construction of hierarchical diagnosis
driven perspective of fault detection and diagnosis. IEEE Trans network based on deep learning and its application in the fault
Ind Inf 9(4):2226–2238 pattern recognition of rolling element bearings. Mech Syst Signal
6. Li XY, Gao L, Pan QK, Wan L, Chao KM (2018) An effective Process 72:92–104
hybrid genetic algorithm and variable neighborhood search for 26. Han T, Liu C, Yang WG, Jiang DX (2018) Deep transfer network
integrated process planning and scheduling in a packaging with joint distribution adaptation: a new intelligent fault diag-
machine workshop. IEEE Trans Syst Man Cybern Syst. https:// nosis framework for industry application. arXiv preprint arXiv:
doi.org/10.1109/TSMC.2018.2881686 1804.07265
7. Yang C, Song P, Liu X (2019) Failure prognostics of heavy 27. Zhang HJ, Cao X, Ho J, Chow T (2017) Object-level video
vehicle hydro-pneumatic spring based on novel degradation advertising: an optimization framework. IEEE Trans Ind Inf
feature and support vector regression. Neural Comput Appl 13(2):520–531
31(1):139–156 28. Zhang HJ, Ji YZ, Huang W, Liu LL (2018) Sitcom-star-based
8. Ertunc HM, Ocak H, Aliustaoglu C (2013) ANN- and ANFIS- clothing retrieval for video advertising: a deep learning frame-
based multi-staged decision algorithm for the detection and work. Neural Comput Appl. https://doi.org/10.1007/s00521-018-
diagnosis of bearing faults. Neural Comput Appl 22(1):435–446 3579-x
9. Kiakojoori S, Khorasani K (2016) Dynamic neural networks for 29. Wen L, Li XY, Gao L, Zhang YY (2018) A new convolutional
gas turbine engine degradation prediction, health monitoring and neural network based data-driven fault diagnosis method. IEEE
prognosis. Neural Comput Appl 27(8):2157–2192 Trans Ind Electron 65(7):5990–5998
10. Seera M, Lim CP, Ishak D, Singh H (2013) Application of the 30. Wang F, Jiang HK, Shao HD, Duan WJ, Wu SP (2017) An
fuzzy min–max neural network to fault detection and diagnosis of adaptive deep convolutional neural network for rolling bearing
induction motors. Neural Comput Appl 23(1):191–200 fault diagnosis. Meas Sci Technol 28(9):095005
11. Liu RN, Yang BY, Zio E, Chen XF (2018) Artificial intelligence 31. Guo X, Chen L, Shen C (2016) Hierarchical adaptive deep con-
for fault diagnosis of rotating machinery: a review. Mech Syst volution neural network and its application to bearing fault
Signal Process 108:33–47 diagnosis. Measurement 93:490–502
12. Wang JJ, Ma YL, Zhang LB, Gao RX, Wu DZ (2018) Deep 32. Lu C, Wang ZY, Zhou B (2017) Intelligent fault diagnosis of
learning for smart manufacturing: methods and applications. rolling bearing using hierarchical convolutional network based
J Manuf Syst 48(Part C):144–156 health state classification. Adv Eng Inf 32:139–151
13. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 33. Xie Y, Zhang T (2017) Fault diagnosis for rotating machinery
521(7553):436–444 based on convolutional neural network and empirical mode
14. Zhao R, Yan RQ, Chen ZH, Mao KZ, Wang P, Gao RX (2019) decomposition. Shock Vib. Article ID 3084197
Deep learning and its applications to machine health monitoring. 34. Xia M, Li T, Xu L, Liu L, de Silva CW (2018) Fault diagnosis for
Mech Syst Signal Process 115:213–237 rotating machinery using multiple sensors and convolutional
15. Donahue J, Jia YQ, Vinyals O, Hoffman J, Zhang N, Tzeng E, neural networks. IEEE/ASME Trans Mechatron 23(1):101–110
Darrell T (2014) Decaf: a deep convolutional activation feature 35. Ren R, Hung T, Tan KC (2018) A generic deep-learning-based
for generic visual recognition. In: International conference on approach for automated surface inspection. IEEE Trans Cybern
machine learning, pp 647–655 48(3):929–940
16. Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable 36. Wehrmann J, Simoes GS, Barros RC, Cavalcante VF (2018)
are features in deep neural networks? In: Advances in neural Adult content detection in videos with convolutional and recur-
information processing systems, pp 3320–3328 rent neural networks. Neurocomputing 272:432–438
17. He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning 37. Shin HC, Roth HR, Gao MC, Lu L, Xu ZY, Nogues I, Yao JH,
for image recognition. In: Proceedings of the IEEE conference on Mollura D, Summers RM (2016) Deep convolutional neural
computer vision and pattern recognition, pp 770–778 networks for computer-aided detection: CNN architectures,
18. Li XY, Lu C, Gao L, Xiao SQ, Wen L (2018) An effective multi- dataset characteristics and transfer learning. IEEE Trans Med
objective algorithm for energy efficient scheduling in a real-life Imaging 35(5):1285–1298
welding shop. IEEE Trans Ind Inf 14(12):5400–5409 38. Janssens O, Van de Walle R, Loccufier M, Van Hoecke S (2018)
19. Wen L, Gao L, Li XY (2017) A new deep transfer learning based Deep learning for infrared thermal image based machine health
on sparse auto-encoder for fault diagnosis. IEEE Trans Syst Man monitoring. IEEE/ASME Trans Mechatron 23(1):151–159
Cybern Syst. https://doi.org/10.1109/tsmc.2017.2754287 39. Shao S, McAleer S, Yan R, Baldi P (2018) Highly-accurate
20. Cho HC, Knowles J, Fadali MS, Lee KS (2010) Fault detection machine fault diagnosis using deep transfer learning. IEEE Trans
and isolation of induction motors using recurrent neural networks Ind Inf 1:1. https://doi.org/10.1109/tii.2018.2864759
and dynamic Bayesian modeling. IEEE Trans Control Syst 40. Chong UP (2011) Signal model-based fault detection and diag-
Technol 18(2):430–437 nosis for induction motors using features of vibration signal in
21. Sun C, Ma M, Zhao Z, Chen X (2018) Sparse deep stacking two-dimension domain. Stroj Vestn J Mech Eng 57(9):655–666
network for fault diagnosis of motor. IEEE Trans Ind Inf 41. Kang M, Kim JM (2014) Reliable fault diagnosis of multiple
14(7):3261–3270 induction motor defects using a 2-d representation of Shannon
22. Verma NK, Gupta VK, Sharma M, Sevakula RK (2013) Intelli- wavelets. IEEE Trans Magn 50(10):1–13
gent condition based monitoring of rotating machines using 42. Lu C, Wang Y, Ragulskis M, Cheng Y (2016) Fault diagnosis for
sparse auto-encoders. In: IEEE conference on prognostics and rotating machinery: a method based on image processing. PLoS
health management (PHM), Gaithersburg, MD, pp 1–7 ONE 11(10):e0164111
23. Jia F, Lei Y, Guo L, Lin J, Xing S (2018) A neural network 43. Rauber TW, Assis Boldt F, Varejão FM (2015) Heterogeneous
constructed by deep learning technique and its application to feature models and feature selection applied to bearing fault
intelligent fault diagnosis of machines. Neurocomputing diagnosis. IEEE Trans Ind Electron 62(1):637–646
272(10):619–628
123
6124 Neural Computing and Applications (2020) 32:6111–6124
44. Han T, Jiang D, Zhao Q, Wang L, Yin K (2017) Comparison of of the European conference of the prognostics and health man-
random forest, artificial neural networks and support vector agement society, pp 05–08
machine for intelligent diagnosis of rotating machinery. In: 47. Smith WA, Randall RB (2015) Rolling element bearing diag-
Transactions of the institute of measurement and control, pp 1–13 nostics using the Case Western Reserve University data: a
45. Zhu K, Song X, Xue D (2014) A roller bearing fault diagnosis benchmark study. Mech Syst Signal Process 64:100–131
method based on hierarchical entropy and support vector machine 48. Lei Y, Jia F, Lin J, Xing S, Ding SX (2016) An intelligent fault
with particle swarm optimization algorithm. Measurement diagnosis method using unsupervised feature learning towards
47:669–675 mechanical big data. IEEE Trans Ind Electron 63(5):3137–3147
46. Lessmeier C, Kimotho JK, Zimmer D, Sextro W (2016) Condi-
tion monitoring of bearing damage in electromechanical drive Publisher’s Note Springer Nature remains neutral with regard to
systems by using motor current signals of electric motors: a jurisdictional claims in published maps and institutional affiliations.
benchmark data set for data-driven classification. In: Proceedings
123