1. Introduction
Rolling bearings are widely used in industrial manufacturing. Ensuring the safe and stable operation of rolling bearings is the core requirement of the manufacturing process, and their health condition has a significant impact on system dependability, productivity, and facility lifetime [
1,
2,
3]. In recent years, intelligent manufacturing engineering has become a significant development trend of the manufacturing industry, and the model-based mechanical fault diagnosis technology has been developed rapidly. A large number of methods and techniques have been proposed [
4,
5,
6].
Due to its robust feature learning ability, deep learning has become the hot issue at present and provides new ideas for fault diagnosis of mechanical equipment [
7,
8,
9,
10]. Training the model to convergence requires vast quantities of labeled data for supervised learning in deep learning network models. It is a prerequisite to ensure that the number of samples between each category is balanced. The model learns balanced features under each data category to achieve high classification accuracy. However, in practical applications, there are severe imbalances and distribution differences in fault data, which lead to the incomplete training of deep learning networks and the inability to completely fit the distribution of training samples, ultimately leading to the poor classification accuracy of the model. Consequently, it is of great significance to establish a stable and valid diagnosis method under unbalanced samples.
To effectively improve diagnosis performance under unbalanced samples, many scholars have carried out research on this topic and obtained some remarkable results. Duan et al. based on a description of support vector data, developed a multi-classification fault diagnosis strategy to improve diagnostic accuracy [
11]. Zhang et al. designed a new classification method for unbalance faults in permanent magnet synchronous motors based on a discrete wavelet transform [
12]. Nevertheless, the classification accuracy cannot be significantly improved just by improving the classification method. Only by obtaining more simulated data from the original data can we find the root of the problem. In 2014, Goodfellow and Pouget-Abadie designed a new data enhancement method called a generative adversarial network (GAN), which can supplement the sample space with insufficient data by performing a model synthesis on a limited number of types of samples [
13]. GANs are widely used for their outstanding application prospects, including signal processing, pattern recognition, and national security [
14,
15,
16]. Meanwhile, due to GAN’s excellent data expansion capability, many models with different structures have been derived [
17,
18].
However, the continuous optimization and improvement of the GAN model structure does not completely address the problems of convergence difficulty and training instability. In 2017, Gulrajani and Ahmed designed a new generative adversarial network approach called the Wasserstein generative adversarial networks with gradient penalty (WGANs-GP) [
19]. It does this by randomly interpolating between the real sample and the generated sample to guarantee that the transition area between the real sample and the generated sample meets the Lipschitz Constraint. Further research showed that WGAN-GP can overcome the drawbacks of the aforementioned methods, and the application performs well in the field of fault identification [
20,
21,
22,
23,
24].
Due to the multiformity of rotating machine systems and the intricacy of sensing data, “weak” classical machine learning methods based on artificial feature selection are hard to provide accurate classification results for. Data-driven methods have received aggrandized attention from researchers because of the advantages of their fast and efficient processing of mechanical signals, reliable fault detection results, and their powerful capability of not relying on a large amount of a priori expert knowledge [
25,
26]. Deep confidence networks (DBNs) [
27], recurrent neural networks (RNNs) [
28], autoencoders [
29], convolutional neural networks (CNNs) [
30], and numerous other neural networks have been applied in fault diagnosis.
In recent years, CNNs have been widely used in fault diagnosis. CNNs can use a deeper extraction of fault features and significantly reduce the number of parameters while automatically and accurately obtaining the implied information in vibration signals in different states [
31,
32]. Janssens et al. introduced convolutional neural networks (CNNs) to the field of fault diagnosis and designed a feature learning model for condition monitoring based on CNN [
33]. Zhang et al. explicitly applied the raw time signal as the input of a one-dimensional CNN to achieve fault classification [
34]. Peng et al. proposed a residual learning-based one-dimensional CNN combined with the original vibration signal for bearing fault diagnosis under variable operating conditions [
35]. At the same time, some researchers tried to implement fault identification from the perspective of image processing, to eliminate the influence of manual features, which provides a new idea for fault diagnosis. Li et al. proposed a method for a highly depth sensitive feature extraction and pattern recognition using STFT and CNN [
36]. Ding et al. provided a new approach by using deep ConvNet to automatically learn multiscale features of wavelet packet energy (WPE)-generated images and use them for bearing fault diagnosis [
37]. Wen et al. proposed a LeNet-5-based CNN for fault diagnosis [
38]. Although the above CNN and image processing-based fault diagnosis methods have an outstanding preponderance in fault state identification. However, these methods extract spatial and channel information from local sensory regions without considering the weights of feature mapping, which generates redundant features to some extent and increases the computational cost while reducing the nonlinear fitting ability of the model to the fault features.
Recently, attention mechanisms in the computer realm have drawn several researchers’ attention by selectively reinforcing adequate information and reducing superfluous feature information to obtain better network performance [
39,
40]. The attentional (SE) mechanism adaptively recalibrates the feature response of a channel approach by explicitly modeling the interdependencies between channels, bringing significant performance gains with minimal additional computational cost. Hu et al. proposed the self-attentive convolutional neural network (SECNN) by adding a novel architectural unit squeeze and excitation [
41]. Roy et al. demonstrated increasing segmentation accuracy by efficiently merging SE blocks into three state-of-the-art F-CNNs on three challenging benchmark datasets [
42]. Feng et al. proposed a semi-supervised meta-learning with a squeeze and excitation attention network (SSMN) and demonstrated the usability and validity of the method with three bearing datasets [
43]. Compared with convolutional neural networks (CNNs) and numerous other CNN variants, SECNN can improve the model’s resistance to imbalanced data and the nonlinear fitting ability to fault features, while the number of parameters and the model computation in the SECNN structure is relatively small.
To address the problem of limited rolling bearing fault samples and the unbalanced distribution of fault categories and to further realize efficient and high precision fault diagnosis, an intelligent fault diagnosis method based on grayscale image transformation, WGAN-GP, and SECNN is proposed. Firstly, the collected original vibration signals were converted into corresponding grayscale images to obtain 2D image samples that are easy to process by the model to extract image features and visualize different bearing states; then, adversarial training was performed using WGAN-GP to generate more new samples with similar distribution to the original samples; finally, the expanded sample data were input to a deep feature extraction model based on compressed excitation to automatically learn grayscale image features of different fault states, and selectively enhance functional feature mapping and reduce redundant features on the convolution channel to output recognition results. The experimental results show that the method has good robustness and generalization ability, and has excellent recognition performance under the fault class sample imbalance condition.
The superiority and innovativeness of the method proposed in this study is summarized as follows:
The conversion of a one-dimensional original vibration signal to the two-dimensional grayscale image was realized by using grayscale image conversion technology to fully exploit the deeper feature information and better utilize the image generation capability of WGAN-GP;
A data-driven approach based on WGAN-GP was used to generate data samples with imbalanced bearing failure classes. Compared with GAN and WGAN, the WGAN-GP can solve the problems in GAN due to JS dispersion that leads to the WGAN-GP solving the problems of unstable GAN training and pattern collapse due to JS scatter, and the problems of neural network learning become simple function mapping, gradient disappearance, and gradient explosion due to the weight cropping implementation in WGAN. The choice of applying WGAN-GP to force the discriminator to satisfy the continuity constraint of the 1-Lipschitz function by adding a gradient penalty term results in faster convergence and better quality of generated samples;
The attention mechanism was introduced into the field of bearing fault diagnosis, and the self-attentive convolutional neural network (SECNN) was constructed, which can automatically extract information related to deep fault features and further improve the anti-interference ability and classification accuracy of the model for unbalanced data;
This method has outstanding performances in domain adaptation and can gain satisfactory diagnostic performance even when the working environment changes or the environmental noise is strong.
The method has a strong domain adaptive capability. The organizational framework of this paper is illustrated as follows.
Section 2 introduces the essential theoretical background of CNN, GANS, and signal-to-image converting methods. In the
Section 3, the proposed intelligent fault diagnosis framework is described in detail. In the
Section 4, the availability and superiority of this method are verified by experiments, and the experimental results are compared with other deep learning models. In the
Section 5, conclusions and future work are summarized.
4. Experimental Validation
In this section, to evaluate and validate the performance of the constructed fault diagnosis framework and the validity of the proposed algorithm, we experimentally compared the popular CNNs and analyzed the robustness and generalization capability of the method in bearing imbalance fault diagnosis for the measured vibration signals of rolling bearings. The operating environment of the algorithm is 2.7 GHz CPU, 8 GB main memory, NVIDIA GeForce GTX 1060 3 GB GPU; the programming environment is Python 3.8.3.
4.1. Dataset Description
The case data are rolling bearing benchmark data acquired from the Case Western Reserve University (CWRU) Bearing Data Center. The simulated test terrace of CWRU is shown in
Figure 9. The rolling bearing to be tested is a 6205-2RS JEM SKF deep groove ball bearing, and the detailed parameters of this rolling bearing are listed in
Table 1.
The test motor was operated at 1730 r/min, and the bearing health and fault data at the drive end were sampled at a frequency of 12 k. The CWRU dataset contains four different status categories: normal (N), outer race fault (OF), inner race fault (IF), and ball fault (BF). There are 3 different failure sizes for each failure condition: 0.007 in. (0.1778 mm), 0.014 in. (0.3556 mm) and 0.021 in. (0.5334 mm). Therefore, a total of 10 operating states were set up for this experiment, and the specific classification is shown in
Table 2.
Above all, the time-domain signal collected by the acceleration sensor was decomposed into multiple fragments for sample generation. The length
M of the fragments was set to 64, considering the computational performance and preventing memory overflow, and then they were converted into grayscale images with pixel values ranging from 0 to 255 and a size of 64 × 64. To confirm the diagnostic precision of the proposed method, we selected the same proportion of data from the nine rolling bearing fault datasets described in
Table 2 for experiments.
The division of the datasets and the number of samples in each sub-dataset are shown in
Table 3. Dataset
A represents the raw dataset,
B is the training dataset stochastic selected at 60% from the original dataset
A,
C is the test dataset chosen randomly at 40%,
D is the generated dataset of WGAN-GP, and dataset
E is the enhanced dataset formed by combining
B and
D. During the training process, 15% of the dataset
A were used to verify the precision of the proposed method to adjust hyperparameters.
4.2. Enhancement Data and Accuracy
In this section, we first estimated the effectiveness of WGAN-GP in generating and extending data to address the severe data imbalance and distribution discrepancies in a limited data fault diagnosis. To maximize the effectiveness of WGAN-GP data generation, we determined the value of the gradient penalty factor λ through comparison experiments for subsequent experiments. As shown in
Table 4, In order to minimize particularity and contingency, each experiment was repeated ten times, and the average result of the ten experiment results was regarded as the accuracy of the model. When the gradient penalty factor λ is set to 10, the experimental results have high accuracy.
Second, to precisely contrast the sample generation effect of GAN, WGAN, and WGAN-GP, we used the Fréchet distance (F) as a measurement. The experimental and computational results comparison are shown in
Table 5, so the sample generation ability of WGAN-GP is more substantial, and the similarity is higher.
The change curves of the loss function values of the WGAN-GP model are shown in
Figure 10 and
Figure 11, where the data values are taken once every 5000 iterations for a total of 20 loss functions values. During 100,000 iterations, the loss function values in all three GANs models exhibited large oscillations in the early phase and are more stable in the middle and later periods. It is evident that the WGAN-GP model is much more stable than GAN and WGAN in the middle and late stages, and the loss function values keep converging to zero.
During the WGAN-GP generation of sample data, the WGAN-GP model was trained to form a Nash equilibrium between the generator and discriminator. The L2 regularization penalty was set to 1 × 10−5 in the discriminator, and the Adam optimizer was used for both the generator and the discriminator.
To promote the diagnostic performance and the nonlinear fitting ability of the SECNN model to the fault features under the unbalanced sample condition, we divided it into nine experimental groups for comparison experiments by setting the number of convolutional kernels and activation functions in each convolutional layer differently. From
Table 6, it can be seen that optimal identification precision is achieved when the number of convolutional kernels in convolutional layers is 16, 32, and 64, respectively, and the type of activation function is
Leaky ReLU.
The experiments analyzed the effects of batch size and learning rate on fault diagnosis accuracy. From
Figure 12, it can be seen that the highest identification precision is achieved when the batch size and learning rate are set to 128 and 0.001, respectively. The dimensionality reduction rate r of the SE module was set to 8. Therefore, we set this structural parameter in all subsequent experiments. The specific architecture of SECNN is shown in
Figure 13.
We also defined the algorithm efficiency factor λ to maximize the model diagnostic performance. The calculation formula is shown in Equation (21). We performed five sets of comparison experiments for the number of training iterations of the selected model, and the comparison of the experimental and computational results are shown in
Table 7. Through the comparison experiments, we found that set iterations to 100,000 can obtain more satisfactory results.
4.3. Diagnosis Accuracy Comparisons
In this section, to further verify the validity of the proposed rolling bearing diagnosis strategy, we explored the diagnostic performance of different data mining algorithms by setting up comparative experiments. dataset C, with 40% of samples randomly selected in the original dataset was fed into other deep learning models.
To minimize the specificity and chance of the experimental results, we repeated each experiment ten times with the same dataset. A proposed paper comparing the algorithmic models in References [
38,
47,
48,
49,
50,
51,
52,
53] is provided. As can be seen from
Table 8, the average accuracy of all models for the unbalanced dataset exceeds 70%, but there is a large variability in the diagnostic results between different models under the same dataset.
From the comparison results, it can be seen that the original CNN model has the lowest identification precision of 72.40%. At the same time, the diagnosis accuracy is improved for SECNN with the addition of the self-attention mechanism, which indicates that the self-attention module has a more prominent role in suppressing the noise weight and enhancing the weight of fault features. Both algorithm GAN-SECNN and algorithm WGAN-GP + SECNN are fault diagnosis methods based on generative adversarial networks, and the classification accuracy of the WGAN-GP + SECNN algorithm is 100%, which is higher than that of GAN-SECNN. Its diagnostic accuracy is greatly improved compared with that of SECNN-based fault diagnosis methods, which indicates that generative adversarial networks can cope well with unbalanced data and significantly reduces the reliance on raw data while considering the diagnostic accuracy, which has a more significant advantage over other mainstream fault diagnosis methods.
Second, we input dataset
B as the training set into the proposed model and dataset
C as the testing set. The confusion matrix was introduced to show more directly the accuracy of the proposed model for identifying the various fault states of rolling bearings.
Figure 14 shows the confusion matrix of the results. The experimental results show that the model can reach fast convergence and high diagnostic accuracy under data imbalance.
To visualize the feature extraction capability of the WGAN-GP + SECNN model, t-SNE was used to map extracted high-dimensional features to a two-dimensional space, as shown in
Figure 15.
From
Figure 15a, we can observe that when the original features in the test set are transformed into two dimensions by t-SNE, various fault states are overlapped, making it almost impossible to distinguish the boundaries between the categories. With the increasing number of iterations, the points of the same category are gradually clustered, but it is still difficult to distinction all the categories, as shown in
Figure 15b–d. Finally, sample points with the same color are clustered together, and each fault boundary under ten working conditions can be distinguished, as shown in
Figure 15e. The feature visualization results show that the WGAN-GP + SECNN model can reach identification precision accurately.
4.4. Generalization and Robustness Comparisons
In the actual rolling bearing fault diagnosis process, from time to time, we faced changes in the working conditions, resulting in large distribution differences between the training data and the test data, which makes the fault diagnosis performance degraded. To confirm the generalization ability and robustness, fault diagnosis experiments were conducted for rolling bearings under different working conditions.
In this part of the experiments, each dataset is a multi-speed mixed dataset. The training and testing samples in dataset
A1 are composed of the same data from loads of 0–3 hp, the training and testing samples in dataset
B1 are composed of different data from loads of 0–2 hp and the load of 3 hp, the training and testing samples in dataset
C1 are composed of different data from loads of 0–1 hp and the load of 2 hp, and the crack size was added to dataset
D1 variables. The detailed dataset distribution is shown in
Table 9. The generalization ability and robustness of the proposed model were evaluated by conducting experiments under the same parameter settings as in the previous experiments.
Figure 16 and
Table 10 show the accuracy curves of the proposed model training process and the final classification accuracy of the model under datasets
A1–
D1. To minimize specificity and chance, we repeated each experiment ten times and considered the average result of the ten experimental results as the accuracy of the model. From
Figure 16 and
Table 10, we can see that the model still achieves excellent diagnostic performance under different working conditions. The tested accuracies of the model under datasets
A1–
D1 are 99.97%, 99.78%%, 99.82%, and 99.69%, respectively. Thus, the two-dimensional grayscale images can still fully indicate different bearing states even under different operating conditions. It is also shown that the model has not only high fault diagnosis accuracy, but also good robustness for bearing fault diagnosis.
The vibration signals collected from mechanical bearings under complex working conditions incorporate with high power noise, which easily drowns the early fault information in strong background noise, thus making it impossible to achieve accurate fault detection. Therefore, to verify the noise robustness of the proposed method, signals with different signal-to-noise ratios were formed by adding additive Gaussian white noise (AWGN) with different standard deviations to the original vibration signals.
The signal-to-noise ratio is usually expressed in decibels as shown in Equation (22).
Figure 17 shows the comparison of the diagnosis results of different algorithms under different noise environments. To avoid the effect of random factors on the experimental results, ten repetitive experiments were conducted for each test. From
Figure 17, it can be seen that the diagnostic performance of all methods gradually augments with the increase of noise power, but the proposed method can achieve an accuracy of 98.264% under the robust noise pollution environment. The reason is that by changing the original one-dimensional vibration signal into two-dimensional grayscale images as the input samples for model training in the proposed method, sensitive features can be thoroughly mined from the complex original signal. At the same time, noise interference can be effectively suppressed.
Meanwhile, traditional machine learning (ML) such as SVM and KNN lead to poor diagnostic performance due to the scared capacity to restrain noise and unconcerned interference. Therefore, the proposed method has stronger robustness and superior diagnostic performance under solid ambient noise.
To contrapose the phenomenon of data imbalance in the fault diagnosis process, which leads to incomplete training of the deep network and the inability to completely fit the training sample distribution, ten imbalanced datasets with different data imbalance ratios were set to further assess the stability of the proposed method’s diagnostic performance.
The sample distributions of the ten imbalanced datasets with different imbalance ratios are shown in
Table 11. In the ten imbalanced datasets, the ratios of normal samples and each genre of fault samples in the training dataset were set to 500:500, 500:450, 500:400, 500:350, 500:300, 500:250, 500:200, 500:150, 500:100, and 500:50, respectively, while the number of samples in the test dataset was set to 200.
To further verify this method’s validity under unbalanced data, we input the datasets under ten unbalanced states into the other five deep learning models as shown in
Figure 18 and
Table 12. The fault diagnosis precision rate of the proposed method under the first data distribution state is 99.9%, and the accuracy of the other six methods is 99.1%, 98.9%, 98.6%, 97.8%, 94.1%, and 93.9%, respectively.
When the training sample size under each fault category is reduced to half of the normal sample size, the fault diagnosis precision rate of the proposed method is much higher than that of the other six methods at 99.2%. The diagnostic performance of each diagnostic method decreases significantly as the data imbalance rate increases. When the imbalance rate reaches 10:1, the proposed method still shows good diagnostic performance. Therefore, although the fault identification accuracy of the proposed method tends to decrease with the intensification of the data imbalance rate, the method can still maintain a high diagnostic identification accuracy and has high diagnostic stability.
5. Conclusions and Future Work
In this research, an intelligent fault diagnosis method based on WGAN-GP and SECNN is proposed for rolling bearing fault diagnosis analysis under severe imbalance and distribution discrepancy of fault data. The method addresses the scenario of data imbalance under strong noise operation conditions. As an innovative application, the constructed model uses the signal-to-image conversion technique to convert the one-dimensional raw vibration signals into two-dimensional grayscale images, and the noise in the data is completely transformed into the grayscale, luminance, and other information in the images that are irrelevant to the image classification results, and the outstanding advantages of neural networks in two-dimensional image classification are fully reflected. WGAN-GP was used to generate more new data to overcome the distribution differences caused by data imbalance. Meanwhile, the attention mechanism was introduced, and a self-attentive convolutional neural network offline model was constructed to perform in-depth feature learning on the collected vibration signals, which can automatically and selectively enhance the useful feature mapping and reduce the redundant features on the convolutional channel.
The validity and meliorist of the method were verified by analyzing and discussing the benchmark data from CWRU and comparing it with other mainstream deep learning models. The experimental and computational results comparison shows that the method not only attains a diagnostic accuracy of more than 99.6% even under data imbalance and strong noise environment, but also has good generalization and robustness. The limitation of the proposed method is mainly focused on the sample generation of GANs, and in this study we generated more image samples similar to the original samples by GANs, and did not generate new image samples. However, there are many compound faults in the actual rolling bearing fault diagnosis process, so we cannot obtain the training samples under all compound fault modes. In the future work, we will further develop the signal-to-image transformation technique, deeply investigate the sample generation capability of GANs, and design a more suitable network.