Mest D 24 00142
Mest D 24 00142
Mest D 24 00142
Full Title: An improved MF1-FedAvg based Federated Learning method with MSRANet for
machinery fault diagnosis
Keywords: Federated learning; Fault diagnosis; Rolling bearings; Multiscale residual attention
network
Abstract: Current fault detection methods for rolling bearings suffer from insufficient data
features, limiting the generalization capability of models. Typically, conventional
approaches train the model with a significant amount of labeled data to improve
reliability. However, centralized training poses potential risks of data privacy leakage.
In response to this issue, we propose a federated learning-based fault diagnosis
model. In this method, fault diagnosis models for different clients are collaboratively
trained by multiple entities with distinct fault characteristics, eliminating the need for
third-party aggregation and thereby reducing the risk of data leakage. Specifically, we
design a multi-scale residual neural network with the ability to perform direct feature
extraction from fault data. This proposed network integrates attention units for various
scales, emphasizing key features of bearing faults and enhancing the fault recognition
capability of local models. Moreover, to address the inherent problem in traditional
federated learning frameworks—disparities in client contributions, leading to
suboptimal model quality and prolonged training times—this research introduces an
innovative weighted strategy based on multi-class F1 scores. This strategy assigns
higher weight to high-quality local clients, thereby enhancing both model quality and
training speed. Experiments were conducted on two authentic bearing datasets, and
the results demonstrate that the proposed method can achieve an average reduction of
approximately 15% in training iteration times compared to the federated averaging
algorithm, coupled with an average enhancement of about 5% in fault diagnosis
accuracy. The experimental results indicate that the proposed method exhibits
outstanding accuracy and robustness.
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Manuscript Click here to access/download;Manuscript;Article 1.22-
1(4).docx
Keywords: Federated learning; Fault diagnosis; Rolling bearings; Multiscale residual attention network
Abstract Current fault detection methods for rolling bearings suffer from insufficient data features, limiting the generalization
capability of models. Typically, conventional approaches train the model with a significant amount of labeled data to improve reliability.
However, centralized training poses potential risks of data privacy leakage. In response to this issue, we pro-pose a federated learn-
ing-based fault diagnosis model. In this method, fault diagnosis models for different clients are collaboratively trained by multiple
entities with distinct fault characteristics, eliminating the need for third-party aggregation and thereby reducing the risk of data leakage.
Specifically, we design a multi-scale residual neural network with the ability to perform direct feature extraction from fault data. This
proposed network integrates attention units for various scales, emphasizing key features of bearing faults and enhancing the fault
recognition capability of local models. Moreover, to address the inherent problem in traditional federated learning frameworks—
disparities in client contributions, leading to suboptimal model quality and prolonged training times—this research introduces an
innovative weighted strategy based on multi-class F1 scores. This strategy assigns higher weight to high-quality local clients, thereby
enhancing both model quality and training speed. Experiments were conducted on two authentic bearing datasets, and the results
demonstrate that the proposed method can achieve an average reduction of approximately 15% in training iteration times compared
to the federated averaging algorithm, coupled with an average enhancement of about 5% in fault diagnosis accuracy. The experi-
mental results indicate that the proposed method exhibits outstanding accuracy and robustness.
1
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
2
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
2. Proposed method
2.1 Problem definition
In this study, supervised datasets, denoted as 𝑁 𝑖 ={𝑛𝑖 }, 𝑖 =
1,2, …., and 𝐷𝑖 = {𝑑𝑖 }, 𝑖 = 1,2, …, 𝐷𝑖 are acquired by individual cli-
ents. Each client aims to construct a fault diagnosis model by in-
tegrating all the available data. Conventionally, the global model
𝑀𝑎𝑙𝑙 is trained by pooling all the data on 𝐷𝑖 . However, this ap-
proach is unsuitable for the present scenario where the server
cannot access the client's data.
To address this issue, we aim to develop a global fault diagno-
sis model 𝑀𝑓𝑒𝑑 that applies to all clients 𝑁 𝑖 . To achieve this, lo-
cally trained models of different clients are communicated to the 带有故障分类单元
server, aggregating them to construct 𝑀𝑓𝑒𝑑 . It is assumed that 的残差块
𝑀𝑓𝑒𝑑 has a validation set, and the validation results adjust the
model aggregation. By constructing a central fault diagnosis
model, individual clients can learn from the fault information of
other participants while ensuring data privacy.
4
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
Pr ecision
Algorithm 1 FedAvg.
i
1: Function ClientUpate Run on local clients
Pr ecisionmacro i
2: while iter < max_iter do n (2)
3: // Received initialize model w init from sever n
4: for each local epoch i from 1 to E do Recall i
5: for batch b B do Recallmacro i 1
6: w w a( w, b) n (3)
return w to sever Pr ecisionmacro Recallmacro
7: F1imacro 2
8: end for Pr ecisionmacro Recallmacro
(4)
9: end for
10: end while
In this study, the Macro-F1 scores for each client were calcu-
11: Function SeverAggregation Run on Sever node
lated by computing the average accuracy and recall for each
12: Initialize winit category. However, the precision and recall of certain fault clas-
13: for each round t 1,2, do ses may be lower due to smaller sample sizes, which can occur
14: m max(C K , 1) when these classes are scarce and only a few clients possess
15: St (random set of n clients) them. As a result, the Macro-F1 scores of these classes may
16: for each participant i St in parallel do also be lower. In contrast to the traditional federal average algo-
rithm, this can lead to a faster decrease in the weights of these
17: wti1 ClientUpate(i, wt )
K
clients. Conversely, high Macro-F1 scores were consistently ob-
ni
18: wt 1 wti1 served during the federated learning iterations for fault data
i 1 types with factual data.
19: end for Based on Eq. (5), the weights of model aggregation are af-
20: end for fected by the Macro-F1 scores. An improved federated average
The Federated averaging algorithm, used in the model aggre- algorithm MF1-FedAvg is proposed based on the proposed
gation stage, necessitates a consistent evaluation index for the Algorithm 2 MF1-FedAvg.
weights of individual clients. However, the original algorithm fails 1: Function ClientUpate Run on local clients
to consider the data disparities between clients, solely relying on 2: while iter < max_iter do
the volume of data they own. To enhance the algorithmic perfor- 3: // Received initialize model winit from sever
mance of the traditional FedAvg, the accuracy was incorporated 4: for each local epoch i from 1 to E do
in the weights of model aggregation. Still, the performance of 5: for batch b B do
federated learning did not experience significant improvement w w a( w, b)
6:
[22]. With a large number of clients engaged in federated learn-
7: Calculate the M F1 f . or each client
ing, holding different amounts and types of data, the algorithm’s
8: return M F1, w to sever
performance is impacted by the number and types of faults held
by each client. Thus, to improve the weighting strategy of the 9: end for
conventional federated averaging algorithm, various factors like 10: end for
classification accuracy, recall, scarcity level, and others should 11: end while
be considered. The F1-score is frequently employed to weigh 12: Function SeverAggregation Run on Sever node
precision and recall for binary classification issues and is deter- 13: Initialize winit
mined as the average of precision and recall. Nonetheless, 14: for each round t 1,2, do
faulty data classes in practical industries are often multiclass, 15: m max(C K , 1)
rendering using the F1-score challenging.
16: St (random set of n clients)
In this paper, a novel approach is proposed to tackle the chal-
lenges at hand. Considering that real industrial fault data fre- 17: for each participant i St in parallel do
quently entails multiclass classification, this paper integrates the 18: wti1 ClientUpate(i, wt )
MF1 metric into the model's weight aggregation strategy.Based k
on Eq.(2)-Eq.(4),the Macro-F1 score is incorporated into the n M F1 w
i i
i
5
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
k
The Electrical Engineering Laboratory of Case Western Re-
i 1
ni F1macro wti1 serve University [23] has proposed an open dataset known as
the "rolling bearing dataset," which is made available to the
wt 1 k research community for experimentation and analysis. The
n F1
i 1
i macro dataset was generated using an experimental platform shown
(5) in Fig.5, and faults were introduced using electric discharge
machining (EDM). The sampling frequency for the system
Moreover, the lower model quality will lead to an increment in was 12 kHz, and it was generated by considering four me-
the number of model aggregation iterations, which will consume chanical health states: healthy, outer ring failure, inner ring
more model transmission time. The proposed novel model ag- failure, and ball failure. Each failure state was further classi-
gregation weight strategy, through judicious weight allocation, fied into three degrees of severity, i.e., mild, moderate, and
assigns more weight to high-quality clients. Consequently, this severe, based on the damage diameter of 0.007 inches,
enhances the quality of the federated server model, reducing the 0.014 inches, and 0.021 inches, respectively. Ten machine
number of convergence iterations for the algorithm. This, in turn, operating condition cases were diagnosed for each of the four
effectively diminishes model transmission time, accelerating the states. The data distribution for the different failure types is
algorithm's convergence speed. presented in Table 1, where the bearing failure types are rep-
The proposed federated learning scheme for bearing fault di- resented by 1-9, while the normal type is represented by 0.
agnosis is depicted in Fig.4.The procedure commences with the The dataset comprises 10,000 data points, with each bearing
initialization of model weights, followed by dissemination to each failure type having 1000 data points. The dataset has been
participating client. Subsequently, each client undergoes local divided into 8000 training samples and 2000 test samples to
training utilizing the MSRANet and then forwards the trained ensure the experiment's credibility.
model and the Macro-F1 scores to the server. The server as-
signs weights to each client based on their corresponding
Macro-F1 scores to obtain a novel aggregated model. The up- 3.1.2 Dataset2: Jiangnan University bearing da-
dated model is then broadcast to the clients, and this iterative taset
process continues until the maximum number of iterations is To further assess the efficacy of the proposed approach,
reached. Notably, the integration test data consisting of individ- the dataset of bearings from Jiangnan University [24] was em-
ual client data is employed for model testing to ensure data con- ployed as Case 2 for additional verification. This dataset was
fidentiality. procured from the fault diagnosis test bench of the centrifugal
3. Experimental study fan system with rolling bearings at Jiangnan
3.1 Data descriptions
(a)
Fig. 4. Overall flow chart of the proposed federated learning scheme for (b)
the bearing fault diagnosis scenario.
Fig. 5. Rolling bearing failure experimental device.
6
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
7
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
(a)
8
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
9
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
(c) (d)
(a)
(e) (f)
Fig. 10. The confusion matrix of the federated MSRANet models under two
real datasets.
10
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
(a)
(a)
(b)
Fig. 11. The comparison of testing accuracy between two datasets under
FedAvg and the proposed method. (b)
Fig. 12. Comparison of accuracy trends during iterative training on two da-
based on the client's MF1 score, outperformed the conven- tasets.
tional FedAvg algorithm regarding diagnostic accuracy. This
finding suggests that modifying the weighting strategy of the Fe-
dAvg algorithm to allocate a more significant weight to clients
with high-quality data can effectively enhance the fault diagnosis
accuracy within the FedAvg framework. The proposed method
attained a reasonably high diagnostic accuracy due to the rela-
tively cleaner nature of the CWRU dataset.
Next, the performance of the proposed method was evaluated
using the Jiangnan University dataset, which contains more
noisy signals than the CWRU dataset. Fig.11(b) shows the ex-
perimental results of the proposed method, indicating that the
Mf1-FedAvg algorithm still outperforms FedAvg in terms of di-
agnostic accuracy despite the noise. However, the achieved ac-
curacy is lower than that of the CWRU dataset, which is ex-
pected given the increased noise levels at the Jiangnan Univer-
sity dataset.
Based on the findings presented in Fig.12, it is evident that the
MF1-FedAvg algorithm demonstrates a significantly smaller Fig. 13. Effect of sample size on model performance for different cases of
number of iterations required for convergence in comparison to two data sets.
11
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
the traditional FedAvg algorithm for the two real data sets ex-
amined. Specifically, the number of iterations is reduced by
approximately 15% in the MF1-FedAvg algorithm relative to
traditional FedAvg. This reduction in the number of iterations
can be attributed to a modification in the weighting strategy
utilized in traditional FedAvg. The MF1- FedAvg algorithm
employs a weighting strategy based on the multiclassification
F1 score, which assigns higher weights to customers with
higher-quality data during each iteration of joint learning. Con-
sequently, the aggregated model obtained from a single ag-
gregation exhibits higher quality, reducing the number of iter-
ations required to reach convergence in Federated learning.
12
G. Bell et al. / Journal of Mechanical Science and Technology 23 (2009) 1261~1269
Acknowledgments
The research work is supported by the National Natural Sci-
ence Foundation of China, grant number 62001262, and the Na-
ture Science Foundation of Shandong Province, grant number
ZR2020QF008.
Nomenclature-----------------------------------------
(b) 𝑀𝑎𝑙𝑙 : Global fault diagnosis model
Fig. 15. MSRANet model visualization results of dataset 2 𝑀𝑓𝑒𝑑 : Global Federated Learning fault diagnosis model
C : Channel
The fault diagnosis confusion matrix and t-SNE visualization MLP : Multilayer perceptron
results for dataset 2 are shown in Fig.15.The experimental re- r : Reduction rate
sults in Fig.15 prove that the MSRANet model still has good fault F : Feature map
classification and diagnosis ability on dataset 2, which verifies b : Batch size
that the proposed MSRANet model has good generalization 𝛼 : Learning rate
ability. Therefore, the model has reference significance in the 𝑆𝑡 : Randomly chosen subset of clients
fault diagnosis of rolling bearings. E : Epoch
𝑤 : Model parameters
4. Conclusions
In this paper, we introduce a novel federated learning algo- References
rithm, denoted as MF1-FedAvg, built upon the established Fed- [1] H. T. Shi, L. Guo, S. Tan, X. T. Bai and J. Sun, Rolling Bearing
erated Average (FedAvg) algorithm. The proposed approach Initial Fault Detection Using Long Short-Term Memory Recur-
aims to mitigate issues associated with low-quality client data in rent Network, IEEE Access, 7 (2019) 171559-171569.
traditional federated learning methodologies. This is achieved [2] J. Li, X. Li and D. He, A Directed Acyclic Graph Network Com-
through the integration of a weighting strategy that incorporates bined With CNN and LSTM for Remaining Useful Life Prediction,
IEEE Access, 7 (2019) 75464-75475.
13
0000 Journal of Mechanical Science and Technology 00 (0) (2020)
[3] Q. Liu and C. Huang, A Fault Diagnosis Method Based on 104(1) (2022) 1-19.
Transfer Convolutional Neural Networks, IEEE Access, 7 (2019) [19]S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang and
171423-171430. P. Torr, Res2Net: A New Multi-Scale Backbone Architecture,
[4] K. Bonawitz, F. Salehi, J. Konečný, B. Mcmahan and M Grute- IEEE Transactions on Pattern Analysis and Machine Intelli-
ser, Federated Learning with Autotuned Communication-Effi- gence, 43 (2) (2021) 652-662.
cient Secure Aggregation, 2019 53rd Asilomar Conference on [20]Hu J, Shen L, Sun G. Squeeze-and-excitation networks//Pro-
Signals, Systems, and Computers, (2019). ceedings of the IEEE conference on computer vision and pat-
[5] T. Han, D. Jiang, Q. Zhao, L. Wang and K. Yin, Comparison of tern recognition.( 2018) 7132-7141.
random forest, artificial neural networks and support vector ma- [21]H. B. Mcmahan, E. Moore, D. Ramage, S. Hampson and B. Ar-
chine for intelligent diagnosis of rotating machinery, Transac- cas, Communication-Efficient Learning of Deep Networks from
tions of the Institute of Measurement and Control, 40 (8) (2017) Decentralized Data, Proceedings of the 20 th International Con-
2681-2693. ference on Artificial Intelligence and Statistics, (2017).
[6] T. Han, D. Jiang, Y. Sun, N. Wang and Y Yang, Intelligent Fault [22]K. Bonawitz, F. Salehi, J. Konečný, B. Mcmahan and M Grute-
Diagnosis Method for Rotating Machinery via Dictionary Learn- ser, Federated Learning with Autotuned Communication-Effi-
ing and Sparse Representation-Based Classification, Measure- cient Secure Aggregation, 2019 53rd Asilomar Conference on
ment, 118 (2018) 181-193. Signals, Systems, and Computers, (2019).
[7] C. Liu, D. Jiang and W. Yang, Global geometric similarity [23]W. A. Smith and R. B. Randall, Rolling element bearing diag-
scheme for feature selection in fault diagnosis, Expert Systems nostics using the Case Western Reserve University data: A
with Applications, 41 (8) (2014) 3585-3595. benchmark study, Mechanical Systems and Signal Processing,
[8] W. Zhang, X. Li, X. Jia, H. Ma and X. Li, Machinery fault diag- 64-65 (2015) 100-131.
nosis with imbalanced data using deep generative adversarial [24]K. Li, School of Mechanical Engineering, Jiangnan University,
networks, Measurement, 152 (2019) 107377. (2019). http://madnet.org.
[9] X. Li, W. Zhang, H. Ma, Z. Luo and X. Li, Data alignments in [25]He, K. , Zhang, X. , Ren, S. , & Sun, J. Deep residual learn-
machinery remaining useful life prediction using deep adversar- ing for image recognition. IEEE. (2016).
ial neural networks, Knowledge-Based Systems, 197 (2020) [26]Van der Maaten L, Hinton G. Visualizing data using t-SNE.
105843. J.Mach.Learn.Res, (2008), 9(11).
[10]F. Jia, Y. Lei, N. Lu and S. Xing, Deep normalized convolutional
neural network for imbalanced fault classification of machinery Author information
and its understanding via visualization, Mechanical Systems
and Signal Processing, 110 (2018) 349-367. Xiuyan Liu received the Ph.D. degree in
[11]H. Liu, J. Zhou, Y. Zheng, W. Jiang and Y. Zhang, Fault diagno- computer application technology from the
sis of rolling bearings with recurrent neural network-based auto- Ocean University of China, Qingdao,
encoders, Isa Transactions, 77 (2018) 167-178. China, in 2017. She is currently an Asso-
[12]M. Zhao, S. Zhong, X. Fu, B. Tang and M. Pecht, Deep Residual ciate Professor with the School of Infor-
Shrinkage Networks for Fault Diagnosis, IEEE Transactions on mation and Control Engineering, Qingdao
Industrial Informatics, 16 (7) (2020) 4681-4690. University of Technology, Qingdao. Her
[13] ZHANG Zhenliang, LIU Junqiang, HUANG Liang, et al. A bear- current research interests include deep
ing fault diagnosis method based on semi-supervised and trans- learning, mechanical fault diagnosis, and advanced signal pro-
fer learning[J]. Journal of Beijing University of Aeronautics and cessing.
Astronautics, 2019, 45(11) 2291-2300.
[14]H. B. Mcmahan, E. Moore, D. Ramage, S. Hampson and B. Ar- Chunqiu Pang is a Master’s student at
cas, Communication-Efficient Learning of Deep Networks from the School of Information and Control
Decentralized Data, Proceedings of the 20 th International Con- Engineering, Qingdao University of
ference on Artificial Intelligence and Statistics, (2017). Technology, Qingdao, China. His cur-
[15]Z. Li, Z. Li, Y. Li, J. Tao, Q. Mao and X. Zhang, An Intelligent rent research interests include deep
Diagnosis Method for Machine Fault Based on Federated learning and Federated Learning, and
Learning, Applied Sciences, 11 (24) (2021) 12117. their applications in bearing fault diagno-
[16]W. Zhang, X. Li, H. Ma, Z. Luo and X. Li, Federated learning for sis.
machinery fault diagnosis with dynamic validation and self-su-
pervision, Knowledge-Based Systems, 213 (1) (2021) 106679.
[17]Q. Wang, Q. Li, K. Wang, H. Wang and P. Zeng, Efficient fed-
erated learning for fault diagnosis in industrial cloud-edge com-
puting, Computing, 103 (10) (2021) 2319-2337.
[18]D. Geng, H. He, X. Lan and Chang Liu, Bearing fault diagnosis
based on improved federated learning algorithm, Computing,
14
Agreement (Submission/ Copyright Transfer) Click here to access/download;Agreement (Submission/
Copyright Transfer);JMST_Submission Agreement.doc
SUBMISSION AGREEMENT
The Korean Society of Mechanical Engineers
#702 KSTC (New Bldg.), 22, 7-gil, Teheran-ro, Gangnam-gu, Seoul 06130, Korea
Tel: +82-2-501-3605, Fax: +82-2-501-3649, E-mail: [email protected]
- Please fill out this Submission Agreement form and upload it to EM System when you
submit your manuscript to JMST.
- You may add lines if it is required.
- Incomplete forms will be rejected.
Title of
Manuscript
Date: ________2024/1/21________________________