Mechanism-Informed Neural Network An Interpretable Method For Gearbox Impulsive Fault Feature Extraction
Mechanism-Informed Neural Network An Interpretable Method For Gearbox Impulsive Fault Feature Extraction
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
G
EARBOX is competent in the power transmission of can be trained with the unsupervised learning strategy. Diverse
mechanical equipment because of its high efficiency algorithms can be combined with it, which empowers AE in
and stable transmission ratio [1], [2]. It is widely ap- discriminant feature extraction [14]. Qu et al. [15] constructed
plied as the fundamental components of the equipment, such a deep sparse AE (DSAE) combining sparse coding to extract
as aero-engine and wind power plant. Operating under harsh features. Yu et al. [3] proposed a one-dimensional residual
conditions, impact-type fault tends to happen on gear and roll- convolutional AE (1DRCAE) using convolutional layers and
skip connection to achieve fault feature extraction. An et al.
This work was supported in part by the National Natural Science Founda- [16] proposed a mode-decoupling AE to extract fault-related
tion of China under Grants U23A20620, 52075182, 52275111, 52475102 & features for machinery fault diagnosis under unknown work-
52205101. (Corresponding author: Weihua Li, Guolin He) ing conditions. Yuan et al. [17] combine variational AE and
Yuan Zheng and Chen Zheng are with the School of Mechanical and Au-
tomotive Engineering, South China University of Technology, Guangzhou optimal transport distance to learn common fault-related fea-
510640, China (e-mail: [email protected]; tures for cross-machine fault diagnosis. However, these meth-
[email protected]). ods are still confronted with some typical challenges:
Weihua Li and Guolin He are with the School of Mechanical and Automo-
tive Engineering, South China University of Technology, Guangzhou 510640,
1) Interpretability. The rationale of feature extraction in
China, and also with Pazhou Lab, Guangzhou 511442, China (e-mail: these networks does not possess physical meaning, which re-
[email protected]; [email protected]). sults in dubious output and unreliable performance [18]–[20].
Zhuyun Chen is with State Key Laboratory of Precision Electronic Manu- The misdiagnosis and missed diagnosis from these uncontrol-
facturing Technology and Equipment, Guangdong University of Technology,
Guangzhou 510006, China (e-mail: [email protected]). lable decisions will raise severe consequences. Hence, the
Copyright (c) 2024 IEEE. Personal use of this material is permitted. How- black-box attribution becomes a pivotal problem that curbs the
ever, permission to use this material for any other purposes must be obtained networks from further application in industrial scenarios.
from the IEEE by sending a request to [email protected].
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
2) Robustness. These networks ignore the existence of The main contributions of this paper are listed as follows:
harmonic interference of gearboxes. The impulsive features in 1) A novel interpretable MINN is devised for unsupervised
their outputs are overwhelmed in the harmonics. The periodic- IFF extraction. In MINN, the decoder weight matrix
ity of the IFF is hard to be observed. Hence, it is difficult for serves as the dictionary and the hidden layer output be-
the networks to extract significant IFF from impact gearbox comes sparse vector, providing an interpretable approach
signals. to combining AE with sparse representation. The struc-
The interpretability issue has drawn much attention in re- ture allows MINN to be trained without label information.
search. Relevant studies can be categorized as post hoc analy- 2) MINN incorporates fault mechanism prior, thereby en-
sis and ad hoc analysis [21]. The former can provide proof and dowing the physical interpretability for IFF extraction.
credibility for the network decision [22]–[25]. Tang et al. [23] The network parameters possess physical meanings of
proposed a signal-transformer for fault diagnosis and inter- damping ratio and natural frequency, which gives the IFF
extraction of MINN solid reliability.
preted the network by attention-map visualization. However,
3) A two-stage IFF extraction framework of MINN with a
the correctness of the interpretation cannot be rigorously test-
joint optimization scheme is proposed. The optimization
ed and is misleading [26]. To remedy the drawback, ad hoc
scheme integrates stochastic gradient descent (SGD) with
analysis advocates interpreting the network structure and pa- particle swarm optimization (PSO) to accurately recog-
rameters with human knowledge to harvest an algorithm- nize the mechanism-informed dictionary.
transparent model [27]. Resorting to signal-processing tech- 4) Comparative studies in simulation and experiment are
niques has been commonplace in ad hoc analysis [21], [28]– conducted, which has verified the effectiveness of the
[36]. Wu et al. [29] constructed a multiplication-convolution proposed method. Interpretability analysis is further car-
sparse network to realize gearbox fault feature extraction. ried out from the view of impulsive fault mechanism to
Dong et al. [31] developed interpretable multiscale lifting verify the reliability of MINN.
wavelet contrast network to diagnose fault gearbox. Liu et al. The remainder of this paper is arranged as follows. In Sec-
[30] integrated wavelet scattering into deep network design to tion II, the theoretical background will be described. The pro-
extract the transient features in different working conditions. posed MINN with the IFF extraction is detailed in Section III.
However, the above-mentioned interpretable networks still Section IV and Section V verify the proposed method using
contain limitations as follows: simulation and experiment signals, respectively. Finally, Sec-
1) Feature extraction of these networks is disconnected with tion VI summarizes the work.
faults of a specific type, which cannot provide evidence con-
vincible enough to interpret their performance. A well- II. THEORETICAL BACKGROUND
recognized fact is that mechanical faults of different types
generate responses with different features, which are deter- A. Impulsive Fault Mechanism
mined by the dynamics characteristics of the mechanical sys- A normal fixed-axis gear pair generates a vibrational re-
tem [37]; and a feature extraction approach established based sponse that contains harmonics of gear meshing frequency.
on fault mechanism will come with solid reliability. However, Pitting, cracking, broken tooth or other local faults will present
signal-processing techniques that the reported methods use to on the gear. During the engagement of the faulty tooth, the
embed do not have a tight connection with the mechanism of a meshing stiffness fluctuates in large amplitude. That prompts
specific fault type. For example, in the algorithm unrolling the gear pair to generate an impulsive response. It features in a
network based on sparse coding [21], [34], the dictionaries are sinusoidal wave with exponential decay in time domain, as
interpreted via visualization but fail to be interpreted by fault shown in Fig. 1(a). The impulsive response emerges periodi-
vibration characteristics, which weakens the output reliability. cally, because the faulty tooth engages once within a rotation
2) The above networks rely on supervised learning strategy of the faulty gear. The impact of each engagement excites the
while realizing feature extraction, which limits their applica- natural modes to different amplitudes. The natural modes are
tions. Such strategy demands a vast amount of labeled data characterized by resonant peaks in the amplitude spectrum of
[35], [36], which contradicts to the industrial situation where
the acquired data are generally unlabeled.
Considering the mentioned challenges and limitations, this
paper proposes a mechanism-informed neural network (MINN)
to achieve reliable gearbox IFF extraction under strong inter-
ference background via unsupervised learning. Firstly, the
standard AE is modified based on the logic of sparse represen- (a) Time domain
tation to provide MINN with interpretability. Secondly, a
mechanism-informed dictionary is designed with guidance of
impulsive fault mechanism. It is embedded to MINN to sepa-
rate harmonic interferences from the IFF. Finally, a two-stage
IFF extraction framework of MINN is established, where the
network parameters are optimized with the proposed joint op-
timization algorithm for robust IFF extraction. (b) Amplitude spectrum
Fig. 1. Simulated impact response of gearbox.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
the IFF, as shown in Fig. 1(b) [2], [37]. tively selected for IFF reconstruction.
Therefore, the fault vibration response x(t) is modeled as
C. AE
[38]
A standard AE includes the encoder and the decoder which
x(t ) f (t ) m(t ),
are both composed of fully-connected layer(s) [15]. The en-
R S coder extracts the feature vector y N from the input signal
f (t ) Ar , s exp t s sTn
r d,r
(1) x N and the decoder reconstructs the signal xˆ N with y.
r 1 s 1 1 r 2 The process is expressed as
y h xWe be ,
sin d,r t s sTn , t (3)
where f (t) and m(t) represent the IFF and the meshing re- xˆ yWd + bd (4)
sponse excited by the meshing force, respectively; ζ and ωd where h() is the activation function; matrixes We, Wd and vec-
are damping ratio and natural frequency; Ar,s is the amplitude tors be, bd are the trainable parameters, denoting the weight
of the rth natural mode response in the sth impact moment; R and bias of the encoder and the decoder, respectively. Mean
and S are the total number of natural modes and impulses, square error (MSE) is the loss function L to measure the re-
respectively; the impact moment s sTn , where Tn 1/ f t construction accuracy, which is written as
is the impact period and ft is the IFF frequency equal to the 1 2
L x xˆ 2 (5)
rotation frequency of the shaft that the impact fault gear lo- N
cates. However, feature extraction in AE owns little interpretabil-
The periodicity of the IFFs is crucial to fault diagnosis, ity, let alone from perspective of impulsive fault mechanism.
whereas the IFF is submerged in harmonic interference. Fea-
ture extraction methods are to solve the problem. III. THE PROPOSED MINN WITH IFF EXTRACTION
B. Sparse Representation The proposed MINN with IFF extraction includes two stag-
Sparse representation is effective and interpretable in es: pretraining of the extractor and fine-tuning for the refiner
extracting fault features from signals. Specifically, elements in with the joint optimization algorithm, respectively, as illustrat-
sparse vector y indicate the contribution of atoms a N in ed in Fig. 2.
the dictionary D to the reconstructed signal, where N is the
length of the vectors and denotes the real number set. The A. Stage 1: Pretraining of Extractor
method calls for the least number of atoms with the recon- 1) Data preprocessing and Structure of Extractor: First, the
struction error minimized, described as the optimization prob- fault mechanism manifests that the impulsive responses pre-
lem [10] sent by the impact period Tn. Hence, slicing the signal into S
2
min y 0
s.t. x Dy 2 (2) samples with length Tn ensures only one impulse in each sam-
where ‖ 0 is the l0-norm of y that represents the vector sparsi-
‖ ple. Second, S mechanism-informed auto-encoders (MIAEs)
ty; ‖
‖ 2 is the l2-norm; δ is the tolerance of the approximation are ensembled as the extractor of MINN. The numbers of the
error. samples and MIAEs are the same, such that each MIAE simul-
By combining with the fault mechanism, He et al. [38] pro- taneously extract the IFF parameters d , , of the impulses
posed an analytical dictionary whose waveform and atom ex- via the joint optimization. The details are mentioned below.
pression match the fault feature to de-compose the impulsive 2) Architecture of MIAE: MIAE combines the sparse repre-
response. However, the dictionary is fixed and cannot be adap- sentation logic with general AE. Fig. 3 illustrates how MIAE
Fig. 2. The MINN with IFF extraction and the architecture of MIAE with joint optimization scheme.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
where f and fr,s are vectors of the IFF and the modal responses,
respectively; amplitudes Ar,s are collated as an amplitude vec-
tor A ( A1,1 , A1,2 , , AR , S ) . Comparing (8) with (6), A and the
matrix of modal responses correspond to the sparse vector y
and the dictionary D, respectively. Therefore, ar,s(t) is assigned
as the atom, constructing the IFF dictionary
D {ar , s (t; d,r , r , s )}rR, S1, s 1 .
With the mechanism informed, MIAE gains the fault-
related interpretation: the MIAE reconstructs the IFF from the
input, by assigning the modal responses of different orders and
different impact moments with the sparse vector. However,
optimization of the IFF parameters is distinctive to that in
general network training and hence needs improvement.
4) Joint Optimization: A joint optimizing algorithm for the
parameters of MIAE is formulated, as the process shown in
Fig. 2.
Firstly, the encoder parameters penc We , be , and the
modal parameters ωd and ζ are updated using SGD, expressed
as:
Fig. 3. The establishment of MIAE. Sparse representation is first embedded k 1 L
into general AE by replacing Wd with dictionary D and interpreting y as sparse penc penc enc k ,
k
vector; then, based on fault mechanism, the impulsive responses ar,s(t) of IFF penc
are as the IFF atoms of D. Their optimal shapes are determined by the opti-
L a r , s
k
mized physical parameters, based on the atom expression (7). k 1
d,r d,r k
k
, (9)
ar , s d,k r
First, the decoder weight matrix Wd plays the role of dic-
tionary D; the decoder bias vector is removed, i.e., bd 0 . k 1 k L a r , s
k
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
After the optimization, the atom parameters of the extractor IV. SIMULATION ANALYSIS
are obtained as the input of the second stage. A. Simulated Impact Fault Signal
The vibration signal of the impact fault gear in a gearbox is
B. Stage 2: Fine-tuning for Refiner generated with (1), with its amplitude spectrum displayed in
Although the MIAEs of the extractor possess interpretabil- Fig. 1.
ity of IFF extraction, their performance is yet vulnerable to As seen in Fig. 1(b), three natural modes are considered.
harmonic interference. The second stage of MINN is hence The natural frequencies and the damping ratios of the three
established, including dictionary integration and refiner fine- orders are 1600Hz, 2100Hz, 2900Hz and 0.05, 0.04, 0.02,
tuning, as shown in Fig. 2. respectively. Their amplitudes are originally set to be
The optimized IFF parameter triplets from the extractor are 20,15,10m/s2. Then, the amplitudes are added with random
collected for dictionary integration. First, it declares that an fluctuation varying from -0.2 to 0.2 times of themselves,
atom of the cth modal order is qualified if its IFF parameters respectively. The signal length T is 0.3s with sampling
satisfy the conditions: frequency fs 20480Hz. The fault feature frequency, which is
the rotating frequency of the input shaft fn here, is set to 30Hz.
d,rq d,r d , rq (13)
The impact period Tn 1 f n 0.033s. Random fluctuations of
where d and are the tolerances of ωd,rq and ζrq in the qth (0.1 0.5)Tn are added to each impact period and the impact
sample; d,r is the sample average of ωd,rq. The conditions in moment s [s (0.1 0.5)]Tn .
(12) improve the robustness of MINN: 1) For the frequencies, The meshing frequency of the gear is set at 500Hz with
the average of the rth feature parameters approximates the three frequency multiples considered. The third multiple of
actual value. The frequencies of the qualified atoms will be meshing frequency is 1500Hz, adjacent to the natural frequen-
less influenced by noise; 2) For the damping ratios, MINN cy of the second order d,2 1600Hz . It aims to analyze the
tends to match the harmonics and optimize the damping ratios influence of the harmonic interference. The additive Gaussian
of the IFF atoms as 0. Setting the bottom line of the damping noise is injected into the raw vibration signal.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
B. Method Preparations MINN can accurately extract the IFF with strong robustness to
In the extractor of MINN, the signal is sliced to T f n 9 harmonic interference and noise.
samples with a length of Tn. In the extractor, the MIAE num-
ber S is hence 9. The atom number R of each MIAE is 3, iden-
tical to the number of the natural modal orders. The trainable
parameters of the encoder are randomly initialized. In the de-
coder, the range of the natural frequencies is initialized at ±
500Hz around the true values. The initial damping ratios are in
the range [0.01,0.05], according to the analyses on gear system (a) Time domain
modeling [37] and IFF extraction [9]. The impact moment is
initialized in the range of 0-Tn with uniform distribution. In the
refiner, the selection tolerances of ωd and ζ are ±100Hz and
0.005, respectively.
Five methods are selected for comparison, including DSAE
[15], 1DRCAE [3], DNSD [10], nested iterative soft thresh- (b) Amplitude spectrum
olding algorithm network (NISTA-Net) [34] and EK-SVD
[11]. Note that the algorithm unrolling structure of NISTA-Net
with its fault feature reconstruction process is involved in
comparison. Kullback-Leibler divergence [14] is used in
DSAE.
Adam optimizer is used for network training. As listed in (c) Local view of the IFF in the last impact moment
Table I, υζ of the extractor is set to be 10-3 in case of diver- Fig. 4. IFF extraction performance of MINN under -8dB noise.
gence referring to (10). Considering that the magnitude order
of ωd reaches 105, the υω should be at least 103 larger than υζ The extracted IFFs of DSAE are not significant in Fig. 4,
according to (10) and (11). Hence, set υω as 50 for the extrac- whereas the ones of 1DRCAE, DNSD and NISTA-Net can
tor. To ensure precision, learning rates of the refiner are set to both be observed merely in Fig. 5(a)-Fig. 7(a), respectively.
be 10 times less than ones of the extractor. For PSO setups, as As shown in Fig. 8, EK-SVD is adept at denoising but fails to
displayed in Table I, the inertia weight is set to 0.8 within the extract IFF and only recovers the harmonics, due to the high
range recommended by [39]; population is set to 50 and itera- amplitudes of the simulated harmonics and the limitation of
tion is 30 to balance the extraction performance and the com- dictionary learning methods. As displayed in Fig. 5(b)-Fig.
putation cost. Determination of the other parameters can be 7(b), 1DRCAE, DNSD and NISTA-Net catch intact IFFs with
referenced in [39]. The extractor and the refiner will be iterat- less noise. The results indicate that the DNSD, 1DRCAE and
ed over 200 and 100 epochs, respectively. The compared net- NISTA-Net effectively extract the IFF. However, the com-
works are trained 300 epochs with learning rates identical to pared methods still extract the harmonic interference, as the
reconstructed three meshing components shown in Fig. 4(b)-
υenc of the extractor. The determination of the parameters in
Fig. 7(b), making their IFFs less significant than that of MINN.
EK-SVD is identical to [11].
TABLE I
OPTIMIZATION PARAMETER SETTINGS
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
calculated by
f
SICR 20 lg 2
dB (14)
xˆ f 2
where lg denotes the logarithm in base 10. The denominator
(b) Amplitude spectrum represents the residual interference in the extracted IFF. SCIR
Fig. 6. IFF extraction performance of 1DRCAE under -8dB noise. indicates the relative intensity between the IFF and the
residual. The higher SCIR the network yields, the better
elimination performance it indicates. Each SNR test is repeat-
ed 10 times to reduce randomness. The averaged SCIRs are
illustrated in Fig. 9.
D. Quantitative Evaluation upon Robustness Fig. 11. MINN performance under different (a) learning rates of frequency or
damping ratio, and (b) harmonic levels.
To fully investigate the robustness of the proposed method,
noise with signal-to-noise ratios (SNRs) varying from -8dB to
To evaluate the harmonic level, a metric – signal-to-
4dB is added. The performance of eliminating the interfer-
ences, including noise and meshing components should be harmonic ratio (SHR) is set, calculated by
considered. A metric – signal-to-interference-component ratio
(SICR) is accordingly conceived for robustness evaluation,
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
m
SHR 20 lg 2
dB (15)
x 2
where m represents the harmonic interferences. High SHR
means the strong intensity of harmonics in a signal. As shown
in Fig. 11(b), when the SHR decreases from 0dB to -21dB, the (b) Amplitude spectrum
SCIR of MINN ranges in [4.2, 1.8]dB. The result and Fig. 10 Fig. 13. IFF extraction performance of MINN-Ext under -8dB noise.
confirm the robustness of MINN to the interference.
However, MINN is sensitive to the initialization range of ωd. To verify the benefit of PSO, the impact moments become
In Fig. 12(a), when the initialized range of ωd rises from the learnable parameters of MINN and are trained using SGD.
±10Hz to ±1000Hz around the optimum, the SICR declines. The extracted IFF is shown in Fig. 11(a). Compared with Fig.
With rough initialization, the reconstruction precision of 3(a), MINN-SGD fails to search out the accurate impact mo-
MINN is affected. Further, the precision of the trained fre- ments and they are trapped in the local optimal positions. The
quency is inspected by using relative error: result manifests that well-optimized impact moments are cru-
cial for IFF extraction because the IFF is sparse in time do-
= d d d 100 % (16)
main and without a well-optimized impact moment, the atom
where d represents the optimal frequency. Recognition re- may only learn from the interference instead of the IFF. As is
sults of ωd are displayed in Fig. 12(b). The error with its shown in Fig. 11(b), the output cannot cover the modes, turn-
standard variance rises drastically. Based on (6)-(8), the re- ing out to extract the third order of meshing frequency of
constructed IFFs disperse and obtain lower SICR. 1500Hz.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
TABLE III
TRAINED MODAL PARAMETERS OF MINN WITH SIMULATED SIGNAL UNDER DIFFERENT NOISE INTENSITIES.
3rd modes are 1579.3Hz, 2097.7Hz and 2890.4Hz with errors mainly excited modal responses. In bands higher than 3000Hz,
of 1.29%, 0.11% and 0.33%, respectively. The optimization the modes are not considered because the impact-related
results prompt that the mode response of each order is covered energy here is in a thin distribution. Therefore, R is 3 and
[seen in Fig. 3(b)] and the waveform of the impulse is well natural frequencies are initialized in the mentioned 3 ranges,
matched [seen in Fig. 3(c)]. respectively. Other initial parameters of MINN, such as
Therefore, the extraction performance becomes interpreta- learning rates, training epochs and PSO setups are the same as
ble towards the impact fault owing to the impulsive fault those in Section IV-B.
mechanism-informed architecture.
V. EXPERIMENTAL VERIFICATION
A. Data Acquisition and Experimental Settings
A single-stage fixed-axis gearbox for the test is shown the
Fig. 12(a). The teeth number of input and output gear are 24
and 56, respectively. One tooth on the output gear is cut out to
Fig. 16. Fast kurtogram of the experiment signal.
simulate the broken tooth fault, as shown in Fig. 12(b). The
rotation speed of the input shaft is constant at 800rpm. The
IFF frequency ft is 5.71Hz. The vibrational acceleration signal B. Discriminant Feature Recognition
is acquired from the acceleration sensor with the sampling The extracted IFFs are displayed in Fig. 14. Embedded with
frequency fs 12.8kHz and time length T 1.56s. To conduct a the IFF analytical dictionary, MINN grabs the impulsive
fair comparison and ensure the network convergence [34], the feature of each period of the output shaft and stands out in
signal x undergoes the z-score normalization, expressed as suppressing harmonic interferences. In accordance with the
x x x x where σx and μx are the average and standard simulation situation in Section IV-C: the compared methods
variance of x, respectively. The normalized signal x is sepa- catch the harmonic interferences while the periodicity of the
rated as T f t 9 samples to feed Q 9 MIAEs. impulses is not significant. For example, NISTA-Net and EK-
SVD cannot recognize the smaller transient features but
reconstruct the harmonics instead, such as the impulse at 0.40s.
The outputs are performed by Hilbert envelop demodulation
and the envelop spectra are illustrated in Fig. 15. For DSAE,
1DRCAE and NISTA-Net, the 1st order of the fault feature
frequency ft is clear, while other peaks are not located at its
multiple orders. The spectra of DNSD and EK-SVD show that
there are only several peaks located at the multiple orders,
Fig. 15. Test rig and the fixed-axis gearbox with impact fault gear. such as the 4th and 5th orders. However, MINN recovers most
of the orders and some are even reinforced, such as the 6rd
As has been manifested in Section IV-D, MINN is sensitive and 7th orders, indicating that interferences like noise are
to frequency initialization. Well determining the frequency separated from the IFF. The fault feature frequency ft with its
bands is imperative for initialization. However, selecting multiples is an effective and interpretable indicator of the
resonant peaks in hand is time-consuming and the accuracy location of the impact fault [38]. Therefore, the recognition of
cannot be ensured. Fast Kurtogram applies adaptive access to ft or the periodicity ensures that the IFF is discriminant for
locating frequency bands of impact fault responses with little impact fault diagnosis, which manifests the effectiveness of
computational burden. Bands of large kurtosis are informative MINN.
for IFF extraction and fault diagnosis [40].
The Kurtogram of x is shown in Fig. 13. At level 6, three
frequency bands are in high spectral kurtosis. The bands [1100,
1200]Hz, [1400, 1500]Hz and [1700, 1800]Hz, contain the
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
VI. CONCLUSION
Fig. 18. The envelop spectra of the IFFs extracted by MINN and the com-
pared methods. In this paper, a mechanism-informed network called MINN
is established. MINN is composed of the physically inter-
pretable MIAEs in which the decoder weight matrix becomes
C. Computation Time Analysis the IFF dictionary that infuses the prior of impulsive fault
The computation time of optimizing different methods in mechanism. The forward process is hence to generate the
simulation and experiment is carried out, as listed in Table IV. sparse vector to perform sparse representation of the input.
DSAE and 1DRCAE, the two classical unsupervised methods The interpretable atom parameters are updated by the pro-
cost the least optimization time because of their simpler struc- posed joint optimization without label information. Then, the
ture. In contrast, EK-SVD and DNSD are time-consuming MINN with IFF extraction is established. Comparative studies
because the former spends a vast amount of time updating the are conducted in simulation and experiment, showing that
TABLE V
RECOGNIZED IMPACT MOMENTS OF THE VIBRATION SIGNAL.
Number 1 2 3 4 5 6 7 8 9
Moments τ (s) 0.0471 0.2212 0.3989 0.5733 0.7471 0.9238 1.0998 1.276 1.4488
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
MINN obtains higher SICR and better IFF extraction per- chinery Fault Diagnosis Under Unknown Working Conditions,” IEEE
Trans. Ind. Inform., pp. 1–14, 2024.
formance than the compared methods, under harmonic in- [17] S.-Z. Yuan, Z.-H. Liu, H.-L. Wei, L. Chen, M.-Y. Lv, and X.-H. Li, “A
terference and noise of different SNRs. Ablation study has Variational Auto-Encoder-Based Multisource Deep Domain Adaptation
verified that the proposed approach enhances the harmonic Model Using Optimal Transport for Cross-Machine Fault Diagnosis of
Rotating Machinery,” IEEE Trans. Instrum. Meas., vol. 73, pp. 1–11,
robustness and the joint optimization scheme is necessary for 2024.
MINN to locate the impact moment. Inspection of IFF pa- [18] L. C. Brito, G. A. Susto, J. N. Brito, and M. A. V. Duarte, “Fault Diagno-
rameters recognition interprets the performance, which pro- sis using eXplainable AI: A transfer learning-based approach for rotating
vides MINN with the model interpretability towards gearbox machinery exploiting augmented synthetic data,” Expert Syst. Appl., vol.
232, p. 120860, Dec. 2023.
IFF extraction. [19] G. Chen, P. Wei, H. Jiang, and M. Liu, “Formal Language Generation for
MINN has limitations. Firstly, each training sample should Fault Diagnosis With Spectral Logic via Adversarial Training,” IEEE
contain only one impulse, which limits the method to extract Trans. Ind. Inform., vol. 18, no. 1, pp. 119–129, Jan. 2022.
[20] Z. Chen et al., “Generalized open-set domain adaptation in mechanical
IFF under varying speed conditions. Secondly, MINN is not fault diagnosis using multiple metric weighting learning network,” Adv.
fully interpretable. How sparse vector is solved in the encoder Eng. Inform., vol. 57, p. 102033, Aug. 2023.
is yet not algorithm transparent. Future work will focus on [21] Z. Zhao et al., “Model-driven deep unrolling: Towards interpretable deep
learning against noise attacks for intelligent fault diagnosis,” ISA Trans.,
improving the architecture to handle these problem. vol. 129, pp. 644–662, Oct. 2022.
[22] Z. Chen et al., “Explainable Deep Ensemble Model for Bearing Fault
REFERENCES Diagnosis Under Variable Conditions,” IEEE Sens. J., vol. 23, no. 15, pp.
17737–17750, Aug. 2023.
[1] K. Feng, J. C. Ji, Q. Ni, and M. Beer, “A review of vibration-based gear
[23] J. Tang, G. Zheng, C. Wei, W. Huang, and X. Ding, “Signal-Transformer:
wear monitoring and prediction techniques,” Mech. Syst. Signal Process.,
A Robust and Interpretable Method for Rotating Machinery Intelligent
vol. 182, p. 109605, Jan. 2023.
Fault Diagnosis Under Variable Operating Conditions,” IEEE Trans. In-
[2] F. Jiang, K. Ding, G. He, Y. Sun, and L. Wang, “Vibration fault features
strum. Meas., vol. 71, pp. 1–11, 2022.
of planetary gear train with cracks under time-varying flexible transfer
[24] M. S. Kim, J. P. Yun, and P. Park, “Deep Learning-Based Explainable
functions,” Mech. Mach. Theory, vol. 158, p. 104237, Apr. 2021.
Fault Diagnosis Model With an Individually Grouped 1-D Convolution
[3] J. Yu and X. Zhou, “One-Dimensional Residual Convolutional Autoen-
for Three-Axis Vibration Signals,” IEEE Trans. Ind. Inform., vol. 18, no.
coder Based Feature Learning for Gearbox Fault Diagnosis,” IEEE Trans.
12, pp. 8807–8817, Dec. 2022.
Ind. Inform., vol. 16, no. 10, pp. 6347–6358, Oct. 2020.
[25] H. Wang, Z. Liu, D. Peng, and Y. Qin, “Understanding and Learning
[4] J. Wang, H. Shao, Y. Peng, and B. Liu, “PSparseFormer: Enhancing Fault
Discriminant Features based on Multiattention 1DCNN for Wheelset
Feature Extraction Based on Parallel Sparse Self-Attention and Multiscale
Bearing Fault Diagnosis,” IEEE Trans. Ind. Inform., vol. 16, no. 9, pp.
Broadcast Feedforward Block,” IEEE Internet Things J., vol. 11, no. 13,
5735–5745, Sep. 2020.
pp. 22982–22991, Jul. 2024.
[26] W. Cheng et al., “AFARN: Domain Adaptation for Intelligent Cross-
[5] X. Xu, X. Huang, H. Bian, J. Wu, C. Liang, and F. Cong, “Total process
Domain Bearing Fault Diagnosis in Nuclear Circulating Water Pump,”
of fault diagnosis for wind turbine gearbox, from the perspective of com-
IEEE Trans. Ind. Inform., vol. 19, no. 3, pp. 3229–3239, Mar. 2023.
bination with feature extraction and machine learning: A review,” Energy
[27] D. Wang, Y. Chen, C. Shen, J. Zhong, Z. Peng, and C. Li, “Fully inter-
AI, vol. 15, p. 100318, Jan. 2024.
pretable neural network for locating resonance frequency bands for ma-
[6] J. Zuo, Y. Miao, B. Zhang, and J. Lin, “Cyclostationary Feature Mode
chine condition monitoring,” Mech. Syst. Signal Process., vol. 168, p.
Decomposition and Its Application in Fault Diagnosis of Planetary Gear-
108673, Apr. 2022.
boxes via Built-In Information,” IEEE Sens. J., vol. 24, no. 2, pp. 1129–
[28] H. Wang, Z. Liu, D. Peng, and M. J. Zuo, “Interpretable convolutional
1139, Jan. 2024.
neural network with multilayer wavelet for Noise-Robust Machinery fault
[7] W. Huang, J. Wang, G. Du, S. Wu, and Z. Zhu, “Balance Sparse Decom-
diagnosis,” Mech. Syst. Signal Process., vol. 195, p. 110314, Jul. 2023.
position Method with Nonconvex Regularization for Gearbox Fault Diag-
[29] Q. Wu, X. Ding, L. Zhao, R. Liu, Q. He, and Y. Shao, “An Interpretable
nosis,” Chin. J. Mech. Eng., vol. 37, no. 1, p. 107, Sep. 2024.
Multiplication-Convolution Sparse Network for Equipment Intelligent
[8] Z. Feng, Y. Zhou, M. J. Zuo, F. Chu, and X. Chen, “Atomic decomposi-
Diagnosis in Antialiasing and Regularization Constraint,” IEEE Trans.
tion and sparse representation for complex signal analysis in machinery
Instrum. Meas., vol. 72, pp. 1–12, 2023.
fault diagnosis: A review with examples,” Measurement, vol. 103, pp.
[30] C. Liu, T. Han, G. Zhang, H. Sun, and X. Shi, “Scattering moment match-
106–132, Jun. 2017.
ing-based interpretable domain adaptation for transfer diagnostic tasks,”
[9] F. Jiang, K. Ding, G. He, and C. Du, “Sparse dictionary design based on
Neurocomputing, vol. 594, p. 127699, Aug. 2024.
edited cepstrum and its application in rolling bearing fault diagnosis,” J.
[31] Y. Dong, H. Jiang, X. Wang, M. Mu, and W. Jiang, “An interpretable
Sound Vib., vol. 490, p. 115704, Jan. 2021.
multiscale lifting wavelet contrast network for planetary gearbox fault di-
[10] X. Zhou et al., “A hybrid denoising model using deep learning and sparse
agnosis with small samples,” Reliab. Eng. Syst. Saf., vol. 251, p. 110404,
representation with application in bearing weak fault diagnosis,” Meas-
Nov. 2024.
urement, vol. 189, p. 110633, Feb. 2022.
[32] M. Miao and J. Yu, “Sparse-Representation-Network-Based Feature
[11] J. Li, Z. Wang, Q. Li, and J. Zhang, “An enhanced K-SVD denoising
Learning of Vibration Signal for Machinery Fault Diagnosis,” IEEE
algorithm based on adaptive soft-threshold shrinkage for fault detection
Trans. Ind. Inform., vol. 19, no. 5, pp. 6706–6716, May 2023.
of wind turbine rolling bearing,” ISA Trans., vol. 142, pp. 454–464, Nov.
[33] B. An, S. Wang, F. Qin, Z. Zhao, R. Yan, and X. Chen, “Adversarial
2023.
Algorithm Unrolling Network for Interpretable Mechanical Anomaly De-
[12] Z. Chen et al., “A Multi-Source Weighted Deep Transfer Network for
tection,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–14, 2023.
Open-Set Fault Diagnosis of Rotary Machinery,” IEEE Trans. Cybern.,
[34] B. An, S. Wang, Z. Zhao, F. Qin, R. Yan, and X. Chen, “Interpretable
vol. 53, no. 3, pp. 1982–1993, Mar. 2023.
Neural Network via Algorithm Unrolling for Mechanical Fault Diagnosis,”
[13] Y. Xiao, H. Shao, J. Lin, Z. Huo, and B. Liu, “BCE-FL: A Secure and
IEEE Trans. Instrum. Meas., vol. 71, pp. 1–11, 2022.
Privacy-Preserving Federated Learning System for Device Fault Diagno-
[35] T. Li et al., “WaveletKernelNet: An Interpretable Deep Neural Network
sis Under Non-IID Condition in IIoT,” IEEE Internet Things J., vol. 11,
for Industrial Intelligent Diagnosis,” IEEE Trans. Syst. Man Cybern. Syst.,
no. 8, pp. 14241–14252, Apr. 2024.
vol. 52, no. 4, pp. 2302–2312, Apr. 2022.
[14] Z. Yang, B. Xu, W. Luo, and F. Chen, “Autoencoder-based representation
[36] Z. Shang, Z. Zhao, and R. Yan, “Denoising Fault-Aware Wavelet Net-
learning and its application in intelligent fault diagnosis: A review,”
work: A Signal Processing Informed Neural Network for Fault Diagnosis,”
Measurement, vol. 189, p. 110460, Feb. 2022.
Chin. J. Mech. Eng., vol. 36, no. 1, p. 9, Jan. 2023.
[15] Y. Qu, M. He, J. Deutsch, and D. He, “Detection of Pitting in Gears Us-
[37] L. Xu, K. Ding, G. He, Y. Li, and Z. Chen, “Resonance modulation vibra-
ing a Deep Sparse Autoencoder,” Appl. Sci., vol. 7, no. 5, p. 515, May
tion mechanism of equally-spaced planetary gearbox with a localized
2017.
fault on sun gear,” Mech. Syst. Signal Process., vol. 166, p. 108450, Mar.
[16] Z. An, X. Jiang, and J. Liu, “Mode-Decoupling Auto-Encoder for Ma-
2022.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Internet of Things Journal. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2024.3503634
[38] G. He, J. Li, K. Ding, and Z. Zhang, “Feature extraction of gear and bear- Chen Zheng received his B.S and M.S. degrees in
ing compound faults based on vibration signal sparse decomposition,” mechanical engineering from the South China Univer-
Appl. Acoust., vol. 189, p. 108604, Feb. 2022. sity of Technology, Guangzhou, China, in 2019 and
[39] T. M. Shami, A. A. El-Saleh, M. Alswaitti, Q. Al-Tashi, M. A. Summa- 202, respectively.
kieh, and S. Mirjalili, “Particle Swarm Optimization: A Comprehensive His research interests include vibration signal pro-
Survey,” IEEE Access, vol. 10, pp. 10031–10061, 2022. cessing and data-based methods for mechanical fault
[40] B. Deng, G. Yu, T. Lin, and M. Sun, “Fast Cmspogram: An effective new diagnosis.
tool for periodic pulse detection,” Mech. Syst. Signal Process., vol. 209, p.
111094, Mar. 2024.
Authorized licensed use limited to: The University of Hong Kong Libraries. Downloaded on January 12,2025 at 12:57:24 UTC from IEEE Xplore. Restrictions apply.
© 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.