0% found this document useful (0 votes)
15 views20 pages

Sensors 24 06448

KAN-HyperMP: An Enhanced Fault Diagnosis Model for Rolling Bearings in Noisy Environments

Uploaded by

jk13d2567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views20 pages

Sensors 24 06448

KAN-HyperMP: An Enhanced Fault Diagnosis Model for Rolling Bearings in Noisy Environments

Uploaded by

jk13d2567
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

sensors

Article
KAN-HyperMP: An Enhanced Fault Diagnosis Model for
Rolling Bearings in Noisy Environments
Jun Wang 1 , Zhilin Dong 2, * and Shuang Zhang 3

1 Department of Ocean Engineering, Yantai Institute of Science and Technology, Yantai 265600, China;
[email protected]
2 School of Engineering, Zhejiang Normal University, Jinhua 321004, China
3 School of Computer Science and Technology, Anhui University, Hefei 230601, China; [email protected]
* Correspondence: [email protected]

Abstract: Rolling bearings often produce non-stationary signals that are easily obscured by noise,
particularly in high-noise environments, making fault detection a challenging task. To address
this challenge, a novel fault diagnosis approach based on the Kolmogorov–Arnold Network-based
Hypergraph Message Passing (KAN-HyperMP) model is proposed. The KAN-HyperMP model is
composed of three key components: a neighbor feature aggregation block, a feature fusion block, and
a KANLinear block. Firstly, the neighbor feature aggregation block leverages hypergraph theory to
integrate information from more distant neighbors, aiding in the reduction of noise impact, even when
nearby neighbors are severely affected. Subsequently, the feature fusion block combines the features
of these higher-order neighbors with the target node’s own features, enabling the model to capture
the complete structure of the hypergraph. Finally, the smoothness properties of B-spline functions
within the Kolmogorov–Arnold Network (KAN) are employed to extract critical diagnostic features
from noisy signals. The proposed model is trained and evaluated on the Southeast University (SEU)
and Jiangnan University (JNU) Datasets, achieving accuracy rates of 99.70% and 99.10%, respectively,
demonstrating its effectiveness in fault diagnosis under both noise-free and noisy conditions.

Keywords: fault diagnosis; hypergraph; Kolmogorov–Arnold Network; KAN-HyperMP

Citation: Wang, J.; Dong, Z.; Zhang, S.


KAN-HyperMP: An Enhanced Fault
1. Introduction
Diagnosis Model for Rolling Bearings In the modern industrial sector, the widespread adoption of “smart manufactur-
in Noisy Environments. Sensors 2024, ing” and advancements in high-end manufacturing technologies have underscored the
24, 6448. https://doi.org/10.3390/ importance of enhancing mechanical equipment health management to achieve system in-
s24196448 telligence. Rolling bearings, which are essential components of many transmission systems,
Academic Editor: Yi Qin
typically operate under high loads and speeds. Any malfunctions can drastically reduce
the efficiency of mechanical devices, potentially leading to significant economic losses and
Received: 28 August 2024 safety incidents. Therefore, developing efficient bearing fault diagnosis technologies is
Revised: 27 September 2024 crucial, not only for reducing economic costs, but also for preventing accidents [1].
Accepted: 4 October 2024 With the rapid advancement of artificial intelligence technologies, data-driven fault
Published: 5 October 2024
diagnosis has emerged as a research hotspot, focusing primarily on machine learning
and deep learning methods [2]. Traditional mechanical fault signal processing techniques
include analyses in the time domain, frequency domain, and time-frequency domain. These
Copyright: © 2024 by the authors.
methods are typically integrated with machine learning technologies such as multilayer
Licensee MDPI, Basel, Switzerland. perceptrons (MLPs), support vector machines (SVMs), and Bayesian estimation, and are
This article is an open access article well-suited for diagnosing data with distinct features and straightforward patterns. In
distributed under the terms and contrast, deep learning, an advanced algorithmic approach, offers robust capabilities for
conditions of the Creative Commons automatic feature extraction. It can process large volumes of data and reduce reliance on
Attribution (CC BY) license (https:// expert knowledge, significantly improving the efficiency and accuracy of fault diagnosis.
creativecommons.org/licenses/by/ Notable deep learning techniques include convolutional neural networks (CNNs) [3–6],
4.0/). autoencoders (AEs) [7], generative adversarial networks (GANs) [8], and adversarial

Sensors 2024, 24, 6448. https://doi.org/10.3390/s24196448 https://www.mdpi.com/journal/sensors


Sensors 2024, 24, 6448 2 of 20

deep learning (ADL) [9]. The adoption of these technologies not only introduces a new
perspective on mechanical fault diagnosis but also fosters the advancement of the entire
industrial system towards greater efficiency and safety.
Graph theory models exhibit unique advantages in comprehensively describing fault
characteristic information. To effectively handle graph data, Graph Neural Networks
(GNNs) have emerged as a burgeoning field. Specifically designed for graph signal pro-
cessing, GNNs enable the precise definition of values and connections between nodes,
capturing and analyzing information from a global perspective. Recently, GNN technology
has been applied to fault diagnosis by researchers, in order to deepen their understanding
and address fault issues more effectively. GNNs enhance data extraction and inference
by aggregating information from neighbors at various depths. These networks have
been successfully applied in multiple domains, including physical models [10], chemical
structures [11], social networks [12], natural language processing [13], and image classifi-
cation [14]. For example, Li et al. [15] utilized GNNs to model and analyze graph data,
proposing three graph construction methods, exploring seven types of graph convolution
networks (GCNs), and four different graph pooling methods. They further developed an
intelligent fault diagnosis and predictive diagnosis framework based on GNNs. Addition-
ally, Zhao et al. [16] introduced a semi-supervised graph convolutional deep belief network
and applied it to electromechanical system fault diagnosis, which achieved significant
diagnostic results, even with limited labeled samples. These studies, which converted
vibration signals into graph data and utilized GNNs for fault diagnosis, demonstrate the
feasibility and advantages of GNNs in this field.
Graph-based models are becoming a prominent trend in rolling bearing fault diag-
nosis because they effectively capture the relationships between sample data. However,
traditional graph models are limited by their focus on learning pairwise correlations be-
tween adjacent samples, as each edge connects only two nodes, making them inadequate
for capturing the more complex higher-order relationships that are crucial in practical
applications [17]. For instance, during the monitoring of bearing degradation, consecutive
samples are not only interrelated but also collectively reflect the component’s gradual
deterioration. To illustrate the intricate relationships among multiple samples in fault
diagnosis, some researchers have turned to hypergraph structures to represent equipment
monitoring data. Hypergraphs connect multiple nodes through hyperedges, enabling a
more comprehensive depiction of complex relationships among multisample data. Conse-
quently, hypergraphs are used to represent intricate higher-order relationships between
vertices and model complex networks and systems with high-order interactions. Zhang
et al. [18] transformed motion current signals into a hypergraph structure and developed a
Hypergraph GCN (HGCN) to learn the higher-order relationships between nodes for fault
classification. Similarly, Shi et al. [19] transformed vibration signal samples into a hyper-
graph and mined the high-order structural information between samples using HGCN
layers. Yan et al. [20] structured the sample data into multiple hypergraph structures to
better learn the high-order data hidden among the samples. Additionally, Feng et al. [21]
introduced the Hypergraph Neural Network (HGNN), a model that naturally extends the
spectral method of GCN to hypergraphs, and designed corresponding hypergraph convolu-
tion operations. Meanwhile, Yadati et al. [22] developed the HyperGCN model, addressing
semi-supervised classification problems on hypergraphs. These advancements have pro-
moted the application of hypergraph models in fields such as computer vision [23,24],
recommendation systems [25,26], and spatiotemporal forecasting [27,28], achieving signif-
icant success. Notably, in the analysis of bearing monitoring data, utilizing hypergraph
methods to explore high-order relationships between samples offers a new perspective and
methodology for rolling bearing fault diagnosis.
To effectively capture higher-order relationships, Wang et al. [29] introduced T-spectral
convolution, a technique specifically designed for handling complex data structures, with
a particular strength in representing hypergraphs as tensors. By leveraging the multidi-
mensional characteristics of tensors, this method effectively captures complex inter-node
Sensors 2024, 24, 6448 3 of 20

relationships, thereby enhancing the understanding and management of patterns within


multidimensional datasets. T-spectral convolution not only captures higher-order relation-
ships but also clearly articulates the multidimensional relationships of data through its
intuitive tensor representation, making the intrinsic structure and connectivity more appar-
ent. Additionally, it offers significant flexibility for analyzing complex systems involving
various types of interactions. However, T-spectral convolution faces several challenges
in practical applications. Constructing and computing large tensors demands substan-
tial computational resources, especially when dealing with large-scale data, potentially
leading to reduced processing efficiency. Moreover, as data scales increase, the scalability
of T-spectral convolution may become limited, restricting its potential applications on
large-scale datasets.
To address the limitations of T-spectral convolution in handling higher-order relation-
ships, the innovative KAN-HyperMP model is introduced in this paper. KAN-HyperMP
utilizes Mth -order hyperedges within the hypergraph to capture interactions between
target nodes and their neighbors, thereby enhancing the model’s learning capabilities
and prediction accuracy. The model has been validated on two rolling bearing datasets,
demonstrating high fault detection precision even under strong noise interference.
1. An innovative algorithmic framework, KAN-HyperMP, is introduced, specifically de-
signed to manage complex graph structures and high-order data interactions, proving
highly effective in applications such as graph-based rolling bearing fault diagnosis;
2. A neighbor feature aggregation block is designed to utilize hypergraph structures,
enabling the effective management of complex node interactions by defining and
capturing high-order relationships within the hypergraph;
3. A feature fusion block is introduced, integrating node-specific features with those of
their neighbors to provide a comprehensive view of local graph structures, thereby
significantly enhancing prediction accuracy;
4. A KANLinear block, based on the Kolmogorov–Arnold theorem and employing B-spline
functions as activation functions, is introduced to effectively suppress noise, enhancing
the model’s robustness and generalization capabilities in noisy environments.
The rest of this paper is as follows: The proposed model is introduced in Section 2.
In Section 3, experiments are carried out, and the effectiveness of the proposed method is
analyzed. The Section 4 summarizes and puts forward the avenues for future work.

2. Proposed Model
In the task of rolling bearing fault diagnosis, fault samples are unstructured, making
it challenging to construct a hypergraph that can represent the hidden structure within
sample data and across different samples. To address this issue, a hypergraph construction
method capable of capturing the data structure among fault samples is proposed, and a
corresponding neural network is developed based on the constructed hypergraph for fault
identification.

2.1. Hypergraph Construction


Compared to traditional graph structures, hypergraphs are unique in their ability
to connect multiple nodes through hyperedges, facilitating the modeling of higher-order
relationships. A hypergraph G = (V , E ) is defined, where V = {v1 , v2 , . . . , v N } represents
a set of N nodes (or vertices), and E = {e1 , e2 , . . . , eK } comprises K hyperedges. Each
hyperedge ek can be defined as follows:

ek = {vi | vi ∈ V and vi is part of hyperedge ek }, k = 1, 2, . . . , K (1)


In a hypergraph G , the maximum edge cardinality m.c.e(G) indicates the maximum
number of nodes contained in any hyperedge, mathematically defined as M:

M = max |ek | (2)


ek ∈E
Sensors 2024, 24, 6448 4 of 20

Hypergraphs depict the connectivity between nodes through an incidence matrix


H ∈ R|V |×| E| . In this matrix, each element H (v, e) is defined as follows:
(
1, if v ∈ e
H (v, e) = (3)
0, if v ∈/e
This implies that when a node v in the hypergraph is associated with a hyperedge e,
the corresponding element in the matrix is 1; otherwise, it is 0.
The above metrics reflect the fundamental structural features of the hypergraph, crucial
for the analysis and processing of datasets based on hypergraphs. The quality of hypergraph
construction significantly impacts model training and the accuracy of fault diagnosis, as all
HGNN utilize the hypergraph, specifically the incidence matrix H, to capture information
between nodes (samples). Therefore, constructing a hypergraph is a critical step in using
HGNN for fault diagnosis tasks. However, commonly used datasets in fault diagnosis,
such as SEU, JNU, CWRU, etc., do not provide explicit hypergraph structural information,
as there are no clear connections among the samples in these datasets. Consequently, it
becomes necessary to manually design a hypergraph structure that can accurately reflect
the relationships between different sample signals within these datasets.
From the initial vibration signals, X = { X1 , X2 , X3 , . . . , Xn }, we resample each sample
signal feature using a predefined set of sampling frequencies R = {r1 , r2 , r3 , . . . , rm }. Here,
r1 serves as the base sampling frequency, with subsequent frequencies defined as r2 = 12 r1 ,
r3 = 14 r1 , and so on, until rm = 2m1−1 r1 . The resampled results for all original samples are
generated by this method:
 r r1 r1 r1
 X r = { X1r2 , X2r2 , . . . , Xnr2 }
1

 X 2 = { X , X , . . . , Xn }

1 2
.. (4)

 .
X = { X1rm , X2rm , . . . , Xnrm }

 rm

To more precisely capture signal characteristics at different time points, we apply


sliding window resampling to the signals X r1 , X r2 , . . . , X rm . This approach enables the
extraction of local features from the continuous signals, providing the model with coherent
and comprehensive temporal feature data.
The newly acquired signal feature data then undergo Min–Max Normalization to
ensure the numerical stability of the model calculations and to mitigate errors due to
large or small numerical ranges. After normalization, a Fast Fourier Transform (FFT) is
performed to convert the signals into the frequency domain. The processed results are as
follows:
 r1 r1 r1 r1
 Xnorm, f = { X1,norm, f , X2,norm, f , . . . , Xn,norm, f }
r2 r2 r2 r2

 Xnorm, f = { X1,norm, f , X2,norm, f , . . . , Xn,norm, f}


(5)
 ...


 X rm
 rm rm rm
norm, f = { X1,norm, f , X2,norm, f , . . . , Xn,norm, f }

To facilitate the model’s ability to capture inherent connections between samples,


features from samples with identical resampling frequencies and the same fault type are
concatenated. For instance, if X1 and Xi are both classified as having an inner ring fault,
r r2 rm
their features are concatenated to form X1,i1 , X1,i , . . . , X1,i , as shown in Equation (6).
 r1 r1 r1
 X1,i = X1,norm, f ∥ Xi,norm, f
r2 r2 r2

 X1,i = X1,norm, f ∥ Xi,norm,


f
. (6)
 ..


 X rm = X rm
 rm
1,i 1,norm, f ∥ Xi,norm, f
Sensors 2024, 24, 6448 5 of 20

Finally, to ensure each sample is accurately classified, the samples obtained through
the above process are vertically stacked, forming a feature matrix X ∈ R N × D . The entire
process is illustrated in Figure 1.

Figure 1. The construction process of Feature Matrix.

Additionally, to construct the hypergraph, it is crucial to establish connections between


nodes and define hyperedges. The K-Nearest Neighbors (KNN) algorithm is used to
calculate the Euclidean distances between sample features, forming the incidence matrix
H ∈ R N × M , as shown in Figure 2.

Figure 2. The construction process of hypergraph.

2.2. T-Spectral Convoluation


In hypergraphs, a hyperedge that connects multiple nodes can collectively repre-
sent higher-order relationships, such as the collective behaviors or attributes of a node
group. This multi-node relationship is a core feature of hypergraphs and is crucial for
understanding interactions within complex systems. Traditional matrix-based methods,
such as incidence matrices, often fail to adequately represent higher-order relationships by
reducing the hypergraph’s multiway connections to pairwise interactions, leading to a loss
of crucial multiway interaction information originally present in the data.
Sensors 2024, 24, 6448 6 of 20

Building on this analysis, research has introduced the hypergraph T-spectral Convolu-
tion [29], which leverages tensor representations and t-product decompositions to enable
the direct manipulation of hypergraph data in higher dimensions. This approach allows
models to handle higher-order relationships more naturally, overcoming the limitations
of traditional methods that reduce high-order hypergraphs to two-dimensional matrices.
The t-product, a powerful tool for complex algebraic operations, preserves the multidimen-
sional structure of the data, thereby capturing the deep structures and patterns within the
hypergraph. The formula is expressed as follows:

Zs = Anorm
s ∗ Xs ∗ Ws (7)
( M −2)
Here, Anorm
s is the normalized adjacency tensor, and Xs ∈ R N × D× N represents
the CNI signal tensor, defined as follows:
Given a feature (or signal) matrix X ∈ R N × D , where N is the number of nodes in the
hypergraph and D is the feature dimension for each node, the interaction of all nodes along
the d-th dimension (d = 1, . . . , D) is given by
( M −2)
CNI([x]d ) = [x]d ◦ [x]d . . . ◦ [x]d ∈ R N ×1× N (8)
| {z }
(M−1) times

where ◦ denotes the outer product (also known as the basic tensor product), and [x]d ∈ R N
represents the d-th dimensional feature vector of the nodes.
While T-spectra convoluation offers numerous theoretical advantages, such as the
ability to process high-order neighbor information, it also faces significant drawbacks,
including high computational complexity and substantial memory requirements. For in-
( M −2)
stance, in Equation (7), Xs ∈ R N × D× N describes a high-dimensional tensor. While
constructing such a tensor is feasible for small hypergraphs, it becomes impractical for
larger hypergraphs, such as those in this paper, due to computational limitations.

2.3. Proposed Model


To efficiently capture higher-order neighbor features, similar to hypergraph T-spectral
convolution, while minimizing the computational complexity of high-dimensional tensors,
a novel model called KAN-HyperMP is introduced in this paper. KAN-HyperMP is mainly
divided into three parts: a neighbor feature aggregation block, a feature fusion block, and
KANLinear block. The overall model is depicted in Figure 3.
Figure 4 illustrates the flowchart for the neighbor feature aggregation block process
when provided with a hypergraph structure. This block first checks if the number of nodes
in a hyperedge equals M. If not, the hyperedge is expanded. Once all hyperedges satisfy
this condition, both the Mth -order neighborhood hyperedge set and the Mth -order neigh-
borhood of a node are calculated. Subsequently, the node’s high-order neighbor features
are acquired through a concatenation operation. Finally, the feature fusion block processes
these to generate the final feature vector representation, with Sections 2.3.1 and 2.3.2 pro-
viding detailed explanations of the neighbor feature aggregation block and the feature
fusion block, respectively.
Sensors 2024, 24, 6448 7 of 20

Figure 3. The architecture overview of our KAN-HyperMP. The raw signal is processed into the
final signal feature matrix X using techniques such as resampling and sliding window sampling.
An incidence matrix H is then constructed using the KNN algorithm, establishing a hypergraph
structure. Based on the hypergraph, the neighbor feature aggregation block extracts information from
high-order neighbor nodes. This information is then integrated with the node’s own information
through the feature fusion block. Finally, feature extraction is completed using the KANLinear block,
facilitating fault diagnosis.

Figure 4. Flowchart of the proposed neighbor feature aggregation block.

2.3.1. Neighbor Feature Aggregation Block


The design of the neighbor feature aggregation block is based on hypergraph theory,
utilizing high-order neighborhood relationships to expand the adjacency information in
traditional graph structures. This method aims to effectively extract and integrate features
from adjacent nodes within the hypergraph, thereby capturing the complex interactions
and relationships between nodes. By processing more complex data structures and un-
derstanding deeper dependencies among nodes, the model’s predictive capabilities and
learning efficiency are significantly enhanced by this block.
When the working principles of this module are introduced, two fundamental concepts
in hypergraphs are first presented: the Mth -order neighborhood hyperedge set and the
Sensors 2024, 24, 6448 8 of 20

Mth -order neighborhood. These concepts provide a crucial theoretical foundation for
understanding how the block processes data.
1. Mth -order neighborhood hyperedge set
In defining the Mth -order hyperedges within a hypergraph G = (V , E ), scenarios are
differentiated based on the number of nodes each hyperedge contains:
(
{ e }, if |e| = M,
eM =  M M
(9)
ext (e) | |ext (e)| = M , if |e| < M

Based on this, an Mth -order neighborhood hyperedge set can be defined for each
hyperedge as follows:
n o
E M (v) = e M | e ∈ E , v ∈ e (10)

2. Mth -order neighborhood of a node


Building on Equations (9) and (10), the Mth -order neighborhood of a node can be
defined as follows:
n   o
N M (v) = sort e M \ {v} | e M ∈ E M (v) (11)

where e M \ {v} denotes the removal of the target node v from the set e M and the sort
function refers to the ordering of the remaining nodes. This structured definition
of neighborhoods offers an effective method for processing and analyzing hyper-
graph data, significantly enhancing the model’s comprehension of complex node
relationships.
For instance, consider a simple hypergraph as shown in Figure 5a, and based on
Equation (2); M = 3 is determined. According to the previously defined method,
hyperedge e1 is initially expanded to obtain ext3 (e1 ), as shown in Figure 5b. Based on
the previously defined criteria, the 3rd-order neighborhood hyperedge set for node v1
is determined, as shown in Equation (12).

(a) (b)
Figure 5. Construct the 3rd-order neighborhood hyperedge set for node v1 . (a) Hypergraph structure.
(b) Expand hyperedge.

n o
E3 (v1 ) = ext3 (e1 ), {e2 } = {{(v1 , v2 , v1 ), (v1 , v2 , v2 )}, {(v1 , v2 , v3 )}} (12)

Subsequently, the final 3rd-order neighborhood for v1 can be represented as follows:

N 3 (v1 ) = {{sort(v2 , v1 ), sort(v2 , v2 )}, {sort(v2 , v3 )}} (13)


Using this method, neighboring nodes within different hyperedges for other nodes
can also be identified. For instance, the 3rd-order neighborhoods for v2 and v3 are as
follows:
Sensors 2024, 24, 6448 9 of 20

N 3 (v2 ) = {{sort(v1 , v1 ), sort(v1 , v2 )}, {sort(v1 , v3 )}} (14)

N 3 (v3 ) = {{sort(v1 , v2 )}} (15)


This structured approach allows us to easily determine high-order neighbors for
each target node within different hyperedges, facilitating the aggregation of features
through a specified algorithm to enhance cross-node interactions. After the above
concepts have been introduced, a detailed explanation of how the neighbor feature
aggregation block performs neighbor feature aggregation will now be provided. The
core of the neighbor feature aggregation block is mainly divided into the following
two steps:
Step 1: High-order neighbors features.
Consider node v1 , whose 3rd-order neighborhood is defined as

N 3 (v1 ) = {{sort(v2 , v1 ), sort(v2 , v2 )}, {sort(v2 , v3 )}} (16)


The neighborhood features for node v1 are

FNv3 = P1 · ( xv1 ⊙ xv2 ) + P2 · ( xv2 ⊙ xv2 ) + P3 · ( xv2 ⊙ xv3 ) (17)


1

where xv1 , xv2 , and xv3 are the feature vectors of nodes v1 , v2 , and v3 respectively, and P1 ,
P2 , P3 correspond to the combinatorial counts from sort(·). The ⊙ operation denotes the
Hadamard (element-wise) product along the feature dimension.
Step 2: Hyperedge weights.
Notably, hyperedge e1 includes two nodes, while e2 includes three. To capture the
variation among hyperedges during feature aggregation, a weight for each hyperedge (We )
is introduced, calculated as follows:

|e|
We = (18)
α
|e|
where α = ∑i=0 (−1)i (|ei |)(|e| − i ) M .
Therefore, the final 3rd-order neighborhood feature for node v1 is as follows:

FNv3 = We1 · P1 · ( xv1 ⊙ xv2 ) + We1 · P2 · ( xv2 ⊙ xv2 ) + We2 · P3 · ( xv2 ⊙ xv3 ) (19)
1

Repeating this process for all target nodes enables us to obtain neighbor features that
can be extended to the Mth -order, resulting in the final Mth -order neighbor features, as
defined in Equation (20).

FN M (v) = FNvM ∥ FNvM ∥ FNvM ∥ · · · ∥ FNvM (20)


1 2 3 N

where FN M (v) ∈ R N × D || represents the concatenation operation.

2.3.2. Feature Fusion Block


By integrating node-specific features with those of their neighbors, the model transi-
tions from a “micro” to a “macro” perspective. This shift enhances the understanding of
each node’s role and impact within its neighborhood, helping to capture a more compre-
hensive view of the local graph structure. Additionally, integrating these features facilitates
effective fusion through the feature fusion block, defined by the following formula:
h   i
Fv,N M (v) = σ MLP COMBINE Fv , FN M (v) (21)
Sensors 2024, 24, 6448 10 of 20

where Fv ∈ R N × D represents the node’s own feature vector, and σ denotes the activation
function, with ReLU being the choice in this study. The function COMBINE is defined as
follows: h i
COMBINE( Fv , FN M (v) ) = Fv FN M (v) (22)

This method involves concatenating features along dimension D, preserving all origi-
nal feature information from the participating nodes and ensuring that both the node’s and
its neighbors’ features are clearly represented in the final feature matrix.

2.3.3. Kanlinear Block


Drawing inspiration from the Kolmogorov–Arnold theorem, the literature [30] intro-
duces the KAN, which uniquely applies activation functions. Unlike traditional neural
networks that apply activation functions to each node, KANs implement these functions
on the edges rather than the nodes themselves. Additionally, KANs leverage B-spline
functions as activation functions due to their superior approximation capabilities, which
significantly enhance the network’s ability to learn and model complex data relationships.
The functional form of the KAN is defined as follows:

ϕ( x ) = wb b( x ) + ws spline( x ) (23)
where b( x ) serves as the basis function, given by b( x ) = silu( x ) = 1+xe−x ; the spline
function spline( x ) is parameterized as a linear combination of B-splines:

spline( x ) = ∑ ci Bi (x) (24)


i

Owing to the smooth nature of B-spline activation functions, which possess significant
noise suppression characteristics, these functions effectively dampen random fluctuations
in input data, thereby enhancing the network’s stability and predictive accuracy in noisy
environments. In the experimental section, KAN is replaced with a traditional Multilayer
Perceptron (HyperMP-MLP), and a comparative analysis is conducted with the KAN’s
results, further affirming the method’s effectiveness. The overall architecture of the KAN-
HyperMP model is shown in Figure 6.

Figure 6. The architecture of KAN-HyperMP.

3. Experiments Description
In this section, the effectiveness of the constructed model is validated using two open-
source bearing fault diagnosis datasets: SEU and JNU. Experiments are conducted on a
server equipped with an Intel(R) Xeon(R) CPU and an NVIDIA L4 GPU. The network frame-
work is implemented in a PyTorch 2.3.1 and CUDA 12.1 environment. KAN-HyperMP has
a hidden dimension of 256, a combined neighbor feature aggregation block and feature
fusion block, and a single KANLinear block for final feature extraction. For constructing
the incidence matrix with the KNN algorithm, the number of nearest neighbors (K) is set to
Sensors 2024, 24, 6448 11 of 20

3, which accordingly sets the model’s M value to 3. The model training employs a negative
log-likelihood loss function and is optimized using the Adam algorithm with a learning
rate of 1 × 10−3 and a weight decay rate of 5 × 10−6 . In order to evaluate the model’s
performance, the datasets are split into training, validation, and test sets with a ratio of
60%, 20%, and 20%, respectively.

3.1. Datasets Description


Figure 7 shows the JNU testbed, which is composed of a signal recorder, an accelerom-
eter, and an amplifier. The JNU Dataset is primarily used to validate the generalization
performance and superiority of the proposed diagnostic model. The bearing vibration
signals in this dataset are sampled at a frequency of 50 kHz. The dataset is specifically
designed to focus on single fault types, excluding the diagnosis of composite faults. It
provides comprehensive documentation of four distinct bearing health states: Normal (N),
Inner Race Fault (IB), Outer Race Fault (OB), and Rolling Element Fault (TB), covering a
total of four unique fault types.

Figure 7. The JNU testbed [31].

The SEU Bearing Dataset, obtained from the Dynamic Drive Simulator (DDS), is
tailored specifically for bearing fault diagnosis and learning tasks. The bearing signals
in this dataset are sampled at a frequency of 5120 Hz. Data are gathered under two
operational settings: 20 Hz-0 V and 30 Hz-2 V, encompassing normal and various faulted
conditions. These conditions are categorized into five distinct types: Normal, Ball (defects
on the rolling element), Inner (defects on the inner race), Outer (defects on the outer race),
and Combination (concurrent defects on both the inner and outer races). This dataset is
instrumental for basic bearing fault diagnostics, facilitating transfer learning across different
loading conditions and enabling the analysis of complex combined inner and outer race
faults. It effectively addresses the diverse requirements of fault diagnostics and predictive
maintenance. The SEU testbed is depicted in Figure 8.
Sensors 2024, 24, 6448 12 of 20

Figure 8. The SEU testbed [31].

3.2. Baseline Models


1. GCN [32]: This is a well-established spatial learning model widely used for spatial
prediction tasks. This model analyzes vibration data from bearings to detect potential
fault patterns;
2. GAT [33]: GAT uses a graph structure and a Graph Attention Network to represent
and analyze relationships among bearing monitoring samples, effectively diagnosing
faults in rolling bearings;
3. HGNN [21]: This employs a clique expansion to generalize convolutions in hyper-
graphs, using the hypergraph Laplacian and Chebyshev polynomials to learn complex
relationships in bearing data effectively;
4. CNN [34]: This model, consisting of one-dimensional convolutional layers and max-
pooling layers, autonomously learns patterns and features from sensor data, including
vibration signals, for effective rolling bearing fault diagnosis;
5. LSTM [35]: This model utilizes stacked LSTM units for time series prediction. By
analyzing bearing time series data, it effectively identifies fault progression trends;
6. HyperGCN [22]: It is a refined clique expansion method that enhances the hypergraph
Laplacian with weighted mediators between vertices. This method boosts fault
detection and early diagnosis by enabling efficient complex sensor data analysis.

3.3. Experiment Results and Discussion


3.3.1. Demonstration and Analysis without Noise
To minimize the impact of randomness in the experimental results, all models in
this study are tested five times under noise-free conditions. Table 1 presents the average
accuracy and F1-scores from five repeated experiments across two datasets. These results
indicate that the proposed model achieved classification accuracies of 99.70% and 99.10%
on the test sets, significantly outperforming all baseline models and confirming the superior
fault detection capabilities of our approach.
To clearly demonstrate the feature extraction prowess of the KAN-HyperMP model,
Principal Component Analysis (PCA) is employed to visualize high-dimensional data in
two dimensions. This method effectively displays the data embedding vectors learned
by the model and the distribution of data across different categories. PCA is a technique
used to transform high-dimensional data into a lower-dimensional space for visualization.
For this purpose, we used the results from the third experiment for two-dimensional PCA
visualization, as illustrated in the Figure 9a,b. The figures show that in the PCA space,
the model’s outputs form distinct clusters with significant separation between categories,
highlighting the model’s ability to effectively differentiate between various types of samples
in a noise-free environment.
Sensors 2024, 24, 6448 13 of 20

Table 1. Model comparison using the SEU and JNU Datasets (without noise).

SEU JNU
Model Accuracy F1-Score Accuracy F1-Score
CNN 98.60% 98.60% 99.02% 99.02%
LSTM 98.84% 98.80% 95.57% 94.80%
GCN 98.57% 98.53% 93.60% 93.70%
GAT 92.30% 92.70% 83.80% 83.80%
HGNN 98.99% 98.98% 90.40% 90.30%
HyperGCN 98.94% 98.93% 93.40% 93.20%
KAN-HyperMP 99.70% 99.70% 99.10% 99.10%

(a) (b)
Figure 9. A 2D PCA visualization of rolling bearing fault diagnosis on the SEU and JNU Datasets.
(a) SEU Dataset. (b) JNU Dataset.

3.3.2. Demonstration and Analysis under Strong Noise


In the operational environment of mechanical equipment, noise generation is in-
evitable. Consequently, this study incorporates Gaussian white noise at various signal-to-
noise ratios (SNR) into the original monitoring data to simulate real-world conditions. The
primary purpose of this approach is to assess the model’s noise resistance capabilities. By
adding different levels of Gaussian white noise to the original signals, vibration signals
under various SNR conditions are generated. To explore the limits of the model’s resistance
to noise, we selected a noise level range from −6 dB to 6 dB. Additionally, to eliminate the
randomness in the experimental results, each model is tested five times under each SNR
scenario. Table 2 and Figure 10a, Table 3 and Figure 10b (showing the average classification
results of all models) demonstrate the models’ classification accuracies under extreme noise
conditions, ranging from −6 dB to 6 dB.
Notably, at SNR = −6 dB, where noise almost completely masks the original signal
features, the accuracy of the HGNN model drops to 54.70% and 47.80% on the SEU and JNU
Datasets, respectively, a level considered unsatisfactory. Furthermore, other graph neural
network models (such as GCN, GAT and HypergraphGCN) have accuracies between 50%
to 60% under SNR = −6 dB conditions, indicating that the models are essentially ineffective
in this scenario. This phenomenon may occur because noise in the initial node features
introduces anomalous edges into the graph structure or propagates through the network via
connections in the adjacency or incidence matrices. Due to the close connections between
nodes, incorrect or irrelevant information can quickly spread to multiple nodes, impacting
the entire graph’s learning and inference processes.
Sensors 2024, 24, 6448 14 of 20

Table 2. Rolling bearing fault diagnosis on the SEU Dataset at seven noise levels.

Model −6 dB −4 dB −2 dB 0 dB 2 dB 4 dB 6 dB
CNN 75.02% 79.40% 86.00% 93.20% 74.53% 72.69% 95.51%
LSTM 59.42% 57.05% 65.51% 70.05% 76.13% 83.06% 86.80%
GCN 60.10% 70.50% 76.00% 82.00% 82.70% 85.70% 87.30%
GAT 56.60% 66.60% 73.40% 72.90% 79.80% 81.60% 84.10%
HGNN 54.70% 64.80% 75.10% 78.10% 86.60% 80.00% 82.80%
HyperGCN 63.10% 67.00% 75.30% 81.90% 83.50% 89.00% 87.00%
KAN-
81.56% 86.37% 88.50% 90.47% 92.28% 93.69% 95.60%
HyperMP

Table 3. Rolling bearing fault diagnosis on the JNU Dataset at seven noise levels.

Model −6 dB −4 dB −2 dB 0 dB 2 dB 4 dB 6 dB
CNN 76.60% 85.71% 87.39% 94.00% 97.00% 98.40% 98.10%
LSTM 51.23% 64.23% 75.43% 81.36% 87.83% 81.12% 93.51%
GCN 56.20% 69.20% 74.10% 79.80% 85.80% 87.20% 88.50%
GAT 56.00% 65.60% 72.40% 77.90% 80.30% 80.40% 82.90%
HGNN 47.80% 62.10% 65.50% 70.90% 83.40% 80.20% 78.20%
HyperGCN 63.00% 66.00% 71.40% 71.90% 83.40% 85.27% 88.60%
KAN-
87.04% 91.76% 94.57% 96.54% 98.08% 98.64% 99.12%
HyperMP

(a) (b)
Figure 10. Rolling bearing fault diagnosis accuracies of compared methods at seven noise levels.
(a) Experimental results on the SEU Dataset. (b) Experimental results on the JNU Dataset.

In contrast, traditional CNNs, with their locally connected features in convolutional


layers, can capture local characteristics. Even if part of the input features is affected by
noise, other unaffected areas can still effectively provide useful information. Therefore,
even under extreme SNR conditions of −6 dB, CNNs maintain approximately 70% accuracy
on the SEU and JNU Datasets. LSTMs, designed for processing sequential data, rely on
capturing long-term dependencies within time series. Noise can introduce errors in the
early stages of the sequence, which may be continuously transmitted and accumulated
through the recurrent connections in LSTM units, leading to incorrect learning of long-
term dependent features. Consequently, LSTMs perform poorly on these datasets, with
accuracies around 50%.
Observing the KAN-HyperMP model, which aggregates information from higher-
order neighbors, provides the model with a broader perspective. It relies not only on
direct neighbors but can also gather features from a larger range of nodes. This extended
view helps capture more complex and deep graph structural patterns. For instance, even
if a node’s immediate neighbors are heavily affected by noise, introducing more distant
Sensors 2024, 24, 6448 15 of 20

neighbor nodes can dilute the noise’s impact with more effective information. Additionally,
the model’s final part incorporates a B-spline-based KANLinear layer, which, due to
its smoothness and local support characteristics, can handle and suppress input noise
effectively. This helps to maintain the clarity of essential information at each network layer
while filtering out unnecessary noise, as shown in Figure 11. Tables 2 and 3 show that
models using the KANLinear layer perform at 81.56% and 87.04% on the two datasets (SNR
= −6 dB), respectively.

Figure 11. Rolling bearing fault diagnosis accuracies of KAN-HyperMP at seven noise levels.

Simultaneously, we used confusion matrices to visualize the results of the third experi-
ment on two datasets (with SNR ranging from −6 dB to 0 dB). As illustrated in Figure 12a–h,
the model’s performance on the JNU Dataset is noticeably superior to that on the SEU
Dataset as the noise level increases. Specifically, within the SEU Dataset, the primary
classification errors predominantly involve samples labeled 1 and 2.
In summary, even under extreme noise conditions, KAN-HyperMP maintained higher
accuracy compared to other models, highlighting its robustness and precision.
Sensors 2024, 24, 6448 16 of 20

(a) Accuracy: 92.08% (SNR = 0 dB) (b) Accuracy: 97.49% (SNR = 0 dB)

(c) Accuracy: 88.59% (SNR = −2 dB) (d) Accuracy: 95.86% (SNR = −2 dB)

(e) Accuracy: 84.84% (SNR = −4 dB) (f) Accuracy: 92.69% (SNR = −4 dB)

(g) Accuracy: 81.94% (SNR = −6 dB) (h) Accuracy: 87.35% (SNR = −6 dB)

Figure 12. The confusion matrix of the proposed method. (1) Results (a,c,e,g) on the SEU Dataset;
(2) Results (b,d,f,h) on the JNU Dataset.
Sensors 2024, 24, 6448 17 of 20

3.3.3. Ablation Experiments


To investigate the impacts of the neighbor feature aggregation block, the feature fusion
block, and the KANLinear block, ablation experiments are conducted across all datasets.
Below is a concise overview of these variants:
1. KAN-HyperMP-w/o HP: Removing the Hadamard product from the neighbor feature
aggregation block eliminated the capability for cross-node interaction.
2. KAN-HyperMP-w/o FFB: By omitting the feature fusion block, node features are
merely added to high-order neighbor features without further integration.
3. HyperMP-MLP: This variant replaces the KANLinear block with a traditional MLP.
As shown in Figure 13a,b, removing the Hadamard product operation, the neighbor
feature aggregation block lost its ability to facilitate cross-node interaction. This change
occurred because the dot product operation, which performs element-wise multiplication
on feature vectors of adjacent nodes, is eliminated. Normally, this operation not only
merges features between nodes but also intensifies the non-linear relationships among
them, capturing more complex dependencies. With its removal, the block can only combine
feature vectors in a basic manner, lacking the intricate interactions needed. Consequently,
this leads to reduced fault diagnosis accuracy, as noise interference in the data becomes
more problematic without an effective feature interaction mechanism.

(a) (b)
Figure 13. Fault-diagnosis accuracy of each block in the ablation experiments. (a) Experimental
results on the SEU Dataset. (b) Experimental results on the JNU Dataset.

The feature fusion block enhances the model by merging node features with those
of neighboring nodes, providing deeper insights into node interactions and introducing
non-linear processing. This helps to capture the graph’s structure and node relationships
from a broader, more “macro” perspective. However, without the feature fusion block, the
model merely adds node features to high-order neighbor features in a simplistic manner,
diminishing its ability to distinguish between noise and useful signals.
When the KANLinear block is replaced with a traditional MLP for feature extraction,
the model loses the noise suppression and smoothing capabilities of the B-spline function.
Such a change complicates the distinction between useful signals and noise in high-noise
environments, leading to a gradual degradation in performance as the noise levels increase.
In conclusion, the analysis demonstrates the effectiveness of the three components within
the overall model.

3.3.4. Hyperparameters Discussion


Hyperparameter discussions are conducted under the condition of SNR = 6 dB, eval-
uating parameters such as the number of layers (neighbor feature aggregation block and
feature fusion block), the hidden dimension of KAN-HyperMP, and the maximum edge
cardinality, M. Tuning in noisy environments facilitates the identification of the optimal
hyperparameters by striking a balance where the model minimizes noise interference with-
Sensors 2024, 24, 6448 18 of 20

out overfitting and losing its fault diagnosis capabilities in new data. The corresponding
experimental results are depicted in Figure 14.
As shown in Figure 14a, model accuracy gradually decreases as the number of layers
increases, with the optimal number being 1. At this stage, the fault diagnosis accuracy
for the SEU and JNU datasets reaches 95.60% and 99.12%, respectively, though increasing
the layers to 4 reduces accuracy to 88.54% and 87.56%. While adding layers is expected to
deepen the model’s capacity to capture complex data features, in some hypergraph neural
network architectures, aggregating information from more neighbors with each additional
layer may dilute useful information, making node feature representations more similar and
reducing the distinction between nodes, particularly when processing graph data.
Additionally, as depicted in Figure 14b, the model achieves its highest accuracy when
M is set to 3. However, as M increases to 9, the accuracy decreases to 92.42% and 93.21%
on the SEU and JNU datasets, respectively. This decline in performance with larger M
values can be attributed to nodes aggregating features from more distant neighbors, which
may have weaker relevance to the current node, thus introducing more noise into the data.
Particularly in noisy environments, this information from distant neighbors may not only
be unhelpful but could actually disrupt the correct interpretation of the current node’s state.
Relative to the first two hyperparameters, variations in the hidden dimensions exert a less
pronounced impact on accuracy. However, it is observed that the model attains its highest
accuracy levels on the SEU and JNU datasets when the hidden dimensions are set to 256,
as illustrated in Figure 14c.

(a) (b)

(c)
Figure 14. Parameter analysis on the classification performance of the proposed method. (a) The
number of layers. (b) The maximum edge cardinality. (c) The hidden dimension.
Sensors 2024, 24, 6448 19 of 20

4. Conclusions
In this paper, an innovative rolling bearing fault diagnosis method called KAN-
HyperMP is developed. This method utilizes hypergraph theory to effectively identify
and aggregate high-order neighbor node features. By applying B-spline functions within
KAN, the smoothness of data processing is enhanced, thereby improving the accuracy
of fault diagnosis and the stability of the model in noisy environments. Experimental
results demonstrate that KAN-HyperMP exhibits exceptional fault detection capabilities
and robustness, even under conditions of high noise, effectively addressing the challenges
of complex fault diagnosis.
Although the proposed model has demonstrated commendable performance under
extreme noise conditions, there is potential for further improvement in its accuracy. Conse-
quently, future research will focus on enhancing the model’s robustness. Advanced noise
filtering technologies and data augmentation strategies are planned to be incorporated
to bolster performance in complex environments. Additionally, multimodal data fusion
techniques will be explored to enrich the sources of information for fault diagnosis. These
enhancements are expected to improve the model’s accuracy and applicability, better meet-
ing the demands of industrial applications. Through these efforts, further optimization of
the model is aimed to be achieved, ensuring its reliability in challenging conditions.

Author Contributions: Methodology, J.W.; software, J.W.; validation, Z.D. and S.Z.; writing—original
draft preparation, J.W.; writing—review and editing Z.D. and S.Z. All authors have read and agreed
to the published version of the manuscript.
Funding: The authors are grateful for the support from the General Project of the Zhejiang Provincial
Department of Education (Application No. Y202455248) and the Zhejiang Provincial Youth Fund
(Application No. QN25E050040).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The SEU and JNU Datasets provided in this study can be found in the
following repository: https://github.com/Tan-Qiyu/Mechanical_Fault_Diagnosis_Dataset (accessed
on 1 October 2024).
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. Huo, C.; Jiang, Q.; Shen, Y.; Lin, X.; Zhu, Q.; Zhang, Q. A class-level matching unsupervised transfer learning network for rolling
bearing fault diagnosis under various working conditions. Appl. Soft Comput. 2023, 146, 110739.
2. Dong, Z.; Zhao, D.; Cui, L. Rotating machinery fault classification based on one-dimensional residual network with attention
mechanism and bidirectional gated recurrent unit. Meas. Sci. Technol. 2024, 35, 086001.
3. Wang, M.; Wang, W.; Zhang, X.; Iu, H.H.C. A new fault diagnosis of rolling bearing based on Markov transition field and CNN.
Entropy 2022, 24, 751.
4. Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A deep convolutional neural network with new training methods for bearing
fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Process. 2018, 100, 439–453.
5. Cui, L.; Dong, Z.; Xu H. Triplet attention-enhanced residual tree-inspired decision network: A hierarchical fault diagnosis model
for unbalanced bearing datasets Adv. Eng. Inform. 2024, 59, 102322.
6. Dong, Z.; Zhao, D.; Cui, L. An intelligent bearing fault diagnosis framework: one-dimensional improved self-attention-enhanced
CNN and empirical wavelet transform. Nonlinear Dyn. 2024, 112, 6439–6459.
7. Yang, S.; Kong, X.; Wang, Q.; Li, Z.; Cheng, H.; Xu, K. Deep multiple auto-encoder with attention mechanism network: A
dynamic domain adaptation method for rotary machine fault diagnosis under different working conditions. Knowl.-Based Syst.
2022, 249, 108639.
8. Wang, Z.; Wang, J.; Wang, Y. An intelligent diagnosis scheme based on generative adversarial learning deep neural networks and
its application to planetary gearbox fault pattern recognition. Neurocomputing 2018, 310, 213–222.
9. Yao, J.; Chang, Z.; Han, T.; Tian, J. Semi-supervised adversarial deep learning for capacity estimation of battery energy storage
systems. Energy 2024, 294, 130882.
10. Han, J.; Huang, W.; Ma, H.; Li, J.; Tenenbaum, J.; Gan, C. Learning physical dynamics with subequivariant graph neural networks.
Adv. Neural Inf. Process. Syst. 2022, 35, 26256–26268.
Sensors 2024, 24, 6448 20 of 20

11. Zhang, S.; Jin, Y.; Liu, T.; Wang, Q.; Zhang, Z.; Zhao, S.; Shan, B. SS-GNN: A simple-structured graph neural network for affinity
prediction. ACS Omega 2023, 8, 22496–22507.
12. Li, X.; Sun, L.; Ling, M.; Peng, Y. A survey of graph neural network based recommendation in social networks. Neurocomputing
2023, 549, 126441.
13. Wu, L.; Chen, Y.; Ji, H.; Liu, B. Deep learning on graphs for natural language processing. In Proceedings of the 44th International
ACM SIGIR Conference on Research and Development in Information Retrieval, Online, 11–15 July 2021; pp. 2651–2653.
14. Zhou, X.; Zhang, Y.; Wei, Q. Few-shot fine-grained image classification via GNN. Sensors 2022, 22, 7640.
15. Li, T.; Zhou, Z.; Li, S.; Sun, C.; Yan, R.; Chen, X. The emerging graph neural networks for intelligent fault diagnostics and
prognostics: A guideline and a benchmark study. Mech. Syst. Signal Process. 2022, 168, 108653.
16. Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Multireceptive field graph convolutional networks for machine fault diagnosis. IEEE
Trans. Ind. Electron. 2020, 68, 12739–12749.
17. Li, C.; Mo, L.; Yan, R. Rolling bearing fault diagnosis based on horizontal visibility graph and graph neural networks. In
Proceedings of the 2020 International Conference on Sensing, Measurement & Data Analytics in the Era of Artificial Intelligence
(ICSMD), Xi’an, China, 15–17 October 2020; pp. 275–279.
18. Zhang, K.; Li, H.; Cao, S.; Yang, C.; Sun, F.; Wang, Z. Motor current signal analysis using hypergraph neural networks for fault
diagnosis of electromechanical system. Measurement 2022, 201, 111697.
19. Shi, M.; Ding, C.; Wang, R.; Song, Q.; Shen, C.; Huang, W.; Zhu, Z. Deep hypergraph autoencoder embedding: An efficient
intelligent approach for rotating machinery fault diagnosis. Knowl.-Based Syst. 2023, 260, 110172.
20. Yan, X.; Liu, Y.; Zhang, C.A. Multiresolution hypergraph neural network for intelligent fault diagnosis. IEEE Trans. Instrum.
Meas. 2022, 71, 1–10.
21. Feng, Y.; You, H.; Zhang, Z.; Ji, R.; Gao, Y. Hypergraph neural networks. In Proceedings of the AAAI Conference on Artificial
Intelligence, Honolulu, HI, USA, 27 January 2019–1 February 2019; Volume 33, pp. 3558–3565.
22. Yadati, N.; Nimishakavi, M.; Yadav, P.; Nitin, V.; Louis, A.; Talukdar, P. Hypergcn: A new method for training graph convolutional
networks on hypergraphs. Adv. Neural Inf. Process. Syst. 2019, 32, 1509–1520.
23. Ma, Z.; Jiang, Z.; Zhang, H. Hyperspectral image classification using feature fusion hypergraph convolution neural network.
IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–14.
24. Sellami, A.; Farah, M.; Dalla Mura, M. SHCNet: A semi-supervised hypergraph convolutional networks based on relevant feature
selection for hyperspectral image classification. Pattern Recognit. Lett. 2023, 165, 98–106.
25. Gharahighehi, A.; Vens, C.; Pliakos, K. Fair multi-stakeholder news recommender system with hypergraph ranking. Inf. Process.
Manag. 2021, 58, 102663.
26. Sun, Y.; Zhu, D.; Du, H.; Tian, Z. Motifs-based recommender system via hypergraph convolution and contrastive learning.
Neurocomputing 2022, 512, 323–338.
27. Sun, Y.; Jiang, X.; Hu, Y.; Duan, F.; Guo, K.; Wang, B.; Gao, J.; Yin, B. Dual dynamic spatial-temporal graph convolution network
for traffic prediction. IEEE Trans. Intell. Transp. Syst. 2022, 23, 23680–23693.
28. Wu, J.; He, D.; Jin, Z.; Li, X.; Li, Q.; Xiang, W. Learning spatial–temporal pairwise and high-order relationships for short-term
passenger flow prediction in urban rail transit. Expert Syst. Appl. 2024, 245, 123091.
29. Wang, F.; Pena-Pena, K.; Qian, W.; Arce, G.R. T-HyperGNNs: Hypergraph neural networks via tensor representations. IEEE
Trans. Neural Netw. Learn. Syst. 2024.
30. Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-arnold networks.
arXiv 2024, arXiv:2404.19756.
31. Li, C.; Mo, L.; Yan, R. Fault diagnosis of rolling bearing based on WHVG and GCN. IEEE Trans. Instrum. Meas. 2021, 70, 1–11.
32. Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907.
33. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903.
34. Wen, L.; Li, X.; Gao, L.; Zhang, Y. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans.
Ind. Electron. 2017, 65, 5990–5998.
35. Staudemeyer, R.C.; Morris, E.R. Understanding LSTM—A tutorial into long short-term memory recurrent neural networks. arXiv
2019, arXiv:1909.09586.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like