Sensors 24 04415
Sensors 24 04415
Article
Graph Feature Refinement and Fusion in Transformer for
Structural Damage Detection
Tianjie Hu 1,2 , Kejian Ma 1,2 and Jianchun Xiao 1,2, *
Abstract: Structural damage detection is of significance for maintaining the structural health.
Currently, data-driven deep learning approaches have emerged as a highly promising research
field. However, little progress has been made in studying the relationship between the global and
local information of structural response data. In this paper, we have presented an innovative Convo-
lutional Enhancement and Graph Features Fusion in Transformer (CGsformer) network for structural
damage detection. The proposed CGsformer network introduces an innovative approach for hierar-
chical learning from global to local information to extract acceleration response signal features for
structural damage representation. The key advantage of this network is the integration of a graph
convolutional network in the learning process, which enables the construction of a graph structure
for global features. By incorporating node learning, the graph convolutional network filters out
noise in the global features, thereby facilitating the extraction to more effective local features. In the
verification based on the experimental data of four-story steel frame model experiment data and
IASC-ASCE benchmark structure simulated data, the CGsformer network achieved damage identifi-
cation accuracies of 92.44% and 96.71%, respectively. It surpassed the existing traditional damage
detection methods based on deep learning. Notably, the model demonstrates good robustness under
noisy conditions.
Keywords: structural damage detection; deep learning; CGsformer; graph convolutional network;
Citation: Hu, T.; Ma, K.; Xiao, J. global and local features; noise robustness
Graph Feature Refinement and Fusion
in Transformer for Structural Damage
Detection. Sensors 2024, 24, 4415.
https://doi.org/10.3390/s24134415
1. Introduction
Academic Editors: Simon X. Yang, As the foundational support system of buildings, the structure directly affects the
Jingzhou Xin, Hong Zhang building safety. Throughout the entire service life of buildings, structural damage is
and Yan Jiang
inevitable due to the coupling of adverse factors such as environmental corrosion, material
Received: 16 May 2024 aging, fatigue damage, sudden disasters, and the long-term effects of abnormal loads [1–3].
Revised: 30 June 2024 With the extension of service time, the load-bearing capacity and resistance to natural
Accepted: 4 July 2024 disasters of the structure decrease, thus leading to unforeseeable safety challenges for
Published: 8 July 2024 buildings [4–6]. Therefore, the early diagnosis and localization of structural damage play a
vital role in the timely maintenance of structures [7]. Structural damage detection [8–10],
as the core technology of Structural Health Monitoring (SHM), is essential for grasping the
structural working status and evaluating structural safety [11]. Consequently, structural
Copyright: © 2024 by the authors. damage detection methods have emerged as a prominent research focus in the field of
Licensee MDPI, Basel, Switzerland. SHM [12,13]. Structural damage detection is categorized into local-based [14,15] and global-
This article is an open access article
based [16,17] methods. Local-based methods are mainly exploited to detect small regular
distributed under the terms and
structures or the local structure, which can be challenging to assess damage from large
conditions of the Creative Commons
and complex structures. To overcome the limitations of local-based methods, many global-
Attribution (CC BY) license (https://
based structural detection methods have been developed. Among them, vibration-based
creativecommons.org/licenses/by/
structural damage detection methods have garnered widespread attention due to their
4.0/).
complexity of the model. Therefore, how to effectively combine hybrid models to extract
the local and global features of input data has become a challenging problem.
Although the aforementioned networks are able to extract local and global infor-
mation from signals, they have difficulty in mining the intrinsic correlations of the data.
Consequently, some researchers [39–42] have utilized irregular graphs to reflect the inter-
connections between structural damage data. Dang et al. [39] used the Graph Convolutional
Network (GCN) to attain the intrinsic spatial coherence of sensor positions directly from
structural vibration response data. Zhan et al. [41] exploited the GCN to construct graph
structures in the wavelet domain for multisensor signals, and they trained the network with
node classification as the goal for structural damage detection tasks. Wang et al. [40] pro-
posed a waveguide-based dual GCN method for damage detection. The method employed
Short-Time Fourier Transform (STFT) to obtain the spatial–temporal feature representation
of the original waveguide signals. A local graph network was built using features obtained
from samples, and then local graphs were formed by grouping nodes with similar types of
damage, thus enabling damage detection on the structure. The successful applications of
GCN-based methods show that the methods can effectively understand and represent the
intrinsic correlations between original data and features.
There are still challenges in designing a deep learning method that accurately iden-
tifies structural damage. Firstly, both the CNN-based and RNN-based approaches have
limitations in their abilities to model long-term dependencies and capture global temporal
dependencies. Although RNNs are designed to deal with long-term dependencies, they
may still encounter the problem of vanishing or exploding gradients when processing
longer sequences. Secondly, some hybrid models usually use parallel networks, or they
adopt the learning way from local to global, and there is a lack of investigation into the
learning manner from global to local. Thirdly, in addition to the signal components related
to the structural self-oscillation characteristics, the measured acceleration response signal
inevitably contains noise. The direct input of such noisy data into deep learning models
can affect the generalization ability and damage detection performance. Finally, research
on structural damage methods based on graph convolution is limited.
According to the above analysis, we present a novel Convolutional Enhancement
and Graph Features Fusion in Transformer Network (CGsformer) for detecting structural
damage information. The proposed CGsformer network contains two contributions. On the
one hand, the proposed network uses a hierarchical learning method from global to lo-
cal to extract the characteristics of the acceleration response signal. Based on previous
research work, global information can identify the periodic relationship of the acceleration
vibration response signals, while local information can help to define and analyze the
subtle differences in the acceleration response signals of short adjacent moments before
and after damage occurs. Thus, a multihead self-attention was used to acquire the global
information of the signals, and a convolution module was applied to represent the local
information. On the other hand, the graph convolution module was embeded to enhance
robustness against noise contamination. The graph convolution network constructs a
graph structure for global features and filters out noise in global features through node
learning, thus prompting the convolution module to extract more effective local features.
On this basis, an extensive verification testing has been conducted by using the numerical
model of the International Association for Structural Control (IASC)– American Society
of Civil Engineers (ASCE) benchmark structure [43] and a four-story steel frame structure
experiment. The study has compared the proposed CGsformer with deep learning-based
models such as CNN, LSTM, and Transformer, especially in cases of limited datasets and
noisy pollution scenarios.
2. Methodology
2.1. The Equation of Motion
Damage can cause detectable changes in the structural dynamic response. Therefore,
it is usually possible to determine the structural health state by analyzing the structural
Sensors 2024, 24, 4415 4 of 22
response acceleration signals before and after damage. Without loss of generality, consider
a linear multi-degree-of-freedom structure, whose motion equation is as follows [44]:
where M, C, and K represent the structural mass, damping, and stiffness matrices with the
dimension of Z × Z, respectively. u(t), u̇(t), and ü(t) denote the displacement, velocity,
and acceleration with the dimension of Z × 1, respectively; The superscript is a derivative
of time. In addition, F (t) represents the external loads with the dimension of Z × 1.
It is generally believed that the damage will not lead to the change in the mass of the
structure, but it will cause a decrease in the stiffness. Hence, the changes in stiffness of the
structure before and after damage can be defined as ∆K
∆K = K u − K d (2)
This can be further expanded as the linear superposition of element stiffness matrices
as follows:
n
∆K = ∑ ai Kiu (3)
i =1
where Kui defines the stiffness matrix of the ith element in the structure, and ai is the damage
coefficient, which varies from 0 to 1. For example, ai = 0.95 means 5% stiffness lost in the
ith element for stiffness.
The changes in stiffness can be reflected in the structural dynamic response. For ex-
ample, damage can lead to a decrease in the structural vibration frequency and changes in
the vibration modes. Therefore, collecting and analyzing structural responses can help us
understand the health states of the structure. In practical applications, acceleration sensors,
installed at different locations on the structure, are commonly used to capture these changes.
The acceleration response refers to the measurement of the acceleration experienced by a
structure under various loading conditions or external forces. This acceleration response
data are usually collected by acceleration sensors.
This paper proposes a novel CGsformer network for structural damage detection.
The proposed CGsformer network is not only able to extract global and local features in
the acceleration response signal, but it also embeds the graph structure to better extract
local features from global features. As far as we know, the proposed CGsformer unifies
global features, local features, and graph structures for structural damage detection for the
first time. At the same time, our experiments have proven that the proposed CGsformer
has better classification performance and noise robustness.
QX = W q X
KX = W k X (4)
V X = WvX
where ⊕ represents the concatenate operation, and Atten(·) denotes the self-attention
operations.
where Gnodes indicates the set of nodes, and Gedges represents the set of edges. The Graph
Neural Network (GNN) is a deep learning model designed for processing graph-structured
data, which are widely used in fields such as social network analysis, molecular structure
modeling, and recommendation systems. The GCN is an important variant of GNNs, which
is able to construct a graph structure from the input sequence or features and learn the
adjacency relationships between nodes to enhance the understanding and representation
of global information.
We assume that the input sequence data features are X ∈ Rn×d . To construct a graph
for X, we first need to create nodes from the sequence and then define edges based on these
created nodes. Each element in the sequence is considered as a node in the graph network.
After node construction, the creation of edges is done through a self-attention mechanism
on the nodes. When we have completed the construction of the nodes and edges, a graph
is formed that contains rich and reliable connections between relevant nodes. Specifically,
an input X can be represented by a GCN layer as follows:
where X (l ) and W (l ) are the input and learnable matrix for the lth layer, respectively,
and MLap is the Laplacian matrix used to represent the topological structure of the graph,
which is denoted as
1 1
MLap = D − 2 ÂD − 2 (8)
where D = diag(d1 , d2 , . . . , dn ) is the degree matrix used to describe the number of
edges dn corresponding to node n, and  is the adjacency matrix used to represent the
relationships between nodes, which is defined as
(Wd X )T (Wn X )
 = Mre ( ) ∈ Rn × n (9)
d
Sensors 2024, 24, 4415 7 of 22
where Wd and Wn denote the learnable matrices, d represents the feature dimension of the
whole sequence, T denotes the transpose operation, and Mre (·) represents ReLU activation
function, which is utilized as the activation function filters out negative links between
nodes. Negative links imply that there is no necessary direct connection between these two
nodes. Overall, the GCN enhances the correlation between local features and suppresses
noise through graph structure, thus achieving better global modeling of features.
Figure 4. The overall structure of the CGsformer. The proposed CGsformer is mainly composed of
four parts: feedforward module, self-attention mechanism module, convolution module, and graph
network module.
The shallow features of the acceleration response signal are obtained through convolu-
tional subsampling and linear layers, and they are input into the CGsformer block to extract
deep global and local features. In the CGsformer block, the feedforward module, as shown
in Figure 5, is first used to achieve independent mapping of each position in the sequence.
The feedforward module consists of two linear projections and an intermediate nonlinear
activation function, where the first linear layer extends the feature dimensionality of the
data by four times, and the other linear layer projects it to the original model dimension.
We normalize the network using the Swish activation function, dropout, and layer normal-
ization operation in the feedforward module. The Swish function is used to nonlinearly
transform the input signal, and dropout randomly discards neurons to prevent the net-
Sensors 2024, 24, 4415 8 of 22
work from overfitting. The layer normalization operation speeds up network training and
convergence. Simultaneously, the entire module follows the prenormalized residual unit.
Then, the multiheaded self-attention module realizes the long-term dependency mod-
eling of the structural acceleration response data so that the weights of each position can
be dynamically adjusted according to the contextual information of different positions.
The graph convolution module captures the connection relationship between different
nodes to better understand the local and global dependencies in the structural acceleration
response data from the perspective of graph structure. The convolution module completes
the further capture of local features of the structural acceleration response data.
According to the above description, the process of the CGsformer model is unfolded
as follows:
e = X + 1 M f f (X)
X
2
X̂ = X + Mac ( X
e e)
X̂ = Mgcn ( X̂ ) (10)
X̄ = X̂ + Mconv ( X̂ )
1
Y = Mnorm ( X̄ + M f f ( X̄ ))
2
where M f f (·) defines the feedforward module, Mac (·) is the multiheaded self-attention
module, Mgcn (·) is the graph convolution module, Mconv (·) is the convolution module,
Mnorm (·) is the layer normalization operation, and Y ∈ Rn×d denotes the prediction result
of the damage patterns after CGsformer output.
2.5. Prediction
Finally, we pass the features Y obtained from the CGsformer model through two Fully
Connected (FC) layers to obtain the final damage category. The process is as follows:
where W1 and W2 represent weights of the FC layers, b1 and b2 represent biases of the FC
layers, and dout represents the dimension of the output classes.
Sensors 2024, 24, 4415 9 of 22
3. Verification by Simulation
In this section, we validated the damage detection performance of the CGsformer
model through the phase I IASC-ASCE SHM numerical benchmark structure [43].
The model defines a total of six damage patterns. Figure 7 illustrates three of these
damage patterns: D.P.1, D.P.2, and D.P.4. This paper focuses on these damage patterns,
and the original acceleration vibration response data were generated using the 12-degree-
of-freedom finite element model in the MATLAB program from the IASC-ASCE SHM
Research Group. The acceleration response data for these four damage modes (D.P.0,
D.P.1, D.P.2, and D.P.4) were collected from four sensors on each floor, as shown in Table 1.
The noise levels used were 0%, 20%, and 50%, respectively. The data were preprocessed,
thus resulting in training (validation and testing) data for the classifiers. The dataset among
the 24 contains a total of 6248 samples, with 4000 samples used for training, 1000 used
samples for validation, and 1248 used samples for testing. The preprocessing procedure is
described in detail in the next step.
Sensors 2024, 24, 4415 10 of 22
Figure 7. From left to right, they correspond to damage modes D.P.1, D.P.2, and D.P.4.
Table 1. Normal pattern and three cases of damage patterns as defined in the IASC-ASCE SHM
simulated benchmark structure.
N0
S( f ) = (12)
2
where N0 is the intensity of the noise power spectral density.
Data preprocessing: First, the response data collected on two acceleration sensors in
the same direction on each floor were fused to obtain the fused translation data of each
floor from the x and y directions [49] as follows:
(
acc x,d = 0.5 × ( acc1,d + acc3,d )
(13)
accy,d = 0.5 × ( acc2,d + acc4,d )
where acc1,d , acc2,d , acc3,d , and acc4,d denote the acceleration time history response data
collected by the four sensors in the IASC-ASCE SHM benchmark structural model for
floor d, as depicted in Figure 8, respectively, and acc x,d and accy,d denote the translational
acceleration in the x and y directions of the d floor, all of which are 1D temporal data.
Next, the sampling points of the sensor were calculated. Assume that the sampling rate
of the sensors is f s , and the sampling time is ts when collecting the original acceleration time
history response data of the four sensors in floor d. Correspondingly, the number of sam-
pling points S j for one sensor in sampling time ts is S j = f s × ts . The sampling points S j are
divided into m nonoverlapping data segments with fixed lengths of 128. The translational
acceleration datasets for completing the segmentation process in floor d are represented
as acc′x,d and acc′y,d . Meanwhile, acc′x,d and acc′y,d are normalized and shuffled, and the
processed data are defined as acc′′x,d , and acc′′y,d . According to the above method, the data
from each sensor on every floor under other damage modes are processed, thus resulting in
a dataset for each direction of every floor under each working condition. Assuming that the
Sensors 2024, 24, 4415 11 of 22
damage detection task contains a total of p damage patterns, the translational acceleration
dataset D for all floors in the x and y directions can be expressed as
The translational acceleration dataset D is then divided into columns to obtain eight
data subsets. Each subset represents the acceleration response data of a certain translational
direction in a certain floor, and contain p damage patterns.
Therefore, the proposed CGsformer with first floor x direction acceleration time history
response data will be introduced, i.e., Dx,1,p = [ acc′′x,1,1 acc′′x,1,2 . . . acc”x,1,p ] containing p
damage patterns. More generally, we define the Dx,1,p as X ∈ Rn×d .
Setting Value
Encoder Layers 4
Encoder Dim 512
Attention Heads 2
Conv Kernel Size 19
Multihead Attention Dropout 0.4
CGsformer Dropout 0.1
Sensors 2024, 24, 4415 12 of 22
Figure 9. Acceleration time response curve of D.P.1 damage pattern, which was collected from sensor
NO.1 (acc 1) on the first floor at 0 noise level.
Figure 10. Acceleration time response curve of D.P.1 damage pattern, which was collected from
sensor NO.3 (acc 3) on the first floor at 0 noise level.
Sensors 2024, 24, 4415 13 of 22
To measure the classification performance of the different models, the accuracy perfor-
mance ACC and F1 score (Fscore ) were adopted as indicators. The accuracy performance
ACC and F1 score are defined as follows
XTP + XTN
ACC = (15)
XTP + XTN + X FP + X FN
2X pre Xrec
Fscore = (16)
X pre + Xrec
where XTP , XTN , X FP , and X FN represent the true positive, true negative, false positive,
and false negative values of data samples, respectively; X pre and Xrec are the precision and
recall rates, respectively, which are denoted as follows:
XTP
X pre = (17)
XTP + TFP
XTP
Xrec = (18)
XTP + TFN
Table 3 presents the classification performance of the different models on the dataset.
The proposed CGsformer achieved the best performance. The single CNN and LSTM
models only considered the local information or global information of the input signals,
thus resulting in relatively poor results. Compared to the CNN, the multihead CNN
could learn features of different scales, and its accuracy was 3.45% higher than the CNN,
thus reaching 92.39%. However, the multihead CNN still failed to capture the long-term
information of the signals. The CNN-LSTM model took into account both local information
and long-term dependencies to better capture the characteristics of the structural damage,
with an accuracy of 92.55%. Compared to the CNN-LSTM model, the Transformer model
captured more robust global dependencies through the self-attention mechanism and
achieved a gain of 1.52%. The Conformer model employed the advantages of the CNN in
capturing local features and the Transformer in capturing global features, with an accuracy
of 95.27%. The proposed CGsformer model embedded graph structures into both global
and local features. This not only effectively filtered out noise interference from global
features, but it also helped the convolution module better understand global information,
thereby extracting more effective local features. In Table 3, the identification accuracy of the
proposed CGsformer is shown at 96.71%, thus achieving the best classification performance.
where ∆ACC = ACCa − ACCb represents the difference in accuracy between the proposed
model ACCa and other models ACCb —the 95% CI—which is denoted as
s
ACC (1 − ACC )
CI = ACC ± z · SE, SE = (19)
nt
where z was set to 1.96 at the 95% CI, nt is equivalent to the size of test samples, and ∆CI is
formulated as follows: q
∆CI = ∆ACC ± z · SEa2 + SEb2 (20)
Table 4. Statistical significance comparison of the proposed model with other models.
It can be intuitively seen from Table 4 that the proposed CGsformer model had the
most accurate predictive performance. Moreover, the CI of the proposed model was
the narrowest, which indicates that the proposed model is more robust. Meanwhile,
the proposed model had significant statistical significance compared to other models
(except for Conformer). In conclusion, the proposed model is superior to the other models.
Figure 11. Illustration of ablation experiments with GCN placed at different positions in the
CGsformer block.
The accuracy performance outcomes are presented in Table 5, and it can be found that
the results of the three combinations were better than that of Conformer. Among the three
Sensors 2024, 24, 4415 15 of 22
combinations, the proposed CGsformer model achieved the best performance, followed
by Attention Before and the relatively poor Convolution After. Based on these observa-
tions, the following conclusions can be drawn: (1) All three models outperformed the
Conformer model: this means that graph structure learning can help the model select more
discriminative features. (2) Placing the GCN module before the multihead self-attention
module failed to capture the higher-order information of the current sequence features
and only relied on shallow features for sequence representation learning, thus resulting in
poorer performance. Placing the GCN module after the convolution module aggregated
the final decision features, but this operation failed to fully utilize the semantic informa-
tion represented by attention and loses effectiveness. (3) The proposed CGsformer model
achieved the best performance, with an accuracy of 96.71%. Graph convolution learning of
features after the self-attention module can further propagate global features to adjacent
nodes, which helps to better understand and express node features by combining local
adjacency information.
accuracy of the four-floor, eight-classifier model only decreased by 1.81%. This indicates
that the proposed CGsformer-based damage detection model has strong noise resistance.
(2) The CGsformer model exhibited stronger robustness as the noise increased. Taking the
first floor, x direction, as an example, even with an additional 20% noise data, the detection
accuracy reached 96.07%. When the noise levels were 0%, 20%, and 50%, the average
accuracy values of the Conformer model were 96.18%, 95.25%, and 94.58%, respectively.
Comparing the results of the CGsformer model in Table 6, the CGsformer model achieved
better results, with an improvement of 1.07%, 1.49%, and 0.86% for the noise levels at 0%,
20%, and 50%, respectively. (3) Despite the suboptimal performance compared to low noise
levels, the model still achieved results comparable to the multihead CNN model (92.39%)
when dealing with a noise level of 50%. This indicates that the CGsformer model has
further learned the correlation between local and global features, thus exhibiting stronger
generalization even when half of the data is noisy.
Figure 12. Diagram of the best-performing confusion matrix for the three noise levels.
4. Experimental Verfication
In this section, the performance of the proposed CGsformer model is further com-
pared with other models in a four-story, single-span, steel frame structure to verify the
effectiveness of the proposed CGsformer model in different structures through testing.
During the long-term service of the structure, corrosion of the structural metal surface
can lead to a decrease in the net cross-sectional area of the components, thereby reducing
the load-bearing capacity. It can lead to serious effects on the structural safety. Therefore,
the focus of this study was to accurately identify changes in the net cross-sectional area
of structural components. In Figure 14, the structure is considered to be in a healthy state,
with a net cross-sectional diameter of 16 mm for all columns. The experiment replaced
the 16mm original column at the southeast corner of one or more floors in the structural
model with columns with a cross-sectional diameter of 14 mm or 12 mm. The purpose was
to simulate different degrees of corrosion damage. Table 7 summarizes the six damage
scenarios simulated in this paper.
Figure 15. The acceleration response curves of the D.P.0 damage pattern in the south side on the
first floor.
Figure 16. The acceleration response curves of the D.P.0 damage pattern in the north side on the
first floor.
y direction of the first floor and the x direction of the second floor, where CGsformer
reached accuracies of 93.91% and 92.97%, respectively. These outcomes are attributed to
the model’s hierarchical learning approach. The CGsformer networks employs a multihead
self-attention mechanism to capture the global information of signals, which helps to
identify the periodic relationships of acceleration response signals. Meanwhile, it utilizes
a convolution module to precisely capture the slight differences in acceleration response
signals at short and adjacent moments. Furthermore, the graph convolution module
embedded in the CGsformer enhances the model’s robustness against noise pollution. It
does this by constructing a graph structure for global features and filtering out noise in
global features through node learning, thereby enabling the convolution module to extract
more effective local features. These innovative aspects ensure the efficacy and reliability
of CGsformer in structural damage identification, with good generalization performance
across various types of damages.
Table 8. Identification accuracy results of the four stories and one-span steel frame structure from
different models.
5. Conclusions
This paper presents an innovative deep learning model, called CGsformer, for de-
tecting structural damages. The proposed CGsformer effectively extracts the global and
local features of signals by employing a hierarchical learning approach from global to
local. Additionally, the GCN is embedded after the multihead self-attention module for
further propagating global features to adjacent nodes, which helps to better understand
and express node features by incorporating local adjacency information. The proposed
damage detection method based on the CGsformer was verified using simulation data from
the IASC-ASCE benchmark structure and experimental data from a four-story, single-span,
steel frame structure. Some valuable conclusions can be drawn from the validated results:
• The proposed damage detection model has demonstrated its feasibility in test setups
with the IASC-ASCE simulated benchmark structure and a four-story, single-span,
steel frame structure, thus achieving damage identification accuracies of 96.71% and
92.44%, respectively. These results not only validate the effectiveness of the CGsformer
in identifying structural damage but also provide valuable insights for future research.
• The proposed CGsformer model exhibited high accuracy and robustness in limited
datasets and noise-contaminated conditions. In the example of the IASC-ASCE bench-
mark structure, despite the noise level increasing from 0% to 50%, the detection
accuracy only decreased by 1.81%. This means that the CGsformer can more effec-
tively extract features from the acceleration response signal, thus showcasing strong
noise resistance.
Although the proposed method achieved surprising performance through its learning
manner from local to global, some limitations should also be noted. From an applica-
tion perspective, the proposed method has not yet been tested in practical engineering.
To this end, we plan to collaborate with industry partners to implement and validate our
methods in real-world structural health monitoring scenarios. From a technical perspec-
tive, although the Transformer can achieve parallel computing compared to the RNN, it
Sensors 2024, 24, 4415 20 of 22
also increases the number of model parameters. Thus, we will consider compressing or
pruning the model in the future work. Moreover, obtaining acceleration response data
for structures with damage is a challenge. Thus, future research directions are intended
to combine transfer learning with structural numerical simulation models for structure
damage detection.
Author Contributions: Conceptualization, T.H., K.M. and J.X.; methodology, T.H., K.M. and J.X.;
software, T.H. and J.X.; validation, T.H. and J.X.; formal analysis, T.H. and J.X.; investigation, T.H. and
J.X.; resources, K.M. and J.X.; data curation, T.H. and J.X.; writing—original draft preparation, T.H.
and J.X.; writing—review and editing, T.H., K.M. and J.X.; visualization, T.H. and J.X.; supervision,
K.M. and J.X. All authors have read and agreed to the published version of the manuscript.
Funding: The research is supported by the National Natural Science Foundation of China (No.
50978064/Z091015) and the Natural Science Foundation of Guizhou Province of China (No. 2017[1036]).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The code for generating the datasets used in the numerical verification
is available on the datacenterhub website at https://www.dropbox.com/sh/zpkqy5w371mnzam/
AAA-Omuvwx72tjv5NhnhnPuMa?e=1&dl=0 (accessed on 7 July 2024). The datasets used in the
experimental verification are available upon request from the corresponding author. The datasets are
not available to the public, as they are the preliminary results of an ongoing research project carried
out in collaboration. Furthermore, this information will be used in future technological developments
and will be subject to intellectual property protection.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Fan, W.; Qiao, P. Vibration-based damage identification methods: A review and comparative study. Struct. Health Monit. 2011,
10, 83–111.
2. Aabid, A.; Parveez, B.; Raheman, M.A.; Ibrahim, Y.E.; Anjum, A.; Hrairi, M.; Parveen, N.; Mohammed Zayan, J. A review of
piezoelectric material-based structural control and health monitoring techniques for engineering structures: Challenges and
opportunities. Actuators 2021, 10, 101.
3. Xiao, F.; Hulsey, J.L.; Balasubramanian, R. Fiber optic health monitoring and temperature behavior of bridge in cold region.
Struct. Control Health Monit. 2017, 24, e2020.
4. Gkoumas, K.; dos Santos, F.; Pekar, F. Research in bridge maintenance, safety and management: An overview and outlook for
europe. In Bridge Maintenance, Safety, Management, Life-Cycle Sustainability and Innovations; CRC Press: Boca Raton, FL, USA, 2021;
pp. 1755–1761.
5. Malla, P.; Khedmatgozar Dolati, S.S.; Ortiz, J.D.; Mehrabi, A.B.; Nanni, A.; Ding, J. Damage detection in frp-reinforced concrete
elements. Materials 2024, 17, 1171.
6. Tang, Q.; Xin, J.; Jiang, Y.; Zhang, H.; Zhou, J. Dynamic Response Recovery of Damaged Structures Using Residual Learning
Enhanced Fully Convolutional Network. Int. J. Struct. Stab. Dyn. 2024, 2550008.
7. Lucà, F.; Manzoni, S.; Cerutti, F.; Cigada, A. A damage detection approach for axially loaded beam-like structures based on
gaussian mixture model. Sensors 2022, 22, 8336.
8. Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-driven structural health monitoring and damage detection through deep learning:
State-of-the-art review. Sensors 2020, 20, 2778.
9. Dao, P.B.; Staszewski, W.J. Lamb wave based structural damage detection using stationarity tests. Materials 2021, 14, 6823.
10. Xiao, F.; Sun, H.; Mao, Y.; Chen, G.S. Damage identification of large-scale space truss structures based on stiffness separation
method. Structures 2023, 53, 109–118.
11. Tong, K.; Zhang, H.; Zhao, R.; Zhou, J.; Ying, H. Investigation of smfl monitoring technique for evaluating the load-bearing
capacity of rc bridges. Eng. Struct. 2023, 293, 116667.
12. Angeletti, F.; Iannelli, P.; Gasbarri, P.; Panella, M.; Rosato, A. A study on structural health monitoring of a large space antenna via
distributed sensors and deep learning. Sensors 2022, 23, 368.
13. Altabey, W.A.; Wu, Z.; Noori, M.; Fathnejat, H. Structural health monitoring of composite pipelines utilizing fiber optic sensors
and an ai-based algorithm—A comprehensive numerical study. Sensors 2023, 23, 3887.
14. Kot, P.; Muradov, M.; Gkantou, M.; Kamaris, G.S.; Hashim, K.; Yeboah, D. Recent advancements in non-destructive testing
techniques for structural health monitoring. Appl. Sci. 2021, 11, 2750.
15. Hassani, S.; Dackermann, U. A systematic review of advanced sensor technologies for non-destructive testing and structural
health monitoring. Sensors 2023, 23, 2204.
Sensors 2024, 24, 4415 21 of 22
16. Hou, R.; Xia, Y. Review on the new development of vibration-based damage identification for civil engineering structures:
2010–2019. J. Sound Vib. 2021, 491, 115741.
17. Xiao, F.; Hulsey, J.L.; Chen, G.S.; Xiang, Y. Optimal static strain sensor placement for truss bridges. Int. J. Distrib. Sens. Netw. 2017,
13, 5.
18. Avci, O.; Abdeljaber, O.; Kiranyaz, S.; Hussein, M.; Gabbouj, M.; Inman, D.J. A review of vibration-based damage detection in
civil structures: From traditional methods to machine learning and deep learning applications. Mech. Syst. Signal Process. 2021,
147, 107077.
19. Nick, H.; Aziminejad, A.; Hosseini, M.H.; Laknejadi, K. Damage identification in steel girder bridges using modal strain
energy-based damage index method and artificial neural network. Eng. Fail. Anal. 2021, 119, 105010.
20. Fallahian, M.; Ahmadi, E.; Khoshnoudian, F. A structural damage detection algorithm based on discrete wavelet transform and
ensemble pattern recognition models. J. Civ. Struct. Health Monit. 2022, 12, 323–338.
21. Indhu, R.; Sundar, G.R.; Parveen, H.S. A review of machine learning algorithms for vibration-based shm and vision-based shm.
In Proceedings of the 2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore,
India, 23–25 February 2022; pp. 418–422.
22. Zhang, J.; Sato, T.; Iai, S. Support vector regression for on-line health monitoring of large-scale structures. Struct. Saf. 2006,
28, 392–406.
23. Hua, X.; Ni, Y.; Ko, J.; Wong, K. Modeling of temperature–frequency correlation using combined principal component analysis
and support vector regression technique. J. Comput. Civ. Eng. 2007, 21, 122–135.
24. Salkhordeh, M.; Mirtaheri, M.; Soroushian, S. A decision-tree-based algorithm for identifying the extent of structural damage in
braced-frame buildings. Struct. Control Health Monit. 2021, 28, e2825.
25. Wang, Y.; Su, F.; Guo, Y.; Yang, H.; Ye, Z.; Wang, L. Predicting the microbiologically induced concrete corrosion in sewer based on
xgboost algorithm. Case Stud. Constr. Mater. 2022, 17, e01649.
26. Lingxin, Z.; Junkai, S.; Baijie, Z. A review of the research and application of deep learning-based computer vision in structural
damage detection. Earthq. Eng. Eng. Vib. 2022, 21, 1–21.
27. Eltouny, K.; Gomaa, M.; Liang, X. Unsupervised learning methods for data-driven vibration-based structural health monitoring:
A review. Sensors 2023, 23, 3290.
28. Abdeljaber, O.; Avci, O.; Kiranyaz, S.; Gabbouj, M.; Inman, D.J. Real-time vibration-based structural damage detection using
one-dimensional convolutional neural networks. J. Sound Vib. 2017, 388, 154–170.
29. Zhang, Y.; Miyamori, Y.; Mikami, S.; Saito, T. Vibration-based structural state identification by a 1-dimensional convolutional
neural network. Comput.-Aided Civ. Infrastruct. Eng. 2019, 34, 822–839.
30. Khodabandehlou, H.; Pekcan, G.; Fadali, M.S. Vibration-based structural condition assessment using convolution neural networks.
Struct. Control Health Monit. 2019, 26, e2308.
31. Tang, Z.; Chen, Z.; Bao, Y.; Li, H. Convolutional neural network-based data anomaly detection method using multiple information
for structural health monitoring. Struct. Control Health Monit. 2019, 26, e2296.
32. Mantawy, I.M.; Mantawy, M.O. Convolutional neural network based structural health monitoring for rocking bridge system by
encoding time-series into images. Struct. Control Health Monit. 2022, 29, e2897.
33. Lin, Z.; Liu, Y.; Zhou, L. Damage detection in a benchmark structure using long short-term memory networks. In Proceedings of
the 2019 Chinese Automation Congress (CAC), Hangzhou, China, 22–24 November 2019; pp. 2300–2305.
34. Sony, S.; Gamage, S.; Sadhu, A.; Samarabandu, J. Vibration-based multiclass damage detection and localization using long
short-term memory networks. Structures 2022, 35, 436–451.
35. Zou, J.; Yang, J.; Wang, G.; Tang, Y.; Yu, C. Bridge structural damage identification based on parallel cnn-gru. IOP Conf. Ser. Earth
Environ. Sci. 2021, 626, 012017.
36. Yang, J.; Yang, F.; Zhou, Y.; Wang, D.; Li, R.; Wang, G.; Chen, W. A data-driven structural damage detection framework based on
parallel convolutional neural network and bidirectional gated recurrent unit. Inf. Sci. 2021, 566, 103–117.
37. Bao, X.; Fan, T.; Shi, C.; Yang, G. Deep learning methods for damage detection of jacket-type offshore platforms. Process Saf.
Environ. Prot. 2021, 154, 249–261.
38. Fu, L.; Tang, Q.; Gao, P.; Xin, J.; Zhou, J. Damage identification of long-span bridges using the hybrid of convolutional neural
network and long short-term memory network. Algorithms 2021, 14, 180.
39. Dang, V.-H.; Vu, T.-C.; Nguyen, B.-D.; Nguyen, Q.-H.; Nguyen, T.-D. Structural damage detection framework based on graph
convolutional network directly using vibration data. Structures 2022, 38, 40–51.
40. Wang, S.; Luo, Z.; Shen, P.; Zhang, H.; Ni, Z. Graph-in-graph convolutional network for ultrasonic guided wave-based damage
detection and localization. IEEE Trans. Instrum. Meas. 2022, 71, 2502011.
41. Zhan, P.; Qin, X.; Zhang, Q.; Sun, Y. A novel structural damage detection method via multi-sensor spatial-temporal graph-based
features and deep graph convolutional network. IEEE Trans. Instrum. Meas. 2023, 72, 2504814.
42. Liang, Z.; Li, D.; Ren, W. Structural damage identification method based on recursive graph for automatic feature extraction. In
Proceedings of the 29th National Conference on Structural Engineering (Volume II), Wuhan, China, 16–18 October 2020.
43. Johnson, E.A.; Lam, H.-F.; Katafygiotis, L.S.; Beck, J.L. Phase i iasc-asce structural health monitoring benchmark problem using
simulated data. J. Eng. Mech. 2004, 130, 3–15.
44. Hwang, H.; Kim, C. Damage detection in structures using a few frequency response measurements. J. Sound Vib. 2004, 270, 1–14.
Sensors 2024, 24, 4415 22 of 22
45. Yessoufou, F.; Zhu, J. Classification and regression-based convolutional neural network and long short-term memory configuration
for bridge damage identification using long-term monitoring vibration data. Struct. Health Monit. 2023, 22, 14759217231161811.
46. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you
need. In Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA,
4–9 December 2017.
47. Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a
fixed-length context. arXiv 2019, arXiv:1901.02860.
48. Zhang, Z.; Cui, P.; Zhu, W. Deep Learning on Graphs: A Survey. IEEE Trans. Knowl. Data Eng. 2020, 34, 249–270.
49. Chen, F. Improvement of Kalman Filter and Kalman Estimator in the Application of Structural Damage Detection. Ph.D. Thesis,
Xiamen University, Xiamen, China, 2014.
50. Junior, R.F.R.; dos Santos Areias, I.A.; Campos, M.M.; Teixeira, C.E.; da Silva, L.E.B.; Gomes, G.F. Fault detection and diagnosis in
electric motors using 1d convolutional neural networks with multi-channel vibration signals. Measurement 2022, 190, 110759.
51. Gulati, A.; Qin, J.; Chiu, C.-C.; Parmar, N.; Zhang, Y.; Yu, J.; Han, W.; Wang, S.; Zhang, Z.; Wu, Y.; et al. Conformer: Convolution-
augmented transformer for speech recognition. arXiv 2020, arXiv:2005.08100.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.