0% found this document useful (0 votes)
35 views22 pages

Sensors 24 04415

Uploaded by

benlahneche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views22 pages

Sensors 24 04415

Uploaded by

benlahneche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

sensors

Article
Graph Feature Refinement and Fusion in Transformer for
Structural Damage Detection
Tianjie Hu 1,2 , Kejian Ma 1,2 and Jianchun Xiao 1,2, *

1 Research Center of Space Structures, Guizhou University, Guiyang 550025, China;


[email protected] (T.H.); [email protected] (K.M.)
2 Key Laboratory of Structural Engineering of Guizhou Province, Guiyang 550025, China
* Correspondence: [email protected]

Abstract: Structural damage detection is of significance for maintaining the structural health.
Currently, data-driven deep learning approaches have emerged as a highly promising research
field. However, little progress has been made in studying the relationship between the global and
local information of structural response data. In this paper, we have presented an innovative Convo-
lutional Enhancement and Graph Features Fusion in Transformer (CGsformer) network for structural
damage detection. The proposed CGsformer network introduces an innovative approach for hierar-
chical learning from global to local information to extract acceleration response signal features for
structural damage representation. The key advantage of this network is the integration of a graph
convolutional network in the learning process, which enables the construction of a graph structure
for global features. By incorporating node learning, the graph convolutional network filters out
noise in the global features, thereby facilitating the extraction to more effective local features. In the
verification based on the experimental data of four-story steel frame model experiment data and
IASC-ASCE benchmark structure simulated data, the CGsformer network achieved damage identifi-
cation accuracies of 92.44% and 96.71%, respectively. It surpassed the existing traditional damage
detection methods based on deep learning. Notably, the model demonstrates good robustness under
noisy conditions.

Keywords: structural damage detection; deep learning; CGsformer; graph convolutional network;
Citation: Hu, T.; Ma, K.; Xiao, J. global and local features; noise robustness
Graph Feature Refinement and Fusion
in Transformer for Structural Damage
Detection. Sensors 2024, 24, 4415.
https://doi.org/10.3390/s24134415
1. Introduction
Academic Editors: Simon X. Yang, As the foundational support system of buildings, the structure directly affects the
Jingzhou Xin, Hong Zhang building safety. Throughout the entire service life of buildings, structural damage is
and Yan Jiang
inevitable due to the coupling of adverse factors such as environmental corrosion, material
Received: 16 May 2024 aging, fatigue damage, sudden disasters, and the long-term effects of abnormal loads [1–3].
Revised: 30 June 2024 With the extension of service time, the load-bearing capacity and resistance to natural
Accepted: 4 July 2024 disasters of the structure decrease, thus leading to unforeseeable safety challenges for
Published: 8 July 2024 buildings [4–6]. Therefore, the early diagnosis and localization of structural damage play a
vital role in the timely maintenance of structures [7]. Structural damage detection [8–10],
as the core technology of Structural Health Monitoring (SHM), is essential for grasping the
structural working status and evaluating structural safety [11]. Consequently, structural
Copyright: © 2024 by the authors. damage detection methods have emerged as a prominent research focus in the field of
Licensee MDPI, Basel, Switzerland. SHM [12,13]. Structural damage detection is categorized into local-based [14,15] and global-
This article is an open access article
based [16,17] methods. Local-based methods are mainly exploited to detect small regular
distributed under the terms and
structures or the local structure, which can be challenging to assess damage from large
conditions of the Creative Commons
and complex structures. To overcome the limitations of local-based methods, many global-
Attribution (CC BY) license (https://
based structural detection methods have been developed. Among them, vibration-based
creativecommons.org/licenses/by/
structural damage detection methods have garnered widespread attention due to their
4.0/).

Sensors 2024, 24, 4415. https://doi.org/10.3390/s24134415 https://www.mdpi.com/journal/sensors


Sensors 2024, 24, 4415 2 of 22

flexibility, efficiency, and wide application. Traditional vibration-based structural damage


detection methods use natural frequency, modal shape, and other modal measurements to
detect structural damage. Although these traditional methods exhibit excellent performance
in specific application scenarios, they still have obvious limitations when confronting with
complex recognition tasks.
With the rapid advancement of Machine Learning (ML) technology, ML provides new
ideas for vibration-based structural damage detection methods and has been extensively
applied [18–21]. Compared to traditional vibration-based methods, ML-based structural
damage detection methods are able to automatically learn and identify damage characteris-
tics from large amounts of collected data. These methods not only have the advantages
of high efficiency, accuracy, and adaptability, but they also do not need to rely on the
subjective experience of experts. Zhang et al. [22] put forward a SHM method based on
incremental Support Vector Regression (SVR) and substructure strategy, which realized the
health monitoring of large and complex structures. Hua et al. [23] proposed a method to
compress features using Principal Component Analysis (PCA), which improved the classi-
fication accuracy of their SVR and reduced the computational cost. Salkhordeh et al. [24]
designed a decision tree classifier based on the Bayesian optimization algorithm to select
effective features for structural damage classification. Wang et al. [25] proposed a concrete
corrosion prediction model, which applied eXtreme Gradient Boosting (XG-Boost) based
on the Bayesian optimization algorithm to extract corrosion-related features, and they used
a Random Forest (RF) model for prediction.
ML-based structural damage methods have achieved promising performance, but their
key drawback is their reliance on predefined hand-crafted features. These hand-crafted
features often contain elements irrelevant to the task, thus resulting in reduced model
performance. Deep Learning (DL), as the most representative technology of machine
learning, has demonstrated powerful capabilities in various data analysis fields. Many
DL-related methods have been proposed [26,27] for vibration-based structural damage
identification. These methods typically take acceleration response signals as input and
then extract features through a predefined network, thus achieving better classification
performance. In 2017, Abdeljaber et al. [28] proposed a structural damage detection
algorithm based on nonparametric vibration and a Convolutional Neural Network (CNN),
which extracted sensitive features from the original acceleration response signal to achieve
vibration-based damage detection and location. Zhang et al. [29] trained a one-dimensional
(1D) CNN to detect local structural stiffness and mass changes. In addition to a 1D
CNN, Khodabandehlou et al. [30] introduced the 2D CNN, and the method employed
multiple acceleration response signals collected by multiple sensors as the input of the
2D CNN. Tang et al. [31] divided an original time series and used a 2D CNN to extract
the time–frequency image of the time series to obtain local features for structural anomaly
detection. Mantawy et al. [32] encoded a time series into images and trained a 2D CNN to
extract local information to describe structural health and damage status. Although the
CNN has shown a powerful ability to represent local information of input signals for
structural damage detection tasks, it has difficulty describing the long-term dependence of
time series. To capture the global information of the data, Lin et al. [33] introduced the Long
Short-Term Memory (LSTM) network for damage detection, which used the acceleration
response signal as input to learn the signals’ long-term properties. Sony et al. [34] proposed
an LSTM network to classify and locate structural damage by using structural acceleration
response data. However, the local feature extraction capabilities of temporal modeling
networks, the Recurrent Neural Network (RNN), and its variants are inefficient. To this
end, subsequent work [35–37] has designed hybrid models, such as CNN and RNN (CNN-
RNN), which combines the local feature extraction faculty of the CNN—with the ability to
capture long-term dependencies—with the time series data of the RNN. Typically, these
hybrid models integrate local and global features through parallel architectures [35,36] or
learning strategies from local to global [38]. However, the former ignores the dependence
between local and global features, and the latter leads to information loss and increases the
Sensors 2024, 24, 4415 3 of 22

complexity of the model. Therefore, how to effectively combine hybrid models to extract
the local and global features of input data has become a challenging problem.
Although the aforementioned networks are able to extract local and global infor-
mation from signals, they have difficulty in mining the intrinsic correlations of the data.
Consequently, some researchers [39–42] have utilized irregular graphs to reflect the inter-
connections between structural damage data. Dang et al. [39] used the Graph Convolutional
Network (GCN) to attain the intrinsic spatial coherence of sensor positions directly from
structural vibration response data. Zhan et al. [41] exploited the GCN to construct graph
structures in the wavelet domain for multisensor signals, and they trained the network with
node classification as the goal for structural damage detection tasks. Wang et al. [40] pro-
posed a waveguide-based dual GCN method for damage detection. The method employed
Short-Time Fourier Transform (STFT) to obtain the spatial–temporal feature representation
of the original waveguide signals. A local graph network was built using features obtained
from samples, and then local graphs were formed by grouping nodes with similar types of
damage, thus enabling damage detection on the structure. The successful applications of
GCN-based methods show that the methods can effectively understand and represent the
intrinsic correlations between original data and features.
There are still challenges in designing a deep learning method that accurately iden-
tifies structural damage. Firstly, both the CNN-based and RNN-based approaches have
limitations in their abilities to model long-term dependencies and capture global temporal
dependencies. Although RNNs are designed to deal with long-term dependencies, they
may still encounter the problem of vanishing or exploding gradients when processing
longer sequences. Secondly, some hybrid models usually use parallel networks, or they
adopt the learning way from local to global, and there is a lack of investigation into the
learning manner from global to local. Thirdly, in addition to the signal components related
to the structural self-oscillation characteristics, the measured acceleration response signal
inevitably contains noise. The direct input of such noisy data into deep learning models
can affect the generalization ability and damage detection performance. Finally, research
on structural damage methods based on graph convolution is limited.
According to the above analysis, we present a novel Convolutional Enhancement
and Graph Features Fusion in Transformer Network (CGsformer) for detecting structural
damage information. The proposed CGsformer network contains two contributions. On the
one hand, the proposed network uses a hierarchical learning method from global to lo-
cal to extract the characteristics of the acceleration response signal. Based on previous
research work, global information can identify the periodic relationship of the acceleration
vibration response signals, while local information can help to define and analyze the
subtle differences in the acceleration response signals of short adjacent moments before
and after damage occurs. Thus, a multihead self-attention was used to acquire the global
information of the signals, and a convolution module was applied to represent the local
information. On the other hand, the graph convolution module was embeded to enhance
robustness against noise contamination. The graph convolution network constructs a
graph structure for global features and filters out noise in global features through node
learning, thus prompting the convolution module to extract more effective local features.
On this basis, an extensive verification testing has been conducted by using the numerical
model of the International Association for Structural Control (IASC)– American Society
of Civil Engineers (ASCE) benchmark structure [43] and a four-story steel frame structure
experiment. The study has compared the proposed CGsformer with deep learning-based
models such as CNN, LSTM, and Transformer, especially in cases of limited datasets and
noisy pollution scenarios.

2. Methodology
2.1. The Equation of Motion
Damage can cause detectable changes in the structural dynamic response. Therefore,
it is usually possible to determine the structural health state by analyzing the structural
Sensors 2024, 24, 4415 4 of 22

response acceleration signals before and after damage. Without loss of generality, consider
a linear multi-degree-of-freedom structure, whose motion equation is as follows [44]:

M ü(t) + C u̇(t) + Ku(t) = F (t) (1)

where M, C, and K represent the structural mass, damping, and stiffness matrices with the
dimension of Z × Z, respectively. u(t), u̇(t), and ü(t) denote the displacement, velocity,
and acceleration with the dimension of Z × 1, respectively; The superscript is a derivative
of time. In addition, F (t) represents the external loads with the dimension of Z × 1.
It is generally believed that the damage will not lead to the change in the mass of the
structure, but it will cause a decrease in the stiffness. Hence, the changes in stiffness of the
structure before and after damage can be defined as ∆K

∆K = K u − K d (2)

This can be further expanded as the linear superposition of element stiffness matrices
as follows:
n
∆K = ∑ ai Kiu (3)
i =1

where Kui defines the stiffness matrix of the ith element in the structure, and ai is the damage
coefficient, which varies from 0 to 1. For example, ai = 0.95 means 5% stiffness lost in the
ith element for stiffness.
The changes in stiffness can be reflected in the structural dynamic response. For ex-
ample, damage can lead to a decrease in the structural vibration frequency and changes in
the vibration modes. Therefore, collecting and analyzing structural responses can help us
understand the health states of the structure. In practical applications, acceleration sensors,
installed at different locations on the structure, are commonly used to capture these changes.
The acceleration response refers to the measurement of the acceleration experienced by a
structure under various loading conditions or external forces. This acceleration response
data are usually collected by acceleration sensors.
This paper proposes a novel CGsformer network for structural damage detection.
The proposed CGsformer network is not only able to extract global and local features in
the acceleration response signal, but it also embeds the graph structure to better extract
local features from global features. As far as we know, the proposed CGsformer unifies
global features, local features, and graph structures for structural damage detection for the
first time. At the same time, our experiments have proven that the proposed CGsformer
has better classification performance and noise robustness.

2.2. Global and Local Feature Extraction


Research [35,36,45] has shown that the CNN-LSTM model has the ability to capture
local and global features of signals, and its classification accuracy on structural damage
detection tasks can surpass single CNN or RNN models. Inspired by previous studies,
the multiheaded self-attention module and convolutional module were used to extract
global and local features from structural damage detection signals.
Multihead self-attention module: This module comprises layer normalization,
multihead self-attention, and dropout. Among them, multiheaded self-attention was
proposed in [46,47]. To better understand multihead self-attention, we first describe self-
attention. The self-attention mechanism aims to describe contextual information and
capture the global characteristics of signals. The process of the self-attention mechanism
is illustrated in Figure 1. Assume that matrix X ∈ R M× N passes through linear matrices
Sensors 2024, 24, 4415 5 of 22

W q ∈ R Mq × M , W k ∈ R Mk × M , and W v ∈ R Mv × M to obtain query matrix Q X ∈ R Mq × N , key


matrix K X ∈ R Mk × N , and value matrix V X ∈ R Mv × N as follows:

QX = W q X
KX = W k X (4)
V X = WvX

Figure 1. Self-attention mechanism.

Subsequently, the attention score matrix X score is obtained by multiplying Q X and


K X and by passing the softmax function. Finally, the attention coefficients matrix X co f f
is multiplied by the V X to acquire the attention coefficients. Similarly, the multiheaded
self-attention mechanism uses multiple linear matrices W iq , W ik , and W iv to obtain the
matrices Qi , K i and V i , as shown in Figure 2. All attention coefficients are concatenated
into multiheaded attention coefficients, which are obtained as follows:
 
Mac ( X ) = Atten W 1q X, W 1k X, W 1v X ⊕ . . . . . .
  (5)
⊕ Atten W iq X, W ik X, W iv X

where ⊕ represents the concatenate operation, and Atten(·) denotes the self-attention
operations.

Figure 2. Multiheaded self-attention mechanism.


Sensors 2024, 24, 4415 6 of 22

Convolution module: The convolution module includes a pointwise convolution op-


eration with 1x1 convolution kernel, and it doubles the number of channels. Next, the GLU
activation function is utilized to control the number of output features. Batch normalization
is carried out to stabilize the internal distribution of the network after applying a depthwise
convolution. Finally, a Swish activation function and pointwise convolution are employed
to complete the processing of the whole module. The convolution module is depicted
in Figure 3.

Figure 3. The convolution module.

2.3. Graph Convolution Network


As a data structure, a graph can effectively describe the association between two nodes,
which is represented as follows [48],
 
G = Gnodes , Gedges (6)

where Gnodes indicates the set of nodes, and Gedges represents the set of edges. The Graph
Neural Network (GNN) is a deep learning model designed for processing graph-structured
data, which are widely used in fields such as social network analysis, molecular structure
modeling, and recommendation systems. The GCN is an important variant of GNNs, which
is able to construct a graph structure from the input sequence or features and learn the
adjacency relationships between nodes to enhance the understanding and representation
of global information.
We assume that the input sequence data features are X ∈ Rn×d . To construct a graph
for X, we first need to create nodes from the sequence and then define edges based on these
created nodes. Each element in the sequence is considered as a node in the graph network.
After node construction, the creation of edges is done through a self-attention mechanism
on the nodes. When we have completed the construction of the nodes and edges, a graph
is formed that contains rich and reliable connections between relevant nodes. Specifically,
an input X can be represented by a GCN layer as follows:

X (l +1) = MLap X (l ) W (l ) (7)

where X (l ) and W (l ) are the input and learnable matrix for the lth layer, respectively,
and MLap is the Laplacian matrix used to represent the topological structure of the graph,
which is denoted as
1 1
MLap = D − 2 ÂD − 2 (8)
where D = diag(d1 , d2 , . . . , dn ) is the degree matrix used to describe the number of
edges dn corresponding to node n, and  is the adjacency matrix used to represent the
relationships between nodes, which is defined as

(Wd X )T (Wn X )
 = Mre ( ) ∈ Rn × n (9)
d
Sensors 2024, 24, 4415 7 of 22

where Wd and Wn denote the learnable matrices, d represents the feature dimension of the
whole sequence, T denotes the transpose operation, and Mre (·) represents ReLU activation
function, which is utilized as the activation function filters out negative links between
nodes. Negative links imply that there is no necessary direct connection between these two
nodes. Overall, the GCN enhances the correlation between local features and suppresses
noise through graph structure, thus achieving better global modeling of features.

2.4. Extracting Robust Features via CGsformer


Traditional hybrid models usually adopt parallel architecture or learn from local to
global methods. However, the former assumes that global information and local informa-
tion are independent, while the latter will lead to the loss of some global information and
increase the complexity of the model. To this end, we proposed the CGsformer network,
which adopts a learning manner from global to local.
Figure 4 illustrates the proposed CGsformer network. The innovation and advantage
of the proposed CGsformer network lies in the hierarchical learning from global to local to
extract acceleration response signal features for representing structural damage. Meanwhile,
embedding GCN into the learning process of global to local features aims to build a graph
structure for global features. By learning node features of global information to filter out
noise and retain important global features, the convolution module learns more effective
local features.

Figure 4. The overall structure of the CGsformer. The proposed CGsformer is mainly composed of
four parts: feedforward module, self-attention mechanism module, convolution module, and graph
network module.

The shallow features of the acceleration response signal are obtained through convolu-
tional subsampling and linear layers, and they are input into the CGsformer block to extract
deep global and local features. In the CGsformer block, the feedforward module, as shown
in Figure 5, is first used to achieve independent mapping of each position in the sequence.
The feedforward module consists of two linear projections and an intermediate nonlinear
activation function, where the first linear layer extends the feature dimensionality of the
data by four times, and the other linear layer projects it to the original model dimension.
We normalize the network using the Swish activation function, dropout, and layer normal-
ization operation in the feedforward module. The Swish function is used to nonlinearly
transform the input signal, and dropout randomly discards neurons to prevent the net-
Sensors 2024, 24, 4415 8 of 22

work from overfitting. The layer normalization operation speeds up network training and
convergence. Simultaneously, the entire module follows the prenormalized residual unit.
Then, the multiheaded self-attention module realizes the long-term dependency mod-
eling of the structural acceleration response data so that the weights of each position can
be dynamically adjusted according to the contextual information of different positions.
The graph convolution module captures the connection relationship between different
nodes to better understand the local and global dependencies in the structural acceleration
response data from the perspective of graph structure. The convolution module completes
the further capture of local features of the structural acceleration response data.

Figure 5. The feedforward module.

According to the above description, the process of the CGsformer model is unfolded
as follows:
e = X + 1 M f f (X)
X
2
X̂ = X + Mac ( X
e e)
X̂ = Mgcn ( X̂ ) (10)
X̄ = X̂ + Mconv ( X̂ )
1
Y = Mnorm ( X̄ + M f f ( X̄ ))
2
where M f f (·) defines the feedforward module, Mac (·) is the multiheaded self-attention
module, Mgcn (·) is the graph convolution module, Mconv (·) is the convolution module,
Mnorm (·) is the layer normalization operation, and Y ∈ Rn×d denotes the prediction result
of the damage patterns after CGsformer output.

2.5. Prediction
Finally, we pass the features Y obtained from the CGsformer model through two Fully
Connected (FC) layers to obtain the final damage category. The process is as follows:

M pre = W2 ( Mre (W1 Y + b1 )) + b2 ∈ Rdout (11)

where W1 and W2 represent weights of the FC layers, b1 and b2 represent biases of the FC
layers, and dout represents the dimension of the output classes.
Sensors 2024, 24, 4415 9 of 22

3. Verification by Simulation
In this section, we validated the damage detection performance of the CGsformer
model through the phase I IASC-ASCE SHM numerical benchmark structure [43].

3.1. Test Setup and Data Preparation


IASC-ASCE SHM benchmark dataset: The numerical model of the IASC-ASCE
SHM benchmark has been jointly established by the IASC and the ASCE [43], which
provides a standardized and unified benchmarking platform for comparing and evaluating
various structural health monitoring methods. As shown in Figure 6, the IASC-ASCE SHM
benchmark structure is a 13 -scale steel frame model with four stories. The model has a story
height of 0.9 m and a total height of 3.6 m. The plan size of the model is 2.5 m × 2.5 m,
with each bay spanning 1.25 m. Each story consists of nine steel columns and eight diagonal
braces. Each floor slab is composed of four uniformly distributed mass plates. The first
floor has four 800 kg plates. The second and third floors each have four 600 kg plates.
The fourth floor has three 400 kg plates and one 550 kg plate, which are arranged to give
the structure an asymmetric mass distribution. The structure is excited by an external
excitation acting in the diagonal direction at the top. For more detailed information on the
IASC-ASCE SHM simulated benchmark structure, please consult reference [43].

Figure 6. IASC-ASCE SHM Benchmark structure model.

The model defines a total of six damage patterns. Figure 7 illustrates three of these
damage patterns: D.P.1, D.P.2, and D.P.4. This paper focuses on these damage patterns,
and the original acceleration vibration response data were generated using the 12-degree-
of-freedom finite element model in the MATLAB program from the IASC-ASCE SHM
Research Group. The acceleration response data for these four damage modes (D.P.0,
D.P.1, D.P.2, and D.P.4) were collected from four sensors on each floor, as shown in Table 1.
The noise levels used were 0%, 20%, and 50%, respectively. The data were preprocessed,
thus resulting in training (validation and testing) data for the classifiers. The dataset among
the 24 contains a total of 6248 samples, with 4000 samples used for training, 1000 used
samples for validation, and 1248 used samples for testing. The preprocessing procedure is
described in detail in the next step.
Sensors 2024, 24, 4415 10 of 22

Figure 7. From left to right, they correspond to damage modes D.P.1, D.P.2, and D.P.4.

Table 1. Normal pattern and three cases of damage patterns as defined in the IASC-ASCE SHM
simulated benchmark structure.

Damage Pattern Pattern Description


D.P.0 No damage.
D.P.1 All braces on the first floor have no stiffness.
D.P.2 All braces on the first and third floors have no stiffness.
D.P.4 One brace of the first floor and the third floor has no stiffness.

Using Gaussian white noise to simulate environmental excitation is a common method.


Gaussian white noise is a random process with a uniform power spectral density, which is
characterized by equal and independent power components at all frequencies. The power
spectral density of white noise is constant across all frequencies.

N0
S( f ) = (12)
2
where N0 is the intensity of the noise power spectral density.
Data preprocessing: First, the response data collected on two acceleration sensors in
the same direction on each floor were fused to obtain the fused translation data of each
floor from the x and y directions [49] as follows:
(
acc x,d = 0.5 × ( acc1,d + acc3,d )
(13)
accy,d = 0.5 × ( acc2,d + acc4,d )

where acc1,d , acc2,d , acc3,d , and acc4,d denote the acceleration time history response data
collected by the four sensors in the IASC-ASCE SHM benchmark structural model for
floor d, as depicted in Figure 8, respectively, and acc x,d and accy,d denote the translational
acceleration in the x and y directions of the d floor, all of which are 1D temporal data.
Next, the sampling points of the sensor were calculated. Assume that the sampling rate
of the sensors is f s , and the sampling time is ts when collecting the original acceleration time
history response data of the four sensors in floor d. Correspondingly, the number of sam-
pling points S j for one sensor in sampling time ts is S j = f s × ts . The sampling points S j are
divided into m nonoverlapping data segments with fixed lengths of 128. The translational
acceleration datasets for completing the segmentation process in floor d are represented
as acc′x,d and acc′y,d . Meanwhile, acc′x,d and acc′y,d are normalized and shuffled, and the
processed data are defined as acc′′x,d , and acc′′y,d . According to the above method, the data
from each sensor on every floor under other damage modes are processed, thus resulting in
a dataset for each direction of every floor under each working condition. Assuming that the
Sensors 2024, 24, 4415 11 of 22

damage detection task contains a total of p damage patterns, the translational acceleration
dataset D for all floors in the x and y directions can be expressed as

D = [[ acc′′x,1,1 acc′′y,1,1 . . . acc′′x,d,1 acc′′y,d,1 . . . acc′′x,4,1 acc′′y,4,1 ],


[ acc′′x,1,2 acc′′y,1,2 . . . acc′′x,d,2 acc′′y,d,2 . . . acc′′x,4,2 acc′′y,4,2 ],
(14)
...
[ acc′′x,1,p acc′′y,1,p . . . acc′′x,d,p acc′′y,d,p . . . acc′′x,4,p acc′′y,4,p ]].

The translational acceleration dataset D is then divided into columns to obtain eight
data subsets. Each subset represents the acceleration response data of a certain translational
direction in a certain floor, and contain p damage patterns.
Therefore, the proposed CGsformer with first floor x direction acceleration time history
response data will be introduced, i.e., Dx,1,p = [ acc′′x,1,1 acc′′x,1,2 . . . acc”x,1,p ] containing p
damage patterns. More generally, we define the Dx,1,p as X ∈ Rn×d .

Figure 8. Distribution of IASC-ASCE SHM Benchmark model measurement points.

Implementation of details: The hyperparameters of the CGsformer are set as shown


in Table 2. All hyperparameters were determined by performing ablation experiments on
damaged structures in the first floor in the x direction. And, the same hyperparameter
settings were used for all experiments. The entire model used the Adam optimizer for
gradient descent, and the learning rate was 0.001. The loss was computed using the
crossentropy function. Training and testing were performed on a machine with a Tesla
A100 GPU using a batch size of 32.

Table 2. The hyperparameter settings adopted in the CGsformer.

Setting Value
Encoder Layers 4
Encoder Dim 512
Attention Heads 2
Conv Kernel Size 19
Multihead Attention Dropout 0.4
CGsformer Dropout 0.1
Sensors 2024, 24, 4415 12 of 22

3.2. Comparison with Other Models


To verify that the accuracy performance of CGsformer is better than other models,
the performance of the proposed CGsformer structural damage detection method was
compared with other methods on the IASC-ASCE SHM benchmark structural dataset with
an initial layer x direction noise level of 0%, and the related acceleration response curves
are illustrated in Figures 9 and 10. The comparative models include following:
• CNN [28]: In this experiment, a one-dimension (1D) convolution operation with two
convolutional kernels of size five constructed the network.
• LSTM [33]: In this experiment, a bidirectional LSTM with two hidden layers and a
dimension of 128 constructed the network.
• CNN-LSTM [37]: The spatial features were first extracted using a 1D CNN with a
convolutional kernel size of 15, and then these features were input into a two-layer
LSTM with a hidden layer dimension of 256 for temporal modeling.
• Multihead CNN [50]: Multihead CNN learns different-scale or different-type features
by introducing multiple parallel convolutional branches. Each branch can focus on
different spatial or frequency domain information, and their results are fused to more
comprehensively describe structural damage information.
• Transformer [46]: In this experiment, four Transformer blocks were used with eight
heads in the multiheaded attention mechanism, and the dimension was set to 512.
• Conformer [51]: Conformer combines the advantages of CNN and self-attention
mechanisms, effectively handles long input sequences, and possesses strong modeling
and contextual understanding capabilities. The experimental hyperparameter settings
for Conformer were consistent with the CGsformer, as illustrated in Table 2.

Figure 9. Acceleration time response curve of D.P.1 damage pattern, which was collected from sensor
NO.1 (acc 1) on the first floor at 0 noise level.

Figure 10. Acceleration time response curve of D.P.1 damage pattern, which was collected from
sensor NO.3 (acc 3) on the first floor at 0 noise level.
Sensors 2024, 24, 4415 13 of 22

To measure the classification performance of the different models, the accuracy perfor-
mance ACC and F1 score (Fscore ) were adopted as indicators. The accuracy performance
ACC and F1 score are defined as follows
XTP + XTN
ACC = (15)
XTP + XTN + X FP + X FN

2X pre Xrec
Fscore = (16)
X pre + Xrec
where XTP , XTN , X FP , and X FN represent the true positive, true negative, false positive,
and false negative values of data samples, respectively; X pre and Xrec are the precision and
recall rates, respectively, which are denoted as follows:

XTP
X pre = (17)
XTP + TFP

XTP
Xrec = (18)
XTP + TFN
Table 3 presents the classification performance of the different models on the dataset.
The proposed CGsformer achieved the best performance. The single CNN and LSTM
models only considered the local information or global information of the input signals,
thus resulting in relatively poor results. Compared to the CNN, the multihead CNN
could learn features of different scales, and its accuracy was 3.45% higher than the CNN,
thus reaching 92.39%. However, the multihead CNN still failed to capture the long-term
information of the signals. The CNN-LSTM model took into account both local information
and long-term dependencies to better capture the characteristics of the structural damage,
with an accuracy of 92.55%. Compared to the CNN-LSTM model, the Transformer model
captured more robust global dependencies through the self-attention mechanism and
achieved a gain of 1.52%. The Conformer model employed the advantages of the CNN in
capturing local features and the Transformer in capturing global features, with an accuracy
of 95.27%. The proposed CGsformer model embedded graph structures into both global
and local features. This not only effectively filtered out noise interference from global
features, but it also helped the convolution module better understand global information,
thereby extracting more effective local features. In Table 3, the identification accuracy of the
proposed CGsformer is shown at 96.71%, thus achieving the best classification performance.

Table 3. Comparison results between CGsformer and other models.

Method ACC Fscore


CNN 0.8910 0.8903
LSTM 0.8822 0.8826
CNN-LSTM 0.9255 0.9258
Multihead CNN 0.9239 0.9238
Transformer 0.9407 0.9406
Conformer 0.9527 0.9528
CGsformer 0.9671 0.9672

In order to test whether the proposed CGsformer model is significantly relevant to


other models, the 95% confidence interval (CI) was utilized to test the statistical significance
of the accuracy performance between the proposed model and the other models. In Table 4,
we present the accuracy differences (∆ACC), 95% confidence intervals (CIs), and confidence
intervals for the accuracy differences between the proposed model and other models (∆CI),
Sensors 2024, 24, 4415 14 of 22

where ∆ACC = ACCa − ACCb represents the difference in accuracy between the proposed
model ACCa and other models ACCb —the 95% CI—which is denoted as
s
ACC (1 − ACC )
CI = ACC ± z · SE, SE = (19)
nt

where z was set to 1.96 at the 95% CI, nt is equivalent to the size of test samples, and ∆CI is
formulated as follows: q
∆CI = ∆ACC ± z · SEa2 + SEb2 (20)

Table 4. Statistical significance comparison of the proposed model with other models.

Model ACC ∆ACC CI ∆CI


CNN 0.8910 0.0761 [0.8724, 0.9078] [0.0562, 0.0960]
LSTM 0.8822 0.0849 [0.8630, 0.8996] [0.0645, 0.1053]
CNN-LSTM 0.9255 0.0416 [0.9104, 0.9402] [0.0240, 0.0592]
Multihead CNN 0.9239 0.0432 [0.9086, 0.9387] [0.0255, 0.0609]
Transformer 0.9407 0.0264 [0.9261, 0.9532] [0.0100, 0.0428]
Conformer 0.9527 0.0144 [0.9394, 0.9638] [−0.0010, 0.0298]
CGsformer 0.9671 - [0.9557, 0.9763] -

It can be intuitively seen from Table 4 that the proposed CGsformer model had the
most accurate predictive performance. Moreover, the CI of the proposed model was
the narrowest, which indicates that the proposed model is more robust. Meanwhile,
the proposed model had significant statistical significance compared to other models
(except for Conformer). In conclusion, the proposed model is superior to the other models.

3.3. Ablation Study


In this subsection, the performance of the GCN module is compared at different
positions in the model, and more conclusions are drawn. As shown in Figure 11, three
combinations were set. The graph convolution module was placed before the multihead
self-attention module, after the convolution module, and in between the two for the
proposed CGsformer model. They are named Attention Before, Conversion After, and
CGsformer, respectively.

Figure 11. Illustration of ablation experiments with GCN placed at different positions in the
CGsformer block.

The accuracy performance outcomes are presented in Table 5, and it can be found that
the results of the three combinations were better than that of Conformer. Among the three
Sensors 2024, 24, 4415 15 of 22

combinations, the proposed CGsformer model achieved the best performance, followed
by Attention Before and the relatively poor Convolution After. Based on these observa-
tions, the following conclusions can be drawn: (1) All three models outperformed the
Conformer model: this means that graph structure learning can help the model select more
discriminative features. (2) Placing the GCN module before the multihead self-attention
module failed to capture the higher-order information of the current sequence features
and only relied on shallow features for sequence representation learning, thus resulting in
poorer performance. Placing the GCN module after the convolution module aggregated
the final decision features, but this operation failed to fully utilize the semantic informa-
tion represented by attention and loses effectiveness. (3) The proposed CGsformer model
achieved the best performance, with an accuracy of 96.71%. Graph convolution learning of
features after the self-attention module can further propagate global features to adjacent
nodes, which helps to better understand and express node features by combining local
adjacency information.

Table 5. The effectiveness of GCN at various positions in the CGsformer.

Method ACC Fscore


Conformer 0.9527 0.9528
CGsformer 0.9671 0.9672
Attention Before 0.9631 0.9632
Convolution After 0.9623 0.9623

3.4. Comparative Analysis on the Four-Story Numerical Model with 24 Classifiers


As mentioned earlier, the acceleration response data from all measurement points of
the IASC-ASCE SHM benchmark structure collected under four damage patterns were
used to validate the proposed CGsformer-based damage detection method in this paper.
The acceleration sensors at all measurement points on each floor had a sampling frequency
of f s = 250 Hz for all damage patterns, and the vibration response was measured over a
duration of 800 s. Therefore, the number of sampling points S j for the sensor within the
sampling time was 200,000.
A total of 24 CGsformer classifiers (3 noise levels × 4 floors × 2 translational direc-
tions) were trained using the acceleration response data collected from all acceleration
sensors of the IASC-ASCE SHM benchmark structure under three different noise levels
(0%, 20%, and 50%). The collected data were tackled with the steps outlined in Section 3.1.
Therefore, the dataset of the acceleration time history response data for each direction of
every floor under each noise level contains 6248 samples (4 damage patterns × 1562 data
segments). Among these, 4000 samples were used for training the classifiers, 1000 samples
for validation, and 1248 samples for testing the trained classifiers.
The ACC results are reported in Table 6, and some observations can be made: (1) At
different noise levels, increasing noise levels led to a decrease in accuracy. The highest
average accuracy of 97.25% was achieved at a noise level of 0% (with the highest detection
accuracy classifier appearing in the third floor, y direction, with an accuracy of 98.24%).
At a noise level of 20%, the average accuracy reached 96.74% (with the highest detection
accuracy classifier appearing in the fourth floor, y direction, with an accuracy of 98.32%).
At a noise level of 50%, the average accuracy was 95.44% (with the highest detection
accuracy classifier also appearing in the fourth floor, y direction, with an accuracy of
96.15%). The confusion matrix with the highest accuracy for the three noise levels is shown
in Figure 12. As can be seen from Figure 12, misclassifications primarily occurred between
D.P.0 and D.P.4 at all three noise levels, with the error rate increasing as the noise levels rose.
This result is attributed to the training samples being insufficiently diverse and the model’s
limited sensitivity to the highly similar overlapping features in their acceleration response
signals. Although increasing noise levels can lead to decreased accuracy, comparing the
damage detection results between 0% and 50% noise levels shows that the average detection
Sensors 2024, 24, 4415 16 of 22

accuracy of the four-floor, eight-classifier model only decreased by 1.81%. This indicates
that the proposed CGsformer-based damage detection model has strong noise resistance.
(2) The CGsformer model exhibited stronger robustness as the noise increased. Taking the
first floor, x direction, as an example, even with an additional 20% noise data, the detection
accuracy reached 96.07%. When the noise levels were 0%, 20%, and 50%, the average
accuracy values of the Conformer model were 96.18%, 95.25%, and 94.58%, respectively.
Comparing the results of the CGsformer model in Table 6, the CGsformer model achieved
better results, with an improvement of 1.07%, 1.49%, and 0.86% for the noise levels at 0%,
20%, and 50%, respectively. (3) Despite the suboptimal performance compared to low noise
levels, the model still achieved results comparable to the multihead CNN model (92.39%)
when dealing with a noise level of 50%. This indicates that the CGsformer model has
further learned the correlation between local and global features, thus exhibiting stronger
generalization even when half of the data is noisy.

Table 6. Experimental results of 24 CGsformer classifiers (3 noise levels × 4 floors × 2 translational


directions) using the acceleration response data collected from all acceleration sensors of the IASC-
ASCE SHM Benchmark structure under 0%, 20%, and 50% noise levels.

Direction/Floor Noise 0% Noise 20% Noise 50%


First Floor, x direction 0.9671 0.9607 0.9279
First Floor, y direction 0.9688 0.9712 0.9375
Second Floor, x direction 0.9776 0.9704 0.9583
Second Floor, y direction 0.9816 0.9752 0.9391
Third Floor, x direction 0.9824 0.9631 0.9383
Third Floor, y direction 0.9671 0.9535 0.9191
Fourth Floor, x direction 0.9575 0.9607 0.9543
Fourth Floor, y direction 0.9776 0.9832 0.9615
Average 0.9725 0.9674 0.9544

Figure 12. Diagram of the best-performing confusion matrix for the three noise levels.

4. Experimental Verfication
In this section, the performance of the proposed CGsformer model is further com-
pared with other models in a four-story, single-span, steel frame structure to verify the
effectiveness of the proposed CGsformer model in different structures through testing.

4.1. Experiment Description


As depicted in Figure 13, the experimental structure is a four-story, single-span, steel
frame structure. The plan dimension of the structure is 260 mm × 320 mm, with a total
height of 672 mm. All floor slabs have been constructed from 16 mm thick steel plates and
are supported by four solid round steel columns. Each column has a height of 152 mm and
a cross-section diameter of 16 mm. The structural members are made from grade Q235
steel, which has a nominal yield stress of 235 MPa.
Sensors 2024, 24, 4415 17 of 22

Figure 13. The four-story steel frame structure experimental model.

During the long-term service of the structure, corrosion of the structural metal surface
can lead to a decrease in the net cross-sectional area of the components, thereby reducing
the load-bearing capacity. It can lead to serious effects on the structural safety. Therefore,
the focus of this study was to accurately identify changes in the net cross-sectional area
of structural components. In Figure 14, the structure is considered to be in a healthy state,
with a net cross-sectional diameter of 16 mm for all columns. The experiment replaced
the 16mm original column at the southeast corner of one or more floors in the structural
model with columns with a cross-sectional diameter of 14 mm or 12 mm. The purpose was
to simulate different degrees of corrosion damage. Table 7 summarizes the six damage
scenarios simulated in this paper.

Figure 14. Three types of replacement columns.


Sensors 2024, 24, 4415 18 of 22

Table 7. Damage patterns as defined in the experimental structure model.

Damage Case Description


Without damage (The columns at the southeast corner of
D.P.0
floors 1–4 all have a diameter of 16 mm)
D.P.1 Replaced the column on the first floor with a 14 mm diameter column.
D.P.2 Replaced the column on the second floor with a 14 mm diameter column.
D.P.3 Replaced the column on the third floor with a 14 mm diameter column.
D.P.4 Replaced the column on the forth floor with a 14 mm diameter column.
Replaced the columns on the first and second floors
D.P.5
with 14 mm and 12 mm diameter columns, respectively

As shown in Figure 13, a controllable exciter (Donghua DH40100) with a stinger


was used to apply a zero mean white Gaussian noise excitation to the first floor of the
structure along the southeast–northwest diagonal. Four Donghua 1A401E acceleration
sensors were installed at midspan positions on each of the four sides of every story slab to
precisely measure the structural responses. A Donghua DH8303 data acquisition system
was used to collect acceleration data. The acquisition rate was set at 250 Hz for all damage
patterns, with each acquisition lasting 800 s and 200,000 data points being collected each
time. Figures 15 and 16 present the y direction acceleration response signals for the first
floor in the undamaged state.

Figure 15. The acceleration response curves of the D.P.0 damage pattern in the south side on the
first floor.

Figure 16. The acceleration response curves of the D.P.0 damage pattern in the north side on the
first floor.

4.2. Comparative Analysis of Models in the Experimental Structure


To accurately assess the effectiveness and generalization capabilities of CGsformer,
the acceleration response data for each direction on each floor were processed according to
the data preprocessing method described in Section 3.1. Then, the dataset for each direction
came out to 9372 samples. From these samples, 7500 samples were allocated for training
the classifiers, and 1872 were set aside for testing after training. Finally, the preprocessed
acceleration response time history data (4 floors × 2 translational directions) were used to
train eight CGsformer classifiers.
In Table 8, the performance of the proposed CGsformer model is compared with
the other four models, i.e., CNN, LSTM, Transformer, and Conformer. The CGsformer
model demonstrated superior performance, thus achieving an average accuracy of 92.44%,
which significantly outperformed the Conformer (91.71%), Transformer (87.60%), LSTM
(83.92%), and CNN (78.43%) models. Noteworthy are the accuracies observed in the
Sensors 2024, 24, 4415 19 of 22

y direction of the first floor and the x direction of the second floor, where CGsformer
reached accuracies of 93.91% and 92.97%, respectively. These outcomes are attributed to
the model’s hierarchical learning approach. The CGsformer networks employs a multihead
self-attention mechanism to capture the global information of signals, which helps to
identify the periodic relationships of acceleration response signals. Meanwhile, it utilizes
a convolution module to precisely capture the slight differences in acceleration response
signals at short and adjacent moments. Furthermore, the graph convolution module
embedded in the CGsformer enhances the model’s robustness against noise pollution. It
does this by constructing a graph structure for global features and filtering out noise in
global features through node learning, thereby enabling the convolution module to extract
more effective local features. These innovative aspects ensure the efficacy and reliability
of CGsformer in structural damage identification, with good generalization performance
across various types of damages.

Table 8. Identification accuracy results of the four stories and one-span steel frame structure from
different models.

Floor/Direction CNN LSTM Transformer Conformer CGsformer


First Floor, x direction 80.98 84.45 89.52 91.18 93.16
First Floor, y direction 77.78 82.10 87.01 93.37 93.91
Second Floor, x direction 76.01 86.59 86.43 91.93 92.97
Second Floor, y direction 82.05 84.93 87.23 93.48 93.56
Third Floor, x direction 81.20 83.60 84.61 88.67 90.03
Third Floor, y direction 76.33 86.37 85.95 91.72 92.25
Forth Floor, x direction 74.47 79.22 88.19 91.93 91.83
Forth Floor, y direction 78.63 84.13 91.82 91.40 92.20
Average 78.43 83.92 87.60 91.71 92.44

5. Conclusions
This paper presents an innovative deep learning model, called CGsformer, for de-
tecting structural damages. The proposed CGsformer effectively extracts the global and
local features of signals by employing a hierarchical learning approach from global to
local. Additionally, the GCN is embedded after the multihead self-attention module for
further propagating global features to adjacent nodes, which helps to better understand
and express node features by incorporating local adjacency information. The proposed
damage detection method based on the CGsformer was verified using simulation data from
the IASC-ASCE benchmark structure and experimental data from a four-story, single-span,
steel frame structure. Some valuable conclusions can be drawn from the validated results:
• The proposed damage detection model has demonstrated its feasibility in test setups
with the IASC-ASCE simulated benchmark structure and a four-story, single-span,
steel frame structure, thus achieving damage identification accuracies of 96.71% and
92.44%, respectively. These results not only validate the effectiveness of the CGsformer
in identifying structural damage but also provide valuable insights for future research.
• The proposed CGsformer model exhibited high accuracy and robustness in limited
datasets and noise-contaminated conditions. In the example of the IASC-ASCE bench-
mark structure, despite the noise level increasing from 0% to 50%, the detection
accuracy only decreased by 1.81%. This means that the CGsformer can more effec-
tively extract features from the acceleration response signal, thus showcasing strong
noise resistance.
Although the proposed method achieved surprising performance through its learning
manner from local to global, some limitations should also be noted. From an applica-
tion perspective, the proposed method has not yet been tested in practical engineering.
To this end, we plan to collaborate with industry partners to implement and validate our
methods in real-world structural health monitoring scenarios. From a technical perspec-
tive, although the Transformer can achieve parallel computing compared to the RNN, it
Sensors 2024, 24, 4415 20 of 22

also increases the number of model parameters. Thus, we will consider compressing or
pruning the model in the future work. Moreover, obtaining acceleration response data
for structures with damage is a challenge. Thus, future research directions are intended
to combine transfer learning with structural numerical simulation models for structure
damage detection.

Author Contributions: Conceptualization, T.H., K.M. and J.X.; methodology, T.H., K.M. and J.X.;
software, T.H. and J.X.; validation, T.H. and J.X.; formal analysis, T.H. and J.X.; investigation, T.H. and
J.X.; resources, K.M. and J.X.; data curation, T.H. and J.X.; writing—original draft preparation, T.H.
and J.X.; writing—review and editing, T.H., K.M. and J.X.; visualization, T.H. and J.X.; supervision,
K.M. and J.X. All authors have read and agreed to the published version of the manuscript.
Funding: The research is supported by the National Natural Science Foundation of China (No.
50978064/Z091015) and the Natural Science Foundation of Guizhou Province of China (No. 2017[1036]).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The code for generating the datasets used in the numerical verification
is available on the datacenterhub website at https://www.dropbox.com/sh/zpkqy5w371mnzam/
AAA-Omuvwx72tjv5NhnhnPuMa?e=1&dl=0 (accessed on 7 July 2024). The datasets used in the
experimental verification are available upon request from the corresponding author. The datasets are
not available to the public, as they are the preliminary results of an ongoing research project carried
out in collaboration. Furthermore, this information will be used in future technological developments
and will be subject to intellectual property protection.
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. Fan, W.; Qiao, P. Vibration-based damage identification methods: A review and comparative study. Struct. Health Monit. 2011,
10, 83–111.
2. Aabid, A.; Parveez, B.; Raheman, M.A.; Ibrahim, Y.E.; Anjum, A.; Hrairi, M.; Parveen, N.; Mohammed Zayan, J. A review of
piezoelectric material-based structural control and health monitoring techniques for engineering structures: Challenges and
opportunities. Actuators 2021, 10, 101.
3. Xiao, F.; Hulsey, J.L.; Balasubramanian, R. Fiber optic health monitoring and temperature behavior of bridge in cold region.
Struct. Control Health Monit. 2017, 24, e2020.
4. Gkoumas, K.; dos Santos, F.; Pekar, F. Research in bridge maintenance, safety and management: An overview and outlook for
europe. In Bridge Maintenance, Safety, Management, Life-Cycle Sustainability and Innovations; CRC Press: Boca Raton, FL, USA, 2021;
pp. 1755–1761.
5. Malla, P.; Khedmatgozar Dolati, S.S.; Ortiz, J.D.; Mehrabi, A.B.; Nanni, A.; Ding, J. Damage detection in frp-reinforced concrete
elements. Materials 2024, 17, 1171.
6. Tang, Q.; Xin, J.; Jiang, Y.; Zhang, H.; Zhou, J. Dynamic Response Recovery of Damaged Structures Using Residual Learning
Enhanced Fully Convolutional Network. Int. J. Struct. Stab. Dyn. 2024, 2550008.
7. Lucà, F.; Manzoni, S.; Cerutti, F.; Cigada, A. A damage detection approach for axially loaded beam-like structures based on
gaussian mixture model. Sensors 2022, 22, 8336.
8. Azimi, M.; Eslamlou, A.D.; Pekcan, G. Data-driven structural health monitoring and damage detection through deep learning:
State-of-the-art review. Sensors 2020, 20, 2778.
9. Dao, P.B.; Staszewski, W.J. Lamb wave based structural damage detection using stationarity tests. Materials 2021, 14, 6823.
10. Xiao, F.; Sun, H.; Mao, Y.; Chen, G.S. Damage identification of large-scale space truss structures based on stiffness separation
method. Structures 2023, 53, 109–118.
11. Tong, K.; Zhang, H.; Zhao, R.; Zhou, J.; Ying, H. Investigation of smfl monitoring technique for evaluating the load-bearing
capacity of rc bridges. Eng. Struct. 2023, 293, 116667.
12. Angeletti, F.; Iannelli, P.; Gasbarri, P.; Panella, M.; Rosato, A. A study on structural health monitoring of a large space antenna via
distributed sensors and deep learning. Sensors 2022, 23, 368.
13. Altabey, W.A.; Wu, Z.; Noori, M.; Fathnejat, H. Structural health monitoring of composite pipelines utilizing fiber optic sensors
and an ai-based algorithm—A comprehensive numerical study. Sensors 2023, 23, 3887.
14. Kot, P.; Muradov, M.; Gkantou, M.; Kamaris, G.S.; Hashim, K.; Yeboah, D. Recent advancements in non-destructive testing
techniques for structural health monitoring. Appl. Sci. 2021, 11, 2750.
15. Hassani, S.; Dackermann, U. A systematic review of advanced sensor technologies for non-destructive testing and structural
health monitoring. Sensors 2023, 23, 2204.
Sensors 2024, 24, 4415 21 of 22

16. Hou, R.; Xia, Y. Review on the new development of vibration-based damage identification for civil engineering structures:
2010–2019. J. Sound Vib. 2021, 491, 115741.
17. Xiao, F.; Hulsey, J.L.; Chen, G.S.; Xiang, Y. Optimal static strain sensor placement for truss bridges. Int. J. Distrib. Sens. Netw. 2017,
13, 5.
18. Avci, O.; Abdeljaber, O.; Kiranyaz, S.; Hussein, M.; Gabbouj, M.; Inman, D.J. A review of vibration-based damage detection in
civil structures: From traditional methods to machine learning and deep learning applications. Mech. Syst. Signal Process. 2021,
147, 107077.
19. Nick, H.; Aziminejad, A.; Hosseini, M.H.; Laknejadi, K. Damage identification in steel girder bridges using modal strain
energy-based damage index method and artificial neural network. Eng. Fail. Anal. 2021, 119, 105010.
20. Fallahian, M.; Ahmadi, E.; Khoshnoudian, F. A structural damage detection algorithm based on discrete wavelet transform and
ensemble pattern recognition models. J. Civ. Struct. Health Monit. 2022, 12, 323–338.
21. Indhu, R.; Sundar, G.R.; Parveen, H.S. A review of machine learning algorithms for vibration-based shm and vision-based shm.
In Proceedings of the 2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore,
India, 23–25 February 2022; pp. 418–422.
22. Zhang, J.; Sato, T.; Iai, S. Support vector regression for on-line health monitoring of large-scale structures. Struct. Saf. 2006,
28, 392–406.
23. Hua, X.; Ni, Y.; Ko, J.; Wong, K. Modeling of temperature–frequency correlation using combined principal component analysis
and support vector regression technique. J. Comput. Civ. Eng. 2007, 21, 122–135.
24. Salkhordeh, M.; Mirtaheri, M.; Soroushian, S. A decision-tree-based algorithm for identifying the extent of structural damage in
braced-frame buildings. Struct. Control Health Monit. 2021, 28, e2825.
25. Wang, Y.; Su, F.; Guo, Y.; Yang, H.; Ye, Z.; Wang, L. Predicting the microbiologically induced concrete corrosion in sewer based on
xgboost algorithm. Case Stud. Constr. Mater. 2022, 17, e01649.
26. Lingxin, Z.; Junkai, S.; Baijie, Z. A review of the research and application of deep learning-based computer vision in structural
damage detection. Earthq. Eng. Eng. Vib. 2022, 21, 1–21.
27. Eltouny, K.; Gomaa, M.; Liang, X. Unsupervised learning methods for data-driven vibration-based structural health monitoring:
A review. Sensors 2023, 23, 3290.
28. Abdeljaber, O.; Avci, O.; Kiranyaz, S.; Gabbouj, M.; Inman, D.J. Real-time vibration-based structural damage detection using
one-dimensional convolutional neural networks. J. Sound Vib. 2017, 388, 154–170.
29. Zhang, Y.; Miyamori, Y.; Mikami, S.; Saito, T. Vibration-based structural state identification by a 1-dimensional convolutional
neural network. Comput.-Aided Civ. Infrastruct. Eng. 2019, 34, 822–839.
30. Khodabandehlou, H.; Pekcan, G.; Fadali, M.S. Vibration-based structural condition assessment using convolution neural networks.
Struct. Control Health Monit. 2019, 26, e2308.
31. Tang, Z.; Chen, Z.; Bao, Y.; Li, H. Convolutional neural network-based data anomaly detection method using multiple information
for structural health monitoring. Struct. Control Health Monit. 2019, 26, e2296.
32. Mantawy, I.M.; Mantawy, M.O. Convolutional neural network based structural health monitoring for rocking bridge system by
encoding time-series into images. Struct. Control Health Monit. 2022, 29, e2897.
33. Lin, Z.; Liu, Y.; Zhou, L. Damage detection in a benchmark structure using long short-term memory networks. In Proceedings of
the 2019 Chinese Automation Congress (CAC), Hangzhou, China, 22–24 November 2019; pp. 2300–2305.
34. Sony, S.; Gamage, S.; Sadhu, A.; Samarabandu, J. Vibration-based multiclass damage detection and localization using long
short-term memory networks. Structures 2022, 35, 436–451.
35. Zou, J.; Yang, J.; Wang, G.; Tang, Y.; Yu, C. Bridge structural damage identification based on parallel cnn-gru. IOP Conf. Ser. Earth
Environ. Sci. 2021, 626, 012017.
36. Yang, J.; Yang, F.; Zhou, Y.; Wang, D.; Li, R.; Wang, G.; Chen, W. A data-driven structural damage detection framework based on
parallel convolutional neural network and bidirectional gated recurrent unit. Inf. Sci. 2021, 566, 103–117.
37. Bao, X.; Fan, T.; Shi, C.; Yang, G. Deep learning methods for damage detection of jacket-type offshore platforms. Process Saf.
Environ. Prot. 2021, 154, 249–261.
38. Fu, L.; Tang, Q.; Gao, P.; Xin, J.; Zhou, J. Damage identification of long-span bridges using the hybrid of convolutional neural
network and long short-term memory network. Algorithms 2021, 14, 180.
39. Dang, V.-H.; Vu, T.-C.; Nguyen, B.-D.; Nguyen, Q.-H.; Nguyen, T.-D. Structural damage detection framework based on graph
convolutional network directly using vibration data. Structures 2022, 38, 40–51.
40. Wang, S.; Luo, Z.; Shen, P.; Zhang, H.; Ni, Z. Graph-in-graph convolutional network for ultrasonic guided wave-based damage
detection and localization. IEEE Trans. Instrum. Meas. 2022, 71, 2502011.
41. Zhan, P.; Qin, X.; Zhang, Q.; Sun, Y. A novel structural damage detection method via multi-sensor spatial-temporal graph-based
features and deep graph convolutional network. IEEE Trans. Instrum. Meas. 2023, 72, 2504814.
42. Liang, Z.; Li, D.; Ren, W. Structural damage identification method based on recursive graph for automatic feature extraction. In
Proceedings of the 29th National Conference on Structural Engineering (Volume II), Wuhan, China, 16–18 October 2020.
43. Johnson, E.A.; Lam, H.-F.; Katafygiotis, L.S.; Beck, J.L. Phase i iasc-asce structural health monitoring benchmark problem using
simulated data. J. Eng. Mech. 2004, 130, 3–15.
44. Hwang, H.; Kim, C. Damage detection in structures using a few frequency response measurements. J. Sound Vib. 2004, 270, 1–14.
Sensors 2024, 24, 4415 22 of 22

45. Yessoufou, F.; Zhu, J. Classification and regression-based convolutional neural network and long short-term memory configuration
for bridge damage identification using long-term monitoring vibration data. Struct. Health Monit. 2023, 22, 14759217231161811.
46. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you
need. In Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA,
4–9 December 2017.
47. Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a
fixed-length context. arXiv 2019, arXiv:1901.02860.
48. Zhang, Z.; Cui, P.; Zhu, W. Deep Learning on Graphs: A Survey. IEEE Trans. Knowl. Data Eng. 2020, 34, 249–270.
49. Chen, F. Improvement of Kalman Filter and Kalman Estimator in the Application of Structural Damage Detection. Ph.D. Thesis,
Xiamen University, Xiamen, China, 2014.
50. Junior, R.F.R.; dos Santos Areias, I.A.; Campos, M.M.; Teixeira, C.E.; da Silva, L.E.B.; Gomes, G.F. Fault detection and diagnosis in
electric motors using 1d convolutional neural networks with multi-channel vibration signals. Measurement 2022, 190, 110759.
51. Gulati, A.; Qin, J.; Chiu, C.-C.; Parmar, N.; Zhang, Y.; Yu, J.; Han, W.; Wang, S.; Zhang, Z.; Wu, Y.; et al. Conformer: Convolution-
augmented transformer for speech recognition. arXiv 2020, arXiv:2005.08100.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like