0% found this document useful (0 votes)
68 views

Structural Damage Identification Based On Autoencoder Neural Networks and Deep Learning

Uploaded by

durmuş can
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Structural Damage Identification Based On Autoencoder Neural Networks and Deep Learning

Uploaded by

durmuş can
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Engineering Structures 172 (2018) 13–28

Contents lists available at ScienceDirect

Engineering Structures
journal homepage: www.elsevier.com/locate/engstruct

Structural damage identification based on autoencoder neural networks and T


deep learning

Chathurdara Sri Nadith Pathiragea, Jun Lib, Ling Lia, , Hong Haob,c, Wanquan Liua, Pinghe Nib,d
a
School of Electrical Engineering, Computing and Mathematical Sciences, Curtin University, Kent Street, Bentley, WA 6102, Australia
b
Centre for Infrastructural Monitoring and Protection, School of Civil and Mechanical Engineering, Curtin University, Kent Street, Bentley, WA 6102, Australia
c
School of Civil Engineering, Guangzhou University, Guangzhou 510006, China
d
Department of Civil and Environmental Engineering, Hong Kong Polytechnic University, Hung Hom, Hong Kong

A R T I C LE I N FO A B S T R A C T

Keywords: Artificial neural networks are computational approaches based on machine learning to learn and make pre-
Autoencoders dictions based on data, and have been applied successfully in diverse applications including structural health
Deep learning monitoring in civil engineering. It is difficult to optimize the weights in the neural networks that have multiple
Deep neural networks hidden layers due to the vanishing gradient issue. This paper proposes an autoencoder based framework for
Structural damage identification
structural damage identification, which can support deep neural networks and be utilized to obtain optimal
Pre-training
solutions for pattern recognition problems of highly non-linear nature, such as learning a mapping between the
vibration characteristics and structural damage. Two main components are defined in the proposed framework,
namely, dimensionality reduction and relationship learning. The first component is to reduce the dimensionality
of the original input vector while preserving the required necessary information, and the second component is to
perform the relationship learning between the features with the reduced dimensionality and the stiffness re-
duction parameters of the structure. Vibration characteristics, such as natural frequencies and mode shapes, are
used as the input and the structural damage are considered as the output vector. A pre-training scheme is
performed to train the hidden layers in the autoencoders layer by layer, and fine tuning is conducted to optimize
the whole network. Numerical and experimental investigations on steel frame structures are conducted to de-
monstrate the accuracy and efficiency of the proposed framework, comparing with the traditional ANN methods.

1. Introduction formulated as a pattern-recognition problem.


One of the most significant challenges associated with the vibration
Civil infrastructure including bridges and buildings etc., are crucial based methods is that they are susceptible to uncertainties in the da-
for a society to well function. They may deteriorate progressively and mage identification process, such as, finite element modelling errors,
accumulate damage during their service life due to fatigue, overloading noises in the measured vibration data and environmental effect etc.
and extreme events, such as strong earthquake and cyclones. Structural Artificial intelligence techniques, such as Artificial Neural networks
Health Monitoring (SHM) provides practical means to assess and pre- (ANN) [3] and Genetic Algorithms (GA) [4], and Swarm Intelligence
dict the structural performance under operational conditions. It is methods [5,6] are computational approaches based on machine
usually referred as the measurement of the critical responses of a learning to learn and make predictions based on data, and have been
structure to track and evaluate the symptoms of operational incidents, applied successfully in diverse applications including SHM in civil en-
anomalies, and deterioration that may affect the serviceability and gineering. Yun et al. [7] estimated the structural joint damage from
safety [1]. Numerous efforts have been devoted to develop vibration modal data via an ANN model. Noise injection learning with a realistic
based structural damage identification methods by using vibration noise level for each input component was found to be effective in better
characteristics of structures [2]. These methods are based on the fact understanding the noise effect in this work. Later, the mode shape
that changes in the structural physical parameters, such as stiffness and differences or the mode shape ratios before and after damage were used
mass, will alter the structural vibration characteristics as well, i.e. as the input to the neural networks to reduce the effect of the modelling
natural frequencies and mode shapes. Structural damage identification errors in the baseline finite element model [8]. Measured frequency
based on changes in vibration characteristics of structures can be response functions (FRF) were analysed by using Principal Component


Corresponding author.
E-mail address: [email protected] (L. Li).

https://doi.org/10.1016/j.engstruct.2018.05.109
Received 20 January 2018; Received in revised form 22 April 2018; Accepted 30 May 2018
Available online 18 June 2018
0141-0296/ © 2018 Elsevier Ltd. All rights reserved.
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Analysis (PCA) for data reduction, and the compressed FRFs re- multiple simple autoencoders are used to initialize the layer weights
presented by the most significant components were then used as the that are close enough to a good solution. The fine-tuning is performed
input to ANN for structural damage detection [9]. Ni et al. [10] in- to optimize the multiple layers of the whole network together with
vestigated the construction of appropriate input vectors to neural net- respect to the final objective function. Autoencoders have been used in
works for hierarchical identification of structural damage location and the “deep architecture” approaches [17,27–30] with unsupervised
extent from measured modal properties. The neural network is first learning algorithms.
trained to locate the damage, and then re-trained to evaluate the da- This paper proposes an autoencoder based framework for structural
mage extent with several natural frequencies and modal shapes. Yeung damage identification, which can be utilized to learn optimal solutions
and Smith [11] generated the vibration feature vectors from the re- for pattern recognition problems of highly non-linear nature, such as
sponse spectra of a bridge under moving traffic as the input to neural learning a mapping between the vibration characteristics and structural
networks for examination. It was shown that the sensitivity of the damages. The proposed framework consists of two main components,
neural networks maybe adjusted so that a satisfactory rate of damage namely, dimensionality reduction and relationship learning. The first
detection is achieved even in the presence of noisy signals. Bakhary component is to reduce the dimensionality of the original input vector
et al. [12] proposed a statistical approach to account for the effect of while preserving the necessary information required, and the second
uncertainties in developing an ANN model. Li et al. [13] used pattern component is to perform the relationship learning between the features
changes in frequency response functions and ANN to identify structural with the reduced dimensionality and the stiffness reduction parameters
damage. Later, Bandara et al. [14] used PCA to reduce the dimension of of the structure. Vibration characteristics, such as natural frequencies
the measured FRF data and transformed it as new damage indices. ANN and mode shapes, are used as the input and the structural damages are
was then employed for the damage localization and quantification. considered as the output vector. A pre-training scheme is performed to
Dackermann et al. [15] utilized cepstrum based operational modal train the hidden layers in the autoencoders layer by layer, and fine
analysis and ANN for damage identification of civil engineering struc- tuning is conducted to optimize the whole network. Numerical studies
tures. The damages in the joints of a multi-storey structure can be and experimental validations on steel frame structures are conducted to
identified effectively. demonstrate the accuracy and efficiency of the proposed framework,
In general, neural networks are particularly applicable to problems comparing with the traditional ANN methods.
where a significant amount of information is available, but an explicit
algorithm for processing them is difficult to specify. The weights asso-
2. Autoencoder based framework for structural health monitoring
ciated with the mapping functions that make the neural networks ex-
hibit desired behavior are obtained from training a large amount of
An autoencoder based framework, which can support deep neural
data. Back propagation based on gradient descent method is one of the
networks, is proposed for structural health monitoring. A typical au-
most traditional training algorithms, which has been found to be ef-
toencoder model will be briefly described in Section 2.1, and the pro-
fective providing: (1) Initial weights are close enough to a good solu-
posed Autoencoder based framework will be presented in Section 2.2.
tion; (2) Computers are fast enough; and (3) Data sets are big enough.
The proposed framework will be applied for structural damage identi-
However, it is difficult to optimize the weights in the networks that
fication, which is a pattern recognition problem based on the fact that
have multiple hidden layers due to the vanishing gradient issue and
the changes in structural physical material properties, i.e. stiffness, will
convergence to local minima [16]. This problem has been a bottleneck
alter the structural vibration characteristics, i.e. natural frequencies and
for ANN with shallow architecture models. For network models with a
mode shapes. In this study, natural frequencies and mode shapes serve
deep structure, the major difficulty has been to optimize the weights of
as the input to the proposed framework and the output will be the
the hidden layers that are close to the input layer.
elemental stiffness reduction parameters representing structural health
Hinton and Salakhutdinov [17] introduced the concept of deep
conditions. Training methods for the proposed framework will also be
learning to reduce the dimensionality of data and tackle the above three
described in Section 2.2.
limitations. Deep neural networks have attracted wide-spread attention,
mainly since they outperform alternative machine learning methods
such as support vector machine and kernel machines in numerous im- 2.1. Autoencoder
portant applications. The original applications mainly focused on face
detection, objective recognition, speech recognition and detection, and A traditional autoencoder [26] consists of two core segments: en-
natural language processing [18,19]. Recently it has been developed for coder and decoder with a single hidden layer.
fault detection and diagnosis in mechanical engineering [20,21]. A Encoder: The deterministic mapping f (x ) , which transforms a d-
study on using 1-D Convolutional Neural Networks for detecting the dimensional input vector x ∈ R d into a r-dimensional hidden re-
structural damage has been conducted in 2017 [22]. It should be noted presentation h ∈ Rr , is called an encoder. Its typical form is an affine
that sensors have to be placed on all the joints in a space frame struc- mapping followed by a nonlinear transformation, which can be ex-
ture to detect the damage in that work. Cha et al. [23] proposed a vi- pressed as follows
sion-based method using a deep architecture of convolutional neural h = f (x ) = Φ(Wx + b ) (1)
networks for detecting concrete surface cracks without calculating the
defect features. Nadith et al. [24] explored using the Autoencoders where W ∈ Rrxd denotes the mapping weight matrix of the encoder,
model to perform the feasibility study on pattern recognition for b ∈ Rr is the bias vector and Φ is the activation function, which is
structural health monitoring with numerical simulations only. No usually a squashing non-linear function and could be a sigmoid function
system uncertainties and measurement noises have been considered. or hyperbolic tangent function: Φ(x ) = sigmoid (x ) = 1/1 + e−x or
It has been demonstrated that deep learning based methods are Φ(x ) = tanh(x ) = (e x −e−x )/(e x + e−x ) . A non-squashing linear function,
favorable in optimizing networks with multiple hidden layers [25]. such as Φ(x ) = purelin (x ) = x , can also be used to output real values
Autoencoders are unsupervised training models. The aim of an auto- that do not fall into a specific range, where “purelin” is a linear transfer
encoder is to learn a representation for a set of data, usually for the function.
purpose of dimensionality reduction. Deep autoencoder is utilized for Decoder: The mapping g (h ) , which transforms the hidden re-
effective feature learning through hierarchical non-linear mappings via presentation h (observed in the step described above) back into a re-
the multiple hidden layers of the model [26]. The training of auto- constructed vector z ∈ R d in the input space, is called a decoder. The
encoders is usually performed in two stages: pre-training and fine- typical form of a decoder is also an affine mapping optionally followed
tuning. The pre-training is usually performed layer by layer and by a squashing nonlinearity

14
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28


̂ + b)
z = g (h ) = Φ(Wh (2) training could be a way to naturally decompose the problem into sub-
problems associated with different levels of abstraction. It is known that
where W ̂ ∈ R dxr is the weight matrix of the decoder, b ̂ ∈ R d is the bias unsupervised learning algorithms can extract salient information about
vector and Φ is the activation function described above. the input distribution. This information can be captured in a distributed

To optimize the parameters W , b , W ̂ , b , usually the mean squared representation, i.e., a set of features which encode the salient factors of
error is employed as the cost function as follows variation in the input. A one-layer unsupervised learning algorithm
m could extract such salient features, but because of the limited capacity
∗ ⌢∗ 1 1
[W ∗, b ∗, W ̂ , b ] = argminW , b , W ̂,⌢
b ∑ ⎛ ‖g (f (x (i) ))−x (i) ‖2 ⎞ of that layer, the features extracted on the first level of the architecture
m i=1 ⎝2 ⎠ (3) can be seen as low-level features. It is conceivable that learning a
(i)
where m is the number of samples, x is the ith input, f (·) and g (·) second layer based on the same principle but taking as input the fea-
mappings are the encoder and decoder functions respectively. The tures learned with the first layer could extract slightly higher-level
nonlinearity of the activation function shown in Eq. (3) is difficult to features. In this way, one could imagine that higher-level abstractions
solve, thus the gradient descent algorithm is commonly employed.A that characterize the input could emerge. In the latter stage, layers are
typical autoencoder neural network in Eq. (3) tries to reconstruct the pre-trained on performing the mapping between the learned salient
input. However if a response distinct from the input is used as the features to the output. Note how in this process all learning could re-
output of the decoder g (·) , it can be considered as a kind of non-linear main local to each layer, therefore side-stepping the issue of gradient
regression technique. diffusion that might be hurting gradient-based learning of deep neural
networks, when we try to optimize a single global criterion.
The objective of the proposed framework is to learn the relationship
2.2. The proposed framework
between the structural vibration characteristics, i.e. natural frequencies
and mode shapes, and the physical properties of structures, such as
Autoencoders can be used for various tasks, such as effective feature
stiffness. Therefore the input to the framework are the modal in-
learning, dimensionality reduction and nonlinear regression etc.
formation such as frequencies and mode shapes, while the elemental
[27,28]. These functions are explored in the proposed framework for
stiffness reduction parameters of structures are the output vector. The
structural health monitoring to learn a compressed feature re-
input feature vector including possibly many orders of natural fre-
presentation and form a nonlinear regression for accurate and robust
quencies and mode shapes is usually high dimensional. Learning a re-
structural damage detections.
lationship directly from a high dimensional input will very likely be less
The recent demonstrations of the potential of deep learning algo-
accurate than using compressed features, since the high dimensional
rithms were achieved despite the serious challenge of training models
input feature may contain unnecessary information due to the re-
with many layers of adaptive parameters. In general all instances of
dundancy in the data, as well as uncertainties such as measurement
deep learning, the objective function is a highly non-convex function of
noise and finite element modelling errors. Therefore, structural damage
the parameters, with the potential for many distinct local minima in the
identification in this study is performed in two main components in the
model parameter space. Hence the optimization algorithm may not be
proposed framework as shown in Fig. 1. The first component is to re-
guaranteed to arrive at even a local minimum in a reasonable amount of
duce the dimensionality of the original input vector while preserving
time, but it often finds a very low value of the cost function quickly
necessary information required, and the second component is to per-
enough to be useful provided with a decent initialization for weights.
form the relationship learning between the compressed features with
The principal difficulty is that not all of these minima provide
reduced dimensionality and the structural stiffness reduction para-
equivalent generalization errors but the weight initialization method
meters. Each component is defined with a specific objective optimiza-
for deep architectures. The standard training schemes (based on
tion function, which will be described in the following sections.
random initialization) tend to place the parameters in regions of the
The proposed framework is shown in Fig. 1. As mentioned above,
parameters space that generalize poorly—as was frequently observed
there are two components in this framework, namely the dimension-
empirically but rarely reported [25].
ality reduction and the relationship learning. In the dimensionality
Hence the concept of layer wise pre-training of the network is in-
reduction component, an autoencoder based model with a deep archi-
troduced to find the weights that are close to the optimal. In this paper
tecture and nonlinear activation units is proposed to perform the non-
a set of simple autoencoders were used to perform this task. One of the
linear dimensionality reduction. A lower dimensional feature vector
claims of this paper is that powerful unsupervised and semi-supervised
learned from this process is obtained to represent the given high di-
(or self-taught) learning is a crucial component in building successful
mensional data. It is worth noting that a pre-training scheme is con-
learning algorithms for deep architectures aimed at approaching op-
ducted for training the first component. The quality of the dimension-
timal solutions. If gradients of a criterion defined at the output layer
ality reduction process is evaluated by using the mean squared error
become less useful as they are propagated backwards to lower layers, it
(MSE) and the regression value (R-value) on the reconstruction accu-
is reasonable to believe that an unsupervised learning criterion defined
racy of the original input feature. In the relationship learning compo-
at the level of a single layer could be used to move its parameters in a
nent, a simple autoencoder model with a single hidden layer and
favourable direction. It would be reasonable to expect this if the single-
nonlinear activation units is utilized to perform this regression task.
layer learning algorithm discovered a representation that captures
MSE and R-Value are also used to evaluate the quality of the predictions
statistical regularities of the layer’s input [27]. Also layer wise pre-

Fig. 1. Framework of Autoencoder based deep neural networks.

15
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Encoder function fp is set to be “tanh” (hyperbolic tangent) since the


value 0 is contained in its activation region thus supports a sparse re-
presentation of the input when the activation of a hidden unit becomes
0. Decoder function gp is set to be “purelin” since it needs to reconstruct
the real values of the input. The factor “1/2” in Equation (7) is used to
eliminate “2” when taking the gradient of the mean square error. This is
to have a clear derivate of the cost function. Hence having multi-
plication factor on the cost will not change the optimal solution reached
via the optimizing method. Furthermore, the L2-weight decay term
denoted in Eq. (7) is added on the cost function as shown in Eq. (5) to
avoid over-fitting in the overall training process. The motivation behind
L2 (or L1) regularization on network weights in all layers is that by
restricting the weights, constraining the network, it is less likely to over
fit. Also this mechanism of regularization helps to constrain the number
of hidden nodes in a layer during the pre-training by pushing the
Fig. 2. Framework of the proposed Autoencoder based model. weights to zero (there by the making the inputs to a node close to zero,
making the neuron’s response less significant). In addition, L2-weight
decay constraints the hidden nodes of a layer [29] thus allowing the
on structural stiffness parameters.
model to utilize hidden layers with the number of hidden nodes same as
These two components are employed sequentially and fine-tuning of
its input. A typical autoencoder without L2-weight decay would learn
the whole network is conducted to perform the joined optimization on
the identity mapping from its input to output. The burden of choosing a
the final objective function for learning the relationship from the ori-
suitable number of hidden nodes for each layer is handled to a certain
ginal input vector to the final output. In this manner, the proposed
extent with the introduction of this constraint. The optimal parameter
framework is capable of retaining only the required information to
value for λ in Eq. (5) is chosen via utilizing a validation dataset [31].
establish the relationship between the learned compressed features and The compressed representation features learned in the kth layer hk , is
r
the final output in the form of stiffness reduction parameters. The de- then fed to a nonlinear relationship learning component that will be
tailed formulations of these two components are presented in the fol- described in the following section.
lowing sections.
2.2.2. Relationship learning
2.2.1. Dimensionality reduction The relationship learning component, as shown in Fig. 2, is defined
An autoencoder model with a deep neural network architecture is to perform the regression task utilizing the low dimensional feature
trained for the dimensionality reduction, where the 1st hidden layer is learned at the kth layer, which is a better feature representation than
defined to perform the feature fusion of both the frequencies and mode the original input to predict the structural stiffness reduction para-
shapes from the structure while the subsequent 2nd to kth hidden layers meters as the final output. A simple autoencoder model with only one
further compress the features, as shown in Fig. 2. One can visualize this hidden layer and a hyperbolic tangent activation function is defined to
model as the encoding architecture of a typical deep autoencoder [29], perform this task. The cost function for this model is defined as
but not strictly the generic deep autoencoder model with the decoding
k+1 k+1 k+1
structure. Jcos t (W , b ) = JMSE (W , b ) + λJweight (W , b ) (8)
c r represents the combined high dimensional input vector, including
with
n structural natural frequencies and the corresponding n × t mode
N
shape values k+1 r
JMSE (W , b ) = ∑ ‖o r −gk + 1 (fk + 1 (hk ))‖22
cr = [q1r , q2r , …, qir ,
q r
m1 1 ,
q r qr qr
m2 1 …,mt i , …, mj i ]T τ=1 (9)
(4)
where gk+1 and fk+1 are respectively the decoder and the encoder
where qir is the ith (i = 1…n) natural frequency included in the rth r
qr functions of the (k + 1)th layer, hk is the low dimensional representa-
sample; mj i denotes the jth (j = 1…t) mode shape value corresponding tions obtained at the kth layer (also the last layer) of the dimensionality
to the ith frequency and t is the number of measurement points for
reduction component for the rth sample, and o r is the labeled output
describing a mode shape. A layer-wise pre-training scheme [29] is
vector, namely the pre-defined stiffness reduction parameters of the rth
performed for all the layers of this dimensionality reduction component
sample. The output of this relationship learning component is the
with the following cost function
predicted structural stiffness reduction parameters.
p p p
Jcos t (W , b ) = JMSE (W , b ) + λJweight (W , b ) (5)
2.2.3. Fine-tuning
with Once the optimal mapping weight coefficients and bias parameters
N of all the hidden layers are obtained with the pre-training scheme, the
r r
p
JMSE (W , b ) = ∑ ‖hp − 1−gp (fp (hp − 1))‖22 whole network is fine-tuned to optimize all the layers as a whole with
τ=1 (6) the following cost function
p sl sl + 1 F F F
p 1 Jcos t (W , b ) = JMSE (W , b ) + λJweight (W , b ) (10)
Jweight (W ) = ∑ ∑∑ (wji(l) )2
2 (7)
l=p−1 i=1 j=1 with
where p={1… k} with k being the number of layers in the di- N

mensionality reduction component, N is the number of data samples


F
JMSE (W , b ) = ∑ ‖o r −p (c r )‖22
τ=1 (11)
involved in the training, and gp and fp are the decoder and encoder
r
functions of the p-th layer, respectively. hp − 1 is the lower dimensional where p (c r )
= gk + 1 (fk (fk − 1 (fk − 2 (…(c r ))))
is the predicted output vector
representation that is established in the (p−1)th layer for the rth sample through the activations of all the layers in both the dimensionality re-
r
where h 0 = c r . wji(l) represents the weighting coefficient in the weighting duction and relationship learning components. The layer-wise pre-
matrix W(l), and sl denotes the number of neural units in the l-th layer. training and fine-tuning of the whole network are performed to

16
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

0.5m

49.98mm

0.3m

0.3m

4.85mm
0.3m (b) Dimensions of the column
beam
2.1m
0.3m column

49.89mm

0.3m 8.92mm
(c) Dimensions of the beam

0.3m

0.3m

(a) (b)
Fig. 3. Laboratory model and dimensions of the steel frame structure: (a) Steel frame model; (b) Dimensions.

improve the training efficiency and achieve a better accuracy of the system has 195 DOFs in total. The translational and rotational restraints
proposed framework. at the supports, which are Nodes 1 and 65, are represented initially by a
large stiffness of 3 × 109 N/m and 3 × 109 N·m/rad, respectively. The
initial finite element model updating has been conducted to minimize
3. Numerical studies
the discrepancies between the analytical finite element model and the
experimental model in the laboratory. The detailed model updating
In this section, the numerical model, data generation and pre-pro-
process can be found in [32]. This updated finite element model is
cessing, and performance evaluation of the proposed framework will be
taken as the baseline model for generating the training, validation and
presented. The accuracy and efficiency of using the proposed frame-
testing data.
work for structural damage identification will be evaluated with si-
mulation data generated from a numerical finite element model. Both
the uncertainties in the finite element modelling and measurement
3.2. Data generation and pro-processing
noise effect in the data will be considered.
Modal analysis is performed by using the baseline model to generate
3.1. Numerical model the input and output data to train the proposed framework. The first
seven frequencies and the corresponding mode shapes at 14 beam-
A seven-storey steel frame structure is fabricated in the laboratory column joints are obtained. The elemental stiffness parameters are
and the dimensions of the frame are shown in Fig. 3. The column of the normalized to the range between 0 and 1, where 1 denotes the intact
frame has a total height of 2.1 m with 0.3 m for each storey. The length state and 0 denotes the completely damaged state. For example, if the
of the beam is 0.5 m. The cross-sections of the column and beam ele- stiffness parameter of a specific element is equal to 0.9, it means 10%
ments are measured as 49.98 mm × 4.85 mm and stiffness reduction is introduced in this element. 12,400 data samples
49.89 mm × 8.92 mm, respectively. The measured mass densities of the are generated from the baseline model including both single and mul-
column and beam elements are 7850 kg/m3 and 7734.2 kg/m3, re- tiple damage cases. In single element damage cases, the stiffness
spectively. The initial Young’s modulus is taken as 210 GPa for all parameter for each element varies from 1, 0.99, 0.98,… to 0.7 while the
members. The connections between column and beam elements are rest of elements are intact. 30 data sets are generated for such scenarios
continuously welded at the top and bottom of the beam section. Two when a local damage is introduced in a specific element. With 70 ele-
pairs of mass blocks with approximately 4 kg weight each, are fixed at ments in the finite element model, 2100 single damage cases are si-
the quarter and three-quarter length of the beam in each storey to si- mulated. In multiple element damage cases, the stiffness parameters of
mulate the mass from the floor of a building structure. The bottoms of randomly selected two or three elements out of 70 elements are
the two columns of the frame are welded onto a thick and solid steel changed with stiffness reductions randomly defined between 0 and
plate which is fixed to the ground. 30%, while the other elements are undamaged. 10,300 multiple da-
Fig. 4 shows the finite element model of the whole frame structure. mage cases with different damaged elements and patterns are simulated
It consists of 65 nodes and 70 planar frame elements. The weights of in total. The first seven frequencies and the corresponding mode shapes
steel blocks are added at the corresponding nodes of the finite element at 14 beam-column joints are taken as the input, and the pre-defined
model as concentrated masses. Each node has three DOFs (two trans- elemental stiffness reduction parameters as considered as the labelled
lational displacements x, y and a rotational displacement θ), and the output. These input and output data are used for the training and

17
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

(3) Scenario 3: Uncertainty effect. 1% uncertainty is included in the


elemental stiffness parameters to simulate the finite element mod-
elling errors;
(4) Scenario 4: Both the measurement noise and uncertainty effect
defined in Scenarios 2 and 3 are considered.

Since frequencies and mode shapes of the input feature c r are


measured in different scales, they are normalized separately to the
range from −1 to + 1. This range is chosen due to the active range of
the hyperbolic tangent function. Considering that structural damages
are usually observed at a few number of elements, sparse output vector
is defined by using 0 for the intact state and 1 for the fully damage state.
The output is also scaled to the range from −1 to +1 to serve the
operating range of the used linear activation function in the final output
layer. The performance evaluation of the proposed framework based on
the pre-processed datasets will be described in the following section.

3.3. Performance evaluation of the proposed framework

Four different scenarios, as described in Section 3.2, are considered


in the performance evaluation of the proposed framework against the
traditional ANN. It should be noted that when comparing the perfor-
mances of using the proposed approach and ANN for structural damage
identification, the same datasets are used. For this numerical study, 2
hidden layers are used in the dimensionality reduction component with
100 neurons each, and one hidden layer with 80 neurons is used in the
relationship learning component. Thus a deep neural network with 3
layers in total is used in this numerical study. The number of entries in
the original input vector is 7 frequencies plus 14 × 7 mode shape
functions, that is, 105 in total. 70 elemental stiffness parameters are
included in the final output vector. The selection of the number of
hidden layers and neurons is based on the complexity of the target
problem. A deeper neural network would be used for a more complex
problem. It is not necessary to use the same number of input nodes in
each layer since it is a question of mapping the problem complexity to
model complexity. A over complex model that has higher complexity
than the data does not mean that it will always over fit the data pro-
vided that we have enough data and strong regularization techniques in
place. Also as explained above, pre-training helps to regularize the
network and place the weights in close to optimal regions at the same
Fig. 4. Finite element model of the steel frame structure.
time. Usually the guideline is, when a set of data is given, data aug-
mentation methods can be used to expand the dataset in a meaningful
Table 1 way. Then it is advised to start from the simplest model and step for-
Performance evaluation results for Scenario 1 in the numerical study. ward while increasing the complexity of the model gradually. It is also
Methods MSE R-Value Optimization Training time important to utilize regularization techniques while increasing the
method (Hours) number of nodes and layers to avoid the model from being over fitted
on data. Hyperbolic tangent and linear functions are employed re-
ANN 6.2e−04 0.652 SGD 5
spectively as the encoder and decoder functions of the autoencoders in
ANN 4.1e−04 0.824 SCG 1
The proposed 2.5e−04 0.921 SCG 1.5 the pre-training stage. After the pre-training, hyperbolic tangent acti-
approach vation functions are used for all the hidden layers while the last layer
uses a linear function to predict the elemental stiffness reductions ac-
curately.
validation of the proposed framework. It is noted that the selection of the optimal class of ANN models for a
To investigate the effectiveness and robustness of using the pro- given set of training data has been studied [34,35]. In this paper, in
posed framework for structural damage identification, the measure- order to perform a fair comparison between the proposed approach and
ment noise and the uncertainty effect in the finite element modelling ANN methods, 3 hidden layers with the same number of neurons in
are included in the datasets. The following scenarios are defined in the each layer is defined for an ANN model. Hence in contrast to the pro-
numerical studies posed approach no specific layer wise pre-training is performed on ANN
model. Two commonly used optimization methods, namely, Stochastic
(1) Scenario 1: No measurement noise and uncertainty. No noise effect Gradient Descent (SGD) and Scaled Conjugate Gradient (SCG), are used
in the vibration characteristics and uncertainties in the finite ele- for training the ANN model, respectively. Nearly all the deep learning
ment modelling are considered; systems are powered by one very important algorithm: stochastic gra-
(2) Scenario 2: Measurement noise effect. White noises are added on dient descent (SGD), which is an extension of the gradient descent al-
the input vectors, specifically, 1% noise in the frequencies and 5% gorithm [36]. A recurring problem in machine learning is that large
in the mode shapes, considering structural frequencies are usually training sets are necessary for good generalization, but large training
measured more accurately than mode shapes [33]; sets are computationally more expensive. The cost function used by a

18
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 5. Damage identification results of a single damage case from ANN and the proposed approach for Scenario 1.

Fig. 6. Damage identification results of a multiple damage case from ANN and the proposed approach for Scenario 1.

Fig. 7. Damage identification results of another multiple damage case from ANN and the proposed approach for Scenario 1.

machine learning algorithm often decomposes as a sum over training expectation and it may be approximately estimated using a small set of
examples of some per-example loss function. As the training set size samples. Specifically, on each step of the algorithm, a mini batch of
grows to billions of samples, the time to take a single gradient step samples can be used, which is typically chosen to be a relatively small
becomes considerably long. The insight of SGD is that the gradient is an number of examples, ranging from 1 to a few hundred drawn uniformly

19
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Table 2 backpropagation based training scheme such as SGD is that if more


Performance evaluation results for Scenario 2 in the numerical study. hidden layers are involved, it might become hard to achieve a sa-
Methods MSE R-Value Optimization Training Time tisfactory accuracy [17]. R-Values obtained from ANN with SGD and
Method (Hours) SCG are 0.652 and 0.824, respectively. The proposed approach shows a
significant improvement in the regression compared with ANN methods
ANN 6.7e−04 0.578 SGD 5
with a smaller MSE value and a better R-Value. A little more training
ANN 4.9e−04 0.711 SCG 1.5
The proposed 3.7e−04 0.794 SCG 2
time is required for the proposed approach compared to ANN with SCG
approach because pre-training is not used in the ANN model. To further de-
monstrate the quality of the damage identification, several single da-
mage and multiple damage identification results are presented below.
from the training set. Hence it is plausible to fit a training set with Since ANN with SCG generally performs better than ANN with SGD in
billions of samples using updates computed on only a hundred samples. terms of both the accuracy and the training efficiency, the proposed
The estimate of the gradient is formed as using examples from the mini approach will only be compared with ANN with SCG in such intuitive
batch. The SGD algorithm then follows the estimated gradient downhill. comparisons.
Since SGD is a first order algorithm, it may suffer in efficiency and The damage identification of a single damage case resulted from
accuracy. In contrast Scaled Conjugate Gradient (SCG) is a supervised ANN with SCG and the proposed approach are shown in Fig. 5. It can be
learning algorithm and can also be used for training the network based seen that the proposed approach provides more accurate damage
on conjugate directions [37]. Note that two vectors (u, v) are said to be identification than ANN. The damage location is well identified, and the
conjugate in uTAv = 0 and each step is moved in a direction conjugate identified stiffness reduction at the damaged element is very close to
to the all previous step. This direction is found from the residual and the the actual value. The identified stiffness reduction values at the non
director of the previous steps. SCG algorithm finds the search direction -damage elements are very close to zero. The proposed approach is also
and the step size by using information of a second order Taylor ex- evaluated against ANN (SCG) with multiple structural damage cases,
pansion of the error function. It is fully-automated, includes no critical and the identification results for two types of multiple damage cases are
user- dependant parameters, and avoids a line search per iteration by shown in Figs. 6 and 7. It can be seen clearly that the proposed ap-
using a Levenberg-Marquardt approach in order to scale the step size. proach work very well in multiple damage cases too. Damage locations
Hence it yields a significant speed up in training while reaching the are accurately detected and the identified stiffness reductions are very
optimal in a short time comparatively to SGD algorithm. close to the actual values with very small false identifications. In con-
70%, 15% and 15% samples randomly selected from the generated trary, ANN is not working very well in identifying multiple damages.
datasets are used for training, validation and testing, respectively. MSE Significant errors appear in the predicted damage extents, since there
and R-Value are used to assess the quality of the damage predictions are no well-defined layer wise objectives for ANN model.
through the networks. All the numerical computations are conducted
on a desktop computer with an Intel i7 processor, 16 GB RAM and the 3.3.2. Scenario 2: Measurement noise effect
graphics card NVidia 1080 Ti GTX by using GPU for parallel computing. In this scenario, measurement noise is added into the data with 1%
random noise in the frequencies and 5% in modes shapes. The ac-
3.3.1. Scenario 1: No measurement noise and uncertainties curacies of ANN and the proposed approach for damage identification
In this scenario, the datasets without measurement noise nor un- in such scenario are investigated. The noisy data are used for the
certainty effect are used. The performances of using ANN and the training, validation and testing of the neural networks. The perfor-
proposed framework are compared by examining the MSE values and R- mance evaluation of the proposed approach against the ANN model is
Values on the test datasets. shown in Table 2. A lower MSE value and a significantly higher R-Value
The performance evaluation are shown in Table 1. It can be ob- from the proposed approach are observed than those from ANN models.
served that ANN with SGD performs worse than the two other methods It demonstrates the robustness and effectiveness of the proposed ap-
while consuming more time for training. It shows the inefficiency in proach when the noise effect is included in the measurements. The
utilizing the first order method for training. In contrast, ANN with SCG damage identification results for two multiple damage cases are shown
using the second order information for training the network consumes in Figs. 8 and 9. It can be seen that ANN with SCG may fail to identify
less time while performs better than SGD. A further problem with a the damages effectively, while the proposed approach can reliably and

Fig. 8. Damage identification results of a multiple damage case from ANN and the proposed approach for Scenario 2.

20
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 9. Damage identification results of another multiple damage case from ANN and the proposed approach for Scenario 2.

Fig. 10. Damage identification results of a minor damage case from ANN and the proposed approach for Scenario 2 with a higher noise level.

Table 3 mode shapes, is further considered. The identification results of a minor


Performance evaluation results for Scenario 3 in the numerical study. damage case with 5% stiffness reduction a single element are shown in
Methods MSE R-Value Optimization Training time
Fig. 10. With this significant noise effect, the results demonstrate that
method (Hours) ANN fails to identify the introduced damage, however, the proposed
approach can still provide satisfactory identification results, though the
ANN 6.5e−04 0.613 SGD 5 pre-set damage severity is as small as 5%.
ANN 4.8e−04 0.729 SCG 2
The proposed 2.9e−04 0.83 SCG 2.5
approach 3.3.3. Scenario 3: uncertainty effect
Uncertainties inevitably exist in the process of structural damage
identification, e.g., in the material properties of the finite element
accurately identify both the locations and magnitudes of structural
modelling, which will affect the performance of damage identification
damages when measurement noise is included in the data. the proposed
algorithms. In this scenario, 1% uncertainty is included in the elemental
approach outperforms ANN due to the utilizations of dimensionality
stiffness parameters to simulate the finite element modelling errors. It
reduction and relationship learning in the framework. A lower MSE
should be noted that the error in the model class selection is not con-
value and a higher R-Value from the proposed approach are observed
sidered in this study. Datasets are generated with this random model-
than those from ANN models. The damage identification results of two
ling errors included in the finite element analysis, and used for the
specific multiple damage cases are shown in Figs. 8 and 9. It can be
training, validation and testing of the neural networks. The perfor-
found out from the identification results that ANN with SCG may fail to
mance evaluation results are shown in Table 3. The proposed approach
identify the damage effectively. However, the proposed approach can
outperforms ANN with an improvement as indicated by both the MSE
still reliably identify both the locations and magnitudes of pre-defined
and R-Values. The results from both ANN models are affected sig-
multiple structural damages when the noise is included in the data.
nificantly by the uncertainty effect, as reflected by the corresponding R-
To investigate the effect of noise levels on the identification results,
values, indicating the accuracy of output prediction is lower.
a higher level noise case, namely 2% in the frequencies and 10% in the
Damage identification results of a single and a multiple damage

21
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 11. Damage identification results of a single damage case from ANN and the proposed approach for Scenario 3.

Fig. 12. Damage identification results of a multiple damage case from ANN and the proposed approach for Scenario 3.

Table 4 structural damage identification with uncertainty effect are demon-


Performance evaluation results for Scenario 4 in the numerical study. strated.
Methods MSE R-Value Optimization Training time
method (Hours)
3.3.4. Scenario 4: both measurement noise and uncertainty effect
ANN 7.3e−04 0.536 SGD 5 Both the measurement noise and uncertainty effect defined in
ANN 5.2e−04 0.693 SCG 1.5 Scenarios 2 and 3 are considered in this Scenario. It is very challenging
The proposed 3.6e−04 0.732 SCG 2 to achieve an effective and reliable structural damage identification
approach
when significant measurement noise and uncertainty effect are in-
volved. These uncertainties generally affect the damage detection re-
sults greatly. The performance evaluation results for this scenario are
cases randomly selected from the testing datasets are shown in Figs. 11
shown in Table 4. To demonstrate the effectiveness of the proposed
and 12, respectively. For the single damage case, the damage location is
approach, damage identification results from a single and a multiple
well identified by using the proposed approach while a significant false
damage cases in the testing datasets are shown in Figs. 13 and 14, re-
positive identification is observed in the result from ANN (SCG), as
spectively.
shown in Fig. 11. The performance of ANN (SCG) is clearly affected by
As observed in Table 4, the proposed approach once again outper-
the uncertainty effect significantly, while the proposed approach is
forms the ANN models when both the measurement noise and un-
robust for such effect. For the multiple damage case shown in Fig. 12,
certainty effect are present, evidenced by a higher R-Value and a lower
the proposed approach detects damage locations accurately and the
MSE value. The L2-weight decay constraint applied on the cost function
identified stiffness reductions are also close to the actual values with
formulation ensures that it has less space to over-fit the training data. It
very minor false identifications due to the uncertainties. However, ANN
should be noticed that a case with a minor damage, i.e. 5%, is selected
(SCG) is not able to produce good detection results in both the locations
and shown in Fig. 13. It can be observed that ANN completely fails to
and severities of damages. A significant false identification is observed
identify the single structural damage, while the proposed approach
in the results from ANN (SCG). By comparing these identification re-
successfully identifies the damage in both the location and severity.
sults, the accuracy and robustness of using the proposed approach for
With the multiple damage case as shown in Fig. 14, the proposed

22
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 13. Damage identification results of a minor damage case from ANN and the proposed approach for Scenario 4.

Fig. 14. Damage identification results of a multiple damage case from ANN and the proposed approach for Scenario 4.

approach also gives much more accurate stiffness reduction predictions floors of the frame model, and two flat bars of the same cross section
than ANN (SCG) in terms of both the locations and severities, with ANN with a width of 50 mm and a thickness of 5 mm are used as columns.
(SCG) producing several cases of false positives. The beams and columns are welded to form rigid beam-column joints.
Damage identification results from the four scenarios demonstrate The bottom of the two columns is welded onto a thick and solid steel
clearly the accuracy and robustness of using the proposed approach in plate, which is fixed to a strong floor. The initial elastic modulus of the
structural damage identification, compared with the traditional ANN steel is estimated as 200 GPa, and the mass density 7850 kg/m3.
(SCG), even when the measurement noise and uncertainty effect are Dynamic tests are conducted to identify the vibration characteristics
considered. of the testing frame model. A modal hammer with a rubber tip is used to
apply the excitation on the model. Accelerometers are installed at all
4. Experimental verifications the floors to measure horizontal acceleration responses under the
hammer impact. The sampling rate is set as 1024 Hz, and the cut-off
Experimental verifications of using the proposed approach for da- frequency range for the band-pass filter is defined from 1 Hz to 100 Hz
mage identification in a laboratory steel frame model are presented in for all tests. An initial shear-type finite element model with 8 lump
this section. The experimental setup, network design and training, and masses is built based on the dimensions and material properties of the
damage identification results will be presented in details. frame. Vibration testing data from the experimental model under the
healthy state are used to perform an initial model updating to minimize
the difference between the measured and analytical vibration char-
4.1. Experimental model and initial model updating
acteristics, i.e. frequencies and mode shapes. The First-order sensitivity
based method is employed for the updating [38,39]. Environmental
An eight-story shear-type steel frame model is fabricated in the la-
noise and uncertainties are inevitable in such kind of settings. The
boratory for experimental validations of the proposed approach. Fig. 15
detailed experimental test setup and model updating procedure are
shows the testing steel frame model in the laboratory. The height and
referred to Ref. [40]. The measured and analytcial natural frequencies
width of the frame structure are 2000 mm and 600 mm, respectively.
of the experimental model before and after model updating are listed in
Thick steel bars of with dimension of 100 mm × 25 mm are used as the

23
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

4.2. Training data generation

Modal analysis is performed using the baseline model to generate


the input and output data to train the networks. The first eight fre-
quencies and the corresponding mode shapes from these eight floors are
obtained based on the pre-defined structural stiffness parameters.
Similar to Section 3.2, the elemental stiffness parameters are normal-
ized to the range between 0 and 1, where 1 denotes the intact state and
0 denotes the completely damaged state. 25,440 datasets are generated
from the baseline model that include all possibilities for single element
and two element damage cases. In single element damage cases, the
stiffness parameter for each element varies from 1, 0.99, 0.98, …, to 0.7
while keeping all other elements undamaged. 30 data sets are generated
for the scenario when a local damage is introduced in a specific ele-
ment. With 8 elements in the finite element model, 240 single element
damage cases are defined. In multiple element damage cases, the
stiffness parameters for two random elements vary from 1, 0.99, 0.98,
…, to 0.7 while keeping the other elements undamaged. 25,200 mul-
tiple element damage cases are defined. An additional measurement
noise is added into the data with 1% random noise in the frequencies
and 5% in modes shapes in order to make the model robust to noisy
measurements. Adding noise to the training data can improve the ro-
bustness and accuracy of using the proposed approach to deal with the
real testing data. These datasets are processed with the same pre-pro-
cessing procedure as described in Section 3.2, and then used for training
and validation.

4.3. Network structure

Fig. 15. A steel frame model in the laboratory. A relatively simpler Autoencoder model is defined here considering
the complexity of the target problem and the number of unknown
Table 5 parameters to be identified. One hidden layer (k = 1) with 36 neurons
Measured and analytical natural frequencies of the experimental model before is designed in the dimensionality reduction component, and a hidden
and after updating. layer with 16 neurons is used in the relationship learning component.
The input vector contains 8 frequencies and 8 × 8 mode shape values,
Mode Measured Before updating After updating
that is, 72 values in total. 8 stiffness reduction parameters are involved
Analytical (Hz) Error (%) Analytical (Hz) Error (%) in the final output vector. For the pre-training, hyperbolic tangent
function is used as the encoder function and linear function is used as
1 4.645 4.810 3.55 4.636 0.19 the decoder function in the autoencoder. Hyperbolic tangent function is
2 13.705 14.267 4.10 13.714 0.06
3 22.554 23.238 3.03 22.558 0.02
used as the activation functions for all the layers. To have a fair com-
4 30.695 31.418 2.36 30.776 0.26 parison, the same number of hidden layers and neurons are used to
5 38.241 38.528 0.75 38.225 0.04 form an ANN model and the same training datasets are used.
6 44.434 44.325 0.25 44.422 0.03
7 48.826 48.614 0.43 48.712 0.23
4.4. Damage identification results
8 52.306 51.246 2.03 52.161 0.28

Damages are introduced by reducing the column cross sections of


Table 5. The maximum error in the frequencies after updating is only the specific floors of the steel frame model. The flexural stiffness of each
0.28% at the eighth mode, indicating a very good agreement. The floor is proportional to the moment of inertia bh3/12 of the column,
measured and analytical mode shapes of the model are shown in where b and h are defined as the width and thickness of the column
Fig. 16. The mode shapes after model updating match very well with respectively. The equivalent stiffness reduction can be obtained based
the measured mode shapes from the vibration tests. This well updated on the decrease of the moment of inertia. However, it should be noted
finite element model is achieved to serve as the baseline model in the that only the stiffness reduction is considered and the mass change is
following studies for generating the training data and validating the ignored since the structural damage is mainly related with the stiffness
performance of the proposed framework in structural damage identifi- reduction. Two damage cases, namely, Case 1 and Case 2, are in-
cation. The following sections will present the data generation process troduced in the structure. Only a single damage is defined in Case 1
based on the baseline finite element model for network training and with 20% reduction of the equivalent stiffness of the 2nd floor. Case 2
validation, the architecture design of the Autoencoder based framework has multiple damages. Besides the damage in Case 1, another damage is
and ANN, and the investigation of using the vibration characteristics introduced with 10% stiffness reduction in the 7th floor. The introduced
from the damaged laboratory model for damage identification with the damages in the 2nd and 7th floors are shown in Fig. 17. Experimental
proposed approach and ANN. Results from ANN and the proposed ap- vibration tests are conducted with the damaged model to identify the
proach will be compared to demonstrate the performance for a reliable structural vibration characteristics, i.e. frequencies and mode shapes, of
structural damage identification with experimental testing measure- the above two damage cases.
ments. After training and validating the designed networks, frequencies
and mode shapes from the above two damage cases with the additional
added noises are used as the testing input to predict the structural da-
mages and investigate the performance and robustness of using the

24
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 16. Mode shapes before and after updating.

proposed approach with real testing measurements for structural da- false identifications and smaller false values. This indicates that the
mage identification. The performance evaluation results for these two proposed approach can well identify the pre-set structural damages in
test cases by using ANN and the proposed approach are shown in the laboratory model with experimental testing data including en-
Table 6. It can be observed that the MSE value from the proposed ap- vironmental noise and uncertainties.
proach is significantly smaller than those from the ANN methods. Be-
sides, the regression from the proposed approach is also improved, as
represented by the R-Value. ANN with SGD training method requires a 5. Conclusion
much higher amount of training time, while same training time is re-
quired for both the proposed approach and ANN with SCG. Figs. 18 and An autoencoder based deep learning framework for structural da-
19 shows the identified structural damages of both damage Case 1 and mage identification is proposed in this paper. It can well perform the
Case 2. Comparing with the true introduced damages and results from pattern recognition between the modal information, such as frequencies
ANN methods, it is demonstrated that the identified stiffness reductions and mode shapes, and structural stiffness parameters. Two main com-
using the proposed approach are very close to the exact values with less ponents, that is, dimensionality reduction and relationship learning, are
included in the proposed framework. The dimensionality reduction

25
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 17. Introduced damages of the frame model: (a) Introduced damage at the 2nd floor; (b) Introduced damage at the 7th floor.

Table 6 process of the proposed framework could be applied and extended to a


Performance evaluation results in the experimental study. deeper network architecture from a complex problem. L2-weight decay
Methods MSE R-Value Optimization Training time
is utilized to enhance the overall training process by limiting the over-
method (Hours) fitting tendency while training deep architectures. A layer-wise pre-
training is performed to optimize the weights of the individual layers,
ANN 2.1e−04 0.897 SGD 3 and the whole network is fine-tuned using a joined optimization to-
ANN 9.3e−05 0.989 SCG 1
The proposed 9.1e−06 0.996 SCG 1
wards the final objective function. Numerical and experimental vali-
approach dations on steel frame structures are conducted and the results de-
monstrate the improved accuracy and efficiency of the proposed
framework, comparing with the traditional ANN methods. More accu-
component utilizes an autoencoder model to compress the original rate structural damage identification results can be obtained in regards
input vector to obtain a robust low dimensional feature vector that to both the locations and severities of the damages, even when both the
preserves the necessary information through multiple hidden layers. measurement noise and uncertainty effect are considered. The proposed
This not only effectively removes the redundancy in the data but also framework is capable of handling a large amount of training data. The
keep the most useful information to serve as the input to the relation- layer-wise pre-training and fine-tuning are employed to improve the
ship learning component. A regression model is defined in the re- training efficiency and accuracy. It can also be used for more complex
lationship learning component to map the compressed feature vector to problems with a complicated network structure, for example, a high
the output stiffness reduction parameters. The dimensionality reduction dimensional input data, multiple hidden layers and a large number of

Fig. 18. Damage identification results of Case 1 from ANN and the proposed approach.

26
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

Fig. 19. Damage identification results of Case 2 from ANN and the proposed approach.

output parameters. The proposed framework will be extended to utilize [15] Dackermann U, Smith WA, Randall RB. Damage identification based on response-
other structural vibration characteristics, e.g., flexibility and frequency only measurements using cepstrum analysis and artificial neural networks. Struct
Health Monitor 2014;13(4):430–44.
response function, etc., as the input, in order to increase the sensitivity [16] Hochreiter S, Bengio Y, Frasconi P, and Schmidhuber J. Gradient flow in recurrent
of the network and improve the performance of structural health nets: the difficulty of learning long-term dependencies, A Field Guide to Dynamical
monitoring and damage identification for detecting minor damages Recurrent Neural Networks. S.C. Kremer and J.F. Kolen, Eds. Wiley-IEEE Press.
2001:1–15.
under various uncertainties and noise effect. [17] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural
networks. Science 2006;313:504–7.
Acknowledgments [18] Arel I, Rose DC, Karnowski TP. Deep machine learning-a new frontier in artificial
intelligence research [research frontier]. Computat Intelligence Magazine IEEE
2010;5(4):13–8.
The work described in this paper was supported by an Australian [19] Schmidhuber J. Deep learning in neural networks: an overview. Neural Net
Research Council project. 2015;61:85–117.
[20] Jia F, Lei Y, Lin J, Zhou X, Lu N. Deep neural networks: a promising tool for fault
characteristic mining and intelligent diagnosis of rotating machinery with massive
Appendix A. Supplementary material data. Mech Syst Sig Process 2016;72:303–15.
[21] Gan M, Wang C. Construction of hierarchical diagnosis network based on deep
Supplementary data associated with this article can be found, in the learning and its application in the fault pattern recognition of rolling element
bearings. Mech Syst Sig Process 2016;72:92–104.
online version, at http://dx.doi.org/10.1016/j.engstruct.2018.05.109. [22] Abdeljaber O, Avci O, Kiranyaz S, Gabbouj M, Inman DJ. Real-time vibration-based
structural damage detection using one-dimensional convolutional neural networks.
References J Sound Vib 2017;388:154–70.
[23] Cha YJ, Choi W, Büyüköztürk O. Deep learning-based crack damage detection using
convolutional neural networks. Comput-Aided Civ Infrastruct Eng
[1] Brownjohn JMW. Structural health monitoring of civil infrastructure. Philosoph 2017;32(5):361–78.
Transact Royal Soc London A: Mathemat, Phys Eng Sci 2007;365(1851):589–622. [24] Pathirage CSN, Li J, Li L, Hao H, and Liu W. Deep autoencoder models for pattern
[2] Li J, Hao H. A review of recent research advances on structural health monitoring in recognition in civil structural health monitoring. World Congress on Engineering
Western Australia. Struct Monitor Maintenance 2016;3(1):33–49. Asset Management (WCEAM 2016). Jiuzhaigou, Sichuan, China 2016.
[3] Padil KH, Bakhary N, Hao H. The use of a non-probabilistic artificial neural network [25] Bengio Y, and Lecun Y. Scaling learning algorithms towards AI. Bottou L, Chapelle
to consider uncertainties in vibration-based-damage detection. Mech Syst Sig O, DeCoste D, Weston J, Eds, Large-scale kernel machines. MIT Press. 2007.
Process 2017;83:194–209. [26] Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA. Stacked denoising au-
[4] Hao H, Xia Y. Vibration-based damage detection of structures by genetic algorithm. toencoders: learning useful representations in a deep network with a local denoising
J Comput Civil Eng ASCE 2002;16(3):222–9. criterion. J Mach Learn Res 2010;11:3371–408.
[5] Ding ZH, Huang M, Lu ZR. Structural damage detection using artificial bee colony [27] Kan M, Shan S, Chang H, and Chen X. Stacked progressive autoautoencoders (SPAE)
algorithm with hybrid search strategy. Swarm Evol Comput 2016;28:1–13. for face recognition across poses. in 2014 IEEE conference on Computer Vision and
[6] Ding ZH, Yao RZ, Huang JL, Huang M, Lu ZR. Structural damage detection based on Pattern Recognition (CVPR), 2014:1883–1890.
residual force vector and imperialist competitive algorithm. Struct Eng Mech [28] Pathirage CSN, Li L, Liu W, and Zhang M. Stacked face de-noising auto encoders for
2017;62(6):709–17. expression-robust face recognition. in: in 2015 International Conference on Digital
[7] Yun CB, Yi JH, Bahng EY. Joint damage assessment of framed structures using a Image Computing: Techniques and Applications (DICTA), IEEE, 2015:1–8.
neural networks technique. Eng Struct 2001;23(5):425–35. [29] Bengio Y. Learning deep architectures for AI. Foundations Trends® Mach Learn
[8] Lee JJ, Lee JW, Yi JH, Yun CB, Jung HY. Neural networks-based damage detection 2009;2(1):1–127.
for bridges considering errors in baseline finite element models. J Sound Vib [30] Bengio Y, Lamblin P, Popovici D, and Larochelle H. Greedy layer-wise training of
2005;280(3):555–78. deep networks. NIPS'06 Proceedings of the 19th International Conference on Neural
[9] Zang C, Imregun M. Structural damage detection using artificial neural networks Information Processing Systems. 2006:153–160.
and measured FRF data reduced via principal component projection. J Sound Vib [31] Bengio Y. Practical recommendations for gradient-based training of deep archi-
2001;242(5):813–27. tectures, in Neural Networks: Tricks of the Trade, Springer, 2012:437–478.
[10] Ni YQ, Wang BS, Ko JM. Constructing input vectors to neural networks for struc- [32] Li J, Hao H. Substructure damage identification based on wavelet-domain response
tural damage identification. Smart Mater Struct 2002;11(6):825. reconstruction. Struct Health Monitor 2014;13(4):389–405.
[11] Yeung WT, Smith JW. Damage detection in bridges using neural networks for [33] Xia Y, Hao H, Brownjohn JMW, Xia PQ. Damage identification of structures with
pattern recognition of vibration signatures. Eng Struct 2005;27(5):685–98. uncertain frequency and mode shape data. Earthquake Eng Struct Dyn
[12] Bakhary N, Hao H, Deeks AJ. Damage detection using artificial neural network with 2002;31(5):1053–66.
consideration of uncertainties. Eng Struct 2007;29(11):2806–15. [34] Yuen K-V, Lam H-F. On the complexity of artificial neural networks for smart
[13] Li J, Dackermann U, Xu YL, Samali B. Damage identification in civil engineering structures monitoring. Eng Struct 2006;28(7):977–84.
structures utilizing PCA-compressed residual frequency response functions and [35] Lam H-F, Yuen K-V, Beck JL. Structural health monitoring via measured Ritz vectors
neural network ensembles. Struct Cont Health Monitor 2011;18(2):207–26. utilizing artificial neural networks. Comput-Aided Civ Infrastruct Eng
[14] Bandara RP, Chan TH, Thambiratnam DP. Frequency response function based da- 2006;21(4):232–41.
mage identification using principal component analysis and pattern recognition [36] Bottou L. Stochastic gradient learning in neural networks. Proceedings Neuro-
technique. Eng Struct 2014;66:116–28. Nımes 1991;91:1–12.

27
C.S.N. Pathirage et al. Engineering Structures 172 (2018) 13–28

[37] Møller MF. A scaled conjugate gradient algorithm for fast supervised learning. identification: convergence and performance. Int J Numer Meth Eng
Neural Net 1993;6(4):525–33. 2017;111:1231–51.
[38] Friswell M, and Mottershead JE. Finite element model updating structural dy- [40] Ni P, Xia Y, Li J, Hao H. Improved decentralized structural identification with
namics, Springer Science & Business Media, 1995. output-only measurements. Measurement 2018;122:597–610.
[39] Lu ZR, Wang L. An enhanced response sensitivity approach for structural damage

28

You might also like