10 1061@ascecp 1943-5487 0000820

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Convolutional Neural Network Approach for Robust

Structural Damage Detection and Localization


Nur Sila Gulgec, S.M.ASCE 1; Martin Takáč 2; and Shamim N. Pakzad, A.M.ASCE 3

Abstract: Damage diagnosis has been a challenging inverse problem in structural health monitoring. The main difficulty is characterizing
the unknown relation between the measurements and damage patterns (i.e., damage indicator selection). Such damage indicators would
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

ideally be able to identify the existence, location, and severity of damage. Therefore, this procedure requires complex data processing algo-
rithms and dense sensor arrays, which brings computational intensity with it. To address this limitation, this paper introduces convolutional
neural network (CNN), which is one of the major breakthroughs in image recognition, to the damage detection and localization problem. The
CNN technique has the ability to discover abstract features and complex classifier boundaries that are able to distinguish various attributes of
the problem. In this paper, a CNN topology was designed to classify simulated damaged and healthy cases and localize the damage when it
exists. The performance of the proposed technique was evaluated through the finite-element simulations of undamaged and damaged struc-
tural connections. Samples were trained by using strain distributions as a consequence of various loads with several different crack scenarios.
Completely new damage setups were introduced to the model during the testing process. Based on the findings of the proposed study, the
damage diagnosis and localization were achieved with high accuracy, robustness, and computational efficiency. DOI: 10.1061/(ASCE)
CP.1943-5487.0000820. © 2019 American Society of Civil Engineers.

Introduction feature classification, which require manual effort and expert


knowledge.
Structural systems are subjected to damage and deterioration dur- Such methods are often efficient in identifying structural dam-
ing their service life due to environmental and operational factors. age of a particular type that is closely tied to a mechanical model
Providing timely damage evaluation becomes important to ensure of the behavior of the structural systems and components, which
lifetime safety of these structures (Fang et al. 2005). For this rea- constrains these methods in two aspects: (1) the methods are lim-
son, significant research has been conducted in structural health ited in their scope, depending on the feature that they use for dam-
monitoring (SHM), which is a process of diagnosing the deficien- age identification; and (2) they are often overwhelmed by big data
cies affecting the performance of structures (Farrar and Worden when damage features are computationally complex. For example,
2007). Data-driven SHM processes need large quantities of data Yao et al. (2016) presented a damage identification method using
containing detailed condition information over an extended period cross-correlation of strain data in a steel gusset plate, which dem-
of time (Shahidi et al. 2016). As the temporal and spatial resolution onstrates the wealth of information in data from dense sensing sys-
of monitoring data is drastically increased by advances in sensing tems, but at the same time shows the difficulty in dealing with large
technology and with the adaptation of new data collection tech- data sets and the limitation of the identification methods based on
niques, SHM applications reach the thresholds of big data (Gulgec selected features.
et al. 2017a, 2016). The main challenge of the damage identification originates
Traditional damage identification methods mostly adopt time from defining the unknown relation between the measurements and
series or frequency analysis, in conjunction with pattern classifica- damage patterns. In order to solve such poorly defined problems,
tion techniques (Gul and Catbas 2009). Many studies focus on biologically inspired soft-computing techniques have gained trac-
extracting patterns from observations and making decisions based tion (Mehrjoo et al. 2008). The most widely used soft-computing
on the obtained patterns (Sohn and Farrar 2001; Nair et al. 2006; method, called neural networks, was proposed in the 1940s (Flood
Yao and Pakzad 2012; Fujimaki et al. 2005). The pattern recog- and Kartam 1994) and is designed such that it can learn from data
nition technique consists of two processes, feature selection and without a need of feature design process. Since then, it has been
practiced in many disciplines including SHM to diagnose damage
from the measurement data or its features (Shi and Yu 2012). These
1
Graduate Student, Dept. of Civil and Environmental Engineering, studies employed several different inputs to feed the neural network
Lehigh Univ., 117 ATLSS Dr., Imbt Labs, Bethlehem, PA 18015 (corre- such as modal analysis of vibration response (Zapico et al. 2003;
sponding author). Email: [email protected] Hadzima-Nyarko et al. 2011; Lee et al. 2005); statistical parameters
2
Assistant Professor, Dept. of Civil and Environmental Engineering, of vibration (Shu et al. 2013) and strain data (Alavi et al. 2016);
Lehigh Univ., 117 ATLSS Dr., Imbt Labs, Bethlehem, PA 18015. frequency response functions (FRFs) (Fang et al. 2005); and wave-
3
Associate Professor, Dept. of Industrial and Systems Engineering, let transform coefficients of the acceleration data (Shi and Yu
Lehigh Univ., 200 West Packer Ave., Harold S. Mohler Laboratory, 2012). Nevertheless, most of the prior work still used damage
Bethlehem, PA 18015.
Note. This manuscript was submitted on March 11, 2018; approved on
indicators as inputs to the neural networks via preprocessing in-
September 13, 2018; published online on January 30, 2019. Discussion stead of learning directly from data.
period open until June 30, 2019; separate discussions must be submitted Although neural network applications are promising, they
for individual papers. This paper is part of the Journal of Computing showed that more complex network architectures are needed to
in Civil Engineering, © ASCE, ISSN 0887-3801. achieve their full potential (Flood 2008). This idea became practical

© ASCE 04019005-1 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


with the improvements in computing power and the introduction of
large representative training data sets (Gu et al. 2015). Exploiting
the opportunities hidden in big data, deep neural networks (DNNs)
(or deep learning) started to gain popularity and soon reached the
state-of-the-art technique for image, speech, and video recognition.
Yet, there are only a few studies using breakthrough deep learning
techniques in the SHM field. Abdeljaber et al. (2017) used a one-
dimensional convolutional neural network (CNN) to extract dam-
age features from raw acceleration data and Cha et al. (2017) used
raw images taken from structure to perform deep learning–based
detection of visible cracks only.
The previous studies adopted trial-and-error search for tuning
the hyperparameters in neural network architecture and did not con- Fig. 1. DNN with two hidden layers.
sider the noise sensitivity of the measurement data and robustness
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

of the network architecture, which may cause the fundamental


problem of overfitting [i.e., a neural network can fit even a random
noise when the network is not designed carefully (Zhang et al. In 1959, Arthur Samuel (1959) defined machine learning as a “field
2016)]. This paper addresses these limitations by proposing an op- of study that gives computers the ability to learn without being
timized two-dimensional CNN-based approach to detect and local- explicitly programmed” (Simon 2013). ML algorithms are de-
ize cracks in a noise-tolerant way. The approach feeds the network signed such that they can learn from data. During this learning pro-
by using raw strain field measurements, which are a direct indicator cess, they build a model that is then used to make data-driven
of stress, fatigue, and failure and can be obtained by an optic-based predictions or decisions.
DNNs are a subfield of machine learning that are conceptually
technique called digital image correlation (DIC) (Pan et al. 2009).
motivated by the human brain. DNNs aim to build a model using a
In Gulgec et al. (2017b), damage diagnosis was performed
deep graph formed in multiple linear layers followed by nonlinear
by using CNN fed through the strain distributions of a structural
transformations (LeCun et al. 2015). Fig. 1 shows an example four-
connection. In this paper, this idea is expanded by performing a
layer DNN that consists of an input layer, two hidden layers, and
CNN-based methodology for both damage identification and
an output layer. The architecture operates on the input instance
localization with comprehensive noise sensitivity analysis. The
x ¼ ðx1 ; : : : ; xp ÞT to get the output of the network. In Fig. 1, each
proposed methodology shares the front-end layers of a deep con-
circle represents a neuron and an arrow illustrates a connection
volutional network for both identification and localization tasks.
from the output of one neuron to the input of another connection.
Then, customized back-end layers are constructed that are special-
Each arrow has an associated weight parameter that indicates the
ized for both tasks. Automatically extracted features in the front-
significance of the respective inputs to the output. The output of the
end layers are meaningful for both tasks, hence sharing these layers
neuron in a hidden layer can be determined by the weighted sum of
eliminates the need for two completely separate networks. This re-
the inputs activated by a nonlinear mapping (e.g., sigmoid, tanh, or
duces the total training time and computational resources.
others).
This methodology learns sophisticated damage features and
For a given input x ∈ Rp , ML algorithms try to build a predic-
complex classifier boundaries without extracting hand-designed
tion function θðx; wÞ parametrized by weights, w. The simplest case
damage features as is done in traditional methods. The network
of this function can be considered as linear, i.e., θðx; wÞ ¼ xT w.
architecture accomplishes accurate damage diagnosis even from the
After the family of prediction functions is set, a loss function is
unseen damage scenarios since the network is trained with a variety
selected to measure the error between a prediction and the true
of loading cases, damage scenarios, and measurement noise levels.
value. The most elementary loss function can be denoted as
Additionally, the paper presents a comprehensive sensitivity analy-
lðθðx; wÞ; yÞ ¼ kθðx; wÞ − yk2 , where y ∈ Rc is the true observed
sis to better understand the behavior of CNN architecture subjected
value (i.e., label) of the input query x. Softmax loss entropy and
to uncertainties and calibrate it to achieve robust results. Finally, cross-entropy functions can be the other common examples of loss
this approach makes real-time damage identification possible, functions (Bishop 2006).
thanks to (1) front-end layer sharing, (2) CNN’s shared parameter- The learning problem seeks the best possible instance of the
ization, and (3) parallel architecture of graphics processing units prediction function from the selected family; in other words, it boils
(GPUs). down to finding the best possible values of the weights w to min-
The rest of the paper is organized as follows. First, a review of imize the loss function. Mathematically speaking, the optimiza-
relevant studies and a brief overview of CNNs is provided in the tion problem can be defined as follows (Shalev-Shwartz and
“Introduction” and “Background on Deep Learning”; then the Ben-David 2014):
proposed methodology is described in “Proposed Methodology.”
The performance and robustness of the proposed approach are minEðX;YÞ ½lðθðx; wÞ; yÞ ð1Þ
evaluated by numerical validation in “Numerical Validation” and w
“Results and Discussion.” Conclusions and future directions are
given in the “Conclusion.” where the expectation is taken over the true distribution of inputs
and labels ðX; YÞ. Nevertheless, the exact knowledge about the true
distribution is almost never available in practice. The common
practice is to sample n data points fðxi ; yi Þgni¼1 (frequently called
Background on Deep Learning
training data) from the unknown distribution, and minimize the
empirical loss instead
Deep Neural Networks
Machine learning (ML) is gradually evolved from pattern recogni- 1X n
min lðθðxi ; wÞ; yi Þ ð2Þ
tion and learning theory in artificial intelligence (Alpaydin 2014). w n i¼1

© ASCE 04019005-2 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Convolutional Neural Networks kernels used. The number of strides determines the number of
instances skipped in each position, whereas zero padding con-
CNNs are one of the most widely used types of deep neural
trols the number of zeros added to the borders of the input.
networks. The framework of CNN was first proposed by LeCun
Fig. 2 shows a convolution operation for an input size (I) of
et al. (1998) to classify handwritten digits. CNN became a break-
7 × 11 × 2. In the figure, a kernel size (K) of 2 × 2 × 2 passes
through in visual and speech recognition in the last few years with
through the input with the stride (S) of 3 and no zero padding
the introduction of a highly parallel programmable unit called
(P). The output size is found by the equation ðI − K þ 2PÞ=
GPUs and large-scale hierarchical image database (Deng et al.
S þ 1.
2009). CNN architectures kept evolving (Krizhevsky et al. 2012; 2. Pooling (POOL) layer performs a downsampling operation in
Simonyan and Zisserman 2014; Zeiler and Fergus 2014) through the feature maps using maximum, average, or sum operations.
the years and the performance improved significantly as the net- The pooling layer gets the input and resizes it to reduce the
works become more complex and deeper (Szegedy et al. 2015; number of parameters and control the overfitting. Similarly, the
He et al. 2015). The reason behind such achievement was the output size is controlled by different hyperparameters such as
ability to keep temporal features of the input and reduce mem- pool size and the number of strides.
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

ory requirements by using fewer parameters (LeCun and Bengio 3. Fully connected (FC) layers operate on the stacked convolu-
1995). tional or pooling layer outputs and compute the weighted sum
Convolutional neural networks are composed of three architec- of inputs with a nonlinear mapping as described in the overview
tural frameworks: local receptive fields, shared weights, and spatial of DNNs.
subsampling (LeCun et al. 1998). Passing the same set of units all
over the input allows extracting multiple feature maps. In this case,
the feature map shifts the same amount the input shifts. This is Proposed Methodology
called local receptive fields, which makes CNN robust to the trans-
lation and distortion of the input. Furthermore, the weights and
biases are shared through the feature maps. This characteristic Overview
reduces the learned parameters as well as the memory demands. This section gives a general map of the proposed technique. As
Finally, spatial subsampling helps reduce the resolution of the fea- shown in Fig. 3, the methodology consists of training and testing
ture maps and prevent the sensitivity of the outputs under shifts and phases. The training phase operates on raw strain fields from struc-
rotations. tures. After normalizing each strain field by its absolute maximum,
CNNs receive the input as three-dimensional (3D) volumes the search mechanism finds a good set of hyperparameters that
(width, height, depth). As an example from image recognition, the improves the performance of the network architecture. Then the
depth of a colored image (i.e., having red-green-blue color chan- selected architecture is trained to minimize the error between pre-
nels) is three, whereas the depth of a gray image is one. These 3D dictions and true labels.
input volumes feed the CNN architecture, which can be constructed The training phase consists of two tasks: detection and locali-
by using three types of layers: zation. The detection task determines the existence of damage
1. Convolutional (CONV) layer parameters are learnable filters where it is treated as a classification problem (i.e., 0 for undam-
in which each filter (weights or kernels) has spatially small aged and 1 for damaged). The localization task treats the case as
width and height shared in the full depth of the input. While a regression problem where the goal is accurate estimation of the
sliding these weights, the CONV layer computes the dot product boundaries of the damaged area. In the proposed methodology,
between these filters and the small region of the input in any both of the tasks use shared layers in the early stages of the deep
position. Then the weighted sum of the input and weights is learning pipeline. These layers are specialized to extract local
activated by the nonlinear functions to form feature maps. This features that are common for both localization and detection.
operation is called convolution. The size of the feature map is Then these early layers are fed into task-specific layers. Shared
associated with a variety of hyperparameters such as the number front-end layers avoid having two separate networks, provide
of kernels, kernel size, number of strides, and zero padding. The more efficient learning, and have shorter training time and lower
nonlinear activation maps are generated based on the number of computation cost.

Fig. 2. Example of convolution operation.

© ASCE 04019005-3 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

Fig. 3. Overview of the proposed methodology.

The trained model parameters are stored to be used in the Weight Initialization
testing phase. In this phase, raw strain fields are fed into the The first step of training was initializing the weights to control
CNN architecture to predict the labels for detection and localiza- input instances in a reasonable range along the layers. This study
tion tasks. adopted Xavier initialization for the tanh function (Glorot and
Bengio 2010). Weight initialization of the ith layer was set to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
have a uniform distribution in the interval ½− 6=ðni−1 þ nÞ;
Hyperparameter Selection pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
6=ðni−1 þ ni Þ, where ni−1 and ni are the number of units in
The CNN architectures can be built in various ways by using the ði − 1Þth and ith layers.
the sequence of CONV, POOL, and FC layers. The performance
of the neural networks critically depends on identifying a good set Prediction Functions
of hyperparameters (Pei et al. 2004). In this study, these hyper- The feedforward step evaluated different prediction functions for
parameters include learning rate, the number of CONV and FC detection and localization tasks. This study employed the softmax
layers, the number of kernels, kernel and pool sizes, and the num- classifier (Bishop 2006) to predict the label of the detection output
ber of hidden layer sizes. In order to find the structure with good (ypred ), which is either healthy or damaged. The class i of the input
configuration, the hyperparameter search mechanism was imple- x was estimated by selecting the maximum probability of the soft-
mented for both damage detection and localization tasks (Li max function defined as follows:
et al. 2016).
Different networks were constructed with randomly selected
e½θðx;wÞi
hyperparameters. The 10% of the networks that had the worst val- ½softmaxðθðx; wÞÞi ¼ P ½θðx;wÞ ð3Þ
je
j
idation score was removed after the first run, and the remaining
networks were run for another set of an epoch. The runs were re-
peated until the best 10 networks remained in the pool. After the
best network was selected for the damage identification part, the ypred ¼ argmaxi ð½softmaxðθðx; wÞÞi Þ ð4Þ
output of the last convolutional layer was stored and used as an
input for the hyperparameter search for the localization task. The
search for this task was performed on FC layers only. The localization task aimed to predict the location of the crack,
which is defined by a bounding box vector zpred. For this reason,
this task used a regressor instead of a classifier. The bounding box i
Training of the input x was estimated by the following function:
The training process was comprised of two phases: feedforward
and back-propagation (Rojas 2013). The feedforward process X
½zpred i ¼ ½θðx; wÞj ð5Þ
evaluated the prediction function for given input instances. Then j
the back-propagation step adjusted the weights in proportion as
their contributions to the total error (Rumelhart et al. 1988) by us-
ing a stochastic gradient descent (SGD) algorithm (Robbins and Loss Functions
Monro 1951). After the gradients were calculated with SGD, the The proposed model adopted two separate loss functions for the
detection and localization parameters were updated with the learn- detection and localization tasks. The diagnosis part employed
ing rates ηdet and ηloc , respectively. Overfitting was prevented by the negative log-likelihood function, where optimal architecture
monitoring the validation data set performance in every complete parameters θ were learned by maximizing the likelihood of the
forward and backward pass (epoch). When architecture perfor- data set. On the other hand, the localization task calculated the loss
mance was improved sufficiently on the validation data set, the between the predicted and true bounding box with the l2 loss
training process was stopped. function

© ASCE 04019005-4 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


X
N
ð½zpred i − zi Þ2 study, well-known damage states of a structural connection were
L¼ ð6Þ simulated by using Abaqus shell elements. The modeled connec-
N
i¼1
tion consisted of two C8 × 11.5 channels welded to a steel plate
where z = predicted bounding box; and N = batch size (i.e., number with dimensions of 71 × 36 × 0.6 cm (28 × 14 × 1=4 in:) as visu-
of training samples in one feedforward pass). During the regressor alized in Fig. 4. Each channel member was 51 cm long and had
training, the bounds of the boxes were updated based on the con- 20-cm overlap with the main gusset plate. The finite-element model
ditions of Eqs. (7)–(10). If these criteria were satisfied with the pre- adopted the mesh size of 1.3 cm (0.5 in.). The behavior of the steel
defined threshold values, the localization was marked as correct was introduced as elastic–perfectly plastic material with a yield
localization strength of 250 MPa. As can be seen from Fig. 4, stress gradients
occurred at the crack tips as well as the central part of the plate
jminða1 ; a2 Þ − minðâ1 ; â2 Þj ≤ thra ð7Þ when it becomes a plastic region. Strain distribution in the direction
of loading is represented by 28 × 56 × 1 tensors and was used to
jminðb1 ; b2 Þ − minðb̂1 ; b̂2 Þj ≤ thrb ð8Þ feed the CNN architecture after being normalized by its absolute
maximum value.
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

jmaxðâ1 ; â2 Þ − maxða1 ; a2 Þj ≤ thra ð9Þ Training, validation, and test data sets were formed by modeling
different loading cases, damage scenarios, and noise levels. The
jmaxðb̂1 ; b̂2 Þ − maxðb1 ; b2 Þj ≤ thrb ð10Þ load was selected from uniformly distributed load ∼U[−445 kN
(compression), 534 kN (tension)] and applied to the end of the
where ðâ1 ; â2 ; b̂1 ; b̂2 Þ = predicted box coordinates; ða1 ;a2 ; b1 ;b2 Þ = channel members. The damage in the gusset plate was simulated
true box coordinates; and thr is the user-defined threshold. as 2.5-cm-long cracks, which are the smallest crack size given
the mesh size. The crack locations were chosen at the beginning
of each run with a specified load level. The coordinates of the
Numerical Validation cracks changing between the two corners of the middle part of
the plate [lower left corner point A with coordinates (21.6, 2.5)
to upper right corner point B with coordinates (45.9, 33.0)] are
Data Preparation
shown in Fig. 5. In order to assess the approach with completely
The damage identification process requires a large training set of unseen damaged samples, none of the coordinates of the training
correctly classified damage states (Elkordy et al. 1993). In this set was used in the testing samples.

(a) (b)

(c) (d)

Fig. 4. Setup of the (a) healthy, and (b) single-damaged gusset plate; and (c) material behavior, and (d) inelastic behavior of the plate.

© ASCE 04019005-5 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Fig. 5. Damage locations for training and test data sets.
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

The uncertainty in the measurement process was simulated as Training and Proposed Architecture
an additive Gaussian noise ∼Nð0; σ2 Þ, where σ is the standard
Training was implemented by using a Python library called Theano
deviation of the measurement noise. Different noise levels (i.e., the to optimize the mathematical expressions consisting of multidi-
ratio between the standard deviation of measurement noise to actual mensional arrays (Theano Development Team 2016). Higher per-
strain values) were generated to compute the influence of the noise formance was achieved by using NVIDIA (Holmdel, New Jersey)
on CNN architecture performance. Tesla K80 GPUs, which enabled parallelism for data-intensive
The crack coordinates of the single-damaged samples were also calculations.
collected for the localization task. Crack location was stored as In this study, a minibatch SGD algorithm with a batch size of
bounding box ða1 ; b1 ; a2 ; b2 Þ, where b1 and b2 indicate the coor- N ¼ 64 was implemented. Identical thresholds were used for thra
dinates of the tips of the crack. While defining a1 and a2 , 1.3 cm and thrb described through Eqs. (7)–(10). In order to discover the
was subtracted and added to the x-coordinate of the crack to reduce effect of the size of the search area on the localization accuracy,
the rounding error in the direction of loading; for example, if a the sides of the bounding box were increased in length by different
crack is located between (21.6, 2.5) and (21.6, 5.0), the bounding threshold values. The threshold values 1.3, 2.6, and 5.1 cm were
box is defined as [20.3, 2.5, 22.9, 5.0]. For the healthy samples, selected to have an increase in length by scale factors of 2, 3, and 5,
bounding box was set to [0, 0, 0, 0]. respectively. The scale coefficients were selected randomly but not
While preparing damaged samples, 72 different crack loca- to exceed a quarter of the area of the central part of the plate.
tions and 3,000 loading scenarios were used. None of the coor- Thresholds are illustrated in Fig. 6 with values of thr ¼ 1.3 cm,
dinates of training sets was used in the testing samples (i.e., 36 thr ¼ 2.5 cm, and thr ¼ 5.1 cm.
locations for training, 36 locations for testing as shown in Fig. 5). Fig. 7 shows the proposed architecture as a result of the hyper-
Healthy samples were modeled with 6,000 loading scenarios. As parameter search mechanism. The network consisted of three
a result, a total of 6,000 healthy and 6,000 damaged samples convolutional layers followed by two separate fully connected
were generated. Then four different noise levels (2%, 5%, 10%, layers for detection and localization tasks. The detection part clas-
and 15%) were added to noise-free samples to produce a total sified 28 × 56 × 1 inputs as healthy or damaged, whereas the
30,000 healthy and 30,000 damaged samples. This data set is localization part predicted the bounding box of the crack area.
called Data Set 1 and distributed to training, validation, and test- The convolutional layers received the input layer and passed them
ing samples. through a filter size of ð3 × 3Þ. As a result of these CONV layers,
the network formed 8, 16, and 32 feature maps. The max-pooling
operation was implemented right after the first and second con-
Hyperparameters volution layer. A max-pool size of ð2 × 2Þ with a stride of 2 was
A total of 50 networks were constructed with randomly selected
hyperparameters in the detection task. The hyperparameter range
for the detection task had the following characteristics: learning rate
[2 to 2−8 ]; the number of CONV and FC layers [1, 2, or 3]; the
number of kernels [2 to 27 ]; kernel size [ð3 × 3Þ or ð5 × 5Þ] with
stride of 1 and without zero padding; maximum pool size [ð1 × 1Þ
with stride of 1] or [ð2 × 2Þ with stride of 2]; and randomly selected
hidden layer sizes.
The last convolutional layer of the best architecture in the de-
tection task was stored as an input for the hyperparameter search
for the localization task. The search for localization task was per-
formed on a total of 70 networks with FC layers only. The networks
for the localization task were built with hyperparameters using
learning rate [2−6 to 2−18 ]; the number of FC layers [1, 2, or 3];
and randomly selected hidden layer sizes. Activation function
tanh() was adopted for the activation of the layers for both detection
Fig. 6. Threshold values adopted for the localization task.
and localization tasks.

© ASCE 04019005-6 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

Fig. 7. Proposed CNN architecture.

used for POOL layers. The feature maps of the last convolutional additional data sets were prepared with both consisting of 6,000
layer were stacked together in an array and given as an input to the undamaged and 6,000 damaged samples. Data Set 2 was formed
fully connected layers with a hidden layer size of [836, 767] for the by only noise-free samples and Data Set 3 was selected from a sub-
detection task and [2058, 881, 534] for the localization task. set of Data Set 1. Hyperparameter search was performed for these
The learning rate of ηdet ¼ 0.0451 and ηloc ¼ 0.0026 were used for two data sets for fair comparison. Trained models were then tested
the detection and localization parts, respectively. with samples including a variation of different noise levels (0% or
As mentioned previously, CNNs have the ability to keep spatial noise-free, 2%, 4%, 6%, 8%, 10%, 12%, 14%, and 16%) 100 times.
features of inputs. In order to visualize this ability, the activated fea- Proposed network topologies for two additional training processes
ture maps after POOL-1, POOL-2, and CONV-3 layers of a correctly are listed as follows:
identified damaged sample are shown in Fig. 7. The activations were • Training of Data Set 2: The network is trained with Data Set 2,
normalized to have the scale between 0 and 1, where white represents which consists of only noise-free samples. The proposed net-
0 and black represents 1. The figure shows that the damage location work for the second case is composed of two CONV layers
(i.e., right top corner) is still visible during Stages 1 and 2. After the followed by POOL layers, and two FC layers. The CONV layer
CONV-3 layer (Stage 3), the features become abstract where it is adopts the filter size of ð3 × 3Þ with kernel numbers of 2 and 4.
almost impossible to design it by hand. Max-pool size of ð2 × 2Þ with a stride of 2 is used for POOL
layers. The last POOL layer is connected to the two FC layers
size of [373, 223]. The learning rate is chosen as ηdet ¼ 0.0158.
Results and Discussion • Training of Data Set 3: The selected network for Data Set 3
includes two CONV layers with the filter size of ð3 × 3Þ with
The performance and sensitivity analysis of the proposed method- kernel numbers of 8 and 32. Similar to the first case, a max-pool
ology is evaluated in this section. The accuracy and robustness of size of ð2 × 2Þ with a stride of 2 is used for POOL layers. The
the CNN architecture are discussed for both detection and locali- network has the two FC layers size of [2,477, 804] after shared
zation tasks. layers. The learning rate of ηdet ¼ 0.069 is adopted.
The sensitivity analysis of three training cases is visualized in
Fig. 8. Fig. 8(b) presents the testing performance of Data Set 2,
Detection Task which had the worst testing performance among the three cases.
This section presents the performance and sensitivity analysis of Although the testing error was 1.19% for lower noise levels, it
the detection task. In order to measure the effect of noise, two reached around 12% under the noise level of 16%. It is noticeable

© ASCE 04019005-7 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Fig. 8. Sensitivity analysis of detection task trained with (a) Data Set 1; (b) Data Set 2; and (c) Data Set 3.
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

that the error rate exponentially increases with the increase in the In summary, the introduction of uncertainty in measurement
noise levels. noise avoids overfitting, which leads to better testing and generali-
As can be observed from Fig. 8(c), the testing accuracy in- zation performance. Such fact emphasizes that training data set
creased significantly compared with the architecture trained with selection is vital in designing CNN architectures. Another point
noise-free samples. The performance of the trained architecture worth mentioning is that adding more samples to the training data
stayed stable with the increase in noise level. Consequently, the set increases the accuracy and robustness.
introduction of different noise levels during the training process
helped the network to learn damage features under uncertainty. Localization Task
Fig. 8(a) illustrates the best testing performance from the given
training cases. According to the figure, the proposed architecture This section discusses the main findings of the localization task.
identified the previously unseen damages with 0.21% error on The localization part of the network was trained with Data Set 1
noise-free samples. This error rate represents that the CNNs are including both noise-free and noisy samples, which results in better
capable of learning the damage features almost perfectly even with detection accuracy. In order to eliminate the error coming from the
the smallest crack size if enough training cases are provided. detection task, the localization task was run with both healthy and
Furthermore, the test error does not change significantly even under damaged samples. The CNN architecture was tested under different
16% noise, which shows that the proposed methodology is robust noise levels and different threshold values.
for various levels of noise. Fig. 10 displays the percent localization error under different
As discussed previously, deep learning–based approaches can noise levels as well as different user-defined threshold values such
be effective in identifying structural damage more than a particular as thr ¼ 1.3 cm, thr ¼ 2.5 cm, and thr ¼ 5.1 cm. According to
scenario, unlike traditional methods. They have a capability of gen- Fig. 10(a), the proposed architecture localizes the crack with 96.8%
accuracy when the noise level is zero and the threshold value is
eralization when designed carefully. In order to evaluate this char-
1.3 cm. This error rate demonstrates that the proposed CNN archi-
acteristic, the performance of the proposed method was assessed
tecture successfully localizes the damages. The testing performance
with a larger crack size. A total of 3,000 samples with a crack size
of different noise levels does not change significantly, which indi-
of 5.1 cm were tested for the detection task. As shown in Fig. 9,
cates the robustness of the method (i.e., testing accuracy is 95.3%
although samples with crack size of 5.1 cm were not included in the
when the network is tested with 16% noisy samples).
training data set, the testing accuracy was almost perfect. The filters
Fig. 11 shows an example of correct classification by using the
used in the architecture managed to highlight the cracked region.
threshold value of 1.3 cm. When the crack location is searched
in the larger area by increasing the threshold, the error rate is
reduced even further. The error rate was almost 1% under all levels
of noise for both threshold values 2.5 and 5.1 cm as shown in
Figs. 10(b and c).

Computational Performance
The computational performance of the case study was evaluated on
an Intel (Hillsboro, Oregon) Xeon CPU E5-2620 v3 and NVIDIA
Tesla K80 GPUs. The time required for training and testing phases
for a single-strain field and a batch size of 64 strain fields are sum-
marized in Table 1. In the training phase, one forward and back-
ward pass was considered. The computation times for shared layers
and detection task, shared layers and localization task, and only
localization task are compared in Table 1.
As illustrated in Table 1, the testing time for all tasks was less
than 20 ms for both hardwares. A video stream input with 25 frames
per second would give a 40-ms time budget to complete testing for
Fig. 9. Sensitivity analysis of the detection task for the crack size
a single sample, which can be considered as a real-time require-
5.1 cm.
ment. Therefore, the proposed methodology achieves the real-time

© ASCE 04019005-8 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


Fig. 10. Sensitivity analysis of localization task for thresholds: (a) thr = 1.3 cm; (b) thr = 2.5 cm; and (c) thr = 5.1 cm.
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

healthy and damaged states, and the potential damage location is


B
estimated by constructing a threshold value (Li 2010). Some exam-
A ples of these strain-based damage indicators include the modal
strain energy index where the index is a ratio of summations of
fractional energies of elements before and after damage (Cornwell
et al. 1999), curvature mode shape index (Yam et al. 2002), cross-
correlation of strain data (Yao et al. 2016), strain frequency re-
sponse function (Swamidas and Chen 1995), spectral strain energy
(Bayissa and Haritos 2007), and the strain measured by a sensor
against the measurement obtained by neighbor sensors (Laflamme
et al. 2016). Another study (Choi et al. 2005) adopted the changes
in modal compliance distribution and demonstrated the validation
of their approach on a 60 × 40 cm plate including 48 elements.
This study defined the percentage of false positive error as the ratio
of the number of false positive predictions over the number of
Fig. 11. Example of correct bounding box estimation. healthy elements in the plate. The percentage of false positives
was stated to be 6.4% for the noise-free case and 8.5% for the 3%
noise-to-signal (N/S) ratio where the crack size was 5.2 cm. The
percentage of false positives for crack size of 13 cm were 4.3%,
requirement for the testing phase. In addition, as mentioned previ- 6.5%, and 8.7% for 0%, 1%, and 3% N/S ratio, respectively. There
ously, the proposed architecture reduces the computational needs were also several approaches in which damage was characterized
by exploiting the shared feature extractor for the detection and by the probabilistic behavior. As an example, the research pre-
localization tasks. The computation time for only the localization sented in Hasni et al. (2017) extracted the probability density func-
task was almost half of the computation time for the shared layers
tion of strain time histories of a gusset plate and identified cracks
and localization task, which proves the efficiency of the method-
by using the support vector machine (SVM) classifier. The best per-
ology. As a result, the proposed study reduces the shorter training
formance of the classifier is noted as 82% where the performance is
time and lower computation cost.
the number of correctly classified data points divided by the total
number of data points on the girder.
Related Work In addition, several approaches adopt strain-based damage
indicators as inputs to artificial neural networks (ANNs). In
There have been several strain-based studies on damage detection Katsikeros and Labeas (2009), discrete Fourier transformation and
and localization for platelike structures in the literature. Finding the principal component analysis were used to generate damage fea-
damage index is one of the widely used techniques in crack iden- tures, which are then utilized to feed the ANN structure. The study
tification where the existence of cracks is defined by comparing was evaluated on a simulated lap-joint structure and validated by

Table 1. Computation performance of the proposed methodology


K80 GPUs CPU
Batch of samples One sample Batch of samples One sample
Phase Task time (ms) time (ms) time (ms) time (ms)
Training Shared layers + detection 6 4 30 5
Shared layers + localization 9 5 45 14
Localization 4 2 20 12
Testing Shared layers + detection 3 2 8 2
Shared layers + localization 4 2 13 6
Localization 2 1 7 5

© ASCE 04019005-9 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


using mean-square error of target and predicted crack parameters. by National Science Foundation Grants CCF-1618717 and
Another study (Sbarufatti et al. 2013) normalized each sensor of a CMMI-1663256.
confined region with respect to the average value measured by all
the sensors within the same region to obtain the damage index. The
associated damage index map was validated by using the simula- References
tion of a 60 × 50 cm panel with rivets. The detection accuracies
were obtained as the average of the output values from 50 ANNs. Abdeljaber, O., O. Avci, S. Kiranyaz, M. Gabbouj, and D. J. Inman. 2017.
For the noise-free case, detection accuracies were reported as “Real-time vibration-based structural damage detection using one-
92.5% and 95% for 6- and 8-cm cracks, respectively. The accura- dimensional convolutional neural networks.” J. Sound Vib. 388:
cies dropped with the increase in noise level: for example, detection 154–170. https://doi.org/10.1016/j.jsv.2016.10.043.
accuracies for 12% additive Gaussian noise case were 25% for a Alavi, A. H., H. Hasni, N. Lajnef, K. Chatti, and F. Faridazar. 2016. “An
intelligent structural damage detection approach based on self-powered
6-cm crack and 90% for a 8-cm crack, respectively.
wireless sensor data.” Autom. Constr. 62: 24–44. https://doi.org/10
The majority of the described damage identification approaches .1016/j.autcon.2015.10.001.
are effective in detecting a particular type and number of crack
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

Alpaydin, E. 2014. Introduction to machine learning. Cambridge, MA:


scenarios. Nevertheless, there are several limitations in these tech- MIT Press.
niques, as mentioned previously. First, traditional approaches need Bayissa, W., and N. Haritos. 2007. “Structural damage identification in
measurements from both the baseline and unknown state of struc- plates using spectral strain energy analysis.” J. Sound Vib. 307 (1–2):
tures. Second, such approaches consist of damage feature design 226–249. https://doi.org/10.1016/j.jsv.2007.06.062.
and threshold selection processes that require manual effort and Bishop, C. M. 2006. Pattern recognition and machine learning: Informa-
human expertise. Finally, they aim to reduce the number of mea- tion science and statistics, 209. New York: Springer.
surements due to the difficulty in dealing with large data sets. Cha, Y.-J., W. Choi, and O. Büyüköztürk. 2017. “Deep learning-
based crack damage detection using convolutional neural networks.”
Comput.-Aided Civ. Infrastruct. Eng. 32 (5): 361–378. https://doi.org
Conclusion /10.1111/mice.12263.
Choi, S., S. Park, S. Yoon, and N. Stubbs. 2005. “Nondestructive damage
The major challenge of damage diagnosis is characterizing the identification in plate structures using changes in modal compliance.”
unknown relation between the measurements and damage patterns. NDT & E Int. 38 (7): 529–540. https://doi.org/10.1016/j.ndteint.2005
To address this limitation, this paper introduced CNN, which is .01.007.
Cornwell, P., S. W. Doebling, and C. R. Farrar. 1999. “Application of
one of the major breakthroughs in image recognition, to this dam-
the strain energy damage detection method to plate-like structures.”
age detection and localization problem. The CNN technique has the J. Sound Vib. 224 (2): 359–374. https://doi.org/10.1006/jsvi.1999
ability to discover abstract features and complex classifier bounda- .2163.
ries that are able to distinguish various features of the problem. Deng, J., W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. “Image-
In our study, the abstract feature maps were discovered with net: A large-scale hierarchical image database.” In Proc., IEEE Conf. on
CNN technique to classify damaged and healthy cases modeled Computer Vision and Pattern Recognition, 248–255. New York: IEEE.
through analytic simulations. The computational needs of the meth- Elkordy, M., K. Chang, and G. Lee. 1993. “Neural networks trained
odology were decreased by exploiting CNN’s shared parameteriza- by analytically simulated damage states.” J. Comput. Civ. Eng. 7 (2):
tion and GPU’s massively parallel architecture. 130–145. https://doi.org/10.1061/(ASCE)0887-3801(1993)7:2(130).
The proposed CNN architecture can process the available data Fang, X., H. Luo, and J. Tang. 2005. “Structural damage detection
and adjust itself to the control variables such as measurement noise. using neural network with learning rate improvement.” Comput. Struct.
As a result, this study accomplished high accuracy, robustness, and 83 (25): 2150–2161. https://doi.org/10.1016/j.compstruc.2005.02.029.
computational efficiency, which holds great potential for real-time Farrar, C. R., and K. Worden. 2007. “An introduction to structural health
monitoring.” Philos. Trans. R. Soc. London Ser. A 365 (1851): 303–
damage diagnosis and localization challenge. Also, the perfor-
315. https://doi.org/10.1098/rsta.2006.1928.
mance of deep neural networks highly depends on the training data Flood, I. 2008. “Towards the next generation of artificial neural networks
set. The selection of training data set requires the representation of for civil engineering.” Adv. Eng. Inf. 22 (1): 4–14. https://doi.org/10
as many cases as possible to predict test cases accurately. The re- .1016/j.aei.2007.07.001.
sults show that CNN architecture performs with higher accuracy Flood, I., and N. Kartam. 1994. “Neural networks in civil engineering. II:
and robustness when the training data set is formed with noise-free Systems and application.” J. Comput. Civ. Eng. 8 (2): 149–162. https://
and noisy data. Consequently, training data sets should be designed doi.org/10.1061/(ASCE)0887-3801(1994)8:2(149).
considering the existence of uncertainties. Fujimaki, R., T. Yairi, and K. Machida. 2005. “An approach to spacecraft
In order to discover more about the abilities of convolutional anomaly detection problem using kernel feature space.” In Proc., 11th
neural networks, further research is needed. This work should aim ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining,
to (1) perform damage diagnosis with more complicated loading 401–410. New York: Association for Computing Machinery.
scenarios and larger structures, (2) determine the severity of the Glorot, X., and Y. Bengio. 2010. “Understanding the difficulty of training
deep feedforward neural networks.” In Vol. 9 of Proc., 13th Int. Conf.
damage for multiple damage cases, and (3) test the designed net-
on Artificial Intelligence and Statistics, 249–256. Sardinia: Proceedings
work on the real experimental setup. of Machine Learning Research.
Gu, J., Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang,
and G. Wang. 2015. “Recent advances in convolutional neural net-
Acknowledgments works.” Preprint, submitted December 22, 2015. http://arXiv.org/abs
/1512.07108.
Research funding is partially provided by the National Science Gul, M., and F. N. Catbas. 2009. “Statistical pattern recognition for struc-
Foundation through Grant No. CMMI-1351537 by the Hazard tural health monitoring using time series modeling: Theory and exper-
Mitigation and Structural Engineering program, and by a grant imental verifications.” Mech. Syst. Sig. Process. 23 (7): 2192–2204.
from the Commonwealth of Pennsylvania, Department of Commu- https://doi.org/10.1016/j.ymssp.2009.02.013.
nity and Economic Development, through the Pennsylvania Infra- Gulgec, N. S., G. S. Shahidi, T. J. Matarazzo, and S. N. Pakzad. 2017a.
structure Technology Alliance (PITA). Martin Takáč was supported “Current challenges with bigdata analytics in structural health

© ASCE 04019005-10 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005


monitoring.” Vol. 7 of Structural health monitoring and damage detec- Rojas, R. 2013. Neural networks: A systematic introduction. New York:
tion, 79–84. New York: Springer. Springer.
Gulgec, N. S., S. G. Shahidi, and S. N. Pakzad. 2016. “A comparative study Rumelhart, D. E., G. E. Hinton, and R. J. Williams. 1988. “Learning rep-
of compressive sensing approaches for a structural damage diagnosis.” resentations by back-propagating errors.” Cognit. Model. 5 (3): 1.
In Proc., Geotechnical and Structural Engineering Congress 2016, Samuel, A. L. 1959. “Some studies in machine learning using the game of
1910–1919. Reston, VA: ASCE. checkers.” IBM J. Res. Dev. 3 (3): 210–229. http://doi.org/10.1147/rd
Gulgec, N. S., M. Takáč, and S. N. Pakzad. 2017b. “Structural damage .33.0210.
detection using convolutional neural networks.” Vol. 3 of Model vali- Sbarufatti, C., A. Manes, and M. Giglio. 2013. “Performance optimization
dation and uncertainty quantification, 331–337. New York: Springer. of a diagnostic system based upon a simulated strain field for fatigue
Hadzima-Nyarko, M., E. K. Nyarko, and D. Morić. 2011. “A neural damage characterization.” Mech. Syst. Sig. Process. 40 (2): 667–690.
network based modelling and sensitivity analysis of damage ratio https://doi.org/10.1016/j.ymssp.2013.06.003.
coefficient.” Expert Syst. Appl. 38 (10): 13405–13413. https://doi.org Shahidi, S. G., N. S. Gulgec, and S. N. Pakzad. 2016. “Compressive sens-
/10.1016/j.eswa.2011.04.169. ing strategies for multiple damage detection and localization.” Vol. 2
Hasni, H., A. H. Alavi, P. Jiao, and N. Lajnef. 2017. “Detection of fatigue of Dynamics of civil structures, 17–22. New York: Springer.
cracking in steel bridge girders: A support vector machine approach.” Shalev-Shwartz, S., and S. Ben-David. 2014. Understanding machine
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.

Arch. Civ. Mech. Eng. 17 (3): 609–622. https://doi.org/10.1016/j.acme learning: From theory to algorithms Cambridge, UK: Cambridge
.2016.11.005. University Press.
He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Deep residual learning for Shi, A., and X.-H. Yu. 2012. “Structural damage detection using artificial
image recognition.” Preprint, submitted December 10, 2015. http:// neural networks and wavelet transform.” In Proc., IEEE Int. Conf. on
arXiv.org/abs/1512.03385. Computational Intelligence for Measurement Systems and Applica-
Katsikeros, C. E., and G. Labeas. 2009. “Development and validation of a tions, 7–11. New York: IEEE.
strain-based structural health monitoring system.” Mech. Syst. Sig. Pro- Shu, J., Z. Zhang, I. Gonzalez, and R. Karoumi. 2013. “The application of
cess. 23 (2): 372–383. https://doi.org/10.1016/j.ymssp.2008.03.006. a damage detection method using artificial neural network and train-
Krizhevsky, A., I. Sutskever, and G. E. Hinton. 2012. “ImageNet classifi- induced vibrations on a simplified railway bridge model.” Eng. Struct.
cation with deep convolutional neural networks.” In Advances in neural 52: 408–421. https://doi.org/10.1016/j.engstruct.2013.02.031.
information processing systems, 1097–1105. Neural Information Simon, P. 2013. Too big to ignore: The business case for big data.
Processing Systems. Vol. 72. New York: Wiley.
Laflamme, S., L. Cao, E. Chatzi, and F. Ubertini. 2016. “Damage detection
Simonyan, K., and A. Zisserman. 2014. “Very deep convolutional networks
and localization from dense network of strain sensors.” Shock Vib.
for large-scale image recognition.” Preprint, submitted September 4,
2016: 2562949. https://doi.org/10.1155/2016/2562949.
2015. http://arXiv.org/abs/1409.1556.
LeCun, Y., and Y. Bengio. 1995. Convolutional networks for images,
Sohn, H., and C. R. Farrar. 2001. “Damage diagnosis using time series
speech, and time-series. Cambridge, MA: MIT Press.
analysis of vibration signals.” Smart Mater. Struct. 10 (3): 446–451.
LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep learning.” Nature
https://doi.org/10.1088/0964-1726/10/3/304.
521 (7553): 436–444. https://doi.org/10.1038/nature14539.
Swamidas, A., and Y. Chen. 1995. “Monitoring crack growth through
LeCun, Y., L. Bottou, Y. Bengio, and P. Haffner. 1998. “Gradient-based
change of modal parameters.” J. Sound Vib. 186 (2): 325–343. https://
learning applied to document recognition.” Proc. IEEE 86 (11):
2278–2324. https://doi.org/10.1109/5.726791. doi.org/10.1006/jsvi.1995.0452.
Lee, J. J., J. W. Lee, J. H. Yi, C. B. Yun, and H. Y. Jung. 2005. “Neural Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
networks-based damage detection for bridges considering errors in V. Vanhoucke, and A. Rabinovich. 2015. “Going deeper with convo-
baseline finite element models.” J. Sound Vib. 280 (3): 555–578. https:// lutions.” In Proc., IEEE Conf. on Computer Vision and Pattern
doi.org/10.1016/j.jsv.2004.01.003. Recognition, 1–9. New York: IEEE.
Li, L., K. G. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar. Theano Development Team. 2016. “Theano: A Python framework for fast
2017. “Hyperband: A novel bandit-based approach to hyperparameter computation of mathematical expressions.” Preprint, submitted May 9,
optimization.” J. Mach. Learn. Res. 18 (1): 6765–6816. 2015. http://arXiv.org/abs/1605.02688.
Li, Y. 2010. “Hypersensitivity of strain-based indicators for structural dam- Yam, L., Y. Li, and W. Wong. 2002. “Sensitivity studies of parameters
age identification: A review.” Mech. Syst. Sig. Process. 24 (3): 653–664. for damage detection of plate-like structures using static and dynamic
https://doi.org/10.1016/j.ymssp.2009.11.002. approaches.” Eng. Struct. 24 (11): 1465–1475. https://doi.org/10.1016
Mehrjoo, M., N. Khaji, H. Moharrami, and A. Bahreininejad. 2008. “Dam- /S0141-0296(02)00094-9.
age detection of truss bridge joints using artificial neural networks.” Yao, R., and S. N. Pakzad. 2012. “Autoregressive statistical pattern recog-
Expert Syst. Appl. 35 (3): 1122–1131. https://doi.org/10.1016/j.eswa nition algorithms for damage detection in civil structures.” Mech. Syst.
.2007.08.008. Sig. Process. 31 (Aug): 355–368. https://doi.org/10.1016/j.ymssp.2012
Nair, K. K., A. S. Kiremidjian, and K. H. Law. 2006. “Time series-based .02.014.
damage detection and localization algorithm with application to the Yao, R., S. N. Pakzad, and P. Venkitasubramaniam. 2016. “Compressive
ASCE benchmark structure.” J. Sound Vib. 291 (1): 349–368. https:// sensing based structural damage detection and localization using theo-
doi.org/10.1016/j.jsv.2005.06.016. retical and metaheuristic statistics.” Struct. Control Health Monit.
Pan, B., K. Qian, H. Xie, and A. Asundi. 2009. “Two-dimensional digital 24 (4): e1881. https://doi.org/10.1002/stc.1881.
image correlation for in-plane displacement and strain measurement: Zapico, J., M. Gonzalez, and K. Worden. 2003. “Damage assessment using
A review.” Meas. Sci. Technol. 20 (6): 062001. https://doi.org/10.1088 neural networks.” Mech. Syst. Sig. Process. 17 (1): 119–125. https://doi
/0957-0233/20/6/062001. .org/10.1006/mssp.2002.1547.
Pei, J.-S., A. Smyth, and E. Kosmatopoulos. 2004. “Analysis and modifi- Zeiler, M. D., and R. Fergus. 2014. “Visualizing and understanding con-
cation of Volterra/Wiener neural networks for the adaptive identification volutional networks.” In Proc., European Conf. on Computer Vision,
of non-linear hysteretic dynamic systems.” J. Sound Vib. 275 (3–5): 818–833. New York: Springer.
693–718. https://doi.org/10.1016/j.jsv.2003.06.005. Zhang, C., S. Bengio, M. Hardt, B. Recht, and O. Vinyals. 2016. “Under-
Robbins, H., and S. Monro. 1951. “A stochastic approximation method.” standing deep learning requires rethinking generalization.” Preprint,
In The annals of mathematical statistics, 400–407. New York: Springer. submitted November 10, 2015. http://arXiv.org/abs/1611.03530.

© ASCE 04019005-11 J. Comput. Civ. Eng.

J. Comput. Civ. Eng., 2019, 33(3): 04019005

You might also like