10 1061@ascecp 1943-5487 0000820
10 1061@ascecp 1943-5487 0000820
10 1061@ascecp 1943-5487 0000820
Abstract: Damage diagnosis has been a challenging inverse problem in structural health monitoring. The main difficulty is characterizing
the unknown relation between the measurements and damage patterns (i.e., damage indicator selection). Such damage indicators would
Downloaded from ascelibrary.org by Iowa State University on 02/03/19. Copyright ASCE. For personal use only; all rights reserved.
ideally be able to identify the existence, location, and severity of damage. Therefore, this procedure requires complex data processing algo-
rithms and dense sensor arrays, which brings computational intensity with it. To address this limitation, this paper introduces convolutional
neural network (CNN), which is one of the major breakthroughs in image recognition, to the damage detection and localization problem. The
CNN technique has the ability to discover abstract features and complex classifier boundaries that are able to distinguish various attributes of
the problem. In this paper, a CNN topology was designed to classify simulated damaged and healthy cases and localize the damage when it
exists. The performance of the proposed technique was evaluated through the finite-element simulations of undamaged and damaged struc-
tural connections. Samples were trained by using strain distributions as a consequence of various loads with several different crack scenarios.
Completely new damage setups were introduced to the model during the testing process. Based on the findings of the proposed study, the
damage diagnosis and localization were achieved with high accuracy, robustness, and computational efficiency. DOI: 10.1061/(ASCE)
CP.1943-5487.0000820. © 2019 American Society of Civil Engineers.
ory requirements by using fewer parameters (LeCun and Bengio 3. Fully connected (FC) layers operate on the stacked convolu-
1995). tional or pooling layer outputs and compute the weighted sum
Convolutional neural networks are composed of three architec- of inputs with a nonlinear mapping as described in the overview
tural frameworks: local receptive fields, shared weights, and spatial of DNNs.
subsampling (LeCun et al. 1998). Passing the same set of units all
over the input allows extracting multiple feature maps. In this case,
the feature map shifts the same amount the input shifts. This is Proposed Methodology
called local receptive fields, which makes CNN robust to the trans-
lation and distortion of the input. Furthermore, the weights and
biases are shared through the feature maps. This characteristic Overview
reduces the learned parameters as well as the memory demands. This section gives a general map of the proposed technique. As
Finally, spatial subsampling helps reduce the resolution of the fea- shown in Fig. 3, the methodology consists of training and testing
ture maps and prevent the sensitivity of the outputs under shifts and phases. The training phase operates on raw strain fields from struc-
rotations. tures. After normalizing each strain field by its absolute maximum,
CNNs receive the input as three-dimensional (3D) volumes the search mechanism finds a good set of hyperparameters that
(width, height, depth). As an example from image recognition, the improves the performance of the network architecture. Then the
depth of a colored image (i.e., having red-green-blue color chan- selected architecture is trained to minimize the error between pre-
nels) is three, whereas the depth of a gray image is one. These 3D dictions and true labels.
input volumes feed the CNN architecture, which can be constructed The training phase consists of two tasks: detection and locali-
by using three types of layers: zation. The detection task determines the existence of damage
1. Convolutional (CONV) layer parameters are learnable filters where it is treated as a classification problem (i.e., 0 for undam-
in which each filter (weights or kernels) has spatially small aged and 1 for damaged). The localization task treats the case as
width and height shared in the full depth of the input. While a regression problem where the goal is accurate estimation of the
sliding these weights, the CONV layer computes the dot product boundaries of the damaged area. In the proposed methodology,
between these filters and the small region of the input in any both of the tasks use shared layers in the early stages of the deep
position. Then the weighted sum of the input and weights is learning pipeline. These layers are specialized to extract local
activated by the nonlinear functions to form feature maps. This features that are common for both localization and detection.
operation is called convolution. The size of the feature map is Then these early layers are fed into task-specific layers. Shared
associated with a variety of hyperparameters such as the number front-end layers avoid having two separate networks, provide
of kernels, kernel size, number of strides, and zero padding. The more efficient learning, and have shorter training time and lower
nonlinear activation maps are generated based on the number of computation cost.
The trained model parameters are stored to be used in the Weight Initialization
testing phase. In this phase, raw strain fields are fed into the The first step of training was initializing the weights to control
CNN architecture to predict the labels for detection and localiza- input instances in a reasonable range along the layers. This study
tion tasks. adopted Xavier initialization for the tanh function (Glorot and
Bengio 2010). Weight initialization of the ith layer was set to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
have a uniform distribution in the interval ½− 6=ðni−1 þ nÞ;
Hyperparameter Selection pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
6=ðni−1 þ ni Þ, where ni−1 and ni are the number of units in
The CNN architectures can be built in various ways by using the ði − 1Þth and ith layers.
the sequence of CONV, POOL, and FC layers. The performance
of the neural networks critically depends on identifying a good set Prediction Functions
of hyperparameters (Pei et al. 2004). In this study, these hyper- The feedforward step evaluated different prediction functions for
parameters include learning rate, the number of CONV and FC detection and localization tasks. This study employed the softmax
layers, the number of kernels, kernel and pool sizes, and the num- classifier (Bishop 2006) to predict the label of the detection output
ber of hidden layer sizes. In order to find the structure with good (ypred ), which is either healthy or damaged. The class i of the input
configuration, the hyperparameter search mechanism was imple- x was estimated by selecting the maximum probability of the soft-
mented for both damage detection and localization tasks (Li max function defined as follows:
et al. 2016).
Different networks were constructed with randomly selected
e½θðx;wÞi
hyperparameters. The 10% of the networks that had the worst val- ½softmaxðθðx; wÞÞi ¼ P ½θðx;wÞ ð3Þ
je
j
idation score was removed after the first run, and the remaining
networks were run for another set of an epoch. The runs were re-
peated until the best 10 networks remained in the pool. After the
best network was selected for the damage identification part, the ypred ¼ argmaxi ð½softmaxðθðx; wÞÞi Þ ð4Þ
output of the last convolutional layer was stored and used as an
input for the hyperparameter search for the localization task. The
search for this task was performed on FC layers only. The localization task aimed to predict the location of the crack,
which is defined by a bounding box vector zpred. For this reason,
this task used a regressor instead of a classifier. The bounding box i
Training of the input x was estimated by the following function:
The training process was comprised of two phases: feedforward
and back-propagation (Rojas 2013). The feedforward process X
½zpred i ¼ ½θðx; wÞj ð5Þ
evaluated the prediction function for given input instances. Then j
the back-propagation step adjusted the weights in proportion as
their contributions to the total error (Rumelhart et al. 1988) by us-
ing a stochastic gradient descent (SGD) algorithm (Robbins and Loss Functions
Monro 1951). After the gradients were calculated with SGD, the The proposed model adopted two separate loss functions for the
detection and localization parameters were updated with the learn- detection and localization tasks. The diagnosis part employed
ing rates ηdet and ηloc , respectively. Overfitting was prevented by the negative log-likelihood function, where optimal architecture
monitoring the validation data set performance in every complete parameters θ were learned by maximizing the likelihood of the
forward and backward pass (epoch). When architecture perfor- data set. On the other hand, the localization task calculated the loss
mance was improved sufficiently on the validation data set, the between the predicted and true bounding box with the l2 loss
training process was stopped. function
jmaxðâ1 ; â2 Þ − maxða1 ; a2 Þj ≤ thra ð9Þ Training, validation, and test data sets were formed by modeling
different loading cases, damage scenarios, and noise levels. The
jmaxðb̂1 ; b̂2 Þ − maxðb1 ; b2 Þj ≤ thrb ð10Þ load was selected from uniformly distributed load ∼U[−445 kN
(compression), 534 kN (tension)] and applied to the end of the
where ðâ1 ; â2 ; b̂1 ; b̂2 Þ = predicted box coordinates; ða1 ;a2 ; b1 ;b2 Þ = channel members. The damage in the gusset plate was simulated
true box coordinates; and thr is the user-defined threshold. as 2.5-cm-long cracks, which are the smallest crack size given
the mesh size. The crack locations were chosen at the beginning
of each run with a specified load level. The coordinates of the
Numerical Validation cracks changing between the two corners of the middle part of
the plate [lower left corner point A with coordinates (21.6, 2.5)
to upper right corner point B with coordinates (45.9, 33.0)] are
Data Preparation
shown in Fig. 5. In order to assess the approach with completely
The damage identification process requires a large training set of unseen damaged samples, none of the coordinates of the training
correctly classified damage states (Elkordy et al. 1993). In this set was used in the testing samples.
(a) (b)
(c) (d)
Fig. 4. Setup of the (a) healthy, and (b) single-damaged gusset plate; and (c) material behavior, and (d) inelastic behavior of the plate.
The uncertainty in the measurement process was simulated as Training and Proposed Architecture
an additive Gaussian noise ∼Nð0; σ2 Þ, where σ is the standard
Training was implemented by using a Python library called Theano
deviation of the measurement noise. Different noise levels (i.e., the to optimize the mathematical expressions consisting of multidi-
ratio between the standard deviation of measurement noise to actual mensional arrays (Theano Development Team 2016). Higher per-
strain values) were generated to compute the influence of the noise formance was achieved by using NVIDIA (Holmdel, New Jersey)
on CNN architecture performance. Tesla K80 GPUs, which enabled parallelism for data-intensive
The crack coordinates of the single-damaged samples were also calculations.
collected for the localization task. Crack location was stored as In this study, a minibatch SGD algorithm with a batch size of
bounding box ða1 ; b1 ; a2 ; b2 Þ, where b1 and b2 indicate the coor- N ¼ 64 was implemented. Identical thresholds were used for thra
dinates of the tips of the crack. While defining a1 and a2 , 1.3 cm and thrb described through Eqs. (7)–(10). In order to discover the
was subtracted and added to the x-coordinate of the crack to reduce effect of the size of the search area on the localization accuracy,
the rounding error in the direction of loading; for example, if a the sides of the bounding box were increased in length by different
crack is located between (21.6, 2.5) and (21.6, 5.0), the bounding threshold values. The threshold values 1.3, 2.6, and 5.1 cm were
box is defined as [20.3, 2.5, 22.9, 5.0]. For the healthy samples, selected to have an increase in length by scale factors of 2, 3, and 5,
bounding box was set to [0, 0, 0, 0]. respectively. The scale coefficients were selected randomly but not
While preparing damaged samples, 72 different crack loca- to exceed a quarter of the area of the central part of the plate.
tions and 3,000 loading scenarios were used. None of the coor- Thresholds are illustrated in Fig. 6 with values of thr ¼ 1.3 cm,
dinates of training sets was used in the testing samples (i.e., 36 thr ¼ 2.5 cm, and thr ¼ 5.1 cm.
locations for training, 36 locations for testing as shown in Fig. 5). Fig. 7 shows the proposed architecture as a result of the hyper-
Healthy samples were modeled with 6,000 loading scenarios. As parameter search mechanism. The network consisted of three
a result, a total of 6,000 healthy and 6,000 damaged samples convolutional layers followed by two separate fully connected
were generated. Then four different noise levels (2%, 5%, 10%, layers for detection and localization tasks. The detection part clas-
and 15%) were added to noise-free samples to produce a total sified 28 × 56 × 1 inputs as healthy or damaged, whereas the
30,000 healthy and 30,000 damaged samples. This data set is localization part predicted the bounding box of the crack area.
called Data Set 1 and distributed to training, validation, and test- The convolutional layers received the input layer and passed them
ing samples. through a filter size of ð3 × 3Þ. As a result of these CONV layers,
the network formed 8, 16, and 32 feature maps. The max-pooling
operation was implemented right after the first and second con-
Hyperparameters volution layer. A max-pool size of ð2 × 2Þ with a stride of 2 was
A total of 50 networks were constructed with randomly selected
hyperparameters in the detection task. The hyperparameter range
for the detection task had the following characteristics: learning rate
[2 to 2−8 ]; the number of CONV and FC layers [1, 2, or 3]; the
number of kernels [2 to 27 ]; kernel size [ð3 × 3Þ or ð5 × 5Þ] with
stride of 1 and without zero padding; maximum pool size [ð1 × 1Þ
with stride of 1] or [ð2 × 2Þ with stride of 2]; and randomly selected
hidden layer sizes.
The last convolutional layer of the best architecture in the de-
tection task was stored as an input for the hyperparameter search
for the localization task. The search for localization task was per-
formed on a total of 70 networks with FC layers only. The networks
for the localization task were built with hyperparameters using
learning rate [2−6 to 2−18 ]; the number of FC layers [1, 2, or 3];
and randomly selected hidden layer sizes. Activation function
tanh() was adopted for the activation of the layers for both detection
Fig. 6. Threshold values adopted for the localization task.
and localization tasks.
used for POOL layers. The feature maps of the last convolutional additional data sets were prepared with both consisting of 6,000
layer were stacked together in an array and given as an input to the undamaged and 6,000 damaged samples. Data Set 2 was formed
fully connected layers with a hidden layer size of [836, 767] for the by only noise-free samples and Data Set 3 was selected from a sub-
detection task and [2058, 881, 534] for the localization task. set of Data Set 1. Hyperparameter search was performed for these
The learning rate of ηdet ¼ 0.0451 and ηloc ¼ 0.0026 were used for two data sets for fair comparison. Trained models were then tested
the detection and localization parts, respectively. with samples including a variation of different noise levels (0% or
As mentioned previously, CNNs have the ability to keep spatial noise-free, 2%, 4%, 6%, 8%, 10%, 12%, 14%, and 16%) 100 times.
features of inputs. In order to visualize this ability, the activated fea- Proposed network topologies for two additional training processes
ture maps after POOL-1, POOL-2, and CONV-3 layers of a correctly are listed as follows:
identified damaged sample are shown in Fig. 7. The activations were • Training of Data Set 2: The network is trained with Data Set 2,
normalized to have the scale between 0 and 1, where white represents which consists of only noise-free samples. The proposed net-
0 and black represents 1. The figure shows that the damage location work for the second case is composed of two CONV layers
(i.e., right top corner) is still visible during Stages 1 and 2. After the followed by POOL layers, and two FC layers. The CONV layer
CONV-3 layer (Stage 3), the features become abstract where it is adopts the filter size of ð3 × 3Þ with kernel numbers of 2 and 4.
almost impossible to design it by hand. Max-pool size of ð2 × 2Þ with a stride of 2 is used for POOL
layers. The last POOL layer is connected to the two FC layers
size of [373, 223]. The learning rate is chosen as ηdet ¼ 0.0158.
Results and Discussion • Training of Data Set 3: The selected network for Data Set 3
includes two CONV layers with the filter size of ð3 × 3Þ with
The performance and sensitivity analysis of the proposed method- kernel numbers of 8 and 32. Similar to the first case, a max-pool
ology is evaluated in this section. The accuracy and robustness of size of ð2 × 2Þ with a stride of 2 is used for POOL layers. The
the CNN architecture are discussed for both detection and locali- network has the two FC layers size of [2,477, 804] after shared
zation tasks. layers. The learning rate of ηdet ¼ 0.069 is adopted.
The sensitivity analysis of three training cases is visualized in
Fig. 8. Fig. 8(b) presents the testing performance of Data Set 2,
Detection Task which had the worst testing performance among the three cases.
This section presents the performance and sensitivity analysis of Although the testing error was 1.19% for lower noise levels, it
the detection task. In order to measure the effect of noise, two reached around 12% under the noise level of 16%. It is noticeable
that the error rate exponentially increases with the increase in the In summary, the introduction of uncertainty in measurement
noise levels. noise avoids overfitting, which leads to better testing and generali-
As can be observed from Fig. 8(c), the testing accuracy in- zation performance. Such fact emphasizes that training data set
creased significantly compared with the architecture trained with selection is vital in designing CNN architectures. Another point
noise-free samples. The performance of the trained architecture worth mentioning is that adding more samples to the training data
stayed stable with the increase in noise level. Consequently, the set increases the accuracy and robustness.
introduction of different noise levels during the training process
helped the network to learn damage features under uncertainty. Localization Task
Fig. 8(a) illustrates the best testing performance from the given
training cases. According to the figure, the proposed architecture This section discusses the main findings of the localization task.
identified the previously unseen damages with 0.21% error on The localization part of the network was trained with Data Set 1
noise-free samples. This error rate represents that the CNNs are including both noise-free and noisy samples, which results in better
capable of learning the damage features almost perfectly even with detection accuracy. In order to eliminate the error coming from the
the smallest crack size if enough training cases are provided. detection task, the localization task was run with both healthy and
Furthermore, the test error does not change significantly even under damaged samples. The CNN architecture was tested under different
16% noise, which shows that the proposed methodology is robust noise levels and different threshold values.
for various levels of noise. Fig. 10 displays the percent localization error under different
As discussed previously, deep learning–based approaches can noise levels as well as different user-defined threshold values such
be effective in identifying structural damage more than a particular as thr ¼ 1.3 cm, thr ¼ 2.5 cm, and thr ¼ 5.1 cm. According to
scenario, unlike traditional methods. They have a capability of gen- Fig. 10(a), the proposed architecture localizes the crack with 96.8%
accuracy when the noise level is zero and the threshold value is
eralization when designed carefully. In order to evaluate this char-
1.3 cm. This error rate demonstrates that the proposed CNN archi-
acteristic, the performance of the proposed method was assessed
tecture successfully localizes the damages. The testing performance
with a larger crack size. A total of 3,000 samples with a crack size
of different noise levels does not change significantly, which indi-
of 5.1 cm were tested for the detection task. As shown in Fig. 9,
cates the robustness of the method (i.e., testing accuracy is 95.3%
although samples with crack size of 5.1 cm were not included in the
when the network is tested with 16% noisy samples).
training data set, the testing accuracy was almost perfect. The filters
Fig. 11 shows an example of correct classification by using the
used in the architecture managed to highlight the cracked region.
threshold value of 1.3 cm. When the crack location is searched
in the larger area by increasing the threshold, the error rate is
reduced even further. The error rate was almost 1% under all levels
of noise for both threshold values 2.5 and 5.1 cm as shown in
Figs. 10(b and c).
Computational Performance
The computational performance of the case study was evaluated on
an Intel (Hillsboro, Oregon) Xeon CPU E5-2620 v3 and NVIDIA
Tesla K80 GPUs. The time required for training and testing phases
for a single-strain field and a batch size of 64 strain fields are sum-
marized in Table 1. In the training phase, one forward and back-
ward pass was considered. The computation times for shared layers
and detection task, shared layers and localization task, and only
localization task are compared in Table 1.
As illustrated in Table 1, the testing time for all tasks was less
than 20 ms for both hardwares. A video stream input with 25 frames
per second would give a 40-ms time budget to complete testing for
Fig. 9. Sensitivity analysis of the detection task for the crack size
a single sample, which can be considered as a real-time require-
5.1 cm.
ment. Therefore, the proposed methodology achieves the real-time
Arch. Civ. Mech. Eng. 17 (3): 609–622. https://doi.org/10.1016/j.acme learning: From theory to algorithms Cambridge, UK: Cambridge
.2016.11.005. University Press.
He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Deep residual learning for Shi, A., and X.-H. Yu. 2012. “Structural damage detection using artificial
image recognition.” Preprint, submitted December 10, 2015. http:// neural networks and wavelet transform.” In Proc., IEEE Int. Conf. on
arXiv.org/abs/1512.03385. Computational Intelligence for Measurement Systems and Applica-
Katsikeros, C. E., and G. Labeas. 2009. “Development and validation of a tions, 7–11. New York: IEEE.
strain-based structural health monitoring system.” Mech. Syst. Sig. Pro- Shu, J., Z. Zhang, I. Gonzalez, and R. Karoumi. 2013. “The application of
cess. 23 (2): 372–383. https://doi.org/10.1016/j.ymssp.2008.03.006. a damage detection method using artificial neural network and train-
Krizhevsky, A., I. Sutskever, and G. E. Hinton. 2012. “ImageNet classifi- induced vibrations on a simplified railway bridge model.” Eng. Struct.
cation with deep convolutional neural networks.” In Advances in neural 52: 408–421. https://doi.org/10.1016/j.engstruct.2013.02.031.
information processing systems, 1097–1105. Neural Information Simon, P. 2013. Too big to ignore: The business case for big data.
Processing Systems. Vol. 72. New York: Wiley.
Laflamme, S., L. Cao, E. Chatzi, and F. Ubertini. 2016. “Damage detection
Simonyan, K., and A. Zisserman. 2014. “Very deep convolutional networks
and localization from dense network of strain sensors.” Shock Vib.
for large-scale image recognition.” Preprint, submitted September 4,
2016: 2562949. https://doi.org/10.1155/2016/2562949.
2015. http://arXiv.org/abs/1409.1556.
LeCun, Y., and Y. Bengio. 1995. Convolutional networks for images,
Sohn, H., and C. R. Farrar. 2001. “Damage diagnosis using time series
speech, and time-series. Cambridge, MA: MIT Press.
analysis of vibration signals.” Smart Mater. Struct. 10 (3): 446–451.
LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep learning.” Nature
https://doi.org/10.1088/0964-1726/10/3/304.
521 (7553): 436–444. https://doi.org/10.1038/nature14539.
Swamidas, A., and Y. Chen. 1995. “Monitoring crack growth through
LeCun, Y., L. Bottou, Y. Bengio, and P. Haffner. 1998. “Gradient-based
change of modal parameters.” J. Sound Vib. 186 (2): 325–343. https://
learning applied to document recognition.” Proc. IEEE 86 (11):
2278–2324. https://doi.org/10.1109/5.726791. doi.org/10.1006/jsvi.1995.0452.
Lee, J. J., J. W. Lee, J. H. Yi, C. B. Yun, and H. Y. Jung. 2005. “Neural Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
networks-based damage detection for bridges considering errors in V. Vanhoucke, and A. Rabinovich. 2015. “Going deeper with convo-
baseline finite element models.” J. Sound Vib. 280 (3): 555–578. https:// lutions.” In Proc., IEEE Conf. on Computer Vision and Pattern
doi.org/10.1016/j.jsv.2004.01.003. Recognition, 1–9. New York: IEEE.
Li, L., K. G. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar. Theano Development Team. 2016. “Theano: A Python framework for fast
2017. “Hyperband: A novel bandit-based approach to hyperparameter computation of mathematical expressions.” Preprint, submitted May 9,
optimization.” J. Mach. Learn. Res. 18 (1): 6765–6816. 2015. http://arXiv.org/abs/1605.02688.
Li, Y. 2010. “Hypersensitivity of strain-based indicators for structural dam- Yam, L., Y. Li, and W. Wong. 2002. “Sensitivity studies of parameters
age identification: A review.” Mech. Syst. Sig. Process. 24 (3): 653–664. for damage detection of plate-like structures using static and dynamic
https://doi.org/10.1016/j.ymssp.2009.11.002. approaches.” Eng. Struct. 24 (11): 1465–1475. https://doi.org/10.1016
Mehrjoo, M., N. Khaji, H. Moharrami, and A. Bahreininejad. 2008. “Dam- /S0141-0296(02)00094-9.
age detection of truss bridge joints using artificial neural networks.” Yao, R., and S. N. Pakzad. 2012. “Autoregressive statistical pattern recog-
Expert Syst. Appl. 35 (3): 1122–1131. https://doi.org/10.1016/j.eswa nition algorithms for damage detection in civil structures.” Mech. Syst.
.2007.08.008. Sig. Process. 31 (Aug): 355–368. https://doi.org/10.1016/j.ymssp.2012
Nair, K. K., A. S. Kiremidjian, and K. H. Law. 2006. “Time series-based .02.014.
damage detection and localization algorithm with application to the Yao, R., S. N. Pakzad, and P. Venkitasubramaniam. 2016. “Compressive
ASCE benchmark structure.” J. Sound Vib. 291 (1): 349–368. https:// sensing based structural damage detection and localization using theo-
doi.org/10.1016/j.jsv.2005.06.016. retical and metaheuristic statistics.” Struct. Control Health Monit.
Pan, B., K. Qian, H. Xie, and A. Asundi. 2009. “Two-dimensional digital 24 (4): e1881. https://doi.org/10.1002/stc.1881.
image correlation for in-plane displacement and strain measurement: Zapico, J., M. Gonzalez, and K. Worden. 2003. “Damage assessment using
A review.” Meas. Sci. Technol. 20 (6): 062001. https://doi.org/10.1088 neural networks.” Mech. Syst. Sig. Process. 17 (1): 119–125. https://doi
/0957-0233/20/6/062001. .org/10.1006/mssp.2002.1547.
Pei, J.-S., A. Smyth, and E. Kosmatopoulos. 2004. “Analysis and modifi- Zeiler, M. D., and R. Fergus. 2014. “Visualizing and understanding con-
cation of Volterra/Wiener neural networks for the adaptive identification volutional networks.” In Proc., European Conf. on Computer Vision,
of non-linear hysteretic dynamic systems.” J. Sound Vib. 275 (3–5): 818–833. New York: Springer.
693–718. https://doi.org/10.1016/j.jsv.2003.06.005. Zhang, C., S. Bengio, M. Hardt, B. Recht, and O. Vinyals. 2016. “Under-
Robbins, H., and S. Monro. 1951. “A stochastic approximation method.” standing deep learning requires rethinking generalization.” Preprint,
In The annals of mathematical statistics, 400–407. New York: Springer. submitted November 10, 2015. http://arXiv.org/abs/1611.03530.