0% found this document useful (0 votes)
107 views8 pages

Shallow and Deep Artificial Neural Networks For Structural Reliability Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views8 pages

Shallow and Deep Artificial Neural Networks For Structural Reliability Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?

casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021


Shallow and Deep Artificial
Neural Networks for Structural
Reliability Analysis
Wellison Jos
e de Santana Surrogate models are efficient tools which have been successfully applied in structural
Gomes reliability analysis, as an attempt to keep the computational costs acceptable. Among the
Department of Civil Engineering, surrogate models available in the literature, artificial neural networks (ANNs) have been
Center for Optimization and Reliability in attracting research interest for many years. However, the ANNs used in structural reli-
Engineering (CORE), ability analysis are usually the shallow ones, based on an architecture consisting of neu-
Federal University of Santa Catarina, rons organized in three layers, the so-called input, hidden, and output layers. On the
Rua Jo~ao Pio Duarte, 205, other hand, with the advent of deep learning, ANNs with one input, one output, and sev-
Corrego Grande, eral hidden layers, known as deep neural networks, have been increasingly applied in
Florianopolis, SC 88037-000, Brazil engineering and other areas. Considering that many recent publications have shown
e-mail: [email protected] advantages of deep over shallow ANNs, the present paper aims at comparing these types
of neural networks in the context of structural reliability. By applying shallow and deep
ANNs in the solution of four benchmark structural reliability problems from the litera-
ture, employing Monte Carlo simulation (MCS) and adaptive experimental designs
(EDs), it is shown that, although good results are obtained for both types of ANNs, deep
ANNs usually outperform the shallow ones. [DOI: 10.1115/1.4047636]

Keywords: structural reliability, metamodels, surrogate models, artificial neural


networks, deep neural networks

1 Introduction those with just a single hidden layer, is adopted. This definition is
presented, for example, in Ref. [14].
Reliability analysis of real structural engineering systems is still
Considering that many recent publications have shown advan-
a computationally demanding task. Although in some cases failure
tages of deep ANNs over shallow ones [15,16], this paper presents
probabilities can be estimated at acceptable computational costs
a comparison between them in the context of structural reliability.
by using approximated methods such as first- and second-order
To do so, a previously proposed adaptive ANN procedure [5],
reliability methods (FORM and SORM), many times more
which aimed at shallow networks, is simplified, extended to the
demanding approaches such as Monte Carlo simulation (MCS)
case of deep ones and employed in the solution of four benchmark
and other sampling-based methods are the only feasible alterna-
structural reliability problems.
tives. In these cases, surrogate models, also known as metamo-
It is noteworthy that most of the surrogate models found in the
dels, have been widely employed as an attempt to keep the
literature, including shallow neural networks, suffer from what is
computational effort acceptable.
usually known as the curse of dimensionality [17–19]. This basi-
The basic idea of surrogate modeling for reliability analysis
cally means that the surrogates rapidly lose their efficiency as the
purposes is usually to replace the true time-consuming limit state
number of dimensions of the problem increases. However, recent
function by an approximation. In the literature, many different
developments in the area of deep learning have been leading to
surrogate models have been applied on structural reliability analy-
theoretical guarantees that deep neural networks can avoid the
sis, for example: response Surface Method [1], kriging [2], poly-
curse of dimensionality for some types of problems [20,21].
nomial chaos expansions [3], and artificial neural networks
Dimensionality issues are not directly investigated herein, but this
(ANNs) [4,5]. The present paper focuses on ANNs.
is another reason to consider the application of deep ANNs in the
A large number of applications of ANNs in the field of struc-
context of structural reliability, especially because it is common
tural reliability is also available in the literature, as can be seen in
to find structural reliability problems with high dimensionality.
the review paper by Chojaczyk et al. [6] and in many other refer-
The fact that different layers of deep ANNs may have different
ences [7–9]. However, the vast majority of them, if not all, employ
roles, or in other words that different layer types with different
only the so-called shallow neural networks, which are those with
goals may be employed [22,23], could also lead to advantages of
just one hidden layer. The potential of deep neural networks, those
these ANNs over the shallow ones. In the case of system reliabil-
with two or more hidden layers, in structural reliability is still to
ity, for example, the first hidden layer could try to separate the dif-
be explored, although these ANNs have been attracting a lot of
ferent failure modes in such a way that each group of neurons of
research interest in many areas over the last years. In the context
the next layers would be responsible for approximating one spe-
of structural engineering, a few papers with applications of deep
cific limit state function.
ANNs may already be found in the literature [10,11].
The remainder of this paper is organized as follows. In Sec. 2,
As pointed out by Schmidhuber [12], it is not clear in the litera-
some basic concepts related to structural reliability and Monte
ture at which problem depth shallow learning ends and deep learn-
Carlo simulation are presented. Section 2 also presents a brief dis-
ing begins. An attempt to define shallow and deep ANNs is
cussion about why the computational cost may become prohibi-
presented in Ref. [13], where it is said that deep architectures are
tive and points out some alternatives to overcome this. Section 3
composed of multiple levels of nonlinear operations. However, in
describes the artificial neural networks considered herein, as well
the present paper, a simpler definition, that shallow networks are
as the adaptive procedure employed for the shallow and deep
ANNs. Section 4 presents results obtained for the numerical
Manuscript received November 28, 2019; final manuscript received June 9, 2020; examples and some discussions about these results. Finally, some
published online July 17, 2020. Assoc. Editor: Gilberto Francisco Martha de Souza. concluding remarks are drawn in Sec. 5.

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, DECEMBER 2020, Vol. 6 / 041006-1
Part B: Mechanical Engineering C 2020 by ASME
Copyright V
Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
2 Structural Reliability simplified analogy to the nervous system and have significantly
evolved ever since. Most of the recent developments on ANNs are
Let X be a vector of random variables, which represents all ran-
associated with the area known as deep learning.
dom or uncertain parameters of a structural system, and x be a
In ANNs, information is processed by small processing units,
vector of realizations of these random variables. The boundary
corresponding to the neurons, mathematically represented by sim-
between desirable and undesirable structural responses is defined
ple functions which are usually called activation functions. The
by limit state functions, g(X), in such a way that the failure and
processing units communicate with each other by means of
safe domains, Xf and Xs, respectively, are given by
weighted connections corresponding to the synapses of the brain
  [18]. Different networks can be constructed by choosing different
Xf ¼ xjgðxÞ  0
  (1) numbers of neuron layers, the type and number of neurons in each
Xs ¼ xjgðxÞ > 0 layer, and the type of connection between neurons.
The most widely used network type for approximation prob-
Each limit state describes one possible failure mode of the lems, which is adopted herein, is the multilayer perceptron (MLP,
structure. The probability of undesirable structural responses for see Ref. [18]). MLP networks are built with: one input layer with
each failure mode, usually known as probability of failure, is one neuron for each input parameter, one output layer with one
defined as neuron for each output parameter, and an arbitrary number of hid-
ð den layers, nhidden, with arbitrary numbers of (hidden) neurons,
nneurons. In the present paper, following some references from the
Pf ¼ P½X 2 Xf  ¼ fX ðxÞdx (2) literature (for example, Ref. [14]), the ANN is classified as shal-
Xf
low if it has just one hidden layer and as deep otherwise.
where fX(x) is the joint probability density function of vector X. In feedforward ANNs, the neurons of one layer are connected
Equation (2) may also be employed to compute failure probabil- with each neuron of the previous layer, but information only flows
ities of structural systems. In this case, Xf must be defined as a in the forward direction, from the input toward the output layer.
combination of all limit state functions involved. The type of neuron in each layer is defined by the chosen acti-
The multidimensional integral in Eq. (2) may be solved by vation function. Linear and sigmoid functions are usual choices,
means of structural reliability methods such as FORM, SORM, although the literature is filled with many different types of activa-
and MCS. These methods are described, for example, in Refs. tion functions. In this paper, linear activation functions are used
[24] and [25]. for the input and output layers. For the hidden layers two different
When simple MCS is employed, failure probabilities are esti- functions are chosen to be tested for both types of ANNs: tangent-
mated via Eq. (3). In this case, nMC samples of X are randomly sigmoid (tansig), very common in the context of shallow ANNs,
generated according to the joint distribution, fX(x), and a so-called and rectified linear unit (ReLU), which is a common choice for
indicator function, I[x], which is equal to one if x belongs to the deep ANNs.
failure domain and zero otherwise, is considered. Application of For a given configuration and a given dataset, the so-called
Eq. (3) requires one limit state function evaluation per sample, training of the network consists of adjusting its parameters in such
and large numbers of samples are necessary when dealing with a way that its performance is increased. In other words, during the
small failure probabilities. As engineering structures usually pres- training of the network their parameters are modified in such a
ent very small failure probabilities, the computational burden eas- way that the differences between known outputs and outputs pro-
ily becomes prohibitive vided by the ANN (the error) are reduced. Each iteration of the
training process is called an epoch, and if a better approximation
1 X
nMC is required for some regions of the output space, the error to be
P f ¼ E ½ I ½X  ffi I ½x i  (3) reduced may be weighted by multiplying it component-wise by a
nMC i¼1 vector of weights, eW.
In the present paper, the Levenberg–Marquardt training method
In the literature, many methods have been proposed to reduce is employed [34], which is a common choice for shallow net-
the number of samples required by MCS to achieve a given accu- works, and the mean-squared error is used as a performance func-
racy. These methods include but are not limited to: importance tion. Although, for deep ANNs, training algorithms such as the
sampling [26], asymptotic sampling [27], and subset simulation adaptive moment estimation method [35] have shown promising
[28]. results, it seems that they usually aim at problems with lots of
Another approach which has been drawing a lot of attention data, which is hardly the case for structural reliability problems.
from researchers over the last years is the one based on surrogate So, the Levenberg–Marquardt training method seems to be a good
models [2,9,29–31]. In this case, a common approach consists of choice also for the deep networks in the context of the present
replacing as many as possible evaluations of the time-consuming paper. In fact, the Levenberg–Marquardt method led to better
limit state function by evaluations of an accurate enough surrogate results than the adaptive moment estimation method, when both
model, which presents smaller computational costs. Most of the were briefly compared considering the problems studied herein.
times, the true model is evaluated on a number of points, which However, a better tuning of the hyperparameters of the adaptive
constitute the so-called experimental design (ED), and the surro- moment estimation method, in the context of reliability problems,
gate is constructed by using this information. The fact that the could still be pursued in future studies.
choice of these points has a significant impact on the accuracy of The MATLAB neural network toolbox [36] is employed herein.
the metamodel has led the path to the development of a number of Further details about ANNs can be obtained, for example, in
adaptive strategies for EDs, such as the one employed in the pres- Ref. [18].
ent paper. In these strategies, points are included in the ED in an
iterative manner, trying to cover the most important regions of the
domain. Identification of these regions takes into account proba- 3.2 Adaptive Artificial Neural Networks for Structural
bility densities as well as the accuracy of the limit state function Reliability Analysis. The adaptive ANN procedure applied in
approximation [3,29,31,32]. this paper for both shallow and deep networks is similar to the one
proposed for shallow networks in Ref. [5].
When using surrogate models for limit state function approxi-
3 Artificial Neural Networks and Adaptive Designs mation in reliability analysis, an experimental design is employed
to
n construct the approximation.
o The ED consists of nED points
3.1 Artificial Neural Networks. Artificial neural networks
ð1Þ ð2Þ ðnED Þ ðiÞ
were introduced by McCulloch and Pitts [33] based on a xED ; xED ; …; xED , with xED 2 Rn , and the respective function

041006-2 / Vol. 6, DECEMBER 2020 Transactions of the ASME


Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
 
ðiÞ ðiÞ assigned to the first group, related to one neuron per hidden layer.
evaluation values, yED ¼ g xED 2 R. After that, if Monte Carlo
Each group has one hidden neuron more than the previous one
simulation is employed, Eq. (3) is solved by using the metamodel and the neurons are as equally distributed among the hidden layers
to evaluate the indicator function for each one of the nMC samples. as possible, with priority to the first layers. For ANNs with two
The true, supposedly time-consuming, limit state function needs hidden layers, for example, the numbers of hidden neurons for the
to be evaluated only nED times. first three groups would be [1 1], [2 1], and [2 2], respectively.
One efficient way to construct the metamodel is by making use Initialization of backpropagation networks is usually performed
of active learning and adaptive EDs, allowing the surrogate to be by the Nguyen–Widrow method [37], considering a certain degree
refined and adapted during the analysis. The adaptive procedure of randomness. It is common to initialize the network many times,
used herein (see Ref. [5]) is based on the paper by Echard et al. to try to avoid getting stuck in local minima. For this reason, for
[29], which addresses Kriging, and in Ref. [32], which deals with each group, a total of ten networks are initialized and trained for a
polynomial chaos expansions. In this procedure, many surrogates maximum of 100 epochs. The ANN presenting the best perform-
are used simultaneously, to compute the failure probabilities and ance is chosen. After that, all ANNs of the group are initially
to determine which points should be added to the ED in order to defined as copies of the respective chosen neural network.
improve the failure probability estimates. At this stage and during the entire process, 80% of the ED is
The algorithm consists of seven stages: used for training and 20% for validation, in an attempt to avoid
(1) Generation of a Monte Carlo population: A population of overfitting. If the validation performance does not improve during
nMC samples of X is randomly generated according to fX(x). ten consecutive iterations, the training is stopped. As the training
(2) Definition of the initial experimental design: The initial ED and validation datasets are randomly chosen from the ED at the
comprises nED points selected from the population and beginning of each training step, all surrogates result different after
includes the respective limit state function evaluations. some training is performed. Note that in an attempt to have a small
(3) Initialization of the ANNs: A total of B ANNs are generated dataset, and also considering that the ANNs are updated in an iter-
and trained considering the initial ED. ative manner, no data are used herein for testing of the ANNs.
(4) Training of the ANNs: The ANNs are trained again, using
the current ED. 3.2.3 Training of the Artificial Neural Networks. Training of
(5) Prediction by ANN and estimation of the probability of fail- the ANNs must consider the necessity of adaptation of the num-
ure. ANN predictions are obtained for the entire popula- bers of hidden neurons as well as the fact that the ANN may get
tion. Then, failure probability estimates are obtained for trapped into local minima. For this reason, a scheme consisting of
each surrogate, b, by dividing the number of points with a two steps is used.
negative or null ANN prediction by nMC (Eq. (4)). Remem- In the replacement step, which is applied from the second itera-
ber that the points for which the limit state function results tion onward, the nREP ANNs with error performance greater than
negative or null are those which correspond to failures. a certain limit are replaced one by one by the nREP best ANNs.
ð Þ ny 0 This decreases the diversity of ANNs, but accelerates the conver-
Pf b ¼ ANN ; b ¼ 1; 2; …; B (4) gence. The limit is taken as the minimum of all error performan-
nMC
ces plus 1.5 times the standard deviation of the performances.
(6) Evaluation of the convergence criterion: A convergence Also, nREP is always taken as greater or equal to one and smaller
ðbÞ
criterion based on Pf is evaluated. If the criterion is met, than or equal to B/2, so that at least one and at most half of the
the algorithm ends and the Pf to be returned is the average ANNs are replaced by iteration.
of the probabilities given by Eq. (4). Otherwise, the algo- The training step, on the other hand, consists of three substeps.
rithm continues. First, the ANN is trained and its error performance is computed. If
(7) Identification of the points to be included on the ED: A suit- there is no improvement, random perturbations of up to 610% are
able learning function is evaluated on the population. One applied to the weights and biases of the ANN, and it is trained
or more points belonging to the population are chosen, again. If there is still no improvement, random perturbations of up
according to their learning function values, and added to to 61% are applied, and the ANN is trained one last time.
the ED. The algorithm returns to step 4. At this stage, the maximum number of training epochs is given
by Eq. (5), where nEDini is the initial size of the ED and
Some details related to the algorithm are presented in the
nepochsADD ¼ 5 is the number of epochs to be added per point
following.
included in the ED. The number of epochs is increased as the
3.2.1 Monte Carlo Population and the Initial Experimental dataset increases.
Design. The uncertainty associated with the structural reliability
problem is addressed by the generation of a Monte Carlo popula- nepochs ¼ nepochsADD  ðnED  nEDini Þ þ nEDini (5)
tion, as briefly described as the first stage of the algorithm. This
population is used whenever estimations of failure probabilities
are necessary. Also, the points which comprise the initial ED are 3.2.4 Convergence Criteria. The convergence criterion cho-
chosen from the population. sen was developed by Sch€obi et al. [30], based on the stability of
To improve the space-filling properties of the initial ED, selec- the estimated failure probability, P^f , at the current iteration. It is
tion is performed by means of a deterministic algorithm, which given by
tries to find a subset of nED farthest-apart samples of the popula-
tion, considering sums of Euclidean distances. To do so, first the    
ðbÞ ðbÞ
sample closest to the mean of the population is selected. After max Pf  min Pf
 P^f ; b ¼ 1; 2; …; B (6)
that, an iterative process is adopted to select the remaining nED1 P^f
points. In each iteration, the sample farthest from those already
included in the ED is chosen and added to the ED. where the tolerance, P^f , is taken as 0.5%.
Note that the deterministic selection of a farthest-apart subset
from the population largely removes randomness from the initial
ED, facilitating the construction of the surrogate model. 3.2.5 Learning Function and Enrichment of the Experimental
Design. The learning function adopted herein is the one proposed
3.2.2 Initialization of the Artificial Neural Networks. The by Marelli and Sudret [32], related to the misclassification proba-
total of B ANNs is divided into B/2 groups, each group with a dif- bility of the population samples and based on the fraction of failed
ferent number of hidden neurons. A minimum value for nhidden is bootstrap replicates

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, DECEMBER 2020, Vol. 6 / 041006-3
Part B: Mechanical Engineering
Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
 
  Table 1 Example 1: results
Bs ðxðiÞ Þ  Bf ðxðiÞ Þ
 
UFBR ðxðiÞ Þ ¼ (7) Average results
B
Method nhidden Pf nCLS nneurons Time (min)
where Bs ðxðiÞ Þ and Bf ðxðiÞ Þ are the number of surrogates which
identify the sample xðiÞ as being in the safe and in the failure ANN (Tansig) 1 4.452  103 125 17.4 27.7
regions, respectively. If UFBR ðxðiÞ Þ ¼ 1, the classification of the 2 4.456  103 102 20.2 22.7
sample is resulting the same for all surrogates. If UFBR ðxðiÞ Þ is 3 4.458  103 97 20.1 22.3
close to zero, the classifications of xðiÞ are resulting different and 4 4.457  103 97 23.5 25.4
5 4.457  103 103 24.0 30.3
this point should be added to the ED.
In order to add nADD points to the ED at each iteration, the pop- ANN (ReLU) 1 4.457  103 145 20.6 30.6
ulation is clustered in nADD different regions by using the k-means 2 4.464  103 124 23.5 29.8
clustering method [38]. Each time the enrichment of the ED takes 3 4.460  103 132 22.3 33.2
place, UFBR is evaluated on the entire population and one point of 4 4.460  103 116 25.3 29.9
5 4.459  103 121 25.7 32.3
each cluster is selected, among those presenting the smallest value
of UFBR. MCS – 4.458  103 5  106 – <1

3.2.6 Error Weights and Scaling/Transformation of the Data.


The data presented to the ANNs are scaled using the mapminmax distributed random variables, X1 and X2, are considered and the
MATLAB function, before any training or evaluation takes place. limit state function reads
Scaling of the input data considers maximum and minimum val- 8 9
ues of each random variable, computed from the population; for > x 1 þ x2 >
>
> 3 þ 0:1 ðx  x Þ2
 p ffiffi
ffi >
>
the output data, an interval defined by the maximum absolute >
> 1 2 >
>
>
> 2 >
>
value of the limit state function, ½ymax ; ymax , is considered, >
> >
   >
> x 1 þ x2 >
>
>
ð1Þ ð2Þ ðn Þ >
> 3 þ 0:1 ðx  x Þ2
þ p ffiffi
ffi >
>
where ymax ¼ max abs yED ; yED ; …; yEDED . < 1 2
2 =
Another important aspect related to the application of ANNs to gðx1 ; x2 Þ ¼ min (8)
>
> 6 >
>
reliability analysis concerns the fact that the most important points >
> ðx 1  x 2 Þ þ pffiffiffi >
>
>
> 2 >
>
of the ED are those for which the limit state function value is >
> >
>
>
> >
>
closer to zero, since they are also the most difficult points to be >
> 6 >
>
>
: ðx 2  x 1 Þ þ p ffiffi
ffi >
;
classified as in the failure domain or in the safe domain. For this 2
ðiÞ
 
reason, error weights, given by eW ¼ min 1
; 105 , are
absðyED Þ
ðiÞ

applied in the computations of error performances during the Results for this example are shown in Table 1 and Fig. 1. Note
entire process. that the number of neurons given in Table 1, for example, is not
an integer since it refers to the average over five runs.

4 Numerical Examples 4.2 Example 2: Dynamic Response of a Nonlinear Oscilla-


In this section, shallow and deep neural networks are applied to tor. This example consists of a nonlinear undamped single
solve four benchmark reliability problems. Results are obtained degree-of-freedom system (Fig. 2), studied, for example, in Ref.
by considering tangent-sigmoid versus rectified linear unit hidden [29]. The limit state function is defined by
layers, with up to five hidden layers. In each case, results are   
 2F1 x0 t1 
obtained for five runs of the algorithm, using five different seeds 
gðc1 ; c2 ; m; r; t1 ; F1 Þ ¼ 3r   sin (9)
for the random number generator, and presented in terms of aver- mx20 2 
ages, maxima and minima. In each run, the same seed is consid- pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ered for both shallow and deep ANNs, so that they use the same where x0 ¼ ðc1 þ c2 Þ=m. The parameters of the six random
chain of random numbers. For comparison purposes, results variables are shown in Table 2, where P.D.F. stands for probabil-
obtained by simple MCS are also shown. ity density function.
For all examples, the size of the initial ED and the number of Results for this example are presented in Table 3 and Fig. 3.
samples to be added to the ED at each enrichment step are kept
constant, with nED ¼ 50 and nADD ¼ 3. Also, a fixed number of
surrogates, B ¼ 50, is adopted.
Comparison of the computational effort is performed by using
the number of calls to the limit state function, nCLS, required by
each method. Although the computational costs for training the
ANNs are significant for these examples, the use of surrogates
aims at problems where each evaluation of the limit state function
is extremely time-consuming in comparison to the construction of
the surrogate. This is not the case for the problems considered
herein, which were chosen so that reference results could be com-
puted by MCS. Even though, computational times required to
solve the problems are also presented. In all cases, each run was
performed using single-thread computation on an IntelV CoreTM
R

I7 CPU 860 at 2.80 GHz processor.

4.1 Example 1: Series System With Four Branches. This


example consists of a series system with four branches, originally
proposed in Ref. [39], but also studied by Echard et al. [29], Mar- Fig. 1 Difference between failure probabilities obtained by
elli and Sudret [32], and other authors. Two standard normal ANNs and by MCS (Example 1)

041006-4 / Vol. 6, DECEMBER 2020 Transactions of the ASME


Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
Table 4 Example 3: results

Average results

Method nhidden Pf nCLS nneurons Time (min)

ANN (Tansig) 1 2.016  103 121 6.2 12.5


2 2.017  103 154 15.8 16.4
3 2.016  103 123 15.0 11.9
4 2.015  103 102 17.3 10.5
5 2.016  103 88 15.2 8.9
Fig. 2 Nonlinear oscillator: system definition and applied load ANN (ReLU) 1 2.016  103 73 4.2 5.3
2 2.016  103 73 6.1 4.6
3 2.016  103 74 10.2 4.5
Table 2 Example 2: random variables 4 2.016  103 74 12.8 4.4
5 2.016  103 75 18.0 4.7
Variable P.D.F. Mean Standard deviation
MCS — 2.016  103 3  105 — <1
m Normal 1.0 0.05
c1 Normal 1.0 0.10
c2 Normal 0.1 0.01
r Normal 0.5 0.05
F1 Normal 1.0 0.20
t1 Normal 1.0 0.20

Table 3 Example 2: results

Average results

Method nhidden Pf nCLS nneurons Time (min)


2
ANN (Tansig) 1 2.863  10 110 9.7 2.3
2 2.863  102 107 13.0 2.5
3 2.862  102 102 14.0 2.7
4 2.861  102 100 16.0 2.9
5 2.864  102 94 19.8 3.2
ANN (ReLU) 1 2.869  102 197 11.4 5.2 Fig. 4 Difference between failure probabilities obtained by
2 2.868  102 223 16.8 8.1 ANNs and by MCS (Example 3)
3 2.865  102 208 19.6 8.3
4 2.875  102 202 20.6 8.1
5 2.862  102 203 23.0 8.8
MCS — 2.864  102 7  104 — <1

Fig. 5 A 23-bar truss structure


pffiffiffi X
n
gðx1 ; x2 ; …; xn Þ ¼ n þ 3r n  xi (10)
i¼1

The random variables are taken as lognormally distributed, with


unit means and standard deviation r ¼ 0.2.
The results are obtained herein for n ¼ 40, and shown in Table 4
and Fig. 4.

Fig. 3 Difference between failure probabilities obtained by 4.4 Example 4: Two-Dimensional Truss Structure. The
ANNs and by MCS (Example 2) last example consists of a finite-element model of a 23-bar truss
structure, subject to random loads P1, P2, …, P6 (Fig. 5). This
example was presented by Lee and Kwak [41] and studied in the
4.3 Example 3: High-Dimensional Problem. This example literature, for example, in Refs. [3,30], and [32].
was proposed by Rackwitz [40] and also studied by Echard et al. Ten independent random variables are considered, as summar-
[29]. The limit state function is given by Eq. (10). Note that, for ized in Table 5. The vector of random variables is given by
this limit state function, the number of variables can be easily X ¼ fE1, E2, A1, A2, P1, P2, P3, P4, P5, P6g. It is assumed that all
changed without significantly modifying the level of failure horizontal members have perfectly correlated Young’s moduli,
probability E1, and cross-sectional areas, A1; the same is assumed for the

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, DECEMBER 2020, Vol. 6 / 041006-5
Part B: Mechanical Engineering
Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
Table 5 Example 4: random variables Results for two values of vAdm, 0.10 m and 0.12 m, respectively,
are presented in Table 6 and Fig. 6.
Variable P.D.F. Mean Standard deviation
11
E1, E2 (Pa) Lognormal 2.1  10 2.1  1010 4.5 Discussions. In most cases, the results obtained by the
A1 (m2) Lognormal 2.0  103 2.0  104 ANNs were very good, with differences to reference values usu-
A2 (m2) Lognormal 1.0  103 1.0  104 ally far below 1%. The nCLS required by the ANNs were compara-
P1,… , P6 (N) Gumbel 5.0  104 7.5  103 ble to and in some cases smaller than the ones required by other
surrogate models using similar adaptive approaches, as can be
diagonal members, where the Young’s moduli and cross-sectional seen in the literature [29,30,32]. Also, as expected, the ANNs
areas are represented by E2 and A2, respectively. required much less calls to the limit state function than crude
The limit state adopted is related to the allowable vertical dis- MCS.
placement at midspan, and the structural displacements are A comparison of the ANNs in terms of activation functions
obtained by means of linear elastic analyses. The limit state equa- shows that only in the third example, where the limit state func-
tion is given by tion is linear on the random variables, the ReLU ANNs led to bet-
ter results. Note that the limit state functions for the first and
gðxÞ ¼ vAdm  jvmax j (11) second problems are clearly nonlinear on the random variables, as
where vmax is the vertical displacement at midspan, and vAdm is seen in Eqs. (8) and (9). Although in the last example vmax is com-
the admissible maximal deflection. puted via linear elastic structural analyses, the limit state function
is also nonlinear on the random variables. By comparing failure
Table 6 Example 4: results

Average results

vAdm (m) Method nhidden Pf nCLS nneurons Time (min)

0.10 ANN (Tansig) 1 4.307  102 126 7.6 26.0


2 4.308  102 123 14.5 27.0
3 4.307  102 144 13.6 29.4
4 4.307  102 141 16.2 30.1
5 4.308  102 127 16.8 29.2
ANN (ReLU) 1 4.317  102 199 11.4 34.9
2 4.306  102 235 17.6 42.3
3 4.300  102 222 19.6 40.9
4 4.301  102 200 21.4 38.5
5 4.268  102 181 22.0 36.3
MCS - 4.309  102 1  106 — 12.5
0.12 ANN (Tansig) 1 1.490  103 154 8.2 29.0
2 1.480  103 175 11.2 32.8
3 1.486  103 181 13.4 35.5
4 1.487  103 171 15.8 34.8
5 1.487  103 167 18.8 35.7
ANN (ReLU) 1 1.488  103 283 8.8 43.7
2 1.480  103 263 16.0 44.5
3 1.481  103 293 14.6 49.4
4 1.486  103 251 16.4 43.4
5 1.480  103 262 24.6 49.9
MCS — 1.490  103 1  106 — 13.1

Fig. 6 Difference between failure probabilities obtained by ANNs and by MCS (Example 4): (a) vAdm 5 0.10 m and
(b) vAdm 5 0.12 m

041006-6 / Vol. 6, DECEMBER 2020 Transactions of the ASME


Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
probabilities computed by the first-order reliability method and by [3] Blatman, G., and Sudret, B., 2010, “An Adaptive Algorithm to Build Up Sparse
Monte Carlo simulation, as done in Ref. [30], it is possible to ver- Polynomial Chaos Expansions for Stochastic Finite Element Analysis,” Proba-
bilist. Eng. Mech., 25(2), pp. 183–197.
ify the degree of nonlinearity of the resulting limit state equation [4] Gomes, W. J. S., and Beck, A. T., 2013, “Global Structural Optimization Con-
in the region which contributes the most to the failure probability. sidering Expected Consequences of Failure and Using ANN Surrogates,” Com-
For vAdm ¼ 0.10 m, for example, the failure probabilities obtained put. Struct., 126, pp. 56–68.
by FORM and MCS, where the latter was performed considering [5] Gomes, W., 2019, “Structural Reliability Analysis Using Adaptive Artificial
Neural Networks,” ASME J. Risk Uncertainty Part B, 5(4), p. 041004.
one million samples, were 2.81  102 and 4.29  102, respec- [6] Chojaczyk, A. A., Teixeira, A. P., Neves, L. C., Cardoso, J. B., and Soares, C.
tively [30]. G., 2015, “Review and Application of Artificial Neural Networks Models in
Finally, in the large majority of the cases, the best results were Reliability Analysis of Steel Structures,” Struct. Saf., 52, pp. 78–89.
achieved by ANNs with more than one hidden layer. In most [7] Gomes, H. M., and Awruch, A. M., 2004, “Comparison of Response Surface
and Neural Network With Other Methods for Structural Reliability Analysis,”
cases, the deep ANNs required less calls to the limit state function Struct. Saf., 26(1), pp. 49–67.
and achieved better accuracy than the shallow ones, although [8] Bucher, C., and Most, T., 2008, “A Comparison of Approximate Response
more neurons were necessary. In terms of computational times, it Functions in Structural Reliability Analysis,” Probabilist. Eng. Mech., 23(2–3),
seems that the deep ANNs tend to compensate the additional pp. 154–163.
[9] Kroetz, H. M., Tessari, R. K., and Beck, A. T., 2017, “Performance of Global
effort related to more neurons with a faster convergence. As a Metamodeling Techniques in Solution of Structural Reliability Problems,”
result, the shallow ANNs were less demanding most of the times, Adv. Eng. Software, 114, pp. 394–404.
as expected, but overall the computational times were similar. It is [10] Kim, T., Kwon, O.-S., and Song, J., 2019, “Response Prediction of Nonlinear Hys-
also possible to note that the computational times required to solve teretic Systems by Deep Neural Networks,” Neural Networks, 111, pp. 1–10.
[11] Kulkarni, P. A., Dhoble, A. S., and Padole, P. M., 2019, “Deep Neural
the examples considered herein by using ANNs were higher than Network-Based Wind Speed Forecasting and Fatigue Analysis of a Large
those required to solve the same problems by MCS. This empha- Composite Wind Turbine Blade,” Proc. Inst. Mech. Eng. C, 233(8),
sizes the fact that the use of surrogate models is only justifiable pp. 2794–2812.
when the true model is associated with a sufficiently high compu- [12] Schmidhuber, J., 2015, “Deep Learning in Neural Networks: An Overview,”
Neural Networks, 61, pp. 85–117.
tational effort. Nevertheless, the adaptive procedure employed [13] Bengio, Y., 2009, “Learning Deep Architectures for AI,” Foundations Trends
herein requires a number of ANNs to select the points to be added Mach. Learn., 2(1), pp. 1–127.
to the experimental design. If this procedure is replaced by one [14] Nielsen, M. A., 2015, Neural Networks and Deep Learning, Determination
which alleviates this necessity, the computational time of the Press.
[15] Mhaskar, H., Liao, Q., and Poggio, T., 2017, “When and Why Are Deep Net-
adaptive ANNs procedure could be dramatically reduced. works Better Than Shallow Ones?,” Thirty-First AAAI Conference on Artificial
Intelligence (AAAI-17), pp. 2343–2349.
[16] Poggio, T., Mhaskar, H., Rosasco, L., Miranda, B., and Liao, Q., 2017, “Why
and When Can Deep - but Not Shallow - Networks Avoid the Curse of Dimen-
sionality: A Review,” Int. J. Autom. Comput., 14(5), pp. 503–519.
5 Conclusions [17] Friedman, J. H., 1994, “An Overview of Prediction Learning and Function
Approximation,” From Statistics to Neural Networks: Theory and Pattern Rec-
In this paper, shallow and deep artificial neural networks were ognition Applications, V. Cherkassky, J. H. Friedman, and H. Wechsler, eds.,
applied as metamodels in structural reliability analysis, using Springer-Verlag, New York.
adaptive experimental designs. Their performances were com- [18] Haykin, S., 2009, Neural Networks and Learning Machines, 3rd ed., Prentice
pared by considering four benchmark problems from the litera- Hall, Upper Saddle River, NJ.
ture. The results indicated that, although the shallow ANNs are [19] Lataniotis, C., Marelli, S., and Sudret, B., 2018, Extending Classical Surrogate
Modelling to Ultrahigh Dimensional Problems Through Supervised Dimension-
the ones commonly employed in the solution of this kind of prob- ality Reduction: A Data-Driven Approach, Research Report, ETH Zurich,
lem, the deep ANNs usually outperform them. Nevertheless, the Zurich, Switzerland.
performances of the shallow ANNs were also acceptable in com- [20] Poggio, T., and Liao, Q., 2018, “Theory I: Deep Networks and the Curse of
Dimensionality,” Bull. Pol. Acad. Tech. Sci., 66(6), pp. 761–773.
parison with other metamodels from the literature. Since the pres- [21] Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A., 2020, “A Proof
ent paper compared only two different activation functions for the That Rectified Deep Neural Networks Overcome the Curse of Dimensionality
neurons on the hidden layers, used a simple scheme to determine in the Numerical Approximation of Semilinear Heat Equations,” SN Partial
the number of hidden neurons and focused on MLP feedforward Differ. Equ. Appl., 1, Paper No. 10.
networks, other activation functions, methods to adapt the number [22] Montavon, G., Samek, W., and M€ uller, K.-R., 2018, “Methods for Interpreting and
Understanding Deep Neural Networks,” Digital Signal Process., 73, pp. 1–15.
of hidden neurons, and neural network architectures should be [23] Gopalakrishnan, K., Gholami, H., Vidyadharan, A., Choudhary, A., and
studied in future works. Agrawal, A., 2018, “Crack Damage Detection in Unmanned Aerial Vehicle
The deep ANNs, in particular, have a great potential to be fur- Images of Civil Infrastructure Using Pre-Trained Deep Learning Model,” Int. J.
Traffic Transp. Eng., 8(1), pp. 1–14.
ther explored, since their complexity allows for more refined [24] Ditlevsen, O., and Madsen, H. O., 2007, Structural Reliability Methods, Techni-
adaptations to the problem at hand, although it may be difficult to cal University of Denmark, Kongens Lyngby, Copenhagen, Denmark.
perform such adaptations. The fact that different layers may have [25] Melchers, R. E., and Beck, A. T., 2018, Structural Reliability Analysis and Pre-
different roles, for example, could possibly lead to better results diction, 3rd ed., Wiley, New York.
with still less evaluations of the limit state function; or at least [26] Engelund, S., and Rackwitz, R., 1993, “A Benchmark Study on Importance
Sampling Techniques in Structural Reliability,” Struct. Saf., 12(4), pp.
lead to a more flexible neural network which can be more easily 255–276.
adapted to changes in the limit state function, for example, during [27] Maes, M. A., Breitung, K., and Dupuis, D. J., 1993, “Asymptotic Importance
a parametric analysis or structural optimization. Sampling,” Struct. Saf., 12(3), pp. 167–186.
[28] Au, S.-K., and Beck, J. L., 2001, “Estimation of Small Failure Probabilities in
High Dimensions by Subset Simulation,” Probabilist. Eng. Mech., 16(4), pp.
Funding Data 263–277.
[29] Echard, B., Gayton, N., and Lemaire, M., 2011, “AK-MCS: An Active Learning
Reliability Method Combining Kriging and Monte Carlo Simulation,” Struct.
 Brazilian National Council for Scientific and Technological Saf., 33(2), pp. 145–154.
Development (CNPq) (Grant No. 302489/2017-7; Funder [30] Sch€obi, R., Sudret, B., and Marelli, S., 2017, “Rare Event Estimation Using
ID: 10.13039/501100003593). Polynomial-Chaos Kriging,” ASME J. Risk Uncertain. Eng. Syst. A Civ. Eng.,
3(2), p. D4016002.
[31] Marelli, S., and Sudret, B., 2018, “An Active-Learning Algorithm That Com-
bines Sparse Polynomial Chaos Expansions and Bootstrap for Structural Reli-
References ability Analysis,” Struct. Saf., 75, pp. 67–74.
[1] Soares, R. C., Mohamed, A., Venturini, W. S., and Lemaire, M., 2002, [32] Marelli, S., and Sudret, B., 2016, “Bootstrap-Polynomial Chaos Expansions and
“Reliability Analysis of Non-Linear Reinforced Concrete Frames Using the Adaptive Designs for Reliability Analysis,” Proceedings of the Sixth Asian-
Response Surface Method,” Reliab. Eng. Syst. Saf., 75(1), pp. 1–16. Pacific Symposium on Structural Reliability and Its Applications (APSSRA6),
[2] Dubourg, V., Sudret, B., and Deheeger, F., 2013, “Metamodel-Based Impor- Shangai, China, May 28–30, pp. 217–224.
tance Sampling for Structural Reliability Analysis,” Probabilist. Eng. Mech., [33] McCulloch, W., and Pitts, W., 1943, “A Logical Calculus of the Ideas Imma-
33, pp. 47–57. nent in Nervous Activity,” Bull. Math. Biophys., 5(4), pp. 115–133.

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, DECEMBER 2020, Vol. 6 / 041006-7
Part B: Mechanical Engineering
Downloaded from http://asmedigitalcollection.asme.org/risk/article-pdf/6/4/041006/6550869/risk_006_04_041006.pdf?casa_token=6w9pSmm8azUAAAAA:FolXEzjECWl8hKYRrepuSLq-re9c6-vN1xjlRscyMBBmpC9eG5C81nRSd4IY4sozEuYr08lUxw by Christ University user on 14 June 2021
[34] Hagan, M. T., and Menhaj, M. B., 1994, “Training Feedforward Networks With [38] Haykin, S., 1996, Adaptive Filter Theory, 3rd ed., Prentice Hall, Upper Saddle
the Marquardt Algorithm,” IEEE Trans. Neural Networks, 5(6), pp. 989–993. River, NJ.
[35] Kingma, D. P., and Ba, J. L., 2015, “ADAM: A Method for Stochastic [39] Waarts, P.-H., 2000, “Structural Reliability Using Finite Element
Optimization,” Proceedings of the Third International Conference on Learning Methods: An Appraisal of DARS—Directional Adaptive Response
Representations, San Diego, CA, May 7–9. Surface Sampling,” Ph.D. thesis, Technical University of Delft, Delft, The
[36] Beale, M. H., Hagan, M. T., and Demuth, H. B., 2011, Neural Network Tool- Netherlands.
box: User’s Guide, Mathworks, Natick, MA, p. 404. [40] Rackwitz, R., 2001, “Reliability Analysis — A Review and Some Perspective,”
[37] Nguyen, D., and Widrow, B., 1990, “Improving the Learning Speed of 2-Layer Struct. Saf., 23(4), pp. 365–395.
Neural Networks by Choosing Initial Values of the Adaptive Weights,” [41] Lee, S. H., and Kwak, B. M., 2006, “Response Surface Augmented
Proceedings of the International Joint Conference on Neural Networks, San Moment Method for Efficient Reliability Analysis,” Struct. Saf., 28(3), pp.
Diego, CA, June 17–21. 261–272.

041006-8 / Vol. 6, DECEMBER 2020 Transactions of the ASME

You might also like