0% found this document useful (0 votes)
67 views13 pages

Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao

This study develops a long short-term memory (LSTM) surrogate modeling approach to predict the mechanical responses and failure envelope of caisson foundations in marine soils. The LSTM model is trained using data from smoothed particle hydrodynamics numerical simulations. It more flexibly learns the failure mechanism directly from the raw data, with higher computational efficiency and accuracy than traditional macro-element methods. The LSTM surrogate model shows potential for application in engineering to investigate caisson foundation responses under various conditions with minimal computational resources.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views13 pages

Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao

This study develops a long short-term memory (LSTM) surrogate modeling approach to predict the mechanical responses and failure envelope of caisson foundations in marine soils. The LSTM model is trained using data from smoothed particle hydrodynamics numerical simulations. It more flexibly learns the failure mechanism directly from the raw data, with higher computational efficiency and accuracy than traditional macro-element methods. The LSTM surrogate model shows potential for application in engineering to investigate caisson foundation responses under various conditions with minimal computational resources.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Ocean Engineering 204 (2020) 107263

Contents lists available at ScienceDirect

Ocean Engineering
journal homepage: www.elsevier.com/locate/oceaneng

A LSTM surrogate modelling approach for caisson foundations


Pin Zhang a, b, Zhen-Yu Yin a, *, Yuanyuan Zheng c, d, Fu-Ping Gao e, f
a
Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
b
Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), 1119, Haibin Rd., Nansha District, Guangzhou, China
c
School of Civil Engineering, Sun Yat-Sen University, Guangzhou 510275, China
d
Southern Marine Science and Engineering Guangdong Laboratory (Zhuahai), China
e
Key Laboratory for Mechanics in Fluid Solid Coupling Systems, Institute of Mechanics, Chinese Academy of Sciences, Beijing, 100190, China
f
School of Engineering Science, University of Chinese Academy of Sciences, Beijing, 100049, China

A R T I C L E I N F O A B S T R A C T

Keywords: This study proposes a hybrid surrogate modelling approach with the integration of deep learning algorithm long
Caisson foundation short-term memory (LSTM) to identify the mechanical responses of caisson foundations in marine soils. The
Failure envelope LSTM based surrogate model is first trained based on limited results generated from the SPH-SIMSAND based
Smoothed particle hydrodynamics
numerical simulations with a strong validation, thereafter it is applied to predict the mechanical responses of
Long short-term memory
soil-structure interaction and the failure envelope of unknown caisson foundations with various specifications as
testing. The results indicate that the LSTM based model is more flexible than macro-element method, because it
can directly learn the failure mechanism of caisson foundation from the raw data, meanwhile guarantees a high
computational efficiency and accuracy in comparison with physical and numerical modelling. LSTM based
surrogated model shows a great potential of application in engineering practice.

1. Introduction expensive. Numerical methods have thus been extensively employed to


simulate the responses of caisson foundations (Jin et al., 2018; Jin et al.,
Since the suction caisson was first used as the foundation of an 2019b; Liu et al., 2014), but the nonlinear and elaborate finite element
offshore wind turbine (OWT) at Frederikshave, Denmark in 2002 (Ibsen modelling is time-consuming and requires considerable skill (Jin et al.,
and Brincker, 2004), suction caisson foundations have drawn great 2019c), which is suitable for a specific case. An analytical solution
attention with the increasing application in the foundation of OWT known as macro-element that derives from the experimental or nu­
(Gelagoti et al., 2018; Skau et al., 2018b). The suction caisson is installed merical results has been proposed to explore the failure envelope and
by pumping the trapped water within the caisson compartment after it applied in engineering design due to its simplicity. In this method, soil
has touched the seabed. Such process does not rely on any specialist and foundation structure are considered as a macro-element, thereby the
equipment, thereby suction caisson has been commonly acknowledged computation is faster and simpler in comparison with finite element
as a cost-effective and eco-friendly foundation mode (Jin et al., 2019c; analysis. However, for different size of caisson foundations, certain
Zhu et al., 2018). experimental tests and numerical simulations are needed to calibrate the
Numerous research works have been conducted to investigate the macro-element model, which is time-consuming and expensive.
responses of caisson foundations to the couplings between the vertical Macro-element is also limited by fixed formation and thus cannot totally
force, the horizontal force and the bending moment through, e.g. in-situ replicate the results of experimental or numerical results.
testing (Houlsby et al., 2006), physical modelling (Byrne and Houlsby, An alternative method is to develop a surrogate model that is con­
2001; Cassidy et al., 2002; Ibsen et al., 2014), numerical modelling structed using limited experimental or numerical results and at the same
(Skau et al., 2018a; Zafeirakos and Gerolymos, 2016) and analytical time it can directly learn the failure mechanism of caisson foundations
solutions (Li et al., 2015; Montrasio and Nova, 1997; Nova and Mon­ from the raw experimental or numerical data. Thereafter this surrogate
trasio, 1991). Simple physical model is hard to simulate the in-situ model can be applied to predict the mechanical responses and the failure
operation condition of caisson foundation and obtain its failure mech­ envelope of a caisson foundation under various conditions such as
anism; meanwhile the instrument tends to be cumbersome and different foundation size, aspect ratio, soil-structure contact surface

* Corresponding author.
E-mail addresses: [email protected], [email protected] (Z.-Y. Yin).

https://doi.org/10.1016/j.oceaneng.2020.107263
Received 30 December 2019; Received in revised form 14 March 2020; Accepted 16 March 2020
Available online 1 April 2020
0029-8018/© 2020 Elsevier Ltd. All rights reserved.
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 1. Schematic view of ANN and RNN.

lowest. The training of ML-based model is flexible and simple as long as


put the raw data. Meanwhile, once a surrogate model is well trained, the
simulation of a new case can be completed within several seconds,
which provides an effective method to investigate the responses of the
studied object under various conditions. Nevertheless, to the best
knowledge of authors, ML based models have not been developed and
directly used to capture the failure envelop or mechanical responses of a
caisson foundation up to now. The long short-term memory (LSTM) has
been proposed to predict sequential datasets and overcome gradient
vanishing and exploding problems (Hochreiter and Schmidhuber,
1997b), which means that it can account for the history of loading force
or deformation. Moreover, LSTM can directly learn the failure mecha­
nism of caisson foundations from the raw data. Therefore, a LSTM based
surrogate model deserves to be developed to investigate the responses of
a caisson foundation.
This study aims to develop a ML surrogate modelling approach to
identify the failure envelope and mechanical responses of a caisson
foundation in sand. Database is generated from an advanced numerical
Fig. 2. Memory cell of LSTM. modelling combined the smoothed particle hydrodynamics with the
SIMSAND model (SPH-SIMSAND) with a strong validation from labo­
area, thereby designing the optimum specification of a caisson founda­ ratory tests to physical model tests and a field test of caisson founda­
tion. Recently, the application of machine learning (ML) algorithms in tions. A LSTM based surrogate model is first trained based on limited
geotechnical engineering have proliferated, e.g. soil parameters identi­ results generated from the SPH-SIMSAND, thereafter it is applied to
fication (Zhang et al., 2020a, 2020b; Zhou et al., 2016), development of model the mechanical behavior and the failure envelope of unknown
constitutive models (Zhang et al., 2019a, 2020c), evaluation of soil caisson foundations with various specifications. The simulation of each
liquefaction (Atangana Njock et al., 2020), tunneling (Chen et al., new case using the surrogate model can be completed within several
2019a, 2019b; Elbaz et al., 2019a, 2019b; Zhang et al., 2019; Zhang seconds. Therefore, the computational cost for determining the opti­
et al., 2020), landslides (Huang et al., 2017; Yang et al., 2019), because mum design of caisson foundation specification can be remarkably
the strong nonlinear mapping ability of such algorithms provides a novel reduced.
methodology to tackle sophisticated problems with the interaction of
multiple parameters (Sarir et al., 2019; Zhang et al., 2019a; Zhang, 2. Deep learning based methodology
2019). Most recently, Liu et al. (2019) proposed a deep neural network
based failure criterion to describe the behavior of woven composites, 2.1. Long short-term memory neural network
and the predicted failure envelope matching well with the measured
results. Reuter et al. (2018) compared the performance of three The framework of a typical forward neural network is presented in
commonly used ML algorithms, i.e., artificial neural network, support Fig. 1. It can be observed that the datasets flow from the input layer to
vector machine and support vector regression (SVR) in modelling con­ the output layer and the error is back propagated to modify the weights
crete failure surfaces, and found that the prediction error of SVR is and biases for minimizing the discrepancy between predicted outputs

2
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 3. Schematic view of hybrid SPH-SIMSAND and LSTM surrogate modelling process.

of the hidden layers in the front of the architecture is poorer than the
Table 1 later hidden layers.
Specifications of caisson foundations in the numerical model.
To overcome gradients exploding and vanishing in the conventional
Specification (L, D) RNNs, a memory cell is thus added in the architecture of LSTM in place
Training (1, 2), (1, 2.9), (1.5, 2), (1.5, 2.39), (2, 1), (2, 2), (2, 4), (2, 5.8), (2.96, of the neurons used in conventional RNNs. Such memory cell can store
set 1.5), (3, 4), (3.72, 1.24), (4, 2), (4, 4), (4, 8), (4, 11.6), (4.75, 1), (6, information over extended time intervals and handle long-time-lag tasks
2.97), (6, 8), (6, 9.55), (8, 2.33), (8, 4), (8, 8), (8, 16), (10, 1.91), (10, (Hochreiter and Schmidhuber, 1997a) by using a novel entity termed as
20), (12, 5.94), (12, 16), (15, 20), (16, 8), (16, 16), (20, 20), (20, 10)
“gate”, as presented in Fig. 2. Three gates, i.e., forget, input and output
Testing set (1, 2.83), (1.5, 2.31), (2, 5.65), (3.56, 1.5), (4, 11.32), (4.15, 1.39), (6, gates are included in the memory cell to control the flow of information
3.26), (6, 9.24)
and the state of the cell. Forget gate decides which information is dis­
carded from the memory cell, input gate decides which information is
and measured results (Rumelhart et al., 1986). In other words, the stored in the memory cell, and output gate decides ultimate output
prediction of current output parameters is not affected by the previous values. The outputs of forget and input gates at the tth step are obtained
information and it also does not affect the prediction of output param­ by:
eters at the next step. Given a set of input matrix x ¼ [x1, x2, …, xn], the �� � � � � ��
f tj ¼ σ Uf xt j þ Wf ht 1 j þ bf j (4)
output of hidden and output layers can be obtained by:
h ¼ f ðUx þ b1 Þ (1) � �
itj ¼ σ ½Ui xt �j þ Wi ht 1 j þ ½bi �j

(5)
o ¼ gðVh þ b2 Þ (2)
where σ ¼ sigmoid function. In the forget gate, σ ¼ 1 and 0 represent all
where U, V ¼ matrix connecting the input and hidden layers, hidden and information is maintained or discarded, respectively. In the input gate, σ
output layers, respectively; b1, b2 ¼ biases vectors in the input and ¼ 1 and 0 represent all information is selected or discarded, respectively.
hidden layers, respectively; f, g ¼ activation functions in the hidden and Based on the forget and input information, the memory cell state at
output layers, respectively. the tth current step is thus updated by:
The main departure of RNN is a cyclic connection topology is � �
~ctj ¼ tanh ½Uc xt �j þ Wc ht 1 j þ ½bc �j

(6)
adopted, as presented in Fig. 1. It is clear that the predicted output at the
current step depends on current values of input parameters and the in­
ct ¼ f t � ct 1
þ it � ~ct (7)
formation transferred from the former hidden layer, which can be ob­
tained by: where tanh is the activation function; � ¼ elementwise product;f t � ct 1

ht ¼ f Ux þ Wht 1 þ b1 (3) represents the discarded information; it � ~ct represents newly selected
information. The update of memory cell status with an addition format
where W ¼ matrix connecting hidden layers at adjacent steps. can avoid the gradients vanishing and exploding. Thereafter, output of
The history information is stored and it is applied to predict the next the hidden layer at the tth step is obtained by:
status, such history-dependent characteristic makes RNNs applicable to � � �
οtj ¼ σ ½Uo xt �j þ Wo ht 1 j þ ½bo �j (8)
investigate problems with sequential datasets, such as language trans­
formation, speech recognition, and the prediction of load–deformation
responses (Wang and Sun, 2018; Zhu et al., 1998). However, training ht ¼ ot � tanhðct Þ (9)
RNNs has proved to be problematic because the back-propagated gra­ A multiplicative input gate unit is employed to protect the memory
dients either grow or shrink at each time step, resulting in exploding or contents stored at the current step from perturbation by irrelevant in­
vanishing gradients (LeCun et al., 2015), that is, the learning efficiency puts, and a multiplicative output gate unit is employed to protect other

3
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 4. Results of SPH-SIMSAND numerical modelling: (a) u–H; (b) θ–M; (c) H–M/D; (d) failure envelope.

units from perturbation by currently irrelevant memory contents stored Ht and bending moment Mt. The training performance of the LSTM based
at the current step (Hochreiter and Schmidhuber, 1997a). It should be model is evaluated by the mean square error (MSE) values on both
noted that numerous LSTM variants have been thereafter proposed such training and test sets, meanwhile 10-fold cross-validation method is
as gated recurrent unit (GRU) (Cho et al., 2014). LSTM with numerous applied to enhance model robustness, thereby the loss function can be
weights and biases that is beneficial to predict high-dimensional issues is obtained by:
selected to explore its feasibility in capturing caisson foundations
responses. 1 X 10 Xr
�2
MSE ¼ ym ypi (10)
10r i¼1 j¼1 i

2.2. Proposed hybrid surrogate model where ym p


i ¼ measured result at the ith point; yi ¼ predicted result at the
ith point; r ¼ a total of datasets in one cross-validation set.
The development of a hybrid surrogate model is categorized into two
phases: offline and online modelling. The objective of the offline 2.3. Evaluation indicators
modelling is to bridge the numerical modelling platform of SPH-
SIMSAND by Yin et al., 2018b and deep learning algorithm LSTM. Two commonly used indicators “Mean Absolute Percentage Error
Hence, a LSTM based surrogate model that can entirely replace the (MAPE)” and “Nash–Sutcliffe model Efficiency coefficient (NSE)” are
numerical modelling to reduce computational cost is further developed. employed to evaluate the performance of the LSTM based model. MAPE
This phase starts from the calibration of numerical model including the is an unbiased measure to compute the average prediction error of the
parameters of SPH and SIMSAND constitutive model for a given soil type model, and NSE can assess the accuracy and precision of the model
(Jin et al., 2016, 2017, 2019a; Yin et al., 2013, 2016, 2018a), thereafter (Nash and Sutcliffe, 1970). The expression of two measures can be ob­
several cases with different specifications of caisson foundations are tained by
computed for creating a synthetic database. It should be noted that the
n �� m �
loading paths are consistent among all cases. Herein, 80% of datasets are 1X �yi ypi ��
MAPE ¼ (11)
used to train the LSTM based surrogate model, and the remaining 20% of n i¼1 � ymi �
datasets are used to test the model. The online modelling aims to utilize
Pn �2
the surrogate model to predict the mechanical responses and the failure ypi ymi
envelope of a caisson foundation with a random specification under NSE ¼ 1 Pn
i¼1
2
(12)
m
i¼1 ðyi ymi Þ
various loading paths. Meanwhile the SPH-SIMSAND is also used to
simulate the same case to validate the accuracy of the surrogate model. where n ¼ a total of datasets; μ ¼ mean value of ypi /ym
i ; δ ¼ standard
The framework of training a LSTM based model is presented Fig. 3.
deviation of ypi /ym
i . The combination of MAPE and NSE enables to
The input parameters consist of the length L and diameter D of a caisson
comprehensively evaluate the model performance. Low value of MAPE
foundation, the horizontal displacement ut and rotational angle θt at the
and high value of NSE indicate an excellent model performance.
current step, the sequence of history values of horizontal force Ht 1 and
bending moment Mt 1. The output parameters are the horizontal force

4
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 6. Loss values yielded by LSTM model with dropout layer on the: (a)
training set; (b) testing set.
Fig. 5. Data smoothing: (a) u–H; (b) θ–M.

the numerical and experimental results indicates the reliability of


Table 2 SPH-SIMSAND modelling method to investigate the responses of a
Main hyper-parameters during training LSTM. caisson foundation. Therefore, the results from such numerical model­
ling are used to establish database in this study.
Hyper- Description Value
parameter Because of the lightness of a caisson foundation, the horizontal and
overturning moment bearing capacities are important for the design. Jin
Nh Number of hidden layers 3
Nn Number of nodes in the hidden layer 80; 80; 50 et al., 2019b thus investigated the failure envelope of caisson founda­
Activation Activation function to use tanh; ReLU; ReLU tions with various specifications in the H-M plane. Numerical radial
Dropout Fraction of the units to drop 0 displacement tests in which the ratio between the applied displacements
Optimizer Algorithm for optimizing weights Adam or the combined rotation-displacement increments is kept constant are
and biases
adopted as the main loading control over the course of numerical
η Learning rate in the optimizer 0.0002–0.002; period ¼
20 modelling (Gottardi et al., 1999). This study aims to identify the me­
Batch_size Number of training samples 200 chanical responses and failure envelop of caisson foundations with
Epoch Number of iterations during training 200 various specifications, thereby a total of 40 numerical models with
176000 datasets are constructed and the detailed specifications of
studied caisson foundations are presented in Table 1. Herein, 32 nu­
3. Database design
merical modelling results with 140800 datasets are used to train the
LSTM based model, and the remaining 8 numerical modelling results
3.1. Data source
with 35200 datasets are used to test the performance of the LSTM based
model. All datasets are ultimately stored in comma-separated values
This study directly utilizes the numerical results of SPH-SIMSAND
(CSV) file for fast importing into Python.
conducted by Jin et al., 2019b. In such research work, the parameters
As an example, a typical simulating result of a caisson foundation
of the SIMSAND model were first calibrated using triaxial tests on Bas­
with an outer diameter (D) of 2 m and a skirt length (L) of 2 m is pre­
karp sand. Meanwhile a cone penetration test and model tests on a
sented in Fig. 4. It can be observed that a total of 22 loading paths
reduced scale, and a full scale field test of caisson foundation were
including a pure rotation and a pure horizontal displacement paths are
simulated to validate SPH-SIMSAND. The excellent agreement between
considered in each case, where the horizontal displacement increases

5
P. Zhang et al. Ocean Engineering 204 (2020) 107263

et al., 2019). This study thus introduces a sliding window approach to


smooth data before such datasets applied to train the LSTM based model.
The value of smoothed xn can be obtained by:

1 X n
xn ¼ xi (13)
t i¼n tþ1

where t ¼ window size. The average value of datasets within a window is


assigned as the new value of the studied parameter. The datasets within
a window consist of current and the former (t–1) values. It should be
noted that the first (t–1) and the last (t–1) points cannot form a complete
window, thereby values of such points maintain constant. Larger win­
dow size can generate smoother sequential curve, but it is much more
likely to deviate from the original curve. Considering the variance of
u–H and θ–M relationships presented in Fig. 4 is small, the window size
is thus set as two in this study for maintaining the reliability and
smoothness of the datasets. The smoothed relationships of u–H and θ–M
are presented in Fig. 5. It can be observed that the magnitude and trend
of sequential curves are roughly identical to the original results pre­
sented in Fig. 4, meanwhile the smoothness is improved dramatically.
The different scales of input and output parameters also affect model
performance. After smoothing all datasets, a remedy is to normalize all
datasets into a same scale [–1, 1] to eliminate the scale effect using Eq.
[14].
x xmin
xnorm ¼ ðxmax xmin Þ þ xmin (14)
xmax xmin

where xmax and xmin ¼ measured maximum and minimum of the


parameter x; xmax and xmin ¼ 1 and 1, respectively. The ultimate
database can be downloaded at the Appendix.

4. Offline training of a hybrid surrogate model

4.1. Determination of hyper-parameters

Training the LSTM based model means numerous hyper-parameters


need to be determined in advance. The main hyper-parameters over the
Fig. 7. Loss values yielded by LSTM model with different batch size on the: (a)
course of training are summarized in Table 2. Herein, grid search
training set; (b) testing set.
method is used to search for the optimum architecture of LSTM
including the number of hidden layers, the number of nodes and the
activation function in each layer. Considering the dropout layer can
Table 3
Indicator values for the training and testing sets.
overcome the overfitting problems, the performance of LSTM based
models with various dropout rates (0, 0.2, 0.4, 0.6, 0.8) is thus inves­
Parameter Training set Testing set
tigated. Adaptive moment estimation (Adam) optimizer is utilized in
MAPE NSE MAPE NSE this study due to its superiority (Ruder, 2016). The learning rate of
H 18.36% 0.99 25.07% 0.93 Adam controls the updated step size of weights and its default value is
M 36.57% 0.99 42.15% 0.92 0.001. This parameter needs to be finely tuned for a complex problem
with numerous saddle points (Smith, 2017). Considering the highly
non-linear responses of a caisson foundation, a cyclical learning rate
from 0 to 0.8 m with an interval of 0.04 and the rotation maintains in the
proposed by Smith, 2017 is used to finely optimize the weights and
range of 0 to 0.2 rad (see Fig. 4(a) and (b)). Each loading path includes
biases of LSTM. Batch size represents the number of samples to be fed at
200 data points. To determine the bearing capacity, the loading paths
each training step. Because each loading path consists of 200 datasets as
are plotted in the H-M plane, and the ultimate bearing capacity is
mentioned in the section 3.1, the batch size is thus set as 200n (n ¼ 1, 2,
determined by the inflexion of loading paths, i.e., failure loci, as pre­
…) to ensure that datasets from an entire loading path can be simulta­
sented in Fig. 4(c). It should be noted that the ultimate bearing capacity
neously used to train model. This study investigates the performance of
is hard to reach for some loading paths. To unify the basis of determi­
LSTM based models with batch size of 200, 400, 800, 1000 and 4400 (a
nation, the ultimate bearing capacities of such cases are represented by
total of datasets in a case is 4400). The number of epochs requires to be
the ends of loading paths (Jin et al., 2019c), based on which the final
sufficiently large to ensure the loss value can converge at a constant
failure envelope can be obtained by connecting the failure loci, as pre­
value. Herein, orthogonal initializer is used to generate an initial
sented in Fig. 4(d).
random orthogonal weights matrix and zeros initializer is used to
generate an initial zero biases vector.
3.2. Data preprocessing Regarding the implementation, Keras, that is a high-level deep
learning library based on Python programming language, is leveraged to
It can be seen that the simulating loading paths of SPH-SIMSAND are design the architecture of the LSTM based model. Tensorflow as the back
noisy, as presented in Fig. 4. The large variance of sequential data has an engine supports to implement operations in Keras such as tensor cal­
adverse impact on the training process and the model performance (Xu culus. The matrix construction and computation are achieved by using

6
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 8. Predicted loading using LSTM based surrogate model for the training set L ¼ 4, D ¼ 8, in comparison with numerical results: (a) u–H; (b) θ–M; (c) H–M/D.

NumPy library. Data mining and analysis toolbox Pandas is employed to improves the optimization process to escape from the local optima and
import CSV datasets file. saddle points. The loss value roughly maintains steadily and converges
The results of grid search indicate that the LSTM model with three at a constant value within 200 epochs, thereby the maximum number of
hidden layers can produce the lowest loss value. The number of nodes in epochs is set as 200.
each layer is 80, 80 and 50, respectively, and the corresponding acti­
vation functions are tanh, ReLU and ReLU, respectively. Therefore, the
number of weights and biases are 104820 and 842, respectively. Over 4.2. Underfitting and overfitting examination
the course of training, the learning rate first increases from 0.0002 to
0.002 within 10 epochs and thereafter decreases from 0.002 to 0.0002 The examination of underfitting and overfitting is a key step to
within 10 epochs, thereby each period includes 20 epochs. Such strategy guarantee the reliability of the LSTM based model. Learning curves of
both loss values on the training and testing sets have been successfully

7
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 9. Predicted loading using LSTM based surrogate model for the testing set L ¼ 6, D ¼ 9.24, in comparison with numerical results: (a) u–H; (b) θ–M; (c) H–M/D.

used to evaluate the underfitting and overfitting problems (Hassan et al., such as L1 and L2 penalties(Moradi et al., 2019; Srivastava et al., 2014).
2020), because it can reflect how well a behavior of neural network is Therefore, a dropout method is used to avoid potential overfitting of
improved with the increasing number of training samples or complexity LSTM in this study. The effect of dropout rates on the prediction per­
of neural network (Murata et al., 1993). Large loss values on both formance of LSTM based models can be observed in Fig. 6. In compar­
training and testing sets represent that the LSTM based model exists ison with the model with dropout rates of 0, 0.2, 0.4, 0.6 and 0.8, it is
underfitting problem. The large loss value on the training set and the low clear that the loss values on both training and testing sets increase with
loss value on the testing set represent that the LSTM based model has the increasing dropout rates. Meanwhile the increasing dropout rates
overfitting problem. This study thus uses learning curve to examine the can cause the variation of loss values. The minor effectiveness of the
potential underfitting and overfitting issues. dropout layer indicates that the LSTM based model can well suppress the
Numerous research works have demonstrated that dropout family overfitting problem over the course of training and provide accurate
methods give significant advantages over other regularization methods prediction. Therefore, the dropout rate is set as 0 in this study.

8
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 10. Comparison between loading paths of testing sets yielded by SPH-SIMSAND and LSTM.

Fig. 7 presents the evolution of loss values on both training and


testing sets produced by LSTM based models with various number of
batch size. Loss value in logarithm is used to highlight the difference of
model performance because of the small loss value. The loss value
roughly holds steadily as the epochs reaches 200, and it clearly increases
with the increasing number of batch size. The increasing training sam­
ples bring about difficulties in optimizing weights and biases to reduce
loss value, but the model trained with larger batch size possesses better
generalization ability, thereby the prediction error on both training and
testing sets decreases continuously throughout the training process. The
model trained with small batch size ensures to shrink the difference
between the most measured and predicted results, thereby the loss value
is small, but such model suffers from poor generalization ability.
Consequently, the LSTM model with optimized weights and biases
which presents an excellent prediction performance on cross-validation
sets may produce large error on both training and testing set, thereby it
can be seen from Fig. 7 that the loss value varies as the number of batch
Fig. 11. Notation convention of failure envelope.
size decreases. Considering the model trained with batch size of 200
produces the lowest loss value and the variation is acceptable, thereby
the batch size of 200 is applied to train the LSTM based model in this

9
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Fig. 12. Comparison between failure envelopes of testing sets yielded by SPH-SIMSAND and LSTM.

10
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Table 4 is also used to simulate the same cases for comparison. Fig. 9 presents
Values of parameters used in failure envelope. the predicted loading paths of a caisson foundation with L ¼ 6 m and D
L D SPH-SIMSAND LSTM based model ¼ 9.24 m. It can be observed that the LSTM based model has an excellent
performance in reproducing the u–H relationship, but the prediction
a(� b(� ϕ( )

a(� b(� ϕ (� )
104) 104) 104) 104) errors of initial θ–M and H–M/D relationships are large, which is
attributed to the loss function, i.e., MSE value. Such indicator focuses on
1 2.83 0.50 0.17 26.30 0.52 0.18 28.96
1.5 2.31 0.54 0.15 37.27 0.57 0.18 39.06
eliminating the discrepancy of large output values, whereas the small
2 5.65 1.39 0.31 19.95 1.59 0.29 21.86 output values are less important, thereby the trained LSTM based model
3.56 1.5 1.49 0.15 62.98 1.54 0.17 63.31 shows larger prediction error in predicting initial loading paths. The in-
4 11.32 3.88 0.66 19.48 3.84 0.69 17.74 depth study of loss function selection to achieve the tradeoff of pre­
4.15 1.39 2.06 0.18 67.5 2.23 0.23 67.75
dicting large and small values, and further improve the model general­
6 3.26 3.47 0.99 52.84 3.63 0.89 55.51
6 9.24 5.20 1.29 31.69 5.20 1.51 32.39 ization ability is important, but is out of the scope of this paper. These
studies will be conducted in a future dedicated work. Overall, the pre­
diction performance of the LSTM based model on the mechanical re­
study. Meanwhile the loss value decreases continuously for both training sponses of unknown caisson foundations is reliable.
and test sets, and the convergence value is roughly identical. Such fac­ Fig. 10 presents the predicted H–M/D relationships of the remaining
tors indicate that the constructed LSTM model can well overcome seven testing sets. The loading paths generated from the LSTM based
underfitting and overfitting problems. model show good agreement with the numerical results. Small MAPE
and high NSE values are generated on the testing set, as presented in
4.3. Evaluation of surrogate model performance Table 3. The simulations using the LSTM based model are completed
without using any internal variables to capture the responses of caisson
All of optimum values of hyper-parameters are determined as foundations. Such model is thus ready to be used to predict the failure
mentioned in the former two sections. Table 2 summarizes such values envelope of caisson foundations with various specifications on a given
and the model is trained based on this set of parameters. The indicator soil type in engineering practice.
values for describing the prediction performance of the model are pre­
sented in Table 3. For the training set, MAPE values are low on both
horizontal force and moment predictions, meanwhile NSE values are 5.2. Prediction of failure envelope in the H–M plane
roughly identical to 1. The LSTM based model shows an excellent per­
formance in capture loading paths of caisson foundations. As presented in Fig. 4(d), the failure envelope in the H–M plane has
For brevity, the predicted loading path of one training set with L ¼ 4 an elliptical shape. Following Villalobos et al., 2009, Jin et al., 2019c
m and D ¼ 8 m is presented as a typical example to illustrate the training proposed an ellipse formulation with only three parameters a, b and ϕ to
performance of the LSTM based model, as shown in Fig. 8. Such results describe the failure envelope of a caisson foundation in the H–M plane.
are obtained within several seconds. Remarkably, the LSTM based Fig. 11 illustrates the notation convention of failure envelope, in which a
model is capable of replicating the u–H, θ–M and H–M/D relationships and b are the major and minor axis of the ellipse, respectively, and ϕ is
with negligible error. The results presented in Fig. 8(a)–(b) and (e)–(f) the rotation of the ellipse. The formulation can be obtained by:
indicate that the softening behavior can be captured by the LSTM based A1 X 2 þ A2 XY þ A3 Y 2 þ A4 ¼ 0 (15)
model. The excellent repeatability provides a basis for the LSTM based
surrogate model to replace numerical modelling for investigating the 8
>
> A1 ¼ a2 ðsin ϕÞ2�þ b2 ðcos ϕÞ2
mechanical responses of caisson foundations with lower computational <
A2 ¼ 2 b2 a2 sin ϕ cos ϕ
cost. (16)
> A3
>
: ¼ a2 ðcos ϕÞ2 þ b2 ðsin ϕÞ2
A4 ¼ a2 b2
5. Online prediction using LSTM surrogate model
where X and Y denote the horizontal force H � 104 and normalized
5.1. Loading paths prediction moment M/D � 104 in this study.
The failure loci of eight testing cases obtained from the numerical
To test the reliability of the LSTM based surrogate model to guar­ modelling and the LSTM based model is plotted together in Fig. 12. The
antee its application in engineering practice, the responses of additional predicted points are close to the numerical results. Using Eqs. 15 and 16
eight caisson foundations are investigated using the LSTM based model to fit these failure loci, it can be observed that the fitted failure envelope
developed in the former section, meanwhile the SPH-SIMSAND platform based on failure loci obtained from the LSTM based models exhibit good

Fig. 13. Values of parameters used in failure envelope: (a) major axis a; (b) minor axis b; (c) rotation ϕ.

11
P. Zhang et al. Ocean Engineering 204 (2020) 107263

agreement with numerical results. The corresponding a, b and ϕ values based_model_for_predicting_caisson_foundations_responses.


generated by the numerical modelling and the LSTM based model are
presented in Table 4. Scatter plots of the predicted and actual a, b and ϕ Declaration of competing interest
values are presented in Fig. 13 with the MAPE and NSE values. It is
obvious that all points are close to the line with slope of 1. MAPE values We declare that we have no known competing financial interests or
of a and ϕ are only around 5%, the larger MAPE value (13.14%) in personal relationships that could have appeared to influence the work
predicting b is attributed to the smaller value of b. All NSE values are reported in this manuscript “A LSTM Surrogate Modelling Approach for
larger than 0.95. Such results clearly demonstrate the capacity of the Caisson Foundations”.
LSTM based model in predicting the failure envelop of unknown caisson
foundations with various specifications. It can reproduce the mechanical CRediT authorship contribution statement
responses of caisson foundations with low computational cost and high
accuracy, which is a significant improvement over numerical and Pin Zhang: Conceptualization, Methodology, Software, Validation,
analytical methods. Formal analysis, Investigation, Data curation, Writing - original draft,
Visualization. Zhen-Yu Yin: Resources, Conceptualization, Methodol­
6. Conclusions ogy, Resources, Writing - review & editing, Supervision, Project
administration, Funding acquisition. Yuanyuan Zheng: Methodology,
This study presents the development of a hybrid surrogate model Writing - review & editing. Fu-Ping Gao: Methodology, Writing - review
with the integration of numerical modelling technique SPH-SIMSAND & editing.
and deep learning algorithm long short-term memory (LSTM) to iden­
tify the mechanical responses and failure envelope of a caisson foun­ Acknowledges
dation in sand. The LSTM based surrogate model was first trained based
on limited results generated from the numerical modelling, thereafter it This research was financially supported by the Research Grants
was applied to simulate the mechanical behavior and the failure enve­ Council (RGC) of Hong Kong Special Administrative Region Government
lope of unknown caisson foundations with various specifications. The (HKSARG) of China (Grant No: PolyU R5037-18F) and the Key Special
underfitting and overfitting problems that commonly exist during the Project for Introduced Talents Team of Southern Marine Science and
development of machine leaning based models have been well tackled, Engineering Guangdong Laboratory (Guangzhou) (No:
which ensures the robustness and generalization ability of the LSTM GML2019ZD0503).
based model.
The predictive ability on the sequential data allowed the LSTM based References
model to accurately reproduce the mechanical responses of a caisson
foundation including the relationships between horizontal displacement Atangana Njock, P.G., Shen, S.-L., Zhou, A., Lyu, H.-M., 2020. Evaluation of soil
liquefaction using AI technology incorporating a coupled ENN/t-SNE model. Soil
and force, rotation and moment, horizontal force and moment. Such Dynam. Earthq. Eng. 130, 105988.
surrogate model has the capacity to memorize and interpret history- Byrne, B., Houlsby, G.T., 2001. Observations of footing behaviour on loose carbonate
dependent events without using additional parameters. It means that sands. Geotechnique 51 (5), 463–466.
Cassidy, M., Byrne, B., Houlsby, G.T., 2002. Modelling the behaviour of circular footings
LSTM based model is more flexible than macro-element method, under combined loading on loose carbonate sand. Geotechnique 52 (10), 705–712.
because it can directly learn the failure mechanism of caisson foundation Chen, R.P., Zhang, P., Kang, X., Zhong, Z.Q., Liu, Y., Wu, H.N., 2019a. Prediction of
from the raw data. The failure envelopes of caisson foundations can be maximum surface settlement caused by EPB shield tunneling with ANN methods.
Soils Found. 59 (2), 284–295.
rapidly obtained using the LSTM based surrogate model, which agree Chen, R.P., Zhang, P., Wu, H.N., Wang, Z.T., Zhong, Z.Q., 2019b. Prediction of shield
well with the actual results. Therefore, LSTM based model also gua­ tunneling-induced ground settlement using machine learning techniques. Front.
rantees a high computational efficiency and accuracy in comparison Struct. Civ. Eng. 13 (6), 1363–1378.
Cho, K., Van Merri€enboer, B., Gulcehre, C., Bahdanau, v., Bougares, F., Schwenk, H.,
with physical and numerical modelling.
Bengio, Y., 2014. Learning Phrase Representations Using RNN Encoder–Decoder for
Overall, LSTM based surrogated model shows a great potential of Statistical Machine Translation arxiv 1406.1078.
application in engineering practice. Engineers can first obtain several Elbaz, K., Shen, S.-L., Zhou, A., Yuan, D.-J., Xu, Y.-S., 2019a. Optimization of EPB shield
performance with adaptive neuro-fuzzy inference system and genetic algorithm.
responses of caisson foundations using experiments or numerical
Appl. Sci. 9 (4).
modelling, thereafter a LSTM based model is built with such datasets Elbaz, K., Shen, S.L., Zhou, A.N., Yin, Z.Y., Lyu, H.M., 2019b. Prediction of disc cutter life
and further use the LSTM based model to obtain the responses of caisson during shield tunnelling with AI via incorporation of genetic algorithm into GMDH-
foundations under different conditions. Thereby an optimum design of type neural network. Engineering (in press).
Gelagoti, F., Georgiou, I., Kourkoulis, R., Gazetas, G., 2018. Nonlinear lateral stiffness
caisson foundation can be obtained with less experimental or compu­ and bearing capacity of suction caissons for offshore wind-turbines. Ocean Eng. 170,
tational costs. 445–465.
The proposed method of developing the LSTM based model can be Gottardi, G., Houlsby, G.T., Butterfield, R., 1999. Plastic response of circular footings on
sand under general planar. Geotechnique 49 (4), 453–469.
extended to more conditions (i.e. different soil properties, different Hassan, M.M., Gumaei, A., Alsanad, A., Alrubaian, M., Fortino, G., 2020. A hybrid deep
vertical forces applied to the caisson foundation) if database is available. learning model for efficient intrusion detection in big data environment. Inf. Sci.
Future work will focus on the application of the method and the model 513, 386–396.
Hochreiter, S., Schmidhuber, J., 1997a. Long short-term memory. Neural Comput. 9,
using experimental observations. 1735–1780.
Hochreiter, S., Schmidhuber, J., 1997b. Long short-term memory. Neural Comput. 9 (8),
Appendix 1735–1780.
Houlsby, G.T., Kelly, R.B., Huxtable, J., Byrne, B.W., 2006. Field trials of suction caissons
in sand for offshore wind turbine. Geotechnique 56 (1), 3–10.
The datasets and the optimum LSTM based model used in this study Huang, F., Huang, J., Jiang, S., Zhou, C., 2017. Landslide displacement prediction based
can be freely downloaded at following link. There are three documents on multivariate chaotic model and extreme learning machine. Eng. Geol. 218,
173–186.
entitled as “databasesmooth.csv”, “Caisson_Mech_Response.h5” and
Ibsen, L.B., Barari, A., Larsen, K.A., 2014. Adaptive plasticity model for bucket
“validation.py”. The data can be stored in the “databasesmooth.csv”, foundations. J. Eng. Mech. 140 (2), 361–373.
and “Caisson_Mech_Response.h5” is the optimum LSTM based surrogate Ibsen, L.B., Brincker, R., 2004. Design of a New Foundation for Offshore Wind Turbines,
model. “Validation.py” is the main code. Researchers and engineers can 22nd International Modal Analysis Conference. MI, USA., Detroit.
Jin, Y.-F., Wu, Z.-X., Yin, Z.-Y., Shen, J.S., 2017. Estimation of critical state-related
directly run this code to replicate the results presented in this study and formula in advanced constitutive modeling of granular material. Acta Geotechnica
apply it in engineering practice. 12 (6), 1329–1351.
https://www.researchgate.net/publication/338983602_LSTM_

12
P. Zhang et al. Ocean Engineering 204 (2020) 107263

Jin, Y.-F., Yin, Z.-Y., Shen, S.-L., Hicher, P.-Y., 2016. Selection of sand models and Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014.
identification of parameters using an enhanced genetic algorithm. Int. J. Numer. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn.
Anal. Methods GeoMech. 40 (8), 1219–1240. Res. 15, 1929–1958.
Jin, Y.-F., Yin, Z.-Y., Zhou, W.-H., Horpibulsuk, S., 2019a. Identifying parameters of Villalobos, F.A., Byrne, B.W., Houlsby, G.T., 2009. An experimental study of the drained
advanced soil models using an enhanced transitional Markov chain Monte Carlo capacity of suction caisson foundations under monotonic loading for offshore
method. Acta Geotechnica 14 (6), 1925–1947. applications. Soils Found. 49 (3), 477–488.
Jin, Z., Yin, Z.Y., Kotronis, P., Jin, Y.F., 2018. Numerical investigation on evolving failure Wang, K., Sun, W., 2018. A multiscale multi-permeability poroplasticity model linked by
of caisson foundation in sand using the combined Lagrangian-SPH method. Marine recursive homogenizations and deep learning. Comput. Methods Appl. Mech. Eng.
Georesources & Geotechnology 37 (1), 23–35. 334, 337–380.
Jin, Z., Yin, Z.-Y., Kotronis, P., Li, Z., 2019b. Advanced numerical modelling of caisson Xu, P., Du, R., Zhang, Z., 2019. Predicting pipeline leakage in petrochemical system
foundations in sand to investigate the failure envelope in the H-M-V space. Ocean through GAN and LSTM. Knowl. Base Syst. 175, 50–61.
Eng. 190 (15), 106394. Yang, B., Yin, K., Lacasse, S., Liu, Z., 2019. Time series analysis and long short-term
Jin, Z., Yin, Z.-Y., Kotronis, P., Li, Z., Tamagnini, C., 2019c. A hypoplastic macroelement memory neural network to predict landslide displacement. Landslides 16 (4),
model for a caisson foundation in sand under monotonic and cyclic loadings. Mar. 677–694.
Struct. 66, 16–26. Yin, Z.-Y., Huang, H.-W., Hicher, P.-Y., 2016. Elastoplastic modeling of sand–silt
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521 (7553), 436–444. mixtures. Soils Found. 56 (3), 520–532.
Li, Z., Kotronis, P., Escoffier, S., Tamagnini, C., 2015. A hypoplastic macroelement for Yin, Z.-Y., Jin, Y.-F., Shen, J.S., Hicher, P.-Y., 2018a. Optimization techniques for
single vertical piles in sand subject to three-dimensional loading conditions. Acta identifying soil parameters in geotechnical engineering: comparative study and
Geotechnica 11 (2), 373–390. enhancement. Int. J. Numer. Anal. Methods GeoMech. 42 (1), 70–94.
Liu, X., Gasco, F., Goodsella, J., Yua, W.B., 2019. Initial failure strength prediction of Yin, Z.-Y., Jin, Z., Kotronis, P., Wu, Z.-X., 2018b. Novel SPH SIMSAND–based approach
woven composites using a new yarnfailure criterion constructed by deep learning. for modeling of granular collapse. Int. J. GeoMech. 18 (11).
Compos. Struct. 230, 111505. Yin, Z.-Y., Xu, Q., Hicher, P.-Y., 2013. A simple critical-state-based double-yield-surface
Liu, M., Yang, M., Wang, H., 2014. Bearing behavior of wide-shallow bucket foundation model for clay behavior under complex loading. Acta Geotechnica 8 (5), 509–523.
for offshore wind turbines in drained silty sand. Ocean Engineering 82, 169–179. Zafeirakos, A., Gerolymos, N., 2016. Bearing strength surface for bridge caisson
Montrasio, L., Nova, R., 1997. Settlements of shallow foundations on sand geometrical foundations in frictional soil under combined loading. Acta Geotechnica 11 (5),
effects. Geotechnique 47 (1), 49–60. 1189–1208.
Moradi, R., Berangi, R., Minaei, B., 2019. A survey of regularization strategies for deep Zhang, N., Shen, S.-L., Zhou, A., Xu, Y.-S., 2019a. Investigation on performance of neural
models. Artif. Intell. Rev. https://doi.org/10.1007/s10462-019-09784-7. networks using quadratic relative error cost function. IEEE Access 7,
Murata, N., Yoshizawa, S., Amari, S., 1993. Learning curves, model selection and 106642–106652.
complexity of neural networks. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (Eds.), Zhang, P., 2019. A novel feature selection method based on global sensitivity analysis
Advances in Neural Information Processing Systems, 5. Morgan Kaufmann, San with application in machine learning-based prediction model. Appl. Soft Comput.
Mateo, CA, pp. 607–614, 1993. 85, 105859.
Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models part I - Zhang, P., Chen, R.P., Wu, H.N., 2019b. Real-time analysis and regulation of EPB shield
a discussion of principles. J. Hydrol. 10 (3), 282–290. steering using Random Forest. Autom. ConStruct. 106, 102860.
Nova, R., Montrasio, L., 1991. Settlements of shallow foundations on sand. Geotechnique Zhang, P., Wu, H.N., Chen, R.P., Chan, T.H.T., 2020. Hybrid meta-heuristic and machine
41 (2), 243–256. learning algorithms for tunneling-induced settlement prediction: A comparative
Reuter, U., Sultan, A., Reischl, D.S., 2018. A comparative study of machine learning study. Tunnelling and Underground Space Technology 99, 103383.
approaches for modeling concrete failure surfaces. Adv. Eng. Software 116, 67–79. Zhang, P., Yin, Z.-Y., Jin, Y.-F., Chan, T.H.T., 2020a. A novel hybrid surrogate intelligent
Ruder, S., 2016. An Overview of Gradient Descent Optimization arXiv preprint, arXiv: model for creep index prediction based on particle swarm optimization and random
1609.04747v04742. forest. Eng. Geol. 265, 105328.
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back- Zhang, P., Yin, Z.Y., Jin, Y.F., Chan, T., 2020b. Intelligent modelling of clay
propagating errors. Nature 323 (9), 533–536. compressibility using hybrid meta-heuristic and machine learning algorithms.
Sarir, P., Shen, S.-L., Wang, Z.-F., Chen, J., Horpibulsuk, S., Pham, B.T., 2019. Optimum Geosci. Front. (in press).
model for bearing capacity of concrete-steel columns with AI technology via Zhang, P., Yin, Z.Y., Jin, Y.F., Ye, G.L., 2020c. An AI-based model for describing cyclic
incorporating the algorithms of IWO and ABC. Eng. Comput. 1–11. characteristics of granular materials. Int. J. Numer. Anal. Methods GeoMech. 1–21.
Skau, K.S., Chen, Y., Jostad, H.P., 2018a. A numerical study of capacity and stiffness of Zhou, W.-H., Garg, A., Garg, A., 2016. Study of the volumetric water content based on
circular skirted foundations in clay subjected to combined static and cyclic general density, suction and initial water content. Measurement 94, 531–537.
loading. Geotechnique 68 (3), 205–220. Zhu, F., Bienen, B., O’Loughlin, C., Morgan, N., Cassidy, M.J., 2018. The response of
Skau, K.S., Grimstad, G., Page, A.M., Eiksund, G.R., Jostad, H.P., 2018b. A macro- suction caissons to multidirectional lateral cyclic loading in sand over clay. Ocean
element for integrated time domain analyses representing bucket foundations for Eng. 170, 43–54.
offshore wind turbines. Mar. Struct. 59, 158–178. Zhu, J.-H., Zaman, M.M., Anderson, S.A., 1998. Modeling of soil behavior with a
Smith, L.N., 2017. Cyclical learning rates for training neural networks. In: IEEE Winter recurrent neural network. Can. Geotech. J. 35, 858–872.
Conference on Applications of Computer Vision (WACV), Santa Rosa, California.

13

You might also like