Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
Ocean Engineering: Pin Zhang, Zhen-Yu Yin, Yuanyuan Zheng, Fu-Ping Gao
Ocean Engineering
journal homepage: www.elsevier.com/locate/oceaneng
A R T I C L E I N F O A B S T R A C T
Keywords: This study proposes a hybrid surrogate modelling approach with the integration of deep learning algorithm long
Caisson foundation short-term memory (LSTM) to identify the mechanical responses of caisson foundations in marine soils. The
Failure envelope LSTM based surrogate model is first trained based on limited results generated from the SPH-SIMSAND based
Smoothed particle hydrodynamics
numerical simulations with a strong validation, thereafter it is applied to predict the mechanical responses of
Long short-term memory
soil-structure interaction and the failure envelope of unknown caisson foundations with various specifications as
testing. The results indicate that the LSTM based model is more flexible than macro-element method, because it
can directly learn the failure mechanism of caisson foundation from the raw data, meanwhile guarantees a high
computational efficiency and accuracy in comparison with physical and numerical modelling. LSTM based
surrogated model shows a great potential of application in engineering practice.
* Corresponding author.
E-mail addresses: [email protected], [email protected] (Z.-Y. Yin).
https://doi.org/10.1016/j.oceaneng.2020.107263
Received 30 December 2019; Received in revised form 14 March 2020; Accepted 16 March 2020
Available online 1 April 2020
0029-8018/© 2020 Elsevier Ltd. All rights reserved.
P. Zhang et al. Ocean Engineering 204 (2020) 107263
2
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 3. Schematic view of hybrid SPH-SIMSAND and LSTM surrogate modelling process.
of the hidden layers in the front of the architecture is poorer than the
Table 1 later hidden layers.
Specifications of caisson foundations in the numerical model.
To overcome gradients exploding and vanishing in the conventional
Specification (L, D) RNNs, a memory cell is thus added in the architecture of LSTM in place
Training (1, 2), (1, 2.9), (1.5, 2), (1.5, 2.39), (2, 1), (2, 2), (2, 4), (2, 5.8), (2.96, of the neurons used in conventional RNNs. Such memory cell can store
set 1.5), (3, 4), (3.72, 1.24), (4, 2), (4, 4), (4, 8), (4, 11.6), (4.75, 1), (6, information over extended time intervals and handle long-time-lag tasks
2.97), (6, 8), (6, 9.55), (8, 2.33), (8, 4), (8, 8), (8, 16), (10, 1.91), (10, (Hochreiter and Schmidhuber, 1997a) by using a novel entity termed as
20), (12, 5.94), (12, 16), (15, 20), (16, 8), (16, 16), (20, 20), (20, 10)
“gate”, as presented in Fig. 2. Three gates, i.e., forget, input and output
Testing set (1, 2.83), (1.5, 2.31), (2, 5.65), (3.56, 1.5), (4, 11.32), (4.15, 1.39), (6, gates are included in the memory cell to control the flow of information
3.26), (6, 9.24)
and the state of the cell. Forget gate decides which information is dis
carded from the memory cell, input gate decides which information is
and measured results (Rumelhart et al., 1986). In other words, the stored in the memory cell, and output gate decides ultimate output
prediction of current output parameters is not affected by the previous values. The outputs of forget and input gates at the tth step are obtained
information and it also does not affect the prediction of output param by:
eters at the next step. Given a set of input matrix x ¼ [x1, x2, …, xn], the �� � � � � ��
f tj ¼ σ Uf xt j þ Wf ht 1 j þ bf j (4)
output of hidden and output layers can be obtained by:
h ¼ f ðUx þ b1 Þ (1) � �
itj ¼ σ ½Ui xt �j þ Wi ht 1 j þ ½bi �j
�
(5)
o ¼ gðVh þ b2 Þ (2)
where σ ¼ sigmoid function. In the forget gate, σ ¼ 1 and 0 represent all
where U, V ¼ matrix connecting the input and hidden layers, hidden and information is maintained or discarded, respectively. In the input gate, σ
output layers, respectively; b1, b2 ¼ biases vectors in the input and ¼ 1 and 0 represent all information is selected or discarded, respectively.
hidden layers, respectively; f, g ¼ activation functions in the hidden and Based on the forget and input information, the memory cell state at
output layers, respectively. the tth current step is thus updated by:
The main departure of RNN is a cyclic connection topology is � �
~ctj ¼ tanh ½Uc xt �j þ Wc ht 1 j þ ½bc �j
�
(6)
adopted, as presented in Fig. 1. It is clear that the predicted output at the
current step depends on current values of input parameters and the in
ct ¼ f t � ct 1
þ it � ~ct (7)
formation transferred from the former hidden layer, which can be ob
tained by: where tanh is the activation function; � ¼ elementwise product;f t � ct 1
�
ht ¼ f Ux þ Wht 1 þ b1 (3) represents the discarded information; it � ~ct represents newly selected
information. The update of memory cell status with an addition format
where W ¼ matrix connecting hidden layers at adjacent steps. can avoid the gradients vanishing and exploding. Thereafter, output of
The history information is stored and it is applied to predict the next the hidden layer at the tth step is obtained by:
status, such history-dependent characteristic makes RNNs applicable to � � �
οtj ¼ σ ½Uo xt �j þ Wo ht 1 j þ ½bo �j (8)
investigate problems with sequential datasets, such as language trans
formation, speech recognition, and the prediction of load–deformation
responses (Wang and Sun, 2018; Zhu et al., 1998). However, training ht ¼ ot � tanhðct Þ (9)
RNNs has proved to be problematic because the back-propagated gra A multiplicative input gate unit is employed to protect the memory
dients either grow or shrink at each time step, resulting in exploding or contents stored at the current step from perturbation by irrelevant in
vanishing gradients (LeCun et al., 2015), that is, the learning efficiency puts, and a multiplicative output gate unit is employed to protect other
3
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 4. Results of SPH-SIMSAND numerical modelling: (a) u–H; (b) θ–M; (c) H–M/D; (d) failure envelope.
units from perturbation by currently irrelevant memory contents stored Ht and bending moment Mt. The training performance of the LSTM based
at the current step (Hochreiter and Schmidhuber, 1997a). It should be model is evaluated by the mean square error (MSE) values on both
noted that numerous LSTM variants have been thereafter proposed such training and test sets, meanwhile 10-fold cross-validation method is
as gated recurrent unit (GRU) (Cho et al., 2014). LSTM with numerous applied to enhance model robustness, thereby the loss function can be
weights and biases that is beneficial to predict high-dimensional issues is obtained by:
selected to explore its feasibility in capturing caisson foundations
responses. 1 X 10 Xr
�2
MSE ¼ ym ypi (10)
10r i¼1 j¼1 i
4
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 6. Loss values yielded by LSTM model with dropout layer on the: (a)
training set; (b) testing set.
Fig. 5. Data smoothing: (a) u–H; (b) θ–M.
5
P. Zhang et al. Ocean Engineering 204 (2020) 107263
1 X n
xn ¼ xi (13)
t i¼n tþ1
6
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 8. Predicted loading using LSTM based surrogate model for the training set L ¼ 4, D ¼ 8, in comparison with numerical results: (a) u–H; (b) θ–M; (c) H–M/D.
NumPy library. Data mining and analysis toolbox Pandas is employed to improves the optimization process to escape from the local optima and
import CSV datasets file. saddle points. The loss value roughly maintains steadily and converges
The results of grid search indicate that the LSTM model with three at a constant value within 200 epochs, thereby the maximum number of
hidden layers can produce the lowest loss value. The number of nodes in epochs is set as 200.
each layer is 80, 80 and 50, respectively, and the corresponding acti
vation functions are tanh, ReLU and ReLU, respectively. Therefore, the
number of weights and biases are 104820 and 842, respectively. Over 4.2. Underfitting and overfitting examination
the course of training, the learning rate first increases from 0.0002 to
0.002 within 10 epochs and thereafter decreases from 0.002 to 0.0002 The examination of underfitting and overfitting is a key step to
within 10 epochs, thereby each period includes 20 epochs. Such strategy guarantee the reliability of the LSTM based model. Learning curves of
both loss values on the training and testing sets have been successfully
7
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 9. Predicted loading using LSTM based surrogate model for the testing set L ¼ 6, D ¼ 9.24, in comparison with numerical results: (a) u–H; (b) θ–M; (c) H–M/D.
used to evaluate the underfitting and overfitting problems (Hassan et al., such as L1 and L2 penalties(Moradi et al., 2019; Srivastava et al., 2014).
2020), because it can reflect how well a behavior of neural network is Therefore, a dropout method is used to avoid potential overfitting of
improved with the increasing number of training samples or complexity LSTM in this study. The effect of dropout rates on the prediction per
of neural network (Murata et al., 1993). Large loss values on both formance of LSTM based models can be observed in Fig. 6. In compar
training and testing sets represent that the LSTM based model exists ison with the model with dropout rates of 0, 0.2, 0.4, 0.6 and 0.8, it is
underfitting problem. The large loss value on the training set and the low clear that the loss values on both training and testing sets increase with
loss value on the testing set represent that the LSTM based model has the increasing dropout rates. Meanwhile the increasing dropout rates
overfitting problem. This study thus uses learning curve to examine the can cause the variation of loss values. The minor effectiveness of the
potential underfitting and overfitting issues. dropout layer indicates that the LSTM based model can well suppress the
Numerous research works have demonstrated that dropout family overfitting problem over the course of training and provide accurate
methods give significant advantages over other regularization methods prediction. Therefore, the dropout rate is set as 0 in this study.
8
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 10. Comparison between loading paths of testing sets yielded by SPH-SIMSAND and LSTM.
9
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Fig. 12. Comparison between failure envelopes of testing sets yielded by SPH-SIMSAND and LSTM.
10
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Table 4 is also used to simulate the same cases for comparison. Fig. 9 presents
Values of parameters used in failure envelope. the predicted loading paths of a caisson foundation with L ¼ 6 m and D
L D SPH-SIMSAND LSTM based model ¼ 9.24 m. It can be observed that the LSTM based model has an excellent
performance in reproducing the u–H relationship, but the prediction
a(� b(� ϕ( )
�
a(� b(� ϕ (� )
104) 104) 104) 104) errors of initial θ–M and H–M/D relationships are large, which is
attributed to the loss function, i.e., MSE value. Such indicator focuses on
1 2.83 0.50 0.17 26.30 0.52 0.18 28.96
1.5 2.31 0.54 0.15 37.27 0.57 0.18 39.06
eliminating the discrepancy of large output values, whereas the small
2 5.65 1.39 0.31 19.95 1.59 0.29 21.86 output values are less important, thereby the trained LSTM based model
3.56 1.5 1.49 0.15 62.98 1.54 0.17 63.31 shows larger prediction error in predicting initial loading paths. The in-
4 11.32 3.88 0.66 19.48 3.84 0.69 17.74 depth study of loss function selection to achieve the tradeoff of pre
4.15 1.39 2.06 0.18 67.5 2.23 0.23 67.75
dicting large and small values, and further improve the model general
6 3.26 3.47 0.99 52.84 3.63 0.89 55.51
6 9.24 5.20 1.29 31.69 5.20 1.51 32.39 ization ability is important, but is out of the scope of this paper. These
studies will be conducted in a future dedicated work. Overall, the pre
diction performance of the LSTM based model on the mechanical re
study. Meanwhile the loss value decreases continuously for both training sponses of unknown caisson foundations is reliable.
and test sets, and the convergence value is roughly identical. Such fac Fig. 10 presents the predicted H–M/D relationships of the remaining
tors indicate that the constructed LSTM model can well overcome seven testing sets. The loading paths generated from the LSTM based
underfitting and overfitting problems. model show good agreement with the numerical results. Small MAPE
and high NSE values are generated on the testing set, as presented in
4.3. Evaluation of surrogate model performance Table 3. The simulations using the LSTM based model are completed
without using any internal variables to capture the responses of caisson
All of optimum values of hyper-parameters are determined as foundations. Such model is thus ready to be used to predict the failure
mentioned in the former two sections. Table 2 summarizes such values envelope of caisson foundations with various specifications on a given
and the model is trained based on this set of parameters. The indicator soil type in engineering practice.
values for describing the prediction performance of the model are pre
sented in Table 3. For the training set, MAPE values are low on both
horizontal force and moment predictions, meanwhile NSE values are 5.2. Prediction of failure envelope in the H–M plane
roughly identical to 1. The LSTM based model shows an excellent per
formance in capture loading paths of caisson foundations. As presented in Fig. 4(d), the failure envelope in the H–M plane has
For brevity, the predicted loading path of one training set with L ¼ 4 an elliptical shape. Following Villalobos et al., 2009, Jin et al., 2019c
m and D ¼ 8 m is presented as a typical example to illustrate the training proposed an ellipse formulation with only three parameters a, b and ϕ to
performance of the LSTM based model, as shown in Fig. 8. Such results describe the failure envelope of a caisson foundation in the H–M plane.
are obtained within several seconds. Remarkably, the LSTM based Fig. 11 illustrates the notation convention of failure envelope, in which a
model is capable of replicating the u–H, θ–M and H–M/D relationships and b are the major and minor axis of the ellipse, respectively, and ϕ is
with negligible error. The results presented in Fig. 8(a)–(b) and (e)–(f) the rotation of the ellipse. The formulation can be obtained by:
indicate that the softening behavior can be captured by the LSTM based A1 X 2 þ A2 XY þ A3 Y 2 þ A4 ¼ 0 (15)
model. The excellent repeatability provides a basis for the LSTM based
surrogate model to replace numerical modelling for investigating the 8
>
> A1 ¼ a2 ðsin ϕÞ2�þ b2 ðcos ϕÞ2
mechanical responses of caisson foundations with lower computational <
A2 ¼ 2 b2 a2 sin ϕ cos ϕ
cost. (16)
> A3
>
: ¼ a2 ðcos ϕÞ2 þ b2 ðsin ϕÞ2
A4 ¼ a2 b2
5. Online prediction using LSTM surrogate model
where X and Y denote the horizontal force H � 104 and normalized
5.1. Loading paths prediction moment M/D � 104 in this study.
The failure loci of eight testing cases obtained from the numerical
To test the reliability of the LSTM based surrogate model to guar modelling and the LSTM based model is plotted together in Fig. 12. The
antee its application in engineering practice, the responses of additional predicted points are close to the numerical results. Using Eqs. 15 and 16
eight caisson foundations are investigated using the LSTM based model to fit these failure loci, it can be observed that the fitted failure envelope
developed in the former section, meanwhile the SPH-SIMSAND platform based on failure loci obtained from the LSTM based models exhibit good
Fig. 13. Values of parameters used in failure envelope: (a) major axis a; (b) minor axis b; (c) rotation ϕ.
11
P. Zhang et al. Ocean Engineering 204 (2020) 107263
12
P. Zhang et al. Ocean Engineering 204 (2020) 107263
Jin, Y.-F., Yin, Z.-Y., Shen, S.-L., Hicher, P.-Y., 2016. Selection of sand models and Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014.
identification of parameters using an enhanced genetic algorithm. Int. J. Numer. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn.
Anal. Methods GeoMech. 40 (8), 1219–1240. Res. 15, 1929–1958.
Jin, Y.-F., Yin, Z.-Y., Zhou, W.-H., Horpibulsuk, S., 2019a. Identifying parameters of Villalobos, F.A., Byrne, B.W., Houlsby, G.T., 2009. An experimental study of the drained
advanced soil models using an enhanced transitional Markov chain Monte Carlo capacity of suction caisson foundations under monotonic loading for offshore
method. Acta Geotechnica 14 (6), 1925–1947. applications. Soils Found. 49 (3), 477–488.
Jin, Z., Yin, Z.Y., Kotronis, P., Jin, Y.F., 2018. Numerical investigation on evolving failure Wang, K., Sun, W., 2018. A multiscale multi-permeability poroplasticity model linked by
of caisson foundation in sand using the combined Lagrangian-SPH method. Marine recursive homogenizations and deep learning. Comput. Methods Appl. Mech. Eng.
Georesources & Geotechnology 37 (1), 23–35. 334, 337–380.
Jin, Z., Yin, Z.-Y., Kotronis, P., Li, Z., 2019b. Advanced numerical modelling of caisson Xu, P., Du, R., Zhang, Z., 2019. Predicting pipeline leakage in petrochemical system
foundations in sand to investigate the failure envelope in the H-M-V space. Ocean through GAN and LSTM. Knowl. Base Syst. 175, 50–61.
Eng. 190 (15), 106394. Yang, B., Yin, K., Lacasse, S., Liu, Z., 2019. Time series analysis and long short-term
Jin, Z., Yin, Z.-Y., Kotronis, P., Li, Z., Tamagnini, C., 2019c. A hypoplastic macroelement memory neural network to predict landslide displacement. Landslides 16 (4),
model for a caisson foundation in sand under monotonic and cyclic loadings. Mar. 677–694.
Struct. 66, 16–26. Yin, Z.-Y., Huang, H.-W., Hicher, P.-Y., 2016. Elastoplastic modeling of sand–silt
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521 (7553), 436–444. mixtures. Soils Found. 56 (3), 520–532.
Li, Z., Kotronis, P., Escoffier, S., Tamagnini, C., 2015. A hypoplastic macroelement for Yin, Z.-Y., Jin, Y.-F., Shen, J.S., Hicher, P.-Y., 2018a. Optimization techniques for
single vertical piles in sand subject to three-dimensional loading conditions. Acta identifying soil parameters in geotechnical engineering: comparative study and
Geotechnica 11 (2), 373–390. enhancement. Int. J. Numer. Anal. Methods GeoMech. 42 (1), 70–94.
Liu, X., Gasco, F., Goodsella, J., Yua, W.B., 2019. Initial failure strength prediction of Yin, Z.-Y., Jin, Z., Kotronis, P., Wu, Z.-X., 2018b. Novel SPH SIMSAND–based approach
woven composites using a new yarnfailure criterion constructed by deep learning. for modeling of granular collapse. Int. J. GeoMech. 18 (11).
Compos. Struct. 230, 111505. Yin, Z.-Y., Xu, Q., Hicher, P.-Y., 2013. A simple critical-state-based double-yield-surface
Liu, M., Yang, M., Wang, H., 2014. Bearing behavior of wide-shallow bucket foundation model for clay behavior under complex loading. Acta Geotechnica 8 (5), 509–523.
for offshore wind turbines in drained silty sand. Ocean Engineering 82, 169–179. Zafeirakos, A., Gerolymos, N., 2016. Bearing strength surface for bridge caisson
Montrasio, L., Nova, R., 1997. Settlements of shallow foundations on sand geometrical foundations in frictional soil under combined loading. Acta Geotechnica 11 (5),
effects. Geotechnique 47 (1), 49–60. 1189–1208.
Moradi, R., Berangi, R., Minaei, B., 2019. A survey of regularization strategies for deep Zhang, N., Shen, S.-L., Zhou, A., Xu, Y.-S., 2019a. Investigation on performance of neural
models. Artif. Intell. Rev. https://doi.org/10.1007/s10462-019-09784-7. networks using quadratic relative error cost function. IEEE Access 7,
Murata, N., Yoshizawa, S., Amari, S., 1993. Learning curves, model selection and 106642–106652.
complexity of neural networks. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (Eds.), Zhang, P., 2019. A novel feature selection method based on global sensitivity analysis
Advances in Neural Information Processing Systems, 5. Morgan Kaufmann, San with application in machine learning-based prediction model. Appl. Soft Comput.
Mateo, CA, pp. 607–614, 1993. 85, 105859.
Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models part I - Zhang, P., Chen, R.P., Wu, H.N., 2019b. Real-time analysis and regulation of EPB shield
a discussion of principles. J. Hydrol. 10 (3), 282–290. steering using Random Forest. Autom. ConStruct. 106, 102860.
Nova, R., Montrasio, L., 1991. Settlements of shallow foundations on sand. Geotechnique Zhang, P., Wu, H.N., Chen, R.P., Chan, T.H.T., 2020. Hybrid meta-heuristic and machine
41 (2), 243–256. learning algorithms for tunneling-induced settlement prediction: A comparative
Reuter, U., Sultan, A., Reischl, D.S., 2018. A comparative study of machine learning study. Tunnelling and Underground Space Technology 99, 103383.
approaches for modeling concrete failure surfaces. Adv. Eng. Software 116, 67–79. Zhang, P., Yin, Z.-Y., Jin, Y.-F., Chan, T.H.T., 2020a. A novel hybrid surrogate intelligent
Ruder, S., 2016. An Overview of Gradient Descent Optimization arXiv preprint, arXiv: model for creep index prediction based on particle swarm optimization and random
1609.04747v04742. forest. Eng. Geol. 265, 105328.
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back- Zhang, P., Yin, Z.Y., Jin, Y.F., Chan, T., 2020b. Intelligent modelling of clay
propagating errors. Nature 323 (9), 533–536. compressibility using hybrid meta-heuristic and machine learning algorithms.
Sarir, P., Shen, S.-L., Wang, Z.-F., Chen, J., Horpibulsuk, S., Pham, B.T., 2019. Optimum Geosci. Front. (in press).
model for bearing capacity of concrete-steel columns with AI technology via Zhang, P., Yin, Z.Y., Jin, Y.F., Ye, G.L., 2020c. An AI-based model for describing cyclic
incorporating the algorithms of IWO and ABC. Eng. Comput. 1–11. characteristics of granular materials. Int. J. Numer. Anal. Methods GeoMech. 1–21.
Skau, K.S., Chen, Y., Jostad, H.P., 2018a. A numerical study of capacity and stiffness of Zhou, W.-H., Garg, A., Garg, A., 2016. Study of the volumetric water content based on
circular skirted foundations in clay subjected to combined static and cyclic general density, suction and initial water content. Measurement 94, 531–537.
loading. Geotechnique 68 (3), 205–220. Zhu, F., Bienen, B., O’Loughlin, C., Morgan, N., Cassidy, M.J., 2018. The response of
Skau, K.S., Grimstad, G., Page, A.M., Eiksund, G.R., Jostad, H.P., 2018b. A macro- suction caissons to multidirectional lateral cyclic loading in sand over clay. Ocean
element for integrated time domain analyses representing bucket foundations for Eng. 170, 43–54.
offshore wind turbines. Mar. Struct. 59, 158–178. Zhu, J.-H., Zaman, M.M., Anderson, S.A., 1998. Modeling of soil behavior with a
Smith, L.N., 2017. Cyclical learning rates for training neural networks. In: IEEE Winter recurrent neural network. Can. Geotech. J. 35, 858–872.
Conference on Applications of Computer Vision (WACV), Santa Rosa, California.
13