Application of Deep Learning Algorithms in Geotech
Application of Deep Learning Algorithms in Geotech
https://doi.org/10.1007/s10462-021-09967-1
Wengang Zhang1,2,3 · Hongrui Li1,2,3 · Yongqin Li1,2,3 · Hanlong Liu1,2,3 · Yumin Chen4 ·
Xuanming Ding1,2,3
Abstract
With the advent of big data era, deep learning (DL) has become an essential research sub-
ject in the field of artificial intelligence (AI). DL algorithms are characterized with pow-
erful feature learning and expression capabilities compared with the traditional machine
learning (ML) methods, which attracts worldwide researchers from different fields to its
increasingly wide applications. Furthermore, in the field of geochnical engineering, DL
has been widely adopted in various research topics, a comprehensive review summarizing
its application is desirable. Consequently, this study presented the state of practice of DL
in geotechnical engineering, and depicted the statistical trend of the published papers. Four
major algorithms, including feedforward neural (FNN), recurrent neural network (RNN),
convolutional neural network (CNN) and generative adversarial network (GAN) along with
their geotechnical applications were elaborated. In addition, a thorough summary contain-
ing pubilished literatures, the corresponding reference cases, the adopted DL algorithms
as well as the related geotechnical topics was compiled. Furthermore, the challenges and
perspectives of future development of DL in geotechnical engineering were presented and
discussed.
* Wengang Zhang
[email protected]
1
Key Laboratory of New Technology for Construction of Cities in Mountain Area, Chongqing
University, Chongqing, China
2
National Joint Engineering Research Center of Geohazards Prevention in the Reservoir Areas,
Chongqing University, Chongqing 400045, China
3
School of Civil Engineering, Chongqing University, Chongqing, China
4
College of Civil and Transportation Engineering, Hohai University, Nanjing, China
13
Vol.:(0123456789)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5634 W. Zhang et al.
1 Introduction
Due to the inherent complexity of geotechnical materials, researchers tend to replace tedi-
ous theoretical solutions with soft computing methods to solve various geotechnical design
problems and assessment issues. Geotechnical problems are characterized with great uncer-
tainties, and involve various factors cannot be directly determined by engineers, which
leads to the rapid popularity of machine learning (ML) methods (Goh and Zhang 2014;
Wang et al. 2020; Zhang et al. 2017b). ML algorithms are capable of capturing the poten-
tial correlations among information without any prior assumptions (Goh et al. 2018; van
Natijne et al. 2020; Zhang et al. 2015, 2019b). With the improvement of computing effi-
ciency, explorations of artificial intelligence (AI) and deep learning (DL) are in full swing
(Da’u and Salim 2020; Nguyen et al. 2019; Zhang et al. 2019c). Specifically, AI, ML and
DL have an inclusion relationship as shown in Fig. 1. AI is a science like biology or mathe-
matics, it studies ways to build intelligent programs can creatively solve problems imitating
the human prerogative. As for ML, which is a subset of AI that provides systems the ability
to automatically learn and improve from experience without being explicitly programmed.
More specific to DL, or deep neural learning, this is a particular kind of ML that achieves
great power and flexibility by learning to represent the world as nested hierarchy of con-
cepts, without manually extracting features. Whereas previous researches of traditional ML
algorithms, such as Support Vector Machine (SVM), Decision Tree (DT), Random Forest
(RF), Multivariate Adaptive Regression Splines (MARS) and Naïve Bayes (NB), their per-
formances tend to stagnate to a certain extent as the amount of data increases (Goh et al.
2017; Zhang and Goh 2013; Zhang et al. 2020d). More recently, as a part of the broader
family of ML methods based on artificial neural networks, the accuracy of DL predictions
will gradually increase with the dataset expansion under no noise premise, which provides
efficient tools to deal with those data and other extract useful information to make reliable
decisions in geo-engineering. (Shrestha and Mahmood 2019).
In the first decade of the twenty-first century, researchers have mainly discussed the
shallow artificial neural network (ANN) based on optimization algorithms for certain
aspects, but the complexity of neural networks has not been extended too much. Till
the 2012 ImageNet competition, Hiton and other scholars applied a DL method named
AlexNet (a variant of convolutional neural network (CNN)), which greatly improved the
Fig. 1 Venn diagram representing the relationships between AI, ML and DL. (Adapted from Goodfellow
et al. 2016)
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5635
predictive accuracy, giving rise to the wide spreading of DL algorithms in numerous fields
and disciplines. A systematic review of DL development timeline as shown in Fig. 2. It can
be seen that DL gets great process in recent decades. As one of the cutting-edge algorithms
in AI, DL is advanced in defining complex nonlinear relationship between features in dif-
ferent domains, such as, health and medicine (Lisboa 2002), business and management
(Wong et al. 1997), natural language processing(NLP) (Mikolov et al. 2010), image pro-
cessing (Ayyıldız and Çetinkaya 2017; Egmont-Petersen et al. 2002; He et al. 2015), geo-
sciences and remote sensing (Lary et al. 2016), mathematics (Gao et al. 2019a, b, 2018a,
b), civil engineering (Gandomi and Alavi 2012a, b; Lazarevska et al. 2014; Zhang et al.
2020c, 2014), early warnings related to geotechnical problems(Chou and Thedja 2016),
risk assessment (Adams and Kanaroglou 2016; Dong et al. 2017).
For practical uses, the proportion of DL researches in geotechnical engineering is
relatively less among the extensive applications, for the reason of uncertain and varying
behavior of rock and soil. To the best knowledge of the authors, several researchers have
gradually explored new decision-making tools for the application of neural networks in the
geotechnical field (Chua and Goh 2005; Li et al. 2012; Uncuoglu et al. 2008). There exists
a great deal of detailed reviews of ANN research, discussing AI technology in the future
(Jiao and Alavi 2020), the application of ANN in the general civil engineering field (Huang
et al. 2019; Kapliński et al. 2016), the meta-analysis of ANN application in geotechni-
cal engineering (Moayedi et al. 2020; Zhang et al. 2019d) and the research on the ANN
from shallow to deep models (Saikia et al. 2020), etc. Nevertheless, some pointed out that
although the monitoring data in geotechnical field becomes huge as development expands,
compared with some other fields, the amount of data is far from enough yet to fully apply
DL (Phoon 2020; Zhong et al. 2020). Latest research in geotechnical field show that DL
is a very potential means to solve problems related with many uncertain factors in geo-
technics. As the fundamental tool of DL, ANN has been widely used in various geotech-
nical aspects, such as underground openings (Lee and Sterling 1992), braced excavation
(Goh et al. 1995; Zhang et al. 2020b), earth retaining structures (Calabrese and Lai 2013;
Moayedi et al. 2011), slope stability (Chakraborty and Goswami 2017), modeling tun-
nel boring machine (TBM) performance (Benardos and Kaliampakos 2004), liquefaction
(Baziar and Jafarian 2007), pile integrity testing (Watson et al. 1995), soil swelling (Erzin
2007; Garg et al. 2014), predicting geotechnical parameters (Asadi et al. 2011a, b, c; Lee
et al. 2003), pile bearing capacity (Moayedi and Armaghani 2018; Moayedi and Rezaei
2019; Mosallanezhad and Moayedi 2017), kinematic soil pile interaction response param-
eters (Ahmad et al. 2007), shallow foundation and pile foundation (Shahin 2015, 2016), as
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5636 W. Zhang et al.
well as corrosion monitoring (Mabbutt et al. 2012). Regarding to the findings presented in
the ANN applications above, new methods have been proposed to further improve the per-
formance. With the rapid development of DL, the prospective applications in geotechnics
and geoengineering should have been broad and profound.
However, the conducted investigations were limited to some specific topics and did not
emphasize the new challenges and prospects of DL in this field. A demand for comprehen-
sive DL investigations dedicated to the geotechnical engineering of the published studies
still exists. With regard to this, the aim of this paper is to provide a structured and a com-
prehensive review of literature using DL to perform in geotechnical engineering and to find
drawbacks in the current application process. The main contribution of this article is to
present a detailed and updated state-of-the-art concerning the application of DL algorithms
to solve complex geotechnical task, and to provide an overview about the potential of these
advanced algorithms and how they can be explored on further applications. To better illus-
trate mentality and primary coverage of this paper, the research flowchart is presented in
Fig. 3. As is conveyed by the literature survey process, we mainly count the publication of
papers on this aspect for the past 20 years, and briefly depicts the literature distribution in
journals and papers publication trending in Sect. 2. Four classes of DL methods, namely
feedforward neural network (FNN), recurrent neural network (RNN), convolutional neural
network (CNN) and generative adversarial network (GAN), have been found the most use
up to date in geotechnical engineering. Section 3 presents the methodology and DL model
architecture followed by the practical applications of these in Sect. 4. Finally, Sect. 5 dis-
cussed the challenges and future prospects in relation to this issue, and concluding remarks
are presented in Sect. 6.
Since the early 1900s, and up to the date of writing of this paper, there are more than fifty
thousand research articles in the field of geotechnical engineering which were indexed in
the web of science (WOS). Nevertheless, when the searching scope narrows down to the
application of DL in the subjects of geotechnical engineering, there are only 158 litera-
tures, with a very limited number of source title remained. As presented in Fig. 4, the 158
articles are arranged according to the annual of publications. Indeed, it can be seen that
the DL application in geotechnical engineering was extremely limited in the first decade
of twenty-first century, while the paper number has increased sharply in the last decade,
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5637
Fig. 4 Annual distribution of published journal papers focusing on the DL application in geotechnical engi-
neering (Source: Web of Science; literature search last updated in November 2020)
leading to forty-three publications in the year 2019. It seems that the rapid growth of this
process will continue since many papers in current year 2020 are still under processing due
to the retrieve rules.
To meet readers’ research interest in the geotechnical field of DL, the journals that
mostly published more than one paper on the corresponding keywords and percentage of
selected articles amongst the top fifteen journal sources are drawn as a pie chart in Fig. 5.
Meanwhile, the distribution of journal sources by number of papers and the specific
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5638 W. Zhang et al.
application paper are tabulated in Table 1 (data from Web of Science). Noteworthy, book
chapters, editor notes, master and doctoral dissertations were not involved. As illustrated
in the diagram, IEEE access is the leading journals, which has been published over 28%
of published paper focus on use of DL in geotechnical engineering followed by the journal
of TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY. Three among the
top fifteen journals (IEEE ACCESS, SENSORS, and ADVANCES IN CIVIL ENGINEER-
ING) are open-access journals.
From the statistics retrieval data of Web of Science, several DL algorithms are being
used to solve geotechnics related problems. Figure 6 illustrates the main DL methods
applied in this field, in summary, the FNN, RNN, CNN, and GAN are the most popular
ones in solving complex geotechnical problems. Their applications include soil parameters
inference (Mollahasani et al. 2011), pile bearing capacity (Singh and Walia 2017), TBM
performance (Ninić et al. 2017), stratum thickness prediction (Zhou et al. 2019a), land-
slide susceptibility (Xiao et al. 2018), rock physical parameters evaluation (Karimpouli and
Tahmasebi 2019), etc. Moreover, as a new rising unsupervised learning method GAN, it
provides a bright prospect in rock image identification and reconstruction of the structure
of porous media.
In addition, some other deep neural networks, such as FL-SegNet(focal loss SegNet)
network (Dong et al. 2019), Auto-encoder algorithm (Canchumuni et al. 2017; Liu and Wu
2016), transfer learning (Lu et al. 2020), deep belief networks (Chen et al. 2012; Hinton
et al. 2006; Li et al. 2019a; Ye et al. 2019) and Restricted Boltzmann Machines (Can-
chumuni et al. 2019), are also investigated to extract valuable geological parameters and
information. However, the other DL methods are rarely applied and too difficult to develop
rapidly in a short time (Ball et al. 2017). Thus, in the following sections, the principles and
applications of the above-mentioned main DL algorithms will be explained respectively.
Overall, the results indicate that combination of DL techniques with available database is a
promising research direction with potential for engineering applications. Furthermore, DL
is effective regarding the resolution of images, which is also an aspect superior to shallow
ANN techniques.
3 DL methodology
In this section, the theory and structure of four DL methods including FNN, RNN
(included improved version LSTM), CNN and GAN models are described in detail, and for
each step in the process, explanation of the employed approach are elaborated with sche-
matic diagrams.
ANN, as indicated by its name, complies with biological learning process that existed in
human brain, aiming at build a highly intelligent system like that (Barrow 1996; Gurney
1997). The biological neurons existed in in human brain correspond to those highly inter-
connected processing elements in ANN structure, which is called neurons (Fukushima and
Miyake 1982). With complex multilayered integration, neurons form network, named Arti-
ficial Neural Network.
Feedforward neural network (FNN) is the basic and simplest artificial neural network
model. The most common architecture of this computing paradigm is the multilayer
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 1 Most contributed journals of DL in geotechnical engineering and corresponding papers (Data from Web of Science)
Number Journal source N % Publications
1 IEEE ACCESS 16 28.07 Chen et al. (2019); Dong et al. (2019); Du et al. (2020); Gao et al. (2020b,
c); Li et al. (2019b), (2020b, c), (2018); Luo et al. (2020); Qu et al.
(2020); Shim et al. (2020); Song et al. (2019); Xie et al. (2019); Yang
et al. (2020); Zhao et al. (2019a)
2 TUNNELLING AND UNDERGROUND SPACE TECHNOLOGY 6 10.53 Cao et al. (2018); Huang et al. (2018a, b); Ninić et al. (2017); Wu et al.
(2019); Zhao et al. (2020)
3 JOURNAL OF COMPUTING IN CIVIL ENGINEERING 5 8.77 Jan et al. (2002); Zhang et al. (2018a, 2017d); Zhou et al. (2019c,(2017)
4 JOURNAL OF PERFORMANCE OF CONSTRUCTED FACILITIE 4 7.02 Chen et al. (2015); Nelson et al. (2017); Tan and Lu (2017a, b)
5 AUTOMATION IN CONSTRUCTION 3 5.26 Gao et al. (2019c); Li and Gong (2019b); Zhou et al. (2019b)
6 COMPUTERS AND GEOSCIENCES 3 5.26 Han et al. (2019a); Wang et al. (2019); Xu and Niu (2018)
7 APPLIED SCIENCES-BASEL 3 5.26 Han et al. (2019b); Zhang et al. (2018b); Zhou et al. (2019a)
8 SENSORS 3 5.26 Cui et al. (2017); Mao et al. (2019); Zhang et al. (2019e)
Application of deep learning algorithms in geotechnical…
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5640 W. Zhang et al.
perceptron which consists of at least three layers: input layer, output layer and hidden layer.
FNN has the capacity of discovering complex patterns and solving different statistical
issues (Huang et al. 2018a; Qi et al. 2018; Qi and Tang 2018). By constructing a sequence
of interrelated neurons in FNN, the correlation between input features and output label can
be built. When processing the complicated but orderly network using training dataset, the
neurons are interconnected into multiple layers and iteratively assign weight to each indi-
vidual neuron. Figure 7 displays a fundamental diagram of developing a multilayer feedfor-
ward neural network. The calculation of some specific values and functions is simultane-
ously conducted in each layer during the forward path. The intermediate variable Z is the
weighted sum of the inputs, and y represents the nonlinear activation function f of each
layer. W refers to the weight between two units in adjacent layers indicated by subscript let-
ters, and b stands for the bias value of the unit.
The training algorithm widely used in FNN is back propagation algorithm, which can be
simply represented as: forward propagation phase and backward propagation phase. The main
purpose of the first phase is to transmit the input information to the output layer through sev-
eral hidden layers assigning random weights to each neuron. When completing the forward
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5641
propagation, some divergence between the resulted prediction and the desired output una-
voidably exists, which brings necessity to adjusting the current model. The latter phase is to
approximate the predicting output to the desired one by updating the weight of neurons in
each layer. The gradient descent is performed in back propagation, it minimizes the model
error through calculating the derivative of the error and propagates backwards.
When the number of hidden layers increases to more than one, the FNN model becomes
a deep model. Deep FNN models are characterized by learning multiple representations, in
other words, is capable of precisely modeling complex data interactions from practical engi-
neering problems. The learning method used to train this model is called DL (Bengio 2009).
For complex issues such as time series, computer vision, speech recognition, increasing the
number of hidden layers may attribute effective modeling capabilities (Bengio et al. 2007).
However, the learning process of deep neural networks may lead to overfitting and perfor-
mance degradation.
In the traditional neural network ANN, nodes in different layers (input layer, hidden layer, and
output layer) are connected to each other, and the nodes between each layer are independent.
However, in RNN, it is identical that adjacent nodes in one hidden layer are connected to each
other. Each node of the hidden layer receives the information from two sources. First one is
inherited from the hidden layer at the previous time point, and the input layer at the current
time point, which gives RNN the characteristic of memorizing the information from historical
time points. Then the network applies the remaining information in memory to the output cal-
culation of the current neuron, meanwhile, constantly transforms new data as input. In a word,
RNN can describe the real-time dynamic behaviors using time sequential data. The sketch of
the network is given in Fig. 8. The left simplified structure can be unfolded as the right one
in detail. An integral illustrative one contains input, hidden and output layers (Williams and
Zipser 1989). All hidden layers must be connected and committed to its next one. For a given
input sequence x = (x1 , x2 , ⋅ ⋅ ⋅, xT ), the hidden state vector sequence is h = (h1, , ht , ⋅ ⋅ ⋅, hT ),
the output sequence is y1 = (y1 , y2 , ⋅ ⋅ ⋅, yT ). The output sequence y can be obtained by iterat-
ing the following equations from time t = 1 to T (Fan et al. 2014; Yang et al. 2019). Let xt, ht
and yt represent the input, hidden, and output vectors at sampling instant t, respectively. The
hidden and output vectors at sampling time t can be calculated as
ht = f (Uxt + Wht−1 + b) (1)
yt = 𝛼(Vht + c) (2)
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5642 W. Zhang et al.
where U stands for the weight matrix between input and hidden vectors; W is the weight
matrix between different time steps of the hidden vectors; V represents the weight matrix
that connects the hidden layers to the output vectors, b and c are the corresponding bias
vectors of W and V, respectively. a is the activation function like sigmoid function.
Due to the unique structure, RNN is characterized by a superior ability in time series pre-
diction and can theoretically handle arbitrary long sequences. However, its shortage lies in
processing long distance information. The learning ability will be weakened owing to the
gradient vanishing or exploding problem, which makes it difficult to capture long-term time
dependences (Bengio et al. 1994; Pascanu et al. 2013). To remedy this problem of conven-
tional RNN, Hochreiter and Schmidhuber (1997) proposed a long short-term memory net-
work (LSTM), which is an upgrade of original standard RNN. The thorough formation of an
LSTM unit can be assessed in Fig. 9. A LSTM unit includes three gate controllers, known as
the input, forget, and output gates individually. Every information going through this unit has
to be decided whether to be remembered or forgotten, then assigned to corresponding gate. To
avoiding the gradient vanishing problem, the LSTM network implements temporal memory
through switching those gates (Yuan et al. 2019). For the basic LSTM unit, its external inputs
are its previous cell state ct−1, the previous hidden state ht−1, and the current input vector xt.
From Fig. 9, the three current gates are generated as Eq. (3) to Eq. (5).
it = 𝜎(Wix xt + Wih ht−1 + bi ) (3)
Inside the LSTM, the memory cell and the current hidden state in the memory block are
calculated as Eq. (6) to Eq. (7).
ct = ft ct−1 + it tanh(Wxc xt + Whc ht−1 + bc )ct−1 (6)
ht = ot tanh(ct ) (7)
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5643
where it ft,ot,ct are the values of the input gate, forget gate, output gate and memory cell
in the memory unit; bi,bf ,bo and bc are their corresponding bias values; Wx represents the
weights between input nodes and hidden nodes; Wh represents the weights between hid-
den nodes and memory cell; Wc represents the weights that connect memory cell to output
nodes; σ and tanh represents the sigmoid and hyperbolic tangent activation function for
the gates, respectively. The output yt can then be obtained by Eq. 2. The output sequence
y1 = (y1 , y2 , ⋅ ⋅ ⋅, yT ) will be updated by iterating Eq. (2) to Eq. (9) from times t = 1 to T
(Yang et al. 2019).
The CNN show significant superiority when dealing with image classification and recog-
nition issues comparing with other DL algorithms. Worldwide researchers beyond doubt
employ this method on image recognition or computer vision tasks (Karimpouli and Tah-
masebi 2019; Krizhevsky et al. 2017; Schmidhuber 2015). The first CNN model is called
LeNet-5, proposed by LeCun et al., which has been successfully utilized to identify hand-
written numbers at the end of last century (LeCun et al. 1998). As is known that the com-
putation efficiency was quite limited back then, CNNs can only process small database, the
potential of CNNs has not been adequately exploited. With the development of technology
in the past 20 years, the emergence of big data (Zhang et al. 2016), and the improvements
in training algorithms (Dong et al. 2020), the feasibility of training deep CNNs to handle
further complex recognition tasks has greatly promoted (Khan et al. 2020). Microsoft has
deployed a number of CNNs-based optical character recognition and handwriting recogni-
tion systems (Simard et al. 2003). CNNs were also employed for object detection in natural
images (Vaillant et al. 1994; Zhao et al. 2019b), including face recognition (Lawrence et al.
1997), medical diagnosis (Shen et al. 2017) and image understanding (CireşAn et al. 2012).
The structure of CNN is based on a large number of convolutional layers, which are
obtained from image convolution through many small size kernels. These large number
of kernels work as feature identifiers to classify different features of input data, which are
usually images. However, using such functions is not straightforward and requires activa-
tion functions and pooling. After feature extraction, they will be connected to the output
layer using a fully coupled neural networks. As being seen in Fig. 10, a comprehensive
representation of CNN structure for rock image is illustrated, which contains input layer,
feature maps and fully connected network. Obviously, feature maps extraction is the most
cumbersome and complicated, mainly including the following several steps (Karimpouli
and Tahmasebi 2019):
1. Convolutional layer the input image or the last feature map is convolved by a randomly
generated kernel (or filter) of size (height, width, channel). The feature map F k is cal-
culated as Eq. (10):
∑
Fk = ( Wki ∗ Xi ) + bk (10)
l
where Wkl is the sub-kernel of the lth channel, Xl is the ith input channel, ∗ represents
the convolution operator, and bk is a bias term. Since the convolutional layer is asso-
ciated with L input channels, X contains M × M × L values, each kernel Wk contains
N × N × L weights. Accordingly, the number of parameters in a convolution block com-
posed of K feature maps is equal to K × M × M × L.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5644 W. Zhang et al.
Fig. 10 Schematic illustration of the CNN (After Karimpouli and Tahmasebi 2019)
2. Padding In order to obtain a convolutional layer of the same size as the input matrix, a
zero padding cells containing some rows and columns with zero value are added around
the input matrix.
3. ReLU (Rectified Linear Unit) layer The ReLU layer is an activation function that changes
negative values to zero. On the other words, it applies the following function to all cells
obtained from the Eq. (11) is:
∑
Fk = max(0, Wkl ∗ Xl ) + bk ) (11)
l
ReLU training is faster compared with sigmoid or tanh activation function (Kriz-
hevsky et al. 2017). It induces the sparsity of the hidden unit and the nonlinearity of the
system, while not encounter the gradient vanishing problem.
4. Maximum pooling The pooling layer, also known as down sampling, combines all val-
ues in the pooling window into one value in the next layer. For example, the maximum
pooling uses the maximum or average pool to calculate the average value in each pool
cluster. It imposes a certain degree of spatial invariance on the network and reduces the
computational cost by processing global information (Garcia-Garcia et al. 2017).
5. Stride Convolution and maximum merge are performed with an offset of "n" pixels,
which is called stride. A stepping window larger than one pixel will lead to a smaller
image. It controls the size of feature maps in convolution and max pooling layers.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5645
Fig. 11 Schematic diagram of the GAN (Adapted from Azevedo et al. 2020)
Similarly to the conventional Multilayer Perceptron (MLP) neural network, fully Con-
nected Network (FCN) is used to connect the feature maps or patterns obtained in the pre-
vious layers to known outputs.
GAN is proposed by Goodfellow et al. and it has become one of the most popular DL algo-
rithms among generative models (Goodfellow et al. 2014). Although it is based on neural net-
works, GAN is characterized by training two independent networks simultaneously, namely
generator and discriminator, as Fig. 11 shows its key logic. The generator generates new fake
samples according to the features of practical data, making it difficult for the distinguisher to
distinguish the fake samples from the real data. On the other hand, the discriminator attempts
to identify whether the data comes from the generator. The two networks are trained in an
adversarial way, and this process allows them to continously train for their goals.
Figure 12 is a conceptual description of the simultaneous training process of two models in
GAN (Goodfellow et al. 2014). The black dots indicate the distribution of the training set and
the green line is the generated sample distribution from generator G. The blue dashed line is a
discernible distribution D, which can distinguish the data between the black and green lines.
The sample is generated from the domain z and mapped to the domain x according to x = G
(z). At first in Fig. 12a, the distribution from G is different from the true so that D can classify
the samples from the true values. As G and D are trained repetitively, the samples from G and
the trainset are analogous and D cannot distinguish them, and the probability D(x) is equal 0.5.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5646 W. Zhang et al.
The loss function V(D,G) in GAN is defined as Eq. (12), in which G is set to be minimized
while D is to be maximized in each term.
(12)
[ ] [ ]
min max V(D, G) = Ex log D(x) + Ez log(1 − D(G(z)))
When data in the domain x is entered, the best result is that the discriminator D prints
out 1. In addition, the data in the domain z (called as fake data) should be calculated as
0 from D(G(z)). On the other hands, the generator G is trained to make fake data to be
treated as the data in x. Therefore, it optimizes the discriminator to print out D(G(z))
from 1.
After the first proposal of GAN in 2014, there have been several variants of GAN such
as cGAN (conditional GAN) (Mirza and Osindero 2014), WGAN (Wasserstein GAN)
(Arjovsky et al. 2017), BEGAN (boundary equilibrium GAN) (Marzouk et al. 2019),
LSGAN(least square GAN) (Mao et al. 2017), DCGAN (deep convolutional GAN) (Rad-
ford et al. 2015), and so on. These variants make learning model more stable and accurate
than the original GAN.
This section presents the findings obtained from processing the reviewed literature. The
specific applications are summarized and shown in chart based on the DL methods adopted
in papers. It is to be noted that the summary also covered the hybrid or optimized DL
model applications. Additionally, the corresponding applications of each method are tabu-
lated in detail.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5647
Fig. 13 The specific topics of the four DL methods’ usage in geotechnical engineering
detailed list of the research fields. It needs to be emphasized that the research only exists
on expand hidden layer for neural networks is quite a few. Hence, the references of multi-
layer neural network application are listed selectively in this article, as for the systematic
review of ANN, it was discussed by numerous research scholars in the area of geotechnical
engineering (Fatehnia and Amirinia 2018; Moayedi et al. 2020; Shahin 2015, 2016; Shahin
et al. 2001).
4.2 FNN applications
FNN is widely used, and many studies have explored it in terms of model structure. In the
past few years, researchers only discussed the shallow neural network with a small amount
of data. But for deep neural network, determining the network architecture is one of the
most important and difficult tasks in ANN model establishment (Maier and Dandy 2000).
It requires the selection of the optimum number of hidden layers and the number of nodes
in each of these. However, there is no unified theory for determination of an optimal ANN
architecture. The number of nodes in the input and output layers are restricted by the model
of optional features and labels.
Honik et al. (1989) proved that the standard back-propagation network can approximate
any measurable function to any desired degree of accuracy with just one hidden layer on
condition that sufficient hidden units provided. However, as practical research shows (He
et al. 2016), in many cases DNNs predictions are more accurate than the ones obtained
by shallow networks. Meanwhile, this existence theorem does not suggest any rules to
choose a proper hidden layer size. As for geotechnical engineering, Bagińska and Srokosz
(2019) compared shallow and deep neural networks to predict the ultimate bearing capac-
ity of shallow foundation. It shows that DNNs have a significant advantage over shallow
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 2 Literature survey on DL models used to geotechnical engineering
5648
13
FNN Sand liquefaction triggering assessment Previously published cyclic tests / Baziar and Jafarian (2007)
Undrained cohesion intercept (c) of soil Basic geotechnical characterization and / Mollahasani et al. (2011)
triaxial compression tests
Expansive soils swelling pressures predic- Laboratory test apparatus Optimization algorithm: Adaptive Neuro- Ikizler et al. (2014)
tion Fuzzy Inference Systems (ANFIS)
Subgrade resilient modulus Subgrade soils locally available in Georgia / Kim et al. (2014)
Ultimate bearing capacity of shallow Different laboratories experiments datasets / Bagińska and Srokosz (2019)
foundation
Pile settlement prediction A total of 1013 cases from 76 individual pile / Nejad et al. (2009)
load tests
Bored pile of bearing capacity Generated by finite element simulation and Four nature-inspired optimization algo- Singh and Walia (2017)
field pile loading test rithms: PSO, FFA, CS and BF
Uplift capacity of suction foundations Experimental data Contrast model: finite element analysis Rahman et al. (2001)
(FEM)
Landslide susceptibility mapping Shallow landslides located at the Ha Long / Nhu et al. (2020)
area in Vietnam
Landslide susceptibility assessment A geospatial database for the Kom Tum Contrast models: SVM, C4.5 DT,RF, Bui et al. (2020)
province in Viet nam Multi-Layer Perceptron Neural
Network(MLPNN)
Piezometric water level The Iron Gate 2 Dam Contrast model: Multiple Linear Regression Ranković et al. (2014)
(MLR)
Roadheader performance in tunneling Tabas coal project In Iran / Salsani et al. (2014)
project
Tunnel geological types prediction An urban subway construction project Contrast models: XGBoost, CatBoost, RF, Zhao et al. (2019a)
DT, SVM, K-nearest neighbor(KNN),
Bayesian linear regression(BLR)
RNN Stratum type and the stratum thickness A city in eastern China / Zhou et al. (2019a)
model establishment
Real time simulation and monitoring-based The Wehrhahn-line (WHL) metro project in / Ninić et al. (2017)
predictions Düsseldorf, Germany
Slurry pressure balance in shield tunneling Wangjing railway tunnel located between the / Li and Gong (2019a
South Road of Caochangdi and the outh of
W. Zhang et al.
Water level forecast Dongting Lake in China Contrast model: SVM Liang et al. (2018)
Periodic displacement The Baishuihe landslide and Bazimen Contrast model: SVM Yang et al. (2019)
landslide in the Three Gorges Reservoir
Area (TGRA)
landslide displacement prediction with risk- Baishuihe landslide,situated on the south Contrast models: SVM、 double exponen- Xing et al. (2020)
averse adaptation bank of the Yangtze River tial smoothing method
Periodic displacement Baijiabao landslide in Zigui county,Hubei / Xu and Niu (2018)
province,China
Periodic displacement Laowuji Landslide in Guizhou province, / Xie et al. (2019)
China
Landslide susceptibility assessment Landslide along the China-Nepal Highway Contrast models: DT、SVM、BPNN Xiao et al. (2018)
5649
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 2 (continued)
5650
13
located in the western part of Jiangxi
Province, China
Landslide Intelligent Detection Landslide images from different regions Optimization algorithm: improved region Yu et al. (2017)
growing algorithm (RSG_R)
Landslide recognition Images taken before and after landslide / Ding et al. (2016)
events in Shenzhen, China
Slope Failure Detection The area located along the river Ganga, in / Ghorbanzadeh et al. (2019b)
the Himalayas of northern India
Automated Landslides Detection Landslides in Shenzhen, Zhouqu County and / Chen et al. (2018b)
Beichuan County in China
Landslide mapping The southern part of the Rasuwa district in Contrast models: SVM, ANN, RF Ghorbanzadeh et al. (2019a)
Nepal along a highway
Identification of unsafe behaviors on con- Video recordings of a person climbing and Contrast models: LSTM Ding et al. (2018)
struction sites dismounting from a ladder
Construction objects detection A construction equipment dataset / Kim et al. (2018)
Identifying rock types in the field 2290 images of six rock types Contrast models: SVM, AlexNet, VGG- Ran et al. (2019)
Net-16, GoogLeNet Inception v3
Rock-Mineral Microscopic Images Intel- 481 mineral images Contrast models: LR, SVM, RF, KNN, Zhang et al. (2019e)
ligent Identification MLP, and Gaussian Naive Bayes (GNB)
Automatic intelligent classification and 4139 annotated images including crack, Contrast models: AlexNet, GoogleNet, VGG Xue and Li (2018)
detection of tunnel lining defects leakage, and scratch marks of shield tun-
nel lining
Image instance segmentation for moisture 5031 annotated images containing moisture / Zhao et al. (2020)
marks of shield tunnel lining marks of shield tunnel lining
Crack and leakage defection in metro shield Lining images of metro shield tun- / Huang et al. (2018a)
tunnel nel(188,704 crack images and 110,466
leakage images)
Interpretation model for Ground Penetrating A freeway tunnel in Guangxi Optimization algorithm: WD (Wigner He et al. (2017)
Radar point data in tunnel distribution)
Estimation of the P- and S-wave velocities Berea sandstone images / Karimpouli and Tahmasebi (2019)
from images of rock
W. Zhang et al.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Table 2 (continued)
5652
13
crack segmentation task and genarate grounds
crack images
2D-to-3D reconstruction of the structure of Three kinds of images with Bead pack, / Valsecchi et al. (2020)
porous media Berea sandstone and Ketton limestone
Reconstruct high-resolution three-dimen- Spherical bead pack, Berea sandstone and / Mosser et al. (2017)
sional images of porous media at different oolitic Ketton limestone images
scales
Reconstruct microstructures of porous media An oolitic Ketton limestone micro-CT / Mosser et al. (2018)
based on grayscale image representations unsegmented gray-level dataset
of volumetric porous media
W. Zhang et al.
networks even though the experimental dataset used for preparing models is small. Nejad
et al. (2009) used single and multiple hidden layers of FNN models to predict pile settle-
ment based on standard penetration test (SPT) data based on approximately 1000 data sets.
Karlik et al. (1998) identified an optimal ANN model with 3 hidden layers for the vibration
of a beammass system, which performed better than a single hidden layer model.
FNN recently generated, extended, and applied by many researchers scholars, they have
successfully attempted to present this new utility determing tools into the field of geotech-
nical engineering. For instance, the interesting appraches has been studied several times in
soil propertty assessment. Kim et al. (2014) described FNN model to estimate subgrade
resilient modulus in correlation with the physical properties and stress state of subgrade
soil which has a substantial effect on pavement design. Mollahasani et al. (2011) derived
multilayer perception of FNN to estimate undrained cohesion intercept (c) of soil with
experimental database which established upon a series of unconsolidated-undrained triax-
ial tests. The results indicate that the developed model is effectively capable of estimating
the c values of soil samples. This model provides a significantly better prediction perfor-
mance than the regression model. Baziar and Jafarian (2007) adopted a relatively large
dataset to establish a FNN model for the correlation between soils initial parameters and
the strain energy required to trigger liquefaction in sands and silty sands. In addition, the
data recorded during some real earthquakes plus some available centrifuge tests data have
been utilized in order to validate the proposed ANN-based liquefaction energy model. The
results clearly demonstrate the capability of the proposed model and the strain energy con-
cept to assess liquefaction resistance of soils.
In other respects, various practical issues have been resolved by this intelligent algo-
rithm. Rahman et al. (2001) developed FNN model to predict the uplift capacity of suction
foundations for the anchorage of large compliant offshore structures and compared with
finite element based predictions. It demonstrated as more available data used, the model
can be improved to make more accurate capacity prediction for a wider range of load and
sit conditions. Bui et al. (2020) compared the prediction performance of deep FNN with
conventional ML model such as C4.5-decision Tree model, Support vector machine, and
random forest model in landslide susceptibility assessment. Meanwhile, Nhu et al. (2020)
investigated the capability of deep FNN for landslide susceptibility mapping based on a
Python DL library of Keras. Ranković et al. (2014) develop a FNN model to predict the
piezometric water level in dams. An improved resilient propagation algorithm has been
used to train the FNN so as much as possible to minimize the error between the neural net-
work predictions and the desired outputs. Salsani et al. (2014) utilized FNN to model the
relationship between the roadheader performance and the parameters influencing the tun-
neling operations with a high correlation based on the geological conditions.
However, although FNN perform well in most cases, it is a remarkable fact that the tra-
ditional neural network still has some limitation, such as overfitting and complicated
parameter selection. By combining with other soft computing technologies, FNN have the
strength to obtain better modeling capabilities (Cui et al. 2009; Elbeltagi et al. 2005). Many
of optimization algorithms are inspired by various phenomena which occur in nature. par-
ticle swarm optimization (PSO), Genetic algorithm (GA), firefly algorithm (FFA), cuckoo
search (CS), bacterial foraging (BF), artificial bee colony (ABC), ant colony optimization
(ACO), gravitational search algorithm (GSA), etc. are being popular used in the area of
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5654 W. Zhang et al.
geotechnical engineering. Gordan et al. (2016) combined PSO and ANN to predict slope
stability induced by seismic loading. Liu et al. (2012) proposed genetic algorithm for deter-
mining load carrying capacity of composite foundation. Singh and Walia (2017) proposed
four developed ANN by applying optimization algorithms of PSO, FFA, CS and BF to
determine the unit skin friction and unit end bearing capacity for the design of bored pile
foundation.
Some typical FNN application references can be referred in Table 2, similarly, the appli-
cations of the other three algorithms are also listed separately.
4.3 RNN applications
A time-aware class of the Neural Networks are RNN, which operate on time series of vari-
ables and have some memory of previous states. Thanks to the memory of those gates, pre-
vious conditions can be weighted, and when applicable, incorporated into the current state.
This kind of DL method is proficient in natural language processing tasks, such as transla-
tion, speech generation from text and text classification (Van Houdt et al. 2020).
In the geotechnical field, some time series monitoring problems of tunnel construction
have gradually applied this algorithm, Ninić et al. (2017) proposes RNN method and the
process-oriented 3D simulation model for real-time simulation and monitoring-based pre-
dictions during the construction of machine-driven tunnels to support decisions concerning
the steering of tunnel boring machines (TBMs). Li and Gong (2019a) presents diagonal
RNN and evolved particle swarm optimization (EPSO) algorithm as a model predictive
control (MPC) system for the slurry pressure balance during construction through effec-
tively regulating the slurry circulation and air pressure holding systems according to geo-
logical conditions. The simulation results demonstrated that the presented approach can
accurately track the desired water-earth pressure and significantly enhance the robustness
of slurry supporting system in tunneling, and the novel EPSO also performed higher con-
vergence speed and precision than the classic algorithms used for comparison. Besides,
Zhou et al. (2019a) propose RNN to establish a sequence model of the stratum type and the
stratum thickness. The series model based on DL can describe the real stratum situation,
and it is a complimentary tool to the traditional 3D geological model. The prediction abil-
ity of the model is improved to a certain extent by including expert-driven learning, which
provides a novel approach for the simulation and prediction of a series by 3D geological
modeling.
Since RNN is hard to capture long term time associations, as an advanced implementation
of this is Long Short-Term Memory (LSTM), which is more suitable for solving detection
problems with long time span. It is widely used in prediction of landslide deformation,
prediction of groundwater level determination, and tunneling deformation. Landslide is a
dynamic process characterized by several features. The conventional static method ignores
the essence of the dynamic system of landslide evolution and cannot consider the influence
of time, which restricts the improvement of prediction accuracy. Current researches have
begun to focus on dynamic model LSTM to predict landslides. For step-wise landslide
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5655
displacement, the accumulated displacement can be separated into trend term and peri-
odic term displacement. Generally, the displacement of the trend term is determined by the
conditions of the slope body reflects the long-term trend of landslide, which can be pre-
dicted by the traditional static method, the displacement of the periodic term is controlled
by external factors, such as rainfall and reservoir water level changes, and can be predicted
using LSTM efficiently.
More recently, Yang et al. (2019) propose LSTM as a dynamic model for predict the
Baishuihe landslide and Bazimen landslide displacement in the Three Gorges Reservoir
(TGR) Area. A LSTM model was used to predict the relationship between periodic dis-
placement and reservoir water level with rainfall. The application of the model demon-
strates that the LSTM model provides a good representation of the measured displace-
ments and gives a more reliable prediction of landslide displacement than the static support
vector machine (SVM) model. Likewise, Xing et al. (2020) investigate a novel prediction
model of landslide displacement with risk-averse adaptation. For this methodology, double
exponential smoothing method is utilized to predict trend term of landslide displacement,
while hybrid model of LSTM and support vector regression(SVR) is developed to predict
periodic term of landslide displacement. The proposed approach maintain a high predic-
tion accuracy and reduce the underestimation rate based on Baishuihe landslide. Similarly
to investigate landslide evolution, Xu and Niu (2018) applied LSTM model to predict the
total periodic component of Baijiabao landslide in Zigui county of HuBei province. The
predicted results indicate that, to some extent, the dynamic model (LSTM) achieves results
that are more accurate than those of the static models (i.e., SVR and BP). LSTM even
displays better performance than the Elman network which is also a dynamic method. Fur-
themore, Xie et al. (2019) adopted LSTM method to investigate the dynamic failure mode
for Laowuji Landslide. The displacement of the Laowuji landslide contains the trend and
periodic component. The periodic component is predicted by LSTM method and model’s
input includes multiple factors of geological conditions, rainfall intensity, and human activ-
ities. The measured data and the predicted data show good consistency. Compared with
a traditional mechanical model namely the Empirical Mode Decomposition (EMD), the
LSTM model is more powerful to predict the landslide displacement triggered by multiply-
ing factors and the idea can give a promising way to develop the landslide warning system
more efficiently and precisely on site. Besides, Xiao et al. (2018) employed the four data-
driven algorithms of decision tree (DT), support vector machines (SVM), Back Propaga-
tion neural network (BPNN), and LSTM to evaluate landslide susceptibility along China-
Nepal highway. LSTM outperformed the other three models due to its capability to learn
time series with long temporal dependencies. It indicates that the dynamic change course
of geological and geographic parameters is an important indicator in reflecting landslide
susceptibility.
In addition, LSTM model can be used to predict water level change over time. Supreetha
et al. (2020) developed a groundwater level forecasting model by using hybrid long short-
term memory with lion algorithm based on the groundwater level and rainfall dataset from
an oberservation well at India. The prediction accuracy of the hybrid LSTM-LA model
was better than the FNN and the isolated LSTM models. Liang et al. (2018) establish a
LSTM model to study the Dongting Lake water level variation and its relationship with the
upstream Three Gorges Dam (TGD) The test shows the LSTM model has better accuracy
compared to the support vector machine (SVM) model. Furthermore, the model is adjusted
to simulate the situation where the TGD does not exist to explore the dam’s impact.
Consindering of the uncertain geological conditions, the performance evaluation of tun-
nel boring machine is a essential task during tunneling process. As a consequence of this,
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5656 W. Zhang et al.
Li et al. (2020a) employed LSTM to predict the total thrust and the cutterhead torque dur-
ing a stable period in a boring cycle which totally included 120 GB data. This real-time
predication shows superior performance than the classical theoretical model in which only
a single value can be obtained based on the rock properties. Gao et al. (2020a) verify the
performance of LSTM in TBM penetration rate prediction. The machine parameters, rock
mass parameters, and geotechnical survey data from the water conveyance tunnel of the
Hangzhou Second Water Source project were collected to form a dataset. Compared with
RNN based model and traditional time-series prediction model autoregressive integrated
moving average with explanation variables (ARIMAX), the overall performance on pro-
posed LSTM model is better. Moreover, in the rapidly increasing period of the penetra-
tion rate, the error of the LSTM-based model prediction curve is significantly smaller than
those of the other two models. Gao et al. (2019c) adopted three kinds of RNN, including
traditional RNN, LSTM and gated recurrent unit (GRU) networks, to deal with the real-
time prediction of TBM operating parameters based on TBM in-situ operating data. Com-
pared with several classical regression models such as support vector regression (SVR),
random forest (RF) and Lasso, the comparative experiments show that the proposed RNN-
based predictors outperform the regression models in most cases. The feasibility of RNN
for the real-time prediction of TBM operating parameters indicates that can afford the anal-
ysis and the forecasting of the time-continuous insitu data collected from various construc-
tion equipments.
Essentially based on RNN, the hybrid model of LSTM can be more flexible to achieve
time-series dynamic prediction, for instance the quick determination of the attitude and
position for shield machine in tunneling with the consideration of the uncertainty of geo-
logical stratum. Zhou et al. (2019b) presented a hybrid model, namely, WCNN-LSTM,
for the dynamic multi-step-ahead prediction by using a DL-based forecasting framework
that integrates wavelet transform(WT), CNNs, and LSTM. The prediction framework is
tested with the collected data of Mixshield operated in the river-crossing tunnel project
of Yangtze Sanyang Road, Wuhan, China. To verify the validity and demonstrate the pre-
diction accuracy of the proposed method, three widely used predictive models, namely,
ARIMA, LSTM, and WLSTM, were introduced for comparison with the WCNN-LSTM
model. Results reveal that the proposed model outperforms the other three similar models
in predictive accuracy and provides decision support for adjusting the attitude and position
in shield tunneling. In addition, Qi and Fourie (2018) proposed LSTM model as a substi-
tute for numerical modelling to eastimate the rheological parameter of rock tunnel and uti-
lize firefly algorithm (FA) to search for the optimum hyper-parameter. The performance of
DeepLSTM-FA, was verified using a tunnel response with the FLAC 2D finite difference
program. Furthermore, an engineering instance is applied to validate the accurate of rheo-
logical parameters. Results demonstrated that the DeepLSTM-FA can provide real-time
stress and stability analysis for engineering projects. What’s more, LSTM can also be used
for other monitoring issues, such as soft sensor on a debutanizer column and penicillin fer-
mentation process (Yuan et al. 2019).
Regarding the application of RNN in geotechnical engineering, its improved algorithm
LSTM is mostly used, and evidently reflects this fact in Table 2.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5657
4.4 CNN applications
Due to increasing the computational power and a high demand of considering more com-
plexity in the computational methods, image-based problems have recently been devel-
oped on the ground of AI, particularly in DL. CNN is one of the DL derivatives, which
is able to automatically learn the features required for image classification from training-
image data, thus improving classification accuracy and efficiency without relying on artifi-
cial feature selection. Recently, many researchers used the CNN for different applications
such as: image classification (Krizhevsky et al. 2017), face detection (Li et al. 2015), face
image synthesis (Abdolahnejad and Liu 2020), semantic segmentation(Garcia-Garcia et al.
2017), traffic signal recognition (Lv et al. 2014; Sermanet and LeCun 2011), text mining
(Shi et al. 2018),speech recognition (Zhang et al. 2017c), human behavior recognition (Han
et al. 2018; Sargano et al. 2017), 3D pose estimation (Mahendran et al. 2017), plant disease
identification (Liu et al. 2018; Lu et al. 2017), medical image analysis and applications
(Havaei et al. 2017; Litjens et al. 2017; Wallach et al. 2015), saliency detection (Lee et al.
2017) and so on. Following the rapid development of DL methods in computer vision and
medical science, some attempts have been done in geosciences especially in rock physics.
For example, DL methods and CNNs were applied on classification of rock type (Cheng
and Guo 2017; Ferreira and Giraldi 2017), borehole imaging for lithology detection (Zhang
et al. 2017a), permeability prediction (Srisutthiyakorn* 2016), landslide detection (Ding
et al. 2016; Ghorbanzadeh et al. 2019a, b; Lei et al. 2019; Lv et al. 2020), reconstruction/
analyzing of rock porous media (Alqahtani et al. 2018; Laloy et al. 2017; Mosser et al.
2017) and rock image segmentation (Karimpouli and Tahmasebi 2019).
The automatic identification of rock type in the field would aid geological surveying,
and the successes of applying CNNs to image recognition have led geologists to investi-
gate its application in identifying rock types. Wang et al. (2019) proposes a novel network
named as three-dimensional super-resolution CNN to realize Computed Tomography (CT)
imaging of rock samples. Ran et al. (2019) proposes an accurate approach for identify-
ing rock types in the field based on image analysis using deep CNN. The proposed deep
CNNs model was trained and tested using 24,315 sample rock image patches and achieved
an overall accuracy of 97.96%. Its application has effectively identified rock types from
images captured in the field. Similarly, Wei et al. (2019) propose CNN for characteriz-
ing rock facies with feature engineering and data padding strategies. They test the feasibil-
ity of applying this new algorithm using a verifiable well logging dataset from the Pan-
oma gas field in southwest Kansas. The results show that CNN has application potential
in automatic rock facies characterization with high accuracy and efficiency. In addition,
CNN can judge rock physical properties based on rock images, such as Karimpouli and
Tahmasebi (2019) estimates P-wave and S-wave velocities based on images of rock media
to evaluate the physical parameters of the rock. He et al. (2019) proposed CNN to con-
tinuously estimate the field strength parameters of rock which achieves higher accuracy
than the Mohr–Coulomb criterion and shows superior performance for UCS estimation of
various rock types. Han et al. (2019a, b)proposed new deep convolutional networks based
on Inception-ResNet-v2 and Inception-v3 models for rock strength measurement based on
spectrogram. The frequency spectrum collected by tapping the rock with a geological ham-
mer is the input variable of the DL model. The classification accuracy of the model can
reach 93%, thus overcoming the subjectivity of human judgment.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5658 W. Zhang et al.
As an alternative and effective approach, hybrid model of CNN also offers the required
tools for geotechnical engineers, to make a fast and better decisions to improve the qual-
ity of their performance and to reduce risks. Numerous scholars have applied this method
in geological disaster assessment. Yu et al. (2017) proposed an algorithm based on depth
CNN and an improved region growing algorithm (RSG_R) method for detection of land-
slide intelligence. Fang et al. (2020) developed three hybrid methods of CNN-SVM, CNN-
RF, CNN-LR for landslide susceptibility mapping which effectively improved the clas-
sifiers performance. For tunnel defects classification and detection aspects, Huang et al.
(2018a) presented an image recognition algorithm with Fully CNN to conduct semantic
segmentation of cracks and leakage defects of metro shield tunnel. With respect to tunnel
construction, He et al. (2017) proposd an interpretation model of ground penetrating radar
point data based on WD (Wigner distribution) and deep CNN in tunnel geological pre-
diction. As known, microelectro-mechanical systems (MEMS) sensors is a very effective
and promising for ground vibration monitoring in early alert geological disasters warning.
According, Kang et al. (2019) develop a novel ground vibration monitoring scheme for
MEMS sensed data based on a deep CNN. Experimental results on both data sets demon-
strate that the proposed scheme significantly outperforms the other comparable schemes.
For the application in image identification, Zhang et al. (2018b) established hybrid
model based on a transfer learning model with the new deep convolutional networks of
the Inception-v3 for geological structures and made a comparison between the identifica-
tion model with the other four models, namely K-nearest neighbors (KNN), artificial neural
network (ANN), extreme gradient boosting (XGBoost) and CNN. The ML method’s accu-
racy was poor because it is hard to extract accurate features of images from a pixel vector
or histogram and traditional single CNN model is overfitting strongly. Transfer learning
based on DL model was an effective method for geological structure images classification.
And the transfer learning model of mineral microscopic images is also established based
on Inception-v3 architecture considering the task of microscopic mineral image identifica-
tionis tedious and time-consuming in the lab (Zhang et al. 2019e).
Indeed, CNN can even be used to on-site construction safety management. Ding et al.
(2018) developed a hybrid DL model that integrates CNN and LSTM to automatically
identify unsafe behaviors of employees on construction sites. The CNN model is applied to
each frame to capture the spatial features obtained from video, and LSTM network is used
to understand the temporal information from the continuous frames that are generated.
Kim et al. (2018) proposed a construction equipment detection model based on deep con-
volutional nerual network, which is helpful for construction site management. This model
is trained with a small amount of construction equipment data through transfer learning.
Some applications of CNN in geoscience and geoengineering have been listed in
Table 2 in detail. It can be seen that this method has attracted more attention in recent years
for the improvement of computing power and storage capacity.
4.5 GAN applications
Since GAN was proposed in 2014 and has a relatively short development time. It is a radi-
cally novel approach to explore new development opportunity in image synthesis (Mosser
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5659
et al. 2017). Recently, researchers have investigated the utilized of GAN for imge construc-
tion of porous media. As known, in the studying of rock microstructure macroscopic prop-
erties, microsomputed tomography is widely used to achieve the high resolution digital
imaging. However, using this technique to extract a large number of three dimentional
images of the pore space is often experimentally not feasible. To address this problem,
GAN can be applied to reconstruct the solid-void structure of porous media for the stochas-
tic image reconstrution.
As results shown by Mosser et al. (2018), GAN are able to make a fast and accurate
reconstruction of the evaluated image dataset and the synthetic images generated by the
GAN model are accurate match the key characteristic statistical and physical parameters
of these porous media. Likewise, Valsecchi et al. (2020) establised a GAN based model
for two dimentional to three dimentional of the structure of porous media by employ three
kind of rock image database, named Beadpack, Berea and Ketton. The experiments proved
that three dimentional reconstruction can be performed successfully employing the sets of
two dimentional images, providing a huge advantage in terms of applicability with respect
to the costly microsomputed tomography scans. In addition, according to Azevedo et al.
(2020), GAN method is greatly suitable in generate geological models of discrete and
continuous properties for stochastic subsurface model restruction. Moreover, the hybird
model of developed generative-model-based comprssive sensing approach can be applied
to recover the overall shape and thickness of the cracks with its outstanding ability to learn
the low dimention representation of differents class of images (Huang et al. 2020).
Due to the inherent limitations of microcomputed tomography, the balance between the
field of view and rock image high resolution has always been a research hotspot in the
field of computer vision. SRGAN (Super-Resolution Generation Adversarial Network) is
a successful case of GAN in the application of image super-resolution (Ledig et al. 2017).
SRGAN proposes a new loss function, which effectively solves the problem of high-fre-
quency details lost in the image after restoration, and enables people to have a good visual
experience. Chen et al. (2020) propose a cycle-consistent generative adversarial network
(CycleGAN)-based SR approach for real-world rock microcomputed tomography images,
which can model the mapping between rock MCT images at different resolutions. The high
resolution images reconstructed by SR CycleGAN show good agreement with the targets
in terms of both the visual quality and the statistical parameters, which greatly improve the
quality of rock images and exceed the limitation of imaging systerms on field of view and
resolution. Similarly, Janssens et al. (2020) present the conditional Generative Adversarial
Networks (cGANs) method to handle the SR problem on fluid flow characteristics assess-
ment while still succeeding in generate visually more appealing results. It could therefore
be interesting as a pre-processing step in geological materials study.
Furthermore, GAN is mostly used to explore the reservoir properties in combination
with the existing geological information. Based on the unique architecture of this model, it
generates new models that does not exist in the initial models, then try to perform cluster
analysis on the regenerated model and filter it to obtain the corrected model, thereby reduc-
ing the uncertainty of prediction as possible (Kang and Choe 2020). Oliveira et al. (2019)
evaluate the performance of cGANs as an interpolation tool for improving seismic data res-
olution on a public poststack seismic data set and compare the results with the traditional
cubic interpolation. The results show that cGANs outperform traditional algorithms and
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5660 W. Zhang et al.
that the texture descriptor was able to better capture image similarities, producing results
more coherent with the visual perception.
Considering the advantages of GAN’s autonomous learning and the generation of ran-
dom samples, as well as the diversity for required training samples, it has been very suc-
cessful in generating realistic images in a large number of fields such as natural synthesis
(Brock et al. 2018; Karras et al. 2017, 2019; Salimans et al. 2016), face synthesis (Bao
et al. 2018), image style transfer (Chen et al. 2018a; Choi et al. 2018; Ma et al. 2018b;
Taigman et al. 2016), image segmentation (Zhang et al. 2018c), facial expression recogni-
tion (Peng and Wang 2018) and so on. Nonetheless, lack of data is still one of the impor-
tant factors restricting the development of DL. Particularly, the relevant aspects are limited
in geotechnical applications due to the inaccessibility of data as seen in Table 2, yet this
method is already a key technology in unsupervised learning and will be one of the impor-
tant development directions of AI in the future.
5 Discussion
5.1 Critical analysis
The four main DL algorithms, as well as literature mentioned are compiled in Table 2. The
summary is mainly based on different data sources discussed with the case study, adopted
methods and application aspect in geotechnical engineering. It is apparent from this short
review that different DL methods are motivated with the limitation of traditional theoreti-
cal approach in solving particular problem in geotechnical engineering. Even though some
DL methods can be applied in the same problem, results still have differences due to the
different architecture of various DL methods. Thus, no single or particular model can be
presented as the most suitable for all geotechnical problems, because the model selection
depends on the underlying objectives, the scientific goals, and model limitations. Notably,
DL models are still more appealing than others ML methods due to its calculation accuracy
and computational efficiency.
However, due to the short development time of DL theories, there still exist some
deficiencies in the four DL algorithms. Firstly, for the multilayer neural network, simply
increasing the number of network layers do not significantly contribute to the results accu-
racy. Secondly, the CNN algorithm is more proficient in handling supervised issues, which
utilize labelled data to push learnable model (Yang et al. 2017). However, the performance
of CNN strongly depends on the availability of large training datasets, optimized network
architectures, and faster graphics processing units (Guirado et al. 2017). Although automa-
tion approach with CNN has made a series of improvement in image classification and
object detection, with the application of CNN for geotechnical engineering are still limited.
Thirdly, although the RNN algorithm is apt at processing large amounts of time-related
data, the problem of the data missing certainly hold it back. Furthermore, while GAN
model is particularly excellent in image processing, but training-based methods such as the
presented GAN-based approach have a high initial computational cost and run times due to
the required training phase. Therefore, it places higher demands on the configuration of the
calculation.
Although computing configuration does restrict the calculation efficiency, from the per-
spective of the application of DL algorithms, DL still has a broad prospect in geotechnical
engineering applications, such as landslide susceptibility assessment, slope deformation
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5661
prediction, tunnel defects detection. Undoubtedly, these methods can extract increasingly
useful information of the raw data through each hidden layer, which can make full use
of effective monitoring information and exert a crucial influence on geological disaster
prevention (Shaheen et al. 2016). However, there are still several problems that cannot be
ignored in the data processing. Firstly, although there was large amounts of monitoring
data for large-scale geotechnical projects, the amount of data obtained from medium or
small-scale projects is very less, this leads to the great difficulty to process and analyze the
geotechnical problems. Moreover, data sharing platform of huge geological disasters has
not yet been established till now, which limited the expansion and application of new data
mining methods. Secondly, the output quality not only depends on the model, but also the
quality of the acquired data. However, site data acquirements are often limited by the geo-
logical environment and monitoring instruments. Besides, due to the existence of missing
values or outliers in the data, the robustness of the model may be affected in the process
of data mining. In addition, as far as the current situation is concerned, it is still unachiev-
able to realize a precise and fully automated process. Hence, the remote intelligent early
warning of earth disasters of equipment, consultation, and cooperative prediction mecha-
nism urgently needs to be established. Furthermore, considering the spatial variability and
uncertainty of geological conditions, it is difficult to ensure the generalization of the model
during data analysis. In summary, the exploration of AI applications, big data technology,
and DL methods are in the ascending stage of development. Scholars still need to over-
come various difficulties in the process of continuous exploration and advancement.
5.2 Future perspectives
With the particular insistence on the latest techniques used for geotechnical engineering
in the past years and future remarks on those approaches that imply to be promising in the
subsequent years but still requires further improvement. It is undeniably that DL methods
can be adopted as a complementary measure to conventional theory. It may be utilized as
a quick check on expert’s solutions even as an alternative approach (Thirugnanam et al.
2020). Interestingly, the DL based researches have already demonstrated the accuracy of
pre-judgement compared with others.
However, although DL has been applied for civil engineering field, the practical research
is rarely applied in practical engineering (Lin et al. 2017). The main reason for that is, in
the last two decades, the majority of geotechnical engineering related studies have been
dedicated to using conventional ML approaches such as RF, SVM and DT. On the contrary,
geotechnical engineering such as tunnel construction and landslide displacement prediction
developed by using DL methods are still in their infancy (Jiao and Alavi 2020). Hence,
there are tremendous opportunities for exploitation of current algorithms with architectures
and further exploration of optimization methods to solve more complex problems. It should
be noted that DL training is currently constrained by overfitting, training time and is highly
susceptible to getting stuck in local minima, overcoming these challenges shall become the
research focus to accelerate breakthroughs (Shrestha and Mahmood 2019).
Furthermore, the performance of these approaches is strongly dependent on their net-
work architecture, the sample patches selected for input, and graphics processing unit
speed. That meaning increasingly high resolution images typically require increasing mem-
ory storage and computational load, thus an accurate model requires a balance between
satisfactory performance and practicable computational time. In other words, while consid-
ering the resulting image quality to be equal, one possible differentiation of these methods
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5662 W. Zhang et al.
is computational run time. Therefore, how to optimize the algorithms architecture to calcu-
late faster is also a research orientation for scholars.
In the future development, in addition to continuously optimizing the algorithm archi-
tecture, how to effectively acquire high-quality data is also crucially important. Especially
in this era of big data, it provides more opportunities and challenges for DL. As the charac-
teristics of big data with 4Vs (volume, velocity, veracity, variety), if it can be properly man-
aged and adequately analyzed, data will be priceless (Kapliński et al. 2016). Conversely,
DL methods may also be limited by the availability of data sources. If the data source is
limited or of substandard quality, adopting or reusing the proposed method may not be fea-
sible for other research. That is to say, the durability and accuracy of geological monitoring
are of significance to the prediction systems and performance of the DL based prediction
model crucially depends on the monitoring data. As a consequence, we must adopt from
many new monitoring devices to collect data locally, such as GPS, wireless devices, sen-
sors, and streaming communication generated by machine-to-machine interactions in engi-
neering (Huang et al. 2019). And then taking the advantage of the rapid development of
internet-based platform, DL models can readily be functionalized in the platform to enable
an advance real-time geological monitoring system, as well cloud computing techniques
are expected to transmit and process geological data efficiently.
In addition, in terms of data source processing, especially for the related analysis of
rock and soil materials, considering of its strong spatial variability, it is necessary to adopt
appropriate data preprocessing techniques, such as noise removal, outlier removal, spurious
attribute removal, and proper correlation between features of different data sources (Saikia
et al. 2020). Simultaneously, the application of DL is mainly based on the assumption that
the created training data set can correctly represent the relationship between the attribute to
be predicted and the input data. It should be stressed that it is important to choose the train-
ing set that can represent the population well.
Besides, previous numerous research has focused on deploying supervised learning
methods. While unsupervised learning methods may play a key role in geotechnical engi-
neering in the future, because the main learning mechanisms of humans and animals are
unsupervised-discovering the world through observation, rather than being informed on all
objects (Gentine et al. 2018). Although supervised learning can synthesize all features to
predict label values, unsupervised learning can more deeply mine potential connections
and automatically extract features, such as the DL method of GAN can learn a function
(generator) with the ability to generate data in various forms (images, speech, languages,
etc.) from a large amount of unlabeled data, providing new ideas for image-based predic-
tion, which has great potential even though temporarily applied less. And from a devel-
opmental perspective, we find that only a handful of researchers use GAN method in soil
related but more rock image identification studies in the process of reviewing articles. It
shows the expansibility of multi-dimensional image reconstruction of soil mass in geotech-
nical engineering。
Eventually, DL offers a potential solution for the field requested to address massive data
by combing the advents of cluster computing environment and more powerful personal
computers. The DL-enhanced prediction models also have shown promising performance
when analyzing huge and complicated geological data and extracting meaningful finding.
It is of our belief that the major progress of the big data analysis platform is likely to be
achieved by combing various DL methods with complex reasoning.
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5663
6 Concluding remarks
Acknowledgements The authors are grateful to the financial supports from National Key R&D Program of
China (Project No. 2019YFC1509605), Program of Distinguished Young Scholars, Natural Science Founda-
tion of Chongqing, China (cstc2020jcyj-jq0087) and Chongqing Construction Science and Technology Plan
Project (No. 2019-0045).
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5664 W. Zhang et al.
References
Abdolahnejad M, Liu PX (2020) Deep learning for face image synthesis and semantic manipulations: a
review and future perspectives. Artificial Intelligence Review, 1–34
Adams MD, Kanaroglou PS (2016) Mapping real-time air pollution health risk for environmental manage-
ment: combining mobile and stationary air pollution monitoring with neural network models. J Envi-
ron Manage 168:133–141
Ahmad I, El Naggar MH, Khan AN (2007) Artificial neural network application to estimate kinematic soil
pile interaction response parameters. Soil Dynam Earthquake Eng 27(9):892–905
Alqahtani N, Armstrong RT, Mostaghimi P (2018) Deep learning convolutional neural networks to predict
porous media properties. In: SPE Asia Pacific oil and gas conference and exhibition. Society of Petro-
leum Engineers
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv preprint https://arxiv.org/abs/1701.07875
Asadi A, Moayedi H, Huat BB, Boroujeni FZ, Parsaie A, Sojoudi S (2011a) Prediction of zeta potential for
tropical peat in the presence of different cations using artificial neural networks. Int J Electrochem Sci
6(4):1146–1158
Asadi A, Moayedi H, Huat BB, Parsaie A, Taha MR (2011b) Artificial neural networks approach for electro-
chemical resistivity of highly organic soil. Int J Electrochem Sci 6(4):1135–1145
Asadi A, Shariatmadari N, Moayedi H, Huat BB (2011c) Effect of MSW leachate on soil consistency under
influence of electrochemical forces induced by soil particles. Int J Electrochem Sci 6(7):2344–2351
Ayyıldız M, Çetinkaya K (2017) Predictive modeling of geometric shapes of different objects using
image processing and an artificial neural network. Proc Inst Mech Eng, Part E: J Proc Mech Eng
231(6):1206–1216
Azevedo L, Paneiro G, Santos A, Soares A (2020) Generative adversarial network as a stochastic subsurface
model reconstruction. Comput Geosci 24(4):1673–1692
Bagińska M, Srokosz PE (2019) The optimal ANN Model for predicting bearing capacity of shallow foun-
dations trained on scarce data. KSCE J Civil Eng 23(1):130–137
Ball JE, Anderson DT, Chan CS (2017) Comprehensive survey of deep learning in remote sensing: theories,
tools, and challenges for the community. J Appl Remote Sens 11(4):042609
Bang S, Park S, Kim H, Kim H (2019) Encoder-decoder network for pixel-level road crack detection in
black-box images. Comput-Aided Civil Infrastruct Eng 34(8):713–727
Bao J, Chen D, Wen F, Li H, Hua G (2018) Towards open-set identity preserving face synthesis. In: Pro-
ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6713–6722
Barrow H (1996) Connectionism and neural networks. In: Artificial Intelligence, pp 135–155, Academic
Press
Baziar M, Jafarian Y (2007) Assessment of liquefaction triggering using strain energy concept and ANN
model: capacity energy. Soil Dynam Earthquake Eng 27(12):1056–1072
Benardos A, Kaliampakos D (2004) Modelling TBM performance with artificial neural networks. Tunn
Undergr Space Technol 19(6):597–605
Bengio Y (2009) Learning deep architectures for AI. Foundations and trends® in Machine Learning
2(1):1–127
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult.
IEEE Trans Neural Netw 5(2):157–166
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Adv
Neural Inf Process Syst 19:153–160
Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis.
arXiv preprint https://arxiv.org/abs/1809.11096
Bui DT, Tsangaratos P, Nguyen V-T, Van Liem N, Trinh PT (2020) Comparing the prediction performance
of a Deep Learning Neural Network model with conventional machine learning models in landslide
susceptibility assessment. CATENA 188:104426
Calabrese A, Lai CG (2013) Fragility functions of blockwork wharves using artificial neural networks. Soil
Dynam Earthquake Eng 52:88–102
Canchumuni SA, Emerick AA, Pacheco MA (2017) Integration of ensemble data assimilation and deep
learning for history matching facies models. In: OTC Brasil. Offshore Technology Conference
Canchumuni SW, Emerick AA, Pacheco MAC (2019) History matching geological facies models based on
ensemble smoother and deep generative models. J Petrol Sci Eng 177:941–958
Cao C, Shi C, Lei M, Yang W, Liu J (2018) Squeezing failure of tunnels: A case study. Tunn Undergr Space
Technol 77:188–203
Chakraborty A, Goswami D (2017) Prediction of slope stability using multiple linear regression (MLR) and
artificial neural network (ANN). Arab J Geosci 10(17):385
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5665
Chen J, Jin Q, Chao J (2012) Design of deep belief networks for short-term prediction of drought index
using data in the Huaihe river basin. Mathematical Problems in Engineering 2012
Chen RP, Li ZC, Chen YM, Ou CY, Hu Q, Rao M (2015) Failure Investigation at a Collapsed Deep Excava-
tion in Very Sensitive Organic Soft Clay. J Perform Constr Facil 29(3):04014078
Chen Z, Zhang Y, Ouyang C, Zhang F, Ma J (2018) Automated landslides detection for mountain cities
using multi-temporal remote sensing imagery. Sensors 18(3):821
Chen Y, Lai Y-K, Liu Y-J (2018a) Cartoongan: Generative adversarial networks for photo cartooni-
zation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp
9465–9474
Chen H, Lin H, Yao M (2019) Improving the efficiency of encoder-decoder architecture for pixel-level
crack detection. IEEE Access 7:186657–186670
Chen H, He X, Teng Q, Sheriff RE, Feng J, Xiong S (2020) Super-resolution of real-world rock micro-
computed tomography images using cycle-consistent generative adversarial networks. Physical
Review E 101(2):023305
Cheng G, Guo W (2017) Rock images classification by using deep convolution neural network. J Phys:
Conference Series, IOP Publish 887(1):012089
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) Stargan: Unified generative adversarial net-
works for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on
computer vision and pattern recognition, pp 8789–8797
Chou J-S, Thedja JPP (2016) Metaheuristic optimization within machine learning-based classification
system for early warnings related to geotechnical problems. Automat Construct 68:65–80
Chua CG, Goh ATC (2005) Estimating wall deflections in deep excavations using Bayesian neural net-
works. Tunn Undergr Space Technol 20(4):400–409
CireşAn D, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural network for traffic sign
classification. Neural networks 32:333–338
Cruz M, Santos JM, Cruz N (2015) Using neural networks and support vector regression to relate mar-
chetti dilatometer test parameters and maximum shear modulus. Appl Intell 42(1):135–146
Cui Y, Ju S-G, Han F, Gu T-Y (2009) An improved approach combining random PSO with BP for feed-
forward neural networks. In: International Conference on Artificial Intelligence and Computa-
tional Intelligence, pp 361–368
Cui D-M, Yan W, Wang X-Q, Lu L-M (2017) Towards intelligent interpretation of low strain pile integ-
rity testing results using machine learning techniques. Sensors 17(11):2443
Da’u A, Salim N (2020) Recommendation system based on deep learning methods: a systematic review
and new directions. Artif Intell Rev 53:2709–2748
Ding A, Zhang Q, Zhou X, Dai B (2016) Automatic recognition of landslide based on CNN and texture
change detection. In: 2016 31st Youth Academic Annual Conference of Chinese Association of
Automation (YAC), pp 444–448
Ding L, Fang W, Luo H, Love PE, Zhong B, Ouyang X (2018) A deep hybrid learning model to detect
unsafe behavior: Integrating convolution neural networks and long short-term memory. Automat
construct 86:118–124
Dong C, Dong X, Gehman J, Lefsrud L (2017) Using BP neural networks to prioritize risk management
approaches for China’s unconventional shale gas industry. Sustainability 9(6):979
Dong Y, Wang J, Wang Z, Zhang X, Gao Y, Sui Q, Jiang P (2019) A Deep-learning-based multiple
defect detection method for tunnel lining damages. IEEE Access 7:182643–182657
Dong X, Yu Z, Cao W, Shi Y, Ma Q (2020) A survey on ensemble learning. Front Comput Sci 14:241–258
Du S, Wang R, Wei C, Wang Y, Zhou Y, Wang J, Song H (2020) The connectivity evaluation among
wells in reservoir utilizing machine learning methods IEEE. Access 8:47209–47219
Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networks—a review.
Pattern Recogn 35(10):2279–2301
Elbeltagi E, Hegazy T, Grierson D (2005) Comparison among five evolutionary-based optimization
algorithms. Adv Eng Inform 19(1):43–53
Erzin Y (2007) Artificial neural networks approach for swell pressure versus soil suction behaviour. Can
Geotech J 44(10):1215–1223
Fan Y, Qian Y, Xie F-L, Soong FK TTS (2014) synthesis with bidirectional LSTM based recurrent neural
networks. In: Fifteenth Annual Conference of the International Speech Communication Association
Fang Z, Wang Y, Peng L, Hong H (2020) Integration of convolutional neural network and conventional
machine learning classifiers for landslide susceptibility mapping. Computers & Geosciences:104470
Fatehnia M, Amirinia G (2018) A review of genetic programming and artificial neural network applica-
tions in pile foundations. Int J Geo-Eng 9(1):2
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5666 W. Zhang et al.
Ferreira A, Giraldi G (2017) Convolutional Neural Network approaches to granite tiles classification.
Expert Syst Appl 84:1–11
Fukushima K, Miyake S (1982) Neocognitron: a new algorithm for pattern recognition tolerant of defor-
mations and shifts in position. Pattern Recogn 15(6):455–469
Gandomi AH, Alavi AH (2012a) A new multi-gene genetic programming approach to non-linear sys-
tem modeling. Part II: geotechnical and earthquake engineering problems. Neural Computing and
Applications 21(1):189–201
Gandomi AH, Alavi AH (2012b) A new multi-gene genetic programming approach to nonlinear system
modeling. Part I: materials and structural engineering problems. Neural Computing and Applications
21(1):171–187
Gao W, Guirao JL, Basavanagoud B, Wu J (2018) Partial multi-dividing ontology learning algorithm. Inf
Sci 467:35–58
Gao W, Wu H, Siddiqui MK, Baig AQ (2018) Study of biological networks using graph theory. Saudi J biol
sci 25(6):1212–1219
Gao W, Guirao JLG, Abdel-Aty M, Xi W (2019) An independent set degree condition for fractional critical
deleted graphs. Dis Continus Dynam Syst-S 12(4 & 5):877
Gao X, Shi M, Song X, Zhang C, Zhang H (2019) Recurrent neural networks for real-time prediction of
TBM operating parameters. Automation in Construction 98:225–235
Gao W, Dimitrov D, Abdo H (2019a) Tight independent set neighborhood union condition for fractional
critical deleted graphs and ID deleted graphs. Discrete & Continuous Dynamical Systems-Series S 12
Gao B, Wang R, Lin C, Guo X, Liu B, Zhang W (2020) TBM penetration rate prediction based on the long
short-term memory neural network. Underground Space. https://doi.org/10.1016/j.undsp.2020.01.003
Gao M-Y, Zhang N, Shen S-L, Zhou A (2020) Real-time dynamic earth-pressure regulation model for
shield tunneling by integrating GRU deep learning method with GA optimization. IEEE Access
8:64310–64323
Gao W, Lu X, Peng Y, Wu L (2020) A Deep learning approach replacing the finite difference method for
in situ stress prediction. IEEE Access 8:44063–44074
Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J (2017) A review
on deep learning techniques applied to semantic segmentation. arXiv preprint https://arxiv.org/
abs/1704.06857
Garg A, Garg A, Tai K, Barontini S, Stokes A (2014) A computational intelligence-based genetic program-
ming approach for the simulation of soil water retention curves. Transp Porous Media 103(3):497–513
Gentine P, Pritchard M, Rasp S, Reinaudi G, Yacalis G (2018) Could machine learning break the convection
parameterization deadlock? Geophys Res Lett 45(11):5742–5751
Ghorbanzadeh O, Meena SR, Blaschke T, Aryal J (2019) UAV-based slope failure detection using deep-
learning convolutional neural networks. Remote Sens 11(17):2046
Ghorbanzadeh O, Blaschke T, Gholamnia K, Meena SR, Tiede D, Aryal J (2019) Evaluation of different
machine learning methods and deep-learning convolutional neural networks for landslide detection.
Remote Sens 11(2):196
Goh ATC, Zhang W (2014) An improvement to MLR model for predicting liquefaction-induced lateral
spread using multivariate adaptive regression splines. Eng Geol 170:1–10
Goh ATC, Wong K, Broms B (1995) Estimation of lateral wall movements in braced excavations using neu-
ral networks. Can Geotech J 32(6):1059–1064
Goh ATC, Zhang Y, Zhang R, Zhang W, Xiao Y (2017) Evaluating stability of underground entry-type
excavations using multivariate adaptive regression splines and logistic regression. Tunn Undergr
Space Technol 70:148–154
Goh ATC, Zhang W, Zhang Y, Xiao Y, Xiang Y (2018) Determination of earth pressure balance tunnel-
related maximum surface settlement: a multivariate adaptive regression splines approach. Bull Eng
Geol Env 77(2):489–500
Goodfellow IJ et al. (2014) Generative Adversarial Nets. In: Advances in neural information processing
systems pp. 2672–2680
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning vol. 1, No. 2. MIT press Cambridge
Gordan B, Armaghani DJ, Hajihassani M, Monjezi M (2016) Prediction of seismic slope stability through
combination of particle swarm optimization and neural network. Eng Comput 32(1):85–97
Guirado E, Tabik S, Alcaraz-Segura D, Cabello J, Herrera F (2017) Deep-learning convolutional neural
networks for scattered shrub detection with google earth imagery. arXiv preprint https://arxiv.org/
abs/1706.00917
Gurney K (1997) An introduction to neural networks. CRC press
Han S, Ren F, Wu C, Chen Y, Du Q, Ye X (2018) Using the tensorflow deep neural network to classify
mainland china visitor behaviours in hong kong from check-in data. ISPRS Int J Geo-Inf 7(4):158
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5667
Han S, Li H, Li M, Luo X (2019) Measuring rock surface strength based on spectrograms with deep convo-
lutional networks. Comput Geosci 133:104312
Han S, Li H, Li M, Rose T (2019) A Deep Learning Based Method for the Non-Destructive Measuring of
Rock Strength through Hammering Sound. Appl Sci-Basel 9(17):3484
Hashash YMA, Levasseur S, Osouli A, Finno R, Malecot Y (2010) Comparison of two inverse analysis
techniques for learning deep excavation response. Comput Geotech 37(3):323–333
Havaei M et al (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on
imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp
1026–1034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the
IEEE conference on computer vision and pattern recognition, pp 770–778
He Y-y, Li B-q, Guo Y-s, Wang T-n, Zhu Y (2017) An interpretation model of GPR point data in tunnel
geological prediction. In: Eighth International Conference on Graphic and Image Processing (ICGIP
2016). International Society for Optics and Photonics
He M, Zhang Z, Ren J, Huan J, Li G, Chen Y, Li N (2019) Deep convolutional neural network for fast deter-
mination of the rock strength parameters using drilling data. Int J Rock Mech Min Sci 123:104084
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput
18(7):1527–1554
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators.
Neural netw 2(5):359–366
Huang H-w, Li Q-t, Zhang D-m (2018) Deep learning based image recognition for crack and leakage defects
of metro shield tunnel. Tunn Undergr Space Technol 77:166–176
Huang L, Li J, Hao H, Li X (2018b) Micro-seismic event detection and location in underground mines by
using Convolutional Neural Networks (CNN) and deep learning Tunnelling and Underground Space
Technology 81:265–276
Huang Y, Li J, Fu J (2019) Review on Application of Artificial Intelligence in Civil Engineering. CMES-
Comput Model Eng Sci 121(3):845–875
Huang Y, Zhang H, Li H, Wu S (2020) Recovering compressed images for automatic crack segmentation
using generative models. arXiv preprint https://arxiv.org/abs/2003.03028
Ikizler SB, Vekli M, Dogan E, Aytekin M, Kocabas F (2014) Prediction of swelling pressures of expansive
soils using soft computing methods. Neural Comput Appl 24(2):473–485
Imamverdiyev Y, Sukhostat L (2019) Lithological facies classification using deep convolutional neural net-
work. J Petrol Sci Eng 174:216–228
Jan JC, Hung SL, Chi SY, Chern JC (2002) Neural network forecast model in deep excavation. J Comput
Civil Eng 16(1):59–65
Janssens N, Huysmans M, Swennen R (2020) Computed tomography 3D super-resolution with genera-
tive Adversarial neural networks: implications on unsaturated and two-phase fluid flow. Materials
13(6):1397
Jiao P, Alavi AH (2020) Artificial intelligence in seismology: advent, performance and future trends. Geosci
Front 11(3):739–744
Kang B, Choe J (2020) Uncertainty quantification of channel reservoirs assisted by cluster analysis and deep
convolutional generative adversarial networks. J Petrol Sci Eng 187:106742
Kang J-M, Kim I-M, Lee S, Ryu D-W, Kwon J (2019) A deep CNN-based ground vibration monitoring
scheme for MEMS sensed data. IEEE Geosci Remote Sens Lett 17(2):347–351
Kapliński O, Košeleva N, Ropaitė G (2016) Big Data in civil engineering: a state-of-the-art survey. Eng
Struct Technol 8(4):165–175
Karimpouli S, Tahmasebi P (2019) Image-based velocity estimation of rock using Convolutional Neural
Networks. Neural Netw 111:89–97
Karlik B, anÖzkayaAydinPakdemirli ESM (1998) Vibrations of a beam-mass systems using artificial neural
networks. Comput Struct 69(3):339–347
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and
variation. arXiv preprint https://arxiv.org/abs/1710.10196
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks.
In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional
neural networks. Artif Intell Rev 53(8):5455–5516
Kim S-H, Yang J, Jeong J-H (2014) Prediction of subgrade resilient modulus using artificial neural network.
KSCE J Civ Eng 18(5):1372–1379
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5668 W. Zhang et al.
Kim H, Kim H, Hong YW, Byun H (2018) Detecting construction equipment using a region-based fully
convolutional network and transfer learning. J Comp Civil Eng 32(2):04017082
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural net-
works. Commun ACM 60(6):84–90
Laloy E, Hérault R, Lee J, Jacques D, Linde N (2017) Inversion using a new low-dimensional representation
of complex binary geological media based on a deep neural network. Adv Water Resour 110:387–405
Lary DJ, Alavi AH, Gandomi AH, Walker AL (2016) Machine learning in geosciences and remote sens-
ing. Geosci Front 7(1):3–10
Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: A convolutional neural-network
approach. IEEE Trans Neural Netw 8(1):98–113
Lazarevska M, Knezevic M, Cvetkovska M, Trombeva-Gavriloska A (2014) Application of artificial
neural networks in civil engineering. Tehnički vjesnik 21(6):1353–1359
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recogni-
tion. Proc IEEE 86(11):2278–2324
Ledig C et al. (2017) Photo-realistic single image super-resolution using a generative adversarial net-
work. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp
4681–4690
Lee C, Sterling R (1992) Identifying probable failure modes for underground openings using a neu-
ral network. In: International journal of rock mechanics and mining sciences & geomechanics
abstracts (Vol. 29, No. 1, pp. 49–67)
Lee SJ, Lee SR, Kim YS (2003) An approach to estimate unsaturated shear strength using artificial neu-
ral network and hyperbolic formulation. Comput Geotech 30(6):489–503
Lee G, Tai Y-W, Kim J (2017) ELD-net: An efficient deep learning architecture for accurate saliency
detection. IEEE Trans Pattern Anal Mach Intell 40(7):1599–1610
Lei T, Zhang Y, Lv Z, Li S, Liu S, Nandi AK (2019) Landslide inventory mapping from bitemporal
images using deep convolutional neural networks. IEEE Geosci Remote Sens Lett 16(6):982–986
Li X, Gong G (2019) Predictive control of slurry pressure balance in shield tunneling using diagonal
recurrent neural network and evolved particle swarm optimization. Autom Construct 107:102928
Li Y, Chen G, Tang C, Zhou G, Zheng L (2012) Rainfall and earthquake-induced landslide susceptibility
assessment using GIS and Artificial Neural Network. Nat Hazards Earth Syst Sci 12(8):2719–2729
Li H, Lin Z, Shen X, Brandt J, Hua G (2015) A convolutional neural network cascade for face detection.
In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5325–5334
Li C, Wang Y, Zhang X, Gao H, Yang Y, Wang J (2019) Deep belief network for spectral–spatial clas-
sification of hyperspectral remote sensor data. Sensors 19(1):204
Li J, Chen H, Zhou T, Li X (2019) Tailings Pond Risk Prediction Using Long Short-Term Memory Net-
works. IEEE Access 7:182527–182537
Li Y, Bao T, Gong J, Shu X, Zhang K (2020) The prediction of dam displacement time series using STL,
extra-trees, and stacked LSTM Neural network. IEEE Access 8:94440–94452
Li J, Zhao F, Wang X, Cao F, Han X (2020) The underground explosion point measurement method
based on high-precision location of energy focus. IEEE Access 8:165989–166002
Li J, Li P, Guo D, Li X, Chen Z (2020) Advanced prediction of tunnel boring machine performance
based on big data. Geosci Front 12(1):331–338
Liang C, Li H, Lei M, Du Q (2018) Dongting lake water level forecast and its relationship with the three
gorges dam based on a long short-term memory network. Water 10(10):1389
Lin Y, Zhou K, Li J (2018) Application of cloud model in rock burst prediction and performance com-
parison with three machine learnings algorithms. IEEE Access 6:30958–30968
Lisboa PJ (2002) A review of evidence of health benefit from artificial neural networks in medical inter-
vention. Neural netw 15(1):11–39
Litjens G et al (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Liu Y, Wu L (2016) Geological disaster recognition on optical remote sensing images using deep learn-
ing. Procedia Comput Sci 91:566–575
Liu X, Cheng G, Wang B, Lin S (2012) Optimum design of pile foundation by automatic grouping
genetic algorithms. ISRN Civil Engineering 2012
Liu B, Zhang Y, He D, Li Y (2018) Identification of apple leaf diseases based on deep convolutional
neural networks. Symmetry 10(1):11
Lu Y, Yi S, Zeng N, Liu Y, Zhang Y (2017) Identification of rice diseases using deep convolutional neu-
ral networks. Neurocomputing 267:378–384
Lu H, Ma L, Fu X, Liu C, Wang Z, Tang M, Li N (2020) Landslides information extraction using object-ori-
ented image analysis paradigm based on deep learning and transfer learning. Remote Sens 12(5):752
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5669
Luo C-L, Sha H, Ling C-L, Li J-Y (2020) Intelligent Detection for tunnel shotcrete spray using deep
Learning and LiDAR. IEEE Access 8:1755–1766
Lv Y, Duan Y, Kang W, Li Z, Wang F-Y (2014) Traffic flow prediction with big data: a deep learning
approach. IEEE Trans Intell Transp Syst 16(2):865–873
Lv Z, Liu T, Kong X, Shi C, Benediktsson JA (2020) Landslide Inventory Mapping with Bitemporal
Aerial Remote Sensing Images Based on the Dual-path Full Convolutional Network. IEEE Journal
of Selected Topics in Applied Earth Observations and Remote Sensing
Ma J, Tang H, Liu X, Wen T, Zhang J, Tan Q, Fan Z (2018) Probabilistic forecasting of landslide displace-
ment accounting for epistemic uncertainty: a case study in the Three Gorges Reservoir area. China
Landslides 15(6):1145–1153
Ma S, Fu J, Wen Chen C, Mei T (2018b) Da-gan: Instance-level image translation by deep attention genera-
tive adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp 5657–5666
Mabbutt S, Picton P, Shaw P, Black S Review of Artificial Neural Networks (ANN) applied to corrosion
monitoring. In: Journal of Physics: Conference Series, IOP Publishing, Vol. 364, No. 1, p. 012114
Mahendran S, Ali H, Vidal R (2017) 3d pose regression using convolutional neural networks. In: Proceed-
ings of the IEEE International Conference on Computer Vision Workshops, pp 2174–2182
Maier H, Dandy G (2000) Application of artificial neural networks to forecasting of surface water quality
variables: issues, applications and challenges. In: Artificial neural networks in hydrology. Springer,
pp 287–309
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial net-
works. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2794–2802
Mao Y, Zhang J, Qi H, Wang L (2019) DNN-MVL: DNN-Multi-view-learning-based recover block missing
data in a dam safety monitoring system. Sensors 19(13):2895
Marzouk A, Barros P, Eppe M, Wermter S (2019) The Conditional Boundary Equilibrium Generative
Adversarial Network and its Application to Facial Attributes. In: 2019 International Joint Conference
on Neural Networks (IJCNN). IEEE, pp 1–7
Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based lan-
guage model. In: Eleventh annual conference of the international speech communication association,
pp 1045–1048
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint https://arxiv.org/
abs/1411.1784
Moayedi H, Armaghani DJ (2018) Optimizing an ANN model with ICA for estimating bearing capacity of
driven pile in cohesionless soil. Eng Comput 34(2):347–356
Moayedi H, Rezaei A (2019) An artificial neural network approach for under-reamed piles subjected to
uplift forces in dry sand. Neural Comput Appl 31(2):327–336
Moayedi H, Huat BB, Moayedi F, Asadi A, Parsaie A (2011) Effect of sodium silicate on unconfined com-
pressive strength of soft clay Electronic. J Geotech Eng 16:289–295
Moayedi H, Mosallanezhad M, Rashid ASA, Jusoh WAW, Muazu MA (2020) A systematic review and
meta-analysis of artificial neural network application in geotechnical engineering: theory and applica-
tions. Neural Comput Appl 32:495–518
Mollahasani A, Alavi AH, Gandomi AH, Rashed A (2011) Nonlinear neural-based modeling of soil cohe-
sion intercept. KSCE J Civ Eng 15(5):831–840
Mosallanezhad M, Moayedi H (2017) Developing hybrid artificial neural network model for predicting
uplift resistance of screw piles. Arab J Geosci 10(22):479
Mosser L, Dubrule O, Blunt MJ (2017) Reconstruction of three-dimensional porous media using generative
adversarial neural networks. Phys Rev E 96(4):043309
Mosser L, Dubrule O, Blunt MJ (2018) Stochastic reconstruction of an oolitic limestone by generative
adversarial networks. Transp Porous Media 125(1):81–103
Naghadehi MZ, Thewes M, Lavasan AA (2019) Face stability analysis of mechanized shield tunneling: An
objective systems approach to the problem. Eng Geol 262:105307
Najjar YM, Huang C (2007) Simulating the stress-strain behavior of Georgia kaolin via recurrent neuronet
approach. Comput Geotech 34(5):346–361
Nassr A, Esmaeili-Falak M, Katebi H, Javadi A (2018) A new approach to modeling the behavior of frozen
soils. Eng Geol 246:82–90
Nejad FP, Jaksa MB, Kakhi M, McCabe BA (2009) Prediction of pile settlement using artificial neural net-
works based on standard penetration test data. Comput Geotech 36(7):1125–1133
Nelson EJ, Chao KC, Nelson JD, Overton DD (2017) Lessons Learned from Foundation and Slab Failures
on Expansive Soils. J Perform Construct Facil 31(3):D4016007
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5670 W. Zhang et al.
Nguyen G et al (2019) Machine Learning and Deep Learning frameworks and libraries for large-scale data
mining: a survey. Artif Intell Rev 52(1):77–124
Nhu V-H et al (2020) Effectiveness assessment of keras based deep learning with different robust optimiza-
tion algorithms for shallow landslide susceptibility mapping at tropical area. CATENA 188:104458
Ninić J, Freitag S, Meschke G (2017) A hybrid finite element and surrogate modelling approach for simula-
tion and monitoring supported TBM steering. Tunn Undergr Space Technol 63:12–28
Oliveira DA, Ferreira RS, Silva R, Brazil EV (2019) Improving seismic data resolution with deep generative
networks. IEEE Geosci Remote Sens Lett 16(12):1929–1933
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In:
International conference on machine learning, pp 1310–1318
Peng G, Wang S (2018) Weakly supervised facial action unit recognition through adversarial training. In:
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2188–2196
Phoon K-K (2020) The story of statistics in geotechnical engineering. Georisk: Assessment and Manage-
ment of Risk for Engineered Systems and Geohazards 14(1):3–25
Protopapadakis E, Voulodimos A, Doulamis A, Doulamis N, Stathaki T (2019) Automatic crack detec-
tion for tunnel inspection using deep learning and heuristic image post-processing. Appl Intell
49(7):2793–2806
Qi C, Fourie A (2018) A real-time back-analysis technique to infer rheological parameters from field
monitoring. Rock Mech Rock Eng 51(10):3029–3043
Qi C, Tang X (2018) Slope stability prediction using integrated metaheuristic and machine learning
approaches: a comparative study. Comput Ind Eng 118:112–122
Qi C, Fourie A, Chen Q (2018) Neural network and particle swarm optimization for predicting the
unconfined compressive strength of cemented paste backfill Construction and Building. Materials
159:473–478
Qin X, Cui S, Liu L, Wang P, Wang M, Xin J (2018) Prediction of Mechanical strength based on deep
learning using the scanning electron image of microscopic cemented paste backfill. Adv Civ Eng.
https://doi.org/10.1155/2018/6245728
Qin X, Liu L, Wang P, Wang M, Xin J (2018) Microscopic Parameter extraction and corresponding
strength prediction of cemented paste backfill at different curing times. Adv Civ Eng. https://doi.
org/10.1155/2018/2837571
Qu Z, Mei J, Liu L, Zhou D-Y (2020) Crack detection of concrete pavement with cross-entropy loss
function and improved VGG16 network model. IEEE Access 8:54564–54573
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional
generative adversarial networks. arXiv preprint https://arxiv.org/abs/1511.06434
Rahman M, Wang J, Deng W, Carter J (2001) A neural network model for the uplift capacity of suction
caissons. Comput Geotech 28(4):269–287
Ran X, Xue L, Zhang Y, Liu Z, Sang X, He J (2019) Rock classification from field image patches ana-
lyzed using a deep convolutional neural network. Mathematics 7(8):755
Ranković V, Novaković A, Grujović N, Divac D, Milivojević N (2014) Predicting piezometric water
level in dams via artificial neural networks. Neural Comput Appl 24(5):1115–1121
Saikia P, Baruah RD, Singh SK, Chaudhuri PK (2020) Artificial Neural Networks in the domain of res-
ervoir characterization: a review from shallow to deep models. Comput Geosci 135:104357
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for
training gans. Adv Neural Inf Process Syst 29:2234–2242
Salsani A, Daneshian J, Shariati S, Yazdani-Chamzini A, Taheri M (2014) Predicting roadheader perfor-
mance by using artificial neural network. Neural Comput Appl 24:1823–1831
Sargano AB, Angelov P, Habib Z (2017) A comprehensive review on handcrafted and learning-based
action representation approaches for human activity recognition. applied sciences 7(1):110
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural networks 61:85–117
Sermanet P, LeCun Y (2011) Traffic sign recognition with multi-scale convolutional networks. In: The
2011 International Joint Conference on Neural Networks. IEEE, pp 2809–2813
Shaheen F, Verma B, Asafuddoula M (2016) Impact of automatic feature extraction in deep learning
architecture. In: 2016 International conference on digital image computing: techniques and appli-
cations (DICTA). IEEE, pp 1–8
Shahin MA (2015) A review of artificial intelligence applications in shallow foundations. Int J Geotech
Eng 9(1):49–60
Shahin MA (2016) State-of-the-art review of some artificial intelligence applications in pile foundations.
Geosci Front 7(1):33–44
Shahin MA, Jaksa MB, Maier HR (2001) Artificial neural network applications in geotechnical engi-
neering Australian geomechanics 36(1):49–62
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5671
Shen D, Wu G, Suk H-I (2017) Deep learning in medical image analysis. Annu Rev Biomed Eng
19:221–248
Shi L, Jianping C, Jie X (2018) Prospecting information extraction by text mining based on convolutional
neural networks–a case study of the Lala copper deposit, China. IEEE Access 6:52286–52297
Shim S, Kim J, Cho G-C, Lee S-W (2020) Multiscale and adversarial learning-based semi-super-
vised semantic segmentation approach for crack detection in concrete structures. IEEE Access
8:170939–170950
Shrestha A, Mahmood A (2019) Review of deep learning algorithms and architectures. IEEE Access
7:53040–53065
Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual
document analysis. In: Icdar Vol. 3, No. 2003
Singh G, Walia B (2017) Performance evaluation of nature-inspired algorithms for the design of bored pile
foundation by artificial neural networks. Neural Comput Appl 28(1):289–298
Song Q et al (2019) Real-time tunnel crack analysis system via deep learning. IEEE Access 7:64186–64197
Srisutthiyakorn* N (2016) Deep-learning methods for predicting permeability from 2D/3D binary-seg-
mented images. In: SEG technical program expanded abstracts 2016. Society of Exploration Geo-
physicists, pp 3042–3046
Supreetha B, Shenoy N, Nayak P (2020) Lion Algorithm-Optimized Long Short-Term Memory Network
for Groundwater Level Forecasting in Udupi District. Applied Computational Intelligence and Soft
Computing, India. https://doi.org/10.1155/2020/8685724
Taigman Y, Polyak A, Wolf L (2016) Unsupervised cross-domain image generation. arXiv preprint https://
arxiv.org/abs/1611.02200
Tan Y, Lu Y (2017) Why Excavation of a Small Air Shaft Caused Excessively Large Displacements: Foren-
sic Investigation. J Perform Construct Facil 31(2):04016083
Tan Y, Lu Y (2017) Forensic diagnosis of a leaking accident during excavation. J Perform Construct Facil
31(5):04017061
Thirugnanam H, Ramesh MV, Rangan VP (2020) Enhancing the reliability of landslide early warning sys-
tems by machine learning. Landslides 17(9):2231–2246
Uncuoglu E, Laman M, Saglamer A, Kara HB (2008) Prediction of lateral effective stresses in sand using
artificial neural network. Soils Found 48(2):141–153
Vaillant R, Monrocq C, Le Cun Y (1994) Original approach for the localisation of objects in images. IEE
Proc Vis Image Signal Process 141(4):245–250
Valsecchi A, Damas S, Tubilleja C, Arechalde J (2020) Stochastic reconstruction of 3D porous media from
2D images using generative adversarial networks. Neurocomputing 399:227–336
Van Houdt G, Mosquera C, Nápoles G (2020) A review on the long short-term memory model. Artif Intell
Rev 53:5929–5955
van Natijne AL, Lindenbergh RC, Bogaard TA (2020) Machine learning: new potential for local and
regional deep-seated landslide nowcasting. Sensors 20(5):1425
Wallach I, Dzamba M, Heifets A (2015) AtomNet: a deep convolutional neural network for bioactivity pre-
diction in structure-based drug discovery. arXiv preprint https://arxiv.org/abs/1510.02855
Wang Y, Teng Q, He X, Feng J, Zhang T (2019) CT-image of rock samples super resolution using 3D con-
volutional neural network. Comput Geosci 133:104314
Wang L, Wu C, Gu X, Liu H, Mei G, Zhang W (2020) Probabilistic stability analysis of earth dam slope
under transient seepage using multivariate adaptive regression splines. Bulletin of Engineering Geol-
ogy and the Environment:1–13. https://doi.org/10.1007/s10064-020-01730-0
Watson J, Wan F, Sibbald A (1995) The use of artificial neural networks in pile integrity testing. CIVIL-
COMP95 developments in neural networks and evolutionary computing for civil and structural
engineering:7–13
Wei Z, Hu H, Zhou H-w, Lau A (2019) Characterizing rock facies using machine learning algorithm based
on a convolutional neural network and data padding strategy. Pure Appl Geophys 176(8):3593–3605
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks.
Neural Comput 1(2):270–280
Wong BK, Bodnovich TA, Selvi Y (1997) Neural network applications in business: A review and analysis of
the literature (1988–1995). Decis Support Syst 19(4):301–320
Wu Y, Hao Y, Tao J, Teng Y, Dong X (2019) Non-destructive testing on anchorage quality of hollow
grouted rock bolt for application in tunneling, lessons learned from their uses in coal mines. Tunn
Undergr Space Technol 93:103094
Xiao L, Zhang Y, Peng G (2018) Landslide susceptibility assessment using integrated deep learning algo-
rithm along the China-Nepal highway. Sensors 18(12):4436
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5672 W. Zhang et al.
Xie P, Zhou A, ChaI B (2019) The Application of Long Short-Term Memory(LSTM) Method on Displace-
ment Prediction of Multifactor-Induced Landslides. IEEE Access 7:54305–54311
Xing Y, Yue J, Chen C, Qin Y, Hu J (2020) A hybrid prediction model of landslide displacement with risk-
averse adaptation. Computers & Geosciences:104527
Xu S, Niu R (2018) Displacement prediction of Baijiabao landslide based on empirical mode decomposition
and long short-term memory neural network in Three Gorges area, China. Comput Geosci 111:87–96
Xue Y, Li Y (2018) A fast detection method via region-based fully convolutional neural networks for shield
tunnel lining defects. Comput-Aided Civ Infrastruct Eng 33(8):638–654
Xue D, Wang J, Zhao Y, Zhou H (2018) Quantitative determination of mining-induced discontinuous stress
drop in coal. Int J Rock Mech Min Sci 111:1–11
Yang HL, Lunga D, Yuan J (2017) Toward country scale building detection with convolutional neural net-
work using aerial images. In: 2017 IEEE International Geoscience and Remote Sensing Symposium
(IGARSS). IEEE, pp 870–873
Yang B, Yin K, Lacasse S, Liu Z (2019) Time series analysis and long short-term memory neural network
to predict landslide displacement. Landslides 16(4):677–694
Yang D, Gu C, Zhu Y, Dai B, Zhang K, Zhang Z, Li B (2020) A Concrete Dam Deformation Prediction
Method Based on LSTM With Attention Mechanism. IEEE Access 8:185177–185186
Ye C et al (2019) Landslide Detection of Hyperspectral Remote Sensing Data Based on Deep Learning
With Constrains. IEEE J Select Topics Appl Earth Observat Remote Sens 12(12):5047–5060
Yu H, Ma Y, Wang L, Zhai Y, Wang X (2017) A landslide intelligent detection method based on CNN and
RSG_R. In: 2017 IEEE International Conference on Mechatronics and Automation (ICMA). IEEE,
pp 40–44
Yuan X, Li L, Wang Y (2019) Nonlinear dynamic soft sensor modeling with supervised long short-term
memory network. IEEE Trans Industr Inf 16(5):3168–3176
Yz L, Nie Zh, Hw Ma (2017) Structural damage detection with automatic feature-extraction through deep
learning. Comput-Aided Civ Infrastruct Eng 32(12):1025–1046
Zhang W, Goh ATC (2013) Multivariate adaptive regression splines for analysis of geotechnical engineering
systems. Comput Geotech 48:82–95
Zhang W, Goh ATC (2016) Multivariate adaptive regression splines and neural network models for predic-
tion of pile drivability. Geosci Front 7(1):45–52
Zhang Z, Liu Z, Zheng L, Zhang Y (2014) Development of an adaptive relevance vector machine approach
for slope stability inference. Neural Comput Appl 25:2025–2035
Zhang W, Goh ATC, Zhang Y, Chen Y, Xiao Y (2015) Assessment of soil liquefaction based on capacity
energy concept and multivariate adaptive regression splines. Eng Geol 188:29–37
Zhang Y, Ding L, Love PED (2017) Planning of deep foundation construction technical specifica-
tions using improved case-based reasoning with weighted k-nearest neighbors. J Comput Civ Eng
31(5):04017029
Zhang Y, Chan W, Jaitly N (2017) Very deep convolutional networks for end-to-end speech recognition.
2017 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), IEEE,
pp 4845–4849
Zhang W, Zhang Y, Goh ATC (2017) Multivariate adaptive regression splines for inverse analysis of soil
and wall properties in braced excavation. Tunn Undergr Space Technol 64:24–33
Zhang P, Sun J, Jiang Y, Gao J (2017a) Deep learning method for lithology identification from borehole
images. In: 79th EAGE Conference and Exhibition 2017, European Association of Geoscientists &
Engineers, Vol. 2017, No. 1, pp. 1–5
Zhang A et al (2018) Deep Learning-Based Fully Automated Pavement Crack Detection on 3D Asphalt Sur-
faces with an Improved CrackNet. J Comput Civ Eng 32(5):04018041
Zhang Y, Wang G, Li M, Han S (2018) Automated classification analysis of geological structures based on
images data and deep learning model. Appl Sci-Basel 8(12):2493
Zhang Z, Yang L, Zheng Y (2018c) Translating and segmenting multimodal medical volumes with cycle-
and shape-consistency generative adversarial network. In: Proceedings of the IEEE conference on
computer vision and pattern Recognition, pp 9242–9251
Zhang W, Zhang R, Wu C, Goh ATC, Lacasse S, Liu Z, Liu H (2019) State-of-the-art review of soft com-
puting applications in underground excavations. Geosci Front 11(4):1095–1106
Zhang W, Zhang R, Wang W, Zhang F, Goh ATC (2019) A Multivariate Adaptive Regression Splines model
for determining horizontal wall deflection envelope for braced excavations in clays. Tunn Undergr
Space Technol 84:461–471
Zhang Y, Li M, Han S, Ren Q, Shi J (2019) Intelligent identification for rock-mineral microscopic images
using ensemble machine learning algorithms. Sensors 19(18):3914
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Application of deep learning algorithms in geotechnical… 5673
Zhang T-F, Tilke P, Dupont E, Zhu L-C, Liang L, Bailey W (2019) Generating geologically realistic 3D
reservoir facies models using deep learning of sedimentary architecture with generative adversarial
networks. Petroleum Science 16(3):541–549
Zhang W, Wu C, Li Y, Wang L, Samui P (2019b) Assessment of pile drivability using random forest regres-
sion and multivariate adaptive regression splines. Georisk-Assessment and Management of Risk for
Engineered Systems and Geohazards:1–14. https://doi.org/10.1080/17499518.2019.1674340
Zhang R, Wu C, Goh ATC, Böhlke T, Zhang W (2020) Estimation of diaphragm wall deflections for deep
braced excavation in anisotropic clays using ensemble learning. Geosci Front 12(1):365–373
Zhang P, Wu H-N, Chen R-P, Dai T, Meng F-Y, Wang H-B (2020) A critical evaluation of machine learning
and deep learning in shield-ground interaction prediction. Tunn Undergr Space Technol 106:103593
Zhang W, Wu C, Zhong H, Li Y, Wang L (2020) Prediction of undrained shear strength using extreme gra-
dient boosting and random forest based on Bayesian optimization. Geosci Front 12(1):469–477
Zhang W, Li H, Wu C, Li Y, Liu Z, Liu H (2020) Soft computing approach for prediction of surface settle-
ment induced by earth pressure balance shield tunneling Underground Space. Underground Space.
https://doi.org/10.1016/j.undsp.2019.12.003
Zhao Z-Q, Zheng P, Xu S-t, Wu X (2019) Object detection with deep learning: a review. IEEE Trans neural
Netw Learn Syst 30(11):3212–3232
Zhao J, Shi M, Hu G, Song X, Zhang C, Tao D, Wu W (2019) A data-driven framework for tunnel geologi-
cal-type prediction based on TBM operating data. IEEE Access 7:66703–66713
Zhao S, Zhang DM, Huang HW (2020) Deep learning–based image instance segmentation for moisture
marks of shield tunnel lining. Tunn Undergr Space Technol 95:103156
Zhong C et al (2020) Landslide mapping with remote sensing: challenges and opportunities. Int J Remote
Sens 41(4):1555–1581
Zhou Y, Su W, Ding L, Luo H, Love PED (2017) Predicting safety risks in deep foundation pits in subway
infrastructure projects: support vector machine approach. J Comput Civ Eng 31(5):04017052
Zhou C, Ouyang J, Ming W, Zhang G, Du Z, Liu Z (2019) A stratigraphic prediction method based on
machine learning. Appl Sci-Basel 9(17):3553
Zhou Y, Li S, Zhou C, Luo H (2019) Intelligent approach based on random forest for safety risk prediction
of deep foundation pit in subway stations. J Comput Civ Eng 33(1):05018004
Zhou C, Xu H, Ding L, Wei L, Zhou Y (2019) Dynamic prediction for attitude and position in shield tun-
neling: a deep learning method. Autom Construct 105:102840
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:
1. use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
2. use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at