0% found this document useful (0 votes)
57 views18 pages

Data-Driven Multi-Output Prediction For TBM Performance

This document presents a novel deep learning model called att-GCN for predicting tunnel boring machine (TBM) performance during excavation, integrating graph convolutional networks and an attention mechanism. The model has been validated in a Singapore MRT construction project, demonstrating superior prediction accuracy for penetration rate and energy consumption compared to traditional algorithms. The att-GCN model enhances real-time estimation capabilities, contributing to improved construction efficiency and reliability in complex underground environments.

Uploaded by

DianelysVegaRuiz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views18 pages

Data-Driven Multi-Output Prediction For TBM Performance

This document presents a novel deep learning model called att-GCN for predicting tunnel boring machine (TBM) performance during excavation, integrating graph convolutional networks and an attention mechanism. The model has been validated in a Singapore MRT construction project, demonstrating superior prediction accuracy for penetration rate and energy consumption compared to traditional algorithms. The att-GCN model enhances real-time estimation capabilities, contributing to improved construction efficiency and reliability in complex underground environments.

Uploaded by

DianelysVegaRuiz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Automation in Construction 141 (2022) 104386

Contents lists available at ScienceDirect

Automation in Construction
journal homepage: www.elsevier.com/locate/autcon

Data-driven multi-output prediction for TBM performance during tunnel


excavation: An attention-based graph convolutional network approach
Yue Pan a, Xianlei Fu b, Limao Zhang c, *
a
Shanghai Key Laboratory for Digital Maintenance of Buildings and Infrastructure, Department of Civil Engineering, Shanghai Jiao Tong University, 800 Dongchuan
Road, Shanghai 200240, China
b
School of Civil and Environmental Engineering, Nanyang Technological University, Singapore, 50 Nanyang Avenue, 639798, Singapore
c
School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, 1037 Luoyu Road, Hongshan District, Wuhan, Hubei 430074, China

A R T I C L E I N F O A B S T R A C T

Keywords: A deep learning-based multi-output prediction model is developed to better understand and more accurately
TBM performance prediction estimate tunnel boring machine (TBM) performance in each segment ring during the deep excavation under
Graph convolutional networks complex underground environments. The novelty lies in the development of a new deep learning approach
Attention mechanism
named att-GCN, which feasibly integrates the graph convolutional networks (GCN) and scaled dot-product
Multi-output regression
Tunnel excavation
attention mechanism to improve model performance and interpretability. It is proved that our proposed att-
GCN model is outstanding in significantly enhancing the prediction performance and effectively capturing the
influence between monitoring points. As a case study, the proposed method is validated in a Singapore Mass Rail
Transit (MRT) construction project, where seven features associated with the TBM machine are input for att-GCN
training and testing. Experimental results reveal that the att-GCN model can exhibit a powerful capability in
simultaneously predicting two targets named penetration rate (y1) and energy consumption (y2), reaching the
mean absolute percentage error (MAPE) value at 15.475% and 15.173%, respectively. In terms of prediction
accuracy, att-GCN is superior to some state-of-the-art algorithms, including deep neural network (DNN), random
forest (RF), and support vector regression (SVR). Moreover, an online-learning version of att-GCN is designed.
When the objective value is gradually known and fed into att-GCN during the tunneling procedure, the model can
yield more impressive performance under the MAPE of 8.504% (y1) and 7.934% (y2). Accordingly, the real-time
estimation of TBM performance based on the time-varying monitoring data provides valuable evidence to realize
the intelligent control of TBM tunneling, which can ultimately improve construction efficiency and reliability.

1. Introduction interaction between the ground and machine, where the procedure of
rock break is highly associated with the properties of soil and rock strata
To take the place of the traditional drilling and blasting (D&B) [4]. Remarkably, one of the superiorities of TBM lies in its capability of
methods, advanced equipment named tunnel boring machine (TBM) has limiting the disturbance to the surrounding underground environment
become more and more popular worldwide to realize the mechanized and reducing the construction cost. Also, TBM is outstanding in fast
tunnel excavation through different types of geological conditions [1]. advancing speed, less rock damage, great safety, low cost, easy and
Especially in China, large-scale metro construction is boosting in recent continuous operation, and others [5,6]. All these TBM advantages
years to relieve traffic pressure and facilitate rapid urbanization [2]. As a contribute to ensuring the high efficiency and reliability of tunnel
preferred tool, TBM has been applied in most construction projects of construction.
metro systems in China, accounting for around a quarter of the world The wide application of TBM in tunneling has made up TBM per­
market share [3]. The key part of TBM is its cutter head to constantly formance to be a crucial issue. It is known that TBM exhibiting great
rotates and thrusts into the rock surface, and then a chamber and a set of performance can work stably and reliably, leading to the smooth
hydraulic jacks locate behind the cutter head to push the TBM forward. development of tunneling projects [7]. In contrast, the poor perfor­
That is to say, the TBM-based tunneling can be understood as an mance of TBM will adversely influence the construction stability and

* Corresponding author.
E-mail addresses: [email protected] (Y. Pan), [email protected] (X. Fu), [email protected] (L. Zhang).

https://doi.org/10.1016/j.autcon.2022.104386
Received 29 January 2022; Received in revised form 12 May 2022; Accepted 23 May 2022
Available online 4 June 2022
0926-5805/© 2022 Elsevier B.V. All rights reserved.
Y. Pan et al. Automation in Construction 141 (2022) 104386

quality, which can even damage the machine, bring additional cost, and segment rings for better model interpretability, and the online learning
cause the project delay [8]. Therefore, it is important to effectively process for updating predictions at each step. There are four key
model and evaluate the TBM performance in time, which can provide research questions to be solved as follows: (1) How to determine the
unique opportunities in data-driven decision-making for different pro­ useful features under the consideration of ground-machine interaction
poses, such as operation parameter adjustment, cost estimation, project of TBM for training a high-quality deep learning model; (2) How to
arrangement, risk mitigation, and others. However, practical difficulties design a robust deep learning-based model to fully learn the time-
in automatically measuring and understanding TBM performance continuous data pertaining to the TBM machine, resulting in accurate
remain due to the high degree of uncertainty and complexity in the predictions of two objectives simultaneously and dynamically. (3) How
geology condition and construction process. In the field of project to capture and measure the essential intersections between each
management, expert knowledge and judgment still act as a common segment ring tunneling cycle for better comprehending the nature of
estimation method, which inevitably suffers from subjectivity, complex shield tunneling and generating data-driven control guidance.
randomness, and inflexibility [9]. As an alternative solution, the deep (4) How to set comparative experiments to demonstrate the superiority
learning algorithm can be adopted in tunneling engineering, which has of the developed model in learning continuous streams of data to make
proven to be more generalizable and reliable to deal with uncertainty predictions in a sequential manner.
and multi-dimensional data [10]. It is prominent in continuously The remainder of this paper is organized as follows. Section 2 re­
learning from large volumes of historical data and keeping improving in views the existing research about TBM performance prediction. Section
result accuracy without human intervention, which can eventually ac­ 3 introduces the hybrid deep learning model that integrates GCN and the
cess hidden insights of the TBM performance. Another greatest strength attention mechanism. Section 4 applies the developed att-GCN-based
of deep learning to be noted is that it can carry out feature engineering approach to an actual construction project for the Singapore metro
on its own to facilitate a more efficient learning process. Therefore, the line, aiming to validate its effectiveness and practicability. Section 5
main research concern is to develop deep learning-based prediction on reveals the role of the attention mechanism and discusses the impact of
TBM performance, which is desirable to improve TBM adaptability to­ measured objective value in raising prediction accuracy. Section 6 draws
wards better guidance of the shield tunnel operation. up the conclusions and future works.
In particular, the collected data in large volumes has been high­
lighted as the prerequisite for the successful implementation of deep 2. Literature review
learning. A fact in the TBM-based excavation procedure is that a large
amount of in-suite monitoring data can be accumulated. These data are At present, the construction industry is undergoing digital trans­
rich and time-varying in nature, which are well worth exploring. To formation at a fast speed, where an immerse interest is to extensively
maximize the value of these abundant data, an essential step is to apply artificial intelligence (AI) techniques to boost automation, reli­
determine the input and output variables for preparing a high-quality ability, and productivity of the construction project management [14].
dataset that can be directly deployed for model training. Since TBM Among various AI-enabled approaches, machine learning is a typical
operation should be dynamically adjusted according to the geotechnical and important branch to train models that can imitate human behavior,
properties of the mixed ground, the shield tunneling actually acts as a such as perception, reasoning, and others, for problem-solving. That is to
reflection of ground-machine interaction [4]. Therefore, two kinds of say, a machine learning model can automatically learn from a large
features regarding the TBM operational parameters and geological amount of complex data and discover hidden knowledge, aiming to
conditions both considerably influence TBM performance, which can be adaptively make smart predictions and decisions [15]. Due to the
reasonably taken as inputs for further investigation. Besides, high con­ popularity and effectiveness of machine learning, it has gradually
struction efficiency and low construction cost have turned out to be the replaced some classical methods, like experimental methods, theoretical
points of focus in most engineering projects [11]. In this regard, the TBM methods, and statistical methods, for the purpose of TBM performance
penetration rate and energy consumption are selected as outputs herein. prediction in the field of tunnel excavation. To be more specific, the
Most existing studies only concentrate on one of the identified outputs. experimental method largely relies on field tunneling tests in complex
Herein, two targeted objectives are taken into account at the same time geological conditions. The limitation lies in the high cost and a lot of
for measuring TBM performance, leading to a multi-output regression time in designing and conducting several experiments [10]. Under the
task. On the one hand, the fast penetration rate helps to cut down the given test conditions and some physical assumptions, the obtained re­
project time, which can in turn lower the construction cost [12]. On the sults could be only suitable for the specific case and be somewhat con­
other hand, the amount of energy consumed for excavating a unit vol­ servative [16]. As for the theoretical method, the theory that the
ume of the rock mass can serve as an overall indicator to reveal the TBM analysis is based on could either simplify the problem or incorporate too
work efficiency, and the heavy power load also has an effect on the many assumptions. It could possibly be unsuitable to the actual situation
construction cost [13]. Based upon the well-prepared dataset, exhaus­ and negatively influence the result reliability [17]. Statistical methods
tive efforts herein are put into the establishment of a deep learning- mainly depend on mathematical rules, but they are not always robust to
based model supporting outputting two objectives in each prediction. nonlinear and complex systems [18]. In addition, their prediction ability
Such an intelligent model is able to automatically handle multiple will also be weakened by outliers and extreme values [19]. That is to say,
influential features and then simultaneously make accurate predictions these classical methods have their shortcomings in practical applica­
on the two objectives, which can potentially reduce the burden on the tions. They are also time-consuming and cumbersome to be imple­
onsite workers and project managers during the shield tunneling. mented. Compared to these traditional approaches, the utilization of
In short, this research proposes a novel deep learning model named machine learning is outstanding to realize a more accurate, flexible, and
att-GCN to automate a detailed prediction process, which can be more robust data-driven prediction of TBM excavation performance without
practical for quantitatively understanding the field excavation perfor­ any assumption. For one thing, the increasing number of TBM-related
mance of TBM in metro construction. Through exploring the mass of on- monitoring data can bring a big data environment for machine
site data, the subsequent TBM penetration rate and energy consumption learning model training [20]. For another, machine learning owning
can be precisely predicted at a certain interval of net stork in each strong learning abilities can be an ideal solution to effectively build the
segment ring, resulting in a small disparity between the predictive re­ complicated relationships among multidimensional data, which is
sults and the measured data. Its innovations can be summarized from beneficial to facilitate intelligent prediction during TBM excavation in a
three critical aspects, including the proper combination of the GCN and considerably low computational cost [21].
scaled dot-product attention mechanism for greater prediction perfor­ Remarkably, prediction models based on machine learning can
mance, the accurate measurement of relationships between pairs of conduct the less expensive and more powerful processing in an immense

2
Y. Pan et al. Automation in Construction 141 (2022) 104386

amount of data, aiming to create the non-linear input-output mapping of traffic prediction. Since the tunnel construction can be also regarded as a
interaction under no assumption [21]. With the help of machine complex system under nonlinearity and uncertainty [33], GCN can help
learning algorithms, people without too much professional knowledge to understand the dynamic behavior of TBM during the deep excavation
and engineering experience can also make a fast and reliable estimation from a novel view of complex networks, where the node represents a net
of TBM performance [18]. Subsequently, data-driven decisions can be stroke of a certain ring. Moreover, another thing to be noted is the
easily informed to properly adjust the machine specifications and attention mechanism with a great capacity of dealing with sequential
arrange project schedules, which are crucial in reducing risk occurrence data, which can easily pay more attention to the most relevant parts of
in constructing the deep-buried long tunnel [22]. In accordance with the input for decision making [34]. Velickovic et al. [35] achieved the
recent research, two models, namely random forest (RF) and support first attempt to leverage a self-attention layer in GCN to build up a graph
vector machine (SVM), have turned out to be the most widely adopted to attention network (GAN), which can show two distinct advantages. One
predict the field excavation performance of TBM. Since these intelligent is that GAN can outperform the popular RNN and LSTM with the diffi­
approaches can provide deeper insights into TBM operation, they could culty in parallelizing. The other is that GAN is capable of capturing and
be helpful in ensuring the safe operation of the complex tunneling measuring the influence of previous conditions on the current perfor­
project. For example, Tao et al. [23] performed RF to predict the TBM mance, which cannot be realized in RNN and LSTM. However, there are
penetration rate under the hard rock condition, which showed great still some limitations in the developed GAN. Firstly, the application
tolerance to noises and outliers. Sun et al. [24] implemented RF to learn scope of this developed network is limited in terms of solving the node-
both the operational and geological data to predict the dynamic load of classification task in graph-structured data. Secondly, it adopts the ad­
TBM, resulting in reliable stress and structural estimations for the design ditive attention, but such a basic version of attention is not efficient
and analysis of TBM. Zhou et al. [25] relied on a hybrid SVM model to enough to model long-range dependencies. Thirdly, there is a lack of the
predict the TBM advance rate under high accuracy, aiming to minimize popular encoder-decoder architecture that can train a single end-to-end
the risk of high capital costs and scheduling. Zhou et al. [3] applied SVM model directly on the sequential data. As a possible solution, a more
to predict the energy consumption of cutter head drives for power advanced attention named the scaled dot-product attention can there­
planning and control in an urban shield tunneling project. Although fore be taken into account to address these above-mentioned limitations,
most of the relevant studies can return acceptable results, they are prone which can potentially be faster and more space-efficient in practice over
to suffer from limitations of shallow learning mechanisms and have additive attention.
difficulties in harvesting more complicated patterns. In summary, for the purpose of filling the research gap, this research
Additionally, based upon the use of deep neural networks, deep is undertaken to develop a novel and hybrid deep learning model called
learning is a subset of machine learning, which is also popular in att-GCN and encapsulate it into an encoder-decoder architecture, which
tunneling engineering. The distinguishing advantage of deep learning can be regarded as an upgrade of the existing GAN. As expected, the
over classical machine learning algorithms to be highlighted is its ability developed att-GCN model can be easy to implement and understand. For
to automatically conduct the feature engineering on its own, enabling one thing, it is bound to realize higher computational efficiency and
fast learning to discover meaningful representations from a large prediction accuracy in the non-linearity and complex sequence-to-
amount of raw data and deliver high-quality prediction results [26]. In sequence-based prediction problem about the information-intensive
this regard, deep learning approaches have captured more attention in tunneling project. For another, it is believed to automatically offer a
exploring time-series data describing TBM mechanical information quantitative overview of relationships between TBM performance and
along with responses from the front rock face, since they have great its influential factors regarding the TBM machine. Moreover, an addi­
potential to better capture dynamic characteristics of data to return tional attempt to be made is to facilitate the att-GCN training process in
promising prediction and optimization of TBM performance [27]. For an online manner, and thus a dynamic prediction process can be carried
example, Armaghani et al. [28] applied the artificial neural network out to adaptively update the model with the incoming stream of ob­
(ANN) along with the particle swarm optimization (PSO) to increase the servations [36]. Eventually, the insights gained from predictive results
prediction accuracy of the TBM penetration rate. Gao et al. [11] utilized can lead to some optimal strategies, which may be of assistance to guide
the recurrent neural networks (RNN) to predict TBM operating param­ the improvement of tunneling efficiency and cost for smoothly moving
eters based on the real-world operating data for TBM adaptable the excavation process forward.
adjustment, and also compared it with RF and SVM to demonstrate its
superiority. Zhou et al. [29] developed a hybrid deep learning algorithm 3. Methodology
combining the convolutional neural networks (CNNs), and long short-
term memory (LSTM) for predicting the shield's attitude and position, 3.1. Graph convolutional network (GCN)
aiming to provide decision support to reduce a snakelike motion. Feng
et al. [30] conducted the deep belief network to predict the field pene­ Since many things in practice can be represented in the form of
tration index for quantifying TBM performance, where the predicted networks, GCN has been developed as the backbone for direct operation
values well followed the trend of true value. It can be seen that these on graphs [37,38]. The theoretical foundation behind GCN is that each
existing studies primarily depend on the basic deep learning algorithms, node can send its information to neighbors and receive feedback for
such as ANN, RNN, and LSTM, which can consider the time-dependent understanding and updating the status. Fig. 1 provides the basic archi­
nature of the defined problem. However, no research has yet to carry tecture of GCN, which takes a graph as input and apply convolution over
out a very powerful and new neural network architecture named graph the graph. The main difference between GCN and the classical feedfor­
convolutional network (GCN) that can operate on the graph structure to ward neural network is that GCN operates on nodes, and thus, GCN
more effectively model the complicated TBM tunneling process and demonstrates a significant strength in capturing connections between
make accurate predictions. nodes in the networks [39]. That is to say, each node can aggregate its
To be more specific, GCN is developed as a special kind of con­ neighbors' vectors and then pass the result through a dense neural
volutional neural network, which can encode structured information in network layer with a non-linear function given by Eq. (1). Therefore,
the model and directly operate on graphs. It is known that GCN is nodes in each layer own different representations. For example, nodes at
outstanding in concurrently catching the temporal and spatial de­ the 0th layer are the same as the node features. Representation of nodes
pendencies, allowing for gaining more insights compared to data anal­ at the lth layer depends on the previous (l− 1)th layer. For simplicity, a
ysis in isolation [31]. The application of GCN in the engineering domain layer-wise propagation rule is considered in GCN, and thus the output of
is still at an initial stage. Zhao et al. [32] deployed GCN to capture the the hidden layer can be calculated by Eq. (2).
topological structure of road networks, leading to accurate and real-time However, there are two shortcomings in Eq. (2). One is that the node

3
Y. Pan et al. Automation in Construction 141 (2022) 104386

Fig. 1. Architecture of GCN.

itself is not included in the summation of feature vectors of neighboring



T
nodes. The other is that the adjacent matrix A is not normalized, which is Ai = aij xj (6)
unable to maintain the scale of the output feature vectors. As a possible j=0
solution, the propagation rule from [38] is adopted, which has been
defined in Eq. (3). For one thing, since a fact is that the output of a node where xi is the current input sequence, and xj is all other inputs j ∈ {1, 2,
in the hidden layer can be derived from the node itself and its neighbors, …, T}.
the diagonal elements of A can be converted to 1 by the formula A ̂ =A+ A problem to be noted is that the basic version of self-attention
− 12 − 12 (l) (l)
) introduced above does not contain any learnable parameters, which is
I. For another, the multiplication of ( D
̂ A ̂D
̂ H W can address the not convenient for model updating. In this regard, three trainable weight
second shortcoming. metrics are introduced named query, key, and value, which are multi­
( ) plied with input sequence embeddings as defined in Eqs. (7)–(9). Based
H (l+1) = f H (l) , A (1)
upon this, a scaled dot-product attention is developed, as visualized in
( ) ( ) Fig. 2. For each query, the model learns the key-value input that should
f H (l) , A = σ AH (l) W (l) (2) attend to. The process in Fig. 2 will be calculated in parallel across all
input x ∈ R T×de .As a result, the attention matrix can be derived from Eq.
where H(l) is the matrix of activations in the lth neural network layer, A is √̅̅̅̅̅
an adjacent matrix, L is the number of layers, σ is the non-linear acti­ (10), where (1/ dk ) guarantees that the dot-products between the
vation function (i.e., ReLU), and W(l) is the weight matric of the lth query and key will not grow too large.
neural network layer. qi = W q xi (7)
( ) ( −1 1 )
̂ 2A
f H (l) , A = σ D ̂ − 2 H (l) W (l)
̂D (3) ki = W k xi (8)
∑̂ ̂
where D
̂ is the diagonal node degree matrix of A,
̂ D̂ ii = A ij , A = A + I vi = W v xi (9)
j
is the adjacency matrix with self-connections, and I is the identity where xi is in the dimension of 1 × de (de is the embedding size), Wq is in
matrix. the dimension of de × dq, Wk is in the dimension of de × dk, Wv is in the
dimension of de × dv
3.2. Scaled dot-product attention mechanism ( T)
QK
A(Q, K, V) = softmax √̅̅̅̅̅ V (10)
The self-attention mechanism aims to better deal with long se­ dk
quences with no use of recurrence and convolution. There are three
where T is the input sequence size, Q ∈ R T×dq , K ∈ R T×dk , V ∈ R T×dv .
important steps involved in the implementation of self-attention [40].
The first step is to compute dot products using Eq. (4), resulting in
3.3. Attention-based GCN (att-GCN) model
attention weights that can measure the similarity between the current
input and other inputs. The second step is to normalize the calculated
Motivated by the superiority of GCN and scaled dot-product atten­
weights using the softmax function in Eq. (5), which can make the value
tion, we intend to develop a hybrid model named att-GCN (attention-
comparable across all nodes. The third step is to calculate the self-
based GCN) based on an encoder-decoder module. Its key principle is to
attention value as the weighted sum based on the normalized weights
appropriately integrate the scaled dot-product attention into GCN to
and corresponding inputs, as formulated in Eq. (6).
replace the spatial graph convolutions, which can bring benefits of high
eij = xi T xj (4) computation efficiency, importance measurement for neighbors, and
( ) long-range dependency capturing. Under the proper combination of
exp eij ([ ] ) GCN and attention, the att-GCN model is well designed to be suitable for
aij = = softmax eij j=1,2,…,T (5)

T ( ) the defined multi-output regression task, which is expected to achieve
exp eij
j=0 great improvement in prediction performance and interpretability. For
one thing, GCN is responsible for capturing the spatial relationships
between TBM-related data. For another, the scaled dot-product attention
aims to draw global dependencies between input and output.

4
Y. Pan et al. Automation in Construction 141 (2022) 104386

Fig. 2. Calculation of the scaled dot-product mechanism.

More specifically, a problem under the consideration of graph directly-connected decoder module, aiming to subsequently output the
structure can be defined as a graph g with n nodes, and nodes are rep­ node prediction at each timestep.
resented by a set of features xi. To generate embeddings of input nodes, ( ( ))
(11)
(l− 1) (l− 1)
an attention-based encoder is designed consisting of layers with a skip- ĥi = BN l hi + Ali h1 , …, h(l−
n
1)

connection and batch normalization (BN), including a scaled dot-


( )
product attention layer and a feedforward layer. Its architecture is h(l) l ̂ l ̂
i = BN hi + FF ( hi ) (12)
shown in Fig. 3, where the attention layer processes information be­
tween input nodes and the node-wise fully connected feedforward layer. 1 ∑N

Accordingly, the node embedding at the lth can be derived from the h(N) = h(N) (13)
g
N i=1 i
(l− 1)th, which can be calculated by Eqs. (11) and (12). As a result, the
encoder calculates the average value of the final node embedding in Eq. where l is the layer l ∈ {1, 2, …, N}, N is the number of layers, A is the
(13), which is regarded as graph embedding. Afterward, the output on scaled dot-product attention sublayer, and FF is the node-wise feedfor­
behalf of high-level features from the encoder module can be fed into a ward sublayer.

Fig. 3. Architecture of the encoder module of the att-GCN model. (Note: ATT is the abbreviation of the scaled dot-product attention sublayer, and FF is the
abbreviation of the node-wise feedforward sublayer.)

5
Y. Pan et al. Automation in Construction 141 (2022) 104386

3.4. Model evaluation indicators ∑


n
( ŷi − yi )2
To mathematically assess the prediction performance of the estab­
2
R = 1− i=1
∑n (17)
lished att-GCN model in the multi-output regression task and make a (y − yi )2
i=1
comparison to other models, the following evaluation indicators are
⃒ ⃒
given in Eqs. (14)–(18), including root mean square error (RMSE), mean ⃒ ŷi − yi ⃒
absolute error (MAE), mean absolute percentage error (MAPE), R2, and Relative error = ⃒⃒ ⃒ (18)
yi ⃒
relative error, are employed. These five metrics are popular to evaluate
regression models in data science [41]. That is to say, the error between where ŷi is the predicted value, yi is the measured value, y is the mean of
the predicted results and the measured data is calculated in different the prediction value, and n is the total number of observations.
ways, which helps to judge the model quality by measuring how well the
predictions can match up against the ground truth. RMSE is in the form 4. Case study
of the standard deviation of the residuals, and in particular, it squares
the residual before calculating the mean value. MAE takes the mean of To validate the effectiveness of the proposed deep learning-based
the absolute value of residual to provide the average magnitude of the approach, it is applied in a case study about a practical construction
error. MAPE can be regarded as the percentage equivalent of MAE. R2 project in the Singapore Mass Rail Transit (MRT). There are three major
measures the proportion of the variation in the dependent variable that phases in the case study to explore the deep excavation behavior of TBM,
can be explained by the model inputs. Besides, due to the utilization of including dataset preparation, deep learning implementation, and result
absolute value, MAE and MAPE are more robust to outliers. Relative analysis and discussion. The whole process has been clearly exhibited in
error divides the absolute value of residual by the real value. Typically, Fig. 4.
the smaller value of the four indicators, the better accuracy of the model.
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
1∑ n
4.1. Background
RMSE = ( ŷi − yi )2 (14)
n i=1
The case study focuses on the construction of a metro tunnel section
1∑ in Singapore's Thomson-East Coast Line tunnel project. This targeted
n
MAE = | ŷi − yi | (15)
n i=1 section is a twin-line tunnel running through the subsurface from the
T308 Marine Terrace Station to the T309 Siglap Station, which owns a
n ⃒ ⃒
total length of 1480 m. The earth pressure balance (EPB) TBM with the
1∑ ⃒ ŷi − yi ⃒
MAPE = ⃒

⃒ × 100% (16) cutter head diameter of 6.67 m is adopted to excavate soil and advance
n i=1 yi ⃒
the tunnel through different types of soil conditions. The construction

Mechanical Penetration
parameter rate
x1-x5 y1 TBM performance
Variable prediction
determination Geological Energy
parameter consumption To predict y1 and y2
x6-x7 y2 simutanously in high accuracy

Data Missing value filling


Comparison of
preprocessing Independent variable normalization att-GCN/DNN/RF/SVR
To evaluate prediction results
Feature by RMSE/MAE/MAPE/Related
Grey relation grade (GRG) error
selection

Role of attention
Hyper-
parameter Grid search mechanism
tunning To capture and consider the
Training set influence of previous
(Ring No.1-9,11-19,Ă,321-329) observations
Dataset split
Test set
(Ring No.10,20,Ă,330) Real-time prediction
GCN(Graphic convolutional To dynamically update input
att-GCN model network)
data and improve prediction
development accuracy
Scaled dot-product attention

Phase 1 Phase 2 Phase 3


Dataset Deep learning Result analysis and
preparation implementation discussion

Fig. 4. Framework of the att-GCN-enabled case study.

6
Y. Pan et al. Automation in Construction 141 (2022) 104386

route is separated into several concrete segment rings that are 6.8 m in conditions along the tunnel alignment, aiming to keep face stability for
the inner diameter and 1.4 m in the width. ensuring excavation safety. Moreover, the average chamber pressure
For the real-time monitoring of the tunnel construction project, a (x6) and the average soil pressure (x7) that indirectly reflect the actual
data acquisition system has been developed, which can automatically geological conditions of the tunneling ground need to be taken into
collect data concerning the TBM machine at an interval of per 20 mm net account. They are also attributed to TBM machine parameters. In
stroke. In this case study, we focus on data from the previous 330 particular, the chamber pressure is the pressure to balance soil pressure,
circular-shaped rings (ring No. 1 to No. 330) to prepare a dataset. For which is collected by sensors installed in the excavation chamber. The
each ring, the recorded data starts from the 60 mm net stroke and ends at soil pressure (also known as earth pressure) measures the pressure
the 1260 mm net stroke. Although a large number of parameters can be exerted by the underground soil.
accumulated in the tunnel project, many of them are not easily acces­ Based upon the collection of sensor data and field data, a high-
sible. For simplicity, seven mechanical parameters that are easier to quality deep learning model can be well developed to produce predic­
obtain and more influential to the TBM performance are taken as the tion sequences of two major objectives, namely the penetration rate of
input features. The goal of this research is to take full advantage of these TBM (y1) and energy consumption of cutter head (y2). It is worth noting
rich collected data to develop a deep learning model as a dynamical that two defined objectives are responsible to quantify the TBM opera­
forecasting tool, aiming to reliably predict the TBM performance in tion performance from the view of construction progress and cost. The
terms of penetration rate and energy consumption under complex and importance of these objectives is briefly presented below. Firstly, the
uncertain underground environments. In summary, the prepared dataset penetration rate is calculated by the ratio of excavation distance to the
contains 61 × 330 lines in total, and each line is made of seven features amount of time spent in tunneling, which can also be understood as the
and two objectives. The layout plan of the project, the TBM employed for excavation speed [42]. For one thing, it plays an important role in
the tunnel construction and the photo of the TBM's data acquisition evaluating shield tunneling efficiency. For another, it can reflect the
system are shown in Fig. 5(a)–(c), respectively. operating state and adaptability of TBM to the current surrounding en­
vironments in real time. Secondly, the cutter head will consume more
than 60% of the total power capacity in the shield tunnel excavation due
4.2. Data description to its complicated interaction with the soil [3]. Thus, the energy con­
sumption herein measures the amount of energy required by the TBM
A detailed description of seven features and two targeted objectives cutter head. Estimation of energy consumption in advance helps to
has been outlined in Table 1. Seven features in the numerical format better plan the power and operation of TBM, which is a critical
about TBM machine parameters are incorporated in the prepared consideration in handling the project cost. The linear relationship be­
dataset, whose value is time-varying during the ring advance. All these tween independent and non-independent variables has been measured
features can potentially capture the technical factors of the TBM itself by the Pearson correlation coefficient, as shown in Fig. 6. Obviously, the
and machine-environment interaction in the deep excavation. To be absolute value of Pearson is less than 0.75, indicating that there is no
more specific, data regarding the net stroke per ring (x1), trust force high correlation data in the prepared dataset.
(x2), cutter head (CHD) torque (x3), CHD rotational speed (x4), and Since these two objectives serve as the important indicators of TBM
screw rotational speed (x5) are gathered, which are the specifications of performance, their precise predictions are believed to provide strong
TBM to reveal the corresponding operation performance over time. evidence to guide the safety insurance, efficiency improvement, and cost
Particularly, operators need to continuously adjust some mechanical control of the tunneling project. In general, it is desirable to achieve a
parameters to make them adaptable to the changeable surrounding

(b)

(a) (c)
Fig. 5. Project profile: (a) The layout plan of Singapore's Thomson-east coastline, (b) The TBM employed for the project, and (c) The photo of TBM's data acqui­
sition system.

7
Y. Pan et al. Automation in Construction 141 (2022) 104386

Table 1
Summary of the prepared dataset for TBM performance prediction.
Item Variable Name Unit Min Max Mean Std Medium

TBM machine parameter x1 Net Stroke per ring mm 60.000 1261.000 660.212 352.145 660.000
x2 Thrust Force KN 0.000 32,185.000 20,299.953 3940.467 21,290.000
x3 CHD Torque KN*m 0.000 3863.000 1338.198 820.704 899.000
x4 CHD RPM rpm 0.000 2.260 1.400 0.291 1.280
x5 Screw RPM rpm 0.000 8.900 2.823 1.714 2.800
x6 Average champer pressure Bar 0.175 4.482 3.240 0.315 3.270
x7 Average soil pressure Bar 0.370 5.400 2.797 0.546 2.800
Objective y1 Penetration rate mm/r 1.000 90.000 20.722 9.716 20.000
y2 Energy consumption KJ 2060.756 302,042.700 16,192.835 18,022.229 7321.721

feature with a higher GRG value is more favorable due to the higher
Pearson degree of influence exerted by the comparability sequence on the
x1
Correlation reference sequence [44]. As shown in Fig. 7, the GRG value for the
x2 Coefficient selected features and the targeted objective is all larger than 0.5, indi­
x3 cating that there are some strong relationships between the prepared
features (x1–x7) and shield performance (y1, y2). That is to say, it is
x4 reliable to feed these seven features into the deep learning model for
shield performance prediction. In particular, features x6 and x7 that
x5 indirectly reflect the geological condition tend to exert more influence
x6 on the penetration rate (y1), while features x1–x5 that are directly
related to machine operation are more important to the TBM energy
x7 consumption (y2).
y1 x − min(x)
xscale = (19)
max(x) − min(x)
y2

x1 x2 x3 x4 x5 x6 x7 y1 y2 minmin|xr (j) , xi (j) | + ξmaxmax|xr (j) , xi (j) |


(20)
∀j∈i ∀j ∀j∈i ∀j
γ(xr (j) , xi (j) ) =
|xr (j) , xi (j) | + ξmaxmax|xr (j) , xi (j) |
Fig. 6. Pearson correlation analysis. ∀j∈i ∀j

where xr is the reference sequence, xi is the comparability sequence, and


faster penetration rate and smaller energy consumption, which can
ξ is the resolving coefficient.
bring advantages of shortening the time limit for a project and lowering
the cost of construction [43]. Besides, two data preprocessing tech­
niques are conducted to convert the raw data into a cleaned and un­ 4.3. Model training and testing
derstandable format, which is crucial for the success of deep learning
algorithms. One is to fill the missing value with the previously recorded Since the hyperparameters are prone to affect the performance of the
value. The other is to normalize the independent variables into the range deep learning model, the hidden layer and hidden neuron are taken into
of [0,1] using the min-max scaling technique by Eq. (19), and thus account in this case for pursuing higher prediction accuracy. As an
gradient descent for training the neural network can converge faster. explanation of these two hyperparameters, hidden neurons are placed in
After data preprocessing, grey relational grade (GRG) defined in Eq. (20) the hidden layers and the hidden layer resides between the input layer
is calculated based on the normalized data for feature selection, which and the output layer. The more hidden layers and neurons are bound to
can further validate the high quality of the dataset. It is known that a increase the burden of network training. For obtaining the optimal value
of hyperparameters, the tuning technique called grid search acts as an

Category2
Geological parameter
TBM parameter
y y2
x5 0.917 x4 0.958

x7 0.912 x3 0.949

x6 0.898 x2 0.944
Variable

Variable

x2 0.876 x6 0.934

x4 0.864 x7 0.922

x3 0.853 x5 0.873

x1 0.599 x1 0.635

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
GRG value for y1 GRG value for y2
(a) (b)

Fig. 7. GRG values between features and objects for (a) y1 and (b) y2.

8
Y. Pan et al. Automation in Construction 141 (2022) 104386

exhaustive search on the specific value of parameters to find the best 0.14
point that can maximize the model performance. Herein, the hidden Training loss
layer is set as {2, 3, 4} and the hidden neuron is set as {64, 128. 256}. 0.12

Loss (Normalized MSE)


Testing loss
These two hyperparameters are optimized together in Fig. 8 under the 0.10
learning rate = 0.001. It is observed that when the number of the hidden
layer is 3 and the number of the hidden neuron is 256, the normalized 0.08
MSE can reach the smallest value of 0.306. Besides, two additional ex­ 0.06
periments under the learning value of 0.01 and 0.0001 along with hid­
den layer 3 and hidden neuron 256 are carried out, reaching the 0.04
normalized MSE of 0.313 and 0.321, respectively. It is known that a
0.02
small learning rate may take a lot of time to reach the minimal point,
while a large learning rate is likely to show divergent behavior and 0.00
arrive at a sub-optimal solution.
For evaluating the performance of the att-GCN model, the train-test 0 200 400 600 800 1000
split procedure is necessary. Since observations about the shield Epoch
tunneling could be dependent, it can be assumed the prepared dataset is
Fig. 9. The training loss and testing loss of the att-GCN model.
made up of time-series data. That is to say, the random split is not
allowed in this case. As a reliable solution, the creation of the training
and testing set is presented below. The whole dataset is directly sepa­ effectiveness of the established att-GCN model. The model training and
rated into 33 segments with no shuffle, and each segment contains 10 testing process is implemented using the Python programming language
rings. The first nine rings in each segment are used for training and the in this case study.
last ring is taken out for testing. For example, rings No. 1–10 are
included in the first segment. Data about rings No. 1–9 is assigned to the 4.4. Result analysis
training set, while data about ring No. 10 is packed into the testing set.
Consequently, the testing set embraces data of 33 rings (No. 10, 20, 30, The novel forecasting model named att-GCN is suitable for dealing
…, 320, 330), while data of the remaining rings belong to the training with a multi-output regression task and the time-series prediction. It has
set. Since each ring has 61 records that are collected every 20 mm net demonstrated a strong capability of modeling the non-linear relation­
stroke, there are totally 18,117 (61 × 297) and 2013 (61 × 33) lines of ship between seven features and two objectives, aiming to provide an
data for training and testing, respectively. Based upon the data splitting, accurate prediction about the operational TBM performance. To high­
the proposed att-GCN model with 3 hidden layers and 256 hidden light the advantage of att-GCN, its performance is also compared with
neurons is trained at the learning rate of 0.001 using the training set, and three popular machine learning algorithms. The prediction results are
then is applied to the testing set for evaluation of the model perfor­ analyzed as follows.
mance. The model training process aims to minimize the loss expressed
by the normalized MSE. Besides, since the Adam optimization algorithm (1) The developed att-GCN model under a proper combination of
is easy to implement and shows high computational efficiency, it is GCN and scaled dot-product attention can be suggested as a
employed to iteratively update the weights and bias of neural networks reliable predictor for predicting the TBM performance during the
in place of the classical stochastic gradient descent (SGD). As a result, tunneling process. The overall prediction performance of the
Fig. 9 displays the training loss and testing loss over 1000 epochs. It can developed model in the training set and testing set is measured by
be seen that the training loss and testing loss decrease rapidly at the first four evaluation indicators in Table 2. For both y1 and y2, their
100 epochs, and then gradually arrive at a plateau with small fluctua­ prediction accuracy in the training set is slightly higher than the
tions. The training loss and testing loss share a similar decrease pattern, testing set, verifying the effectiveness of the developed model in
which can converge to the stable value of normalized MSE around 0.003 the multi-output regression task. Besides, it can be seen that the
and 0.006, respectively, after 800 epochs and remain afterward. The MAPE of att-GCN for y1 (14.804% in train and 15.475% in test)
small difference value between the training loss and testing loss in­ and y2 (12.577% in train and 15.173% in test) is similar. This
dicates no overfitting problem occurs, which further verifies the reveals that the att-GCN model supports outputting two objec­
tives simultaneously under satisfactory accuracy. Moreover, since
it usually more focuses on the testing set performance, Figs. 10
and 11 show the predicted penetration rate (y1) and energy
consumption (y2) of TBM on the testing set along with the cor­
responding measured data. Observably, the discrepancy between
0.38
the red line representing the measured value and the blue line
representing the predicted value in Figs. 10(a) and 11(a) is
zed MSE)

0.36
relatively small. The data points in Figs. 10(b) and 11(b) are close
to the line y = x. Accordingly, there is a great level of agreement
between the measured data and the predicted value, thereby
Loss (Normali

0.34 indicating that the developed att-GCN model is able to well catch

Table 2
0.32
Evaluation of the att-GCN prediction performance.
50 Metric Training set Testing set
100 0.3
0 y1 y2 y1 y2
Hid

2
.0

150
de

2.
nn

RMSE 3.227 3774.289 4.178 4313.695


5

yer
eu

200
3.

en la
ron

Hidd MAE 2.324 2011.910 2.967 2422.821


3.
5

250
MAPE 14.804% 12.577% 15.475% 15.173%
4.
0

R2 86.28% 95.01% 84.72% 91.43%


Fig. 8. The optimization result of hidden neuron and hidden layer.

9
Y. Pan et al. Automation in Construction 141 (2022) 104386

80 80

70
Predicted y1 70
Value of Penetration (mm/r)

Measured y1
60 60

50 50

Predicted y1
40 40

30 30

20 20

10 10

0 0

0 500 1000 1500 2000 0 10 20 30 40 50 60 70 80


Data No. Measured y1
(a) (b)
Fig. 10. Prediction of TBM penetration rate (y1) for the 33 rings in the testing set: (a) Data samples; (b) Scatter plot.

60000 60000
Prediction y2
50000 Mesured y2 50000
Value of Energy (KJ)

40000 40000

Prediction y2
30000 30000

20000 20000

10000 10000

0 0
0 500 1000 1500 2000 0 10000 20000 30000 40000 50000 60000
Data No. Measured y2
(a) (b)

Fig. 11. Prediction of TBM energy consumption (y2) for the 33 rings in the testing set: (a) Data samples; (b) Scatter plot.

the evolution of TBM performance under the complex under­


ground environment.
(2) The mean value of relative error for predicting the TBM pene­
tration rate and the energy consumption is 0.155 and 0.152,
respectively, which is small enough to meet engineering re­
quirements. To be more specific, a relative error can be calculated
by dividing the absolute error by the measured value, which is Average = 0.155
another measurement of precision. Fig. 12 presents the average
relative error of 33 rings on the testing set, which further vali­
dates that those results from the att-GCN-based predictor are
acceptable in practice. Since each ring owns 61 prediction points
at the net stroke interval of every 20 mm, the blue point stands for
the average relative error for a certain ring and the blue band (a)
denotes the 95% confidence interval of the relative error. It is
clear that the ring No.100 suffers from a relatively large error.
That is because this ring owns an extreme value that is hard to be
accurately predicted. In contrast, the error between rings
no.250–300 is small, since the penetration rate and energy con­
sumption of these rings fluctuate slightly. Figs. 13 and 14 take the
prediction results of a single ring No. 270 as an example. It can be Average = 0.152
seen from Figs. 13(a) and 14(a) that the att-GCN model is able to
offer an accurate approximation to the observations regarding the
two objectives, resulting in the average relative error (standard
deviation) of 0.057 (0.054) and 0.071 (0.062) for y1 and y2,
respectively. The number of outliers generated by att-GCN is (b)
comparatively small, which is 8 for y1 and 4 for y2. These outliers
of relative error tend to appear in some places where the Fig. 12. Relative error of 33 rings on the testing set: (a) y1; (b) y2.

10
Y. Pan et al. Automation in Construction 141 (2022) 104386
Value of penetration rate (mm/r)
45 0.30 0.30
40 Outlier
0.25 0.25 Mean
35

Relative error y1
Median
30 0.20 0.20
25
0.15 0.15
20
15 0.10 0.10
10 Predicted y1
Measured y1 0.05 0.05
5
0 0.00 0.00
200 400 600 800 1000 1200 200 400 600 800 1000 1200 Relative error y1
Net stroke (mm) Net stroke (mm)
(a) (b) (c)

Fig. 13. Prediction results of y1 on the ring No. 270: (a) Measured value and predicted value along the network stroke; (b) Relative error along the net strokes; (c)
Boxplot of relative error.

22000 0.40 0.4


20000 Outlier
Prediction y2 0.35
Value of energy (KJ)

18000 Mean
Mesured y2

Relative error y2
0.30 0.3 Median
16000
14000 0.25
12000 0.20 0.2
10000
0.15
8000
6000 0.10 0.1
4000 0.05
2000
0.00 0.0
200 400 600 800 1000 1200 200 400 600 800 1000 1200 Relative error y2
Net stroke (mm) Net stroke (mm)
(a) (b) (c)

Fig. 14. Prediction results of y2 on the ring No. 270: (a)Measured value and predicted value along the network stroke; (b) Relative error along the net strokes; (c)
Boxplot of relative error.

measured value experiences abrupt changes, such as at the net named DNN in forecasting the penetration rate. As for the pre­
stroke of 320 mm, 580 mm, and 860 mm. Besides, the low stan­ diction of energy consumption, att-GCN can improve the three
dard deviation indicates that the relative error is clustered around intelligent models by a minimum of 145.200, 91.459, 3.44%, and
the mean, and thus, the reliability of the att-GCN–based predictor 0.80% in terms of RMSE, MAE, MAPE, and R2, respectively. Also,
is sufficiently stable without too large error fluctuation. the relative error of the four models is calculated, which has been
(3) The att-GCN model outperforms other popular machine learning visualized by the distribution in Fig. 15. Table 4 outlines the
algorithms, including deep neural network (DNN), random forest statistical characteristics of the relative error distribution. It is
(RF), and support vector regression (SVR). More specifically, clear that the att-GCN model returns the lowest average value of
DNN is a kind of artificial neural network containing several relative errors under the smallest standard deviation. The average
layers between input layers and output layers. RF is an ensemble relative error of att-GCN presents an improvement over the other
of decision trees under an extension of bootstrap aggregation three models, which is remarkably reduced by 0.061 (y1) and
(bagging). SVR attempts to find a hyperplane as the best fit line 0.133 (y2) in contrast to SVR. Therefore, it can be concluded that
within a threshold of values. These three machine learning al­ the att-GCN model serves as a more suitable choice in this case,
gorithms have commonly been selected as the baseline to make a while SVR is less applicable probably due to its poor nonlinear
comparative assessment of prediction performance [45,46]. The mapping ability.
comparison results are listed in Table 3, where four candidate (4) From the view of every single ring, it also validates that the att-
models have been fine-tuned by grid search. Overall, our att-GCN GCN model is superior to the other three machine learning al­
model can always reach the smallest RMSE, MAE, MAPE, and the gorithms. The graphic comparison of prediction results from four
largest R2 in accurately predicting the two objectives, which has candidates on each ring is provided in Fig. 16. The deeper the
been highlighted in bold type. To be more specific, att-GCN can rectangular is, the smaller the mean of relative error is. Clearly,
significantly raise the prediction accuracy, which brings a att-GCN can always return a lower error in all 33 testing rings,
decrease of 0.252 in RMSE, 0.207 in MAE, 0.964% in MAPE, and showing its promising capacity in forecasting the TBM penetra­
an increase of 2.31% in R2 compared with the second-best model tion rate and energy consumption during the ongoing tunneling

Table 3
Performance comparison of four intelligent models. The value in bold represents the prediction performance of the proposed att-GCN model, which outperforms other
popular algorithms.
Method y1 R2 y2 R2

RMSE MAE MAPE RMSE MAE MAPE

att-GCN 4.178 2.967 15.48% 84.72% 4313.695 2422.821 15.17% 91.43%


DNN 4.430271 3.173759 16.44% 82.41% 4626.509 2875.799 21.41% 90.09%
RF 4.206947 2.999275 18.13% 83.65% 4458.895 2514.28 18.61% 90.63%
SVR 4.730726 3.685932 21.62% 77.40% 4772.052 3230.928 28.49% 88.85%

11
Y. Pan et al. Automation in Construction 141 (2022) 104386

1000 the developed att-GCN performs much better than other candi­
att-GCN date algorithms. As shown in Figs. 17 and 18, the curve of att-
DNN GCN-based prediction is nearest to the measured value and the
RF data points of att-GCN are concentrated near y = x. In other
750 SVM words, both the penetration rate and energy consumption of ring
No. 320 can be more accurately estimated by att-GCN, whose
mean of relative error is 0.067 and 0.071, respectively.
Count

500
5. Discussions

The discussion section consists of two aspects. For one thing, the
250 attention mechanism is an important part of the proposed att-GCN
model, which deserves further exploration. For another, a potential so­
lution to raise the prediction accuracy is to make the most use of the
actual value of two objectives and feed them into the att-GCN model.
0 Such an online learning manner helps to make the dynamic predictions
-1 0 1 2
adapt to the evolving excavation process. In the end, a greater fitting can
Relative error y1 be generated to better follow the real trend of TBM performance. The
(a) prediction results under the consideration of objective value are also
1600
compared among the four candidate machine learning models, aiming to
att-GCN
1400 further demonstrate the superiority of the att-GCN model.
DNN
RF
1200 (1) The in-degree value is a critical parameter of the attention
SVR
mechanism, whose goal is to control the number of previous
1000 observations that is taken into account in the prediction task.
That is to say, the in-degree value provides interesting insights
Count

800 into the model prediction behavior, contributing to improving


the explainability of deep learning. For example, when the in-
600 degree value is set as 4, the quantum of influence degree from
four previous segment rings connected to the targeted ring is
400
estimated. In consequence, the most influential part of the
neighboring rings to focus on can be reasonably identified under
200
the utilization of an attention-based module. Since the historical
0 data will inevitably make an impact on the current conditions, it
-1 0 1 2 is meaningful to set the different in-degree values to explore their
Relative error y2 effect on the prediction accuracy of the att-GCN model. For
conducting a series of comparison experiments to find out the
(b) optimal value, we set the in-degree value to be {1, 2, 3, 4, 5}. As a
result, Fig. 19 manifests the att-GCN performance under different
Fig. 15. Distribution of relative error for (a) y1 and (b) y2 from the four
in-degree values, proving that the model forecasting ability is
intelligent models.
sensitive to the in-degree of attention. Evidently, under the in-
degree value of 4, the sum of MAPE for the two objectives ach­
Table 4
ieves the minimum, which is 0.02 lower than the worst condi­
Statistic characteristics of relative error from four intelligent models. tions with only 1 in-degree. Meanwhile, when the in-degree
equals to 4, the sum of the standard deviation of relative error for
Statistic att-GCN DNN RF SVR p-value
the two objectives is smaller than the in-degree of 1, 3, and 5,
y1 Average 0.155 0.164 0.181 0.216 0.000** indicating that the data can be more closely distributed around
Std 0.142 0.159 0.377 0.255
the average relative error. In other words, in-degree = 4 can be
IQR [0.054, [0.056, [0.050, [0.075,
0.216] 0.228] 0.222] 0.265] selected as the proper value in this case.
Skewness 2.160 3.638 22.518 4.407 (2) The previous conditions of the tunneling operation parameters
Kurtosis 7.881 29.049 684.617 30.905 and geological environment are bound to affect the forecasting of
y2 Average 0.152 0.214 0.186 0.285 0.000**
TBM performance variation. The data are observed every 20 mm
Std 0.149 0.249 0.197 0.286
IQR [0.051, [0.073, [0.066, [0.102,
net stroke, which are dynamic and complex in nature. It can be
0.201] 0.282] 0.241] 0.388] therefore believed that the previous net stroke tends to have an
Skewness 2.467 5.907 4.384 3.855 effect on the next net stroke during the procedure of shield
Kurtosis 11.973 62.129 33.005 29.730 tunneling. Notably, the attention incorporated in the developed
Note: **p < 0.01 Related-samples Friedman's two-way analysis of variance by att-GCN model paves a new way to capture the influence from the
ranks. Reject the null hypothesis that the null hypothesis is the distribution of previous observations. To provide a clearer understanding of the
relative error of att-GCN, DNN, RF, SVM is the same. influence, Fig. 20 takes the ring No. 270 between the net stroke of
180 mm to 360 mm as an example. Since the optimal in-degree of
project. Especially for the testing rings after No. 250, the devia­ attention is herein determined as 4, we explain Fig. 20 (c) in
tion between the observed and measured data is much smaller, detail. For the bottom line about predictive results of the 180 mm
where the range of average relative error is [0.057, 0.122] and net stroke, the first, second, third, and fourth previous net stroke
[0.071, 0.122] for y1 and y2, respectively. Herein, the ring No. (160 mm, 140 mm, 120 mm, and 100 mm) will influence the
320 is taken as an example, aiming to intuitively illustrates that prediction to varying degrees under the influence value of 0.455,
0.170, 0.216, and 0.160, respectively. In general, when the

12
Y. Pan et al. Automation in Construction 141 (2022) 104386

330 330
320 320
310 310
300 Mean of 300
290 290 Mean of
280 relative error y1 280 relative error y2
270 270
260 0.546 260 0.844
250 250
240 240
230 230
220 0.448 220 0.688
210 210
200 200
190 190

Ring No.
Ring No.

180 0.349 180 0.533


170 170
160 160
150 150
140 0.251 140 0.377
130 130
120 120
110 110
100 100 0.222
90 0.152 90
80 80
70 70
60 60 0.0660
50 0.0540 50
40 40
30 30
20 20
10 10
GCN
att-GCN DNN RF SVR GCN
att-GCN DNN RF SVR
Method Method
(a) (b)
Fig. 16. Mean of relative error for (a) y1 and (b) y2 in 33 testing rings based on four intelligent models.

30 30
att-GCN
Value of Penetration (mm/r)

DNN
25 25 RF
SVR
y=x
Predicted y1

20 20

15 15
att-GCN
DNN
10 RF 10
SVR
Measured y1
5 5
200 400 600 800 1000 1200 5 10 15 20 25 30
Net stroke (mm) Measured y1
(a) (b)

Fig. 17. Prediction results of y1 for ring No. 320 based on four intelligent models: (a) Data samples; (b) Scatter plot.

location of net stroke is farther away from the focused site, its makes the comparison of att-GCN performance with and without
influence on the prediction performance is more likely to be the observations of two objectives under different in-degree of
waned. Significance of the previous condition on the current the attention mechanism. It is clear that the inputs of known
penetration rate and the energy consumption is beneficial to objective value y1 can bring great improvement of more than
better understand and evaluate the TBM performance, eventually 36.835%, 45.920%, and 41.769% in terms of RMSE, MAE, and
supporting data-driven decision making for improving tunneling MAPE, respectively. When the ground truth of objective y2 is
efficiency. available, it helps to significantly reduce the RMSE, MAE, and
(3) When the measured value of two objectives is taken as inputs to MAPE by at least 32.521%, 42.280%, and 44.978%, respectively.
carry out the mechanism of online learning, the att-GCN model is From the view of relative error as shown in Fig. 21, the att-GCN
more capable of dynamically forecasting the TBM performance model that learns objective value can averagely lower the relative
under higher accuracy. Since the tunnel excavation is an ongoing error to 0.085 (y1) and 0.079 (y2) in the 33 testing rings, which is
process, a fact is that the ground truth of the two objectives can be over half of the relative error from the prediction with unknown
recorded in the dataset continuously. Therefore, how to make full objective value. Apart from ring No. 320, other rings all benefit
use of these constantly updated objective values can become an from the inputs of known objective value, whose relative error
issue worthy of discussion, which can be effective in enhancing can gain around 41% improvement in average compared to the
the prediction quality and realizing the real-time prediction. To conditions taking no account of real value about the two objec­
quantitatively verify the effectiveness of objective value, Table 5 tives. Fig. 22 and 23 take the prediction results on the ring No.

13
Y. Pan et al. Automation in Construction 141 (2022) 104386

16000 16000
att-GCN
DNN
14000 RF 14000
Value of Energy (KJ)

SVR
Mesured y2
12000 12000

Predicted y2
10000 10000

8000 8000 att-GCN


DNN
RF
6000 6000 SVR
y=x
4000 4000
200 400 600 800 1000 1200 4000 6000 8000 10000 12000 14000 16000
Net Stroke (mm) Mesured y2
(a) (b)

Fig. 18. Prediction results of y2 for ring No. 320 based on four intelligent models: (a) Data samples; (b) Scatter plot.

0.340
0.32 Sum of MAPE 0.320
0.335 0.318 Sum of std for relative error

Sum of std for relative error


0.315
0.330 0.313
0.310
Sum of MAPE

0.325
0.305
0.320

0.315 0.300

0.310 0.295
0.291
0.305 0.289 0.290

0.329 0.322 0.327 0.309 0.316


0.300 0.285
1 2 3 4 5
In-degree

Fig. 19. Prediction evaluation under different in-degree values of the attention mechanism.

100 as an example, aiming to visually illustrate the effects of the More specifically, the mean (median) value of relative error for
known objective value on 61 testing points for a single ring. the predictive penetration rate and the energy consumption is
Evidently, when the objective value is known, the prediction in 0.085 (0.047) and 0.079 (0.051) under the adoption of the att-
the blue line of Fig. 22 (a) and 23 (a) can more tightly conform to GCN model. Overall, when the objective value is fed into the
the measured value. Also, orange points in Fig. 22 (b) and 23 (b) predictive model, att-GCN outperforms others by the least RMSE,
standing for the results with no consideration of objective value MAE, MAPE, and average relative error. In addition, our work
obviously deviate from the line y = x. That is to say, the full usage concentrates on the multi-output regression task that can simul­
of actual value about the two objectives provides valuable op­ taneously predict multiple targeted variables about TBM perfor­
portunities to facilitate more accurate att-GCN-based prediction, mance given a set of inputs. However, the existing studies in
which is potential to better guide the control of TBM multi-output regressions are insufficient. The core idea behind
performance. most studies is to simply convert the multi-output regression
(4) Under the condition of knowing objective value, the developed problem into independent single-target problems, and thus they
att-GCN model trained in an online manner is also superior to build several models for each target and then concatenate all
DNN, RF, and SVR in terms of three evaluation metrics including predictions. The main weakness lies in the independent predic­
RMSE, MAE, and MAEP. As outlined in Table 6, all four algo­ tion, which may negatively affect the computational efficiency
rithms can benefit from learning the ground truth of two objec­ and prediction performance [47]. Meanwhile, they mainly learn
tives, which can bring significant improvement to the three from the entire dataset, which do not take much attention to the
evaluation indicators. It also proves the contribution of the online learning concept. Differently, the proposed att-GCN-based
objective value in the prediction task to form an online learning method is proven useful to inherently support the predictive
mechanism, which drives the model adaptive to the ongoing modeling involving two outputs associated with sequences of
excavation process for achieving higher accuracy. Compared to TBM penetration rate and energy consumption at the same time,
the worst performance from SVR, the RMSE, MAE, and MAEP of yielding a relatively higher accuracy (Fig. 24). As for the engi­
att-GCN can be smaller than 0.159, 0.340, and 2.59% for y1 and neering value, our method can return prediction results con­
1309.478, 526.810, and 3.22% for y2, respectively. The boxplot cerning two aspects of TBM performance simultaneously and
in Fig. 24 evaluates the prediction performance based on relative reliably, which can provide robust evidence to guide the TBM
error. According to the mean and median values, the priority of excavation process in the pursuit of higher efficiency and lower
the four models can be ranked as: att-GCN > RF > DNN > SVR. energy consumption.

14
Y. Pan et al. Automation in Construction 141 (2022) 104386

Influence value Influence value


1.000 1.000
360 0.001 0.999 360 0.000 0.002 0.998
340 0.000 1.000 340 0.023 0.000 0.977
0.800 0.800
320 1.000 0.000 320 0.782 0.218 0.000
Net stroke (mm)

Net stroke (mm)


300 0.823 0.177 300 0.839 0.123 0.039
0.600 0.600
280 0.724 0.276 280 0.867 0.119 0.014
260 0.724 0.276 260 0.912 0.079 0.009
0.400 0.400
240 0.782 0.218 240 0.763 0.218 0.019
220 0.733 0.267 220 0.779 0.172 0.049
0.200 0.200
200 0.716 0.284 200 0.499 0.411 0.090
180 0.515 0.485 0.000 180 0.289 0.390 0.321 0.000

Previous 1 Previous 2 Previous 1 Previous 2 Previous 3


Data Data
(a) Influence value (b) Influence value
0.995 0.884
360 0.000 0.000 0.007 0.993 360 0.060 0.058 0.000 0.020 0.882
340 0.009 0.003 0.000 0.988 340 0.002 0.199 0.193 0.108 0.604
0.796 0.707
320 0.671 0.257 0.072 0.000 320 0.393 0.354 0.126 0.223 0.001
Net stroke (mm)

Net stroke (mm)


300 0.709 0.192 0.076 0.022 300 0.250 0.288 0.261 0.239 0.1
0.597 0.530
280 0.729 0.198 0.053 0.021 280 0.197 0.224 0.264 0.248 0.079
260 0.617 0.292 0.073 0.018 260 0.142 0.181 0.209 0.236 0.220
0.398 0.354
240 0.629 0.233 0.110 0.028 240 0.106 0.162 0.208 0.100 0.285
220 0.467 0.347 0.127 0.060 220 0.231 0.114 0.174 0.125 0.257
0.199 0.177
200 0.281 0.357 0.265 0.096 200 0.541 0.143 0.070 0.001 0.138
180 0.455 0.170 0.216 0.160 0.000 180 0.751 0.157 0.041 0.001 0.031 0.000

s1 s2 s3 s4 s5
Previous 1 Previous 2 Previous 3 Previous 4 Previou Previou Previou Previou Previou
Data Data
(c) (d)
Fig. 20. Measurement of influence from the previous net stroke condition on the prediction results of the current net stroke under different in-degree values: (a) In-
degree = 2; (b) In-degree = 3; (c) In-degree = 4; (d) In-degree = 5.

Table 5
Comparison of prediction performance with known and unknown objective values.
Objective In-degree Unknow the objective value Know the objective value Improvement

RMSE MAE MAPE RMSE MAE MAPE RMSE MAE MAPE

y1 1 4.323 3.212 17.701% 2.558 1.737 9.942% 40.834% 45.920% 43.836%


2 4.265 3.020 15.451% 2.409 1.492 8.997% 43.506% 50.589% 41.769%
3 4.286 3.078 16.869% 2.408 1.502 8.498% 43.808% 51.206% 49.620%
4 4.063 2.944 15.317% 2.454 1.479 8.504% 39.594% 49.764% 44.479%
5 4.097 2.971 15.900% 2.588 1.483 8.058% 36.835% 50.081% 49.325%
y2 1 3939.324 2307.163 15.240% 2634.474 1331.704 8.385% 33.124% 42.280% 44.978%
2 4114.237 2424.552 16.735% 2711.646 1340.102 8.699% 34.091% 44.728% 48.015%
3 4121.517 2380.218 15.781% 2781.164 1319.090 8.349% 32.521% 44.581% 47.097%
4 4091.140 2366.797 15.547% 2645.317 1280.786 7.934% 35.340% 45.885% 48.968%
5 4341.359 2470.127 15.718% 2654.197 1260.221 7.939% 38.863% 48.982% 49.494%

6. Conclusions and future works contribution of this research lies in two aspects. From the theoretical
aspect, the significant novelty is placed on the innovative development
In this paper, a novel graphic-based network model named att-GCN of a hybrid deep learning model att-GCN to effectively model nonlinear
based on the integration of GCN and the scaled dot-product attention relationships between inputs and outputs. There are four key points to
mechanism is developed for a multi-output regression task, aiming to be noted. Firstly, a more advanced attention mechanism called scaled
make an accurate prediction about the TBM excavation performance dot-product attention instead of the classical additive attention is
and gain a deep understanding of the ongoing tunneling procedure. The reasonably incorporated in the basic GCN model to not only raise the

15
Y. Pan et al. Automation in Construction 141 (2022) 104386

0.40 0.50
Unknow objective value 0.45 Unknow objective value
0.35
Know objective value 0.40 Know objective value

Relative error y2
Relative error y1

0.30
0.35
0.25 0.30
0.20 0.25
0.15 0.20
0.15
0.10
0.10
0.05 0.05

1.0 50 100 150 200 250 300 50 100 150 200 250 300
0.8
0.8
relative error y1

relative error y2
Improvement of

Improvement of
0.6 0.6
0.4 0.4
0.2
0.2
0.0
-0.2 0.0
-0.4 -0.2
50 100 150 200 250 300 50 100 150 200 250 300
Ring Ring
(a) (b)
Fig. 21. Comparison of relative error for (a) y1 and (b) y2 with known and unknown objective value.

22 22
Unknow objective value Unknow objective value
20 20
Know objective value
Value of Penetration (mm/r)

Know objective value


18 Measured y1 18 y=x

16 16

Predicted y1
14 14
12 12
10 10
8 8
6 6
4 4
2 2
200 400 600 800 1000 1200 2 4 6 8 10 12 14 16 18 20 22
Net Stroke (mm) Measured y1
(a) (b)
Fig. 22. Prediction results of y1 for ring No. 100 with and without the objective value: (a) Data samples; (b) Scatter plot.

22 22
Unknow objective value Unknow objective value
20 20
Know objective value
Value of Penetration (mm/r)

Know objective value


18 Measured y1 18 y=x

16 16
Predicted y1

14 14
12 12
10 10
8 8
6 6
4 4
2 2
200 400 600 800 1000 1200 2 4 6 8 10 12 14 16 18 20 22
Net Stroke (mm) Measured y1
(a) (b)
Fig. 23. Prediction results of y2 for ring No. 100 with and without the objective value: (a) Data samples; (b) Scatter plot.

computation efficiency but also raise prediction accuracy. Secondly, the Thirdly, an encoder-decoder architecture is designed to make it more
attention mechanism over neighbors can easily focus on the most suitable for sequence-sequence data, and thus the proposed att-GCN
important and relevant parts of inputs, which plays important roles in algorithm is more appropriate to well handle the multi-output regres­
making the deep learning algorithm more transparent and explainable. sion task in the prepared TBM-related dataset. Fourthly, to make the att-

16
Y. Pan et al. Automation in Construction 141 (2022) 104386

Table 6
Prediction performance of four intelligent models with known objective value.
Objective Model Performance based on known objective value Improvement based on unknown objective value

RMSE MAE MAPE RMSE MAE MAPE

y1 att-GCN 2.454103 1.479162 8.50% 41.26% 50.15% 45.05%


DNN 2.641292 1.801029 10.14% 40.38% 43.25% 38.33%
RF 2.882373 1.735178 9.84% 31.49% 42.15% 45.70%
SVR 2.612699 1.819318 11.09% 44.77% 50.64% 48.71%
y2 att-GCN 2645.317 1280.786 7.93% 38.68% 47.14% 47.71%
DNN 2808.706 1509.135 10.29% 39.29% 47.52% 51.95%
RF 2710.926 1347.805 8.63% 33.21% 46.39% 53.64%
SVR 3954.795 1807.596 11.16% 17.13% 44.05% 60.84%

project is conducted to validate the effectiveness of the proposed


0.30
method. Totally 330 rings are taken into account and the monitoring
Mean data is collected at an interval of 20 mm net stroke. Some useful results
0.25 Median can be concluded as follows: (1) By fully learning all the seven TBM
mechanical parameters in the time-series format, the developed att-GCN
model is good at nonlinearly interpreting the TBM performance for each
Relative error y1

0.20
segment ring. As a result, it supports multioutput regression under high
0.15 accuracy, achieving the relatively low MAPE of 15.475% and 15.173%
for two objectives (y1 and y2), respectively. (2) The prediction perfor­
0.10 mance of att-GCN is obviously superior to the other three popular
intelligent models, including DNN, RF, and SVR. Under the evaluation
0.05 indicator of MAPE, the prediction result of att-GCN is greater than the
previous best result (DNN) of 0.964% in y1 and the previous best result
0.00 (RF) of 3.439% in y2. (3) The attention mechanism plays an important
GCN
att-GCN DNN RF SVR role in considering measuring the influence of previous observations on
(a) the TBM performance in the next net stroke. Herein, it is better to set the
0.30 in-degree of the attention as 4, which paves a new way to raise the
Mean prediction accuracy. (4) To cope with monitoring data that is collected
0.25 Median in sequence during the shield tunneling, att-GCN can be reasonably
developed into a real-time prediction model showing its advantages in
data efficiency and adaptability. When the objective value is continu­
Relative error y2

0.20
ously input into the att-GCN model, it can bring over 40% improvement
0.15 of MAPE in predicting y1 and y2.
There are some limitations for further improvement. For one thing,
0.10 instead of the manual design of att-GCN, we can refer to an advanced
technique called neural architecture search (NAS) to determine the
0.05 optimal combination of hyperparameter, architecture, and training
process of the complex att-GCN model in an automatic and cost-efficient
0.00 manner. Various NAS methods concerning evolutionary algorithms and
search algorithms can be implemented to generate a high-performance
GCN
att-GCN DNN RF SVR
(b) att-GCN model, aiming to further facilitate the enhancement of predic­
tion reliability. For another, with the incorporation of the ourstanding
Fig. 24. Boxplot of relative error for (a) y1 and (b) y2 under the consideration multi-objective optimization algorithm [48], our proposed
of objective value. decision-making approach can potentially demonstrate a valuable
application prospect in the creation of a digital twin system. It is ex­
GCN model more practical in the underground construction project, the pected that the digital twin can offer a unique opportunity to facilitate
model can be trained in an online learning manner, which can adap­ better safe monitoring, predicting, and controlling under a high degree
tively learn from the continuous stream of TBM-related data at each of automation and information, enabling smarter tunnel construction
time. This provides a valuable opportunity to instantly update the model management.
and significantly improve the prediction accuracy. As a result, the
developed att-GCN model has shown advantages of easing the burden on Declaration of Competing Interest
feature engineering, enhancing the model interpretability under atten­
tion mechanisms, and improving the prediction ability over popular The authors declare that they have no affiliation with any organi­
machine learning algorithms. From the practical aspect, the penetration zation with a direct or indirect financial interest in the subject matter
rate and energy consumption of TBM can be forecasted simultaneously, discussed in the manuscript.
efficiently, and dynamically. The interactions between pairs of segment
rings can be quantitatively captured to find out which should be paid Acknowledgments
attention to. In the end, the accurate prediction results can assist in
adjusting the operation of TBM and construction progress in a data- The Ministry of Education Tier 1 Grant, Singapore (No.
driven manner, which holds promise in the intelligent feedforward 04MNP002126C120), the Start-Up Grant at Huazhong University of
control of TBM towards sustainable and efficient shield tunneling. Science and Technology (No. 3004242122), and the Shanghai Sailing
A deep excavation case study in the Singapore MRT construction Program (No. 22YF1419100) are acknowledged for their financial sup­
port of this research.

17
Y. Pan et al. Automation in Construction 141 (2022) 104386

References [24] W. Sun, M. Shi, C. Zhang, J. Zhao, X. Song, Dynamic load prediction of tunnel
boring machine (TBM) based on heterogeneous in-situ data, Autom. Constr. 92
(2018) 23–34, https://doi.org/10.1016/j.autcon.2018.03.030.
[1] G. Shi, C. Qin, J. Tao, C. Liu, A VMD-EWT-LSTM-based multi-step prediction
[25] J. Zhou, et al., Optimization of support vector machine through the use of
approach for shield tunneling machine cutterhead torque, Knowl.-Based Syst. 228
metaheuristic algorithms in forecasting TBM advance rate, Eng. Appl. Artif. Intell.
(2021), https://doi.org/10.1016/j.knosys.2021.107213, p. 107213. /09/27/2021.
97 (2021) 104015, https://doi.org/10.1016/j.engappai.2020.104015.
[2] C. Zhou, L.Y. Ding, M.J. Skibniewski, H. Luo, H.T. Zhang, Data based complex
[26] Y. Pan, L. Zhang, Mitigating tunnel-induced damages using deep neural networks,
network modeling and analysis of shield tunneling performance in metro
Autom. Constr. 138 (2022) 104219, https://doi.org/10.1016/j.
construction, Adv. Eng. Inform. 38 (2018) 168–186, https://doi.org/10.1016/j.
autcon.2022.104219.
aei.2018.06.011. /10/01/2018.
[27] I. Shahrour, W. Zhang, Use of soft computing techniques for tunneling optimization
[3] C. Zhou, L. Ding, Y. Zhou, H. Zhang, M.J. Skibniewski, Hybrid support vector
of tunnel boring machines, Underground Space 6 (3) (2021) 233–239, https://doi.
machine optimization model for prediction of energy consumption of cutter head
org/10.1016/j.undsp.2019.12.001.
drives in shield tunneling, J. Comput. Civ. Eng. 33 (3) (2019) 04019019, https://
[28] D.J. Armaghani, E.T. Mohamad, M.S. Narayanasamy, N. Narita, S. Yagiz,
doi.org/10.1061/(ASCE)CP.1943-5487.0000833.
Development of hybrid intelligent models for predicting TBM penetration rate in
[4] Q. Gong, L. Yin, H. Ma, J. Zhao, TBM tunnelling under adverse geological
hard rock condition, Tunn. Undergr. Space Technol. 63 (2017) 29–43, https://doi.
conditions: an overview, Tunn. Undergr. Space Technol. 57 (2016) 4–17, https://
org/10.1016/j.tust.2016.12.009.
doi.org/10.1016/j.tust.2016.04.002.
[29] C. Zhou, H. Xu, L. Ding, L. Wei, Y. Zhou, Dynamic prediction for attitude and
[5] M. Shi, L. Zhang, W. Sun, X. Song, A fuzzy c-means algorithm guided by attribute
position in shield tunneling: a deep learning method, Autom. Constr. 105 (2019)
correlations and its application in the big data analysis of tunnel boring machine,
102840, https://doi.org/10.1016/j.autcon.2019.102840.
Knowl.-Based Syst. 182 (2019) 104859, https://doi.org/10.1016/j.
[30] S. Feng, et al., Tunnel boring machines (TBM) performance prediction: a case study
knosys.2019.07.030. /10/15/2019.
using big data and deep learning, Tunn. Undergr. Space Technol. 110 (2021)
[6] M. Shi, T. Zhang, L. Zhang, W. Sun, X. Song, A fuzzy c-means algorithm based on
103636, https://doi.org/10.1016/j.tust.2020.103636.
the relationship among attributes of data and its application in tunnel boring
[31] S. Zhang, H. Tong, J. Xu, R. Maciejewski, Graph convolutional networks: a
machine, Knowl.-Based Syst. 191 (2020) 105229, https://doi.org/10.1016/j.
comprehensive review, Comput. Soc. Netw. 6 (1) (2019) 1–23, https://doi.org/
knosys.2019.105229. /03/05/2020.
10.1186/s40649-019-0069-y.
[7] X. Fu, L. Zhang, Spatio-temporal feature fusion for real-time prediction of TBM
[32] L. Zhao, et al., T-gcn: a temporal graph convolutional network for traffic
operating parameters: a deep learning approach, Autom. Constr. 132 (2021)
prediction, IEEE Trans. Intell. Transp. Syst. 21 (9) (2019) 3848–3858, https://doi.
103937, https://doi.org/10.1016/j.autcon.2021.103937. /12/01/2021.
org/10.1109/TITS.2019.2935152.
[8] K. Guo, L. Zhang, Multi-source information fusion for safety risk assessment in
[33] C. Zhou, L. Ding, Y. Zhou, H. Luo, Topological mapping and assessment of multiple
underground tunnels, Knowl.-Based Syst. 227 (2021) 107210, https://doi.org/
settlement time series in deep excavation: a complex network perspective, Adv.
10.1016/j.knosys.2021.107210. /09/05/2021.
Eng. Inform. 36 (2018) 1–19, https://doi.org/10.1016/j.aei.2018.02.005.
[9] C. Zhou, T. Kong, Y. Zhou, H. Zhang, L. Ding, Unsupervised spectral clustering for
[34] J. Gehring, M. Auli, D. Grangier, Y.N. Dauphin, A convolutional encoder model for
shield tunneling machine monitoring data with complex network theory, Autom.
neural machine translation, arXiv Prepr. (2016) arXiv:1611.02344, https://arxiv.
Constr. 107 (2019) 102924, https://doi.org/10.1016/j.autcon.2019.102924. /11/
org/abs/1611.02344.
01/2019.
[35] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph
[10] S.-S. Lin, S.-L. Shen, N. Zhang, A. Zhou, Modelling the performance of EPB shield
attention networks, Stat 1050 (2017) 20. https://arxiv.org/abs/1710.10903.
tunnelling using machine and deep learning algorithms, Geosci. Front. 12 (5)
[36] Y. Pan, L. Zhang, J. Unwin, M.J. Skibniewski, Discovering spatial-temporal
(2021) 101177, https://doi.org/10.1016/j.gsf.2021.101177.
patterns via complex networks in investigating COVID-19 pandemic in the United
[11] X. Gao, M. Shi, X. Song, C. Zhang, H. Zhang, Recurrent neural networks for real-
States, Sustain. Cities Soc. 77 (2022) 103508, https://doi.org/10.1016/j.
time prediction of TBM operating parameters, Autom. Constr. 98 (2019) 225–235,
scs.2021.103508.
https://doi.org/10.1016/j.autcon.2018.11.013.
[37] D. Duvenaud, et al., Convolutional networks on graphs for learning molecular
[12] K. Elbaz, S.-L. Shen, A. Zhou, D.-J. Yuan, Y.-S. Xu, Optimization of EPB shield
fingerprints, arXiv Prepr. (2015) arXiv:1509.09292, https://arxiv.org/abs/1
performance with adaptive neuro-fuzzy inference system and genetic algorithm,
509.09292.
Appl. Sci. 9 (4) (2019) 780, https://doi.org/10.3390/app9040780.
[38] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional
[13] M. Mirahmadi, M. Tabaei, M.S. Dehkordi, Estimation of the specific energy of TBM
networks, arXiv Prepr. (2016) arXiv:1609.02907, https://arxiv.org/abs/1
using the strain energy of rock mass, case study: Amir-Kabir water transferring
609.02907.
tunnel of Iran, Geotech. Geol. Eng. 35 (5) (2017) 1991–2002, https://doi.org/
[39] R. Vijayan, G. Mohler, Forecasting retweet count during elections using graph
10.1007/s10706-017-0222-z.
convolution neural networks, in: 2018 IEEE 5th International Conference on Data
[14] Y. Pan, L. Zhang, Roles of artificial intelligence in construction engineering and
Science and Advanced Analytics (DSAA), IEEE, 2018, pp. 256–262, https://doi.
management: a critical review and future trends, Autom. Constr. 122 (2021)
org/10.1109/DSAA.2018.00036.
103517, https://doi.org/10.1016/j.autcon.2020.103517.
[40] A. Vaswani, et al., Attention is all you need, Adv. Neural Inf. Proces. Syst. 30
[15] Y. Pan, L. Zhang, Data-driven estimation of building energy consumption with
(2017). https://papers.nips.cc/paper/2017.
multi-source heterogeneous data, Appl. Energy 268 (2020) 114965, https://doi.
[41] D. Chicco, M.J. Warrens, G. Jurman, The coefficient of determination R-squared is
org/10.1016/j.apenergy.2020.114965.
more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis
[16] L.-j. Jing, J.-b. Li, C. Yang, S. Chen, N. Zhang, X.-x. Peng, A case study of TBM
evaluation, PeerJ Comput. Sci. 7 (2021) e623, https://doi.org/10.7717/peerj-
performance prediction using field tunnelling tests in limestone strata, Tunn.
cs.623.
Undergr. Space Technol. 83 (2019) 364–372, https://doi.org/10.1016/j.
[42] H. Fattahi, N. Babanouri, Applying optimized support vector regression models for
tust.2018.10.001.
prediction of tunnel boring machine performance, Geotech. Geol. Eng. 35 (5)
[17] W. Zhang, et al., State-of-the-art review of soft computing applications in
(2017) 2205–2217, https://doi.org/10.1007/s10706-017-0238-4. /10/01 2017.
underground excavations, Geosci. Front. 11 (4) (2020) 1095–1106, https://doi.
[43] G.R. Garcia, G. Michau, H.H. Einstein, O. Fink, Decision support system for an
org/10.1016/j.gsf.2019.12.003.
intelligent operator of utility tunnel boring machines, Autom. Constr. 131 (2021)
[18] J. Zhou, et al., Predicting TBM penetration rate in hard rock condition: a
103880, https://doi.org/10.1016/j.autcon.2021.103880. /11/01/2021.
comparative study among six XGB-based metaheuristic techniques, Geosci. Front.
[44] Z.A. Khan, A.N. Siddiquee, N.Z. Khan, U. Khan, G. Quadir, Multi response
12 (3) (2021) 101091, https://doi.org/10.1016/j.gsf.2020.09.020.
optimization of wire electrical discharge machining process parameters using
[19] M.A. Grima, P. Bruines, P. Verhoef, Modeling tunnel boring machine performance
Taguchi based grey relational analysis, Procedia Mater. Sci. 6 (2014) 1683–1695,
by neuro-fuzzy methods, Tunn. Undergr. Space Technol. 15 (3) (2000) 259–269,
https://doi.org/10.1007/s00170-006-0672-6.
https://doi.org/10.1016/S0886-7798(00)00055-9.
[45] K. Were, D.T. Bui, Ø.B. Dick, B.R. Singh, A comparative assessment of support
[20] X. Yin, Q. Liu, X. Huang, Y. Pan, Perception model of surrounding rock geological
vector regression, artificial neural networks, and random forests for predicting and
conditions based on TBM operational big data and combined unsupervised-
mapping soil organic carbon stocks across an Afromontane landscape, Ecol. Indic.
supervised learning, Tunn. Undergr. Space Technol. 120 (2022) 104285, https://
52 (2015) 394–403, https://doi.org/10.1016/j.ecolind.2014.12.028.
doi.org/10.1016/j.tust.2021.104285.
[46] Y. Pan, L. Zhang, BIM log mining: learning and predicting design commands,
[21] P. Zhang, H.-N. Wu, R.-P. Chen, T. Dai, F.-Y. Meng, H.-B. Wang, A critical
Autom. Constr. 112 (2020) 103107, https://doi.org/10.1016/j.
evaluation of machine learning and deep learning in shield-ground interaction
autcon.2020.103107.
prediction, Tunn. Undergr. Space Technol. 106 (2020) 103593, https://doi.org/
[47] H. Borchani, G. Varando, C. Bielza, P. Larranaga, A survey on multi-output
10.1016/j.tust.2020.103593.
regression, Wiley Interdisc. Rev. Data Min. Knowl. Discov. 5 (5) (2015) 216–233,
[22] J. Yang, S. Yagiz, Y.-J. Liu, F. Laouafa, Comprehensive evaluation of machine
https://doi.org/10.1002/widm.1157.
learning algorithms applied to TBM performance prediction, Underground Space 7
[48] L. Zhang, P. Lin, Multi-objective optimization for limiting tunnel-induced damages
(1) (2022) 37–49, https://doi.org/10.1016/j.undsp.2021.04.003.
considering uncertainties, Reliab. Eng. Syst. Saf. 216 (2021) 107945, https://doi.
[23] H. Tao, W. Jingcheng, Z. Langwen, Prediction of hard rock TBM penetration rate
org/10.1016/j.ress.2021.107945.
using random forests, in: The 27th Chinese Control and Decision Conference (2015
CCDC), IEEE, 2015, pp. 3716–3720, https://doi.org/10.1109/
CCDC.2015.7162572.

18

You might also like