Use of Machine Learning
Use of Machine Learning
net/publication/343723428
Article in Journal of Energy Resources Technology, Transactions of the ASME · April 2021
DOI: 10.1115/1.4048070
CITATIONS READS
11 368
4 authors:
Some of the authors of this publication are also working on these related projects:
CIPR 2333: Drilling Geological Environment Description For Pre-Khuff Formation In Key Blocks In Saudi Arabia. View project
Improving the Oil-well cement integrity using the Polypropylene Fiber View project
All content following this page was uploaded by Ahmed Saihati on 23 February 2021.
Keywords: torque and drag, horizontal drilling, random forest, Mahalanobis distance,
petroleum engineering, petroleum wells-drilling/production/construction
Journal of Energy Resources Technology Copyright © 2020 by ASME APRIL 2021, Vol. 143 / 043201-1
A challenging problem that arises in this domain is the use of energy mass and conservation of momentum or empirical. Although
single data entry of the friction coefficient when the well profile mathematical models can simulate real-life situations to forecast
is designed using modern T&D software packages, regardless of future behaviors, they do not provide effective solutions under
the type of formation being drilled. In addition to that, the friction some conditions [15]. On the other hand, empirical models,
coefficient used to calculate T&D values in the planning phase which are based on experiments and observations, are easy to estab-
needs to be calibrated during the actual drilling phase [2,5–7]. To lish, but might not be accurate and cannot be generalized
do that, drilling surface loads including pick-up weight, slack-off [14,16,17]. Conventional models, which entail several assumptions
weight, and surface torque reading are used as data inputs for the and simplifications, require extensive procedures based on trial and
software. Then, the software will perform backward calculations error until desired outcomes are obtained. Furthermore, they are
to find the friction coefficient, which will be different for each drill- unable to handle complex relations and account for noisy data
ing surface load [2,6,8]. Therefore, the friction coefficient has to be [17]. Data mining, which plays a crucial role in data-driven
considered carefully not only to lend credibility to the model and models, can extract hidden predictive insights from large and
predict T&D confidently but also so as not to obscure downhole complex datasets. It integrates ML and pattern recognition algo-
problems that might cause an unhealthy drilling environment. rithms with statistical and visualization tools to find anomalies
The first T&D model was developed by Johancsik et al. [2]. This and correlations within large datasets.
model was put in a differential equation standard form by Sheppard Data-driven models incorporate two methods: (i) computational
et al. [9]. It is the most common model used for drill string analysis, intelligence which involves artificial neural networks (ANN),
and it has been adopted broadly for well planning [10]. Vos and fuzzy rule-based systems (ANFIS), genetic algorithms and (ii)
Reiber [4] combined real-time friction coefficients with equivalent ML models that are based on the theoretical foundation applied
circulating density (ECD) and vibration to gain insights about by computational intelligence [18]. These two methods are power-
downhole conditions including the hole cleaning condition, drilling ful tools that boost the capability to detect and uncover hidden infor-
efficiency, and wellbore stability. Rae et al. [10] suggested design- mation and enable operational improvement in the oil and gas
ing a well profile using a T&D simulator to calculate surface drilling industry.
torque and hock load and compared it with actual field data. If there The AI and ML have been widely used in the drilling industry to
is inconsistency in the results, it implies that either the model is not generate insights that would help to predict, detect, and describe
reliable or the hole condition has deteriorated. Mason and Chen [11] trends. Hedge et al. [19] used logistic regression, support vector
emphasized that the effect of drag force as a result of pipe move- machine (SVM), RF, and Gaussian mixture to classify the severity
ment in the opposite direction of drilling fluid flow and wellbore tor- of the stick-slip index (SSI) during drilling as low or high. The data
tuosity should be considered in the soft string model for better consisted of surface and downhole drilling parameters, and vibra-
accuracy of T&D estimation. Mirhaj et al. [8] found from a field tional measurements obtained from a measurement-while-drilling
case study that the friction coefficients resulting from reverse calcu- (MWD) sensor. In the study, a threshold of one was used to separate
lations of hock load during drilling and while tripping activities are the dataset into two classes, low or high. If SSI is greater than 1, the
0.05 and 0.2, respectively. Mitchell et al. [12] developed a new data points are classified as high, and if SSI is less than 1, the data
model that adds bending moment and shear forces to the pre- points are classified as low. The results showed that the RF outper-
existing conventional T&D model. The new model also determines formed other models with an average accuracy of 90%.
the contact forces between the drill string and wellbore more accu- Gurina et al. [20] used an analogs search approach to detect acci-
rately for a complex well path, which results in a better friction dents during directional drilling. Two types of data were used in the
coefficient. study: (i) 94 operational accidents obtained from old wells includ-
It is clear from the literature that the friction coefficient needs to ing stuck pipe, drill string wash-out, breaks of drilling, mud loss,
be iteratively altered in the model using field T&D data to increase shale collars, and fluid show and (ii) measurements while drilling
model reliability. Moreover, the friction coefficient needed to match data such as depth of the drill bit, torque, WOB, SPP, RS, Q, gas
the torque results is different from the coefficient needed to match content, and weight on the hook. The database values of mean, var-
the drag results; it can also differ from trip to trip [13]. Thus, vigi- iance, slope angle, deviation, and relative coefficient of the mea-
lance, caution, and domain knowledge are essential when using the surements while drilling data were calculated and used as input
T&D model. parameters for the gradient boosting (GB) classification model.
One way of avoiding the above problems is to use artificial intel- The Precision-Recall curve was used to check the model quality
ligence (AI) and machine learning (ML) models, which could be since the problem was an unbalanced classification problem. The
built based on drilling surface parameters only. This viable solution model had a Precision-Recall curve of 0.6086, which indicates an
has been proposed to eliminate the need to alter the friction coeffi- adequate model. Abbas et al. [21] developed an ANN model to
cient and to assist the crew to have real-time intuition about T&D. predict ROP in highly angled wells using depth, WOB, RS, bit
This paper will introduce an intelligent system that analyses surface type, bit working hours, torque, Q, SPP, total flow area (TFA),
drilling parameters to identify possible downhole issues by provid- azimuth, inclination, mud weight (MW), mud fluid rheology (e.g.,
ing an alarm based on a pre-defined threshold. The proposed alarm funnel viscosity, plastic viscosity, yield point), rock compressive
system could promote safer operations while drilling extended hor- strength, vertical stress, maximum horizontal stress, and minimum
izontal wells and improve the response time limit for the drilling horizontal stress as inputs. The highest accuracy of the coefficient
crew to prevent possible stuck pipe incidents. of determination (R 2 = 0.97) was obtained using three layers and
The types of models used in the oil and gas industry are discussed 30 neurons with tan-sigmoid (TANSIG) transfer function. Elkatatny
in Sec. 2, which is then followed by a discussion of different AI and [22] developed ANN, ANFIS, and SVM models to obtain a contin-
ML models, random forest (RF), artificial neural network (ANN), uous profile of the static Poisson’s ratio for a carbonate reservoir.
and functional network (FN). Sections 3 and 4 describe the datasets 610 core samples and log data (bulk density, compressional time,
and the experimental design methodology for developing the intel- shear time) were used to train and test the models. The ANN
ligent system. Section 5 presents the results of the study and a achieved the best results compared to the ANFIS and SVM. The
detailed discussion, while conclusions are presented in Sec. 6. developed model can be used to estimate the Poisson’s ratio
without a need for coring and extensive lab work.
Abdelgawad et al. [23] developed an ANN to estimate the rheo-
logical properties of bentonite spud mud. The input parameters of
2 Data-Driven Models the model were MW, marsh funnel viscosity, and solid percent.
Models used in the oil and gas industry can be classified into The ANN model was combined with the self-adaptive differential
three categories: (i) mathematical, (ii) physical, and (iii) empirical evolution algorithm (SaDE) to optimize the developed ANN
[14]. Mathematical models are based on first principles such as model. The model predicted the rheological properties with an
Table 1 Statistical parameter for the training data (7186 data points)
Statistical parameter Q (gal/min) HL (klbf) ROP (ft/h) RS (RPM) SPP (psi) WOB (klbf) Torque (kft.lbf)
Table 2 Statistical parameters for the testing data (1797 data points)
Statistical parameter Q (gal/min) HL (klbf) ROP (ft/h) RS (RPM) SPP (psi) WOB (klbf) Torque (kft.lbf)
4.6 The Last day Leading up to the Incident. The best- 5 Results and Discussion
developed model out of the three was used to predict the surface 5.1 Models Assessment. The RF model optimum parameters
drilling torque for the last day leading up to the incident in are presented in Table 5. The RF predicted the actual torque with
Well-1. The list of the statistical parameters of the last day
leading up to the incident is listed in Table 3. The compatibility Table 5 The optimum parameters of the RF model
of the input parameters for the last day leading up to the incident
in Well-1was checked to ensure they are in the same range as the Optimum parameters
dataset, which was used to train the best model.
max_features Log2
max_depth 23
4.7 Real-Time Alarm Detection. To perform real-time n_estimators 100
anomaly detection, a threshold value needs to be determined to
Fig. 3 Cross-plots of the actual torque versus predicted torque using RF for (a) training set and
(b) testing set
Table 7 The performance of the ANN model using different training functions with TANSIG transfer function and one hidden layer
Fig. 4 Cross-plots of the actual torque versus predicted torque using ANN for (a) training set and
(b) testing set
Table 8 The performance of the FN model with different methods and relationship types
AAPE of 1.46% and R of 0.99 in the training set, while AAPE and a transfer function, LOGSIG. Table 7 shows the performance of the
R were 3.98% and 0.93, respectively, in the testing set. Figures 3(a) ANN model with different training functions and their optimal
and 3(b) are cross-plots of the actual and predicted torque of the number of neurons when using a transfer function, TANSIG. The
training and testing sets, respectively. lowest AAPE in the testing set was the criterion to select the
Table 6 shows the performance of the ANN model with different optimum ANN model. The analysis shows that the ANN model,
training functions and their optimal number of neurons when using when using a transfer function, LOGSIG, with a training function,
Fig. 5 Cross-plots of the actual torque versus predicted torque using FN for (a) training set and
(b) testing set
Fig. 6 Depth versus the actual and predicted torque (the last day leading up to the inci-
dent, Well-1)
Fig. 8 Depth versus the actual and predicted torque (the last day leading up to the inci-
dent, Well-2)
6 Conclusions
The key contribution of this work is the solution it provides to
streamlining early detection operation anomalies. Unlike traditional
drilling models, ML models provide predictive capabilities. An ML
model, i.e., RF, was built to predict the surface torque using actual
Fig. 9 Mahalanobis distances of the normal and healthy trend field data of Well-1. Well-2 was used to assess the capability of the
(the last day leading up to the incident, Well-2) intelligent system in detecting downhole abnormalities in the last
day leading up to the incident in Well-2. Based on the results, the
following can be concluded:
react and take action to mitigate the problem promptly, thereby min-
imizing the unproductive time associated with the incident. – The RF predicted the actual torque with AAPE of 1.46% and R
of 0.99 in the training set, while AAPE and R were 3.98% and
5.3 Alarm Detection in Well-2. RF was used to predict the 0.93, respectively, in the testing set.
torque on the last day leading up to the incident in Well-2. – The RF model recognized that the actual torque trend in the
Figure 8 indicates that the hole condition starts to deteriorate at a last day leading up to the incident in Well-1 diverged from
depth of 15,060 ft. the predictive model at a depth of 14,900 ft.
Similar to Well-1, the Mahalanobis distances of all observations – The RF model recognized that the actual torque trend in the
of the modeled surface drilling torque (normal and healthy trend) in last day leading up to the incident in Well-2 deviated from
Well-2 were calculated to decide on a threshold. Figure 9 shows that the predictive model at a depth of 15,060 ft.
the Mahalanobis distances for the majority of the data points fall – The intelligent system populated a real-time alarm to alert the
between 1.36 and 3.39. However, there are 11 data points, which drilling crew 9 h before any abnormality was observed or
are the farthest from the centroid of the distribution; these points reported by the drilling crew or monitoring engineers in
are within the range of 3.97–4.26. Therefore, a threshold value of Well-1. However, the alarm was populated 7 h before any
4.26 is considered to flag an anomaly when the actual surface drill- abnormality was observed by the drilling crew in Well-2.
ing torque is examined.
The Mahalanobis distances of the actual drilling surface torque
were calculated and compared with the pre-determined threshold, Conflict of Interest
i.e., 4.26. The Mahalanobis distances for the last two drilled There are no conflicts of interest.
Appendix A
Table 10 Mahalanobis distances for the last two drilled stands, Well-1
Table 11 Mahalanobis distances for the last two drilled stands, Well-2
Appendix B where Yj denotes the jth column of Y, i is the row number, and j is
the column number.
Example of calculating the Mahalanobis distance
Step 1: Finding a dataset Y A = 68.0, Y B = 600.0, Y C = 40.0
Suppose that the following dataset (Y ) was obtained and we
need to find the Mahalanobis distance of Therefore,
x = 〈66.0,640.0, 44.0〉: = 〈68.0, 600.0, 40.0〉
μ
i=1 4.0