Coupled ML Models For Drilling Optimization
Coupled ML Models For Drilling Optimization
PII: S1875-5100(18)30254-3
DOI: 10.1016/j.jngse.2018.06.006
Reference: JNGSE 2605
Please cite this article as: Hegde, C., Gray, K., Evaluation of coupled machine learning models
for drilling optimization, Journal of Natural Gas Science & Engineering (2018), doi: 10.1016/
j.jngse.2018.06.006.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
PT
4
5 Hildebrand Department of Petroleum and Geosystems Engineering, The University of Texas at
6 Austin
RI
*
7 Corresponding Author
SC
8 Abstract
9
10 Drilling optimization can provide significant value to an oil and gas project, especially in a low-
11 price environment. This is generally approached by optimizing the rate of penetration (ROP) of
U
12 the well, which may not always be the best strategy. Two additional strategies (or models) can be
AN
13 used to optimize a well – torque on bit (TOB) response to reduce vibrations at the bit or
14 mechanical specific energy (MSE) to reduce the energy used by the bit. This paper evaluates
15 these three models for drilling optimization based on several criteria. Models for ROP, TOB and
16 MSE are built using a data-driven approach with the random forests algorithm using drilling
M
17 operational parameters such as weight-on-bit, flow-rate, rotary speed, and rock strength as
18 inputs. The drilling models are optimized using a meta-heuristic optimization algorithm to
19 compute the ideal drilling operational parameters for drilling ahead of the bit. Machine learning
D
20 is used to develop these models since these models are coupled which enable calculation of
21 interaction effects. Results show that optimizing the ROP model leads to a 28% improvement in
TE
22 ROP on average, however, this also increases the MSE and the TOB which is undesirable.
23 Optimizing the MSE model results in a (smaller) increase of ROP (20%). This is accompanied
24 by a decrease in MSE (by 15%) and decrease in TOB (by 7%) which may result in longer bit life
EP
25 and additional savings over time. Hypothesis testing has been used to ensure that all simulations
26 conducted in this paper show statistically significant results.
27 Keywords: MSE, ROP, data-driven, machine learning, drilling, optimization, data analytics
C
AC
ACCEPTED MANUSCRIPT
28 1. Introduction
29
30 Drilling optimization has been the focal point of drilling research for a couple of decades. Over
31 the past couple of decades engineers, operators, and researchers have made efforts to reduce the
32 cost of drilling by optimizing different metrics such as rate of penetration (ROP), vibrations,
PT
33 mechanical specific energy(MSE), the torque on bit (TOB), and cost per foot of drilling. These
34 metrics affect drilling time, productivity, and equipment used to drill a given well, directly
35 affecting costs.
RI
36 Modeling of the rate of penetration (ROP) has always been of interest since it represents the
37 speed of drilling which is an easy means to measure productivity in drilling. This led to research
38 in ROP modeling techniques which can be broadly classified as equation-based (or empirical)
SC
39 models and data-driven (or machine learning) models (C. M. Hegde, 2016). Low accuracy of
40 equation-based models led to the use of machine learning algorithms to model ROP using
41 algorithms such as neural networks (Chamkalani et al., 2017), response surface modeling
U
42 (Moraveji & Naderi, 2016) and random forests (Chiranth Hegde & Gray, 2017). The increase in
43 accuracy is accounted to the non-parametric approach, in which the models are not constrained
AN
44 to a given functional form (for example, WOB raised to a power in Bingham’s model (Bingham,
45 1964)); the predictions depend on the input data and selection of input parameters (or features).
46 Mechanical specific energy (MSE) is a measure of the amount of energy required to break a unit
M
47 volume of rock. It was introduced in the mining industry as a quality measure for drilling (Teale,
48 1965). MSE was initially used in the oil and gas industry for bit efficiency evaluation and bit
49 selection (Rabia, 1985; Rabia, Farrelly, & Barr, 1986). Recently, MSE was introduced as a real-
D
50 time optimization metric to be used to control drilling operational parameters (Dupriest &
51 Koederitz, 2005). MSE has since been used as a real-time drilling optimization metric to
TE
52 optimize drilling in place of ROP since it represents the energy required to excavate rock
53 (Dupriest, 2006; Guerrero & Kull, 2007).
54 Prediction of downhole torque during drilling can help mitigate drilling problems as well as aid
EP
55 in drilling optimization by improving the mechanical specific energy (MSE) of drilling. Surface
56 torque is commonly measured on the rig and typically easy to obtain. Unfortunately, it is still
57 rare to obtain torque on bit (TOB) measurements close to the bit, which are more important since
C
58 it plays an integral role in the MSE calculations. Torque is measured on the rig surface using a
59 sensor, most of torque and drag modeling is focused on predicting the loss of torque from the
AC
60 surface to the bit – estimating the torque on bit (TOB). Models have built upon the basic model
61 introduced in the past (Johancsik, Friesen, & Dawson, 1984). Downhole torque and drag
62 modeling provide one method to estimate the TOB without downhole measurements. However,
63 predictions in real-time require an analytical model with a closed form solution to cope with
64 computational constraints (Gerbaud, Menand, & Sellami, 2006). Recently, pioneering work by
65 Ertas (Ertas, Bailey, Wang, & Pastusek, 2014) has shown that TOB estimation is possible using
66 surface torque by applying the transfer matrices technique. This estimation of TOB was then
67 used for vibration control and calculation. Some authors (C. Hegde, Wallace, & Gray, 2015)
68 have used statistical learning methods to predict downhole torque using surface drilling
ACCEPTED MANUSCRIPT
69 parameters. They argued that analytical models were inaccurate, and FEM-based models are not
70 feasible in real-time. A simple solution was to use a pattern matching or machine learning
71 algorithm to learn the torque response downhole.
72 Drilling optimization models (or objective functions) – like ROP, MSE or TOB – can be used to
73 determine ideal parameters for drilling a well (Tansev, 1975). These ideal parameters are
74 computed by optimizing a given model (or objective function). The objective of this paper is to
PT
75 evaluate three such drilling models or objective functions and their effect on drilling a well such
76 as: what would happen is ROP is optimized for this well instead of MSE? Such questions have
77 not been addressed in the past since drilling models are uncoupled – it is not possible to measure
RI
78 the change in ROP if weight-on-bit is changed to minimize MSE. Traditional practices include
79 building separate models for ROP, TOB, and MSE of a given well – independent of each other.
80 Hence, there is no interaction between these models even though they are dependent on the same
SC
81 input parameters – weight-on-bit, rotary speed, flow-rate and rock strength.
82 Three different drilling optimization models – ROP, TOB, and MSE – are evaluated. The
83 models are built using machine learning algorithms with the same input features – weight-on-bit,
U
84 flow-rate, rotary speed, and rock strength – which results in coupled drilling optimization
AN
85 models. Hence, the effect of one model on another can be easily evaluated. Using these coupled
86 models, a thorough evaluation can be conducted to determine the best drilling optimization
87 model (or objective) to use while drilling a given well. Statistical tools such as hypothesis testing
88 can be used to calculate confidence intervals for the interaction effects of each model.
M
89 A procedure to optimize engineering operations using dineufferent machine learning models has
90 been laid out. Fitting functions are first determined using machine learning algorithms to create
D
91 ROP, TOB, and MSE models (or objective functions). Drilling control parameters – drilling
92 parameters which can be changed to control the response – are used to fit the objective functions.
TE
93 Heuristic algorithms used to optimize drilling models functions are briefly discussed. These
94 algorithms are then used (on fitted drilling models) to determine the best control parameters to
95 implement ahead of the bit. Three objective functions – ROP, TOB, and MSE – are evaluated to
96 determine the effect of choosing a given objective function for drilling optimization. Model
EP
97 evaluation is performed by running simulations on data measured during drilling a well in the
98 Williston Basin, North Dakota. Results show that choosing MSE as an objective function can be
99 more suitable and performs better than purely optimizing ROP.
C
101
102 This section discusses the development of each drilling optimization model – ROP, MSE, and
103 TOB – using machine learning algorithms. In general, drilling parameters measured on the rig
104 can be naively classified into control parameters, uncontrollable parameters, and response
105 parameters. Control parameters are drilling operational parameters which can be controlled by
106 the drilling engineer on the rig: WOB, RPM, and flow rate. Uncontrollable parameters are those
107 which cannot be changed by engineers while drilling a well: such as the strength of the rock,
108 geological properties, maximum pump power. Response parameters (or in this case the
ACCEPTED MANUSCRIPT
109 objectives) are those which change when control parameters are changed: ROP, MSE, downhole
110 vibrations, and TOB.
111 The aim of modeling is to be able to represent objective or the response as a mathematical
112 function of controllable and uncontrollable parameters. The response (or objective) can then be
113 optimized by changing the controllable parameters. This section shows how the objective
114 functions can be modeled in terms of controllable and uncontrollable parameters. Additional
PT
115 details on algorithms and fitting each model has been described in the appendix.
RI
117 ROP is modeled using a data-driven approach for the purposes of this paper as shown by Hegde
118 (Chiranth Hegde & Gray, 2017). The ROP model is built using the random forest algorithm
119 using RPM, WOB, flow-rate, and UCS (calculated from correlations of sonic logs using offset
SC
120 well data) as input features. The training data are fit using the random forests algorithm; this
121 (fitted) data-driven model for ROP as a function of the input parameters – RPM, WOB, flow-
122 rate, and UCS. Performance of the ROP model has been plotted in Figure 1.
U
AN
M
D
TE
C EP
AC
123
124 Figure 1: Evaluation of errors due to ROP, TOB and MSE model predictions. (top) Normalized errors of MSE, TOB and ROP
125 model. Random forest algorithm was used to train a model on each formation using half the data for training. The trained model
ACCEPTED MANUSCRIPT
126 was evaluated for prediction accuracy on the test data. The errors are well within 15% for ROP, TOB, and MSE, showing the
127 accuracy of these models as compared to the field evaluated measurements. (bottom) The models are used to predict ROP (left),
128 TOB (middle) and MSE (right), and compared to the measurements seen in the field. As seen from these figures the models
129 perform well with a low error during prediction. These models are sufficiently accurate to be used thereafter in this paper for
130 optimization analysis and simulations.
PT
133 input parameters as described by (C. Hegde et al., 2015). Modeling torque as a function of
134 control parameters (using a data-driven approach) allows easy coupling of the ROP and TOB
135 (Pavone, 1992). The TOB model is fitted as a function of WOB, RPM, flow-rate and UCS using
RI
136 the random forests algorithm. The accuracy of the model is shown in Figure 1 is around 90% on
137 average. By fitting the TOB model with the same input parameters as the ROP model it is
138 possible to measure the change of ROP and TOB jointly (or in a coupled manner) when drilling
SC
139 control parameters are changed – resulting in coupled drilling models.
140 Alternatively, a full-scale physics-based model of torque on bit (TOB) (Stephane Menand et al.,
141 2006) can use to model the TOB. TOB can be calculated using differential pressure (S Menand
U
142 & Mills, 2017). A transfer matrix approach introduced by (Ertas, Bailey, Wang, & Pastusek,
143 2013) has shown great promise in modeling downhole torque. This method uses only surface
AN
144 parameters further extending the value of this model. These deterministic calculations of TOB do
145 not allow the coupling of ROP. It is not in the interest here to intricately model TOB to predict
146 stick-slip or vibration events. TOB has been modeled since it is evaluated as an objective
M
149 MSE has been defined differently based on its intended application (Armenta, 2008; Aurele,
150 2017; Dupriest & Koederitz, 2005; Teale, 1965). For the purposes of the cost function, the value
TE
151 of MSE is not as important as the relative change in MSE in this analysis (as long as MSE
152 calculation is kept consistent). MSE has been modeled according to Equation 1 following work
153 presented by Teale (Teale, 1965). MSE is calculated using ROP and TOB models, which
154 themselves are calculated using data-driven models as described in the previous sections. Hence,
EP
155 MSE is indirectly a function of drilling control parameters. Errors in MSE calculations have been
156 summarized in Figure 1.
120 ∗ ∗
C
= +
∗
AC
157 (Equation 1)
158 The three drilling objectives modeled in this section using the random forest algorithm are
159 coupled. Since ROP and TOB models have the same input features, a change in WOB can be
160 used to simulate a change in ROP and TOB. Since the MSE model is a function of the ROP and
161 TOB models themselves, it is also coupled.
162
163
ACCEPTED MANUSCRIPT
PT
169 RPM, changing these controllable parameters to their “ideal” setting would enable the operator
170 to maximize ROP as shown in Figure 2 (or similarly TOB and MSE).
RI
U SC
AN
171
172 Figure 2: Contour plots of the WOB and RPM sample space for the random forest ROP model. The ideal control parameters to
173 true maximize ROP as determined by the optimization algorithm is has been plotted as a red star. The optimization algorithm
M
174 searches the control parameters space for values of WOB and TPM which will maximize ROP as shown in the figure.
175
D
176 These “ideal” settings can be calculated by using an optimization algorithm (Lummus, 1970).
177 This involves selection of the optimal “setting” (maximum or minimum) from the entire solution
TE
178 space of control variables. The choice of the optimization algorithm depends on the function that
179 is being optimized. Knowing the functional form (exponential, log, linear, affine) aids in the
180 selection of the correct algorithm. Black-box optimization algorithms are used to find optimal
181 control parameters for data-driven drilling models since data-driven models are unknown in
EP
182 functional form: pre-determined knowledge of the function aiding in optimization is not possible.
183 However, optimization of more complex functions where competing parameters has to be
184 optimized is often not efficient with non-gradient based optimization algorithms (Boyd &
C
185 Vandenberghe, 2010). Gradient-based algorithms often get stuck in local minima. This can be
186 avoided by making use of a metaheuristic algorithm – such as the particle swarm algorithm –
AC
187 which has been used to optimize all cost functions presented in this paper. The theory and
188 pseudo code for the optimization of an objective function is discussed in the appendix.
194 well drilled in the Bakken shale. The data contains drilling parameters measured on the surface
195 and downhole averaged to a depth-based format – per 0.25 ft of drilled depth. Drilling the
196 vertical portion of the well did not make use of a downhole motor or rotary steerable (tools
197 which are commonly used in directional drilling), however, downhole measurements were
198 obtained. Depth based average data are preferred for ROP modeling since they tend to be less
199 noisy and trends are well preserved (Payette et al., 2015; Wallace, Hegde, & Gray, 2015). A
PT
200 simplified stratigraphic column is shown in Figure 3 (left). The entire interval of data used in this
201 paper was drilled by Marathon Oil with the use of a Smith 616 PDC bit. ROP vs Depth plot for
202 this dataset has been plotted in Figure 3 (right).
RI
U SC
AN
M
D
203
TE
204 Figure 3: (left) Generalized stratigraphic column for the Williston Basin, North Dakota (Theloy, 2014); (right) ROP vs Depth
205 plot over different formations in a vertical section of a well drilled in Williston Basin, North Dakota. ROP, TOB, and Vibrations
206 models are built on field data collected surface and downhole while drilling this well (Chiranth Hegde et al., 2018)
EP
211 Figure 4 shows the basic methodology behind data-driven optimization in drilling. A portion of
AC
212 the well is drilled (without any modeling), and the data collected during drilling this interval are
213 called the training set. This training data are used to build data-driven models for ROP, TOB,
214 and MSE. An objective function is defined (ROP, TOB or MSE), and optimized using the PSO
215 algorithm with operational constraints set by the on-site engineer. The optimized control
216 parameters are used to drill ahead of the bit. After the completion of another joint/stand, newly
217 acquired data are used to update the model. This process is repeated in a closed loop process
218 until the entire formation is drilled. This process is discussed in depth by Hegde (C. M. Hegde,
219 2018).
ACCEPTED MANUSCRIPT
220 The machine learning models which have been implemented in this paper were built in Python
221 using the scikit-learn package (Pedregosa et al., 2011) in an anaconda environment. This
222 implementation can be easily scaled for the deployment using cloud computing applications. The
223 run-time for training the random forest models for this dataset are trivial (<100 milliseconds).
224 However, for larger datasets using big data technologies can help with computational constraints
225 (Zhang et al., 2015).
PT
226
RI
U SC
AN
M
D
TE
EP
227
228 Figure 4: Flowchart describes the drilling optimization process. The first step is to acquire drilling data by drilling one or more
229 stands into a formation. These data are not modeled and treated as training data. Models are then built on this training data. An
230 objective function (MSE for example) is defined and optimized using the PSO algorithm. The optimized drilling control
C
231 parameters are implemented for drilling the next joint/stand. Data acquired from the next joint/stand can be used to update all
232 the models, after which the cycle is repeated in a closed loop.
AC
233 Simulations are conducted in this paper to evaluate and compare the effect of different objective
234 functions. A drilling simulation is used to quantify expected results paralleling a reservoir
235 simulation, instead of production curves, ROP, TOB, and MSE curves are analyzed. A
236 simulation would entail using a drilling model to simulate the actual drilling environment and
237 model ROP, TOB and MSE ahead of the bit (in a known or unknown environment with the
238 assumption that an accurate machine learning based model is able to generalize well).
239
ACCEPTED MANUSCRIPT
PT
245 objective function) depends on the operator’s discretion and project goals. Statistical significance
246 tests for optimization of each metric has been addressed in the appendix.
RI
248 The effect of using ROP (as an objective function) to determine ideal drilling control parameters
249 is analyzed. ROP is modeled as using the random forest algorithm on the training dataset (40%
250 of formation data in this case). Ideal parameters are computed using the PSO algorithm on the
SC
251 trained model (for each formation). The change in ROP, MSE, and TOB due to the
252 implementation of these control parameters (calculated based on the ROP model) to maximize
253 ROP has been observed and plotted in Figure 5.
U
254 The 95% confidence interval of the response has also been plotted. By definition, 95% of the
AN
255 time the range defined by this interval will contain the possible improvement in ROP. In each
256 simulation performed for the confidence interval, the training and test set are changed by
257 randomly choosing data points to act as the test and training set respectively. The test-train split
258 – ratio training set length to test set length – is kept fixed at 0.4. At each iteration, 40% of the
M
259 data points selected in a formation are chosen to be part of the training set, and the rest will be a
260 part of the test set. 1000 Monte Carlo based simulations are evaluated to calculate the 95%
261 confidence interval of each drilling response parameter. ROP change for each individual
D
262 formation is plotted in Figure 6. Figures show that using ROP as the objective function has the
263 potential to improve ROP by 28% on average. However, along with the increase in ROP, there is
TE
264 a corresponding increase in MSE (4%) and TOB (10%) which may not be desirable.
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
SC
265
U
266 Figure 5: Effect on ROP, MSE, and TOB on the test set when the formation is drilled with the optimal control parameters
267 calculated by the PSO algorithm using ROP as an objective function. The changes in the drilling parameters are simulated by
AN
268 observing the changes on a machine learning model built for each drilling parameter. The figure shows that if ROP purely is
269 optimized, an increase in ROP is accompanied by an increase in TOB and MSE which might be undesirable. The shaded regions
270 around the dotted lines (for ROP, TOB, and MSE) represent the 95% confidence interval for each prediction.
M
D
TE
C EP
AC
271
272 Figure 6: Effect of ROP optimization on MSE in each formation. The changes in MSE are simulated by observing the changes on
273 a machine learning model built for ROP and TOB. The figure shows that if ROP purely is optimized, an increase in ROP is
274 accompanied by an increase in MSE which might be undesirable.
279 Reducing the TOB response of the drill bit will reduce the MSE since MSE is directly
280 proportional to TOB. Ideal parameters are computed using the PSO algorithm on the trained
281 model for each formation. Effect of optimal control parameters and their 95 % confidence
282 interval has been plotted in Figure 7. Changes in TOB in each formation have been plotted in
283 Figure 8. Figures show that using TOB as the objective function only slightly reduces TOB
284 (12%) whereas an increase in MSE (13%) and ROP (9%) is observed.
PT
RI
U SC
AN
M
285
286
D
287 Figure 7: Effect on ROP, MSE, and TOB on the test set when the formation is drilled with the optimal control parameters as
288 calculated by the PSO algorithm using TOB as an objective function. The changes in the drilling parameters are simulated by
289 observing the changes on a machine learning model built for each drilling parameter. The figure shows that if TOB purely is
TE
290 optimized, a small decrease in TOB and MSE is accompanied by a reduced increase in ROP. The shaded regions around the
291 dotted lines (for ROP, TOB, and MSE) represent the 95% confidence interval for each prediction.
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
SC
292
U
293 Figure 8: Effect of TOB optimization on MSE in each formation. The changes in MSE are simulated by observing the changes on
294 a machine learning model built for ROP and TOB. The figure shows that if TOB purely is optimized, a smaller increase in ROP
AN
295 is accompanied with a slight increase in MSE as compared to purely optimizing ROP
298 each built using a train-test ratio of 40/60. Ideal parameters are computed using the PSO
299 algorithm. However, the convergence of the algorithm, in this case, takes substantially longer
300 (~10x) since there are competing responses. ROP and TOB themselves are functions of WOB,
D
301 RPM, flow-rate, and UCS. Parameters (for minimization of MSE) must be chosen such that ROP
302 can be increased with a corresponding decrease (or minimal increase) in TOB – so that the
TE
303 overall MSE can be minimized. The changes in ROP, MSE, and TOB because of changing these
304 control parameters has been observed and plotted in Figure 9 along with their 95% confidence
305 interval. A smaller increase in ROP is observed as compared to Figure 5. TOB and MSE are
EP
306 observed to be lower which is desirable. A decrease in MSE shows that less energy is utilized by
307 the bit to destroy a given volume of rock implying that rock breakage during drilling is more
308 efficient. A reduction in TOB leads to a reduced power consumption of the motor and may lead
C
PT
RI
SC
311
U
312 Figure 9: Effect on ROP, MSE, and TOB on the test set when the formation is drilled with the optimal control parameters as
313 calculated by the PSO algorithm using MSE as an objective function. The changes in the drilling parameters are simulated by
AN
314 observing the changes on a machine learning model built for each drilling parameter. The figure shows that if MSE is minimized
315 by controlling RPM, WOB and flow-rate to manipulate the MSE, an increase in ROP is accompanied with a decrease in TOB
316 and decrease MSE which is highly desirable. The shaded regions around the dotted lines (for ROP, TOB, and MSE) represent the
317 95% confidence interval for each prediction.
M
D
TE
C EP
AC
318
319 Figure 10: Effect of MSE optimization on the MSE response in each formation. The changes in MSE are simulated by observing
320 the changes on a machine learning model built for ROP and TOB. The reduction in MSE in the simulated test set shows that
321 optimal parameters can be calculated which will increase ROP and decrease MSE at the same time
322
ACCEPTED MANUSCRIPT
323 6 Conclusions
324
325 Drilling optimization is still actively being researched. However, comparison of different drilling
326 optimization models for drilling the same well is relatively unexplored in literature. This paper
327 evaluated three objective functions or models for drilling optimization. This was performed by
PT
328 changing the objective function and evaluating its effect on key performance indicators (KPI)
329 observed while drilling the well. The reason such analysis has not been addressed is that
330 conventional or empirical drilling models are uncoupled. Separate models are built individually
331 for ROP, TOB, and MSE of a given well – independent of each other. Hence, there is no
RI
332 interaction between these models even though they are dependent on the same input parameters.
333 This is solved by using machine learning algorithms to model different response functions. Three
SC
334 different drilling models – ROP, TOB, and MSE – are built using the random forest algorithm
335 with the same input features – weight-on-bit, flow-rate, rotary speed, and rock strength –
336 coupling them. Hence, the effect of one model on another can be easily evaluated. Using these
U
337 coupled models, a thorough evaluation is conducted to determine the best drilling optimization
338 model (or objective) for a given well.
AN
339 Based on the analysis conducted in this paper, the optimization of the ROP model led to an
340 increase in MSE and TOB. Using ROP as an objective function leads to an improvement of ROP
341 by an average of 28%, MSE by 4% and torque on average by 10%. While the improvement in
M
342 ROP can save time, the increase in MSE and TOB may lead to additional costs due to non-
343 optimal use of bit energy, excessive vibrations, and drilling dysfunction which can offset the
344 time saved due to improvements in ROP. The optimization of TOB did not yield favorable
D
345 results. When TOB is used as an objective function, a small reduction in TOB is accompanied by
346 a small improvement in ROP and an increase in MSE (which is undesirable). However, in cases
TE
347 with high axial, lateral or torsional vibrations, this might be more desirable since reducing
348 vibrations can help avoid drilling dysfunction and tool failure. An average reduction of TOB by
349 12%, ROP increase by 9% and MSE increase by 13 % was observed. Optimizing based on the
EP
350 MSE objective function lead to an average decrease of 15% of the MSE, increase of ROP by
351 20% and reduction of torque by 7%. By far, using MSE as an objective function has the most
352 balanced improvement for drilling – an increase in ROP, reduction in torque, and reduction in
C
353 MSE concurrently. While the increase in ROP is reduced when compared to purely optimizing
354 ROP, there is an improvement in MSE and TOB which can increase the longevity of the bit. This
AC
355 paper has shown how machine learning algorithms can be used to model different objective
356 functions using coupled models to determine the best objective function (to be optimized) for a
357 given well.
358 Acknowledgements
359
360 The authors wish to thank the Wider Windows Industrial Affiliate Program, the University of
361 Texas at Austin, for financial and logistical support of this work. Project support and technical
362 discussions with industrial colleagues from Wider Windows sponsors BHP Billiton, British
ACCEPTED MANUSCRIPT
363 Petroleum, Chevron, ConocoPhillips, Halliburton, Marathon, National Oilwell Varco, Occidental
364 Oil and Gas and Shell are gratefully acknowledged.
365 Acronyms
366 • KPI: key performance indicator
367 • MSE: mechanical specific energy
• PDC: Polycrystalline diamond compact
PT
368
369 • PSO: particle swarm optimization
370 • ROP: Rate of penetration
• RPM: Rotations per minute
RI
371
372 • TOB: torque on bit
373 • UCS: Unconfined compressive strength
SC
374 • WOB: Weight-on-bit
375
U
AN
M
D
TE
C EP
AC
ACCEPTED MANUSCRIPT
PT
381 compressive strength of rock (UCS) using the random forests algorithm. The random forest
382 algorithm is a machine learning algorithm that can be used to model ROP as a function of
383 drilling input parameters. The random forest algorithm builds numerous decision trees, then
384 averages them to decrease the variance, producing a reliable estimate of ROP using RPM, WOB,
RI
385 flowrate and UCS as inputs. In addition to averaging trees, a subset of the features is chosen
386 (randomly) at each step in the tree building process. The number of features chosen is often
SC
387 considered a hyper parameter for the random forest algorithm and was set to 2 in this model
388 (chosen using cross-validation). This randomized selection of features decorrelates the tree
389 which helps improve its accuracy. Another important hyper parameter is the number of trees
390 used in the random forest model. The dataset was bootstrapped 100 times to create subsamples.
U
391 These sub-samples of data were used to construct 100 trees (the number of trees is limited to 100
392 since using additional trees did not improve accuracy). The ROP model was built separately on
AN
393 each of the 14 different formations. This model can be used to predict ROP given the surface-
394 based drilling parameters. TOB can be modeled using the same process (as shown for ROP) as a
395 function of WOB, RPM, flow-rate and UCS. Parametric studies have shown that for the
M
396 modeling of ROP and TOB, the random forest model generalizes the best and outperforms other
397 machine learning algorithms (C. M. Hegde, 2018). The MSE model is just calculated using the
398 ROP and TOB model according to Equation 1.
D
400
401 B.1 Theory
402 This algorithm is a stochastic optimization technique modeled after swarming or flocking of
EP
403 animals (Kennedy, 2011). This is like the simplex algorithm; however, more than three samples
404 are used, and each sample is called a particle. PSO operates well in a multi-dimensional setting
405 where at each iteration the solution is tweaked towards the current best solution.
C
406 Each particle in the PSO has a location and velocity. The velocity determines the direction and
AC
407 speed of travel at the next time step. Put another way, if x(t−1) and x(t) are the locations in space
408 of the particle at times t-1 and t, then at time t: v = x(t) - x(t-1). Each particle starts off with a
409 random location and a random velocity.
410 At each time step, the velocity vectors of each particle are updated based on the global optimum
411 discovered till that time step. The velocity vector is modified to point at a specified magnitude
412 towards to global minimum. The particle is tweaked based on the velocity vector and some
413 noise.
ACCEPTED MANUSCRIPT
PT
421 P=[]
422 For swarmsize times do
423 P=P U {new random particle x with random initial velocity}
RI
424 Best = Value
425 Repeat
426 For each particle in P:
427 evalROP(Pi)
SC
428 if Best != Value or ROP(Pi) > Best :
429 Best = Pi
430 For each particle in P:
Xg= best particle location globally
U
431
432 Xl = best local particle location
433 Xp = best personal particle location
AN
434 For each dimension do
435 b = rand(0,beta)
436 c= rand(0,gamma)
M
445 Hypothesis testing is commonly used in scientific studies to test the hypothesis. Hypothesis
446 testing consists of evaluating a null hypothesis and rejecting the null hypothesis if sufficient
447 evidence of the alternative hypothesis exists. It is widely used in the fields of statistics, science,
C
449 The confidence interval commonly chosen with a width of 95% represents the range which will
450 contain the true value of the mean from the population. For example, if the confidence of interval
451 for MSE in the lodgepole limestone formation is: 5000-15000 psi, this means that the population
452 mean of MSE corresponding to lodgepole limestone will lie between 5000 and 15000 psi 95% of
453 the time. Confidence interval in a way represents the lower and upper limits of allowable
454 statistical variation. Hence if the confidence interval of two different distributions intersect they
455 cannot be claimed to be statistically different.
ACCEPTED MANUSCRIPT
456 For the study in this paper, the objective of the hypothesis tests is to test whether the
457 optimization results in a significant change in ROP, MSE, and TOB; a difference large enough
458 that it could not have occurred by chance and can be declared to be significant. An example is
459 illustrated in the figure C-1 below. In the figure, A-1 ROP optimization has been studied in the
460 piper limestone formation. The distribution in red plots the ROP distribution before optimization:
461 that observed while drilling the formation. The green distribution represents the ROP post
PT
462 optimization: new ROP once optimal parameters are used for drilling the same formation. The
463 null hypothesis being tested is that both distributions are the same i.e. there is no difference
464 between the distributions observed in red and green. This would imply that the formation is
RI
465 being drilled as efficiently as possible. The alternative hypothesis states that ROP increases post
466 optimization. In other words, there exists a solution or set of input parameters which can result in
467 an improved ROP while drilling this formation. Based on the p-value for a two-sample t-test
SC
468 (Casella & Berger, 2002) conducted on the data – which was 5.43 E-6 – the null hypothesis is
469 rejected in favor of the alternative hypothesis for the analysis pertaining to Figure C-1. The
470 optimization algorithm’s solution helps improve the ROP. Such an analysis can also be useful in
471 determining the efficiency of the optimization algorithm. This principle can be easily extended to
U
472 MSE and TOB, to check if the improvements shown by the simulator post optimization is
473 significant compared to the data observed before.
AN
474
M
D
TE
EP
475
476 Figure C-1: Figure showing hypothesis test on ROP simulation. Distribution in red represents the distribution of ROP values
477
C
before ROP optimization. The distribution in green shows the ROP values after optimization. Hypothesis testing is used to
478 determine the p-value based on the difference between the two distributions based on their mean (represented as dotted lines in
479 the figure) and standard deviation. This case when analyzed using a two-sample t-test results in a p-value of 5E-6 showing that
AC
481 This analysis assumes that the data points are independent, the distributions are normal or
482 Gaussian, and ROP predictions made using the ROP model are accurate. The ROP predictions
483 are accurate since they have been modeled using a random forest algorithm; they have been
484 tested to show a low error, which indicates that the random forest algorithm predicts ROP
485 accurately and should generalize well for a given formation. The data points are independent,
486 however, not all distributions of ROP are approximately normal. Distributions can deviate from
487 normality as shown in Figure C-2 in the Charles Sandstone formation. Violation of this
488 normality assumption can lead to incorrect conclusions due to overinflated p-values and falsely
ACCEPTED MANUSCRIPT
489 narrow confidence intervals. An alternative is to utilize a non-parametric method such as the
490 bootstrap (Efron, 1982; C. M. Hegde, Wallace, & Gray, 2015) to calculate confidence intervals
491 for each distribution. If confidence intervals of the difference between the means of the two
492 distributions do not include 0, they are said to be statistically different: the null hypothesis is
493 rejected (Casella & Berger, 2002). In the case of Figure C- 2 the confidence interval of the
494 difference of the means are 21.63 - 29.29 ft/hr.
PT
495
RI
U SC
AN
496
497 Figure C-2: Figure showing hypothesis test on ROP simulation. Distribution in red represents the distribution of ROP values
498 before ROP optimization and green after. The means are plotted in the form of dotted lines. In this case, the normal assumption
M
499 breaks down for both distributions which affect the p-values and conclusions of hypothesis testing.
500 A hypothesis test was conducted for to determine the significance of ROP improvement for each
D
501 formation. The distributions of the original and improved ROP along with their means have been
502 plotted in Figure C-3. Both analytic confidence intervals and bootstrap confidence intervals have
TE
503 been shown in table 1. Based on the results seen in Table C-1, it can be concluded that ROP
504 optimization significantly improves the ROP in all formations. The ROP improvements are not
505 likely to have been caused by chance. Table C-2 summarizes similar statistics for TOB which
506 show that all TOB improvements are significant. Table C-3 summarize similar statistics for
EP
507 MSE. MSE reduction using optimization for all but one formations are significant. The
508 confidence interval for Kibbey Limestone contains 0, which means that a statistically significant
509 difference in the means is not observed. Both analytical as well as non-parametric intervals, have
C
510 been tabulated and displayed in Tables C-1, C-2, and C-3. The difference in their values depicts
511 the results due to the non-normality of the probability distributions.
AC
512
ACCEPTED MANUSCRIPT
PT
RI
U SC
513
AN
514 Figure C-3: ROP optimization distribution for all formations. ROP plotted in red refers to measured ROP. The distribution
515 plotted in green is the distribution of the optimized ROP. The means are plotted in dotted lines.
516 Table C-1: Confidence Intervals for difference in means for ROP Optimization
M
519 Table C-2: Confidence Intervals for difference in means for TOB Optimization
PT
Spearfish Sandstone -1835.13 -1369.51 -1814.27 -1352.63
Tyler Sandstone -903.065 -713.406 -898.202 -707.66
Kibbey Lime
RI
Limestone -1204.53 -423.405 -1187.92 -425.699
Kibbey Lime Shale -384.939 -95.6488 -370.342 -77.3619
Charles Sandstone -1816.55 -1332.19 -1808.06 -1321.17
SC
Charles Limestone -1349.05 -966.215 -1334.8 -951.774
Ratcliffe Sandstone -1077.36 -451.148 -1067.67 -452.202
Base Last Salt
U
Limestone -2266.38 -1525.56 -2263.11 -1531.31
Base Last Salt
AN
Sandstone -344.398 -49.3379 -324.733 -29.2358
Mission Canyon
Limestone -495.743 -365.405 -493.632 -362.429
M
Limestone
Lodgepole Limestone -2869.99 -1593.38 -2837.36 -1551.74
522
523
524 References
PT
525 Armenta, M. (2008). SPE 116667 Identifying Inefficient Drilling Conditions Using Drilling-
526 Specific Energy, (September), 21–24. https://doi.org/10.2118/116667-MS
RI
527 Aurele, M. (2017). New Formulation of Mechanical Specific Energy (MSE) Taking into Account
528 the Hydraulic Effects for PDC Bits. The University of Texas at Austin.
529 Bingham, M. G. (1964). A new approach to interpreting Rock Drillability. The Oil and Gas
SC
530 Journal.
531 Boyd, S., & Vandenberghe, L. (2010). Convex Optimization. Optimization Methods and
532 Software (Vol. 25). https://doi.org/10.1080/10556781003625177
533
U
Casella, G., & Berger, R. L. (2002). Statistical inference (Vol. 2). Duxbury Pacific Grove, CA.
AN
534 Chamkalani, A., Zendehboudi, S., Amani, M., Chamkalani, R., James, L., & Dusseault, M.
535 (2017). Pattern recognition insight into drilling optimization of shaly formations. Journal of
536 Petroleum Science and Engineering, 156, 322–339.
M
540 Dupriest, F. E., & Koederitz, W. L. (2005). Maximizing Drill Rates with Real-Time Surveillance
TE
541 of Mechanical Specific Energy. SPE/IADC Drilling Conference, (SPE/IADC 92194), 1–10.
542 https://doi.org/10.2118/92194-MS
543 Efron, B. (1982). The jackknife, the bootstrap and other resampling plans. SIAM.
EP
544 Ertas, D., Bailey, J. R., Wang, L., & Pastusek, P. E. (2013). Drillstring Mechanics Model for
545 Surveillance, Root Cause Analysis, and Mitigation of Torsional and Axial Vibrations. In
546 SPE/IADC Drilling Conference. Society of Petroleum Engineers.
C
547 https://doi.org/10.2118/163420-MS
548 Ertas, D., Bailey, J., Wang, L., & Pastusek, P. E. (2014). Drillstring Mechanics Model for
AC
549 Surveillance, Root Cause Analysis, and Mitigation of Torsional Vibrations. Society of
550 Petroleum Engineers. https://doi.org/10.2118/163420-PA
551 Gerbaud, L., Menand, S., & Sellami, H. (2006). PDC bits: all comes from the cutter rock
552 interaction. In IADC/SPE Drilling Conference (p. 1).
553 Guerrero, C. A., & Kull, B. J. (2007). Deployment of an SeROP Predictor Tool for Real-Time
554 Bit Optimization. SPE/IADC Drilling Conference, (SPE 105201), 1–14.
555 https://doi.org/10.2118/105201-MS
556 Hegde, C., Daigle, H., & Gray, K. (2018). Performance comparison of algorithms for real-time
ACCEPTED MANUSCRIPT
557 rate of penetration optimization in drilling using data-driven models. SPE Journal.
558 Hegde, C., Daigle, H., Millwater, H., & Gray, K. (2017). Analysis of rate of penetration (ROP)
559 prediction in drilling using physics-based and data-driven models. Journal of Petroleum
560 Science and Engineering, 159, 295–306. https://doi.org/10.1016/j.petrol.2017.09.020
561 Hegde, C., & Gray, K. E. (2017). Use of machine learning and data analytics to increase drilling
562 efficiency for nearby wells. Journal of Natural Gas Science and Engineering, 40, 327–335.
PT
563 https://doi.org/10.1016/j.jngse.2017.02.019
564 Hegde, C. M. (2016). Application of statistical learning techniques for rate of penetration (ROP)
RI
565 prediction in drilling. The University of Texas at Austin.
566 Hegde, C. M. (2018). End-to-end drilling optimization using machine learning. The University
567 of Texas at Austin.
SC
568 Hegde, C. M., Wallace, S. P., & Gray, K. E. (2015). Use of Regression and Bootstrapping in
569 Drilling Inference and Prediction. In SPE Middle East Intelligent Oil and Gas Conference
570 and Exhibition. Society of Petroleum Engineers.
571
U
Hegde, C., Wallace, S., & Gray, K. (2015). Real time prediction and classification of torque and
AN
572 drag during drilling using statistical learning methods. SPE Eastern Regional Meeting,
573 2015–Janua. https://doi.org/10.2118/177313-MS
574 Johancsik, C. A., Friesen, D. B., & Dawson, R. (1984). Torque and drag in directional wells-
M
578 Lummus, J. L. (1970). Drilling optimization. Journal of Petroleum Technology, 22(11), 1–379.
TE
579 Menand, S., & Mills, K. (2017). Use of Mechanical Specific Energy Calculation in Real-Time to
580 Better Detect Vibrations and Bit Wear While Drilling. Paper AADE-17-NTCE-0332017
581 Proceedings of the 2017 AADE National Technical Conference and Exhibition Held at the
582 Hilton Houston North Hotel, Houston, Texas, April 11-12,.
EP
583 Menand, S., Sellami, H., Tijani, M., Stab, O., Dupuis, D. C., & Simon, C. (2006). Advancements
584 in 3D drillstring mechanics: from the bit to the topdrive. In IADC/SPE drilling conference.
585 Society of Petroleum Engineers.
C
586 Moraveji, M. K., & Naderi, M. (2016). Drilling rate of penetration prediction and optimization
AC
587 using response surface methodology and bat algorithm. Journal of Natural Gas Science and
588 Engineering, 31, 829–841.
589 Payette, G. S., Pais, D., Spivey, B., Wang, L., Bailey, J. R., Pastusek, P., & Owens, M. (2015).
590 Mitigating Drilling Dysfunction Using a Drilling Advisory System: Results from Recent
591 Field Applications. International Petroleum Technology Conference.
592 https://doi.org/10.2523/IPTC-18333-MS
593 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Dubourg, V.
594 (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research,
ACCEPTED MANUSCRIPT
PT
600 Tansev, E. (1975). A Heuristic Approach to Drilling Optimization. Spe.
601 Teale, R. (1965). The Concept of Specific Energy in Rock Drilling. International Journal of
602 Rock Mechanics and Mining Science, 2(1), 57–73. https://doi.org/10.1016/0148-
RI
603 9062(65)90022-7
604 Theloy, C. (2014). Integration of geological and technological factors influencing production in
SC
605 the Bakken play, Williston Basin. Colorado School of Mines.
606 Wallace, S. P., Hegde, C. M., & Gray, K. E. (2015). A System for Real-Time Drilling
607 Performance Optimization and Automation Based on Statistical Learning Methods. In SPE
U
608 Middle East Intelligent Oil and Gas Conference and Exhibition.
609 https://doi.org/10.2118/176804-MS
AN
610 Zhang, Z., Barbary, K., Nothaft, F. A., Sparks, E., Zahn, O., Franklin, M. J., … Perlmutter, S.
611 (2015). Scientific computing meets big data technology: An astronomy use case. In Big
612 Data (Big Data), 2015 IEEE International Conference on (pp. 918–927). IEEE.
M
613
D
TE
C EP
AC
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
C EP
AC