0% found this document useful (0 votes)

36 views

Rate of Penetration (ROP) Modeling Using Hybrid Models - Deterministic and Machine Learning

This paper explores three approaches to modeling rate of penetration (ROP) for drilling: deterministic, data-driven, and hybrid models. Deterministic models are based on physical principles but have limitations, while data-driven models have high predictive accuracy but low interpretability. Hybrid models combine deterministic and machine learning approaches to achieve higher accuracy than deterministic models and higher interpretability than data-driven models. The paper evaluates these approaches on field data from twelve formations drilled in North Dakota, finding that hybrid models provide the best balance between predictive performance and model inference.

Uploaded by

Chinedu Nwabueze

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Rate of Penetration (ROP) Modeling Using Hybrid Models - Deterministic and Machine Learning

Uploaded by

Chinedu Nwabueze

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

URTeC: 2896522

Rate of penetration (ROP) modeling using hybrid models:

deterministic and machine learning
Chiranth Hegde*, Cesar Soares, and Ken Gray;
The University of Texas at Austin
Copyright 2018, Unconventional Resources Technology Conference (URTeC) DOI 10.15530/urtec-2018-2896522

This paper was prepared for presentation at the Unconventional Resources Technology Conference held in Houston, Texas, USA, 23-25 July 2018.

The URTeC Technical Program Committee accepted this presentation on the basis of information contained in an abstract submitted by the author(s). The contents of this paper
have not been reviewed by URTeC and URTeC does not warrant the accuracy, reliability, or timeliness of any information herein. All information is the responsibility of, and, is
subject to corrections by the author(s). Any person or entity that relies on any information obtained from this paper does so at their own risk. The information herein does not
necessarily reflect any position of URTeC. Any reproduction, distribution, or storage of any part of this paper by anyone other than the author without the written consent of URTeC
is prohibited.

Abstract

This study explores three distinct approaches to ROP modeling: deterministic, data-driven, and hybrid models.
Deterministic or physics-based models rely on a fixed equation derived from drilling physical principles and have
been the traditional workhorse of the industry. Newer, and more powerful data-driven models utilize machine
learning and predictive analytics to enhance ROP prediction and optimization. However, the improved predictive
accuracy achieved with statistical techniques comes at the expense of model interpretability. In order to overcome
this disadvantage, a novel hybrid modeling technique is introduced.

A novel way to formulate hybrid models is discussed by presenting two broad strategies: ensembles of a single
deterministic model (hybrid-One) and ensembles of several deterministic models (hybrid-N). They provide a means
to encode the physics of drilling formulated in deterministic models into machine learning algorithms. Hybrid
models also enable inference on ROP models which provide valuable insight.

Both classes of hybrid models predict ROP with a greater accuracy than physics-based models alone; purely data-
driven models perform marginally better in most cases. On the other hand, hybrid models offer higher
interpretability, as they are built from deterministic models. Inference using hybrid models has been discussed with
a case study for Mission Canyon Limestone.

Hence, hybrid models are employed for ROP prediction and optimization by computing ideal drilling operating
parameters – weight-on-bit, RPM, and flowrate – for each rock formation in the vertical section of a Bakken shale
horizontal well. The study demonstrates the use of hybrid models for higher accuracy (than deterministic models)
and higher interpretability (than machine learning models) – providing an optimal tradeoff.

Keywords: Drilling, Machine Learning, Inference, ROP, Analytics

1. Introduction

Rate of penetration (ROP) has been a focal point in drilling optimization for decades: the rate at which a well is
drilled is a key indicator of drilling efficiency. Higher ROP implies faster drilling which in turn implies better rig
performance and higher rig productivity. Given this inherent interest in ROP modeling, deterministic models have
been developed over the past 50 years for ROP prediction based on laboratory experiments (Bingham, 1964;
Bourgoyne Jr & Young Jr, 1974; Hareland & Rampersad, 1994; Motahhari, Hareland, & James, 2010). The
performance of these models has been routinely questioned (Soares, Daigle, & Gray, 2016) since their applicability
on new datasets is not guaranteed. Advances in computational power and machine learning over the past few years
has seen the birth of many new data-driven ROP prediction models – models based purely on data statistics (C.
URTeC 2896522 2

Hegde, Daigle, Millwater, & Gray, 2017; C. Hegde & Gray, 2017). These models utilize machine learning
algorithms for ROP prediction and have been shown to generalize well for different formations.

Data-driven modeling techniques provide high-performance, reproducible, and scalable solutions. However, they
have an unknown functional form which makes their interpretation difficult. They use drilling parameters as inputs,
while not constraining functional form(C. Hegde, Daigle, & Gray, 2018). This allows them to model the data more
closely, resulting in higher accuracy. This increase in accuracy is accompanied by a decrease in model inference.
Knowing the functional form of a deterministic model, however, can be used to derive insights (such as the most
influential input drilling parameter). Hence, finding a middle ground between these two models can result in a good
trade-off between accuracy and inference. The goal of this paper is to evaluate the performance of hybrid models – a
combination of deterministic and machine learning models – in modeling drilling rate of penetration (ROP). This
modeling methodology has been carried out with success in climate science (Goldstein, Coco, Murray, & Green,
2014) where combining the strengths of inductive (data-driven) and deductive (physics-based) approaches in a
single hybrid model was successful.

This paper analyzes different ROP modeling strategies. Bingham (1964), Bourgoyne and Young (BY) (1974), and
Motahhari et al. (2010) constitute the applied deterministic ROP models. Different machine learning algorithms –
regression, regularized regression, bagging, random forests, and K-nearest neighbors – are used to model ROP; the
best machine learning algorithm based on cross validation error is used as a data-driven ROP model. A novel way to
formulate hybrid models has been discussed by presenting two broad strategies: ensembles of a single deterministic
model (hybrid-One) and ensembles of several deterministic models (hybrid-N). Both types of hybrid models are
evaluated for ROP predictive accuracy and model inference on twelve different formations by running experiments
on data measured while drilling a well in the Williston Basin, North Dakota. A case study for Mission Canyon
limestone shows the application of each drilling model in detail – for prediction and inference. Hybrid models result
in a higher ROP prediction accuracy when compared to deterministic models. Additionally, they can be used as
effective inferential tools, unlike data-driven models.

2. Dataset and Attributes

2.1 Experiments
Experiments are conducted to test the accuracy of different models for ROP prediction on measured field data.
Figure 1 shows the ROP data plotted for the entire well, colored based on geological rock formation. Field measured
data used to conduct and evaluate models in this paper were collected on the surface. The well was drilled using a
Smith PDC 616 bit. Models are built individually on each formation to adapt for varying geology.

Figure 2 shows the methodology of test error evaluation. A portion of the well is drilled (without any modeling); the
data collected during in this interval are the training data. This dataset is used to build the ROP model and select
empirical constants (for deterministic models) or hyperparameter values (for hybrid and machine learning models).
The portion of drilling data not used for model building (colored red in Figure 2) is the test set which is held out
during training. The held-out test set ensures unbiased evaluation of the model error since it is an “unseen portion of
the dataset. Once the model is built on training data it can be used for prediction of ROP ahead of the bit (colored
green in Figure 2). It is important to ensure that test error is low (not just the training error) since low training error
does not imply low test error. If a model has low training error and high test error, it is said to have overfit the data.
Overfitting occurs due to excessive variance of the model and underfitting due to excessive bias of the model. Bias
is accumulated when a simple model is used to explain a complex real-world phenomenon. Variance is associated
with error accumulated in the model when data sets are changed – how well the model generalizes. The bias-
variance tradeoff is a constant theme in machine learning (James, Witten, Hastie, & Tibshirani, 2013). Picking the
best model – which results in low bias and low variance – is often best managed by cross-validation. Previous work
related to the use of machine learning for ROP prediction (C. Hegde et al., 2017) determined that data collected in
drilling the first 20-30% of formation depth were sufficient to build an accurate data-driven model. Since this paper
evaluates deterministic, data-driven, and hybrid models, 40% of formation depth data was used for training to ensure
better fitting.
URTeC 2896522 3

Figure 1: Scatter plot showing field data used for ROP model validation

Figure 2: Schematic of ROP modeling within a formation

2.2 Dataset Attributes

Data used for validation of experiments have been collected in a depth-based drilling data set spanning from 6001 ft
to 9128 ft TVD. Data from twelve formations have been evaluated in this paper. This data were acquired from
drilling the vertical portion of a well in the Bakken shale. The data set contains drilling parameters – measured on
the surface and downhole – in a depth-based format – per 0.25 ft of drilled depth. A simplified stratigraphic column
URTeC 2896522 4

of the formations drilled is shown in Figure 3. The entire interval of data used for analysis in this paper was drilled
by Marathon Oil.

Figure 3: Generalized stratigraphic column for the Williston Basin, North Dakota (Theloy, 2014)

3. ROP Models

3.1 Deterministic Models

Bingham’s model (Bingham, 1964) is the earliest deterministic model considered in this paper. The model was
designed to be applied to any bit-type.

𝑊𝑂𝐵 𝑏
𝑅𝑂𝑃 = 𝑎𝑅𝑃𝑀 � � , (Equation 1)
𝐷𝑏

where ROP is the rate of penetration (ft/hr), WOB is the weight on bit (klb), RPM is the rotary speed of the drill bit
(revolutions/sec), Db is the bit diameter (in), and ‘a’ and ‘b’ are constants fitted for a given rock formation. The
constants are evaluated for each formation by conditioning them to the training set. The constant ‘a’ represent a
quantification of the ease of drilling through a given formation.

The Bourgoyne and Young (BY) model (Bourgoyne Jr & Young Jr, 1974) estimates ROP as a function of eight
parameters as shown in Equation 2.

𝑑𝐷
= 𝐸𝑥𝑝(𝑎1 + ∑8𝑗=2 𝑎𝑗 𝑥𝑗 ) (Equation 2)
𝑑𝑡

where D is the well depth (ft), t is the time (hr), 𝑎1 is the formation strength parameter, 𝑎2 is the normal compaction
trend exponent, 𝑎3 is the undercompaction exponent, 𝑎4 is the pressure differential exponent, 𝑎5 is the bit weight
exponent, 𝑎6 is the rotary speed exponent, 𝑎7 is the tooth wear exponent, and 𝑎8 is the hydraulic exponent.
Coefficients 𝑎1 through 𝑎8 are determined with a multiple regression technique, using several data points to
determine the eight unknowns that best fit a specific set of field data.

Motahhari’s model (Motahhari et al., 2010) introduced a PDC-bit-specific model and incorporated within it a wear
function and the effect of rock strength on ROP (Equation 3).

𝐺 𝑅𝑃𝑀𝛾 𝑊𝑂𝐵𝛼
𝑅𝑂𝑃 = 𝑊𝑓 � � (Equation 3)
𝐷𝑏 𝐶𝐶𝑆
URTeC 2896522 5

where CCS is the confined rock strength (psi), Wf is the wear function, G is the model coefficient which represents
the drillability, α and γ are ROP related model exponents. The confined compressive strength of the rock (CCS) used
in Motahhari’s model requires laboratory testing at different confining pressures which is seldom undertaken
(Menand & Mills, 2017). Field based correlations, and a rock failure envelope (such as Mohr-Coulomb) can be used
to determine the CCS of a rock. Unconfined compressive rock strength(UCS) data in this dataset come from
calculations using field-based correlations on sonic log measurements.

The Bingham model is the most basic ROP model. It considers two main input features affecting ROP: WOB and
RPM. Motahhari’s model adds on the effect of rock strength to this model. The BY model adds flowrate among
other parameters to Bingham’s model. All three deterministic models have empirical constants which are dependent
on the geology and bit design. These empirical constants are calculated using the training data. Fitting these
constants involves a fitting procedure or optimization algorithm.

3.2 Machine Learning Models

The training set is used to build (or form) the model; this model is used to predict the response (ROP in this case).
The model building process incorporates feature engineering – inclusion of domain knowledge – to ensure that the
model is built obeying the physics of the process. The model is built as a function of its feature vectors: RPM,
WOB, flow rate, and strength of the rock (UCS). Data-driven models do not pose any constraint on functional form,
thereby making them more flexible as compared to deterministic models. The machine learning algorithms used to
fit data-driven models are explained in Section 4.

3.3 Hybrid Models

These models combine deterministic and machine learning models and their objective is to provide a trade-off
between interpretability and prediction accuracy. Two approaches are utilized to build a hybrid model: ensembles of
a single deterministic model (hybrid-One) and ensembles of several deterministic models (hybrid-N).

3.3.1 Hybrid-One
This model uses one deterministic ROP model within an ensemble algorithmic framework. Deterministic models
used in this paper have fitting parameters (or constants) which are determined based on the geology and drill bit
design. The Hybrid-One model uses an ensemble machine learning algorithm to effectively fit these parameters in
batches.

The ensemble algorithm combines many versions (or “realizations”) of the same deterministic model to yield a
hybrid model. Figure 4 shows a flow chart of this algorithm by using Bingham’s model as a base. ‘N’ versions of
Bingham’s model are combined using an ensemble algorithm such as bagging or random forests. A “version” of
Bingham’s model is a model with unique fitted coefficients. These coefficients are selected by minimizing the
deterministic model on a subset (or batch) of the training set. For example, a Bingham model can be built using a
random one tenth of the training set. This constitutes a form or version of the model as shown in Figure 5. This is
repeated ‘N’ times. All ‘N’ models are fed into an ensemble algorithm which is used to combine all versions of the
model to a single predictive model – the Hybrid-One model. This algorithm works on the premise that many input
features can be underrepresented in the final model while fitting the entire formation at once. Alternatively, this two-
layered fitting procedure guards against outliers and increases the influence of underrepresented input features.
URTeC 2896522 6

Figure 4: Schematic for building a Hybrid-One model using Bingham’s model as the base deterministic model

Figure 5: Building a form of a Bingham model; Training set is randomly sampled as shown in the figure for data, which is used to fit a Bingham
model.

3.3.2 Hybrid-N
All three previously described deterministic ROP models are combined to provide better ROP predictions. This is
based on the premise that certain downhole conditions which are not explained by one model can be explained using
another model. Therefore, it is hypothesized that the deficiencies of models are cancelled out by the other models in
the ensemble – like a team effort. Each model is assigned a weight (𝑤𝑖 ) determined mathematically to reduce the
overall error in ROP prediction. Machine learning algorithms are used to determine the weight since they account
for correlation between the input ROP models and account for overfitting. Figure 6 shows a schematic of the hybrid-
N model where three deterministic models – Bingham, Hareland and BY model – are combined using an
ensembling algorithm to yield a predictive model. This template can be easily extended to numerous other oil and
gas applications such as prediction of production using different models, decline curve analysis for better accuracy,
history-matching, etc. A constraint is often beneficial for ensembling different algorithms to ensure that the
optimization problem is well defined (Friedman, Hastie, & Tibshirani, 2001). The weights of each model are set to
only take on values between 0 and 1. Additionally, the sum of all ‘N’ model weights are set to 1 (this makes it a
geometric programming problem, making the optimization easy(Friedman et al., 2001)). There can be cases where
some model weights are driven to zero, indicating that the model in question does not contribute to the final model.
URTeC 2896522 7

Figure 6: Schematic for building Hybrid-N ROP model using three deterministic models

4. Algorithms

This section describes algorithms which are used for the development of hybrid and data-driven ROP models.

4.1 Mean
This algorithm would be applied to build the hybrid-N model; it predicts ROP by using a 𝑛th of the ‘N’ inputs
(Equation 4). For the case shown in Figure 6, since three input models were used, ROP is predicted using a third of
each model’s prediction. While not a machine learning algorithm, this method has been presented given its sheer
simplicity. The premise is that the ‘𝑁’ ROP models used are dependent on different drilling parameters and can
compensate for each other. For example, Bingham’s ROP model highly depends on ROP and WOB, whereas
Motahhari’s model is the only one sensitive to rock strength.
1
ROP = ∑𝑁
𝑛=1 𝐷𝑛 , (Equation 4)
𝑁

where, ‘N’ is the number of input deterministic models and Dn is the ROP prediction of the nth model.

4.2 Regression and Stacking

Regression is the simplest machine learning technique which assumes that the data are linearly related. For the
hybrid-N algorithm, it assumes that the metadata - predictions of ROP models themselves - can be combined
linearly to predict the response (Equation 5). Regression assigns each deterministic model with a weight, which
measures its relative contribution to the final ROP model. The least squares algorithm is used to minimize the 𝑙2
norm. The premise of using linear regression is that ROP predictions from different models are sufficiently non-
linear; these non-linear metadata can be combined linearly to predict the response.

ROP = ∑𝑁
𝑛=1 w𝑛 𝐷𝑛 , (Equation 5)
where, w𝑛 is the weight of each prediction, and 𝐷𝑛 is the ROP prediction of the nth deterministic model. Since these
models are prone to overfitting, adding constraints often helps improve the efficacy of regression. By constraining
the weights (w𝑛 ) to sum to 1 and remain non-negative the problem is better formulated. This ensures that stacking
does not assign higher weights to more complex models and converts the least squares regression to a quadratic
programming problem (Friedman et al., 2001) .

4.3 Ridge Regression

This algorithm adds constraints to the regression equation by adding a regularization term (Equation 6).
URTeC 2896522 8

ROP = ∑𝑁 𝑁 2
𝑛=1 w𝑛 𝐷𝑛 + λ ∑𝑛=1 𝑤𝑛 , (Equation 6)
where, w𝑛 are the weights of each prediction, and D𝑛 is the ROP prediction of the 𝑛𝑡ℎ deterministic model, λ is a
tuning parameter selected using cross-validation. The extra term in Equation 6 as compared to Equation 6 is the
regularization term; it adds a constraint on the weights which helps prevent overfitting by penalizing large weights
(making the model less flexible).

4.4 K-Nearest Neighbor (KNN)

The KNN is a non-parametric method which does not assume any functional form. For a specified value of K and a
prediction point x0, the algorithm searches for K closest neighbors(No). The estimation is the average of all the
training data in No. This model will perform poorly in high dimensional spaces (with the increase of input
parameters or dimensions, the ability of the algorithm to find nearest neighbors decreases). This algorithm can be
used to predict ROP independently as a standalone data-driven model or by combining different deterministic model
predictions using the hybrid-One algorithm.
1
ROP = ∑𝐾
𝑛=1 𝑅𝑂𝑃𝑁0 , (Equation 7)
𝐾

4.5 Trees
Trees represent a simple non-linear method to predict data. A simple tree has been built to predict ROP (Figure 7). It
consists of a series of splitting rules. The first split assigns observation having flow rate > 374 to the right branch.
The next parameter evaluated is UCS: if it’s less than 5428 psi then an ROP prediction of 110 ft/hr is returned. On
the other hand, if the flow rate was less than 374 and RPM was less than 63 then a prediction of 53 ft/hr would be
returned. Overall, the tree stratifies data into different segments based on input features.

A tree is built using a simple (greedy) algorithm. The sample predictor space of ‘N’ dimensions is divided into ‘m’
distinct non-overlapping regions (Rm). For each observation that falls into a region Ri the mean value of the training
values in Ri is the predicted ROP. The goal is to find regions (Ri) which minimize the sum of squared errors of the
entire training set. Trees are typically grown until a stopping criterion is reached: a minimum of three observations
in each region. Long trees are prone to overfitting and are commonly pruned using cross-validation as described by
James et al. (2013). Trees are the building blocks of ensemble algorithms like bagging and random forests.

4.6 Bagging
Trees are not very accurate predictors and suffer from high variance. If a tree-based predictor is built on two
separate halves of the training dataset, they are likely to be different. An algorithm with low variance (like linear
regression) would not have this problem. The variance of samples (X1, X2,…Xn) can be reduced by averaging them,
σ2
since the variance of the mean (X� ) of these samples is . In this setting, multiple trees can be averaged to reduce
𝑛
their overall variance, preventing overfitting. These trees can be built on pseudo independent data sets carved from
the original dataset using bootstrapping (Efron, 1987). To summarize, many trees are grown on each bootstrapped
training sample and averaged to reduce the overall variance. This powerful algorithm described using trees can be
generalized to any model. Multiple models built on bootstrapped training data sets, each with high variance, can be
averaged to reduce variance. This is extended to the Hybird-N model where deterministic models with high variance
(like the Bingham model) are averaged to yield a much better ROP predictor.
URTeC 2896522 9

Figure 7: Sample tree used to predict ROP using RPM, WOB, flow rate, and UCS as inputs

4.7 Random Forests

The random forests (Breiman, 2001) algorithm is an extension of the bagging algorithm which is appropriate for
modeling non-linear data. It applies a condition which decorrelates trees and improves model accuracy. Rather than
considering all available input drilling features (X1, X2,…Xn), only a subset of input features – selected using cross
validation – are used to build a tree. This powerful algorithm is used to predict ROP for the data-driven model. This
algorithm can be generalized to be used outside ROP prediction.

5. Results and Discussions

Models are evaluated in this section based on their test errors. 40% of the data in each formation are used for
building the model (training set) and the rest to evaluate accuracy (test set). Model hyperparameters are finetuned
using cross validation on the training set; trained models are then evaluated on the test set for an unbiased test
accuracy. Deterministic models and machine learning algorithms were used to model ROP based on field data.
Figure 8 summarizes the performance of using deterministic models for ROP prediction. Results indicate that the
random forests algorithm outperforms deterministic models for ROP prediction. Among the deterministic models,
Motahhari’s model performs better than Bingham and the BY model. In a few formations the deterministic models
yield acceptable results. However, in most cases, they underfit and result in low test accuracy.

A random forests model built using WOB, RPM, flow-rate and UCS as input features produced lower error rates
than all other model explored so far. Different machine learning algorithms have been evaluated for ROP prediction
(Figure 9) of which random forests outperformed all others in test accuracy paralleling results seen in literature (C.
Hegde et al., 2017; C. Hegde, Wallace, & Gray, 2015; C. Hegde & Gray, 2017; Ch. Hegde, 2016).
URTeC 2896522 10

Figure 8: Performance of ROP models on test data for the entire dataset. Three deterministic models and one machine learning algorithm have
been evaluated for test set accuracy.

Figure 9: ROP prediction using data-driven models. Random forests ROP prediction has the lowest error as compared to all other statistical
learning algorithms

Regularized (or ridge) regression, KNN, and bagging were used as ensembling algorithms to combine each
deterministic model using the hybrid-One algorithm. Since the input models are used as metadata for the ensemble,
URTeC 2896522 11

this model is prone to overfitting (especially if the different “realizations” of the deterministic models are similar).
Ridge regression and bagging gives some protection against overfitting and serve as good candidates for ensembling
algorithms. Figures 10,11, and 12 evaluate the effect of each ensembling algorithm on the accuracy of a
corresponding hybrid-One model. Ridge regression produces the best hybrid models using Bingham and BY as a
base deterministic model resulting in a significant increase in test accuracy over their vanilla deterministic
counterparts. For the Motahhari model, using bagging works well and improves accuracy. Overall, the results show
the hybrid-One algorithm results in better ROP prediction as compared to purely deterministic models as
hypothesized. However, these models still do not always outperform machine learning models, although most
hybrid-One models come close.

Figure 10: Hybird-One models using the bagging algorithm. Bingham and Motahhari hybrid-One models perform better than the deterministic
models; the hybrid-One BY model has a higher mean error than the deterministic model itself; all models perform worse than machine learning
models.

Figure 8: Hybrid-One models using the random forest algorithm. All three hybrid-One models perform worse than the deterministic models and
machine learning models.
URTeC 2896522 12

Figure 9: Hybird-One models using the ridge regression algorithm. The hybrid-One with Bingham and BY models perform much better than the
deterministic models. The hybrid-One Motahhari model does not perform as well as the deterministic model overall. Whereas the machine
learning model outperforms all models;

Using the hybrid-N algorithm, models were built and evaluated against test data. A total of six algorithms – mean,
regression, bagging, random forests, KNN and stacking – were used to combine deterministic models in an effort to
identify the best hybrid-N algorithm (Figure 13). Results indicate that the stacking algorithm outperforms other
evaluated algorithms. Additionally, stacking is a linear algorithm which makes its interpretation easy; Table 1
summarizes the weights assigned to each deterministic model by the stacking algorithm. These weights can be
considered to be a proxy for the importance of each deterministic model in the final ROP model.

Data-driven models perform well in most cases. In certain instances, with a shortage of data, it may be more feasible
to use a deterministic model. Hybrid models provide an alternative way to fit deterministic models which often
boosts accuracies and still provides some inference. In general, for a field application a bag-of-models approach
should be preferred – the best model for that specific formation based on cross-validation error is used as opposed to
an overall best model for the entire well.

Figure 10: Performance of hybrid-N models for ROP modeling. A total of six algorithms have been evaluated to determine the best algorithm for
the use of the hybrid-N model.
URTeC 2896522 13

Table 1: Weights assigned to each formation based on stacking algorithm used to create a hybrid-N ROP model

Hybrid-N Model Contribution

Formation Bingham Motahhari BY
Piper Limestone 0 0.35 0.65
Spearfish Sandstone 0.59 0.21 0.2
Pine Salt Sandstone 0 0.77 0.23
Broom Creek Sandstone 0 0.64 0.36
Tyler Sandstone 0.1 0.76 0.15
Kibbey Lime Limestone 1 0 0
Kibbey Lime Shale 0 0 1
Charles Sandstone 0 0.6 0.4
Charles Limestone 0.05 0.76 0.19
Ratcliffe Sandstone 0.65 0.35 0
Base Last Salt Limestone 0.04 0.91 0.04
Base Last Salt Sandstone 0.25 0.26 0.49
Mission Canyon
Limestone 0.11 0.89 0
Lodgepole Limestone 0 0.24 0.76

5.1 Model Analysis and Inference

Deterministic models provide utmost transparency in the ROP modeling process. The exact form of the equation is
known, which makes model inference easy. The parameters which affect the ROP – RPM, WOB, UCS, and flowrate
– are built into the equation; hence, once the constants are determined it is easy to carry out an inferential analysis.
For example, if the best fit Bingham model had RPM raised to 0.5 and WOB raised to 1.5, then WOB would be the
most influential parameter to change ROP. The caveat is that these models do not result in accurate predictions.
Alternatively, machine learning based ROP models result in much better predictive accuracy of ROP, however, they
have low inferential capabilities (except in the case of linear regression-based methods). The random forest
algorithm allows us to rank input features (or compute the importance of each input feature) in the model which can
be used as a proxy for inference. An example of this feature importance has been shown in Figure 14. It is calculated
by measuring the increase in information (or gini index) with a split in the decision tree for a given input feature.
This is averaged across all trees and plotted (Figure 14). Since the random forest and bagging algorithm build trees
using a greedy algorithm the feature importance shown may not be representative of the data; they merely represent
the feature importance in the model. Additionally, it does not provide a direction of change but only a magnitude of
importance.
URTeC 2896522 14

Figure 11: (Left) ROP prediction using random forests in Lodgepole Limestone; (Right) Drilling parameter interest using random forest for ROP
prediction in Lodgepole Limestone

Hybrid models provide a tradeoff between accuracy and ability to perform model inference. The hybrid-One
algorithm introduced an alternative method to fit and tune hyperparameters using deterministic models. Since it is an
additive model, it can be used to determine the most influential sub-models which can be used for inference (since
these sub-models are “realizations” of a deterministic models). The hybrid-N algorithm provides a simple method to
evaluate the importance of a deterministic model for each formation. This can be traced back to the most influential
input parameter of the dominant model. The Bingham model calculates ROP based on the change in RPM and
WOB. The most dominant term added in Motahhari’s model is UCS (when compared to Bingham); BY adds flow-
rate. Hence, it can be hypothesized that the weight of a deterministic model (Tab1e 1) in the hybrid-N algorithm is
representative of the dominant drilling operating parameters (note that WOB and RPM are combined into one
parameter). This provides a proxy for model importance and thereby input feature importance. This model
importance can be compared to the feature importance which is calculated using the random forests algorithm
(Table 2).

Table 2 shows similar trends between the two cases analyzed. A higher feature importance of UCS is associated
with higher weighting of the Motahhari model in all cases. In a few cases where the random forest model associates
high feature importance to RPM and WOB, the Bingham model is dominant or most relevant for the Hybird-N
model. Moderate importance of all features is associated high weight of the BY model – the BY is made of eight
input parameters and is not overly dominant in one feature, making this is a plausible result (as opposed to the initial
hypothesis that BY is a flow-rate dominant model). This shows that these modeling schemes are essentially moving
in the same direction; some methods are more flexible than others which may result in a better fit.
URTeC 2896522 15

Table 2: Comparison of random forest feature importance and hybrid-N model weights. Important random forest features are highlighted in blue
and green for high hybrid-N weights

Random Forest Importance Hybrid-N Model Contribution

Formation WOB RPM Flow UCS Bingham Motahhari BY
Piper Limestone 0.35 0.2 0.11 0.34 0 0.35 0.65
Spearfish Sandstone 0.28 0.15 0.31 0.25 0.59 0.21 0.2
Pine Salt Sandstone 0.08 0.43 0.31 0.18 0 0.77 0.23
Broom Creek Sandstone 0.37 0.19 0.07 0.36 0 0.64 0.36
Tyler Sandstone 0.61 0.07 0.04 0.28 0.1 0.76 0.15
Kibbey Lime Limestone 0.17 0.24 0.02 0.58 1 0 0
Kibbey Lime Shale 0.5 0.16 0.13 0.22 0 0 1
Charles Sandstone 0.22 0.09 0.15 0.54 0 0.6 0.4
Charles Limestone 0.27 0.19 0.14 0.41 0.05 0.76 0.19
Ratcliffe Sandstone 0.58 0.12 0.15 0.14 0.65 0.35 0
Base Last Salt Limestone 0.32 0.19 0.14 0.35 0.04 0.91 0.04
Base Last Salt Sandstone 0.47 0.15 0.1 0.28 0.25 0.26 0.49
Mission Canyon Limestone 0.26 0.12 0.09 0.53 0.11 0.89 0
Lodgepole Limestone 0.43 0.1 0.2 0.26 0 0.24 0.76

6. Case Study: Mission Canyon Limestone

This section applies previously discussed ROP analysis on the Mission Canyon formation which is made up of
limestone. The data set was separated into training and test sets with a 40/60 split.

6.1 ROP Prediction

ROP modeling is undertaken on the training set. The Motahhari model is used as the base deterministic model for
this formation since it was most accurate among the deterministic models. The random forests algorithm is built
using 1000 trees and a feature subset of 2; WOB, RPM, flow-rate and UCS are used as input parameters. All
deterministic models are stacked using the stacking algorithm for the Hybrid-N ROP predictor; the model’s weights
are displayed in Table 2. 100 Motahhari models are ensembled using the random forest algorithm to create the
hybrid-One model. The test errors and ROP predictions of these models have been plotted in Figure 15; all models
perform well producing low test error; both hybrid models perform better than the deterministic and random forest
model.

6.2 ROP Inference

ROP models built in the previous section are used for inference. The deterministic model used – Motahhari’s model
– has three formation related constants. α and γ represent the exponents of WOB and RPM in Equation 3. For this
formation α is 1.064 and γ is 1. This shows that a percent change in WOB will have a higher impact on the ROP
than a percent change in RPM providing field intuition.

Random forest’s feature importance plot (Figure 16) emphasizes that UCS is the most important parameter
influencing ROP followed by WOB, RPM, and flow-rate. Hence, to change ROP while drilling this formation,
WOB is the key controlling parameter, as rock strength is not a controllable parameter. This method does not
provide the direction of change like the deterministic model.
URTeC 2896522 16

Hybrid-N model weights are: 0.11 for Bingham, 0.89 for Motahhari, and 0 for the BY model. Zero weight assigned
to a model implies that the model does not help improve ROP predictions. From an inferential perspective, the
dominant drilling feature of the excluded deterministic model – flow-rate in this case – is not an important feature
influencing the ROP. This is reinforced by the random forest model’s feature importance plot shown in Figure 16.
The weights of the hybrid-N model imply that rock strength is the key feature which affects ROP (followed by
WOB and RPM). These results align perfectly with the previously discussed models. However, in this case, unlike
random forests it is possible to derive the exact relationship between ROP and the input features. The model is
composed of two ROP models – 0.11* Bingham + 0.89*Motahhari; each of these models have a relationship
between ROP, WOB, and RPM which can be used for inference in the form of an additive model as shown in
Equation 8.

𝑅𝑂𝑃 ∝ 𝑅𝑃𝑀 ∗ 𝑊𝑂𝐵 0.5 + 𝑅𝑃𝑀 ∗ 𝑊𝑂𝐵1.064 , (Equation 8)

The hybrid-One model combines different versions of a deterministic model. Bagging was used for ensembling
Motahhari models (since it gave the best ROP predictions); the feature importance of the bagging predictor can be
used to determine the most prominent versions of Motahhari’s models (Figure 17). For Mission Canyon formation,
the most influential version of the Motahhari model is the 4th version; for which the empirical constants α and γ are
1. Other models have decreasing amounts of influence on the model; the empirical constants for these versions can
be calculated to form an additive equation for ROP inference (similar to Equation 8). However, in this case
extracting the empirical constants of versions 1,6, and 0 provide no additional information. This provides the same
level of inference as the deterministic model but fits the data better: resulting in higher accuracy. This method of
model fitting and inference bears the same flavor as the boosting algorithm. In this case, the best deterministic
model is fit to the data and the portion of data left unexplained (residuals) are refit with the same deterministic
model to reduce prediction error. Inference can be performed similar to the hybrid-N model; the terms in Equation 8
for this method would all be composed of the different “realizations” of the same model.

Figure 12: ROP predictions of models developed for Mission Canyon Limestone; (Left) ROP predictions of deterministic Motahhari model; (Left
-middle) ROP predictions of data-driven random forest model; (Right-middle) ROP predictions of deterministic hybrid-One model; (Right) ROP
predictions of deterministic Hybrid-N model;
URTeC 2896522 17

Figure 13: Random forest ROP model input feature importance for Mission Canyon Limestone.

Figure 14: Hybrid-One model feature importance for Mission Canyon Limestone

7. Conclusions

Deterministic models are often the industry standard for ROP prediction. Although sound in physics, these models
introduce many empirical coefficients and functional constraints for fitting which often lead to poor results. Machine
learning models on the other hand utilize data to predict ROP; machine learning models are more flexible allowing
them to be better ROP predictors.

Empirical coefficients in deterministic models are calculated by conditioning the models to the data – the training
set. Data-driven models, on the other hand, rely purely on the data. Hybrid models combine the two analogies. Two
algorithms for building hybrid ROP models have been discussed: hybrid-One and hybrid-N. The hybrid-One model
combines different versions (or realizations) of a single deterministic model – providing an alternative method to
determine empirical constants. The hybrid-N model combines predictions from different deterministic models using
an ensembling algorithm. This is based on the premise that one model can compensate for the weaknesses of other
models present in the ensemble. The hybrid model algorithms introduced in this paper can easily be extended to
other applications in oil and gas.

Traditional, hybrid, and data-driven models were simulated to predict ROP in different formations. Models were
evaluated using normalized ROP test error. The machine learning model– using random forests – outperformed all
other models. However, adopting the bag-of-models analogy can be more beneficial – this method selects the best
model for each formation based on the cross-validation error. It produces better results as opposed to selecting an
overall best algorithm.
URTeC 2896522 18

In general, ROP data are non-linear; deterministic models attempt to estimate this nonlinear data using power law or
exponential functions and as a result under fit the data. They provide great inferential insight into ROP modeling,
which is not available using machine learning. The hybrid models provide good inferential insight using additive
ROP equations. For the hybrid-One model, the “version (s)” of a deterministic model with the highest weight(s) (or
that which explains the highest variance) can be used for inference. The hybrid-N algorithm provides valuable
insights: the weight of a deterministic ROP model in the hybrid-N algorithm can be used to gain information about
the dominant drilling parameter which influences ROP. Feature importance of random forests and the weights of the
hybrid-N model showed synergy – indicating that both modeling techniques were converging to the same solution.
A case study implements these ideas to model ROP and gain insight into the drilling process for Mission Canyon
Limestone formation.

Acknowledgements
The authors wish to thank the Wider Windows Industrial Affiliate Program, the University of Texas at Austin, for
financial and logistical support of this work. Project support from Wider Windows sponsors BHP Billiton, British
Petroleum, Chevron, ConocoPhillips, Halliburton, Marathon, National Oilwell Varco, Occidental Oil and Gas and
Shell is gratefully acknowledged.

Acronyms
• CCS: Confined compressive strength
• KNN: K nearest neighbors
• PDC: Polycrystalline diamond compact
• ROP: Rate of penetration
• RPM: Rotations per minute
• UCS: Unconfined compressive strength
• WOB: Weight-on-bit

References

Bingham, M. G. (1964). A new approach to interpreting Rock Drillability. The Oil and Gas Journal.

Bourgoyne Jr, A. T., & Young Jr, F. S. (1974). A multiple regression approach to optimal drilling and abnormal
pressure detection. Society of Petroleum Engineers Journal, 14(4), 371–384.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397),
171–185.

Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1). Springer series in
statistics New York.

Goldstein, E. B., Coco, G., Murray, A. B., & Green, M. O. (2014). Data-driven components in a model of inner-
shelf sorted bedforms: a new hybrid model. Earth Surface Dynamics, 2(1), 67.

Hareland, G., & Rampersad, P. R. (1994). Drag-bit model including wear. In SPE Latin America/Caribbean
Petroleum Engineering Conference. Society of Petroleum Engineers.

Hegde, C., Daigle, H., & Gray, K. (2018). Performance comparison of algorithms for real-time rate of penetration
optimization in drilling using data-driven models. SPE Journal.

Hegde, C., Daigle, H., Millwater, H., & Gray, K. (2017). Analysis of rate of penetration (ROP) prediction in drilling
using physics-based and data-driven models. Journal of Petroleum Science and Engineering, 159, 295–306.
URTeC 2896522 19

https://doi.org/10.1016/j.petrol.2017.09.020

Hegde, C., & Gray, K. E. (2017). Use of machine learning and data analytics to increase drilling efficiency for
nearby wells. Journal of Natural Gas Science and Engineering, 40, 327–335.
https://doi.org/10.1016/j.jngse.2017.02.019

Hegde, C., Wallace, S., & Gray, K. (2015). Using trees, bagging, and random forests to predict rate of penetration
during drilling. In SPE Middle East Intelligent Oil and Gas Conference and Exhibition. Society of Petroleum
Engineers.

Hegde, Ch. (2016). Application of statistical learning techniques for rate of penetration (ROP) prediction in
drilling. The University of Texas at Austin.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). Springer Texts in Statistics An Introduction to Statistical
Learning - with Applications in R. https://doi.org/10.1007/978-1-4614-7138-7

Menand, S., & Mills, K. (2017). Use of Mechanical Specific Energy Calculation in Real-Time to Better Detect
Vibrations and Bit Wear While Drilling. Paper AADE-17-NTCE-0332017 Proceedings of the 2017 AADE
National Technical Conference and Exhibition Held at the Hilton Houston North Hotel, Houston, Texas, April
11-12,.

Motahhari, H. R., Hareland, G., & James, J. A. (2010). Improved drilling efficiency technique using integrated PDM
and PDC bit parameters. Journal of Canadian Petroleum Technology, 49(10), 45–52.

Soares, C., Daigle, H., & Gray, K. (2016). Evaluation of PDC bit ROP models and the effect of rock strength on
model coefficients. Journal of Natural Gas Science and Engineering, 34, 1225–1236.
https://doi.org/10.1016/j.jngse.2016.08.012

Theloy, C. (2014). Integration of geological and technological factors influencing production in the Bakken play,
Williston Basin. Colorado School of Mines.

The Definitive Guide To Mastering The Psychology of Trading by Mark Douglas
100% (4)
The Definitive Guide To Mastering The Psychology of Trading by Mark Douglas
314 pages
Analysis of Rate of Penetration (ROP) Prediction in Drilling Using Physics-Based and Data-Driven Models
No ratings yet
Analysis of Rate of Penetration (ROP) Prediction in Drilling Using Physics-Based and Data-Driven Models
30 pages
Combining Insight From Physics-Based Models Into Data-Driven Model For Predicting Drilling Rate of Penetration
No ratings yet
Combining Insight From Physics-Based Models Into Data-Driven Model For Predicting Drilling Rate of Penetration
10 pages
Drilling Efficiency Improvement and Rate of Penetration Optimization by Machine Learning and Data Analytics
No ratings yet
Drilling Efficiency Improvement and Rate of Penetration Optimization by Machine Learning and Data Analytics
14 pages
08.-2019 - Cloud Based ROP Prediction and Optimization in Real-Time Using
No ratings yet
08.-2019 - Cloud Based ROP Prediction and Optimization in Real-Time Using
12 pages
Investigating and Ranking The Rate of Penetration (ROP) Features For Petroleum Drilling Monitoring and Optimization
No ratings yet
Investigating and Ranking The Rate of Penetration (ROP) Features For Petroleum Drilling Monitoring and Optimization
7 pages
Developing a New Model for Drilling Rate of Penetration Prediction Using Convolutional Neural Network
No ratings yet
Developing a New Model for Drilling Rate of Penetration Prediction Using Convolutional Neural Network
33 pages
AI in Drilling Rop
No ratings yet
AI in Drilling Rop
102 pages
Developing a New Rigorous Drilling Rate Predicti 2020 Journal of Petroleum S
No ratings yet
Developing a New Rigorous Drilling Rate Predicti 2020 Journal of Petroleum S
27 pages
Evaluation of PDC Bit ROP Models and The Effect of Rock Strength On Model Coefficients
No ratings yet
Evaluation of PDC Bit ROP Models and The Effect of Rock Strength On Model Coefficients
12 pages
1 s2.0 S0920410521016442 Main
No ratings yet
1 s2.0 S0920410521016442 Main
22 pages
A Data Driven Approach of ROP Prediction and Drilling Performance
No ratings yet
A Data Driven Approach of ROP Prediction and Drilling Performance
9 pages
Drilling rate prediction from petrophysical logs and mud loggind data using an optimized multilayered perceptron neural network
No ratings yet
Drilling rate prediction from petrophysical logs and mud loggind data using an optimized multilayered perceptron neural network
14 pages
Anemangely 2018
No ratings yet
Anemangely 2018
15 pages
(Hedge Et Al. 2015) Using Trees, Bagging, and Random Forest To Predict ROP During Drilling
No ratings yet
(Hedge Et Al. 2015) Using Trees, Bagging, and Random Forest To Predict ROP During Drilling
12 pages
Feature Investigation On The ROP Machine
No ratings yet
Feature Investigation On The ROP Machine
7 pages
Cloud-Based ROP Prediction and Optimization in Real Time Using Supervised Machine Learning
No ratings yet
Cloud-Based ROP Prediction and Optimization in Real Time Using Supervised Machine Learning
12 pages
Artificial Intelligence in Drilling 1686676588
No ratings yet
Artificial Intelligence in Drilling 1686676588
30 pages
RO&G - Machine Learning Methods Applied To Rate of Penetration
No ratings yet
RO&G - Machine Learning Methods Applied To Rate of Penetration
10 pages
1 s2.0 S0920410521015217 Main
No ratings yet
1 s2.0 S0920410521015217 Main
13 pages
A-comprehensive-data-mining-approach-to-estimate-the-r_2017_Journal-of-Petro
No ratings yet
A-comprehensive-data-mining-approach-to-estimate-the-r_2017_Journal-of-Petro
11 pages
Journal of Natural Gas Science and Engineering: Chiranth Hegde, K.E. Gray
No ratings yet
Journal of Natural Gas Science and Engineering: Chiranth Hegde, K.E. Gray
9 pages
AADE 20 FTCE 070 - Alkinani
No ratings yet
AADE 20 FTCE 070 - Alkinani
8 pages
Comparative Analysis of Machine Learning Algorithms in Predicting Rate Ofrnpenetration During Drilling
No ratings yet
Comparative Analysis of Machine Learning Algorithms in Predicting Rate Ofrnpenetration During Drilling
16 pages
Drilling Optimization
No ratings yet
Drilling Optimization
14 pages
Application of Data Science and Machine Learning Algorithms For ROP Prediction Turning Data Into Knowledge
No ratings yet
Application of Data Science and Machine Learning Algorithms For ROP Prediction Turning Data Into Knowledge
10 pages
Sridharan SureshKumar
No ratings yet
Sridharan SureshKumar
15 pages
A machine learning approach to predict drilling rate using petrophysical and mud logging data
No ratings yet
A machine learning approach to predict drilling rate using petrophysical and mud logging data
21 pages
A Novel Neural Network Framework For The Prediction of Drilling Rate of Penetration
No ratings yet
A Novel Neural Network Framework For The Prediction of Drilling Rate of Penetration
11 pages
Prediction of The Rate of Penetration While Drilling Horizontal Carbonate Reservoirs Using The Self-Adaptive Artificial Neural Networks Technique
No ratings yet
Prediction of The Rate of Penetration While Drilling Horizontal Carbonate Reservoirs Using The Self-Adaptive Artificial Neural Networks Technique
19 pages
Machine Learning Classification Approaches To Optimize ROP and TOB Using Drilling and Geomechanical Parameters in A Carbonate Reservoir
No ratings yet
Machine Learning Classification Approaches To Optimize ROP and TOB Using Drilling and Geomechanical Parameters in A Carbonate Reservoir
26 pages
2022 Rate of Penetration Prediction Method For Ultra-Deep Wells Based On LSTM-FNN
No ratings yet
2022 Rate of Penetration Prediction Method For Ultra-Deep Wells Based On LSTM-FNN
14 pages
SPE-195268-MS A Novel Application of Artificial Neural Networks To Predict Rate of Penetration
No ratings yet
SPE-195268-MS A Novel Application of Artificial Neural Networks To Predict Rate of Penetration
6 pages
Coupled ML Models For Drilling Optimization
No ratings yet
Coupled ML Models For Drilling Optimization
26 pages
Fully Coupled End to End Drilling Optimizatio 2020 Journal of Petroleum Scie
No ratings yet
Fully Coupled End to End Drilling Optimizatio 2020 Journal of Petroleum Scie
14 pages
Is Support Vector Regression Method Suitable f 2020 Journal of Petroleum Sci
No ratings yet
Is Support Vector Regression Method Suitable f 2020 Journal of Petroleum Sci
18 pages
Bataee, M., & Mohseni, S. (2011) - Application of Artificial Intelligent Systems in ROP Optimization - A Case Study
No ratings yet
Bataee, M., & Mohseni, S. (2011) - Application of Artificial Intelligent Systems in ROP Optimization - A Case Study
10 pages
Multiple Regression
No ratings yet
Multiple Regression
127 pages
1 s2.0 S0920410518307824 Main
No ratings yet
1 s2.0 S0920410518307824 Main
12 pages
Maximizing Drilling Process With ROP - Geothermal - Dang Ton 2021
No ratings yet
Maximizing Drilling Process With ROP - Geothermal - Dang Ton 2021
77 pages
Graduation Project Template
No ratings yet
Graduation Project Template
36 pages
Predictions of ROP
No ratings yet
Predictions of ROP
18 pages
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
Application of Bourgoyne and Young ROP Model To A Presalt Case Study. Mathematical Problems in Engineering
No ratings yet
Application of Bourgoyne and Young ROP Model To A Presalt Case Study. Mathematical Problems in Engineering
10 pages
Expert System Development: WWW - Petroman.ir
No ratings yet
Expert System Development: WWW - Petroman.ir
6 pages
Rate of Penetration (ROP) Optimization in Drilling With Vibration Control by Hegde 2019
No ratings yet
Rate of Penetration (ROP) Optimization in Drilling With Vibration Control by Hegde 2019
11 pages
Reference Dataset For Rate of Penetration Benchmar
No ratings yet
Reference Dataset For Rate of Penetration Benchmar
12 pages
Accepted Manuscript
No ratings yet
Accepted Manuscript
22 pages
Evaluation of The Wellbore Drillability While Hori
No ratings yet
Evaluation of The Wellbore Drillability While Hori
13 pages
Porosityand Permeability
No ratings yet
Porosityand Permeability
12 pages
Drilling Optimization of Petr... On of Artificial - Barbosa
No ratings yet
Drilling Optimization of Petr... On of Artificial - Barbosa
3 pages
Active Appearance Model: Unlocking the Power of Active Appearance Models in Computer Vision
From Everand
Active Appearance Model: Unlocking the Power of Active Appearance Models in Computer Vision
Fouad Sabry
No ratings yet
Optimizing Drilling Using Step Wise Linear Regression Rate of Penetration Model
No ratings yet
Optimizing Drilling Using Step Wise Linear Regression Rate of Penetration Model
6 pages
He 2021 J. Phys. Conf. Ser. 1894 012102
No ratings yet
He 2021 J. Phys. Conf. Ser. 1894 012102
7 pages
Maptek Deep Learning A New Paradigm For Orebody Modelling
100% (1)
Maptek Deep Learning A New Paradigm For Orebody Modelling
11 pages
View Synthesis: Exploring Perspectives in Computer Vision
From Everand
View Synthesis: Exploring Perspectives in Computer Vision
Fouad Sabry
No ratings yet
Penetration Rate Prediction Modelsfor Core Drilling
No ratings yet
Penetration Rate Prediction Modelsfor Core Drilling
9 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
IPTC-19548-MS Rate of Penetration Prediction in Shale Formation Using Fuzzy Logic
No ratings yet
IPTC-19548-MS Rate of Penetration Prediction in Shale Formation Using Fuzzy Logic
9 pages
Penetration Rate Prediction Modelsfor Core Drilling
No ratings yet
Penetration Rate Prediction Modelsfor Core Drilling
9 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
PARTE 1 Chapter4
No ratings yet
PARTE 1 Chapter4
22 pages
Artificial Neural Network Drilling Parameter Optimization System Improves ROP by Pre
No ratings yet
Artificial Neural Network Drilling Parameter Optimization System Improves ROP by Pre
13 pages
An Analysis To The Main Economic Drivers For Offshore Wells-THESIS
No ratings yet
An Analysis To The Main Economic Drivers For Offshore Wells-THESIS
61 pages
Exploration On Optimizing Maintenance Management by Using Condition Monitoring On Drilling Platforms in COSL
100% (1)
Exploration On Optimizing Maintenance Management by Using Condition Monitoring On Drilling Platforms in COSL
62 pages
Rahimzadeh, H., Mostofi, M., & Hashemi, A. (2011) - A New Method For Determining Bourgoyne and Young Penetration Rate Model Constant
No ratings yet
Rahimzadeh, H., Mostofi, M., & Hashemi, A. (2011) - A New Method For Determining Bourgoyne and Young Penetration Rate Model Constant
13 pages
Difficult Add Math
No ratings yet
Difficult Add Math
5 pages
DC Circuit Theory
No ratings yet
DC Circuit Theory
20 pages
ES190 Inertia Lab Report Template-DESKTOP-T1PR75R
No ratings yet
ES190 Inertia Lab Report Template-DESKTOP-T1PR75R
5 pages
ME 303 (Manufacturing Engineering) - 03 - Material Properties II
No ratings yet
ME 303 (Manufacturing Engineering) - 03 - Material Properties II
46 pages
Sta301 Solved Subjective Final Term by Junaid
No ratings yet
Sta301 Solved Subjective Final Term by Junaid
16 pages
S5 SL Algebra Practice
No ratings yet
S5 SL Algebra Practice
1 page
GRX-290 TDS
No ratings yet
GRX-290 TDS
17 pages
BSC Physics
No ratings yet
BSC Physics
92 pages
Is 6735
No ratings yet
Is 6735
11 pages
Geo 5
No ratings yet
Geo 5
2 pages
K77000, K77001 - Automatic Cloud and Pour Point - Technical Datasheet
No ratings yet
K77000, K77001 - Automatic Cloud and Pour Point - Technical Datasheet
1 page
MPPGCL JE 24 Syllabus All Branch
No ratings yet
MPPGCL JE 24 Syllabus All Branch
10 pages
Test Pile Drawing For Major bridge02-LT-002
100% (1)
Test Pile Drawing For Major bridge02-LT-002
1 page
Chapter 3: The Lagrange Method: Elements of Decision: Lecture Notes of Intermediate Microeconomics
No ratings yet
Chapter 3: The Lagrange Method: Elements of Decision: Lecture Notes of Intermediate Microeconomics
12 pages
Engg Materials Assignment 01
No ratings yet
Engg Materials Assignment 01
10 pages
Homi Bhabha: Quick Facts
0% (1)
Homi Bhabha: Quick Facts
19 pages
FEM DV Checklist for ESTEEM (Detailed) [20161218]
No ratings yet
FEM DV Checklist for ESTEEM (Detailed) [20161218]
9 pages
10.2478_pjst-2018-0007
No ratings yet
10.2478_pjst-2018-0007
7 pages
Dana P. Williams-A Very Short Course On C - Algebras-2019 161pp
No ratings yet
Dana P. Williams-A Very Short Course On C - Algebras-2019 161pp
161 pages
C Series Product Guide PDF
No ratings yet
C Series Product Guide PDF
112 pages
The Covariant Formulation of Gravity: F T F T F T F T F T
No ratings yet
The Covariant Formulation of Gravity: F T F T F T F T F T
12 pages
The Harmony of The Sphere
No ratings yet
The Harmony of The Sphere
30 pages
PHY108L Manual
No ratings yet
PHY108L Manual
58 pages
Chapter V Beam Deflections 5.3
No ratings yet
Chapter V Beam Deflections 5.3
8 pages
FM (NEW) NTPP & PYQs - 1-2 (01.04.24)
No ratings yet
FM (NEW) NTPP & PYQs - 1-2 (01.04.24)
9 pages
Malnad College of Engineering, Hassan Provisional Grade Card Aug 2021 Examination
No ratings yet
Malnad College of Engineering, Hassan Provisional Grade Card Aug 2021 Examination
1 page
Coolpack: Cycle Specification
No ratings yet
Coolpack: Cycle Specification
2 pages
Noncommutative Geometry and Global Analysis
100% (3)
Noncommutative Geometry and Global Analysis
337 pages
Chapter 2: Properties of Fluids: Eric G. Paterson
No ratings yet
Chapter 2: Properties of Fluids: Eric G. Paterson
14 pages
Grade 10 Fourth Quarter Exam
No ratings yet
Grade 10 Fourth Quarter Exam
2 pages

Rate of Penetration (ROP) Modeling Using Hybrid Models - Deterministic and Machine Learning

Uploaded by

Rate of Penetration (ROP) Modeling Using Hybrid Models - Deterministic and Machine Learning

Uploaded by

URTeC: 2896522

Rate of penetration (ROP) modeling using hybrid models:

Keywords: Drilling, Machine Learning, Inference, ROP, Analytics

2. Dataset and Attributes

Figure 2: Schematic of ROP modeling within a formation

2.2 Dataset Attributes

3.1 Deterministic Models

3.2 Machine Learning Models

3.3 Hybrid Models

4.2 Regression and Stacking

4.3 Ridge Regression

4.4 K-Nearest Neighbor (KNN)

4.7 Random Forests

5. Results and Discussions

Hybrid-N Model Contribution

5.1 Model Analysis and Inference

Random Forest Importance Hybrid-N Model Contribution

6. Case Study: Mission Canyon Limestone

6.1 ROP Prediction

6.2 ROP Inference

𝑅𝑂𝑃 ∝ 𝑅𝑃𝑀 ∗ 𝑊𝑂𝐵 0.5 + 𝑅𝑃𝑀 ∗ 𝑊𝑂𝐵1.064 , (Equation 8)

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

You might also like