0% found this document useful (0 votes)
27 views4 pages

Zhang 2017

Uploaded by

SebastianRivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views4 pages

Zhang 2017

Uploaded by

SebastianRivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning in Rock Facies Classification: An Application of XGBoost

Downloaded 06/18/17 to 132.239.1.230. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Licheng Zhang, Cheng Zhan


Summary ten wells (with 4149 examples), consisting of a set of seven
predictor variables and a rock facies (class) for each example
Big data analysis has drawn much attention across different vector and validation (test) data (830 examples from two
industries. Geoscientists, meanwhile, have been doing wells) having the same seven predictor variables in the
analysis with voluminous data for many years, without even feature vector. Facies are based on the examination of cores
bragging how big it is. In this paper, we present an from nine wells taken vertically at half-foot intervals.
application of machine learning, to be more specific, the Predictor variables include five from the wireline log
gradient boosting method, in Rock Facies Classification measurements and two geologic constraining variables that
based on certain geological features and constrains. Gradient are derived from geologic knowledge. These are essentially
boosting is a both popular and effective approach in continuous variables sampled at a half-foot sample rate.
classification, which produces a prediction model in an
ensemble of weak models, typically decision trees. The key The seven predictor variables are:
for gradient boosting to work successfully lies in introducing
Five wireline log Two geologic constrains
a customized objective function and tuning the parameters
measurements
iteratively based on cross-validation. Our model achieves a
rather high F1 score in evaluating two test wells data.  Gamma ray (GR)  Nonmarine-marine
 Resistivity logging indicator (NM_M)
Introduction and Background (ILD_log10)  Relative position
 Photoelectric effect (PE) (RELPOS)
Machine learning emerges to be a very promising area and  Neutron-density porosity
should make the work of future geoscientists more fun and difference (Delta PHI)
 Average neutron-density
less tedious. Furthermore, with the maturing neural network
porosity (PHIND)
technology, the ability for better geological interpretation
could be more automatic and accurate, e.g., in the Gulf of
The nine discrete facies (classes of rocks), the abbreviated
Mexico region, salt body characterization (challenging in the
labels, and the corresponding adjacent facies are listed in the
velocity model) might be elevated to the next level of higher
following Table 1. The facies gradually blend into one
quality seismic images.
another, and some of the neighboring facies are rather close.
There are a few decision tree based algorithms to handle Mislabeling within these neighboring is possible to occur.
classification problems. One is using the random forest, Table 1:
which operates by constructing multiple decision trees to
reduce the possible variance error in each model. Another
Class of rocks Facies Label Adjacent Facies
widely used technique is gradient boosting, which has been
successfully applied in many Kaggle competitions. This Nonmarine sandstone 1 SS 2
method focuses on where the model performs poorly, and
Nonmarine coarse siltstone 2 CSiS 1,3
improves those areas by introducing a learner to compensate
the existing model. Nonmarine fine siltstone 3 FSiS 2

This facies classification problem was originally introduced Marine siltstone and shale 4 SiSh 5
in the Leading Edge by Brendon Hall in Oct. 2016 (Hall, Mudstone 5 MS 4,6
2016). It seems to evolve into the first machine learning
contest in the SEG, more information to be found on here Wackestone 6 WS 5,7

(https://github.com/seg/2016-ml-contest). By the time we


Dolomite 7 D 6,8
submitted the paper, our ranking is 5th on the leaderboard.
Packstone-Grainstone 8 PS 6,7,9
This data is from the Council Grove gas reservoir in
Phylloid-glgal bafflestone 9 BS 7,8
Southwest Kansas. The Panoma Council Grove Field is
predominantly a carbonate gas reservoir encompassing 2700
square miles in Southwestern Kansas. This dataset is from

1371
Machine Learning in Rock Facies Classification: An Application of XGBoost

Methodology 1 T
( f )  T     2j
Downloaded 06/18/17 to 132.239.1.230. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

(2)
Generally speaking, there are 3 types of machine learning 2 j 1
algorithms: supervised learning, unsupervised learning, and
reinforcement learning. The application in this paper belongs There is, of course, more than one way to define the
to the category of supervised learning. This type of complexity, and this particular one works well in practice.
algorithm consists of a target/outcome variable (or And the objective function in XGBoost is defined as:
dependent variable), which is to be predicted from a given
set of predictors (independent variables, or usually called T
1
features). Using these feature variables, a function that maps obj   [G j j  ( H j   ) 2j ]  T (3)
inputs to desired outputs will be generated. The training j 1 2
process continues until the model achieves a satisfied level
of accuracy on the training data. Examples of supervised More details about the notations can be found here
learning includes: regression, decision tree, random forest, (http://xgboost.readthedocs.io/en/latest/model.html).
KNN, logistic regression etc.
Data Analysis and Model Selection
The algorithm adopted here is called XGBoost (eXtreme
Gradient Boosting), which is an optimized distributed Before building any machine learning model, it is necessary
gradient boosting library designed to be highly efficient, to perform some exploratory analysis and cleanup.
flexible and portable. It implements machine learning First, we examine the data that will be used to train the
algorithms under the Gradient Boosting framework. classifier. The data consists of 5 wireline log measures, 2
XGBoost provides a parallel tree boosting (also known as indicator variables, and 1 facies label at half foot interval. In
GBDT, GBM) that solves many data science problems in a machine learning terminology, each log measurement is a
fast and accurate way. It was created and developed by feature vector that maps a set of ‘features’ (the log measures)
Tianqi Chen, a Ph.D. student at the University of to a class (the facies type).
Washington. More details about XGBoost can be found here
(http://dmlc.cs.washington.edu/xgboost.html). Pandas library in Python is a great tool in loading data into
the dataframe structure for further manipulation.
The basic idea of boosting is to combine hundreds of simple
trees with low accuracy to build a more accurate model.
Every iteration will generate a new tree for the model. When
it comes to how a new tree is created, there are thousands of
methods. A famous one is called Gradient Boosting
machine, raised by Friedman (Friedman, 2001). It utilizes
the gradient descent to generate the new tree based on all
previous trees, driving the objective function towards the
minimum direction.

An objective function usually has the form that contains two


parts (training loss and regularization):

Obj ( )  L( )  ( ) (1)


Then some basic statistical analysis are produced, for
Where L is the training loss function, and  is the example, the distribution of each classes (Figure 1a),
regularization term. The training loss measures the heatmap of features (Figure 1b), which produces correlation
performance of the model is on training data. The plot for us to observe relationship between variables, and log
regularization term controls the complexity of the model, plots for wells (Figure 1c). These figures are the initial
which usually controls overfitting. The complexity of each blocks to explore the data, and the visualization libraries are
tree is defined as the following: seaborn and matplotlib.

1372
Machine Learning in Rock Facies Classification: An Application of XGBoost

The next step is data preparation and model selection. The


Downloaded 06/18/17 to 132.239.1.230. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

goal is to build a reliable model to predict the Y values


(Facies) based on X values (the seven predictor variables).

To enhance the performance of XGBoost’s speed over many


iterations, we create a DMatrix format. Such process sorts
the data initially to optimize for XGBoost in building trees,
and reduces the runtime correspondingly. This is especially
helpful in learning with a large number of training examples.

(a)

On the other hand, in order to quantify the quality of the


models, certain metrics are needed. We use accuracy metrics
for judging the models. A simple and easy way to learn the
terminologies (e.g., accuracy, prediction, recall) can be
found in the following webpage
(http://www.dataschool.io/simple-guide-to-confusion-
matrix-terminology/).

There are several main parameters to be tuned to get a good


model for this rock facies classification problem.

(b)

Table 2: main parameters

Step size shrinkage employed to


prevent overfitting. We shrinks
Learning rate the feature weights to make the
boosting process more
conservative
N_estimators The number of trees
Maximum depth of a tree, and
increasing this value will make
Max_depth
the model more complex(likely to
be overfitting)
Minimum sum of instance weight
Min_child_weight
needed in a child
Minimum loss reduced required to
Gamma make a further partition on a leaf
node of the tree
(c)
Subsample ratio of the training
Subsample
instance
Subsample ratio of features when
Colsample_bytree
constructing each tree
Figure 1: (a) Distribution of facies (b) Heatmap of features (c) Log
This sets XGBoost to produce
plots for well SHRIMPLIN and SHANKLE Objective:’multi:softmax’ multiclass classification using the
softmax objective
Number of parallel threads used
nthread
to run XGBoost

1373
Machine Learning in Rock Facies Classification: An Application of XGBoost

Algorithm parameter tuning is a critical processs in applying to another two blind well test data. The best
accuracy (F1 score) we have so far is 0.564, ranked 5th in the
Downloaded 06/18/17 to 132.239.1.230. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

achieving the optimal performance of certain algorithm, and


needs to be carefully justified before moving into contest. The following is the feature importance plot of the
production. Our workflow for optimizing parameters is model. Importance provides a score that indicates how
presented here: useful or valuable each feature was in the construction of the
boosted decision trees within the model. The more an
Pick initial parameters (e.g., default values) attribute is used to make key decisions with decision trees,
the higher its relative importance.
Turn tree-based parameters (e.g., adjust
max_depth and min_child_weight
simultaneously

Calibrate gamma, subsample and


colsample_bytree

Balance regularization parameters

Reduce learning rate and update the number


of trees

The reason we adopt such flow is because of the nature of Conclusions


XGBoost algorithm, which is robust enough not to be We have successfully applied the gradient boosting method
overfitting with increasing trees, but a high value for a to a classification problem in the rock facies. Potential
particular learning rate could degrade its ability in predicting applications of such prediction could be to validate the
new test data. As we reduce the learning rate and increase velocity model for seismic data. This could be viewed as
the number of trees, the computation becomes expensive and some commencing endeavors for more machine learning
could potentially take longer time on standard personal applications in the near future of the oil and gas sector.
computers.
Acknowledgments
Grid search is a typical approach for parameters tuning that
methodically builds and evaluates a model for each The authors would like to thank Ted Petrou, Aiqun Huang
combination of parameters in a specific grid. For instance, and Zhongyang Dong for discussion. We also thank Yan Xu
the code below examines different combinations of for reviewing the manuscript.
‘max_depth’ and ‘min_child_weight’.
Reference

Chen, T. & Guestrin, C., 2016. Xgboost: A scalable tree


boosting system. arXiv preprint arXiv:1603.02754.

Friedman, J. H., 2001. Greedy function approximation: a


gradient boosting machine. Annals of statistics, pp. 1189-
1232.

Hall, B., 2016. Facies classification using machine learning.


Another way to tailor parameters is by random search, which
The Leading Edge, 10, pp. 906-909.
complements the predefined grid search procedure that is
currently being exploited. In this case, we didn’t find random Natekin, A. & Knoll, A., n.d. Gradient boosting machines, a
search benefits much the final results. tutorial. Frontiers in neurorobotics, p. 2013.
After several iterations, the final model is built up. A cross-
validation is conducted to access the performance before

1374

You might also like