Application of Machine Learning Techniques in Mineral Classification
Application of Machine Learning Techniques in Mineral Classification
A R T I C L E I N F O A B S T R A C T
Keywords: Mineral classification and segmentation is time-consuming in geological image processing. The development of
Mineral segmentation machine learning methods shows promise as a technique in replacing manual classification. In this study, per
Machine learning formances of five shallow machine classification algorithms and a deep learning algorithm were compared for
SEM-EDS
the goal of pixel-level mineral classification of Scanning Electron Microscopy - Energy Dispersive X-Ray Spec
U-net
The Bakken Formation
troscopy (SEM-EDS) images. Five machine learning models, including Logistic Regression (LR), Linear Support
Vector Machine (SVM), k-Nearest Neighbor (k-NN), Random Forest (RF), and Artificial Neuron Networks (ANN),
and a deep learning CNN U-Net model were used in this study. Thirteen mineral phases were classified on SEM-
EDS images of a shale sample taken from the Bakken Formation. Hyperparameters of models were tuned using a
grid-search method. Randomly selected balanced dataset was used for the shallow models, while the original
cropped images were used for the U-Net.
The experimental results showed that all classification algorithms resulted in high F1 scores ranging from 0.86
to 0.92. The RF demonstrated the best performance among the five machine learning models, with an F1 score of
0.92. Additionally, sensitivity analysis on the size of dataset demonstrated that the LR algorithm and the SVM
were less sensitive to the dataset reduction, while the models of k-NN, RF, and ANN were more influenced.
Sensitivity analysis of noise suggested that noises added on the element of Silicon, Aluminum, Magnesium,
Calcium, Potassium, and Iron would decrease the performance of the RF. Furthermore, the noise in Silicon had
the greatest effect on the prediction result compared to the other minerals. In addition, those non-linear clas
sifiers showed a larger performance score drop when the noise was simultaneously included into all the elements
density. Though U-Net shows poor performance on the segmentation of minority classes due to the negative
effect of imbalanced dataset the U-Net model still outperformed the RF model when it comes to unseen shale
samples.
1. Introduction (Haralick and Shapiro, 1985). Segmenting the pore space and organic
matters is the relatively easier part due to their obvious lower grayscale
A significant approach in image analysis for microstructure visuali values (dark) on SEM images (Wu et al., 2019) (Fig. 1). This process can
zation and quantification of rock samples is the use of the scanning be done by a thresholding method where all pixels with grayscale values
electron microscopy (SEM) (Klaver et al., 2012; Kelly et al., 2016; Sun above (or below) the threshold values are assigned to a particular class
et al., 2019). Pore structure information and mineralogy obtained from (Andrew, 2018). However, this threshold technique fails in classification
SEM image analysis is the basis for digital rock analysis and related when the other common minerals are present in shales such as quartz,
computational simulation work (Kong et al., 2019). Therefore, identi feldspar, carbonate minerals, and clay minerals due to their similar
fying the mineral phases on the SEM images are intimately related to the white-gray color in backscattered SEM image (Fig. 1). Therefore, in
subsequent analyses. Image segmentation involves partitioning and order to classifying these minerals with a backscattered SEM image,
clustering the image into different continuous and homogeneous regions more information such as the Energy Dispersive X-Ray Spectroscopy
* Corresponding author.
E-mail addresses: [email protected] (C. Li), [email protected] (L. Kong).
https://doi.org/10.1016/j.petrol.2020.108178
Received 5 March 2020; Received in revised form 11 November 2020; Accepted 23 November 2020
Available online 1 December 2020
0920-4105/© 2020 Elsevier B.V. All rights reserved.
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
2
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
3
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
Fig. 4. Schematic illustration of the procedure for acquiring mineralogy map through MAPS (modified from (Saif et al., 2017)).
Fig. 5. The workflow of generating the dataset for the shallow models.
map (Fig. 3). The acquisition of the mineral information was carried out 2.3. Feature extraction and dataset establishing
by the software MAPS, which is an automated mineral mapping software
commercialized by FEI. A mineral map of shale Sample 1, scanned on the The five machine learning models share the same input dataset,
same area of the element maps, was generated by the MAPS. Mapping of while the deep learning CNN U-Net model uses a different one. For the
mineralogy is performed and captured in the following processes as shallow models, the input is the pixel-level grayscale values extracted
illustrated in Fig. 4 (Saif et al., 2017): a) firstly the sample area was from element intensity maps, and the label data is the mineral class at
divided into multiple tiles, then b) the electron beam scans each tile to each pixel. While the U-Net model was trained end-to-end, where the
produce a BSE image, c) each analysis point is examined by EDS x-ray input is images cut from the element maps, and the output is the cor
detector and an X-ray spectrum is acquired, d) phase classification responding mineral map.
matches the observed spectrum at each point with known phases in a
mineral database, and last e) pixels were assigned to mineral composi 2.3.1. Feature extraction and dataset of machine learning models
tion and a mineral map is generated. The workflow for feature extraction for shallow learning models is
A total of 13 main mineral phases including K-feldspar, Quartz, illustrated in Fig. 5. The input data for are the intensity matrix, extracted
Dolomite, Illite, Pyrite, Albite, Muscovite, Calcite, Annite, Organic from the 12 element images. The brightness/intensity of each element at
Matter, Anorthite, Ankerite, and Chamosite were recognized in the each pixel was extracted from the grayscale maps (Fig. 2), where the
sample. Some minor minerals are also recognized by the software in this brightness or intensity values are integers that range from 0 (black) to
sample which is labeled as ‘unknown’ in Fig. 3, and they were ignored 255 (white), meaning the brighter the pixel the higher the intensity of
and not used in this study. Different mineral phases were labeled as the corresponding element. Each element intensity was normalized to
different colors in the mineral map (Fig. 3). transform data within the map to the range of 0–1 by dividing the pixel
values by 255. The densities of 12 elements at the same pixel location
were then extracted as a vector containing 12 element feature values
4
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
15,600 denotes the total number of samples, and the number of 12 de
Table 1
notes the 12 element density features extracted at each data point/pixel.
Mineral distribution in Sample 1.
Additionally, the ground truth data is a vector with size of 15, 600 × 1
Mineral/Class Percentage (%) which contains the value of the mineral class labels (Fig. 5).
Albite 1.14 The dataset of 15,600 data points from Sample 1 was split into the
K-feldspar 38.87 training and validation dataset, where the cross-validation dataset was
Quartz 28.25
25% of the entire dataset and the training process was comprised by a 4-
Illite 12.91
Dolomite 14.14 fold cross validation process. In the end, after the training and valida
Pyrite 2.24 tion, the best selected model was then used on Sample 2.
Calcite 0.81
Muscovite 0.72 2.3.2. Data processing and dataset for U-Net
Annite 0.32
Compared to the machine learning models, of which the input data
Ankerite 0.18
Anorthite 0.13 are the element intensity matrix extracted from the element intensity
OM 0.16 maps, the training for the U-Net is end-to-end. The elements maps
Chamosite 0.13 (Fig. 2) and mineral map (Fig. 3) are sliced into training samples
(cropped images) with a size of 128 × 128 pixels, which are also flipped
in both vertical and horizontal directions to generate sufficient training
(Fig. 5). Correlation matrix of these elemental intensities shows that the
samples through data augmentation procedure. Each input sample has a
correlation between Al and K, Mg and Ca, S and Fe are strongly positive
total of 12 channels (elements), and the output has 13 channels (mineral
related, while for Si and Ca, Si and Mg are highly negatived related
class). After data augmentation, there are a total of 147 samples for the
(Fig. 6). This is easy to understand that each mineral has a fixed
training and testing, which were further divided into training and model
chemical formula. For example, the chemical formula of k-feldspar is
validation, with a fraction of 75% and 25%.
KAlSi3O8, and therefore the presence of element Al is strongly related to
the presence of element K. Meanwhile, only one mineral can present at
2.4. Performance metric
each pixel. This accounts for that the strong negative relationship be
tween elements which don’t exist within the same mineral. Additionally,
To evaluate the prediction performance of different models, the
these mineral classes at the corresponding pixel in the mineral map are
multiclass version of the F1 score was used. The following equation is
the labeled ground truth data (Fig. 5), resulting in 13 mineral classes.
the definition of the F1 score (Müller et al., 2016):
The mineral class distribution as shown in Fig. 3 is imbalanced. The
percentage of each mineral class is calculated for quantifying the level of Precision ⋅ Recall
f 1 = 2⋅ (1)
the imbalance. Table 1 shows the percentage of each mineral class for Precision + Recall
Sample 1. The sample comprises of majority of K-feldspar, Quartz,
Dolomite and Illite. To build a balanced dataset for training and vali Precision =
TP
(2)
dation, 1200 pixels per class were randomly selected, which resulted in TP + FP
15,600 total number of pixels (1200 pixel per classification x13 class)
TP
(Fig. 5). A matrix with size of 15,600 × 12 was used as input, where Recall = (3)
TP + FN
5
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
6
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
3.1.4. Random Forest The advantages of using the random forest is that the data does not need
The Random Forest algorithm works effectively on a variety of to be preprocessed. However, to achieve good performance it is critical
problems. It is an ensemble of multiple decision trees (Breiman, 2001; to realize the important hyperparameters needs to be tuned which
Liaw et al., 2002). The idea is to solve the issue that an individual de include the maximum depth of the tree and the maximum number of
cision tree may be prone to overfit a portion of the data. By combining features.
different individual decision trees into an ensemble (Fig. 8), a random
forest can average out the individual mistakes to reduce the risk of 3.1.5. Artificial Neuron Network
overfitting (Breiman, 2001). A random forest creates tens to hundreds of Among various machine learning algorithms, one of the most pop
individual decision trees on a training set, and each individual decision ular technologies is Artificial Neural Networks (ANN). This algorithm
tree is constructed by introducing random variation (Liaw et al., 2002). can extract both implicit and complex data correlation based on large
This random variation during tree building happens in two ways. First, amounts of training data. The typical structure of the ANN model con
the data used to train an individual tree in a forest ensemble, referred to sists of a series of layers. Each layer contains a number of “neuron” units
as the sub-sample or bootstrap sample, is selected randomly. Second, in and carries a calculation of weighted input plus a bias term followed by a
an individual decision tree, the best feature to split a node is picking non-linear transformation (Zurada, 1992). The results obtained by the
within a randomly selected subset of features, instead of across all above procedures are then fed into the next layer. The training process
possible features. By randomizing these two processes, it will guarantee involves minimizing the difference between the true values and pre
that all the decision trees and the random forests will be different. For dicted values. During the process, the weights and bias of each layer are
the classification problem, the overall prediction is based on a weighted iteratively updated by backpropagation algorithms. Due to the
vote across all trees, once the random forest is trained, each individual nonlinear activation function and hidden neurons, deep neural networks
tree can make a prediction based on the target classes (Breiman, 2001). are established to deal with situations where input-output mappings are
7
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
Fig. 10. Effect of regularization parameter on performance of (a) logistic regression classifier and (b) linear SVM.
extensively complex.
3.2. U-Net
8
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
Table 2
Prediction performance of different classifiers.
Learning models F1-score
For the shallow machine learning models, the validation dataset was
used to measure the prediction performance of each classifier. Perfor
mance results obtained by cross-validation were shown in Table 2. The
F1 score results show a slight contrast with the exception of linear SVM
when applying different averaging strategies. The Random forest clas
sifier with a micro F1 score of 0.9238 performed the best among the five
shallow models followed by the k-NN, LR, ANN, and Linear SVM.
Moreover, the scores calculated by different averaging strategy show
slight differences, meaning that models trained by a balanced dataset
can perform well regardless of the distribution of mineral phases.
For the U-Net model, the following scores were observed: the F1-
micro (0.8832), F1-weighted (0.8784), and F1-macro score (0.7301)
for the validation datasets. The results indicate that the F1 score aver
aged over all classes (F1-macro) is much lower when compared to the
score averaged over all pixels (F1-micro), meaning that the U-Net model
Fig. 12. The effect of (a) minimum sample at each leaf, (b) number of decision showed poor performance when classifying the minor classes. This is
tree, and maximum features on the performance of sample.
training dataset (Fig. 12a). This lead to that trees prepared by the base
algorithms can be prone to overfitting as they became incredibly large
and complex. Additionally, the model can be simplified by setting a
lower limit for the minimum number of samples in an individual leaf
(min_sample_leaf). The simplified model is referred as the ‘pruned tree’.
In this paper, we tested the performance of the RF classifier on both
training and validation datasets by the cross-validation and examine the
effect of max_feature and min_sample_leaf on the prediction results. The
outcome showed when the number of min_sample_leaf increased, the
score of the classifier on training dataset decreased, while the prediction
performance of validation data improved when the parameter increases
from 1 to 2. However, a slight difference of performance was shown
when the parameter was in the range of 2–4, presenting an F1 score
ranging from 0.918 to 0.920. The optimal parameters max_feature = 5,
and min_sample_leaf = 4 for the RF classifier were chosen in this study.
Additionally, the effect of the number of trees in the forest (n_esti
mator) on the classifier performance was evaluated as well. The results
show that n_estimator = 50 provided the best performance for the Fig. 13. Comparison of performance of different models on various data
set size.
9
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
10
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
Fig. 16. Mineral maps with (a) ground true labels, and predicted map from (b) RF and (c) U-Net.
still outperforms the others. However, when comparing the dropped presence of noise, k-NN, RF, and ANN models tended to overfit the
prediction score, it was found that the non-linear classifiers-score drop noised data, resulting in a larger performance score drops.
was larger than those from the linear classifiers (Fig. 15b). Compared to
the relatively simple linear classifiers, when training the dataset in the
4.4. An example of applying RF classifier and U-Net on unseen sample
11
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
ground truth map, most of the pixels were correctly predicted by this the work reported in this paper.
model. In terms of mineral classes, the major minerals, including quartz,
illite, pyrite, dolomite, and calcite were correctly recognized. However, Acknowledgment
when looking at the wrong predictions, it was observed most of these
errors are related to k-feldspar (Table 3). It was predicted as either as The authors would like to thank the North Dakota Geological Survey
Quartz, Illite, or Albite. Potential reason for this is that these minerals and Core Library for allowing us access to the shale sample, particularly
share similar comprising elements. For example, comparing the chem Jeffrey Bader, state geologist and director as well as Kent Hollands,
ical formula of k-feldspar, KAlSi3O8, and Illite,K0.65Al 2.0 [Al0.65Si3.35O laboratory technician.
10](OH)2. It founds that they both contain the elements of Al, K, Si, O,
and H. It appears that the close intensities of these elements presented at Nomenclature
a pixel can led to the classifier failing to distinguish one mineral from
another. Further efforts are needed to improve the RF model’s perfor SEM Scanning electron microscopy
mance regarding this issue. Moreover, some discrete K-feldspar pixels LR Logistic Regression
were wrongly predicted as Quartz, for which the potential reason is SVM Linear Support Machine
because the elementary intensity obtained from X-ray of these isolated k-NN k-Nearest Neighbor
k-feldspar pixels was prone to be inferenced by the around Quartz pixels, RF Random Forest
and therefore it is prone to be identified as Quartz. ANN Artificial Neuron Networks
Compared to the RF model, the mineral map predicted by the U-Net EDS Energy Dispersive X-ray Spectroscopy
model presents a better performance in terms of the isolated small MAPS Modular automated processing system
particles, which is due to that the U-Net model not only takes the in TP True positive prediction
tensity information but also the location information of the input data. FP False positive prediction
Another characteristic of the predicted results of the U-Net model is that FN False negative prediction
the shape of the particles turned to be rounder compared to the ground
truth map (Fig. 16a) as the U-Net was originally designed for separating Appendix A. Supplementary data
cells in medical research. Furthermore, the U-Net failed predicting the
minor mineral classes in some pixels, for example, it missed several Supplementary data to this article can be found online at https://doi.
Muscovite particles as shown in brown. This is mainly due to the org/10.1016/j.petrol.2020.108178.
shortness of the imbalanced training dataset for the U-Net model.
References
5. Conclusion
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S.,
Irving, G., Isard, M., others, 2016. Tensorflow: a system for large-scale machine
In this study, mineral segmentation of SEM-EDS images using five learning. In: 12th ${$USENIX$}$ Symposium on Operating Systems Design and
shallow machine learning algorithms and a deep learning CNN U-Net Implementation (${$OSDI$}$ 16), pp. 265–283.
model were implemented and compared. For shallowing learning Al-Obaidi, M., Heidari, Z., Casey, B., Williams, R., Spath, J., 2018. Automatic well-log-
based fabric-oriented rock classification for optimizing landing spots and completion
models, balanced datasets with different training sample sizes were intervals in the midland basin. In: Presented at the SPWLA 59th Annual Logging
performed. Additionally, a sensitivity analysis of the effect of noise on Symposium. Society of Petrophysicists and Well-Log Analysts.
each model was also examined. Finally, the trained RF model and U-Net Andrew, M., 2018. A quantified study of segmentation techniques on synthetic geological
XRM and FIB-SEM images. Comput. Geosci. 22, 1503–1512. https://doi.org/
were implemented on an unseen sample to compare their performance 10.1007/s10596-018-9768-y.
on mineral segmentation. Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.
Results demonstrate that all classification algorithms show a high Cover, T., Hart, P., 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theor.
13, 21–27.
overall score ranging from 86% to 92%. Random Forest demonstrates
Dreiseitl, S., Ohno-Machado, L., 2002. Logistic regression and artificial neural network
the best performance among the five shallow models, with an F1 score of classification models: a methodology review. J. Biomed. Inf. 35, 352–359. https://
0.92. Sensitivity analysis on datasets size shows that Linear Regression doi.org/10.1016/S1532-0464(03)00034-0.
and Linear SVM were less sensitive to dataset size, while the k-Nearest Esmaeilzadeh, S., Salehi, A., Hetz, G., Olalotiti-lawal, F., Darabi, H., Castineira, D., 2020.
Multiscale modeling of compartmentalized reservoirs using a hybrid clustering-
Neighbors, Random Forest, and ANN were more sensitive to the based non-local approach. J. Petrol. Sci. Eng. 184, 106485. https://doi.org/
reduction of the size of the training dataset. Sensitivity analysis of noise 10.1016/j.petrol.2019.106485.
indicates that noises adding on the element of Silicon, Aluminum, Esmaeilzadeh, S., Salehi, A., Hetz, G., Olalotiti-lawal, F., Darabi, H., Castineira, D., 2019.
A general spatio-temporal clustering-based non-local formulation for multiscale
Magnesium, Calcium, Potassium, and Iron would decrease the perfor modeling of compartmentalized reservoirs. In: SPE Western Regional Meeting.
mance of RF due to their wider distribution in the element maps. When it https://doi.org/10.2118/195329-MS.
comes to unseen shale samples, although the U-Net shows relatively Guntoro, P.I., Tiu, G., Ghorbani, Y., Lund, C., Rosenkranz, J., 2019. Application of
machine learning techniques in mineral phase segmentation for X-ray
poor performance on the segmenting of minor mineral classes due to the microcomputed tomography (μCT) data. Miner. Eng. 142, 105882. https://doi.org/
negative effect of an imbalanced dataset, it still outperformed the RF 10.1016/j.mineng.2019.105882.
model in terms of correct classification of more pixels (higher F1 score) Gupta, I., Rai, C., Sondergeld, C.H., Devegowda, D., 2018. Rock typing in eagle ford,
barnett, and woodford formations. SPE Reservoir Eval. Eng. 21, 654–670. https://
and better performance on segmenting small isolated particles. In doi.org/10.2118/189968-PA.
Haralick, R.M., Shapiro, L.G., 1985. Image segmentation techniques. Comput. Vis. Graph
Credit author statement Image Process 29, 100–132.
Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X., 2013. Applied Logistic Regression.
John Wiley & Sons.
Chunxiao Li: Conceptualization, Methodology, Data Analytics, Pro Izadi, H., Sadri, J., Bayati, M., 2017. An intelligent system for mineral identification in
gramming, Writing- Original draft preparation. Dongmei Wang: Super thin sections based on a cascade approach. Comput. Geosci. 99, 37–49. https://doi.
org/10.1016/j.cageo.2016.10.010.
vision, Reviewing and Editing. Lingyun Kong: Programming for U-Net
Izadi, H., Sadri, J., Mehran, N.-A., 2015. A new intelligent method for minerals
Model, Critical Revision, Language Improvement, and Proofreading. segmentation in thin sections based on a novel incremental color clustering. Comput.
Geosci. 81, 38–52. https://doi.org/10.1016/j.cageo.2015.04.008.
Declaration of competing interest Jung, H., Jo, H., Kim, S., Lee, K., Choe, J., 2018. Geological model sampling using PCA-
assisted support vector machine for reliable channel reservoir characterization.
J. Petrol. Sci. Eng. 167, 396–405. https://doi.org/10.1016/j.petrol.2018.04.017.
The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence
12
C. Li et al. Journal of Petroleum Science and Engineering 200 (2021) 108178
Kelly, S., El-Sobky, H., Torres-Verdín, C., Balhoff, M.T., 2016. Assessing the utility of FIB- Probst, P., Wright, M.N., Boulesteix, A.-L., 2019. Hyperparameters and tuning strategies
SEM images for shale digital rock physics. Adv. Water Resour. Pore Scale Model. for random forest. WIREs Data Mining Knowledge Disc. 9, e1301 https://doi.org/
Exp. 95, 302–316. https://doi.org/10.1016/j.advwatres.2015.06.010. 10.1002/widm.1301.
Klaver, J., Desbois, G., Urai, J.L., Littke, R., 2012. BIB-SEM study of the pore space Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: convolutional networks for
morphology in early mature Posidonia Shale from the Hils area, Germany. Int. J. biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M.,
Coal Geol. Shale Gas Shale Oil Petrol. Petrophys. 103, 12–25. https://doi.org/ Frangi, A.F. (Eds.), Medical Image Computing and Computer-Assisted Intervention –
10.1016/j.coal.2012.06.012. MICCAI 2015, Lecture Notes in Computer Science. Springer International Publishing,
Knaup, A., Jernigen, J., Curtis, M., Sholeen, J., Borer, J.I., Sondergeld, C., Rai, C., 2019. Cham, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
Unconventional reservoir microstructural analysis using SEM and machine learning. Saif, T., Lin, Q., Butcher, A.R., Bijeljic, B., Blunt, M.J., 2017. Multi-scale multi-
In: Presented at the SPE/AAPG/SEG Unconventional Resources Technology dimensional microstructure imaging of oil shale pyrolysis using X-ray micro-
Conference, Unconventional Resources Technology Conference. https://doi.org/ tomography, automated ultra-high resolution SEM, MAPS Mineralogy and FIB-SEM.
10.105530/urtec-2019-638. Appl. Energy 202, 628–647. https://doi.org/10.1016/j.apenergy.2017.05.039.
Kong, L., Ostadhassan, M., Hou, X., Mann, M., Li, C., 2019. Microstructure characteristics Smith, M.G., Bustin, R.M., 1996. Lithofacies and paleoenvironments of the upper
and fractal analysis of 3D-printed sandstone using micro-CT and SEM-EDS. J. Petrol. devonian and lower mississippian Bakken Formation, Williston Basin. Bull. Can.
Sci. Eng. 175, 1039–1048. Petrol. Geol. 44, 495–507.
Li, C., Ostadhassan, M., Abarghani, A., Fogden, A., Kong, L., 2019. Multi-scale evaluation Sun, W., Zuo, Y., Wu, Z., Liu, H., Xi, S., Shui, Y., Wang, J., Liu, R., Lin, J., 2019. Fractal
of mechanical properties of the Bakken shale. Journal of materials science 54 (3), analysis of pores and the pore structure of the Lower Cambrian Niutitang shale in
2133–2151. northern Guizhou province: investigations using NMR, SEM and image analyses.
Li, C., Ostadhassan, M., Guo, S., Gentzis, T., Kong, L., 2018. Application of PeakForce Mar. Petrol. Geol. 99, 416–428. https://doi.org/10.1016/j.marpetgeo.2018.10.042.
tapping mode of atomic force microscope to characterize nanomechanical properties Suykens, J.A., Vandewalle, J., 1999. Least squares support vector machine classifiers.
of organic matter of the Bakken Shale. Fuel 233, 894–910. https://doi.org/10.1016/ Neural Process. Lett. 9, 293–300.
j.fuel.2018.06.021. Tang, D., Spikes, K., 2017. Segmentation of shale SEM images using machine learning. In:
Li, H., Misra, S., 2019. Long short-term memory and variational autoencoder with SEG Technical Program Expanded Abstracts 2017. Presented at the SEG Technical
convolutional neural networks for generating NMR T2 distributions. Geosci. Rem. Program Expanded Abstracts 2017. Society of Exploration Geophysicists, Houston,
Sens. Lett. IEEE 16, 192–195. https://doi.org/10.1109/LGRS.2018.2872356. Texas, pp. 3898–3902. https://doi.org/10.1190/segam2017-17738502.1.
Liaw, A., Wiener, M., others, 2002. Classification and regression by randomForest. Tang, Y., Zhang, Y.-Q., Chawla, N.V., Krasser, S., 2008. SVMs modeling for highly
R. News 2, 18–22. imbalanced classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39,
Luo, X., 2019. Ensemble-based kernel learning for a class of data assimilation problems 281–288.
with imperfect forward simulators. PloS One 14, 1–40. https://doi.org/10.1371/ Temirchev, P., Gubanova, A., Kostoev, R., Gryzlov, A., Voloskov, D., Koroteev, D.,
journal.pone.0219247. Simonov, M., Akhmetov, A., Margarit, A., Ershov, A., 2019. Reduced order reservoir
Marmo, R., Amodio, S., Tagliaferri, R., Ferreri, V., Longo, G., 2005. Textural simulation with neural-network based hybrid model. In: Presented at the SPE
identification of carbonate rocks by image processing and neural network: Russian Petroleum Technology Conference. Society of Petroleum Engineers. https://
methodology proposal and examples. Comput. Geosci. 31, 649–659. https://doi.org/ doi.org/10.2118/196864-MS.
10.1016/j.cageo.2004.11.016. Temirchev, P., Simonov, M., Kostoev, R., Burnaev, E., Oseledets, I., Akhmetov, A.,
Miao, X., Wang, J., Wang, Z., Sui, Q., Gao, Y., Jiang, P., 2019. Automatic recognition of Margarit, A., Sitnikov, A., Koroteev, D., 2020. Deep neural networks predicting oil
highway tunnel defects based on an improved U-net model. IEEE Sensor. J. 1–1 movement in a development unit. J. Petrol. Sci. Eng. 184, 106513. https://doi.org/
https://doi.org/10.1109/JSEN.2019.2934897. 10.1016/j.petrol.2019.106513.
Misra, S., Li, H., He, J., 2019. Machine Learning for Subsurface Characterization. Gulf Ulker, E., Sorgun, M., 2016. Comparison of computational intelligence models for
Professional Publishing. cuttings transport in horizontal and deviated wells. J. Petrol. Sci. Eng. 146, 832–837.
Müller, A.C., Guido, S., others, 2016. Introduction to Machine Learning with Python: a https://doi.org/10.1016/j.petrol.2016.07.022.
Guide for Data Scientists. O’Reilly Media, Inc. Wang, L., 2005. Support Vector Machines: Theory and Applications. Springer Science &
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Business Media.
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Wu, Y., Misra, S., Sondergeld, C., Curtis, M., Jernigen, J., 2019. Machine learning for
Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn: machine locating organic matter and pores in scanning electron microscopy images of
learning in Python. J. Mach. Learn. Res. 12, 2825–2830. organic-rich shales. Fuel 253, 662–676. https://doi.org/10.1016/j.
Pirrie, D., Butcher, A.R., Power, M.R., Gottlieb, P., Miller, G.L., 2004. Rapid quantitative fuel.2019.05.017.
mineral and phase analysis using automated scanning electron microscopy Zurada, J.M., 1992. Introduction to Artificial Neural Systems. West St. Paul.
(QemSCAN); potential applications in forensic geoscience. Geol. Soc. Lond. Spec.
Publ. 232, 123–136. https://doi.org/10.1144/GSL.SP.2004.232.01.12.
13