0% found this document useful (0 votes)

51 views22 pages

Data-Driven Machine Learning

This study presents data-driven machine learning approaches for lithofacies identification in complex geological environments, particularly focusing on the Lower Indus Basin. Four methods—multi-resolution graph-based clustering (MRGC), artificial neural networks (ANN), K-nearest neighbors (KNN), and self-organizing maps (SOM)—are evaluated for their effectiveness based on varying core sample availability. The findings indicate that MRGC is optimal for limited data scenarios, while KNN is more suitable for larger datasets, highlighting the importance of method selection in enhancing lithofacies identification in the petroleum industry.

Uploaded by

petro Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views22 pages

Data-Driven Machine Learning

Uploaded by

petro Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Geo-spatial Information Science

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/tgsi20

Data-driven machine learning approaches for

precise lithofacies identiﬁcation in complex
geological environments

Muhammad Ali, Peimin Zhu, Ma Huolin, Ren Jiang, Hao Zhang, Umar Ashraf
& Wakeel Hussain

To cite this article: Muhammad Ali, Peimin Zhu, Ma Huolin, Ren Jiang, Hao Zhang, Umar
Ashraf & Wakeel Hussain (18 Oct 2024): Data-driven machine learning approaches for precise
lithofacies identiﬁcation in complex geological environments, Geo-spatial Information Science,
DOI: 10.1080/10095020.2024.2405635

To link to this article: https://doi.org/10.1080/10095020.2024.2405635

© 2024 Wuhan University. Published by

Informa UK Limited, trading as Taylor &
Francis Group.

Published online: 18 Oct 2024.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=tgsi20
GEO-SPATIAL INFORMATION SCIENCE
https://doi.org/10.1080/10095020.2024.2405635

Data-driven machine learning approaches for precise lithofacies identification

in complex geological environments
Muhammad Ali a,b, Peimin Zhu a
, Ma Huolina, Ren Jiang c
, Hao Zhang a
, Umar Ashraf d

and Wakeel Hussain a

a
Institute of Geophysics & Geomatics, China University of Geosciences, Wuhan, China; bState Key Laboratory of Geomechanics and
Geotechnical Engineering, Institute of Rock and Soil Mechanics, Chinese Academy of Sciences, Wuhan, China; cResearch Institute of
Petroleum Exploration and Development, Petro China Company Limited, Beijing, China; dSchool of Ecology and Environmental Sciences,
Yunnan University, Kunming, China

ABSTRACT ARTICLE HISTORY

Reservoir characterization is a vital task within the oil and gas industry, with the identification of Received 4 October 2023
lithofacies in subsurface formations being a fundamental aspect of this process. However, Accepted 12 September 2024
lithofacies identification in complex geological environments with high dimensions, such as KEYWORDS
the Lower Indus Basin in Pakistan, poses a notable challenge, especially when dealing with Reservoir characterization;
limited data. To address this issue, we propose four common data-driven machine learning lithofacies identification;
approaches: multi-resolution graph-based clustering (MRGC), artificial neural networks (ANN), machine learning; core
K-nearest neighbors (KNN), and self-organizing map (SOM). We utilized these proposed sample availability;
approaches to assess their performance in scenarios with varying core sample availability, truncated Gaussian
specifically evaluating their effectiveness in identifying lithofacies within the Lower Goru simulation
formation of the middle Indus Basin. The study reveals that in scenarios with a limited number
of core samples, MRGC is the preferred choice, while KNN or MRGC is more suitable for larger
datasets. The results demonstrate the superior performance of MRGC and KNN in lithofacies
identification within the specified geological environment, with SOM following closely behind,
and ANN exhibiting comparatively lower efficacy. The accurate identification of lithofacies from
the selected model is complemented by the application of the truncated Gaussian simulation
method for facies modeling. Comparative results confirm the excellent agreement between the
model identification of lithofacies from well logs and electro-facies obtained from the trun
cated Gaussian simulation electro-facies volume. This study highlights the crucial role of
selecting the right machine learning approach for precise lithofacies identification and model
ing in complex geological environments. The comparative analysis provides practitioners in the
petroleum industry with insights into the strengths and limitations of each method, enhancing
existing knowledge. In conclusion, this research emphasizes the significance of comprehensive
research and method selection for advancing lithofacies identification in diverse formations or
study areas, ultimately benefiting the broader field of subsurface characterization in the
petroleum industry.

1. Introduction
highlight the difficulty of achieving high-precision litho
Lithofacies classification is the foundation of geological facies identification based on the classification of rock
surveys, using logging data to assign lithofacies types to structure components. The complex layering and strong
rock samples (Dubois, Bohling, and Chakrabarti 2007). heterogeneity further compound the difficulty of accu
Lithology is of great significance in reservoir evaluation. rately classifying lithofacies based solely on the classifica
Different lithofacies have specific ranges of pore and tion of rock structure components (Bloch, Lander, and
permeability changes, and pore and permeability data Bonnell 2002; Lai et al. 2018b; Zhang, Ambrose, and Xie
can be fitted better according to the lithofacies result 2021). In this context, the utilization of traditional meth
(Chang, Kopaska, and Chen 2002). The study of sedi ods that heavily rely on extensive core sample images for
mentary microfacies is crucial for accurate reservoir pre lithofacies prediction has proven challenging due to lim
diction, especially in scenarios with limited coring well ited data availability. Therefore, a conventional approach
data. Precise lithofacies identification using logging data plays a crucial role in lithofacies identification to obtain
becomes crucial for predicting tight sandstone forma information about lithofacies from tight reservoirs (Liu
tions (Ali et al. 2023a). The inherent challenges of tight et al. 2020; Lyu et al. 2019; Valentín et al. 2019; Zhou et al.
reservoirs, characterized by matrix microporosity, low 2016). Several traditional techniques have been proposed
permeability, complex layering, and strong heterogeneity, to classify lithofacies beyond core samples. One such

CONTACT Peimin Zhu zhupm@cug.edu.cn

© 2024 Wuhan University. Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting
of the Accepted Manuscript in a repository by the author(s) or with their consent.
2 M. ALI ET AL.

method is conventional logging identification, which the effectiveness of SVR in enhancing RRN perfor
includes structural characteristic parameter method mance. Al-Qaness et al. (2022) utilized a model,
(Feng et al. 2018; Li et al. 2020), curve overlap method AOOBL-ANFIS, to enhance the adaptive neuro-fuzzy
(Lai et al. 2020), etc. However, these methods often inference system (ANFIS) for oil production estima
depend more on the experience and knowledge of inter tion. They optimized ANFIS parameters using
preters and generally exhibit low accuracy. The second a modified aquila optimizer (AO) with the opposition-
approach involves special logging identification methods, based learning (OBL) technique. The AOOBL-ANFIS
such as the use of imaging logging to identify lithofacies. model outperformed classic ANFIS and other modi
However, these methods are challenging to apply widely fied ANFIS models and time series forecasting meth
due to their high cost. To address this issue, the conven ods in terms of performance metrics and
tional method enables more efficient and accurate reser computational time. Pei et al. (2023) introduced
voir lithofacies classification using efficient and reliable a deep learning-based algorithm called FCN-
machine learning methods in complex reservoirs for Attention for classifying line-of-sight (LOS) and none-
exploration, development, and production. line-of-sight (NLOS) propagation in ultra-wideband
In recent years, with the focus on leveraging artifi (UWB) Location-Based Services. FCN-Attention uti
cial intelligence for subsurface characterization in the lizes a fully convolution network (FCN) and a self-
oil and gas industry, there has been notable progress attention mechanism to improve feature extraction
(Antariksa, Muammar, and Lee 2022; Song et al. 2021; and description, achieving high classification accura
Valentín et al. 2019). Chawshin et al. (2021) developed cies on various datasets, outperforming existing algo
a convolutional neural network (CNN) that utilizes 2D rithms. Zhen et al. (2023) utilized classical Boosting
core CT scan image slices as input to perform auto machine learning algorithms to categorize deep-water
matic lithofacies prediction. Kim (2022) developed an submarine fan lithofacies types in a West African
integrated approach for lithofacies classification in the oilfield. By addressing sample no balance issues
Eagle Ford shale and the Austin Chalk, addressing through oversampling techniques and optimizing
challenges in defining geological heterogeneity in hyperparameters with Genetic Algorithm, the pro
these unconventional reservoirs. They utilized core posed MAHAKIL-GA-GBDT algorithm achieved
samples, thin sections, SEM images, and wireline a high accuracy of 0.986. Alzubaidi et al. (2021) intro
logs, including natural gamma (GR) deep resistivity duced a CNN-based method that used core images to
(LLD), sonic (DT), and density (RHOB) logs. A CNN predict lithology automatically and quickly, but the
model was trained to classify four lithofacies based on method did not perform well in the subdivision of
various geological features, providing a data-driven rock types. Zhang et al. (2021) used convolutional
method for reservoir characterization. Zhen et al. neural networks to build a deep learning model for
(2023) utilized classical boosting machine learning lithofacies identification from core images. This work
algorithms to identify deep-water submarine fan litho effectively provided the first-glance analysis of core
facies types in a West African oilfield. By addressing data; however, the generalization of the model needed
sample non-balance issues through oversampling to be improved. Although these methods greatly
techniques and optimizing hyperparameters with reduced the identification time, they still required
genetic algorithm, the proposed MAHAKIL-GA- many core sample images for network training, and
GBDT algorithm achieved a high accuracy of 0.986. labeling the samples was also a challenge. Therefore,
Dixit, McColgan, and Kusler (2020) applied machine relatively low-cost well logs instead of core samples
learning algorithms to predict rock facies in the Umiat were used for lithofacies identification.
Oil Field of Alaska. Utilizing limited core data and Log data is widely used in lithofacies identification and
mineralogical information, they identified five sand evaluation due to its high vertical resolution and good
reservoir lithofacies in the Lower Member continuity (Lai et al. 2018a). The composition and struc
Grandstand. The integration of machine learning ture of the reservoir will lead to different lithofacies being
algorithms, including self-organizing maps, with wire divided, and the corresponding logging response will also
line log data resulted in successful facies predictions, be different (Hemmesch et al. 2014; Ozkan et al. 2011).
particularly validated in nearby uncored wells using Therefore, Bhattacharya, Carr, and Pal (2016) input five
observed seismic data. Geng et al. (2020) proposed one-dimensional logs and other derived parameters into
a learning-based non-linear regression method, sup the lithofacies model using three machine learning algo
port vector machine regression (SVR), for accurate rithms, such as ANN, and proved that lithofacies identi
modeling of the radiometric transforming relation in fication could be modeled in that way. Similarly, Wu et al.
relative radiometric normalization (RRN) of coarse- (2020) selected deep resistivity (RT), spontaneous poten
resolution data. They conduct experiments, including tial (SP), natural gamma (GR), sonic (DT), compensated
synthetic and real data, comparing SVR with other neutron (CNL), and density (DEN) to summarize the
methods like linear regression, artificial neural net logging response characteristics of five lithofacies based
work (ANN), and random forest (RF), demonstrating on the experimental results of core composition analysis,
GEO-SPATIAL INFORMATION SCIENCE 3

and successfully predicted the distribution of each litho in tight sandstone formations using machine learning
facies in a single well. He et al. (2016) optimized the techniques. Given the challenges associated with lim
identification model constructed by DEN, AC, RT, and ited core data, our focus is on leveraging the efficiency,
other logs through the comparison of core observation, speed, and accuracy of machine learning methods to
X-ray diffraction, and qualitatively identified lithofacies address the complexities of tight reservoirs. Specifically,
through the intersection diagram. Compared with GR we have chosen four widely used techniques: self-
DEN, DT, and other one-dimensional logs, resistivity organizing map (SOM), multi-resolution graph-based
images can directly observe formation changes and iden clustering (MRGC), K-nearest neighbor (KNN), and
tify lithofacies boundaries. Its appearance improves the artificial neural network (ANN). In this study, the
accuracy of lithofacies identification. Nishitsuji and Exley Lower Goru formation in the Kadanwari block of the
(2019) address a key challenge in the energy industry by central Indus Basin serves as a testbed. Two core sam
optimizing deep-learning architectures, with a focus on ple wells and two identification and verification wells
labor-intensive hyperparameter optimization. Using have been selected for a thorough comparative analysis
optuna, a global optimizer, the study fine-tunes para of lithofacies identification. By showcasing the effec
meters for an extended long-short term memory model tiveness of the chosen machine learning techniques in
in predicting lithological facies. Although the macro dif this real-world scenario, we aim to establish their
ference with and without optuna is minor, the results applicability and reliability in enhancing lithofacies
indicate notable commercial impacts, particularly in sce prediction. Ultimately, the culmination of this research
narios with small yet challenging targets. Ying and Bao- will involve applying the truncated Gaussian simula
Zhi (2011) used the support vector machine (SVM) algo tion technique to create a facies model based on the
rithm to process conventional logs such as natural lithofacies identified through the machine learning
gamma, photoelectric absorption cross-section index, approaches. This model will not only contribute to
etc. They explained it with the help of micro-resistivity a deeper understanding of lithofacies distribution in
images and finally made a better analysis of the volcanic the studied reservoir but will also assist in identifying
lithofacies. However, the SVM is difficult to achieve large- prospective areas with potential for tight sandstone
scale training samples, and the neural network is easy to formations. Through this research, we seek to establish
fall into local optimum (LeCun, Bengio, and Hinton a robust methodology for efficient and accurate litho
2015). Therefore, Yu et al. (2021) established a lithology facies prediction in challenging geological settings. In
identification and classification model using the gradient summary, our study differentiates itself from previous
boosting decision tree (GBDT) ensemble learning algo approaches through its emphasis on log data, the care
rithm. The model correctly identified the lithofacies of ful selection of diverse machine learning techniques,
the volcanic rocks using core and FMI-calibrated con and the rigorous validation of results in a specific geo
ventional as input. On this basis, to further improve the logical context. This innovative combination contri
efficiency and accuracy of identification, research has butes to the advancement of lithofacies identification
been carried out. Lan et al. (2021) proposed a semi- methodologies, offering a more robust and applicable
supervised learning strategy for conventional log data solution for reservoir prediction in tight sandstone
based on positive and unlabeled machine-learning, formations.
which only marked limited log samples, and successfully
obtained five carbonate logging lithofacies, but the accu
2. Geological setting and data analysis
racy of the results needed to be improved. However, most
of the above lithofacies identification methods require In this section, we provide a comprehensive overview
certain a priori judgment results for guidance. This kind of the geological setting and data analysis for the
of method is greatly influenced by manual subjective and Kadanwari and Sawan gas field in the Lower Goru
has low precision and huge workload. Therefore, it is formation within the Central Indus Basin, Pakistan
necessary to identify lithofacies automatically. Tian et al. (Figure 1(a)). The section is structured as follows.
(2016) used the multi-resolution graph-based clustering
(MRGC) method to automatically cluster the log of the
2.1. Geographical and geological description
Amu Darya basin without prior knowledge, and finally
obtained different lithofacies. Chai et al. (2009) designed The study area encompasses the Kadanwari and
an automatic lithofacies classification method for sedi Sawan gas field, focusing on the conventional sands
mentary facies of reef-shoal reservoirs. These above in D, E, F, and the tight G sand layer. This region,
researchers once again proved the trend of design situated in the Central Indus Basin, is characterized by
research of lithofacies automatic classification and iden a complex geometrically progradational sequence
tification method. environment, formed during three significant tectonic
Therefore, the objective of this paper is to introduce events (Ahmad and Chaudhry 2002; Ali et al. 2019,
a comprehensive approach for lithofacies identification 2020; Ashraf et al. 2019). The structural configuration
4 M. ALI ET AL.

Figure 1. Presents a clear visual representation, highlighting (a) the geographical positioning, (b) the lithological composition
within the study region, and (c) the sedimentology model.

of the field, shaped by these tectonic events, has exhibits hot sand characteristics in the field (Ashraf
a significant impact on the reservoir characteristics et al. 2019, 2020).
(Figure 1(a)). The Lower Goru sands have been
divided into seven sand-bearing intervals (Figure 1
(b)) from bottom B-Sand to top H-Sand (Ahmad
2.2. Deltaic system characteristics
and Chaudhry 2002). The primary producing sands
in the area are E-Sand and G-Sand, while D-Sand and The Kadanwari and Sawan (from C to H layers) of the
F-Sand have also yielded production from select wells Lower Goru represent a clastic delta system character
(Ali et al. 2023b; Ali et al. 2019, 2020). In Kadanwari, ized by a river-dominant regime with additional wave
E-Sand, the main producer, is characterized as and tidal transformations. River dynamics leave their
a conventional reservoir, forming an elongate body mark on both sand-prone “proximal” and fine-grained
trending SW-NE parallel to the paleo shoreline of the “distal” facies. Proximal facies exhibit cross-bedded
Early Cretaceous time. However, B, C, D, G, and medium to coarse sandstones, while distal facies are
H exhibit the tight characteristics. G-Sand has been typified by hummocky cross-lamination, associated
productive post-hydraulic fracturing, and F-Sand with hyperpycnal flow during massive seasonal storms
GEO-SPATIAL INFORMATION SCIENCE 5

and floods (Valzania et al. 2011). Distinctive variations training wells, comprises multiple stages, as depicted
in the size and shape of delta lobes deposited at dif in Figure 2. It is essential to emphasize that, prior to
ferent stages are evident (Figure 1(c)). training and employing the models to predict lithofa
cies in new wells, various critical measures need to be
undertaken for data preprocessing, exploration, and
2.3. Data description and analysis
sample preparation. These phases involve eliminating
The logging and core data used in this paper are from erroneous data, incorporating additional features and
four wells K-15, K-14, K-13, and S-10 in the Lower clustering, scaling and normalizing the data, creating
Goru formation of Kadanwari and Sawan gas field sequential samples, and dividing the dataset into sub
block in the central Indus Basin. Two coring wells sets designated for training and testing. These actions
(Well K-15, and Well K-14) were selected as core are instrumental in ensuring the excellence of the data
sample wells, and the other two wells (Well S-10, and samples that will subsequently be inputted into the
Well K-13) were selected as identification effect ver machine learning models, ultimately leading to high-
ification wells. We will utilize the selected well logs quality predictions.
such as sonic (DT), density (RHOB), neutron (NPHI),
deep lateral resistivity (LLD), photoelectric factor
3.1. Data cleaning
(PEF), spontaneous potential (SP), and natural
gamma log (GR) for the input logging curve. Due to instrument errors or recording errors, some
outliers are inevitably generated in the logging data.
When the model is sensitive to outliers, these out
3. Methodology
liers tend to negatively affect the results (Ashraf et
The procedure of utilizing machine learning algo al. 2024a, 2024b, 2024c; Valzania et al. 2011). The
rithms to predict lithofacies in wells where core facies logging data basically conforms to the normal dis
data does not exist, based on raw well log data from tribution (Zheng et al. 2021); To detect outliers, we

Figure 2. Showcases a visual representation delineating the methodological framework.

6 M. ALI ET AL.

Figure 3. Illustration of pauta criterion.

use the Pauta criterion (Li, Wen, and Wang 2016) layer represents a neuron, and each neuron is con
for detection. Figure 3 shows that according to the nected through lateral inhibition. The input layer and
description of the Pauta criterion, the confidence the competition layer are connected through full con
probability of judging gross error is 99.7%, which nection (Figure 4).
is based on three times the standard deviation. If the One key advantage of SOM is their ability to per
value exceeds the confidence interval, it does not form clustering and dimensionality reduction simul
belong to the category of random error, but a gross taneously. This makes them particularly useful for
error. The outliers in this paper are filled with visualizing high-dimensional data in a lower-
LaGrange interpolation. dimensional space. The competitive learning process,
where neurons compete to be activated, allows SOM to
identify patterns and relationships within the data.
3.2. Machine learning algorithms
Additionally, SOM has been applied in various fields,
In the course of this research, we strategically identi such as image recognition, data mining, and feature
fied and selected four machine learning algorithms, extraction, showcasing their versatility in solving com
taking into account the specific characteristics of the plex problems across different domains (Ali et al.
study area scenario and the constraints posed by lim 2022; Ali et al. 2023).
ited core data availability. The chosen machine learn The specific steps of SOM algorithm implementa
ing algorithms encompass SOM, KNN, MRGC, and tion are as follows:
ANN. This strategic selection not only addresses the Network initialization: initialize weights W, etc.
complexities of the geological study area but also
ensures a comprehensive and effective approach to
our analysis. Randomly select an input vector from the input
samples
3.2.1. Self-organizing map (SOM)
SOM is a type of ANN proposed by Teuvo Kohonen,
a professor at Helsinki University in Finland, in 1981. Calculate the distance between xi and the competing
Therefore, it is also known as the Kohonen algorithm layer neuron j, and find out the smallest distance
(Kohonen 1991). The SOM is an unsupervised train neuron g from xi . Adjust the weights of the neurons
ing neural network. By introducing the concept of g and the neurons included in their neighborhood
a neighborhood function, it achieves self-organizing NðTÞ according to the first step.
and unsupervised learning. In other words, all neurons
are placed on a topology determined in advance
according to prior knowledge (Bhattacharya, Carr, where wij represents the weights of the input layer
and Pal 2016; Cai and Chen 2022; Hussain et al. node i and the competition layer node j; n represents
2022; Wang et al. 2020). The introduction of the the learning rate, which generally decreases with the
neighborhood function restricts SOM training, ensur number of evolutions.
ing that the training does not fall into a local mini
mum. It adopts a two-dimensional SOM structure and 3.2.2. Multi-resolution graph-based clustering
consists of an input layer and a competition layer. The (MRGC)
dimension of the input layer is consistent with the Multi-resolution-graph-based-clustering (MRGC)
dimension of the input sample vector. The nodes of technique was proposed by Ye et al. (2000).
the competition layer are generally distributed in MRGC is a multi-dimensional dot matrix pattern
a two-dimensional array. A node in the competition recognition method based on a nonparametric
GEO-SPATIAL INFORMATION SCIENCE 7

Figure 4. The basic architecture of the self-organizing map (SOM). The input x is fully connected to the array of map nodes which is
most often and also in this illustration two-dimensional. Each map node is visualized as a circle on the grid.

KNN algorithm and graphic data representation

(Shi et al. 2017). This method is different from
other clustering algorithms, such as ANN and
SOM, which usually require a large number of
parameters to be set and the calculation process is
complicated when they are used. However, the set
ting of some empirical parameters often has a great
impact or uncertainty on the results, while the where the measurement point x is the mth adjacent
MRGC analysis method does not depend on the point of the measurement point y, m � P 1; a is
preference of analysts. It does not need to know a smoothing factor, a � 0; The value PIðxÞ varies
the structure of the cluster data in advance, and the from zero to one. When the value PIðxÞ is larger, it
operation speed is very fast. This method can auto means that the point is closer to the “core” of a certain
matically determine the optimal lithologic cluster class.
ing scheme by using well-logging data in The kernel representation index (KRI) is
combination with actual needs, and it is also easy a combined function that combines the proximity
to extend to adjacent areas. Compared with other index PIðxÞ, the distance function Bðx; yÞ, and the
methods, this clustering method has unique neighborhood function Aðx; yÞ. The proximity index
advantages. PIðxÞ is a very important factor for the KRI, but it is
There are two key parameters in the MRGC cluster only a local index. Therefore, the distance function
analysis method: the proximity index (PI) and the Bðx; yÞ and the neighborhood function Aðx; yÞ are
kernel representation index (KRI). The proximity introduced. Among them, the proximity index can
index is a weighting function based on the measure effectively identify the core of the cluster, the neigh
ment point x relative to all other measurement points. borhood function Aðx; yÞ can generate clusters of
Its specific expression is as follows: equivalent size, and the distance function Bðx; yÞ can
form clusters of equivalent volume. Therefore, the
combination of Aðx; yÞ and Bðx; yÞ can form an effec
tive balance on the cluster size and volume and obtain
consistent results. The optimal clustering scheme can
8 M. ALI ET AL.

be obtained based on the calculation KRI, in which pre-classification of the data set without undergoing
KRI can be obtained by the following formula: a learning and training process (Bezdek, Chuah, and
Leep 1986; Villegas et al. 2017). The neighbor of
a sample to be divided is an object that has been
For instance, as depicted in Figure 5a, the middle correctly classified, the category to which the sample
data point, which has a PI value of 0.9, along with the to be divided is determined according to the category
remaining neighboring points, the majority of which of the nearest one or several samples. Therefore, the
have PI values lower than 0.9, constructed an attrac KNN method is suitable for classification problems
tive set. Due to its independence from every other data with overlapping sample sets to be classified or over
point, x1 is referred to as a “free attractor”. It is much lapping class domains because it is not affected by
more apparent that a similar situation occurred with outliers and its algorithm is simple and direct. It can
the PI value of 0.9 data point, which is the middle also be classified when the sample size and its char
point of Figure 5(b) as an outcome, the distance that acteristics are small, but the number of sample types is
separates x2 and x1 is greater than that of the majority required to be balanced (Figure 6).
of the points inside the dataset indicating the attrac Moreover, KNN is a non-parametric and instance-
tion of x1 ; more specifically, the KRI computation based learning algorithm, meaning it does not make
criterion is satisfied by only x2 , that has a PI value of assumptions about the underlying data distribution
0.95. In order to determine KRI the x1 point, equation and relies on the specific instances in the training set
8 has been used. for making predictions. This characteristic makes
KNN particularly useful in scenarios where the deci
3.2.3. K-Nearest neighbor (KNN) sion boundaries are complex and not easily defined by
K-nearest neighbor (KNN) is one of the simplest a simple mathematical function. The algorithm’s sim
mathematical classification and recognition algo plicity and flexibility, however, come at the cost of
rithms based on the Supervised Machine Learning computational efficiency, especially as the size of the
technique (Soucy and Mineau 2001). In KNN, each dataset grows. Despite its computational challenges,
sample can be represented by its nearest K neighbors, KNN remains a popular choice in various applications
and new samples can be directly classified based on the such as pattern recognition, image classification, and

Figure 5. Attraction schematic diagram.

Figure 6. Visual illustration of the KNN algorithm.

GEO-SPATIAL INFORMATION SCIENCE 9

recommendation systems, showcasing its versatility in output, facilitating the learning process and enhancing
solving diverse problems. the network’s predictive capabilities.
The specific steps of KNN algorithm implementa Assume neural networks with a hidden signal and
tion are as follows: an input layer n and output layer m, bj indicates the
It is assumed that there are C classes output of the hidden signals, θj is the value of the
k1 ; k2 ; k3 ; . . . ; kc , and each class has hidden layer’s threshold, the value θk represents the
Pi ði ¼ 1; 2; 3; . . . ; CÞ samples indicating the class. It threshold for the output signal, f1 is indicated the
is specified that the discriminant function of the class transfer factor of the hidden signal, while f2 is repre
ki is defined: sented the transfer function of the output signal, input
layer to hidden layer weights of wij , while hidden layer
to output layer weights wjk . After that, we will be able
where i means the ki type; k means the k-th in the Pi to obtain the output of the network, which is denoted
samples of class ki . by yk , while the output of the jth neuron of the hidden
According to question (9), the decision rule can be layer is denoted by tk .
written as: if gi ðxÞ ¼ min gi ðxÞ, then the decision
x 2 kj . This decision-making technique is entitled the
nearest neighbor technique, that is, for the unknown
samplex, as long as the Euclidean distance between x
Pm
Calculating the output yk of the output layer, this is:
and n ¼ ni known class samples is compared, the
i¼1
decision-making x is the same as the nearest sample,
and the class of sample x can be determined.

3.2.4. Artificial neural network (ANN)

Defining the error function by the network actual
ANN is broadly utilized in the domain of classification
output, that is:
and identification. Here is a basic description of the
backpropagation (BP) neural network(Alakbari,
Elkatatny, and Baarimah 2016; Arkalgud, McDonald,
and Brackenridge 2021; Ashraf et al. 2021; Onalo et al.
2018). The BP neural network is a gradient descent The purpose of network training is to have the
approach that has been created to reduce the overall network error decrease to a predetermined minimum
error, also known as the mean error, of the output that or stop at a specific training step by continuously
is computed by the network (Ali et al. 2021; Balmer, adjusting the weights and threshold. The prediction
Weibel, and Huang 2021; Guresen and Kayakutlu samples are then entered into the trained network, and
2011). Figure 7 illustrates such a network. This net the findings of the prediction are obtained.
work is composed of three distinct layers: an input Moreover, to enhance the robustness and reliability
layer, an output layer, and one or several hidden layers of Machine Learning models, a valuable technique
in between them. The input layer receives the initial known as K-fold cross-validation can be employed
data, the hidden layers process and transform this (Pal et al. 2022; Ruidas et al. 2023). This approach
information, and the output layer produces the final involves dividing the dataset into K segments, with
result. The connections between the layers, known as one segment utilized for validation (verification
weights, are adjusted during training to minimize the wells) and the remaining K–1 segments combined
difference between the predicted output and the actual for training (core sample wells). This process is

Figure 7. The topology of the backpropagation neural network.

10 M. ALI ET AL.

iterated K times, and the average error of all iterations looking at all the features together, this method takes
is reported, allowing each sample in the dataset to be each feature separately and identifies the relationship
tested. Integrating K-fold cross-validation with an of that feature with the target feature (Abellana and
exhaustive grid search technique aid in hyperpara Lao 2023). Pearson’s correlation is a statistical
meter tuning, ensuring optimal model performance. approach that measures the strength of a linear rela
This approach not only evaluates the model’s predict tionship between two variables (a and b). The method
ability but also addresses overfitting concerns during attempts to draw the line of best fit through data, with
the training process. In the context of our model values ranging from −1 to +1. A value of 0 indicates no
development, we have incorporated five folds in the relationship between the variables. Values between −1
K-fold cross-validation process (Figure 8). This com and 0 indicate a negative relationship and values
prehensive approach contributes to a more thorough between 0 and +1 indicate a positive correlation.
assessment of the model’s generalization capabilities.
Typical data splits are 70% training and 30% testing.

3.2.5. Feature selection There are several conventional log curves, so it is

Feature selection is an essential pre-processing necessary to select the feature curve possessing
method commonly employed to reduce the dimen a higher correlation with the target curve before train
sionality of a dataset by systematically removing irre ing the model. Nine logging curves were selected from
levant features from an available set of features. There the dataset (Figure 9) providing a range of common
are many benefits to employing feature selection measurements typically acquired within a well. To
within a machine learning workflow (Ali et al. 2021; evaluate the ranking mechanism, several curves, for
Guyon and Elisseeff 2003; Li, Li, and Liu 2017). These example, TVD, LLM, and LLS are not used for litho
include simplifying the model, which increases the facies recognition were selected to evaluate how the
understanding of the processes that generated the feature selection model handled them. The results of
predictions; reducing model training times, thus redu feature selection methods are shown in Figure 9.
cing the computational cost required for modeling; Neutron porosity (NPHI) is indicated as a feature
reducing the potential of model overfitting; avoiding having the most significant impact with an influence
the curse of dimensionality. factor value of 36.21. Following a significant drop-off,
In this recent research, we have selected the the next most relevant feature identified is compres
Pearson’s correlation algorithm of feature selection sional sonic (DT), with an influence factor of 14.04.
(Arkalgud, McDonald, and Brackenridge 2021). Another significant, although smaller, decrease in
Univariate feature selection identifies the best features impact factor is observed between Gamma-ray (GR)
based on univariate linear regression tests. Rather than (12.17), Deep resistivity (LLD) (10.01), and density

Figure 8. Visualization of the cross-validation process involving random subsampling. The initial dataset undergoes random
partitioning, creating a training set for model development and a testing set for validation purposes.
GEO-SPATIAL INFORMATION SCIENCE 11

Figure 9. (a) Weight feature importance scores and (b) heatmap of correlation features for lithofacies. Following a successful
ranking, the top five features scaled down with the feature selection method were carried forward to modeling to construct
models for lithofacies.

(RHOB) (7.70). Whereas photoelectric factor (PEF), used a robust scaling technique, called z-score
spontaneous potential (SP), and caliper (CAL) inputs method, that removes the median and scales the data
were ranked as the least impactful and relevant input according to the interquartile range (Equation 15).
features with influence factors of 6 and <6, respec The effect of features scaling can be clearly shown
tively. Therefore, it can be suggested that the top five when using Karnal density pair plots. Figure 10 dis
ranked inputs are the most relevant features. plays the Karnal density pair plots generated for all
features after applying z-score method. After normal
3.2.6. Measuring prediction accuracy ization, all the features have been scaled to a range of 0
Evaluation metrics are required to estimate the per and 1 (Figure 10a,b). The RMSE value in the after-
formance of the model (Arkalgud, McDonald, and normalized model is lower than the before normalized
Brackenridge 2021). The difference between the actual model (Figure 10c,d). The selected hyperparameters
logs and pseudo logs can be calculated by these eva based on the optimization routine are a max depth of
luation metrics, as there are several evaluation metrics 12, a learning rate of 0.05, and a minimum child
for regression. A simple following method has been weight of 8.
employed:

where μ is the average value of the characteristic curve

P
N
and μ ¼ N1 ðxi Þ, N is the number of samples of
i¼1
3.3. Normalization of input curves a characteristic curve, and xi is the value at the i-sam
ple point of a characteristic curve; σ is the standard
Normalization is a critical process in almost all sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
machine learning workflows, defined as the process P N
deviation σ ¼ N1 ðxi μÞ2 .
of the application of corrective shifts to log traces to i¼1
remove systematic errors. There are several scaling
techniques available that could be applied to scale
4. Area of petrologic features
a given dataset. Standardization and normalization
are two of the most common techniques for data The Lower Goru formation of the Kadanwari field
preprocessing and scaling. Standardization scales the block is distributed in the NE-SW direction, and the
data by removing the mean and scaling to unit var lithology is mainly composed of sand, siltstone, silty-
iance. In normalization, the most popular technique is shaly, shaly sand, and Shale (Figure 11).
to normalize each sample by using the minimum and Generally, geologically in the area, there are many
maximum value of each feature, thus normalizing the lithological classifications, and the lithology changes
samples to a range of zero and one (Ioffe and Szegedy rapidly in the longitudinal direction. However, the
2015). Scaling results from these common techniques logging curve is limited by the resolution and cannot
are negatively impacted when there are outliers in the reflect the lithological information of thin layers. In
dataset. In this study, to mitigate any outlier issue, we addition, too many lithology types will also reduce the
12 M. ALI ET AL.

Figure 10. Comparison of statistical characteristics of feature curves (a) before and (b) after normalization, and the RMSE value is
lower than the RMSE values in the initial and normalized hyperparameter optimization of the models as shown in (c) and (d).

Figure 11. Location and stratigraphic profile of the study area.

GEO-SPATIAL INFORMATION SCIENCE 13

Table 1. Lithofacies classification of Lower Goru formation.

Facies Description
Macro Facies 1 Shale to silty shale (Shelf deposits)
Macro Facies 2 Siltstone to silty-shaly sandstone (prodelta shales with turbiditic layers)
Macro Facies 3 Low-porosity, low permeability cemented sandstone (very distal mouth bar fringe)
Macro Facies 4 Low-medium porosity, low permeability sideritic sandstone (shoreface to distal mouth bar)
Macro Facies 5 High-porosity, high permeability sandstone (mouth bar)
Macro Facies 6 Highly chamosite/siderite affected lithologies (chamositized mouth bar)

accuracy of lithology identification. The purpose of The method of log facies analysis is to establish the
lithological identification is to analyze the energy of relationship between log facies and rock facies and
the water body in the sedimentary period by using transform log facies into rock facies to realize rock
lithological information, to meet the demand for fine facies identification. The relationship between log
research on sedimentary microfacies. Therefore, con facies and rock facies is established based on core
sidering the demand for sedimentary microfacies calibration. Due to the complexity of the rock electri
research and the recognition accuracy, the lithofacies cal relationship, log facies, and rock facies often can
is divided into six categories (Table 1). Because the not be completely one-to-one corresponded. It is
particle size of sandstone and fine sandstone is rela necessary to determine the corresponding relationship
tively coarse, it is difficult to distinguish them in terms by referring to the probability of different rock facies
of logging characteristics, while both fine-grained corresponding to each well log. The probability is
sand and granular sand reflect a high-energy sedimen equal to the percentage of the cumulative thickness
tary environment, so they are collectively classified as of different rock facies corresponding to each well log
granular sand facies. The granular sand facies repre in the total thickness of the log facies. Taking the SOM
sent the high-energy sedimentary environment, the method of K-14 well as an example, the total thickness
siltstone facies represent the medium-energy sedimen of log facies 5 is 19 m, and the thickness of the good
tary environment, and the clay or shale facies repre sand reservoir part corresponding to log facies 5 is 16
sents the low-energy sedimentary environment. It is m, so the probability of log facies 5 corresponding to
divided into six types of lithofacies, which can not only good sand reservoir facies is 83.8%. The probability of
meet the production demand but also ensure high- the corresponding medium sand reservoir facies is
precision lithofacies identification. 16.2%, and the probability of the corresponding shale
and other facies are 0%. It is comprehensively consid
ered that log facies 5 corresponds to the good sand
5. Logging facies analysis method reservoir facies (Table 2). Similarly, based on the per
centage of core calibration probability, a good sand
A log facies is a set of logging features and their reservoir corresponds to log facies 5 and log facies 9. If
combinations that can reflect the sedimentary charac the probability of a similar selection of other rock
teristics and distinguish this sediment from other sedi facies is greater than 70%, the corresponding relation
ments. Rock facies refers to the total geological ship between log facies and rock facies be established.
environment in which rocks are formed, including
temperature, climate, stratum, age, etc. It is a rock or
rock combination formed in a certain sedimentary
6. Results and discussion
environment. Each type of lithofacies has different
physical properties, such as high porosity or low por In this section, we will discuss the results of selected
osity, high radioactivity or low radioactivity, oil, and machine learning models and the accuracy of predicting
gas, or water-bearing and fracture development. As lithofacies in a geometrically progradational sequence
a result, the same lithofacies may have multiple log environment. Based on the consistency table, lithofacies
characteristics, so one rock facies may correspond to are divided into six divisions, as shown in Figure 12.
multiple log facies. The relationship between lithofacies and logging char
The number of log facies can be set according to acteristics is analyzed in Figure 12, which represents
actual needs. Generally, the number of log facies is a cross plot of scattered points for various parameters.
larger than the actual of rock facies. The more the It can be observed that the natural gamma-ray vs deep
number of log facies, the more complex the corre resistivity log cross plot provides a good distinction for
sponding relationship with the rock facies. lithology. Conversely, the acoustic vs deep resistivity log
According to the literature, it is more appropriate for cross plot exhibits a relatively weak discrimination abil
the number of log facies to be set 1 to 3 times the ity for facies. Different lithological categories overlap
number of rock facies (Al Hasan et al. 2023). The rock significantly and cannot be effectively distinguished by
facies in the study area are divided into 6 categories, logging data alone in this complex lithological and
and the number of log facies is set to 10 categories. depositional environment.
14 M. ALI ET AL.

Table 2. Probability percentages indicating the relationship between log facies and corresponding rock facies based on core
calibration data. Bold values highlight significant probabilities supporting the identification of specific rock facies.
Core Facies 1 Core Facies 2 Core Facies 3 Core Facies 4 Core Facies 5 Core Facies 6
Log Facies 1 0 0 59.3 22.1 18.6 0
Log Facies 2 0 0 70.2 29.8 0 0
Log Facies 3 0 17.4 1.4 81.3 0 0
Log Facies 4 0 0 83.5 10.1 6.3 0
Log Facies 5 0 0 0 0 83.8 16.2
Log Facies 6 1.5 83.1 1.5 0 13.8 0
Log Facies 7 100 0 0 0 0 0
Log Facies 8 45.4 33 0 0 21.6 0
Log Facies 9 100 0 0 0 0 0

Figure 12. Cross plot of logging characteristics of different lithofacies.

Recognizing the limitations of conventional logging example, changes can be seen in Figure 13 in neutron,
data in such a complex setting, we leverage the power density, and acoustic log characteristics in the pay
of diverse machine learning methodologies. The zone showing that the porosity log response changes
ensemble includes SOM, KNN, MRGC, and ANN. from low to high. Taking the MRGC method as an
These techniques collectively aim to discern, compare, example, the high porosity pay zone corresponds to
and enhance the recognition effects, providing log facies 8, the medium porosity medium pay zone
a robust framework for unraveling the intricacies of corresponds to log facies 9, and low porosity non-pay
lithofacies in the face of challenging geological and corresponds to log facies 1.
depositional conditions. Figure 13 provides a comprehensive visual repre
Both SOM and MRGC belong to cluster analysis sentation of the transformative process applied to the
methods. The process of identifying lithofacies by log facies of the K-15 well after core calibration. The
cluster analysis is to determine the number of cluster utilization of both SOM and MRGC cluster analysis
log facies first; then, cluster to obtain log facies, cali techniques enables the conversion of log facies colors
brate the core with coring wells, establish the corre into corresponding lithologies, establishing a crucial
sponding relationship between log facies and link between the acquired well data and its geological
lithofacies, and finally, convert log facies into lithofa significance. After carefully reviewing Figure 14, it
cies to realize lithofacies identification of non-coring becomes apparent that the lithologies derived from
wells and non-coring sections. The relationship the log facies transformation align remarkably well
between logging facies and rock correspondence is with the lithology profile extracted from the actual
the key to determining the identification effect. The core samples. This alignment underscores the effec
corresponding relationship can be determined based tiveness and reliability of the applied methods in accu
on the existing logging theoretical basis. Therefore, rately capturing the geological characteristics of the
even if the number of core samples is small, cluster subsurface formations. Specifically, employing the
analysis can be performed. The method can also iden SOM method reveals that the coring section of the
tify lithofacies. K-15 well spans a thickness of 9.8 m. Notably, the
The key to cluster analysis is log facies analysis, coincidence thickness, where lithofacies identification
where log facies are translated into geological facies. matches with the actual lithology, extends to 24.5 m.
In the process of transformation, there may be multi Despite the relatively lower recognition coincidence
ple log facies corresponding to one rock facies. For rate of 40%, the SOM method provides valuable
GEO-SPATIAL INFORMATION SCIENCE 15

Figure 13. The log facies correlated with core calibration by SOM and MRGC of K-15 well.

Figure 14. The corresponding relationship between lithofacies and logging.

insights into the lithological composition of the well. coincidence rate of 90.7%. This signifies a robust cap
In contrast, the MRGC method demonstrates a higher ability of MRGC in precisely characterizing lithologi
level of accuracy in lithofacies identification. The coin cal transitions and patterns within the K-15 well.
cidence thickness under the MRGC approach is mea Figure 14 serves as a crucial visual aid, offering a side-
sured at 10.8 m, with an impressive identification by-side comparison of the transformed log facies using
16 M. ALI ET AL.

SOM and MRGC methods against the lithology profile makes it evident that the MRGC algorithms not only
derived from actual core samples. The consistent showcase a good classification effect for each lithology
alignment observed in the lithological interpretations but also demonstrate robustness in handling complex
underscores the reliability of the chosen methodolo geological variations. The visualization in Figure 15
gies in converting log data into meaningful geological allows us to discern the detailed performance differ
information, thus enhancing our understanding of the ences among the machine learning models, guiding us
subsurface geological conditions in the vicinity of the toward a more informed selection of models based on
K-15 well. their suitability for specific lithological contexts. In
Similarly, we are taking another well as an example, summary, the detailed examination of Figure 15
where sufficient core data information was not avail underscores the significance of employing MRGC
able; therefore, we utilized and combined the previous and KNN models for effective lithology prediction
near well K-15 core data information with K-14 well to and classification, particularly in instances where
identify and compare the actual rock facies with log data imbalances pose challenges for other machine
facies in the existing well. Therefore, we employed learning algorithms. The results from Figures 14 and
multiple machine learning models in this scenario. 15 highlight the effectiveness of the MRGC algorithm
Figure 15 visualizes the predictive results of the four in accurately classifying lithologies and its robust per
different machine learning models for K-14 well. The formance in handling complex geological variations
evaluation results are displayed in the log track from within the study area. Therefore, based on these obser
10 to 13, the performance of each model more intui vations, MRGC emerges as the most suitable method
tively. Tracks 11 and 13 of Figure 15 demonstrate that for lithofacies identification in our study.
the MRGC and KNN models exhibit a notable recog Consequently, we utilized the MRGC algorithms in
nition effect for various lithologies, surpassing other the process to construct a facies model in the next step.
cluster methods such as ANN and SOM. The distinct The next step is to develop the lithofacies model.
classification performance of MRGC and KNN in We have used the truncated Gaussian simulation
these tracks underscores their efficacy in handling (TGS) technique in this work on a G5 layer at the
diverse lithological variations within the well. Kadanwari gas field. The employment of the TGS
Notably, Figure 15 highlights that the ANN and method is to subsequently build a facies model that
SOM models show comparatively lower effectiveness, can terminate the characterization uncertainties
particularly in scenarios involving imbalanced classi and prospectively, a better definition of the area.
fication of lithologies. Further analysis of Figure 15 The TGS approach was first established to render

Figure 15. Comparison of lithofacies identification results with the different machine learning approaches.
GEO-SPATIAL INFORMATION SCIENCE 17

Figure 16. Illustration of the sequential model building based on lithofacies (a) estimated facies proportions (b) section across the
chronostratigraphic sublayer of G distributions. (c) WS-se section across the TGS facies correlated with the well log information.

stochastic images of sedimentary geology precar the sedimentary facies interpreted from the well logs
ious to fluvial-deltaic environments (Beucher et al. and the electro-facies obtained from the TGS electro-
1999; López and Aldana 2007). The fundamental facies volume.
purpose of the TGS model was to substitute poor
geological calculations with multiple Gaussian dis
tributions based on random function for com 7. Conclusions
monly used geostatistical simulations. The This research presents a comprehensive study of
simulation for lithotype relies on the value given machine learning approaches for lithofacies identifica
for the Gaussian random function and further, this tion in complex geological environments with high
given value selects the lithotype after determining dimensions, focusing specifically on the G5 layer of
the threshold. The TGS model is normally the Lower Goru formation in the Kadanwari gas field,
employed in the sedimentary environment. After Central Indus Basin, Pakistan. By systematically com
undergoing a geometric modification that flattens paring and analyzing the practical application out
one or more chronostratigraphic markers, the comes of four distinct methods SOM, MRGC, ANN,
simulation is run in a “working simulation grid.” and KNN and incorporating TGS analysis for reservoir
They are shown as vertically drawn proportion characterization, this research not only addresses the
curves or vertical proportion curves. These curves challenges associated with applying machine learning
can differ spatially. to well log data but also establishes a robust framework
After that, the MRGC technique was utilized to with global implications. The findings underscore the
propagate the lithofacies information obtained from strengths of unsupervised learning methods in lithofa
the cored wells to the non-cored wells. The lithofacies cies relationship establishment, particularly highlight
were then modeled employing TGS constrained to ing SOM’s fault tolerance networks and MRGC’s
depositional facies based on the geological information efficacy in handling domain intersection with multiple
and chronostratigraphic markers (Figure 16b). lithofacies classifications. Similarly, supervised learning
Figure 16a shows the global proportion curves for the methods, namely ANN and KNN, demonstrate their
data set. Figure 16c shows the cross sections of one respective adaptability and problem-solving capacities.
simulation for the same data set. This simulation has In real-world scenarios, such as the complex litho-
been performed by using proportions that vary in electric relationships of fluvio-deltaic environments
space. with limited core data, the study recommends the suit
The results obtained for each of the facies of the ability of KNN and MRGC methods. The integration of
main G5 layer, follow the geological information and TGS analysis following precise lithofacies identification
previous studies in the area and honors the well data. enhances reservoir characterization, providing a facies
It is important to notice the good lateral correlation volume that aids in defining prospective areas. Notably,
obtained with the TGS method. As can be observed in the TGS approach outperforms previous non-linear
Figure 16c, there is an excellent agreement between techniques, offering conclusive results and establishing
18 M. ALI ET AL.

a strong lateral correlation between identified well log Umar Ashraf received his Ph.D. in geophysics from the
ging lithofacies and TGS lithofacies. Beyond the Institute of Geophysics and Geomatics, China University
of Geosciences (Wuhan), China. He is currently an assistant
Kadanwari gas field, this research provides valuable professor of geophysics at Yunnan University. He focuses
insights and a practical reference for sedimentary litho on geophysical methods for subsurface exploration and
facies logging identification in diverse formations and seismic data analysis.
study areas, contributing significantly to the global Wakeel Hussain received his M.S. degree in oil and gas
interest in advancing the application of machine learn engineering from the China University of Geosciences
(Wuhan), China. He is currently pursuing Ph.D. degree
ing techniques in geological and reservoir studies. The
from China University of Geosciences (Wuhan). His
proposed workflow ensures consistency, reliability, and research interests include petroleum geology, well-log ana
efficiency in results, ultimately saving time and effort in lysis, and facies classification.
data processing and interpretation on a broader scien
tific scale.
ORCID
Muhammad Ali http://orcid.org/0000-0001-9795-1117
Disclosure statement Peimin Zhu http://orcid.org/0000-0003-1613-9261
Ren Jiang http://orcid.org/0000-0002-6750-2297
No potential conflict of interest was reported by the
Hao Zhang http://orcid.org/0000-0001-7845-3489
author(s).
Umar Ashraf http://orcid.org/0000-0003-2402-3605
Wakeel Hussain http://orcid.org/0009-0007-3582-3612

Funding
This research was funded by financial support from the
Data availability statement
National Natural Science Foundation of China [Grant The data supporting the findings of this study can be
Nos. 41774145, 72243011] and China’s National Key R&D obtained from the corresponding author upon
Program [Grant No. 2023YFB4104200]. a reasonable request.

Notes on contributors References

Muhammad Ali earned his Ph.D. in geophysics from the Abellana, D. P. M., and D. M. Lao. 2023. “A New Univariate
Institute of Geophysics & Geomatics at China University of Feature Selection Algorithm Based on the Best–Worst
Geosciences (Wuhan) and subsequently completed a post Multi-Attribute Decision-Making Method.” Decision
doctoral research fellowship at the same institution. He is Analytics Journal 7:100240. https://doi.org/10.1016/j.
currently a full-time researcher at the Institute of Rock and dajour.2023.100240.
Soil Mechanics, Chinese Academy of Sciences. His research Ahmad, N., and S. Chaudhry. 2002. “Kadanwari Gas Field,
interests include petroleum geology, seismic and well log Pakistan: A Disappointment Turns into an Attractive
analysis, rock physics modeling, exploration, and machine Development Opportunity.” Petroleum Geoscience 8 (4):
learning. 307–316. https://doi.org/10.1144/petgeo.8.4.307.
Alakbari, F. S., S. Elkatatny, and S. O. Baarimah. 2016.
Peimin Zhu is a professor at the Institute of Geophysics &
“Prediction of Bubble Point Pressure Using Artificial
Geomatics, China University of Geosciences (Wuhan),
Intelligence AI Techniques.” In Paper presented at the
China. His research interests include seismic imaging, seis
SPE Middle East Artificial Lift Conference and Exhibition,
mic inversion, and radar imaging for the interiors of the
Manama, Kingdom of Bahrain, November 30–
moon, Mars and asteroids.
December 1. https://doi.org/10.2118/184208-MS.
Ma Huolin is currently an associate professor at the China Al Hasan, R., M. H. Saberi, M. A. Riahi, and A. K. Manshad.
University of Geosciences (Wuhan), China. He focuses on 2023. “Electro-Facies Classification Based on Core and
geophysical numerical modeling, geophysical inversion, and Well-Log Data.” Journal of Petroleum Exploration and
the application of electromagnetic theory to both surface Production Technology 13 (11): 2197–2215. https://doi.
and subsurface environments. org/10.1007/s13202-023-01668-5.
Ali, J., U. Ashraf, A. Anees, S. Peng, M. U. Umar, H. Vo
Ren Jiang received his M.S. degree in geophysics from the
Thanh, U. Khan, et al. 2022. “Hydrocarbon Potential
Research Institute of Petroleum Exploration &
Assessment of Carbonate-Bearing Sediments in a Meyal
Development (RIPED), PetroChina. He is currently work
Oil Field, Pakistan: Insights from Logging Data Using
ing as an advanced engineer in RIPED in seismic rock
Machine Learning and Quanti Elan Modeling.”
physics and reservoir prediction.
American Chemical Society Omega 7 (43): 39375–39395.
Hao Zhang received the M.S. and Ph.D. degrees in geophy https://doi.org/10.1021/acsomega.2c05759.
sics from the Institute of Geophysics & Geomatics, China Ali, M., U. Ashraf, P. Zhu, H. Ma, R. Jiang, G. Lei, J. Ullah,
University of Geosciences (Wuhan), China. He is currently J. Ali, H. Vo Thanh, and A. Anees. 2023a. “Quantitative
a post-doctoral researcher at the China University of Characterization of Shallow Marine Sediments in Tight
Geosciences (Wuhan) and works on the applications of Gas Fields of Middle Indus Basin: A Rational Approach
deep learning in seismic interpretation, including image of Multiple Rock Physics Diagnostic Models.” Processes
processing and seismic facies classification. 11 (2): 323. https://doi.org/10.3390/pr11020323.
GEO-SPATIAL INFORMATION SCIENCE 19

Ali, M., R. Jiang, H. Ma, H. Pan, K. Abbas, U. Ashraf, and Ashraf, U., H. Zhang, A. Anees, M. Ali, X. Zhang, S. Abbasi,
J. Ullah. 2021. “Machine Learning - A Novel Approach of and H. Nasir. 2020. “Controls on Reservoir Heterogeneity
Well Logs Similarity Based on Synchronization Measures of a Shallow-Marine Reservoir in Sawan Gas Field, SE
to Predict Shear Sonic Logs.” Journal of Petroleum Science Pakistan: Implications for Reservoir Quality Prediction
& Engineering 203:108602. https://doi.org/10.1016/j.pet Using Acoustic Impedance Inversion.” Water 12 (11):
rol.2021.108602. 2972. https://doi.org/10.3390/w12112972.
Ali, M., M. J. Khan, M. Ali, and S. Iftikhar. 2019. Ashraf, U., H. Zhang, A. Anees, H. N. Mangi, M. Ali,
“Petrophysical Analysis of Well Logs for Reservoir X. Zhang, M. Imraz, et al. 2021. “A Core Logging,
Evaluation: A Case Study of “Kadanwari” Gas Field, Machine Learning and Geostatistical Modeling
Middle Indus Basin, Pakistan.” Arabian Journal of Interactive Approach for Subsurface Imaging of
Geosciences 12 (6): 215. https://doi.org/10.1007/s12517- Lenticular Geobodies in a Clastic Depositional System,
019-4389-x. SE Pakistan.” Natural Resources Research 30 (3):
Ali, M., H. Ma, H. Pan, U. Ashraf, and R. Jiang. 2020. 2807–2830. https://doi.org/10.1007/s11053-021-09849-x.
“Building a Rock Physics Model for the Formation Ashraf, U., H. Zhang, H. V. Thanh, A. Anees, M. Ali, Z.
Evaluation of the Lower Goru Sand Reservoir of the Duan, H. N. Mangi, and X. Zhang 2024b. “A Robust
Southern Indus Basin in Pakistan.” Journal of Petroleum Strategy of Geophysical Logging for Predicting Payable
Science & Engineering 194:107461. https://doi.org/10. Lithofacies to Forecast Sweet Spots Using Digital
1016/j.petrol.2020.107461. Intelligence Paradigms in a Heterogeneous Gas Field.”
Ali, M., P. Zhu, R. Jiang, H. Ma, M. Ehsan, W. Hussain, Natural Resources Research: 1–22.
H. Zhang, U. Ashraf, and J. Ullaah. 2023b. “Reservoir Ashraf, U., P. Zhu, Q. Yasin, A. Anees, M. Imraz,
Characterization Through Comprehensive Modeling of H. N. Mangi, and S. Shakeel. 2019. “Classification of
Elastic Logs Prediction in Heterogeneous Rocks Using Reservoir Facies Using Well Log and 3D Seismic
Unsupervised Clustering and Class-Based Ensemble Attributes for Prospect Evaluation and Field
Machine Learning.” Applied Soft Computing 148:110843. Development: A Case Study of Sawan Gas Field,
https://doi.org/10.1016/j.asoc.2023.110843. Pakistan.” Journal of Petroleum Science & Engineering
Ali, N., J. Chen, X. Fu, W. Hussain, M. Ali, S. M. Iqbal, 175:338–351. https://doi.org/10.1016/j.petrol.2018.12.
A. Anees, M. Hussain, M. Rashid, and H. V. Thanh. 2023. 060.
“Classification of Reservoir Quality Using Unsupervised Balmer, M., R. Weibel, and H. Huang. 2021. “Value of
Incorporating Geospatial Information into the
Machine Learning and Cluster Analysis: Example from
Prediction of On-Street Parking Occupancy – A Case
Kadanwari Gas Field, SE Pakistan.” Geosystems and
Study.” Geo-Spatial Information Science 24 (3): 438–457.
Geoenvironment 2 (1): 100123. https://doi.org/10.1016/j.
https://doi.org/10.1080/10095020.2021.1937337.
geogeo.2022.100123.
Beucher, H., F. Fournier, B. Doligez, and J. Rozanski. 1999.
Al-Qaness, M. A. A., A. A. Ewees, H. Fan, A. M. AlRassas,
“Using 3D Seismic-Derived Information in Lithofacies
and M. Abd Elaziz. 2022. “Modified Aquila Optimizer for
Simulations. A Case Study.” Paper presented at the SPE
Forecasting Oil Production.” Geo-Spatial Information
Annual Technical Conference and Exhibition, Houston,
Science 25 (4): 519–535. https://doi.org/10.1080/
Texas, October 3–6. https://doi.org/10.2118/56736-MS.
10095020.2022.2068385.
Bezdek, J. C., S. K. Chuah, and D. Leep. 1986. “Generalized
Alzubaidi, F., P. Mostaghimi, P. Swietojanski, S. R. Clark,
K-Nearest Neighbor Rules.” Fuzzy Sets and Systems
and R. T. Armstrong. 2021. “Automated Lithology 18 (3): 237–256. https://doi.org/10.1016/0165-0114(86)
Classification from Drill Core Images Using 90004-7.
Convolutional Neural Networks.” Journal of Petroleum Bhattacharya, S., T. R. Carr, and M. Pal. 2016. “Comparison of
Science & Engineering 197:107933. https://doi.org/10. Supervised and Unsupervised Approaches for Mudstone
1016/j.petrol.2020.107933. Lithofacies Classification: Case Studies from the Bakken
Antariksa, G., R. Muammar, and J. Lee. 2022. “Performance and Mahantango-Marcellus Shale, USA.” Journal of
Evaluation of Machine Learning-Based Classification Natural Gas Science & Engineering 33:1119–1133. https://
with Rock-Physics Analysis of Geological Lithofacies in doi.org/10.1016/j.jngse.2016.04.055.
Tarakan Basin, Indonesia.” Journal of Petroleum Science Bloch, S., R. H. Lander, and L. Bonnell. 2002. “Anomalously
& Engineering 208:109250. https://doi.org/10.1016/j.pet High Porosity and Permeability in Deeply Buried
rol.2021.109250. Sandstone Reservoirs: Origin and Predictability.” AAPG
Arkalgud, R., A. McDonald, and R. Brackenridge. 2021. Bulletin 86 (2): 301–328. https://doi.org/10.1306/
“Automated Selection of Inputs for Log Prediction 61EEDABC-173E-11D7-8645000102C1865D.
Models Using a New Feature Selection Method.” Paper Cai, J., and Y. Chen. 2022. “A Novel Unsupervised Deep
presented at the SPWLA 62nd Annual Logging Learning Method for the Generalization of Urban Form.”
Symposium, Virtual Event, 17–20. May. https://doi.org/ Geo-Spatial Information Science 25 (4): 568–587. https://
10.30632/SPWLA-2021-0091. doi.org/10.1080/10095020.2022.2068384.
Ashraf, U., W. Shi, H. Zhang, A. Anees, R. Jiang, M. Ali, H. Chai, H., N. Li, C. Xiao, X. Liu, D. Li, C. Wang, and D. Wu.
N. Mangi, and X. Zhang 2024c. “Reservoir Rock Typing 2009. “Automatic Discrimination of Sedimentary Facies
Assessment in a Coal-Tight Sand-Based Heterogeneous and Lithologies in Reef-Bank Reservoirs Using Borehole
Geological Formation Through Advanced AI Methods.” Image Logs.” Applied Geophysics 6 (1): 17–29. https://doi.
Scientific Reports 14 (1): 5659. org/10.1007/s11770-009-0011-4.
Ashraf, U., H. Zhang, A. Anees, M. Ali, H. N. Mangi, and X. Chang, H. C., D. C. Kopaska, and H. C. Chen. 2002.
Zhang 2024a. “An Ensemble-Based Strategy for Robust “Identification of Lithofacies Using Kohonen
Predictive Volcanic Rock Typing Efficiency on a Global Self-Organizing Maps.” Computers & Geosciences 28 (2):
Scale: A Novel Workflow Driven by Big Data Analytics.” 223–229. https://doi.org/10.1016/S0098-3004(01)00067-X.
Science of the Total Environment 173425. 10.1016/j.scito Chawshin, K., A. Gonzalez, C. F. Berg, D. Varagnolo,
tenv.2024.173425. Z. Heidari, and O. Lopez. 2021. “Classifying Lithofacies
20 M. ALI ET AL.

from Textural Features in Whole Core CT-Scan Images.” Lai, J., X. Fan, B. Liu, X. Pang, S. Zhu, W. Xie, and G. Wang.
SPE Reservoir Evaluation & Engineering 24 (2): 341–357. 2020. “Qualitative and Quantitative Prediction of
https://doi.org/10.2118/205354-PA. Diagenetic Facies via Well Logs.” Marine & Petroleum
Dixit, N., P. McColgan, and K. Kusler. 2020. “Machine Geology 120:104486. https://doi.org/10.1016/j.marpetgeo.
Learning-Based Probabilistic Lithofacies Prediction 2020.104486.
from Conventional Well Logs: A Case from the Umiat Lai, J., G. Wang, S. Wang, J. Cao, M. Li, X. Pang, C. Han, et
Oil Field of Alaska.” Energies 13 (18): 4862. https://doi. al. 2018a. “A Review on the Applications of Image Logs in
org/10.3390/en13184862. Structural Analysis and Sedimentary Characterization.”
Dubois, M. K., G. C. Bohling, and S. Chakrabarti. 2007. Marine & Petroleum Geology 95:139–166. https://doi.org/
“Comparison of Four Approaches to a Rock Facies 10.1016/j.marpetgeo.2018.04.020.
Classification Problem.” Computers & Geosciences 33 (5): Lai, J., G. Wang, S. Wang, J. Cao, M. Li, X. Pang, Z. Zhou,
599–617. https://doi.org/10.1016/j.cageo.2006.08.011. et al. 2018b. “Review of Diagenetic Facies in Tight
Feng, C., W. Shi, Y. Hu, and X. Zhao. 2018. “Depositional Sandstones: Diagenesis, Diagenetic Minerals, and
Environments and Petrofacies of X–XII Sand Groups of Prediction via Well Logs.” Earth-Science Reviews
K 2 qn 3 Formation, Daqingzijing Area, Songliao Basin, 185:234–258. https://doi.org/10.1016/j.earscirev.2018.
China.” Journal of Petroleum Exploration and Production 06.009 .
Technology 8 (2): 363–374. https://doi.org/10.1007/ Lan, X., C. Zou, Z. Kang, and X. Wu. 2021. “Log Facies
s13202-017-0400-9. Identification in Carbonate Reservoirs Using Multiclass
Geng, J., W. Gan, J. Xu, R. Yang, and S. Wang. 2020. Semi-Supervised Learning Strategy.” Fuel 302:121145.
“Support Vector Machine Regression (SVR)-Based https://doi.org/10.1016/j.fuel.2021.121145.
Nonlinear Modeling of Radiometric Transforming LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep
Relation for the Coarse-Resolution Data-Referenced Learning.” Nature 521 (7553): 436–444. https://doi.org/
Relative Radiometric Normalization (RRN).” Geo- 10.1038/nature14539.
Spatial Information Science 23 (3): 237–247. https://doi. Li, L., Z. Wen, and Z. Wang. 2016. “Outlier Detection and
org/10.1080/10095020.2020.1785958. Correction During the Process of Groundwater Lever
Guresen, E., and G. Kayakutlu. 2011. “Definition of Monitoring Base on Pauta Criterion with Self-Learning
Artificial Neural Networks with Comparison to Other and Smooth Processing.” Paper Presented at the AsiaSim
Networks.” Procedia Computer Science 3:426–433. SCS AutumnSim, 643. Springer, Singapore.
https://doi.org/10.1016/j.procs.2010.12.071. Li, S., H. He, R. Hao, H. Chen, H. Bie, and P. Liu. 2020.
Guyon, I., and A. Elisseeff. 2003. “An Introduction to “Depositional Regimes and Reservoir Architecture
Variable and Feature Selection.” Journal of Machine Characterization of Alluvial Fans of Karamay Oilfield in
Learning Research 3:1157–1182. https://doi.org/10.1162/ Junggar Basin, Western China.” Journal of Petroleum
153244303322753616. Science & Engineering 186:106730. https://doi.org/10.
He, J., W. Ding, Z. Jiang, A. Li, R. Wang, and Y. Sun. 2016. 1016/j.petrol.2019.106730.
“Logging Identification and Characteristic Analysis of the Li, Y., T. Li, and H. Liu. 2017. “Recent Advances in Feature
Lacustrine Organic-Rich Shale Lithofacies: A Case Study Selection and Its Applications.” Knowledge and
from the Es3L Shale in the Jiyang Depression, Bohai Bay Information Systems 53 (3): 551–577. https://doi.org/10.
Basin, Eastern China.” Journal of Petroleum Science & 1007/s10115-017-1059-8.
Engineering 145:238–255. https://doi.org/10.1016/j.pet Liu, J., Z. Liu, K. Xiao, Y. Huang, and W. Jin. 2020.
rol.2016.05.017. “Characterization of Favorable Lithofacies in Tight
Hemmesch, N. T., N. B. Harris, C. A. Mnich, and D. Selby. Sandstone Reservoirs and Its Significance for Gas
2014. “A Sequence-Stratigraphic Framework for the Exploration and Exploitation: A Case Study of the 2nd
Upper Devonian Woodford Shale, Permian Basin, West Member of Triassic Xujiahe Formation in the Xinchang
Texas.” AAPG Bulletin 98 (1): 23–47. https://doi.org/10. Area, Sichuan Basin.” Petroleum Exploration and
1306/05221312077. Development 47 (6): 1194–1205. https://doi.org/10.1016/
Hussain, M., S. Liu, U. Ashraf, M. Ali, W. Hussain, N. Ali, S1876-3804(20)60129-5.
and A. Anees. 2022. “Application of Machine Learning López, M., and M. Aldana. 2007. “Facies Recognition Using
for Lithofacies Prediction and Cluster Analysis Approach Wavelet Based Fractal Analysis and Waveform Classifier
to Identify Rock Type.” Energies 15 (12): 4501. https://doi. at the Oritupano-A Field, Venezuela.” Nonlinear
org/10.3390/en15124501. Processes in Geophysics 14 (4): 325–335. https://doi.org/
Ioffe, S., and C. Szegedy. 2015. “Batch Normalization: 10.5194/npg-14-325-2007.
Accelerating Deep Network Training by Reducing Lyu, Q., S. Luo, Y. Guan, J. Fu, X. Niu, L. Xu, S. Feng, and
Internal Covariate Shift.” Paper presented at the S. Li. 2019. “A New Method of Lithologic Identification
Proceedings of the 32nd International Conference on and Distribution Characteristics of Fine-Grained
Machine Learning, Lille, France, 448–456. https://doi. Sediments: A Case Study in Southwest of Ordos Basin,
org/10.48550/arXiv.1502.03167. China.” Open Geosciences 11 (1): 17–28. https://doi.org/
Kim, J. 2022. “Lithofacies Classification Integrating 10.1515/geo-2019-0002.
Conventional Approaches and Machine Learning Nishitsuji, Y., and R. Exley. 2019. “Elastic Impedance Based
Technique.” Journal of Natural Gas Science & Facies Classification Using Support Vector Machine and
Engineering 100:104500. https://doi.org/10.1016/j.jngse. Deep Learning.” Geophysical Prospecting 67 (4):
2022.104500. 1040–1054. https://doi.org/10.1111/1365-2478.12682.
Kohonen, T. 1991. “Self-Organizing Maps: Ophmization Onalo, D., S. Adedigba, F. Khan, L. A. James, and S. Butt.
Approaches.” Paper presented at the Proceedings of the 2018. “Data Driven Model for Sonic Well Log
1991 International Conference on Artificial Neural Prediction.” Journal of Petroleum Science & Engineering
Networks, 981–990. Espoo, Finland, June 24–28. https:// 170:1022–1037. https://doi.org/10.1016/j.petrol.2018.06.
doi.org/10.1016/B978-0-444-89178-5.50003-8. 072.
GEO-SPATIAL INFORMATION SCIENCE 21

Ozkan, A., S. Cumella, K. Milliken, and S. Laubach. 2011. Vienna, Austria, May 23–26. https://doi.org/10.2118/
“Prediction of Lithofacies and Reservoir Quality Using 143001-ms.
Well Logs, Late Cretaceous Williams Fork Formation, Villegas, G., W. Liao, R. Criollo, W. Philips, and D. Ochoa.
Mamm Creek Field, Piceance Basin, Colorado.” AAPG 2017. “Detection of Leaf Structures in Close-Range
Bulletin 95 (10): 1699–1723. https://doi.org/10.1306/ Hyperspectral Images Using Morphological Fusion.”
01191109143. Geo-Spatial Information Science 20 (4): 325–332. https://
Pal, S. C., D. Ruidas, A. Saha, A. R. M. T. Islam, and doi.org/10.1080/10095020.2017.1399673.
I. Chowdhuri. 2022. “Application of Novel Data-Mining Wang, Z., D. Gao, X. Lei, D. Wang, and J. Gao. 2020.
Technique Based Nitrate Concentration Susceptibility “Machine Learning-Based Seismic Spectral Attribute
Prediction Approach for Coastal Aquifers in India.” Analysis to Delineate a Tight-Sand Reservoir in the
Journal of Cleaner Production 346:131205. https://doi. Sulige Gas Field of Central Ordos Basin, Western
org/10.1016/j.jclepro.2022.131205. China.” Marine & Petroleum Geology 113:104136.
Pei, Y., R. Chen, D. Li, X. Xiao, and X. Zheng. 2023. “FCN- https://doi.org/10.1016/j.marpetgeo.2019.104136.
Attention: A Deep Learning UWB NLOS/LOS Wu, D., S. Liu, H. Chen, L. Lin, Y. Yu, C. Xu, and B. Pan.
Classification Algorithm Using Fully Convolution 2020. “Investigation and Prediction of Diagenetic Facies
Neural Network with Self-Attention Mechanism.” Geo- Using Well Logs in Tight Gas Reservoirs: Evidences from
Spatial Information Science 27 (4): 1162–1181. https:// the Xu-2 Member in the Xinchang Structural Belt of the
doi.org/10.1080/10095020.2023.2178334. Western Sichuan Basin, Western China.” Journal of
Ruidas, D., S. C. Pal, A. R. M. Towfiqul, and A. Saha. 2023. Petroleum Science & Engineering 192:107326. https://doi.
“Hydrogeochemical Evaluation of Groundwater Aquifers org/10.1016/j.petrol.2020.107326.
and Associated Health Hazard Risk Mapping Using Ye, S.-J., and P. Rabiller. 2000. “A New Tool for Electro-
Ensemble Data Driven Model in a Water Scares Plateau Facies Analysis: Multi-Resolution Graph-Based
Region of Eastern India.” Exposure and Health 15 (1): Clustering.” Paper presented at the SPWLA 41st Annual
113–131. https://doi.org/10.1007/s12403-022-00480-6. Logging Symposium, Dallas, Texas, June 2000. June.
Shi, X., Y. Cui, X. Guo, H. Yang, R. Chen, T. Li, R. Li, Ying, Z., and P. Bao-Zhi. 2011. “The Application of SVM
J. Wang, R. Wang, and L. Meng. 2017. “Logging Facies and FMI to the Lithologic Identification of Volcanic
Classification and Permeability Evaluation:
Rocks.” Geophysical and Geochemical Exploration 35
Multi-Resolution Graph Based Clustering.” Paper pre
(5): 634–633. https://doi.org/10.1007/s12583-011-0163-z .
sented at the SPE Annual Technical Conference and
Yu, Z., Z. Wang, F. Zeng, P. Song, B. A. Baffour, P. Wang,
Exhibition. https://doi.org/10.2118/187030-MS.
W. Wang, and L. Li. 2021. “Volcanic Lithology
Song, L., Z. Liu, C. Li, C. Ning, Y. Hu, Y. Wang, F. Hong , et
Identification Based on Parameter-Optimized GBDT
al. 2021. “Prediction and Analysis of Geomechanical
Algorithm: A Case Study in the Jilin Oilfield, Songliao
Properties of Jimusaer Shale Using a Machine Learning
Basin, NE China.” Journal of Applied Geophysics
Approach.” Paper presented at the SPWLA 62nd Annual
194:104443. https://doi.org/10.1016/j.jappgeo.2021.
Logging Symposium, May 2021, Virtual Event. https://doi.
org/10.30632/SPWLA-2021-0089 104443.
Soucy, P., and G. W. Mineau. 2001. “A Simple KNN Zhang, J., W. Ambrose, and W. Xie. 2021. “Applying
Algorithm for Text Categorization.” Paper presented at Convolutional Neural Networks to Identify Lithofacies
the Proceedings 2001 IEEE International Conference on of Large-N Cores from the Permian Basin and Gulf of
Data Mining, 647–648. San Jose, CA, USA. Mexico: The Importance of the Quantity and Quality of
Tian, Y., H. Xu, X. Y. Zhang, H. J. Wang, T. C. Guo, Training Data.” Marine & Petroleum Geology 133:105307.
L. J. Zhang, and X. L. Gong. 2016. “Multi-Resolution https://doi.org/10.1016/j.marpetgeo.2021.105307.
Graph-Based Clustering Analysis for Lithofacies Zhen, Y., Y. Xiao, X. Zhao, X. Lu, J. Fang, J. Kang, and
Identification from Well Log Data: Case Study of L. Liu. 2023. “Identifying Lithofacies Types by Boosting
Intraplatform Bank Gas Fields, Amu Darya Basin.” Algorithm and Resampling Technique: A Case Study of
Applied Geophysics 13 (4): 598–607. https://doi.org/10. Deep-Water Submarine Fans in an Oil Field in West
1007/s11770-016-0588-3. Africa.” Petroleum Science and Technology: 1–24.
Valentín, M. B., C. R. Bom, J. M. Coelho, M. D. Correia, https://doi.org/10.1080/10916466.2023.2256787.
M. P. de Albuquerque, M. P. de Albuquerque, and Zheng, W., F. Tian, Q. Di, W. Xin, F. Cheng, and X. Shan.
E. L. Faria. 2019. “A Deep Residual Convolutional 2021. “Electrofacies Classification of Deeply Buried
Neural Network for Automatic Lithological Facies Carbonate Strata Using Machine Learning Methods:
Identification in Brazilian Pre-Salt Oilfield Wellbore A Case Study on Ordovician Paleokarst Reservoirs in
Image Logs.” Journal of Petroleum Science & Tarim Basin.” Marine & Petroleum Geology 123:104720.
Engineering 179:474–503. https://doi.org/10.1016/j.pet https://doi.org/10.1016/j.marpetgeo.2020.104720.
rol.2019.04.030. Zhou, Z., G. Wang, Y. Ran, J. Lai, Y. Cui, and X. Zhao. 2016.
Valzania, S., M. Kfoury, M. Grandis, A. Valdisturlo, “A Logging Identification Method of Tight Oil Reservoir
G. Fanello, L. Guerra, S. Heikal, A. Kashif, and Lithology and Lithofacies: A Case from Chang7 Member
A. Sultan. 2011. “Kadanwari Field: A Tight Gas of Triassic Yanchang Formation in Heshui Area, Ordos
Reservoir Study and a Successful Pilot Well Give New Basin, NW China.” Petroleum Exploration and
Life to an Exploited Field.” Paper presented at the SPE Development 43 (1): 65–73. https://doi.org/10.1016/
EUROPEC/EAGE Annual Conference and Exhibition, S1876-3804(16)30007-6.

A New Tool For Electro-Facies Analysis: Multi-Resolution Graph-Based Clustering
100% (1)
A New Tool For Electro-Facies Analysis: Multi-Resolution Graph-Based Clustering
14 pages
Geologic Data on Phosphate Mines
No ratings yet
Geologic Data on Phosphate Mines
352 pages
Introduction To Surveying PDF
No ratings yet
Introduction To Surveying PDF
27 pages
1 s2.0 S0920410522004855 Main
No ratings yet
1 s2.0 S0920410522004855 Main
14 pages
Energies 16 02581
No ratings yet
Energies 16 02581
19 pages
A Lithology Identification Approach Based On Machine Learning With Evolutionary Parameter Tuning
No ratings yet
A Lithology Identification Approach Based On Machine Learning With Evolutionary Parameter Tuning
5 pages
Machine Learning in Reservoir Analysis
No ratings yet
Machine Learning in Reservoir Analysis
3 pages
Poster Sample Slide
No ratings yet
Poster Sample Slide
1 page
A Machine Learning Approach To Facies Classification Using Well Logs
No ratings yet
A Machine Learning Approach To Facies Classification Using Well Logs
6 pages
1 s2.0 S1995822622002217 Main
No ratings yet
1 s2.0 S1995822622002217 Main
20 pages
1 s2.0 S1875510016302785 Main
No ratings yet
1 s2.0 S1875510016302785 Main
15 pages
Machin Well Log
No ratings yet
Machin Well Log
15 pages
Important Map
No ratings yet
Important Map
14 pages
Spe 197307 MS
No ratings yet
Spe 197307 MS
10 pages
Comparison of Different Machine Learning Algorithms
No ratings yet
Comparison of Different Machine Learning Algorithms
13 pages
1 s2.0 S0920410520308974 Main
No ratings yet
1 s2.0 S0920410520308974 Main
14 pages
Identification of Lithology From Well Log Data Usi
No ratings yet
Identification of Lithology From Well Log Data Usi
10 pages
1 s2.0 S2590197422000222 Main
No ratings yet
1 s2.0 S2590197422000222 Main
7 pages
No - Uis Inspera 78834918 47156456
No ratings yet
No - Uis Inspera 78834918 47156456
194 pages
Adaptive Multi-Resolution Graph-Based Clustering Algorithm For Electrofacies - Wu Hongliang
No ratings yet
Adaptive Multi-Resolution Graph-Based Clustering Algorithm For Electrofacies - Wu Hongliang
15 pages
Facies Classification
No ratings yet
Facies Classification
1 page
1 s2.0 S1674987122001748 Main
No ratings yet
1 s2.0 S1674987122001748 Main
14 pages
(2020) Multiple Point Geostatistical Simulation With Adaptive Filter Derived From Neural Network For Sedimentary Facies Classification
No ratings yet
(2020) Multiple Point Geostatistical Simulation With Adaptive Filter Derived From Neural Network For Sedimentary Facies Classification
11 pages
1 s2.0 S0098300413003002 Main
No ratings yet
1 s2.0 S0098300413003002 Main
9 pages
2020 Article ApplicationOfMulti-ResolutionG
No ratings yet
2020 Article ApplicationOfMulti-ResolutionG
13 pages
SEG2017 Application of Machine Learn
No ratings yet
SEG2017 Application of Machine Learn
5 pages
1 s2.0 S0920410518304960 Main
No ratings yet
1 s2.0 S0920410518304960 Main
11 pages
Chapter 1&2
No ratings yet
Chapter 1&2
28 pages
A Machine Learning Approach To Facies Classification Using Well Logs
No ratings yet
A Machine Learning Approach To Facies Classification Using Well Logs
6 pages
Machine Learning for Electrofacies Analysis
No ratings yet
Machine Learning for Electrofacies Analysis
17 pages
ML Deep Learning Borehole Image Interpretation Spwla-2019
No ratings yet
ML Deep Learning Borehole Image Interpretation Spwla-2019
10 pages
Processes
No ratings yet
Processes
19 pages
Electro-Facies Classification Using Well-Log Data
No ratings yet
Electro-Facies Classification Using Well-Log Data
19 pages
Jge11 4 011
No ratings yet
Jge11 4 011
7 pages
Remotesensing 14 00819 v4
No ratings yet
Remotesensing 14 00819 v4
20 pages
Evaluating StackingC and Ensemble Models For Enhanced - 2024 - Journal of Geoche
No ratings yet
Evaluating StackingC and Ensemble Models For Enhanced - 2024 - Journal of Geoche
14 pages
Article Diagraphie Main
No ratings yet
Article Diagraphie Main
18 pages
Applied Geostatistics in R: 1. Naive Bayes Classifier For Lithofacies Modeling in A Sandstone Formation
No ratings yet
Applied Geostatistics in R: 1. Naive Bayes Classifier For Lithofacies Modeling in A Sandstone Formation
14 pages
Ore Geology Reviews: Qun Yan, Linfu Xue, Yongsheng Li, Rui Wang, Bo Wu, Ke Ding, Jianbang Wang
No ratings yet
Ore Geology Reviews: Qun Yan, Linfu Xue, Yongsheng Li, Rui Wang, Bo Wu, Ke Ding, Jianbang Wang
14 pages
Geosciences 14 00250 With Cover
No ratings yet
Geosciences 14 00250 With Cover
22 pages
Neural Network Modelling and Classification of Lithofacies Using Well Log Data: A Case Study From KTB Borehole Site
No ratings yet
Neural Network Modelling and Classification of Lithofacies Using Well Log Data: A Case Study From KTB Borehole Site
14 pages
Leveraging Machine Learning For Lithology Discrimination
No ratings yet
Leveraging Machine Learning For Lithology Discrimination
18 pages
Remote Sensing: Semi-Automatization of Support Vector Machines To Map Lithium (Li) Bearing Pegmatites
No ratings yet
Remote Sensing: Semi-Automatization of Support Vector Machines To Map Lithium (Li) Bearing Pegmatites
22 pages
Applsci 15 09978
No ratings yet
Applsci 15 09978
26 pages
Minerals: An Enhanced Rock Mineral Recognition Method Integrating A Deep Learning Model and Clustering Algorithm
No ratings yet
Minerals: An Enhanced Rock Mineral Recognition Method Integrating A Deep Learning Model and Clustering Algorithm
17 pages
Regionalized Classification
No ratings yet
Regionalized Classification
2 pages
2011 05 MA International Association For Mathematical Geosciences
No ratings yet
2011 05 MA International Association For Mathematical Geosciences
19 pages
Predicting Reservoir Brittleness Using ML
No ratings yet
Predicting Reservoir Brittleness Using ML
16 pages
Geosciences 11 00235 With Cover
No ratings yet
Geosciences 11 00235 With Cover
21 pages
1 s2.0 S0920410517308094 Main
No ratings yet
1 s2.0 S0920410517308094 Main
12 pages
Lithology Prediction Using K-Nearest Neighbors (KNN) Algorithm Study Case in Upper Cibulakan Formation
No ratings yet
Lithology Prediction Using K-Nearest Neighbors (KNN) Algorithm Study Case in Upper Cibulakan Formation
16 pages
Eng 04 00139
No ratings yet
Eng 04 00139
25 pages
Paper SQP Dan SQs
No ratings yet
Paper SQP Dan SQs
8 pages
Geochemical Anomaly Detection and Pattern Recognition A Combined Study of The Apriori Algorithm, Principal Component Analysis, and Spectral Clustering
No ratings yet
Geochemical Anomaly Detection and Pattern Recognition A Combined Study of The Apriori Algorithm, Principal Component Analysis, and Spectral Clustering
17 pages
Feature Fusion of Single and Orthogonal Polarized Rock Images For Intelligent Lithology Identification
No ratings yet
Feature Fusion of Single and Orthogonal Polarized Rock Images For Intelligent Lithology Identification
12 pages
1705 06345
No ratings yet
1705 06345
7 pages
Machine Learning-Based Mapping For Mineral Exploration: Renguang Zuo Emmanuel John M. Carranza
No ratings yet
Machine Learning-Based Mapping For Mineral Exploration: Renguang Zuo Emmanuel John M. Carranza
5 pages
Energies 12 01509
No ratings yet
Energies 12 01509
17 pages
1542-Article Text-3069-1-10-20240904
No ratings yet
1542-Article Text-3069-1-10-20240904
29 pages
ESTCON ICG 2018 Complete Manuscript
No ratings yet
ESTCON ICG 2018 Complete Manuscript
8 pages
Deep Learning for Rock Facies Classification
No ratings yet
Deep Learning for Rock Facies Classification
18 pages
2 - processes-RPT For Gas Fields of Middle Indus Basin
No ratings yet
2 - processes-RPT For Gas Fields of Middle Indus Basin
28 pages
Geosciences 15 00243
No ratings yet
Geosciences 15 00243
15 pages
Advanced Permeability Prediction Through Two-Dimensional Geological Feature Image Extraction With CNN Regression From Well Logs Data
No ratings yet
Advanced Permeability Prediction Through Two-Dimensional Geological Feature Image Extraction With CNN Regression From Well Logs Data
47 pages
Synthetic Seismograms, Synthetic Sonic Logs, and Synthetic Core
No ratings yet
Synthetic Seismograms, Synthetic Sonic Logs, and Synthetic Core
6 pages
2 MRST
No ratings yet
2 MRST
42 pages
b099 PDF
No ratings yet
b099 PDF
15 pages
Acoustic Data Processing Guide
No ratings yet
Acoustic Data Processing Guide
3 pages
PPL Exploration Report
No ratings yet
PPL Exploration Report
67 pages
Vpvs Aplicatipons
No ratings yet
Vpvs Aplicatipons
6 pages
AVO and Inversion: Intro & Rock Physics
100% (1)
AVO and Inversion: Intro & Rock Physics
65 pages
AVO and Lamé Constants For Rock Parameterization and Fluid Detection
No ratings yet
AVO and Lamé Constants For Rock Parameterization and Fluid Detection
1 page
AVO and Lamé Constants For Rock Parameterization and Fluid Detection PDF
No ratings yet
AVO and Lamé Constants For Rock Parameterization and Fluid Detection PDF
41 pages
Recommendation Letter For The UST Global Research Internship 2019
No ratings yet
Recommendation Letter For The UST Global Research Internship 2019
1 page
Comprehensive Guide to Flooring Materials
No ratings yet
Comprehensive Guide to Flooring Materials
28 pages
BCME Unit-1
No ratings yet
BCME Unit-1
100 pages
Talwan I 1959
No ratings yet
Talwan I 1959
11 pages
Routing Function - Part 2 Hydraulic Approach: 6.1. Saint-Vénant Equations
No ratings yet
Routing Function - Part 2 Hydraulic Approach: 6.1. Saint-Vénant Equations
30 pages
Seychelles Flora Virtual Gallery
100% (1)
Seychelles Flora Virtual Gallery
9 pages
JORC-Report ALG Feb2021 Final
No ratings yet
JORC-Report ALG Feb2021 Final
96 pages
Rock Temperature Prediction
No ratings yet
Rock Temperature Prediction
9 pages
Integrated Approach For Development of Safety Management Plan For Coal and Metalliferous Mines
No ratings yet
Integrated Approach For Development of Safety Management Plan For Coal and Metalliferous Mines
21 pages
Geology & Earth Science Syllabus
No ratings yet
Geology & Earth Science Syllabus
4 pages
DRRR W2
No ratings yet
DRRR W2
4 pages
ESQ2 LESSON 14 Relative and Absolute Dating
No ratings yet
ESQ2 LESSON 14 Relative and Absolute Dating
9 pages
Dialog & Text Analysis Guide
No ratings yet
Dialog & Text Analysis Guide
24 pages
DEKA Dynamics Mining Courses Overview
100% (1)
DEKA Dynamics Mining Courses Overview
2 pages
Kohistan Island Arc
No ratings yet
Kohistan Island Arc
18 pages
Geological Society of America Citation Style: GSA R G E
No ratings yet
Geological Society of America Citation Style: GSA R G E
7 pages
Geotechnical Foundation Course Guide
100% (1)
Geotechnical Foundation Course Guide
43 pages
CALA Road Project: Climate & Topography
No ratings yet
CALA Road Project: Climate & Topography
17 pages
Mesri2009 - Obituary Ralph B. Peck, 1912-2008
No ratings yet
Mesri2009 - Obituary Ralph B. Peck, 1912-2008
3 pages
Managing Problematic Soils Guide
No ratings yet
Managing Problematic Soils Guide
7 pages
Geological Strength Index Guide
No ratings yet
Geological Strength Index Guide
9 pages
SPE 0114 0091 JPT Uncertainty Evaluation of Wellbore Stability Model Predictions
No ratings yet
SPE 0114 0091 JPT Uncertainty Evaluation of Wellbore Stability Model Predictions
2 pages
Question Bank-Geography Class-10
No ratings yet
Question Bank-Geography Class-10
42 pages
Es Lesson 11 Continental Drift Theory
No ratings yet
Es Lesson 11 Continental Drift Theory
12 pages
Earth Science Reviewer Volcanoes
No ratings yet
Earth Science Reviewer Volcanoes
7 pages
Explosive Energy G.K.Pradhan Et El PDF
100% (3)
Explosive Energy G.K.Pradhan Et El PDF
14 pages
Concrete Concrete Concrete Concrete Concrete: Diagnosis and Control of Alkali-Aggregate Reactions in Concrete
No ratings yet
Concrete Concrete Concrete Concrete Concrete: Diagnosis and Control of Alkali-Aggregate Reactions in Concrete
24 pages
Weir and Barrage Design Theories
No ratings yet
Weir and Barrage Design Theories
16 pages
Geoscience Fault Zones Conference
No ratings yet
Geoscience Fault Zones Conference
141 pages

Data-Driven Machine Learning

Uploaded by

Data-Driven Machine Learning

Uploaded by

Geo-spatial Information Science

ISSN: (Print) (Online) Journal homepage: www.tandfonline.com/journals/tgsi20

Data-driven machine learning approaches for

To link to this article: https://doi.org/10.1080/10095020.2024.2405635

© 2024 Wuhan University. Published by

Published online: 18 Oct 2024.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Data-driven machine learning approaches for precise lithofacies identification

and Wakeel Hussain a

ABSTRACT ARTICLE HISTORY

CONTACT Peimin Zhu zhupm@cug.edu.cn

Figure 2. Showcases a visual representation delineating the methodological framework.

Figure 3. Illustration of pauta criterion.

KNN algorithm and graphic data representation

Figure 5. Attraction schematic diagram.

Figure 6. Visual illustration of the KNN algorithm.

3.2.4. Artificial neural network (ANN)

Figure 7. The topology of the backpropagation neural network.

3.2.5. Feature selection There are several conventional log curves, so it is

where μ is the average value of the characteristic curve

Figure 11. Location and stratigraphic profile of the study area.

Table 1. Lithofacies classification of Lower Goru formation.

Figure 12. Cross plot of logging characteristics of different lithofacies.

Figure 14. The corresponding relationship between lithofacies and logging.

Notes on contributors References

You might also like