Seismic Inversion With Deep Learning
Seismic Inversion With Deep Learning
https://doi.org/10.1007/s10596-021-10118-2
ORIGINAL PAPER
Silvia L. Pintea1 · Siddharth Sharma2 · Femke C. Vossepoel3 · Jan C. van Gemert1 · Marco Loog1 ·
Dirk J. Verschuur2
Received: 29 January 2021 / Accepted: 11 November 2021 / Published online: 21 December 2021
© The Author(s) 2021
Abstract
This article investigates bypassing the inversion steps involved in a standard litho-type classification pipeline and performing
the litho-type classification directly from imaged seismic data. We consider a set of deep learning methods that map the
seismic data directly into litho-type classes, trained on two variants of synthetic seismic data: (i) one in which we image the
seismic data using a local Radon transform to obtain angle gathers, (ii) and another in which we start from the subsurface-
offset gathers, based on correlations over the seismic data. Our results indicate that this single-step approach provides a faster
alternative to the established pipeline while being convincingly accurate. We observe that adding the background model as
input to the deep network optimization is essential in correctly categorizing litho-types. Also, starting from the angle gathers
obtained by imaging in the Radon domain is more informative than using the subsurface offset gathers as input.
Fig. 1 Comparative diagram. Top: the baseline seismic inversion that aims at solving the seismic inversion problem in one step. We start
approach starting from an imaging process, followed by estimating the from a local imaging process, where we use either subsurface-offset
wavelet at the well, a time-consuming elastic inversion, and finally gathers or angle gathers. These together with a background model
litho-type classification from estimated elastic parameters, followed obtained from the well-log, form the input of a deep network classifier
by the verification step. Bottom: the proposed deep learning approach which predicts litho-classes to be verified
approach known as Naive Bayes [21], which assumes problem without ambiguity, we also use background models
independence between the variables, is easy to implement as prior information and input these into the considered
and provides good results. In this work, we apply both the deep-learning pipeline. Figure 1 (bottom) depicts the deep
full Bayesian and the Naive Bayes approach as baselines. learning proposal. The method for obtaining background
Here, we consider a set of deep learning methods inspired models is similar in both the baseline and the deep
by image segmentation and image classification literature learning approach. These background models represent
[20, 23]. We compare the considered deep learning methods prior geologic knowledge obtained from typical seismic
with an advanced elastic inversion approach, performing a velocity analysis. For the purpose of this paper, we create a
non-linear elastic inversion based on a local full-waveform synthetic earth model based on the “Book Cliffs” geological
inversion (FWI) in the tau-p domain, which provides an model, derived from the Book Cliffs outcrop in Colorado,
estimate of the elastic parameters in the reservoir area Utah (US). We use this model to evaluate the deep learning
[15]. These elastic parameters form the input of a Bayesian approach and the baseline approach. In both cases, we
inference process to determine the litho-types. We evaluate evaluate the predicted litho-types by comparison against
the estimated litho-types against the petrophysical analysis, the ground-truth labeling. Although a direct comparison
here referred to as the ground-truth labeling. Throughout between the baseline and the deep learners is not possible
this paper, we call this process the baseline approach, due to the nature of the data processing, we evaluate both
depicted in Fig. 1 (top). As an alternative approach to approaches as transparent as possible.
this, we propose a one-step algorithm, which bypasses
the individual steps of elastic inversion and litho-type 1.1 Naming conventions
classification. In the proposal, as in the baseline approach,
we start from the seismic data. We perform an imaging Throughout the paper, we refer to “model” as the earth
process to obtain either angle gathers [25], or alternatively, geophysical model. We use the names “learner” or
subsurface-offset gather responses [29]. Note that the “classifier” for the machine learning methods and in specific
subsurface-offset gathers implicitly store angle-dependent the deep learning networks. We denote the litho-types
information in the offset data. These responses represent predicted by the learners as “labels” or “targets”. We use the
the input of a deep-learning pipeline that maps them terms “classes” and “litho-classes” to refer to the litho-types
directly to corresponding litho-types. To solve the inversion discriminated by the considered classifiers.
2 Baseline: FWI and Bayesian classification If the data is highly overlapping, different classes may
have similar posterior probability, thus greatly affecting
For the baseline approach, we carry out the estimation of the classifier’s performance. Therefore, here we consider
lithology from seismic data in two steps. Initially, we invert a more idealistic case. In our synthetic experiment, we
the global seismic data to estimate the elastic properties, define a ground-truth model that contains the three elastic
followed by statistical classification of lithology using properties of rocks: vp , vs , and ρ measured at different well
inverted elastic properties and well logs [1]. In the present locations, and interpolate these spatially. Figure 3 shows the
work, we use local 1.5D elastic full-waveform inversion sampling of the locations along which we define the three
(FWI) to invert the seismic data, and we map the lithology data subsets we use for training, validating, and testing our
using a Bayesian classification method [10]. classifiers.
The dataset used in this article is based on the “Book Cliffs” The inversion scheme applied in this paper was developed
geological model, derived from the outcrop in Colorado, by Gisolf and an den Berg [16] and more extensively
Utah (US). The Book Cliffs model was initially built described by Gisolf, Haffinger, and Doulgeris [17]. The
tetetyukhina2014acoustic to represent a realistic reservoir method has been applied in various studies for the purpose
layer containing 8 different litho-types. This model was of reservoir-oriented seismic inversion for synthetic data
downscaled by [11] and changed to have improved geologic [12]. This inversion scheme has been shown to provide
structures that contain 12 litho-types with unique vp (p- better quantitative images of the subsurface for both
wave velocity), vs (s-wave velocity), ρ (density). Typically, synthetic and real-field data [13, 26, 27]. The inversion
in lithology prediction, a rough separation of litho-types into uses a wave equation based approach on pre-stack AVO
3-4 groups is defined based on physical properties of rocks (amplitude versus offset), or AVP (amplitude versus ray-
[34]. In our work, we group these 12 litho-types together to parameter) information, and it solves locally the 1.5D full
represent 4 different classes, each with distinct properties, elastic wave equation, in conjunction with inverting for the
and increasing clay content: ‘coal’, ‘sand’, ‘sandy-shale’, elastic parameters: compressibility (κ), shear-compliance
and ‘shale’. Based on the lithological model, we generate (M) and density (ρ).
the model for the compressional (p-wave) velocity, shear- For the application of the inversion scheme on our
wave (s-wave) velocity, and density of the rock (vp , vs considered Book Cliffs model, we generate synthetic data
, ρ) and upscale it to the seismic frequency (60 Hz) in the tau-p domain over this model. For this, we use the
using Backus averaging [2]. The upscaling process brings Kennett invariant embedding method [19], for 10 different
a spread in the vp , vs , ρ, as shown in Fig. 2. It is ray-parameters, or horizontal slowness parameters. The
evident from Fig. 2 that the litho-classes correspond to highest ray parameter is such that on the outermost trace
well-separated ranges of values. Although in reality, the the maximum angle of incidence is 42 degrees. We make
classes are not well-separated, the Bayesian methods work this choice because in real data we never have angles that
by calculating posterior probabilities for every class and
assigning the class with the highest posterior probability.
Fig. 4 Inverted elastic properties. Top: Our modified Book Cliffs model with 4 litho-types. Middle: Inverted compressibility (κ). Bottom:
Inverted shear-compliance (M). We use full-waveform inversion to invert for the elastic properties
are more than 40 degrees, and we know that FWI can suffer the bulk modulus) and bulk density (ρ). The properties
from missing angles. This allows us to test the limitations of actually inverted for are the normalised relative contrasts
the learning approaches in terms of missing angles. of the absolute properties against very smooth background
For the modeling, we use a zero-phase band-pass wavelet properties (κ0 , M0 , ρ0 ). The background medium is a very
with a maximum frequency of 60 Hz. Figure 4 shows the important aspect of the inversion as the incident wave field
synthetic data after inversion. The inversion reconstructs and Green’s function are calculated in this medium. It has
the general structure of the lithofacies model, however, it to be smooth because we rely on the WKB (Wentzel–
fails to resolve the thin coal seams. In addition, in some Kramers–Brillouin) approximation, which is only valid in
places, the lateral continuity is missing. This may be due to smooth mediums. Additionally, it should be a smooth
the application of the inversion on individual common mid- heterogeneous background medium that is non-reflective
points (CMP’s) independently. An accurate estimation of over the data bandwidth. On the other hand, one would
density requires a broad angle range (> 45 degrees). Given like to have as much information as possible in the
that the synthetic modeling contains less than 45 degrees, background, because it represents the starting model for the
we did not invert for density. inversion, and to keep the contrasts χ as low as possible to
reduce the non-linearity of the problem. However, for the
2.2.1 Full-waveform inversion details current state of the art, we need to keep the backgrounds
non-reflective over the bandwidth of data. Usually, low
In this study, we use an inversion based on the wave wave-number backgrounds are derived from well logs and
equation on pre-stack AVO, or AVP. We solve the 1.5D full interpolated between well logs. That is another reason
elastic wave equation locally, in conjunction with inverting why the background should have only a low wave-number
for the elastic parameters: compressibility κ and shear content.
compliance M, or their inverse: bulk modulus K and shear
modulus μ. If data quality permits, also density ρ can be
inverted for. Forward modelling The forward modelling in the inversion
is based on the scattering approach for calculating wave
propagation in inhomogeneous elastic mediums. This
FWI parameterization In this inversion scheme, the outputs makes use of the integral formulation of the wave equation.
are the contrast in shear compliance M = 1/μ (μ is For the purpose of matching it to the observed data, we
the shear modulus), compressibility κ = 1/K (K is use the data equation, which is a subset of the full integral
equation, or the object equation. We show the data equation re-calculation of the total wave field are based on the full
and object equation for the simple single parameter acoustic elastic wave equation. These are carried out such that every
case: re-calculation brings in a higher order of multiple scattering
in the modelled data. Optimization is needed to ensure non-
Pdata (xr , xs , ω) = G(xr , x, ω)χ(x) divergence of the field updates. In the context of the iterative
x∈D
inversion scheme, Eqs. 1 and 3 are solved alternatively for
Ptot (x, xs , ω) dx (1) the elastic case. The process is augmented by using the Born
approximation, where the incident wave-field propagating
where G(·) is Green’s function, xr and xs are the in the background medium is subjected to a simple linear
receiver location and source location, respectively, ω is the inversion to estimate the approximate subsurface properties.
frequency. The integral over x is an integral over the whole These approximate subsurface properties are then used to
object domain. Equation 1 predicts the data recorded at the update the total field in the domain using equation Eq. 3.
surface Pdata in terms of the wave field transmitted by a This process is repeated until the estimated subsurface
source that propagates to every point in the subsurface. The properties and the updated total field do not change
wave field is transmitted back from the points where the anymore.
contrast χ is non-zero to the surface through the smooth
background medium. The contrast functions χ are: 2.3 Bayesian classification
c0 (x) 2
χ(x) = 1 − (2) Our considered baseline approaches follow [4, 31] and rely
c(x)
on a full Bayesian classifier and a Naive Bayes classifier.
where c(x) is an unknown subsurface acoustic wave-
For classification, one is interested in calculating the class
velocity model and c0 (x) is the known background medium.
Cj (litho-types) probabilities, given the elastic properties
On the other hand, the object domain equation predicts the
X (observed data), and assigning the class with the highest
total wave field at each grid point in the subsurface:
posterior probability:
Ptot (x, xs , ω) = Pinc (x, xs , ω) + G(x, x , ω)
x ∈D argmaxCj P (Cj | X=x)
P (X=x | Cj )
χ(x )Ptot (x , xs , ω) dx
(3) = argmaxCj P (Cj ), (4)
P (X=x)
Eq. 3 can be used to estimate the total wave field with all its
where P (Cj | X=x) is the posterior probability of the jth
complex propagation, given the contrast function is known.
litho-type given the elastic parameters. P (Cj ) is the prior
Equation 3 can be substituted in Eq. 1 to obtain the recorded
litho-type probability, and P (X=x | Cj ) is the likelihood
seismic data in terms of subsurface properties.
property ( X=x ) to be from the jth litho-type.
of the elastic
P (X=x) = m k=1 P (X=x | Ck )P (Ck ) is the probability
Optimization Scheme The inversion is an iterative process
of the elastic property. If we have n elastic properties for a
where the linearized inversion of the recorded data is
given depth sample, the posterior in Eq. 4 becomes:
alternated with the re-calculation of the total wave field in
the object domain (Fig. 5). The inversion kernel and the P (Cj | X1 =x1 , ..Xn =xn )
P (X1 =x1 , ..Xn =xn | Cj )
= P (Cj ). (5)
P (X1 =x1 , ..Xn =xn )
Eq. 5 provides the full Bayesian treatment by including
the correlation among all elastic properties, giving rise
to a multivariate distribution conditioned on the classes.
We estimate this posterior from the training data by using
kernel density estimation (KDE) with Gaussian kernels. A
simplified approach of Eq. 5 is the Naive Bayes classifier,
which assumes independence among elastic properties:
where m is the number of litho-types. The above posterior examples (models) with their associated labels (ground truth
only requires evaluating univariate distributions conditioned lithologies). To overcome this bottleneck we propose to
on the classes. As before, we estimate these from the process the data as described below.
training dataset using KDE with Gaussian kernels. We
estimate the target priors P (Cj ) by counting the number 3.1 Data description for deep learning
of instances for each target in the training dataset and
normalising this to 1. Starting from the same geological model as for the baseline,
For the full Bayesian treatment we estimate a bivariate we create a large number of samples to be used for training
density (P (κ, M)), whereas for the Naive Bayes we obtain the deep-learning classifiers. The dataset creation process
univariate distributions (P (κ)P (M)). We do not expect a depicted in Fig. 7 is as follows: we construct snippets of
significant difference between the full Bayesian and Naive subsurface models from a range of possible models that
Bayes classifiers because, in our case, we consider only we expect in our geology. A snippet is a very small piece
two inverted elastic properties: compressibility (κ) and of 1D model (say 150 meter) containing a few layers of
shear compliance (M). Figure 6 shows the estimated 2D elastic parameters. Each snippet is extended by including
probability densities for the four litho-type classes. an overburden, creating a realistic-scale 1.5D model.
However, the overburden has smooth velocity variations.
Subsequently, we generate a seismic response and apply
3 Seismic litho-class estimation with deep the backpropagation and imaging process to arrive at the
learning local subsurface gathers, either in the linear Radon domain
or in the subsurface offset domain. These local responses
For the deep learning approaches, we employ the same are used as input for the training stage. By this process,
“Book Cliffs” geological model described in Section 2.1. all subsurface responses have natural limitations from the
This is a single model corresponding to a single ground truth overburden and surface acquisition parameters (aperture,
lithology output. Deep networks require numerous training seismic source signal, etc.).
Fig. 7 Deep learning dataset creation: (i) We select a vertical cross- overburden are used to simulate seismic responses. (vi) The seis-
section. (ii) Along this section we crop fix-length snippets. (iii) These mic responses are then either (vi.1) imaged via the linear Radon
snippets have associated litho-classes, representing the ground truth domain to obtain angle gathers, or (vi.2) cross-correlated to obtain
labels for the deep-learning method. (iv) The original snippet is blurred subsurface-offset gathers. These together with the background models
and subsampled, and this creates the background model for this spe- (iv) represent the inputs to the deep networks
cific snippet. (v) The same input snippets together with a simple
Figure 7 shows all steps involved in dataset creation: (i) The considered deep-learning classifiers use the background
We select a vertical cross-section (a “well-log”). (ii) Along model as input as well as a dataset-specific input: angle
this cross-section we crop fix-length snippets. (iii) The litho- gathers or subsurface-offset gathers. The true litho-classes
classes corresponding to these snippets are used as ground represent the learning targets. We sample the three data
truth for the deep-learning pipeline – the learning labels. subsets: training, test, validation at the same locations as
(iv) We blur and subsample the original elastic properties for the baseline classifiers (Fig. 3), where each location
in the snippet, vp , vs , ρ to form the background model. contains ≈ 1, 900 data tuples composed of: background
For fairness of comparison, we use the same procedure in model, imaged seismic inputs, and litho-classes.
the baseline method for defining the model prior. (v) We
add a simple overburden to each snippet containing elastic 3.2 Deep learning approaches
properties and simulate the expected seismic responses
through this overburden. This imposes typical offset/angle For deep learning methods, the few hundred samples avail-
limitations to the data and includes typical imaging artifacts. able represent a relatively small dataset size. Because of
(vi) Finally, we image the simulated seismic responses this, we consider a set of small network architectures,
giving rise to two different datasets: selected based on their popularity for visual classification
tasks resembling our litho-type classification task. LeNet
(1) Book-Cliffs angle: We preprocess the seismic data [20] is a standard, commonly-used architecture for clas-
corresponding to each location along with the snippet, sification on small datasets, while UNet [23] is the most
via the linear Radon domain, to obtain the angle popular segmentation architecture to date. The lithology
gathers [9, 33]. Note that we refer to these as “angle classification implies predicting a class label at every input
gathers” while in principle they are horizontal ray location which is similar to image semantic segmenta-
parameter gathers. tion. However, our outputs are 1D vectors of litho-types
(2) Book-Cliffs offset: We define subsurface-offset rather than 2D segmentation maps. We experiment with the
gathers as cross-correlations between the forward- following architecture variations: (a) LeNet-2D: a 2D net-
propagated source fields and the backpropagated work inspired by the popular LeNet-5 architecture [20], as
seismic responses [6, 29]. This is done via depth depicted in Fig. 8.(a); (b) LeNet-1D a 1D version of the
migration, and specifically in our case WEM (Wave same network depicted in Fig. 8.(b), where the 1D convo-
Equation Migration) using one-way recursive prop- lution is performed across the angles when using the angle
agation operators, and applying a subsurface offset gathers, and across the offsets when using subsurface-offset
imaging condition at each depth level. gathers; (c) UNet-2D a light-weight version of the network
Fig. 8 Deep learning methods: (a) LeNet-2D inspired by [20] where and the background models are concatenanted at input level; (d) UNet-
the background and input seismic data are processed on separate 1D is similar to the UNet-2D while performing 1D convolutions. The
branches; (b) LeNet-1D is similar to the LeNet-2D but using only 1D yellow blocks represent the inputs/outputs of the learners, the blue
convolutions; (c) UNet-2D inspired by [23] where the seismic inputs blocks are convolutions, and the gray blocks are fully-connected layers
proposed by [23], shown in Fig. 8.(c); and (d) UNet-1D litho-classes. In the LeNet-like architectures, we add the
a UNet-inspired [23] 1D network displayed in Fig. 8.(d), background model via fully-connected layers into the
where again the 1D convolution is performed across the network and concatenate its output with the downstream
angles / offsets dimension. Given the limited depth of the network features. The UNet-like networks are fully-
networks, for all architectures we use filter sizes of 13 × 13 convolutional, thus we concatenate the background model
px, 13×1 px respectively, to increase the receptive field size. to the seismic data as input to the network. For the 2D UNet
We feed into the networks the input data and the we replicate the background model spatially to match the
background model and predict the associated ground-truth resolution of the seismic data.
Fig. 9 Visualization of baseline predictions: Example of true and are made using full Bayesian and Naive Bayes classification over the
inverted elastic properties at common-mid-point 200, together with inversion results for M and κ (inverted: blue, true: red). The Naive
the predicted litho-types versus the true litho-types. The predictions Bayes has marginally better predictions
Table 1 Comparative numeric evaluation: Class-normalized accuracies on the two datasets for the considered network architectures as well as
the two baseline methods
Book-cliffs accuracy
We evaluate also the added value of the background models. The results indicate that starting from the angle gathers (Book-Cliffs angle), is slightly
more beneficial, especially for the fully-convolutional networks. Overall, the UNet classifiers on Book-Cliffs angle achieve the best generalization
Bayesian and Naive Bayes classifiers to these inputs. Table 1 shows the classification accuracies of the deep
Figure 9 shows the predictions made on common-mid- learners: LeNet-1D, LeNet-2D, UNet-2D and UNet-1D
point 200 for the full Bayesian and Naive Bayes classifiers, when compared to each other and with the baseline
along with the true lithology. The average accuracy of both methods. We also report the performance of the networks
classifiers is ≈70-75%, however, Naive Bayes outperforms with and without the background models, as well as
the full Bayesian classifier for some targets, especially for just using the background models. For all considered
the ‘shale’ and ‘sandy-shale’ classes. The low accuracy classifiers, we measure performance as class-normalized
can be attributed to the imperfect inversion results used as accuracies. For the deep learning classifiers, we also report
a starting point for the lithology classification. The true standard deviations over three runs and the number of
parameters versus inverted parameters are shown in Fig. 9 parameters. The two LeNet inspired networks perform on
(left). par in terms of accuracy, while the UNet methods are
more accurate on both datasets, both with and without
4.2 Deep learning results background models. Adding the background models as
input to the network is beneficial for all architectures, and
To bypass the inversion step in the baseline method, we more specifically for the UNet architectures where it gives a
use deep networks trained on the training sets of our two 13-16% improvement in accuracy. Compared to each other
synthetic datasets: Book-cliffs angle and Book-cliffs offset. the UNet networks using 1D versus 2D inputs perform
We evaluate on the corresponding test sets, where we set the approximately on par when considering the variance of the
hyperparameters using the validation sets. learners across different initializations. The advantage of
All deep networks are trained for 500 epochs, using the the UNet methods may come from both having a fully
standard Adam optimizer and batch sizes of 64 samples. convolutional network incorporating translation invariance,
We used a scheduled learning rate starting from 0.003 and and from concatenating the background models earlier on in
reduced at epochs 100, 200 and 300, and a weight decay the network rather than at the end through a fully-connected
of 0.0001. We standardize the input data: both the imaged layer. The deep network architectures obtain competitive
seismic data and the background models, by making it performance when compared to the baseline approaches,
zero mean and unit standard deviation, using training set however, it is difficult to reach a strong conclusion, as the
statistics. seismic data is processed differently between the baseline
Fig. 10 Class confusion matrices: (a) The full Bayesian classification The UNet-2D with an accuracy of ≈ 83%; and (f) The UNet-1D clas-
with an average accuracy of ≈ 70%; (b) The Naive Bayes classifica- sifier on the Book-Cliffs angle test data with and accuracy of ≈ 87%.
tion with an average accuracy of ≈ 73%; (c) The LeNet-2D with an The Naive Bayes classifier outperforms the full Bayesian classifier.
accuracy of ≈ 73%; (d) The LeNet-1D with an accuracy of ≈ 69%; (e) The UNet classifiers display less class confusion
Fig. 11 Visualization of UNet-1D predictions: Results on the Book- predicted litho-types follow the true litho-types, which indicated that
Cliffs angle test set. We display the true litho-types versus the pre- the network is able to generalize well on our considered data
dicted litho-types. Each color represents a litho-type. We can see the
and the deep-network methods. We, additionally, test using ‘sand’ seems to be often misclassified as ‘shale’, and
only the background models, and we find the deep networks ‘coal’ as ‘shale’, while the LeNet-like networks display
are effective at extracting the correct lithology class from a larger confusion for the class ‘coal’. For the UNet-1D
only the input blurred version of the elastic properties. This deep network, the errors are different: the most common
may be due to the well-known ability of deep learning misclassification is ‘coal’ as ‘sand’. Although we train and
methods to perform input deblurring. evaluate at the same locations in the Book Cliffs model,
Figure 10 shows the confusion matrices for the two these confusion matrices cannot be directly compared
baseline approaches and for the deep learning methods between the baselines and the deep networks, because the
trained on Book-Cliffs angle. In the Bayesian approaches, methods process the seismic data differently.
Table 2 Importance of background models: We evaluate the impor- median vp , vs , ρ value of the blurred background per snippet, as well
tance of the background model for the deep networks by considering as using the blurred background
four cases: no background, using a constant background, using the
Book-cliffs offset
Book-cliffs angle
We report accuracy over 3 repetitions on the validation set for both the Book-cliffs offset and Book-cliffs angle datasets. A constant background
across locations is detrimental to the classification, while the more informative the background is, the higher the accuracy.
Figure 11 visualizes true and predicted litho-types over keeping only the background values at training location
the test set of Book-Cliffs angle when using the UNet- CMP-100, and a background defined as a single median vp ,
1D learner. Each color represents one out of the 4 litho- vs , ρ value of the already blurred background per snippet,
types: the predicted litho-types closely resemble the true as well as the blurred background. We report validation
ones. The explanation for this exceptional performance accuracies over 3 repetitions for both the Book-cliffs offset
is that previous work shows that a shallow network can and Book-cliffs angle datasets.
successfully approximate the Zoeppritz equations used in Deep networks are well-known to be able to perform
the FWI [14], and the FWI is an iterative process adapting a input deblurring, therefore the best results are obtained
set of parameters to a data fit, which is similar to the training when using the blurred background. Interestingly, in
procedure of artificial neural networks. Additionally, our Table 2, using a constant background is detrimental to
data does not contain neighboring seismic interferences the lithology classification especially for the deep learning
and relies on informative background models. We conclude methods where the background is processed earlier on in
that on this specific dataset the UNet-1D network can the network such as the UNet methods. Moreover, for
generalize well. However, we do not know how well these the UNet-2D the background is replicated to match the
classifiers generalize to real seismic data, which is more input data dimensions thus the network focuses more on
challenging, and when using less informative background the background, and if this background is not informative
models. the accuracy suffers greatly. Just using a median value of
the blurred background allows the network to find better
4.2.1 Importance of background models solutions when compared to not using any background or
using an uninformative background such as the constant
Here we test the importance of the background models in the background. For the deep networks the background plays a
deep learning methods. We consider four scenarios: using stronger role than for the standard baselines, and therefore a
no background, using a constant background obtained by more informative background greatly improves results.
Fig. 12 Training curves: We show the training and validation losses LeNet-based architectures display a strongly overfitting behavior
and accuracies of the four considered deep networks, when train- where the training loss is nearly 0, while the validation loss is
ing on the Book-cliffs angle dataset and evaluating on the validation increasing. The UNet-based networks are less prone to overfitting
set: (a) LeNet-2D, (b) LeNet-1D, (c) UNet-2D, (d) UNet-1D. The
11. Feng, R., Luthi, S.M., Gisolf, D., et al.: Obtaining a high- 25. Sava, P.C., Fomel, S.: Angle-domain common-image gathers by
resolution geological and petrophysical model from the results wavefield continuation methods. Geophysics 68(3), 1065–1074
of reservoir-orientated elastic wave-equation-based seismic inver- (2003)
sion. Pet. Geosci. 23(3), 376–385 (2017) 26. Sharma, S., Gisolf, D., Luthi, S., et al.: Strategies to include
12. Feng, R., Luthi, S.M., Gisolf, D., et al.: Obtaining a high- geological knowledge in full waveform inversion. In: International
resolution geological and petrophysical model from the results Conference and Exhibition, Barcelona, Spain, 3-6, April, 2016.
of reservoir-orientated elastic wave-equation-based seismic inver- Society of Exploration Geophysicists and American Association
sion. Pet. Geosci. 23(3), 376–385 (2017). https://doi.org/10.1144/ of Petroleum Geologists. https://doi.org/10.1190/ice2016-65180
petgeo2015-076 22.1 (2016)
13. Feng, R., Luthi, S.M., Gisolf, D., et al.: Reservoir lithology classi- 27. Sharma, S., Gisolf, D., Luthi, S.: Bayesian update of wave equa-
fication based on seismic inversion results by hidden markov mod- tion based seismic inversion using geological prior information
els: Applying prior geological information. Mar. Pet. Geol. 93, and scenario testing. In: 80th EAGE Conference and Exhibition
218–229 (2018). https://doi.org/10.1016/j.marpetgeo.2018.03.004 2018, European Association of Geoscientists & Engineers, vol. 1,
14. Ganssle, G.: Neural networks. Lead. Edge 37(8), 616–619 (2018) pp. 1–5 (2018)
15. Gisolf, A., van den Berg, P.: Target oriented non-linear inversion 28. Siripitayananon, P., Chen, H.C., Hart, B.S.: A new technique
of seismic data. In: 72nd EAGE Conference and Exhibition- for lithofacies prediction: back-propagation neural network. In:
Workshops and Fieldtrips, European Association of Geoscientists Proceedings of ACMSE: The 39th Association of Computing and
& Engineers, pp. cp–161 (2010) Machinery South Eastern Conference, Citeseer, pp. 31–38 (2001)
16. Gisolf, A., van den Berg, P.: Target-oriented non-linear inversion 29. Symes, W.W.: Migration velocity analysis and waveform inver-
of time-lapse seismic data. In: SEG Technical Program Expanded sion. Geophys. Prospect. 56(6), 765–790 (2008)
Abstracts 2010. Society of Exploration Geophysicists, pp. 2860– 30. Tang, H., White, C.D.: Multivariate statistical log log-facies
2864 (2010) classification on a shallow marine reservoir. J. Pet. Sci. Eng.
17. Gisolf, D., Haffinger, P.R., Doulgeris, P.: Reservoir-oriented 61(2-4), 88–93 (2008)
wave-equation-based seismic amplitude variation with offset 31. Vossepoel, F., Darnet, M., Gesbert, S., et al.: Detecting
inversion, vol. 5. https://doi.org/10.1190/int-2016-0157.1 (2017) hydrocarbons in carbonates: Joint interpretation of CSEM and
18. Goodfellow, I., Bengio, Y., Courville, A., et al: Deep learning. 2, seismic. In: Society of Exploration Geophysicists International
MIT press Cambridge (2016) Exposition and 80th Annual Meeting 2010, SEG (2010)
19. Kennett, B.: Seismic wave propagation in stratified media. ANU 32. White, R., Simm, R.: Tutorial: Good practice in well ties. First
Press (2009) Break. 21(10) (2003)
20. LeCun, Y., Bottou, L., Bengio, Y., et al.: Gradient-based learning 33. van Wijngaarden, A.: Imaging and characterization of angle-
applied to document recognition. Proc. IEEE 86(11), 2278–2324 dependent seismic reflection data. PhD thesis, Delft University of
(1998) Technology (1998)
21. Li, Y., Anderson-Sprecher, R.: Facies identification from well 34. Zhao, T., Li, F., Marfurt, K.J.: Seismic attribute selection for
logs: A comparison of discriminant analysis and naı̈ve bayes unsupervised seismic facies analysis using user-guided data-
classifier. J. Pet. Sci. Eng. 53(3), 149–157 (2006) adaptive weights. Geophys. 83(2), O31–O44 (2018)
22. Loog, M.: Supervised classification: Quite a brief overview. In: 35. Zheng, Y.: Elastic pre-stack seismic inversion in stratified media
Machine Learning Techniques for Space Weather, pp. 113–145. using machine learning. In: 81st EAGE Conference and Exhibition
Elsevier (2018) 2019 (2019)
23. Ronneberger O, Fischer P, Brox T: U-net: Convolutional networks
for biomedical image segmentation. In: International Conference
on Medical image computing and computer-assisted intervention, Publisher’s note Springer Nature remains neutral with regard to
pp. 234–241. Springer (2015) jurisdictional claims in published maps and institutional affiliations.
24. Sakurai, S., Melvin, J., et al: Facies discrimination and
permeability estimation from well logs for the endicott field.
In: SPWLA 29th Annual Logging Symposium, Society of
Petrophysicists and Well-Log Analysts (1988)
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at