0% found this document useful (0 votes)
9 views

Learned Multiphysics Inversion With Differentiable Programming and Machine Learning

Uploaded by

amanraizadasu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Learned Multiphysics Inversion With Differentiable Programming and Machine Learning

Uploaded by

amanraizadasu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Learned multiphysics inversion with differentiable

programming and machine learning


Mathias Louboutin1*, Ziyi Yin2*, Rafael Orozco3, Thomas J. Grady II3, Ali Siahkoohi3, Gabrio Rizzuti4, Philipp A. Witte5, Olav Møyner 6,
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

Gerard J. Gorman7, and Felix J. Herrmann1


https://doi.org/10.1190/tle42070474.1

Abstract handwritten low-level C/Cuda code, reducing the flexibility and


We present the Seismic Laboratory for Imaging and Modeling/ extensibility to new physics and three-dimensional problems. It
Monitoring open-source software framework for computational also does not integrate machine learning with FWI as advocated
geophysics and, more generally, inverse problems involving the in this work. While these implementation design choices lead to
wave equation (e.g., seismic and medical ultrasound), regularization performant code for specific problems, such as FWI, they often
with learned priors, and learned neural surrogates for multiphase hinder the implementation of new algorithms, e.g., based on
flow simulations. By integrating multiple layers of abstraction, different objective functions or constraints, as well as coupling
the software is designed to be both readable and scalable, allowing existing code bases with external software libraries. For instance,
researchers to easily formulate problems in an abstract fashion combining wave-equation-based inversion with machine learning
while exploiting the latest developments in high-performance frameworks or coupling wave physics with multiphase fluid-flow
computing. The design principles and their benefits are illustrated solvers is considered challenging and costly. Thus, our industry
and demonstrated by means of building a scalable prototype for runs the risk of losing its ability to innovate, a situation exacerbated
permeability inversion from time-lapse crosswell seismic data, by the challenges we face due to the energy transition.
which, aside from coupling of wave physics and multiphase flow, In this work, we present a flexible and agile software framework
involves machine learning. that aims to resolve these challenges and is designed to be scalable,
differentiable, and interoperable. We first introduce the design
Motivation
DOI:10.1190/tle42070474.1

principles of our software framework, followed by a concrete usage


Advancements in high-performance computing techniques scenario for time-lapse seismic monitoring of geologic carbon
have led to giant leaps in computational (exploration) geophysics storage. This illustrative and didactic example involves the integra-
over the past decades. These developments have led, for instance, tion of multiple software modules for different types of physics
to the adoption of wave-equation-based inversion technologies with machine learning techniques such as learned deep priors
such as full-waveform inversion (FWI) and reverse time migration and neural surrogates. For each module, we explain the choices
(RTM) that, due to their adherence to wave physics, have resulted we made and how these modules are connected through software
in superior imaging in complex geologies. While these techniques abstractions and overarching high-level programming language
rank among the most sophisticated imaging technologies, their constructs. The advocacy of our proposed framework is demon-
implementation relies with few exceptions — most notably strated on a preliminary 2D case study involving the realistic
iWave++ (Sun and Symes, 2010), Julia Devito Inversion framework Compass model (Jones et al., 2012). We conclude by discussing
(JUDI.jl) of the Seismic Laboratory for Imaging and Modeling remaining challenges and future work directions.
(SLIM) (Witte et al., 2019a; Louboutin et al., 2023), and Chevron’s
COFII (Washbourne et al., 2021) — on monolithic low-level (C/ Design principles
Fortran) implementations. As a consequence, due to their lack of To address the shortcomings of current software implementa-
abstraction and modern programming constructs, these low-level tions that impede progress, we have embarked on the development
implementations are difficult and costly to maintain, especially of a performant software framework. For instance, our wave
when performance considerations prevail over best software propagators, implemented in Devito (Louboutin et al., 2019;
practices. A noteworthy attempt at modernizing wave-equation Luporini et al., 2020), are used in production by contractors and
inversion frameworks is Deepwave (Richardson, 2018), which oil and gas majors while enabling rapid, low-cost, scalable, and
implements FWI using PyTorch (Paszke et al., 2019). Despite interoperable algorithm development for multiphysics and machine
state-of-the-art examples and applications for 2D inversion, this learning problems that run on a variety of chipsets (e.g., ARM,
work is limited by the aforementioned pitfalls as it relies on Intel, POWER) and graphics accelerators (e.g., NVIDIA, AMD,

The first and second authors contributed equally to this work.


*

Georgia Institute of Technology, School of Earth and Atmospheric Sciences, Atlanta, Georgia, USA. E-mail: [email protected]; felix.her-
1

[email protected].
2
Georgia Insitute of Technology, School of Computational Science and Engineering, Atlanta, Georgia, USA. E-mail: [email protected].
3
Georgia Institute of Technology, College of Computing, Atlanta, Georgia, USA. E-mail: [email protected]; [email protected]; ali.siahkoohi@
gmail.com.
4
Utrecht University, Utrecht, Netherlands. E-mail: [email protected].
5
Microsoft Corp., Redmond, Washington, USA. E-mail: [email protected].
6
SINTEF, Trondheim, Norway. E-mail: [email protected].
7
Imperial College London, Department of Earth Science and Engineering, London, UK. E-mail: [email protected].

474 The Leading Edge July 2023 Special Section: Digitalization in energy
Intel). To achieve this, we adopt contemporary software design Second, we exploit hierarchy within wave-equation-based
practices that include high-level abstractions, software design inversion problems that naturally leads to a separation of concerns.
principles, and utilization of modern programming languages such At the highest level, we deal with linear operators, specifically
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

as Python (Rossum and Drake, 2009) and Julia (Bezanson et al., matrix-free Jacobians of wave-based inversion, with JUDI.jl and
2017). We also make use of abstractions provided by domain- parallel file input/output with SegyIO.jl (Lensink et al., 2023) on
specific languages (DSLs) such as the Rice Vector Library (Padula premise, or in the cloud (Azure) via JUDI4Cloud.jl (Louboutin
et al., 2009) and the Unified Form Language (Alnaes et al., 2015; et al., 2022b) and CloudSegyIO.jl (Modzelewski and Louboutin,
Rathgeber et al., 2016) and adopt reproducible research practices 2022). At the intermediate and lower level, we make extensive
introduced by the trailblazing open-source initiative Madagascar use of Devito (Louboutin et al., 2019; Luporini et al., 2020) — a
(Fomel et al., 2013), which made use of version control and an just-in-time compiler for stencil-based time-domain finite-
abstraction based on the software construction tool SCons. difference calculations, the development of which SLIM has been
To meet the challenges of modern software design in a per- involved in over the years.
formance-critical environment, we adhere to three key principles Third, we build on the principles of differentiable programming
— in addition to the fundamental principle of separation of as advocated by Innes et al. (2019) and intrusive automatic dif-
concerns. First, we adopt mathematical language to inform our ferentiation introduced by D. Li et al. (2020) to integrate wave
abstractions. Mathematics is concise, unambiguous, well under- physics with machine learning frameworks and multiphase flow.
stood, and leads to natural abstractions for the Specifically, we employ automatic differentiation (AD) through
the use of the chain rule, including abstractions that allow the
• wave physics, through partial differential equations as put to user to add derivative rules, as in ChainRules.jl (White et al.,
practice by Devito, which relies on Symbolic Python (SymPy) 2022, 2023).
(Meurer et al., 2017) to define partial differential equations. During the Federal University of Rio Grande do Norte’s
Given the symbolic expressions, Devito automatically generates inaugural FWI workshop in 2015, we at SLIM started articulating
highly optimized, possibly domain-decomposed, parallel C these design principles (Lin and Herrmann, 2015), which over
code that targets the available hardware with near-optimal the years cumulated in scalable parallel software frameworks for
DOI:10.1190/tle42070474.1

performance for 3D acoustic, tilted-transverse-isotropic, or time-harmonic FWI (Silva and Herrmann, 2019), for time-domain
elastic wave equations; RTM and FWI (Witte et al., 2018, 2019a; Louboutin et al.,
• linear algebra, through matrix-free linear operators, as in 2023), and for abstracted FWI (Louboutin et al., 2022a) allowing
JUDI.jl (Witte et al., 2019a; Louboutin et al., 2023) — a for connections with machine learning. Aside from developing
high-level linear algebra DSL for wave-equation-based model- software for wave-equation-based inversion, we have been involved
ing and inversion. These ideas date back to SPOT (van den more recently in the development of scalable machine learning
Berg and Friedlander, 2009) with more recent implementations solutions, including the Julia package InvertibleNetworks.jl (Witte
JOLI.jl (Modzelewski et al., 2023) in Julia and PyLops in et al., 2023), which implements memory-efficient invertible deep
Python (Ravasi and Vasconcelos, 2020); and neural networks such as (conditional) normalizing flows (NFs)
• optimization, through definition of objective functions, also (Rezende and Mohamed, 2015), and scalable distributed Fourier
known as loss functions, that need to be minimized — via neural operators (FNOs) (Z. Li et al., 2020) in the dfno software
SlimOptim.jl (Louboutin et al., 2022c) — subject to math- package (Grady et al., 2022a, 2022b). All of these will be described
ematical constraints, which can be imposed through in more detail later in this paper.
SetIntersectionProjection.jl (Peters and Herrmann, 2019; To illustrate how these design principles can lead to solutions
Peters et al., 2022). of complex learned coupled inversions, we consider in the ensuing

Figure 1. The multiphysics forward model. The permeability, K, is generated from Gaussian noise with a pretrained NF, G, followed by two-phase flow simulations through S, rock physics
denoted by R, and time-lapse seismic data simulations via wave physics, F.

Special Section: Digitalization in energy July 2023 The Leading Edge 475
sections end-to-end inversion of time-
lapse seismic data for the spatial perme-
ability distribution (D. Li et al., 2020).
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

As can be seen from Figure 1, this


inversion problem is rather complex,
and its solution arguably benefits from
our three design principles listed earlier.
In this formulation, the latent repre-
sentation for the permeability is taken
via a series of nonlinear operations all
the way to the time-lapse seismic data.
In the remainder of this exposition, we
will detail how the different compo- Figure 2. Experimental setup. The black X symbol in the middle of the model indicates the CO2 injection location. The seismic
nents in this learned inversion problem sources are on the left-hand side of the model (shown as yellow X symbols) and receivers are on the right-hand side of the model
are implemented so that the coupled (shown as red dots). Overlaid in gray is the compressional wavespeed with simulated CO2 saturation modeled for 18 years.
inversion can be carried out. The results
presented are preliminary, representing storage (Ringrose, 2020). In its simplest form for a single time-lapse vintage, FWI involves
a snapshot on how research is conducted minimizing the ℓ2 -norm misfit/loss function between observed and synthetic data — i.e.,
according to the design principles. we have

Learned time-lapse end-to- ​minimize​


m​
​12 ∥
​ _ ​ F​(m)​q − d ​∥​22​ where F​(m)​ = ​Pr​​A​(m)​−1​​Ps​⊤​​. (​ 1)​
end permeability inversion
Combating climate change and In this formulation, the symbol F(m) represents the forward modeling operator (wave
dealing with the energy transition call physics), parameterized by the squared slowness m. This forward operator acting on the
DOI:10.1190/tle42070474.1

for solutions to problems of increasing sources consists of the composition of source injection operator Ps⊤, with ⊤ denoting the
complexity. Building seismic monitor- transpose operator, solution of the discretized wave equation via A(m) –1, and restriction to
ing systems for geologic CO2 and/or the receivers via the linear operator Pr . The vector q represents the seismic sources, and the
H 2 storage falls in this category. To vector d contains single-vintage seismic data collected at the receiver locations. Thanks to
demonstrate how math-inspired our adherence to the math, the corresponding Julia code to invert for the unknown squared
abstractions can help, we consider slowness m with JUDI.jl reads
inversion of permeability from cross-
well time-lapse data (see Figure 2 for # Forward modeling to generate seismic data.
experimental setup) involving (1) cou- Pr = judiProjection(recGeometry) # setup receiver
pling of wave physics with two-phase Ps = judiProjection(srcGeometry) # setup sources
(brine/CO2 ) f low using Jutul.jl Ainv = judiModeling(model) # setup wave-equation solver
(Møyner et al., 2023) state-of-the-art F = Pr * Ainv * Ps' # forward modeling operator
reservoir modeling software in Julia; d = F(m_true) * q # generate observed data
(2) learned regularization with NFs # Gradient descent to invert for the unknown squared slowness.
with InvertibleNetworks.jl; and for it = 1:maxiter
(3) learned surrogates for the fluid-flow d0 = F(m) * q # generate synthetic data
simulations with FNOs. This type of J = judiJacobian(F(m), q) # setup the Jacobian operator of F
inversion problem is especially chal- g = J' * (d0 - d) # gradient w.r.t. squared slowness
lenging because it involves different m = m - t * g # gradient descent with steplength t
types of physics to estimate the past, end
current, and future saturation and pres-
sure distributions of CO2 plumes from To obtain this concise and abstract formulation for FWI, we utilized hierarchical
crosswell data in saline aquifers. In the abstractions for propagators in Devito and linear algebra tools in JUDI.jl, including
subsequent sections, we demonstrate matrix-free implementations for F and its Jacobian J. While the preceding stand-alone
how we invert time-lapse data using implementation allows for (sparsity-promoting) seismic (Louboutin and Herrmann, 2017;
the separate software packages listed Louboutin et al., 2018; Herrmann et al., 2019; Witte et al., 2019b; Rizzuti et al., 2020,
in Figure 1. 2021; Siahkoohi et al., 2020a, 2020b, 2020c; Yang et al., 2020; Yin et al., 2021, 2023)
Wave-equation-based inversion. and medical (Yin et al., 2020; Orozco et al., 2021, 2023a, 2023b) inversions, it relies on
Due to its unmatched ability to resolve hand-derived implementations for the adjoint of the Jacobian J' and for the derivative
CO2 plumes, active-source time-lapse of the loss function. Although this approach is viable, relying solely on hand-derived
seismic is arguably the preferred imag- derivatives can become cumbersome when we want to utilize machine learning models
ing modality when monitoring geologic or when we need to couple the wave equation to the multiphase flow equations.

476 The Leading Edge July 2023 Special Section: Digitalization in energy
Deep priors and NFs. NFs are gen-
erative models that take advantage of
invertible deep neural network archi-
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

tectures to learn complex distributions


from training examples (Dinh et al.,
2016). The term “flow” refers to the
transformation of data from a complex
distribution to a simple one. The term
“normalizing” refers to the standard
Gaussian (normal) target distribution
that the network learns to map images
to. For example, in seismic inversion
applications, we are interested in
approximating the distribution of earth
models to use as priors in downstream
tasks. NFs learn to map samples from
the target distribution (i.e., earth mod-
Figure 3. Demonstration of Gaussianization of Compass slices during training of an NF. The data used for this didactic example els) to zero-mean unit standard devia-
are openly available and this figure is in the InvertibleNetworks.jl repository. tion Gaussian noise using a sequence of
trainable nonlinear invertible layers.
To allow for this situation, we make use of Julia’s differentiable programming ecosystem Once trained, one can resample new
that includes tools to use AD and to add differentiation rules via ChainRules.jl. Using Gaussian noise and pass it through the
this tool, the AD system can be taught how to differentiate JUDI.jl via the following inverse sequence of layers to obtain new
differentiation rule for the forward propagator: generative samples from the target
DOI:10.1190/tle42070474.1

distribution. NFs are an attractive choice


# Custom AD rule for wave modeling operator. for generative models in seismic applica-
function rrule(::typeof(*), F::judiModeling, q) tions (Zhang and Curtis, 2020, 2021;
y = F * q # forward modeling Siahkoohi and Herrmann, 2021;
# The pullback function for gradient calculations. Siahkoohi et al., 2021, 2022, 2023; Zhao
pullback(dy) = NoTangent(), judiJacobian(F, q)' * dy, F' * dy et al., 2021) because they provide fast
return y, pullback sampling and allow for memory-efficient
end training due to their intrinsic invert-
ibility, which eliminates the need to
In this rule, the pullback function takes as input the data residual, dy, and outputs the store intermediate activations during
gradient of F * q with respect to the operator * (no gradient), the model parameters, and backpropagation. Memory efficiency is
the source distribution. With this differentiation rule, the above gradient descent algorithm particularly important for seismic appli-
can be implemented as follows: cations due to the 3D volumetric nature
of the seismic models. Thus, our meth-
# Define the loss function. ods need to scale well in this regime.
loss(m) = .5f0 * norm(F(m) * q - d)^2f0 To illustrate the practical use of NFs
# Gradient descent to invert for the squared slowness. as priors in seismic inverse problems,
for it = 1:maxiter we trained an NF on slices from the
g = gradient(loss, m)[1] # gradient computation via AD Compass model (Jones et al., 2012). The
m = m - t * g # gradient descent with steplength t training of an NF is laid out in Figure 5
end where, for illustrative purposes, we
demonstrate a training run on small
Compared to the original implementation, this code only needs F(m) and the function (64 × 64) slices of the Compass model.
loss(m). With the help of the above rrule, Julia’s AD system8 is capable of computing the Each row shows the normalization
gradients (line 5). Aside from remaining performant — i.e., we still make use of the adjoint- (image m transformed to Z m intended
state method to compute the gradients — the advantage of this approach is that it allows for to be white zero-mean standard devia-
much more flexibility, e.g., in situations where the squared slowness is parameterized in terms tion one Gaussian noise) during training
of a pretrained neural network or in terms of the output of multiphase flow simulations. In and its generative inverse (white noise​
the next section, we show how trained NFs can serve as priors to improve the quality of FWI. z ∼ 𝒩​(0, 1)​ to image ​m​˜​) during each
epoch. From Figure 5, we clearly
8
In this case, we used reverse AD provided by Zygote.jl, the AD system provided by Julia machine learn- observe the intended behavior. As the
ing package Flux.jl. Because ChainRules.jl is AD system agnostic, another choice could have been made. training proceeds, the NFs transform

Special Section: Digitalization in energy July 2023 The Leading Edge 477
the true model better toward white noise while its inverse progressively generates more In Figure 6, we compare the results
realistic looking generative velocity models. To perform a comparison with traditional FWI, of FWI with our learned prior against
we train an NF on full model size slices (512 × 256 grid points). In Figure 5, we compare unregularized FWI. Because our prior
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

generative samples from the NF with the slices used to train the model shown in Figure 4. regularizes the solution toward realistic
Although there are still irregularities, the model has learned important qualitative aspects models, we obtain a velocity estimate
of the model that will be useful in inverse problems. To demonstrate this usefulness, we that is closer to the ground truth. To
test our prior on an FWI inverse problem. Because our NF prior is trained independently, measure the performance of our
it is flexible and can be plugged into different inverse problems easily. method, we use peak signal-to-noise
Our FWI experiment includes ocean-bottom nodes, Ricker wavelet with no energy ratio (PS/N) and see an increase from
below 4 Hz, and additive colored Gaussian noise that has the same bandwidth as the noise- 12.98 dB with traditional FWI to
free data. For FWI with our learned prior, we minimize 14.77 dB with the learned prior.
Through this simple example, we
​minimize​
z
​ ​​_21 ​∥ F​(​𝒢​θ​​​​(​ z)​)​q − d ​∥​22​​+ _λ2​​∥z ​∥​22 ​​,
* ​(2)​ demonstrated the ability to easily
integrate our state-of-the-art wave-
where 𝒢 ​ θ​​​ ​​ is a pretrained NF with weights θ*. After training, the inverse of the NF maps realistic
*
equation propagators with Julia’s dif-
Compass-like earth samples to white noise — i.e., 𝒢 ​ ​−1 ​(​ m)​ = z ∼ 𝒩​(0, I )​. Because the NFs are
θ​ ​​ *
ferentiable programming system. By
designed to be invertible, the action of the pretrained NF, ​𝒢​​θ​ ​​, on Gaussian noise z produces
*
applying these design principles to
realistic samples of earth models (see Figure 5). We use this capability in equation 2 where the other components of the end-to-end
unknown model parameters in m are reparameterized by 𝒢 ​ θ​​​(​​ z).​ The regularization term, ​2_λ​ ∥ z ∥
* ​ ​22​,​ inversion, we design a seismic monitor-
penalizes the latent variable z with large ℓ2 -norm, where λ balances the misfit and regularization ing framework for real-world applica-
terms. Consequently, this learned regularizer encourages FWI results that are more likely to be tions in subsurface reservoirs.
realistic earth models (Asim et al., 2020). However, notice that the optimization routine now Fluid-flow simulation and perme-
requires differentiation through both the physical operator (wave physics, F) and the pretrained ability inversion. As stated earlier,
NF (​​𝒢​​θ​​​), and only a true invertible implementation like ours, with minimal memory imprint
*
our goal is to estimate the permeabil-
DOI:10.1190/tle42070474.1

for both training and inference, can provide scalability. ity from time-lapse crosswell moni-
Due to the JUDI.jl’s rrule for F and InvertibleNetworks.jl’s rrule for G, integration of toring data collected at a CO2 injec-
machine learning with FWI becomes straightforward involving replacement of m by G(z) tion site (cf. Figure 2). Compared to
on line 6. Minimizing the objective function in equation 2 now translates to conventional seismic imaging, time-
lapse monitoring of geologic storage
# Load the pretrained NF and weights. differs because it aims to image time-
G = NetworkGlow(nc, nc_hidden, depth, nscales) lapse changes in the CO2 plume while
set_params!(G, θ) obtaining estimates for the reservoir’s
# Set up the ADAM optimizer. f luid-f low properties. This involves
opt = ADAM() coupling wave modeling operators to
# Define the reparameterized loss function including penalty term. f luid-f low physics to track the CO2
loss(z) = .5f0 * norm(F(G(z)) * q - d)^2f0 + .5f0 * λ * norm(z)^2f0 plumes underground. The f luid-f low
# ADAM iterations. physics models the slow process of
for it = 1:maxiter CO2 partly replacing brine in the pore
g = gradient(loss, z)[1] # gradient computation with AD space of the reservoir, which involves
update!(opt, z, g) # update z with ADAM solving the multiphase f low equa-
end tions. For this purpose, we need
# Convert latent variable to squared slowness. access to reservoir simulation soft-
m = G(z) ware capable of modeling two-phase

Figure 4. Examples of Compass 2D slices used to train an NF prior.

478 The Leading Edge July 2023 Special Section: Digitalization in energy
(brine/CO2 ) f low. While several basic example where the ADAM algorithm is used to invert for subsurface permeability
proprietary and open-source reservoir given the full history of CO2 concentration snapshots:
simulators exist, including MRST
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

(Lie and Møyner, 2021), GEOSX # Generate CO2 concentration.


(Settgast et al., 2022), and Open c = S(K_true)
Porous Media (Rasmussen et al., # Set up ADAM optimizer.
2021), few support differentiation of opt = ADAM()
the simulator’s output (CO2 satura- # Define the loss function.
tion) with respect to its input (the loss(K) = .5f0 * norm(S(K) - c)^2f0
spatial permeability distribution K in # ADAM iterations.
Figure 1). We use the recently devel- for it = 1:maxiter
oped e x ter na l Ju l ia pac k age g = gradient(loss, K)[1] # gradient computed with AD
JutulDarcy.jl that supports Darcy f low update!(opt, K, g) # update K with ADAM
and serves as a front-end to Jutul.jl end
(Møyner et al., 2023), which provides
accurate Jacobians with respect to K. During each iteration of the preceding loop, Julia’s machine learning package Flux.jl
Jutul.jl is an implicit solver for finite- (Innes et al., 2018; Innes, 2018b) uses the custom gradient defined by the aforementioned
volume discretizations that internally rrule, calling the high-performance adjoint code from JutulDarcy.jl. Our adaptable software
uses AD to calculate the Jacobian. It framework also facilitates effortless substitution of deep learning models in lieu of the
has a performance and feature set numerical fluid-flow simulator. In the next section, we introduce distributed FNOs and
comparable to commercial multiphase discuss how this neural surrogate contributes to our inversion framework.
f low simulators and accounts for real- Fourier neural operator surrogates. While the integration of multiphase flow modeling
istic effects (e.g., dissolution, inter- into the Julia differentiable programming ecosystem opens the way to carry out end-to-end
phase mass exchange, compressibility, inversions (as explained later), fluid-flow simulations are computationally expensive — a
DOI:10.1190/tle42070474.1

capillary effects) and residual trapping notion compounded by the fact that these simulations have to be done many times during
mechanisms. It also provides accurate inversion. For this reason, we switch to a data-driven approach where a neural operator is first
sensitivities through an adjoint for- trained on simulation examples, pairs ​{​K, 𝒮​(K)​}​, to learn the mapping from permeability
mulation of the subsurface multiphase models, K, to the corresponding CO2 snapshots, ​c:= 𝒮​(K)​. After incurring initial offline
f low equations. To integrate the training costs, this neural surrogate provides a fast alternative to numerical solvers with
Jacobian of this software package into acceptable accuracy. FNOs (Z. Li et al., 2020), a neural network architecture based on spectral
Julia’s differentiable programming convolutions that capture the long-range correlations rather than localized spatial convolutions,
system, we wrote the light “wrapper have been introduced recently as a surrogate for elliptic partial differential equations such as
package” JutulDarcyRules.jl (Yin and the Darcy or Burgers equation. This spectral architecture has been applied successfully to
Louboutin, 2023) that adds an rrule simulate two-phase flow during geologic CO2 storage projects (Wen et al., 2022). Independently,
for the nonlinear operator ​​𝒮 ​( ​K ​) ​​, Yin et al. (2022) used a trained FNO to replace the fluid-flow simulations as part of end-to-end
which maps the permeability distribu- inversion and showed that AD of Julia’s machine learning package can be used to compute
tion, K, to the spatially varying CO2 gradients with respect to the permeability using Flux.jl’s reverse-mode AD system Zygote.jl
concentration snapshots, ​c = ​{​​c​i​​}​i=1​​​,
​n​ ​ v
(Innes, 2018a). After training, the above permeability inversion from concentration snapshots,
over n v monitoring time steps (cf. c, is carried out by simply replacing ​𝒮​by ​𝒮​​w​ ​​​ with w* being the weights of the pretrained
*

Figure 1). Addition of this rrule FNO. Thanks to the AD system, the gradient with respect to K is computed automatically.
allows these packages to interoperate Thus, after loading the trained FNO and redefining the operator 𝒮 ​ ​, the aforementioned code
with other packages in Julia’s AD remains exactly the same. For implementation details on the FNO and its training, we refer
ecosystem. The following shows a to Yin et al. (2022) and Grady et al. (2022b).

Figure 5. Generative samples from our trained prior. Their similarity to the training samples in Figure 4 suggests that our NF has learned a useful prior.

Special Section: Digitalization in energy July 2023 The Leading Edge 479
Putting it all together
As a final step in our end-to-end permeability inversion, we introduce a nonlinear rock-physics model, denoted by ℛ
​ ​. Based on the
patchy saturation model (Avseth et al., 2010), this model nonlinearly maps the time-lapse CO2 saturations to decreases in the seismic
properties (compressional wavespeeds, ​v = ​{​​v​i​​}​​ni=1​​ ​​) within the reservoir with the Julia code
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

# Patchy saturation function.


# Input: CO2 saturation, velocity, density, porosity.
# Optional: bulk modulus of mineral, brine, CO2; density of CO2, brine.
# Output: velocity, density.
function Patchy(sw, vp, rho, phi;
bulk_min=36.6f9, bulk_fl1=2.735f9, bulk_fl2=0.125f9,
ρw=7f2, ρo=1f3) where T
# Relate vp to vs, set modulus properties.
vs = vp ./ sqrt(3f0)
bulk_sat1 = rho .* (vp.^2f0 .- 4f0/3f0 .* vs.^2f0)
shear_sat1 = rho .* (vs.^2f0)
# Calculate bulk modulus if filled with 100% CO2.
patch_temp = bulk_sat1 ./ (bulk_min .- bulk_sat1)
.- bulk_fl1 ./ phi ./ (bulk_min .- bulk_fl1)
.+ bulk_fl2 ./ phi ./ (bulk_min .- bulk_fl2)
bulk_sat2 = bulk_min ./ (1f0 ./ patch_temp .+ 1f0)
# Calculate new bulk modulus as weighted harmonic average.
bulk_new = 1f0 / ((1f0 .- sw) ./ (bulk_sat1 .+ 4f0/3f0 * shear_sat1)
+ sw ./ (bulk_sat2 + 4f0/3f0 * shear_sat1)) - 4f0/3f0 * shear_sat1
DOI:10.1190/tle42070474.1

# Calculate new density and velocity.


rho_new = rho + phi .* sw * (ρw - ρo)
vp_new = sqrt.((bulk_new .+ 4f0/3f0 * shear_sat1) ./ rho_new)
return vp_new, rho_new
end

We map the changes in the wavespeeds to time-lapse seismic data, d ​ = { ​ ​i​}


​ ​d ​ ​n​i=1​​ ,​ via the blockdiagonal seismic modeling9 operator​
v

ℱ​(v)​ = diag​(​{​F​​ ​v​​ ​​q​​}​i=1​)​. In this formulation, the single-vintage forward operators Fi and corresponding sources, qi, are allowed
i( i) i ​n​ ​
v

to vary between vintages.


With the fluid-flow (surrogate) solver, 𝒮 ​ ​, the rock-physics module, ℛ
​ ​, and wave-physics module, ℱ ​ ​, in place, along with regu-
larization via reparametrization using ​𝒢​​θ​​​, we are now in a position to formulate the desired end-to-end inversion problem as
*

​​minimize

z
​ ​12_​​∥ ℱ ∘ ℛ ∘ 𝒮​(​𝒢​θ​​​​​(z)​)​− d ​∥​22​​+ _​λ2 ​∥ z ​∥​22​​,
*
(​3)​​

where the inverted permeability can be calculated by ​K​*​ = ​𝒢​θ​​​(​ ​z​*​)​with z* the latent space minimizer of equation 3. As illustrated in
*

Figure 1, we obtain the nonlinear end-to-end map by composing the fluid-flow, rock, and wave physics, according to ​ℱ ∘ ℛ ∘ 𝒮​.
The corresponding Julia code reads

# Set up ADAM optimizer.


opt = ADAM()
# Define the reparameterized loss function including penalty term.
loss(z) = .5f0 * norm(F ° R ° S(G(z)) - d)^2f0 + .5f0 * λ * norm(z)^2f0
# ADAM iterations.
for it = 1:maxiter
g = gradient(loss, z)[1] # gradient computed by AD
update!(opt, z, g) # update z by ADAM
end
# Convert latent variable to permeability.
K = G(z)

Note, we parameterized this forward modeling in terms of the compressional wavespeed.


9

480 The Leading Edge July 2023 Special Section: Digitalization in energy
This end-to-end inversion procedure, which utilizes a learned
deep prior and a pretrained FNO surrogate, was successfully
employed by Yin et al. (2022) on a simple stylistic blocky high-low
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

permeability model. The procedure involves using AD, with rrule


for the wave and fluid physics, in combination with innate AD
capabilities to compute the gradient of the objective in equation 3,
which incorporates fluid-flow, rock, and wave physics. To follow,
we share early results from applying the proposed end-to-end
inversion in a more realistic setting derived from real data
(cf. Figure 2).

Preliminary inversion results


While initial results by Yin et al. (2022) were encouraging and
showed strong benefits from the learned prior, the permeability
model and fluid-flow simulations considered in their study were
too simplistic. To evaluate the proposed end-to-end inversion
methodology in a more realistic setting, we consider the permeability
model plotted in Figure 7a, which we derived from a slice of the
Compass model (Jones et al., 2012) shown in Figure 2. To generate
realistic CO2 plumes in this model, we generate immiscible and
compressible two-phase flow simulations with JutulDarcy.jl over a
period of 18 years with five snapshots plotted at years 10, 15, 16,
17, and 18. These CO2 snapshots are shown in the first row of
Figure 8. Next, given the fluid-flow simulation, we use the patchy
DOI:10.1190/tle42070474.1

saturation model (Avseth et al., 2010) to convert each CO2 con-


centration snapshot, ci, i = 1 … nv to corresponding wavespeed
model, vi, i = 1 … nv with v​ = ℛ​(c).​ We then use JUDI.jl to generate
synthetic time-lapse data, di, i = 1 … nv, for each vintage.
During the inversion, the first 15 years of time-lapse data,
Figure 6. (a) Ground truth. (b) Traditional FWI without prior resulting in 12.98 dB PS/N. di, i = 1 … 15, from the aforementioned synthetic experiment are
(c) Our FWI result with learned prior resulting in 14.77 dB PS/N.

Figure 7. (a) Ground truth permeability. (b) Initial permeability with homogeneous values in the reservoir, with a 7.06 dB S/N. (c) Inverted permeability from physics-based inversion, with a
7.26 dB S/N. (d) Inverted permeability with neural surrogate approximation, with a 7.10 dB S/N.

Special Section: Digitalization in energy July 2023 The Leading Edge 481
inverted with permeabilities within the reservoir initialized by a separation of concerns, we were able to accelerate the research
single reasonable value as shown in Figure 7b. Inversion results and development cycle for the end-to-end inversion. As a result,
obtained after 25 passes through the data for the physics-driven we created a development environment that allowed us to include
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

two-phase flow solver and its learned neural surrogate approxima- machine learning techniques. Relatively late in the development
tion are included in Figure 7c and Figure 7d, respectively. Both cycle, it also gave us the opportunity to swap out the original 2D
results were obtained with 200 iterations of the preceding code reservoir simulation code for a much more powerful and fully
block. Each time-lapse vintage consist of 960 receivers and 32 featured industry-strength 3D code developed by a national lab.
shots. To limit the number of wave-equation solves, gradients What we unfortunately have not yet been able to do is demonstrate
were calculated for only four randomly selected shots with replace- our ability to scale this end-to-end inversion to 3D, while both
ment per iteration. While these results obtained without learned the Devito-based propagators and Jutul.jl’s fluid-flow simulations
regularization are somewhat preliminary, they lead to the following both have been demonstrated on industry-scale problems.
observations. First, both inversion results for the permeability Unfortunately, lack of access to large-scale computational resources
follow the inverted cone shape of the CO2. This is to be expected makes it challenging in an academic environment to validate the
because permeability can only be inverted where CO2 has flown proposed methodology on 4D synthetic and field data, even though
over the first 15 years. Second, the inverted permeability follows the computational toolchain presented in this paper is fully dif-
trends of this strongly heterogeneous model. Third, as expected, ferentiable and, in principle, capable of scale-up. Most components
details and continuity of the results obtained with the two-phase have been tested separately and verified on realistic 3D examples
flow solver are better. In part, this can be explained by the fact (Grady et al., 2022b; Louboutin and Herrmann, 2022, 2023;
that there are no guarantees that the model iterations remain with Møyner and Bruer, 2023) and efforts are underway to remove
the statistical distribution on which the FNO was trained. Fourth, fundamental memory and other bottlenecks.
the implementation of this workflow benefited greatly from the Scale-up NFs. Generative models, and NFs included, call
aforementioned software design principles. For instance, the use for relatively large training sets and large computational resources
of abstractions made it trivial to replace physics-driven two-phase for training. While efforts have been made to create training
flow solvers with their learned counterparts. sets for more traditional machine learning tasks, no public-
DOI:10.1190/tle42070474.1

Despite being preliminary, the inversion results show that domain training set exists that contains realistic 3D examples.
this framework is conducive to producing current CO2 plume A positive is that NFs (Rezende and Mohamed, 2015) have a
estimates and near-future forecasts. As described by Yin et al. small memory footprint compared to diffusion models
(2022), these capabilities can be
achieved through use of the physics
simulator or the trained FNO surrogate.
The 18-year CO2 simulations in both
inverted permeability models are rea-
sonable when comparing the true plume
development plotted in the top row of
Figure 8 with plumes simulated from
the inverted models plotted in rows
three and four of Figure 8. While cer-
tain details are missing in the estimates
for the past, current, and predicted CO2
concentrations, the inversion constitutes
a considerable improvement compared
to plumes generated in the starting
model for the permeability plotted in
the second row of Figure 8. An early
version of the presented workflow can
be found in the Julia package
Seis4CCS.jl. As the project matures,
updated workflows and codes will be
pushed to GitHub. Figure 8. CO plume estimation and prediction. The first two columns are the CO concentration snapshots at year 10 and year
2 2
15 of the first 15 years of simulation monitored seismically. The last three columns are forecasted snapshots at years 16, 17,
Remaining challenges and 18, where no seismic data are available. The first row corresponds to the ground truth CO2 plume simulated by the unseen
We hope we have been able to con- ground truth permeability model. The second row contains plume simulations in the starting model, with a 10.99 dB S/N on
vince the reader that working with the first 15 years of CO2 snapshots and 8.51 dB on the last 3 years. Rows three and four contain estimated and predicted
CO2 plumes for the physics-based and surrogate-based permeability inversion. The S/N values of the first 15 years of the
abstractions has its benefits. Due to the estimated CO2 plume are 17.72 and 16.17 dB for the physics-based inversion and the surrogate-based inversion, respectively.
math-inspired abstractions, which The S/N values for the CO2 plume forecasts for the last 3 years are 15.69 and 14.05 dB for the physics-based inversion and the
naturally lead to modularity and surrogate-based inversion, respectively.

482 The Leading Edge July 2023 Special Section: Digitalization in energy
(Song et al., 2020), so training this type of network will be Scale-up neural operators. Since the seminal paper by Z. Li
feasible when training sets and compute become available. In et al. (2020), there has been a flurry of publications on the use of
our laboratory, we were already able to successfully train and FNOs as neural surrogates for expensive multiphase fluid-flow
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

evaluate NFs on 256 × 256 × 64 models. In some cases where solvers used to simulate CO2 injection as part of geologic storage
geophysicists might not have enough samples for velocity/ projects (Wen et al., 2022, 2023). While there is good reason for
permeability models, one could use in-house legacy models to this excitement, challenges remain when scaling this technique
train the NFs as a preparation step for inverting the seismic to realistic 3D problems. In that case, additional measures must
data. We leave the potential investigation to future studies. be taken. For instance, by nesting FNOs Wen et al. (2023) were
able to divide 3D domains into smaller hierarchical subdomains
centered around the wells — an approach that is only viable when
Software mentioned in this article (in order of first mention) certain assumptions are met. Because of this nested decomposition,
JUDI.jl https://github.com/slimgroup/JUDI.jl these authors avoid the large memory footprint of 3D FNOs and
COFII https://github.com/ChevronETC/Examples report a speedup of many orders of magnitude. Given the potential
Devito https://github.com/devitocodes/devito impact of irregular CO2 flow, e.g., leakage, we try as much as
Julia https://julialang.org/ possible to avoid making assumptions on the flow behavior and
propose an accurate distributed FNO structure based on a domain
SymPy https://www.sympy.org/en/index.html
decomposition of the network’s input and network weights (Grady
SPOT https://github.com/mpf/spot
et al., 2022b). By using DistDL (Hewett and Grady II, 2020), a
JOLI.jl https://github.com/slimgroup/JOLI.jl software package that supports “model parallelism” in machine
PyLops https://pylops.readthedocs.io/en/stable/ learning, our dfno software package partitions the input data and
SlimOptim.jl https://github.com/slimgroup/SlimOptim.jl network weights across multiple GPUs such that each partition
SetIntersectionProjection.jl https://github.com/slimgroup/SetIntersectionProjection.jl is able to fit in the memory of a single GPU. As reported by Grady
SegyIO.jl https://github.com/slimgroup/SegyIO.jl et al. (2022b), our work demonstrated validity of dfno on a realistic
JUDI4Cloud.jl https://github.com/slimgroup/JUDI4Cloud.jl
problem and reasonable training set (permeability/CO2 concentra-
DOI:10.1190/tle42070474.1

tion pairs) sizes for permeability models derived from the Sleipner
CloudSegyIO.jl https://github.com/slimgroup/CloudSegyIO.jl
benchmark model (Furre et al., 2017). On 16 timesteps and models
ChainRules.jl https://github.com/JuliaDiff/ChainRules.jl
of size 64 × 118 × 263, we reported from our perspective a more
InvertibleNetworks.jl https://github.com/slimgroup/InvertibleNetworks.jl realistic speedup of more than 1300× compared to the simulation
dfno https://github.com/slimgroup/dfno time on Open Porous Media (Rasmussen et al., 2021), one of the
Jutul.jl https://github.com/sintefmath/Jutul.jl leading open-source reservoir simulators. These results confirm
Zygote.jl https://github.com/FluxML/Zygote.jl a similar indepedent approach advocated by Witte et al. (2022).
Flux.jl https://github.com/FluxML/Flux.jl Even though we are working with our industrial partners and
JutulDarcy.jl https://github.com/sintefmath/JutulDarcy.jl
Extreme Scale Solutions to further improve these numbers, we
are confident that distributed FNOs are able to scale to 3D with
Jutul.jl https://github.com/sintefmath/Jutul.jl
a high degree of parallel efficiency.
JutulDarcyRules.jl https://github.com/slimgroup/JutulDarcyRules.jl
Toward scalable open-source software. In addition to allowing
Flux.jl https://github.com/FluxML/Flux.jl for reproduction of published results, we are advocates of pushing
Seis4CCS.jl https://github.com/slimgroup/Seis4CCS.jl out scalable open-source software to help with the energy transition
ParametricOperators.jl https://github.com/slimgroup/ParametricOperators.jl and with combating climate change. As observed in other fields,
most notably in machine learning, open-source software leads to
accelerated rates of innovation, a feature
Table 1. Current state of SLIM’s software stack. To underline collaboration and active participation in other open-source projects, needed in industries faced with major
we included the external software packages (denoted by *) as well as how these are integrated into our software framework. challenges. Despite the exposition on
our experiences implementing end-to-
Package 3D GPU AD Parallelism
end permeability inversion, this work
Devito* yes yes no Domain decomposition via MPI, multithreading via OpenMP constitutes a snapshot of an ongoing
JUDI.jl yes yes yes Multithreading via OpenMP, task parallel project. However, many of the software
JUDI4Cloud.jl yes yes yes Multithreading via OpenMP, task parallel components listed in Table 1 are in an
InvertibleNetworks.jl yes yes yes Julia-native multithreading
advanced stage of development and to
a large degree are ready to be tested in
dfno yes yes yes Domain decomposition via MPI 3D and ultimately on field data. For
Jutul.jl* yes soon yes Julia-native multithreading instance, all of our software supports
JutulDarcyRules.jl yes soon yes Julia-native multithreading large-scale 3D simulation and AD. In
Seis4CCS.jl yes yes yes Julia-native multithreading
addition, we are in an advanced state of
development to support GPU for all
ParametricOperators.jl yes yes yes Domain decomposition via MPI, Julia-native multithreading codes. For those curious about future

Special Section: Digitalization in energy July 2023 The Leading Edge 483
developments, we also include the Julia package References
ParametricOperators.jl, which is designed to allow for high- Alnaes, M. S., J. Blechta, J. Hake, A. Johansson, B. Kehlet, A. Logg,
C. Richardson, J. Ring, M. E. Rognes, and G. N. Wells, 2015, The
dimensional parallel tensor manipulations in support of future
FEniCS project version 1.5: Archive of Numerical Software, 3,
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

Julia-native implementations of distributed FNOs. no. 100, https://doi.org/10.11588/ans.2015.100.20553.


The work presented in this paper would not have been possible Asim, M., M. Daniels, O. Leong, A. Ahmed, and P. Hand, 2020,
without open-source efforts from other groups, most notably by Invertible generative models for inverse problems: Mitigating repre-
sentation error and dataset bias: Proceedings of the 37th International
researchers at the UK’s Imperial College London, who spearheaded
Conference on Machine Learning, PMLR, http://proceedings.mlr.
the development of Devito, and researchers at Norway’s SINTEF. press/v119/asim20a.html, accessed 2 June 2023.
By integrating these packages into Julia’s agile differentiable Avseth, P., T. Mukerji, and G. Mavko, 2010, Quantitative seismic
programming environment, we believe that we are well on our interpretation: Applying rock physics tools to reduce interpretation
way to arrive at a software environment that is much more viable risk: Cambridge University Press.
Bezanson, J., A. Edelman, S. Karpinski, and V. B. Shah, 2017, Julia: A
than the sum of its parts. We welcome readers to check https:// fresh approach to numerical computing: SIAM Review, 59, no. 1,
github.com/slimgroup for the latest developments. 65–98, https://doi.org/10.1137/141000671.
Dinh, L., J. Sohl-Dickstein, and S. Bengio, 2016, Density estimation
Conclusions using Real NVP: arXiv preprint, https://doi.org/10.48550/
arXiv.1605.08803.
In this work, we introduced a software framework for geo- Fomel, S., P. Sava, I. Vlad, Y. Liu, and V. Bashkardin, 2013, Madagascar:
physical inverse problems and machine learning that provides a Open-source software project for multidimensional data analysis and
scalable, portable, and interoperable environment for research reproducible computational experiments: Journal of Open Research
and development at scale. We showed that through carefully Software, 1, no. 1, e8, https://doi.org/10.5334/jors.ag.
Furre, A.-K., O. Eiken, H. Alnes, J. N. Vevatne, and A. F. Kiær, 2017,
chosen design principles, software with math-inspired abstractions 20 years of monitoring CO2 -injection at Sleipner: Energy Procedia,
can be created that naturally leads to desired modularity and 114, 3916–3926, https://doi.org/10.1016/j.egypro.2017.03.1523.
separation of concerns without sacrificing performance. We Grady, T., Infinoid, and M. Louboutin, 2022a, slimgroup/dfno: Optimal
achieve this by combining Devito’s automatic code generation comm: Zenodo, https://doi.org/10.5281/zenodo.6981516.
Grady II, T. J., R. Khan, M. Louboutin, Z. Yin, P. A. Witte, R. Chandra,
for wave propagators with Julia’s modern highly performant and R. J. Hewett, and F. J. Herrmann, 2022b, Model-parallel Fourier
DOI:10.1190/tle42070474.1

scalable programming capabilities, including differentiable pro- neural operators as learned surrogates for large-scale parametric
gramming. These features enabled us to quickly implement a PDEs: arXiv preprint, https://doi.org/10.48550/arXiv.2204.01205.
prototype, in principle scalable to 3D, for permeability inversion Herrmann, F. J., A. Siahkoohi, and G. Rizzuti, 2019, Learned imaging
with constraints and uncertainty quantification: arXiv preprint,
from time-lapse crosswell seismic data. Aside from the use of https://doi.org/10.48550/arXiv.1909.06473.
proper abstractions, our approach to solving this relatively complex Hewett, R. J., and T. J. Grady II, 2020, A linear algebraic approach to
multiphysics problem relied extensively on Julia’s innate algo- model parallelism in deep learning: arXiv preprint, https://doi.
rithmic differentiation capabilities, supplemented by auxiliary org/10.48550/arXiv.2006.03108.
Innes, M., 2018a, Don’t unroll adjoint: Differentiating SSA-form pro-
performant derivatives for the wave/fluid-flow physics and for grams: arXiv preprint, https://doi.org/10.48550/arXiv.1810.07951.
components of the machine learning. Because of these design Innes, M., 2018b, Flux: Elegant machine learning with Julia: Journal of
choices, we developed an agile and relatively easy to maintain Open Source Software, 3, no. 25, 602, https://doi.org/10.21105/
compact software stack where low-level code is hidden through joss.00602.
Innes, M., A. Edelman, K. Fischer, C. Rackauckas, E. Saba, V. B. Shah,
a combination of math-inspired abstractions, modern program-
and W. Tebbutt, 2019, A differentiable programming system to bridge
ming practices, and automatic code generation. machine learning and scientific computing: arXiv preprint, https://
doi.org/10.48550/arXiv.1907.07587.
Acknowledgments Innes, M., E. Saba, K. Fischer, D. Gandhi, M. C. Rudilosso, N. M. Joy,
T. Karmali, A. Pal, and V. Shah, 2018, Fashionable modelling with
This research was carried out with the support of the Georgia
Flux: arXiv preprint, https://doi.org/10.48550/arXiv.1811.01457.
Research Alliance and industrial partners of the ML4Seismic Jones, C. E., J. A. Edgar, J. I. Selvage, and H. Crook, 2012, Building
Center. The authors thank Henryk Modzelewski (University of complex synthetic models to evaluate acquisition geometries and
British Columbia) and Rishi Khan (Extreme Scale Solutions) for velocity inversion technologies: 74th Conference and Exhibition,
constructive discussions. This work was supported in part by the EAGE, Extended Abstracts, https://doi.org/10.3997/2214-
4609.20148575.
U.S. National Science Foundation grant OAC 2203821 and Lensink, K., H. Modzelewski, M. Louboutin, yzhang3198, and Z. Yin,
Department of Energy grant no. DE-SC0021515. 2023, slimgroup/SegyIO.jl: v0.8.3: Zenodo, https://doi.org/10.5281/
zenodo.7502671.
Data and materials availability Li, D., K. Xu, J. M. Harris, and E. Darve, 2020, Coupled time-lapse
full-waveform inversion for subsurface flow problems using intrusive
Our software framework is organized into registered Julia automatic differentiation: Water Resources Research, 56, no. 8,
packages, all of which can be found on the SLIM GitHub page e2019WR027032, https://doi.org/10.1029/2019WR027032.
(https://github.com/slimgroup). The software packages described Li, Z., N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya,
in this paper are all open source and released under the MIT A. Stuart, and A. Anandkumar, 2020, Fourier neural operator for
parametric partial differential equations: arXiv preprint, https://doi.
license for use by the community. org/10.48550/arXiv.2010.08895.
Lie, K.-A., and O. Møyner, eds., 2021, Advanced modelling with the
Corresponding author: [email protected] MATLAB reservoir simulation toolbox: Cambridge University Press,
https://doi.org/10.1017/9781009019781.

484 The Leading Edge July 2023 Special Section: Digitalization in energy
Lin, T. T. Y., and F. J. Herrmann, 2015, The student-driven HPC Orozco, R., A. Siahkoohi, G. Rizzuti, T. van Leeuwen, and F. J.
environment at SLIM: Presented at the Inaugural Full-Waveform Herrmann, 2023b, Adjoint operators enable fast and amortized
Inversion Workshop, https://slim.gatech.edu/Publications/Public/ machine learning based Bayesian uncertainty quantification:
Conferences/IIPFWI/lin2015IIPFWIsdh/lin2015IIPFWIsdh_pres. Proceedings SPIE 12464, Medical Imaging 2023: Image Processing,
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

pdf, accessed 2 June 2023. 124641L, https://doi.org/10.1117/12.2651691.


Louboutin, M., and F. J. Herrmann, 2017, Extending the search space Padula, A. D., S. D. Scott, and W. W. Symes, 2009, A software framework
of time-domain adjoint-state FWI with randomized implicit time for abstract expression of coordinate-free linear algebra and optimiza-
shifts: 79th Conference and Exhibition, EAGE, Extended Abstracts, tion algorithms: ACM Transactions on Mathematical Software, 36,
https://doi.org/10.3997/2214-4609.201700831. no. 2, 8, https://doi.org/10.1145/1499096.1499097.
Louboutin, M., and F. J. Herrmann, 2022, Enabling wave-based inversion Paszke, A., S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T.
on GPUs with randomized trace estimation: 83rd Annual conference Killeen, 2019, PyTorch: An imperative style, high-performance deep
and Exhibition, EAGE, Extended Abstracts, https://doi. learning library: Proceedings of the 33rd International Conference
org/10.3997/2214-4609.202210531. on Neural Information Processing Systems, 8026–8037.
Louboutin, M., and F. J. Herrmann, 2023, Wave-based inversion at scale Peters, B., and F. J. Herrmann, 2019, Algorithms and software for projec-
on GPUs with randomized trace estimation, https://slim.gatech.edu/ tions onto intersections of convex and non-convex sets with applica-
Publications/Public/Submitted/2023/louboutin2023rte/paper.html, tions to inverse problems: arXiv preprint, https://doi.org/10.48550/
accessed 2 June 2023. arXiv.1902.09699.
Louboutin, M., M. Lange, F. Luporini, N. Kukreja, P. A. Witte, F. J. Peters, B., M. Louboutin, and H. Modzelewski, 2022, slimgroup/
Herrmann, P. Velesko, and G. J. Gorman, 2019, Devito (v3.1.0): An SetIntersectionProjection.jl: v0.2.4: Zenodo, https://doi.org/10.5281/
embedded domain-specific language for finite differences and geo- zenodo.7257913.
physical exploration: Geoscientific Model Development, 12, no. 3, Rasmussen, A. F., T. H. Sandve, K. Bao, A. Lauser, J. Hove, B. Skaflestad,
1165–1187, https://doi.org/10.5194/gmd-12-1165-2019. R. Klöfkorn, et al., 2021, The Open Porous Media Flow reservoir
Louboutin, M., P. Witte, and F. J. Herrmann, 2018, Effects of wrong simulator: Computers & Mathematics with Applications, 81, 159–185,
adjoints for RTM in TTI media: 88th Annual International Meeting, https://doi.org/10.1016/j.camwa.2020.05.014.
SEG, Expanded Abstracts, 331–335, https://doi.org/10.1190/ Rathgeber, F., D. A. Ham, L. Mitchell, M. Lange, F. Luporini,
segam2018-2996274.1. A. T. T. Mcrae, G.-T. Bercea, G. R. Markall, and P. H. J. Kelly,
Louboutin, M., P. Witte, A. Siahkoohi, G. Rizzuti, Z. Yin, R. Orozco, 2016, Firedrake: Automating the finite element method by composing
and F. J. Herrmann, 2022a, Accelerating innovation with software abstractions: ACM Transactions on Mathematical Software, 43, no.
abstractions for scalable computational geophysics: Second 3, Article 24, https://doi.org/10.1145/2998441.
International Meeting for Applied Geoscience & Energy, SEG/ Ravasi, M., and I. Vasconcelos, 2020, PyLops — A linear-operator
DOI:10.1190/tle42070474.1

APPG, Expanded Abstracts, 1482–1486, https://doi.org/10.1190/ Python library for scalable algebra and optimization: SoftwareX, 11,
image2022-3750561.1. 100361, https://doi.org/10.1016/j.softx.2019.100361.
Louboutin, M., P. Witte, Z. Yin, H. Modzelewski, Kerim, C. da Costa, Rezende, D., and S. Mohamed, 2015, Variational inference with normal-
and P. Nogueira, 2023, slimgroup/JUDI.jl: v3.2.3: Zenodo, https:// izing flows: Proceedings of the 32nd International Conference on
doi.org/10.5281/zenodo.7785440. Machine Learning, 1530–1538, http://proceedings.mlr.press/v37/
Louboutin, M., Z. Yin, and F. J. Herrmann, 2022b, slimgroup/ rezende15.html, accessed 2 June 2023.
JUDI4Cloud.jl: First public release: Zenodo, https://doi.org/10.5281/ Richardson, A., 2018, Seismic full-waveform inversion using deep
zenodo.6386831. learning tools and techniques: arXiv preprint, https://doi.org/10.48550/
Louboutin, M., Z. Yin, and F. J. Herrmann, 2022c, slimgroup/ arXiv.1801.07232.
SlimOptim.jl: v0.2.0: Zenodo, https://doi.org/10.5281/ Ringrose, P., 2020, How to store CO2 underground: Insights from early-mover
zenodo.7019463. CCS projects: Springer, https://doi.org/10.1007/978-3-030-33113-9.
Luporini, F., M. Louboutin, M. Lange, N. Kukreja, P. Witte, J. Rizzuti, G., M. Louboutin, R. Wang, and F. J. Herrmann, 2021, A dual
Hückelheim, C. Yount, P. H. J. Kelly, F. J. Herrmann, and G. J. formulation of wavefield reconstruction inversion for large-scale
Gorman, 2020, Architecture and performance of devito, a system seismic inversion: Geophysics, 86, no. 6, R879–R893, https://doi.
for automated stencil computation: ACM Transactions on org/10.1190/geo2020-0743.1.
Mathematical Software, 46, no. 1, 6, https://doi.org/10.1145 Rizzuti, G., A. Siahkoohi, P. A. Witte, and F. J. Herrmann, 2020,
/3374916. Parameterizing uncertainty by deep invertible networks: An applica-
Meurer, A., C. P. Smith, M. Paprocki, O. Čertík, S. B. Kirpichev, tion to reservoir characterization: 90th Annual International Meeting,
M. Rocklin, A. M. T. Kumar, et al., 2017, SymPy: Symbolic comput- SEG, Expanded Abstracts, 1541–1545, https://doi.org/10.1190/
ing in Python: PeerJ Computer Science, 3, e103, https://doi. segam2020-3428150.1.
org/10.7717/peerj-cs.103. Settgast, R., C. Sherman, B. Corbett, S. Klevtsov, F. Hamon, A. Mazuyer,
Modzelewski, H., and M. Louboutin, 2022, slimgroup/CloudSegyIO. A. Vargas, et al., 2022, GEOSX/GEOSX: v0.2.1-alpha: Zenodo,
jl: v1.0.1: Zenodo, https://doi.org/10.5281/zenodo.7434854. https://doi.org/10.5281/zenodo.7151032.
Modzelewski, H., M. Louboutin, Z. Yin, D. Karrasch, and R. Orozco, Siahkoohi, A., and F. J. Herrmann, 2021, Learning by example: Fast
2023, slimgroup/JOLI.jl: v0.8.5: Zenodo, https://doi.org/10.5281/ reliability-aware seismic imaging with normalizing flows: First
zenodo.7752660. International Meeting for Applied Geoscience & Energy, SEG/
Møyner, O., and G. Bruer, 2023, sintefmath/JutulDarcy.jl: v0.2.2: AAPG, Expanded Abstracts, 1580–1585, https://doi.org/10.1190/
Zenodo, https://doi.org/10.5281/zenodo.7775738. segam2021-3581836.1.
Møyner, O., M. Johnsrud, H. M. Nilsen, X. Raynaud, K. O. Lye, and Siahkoohi, A., G. Rizzuti, and F. Herrmann, 2020a, A deep-learning
Z. Yin, 2023, sintefmath/jutul.jl: v0.2.5: Zenodo, https://doi. based Bayesian approach to seismic imaging and uncertainty quan-
org/10.5281/zenodo.7775759. tification: 82nd Conference and Exhibition, EAGE, Extended
Orozco, R., M. Louboutin, A. Siahkoohi, G. Rizzuti, T. van Leeuwen, Abstracts, https://doi.org/10.3997/2214-4609.202010770.
and F. J. Herrmann, 2023a, Amortized normalizing flows for tran- Siahkoohi, A., G. Rizzuti, and F. J. Herrmann, 2020b, Uncertainty
scranial ultrasound with uncertainty quantification, https://openre- quantification in imaging and automatic horizon tracking — A
view.net/forum?id=LoJG-lUIlk, accessed 5 June 2023. Bayesian deep-prior based approach: 90 th Annual International
Orozco, R., A. Siahkoohi, G. Rizzuti, T. van Leeuwen, and F. J. Herrmann, Meeting, SEG, Expanded Abstracts, 1636–1640, https://doi.
2021, Photoacoustic imaging with conditional priors from normalizing org/10.1190/segam2020-3417560.1.
flows, https://openreview.net/forum?id=woi1OTvROO1, accessed Siahkoohi, A., G. Rizzuti, and F. J. Herrmann, 2020c, Weak deep priors
2 June 2023. for seismic imaging: 90 th Annual International Meeting, SEG,

Special Section: Digitalization in energy July 2023 The Leading Edge 485
Expanded Abstracts, 2998–3002, https://doi.org/10.1190/ Witte, P. A., R. J. Hewett, K. Saurabh, A. Sojoodi, and R. Chandra,
segam2020-3417568.1. 2022, SciAI4Industry — Solving PDEs for industry-scale problems
Siahkoohi, A., G. Rizzuti, and F. J. Herrmann, 2022, Deep Bayesian with deep learning: arXiv preprint, https://doi.org/10.48550/
inference for seismic imaging with tasks: Geophysics, 87, no. 5, arXiv.2211.12709.
S281–S302, https://doi.org/10.1190/geo2021-0666.1. Witte, P. A., M. Louboutin, N. Kukreja, F. Luporini, M. Lange, G.
Downloaded 09/24/24 to 128.12.122.235. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/page/policies/terms

Siahkoohi, A., G. Rizzuti, M. Louboutin, P. A. Witte, and F. J. J. Gorman, and F. J. Herrmann, 2019a, A large-scale framework
Herrmann, 2021, Preconditioned training of normalizing flows for for symbolic implementations of seismic inversion algorithms in
variational inference in inverse problems: arXiv preprint, https://doi. Julia: Geophysics, 84, no. 3, F57–F71, https://doi.org/10.1190/
org/10.48550/arXiv.2101.03709. geo2018-0174.1.
Siahkoohi, A., G. Rizzuti, R. Orozco, and F. J. Herrmann, 2023, Reliable Witte, P. A., M. Louboutin, F. Luporini, G. J. Gorman, and F. J.
amortized variational inference with physics-based latent distribution Herrmann, 2019b, Compressive least-squares migration with on-
correction: Geophysics, 88, no. 3, R297–R322, https://doi. the-fly fourier transforms: Geophysics, 84, no. 5, R655–R672, https://
org/10.1190/geo2022-0472.1. doi.org/10.1190/geo2018-0490.1.
Silva, C. D., and F. Herrmann, 2019, A unified 2D/3D large scale Yang, M., Z. Fang, P. A. Witte, and F. J. Herrmann, 2020, Time-domain
software environment for nonlinear inverse problems: ACM sparsity promoting least-squares reverse time migration with source
Transactions on Mathematical Software, 45, no. 1, 7, https://doi. estimation: Geophysical Prospecting, 68, no. 9, 2697–2711, https://
org/10.1145/3291042. doi.org/10.1111/1365-2478.13021.
Song, Y., J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and Yin, Z., H. T. Erdinc, A. P. Gahlot, M. Louboutin, and F. J. Herrmann,
B. Poole, 2020, Score-based generative modeling through stochastic 2023, Derisking geologic carbon storage from high-resolution time-
differential equations: arXiv preprint, https://doi.org/10.48550/ lapse seismic to explainable leakage detection: The Leading Edge,
arXiv.2011.13456. 42, no. 1, 69–76, https://doi.org/10.1190/tle42010069.1.
Sun, D., and W. W. Symes, 2010, IWAVE implementation of adjoint Yin, Z., and M. Louboutin, 2023, slimgroup/JutulDarcyRules.jl: v0.2.4:
state method: Technical Report 10-06, Department of Computational; Zenodo, https://doi.org/10.5281/zenodo.7762154.
Applied Mathematics: Rice University, https://pdfs.semanticscholar. Yin, Z., M. Louboutin, and F. J. Herrmann, 2021, Compressive time-
org/6c17/cfe41b76f6b745c435891ea6ba6f4e2c2dbf.pdf, accessed lapse seismic monitoring of carbon storage and sequestration with
5 June 2023. the joint recovery model: First International Meeting for Applied
van den Berg, E., and M. P. Friedlander, 2009, Spot: A linear-operator Geoscience & Energy, SEG/AAPG, Expanded Abstracts, 3434–3438,
toolbox for Matlab: Presented at SCAIM Seminar. https://doi.org/10.1190/segam2021-3569087.1.
van Rossum, G., and F. L. Drake, 2009, Python 3 reference manual: Yin, Z., R. Orozco, P. A. Witte, M. Louboutin, G. Rizzuti, and F. J.
CreateSpace. Herrmann, 2020, Extended source imaging — A unifying framework
Washbourne, J., S. Kaplan, M. Merino, U. Albertin, A. Sekar, C. Manuel, for seismic and medical imaging: 90th Annual International Meeting,
DOI:10.1190/tle42070474.1

S. Mishra, M. Chenette, and A. Loddoch 2021, Chevron optimization SEG, Expanded Abstracts, 3502–3506, https://doi.org/10.1190/
framework for imaging and inversion (COFII) — An open source segam2020-3426999.1.
and cloud friendly Julia language framework for seismic modeling Yin, Z., A. Siahkoohi, M. Louboutin, and F. J. Herrmann, 2022, Learned
and inversion: First International Meeting for Applied Geoscience coupled inversion for carbon sequestration monitoring and forecasting
& Energy, SEG/AAPG, Expanded Abstracts, 792–796, https://doi. with Fourier neural operators: Second International Meeting for
org/10.1190/segam2021-3594362.1. Applied Geoscience & Energy, SEG/AAPG, Expanded Abstracts,
Wen, G., Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson, 467–472, https://doi.org/10.1190/image2022-3722848.1.
2022, U-FNO — An enhanced Fourier neural operator-based deep- Zhang, X., and A. Curtis, 2020, Seismic tomography using variational
learning model for multiphase flow: Advances in Water Resources, inference methods: Journal of Geophysical Research: Solid Earth,
163, 104180, https://doi.org/10.1016/j.advwatres.2022.104180. 125, no. 4, e2019JB018589, https://doi.org/10.1029/2019JB018589.
Wen, G., Z. Li, Q. Long, K. Azizzadenesheli, A. Anandkumar, and S. Zhang, X., and A. Curtis, 2021, Bayesian geophysical inversion using
Benson, 2023, Real-time high-resolution CO2 geological storage invertible neural networks: Journal of Geophysical Research: Solid
prediction using nested Fourier neural operators: Energy & Earth, 126, no. 7, e2021JB022320, https://doi.org/10.1029/2021JB022320.
Environmental Science, 16, no. 4, 1732–1741, https://doi.org/10.1039/ Zhao, X., A. Curtis, and X. Zhang, 2021, Bayesian seismic tomography
D2EE04204E. using normalizing flows: Geophysical Journal International, 228,
White, F. C., M. Abbott, M. Zgubic, J. Revels, S. Axen, A. Arslan, no. 1, 213–239, https://doi.org/10.1093/gji/ggab298.
S. Schaub, et al., 2023, JuliaDiff/ChainRules.jl: v1.47.0: Zenodo,
https://doi.org/10.5281/zenodo.7628788.
White, F. C., M. Zgubic, M. Abbott, J. Revels, N. Robinson, A. Arslan,
D. Widmann, et al., 2022, JuliaDiff/ChainRulesCore.jl: v1.15.6:
Zenodo, https://doi.org/10.5281/zenodo.7107911.
Witte, P., M. Louboutin, K. Lensink, M. Lange, N. Kukreja, F. Luporini,
G. Gorman, and F. J. Herrmann, 2018, Full-waveform inversion,
part 3: Optimization: The Leading Edge, 37, no. 2, 142–145, https:// © 2023 The Authors. Published by the Society of Exploration Geophysicists.All article
doi.org/10.1190/tle37020142.1. content, except where otherwise noted (including republished material), is licensed
Witte, P., M. Louboutin, R. Orozco, G. Rizzuti, A. Siahkoohi, F. under a Creative Commons Attribution 4.0 International (CC BY) license. See https://
Herrmann, B. Peters, P. Haraldsson, and Z. Yin, 2023, slimgroup/ creativecommons.org/licenses/by/4.0/. Distribution or reproduction of this work in
InvertibleNetworks.jl: v2.2.4: Zenodo, https://doi.org/10.5281/ whole or in part commercially or noncommercially requires full attribution of the original
zenodo.7693048. publication, including its digital object identifier (DOI).

486 The Leading Edge July 2023 Special Section: Digitalization in energy

You might also like