0% found this document useful (0 votes)
84 views13 pages

PINNeikEikonal Solution Using Physics-Informed Neural Networks

Uploaded by

shib ganguli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views13 pages

PINNeikEikonal Solution Using Physics-Informed Neural Networks

Uploaded by

shib ganguli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Computers & Geosciences 155 (2021) 104833

Contents lists available at ScienceDirect

Computers and Geosciences


journal homepage: www.elsevier.com/locate/cageo

PINNeik: Eikonal solution using physics-informed neural networks


Umair bin Waheed a, *, Ehsan Haghighat b, Tariq Alkhalifah c, Chao Song c, Qi Hao a
a
Department of Geosciences, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
b
Department of Civil Engineering, Massachusetts Institute of Technology, MA, 02139, USA
c
Physical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia

A R T I C L E I N F O A B S T R A C T

Keywords: The eikonal equation is utilized across a wide spectrum of science and engineering disciplines. In seismology, it
Eikonal equation regulates seismic wave traveltimes needed for applications like source localization, imaging, and inversion.
Physics-informed neural networks Several numerical algorithms have been developed over the years to solve the eikonal equation. However, these
Seismic modeling
methods require considerable modifications to incorporate additional physics, such as anisotropy, and may even
Traveltimes
breakdown for certain complex forms of the eikonal equation, requiring approximation methods. Moreover, they
suffer from computational bottleneck when repeated computations are needed for perturbations in the velocity
model and/or the source location, particularly in large 3D models. Here, we propose an algorithm to solve the
eikonal equation based on the emerging paradigm of physics-informed neural networks (PINNs). By minimizing a
loss function formed by imposing the eikonal equation, we train a neural network to output traveltimes that are
consistent with the underlying partial differential equation. We observe sufficiently high traveltime accuracy for
most applications of interest. We also demonstrate how the proposed algorithm harnesses machine learning
techniques like transfer learning and surrogate modeling to speed up traveltime computations for updated ve­
locity models and source locations. Furthermore, we use a locally adaptive activation function and adaptive
weighting of the terms in the loss function to improve convergence rate and solution accuracy. We also show the
flexibility of the method in incorporating medium anisotropy and free-surface topography compared to con­
ventional methods that require significant algorithmic modifications. These properties of the proposed PINN
eikonal solver are highly desirable in obtaining a flexible and efficient forward modeling engine for seismological
applications.

1. Introduction intensity values in 2D images (Rouy and Tourin, 1992), image denoising
(Malladi and Sethian, 1996), segmentation (Alvino et al., 2007), and
The eikonal (from the Greek word εικων = image) equation is a first- registration (Cao et al., 2004). In robotics, the eikonal equation is
order non-linear partial differential equation (PDE) encountered in the extensively used for optimal path planning and navigation, e.g., for
wave propagation and geometric optics literature. It was first derived by domestic robots (Ventura and Ahmad, 2014), autonomous underwater
Sir William Rowan Hamilton in the year 1831 (Masoliver and Ros, vehicles (Petres et al., 2007), and Mars Rovers (Garrido et al., 2016). In
2009). The eikonal equation finds its roots in both wave propagation computer graphics, the eikonal equation is used to compute geodesic
theory and geometric optics. In wave propagation, the eikonal equation distances for extracting shortest paths on discrete and parametric sur­
can be derived from the first term of the Wentzel-Kramers-Brillouin faces (Spira and Kimmel, 2004; Raviv et al., 2011). In semi-conductor
(WKB) expansion of the wave equation (Paris and Hurd, 1969), manufacturing, the eikonal equation is used for etching, deposition,
whereas in geometric optics, it can be derived using Huygen’s principle and lithography simulations (Helmsen et al., 1996; Adalsteinsson and
(Arnold, 2013). Sethian, 1996). Furthermore, and of primary interest to us, the eikonal
Despite its origins in optics, the eikonal equation finds applications equation is routinely employed in seismology to compute traveltime
in many science and engineering problems. To name a few, in image fields needed for many applications, including statics and moveout
processing, it is used to compute distance fields from one or more points correction (Lawton, 1989), traveltime tomography (Guo et al., 2019),
(Adalsteinsson and Sethian, 1994), inferring 3D surface shapes from microseismic source localization (Grechka et al., 2015), and Kirchhoff

* Corresponding author.
E-mail address: [email protected] (U. Waheed).

https://doi.org/10.1016/j.cageo.2021.104833
Received 4 September 2020; Received in revised form 17 May 2021; Accepted 17 May 2021
Available online 4 June 2021
0098-3004/© 2021 Elsevier Ltd. All rights reserved.
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

migration (Lambare et al., 2003). models, we study the accuracy properties of the proposed solver. We also
The fast marching method (FMM) and the fast sweeping method explore how machine learning techniques like transfer learning and
(FSM) are the two most commonly used algorithms for solving the surrogate modeling can potentially speed up repeated traveltime com­
eikonal equation. FMM belongs to the family of algorithms which are putations with updated velocity models and/or source locations. We
also referred to as single-pass methods. The first such algorithm is also demonstrate the flexibility of the proposed scheme in incorporating
attributed to John Tsitsiklis (1995), who used a control-theoretic dis­ additional physics and surface topography into the eikonal solution.
cretization of the eikonal equation and emulated Dijkstra-like shortest The main contributions of this paper are as follows: (1) We propose a
path algorithm. However, a few months later, a finite-difference novel algorithm to solve the eikonal equation based on neural networks,
approach, also based on Dijkstra-like ordering and updating was which predicts functional solutions by setting the underlying PDE as a
developed (Sethian, 1996). The FMM combines entropy satisfying up­ loss function to optimize the network’s parameters. The proposed al­
wind schemes for gradient approximations and a fast sorting mechanism gorithm achieves sufficiently high accuracy on models of practical in­
to solve the eikonal equation in a single-pass. terest. (2) Through the use of transfer learning, we show how repeated
The FSM, on the other hand, is a multi-pass algorithm that combines traveltime computations can be done efficiently. On the contrary, con­
Gauss-Seidel iterations with alternating sweeping ordering to solve the ventional algorithms like fast marching and fast sweeping require the
eikonal equation (Zhao, 2005). The idea behind the algorithm is that the same computational effort even for small perturbations in the velocity
characteristics of the eikonal equation can be divided into a finite model or source location. (3) We demonstrate that by constructing
number of pieces and information propagating along each piece can be surrogate models with respect to the source location, the computations
accounted for by one of the sweeping directions. Therefore, FSM con­ can be sped up dramatically as only a single evaluation of the trained
verges in a finite number of iterations, irrespective of the grid size. neural network is needed for perturbations in the source location. Such a
Both FMM and FSM were initially proposed to solve the eikonal model can also be effectively used for sensitivity analysis. (4) We
equation on rectangular grids. However, many different approaches demonstrate the flexibility of the proposed approach in incorporating
have since been proposed, extending them to other discretizations and additional physics by simply updating the loss function and the fact that
formulations. A detailed analysis and comparison of these fast methods no special treatment is needed to accurately account for surface topog­
can be found in (Gómez et al., 2019). raphy or any irregularly shaped domain.
On a different front, deep learning is fast emerging as a potential The rest of the paper is organized as follows. We begin by describing
disruptive tool to tackle longstanding research problems across science the theoretical underpinnings of the algorithm. Then, we present nu­
and engineering disciplines (Najafabadi et al., 2015). Recent advances in merical tests probing into the accuracy of the proposed framework on
the field of Scientific Machine Learning have demonstrated the largely synthetic velocity models. We also explore the applicability of transfer
untapped potential of deep learning for applications in scientific learning and surrogate modeling to efficiently solve the eikonal equa­
computing. The idea to use neural networks for solving PDEs has been tion. Next, we discuss the strengths and limitations of the approach,
around since the 1990s (Lee and Kang, 1990; Lagaris et al., 1998). including implications of this work on the field of numerical eikonal
However, recent advances in the theory of deep learning coupled with a solvers. This is followed by some concluding remarks.
massive increase in computational power and efficient graph-based
implementation of new algorithms and automatic differentiation (Bay­ 2. Theory
din et al., 2017) have seen a resurgence of interest in using neural net­
works to approximate the solution of PDEs. In this section, we first introduce the eikonal equation and the
This resurgence is confirmed by the advances made in the recent factorization idea. This is followed by a brief overview of deep neural
literature on scientific computing. For example (Ling et al., 2016), used networks and their capabilities as function approximators. Next, we
a deep neural network (DNN) for modeling turbulence in fluid dy­ briefly explain the concept of automatic differentiation. Finally, putting
namics, while (Han et al., 2018) proposed a deep learning algorithm to these pieces together, we present the proposed algorithm for solving the
solve the non-linear Black–Scholes equation, the Hamil­ eikonal equation.
ton–Jacobi–Bellman equation, and the Allen–Cahn equation. Similarly
(Sirignano and Spiliopoulos, 2018), developed a mesh-free algorithm 2.1. Eikonal equation
based on deep learning for efficiently solving high-dimensional PDEs. In
addition (Tompson et al., 2017), used a convolutional neural network to The eikonal equation is a non-linear, first-order, hyperbolic PDE of
speed up the solution to a sparse linear system required to obtain a the form:
numerical solution of the Navier-Stokes equation.
1
Recently, Raissi et al. (2019) developed a deep learning framework |∇T(x)|2 = , ∀ x ∈ Ω,
for the solution and discovery of PDEs. The so-called physics-informed v2 (x) (1)
neural network (PINN) leverages the capabilities of DNNs as universal T(xs ) = 0,
function approximators. In contrast with the conventional deep learning
approaches, PINNs restrict the space of admissible solutions by enforc­ where Ω is a domain in Rd with d as the space dimension, T(x) is the
ing the validity of the underlying PDE governing the actual physics of traveltime or Euclidean distance to any point x from the point-source xs,
the problem. This is achieved by using a simple feed-forward network v(x) is the velocity defined on Ω, and ∇ denotes the spatial differential
leveraging automatic differentiation (AD), also known as algorithmic operator. Equation (1) simply means the gradient of the arrival time
differentiation. PINNs have already demonstrated success in solving a surface is inversely proportional to the speed of the wavefront. This is
wide range of non-linear PDEs, including Burgers, Schrödinger, also commonly known as the isotropic eikonal equation as the velocity is
Navier-Stokes, and Allen-Cahn equations (Raissi et al., 2019). Moreover, not a function of the wave propagation direction (∇T/|∇T|).
PINNs have also been successfully applied to problems arising in geo­ To avoid the singularity due to the point-source (Qian and Symes,
sciences (Xu et al., 2019; Karimpouli and Tahmasebi, 2020; Song et al., 2002), we consider the factored eikonal equation (Fomel et al., 2009).
2021; Bai and Tahmasebi, 2021; Waheed et al., 2021). The factorization approach relies on factoring the unknown traveltime
In this paper, we propose a paradigm shift from conventional nu­ (T(x)) into two functions. One of the functions is specified analytically,
merical algorithms to solve the eikonal equation. Using a loss function such that the other function is smooth in the source neighborhood.
defined by the underlying PDE, we train a DNN to yield the solution of Specifically, we consider multiplicative factorization, i.e.,
the eikonal equation. To mitigate point-source singularity, we use the T(x) = T0 (x) τ(x), (2)
factored eikonal equation. Through tests on benchmark synthetic

2
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

where T0(x) is known and τ(x) is the unknown function. Plugging and the activation function is applied element-wise. Computational
equation (2) in equation (1), we get the factored eikonal equation: frameworks, such as Tensorflow (Abadi et al., 2015), can be used to
efficiently evaluate data flow graphs like the one given in equation (6)
T 20 |∇τ|2 + τ2 |∇T0 |2 +2 T0 τ (∇T0 .∇τ) =
1
, efficiently using parallel execution. The input values can be defined as
v2 (x) (3) tensors (multi-dimensional arrays) and the computation of the outputs is
vectorized and distributed across the available computational resources
τ(xs ) = 1.
for efficient evaluation.
The known factor T0 is computed analytically using the expression:
2.3. Approximation property of neural networks
|x − xs |
T0 (x) = , (4)
v(xs )
Neural networks are well-known for their strong representational
where v(xs) is the velocity at the source location. power. It has been shown that a neural network with a single hidden
layer and a finite number of neurons can be used to represent any
bounded continuous function to any desired accuracy. This is also
2.2. Deep feed-forward neural networks known as the universal approximation theorem (Cybenko, 1989; Hornik
et al., 1989). It was later shown that by using a non-linear activation
A feed-forward neural network is a set of neurons organized in layers function and a deep network, the total number of neurons can be
in which evaluations are performed sequentially through the layers. It significantly reduced (Lu et al., 2017). Therefore, we seek a trained deep
can be seen as a computational graph having an input layer, an output neural network (DNN) that could represent the mapping between the
layer, and an arbitrary number of hidden layers. In a fully connected input (x) and the output (τ(x)) of the factored eikonal equation for a
neural network, neurons in adjacent layers are connected with each given velocity model (v(x)).
other but neurons within a single layer share no connection. It is worth noting that while neural networks are, in theory, capable
Thanks to the universal approximation theorem, a neural network of representing very complex functions compactly, finding the actual
with n neurons in the input layer and m neurons in the output layer can parameters (weights and biases) needed to solve a given PDE can be
be used to represent a multi-dimensional function u : Rn →Rm (Hornik quite challenging.
et al., 1989), as shown in Fig. 1. For illustration, we consider a network
of L + 1 layers starting with input layer 0, the output layer L, and L − 1 2.4. Automatic differentiation
hidden layers. The number of neurons in each layer is denoted as k0 = n,
k1, …, kL = m. Each connection between the i-th neuron in layer l − 1 and Solving a PDE using PINNs requires derivatives of the network’s
j-th neuron in layer l has a weight wlji associated with it. Moreover, for output with respect to the inputs. There are four possible ways to
each neuron in layer l, we have an associated bias term bi, i = 1, …, kl. compute derivatives (Baydin et al., 2017; Margossian, 2019): (1)
Each neuron represents a mathematical operation, whereby it takes a hand-coded analytical derivatives, (2) symbolic differentiation, (3) nu­
weighted sum of its inputs plus a bias term and passes it through an merical approximation such as finite-difference, and (4) automatic dif­
activation function. The output from the k-th neuron in layer l is given as ferentiation (AD).
(Bishop, 2006): Manually working out the derivatives may be exact, but they are not
( ) automated, and thus, impractical. Symbolic differentiation is also exact,
but it is memory intensive and prohibitively slow as one could end up
kl − 1

ulk = σ wlkj ul−j 1 + blk , (5)
j=1
with exponentially large expressions to evaluate. While numerical dif­
ferentiation is easy to implement, it can be highly inaccurate due to
where σ() represents the activation function. Commonly used activation round-off errors. On the contrary, AD uses exact expressions with
functions are the logistic sigmoid, the hyperbolic tangent, and the floating-point values instead of symbolic strings and it involves no
rectified linear unit (Sibi et al., 2013). By dropping the subscripts, we approximation error, resulting in accurate evaluation of derivatives at
can write equation (5) compactly in the vectorial form: machine precision. However, an efficient implementation of AD can be
( ) non-trivial. Fortunately, many existing computational frameworks such
ul = σ Wl ul− 1 + bl , (6) as Tensorflow (Abadi et al., 2015) and PyTorch (Paszke et al., 2017)
have made available efficiently implemented AD libraries. In fact, in
where Wl is the matrix of weights corresponding to connections between deep learning, backpropagation (Rumelhart et al., 1986), a generalized
layers l − 1 and l, ul and bl are vectors given by ulk and blk , respectively, technique of AD, has been the mainstay for training neural networks.
To understand how AD works, consider a simple fully-connected
neural network with two inputs (x1, x2), one output (y), and one
neuron in the hidden layer. Let us assume the network’s weights and
biases are assigned such that:
ν = 2x1 + 3x2 − 1,
1
h = σ(ν) = , (7)
1 + e− ν
y = 5h + 2,

where h represents the output from the neuron in the hidden layer
computed by applying the sigmoid function (σ ) on the weighted sum of
the inputs (ν).
To illustrate the idea, let us say we are interested in computing
∂y ∂y
partial derivatives x1 and x2 at (x1, x2) = (1, − 1). AD requires one for­
ward pass and backward pass through the network to compute these
Fig. 1. Schematic representation of a feed-forward neural network with L − 1 derivatives as detailed in Table 1. To compute high-order derivatives,
hidden layers. AD can be applied recursively through the network in the same manner.

3
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Table 1
Example of forward and reverse pass computations needed by AD to compute
partial derivates of the output with respect to the inputs at (x1, x2) = (1, − 1) for
the expressions given in equation (7).
Forward pass Reverse pass

x1 =1 ∂y
=1
x2 =− 1 ∂y
ν = 2x1 + 3x2 − 1 = − 2 ∂y ∂(5h + 2)
= =5
1 ∂h ∂h
h = = 0.119
1 + e− ν ∂y ∂y ∂ h e− ν
= . =5×
∂ν ∂h ∂ν (1 + e− ν )2
= 0.525 Fig. 2. A workflow for the proposed eikonal solver: A randomly initialized
y = 5h + 2 = 2.596 ∂y ∂y ∂ν ∂y neural network is trained on a set of randomly selected collocation points (x*,
= . = × 2 = 1.050
∂x1 ∂ν ∂x1 ∂ν
z*) in the model space with given velocity v(x*, z*) and the known traveltime
∂y ∂y ∂ν ∂y
= . = × 3 = 1.575 function T0(x*, z*) and its spatial derivative ∇T0(x*, z*) to minimize the loss
∂x2 ∂ν ∂x2 ∂ν
function given in equation (8). Once the network is trained, it is evaluated on a
regular grid of points (x, z) to yield an estimate of the unknown traveltime field
For a deeper understanding of AD, we refer the interested reader to ̂τ , which is then multiplied with the known traveltime part T0 to yield the
(Elliott, 2018). estimated eikonal solution T. ̂

2.5. Solving the eikonal equation

To solve the eikonal equation (1), we leverage the capabilities of


neural networks as function approximators and define a loss function
that minimizes the residual of the factored eikonal equation at a chosen
set of training (collocation) points. This is achieved with (i) a DNN
approximation of the unknown traveltime field variable τ(x); (ii) a loss
function incorporating the eikonal equation and sampled on a colloca­
tion grid; (iii) a differentiation algorithm, i.e., AD in this case, to eval­
uate partial derivatives of τ(x) with respect to the spatial coordinates;
and (iv) an optimizer to minimize the loss function by updating the
network parameters.
To illustrate the idea, let us consider a two-dimensional domain Ω ∈
R2 where x = (x, z) ∈ [0, 2], as shown in Fig. 3. A source term is
considered at xs = (xs, zs), where τ(xs) = 1. The unknown traveltime
factor τ(x) is approximated by a multilayer DNN 𝒩 τ , i.e., τ(x) ≈ ̂τ (x) =
𝒩 τ (x; θ), where x = (x, z) are network inputs, ̂τ is the network output,
and θ represents the set of all trainable parameters of the network.
The loss function can now be constructed using a mean-squared-
Fig. 3. A velocity model with a constant velocity gradient of 0.5 s− 1 in the
error (MSE) norm as:
vertical direction. The velocity at zero depth is equal to 2 km/s and it increases
1 ∑ 1 ∑ linearly to 3 km/s at a depth of 2 km. Black star indicates the point-source
(8)
2
J = ‖ℒ‖2 + ‖ℋ(− ̂τ )|̂τ |‖ + ‖̂τ (xs ) − 1‖2 , location used for the tests.
NI x∗ ∈I NI x∗ ∈I

where should be highlighted that the computations of derivatives on the ve­


locity model boundaries using AD is straightforward and do not need
1
2
ℒ = T 20 |∇̂τ |2 + ̂τ |∇T0 |2 + 2 T0 ̂τ (∇T0 .∇̂τ ) − , (9) any special treatment.
v2 (x)
It is also worth emphasizing that the proposed approach is different
from traditional (or non-physics constrained) deep learning techniques.
forms the residual of the factored eikonal equation.
The training of the network here refers to the tuning of weights and
The first term on the right side of equation (8) imposes the validity of
biases of the network such that the resulting solution minimizes the loss
the factored eikonal equation on a given set of training points x* ∈ I,
function J on a given set of training points. The training set here refers
with NI as the number of sampling points. The second term forces the
to the collocation points, usually chosen randomly, from within the
solution ̂τ to be positive by penalizing negative solutions using the
computational domain. The number of collocation points needed to
Heaviside function ℋ(). The last term requires the solution to be unity at
obtain a sufficiently accurate solution increases with the heterogeneity
the point-source xs = (xs, zs).
of the velocity model.
Network parameters θ are then identified by minimizing the loss
Contrary to supervised learning applications, the network here
function (8) on a set of sampling (training) points x* ∈ I, i.e.,
learns without any labeled set. To understand this point, consider a
arg min J(x∗ ; θ). (10) randomly initialized network, which will output a certain value ̂τ i,j for
θ
each point (i, j) in the training set. These output values will be used to
Once the DNN is trained, we evaluate the network on a set of regular calculate the residual using equation (8). Based on this residual, the
grid-points in the computational domain to obtain the unknown trav­ network adjusts its weights and biases, allowing it to produce ̂τ that
eltime part. The final traveltime solution is obtained by multiplying it adheres to the underlying factored eikonal equation (3).
with the known traveltime part, i.e.,
3. Numerical tests
̂ (x) = T0 (x) ̂τ (x).
T (11)

A pictorial description of the proposed algorithm is shown in Fig. 2. It In this section, we test the proposed PINN eikonal solver for

4
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

computing traveltimes emanating from a point-source. We consider


several velocity models, including a highly heterogeneous portion from
the Marmousi model. We also include a model with irregular topography
and anisotropy to demonstrate the flexibility of the proposed method
compared to conventional algorithms.
For each example presented below, we use a neural network with 10
hidden layers containing 20 neurons in each layer and minimize the
neural network’s loss function using full-batch optimization. The input
layer consists of two neurons, one for each spatial coordinate (x, z), and
the output layer has a single neuron to provide the estimated traveltime
factor ̂τ (x,z). The network architecture is chosen using some initial tests
and kept fixed for the entire study to avoid the need for architecture
tuning for each new velocity model.
The activation function also plays an important role in the optimi­
zation of the network. We use a locally adaptive inverse tangent function
for all hidden layers except the final layer, which has a linear activation Fig. 4. A comparison of loss curves for training of the velocity model shown in
function. Locally adaptive activation functions have recently shown Fig. 3 using the arctangent activation function with fixed weights in the loss
better learning capabilities than the traditional or fixed activation function (blue) compared to the training performed using the locally adaptive
functions in achieving higher convergence rate and solution accuracy arctangent activation function with adaptive weights for the loss terms (or­
(Jagtap et al., 2020). Using a scalable parameter in the activation ange). (For interpretation of the references to colour in this figure legend, the
function for each neuron changes the slope of the activation function reader is referred to the Web version of this article.)
and, therefore, alters the loss landscape of the network, yielding
improved performance. In Fig. 5a, we show the absolute traveltime errors for the PINN
Moreover, recent investigations into the PINN model have shown eikonal solver considering the same velocity model and source position.
that the optimization process suffers from the discrepancy of the Once the network is trained, we evaluate the network on the same 101
convergence rate in the different components of the loss function (Wang × 101 regular grid. For comparison, we also plot absolute traveltime
et al., 2020b). Using the statistics of the back-propagated gradient, errors for the first-order fast sweeping solution in Fig. 5b on the same
adaptive weights could be assigned to different terms in the loss func­ grid. We observe that despite using only 25% of the total grid points for
tion, balancing the magnitude of back-propagated gradients. We adopt training, the PINN eikonal solution is significantly more accurate than
this strategy to adaptively assign weights to each term in the loss the first-order fast sweeping solution. As can be seen in Fig. 5b, the fast
function for improved convergence. For more information on the sweeping solution suffers from large errors in the diagonal direction,
adaptive weighting strategy, we refer the interested reader to (Wang whereas the errors for the PINN eikonal solver are more randomly
et al., 2020a, 2020b). distributed. We also plot traveltime contours in Fig. 6 comparing the
The PINN framework is implemented using the SciANN package analytical solution with the PINN eikonal solution and the first-order
(Haghighat and Juanes, 2021) – a high level Tensorflow wrapper for fast sweeping solution visually.
scientific computations. For comparison, we use the first-order finite-­ Next, we investigate the applicability of transfer learning to the PINN
difference fast sweeping solution, which is routinely used for traveltime eikonal solver. Transfer learning is a machine learning technique that
computations in seismological applications. relies on storing knowledge gained while solving one problem and
First, we consider a 2 × 2 km2 model with vertically varying velocity. applying it to a different but related problem. We explore if the network
The velocity at zero depth is 2 km/s and it increases linearly with a trained on the previous example can be used to compute the solution for
gradient of 0.5 s− 1. We consider the point-source to be located at (1 km, a different source location and velocity model. To this end, we consider a
1 km). The model is shown in Fig. 3 with the black star depicting the 6 × 6 km2 velocity model with a vertical gradient of 0.4 s− 1 and a
point-source location. The model is discretized on a 101 × 101 grid with horizontal gradient of 0.1 s− 1. The point-source is also relocated to (4
a grid spacing of 20 m along both axes. For a model with a constant km, 1 km) as shown in Fig. 7. The model is discretized on a 301 × 301
velocity gradient, the analytical traveltime solution is given as (Slotnick, grid with a grid spacing of 20 m along both axes.
1959): To train the network for this case, instead of initializing the network
( ( 2 )⃒ ) with random weights, we use the weights from the network trained for
1 g + g2h ⃒x − xs |2 the previous example. We re-train this neural network using 25% of the
T(x) = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅2̅ arccosh 1 + v , (12)
g2v + gh 2 v(x) v(xs ) total grid points, selected randomly, using the L-BFGS-B solver (Zhu
et al., 1997). Starting with pre-trained weights allow us to use a
where T(x) is the traveltime value at some grid point x from a point- super-linear optimization method for faster convergence as opposed to
source located at xs. Likewise, v(x) is the velocity at the grid-point x starting with the first-order Adam optimizer for stable convergence and
and v(xs) is the velocity at the point-source location. The velocity gra­ then switching to the L-BFGS-B optimizer, as suggested in previous
dients along the vertical and horizontal dimensions are denoted by gv studies (Raissi et al., 2019). For comparison, we also train a neural
and gh, respectively. Therefore, for the model in Fig. 3, gv = 0.5 s− 1, gh = network with random initialization and the loss curves are compared in
0 s− 1, xs = (1 km, 1 km), and v(xs) = 2.5 km/s. Fig. 8. Despite using a different velocity model with a relocated
In Fig. 4, we demonstrate the efficacy of using a locally adaptive point-source and a larger model, we observe that the network with
activation function (Jagtap et al., 2020) and the adaptive weighting pre-trained weights converges much faster than the one trained from
strategy of the terms in the loss function (Wang et al., 2020a) compared scratch. This is a highly desirable property of the PINN eikonal solver
to the standard PINN approach. For both cases, the network is trained on since seismic applications, such as earthquake source localization or
25% of the total grid points, randomly selected, using the Adam opti­ seismic imaging, require repeated traveltime computations for multiple
mizer for 10,000 epochs. We observe markedly improved convergence source locations and updated velocity models. In comparison, conven­
by using the locally adaptive arctangent activation function and adap­ tional numerical algorithms, such as the fast sweeping method, require
tive weights for the loss terms. Therefore, all examples reported in this the same computational effort for even a slight variation in the velocity
study utilize these techniques to accelerate the convergence of the PINN model or source location, which is a major source of the computational
model. bottleneck, particularly in large 3D models.

5
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 6. Traveltime contours for the analytical solution (red), PINN eikonal
solution (dashed black), and the first-order fast sweeping solution (dotted blue).
The velocity model and the source location considered are shown in Fig. 3. (For
interpretation of the references to colour in this figure legend, the reader is
referred to the Web version of this article.)

Fig. 5. The absolute traveltime errors for the PINN eikonal solution (a) and the
first-order fast sweeping solution (b) for the velocity model and the source
location shown in Fig. 3.

Fig. 9 compares the absolute traveltime errors computed using the


PINN eikonal solution with the pre-trained initial model and the first-
order fast sweeping solution. We observe that despite using signifi­ Fig. 7. A velocity model with a constant vertical velocity gradient of 0.4 s− 1
cantly fewer epochs, the solution accuracy is not compromised. The and a horizontal velocity gradient of 0.1 s− 1. Black star indicates the point-
traveltime contours, shown in Fig. 10, confirm this observation visually. source location used for the test.
Next, we explore if a PINN model trained on solutions computed for
various source locations in a given velocity model can be used as a rapidly with just a single evaluation of the network. This is similar to
surrogate model. To do this, the neural network is modified to include obtaining an analytic solver as no further training is needed for
the source location xs = (xs, zs) as inputs in addition to the grid points x computing traveltimes corresponding to additional source locations.
= (x, z). We train the network on solutions computed for 16 sources This property is particularly advantageous for large 3D models that need
located at regular intervals for the same model, as shown in Fig. 11. thousands of such computations for inversion applications.
These computed solutions act as additional data points for training the To demonstrate if the surrogate model approach yields an accurate
surrogate model and these points along with the computed solution can solution, we use the trained surrogate model to compute traveltime
be lumped with the boundary term in the loss function. Through this solution for a source located at (3.05 km, 4.95 km). This source position
training process, the network learns the mapping between a considered is deliberately chosen to be the furthest away from the training source
source location xs and a point in the model space x to the corresponding locations to better analyze the accuracy limits of the surrogate model.
traveltime factor value ̂τ (xs ,x). Once the surrogate model is trained with We can confirm by looking at the absolute traveltime errors, shown in
source locations as additional input parameters, the traveltime function Fig. 12, that the trained surrogate model yields a highly accurate solu­
for new source locations for the given velocity model can be computed tion compared to the first-order fast sweeping solution even though no

6
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 8. A comparison of convergence history for training of the velocity model


shown in Fig. 7 using a randomly initialized model (blue) and a pre-trained
initial model (orange) indicating the number of epochs needed for conver­
gence. (For interpretation of the references to colour in this figure legend, the
reader is referred to the Web version of this article.)

additional training is performed for this randomly chosen source point


(see Fig. 13 for traveltime contours).
Moreover, transfer learning can be used to efficiently build surrogate
models for updated velocity models, i.e., by initializing the PINN sur­
rogate model for the updated velocities using weights from the already
trained surrogate model. Therefore, the transfer learning technique
combined with surrogate modeling can be used to build a highly effi­
cient traveltime modeling engine for seismic inversion compared to
conventional algorithms that do not afford such flexibility.
Next, to demonstrate the flexibility of the proposed framework, we
consider an anisotropic model with irregular topography. We consider
the simplest form of anisotropy, known as elliptical anisotropy, which
uses a vertical and a horizontal velocity to parameterize the model and
to define an elliptical phase velocity surface. The considered model
parameters are shown in Fig. 14 with a rugged topography layer on top.
The introduction of anisotropy necessitates the construction of a new
solver for the fast sweeping method, whereas for PINNs, it requires only
an update to the loss function term that embeds the residual of the PDE.
Moreover, for conventional methods, the presence of non-uniform
topography requires special treatment, such as mathematical flat­
tening of the free-surface and solving a topography-dependent eikonal
equation (Lan and Zhang, 2013). This not only adds to the complexity of
the eikonal solver but also results in a considerable increase in the
computational cost. On the contrary, being a mesh-free method, PINNs Fig. 9. The absolute traveltime errors for the PINN eikonal solution (a) and the
do not require any special treatment. For models with irregular topog­ first-order fast sweeping solution (b) for the velocity model and the source
raphy, only the grid points below the free-surface are used for training location shown in Fig. 7.
and evaluation of the network.
Finally, we test the PINN eikonal solver on a highly heterogeneous along both axes. The training is done using only 12% of the total grid
portion of the Marmousi model as shown in Fig. 17. We consider a points, selected randomly, below the free-surface using L-BFGS-B opti­
source located at (1 km, 1 km). This is a particularly challenging model mizer for 2000 epochs. For a source located at (5 km, 8 km), the absolute
due to sharp velocity variations. The model is discretized on a 101 × 101 traveltime errors for the PINN and fast sweeping solvers are shown in
grid with a spacing of 20 m along both axes. Starting with the pre- Fig. 15. The corresponding traveltime contours are shown in Fig. 16.
trained weights from the model shown in Fig. 3, we train the network Again, we observe better accuracy for PINNs compared to the first-order
on 30% of the total grids point, randomly selected from the discretized fast sweeping method.
computational domain. The training is done using an L-BFGS-B solver However, compared to previous examples, we also observe consid­
for 12,000 epochs. Once the network is trained, we evaluate it on the erably slower convergence for the Marmousi model. This can be
same 101 × 101 regular grid. For comparison, we also compute the attributed to the spectral bias of fully-connected neural networks
traveltime solution using the first-order fast sweeping method on the (Rahaman et al., 2019), which underscores the learning bias of deep
same regular grid. The absolute traveltime errors for both the ap­ neural networks towards smoother representations. Since the solution
proaches are compared in Fig. 18 and the traveltime contours are shown for the Marmousi model contains plenty of local fluctuations or
in Fig. 19. We observe significantly better accuracy for the PINN eikonal high-frequency features compared to prior examples, this means a
solver compared to the first-order fast sweeping method. longer training time for the neural network to accurately capture the
Despite the model being anisotropic in nature, we initialize the PINN underlying local features. While the use of locally adaptive activation
model using pre-trained weights from the isotropic model shown in functions, gradient balancing, transfer learning, and a second-order
Fig. 3. The velocity model is discretized using a grid spacing of 50 m

7
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 10. Traveltime contours for the analytical solution (red), PINN eikonal
solution (dashed black), and the first-order fast sweeping solution (dotted blue).
The velocity model and the source location considered are shown in Fig. 7. (For
interpretation of the references to colour in this figure legend, the reader is
referred to the Web version of this article.)

Fig. 12. The absolute traveltime errors for the solution computed using the
PINN surrogate model (a) and the first-order fast sweeping solution (b) for the
velocity model shown in Fig. 11 and the source is located at (3.05 km, 4.95 km).

world phenomena. For cases when the available training and test data
− 1 are insufficient, such models often learn spurious relationships that are
Fig. 11. A velocity model with a constant vertical velocity gradient of 0.4 s
misleading. However, the biggest concern of such a data-driven model is
and a horizontal velocity gradient of 0.1 s− 1. Black stars indicate locations of
sources used to train the network as a surrogate model. the lack of scientific consistency of their predictions to known physical
laws that have been the cornerstone of knowledge discovery across
scientific disciplines for centuries.
optimizer help improve the convergence rate, further advances are
A case in point is the failure of Google Flu Trends – a system designed
needed to make PINNs computationally feasible for such highly het­
to predict the onset of flu solely based on Google search queries without
erogeneous velocity models.
taking into account the physical knowledge of the disease spread.
Despite its success in the initial years that were used to train the model,
4. Discussion
it soon started overestimating by several factors to the point that it was
eventually taken down (Lazer et al., 2014). Such problems with
In a conventional deep learning application, a neural network is
black-box data science methods on scientific problems have been re­
trained by minimizing a loss function that typically measures the
ported in several other publications (Caldwell et al., 2014; Marcus and
mismatch between the network’s predicted outputs and their expected
Davis, 2014; Karpatne et al., 2017). Furthermore, consider a neural
(true) values, also known as training data. However, there are several
network with rectified linear unit (ReLU) activation function. These net­
limitations associated with such models that solely rely on a labeled
works show excellent training and convergence characteristics for
dataset and are oblivious to the scientific principles governing real-
data-driven setups. However, it is trivial that the first spatial or temporal

8
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 13. Traveltime contours for solutions obtained using the analytical for­
mula (red), the PINN surrogate model (dashed black), and the first-order fast
sweeping solver (dotted blue). The velocity model considered is shown in
Fig. 11 and a source located at (3.05 km, 4.95 km). (For interpretation of the
references to colour in this figure legend, the reader is referred to the Web
version of this article.)

derivative of this network is discontinuous while the second derivative is


identically zero. Considering that most physical phenomena are derived
by gradients, trivially such networks cannot show generalization capa­
bilities. In a physics-informed setup, these networks cannot be trained at
all because the governing PDEs are not satisfied.
Therefore, in the present work, we propose an eikonal solver based
on the framework of physics-informed neural networks (Raissi et al.,
2019). We leverage the capabilities of neural networks as universal
function approximators (Hornik et al., 1989) and define a loss function
to minimize the residual of the governing eikonal equation at a chosen
set of training points. This is achieved with a simple feed-forward neural
network leveraging the concept of automatic differentiation (Baydin
et al., 2017). Through numerical tests, we observe that the proposed
algorithm yield sufficiently accurate traveltimes for most seismic ap­
plications of interest. We demonstrate this by comparing the accuracy of
the proposed approach against the first-order fast sweeping solution, Fig. 14. The vertical (a) and horizontal (b) velocity models for the elliptically
which is a popular numerical algorithm for solving the eikonal equation. anisotropy model with irregular topography. Black star indicates the position of
We observe that the transfer learning technique can be used to speed the point-source.
up the convergence of the network for new source locations and/or
updated velocity models by initializing the PINN model with the weights approaches would be the same for the first 67 source locations, after
of a previously trained network. Moreover, having computed solutions which each PINN solution is about 13.5 times faster. Given the fact that
corresponding to a few sources for a given velocity model, we can also usually thousands of such source evaluations are needed for a given
build a surrogate model with respect to the source locations by adding velocity model, the computational attractiveness of PINNs is obvious.
them as input parameters. This essentially means that this surrogate These computations are done using an NVIDIA Tesla P100 GPU.
model can then be used to compute traveltime fields corresponding to Moreover, the solution computed with the PINN surrogate model is
new source locations for the same velocity model just by a single eval­ considerably more accurate and, therefore, if we need similar accuracy
uation of the network. These observations effectively demonstrate the with the fast sweeping method, we either need to use a much smaller
potential of the proposed approach in massively speeding up many grid spacing or a high-order version, leading to a substantial increase in
seismic applications that rely on repeated traveltime computations for the computational cost of the fast sweeping solver. Such high accuracy is
multiple source locations and velocity models. often required for quantities that rely on traveltime derivatives, such as
For a rudimentary analysis of the computational cost, we note that amplitudes and take-off angles. While these traveltime derivatives need
the training of the surrogate model, shown in Fig. 11, takes about 92.7 s, to be separately computed for conventional methods, we obtain them as
while subsequent evaluations for each new source location takes merely a by-product of the PINN training process.
0.11 s. On the contrary, it takes about 1.5 s for the fast sweeping solver to Nevertheless, it is worth noting that the actual computational
obtain a single solution. Therefore, the computational cost for both advantage of using the proposed algorithm, compared to conventional

9
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 16. Traveltime contours for the reference solution (red), PINN eikonal
solution (dashed black), and the first-order fast sweeping solution (dotted blue).
The velocity model and the source location considered are shown in Fig. 14.
The black solid curve indicates the surface topography. (For interpretation of
the references to colour in this figure legend, the reader is referred to the Web
version of this article.)

Fig. 15. The absolute traveltime errors for the PINN eikonal (a) and first-order
fast sweeping (b) solutions for the anisotropic model and the source location
shown in Fig. 14.

numerical solvers, depends on many factors, including the neural


network architecture, optimization hyper-parameters, and sampling Fig. 17. A highly heterogeneous portion of the Marmousi model used for
techniques. If the initialization of the network and the learning rate of traveltime computation. Black star indicates the point-source location used for
the optimizer are chosen carefully, the training can be completed quite the test.
efficiently. Moreover, the type of activation function used, the adaptive
weighting of the loss function terms, and the availability of second-order eikonal equation, by simply modifying the neural network’s loss func­
optimization techniques can accelerate the training significantly. tion. Since the method is mesh-free, contrary to conventional algo­
Additional factors that would dictate the computational advantage rithms, incorporating topography does not require any special
include the complexity/size of the velocity model and the number of treatment. Furthermore, the point-source location does not need to be
seismic sources. Therefore, a detailed study is needed to quantify the on a regular grid as needed by conventional numerical algorithms. This
computational efficiency afforded by the proposed PINN eikonal solver often results in using a smaller than necessary computational grid for
compared to conventional algorithms by considering the afore- conventional methods, thereby increasing the computational load or
mentioned factors. Since the computational cost of solving the aniso­ incurring errors by relocating the point-source to the nearest available
tropic eikonal equation is 3–5 times that of the isotropic case (Waheed grid point. Furthermore, our PINN eikonal solver uses Tensorflow at
et al., 2015), PINNs will be computationally more attractive for trav­ the backend, which allows easy deployment of computations across a
eltime modeling in anisotropic media. variety of platforms (CPUs, GPUs) and architectures (desktops, clusters).
Other advantages of the proposed framework include the ease of On the contrary, significant effort needs to be spent on adapting the
extension to complex eikonal equations, for example, the anisotropic conventional algorithms to benefit from different computational

10
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Fig. 19. Traveltime contours for the reference solution (red), PINN eikonal
solution (dashed black), and the first-order fast sweeping solution (dotted blue).
The velocity model and the source location considered are shown in Fig. 17.
(For interpretation of the references to colour in this figure legend, the reader is
referred to the Web version of this article.)

5. Conclusions

We proposed a novel algorithm to solve the eikonal equation using a


deep learning framework. Through tests on benchmark synthetic
models, we show that the accuracy of the proposed approach is better
than the first-order fast sweeping solution. Depending on the hetero­
geneity in the velocity model, we also note that training is needed for
only a fraction of the total grid points in the computational domain to
reliably reconstruct the solution. We also observed that transfer learning
could be used to speed up convergence for new velocity models and/or
source locations. Moreover, having computed solutions corresponding
to a few source locations for a given velocity model, surrogate modeling
can be used to train a network to instantly yield traveltime solutions
corresponding to new source locations. These properties, not afforded by
Fig. 18. The absolute traveltime errors for the PINN eikonal solution (a) and the conventional numerical algorithms, potentially allow us to
the first-order fast sweeping solution (b) for the velocity model and the source massively speed up seismic inversion applications, particularly for large
location shown in Fig. 17. 3D models. Moreover, the extension of the proposed framework to more
complex eikonal equations, such as the anisotropic eikonal equation,
platforms or architectures. requires only an update to the loss function according to the underlying
Nevertheless, there are a few challenges in our observation that need PDE. Furthermore, contrary to conventional methods, the mesh-free
further investigation. Most importantly, we observe slow convergence nature of the proposed method allows easy incorporation of surface
for velocity models with sharp variations. Although the use of a second- topography, which is often an important consideration for land seismic
order optimization method, locally adaptive activation function, and data.
adaptive weighting of the terms in the loss function considerably im­ We must, however, note that the conventional algorithms to solve
proves the convergence rate, additional advances are needed, particu­ the eikonal equation have evolved over several decades, gradually
larly to make PINNs computationally feasible for traveltime modeling of improving in their performance. While we have demonstrated the
areas with complex subsurface geologic structures. Additional solutions immense potential of the proposed framework, further study is needed
may include a denser sampling of training points around parts of the to meet the robustness and efficiency required in practice. We demon­
velocity model with a large velocity gradient (Anitescu et al., 2019) or strated the proposed framework on 2D models for the simplicity of
the use of non-local PINN framework (Haghighat et al., 2020). Another illustration. The extension to 3D velocity models is straightforward.
challenge is concerning the optimal choice of the network
hyper-parameters that are often highly problem-dependent. Recent ad­ Computer code availability
vances in the field of meta-learning may enable the automated selection
of optimum architectures. All accompanying codes are publicly available at https://github.
com/umairbinwaheed/PINNeikonal.

11
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

CRediT authorship contribution statement Helmsen, J.J., Puckett, E.G., Colella, P., Dorr, M., 1996. Two new methods for simulating
photolithography development in 3D. In: Optical Microlithography IX, International
Society for Optics and Photonics, pp. 253–261.
Umair bin Waheed: Conceptualization, Methodology, Validation, Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedforward networks are
Visualization, Software, Writing - original and draft. Ehsan Haghighat: universal approximators. Neural Network. 2, 359–366.
Conceptualization, Methodology, Software, Writing - original and draft. Jagtap, A.D., Kawaguchi, K., Em Karniadakis, G., 2020. Locally adaptive activation
functions with slope recovery for deep and physics-informed neural networks.
Tariq Alkhalifah: Supervision, Conceptualization, Validation, Writing - Proceedings of the Royal Society A 476, 20200334.
review and editing. Chao Song: Methodology, Visualization, Writing - Karimpouli, S., Tahmasebi, P., 2020. Physics informed machine learning: seismic wave
review and editing. Qi Hao: Validation, Writing - review and editing. equation. Geoscience Frontiers 11, 1993–2001.
Karpatne, A., Atluri, G., Faghmous, J.H., Steinbach, M., Banerjee, A., Ganguly, A.,
Shekhar, S., Samatova, N., Kumar, V., 2017. Theory-guided data science: a new
Declaration of competing interest paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29,
2318–2331.
Lagaris, I.E., Likas, A., Fotiadis, D.I., 1998. Artificial neural networks for solving ordinary
The authors declare that they have no known competing financial and partial differential equations. IEEE Trans. Neural Network. 9, 987–1000.
interests or personal relationships that could have appeared to influence Lambare, G., Operto, S., Podvin, P., Thierry, P., 2003. 3D ray+ born migration/
the work reported in this paper. inversion—part 1: Theory. Geophysics 68, 1348–1356.
Lan, H., Zhang, Z., 2013. Topography-dependent eikonal equation and its solver for
calculating first-arrival traveltimes with an irregular surface. Geophys. J. Int. 193,
Acknowledgments 1010–1026.
Lawton, D.C., 1989. Computation of refraction static corrections using first-break
traveltime differences. Geophysics 54, 1289–1296.
We extend gratitude to Prof. Sjoerd de Ridder and three anonymous
Lazer, D., Kennedy, R., King, G., Vespignani, A., 2014. The parable of google flu: traps in
reviewers for their constructive feedback that helped us in improving big data analysis. Science 343, 1203–1205.
the paper. Lee, H., Kang, I.S., 1990. Neural algorithm for solving differential equations. J. Comput.
Phys. 91, 110–131.
Ling, J., Kurzawski, A., Templeton, J., 2016. Reynolds averaged turbulence modelling
References using deep neural networks with embedded invariance. J. Fluid Mech. 807,
155–166.
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Lu, Z., Pu, H., Wang, F., Hu, Z., Wang, L., 2017. The expressive power of neural networks:
Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., a view from the width. In: Advances in Neural Information Processing Systems,
Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., pp. 6231–6239.
Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Malladi, R., Sethian, J.A., 1996. A unified approach to noise removal, image
Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., 2015. enhancement, and shape recovery. IEEE Trans. Image Process. 5, 1554–1568.
TensorFlow: large-scale machine learning on heterogeneous systems. https://www. Marcus, G., Davis, E., 2014. Eight (no, nine!) problems with big data. The New York
tensorflow.org/.software.available.from.tensorflow.org. Times 6, 2014.
Adalsteinsson, D., Sethian, J., 1996. Level set methods for etching, deposition and Margossian, C.C., 2019. A review of automatic differentiation and its efficient
photolithography development. Journal of Technology Computer Aided Design implementation. Wiley Interdisciplinary Reviews: Data Min. Knowl. Discov. 9,
TCAD 1–67. e1305.
Adalsteinsson, D., Sethian, J.A., 1994. A fast level set method for propagating interfaces. Masoliver, J., Ros, A., 2009. From classical to quantum mechanics through optics. Eur. J.
J. Comput. Phys. 118. Phys. 31, 171.
Alvino, C., Unal, G., Slabaugh, G., Peny, B., Fang, T., 2007. Efficient segmentation based Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R.,
on eikonal and diffusion equations. Int. J. Comput. Math. 84, 1309–1324. Muharemagic, E., 2015. Deep learning applications and challenges in big data
Anitescu, C., Atroshchenko, E., Alajlan, N., Rabczuk, T., 2019. Artificial neural network analytics. Journal of Big Data 2, 1.
methods for the solution of second order boundary value problems. Comput. Mater. Paris, D.T., Hurd, F.K., 1969. Basic Electromagnetic Theory. McGraw-Hill Education.
Continua (CMC) 59, 345–359. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A.,
Arnold, V.I., 2013. Mathematical Methods of Classical Mechanics, vol. 60. Springer Antiga, L., Lerer, A., 2017. Automatic differentiation in pytorch. In: Proceedings of
Science & Business Media. Neural Information Processing Systems.
Bai, T., Tahmasebi, P., 2021. Accelerating geostatistical modeling using geostatistics- Petres, C., Pailhas, Y., Patron, P., Petillot, Y., Evans, J., Lane, D., 2007. Path planning for
informed machine learning. Comput. Geosci. 146, 104663. autonomous underwater vehicles. IEEE Transactions on Robotics 23, 331–341.
Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M., 2017. Automatic Qian, J., Symes, W.W., 2002. An adaptive finite-difference method for traveltimes and
differentiation in machine learning: a survey. J. Mach. Learn. Res. 18, 5595–5637. amplitudes. Geophysics 67, 167–176.
Bishop, C.M., 2006. Pattern Recognition and Machine Learning. Springer. Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F., Bengio, Y.,
Caldwell, P.M., Bretherton, C.S., Zelinka, M.D., Klein, S.A., Santer, B.D., Sanderson, B.M., Courville, A., 2019. On the spectral bias of neural networks. In: International
2014. Statistical significance of climate sensitivity predictors obtained by data Conference on Machine Learning. PMLR, pp. 5301–5310.
mining. Geophys. Res. Lett. 41, 1803–1808. Raissi, M., Perdikaris, P., Karniadakis, G.E., 2019. Physics-informed neural networks: a
Cao, Z., Pan, S., Li, R., Balachandran, R., Fitzpatrick, J.M., Chapman, W.C., Dawant, B. deep learning framework for solving forward and inverse problems involving
M., 2004. Registration of medical images using an interpolated closest point nonlinear partial differential equations. J. Comput. Phys. 378, 686–707.
transform: method and validation. Med. Image Anal. 8, 421–427. Raviv, D., Bronstein, A.M., Bronstein, M.M., Kimmel, R., Sochen, N., 2011. Affine-
Cybenko, G., 1989. Approximation by superpositions of a sigmoidal function. invariant geodesic geometry of deformable 3D shapes. Comput. Graph. 35, 692–697.
Mathematics of control, signals and systems 2, 303–314. Rouy, E., Tourin, A., 1992. A viscosity solutions approach to shape-from-shading. SIAM
Elliott, C., 2018. The simple essence of automatic differentiation. Proceedings of the J. Numer. Anal. 29, 867–884.
ACM on Programming Languages 2, 1–29. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-
Fomel, S., Luo, S., Zhao, H., 2009. Fast sweeping method for the factored eikonal propagating errors. Nature 323, 533–536.
equation. J. Comput. Phys. 228, 6440–6455. Sethian, J.A., 1996. A fast marching level set method for monotonically advancing fronts.
Garrido, S., Álvarez, D., Moreno, L., 2016. Path planning for mars rovers using the fast Proc. Natl. Acad. Sci. Unit. States Am. 93, 1591–1595.
marching method. In: Robot 2015: Second Iberian Robotics Conference. Springer, Sibi, P., Jones, S.A., Siddarth, P., 2013. Analysis of different activation functions using
pp. 93–105. back propagation neural networks. J. Theor. Appl. Inf. Technol. 47, 1264–1268.
Gómez, J.V., Álvarez, D., Garrido, S., Moreno, L., 2019. Fast methods for eikonal Sirignano, J., Spiliopoulos, K., 2018. DGM: a deep learning algorithm for solving partial
equations: an experimental survey. IEEE Access 7, 39005–39029. differential equations. J. Comput. Phys. 375, 1339–1364.
Grechka, V., De La Pena, A., Schisselé-Rebel, E., Auger, E., Roux, P.F., 2015. Relative Slotnick, M., 1959. Lessons in seismic computing. Soc. Expl. Geophys 268.
location of microseismicity. Geophysics 80, WC1–WC9. Song, C., Alkhalifah, T., Waheed, U.B., 2021. Solving the frequency-domain acoustic vti
Guo, R., Li, M., Yang, F., Xu, S., Abubakar, A., 2019. First arrival traveltime tomography wave equation using physics-informed neural networks. Geophys. J. Int. 225,
using supervised descent learning technique. Inverse Probl. 35, 105008. 846–859.
Haghighat, E., Bekar, A.C., Madenci, E., Juanes, R., 2020. A Nonlocal Physics-Informed Spira, A., Kimmel, R., 2004. An efficient solution to the eikonal equation on parametric
Deep Learning Framework Using the Peridynamic Differential Operator arXiv manifolds. Interfaces Free Boundaries 6, 315–327.
preprint arXiv:2006.00446. Tompson, J., Schlachter, K., Sprechmann, P., Perlin, K., 2017. Accelerating eulerian fluid
Haghighat, E., Juanes, R., 2021. Sciann: a keras/tensorflow wrapper for scientific simulation with convolutional networks. In: Proceedings of the 34th International
computations and physics-informed deep learning using artificial neural networks. Conference on Machine Learning, vol. 70. JMLR. org, pp. 3424–3433.
Comput. Methods Appl. Mech. Eng. 373, 113552. Tsitsiklis, J.N., 1995. Efficient algorithms for globally optimal trajectories. IEEE Trans.
Han, J., Jentzen, A., Weinan, E., 2018. Solving high-dimensional partial differential Automat. Contr. 40, 1528–1538.
equations using deep learning. Proc. Natl. Acad. Sci. Unit. States Am. 115, Ventura, R., Ahmad, A., 2014. Towards optimal robot navigation in domestic spaces. In:
8505–8510. Robot Soccer World Cup. Springer, pp. 318–331.

12
U. Waheed et al. Computers and Geosciences 155 (2021) 104833

Waheed, U.B., Alkhalifah, T., Haghighat, E., Song, C., Virieux, J., 2021. PINNtomo: Xu, Y., Li, J., Chen, X., 2019. Physics informed neural networks for velocity inversion. In:
Seismic Tomography Using Physics-Informed Neural Networks arXiv preprint arXiv: SEG Technical Program Expanded Abstracts 2019. Society of Exploration
2104.01588. Geophysicists, pp. 2584–2588.
Waheed, U.B., Yarman, C.E., Flagg, G., 2015. An iterative, fast-sweeping-based eikonal Zhao, H., 2005. A fast sweeping method for eikonal equations. Math. Comput. 74,
solver for 3d tilted anisotropic media. Geophysics 80, C49–C58. 603–627.
Wang, S., Teng, Y., Perdikaris, P., 2020a. Understanding and Mitigating Gradient Zhu, C., Byrd, R.H., Lu, P., Nocedal, J., 1997. Algorithm 778: L-BFGS-B: Fortran
Pathologies in Physics-Informed Neural Networks arXiv preprint arXiv:2001.04536. subroutines for large-scale bound-constrained optimization. ACM Trans. Math
Wang, S., Yu, X., Perdikaris, P., 2020b. When and Why Pinns Fail to Train: A Neural Software 23, 550–560.
Tangent Kernel Perspective arXiv preprint arXiv:2007.14527.

13

You might also like