0% found this document useful (0 votes)

28 views19 pages

008 Parameterized Neural Ordinary Differential Equations: Applications To Computational Physics Problems

This document proposes an extension of neural ordinary differential equations (NODEs) called parameterized NODEs (PNODEs) that introduce input parameters to allow NODEs to learn multiple dynamics. PNODEs are applied to learn latent dynamics of complex computational physics problems. The effectiveness of PNODEs is demonstrated on benchmark problems where classical linear methods often fail.

Uploaded by

Nathaniel Saura

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views19 pages

008 Parameterized Neural Ordinary Differential Equations: Applications To Computational Physics Problems

Uploaded by

Nathaniel Saura

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

SAND2020-11835R

Parameterized Neural Ordinary Differential Equations:

Applications to Computational Physics Problems

Kookjin Leea,∗, Eric Parisha

a Sandia National Laboratories

Abstract
This work proposes an extension of neural ordinary differential equations (NODEs) by introducing an
additional set of ODE input parameters to NODEs. This extension allows NODEs to learn multiple dynamics
specified by the input parameter instances. Our extension is inspired by the concept of parameterized ordinary
differential equations, which are widely investigated in computational science and engineering contexts,
where characteristics of the governing equations vary over the input parameters. We apply the proposed
parameterized NODEs (PNODEs) for learning latent dynamics of complex dynamical processes that arise
in computational physics, which is an essential component for enabling rapid numerical simulations for
time-critical physics applications. For this, we propose an encoder-decoder-type framework, which models
latent dynamics as PNODEs. We demonstrate the effectiveness of PNODEs with important benchmark
problems from computational physics.
Keywords: model reduction, deep learning, autoencoders, machine learning, nonlinear manifolds, neural
ordinary differential equations, latent-dynamics learning

1. Introduction

Numerical simulations of dynamical systems described by systems of ordinary differential equations

(ODEs)1 play essential roles in various engineering and applied science applications. Such examples include
predicting input/output responses, design, and optimization [36]. These ODEs and their solutions often
depend on a set of input parameters, and such ODEs are denoted as parameterized ODEs. Examples of such
input parameters within the context of fluid dynamics include Reynolds number and Mach number. In many
important scenarios, high-fidelity solutions of parameterized ODEs are required to be computed i) for many
different input parameter instances (i.e., many-query scenario) or ii) in real time on a new input parameter
instance. A single run of a high-fidelity simulation, however, often requires fine spatiotemporal resolutions.
Consequently, performing real-time or multiple runs of a high-fidelity simulation can be computationally
prohibitive.
To mitigate this computational burden, many model-order reduction approaches have been proposed to
replace costly high-fidelity simulations. The common goal of these approaches is to build a reduced-dynamical
model with lower complexity than that of the high-fidelity model, and to use the reduced model to compute
approximate solutions for any new input parameter instance. In general, model-order reduction approaches
consist of two components: i) a low-dimensional latent-dynamics model, where the computational complexity
is very low, and ii) a (non)linear mapping that constructs high-dimensional approximate states (i.e., solutions)
from the low-dimensional states obtained from the latent-dynamics model. In many studies, such models are
constructed via data-driven techniques in the following steps: i) collect solutions of high-fidelity simulations

∗ 7011 East Ave, MS 9159, Livermore, CA 94550.

Email addresses: [email protected] (Kookjin Lee), [email protected] (Eric Parish)
1 Such systems of ODEs often arise from spatial discretization of time-dependent partial differential equations.

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned
subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
for a set of training parameter instances, ii) build a parameterized surrogate model, and iii) fit the model by
training with the data collected from the step i).
In the field of deep-learning, similar efforts have been made for learning latent dynamics of various
physical processes [21, 6, 31, 39, 13]. Neural ordinary differential equations (NODEs), a method of learning
time-continuous dynamics in the form of a system of ordinary differential equations from data, comprise
a particularly promising approach for learning latent dynamics of dynamical systems. NODEs have been
studied in [45, 6, 40, 27, 7, 15], and this body of work has demonstrated their ability to successfully learn
latent dynamics and to be applied to downstream tasks [6, 39].
Because NODEs learn latent dynamics in the form of ODEs, NODEs have a naturally good fit as a
latent-dynamics model in reduced-order modeling of physical processes and have been applied to several
computational physics problems including turbulence modeling [34, 30] and future states predictions in fluids
problems [1]. As pointed out in [10, 5], however, NODEs learn a single set of network weights, which fits best
for a given training data set. This results in an NODE model with limited expressibility and often leads to
unnecessarily complex dynamics [12]. To overcome this shortcoming, we propose to extend NODEs to have a
set of input parameters that specify the dynamics of the NODE model, which leads to parameterized NODEs
(PNODEs). With this simple extension, PNODEs can represent multiple trajectories such that the dynamics
of each trajectory are characterized by the input parameter instance.
The main contributions of this paper are

• an extension to NODEs that enables them to learn multiple trajectories with a single set of network
weights; even for the same initial condition, the dynamics can be different for different input parameter
instances,
• a framework for learning latent dynamics of parameterized ODEs arising in computational physics
problems,
• a demonstration of the effectiveness of the proposed framework with advection-dominated benchmark
problems, which are a class of problems where classical linear latent-dynamics learning methods (e.g.,
principal component analysis) often fail to learn accurately [25].

2. Related work

Classical reduced-order modeling. Classical reduced-order modeling (ROM) techniques rely heavily on linear
methods such as the proper orthogonal decomposition (POD) [19], which is analogous to principal component
analysis [20], for constructing the mappings between a high-dimensional space and a low-dimensional space.
These ROMs then identify the latent-dynamics model by executing a (linear) projection process on the
high-dimensional equations e.g., Galerkin projection [19]. We refer readers to [2, 3] for a complete survey on
classical methods.

Physics-aware deep-learning-based reduced-order modeling . Recent work has extended classical ROMs by
replacing proper orthogonal decomposition with nonlinear dimension reduction techniques emerging from
deep learning [25, 26]. These approaches operate by identifying a nonlinear mapping (via, e.g., convolutional
autoencoders) and subsequently identifying the latent dynamics as certain residual minimization problems,
which are defined on the latent space and are derived from the governing equations.
Another class of physics-aware methods include explicitly modeling time integration schemes [32, 41, 47, 14]
and adding stability/structure-preserving constraints in the latent dynamics [11, 18]. We emphasize that
our approach is closely related to [41], where neural networks are trained to approximate the action of the
first-order time-integration scheme applied to latent dynamics and, at each time step, neural network takes a
set of problem-specific parameters as well as reduced state as an input. Thus, our approach can be seen as a
time-continuous generalization of the approach in [41].

2
Purely data-driven deep-learning-based reduced-order modeling. Another approach for developing deep-
learning-based ROMs is to learn both nonlinear mappings and latent dynamics in purely data-driven ways.
Latent dynamics are modeled as recurrent neural networks with long short-term memory (LSTM) units along
with linear POD mappings [44, 37, 30] or nonlinear mappings constructed via (convolutional) autoencoders
[16, 46, 29, 42].

Enhancing NODE. Augmented NODEs [10] extends NODEs by augmenting additional state variables
to hidden state variables, which allows NODEs to learn dynamics using the additional dimensions and,
consequently, to have increased expressibility. ANODE [15] discretize the integration range into a fixed
number of steps (i.e., checkpoints) to mitigate numerical instability in the backward pass of NODEs; ACA
[50] further extends this approach by adopting adaptive stepsize solver in the bardward pass. ANODEV2
[49] proposes a coupled system of neural ODEs, where both hidden state variables and network weights are
allowed to evolve over time and their dynamics are approximated as neural networks. Neural optimal control
[5] formulates an NODE model as a controlled dynamical system and infers optimal control via an encoder
network This formulation results in an NODE that adjusts the dynamics for different input data. Moreover,
improved training strategies for NODEs have been studied in [12] and an extension of using spectral elements
in discretizations of NODE has been proposed in [35].

3. Neural ODE
Neural ODEs (NODEs) are a family of deep neural network models that parameterize the time-continuous
dynamics of hidden states using a system of ODEs:
dz(t)
= fΘ (z(t), t; Θ), (3.1)
dt
where z(t) is a time-continuous representation of a hidden state, fΘ is a parameterized velocity function,
which defines the dynamics of hidden states over time, and Θ is a set of neural network weights. Given the
initial condition z(0) (i.e., input), a hidden state at any time index z(t) can be obtained by solving the initial
value problem (IVP) (3.1). To solve the IVP, a black-box differential equation solver can be employed and
the hidden states can be computed with the desired accuracy:
z 1 , . . . , z nt = ODESolve(z(0), fΘ , t1 , . . . , tnt ). (3.2)
In the backward pass, as proposed in [6], gradients are computed by solving another system of ODEs, which
are derived using the adjoint sensitivity method [33], which allows memory efficient training of the NODE
model. As pointed out in the papers [10, 5], an NODE model learns a single dynamics for the entire data
distribution and, thus, results in a model with limited expressivity.

4. Parameterized neural ODE

To resolve this, we propose a simple, but powerful extension of neural ODE. We refer to this extension a
“parameterized neural ODEs” (PNODEs):
dz(t; µ)
= fΘ (z(t; µ), t; µ, Θ), (4.1)
dt
with a parameterized initial condition z 0 (µ), where µ = [µ1 , . . . , µnµ ] ∈ D ⊂ Rnµ denotes problem-specific
input parameters. Inspired by the concept of “parameterized ODEs”, where the ODEs depend on the input
parameters, this simple extension allows NODE to have multiple latent trajectories that depend on the input
parameters. This extension only requires minimal modifications in the definition of the velocity function
fΘ and can be trained/deployed by utilizing the same mathematical machinery developed for NODEs in
the forward pass (i.e., via a black-box ODE solver) and the backward pass (i.e., via the adjoint-sensitivity
method). In practice, fΘ is approximated as a neural network which takes z and µ as an input and then
produces dz
dt as an output.

3
5. Applications to computational physics problems

We now investigate PNODEs within the context of performing model reduction of computational physics
problems. We start by formally introducing the full-order model that we seek to reduce. We then describe
our proposed framework, which uses PNODEs (or NODEs) as the reduced-order (latent-dynamics) model.

5.1. Full-order model (FOM)

The full-order model (FOM) corresponds to a parameterized system of ordinary differential equations
(ODEs):
u̇ = f (u, t; µ), u(0; µ) = u0 (µ), (5.1)
where u(t; µ), u : [0, T ] × D → RN denotes the state, which is implicitly defined as the solution to the
system of ODEs. Here, µ ∈ D denotes the ODE parameters that characterize physical properties (e.g.,
boundary conditions, forcing terms), D ⊂ Rnµ denotes the parameter space, where nµ is the number of
parameters, and T denotes the final time. The initial state is specified by the parameterized initial condition
u0 (µ), u0 : D → RN . Lastly, f (u, t; µ), f : RN × [0, T ] × D → RN denotes the velocity and u̇ denotes
the differentiation of u with respect to time t. Solving (5.1) requires application of an ODE solver, where
the computational complexity rapidly grows with degrees of freedom N (e.g., N ∼ 107 for many practical
problems in computational physics).

5.2. Reduced-order model (ROM)

Reduced-order modeling mitigates the high cost associated with solving the FOM by operating on reduced
computational models that comprise i) a (non)linear mapping that constructs high-dimensional states from
reduced states and ii) a low-dimensional latent-dynamics model for the reduced states. We denote the
mapping from the reduced states to the high-dimensional states as d : Rp → RN , and denote the latent
dynamics model as the system of parameterized ODEs:

û˙ = fˆ(û, t; µ), û(0; µ) = û0 (µ), (5.2)

where û(t; µ), û : [0, T ] × D → Rp denotes the reduced state, which is a low-dimensional representative
state of the high-dimensional state (i.e., p N ). Analogously, û0 (µ), û0 : D → Rp denotes the reduced
parameterized initial condition, and fˆ(û, t; µ), fˆ : Rp × [0, T ] × D → Rp denotes the reduced velocity. The
objective of the ROM is to learn both a nonlinear mapping and a latent-dynamics model such that the ROM
generates accurate approximate solutions to the full-order model solution, i.e., d (û) ≈ u.

5.3. Learning latent-dynamics with PNODE

The aim here is to learn the latent dynamics with the PNODE: find a set of NODE parameters Θ such
that

û˙ = fˆΘ (û, t; µ, Θ),

where fˆΘ (·, ·; ·, Θ) : Rp × [0, T ] × D → Rp denotes the reduced velocity, i.e., modeling a ROM (Eq. (5.2)) as
PNODE. To achieve this goal, we propose a framework, where, besides a latent-dynamics model described by
the PNODE, two additional functions are required: i) an encoder, which maps a high-dimensional initial state
u0 (µ) to a reduced initial state û0 (µ), and ii) a decoder, which maps a set of reduced states ûk , k = 1, . . . , nt
to a set of high-dimensional approximate states ũk , k = 1, . . . , nt . We approximate these functions with
two neural networks: the encoder û = h enc (u; θ enc ), h enc : RN → Rp and the decoder ũ = h dec (û; θ dec ),
h dec : Rp → RN (i.e., d = h dec ). Here, θ = (θθ enc , θ dec ) are the network weights.
With all these neural networks defined, the forward pass of the framework can be described as
1. encode a reduced initial state from the given initial condition: û0 (µ) = h enc (u0 (µ); θ enc ),
2. solve a system of ODEs defined by PNODE (or NODE):

û1 , . . . , ûnt = ODESolve(û0 (µ), fˆΘ , µ, t1 , . . . , tnt ),

4
Figure 1: The forward pass of the proposed framework: i) the encoder (red arrow), which provides a reduced initial state to
the PNODE, ii) solving PNODE (or NODE) with the initial state results in a set of reduced states, and iii) the decoder (blue
arrows), which maps the reduced states to high-dimensional approximate states.

3. decode a set of reduced states to a set of high-dimensional approximate states: ũk = h dec (ûk ; θ dec ), k =
1, . . . , nt , and
4. compute a loss function L(ũ1 , . . . , ũnt , u1 , . . . , unt ).
Figure 1 illustrates the computational graph of the forward pass in the proposed framework. We emphasize
that the proposed framework only takes the initial states from the training data and the problem-specific
ODE parameters µ as an input. PNODEs still can learn multiple trajectories, which are characterized by
the ODE parameters, even if the same initial states are given for different ODE parameters, which is not
achievable with NODEs. Furthermore, the proposed framework is significantly simpler than the common
neural network settings for NODEs when they are used to learn latent dynamics: the sequence-to-sequence
architectures as in [6, 39, 48, 29], which require that a (part of) sequence is fed into the encoder network to
produce a context vector, which is then fed into the NODE decoder network as an initial condition.

6. Numerical experiments

In the following, we apply the proposed framework for learning latent dynamics of parameterized dynamics
from computational physics problems. We then demonstrate the effectiveness of the proposed framework
with results of numerical experiments performed on these benchmark problems.

6.1. Data collection

To train the proposed framework, we collect snapshots of reference solutions by solving the FOM for
pre-specified training parameter instances µ ∈ Dtrain ≡ {µktrain }nk=1
train
⊂ D. This collection results in a tensor

U ∈ Rntrain ×(nt +1)×N ,

where nt is the number of time steps. The mode-2 unfolding [23] of the solution tensor U gives

U [2] = U (µ1train ) · · · U (µntrain ) ∈ RN ×ntrain (nt +1) ,

train

where U (µktrain ) ∈ RN ×(nt +1) consists of the FOM solution snapshots for µktrain and the first column
corresponds to the initial condition u0 (µktrain ). Among the collected solution snapshots, only the first columns

5
of U (µktrain ), k = 1, . . . , ntrain (i.e., the initial conditions) are fed into the framework, the rest of solution
snapshots are used in computing the loss function.
Assuming the FOM arises from a spatially discretized partial differential equation, the total degrees of
freedom N can be defined as N = nu × n1 × · · · × nnd , where nu is the number of different types of solution
variables (e.g., chemical species), and nnd denotes the number of spatial dimensions of the partial differential
equation. Note that this spatially-distributed data representation is analogous to multi-channel images (i.e.,
nu corresponds to the number of channels); as such we utilize (transposed) convolutional layers [24, 17] in
our encoder and decoder.

6.2. Network architectures, training, and testing

In the experiments, we employ convolutional encoders and transposed-convolutional decoders. The encoder
consists of four convolutional layers, followed by one fully-connected layer, and the decoder consists of one
fully-connected layer, followed by four transposed-convolutional layers. To decrease/increase the spatial
dimension, we employ strides larger than one, but we do not use pooling layers. For the nonlinear activation
functions, we use ELU [8] after each (transposed) convolutional layer and fully-connected layer, with an
exception of the output layer (i.e., no activation at the output layer).
Moreover, we employ NODE and PNODE for learning latent dynamics and model fˆΘ as fully-connected
layers. For the nonlinear activation functions, we again use ELU. Lastly, for ODESolve, we use the Dormand–
Prince method [9], which is provided in the software package of [6]. Our implementation reuses many parts
of the software package used in [39], which is written in PyTorch. The details of the configurations will be
presented in each of the benchmark problems.
For training, we set the loss function as the mean squared error and optimize the network weights
(Θ, θ enc , θ dec ) using Adamax, a variant of Adam [22], with an initial learning rate 1e-2.
For the performance evaluation metrics, we measure errors of approximated solutions with respect to the
reference solutions for testing parameter instances, µktest . We use the relative `2 -norm of the error:

U (µktest ) − Ũ
kU U (µktest )kF
k
, (6.1)
kUU (µtest )kF

where k · kF denotes the Frobenius norm.

6.3. Problem 1: 1D inviscid Burgers’ equation

The first benchmark problem is a parameterized one-dimensional inviscid Burgers’ equation, which models
simplified nonlinear fluid dynamics and demonstrates propagations of shock. The governing system of partial
differential equations is
∂w(x, t; µ) ∂f (w(x, t; µ))
+ = 0.02eµ2 x , (6.2)
∂t ∂x
where f (w) = 0.5w2 , x ∈ [0, 100], and t ∈ [0, 35]. The boundary condition w(0, t; µ) = µ1 is imposed on the
left boundary (x = 0) and the initial condition is set by w(x, 0; µ) = 1. Thus, the problem is characterized
by a single variable w (i.e., nu = 1) and the two parameters (µ1 , µ2 ) (i.e., nµ = 2) which correspond to the
Dirichlet boundary condition at x = 0 and the forcing term, respectively. Following [38], discretizing Eq.
6.2 with Godunov’s scheme with 256 control volumes results in a system of parameterized ODEs (FOM)
with N = nu n1 = 256. We then solve the FOM using the backward-Euler scheme with a uniform time step
∆t = 0.07, which results in nt = 500 for each µktrain ∈ Dtrain . Figure 2 depicts snapshots of reference solutions
for parameter instances (µ1 , µ2 ) = (4.25, 0.015) at time t = {7.77, 11.7, 19.5, 23.3, 27.2, 35.0}, illustrating the
discontinuity (shock) moving from left to right as time proceeds.
For the numerical experiments in this subsection, we use the network described in Table 1.

6
Table 1: Network architecture: kernel filter length κ, number of kernel filters nκ , and strides s at each (transposed) convolutional
layers.

Encoder Decoder
Conv-layer (4 layers) FC-layer (1 layer)
κ [16, 8, 4, 4] din = p, dout = 128
nκ [ 8, 16, 32, 64] Trans-conv-layer (4 layers)
s [ 2, 4, 4, 4] κ [ 4, 4, 8, 16]
FC-layer (1 layer) nκ [32, 16, 8, 1]
din = 128, dout = p s [ 4, 4, 4, 2]

6.3.1. Reconstruction: approximating a single trajectory with latent-dynamics learning

In this experiment, we consider a single training/testing parameter instance µ1train = µ1test = (µ11 , µ12 ) =
(4.25, 0.015) and test both NODE and PNODE for latent-dynamics modeling. We set the reduced dimension2
as p = 5. Figure 3 depicts snapshots of the reference solutions and approximated solutions computed by using
the framework with NODE (Figure 3a) and with PNODE (Figure 3b) at 15 time instances t = { 35 15 k}k=1 .
15

The relative errors for NODE and PNODE are 2.6648 × 10 and 2.6788 × 10 ; the differences between the
−3 −3

two errors are negligible.

6.3.2. Prediction: approximating an unseen trajectory for unseen parameter instances

We now consider two multi-parameter scenarios. In the first scenario (Scenario 1), we vary the first
parameter (boundary condition) and consider 4 training parameter instances, 2 validation parameter instances,
and 2 test parameter instances. The parameter instances are collected as shown in Figure 4a: the training

2 For setting p, we follow the results of the study on the effective latent dimension shown in [25].

5 5 5

4 4 4
w(x, t)

w(x, t)

3 3 3

2 2 2

1 1 1

0 100 0 100 0 100

x x x

(a) t = 7.77 (b) t = 11.7 (c) t = 19.5

5 5 5

4 4 4
w(x, t)

w(x, t)

3 3 3

2 2 2

1 1 1

0 100 0 100 0 100

x x x

(d) t = 23.3 (e) t = 27.2 (f) t = 35.0

Figure 2: Snapshots of reference solutions at t = {7.77, 11.7, 19.5, 23.3, 27.2, 35.0}.

7
5 5

4 4

w(x, t)

w(x, t)
3 3

2 2

1 1
Training parameters instances
0 100 0 100
x parameters instances
Validating x
(a) NODE (b) PNODE
Testing parameters instance
Figure 3: Reconstruction: snapshots of reference solutions and approximated solutions using NODE (left) and PNODE (right)
35
at t = { 15 k}15
k=1 .

parameter instances correspond to Dtrain = {(4.25 + (0.139)k, 0.015)}, k = 0, 2, 4, 6, the validating parameter
instances correspond to Dval = {(4.25 + (0.139)k, 0.015)}, k = 1, 5, and the testing parameter instances
correspond to Dtest = {(4.67, 0.015), (5.22, 0.015)}. Note that the initial condition is identical for all parameter
instances, i.e., u0 (µ) = 1.

(a) Scenario 1

Training parameters instances

Validating parameters instances instances

Training parameters

Testing parameters instance

Validating parameters instances

Testing parameters instance

(b) Scenario 2

Figure 4: Visualizations of parameter instances sampling for Scenario 1 and Scenario 2.

We train the framework with NODE and PNODE with the same set of hyperparameters. Again, the
reduced dimension is set to p = 5. Figures 5a–5b depict snapshots of reference solutions and approximated
solutions using NODE and PNODE. Both NODE and PNODE learn the boundary condition (i.e., 4.67 at
x = 0) accurately. For NODE, this is only because the testing boundary condition is linearly in the middle
of two validating boundary conditions (and also in the middle of four training boundary conditions) and
minimizing the mean squared error results in learning a single trajectory with the NODE, where the trajectory
has a boundary condition, which is exactly the middle of two validating boundary conditions 4.389 and 4.944.
Moreover, as NODE learns a single trajectory that minimizes MSE, it actually fails to learn the correct
dynamics and results in poor approximate solutions as time proceeds. As opposed to NODE, the PNODE
accurately approximates solutions up to the final time. Table 2 (second row) shows the relative `2 -errors (Eq
6.1) for both NODE and PNODE.
Continuing from the previous experiment, we test the second testing parameter instance, Dtest =
{(5.22, 0.015)}, which is located outside Dtrain (i.e., next to µ(7) in Figure 4a). The results are shown in
Figures 5c–5d: the NODE only learns a single trajectory with the boundary condition, which lies in the

8
middle of validating parameter instances, whereas the PNODE accurately produces approximate solutions for
the new testing parameter instances. Table 2 (third row) reports the relative errors.

6 6

5 5

4 4
w(x, t)

w(x, t)
3 3

2 2

1 1

0 100 0 100
x x
(a) NODE, µ1test (b) PNODE, µ1test

6 6

5 5

4 4
w(x, t)

w(x, t)

3 3

2 2

1 1
0 100 0 100
x x
(c) NODE, µ2test (d) PNODE, µ2test

Figure 5: Prediction Scenario 1: snapshots of reference solutions (red) and approximated solutions (green) using NODE (left)
and PNODE (right) at t = { 35
15
k}15 1 2
k=1 for µtest = (4.67, 0.015) (top) and µtest = (5.22, 0.015) (bottom).

Table 2: Prediction Scenario 1: the relative `2 -errors.

NODE PNODE
µ1test = µ(4) 4.3057 × 10−2 3.6547 × 10−3
µ2test = µ(8) 1.5740 × 10−1 5.6900 × 10−3

Next, in the second scenario, we vary both parameters µ1 and µ2 as shown in Figure 4b: the sets of the
training, validating, and testing parameter instances correspond to
Dtrain = {(4.25 + (0.139)k, 0.015 + (0.002)l)}, {(k, l)} = {(0, 0), (0, 2), (2, 0), (2, 2)},
Dval = {(4.25 + (0.139)k, 0.015 + (0.002)l)}, {(k, l)} = {(1, 0), (0, 1), (2, 1), (1, 2)},
Dtest = {(4.25 + (0.139)k, 0.0015 + (0.002)l)}, {(k, l)} = {(1, 1), (3, 2), (2, 3), (3, 3)}.
We have tested the set of testing parameter instances and Table 3 reports the relative errors; the result
shows that PNODE achieves sub 1% error in most cases. On the other hand, NODE achieves around 10%
errors in most cases. The 1.7% error of NODE for µ1test is achieved only because the testing parameter
instance is located in the middle of the validating parameter instances (and the training parameter instances).

9
Table 3: Prediction Scenario 2: the relative `2 -errors.

NODE PNODE
µ1test = µ(5) 1.7422 × 10−2 3.2672 × 10−3
µ2test = µ(10) 1.0713 × 10−1 7.7303 × 10−3
µ3test = µ(11) 8.9229 × 10−2 8.5650 × 10−3
µ4test = µ(12) 1.2377 × 10−1 1.0735 × 10−2

Figure 6: The geometry of the spatial domain for chemically reacting flow.

6.4. Problem 2: 2D chemically reacting flow

The reaction model of a premixed H2 -air flame at constant uniform pressure [4] is described by the
equation:
∂w(x, t; µ)
= ∇ · (ν∇w(x, t; µ)) − v · ∇w(x, t; µ) + q(w(x, t; µ); µ), (6.3)
∂t
where, on the right-hand side, the first term is the diffusion term with the spatial gradient operator
∇, the molecular diffusivity ν = 2cm2 ·s−1 , the second term is the convective term with the constant
and divergence-free velocity field v = [50cm·s−1 , 0]T , and the third term is the reactive term with the
reaction source term q. The solution w corresponds to the thermo-chemical composition vector defined as
w(x, t; µ) = [wT (x, t; µ), wH2 (x, t; µ), wO2 (x, t; µ), wH2 O (x, t; µ)]T ∈ R4 , where wT denotes the temperature,
and wH2 , wO2 , wH2 O denote the mass fraction of the chemical species H2 , O2 , and H2 O. The reaction source
term is of Arrhenius type, which is defined as: q(w; µ) = [qT (w; µ), qH2 (w; µ), qO2 (w; µ), qH2 O (w; µ)]T ,
where

qT (w; µ) = QqH2 O (w; µ),

v v
Wi ρwH2 H2 ρwO2 O2

E
− Rw
qi (w; µ) = −vi Ae T ,
ρ WH2 WO2

for i ∈ {H2 , O2 , H2 O}. Here, (vH2 , vO2 , vH2 O ) = (2, 1, −2) denote stoichiometric coefficients, (WH2 , WO2 , WH2 O ) =
(2.016, 31.9, 18) denote molecular weights in units g·mol−1 , ρ = 1.39×10−3 g·cm−3 denotes the density mixture,
R = 8.314J·mol−1 ·K−1 denotes the universal gas constant, and Q = 9800K denotes the heat of the reaction.
The problem has two input parameters (i.e., nµ = 2), which correspond to µ = (µ1 , µ2 ) = (A, E), where A
and E denote the pre-exponential factor and the activation energy.
Figure 6 depicts the geometry of the spatial domain and the boundary conditions are set as:
• Γ2 : the inflow boundary with Dirichlet boundary conditions wT = 950K, and (wH2 , wO2 , wH2 O ) =
(0.0282, 0.2259, 0),
• Γ1 and Γ3 : the Dirichlet boundary conditions wT = 300K, and (wH2 , wO2 , wH2 O ) = (0, 0, 0),
• Γ4 , Γ5 , and Γ6 : the homogeneous Neumann condition,

10
1600
1200
800
400

0.024
0.018
0.012
0.006
0.000
0.20
0.15
0.10
0.05
0.00
0.10
0.08
0.06
0.04
0.02
0.00

Figure 7: Snapshots of reference solutions of temperature (first row), H2 (second row), O2 (third row), and H2 O (fourth row) at
t = {0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06} (from left to right).

and the initial condition is set as wT = 300K, and (wH2 , wO2 , wH2 O ) = (0, 0, 0) (i.e., empty of chemical
species). For collecting data, we employ a finite-difference method with 64 × 32 uniform grid points (i.e.,
N = nµ × n1 × n2 = 4 × 64 × 32), the second-order backward Euler method (BDF2) with a uniform time
step ∆t = 10−4 and the final time 0.06 (i.e., nt = 600). Figure 7 depicts snapshots of the reference solutions
of each species for the training parameter instance (µ1 , µ2 ) = (2.3375 × 1012 , 5.6255 × 103 ).

6.4.1. Data preprocessing and training

Each species in this problem has different numeric scales: the magnitude of wT is about four-orders of
magnitude larger than those of other species (see Figure 7). To set the values of each species in a common
range [0, 1], we employ zero-one scaling to each species separately. Moreover, because the values of A and E
are several orders of magnitude larger than those of species, we scale the input parameters as well to match
the scales of the chemical species. We simply divide the values of the first parameter and the second parameter
by 1013 and 104 , respectively.3 After these scaling operations, we train the framework with hyper-parameters
specified in Table 4, where the input data consists of 2-dimensional data with 4 channels, and the reduced
dimension is again set as p = 5.

Table 4: Network architecture: kernel filter length κ = κ1 = κ2 , number of kernel filters nκ , and strides s = s1 = s2 at each
(transposed) convolutional layer.

Encoder Decoder
Conv-layer (4 layers) FC-layer (1 layer)
κ [16, 8, 4, 4] din = p, dout = 512
nκ [ 8, 16, 32, 64] Trans-conv-layer (4 layers)
s [ 2, 2, 2, 2] κ [ 4, 4, 8, 16]
FC-layer (1 layer) nκ [32, 16, 8, 4]
din = 512, dout = p s [ 2, 2, 2, 2]

6.4.2. Prediction: approximating an unseen trajectory for unseen parameter instances

In this experiment, we vary two parameters: the pre-exponential factor µ1 = A and the activation energy
µ2 = E. We consider parameter instances as depicted in Figure 8: the sets of the training, validating, and

3 We have not investigated different scaling strategies for scaling of parameter instances as it is not the focus of this study.

11
testing parameter instances correspond to

Dtrain = {(2.3375 × 1012 + (0.5946 × 1012 )k,5.6255 × 103 + (0.482 × 103 )l)},
{(k, l)} = {(0, 0), (0, 2), (0, 4), (2, 0), (2, 2), (2, 4)},
12
Dval = {(2.3375 × 10 + (0.5946 × 10 )k,5.6255 × 103 + (0.482 × 103 )l)},
12

{(k, l)} = {(0, 1), (1, 0), (1, 4), (2, 3)},
12
Dtest = {(2.3375 × 10 + (0.5946 × 10 )k,5.6255 × 103 + (0.482 × 103 )l)},
12

{(k, l)} = {(0, 3), (1, 1), (1, 2), (1, 3), (2, 1), (3, 0), (3, 1), (4, 0)}.

Training parameters instances

Validating parameters instances

Testing parameters instance

Figure 8: Visualizations of parameter instances sampling for the reacting flow.

Table 5 presents the relative `2 -errors of approximate solutions computed using NODE and PNODE
for testing parameter instances in the predictive scenario. The first three rows in Table 5 correspond to
the results of testing parameter instances at the middle three red circles in Figure 8. As expected, both
NODE and PNODE work well for these testing parameter instances: NODE is expected to work well for
these testing parameter instances because the single trajectory that minimizes the MSE over validating
parameter instances would be the trajectory associated with the testing parameter µ(8) . As we consider
testing parameter instances that are distant from µ(8) , we observe PNODE to be (significantly) more accurate
than NODE. From these observations, the NODE model can be considered as being overfitted to a trajectory
that minimizes the MSE. This overfitting can be avoided to a certain extent by applying e.g., early-stopping,
however, this cannot fundamentally fix the problem of the NODE (i.e., fitting a single trajectory for all input
data distributions).

Table 5: Prediction Scenario: the relative `2 -errors.

NODE PNODE
µ1test = µ(7) 9.2823 × 10−3 4.2993 × 10−3
µ2test = µ(8) 3.3450 × 10−3 4.6429 × 10−3
µ3test = µ(9) 4.1516 × 10−3 5.0617 × 10−3
µ4test = µ(4) 4.0835 × 10−2 5.6011 × 10−3
µ5test = µ(12) 3.4767 × 10−2 4.4133 × 10−3
µ6test = µ(16) 5.9410 × 10−2 1.2935 × 10−2
µ7test = µ(17) 5.4553 × 10−2 1.1785 × 10−2
µ8test = µ(18) 7.4881 × 10−2 2.4660 × 10−2

12
6.5. Problem 3: Quasi-1D Euler equation
For the third benchmark problem, we consider the quasi-one-dimensional Euler equations for modeling
inviscid compressible flow in a one-dimensional converging–diverging nozzle with a continuously varying
cross-sectional area [28]. The system of the governing equations is

∂w 1 ∂f (w)
+ = g(w),
∂t A ∂x
where    
ρ ρu 0
p ∂A 
w = ρu , f (w) =  ρu2 + p  , g(w) =  A ∂x
,
e (e + p)u 0
2
with p = (γ − 1)ρ, = ρe − u2 , and A = A(x). Here, ρ denotes density, u denotes velocity, p denotes pressure,
denotes energy per unit mass, e denotes total energy density, γ denotes the specific heat ratio, and A(x)
denotes the converging–diverging nozzle cross-sectional area. We consider a specific heat ratio of γ = 1.3, a
specific heat constant of R = 355.4m2 /s2 /K, a total temperature of Ttotal = 300K, and a total pressure of
ptotal = 106 N/m2 . The cross-sectional area A(x) is determined by a cubic spline interpolation over the points
(x, A(x)) = {(0, 0.2), (0.25, 1.05µ), (0.5, µ), (0.75, 1.05µ), (1, 0.2)}, where µ determines the width of the middle
cross-sectional area. Figure 9 depicts the schematic figures of converging–diverging nozzle determined by
A(x), parameterized by the width of the middle cross-sectional area, µ. A perfect gas, which obeys the ideal
gas law (i.e., p = ρRT ), is assumed.

Figure 9: The geometry of the spatial domain for quasi-1D Euler equation of converging–diverging nozzle.

For the initial condition, the initial flow field is computed as follows; a zero pressure-gradient flow field is
constructed via the isentropic relations,
γ+1
! 2(γ−1) −γ
γ−1 2
1+ 2 M (x) γ−1
γ−1
M m Am
M (x) = , p(x) = ptotal 1+ M (x)2 ,
A(x) 1 + γ−1
2 Mm
2 2

−1 s
γ−1

p(x) p(x)
T (x) = Ttotal 1+ M (x)2 , ρ(x) = , c(x) = R , u(x) = M (x)c(x),
2 RT (x) ρ(x)
where M denotes the Mach number, c denotes the speed of sound, a subscript m indicates the flow quantity
at x = 0.5 m. The shock is located at x = 0.85 m and the velocity across the shock (u2 ) is computed by
using the jump relations for a stationary shock and the perfect gas equation of state. The velocity across the
shock satisfies the quadratic equation

1 γ γ n
− u22 + u2 − h = 0,
2 γ−1 γ−1m

13
where m = ρ2 u2 = ρ1 u1 , n = ρ2 u22 + p2 = ρ1 u21 + p1 , h = (e2 + p2 )/ρ2 = (e1 + p1 )/ρ1 . The subscripts 1
and 2 indicates quantities to the left and to the right of the shock. We consider a specific Mach number of
Mm = 2.0.
For spatial discretization, we employ a finite-volume scheme with 128 equally spaced control volumes and
fully implicit boundary conditions, which leads to N = nu n1 = 3 × 128 = 384. At each intercell face, the Roe
flux difference vector splitting method is used to compute the flux. For time discretization, we employ the
backward Euler scheme with a uniform time step ∆t = 10−3 and the final time 0.6 (i.e., nt = 600). Figure
10 depicts the snapshots of reference solutions of Mach number M (x) for the middle cross-sectional area
µ = 0.15 at t = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}.

2.5 2.5 2.5

2.0 2.0 2.0

1.5 1.5 1.5

M
1.0 1.0 1.0

0.5 0.5 0.5

0.0 0.0 0.0

−0.5 −0.5 −0.5

0 1 0 1 0 1
x x x
(a) t = 0.1 (b) t = 0.2 (c) t = 0.3

2.5 2.5 2.5

2.0 2.0 2.0

1.5 1.5 1.5

1.0 1.0 M 1.0

0.5 0.5 0.5

0.0 0.0 0.0

−0.5 −0.5 −0.5

0 1 0 1 0 1
x x x
(d) t = 0.4 (e) t = 0.5 (f) t = 0.6

Figure 10: Snapshots of reference solutions of Mach number M (x) for µ = 0.15 at t = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}.

The varying parameter of this problem is the width of the middle cross-sectional area, which determines
the geometry of the spatial domain and, thus, determines the initial condition as well as the dynamics.
Analogously to the previous two benchmark problems, we select 4 training parameter instances, 3 validating
parameter instances, and 3 testing parameter instances (Figure 11):

Dtrain = {(0.13 + (0.005)k)}, {k} = {0, 3, 6, 9)},

Dval = {(0.13 + (0.005)k}, {k} = {1, 4, 7},
Dtest = {(0.13 + (0.005)k)}, {k} = {2, 5, 8, 10}.

Again, we set the reduced dimension as p = 5.

We train the framework either with NODE and PNODE for learning latent dynamics and test the
framework in the predictive scenario (i.e., for unseen testing parameter instances as shown in Figure 11) and
Figure 12 depicts the solution snapshots at t = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}. We observe that PNODE yields
moderate improvements over NODE, i.e., about 20% decrease in the relative `2 -norm of the error (6.1) for all
four testing parameter instances. The improvements are not as dramatic as the ones shown in the previous
two benchmark problems. We believe this is because, in this problem setting, varying the input parameter

14
Training parameters instances

Validating parameters instances

Testing parameters instance

Figure 11: Visualizations of parameter instances sampling for the quasi-1D Euler equations.

results in fairly distinct initial conditions, but does not significantly affect variations in dynamics; both the
initial condition and the dynamics are parameterized by the same input parameter, the width of the middle
cross-sectional area of the spatial domain.

Table 6: Network architecture: kernel filter length κ, number of kernel filters nκ , and strides s at each layer of (transposed)
convolutional layers.

Encoder Decoder
Conv-layer (5 layers) FC-layer (1 layer)
κ [16, 8, 4, 4, 4] din = p, dout = 512
nκ [16, 32, 64, 64, 128] Trans-conv-layer (5 layers)
s [ 2, 2, 2, 2, 2] κ [ 4, 4, 4, 8, 16]
FC-layer (1 layer) nκ [64, 64, 32, 16, 3]
din = 512, dout = p s [ 2, 2, 2, 2, 2]

Our general observation is that the benefits of using PNODE are most pronounced when the dynamics are
parameterized and there is a single initial condition. Moreover, we expect to see more improvements in the
approximation accuracy over NODE when the dynamics vary significantly for different input parameters, for
instance, modeling infectious diseases such as the novel corona virus (COVID-19) [43], where the dynamics of
transmission is greatly affected by parameters of the model, which are determined by e.g., quarantine policy,
social distancing.

7. Conclusions

In this study, we proposed a parameterized extension of neural ODEs and a novel framework for reduced-
order modeling of complex numerical simulations of computational physics problems. Our simple extension
allows neural ODE models to learn multiple complex trajectories. This extension overcomes the main drawback
of neural ODEs, namely that only a single set of dynamics are learned for the entire data distribution. We
have demonstrated the effectiveness of of parameterized neural ODEs on several benchmark problems from
computational fluid dynamics, and have shown that the proposed method outperforms neural ODEs.

8. Acknowledgments

This paper describes objective technical results and analysis. Any subjective views or opinions that might
be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the
United States Government. Sandia National Laboratories is a multimission laboratory managed and operated
by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell
International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under
contract DE-NA0003525.

15
2 2 2
M

M
1 1 1

0 0 0

0 1 0 1 0 1
x x x
(a) t = 0.1 (b) t = 0.2 (c) t = 0.3

2 2 2
M

M
1 1 1

0 0 0

0 1 0 1 0 1
x x x
(d) t = 0.4 (e) t = 0.5 (f) t = 0.6

Figure 12: Snapshots of reference solutions (solid red lines) and approximated solutions (dashed green lines) of Mach number
M (x) for µ = 0.15 at t = {0.1, 0.2, 0.3, 0.4, 0.5, 0.6}. The approximated solutions are obtained by using the framework with
PNODE.

[1] I. Ayed, E. de Bézenac, A. Pajot, J. Brajard, and P. Gallinari, Learning dynamical systems
from partial observations, arXiv preprint arXiv:1902.11136, (2019).

[2] P. Benner, S. Gugercin, and K. Willcox, A survey of projection-based model reduction methods
for parametric dynamical systems, SIAM review, 57 (2015), pp. 483–531.
[3] P. Benner, M. Ohlberger, A. Cohen, and K. Willcox, Model Reduction and Approximation:
Theory and Algorithms, SIAM, 2017.

[4] M. Buffoni and K. Willcox, Projection-based model reduction for reacting flows, in 40th Fluid
Dynamics Conference and Exhibit, 2010, p. 5008.
[5] M. Chalvidal, M. Ricci, R. VanRullen, and T. Serre, Neural optimal control for representation
learning, arXiv preprint arXiv:2006.09545, (2020).
[6] R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differential
equations, in Advances in neural information processing systems, 2018, pp. 6571–6583.
[7] M. Ciccone, M. Gallieri, J. Masci, C. Osendorfer, and F. Gomez, Nais-net: Stable deep
networks from non-autonomous differential equations, in Advances in Neural Information Processing
Systems, 2018, pp. 3025–3035.

[8] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, Fast and accurate deep network learning by
exponential linear units (ELUs), arXiv preprint arXiv:1511.07289, (2015).
[9] J. R. Dormand and P. J. Prince, A family of embedded runge–kutta formulae, Journal of Computa-
tional and Applied Mathematics, 6 (1980), pp. 19–26.

16
[10] E. Dupont, A. Doucet, and Y. W. Teh, Augmented neural ODEs, in Advances in Neural Information
Processing Systems, 2019, pp. 3140–3150.
[11] N. B. Erichson, M. Muehlebach, and M. W. Mahoney, Physics-informed autoencoders for
Lyapunov-stable fluid flow prediction, arXiv preprint arXiv:1905.10866, (2019).

[12] C. Finlay, J.-H. Jacobsen, L. Nurbekyan, and A. M. Oberman, How to train your neural ODE,
arXiv preprint arXiv:2002.02798, (2020).
[13] L. Fulton, V. Modi, D. Duvenaud, D. I. Levin, and A. Jacobson, Latent-space dynamics for
reduced deformable simulation, in Computer Graphics Forum, vol. 38, Wiley Online Library, 2019,
pp. 379–391.

[14] N. Geneva and N. Zabaras, Modeling the dynamics of pde systems with physics-constrained deep
auto-regressive networks, Journal of Computational Physics, 403 (2020), p. 109056.
[15] A. Gholami, K. Keutzer, and G. Biros, Anode: Unconditionally accurate memory-efficient gradients
for neural odes, arXiv preprint arXiv:1902.10298, (2019).
[16] F. J. Gonzalez and M. Balajewicz, Deep convolutional recurrent autoencoders for learning low-
dimensional feature dynamics of fluid systems, arXiv preprint arXiv:1808.01346, (2018).
[17] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT press, 2016.
[18] Q. Hernandez, A. Badias, D. Gonzalez, F. Chinesta, and E. Cueto, Deep learning of
thermodynamics-aware reduced-order models from data, arXiv preprint arXiv:2007.03758, (2020).

[19] P. Holmes, J. L. Lumley, G. Berkooz, and C. W. Rowley, Turbulence, Coherent Structures,

Dynamical Systems and Symmetry, Cambridge university press, 2012.
[20] H. Hotelling, Analysis of a complex of statistical variables into principal components., Journal of
educational psychology, 24 (1933), p. 417.

[21] M. Karl, M. Soelch, J. Bayer, and P. van der Smagt, Deep variational bayes filters: Unsupervised
learning of state space models from raw data, in International Conference on Learning Representations,
2017.
[22] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980,
(2014).

[23] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM review, 51 (2009),
pp. 455–500.
[24] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, 521 (2015), pp. 436–444.
[25] K. Lee and K. Carlberg, Deep conservation: A latent dynamics model for exact satisfaction of
physical conservation laws, arXiv preprint arXiv:1909.09754, (2019).
[26] K. Lee and K. T. Carlberg, Model reduction of dynamical systems on nonlinear manifolds using
deep convolutional autoencoders, Journal of Computational Physics, 404 (2020), p. 108973.
[27] Y. Lu, A. Zhong, Q. Li, and B. Dong, Beyond finite layer neural networks: Bridging deep
architectures and numerical differential equations, in International Conference on Machine Learning,
2018, pp. 3276–3285.
[28] R. W. MacCormack, Numerical computation of compressible and viscous flow, American Institute of
Aeronautics and Astronautics, Inc., 2014.

17
[29] R. Maulik, B. Lusch, and P. Balaprakash, Reduced-order modeling of advection-dominated systems
with recurrent neural networks and convolutional autoencoders, arXiv preprint arXiv:2002.00470, (2020).
[30] R. Maulik, A. Mohan, B. Lusch, S. Madireddy, P. Balaprakash, and D. Livescu, Time-series
learning of latent-space dynamics for reduced-order model closure, Physica D: Nonlinear Phenomena, 405
(2020), p. 132368.
[31] J. Morton, A. Jameson, M. J. Kochenderfer, and F. Witherden, Deep dynamical modeling
and control of unsteady fluid flows, in Advances in Neural Information Processing Systems, 2018,
pp. 9258–9268.
[32] S. Pawar, S. Rahman, H. Vaddireddy, O. San, A. Rasheed, and P. Vedula, A deep learning
enabler for nonintrusive reduced order modeling of fluid flows, Physics of Fluids, 31 (2019), p. 085101.
[33] L. S. Pontryagin, The mathematical theory of optimal processes, (1962).
[34] G. D. Portwood, P. P. Mitra, M. D. Ribeiro, T. M. Nguyen, B. T. Nadiga, J. A. Saenz,
M. Chertkov, A. Garg, A. Anandkumar, A. Dengel, et al., Turbulence forecasting via neural
ODE, arXiv preprint arXiv:1911.05180, (2019).
[35] A. Quaglino, M. Gallieri, J. Masci, and J. Koutník, SNODE: Spectral discretization of neural
odes for system identification, arXiv preprint arXiv:1906.07038, (2019).
[36] A. Quarteroni, A. Manzoni, and F. Negri, Reduced Basis Methods for Partial Differential Equations:
an Introduction, vol. 92, Springer, 2015.
[37] S. M. Rahman, S. Pawar, O. San, A. Rasheed, and T. Iliescu, Nonintrusive reduced order
modeling framework for quasigeostrophic turbulence, Physical Review E, 100 (2019), p. 053306.
[38] M. Rewienski, A trajectory piecewise-linear approach to model order reduction of nonlinear dynamical
systems, PhD thesis, Massachusetts Institute of Technology, 2003.
[39] Y. Rubanova, R. T. Chen, and D. Duvenaud, Latent odes for irregularly-sampled time series, arXiv
preprint arXiv:1907.03907, (2019).
[40] L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, Journal
of Mathematical Imaging and Vision, (2019), pp. 1–13.
[41] O. San, R. Maulik, and M. Ahmed, An artificial neural network framework for reduced order
modeling of transient flows, Communications in Nonlinear Science and Numerical Simulation, 77 (2019),
pp. 271–287.
[42] J. Tencer and K. Potter, Enabling nonlinear manifold projection reduced-order models by extending
convolutional neural networks to unstructured data, arXiv preprint arXiv:2006.06154, (2020).
[43] H. Wang, Z. Wang, Y. Dong, R. Chang, C. Xu, X. Yu, S. Zhang, L. Tsamlag, M. Shang,
J. Huang, et al., Phase-adjusted estimation of the number of coronavirus disease 2019 cases in wuhan,
china, Cell discovery, 6 (2020), pp. 1–8.
[44] Z. Wang, D. Xiao, F. Fang, R. Govindan, C. C. Pain, and Y. Guo, Model identification of
reduced order fluid dynamics systems using deep learning, International Journal for Numerical Methods
in Fluids, 86 (2018), pp. 255–268.
[45] E. Weinan, A proposal on machine learning via dynamical systems, Communications in Mathematics
and Statistics, 5 (2017), pp. 1–11.
[46] S. Wiewel, M. Becher, and N. Thuerey, Latent space physics: Towards learning the temporal
evolution of fluid flow, in Computer Graphics Forum, vol. 38, Wiley Online Library, 2019, pp. 71–82.

18
[47] X. Xie, G. Zhang, and C. G. Webster, Non-intrusive inference reduced order model for fluids using
deep multistep neural network, Mathematics, 7 (2019), p. 757.
[48] C. Yildiz, M. Heinonen, and H. Lahdesmaki, ODE2VAE: Deep generative second order ODEs with
Bayesian neural networks, in Advances in Neural Information Processing Systems, 2019, pp. 13412–13421.

[49] T. Zhang, Z. Yao, A. Gholami, J. E. Gonzalez, K. Keutzer, M. W. Mahoney, and G. Biros,

ANODEV2: A coupled neural ODE framework, in Advances in Neural Information Processing Systems,
2019, pp. 5151–5161.
[50] J. Zhuang, N. Dvornek, X. Li, S. Tatikonda, X. Papademetris, and J. Duncan, Adaptive
checkpoint adjoint method for gradient estimation in neural ODE, arXiv preprint arXiv:2006.02493,
(2020).

5 Charpit S Method PDF
80% (10)
5 Charpit S Method PDF
23 pages
X-Parameters Characterization Modeling and Design of Nonlinear RF and Microwave Components - Root PDF
No ratings yet
X-Parameters Characterization Modeling and Design of Nonlinear RF and Microwave Components - Root PDF
238 pages
Applied Nonlinear Control
83% (6)
Applied Nonlinear Control
49 pages
Shooting Method
100% (1)
Shooting Method
16 pages
Accepted Manuscript: Journal of Computational Physics
No ratings yet
Accepted Manuscript: Journal of Computational Physics
47 pages
ANSYS Mechanical APDL Introductory Tutorials Huy KLJHLKJHLK
100% (1)
ANSYS Mechanical APDL Introductory Tutorials Huy KLJHLKJHLK
142 pages
PINN
100% (1)
PINN
22 pages
Q&A - When - Wave-Induced - Fatigue - in - Offshore - Structures - Matters
No ratings yet
Q&A - When - Wave-Induced - Fatigue - in - Offshore - Structures - Matters
14 pages
A Feedforward Neural Network Framework For Approximating The Solutions To Nonlinear Ordinary Differential Equations
No ratings yet
A Feedforward Neural Network Framework For Approximating The Solutions To Nonlinear Ordinary Differential Equations
13 pages
2022 Predicting Parametric Spatiotemporal Dynamics by Multi-Resolution PDE Structure-Preserved Deep Learning
No ratings yet
2022 Predicting Parametric Spatiotemporal Dynamics by Multi-Resolution PDE Structure-Preserved Deep Learning
51 pages
DeepXDE A Deep Learning Library For Solving Differ
No ratings yet
DeepXDE A Deep Learning Library For Solving Differ
17 pages
1812 11285v4
No ratings yet
1812 11285v4
46 pages
Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method A Differe
No ratings yet
Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method A Differe
44 pages
2304.00388 Multilevel CNNs For Parametric PDEs
No ratings yet
2304.00388 Multilevel CNNs For Parametric PDEs
42 pages
Physics Informed Regression POD ROM
No ratings yet
Physics Informed Regression POD ROM
38 pages
N Ode P: Eural Rocesses
No ratings yet
N Ode P: Eural Rocesses
19 pages
Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches
No ratings yet
Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches
28 pages
ROM Paper
No ratings yet
ROM Paper
25 pages
Three Ways To Solve Partial Differential Equations With Neural
No ratings yet
Three Ways To Solve Partial Differential Equations With Neural
32 pages
Surrogate Modeling of High-Dimensional Problems Via Data-Driven
No ratings yet
Surrogate Modeling of High-Dimensional Problems Via Data-Driven
25 pages
Stochastic Physics-Informed Neural Ordinary Differential Equations
No ratings yet
Stochastic Physics-Informed Neural Ordinary Differential Equations
35 pages
Deep Convolutional Recurrent Autoencoders For Learning Low-Dimensional Feature Dynamics of Fluid Systems
No ratings yet
Deep Convolutional Recurrent Autoencoders For Learning Low-Dimensional Feature Dynamics of Fluid Systems
28 pages
On Second Order Behaviour in Augmented Neural Odes: Alexander Norcliffe
No ratings yet
On Second Order Behaviour in Augmented Neural Odes: Alexander Norcliffe
28 pages
1 s2.0 S0888327024011609 Main
No ratings yet
1 s2.0 S0888327024011609 Main
26 pages
Unit 14
No ratings yet
Unit 14
33 pages
NeurIPS2024论文
No ratings yet
NeurIPS2024论文
23 pages
Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty qu-已压缩
No ratings yet
Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty qu-已压缩
26 pages
Hesthaven和Ubbiali - 2018 - Non-intrusive reduced order modeling of nonlinear problems using neural networks
No ratings yet
Hesthaven和Ubbiali - 2018 - Non-intrusive reduced order modeling of nonlinear problems using neural networks
28 pages
Snode PP
No ratings yet
Snode PP
15 pages
Phase Space Learning With Neural Networks
No ratings yet
Phase Space Learning With Neural Networks
24 pages
Optimisation 1
No ratings yet
Optimisation 1
22 pages
Neural ODE
No ratings yet
Neural ODE
21 pages
ST PDE Equivariant
No ratings yet
ST PDE Equivariant
17 pages
Approximation of Solution Operators For High-Dimensional Pdes
No ratings yet
Approximation of Solution Operators For High-Dimensional Pdes
15 pages
Physics-Informed Reduced Order Modeling of Time-Dependent Pdes Via Differentiable Solvers
No ratings yet
Physics-Informed Reduced Order Modeling of Time-Dependent Pdes Via Differentiable Solvers
32 pages
Hamiltonian 3
No ratings yet
Hamiltonian 3
16 pages
ICONIP2024论文
No ratings yet
ICONIP2024论文
15 pages
Symplectic Ode-Net - Learning Hamiltonian Dynamics With Control
No ratings yet
Symplectic Ode-Net - Learning Hamiltonian Dynamics With Control
17 pages
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
No ratings yet
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
28 pages
Oommen - NPJ - Rethinking Materials Simulations - Blending Direct Numerical Simulations With Neural Operators
No ratings yet
Oommen - NPJ - Rethinking Materials Simulations - Blending Direct Numerical Simulations With Neural Operators
14 pages
Can Physics Informaed Neural Networks Beat Finite Element Method
No ratings yet
Can Physics Informaed Neural Networks Beat Finite Element Method
27 pages
Physics-Informed Neural Networks
No ratings yet
Physics-Informed Neural Networks
22 pages
程Model order reduction method based on (r) POD-ANNs for parameterized
No ratings yet
程Model order reduction method based on (r) POD-ANNs for parameterized
13 pages
Enhancing Trajectory Prediction in Complex Dynamical Systems With Neural Ordinary Differential Equations
No ratings yet
Enhancing Trajectory Prediction in Complex Dynamical Systems With Neural Ordinary Differential Equations
12 pages
NeurIPS 2020 On Second Order Behaviour in Augmented Neural Odes Paper
No ratings yet
NeurIPS 2020 On Second Order Behaviour in Augmented Neural Odes Paper
11 pages
Raissi - PIDL Part 2
No ratings yet
Raissi - PIDL Part 2
19 pages
Neural Operator Graph Kernel Network For Partial Differential Equations
No ratings yet
Neural Operator Graph Kernel Network For Partial Differential Equations
21 pages
GrADE A Graph Based Data Driven Solver F
No ratings yet
GrADE A Graph Based Data Driven Solver F
20 pages
PDENet
No ratings yet
PDENet
9 pages
Neural Ordinary Differential Equations
No ratings yet
Neural Ordinary Differential Equations
13 pages
Gelbrecht Et Al. - 2021 - Neural Partial Differential Equations For Chaotic
No ratings yet
Gelbrecht Et Al. - 2021 - Neural Partial Differential Equations For Chaotic
11 pages
2403.12938neural Differential Algebraic Equations
No ratings yet
2403.12938neural Differential Algebraic Equations
8 pages
NeurIPS ML4PS 2024 135
No ratings yet
NeurIPS ML4PS 2024 135
7 pages
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
No ratings yet
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
22 pages
A2
No ratings yet
A2
13 pages
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
No ratings yet
Journal of Computational Physics: Zichao Long, Yiping Lu, Bin Dong
17 pages
Pinn F: A T - B F - F P - I N N: S Ormer Ransformer Ased Rame Work OR Hysics Nformed Eural Etworks
No ratings yet
Pinn F: A T - B F - F P - I N N: S Ormer Ransformer Ased Rame Work OR Hysics Nformed Eural Etworks
16 pages
Sciadv Abi8605
No ratings yet
Sciadv Abi8605
10 pages
An Analysis of Universal Differential Equations For
No ratings yet
An Analysis of Universal Differential Equations For
10 pages
Neural Ordinary Differential Equations: Lu Et Al. 2017 Haber and Ruthotto 2017 Ruthotto and Haber 2018
No ratings yet
Neural Ordinary Differential Equations: Lu Et Al. 2017 Haber and Ruthotto 2017 Ruthotto and Haber 2018
18 pages
Merger 02
No ratings yet
Merger 02
5 pages
pde 微分方程与神经网络
No ratings yet
pde 微分方程与神经网络
16 pages
Alphee Lavoie - Neural Networks in Financial Astrology PDF
No ratings yet
Alphee Lavoie - Neural Networks in Financial Astrology PDF
12 pages
Approximation of Solution Operators for High-dimensional PDEs部分1
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分1
2 pages
Connections Between Deep Learning and Partial Differential Equations
No ratings yet
Connections Between Deep Learning and Partial Differential Equations
2 pages
132 KV Twrkifaf SS
No ratings yet
132 KV Twrkifaf SS
4 pages
Nonlinear Control of Electric Machines - An Overview
No ratings yet
Nonlinear Control of Electric Machines - An Overview
11 pages
Machine Learned Reduced Order Modeling
No ratings yet
Machine Learned Reduced Order Modeling
29 pages
Multi Step Nonlinear
No ratings yet
Multi Step Nonlinear
132 pages
Computation of Electromagnetic Transients Dommel
No ratings yet
Computation of Electromagnetic Transients Dommel
11 pages
Partial Differential Equations Modeling
No ratings yet
Partial Differential Equations Modeling
403 pages
Modeling, Simulation and Feedback Linearization Control of Nonlinear Surface Vessels
No ratings yet
Modeling, Simulation and Feedback Linearization Control of Nonlinear Surface Vessels
6 pages
PF2001-10 Griffin Final Report
No ratings yet
PF2001-10 Griffin Final Report
62 pages
2024 - Ecn 133 - Feb - 15
No ratings yet
2024 - Ecn 133 - Feb - 15
34 pages
Differential Notes
No ratings yet
Differential Notes
29 pages
Davey Neil Gordon Thesis March 2018
No ratings yet
Davey Neil Gordon Thesis March 2018
178 pages
Asymptotic Methods For Perturbation Problems
No ratings yet
Asymptotic Methods For Perturbation Problems
42 pages
TFM Sergio Zavaleta vf2 PDF
No ratings yet
TFM Sergio Zavaleta vf2 PDF
133 pages
A New Three Phase Time-Domain Model For Electric Arc Furnaces Using MATLAB
No ratings yet
A New Three Phase Time-Domain Model For Electric Arc Furnaces Using MATLAB
6 pages
You-Tung2018 Article DerivationOfRainfallIDFRelatio
No ratings yet
You-Tung2018 Article DerivationOfRainfallIDFRelatio
16 pages
Nonlinear Storage
No ratings yet
Nonlinear Storage
11 pages
Tomography 08 00075
No ratings yet
Tomography 08 00075
15 pages
A Survey of Direct Time-Integration Methods in Computational Structural Dynamics-I. Explicit Methods
No ratings yet
A Survey of Direct Time-Integration Methods in Computational Structural Dynamics-I. Explicit Methods
16 pages
Calculus
No ratings yet
Calculus
100 pages
Self Attention Graph Pooling
No ratings yet
Self Attention Graph Pooling
10 pages
Edge Contraction Pooling GNN
No ratings yet
Edge Contraction Pooling GNN
9 pages
Publi 6888
No ratings yet
Publi 6888
9 pages
Jin 16m1089320
No ratings yet
Jin 16m1089320
23 pages
Thin-Walled Structures: P. Malekzadeh, M.R. Golbahar Haghighi, M. Shojaee
No ratings yet
Thin-Walled Structures: P. Malekzadeh, M.R. Golbahar Haghighi, M. Shojaee
9 pages
Meta-Learning of Pooling Layers For Character Reco
No ratings yet
Meta-Learning of Pooling Layers For Character Reco
16 pages
Kawaguchi 2020 Plasma Sources Sci. Technol. 29 025021
No ratings yet
Kawaguchi 2020 Plasma Sources Sci. Technol. 29 025021
12 pages
Modeling and Simulation in Scilab/Scicos With Scicoslab 4.4: Springer
No ratings yet
Modeling and Simulation in Scilab/Scicos With Scicoslab 4.4: Springer
6 pages
A PLC Based Control System For Load Frequency Control in An Isolated Small Hydro Power Plant
No ratings yet
A PLC Based Control System For Load Frequency Control in An Isolated Small Hydro Power Plant
8 pages
Fferential Equations: Fast Neural Network Based Solving of Partial Di
No ratings yet
Fferential Equations: Fast Neural Network Based Solving of Partial Di
6 pages
Wazwaz2006 The Modified Decomposition Method and Pade Approximants For A Boundary Layer Equation in Unbounded Domain
No ratings yet
Wazwaz2006 The Modified Decomposition Method and Pade Approximants For A Boundary Layer Equation in Unbounded Domain
8 pages

008 Parameterized Neural Ordinary Differential Equations: Applications To Computational Physics Problems

Uploaded by

008 Parameterized Neural Ordinary Differential Equations: Applications To Computational Physics Problems

Uploaded by

SAND2020-11835R

Parameterized Neural Ordinary Differential Equations:

Kookjin Leea,∗, Eric Parisha

Numerical simulations of dynamical systems described by systems of ordinary differential equations

∗ 7011 East Ave, MS 9159, Livermore, CA 94550.

4. Parameterized neural ODE

5.1. Full-order model (FOM)

5.2. Reduced-order model (ROM)

û˙ = fˆ(û, t; µ), û(0; µ) = û0 (µ), (5.2)

5.3. Learning latent-dynamics with PNODE

û˙ = fˆΘ (û, t; µ, Θ),

û1 , . . . , ûnt = ODESolve(û0 (µ), fˆΘ , µ, t1 , . . . , tnt ),

6.1. Data collection

U ∈ Rntrain ×(nt +1)×N ,

U [2] = U (µ1train ) · · · U (µntrain ) ∈ RN ×ntrain (nt +1) ,

6.2. Network architectures, training, and testing

where k · kF denotes the Frobenius norm.

6.3. Problem 1: 1D inviscid Burgers’ equation

6.3.1. Reconstruction: approximating a single trajectory with latent-dynamics learning

two errors are negligible.

6.3.2. Prediction: approximating an unseen trajectory for unseen parameter instances

0 100 0 100 0 100

(a) t = 7.77 (b) t = 11.7 (c) t = 19.5

0 100 0 100 0 100

(d) t = 23.3 (e) t = 27.2 (f) t = 35.0

Training parameters instances

Validating parameters instances instances

Testing parameters instance

Testing parameters instance

Figure 4: Visualizations of parameter instances sampling for Scenario 1 and Scenario 2.

Table 2: Prediction Scenario 1: the relative `2 -errors.

6.4. Problem 2: 2D chemically reacting flow

qT (w; µ) = QqH2 O (w; µ),

6.4.1. Data preprocessing and training

6.4.2. Prediction: approximating an unseen trajectory for unseen parameter instances

Training parameters instances

Validating parameters instances

Testing parameters instance

Figure 8: Visualizations of parameter instances sampling for the reacting flow.

Table 5: Prediction Scenario: the relative `2 -errors.

2.5 2.5 2.5

2.0 2.0 2.0

1.5 1.5 1.5

0.5 0.5 0.5

0.0 0.0 0.0

−0.5 −0.5 −0.5

2.5 2.5 2.5

2.0 2.0 2.0

1.5 1.5 1.5

1.0 1.0 M 1.0

0.5 0.5 0.5

0.0 0.0 0.0

−0.5 −0.5 −0.5

Dtrain = {(0.13 + (0.005)k)}, {k} = {0, 3, 6, 9)},

Again, we set the reduced dimension as p = 5.

Validating parameters instances

Testing parameters instance

[19] P. Holmes, J. L. Lumley, G. Berkooz, and C. W. Rowley, Turbulence, Coherent Structures,

[49] T. Zhang, Z. Yao, A. Gholami, J. E. Gonzalez, K. Keutzer, M. W. Mahoney, and G. Biros,

You might also like