0% found this document useful (0 votes)
28 views2 pages

Approximation of Solution Operators for High-dimensional PDEs部分2

Uploaded by

zackzhangzt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views2 pages

Approximation of Solution Operators for High-dimensional PDEs部分2

Uploaded by

zackzhangzt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Learning solution operator of PDEs The previously discussed methods all share a similarity in that

they aimed at solving specific instances of a given PDE. Therefore, they need to be rerun from scratch
when any part of the problem configuration (e.g., initial value, boundary value, problem domain) changes.
In contrast, the solution operator of a PDE can directly map a problem configuration to its corresponding
solution. When it comes to learning solution operators, several DNN approaches have been proposed.
One approach attempts to approximate Green’s functions for some linear PDEs [10, 79, 9, 55], as solutions
to such PDEs have explicit expression based on their Green’s functions. However, this approach only applies
to a small class of linear PDEs whose solution can be represented using Green’s functions. Moreover,
Green’s functions have singularities and special care is needed to approximate them using neural networks.
For example, rational functions are used as activation functions of DNNs to address singularities in [9]. In
[10], the singularities are represented with the help of fundamental solutions.
For more general nonlinear PDEs, DNNs have been used for operator approximation and meta-learning
for PDEs [64, 33, 58, 59, 52, 86, 85, 74]. For example, the work [33] considers solving parametric PDEs in
low-dimension (d ≤ 3 for the examples in their paper). Their method requires discretization of the PDE
system and needs to be supplied by many full-order solutions for different combinations of time discretization
points and parameter selections for their network training. Then their method applies proper orthogonal
decomposition to these solutions to obtain a set of reduced bases to construct solutions for new problems.
This can be seen as using classical solutions to develop a data-driven solution operator. The work [74] requires
a massive amount of pairs of ODE/PDE control and the corresponding system outputs, which are produced
by solving the original ODE/PDE system; then the DNN is trained on such pairs to learn the mapping
between these two subjects which necessarily are discretized as vectors under their proposed framework.
In contrast, DeepONets [58, 59, 85] seek to approximate solution mappings by use of a “branch” and
“trunk” network. FNOs [52, 86] use Fourier transforms to map a neural network to a low dimensional space
and then back to the solution on a spatial grid. In addition, several works apply spatial discretization of the
problem or transform the problem domains and use convolutional neural networks (CNNs) [73, 34, 92] or
graph neural networks (GNNs) [48, 2, 57] to create a mapping from initial conditions to solutions. Interested
readers may also refer to generalizations and extensions of these methods in [16, 27, 2, 12, 17, 64, 67, 59, 47].
A key similarity of all these methods is they require certain domain discretization and often a large number of
labeled pairs of IVP initial conditions (or PDE parameters) and the corresponding solution obtained through
other methods for training. This limits their applicability for high dimensional problems where such training
data is unavailable or the mesh is prohibitive to generate due to the curse of dimensionality.
The work [30] develops a new framework to approximate the solution operator of a given evolution PDE
by parameterizing the solution as a neural network (or any reduced-order model) and learning a vector
field that determines the proper evolution of the network’s parameters. Therefore, the infinite-dimensional
solution operator approximation reduces to finding a vector field in the finite-dimensional parameter space
of the network. In [30], this vector field is obtained by solving a least squares problem using a large number
of sampled network parameters in the parameter space.

Differences between our proposed approach and existing ones Our approach follows the framework
proposed in [30] and thus allows solution operator approximation for high-dimensional PDEs and does not
require any spatial discretization nor numerous solution examples of the given PDE for training. These avoid
the issues hindering applications of all the other existing methods (e.g., DeepONet [58] and FNO [52]).
On the other hand, the approach in this work improves both approximation accuracy and efficiency
over [30] by leveraging a new training strategy of control vector field in parameter space based on Neural
ODE (NODE) [15]. In particular, this new approach only samples initial points in the parameter space and
generates massive informative samples along the trajectories automatically during training, and the optimal
parameter of the control vector field is learned by minimizing the approximation error of the PDE along
these trajectories. This avoids both random sampling in a high-dimensional parameter space and solving
expensive least squares problems as in [30]. As a result, the new approach demonstrates orders of magnitudes
higher accuracy and faster training compared to [30]. Moreover, we develop a new error estimate to handle
a more general class of nonlinear PDEs than [30] does. Both the theoretical advancements and numerical
improvements will be demonstrated in the present paper.

3
3 Proposed Method
3.1 Solution operator and its parameterization
We follow the problem setting in [30] and let Ω be an open bounded set in Rd and F a (possibly) nonlinear
differential operator of functions u : Ω → R with necessary regularity conditions, which will be specified
below. We consider the IVP of the evolution PDE defined by F with arbitrary initial value as follows:
(
∂t u(x, t) = F [u](x, t), x ∈ Ω, t ∈ (0, T ],
(1)
u(x, 0) = g(x), x ∈ Ω,

where T > 0 is some prescribed terminal time, and g : Rd → R stands for an initial value. For ease of
presentation, we assume u to be compactly supported in Ω (for compatibility we henceforth assume g(x) has
zero trace on ∂Ω) throughout this paper. We denote ug the solution to the IVP (1) with this initial g. The
solution operator SF of the IVP (1) is thus the mapping from the initial g to the solution ug :

SF : C(Ω̄) ∩ C 2 (Ω) → C 2,1 (Ω̄ × [0, T ]), such that g 7→ SF (g) := ug . (2)

Our goal is to find a numerical approximation to SF . Namely, we want to find a fast computational scheme
SF that takes any initial g as input and accurately estimate ug with low computation complexity. Specifically,
we expect this scheme SF to satisfy the following properties:
1. The scheme applies to PDEs on high-dimensional Ω ⊂ Rd with d ≥ 5.
2. The computation complexity of the mapping g 7→ SF (g) is much lower than solving the problem (1)
directly.
It is important to note that the second item above is due to the substantial difference between solving
(1) for any given but fixed initial value g and finding the solution operator (2) that maps any g to the
corresponding solution ug . In the literature, most methods belong to the former class, such as finite difference
and finite element methods, as well as most state-of-the-art machine-learning based methods. However, these
methods are computationally expensive if (1) must be solved with many different initial values, either in
parallel or sequentially, and they essentially need to start from scratch for every new g. In a sharp contrast,
our method belongs to the second class in order to approximate the solution operator SF which, once found,
can help us to compute ug for any given g at much lower computational cost.
To approximate the solution operator SF in (2), we follow the strategy devleoped in [30] and develop a
new control mechanism in the parameter space Θ of a prescribed reduced-order model uθ . Specifically, we
first determine select a model structure uθ (e.g., a DNN) to represent solutions of the IVP. We assume that
uθ (x) := u(x; θ) is C 1 smooth with respect to θ. This is a mild condition satisfied by many different
parameterizations, such as all typical DNNs with smooth activation functions. Suppose there exists a
trajectory {θ(t) : 0 ≤ t ≤ T } in the parameter space, we only need

∇θ uθ(t) (x) · θ̇(t) = ∂t (uθ(t) (x)) = F [uθ(t) ](x), ∀ x ∈ Ω, t ∈ (0, T ] (3)

and the initial θ(0) to satisfy uθ(0) = g. The first and second equalities of (3) are due to the chain rule and
the goal for uθ(t) (·) to solve the PDE (1), respectively. Here we use ∇θ to denote the partial derivative with
respect to θ (and ∇ is the partial derivative with respect to x). To achieve (3), [30] proposed to learn a DNN
Vξ with parameters ξ by solving the following nonlinear least squares problem:
Z Z
min |∇θ uθ (x) · Vξ (θ) − F [uθ ](x)|2 dxdθ. (4)
ξ Θ Ω

Once ξ is obtained, one can effectively approximate the solution of the IVP with any initial value g: first
find θ(0) by fitting uθ(0) to g (e.g., θ(0) = argminθ ∥uθ − g∥22 , which is fast to compute) and then numerically
integrate θ̇(t) = Vξ (θ(t)) (which is again fast) in the parameter space Θ. The solution trajectory {θ(t) :
0 ≤ t ≤ T } induces a path {uθ(t) : 0 ≤ t ≤ T } as an approximation to the solution of the IVP. The total
computational cost is substantially lower than solving the IVP (1) directly as shown in [30].

You might also like