5 PINNs
5 PINNs
A P REPRINT
a
Department of Mechancial Engineering, Massachusetts Institute of Technology
b
Department of Chemical Engineering, Massachusetts Institute of Technology
c
Department of Mathematics, Massachusetts Institute of Technology
A BSTRACT
One of the obstacles hindering the scaling-up of the initial successes of machine learning in practical
engineering applications is the dependence of the accuracy on the size of the database that “drives” the
algorithms. Incorporating the already-known physical laws into the training process can significantly
reduce the size of the required database. In this study, we establish a neural network-based compu-
tational framework to characterize the finite deformation of elastic plates, which in classic theories
is described by the Föppl–von Kármán (FvK) equations with a set of boundary conditions (BCs).
A neural network is constructed by taking the spatial coordinates as the input and the displacement
field as the output to approximate the exact solution of the FvK equations. The physical information
(PDEs, BCs, and potential energies) is then incorporated into the loss function, and a pseudo dataset
is sampled without knowing the exact solution to finally train the neural network. The prediction
accuracy of the modeling framework is carefully examined by applying it to four different loading
cases: in-plane tension with non-uniformly distributed stretching forces, in-plane central-hole tension,
out-of-plane deflection, and buckling under compression. Three ways of formulating the loss function
are compared: 1) purely data-driven, 2) PDE-based, and 3) energy-based. Through the comparison
with the finite element simulations, it is found that all the three approaches can characterize the elastic
deformation of plates with a satisfactory accuracy if trained properly. Compared with incorporating
the PDEs and BCs in the loss, using the total potential energy shows certain advantage in terms of the
simplicity of hyperparameter tuning and the computational efficiency.
Keywords Physics-informed neural network · structural mechanics · elastic plates · Ritz method
1 Introduction
In the past half-decade, machine learning enjoyed vast researches to achieve remarkable successes in a wide spectrum
of scientific problems, including image processing [1, 2], cognitive science [3], genomics [4], drug discovery [5], and
material designing [6], to name a few. It has shown prominent advantages over other methods in effectively handling
complex natural systems with a daunting number of variables. Recently, we are witnessing a growing number of
initial successes of machine learning, especially deep learning, in modeling complex engineering systems with a high
dimensionality (the number of variables and degrees of freedom) [7, 8], for example, predicting the remaining useful life
of a battery cell based on its partial life-cycle data [9]. In most cases, machine learning algorithms serve as a data-driven
approach. It has been proven effective to predict the performance of a system even when the underlying physics has not
been fully elucidated. However, like other statistical methods such as curve fitting and feature engineering, the accuracy
∗
Corresponding author. Emails: [email protected] (W.L.), [email protected] (M.Z.B), [email protected] (J.Z.)
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
of machine learning algorithms highly depends on the quantity and quality of the dataset used for training [10]. In the
cases where little data is accessible or a large database is not affordable, for example at the microscales and nanoscales
of unknown materials, machine learning algorithms may lose their power, thus hindering the scaling-up of those initial
successes. On the contrary, physics-based or first-principle-based models are commonly less reliant on the size of the
dataset because the governing physical laws are elucidated by human brains beforehand and only a small amount of
data is required to calibrate the unknown parameters.
Bridging the gap between the data-driven approaches and the physics-based approaches creates a promising opportunity
to develop novel computer methods that have the potential to unite the advantages of both approaches – characterizing
high-dimensionality systems with a small dataset. One type of these methods is often referred to as physic-guided
data-driven methods, which aim to implement the already-known physics into the data-driven approaches [8, 11, 12,
13, 14, 15, 16, 17, 18]. In this paper, we focus on machine learning algorithms, particularly artificial neural networks
(ANNs). Generally, there are three key elements in a machine learning application: a dataset, a model, and a training
process. In the vast open literature, it is found that the term “physics-guided machine learning” (PGML) is being used
in an unregulated manner. Here, we summarize the recent progress on this topic by classifying the existing studies in
the open literature according to which key element they worked on.
The first category of studies implements physics into machine learning algorithms by generating a dataset following
physical laws [19, 20]. Instead of collecting data from the expensive and time-consuming experiments, attempts were
made to generate data through first-principle theories and physics-based simulations. In this way, the PGML algorithm
can learn from the known physics behind the man-made data. For example, Chen et al. [20] predicted the phonon
density of states of crystalline solids with unseen elements using a density functional perpetration theory-based phonon
database to train a Euclidean neural network. Another example is that Li et al. [19] generated a large numerical dataset
of lithium-ion battery failure behavior under mechanical impact loads with a well-calibrated physics-based detailed
model and trained various machine learning algorithms to get the safety envelope of the cell.
The second category of PGML studies implements physical laws by designing a physics-guided machine learning
model [8, 21, 22]. Various types of models have been used and studied for machine learning systems to solve a target
regression or classification problem, for example, ANNs and support vector machines (SVMs). Many existing studies
treated the model as a black box and empirically choose a set of parameters of the model, such as the number of nodes
and hidden layers of ANNs. But it has become a consensus that understanding the physics of the problem can guide
the design of the model. E’s research team [8, 21] is a clear pioneer in this aspect. In one of their recent studies [8], a
neural network was designed with several subnetworks representing the solution at different time instances to solve the
high-dimensional partial differential equations (PDEs). Several important physical equations were successfully solved
with their algorithm, including the nonlinear Black–Scholes equation with default risk, the Hamilton–Jacobi–Bellman
equation, and the Allen–Cahn Equation.
The last category of PGML studies imposes physics into the training process of the algorithm. A typical example is the
“physics-informed neural network” (PINN) proposed by Raissi et al. [14] PINN introduces the PDEs and the associated
initial and boundary conditions (ICs and BCs) into the loss function and solves the PDEs by minimizing it. The authors
successfully applied the PINN approach to solving the Burger’s equation and the Navier-Stokes equation, which are the
governing equations of a variety of flows. The same idea was adopted by Lu et al. [23] to solve a series of PDEs such as
the diffusion equation, and a library of open-source codes for solving different PDEs was created by the authors, named
“DeepXDE”. Besides PINN, E et al. [13] proposed a deep learning Ritz method where the loss function is defined by
the energy functional corresponding to the PDEs. It is also found that this type of loss-function-based optimization
algorithms can not only predict the performance of a system by solving the governing equations but also identify the
unknown parameters in a physical law through an inverse process. Raissi et al [14] showed preliminary successes to
discover the unknown parameters in the Burger’s equation using PINN. In fact, this type of general applications of
data-driven methods does not rely on machine learning algorithms. Zhao et al. [24] used an inverse approach to learn
pattern-forming equations such as Cahn-Hilliard and Allen-Cahn from image data. Zhao’s approach turned out to be
still effective even with a very small set of images. This success was recently extended by Effendy et al [25] to analyze
and design the electrochemical impedance spectroscopy of energy storage systems.
Although we classify here the existing PGML studies in the open literature into three categories, it is worth noting that
there is no absolute boundary between them. It is possible to combine these approaches in one algorithm, and with the
rapid development of machine learning technologies, we are already witnessing an increasing number of advanced
algorithms that have all the above three merits [26, 27].
There is a recent trend that the success of physics-guided machine learning algorithms, which was initially achieved in
the fields like fluid dynamics and mass and heat transfer, is now being extended into the field of applied mechanics
of solids. Haghighat et al. [28] applied the PINN framework to predicting the mechanical responses of linear elastic
materials, and their predictions agreed well with the finite element simulations. Wu et al. [29] designed a recurrent
2
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
neural network-accelerated multiscale model to describe the elasto-plastic behavior of heterogeneous media subjected to
random cyclic and non-proportional loading paths. A recent study by Samaniego et al. [16] adopted an energy approach
(variational method) to solve the PDEs in solid mechanics with machine learning and shows a high prediction accuracy
for the given examples. Huang et al. [30] developed a machine learning-based plasticity model that can effectively
predict the behaviors of history-dependent materials. In our opinion, one of the challenges for the machine learning
applications in predicting the mechanical responses of solids is that most of the variables of interest (such as stress and
strain) are highly tensorial, or multi-axial. As a result, each direction has its own PDEs, leading to a large total number
of equations as well as BCs and ICs. Therefore, although the above studies all compared their predictions with other
numerical methods and showed satisfactory agreements, the most fundamental questions such as how the loss function
should be formulated are still not full settled.
The purpose of the present study is to develop a neural network framework for predicting the mechanical responses of
elastic plates using the third category of the aforementioned PGML strategies (implementing physics into the training
process). Our work will be distinguished from the existing publications in a number of ways. First, we will carefully
investigate the reliability of the neural network framework by solving a set of high-order and highly non-linear equations.
In the classic theories of elastic plates, the governing equations are the well-known Föppl-von Kármán equations
equations, which consists of two second-order PDEs and one fourth-order PDE. Second, two different approaches will
be used to construct the loss function, one based on PDEs and BCs (inspired by PINN), and the other based on the total
potential energy of the whole structure (inspired by deep Ritz). Third, the proposed computational framework will be
applied to four different loading cases of the plates for evaluation. Non-linearities that stem from the loading condition
and the geometry of the plates will be purposely introduced to push the developed numerical framework to its limit.
The paper will start with a brief introduction of the classic Kirchhoff plate theory. The theory of the neural network
framework is then presented, together with a comparison with the conventional purely date-driven machine learning
framework. At last, four exemplary loading conditions will be investigated, and some key features of the framework
will be discussed.
For a given mechanical system, there are infinite possible configurations that can satisfy the geometric constraints.
However, only the one that also satisfies the equilibrium condition is the true configuration. The displacement field
corresponding to the true configuration is the true displacement, and the virtual displacements represent all the possible
configurations consistent with the geometric constraints. The amounts of the virtual work is then the work done by all
the forces along with the virtual displacements. The virtual work done by the internal stress and external body or surface
force are respectively defined as internal virtual work δU and external virtual work δV . Among all the admissible
configurations, the one corresponding to the equilibrium configuration makes the total virtual work δΠ vanish. The
principle of virtual displacement is then stated as
δΠ = δU + δV = 0, (1)
where δ is the variation operator. From this equation, we will derive the governing equations of the plates in the
following by writing down the internal and external work and making use of the principles of the variation method.
We adopt the classic plate theory that is based on the following three Kirchhoff hypotheses [31]: i) straight lines normal
to the plate mid-surface remain normal after deformation and thus are also called transverse normals; ii) the transverse
normals remain straight after defter deformation, and iii) the thickness of the plate does not change after deformation.
3
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
(𝑥, 𝑦, 𝑧) (𝑥, 𝑦, 0)
𝑥 𝑢 𝜕𝑤
𝑧
𝑞 𝑥, 𝑦 𝜕𝑥
𝑢0
𝑧
Transverse Normals
𝑦 𝑥 𝜕𝑤
𝜃1 =
𝑧 𝜕𝑥
𝜎𝑛𝑠
𝑠 𝜎𝑛𝑛
𝜎𝑛𝑟
𝑟 𝑛
Figure 1: Illustration of the reference and deformed configurations of an elastic plate and the boundary conditions
Figure 1 shows a plate of thickness h in the Cartesian coordinate (x, y, z). The x-y plane coincides with the geometric
mid-plane of the plate and the z-direction is taken positive downward. Without loss of generality, the three components
of displacement field along the x, y, and z axes are noted as u, uy , and w, respectively. Based on the Kirchhoff
hypotheses, the strain components can be written as (refer to A.1 for a detailed derivation),
where ε0αβ (α, β = 1, 2) are the membrane strains that represent the in-plane deformation, and καβ (α, β = 1, 2) are
the curvatures (often known as the bending strains) that comes from the transverse bending,
1 0
ε0αβ = (u + u0β,α + w,α w,β ), (3)
2 α,β
where the comma notation “,” indicates a derivative (for example, w,α = ∂w/∂α), u0α (α = x, y) are the displacements
on the mid-plane (u0α (xα ) = uα (xα , 0)). w,α (α = x, y) are respectively the rotation angles of the transverse normal
along x and y axes.
Z h
2
Nαβ = σαβ dz, (6)
−h
2
Z h
2
Mαβ = σαβ zdz. (7)
−h
2
4
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
We consider a distributed transverse (z-direction) pressure qt on the top surface and a set of mixed traction-displacement
boundary conditions. For generality, we define a local coordinate (n, s, z) at any point on the boundary where n and s
are corresponding to the normal and tangential directions. The applied in-plane traction can thereby be described as
the normal stress σ̂nn , tangential stress σ̂ns , and the transverse shear stress σ̂nz . Hence, the external virtual work can be
calculated as
Z Z
V =− qt δwdxdy − bnn δu0n − M
N bns δu0s − M
cnn δw,n + N cns δw,s + N
bnz δw ds, (8)
Ω Γσ
where un and us are respectively the displacements along the boundary normal and tangential direction, u0n and u0s
are respectively the corresponding displacements at the mid-plane, w,n and w,s are respectively the rotation angels of
the transverse normal along the boundary normal and tangential directions, and the applied thickness-integrated forces
and moments are defined in the same way as in Eq. (6) and Eq. (7).
To express the variation of internal energy in terms of virtual displacements, integration by parts need to be performed
several times (details can be found in A.2). According to the principle of virtual displacements and rearranging the
coefficients, we finally have
0 = δU + δV
Z n h i o
=− Nαβ,β δu0α + (Nαβ w,β ),α + Mαβ,αβ − qt δw dxdy
Z hΩ
(9)
+ Nαβ nβ δu0α + (Nαβ w,β nα + Mαβ,β nα ) δw − Mαβ nβ δw,α
Γσ
Pα ≡ Nαβ,β = 0,
(10)
Pz ≡ (Nαβ w,β ),α + Mαβ,αβ − qt = 0.
Eq. (10) are the general form of the governing equations of plates. Its linear elasticity special case is the well-known
Föppl–von Kármán (FvK) equations, named after August Föppl [32] and Theodore von Kármán [33]. To derive them,
the constitutive equations of an isotropic elastic plate should be established,
where δαβ is the Kronecker delta (δαβ = 1, if α = β and δαβ = 0, if α 6= β), C is the stretching stiffness, or axial
rigidity, and D is the bending stiffness, or flexural rigidity,
Eh Eh3
C= 2
,D = . (13)
1−ν 12(1 − ν 2 )
Here E is the Young’s modulus and ν is the Poisson’s ratio. Substituting Eq. (2), Eq. (3), Eq. (11) and Eq. (12) into Eq.
(10), we can get the FvK equations for the isotropic elastic plate in terms of displacement.
In Eq. (10), the two in-plane equations have second-order spatial derivatives and the out-of-plane equation has the
fourth-order spatial derivatives. Therefore, a total of eight boundary conditions are required. As in the derivation
5
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
using adjacent equilibrium, it is possible to identify all these BCs on each edge of the plate. But the advantage of
using the energy method is that it can give not only the governing PDEs but also the complete description of the
boundary conditions that leads to a unique solution. By setting the term of the integration along the stress boundary (the
second integration along Γσ ) in Eq. (9) to zero, we can obtain the boundary conditions. The quantities with a variation
are referred to as the primary variables that constitute the geometric boundary conditions and the coefficients of the
variations are referred to as the secondary variables that constitute the natural boundary conditions. We can see there
are five primary variables u0x , u0y , w, w,x , and w,y for a plate with the edges aligned with the x and y axis, which
indicates a total of ten boundary conditions (five geometric and five natural boundary conditions). This is seemingly
inconsistent with the eight boundary conditions from the order analysis of the PDEs. The reason is that there are
only four independent primary variables among the above five variables. Only the rotation about the normal axis is
considered in the plate theory. Interested readers are referred to A.3 for the detailed derivation. After a transformation
from the global Cartesian coordinate to the local coordinate of (n, s, z), the second integration term becomes
Z h i
0= Nnn − N
bnn δu0n + Nns − N
bns δu0s + Vn − Vbn δw + Mnn − M
cnn δw,n ds, (14)
Γσ
where
It is then clear that the four primary variables are u0n , u0s , w, and w,n , respectively corresponding to the in-plane
displacement in the normal direction, the in-plane displacement in the tangential direction, the out-of-plane deflection,
and the rotation along the normal axis, and the four secondary variables are Nnn , Nns , Vn , and Mnn , respectively
corresponding to the in-plane normal force, the in-plane tangential force, the shear force, and the bending moment.
A fully connected neural network is constructed to approximate the exact solution of the displacement field. It consists
of the input layer, output layer, and hidden layers in between. Here, we take the spatial coordinates (x, y) as the inputs
and the displacement fields (ux , uy , w) to be predicted as the outputs (see Figure 3a).
Considering an L-layer neural network, or a (L − 1)-hidden layer neural network, with Pk neurons in the k-th layer
(P0 = 2 is the dimension of inputs and PL = 3 is the dimension of outputs), for the j-th neuron in the k-th layer,
the output Akj is obtained by taking the weighted average of the outputs of the previous layer and then applying an
activation function,
Pk−1
X
Akj = f Wijk Ak−1
i + bkj , (16)
i=1
where Wijk is the weights and bkj is the bias. f (·) represent the nonlinear activation function. Some common choices (see
Fig. 3b) are the rectified linear unit (ReLU, f (x) = max{x, 0}), the logistic sigmoid function (f (x) = 1/(1 + e−x )),
and the hyperbolic tangent (Tanh, f (x) = (ex − e−x )/(ex + e−x )). The above equation can be written in vector and
matrix form,
T
Ak = f Wk Ak−1 + bk , (17)
6
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 2: Flow chart of the physic-guided machine learning framework compared with the purely data-driven approach
Activation function
(b) ReLU(𝑥) Sigmoid(𝑥) Tanh(𝑥)
1 1 1
0 𝑥
0 -1
1 𝑥 0 𝑥
Figure 3: Construction of the artificial neural network, (a) fully-connected multi-layer network, (b) activation functions
7
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
where Wk = [Wijk ] ∈ RMl−1 ×Ml and bk = bkj ∈ RMl are the weight matrix and bias vector, respectively. Ak
represents the output vector of k-th layer. The neural network can then be defined recursively as follows,
Note that the activation function is not applied for the output layer. It should also be pointed out that the lowercase
w without subscripts denotes the out-of-plane displacement whereas the capital W and Wij represent the weights
of the neural network. The weights and biases are the parameters to be trained of the neural network and there is a
PL PL
total of i=1 Pi−1 Pi weights and i=1 Pi biases (P0 = 2, PL = 3). It has been proven that a neural network can
approximate any targeted functions with an arbitrary width or depth[39]. This approximation ability of the neural
network makes it possible to represent the full filed solution of the PDEs.
The loss function is perhaps the most essential part of a neural network algorithm because the already-known physical
laws will be implemented here. As illustrated in Figure 2, we consider three ways of formulating the loss function.
The first is purely data-driven and compares the prediction of the displacement field only with the data observed from
experiments or simulations. Therefore, it is not a physics-guided algorithm. We include it here for the comparison with
the other two loss functions, which reflect the physical laws. The second one is defined on the PDEs and BCs, and
the third one is defined on the total potential energy of the plate. It is worth noting that the PDE-based is close to the
concept of PINN proposed by Raissi et al. [14] and the energy-based can be viewed as an implementation of the Ritz
method [13].
Purely data-driven
Like most conventional data-driven methods, the loss function of a regression problem can be constructed by the
difference between the predictions and the true experimental or numerical observations, such as the mean square error,
Q
X 1 h i 2 2 2 i
LData-driven = ux − ûix + uiy − ûiy + wi − ŵi , (19)
i=1
Q
where the superscript i indicates the i-th training sample, ûix , ûiy , and ŵi are the observed displacements in the training
dataset, and uix , uiy , and wi are the predicted displacements, Q is the total number of training samples, which should be
sufficient large to ensure an acceptable accuracy.
It is worth noting that the purely data-driven loss function can also be defined on the strain field or the stress field.
To realize these formulations, one can replace the u, w and û, ŵ components in Eq.(19). It is also common to use a
combination of the displacement, strain, and stress fields. This point will be investigated in the following sections of the
paper.
PDE-based
For the studied plate theory, the outputs of the neural network should satisfy the governing PDEs (Eq. (10)) and the
BCs (Eq. (14)). The first way to implement the plate theory into the algorithm is, therefore, to construct the loss with
the residuals of the PDEs and BCs,
QP
X 1 h i 2 2 2 i
LPDEs = Px + Pyi + Pzi , (21)
i=1
QP
Px,y,z are the residual values of the three governing PDEs defined in Eq. (10), QP is the number of training samples
within the solution domain.
8
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Qbs 2 2 2 2
X 1 i i i i i i i i
LBCs = Nnn − Nnn + Nns − Nns + Vn − V̂n + Mnn − Mnn
b b c , (22)
i=1
Qbs
Qbd
X 1 h i 2 2 2 i 2
i
LBCd = i
u0n − ûi0n + ui0s − ûi0s + wi − ŵi + w,n − ŵ,n . (23)
i=1
Qbd
λs and λd are the weights of loss on stress boundary and displacement boundary, the hat notation indicates the prescribed
secondary and primary variables defined in Eq. (14) at the boundaries, and the corresponding variables without a hat
are the predicted outputs of the neural network, Qbs and Qbd are the number of samples at the static boundary and
kinematic boundary, respectively.
By minimizing the total loss, the PDEs and boundary conditions can be satisfied. Mathematically, the neural network is
able to approximate the exact solution.
Energy-based
The second way to implement the plate theory into the algorithm is to directly take the total potential energy as the loss
and minimize the loss according to the principle of minimum potential energy. However, as E et al. [13] pointed out
in their study on the deep Ritz method, the challenging issue is how to incorporate the kinematic boundaries into the
total potential energy because it is not automatically included. Here, we construct the loss function based on the total
potential energy with a penalty energy term [13, 34, 35],
LEnergy-based = Π = U + V + T, (24)
V is the virtual work done by external forces on the boundary (defined as negative to be consistent with the principle
virtual displacement)
Z h i
V = −N
bnn u0n − N
bns u0s − N
bnz w + M
cns w,s + M
cnn w,n ds, (26)
Γσ
and T is the penalty term that enforces the kinematic boundary conditions,
Z Z Z Z
T = λ∗n |u0n − û0n | ds + λ∗s |u0s − û0s | ds + λ∗w |w − ŵ| ds + λ∗w,n |w,n − ŵ,n | ds. (27)
Γd Γd Γd Γd
Here, λ∗n , λ∗s , λ∗w , and λ∗w,n are four Lagrangian multipliers that are in the dimension of force (λ∗n , λ∗s , λ∗w ) or moment
(λ∗w,n ), representing the applied forces and moments on the kinematic boundaries. Note that if the kinematic BCs are
satisfied, this penalty term vanishes. In addition, the first variation of this penalty term is zero (δT = 0), and therefore,
taking the variation of the total potential energy functional Π still returns δU + δV . This is consistent with Eq. (1).
The energy integrations can be evaluated numerically. For integration in the domain, we first uniformly sample QP
points, the integration can then be approximated by,
QP QP
X i X 1 i
Unum = U δAi ≈ U At , (28)
i=1 i=1
QP
9
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
1
Nαβ ε0αβ + Mαβ καβ .
U= (29)
2
i
U is, therefore, the internal energy density of the i-th sample point, δAi is the discrete area around the sampled point,
which can be estimated by At /QP is the total area when QP is sufficiently large.
Similarly,
Qbs Qbs
X i X 1 i
Vnum = V δli ≈ V lt , (30)
i=1 i=1
Q bs
V = −N
bnn u0n − N
bnz w − N
bns u0s + M
cns ws + M
cnn wn . (31)
δli is the discrete length around the sampled point, approximately lt /QBCs , where lt is the total length of the static
boundaries.
T can be numerically interpreted in the same way as LBCd in Eq. (28). For simplicity, we assume all the four Lagrangian
multipliers are of the same value λd ,
Qbd
X 1 h i 2 2 2 i 2
i1/2
Tnum = λd i
u0n − ûi0n + ui0s − ûi0s + wi − ŵi + w,n − ŵ,n . (32)
i=1
Qbd
It should be noted that both PDE-based and energy-based loss functions require the calculation of the partial derivatives
of the outputs with respect to the inputs. Most existing machine learning platforms (e.g. Tensorflow [40] and Pytorch
[41]) are already equipped with default gradient algorithms, and users can obtain the numerical gradient efficiently.
Higher-order partial derivatives can also be calculated by the algorithms but take more computation time. An alternative
strategy is to introduce the derivatives into the outputs of the neural network. For example, if we set the outputs to be
(ux , uy , ux,x , ux,y , uy,x , uy,y ), the second-order derivatives of ux and uy can be obtained by performing only the
first derivative of the outputs. In this way, we can avoid calculating high-order derivatives. The disadvantage is that
it will increase the size of the neural network and extra constraints are needed to enforce the mathematical relations
among the outputs. For instance, in the above example, the derivative of the first output ux to the first input x should be
equal to the third output ux,x . For simplicity, we adopted the default gradient algorithm in Pytorch to ensure higher
accuracy of the calculation of the derivatives at the sacrifice of computational efficiency.
As indicated by Eq. (19), the purely data-driven model loss function can be minimized only when the displacement
field (ûx , ûy , ŵ) of a sufficiently large number of sample points can be observed. It means a large experimental or
numerical database. The exact solution of (ûx , ûy , ŵ) is always preferable rather than numerical simulation results for a
reliable training process. Experimental data is good but usually comes with measurement uncertainties. In addition, as
we already pointed out, high-order equations such as the FvK equations are difficult to be solved analytically. Therefore,
how to obtain a satisfactory dataset to train the data-driven algorithm is a big challenge even for a simple structure like
an elastic plate.
The training data of the physics-guided algorithms has two important features: 1) exact solutions are not required
(the loss function only contains the predicted outputs). It means that the training data are simply the input spatial
coordinates. 2) it should be sampled both within the solution domain and at the boundaries. Theoretically, the training
data can be sampled in any size and strategy. Two possible sampling strategies are: 1) the data points are sampled in
the beginning and remain unchanged during training process. The samples can be uniform grid points or randomly
distributed points. 2) the data points are sampled randomly at each training epoch, which means the training dataset
changes during training. For energy-based loss function, due to the requirement of numerical integration (Eq. (28) and
Eq. (30)), we choose the second strategy where we first randomly sample the data points in a uniform distribution and
then resample the data during each training epoch to have more accurate and consistent results. For the PDE-based loss
function, the same strategy is used. It should also be noted that the data points can be non-uniformly distributed with
10
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
local refinement, which can be used as a means to improve the results. This point will be further discussed in the next
section.
The training process is essentially optimizing the weights and bias of a neural network to minimize the loss function. It
is common to train the network with small batches from the training dataset. This is still applicable for PDE-based loss
function. For the energy-based loss function, however, the network has to be trained with one batch (the whole dataset)
for each iteration of optimization to reliably evaluate the numerical integration.
We start with a two-dimensional case that only involves in-plane deformation under plane-stress condition, as illustrated
in Figure 4a. The two lateral edges of a 20 mm × 20 mm (l × l) square elastic plate are subjected a non-uniform
stretching force that follows a sinuous distribution p = sin (yπ/l) MPa. The other two edges (upper and lower) are
traction-free. The origin of the Cartesian coordinate is placed at the center of the square with the x axis pointing to the
right. The Young’s modulus of the elastic plate is 70 MPa and the Poisson’s ratio is 0.3. Due to the symmetry of the
geometry and loads, only one quarter of the plate is modeled (see Figure 4b). The boundary conditions at the four edges
of the quarter model are listed as following,
ux |x=0 = 0,
uy |y=0 = 0,
y
(33)
Nxx |x= l = p · h = sin π h, Nxy |x= l = 0,
2 l 2
The two governing PDEs for this case can be written in terms of the displacement field,
2
1 − ν ∂ 2 ux 1 + ν ∂ 2 uy
E ∂ ux
Px ≡ + + = 0,
1 − ν 2 ∂x2 2 ∂y 2 2 ∂x∂y
2 (34)
1 − ν ∂ 2 uy 1 + ν ∂ 2 ux
E ∂ uy
Py ≡ + + = 0.
1 − ν 2 ∂y 2 2 ∂x2 2 ∂x∂y
Two different loss functions are defined for comparison according to Eq. (20) and Eq. (24). The PDEs-based loss
function is,
and
QP
1 X h i 2 2 i
LPDEs = Px + Pyi ,
QP i=1
Qsx i2
1 X h i 2 i 2
LBCsx = Nxx − p · h + Nxy ,
Qsx i=1
Qsy
(36)
1 X h i 2 i 2
i
LBCsy = Nxx + Nxy ,
Qsy i=1
Qdx Qdy
1 X i 2 1 X i 2
LBCd = u + u ,
Qdx i=1 x Qdy i=1 y
11
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
(a) 𝑦 (b)
𝑝 = sin 𝜋 Free condition Traction condition
𝑙
𝑦 𝑁𝑦𝑦 = 0, 𝑁𝑥𝑦 = 0 𝑁𝑥𝑥 = 𝑝 ∙ ℎ, 𝑁𝑥𝑦 = 0
𝑥 𝑦
𝑢𝑥 = 0
𝑥
𝑢𝑦 = 0
𝑝 𝑦
𝑥 𝑢𝑥 = 0
𝑥 Free condition
𝑢𝑦 = 0 𝑛𝑥 ∙ 𝑁𝑥𝑥 + 𝑛𝑦 ∙ 𝑁𝑥𝑦 = 0
𝑛𝑦 ∙ 𝑁𝑦𝑦 + 𝑛𝑥 ∙ 𝑁𝑥𝑦 = 0
Figure 4: (a) Loading condition of non-uniform tension of plate and (b) one-quarter equivalent model;(c) loading
condition of uniaxial central-hole tension and (d) one-quarter equivalent model.
where variables with superscript i indicate value evaluated for the i-th training sample, QP , Qsx , Qsy Qdx , and Qdy are
the total number of training samples within the domain, on the right and upper edges (static boundary), and on the
left and lower edges (kinematic boundary), respectively. The membrane forces are computed with the displacement
gradients according to Eq. 11.
The energy-based loss function is,
LEnergy-based = U + V + T
Z l/2 Z l/2 2 2
Eh ∂ux ∂uy ∂ux ∂uy
= 2
+ + 2ν
0 0 2 1 − uy ∂x ∂y ∂x ∂y
2 (37)
1 − ν ∂ux ∂uy
+ + dxdy
2 ∂y ∂x
Z l Z l/2 Z l/2
2 2
− [Nxx (y) · ux (y)]x=l/2 dy + ux (y) x=0 hdy + uy (x) y=0 hdx.
0 0 0
The fully connected neural network defined in Eq. (18) is used, where the outputs are replaced by 2D displacement
fields (ux , uy ). We modified the outputs in the following way,
0 0
ux , uy = N (x, y)
ux = u0x · x, (38)
uy = u0y · y,
where u0x and u0y are the outputs of neural network N (x, y), ux and uy are the modified outputs. The displacement
boundary conditions can then be satisfied (ux |x=0 = 0, uy |y=0 = 0), which means that the loss term LBCd is always
zero. This can simplify the calculation of loss function (T ≡ 0 is automatically satisfied).
At the same time, we solved the problem using finite element method with an extremely fine mesh size of 0.1 mm
(10,000 elements in total) in Abaqus/standard. Since this is a simple mechanical problem, the results were regarded as
the extract solution to evaluate the accuracy of the machine learning methods.
We performed the purely data-driven machine learning method to be compared with our physics-guided methods.
The training data were extracted from the FE simulation results with the spatial positions (x, y) as the inputs and the
displacement fields (uˆx , uˆy ) as the outputs. 10,000 samples were used for training, with a batch size of 128 and varied
12
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 5: Predicted horizontal displacement and longitudinal membrane force of FE simulation (a,d), and purely
date-driven models trained with different loss functions defined by: (b,e) mean square error of displacement field, (c,f)
mean square error of displacement and membrane force fields, under non-uniform stretching load. (For interpretation of
the references to color in this figure legend, the reader is referred to the web version of this article.)
training rates (10−3 for initial 3,000 epochs, 10−4 for subsequent 6,000 epochs, and 10−5 for another 3000 epochs).
The hyperbolic (Tanh) function was used as the activation function. The loss function is constructed with the mean
square error (MSE) of the displacement field. Figure 5a and b show a comparison between the contour plots of the
magnitude of the displacement obtained by FE simulation and the predictions of a trained 5-hidden layer (5 neurons
each layer) neural network. The predicted longitudinal membrane force field was then obtained according to Eq. (11),
as shown in Figure 5d. The coefficients of determination (namely, R2 value) of the predicted outputs were calculated to
evaluate the accuracy of the global prediction of the displacement and membrane force fields (see Table 1). We can see
that though the 5-hidden layer network can well predict both fields globally, the accuracy of membrane force prediction
is not as high as that of displacement field (see Figure 5d and e as well as R2 values in Table 1). This is mainly because
the membrane force is not directly included in the loss function and the derivative operation used to calculate membrane
force magnifies the error of displacement prediction. We then trained the same network with the same procedure but
with a different loss function that was defined by the MSE of both displacement and membrane force fields. The results
(Figure 5c and f) show that the accuracy of membrane force prediction is improved without affecting the accuracy of
displacement prediction. Figure 6a and b respectively present the MSE of the displacement and membrane force versus
the training epochs. The final MSE of membrane force is significantly decreased when the network is trained with both
fields. To further visualize the accuracy locally, we plotted the displacement and membrane force along the diagonal
line of the square plate. As shown in Figure 6c and d, the same phenomenon can be seen.
The above analysis provides a general idea on how well the ANN can predict the displacement field. For the models with
physics-guided loss functions, we use the same size of the network (5 layers and 5 neurons each layer). We uniformly
sampled 10,000 data points within the solution area and 1,000 data points from each edge. The model is first trained
with the PDE-based loss function defined in Eq. (35) in the procedure described in Section 3. The hyperparameter
λs in Eq.(35), the weight of BC residuals, is carefully tuned. Here, we show the results of two cases with λs = 1.0
and λs = 0.1 as a comparison. The latter turns out to provide a better performance (see the MSE in Figure 6 and R2
values in Table 1). The predicted displacement and membrane force fields are shown in Figure 7c and d, where the
distribution is well captured with a high accuracy. The model is then trained with the energy-based loss function, the
accuracy of both displacement and membrane force field is similar to that of the PDE-based loss function. In terms
13
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 6: Mean square error of displacement (a) and membrane force (b) predictions. Predicted horizontal displacement
(c) and longitudinal membrane force (d) along the diagonal line.
Table 1: Coefficients of determination of the predicted displacement and membrane force fields under non-uniform
stretching load
14
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 7: Predicted horizontal displacement and longitudinal membrane force of FE simulation (a,b), PDE-based
model with λs = 0.1 (c,d), and energy-based model (e,f) under non-uniform stretching load. (For interpretation of the
references to color in this figure legend, the reader is referred to the web version of this article.)
of the computational efficiency, the energy-based model takes much shorter time (20 minutes for energy-based and
160 minutes for PDE-based, with 4-core Intel Core I5 CPU without GPU acceleration). This is largely because the
calculation of the high-order derivatives are not required in the energy-based method.
Discussion on sampling size. For the purely data-driven methods, the accuracy highly relies on the quality and size of
the training data. To investigate the influence of the sampling size on the two physics-guided models (energy-based and
PDE-based), we trained the models with two smaller sampling size (1,000 and 200 samples within the domain and
100 and 20 on the edge, respectively), as shown in Figure 8b and c. The absolute error of membrane force prediction
with different sampling size is shown in Figure 8d, e, and f for the PDE-based model and Figure 8g, h, and i for the
energy-based model. The accuracy of energy-based model significantly decreases with a smaller sample size, but the
PDE-based model still has a relatively high accuracy. This disadvantage will be magnified in the next example where
stress concentration exists in the specimen. The underlying mechanism will be further discussed in Sections 5.2 and 5.3.
The second application still focuses on the in-plane loading, but we will carefully investigate the effect of geometric
nonlinearity by introducing a hole at the center of the specimen. Thus, the governing equations are the same as Eq. (34).
The central hole leads to the stress concentration phenomenon at the edge of the hole, as a tough loading case for the
machine learning method to solve. The problem is illustrated in Figure 4c, the hole has a diameter of 5 mm in the center
of 20 mm × 20 mm (l × l) square plate. A uniformly distributed stretching load (1 MPa) is applied at both ends. The
15
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 8: Absolute error of predicted longitudinal membrane force of PDE-based and energy-based models trained
with different size of training samples: (a, d, g) 10,000 samples, (b, e, h) 1,000 samples, (c, f, i) 200 samples. (For
interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
loading case is equivalent to the one-quarter model shown in Figure 4d and the boundary conditions for the 4 straight
edges and 1 arch edge are listed as follows,
ux |x=0 = 0,
uy |y=0 = 0,
Nxx |x= l = p · h, Nxy |x= l = 0, (39)
2 2
16
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Table 2: Coefficients of determination of the predicted displacement and membrane force fields for central-hole tension
the membrane force field. When the model is trained only with the displacement field, the error in the membrane force
prediction is significant, especially around the central hole; when trained with both fields, the accuracy can be greatly
improved. There are two important implications: 1) the central hole introduces stronger nonlinearity compared with the
previous case, making the modeling more difficult; 2) a 5-layer network can well describe both fields, but the fitting
will not be satisfactory unless the model is trained properly.
We then trained the same 5-hidden layer network with the physics-guided loss functions, one based on PDEs and
the other based on energy. The same training procedure as in the previous case was used. λs = 0.1 was set for the
PDE-based loss function. We found that both models can qualitatively predict the distribution of displacement field
(see Figure 9d, e). Besides, the stress concentration phenomenon is also captured, but the concentration factor may
not. The predicted distributions of the longitudinal membrane force are plotted in Figure 10d, e. The prediction by the
energy-based model is the closest to the FE simulation result, significantly better than the PDE-based. It should be
noted that, as demonstrated in the previous case, the PDE-based method can potentially achieve a similar accuracy as
the energy-based method does by tuning the hyperparameters (training epochs, learning rates, sampling size, batch size
weights in the loss function, etc). In practical, however, it is difficult to efficiently find the optimum hyperparameters.
Attempts were made by increasing the sample size to an extremely large number 40,000, by tuning the weights in the
loss function, by adjusting the batch size from 64 to 512, and by increasing the network to 20 layers (20 neurons each
layer). However, none of these attempts returned satisfactory results. It is still an open question on how to optimize the
hyperparameters for a strong nonlinear problem. Here, the unsatisfactory predictions are reported to show this weakness
of the PDE-based approach.
A quantitative comparison between the physics-guided models and the FE results is performed by plotting the magnitude
of Nxx along the edge of the central hole in the polar coordinate, as shown in Figure 11. It should be noted that
a closed-form analytical solution exists for the uniaxial tension of an infinite large central-hole plate [42], which
predicts Nxx = p(1 − 2 cos 2φ)sin φ2 . This analytical model is also plotted. At the same time, the R2 values are
summarized in Table 2. It is found that the error of the prediction around the stress concentration area is still notable
for the energy-based model compared with the FE simulation and closed-form solution, although the predicted global
distribution is qualitatively correct.
To further improve the accuracy, we looked into the sampling strategy that has been revealed through the first in-plane
tension example to be important to the numerical integration calculating the potential energy. First, it was found that the
sampling size should be large enough to capture the edge of the central hole. As shown in Figure 12a and b, a small
sample size cannot well capture the edge of central hole and 10,000 samples turned out to be sufficient for this example.
Second, we proposed a sampling strategy with local refinement. Given that the stress concentration and high energy
density area locates near the central hole, we divided the whole domain into two regions (indicated by red and blue in
Figure 12c). The region around the central hole is sampled in higher density (local refinement). Within each region we
performed the numerical integration, and the total potential energy is the sum of these two regions.
It should also be pointed out that even the energy-based method cannot perfectly agree with the FE result. However,
for such a nonlinear problem with stress concentration, we cannot fully trust the FE simulations as well. We expect a
more persuasive comparison with the experimental data in future studies. Here, to further scrutinize the energy-based
method, we applied it to three more cases with a central hole of different sizes and shapes. Meanwhile, a even more
complicated case of three-hole plate was performed. All the results are shown in Figure 13. As expected, the stress
concentration factor increases as the aspect ratio (ly /lx ) of the central hole increases. The energy-based model can
correctly capture this trend and is also able to give reasonable predictions of the full stress field. It is still seen that
there is local deviations between the energy-based model and the FE simulation, but a prediction accuracy of over 95%
should have already met the requirement of most engineering applications.
17
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 9: Horizontal displacement field prediction of FE simulation (a), purely data-driven models with two different
loss functions defined by: (b) mean square error of displacement field , (c) mean square error of displacement and
membrane force fields, PDE-based model (d), and energy-based models trained with uniform sampling (e) and sampling
with local refinement (f), for the central-hole tension. (For interpretation of the references to color in this figure legend,
the reader is referred to the web version of this article.)
The above two examples played as very strict comparisons of the purely data-driven approach and the two models with
PDE-based and energy-based loss functions while using FE simulations as the reference. It has been demonstrated
that both the PDE-based and the energy-based models can provide the high-accuracy predictions that are close to
the FE results but the PDE-based heavility relies on the optimization of the hyperparameters. In the following two
examples, we also compared the three ANN methods, and we reached the same conclusions. Therefore, the details of
the comparison will not be presented for conciseness. Instead, we will focus on the energy-based method while still
using FE simulation as the reference although it is not the exact solution.
In this example, we consider the square plate deflection loading case under uniform out-of-plane pressure. Unlike the
previous two, this case will involve both in-plane and out-of-plane deformation. As illustrated in Figure 14a, a 10 Pa
uniform transverse distributed pressure is applied on a 100 mm × 100 mm square plate whose four edges are clamped.
The Young’s modulus and Poisson’s ratio of the plate is set as 70 MPa and 0.3, respectively. The governing PDEs are
listed in Eq. (10). The boundary conditions are
∂w
ux = 0, uy = 0, w = 0, = 0, (at x = ±50 or y = ±50) . (40)
∂n
FE simulations with an element size of 0.1 mm are performed in Abaqus/standard with shell element (4-node doubly
curved thin shell element, with reduced integration, hourglass control, and finite membrane strains). A 5-hidden layer
neural network (5 neurons each layer) is trained with the energy-based loss function and the comparison of the predicted
deflection field with the FE simulation is shown in Figure 14b-e. A quantitative comparison of the distribution along the
central line is shown in Figure 14f. It is found that the energy-based algorithm can still provide a satisfactory prediction
of the out-of-plane displacement. It should be noted that the small deviation between the energy-based method and the
FE simulation cannot be fully ascribed to the computational error of the former. This is because our physics-guided
neural network framework for elastic plates is constructed to implement the classic plate theory based on Kirchhoff
18
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 10: Longitudinal membrane force field prediction of FE simulation (a), purely data-driven models with two
different loss functions defined by: (b) mean square error of displacement field , (c) mean square error of displacement
and membrane force fields, PDE-based model (d), and energy-based models trained with uniform sampling (e) and
sampling with local refinement (f), for the central-hole tension. (For interpretation of the references to color in this
figure legend, the reader is referred to the web version of this article.)
Figure 11: Predicted horizontal displacement (a) and longitudinal membrane force (b) of different models around the
central hole.
19
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 12: Illustration of different sampling size and strategy: (a) uniform sampling with small size and (b) large size,
(c) sampling with local refinement (For interpretation of the references to color in this figure legend, the reader is
referred to the web version of this article.)
hypotheses. In the FE simulations, the governing equations of the shell elements are slightly different, which depends
on the integration algorithms and element size. For a stricter comparison, we can either employ experimental data or
develop finite element simulations with the same classic plate theory.
In the last example, we investigate the buckling of the same plate as the third example under in-plane compressive
loads. Figure 15a and b show the two different boundary conditions that were studied. One has a simply-supported left
edge (ux = 0, w = 0, Nyy = 0, Myy = 0 at x = −50), and the other clamped (ux = 0, w = 0, Nyy = 0, ∂w/∂x = 0
at x = −50). In both cases, there is no out-of-plane load. Therefore, trivial solutions that only involve the in-plane
deformation (i.e. no out-of-plane deflection, w = 0) exist because the trivial solutions always satisfy the out-of-plane
governing equation. The deformation of the plate follows the trivial solutions when the load is sufficiently small, but as
the load increases, there is a point where the plate will bifurcate into a more stable configuration (with lower potential
energy) in a buckled shape. In plate theory, the first buckling mode is usually determined by seeking the lowest total
potential energy. Therefore, we applied the energy-based model to predict it. The PDE-based loss function is not
suitable for the buckling analysis since it inevitably converges to the trivial in-plane solution.
For the neural network algorithms, a 5-hidden layer neural network (5 neurons each layer) with the energy-based
loss function was constructed. The FE simulations were performed in Abaqus/standard to get the first buckling mode.
Modal analysis was first conducted to obtain the different buckling modes. The first buckling mode configuration was
then induced as the geometric imperfection with a maximum 0.01 mm transverse deviation and the model with the
imperfection is used to simulate the in-plane compression with the implicit solver. The buckled configuration predicted
by the neural network algorithms for the two cases are shown in Figure 15c and d, respectively. In addition, Figure 15e
and f respectively compare the deflection of the central line with the FE simulations. We can see that the bulking
configurations under two different boundary conditions are both correctly predicted.
The machine learning algorithms are designed to find the global minimum that corresponds to the first buckling mode
in the studied case. It is intriguing whether the energy-based model can always converge to the global minimum or
may find local minimums corresponding to higher buckling modes. We initialized the neural network to three different
buckling modes: the first mode and two other higher modes (Figure 16a). This can be realized by pre-training the
network to fit the initial configurations as shown in Figure 16b. Figure 16c presents the predicted deflections, where we
can see that the final deformed profiles are almost the same regardless of the initial configuration. These results serve
as a validation that the optimization algorithm that is being used in this study can find the global minimum. It also
suggests that the energy-based approach is not able to find the higher modes.
5 Discussions
5.1 Comparison through the four loading conditions
In this study, we chose four typical loading conditions to validate the physics-guided neural network framework that
we developed. Although they are all simple in terms of loading and boundary conditions, they together provided a
20
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 13: Comparison of FE simulation and energy-based model results for different shape and size central-hole
tension: (a) lx = ly = 4 mm, (b) lx = 2 mm, ly = 4 mm, (c) lx = 4 mm, ly = 2 mm, and (d) three holes. The
membrane force fields predicted by the FE simulation of the four cases are respectively shown in (e), (f), (g), and (h).
The predictions of the energy-based model are respectively shown in (i), (j), (k), and (l). (For interpretation of the
references to color in this figure legend, the reader is referred to the web version of this article.)
21
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 14: Loading case of plate deflection under out-of-plane pressure (a). The predicted 3D configuration and
2D out-of-plane displacement field after deflection are shown in (b) and (d) for FE simulation and (c) and (d) for
energy-based model. The deflection of the central line is compared in (e) (For interpretation of the references to color in
this figure legend, the reader is referred to the web version of this article.)
Figure 15: Prediction of the buckling of elastic plates with two different types of constraints: simply supported (a) and
clamped (b). The buckled configurations are shown in (c) and (d). The deflections of the central line are compared with
FE simulations (e) and (f). (For interpretation of the references to color in this figure legend, the reader is referred to the
web version of this article.)
22
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Figure 16: (a) Three different buckling modes under in-plane compression. Mode 1 corresponds to the lowest potential
energy. (b) Deflection profiles of the three buckling modes for initializing the neural network. (c) Prediction of deflection
when initialized with different deformation mode. The neural network is pre-trained to fit the initial deflection profiles.
rather comprehensive investigation of the accuracy of different approaches. The first example involved only in-plane
deformation, but a nonlinear stretching force was applied. Among the four examples, this may be the simplest task
for modeling. However, the differences between the data-driven approach and the physics-guided approaches and
between the PDE-based and energy-based approaches were already clear. The second example was also in-plane, but
geometrical nonlinearity was generated by introducing a circular hole at the center of the plate. Different aspect ratios
of the central hole were investigated to obtain a wide range of the stress concentration factor, and a three-hole plate was
solved to push the computational framework to its limit. We observed that there is a small difference between the FE
simulation and energy-based model. However, it is promising to find that the accuracy of the energy-based method
did not decrease as the stress concentration factor increased. In other words, this method is stable. The third example
involved a pressure in the z-direction so that the out-of-plane governing equation could no longer be neglected. The
energy-based method still provided a satisfactory prediction. The last example was more challenging due to instability.
No out-of-plane load was applied, but out-of-plane deformation occurred through buckling. The energy-based method
showed a great advantage over its PDE-based counterpart because the latter always converged to the trivial solution
with w = 0. Therefore, these four examples covered almost all the important aspects of plate deformation.
The major task of the present study is to compare the PDE-based and energy-based approaches of formulating the loss
function. Although there is no absolute conclusion about which is better, it is clear that these two approaches have
their own pros and cons. In this sub-section, we summarize them in three aspects: hyperparameters, sampling, and
computational efficiency.
Hyperparameters – The PDE-based loss function involves a larger number of hyperparameters than the energy-based
does. This is a clear disadvantage of the PDE-based approach, although it was found through the first example of
in-plane tension that the two approaches could achieve a similar accuracy as long as the hyperparameters could be
determined properly. In the second in-plane example, we showed that it is difficult to find the optimum hyperparameters
23
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
and that the prediction of the PDE-based approach could fail to match the results if the non-optimum hyperparameters
are used.
Sampling – The energy-based approach is significantly more dependent on the size and resolution of the samples used
for training than the PDE-based, which is an important weakness. This is because the energy-based approach has to
perform the numerical integration of the total potential energy, which relies on the discretion of the domain. In this
sense, the energy-based approach is a “mesh sensitive" tool although the concept of meshing is not explicitly stated. In
the second in-plane example, this sampling strategy is close to a meshing step in conventional FE simulations. The
difference is that there is not a strict requirement for the “mesh" quality.
Computational efficiency – To achieve the same accuracy that is sufficiently high to approximate the exact solution,
the energy-based approach turns out to be more efficient. Here, the efficiency not only refers to the computation time
that the algorithm takes to converge but also includes the time of the user to tune the hyperparameters.
5.3 Fundamental differences between the PDE-based and energy-based loss functions
There are two prominent differences between the PDE-based and energy-based loss functions. The first lies in the
order of the involved partial derivatives. The energy-based loss function deals with the strain components, which are
functions of the first derivatives of the displacement field. By constructing the neural network to directly output the
strain components, for example in the first example of in-plane tension, it is possible to avoid additional computational
errors coming from the derivation process. On the contrary, the PDE-based loss function has to include the residuals in
the PDEs and BCs simultaneously. Therefore, it is almost impossible to reduce the order of equations by computational
treatments, and consequently, a number of derivation processes have to be performed in the algorithm, accumulating the
computational error. The second difference between the PDE-based and energy-based loss functions is the number of
the residual terms. The complete PDE-based loss function has to sum up a total number of eleven residuals coming
from three equations and eight boundary conditions. However, the energy-based deals only one residual by introducing
the penalty term into the total energy. Even if considering the penalty term as an independent quantity, we still have only
two residuals for summing up. This is a big simplification. As a result of these two aspects, the PDE-based approach is
less computationally efficient.
There are more fundamental underlying mechanisms behind these two computational differences. As already pointed
out above, the energy approach and the PDE governing equations are mathematically equivalent. The energy-based loss
(Eq. (24)) and the PDE-based loss (Eq. (20) are respectively formulated following these two approaches and, therefore,
should also be equivalent. However, the equivalence could only be achieved when the weight ratios of the PDE-based
loss, λs and λd , can be determined in advance to have the physical meanings of displacement and force, respectively. In
other words, λs and LBcs , λd and LBcd should be two pairs of conjugate variables in terms of potential energy. However,
this is impossible not only because such values of λs and λd are difficult to calculate but also because they are usually
not uniform in all boundaries. In a practical application of neural network-based computational frameworks, λs and λd
are usually chosen by the user with the help of a careful tuning procedure. Our study suggests that physics could guide
the determination of the hyperparameters, particularly the weights of the residuals. This will be an important future
topic which is worth a comprehensive investigation.
We have seen that for loading cases with high nonlinearity there is still a relatively large deviation of the local predictions
between the energy-based neural network model and the FE results. Although we have noted that FE results are not
necessarily the exact solution and that it is unfair to attribute the deviation only to the computational error of the neural
network-based algorithms, it is still necessary to point out the limitations of the proposed computational framework.
The first limitation is indicated by its name – the accuracy and the applicability of this physics-guided computational
framework is largely determined by the physical laws that are implemented by human brains. In our study, the physical
laws are the classic plate theory. It is based on the strong Kirchhoff hypotheses, which will lose the applicability for
moderately thick plates. The framework was developed based on these hypotheses and, therefore, inherits its limitations.
The second limitation stems from the fundamental and shadow neural network we have used. Its capability to
approximate a highly non-uniform displacement or strain field is limited due to its simplicity. A deep neural network
that involves a larger number of hidden layers is likely to increase accuracy. In the present study, we focused on the
implementation of physics into machine learning algorithms and we did not try a deeper neural network due to the
limitation in computational resources. Another approach to improve the approximation ability is to modify the structure
of the neural network or seek for other machine learning models. For example, Wang et al. [12] added extra connections
between non-adjacent layers to improve the approximation ability of the neural network.
24
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
5.5 Special challenges for applications of neural network-based algorithms in predicting the mechanical
responses of solids
As mentioned in the literature survey in the introductory section, many initial successes of PGML or PINN algorithms
have been achieved in modeling the dynamics of fluids as well as the mass and heat transfer. To model the mechanical
responses of solids, a special challenge is that the variables of interest (stresses and strains) are highly tensorial. As
a comparison, in many cases, only the pressure of a fluid is wanted. Consequently, more PDEs and BCs have to be
implemented into the loss function, leading to a low accuracy of the PDE-based algorithms as mentioned in 5.4. This
point becomes very clear through our present study – the governing equations are established in three directions, and
each edge of the plate has four pairs of conjugates as BCs. It is also worth noting that our present study only considered
elastic plates. To implement the plasticity theories, even the simplest model, there will be more intermediate state
variables, thus creating more PDEs and ordinary differential equations (ODEs) to be solved. This will be one more
special challenge for the neural network-based algorithms to take for predicting the deformation of solids.
As a pilot study, we demonstrated that the energy-based neural network framework can provide a satisfactory prediction
of the mechanical response of elastic plates that is as good as the FE simulation results. This seemingly easy conclusion
may cause an underrating of the contribution of this work. The conclusion can be generalized – for a system that
is governed by a large number of PDEs and BCs, if the principle of minimum potential energy is applicable, a
machine learning algorithm designed to minimize the potential energy will be more effective and efficient than directly
minimizing the total sum of all the residuals stemming from the PDEs and BCs. One potential extension of our neural
network framework is modeling the thermodynamics of materials, which is also based on some energy indicators.
While it is true that there is no evidence to prove the current accuracy of the framework is better than FE simulations,
as the complexity of the system keeps increasing, we expect that the neural network framework will show more clear
advantages over the FE simulations. One such potential application is the modeling of multiphysics and multiscale
systems, lithium-ion batteries as a typical example. It is well-known that the FE simulations often suffer from a
stringent criterion to get converged when dealing with these systems. Energy-based models are promising to relieve the
convergence requirement to provide an approximate solution for practical applications. The mechanism behind it is a
tradeoff between the modeling accuracy and the computational feasibility.
6 Conclusion
In this study, we established a physics-guided neural network-based computational framework to predict the mechanical
responses of elastic plates. The physical laws that were implemented into the algorithm were from the classic plate
theory derived following the Kirchhoff hypotheses. The governing PDEs are the well-known FvK equations, which
can be derived from the principle of virtual displacement. In our computational framework, a neural network was
constructed to output the displacement fields (or strain fields in some cases) with the input of spatial coordinates.
Three different ways of formulating the loss function were investigated. One was purely data-driven by comparing
the predicted displacement field with the observed one from tests or FE simulations. The other two were based on the
physical laws. The PDE-based loss function was the total sum of all the residuals stemming from the PDEs and BCs,
and the energy-based simply used the total potential energy as its loss. The computational framework that we developed
were then applied to four different types of loading conditions, including 1) the in-plane tension with non-uniformly
distributed stretching force to study the effect of the nonlinearity from external loads, 2) the in-plane central-hole
tension to investigate the nonlinearity from geometric imperfections, 3) the out-of-plane deflection to examine the
capability of modeling the out-of-plane deformation, and 4) the buckling induced by uniaxial compression to validate
the algorithm on instability analysis. In all the four cases, FE simulations with an extremely fine mesh size were
performed as references. Through these validations and comparisons, the following conclusions can be drawn.
1) Both the PDE-based and the energy-based neural networks algorithms developed in this study can approxi-
mately predict the mechanical response of elastic plates with a satisfactory accuracy that is close to the FE
simulations if the hyperparameters are properly tuned.
2) The advantage of the model with the energy-based loss function is that it has a small number of hyperparameters
and is computationally more efficient. Its disadvantage is that it relies on a large sampling size and a fine
sampling resolution.
3) The model with the PDE-based loss function has an advantage over the energy-based because it is less
dependent on sampling and has the potential to be “mesh-free".
25
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
4) In order to achieve a good accuracy, the purely data-driven approach is suggested to be trained with data from
both displacement and membrane force fields.
5) The fundamental difference between the energy-based and PDE-based approaches largely stem from the
determination of the weight in the loss function and the calculation of the derivatives. Deciphering the
relationship between the weights and the physical meanings can potentially improve the PDE-based approach.
It is optimistically expected that our energy-based neural network framework will have a wide spectrum of applications in
future studies. Particularly, it provides an important energy-optimization inspiration for modeling complex engineering
systems involving multiple scales and multiple physics.
A Appendix
Following the Kirchhoff hypotheses, the displacement field can be expressed as:
where uα (α = x, y) and w denote the in-plane displacement and out-of-plane displacement, respectively. u0 represents
the corresponding displacement at the mid-plane (i.e. u0α (x, y, z) = uα (x, y, 0)).
The general three-dimensional second-order nonlinear Green strains are
1
εαβ = (uα,β + uβ,α + uγ,α uγ,β ) , (α, β, γ = x, y, z). (A.2)
2
We consider the moderate deformation of a plate, meaning that the transverse (i.e. out-of-plane) displacement gradients
uz,x = w,x and uz,y = w,y can be relatively large and the in-plane displacement gradients uα,β , (α, β = x, y) are small
due to the large width and length. The second-order terms in the Green strains can be therefore omitted except the
w,αβ (α, β = x, y).
Substitute Eq. (A.1) into the above equation, the strains can subsequently be simplified to the following strains of the
2D plate theory,
1 0
uα,β + u0β,α + w,α w,β − z · w,αβ , (α, β = x, y),
εαβ =
2 (A.3)
εγ3 = 0, (γ = x, y, z).
The virtual strains are calculated from the virtual displacements according to Eq. (2) and Eq. (3). For the first term in
Eq. (5) , applying integration by parts we have
Z Z
1
Nαβ δε0αβ dxdy = Nαβ δu0α,β + δu0β,α + δw,α w,β + w,α δw,β
Ω 2
ZΩ
Nαβ δu0α,β + δw,α w,β
= (A.4)
ZΩ Z h i
0
Nαβ,β δu0α + (Nαβ w,β ),α δw dxdy,
= Nαβ δuα nβ + Nαβ w,β δwγ nα ds −
Γ Ω
26
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
where the partial integration is applied to get the displacement variation instead of its gradient. n = nx ex + ny ey is the
outward normal on the boundary (nx and ny are the direction cosines of the unit normal). For the second term, we have
Z Z
Mαβ δκαβ dxdy = − Mαβ δw,αβ dxdy
Ω Ω
Z Z
=− Mαβ δw,α nβ ds + Mαβ,β δw,α dxdy (A.5)
ZΓ ZΩ Z
=− Mαβ δw,α nβ ds + Mαβ,β δwnα ds − Mαβ,αβ δwdxdy,
Γ Γ Ω
We perform a coordinate transformation between the global Cartesian (x, y, z) coordinate and the local Cartesian
coordinate (n, s, r) (see Fig. 1),
where θ is the angle between the global x axis and the local n axis along the counterclockwise direction. The
displacements and stresses under the two coordinates are related by
"σxx #
n2x n2y
σnn 2nx ny
= σyy . (A.8)
σns −nx ny nx ny n2x − n2y
σxy
According to the above relations, the stress boundary integrands in Eq. (9) can be rewritten with quantities under the
local coordinate,
and
Z Z
∂δw ∂Mns
Mns ds = − [Mns δw]Γσ + δwds. (A.11)
Γσ ∂s Γσ ∂s
[Mns δw]Γσ is zero when the stress boundary is closed or Mns = 0. Then we can get the Eq. (14).
27
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
Acknowledgment
J.Z. and W.L. are grateful to the support by AVL, Hyundai, Murata, Tesla, Toyota North America, Volkswa-
gen/Audi/Porsche, and other industrial partners through the MIT Industrial Battery Consortium. M.Z.B is grateful to the
support by Toyota Research Institute through the D3BATT Center on Data-Driven-Design of Rechargeable Batteries.
Thanks are also due to the MIT-Indonesia Seed Fund to support J.Z.’s postdoctoral study.
References
[1] M. Egmont-Petersen, D. de Ridder, and H. Handels. “Image processing with neural networks- A review”. In:
Pattern Recognition (2002). ISSN: 00313203. DOI: 10.1016/S0031-3203(01)00178-9.
[2] W. Rawat and Z. Wang. “Deep convolutional neural networks for image classification: A comprehensive review”.
In: Neural Computation (2017). ISSN: 1530888X. DOI: 10.1162/NECO_a_00990.
[3] R. M. French. “Introduction to Neural and Cognitive Modeling”. In: Biological Psychology (2002). ISSN:
03010511. DOI: 10.1016/s0301-0511(02)00012-1.
[4] M. W. Libbrecht and W. S. Noble. “Machine learning applications in genetics and genomics”. In: Nature Reviews
Genetics (2015). ISSN: 14710064. DOI: 10.1038/nrg3920.
[5] Y. C. Lo et al. “Machine learning in chemoinformatics and drug discovery”. In: Drug Discovery Today (2018).
ISSN : 18785832. DOI : 10.1016/j.drudis.2018.05.010.
[6] R. Ramprasad et al. “Machine learning in materials informatics: Recent applications and prospects”. In: npj
Computational Materials (2017). ISSN: 20573960. DOI: 10.1038/s41524-017-0056-5.
[7] M. Alber et al. “Integrating machine learning and multiscale modeling—perspectives, challenges, and opportuni-
ties in the biological, biomedical, and behavioral sciences”. In: npj Digital Medicine (2019). ISSN: 2398-6352.
DOI : 10.1038/s41746-019-0193-y.
[8] J. Han, A. Jentzen, and E. Weinan. “Solving high-dimensional partial differential equations using deep learning”.
In: Proceedings of the National Academy of Sciences of the United States of America 115.34 (2018), pp. 8505–
8510. ISSN: 10916490. DOI: 10.1073/pnas.1718942115.
[9] K. A. Severson et al. “Data-driven prediction of battery cycle life before capacity degradation”. In: Nature Energy
4.5 (2019), pp. 383–391. ISSN: 20587546. DOI: 10.1038/s41560-019-0356-8.
[10] A. Famili et al. “Data preprocessing and intelligent data analysis”. In: Intelligent Data Analysis (1997). ISSN:
15714128. DOI: 10.3233/IDA-1997-1102.
[11] J. Sirignano and K. Spiliopoulos. “DGM: A deep learning algorithm for solving partial differential equations”. In:
Journal of Computational Physics 375.Dms 1550918 (2018), pp. 1339–1364. ISSN: 10902716. DOI: 10.1016/j.
jcp.2018.08.029.
[12] Z. Wang and Z. Zhang. “A mesh-free method for interface problems using the deep learning approach”. In:
Journal of Computational Physics 400 (2020). ISSN: 10902716. DOI: 10.1016/j.jcp.2019.108963.
[13] E. Weinan and B. Yu. “The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving
Variational Problems”. In: Communications in Mathematics and Statistics 6.1 (2018), pp. 1–14. ISSN: 2194671X.
DOI : 10.1007/s40304-018-0127-z.
[14] M. Raissi, P. Perdikaris, and G. E. Karniadakis. “Physics-informed neural networks: A deep learning framework
for solving forward and inverse problems involving nonlinear partial differential equations”. In: Journal of
Computational Physics 378 (2019), pp. 686–707. ISSN: 10902716. DOI: 10.1016/j.jcp.2018.10.045.
[15] J. X. Wang, J. L. Wu, and H. Xiao. “Physics-informed machine learning approach for reconstructing Reynolds
stress modeling discrepancies based on DNS data”. In: Physical Review Fluids 2.3 (2017), pp. 1–22. ISSN:
2469990X. DOI: 10.1103/PhysRevFluids.2.034603.
[16] E. Samaniego et al. “An energy approach to the solution of partial differential equations in computational
mechanics via machine learning: Concepts, implementation and applications”. In: Computer Methods in Applied
Mechanics and Engineering 362 (2020), p. 112790. ISSN: 00457825. DOI: 10.1016/j.cma.2019.112790.
[17] A. Karpatne et al. “Theory-guided data science: A new paradigm for scientific discovery from data”. In:
IEEE Transactions on Knowledge and Data Engineering 29.10 (2017), pp. 2318–2331. ISSN: 10414347. DOI:
10.1109/TKDE.2017.2720168.
[18] Z. Long, Y. Lu, and B. Dong. “PDE-Net 2.0: Learning PDEs from data with a numeric-symbolic hybrid deep
network”. In: Journal of Computational Physics 399 (2019), p. 108925.
[19] W. Li et al. “Data-Driven Safety Envelope of Lithium-Ion Batteries for Electric Vehicles”. In: Joule (2019),
pp. 1–13. ISSN: 25424351. DOI: 10.1016/j.joule.2019.07.026.
28
A Physics-Guided Neural Network Framework for Elastic Plates: Comparison of Governing Equations-Based and
Energy-Based Approaches A P REPRINT
[20] Z. Chen et al. Direct prediction of phonon density of states with Euclidean neural network. 2020. eprint:
arXiv:2009.05163.
[21] L. Zhang et al. “DeePCG: Constructing coarse-grained models via deep neural networks”. In: Journal of Chemical
Physics (2018). ISSN: 00219606. DOI: 10.1063/1.5027645.
[22] J. Darbon, G. P. Langlois, and T. Meng. “Overcoming the curse of dimensionality for some Hamilton–Jacobi
partial differential equations via neural network architectures”. In: Research in Mathematical Sciences (2020).
ISSN : 21979847. DOI : 10.1007/s40687-020-00215-6.
[23] L. Lu et al. “DeepXDE: A deep learning library for solving differential equations”. In: (2019), pp. 1–21.
[24] H. Zhao et al. “Learning the Physics of Pattern Formation from Images”. In: Physical Review Letters (2020).
ISSN : 10797114. DOI : 10.1103/PhysRevLett.124.060201.
[25] S. Effendy, J. Song, and M. Z. Bazant. “Analysis, Design, and Generalization of Electrochemical Impedance
Spectroscopy (EIS) Inversion Algorithms”. In: Journal of The Electrochemical Society (2020). ISSN: 1945-7111.
DOI : 10.1149/1945-7111/ab9c82.
[26] W. E, J. Han, and L. Zhang. “Integrating Machine Learning with Physics-Based Modeling”. In: (2020), pp. 1–23.
[27] E. Qian et al. “Lift & Learn: Physics-informed machine learning for large-scale nonlinear dynamical systems”.
In: Physica D: Nonlinear Phenomena (2020). ISSN: 01672789. DOI: 10.1016/j.physd.2020.132401.
[28] E. Haghighat et al. “A deep learning framework for solution and discovery in solid mechanics”. In: (2020).
[29] L. Wu et al. “A recurrent neural network-accelerated multi-scale model for elasto-plastic heterogeneous materials
subjected to random cyclic and non-proportional loading paths”. In: Computer Methods in Applied Mechanics
and Engineering (2020). ISSN: 00457825. DOI: 10.1016/j.cma.2020.113234.
[30] D. Huang et al. “A machine learning based plasticity model using proper orthogonal decomposition”. In:
Computer Methods in Applied Mechanics and Engineering 365 (2020), p. 113008. ISSN: 00457825. DOI:
10.1016/j.cma.2020.113008.
[31] J. N. Reddy. Theory and Analysis of Elastic Plates and Shells. CRC press, 2006. DOI: 10.1201/9780849384165.
[32] A. Föppl. Vorlesungen über technische Mechanik. Vol. 4. BG Teubner, 1899.
[33] T. V. Kármán. “Festigkeitsprobleme im Maschinenbau”. In: Mechanik. Springer, 1907, pp. 311–385. DOI:
10.1007/978-3-663-16028-1_5.
[34] J. Zhu, X. Zhang, and T. Wierzbicki. “Stretch-induced wrinkling of highly orthotropic thin films”. In: International
Journal of Solids and Structures 139-140 (2018), pp. 238–249. ISSN: 00207683. DOI: 10.1016/j.ijsolstr.
2018.02.005.
[35] E. Cerda, K. Ravi-Chandar, and L. Mahadevan. “Thin films: Wrinkling of an elastic sheet under tension”. In:
Nature (2002). ISSN: 00280836. DOI: 10.1038/419579b.
[36] E. Puntel, L. Deseri, and E. Fried. “Wrinkling of a stretched thin sheet”. In: Journal of Elasticity (2011). ISSN:
03743535. DOI: 10.1007/s10659-010-9290-5.
[37] V. Nayyar, K. Ravi-Chandar, and R. Huang. “Stretch-induced stress patterns and wrinkles in hyperelastic thin
sheets”. In: International Journal of Solids and Structures (2011). ISSN: 00207683. DOI: 10.1016/j.ijsolstr.
2011.09.004.
[38] A. A. Sipos and E. Fehér. “Disappearance of stretch-induced wrinkles of thin sheets: A study of orthotropic
films”. In: International Journal of Solids and Structures (2016). ISSN: 00207683. DOI: 10.1016/j.ijsolstr.
2016.07.021.
[39] G. Cybenko. “Approximation by superpositions of a sigmoidal function”. In: Mathematics of control, signals
and systems 2.4 (1989), pp. 303–314.
[40] M. Abadi et al. Tensorflow: A system for large-scale machine learning. 2016.
[41] A. Paszke et al. Automatic differentiation in pytorch. 2017.
[42] M. H. Sadd. Elasticity: theory, applications, and numerics. Academic Press, 2009.
29