Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
345-359, 2019
Abstract: We present a method for solving partial differential equations using artificial
neural networks and an adaptive collocation strategy. In this procedure, a coarse grid of
training points is used at the initial training stages, while more points are added at later
stages based on the value of the residual at a larger set of evaluation points. This method
increases the robustness of the neural network approximation and can result in significant
computational savings, particularly when the solution is non-smooth. Numerical results
are presented for benchmark problems for scalar-valued PDEs, namely Poisson and
Helmholtz equations, as well as for an inverse acoustics problem.
1 Introduction
Artificial neural networks (ANNs) have been a topic of great interest in the machine
learning community due to their ability to solve very difficult problems, particularly in
the fields of image processing and object recognition, speech recognition, medical
diagnosis, etc. More recently, applications have been found in engineering, especially
where large data sets are involved. From a mathematical point of view, neural networks
are also interesting due to their ability to efficiently approximate arbitrary functions
[Cybenko (1989)].
A natural question is to determine whether ANNs can be used to approximate the solution
of partial differential equations which commonly appear in physics, engineering and
mathematical problems. Several articles and even a book [Yadav (2015)] have been very
recently devoted to this topic. In most of the approaches considered, a collocation-type
method is employed which attempts to fit the governing equations and the boundary
conditions at randomly selected points in the domain and on the boundary. Among these
methods we mention the Deep Galerkin Method [Sirignano and Spiliopoulos (2018)],
Physics Informed Neural Networks [Raissi, Perdikaris, and Karniadakis (2019)], as well
as the earlier works in Lagaris et al. [Lagaris, Likas and Fotiadis (1998); Lagaris, Likas
and Papageorgiou (2000); van Milligen, Tribaldos and Jiménez (1995); Kumar and Yadav
(2011); McFall and Mahan (2009)]. This method appears to produce reasonably accurate
results, particularly for high-dimensional domains [Han, Jentzen and Weinan (2018)] and
domains with complex geometries [Berg and Nyström (2018)], where the meshfree
character of these methods makes them competitive with established discretization methods.
Another related approach is to use an energy minimization formulation of the governing
equation as in Weinan et al. [Weinan and Yu (2018); Wang and Zhang (2019)]. This
formulation has the advantage that only the first derivatives need to be computed for a 2nd
order problem, however it requires a more precise integration procedure and not all
governing equations can be cast in an energy-minimization framework.
In this work, we employ a collocation formulation for solving 2nd order boundary value
problems such as Poisson’s equation and Helmholtz equation. Different from existing
methods which typically use a randomly scattered set of collocation points, we present an
adaptive approach for selecting the collocation points based on the value of the residual at
previous training steps. This method can improve the robustness of collocation method,
particularly in cases when the solution has a non-smooth region where increasing the
number of training points is beneficial.
The paper is structured as follows: in Section 2 we give an overview of artificial neural
networks and briefly discuss their approximation properties. The application of ANNs to
forward and inverse boundary-value problems is discussed in Section 3. Detailed
numerical results are presented in Section 4, followed by concluding remarks.
Aside from the choice of the size of the neural networks (the number of hidden layers and
the number of neurons in each layers) and that of the training points, other important
parameters are related to the selection of the activation function and the choice of the
minimization algorithm. Typical activation functions used are ramp functions like ReLU,
sigmoid (logistic) function, and the hyperbolic tangent function (tanh). In this work, we
use the tanh activation function which is preferable due to its smoothness. For
optimization, we use the Adam (adaptive momentum) optimizer which based on
stochastic gradient descent followed by a quasi-Newton method (L-BFGS) which builds
an approximated Hessian at each gradient-descent step.
Figure 2: The steps of the adaptive collocation method, assuming the residual values are
higher in the center of the domain
4 Numerical results
4.1 Poisson equation on the unit square
We first consider a Poisson equation with Dirichlet and Neumann boundary conditions:
−Δ𝑢𝑢(𝑥𝑥, 𝑦𝑦) = 8sin(2𝜋𝜋𝑥𝑥)cos(2𝜋𝜋𝑦𝑦) for (𝑥𝑥, 𝑦𝑦) ∈ (0,1)2 ,
𝑢𝑢(𝑥𝑥, 𝑦𝑦) = 0 for 𝑥𝑥 = 0,
𝑑𝑑𝑢𝑢
= 0 for 𝑦𝑦 = 0 and 𝑦𝑦 = 1, and
𝑑𝑑𝑦𝑦
𝑑𝑑𝑢𝑢
= 2𝜋𝜋 cos(2𝜋𝜋𝑥𝑥) cos(2𝜋𝜋𝑦𝑦) for 𝑥𝑥 = 1.
𝑑𝑑𝑥𝑥
The exact solution of this equation is 𝑢𝑢(𝑥𝑥, 𝑦𝑦) = sin(2𝜋𝜋𝑥𝑥) cos(2𝜋𝜋𝑦𝑦). We consider a loss
function of the form:
𝑁𝑁𝑖𝑖𝑖𝑖𝑖𝑖 𝑁𝑁𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖
1 1 2
𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖 𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖
ℂ(𝑢𝑢) ≔ �[Δ𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ ) + 2𝜋𝜋 2 sin(𝜋𝜋𝑥𝑥𝑗𝑗 )cos(𝜋𝜋𝑦𝑦𝑗𝑗 )]2 + � 𝑢𝑢�𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 � +
𝑁𝑁𝑗𝑗𝑛𝑛𝑖𝑖 𝑁𝑁𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖
𝑗𝑗=1 𝑗𝑗=1
𝑁𝑁𝑏𝑏𝑡𝑡𝑖𝑖𝑖𝑖𝑡𝑡𝑏𝑏 𝑁𝑁𝑖𝑖𝑡𝑡𝑡𝑡
1 𝜕𝜕𝑢𝑢 𝑏𝑏𝑏𝑏𝑖𝑖𝑖𝑖𝑏𝑏𝑚𝑚 𝑏𝑏𝑏𝑏𝑖𝑖𝑖𝑖𝑏𝑏𝑚𝑚 2 1 𝜕𝜕𝑢𝑢 𝑖𝑖𝑏𝑏𝑡𝑡 𝑖𝑖𝑏𝑏𝑡𝑡 2
+ � � �𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 �� + � � �𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 �� +
𝑁𝑁𝑏𝑏𝑏𝑏𝑖𝑖𝑖𝑖𝑏𝑏𝑚𝑚 𝜕𝜕𝑦𝑦 𝑁𝑁𝑖𝑖𝑏𝑏𝑡𝑡 𝜕𝜕𝑦𝑦
𝑗𝑗=1 𝑗𝑗=1
𝑁𝑁𝑑𝑑𝑖𝑖𝑟𝑟ℎ𝑖𝑖
2
1 𝜕𝜕𝑢𝑢 𝑑𝑑𝑗𝑗𝑟𝑟ℎ𝑖𝑖 𝑑𝑑𝑗𝑗𝑟𝑟ℎ𝑖𝑖 𝑑𝑑𝑗𝑗𝑟𝑟ℎ𝑖𝑖 𝑑𝑑𝑗𝑗𝑟𝑟ℎ𝑖𝑖
+ � � �𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 � − 𝜋𝜋 cos�𝜋𝜋𝑥𝑥𝑗𝑗 � cos�𝜋𝜋𝑦𝑦𝑗𝑗 �� ,
𝑁𝑁𝑑𝑑𝑗𝑗𝑟𝑟ℎ𝑖𝑖 𝜕𝜕𝑥𝑥
𝑗𝑗=1
𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖 𝑙𝑙𝑛𝑛𝑙𝑙𝑖𝑖
where (𝑥𝑥𝑗𝑗 . 𝑦𝑦𝑗𝑗 ) are interior collocation points, and (𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 ), (𝑥𝑥𝑗𝑗𝑏𝑏𝑏𝑏𝑖𝑖𝑖𝑖𝑏𝑏𝑚𝑚 , 𝑦𝑦𝑗𝑗𝑏𝑏𝑏𝑏𝑖𝑖𝑖𝑖𝑏𝑏𝑚𝑚 ),
∗ ∗
set) consists of a grid of 292 equally-spaced points. The training (collocation) points at
subsequent iterations were chosen by selecting the top 30% of the points with the highest.
The results obtained for a shallow network with one hidden layer of 10 neurons are
shown in Figure 3. The blue dots represent the interior collocation points, while the red
and green dots represent the points corresponding to the Dirichlet and Neumann
boundary conditions respectively. We note that even for this simple network with 41
parameters (30 weights and 11 biases), an accurate solution can be obtained. Because the
solution is smooth throughout the domain, the training points are generally evenly
distributed. However, more points are selected near the corners and boundaries since the
residuals and the actual errors are higher there.
The relative 𝐿𝐿2 errors obtained by increasing the number of layers while keeping the
number of neurons per layer fixed and using the same refinement strategy are shown in
Tab. 1. It can be observed that except for the single-layer network, the error decreases
significantly as more training points are used. Moreover, the error for deeper networks is
greatly reduced compared to the single-layer network, although the number of parameters
and computational cost increases as well.
Table 1: Relative 𝑳𝑳𝟐𝟐 errors for different levels of refinement and different numbers of
layers for the Poisson equation on the unit square
Number of Relative 𝐿𝐿2 Relative 𝐿𝐿2 Relative 𝐿𝐿2 error
training points error for 1 error for 2 for 3 layers
layer layers
Refinement 1 361 0.07475152 0.00486584 0.00116222
Refinement 2 816 0.03761188 0.00119753 0.00043185
Refinement 3 1271 0.08465629 0.00026268 0.00026697
𝑁𝑁𝑖𝑖𝑖𝑖𝑖𝑖 𝑁𝑁𝑏𝑏𝑖𝑖𝑑𝑑
1 𝛾𝛾 2
ℂ(𝑢𝑢) ≔ �[Δ𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ ) − 𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ ) + 𝑓𝑓(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ )]2 + � 𝑢𝑢�𝑥𝑥𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 , 𝑦𝑦𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 �
𝑁𝑁𝑗𝑗𝑛𝑛𝑖𝑖 𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑
𝑗𝑗=1 𝑗𝑗=1
We first choose a neural network with 2 hidden layers with 10 neurons each and the tanh
activation function. The initial set of collocation points consists of 𝑁𝑁𝑗𝑗𝑛𝑛𝑖𝑖 = 192 points in
the interior and 𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑 = 168 points on the boundary, spaced uniformly as shown in Fig. 4.
Subsequent refinements are done according to the same procedure as in the first example.
The relative 𝐿𝐿2 error is calculated as 0.00053738 for the initial training and decreases to
0.00036766 and 0.00043207 as the number of collocation points is increased.
The exact solution in polar coordinates is 𝑢𝑢𝑛𝑛𝑒𝑒 (𝑟𝑟, 𝜃𝜃) = 𝑟𝑟1/2 sin(𝜃𝜃/2) , which has the
singular term 𝑟𝑟1/2 creating approximation difficulties near the origin. In finite elements
methods, a more refined mesh is typically required to obtain a good approximation. This
problem was also investigated in Weinan et al. [Weinan and Yu (2018)] using an energy
minimization method.
The geometry is modelled by considering 3 rectangular subdomains (−1,0) × (−1,1),
(0,1) × (−1,0), and (0,1) × (0,1). We define a loss function of the form:
𝑁𝑁𝑖𝑖𝑖𝑖𝑖𝑖 𝑁𝑁𝑏𝑏𝑖𝑖𝑑𝑑
1 𝛾𝛾 2
ℂ(𝑢𝑢) ≔ �[Δ𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ )]2 + � �𝑢𝑢�𝑟𝑟𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 , 𝜃𝜃𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 � − 𝑢𝑢𝑛𝑛𝑒𝑒 �𝑟𝑟𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 , 𝜃𝜃𝑗𝑗𝑏𝑏𝑛𝑛𝑑𝑑 �� .
𝑁𝑁𝑗𝑗𝑛𝑛𝑖𝑖 𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑
𝑗𝑗=1 𝑗𝑗=1
In the initial grid we choose equally spaced points with a distance of 0.05 in the x and y
directions. For the points on the boundary, we choose more densely spaced points, with a
distance of 0.025 in Cartesian coordinates and we set a penalty factor of 𝛾𝛾 = 500 to
ensure that the boundary conditions are respected. As before, we evaluate the model on
grids with more points and append the points where the residual value is large to the
training set in the next step.
The results obtained by the adaptive collocation scheme using a network with 3 hidden
Artificial Neural Network Methods for the Solution 355
layers and 30 neurons each are shown in Fig. 5. In general, the residual values in a
narrow region around the singularity are much larger than in the rest of the domain and
they are selected in the subsequent training step. Also, larger residuals are observed along
the line 𝑦𝑦 = 0, with 𝑥𝑥 < 0 as the neural network with a coarser training grid has
difficulties in capturing correctly the end of the internal boundary. However, as can be
seen from the plots, the error diminishes as the number of training points increases. The
accuracy can be further improved by choosing larger networks although the number of
training points needs to be increased as well.
Figure 5: Error between the exact solution and computed solution for the Poisson
equation with a singularity at origin and the training sets at each refinement step for a
network with 4 hidden layers and 30 neurons per layer
𝑖𝑖𝑘𝑘𝑒𝑒 −𝑖𝑖𝑘𝑘𝑒𝑒 𝐴𝐴 1
� � � 1 � = � �.
(𝑘𝑘 − 𝑘𝑘𝑒𝑒 )exp(−2𝑖𝑖𝑘𝑘𝑒𝑒 ) (𝑘𝑘 + 𝑘𝑘𝑒𝑒 )exp(2𝑖𝑖𝑘𝑘𝑒𝑒 ) 𝐴𝐴2 0
In the following, we compute only the real part of the solution 𝑢𝑢(𝑥𝑥, 𝑦𝑦) as the imaginary
part can be computed by a similar procedure.
As before, we define a loss functions which minimizes the residual of the governing
equation at interior and boundary points:
𝑁𝑁𝑖𝑖𝑖𝑖𝑖𝑖
1
ℂ(𝑢𝑢) ≔ �[Δ𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ ) + 𝑘𝑘 2 𝑢𝑢(𝑥𝑥𝑗𝑗∗ , 𝑦𝑦𝑗𝑗∗ )]2
𝑁𝑁𝑗𝑗𝑛𝑛𝑖𝑖
𝑗𝑗=1
𝑁𝑁𝑏𝑏𝑖𝑖𝑑𝑑 2
𝛾𝛾 𝜕𝜕𝑢𝑢𝑛𝑛𝑒𝑒 𝑏𝑏𝑛𝑛𝑑𝑑 𝑏𝑏𝑛𝑛𝑑𝑑 𝜕𝜕𝑢𝑢 𝑏𝑏𝑛𝑛𝑑𝑑 𝑏𝑏𝑛𝑛𝑑𝑑
+ � � �𝑥𝑥𝑗𝑗 , 𝑦𝑦𝑗𝑗 � − �𝑥𝑥 , 𝑦𝑦𝑗𝑗 �� .
𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑 𝜕𝜕𝑛𝑛 𝜕𝜕𝑛𝑛 𝑗𝑗
𝑗𝑗=1
The results of the adaptive collocation method are shown in Fig. 6. We have used a
neural network with 3 hidden layers of 30 neurons each and a grid of 99 × 49 uniformly
spaced points in the interior of the domain in the initial step. For the boundary, we have
used 𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑 = 400 uniformly spaced points and a penalty parameter of 𝛾𝛾 = 100. As
before, the size of the training set is increased based on the residual value on a finer grid
(with double the points in each direction) in subsequent steps. Due to the oscillatory
nature of the solution, the additional training points are also generally evenly distributed
in the domain with higher concentration in the areas where the residual value was initially
larger than average.
Figure 6: Computed solution, the error and the sets of training points for the acoustic
duct benchmark problem with 𝒌𝒌 = 𝟏𝟏𝟐𝟐 and 𝒎𝒎 = 𝟐𝟐
Here 𝑢𝑢𝑛𝑛𝑒𝑒 has the same form as in the previous section but with 𝑘𝑘 = 4 and 𝑚𝑚 = 1. We
start with 𝑘𝑘 = 1 as an initial guess and seek to minimize the loss function with 𝑘𝑘 as a free
parameter. For this problem, we choose a grid of 149 × 29 equally spaced points in the
interior of the domain and 𝑁𝑁𝑏𝑏𝑛𝑛𝑑𝑑 = 800 boundary collocation points and 𝛾𝛾 = 100.
The results for this example are presented in Fig. 7. We can observe that the solution has
been represented with reasonable accuracy both in terms of 𝑢𝑢(𝑥𝑥, 𝑦𝑦) as well 𝑘𝑘. The
relative 𝐿𝐿2 error for 𝑢𝑢(𝑥𝑥, 𝑦𝑦) in this example is 0.084, while the computed 𝑘𝑘 is 3.882 as
compared to 4 in the reference solution. As in the other examples, we have used the
Adam optimizer followed by a quasi-Newton method (L-BFGS). It can be noted that the
latter converges significantly faster, however in many cases performing a stochastic
gradient-descent like Adam helps the solver to avoid being trapped in early local minima.
358 Copyright © 2019 Tech Science Press CMC, vol.59, no.1, pp.345-359, 2019
5 Conclusions
We have presented a collocation method for solving boundary values problem using
artificial neural networks. The method is completely mesh-free as only scattered sets of
points are used in the training and evaluation sets. Although uniform grids of training
points have been used in the initial training step, the method could be easily adapted to
scattered data obtained e.g. by Latin hypercube sampling methods. The method was
shown to produce results with good accuracy for the parameters chosen, although as
common in deep learning methods, parameter selection may require some manual tuning.
A more detailed study of the convergence and approximation properties of neural networks,
as well as selecting robust minimization procedures remain open as possible research topics.
Moreover, the applicability of these methods to energy minimization formulations, for the
differential equations which allow it, can be investigated in future work.
References
Berg, J.; Nyström K. (2018): A unified deep artificial neural network approach to partial
differential equations in complex geometries. Neurocomputing, vol. 317, pp. 28-41.
Cybenko, G. (1989): Approximation by superpositions of a sigmoidal function.
Mathematics of Control, Signals and Systems, vol. 2, no. 4, pp. 303-314.
Weinan, E.; Yu, B. (2018): The deep Ritz method: a deep learning-based numerical
algorithm for solving variational problems. Communications in Mathematics and
Statistics, vol. 6, no. 1, pp. 1-12.
Han, J.; Jentzen, A.; Weinan, E. (2018): Solving high-dimensional partial differential
equations using deep learning. Proceedings of the National Academy of Sciences, vol.
115, no. 34, pp. 8505-8510.
Hornik, K.; Stinchcombe, M.; White, H. (1989): Multilayer feedforward networks are
universal approximators. Neural Networks, vol. 2, no. 5, pp. 359-366.
Kumar, M.; Yadav, N. (2011) Multilayer perceptrons and radial basis function neural
network methods for the solution of differential equations: a survey. Computers &
Mathematics with Applications, vol. 62, no. 10, pp. 3796-3811.
Lagaris, I. E.; Likas, A. C.; Papageorgiou, D. G. (2000): Neural-network methods for
boundary value problems with irregular boundaries. IEEE Transactions on Neural
Networks, vol. 11, no. 5, pp. 1041-1049.
Lagaris, I. E.; Likas, A.; Fotiadis, D. I. (1998): Artificial neural networks for solving
ordinary and partial differential equations. IEEE Transactions on Neural Networks, vol. 9,
no. 5, pp. 987-1000.
Lu, Z.; Pu, H.; Wang, F.; Hu, Z.; Wang, L. (2017): The expressive power of neural
networks: a view from the width. Advances in Neural Information Processing Systems,
vol. 30, pp. 6231-6239.
McFall, K. S.; Mahan, J. R. (2009): Artificial neural network method for solution of
boundary value problems with exact satisfaction of arbitrary boundary conditions. IEEE
Transactions on Neural Networks, vol. 20, no. 8, pp. 1221-1233.
Raissi, M.; Perdikaris, P.; Karniadakis, G. E. (2019): Physics-informed neural
networks: a deep learning framework for solving forward and inverse problems involving
nonlinear partial differential equations. Journal of Computational Physics, vol. 378, pp.
686-707.
Sirignano, J.; Spiliopoulos, K. (2018): DGM: a deep learning algorithm for solving
partial differential equations. Journal of Computational Physics, vol. 375, pp. 1339-1364.
van Milligen, B. Ph.; Tribaldos, V.; Jiménez, J. A. (1995): Neural network differential
equation and plasma equilibrium solver. Physical Review Letters, vol. 75, no. 20, pp.
3594-3597.
Wang, Z.; Zhang, Z. (2019): A mesh-free method for interface problems using the deep
learning approach. arXiv:1901.00618.
Yadav, N. (2015). An Introduction to Neural Network Methods for Differential
Equations. Springer, Netherlands.