Artificial Neural Network With The Levenberg-Marquardt Algorithm For Numerical Solution of Two-Dimension Poisson's Equation
Artificial Neural Network With The Levenberg-Marquardt Algorithm For Numerical Solution of Two-Dimension Poisson's Equation
RESEARCH
*Correspondence:
Anup Kumar Thander,
[email protected]
This study introduces an Artificial Neural Network (ANN) framework to address the two-dimensional Poisson’s
equation within a rectangular domain. It places a focus on the training process of a neural network with three
layers, incorporating hidden neurons. The feedforward ANN is trained using MATLAB, which calculates weights for
all neurons within the network structure. These acquired weights are subsequently applied in the trained network
model to make predictions for the desired output of a specific partial differential equation. The architecture of the
ANN consists of three layers: one input layer, one hidden layer, and one output layer. In this study, we specifically
employ an ANN configuration with 50 hidden neurons. The training process is executed using MATLAB, utilizing the
Levenberg–Marquardt algorithm (LMA) for optimization. Furthermore, the study encompasses the development of
surface and contour plots that illustrate the computational solution of the partial differential equation. Additionally,
error functions are graphed to assess the effectiveness of the ANN model.
Keywords: artificial neural network, Levenberg–Marquardt algorithm, Poisson’s equation, optimization algorithms,
numerical solution
55
56 Thander and Bhowmik
problems and understand how they perform in comparison sequence of interconnected layers (1–7). The initial layer
to traditional numerical methods. is connected to the network’s input, and each subsequent
The paper is structured as the following sections: Section layer is linked to the one preceding it. Ultimately, the
II covers the details of the partial differential equation last layer generates the network’s output. These networks
and domain discretization. In Section III, an overview are versatile and can be applied to map inputs to outputs
of the feedforward neural network and its properties is across various problem domains. A feedforward neural
presented. Section IV delves into the discussion of the network, particularly one equipped with a single hidden layer
Levenberg–Marquardt Algorithm (LMA). Section V provides containing an adequate number of neurons, possesses the
a comprehensive presentation of detailed numerical results. capability to approximate and fit any finite input-output
Finally, Section VI offers concluding remarks. mapping problem. In essence, it can adapt to a wide range
of tasks where there is a need to transform inputs into
corresponding outputs. Additionally, specialized variations
2. The Poisson equation within a of feedforward networks are available, including networks
designed for fitting purposes and those tailored for pattern
rectangular region recognition tasks (6, 7, 17, 18). These variations allow for the
network’s adaptation to specific problem types, enhancing its
Two-dimensional Poisson’s Equation within rectangular
applicability across various domains.
domain where x ∈ [0, A] and y ∈ [0, B].
Equation (2) provides a mathematical representation that
Such that
describes an individual neuron within Feed Forward neural
2 2
∂ ν ∂ ν network architecture in general sense.
+ 2 = f (x, y) (1)
∂x2 ∂y Nl−1
X
alk = l
wk,p Xp + blk,0 , k = 1, 2, . . . . . . , Nl
Here the forcing function is, (2)
p=1
We prescribe Dirichlet Boundary Conditions on all four sides Here, Nl represents the quantity of neurons in the l layer
of the rectangle as (14). Every concealed neuron, labeled as “k,” is supplied with
the outcome of every input neuron “p” originating from
ν(x, 0) = 0, ν(x, A) = 0, ∀x ∈ (0, A) the input layer, which is scaled by weight (wk,pl ). The total
ν(0, y) = 0, ν(A, y) = 0, ∀x ∈ (0, B) of weighted inputs is then fed into activation function Fl
to determine the output of concealed layer neuron. This
For numerical resolution of Equation (1), we have information is subsequently moved ahead to the output layer,
decomposed the rectangular solution region into mesh and a comparable process involving weighted sums takes
points (11, 13). Here hx and hy are step length along the x place for each output neuron. The term blk,0 signifies the bias
and y directions, respectively. Let (xi , yj ) be a mesh point of the neuron indexed as “k” in the “lth” layer. Bias values
in the region. Then xi = x0 + ihx , yj = y0 + jhy . We are incorporated to introduce a degree of randomness to the
have considered (x0 , y0 ) as (0,0). A = 3, B = 2, hx = 0.03, initial conditions, which ultimately enhances the network’s
hy = 0.02 likelihood of reaching convergence. In Equation (3), the
The exact solution of the PDE mentioned in equation (1) weight matrix is defined as the connector between the (“l −
is, 1“)-th layer and the “lth” layer (14).
x(A − x)y(B − y)
w1,1 w1,2 .... w1,l−1
νexact =
2 w2,1 w2,2 .... w2,l−1
ωl = ..... ..... ..... (3)
.....
This exact solution is used to compute the error function
(||vexact − vann ||p ) numerically in Lp space (p = 2). Here vann wl,1 wl,2 ..... wl,l−1 lX(l−1)
is the solution of equation (1) using ANN.
4. The Levenberg–Marquardt
3. Feed-forward neural network algorithm (LMA)
One of the two primary types of artificial neural networks, The Levenberg–Marquardt algorithm (LMA) (1–6, 14, 16)
a feedforward neural network, is distinguished by the way is a widely used trust region method designed to locate a
information is processed and sent between its various minimum of a function, be it linear or non-linear, within
layers. Feedforward neural networks are structured as a a parameter space. It essentially builds an internal model
10.54646/bijscit.2023.37 57
of the objective function, often quadratic, to establish a weights and biases. Precision of ANN heavily depends on the
trusted region. When a good fit is achieved, the trust region availability of substantial training dataset.
expands. However, like many numerical techniques, LMA Typically, the training data are segmented into three
can be sensitive to the initial parameter values. In traditional separate parts: training, validation, and testing sets. Each of
implementations of the Levenberg–Marquardt method, finite these divisions is utilized separately to assess the training’s
differences are employed to approximate the Jacobian matrix. effectiveness. This approach allows for the assessment of the
Within the realm of artificial neural networks, this method is entire dataset’s training results and facilitates comparisons
well-suited for training small to medium-sized problems. between various training algorithms and ANN architectures.
It combines elements from both gradient descent and In MATLAB, we have generated a substantial dataset
Gauss–Newton methods, resulting in an adaptive and reliable consisting of 100 × 100 samples for solving Poisson’s
optimization tool. LMA often assures successful problem- equation. Throughout the training phase, these samples
solving due to its adaptable nature. However, when we are categorized into three distinct subsets: Seventy percent
represent the back-propagation (BP) method as gradient is designated for training, fifteen percent is set aside for
descent, the algorithm tends to slow down and may not validation, and the remaining fifteen percent is reserved
achieve an optimal solution. Conversely, if we express BP for testing. Training of the ANN is conducted using the
as Gauss–Newton, the algorithm substantially increases the Levenberg–Marquardt algorithm.
likelihood of reaching an optimal solution. Within this We have completed three training cycles for each network
algorithm, an approximation for calculating the Hessian configuration using this method, with each training cycle
matrix (Ha ) is presented in Eqn (4), while the gradient (G) consisting of 1,000 epochs (14). “Epoch” denotes the
computation is expressed in Eqn (5). frequency of occurrences that all the training data samples
are utilized once for updating the network’s weights.
Ha = Jat Ja (4) Figure 1 illustrates the artificial neural network
G= Jat Er (5) architecture designed for the numerical solution of the
partial differential equation outlined in Eqn (1).
Here, the Jacobian matrix is denoted as “Ja ” and “Er ” Figures 2 and 3 depict the surface and contour plots,
represents a vector representing the network’s error. In respectively, showcasing the numerical solution obtained
this context, the Levenberg–Marquardt Algorithm (LMA) using the current method for the partial differential equation.
exhibits behavior akin to the Newton method. This can be The evaluation of the network training’s effectiveness is
articulated via the subsequent convergence procedure: done by quantifying the discrepancy between the computed
−1 t output (ynn ) of the neural network and the intended target
Zk+1 = Zk − Jat Ja + µI
Ja E r (6) output for training (yt ). In essence, we set a threshold error
value that is deemed sufficiently small for us to consider
In this context, “Zk+1 ” denotes a new weight value, which is the output with precision. The assessment of the network
computed by applying the gradient function to the current training operation depends on the speed and efficiency with
weight “Zk ” using the Newton algorithm. Here I is the which this error approaches the predefined cutoff point.
identity matrix and µ is the learning factor. Usually, the most widely employed approach for measuring
Notably, it can successfully converge even when the error the output error involves Mean Squared Error (MSE), as
landscape is notably more intricate than a straightforward shown in equations (7, 14).
quadratic scenario. Core concept behind the Levenberg–
Marquardt algorithm involves a hybrid training approach: N
1 X nn
in regions with intricate curvature, the steepest descent MSE = (yi − yit )2 (7)
N
technique is employed until the local curvature is suitable i
for a quadratic estimation. Subsequently, this transition leads Here, N denotes the quantity of outputs.
to an approximation akin to the Gauss–Newton algorithm, In the MATLAB simulation environment, we use 64-
notably expediting the convergence. bit floating-point data representation to represent weights,
biases, and training data within the ANN model.
Figure 4 through Figure 5 display training performance
5. Design and training ANN results of the Levenberg–Marquardt algorithm (LMA).
In Figure 4, we observe the performance curve where
Training process of an ANN involves adjusting weights and MSE decreases as the number of epochs increases. It is
biases based on input data to reduce the discrepancy between noteworthy that the error in the test set and the error in the
the network’s outputs and the intended target results. For validation set show analogous patterns. Importantly, there is
comprehensive overview of neural networks, please refer to no prominent overfitting issue observed up to epoch 1000,
(12, 14). To train a multi-layer ANN, the Backpropagation which corresponds to the point where the best validation
(BP) algorithm is commonly utilized to iteratively update the performance is achieved, and the MSE reaches a remarkably
58 Thander and Bhowmik
FIGURE 2 | Surface plot of solution of the PDE. FIGURE 4 | Mean Squared Error (MSE) values for different Epochs.
Figures 8 and 9 display the surface and contour plots of the In contrast, our method is versatile, as it can handle both
error function associated with the PDE. This error function Dirichlet and Neumann boundary conditions. Nonetheless,
is computed in Euclidean space, comparing the analytical a drawback of our method involves the necessity for
solution of the PDE with the numerical solution obtained discretizing the solution domain, a requirement that differs
using the current method. from Liu et al.’s approach (23). Furthermore, our proposed
Liu et al. (23) employed a similar approach to obtain a method has limitations when it comes to addressing elliptic
numerical solution for elliptic partial differential equations PDEs with complex geometries where employing finite-
(PDEs) using an artificial neural network (ANN)-based radial difference meshes with a uniform grid size is unfeasible.
basis function (RBF) collocation method. In their approach,
the training data encompass the prescribed boundary values
of the dependent variable and the radial distances between 6. Conclusion
exterior fictitious sources and the boundary points of the
solution domain. This technique is suitable when dealing This paper is centered on the development and refinement
with Dirichlet boundary conditions. of a specialized Artificial Neural Network (ANN) design
10.54646/bijscit.2023.37 61
for addressing Poisson’s equation. In particular, a 3-layer 10. Bhattacharya K, Hosseini B, Kovachki NB, Stuart AM. Model reduction
ANN structure was trained using optimization techniques, and neural networks for parametric PDEs. SMAI J Comput Math. (2021)
7:121–57.
including the Levenberg–Marquardt algorithm within the
11. Thander AK, Bhattacharyya S. Study of optical modal index for
MATLAB environment. Notably, the hidden layer of the semiconductor rib wave guides using higher order compact finite
ANN consisted of 50 neurons. The numerical findings difference method. Optik. (2017) 131:775–84.
demonstrate that this ANN-based method can achieve an 12. Zhang L. Artificial neural networks model design of Lorenz chaotic
error below a certain threshold. As part of future research system for EEG pattern recognition and prediction. Proceedings of the
2017 IEEE Life Sciences Conference (LSC). London: (2017).
endeavors, our aim is to extend this work to address the
13. Thander AK, Mandal G. Optical waveguide analysis using alternative
numerical solution of the Helmholtz wave equation within
direction implicit (ADI) method in combination with successive over-
the context of specific rib-structured waveguides, utilizing relaxation (SOR) algorithm. J Optics. (2023).
artificial neural networks as a promising approach. 14. Zhang L. Artificial Neural Network model design and topology analysis
for FPGA implementation of Lorenz chaotic generator. Proceedings of
the 2017 IEEE 30th Canadian Conference on Electrical and Computer
Engineering (CCECE). New York, NY (2017).
References 15. Yadav AK, Chandel SS. Artificial neural network based prediction of
solar radiation for Indian Stations. Int J Comput Applic. (2012) 50:975–
1. Yadav N, Yadav A, Kumar M. An Introduction to Neural Network
8887.
Methods for Differential Equations. Springer Briefs in Applied
Sciences and Technology: Computational Intelligence. Berlin: Springer 16. Szczuka M, Slezak D. Feedforward neural networks for compound
(2015). signals. Theor Comput Sci. (2011) 412:5960–73.
2. Jiang Z, Jiang J, Yao Q, Yang G. A neural network-based PDE solving 17. Sunny J, Schmitz J, Zhang L. Artificial neural network modelling of
algorithm with high precision. Sci Rep. (2023) 13:4479. Rossler’s and Chua’s chaotic systems. Proceedings of the 2018 IEEE
Canadian Conference on Electrical & Computer Engineering (CCECE).
3. Althubiti S, Kumar M, Goswami P, Kumar K. Artificial neural network
New York, NY (2018).
for solving the nonlinear singular fractional differential equations. Appl
Math Sci Eng. (2023) 31:2187389. 18. Zhang L. Chaotic system design based on recurrent artificial neural
4. Basir S, Senocak I. Physics and equality constrained artificial neural network for the simulation of EEG Time Series. Int J Cogn Inform Natl
networks: application to forward and inverse problems with multi- Intell. (2019) 13:103.
fidelity data fusion. J Comput Phys. (2022) 463:111301. 19. Bhattacharyya S, Thander A. Study of H-field using higher-order
5. Seo J. A pretraining domain decomposition method using artificial compact (HOC) finite difference method (FDM) in semiconductor rib
neural networks to solve elliptic PDE boundary value problems. Sci Rep. waveguide structure. J Optics. (2019) 48:345–56.
(2022) 12:13939. 20. Dua V, Dua P. A simultaneous approach for parameter estimation of a
6. Sun Y, Zhang L, Schaeffer H. NeuPDE: Neural Network Based Ordinary system of ordinary differential equations, using artificial neural network
and Partial Differential Equations for Modeling Time-Dependent Data. approximation. Ind Eng Chem Res. (2012) 51:1809–14.
Proc Mach Learn Res. (2020) 107:352–72. 21. Dua V. An artificial neural network approximation-based
7. Blechschmidt J, Ernst O. Three ways to solve partial differential decomposition approach for parameter estimation of system of
equations with neural networks — A review. GAMM-Mitteilungen. ordinary differential equations. Comput Chem Eng. (2011) 35:545–53.
(2021) 44:e202100006. 22. Pratama DA, Bakar MA, Ismail NB, Mashuri M. ANN-based methods
8. Thander AK, Bhattacharyya S. Optical confinement study of different for solving partial differential equations: a survey. Arab J Basic Appl Sci.
semiconductor rib wave guides using higher order compact finite (2022) 29:233–48.
difference method. Optik. (2016) 127:2116–20. 23. Liu C, Ku C. A novel ANN-based radial basis function collocation
9. Li Y, Hu X. Artificial neural network approximations of Cauchy inverse method for solving elliptic boundary value problems. Mathematics.
problem for linear PDEs. Appl Math Comput. (2022) 414:126678. (2023) 11:3935.