Physics-Informed Neural Nets For Control of Dynamical Systems
Physics-Informed Neural Nets For Control of Dynamical Systems
Eric Aislan Antoneloa, Eduardo Camponogaraa, Laio Oriel Semana,b, Eduardo Rehbein de Souzaa , Jean Panaioti
Jordanoua, Jomi Fred Hübnera
a Department of Automation and Systems Engineering, Federal University of Santa Catarina, Florianópolis - Brazil, CEP: 88040-900
b Gradute Program in Applied Computer Science, University of Vale do Itajaı́, Itajaı́ - Brazil, CEP: 88302-901
Abstract
Physics-informed neural networks (PINNs) impose known physical laws into the learning of deep neural networks,
making sure they respect the physics of the process while decreasing the demand of labeled data. For systems rep-
arXiv:2104.02556v3 [cs.LG] 31 May 2022
resented by Ordinary Differential Equations (ODEs), the conventional PINN has a continuous time input variable
and outputs the solution of the corresponding ODE. In their original form, PINNs do not allow control inputs,
neither can they simulate for variable long-range intervals without serious degradation in their predictions. In this
context, this work presents a new framework called Physics-Informed Neural Nets for Control (PINC), which pro-
poses a novel PINN-based architecture that is amenable to control problems and able to simulate for longer-range
time horizons that are not fixed beforehand, making it a very flexible framework when compared to traditional
PINNs. Furthermore, this long-range time simulation of differential equations is faster than numerical methods
since it relies only on signal propagation through the network, making it less computationally costly and, thus, a
better alternative for simulation of models in Model Predictive Control. We showcase our proposal in the control
of two nonlinear dynamic systems: the Van der Pol oscillator and the four-tank system.
Keywords: physics-informed neural networks, deep learning, nonlinear model predictive control.
1. Introduction
In the era of industry 4.0, the simulation and control of complex real-world systems in smart and efficient
ways becomes increasingly important. Thus, harnessing deep learning for smart automation and control of real
plants is not only desirable but also inevitable. One way to achieve that is by the use of deep neural networks
as models in Model Predictive Control (MPC) (Grüne & Pannek, 2011). MPC is a technique that has become
standard for multivariate control in industry and academia (Camacho & Bordons, 2013). Since its inception in the
1970s, MPC has been successfully applied in the oil and gas (Jordanou et al., 2018), aerospace (Eren et al., 2017)
and process industries, as well as in robotics (Nascimento et al., 2018). The main idea of MPC is to control a
system by employing a prediction model: at every iteration of the control loop, an optimization problem is solved
using a model of the plant in a receding horizon approach.
There are two main cases which we consider for which practical application of MPC or even just efficient
simulation of a dynamic system are a challenge: (a) sparse or insufficient historical data of the real plant to build
a sufficiently accurate machine learning model; (b) when the numerical simulation of a precise model given by
Ordinary Differential Equations (ODEs) or Partial Differential Equations (PDEs) is too costly to be considered
in a real-time application. However, a recently introduced approach for training deep neural networks using
laws of physics, namely Physics-Informed Neural Networks (PINN) (Raissi et al., 2017, 2019), is one effective
approach that addresses both of the aforementioned challenges. For the first challenge (a), we assume that a priori
knowledge built previously by experts or borrowed from the laws of nature is available. For (b), instead of relying
on numerical solutions of differential equations, PINNs can be used instead easing the computational burden of
solving ODEs or PDEs, and consequently extending the application of MPC to more real-time scenarios.
A standard PINN has a continuous-time t as input, and the system’s state variables as output y. The main
outcome of this approach is that the need for real data collection is reduced to a minimum, since the behavior
∗ Corresponding
author. E-mail: [email protected]
Email addresses: [email protected] (Eduardo Camponogara ), [email protected] (Laio Oriel Seman),
[email protected] (Eduardo Rehbein de Souza ), [email protected] (Jean Panaioti Jordanou ),
[email protected] (Jomi Fred Hübner )
2. Related Works
2
approximator, trained offline to minimize a control-related cost function directly, without the need to calculate a
model predictive controller during training.
In the vein of Recurrent Neural Networks (RNN), works such as Jordanou et al. (2021) and Pan & Wang
(2012) utilize Echo State Networks as dynamical models for the MPC. Jordanou et al. (2021) uses a Trajectory
linearization approach (Ławryńczuk, 2014), by derivating the input-output sensitivities along the nonlinear free
response over the prediction horizon to calculate a forced response (Camacho & Bordons, 2013). In Pan & Wang
(2012), the whole ESN is approximated into a state space system for computation of the control action. Terzi et al.
(2020) rely on the same reduction approach, however employing LSTMs instead of ESN.
Another example is the classical Approximate Predictive Control (Witt & Werner, 2010), which employs a
feedforward neural network that implements dynamics through the application of delayed outputs as inputs (an
external dynamics model (Nelles, 2001)), obtains an ARX (Auto Regressive with eXogenous input) model from
the networks through derivation, and performs GPC (Generalized Predictive Control) calculations per time step
(Camacho & Bordons, 2013). Hertneck et al. (2018) consider a neural network as the approximation to a MPC, in
the same vein as Åkesson & Toivonen (2006).
3. Methods
3
allows one to find data-driven solutions of PDEs or ODEs automatically. In this paper, nonlinear ODEs are
considered in the following general form:
where N [·] is a nonlinear differential operator and y represents the state of the dynamic system (the latent ODE
solution).
We define F (y) to be equivalent to the left-hand side of Equation (1):
Here, y also represents the output of a multilayer neural network (hence the notation y instead of x) which has
the continuous time t as input: y = fw (t), where fw represents the mapping function obtained by a deep network
parameterized by adaptive weights w. This formulation implies that a neural network must learn to compute the
solution of a given ODE.
Assuming an autonomous system for this formulation, a given neural network y(t) is trained using optimizers
such as ADAM (Kingma & Ba, 2014) or L-BFGS (Andrew & Gao, 2007) to minimize a mean squared error
(MSE) cost function:
MSE = MSEy + MSEF , (3)
where
Ny Nt
1 X 1 X
MSEy = |yi (tj ) − ybij |2 , (4a)
Ny i=1 Nt j=1
Ny NF
1 X 1 X
MSEF = |F (yi (tk ))|2 , (4b)
Ny i=1 NF
k=1
where: Nt , NF , and Ny correspond to the number of training data samples, the number of collocation points, and
the number of outputs of the neural network, respectively; yi (·) is the i-th output of the network; ybij represents
the desired i-th output for yi (·), considering the j-th data pair (tj , ybij ). The first loss term MSEy corresponds to
the usual cost function for regression (Bishop, 2006) based on collected training data {(tj , ybij )}N j=1 , which usually
t
provides the boundary (initial or terminal) conditions of ODEs when solving these equations.
The second loss term MSEF penalizes the misadjusted behavior of y(t), measured by F (y) in Equation (2),
whereby the physical structure of the solution is imposed by F (y) at a finite set of randomly sampled collocation
points {tk }Nk=1 . Experiments show that the training data size Nt required for learning a certain dynamical behavior
F
is drastically reduced due to the a priori information assimilated from MSEF . As the differential equation of the
physical system is assumed to be represented by F (y) = 0, the term MSEF is a measure of how well the PINN
adheres to the solution of the physical model. This physics-informed approach provides a framework that unifies a
previously available theoretical, possibly approximate model and measured data from processes, which is capable
of correcting imprecisions in the theoretical model or providing sample efficiency in process modeling.
problem is composed of mathematical expressions established in the controller’s design phase, taking many forms.
Usually, quadratic functions are used to penalize the error in the reference tracking.
According to Camacho & Bordons (2013), there are several ways to classify these controllers taking into
account characteristics such as model linearity, treatment of uncertainties, and how the optimization problem is
solved. In this work, we focus on the lack of model linearity, more specifically in the Nonlinear Model Predictive
Control (NMPC) (Grüne & Pannek, 2011). The discrete NMPC formulation is given by:
N2 NX
u −1
X 2
J= x[k + j] − xref [k + j] + k∆u[k + i]k2R (5a)
Q
j=N1 i=0
where k represents the time step at which the MPC problem is being computed, x[k] is the recurrent state of the
dynamic system which, for simplification purposes, is also the output (i.e., x = y), xref is the set-point signal
over the prediction horizon (i.e., reference), being defined by the first penalized instant k + N1 and the last instant
k +N2 . The cost function J is the penalization of the quadratic error between the model output x and the reference
xref along the horizon, and the penalization of the control increment ∆u. Each penalization is weighted by the
diagonal matrices Q and R, respectively. Eqn. (5b) is the constraint imposed by the considered state-equation
model with x as the state, and equations (5e) and (5f) refer to inequality and equality constraints imposed by
functions h and g, respectively. Eqns. (5c) and (5d) define the relation between the control action u and the
control increments, which are aggregated into the control action from time k up until either the control action time
k + j or k + Nu − 1.
The optimization problem is defined by equations from (5a) to (5f) and results in a Non-Linear Programming
(NLP) Problem, which can be solved using well-established methods like Sequential Quadratic Programming
(SQP) (Nocedal & Wright, 2006) and the Interior-Point (IP) method, available in commercial (Gill et al., 2005)
and non-commercial solvers (Wächter & Biegler, 2006). The NLP is solved at each time step k, and typical
approaches only apply the first control increment into the system (Camacho & Bordons, 2013).
where fw represents the mapping given by a deep network parameterized by weights w. In this work, we assume
the control input to be a constant value for the time interval t ∈ [0, T ]. Thus, the new formulation provides a
conditioned response y(t) on u and y(0) during this interval of T seconds.
Traditional PINNs tend to degrade rapidly for long time intervals and can only accept input t in the range the
network was trained. The PINC framework significantly alleviates this degrading issue as well as enables control
applications by dividing the problem in M equidistant control time intervals, each of T seconds (see Fig. 4). We
call this shorter period of T seconds as the inner continuous time interval of the problem, in which a solution of
an ODE is obtained given some initial condition y(0) (which models the current system state) and control input
u for t ∈ [0, T ]. This ODE solution y(t), which is the output of the network, is found by a single PINC network,
that is, the same network solves all M intermediate problems, which results from learning the ODE solution for
a particular range of initial conditions and control inputs that vary over the complete time horizon, but which stay
constant for t ∈ [0, T ].
fbw
We call fbw the control interface for the PINC framework. Thus, ∂∂u can be computed for providing the
Jacobian matrix to solvers used in MPC, possibly by means of automatic differentiation. This control interface
provides the prediction of the states of the dynamic system at the vertical lines in Fig. 4, that is, at every Ts
seconds, the state y[k] is predicted in a single forward net propagation operation, for k = 1, ..., M . This differs
from numerical integration methods that need to integrate over the continuous inner interval (Iserles, 1996).
Since the prediction is fed back as an input at every discrete timestep, it is expected that errors accumulate in
the long free run. This is not exclusive of this approach, and is common to recurrent neural networks. However,
because MPC works in a receding horizon control approach, at every timestep k of the control loop, the input
y[k − 1] representing the initial state is set to the real system’s state y b [k − 1] (Fig. 3b). Thus, the prediction
horizon in MPC always starts from the true initial state yb [k − 1], that is, Equation (8) becomes
y[k] = fbw (b
y[k − 1], u[k]) (9)
^
y u ref
Ts
T y
y y(0)
y(0)
u
u
input output
(a) PINC in self-loop mode (b) PINC connected to the plant
Figure 3: Modes of operation of the PINC network. (a) PINC net operates in self-loop mode, using its own output
prediction as next initial state, after T seconds. This operation mode is used within one iteration of MPC, for
trajectory generation until the prediction horizon of MPC completes (predicted output from Fig. 1). (b) Block
diagram for PINC connected to the plant. One pass through the diagram arrows corresponds to one MPC iteration
applying a control input u for Ts timesteps for both plant and PINC network. Note that the initial state of the
PINC net is set to the real output of the plant. In practice, in MPC, these two operation modes are executed in an
alternated way (optimization in the prediction horizon, and application of control action).
The error might accumulate when the MPC model is used in a future finite prediction horizon to solve a
constrained optimization problem. In this case, the prediction y[k − 1] is fed back as no readings from the real
process at a future time are possible (Fig. 3a).
3.3.3. Training
The first loss term in Equation (3) can be generalized to the PINC network as:
Ny Nt
1 X 1 X
MSEy = |yi (vj ) − ybij |2 , (10)
Ny i=1 Nt j=1
where the pair (vj , yb j ) corresponds to the j-th training example and vj = (t, y(0), u)j is the whole input to the
network (i.e., time, initial state, and control input). Usually, this dataset comes from measured data, but in this
work we will show that, if we assume that the given ODE is an exact representation of the process, it is enough
for this dataset to contain only the initial conditions of the modelled ODE. For instance, one such training data
pair is ((0, 0.4, 0.6), 0.4), which means: at t = 0 the initial state is 0.4, the control input is 0.6 and the desired
output is equal to the initial state (0.4). Thus, for all data points, t = 0, while y(0) and u are randomly sampled
from intervals defined according to the modelled dynamic system. This means that MSEy represents the mean
squared error for all randomly sampled initial conditions of the considered ODE and control inputs. Note also that
the input yj (0) is equal to the desired output ybj in the training set, such that the network must learn to reproduce
7
Figure 4: Representation of a trained PINC network evolving through time in self-loop mode (Fig. 3a) for tra-
jectory generation in prediction horizon. The top dashed black curve corresponds to a predicted trajectory y of
a hypothetical dynamic system in continuous time. The states y[k] are snapshots of the system in discrete time
k positioned at the equidistant vertical lines. Between two vertical lines (during the inner continuous interval
between steps k and k + 1), the PINC net learns the solution of an ODE with t ∈ [0, T ], conditioned on a fixed
control input u[k] (blue solid line) and initial state y(0) (green thick dashed line). Control action u[k] is changed
at the vertical lines and kept fixed for T seconds, and the initial state y(0) in the interval between steps k and
k + 1 is updated to the last state of the previous interval k − 1 (indicated by the red curved arrow). The PINC net
can directly predict the states at the vertical lines without the need to infer intermediate states t < T as numerical
simulation does. Here, we assume that T = Ts and, thus, the number of discrete timesteps M is equal to the
length of the prediction horizon in MPC.
the initial state yj (0) into the network output y(vj ) at t = 0. In practice, the aforementioned assumption allows
training using randomly sampled data for solving the ODE, without ever requiring measured process’s data.
The second loss term in Equation (3) is rewritten as:
Ny NF
1 X 1 X
MSEF = |F (yi (vk ))|2 , (11)
Ny i=1 NF
k=1
where vk corresponds to the k-th collocation point (t, y(0), u)k , where now all three types of inputs (and not only
the last two), i.e., time, initial condition, and control input, are randomly sampled from their respective particular
intervals. Specifically, the interval for t is [0, T ], where T is the inner continuous interval of the PINC framework.
Basically, this formulation means that the PINC net is trained with data points that lie on the boundary of
simulations, i.e., only initial states of ODEs are presented for the loss function in Eq. (10). Practically, this does
not require collecting data from ODE simulators. On the other hand, the collocation points in MSEF serve to
regularize the PINC net to satisfy the behavior defined by F . Thus, in the training process, the PINC net is only
directly informed with a initial state in Eq. (10), and its physics-informed cost loss in Eq. (11) must enforce the
structure of the differential equation into its output y(·) for the remaining inner continuous interval of T seconds
(e.g., t ∈ (0, T ]).
The total loss can be generalized to MSE = MSEy + λ · MSEF , where λ represents a rescaling factor so that
both terms are approximately in the same scale. Once the PINC net structure, datasets and the losses are defined,
the training process starts with the ADAM optimizer (Kingma & Ba, 2014) for K1 epochs, and subsequently
continues with the L-BFGS optimizer (Andrew & Gao, 2007) for K2 iterations in order to adapt the net weights
w towards the minimization of MSE. Note that automatic differentiation is employed for the physics-informed
term MSEF in Eq. (11), using deep learning frameworks such as Tensorflow.
3.3.4. NMPC
After training, the PINC net is used as a model in nonlinear MPC, whose algorithm is described in Section 3.2.
Thus, the control interface function fbw in Equation (8) replaces Equation (5b) in the MPC formulation, redefining
the notation of a dynamic system’s state by the prediction given by the PINC network, i.e., x[k] = y[k]. After
these substitutions, we arrive at a Multiple Shooting (MS)-inspired formulation for the NMPC problem under the
PINC framework:
XN2 NXu −1
2
J= y[k + j] − yref [k + j] Q + k∆u[k + i]k2R (12a)
j=N1 i=0
8
while being subject to:
y[k + j + 1] = b
fw (y[k + j], u[k + j]), ∀j = 0, . . . , N2 − 1 (12b)
j
X
u[k + j] = u[k − 1] + ∆u[k + i], ∀j = 0, . . . , (Nu − 1) (12c)
i=0
u[k + j] = u[k + Nu − 1], ∀j = Nu , . . . , N2 − 1 (12d)
h(y[k + j], u[k + j]) ≤ 0, ∀j = N1 , . . . , N2 (12e)
g(y[k + j], u[k + j]) = 0, ∀j = N1 , . . . , N2 (12f)
3.4. Metrics
The evaluation of the PINC net prediction performance is done on a validation set in self-loop mode (Figures 3a
and 4). In particular, the generalization MSE is computed only at the discrete time steps (vertical lines in Fig. 4):
Ny N
1 X 1 X 2
MSEgen = yi [k] − ybi [k] , (13)
Ny i=1 N
k=1
where: y[k] is the prediction of the PINC net given by Equation (8) and yb [k] is obtained with Runge-Kutta (RK)
simulation of the true model of the plant; N is the length of vector y; and the same control input signal u[k] is
given to both the PINC net and the RK model.
The control performance is measured by employing the Integral of Absolute Error (IAE) on a simulation of
C iterations:
Ny C
1 X X ref
IAE = yi [k] − yi [k] (14)
Ny i=1
k=1
4. Experiments
This section presents experiments regarding the application of PINC to the modeling and control of the Van der
Pol Oscillator and the four-tank system, which are two dynamical systems often considered for nonlinear analysis
in the literature.
9
Algorithm 1: PINC Training Algorithm
input: K1 , K2 , F (·), {(vj , y bj ) : j = 1, . . . , Nt }, {vk : k = 1, . . . , NF };
initialize PINC weights w with Xavier normal distribution;
// Train with ADAM
for K1 epochs do
Compute the gradients of Eq. (10) + Eq. (11) (with F (·)) with respect to w using the data points
{(vj , y
bj ) : j = 1, . . . , Nt } and collocation points {vk : k = 1, . . . , NF };
Update w with ADAM optimizer and the obtained gradients;
// Train with L-BFGS
for K2 iterations do
Compute the gradients of Eq. (10) + Eq. (11) (with F (·)) with respect to w using the data points
{(vj , y
bj ) : j = 1, . . . , Nt } and collocation points {vk : k = 1, . . . , NF };
Update w with L-BFGS optimizer and the obtained gradients;
Save network w with best performance seen so far on a validation set using Eq. (13);
output: network w with lowest validation error;
ẋ1 = x2 (16a)
ẋ2 = µ(1 − x21 )x2 − x1 + u (16b)
where µ = 1 is referred to as the damping parameter, which affects how much the system will oscillate, x =
(x1 , x2 ) is the system state, and u is an exogenous control action.
By inspection,
√ √ Pol oscillator has an equilibrium at x̄ = (u, 0), which is stable for a constant input
the Van der
u ∈ (− 3, −1) or u ∈ (1, 3). The oscillator also has a limit cycle that can be perceived in polar coordinates
(Y. Hafeez et al., 2015). For our experiments, u ∈ [−1, 1] and x1 , x2 ∈ [−3, 3].
−1.0
2000 -1.13 -2.46 -2.40 -0.68 0.30
−0.5
5 -0.30 -0.75 -1.85 -2.31 -2.83 −1.5
4000 -1.19 -2.31 -2.73 -2.56 -0.80
−1.0
Nf
8 -0.15 -0.61 -1.60 -2.36 -2.80 −2.0
-1.23 -2.79 -2.50 -2.83 -2.68
−1.5
10000
−2.0
−2.5
10 -0.17 -0.46 -2.08 -1.86 -2.87 100000 -1.19 -2.52 -2.75 -2.54 -2.51 −2.5
(a) Effect of the number of layers and neurons per layer (b) Effect of the number of collocation points Nf and data points Nt
Figure 5: Analysis of the PINC net for the Van der Pol Oscillator. The network training time is fixed to a constant
number of iterations. The MSE validation error is computed according to Equation (13). (a) The log10 of the
MSE error as a function of network complexity, averaged over 10 different simulations. The best generalization
error (10−2.87 ) is achieved with a deep network of 10 layers with 20 neurons each. (b) The effect of the number
of collocation points Nf and data points Nt on generalization performance, averaged over 5 different randomly
initialized networks.
The parameter λ is set empirically so that MSEy and MSEF are not in disparate scales. The validation dataset is
composed of 1810 points obtained using a randomly generated control action u (e.g., Fig. 9), which is equivalent
to 905s of simulation, since Ts = 0.5s. The validation or generalization error considers the self-loop mode of
PINC to compute Eq. (13).
The first experiment analyses the network complexity (Fig. 5a) and shows the validation MSE using Eq. (13)
averaged over 10 different random initializations of the network weights. In general, as the network grows deeper
and with more neurons per layer, the performance increases. Besides, layers with 3 or 5 neurons are not sufficient
to model the required task. Note that these errors would decrease even further if training had been extended
for more epochs (correcting the lower performance of the net of 10 layers with 15 neurons each, for instance).
Although the network of 10 layers with 20 neurons each achieves the best performance, we choose to continue
the following experiments with a configuration of 4 layers of 20 neurons each, which also showed excellent
performance, but with less computational overhead.
In Fig. 5b, the proportion between data points and collocation points is investigated. Each error cell in the
plot corresponds to the average of 5 different experiments with randomly generated networks. Clearly, 40 data
points are not enough, and the proportion Nf /Nt should be considerably higher than 4 (hence the dark cells in the
upper-right corner of the plot).
0 10 0 10 0 10 0 10 0 10
Time (s)
Figure 6: Comparison between conventional PINN (dashed grey line) and proposed PINC (solid blue and pink
lines) for long-range simulation of the Van Der Pol oscillator with fixed control input u and fixed initial condition
x = (x1 , x2 ) along the simulation. The target trajectories for states x1 and x2 are also plot in black solid line,
but are completely superimposed by the PINC predictions y1 and y2 . From left to right, the PINN nets are trained
with fixed T ∈ {0.5, 1, 2, 5, 10}, while the PINC net is trained only once with T = 0.5s even though it can run
for arbitrary longer simulation times not fixed beforehand.
The RMSE error for these experiments are summarized in Fig. 7, making it clear the high prediction error
obtained by the conventional PINN in relation to the proposed PINC approach when the T used for PINN training
is lower than 10s. At T = 10s, PINN has slightly lower error than PINC, likely because of the small accumulation
of prediction errors during self-loop simulation for PINC.
100
10−1
RMSE
PINN
PINC
10−2
10−3
Figure 7: Performance comparison in terms of RMSE between the target trajectory of the Van der Pol oscillator
and the predicted trajectory for the PINN and PINC networks from Fig. 6 for a simulation of 10s. The horizontal
axis corresponds to the fixed T used for training the PINN network. See text for more details.
12
The control input u was kept fixed here to compare the conventional and the new approach. However, PINC
can have a variable u along the simulation, yielding an additional advantage for allowing control applications, as
showcased in the next section.
100 Training
Validation
10−1
10−2
MSE
10−3
10−4
10−5
Figure 8: MSE evolution during training of the final PINC net. Previous experiments from Section 4.1.2 stopped
training at the vertical dashed line. The validation dataset consists of 1810 points, or 90s of simulation since
T = Ts = 0.5s. The validation MSE is more noisy because it is computed on a much smaller dataset and on
self-loop mode using Eq. (13).
To view the PINC prediction after training, we randomly generate a control input u for 10s. In Fig. 9, the
predicted trajectory is given for such a control input. With our method, we can directly infer each circle in the
trajectory using (8) every T = 0.5s. The trajectory between two consecutive circles can be predicted by varying
the input t of the network and keeping the other inputs y(0) and u fixed. The prediction matches the target
trajectory very well as the latter is also plot, but gets superimposed by the former.
The resulting control from PINC can be seen in Fig. 10 in a simulation of 60s, where MPC was employed
to find the optimal value of the control input, considering a prediction and control horizon of 5T (or 2.5s). The
control parameters are given as follows: N1 = 1, N2 = 5, Nu = N2 , Q = 10I, and R = I. Here, the
optimization in MPC to find a control input at the current timestep uses the PINC network’s predicted trajectory
for future timesteps, i.e., for the prediction horizon of 2.5s. This procedure is repeated for all 120 points of
the plotted trajectory. Here, the role of the plant to be controlled is taken by the Van der Pol oscillator, whose
states are obtained by an RK integrator. The control performance for a 60s simulation is presented in Table 1,
which also shows the result when the original ODE model is used as the predictive model in NMPC instead
of the PINC net. In this case, the classic, fourth-order Runge-Kutta method (RK4) is employed as a numerical
solution method to compute the states of the system for NMPC. This means that practically other approximations
to the plant/system are not likely to improve the ODE model itself, thus, justifying our comparison to the baseline
NMPC. Remarkably, PINC achieves practically the same result as the ODE/RK approach in terms of RMSE and
IAE, while being slightly faster on average when executed with 10 repetitions on the same desktop computer. The
right plot in Fig. 10 also shows the NMPC considering the ODE/RK model as the predictive model, in thick yellow
line, showing that the lines from PINC and ODE/RK match very well.
13
Van der Pol prediction
1.0
2
0.5
outputs y1 , y2
input u
0 0.0
−0.5
−2
−1.0
0 2 4 6 8 10
Time (s)
Figure 9: PINC net prediction for the Van der Pol oscillator on test data. The grey dashed line gives the randomly
generated input u, while the predictions for the oscillator states x1 and x2 correspond to the solid blue and pink
lines, respectively. The target trajectory, from RK method, is also plot in black, but is not visible as the prediction
completely superimposes the former. Each dot in the predicted trajectory corresponds to the vertical lines in Fig. 4
when the control action and initial state change (after Ts = T = 0.5s).
1.0
1.0 x1
x2
0.5
0.5
x
x
0.0
0.0
x1
x2
−0.5
−0.5
1.0
1.0
0.5
0.5
u
u
0.0
0.0
−0.5
−0.5 0 5 10 15 20
0 10 20 30 40 50 60 Time (s)
Time (s)
Figure 10: Control of the Van der Pol oscillator with PINC and comparison with the ODE model. The reference
trajectory for x1 (x2 ) is given by a dashed black step signal (fixed at zero), while the controlled variables are the
states x1 and x2 given by blue and pink lines. The control input u is the manipulated variable in the lower plot,
found by MPC. Left: simulation totalling 60s exclusively by PINC. Right: comparison with ODE model (plant)
as predictive model in yellow thick line during the first 20s. See text for more details.
γ1 k1 u1 + ω3 − ω1
ḣ1 = (17a)
A1
γ2 k2 u2 + ω4 − ω2
ḣ2 = (17b)
A2
(1 − γ2 )k2 u2 − ω3
ḣ3 = (17c)
A3
(1 − γ1 )k1 u1 − ω4
ḣ4 = (17d)
A4
where the flow in each tank orifice ωi is described by the Bernoulli orifice equation, adding the sole nonlinearity
of the system: p
ωi = ai 2ghi (18)
with g as the acceleration of gravity. The parameters used for this application are the same as the ones stipulated
for the non-minimum phase experiment in Johansson (2000).
Tank 4
Tank 3
Tank 1
Pump 1 Tank 2 Pump 2
Figure 11: Schematic representation of the four tanks system, from Brandão (2018).
15
12.5
y1 12.5 y3
10.0 y2 y4
10.0
7.5
7.5
h1 , h2
h3 , h4
5.0 5.0
2.5 2.5
0.0 0.0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Time (s) Time (s)
Figure 12: PINC net prediction for the four tanks system on test data, with randomly generated control input
signals similar to Fig. 9. The predictions for the level of the tanks h1 and h2 correspond to the solid blue and pink
lines, respectively. From the RK method, the target trajectories are also plot in dark solid lines without dots. At
each dot in the predicted trajectory, the PINC net receives new inputs for control action and initial state (every
Ts = T = 10s), as explained in Fig. 3a. The vertical dashed line indicates the prediction horizon used for MPC.
uses this trajectory only up to 50s, equivalent to a prediction horizon of 5 steps, indicated by the vertical dashed
line in the figure, and the next optimization procedure in MPC resets the initial state to the true value as obtained
by sensors of the real process (Fig. 3b).
PINC’s control employs prediction and control horizons both of 5 steps (50s in simulation time) for this four
tanks system. Besides, both h3 and h4 tank levels are constrained to the interval [0.6, 5.5]cm. The results are
shown in Fig. 13, where the controlled and constrained tank levels are presented in the first two topmost plots, and
the control action found by MPC is shown at the bottom plot. The plots on the right-hand side show a close-up
during the initial 160s of the simulation. The control was successful in spite of the constrains imposed on h3
and h4 (which were respected) and some minor error in steady regime, which can be corrected by adding the
calculation of a correction factor through filtering the error between the measurement and the network prediction,
as done in Jordanou et al. (2018) for a recurrent network. In Fig. 14, we use the same simulation setup, focusing
on the timesteps between 500s and 1300s, to compare with the response (in yellow color) of the control using
the plant reference model as predictive model in MPC. This ODE/RK-based model is the reference model that
represent the plant itself, which justifies the negligible steady-state regime error observed in the figure. We can
notice that the PINC simulation is very close to the nominal MPC given by the ODE/RK model. This comparison
suffices as another NMPC would employ an approximation of the ODE/RK model as a predictive model.
The control performance in terms of RMSE and IAE is shown in Table 1. Although IAE seems to differ more
between PINC and ODE/RK, RMSE errors for both methods are almost equivalent. The average time spent on a
desktop computer for the complete control simulation using PINC, repeated 10 times, 10.85s, is 23.3% inferior
than using the ODE of the four tanks as a model for MPC (14.15s), which is remarkable given the architecture of
5 hidden layers with 20 neurons each that is used for PINC.
15.0 h1 15.0
h1 ,h2 (cm)
h1 ,h2 (cm)
h2
12.5 12.5
10.0 10.0
7.5 7.5
h3
h3 ,h4 (cm)
h3 ,h4 (cm)
4 h4 4
2 2
pump voltage u (V)
u1
4 4
u2
2 2
Figure 13: Control of the four tanks system with PINC. The controlled variables are the tank levels h1 and h2
given by blue and pink lines, respectively, whereas the reference trajectory for h1 (h2 ) is given by a dashed black
(grey) step signal. The control inputs u are the manipulated voltages shown in the lower plot, found by MPC.
Dashed grey horizontal lines represent the lower and upper limits for both h3 and h4 . Left: simulation totalling
2400s. Right: close-up on the first 160s. The initial conditions for h1 and h2 are (2, 2), which is the minimum of
the allowed interval [2, 20]. See text for more details.
17
Controlled tank level Controlled tank level
16
10.0
15
9.5
h1 (cm)
h2 (cm)
14
9.0
8.5 13
8.0
Constrained tank level Constrained tank level
6 2.0
5 1.5
h3 (cm)
h4 (cm)
4 1.0
3 0.5
2 0.0
3 5
pump voltage u1 (V)
4
2
3
1
2
0 1
600 800 1000 1200 600 800 1000 1200
Time (s) Time (s)
Figure 14: Control of the four tanks system with PINC and the RK/ODE model for timesteps 500s to 1300s from
Fig. 13. All the yellow lines correspond to the simulation where the ODE model itself is used as predictive model
(using a numerical solution at each timestep in the prediction horizon), whereas the remaining lines belong to the
simulation with PINC net as predictive model.
18
25 25
20 20
15 15
10 10
5 5
0 0
400 600 800 1000 520 540 560 580
IAE IAE
(a) 5 layers of 20 neurons each (left: k1 and k2 perturbed; right: initial condition perturbed)
25 25
20 20
15 15
10 10
5 5
0 0
300 350 400 450 500 550 600 300 320 340 360 380
IAE IAE
(b) 8 layers of 20 neurons each (left: k1 and k2 perturbed; right: initial condition perturbed)
Figure 15: Sensitivity to modeling errors (left) and to initial conditions (right) for the MPC of the four tanks
system with PINC net as the model. For each of the plots, 151 runs of the MPC algorithm using a trained PINC
net of 5 layers (top plots) and 8 layers (bottom plots) are executed. The resulting IAEs between the references (as
in Fig. 13) and the controlled signals h1 and h2 are computed and shown in a histogram. The initial conditions are
h1 = h2 = 9 , which is the middle point of the allowed interval.
random deviations in the values of the k1 and k2 parameters. The perturbed values of e e2 are sampled from
k1 and k
uniform probability distributions U5% (a, b) = [0.95x, 1.05x], with x being the nominal value in which the PINN
was trained.
Altogether, 151 simulations were carried out, injecting the deviations in the system. The results are shown in
first column of Fig. 15 for two different networks, one with 5 hidden layers of 20 neurons each, and another one
with 8 hidden layers of 20 neurons each. Despite the random variation of the parameters k1 and k2 , it can be seen
that the IAE of the system has variations within a tolerance range considered adequate. Fig. 16 shows the control
of the four tanks when the plant controlled has the maximum deviation of 5% in k1 and k2 parameters, showcasing
that the perturbation only slightly bias the trajectories.
A second experiment consisted of perturbing the initial condition with a uniform distribution. The perturbed
initial conditions for h1 and h2 are sampled from U5% (a, b) = [0.95x, 1.05x], with x being the nominal value 9
for both states. The results are shown in the second column of Fig. 15. The peak of the histogram approximately
coincides with the IAE obtained by the unperturbed model of the plant. Thus, other initial conditions can imply
relatively lower or higher IAE. Notice that the bottom plots show instances with lower IAE, evidencing the higher
accuracy of a deeper network, with 8 hidden layers, in this particular situation.
In summary, the sensitivity experiments imply that the system does not lose much performance in terms of
IAE. In fact, the occurrence of lower IAE simulations is actually higher. Since the PINC control strategy has no
integrators, a small steady state error that depends on model match is expected. As there is more model mismatch
when the parameters k1 and k2 distance themselves from the nominal ones, the steady-state error is expected to
be higher but still within an acceptable range of IAEs.
19
Controlled tank levels
15.0
h1 ,h2 (cm)
12.5
10.0
7.5
Constrained tank levels
h3 ,h4 (cm) 4
2
pump voltage u (V)
Figure 16: Control of the four tanks system with PINC as in Fig. 13, but with maximum perturbation of 5% in k1
and k2 parameters of the model being controlled. Initial conditions as in Fig. 15, that is, h1 = h2 = 9. Control
results were adequate even though the IAE was 1022, which is in the tail of the histogram from Fig. 15(a) left plot.
4.3. Discussion
In the context of ODE simulation, our proposal, after the network is trained, has shown that it is possible to
speed up the runtime of these simulations (up to 30% on average). With further enhancements in the network
inference (e.g., parallelization) or in the case of extending our method to PDEs, we envision an even higher gain
(e.g., 10x faster as reported in the literature for simulation of PDEs with PINNs).
One of the main obstacles to having a fully effective simulation from a PINC network is the long training
of such networks. Nonetheless, this is a common issue in most of the proposals dealing with deep learning and
specially with PINNs. However, after training, the network can predict directly any state in the range [0, T ] without
requiring an integration with intermediate points as in numerical simulation methods.
We noticed that a precise optimization algorithm towards the end of the training (e.g., L-BFGS) is essential
in obtaining a precise model. Besides, preliminary work in identifying more complex plants (e.g., in oil and gas
industry) show that skip connections (Lee et al., 2015; He et al., 2016) can help the training of deep networks by
helping to backpropagate the gradient to the deepest layers during training, further improving the precision of the
final trained model.
Furthermore, challenges to the learning of PINNs can arise from discontinuities in the ODE equations that
model the plant, such as the presence of the max operator. Also, the random initialization of the weights of neural
networks may cause different results and also render invalid arguments to functions such as the square root, if
present in the ODE equations. Notice that some fixes or workarounds can be applied in these cases.
5. Conclusion
We have proposed a new framework that makes Physics-Informed Neural Networks (PINNs) amenable to
control methods, such as MPC, opening a wide range of application opportunities. This Physics-Informed Neural
Nets-based Control (PINC) approach allows a PINN to work for longer-range time intervals that are not fixed
beforehand, without severe prediction degradation as it normally does, and makes it easy to employ such networks
in MPC applications. In control applications, this framework (a) provides a way to identify a system by integrating
collected data from a plant with a priori expert knowledge in the form of ordinary differential equations; (b) can
simulate differential equations faster than numerical solution methods, specially if extended to Partial Differential
Equations (PDE), making PINNs more appealing to real-time control applications. Although we only used initial
20
conditions as real training data, we foresee that using additional sparse data will make the training of PINC nets
much faster.
In future work, we intend to extend the framework to systems described by Differential-Algebraic Equations
(DAEs) and PDEs, and systems for which prior knowledge is uncertain (unknown parameters) as well as apply
PINC to industrial control problems, such as in the oil and gas industry, for which some prior knowledge of ODEs
are known in addition to very noisy or sparse data. We expect that the reduction in the computational burden in
using PINC for control scenarios will be even more relevant in comparison to the numerical solution approach, as
the model becomes increasingly more complex or in the case of models described by PDEs. Finally, we envision
that the application of system identification in an industrial setting will expand if we use complementary sources of
information for training deep networks, that is, by using physical laws and historical sparse data, making feasible
a wide range of previously unsolved applications in systems and control.
Acknowledgments
This work was funded in part by CNPq (Grant 308624/2021-1) and FAPESC (Grant 2021TR2265).
References
Åkesson, B. M., & Toivonen, H. T. (2006). A neural network model predictive controller. Journal of Process
Control, 16, 937–946.
Andersson, J., Åkesson, J., & Diehl, M. (2012). Dynamic optimization with CasADi. In Proceedings of the IEEE
Conference on Decision and Control (pp. 681–686). doi:10.1109/CDC.2012.6426534.
Andrew, G., & Gao, J. (2007). Scalable training of L1-regularized log-linear models. In Proceedings of the 24th
International Conference on Machine Learning ICML’07 (pp. 33–40). New York, NY, USA: Association for
Computing Machinery. doi:10.1145/1273496.1273501.
Antonelo, E. A., Camponogara, E., Seman, L. O., de Souza, E. R., Jordanou, J. P., & Hub-
ner, J. F. (2021). Physics-informed neural nets for control of dynamical systems. URL:
https://arxiv.org/abs/2104.02556. doi:10.48550/ARXIV.2104.02556.
Biegler, L. T. (2010). Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes.
Philadelphia: SIAM.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Springer.
Brandão, A. S. M. (2018). Controle Preditivo com Geração de Código: Um Estudo Comparativo. Master’s thesis
Universidade Federal de Santa Catarina.
Brown, R. G., & Hwang, P. Y. C. (1992). Introduction to Random Signals and Applied Kalman Filtering. John
Wiley & Sons.
Camacho, E. F., & Bordons, C. (2013). Model Predictive Control. Springer Science & Business Media.
Cavagnari, L., Magni, L., & Scattolini, R. (1999). Neural network implementation of nonlinear receding-horizon
control. Neural computing & applications, 8, 86–92.
Chen, J., & Liu, Y. (2021). Probabilistic physics-guided machine learning for fatigue data analysis. Expert Systems
with Applications, 168, 114316. doi:10.1016/j.eswa.2020.114316.
Cheng, P., Chen, M., Stojanovic, V., & He, S. (2021). Asynchronous fault detection filtering for piece-
wise homogenous markov jump linear systems via a dual hidden markov model. Mechanical Systems
and Signal Processing, 151, 107353. URL: https://doi.org/10.1016/j.ymssp.2020.107353.
doi:10.1016/j.ymssp.2020.107353.
Eren, U., Prach, A., Koçer, B. B., Raković, S. V., Kayacan, E., & Açıkmeşe, B. (2017). Model predictive control
in aerospace systems: Current state and opportunities. Journal of Guidance, Control, and Dynamics, 40,
1541–1566. doi:10.2514/1.G002507.
Gill, P. E., Murray, W., & Saunders, M. A. (2005). SNOPT: An SQP algorithm for large-scale constrained
optimization. SIAM Review, 47, 99–131. doi:10.1137/S0036144504446096.
21
Gokhale, G., Claessens, B., & Develder, C. (2022). Physics informed neural networks for control oriented thermal
modeling of buildings. Applied Energy, 314, 118852.
Grüne, L., & Pannek, J. (2011). Nonlinear Model Predictive Control: Theory and Algorithms. Springer.
Harp, D. R., O’Malley, D., Yan, B., & Pawar, R. (2021). On the feasibility of using physics-informed machine
learning for underground reservoir pressure management. Expert Systems with Applications, 178, 115006.
doi:10.1016/j.eswa.2021.115006.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
Hertneck, M., Köhler, J., Trimpe, S., & Allgöwer, F. (2018). Learning an approximate model predictive controller
with guarantees. IEEE Control Systems Letters, 2, 543–548.
Iserles, A. (1996). A First Course in the Numerical Analysis of Differential Equations. Cambridge University
Press.
Johansson, K. (2000). The quadruple-tank process: A multivariable laboratory process with an adjustable zero.
IEEE Transactions on Control Systems Technology, 8, 456–465. doi:10.1109/87.845876.
Jordanou, J. P., Antonelo, E. A., & Camponogara, E. (2021). Echo state networks for practical nonlinear model
predictive control of unknown dynamic systems. IEEE Transactions on Neural Networks and Learning
Systems, (pp. 1–15). doi:10.1109/TNNLS.2021.3136357.
Jordanou, J. P., Camponogara, E., Antonelo, E. A., & Aguiar, M. A. S. (2018). Nonlinear model
predictive control of an oil well with echo state networks. IFAC-PapersOnLine, 51, 13–18.
doi:10.1016/j.ifacol.2018.06.348.
Kingma, D. P., & Ba, J. (2014). ADAM: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, .
Kumar, A., Ridha, S., Narahari, M., & Ilyas, S. U. (2021). Physics-guided deep neural network to characterize
non-newtonian fluid flow for optimal use of energy resources. Expert Systems with Applications, (p. 115409).
doi:10.1016/j.eswa.2021.115409.
Ławryńczuk, M. (2014). Computationally Efficient Model Predictive Control Algorithms. Springer International
Publishing.
Lee, C.-Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In Artificial Intelligence
and Statistics (pp. 562–570). PMLR.
Liu, X.-Y., & Wang, J.-X. (2021). Physics-informed dyna-style model-based deep reinforcement learning for
dynamic control. Proceedings of the Royal Society A, 477, 20210618.
Meng, X., Li, Z., Zhang, D., & Karniadakis, G. E. (2020). PPINN: Parareal physics-informed neural net-
work for time-dependent PDEs. Computer Methods in Applied Mechanics and Engineering, 370, 113250.
doi:10.1016/j.cma.2020.113250.
Nascimento, T. P., Dórea, C. E. T., & Gonçalves, L. M. G. (2018). Nonlinear model predictive control for trajectory
tracking of nonholonomic mobile robots: A modified approach. International Journal of Advanced Robotic
Systems, 15. doi:10.1177/1729881418760461.
Nelles, O. (2001). Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy
Models. (1st ed.). Berlin: Springer.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. (2nd ed.). New York, NY, USA: Springer.
Normey-Rico, J. E., & Camacho, E. F. (2007). Control of Dead-time Processes. Springer London.
doi:10.1007/978-1-84628-829-6.
Ortega, J. G., & Camacho, E. (1996). Mobile robot navigation in a partially structured static environment, using
neural predictive control. Control Engineering Practice, 4, 1669–1679.
Pan, Y., & Wang, J. (2012). Model predictive control of unknown nonlinear dynamical systems based on recurrent
neural networks. IEEE Transactions on Industrial Electronics, 59, 3089–3101.
22
Pang, G., & Karniadakis, G. E. (2020). Physics-informed learning machines for partial differential equations:
Gaussian processes versus neural networks. Kevrekidis P., Cuevas-Maraver J., Saxena A. (eds) Emerging
Frontiers in Nonlinear Science. Nonlinear Systems and Complexity, 32, 323–343.
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2017). Physics informed deep learning (Part I): Data-driven
solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561, .
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning frame-
work for solving forward and inverse problems involving nonlinear partial differential equations. Journal of
Computational Physics, 378, 686–707. doi:0.1016/j.jcp.2018.10.045.
Schultz, W. C., & Rideout, V. C. (1961). Control system performance measures: Past, present, and future. IRE
Transactions on Automatic Control, AC-6, 22–35. doi:10.1109/TAC.1961.6429306.
Sirignano, J., & Spiliopoulos, K. (2018). DGM: A deep learning algorithm for solving partial differential equa-
tions. Journal of Computational Physics, 375, 1339–1364. doi:10.1016/j.jcp.2018.08.029.
Stinis, P. (2020). Enforcing constraints for time series prediction in supervised, unsupervised and
reinforcement learning. In Proceedings of the AAAI 2020 Spring Symposium on Combining
Artificial Intelligence and Machine Learning with Physical Sciences. volume 2587. URL:
http://ceur-ws.org/Vol-2587/article_5.pdf.
Terzi, E., Bonetti, T., Saccani, D., Farina, M., Fagiano, L., & Scattolini, R. (2020). Learning-based predic-
tive control of the cooling system of a large business centre. Control Engineering Practice, 97, 104348.
doi:https://doi.org/10.1016/j.conengprac.2020.104348.
Wächter, A., & Biegler, L. T. (2006). On the implementation of an interior-point filter line-
search algorithm for large-scale nonlinear programming. Mathematical Programming, 106, 25–57.
doi:10.1007/s10107-004-0559-y.
Wei, T., Li, X., & Stojanovic, V. (2021). Input-to-state stability of impulsive reac-
tion–diffusion neural networks with infinite distributed delays. Nonlinear Dynam-
ics, 103, 1733–1755. URL: https://doi.org/10.1007/s11071-021-06208-6.
doi:10.1007/s11071-021-06208-6.
Witt, J., & Werner, H. (2010). Approximate model predictive control for nonlinear multivariable systems. Model
Predictive Control, (pp. 141–166). doi:10.5772/46955.
Y. Hafeez, H., Ndikilar, C. E., & Isyaku, S. (2015). Analytical study of the Van der Pol equation in the autonomous
regime. Progress in Physics, 11, 252–255.
Yang, L., Meng, X., & Karniadakis, G. E. (2021). B-PINNs: Bayesian physics-informed neural networks for
forward and inverse PDE problems with noisy data. Journal of Computational Physics, 425, 109913.
Zhai, H., & Sands, T. (2021). Physics-informed deep operator control: Controlling chaos in van der pol oscillating
circuits. arXiv preprint arXiv:2112.14707, .
Zhu, Y., Zabaras, N., Koutsourelakis, P.-S., & Perdikaris, P. (2019). Physics-constrained deep learning for high-
dimensional surrogate modeling and uncertainty quantification without labeled data. Journal of Computa-
tional Physics, 394, 56–81. doi:10.1016/j.jcp.2019.05.024.
23