PINNecho
PINNecho
PINNecho
A R T I C L E I N F O A B S T R A C T
Keywords: We propose a physics-informed echo state network (ESN) to predict the evolution of chaotic systems. Compared
Echo state networks to conventional ESNs, the physics-informed ESNs are trained to solve supervised learning tasks while ensuring
Physics-informed neural networks that their predictions do not violate physical laws. This is achieved by introducing an additional loss function
Chaotic dynamical systems
during the training, which is based on the system’s governing equations. The additional loss function penalizes
non-physical predictions without the need of any additional training data. This approach is demonstrated on a
chaotic Lorenz system and a truncation of the Charney–DeVore system. Compared to the conventional ESNs, the
physics-informed ESNs improve the predictability horizon by about two Lyapunov times. This approach is also
shown to be robust with regard to noise. The proposed framework shows the potential of using machine learning
combined with prior physical knowledge to improve the time-accurate prediction of chaotic dynamical systems.
1. Introduction use are based on reservoir computing [13], in particular, echo state
networks (ESNs). ESNs are used here instead of more conventional
Over the past few years, there has been a rapid increase in the recurrent neural networks (RNNs), like the long-short term memory
development of machine learning techniques, which have been applied unit, because ESNs proved particularly accurate in predicting chaotic
with success to various disciplines, from image or speech recognition [1, dynamics for a longer time horizon than other machine learning net
2] to playing Go [3]. However, the application of such methods to the works [13]. ESNs are also generally easier to train than other RNNs, and
study and forecasting of physical systems has only been recently they have recently been used to predict the evolution of spatiotemporal
explored, including some applications in the field of fluid dynamics chaotic systems [14,15]. In the present study, ESNs are augmented by
[4–7]. One of the major challenges for using machine learning algo physical constraints to accurately forecast the evolution of two proto
rithms for the study of complex physical systems is the prohibitive cost typical chaotic systems, the Lorenz system [16] and the Char
of data generation and acquisition for training [8,9]. However, in ney–DeVore system [17]. The robustness of the proposed approach with
complex physical systems, there exists a large amount of prior knowl regard to noise is also analysed. Compared to previous physics-informed
edge, such as governing equations and conservation laws, which can be machine learning approaches, which mostly focused on identifying so
exploited to improve existing machine learning approaches. These lutions of PDEs using feedforward neural networks [4,9,11], the
hybrid approaches, called physics-informed machine learning or theor approach proposed here is applied on a form of RNN for the modeling of
y-guided data science [10], have been applied with some success to chaotic systems. The objective is to train the ESN in conjunction with
flow-structure interaction problems [4], turbulence modelling [5], the physical knowledge to reproduce the dynamics of the original system,
solution of partial differential equations (PDEs) [9], cardiovascular flow and so, for the ESN to be a digital twin of the real system.
modelling [11], and physics-based object tracking in computer vision Section 2 details the method used for the training and for forecasting
[12]. the dynamical systems, both with conventional ESNs and the newly
In this study, we propose an approach to combine physical knowl proposed physics-informed ESNs (PI-ESNs). Results are presented in
edge with a machine learning algorithm to time-accurately forecast the Section 3 and final comments are summarized in Section 4.
evolution of chaotic dynamical systems. The machine learning tools we
* Corresponding author at: Institute for Advanced Study, Technical University of Munich, Germany (visiting).
E-mail address: [email protected] (L. Magri).
https://doi.org/10.1016/j.jocs.2020.101237
Received 18 February 2020; Received in revised form 11 August 2020; Accepted 18 October 2020
Available online 31 October 2020
1877-7503/© 2020 Elsevier B.V. All rights reserved.
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
Fig. 1. Schematic of the ESN during (a) training and (b) future prediction. The physical constraints are imposed during the training phase (a).
2
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
function. Therefore, this procedure ensures that the ESN becomes pre training. Therefore, Np is chosen as a trade-off.
dictive because of data training and the ensuing prediction is consistent The predictions for the Lorenz system by conventional ESN and PI-
with the physics. It is motivated by the fact that in many complex ESNs, for a particular case where the reservoir has 200 units, are
physical systems, the cost of data acquisition is prohibitive and thus, compared with the actual evolution in Fig. 2, where the time is
there are many instances where only a small amount of data is available normalized by the largest Lyapunov exponent, λmax = 0.934. Fig. 3
for the training of neural networks. In this context, most existing ma shows the evolution of the associated normalized error, which is defined
chine learning approaches lack robustness. The proposed approach as
better leverages on the information content of the data that the recurrent
||u(n) − ̂u (n)||
neural network uses. The physics-informed framework is straightfor E(n) = 〈 (9)
ward to implement because it only requires the evaluation of the re
2 1/2
||u|| 〉
sidual, but it does not require the computation of the exact solution.
Practically, the optimization of W out is performed using the L-BFGS-B where 〈⋅〉 denotes the time average. The PI-ESN shows a remarkable
algorithm [21] with the W out obtained by ridge regression (Eq. (4)) as improvement of the time over which the predictions are accurate.
the initial guess. Indeed, the time for the normalized error to exceed 0.2, which is the
threshold used here to define the predictability horizon, increases from 4
3. Results
u˙2 = u1 (ρ − u3 ) − u2 (8b)
where ρ = 28, σ = 10 and β = 8/3. These are the standard values of the
Lorenz system that spawn a chaotic solution [16]. The size of the
training dataset is Nt = 1000 and the timestep between two time in
stants is Δt = 0.01. This corresponds to roughly 10 Lyapunov times
[22].
The parameters of the reservoir both for the conventional and PI-
ESNs are: σ in = 0.15, Λ = 0.4 and 〈d〉 = 3. In the case of the conven
tional ESN, γ = 0.0001. These values of the hyperparameters are taken
from previous studies [14,15].
For the PI-ESN, a prediction horizon of Np = 1000 points is used and
the physical error is estimated by discretizing Eq. (8) using an explicit
Euler time-integration scheme. The choice of Np = 1000 is used to bal
ance the error based on the data and the error based on the physical
constraints. A balancing factor, similar to the Tikhonov regularization
factor, could potentially also be used to do this. However, the proposed
method based on collocation points provide additional information for
the training of the PI-ESN as the physical residual has to be minimized at Fig. 2. Prediction of the Lorenz system (a) u1 , (b) u2 , (c) u3 with the conven
the collocation points. Increasing Np may be beneficial for the accuracy tional ESN (dotted red lines) and the PI-ESN (dashed blue lines). The actual
of the PI-ESN, but at the cost of a more computationally expensive evolution of the Lorenz system is shown with full black lines.
3
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
4
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
Fig. 5. (a) Evolution of the modal amplitudes of the CDV system (black to light
gray: u1 to u6 ). The shaded grey box indicates the data used for training. (b)
Phase plots of the u1 − u4 trajectory.
For the prediction, the parameters for the ESNs are: σin = 2.0, Λ =
0.9 and 〈d〉 = 3. For the conventional ESN, γ = 0.0001. These values are
obtained after performing a grid search. For the PI-ESN, a prediction
horizon of Np = 3000 points is used. Compared to the Lorenz system
where the same number of collocation points as training points was
used, here, comparatively fewer collocation points are used. This choice Fig. 6. Prediction of the CDV system for (a) u1 , u2 and u3 and (b) u4 , u5 and
was made to decrease the computational cost of the optimization process u6 with the conventional ESN (dotted lines) and the PI-ESN (dashed lines). The
as the cost of computing Ep is proportional to Np . Nonetheless, that actual evolution of the CDV system is shown with full lines.
5
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
Fig. 7. Error on the prediction from the conventional and PI-ESN for the pre
diction shown in Fig. 6.
Fig. 8. Mean predictability horizon of the conventional ESN (red line with
circles), PI-ESN (blue line with crosses), hybrid-b with ϵ = 0.05 (full green line
with triangles), hybrid-b with ϵ = 1.0 (dashed green line with triangles),
hybrid-C with ϵ = 0.05 (full orange line with downward triangles) and hybrid-C
with ϵ = 1.0 (dashed orange line with downward triangles) as a function of the
reservoir size (Nx ) for the CDV system. (For interpretation of the references to
color in this figure legend, the reader is referred to the web version of
this article.)
Fig. 9. (a) Prediction of the Lorenz system with the conventional ESN (dotted
which is close to the exact model. The accuracy is, however, less marked
lines) and the PI-ESN (dashed lines) with 200 units trained from noisy data
than it is in the Lorenz system because an error in the parameters b or C
(SNR=20dB) and (b) zoom of the evolution before the divergence of the con
is amplified by more significant model nonlinearities as these parame ventional ESN. The actual (noise-free) evolution of the Lorenz system is shown
ters appear in all the governing equations of the CDV system (Eq. (10)). with full grey lines. (c) Error on the prediction for the conventional ESN and
The accuracy of hybrid-b is lower than the accuracy of hybrid-C because PI-ESN.
the nonlinear dynamics is more sensitive to small errors in b, which
affects all the coefficients of the CDV equations (Eqs. (10) and (11)). required during the training as to how to appropriately filter the noise.
Similarly to the Lorenz system, when the model error is larger (ϵ = 1.0), Indeed, the physics-based loss provides the constraints that the com
the predictability horizon is smaller than with the accurate approximate ponents of the output have to satisfy, therefore providing an indication
model (ϵ = 0.05). as to how to filter the noise. In addition, for the Lorenz system, the
conventional ESN is diverging during its prediction while the PI-ESN’s
prediction remains bounded. This highlights the improved robustness of
3.3. Robustness with respect to noise the physics-informed approach. This is an encouraging result, which can
potentially enable the use of the proposed approach with noisy data
In this section, we study the robustness of the results presented in the from physical experiments whose governing equations are known.
previous sections for the Lorenz and CDV systems with regard to noise. The mean predictability horizon for the two systems and the two
To do so, the training data used in Sections 3.1 and 3.2 are perturbed by noise levels is shown in Fig. 11, which also shows a comparison with the
adding measurement Gaussian noise to the training datasets. Two cases hybrid approach with ϵ = 0.05. For the Lorenz system, compared to the
with signal to noise ratios (SNRs) of 20 and 30 dB are considered, which ESN trained on non-noisy data, in Fig. 4, the mean predictability horizon
are typical noise levels encountered in experimental fluid mechanics is smaller. Furthermore, for the data-only ESN, the predictability hori
[25]. zon decreases for large reservoirs. This is because the ESN starts over
The evolution of the Lorenz and the CDV systems and the predictions fitting the noisy data and, thereby, reproducing a noisy behaviour and
from the conventional and PI-ESNs are shown in Figs. 9 and 10 , deteriorating its prediction. On the other hand, the PI-ESN maintains a
respectively. In those figures, it is seen that the proposed approach still satisfactory predictability horizon for the same large reservoirs. This
improves the prediction capability of the PI-ESN despite the training indicates that the physics-based regularization in the loss function (Ep in
with noisy data. This originates from the physics-based regularization
Eq. (7)) enhances the robustness of the PI-ESN. The predictability
term in the loss function in Eq. (7), which provides the information
6
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
Fig. 11. Mean predictability horizon of the conventional ESN (dotted line with
circles), PI-ESN (full line with crosses), hybrid or hybrid-b (dashed-dotted line
with upward triangles) and hybrid-C (dashed line with downward triangles)
trained from noisy data (red: SNR = 20 dB, blue: SNR = 30 dB) as a function of
the reservoir size (Nx ) for the (a) Lorenz and (b) CDV systems. Hybrid methods
are used with ϵ = 0.05. (For interpretation of the references to color in this
figure legend, the reader is referred to the web version of this article.)
larger dimension than the Lorenz system. Hence, it would require larger
reservoirs than those considered here before the occurrence of noise
overfitting. Finally, the accuracy of the hybrid method is similar to that
of the PI-ESN. Similarly to the Lorenz system, this is because of the effect
of noisy data used in training.
7
N.A.K. Doan et al. Journal of Computational Science 47 (2020) 101237
by using the underlying physical laws as constraints. [7] J.-L. Wu, H. Xiao, E. Paterson, Physics-informed machine learning approach for
augmenting turbulence models: a comprehensive framework, Phys. Rev. Fluids
(2018) 074602, arXiv:1801.02762v3.
Conflict of interest [8] K. Duraisamy, G. Iaccarino, H. Xiao, Turbulence modeling in the age of data, Annu.
Rev. Fluid Mech. 51 (2019) 357–377, arXiv:1804.00183.
[9] M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: a deep
The authors declare no conflict of interest. Luca Magri on behalf of all learning framework for solving forward and inverse problems involving nonlinear
the authors. partial differential equations, J. Comput. Phys. 378 (2019) 686–707.
[10] A. Karpatne, G. Atluri, J.H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly,
S. Shekhar, N. Samatova, V. Kumar, Theory-guided data science: a new paradigm
Declaration of Competing Interest for scientific discovery from data, IEEE Trans. Knowl. Data Eng. 29 (2017)
2318–2331, arXiv:1612.08544.
[11] G. Kissas, Y. Yang, E. Hwuang, W.R. Witschey, J.A. Detre, P. Perdikaris, Machine
The authors report no declarations of interest. learning in cardiovascular flows modeling: predicting arterial blood pressure from
non-invasive 4D flow MRI data using physics-informed neural networks, Comput.
Methods Appl. Mech. Eng. 358 (2020) 112623, arXiv:1905.04817.
Acknowledgements
[12] R. Stewart, S. Ermon, Label-free supervision of neural networks with physics and
domain knowledge, 31st AAAI Conf. Artif. Intell. AAAI 2017, volume 1 (2017)
The authors acknowledge the support of the Technical University of 2576–2582. arXiv:1609.05566.
Munich – Institute for Advanced Study, funded by the German Excel [13] M. Lukoševičius, H. Jaeger, Reservoir computing approaches to recurrent neural
network training, Comput. Sci. Rev. 3 (2009) 127–149.
lence Initiative and the European Union Seventh Framework Pro [14] J. Pathak, A. Wikner, R. Fussell, S. Chandra, B.R. Hunt, M. Girvan, E. Ott, Hybrid
gramme under grant agreement no. 291763. L.M. also acknowledges the forecasting of chaotic processes: using machine learning in conjunction with a
Royal Academy of Engineering Research Fellowship Scheme. knowledge-based model, Chaos 28 (2018) 041101, arXiv:1803.04779.
[15] J. Pathak, B. Hunt, M. Girvan, Z. Lu, E. Ott, Model-free prediction of large
spatiotemporally chaotic systems from data: a reservoir computing approach, Phys.
References Rev. Lett. 120 (2018) 24102.
[16] E.N. Lorenz, Deterministic nonperiodic flow, J. Atmos. Sci. 20 (1963) 130–141.
[1] A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep [17] D.T. Crommelin, J.D. Opsteegh, F. Verhulst, A mechanism for atmospheric regime
convolutional neural networks, Adv. Neural Inf. Process. Syst. 25 (2012) behavior, J. Atmos. Sci. 61 (2004) 1406–1419.
1097–1105. [18] M. Lukoševičius, A practical guide to applying echo state networks, in:
[2] G. Hinton, L. Deng, D. Yu, G. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, G. Montavon, G.B. Orr, K.-R. Muller (Eds.), Neural Networks: Tricks of the Trade,
V. Vanhoucke, P. Nguyen, T. Sainath, B. Kingsbury, Deep neural networks for Springer, 2012.
acoustic modeling in speech recognition: the shared views of four research groups, [19] P. Verzelli, C. Alippi, L. Livi, Echo state networks with self-normalizing activations
IEEE Signal Process. Mag. 29 (2012) 82–97. on the hyper-sphere, Sci. Rep. 9 (2019) 1–14. arXiv:1903.11691.
[3] D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. van den Driessche, [20] H. Jaeger, H. Haas, Harnessing nonlinearity: predicting chaotic systems and saving
J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, energy in wireless communication, Science 304 (2004) 78–80 (80-.).
D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, [21] R.H. Byrd, P. Lu, J. Nocedal, C. Zhu, A limited memory algorithm for bound
K. Kavukcuoglu, T. Graepel, D. Hassabis, Mastering the game of Go with deep constrained optimization, J. Sci. Comput. 16 (1995) 1190–1208.
neural networks and tree search, Nature 529 (2016) 484–489. [22] S.H. Strogatz, Nonlinear Dynamics and Chaos: With Applications to Physics,
[4] M. Raissi, Z. Wang, M.S. Triantafyllou, G. Karniadakis, Deep learning of vortex- Biology, Chemistry, Engineering, Perseus Books Publishing, 1994.
induced vibrations, J. Fluid Mech. 861 (2019) 119–137. [23] Z.Y. Wan, P. Vlachas, P. Koumoutsakos, T.P. Sapsis, Data-assisted reduced-order
[5] J. Ling, A. Kurzawski, J. Templeton, Reynolds averaged turbulence modelling using modeling of extreme events in complex dynamical systems, PLOS ONE 13 (2018)
deep neural networks with embedded invariance, J. Fluid Mech. 807 (2016) 1–22, arXiv:1803.03365.
155–166. [24] D.T. Crommelin, A.J. Majda, Strategies for model reduction: comparing different
[6] S. Jaensch, W. Polifke, Uncertainty encountered when modelling self-excited optimal bases, J. Atmos. Sci. 61 (2004) 2206–2217.
thermoacoustic oscillations with artificial neural networks, Int. J. Spray Combust. [25] N.T. Ouellette, H. Xu, E. Bodenschatz, A quantitative study of three-dimensional
Dyn. 9 (2017) 367–379. Lagrangian particle tracking algorithms, Exp. Fluids 40 (2006) 301–313.