0% found this document useful (0 votes)
265 views11 pages

Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network

This document presents a tutorial on solving ordinary differential equations (ODEs) using Python and hybrid physics-informed neural networks. It leverages modern machine learning frameworks like TensorFlow and Keras to directly implement ODE integration through recurrent neural networks. The main advantage is that hybrid models can combine physics-informed and data-driven kernels, where data-driven kernels reduce gaps between predictions and observations. The tutorial illustrates the approach on two case studies: 1) fatigue crack growth integration using a hybrid model combining a data-driven stress intensity range model and a physics-based crack length increment model, and 2) model parameter identification of a dynamic two-degree-of-freedom system through Runge-Kutta integration. Source codes are available

Uploaded by

bakidoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
265 views11 pages

Renato - A Tutorial On Solving Ordinary Differential Equations Using Python and Hybrid Physics-Informed Neural Network

This document presents a tutorial on solving ordinary differential equations (ODEs) using Python and hybrid physics-informed neural networks. It leverages modern machine learning frameworks like TensorFlow and Keras to directly implement ODE integration through recurrent neural networks. The main advantage is that hybrid models can combine physics-informed and data-driven kernels, where data-driven kernels reduce gaps between predictions and observations. The tutorial illustrates the approach on two case studies: 1) fatigue crack growth integration using a hybrid model combining a data-driven stress intensity range model and a physics-based crack length increment model, and 2) model parameter identification of a dynamic two-degree-of-freedom system through Runge-Kutta integration. Source codes are available

Uploaded by

bakidoki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Engineering Applications of Artificial Intelligence 96 (2020) 103996

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence


journal homepage: www.elsevier.com/locate/engappai

A tutorial on solving ordinary differential equations using Python and hybrid


physics-informed neural network
Renato G. Nascimento, Kajetan Fricke, Felipe A.C. Viana ∗
Department of Mechanical and Aerospace Engineering, University of Central Florida, Orlando, FL 32816-8030, USA

ARTICLE INFO ABSTRACT


Keywords: We present a tutorial on how to directly implement integration of ordinary differential equations through
Physics-informed neural network recurrent neural networks using Python. In order to simplify the implementation, we leveraged modern
Scientific machine learning machine learning frameworks such as TensorFlow and Keras. Besides, offering implementation of basic models
Uncertainty quantification
(such as multilayer perceptrons and recurrent neural networks) and optimization methods, these frameworks
Hybrid model python implementation
offer powerful automatic differentiation. With all that, the main advantage of our approach is that one can
implement hybrid models combining physics-informed and data-driven kernels, where data-driven kernels
are used to reduce the gap between predictions and observations. Alternatively, we can also perform model
parameter identification. In order to illustrate our approach, we used two case studies. The first one consisted of
performing fatigue crack growth integration through Euler’s forward method using a hybrid model combining
a data-driven stress intensity range model with a physics-based crack length increment model. The second
case study consisted of performing model parameter identification of a dynamic two-degree-of-freedom system
through Runge–Kutta integration. The examples presented here as well as source codes are all open-source
under the GitHub repository https://github.com/PML-UCF/pinn_code_tutorial.

1. Introduction differential equations using neural networks, which has a compan-


ion GitHub repository (https://github.com/maziarraissi/PINNs) with
Deep learning and physics-informed neural networks (Cheng et al., detailed and documented Python implementation. Authors proposed
2018; Shen et al., 2018; Chen et al., 2018; Pang and Karniadakis, using deep neural networks to handle the direct problem of solving
2020) have received growing attention in science and engineering over differential equations through the loss function (functional used in the
the past few years. The fundamental idea, particularly with physics- optimization of hyperparameters). The formulation is such that neural
informed neural networks, is to leverage laws of physics in the form networks are parametric trial solutions of the differential equation and
of differential equations in the training of neural networks. This is the loss function accounts for errors with respect to initial/boundary
fundamentally different than using neural networks as surrogate mod- conditions and collocation points. Authors also present a formulation
els trained with data collected at a combination of inputs and output for learning the coefficients of differential equations given observed
values. Physics-informed neural networks can be used to solve the data (i.e., calibration). The proposed method is applied to both, the
forward problem (estimation of response) and/or the inverse problem Schroedinger equation, a partial differential equation utilized in quan-
(model parameter identification). tum mechanics systems, and the Allen–Cahn equation, an established
Although there is no consensus on nomenclature or formulation, equation for describing reaction–diffusion systems. Pan and Duraisamy
we see two different and very broad approaches to physics-informed (2020) introduced a physics informed machine learning approach to
neural network. There are those using neural network as approxi- learn the continuous-time Koopman operator. Authors apply the de-
mate solutions for the differential equations (Chen et al., 2018; Raissi rived method to nonlinear dynamical systems, in particular within
et al., 2019; Raissi and Karniadakis, 2018; Pan and Duraisamy, 2020). the field of fluid dynamics, such as modeling the unstable wake flow
Essentially, through collocation points, the neural network hyperpa- behind a cylinder. In order to derive the method, authors used a
rameters are optimized to satisfy initial/boundary conditions as well measure-theoretic approach to create a deep neural network. Both
as the constitutive differential equation itself. For example, Raissi et al. differential and recurrent model types are derived, where the latter
(2019) present an approach for solving and discovering the form of is used when discrete trajectory data can be obtained whereas the

∗ Corresponding author.
E-mail address: [email protected] (F.A.C. Viana).
URL: https://pml-ucf.github.io/ (F.A.C. Viana).

https://doi.org/10.1016/j.engappai.2020.103996
Received 18 June 2020; Received in revised form 15 September 2020; Accepted 2 October 2020
Available online xxxx
0952-1976/© 2020 Elsevier Ltd. All rights reserved.
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

differential form is suitable when governing equations are disposable. 2. Code repository and replication of results
This physics-informed neural network approach shows its strength
regarding uncertainty quantification and is robust against noisy input
In this paper, we will use TensorFlow (Abadi et al., 2016) (version
signal.
Alternatively, there are those building hybrid models that directly 2.0.0-beta1), Keras (Chollet et al., 2015), and the Python application
code reduced order physics-informed models within deep neural programming interface. We will leverage the object orientation capa-
networks (Nascimento and Viana, 2020; Yucesan and Viana, 2020; bilities of the framework to differentiate classes that will implement
Dourado and Viana, 2020; Karpatne et al., 2017; Singh et al., 2019). the Euler’s forward method. Further information on how to customize
This implies that the computational cost of these physics-informed neural network architectures within TensorFlow, the reader is referred
kernels have to be comparable to the linear algebra found in neural to the TensorFlow documentation (tensorflow.org).
network architectures. It also means that tuning of the physics-informed In order to replicate our results, the interested reader can down-
kernel hyperparameters through backpropagation requires that adjoints load codes and data available at Fricke et al. (2020). Throughout
to be readily available (through automatic differentiation (Baydin this paper, we will highlight the main features of the codes found
et al., 2018), for example). For example, Yucesan and Viana (2020) in this repository. We also refer to the PINN package (Viana et al.,
proposed a hybrid modeling approach which combines reduced-order
2019) (a freely available base package for physics-informed neural
models and machine learning for improving the accuracy of cumulative
network, which contains specialized implementations and examples of
damage models used to predict wind turbine main bearing fatigue.
cumulative damage models).
The reduce-order models capture the behavior of bearing loads and
bearing fatigue; while machine learning models account for uncertainty
in grease degradation. The model was successfully used to predict
3. Physics-informed neural network for ordinary differential
grease degradation and bearing fatigue across a wind park; and with
equations
that, optimize the regreasing intervals. Karpatne et al. (2017) presented
an interesting taxonomy for what authors called theory-guided data
science. In the paper, they discuss how one could augment machine In this section, we will focus on our hybrid physics-informed neural
learning models with physics-based domain knowledge and walk from network implementation for ordinary differential equations. This is spe-
simple correlation-based models, to hybrid models, to fully physics- cially useful for problems where physics-informed models are available,
informed machine learning (such as in solving differential equations but known to have predictive limitations due to model-form uncertainty
directly). Authors discuss examples in hydrological modeling, compu- or model-parameter uncertainty. We start by providing the background
tational chemistry, mapping surface water dynamics, and turbulence on recurrent neural networks and then discuss how we implement them
modeling. for numerical integration.
We will focus on discussing a Python implementation for hybrid
physics-informed neural networks. We believe these hybrid implemen-
tations can have an impact in real-life applications, where reduced 3.1. Background: Recurrent neural networks
order models capturing the physics are available and well adopted.
Most of the time, the computational efficiency of reduced order models
comes at the cost of loss of physical fidelity. Hybrid implementations of Recurrent neural networks (Goodfellow et al., 2016) extend tradi-
physics-informed neural networks can help reducing the gap between tional feed forward networks to handle time-dependent responses. As
predictions and observed data. illustrated in Fig. 1, in every time step 𝑡, recurrent neural networks
Our approach starts with the analytical formulation and passes apply a transformation to a state 𝐲 such that:
through the numerical integration method before landing in the neural
network implementation. Depending on the application, different nu- 𝐲𝑡 = 𝑓 (𝐲𝑡−1 , 𝐱𝑡 ), (1)
merical integration methods can be used. While this is an interesting
where 𝑡 ∈ [0, … , 𝑇 ] represent the time discretization; 𝐲 ∈ R𝑛𝑦 are the
topic, it is not the focus of our paper. Instead, we will focus on how
states representing the quantities of interest, 𝐱 ∈ R𝑛𝑥 are input vari-
to move from analytical formulation to numerical implementation
of the physics-informed neural network model. ables; and 𝑓 (.) is the transformation cell. Depending on the application,
We will address the implementation of ordinary differential equa- 𝐲 can be observed in every time step 𝑡 or only at specific observation
tion solvers using two case studies in engineering. Fatigue crack prop- times.
agation is used as an example of first order ordinary differential equa- Popular recurrent neural network cell designs include the long–
tions. In this example, we show how physics-informed neural net- short term memory (Hochreiter and Schmidhuber, 1997) and the gated
works can be used to mitigate epistemic (model-form) uncertainty recurrent unit (Cho et al., 2014), as illustrated in Fig. 1. Although
in reduced order models. Forced vibration of a 2 degree-of-freedom very useful in data-driven applications (time-series data Connor et al.,
system is used as an example of a system of second order ordinary dif- 1994; Sak et al., 2014, speech recognition Graves et al., 2013, text
ferential equations. In this example, we show how physics-informed sequence Sutskever et al., 2011, etc.), these cell designs do not im-
neural networks can be used to estimate model parameters of a plement numerical integration directly. In this paper, we will show
physical system. The main intend of this paper is to be a tutorial how to implement specialized recurrent neural networks for numerical
for a hybrid implementation of physics-informed neural networks. The
integration. The only requirements are that computations stay within
remaining of the paper is organized as follows. Section 2 specifies the
linear algebra complexity (so that computational cost stays comparable
implementation choices in terms of language and libraries, and public
to any other neural network architecture) and gradients with respect
repositories (needed for replication of results). Section 3.2 presents the
formulation and implementation for integrating first order ordinary dif- to trainable parameters are made available (so that backpropagation
ferential equation with the simple Euler’s forward method. Section 3.3 can be used for optimization). Keeping these two constraints in mind,
details the formulation and implementation for integrating a system we can design customized recurrent neural network cells that per-
of coupled second order differential equations with the Runge–Kutta forms the desired integration technique. For the sake of illustration,
method. Section 4 closes the paper recapitulating salient points and in Sections 3.2 and 3.3 , we customized two recurrent neural network
presenting conclusions and future work. Finally, Appendix summarizes cells, one for Euler integration and one for Runge–Kutta integration, as
concepts about neural networks used in this paper. shown in Fig. 1.

2
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

Fig. 1. Recurrent neural network. Two popular cell designs used in data-driven applications are illustrated in contrast with the two physics-informed cells we discuss in this paper.

3.2. First order ordinary differential equations assets providing the test data sets. In real life, these numbers depend
on the cost associated with inspection (grounding the aircraft implies
Consider the first order ordinary differential equation expressed in in loss of revenue besides cost of the actual inspection). For the sake
the form of this example, we observed 60 time histories of 7300 data points
𝑑𝑦 each (total of 438,000 input points) and only 60 output observations.
= 𝑓 (𝐱(𝑡), 𝑦, 𝑡) (2) The test data consists of 240 time histories of 7300 data points each
𝑑𝑡
where 𝐱(𝑡) are controllable inputs to the system, 𝑦 is the output of (total of 1,752,000 input points) and no crack length observations. In
interest, and 𝑡 is time. The solution to Eq. (2) depends on initial order to highlight the benefits of the hybrid implementation, we use
conditions (𝑦 at 𝑡 = 0), input data (𝐱(𝑡) known at different time steps), only 60 crack length observations after the entire load cycle regime.
and the computational cost associated with the evaluation of 𝑓 (.). The fact that we directly implemented the governing equation in a
recurrent neural network cell compensates for the number of points
of available output data. Hence, for training procedures we only use
3.2.1. Case study: Fatigue crack propagation
the aforementioned 60 assets, while the data for the remaining 240
In this case study, we consider the tracking of low cycle fatigue
machines can be utilized as validation data set.
damage. We are particularly interested in a control point that is moni-
tored for a fleet of assets (e.g., compressors, aircraft, etc.). This control
3.2.2. Computational implementation
point sits on the center of large plate in which loads are applied
For the sake of this example, assume that the material is charac-
perpendicularly to the crack plane. As depicted in Fig. 2(a), under
terized by 𝐶 = 1.5 × 10−11 and 𝑚 = 3.8, 𝐹 = 1, and that the initial
such circumstances, fatigue crack growth progresses following Paris
crack length is 𝑎0 = 0.005 m. 𝛥𝑆(𝑡) is estimated either through struc-
law (Paris and Erdogan, 1963)
tural health monitoring systems or high-fidelity finite element analysis
𝑑𝑎 √
= 𝐶 (𝛥𝐾(𝑡))𝑚 and 𝛥𝐾(𝑡) = 𝐹 𝛥𝑆(𝑡) 𝜋𝑎(𝑡), (3) together with cycle count methods (e.g., the rain flow method Collins,
𝑑𝑁 1993). This way, the numerical method used to solve Eq. (3) hinges
where 𝑎 is the fatigue crack length, 𝐶 and 𝑚 are material properties, 𝛥𝐾 on the availability and computational cost associated with 𝛥𝑆(𝑡).
is the stress intensity range, 𝛥𝑆 is the far-field cyclic stress time history, In this example, let us assume that the far-field stresses are avail-
and 𝐹 is a dimensionless function of geometry (Dowling, 2012). able in every cycle at very low computational cost (for example,
We assume that the control point inspection occurs in regular inter- the load cases are known and stress analysis is performed before
vals. Scheduled inspection of part of the fleet is adopted to reduce cost hand).
associated with it (mainly downtime, parts, and labor). As inspection Within folder first_order_ode of the repository available at
data is gathered, the predictive models for fatigue damage are updated. Fricke et al. (2020), the interested reader will find all data files used
In turn, the updated models can be used to guide the decision of which in this case study. File a0.csv has the value for the initial crack
machines should be inspected next. length (𝑎0 = 0.005 m) used throughout this case study. Files Stest
Fig. 2(b) illustrates all the data used in this case study. There are .csv and Strain.csv contain the load histories for the fleet of
300 machines, each one accumulating 7300 loading cycles. Not all 300 machines as well as the 60 machines used in the training of the
machines are subjected to the same mission mix. In fact, the duty cycles physics-informed neural network. Files atest.csv and atrain.csv
can greatly vary, driving different fatigue damage accumulation rates contain the fatigue crack length histories for the fleet as well as the
throughout the fleet. In this case study, we consider that while the 60 observed crack lengths used in the training of the physics-informed
history of cyclic loads is known throughout the operation of the fleet, neural network.
crack length history is not available. We divided the entire data set We will then show how 𝛥𝐾𝑡 can be estimated through a multilayer
consisting of 300 machines into 60 assets used for training and 240 perceptron (MLP), which works as a corrector on any poor estimation of

3
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

Fig. 2. Fatigue crack propagation details. Crack growth is governed by Paris law, Eq. (3), a first order ordinary differential equation. The input is time history of far-field stress
cyclic loads, 𝛥𝑆, and the output is the fatigue crack length, 𝑎. In this case study, 300 aircraft are submitted to a wide range of loads (due to different missions and mission mixes).
This explains the large variability in observed crack length after 5 years of operation.

either 𝛥𝑆(𝑡) or 𝛥𝐾𝑡 (should it have been implement through a physics- Eq. (4) starts to be implemented when PINN, the object of class RNN
informed model). Therefore, we can simply use the Euler’s forward , is instantiated. As a recurrent neural network, PINN has the ability
method (Press et al., 2007) (with unit time step) to obtain to march through time and execute the call method of the euler

𝑛 object. Lines 10 to 14 of List. 3 are needed so that an optimizer and loss
𝑎𝑛 = 𝑎0 + 𝛥𝑎(𝛥𝑆𝑡 , 𝑎𝑡−1 ) , (4) function are linked to the model that will be created. In this example,
𝑡=1 we use the mean square error (’mse’) as loss function and RMSprop
( )
𝛥𝑎𝑡 = 𝐶𝛥𝐾𝑡𝑚 , and 𝛥𝐾𝑡 = MLP 𝛥𝑆𝑡 , 𝑎𝑡−1 ; 𝐰, 𝐛 , (5) as an optimizer.
With EulerIntegratorCell and create_model defined, we
where 𝐰 and 𝐛 are the trainable hyperparameters. can proceed to training and predicting with the hybrid physics-informed
Similarly to regular neural networks, we use observed data to tune neural network model. Listing 4 details how to build the main portion
𝐰 and 𝐛 by minimizing a loss function. Here we use the mean squared of the Python script. From line 2 to line 9, we are simply defining the
error: material properties and loading the data. After that, we can create the
1 dKlayer model. Within TensorFlow, Sequential is used to create
𝛬 = (𝐚 − 𝐚) ̂ 𝑇 (𝐚 − 𝐚)
̂ , (6)
𝑛 models that will be stacks of several layers. Dense is used to define a
where 𝑛 is the number of observations, 𝐚 are fatigue crack length layer of a neural network. Line 12 initializes dKlayer preparing it to
observations, and 𝐚̂ are the predicted fatigue crack length using the receive the different layers in sequence. Line 13 adds the first layer with
hybrid physics-informed neural network. 5 neurons (and tanh as activation function). Line 14 adds the second
Listing 1 lists all the necessary packages. Besides pandas and layer with 1 neuron. Creating the hybrid physics-informed neural
numpy for data importation and manipulation, we import a series of network model is as simple as calling create_model, as shown in
packages out of TensorFlow. We import Dense and RNN to leverage na- line 19. As is, model is ready to be trained, which is done in line
tive multilayer perceptron and recurrent neural networks (see Appendix 23. For the sake of the example though, we can check the predictions
for a brief overview of these architectures). We import Sequential at the training set before and after the training (lines 22 and 24,
and Layer so that we can specialize our model. We import RMSprop respectively). The fact that we have to slice the third dimension of the
for hyperparameter optimization. Finally, the other operators are array with [:,:] is simply an artifact of TensorFlow. The way the code
needed to move data to TensorFlow-friendly structures. is implemented, predictions are done by marching through time while
Listing 2 shows the important snippets of implementation of the integrating fatigue crack growth starting from a0. However, since we
Euler integrator cell (to avoid clutter, we leave out the lines that have set return_sequences=False (default in create_model),
are needed for data-type reinforcement). This is a class inherited the predictions are returned only for the very last cycle. Setting that
from Layers, which is what TensorFlow recommends for imple- flag to True would change the behavior of the predict_on_batch,
menting custom layers. The __init__ method, constructor of the which would return the entire time series.
EulerIntegratorCell, assigns the constants 𝐶 and 𝑚 as well as Fig. 3 illustrates the results obtained when running the codes within
the initial state 𝑎0 . This method also creates an attribute dKlayer to folder first_order_ode available at Fricke et al. (2020). Fig. 3(a)
the object of EulerIntegratorCell. As we will detail later, this is shows the history of the loss function (mean square error) throughout
an interesting feature that will essentially allow us to specify any model the training. The loss converges rapidly within the first ten epochs
to dKlayer. Although dKlayer can be implemented using physics, and shows minor further convergence in the following ten epochs. We
as we discussed before, we will illustrate the case in which dKlayer would like to point out that experienced TensorFlow users could further
is a multilayer perceptron. The call method effectively implements customize the implementation to stop the hyperparameter optimization
Eq. (5). With regards to numerically integrating fatigue crack growth, as loss function converges. Fig. 3(b) shows the prediction against actual
we still have to implement Eq. (4). fatigue crack length at the last loading cycle for a test set (data
Here, we will use the TensorFlow native recurrent neural net- points not used to train the physics-informed neural network). While
work class, RNN, to effectively march in time; and therefore, im- results may vary from run-to-run, given that RMSprop implements
plement Eq. (4). Listing 3 details how we can use an object from a stochastic gradient descend algorithm, it is clear that the hybrid
the EulerIntegratorCell and couple it with RNN to create a physics-informed neural network was able to learn the latent (hidden)
model ready to be trained. The function create_model takes C stress intensity range model. Finally, we repeated the training of the
, m, a0, and dKLayer so that an EulerIntegratorCell object proposed physics-informed neural network 100 times so that we can
can be instantiated. Additionally, it also takes batch_input_shape study the repeatability of results. Fig. 3(c) shows the histograms of the
and return_sequences. The variable batch_input_shape is mean squared error at both training and test sets. Most of the time, the
used within EulerIntegratorCell to reinforce the shape of the mean squared error is below 20 × 10−6 (m)2 , while it was never above
inputs. Although batch_input_shape is not directly specified in 100 × 10−6 (m)2 . Considering that the observed crack lengths are within
EulerIntegratorCell, it belongs to **kwargs and it will be 5 × 10−3 (m) to 35 × 10−3 (m), these values of mean square error are
consumed in the constructor of Layer. sufficiently small.

4
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

1 # basic packages
2 import pandas as pd
3 import numpy as np

5 # keras essentials
6 from tensorflow . keras . layers import RNN , Dense , Layer
7 from tensorflow . keras import Sequential

8 from tensorflow . keras . optimizers import RMSprop

10 # tensorflow operators
11 from tensorflow . python . framework import tensor_shape
12 from tensorflow import float32 , concat , convert_to_tensor

Listing 1: Import section for the Euler integration example.

1 class EulerIntegratorCell ( Layer ):


2 def __init__ (self , C, m, dKlayer , a0 , units =1, ** kwargs ):
3 super( EulerIntegratorCell , self). __init__ (** kwargs )
4 self.units = units
5 self.C = C
6 self.m = m
7 self.a0 = a0
8 self. dKlayer = dKlayer
9

10 ...
11

12 def call(self , inputs , states ):


13

14 ...
15

16 x_d_tm1 = concat (( inputs , a_tm1 [0 ,:]) ,axis =1)


17 dk_t = self. dKlayer ( x_d_tm1 )
18 da_t = self.C * (dk_t ** self.m)
19 a = da_t + a_tm1 [0, :]
20 return a, [a]
Listing 2: Euler integrator cell.

1 def create_model (C,m,dKlayer ,a0 , batch_input_shape , return_sequences ):


2 euler = EulerIntegratorCell (C=C,m=m, dKlayer =dKlayer ,a0=a0 ,
3 batch_input_shape = batch_input_shape , return_state =False)
4 PINN = RNN(cell=euler , batch_input_shape = batch_input_shape ,
5 return_sequences = return_sequences , return_state = return_state )
6 model = Sequential ()
7 model.add(PINN)
8 model. compile (loss=’mse ’,optimizer = RMSprop (1e -2))
9 return model
Listing 3: Create model function for the Euler integration example.

𝑑𝐲
3.3. System of second order ordinary differential equations conditions (𝐲 as well as 𝑑𝑡
at 𝑡 = 0), input data (𝐮(𝑡) known at different
time steps).
In this section, we will focus on our hybrid physics-informed neural
network implementation of a system of second order ordinary differ-
3.3.1. Case study: Forced vibration of 2-degree-of-freedom system
ential equations. In the case study, we will highlight the useful aspect
of system identification. This is when observed data is used to estimate In this case study, we consider the motion for two masses linked
parameters of the governing equations. together springs and dashpots, as depicted in Fig. 4(a). The number
Consider the system of second order ordinary differential equation of degrees of freedom of a system is the number of independent
expressed in the form coordinates necessary to define motion (equal to the number of masses
in this case). Under such circumstances, the equations of are obtained
𝑑2𝐲 𝑑𝐲
𝐏(𝑡) + 𝐐(𝑡) + 𝐑(𝑡)𝐲 = 𝐮(𝑡) (7) using Newton’s second law
𝑑𝑡2 𝑑𝑡
where 𝐮(𝑡) are controllable inputs to the system, 𝐲 are the outputs 𝐌𝐲̈ + 𝐂𝐲̇ + 𝐊𝐲 = 𝐮, or alternatively
(8)
of interest, and 𝑡 is time. The solution to Eq. (7) depends on initial ̇ 𝐲) = 𝐌−1 (𝐮 − 𝐂𝐲̇ − 𝐊𝐲) ,
𝐲̈ = 𝑓 (𝐮, 𝐲,

5
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

1 if __name__ == " __main__ " :


2 # Paris law coefficients
3 [C, m] = [1.5E -11 , 3.8]
4

5 # data
6 Stest = np. asarray (pd. read_csv (’./ data/Stest.csv ’))[:,:,np. newaxis ]
7 Strain = np. asarray (pd. read_csv (’./ data/ Strain .csv ’))[:,:,np. newaxis ]
8 atrain = np. asarray (pd. read_csv (’./ data/ atrain .csv ’))
9 a0 = np. asarray (pd. read_csv (’./ data/a0.csv ’))[0 ,0]* np.ones (( Strain .shape [0] ,1))
10

11 # stress - intensity layer


12 dKlayer = Sequential ()
13 dKlayer .add( Dense (5, input_shape =(2 ,) , activation =’tanh ’))
14 dKlayer .add( Dense (1))
15

16 ...
17

18 # fitting physics - informed neural network


19 model = create_model (C=C, m=m, dKlayer =dKlayer ,
20 a0=ops. convert_to_tensor (a0 ,dtype= float32 ),
21 batch_input_shape = Strain .shape)
22 aPred_before = model . predict_on_batch (Stest)[: ,:]
23 model.fit(Strain , atrain , epochs =20, steps_per_epoch =1, verbose =1)
24 aPred = model . predict_on_batch ( Stest)[: ,:]
Listing 4: Training and predicting in the Euler integration example

Fig. 3. Euler integration results. After training is complete, the model-form uncertainty is greatly reduced. Trained model can be used directly for predictions outside the training
set. We observe repeatability of results after repeating the training of the physics-informed neural network varying initialization of weights.

where: of the training data are contaminated with Gaussian noise with zero
[ ] [ ] [ ] mean and 1.5 × 10−5 standard deviation.
𝑚 0 𝑐 + 𝑐2 −𝑐2 𝑘 + 𝑘2 −𝑘2
𝐌= 1 , 𝐂= 1 , 𝐊= 1 ,
0 𝑚2 −𝑐2 𝑐2 + 𝑐3 −𝑘2 𝑘2 + 𝑘3
[ ] [ ]
𝑦 𝑢
𝐲 = 1 , and 𝐮 = 1 .
𝑦2 𝑢2
(9) 3.3.2. Computational implementation

We assume that while the masses and spring coefficients are known, Within folder second_order_ode of the repository available
the damping coefficients are not. Once these coefficients are estimated at Fricke et al. (2020), the interested reader will find the training and
based on available data, the equations of motion can be used for test data in the data.csv and data02.csv files, respectively. The
predicting the mass displacements given any input conditions (useful
time stamp is given by column t. The input forces are given by columns
for design of vibration control strategies, for example).
Fig. 4(b) and 4(c) illustrate the data used in this case study. Here, we
u1 and u2. The measured displacements are given by columns yT1
used 𝑚1 = 20 (kg), 𝑚2 = 10 (kg), 𝑐1 = 30 (N.s/m), 𝑐2 = 5 (N.s/m), 𝑐3 = 10 and yT2. Finally, the actual (but unknown) displacements are given
(N.s/m), 𝑘1 = 2 × 103 (N/m), 𝑘2 = 1 × 103 (N/m), and 𝑘3 = 5 × 103 (N/m) by columns y1 and y2.
to generate the data. On the training data, a constant force 𝑢1 (𝑡) = 1 (N)
With defined initial conditions 𝐲(𝑡 = 0) = 𝐲0 and 𝐲(𝑡
̇ = 0) = 𝐲̇ 0 , we
is applied to mass 𝑚1 , while 𝑚2 is let free. On the test data, time-varying
forces are applied to both masses. The displacements of both masses are can use the classic Runge–Kutta method (Press et al., 2007; Butcher and
observed every 0.002 (s) for two seconds. The observed displacements Wanner, 1996) to numerically integrate Eq. (8) over time set with time

6
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

Fig. 4. Forced vibration details. Response of a two degree of freedom system is a function of input forces applied at the two masses. Training data is contaminated with Gaussian
noise (emulating noise in sensor reading). Test data is significantly different from training data.

step ℎ: Similarly to the Euler example, we will use the TensorFlow native
[ ] [ ] [ ] recurrent neural network class, RNN, to effectively march in time. The
𝐲̇ 𝑛+1 𝐲̇ ∑ 𝐤
= 𝑛 +ℎ 𝑏𝑖 𝜿 𝑖 , 𝜿𝑖 = ̄ 𝑖 , march through time portion of Eq. (8) starts to be implemented when
𝐲𝑛+1 𝐲𝑛 𝑖
𝒌𝑖
PINN, the object of class RNN, is instantiated. As a recurrent neural
𝐤1 = 𝑓 (𝐮𝑛 , 𝐲̇ 𝑛 , 𝐲𝑛 ), 𝒌̄ 1 = 𝐲𝑛 network, PINN has the ability to march through time and execute the
( )
∑𝑖−1 ∑
𝑖−1 call method of the rkCell object. Lines 9 to 11 of List. 6 are needed
𝐤𝑖 = 𝑓 𝐮𝑛+𝑐𝑖 ℎ , 𝐲̇ 𝑛 + ℎ 𝑎𝑖𝑗 𝐤𝑗 , 𝐲𝑛 + ℎ 𝑎𝑖𝑗 𝒌̄ 𝑗 , so that an optimizer and loss function are linked to the model that will
𝑗 𝑗 be created. In this example, we use the mean square error (’mse’) as

𝑖−1 (10) loss function and RMSprop as an optimizer.
𝒌̄ 𝑖 = 𝐲𝑛 + ℎ 𝑎𝑖𝑗 𝒌̄ 𝑗 , Listing 7 details how to build the main portion of the Python
𝑗
script. From line 2 to line 5, we are simply defining the masses, spring
⎡ 0 0 0 0⎤ ⎡1∕6⎤ ⎡ 0 ⎤ coefficients (which are assumed to be known), as well as damping
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
1∕2 0 0 0⎥ 1∕3⎥ 1∕2⎥
𝐀=⎢ , 𝐛=⎢ , 𝐜=⎢ , coefficients, which are unknown and will be fitted using observed data
⎢ 0 1∕2 0 0⎥ ⎢1∕3⎥ ⎢1∕2⎥ (here, values only represent an initial guess for the hyperparameter
⎢ 0 0 1 ⎥
0⎦ ⎢1∕6⎥ ⎢ 1 ⎥
⎣ ⎣ ⎦ ⎣ ⎦ optimization). Creating the hybrid physics-informed neural network
In this section, we will show how we can use observed data to tune model is as simple as calling create_model, as shown in line 16. As
specific coefficients in Eq. (8). Specifically, we will tune the damping is, model is ready to be trained, which is done in line 19. or the sake
coefficients 𝑐1 , 𝑐2 , and 𝑐3 by minimizing the mean squared error: of the example though, we can check the predictions at the training set
before and after the training (lines 18 and 20, respectively).
1
𝛬= ̂ 𝑇 (𝐲 − 𝐲)
(𝐲 − 𝐲) ̂ , (11) Fig. 5 illustrates the results obtained when running the codes within
𝑛 folder second_order_ode available at Fricke et al. (2020). Fig. 5(a)
where 𝑛 is the number of observations, 𝐲 are observed displacements, shows the history of the loss function (mean square error) throughout
and 𝐲̂ are the displacements predicted using the physics-informed neu- the training. Figs. 5(b) and 5(c) show the prediction against actual
ral network. displacements. Similarly to the Euler case study, results may vary from
We will use all the packages shown Listing 1, in addition to linalg run-to-run, depending on the initial guess for 𝑐1 , 𝑐2 , and 𝑐3 as well as
imported from tensorflow (we did not show a separate listing to performance of RMSprop. The loss converges rapidly within 20 epochs
avoid clutter). Listing 5 shows the important snippets of implementa- and only marginally further improves after 40 epochs. As illustrated
tion of the Runge–Kutta integrator cell (to avoid clutter, we leave out in Fig. 5(b), the predictions converge to the observations, filtering the
the lines that are needed for data-type reinforcement). The __init__ noise in the data. Fig. 5(c) shows that the model parameters identified
method, constructor of the RungeKuttaIntegratorCell, assigns after training the model allowed for accurate predictions on the test set.
the mass, stiffness, and damping coefficient initial guesses, as well as In order to further evaluate the performance of the model, we created
the initial state and Runge–Kutta coefficients. The call method effec- contaminated training data sets where we emulate the case that sensors
tively implements Eq. 4 while the _fun method implements Eq. (8). used to read the output displacement exhibit a burst of high noise levels
Listing 6 details how we use objects from at different points in the time series. For example, Fig. 5(d) illustrates
RungeKuttaIntegratorCell and RNN to create a model ready to the case in which the burst of high noise level happens between 0.5
be trained. The function create_model takes m, c, and k arrays, dt, (s) and 0.75 (s); while in Fig. 5(e), this data corruption happened at
initial_state, batch_input_shape and two different time periods (0.1 to 0.2 (s) and 0.4 to 0.5 (s)). In both
return_sequences so that a RungeKuttaIntegratorCell ob- cases, model parameters identified after training the model allowed for
ject is instantiated. Parameter batch_input_shape is used within accurate predictions. Noise in the data imposes a challenge for model
RungeKuttaIntegratorCell to reinforce the shape of the inputs ( parameter identification. Table 1 lists the identified parameters for the
although it is not directly specified in separate model training runs with and without the bursts of corrupted
RungeKuttaIntegratorCell, it belongs to **kwargs and it will data. As expected, 𝑐1 is easier to identify, since it is connected between
be consumed in the constructor of Layer). the wall and 𝑚1 , which is twice as large as 𝑚2 . On top of that, the

7
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

1 class RungeKuttaIntegratorCell ( Layer ):


2 def __init__ (self , m, c, k, dt , initial_state , ** kwargs ):
3 super( RungeKuttaIntegratorCell , self). __init__ (** kwargs )
4 self.Minv = linalg .inv(np.diag(m))
5 self._c = c
6 self.K = self. _getCKmatrix (k)
7 self.A = np. array ([0. , 0.5 , 0.5, 1.0] , dtype=’float32 ’)
8 self.B = np. array ([[1/6 , 2/6 , 2/6, 1/6]] , dtype=’float32 ’)
9 self.dt = dt
10 ...
11

12 def build(self , input_shape , ** kwargs ):


13 self. kernel = self. add_weight ( " C " ,shape=self._c.shape ,
14 trainable =True , initializer = lambda shape ,dtype: self._c ,
15 ** kwargs )
16 self.built = True
17

18 def call(self , inputs , states ):


19 C = self. _getCKmatrix (self. kernel )
20 y = states [0][: ,:2]
21 ydot = states [0][: ,2:]
22

23 yddoti = self._fun(self.Minv , self.K, C, inputs , y, ydot)


24 yi = y + self.A[0] * ydot * self.dt
25 ydoti = ydot + self.A[0] * yddoti * self.dt
26 fn = self._fun(self.Minv , self.K, C, inputs , yi , ydoti)
27 for j in range (1 ,4):
28 yn = y + self.A[j] * ydot * self.dt
29 ydotn = ydot + self.A[j] * yddoti * self.dt
30 ydoti = concat ([ ydoti , ydotn], axis =0)
31 fn = concat ([fn ,self._fun(self.Minv ,self.K,C,inputs ,yn ,ydotn)],axis =0)
32

33 y = y + linalg . matmul (self.B, ydoti) * self.dt


34 ydot = ydot + linalg . matmul (self.B, fn) * self.dt
35 return y, [ concat (([y, ydot ]), axis =-1)]
36

37 def _fun(self , Minv , K, C, u, y, ydot):


38 return linalg . matmul (u - linalg . matmul (ydot , C, transpose_b =True) - linalg . matmul
(y, K, transpose_b =True), Minv , transpose_b =True)
39 ...
Listing 5: Runge–Kutta integrator cell.

1 def create_model (m,c,k,dt , initial_state , batch_input_shape ,


2 return_sequences =True , unroll =False):
3 rkCell = RungeKuttaIntegratorCell (m=m,c=c,k=k,dt=dt ,
4 initial_state = initial_state )
5 PINN = RNN(cell=rkCell , batch_input_shape = batch_input_shape ,
6 return_sequences = return_sequences ,
7 return_state =False , unroll = unroll )
8 model = Sequential ()
9 model.add(PINN)
10 model. compile (loss=’mse ’,optimizer = RMSprop (1e4),metrics =[’mae ’])
11 return model
Listing 6: Create model function for the Runge–Kutta integration example.

force is applied in 𝑚1 . In this particular example, the outputs show low 4. Summary and closing remarks
sensitivity to 𝑐2 and 𝑐3 . Fig. 5(f) show a comparison between the actual
training data (in the form of mean and 95% confidence interval) and In this paper, we discussed Python implementations of ordinary
differential equation solvers using recurrent neural networks with
the predicted curves when 70 ≤ 𝑐2 ≤ 110 and 15 ≤ 𝑐3 ≤ 120. Despite the
customized repeatable cells with hybrid implementation of physics-
apparently large ranges for 𝑐2 and 𝑐3 , their influence in the variation of informed kernels. In our examples, we found that this approach is
the predicted output is still smaller than the noise in the data. useful for both quantification of model-form uncertainty as well as

8
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

1 if __name__ == " __main__ " :


2 # masses , spring coefficients , and damping coefficients
3 m = np.array ([20.0 , 10.0] , dtype =’float32 ’)
4 c = np.array ([10.0 , 10.0 , 10.0] , dtype=’float32 ’) # initial guess
5 k = np.array ([2e3 , 1e3 , 5e3], dtype=’float32 ’)
6

7 # data
8 df = pd. read_csv (’./ data/data.csv ’)
9 t = df[[’t’]]. values
10 dt = (t[1] - t[0]) [0]
11 utrain = df [[ ’u0 ’,’u1 ’]]. values [np.newaxis ,: ,:]
12 ytrain = df [[ ’yT0 ’,’yT1 ’]]. values [np.newaxis ,: ,:]
13

14 # fitting physics - informed neural network


15 initial_state = np. zeros ((1 ,2* len(m) ,),dtype=’float32 ’)
16 model = create_model (m,c,k,dt , initial_state = initial_state ,
17 batch_input_shape = utrain . shape)
18 yPred_before = model . predict_on_batch ( utrain )[0 ,: ,:]
19 model.fit(utrain , ytrain , epochs =100 , steps_per_epoch =1, verbose =1)
20 yPred = model . predict_on_batch ( utrain )[0 ,: ,:]
Listing 7: Training and predicting in the Runge–Kutta integration example

Fig. 5. Runge–Kutta integration results. After training, damping coefficients are identified (Table 1). Model can be used in test cases that are completely different from training.
Due to nature of physics that governs the problem, responses are less sensitive to coefficients 𝑐2 and 𝑐3 , when compared to 𝑐1 . Nevertheless, model identification is successful even
when noise level varies throughout training data.

Table 1 • Euler integration of fatigue crack propagation: our hybrid model


Identified damping coefficients. Actual values for the coefficients are 𝑐1 = 100.0,
framework characterized model form uncertainty regarding the
𝑐2 = 110.0, and 𝑐3 = 120.0. Due to nature of physics that governs the problem, responses
are less sensitive to coefficients 𝑐2 and 𝑐3 , when compared to 𝑐1 (Fig. 5(f)). stress intensity range used in Paris law. We implemented the nu-
Noise in observed output 𝑐1 𝑐2 𝑐3
merical integration of the first order differential equation through
the Euler method given that the time history of far-field stresses
Gaussian 115.1 71.6 16.7
Gaussian with single burst of high contamination 113.2 70.0 17.1 is available. For this case study, we observed good repeatability
Gaussian with double burst of high contamination 109.2 70.7 15.3 of results with regards to variations in initialization of the neural
network hyperparameters.
• Runge–Kutta integration of a 2 degree-of-freedom vibrations system:
our hybrid approach is capable of model parameter identifi-
model parameter estimation. We demonstrated our framework on two cation. We implemented the numerical integration of the sec-
examples: ond order differential equation through the Runge–Kutta method

9
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

given that the physics is known and both inputs and outputs
are observed through time. For this case study, we saw that the
identified model parameters led to accurate prediction of the
system displacements.

In this paper, we demonstrated the ability to directly implement


physics-based models into a hybrid neural network and leverage the
graph-based modeling capabilities found in platforms such as Tensor-
flow. Specifically, our implementation inherits the capabilities offered
by these frameworks such as implementation of recurrent neural net-
work base class, automatic differentiation, and optimization methods Fig. 6. Multilayer perceptron.
for hyperparameter optimization. The examples presented here as well
as source codes are all open-source under the MIT License and are
available in the GitHub repository https://github.com/PML-UCF/pinn_
others could also be used, such as the rectified exponential linear unit):
code_tutorial.
In terms of future work, we believe there are many opportunities to
advance the implementation as well as the application of the demon- 𝑒𝑧 − 𝑒−𝑧 1
tanh(𝑧) = 𝑧 , sigmoid(𝑧) = ,
𝑒 + 𝑒−𝑧 1 + 𝑒−𝑧
strated approach. For example, one can explore further aspects of par- {
𝑧 𝑤ℎ𝑒𝑛z > 0, and (13)
allelization, potentially extending implementation to low-level CUDA elu(𝑧) =
𝑧
𝑒 − 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
implementation (TensorFlow Contributors, 2020). Another interesting
aspect to explore would be data batching and dropout (Srivastava et al.,
The choice of number of layers, number of neurons in each layer,
2014), which could be particularly useful when dealing with large
and activation functions is outside the scope of this paper. Depending
datasets. In terms of applications, we believe that hybrid implemen-
on computational cost associated with application, we even encourage
tations like the ones we discussed are beneficial when reduced order
the interested reader to pursue neural architecture search (Kandasamy
models can capture part of the physics. Then data-driven models can
et al., 2018; Liu et al., 2018; Elsken et al., 2019) for optimization of
compensate for the remaining uncertainty and reduce the gap between
the data-driven portions of the model.
predictions and observations. In the immediate future, it would be
interesting to see applications in dynamical systems and controls. For
References
example, Altan and collaborators proposed a new model predictive
controller for target tracking of a three-axis gimbal system (Altan Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S.,
and Hacioglu, 2020) as well as a real-time control system for UAV Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G.,
path tracking (Altan et al., 2018). The UAV control system is based Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.,
on a nonlinear auto-regressive exogenous neural network where the 2016. Tensorflow: A system for large-scale machine learning. In: 12th USENIX
Symposium on Operating Systems Design and Implementation, OSDI. pp. 265–283.
proposed model predictive controller for the gimbal system is based on Altan, A., Aslan, O., Hacioglu, R., 2018. Real-time control based NARX neural networks
a Hammerstein model. The proposed control methods are used for real- of hexarotor UAV with load transporting system for path tracking. In: 2018 6th
time target tracking of a moving target under influence from external International Conference on Control Engineering Information Technology. CEIT,
disturbances. It would be interesting to see how our proposed approach IEEE, Istanbul, Turkey, pp. 1–6. http://dx.doi.org/10.1109/CEIT.2018.8751829.
Altan, A., Hacioglu, R., 2020. Model predictive control of three-axis gimbal system
can be incorporated in real-time controls.
mounted on UAV for real-time target tracking under external disturbances. Mech.
Syst. Signal Process. 138, http://dx.doi.org/10.1016/j.ymssp.2019.106548.
CRediT authorship contribution statement Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M., 2018. Automatic dif-
ferentiation in machine learning: a survey. J. Mach. Learn. Res. 18 (153),
1–43.
Renato G. Nascimento: Methodology, Software, Formal analysis,
Butcher, J., Wanner, G., 1996. Runge–Kutta methods: some historical notes. Appl.
Investigation, Data curation, Writing, Visualization. Kajetan Fricke: Numer. Math. 22 (1), 113–151. http://dx.doi.org/10.1016/S0168-9274(96)00048-
Methodology, Software, Formal analysis, Investigation, Data curation, 7, Special Issue Celebrating the Centenary of Runge-Kutta Methods.
Writing, Visualization. Felipe A.C. Viana: Conceptualization, Method- Chen, T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K., 2018. Neural ordinary
ology, Validation, Software, Formal analysis, Investigation, Writing, differential equations. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K.,
Cesa-Bianchi, N., Garnett, R. (Eds.), 31st Advances in Neural Information Processing
Supervision, Funding acquisition.
Systems. Curran Associates, Inc., pp. 6572–6583.
Cheng, Y., Huang, Y., Pang, B., Zhang, W., 2018. ThermalNet: A deep reinforcement
Declaration of competing interest learning-based combustion optimization system for coal-fired boiler. Eng. Appl.
Artif. Intell. 74, 303–311. http://dx.doi.org/10.1016/j.engappai.2018.07.003.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H.,
The authors declare that they have no known competing finan-
Bengio, Y., 2014. Learning phrase representations using RNN encoder-decoder for
cial interests or personal relationships that could have appeared to statistical machine translation. arXiv preprint arXiv:1406.1078.
influence the work reported in this paper. Chollet, F., et al., 2015. Keras. https://keras.io.
Collins, J.A., 1993. Failure of Materials in Mechanical Design: Analysis, Prediction,
Appendix. Multilayer perceptrons and recurrent neural networks Prevention. John Wiley & Sons.
Connor, J.T., Martin, R.D., Atlas, L.E., 1994. Recurrent neural networks and robust time
series prediction. IEEE Trans. Neural Netw. 5 (2), 240–254. http://dx.doi.org/10.
Fig. 6 illustrates the popular multilayer perceptron. Each layer can 1109/72.279188.
have one or more perceptrons (nodes in the graph). A perceptron Dourado, A., Viana, F.A.C., 2020. Physics-informed neural networks for missing physics
applies a linear combination to the input variables followed by an estimation in cumulative damage models: a case study in corrosion fatigue. ASME
J. Comput. Inf. Sci. Eng. 20 (6), 061007. http://dx.doi.org/10.1115/1.4047173, 10.
activation function
Dowling, N.E., 2012. Mechanical Behavior of Materials: Engineering Methods for
Deformation, Fracture, and Fatigue. Pearson.
𝑣 = 𝑓 (𝑧) and 𝑧 = 𝐰𝑇 𝐮 + 𝑏 , (12)
Elsken, T., Metzen, J.H., Hutter, F., 2019. Neural architecture search: a survey. J. Mach.
where 𝑣 is the perceptron output; 𝐮 are the inputs; 𝐰 and 𝑏 are Learn. Res. 20 (55), 1–21.
Fricke, K., Nascimento, R.G., Viana, F.A.C., 2020. Python Implementation of Ordinary
the perceptron hyperparameters; and 𝑓 (.) is the activation function. Differential Equations Solvers using Hybrid Physics-informed Neural Networks. Zen-
Throughout this paper, we used the hyperbolic tangent (tanh), sigmoid odo, http://dx.doi.org/10.5281/zenodo.3895408, URL: https://github.com/PML-
and the exponential linear unit (elu) activation functions (although UCF/pinn_ode_tutorial.

10
R.G. Nascimento, K. Fricke and F.A.C. Viana Engineering Applications of Artificial Intelligence 96 (2020) 103996

Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press, URL: http: Raissi, M., Perdikaris, P., Karniadakis, G., 2019. Physics-informed neural networks:
//www.deeplearningbook.org. A deep learning framework for solving forward and inverse problems involving
Graves, A., Mohamed, A., Hinton, G., 2013. Speech recognition with deep recurrent nonlinear partial differential equations. J. Comput. Phys. 378, 686–707. http:
neural networks. In: IEEE International Conference on Acoustics, Speech and Signal //dx.doi.org/10.1016/j.jcp.2018.10.045.
Processing. pp. 6645–6649. http://dx.doi.org/10.1109/ICASSP.2013.6638947. Sak, H., Senior, A., Beaufays, F., 2014. Long short-term memory recurrent neural
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8), network architectures for large scale acoustic modeling. In: Fifteenth Annual
1735–1780. http://dx.doi.org/10.1162/neco.1997.9.8.1735. Conference of the International Speech Communication Association. Singapore. pp.
Kandasamy, K., Neiswanger, W., Schneider, J., Poczos, B., Xing, E.P., 2018. Neural 338–342. https://www.isca-speech.org/archive/interspeech_2014/i14_0338.html.
architecture search with Bayesian optimisation and optimal transport. In: Bengio, S., Shen, C., Qi, Y., Wang, J., Cai, G., Zhu, Z., 2018. An automatic and robust features
Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (Eds.), learning method for rotating machinery fault diagnosis based on contractive
Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, autoencoder. Eng. Appl. Artif. Intell. 76, 170–184. http://dx.doi.org/10.1016/j.
Inc., pp. 2016–2025.
engappai.2018.09.010.
Karpatne, A., Atluri, G., Faghmous, J.H., Steinbach, M., Banerjee, A., Ganguly, A.,
Singh, S.K., Yang, R., Behjat, A., Rai, R., Chowdhury, S., Matei, I., 2019. PI-LSTM:
Shekhar, S., Samatova, N., Kumar, V., 2017. Theory-guided data science: A new
Physics-infused long short-term memory network. In: 2019 18th IEEE International
paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29 (10),
Conference on Machine Learning and Applications. ICMLA, IEEE, Boca Raton, USA,
2318–2331. http://dx.doi.org/10.1109/TKDE.2017.2720168.
pp. 34–41. http://dx.doi.org/10.1109/ICMLA.2019.00015.
Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.-J., Fei-Fei, L., Yuille, A.,
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014.
Huang, J., Murphy, K., 2018. Progressive neural architecture search. In: The
Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn.
European Conference on Computer Vision, ECCV.
Nascimento, R.G., Viana, F.A.C., 2020. Cumulative damage modeling with recurrent Res. 15 (56), 1929–1958.
neural networks. AIAA J. http://dx.doi.org/10.2514/1.J059250, Online First. Sutskever, I., Martens, J., Hinton, G., 2011. Generating text with recurrent neural
Pan, S., Duraisamy, K., 2020. Physics-informed probabilistic learning of linear embed- networks. In: Getoor, L., Scheffer, T. (Eds.), 28th International Conference on
dings of nonlinear dynamics with guaranteed stability. SIAM J. Appl. Dyn. Syst. Machine Learning. ACM, Bellevue, USA, pp. 1017–1024, URL: https://icml.cc/
19 (1), 480–509. http://dx.doi.org/10.1137/19M1267246. 2011/papers/524_icmlpaper.pdf.
Pang, G., Karniadakis, G.E., 2020. Physics-informed learning machines for partial TensorFlow Contributors, 2020. Create an op. URL: https://www.tensorflow.org/guide/
differential equations: Gaussian processes versus neural networks. In: Nonlinear create_op.
Systems and Complexity. Springer International Publishing, pp. 323–343. http: Viana, F.A.C., Nascimento, R.G., Yucesan, Y., Dourado, A., 2019. Physics-Informed Neu-
//dx.doi.org/10.1007/978-3-030-44992-6_14. ral Networks Package. Zenodo, http://dx.doi.org/10.5281/zenodo.3356877, URL:
Paris, P., Erdogan, F., 1963. A critical analysis of crack propagation laws. J. Basic Eng. https://github.com/PML-UCF/pinn.
85 (4), 528–533. http://dx.doi.org/10.1115/1.3656900. Yucesan, Y.A., Viana, F.A.C., 2020. A physics-informed neural network for wind turbine
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 2007. Numerical Recipes: main bearing fatigue. Int. J. Progn. Health Manag. 11 (1), 27–44.
The Art of Scientific Computing. Cambridge University Press, New York, USA.
Raissi, M., Karniadakis, G.E., 2018. Hidden physics models: machine learning of
nonlinear partial differential equations. J. Comput. Phys. 357, 125–141. http:
//dx.doi.org/10.1016/j.jcp.2017.11.039.

11

You might also like