0% found this document useful (0 votes)

96 views17 pages

2016 - A Paradigm For Data-Driven Predictive Modeling Using Field Inversion and Machine Learning

Uploaded by

Ethem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views17 pages

2016 - A Paradigm For Data-Driven Predictive Modeling Using Field Inversion and Machine Learning

Uploaded by

Ethem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Journal of Computational Physics 305 (2016) 758–774

Contents lists available at ScienceDirect

Journal of Computational Physics

www.elsevier.com/locate/jcp

A paradigm for data-driven predictive modeling using ﬁeld

inversion and machine learning
Eric J. Parish, Karthik Duraisamy ∗
Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109, USA

a r t i c l e i n f o a b s t r a c t

Article history: We propose a modeling paradigm, termed field inversion and machine learning (FIML),
Received 9 July 2015 that seeks to comprehensively harness data from sources such as high-fidelity simulations
Received in revised form 27 September and experiments to aid the creation of improved closure models for computational physics
2015
applications. In contrast to inferring model parameters, this work uses inverse modeling to
Accepted 7 November 2015
Available online 10 November 2015
obtain corrective, spatially distributed functional terms, offering a route to directly address
model-form errors. Once the inference has been performed over a number of problems
Keywords: that are representative of the deficient physics in the closure model, machine learning
Data-driven modeling techniques are used to reconstruct the model corrections in terms of variables that appear
Machine learning in the closure model. These reconstructed functional forms are then used to augment the
Closure modeling closure model in a predictive computational setting. As a first demonstrative example,
a scalar ordinary differential equation is considered, wherein the model equation has
missing and deficient terms. Following this, the methodology is extended to the prediction
of turbulent channel flow. In both of these applications, the approach is demonstrated
to be able to successfully reconstruct functional corrections and yield accurate predictive
solutions while providing a measure of model form uncertainties.
© 2015 Elsevier Inc. All rights reserved.

Even with the tremendous growth in computational power during the past decade, simulations based on first-principles
models of physical systems remain prohibitively expensive for most practical problems. As a result, one has to rely on
coarse-grained models to characterize or predict the overall state of a complex system or its statistical properties. Derivation
of the more affordable models, however, involves a number of additional assumptions that can limit their accuracy. The
pursuit of accurate closures in coarse-grained or intermediate/low-fidelity models is typically a central issue and pacing item
in many scientific disciplines. At the same time, it is becoming feasible to run first-principles or high-fidelity simulations
under idealized conditions and in some regimes of interest. Concurrently, experimental techniques have evolved to a point
where high resolution information can be provided at many scales, including those that are of direct relevance to problems
of interest for physicists and engineers. Against this backdrop, data mining techniques have already made their mark in many
disciplines of science and engineering by providing improved physical insight as well as quantitative data for modeling.
Unprecedented opportunities exist in going one step further and directly utilizing available data to improve and generate
predictive models that can be used in practical analysis and design in a robust manner.
Physical modeling has always been data-driven to a degree. Typically, a theory or set of theories are formulated and
unknown model coefficients and functions are empirically determined by correlating the response of the model with avail-
able data. Over the past decade, more formal calibration procedures have emerged in many different fields of application.

* Corresponding author.
E-mail addresses: [email protected] (E.J. Parish), [email protected] (K. Duraisamy).

http://dx.doi.org/10.1016/j.jcp.2015.11.012
0021-9991/© 2015 Elsevier Inc. All rights reserved.
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 759

Discipline-speciﬁc reviews are presented in Navon [1] for data assimilation in weather forecasting and Kerschen et al. [2]
for system identiﬁcation in Structural Dynamics. Given some data Gd and a model output Gm (Q, λ), where Q are state
variables and λ are model parameters, Frequentist (typically Least Squares) or Bayesian procedures are formulated to infer
optimal values of λ. This is usually accomplished by minimizing the difference between the data and the model output.
The Least Squares procedure is conceptually simple and can offer probability measures on the output, whereas the Bayesian
approach can be formulated more rigorously and can account for a more precise prescription of prior knowledge and prob-
ability structure, though at a much higher expense. Nevertheless, both types of techniques have been successfully used for
parameter estimation.
Errors in the underlying structure of the model may result in inadequacies [3], even if the best possible set of parameters
have been inferred. Predicted values of the outputs may never match the true value in a deterministic or statistical sense
for many models. A widely used approach to address model inadequacy is the Bayesian calibration framework of Kennedy
and O’Hagan (Ref. [4] and its derivatives [5,6], etc.). The essence of their approach is to represent the output quantity of
interest as

Gd = Gm (Q, λ) + δ(Q) + ε , (1)

where δ is the model discrepancy term and ε represents measurement errors. Typically, Gaussian Process models are as-
sumed for Gm , δ , and ε and Bayesian inversion is used to derive a posterior distribution for the hyperparameters of the
Gaussian process as well as the parameters λ. A key advantage of this approach is that the procedure is general enough
that, in theory, a deep understanding of the underlying physical system is not required to quantify the discrepancy, although
physically motivated priors may be required to guarantee identifiable solutions [7]. While this approach has been successful
in exposing the idea of model discrepancy and has been widely employed, several shortcomings have been noted [8,9].
A primary drawback is that there is no logical way to distinguish the role of the parameters λ and the discrepancy func-
tion δ . Additional concerns include that (i) the discrepancy model δ is an empiricism that is unconstrained by physical laws,
(ii) in its originally proposed (and widely employed) form based on Gaussian processes, non-Gaussian phenomena cannot
be accurately represented, and (iii) the definition of this discrepancy is tied to a particular quantity of interest and does not
inform the prediction of other quantities at a more fundamental level.
Over the past few years, researchers have focused on addressing some of the aforementioned limitations. It has to
be mentioned that the process of addressing model inadequacies cannot ignore the underlying physics, and hence many
attempts at model calibration may be hidden in discipline-specific literature and thus a comprehensive review is difficult.
Cheung et al. [10] and Edeling et al. [11] use the concept of Stochastic Model Class [12] to calibrate the coefficients of
underlying sub-models (in this case, eddy-viscosity-based turbulence models). In this approach, given a number of models,
the plausibility of each model class M i is quantified by appropriate prior and likelihood functions. Again, Bayes’ theorem is
used to find the posterior distribution p (θ|Gd , M i ). The above approaches can be successful when a large amount of data is
available and when the model parameters are fixed.
Berliner et al. [13] try to improve one-dimensional ice-sheet models by incorporating additive parameters to specific
model components (for instance, the basal shear stress). More recently, Sargsyan et al. [14] have introduced random variables
into kinetic rate coefficients in reaction models and reduce the calibration problem to one of estimating the kernel density
of the random variables. These works embed the discrepancy function into the model or sub-model equations (rather
than at the output level) and thus operate under conditions that satisfy physical constraints (for instance, conservation
properties [14]). However, as discussed by Loeppky et al. [15], accounting for the discrepancy can also lead to poor choices
of the parameters λ, which should now be viewed as tuning parameters. Soize [16] addresses model form uncertainties
directly by constructing stochastic models of the operators of structural mechanics instead of introducing a probabilistic model
of the prediction errors. The approach uses random matrix theory [17] to construct the prior probability distribution of
matrix operators in a structural dynamic system. This approach has been used to represent modeling uncertainty that
appears in a class of problems with specific structure (i.e. classical second order dynamical systems typical of structural
mechanics). It is not clear how this method can be extended to more general non-linear large scale systems that do not
have a matrix-based dynamical system representation.
The objective of the present work is to move beyond parameter estimation and seek detailed distributions of field
variables that can effectively address structural uncertainties. The underlying Bayesian problem is one of field inversion.
Functional relationships are extracted from the field variable distribution using machine learning techniques once the inver-
sion over a sufficiently large number of problems is complete. The ultimate goal is to offer predictive capability based on
the knowledge gained from various datasets. This two-step process, which we refer to as field inversion and machine learn-
ing (FIML) is demonstrated in a predictive setting to account for model form uncertainties in a scalar ordinary differential
equation as well as in a turbulence modeling application.

1. Mathematical setting

Consider a physical system that is governed by a set of non-linear equations (partial differential or otherwise). The
truth-model of the system is given by

RT (QT (x, t )) = 0, (2)

760 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

along with well-posed initial and boundary conditions. The operator R T contains the governing equations of the system
while Q T contains the model variables. The physical system is modeled by

Rm (Qm (x, t ), M ) = 0, (3)

where Rm = R T and it is possible Qm involves different model variables than Q T . These model equations contain unclosed
functions/operators M that arise either from a deﬁcient understanding of the underlying physics or from coarse-graining.1
In other words, if exact values of M are inserted into the model equations Rm , one can obtain accurate solutions2 that
satisfy the truth model. However, M is usually not determinable from ﬁrst principles and so to provide closure to the above
set of equations, a secondary set of model variables Qs (x, t ) ∈ Rns are typically introduced. These secondary variables may
have associated model equations Rs (Qm , Qs ) = 0, which themselves will have inexact aspects. Using these equations, an
approximation for M is typically sought as a function M(Qm , Qs ). Thus, the model of the system can be represented as

R (Q, M(Q)) = 0; (4)

with R and Q denoting the combination {Rm ; Rs } and {Qm ; Qs } respectively.
In the proposed approach, the model in Eq. (4) will be replaced by a stochastic system

R (Q, M(Q, β(ω))) = 0, (5)

where β : ω is a random function that is a result of a data-driven inversion/machine learning process. Determining β
is the essence of this work. A significant aspect of our approach is that realizations (sampled in the space ) of β are not
parameters, but spatio-temporally varying field variables. To be useful in predictive modeling, however, the spatio-temporal de-
pendence of β has to be transformed by extracting the functional relationship β(η), where η(Q) are input features available
in the closure model. The input features are generally extensions of the model variables. Such a functional relationship
will have to be elicited while considering the output of a large number of inverse problems that are representative of the
physical phenomena to be modeled. Note that the stochastic nature of β will be implicit in the discussion below and thus
the dependence on ω will not be included for clarity.
It is beneficial to relate the concepts defined above to a physical system. As an example, direct numerical simulations
(DNS) of the Navier–Stokes equations can be considered to be the truth model of the system, R T . The computational cost
associated with DNS can be quite high in practical flows, and thus filtering or averaging the solutions can lead to a modeled
set of equations such as Large Eddy (LES) or Reynolds-averaged Navier–Stokes (RANS) simulations, represented by Rm . The
averaging/filtering procedure produces unclosed terms M . It is not possible to derive an exact set of closed equations for M ;
instead the solution approach is to develop a closure model with a secondary set of equations, Rs , which will be dependent
on the averaged/filtered quantities.
In the proposed approach, the stochastic function β is introduced to the model system. The inverse methods described
in Section 2 are used to infer the probability density function (PDF) of β(x). To be of use in a predictive setting, inferred
solutions for β over a wide class of problems are used as inputs to the machine learning methods described in Section 3.
The machine learning algorithms are then used to extract a functional relationship for β(η), with η being input features
(such as a notional time-scale of the turbulence) available from the closure model.

2. Field inversion

The challenge of creating the stochastic system in Eq. (5) is in the estimation of the distribution of the function β(η).
Model discrepancy inhibits direct extraction of the functional form of β(η) from available data. Instead, an inverse problem
is posed to infer the distribution of β such that realizations of the stochastic system are consistent with underlying physics;
potentially both in the mean and higher order statistics. Bayesian inversion is used to obtain β(ω) in the form of functional
corrections. The functional correction is obtained by inferring β at every grid point in the computational domain. The
process of inferring β can be thought of as follows: We start by having an estimate of β , along with a certain amount
of conﬁdence in that estimate. This is the prior probability of β , given by p (β). Next, we observe an external system and
obtain an observational dataset d along with its associated uncertainty. For a given β , there is some probability that the
model will reproduce the dataset d. This is given by the likelihood function h(d|β). Given p (β) and h(d|β), there exists
some probability of β given the observations d. This is the posterior probability q(β|d). The goal of the inverse is to obtain
q(β|d). Mathematically, the posterior probability distribution is given by Bayes’ theorem

h(d|β) p (β)
q(β|d) = , (6)
c

where c = h(d|β) p (β)dβ . The solution of Eq. (6) can be made tractable via assumptions regarding the distribution of d
and β . In the case that the dataset d and random function β are Gaussian, and the distribution h(d|β) is Gaussian, it can be

1
Most continuum models have this issue in one form or the other.
2
In this paper, the focus is on the closure model and not on issues such as numerical discretization errors, initial/boundary condition uncertainties, etc.
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 761

shown [18] that the problem of determining the distribution of β is reduced to estimating the maximum a posteriori (MAP)
solution, which is found by solving a deterministic optimization problem

1 T T
βmap = arg min d − h(β) Cm −1 d − h(β) + β − βprior Cβ −1 β − βprior , (7)
2
where Cm and Cβ are the observational and prior covariance matrices, respectively. The observational covariance is deter-
mined by the statistics of the observed dataset d and the prior covariance is determined by prior knowledge of the system.
The parameter to observable map h(β) is a subset of the governing equations. The parameter being optimized in Eq. (7)
is β . The term being minimized is referred to as the cost function J; i.e.

βmap = arg min J. (8)

The dimensionality of the optimization problem scales with the number of discrete parameters being optimized. In this
case, β is being optimized at every point in the domain so the dimensionality of the optimization problem scales with the
number of mesh points. In the linear case, the covariance of the posterior is given by the inverse of the Hessian of the cost
function J evaluated at the MAP point
−1
d2 J
Cβmap = H−1 =
. (9)
βmap d βi d β j βmap

In the non-linear case, Eq. (9) becomes an approximation that is a result of a linearization about the MAP point. Once the
MAP solution and posterior covariance are found, realizations of β can be drawn from the posterior distribution. A standard
method is to perform a Cholesky decomposition on Cβmap such that

RRT = Cβmap . (10)

Random samples of the posterior can then be drawn by

β = βmap + Rs, (11)

where s is a vector, the components of which are independent standard normal variates.
It is well-recognized that the Gaussian assumption is a strong one. In a general setting, the data d and prior β may not
be truly Gaussian. In non-linear problems, even if d and the prior β are Gaussian, the posterior distribution may not be
Gaussian in nature. In our approach, we consider the approximation to make our approaches feasible in high-dimensional
problems. The error introduced by the Gaussian assumption is problem-dependent and can be estimated by analyzing
the model response, but in the non-linear case, the posterior distribution must be viewed as an approximation. Computing
non-Gaussian posterior distributions require expensive sampling based methods, such as Markov chain Monte Carlo (MCMC)
methods (see Appendix A). These methods are especially costly in high-dimensional state spaces. The inversion procedure
described in this paper is scalable, while sampling based methods are not.
In the following subsections, different aspects of the inversion process are discussed.

2.1. Gradient and Hessian computations

For field inversion, the dimensionality of the inversion scales with the number of mesh points. Though simplifications
may be performed by constructing a surrogate representation of β over the computational domain, we pursue the more
detailed approach of estimating β at every grid point in the computational domain. The resulting optimization problem is
high-dimensional and efficient methods of minimizing the cost function are needed. Gradient-based methods are used to
solve the inverse problem in this work. These methods require derivatives with respect to a large number of parameters,
which are efficiently calculated using a discrete adjoint [19] formulation. To determine the gradient, the adjoint equation is
first solved for ψ ,
T T
∂R ∂J
ψ =− , (12)
∂Q ∂Q
where J is the cost function, R is the residual of the primal equations, and Q are the model variables. The gradient is then
computed using

dJ ∂J ∂R
G= = + ψT . (13)
dβ ∂β ∂β
The optimization problem is solved using BFGS [20], or in problematic cases, using steepest decent. The solution of the
adjoint equation requires the computation of the Jacobian of the primal equations, which is calculated analytically.
762 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

To determine the posterior covariance, a Hessian calculation is required. In this work, an adjoint-adjoint method is used
to compute the Hessian. For a system with M discrete model variables and N optimization parameters (the model variables
are Q at each grid point and the optimization parameter is β at each grid point), the Hessian is computed by [21]:

∂ 2J ∂ 2 Rm ∂ Rm ∂ 2J ∂ 2 Rm
Hij = + ψm + μi ,m + νi ,m + νi ,n ψm m, n ∈ [1, M ] (14)
∂βi ∂β j ∂βi ∂β j ∂β j ∂ Q n ∂β j ∂ Q n ∂β j
where
∂ Rm ∂ Rm
νi,n =− m, n ∈ [1, M ] i ∈ [1, N ] (15)
∂ Qn ∂βi
∂ Rm ∂2 F ∂ 2 Rm ∂ 2J ∂ 2 Rm
μi,m =− − ψm − νi ,n − νi ,n ψm k, m, n ∈ [1, M ] i ∈ [1, N ]. (16)
∂ Qk ∂βi ∂ uk ∂βi ∂ Q k ∂ Q n∂ Q k ∂ Q n∂ Q k
A low-rank approximation is useful for a diagonal observational covariance matrix. A diagonal covariance assumes that
the data are uncorrelated. This is a reasonable approximation when measurement error dominates the observational vari-
ance. It is additionally relevant for cases where data is not available to build a complete covariance matrix. For a cost
function with a diagonal observational covariance matrix and no prior, the cost function simpliﬁes to
2
1

M
h(β)i − di
J= . (17)
2
i
σi
To approximate the Hessian, M scalar valued functions are deﬁned,

h(β)i − di
f i (β) = for i = 1, 2, . . . , M . (18)
σi
The gradient of the non-regularized cost function can computed as

M
∇J(β) = ∇ f i (β)2 = 2 f i (β)∇( f i (β)). (19)
i =1 i =1

Or equivalently

∇J(β) = 2J(β) T F(β), (20)

where
T
F= f 1 (β) f 2 (β) ... f M (β) . (21)

The Jacobian of the scalar valued functions can then be used to approximate the Hessian by

H ≈ J(β) T J(β). (22)

The contribution of the prior is easily computed analytically. It is noted that calculation of the low-rank Hessian involves
computing adjoints for each of the scalar valued functions. The low-rank approximation was generally found to be an
excellent approximation to the true Hessian near the MAP point. For cases where the inversion was severely ill-conditioned,
the approximate Hessian calculation generally produced a positive deﬁnite covariance, while the exact Hessian did not.

2.2. Observational covariance

In the case of an uninformed prior with a high covariance, the magnitude of the observational covariance will generally
be much less than that of the prior. The solution of the inverse problem is expectedly sensitive to the speciﬁcation of Cm . In
general, Cm is determined by the statistics of the observational data. The availability of observational statistics varies from
case to case; and it is thus important to quantify the performance of the inversion for different forms of Cm . In this work,
three different models are considered. The simplest possible model we consider takes the form
2
Cm = σobs I, (23)
where σobs is a scalar that is representative of some mean variance of the observations. Such a model neglects all covari-
ances. The second model considered is an extension of the above

obs
Cm = σ 2
I, (24)
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 763

where σ obs is a vector containing the variances for each observation. This model assumes that σobs can be determined from
available data. The third model considered assumes the availability of a complete set of statistics, in which case the exact
covariance matrix of the data vector D is given by

Cm = E Di − Di Dj − Dj , (25)

where E is the expectation operator.

The complexity of the observational covariance has a minimal impact on the computational cost of the inverse problem.
The bandwidth of the covariance matrix affects the cost of evaluating the objective function and its derivatives. This does not
play a signiﬁcant role in the computational cost, although analytic derivatives of the objective function become unwieldy as
the bandwidth increases. For the cases considered, it was observed that the optimization algorithm converged signiﬁcantly
faster using the full covariance matrix.

3. Machine learning

The inversion produces solutions of the correction β that are in a spatio-temporal (or in the present work, spatial) form,
for specific problems. If the inversion is performed over a large number of problems and objective functions, problem-specific
inference can be converted to general modeling knowledge via Supervised machine learning [22,23] algorithms. These tech-
niques can be used to elicit the functional relation β(η) where η(Q) are the model input features. The examples provided
in this paper use Gaussian Processes [24] (GPs), and the interaction between the inverse and ML formulation is limited to
variances, i.e. the ML algorithm does not use the entire covariance matrix generated in the inversion. In the GP formulation,
it is assumed that the output function at the training points, Ytrain (where Ytrain consists of inferred spatio-temporal fields
of β for a wide class of problems), is drawn from the distribution

Ytrain ≈ N (0, φ + λI), (26)

where φ is the covariance matrix of the training data and λ is the variance predicted by the inversion. The mean of the
distribution is assumed to be zero since it is unknown. The covariance is constructed using a radial basis function (RBF)
kernel given by
2
− ||ηm −2ηn ||
φ(ηm , ηn ) = e h . (27)
The L2 norm is used in the RBF. The decision to use an RBF kernel is not unique, it is possible that more expressive kernels
may produce better results. The model output, β(ηtest ), is deﬁned by a conditional probability distribution p (β|Ytrain ). It can
be shown that the probability distribution is given by

p (η|Ytrain ) ≈ N φ T (φ + λI)−1 Ytrain , 1 + λ − φ T (φ + λI)−1 φ . (28)
It directly follows that the expected value and variance of β are given by
− 1
β(η) = φ T φ + λI Ytrain , (29)
− 1
σ f2 = 1 + λ − φ T φ + λI φ, (30)

where φ is a vector whose elements are φ(ηtrain,i , ηtest ). It is noted that in the current implementation the machine learning
process only returns variances. The covariance of the training data assumed in Eq. (27) is a mathematical construct designed
to help the machine learning process and has no direct relation to the true covariance of the training data.
The hyperparameters h in Eq. (27) are found by maximizing the probability of obtaining an output distribution Ytrain
given input features η and hyperparameters h. This is done by maximizing the log marginal likelihood function,

1 1 N
log p (Ytrain |ηtrain , h) = − log |φ + λ| − Ytrain (φ + λ)−1 Ytrain − log(2π ), (31)
2 2 2
where N is the number of training points.

4. Application to model problem

The framework is now applied to a scalar non-linear ordinary differential equation that resembles one-dimensional heat
conduction with radiative and convective heat sources. The “true” model is taken to be

d2 T 4
= ε(T ) T ∞ − T 4 + h( T ∞ − T ), z ∈ [0..1], (32)
dz2
with homogeneous boundary conditions. We will refer to ε as the emissivity of the material and h as the convection
coeﬃcient. In the true process, the emissivity is a stochastic non-linear function of temperature and is given by
764 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

Fig. 1. Solutions of the base model compared to the mean of the true process.

Table 1
Summary of the conditions used for the inversion.

Case T∞ σprior βprior

1 5 20 1
2 10 2 1
3 15 1 1
4 20 1 1
5 25 0.5 1
6 30 1 1
7 35 1 1
8 40 1 1
9 45 1 1
10 50 0.8 1

3π
ε( T ) = 1 + 5 sin T + exp(0.02T ) + N (0, 0.1 ) × 10−4 . 2
(33)
200
The convection coeﬃcient is taken to be a constant of h = 0.5. To demonstrate the framework, we consider the case where
the true process (Eqs. (32) and (33)) is unknown. The process is imperfectly modeled by

d2 T 4
= ε0 T ∞ ( z) − T 4 (34)
dz2
with ε0 = 5 × 10−4 . Eq. (34) will be referred to as the base model. The resulting model outputs are shown in Fig. 1. The
model particularly suffers when T ∞ is low, where the ignored linear term is signiﬁcant. The inverse is posed by adding a
spatial multiplier to ε0

d2 T 4
= β(z)ε0 T ∞ ( z) − T 4 . (35)
dz2
The goal of the framework is to obtain β( z) from the inversion, and then to learn β = β( T , T ∞ ). Note that β will encapsulate
both the true form of ε and the convective heat transfer term. The true solution for β is

1 3π h T∞ − T
β( T , T ∞ ) = βr + βc = 1 + 5 sin T + exp(0.02T ) + N (0, 0.1) × 10−4 + 4 − T4
. (36)
ε0 200 ε0 T ∞
Synthetic data is generated by solving 100 realizations of the true process (Eq. (32)) for T ∞ ∈ [5, 10, . . . , 50]. The governing
equation is solved using second order central differences on a uniform mesh with 31 grid points. These synthetic data are
used as observational data for the inverse calculations. Note that the inversion is performed on the same computational
grid. A summary of the conditions used for the inversion are given in Table 1.
The inversion is performed using the various models for the observational covariance matrix Cm that were previously
discussed. For Cm = σobs
2
I, the observational data is used to compute a single mean variance. For Cm = σ obs
2
I, the obser-
vational data is used to compute the variance of temperature at each grid point. An uninformative prior is selected that
corresponds with the baseline model, i.e. βprior = 1. The prior variance is selected such that the 2σ limits of the prior PDF
of temperature encompass the observed solution. The prior PDF for temperature is determined by solving the forward model
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 765

for samples of βprior . In this case, the forward model was sampled 100 times. Elementary statistical formulae can be used
as a general guideline to determine the number of required samples. Given n statistically independent samples, the error on
the mean σ X and the error on the (co)variances σ S can be approximated by

σ 2 2 2
σX = √ , σ =σ
S . (37)
n (n − 1)
For clarity, the entire solution process is outlined:

1. Sample the prior distribution of β via Equations (10) and (11) with the assumed prior covariance matrix Cβ .
(a) For each sample of β solve the model equation (Equation (35)) to determine distributions for temperature.
(b) Determine if Cβ was appropriately chosen by ensuring that the observed temperature proﬁle falls within the ±2σ
limits of the distribution predicted by the model equation.
2. Solve the inverse problem with the various models for Cm by solving the optimization problem.
3. Sample the posterior distribution of βmap via Equations (10) and (11) with Cβmap .
(a) For each sample of β , solve the model equation to determine the posterior distributions for temperature about the
MAP point.

The results of the inversion for T ∞ = 50 are of the most interest and are shown in Fig. 2. It is first seen that the MAP
solution for temperature coincides with the observed value for all models. The MAP solution for β , however, only coincides
with the true solution when the complete observational covariance is used. For the diagonal models of Cm that ignore
covariance, the posterior variance is too high in the center of the domain, as seen in Figs. 2a and 2b. However, the posterior
variance is directly correlated to the local accuracy of βmap . When the complete observational covariance is used the correct
posterior distribution is inferred across the entire domain.
Fig. 3 gives a compiled summary of the inferred βmap for the 10 cases. For plotting purposes, βr and βc are extracted
from βmap and the uncertainty bounds are attached to βr . The missing and deficient terms in the model equation have been
effectively inferred, especially in Fig. 3c. For the lower-order representations of Cm , β was inferred correctly over most of
the domain; with error being present at low and high temperatures. This error is well reflected by the posterior variance.
The solutions using the complete observational covariance are seen to yield extremely accurate inferences for β and σ over
the entire domain.
Several conclusions can be drawn from the inference step. First, the performance of the inversion was comparable for
Cm = σ 2 I and Cm = σ 2 I. This shows that an accurate observational covariance is needed to correctly infer the posterior
distribution of β . Second, if a simpler observational covariance is used, the posterior distribution can still provide informa-
tion on the accuracy of the inference; but the posterior distribution is not representative of the underlying physics. Here
we make an important note that the objective of this model problem was to infer the correction β and its proper posterior
distribution. However, it is often desirable to infer a correction for a mean quantity (as in the next example). Under these
settings, the covariance
matrices should be constructed differently than in the procedure described above. For example,
mean
σobs = σobs / N samples is a more appropriate standard deviation for mean quantities. In general, an arbitrarily low vari-
ance can always be set for the observable, in which case the resulting MAP solutions will be in closer agreement with the
observed data (the discrepancies between the MAP and true solutions in Fig. 2 can be eliminated with this method).
In this section it was shown that, unless the correct observational covariance was used, the statistics of the resulting
posterior distribution can be inaccurate. It is well argued that, in the absence of the true observational covariance, the
inversion could be carried out for low order statistics such as mean quantities.

4.1. Predictive simulations with machine learning injection

As described in Section 3, a machine learning algorithm utilizing GPs is used to elicit the functional relationship β( T , T ∞ )
from the spatial data generated in the inversion. The data generated from the inversion using the exact observational
covariance is used for training. It is noted that the ML formulation employed only makes use of the variance predicted
by the inverse, rather than the entire covariance matrix. The hyperparameters for the GP are optimized off-line, and then
the resulting model is injected into the solver at every iteration of the solution. Systematically, the solver calls the residual
calculation routine which in turn calls the ML algorithm. The ML algorithm is queried with T and T ∞ and returns βML
and σML .
The machine-learned predictive model was evaluated for a variety of new predictive cases. Table 2 gives a summary of
the conditions and the performance of the ML model. Case 1 provides examples with similar “physics” as the training cases
(i.e. T ∞ = constant), and the results of the implementation are excellent. Cases 2 and 3 explore the performance of the
model for regions of T ∞ > 50, for which there were no training data. For these cases, an improvement is observed over
the baseline model. Cases 4 and 5 explore the performance of the ML model for lower values of T ∞ , where the linear heat
transfer term becomes important. Again, it is seen that the performance of the model is much improved. The solutions for
Case 2 and Case 5 with the predictive model are given in Figs. 4 and 5. In both cases, the ML solution for temperature and
β is much improved from the base model, but error is still present. However, excellent correlation is seen between model
766 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

Fig. 2. Posterior model distributions at T ∞ = 50 for each model of Cm . From left to right T and β with 2σ limits, σ , and Cβmap are shown, respectively.

Table 2
Summary of cases used to test the predictive model. The L2 norm is used to compute the errors reported in column 5.

Case Grid points T∞ Training data |emap − emodel |/emodel × 100

1 71 28 R = σ 2I 95.5%
2 71 55 R = σ 2I 64.5%
3 71 35 + 20 sin(2π z) R = σ 2I 76.0%
4 71 35 − 15z R = σ 2I 98.3%
5 71 15 + 5 cos(π z) R = σ 2I 88.5%

error and the predicted variance. Similar correlations were seen for all cases, suggesting that the posterior variance is an
indicator of local model accuracy. This feature is extremely useful in a predictive setting.

5. Application to turbulent channel ﬂow

The modeling of turbulent flows has been a long-standing obstacle to the application of computational fluid dynamics
(CFD) to many practical problems. Direct numerical simulations (DNS) attempt to resolve all scales of turbulence but the
resolution requirements make this technique infeasible for most flows of engineering interest. To compute practical high
Reynolds number flows, near-wall modeling is performed using a Reynolds-averaged Navier–Stokes (RANS)-type closure.
RANS-based methods are typically formulated using a combination of theory and intuition. Traditionally, a number of free
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 767

Fig. 3. Summary of Posterior model distributions for each model of Cm . From left to right, βr , βc , and σ are shown.

parameters remain in the model and these are calibrated using empirical fitting and are often found to be deficient in many
flows. The key issue is that the main source of error is in the functional form of the model terms. Functional relationships
elicited directly from high-fidelity simulation or experimental data will not translate to RANS model improvements since
the inference has to be within the context of the model. The technique outlined in this section infers and reconstructs a
correction that is consistent with the low fidelity (RANS) model.
The FIML framework is applied to turbulent channel flow with a k − ω turbulence model [25]. The Reynolds-averaged
momentum equation for incompressible fully-developed channel flow is given by

∂ ∂u ∂p
μ − ρ u v − = 0, (38)
∂y ∂y ∂x

where p and u are the mean pressure and velocity, respectively. The process of Reynolds averaging introduces the unclosed
Reynolds stresses, τi j = −ρ u v . Determining τi j is the fundamental challenge of turbulence modeling. The k − ω model
makes use of the Boussinesq approximation, where the Reynolds-stress tensor is assumed to take the form

2
τi j = 2νt S i j − kδi j , (39)
3
768 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

Fig. 4. Results of the predictive model for Case 2.

Fig. 5. Results of the predictive model for Case 5.

with νt being the turbulent eddy viscosity, k the turbulent kinetic energy, and S i j the mean strain-rate tensor. The still
unclosed turbulent eddy viscosity is then determined by introducing transport equations for the turbulent kinetic energy k
and the speciﬁc dissipation rate ω . On dimensional grounds, the turbulent eddy viscosity is modeled by

k
νt = C μ , (40)
ω
where C μ is a constant of proportionality. For the case of planar channel flow, the transport equations for k and ω become
ordinary differential equations of the form
2
∂u ∗ ∂ ∗k ∂k
νt − α kω + ν +σ =0 (41)
∂y ∂y ω ∂y
2
∂u ∂ k ∂ω
γ − αω2 + ν +σ = 0. (42)
∂y ∂y ω ∂y
The standard closure coefficients for the Wilcox k − ω model are used and are given in Table 3, along with the associated
boundary conditions for the channel flow. Equations (38) through (42) will be referred to as the base model. Numerically,
these governing equations are discretized with second-order finite differences and the system is solved by introducing
pseudo-time derivatives to the left hand side of Equations (38), (41), and (42). Implicit time integration is then used to
iterate the system to a steady state. The equations are solved on a geometrically graded mesh with the first grid point
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 769

Table 3
Summary of model coeﬃcients and boundary conditions for the k − ω model in planar channel ﬂow. The channel wall is at y = 0 and the mid plane of the
channel is at y = h/2.

Cμ α∗ σ∗ γ α σ y=0 y = h/2
1.00 0.09 0.6 13/25 0.09 0.5 u , k = 0; ω = ωw ∂/∂ y (u , k, ω) = 0

placed well into the viscous sublayer at y + ≈ 0.05. At the wall, the boundary condition for ω becomes singular, which is
numerically handled by analyzing asymptotics [25] of ω .

5.1. Inversion

The functional correction β( y ) is introduced as a multiplier to the production term in the turbulent kinetic energy
equation,
2
∂u ∂ k ∂k
νt β( y ) − α ∗kω + ν + σ∗ = 0. (43)
∂y ∂y ω ∂y
Introducing β to the production term modifies the entire turbulence model, and is equivalent to adding an additive source
term. DNS data from Jimenez et al. [26] are used in the inverse modeling. Of this data, the velocity profiles are targeted;
i.e. d = uDNS . Since the DNS data provides a near perfect observation of truth, the observational covariance is taken to
be Cm = σobs2
I where σobs = 10−10 . This choice neglects covariance in the observed data. Constructing a more accurate
observational covariance, as was done in the previous problem, requires statistics that are not readily available from DNS.
In this case, the two-point correlation of the mean streamwise velocity in the wall normal direction could be used to build
an accurate observational covariance. However, a plausible alternative could be to use an approximation to the two-point
correlation to build a more accurate observational covariance.
The prior distribution is determined by the same process discussed previously, where σ p = 0.5 was selected such that
the DNS velocity profile falls within the 2σ limits of the prior model. With σobs = 10−10 and σ p = 0.5, the dependence on
the prior has been effectively eliminated for regions where the inferred function is sensitive to the data.
The inversion is performed for different wall-shear stress-based Reynolds numbers Reτ ∈ [180, 550, 950, 2000, 4200]. An
example of the resulting posterior distribution for inferred velocity and β is given in Fig. 6. The MAP solution is seen to
match the DNS data very well. Due to the low observational variance, the posterior distribution for velocity collapses on the
MAP solution. The posterior distribution for β also collapses on the MAP estimate for y + > 5. Turbulent production within
the viscous sublayer ( y + < 5) is very small and thus the value of β in this region is largely inconsequential and cannot be
inferred with a high degree of confidence.
A summary of the inferred corrections for all Reynolds numbers is given in Fig. 7. A universal scaling is seen with y +
within the inner layer, and with y near the center of the channel, both results being consistent with the underlying physics.
A Reynolds number dependence that is usually missed in traditional turbulence models is additionally observed. A detailed
summary of the inversion is provided in [27].

5.2. Predictive simulation with machine learning injection

Gaussian processes are again used to extract the functional √relationship β(η). The non-dimensional input features η
considered in this process are the inverse solutions for { Sk/ε , d k/ν , P /ε , y + } at Reτ ∈ [180, 550, 950, 4200]; Reτ = 2000
was omitted from this training data set. In the training process, only input features within the inner layer were considered.
It is well recognized that a functional relationship β(η) between the model correction and input features may not exist; or
minimally the accuracy of the functional extraction may vary across the solution space. Injecting the ML algorithm into the
solver may not be sufficient for cases where local ML predictions are highly inaccurate. One method to make an appropriate
model update is to consider an additional Bayesian update step after the machine learning has completed. This final Bayesian
update step (with the appropriate assumptions) is given by

1 T T
βpost = arg min β − βML Cβ − 1
ML β − βML + β − βprior Cβ prior −1 β − βprior . (44)
2
Note that C βML and βML are functions of both the inverse and the machine learning algorithms, while βprior is specified in
the inverse. The posterior model will assimilate to the ML model for regions where the variance of the ML model is low.
For regions of high variance, the posterior model will assimilate to the prior model. Although not shown, the results using
this method for the previous model problem were comparable to those reported in Table 2.
The posterior model described above is now tested at Reτ = 2000. The ML model is queried once at the inverse solution
to construct a machine learned correction. This correction is applied during the predictive solution to obtain the final
posterior model. The impact of the final Bayesian update is seen in Fig. 8. The ML prediction performs well within the inner
layer, as such a low training variance is predicted and the posterior model assimilates to the ML model. In the channel core,
770 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

Fig. 6. Posterior model distributions for planar channel ﬂow at Reτ = 934. The ±2σ limits are shaded in both ﬁgures.

Fig. 7. Summary of inferred β for Reτ ∈ [186, 547, 934, 2004, 4200].

the ML model is unable to extract an accurate functional relationship. This lack of accuracy is well reflected by the high ML
variance in this solution region, as the posterior model assimilates to the prior model.
Fig. 9 shows the resulting velocity predictions and the associated 95% confidence intervals. Two features worth noting
are the improved performance within the inner layer and the correlation between the confidence intervals and model error.
The MAP solution within the inner layer is much improved, in particular the slight bump that characterizes the buffer layer
is well captured in the posterior model. In the outer layer, however, the mean velocity is under-predicted. In this region,
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 771

Fig. 8. Posterior model predictions for the correction β at Reτ = 2000.

Fig. 9. Posterior model predictions for velocity at Reτ = 2000.

a high ML variance was predicted and the posterior model reverts to the prior model. The failure to predict the increased
destruction of TKE in the channel core leads to too high of an eddy viscosity and an under-prediction of velocity. It is worth
noting that the turbulent eddy viscosity predicted by the baseline k − ω model is too high in the channel core. When the
correct behavior within the inner layer is captured and the baseline model is used in the channel core, an under-prediction
in velocity is expected. The conﬁdence intervals given by the posterior model again provide a reasonable estimate of the
underlying uncertainty of the model. While the resulting PDFs should not be viewed as exact, they provide information
about local model accuracy, which is of paramount value to the practitioner.

6. Conclusions and perspectives

The wealth of available data from high-fidelity simulations and high resolution experiments provides unprecedented
opportunities to more comprehensively inform closure models. In this work, a data-driven modeling approach, which we
refer to as FIML (field inversion and machine learning) was presented. The proposed approach moves beyond parameter
calibration and uses data to directly infer information about the functional form of model discrepancies. The inference
process generates function correction information for specific problems. Once the inference is applied over a number of
problems, machine learning is used to reconstruct the inferred function in terms of variables that will be available during
predictive simulations using lower fidelity models. This step aims to create generic modeling knowledge from the inferred
information. The reconstructed function is then embedded into a predictive solution process. In contrast to existing calibra-
tion frameworks, our approach uses data to directly infer information about underlying model discrepancies and provides a
methodology to generalize the inferred information. This approach provides insight into model error at a fundamental level,
rather than at the level of the output.
772 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

The framework was applied to a scalar non-linear ODE model problem, in which missing and deficient terms were re-
constructed and the predictive capability of the improved model was confirmed. A second application was extended to
turbulent channel flow, where DNS data was used to inform a standard Reynolds-averaged closure model. While it was
shown that precise observational statistics may be needed to precisely quantify the posterior distribution, simple approxi-
mations for the prior statistics and linearized Gaussian assumptions for the posterior proved to be sufficient to obtain mean
solutions and posterior distributions that are representative of the modeling error.
The field inversion process directly provides comprehensive information about model discrepancies, which is of great use
to the modeler in the quest to formulate more accurate closures. The machine learning step could be considered as one tool
that can be used to reconstruct the discrepancy in terms of low fidelity model information. It was demonstrated that, for
the simple problems considered, it is possible to use machine learning methods to elicit functional relationships and the
associated uncertainties for the corrections obtained in the inference process. This extraction allows for predictive modeling.
The examples in this paper are illustrative in nature. For the framework to be able to offer improved predictions in
practical situations, inverse problems must be solved over a wide class of problems (and over multiple objective functions
of interest) that will be representative of the deficient physics in the baseline model. Concurrently, the tendency of the
learning process to over-fit data must also be avoided. At every stage of the process, the underlying physical insight is
irreplaceable and thus it is left to the modeler to make judicious choices about the data, prior information and introduction
of one or more correction functions. Further, physical considerations such as realizability and consistency with asymptotic
limits should be enforced.
A number of challenges remain for a full-scale implementation in complex problems. These include grid/numerical
scheme dependence of the inferred corrections, solver convergence, scalability, learning errors, accounting for non-Gaussian
behavior, etc. The present work has nevertheless demonstrated that the FIML method can play a significant role in using
data to more comprehensively inform predictive models, offering a route to creating improved closure approximations while
providing measures of model-form uncertainties.

Acknowledgements

This work was supported by NASA LEARN project NNX15AN98A and by the NSF via grant 1507928.

Appendix A. Markov chain Monte Carlo sampling

The inverse procedure outlined in this paper makes strong assumptions about the Gaussian nature of the underlying
PDFs. In our approach, it is assumed that distributions for the prior probability p (β), observational data d, likelihood h(d|β),
and conditional probability q(β|d) are all Gaussian. In the linear case, if d and p (β) are Gaussian, the posterior distribution
will be Gaussian. This need not be true in the non-linear case. To accurately infer the posterior PDF in the non-linear case,
more expensive methods (such as sampling) need to be utilized. Since the non-linear heat problem presented in Section 4 is
relatively simple, Markov Chain Monte Carlo (MCMC) simulations were performed. Sampling was performed with the Python
package PyMC [28], which utilizes the Metropolis–Hastings step, to determine the posterior distribution. Fig. 10 shows
the posterior distribution determined by MCMC sampling for T ∞ = 50 with the complete observational covariance matrix
(Eq. (25)) and compares it to the MAP solution obtained through Bayesian inversion. It is seen that the comparison between
the two methods is excellent. The MAP solution coincides almost perfectly with the mean MCMC solution. Additionally, the
posterior PDFs are Gaussian across the entire domain and compare well with the MAP solution. While it is not prudent
to generalize these results to other non-linear problems, the Gaussian assumption appears to be reasonable in this speciﬁc
problem.

Appendix B. Computational cost

The primary computational cost in the FIML framework arises from functional inversion and construction of the machine-
learned model, both of which are off-line processes. The inverse problem requires the solution of a high-dimensional
optimization problem, which could be simpliﬁed by using a parametric representation of the function β . The number of
iterations required by optimization algorithms varies from problem to problem, but the computational cost is O (100) solves
of the forward model. In the presented work, the forward model is additionally solved for realizations of β . This sampling
process is performed for both the prior and posterior distributions. The number of samples required depends on the de-
sired accuracy of statistical quantities, but O (100) samples serves as a representative estimate. Additionally, the sampling
process is embarrassingly parallel. Note that the number of forward solves required for sampling and optimization (assum-
ing gradient-based methods) does not scale with dimensionality. Constructing the Gaussian Process ML model requires an
N × N inversion (with N being the number of training points). Although this is an off-line process, it can become pro-
hibitively expensive for very large training sets, demanding sparse and approximate solvers. Additionally, determining the
GP hyperparameters requires the matrix inversion at each iteration of the optimization algorithm. For high-dimensional
learning problems, more eﬃcient ML algorithms (such as neural networks or approximate GPs [23]) need to be considered.
The on-line cost of the FIML framework is realized in the evaluation of the ML model. In the case of GPs, this evalua-
tion requires a matrix-vector multiplication. In practice, the introduction of the ML model may have an impact on solver
convergence and stability, both of which could affect the computational cost.
E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774 773

Fig. 10. The posterior distribution of β as obtained through MCMC is compared to the MAP solution. The upper left figure compares the mean solution
obtained with MCMC and the 95% confidence intervals. The upper right figure shows the PDF for β at z = 0.1. The lower left and right figures show the
PDF for β at z = 0.5 and z = 0.96 respectively.

References

[1] I.M. Navon, Data assimilation for numerical weather prediction: a review, in: Data Assimilation for Atmospheric, Oceanic, and Hydrologic Applications,
Springer, 2009.
[2] G. Kerschen, K. Worden, A. Vakakis, J. Golinval, Past, present, and future of system identification in structural dynamics, Mech. Syst. Signal Process. 20
(2006) 505–592.
[3] N.R. Council, Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quan-
tification, National Academies Press, 2012.
[4] M.C. Kennedy, A. O’Hagan, Bayesian calibration of computer models, J. R. Stat. Soc., Ser. B, Stat. Methodol. 63 (3) (2001) 425–464.
[5] M.C. Kennedy, C.W. Anderson, S. Conti, A. O’Hagan, Case studies in Gaussian process modelling of computer codes, Reliab. Eng. Syst. Saf. 91 (10) (2006)
1301–1309.
[6] S. Conti, J.P. Gosling, J.E. Oakley, A. O’hagan, Gaussian process emulation of dynamic computer codes, Biometrika 96 (3) (2009) 663–676.
[7] J. Brynjarsdóttir, A. O’Hagan, Learning about physical parameters: the importance of model discrepancy, Inverse Probl. 30 (11) (2014) 114007.
[8] P.D. Arendt, D.W. Apley, W. Chen, Quantification of model uncertainty: calibration, model discrepancy, and identifiability, J. Mech. Des. 134 (10) (2012)
100908.
[9] R.C. Smith, Uncertainty Quantification: Theory, Implementation, and Applications, Computational Science and Engineering, vol. 12, SIAM, 2013.
[10] S.H. Cheung, T.A. Oliver, E.E. Prudencio, S. Prudhomme, R.D. Moser, Bayesian uncertainty analysis with applications to turbulence modeling, Reliab. Eng.
Syst. Saf. 96 (9) (2011) 1137–1149.
[11] W.N. Edeling, P. Cinnella, R.P. Dwight, Bayesian estimates of parameter variability in the k − ε turbulence model, 2014.
[12] J.L. Beck, L.S. Katafygiotis, Updating models and their uncertainties. I: Bayesian statistical framework, J. Eng. Mech. 124 (4) (1998) 455–461.
[13] L.M. Berliner, K. Jezek, N. Cressie, Y. Kim, C. Lam, C.V.D. Veen, Modeling dynamic controls on ice streams: a Bayesian statistical approach, J. Glaciol. 54
(2008) 705–714.
[14] K. Sargsyan, H. Najm, R. Ghanem, On the statistical calibration of physical models, Int. J. Chem. Kinet. 47 (4) (April 2015) 246–276.
[15] J.L. Loeppky, D. Bingham, W.J. Welch, Computer model calibration or tuning in practice, Technical report, University of British, Columbia, 2006.
[16] C. Soize, Stochastic modeling of uncertainties in computational structural dynamics recent theoretical advances, J. Sound Vib. 332 (10) (2013)
2379–2395.
[17] M.L. Mehta, Random Matrices, Pure and Applied Mathematics, vol. 142, Academic Press, 2004.
[18] R. Aster, Parameter Estimation and Inverse Problems, Elsevier Academic Press, 2005.
[19] M.B. Giles, M.C. Duta, J.-D. Müller, N.A. Pierce, Algorithm developments for discrete adjoint methods, AIAA J. 41 (2) (2003) 198–205.
774 E.J. Parish, K. Duraisamy / Journal of Computational Physics 305 (2016) 758–774

[20] J.E. Dennis Jr., J.J. Moré, Quasi-Newton methods, motivation and theory, SIAM Rev. 19 (1) (1977) 46–89.
[21] P. Caplan, Numerical computation of second derivatives with applications to optimization problems, Unpublished academic report, MIT.
[22] B.D. Tracey, K. Duraisamy, J.J. Alonso, A machine learning strategy to assist turbulence model development, in: 53rd AIAA Aerospace Sciences Meeting,
The American Institute of Aeronautics and Astronautics, 2015.
[23] Z.J. Zhang, K. Duraisamy, Machine learning methods for data-driven turbulence modeling, in: AIAA Aviation and Aeronautics Forum and Exposition,
Dallas, Texas, June 2015.
[24] C.E. Rasmussen, Gaussian processes for machine learning, 2006.
[25] D.C. Wilcox, Turbulence Modeling for CFD, vol. 2, DCW Industries, La Canada, CA, 1998.
[26] J. Jimenez, S. Hoyas, Turbulent fluctuations above the buffer layer of wall-bounded flows, J. Fluid Mech. 611 (2008) 215–236.
[27] E. Parish, K. Duraisamy, Quantification of turbulence modeling uncertainties using full field inversion, in: AIAA Aviation and Aeronautics Forum and
Exposition, Dallas, Texas, June 2015.
[28] A. Patil, D. Huard, C. Fonnesbeck, PyMC: Bayesian stochastic modelling in Python, J. Stat. Softw. 35 (2010) 4.

Solving Inverse Problems Using Data-Driven Models
No ratings yet
Solving Inverse Problems Using Data-Driven Models
174 pages
From Pinns To Pikans: Recent Advances in Physics-Informed Machine Learning
No ratings yet
From Pinns To Pikans: Recent Advances in Physics-Informed Machine Learning
90 pages
Model Stacking To Improve Prediction and Variable Impor 2022 Digital Chemica
No ratings yet
Model Stacking To Improve Prediction and Variable Impor 2022 Digital Chemica
13 pages
Backsolution: A Framework For Solving Inverse Problems Via Automatic Differentiation
No ratings yet
Backsolution: A Framework For Solving Inverse Problems Via Automatic Differentiation
7 pages
Probabilistic Forecasting For Dynamical Systems With Missing or Imperfect Data
No ratings yet
Probabilistic Forecasting For Dynamical Systems With Missing or Imperfect Data
26 pages
A Novel Data Augmentation Method Based On Denoising Diffusion Probabilistic Model For Fault Diagnosis Under Imbalanced Data
No ratings yet
A Novel Data Augmentation Method Based On Denoising Diffusion Probabilistic Model For Fault Diagnosis Under Imbalanced Data
12 pages
Baddoo Et Al Physics Informed Dynamic Mode Decomposition
No ratings yet
Baddoo Et Al Physics Informed Dynamic Mode Decomposition
23 pages
A Structure-Preserving Domain Decomposition Method For Data-Driven Modeling
No ratings yet
A Structure-Preserving Domain Decomposition Method For Data-Driven Modeling
38 pages
S To Chas Tic Multi Scale
No ratings yet
S To Chas Tic Multi Scale
201 pages
Image Analysis Lecture 5
No ratings yet
Image Analysis Lecture 5
40 pages
Zhang Et Al 2022 Physics Informed Multifidelity Residual Neural Networks For Hydromechanical Modeling of Granular Soils
No ratings yet
Zhang Et Al 2022 Physics Informed Multifidelity Residual Neural Networks For Hydromechanical Modeling of Granular Soils
15 pages
10 1016@j JCP 2019 05 027
No ratings yet
10 1016@j JCP 2019 05 027
17 pages
5 PINNs
No ratings yet
5 PINNs
29 pages
Enhancing Predictive Skills in Physically-Consiste
No ratings yet
Enhancing Predictive Skills in Physically-Consiste
20 pages
Model-Based Deep Learning
No ratings yet
Model-Based Deep Learning
35 pages
SSRN Id4758647
No ratings yet
SSRN Id4758647
35 pages
Reviews: Physics-Informed Machine Learning
No ratings yet
Reviews: Physics-Informed Machine Learning
19 pages
Sartori Toma
No ratings yet
Sartori Toma
81 pages
Data Driven Modelling For Engineering Applications
No ratings yet
Data Driven Modelling For Engineering Applications
39 pages
Integrating Physics-Based Modeling With Machine
No ratings yet
Integrating Physics-Based Modeling With Machine
34 pages
A Physics-Constrained Data-Driven Approach Based On Locally Convex Reconstruction For Noisy Database
No ratings yet
A Physics-Constrained Data-Driven Approach Based On Locally Convex Reconstruction For Noisy Database
29 pages
Numerical Analysis of Physics Informed Neural Networks and Related Models in Physics Informed Machine Learning
No ratings yet
Numerical Analysis of Physics Informed Neural Networks and Related Models in Physics Informed Machine Learning
81 pages
Physics Based Modelling
No ratings yet
Physics Based Modelling
10 pages
Capelo 1998
No ratings yet
Capelo 1998
15 pages
Gocheva-Ilieva S. Statistical Data Modeling... Applications 2021
No ratings yet
Gocheva-Ilieva S. Statistical Data Modeling... Applications 2021
186 pages
Baddoo Et Al 2023 Physics Informed Dynamic Mode Decomposition
No ratings yet
Baddoo Et Al 2023 Physics Informed Dynamic Mode Decomposition
23 pages
Full Text
No ratings yet
Full Text
18 pages
Q-Statistic and T - Statistic PCA-based Measures For Damage Assessment in Structures
No ratings yet
Q-Statistic and T - Statistic PCA-based Measures For Damage Assessment in Structures
15 pages
Model-Based Deep Learning: Nir Shlezinger, Jay Whang, Yonina C. Eldar, and Alexandros G. Dimakis
No ratings yet
Model-Based Deep Learning: Nir Shlezinger, Jay Whang, Yonina C. Eldar, and Alexandros G. Dimakis
31 pages
State-Space Modeling For Control Based On Physics-Informed Neural Networks
No ratings yet
State-Space Modeling For Control Based On Physics-Informed Neural Networks
10 pages
Learning Physics Based Models From Data Perspectives From Inverse Problems and Model Reduction
No ratings yet
Learning Physics Based Models From Data Perspectives From Inverse Problems and Model Reduction
110 pages
Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method A Differe
No ratings yet
Du and He - 2023 - Neural-Integrated Meshfree (NIM) Method A Differe
44 pages
Applsci 13 00567
No ratings yet
Applsci 13 00567
18 pages
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
No ratings yet
Deep Learning and Inverse Problems: Ali Mohammad-Djafari Orcid Number:0000-0003-0678-7759, Ning Chu, Li Wang, Liang Yu
13 pages
nature综述
No ratings yet
nature综述
19 pages
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
No ratings yet
Artificial Neural Network Methods For The Solution of Second Order Boundary Value Problems
15 pages
Full Text
No ratings yet
Full Text
15 pages
Characterizing Possible Failure Modes in Physics-Informed Neural Networks
No ratings yet
Characterizing Possible Failure Modes in Physics-Informed Neural Networks
20 pages
ML Material
No ratings yet
ML Material
38 pages
Data-Driven Computational Mechanics
No ratings yet
Data-Driven Computational Mechanics
21 pages
Physics-Guided Machine Learning For Scientific Discovery - An Application in Simulating Lake Temperature Profiles
No ratings yet
Physics-Guided Machine Learning For Scientific Discovery - An Application in Simulating Lake Temperature Profiles
26 pages
Rspa 2022 0576
No ratings yet
Rspa 2022 0576
23 pages
Adding Missing-Data-Relevant Variables To FIML-Based Structural Equation Models
No ratings yet
Adding Missing-Data-Relevant Variables To FIML-Based Structural Equation Models
22 pages
Robust Data Driven Discovery of A Seismic Wave Equation
No ratings yet
Robust Data Driven Discovery of A Seismic Wave Equation
10 pages
Sensors 23 02792 v2
No ratings yet
Sensors 23 02792 v2
20 pages
Physics-Guided Neural Networks (PGNN) - An Application in Lake Temperature Modeling
No ratings yet
Physics-Guided Neural Networks (PGNN) - An Application in Lake Temperature Modeling
16 pages
Learning Data-Driven Discretizations
No ratings yet
Learning Data-Driven Discretizations
6 pages
Accepted Manuscript: Journal of Computational Physics
No ratings yet
Accepted Manuscript: Journal of Computational Physics
47 pages
Sciadv Abi8605
No ratings yet
Sciadv Abi8605
10 pages
1 SM
No ratings yet
1 SM
8 pages
Finite Basis Physics-Informed Neural Networks (Fbpinns) : A Scalable Domain Decomposition Approach For Solving Differential Equations
No ratings yet
Finite Basis Physics-Informed Neural Networks (Fbpinns) : A Scalable Domain Decomposition Approach For Solving Differential Equations
39 pages
Raissi - PIDL Part 2
No ratings yet
Raissi - PIDL Part 2
19 pages
DT MetalFormingKEM.926.3
No ratings yet
DT MetalFormingKEM.926.3
12 pages
FODS IMPORTANT Question
No ratings yet
FODS IMPORTANT Question
4 pages
Data Inspection Using Biplot
No ratings yet
Data Inspection Using Biplot
16 pages
Full Magazine
No ratings yet
Full Magazine
10 pages
Linear Prediction of Speech: D. Markel A. H. Gray, JR
No ratings yet
Linear Prediction of Speech: D. Markel A. H. Gray, JR
299 pages
An Analysis of Universal Differential Equations For
No ratings yet
An Analysis of Universal Differential Equations For
10 pages
Prediction of Mental Health (Depression) Using Data Science Technique
No ratings yet
Prediction of Mental Health (Depression) Using Data Science Technique
6 pages
Bayesian Classifier Implementation Using MATLAB
No ratings yet
Bayesian Classifier Implementation Using MATLAB
21 pages
Physics-Guided Neural Networks PGNN An Application
No ratings yet
Physics-Guided Neural Networks PGNN An Application
9 pages
Kelly's Criterion in Portfolio Optimization: A Decoupled Problem
No ratings yet
Kelly's Criterion in Portfolio Optimization: A Decoupled Problem
15 pages
DeepXDE A Deep Learning Library For Solving Differ
No ratings yet
DeepXDE A Deep Learning Library For Solving Differ
17 pages
Black Litterman
No ratings yet
Black Litterman
37 pages
Physics-Informed Neural Networks
No ratings yet
Physics-Informed Neural Networks
22 pages
2016 - A Novel Evolutionary Algorithm Applied To Algebraic Modifications of The RANS Stress-Strain Relationship
No ratings yet
2016 - A Novel Evolutionary Algorithm Applied To Algebraic Modifications of The RANS Stress-Strain Relationship
33 pages
Workshop 2 Suggested Solutions
No ratings yet
Workshop 2 Suggested Solutions
35 pages
Mosconi W1
No ratings yet
Mosconi W1
14 pages
STAT3006 Lecture Notes 2021 Aug8 2021
No ratings yet
STAT3006 Lecture Notes 2021 Aug8 2021
110 pages
Spde Tutorial PDF
No ratings yet
Spde Tutorial PDF
125 pages
2019 - Large-Eddy Simulation of Turbulent Flow Over A Parametric Set of Bumps
No ratings yet
2019 - Large-Eddy Simulation of Turbulent Flow Over A Parametric Set of Bumps
23 pages
Lavaan Plot
No ratings yet
Lavaan Plot
105 pages
Karhunen Loeve Transform - KLT
No ratings yet
Karhunen Loeve Transform - KLT
17 pages
Solving Inverse Problems Using Datadriven Models
No ratings yet
Solving Inverse Problems Using Datadriven Models
174 pages
2001 - Guidelines For The Selection of Near-Earth Thermal Environmental Parameters For Spacecraft Design
No ratings yet
2001 - Guidelines For The Selection of Near-Earth Thermal Environmental Parameters For Spacecraft Design
32 pages
The Multivariate Normal Distribution: Exactly Central Limit
No ratings yet
The Multivariate Normal Distribution: Exactly Central Limit
59 pages
Introduction To Portfolio Analysis: Drivers in The Case of Two Assets
No ratings yet
Introduction To Portfolio Analysis: Drivers in The Case of Two Assets
20 pages
Hotelling's One-Sample T2
100% (1)
Hotelling's One-Sample T2
8 pages
3 Sls
No ratings yet
3 Sls
31 pages
Sequence Weights: Stephen F. Altschul
No ratings yet
Sequence Weights: Stephen F. Altschul
17 pages
Variance Component Estimation & Best Linear Unbiased Prediction (Blup)
100% (1)
Variance Component Estimation & Best Linear Unbiased Prediction (Blup)
16 pages
CPA 200 COmponents
No ratings yet
CPA 200 COmponents
11 pages
PINN
100% (1)
PINN
22 pages
2015 - Analytical Investigation of A Nanosatellite Panel Surface Temperatures For Different Altitudes and Panel Combinations
No ratings yet
2015 - Analytical Investigation of A Nanosatellite Panel Surface Temperatures For Different Altitudes and Panel Combinations
8 pages
Cleaning Correlation Matrices
No ratings yet
Cleaning Correlation Matrices
6 pages
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
No ratings yet
Journal of Computational Physics: M. Raissi, P. Perdikaris, G.E. Karniadakis
22 pages
GPower Faul2007
No ratings yet
GPower Faul2007
17 pages
Understanding and Applying Kalman Filter
No ratings yet
Understanding and Applying Kalman Filter
34 pages
Iso TS 28037-2010
100% (1)
Iso TS 28037-2010
72 pages
1520-0493-1520-0493 1958 086 0117 Anotgd 2 0 Co 2
No ratings yet
1520-0493-1520-0493 1958 086 0117 Anotgd 2 0 Co 2
6 pages
Bayes Classifier Exercise - 1
No ratings yet
Bayes Classifier Exercise - 1
2 pages
Black Litterman Model
No ratings yet
Black Litterman Model
7 pages
M3-M4-Understanding of Data
No ratings yet
M3-M4-Understanding of Data
16 pages

2016 - A Paradigm For Data-Driven Predictive Modeling Using Field Inversion and Machine Learning

Uploaded by

2016 - A Paradigm For Data-Driven Predictive Modeling Using Field Inversion and Machine Learning

Uploaded by

Journal of Computational Physics 305 (2016) 758–774

Contents lists available at ScienceDirect

Journal of Computational Physics

A paradigm for data-driven predictive modeling using ﬁeld

Gd = Gm (Q, λ) + δ(Q) + ε , (1)

RT (QT (x, t )) = 0, (2)

Rm (Qm (x, t ), M ) = 0, (3)

R (Q, M(Q)) = 0; (4)

R (Q, M(Q, β(ω))) = 0, (5)

βmap = arg min J. (8)

RRT = Cβmap . (10)

Random samples of the posterior can then be drawn by

β = βmap + Rs, (11)

2.1. Gradient and Hessian computations

∇J(β) = 2J(β) T F(β), (20)

H ≈ J(β) T J(β). (22)

2.2. Observational covariance

where E is the expectation operator.

Ytrain ≈ N (0, φ + λI), (26)

4. Application to model problem

Case T∞ σprior βprior

4.1. Predictive simulations with machine learning injection

Case Grid points T∞ Training data |emap − emodel |/emodel × 100

5. Application to turbulent channel ﬂow

Fig. 4. Results of the predictive model for Case 2.

Fig. 5. Results of the predictive model for Case 5.

5.2. Predictive simulation with machine learning injection

Fig. 8. Posterior model predictions for the correction β at Reτ = 2000.

Fig. 9. Posterior model predictions for velocity at Reτ = 2000.

6. Conclusions and perspectives

Appendix A. Markov chain Monte Carlo sampling

Appendix B. Computational cost

You might also like