Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches

Agouzal and Taddei Advanced Modeling and Simulation
in Engineering Sciences(2024)11:8
Advanced Modeling and Simulation
https://doi.org/10.1186/s40323-024-00263-5 in Engineering Sciences
RESEARCH ARTICLE Open Access
Accelerated construction of
projection-based reduced-order models
via incremental approaches
Eki Agouzal1,2,3 and Tommaso Taddei2,3*
* Correspondence:
[email protected]
Abstract
1
EDF Lab Paris-Saclay, EDF R&D, 7 We present an accelerated greedy strategy for training of projection-based
Boulevard Gaspard Monge,
Palaiseau 91120, France
reduced-order models for parametric steady and unsteady partial differential equations.
2
IMB, UMR 5251, University of Our approach exploits hierarchical approximate proper orthogonal decomposition to
Bordeaux, 351, cours de la speed up the construction of the empirical test space for least-square Petrov–Galerkin
Libération, Talence 33400, France
3
Inria Bordeaux Sud-Ouest, Team
formulations, a progressive construction of the empirical quadrature rule based on a
MEMPHIS, 200 Av. de la Vieille warm start of the non-negative least-square algorithm, and a two-fidelity sampling
Tour, Talence 33405, France strategy to reduce the number of expensive greedy iterations. We illustrate the
performance of our method for two test cases: a two-dimensional compressible
inviscid flow past a LS89 blade at moderate Mach number, and a three-dimensional
nonlinear mechanics problem to predict the long-time structural response of the
standard section of a nuclear containment building under external loading.
Keywords: Parameterized partial differential equations, Model order reduction,
Adaptive sampling, Hyper-reduction
Introduction
Projection-based model reduction of parametric systems
In the past few decades, several studies have shown the potential of model order reduc-
tion (MOR) techniques to speed up the solution to many-query and real-time problems,
and ultimately enable the use of physics-based three-dimensional models for design and
optimization, uncertainty quantification, real-time control and monitoring tasks. The
distinctive feature of MOR methods is the offline/online computational decomposition:
during the offline stage, high-fidelity (HF) simulations are employed to generate an empir-
ical reduced-order approximation of the solution field and a parametric reduced-order
model (ROM); during the online stage, the ROM is solved to estimate the solution field
and relevant quantities of interest for several parameter values. Projection-based ROMs
(PROMs) rely on the projection of the equations onto a suitable low-dimensional test
space. Successful MOR techniques should hence achieve significant online speedups at
acceptable offline training costs. This work addresses the reduction of offline training
costs of PROMs for parametric steady and unsteady partial differential equations (PDEs).
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit
to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The
images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise
in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
0123456789().,–: volV
Agouzal and Taddei Advanced Modeling and Simulation in Engineering Sciences(2024)11:8 Page 2 of 28
We are interested in the solution to steady parametric conservation laws. We denote by

μ the vector of p model parameters in the compact parameter region P ⊂ Rp ; given the
domain ⊂ Rd (d = 2 or d = 3), we introduce the Hilbert spaces (X , · ) and (Y , |||·|||)
defined over . Then, we consider problems of the form
find uμ ∈ X : Rμ (uμ , v) = 0, ∀ v ∈ Y, μ ∈ P, (1)
where R : X × Y × P → R is the parametric residual associated with the PDE of

interest. We here focus on linear approximations, that is we consider reduced-order
approximations of the form

uμ = Z
αμ , (2)
where Z : Rn → X is a linear operator, and n is much smaller than the size of the HF model;
α : P → Rn is the vector of generalized coordinates. We here exploit the least-square

Petrov-Galerkin (LSPG, [1]) ROM formulation proposed in [2], which is well-adapted to
approximating advection-dominated problems: the approach relies on the definition of a
low-dimensional empirical test space Y ⊂ Y ; furthermore, it relies on the approximation
of the HF residual R through hyper-reduction [3] to enable fast online calculations of the
solution to the ROM. In more detail, we consider an empirical quadrature (EQ) procedure
[4,5] for hyper-reduction: EQ methods recast the problem of finding a sparse quadrature
rule to approximate R as a sparse representation problem and then resort to optimization
algorithms to find an approximate solution. Following [4], we resort to the non-negative
least-square (NNLS) method to find the quadrature rule.
We further consider the application to unsteady problems: to ease the presentation, we
k=0 . Given μ ∈ P , we
consider one-step time discretizations based on the time grid {t (k) }K
(k) K
seek the sequence μ = {uμ }k=0 such that
⎧
⎨ R(k) (k) (k−1)
μ (uμ , uμ , v) = 0 ∀ v ∈ Y ;
(3)
⎩ u(0) = ū(0) ;
μ μ
(k)
for all μ ∈ P . As for the steady-state case, we consider linear ansatzs of the form uμ =
(k)
Zαμ , where Z : Rn → X is a linear time- and parameter-independent operator, and
α (K ) : P → Rn are obtained by projecting the Eq. (3) onto a low-dimensional
α (0) , . . . ,
test space. To speed up online costs, we also replace the HF residual in (3) with a rapidly-
computable surrogate through the same hyper-reduction technique considered for steady-
state problems.
Following the seminal work by Veroy [6], numerous authors have resorted to greedy
methods to adaptively sample the parameter space, with the ultimate aim of reducing the
number of HF simulations performed during the training phase. Algorithm 1 summarizes
the general methodology. First, we initialize the reduced-order basis (ROB) Z and the
ROM based on a priori sampling of the parameter space; second, we repeatedly solve the
ROM and we estimate the error over a range of parameters Ptrain ⊂ P ; third, we compute
the HF solution for the parameter that maximizes the error indicator; fourth, if the error is
above a certain threshold, we update the ROB Z and the ROM, and we iterate; otherwise,
we terminate. Note that the algorithm depends on an a posteriori error indicator : if
is a rigorous a posteriori error estimator, we might apply the termination criterion

directly to the error indicator (and hence save one HF solve). The methodology has been
extended to unsteady problems in [7]: the method in [7] combines a greedy search driven
by an a posteriori error indicator with proper orthogonal decomposition (POD, [8,9]) to
compress the temporal trajectory.
Algorithm 1 Abstract weak greedy algorithm

1: Choose P = {μ,i }ni=1 0
and compute the HF solutions S = {uhf μ : μ ∈ P }.
2: for n = n0 + 1, . . . , nmax do
3: Update the ROB Z and the ROM.
4: Compute the estimate μ and evaluate the error indicator μ for all μ ∈ Ptrain .
uhf
5: hf ,n
Compute uμ,n for μ = arg maxμ∈Ptrain μ ; update P and S .
6: if uhf
μ,n −
uμ,n < toluhf μ,n then
7: Update the ROB Z and the ROM.
8: break
9: end if
10: end for
As observed by several authors, greedy methods enable effective sampling of the param-
eter space [10]; however, they suffer from several limitations that might ultimately limit
their effectiveness compared to standard a priori sampling. First, Algorithm 1 is inher-
ently sequential and cannot hence benefit from parallel architectures. Second, Algorithm 1
requires the solution to the ROM and the evaluation of the error indicator for several
parameter values at each iteration of the offline stage; similarly, it requires the update
of the ROM—i.e., the trial and test bases, the reduced quadrature, and possibly the data
structures employed to evaluate the error indicator. These observations motivate the
development of more effective training strategies for MOR.
Contributions and relation to previous work

We propose an acceleration strategy for Algorithm 1 based on three ingredients: (i) a
hierarchical construction of the empirical test space for LSPG ROMs based on the hierar-
chical approximate POD (HAPOD, [11]); (ii) a progressive construction of the empirical
quadrature rule based on a warm start of the NNLS algorithm; (iii) a two-fidelity sam-
pling strategy that is based on the application of the strong-greedy algorithm (see, e.g.,
[12, section 7.3]) to a dataset of coarse simulations. Re (iii), sampling based on coarse
simulations is employed to initialize Algorithm 1 (cf. Line 1) and ultimately reduce the
number of expensive greedy iterations. We illustrate the performance of our method for
two test cases: a two-dimensional compressible inviscid flow past a LS89 blade at moder-
ate Mach number, and a three-dimensional nonlinear mechanics problem to predict the
long-time structural response of the standard section of a nuclear containment building
(NCB) under external loading.
Our method shares several features with previously-developed techniques. Incremental
POD techniques have been extensively applied to avoid the storage of the full snapshot
set for unsteady simulations (see, e.g., [11] and the references therein): here, we adapt
the incremental approach described in [11] to the construction of the test space for
LSPG formulations. Chapman [13] extensively discussed the parallelization of the NNLS
algorithm for MOR applications: we envision that our method can be easily combined
with the method in [13] to further reduce the training costs. Several authors have also
devised strategies to speed up the greedy search through the vehicle of a surrogate error
model [14,15]; our multi-fidelity strategy extends the work by Barral [16] to unsteady
problems and to a more challenging—from the perspective of the HF solver—compressible
flow test. We remark that our approach is similar in scope to the work by Benaceur
[17] that devised a progressive empirical interpolation method (EIM, [18]) for hyper-
reduction of nonlinear problems. Finally, we observe that multi-fidelity techniques have
been extensively considered for non-intrusive MOR (see, e.g., [19] and the references
therein).
The paper is organized as follows. In “Projection-based model reduction of parametric
systems”, we review relevant elements of the construction of MOR techniques based
on LSPG projection for steady conservation laws; we further address the construction
of Galerkin ROMs for problems of the form (3). Then, in “Accelerated construction
of PROMs”, we present the accelerated strategy for both classes of ROMs. “Numerical
results” illustrates the performance of the method for two model problems. “Conclusions”
draws some conclusions and discusses future developments.
Projection-based model reduction of parametric systems

“Preliminary definitions and tools” summarizes notation that is employed throughout the
paper and reviews three algorithms—POD, strong-greedy, and the active set method for
NNLS problems—that are used afterwards. “Least-square Petrov–Galerkin formulation of
steady” reviews the LSPG formulation employed in the present work for steady conserva-
tion laws and illustrates the strategies employed for the definition of the quadrature rule
and the empirical test space. Finally, “Vanilla POD-greedy for Galerkin time-marching
ROMs” addresses model reduction of unsteady systems of the form (3).
Preliminary definitionsand tools

N
We denote by Thf = {xjhf }j=1
nd
, T the HF mesh of the domain with nodes {xjhf }j and
N
connectivity matrix T; we introduce the elements {Dk }N k=1 and the facets {Fj }j=1 of the
e f
mesh and we define the open set Fj in as the union of the elements of the mesh that
share the facet Fj , with j = 1, . . . , Nf . We further denote by Xhf the HF space associated
with the mesh Thf and we define Nhf := dim(Xhf ).
Exploiting the previous definitions, we can introduce the HF residual:
Ne Nf
Rhf
μ (w, v) = e
rk,μ w|Dk , v|Dk + f
rj,μ Fj , v|
w| Fj , (4)
k=1 j=1
for all w, v ∈ Xhf . To shorten notation, in the remainder, we do not explicitly include
the restriction operators in the residual. The global residual can be viewed as the sum
of local element-wise residuals and local facet-wise residuals, which can be evaluated at
a cost that is independent of the total number of elements and facets, and is based on
local information. The nonlinear infinite-dimensional statement (1) translates into the
μ ∈ Xhf such that

high-dimensional problem: find uhf

μ uμ , v = 0 ∀ v ∈ Xhf .
Rhf hf
(5)
Since (5) is nonlinear, the solution requires to solve a nonlinear system of Nhf equations
with Nhf unknowns. Towards this end, we here resort to the pseudo-transient continuation
(PTC) strategy proposed in [20].
We use the method of snapshots (cf. [8]) to compute POD eigenvalues and eigenvectors.
Given the snapshot set {uk }nk=1 train
⊂ Xhf and the inner product (·, ·)pod , we define the
×n
Gramian matrix C ∈ R n train train , Ck,k = (uk , uk )pod , and we define the POD eigenpairs
{(λi , ζi )}ni=1
train
as
ntrain
Cζ i = λi ζ i , ζi := (ζ i )k uk , i = 1, . . . , ntrain ,
k=1
with λ1 ≥ λ2 ≥ . . . λntrain ≥ 0. In our implementation, we orthonormalize the modes, that

is (ζn , ζn )pod = 1 for n = 1, . . . , ntrain . In the remainder we use notation

[{ζi }ni=1 , {λi }ni=1 ] = POD {uk }nk=1
train
, n, (·, ·)pod (6)
to refer to the application of POD to the snapshot set {uk }nk=1 train
. The number of modes n
can be chosen adaptively by ensuring that the retained energy content is above a certain
threshold (see, e.g., [12, Eq. (6.12)]).
We further recall the strong-greedy algorithm: the algorithm takes as input the snapshot
set {uk }nk=1
train
, an integer n, an inner product (·, ·)sg and the induced norm · sg = (·, ·)sg ,
and returns a set of n indices Isg ⊂ {1, . . . , ntrain }

Isg = strong-greedy {uk }nk=1
train
, n, (·, ·)sg . (7)
The dimension n can be chosen adaptively by ensuring that the projection error is below
a given threshold tolsg > 0,
⎧ ⎫
⎨ Z ⊥ uk sg ⎬
n = min n ∈ {1, . . . , ntrain } : max n
≤ tolsg , (8)
⎩ k∈{1,...,ntrain } uk sg ⎭
where Zn denotes the n -dimensional space obtained after n steps of the greedy procedure
in Algorithm 2, Zn⊥ is the orthogonal complement of the space Zn and Z ⊥ : X → Zn⊥
n
is the orthogonal projection operator onto Zn⊥ .
We conclude this section by reviewing the active set method [21] that is employed to
find a sparse solution to the non-negative least square problem:
min Gρ − b2 s.t. ρ ≥ 0, (9)

ρ∈RN
for a given matrix G ∈ RM×N and a vector b ∈ RM . Algorithm 3 reviews the computational
procedure. In the remainder, we use notation
[ρ] = NNLS (G, b, δ, P0 ) , (10)

Algorithm 2 Strong-greedy algorithm

Inputs: {uk }nk=1
train
snapshot set, n size of the desired reduced space, (·, ·)sg inner product.
Outputs: Isg indices of selected snapshots.
1: Choose Z0 = ∅, Isg = ∅, set Itrain = {1, . . . , ntrain }.

2: for i = 1, . . . , n do
3: Compute i = arg maxi∈Itrain Z ⊥ ui sg .

4: Update Zi = Zi−1 ∪ span{ui } and Isg = Isg ∪ {i }.
5: end for
Algorithm 3 Active set method for (9)

Inputs: G ∈ RM×N , b ∈ RM , δ > 0, P0 .
Output: ρ approximate solution to (9), it number of iterations to meet convergence criterion.
1: Choose ρ = 0, w = C ρ, P = P0 , it = 0.
2: while true do
3: Compute r = G(:, P)x(P) − b.
4: if #P = N or r2 ≤ δb2 then
5: break
6: end if
7: Set i = arg maxj∈P
/ (w)j .
8: Set P = P ∪ {i }.
9: while true do
10: it = it + 1
11: Define z ∈ RN s.t. z(P c ) = 0, z(P) = G(:, P)† d
12: if z ≥ 0 then
13: Set x = z, w = G (b − Gx)
14: break
15: end if
16: I = {i ∈ {1, . . . , N } : (z)i < 0}.
17: [α, i ] = mini∈I (x)i −(z)
(x)i
i+
.
18: P = P \ {i }.
19: x = x − α(x − z).
20: end while
21: end while
to refer to the application of Algorithm 3.

Note that the method takes as input a set of indices—which is initialized with the
empty set in the absence of prior information—to initialize the process. Given the matrix
G = [g1 , . . . , gN ], the vector x ∈ RN , and the set of indices P = {pi }m i=1 ⊂ {1, . . . , N }, we
use notation G(:, P) := [gp1 , . . . , gpm ] ∈ R M×m and x(P) = vec((x)p1 , . . . , (x)pm ) ∈ Rm ; we
denote by #P the cardinality of the discrete set P, and we introduce the complement of
P in {1, . . . , N } as P c = {1, . . . , N } \ P. Given the vector x ∈ RN and the set of indices
I ⊂ {1, . . . , N }, notation [α, i ] = mini∈I (x)i signifies that α = mini∈I (x)i and i ∈ I
realizes the minimum, α = (x)i . The constant > 0 is intended to avoid division by
zero and is set to 2−1022 . The computational cost of Algorithm 3 is dominated by the cost
to repeatedly solve the least-square problem at Line 11: in “Numerical results”, we hence
report the total number of least-square solves needed to achieve convergence (cf. output
it in Algorithm 3).
Least-square Petrov–Galerkin formulation of steady problems

Given the reduced-order basis (ROB) Z = [ζ1 , . . . , ζn ] : Rn → Xhf , following [16], we
consider the LSPG formulation
eq
Rμ (Zα, ψ)
uμ = Z
αμ with
αμ ∈ arg minn max (11a)

α∈R ψ∈Y |||ψ|||
where Y = span{ψi }m is a suitable test space that is chosen below and the empirical
i=1
eq
residual Rμ satisfies
Ne Nf
eq eq,e e eq,f f
Rμ (w, v) = ρk rk,μ (w, v) + ρk rj,μ (w, v) (11b)
k=1 j=1
N
where ρeq,e ∈ RN
+ and ρ
e eq,f ∈ R f are sparse vectors of non-negative weights.
+
Provided that {ψi }i=1 is an orthonormal basis of Y
m , we can rewrite (11a) as
eq
min Rμ (α)2 ,
eq eq
with Rμ (α) i = Rμ (Zα, ψi ), i = 1, . . . , m, (12)
α∈Rn
which can be efficiently solved using the Gauss-Newton method (GNM). Note that (12)
does not explicitly depend on the choice of the test norm |||·|||: dependence on the norm is
implicit in the choice of {ψi }m
i=1 . Formulation (11)–(12) depends on the choice of the test

space Y and the empirical weights ρeq,e , ρeq,f : in the remainder of this section, we address
the construction of these ingredients.
We remark that Carlberg [1] considered a different projection method, which is based
on the minimization of the Euclidean norm of the discrete residual: due to the particular
choice of the test norm, the approach of [1] does not require the explicit construction
of the empirical test space. We further observe that Yano and collaborators [22,23] have
considered different formulations of the empirical residual (11b). A thorough comparison
between different projection methods and different hyper-reduction techniques is beyond
the scope of the present study.
Construction of the empirical test space

As discussed in [2], the test space Y should approximate the Riesz representers of
the functionals associated with the action of the Jacobian on the elements of the
trial ROB. Given the snapshot set of HF solutions {uhf μ : μ ∈ Ptrain } with Ptrain =
{μ }k=1 ⊂ P and the ROB {ζi }i , we hence apply POD to the test snapshot set
k n

Sntest := k,i : k = 1, . . . , ntrain , i = 1, . . . , n , where

(( k,i , v)) = Jhf
μ k u hf
μ k (ζi , v), ∀ v ∈ Xhf , (13)
where Jhfμ [w] : Xhf × Xhf → R denotes the Fréchet derivative of the HF residual at w. It
is also useful to provide an approximate computation of the right-hand side of (13),
1
(( k,i , v)) ≈ Rhf (uhf + ζi , v) − Rhf
μk μk
(uhf , v)
μk μk
∀ v ∈ Xhf , (14)
with | | 1. The evaluation of the right-hand side of (14) involves the computation
of the residual at uhf
μk
+ ζi for i = 1, . . . , n; on the other hand, the evaluation of (13)
requires the computation of the Jacobian matrix at uhf
μk
and the post-multiplication by the
algebraic counterpart of Z. Both (13) and (14)—which are equivalent in the limit → 0—
are used in the incremental approach of “Progressive construction of empirical test space
and quadrature”.
Construction of the empirical quadrature rule

N
Following [5], we seek ρeq,e ∈ RN
+ and ρ
e eq,f ∈ R f in (11b) such that
+
(i) (efficiency constraint) the number of nonzero entries in ρeq,e , ρeq,f , nnz(ρeq,e ) and
nnz(ρeq,f ), is as small as possible;
(ii) (constant function constraint) the constant function is approximated correctly in
Ne Nf Nf
eq,e eq,f
ρk |Dk | − || 1, ρj |Fj | − |Fj | 1; (15)
k=1 j=1 j=1
ntrain +ntrain,eq
(iii) (manifold accuracy constraint) for all μ ∈ Ptrain,eq = {μk }k=1 , the empirical
residual satisfies

hf train eq
Rμ (αμ ) − Rμ (αμtrain ) 1, (16a)
2
eq,e eq,e eq,f eq,f

where Rhfμ corresponds to substitute ρ1 = . . . = ρNe = ρ1 = . . . = ρNf = 1
in (11b) and αμtrain satisfies
⎧
⎨ arg min Zα − uhf
μ , if μ ∈ Ptrain ;
αμtrain = α∈R
n
(16b)
⎩ arg min Rhf / Ptrain ;
μ (α)2 , if μ ∈
α∈R
n
and Ptrain = {μk }nk=1

train
is the set of parameters for which the HF solution is available.
In the remainder, we use Ptrain,eq = Ptrain .
By tedious but straightforward calculations, we find that
eq
Rμk (Zα, ψi ) = Gek,i · ρeq,e + Gfk,i · ρeq,f (17a)
for some row vectors Gek,i ∈ R1×Ne and Gfk,i ∈ R1×Nf , k = 1, . . . , ntrain and i = 1, . . . , m.
Therefore, if we define

Gcnst,e = |D1 |, . . . , |DNe | , Gcnst,f = |F1 |, . . . , |FNf | , (17b)
we find that the constraints (15) and (16) can be expressed in the algebraic form
⎡ ⎤
Ge Gf1,1
⎢ . 1,1 ⎥
⎢. .. ⎥
⎢. . ⎥ ρeq,e
⎢ ⎥
G ρeq − ρhf 2 1, with G = ⎢ Ge Gfntrain ,m ⎥
eq
, ρ = , (17c)
⎢ ntrain ,m ⎥ ρeq,f
⎢ cnst,e ⎥
⎣G 0 ⎦
0 Gcnst,f
and ρhf = [1, . . . , 1] .

In conclusion, the problem of finding the sparse weights ρeq,e , ρeq,f can be recast as a
sparse representation problem
min nnz(ρ), Gρeq − b2 ≤ δb2 , (18)

Ne +Nf
ρ∈R+
where nnz(ρ) is the number of non-zero entries in the vector ρ, for some user-defined
tolerance δ > 0, and b = Gρhf . Following [4], we resort to the NNLS algorithm discussed
in “Preliminary definitions and tools” (cf. Algorithm 3) to find an approximate solution to
(18).
A posteriori error indicator

The final element of the formulation is the a posteriori error indicator that is employed
for the parameter exploration. We here consider the residual-based error indicator (cf.
[16]),
Rhf
μ (
uμ , v)
: μ ∈ P → sup . (19)
v∈Xhf |||v|||
Note that the evaluation of (19) requires the solution to a symmetric positive definite linear
system of size Nhf : it is hence ill-suited for real-time online computations; nevertheless, in
our experience the offline cost associated with the evaluation of (19) is comparable with
the cost that is needed to solve the ROM—clearly, this is related to the size of the mesh
and is hence strongly problem-dependent.
Overview of the computational procedure

Algorithm 4 provides a detailed summary of the construction of the ROM at each step
of Algorithm 1 (cf. Line 7). Some comments are in order. The cost of Algorithm 4 is
dominated by the assembly of the test snapshot set and by the solution to the NNLS
problem. We also notice that the storage of Sntest scales with O(n2 ) and is hence the
dominant memory cost of the offline procedure.
Vanilla POD-greedy for Galerkin time-marching ROMs

We denote by uμ ∈ Rd the displacement field, by σμ ∈ Rd×d the Cauchy tensor, by
εμ = ∇s uμ the strain tensor with ∇s • = 12 (∇ • +∇• ); we further introduce the vector of
internal variables γμ ∈ Rdint . Then, we introduce the quasi-static equilibrium equations
Algorithm 4 Generation of the ROM

Inputs: Z = [ζ1 , . . . , ζn−1 ] current ROB, tol > 0 tolerance for Algorithm 3, uhf
μn new snapshot,
m size of the test space
Outputs: Z = [ζ1 , . . . , ζn ] new ROB, ROM for the generalized coordinates.
1: Apply Gram-Schmidt orthogonalization to define the new ROB Z.

2: Assemble the test snapshot set Sntest using (13).
3: Apply POD to find Y : {ψi }m = POD S test , m, ((·, ·)) .
i=1
4: Assemble the matrix and vector G, b (cf. (17)).
5: Solve the NNLS problem (cf. Algorithm 3).
6: Update the ROM data structures.
(completed with suitable boundary and initial conditions)

⎧
⎪
⎨ −∇ · σμ = f in
σμ = Fμσ (εμ , γμ ) in (20)
⎪
⎩
γ̇μ = Fμγ (εμ , γμ ) in
γ
where Fμσ : Rd×d × Rdint → Rd×d and Fμ : Rd×d × Rdint → Rdint are suitable parametric
functions that encode the material constitutive law—note that the Newton’s law (20)1
does not include the inertial term; the temporal evolution is hence entirely driven by the
constitutive law (20)3 . Equation (20) is discretized using the FE method in space and a
one-step finite difference (FD) method in time. Given the parameter μ ∈ P , the FE space
Xhf and the time grid {t (k) }Kk=1 , we seek the sequence μ = {u(k)
μ }k=0 such that
K
Ne
(k),e (k),f
R(k) (k)
μ (uμ , v) = r,μ u(k)
μ , v + rj,μ u(k)
μ , v , (21)
=1 j∈Ibnd
where the elemental residuals satisfy

!
(k),e
r,μ (w, v) = Fμσ (∇s w, γμ(k) (w)) : ∇s v − f · v dx, = 1, . . . , Ne ,
D
γ
with γμ(k) (w) = Fμ,t (∇s w, εμ(k−1) , γμ(k−1) ), (22)
Ibnd ⊂ {1, . . . , Nf } are the indices of the boundary facets and the facet residuals
(k),f γ
{rj,μ }j∈Ibnd incorporate boundary conditions. Note that Fμ,t is the FD approximation
of the constitutive law (20)3 .
Given the reduced space Z = span{ζi }ni=1 ⊂ Xhf , the time-marching hyper-reduced
(k)
Galerkin ROM of (21)–(22) reads as: given μ ∈ P , find μ = {u μ }Kk=0 such that
Ne
(k),eq eq,e (k),e eq,f (k),f
Rμ u(k)
(μ , v) = ρ r,μ u(k)
μ , v + ρj rj,μ u(k)
μ , v , ∀ v ∈ Z , (23)
=1 j∈Ibnd
N
for k = 1, . . . , K , where ρeq,e ∈ RN
+ and ρ
e eq,f ∈ R f are suitably-chosen sparse vectors of
+
weights. We observe that the Galerkin ROM (23) does not reduce the total number of time
steps: solution to (23) hence requires the solution to a sequence of K nonlinear problems
of size n and is thus likely much more expensive than the solution to (Petrov-) Galerkin
ROMs for steady-state problems. Furthermore, several independent studies have shown
that residual-based a posteriori error estimators for (23) are typically much less sharp
and reliable than their counterparts for steady-state problems. These observations have
motivated the development of space-time formulations [24]: to our knowledge, however,
space-time methods have not been extended to problems with internal variables.
The abstract Algorithm 1 can be readily extended to time-marching ROMs for unsteady
PDEs: the POD-Greedy method [7] combines a greedy search in the parameter domain
with a temporal compression based on POD. At each iteration it of the algorithm, we
(k)
update the reduced space Z using the newly-computed trajectory {uμ,it }K k=1 and we
update the quadrature rule. The reduced space Z should be updated using hierarchical
methods that avoid the storage of the full space-time trajectory for all sampled parameters:
the hierarchical approximate POD (HAPOD, see [11] and also “Empirical test space”) or
the hierarchical approach in [25, section 3.5] that generates nested spaces: we refer to
[26, section 3.2.1] for further details and extensive numerical investigations for a model
problem with internal variables. Here, we consider the nested approach of [25]: we denote
by nit the dimension of the reduced space Zit at the it-th iteration, and we denote by
nnew = nit − nit−1 the number of modes added at each iteration,

(k)
Zit = Zit−1 ⊕ Z new , where Z new = POD {Z ⊥ uμ,it }K
k=1 , n new (tol), (·, ·) ,
it−1
(24a)
where nnew (tol) satisfies nnew (tol) =

⎧ "K ⎫
⎨ k=1
(k)
t (k) uμ,it − Zit−1 ⊕Z new
(k)
uμ,it 2
⎬

min n : "
n
≤ tol 2
, Z new
= span ζ new n
it,i i=1 ⎭ ,
⎩ K (k)
t (k) u 2
n
k=1 μ,it
(24b)
for some tolerance tol > 0, with t (k) = t (k) − t (k−1) , k = 1, . . . , K . The quadrature rule is
obtained using the same procedure described in “Empirical quadrature”: exploiting (24),
it is easy to verify that the matrix G (cf. (17c)) can be rewritten as (we omit the details)
⎡ ⎤
Gcnst,e 0 ⎧
⎢ ⎥ ⎨ G(it−1)
⎢0 Gcnst,f ⎥ acc ∈ R(nit−1 (it−1)K )×(Ne +Nf ) ,
G(it) ⎢
=⎢ ⎥
(it−1) ⎥ , with ⎩ (it) (25)
⎣ Gacc ⎦ Gnew ∈ R(nnew (it−1)K +nit K )×(Ne +Nf ) .
(it)
Gnew
Note that Gitacc corresponds to the columns of the matrix G associated with the manifold
accuracy constraints (cf. (16)) at the it-th iteration of the greedy procedure.
Accelerated construction of PROMs

“Progressive construction of empirical test space and quadrature” discusses the incremen-
tal strategies proposed in this work, to reduce the costs associated with the construction
of the empirical test space and the empirical quadrature in Algorithm 4; “Multi-fidelity
sampling” summarizes the multi-fidelity sampling strategy, which is designed to reduce
the total number of greedy iterations required by Algorithm 1. “Progressive construc-

tion of empirical test space and quadrature” and “Multi-fidelity sampling” focus on LSPG
ROMs for steady problems; in “Extension to unsteady problems” we discuss the extension
to Galerkin ROMs for unsteady problems.
Progressive construction of empirical test space and quadrature

Empirical test space
By reviewing Algorithms 1 and 4, we notice that the test snapshot set Sntest satisfies
n−1 n
Sntest = Sn−1
test
∪ k,n k=1 ∪ n,i i=1 ; (26)
therefore, at each iteration, it suffices to solve 2n − 1—as opposed to n2 —Riesz problems

of the form (13) to define Sntest . As in [16], we rely on Cholesky factorization with fill-
in reducing permutations of rows and columns, to reduce the cost of solving the Riesz
problems. In the numerical experiments, we rely on (13), which involves the assembly
n
of the Jacobian matrix, to compute n,1 k=1 , while we consider the finite difference
n−1
approximation (14) to compute k,n k=1 with = 10−6 .
In order to lower the memory costs associated with the storage of the test snapshot set
Sntest and also the cost of performing POD, we consider a hierarchical approach to con-
. In this work, we apply the (distributed) hierarchical approximate
struct the test space Y
proper orthogonal decomposition: HAPOD approach is related to incremental singular
value decomposition [27] and guarantees near-optimal performance with respect to the
standard POD (cf. [11]). Given the POD space {ψi }m i=1 such that ((ψ , ψ )) = δ and the
i jn−1 i,j n
corresponding eigenvalues {λi }i=1 , and the new set of snapshots
m
k,n k=1 ∪ n,i i=1 ,
HAPOD corresponds to applying POD to a suitable snapshot set that combines informa-
tion from current and previous iterations,
[{ψinew }m new mnew

i=1 , {λi
new
}i=1 ] = POD Snincr , mnew , ((·, ·)) , (27)
√ n−1 n
with Snincr := { λi ψi }m
i=1 ∪ k,n k=1 ∪ n,i i=1 . Note that the modes {ψi }i=1 are scaled
m
to properly take into account the energy content of the modes in the subsequent iterations.
Note also that the storage cost of the method scales with m + 2n − 1, which is much lower
than n2 , provided that m n2 . Note also that the POD spaces {Y n }n , which are generated
at each iteration of Algorithm 1 using (27), are not nested.
Empirical quadrature
Exploiting (17), it is straightforward to verify that if the test spaces are nested — that is,
n−1 ⊂ Y
Y n , then the EQ matrix Gn at iteration n satisfies

Gn−1
Gn = , (28)
Gnew
n
where Gnew has k = n × dim(Y n \ Yn−1 ) + dim(Y

n−1 ) rows. We hence observe that we
n
can reduce the cost of assembling the EQ matrix Gn by exploiting (28), provided that we
rely on nested test spaces: since, in our experience, the cost of assembling Gn is negligible,
the use of non-nested test spaces does not hinder offline performance.
On the other hand, due to the strong link between consecutive EQ problems that are
solved at each iteration of the greedy procedure, we propose to initialize the NNLS Algo-
rithm 3 using the solution from the previous time step, that is
# $
P = i ∈ {1, . . . , Ne + Nf } : (ρeq,(n−1) )i = 0 . (29)
We provide extensive numerical investigations to assess the effectiveness of this choice.
Summary of the incremental weak-greedy algorithm

Algorithm 5 reviews the full incremental generation of the ROM. In the numerical exper-
iments, we set m = 2n: this implies that the storage of S incr scales with 4n − 3 as opposed
to n2 .
Algorithm 5 Incremental generation of the ROM

Inputs: Z = [ζ1 , . . . , ζn−1 ] current ROB, δ > 0 tolerance for Algorithm 3, uhf
μn new snapshot,

{(ψiold λold m
i )}i=1 POD eigenpairs for test space, m size of the test space, P
(n−1) initial condition
for Algorithm 3.
Outputs: Z = [ζ1 , . . . , ζn ] new ROB, ROM for the generalized coordinates, {(ψi , λi )}m
i=1 new
POD eigenpairs for test space, P (n) initial condition for Algorithm 3 at subsequent iteration.
1: Apply Gram-Schmidt orthogonalization to define the new ROB Z.

2: Assemble the test snapshot set Snincr in (27).
3: Apply POD to find Y , [{ψi }m , {λi }m ] = POD S incr , m, ((·, ·)) .
i=1 i=1 n
4: Assemble the matrix and vector G, b (cf. (17)).
5: Solve the NNLS problem (cf. Algorithm 3) with initial condition given by P (n−1) .
6: Update the ROM data structures and compute P using (29).
Multi-fidelity sampling
The incremental strategies of “Progressive construction of empirical test space and
quadrature” do not affect the total number of greedy iterations required by Algorithm 1
to achieve the desired accuracy; furthermore, they do not address the reduction of the
cost of the HF solves. Following [16], we here propose to resort to coarser simulations
to learn the initial training set P (cf. Line 1 Algorithm 1) and to initialize the HF solver.
Algorithm 6 summarizes the computational procedure.
Algorithm 6 Multi-fidelity weak-greedy algorithm

1: Define two meshes Thf0 and Thf , where Thf0 is coarser than Thf .
2: Generate a ROM for the HF model associated with the coarse mesh Thf0 (e.g. using Algorithm
1), Z 0 ,
α 0 : P → Rn .
3: Estimate the solution for all μ ∈ Ptrain and store the generalized coordinates {
αμ0 : μ ∈
Ptrain }.
4: Apply Algorithm 2 to {
αμ0 : μ ∈ Ptrain } to obtain the initial sample P .
5: Apply Algorithm 1 to the HF model associated with the fine mesh Thf ; use P to initialize
the greedy method, and use the estimate μ → Z 0
αμ0 to initialize the HF solver.
Some comments are in order.
• In the numerical experiments, we choose the cardinality n0 of P according to (8)

with tolsg = tol (cf. Algorithm 1). Note that, since the strong greedy algorithm is
applied to the generalized coordinates of the coarse ROM, n0 cannot exceed the size
n of the ROM.
• We observe that increasing the cardinality of P ultimately leads to a reduction of
the number of sequential greedy iterations; it hence enables much more effective
parallelization of the offline stage. Note also that Algorithm 6 can be coupled with
parametric mesh adaptation tools to build an effective problem-aware mesh Thf (cf.
[16]); in this work, we do not exploit this feature of the method.
• We observe that the multi-fidelity Algorithm 6 critically depends on the choice of the
coarse grid Thf0 : an excessively coarse mesh Thf0 might undermine the quality of the
initial sample P and of the initial condition for the HF solve, while an excessively fine
mesh Thf0 reduces the computational gain. In the numerical experiments, we provide
extensive investigations of the influence of the coarse mesh on performance.
Extension to unsteady problems

Acceleration of the POD-greedy algorithm for the Galerkin ROM (23) relies on two inde-
pendent building blocks: first, the incremental construction of the quadrature rule; sec-
ond, a two-fidelity sampling strategy. Re the quadrature procedure, we notice that at each
greedy iteration, the matrix G in (25) admits the decomposition in (28). If the number of
(it)
new modes is modest compared to nit−1 the rows of Gnew are significantly less numerous
than the columns of G : we can hence reduce the cost of the construction of G(it) by
(it)
keeping in memory the EQ matrix from the previous iteration; furthermore, the decom-
position (28) motivates the use of the active set of weights from the previous iterations to
initialize the NNLS algorithm at the current iteration. On the other hand, the extension
of the two-fidelity sampling strategy in Algorithm 6 simply relies on the generalization of
the strong-greedy procedure 2 to unsteady problems, which is illustrated in the algorithm
below.
Algorithm 7 POD-strong-greedy algorithm

(k) ntrain
Inputs: {i = {ui }Kk=1 }i=1 snapshot set, maxit maximum number of iterations, tol
tolerance for data compression, (·, ·)sg inner product.
Outputs: Isg indices of selected snapshots.
1: Choose Z0 = ∅, Isg = ∅, set Itrain = {1, . . . , ntrain }.

2: for it = 1, . . . , maxit do
"
Compute i = arg maxi∈Itrain K
(k) 2
3: k=1 min w∈ Z w − u i sg .

(k)
4: Update the reduced space Zit = data-compression Zit−1 , {ui }K k=1 , tol .
(cf. (24))
5: Isg = Isg ∪ {i }.
6: end for
Algorithm 7 takes as input a set of trajectories and returns the indices of the selected
parameters. To shorten the notation, we here assume that the time grid is the same for
all parameters: however, the algorithm can be readily extended to cope with parameter-
dependent temporal discretizations. We further observe that the procedure depends on
the data compression strategy employed to update the reduced space at each greedy
iteration: in this work, we consider the same hierarchical strategy (cf. [25, section 3.5])
employed in the POD-(weak-)greedy algorithm.
Numerical results
We present numerical results for a steady compressible inviscid flow past a LS89 blade
(cf. “Transonic compressible flow past an LS89 blade”), and for an unsteady nonlinear
mechanics problem that simulates the long-time mechanical response of the standard
section of a containment building under external loading (cf. “Long-time mechanical
response of the standard section of a containment building under external loading”).
Simulations of . “Transonic compressible flow past an LS89 blade” are performed in Matlab
2022a [28] based on an in-house code, and executed over a commodity Linux workstation
(RAM 32 GB, Intel i7 CPU 3.20 GHz x 12). The HF simulations of “Long-time mechanical
response of the standard section of a containment building under external loading” are
performed using the FE software code_aster [29] and executed over a commodity
Linux workstation (RAM 32 GB, Intel i7-9850H CPU 2.60 GHz x 12); on the other hand,
the MOR procedure relies on an in-house Python code and is executed on a Windows
workstation (RAM 16 GB, Intel i7-9750H CPU 2.60 GHz x 12).
Transonic compressible flow past an LS89 blade

Model problem
We consider the problem of estimating the solution to the two-dimensional Euler equa-
tions past an array of LS89 turbine blades; the same model problem is considered in [30],
for a different parameter range. We consider the computational domain depicted in Fig. 1a;
we prescribe total temperature, total pressure and flow direction at the inflow, static pres-
sure at the outflow, non-penetration (wall) condition on the blade and periodic boundary
conditions on the lower and upper boundaries. We study the sensitivity of the solution
Fig. 1 Inviscid flow past an array of LS89 turbine blades. a Partition associated with the geometric map. b, c
Behavior of the Mach number for two parameter values
with respect to two parameters: the free-stream Mach number Ma∞ and the height of the
channel H, μ = [H, Ma∞ ]. We consider the parameter domain P = [0.9, 1.1] × [0.2, 0.9].
We deal with geometry variations through a piecewise-smooth mapping associated
with the partition in Fig. 1a. We set Href = 1 and we define the curve x1 → fbtm (x1 ) that
describes the lower boundary btm of the domain = (H = 1); then, we define H >0
and x1 → fbtm (x1 ) + H − H
such that x1 → fbtm (x1 ) + H do not intersect the blade for
any H ∈ [0.9, 1.1]; finally, we define the geometric mapping

geo x1
H (x = [x1 , x2 ]) = geo , (30a)
ψH (x)
where
⎧
⎪
⎨ o1 (x1 ) + C(H) (x2 − o1 (x1 )) x2 < o1 (x1 ),
geo
ψH (x) = o2 (x1 ) + C(H) (x2 − o2 (x1 )) x2 > o2 (x1 ), (30b)
⎪
⎩
x2 otherwise,
, o2 (x1 ) = fbtm (x1 ) + Href − H

with o1 (x1 ) = fbtm (x1 ) + H and C(H) = H−Href + 1.

2H
Fig. 1b, c show the distribution of the Mach field for μ(1) = [0.95, 0.78] and μ(2) =
[1.05, 0.88]. We notice that for large values of the free-stream Mach number the solution
develops a normal shock on the upper side of the blade and two shocks at the trailing
edge: effective approximation for higher values of the Mach number requires the use of
nonlinear approximations and is beyond the scope of the present work.
Fig. 2 Compressible flow past a LS89 blade. a–c Three computational meshes. d Behavior of the average and
maximum error (31)
Results
We consider a hierarchy of six P2 meshes with Ne = 1827, 2591, 3304, 4249, 7467, 16353
elements, respectively; Fig. 2a–c show three computational meshes. Figure 2d show maxi-
hf ,(6)
mum and mean errors between the HF solution uμ and the corresponding HF solution
associated with the i-th mesh,
hf ,(i) hf ,(6)
uμ − uμ
Eμ(i) = hf ,(6)
, (31)
uμ
over five randomly-chosen parameters in P . We remark that the error is measured in the
%&configuration , which corresponds to H = 1. We consider the L () norm
reference 2
·= 2
(·) dx.
Figure 3 compares the performance of the standard (“std”) greedy method with the per-
formance of the incremental (“incr”) procedures of “Progressive construction of empirical
test space and quadrature” for the coarse mesh (mesh 1). In all the tests below, we consider
a training space Ptrain based on a ten by ten equispaced discretization of the parameter
domain. Figure 3a shows the number of iterations of the NNLS algorithm 3, while Fig. 3b
shows the wall-clock cost (in seconds) on a commodity laptop, and Fig. 3c shows the per-
eq )
centage of sampled weights nnz(ρ
Ne +Nf × 100%. We observe that the proposed initialization
of the active set method leads to a significant reduction of the total number of iterations,
and to a non-negligible reduction of the total wall-clock cost1 , without affecting the per-
1
The computational cost reduction is not as significant as the reduction in the total number of iterations, because
the cost per iteration depends on the size of the least-square problem to be solved (cf. Line 11, Algorithm 3), which
increases as we increase the cardinality of the active set.
Fig. 3 Compressible flow past a LS89 blade. Progressive construction of quadrature rule and test space, coarse
mesh (Ne = 1827)
formance of the method. Figure 3d shows the cost of constructing the test space: we notice
that the progressive construction of the test space enables a significant reduction of offline
costs. Finally, Fig. 3e shows the averaged out-of-sample performance of the LSPG ROM
over ntest = 20 randomly-chosen parameters Ptest ,
hf ,(1)
1 uμ −
uμ
Eavg = hf ,(1)
,
ntest uμ
μ∈Ptest
and Fig. 3f shows the online costs: we observe that the standard and the incremental
approaches lead to nearly-equivalent results for all values of the ROB size n considered.
Figure 4 replicates the tests of Fig. 3 for the fine mesh (mesh 6). As for the coarser grid,
we find that the progressive construction of the test space and of the quadrature rule do
not hinder online performance of the LSPG ROM and ensure significant offline savings.
We notice that the relative error over the test set is significantly larger: reduction of the
mesh size leads to more accurate approximations of sharp features—shocks, wakes—that
are troublesome for linear approximations. On the other hand, we notice that online costs
are nearly the same as for the coarse mesh for the corresponding value of the ROB size n:
this proves the effectiveness of the hyper-reduction procedure.
Figure 5 investigates the effectiveness of the sampling strategy based on the strong-
greedy algorithm. We rely on Algorithm 6 to identify the training set of parameters P for
(i) (6)
different choices of the coarse mesh Thf0 = Thf , for i = 1, . . . , 6 and for Thf = Thf . Then,
we measure performance in terms of the maximum relative projection error over Ptrain ,
uμ
hf ,(6)
− ζ # $
proj,(i) (i) (i)
En = max inf hf ,(6)
, with Zn = span uhf
μ
,(6)
: μ ∈ P,n , (32)
μ∈Ptrain ζ ∈Z (i) uμ
n
where P,n is the set of the first n parameters selected through Algorithm 6 based on
(i)
(i) proj,(i)
the coarse mesh Thf . Figure 5a shows the behavior of the projection error En (32) for
three different choices of the coarse mesh; to provide a concrete reference, we also report
the performance of twenty sequences of reduced spaces obtained by randomly selecting
Fig. 4 Compressible flow past a LS89 blade. Progressive construction of quadrature rule and test space; fine
mesh (Ne = 16353)
proj,(i)
Fig. 5 Compressible flow past a LS89 blade; sampling. a behavior of En (32) for six different choices of the
coarse mesh, and for random samples. b, c Parameters {μ,j }j selected by Algorithm 6 for two different coarse
meshes
sequences of parameters in Ptrain . Figure 5b, c show the parameters selected through
Algorithm 6 for two different choices of the coarse mesh: we observe that the selected
parameters are clustered in the proximity of Ma∞ = 0.9.
Table 1 compares the costs of the standard weak-greedy algorithm (“vanilla”), the weak-
greedy algorithm with progressive construction of the test space and the quadrature
(1)
rule (“incr”), and the two-fidelity Algorithm 6 with coarse mesh given by Thf0 = Thf
Table 1 Compressible flow past a LS89 blade

Fine mesh Thf(5) ROB ES EQP Greedy search HF solves Overhead Total
Vanilla 1.1 714.5 247.4 930.6 3554.8 0.6 5449.0
incr 1.1 177.9 221.2 928.4 3504.8 0.6 4834.0
incr+MF 0.9 131.0 121.9 362.7 2030.0 692.1 3338.5
Fine mesh Thf(6)
Vanilla 3.4 2067.8 668.4 2091.2 10,997.4 2.3 15,830.5
incr 3.2 444.8 538.4 1952.8 10, 529.1 2.3 13,470.6
incr+MF 3.6 366.4 362.5 905.8 9597.7 699.9 11,935.9
(1)
Overview of offline costs for three computational strategies and two different fine meshes (coarse mesh: Thf )
(“incr+MF”). To ensure a fair comparison, we impose that the final ROM has the same
number of modes (twenty) for all cases. Training of the coarse ROM in Algorithm 6 is
based on the weak-greedy algorithm with progressive construction of the test space and
the quadrature rule, with tolerance tol = 10−3 : this leads to a coarse ROM with n0 = 14
modes, which corresponds to an initial training set P of cardinality 14 in the greedy
method (cf. Line 5, Algorithm 6). The ROMs associated with three different training
strategies show comparable performance in terms of online cost and L2 errors.
(6)
For the fine mesh Thf , the two-fidelity training leads to a reduction of offline costs of
roughly 25% with respect to the vanilla implementation and of roughly 10% with respect
(5)
to the incremental implementation; for the fine mesh Thf (which has one half as many
elements as the finer grid), the two-fidelity training leads to a reduction of offline costs of
roughly 39% with respect to the vanilla implementation and of roughly 31% with respect
to the incremental implementation. In particular, we notice that the initialization based
on the coarse model—which is the same for both cases—is significantly more effective
for the HF model associated with mesh 5 than for the HF model associated with mesh 6.
The empirical finding strengthens the observation made in “Multi-fidelity sampling” that
the choice of the coarse approximation is a compromise between overhead costs—which
increase as we increase the size of the coarse mesh—and accuracy of the coarse solution.
We remark that for the first two cases the HF solver is initialized using the solution for
the closest parameter in the training set
u0μ,n = uhf
μ̄ , with μ̄ = arg min μ,n − μ 2 ,
μ ∈{μ,i }n−1
i=1
for n = 2, 3, . . .; for n = 1, we rely on a coarse solver and on a continuation strategy

with respect to the Mach number: this initialization is inherently sequential. On the other
hand, in the two-fidelity procedure, the HF solver is initialized using the reduced-order
solver that is trained using coarse HF data: this choice enables trivial parallelization of
the HF solves for the initial parameter sample P . We hence expect to achieve higher
computational savings for parallel computations.
Long-time mechanical response of the standard section of a containment building under

external loading
Model problem
We study the long-time mechanical response of a three-dimensional standard section of
a nuclear power plant containment building: the highly-nonlinear mechanical response is
activated by thermal effects; its simulation requires the coupling of thermal, hydraulic and
Fig. 6 Mechanical response of a NCB under external loading. a Normal force on a horizontal cable. b Tangential
and (c) vertical strains on the outer wall of the standard section of the containment building, for ηdc = 5 · 109 and
three values of κ on the coarse mesh
mechanical (THM) responses. A thorough presentation of the mathematical model and of

the MOR procedure is provided in [31]. The ultimate goal of the simulation is to predict
the temporal behavior of several quantities of interest (QoIs), such as water saturation
in concrete, delayed deformations, and stresses: these QoIs are directly related to the
leakage rate, whose estimate is of paramount importance for the design of NCBs. The
deformation field is also important to conduct validation and calibration studies against
real-world data.
Following [32], we consider a weak THM coupling procedure to model deferred defor-
mations within the material; weak coupling is appropriate for large structures under nor-
mal operational loads. The MOR process is exclusively applied to estimate the mechan-
ical response: the results from thermal and hydraulic calculations are indeed used as
input data for the mechanical calculations, which constitute the computational bottle-
neck of the entire simulation workflow. To model the mechanical response of the con-
crete structure, we consider a three-dimensional nonlinear rheological creep model with
internal variables; on the other hand, we consider a one-dimensional linear elastic model
for the prestressing cables: the state variables are hence the displacement field of the
three-dimensional concrete structure and of the one-dimensional steel cables. We assume
that the whole structure satisfies the small-strain and small-displacement hypotheses. To
establish connectivity between concrete and steel nodes, a kinematic linkage is imple-
mented: a point within the steel structure and its corresponding point within the concrete
structure are assumed to share identical displacements.
We study the solution behavior with respect to two parameters: the desiccation creep
viscosity (ηdc ) and the basic creep consolidation parameter (κ) in the parameter range
μ ∈ P = [5 · 108 , 5 · 1010 ] × [10−5 , 10−3 ] ⊂ R2 . Figure 6 shows the behavior of (a) the
normal force on a horizontal cable, and (b) the tangential and (c) vertical strains on the
outer wall of the standard section of the containment building, for three distinct parameter
values μ(i) = (5.109 , κ (i) ), for κ (i) ∈ {10−5 , 10−4 , 10−3 }, i = 1, 2, 3. Notation “-E” indicates
that the HF data are associated to the outer face of the structure. Note that the value of
the consolidation parameter κ affects the rate of decay of the various quantities.
Results
High-fidelity solver. We consider the two distinct three-dimensional meshes, depicted
in Fig. 7: a coarse mesh of Ne = 784 three-dimensional hexahedral elements and a refined
mesh with Ne = 1600 elements. This mesh features the geometry of a portion of the
Fig. 7 Mechanical response of a NCB under external loading. Tridimensional meshes (HEXA20) used for
code_aster calculations: a coarse mesh (mesh 1: Ne = 784); b refined mesh (mesh 2: Ne = 1600)
Table 2 Mechanical response of a NCB under external loading

Mean Max Min Q1 Median Q3
Mesh 1 546.91 905.53 386.96 387.04 387.12 387.19
Mesh 2 1034.07 1658.53 747.60 748.20 749.78 749.40
HF CPU cost in seconds [s] for the HF simulations on the coarse (mesh 1) and the refined mesh (mesh 2)
Table 3 Mechanical response of a NCB under external loading

avg avg
max Eμmax (·) Eμmax (·)/ntrain max Eμ (·) Eμ (·)/ntrain
μ∈Ptrain μ∈Ptrain
NH2 2.86 · 10−2 6.55 · 10−4 2.86 · 10−2 3.91 · 10−5
εtt − E 4.42 · 10−2 7.68 · 10−3 7.87 · 10−3 4.84 · 10−3
εzz − E 5.03 · 10−2 1.09 · 10−2 1.38 · 10−2 6.73 · 10−3
Computation of average (column 1 and 3) and maximum (column 2 and 4) errors over the training set for several errors on
the quantities of interest: maximum error over all time steps (cf. (33))
building halfway up the barrel, and is crossed by two vertical and three horizontal cables.
We consider an adaptive time-stepping scheme: approximately 45 to 50 time steps are
needed to reach the specified final time step, for all parameters in P and for both meshes.
Table 2 provides an overview of the costs of the HF solver over the training set for the
two meshes: we observe that the wall-clock cost of a full HF simulation is roughly nine
minutes for the coarse mesh and seventeen minutes for the refined mesh. We consider
a 7 by 7 training set Ptrain and a 5 by 5 test set; parameters are logarithmically spaced in
both directions.
Figure 8 showcases the evolution of normal forces in the central horizontal cable (NH2 ),
the vertical (εzz ) and the tangential (εtt ) deformations on the outer surface of the geometry.
Table 3 shows the behavior of the maximum and average relative errors
hf ,(1) hf ,(2) "K hf ,(1) hf ,(2)

maxk=1,...,K |qμ (t (k) ) − qμ (t (k) )| avg k=1 t (k) |qμ (t (k) ) − qμ (t (k) )| dt
Eμmax = hf ,(2)
, Eμ = "K hf ,(2) (k)
.
maxk=1,...,K |qμ (t (k) )| k=1 |qμ (t )|
(33)
for the three quantities of interest of Fig. 8. We notice that the two meshes lead to nearly-
equivalent results for this model problem.
Fig. 8 Mechanical response of a NCB under external loading. Comparison of the quantities of interest computed
for the two meshes of the standard section: (a) normal force on a horizontal cable, (b) tangential and (c) vertical
strains on the outer wall of the standard section of the containment building
Model reduction. We assess the performance of the Galerkin ROM over training and
test sets. Given the sequence of parameters {μ,it }maxit
it=1 , for it = 1, . . . , maxit,
(k)
1. we solve the HF problem to find the trajectory {uμ,it }K
k=1 ;
2. we update the reduced space Z using (24) with tolerance tol = 10−5 ;
3. we update the quadrature rule using the (incremental) strategy described in “Exten-
sion to unsteady problems”.
Below, we assess (i) the effectiveness of the incremental strategy for the construction of
the quadrature rule, and (ii) the impact of the sampling strategy on performance. Towards
this end, we compute the projection error

proj
μ − Z μ
Eit (P· ) = max it
, · ∈ {train, test} (34)
μ∈P· μ
%"K

where Zit μ := {Zit uμ } and =
(k)
k=1 t v is the discrete L (0, T ; ·)
(k) (k) 2 2
norm—as in [31], we consider · = · 2 . Further results on the prediction error and

online speedups of the ROM are provided in [31].
Figure 9 illustrates the performance of the EQ procedure; in this test, we consider
the finer mesh (mesh 2), and we select the parameters {μ,it }maxit=15it=1 using the POD-
strong greedy algorithm 7 based on the HF results on mesh 1. Figure 9a–c show the
number of iterations required by Algorithm 3 to meet the convergence criterion, the
computational cost, and the percentage of sampled elements, which is directly related
to the online costs, for δ = 10−4 . As for the previous test case, we observe that the
progressive construction of the quadrature rule drastically reduces the NNLS iterations
without hindering performance. Figure 9d shows the speedup of the incremental method
for three choices of the tolerance δ and for several iterations of the iterative procedure. We
notice that the speedup increases as the iteration count increases: this can be explained by
observing that the percentage of new columns added during the it-th step in the matrix
G decays with it (cf. (25)). We further observe that the speedup increase as we decrease
the tolerance δ: this is justified by the fact that the number of iterations required by
Algorithm 3 increases as we decrease δ.
Figures 10 and 11 investigate the efficacy of the greedy sampling strategy. Figure 10
shows the parameters {μ,j }j selected by Algorithm 7 for (a) the coarse mesh (mesh 1)
and (b) the refined mesh (mesh 2): we observe that the majority of the sampled param-
eters are clustered in the bottom left corner of the parameter domain for both meshes.
Fig. 9 Mechanical response of a NCB under external loading. Progressive construction of the quadrature rule on
mesh 2. a–c Performance of standard (“std”) and incremental (“incr”) EQ procedures for several iterations of the
greedy procedures, for δ = 10−4 . d Computational speedup for three choices of the tolerance δ
Fig. 10 Mechanical response of a NCB under external loading. Parameters {μ,j }j selected by Algorithm 7 for (a)
the coarse mesh (mesh 1) and (b) the refined mesh (mesh 2)
Figure 11 shows the performance (measured through the projection error (34)) of the two
samples depicted in Figure 10 on training and test sets; to provide a concrete reference,
we also compare performance with five randomly-generated samples. Interestingly, we
observe that for this model problem the choice of the sampling strategy has little effects
on performance; nevertheless, also in this case, the greedy procedure based on the coarse
mesh consistently outperforms random sampling.
Agouzal and Taddei Advanced Modeling and Simulation in Engineering Sciences(2024)11:8
proj
Fig. 11 Mechanical response of a NCB under external loading. Behavior of the projection error Eit (34) for parameters selected by Algorithm 7 based on coarse (mesh 1) and fine data (mesh 2);
comparison with random sampling. a Performance on Ptest (5 × 5). b Performance on Ptrain (7 × 7). c Behavior of the basis size nit
Page 25 of 28
Conclusions
The computation burden of the offline training stage remains an outstanding challenge of
MOR techniques for nonlinear, non-parametrically-affine problems. Adaptive sampling of
the parameter domain based on greedy methods might contribute to reduce the number of
offline HF solves that is needed to meet the target accuracy; however, greedy methods are
inherently sequential and introduce non-negligible overhead that might ultimately hinder
the benefit of adaptive sampling. To address these issues, in this work, we proposed two
new strategies to accelerate greedy methods: first, a progressive construction of the ROM
based on HAPOD to speed up the construction of the empirical test space for LSPG
ROMs and on a warm start of the NNLS algorithm to determine the empirical quadrature
rule; second, a two-fidelity sampling strategy to reduce the number of expensive greedy
iterations.
The numerical results of “Numerical results” illustrate the effectiveness and the gener-
ality of our methods for both steady and unsteady problems. First, we found that the warm
start of the NNLS algorithm enables a non-negligible reduction of the computational cost
without hindering performance. Second, we observed that the sampling strategy based
on coarse data leads to near-optimal performance: this result suggests that multi-fidelity
algorithms might be particularly effective to explore the parameter domain in the MOR
training phase.
The empirical findings of this work motivate further theoretical and numerical inves-
tigations that we wish to pursue in the future. First, we wish to analyze the performance
of multi-fidelity sampling methods for MOR: the ultimate goal is to devise a priori and a
posteriori indicators to drive the choice of the mesh hierarchy and the cardinality n0 of the
set P in Algorithm 6. Second, we plan to extend the two elements of our formulation—
progressive ROM generation and multi-fidelity sampling—to optimization problems for
which the primary objective of model reduction is to estimate a suitable quantity of inter-
est (goal-oriented MOR): in this respect, we envision to combine our formulation with
adaptive techniques for optimization [33–35].
Acknowledgements
The authors thank Dr. Jean-Philippe Argaud, Dr. Guilhem Ferté (EDF R&D) and Dr. Michel Bergmann (Inria Bordeaux) for
fruitful discussions. The first author thanks the code aster development team for fruitful discussions on the FE software
employed for the numerical simulations of section 4.2.
Author contributions
Eki Agouzal: methodology, software, investigation. Tommaso Taddei: conceptualization, methodology, software,
investigation, writing.
Funding
This work was partly funded by ANRT (French National Association for Research and Technology) and EDF.
Availability of data and materials.

The data that support the findings of this study are available from the corresponding author, TT, upon reasonable
request. The software developed for the investigations of section 4.2 is expected to be integrated in the open-source
Python library Mordicus [36], which is funded by a ‘French Fonds Unique Interministériel” (FUI) project.
Declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have
appeared to influence the work reported in this paper.
Received: 17 January 2024 Accepted: 24 March 2024

References
1. Carlberg K, Farhat C, Cortial J, Amsallem D. The GNAT method for nonlinear model reduction: effective implementation
and application to computational fluid dynamics and turbulent flows. J Comput Phys. 2013;242:623–47.
2. Taddei T, Zhang L. Space-time registration-based model reduction of parameterized one-dimensional hyperbolic
PDEs. ESAIM Math Model Numer Anal. 2021;55(1):99–130.
3. Ryckelynck D. Hyper-reduction of mechanical models involving internal variables. Int J Numer Method Eng.
2009;77(1):75–89.
4. Farhat C, Chapman T, Avery P. Structure-preserving, stability, and accuracy properties of the energy-conserving
sampling and weighting method for the hyper reduction of nonlinear finite element dynamic models. Int J Numer
Method Eng. 2015;102(5):1077–110.
5. Yano M, Patera AT. An LP empirical quadrature procedure for reduced basis treatment of parametrized nonlinear
PDEs. Comput Methods Appl Mech Eng. 2019;344:1104–23.
6. Veroy K, Prud’Homme C, Rovas D, Patera A. A posteriori error bounds for reduced-basis approximation of parametrized
noncoercive and nonlinear elliptic partial differential equations. In: 16th AIAA Computational Fluid Dynamics Confer-
ence, p. 3847, 2003.
7. Haasdonk B, Ohlberger M. Reduced basis method for finite volume approximations of parametrized linear evolution
equations. ESAIM Math Model Numer Anal. 2008;42(2):277–302.
8. Sirovich L. Turbulence and the dynamics of I. Coherent structures. Coherent Struct Quart Appl Math. 1987;45(3):561–
71.
9. Volkwein S. Model reduction using proper orthogonal decomposition. Lecture Notes, Institute of Mathematics and
Scientific Computing, University of Graz. see http://www.uni-graz.at/imawww/volkwein/POD.pdf. 1025; 2011.
10. Cohen A, DeVore R. Approximation of high-dimensional parametric PDEs. Acta Numer. 2015;24:1–159.
11. Himpe C, Leibner T, Rave S. Hierarchical approximate proper orthogonal decomposition. SIAM J Sci Comput.
2018;40(5):3267–92.
12. Quarteroni A, Manzoni A, Negri F. Reduced basis methods for partial differential equations: an introduction, vol. 92.
Berlin: Springer; 2015.
13. Chapman T, Avery P, Collins P, Farhat C. Accelerated mesh sampling for the hyper reduction of nonlinear computa-
tional models. Int J Numer Method Eng. 2017;109(12):1623–54.
14. Feng L, Lombardi L, Antonini G, Benner P. Multi-fidelity error estimation accelerates greedy model reduction of
complex dynamical systems. Int J Numer Method Eng. 2023;124(3):5312–33.
15. Paul-Dubois-Taine A, Amsallem D. An adaptive and efficient greedy procedure for the optimal training of parametric
reduced-order models. Int J Numer Method Eng. 2015;102(5):1262–92.
16. Barral N, Taddei T, Tifouti I. Registration-based model reduction of parameterized PDEs with spatio-parameter adap-
tivity. J Comput Phys. 2023;112727.
17. Benaceur A, Ehrlacher V, Ern A, Meunier S. A progressive reduced basis/empirical interpolation method for nonlinear
parabolic problems. SIAM J Sci Comput. 2018;40(5):2930–55.
18. Barrault M, Maday Y, Nguyen NC, Patera AT. An empirical interpolation method: application to efficient reduced-basis
discretization of partial differential equations. CR Math. 2004;339(9):667–72.
19. Conti P, Guo M, Manzoni A, Frangi A, Brunton SL, Kutz JN. Multi-fidelity reduced-order surrogate modeling. arXiv
preprint arXiv:2309.00325 2023.
20. Yano M, Modisette J, Darmofal D. The importance of mesh adaptation for higher-order discretizations of aerodynamic
flows. In: 20th AIAA Computational Fluid Dynamics Conference, p. 3852, 2011.
21. Lawson CL, Hanson RJ. Solving least squares problems. SIAM, 1995.
22. Yano M. Discontinuous Galerkin reduced basis empirical quadrature procedure for model reduction of parametrized
nonlinear conservation laws. Adv Comput Math. 2019;45(5):2287–320.
23. Du E, Yano M. Efficient hyperreduction of high-order discontinuous Galerkin methods: element-wise and point-wise
reduced quadrature formulations. J Comput Phys. 2022;466: 111399.
24. Urban K, Patera A. An improved error bound for reduced basis approximation of linear parabolic problems. Math
Comput. 2014;83(288):1599–615.
25. Haasdonk B. Reduced basis methods for parametrized PDEs—a tutorial introduction for stationary and instationary
problems. Model Reduct Approx Theory Algo. 2017;15:65.
26. Iollo A, Sambataro G, Taddei T. An adaptive projection-based model reduction method for nonlinear mechanics with
internal variables: application to thermo-hydro-mechanical systems. Int J Numer Method Eng. 2022;123(12):2894–918.
27. Brand M. Fast online svd revisions for lightweight recommender systems. In: Proceedings of the 2003 SIAM Interna-
tional Conference on Data Mining, pp. 37–46, 2003, SIAM.
28. MATLAB: R2022a. The MathWorks Inc., Natick, Massachusetts. 2022.
29. Finite Element code_aster, Analysis of Structures and Thermomechanics for Studies and Research. Electricité de
France (EDF), Open source on www.code-aster.org (1989–2024).
30. Taddei T. Compositional maps for registration in complex geometries. arXiv preprint arXiv:2308.15307 2023.
31. Agouzal E, Argaud J-P, Bergmann M, Ferté G, Taddei T. Projection-based model order reduction for prestressed con-
crete with an application to the standard section of a nuclear containment building. arXiv preprint arXiv:2401.05098
2024.
32. Bouhjiti D. Analyse probabiliste de la fissuration et du confinement des grands ouvrages en béton armé et
précontraint-application aux enceintes de confinement des réacteurs nucléaires (cas de la maquette vercors). Acad J
Civil Eng. 2018;36(1):464–71.
33. Zahr MJ, Farhat C. Progressive construction of a parametric reduced-order model for PDE-constrained optimization.
Int J Numer Method Eng. 2015;102(5):1111–35.
34. Alexandrov NM, Lewis RM, Gumbert CR, Green LL, Newman PA. Approximation and model management in aerody-
namic optimization with variable-fidelity models. J Aircraft. 2001;38(6):1093–101.
35. Yano M, Huang T, Zahr MJ. A globally convergent method to accelerate topology optimization using on-the-fly model
reduction. Comput Methods Appl Mech Eng. 2021;375: 113635.
36. Mordicus Python Package. Consortium of the FUI Project MOR DICUS. Electricité de France (EDF), Open source on
https://gitlab.com/mordicus/mordicus. 2022.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches

Uploaded by

Copyright:

Available Formats

Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Accelerated Construction of Projection-Based Reduced-Order Models Via Incremental Approaches

Uploaded by

Copyright:

Available Formats

Agouzal and Taddei Advanced Modeling and Simulation

RESEARCH ARTICLE Open Access

We are interested in the solution to steady parametric conservation laws. We denote by

ﬁnd uμ ∈ X : Rμ (uμ , v) = 0, ∀ v ∈ Y, μ ∈ P, (1)

where R : X × Y × P → R is the parametric residual associated with the PDE of

is a rigorous a posteriori error estimator, we might apply the termination criterion

Algorithm 1 Abstract weak greedy algorithm

Contributions and relation to previous work

Projection-based model reduction of parametric systems

Preliminary definitionsand tools 

μ ∈ Xhf such that

with λ1 ≥ λ2 ≥ . . . λntrain ≥ 0. In our implementation, we orthonormalize the modes, that

min Gρ − b2 s.t. ρ ≥ 0, (9)

[ρ] = NNLS (G, b, δ, P0 ) , (10)

Algorithm 2 Strong-greedy algorithm

1: Choose Z0 = ∅, Isg = ∅, set Itrain = {1, . . . , ntrain }.

Algorithm 3 Active set method for (9)

to refer to the application of Algorithm 3.

Least-square Petrov–Galerkin formulation of steady problems

Construction of the empirical test space

Construction of the empirical quadrature rule

eq,e eq,e eq,f eq,f

and Ptrain = {μk }nk=1

By tedious but straightforward calculations, we ﬁnd that

and ρhf = [1, . . . , 1] .

min nnz(ρ), Gρeq − b2 ≤ δb2 , (18)

A posteriori error indicator

Overview of the computational procedure

Vanilla POD-greedy for Galerkin time-marching ROMs

Algorithm 4 Generation of the ROM

1: Apply Gram-Schmidt orthogonalization to deﬁne the new ROB Z.

(completed with suitable boundary and initial conditions)

where the elemental residuals satisfy

where nnew (tol) satisﬁes nnew (tol) =

Accelerated construction of PROMs

the total number of greedy iterations required by Algorithm 1. “Progressive construc-

Progressive construction of empirical test space and quadrature

therefore, at each iteration, it suﬃces to solve 2n − 1—as opposed to n2 —Riesz problems

[{ψinew }m new mnew

where Gnew has k = n × dim(Y n \ Yn−1 ) + dim(Y

We provide extensive numerical investigations to assess the eﬀectiveness of this choice.

Summary of the incremental weak-greedy algorithm

Algorithm 5 Incremental generation of the ROM

1: Apply Gram-Schmidt orthogonalization to deﬁne the new ROB Z.

Algorithm 6 Multi-ﬁdelity weak-greedy algorithm

Some comments are in order.

• In the numerical experiments, we choose the cardinality n0 of P according to (8)

Extension to unsteady problems

Algorithm 7 POD-strong-greedy algorithm

1: Choose Z0 = ∅, Isg = ∅, set Itrain = {1, . . . , ntrain }.

Transonic compressible flow past an LS89 blade

 , o2 (x1 ) = fbtm (x1 ) + Href − H

Table 1 Compressible ﬂow past a LS89 blade

for n = 2, 3, . . .; for n = 1, we rely on a coarse solver and on a continuation strategy

Long-time mechanical response of the standard section of a containment building under

mechanical (THM) responses. A thorough presentation of the mathematical model and of

Table 2 Mechanical response of a NCB under external loading

Table 3 Mechanical response of a NCB under external loading

hf ,(1) hf ,(2) "K hf ,(1) hf ,(2)

Preliminary definitionsand tools

• In the numerical experiments, we choose the cardinality n0 of P according to (8)

, o2 (x1 ) = fbtm (x1 ) + Href − H

%"K