0% found this document useful (0 votes)

51 views

On Lazy Evaluation As A Tool To Optimize The Efficiency of Large Scale Numerical Simulations in Python

This document summarizes a new approach to optimize the efficiency of large-scale numerical simulations in Python using lazy evaluation. Specifically, it discusses applying lazy evaluation to the evaluation of partial differential equation (PDE) coefficients in the escript Python module. Currently in escript, PDE coefficient expressions are evaluated immediately for all mesh elements, which can be memory intensive. The new approach uses lazy evaluation to compute coefficients on a per-element basis during matrix assembly without storing all intermediate results. This reduces memory usage and improves performance. The document presents elastic-plastic saturated porous media flow as an example problem to demonstrate the benefits of the lazy evaluation approach.

Uploaded by

Maruf Muhammad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

On Lazy Evaluation As A Tool To Optimize The Efficiency of Large Scale Numerical Simulations in Python

Uploaded by

Maruf Muhammad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Procedia Computer

Science
ProcediaComputer
Procedia Computer Science
Science001 (2010)
(2010)1–9
2145–2153
www.elsevier.com/locate/procedia

International Conference on Computational Science, ICCS 2010

On lazy evaluation as a tool to optimize the eﬃciency of large scale

numerical simulations in Python
L. Gross1 , A. Amirbekyan, J. Fenwick, L. Gao, A. Mohajeri, H. Muhlhaus
Earth Systems Science Computational Center,
School of Earth Sciences,
The University of Queensland,
St. Lucia, QLD 4072, Australia.

Abstract
Scientists working on mathematical models want to concentrate on the design of models. They pay little attention
to numerical methods such as the finite element method (FEM), their implementation and parallelization. The escript
module in python provides an environment in which scientists can define new models using a language of partial
differential equation (PDE) and spatial functions which is natural for the formulation of continuous models. This
approach defines a high level of abstraction from the underlying data structures and frees modelers from issues of
optimized implementation and parallelization. In its current implementation escript evaluates expressions which
define PDE coefficients immediately for all nodes or elements of an FEM mesh. In the paper we will demonstrate
that for complex rheologies such as the Drucker–Prager plasticity model, the memory requirements for this strategy
are the limiting factor for scaling up the mesh size. The python module is backed by an escript C++ library where
the processing is performed. We will discuss an new extension to the PDE coefficient handling provided by this C++
library which uses a lazy evaluation technique and will demonstrate the efficiency of this new extension in terms
of compute time and memory usage for a practical engineering application, namely the simulation of elastic-plastic,
saturated porous media.
c 2010 Published by Elsevier Ltd.

Keywords:
Python Programming Language, Partial Differential Equations, Finite Element Methods.

1. Introduction

The escript package [1] is a python module [2] for implementing mathematical models based on partial differential
equations (PDEs). It provides scientists with an easy-to-use and flexible framework to specify complex, non-linear
and coupled PDEs using python scripts. The PDE is solved by an appropriate solver library while escript handles
the definition and evaluation of the PDE coefficients. Abstraction from the data structures allows simulations to be

Email addresses: [email protected] (L. Gross), [email protected] (A. Amirbekyan), [email protected] (J.
Fenwick), [email protected] (L. Gao), [email protected] (A. Mohajeri), [email protected] (H. Muhlhaus)
1 Corresponding author.

1877-0509 c 2010 Published by Elsevier Ltd.

doi:10.1016/j.procs.2010.04.240
2146 L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 2

independent from the PDE solver library and the underlying computer architecture. Modelers can run simulations
on their desktop PCs as well as massive parallel supercomputers without costly modifications to the program code.
A variety of applications have successfully used escript including earthquakes [3], Earth mantel convection [4] and
volcanic lava [5].
For reasons of better flexibility and ease of use for inexperienced users, traditional PDE solving environments
such as ELLPACK [6] and FASTFLO [7] are designed with their own scripting languages rather than requiring the
use of low level languages like C. As a python module, escript has the benefits of a scripting language without users
needing to learn another new language. In escript, computationally expensive calculations such as the assembly and
solution of a system of linear equation are entirely implemented in C/C++ to optimize performance. However, the
ease of use comes at some cost for the overall performance. In many cases, this is acceptable as the drastic reduction
of development time for simulation codes significantly improves the project productivity in particular in a research
environment.
In the current escript (version 3.0), an operation within an expression defined on the python level is evaluated
immediately for all elements in the finite element mesh on the C/C++ level. Typically the costs for the evaluation
of PDE coefficients can be neglected in comparison to the overall compute time which is clearly dominated by the
solution of the linear systems. However, in this paper we will discuss an example for which the overhead from this
evaluation strategy is significant and in fact dominates the memory cost for the FEM solver. So the problem size is
restricted by evaluation of coefficients rather than the solver.
In order to eliminate this limitation in escript we use the functional programming strategy of lazy evaluation [8, 9]
to compute PDE coefficients. Lazy evaluation means that expressions are not evaluated until their values are required.
That is, expressions can be assigned to variables without actually computing the value of the expressions. In escript,
lazy evaluation is a feature of our objects and not of the python language as a whole.
Our use of lazy evaluation means that expressions can be evaluated on a per element basis during the assemply of
the global matrix without storing intermediate results for all elements at any point in time. This evaluation process
operates entirely on the C/C++ level and is taylored to efficiently incorporate PDE coefficients defined by complex
expressions for large finite element meshes into the assembling process. Existing simulation scripts using escript can
easily make use of this new technique by setting a switch at the beginning of the simulation. In order to avoid the
storage of large volumes of data our approach does not automatically cache intermediate results as it is typically done
in the functional programming approach. However, significant performance improvements can be achieved when the
user explicitly request evaluation for critical variables which are reused in the next time or iteration step (typically
less then five variables even in a complex simulation).
The paper is organized as follows: In the next section we will describe a model problem, namely elastic-plastic
saturated porous media flow. We will give a short overview of escript concepts, then show how the model problem
is implemented in escript and will discuss the limitation of the current approach for the model problem. In Section 4
we will present our new approach in more detail and in Section 5 will show memory and timing measurement for the
model problem to demonstrate its efficiency. Finally we will draw some conclusions.

2. Modelling Elastic-Plastic Saturated Porous Media

In this section, we present the model for an elastic-plastic, saturated porous media as an example of a more
complex model that can be solved using escript. The solution algorithm we outline is suggested in the literature for
practical applications, e.g. see [10]. To simplify the presentation we ignore boundary coinditions. The state of the
media is described by the velocity v, stress σi j and the pore pressure q over a domain Ω.
From Darcy’s law and the mass conservation law one derives the diﬀusion equation. By applying the backward
Euler scheme for the time step size dt, the pore pressure q is updated from the previous time step q− by the solving
the PDE
1 k 1
q − dt q,i = q− + dt · v j, j (1)
KU − K μf ,i KU − K
where for any function Z, Z,i denotes the derivative of Z with respect to xi and Einstein’s summation convention is
applied. The parameter k is the soil permeability, μ f is the pore ﬂuid viscosity, KU is the undrained bulk modulus and
K is the bulk modulus of the porous media.
L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153 2147
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 3

To calculate the displacement increment du = dt · v, one has to solve the PDE:

−(S i jkl duk,l ), j = σ−i j − dq δi j (2)
,j

where dq = q − q− is the pore pressure increment and S i jkl is the tangential tensor which depends on the rheology in
use. The argument of the divergence expression on the right hand side is the stress σ−i j from the previous time step.
For the Drucker–Prager Model, the stress is updated using the following procedure: First the new stress state σeij due
to elastic deformation is calculated:
2
σeij = σ−i j + 2G Di j + ((K − G)Dkk − dq)δi j + σ−ik Wk j − Wik σ−k j (3)
3
where G is the shear modulus, and Di j and Wi j are the strain and spin are deﬁned as the symmetric and non-symmetric
part of the gradient of the displacement increment du:
1 1
Di j = (dui, j + du j,i ) and Wi j = (dui, j − du j,i ) . (4)
2 2
The yield function F is evaluated for the elastic stress as

1 1 e e
F = τe − α · pe − τY with pe = − σekk and τe = (σ ) (σi j ) (5)
3 2 ij
with the deviatoric stress (σeij ) = σeij + pe δi j . The values τY and α are the yield stress and the friction parameter
respectively, both of which are given as a function of the plastic shear strain γ p = γ p− + λ̇. The plastic multiplier λ̇ is
set as
F 0 for F < 0
λ̇ = χ with χ = (6)
h + G + αK 1 else
dτY
where the factor χ marks when the yield condition is violated and h = dγ p is the hardening modulus. The tangential

tensor S i jkl for the next update step for the Drucker-Prager plasticity model is given by

2 1
S i jkl = G(δik δ jl + δ jk δil ) + (K − G)δi j δkl + (δik σ−l j − δ jl σ−ik + δ jk σ−il − δil σ−k j )
3 2 G (7)
χ G
+σ−i j δkl − σ−il δ jk − σ + αKδ i j σ
+ αKδ kl
h + G + α2 K τ i j τ kl

3. The escript module

Obviously, the solution of the two linear PDEs (2) for velocity and (1) for pressure are the key components when
implementing the model problem outlined in the previous section. Typically the modeler developing the model will
focus on how the coefficients of PDE are set as they define the outcome of the model. In the following sections we will
briefly outline the basic concept of the escript module [1] and will show how the model problem can be implemented
in python using the escript module.

3.1. Linear Partial Diﬀerential Equations

The keystone of the escript environment is the LinearPDE class in python which provides the interface for the
user to deﬁne and solve a general, second order, linear PDE. The general form of a PDE for an unknown vector-valued
function ui represented by the LinearPDE class is

−(Ai jkl uk,l + Bi jk uk ), j + Cikl uk,l + Dik uk = −Xi j, j + Yi . (8)

The coeﬃcients A, B, C, D, X and Y are functions of their location in the domain. Moreover, natural boundary
conditions of the form
n j (Ai jkl uk,l + Bi jk uk − Xi j, j ) = yi (9)
2148 L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 4

Continuous Functions on Interior Functions on Boundary Functions on

Nodes Quadrature Points Quadrature Points

Figure 1: Location of Data-points for Selected FunctionSpace Types on a Finite Element Mesh.

can be defined. In this condition, (n j ) defines the outer normal field of boundary of the domain and y is a given
function. To restrict values of ui to given values, ri , on certain locations in the domains one can define constraints of
the form
ui = ri where qi > 0 , (10)
and qi is a characteristic function of the locations where the constraint is applied. For instance to implement the PDE
for the update of the of the pore pressure in equation (1) the user needs to write the python script
from esys.escript.linearPDE import LinearPDE
mypde=LinearPDE(mydom)
mypde.setValue(A=dt*k/mu_f*kronecker(mydom), D=1./(K_U-K),
Y=1./(K_U-K)*q+div(du))
q=mypde.getSolution()
where kronecker denotes the Kronecker δi j symbol. In this script the object mydom is a Domain class object and
defines the domain of the PDE. Instances of the Domain class (in practice of a Domain subclass) are created through
a function of the PDE solver library to be used to solve the PDE. For instance, the following python code
from esys.finley import ReadMesh
mydom=ReadMesh("mesh.fly")
creates the Domain class object mydom represented by the finite element (FEM) mesh in the file mesh.fly for the finley
solver library [11]. The finley library reads the mesh and distributes the mesh data across the available processors if
a parallel computer is used. It is beyond the scope of this paper to discuss the Domain class in detail. However, note
that a Domain class object contains information about the discretization method and the PDE solver library including
the distribution of FEM mesh items such as nodes and elements.

3.2. Spatial Functions

From the modeler’s point of view, the coeﬃcients of the PDE and the solution of the PDE are functions of their
locations in the domain of interest. The design of escript adopts this view point by introducing the concept of functions
of a spatial variable. These functions are represented by their values on a set of so-called data-points where the choice
of the data-points depends on the type of function to be represented. The type is described by the FunctionSpace
L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153 2149
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 5

class object. As shown in Figure 1 data-points can be selected as the nodes of a ﬁnite element mesh to represent
continuous function on the domain, as quadrature points in the interior of the domain to represent function which may
be discontinuous across element boundaries or as quadrature points on the boundary to represent functions which are
deﬁned on the boundary only. In escript, functions can be combined through arithmetic expressions where the results
inherit the associated FunctionSpace from their arguments.
The stress update step described in Section 2 is written in escript in the following method

from escript import *

def getStress(du, dq, sigma, gamma_p, tau_Y, G, K, alpha, beta, h):
g=grad(du)
D=symmetric(g)
W=nonsymmetric(g)
s_e=sigma+2*G*D+((K-2./3*G)*trace(D)-dq)*kronecker(3) \
+2*symmetric(matrix_mult(W,sigma)
p_e=-trace(s_e)/3.
s_e_dev=deviatoric(s_e)
tau_e=sqrt(1./2)*length(s_e_dev)
F=tau_e-alpha*positive(p_e)-tau_Y
chi=whereNonNegative(F)
l=chi*F/(h+G+alpha*beta*K)
tau=tau_e-l*G
sigma=tau/tau_e * s_e_dev - (p_e+l*alpha*K) *kronecker(3)
gamma_p=gamma_p+l
return sigma, gamma_p
The arguments du and dq which define the current displacement and pore pressure increment are solution of a PDE
and are represented by their values at the FEM nodes. The stress sigma and plastic shear strain gamma_p which are
updated by this method are represented by their values at the quadrature points in the interior of the domain. The
calculation of the gradient g=grad(du) leads to a change in the data-points to be used for representation. In fact,
the gradient is calculated by interpolating the displacement increment du given on the nodes and then evaluating the
derivative of the interpolation polynomial on the quadrature points on an element-by-element basis. Note that in the
expression (K-2./3*G)*trace(D)-dq, different data-points for representation are used for D and dq. In this case
escript automatically performs an interpolation of dq to the quadrature points.
Typically the material parameters such as the yield stress tau_Y in the getStress method are floating point num-
bers. However, for composite materials the material parameters become piecewise constant functions of their location
in the domain. For this case, escript provides a special type of representation called tagging. At mesh generation
time, each individual element is marked by a tag according to its material type present at the location of the element.
A piecewise constant function is represented by a dictionary mapping tags to values. In a typical simulation scenario
where a very large number of elements are used to represent a composition of a small number of materials the tagged
representation, if appropriate, is obviously computationally more efficient than the general expanded representation
where each data-point is holding an individual value. Expanded, tagged and constant representations can be combined
in a single expression where behind the scenes escript uses the appropriate interpretation. As pointed out earlier, this
may include interpolations to another set of data-points. From the perspective of an escript user, this is a key feature as
it allows the user to develop complex and coupled mathematical models without worrying about what type of model
parameters may be fed into the model at a later stage.

4. Lazy Evaluation

To calculate the solution of the PDE, a linear system Mu = f needs to be solved where u represents the solution
at the nodes of the FEM mesh, f is the right hand side and M is the so-called stiffness matrix. The right hand side
and the stiffness matrix are assembled from the PDE coefficients and the shape functions N used to interpolate the
2150 L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 6

solution from their values on the FEM nodes across the elements. If we focus on the coeﬃcient A of the PDE, then
the assembly of the stiﬀness matrix entry M(i,μ)(k,λ) is given as

M(i,μ)(k,λ) = Ai jkl (x)Nμ, j Nλ,l dx = Ai jkl (x)Nμ, j Nλ,l dx = me(i,μ)(k,λ) (11)
Ω e Ωe e

where i and k run through the number of solution components and μ and λ through the nodes of the FEM mesh. The
value me(i,μ)(k,λ) is the local element matrix for an element Ωe . The element matrix is calculated using a numerical
integration scheme

me(i,μ)(k,λ) = Ai jkl (x)Nμ, j Nλ,l dx ≈ Aeijkl (q) Nμ,e j (q)Nλ,l
e
(q)we (q) (12)
Ωe q
e
where w are the integration weights and index q refers to the quadrature point. The assembly process is performed
on an element-by-element basis requiring a very small amount of additional memory to calculate the derivatives of
the shape functions on an element. The values Aeijkl (q) for all quadrature points q are accessed when element e is
processed only and are typically not used elsewhere in the simulation.
The following method defines the tangential operator for the Drucker–Prager update step defined in Equation (7):
from escript import *
def getTangentialTensor(sigma, tau_Y, G, K, alpha, beta, h):
k3=kronecker(3)
pp=positive(-trace(sigma)/3)
s_dev=deviatoric(sigma)
tau=sqrt(1./2)*length(s_dev)
chif=whereNonNegative(tau-alpha*pp-tau_Y)/(h+G+alpha*beta*K)*tau**2
sXk3=outer(sigma,k3)
k3Xk3=outer(k3,k3)
S= G*(swap_axes(k3Xk3,0,3)+swap_axes(k3Xk3,1,3)) + (K-2./3.*G)*k3Xk3 \
+ 0.5*( swap_axes(swap_axes(sXk3,0,2),2,3) - swap_axes(sXk3,1,2) \
- swap_axes(swap_axes(sXk3,0,3),2,3) + swap_axes(sXk3,1,3) ) \
+ sXk3 - swap_axes(swap_axes(sXk3,1,2),2,3) \
- outer(chif*(G*s_dev+tau*beta*K*k3),G*s_dev+tau*alpha*K*k3)
return S
The tangential tensor S has rank four. As it is derived from the stress sigma it is represented by its values at the
quadrature points in each element and can be used directly in the assembly process for A (see Equation (12)). With 8
quadrature points per element and 34 = 81 tensor entries per data-point, the storage of the tangential tensor for a three
dimensional geometry requires at least 648 floating point numbers per element. So for a mesh with 100, 000 elements
one needs about half a gigabyte just to store the PDE coefficient A alone. However, this is a very optimistic estimate
for the memory requirements as it ignores the necessity to create temporary results.
To give an idea on the overhead let’s have a look at a simplified version of the calculation of the tangential tensor:
sXk3=outer(sigma,k3)
S= sXk3 - swap_axes(swap_axes(sXk3,1,2),2,3)
The python interpreter executes these statements using two temporary variables t1 and t2 in the form
sXk3=outer(sigma,k3)
t1=swap_axes(sXk3,1,2)
t2=swap_axes(t1,2,3)
S= sXk3 - t2
In the best case scenario when the temporary variable t1 is deleted after the calculation of t2 this operation will
require the storage of at least three tensors of rank four at the time where S is calculated The total memory cost
per element is 1944 floating point numbers. For a mesh of 100, 000 elements this adds up to over 1.5 gigabytes to
L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153 2151
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 7

Peak Memory Usage Normalized Peak Memory Usage w.r.t. Non-lazy

2048 1
Non-lazy
Lazy
Tuned
Reference 0.8
1024
Memory Usage (MB)

0.6
512
0.4

256
0.2
Lazy
Tuned
128 0
1000 2000 4000 8000 16000 32000 64000 128000 1000 2000 4000 8000 16000 32000 64000 128000
Number of Unknowns Number of Unknowns

Figure 2: Peak memory used for a simulation for lazy and tuned runs relative to non-lazy runs. Memory includes the memory for storage of the
systems of linear equations to be solved. The line marked as Reference shows the slope of the theorical linear growth of the peak memory with
the number of unknowns.

construct the tangential operators which is about five times more than required to store the coefficient matrix of the
linear system to be solved. This means that the limiting resource factor for the simulation is the assemblage process
rather than solving the linear system as it is the typical situation in escript based simulations.
In almost all cases tensors of rank four are used once for the assembly process only. Consequently the best
approach for reducing the memory requirement is not to compute the tensors for all elements at once but to build the
value for a specific element at the time the value is required and then discard it. To implement this strategy, escript
can store a function as the expression used to define it rather than its explicit value at each data-point. An expression
will only be evaluated when it is actually required (hence “lazy evaluation”) for example as part of the assembly
process in Equation 12 or as a subexpression of another object. If an expression is evaluated at a given element e, each
subexpression is evaluated for the particular element e and the calculated value and the identity e of the corresponding
element are recorded in a small buffer. This way if a subexpression has already been evaluated for the current element
e its value does not need to be recomputed. If the recorded element does not match the current element e, then the
stored value is replaced. In our example, the tensor sXk3 is used twice in the calculation of S , once directly and once
in the calculation of the temporary variable t1. The first time a value of sXk3 is requested for an element it will be
computed, while the second time it will be returned directly from the buffer. Significantly, with lazy evaluation, the
memory cost is reduced to approximately 16KB (excluding storage for inputs) independent of the number of elements.
In escript, the python interpreter provides a convienient environment in which to manipulate expressions. The
data structures and their manipulation are implemented entirely in C/C++. When evalation is requested from within
the solver library, it is executed within the escript C++ library. Values are exchanged via C pointers.

5. Timings

The escript implementation of the model problem as described Section 3 has been tested for three-dimensional
meshes with varying numbers of elements. As the timings for the PDE coefficient evaluation does not depend on
the mesh structure a rectangular mesh is used in the tests. All runs have been performed on a single core of an SGI
ICE 8200 EX node with 32 GB of RAM using the Intel C/C++ compiler version 10. First as a control, simulations
were run with lazy evaluation disabled. This run sets up the baseline for our experiments to compare against. We
call this run non-lazy. In the second test, a single line was added to the python script to enable lazy evaluation in
the C++ part of escript. No other changes were made to the code. This is to show the effect of using lazy evaluation
with minimal effort on the part of the user. This type of test is labelled as lazy. When lazy evaluation is enabled in
this form it is used for all expression involving escript functions so lazy evaluation is not just used in the calculation
of the tangential tensor. As expressions are defined in terms of values from previous iteration steps, eg. sigma and
gamma_p expressions become significantly larger and hence more expensive to evaluate in later iterations where in fact
2152 L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 8

Execution Time for Computing and Assembling Tangential Tensor Speedup for Computing and Assembling Tangential Tensor
8192 2
Non-lazy Lazy
Lazy Tuned
4096 Tuned 1.8
2048

1024 1.6
Time (s)

512
1.4
256
1.2
128

64 1
32
0.8
1000 2000 4000 8000 16000 32000 64000 128000 1000 2000 4000 8000 16000 32000 64000 128000
Number of Unknowns Number of Unknowns

Figure 3: Execution time for computing and assembling tangential tensor and Speed-up of the execution time for lazy and tuned runs over
non-lazy runs. Timings do not include the time for solving the linear systems.

a majority of the values need to be recalculated every time an expression is evaluated. To reduce this overhead escript
evaluates expressions with depth greater than 70 immediately. This is the price to be paid to deploy lazy evaluation
globally.
Eﬃciency can be improved with help from the user. As the two variables sigma and gamma_p are reused in the
next iteration step an explicit evaluation of this variables is enforced after their expressions have been deﬁned. In fact,
the user needs to just add the line

resolveGroup((sigma,gamma_p))

which triggers a simultaneously evaluation of the two expressions for sigma and gamma_p. The evaluation of one of
the expressions at a time would require the reevaluation of shared subexpressions, eg. of l. This type of evaluation is
labelled as tuned.
Figure 2 shows the total memory used for the whole simulation for the lazy and tuned type relative to the
non-lazy approach. As it is impractical to measure memory usage of the assembly process, only we present the total
memory including memory for the systems of linear equations. The total saving in memory for the lazy type is in
the best case about 10% for the reasons pointed out above. With the additional small code modifications a reduction
of memory usage of over 70% is achieved.
Figure 3 shows the execution time of all three types of runs and the speedup of lazy and tuned runs over
non-lazy runs for computing the tangential tensor and assembling of the stiffness matrix. The speed-up increases
with the number of elements and reaches almost two for the largest instance and the tuned type. This significant
performance improvement is caused by a dramatic reduction of cache misses while for the lazy type the speed-up
is not as good as for the tuned type due to reevaluation. In the first iteration step about 35% of the execution time
is spent in the expression evaluation and the assembly of the linear systems. This portion drops to 23% if the tuned
implementation is used which leads to a reduction of the compute time by 18%. This speed-up is pessimistic as the
simulation did not use the optimal linear solver.

6. Summary

We presented an extension to the escript C++ library which uses lazy evaluation to speed up three-dimensional
escript simulations using python as a high-level programming environment. Limiting the complexity of the expres-
sions to be evaluated is crucial to achieve optimal speed-up and to minimize memory usage. With rather moderate
modiﬁcations to the python code (in the example discussed in this paper two line need to be added) a speed-up of
a factor two and a memory reduction of 70% can be achieved. To implement the required modiﬁcations the user
needs to identify variables which will be reused in the next iteration step. We believe that considering the fact that in
L. Gross et al. / Procedia Computer Science 1 (2010) 2145–2153 2153
L. Gross et al. / Procedia Computer Science 00 (2010) 1–9 9

practice these variable will be critical state variables of the model even an unexperienced user will be able to insert
the necessary modiﬁcations into the code. This implementation of lazy evaluation which will be available in escript
version 3.1 can be used with both MPI and OpenMP parallelism.

Acknowledgements

This work is supported by the AuScope National Collaborative Research Infrastructure Strategy, the Queensland
State Government and The University of Queensland.

References
[1] L. Gross, L. Bourgouin, A. J. Hale, H.-B. Muhlhaus, Interface modeling in incompressible media using level sets in escript, Physics of the
Earth and Planetary Interiors 163 (2007) 23–34. doi:doi:10.1016/j.pepi.2007.04.004.
[2] M. Lutz, Programming Python, 2nd Edition, O’Reilly, 2001.
[3] L. M. Olsen-Kettle, D. Weatherley, E. Saez, L. Gross, H.-B. Mühlhaus, H. L. Xing, Analysis of slip-weakening frictional laws with static
restrengthening and their implications on the scaling, asymmetry, and mode of dynamic rupture on homogeneous and bimaterial interfaces,
J. Geophys. Res. 113 (2008) B08307. doi:doi:10.1029/2007JB005454.
[4] M. Davies, H.-B. Mühlhaus, L. Gross, Thermal effects in the evolution of initially layered mantle material, Journal Pure and Applied Geo-
physics 163 (11–12). doi:10.1007/s00024-006-0143-x.
[5] L. Bourgouin, H.-B. Mühlhaus, A. J. Hale, A. Arsac, Studying the influence of a solid shell on lava dome growth and evolution using the
level set method, Geophysical Journal International 170 (3) (2007) 1431–1438.
[6] J. R. Rice, R. F. Boisvert, Solving Elliptic Problems Using ELLPACK., Vol. 2, Springer Series in Computational Software, 1985.
[7] A. N. Stokes, N. G. Barton, L. X.-L., Z. Z., Fluids, finite elements and multi-physics, NOTES ON NUMERICAL FLUID MECHANICS 85
(2003) 262–276.
[8] D. P. Friedman, M. Wand, Essentials of Programming Languages, 3rd Edition, MIT Press, 2008.
[9] A. K. Pandey, Programming Languages, Principles and Paradigms, Alpha Science, 2008.
[10] N. Manoharan, S. P. Dasgupta, Consolidation analysis of elasto-plastic soil, Computers & Structures 54 (6) (1995) 1005–1021.
[11] M. Davies, L. Gross, H.-B. Mühlhaus, Scripting high-performance earth systems simulations on the sgi altix 3700., in: Proceedings of the 7th
international conference on high-performance computing and grid in the Asia Pacific region, SIGHPC / Information Processing Society of
Japan (IPSJ) and IEEE Computer Society Japan Chapter, IEEE Computer Society, 2004, pp. 244–251. doi:10.1109/HPCASIA.2004.1324041.

Finite Elements and Approximation
From Everand
Finite Elements and Approximation
O. C. Zienkiewicz
4.5/5 (4)
Online Finite Element Analysis Course
From Everand
Online Finite Element Analysis Course
Dr. James A. Mandel P.E.
3/5 (1)
Theory of Computation and Application- Automata,Formal languages,Computational Complexity (2nd Edition): 2, #1
From Everand
Theory of Computation and Application- Automata,Formal languages,Computational Complexity (2nd Edition): 2, #1
S. R. Jena
No ratings yet
Métodos numéricos aplicados a Ingeniería: Casos de estudio usando MATLAB
From Everand
Métodos numéricos aplicados a Ingeniería: Casos de estudio usando MATLAB
Héctor Jorquera González
5/5 (1)
FPDE User Guide
No ratings yet
FPDE User Guide
77 pages
Computational Geometry: Exploring Geometric Insights for Computer Vision
From Everand
Computational Geometry: Exploring Geometric Insights for Computer Vision
Fouad Sabry
No ratings yet
Prepared For The U.S. Department of Energy, UNDER CONTRACT DE-AC02-76CH03073
No ratings yet
Prepared For The U.S. Department of Energy, UNDER CONTRACT DE-AC02-76CH03073
15 pages
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Proto Fem Paper
100% (1)
Proto Fem Paper
23 pages
Application of The FEM To Structural Engieering Problems
No ratings yet
Application of The FEM To Structural Engieering Problems
5 pages
Cppnow2012 Submission 13
No ratings yet
Cppnow2012 Submission 13
12 pages
Finite Element Method
From Everand
Finite Element Method
Gouri Dhatt
1/5 (1)
Real-Time Critical Systems
From Everand
Real-Time Critical Systems
Jordan Lee Mauro-Buhagiar
3/5 (1)
Japanese High Technology
From Everand
Japanese High Technology
Kazutake IMANI
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
829-Article Text-5973-2-10-20210114
No ratings yet
829-Article Text-5973-2-10-20210114
11 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
FEA Summary
No ratings yet
FEA Summary
4 pages
Analysis of Experimental Data Microsoft®Excel or Spss??! Sharing of Experience English Version: Book 3
From Everand
Analysis of Experimental Data Microsoft®Excel or Spss??! Sharing of Experience English Version: Book 3
Ping Yuen PY Cheng
No ratings yet
Parallelized Implicit Nonlinear FEA Program For Real Scale RC Structures Under Cyclic Loading
No ratings yet
Parallelized Implicit Nonlinear FEA Program For Real Scale RC Structures Under Cyclic Loading
10 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
From Everand
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
Matthew Rosch
No ratings yet
Learning PyTorch 2.0, Second Edition
From Everand
Learning PyTorch 2.0, Second Edition
Matthew Rosch
No ratings yet
Computational Science: An Introduction for Scientists and Engineers
From Everand
Computational Science: An Introduction for Scientists and Engineers
Christopher D Wentworth
No ratings yet
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
From Everand
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Algorithms for Minimization Without Derivatives
From Everand
Algorithms for Minimization Without Derivatives
Richard P. Brent
No ratings yet
Automated Theorem Proving: Fundamentals and Applications
From Everand
Automated Theorem Proving: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ansys Manual 2013
No ratings yet
Ansys Manual 2013
111 pages
referensi kelebihan abaqus
No ratings yet
referensi kelebihan abaqus
5 pages
Advanced Guide to Dynamic Programming in Python: Techniques and Applications
From Everand
Advanced Guide to Dynamic Programming in Python: Techniques and Applications
Adam Jones
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
1 Background - The 2D Distinct Element Method
No ratings yet
1 Background - The 2D Distinct Element Method
36 pages
NMCE - Volume 3 - Issue 4 - Pages 1-9
No ratings yet
NMCE - Volume 3 - Issue 4 - Pages 1-9
9 pages
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
From Everand
Live Trace Visualization for System and Program Comprehension in Large Software Landscapes
Florian Fittkau
No ratings yet
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Fea Handouts
No ratings yet
Fea Handouts
51 pages
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
epx_flyer2.0
No ratings yet
epx_flyer2.0
2 pages
Applications of Coupled Eulerian-Lagrangian Method To Geotechnical Problems With Large Deformations
No ratings yet
Applications of Coupled Eulerian-Lagrangian Method To Geotechnical Problems With Large Deformations
16 pages
Mastering Dynamic Programming in Python
From Everand
Mastering Dynamic Programming in Python
Ed A Norex
No ratings yet
A User-Defined Element For Dynamic Analysis of Saturated Porous Media in ABAQUS
No ratings yet
A User-Defined Element For Dynamic Analysis of Saturated Porous Media in ABAQUS
17 pages
Math for Deep Learning: What You Need to Know to Understand Neural Networks
From Everand
Math for Deep Learning: What You Need to Know to Understand Neural Networks
Ronald T. Kneusel
No ratings yet
Computational Tools For Teaching Graduate Courses in Geotechnical Engineering
No ratings yet
Computational Tools For Teaching Graduate Courses in Geotechnical Engineering
5 pages
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Computer Assisted Proof: Fundamentals and Applications
From Everand
Computer Assisted Proof: Fundamentals and Applications
Fouad Sabry
No ratings yet
C + +: C++ programming
From Everand
C + +: C++ programming
Ummed Singh
No ratings yet
An Introduction To FE Analysis With Excel-VBA: [email protected] - JP
No ratings yet
An Introduction To FE Analysis With Excel-VBA: [email protected] - JP
15 pages
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
From Everand
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
Eric Okoth Ogur
No ratings yet
Programming the Finite Element Method
From Everand
Programming the Finite Element Method
I. M. Smith
No ratings yet
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
From Everand
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
ARCHER PAUL
No ratings yet
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
Adding Elements Into FEAP
100% (1)
Adding Elements Into FEAP
26 pages
Learn Python in One Hour: Programming by Example
From Everand
Learn Python in One Hour: Programming by Example
Victor R. Volkman
3/5 (2)
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Computer Aided Engineering Notes Imp
100% (1)
Computer Aided Engineering Notes Imp
6 pages
An Increasing of The Efficiency of Microbiological Synthesis of 1 3 Propanediol From Crude Glycerol by The Concentration of Biomass
No ratings yet
An Increasing of The Efficiency of Microbiological Synthesis of 1 3 Propanediol From Crude Glycerol by The Concentration of Biomass
7 pages
Evaluation of Different Expression Systems For The Heterologous Expression of Pyranose 2 Oxidase From Trametes Multicolor in e Coli
No ratings yet
Evaluation of Different Expression Systems For The Heterologous Expression of Pyranose 2 Oxidase From Trametes Multicolor in e Coli
9 pages
The Exometabolome of Clostridium Thermocellum Reveals Overflow Metabolism at High Cellulose Loading
No ratings yet
The Exometabolome of Clostridium Thermocellum Reveals Overflow Metabolism at High Cellulose Loading
11 pages
Genome Mining For Natural Product Biosynthetic Gene Clusters in The Subsection V Cyanobacteria
No ratings yet
Genome Mining For Natural Product Biosynthetic Gene Clusters in The Subsection V Cyanobacteria
20 pages
An Efficient Method For The Immobilization of Inulinase Using New Types of Polymers Containing Epoxy Groups
No ratings yet
An Efficient Method For The Immobilization of Inulinase Using New Types of Polymers Containing Epoxy Groups
12 pages
Correlation Between Agar Plate Screening and Solid State Fermentation For The Prediction of Cellulase Production by Trichoderma Strains
No ratings yet
Correlation Between Agar Plate Screening and Solid State Fermentation For The Prediction of Cellulase Production by Trichoderma Strains
8 pages
A Significant Number of Reported Absidia Corymbifera Lichtheimia Corymbifera Infections Are Caused by Lichtheimia Ramosa Syn Lichtheimia Hongkongensis An Emerging Cause of Mucormycosis
No ratings yet
A Significant Number of Reported Absidia Corymbifera Lichtheimia Corymbifera Infections Are Caused by Lichtheimia Ramosa Syn Lichtheimia Hongkongensis An Emerging Cause of Mucormycosis
8 pages
Recycling of Sewage Sludge As Production Medium For Cellulase by A Bacillus Megaterium Strain
No ratings yet
Recycling of Sewage Sludge As Production Medium For Cellulase by A Bacillus Megaterium Strain
15 pages
Physiological Characterization of Thermotolerant Yeast For Cellulosic Ethanol Production
No ratings yet
Physiological Characterization of Thermotolerant Yeast For Cellulosic Ethanol Production
12 pages
The Impact of Diclofenac and Ibuprofen On Biofilm Formation On The Surface of Polypropylene Mesh
No ratings yet
The Impact of Diclofenac and Ibuprofen On Biofilm Formation On The Surface of Polypropylene Mesh
7 pages
Brief Report: Corynebacterium Glutamicum
No ratings yet
Brief Report: Corynebacterium Glutamicum
5 pages
High-Throughput Strategies For Penicillin G Acylase Production in Re. Coli Fed-Batch Cultivations
No ratings yet
High-Throughput Strategies For Penicillin G Acylase Production in Re. Coli Fed-Batch Cultivations
13 pages
Amino Acid Catabolism Directed Biofuel Production in Clostridium Sticklandii An Insight Into Model Driven Systems Engineering
No ratings yet
Amino Acid Catabolism Directed Biofuel Production in Clostridium Sticklandii An Insight Into Model Driven Systems Engineering
29 pages
Degradation of o Xylene and M Xylene by A Novel Sulfate Reducer Belonging To The Genus Desulfotomaculum
No ratings yet
Degradation of o Xylene and M Xylene by A Novel Sulfate Reducer Belonging To The Genus Desulfotomaculum
11 pages
Imac Capture of Recombinant Protein From Unclarified Mammalian Cell Feed Streams
No ratings yet
Imac Capture of Recombinant Protein From Unclarified Mammalian Cell Feed Streams
11 pages
The Sequence Capture by Hybridization A New Approach For Revealing The Potential of Mono Aromatic Hydrocarbons Bioattenuation in A Deep Oligotrophic Aquifer
No ratings yet
The Sequence Capture by Hybridization A New Approach For Revealing The Potential of Mono Aromatic Hydrocarbons Bioattenuation in A Deep Oligotrophic Aquifer
11 pages
Soybean Glucosidase Immobilisated On Chitosan Beads and Its Application in Soy Drink Increase The Aglycones
No ratings yet
Soybean Glucosidase Immobilisated On Chitosan Beads and Its Application in Soy Drink Increase The Aglycones
8 pages
Streptomyces Flavogriseus hs1 Isolation and Characterization of Extracellular Proteases and Their Compatibility With Laundry Detergents
No ratings yet
Streptomyces Flavogriseus hs1 Isolation and Characterization of Extracellular Proteases and Their Compatibility With Laundry Detergents
9 pages
Enhanced Rhamnolipid Production in Burkholderia Thailandensis Transposon Knockout Strains Deficient in Polyhydroxyalkanoate Pha Synthesis
No ratings yet
Enhanced Rhamnolipid Production in Burkholderia Thailandensis Transposon Knockout Strains Deficient in Polyhydroxyalkanoate Pha Synthesis
12 pages
Simultaneous Environmental Manipulations in Semi Perfusion Cultures of Cho Cells Producing RH Tpa
No ratings yet
Simultaneous Environmental Manipulations in Semi Perfusion Cultures of Cho Cells Producing RH Tpa
10 pages
4.partial Differential Equation by K-Sankara-Rao
65% (20)
4.partial Differential Equation by K-Sankara-Rao
86 pages
Indian Institute of Space Science and Technology: B.Tech. Aerospace Engineering Curriculum & Syllabus
No ratings yet
Indian Institute of Space Science and Technology: B.Tech. Aerospace Engineering Curriculum & Syllabus
59 pages
Applied Mathematics for Engineers and Physicists 3rd Edition Louis A. Pipespdf download
100% (1)
Applied Mathematics for Engineers and Physicists 3rd Edition Louis A. Pipespdf download
25 pages
Meem5800 r02 Aem Summer 2016
No ratings yet
Meem5800 r02 Aem Summer 2016
4 pages
DST Group TR 3530
No ratings yet
DST Group TR 3530
43 pages
Lecture Notes, Revised, PDEsm, M, M, MM, M, M
No ratings yet
Lecture Notes, Revised, PDEsm, M, M, MM, M, M
12 pages
MATHS 361: Partial Differential Equations Study Guide: Semester 1 2016
No ratings yet
MATHS 361: Partial Differential Equations Study Guide: Semester 1 2016
3 pages
Course Contents Full Copy MA MSC
No ratings yet
Course Contents Full Copy MA MSC
21 pages
Resume Eric Radke
No ratings yet
Resume Eric Radke
1 page
Brief Notes On Solving PDE's and Integral Equations
No ratings yet
Brief Notes On Solving PDE's and Integral Equations
11 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
4 pages
Week 9
No ratings yet
Week 9
17 pages
Nine Mathematical Challenges An Elucidation A Kechris N Makarov download
No ratings yet
Nine Mathematical Challenges An Elucidation A Kechris N Makarov download
83 pages
Differential Equations & Modeling Applications: Dennis G.Zill'
No ratings yet
Differential Equations & Modeling Applications: Dennis G.Zill'
4 pages
TOCHER, K. D., The Art of Simulation
No ratings yet
TOCHER, K. D., The Art of Simulation
192 pages
FY BTech Structure and Syllabus 2020 21
No ratings yet
FY BTech Structure and Syllabus 2020 21
59 pages
Keller Box Shooting Method
No ratings yet
Keller Box Shooting Method
30 pages
Similarity Transformations For Partial Differential Equations
No ratings yet
Similarity Transformations For Partial Differential Equations
7 pages
Exact Dirichlet Boundary Physics-Informed Neural Network EPINN
No ratings yet
Exact Dirichlet Boundary Physics-Informed Neural Network EPINN
18 pages
Preview
No ratings yet
Preview
58 pages
6.Lab6 06-03-24_Packed Bed Reactor
No ratings yet
6.Lab6 06-03-24_Packed Bed Reactor
24 pages
Linear Wave Theory
No ratings yet
Linear Wave Theory
33 pages
Numerical Methods and Analysis
No ratings yet
Numerical Methods and Analysis
13 pages
Finite Element Method For Maxwell's Equations
No ratings yet
Finite Element Method For Maxwell's Equations
6 pages
178491Full download (Ebook) Essential mathematical methods for physicists by Hans J. Weber, Frank Harris, George B. Arfken ISBN 9780080469850, 9780120598779, 008046985X, 0120598779 pdf docx
100% (2)
178491Full download (Ebook) Essential mathematical methods for physicists by Hans J. Weber, Frank Harris, George B. Arfken ISBN 9780080469850, 9780120598779, 008046985X, 0120598779 pdf docx
67 pages
Computational Fluid Dynamics For Incompressible Flows: Classification of Pdes
No ratings yet
Computational Fluid Dynamics For Incompressible Flows: Classification of Pdes
14 pages
AKU Eng 2018 Syllabus PDF
No ratings yet
AKU Eng 2018 Syllabus PDF
70 pages
ADAPTIVE MESH REFINEMENT IN NUMERICAL METHODS FOR FRACTIONAL DIFFERENTIAL EQUATIONS
No ratings yet
ADAPTIVE MESH REFINEMENT IN NUMERICAL METHODS FOR FRACTIONAL DIFFERENTIAL EQUATIONS
32 pages
Quant Interview and Exam Prep
No ratings yet
Quant Interview and Exam Prep
21 pages
Partial Differential Equations For Scientists and
No ratings yet
Partial Differential Equations For Scientists and
7 pages

On Lazy Evaluation As A Tool To Optimize The Efficiency of Large Scale Numerical Simulations in Python

Uploaded by

On Lazy Evaluation As A Tool To Optimize The Efficiency of Large Scale Numerical Simulations in Python

Uploaded by

Procedia Computer

International Conference on Computational Science, ICCS 2010

On lazy evaluation as a tool to optimize the eﬃciency of large scale

1877-0509 c 2010 Published by Elsevier Ltd.

2. Modelling Elastic-Plastic Saturated Porous Media

To calculate the displacement increment du = dt · v, one has to solve the PDE:

3. The escript module

3.1. Linear Partial Diﬀerential Equations

−(Ai jkl uk,l + Bi jk uk ), j + Cikl uk,l + Dik uk = −Xi j, j + Yi . (8)

Continuous Functions on Interior Functions on Boundary Functions on

3.2. Spatial Functions

from escript import *

Peak Memory Usage Normalized Peak Memory Usage w.r.t. Non-lazy

You might also like