Nek5000 Tutorial: Velocity Prediction, ANL MAX Experiment
Nek5000 Tutorial: Velocity Prediction, ANL MAX Experiment
Nek5000 Tutorial
Paul Fischer
Aleks Obabko
Stefan Kerkemeier
James Lottes
Katie Heisey
Shashi Aithal
Yulia Peet
Presenters
Paul Fischer
– spectral element overview
– Nek5000
– Prenek
Additional help
– Shashi Aithal – Nek5000 on fusion, RANS development
– Yulia Peet – multidomain coupling
– Katie Heisey – automated build/test suite, example
suite, mesh partitioner
– Stefan Kerkemeier – principal software engineer
Argonne National
Laboratory
Machine Translated by Google
Objectives
Course Objectives:
Argonne National
Laboratory
Machine Translated by Google
Outline
Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Setting initial and boundary conditions
– Basic runtime analysis
– Parallel / serial issues that you should understand
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations
Argonne National
Laboratory
Machine Translated by Google
Some Resources
www.mcs.anl.gov/~fischer/Nek5000
Argonne National
Laboratory
Machine Translated by Google
Part I
Nek5000 capabilities
– Gallery
– Brief history
– Equations solved
– Features overview:
• Spectral element discretization
• Convergence properties (nek5_svn/examples)
• Scalability
Argonne National
Laboratory
Machine Translated by Google
Applications
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Machine Translated by Google
urms
Argonne National
Laboratory
Machine Translated by Google
1 Merzari et al., Proper Orthogonal Decomposition of the flow in a T-junction, Proc. ICAPP (2010)
2 Hirota et al., Exptl Study on Turbulent Flow and Mixing in Counter-Type T-junction, J. Therm. Sci. & Tech. 3, 157 – 58 (2008)
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Experiment (Re=90K) exhibits more rapid recovery of profile than simulation (Re=40K)
4.6
3.6
2.6
x/D
x/D
1.6
Lo-res Re=40K
Med-res Re=40K
Expt Re=90K
Parallel Scaling
7300 pts/
processor
ÿ=0.8 @
P=131072
www.mcs.anl.gov/~fischer/sem1b
Argonne National
Laboratory
Machine Translated by Google
HEDL geometry
Reh = 10,500
Argonne National
Laboratory
Machine Translated by Google
DNS / LES code for fluid dynamics, heat transfer, MHD, combustion,…
– 100K lines of code: f77 (70K) & C (30K)
– Interfaces w/ VisIt & MOAB/Cubit
Based on high-order spectral element method (Patera '84, Maday & Patera '89)
– Started as Nekton 2.0. First 3D SEM code. (F., Ho, & Ronquist, '86-'89)
Argonne National
Laboratory
Machine Translated by Google
High-order FEM
featuring – Minimal numerical dispersion/dissipation (Nth order accuracy, N=5-15,
– Loosely coupled elements (C0 continuity between elements)
– Tightly coupled dofs within elements (full stiffness matrices – never formed)
Argonne National
Laboratory
Machine Translated by Google
Why High-Order ?
– For fixed final error ÿf, require: numerical dispersion error ~ (ÿ / L)ÿf, << 1
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Strengths of Nek5000
Highly scalable
– Fast scalable multigrid solvers
– Scales to > 290,000 processors with ~104 pts/proc on BGP
Extensively tested
> 10s of platforms over 25 years
> 150 journal articles & > 60 users worldwide
> 400 tests after each build to ensure verified source
Argonne National
(more tests to be added)
Laboratory
Machine Translated by Google
Iterations / Step
ooo – U
ooo - B
Argonne National
Laboratory
Machine Translated by Google
Stefan Kerkemeier
Scaling to P=262144 Cores ETHZ/ANL
Production combustion and reactor simulations on ALCF BG/P demonstrate scaling to P=131072
with n/P ~ 5000-10,000 and ÿ ~ .7
Test problem with 7 billion points scales to P=262144 on Julich BG/P with ÿ ~ .7 – tests 64-bit
global addressing for gs communication framework
Limitations of Nek5000
No steady-state NS or RANS:
– unsteady RANS under development / test – Aithal
Mesh Anisotropy
Refinement
in region of interest…
yields unwanted high aspect-ratio
cells in the far field
Refinement propagation leads to
– unwanted elements in far-field
– high aspect-ratio cells that are detrimental
to iterative solver performance (F. JCP'97)
Argonne National
Laboratory
Machine Translated by Google
Morphing:
Argonne National
Laboratory
Machine Translated by Google
Part 2 (a)
Argonne National
Laboratory
Machine Translated by Google
nek5_svn repository
– nek5_svn
• trunk
– nek – makenek script and source files
– tools – several utilities (prenek, genbox, etc.) and scripts
nek5_svn repository
nek5_svn nek5_svn
|-- 3rd_party |-- |-- :`--
|-- branches trunk: |--
|-- examples nek | | |
| |-- axi | |-- |--
benard | |-- source :
conj_ht | |-- files…. | | `-- tools
eddy | |-- |-- avg |-- :genbox
amg_matlab |--|--
fs_2 | |-- genmap |-- makefile
fs_hydro | |-- |-- maketools |--
kovasznay | | -- n2to3 |--
lowMach_test | |-- nekmerge |--
moab | |-- peris | |-- postnek |--
pipe | |-- rayleigh | prenek |--
|-- shear4 | |-- reatore2 `--
timing | |-- scripts
turbChannel | |--
turbJet | `-- vortex
|- - tags |-- tests
Snapshot of SIZE
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Part 2 (b)
Argonne National
Laboratory
Machine Translated by Google
Outline
Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Setting initial and boundary conditions
– Basic runtime analysis
– Parallel / serial issues that you should understand
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Incompressible MHD
Argonne National
Laboratory
Machine Translated by Google
Timestepping
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Timestepping Design
Implicit:
– symmetric and (generally) linear
terms, – fixed flow rate conditions
Explicit:
– nonlinear, nonsymmetric
terms, – user-provided rhs terms,
including • Boussinesq and Coriolis forcing
Rationale:
– div u = 0 constraint is fastest timescale
– Viscous terms: explicit treatment of 2nd-order derivatives ÿt ~ O(ÿx2)
– Convective terms require only ÿt ~ O(ÿx)
– For high Re, temporal-spatial accuracy dictates ÿt ~ O(ÿx)
– Linear symmetric is “easy” – nonlinear nonsymmetric is “hard”
Argonne National
Laboratory
Machine Translated by Google
BDF2/EXT2 Example
Argonne National
Laboratory
Machine Translated by Google
BDF2/EXT2 Example
Argonne National
Laboratory
Machine Translated by Google
BDF2/EXT2 Example
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
BDFk/EXTk
Argonne National
Laboratory
Machine Translated by Google
Characteristics Timestepping
Argonne National
Laboratory
Machine Translated by Google
Characteristics Timestepping
Argonne National
Laboratory
Machine Translated by Google
Spatial Discretization
Argonne National
Laboratory
Machine Translated by Google
3D nonconforming mesh
for arterio-venous graft simulations:
E = 6168 elements, N = 7
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
– poor choices:
– good choices:
hi(x)
Argonne National
Laboratory
Machine Translated by Google
Monomials: xk
Uniform Points
GLL Points ~ N3
Argonne National
Laboratory
Machine Translated by Google
with
where
Extension to 2D
basis coefficients
N=4
N=10
Argonne National
Laboratory
Machine Translated by Google
hi(r)
mxm
Argonne National
Laboratory
Machine Translated by Google
Mapped Geometries
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Multidimensional Integration
In particular,
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
The fundamental difficulty is that the discrete operators do not commute, except
under special circumstances (eg, Fourier bases).
Velocity, u in PN , continuous
Pressure, p in PN-2 , discontinuous
PN - PN-2 PN – PN
Nek will ensure that the problem type is compatible with the discretization choice.
For most cases, speed is determined by the pressure solve, which addresses the fastest timescales in
the system (the acoustic waves).
– For PN - PN-2, the solver has been highly optimized over the last 15 years.
– The PN - PN version was developed by the ETH group (Tomboulides, Frouzakis, Kerkemeier) for
low Mach-number combustion and has only recently been folded into the production Nek5000
code.
Argonne National
Laboratory
Machine Translated by Google
There are many more, particularly for moving walls, free surface, etc.
Argonne National
Laboratory
Machine Translated by Google
Part 3
Workflow Example
Argonne National
Laboratory
Machine Translated by Google
Outline
Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Parallel / serial issues that you should understand
– Setting initial and boundary conditions
– Basic runtime analysis
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Parallel Structure
52 1 3 4
2 1 1 twenty three
proc 0 proc1
Parallel, local element numbering
Argonne National
Laboratory
Machine Translated by Google
Parallel Structure
52 1 3 4
2 1 1 twenty three
proc 0 proc1
Parallel, local element numbering
For the most part, don't care about global element numbering
– (We'll show some examples where one might)
Key point is
that, – on proc 0, (nelt = # elements in temperature
nelt=2 – on proc 1, nelt=3
domain) (nelv = # elements in fluid domain, usually = nelt)
Argonne National
Laboratory
Machine Translated by Google
Parallel Structure
52 1 3 4
2 1 1 twenty three
proc 0 proc1
– proc 0 proc 1
• nelt=2 net=3
• lgel=(2,5) lgel=(1,3,4)
Serial Structure
real u(lx1,ly1,lz1,lelt)
• Indicates that u is a collection of elements,
e=1,…,Nelt =< lelt, each of size (N+1)d, d=2 or 3
Argonne National
Laboratory
Machine Translated by Google
s=0 s=0
do e=1,nelv do e=1,nelv
do iz=1,nz1 do iz=1,nz1
do iy=1,ny1 do iy=1,ny1
do ix=1,nx1 do ix=1,nx1
s=s+u(ix,iy,iz,e) s=s+u(ix,iy,iz,e)
enddo,…,enddo enddo,…,enddo
s=glsum(s,1)
Argonne National
Laboratory
Machine Translated by Google
n=nx1*ny1*nz1*nelv n=nx1*ny1*nz1*nelv
s=0 s=0
do do
s=glmax(s,1)
Argonne National
Laboratory
Machine Translated by Google
n=nx1*ny1*nz1*nelv n=nx1*ny1*nz1*nelv
s=glsum(u,n) s=glsum(u,n)
s=vlsum(u,n)
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Argonne National
Laboratory
Machine Translated by Google
Eddy Example
A:
– Make a new mesh
– Change the bcs in .rea and .usr
files – Look at the error
Argonne National
Laboratory
Machine Translated by Google
genbox
Argonne National
Laboratory
Machine Translated by Google
genbox
Argonne National
Laboratory
Machine Translated by Google
genbox
f1 y f2
x
f3
– f3: “wall”
– f4: “symmetry”
Argonne National
Laboratory
Machine Translated by Google
genbox example, 2D
Argonne National
Laboratory
Machine Translated by Google
genbox, 3D
f4
y f1 f2
x
f6
z
f3
Argonne National
Laboratory
Machine Translated by Google
genbox conventions
# indicates comment
If nelx (y, or z) < 0, user provides x0 < xnelx , and ratio, so that
domain [x0, xnelx] is partitioned into nelx subdomains, with dxi+1 = ratio*dxi
If ndim < 0, genbox generates .rea and .re2 (binary) file [new convention]
Base input file must match dimension (2D or 3D) of the given case
Argonne National
Laboratory