0% found this document useful (0 votes)
201 views

Nek5000 Tutorial: Velocity Prediction, ANL MAX Experiment

This document provides an overview of the Nek5000 tutorial. It will introduce the capabilities and features of the Nek5000 computational fluid dynamics software, including the spectral element method discretization. The tutorial will demonstrate how to set up and run basic flow simulations in Nek5000 and analyze the results using VisIt. Hands-on examples will allow participants to simulate flows on their own.

Uploaded by

Smitten Clark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
201 views

Nek5000 Tutorial: Velocity Prediction, ANL MAX Experiment

This document provides an overview of the Nek5000 tutorial. It will introduce the capabilities and features of the Nek5000 computational fluid dynamics software, including the spectral element method discretization. The tutorial will demonstrate how to set up and run basic flow simulations in Nek5000 and analyze the results using VisIt. Hands-on examples will allow participants to simulate flows on their own.

Uploaded by

Smitten Clark
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Machine Translated by Google

Nek5000 Tutorial

Velocity prediction, ANL MAX experiment.

Paul Fischer
Aleks Obabko
Stefan Kerkemeier
James Lottes

Katie Heisey
Shashi Aithal
Yulia Peet

Mathematics and Computer Science Division


Argonne National Laboratory
Argonne National
Laboratory
Machine Translated by Google

Presenters

Paul Fischer
– spectral element overview
– Nek5000
– Prenek

Aleks Obabko ( & Hank Childs, LBL)


– VisIt overview

Additional help
– Shashi Aithal – Nek5000 on fusion, RANS development
– Yulia Peet – multidomain coupling
– Katie Heisey – automated build/test suite, example
suite, mesh partitioner
– Stefan Kerkemeier – principal software engineer
Argonne National
Laboratory
Machine Translated by Google

Objectives

Course Objectives:

– Provide an overview of Nek5000 capabilities


– Introduce users to Nek5000 and VisIt usage

By the end of the day, you should be able to run some


basic flow simulations

Argonne National
Laboratory
Machine Translated by Google

Outline

Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Setting initial and boundary conditions
– Basic runtime analysis
– Parallel / serial issues that you should understand
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations

Argonne National
Laboratory
Machine Translated by Google

Some Resources

Nek5000 wiki page (google nek5000)

www.mcs.anl.gov/~fischer/Nek5000

Argonne National
Laboratory
Machine Translated by Google

Part I

Nek5000 capabilities
– Gallery
– Brief history
– Equations solved
– Features overview:
• Spectral element discretization
• Convergence properties (nek5_svn/examples)
• Scalability

Argonne National
Laboratory
Machine Translated by Google

Applications

Clockwise from upper left:


Reactor thermal-hydraulics
Astrophysics
Combustion
Oceanography
Vascular flow modeling

Argonne National
Laboratory
Machine Translated by Google

Coarse DNS: Channel Flow at Reb=13,000

Simulations by J. Ohlsson, KTH, Stockholm

Argonne National
Laboratory
Machine Translated by Google
Machine Translated by Google

Low Re Turbulence in Complex Domains

Arteriovenous graft flow @ Re=1200

Loth, F., Bassiouny, Ann. Rev. Fluid Mech. (2008)


Argonne National
Laboratory
Machine Translated by Google

Influence of Reynolds Number and Flow Division on urms

Validated simulations allow


prediction of the relative influences
of flow division and Reynolds
number on transition to turbulence
in arteriovenous grafts.

urms
Argonne National
Laboratory
Machine Translated by Google

Nek5000 LES Validation: T-Junction Studies


E. Merzari ANL

Square T-junction simulation and comparison with experimental data


– 20 M points, first point at y+ < 1, Reout = 7000

1 Merzari et al., Proper Orthogonal Decomposition of the flow in a T-junction, Proc. ICAPP (2010)
2 Hirota et al., Exptl Study on Turbulent Flow and Mixing in Counter-Type T-junction, J. Therm. Sci. & Tech. 3, 157 – 58 (2008)

Argonne National
Laboratory
Machine Translated by Google

NEA/OECD Blind T-Junction Benchmark

Thermal striping experiment with hot/cold inlets at Re ~ 105


Inlet velocity and temperature data provided by Vattenfall.
Of 29 entries, Nek5000 submission ranked 1st and 6th,
respectively, in temperature and velocity prediction (CFD4NRS 2010)

Argonne National
Laboratory
Machine Translated by Google

Velocity Comparison Downstream of T-junction

Medium resolution results are in excellent agreement at x=1.6 & 2.6

Experiment (Re=90K) exhibits more rapid recovery of profile than simulation (Re=40K)

4.6

3.6

2.6
x/D
x/D

1.6

Lo-res Re=40K

Med-res Re=40K

Expt Re=90K

– Horizontal position, y – –Vertical position, z –


Argonne National
Laboratory
Machine Translated by Google

Parallel Scaling: Subassembly 217 Wire-Wrapped Pins

– 3 million 7th-order spectral elements (n=1.01 billion) – 16384–131072

processors of IBM BG/P

Parallel Scaling

7300 pts/
processor

ÿ=0.8 @
P=131072

www.mcs.anl.gov/~fischer/sem1b
Argonne National
Laboratory
Machine Translated by Google

Nek5000 / Star Cross-Channel Velocity Comparison

HEDL geometry
Reh = 10,500

WD Pointer et al., Simulations of Turbulent Diffusion


in Wire-Wrapped Sodium Fast Reactor Fuel Assemblies,
Best Paper Award, FR09, Kyoto (2009)

Argonne National
Laboratory
Machine Translated by Google

Nek5000 Brief History

DNS / LES code for fluid dynamics, heat transfer, MHD, combustion,…
– 100K lines of code: f77 (70K) & C (30K)
– Interfaces w/ VisIt & MOAB/Cubit

Based on high-order spectral element method (Patera '84, Maday & Patera '89)
– Started as Nekton 2.0. First 3D SEM code. (F., Ho, & Ronquist, '86-'89)

First commercially-available code for distributed memory computers (marketed


by Fluent as Nekton into the mid 90s)

Nek5000 is a highly scalable variant of Nekton –


Gordon Bell Prize in HPC, 4096 processors (Tufo & F. '99) – 20% of peak on 262,000
processors of BGP (Kerkemeier, Parker & F. '10)

Argonne National
Laboratory
Machine Translated by Google

Spectral Element Overview

High-order FEM
featuring – Minimal numerical dispersion/dissipation (Nth order accuracy, N=5-15,
– Loosely coupled elements (C0 continuity between elements)
– Tightly coupled dofs within elements (full stiffness matrices – never formed)

Standard domain decomposition


+ message-passing based parallelism

Iterative solvers imply local


work with dense operators,
followed by data exchanges to
update interface values

Argonne National
Laboratory
Machine Translated by Google

Why High-Order ?

Large problem sizes enabled by peta- and exascale computers allow


propagation of small features (size ÿ) over distances L >> ÿ.

– Dispersion errors accumulate linearly with time:

~|correct speed – numerical speed| * t ( for each wavenumber )

errort_final ~ ( L / ÿ ) * | numerical dispersion error |

– For fixed final error ÿf, require: numerical dispersion error ~ (ÿ / L)ÿf, << 1

– High-order methods most efficiently deliver small dispersion errors


(Kreiss & Oliger 72, Gottlieb et al. 2007)

Argonne National
Laboratory
Machine Translated by Google

Spectral Element Convergence: Exponential with N

Argonne National
Laboratory
Machine Translated by Google

SEM Excellent transport properties, even for non-smooth solutions

Convection of non-smooth data on a 32x32 grid


(K1 x K1 spectral elements of order N). (cf. Gottlieb & Orszag 77)

Argonne National
Laboratory
Machine Translated by Google

Strengths of Nek5000

High-order accuracy at low cost


– Extremely rapid (exponential) convergence in
space – 3rd-order accuracy in time

Highly scalable
– Fast scalable multigrid solvers
– Scales to > 290,000 processors with ~104 pts/proc on BGP

Extensively tested
> 10s of platforms over 25 years
> 150 journal articles & > 60 users worldwide
> 400 tests after each build to ensure verified source

Argonne National
(more tests to be added)
Laboratory
Machine Translated by Google

Solver Performance: Hybrid Schwarz-Multigrid

Magneto-rotational instability – (Obabko, Cattaneo & F.)


E=140000, N=9 ( n = 112 M ), P=32768 (BG/L) ~ 1.2 sec/step ~ 8

iterations / step for U & B – Key is to have a scalable

coarse- grid solver

Iterations / Step

ooo – U
ooo - B

Argonne National
Laboratory
Machine Translated by Google

Stefan Kerkemeier
Scaling to P=262144 Cores ETHZ/ANL

Production combustion and reactor simulations on ALCF BG/P demonstrate scaling to P=131072
with n/P ~ 5000-10,000 and ÿ ~ .7

Test problem with 7 billion points scales to P=262144 on Julich BG/P with ÿ ~ .7 – tests 64-bit
global addressing for gs communication framework

BG/ P Strong Scaling: P=8192 – 131072 P=32768 – 262144

Parallel Efficiency for Autoignition Application:


> 83% on P=131K, for n/P ~ 6200, E=810,000, N=9 Parallel Efficiency, Model Problem:
> 73% on P=131K, for n/P ~ 3100, E=810,000, N=7 > 70% on P=262K
> 7 billion points ( tests n > 231 )

32768 65536 131072 163840 262144


#Cores #Cores
Argonne National
Laboratory
Machine Translated by Google

Limitations of Nek5000

No steady-state NS or RANS:
– unsteady RANS under development / test – Aithal

Lack of monotonicity for under-resolved simulations


– limits, eg, LES + combustion – A high priority for
2011-12

Meshing complex geometries:


– fundamental: meshing always a challenge;
hex-based meshes intrinsically anisotropic

– technical: meshing traditionally not supported as part


of advanced modeling development
Argonne National
Laboratory
Machine Translated by Google

Mesh Anisotropy

A common refinement scenario (somewhat exaggerated):

Refinement
in region of interest…
yields unwanted high aspect-ratio
cells in the far field
Refinement propagation leads to
– unwanted elements in far-field
– high aspect-ratio cells that are detrimental
to iterative solver performance (F. JCP'97)
Argonne National
Laboratory
Machine Translated by Google

Some Meshing Options

genbox: unions of tensor-product boxes

prenek: basically 2D + some 3D or 3D via extrusion (n2to3)

Grow your own: 217 pin mesh via matlab; BioMesh

3rd party: CUBIT + MOAB, TrueGrid, Gambit, Star CD

Morphing:

Argonne National
Laboratory
Machine Translated by Google

Part 2 (a)

Equations, timestepping, and


spectral element formulation

…but first, a bit of code structure.

Argonne National
Laboratory
Machine Translated by Google

nek5_svn repository

Key subdirectories in the repo:

– nek5_svn

• trunk
– nek – makenek script and source files
– tools – several utilities (prenek, genbox, etc.) and scripts

• examples – several case studies

Typical steps to run a case:


– Create a working directory and copy contents of a similar
example case to this directory
– Modify case files to suit
– Copy makenek from nek and type makenek <case>
– Run job using a script (tools/scripts) and analyze results (postx/VisIt)
Argonne National
Laboratory
Machine Translated by Google

nek5_svn repository

nek5_svn nek5_svn
|-- 3rd_party |-- |-- :`--
|-- branches trunk: |--
|-- examples nek | | |
| |-- axi | |-- |--
benard | |-- source :
conj_ht | |-- files…. | | `-- tools
eddy | |-- |-- avg |-- :genbox
amg_matlab |--|--
fs_2 | |-- genmap |-- makefile
fs_hydro | |-- |-- maketools |--
kovasznay | | -- n2to3 |--
lowMach_test | |-- nekmerge |--
moab | |-- peris | |-- postnek |--
pipe | |-- rayleigh | prenek |--
|-- shear4 | |-- reatore2 `--
timing | |-- scripts
turbChannel | |--
turbJet | `-- vortex
|- - tags |-- tests

ArgonneNational `-- trunk


Laboratory
Machine Translated by Google

Base Nek5000 Case Files

SIZE – an f77 include file that


determines – spatial dimension (ldim
=2 or 3) – approximation order (lx1,lx2,lx3,lxd)
- N := lx1-1 – upper bound on number of elements per
processor: lelt – upper bound on total number of elements, lelg

<case>.rea – a file specifying


– job control parameters ( viscosity, dt, Nsteps, integrator,
etc. ) – geometry – element vertex and curvature information
– boundary condition types – restart conditions

<case>.usr – f77 source file


specifying – initial and boundary
conditions – variable properties –
forcing and volumetric heating –
geometry morphing – data analysis
options: min/max, runtime average, rms, etc.
Argonne National
Laboratory
Machine Translated by Google

Snapshot of SIZE

Argonne National
Laboratory
Machine Translated by Google

Snapshots of .rea file

Parameters section Geometry and boundary conditions

Argonne National
Laboratory
Machine Translated by Google

Snapshot of .usr file

Argonne National
Laboratory
Machine Translated by Google

Derived Nek5000 Case Files

<case>.re2 – binary file specifying


– geometry – element vertex and curvature
information – boundary condition types
This file is not requisite for small problems but important for
element counts E > ~10,000

<case>.map – ascii file derived from .rea/.re2 files


specifying – mesh interconnect topology – element-to-
processor map
This file is needed for each run and is generated by running the
“genmap” tool (once, for a given .rea file).

amg…dat – binary files derived from .rea/.re2 files


specifying – algebraic multigrid coarse-grid solver parameters
These files are needed only for large processor counts (P >
10,000) and element counts (E > 50,000).

Argonne National
Laboratory
Machine Translated by Google

Part 2 (b)

Equations, timestepping, and


spectral element formulation

Argonne National
Laboratory
Machine Translated by Google

Outline

Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Setting initial and boundary conditions
– Basic runtime analysis
– Parallel / serial issues that you should understand
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations

Argonne National
Laboratory
Machine Translated by Google

Equation Sets (2D/3D)

Incompressible Navier-Stokes plus energy equation

plus additional passive scalars:

Also supports incompressible MHD, low Mach-number


hydro, free-surface, and conjugate heat transfer formulations.
Argonne National
Laboratory
Machine Translated by Google

Steady State Equations

Steady Stokes (plus boundary conditions):

Steady conduction (plus boundary conditions):

Argonne National
Laboratory
Machine Translated by Google

Constant Property Equation Set

Incompressible Navier-Stokes + energy equation

In Nek parlance, material properties specified in .rea file as:


dimensional nondimensional (convective time scale)

or as variable properties in f77 routine urvp() (.usr file)

Nek provides a scalable framework to advance these equations with user-defined


properties. LES & RANS can be incorporated in this framework. (See /examples.)
Argonne National
Laboratory
Machine Translated by Google

Incompressible MHD

— plus appropriate boundary conditions on u and B

Typically, Re >> Rm >> 1

Semi-implicit formulation yields independent Stokes


problems for u and B

Argonne National
Laboratory
Machine Translated by Google

Incompressible MHD, Elsasser Variables

— A pair of Oseen problems: z-convects z+ , z+ convects z-

— Similar form for Re ^= Rm exists.

— A reasonable starting point for LES development…


Argonne National
Laboratory
Machine Translated by Google

Timestepping

Argonne National
Laboratory
Machine Translated by Google

Navier-Stokes Time Advancement

Nonlinear term: explicit via BDFk/EXTk or characteristics


(Pironneau '82)

Linear Stokes problem: pressure/viscous decoupling:


– 3 Helmholtz solves for velocity
– (“easy” w/ Jacobi-preconditioned CG)
– (consistent) Poisson equation for pressure –
(computationally dominant)

Argonne National
Laboratory
Machine Translated by Google

MHD Time Advancement

1. Compute nonlinear contributions (explicit, in Elsasser form, dealiased)


2. Solve well-conditioned Helmholtz problems for uin,
i=1,3 3. Filter uin 4. Solve consistent Poisson problem
n
for p 5. Compute div-free correction of uin 6. Repeat 2.
– 4. for Bin

Argonne National
Laboratory
Machine Translated by Google

Timestepping Design

Implicit:
– symmetric and (generally) linear
terms, – fixed flow rate conditions

Explicit:
– nonlinear, nonsymmetric
terms, – user-provided rhs terms,
including • Boussinesq and Coriolis forcing

Rationale:
– div u = 0 constraint is fastest timescale
– Viscous terms: explicit treatment of 2nd-order derivatives ÿt ~ O(ÿx2)
– Convective terms require only ÿt ~ O(ÿx)
– For high Re, temporal-spatial accuracy dictates ÿt ~ O(ÿx)
– Linear symmetric is “easy” – nonlinear nonsymmetric is “hard”
Argonne National
Laboratory
Machine Translated by Google

BDF2/EXT2 Example

Argonne National
Laboratory
Machine Translated by Google

BDF2/EXT2 Example

Argonne National
Laboratory
Machine Translated by Google

BDF2/EXT2 Example

Argonne National
Laboratory
Machine Translated by Google

Stability of ABk, BDFk/EXTk Timesteppers

Derived from model problem:

Crucially, the chosen schemes encompass part of the imaginary axis.


Important for high Reynolds number flows.

Stability Regions in the ÿÿt Plane

Argonne National
Laboratory
Machine Translated by Google

BDFk/EXTk

BDF3/EXT3 is essentially the same as BDF2/


EXT2 – O(ÿt3) accuracy – essentially the same
cost – accessed by setting Torder=3 (2 or 1)
in .rea file

For convection-diffusion and Navier-Stokes, the “EXTk” part


of the timestepper implies a CFL (Courant-Friedrichs-Lewy) constraint

For the spectral element method, ÿx ~ N -2, which is restrictive.


– We therefore often use a characteristics-based timestepper.
(IFCHAR = T in the .rea file)

Argonne National
Laboratory
Machine Translated by Google

Characteristics Timestepping

Apply BDFk to material derivative, eg, for k=2:

Amounts to finite-differencing along the characteristic leading into xj

Argonne National
Laboratory
Machine Translated by Google

Characteristics Timestepping

ÿt can be >> ÿtCFL (eg, ÿt ~ 5-10 x ÿtCFL )

Don't need position (eg, Xjn-1) of characteristic departure


point, only the value of un-1(x) at these points.

These values satisfy the pure hyperbolic problem:

which is solved via explicit timestepping with ÿs ~ ÿtCFL

Argonne National
Laboratory
Machine Translated by Google

Spatial Discretization

Argonne National
Laboratory
Machine Translated by Google

Spectral Element Method (Patera 84, Maday & Patera 89)

Variational method, similar to FEM, using GL quadrature.

Domain partitioned into E high-order quadrilateral (or hexahedral)


elements (decomposition may be nonconforming - localized refinement)

Trial and test functions represented as N th-order tensor-product


polynomials within each element. (N ~ 4 -- 15, typ.)

EN 3 gridpoints in 3D, EN 2 gridpoints in 2D.

Converges exponentially fast with N for smooth solutions.

3D nonconforming mesh
for arterio-venous graft simulations:
E = 6168 elements, N = 7

Argonne National
Laboratory
Machine Translated by Google

Spectral Element Method: Poisson Example

Argonne National
Laboratory
Machine Translated by Google

Spectral Element Method: Poisson Example

Argonne National
Laboratory
Machine Translated by Google

SEM Function Representation

Key point is that there is a continuous representation of all variables:

Since ÿj(x) is known a priori, we know how to differentiate and integrate.

Moreover, choose ÿjs to be computationally convenient

Argonne National
Laboratory
Machine Translated by Google

SEM Function Representation

SEM choices for ÿj :

– High-order polynomials on each element

– Compactly supported (sparse matrices, highly parallel)

– Stable Lagrangian interpolants:


• Basis coefficients are also grid-point values
– Easy to implement boundary conditions
– Grid-points chosen to be Gauss-Lobatto-Legendre quadrature points:
diagonal mass matrix and low-cost operator evaluation

– Local tensor-product bases:


• ijk indexing (low storage & minimal indirect addressing)
• Matrix-free fast tensor-product operator evaluation: – (Orszag '80)
memory is O(n), work is O(nN) – Not O(nN3) !!
Argonne National
Laboratory
Machine Translated by Google

How to get to high-order? Step 1: 1D

Stable high-order basis for Nth-order polynomial approximation space:

– poor choices:

– good choices:

hi(x)

Argonne National
Laboratory
Machine Translated by Google

Condition Number of 1D Stiffness Matrix

GLL Nodal Basis good conditioning, minimal round-off error

Monomials: xk

Uniform Points

GLL Points ~ N3

Argonne National
Laboratory
Machine Translated by Google

How to get to high-order? Step 2: 1D

Replace integrals with Gauss-Lobatto-Legendre quadrature:

with

where

Yields a diagonal mass matrix; preserves spectral accuracy.


(However, beware stability issues….)
Argonne National
Laboratory
Machine Translated by Google

Extension to 2D

Nodal bases on the Gauss-Lobatto-Legendre points:

basis coefficients

N=4

N=10

Argonne National
Laboratory
Machine Translated by Google

Matrix-Matrix Based Derivative Evaluation

Local tensor-product form (2D),

hi(r)

allows derivatives to be evaluated as matrix-matrix products:

mxm

Argonne National
Laboratory
Machine Translated by Google

Mapped Geometries

2D basis function, N=10

Argonne National
Laboratory
Machine Translated by Google

Notes about Mapped Elements

Best to use affine (ie, linear) transformations in order to


preserve underlying GLL spacing for stability and accurate quadrature.

Avoid singular corners - ~180o or ~0o

Avoid high-aspect-ratio cells, if possible

Argonne National
Laboratory
Machine Translated by Google

Multidimensional Integration

Given that we have Lagrangian interpolants based on GLL


quadrature points, we have

In particular,

In Nek, this vector reduction is implemented as: alpha = glsc2(u,bm1,n)


Argonne National
Laboratory
Machine Translated by Google

Local “Matrix-Free” Stiffness Matrix in 3D

For a deformed spectral element, ÿk,

Operation count in Rd is only O (Nd+1) not O (N2d) [Orszag '80]

Memory access is 7 x number of points (Grr ,Grs, etc., are diagonal )


Work is dominated by matrix-matrix products involving Dr , Ds , etc.

Argonne National
Laboratory
Machine Translated by Google

Generic SEM Operator Evaluation

Spectral element coefficients stored on element basis ( uL not u )

local work (matrix-matrix products)


nearest-neighbor (gather-scatter) exchange

Decouples complex physics (AL) from communication (QQT)

Argonne National
Laboratory
Machine Translated by Google

Navier-Stokes Discretization Options

Imposition of the constraint div u = 0 is a major difficulty in solving the


incompressible Navier-Stokes equations, both from theoretical and implementation perspectives.

Was not well-understood till the mid-80s (give, or take…).

The fundamental difficulty is that the discrete operators do not commute, except
under special circumstances (eg, Fourier bases).

Nek supports two distinct approaches:


– Option 1 (PN-PN-2):
• discretize in space using compatible approximation spaces
• solve coupled system for pressure/velocity

– Option 2 (PN-PN, or splitting):


• discretize in time first
• take continuous divergence of momentum equation to arrive at a
Poisson equation for pressure, with special boundary conditions
Argonne National
Laboratory
Machine Translated by Google

PN - PN-2 Spectral Element Method for Navier-Stokes (MP 89)

Velocity, u in PN , continuous
Pressure, p in PN-2 , discontinuous

Gauss-Lobatto Legendre points Gauss Legendre points


(velocity) (pressure)
Argonne National
Laboratory
Machine Translated by Google

Consistent Splitting for Unsteady Stokes


(MPR 90, Blair-Perot 93, Couzy 95)

E - consistent Poisson operator for pressure, SPD – boundary


conditions applied in velocity space – most compute-intensive
phase
Argonne National
Laboratory
Machine Translated by Google

Comparison of PN - PN-2 and PN - PN Options in Nek

PN - PN-2 PN – PN

– SIZE: lx2=lx1-2 lx2=lx1


– pressure: – discontinuous continuous
solver: E = DB-1DT A (std. Laplacian)
– preconditioner: SEMG Schwarz (but to be upgraded)
– free-surface Yes No
– ALE Yes No
– low Mach No Yes
– LES OK Better
– low Re Better OK
– var. prop. – Implicit (stress formulation) semi-implicit
spectrally accurate Yes Yes

Nek will ensure that the problem type is compatible with the discretization choice.

For most cases, speed is determined by the pressure solve, which addresses the fastest timescales in
the system (the acoustic waves).

– For PN - PN-2, the solver has been highly optimized over the last 15 years.

– The PN - PN version was developed by the ETH group (Tomboulides, Frouzakis, Kerkemeier) for
low Mach-number combustion and has only recently been folded into the production Nek5000
code.
Argonne National
Laboratory
Machine Translated by Google

Navier-Stokes Boundary Conditions

A few key boundary conditions are listed below.

There are many more, particularly for moving walls, free surface, etc.

Special conditions include:


– Recycling boundary conditions (special form of “v”)
– Accelerated outflow to avoid incoming characteristics
Argonne National
Laboratory
Machine Translated by Google

Thermal Boundary Conditions

A few key boundary conditions are listed below.

Argonne National
Laboratory
Machine Translated by Google

Part 3

Workflow Example

Argonne National
Laboratory
Machine Translated by Google

Outline

Nek5000 capabilities
Equations, timestepping, and SEM basics
Workflow example
– Parallel / serial issues that you should understand
– Setting initial and boundary conditions
– Basic runtime analysis
Using VisIt to analyze results
Mesh generation options
– Building meshes with genbox, prenek, and morphing
Walking through examples; hands on simulations

Argonne National
Laboratory
Machine Translated by Google

Serial / Parallel Issues

Locally, the SEM is structured.

Globally, the SEM is unstructured.

Vectorization and serial performance derive from the


structured aspects of the computation.

Parallelism and geometric flexibility derive from


the unstructured, element-by-element, operator evaluation.

Elements, or groups of elements are distributed


across processors, but an element is never subdivided.

Argonne National
Laboratory
Machine Translated by Google

Parallel Structure

Elements are assigned in ascending order to each processor

Serial, global element numbering

52 1 3 4

2 1 1 twenty three

proc 0 proc1
Parallel, local element numbering

Argonne National
Laboratory
Machine Translated by Google

Parallel Structure

Serial, global element numbering

52 1 3 4

2 1 1 twenty three

proc 0 proc1
Parallel, local element numbering

For the most part, don't care about global element numbering
– (We'll show some examples where one might)

Key point is
that, – on proc 0, (nelt = # elements in temperature
nelt=2 – on proc 1, nelt=3
domain) (nelv = # elements in fluid domain, usually = nelt)
Argonne National
Laboratory
Machine Translated by Google

Parallel Structure

52 1 3 4

2 1 1 twenty three

proc 0 proc1

Arrays that distinguish which processor has which elements:

– proc 0 proc 1
• nelt=2 net=3
• lgel=(2,5) lgel=(1,3,4)

Common arrays (scaling as nelgt, but only two such


arrays): – gllel=(1,1,2,3,2),
gllnid=(1,0,1,1,0)
Argonne National
Laboratory
Machine Translated by Google

Serial Structure

All data contiguously packed (and quad-aligned):

real u(lx1,ly1,lz1,lelt)
• Indicates that u is a collection of elements,
e=1,…,Nelt =< lelt, each of size (N+1)d, d=2 or 3

Argonne National
Laboratory
Machine Translated by Google

Serial / Parallel Usage

A common operation (1st way…) Parallel Version

s=0 s=0
do e=1,nelv do e=1,nelv
do iz=1,nz1 do iz=1,nz1
do iy=1,ny1 do iy=1,ny1
do ix=1,nx1 do ix=1,nx1
s=s+u(ix,iy,iz,e) s=s+u(ix,iy,iz,e)
enddo,…,enddo enddo,…,enddo

s=glsum(s,1)

Argonne National
Laboratory
Machine Translated by Google

Serial / Parallel Usage

A common operation (2nd way…) Parallel Version

n=nx1*ny1*nz1*nelv n=nx1*ny1*nz1*nelv
s=0 s=0
do do

i=1,ns=s+u(i,1,1,1) enddo i=1,ns=s+u(i,1,1,1) enddo

s=glmax(s,1)

Argonne National
Laboratory
Machine Translated by Google

Serial / Parallel Usage

A common operation (3rd way…) Parallel Version

n=nx1*ny1*nz1*nelv n=nx1*ny1*nz1*nelv

s=glsum(u,n) s=glsum(u,n)

– If you want a local max:

s=vlsum(u,n)

– Note: Important that every processor calls glmax()!!

Argonne National
Laboratory
Machine Translated by Google

Structure of .usr file

Let's look at a file!

Argonne National
Laboratory
Machine Translated by Google

Structure of .rea file

Let's look at Kovasznay example…

Argonne National
Laboratory
Machine Translated by Google

Argonne National
Laboratory
Machine Translated by Google

Starting Nek5000 on Fusion

Install source and build tools


– ssh to fusion.lcrc.anl.gov
– Add +pgi-9.0 to your .soft file and “resoft”
– svn co https://svn.mcs.anl.gov/repos/nek5
nek5_svn – cd nek5_svn/trunk/ tools and specify compiler in "m
F77="pgf77"
CC="pgcc"
– maketools all

Argonne National
Laboratory
Machine Translated by Google

Running First Case: Eddy Problem

cd ~nek5_svn/examples; mkdir t1; cd t1; cp ../eddy/* .


cp ~nek5_svn/trunk/nek/makenek .
makenek eddy_uv
nekb eddy_uv 1 (runs on 1 node = 8 cores)
– Results output to:
• logfile –
stdout: – timestepping info, computed errors, etc.
• eddy_uv.fld01,…,eddy_uv.fld12
– velocity & pressure distributions (binary)

Argonne National
Laboratory
Machine Translated by Google

A quick peek at the data

Type “postx &”, then


click type comment
1. SET TIME 12 load fld12
2. SET QUANTITY
3. VORTICITY
4. PLOT

Final error is in eddy_uv.fld11

To check the error:


click type comment
1. SET TIME 11 load fld11
2. SET QUANTITY
3.VELOCITY _
4. MAGNITUDE
Argonne National
5. PLOT
Laboratory
Machine Translated by Google

Eddy Example

Q: What does the error look like with outflow inflow/


boundary conditions?

A:
– Make a new mesh
– Change the bcs in .rea and .usr
files – Look at the error

To build the new mesh, we'll use genbox

Argonne National
Laboratory
Machine Translated by Google

genbox

Argonne National
Laboratory
Machine Translated by Google

genbox

genbox provides a simple way to generate a basic box


mesh comprising an nelx x nely x nelz array of elements, or
a composite mesh with several boxes.

It uses an existing base mesh as input to specify parameters,


etc. and generates a new set of elements and associated
boundary conditions.

The output is "box.rea"

One can then run "genmap"

Assuming the code is already compiled with an appropriate .usr


file, one can then run Nek5000

Argonne National
Laboratory
Machine Translated by Google

genbox

genbox geometry (2D) – uses a symmetric face ordering


f4

f1 y f2
x

f3

BC: v ,O ,W ,SYM, , yields – f1:


“velocity” – f2: “outflow”

– f3: “wall”
– f4: “symmetry”
Argonne National
Laboratory
Machine Translated by Google

genbox example, 2D

genbox generates a 2D or 3D input file "box.rea"

Argonne National
Laboratory
Machine Translated by Google

genbox, 3D

genbox face ordering in 3D:

f4
y f1 f2
x
f6
z
f3

Argonne National
Laboratory
Machine Translated by Google

Multibox Case: Backward Facing Step

BCs for internal faces are blank


Use additional boxes for more control over mesh grading, etc.
Argonne National
Laboratory
Machine Translated by Google

genbox conventions

# indicates comment

If nelx (y, or z) > 0, user provides x0,…,xnelx in ascending


order, possibly on multiple lines

If nelx (y, or z) < 0, user provides x0 < xnelx , and ratio, so that
domain [x0, xnelx] is partitioned into nelx subdomains, with dxi+1 = ratio*dxi

If ndim < 0, genbox generates .rea and .re2 (binary) file [new convention]

“B” or “b” for Box indicates a box descriptor follows

"C" or "c" for Circle indicates a circle descriptor (currently supported?)

BCs must be 3 characters (including blanks) !

Base input file must match dimension (2D or 3D) of the given case
Argonne National
Laboratory

You might also like