Introductory Openfoam® Course From 2 To6 July, 2012: Joel Guerrero University of Genoa, Dicat
Introductory Openfoam® Course From 2 To6 July, 2012: Joel Guerrero University of Genoa, Dicat
Introductory Openfoam® Course From 2 To6 July, 2012: Joel Guerrero University of Genoa, Dicat
Joel Guerrero
University of Genoa, DICAT
Dipartimento di Ingegneria delle Costruzioni, dell'Ambiente e del Territorio
Your Lecturer
Joel GUERRERO
[email protected]
Acknowledgements
Domain Decomposition
• The mesh and fields are decomposed using the decomposePar utility.
• The main goal is to break up the domain with minimal effort but in
such a way to guarantee a fairly economic solution.
• The geometry and fields are broken up according to a set of
parameters specified in a dictionary named decomposeParDict that
must be located in the system directory of the case.
• In the decomposeParDict file the user must set the number of
domains which the case should be decomposed into: usually it
corresponds to the number of cores available for the calculation.
• numberOfSubdomains 2;
• The user has a choice of six methods of decomposition, specified by
the method keyword.
• On completion, a set of subdirectories will have been created, one for
each processor. The directories are named processorN where N = 0,
1, 2, … Each directory contains the decomposed fields.
Running in parallel
Domain Decomposition
• simple: simple geometric decomposition in which the domain is split
into pieces by direction.
• hierarchical: Hierarchical geometric decomposition which is the
same as simple except the user specifies the order in which the
directional split is done.
• scotch: requires no geometric input from the user and attempts to
minimize the number of processor boundaries (similar to metis).
• manual: Manual decomposition, where the user directly specifies the
allocation of each cell to a particular processor.
• When post-processing cases that have been run in parallel the user
has three options:
• Reconstruction of the mesh and field data to recreate the
complete domain and fields.
• reconstructPar
• paraFoam
Running in parallel
• To launch a job in a cluster with PBS, you will need to write a small
script file where you tell to the job scheduler the resources you want
to use and what you want to do.
#!/bin/bash
#
# Simple PBS batch script that reserves 8 nodes and runs a
# MPI program on 64 processors (8 processor on each node)
# The walltime is 24 hours !
#
#PBS -N openfoam_simulation //name of the job
#PBS -l nodes=8:nehalem,walltime=24:00:00 //max execution time
#PBS -m abe -M [email protected] //send an email as soon as the job is launch or terminated
The green lines are not PBS comments. PBS comments use the numeral
character (#)
Running in a cluster using a job scheduler
• To launch your job you need to use the qsub command (part of the
PBS job scheduler). The command will send your job to queue.
• qsub script_name
What is cufflink?
cufflink stands for Cuda For FOAM Link. cufflink is an open source library
for linking numerical methods based on Nvidia's Compute Unified Device
Architecture (CUDA™) C/C++ programming language and
OpenFOAM®. Currently, the library utilizes the sparse linear solvers of
Cusp and methods from Thrust to solve the linear Ax = b system derived
from OpenFOAM's lduMatrix class and return the solution vector. cufflink
is designed to utilize the course-grained parallelism of OpenFOAM® (via
domain decomposition) to allow multi-GPU parallelism at the level of the
linear system solver.
Cufflink Features
• Currently only supports the OpenFOAM-extend fork of the
OpenFOAM code.
• Single GPU support.
• Multi-GPU support via OpenFOAM's course grained parallelism
achieved through domain decomposition (experimental).
Running with a GPU
Cufflink Features
• A conjugate gradient solver based on Cusp for symmetric matrices (e.g.
pressure), with a choice of
• Diagonal Preconditioner.
• Sparse Approximate Inverse Preconditioner.
• Algebraic Multigrid (AMG) based on Smoothed Aggregation
Precondtioner.
• A bi-conjugate gradient stabilized solver based on CUSP for asymmetric
matrices (e.g. velocity, epsilon, k), with a choice of
• Diagonal Preconditioner.
• Sparse Approximate Inverse Preconditioner.
• Single Precision (sm_10), Double precision (sm_13), and Fermi Architecture
(sm_20) supported. The double precision solvers are recommended over
single precision due to known errors encountered in the Smoothed
Aggregation Preconditioner in Single precision.
Running with a GPU
Running cufflink in OpenFOAM extend
Once the cufflink library has been compiled, in order to use the library in OpenFOAM
one needs to include the line
libs ("libCufflink.so");
in your controlDict dictionary. In addition, a solver must be chosen in the fvSolution
dictionary:
p
{
solver cufflink_CG;
preconditioner none;
tolerance 1e-10;
//relTol 1e-08;
maxIter 10000;
storage 1;//COO=0 CSR=1 DIA=2 ELL=3 HYB=4 all other numbers use default CSR
gpusPerMachine 2;//for multi gpu version on a machine with 2 gpus per machine node
AinvType ;
dropTolerance ;
linStrategy ;
}
Additional tutorials
In the folder $path_to_openfoamcourse/parallel_tut you will find many
tutorials, try to go through each one to understand how to setup a parallel case in
OpenFOAM.
Thank you for your attention
Hands-on session