PETSc Tutorial
PETSc Tutorial
2 of 132
Tutorial Objectives
Introduce the Portable, Extensible Toolkit for
Scientific Computation (PETSc)
Demonstrate how to write a complete parallel
implicit PDE solver using PETSc
Learn about PETSc interfaces to other packages
How to learn more about PETSc
3 of 132
4 of 132
What is PETSc?
PETSc history
Begun in September 1991
Now: over 4,000 downloads of version 2.0
5 of 132
PETSc Concepts
How to specify the mathematics of the problem
Data objects
vectors, matrices
6 of 132
Tutorial Topics
Getting started
sample results
programming paradigm
Data objects
vectors (e.g., field variables)
matrices (e.g., sparse
Jacobians)
Viewers
object information
visualization
Solvers
linear
nonlinear
timestepping (and ODEs)
Tutorial Topics:
Using PETSc with Other Packages
http://www. mathworks.com
7 of 132
8 of 132
Tutorial Approach
From the perspective of an application programmer:
Beginner
Advanced
user-defined customization of
algorithms and data structures
beginner
beginner
advanced
advanced
Intermediate
selecting options, performance
evaluation and tuning
2
intermediate
intermediate
Developer
advanced customizations,
intended primarily for use by
library developers
4
developer
developer
9 of 132
Intermediate
Experiment with options
Determine opportunities for improvement
Advanced
Extend algorithms and/or data structures as needed
Developer
Consider interface and efficiency issues for integration and
interoperability of multiple toolkits
10 of 132
Structure of PETSc
PETSc PDE Application Codes
ODE Integrators
Visualization
Nonlinear Solvers,
Interface
Unconstrained Minimization
Linear Solvers
Preconditioners + Krylov Methods
Object-Oriented
Grid
Matrices, Vectors, Indices
Management
Profiling Interface
Computation and Communication Kernels
MPI, MPI-IO, BLAS, LAPACK
11 of 132
Time Steppers
Other
Euler
Other
CG
CGS
Bi-CG-STAB
TFQMR
Preconditioners
Additive
Schwartz
Block
Jacobi
Compressed
Sparse Row
(AIJ)
Jacobi
ILU
ICC
Others
Matrices
Blocked Compressed
Sparse Row
(BAIJ)
Block
Diagonal
(BDIAG)
Distributed Arrays
Dense
Other
Index Sets
Indices
Vectors
LU
(Sequential only)
Block Indices
Stride
Other
12 of 132
PETSc
PC
Application
Initialization
KSP
Function
Evaluation
User code
Jacobian
Evaluation
PETSc code
PostProcessing
13 of 132
SAMRAI
PETSc
PC
Application
Initialization
User code
SPAI
KSP
ILUDTP
PETSc code
Function
Evaluation
Other Tools
Jacobian
Evaluation
PVODE
PostProcessing
14 of 132
Levels of Abstraction
in Mathematical Software
Application-specific interface
Programmer manipulates objects associated with
the application
15 of 132
Solvers
Linear Systems
Nonlinear Systems
Timestepping
16 of 132
Performance Debugging
Integrated profiling using -log_summary
Profiling by stages of an application
User-defined events
17 of 132
Approach
Distributed memory, shared-nothing
Requires only a compiler (single node or processor)
Access to data on remote machines through MPI
18 of 132
Collectivity
MPI communicators (MPI_Comm) specify collectivity
(processors involved in a computation)
All PETSc creation routines for solver and data objects are
collective with respect to a communicator, e.g.,
VecCreate(MPI_Comm comm, int m, int M, Vec *x)
19 of 132
Hello World
#include petsc.h
int main( int arc, char *argv[] )
{
PetscInitialize( &argc, &argv,
NULL, NULL );
PetscPrintf( PETSC_COMM_WORLD,
Hello World\n);
PetscFinalize();
return 0;
}
20 of 132
21 of 132
22 of 132
23 of 132
24 of 132
25 of 132
Data Objects
Vectors (Vec)
focus: field data arising in nonlinear PDEs
Matrices (Mat)
focus: linear operators arising in nonlinear PDEs (i.e., Jacobians)
beginner
beginner
Object creation
beginner
beginner
Object assembly
intermediate
intermediate
Setting options
intermediate
intermediate
Viewing
advanced
advanced
User-defined customizations
tutorial outline:
tutorial outline:
data objects
data objects
26 of 132
Vectors
Fundamental objects for storing field
solutions, right-hand sides, etc.
VecCreateMPI(...,Vec *)
MPI_Comm - processors that share the
vector
number of elements local to this processor
total number of elements
proc 0
proc 1
proc 2
proc 3
proc 4
data objects:
data objects:
vectors
vectors
27 of 132
Vector Assembly
VecSetValues(Vec,)
number of entries to insert/add
indices of entries
values to add
mode: [INSERT_VALUES,ADD_VALUES]
VecAssemblyBegin(Vec)
VecAssemblyEnd(Vec)
beginner
beginner
data objects:
data objects:
vectors
vectors
28 of 132
beginner
beginner
data objects:
data objects:
vectors and
vectors and
matrices
matrices
29 of 132
Operation
y = y + a*x
y = x + a*y
w = a*x + y
x = a*x
y=x
w_i = x_i *y_i
r = max x_i
x_i = s+x_i
x_i = |x_i |
r = ||x||
beginner
beginner
data objects:
data objects:
vectors
vectors
30 of 132
petsc/src/sys/examples/tutorials/
E ex2.c
Location:
- synchronized printing
petsc/src/vec/examples/tutorials/
E ex3.c, ex3f.F
1
beginner
beginner
data objects:
data objects:
vectors
vectors
31 of 132
Sparse Matrices
Fundamental objects for storing linear operators
(e.g., Jacobians)
MatCreateMPIAIJ(,Mat *)
MPI_Comm - processors that share the matrix
number of local rows and columns
number of global rows and columns
optional storage pre-allocation information
beginner
beginner
data objects:
data objects:
matrices
matrices
32 of 132
beginner
beginner
data objects:
data objects:
matrices
matrices
33 of 132
Matrix Assembly
MatSetValues(Mat,)
number of rows to insert/add
indices of rows and columns
number of columns to insert/add
values to add
mode: [INSERT_VALUES,ADD_VALUES]
MatAssemblyBegin(Mat)
MatAssemblyEnd(Mat)
beginner
beginner
data objects:
data objects:
matrices
matrices
34 of 132
beginner
beginner
data objects:
data objects:
matrices
matrices
35 of 132
3D compressible
Euler code
Block size 5
IBM Power2
80
MFlop
/sec
60
40
Blocked
Basic
20
Matrix-vector products
Triangular solves
beginner
beginner
data objects:
data objects:
matrices
matrices
36 of 132
Viewers
beginner
beginner
beginner
beginner
intermediate
intermediate
tutorial outline:
tutorial outline:
viewers
viewers
37 of 132
Viewer Concepts
Information about PETSc objects
runtime choices for solvers, nonzero info for matrices, etc.
Visualization
simple x-window graphics
vector fields
matrix sparsity structure
beginner
beginner
viewers
viewers
38 of 132
Default viewers
Solution components,
using runtime option
-snes_vecmonitor
VIEWER_FORMAT_ASCII_DEFAULT
VIEWER_FORMAT_ASCII_MATLAB
VIEWER_FORMAT_ASCII_COMMON
VIEWER_FORMAT_ASCII_INFO
etc.
velocity: v
vorticity:
beginner
beginner
velocity: u
temperature: T
viewers
viewers
39 of 132
etc.
beginner
beginner
viewers
viewers
40 of 132
important concepts
important concepts
Usage Concepts
Context variables
Solver options
Callback routines
Customization
tutorial outline:
tutorial outline:
solvers
solvers
41 of 132
PETSc
Solve
Ax = b
Application
Initialization
PC
Evaluation of A and b
User code
beginner
beginner
KSP
PostProcessing
PETSc code
solvers:
solvers:
linear
linear
42 of 132
Linear Solvers
Goal: Support the solution of linear systems,
Ax=b,
particularly for sparse, parallel problems arising
within PDE-based models
User provides:
Code to evaluate A, b
beginner
beginner
solvers:
solvers:
linear
linear
43 of 132
2u k 2u = 0
lim r
1/ 2
beginner
beginner
Real
+ iku = 0
Imaginary
solvers:
solvers:
linear
linear
44 of 132
Natural ordering
beginner
beginner
Close-up
45 of 132
beginner
beginner
intermediate
intermediate
intermediate
intermediate
intermediate
intermediate
intermediate
intermediate
advanced
advanced
Matrix-free solvers
advanced
advanced
User-defined customizations
tutorial outline:
tutorial outline:
solvers:
solvers:
linear
linear
46 of 132
Context Variables
Are the key to solver organization
Contain the complete state of an algorithm,
including
parameters (e.g., convergence tolerance)
functions that run the algorithm (e.g.,
convergence monitoring routine)
information about the current state (e.g., iteration
number)
beginner
beginner
solvers:
solvers:
linear
linear
47 of 132
Fortran version
call SLESCreate(MPI_COMM_WORLD,sles,ierr)
solvers:
solvers:
linear
linear
48 of 132
Conjugate Gradient
GMRES
CG-Squared
Bi-CG-stab
Transpose-free QMR
etc.
beginner
beginner
Preconditioners (PC)
Block Jacobi
Overlapping Additive
Schwarz
ICC, ILU via
BlockSolve95
ILU(k), LU (sequential
only)
etc.
solvers:
solvers:
linear
linear
49 of 132
sles;
A;
x, b;
n, its;
/*
/*
/*
/*
beginner
beginner
solvers:
solvers:
linear
linear
50 of 132
sles
A
x, b
n, its, ierr
call MatCreate(MPI_COMM_WORLD,n,n,A,ierr)
call VecCreate(MPI_COMM_WORLD,n,x,ierr)
call VecDuplicate(x,b,ierr)
C
call SLESCreate(MPI_COMM_WORLD,sles,ierr)
call SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTERN ,ierr)
call SLESSetFromOptions(sles,ierr)
call SLESSolve(sles,b,x,its,ierr)
beginner
beginner
solvers:
solvers:
linear
linear
51 of 132
-ksp_type [cg,gmres,bcgs,tfqmr,]
-pc_type [lu,ilu,jacobi,sor,asm,]
-ksp_max_it <max_iters>
-ksp_gmres_restart <restart>
-pc_asm_overlap <overlap>
-pc_asm_type [basic,restrict,interpolate,none]
etc ...
beginner intermediate
beginner intermediate
solvers:
solvers:
linear
linear
52 of 132
-ksp_monitor
-ksp_xmonitor
-ksp_truemonitor
solvers:
solvers:
linear
linear
53 of 132
Helmholtz: Scalability
128x512 grid, wave number = 13, IBM SP
GMRES(30)/Restricted Additive Schwarz
1 block per proc, 1-cell overlap, ILU(1) subdomain solver
Procs
1
2
4
8
16
32
beginner
beginner
Iterations
221
222
224
228
229
230
Time (Sec)
163.01
81.06
37.36
19.49
10.85
6.37
Speedup
-
2.0
4.4
8.4
15.0
25.6
solvers:
solvers:
linear
linear
54 of 132
SLESCreate( )
- Create SLES context
SLESSetOperators( ) - Set linear operators
SLESSetFromOptions( ) - Set runtime solver options
for [SLES, KSP,PC]
SLESSolve( )
- Run linear solver
SLESView( )
- View solver options
SLESDestroy( )
beginner
beginner
solvers:
solvers:
linear
linear
55 of 132
Procedural Interface
beginner intermediate
beginner intermediate
Runtime Option
-pc_type [lu,ilu,jacobi,
sor,asm,]
PCILULevels( )
PCSORSetIterations( )
PCSORSetOmega( )
PCASMSetType( )
-pc_ilu_levels <levels>
2
-pc_sor_its <its>
-pc_sor_omega <omega>
-pc_asm_type [basic,
restrict,interpolate,none]
PCGetSubSLES( )
56 of 132
KSPSetType( )
Set monitoring
routine
KSPSetMonitor()
-ksp_type [cg,gmres,bcgs,
1
tfqmr,cgs,]
-ksp_monitor, ksp_xmonitor,
-ksp_truemonitor, -ksp_xtruemonitor
KSPSetTolerances( )
-ksp_rtol <rt> -ksp_atol <at>
Set convergence
-ksp_max_its <its>
tolerances
KSPGMRESSetRestart( ) -ksp_gmres_restart <restart>
Set GMRES restart
parameter
Set orthogonalization KSPGMRESSetOrthogon -ksp_unmodifiedgramschmidt
-ksp_irorthog
routine for GMRES alization( )
beginner intermediate
beginner intermediate
solvers: linear:
solvers: linear:
Krylov methods
Krylov methods
57 of 132
petsc/src/sles/examples/tutorials/
ex4.c
ex9.c
E ex22.c
ex15.c
E - on-line exercise
solvers:
solvers:
linear
linear
58 of 132
beginner
beginner
intermediate
intermediate
intermediate
intermediate
intermediate
intermediate
advanced
advanced
Matrix-free solvers
advanced
advanced
User-defined customizations
tutorial outline:
tutorial outline:
solvers:
solvers:
nonlinear
nonlinear
59 of 132
PC
Application
Initialization
PETSc
KSP
Function
Evaluation
User code
beginner
beginner
Jacobian
Evaluation
PETSc code
PostProcessing
solvers:
solvers:
nonlinear
nonlinear
60 of 132
Nonlinear Solvers
Goal: For problems arising from PDEs,
support the general solution of F(u) = 0
User provides:
Code to evaluate F(u)
Code to evaluate Jacobian of F(u) (optional)
or use sparse finite difference approximation
or use automatic differentiation (coming soon!)
beginner
beginner
solvers:
solvers:
nonlinear
nonlinear
61 of 132
beginner
beginner
solvers:
solvers:
nonlinear
nonlinear
62 of 132
Velocity-vorticity
formulation
Flow driven by lid and/or
bouyancy
Logically regular grid,
parallelized with DAs
Finite difference
discretization
source code:
petsc/src/snes/examples/tutorials/ex8.c
beginner
beginner
Solution Components
velocity: u
vorticity:
velocity: v
temperature: T
solvers:
solvers:
nonlinear
nonlinear
63 of 132
/*
/*
/*
/*
/*
...
MatCreate(MPI_COMM_WORLD,n,n,&J);
VecCreate(MPI_COMM_WORLD,n,&x);
VecDuplicate(x,&F);
SNESCreate(MPI_COMM_WORLD,SNES_NONLINEAR_EQUATIONS,&snes);
SNESSetFunction(snes,F,EvaluateFunction,usercontext);
SNESSetJacobian(snes,J,EvaluateJacobian,usercontext);
SNESSetFromOptions(snes);
SNESSolve(snes,x,&its);
beginner
beginner
solvers:
solvers:
nonlinear
nonlinear
64 of 132
snes
J
x, F
n, its
...
call MatCreate(MPI_COMM_WORLD,n,n,J,ierr)
call VecCreate(MPI_COMM_WORLD,n,x,ierr)
call VecDuplicate(x,F,ierr)
call SNESCreate(MPI_COMM_WORLD
&
SNES_NONLINEAR_EQUATIONS,snes,ierr)
call SNESSetFunction(snes,F,EvaluateFunction,PETSC_NULL,ierr)
call SNESSetJacobian(snes,J,EvaluateJacobian,PETSC_NULL,ierr)
call SNESSetFromOptions(snes,ierr)
call SNESSolve(snes,x,its,ierr)
beginner
beginner
solvers:
solvers:
nonlinear
nonlinear
65 of 132
solvers:
solvers:
nonlinear
nonlinear
66 of 132
-ksp_type [cg,gmres,bcgs,tfqmr,]
-pc_type [lu,ilu,jacobi,sor,asm,]
-snes_type [ls,tr,]
beginner intermediate
beginner intermediate
solvers:
solvers:
nonlinear
nonlinear
67 of 132
SNESCreate( )
SNESSetFunction( )
SNESSetJacobian( )
SNESSetFromOptions( ) -
SNESSolve( )
SNESView( )
SNESDestroy( )
beginner
beginner
68 of 132
Procedural
Interface
Runtime Option
SNESSetType( )
SNESSetMonitor( )
-snes_type [ls,tr,umls,umtr,]
-snes_monitor
1
snes_xmonitor,
Set convergence
tolerances
Set line search routine
View solver options
Set linear solver
options
SNESSetTolerances( )
beginner intermediate
beginner intermediate
SNESSetLineSearch( )
SNESView( )
SNESGetSLES( )
SLESGetKSP( )
SLESGetPC( )
69 of 132
petsc/src/snes/examples/tutorials/
ex1.c, ex1f.F
ex4.c, ex4f.F
E ex18.c
multigrid
E ex19.c
beginner intermediate
beginner intermediate
E - on-line exercise
solvers:
solvers:
nonlinear
nonlinear
70 of 132
beginner
beginner
intermediate
intermediate
intermediate
intermediate
intermediate
intermediate
advanced
advanced
User-defined customizations
tutorial outline:
tutorial outline:
solvers:
solvers:
timestepping
timestepping
71 of 132
PETSc
Solve
U t = F(U,Ux,Uxx)
KSP
Function
Evaluation
User code
beginner
beginner
Jacobian
Evaluation
PETSc code
PostProcessing
solvers:
solvers:
timestepping
timestepping
72 of 132
Timestepping Solvers
Goal: Support the (real and pseudo) time
evolution of PDE systems
Ut = F(U,Ux,Uxx,t)
User provides:
Code to evaluate F(U,Ux,Uxx,t)
Code to evaluate Jacobian of F(U,Ux,Uxx,t)
or use sparse finite difference approximation
or use automatic differentiation (coming soon!)
beginner
beginner
solvers:
solvers:
timestepping
timestepping
Ut= U Ux + Uxx
U(0,x) = sin(2 x)
U(t,0) = U(t,1)
beginner
beginner
solvers:
solvers:
timestepping
timestepping
74 of 132
beginner
beginner
solvers:
solvers:
timestepping
timestepping
75 of 132
Timestepping Solvers
Euler
Backward Euler
Pseudo-transient continuation
Interface to PVODE, a sophisticated parallel ODE
solver package by Hindmarsh et al. of LLNL
Adams
BDF
beginner
beginner
solvers:
solvers:
timestepping
timestepping
76 of 132
Timestepping Solvers
Allow full access to all of the PETSc
nonlinear solvers
linear solvers
distributed arrays, matrix assembly tools, etc.
beginner
beginner
solvers:
solvers:
timestepping
timestepping
77 of 132
TSCreate( )
TSSetRHSFunction( )
TSSetRHSJacobian( )
TSSetFromOptions( )
TSSolve( )
TSView( )
TSDestroy( )
beginner
beginner
- Create TS context
- Set function eval. routine
- Set Jacobian eval. routine
- Set runtime solver options
for [TS,SNES,SLES,KSP,PC]
- Run timestepping solver
- View solver options
actually used at runtime
(alternative: -ts_view)
- Destroy solver
solvers:
solvers:
nonlinear
nonlinear
78 of 132
Procedural
Interface
Runtime Option
-ts_ type [euler,beuler,pseudo,]
-ts_monitor
1
-ts_xmonitor,
-ts_max_steps <maxsteps>
-ts_max_time < maxtime>
TSView( )
-ts_view
View solver options
-snes_monitor -snes_rtol < rt>
Set timestepping solver TSGetSNES( )
SNESGetSLES( )
-ksp_type < ksptype>
options
2
SLESGetKSP( )
-ksp_rtol <rt>
SLESGetPC( )
-pc_type <pctype>
beginner intermediate
beginner intermediate
TSSetDuration ( )
solvers:
solvers:
timestepping
timestepping
79 of 132
petsc/src/ts/examples/tutorials/
ex3.c
ex4.c
nonlinear PDE)
- uniprocessor heat equation
- parallel heat equation
beginner intermediate
beginner intermediate
E - on-line exercise
solvers:
solvers:
timestepping
timestepping
80 of 132
Mesh Definitions:
81 of 132
Structured Meshes
83 of 132
Semi-Structured Meshes
tutorial
tutorial
introduction
introduction
84 of 132
Mesh Types
Structured
DA objects
Unstructured
VecScatter objects
important concepts
important concepts
Usage Concepts
Geometric data
Data structure creation
Ghost point updates
Local numerical computation
tutorial outline:
tutorial outline:
data layout
data layout
85 of 132
Ghost Values
Local node
Ghost node
data layout
data layout
86 of 132
stencil
[implicit]
DACreate( )
DA
AO
Ghost Point
Updates
DAGlobalToLocal( )
structured meshes
elements
edges
vertices
VecScatter
VecScatterCreate( ) AO
Loops over
I,J,K
indices
VecScatter( )
unstructured meshes
Local
Numerical
Computation
Loops over
entities
beginner intermediate
beginner intermediate
data layout
data layout
87 of 132
beginner
beginner
beginner
beginner
beginner
beginner
DA creation
intermediate
intermediate
intermediate
intermediate
Viewing
tutorial outline:
tutorial outline:
data layout:
data layout:
distributed arrays
distributed arrays
88 of 132
DA
AO
Ghost Point
Updates
DAGlobalToLocal( )
structured meshes
beginner
beginner
Local
Numerical
Computation
Loops over
I,J,K
indices
data layout:
data layout:
distributed arrays
distributed arrays
89 of 132
9
0
beginner
beginner
90 of 132
Update ghostpoints
DAGlobalToLocalBegin(DA,)
DAGlobalToLocalEnd(DA,)
beginner
beginner
data layout:
data layout:
distributed arrays
distributed arrays
91 of 132
Distributed Arrays
Data layout and ghost values
Proc 10
Proc 0
Proc 1
Box-type
stencil
beginner
beginner
Proc 10
Proc 0
Proc 1
Star-type
stencil
data layout:
data layout:
distributed arrays
distributed arrays
92 of 132
data layout:
data layout:
distributed arrays
distributed arrays
93 of 132
DACreate1d(,*DA)
MPI_Comm - processors containing array
DA_STENCIL_[BOX,STAR]
DA_[NONPERIODIC,XPERIODIC]
beginner
beginner
data layout:
data layout:
distributed arrays
distributed arrays
94 of 132
DACreate2d(,*DA)
DA_[NON,X,Y,XY]PERIODIC
beginner
beginner
data layout:
data layout:
distributed arrays
distributed arrays
95 of 132
DAGlobalToLocal End(DA,)
beginner
beginner
data layout:
data layout:
distributed arrays
distributed arrays
96 of 132
Unstructured Meshes
Setting up communication patterns is much more
complicated than the structured case due to
mesh dependence
discretization dependence
beginner
beginner
data layout:
data layout:
vector scatters
vector scatters
97 of 132
Cell-centered
Vertex-centered
Cell and vertex centered (e.g., staggered grids)
Mixed triangles and quadrilaterals
beginner
beginner
data layout:
data layout:
vector scatters
vector scatters
98 of 132
stencil
[implicit]
DACreate( )
DA
AO
Ghost Point
Updates
DAGlobalToLocal( )
structured mesh
elements
edges
vertices
VecScatter
VecScatterCreate( ) AO
Loops over
I,J,K
indices
VecScatter( )
unstructured mesh
Local
Numerical
Computation
Loops over
entities
beginner intermediate
beginner intermediate
data layout
data layout
99 of 132
velocity: v
vorticity:
temperature: T
[u,v, ,T]
beginner intermediate
beginner intermediate
solvers:
solvers:
nonlinear
nonlinear
100 of 132
Experimentation
1
beginner intermediate
beginner intermediate
solvers:
solvers:
nonlinear
nonlinear
101 of 132
Main Routine
PC
Application
Initialization
PETSc
KSP
Function
Evaluation
C
User code
Jacobian
Evaluation
PostProcessing
D
PETSc code
solvers:
solvers:
nonlinear
nonlinear
102 of 132
Driven Cavity:
Running the program (1)
Matrix-free Jacobian approximation with no preconditioning
(via -snes_mf) does not use explicit Jacobian evaluation
1 processor: (thermally-driven flow)
mpirun -np 1 ex8 -snes_mf -snes_monitor -grashof 1000.0 -lidvelocity 0.0
beginner
beginner
solvers:
solvers:
nonlinear
nonlinear
103 of 132
beginner
beginner
developer
developer
tutorial outline:
tutorial outline:
debugging and errors
debugging and errors
104 of 132
beginner
beginner
105 of 132
beginner
beginner
106 of 132
beginner
beginner
107 of 132
beginner
beginner
108 of 132
intermediate
intermediate
intermediate
intermediate
User-defined events
Performance Tuning:
intermediate
intermediate
Matrix optimizations
intermediate
intermediate
Application optimizations
advanced
advanced
Algorithmic tuning
tutorial outline:
tutorial outline:
profiling and
profiling and
performance tuning
performance tuning
109 of 132
Profiling
Integrated monitoring of
time
floating-point performance
memory usage
communication
profiling and
profiling and
performance tuning
performance tuning
110 of 132
Conclusion
beginner
beginner
beginner
beginner
beginner
beginner
developer
developer
beginner
beginner
Summary
New features
Interfacing with other packages
Extensibility issues
References
tutorial outline:
tutorial outline:
conclusion
conclusion
111 of 132
Summary
Using callbacks to set up the problems for ODE
and nonlinear solvers
Managing data layout and ghost point
communication with DAs and VecScatters
Evaluating parallel functions and Jacobians
Consistent profiling and error handling
112 of 132
Multigrid Support:
Recently simplified for structured grids
Linear Example:
3-dim linear problem on mesh of dimensions mx x my x mz
stencil width = sw, degrees of freedom per point = dof
using piecewise linear interpolation
ComputeRHS () and ComputeMatrix() are user-provided functions
DAMG
*damg;
DAMGCreate(comm,nlevels,NULL,&damg)
DAMGSetGrid(damg,3,DA_NONPERIODIC,DA_STENCIL_STAR,
mx,my,mz,sw,dof)
DAMGSetSLES(damg,ComputeRHS,ComputeMatrix)
DAMGSolve(damg)
solution = DAMGGetx(damg)
113 of 132
Multigrid Support
Nonlinear Example:
3-dim nonlinear problem on mesh of dimensions mx x my x mz
stencil width = sw, degrees of freedom per point = dof
using piecewise linear interpolation
ComputeFunc () and ComputeJ acobian() are user-provided functions
DAMG
*damg;
DAMGCreate(comm,nlevels,NULL,&damg)
DAMGSetGrid(damg,3,DA_NONPERIODIC,DA_STENCIL_STAR,
mx,my,mz,sw,dof)
DAMGSetSNES(damg,ComputeFunc,ComputeJacobian)
DAMGSolve(damg)
solution = DAMGGetx(damg)
114 of 132
Overture
Overture is a framework for generating
discretizations of PDEs on composite grids.
PETSc can be used as a black box linear
equation solver (a nonlinear equation solver is
under development).
Advanced features of PETSc such as the runtime
options database, profiling, debugging info, etc.,
can be exploited through explicit calls to the
PETSc API.
software
software
interfacing:
interfacing:
Overture
Overture
115 of 132
Overture Essentials
Read the grid
CompositeGrid cg;
getFromADataBase(cg,nameOfOGFile);
cg.update();
interfacing:
Overture
Overture
116 of 132
software
software
interfacing:
interfacing:
Overture
Overture
117 of 132
118 of 132
software
software
interfacing:
interfacing:
Overture
Overture
119 of 132
u.display();
return(0);
}
software
software
interfacing:
interfacing:
Overture
Overture
120 of 132
121 of 132
software
software
interfacing:
interfacing:
ILUDTP
ILUDTP
122 of 132
MatPartitioningCreate(MPI_Comm,MatPartitioning ctx)
MatPartitioningSetAdjacency(ctx,matrix)
Optional MatPartitioningSetVertexWeights(ctx,weights)
MatPartitioningSetFromOptions(ctx)
MatPartitioningApply(ctx,IS *partitioning)
software
software
interfacing:
interfacing:
ParMETIS
ParMETIS
123 of 132
TSCreate(MPI_Comm,TS_NONLINEAR,&ts)
TSSetType(ts,TS_PVODE)
.. regular TS functions
TSPVODESetType(ts,PVODE_ADAMS)
software
software
interfacing:
interfacing:
PVODE
PVODE
124 of 132
PCSetType(pc,PCSPAI)
PCSPAISetXXX(pc,)
software
software
interfacing:
interfacing:
SPAI
SPAI
125 of 132
Matlab
PetscMatlabEngineCreate(MPI_Comm,machinename,
PetscMatlabEngine eng)
PetscMatlabEnginePut(eng,PetscObject obj)
Vector
Matrix
PetscMatlabEngineEvaluate(eng,R = QR(A);)
PetscMatlabEngineGet(eng,PetscObject obj)
software
software
interfacing:
interfacing:
Matlab
Matlab
126 of 132
SAMRAI
SAMRAI provides an infrastructure for solving
PDEs using adaptive mesh refinement with
structured grids.
SAMRAI developers wrote a new class of PETSc
vectors that uses SAMRAI data structures and
methods.
This enables use of the PETSc matrix-free linear
and nonlinear solvers.
software
software
interfacing:
interfacing:
SAMRAI
SAMRAI
127 of 132
software
software
interfacing:
interfacing:
SAMRAI
SAMRAI
128 of 132
TAO
The Toolkit for Advanced Optimization (TAO) provides
software for large-scale optimization problems, including
unconstrained optimization
bound constrained optimization
nonlinear least squares
nonlinearly constrained optimization
129 of 132
TAO Interface
TAO_SOLVER tao;
Vec
x, g;
ApplicationCtx usercontext;
/* optimization solver */
/* solution and gradient vectors */
/* user-defined context */
TaoInitialize();
/* Initialize Application -- Create variable and gradient vectors x and g */ ...
TaoCreate(MPI_COMM_WORLD,tao_lmvm,&tao);
TaoSetFunctionGradient(tao,x,g, FctGrad,(void*)&usercontext);
TaoSolve(tao);
/* Finalize application -- Destroy vectors x and g */ ...
TaoDestroy(tao);
TaoFinalize();
software
software
interfacing:
interfacing:
TAO
TAO
130 of 132
Extensibility Issues
Most PETSc objects are designed to allow one to
drop in a new implementation with a new set of
data structures (similar to implementing a new
class in C++).
Heavily commented example codes include
Krylov methods: petsc/src/sles/ksp/impls/cg
preconditioners: petsc/src/sles/pc/impls/jacobi
131 of 132
Caveats Revisited
Developing parallel, non-trivial PDE solvers that
deliver high performance is still difficult, and
requires months (or even years) of concentrated
effort.
PETSc is a toolkit that can ease these difficulties
and reduce the development time, but it is not a
black-box PDE solver nor a silver bullet.
Users are invited to interact directly with us
regarding correctness or performance issues by
writing to [email protected].
132 of 132
References
http://www.mcs.anl.gov/petsc/docs