Mathematics and Computer Science Division Argonne National Laboratory
Mathematics and Computer Science Division Argonne National Laboratory
Mathematics and Computer Science Division Argonne National Laboratory
Version 3.0.4
Mathematics and Computer Science Division
Argonne National Laboratory
Pavan Balaji
Wesley Bland
James Dinan
David Goodell
William Gropp
Rob Latham
Antonio Pe
na
Rajeev Thakur
Past Contributors:
David Ashton
Darius Buntinas
Ralph Butler
Anthony Chan
Jayesh Krishna
Ewing Lusk
Guillaume Mercier
Rob Ross
Brian Toonen
April 24, 2013
This work was supported by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, SciDAC Program, Office of Science, U.S. Department of Energy, under Contract DE-AC0206CH11357.
Contents
1 Introduction
2.1
. . . . . . . . . . . . . . . . .
2.2
2.3
3 Quick Start
4.1
4.2
5.1
Standard mpiexec . . . . . . . . . . . . . . . . . . . . . . . .
5.2
5.3
5.4
5.4.1
5.5.1
5.5
5.6
5.7
10
5.7.1
11
OSC mpiexec . . . . . . . . . . . . . . . . . . . . . . .
6 Debugging
6.1
11
TotalView . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
i
11
7 Checkpointing
12
7.1
12
7.2
Taking checkpoints . . . . . . . . . . . . . . . . . . . . . . . .
13
14
14
9.1
Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
9.2
Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
9.3
Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
16
ii
1 INTRODUCTION
Introduction
This manual assumes that MPICH has already been installed. For instructions on how to install MPICH, see the MPICH Installers Guide, or the
README in the top-level MPICH directory. This manual explains how to
compile, link, and run MPI applications, and use certain tools that come
with MPICH. This is a preliminary version and some sections are not complete yet. However, there should be enough here to get you started with
MPICH.
2.1
2.2
2.3
3 QUICK START
MPICH1s configure devoted some effort to finding the libraries that contained the right versions of iargc and getarg and including those libraries
with which the mpif77 script linked MPI programs. Since MPICH does not
require access to command line arguments to applications, these functions
are optional, and configure does nothing special with them. If you need
them in your applications, you will have to ensure that they are available in
the Fortran environment you are using.
Quick Start
To use MPICH, you will have to know the directory where MPICH has been
installed. (Either you installed it there yourself, or your systems administrator has installed it. One place to look in this case might be /usr/local.
If MPICH has not yet been installed, see the MPICH Installers Guide.)
We suggest that you put the bin subdirectory of that directory into your
path. This will give you access to assorted MPICH commands to compile,
link, and run your programs conveniently. Other commands in this directory
manage parts of the run-time environment and execute tools.
One of the first commands you might run is mpichversion to find out
the exact version and configuration of MPICH you are working with. Some
of the material in this manual depends on just what version of MPICH you
are using and how it was configured at installation time.
You should now be able to run an MPI program. Let us assume that the
directory where MPICH has been installed is /home/you/mpich-installed,
and that you have added that directory to your path, using
setenv PATH /home/you/mpich-installed/bin:$PATH
for tcsh and csh, or
export PATH=/home/you/mpich-installed/bin:$PATH
for bash or sh. Then to run an MPI program, albeit only on one machine,
you can do:
cd /home/you/mpich-installed/examples
mpiexec -n 3 ./cpi
Details for these commands are provided below, but if you can successfully execute them here, then you have a correctly installed MPICH and
have run an MPI program.
4.1
The problem is that both stdio.h and the MPI C++ interface use SEEK SET,
SEEK CUR, and SEEK END. This is really a bug in the MPI standard. You can
try adding
#undef SEEK_SET
#undef SEEK_END
#undef SEEK_CUR
before mpi.h is included, or add the definition
-DMPICH_IGNORE_CXX_SEEK
to the command line (this will cause the MPI versions of SEEK SET etc. to
be skipped).
4.2
MPICH provides two kinds of support for Fortran programs. For Fortran 77
programmers, the file mpif.h provides the definitions of the MPI constants
such as MPI COMM WORLD. Fortran 90 programmers should use the MPI module
instead; this provides all of the definitions as well as interface definitions for
many of the MPI functions. However, this MPI module does not provide
full Fortran 90 support; in particular, interfaces for the routines, such as
MPI Send, that take choice arguments are not provided.
The MPI Standard describes mpiexec as a suggested way to run MPI programs. MPICH implements the mpiexec standard, and also provides some
extensions.
5.1
Standard mpiexec
Here we describe the standard mpiexec arguments from the MPI Standard [1]. The simplest form of a command to start an MPI job is
mpiexec -f machinefile -n 32 a.out
to start the executable a.out with 32 processes (providing an MPI COMM WORLD
of size 32 inside the MPI application). Other options are supported, for
search paths for executables, working directories, and even a more general
way of specifying a number of processes. Multiple sets of processes can be
run with different executables and different values for their arguments, with
: separating the sets of processes, as in:
mpiexec -f machinefile -n 1 ./master : -n 32 ./slave
It is also possible to start a one process MPI job (with a MPI COMM WORLD
whose size is equal to 1), without using mpiexec. This process will become
an MPI process when it calls MPI Init, and it may then call other MPI functions. Currently, MPICH does not fully support calling the dynamic process
routines from the MPI standard (e.g., MPI Comm spawn or MPI Comm accept)
from processes that are not started with mpiexec.
5.2
Some mpiexec arguments are specific to particular communication subsystems (devices) or process management environments (process managers). Our intention is to make all arguments as uniform as possible
across devices and process managers. For the time being we will document
these separately.
5.3
5.4
SMPD is an alternate process manager that runs on both Unix and Windows. It can launch jobs across both platforms if the binary formats match
(big/little endianness and size of C types int, long, void*, etc).
5.4.1
mpiexec for smpd accepts the standard MPI mpiexec options. Execute
mpiexec
or
mpiexec -help2
to print the usage options. Typical usage:
mpiexec -n 10 myapp.exe
All options to mpiexec:
-n x
-np x
launch x processes
-localonly x
-np x -localonly
launch x processes on the local machine
-machinefile filename
use a file to list the names of machines to launch on
-host hostname
launch on the specified host.
-hosts n host1 host2 ...
hostn
-phrase passphrase
specify the passphrase to authenticate connections to smpd with.
-smpdfile filename
specify the file where the smpd options are stored including the passphrase.
(unix only option)
-path search path
search path for executable, ; separated
-timeout seconds
timeout for the job.
Windows specific options:
-map drive:\\host\share
map a drive on all the nodes this mapping will be removed when the
processes exit
-logon
prompt for user account and password
-pwdfile filename
read the account and password from the file specified.
put the account on the first line and the password on the second
-nopopup debug
disable the system popup dialog if the process crashes
-priority class[:level]
set the process startup priority class and optionally level.
class = 0,1,2,3,4 = idle, below, normal, above, high
level = 0,1,2,3,4,5 = idle, lowest, below, normal, above, highest
the default is -priority 2:3
-register
encrypt a user name and password to the Windows registry.
-remove
delete the encrypted credentials from the Windows registry.
-validate [-host hostname]
validate the encrypted credentials for the current or specified host.
-delegate
use passwordless delegation to launch processes.
-impersonate
use passwordless authentication to launch processes.
-plaintext
dont encrypt the data on the wire.
5.5
gforker is a process management system for starting processes on a single machine, so called because the MPI processes are simply forked from
the mpiexec process. This process manager supports programs that use
MPI Comm spawn and the other dynamic process routines, but does not support the use of the dynamic process routines from programs that are not
started with mpiexec. The gforker process manager is primarily intended
as a debugging aid as it simplifies development and testing of MPI programs
on a single node or processor.
5.5.1
10
MPIEXEC PREFIX STDOUT Set the prefix used for lines sent to standard output. A %d is replaced with the rank in MPI COMM WORLD; a %w is replaced with an indication of which MPI COMM WORLD in MPI jobs that
involve multiple MPI COMM WORLDs (e.g., ones that use MPI Comm spawn
or MPI Comm connect).
MPIEXEC PREFIX STDERR Like MPIEXEC PREFIX STDOUT, but for standard error.
MPIEXEC STDOUTBUF Sets the buffering mode for standard output. Valid
values are NONE (no buffering), LINE (buffering by lines), and BLOCK
(buffering by blocks of characters; the size of the block is implementation defined). The default is NONE.
MPIEXEC STDERRBUF Like MPIEXEC STDOUTBUF, but for standard error.
5.6
5.7
There are multiple ways of using MPICH with SLURM or PBS. Hydra
provides native support for both SLURM and PBS, and is likely the easiest
6 DEBUGGING
11
way to use MPICH on these systems (see the Hydra documentation above
for more details).
Alternatively, SLURM also provides compatibility with MPICHs internal process management interface. To use this, you need to configure
MPICH with SLURM support, and then use the srun job launching utility
provided by SLURM.
For PBS, MPICH jobs can be launched in two ways: (i) use Hydras
mpiexec with the appropriate options corresponding to PBS, or (ii) using
the OSC mpiexec.
5.7.1
OSC mpiexec
Pete Wyckoff from the Ohio Supercomputer Center provides a alternate utility called OSC mpiexec to launch MPICH jobs on PBS systems. More information about this can be found here: http://www.osc.edu/~pw/mpiexec
Debugging
6.1
TotalView
MPICH supports use of the TotalView debugger from Etnus. If MPICH has
been configured to enable debugging with TotalView then one can debug an
MPI program using
totalview -a mpiexec -a -n 3 cpi
You will get a popup window from TotalView asking whether you want to
start the job in a stopped state. If so, when the TotalView window appears,
you may see assembly code in the source window. Click on main in the stack
window (upper left) to see the source of the main function. TotalView will
show that the program (all processes) are stopped in the call to MPI Init.
7 CHECKPOINTING
12
If you have TotalView 8.1.0 or later, you can use a TotalView feature
called indirect launch with MPICH. Invoke TotalView as:
totalview <program> -a <program args>
Then select the Process/Startup Parameters command. Choose the Parallel tab in the resulting dialog box and choose MPICH as the parallel system. Then set the number of tasks using the Tasks field and enter other
needed mpiexec arguments into the Additional Starter Arguments field.
Checkpointing
7.1
First, you need to have BLCR version 0.8.2 installed on your machine. If
its installed in the default system location, add the following two options
to your configure command:
--enable-checkpointing --with-hydra-ckpointlib=blcr
where BLCR INSTALL DIR is the directory where BLCR has been installed
(whatever was specified in --prefix when BLCR was configured). Note,
checkpointing is only supported with the Hydra process manager. Hyrda
7 CHECKPOINTING
13
will used by default, unless you choose something else with the --with-pm=
configure option.
After its configured, compile as usual (e.g., make; make install).
7.2
Taking checkpoints
To use checkpointing, include the -ckpointlib option for mpiexec to specify the checkpointing library to use and -ckpoint-prefix to specify the
directory where the checkpoint images should be written:
shell$ mpiexec -ckpointlib blcr \
-ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
-f hosts -n 4 ./app
While the application is running, the user can request for a checkpoint at
any time by sending a SIGUSR1 signal to mpiexec. You can also automatically checkpoint the application at regular intervals using the mpiexec option
-ckpoint-interval to specify the number of seconds between checkpoints:
shell$ mpiexec -ckpointlib blcr \
-ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
-ckpoint-interval 3600 -f hosts -n 4 ./app
The checkpoint/restart parameters can also be controlled with the environment variables HYDRA CKPOINTLIB, HYDRA CKPOINT PREFIX and HYDRA
CKPOINT INTERVAL.
Each checkpoint generates one file per node. Note that checkpoints for
all processes on a node will be stored in the same file. Each time a new
checkpoint is taken an additional set of files are created. The files are numbered by the checkpoint number. This allows the application to be restarted
from checkpoints other than the most recent. The checkpoint number can
be specified with the -ckpoint-num parameter. To restart a process:
shell$ mpiexec -ckpointlib blcr \
-ckpoint-prefix /home/buntinas/ckpts/app.ckpoint \
-ckpoint-num 5 -f hosts -n 4
Note that by default, the process will be restarted from the first checkpoint, so in most cases, the checkpoint number should be specified.
14
MPICH also includes a test suite for MPI functionality; this suite may be
found in the mpich/test/mpi source directory and can be run with the
command make testing. This test suite should work with any MPI implementation, not just MPICH.
9.1
Directories
9.2
Compiling
The libraries in the lib directory were compiled with MS Visual C++ .NET
2003 and Intel Fortran 8.1. These compilers and any others that can link
with the MS .lib files can be used to create user applications. gcc and g77
for cygwin can be used with the libmpich*.a libraries.
For MS Developer Studio users: Create a project and add
C:\Program Files\MPICH\include
to the include path and
C:\Program Files\MPICH\lib
to the library path. Add mpi.lib and cxx.lib to the link command. Add
cxxd.lib to the Debug target link instead of cxx.lib.
Intel Fortran 8 users should add fmpich.lib to the link command.
Cygwin users should use libmpich.a libfmpichg.a.
9.3
15
Running
MPI jobs are run from a command prompt using mpiexec.exe. See Section 5.4 on mpiexec for smpd for a description of the options to mpiexec.
16
References
[1] Message Passing Interface Forum. MPI2: A Message Passing Interface
standard. International Journal of High Performance Computing Applications, 12(12):1299, 1998.