0% found this document useful (0 votes)

219 views14 pages

Message Passing Interface (MPI)

The document discusses the Message Passing Interface (MPI) specification and programming model. MPI allows programs to manage communication and parallelism between processes running on distributed memory systems. It provides a standardized way for programs to divide work between processes and communicate data through message passing. The most recent version is MPI-3 and supports C, C++, Fortran, and hybrid languages. Actual MPI library implementations may support different versions and features of the MPI standard.

Uploaded by

Judah Okeleye

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

219 views14 pages

Message Passing Interface (MPI)

Uploaded by

Judah Okeleye

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

10/2/2014

Message Passing Interface (MPI)

An Interface Specification:
M P I = Message Passing Interface
MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a library - but rather the
specification of what such a library should be.
MPI primarily addresses the message-passing parallel programming model: data is moved from the address space of one process to
that of another process through cooperative operations on each process.
Simply stated, the goal of the Message Passing Interface is to provide a widely used standard for writing message passing programs.
The interface attempts to be:
practical
portable
efficient
flexible
The MPI standard has gone through a number of revisions, with the most recent version being MPI-3.
Interface specifications have been defined for C and Fortran90 language bindings:
C++ bindings from MPI-1 are removed in MPI-3
MPI-3 also provides support for Fortran 2003 and 2008 features
Actual MPI library implementations differ in which version and features of the MPI standard they support. Developers/users will
need to be aware of this.
Programming Model:
Originally, MPI was designed for distributed memory architectures, which were becoming increasingly popular at that time (1980s early 1990s).

As architecture trends changed, shared memory SMPs were combined over networks creating hybrid distributed memory / shared
memory systems.
MPI implementors adapted their libraries to handle both types of underlying memory architectures seamlessly. They also
adapted/developed ways of handling different interconnects and protocols.

Today, MPI runs on virtually any hardware platform:

Distributed Memory
Shared Memory
Hybrid
The programming model clearly remains a distributed memory model however, regardless of the underlying physical architecture of
the machine.
https://computing.llnl.gov/tutorials/mpi/

2/40

10/2/2014

Message Passing Interface (MPI)

A summary of LC's MPI environment is provided here, along with links to additional detailed information.

MVAPICH
General Info:
MVAPICH MPI from Ohio State University is the default MPI library on all of LC's Linux clusters.
As of June 2014, LC's default version is MVAPICH 1.2
MPI-1 implementation that includes support for MPI-I/O, but not for MPI one-sided communication.
Based on MPICH-1.2.7 MPI library from Argonne National Laboratory
Not thread-safe. All MPI calls should be made by the master thread in a multi-threaded MPI program.
See /usr/local/docs/mpi.mvapich.basics for LC usage details.
MVAPICH2 is also available on LC Linux clusters
MPI-2 implementation based on MPICH2 MPI library from Argonne National Laboratory
Not currently the default - requires the "use" command to load the selected dotkit - see https://computing.llnl.gov/?
set=jobs&page=dotkit for details.
Thread-safe
See /usr/local/docs/mpi.mvapich2.basics for LC usage details.
MVAPICH2 versions 1.9 and later implement MPI-3 according to the developer's documentation.
A code compiled with MVAPICH on one LC Linux cluster should run on any LC Linux cluster.
Clusters with an interconnect - message passing is done in shared memory on-node and over the switch inter-node
Clusters without an interconnect - message passing is done in shared memory
More information:
/usr/local/docs on LC's clusters:
mpi.basics
mpi.mvapich.basics
mpi.mvapich2.basics

MVAPICH 1.2 User Guide available HERE

MVAPICH2 1.7 User Guide available HERE
MVAPICH home page: mvapich.cse.ohio-state.edu/
MPICH1 home page: www.mcs.anl.gov/research/projects/mpi/mpich1-old/.
MPICH2 home page: www.mcs.anl.gov/research/projects/mpich2/.
MPI Build Scripts:
MPI compiler wrapper scripts are used to compile MPI programs - these should all be in your default $PATH unless you have
changed it. These scripts mimic the familiar MPICH scripts in their functionality, meaning, they automatically include the
appropriate MPI include files and link to the necessary MPI libraries and pass switches to the underlying compiler.
Available scripts are listed below:
Language Script Name Underlying Compiler

C++

Fortran

mpicc

gcc

mpigcc

gcc

mpiicc

icc

mpipgcc

pgcc

mpiCC

g++

mpig++

g++

mpiicpc

icpc

mpipgCC

pgCC

mpif77

g77

mpigfortran

gfortran

mpiifort

ifort

mpipgf77

pgf77

mpipgf90

pgf90

For additional information:

See the man page (if it exists)
https://computing.llnl.gov/tutorials/mpi/

4/40

10/2/2014

Message Passing Interface (MPI)

Header File:
Required for all programs that make MPI library calls.
C include file
#include "mpi.h"

Fortran include file

include 'mpif.h'

With MPI-3 Fortran, the USE mpi_f08 module is preferred over using the include file shown above.
Format of MPI Calls:
C names are case sensitive; Fortran names are not.
Programs must not declare variables or functions with names beginning with the prefix MPI_ or PMPI_ (profiling interface).
C Binding
Format:

rc = MPI_Xxxxx(parameter, ... )

Example:

rc = MPI_Bsend(&buf,count,type,dest,tag,comm)

Error code: Returned as "rc". MPI_SUCCESS if successful

Fortran Binding
Format:

CALL MPI_XXXXX(parameter,..., ierr)

call mpi_xxxxx(parameter,..., ierr)

Example:

CALL MPI_BSEND(buf,count,type,dest,tag,comm,ierr)

Error code: Returned as "ierr" parameter. MPI_SUCCESS if successful

Communicators and Groups:
MPI uses objects called communicators and groups to define which collection of processes may communicate with each other.
Most MPI routines require you to specify a communicator as an argument.
Communicators and groups will be covered in more detail later. For now, simply use MPI_COMM_WORLD whenever a
communicator is required - it is the predefined communicator that includes all of your MPI processes.

Rank:
Within a communicator, every process has its own unique, integer identifier assigned by the system when the process initializes. A
rank is sometimes also called a "task ID". Ranks are contiguous and begin at zero.
Used by the programmer to specify the source and destination of messages. Often used conditionally by the application to control
program execution (if rank=0 do this / if rank=1 do that).
Error Handling:
https://computing.llnl.gov/tutorials/mpi/

7/40

10/2/2014

Message Passing Interface (MPI)

Most MPI routines include a return/error code parameter, as described in the "Format of MPI Calls" section above.
However, according to the MPI standard, the default behavior of an MPI call is to abort if there is an error. This means you will
probably not be able to capture a return/error code other than MPI_SUCCESS (zero).
The standard does provide a means to override this default error handler. A discussion on how to do this is available HERE. You can
also consult the error handling section of the relevant MPI Standard documentation located at http://www.mpi-forum.org/docs/.
The types of errors displayed to the user are implementation dependent.

Environment Management Routines

This group of routines is used for interrogating and setting the MPI execution environment, and covers an assortment of purposes, such as
initializing and terminating the MPI environment, querying a rank's identity, querying the MPI library's version, etc. Most of the
commonly used ones are described below.
MPI_Init
Initializes the MPI execution environment. This function must be called in every MPI program, must be called before any other MPI
functions and must be called only once in an MPI program. For C programs, MPI_Init may be used to pass the command line
arguments to all processes, although this is not required by the standard and is implementation dependent.
MPI_Init (&argc,&argv)
MPI_INIT (ierr)

MPI_Comm_size
Returns the total number of MPI processes in the specified communicator, such as MPI_COMM_WORLD. If the communicator is
MPI_COMM_WORLD, then it represents the number of MPI tasks available to your application.
MPI_Comm_size (comm,&size)
MPI_COMM_SIZE (comm,size,ierr)

MPI_Comm_rank
Returns the rank of the calling MPI process within the specified communicator. Initially, each process will be assigned a unique
integer rank between 0 and number of tasks - 1 within the communicator MPI_COMM_WORLD. This rank is often referred to as a
task ID. If a process becomes associated with other communicators, it will have a unique rank within each of these as well.
MPI_Comm_rank (comm,&rank)
MPI_COMM_RANK (comm,rank,ierr)

MPI_Abort
Terminates all MPI processes associated with the communicator. In most MPI implementations it terminates ALL processes
regardless of the communicator specified.
MPI_Abort (comm,errorcode)
MPI_ABORT (comm,errorcode,ierr)

MPI_Get_processor_name
Returns the processor name. Also returns the length of the name. The buffer for "name" must be at least
MPI_MAX_PROCESSOR_NAME characters in size. What is returned into "name" is implementation dependent - may not be the
same as the output of the "hostname" or "host" shell commands.
MPI_Get_processor_name (&name,&resultlength)
MPI_GET_PROCESSOR_NAME (name,resultlength,ierr)

MPI_Get_version
Returns the version and subversion of the MPI standard that's implemented by the library.
MPI_Get_version (&version,&subversion)
MPI_GET_VERSION (version,subversion,ierr)

MPI_Initialized
https://computing.llnl.gov/tutorials/mpi/

8/40

10/2/2014

Message Passing Interface (MPI)

Indicates whether MPI_Init has been called - returns flag as either logical true (1) or false(0). MPI requires that MPI_Init be called
once and only once by each process. This may pose a problem for modules that want to use MPI and are prepared to call MPI_Init if
necessary. MPI_Initialized solves this problem.
MPI_Initialized (&flag)
MPI_INITIALIZED (flag,ierr)

MPI_Wtime
Returns an elapsed wall clock time in seconds (double precision) on the calling processor.
MPI_Wtime ()
MPI_WTIME ()

MPI_Wtick
Returns the resolution in seconds (double precision) of MPI_Wtime.
MPI_Wtick ()
MPI_WTICK ()

MPI_Finalize
Terminates the MPI execution environment. This function should be the last MPI routine called in every MPI program - no other
MPI routines may be called after it.
MPI_Finalize ()
MPI_FINALIZE (ierr)

Examples: Environment Management Routines

C Language - Environment Management Routines Example
#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[]) {
int numtasks, rank, len, rc;
char hostname[MPI_MAX_PROCESSOR_NAME];
rc = MPI_Init(&argc,&argv);
if (rc != MPI_SUCCESS) {
printf ("Error starting MPI program. Terminating.\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Get_processor_name(hostname, &len);
printf ("Number of tasks= %d My rank= %d Running on %s\n", numtasks,rank,hostname);
/*******

do some work *******/

MPI_Finalize();
}

Fortran - Environment Management Routines Example

program simple
include 'mpif.h'
integer numtasks, rank, len, ierr
character(MPI_MAX_PROCESSOR_NAME) hostname
call MPI_INIT(ierr)
if (ierr .ne. MPI_SUCCESS) then
print *,'Error starting MPI program. Terminating.'
call MPI_ABORT(MPI_COMM_WORLD, rc, ierr)
end if
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numtasks, ierr)

https://computing.llnl.gov/tutorials/mpi/

9/40

10/2/2014

Message Passing Interface (MPI)

Combined send/receive
"Ready" send
Any type of send routine can be paired with any type of receive routine.
MPI also provides several routines associated with send - receive operations, such as those used to wait for a message's arrival or
probe to find out if a message has arrived.
Buffering:
In a perfect world, every send operation would be perfectly synchronized with its matching receive. This is rarely the case.
Somehow or other, the MPI implementation must be able to deal with storing data when the two tasks are out of sync.
Consider the following two cases:
A send operation occurs 5 seconds before the receive is ready - where is the message while the receive is pending?
Multiple sends arrive at the same receiving task which can only accept one send at a time - what happens to the messages that
are "backing up"?
The MPI implementation (not the MPI standard) decides what happens to data in these types of cases. Typically, a system buffer
area is reserved to hold data in transit. For example:

System buffer space is:

Opaque to the programmer and managed entirely by the MPI library
A finite resource that can be easy to exhaust
Often mysterious and not well documented
Able to exist on the sending side, the receiving side, or both
Something that may improve program performance because it allows send - receive operations to be asynchronous.
User managed address space (i.e. your program variables) is called the application buffer. MPI also provides for a user managed
send buffer.
Blocking vs. Non-blocking:
Most of the MPI point-to-point routines can be used in either blocking or non-blocking mode.
Blocking:
A blocking send routine will only "return" after it is safe to modify the application buffer (your send data) for reuse. Safe
means that modifications will not affect the data intended for the receive task. Safe does not imply that the data was actually
received - it may very well be sitting in a system buffer.
A blocking send can be synchronous which means there is handshaking occurring with the receive task to confirm a safe send.
A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive.
A blocking receive only "returns" after the data has arrived and is ready for use by the program.
Non-blocking:
Non-blocking send and receive routines behave similarly - they will return almost immediately. They do not wait for any
https://computing.llnl.gov/tutorials/mpi/

12/40

10/2/2014

Message Passing Interface (MPI)

communication events to complete, such as message copying from user memory to system buffer space or the actual arrival of
message.
Non-blocking operations simply "request" the MPI library to perform the operation when it is able. The user can not predict
when that will happen.
It is unsafe to modify the application buffer (your variable space) until you know for a fact the requested non-blocking
operation was actually performed by the library. There are "wait" routines used to do this.
Non-blocking communications are primarily used to overlap computation with communication and exploit possible
performance gains.
Order and Fairness:
Order:
MPI guarantees that messages will not overtake each other.
If a sender sends two messages (Message 1 and Message 2) in succession to the same destination, and both match the same
receive, the receive operation will receive Message 1 before Message 2.
If a receiver posts two receives (Receive 1 and Receive 2), in succession, and both are looking for the same message, Receive
1 will receive the message before Receive 2.
Order rules do not apply if there are multiple threads participating in the communication operations.
Fairness:
MPI does not guarantee fairness - it's up to the programmer to prevent "operation starvation".
Example: task 0 sends a message to task 2. However, task 1 sends a competing message that matches task 2's receive. Only
one of the sends will complete.

Point to Point Communication Routines

MPI Message Passing Routine Arguments
MPI point-to-point communication routines generally have an argument list that takes one of the following formats:
Blocking sends

MPI_Send(buffer,count,type,dest,tag,comm)

Non-blocking sends

MPI_Isend(buffer,count,type,dest,tag,comm,request)

Blocking receive

MPI_Recv(buffer,count,type,source,tag,comm,status)

Non-blocking receive

MPI_Irecv(buffer,count,type,source,tag,comm,request)

Buffer
Program (application) address space that references the data that is to be sent or received. In most cases, this is simply the variable
name that is be sent/received. For C programs, this argument is passed by reference and usually must be prepended with an
ampersand: &var1
Data Count
Indicates the number of data elements of a particular type to be sent.
https://computing.llnl.gov/tutorials/mpi/

13/40

10/2/2014

Message Passing Interface (MPI)

Data Type
For reasons of portability, MPI predefines its elementary data types. The table below lists those required by the standard.
C Data Types
MPI_CHAR

signed char

MPI_WCHAR

wchar_t - wide character

MPI_SHORT

signed short int

MPI_INT

signed int

MPI_LONG

signed long int

MPI_LONG_LONG_INT
MPI_LONG_LONG

signed long long int

MPI_SIGNED_CHAR

signed char

MPI_UNSIGNED_CHAR

unsigned char

MPI_UNSIGNED_SHORT

unsigned short int

MPI_UNSIGNED

unsigned int

MPI_UNSIGNED_LONG

unsigned long int

MPI_UNSIGNED_LONG_LONG

unsigned long long int

Fortran Data Types

MPI_CHARACTER

character(1)

MPI_INTEGER
MPI_INTEGER1
MPI_INTEGER2
MPI_INTEGER4

integer
integer*1
integer*2
integer*4

real
real*2
real*4
real*8

MPI_FLOAT

float

MPI_REAL
MPI_REAL2
MPI_REAL4
MPI_REAL8

MPI_DOUBLE

double

MPI_DOUBLE_PRECISION

double precision

MPI_LONG_DOUBLE

long double

MPI_C_COMPLEX
MPI_C_FLOAT_COMPLEX

float _Complex

MPI_COMPLEX

complex

MPI_C_DOUBLE_COMPLEX

double _Complex

MPI_DOUBLE_COMPLEX

double complex

MPI_C_LONG_DOUBLE_COMPLEX

long double _Complex

MPI_C_BOOL

_Bool

MPI_LOGICAL

logical

MPI_C_LONG_DOUBLE_COMPLEX

long double _Complex

MPI_INT8_T
MPI_INT16_T
MPI_INT32_T
MPI_INT64_T

int8_t
int16_t
int32_t
int64_t

MPI_UINT8_T
MPI_UINT16_T
MPI_UINT32_T
MPI_UINT64_T

uint8_t
uint16_t
uint32_t
uint64_t

MPI_BYTE

8 binary digits

MPI_BYTE

8 binary digits

MPI_PACKED

data packed or unpacked with

MPI_Pack()/ MPI_Unpack

MPI_PACKED

data packed or
unpacked with
MPI_Pack()/
MPI_Unpack

Notes:
Programmers may also create their own data types (see Derived Data Types).
https://computing.llnl.gov/tutorials/mpi/

14/40

10/2/2014

Message Passing Interface (MPI)

MPI_BYTE and MPI_PACKED do not correspond to standard C or Fortran types.

Types shown in GRAY FONT are recommended if possible.
Some implementations may include additional elementary data types (MPI_LOGICAL2, MPI_COMPLEX32, etc.). Check the
MPI header file.
Destination
An argument to send routines that indicates the process where a message should be delivered. Specified as the rank of the receiving
process.
Source
An argument to receive routines that indicates the originating process of the message. Specified as the rank of the sending process.
This may be set to the wild card MPI_ANY_SOURCE to receive a message from any task.
Tag
Arbitrary non-negative integer assigned by the programmer to uniquely identify a message. Send and receive operations should
match message tags. For a receive operation, the wild card MPI_ANY_TAG can be used to receive any message regardless of its
tag. The MPI standard guarantees that integers 0-32767 can be used as tags, but most implementations allow a much larger range
than this.
Communicator
Indicates the communication context, or set of processes for which the source or destination fields are valid. Unless the programmer
is explicitly creating new communicators, the predefined communicator MPI_COMM_WORLD is usually used.
Status
For a receive operation, indicates the source of the message and the tag of the message. In C, this argument is a pointer to a
predefined structure MPI_Status (ex. stat.MPI_SOURCE stat.MPI_TAG). In Fortran, it is an integer array of size
MPI_STATUS_SIZE (ex. stat(MPI_SOURCE) stat(MPI_TAG)). Additionally, the actual number of bytes received is obtainable
from Status via the MPI_Get_count routine.
Request
Used by non-blocking send and receive operations. Since non-blocking operations may return before the requested system buffer
space is obtained, the system issues a unique "request number". The programmer uses this system assigned "handle" later (in a
WAIT type routine) to determine completion of the non-blocking operation. In C, this argument is a pointer to a predefined structure
MPI_Request. In Fortran, it is an integer.

Point to Point Communication Routines

Blocking Message Passing Routines
The more commonly used MPI blocking message passing routines are described below.
MPI_Send
Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. Note that this
routine may be implemented differently on different systems. The MPI standard permits the use of a system buffer but does not
require it. Some implementations may actually use a synchronous send (discussed below) to implement the basic blocking send.
MPI_Send (&buf,count,datatype,dest,tag,comm)
MPI_SEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Recv
Receive a message and block until the requested data is available in the application buffer in the receiving task.
MPI_Recv (&buf,count,datatype,source,tag,comm,&status)
MPI_RECV (buf,count,datatype,source,tag,comm,status,ierr)

MPI_Ssend
Synchronous blocking send: Send a message and block until the application buffer in the sending task is free for reuse and the
https://computing.llnl.gov/tutorials/mpi/

15/40

10/2/2014

Message Passing Interface (MPI)

destination process has started to receive the message.

MPI_Ssend (&buf,count,datatype,dest,tag,comm)
MPI_SSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Bsend
Buffered blocking send: permits the programmer to allocate the required amount of buffer space into which data can be copied until
it is delivered. Insulates against the problems associated with insufficient system buffer space. Routine returns after the data has
been copied from application buffer space to the allocated send buffer. Must be used with the MPI_Buffer_attach routine.
MPI_Bsend (&buf,count,datatype,dest,tag,comm)
MPI_BSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Buffer_attach
MPI_Buffer_detach
Used by programmer to allocate/deallocate message buffer space to be used by the MPI_Bsend routine. The size argument is
specified in actual data bytes - not a count of data elements. Only one buffer can be attached to a process at a time.
MPI_Buffer_attach
MPI_Buffer_detach
MPI_BUFFER_ATTACH
MPI_BUFFER_DETACH

(&buffer,size)
(&buffer,size)
(buffer,size,ierr)
(buffer,size,ierr)

MPI_Rsend
Blocking ready send. Should only be used if the programmer is certain that the matching receive has already been posted.
MPI_Rsend (&buf,count,datatype,dest,tag,comm)
MPI_RSEND (buf,count,datatype,dest,tag,comm,ierr)

MPI_Sendrecv
Send a message and post a receive before blocking. Will block until the sending application buffer is free for reuse and until the
receiving application buffer contains the received message.
MPI_Sendrecv (&sendbuf,sendcount,sendtype,dest,sendtag,
...... &recvbuf,recvcount,recvtype,source,recvtag,
...... comm,&status)
MPI_SENDRECV (sendbuf,sendcount,sendtype,dest,sendtag,
...... recvbuf,recvcount,recvtype,source,recvtag,
...... comm,status,ierr)

MPI_Wait
MPI_Waitany
MPI_Waitall
MPI_Waitsome
MPI_Wait blocks until a specified non-blocking send or receive operation has completed. For multiple non-blocking operations, the
programmer can specify any, all or some completions.
MPI_Wait (&request,&status)
MPI_Waitany (count,&array_of_requests,&index,&status)
MPI_Waitall (count,&array_of_requests,&array_of_statuses)
MPI_Waitsome (incount,&array_of_requests,&outcount,
...... &array_of_offsets, &array_of_statuses)
MPI_WAIT (request,status,ierr)
MPI_WAITANY (count,array_of_requests,index,status,ierr)
MPI_WAITALL (count,array_of_requests,array_of_statuses,
...... ierr)
MPI_WAITSOME (incount,array_of_requests,outcount,
...... array_of_offsets, array_of_statuses,ierr)

MPI_Probe
Performs a blocking test for a message. The "wildcards" MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test for a
message from any source or with any tag. For the C routine, the actual source and tag will be returned in the status structure as
status.MPI_SOURCE and status.MPI_TAG. For the Fortran routine, they will be returned in the integer array status(MPI_SOURCE)
and status(MPI_TAG).
MPI_Probe (source,tag,comm,&status)
MPI_PROBE (source,tag,comm,status,ierr)

https://computing.llnl.gov/tutorials/mpi/

16/40

10/2/2014

Message Passing Interface (MPI)

stat(MPI_SOURCE), 'with tag',stat(MPI_TAG)

call MPI_FINALIZE(ierr)
end

Point to Point Communication Routines

Non-Blocking Message Passing Routines
The more commonly used MPI non-blocking message passing routines are described below.
MPI_Isend
Identifies an area in memory to serve as a send buffer. Processing continues immediately without waiting for the message to be
copied out from the application buffer. A communication request handle is returned for handling the pending message status. The
program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that the non-blocking
send has completed.
MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
MPI_ISEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Irecv
Identifies an area in memory to serve as a receive buffer. Processing continues immediately without actually waiting for the message
to be received and copied into the the application buffer. A communication request handle is returned for handling the pending
message status. The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking receive operation
completes and the requested message is available in the application buffer.
MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)
MPI_IRECV (buf,count,datatype,source,tag,comm,request,ierr)

MPI_Issend
Non-blocking synchronous send. Similar to MPI_Isend(), except MPI_Wait() or MPI_Test() indicates when the destination process
has received the message.
MPI_Issend (&buf,count,datatype,dest,tag,comm,&request)
MPI_ISSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Ibsend
Non-blocking buffered send. Similar to MPI_Bsend() except MPI_Wait() or MPI_Test() indicates when the destination process has
received the message. Must be used with the MPI_Buffer_attach routine.
MPI_Ibsend (&buf,count,datatype,dest,tag,comm,&request)
MPI_IBSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Irsend
Non-blocking ready send. Similar to MPI_Rsend() except MPI_Wait() or MPI_Test() indicates when the destination process has
received the message. Should only be used if the programmer is certain that the matching receive has already been posted.
MPI_Irsend (&buf,count,datatype,dest,tag,comm,&request)
MPI_IRSEND (buf,count,datatype,dest,tag,comm,request,ierr)

MPI_Test
MPI_Testany
MPI_Testall
MPI_Testsome
MPI_Test checks the status of a specified non-blocking send or receive operation. The "flag" parameter is returned logical true (1) if
the operation has completed, and logical false (0) if not. For multiple non-blocking operations, the programmer can specify any, all
or some completions.
MPI_Test (&request,&flag,&status)
MPI_Testany (count,&array_of_requests,&index,&flag,&status)
MPI_Testall (count,&array_of_requests,&flag,&array_of_statuses)

https://computing.llnl.gov/tutorials/mpi/

18/40

10/2/2014

Message Passing Interface (MPI)

Unexpected behavior, including program failure, can occur if even one task in the communicator doesn't participate.
It is the programmer's responsibility to ensure that all processes within a communicator participate in any collective operations.
Types of Collective Operations:
Synchronization - processes wait until all members of the
group have reached the synchronization point.
Data Movement - broadcast, scatter/gather, all to all.
Collective Computation (reductions) - one member of the
group collects data from the other members and performs an
operation (min, max, add, multiply, etc.) on that data.
Programming Considerations and Restrictions:
With MPI-3, collective operations can be blocking or nonblocking. Only blocking operations are covered in this tutorial.
Collective communication routines do not take message tag
arguments.
Collective operations within subsets of processes are accomplished by first partitioning the subsets into new groups and then
attaching the new groups to new communicators (discussed in the Group and Communicator Management Routines section).
Can only be used with MPI predefined datatypes - not with MPI Derived Data Types.
MPI-2 extended most collective operations to allow data movement between intercommunicators (not covered here).

Collective Communication Routines

MPI_Barrier
Synchronization operation. Creates a barrier synchronization in a group. Each task, when reaching the MPI_Barrier call, blocks until
all tasks in the group reach the same MPI_Barrier call. Then all tasks are free to proceed.
MPI_Barrier (comm)
MPI_BARRIER (comm,ierr)

MPI_Bcast
Data movement operation. Broadcasts (sends) a message from the process with rank "root" to all other processes in the group.
Diagram Here

MPI_Bcast (&buffer,count,datatype,root,comm)
MPI_BCAST (buffer,count,datatype,root,comm,ierr)

MPI_Scatter
Data movement operation. Distributes distinct messages from a single source task to each task in the group.
Diagram Here

MPI_Scatter (&sendbuf,sendcnt,sendtype,&recvbuf,
...... recvcnt,recvtype,root,comm)
MPI_SCATTER (sendbuf,sendcnt,sendtype,recvbuf,
...... recvcnt,recvtype,root,comm,ierr)

MPI_Gather
Data movement operation. Gathers distinct messages from each task in the group to a single destination task. This routine is the
reverse operation of MPI_Scatter.
Diagram Here

MPI_Gather (&sendbuf,sendcnt,sendtype,&recvbuf,
...... recvcount,recvtype,root,comm)
MPI_GATHER (sendbuf,sendcnt,sendtype,recvbuf,
...... recvcount,recvtype,root,comm,ierr)

https://computing.llnl.gov/tutorials/mpi/

21/40

10/2/2014

Message Passing Interface (MPI)

MPI_Allgather
Data movement operation. Concatenation of data to all tasks in a group. Each task in the group, in effect, performs a one-to-all
broadcasting operation within the group.
Diagram Here

MPI_Allgather (&sendbuf,sendcount,sendtype,&recvbuf,
...... recvcount,recvtype,comm)
MPI_ALLGATHER (sendbuf,sendcount,sendtype,recvbuf,
...... recvcount,recvtype,comm,info)

MPI_Reduce
Collective computation operation. Applies a reduction operation on all tasks in the group and places the result in one task.
Diagram Here

MPI_Reduce (&sendbuf,&recvbuf,count,datatype,op,root,comm)
MPI_REDUCE (sendbuf,recvbuf,count,datatype,op,root,comm,ierr)

The predefined MPI reduction operations appear below. Users can also define their own reduction functions by using the
MPI_Op_create routine.
MPI Reduction Operation

C Data Types

Fortran Data Type

MPI_MAX

maximum

integer, float

integer, real, complex

MPI_MIN

minimum

integer, float

integer, real, complex

MPI_SUM

sum

integer, float

integer, real, complex

MPI_PROD

product

integer, float

integer, real, complex

MPI_LAND

logical AND

integer

logical

MPI_BAND

bit-wise AND

integer, MPI_BYTE

MPI_LOR

logical OR

integer

logical

MPI_BOR

bit-wise OR

integer, MPI_BYTE

MPI_LXOR

logical XOR

integer

logical

MPI_BXOR

bit-wise XOR

integer, MPI_BYTE

MPI_MAXLOC

max value and location

float, double and long double

real, complex,double precision

MPI_MINLOC

min value and location

float, double and long double

real, complex, double precision

MPI_Allreduce
Collective computation operation + data movement. Applies a reduction operation and places the result in all tasks in the group. This
is equivalent to an MPI_Reduce followed by an MPI_Bcast.
Diagram Here

MPI_Allreduce (&sendbuf,&recvbuf,count,datatype,op,comm)
MPI_ALLREDUCE (sendbuf,recvbuf,count,datatype,op,comm,ierr)

MPI_Reduce_scatter
Collective computation operation + data movement. First does an element-wise reduction on a vector across all tasks in the group.
Next, the result vector is split into disjoint segments and distributed across the tasks. This is equivalent to an MPI_Reduce followed
by an MPI_Scatter operation.
Diagram Here

MPI_Reduce_scatter (&sendbuf,&recvbuf,recvcount,datatype,
...... op,comm)
MPI_REDUCE_SCATTER (sendbuf,recvbuf,recvcount,datatype,
...... op,comm,ierr)

MPI_Alltoall
Data movement operation. Each task in a group performs a scatter operation, sending a distinct message to all the tasks in the group
https://computing.llnl.gov/tutorials/mpi/

22/40

10/2/2014

Message Passing Interface (MPI)

in order by index.
Diagram Here

MPI_Alltoall (&sendbuf,sendcount,sendtype,&recvbuf,
...... recvcnt,recvtype,comm)
MPI_ALLTOALL (sendbuf,sendcount,sendtype,recvbuf,
...... recvcnt,recvtype,comm,ierr)

MPI_Scan
Performs a scan operation with respect to a reduction operation across a task group.
Diagram Here

MPI_Scan (&sendbuf,&recvbuf,count,datatype,op,comm)
MPI_SCAN (sendbuf,recvbuf,count,datatype,op,comm,ierr)

Examples: Collective Communications

Perform a scatter operation on the rows of an array
C Language - Collective Communications Example
#include "mpi.h"
#include <stdio.h>
#define SIZE 4
main(int argc, char *argv[]) {
int numtasks, rank, sendcount, recvcount, source;
float sendbuf[SIZE][SIZE] = {
{1.0, 2.0, 3.0, 4.0},
{5.0, 6.0, 7.0, 8.0},
{9.0, 10.0, 11.0, 12.0},
{13.0, 14.0, 15.0, 16.0} };
float recvbuf[SIZE];
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
if (numtasks == SIZE) {
source = 1;
sendcount = SIZE;
recvcount = SIZE;
MPI_Scatter(sendbuf,sendcount,MPI_FLOAT,recvbuf,recvcount,
MPI_FLOAT,source,MPI_COMM_WORLD);
printf("rank= %d Results: %f %f %f %f\n",rank,recvbuf[0],
recvbuf[1],recvbuf[2],recvbuf[3]);
}
else
printf("Must specify %d processors. Terminating.\n",SIZE);
MPI_Finalize();
}

Fortran - Collective Communications Example

program scatter
include 'mpif.h'
integer SIZE
parameter(SIZE=4)
integer numtasks, rank, sendcount, recvcount, source, ierr
real*4 sendbuf(SIZE,SIZE), recvbuf(SIZE)
C
C

Fortran stores this array in column major order, so the

scatter will actually scatter columns, not rows.
data sendbuf /1.0, 2.0, 3.0, 4.0,
&
5.0, 6.0, 7.0, 8.0,
&
9.0, 10.0, 11.0, 12.0,
&
13.0, 14.0, 15.0, 16.0 /

https://computing.llnl.gov/tutorials/mpi/

23/40

Fundamentals of Object Oriented Programming PDF
No ratings yet
Fundamentals of Object Oriented Programming PDF
130 pages
B0700be F PDF
100% (1)
B0700be F PDF
522 pages
Introduction To Socket Programming
No ratings yet
Introduction To Socket Programming
20 pages
Soft Computing UNIT 1
No ratings yet
Soft Computing UNIT 1
10 pages
Common Issues Solved
100% (1)
Common Issues Solved
5 pages
Numpy User
No ratings yet
Numpy User
502 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
112 pages
Diffusion Models
No ratings yet
Diffusion Models
46 pages
NGK Mpi
No ratings yet
NGK Mpi
74 pages
A Tutorial On Hidden Markov Models PDF
No ratings yet
A Tutorial On Hidden Markov Models PDF
22 pages
Data Science Using R
No ratings yet
Data Science Using R
74 pages
Matlab Study Guide
No ratings yet
Matlab Study Guide
85 pages
Model Predictive Control Using YALMIP Getting Started
No ratings yet
Model Predictive Control Using YALMIP Getting Started
5 pages
Introduction To Network Programming Using C/C++
No ratings yet
Introduction To Network Programming Using C/C++
27 pages
MATHEMATICS Parallel Scientific Computation
No ratings yet
MATHEMATICS Parallel Scientific Computation
324 pages
Dos Attack (3 PDF
No ratings yet
Dos Attack (3 PDF
21 pages
07 HDLC
100% (1)
07 HDLC
12 pages
SymComp Booklet
No ratings yet
SymComp Booklet
129 pages
JavaTech, An Introduction To Scientific and Technical Computing With Java
No ratings yet
JavaTech, An Introduction To Scientific and Technical Computing With Java
729 pages
Parallel and Distributed Computing Lab Digital Assignment - 5
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 5
7 pages
ML - Chapter 6 - Model Evaluation
No ratings yet
ML - Chapter 6 - Model Evaluation
65 pages
Jupyter Installation
100% (1)
Jupyter Installation
19 pages
Lab Sheet 05 - Numpy and Matplotlib
No ratings yet
Lab Sheet 05 - Numpy and Matplotlib
12 pages
Matlab Mpi: Parallel Programming With Matlabmpi Reference Manual
No ratings yet
Matlab Mpi: Parallel Programming With Matlabmpi Reference Manual
32 pages
Solid State Physics - Ii: Dr. N.Balasundari Assistant Professor Physics Department Sri K.G.S Arts College Srivaikundam
No ratings yet
Solid State Physics - Ii: Dr. N.Balasundari Assistant Professor Physics Department Sri K.G.S Arts College Srivaikundam
112 pages
Dispersion Model
100% (1)
Dispersion Model
2 pages
T24.Architecture - TAFC Directory Structure
No ratings yet
T24.Architecture - TAFC Directory Structure
30 pages
Computational Tools and Software MATLAB Python
No ratings yet
Computational Tools and Software MATLAB Python
5 pages
Matlab Notes
No ratings yet
Matlab Notes
37 pages
Makerbase CANable V2.0 Use Manual
No ratings yet
Makerbase CANable V2.0 Use Manual
16 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
23 pages
In3200 Chap09
No ratings yet
In3200 Chap09
56 pages
AS5812-54X HW Spec Programming Application V0.1 0515 2016
No ratings yet
AS5812-54X HW Spec Programming Application V0.1 0515 2016
64 pages
MATLAB Quick Guide
No ratings yet
MATLAB Quick Guide
126 pages
Matlab Prog PDF
No ratings yet
Matlab Prog PDF
1,104 pages
Experiments With MatLab Moled 2011
No ratings yet
Experiments With MatLab Moled 2011
144 pages
Computing LLNL Gov
No ratings yet
Computing LLNL Gov
42 pages
Matlab Primer Part2 PDF
No ratings yet
Matlab Primer Part2 PDF
28 pages
Pandas
100% (1)
Pandas
1,131 pages
Digitalisasi Citra
No ratings yet
Digitalisasi Citra
28 pages
02 Message Passing Interface Tutorial
No ratings yet
02 Message Passing Interface Tutorial
34 pages
New - MWT LabManual V 2.0
No ratings yet
New - MWT LabManual V 2.0
89 pages
Vxworks Programmers Guide 5-3-1
No ratings yet
Vxworks Programmers Guide 5-3-1
652 pages
Paper Parallel Merge Sort
No ratings yet
Paper Parallel Merge Sort
8 pages
Parallela Cluster by Michael Johan Kruger
No ratings yet
Parallela Cluster by Michael Johan Kruger
56 pages
Survivejs Webpack Apprentice Master
No ratings yet
Survivejs Webpack Apprentice Master
380 pages
(PDF Download) Programming in Ada 2012 With A Preview of Ada 2022 2nd Edition John Barnes Fulll Chapter
100% (2)
(PDF Download) Programming in Ada 2012 With A Preview of Ada 2022 2nd Edition John Barnes Fulll Chapter
64 pages
Module 3 - Containers Vs Virtualization (Docker)
No ratings yet
Module 3 - Containers Vs Virtualization (Docker)
41 pages
2.8inch SPI Module MSP2807 User Manual: Lcdwiki CR2018-MI3028
100% (1)
2.8inch SPI Module MSP2807 User Manual: Lcdwiki CR2018-MI3028
23 pages
Data Extraction & Exploration With SPARQL & The Talis Platform
No ratings yet
Data Extraction & Exploration With SPARQL & The Talis Platform
49 pages
Fuzzy Min-Max Neural Networks
No ratings yet
Fuzzy Min-Max Neural Networks
32 pages
Java Concurrency
No ratings yet
Java Concurrency
70 pages
Matlab Workbook: CME 102 Winter 2008-2009
No ratings yet
Matlab Workbook: CME 102 Winter 2008-2009
55 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
22 pages
An Introduction To Scilab
No ratings yet
An Introduction To Scilab
27 pages
Chapter Bit Manipulation
No ratings yet
Chapter Bit Manipulation
14 pages
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
100% (1)
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
40 pages
Introduction To MPI Ranger Lonestar
No ratings yet
Introduction To MPI Ranger Lonestar
67 pages
Julia Basic Commands
No ratings yet
Julia Basic Commands
10 pages
Cs Project
No ratings yet
Cs Project
28 pages
Coding Theory
No ratings yet
Coding Theory
34 pages
Oop Abeer
No ratings yet
Oop Abeer
42 pages
The Basic Java Applet and Japplet: I2Puj4 - Chapter 6 - Applets, HTML, and Gui'S
No ratings yet
The Basic Java Applet and Japplet: I2Puj4 - Chapter 6 - Applets, HTML, and Gui'S
20 pages
User Guide QESPRESSO
No ratings yet
User Guide QESPRESSO
77 pages
Pandas - PySpark Equivalents-1
No ratings yet
Pandas - PySpark Equivalents-1
3 pages
Matlab Matlab Toolbox Deep Learning Toolbox Neural Network Toolbox Libraries Functions How To Use
No ratings yet
Matlab Matlab Toolbox Deep Learning Toolbox Neural Network Toolbox Libraries Functions How To Use
5 pages
Topic2 Program Development Life Cycle
No ratings yet
Topic2 Program Development Life Cycle
41 pages
CSC3100 Data Structures HW4 Fall 2024
No ratings yet
CSC3100 Data Structures HW4 Fall 2024
9 pages
Dot Net Fundamentals
No ratings yet
Dot Net Fundamentals
42 pages
Linking and Loading
No ratings yet
Linking and Loading
23 pages
Elcad731 Manual Workshop
No ratings yet
Elcad731 Manual Workshop
140 pages
Floyd-Warshall Algorithm
No ratings yet
Floyd-Warshall Algorithm
6 pages
WEEK-1 Python Notes
No ratings yet
WEEK-1 Python Notes
16 pages
LindabRevitTools 2016.0 - Manual
No ratings yet
LindabRevitTools 2016.0 - Manual
68 pages
R and R Studio Introduction
No ratings yet
R and R Studio Introduction
24 pages
C
No ratings yet
C
241 pages
CG Lab Manual PDF
No ratings yet
CG Lab Manual PDF
71 pages
Rslogix 5000: Enterprise Series Programming Software
No ratings yet
Rslogix 5000: Enterprise Series Programming Software
22 pages
Gap Tutorial
No ratings yet
Gap Tutorial
79 pages
SPARQL Tutorial II & III
No ratings yet
SPARQL Tutorial II & III
91 pages
IBM Tape Library Configuration
No ratings yet
IBM Tape Library Configuration
13 pages
GNU Linear Programming Kit Java Binding: Reference Manual
No ratings yet
GNU Linear Programming Kit Java Binding: Reference Manual
32 pages
Build Control Tool
100% (2)
Build Control Tool
25 pages
Search Engine Optimization: M Ini Project Report
No ratings yet
Search Engine Optimization: M Ini Project Report
32 pages
Python Pt1 0702
No ratings yet
Python Pt1 0702
121 pages
Readme Win Tutorial
No ratings yet
Readme Win Tutorial
4 pages
Boost Trie
No ratings yet
Boost Trie
20 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
Microsoft Windows Communication Foundation 4.0 Cookbook for Developing SOA Applications
From Everand
Microsoft Windows Communication Foundation 4.0 Cookbook for Developing SOA Applications
Steven Cheng
No ratings yet
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet

Message Passing Interface (MPI)

Uploaded by

Message Passing Interface (MPI)

Uploaded by

10/2/2014

Message Passing Interface (MPI)

Today, MPI runs on virtually any hardware platform:

Message Passing Interface (MPI)

MVAPICH 1.2 User Guide available HERE

For additional information:

Message Passing Interface (MPI)

Fortran include file

Error code: Returned as "rc". MPI_SUCCESS if successful

CALL MPI_XXXXX(parameter,..., ierr)

Error code: Returned as "ierr" parameter. MPI_SUCCESS if successful

Message Passing Interface (MPI)

Environment Management Routines

Message Passing Interface (MPI)

Examples: Environment Management Routines

do some work *******/

Fortran - Environment Management Routines Example

Message Passing Interface (MPI)

System buffer space is:

Message Passing Interface (MPI)

Point to Point Communication Routines

Message Passing Interface (MPI)

wchar_t - wide character

signed short int

signed long int

signed long long int

unsigned short int

unsigned long int

unsigned long long int

Fortran Data Types

long double _Complex

long double _Complex

data packed or unpacked with

Message Passing Interface (MPI)

MPI_BYTE and MPI_PACKED do not correspond to standard C or Fortran types.

Point to Point Communication Routines

Message Passing Interface (MPI)

destination process has started to receive the message.

Message Passing Interface (MPI)

stat(MPI_SOURCE), 'with tag',stat(MPI_TAG)

Point to Point Communication Routines

Message Passing Interface (MPI)

Collective Communication Routines

Message Passing Interface (MPI)

Fortran Data Type

integer, real, complex

integer, real, complex

integer, real, complex

integer, real, complex

max value and location

float, double and long double

real, complex,double precision

min value and location

float, double and long double

real, complex, double precision

Message Passing Interface (MPI)

Examples: Collective Communications

Fortran - Collective Communications Example

Fortran stores this array in column major order, so the

You might also like