Lecture 11 MPI Point To Point Communication
Lecture 11 MPI Point To Point Communication
Programming
Presenter: Liangqiong Qu
Assistant Professor
Process rank
• MPI uses objects called communicators and groups to define which collection of
processes may communicate with each other.
• Within a communicator, every process has its own unique, integer identifier rank
assigned by the system when the process initializes.
Review of Lecture 10 --- Beginner’s MPI Toolbox
• Send/receive buffer may safely be reused after the call has completed
• MPI_Send() must have a specific received rank/tag, MPI_Recv () does not
Review of Lecture 10 --- Point-to-Point Communication
▪ For a communication to succeed:
• The sender must specify a valid destination.
• The receiver must specify a valid source rank (or MPI_ANY_SOURCE).
• The communicator used by the sender and receiver must be the same
(e.g., MPI_COMM_WORLD).
• The tags specified by the sender and receiver must match (or
MPI_ANY_TAG for receiver).
• The data types of the messages being sent and received must match.
• The receiver's buffer must be large enough to hold the received message.
Outline
▪ Examples
MPI Blocking Point-to-Point Communication
▪ MPI_Send
• MPI_Send would hang until the whole message has arrived the receiver or the whole message has
been copied into a system buffer.
• After MPI_Send, the send buffer is save for reuse.
▪ MPI_Recv
• MPI_Recv would hang until the message has been received into the buffer specified by the buffer
argument.
• If message is not available, the process will remain hang until a message become available.
Network
System Buffer
Question: What happens for the above operation in MPI blocking point to point
communication?
Deadlock in MPI Blocking Point-to-Point Communication
• Two processors initiate a blocking send to each other without posting a receive
• A deadlock occurs when two or more processors try to access the same set of
resources
Deadlock Prevention
Process 0 Process 1
To get the first communication, both programs will have to wait for 5 seconds. Then
to get to the second communication, both have to wait for 6 more additional seconds.
Nonblocking Point-to-Point Communication
▪ Blocking point-to-point communication
• Limitations: may cause deadlock, idle time, etc
• Return only when the buffer is ready to reused
▪ Nonblocking point-to-point communication
• A nonblocking MPI call returns immediately to the next statement without
waiting for task to complete. This enables other_work to proceed right away.
• You cannot reuse the send buffer until either a successful wait/test or you
certainly know that the message has been received.
Process 0 Process 1
Nonblocking Point-to-Point Communication
▪ Nonblocking point-to-point communication
▪ Nonblocking calls merely initiate the communication process
▪ The status of the data transfer, and the success of the communication, must be
verified at a later point in the program.
▪ The purpose of a nonblocking send is mostly to notify the system of the
existence of an outgoing message: the actual transfer might take place later.
▪ It is up to the programmer to keep the send buffer intact until it can be verified
that the message has actually been copied someplace else.
▪ In either case, before trusting the contents of the message buffer, the
programmer must check its status using the MPI_Wait or MPI_Test functions.
Standard Nonblocking Send/Receive
• MPI_Wait: You hang around the counter until your order number is definitely
ready. (Maybe your friends won't talk to you until you bring them some food!)
• MPI_Test: You go to the counter ask the guy if your order number is ready. (If it
isn't, then you can go back to your table and talk some more with your friends.)
MPI_Wait
MPI_Wait
• Waiting forces the process to go in "blocking mode". The sending process will simply
wait for the request to finish. If your process waits right after MPI_Isend, the send is
the same as calling MPI_Send.
Figure1. Inserting the Wait call immediately after the Isend call emulates a blocking send, and the extra
call doesn't even add very much overhead.
• There are three ways to wait : MPI_Wait, MPI_Waitany, and MPI_Waitall. They can be
called as follows:
MPI_Wait
MPI_Wait
• There are three ways to wait : MPI_Wait, MPI_Waitany, and MPI_Waitall. They can be
called as follows:
• MPI_Wait just waits for the completion of the given request (single request). As soon
as the request is complete an instance of MPI_Status is returned in status.
• MPI can handle multiple communication requests
• MPI_Waitany waits for the first completed request in an array of requests to
continue. As soon as a request completes, the value of index is set to store the index
of the completed request of array_of_requests
• MPI_Waitall waits for the completion of an array of requests, waits for all
provided requests have been completed
MPI_Wait
• flag: variable of type int to test for success. It tells whether the request was completed
during test or not. If flag != 0, that means the request has been completed
MPI_Test
Example of Nonblocking Point-to-Point Communication
Nonblocking Point-to-Point Communication
• Standard nonblocking send/recv MPI_Isend()/MPI_Irecv()
• Return of call does not imply completion of operation
• Use MPI_Wait*() / MPI_Test*() to check for completion using request handles
• Potentials
• Overlapping of communication with work (not guaranteed by MPI standard)
• Overlapping send and receive
• Avoiding synchronization and idle times
▪ double MPI_Wtime();
Returns current time stamp
• Split up interval [a, b] into equal disjoint chunks and compute partial results in parallel
Example 2. Parallel Integration in MPI
Task: calculate in parallel,
using 4 processors, let a = 0, b =2
• Split up interval [a, b] into equal
disjoint chunks
• Compute partial results in parallel
• Collect global sum at rank 0
Example 2. Parallel Integration in MPI
Remarks on Parallel Integration Example
• Gathering results from processes is a very common task in MPl - there are more
efficient and elegant ways to do this (learn in the future lectures).
• This is a reduction operation (summation). There are more efficient and elegant ways
to do this (learn in the future lectures).
• The “master” process waits for one receive operation to be completed before the next
one is initiated. There are more efficient ways... You guessed it!
• “Master-worker” schemes are quite common in MPI programming but scalability
to high process counts may be limited.
• Every process has its own res variable, but only the master process actually uses it >
it's typical for MPI codes to use more memory than actually needed.
Thank You!