0% found this document useful (0 votes)
4 views36 pages

CS-3006 7 MPI Advanced Topics

The document provides an overview of collective communication in the Message Passing Interface (MPI), detailing key operations such as Broadcast, Scatter, Gather, and Reduce. It outlines the properties and functionalities of these operations, including synchronization and data reduction methods. Additionally, it introduces various MPI functions like MPI_Bcast, MPI_Scatter, MPI_Gather, and their variations, along with examples and demos for practical understanding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views36 pages

CS-3006 7 MPI Advanced Topics

The document provides an overview of collective communication in the Message Passing Interface (MPI), detailing key operations such as Broadcast, Scatter, Gather, and Reduce. It outlines the properties and functionalities of these operations, including synchronization and data reduction methods. Additionally, it introduces various MPI functions like MPI_Bcast, MPI_Scatter, MPI_Gather, and their variations, along with examples and demos for practical understanding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Message Passing Interface (MPI) -

Collective Communication

Dr. Muhammad Mateen Yaqoob,

Department of AI & DS,


National University of Computer & Emerging Sciences,
Islamabad Campus
Collective Communications

• Broadcast
• Scatter
• Gather
• Reduce
Collective Communications
• Processes may need to communicate with everyone else

• Three Main Classes:


1. Communications: Broadcast, Gather, Scatter
2. Synchronization: Barriers
3. Reductions: sum, max, etc.

• Properties:
– Must be executed by all processes (of the communicator)
– All processes in group call same operation at (roughly)
the same time
– All collective operations are blocking operations
Broadcast
• A one-to-many communication

before r ed
bcast

after r ed r ed r ed r ed r ed
bcast
e.g., root=1
• rank of the sending process (i.e., root process)
• must be given identically by all processes
Collective communication: Broadcast
Collective communication: Broadcast
Broadcasting with MPI_Bcast

• The contents of the send buffer is copied from a sender (i.e.,


root process) to all other processes, (including itself)

• The type signature (number of elements, data type) on any


process must be same (as on the root process)
Broadcasting with MPI_Bcast
Demo:
BroadCast.c
Collective communication: Scatter
MPI_Scatter
• MPI_Scatter is a collective routine that is similar to MPI_Bcast
• It sends chunks of an array to different processes
MPI_Scatter

The root sends a part of its send buffer to each process


• Process k receives sendcount elements starting with
sendbuf+k*sendcount
MPI_Scatter - Example
Demo:
Scatter.c
Collective communication: Gather
MPI_Gather
• MPI_Gather is the inverse of MPI_Scatter
• It takes elements from many processes and gathers them
to one single process
MPI_Gather
MPI_Gather
MPI_Gather

• The root receives data from all processes (from send buffers)
• It stores the data in the receive buffer ordered by the process
number of the senders
MPI_Gather - Example
Demo:
Gather.c
MPI_Scatterv
• MPI_Scatterv is a collective routine that is similar to MPI_Scatter
• It sends variable chunks of an array to different processes

Process-0

Process-1

Process-2

Process-3
Credits: https://www.cineca.it/
MPI_Scatterv Demo:
ScatterV.c

sendbuf: address of send buffer (significant only at root)


sendcounts: integer array (of length group size) specifying the number of
elements to send to each process
displs: integer array (of length group size). Entry i specifies the displacement
(relative to sendbuf from which to take the outgoing data to process i
sendtype: data-type of send buffer elements
recvcount: number of elements in receive buffer (integer)
recvtype: data-type of receive buffer elements
root: rank of sending process (integer)
MPI_Gatherv
• Different number of elements can be received by the root process
• Individual messages are stored according to displs in the receive
buffer

Credits: https://www.cineca.it/
MPI_Gatherv
MPI_Gatherv Demo:
GatherV.c

sendbuf: address of send buffer


sendcounts: number of elements in send buffer (integer)
sendtype: data-type of send buffer elements
recvbuf: address of the receive buff (significant at root)
recvcounts: integer array (of length group size) containing the number of elements
that are to be received from each process (on root)
displs: integer array (of length group size). Entry i specifies the displacement relative
to recvbuf at which to place data from process i (significant only at root)
recvtype: data-type of receive buffer elements (handle)
root: rank of receiving process (root)
Home Tasks
• MPI_Allgather
• Similar to MPI_Gather, but the result is available to all
processes
• MPI_Allgatherv
• Similar to MPI_Gatherv, but the result is available to all
processes
• MPI_Alltoall
• Similar to MPI_Allgather, each process performs a
scater followed by gather process
• MPI_Alltoallv
• Similar to MPI_Alltoall, but messages to different
processes can have different length
MPI_Alltoall
MPI_Alltoall makes a redistribution of contents such that
each process know the buffer of all others. It is a way to
implement the matrix data transposition
Synchronization
Barrier Synchronization

Credits: https://medium.com/@jaydesai36/barrier-synchronization-in-threads-3c56f947047
MPI_BARRIER Demo:
Barrier.c

It synchronizes ALL Processes (by blocking Processes) in


communicator until all processes have called MPI_Barrier.
Reductions
Collective communication: Reduce

• Data reduction involves reducing a set of numbers into a


smaller set of numbers via a function.
• Example: Consider the list [1, 2, 3, 4, 5]. Reducing this list of
numbers with the sum function would produce sum([1, 2, 3,
4, 5]) = 15.
• Similarly, the multiplication reduction would yield multiply([1,
2, 3, 4, 5]) = 120
Reductions
The communicated data of the processes are combined
via a specified operation, e.g. ’+’

Two different variants:


- Result is only available at the root process
- Result is available at all processes

Input values (at each process):


- Scalar variable: operation combines all values of the
processes
– Array: The elements of the arrays are combined in an
element-wise fashion. The result is an array.
MPI_Reduce

• This operation combines the elements in the send buffer


and delivers the result to root.
• Count, op, and root have to be equal in all processes
MPI_Reduce - Example
Demo:
Reduction.
c

Credits: https://dps.uibk.ac.at
Reduction Operations
Reduction Operations:

Data types:
– Operations are defined for appropriate data types
MPI_Allreduce

Similar to MPI_Reduce, returns the result value to all processes

Credits: https://dps.uibk.ac.at
Any Questions

You might also like