Parallel and Distributed
Programming
Dr. Muhammad Naveed Akhtar
Lecture – 04a
Distributed Memory Programming with MPI
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 2
Roadmap
• Writing your first MPI program.
• Using the common MPI functions.
• The Trapezoidal Rule in MPI.
• Collective communication.
• MPI derived datatypes.
• Performance evaluation of MPI programs.
• Parallel sorting.
• Safety in MPI programs.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 3
Distributed and Shared Memory Systems
Shared Memory System
Distributed Memory System
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 4
Hello World!
Identifying MPI processes
• Common practice to identify processes by nonnegative integer ranks.
• p processes are numbered 0, 1, 2, .. p-1
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 5
Our first MPI program
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 6
MPI Program Compilation and Execution
Execute Compile
wrapper script to compile
mpiexec –n <np> <executable> Produce debugging information
Executable File Turns on all warnings
Specify Processors’ Count Create this executable file name
wrapper script to execute source file
mpicc –g –Wall –o mpi_hello mpi_hello.c
Execute with 4 Processors in Parallel Execute with only 1 Processor
mpiexec –n 4 ./mpi_hello mpiexec –n 1 ./mpi_hello
Greetings from process 0 of 4 ! Greetings from process 0 of 1 !
Greetings from process 1 of 4 !
Greetings from process 2 of 4 !
Greetings from process 3 of 4 !
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 7
MPI Programs Structure
• Written in C.
• Has main.
• Uses stdio.h, string.h, etc.
• Need to add <mpi.h> header file.
• Identifiers defined by MPI start with “MPI_”.
• First letter following underscore is uppercase.
• For function names and MPI-defined types.
• Helps to avoid confusion.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 8
MPI Components
• MPI_Init
• Tells MPI to do all the necessary setup.
• MPI_Finalize
• Tells MPI we’re done, so clean up anything allocated for this program.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 9
MPI Communicators
• A collection of processes that can send messages to each other.
• MPI_Init defines a communicator that consists of all the processes created when the program is
started.
• Called MPI_COMM_WORLD.
number of processes in
the communicator
my rank
(the process making this call)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 10
SPMD and Data Types
• SPMD – Single-Program Multiple-Data
• We compile one program.
• Process 0 does something different.
• Receives messages and prints them while the other
processes do the work.
• The if-else construct makes our program SPMD.
• For communication we need datatype
• What type is being communicated in 1,0 sequence
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 11
Communication (Send / Receive)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 12
Message Matching (Send / Receive)
Define Data being Communicated dest=r
MPI Send(send_buf_p, send_buf_sz, send_type, dest, send_tag, send_comm);
MPI Recv(recv_buf_p, recv_buf_sz, recv_type, src, recv_tag, recv_comm, &status);
recv_buf_sz > send_buf_sz src=q
send_type = recv_type
recv_buf_sz ≥ send_buf_sz
Sender Rank = q
Receiver Rank = r (q & r are numbers)
recv_comm = send_comm
recv_tag = send_tag
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 13
Receiving Messages (In-Complete Information)
• A receiver can get a message without knowing:
• The amount of data in the message (send_buf_sz)
• The sender of the message (src)
• The tag of the message (send_tag)
MPI Recv(recv_buf_p, recv_buf_sz, recv_type, src, recv_tag, recv_comm, &status);
• How much data am I receiving?
MPI_Status* status;
status.MPI_SOURCE
status.MPI_TAG
Status.MPI_ERROR
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 14
Issues with Send and Receive
• Exact behavior is determined by the MPI implementation.
• MPI_Send may behave differently with regard to buffer size, cutoffs and blocking.
• MPI_Recv always blocks until a matching message is received.
• Know your implementation; Don’t make Assumptions!
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 15
Trapezoidal rule in MPI
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 16
The Trapezoidal Rule
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 17
Pseudo-Code for a serial program
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 18
Parallelizing the Trapezoidal Rule
• Partition problem solution into tasks.
• Identify communication channels between tasks.
• Aggregate tasks into composite tasks.
• Map composite tasks to cores.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 19
Parallel Pseudo-Code
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 20
First version of MPI Program
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 21
First version of MPI Program (contd.)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 22
First version of MPI Program (contd.)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 23
Dealing with Output in an MPI Program
Each process just prints a message.
Output of Program (unpredictable)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 24
Handling Input in an MPI Program
• Most MPI implementations only allow process 0 in MPI_COMM_WORLD access to stdin.
• Process 0 must read the data (scanf) and send to the other processes.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 25
Collective communication
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 26
Tree-structured communication
Scenario 1 Scenario 2
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 27
MPI_Reduce
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 28
Collective vs. Point-to-Point Communications
• All the processes in the communicator must call the same collective function.
• For example, a program that attempts to match a call to MPI_Reduce on one process with a call to
MPI_Recv on another process is erroneous, (the program will likely hang or crash).
• The arguments passed by each process to an MPI collective communication must be “compatible.”
• For example, if one process passes in 0 as the dest_process and another passes in 1, then the
outcome of a call to MPI_Reduce is erroneous, (the program will likely hang or crash).
• The output_data_p argument is only used on dest_process.
• However, all of the processes still need to pass in an actual argument corresponding to
output_data_p, even if it’s just NULL.
• Point-to-point communications are matched on the basis of tags, Collective comm don’t use tags.
• They’re matched solely on the basis of the communicator and the order in which they’re called.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 29
Example (Multiple calls to MPI_Reduce)
• Suppose that each process calls MPI_Reduce with operator MPI_SUM, and destination process 0.
• At first glance, it might seem that after the two calls to MPI_Reduce, the value of b will be 3, and
the value of d will be 6.
• However, the names of the memory locations are irrelevant to the matching of the calls to
MPI_Reduce.
• The order of the calls will determine the matching so the value stored in b will be 1+2+1 = 4, and
the value stored in d will be 2+1+2 = 5.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 30
MPI_Allreduce
• Useful in a situation in which all of the processes need the result
of a global sum in order to complete some larger computation.
A global sum followed by
distribution of the result.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 31
Butterfly Structured Global Sum
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 32
Broadcast
A tree-structured
broadcast.
• Data belonging to a single process is sent to all of the
processes in the communicator.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 33
A version of Get_input that uses MPI_Bcast
Function Prototype
Function Definition
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 34
Data Distribution / Partition Strategies
• Block partitioning
• Assign blocks of consecutive components to each process.
• Cyclic partitioning
• Assign components in a round robin fashion.
• Block-cyclic partitioning
• Use a cyclic distribution of blocks of components.
Data Distribution Strategy
Total 12 elements Vector
3 Processors
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 35
Compute Vector Sum (Serial / Parallel)
Vector Sum Math Serial Approach
Loop for only “my” Items
Parallel Sum (Same Prototype)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 36
Scatter and Gather (Distribute and Collect)
MPI_Scatter can be used in a function that reads MPI_Gather Collect all of the components of the
in an entire vector on process 0 but only sends the vector onto process 0, and then process 0 can
needed components to each of the other processes. process all of the components.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 37
Reading and distributing a vector (Scatter)
Function Prototype
Function Definition
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 38
Print a distributed vector (Gather)
Function Definition
Function Prototype
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 39
Allgather
• Concatenates the contents of each process’ send_buf_p and stores this in each process’
recv_buf_p.
• As usual, recv_count is the amount of data being received from each process.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 40
Matrix-Vector Multiplication
Dot product of the ith
i-th component of y
row of A with x.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 41
Matrix-Vector Multiplication (Serial Version)
Serial Pseudocode
C Style Array stored as
Serial Program
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 42
Matrix-Vector Multiplication (MPI Version)
Function Prototype
Function Definition
Serial Version Comparison
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 43
Questions and comments?
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 44