Lecture 10-Introduction to MPI
Lecture 10-Introduction to MPI
Programming
Presenter: Liangqiong Qu
Assistant Professor
▪ Introduction to MPI
▪ Parallel Execution in MPI
▪ Communicator and Rank
▪ MPI Blocking Point-to-Point Communication
▪ Beginner’s of MPI Toolbox
▪ Examples
Review of Previous Lecture: Dominant Architectures of HPC Systems
• Shared-memory computers: A shared-memory parallel computer is a system in
which a number of CPUs work on a common, shared physical address space.
Shared-memory programming enables immediate access to all data from all
processors without explicit communication
• Distributed memory computers: A distributed-memory architecture is a system
where each processor or node has its own local memory, and they communicate
with each other through message passing.
• Hybrid (shared/distributed memory) computers: A hybrid architecture is a
system that combines the features of both shared-memory and distributed-memory
architectures.
• Features
• No global shared address space
• Data exchange and communication between processors is done
by explicitly passing message through NI (network interfaces)
• Progamming
• No remote memory access on distributed-memory systems
• Require to ‘send message’ back and forth between processors
• Many free Massage Passing Interface (MPI) libraries available
The Message Passing Paradigm
• A brief history of MPI: Before 1990’s, many libraries could facilitate building parallel
applications, but there was not a standard accepted way of doing it. In Supercomputing
1992 conference, research gather together then define a standard interface for performing
message passing - the Message Passing Interface (MPI). This standard interface allow
programmers to write parallel applications that were portable to all major parallel
architectures.
• Data exchange between processes: Send/receive messages via MPI library calls
• No automatic workload distribution
The MPI Standard
• Documents : https://www.mpi-forum.org/docs/
Serial Programming vs Parallel Programming (MPI) Terminologies
Serial Programming Parallel Programming (MPI)
Parallel Execution in MPI
• Processes run throughout program
execution
Program startup
• MPI start mechanism:
• Launches tasks/processes
• Establishes communication
context (“communicator”)
+
• Bindings:
• MPI function calls follow a
specific naming convention
error = MPI_Xxxxx(…);
Process rank
• MPI uses objects called communicators and groups to define which collection of
processes may communicate with each other.
• Within a communicator, every process has its own unique, integer identifier rank
assigned by the system when the process initializes. A rank is sometimes also called a
“task ID”. Ranks are contiguous and begin at zero.
Communicator and Rank
• Communicator defines a set of processes (MPI_COMM_WORLD: all)
• A pointer is a variable that stores the memory address of another variable as its value.
• A pointer variable points to a data type (like int) of the same type, and is created with
the * operator.
• You can also get the value of the variable the pointer points to, by using the * operator
(the dereference operator):
Communicator and Rank: & and * in C-Programming (Background)
• In C, function arguments are passed by value by default. This means when you pass a
variable (e.g., rank) to a function, the function receives a copy of its value. That is C
allocates new memory for the parameter inside the function.
• Any modifications to the parameter inside the function do not affect the original
variable outside.
Communicator and Rank: & and * in C-Programming (Background)
• In C, function arguments are passed by value by default. This means when you pass a
variable (e.g., rank) to a function, the function receives a copy of its value. That is C
allocates new memory for the parameter inside the function.
• We need to pass by the pointer! The memory address
Communicator and Rank: & and * in C-Programming (Background)
• When a variable is created in C, a memory address is assigned to the variable. The
memory address is the location of where the variable is stored on the computer.
• When we assign a value to the variable, it is stored in this memory address. To access
it, use the reference operator (&), and the result represents where the variable is stored:
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
"&size" syntax is used to pass the
int size; address of the "size" variable to
MPI_Comm_size(MPI_Comm_WORLD, &size); the function.
• The “&” symbol is used in MPI to pass the address of a variable to a function,
allowing the function to directly modify the value at that memory location. “&” is
more frequently used in MPI functions rather than directly return the value with
return ***
General MPI Program Structure
• Head declaration
• Serial code
• Serial code
Step 1: Write the Code: MPI “Hello World!” in C
Step 2: Compiling the Code
• If you try to compile this code locally with gcc, you might run into problems
▪ module avail
List all the available modules in the HPC system
▪ module avail X
List all installed version of module matching any of the “X”
▪ module unload X
Unload specific module X from your current account
Step 2: Compiling Code Basic: Load the Right Module for Compilers
• If you try to compile this code locally with gcc, you might run into problems
• Running
• Starup wrappers: mpirun or mpiexec
• mpirun –np 4 ./hello_world
• Details are implementation specific
Review of Lecture 2---Batch System for Running Jobs with HPC
Submitting job scripts
A job script must contain directives to inform the batch system about the
characteristics of the job. This directives appear as comments (#SBATCH) in the job
script and have to conform with the sbatch syntax
Step 3: Running the Code
• Preparing job scripts and running the code with Scheduler
• Example: Slurm as our HKU HPC system
• MPI Run and scheduler distribute the executable on right nodes
• After preparation of job scripts, then submit with sbatch command: sbatch submit-hello.sh
Step 3: Running the Code
• Running the code
• Example to understand distribution of program
• E.g., executing the MPI program on 4 processors
• Normally batch system allocations
• Understanding role of mpirun is important (below
command, running hello_world with 4 processors)
▪ Sender
• Which processor is sending the message?
• Where is the data on the sending processor?
• What kind of data is being sent?
• How much data is there?
▪ Receiver
• Which processor is receiving the message?
• Where should the data be left on the receiving processor?
• How much data is the receiving processor prepared to accept?
rank i rank j
Sender Receiver
▪ Data types
▪ Basic
▪ MPI derived types
Predefined Data Types in MPI (Selection)
int MPI_Send(void *buf, int count , MPI_Datatype datatype, int dest, int tag, MPI_Comm
comm)
• void* (void pointer) is a void pointer that can hold the address of any data
type in C
Standard Blocking Send
int MPI_Send(void *buf, int count , MPI_Datatype datatype, int dest, int tag, MPI_Comm
comm)
▪ At completion
• Send buffer can be reused as you see fit
• Status of destination is unknown - the message could be anywhere
Standard Blocking Send
int MPI_Send(void *buf, int count , MPI_Datatype datatype, int dest, int tag, MPI_Comm
comm)
Standard Blocking Receive
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag,
MPI_Comm comm, MPI_Status *status);
buf address of receive buffer
count maximum # of elements that excepted to receive
datatype MPI data type
source sending processor rank
tag message tag
comm communicator
status address of status object. It is a struct that you can access if
necessary to have more information on the message you just
received.
▪ At completion
• Message has been received successfully
• Message length, and probably the tag and the sender, are still unknown
Source and Tag WildCards
▪ In certain cases, we might want to allow receiving messages from any sender or
with any tag.
▪ MPI_Recv accepts wildcards for the source and tag arguments:
MPI_ANY_SOURCE,MPI_ANY_TAG
int count;
MPI_Get_count(&s,MPI_DOUBLE, &count);
Standard Blocking Receive
Requirements for Point-to-Point Communication
▪ For a communication to succeed:
• The sender must specify a valid destination.
• The receiver must specify a valid source rank (or MPI_ANY_SOURCE).
• The communicator used by the sender and receiver must be the same (e.g.,
MPI_COMM_WORLD).
• The tags specified by the sender and receiver must match (or MPI_ANY_TAG for
receiver).
• The data types of the messages being sent and received must match.
• The receiver's buffer must be large enough to hold the received message.
Beginner’s MPI Toolbox
• Send/receive buffer may safely be reused after the call has completed
• MPI_Send() must have a specific received rank/tag, MPI_Recv () does not
Example 1. Exchanging Data with MPI Send/Receive (Pingpong.c)
Example 1. Exchanging Data with MPI Send/Receive
• MPI_Send( ) function is used to send
a certain number of elements of
some datatype to another MPI rank;
this routine blocks until the message
is received by the destination process
• MPI_Recv() function is used to
receive a certain number of elements
of some datatype from another MPI
rank; this routine blocks until the
message is received and thus send by
the source process
• This form of MPI communication is
called ‘blocking'
• int MPI_Recv(void *buf, int count, MPI_Datatype
datatype, int source, int tag, MPI_Comm comm,
MPI_Status *status);
• int MPI_Recv(void *buf, int count, MPI_Datatype
datatype, int source, int tag, MPI_Comm comm,
MPI_Status *status);
Example 1. Exchanging Data with MPI Send/Receive
https://forms.gle/zDdrPGCkN7ef3UG5A
52