Lecture 13-Derived Datatypes in MPI
Lecture 13-Derived Datatypes in MPI
Programming
Presenter: Liangqiong Qu
Assistant Professor
▪ General assumption: MPI does a better job at collectives than you trying
to emulate them with point-to-point calls
Review of Lecture 12: Synchronization and Data Movement
▪ Synchronization (barrier) MPI_Barrier(MPI_Comm comm)
• Explicit synchronization of all ranks from specified communicator
▪ Data movement (broadcast, scatter, gather)
• Broadcasting happens when one process wants to send the same information to every
other process.
MPI_Bcast(void* buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
• Scatter: Distributes distinct messages from a single root rank to each ranks in the
communicator.
MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendstype, void *recvbuf, int recvcount, MPI_Datatype
recvtype, int root, MPI_Comm comm)
• Receive a message from each rank and place i-th rank’s message at i-th position in
receive buffer
int MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int
recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm )
scatter
gather broadcast
Review of Lecture 12: Global Computation in MPI
▪ Global computation (MPI_Reduce, scatter, gather)
• MPI_Reduce: Collective computation operation. Applies a reduction operation on all tasks in
communicator and places the result in root rank.
MPI_reduce( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype,
MPI_Op op, int root, MPI_Comm comm );
• MPI_Scan: Performs a prefix reduction of the data stored in sendbuf at each process and returns the
results in recvbuf of the process with rank dest.
MPI_Scan(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm
MPI_Op op here indicates the reduce operation (MPI predefined or your own)
▪ MPI_Type_vector
▪ MPI_Type_create_subarray
▪ MPI_Type_create_struct
Prerequisite: Bits and Bytes in Computer
• A bit is the smallest unit of information in computing and
digital communication.
• Everything in a computer is 0's and 1's. The bit stores just
a 0 or 1: it's the smallest building block of storage.
• Byte: One byte = collection of 8 bits, e.g. 0 1 0 1 1 0 1 0 Review of Lecture 4: At the core of CPU performance
lies the transistor, a tiny electronic switch that can
either allow a signal to pass (representing the on state,
or 1) or block it (representing the off state, or 0). This
fundamental behavior of transistors is what enables the
representation of a bit—the smallest unit of data in
computing.
Prerequisite: Bits and Bytes in Computer
• The byte is a unit of digital information that most commonly consists of 8 bits.
Historically, the byte was the number of bits used to encode a single character of
text in a computer.
• Different data types (e.g., int, float) require different amounts of memory
(bits/bytes).
• In programming, specifying the data type of a variable tells the computer how much
memory to allocate and how to interpret the data.
Predefined Data Types in MPI
• Different data types (e.g., int, float) require
different amounts of memory (bits/bytes).
• MPI provides predefined datatypes
like MPI_INT and MPI_FLOAT to ensure proper
memory allocation and data interpretation during
communication.
• MPI_Type_create_struct(…)
specifies the data layout of user-defined structs (or classes)
• MPI_Type_vector(…)
specifies strided data, i.e. same-type data with missing
elements
• MPI_Type_create_subarray(…)
specifies sub-ranges of multi-dimensional arrays
A Flexible, Vector-Like Type: MPI_Type_vector
▪ Creates a vector (strided) datatype:
MPI_Type_vector(int count, int blocklength, int stride, MPI_Datatype
oldtype, MPI_Datatype * newtype);
Input arguments:
• count is the number of contiguous blocks
• blocklengths are the number of the elements in each block
• stride is number of elements between start of each block
• oldtype is the datatype of the elements
Output arguments:
• newtype: new datatype (handle)
A Flexible, Vector-Like Type: MPI_Type_vector
MPI_Type_vector(int count, int blocklength, int stride, MPI_Datatype oldtype,
MPI_Datatype * newtype);
▪ Get the lower bound and the extent (span from the first byte to the last byte) of
datatype
• MPI_type_get_extent(MPI_Datatype newtype, MPI_Aint *lb, MPI_Aint
*extent);
• Lower bound refers to the starting byte address of the datatype, while the extent
represents the span from the first byte to the last byte of the datatype.
• MPI_Aint is a MPI type represents an address or offset in memory.
How to Obtain and Handle Address
▪ int MPI_Get_address(const void *location, MPI_Aint *address);
• Get the address of a location in memory
• (input argument) location: The element to obtain the address of.
• (output argument) address: Address of location
▪ Example:
• MPI_Type_vector creates a
vector (strided) datatype, with
count as nrows, blocklength as
1 (the number of length in each
block), stride as ncols (number
of elements between start of
each block), original
MPI_FLOAT data type
A Sub-array Type: MPI_Type_create_subarray
MPI_Type_create_subarray(int ndims, const int array_of_sizes[], const int
array_of_subsizes[], const int array_of_starts[], int order, MPI_Datatype
oldtype, MPI_Datatype *newtype)
Input arguments:
• ndims: number of array dimensions
• array_of_sizes: number of elements in each dimension of the full array
• array_of_subsizes: number of elements in each dimension of the subarray
• array_of_starts: starting coordinates of the subarray in each dimension
• order : array storage order flag (row-major: MPI_ORDER_C or column-
major MPI_ORDER_FORTRAIN )
Output arguments:
• newtype: new datatype (handle)
A Sub-array Type: MPI_Type_create_subarray
MPI_Type_create_subarray(int ndims, const int array_of_sizes[],
const int array_of_subsizes[], const int array_of_starts[], int order,
MPI_Datatype oldtype, MPI_Datatype *newtype)
Input arguments:
• block_count: The number of blocks to create.
• block_lengths : Array containing the length of each block.
• displs: Array containing the displacement for each block, expressed in bytes.
The displacement is the distance between the start of the MPI datatype created
and the start of the block.
• block_types : Type of elements in each block
Output arguments:
• newtype: new datatype (handle)
Most Flexible Type: MPI_Type_create_struct
▪ MPI_Type_create_struct is the most flexible routine to create an MPI datatype. It
describe blocks with arbitrary data types and arbitrary displacements.
MPI_Type_create_struct(int
block_count, const int block_lengths[],
const MPI_Aint displs[],
MPI_Datatype block_types[],
MPI_Datatype* new_datatype);
Derived Data Types: Summary
▪ A flexible tool to communicate complex data structures in MPI
▪ Most important calls:
• MPI_Type_create_struct(…)
specifies the data layout of user-defined structs (or classes)
• MPI_Type_vector(…)
specifies strided data, i.e. same-type data with missing elements
• MPI_Type_create_subarray(…)
specifies sub-ranges of multi-dimensional arrays
• MPI_Type_commit, MPI_Type_free
• MPI_Get_address, MPI_Aint_add, MPI_Aint_diff
▪ Matching rule: send and receive match if specified basic datatypes match one by
one, regardless of displacements
▪ Correct displacements at receiver side are automatically matched to the
corresponding data items
Thank You!