0% found this document useful (0 votes)

5 views22 pages

Week12 - L01 and L02

The document discusses various collective communication operations in MPI including barrier, broadcast, reduce, allreduce, scan, gather, scatter, and others. It provides the syntax and usage of each operation, and example code snippets to demonstrate how to use each one.

Uploaded by

saeedmomina110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views22 pages

Week12 - L01 and L02

Uploaded by

saeedmomina110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

CS326 – Parallel and

Distributed Computing
Lecture #19,20
Agenda

 Odd-Even Sort
 Sequential formulation
 Parallel formulation
 Computational and communicational complexity
 Collective communication operations in MPI
 MPI Barrier
 MPI Bcast
 MPI_Reduce
 Predefined Reduction Operations
 MPI_Allreduce
 MPI_Scan
 MPI_Gatherv, MPI_Allgather and MPI_Scatter
 MPI_Alltoall
Parallel Odd-Even Sort

* https://www.slideshare.net/richakumari37266/parallel-sorting-algorithm
Collective Communication and Computation Operations

 MPI provides its own optimized implementations for most of

the collective operations that we performed in CH#4
 These operations are called collective as all the processes must
have a call to collective functions
 Every collective operation take a communicator (such as
MPI_COMM_WORLD) as argument
 all the processes within that communicator must have a corresponding
call to the operation
Collective Communication and Computation Operations

Barrier synchronization operation

 The barrier synchronization operation is performed in MPI
using:
int MPI_Barrier(MPI_Comm comm)
 The call to the MPI_barrier returns only all the processes in the
group have called this function

 Example program: barrier.c

Collective Communication and Computation Operations

The one-to-all broadcast :

int MPI_Bcast(void *buf, int count, MPI_Datatype datatype,
int source, MPI_Comm comm)

 Buffer of the source process is copied to the buffers of other

processes

 Example program: broadcast.c

Collective Communication and Computation Operations

The all-to-one reduction operation

 Dual of one-to-all broadcast
 Every process including target provides sendbuf for its value that is to
be used for the reduction
 After the reduction, reduced value is stored in recvbuf of target process
 Every process must also provide recvbuf, though it may not be target of
the reduction
int MPI_Reduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int target,
MPI_Comm comm)
 Here MPI_Op is MPI defined set of operations for reduction
 Example program: all_to_one_reduction.c
Collective Communication and Computation Operations

The all-to-one reduction operation

Collective Communication and Computation Operations

MPI_MAXLOC and MPI_MINLOC

 The operation MPI_MAXLOC combines pairs of values (vi, li) and returns the
pair (v, l) such that v is the maximum among all vi 's and l is the corresponding
li (if there are more than one, it is the smallest among all these li 's).
 MPI_MINLOC does the same, except for minimum value of vi.

An example use of the MPI_MINLOC and MPI_MAXLOC operators.

Collective Communication and Computation Operations

MPI_MAXLOC and MPI_MINLOC

MPI datatypes for data-pairs used with the MPI_MAXLOC
and MPI_MINLOC reduction operations.
MPI Datatype C Datatype
MPI_2INT pair of ints
MPI_SHORT_INT short and int
MPI_LONG_INT long and int
MPI_LONG_DOUBLE_INT long double and int
MPI_FLOAT_INT float and int
MPI_DOUBLE_INT double and int
Example program: maxloc_reduction.c
Collective Communication and Computation Operations

The All-Reduce operation

 MPI_AllReduce is used when the result of the reduction operation is
needed by all processes
 Equal to All-to-one reduction followed by one-to-all broadcast

int MPI_Allreduce(void sendbuf, void recvbuf,

int count, MPI_Datatype datatype,
MPI_Op op, MPI_Comm comm)

 After Allreduce operation, recvbuf of all the processes contain reduced

value
 Note: no target for reduction is given
Collective Communication and Computation Operations

Prefix(scan) operation
 Recall 4.3-for prefix-sum: After the operation, every process has sum of
the buffers of the previous processes and its own.
 MPI_Scan() is MPI primitive for the prefix operations.
 All the operators that can be used for reduction can also be used for the
scan operation
 If buffer is an array of elements, then recvbuf is also an array containing
element-wise prefix at each position.
int MPI_Scan(void *sendbuf, void *recvbuf, int
count, MPI_Datatype datatype, MPI_Op
op,
MPI_Comm comm)
 Program example: scan.c
Collective Communication and Computation Operations

Exclusive-Prefix (Exscan) operation

 Exclusive-prefix-sum: After the operation, every process has sum of the
buffers of the previous processes excluding its own.
 MPI_Exscan() is MPI primitive for the exclusive-prefix operations.
 The recvbuf of first process is remains unchanged as there is no
process before it. Some MPI distributions place identity value for the
given associative operator.
int MPI_Excan(void *sendbuf, void *recvbuf, int
count, MPI_Datatype datatype, MPI_Op
op,
MPI_Comm comm)
 Program example: escan.c
Collective Communication and Computation Operations

MPI_Gather and its variants

 Recall section 4.4: After the Gather operation, a single target process
accumulates[concatenates] buffers of all the other processes without any
reduction operator.
 Each process sends element(s) in its sendbuf to the target process.
 Total number of elements to be sent by each process must be same. This
number is specified in sendcount and is equal to recvcount.
 On the target, recvbuf stores elements sent by all the processes in rank
order. Elements received at target by process#i, will be stored starting from
(i*sendcount)th index of recvbuf.
int MPI_Gather(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf,
int recvcount, MPI_Datatype recvdatatype,
int target, MPI_Comm comm)
 Example program: gather.c
Collective Communication and Computation Operations

MPI_Gather and its variants

 Gatherv
 Each process can have different message length.
 Recvcounts[i] = Total elements to be received by ith processing node
 Displs[i]= starting index in recvbuf to store message received from ith
process.

int MPI_Gatherv(void *sendbuf, int sendcount,

MPI_Datatype senddatatype, void
*recvbuf,
int *recvcounts,int *displs,
MPI_Datatype recvdatatype,int
target, MPI_Comm comm)

 Example program: gatherv.c

Collective Communication and Computation Operations

MPI_Gather and its variants

 Gatherv (Displs calculation example)
 Let each process have elements one more than their rank.
 Then calculation of displs[] at target is calculated as:

P0 P1 P2 P3
32 12, 15 4,9,14 20,23,27,31

Recvcounts 1 2 3 4
displs 0 0+1=1 1+2=3 3+3=6

 Example program: gatherv.c

Collective Communication and Computation Operations

MPI_Gather and its variants

 MPI_Allgather
 Same as All-to-All broadcast described in section 4.2
 Every process serve as target for the gather

int MPI_Allgather(void *sendbuf, int sendcount,

MPI_Datatype senddatatype, void *recvbuf,
int recvcount, MPI_Datatype recvdatatype,
MPI_Comm comm)

 Note: No target for gather

 Unlike MPI_Gather, it gathers sendbufs of all the processes in recvbufs of
all the processes.

 Example program: allgather.c

Collective Communication and Computation Operations

MPI_Gather and its variants

 MPI_Allgatherv

int MPI_Allgatherv(void *sendbuf, int sendcount,

MPI_Datatype senddatatype, void
*recvbuf, int *recvcounts, int *displs,
MPI_Datatype recvdatatype, MPI_Comm comm)

 Here every process will have to supply the valid calculated arrays of
recvcounts and displs.
 Furthermore, it is also necessary for all the processors to provide a
recvbuf [an array] of sufficient size to store all the elements of all the
processes.
Collective Communication and Computation Operations

MPI_scatter
 Scatters data stored in sendbuf of source process between all the
processes as discussed in ch#4.

int MPI_Scatter(void *s, int sendcount,

MPI_Datatype senddatatype, void *recvbuf,
int recvcount, MPI_Datatype recvdatatype,
int source, MPI_Comm comm)

 Sendcount and recvcount should be the same and represent total

elements to be given to each process.

 Example program: scatter.c

Collective Communication and Computation Operations

MPI_scatterv
 Here sendcounts is an array of P size such that its ith index contains
number of elements to be sent to ith process.
 displs[i] indicates the index in sendbuf from which sendcounts[i] values
are to be sent to ith process.

int MPI_Scatterv(void sendbuf, int sendcounts, int *displs,

MPI_Datatype senddatatype, void *recvbuf, int
recvcount, MPI_Datatype
recvdatatype, int source, MPI_Comm
comm)
 Values of sendbuf and sendcounts at all processes except the source are
ignored but, you have to provide pointers though pointing to nothing.
 Every process will have to calculate its own recvcount
 Example program: scatterv.c
Questions
References
1. Kumar, V., Grama, A., Gupta, A., & Karypis, G. (2017). Introduction to parallel computing. Redwood City, CA:
Benjamin/Cummings.

GRADE 4 TERM 1 TEST MATHEMATICS MEMO (Final)
100% (4)
GRADE 4 TERM 1 TEST MATHEMATICS MEMO (Final)
5 pages
سحر المغربة بخط اليد كامل
100% (4)
سحر المغربة بخط اليد كامل
162 pages
Unit Iv Distributed Memory Programming With Mpi
No ratings yet
Unit Iv Distributed Memory Programming With Mpi
19 pages
NEJE KZ Board Schematic
0% (1)
NEJE KZ Board Schematic
1 page
Assignment Individual - 1 ParallelProg
No ratings yet
Assignment Individual - 1 ParallelProg
6 pages
CSS - 1st Sem - 1st Quarter - DLL
100% (2)
CSS - 1st Sem - 1st Quarter - DLL
44 pages
Parallel Computing: MPI - Collective Communication
No ratings yet
Parallel Computing: MPI - Collective Communication
55 pages
Mpi Programming 2
No ratings yet
Mpi Programming 2
57 pages
Parallel Computing: MPI - Collective Communication
No ratings yet
Parallel Computing: MPI - Collective Communication
52 pages
Distributed Memory Programming With: Peter Pacheco
No ratings yet
Distributed Memory Programming With: Peter Pacheco
125 pages
Collective Comm
No ratings yet
Collective Comm
45 pages
Send and Receive
No ratings yet
Send and Receive
11 pages
MPI Exercises PDF
No ratings yet
MPI Exercises PDF
7 pages
1.hello World Programme in Mpi
No ratings yet
1.hello World Programme in Mpi
11 pages
Module 3 Solutions PCS Ia2 Q.banks
No ratings yet
Module 3 Solutions PCS Ia2 Q.banks
13 pages
Point-to-Point Communication: MPI Send MPI Recv
No ratings yet
Point-to-Point Communication: MPI Send MPI Recv
4 pages
MT6737 PCB Design Guidelines-English-V0 - 1
No ratings yet
MT6737 PCB Design Guidelines-English-V0 - 1
113 pages
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
No ratings yet
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
4 pages
CSC4005 Tutorial3
No ratings yet
CSC4005 Tutorial3
40 pages
Asynchronous (Serial, Communication
No ratings yet
Asynchronous (Serial, Communication
2 pages
Intro To MPI: Hpc-Support@duke - Edu
No ratings yet
Intro To MPI: Hpc-Support@duke - Edu
56 pages
x64dbg Documentation
No ratings yet
x64dbg Documentation
281 pages
Message Passing Interface: Parallel Processing Course University of Tehran
No ratings yet
Message Passing Interface: Parallel Processing Course University of Tehran
49 pages
Mpi Basic Operations
No ratings yet
Mpi Basic Operations
6 pages
MPI Pacheco Ch3
No ratings yet
MPI Pacheco Ch3
124 pages
Parallel & Distributed Computing: MPI - Message Passing Interface
No ratings yet
Parallel & Distributed Computing: MPI - Message Passing Interface
49 pages
4 P2P-1
No ratings yet
4 P2P-1
31 pages
10.collectives I
No ratings yet
10.collectives I
31 pages
Lecture07 MPI by Example
No ratings yet
Lecture07 MPI by Example
27 pages
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
No ratings yet
Ms. V. Uma Maheswari, Assistant Lecturer, Department of Information Technology, National Institute of Technology, Surathkal
91 pages
Unit IV
No ratings yet
Unit IV
12 pages
MPI Lab 3
No ratings yet
MPI Lab 3
18 pages
Advanced View Arduino Projects List - Use Arduino For Projects
No ratings yet
Advanced View Arduino Projects List - Use Arduino For Projects
97 pages
1 MPI Communications: CS424. Parallel Computing Lab#4
No ratings yet
1 MPI Communications: CS424. Parallel Computing Lab#4
30 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
MPI2
No ratings yet
MPI2
3 pages
Neighbor 10.2.2.2 Route-Reflector-Client
No ratings yet
Neighbor 10.2.2.2 Route-Reflector-Client
18 pages
Distributed-Memory Parallel Programming With MPI: Supervised By: Dr. Shaima Hagras
No ratings yet
Distributed-Memory Parallel Programming With MPI: Supervised By: Dr. Shaima Hagras
20 pages
Pioneer SPH-DA360DAB-Operation-Manual
No ratings yet
Pioneer SPH-DA360DAB-Operation-Manual
65 pages
Project Management: Openings For Disruption From AI and Advanced Analytics
No ratings yet
Project Management: Openings For Disruption From AI and Advanced Analytics
30 pages
‎⁨تقرير⁩
No ratings yet
‎⁨تقرير⁩
16 pages
DS K1T804BMF Fingerprint Access Control Terminal - Datasheet - V1.0 - 20220125
No ratings yet
DS K1T804BMF Fingerprint Access Control Terminal - Datasheet - V1.0 - 20220125
4 pages
Contoh Template Soal
No ratings yet
Contoh Template Soal
18 pages
MES2408 Series
No ratings yet
MES2408 Series
3 pages
Cluster Lab Session 03
No ratings yet
Cluster Lab Session 03
9 pages
Introduction MPI - Chap2 - Slide 3
No ratings yet
Introduction MPI - Chap2 - Slide 3
16 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
VSS Mpi 2
No ratings yet
VSS Mpi 2
23 pages
2018 - 4 - Answer Key of Naib Tehsildar (Main) - 2018 Held On 14-04-2018
No ratings yet
2018 - 4 - Answer Key of Naib Tehsildar (Main) - 2018 Held On 14-04-2018
2 pages
08 1 MPI Comm Data Distributions
No ratings yet
08 1 MPI Comm Data Distributions
60 pages
Introduction To MPI Basics
No ratings yet
Introduction To MPI Basics
8 pages
MPI Guide C++
No ratings yet
MPI Guide C++
9 pages
Thakur05-Optimization of Collective Communication Operations in MPICH
No ratings yet
Thakur05-Optimization of Collective Communication Operations in MPICH
18 pages
Thakur03-Improving The Performance of Collective Operations in MPICH
No ratings yet
Thakur03-Improving The Performance of Collective Operations in MPICH
11 pages
CH 4
No ratings yet
CH 4
16 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
Excel Public School Mysore - AI Ready Seat of Education 2024
No ratings yet
Excel Public School Mysore - AI Ready Seat of Education 2024
21 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
Introduction To C MPI PM
No ratings yet
Introduction To C MPI PM
50 pages
2 Mpi
No ratings yet
2 Mpi
13 pages
RG-RAP2260 (H) Datasheet-20240104
No ratings yet
RG-RAP2260 (H) Datasheet-20240104
12 pages
Bramah-Systems Audit
No ratings yet
Bramah-Systems Audit
14 pages
Week 10
No ratings yet
Week 10
52 pages
MPI Part2 Updated
No ratings yet
MPI Part2 Updated
20 pages
Cataloge Textures
No ratings yet
Cataloge Textures
34 pages
Basic Computer Security Tutorial
No ratings yet
Basic Computer Security Tutorial
13 pages
Intro MPI
No ratings yet
Intro MPI
60 pages
Cs-3006 6 Mpi Basics 2
No ratings yet
Cs-3006 6 Mpi Basics 2
52 pages
CS-3006 7 MPI Advanced Topics
No ratings yet
CS-3006 7 MPI Advanced Topics
36 pages
Distributed Systems and Cloud Computing
No ratings yet
Distributed Systems and Cloud Computing
24 pages
Pre Booking Process
No ratings yet
Pre Booking Process
10 pages
60% PDF
No ratings yet
60% PDF
1 page
Lecture 13-Derived Datatypes in MPI
No ratings yet
Lecture 13-Derived Datatypes in MPI
33 pages
Lecture 12-MPI Collective Communication
No ratings yet
Lecture 12-MPI Collective Communication
53 pages
Java R23 - UNIT-3
No ratings yet
Java R23 - UNIT-3
34 pages
VAST2024 - MC2 Data Description
No ratings yet
VAST2024 - MC2 Data Description
3 pages
Lecture 15 MPI Summarization
No ratings yet
Lecture 15 MPI Summarization
26 pages
CS-3006 - 6 - MPI Advanced Topics
No ratings yet
CS-3006 - 6 - MPI Advanced Topics
32 pages
PDC Lecture 17 & 18
No ratings yet
PDC Lecture 17 & 18
16 pages
5CS022 Lecture 2
No ratings yet
5CS022 Lecture 2
24 pages
Co-Creating An Industry Standard For Sharing Agricultural Data - Aug 20
No ratings yet
Co-Creating An Industry Standard For Sharing Agricultural Data - Aug 20
13 pages
SE Answer Key
No ratings yet
SE Answer Key
17 pages
Message Passing-1
No ratings yet
Message Passing-1
76 pages
Cursed Emoji Love - Google Search
No ratings yet
Cursed Emoji Love - Google Search
1 page
Key Concepts in MPI Programming: Processes
No ratings yet
Key Concepts in MPI Programming: Processes
6 pages
Artificial Intelligence and Machine Learning in The Travel Industry Simplifying Complex Decision Making Ben Vinod Download
No ratings yet
Artificial Intelligence and Machine Learning in The Travel Industry Simplifying Complex Decision Making Ben Vinod Download
54 pages
Experiment Trial Gamma Trial Session: Reinforce - Cartpole Search
No ratings yet
Experiment Trial Gamma Trial Session: Reinforce - Cartpole Search
5 pages

Week12 - L01 and L02

Uploaded by

Week12 - L01 and L02

Uploaded by

CS326 – Parallel and

 MPI provides its own optimized implementations for most of

Barrier synchronization operation

 Example program: barrier.c

The one-to-all broadcast :

 Buffer of the source process is copied to the buffers of other

 Example program: broadcast.c

The all-to-one reduction operation

The all-to-one reduction operation

MPI_MAXLOC and MPI_MINLOC

An example use of the MPI_MINLOC and MPI_MAXLOC operators.

MPI_MAXLOC and MPI_MINLOC

The All-Reduce operation

int MPI_Allreduce(void *sendbuf, void *recvbuf,

 After Allreduce operation, recvbuf of all the processes contain reduced

Exclusive-Prefix (Exscan) operation

MPI_Gather and its variants

MPI_Gather and its variants

int MPI_Gatherv(void *sendbuf, int sendcount,

 Example program: gatherv.c

MPI_Gather and its variants

 Example program: gatherv.c

MPI_Gather and its variants

int MPI_Allgather(void *sendbuf, int sendcount,

 Note: No target for gather

 Example program: allgather.c

MPI_Gather and its variants

int MPI_Allgatherv(void *sendbuf, int sendcount,

int MPI_Scatter(void *s, int sendcount,

 Sendcount and recvcount should be the same and represent total

 Example program: scatter.c

int MPI_Scatterv(void *sendbuf, int *sendcounts, int *displs,

You might also like

int MPI_Allreduce(void sendbuf, void recvbuf,

int MPI_Scatterv(void sendbuf, int sendcounts, int *displs,