Slides 2
Slides 2
Chapter 2
Message-Passing Computing
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-2
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-3
Source
file
Source
file
Compile to suit
processor
Executables
Processor 0
Processor p - 1
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-4
Processor 0
Processor p 1
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-5
spawn();
Start execution
of process 2
Process 2
Time
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-6
Process 1
Process 2
send(&x, 2);
Movement
of data
recv(&y, 1);
slides2-7
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-8
Time
Suspend
process
Both processes
continue
send();
Request to send
Acknowledgment
recv();
Message
Process 2
Time
recv();
Request to send
send();
Both processes
continue
Suspend
process
Message
Acknowledgment
slides2-9
More than one version depending upon the actual semantics for
returning.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-10
slides2-11
Process 1
Process 2
Message buffer
Time
send();
Continue
process
recv();
Read
message buffer
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-12
Buffers only of finite length and a point could be reached when send
routine held up because all available buffer space exhausted.
Then, send routine will wait until storage becomes re-available - i.e
then routine behaves as a synchronous routine.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-13
Message Tag
Used to differentiate between different types of messages being
sent.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-14
Process 1
Process 2
send(&x,2,5);
Movement
of data
recv(&y,1,5);
slides2-15
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-16
Broadcast
Sending same message to all processes concerned with problem.
Multicast - sending same message to defined group of processes.
Process 0
data
Process 1
data
Process p 1
data
Action
buf
Code
bcast();
bcast();
bcast();
MPI form
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-17
Scatter
Sending each element of an array in root process to a separate
process. Contents of ith location of array sent to ith process.
Process 0
Process 1
data
data
Process p 1
data
Action
buf
Code
scatter();
scatter();
scatter();
MPI form
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-18
Gather
Having one process collect individual values from set of processes.
Process 0
Process 1
Process p 1
data
data
data
gather();
gather();
gather();
Action
buf
Code
MPI form
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-19
Reduce
Gather operation
operation.
combined
with
specified
arithmetic/logical
Example
Values could be gathered and then added together by root:
Process p 1
Process 0
Process 1
data
data
data
reduce();
reduce();
Action
buf
+
Code
reduce();
MPI form
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-20
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-21
Workstation
PVM
daemon
Application
program
(executable)
Messages
sent through
network
Workstation
PVM
daemon
Application
program
(executable)
PVM
daemon
Application
program
(executable)
slides2-22
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-23
MPI
Process Creation and Execution
slides2-24
Communicators
Defines scope of a communication operation.
Initially,
all
processes
enrolled
in
universe
called
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-25
slides2-26
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-27
Process 1
Destination
send(,1,);
lib()
send(,1,);
Source
recv(,0,);
lib()
recv(,0,);
Process 0
Process 1
send(,1,);
lib()
send(,1,);
recv(,0,);
lib()
recv(,0,);
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-28
MPI Solution
Communicators
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-29
Default Communicator
MPI_COMM_WORLD, exists as the first communicator for all the
processes existing in the application.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-30
Point-to-Point Communication
Uses send and receive routines with message tags (and
communicator). Wild card message tags available
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-31
Blocking Routines
Return when they are locally complete - when location used to hold
message can be used again or altered without affecting message
being sent.
A blocking send will send the message and return. This does not
mean that the message has been received, just that the process is
free to move on without adversely affecting the message.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-32
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-33
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-34
Example
To send an integer x from process 0 to process 1,
MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
/* find rank */
if (myrank == 0) {
int x;
MPI_Send(&x, 1, MPI_INT, 1, msgtag, MPI_COMM_WORLD);
} else if (myrank == 1) {
int x;
MPI_Recv(&x, 1, MPI_INT, 0,msgtag,MPI_COMM_WORLD,status);
}
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-35
Nonblocking Routines
Nonblocking send - MPI_Isend(), will return immediately even
before source location is safe to be altered.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-36
slides2-37
Example
To send an integer x from process 0 to process 1 and allow process
0 to continue,
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
/* find rank */
if (myrank == 0) {
int x;
MPI_Isend(&x,1,MPI_INT, 1, msgtag, MPI_COMM_WORLD, req1);
compute();
MPI_Wait(req1, status);
} else if (myrank == 1) {
int x;
MPI_Recv(&x,1,MPI_INT,0,msgtag, MPI_COMM_WORLD, status);
}
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-38
slides2-39
Any type of send routine can be used with any type of receive
routine.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-40
Collective Communication
Involves set of processes, defined by an intra-communicator.
Message tags not present.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-41
Example
To gather items from the group of processes into process 0, using
dynamically allocated memory in the root process, we might use
int data[10];
/*data to be gathered from processes*/
.
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
/* find rank */
if (myrank == 0) {
MPI_Comm_size(MPI_COMM_WORLD, &grp_size);
/*find group size*/
buf = (int *)malloc(grp_size*10*sizeof(int));/*allocate memory*/
}
MPI_Gather(data,10,MPI_INT,buf,grp_size*10,MPI_INT,0,MPI_COMM_WORLD);
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-42
Barrier
As in all message-passing systems, MPI provides a means of
synchronizing processes by stopping each one until they all have
reached a specific barrier call.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-43
#include mpi.h
Sample MPI program.
#include <stdio.h>
#include <math.h>
#define MAXSIZE 1000
void main(int argc, char *argv)
{
int myid, numprocs;
int data[MAXSIZE], i, x, low, high, myresult, result;
char fn[255];
char *fp;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
if (myid == 0) {
/* Open input file and initialize data */
strcpy(fn,getenv(HOME));
strcat(fn,/MPI/rand_data.txt);
if ((fp = fopen(fn,r)) == NULL) {
printf(Cant open the input file: %s\n\n, fn);
exit(1);
}
for(i = 0; i < MAXSIZE; i++) fscanf(fp,%d, &data[i]);
}
/* broadcast data */
MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD);
/* Add my portion Of data */
x = n/nproc;
low = myid * x;
high = low + x;
for(i = low; i < high; i++)
myresult += data[i];
printf(I got %d from %d\n, myresult, myid);
/* Compute global sum */
MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);
if (myid == 0) printf(The sum is %d.\n, result);
MPI_Finalize();
}
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-44
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-45
slides2-46
Computational Time
Can be estimated in a similar way to that of a sequential algorithm,
by counting number of computational steps. When more than one
process being executed simultaneously, count computational steps
of most complex process. Generally, some function of n and p, i.e.
tcomp = f(n, p)
The time units of tp are that of a computational step.
Often break down computation time into parts. Then
tcomp = tcomp1 + tcomp2 + tcomp3 +
where tcomp1, tcomp2, tcomp3 are computation times of each part.
Analysis usually done assuming that all processors are same and
operating at same speed.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-47
Communication Time
Will depend upon the number of messages, the size of each
message, the underlying interconnection structure, and the mode of
transfer. Many factors, including network structure and network
contention. For a first approximation, we will use
tcomm1 = tstartup + ntdata
for communication time of message 1.
tstartup is the startup time, essentially the time to send a message
with no data. Assumed to be constant.
tdata is the transmission time to send one data word, also assumed
constant, and there are n data words.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-48
Startup time
Number of data items (n)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-49
Both startup and data transmission times, tstartup and tdata, are
measured in units of one computational step, so that we can add
tcomp and tcomm together to obtain the parallel execution time, tp.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-50
Benchmark Factors
With ts, tcomp, and tcomm, can establish speedup factor and
computation/communication ratio for a particular algorithm/
implementation:
ts
ts
Speedup factor = ----- = -----------------------------------------tp
t comp + t comm
t comp
Computation/communication ratio = -----------------t comm
slides2-51
Process 1
Process 2
Process 3
Computing
Time
Waiting
Message-passing system routine
Message
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-52
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-53
L1
and point
L2
in the
.
L1: time(&t1);
/* start timer */
.
.
L2: time(&t2);
/* stop timer */
.
elapsed_time = difftime(t2, t1); /* elapsed_time = t2 - t1 */
printf(Elapsed time = %5.2f seconds, elapsed_time);
slides2-54
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-55
Set up paths
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-56
Hostfile
Before starting MPI for the first time, need to create a hostfile
Sample hostfile
ws404
#is-sm1 //Currently not executing, commented
pvm1 //Active processors, UNCC sun cluster called pvm1 - pvm8
pvm2
pvm3
pvm4
pvm5
pvm6
pvm7
pvm8
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.
slides2-57
slides2-58