Chapter - 2 - Parallel Hardware and Parallel Software
Chapter - 2 - Parallel Hardware and Parallel Software
Peter Pacheco
Chapter 2
Parallel Hardware and Parallel
Software
Figure 2.1
a program should be
executed. (the boss)
fetch/read
CPU
write/store
CPU
terminating a thread
starting a thread
Is called joining
Is called forking
Figure 2.2
float z[1000];
…
sum = 0.0;
for (i = 0; i < 1000; i++)
sum += z[i];
L1
L2
L3
fetch x
L1 x sum
L2 y z total
L3 A[ ] radius r1 center
fetch x x
main
L1 y sum memory
L2 r1 z total
L3 A[ ] radius center
program B
program C
superscalar
z=x+y;
i f ( z > 0) Z will be
positive
w=x;
else
w=y;
PARALLEL HARDWARE
MISD (MIMD)
Multiple instruction stream Multiple instruction stream
Single data stream Multiple data stream
no
tc
ov
ere
d
n data items
control unit
n ALUs
Vector registers.
Capable of storing a vector of operands and
operating simultaneously on their contents.
Vector instructions.
Operate on vectors rather than scalars.
Figure 2.3
Figure 2.4
Two categories:
Shared memory interconnects
Distributed memory interconnects
Crossbar –
Allows simultaneous communication among
different devices.
Faster than buses.
But the cost of the switches and links is relatively
high.
(a)
A crossbar switch connecting 4 processors
(Pi) and 4 memory modules (Mj)
(b)
Configuration of internal switches in
a crossbar
Indirect interconnect
Switches may not be directly connected to a
processor.
Figure 2.8
Figure 2.9
Figure 2.10
Bisection bandwidth
A measure of network quality.
Instead of counting the number of links joining
the halves, it sums the bandwidth of the links.
al
tic
ac
pr
im
Figure 2.11
Figure 2.12
Figure 2.13
Figure 2.14
Figure 2.15
Figure 2.16
latency (seconds)
Figure 2.17
x = 2; /* shared variable */
y0 eventually ends up = 2
y1 eventually ends up = 6
z1 = ???
up
ed
s pe
ar Tparallel = Tserial / p
il ne
Tserial
S=
Tparallel
Tserial
Tparallel
S Tserial
E= = =
p p
.
p Tparallel
0.1 x Tserial = 2
Overall parallel run-time is
Tserial 20
S= 0.9 x Tserial / p + 0.1 x Tserial
= 18 / p + 2
MPI_Wtime omp_get_wtime