PC 2
PC 2
PC 2
Parallel Programs
Content
3. Excess Computation:
Fastest sequential algorithm for a problem may be difficult/impossible to parallelize
use a parallel algorithm based on a poorer but easily parallelizable sequential algorithm.
that is, one with a higher degree of concurrency
Difference in computation performed by the parallel program and the best serial
program = excess computation overhead incurred by the parallel program.
A parallel algorithm based on the best serial algorithm may still perform more
aggregate computation than the serial algorithm.
Example: Fast Fourier Transform algorithm.
In its serial version, the results of certain computations can be reused.
Parallel version: these results cannot be reused: they are generated by different PEs.
Therefore, some computations are performed multiple times on different PEs.
Total parallel overhead
The overheads incurred by a parallel program are encapsulated
into a single expression referred to as the overhead function.
We define overhead function or total overhead of a parallel
system, To, as the total time collectively spent by all the PEs
over and above that required by the fastest known sequential
algorithm for solving the same problem on a single processing
element.
Consider
The total time spent in solving a problem summed over all
processing elements is pTP .
TS units of this time are spent performing useful work,
3. The software inertia: billions of $ worth of existing software makes it hard to switch
to parallel systs; the cost of converting the “decks” to parallel programs and retraining
the programmers is prohibitive.
not all programs needed in the future have already been written.
new appls will be developed & new probls will become solvable with increased performance.
Students are being trained to think parallel.
Tools are being developed to transform sequential code into parallel code automatically.
Asymptotic Analysis of Parallel
Programs
Evaluating a set of parallel programs for
solving a given problem
A1 A2 A3 A4
Example: sorting
The fastest serial programs
for this problem run in time p n2 log n n Ön
O (n log n).
Let us look at four different TP 1 n Ön Ön
log n
parallel algorithms A1, A2,
A3, and A4, for sorting a S n log n Ön Ön
given list. log n log n
Objective of this exercise is E Log 1 Log 1
to determine which of these n /n n/ Ön
four algorithms is the best. pTP n2 n n1.5 n
log n log n
Sorting example
The simplest metric is one of speed
the algorithm with the lowest TP is the best.
by this metric, algorithm A1 is the best, followed by A3, A4, and A2.
The costs of algorithms A1 and A3 are higher than the serial runtime of
n log n and therefore neither of these algorithms is cost optimal.
Algorithms A2 and A4 are cost optimal.
Conclusions:
Important to first understand the objectives of parallel algorithm
analysis and to use appropriate metrics, because use of different
metrics may often result in contradictory outcomes