0% found this document useful (0 votes)
11 views10 pages

Lect 5 Brent

The document discusses parallel and distributed algorithms, specifically focusing on merging two sorted arrays using PRAM algorithms and Brent's Law. It explains the process of parallel merging with binary search and introduces Brent's Theorem, which relates to the efficiency of parallel computations. Additionally, it covers cost-optimal solutions and non-obvious applications of prefix sums and polynomial evaluation in parallel computing.

Uploaded by

gravity circle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

Lect 5 Brent

The document discusses parallel and distributed algorithms, specifically focusing on merging two sorted arrays using PRAM algorithms and Brent's Law. It explains the process of parallel merging with binary search and introduces Brent's Theorem, which relates to the efficiency of parallel computations. Additionally, it covers cost-optimal solutions and non-obvious applications of prefix sums and polynomial evaluation in parallel computing.

Uploaded by

gravity circle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

13‐08‐2015

PARALLEL AND DISTRIBUTED ALGORITHMS


BY
DEBDEEP MUKHOPADHYAY
AND
ABHISHEK SOMANI
http://cse.iitkgp.ac.in/~debdeep/courses_iitkgp/PAlgo/index.htm

PRAM ALGORITHMS:
BRENT’S LAW
2

1
13‐08‐2015

MERGING TWO SORTED ARRAYS


An optimal RAM algorithm creates the merged list one element at a
time.
 Requires at most n-1 comparisions to merge two sorted lists of n/2 elements.
 Time complexity Θ

 Can we do in lesser time?

PARALLEL MERGE
Consider two sorted lists of distinct elements of size n/2.
We spawn n processors, one for each element of the list to be
merged.
In parallel, the processors perform binary search of the corresponding
elements in the other half of the array.
 Element in the lower half of the array performs a binary search in the upper half.
 Element in the upper half of the array performs a binary search in the lower half.

2
13‐08‐2015

THE TASK OF P3
A[1] A[8]
A[i=3] is larger than Thus, 7 is larger than 2
i-1=(3-1)=2 elements in 1 5 7 13 17 19 23 elements in the lower array,
the lower array (lower and larger than
wrt. Index) (high-n/2)=10-8=2 elements
in the upper array.

Perform a binary So, P3 can calculate the


search with A[3] in the A[9] A[16]
position of 7 in the merged
upper array. list, ie. after (i-1)+(high-n/2),
2 4 8 11 12 21 24
Get a position thus the position is
high=index of the (i+high-n/2).
largest integer smaller
than 7=>high=10.

THE TASK OF P11


A[8]
A[i=11]=8 is larger than A[1] Thus, 8 is larger than 2
i-(n/2+1)=(11-9)=2 1 5 7 13 17 19 23 elements in the upper array,
elements in the upper and larger than
array (lower wrt. Index) high=3 elements in the upper
array.

Perform a binary So, P11 can calculate the


search with A[11] in the A[9] A[16]
position of 8 in the merged
lower array. list, ie. after (i-n/2-1)+(high),
2 4 8 11 12 21 24
Get a position thus the position is
high=index of the (i+high-n/2).
largest integer smaller
than 8=>high=3.
Thus the same expression is used to
place the elements in their proper
position in the merged list. 6

3
13‐08‐2015

THE PRAM ALGORITHM

PRAM (CONTD.)

Note that the final writing into the array is done by the processors
without any conflict. All the locations are distinct.
Also note that the total number of operations performed have increased
from that in a sequential algorithm Θ to Θ in the parallel
algorithm. 8

4
13‐08‐2015

COST-OPTIMAL SOLUTIONS
We have seen examples of PRAM algorithms which are not cost
optimal.
Is there a cost-optimal parallel reduction algorithm that has also the
same time complexity?

BRENT’S THEOREM (1974)


Assume a parallel computer where each processor can perform an
operation in unit time.
Further, assume that the computer has exactly enough processors to
exploit the maximum concurrency in an algorithm with M operations,
such that T time steps suffice.
Brent’s Theorem say that a similar computer with fewer processes, P,
can perform the algorithm in time, /

10

5
13‐08‐2015

BRENT’S THEOREM (PROOF)


Let si denote the number of computational operations performed by
the parallel algorithm A at step i, where 1≤i≤t.
By definition ∑ .
Thus, using p processors we can simulate step i in time .
By definition,

∑ ∑ ∑ ∑ .

Note this reduction is work-preserving, meaning that the total work


does not change.
Also, note p is lesser than the initial number of processors, which is
manifested by the increase in the time required.

11

APPLICATION TO PARALLEL REDUCTION


We know of a solution with large number of processors, which takes Θ
time.
Let us reduce the number of processors to processors.
Thus,

log Θ

Θ(logn)
Thus reducing the number of processors from n to does not change
the complexity of the parallel algorithm.
If the total number of operations performed by the parallel algorithm is
the same as an optimal sequential algorithm, then a cost optimal parallel
algorithm does exist.

12

6
13‐08‐2015

AN ORDER ANALYSIS: WORK-DEPTH MODEL


Let denote the computation in the level.
Thus, by assigning operations to each of the P processors in the
PRAM, the operations for level i can be performed in steps.
Summing the time over all the D (Depth) levels,
, ,P
O ∑ O ∑ 1 ∑ .
Note: W is the total work done by the sequential algorithm, which we
have assumed is the same.
The total work performed by the PRAM is O(W+PD).
A cost optimal solution thus can be obtained if PD≤W, or P ≤W/D.

13

EXERCISES
Now think of the cost optimal solutions that we discussed,
like reduction, prefix sum, suffix sum, pointer jumping, tree
traversal etc. in the light of Brent’s Law.

14

7
13‐08‐2015

NON-OBVIOUS APPLICATIONS OF PREFIX


SUM
Suppose, we have an array of 0’s and 1’s, and we want to determine
how many 1’s begin the array.
 Ex (1,1,1,0,1,1,0,1)..The answer is 3.

15

NON-OBVIOUS APPLICATIONS OF
PARALLEL SCAN / REDUCTIONS
Suppose, we have an array of 0’s and 1’s, and we want to determine
how many 1’s begin the array.
 Ex (1,1,1,0,1,1,0,1)..The answer is 3.

It may be non-intuitive to think of an associative operator which we


might use here!
However, there seems to be a common trick, which we can try to learn.

16

8
13‐08‐2015

THE TRICK
Let us define for any segment of the array by the notation (x,p)
 x denotes the number of leading 1’s
 p denotes whether the segment contains only 1’s.

Thus, each element ai is replaced by (ai,ai).


How do we combine, (x,p) and (y,q)?
Let us define an operator, ⊗ to do this.
It is intuitive that (x,p) ⊗ (y,q)=(x+py,pq). Why?
Is this operator associate?
 ((x,p) ⊗ (y,q)) ⊗(z,r)=(x+py,pq) ⊗(z,r)=(x+py+pqz,pqr)
 (x,p) ⊗ ((y,q)) ⊗(z,r))=(x,p)⊗(y+qz,qr)=(x+p(y+qz),pqr)=(x+py+pqz,pqr)

Now all the previous parallelizations can be applied 

17

EXAMPLE
1 1 1 0 1 1 0 1

(1,1) (1,1) (1,1) (0,0) (1,1) (1,1) (0,0) (1,1)

(2,1) (1,0) (2,1) (0,0)

(3,0) (2,0)

(3,0)

18

9
13‐08‐2015

CAN YOU EVALUATE A POLYNOMIAL IN


PARALLEL USING A SIMILAR METHOD?
Consider a polynomial : a0xn-1+a1xn-2+…+an-2x+an-1.
Each segment also denotes a polynomial. Say, the first two coefficients denoted
a0x+a1
Let us consider (p,y) to denote a segment.
 p denotes the value of the segment’s polynomial evaluated for x
 y denotes the value of xn, where n is the length of the segment

Thus, each element ai is replaced by (ai,x).


How do we combine, (p,y) and (q,z)?
Let us define an operator, ⊗ to do this.
It is intuitive that (p,y) ⊗ (q,z)=(pz+q,yz). Why?
Is this operator associate?
 ((a,x) ⊗ (b,y)) ⊗(c,z)=(ay+b,xy) ⊗(c,z)=(ayz+bz+c, xyz)
 (a,x) ⊗ ((b,y)) ⊗(c,z))=(a,x)⊗(bz+c,yz)=(ayz+bz+c, xyz)

Now all the previous parallelizations can be applied 

19

ANNOUNCEMENTS:
NO CLASS ON 13TH AUGUST.
QUIZ ON 14TH AUGUST, 2015 AT 4:30 PM
-- SYLLABUS (TILL THIS POINT)
IF YOU AGREE, WE CAN SWAP THE ABOVE
TOO!

20

10

You might also like