0% found this document useful (0 votes)

19 views47 pages

Chapter 02

Chapter 2 discusses PRAM (Parallel Random Access Machine) algorithms, detailing the model of parallel computation, its various types, and the complexities involved. It covers specific algorithms like parallel prefix sums and merge sort, as well as concepts such as parallel reduction and the cost of computations. The chapter emphasizes the theoretical aspects of PRAM and its application in designing efficient parallel algorithms.

Uploaded by

abdallahm.alsoud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views47 pages

Chapter 02

Uploaded by

abdallahm.alsoud

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 47

Chapter2: PRAM Algorithms

 PRAM
 A Model of Serial Computation (RAM)
 Time & Space Complexities
 The PRAM Model of Parallel
Computation
 Various Models of PRAM
 PRAM Algorithms
 Parallel Reduction

1
…
 Parallel Prefix Sums
 Parallel Merge Sort
 Reducing Number of Processors
 Brent’s Theorem
 Complexity Theory & PRAM

2
PRAM
 PRAM stands for Parallel Random
Access Machine.
 It is unrealistically simple; it ignores
the complexity of inter processor
communication. Why?
 It allows parallel-algorithm designers
to treat processing power as an
unlimited recourse.

3
RAM
 It’s a theoretical model of serial
computation. It is a one address
computer.
 It consists of
 A memory
 A read-only input tape
 A write-only output tape
 A program

4
Read-only
input tape x1 x2 … xn

r0
Accumulat
or
r1

r2
Location counter Program
r3

Memory

Write -only
output y1 y2 … yn
tape
Time & Space Complexities
 Time complexity: the time taken by
the program to execute over all
inputs of size n.
 Space complexity: the space taken
by the program to execute over all
input of size n.
 Measuring time & space
 Uniform cost
 Logarithmic cost
6
The PRAM Model of Parallel
Computation
 The PRAM consists of:
 Control unit
 Global memory
 Unbounded set of processor, each
with its own private memory & a
unique index, which is need to enable
and disable a processor or influence
which memory location it accesses.

7
Control

P1 P2 Pp
Private memory Private memory Private memory

… … … …

Interconnection network

Global memory
How a PRAM Works?
 A PRAM begins with:
 input stored in global memory.
 A single active processing element
 The computation terminates when
the last processor halts.

9
…
 During each step of computation an
active processor may:
 Read a value from a single private local
or global memory location.
 Write a value from a single private
local or global memory location.
 Perform a single RAM operation.
 During a computation step, a processor
may active another processor.

10
Cost
 The cost of a PRAM computation is
the product of the parallel time
complexity and the number of
processors used.
PRAM algorithm that has
complexity (log P) using P
processors has cost
(P log P)
11
Various Models of PRAM

 EREW
(Exclusive Read Exclusive Write)
 CREW
(Concurrent Read Exclusive Write),
which is the default.
 CRCW
(Concurrent Read Concurrent
Write)
12
…
The following are the different policies to handle
concurrent writes:

Common

all processors concurrently writing into the same
global address must be writing the same value.

Arbitrary

if multiple processors writing the same address, an
arbitrary processor are chosen.

Priority

if multiple processors writing the same address, the
processor with lowest index succeeds (highest
Priority).
 Compare between them ( which is the weakness,
strength, and why)
13
PRAM Algorithms
 PRAM algorithms have two
phases:
a) Sufficient number of processors are
activated.
b) Perform computation on parallel.

14
…
 To activate P processors, we need
log p steps.
 The meta instruction
Spawn (<processor names>)
is used to activate processors.
 To denote a code segment to be
executed in parallel by specified
processors, we use a parallel
construct.
15
…
 The general format is:
For all < processor list> do
< statement list>
end for
 We can use other usual structures such
as:

If-then-else-endif

For-endfor

While-endwhile

Repeat-until
16
Parallel Reduction
 The importance of binary trees to
solve problems.
 Starting from the root (fan-out)

broadcasting

divide and conquer
 Starting from the leaves (fan-in)

Reduction

17
…
 Given a set of n values a1,a2,---- an,
and an associative binary
operation ⊕, the reduction is the
process of computing
a1 ⊕ a2 ⊕ a3 ⊕ ------ an

18
Ex:
4 3 8 2 9 1 0 5 Array A
0 1 2 3 4 5 6 7

J=0 7 10 10 5 P0, P1, P2,

J=1 17 15 P0, P2

J=2 P0
3
2 19
Algorithm
SUM (EREW PRAM)
Initialization: list of n ≥ 1 elements stored
in
A[0 … n-1]
Final condition: sum of elements stored in
A[0]
Global variables: n, A [0, …, n-1], j

20
…
begin
spawn (p0, p1, p2, …, p⌊n/2⌋ -1)
for all pi where 0 ≤i ≤ ⌊n/2⌋ -1 do
for j  0 to log n-1 do
if i modulo 2j = 0 and 2i +2j < n
then
A[2i]  A[2i] +A[2i + 2j]
end if
end for
end for
end
21
Tracing
 When j = 0
0 mod 20 = 0  a[0]  a[0]+a[1]
1 mod 20 = 0  a[2]  a[2]+a[3]
2 mod 20 = 0  a[4]  a[4]+a[5]
3 mod 20 = 0  a[6]  a[6]+a[7]
 When j = 1
0 mod 21 = 0  a[0]  a[0]+a[2]
1 mod 21 = 1
2 mod 21 = 0  a[4]  a[4]+a[6]
3 mod 21 = 1
22
…
 When j = 2
0 mod 22 = 0  a[0]  a[0]+a[4]
1 mod 22 = 1
2 mod 22 = 1
3 mod 22 = 1

23
Analysis
 Time complexity (log n)
Given n/2 processors
spawning Log n/2 

Cost is (log n)* n/2

24
Prefix Sums
Given a set of n values a1, a2, ----, an
and an associative binary operation
⊕, the prefix sums is to compute n
quantities:
a1
a1⊕ a2
a1⊕ a2 ⊕ a3
a1⊕ a2 ⊕ a3…an
25
Analysis
 Time Complexity is (log n)
Given n-1 processors
spawning log (n-1)
 Cost is (log n)*(n-1)

26
Algorithm
PREFIX-SUMS (CREW PRAM)
Initialization: list of n ≥ 1 elements stored
in
A[0 … (n-1)]
Final condition: each element a[i] contains
A[1] ⊕ A[2] … A[i]
Global variables: n, A[0, …, (n-1)], j

27
…
begin
spawn (p1, p2,…, pn-1)
for all pi where 1 ≤ i ≤ n-1 do
for j  0 to  log n -1 do
if i -2j ≥ 0 then
A[i]  A[i] + A [i - 2j]
end if
end for
end for
end

28
Ex
0 1 2 3 4 5 6 7

4 3 8 2 9 1 0 5 Array A

P1... P7

4 7 11 10 11 10 1 5
P2... P7

4 7 15 17 22 20 12 15
P4... P7

4 7 15 17 26 27 27 32
29
Tracing
 When j=0
1 - 20 ≥ 0  a[1]  a[1]+a[0]
2 - 20 ≥ 0  a[2]  a[2]+a[1]
3 - 20 ≥ 0  a[3]  a[3]+a[2]
4 - 20 ≥ 0  a[4]  a[4]+a[3]
5 – 20 ≥ 0  a[5]  a[5]+a[4]
6 – 20 ≥ 0  a[6]  a[6]+a[5]
7 – 20 ≥ 0  a[7]  a[7]+a[6]

30
…
 When j=1
1 – 21 < 0
2 - 21 ≥ 0  a[2]  a[2]+a[0]
3 - 21 ≥ 0  a[3]  a[3]+a[1]
4 - 21 ≥ 0  a[4]  a[4]+a[2]
5 – 21 ≥ 0  a[5]  a[5]+a[3]
6 – 21 ≥ 0  a[6]  a[6]+a[4]
7 – 21 ≥ 0  a[7]  a[7]+a[5]

31
…
 When j=2
1 – 22 < 0
2 - 22 < 0
3 - 22 < 0
4 - 22 ≥ 0  a[4]  a[4]+a[0]
5 – 22 ≥ 0  a[5]  a[5]+a[1]
6 – 22 ≥ 0  a[6]  a[6]+a[2]
7 – 22 ≥ 0  a[7]  a[7]+a[3]

32
Merging Two Sorted Lists
 An optimal RAM algorithm creates
the merged list one element at a
time. It requires at most n-1
comparisons to merge two stored
list of n/2 elements.
 Complexity for RAM algorithm is
(n).

33
…
 Every processor do a binary search on
one half of the array and finds the
position of the element associated with it
by adding its index to the index of the
search
 For example, P4 is in the lower half and
associated with element 9, searches the
upper half and finds the position of 9 after
index 3. Therefore, the final position is
4+3=7.
34
Algorithm
MERGE-LISTS (CREW PRAM)
Initial condition: two stored lists of n/2
elements each, stored in A[1] … A[n/2] and
A[(n/2)+1] … A[n]
Final condition: merged list in locations
A[1] … A[n]
Global variables: A[1 …n]
Local variables: x, low, high, index

35
…
begin
spawn (p1, p2, …, pn)
for all pi where i <= n do
if i <= n/2 then
low  (n/2) +1
high  n
else
low  1
high  n/2
endif

36
…
X  a[i]
repeat
index  ⌊(low + high)/2⌋
If x < A[index] then
high  index –1
else
low  index +1
endif
until low > high
A[high+i–n/2]  x
endfor
end

37
Ex
p1 p2 p3 p4 p5 p6 p7 p8

1 5 7 9 13 17 19 23 Lower half

12 4 5 7 8 9 1 1 1 1 1 2 2 2 2
A[1] 1 2 3 7 9 1 2 3 4A[16]

2 4 8 11 12 21 22 24 Upper half

p9 p10 p11 p12 p13 p14 p15 p16

38
Analysis
 It requires n-1 comparisons to
merge two stored list of n/2
elements
 Time complexity for RAM (n)
 Time complexity for PRAM (log n)

39
Reducing Number of
Processors
 A cost optimal parallel algorithm is an algorithm
for which the cost is in the same complexity
class as an optimal sequential algorithm.
 For example, parallel reduction is not cost
optimal.
PRAM cost is time complexity * number of
processors
= (log n)* (n)
= (n log n)
which is greater than the complexity of the
optimal sequential algorithm (n).

40
…
 A cost optimal parallel algorithm exists if the
total number of operations performed the by
parallel algorithm is of the same complexity
class as an optimal sequential algorithm.
 For example, parallel reduction # of steps is:
n/2 + n/4 + n/8 + ... + (n-1)  (n)
while number of steps sequentially is:
n-1 operation  (n)

41
…
 Since both the parallel algorithm
and the sequential algorithm
perform n-1 operations, then a
cost variant algorithm exists.
 Number of processors required to
perform n-1 operations in (log n)
time is
 n 1   n 
 
P=  (log n)  =  log n 

42
Ex: n=16  P=16/4= 4
9 2 1 3 5 4 7 6 01 8 2 3 5 8 1
STEP1
STEP2

STEP3
15 24 11 1
7
STEP4
39 28
STEP5
76

43
Brent’s theorem
 Given A, a parallel algorithm with
computation time t, if parallel
algorithm A performs m
computational operations, then P
processors can execute algorithm A
in time
t+(m-t)/p

44
Ex: Parallel Reduction

m t ( n  1)   log n 
t  log n  
p  n 
 log n 
 

 log n log 2 n 
 log n  log n    log n 
 n n 

45
Complexity Theory & PRAM
• P: class of problems solvable by a
deterministic algorithm in polynomial time.
• NP: class of problems solvable by non
deterministic algorithm in polynomial time.
• P- complete: a problem L  P is P-complete
if every other problem in P can be
transformed to L in polynomial time.
• NP- complete: a problem L  NP is NP-
complete if every other problem in P can be
transformed to L in polynomial time.

46
…
• NC: class of problems solvable on a
PRAM in polylogarithmic time using
polynomial number of processors.
NP-complete

NP
P-complete
P

Daa Unit-V
No ratings yet
Daa Unit-V
50 pages
Grade Estimation Using Surpac
No ratings yet
Grade Estimation Using Surpac
58 pages
Algorithms 20
No ratings yet
Algorithms 20
217 pages
Pram Algorithms: Parallel and Distributed Algorithms BY Debdeep Mukhopadhyay AND Abhishek Somani
No ratings yet
Pram Algorithms: Parallel and Distributed Algorithms BY Debdeep Mukhopadhyay AND Abhishek Somani
17 pages
Overheads
No ratings yet
Overheads
139 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
Pda 3
No ratings yet
Pda 3
90 pages
Parallel Computation Models: Vivek Sarkar
No ratings yet
Parallel Computation Models: Vivek Sarkar
59 pages
Ace of PACE Sample Paper
55% (20)
Ace of PACE Sample Paper
5 pages
The Pram Model and Its Variation
No ratings yet
The Pram Model and Its Variation
47 pages
PRAM Algorithms
100% (1)
PRAM Algorithms
24 pages
Combined Handouts CS702 (WORD Format)
No ratings yet
Combined Handouts CS702 (WORD Format)
382 pages
Parallel Thinking: Guy Blelloch Carnegie Mellon University
No ratings yet
Parallel Thinking: Guy Blelloch Carnegie Mellon University
37 pages
1.3 Abstract Machine Models in Parallel Computing
No ratings yet
1.3 Abstract Machine Models in Parallel Computing
48 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
07 - Lecture - Abstract Models
No ratings yet
07 - Lecture - Abstract Models
38 pages
Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Parallel
No ratings yet
Parallel
59 pages
Parallel Algorithms Ws 20
No ratings yet
Parallel Algorithms Ws 20
353 pages
07 - Lecture - Abstract Models
No ratings yet
07 - Lecture - Abstract Models
38 pages
Ram, Pram, and Logp Models
No ratings yet
Ram, Pram, and Logp Models
72 pages
Linear Array: Jyotika Jain
No ratings yet
Linear Array: Jyotika Jain
22 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Parallel Algorithm Merged
No ratings yet
Parallel Algorithm Merged
76 pages
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
No ratings yet
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
49 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Week5 Lec14
No ratings yet
Week5 Lec14
27 pages
L8 Parallel Algorithms
No ratings yet
L8 Parallel Algorithms
41 pages
Parallel Algorithms and Architectures 1
No ratings yet
Parallel Algorithms and Architectures 1
22 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
22 pages
Divide and Conquer (Yan Gu)
No ratings yet
Divide and Conquer (Yan Gu)
18 pages
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
No ratings yet
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
26 pages
Pap 3 Shared Memory Algos
No ratings yet
Pap 3 Shared Memory Algos
23 pages
n32 Parallel
No ratings yet
n32 Parallel
16 pages
Simulating Ocean Currents
No ratings yet
Simulating Ocean Currents
35 pages
Chapter Six
No ratings yet
Chapter Six
19 pages
4 Pram Algorithms
No ratings yet
4 Pram Algorithms
16 pages
Chapter Six
No ratings yet
Chapter Six
18 pages
1 AlgorithmAnalysis-2 PDF
No ratings yet
1 AlgorithmAnalysis-2 PDF
39 pages
Unit-3.3 PRAM Model
No ratings yet
Unit-3.3 PRAM Model
29 pages
Parallel Algorithms
No ratings yet
Parallel Algorithms
19 pages
Analyzing Algorithm
No ratings yet
Analyzing Algorithm
31 pages
KT Ykts
No ratings yet
KT Ykts
41 pages
Teegala Krishna Reddy Engineering College: - Department of Computer Science & Engineering
No ratings yet
Teegala Krishna Reddy Engineering College: - Department of Computer Science & Engineering
7 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
No ratings yet
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
12 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
Lect 5 Brent
No ratings yet
Lect 5 Brent
10 pages
The PRAM Model and Algorithms: Advanced Topics Spring 2008
No ratings yet
The PRAM Model and Algorithms: Advanced Topics Spring 2008
24 pages
Lec 6
No ratings yet
Lec 6
8 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
PRAM and Distributed Computing Report
No ratings yet
PRAM and Distributed Computing Report
5 pages
Chromatopac C-R3A. Instruction Manual (расп.)
100% (1)
Chromatopac C-R3A. Instruction Manual (расп.)
210 pages
1 Overview, Models of Computation, Brent's Theorem
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
8 pages
Lec6 PRAMalgs
No ratings yet
Lec6 PRAMalgs
5 pages
PRAM Model
No ratings yet
PRAM Model
72 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
8 pages
Drafting and Making The Shieldmaiden Corset
100% (2)
Drafting and Making The Shieldmaiden Corset
6 pages
Assignment 1: Name Class Date Period Sbuid Netid Email
No ratings yet
Assignment 1: Name Class Date Period Sbuid Netid Email
4 pages
1Z0-1072-20 Updated
No ratings yet
1Z0-1072-20 Updated
121 pages
1.1 Parallelism Is Ubiquitous
No ratings yet
1.1 Parallelism Is Ubiquitous
3 pages
Otto Cycle - Wikipedia
No ratings yet
Otto Cycle - Wikipedia
13 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Essentials of Machine Learning Algorithms (With Python and R Codes) PDF
100% (1)
Essentials of Machine Learning Algorithms (With Python and R Codes) PDF
20 pages
38hds Installation Manual
No ratings yet
38hds Installation Manual
8 pages
At2150b Series
0% (1)
At2150b Series
3 pages
Gaurav Infort Pay
No ratings yet
Gaurav Infort Pay
915 pages
Cortex™ M3
No ratings yet
Cortex™ M3
384 pages
Blas Lapack
No ratings yet
Blas Lapack
21 pages
Ionic Equilibrium DPP
No ratings yet
Ionic Equilibrium DPP
33 pages
Microsoft Excel Intermediate
No ratings yet
Microsoft Excel Intermediate
9 pages
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
No ratings yet
IO Wheel Balancer WB220L - CE - 1.1 - ENG - Set910710984
18 pages
Moment Gradient Factor For Steel I-Beams
No ratings yet
Moment Gradient Factor For Steel I-Beams
20 pages
VBHTP2e 01-Beta
No ratings yet
VBHTP2e 01-Beta
35 pages
CH 19 Cardiovascular System
No ratings yet
CH 19 Cardiovascular System
25 pages
VBHTP2e 20-Beta
No ratings yet
VBHTP2e 20-Beta
92 pages
Payment and Order Fulfillment
No ratings yet
Payment and Order Fulfillment
79 pages
Vbhtp2e 10 Beta
No ratings yet
Vbhtp2e 10 Beta
84 pages
Icd Tutorial
No ratings yet
Icd Tutorial
42 pages
Chapter 03
No ratings yet
Chapter 03
68 pages
Solid State (IITian Notes - Kota)
No ratings yet
Solid State (IITian Notes - Kota)
43 pages
Ec ch6
No ratings yet
Ec ch6
27 pages
Time Is Money - Estimating The Cost of Latency in Trading
No ratings yet
Time Is Money - Estimating The Cost of Latency in Trading
61 pages
Cal CH6
No ratings yet
Cal CH6
42 pages
Ec Ch10part1
No ratings yet
Ec Ch10part1
32 pages
Application of 3D Numerical Model in Bed PDF
No ratings yet
Application of 3D Numerical Model in Bed PDF
11 pages
Evolution of Object Approach: C Programming Language
No ratings yet
Evolution of Object Approach: C Programming Language
35 pages
How To Reduce EMI in Switching Power Supplies
No ratings yet
How To Reduce EMI in Switching Power Supplies
3 pages
Cal CH1
No ratings yet
Cal CH1
16 pages
Algebra 2 Homework Help Answers
100% (1)
Algebra 2 Homework Help Answers
7 pages
Its A Small Small Small Small World
No ratings yet
Its A Small Small Small Small World
15 pages
CBSE Class 11 Biology Sample Paper Set 4
No ratings yet
CBSE Class 11 Biology Sample Paper Set 4
3 pages
A Rhodium/silicon Co-Electrocatalyst Design Concept To Surpass Platinum Hydrogen Evolution Activity at High Overpotentials
No ratings yet
A Rhodium/silicon Co-Electrocatalyst Design Concept To Surpass Platinum Hydrogen Evolution Activity at High Overpotentials
7 pages
Ammonia STD 10
No ratings yet
Ammonia STD 10
2 pages
College of Engineering Science and Technology Department of Computing Science & Information Systems
No ratings yet
College of Engineering Science and Technology Department of Computing Science & Information Systems
3 pages

Chapter 02

Uploaded by

Chapter 02

Uploaded by

Chapter2: PRAM Algorithms

J=0 7 10 10 5 P0, P1, P2,

p9 p10 p11 p12 p13 p14 p15 p16

You might also like