0% found this document useful (0 votes)

11 views7 pages

Notes 04

The document discusses parallel algorithms and covers runtime, work, cost, efficiency, and parallelism. It defines key terms and concepts and uses examples like summation algorithms to illustrate runtime, work, and cost when the number of processors is equal to or less than the problem size. It also discusses simulating CRCW PRAM algorithms on EREW PRAM models.

Uploaded by

biubu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views7 pages

Notes 04

Uploaded by

biubu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ICS 643: Advanced Parallel Algorithms Fall 2016

Lecture 4 — September 14, 2016

Prof. Nodari Sitchinava Scribe: Tiffany Eulalio

1 Overview

In the last lecture we covered the Circuit Model, defined runtime and work, and proved Brent’s
Theorem. We previously defined these as:

• Runtime: tp (n) = t(n, p) for p number of processor

• Cost: cost(n) = tp (n) · p

• Work: w(n) = total number of operations

In this lecture we will elaborate on runtime, work and cost and their definitions without a depen-
dence on the number of processors, p. We will define parallelism and work efficiency.

2 Runtime, Work & Cost

We want to define runtime, work, and cost of an algorithm independently of the number of proces-
sors, p, that will be used.

2.1 Cost vs. Work when p = n

We’ll use the following summation algorithm to illustrate the next few points:
sum(A[1...n]){
if n == 1
{return 1}
else
in parallel do {
l = sum(A[1. . . n2 ])
r = sum(A[ n2 + 1. . . n])
}
return l + r
}

Let’s say that the number of processors, p, is equal to the number of elements, n, being summed.
Now, we’ll calculate the runtime, work and cost for this algorithm. Figure 1 is a depiction of the
work done on each level of the recursive Sum algorithm.

1
Similarly to
( solving a recurrence, we can define the runtime as follows:
t p ( n2 ) + Θ(1) if n > 1
tp=n (n) = 2
Θ(1) otherwise
= Θ(log n)
By looking at Figure 1, we can see that the equation given above for runtime is true. We know that
constant work is done at each node. The array is split in half on each level, which gives us the t p ( n2 )
2
when n > 1. We only need to follow one path down the tree since we assume that the nodes on each
level are done in parallel. When n = 1, we hit the base case and constant work is done at each node.

It may seem odd that the number of processors has been halved in this equation,
tp=n (n) → t p ( n2 ) + Θ(1)
2
We can see that this works out because of our assumption that p = n:
tp (n) =
t(n, p) = t( n2 , p2 )
t(n, n) = t( n2 , n2 ) + Θ(1)
Let g(n) = t(n, n), then g(n) = g( n2 ) + Θ(1), which can then be solved.
g(n) = Θ(log n)
t(n, n) = Θ(log n)
We can now determine
( the work done in the algorithm.
n
2w( 2 ) + Θ(1) if n > 1
w(n) =
Θ(1) otherwise
= Θ(n)

2
Looking at the tree in Figure 1, the right column shows the amount of work done on each level.
We can see that there are log n + 1 levels and each level has 2ni work, where i is equal to the level.
We seePthat the work could then be described as
n n
i=0 2i = 2n − 1 ≤ 2n
Thus, we get the idea that work = Θ(n), which can be proved using substitution or the Master
method. We can see that work is the same as it would be for the sequential sum algorithm.
We previously defined cost to be p · tp (n). Thus,
cost = n · Θ(log n) = Θ(n log n)

2.1.1 Work vs. Cost

We can see that cost is not equal to work. In most cases, cost is the same as work, but it is
not always the case, as we have seen here. Essentially, work is a measure of the total number of
operations used in the algorithm, or the number of operations that one processor would need to
perform when p = 1. Cost, on the other hand, is dependent on the number of processors being
used. If p > n, some of the processors will not be utilized on most of the levels, meaning that cost
will be greater than the actual number of operations needed for the algorithm. If p < n, then w(n)
work will be done. We can see that the following must then be true: cost(n) ≥ w(n).

2.2 Summation with log(n) base case

Let’s rewrite the earlier sum algorithm as follows:

sum(A[1...n], N ){
if n ≤ log N {
for i = 1 to n {sum = sum +A[i]}
return sum
}
else
in parallel do {
l = sum(A[1 . . . n2 ], N )
r = sum(A[ n2 + 1 . . . n], N )
}
return l + r
}

We can see that the tree for this algorithm is slightly different from earlier in Figure 2, below.

3
We will (now determine the runtime of this algorithm.
t p ( n2 ) + Θ(1) if n > log n
tp (n) = 2
Θ(n) otherwise
= Θ(log n)

We see that the depth of the tree is now log logNN . We obtain this by finding the difference in
depth of the
original
log n tree and the depth below the level where N = log n.
N
log log N = log N − log log N = log p
Each level above the leaves requires constant time because there are p processors that work in par-
allel. The last level of leaves has log n elements at each node. These elements need to be summed
up by one processor, which will take linear time, meaning that the leaves will take log n time to
process. By this reasoning, we get:
tp (n) = Θ(log N − log log N + log N )
= O(log N )
In order to achieve this runtime, we need a processor for each leaf, so
log N
p = 2 log N = logNN

The work done(in the algorithm is

2w( n2 ) + Θ(1) if n > log n
w(n) =
Θ(log n) otherwise

4
= Θ(n)
We can use the Master Method to verify that this is true:
Compare Θ(1) to nlog2 2
Since Θ(1) = O(n), by Case 1,
w(n) = Θ(n)
Using the equation stated earlier, we can determine the cost.
Cost(n) = tp (n) · p
= Θ(log n) · logn n
= Θ(n)

2.3 Without Brent’s Theorem

If we didn’t know about Brent’s Theorem, then we would instead, design this algorithm so that
the base case is Np . We can see the tree for this algorithm in Figure 3.

The runtime will

( now be
n N
t N ( 2 ) + Θ(1) if n > p
tp (n) = p

Θ(n) otherwise
N
= Θ(log p + p)

Work would still be Θ(N ), and cost will be Θ(p log p + N ).

We would want cost to equal work. We know that cost is equal to Θ(p log p + N ), and work

5
is equal to Θ(N ). In order to make cost equal to Θ(N ), we need the term p log p to be dominated
by N . If we choose p where p < logNN , then we can see that p log p < N :
N N
log N log log N < N
N
log N (log N − log log N ) < N
log log N
N (1 − log N ) < N
N log log N
N− log N <N
N log log N
N should be bigger than one for the algorithm, which makes the term log N positive, meaning
the inequality holds.

N
So, we say that P < log N .

3 Efficiency & Parallelism

Definition 1. Efficiency is defined as tw(n)

1 (n)
, where t1 (n) is the runtime of the best sequential
algorithm, and w(n) is the work of a parallel algorithm. Efficiency is always less than or equal to
one since w(n) ≥ t1 (n).

Definition 2. A parallel algorithm is work-efficient if efficiency is equal to 1.

w(n)
Definition 3. We define parallelism to be t∞ (n) , work over the time needed for max processors
to complete the algorithm.

In our summation example earlier, w(n) = N , and t∞ (n) = O(log n), since it could not be completed
in less time.
w(n)
Theorem 4. If p ≤ t∞ (n) , then cost = Θ(w(n)).

w(n)
Proof. tp (n) = p + t∞ (n) (By Brent’s theorem. Note: t∞ (n) = T (n) for circuits.)

cost = p · tp (n)
= p( w(n)
p + t∞ (n))
= w(n) + p · t∞ (n)
≤ w(n) + w(n)
= 2w(n)
= Θ(w(n))

Thus, if we create an algorithm so that it runs with max parallelism and uses fewer p, then cost
will be equal to work.

Looking back at the equation for parallelism, tw(n)

∞ (n)
, we can think of parallelism as the maximum
number of processors to use so that we’re not wasting parallel resources. If we stop the algorithm
so that the number of leaves is equal to the number of processors, p, that we have, then the work
done at the leaves will dominate the work on internal nodes. We know that w(n) is fixed since it
is equal to the best sequential algorithm’s runtime. We want to try to minimize t∞ (n). We do this

6
by keeping work efficient while making the critical path, t∞ (n), length as short as possible. Then,
tp (n) = w(n)
p + t∞ (n)
w(n)
and minimizing t∞ (n) maximizes p, which minimizes p and the runtime. Since t∞ (n) cannot be
w(n)
greater than p because of p, we need to minimize t∞ (n).

3.1 Simulation of CRCW PRAM algorithms on EREW PRAM

Theorem 5. An algorithm A that runs in time TA (n) on a p-processor CRCW PRAM can be
implemented on a p-processor EREW PRAM in time t0p = Θ(TA (n) · log p).

The proof will be shown in the next lecture.

Crafting Wearables - Blending Technology With Fashion (PDFDrive)
100% (2)
Crafting Wearables - Blending Technology With Fashion (PDFDrive)
229 pages
Recurrence Relation For Complexity Analysis of Algorithms
No ratings yet
Recurrence Relation For Complexity Analysis of Algorithms
4 pages
ALTERNATOR
No ratings yet
ALTERNATOR
26 pages
On Recurrence Relation
100% (1)
On Recurrence Relation
25 pages
MITSUBOSHI - Timing Belt
67% (3)
MITSUBOSHI - Timing Belt
142 pages
Module-Powertrain Control C1 3.6L
No ratings yet
Module-Powertrain Control C1 3.6L
4 pages
Daa
No ratings yet
Daa
57 pages
Design & Analysis of Algorithms - Topic 5 - Recurrence Relations & Their Solution
No ratings yet
Design & Analysis of Algorithms - Topic 5 - Recurrence Relations & Their Solution
38 pages
Assignment C++ (MUHAMMAD RYMI BIN MOHD RAFIZAL)
No ratings yet
Assignment C++ (MUHAMMAD RYMI BIN MOHD RAFIZAL)
6 pages
Basic PRAM Algorithm Design Techniques
No ratings yet
Basic PRAM Algorithm Design Techniques
13 pages
Designed for Success 310 HVAC Design Report and Rater Review_2
No ratings yet
Designed for Success 310 HVAC Design Report and Rater Review_2
67 pages
The Conceptual History of Independence A PDF
No ratings yet
The Conceptual History of Independence A PDF
15 pages
t4 Soln
No ratings yet
t4 Soln
5 pages
CS3230 Cheatsheet
No ratings yet
CS3230 Cheatsheet
6 pages
Chapter 8 - Advanced Parallel Algorithms
No ratings yet
Chapter 8 - Advanced Parallel Algorithms
56 pages
5 - Master Theorem
No ratings yet
5 - Master Theorem
58 pages
DAA
No ratings yet
DAA
140 pages
Sinda Chamber of Commerce & Industry (Scci) Constitution
No ratings yet
Sinda Chamber of Commerce & Industry (Scci) Constitution
28 pages
algorithm_analysis
No ratings yet
algorithm_analysis
5 pages
803purl Algorithms TYS
No ratings yet
803purl Algorithms TYS
10 pages
Tugas Mike P5-3A
No ratings yet
Tugas Mike P5-3A
6 pages
Lec10 Handout
No ratings yet
Lec10 Handout
20 pages
Tuwaang
No ratings yet
Tuwaang
18 pages
Anand Rathi PROJECT
No ratings yet
Anand Rathi PROJECT
64 pages
DH 0615
No ratings yet
DH 0615
10 pages
Task 1 to 4
No ratings yet
Task 1 to 4
20 pages
Orbital Projects & Services LLC: Rofessional Esume
No ratings yet
Orbital Projects & Services LLC: Rofessional Esume
2 pages
Master Theorem
No ratings yet
Master Theorem
3 pages
Truth Is Subject To Too Much Analysis .: Problem Set # 2
No ratings yet
Truth Is Subject To Too Much Analysis .: Problem Set # 2
4 pages
GT Dental Supplies Complete Price List General - Consumable 2
No ratings yet
GT Dental Supplies Complete Price List General - Consumable 2
4 pages
Data Structures & Algorithms - Topic 3 - Time Complexity Basics
No ratings yet
Data Structures & Algorithms - Topic 3 - Time Complexity Basics
31 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
Worst-Case Analysis: - in This Class, We Will Focus On
No ratings yet
Worst-Case Analysis: - in This Class, We Will Focus On
29 pages
Asymptotic notations notes
No ratings yet
Asymptotic notations notes
33 pages
Asian Paints:: The Customer Experience Gets A New Look With Sap® CRM
No ratings yet
Asian Paints:: The Customer Experience Gets A New Look With Sap® CRM
15 pages
INTRODUCTION
No ratings yet
INTRODUCTION
22 pages
Step Count Method in Algorithm
No ratings yet
Step Count Method in Algorithm
13 pages
6 Complexity of Algorithm
No ratings yet
6 Complexity of Algorithm
15 pages
Brent Theorem
No ratings yet
Brent Theorem
2 pages
161 Main
No ratings yet
161 Main
51 pages
n32 Parallel
No ratings yet
n32 Parallel
16 pages
GCI400 SolutionsCh3 2012
No ratings yet
GCI400 SolutionsCh3 2012
25 pages
Tell Me About Yourself - Interview Cheat Sheet
No ratings yet
Tell Me About Yourself - Interview Cheat Sheet
26 pages
Slide 2
No ratings yet
Slide 2
22 pages
CS-E3190 Lect02 PDF
No ratings yet
CS-E3190 Lect02 PDF
23 pages
08 Recurrences
No ratings yet
08 Recurrences
55 pages
Lecture3 Pre
No ratings yet
Lecture3 Pre
3 pages
Analysis of Algorithm
No ratings yet
Analysis of Algorithm
51 pages
Analysis of Algorithms: Recurrences
No ratings yet
Analysis of Algorithms: Recurrences
32 pages
Algorithms Unit-1
No ratings yet
Algorithms Unit-1
7 pages
Algorithm:: Unit-I
No ratings yet
Algorithm:: Unit-I
9 pages
JLL The Future of The Central Business District May 2023 Final
No ratings yet
JLL The Future of The Central Business District May 2023 Final
23 pages
Scanjet SC30T
No ratings yet
Scanjet SC30T
3 pages
DSA Week3 Pasrt 1
No ratings yet
DSA Week3 Pasrt 1
28 pages
Small 05 Div and Conquer
No ratings yet
Small 05 Div and Conquer
47 pages
Recurrences
No ratings yet
Recurrences
29 pages
Algorithm Efficiency Analysis
No ratings yet
Algorithm Efficiency Analysis
31 pages
GC 043 16mai
100% (1)
GC 043 16mai
6 pages
Asymptotic Running Time of Algorithms: Formal Definition of Big-O Notation
No ratings yet
Asymptotic Running Time of Algorithms: Formal Definition of Big-O Notation
8 pages
Line Following Rulebook
No ratings yet
Line Following Rulebook
7 pages
Algo Lecture8 SolvingRecurrence
No ratings yet
Algo Lecture8 SolvingRecurrence
41 pages
Asymptotic notation
No ratings yet
Asymptotic notation
48 pages
Course Name: CS302-Design An Analysis of Algorithm: Credit Hours: 3
No ratings yet
Course Name: CS302-Design An Analysis of Algorithm: Credit Hours: 3
83 pages
W1L2 Complexity PDF
No ratings yet
W1L2 Complexity PDF
38 pages
DAA M1 Recurrence
No ratings yet
DAA M1 Recurrence
47 pages
Complexity Analysis: Text Is Mainly From Chapter 2 by Drozdek
No ratings yet
Complexity Analysis: Text Is Mainly From Chapter 2 by Drozdek
31 pages
TDS¤37942¤Barrier 80 S¤Euk¤GB
No ratings yet
TDS¤37942¤Barrier 80 S¤Euk¤GB
5 pages
Algorithms and Complexity: Two Numbers // 23, 45 Algorithm
No ratings yet
Algorithms and Complexity: Two Numbers // 23, 45 Algorithm
10 pages
Greedy Algorithms
No ratings yet
Greedy Algorithms
6 pages
Chapter 4 - Recursion Tree
No ratings yet
Chapter 4 - Recursion Tree
14 pages
Recurrence Tree Example PDF
No ratings yet
Recurrence Tree Example PDF
10 pages
Algorithm Short Notes38387373md
No ratings yet
Algorithm Short Notes38387373md
35 pages
Environmental Studies: Course Objectives
No ratings yet
Environmental Studies: Course Objectives
3 pages
MIT1 204S10 Lec05
No ratings yet
MIT1 204S10 Lec05
13 pages
Tarnaka Times - Nov 2010
No ratings yet
Tarnaka Times - Nov 2010
8 pages
Unit 2
No ratings yet
Unit 2
17 pages
Chapter 4: Recurrence Relations: Iterative and The Master Method
No ratings yet
Chapter 4: Recurrence Relations: Iterative and The Master Method
19 pages
Mathematical Analysis of Recursive and NonRecursive Techniques
No ratings yet
Mathematical Analysis of Recursive and NonRecursive Techniques
57 pages
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
No ratings yet
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
12 pages
Apple T Notes
No ratings yet
Apple T Notes
29 pages
4-5. Mathematical Analysis of Recursive and NonRecursive Techniques
No ratings yet
4-5. Mathematical Analysis of Recursive and NonRecursive Techniques
59 pages
Full PDF
No ratings yet
Full PDF
157 pages
DSA MK Lect3 PDF
No ratings yet
DSA MK Lect3 PDF
75 pages
Chapter 1 Revised
No ratings yet
Chapter 1 Revised
29 pages
CNC Programming
No ratings yet
CNC Programming
75 pages
PT
No ratings yet
PT
7 pages
Fabm1 Module 1 Week 2
No ratings yet
Fabm1 Module 1 Week 2
5 pages
BS 5141 1 1975
No ratings yet
BS 5141 1 1975
58 pages
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
3.5/5 (1)

Notes 04

Uploaded by

Notes 04

Uploaded by

ICS 643: Advanced Parallel Algorithms Fall 2016

Lecture 4 — September 14, 2016

• Runtime: tp (n) = t(n, p) for p number of processor

• Cost: cost(n) = tp (n) · p

• Work: w(n) = total number of operations

2 Runtime, Work & Cost

2.1 Cost vs. Work when p = n

2.1.1 Work vs. Cost

2.2 Summation with log(n) base case

Let’s rewrite the earlier sum algorithm as follows:

The work done(in the algorithm is

2.3 Without Brent’s Theorem

The runtime will

Work would still be Θ(N ), and cost will be Θ(p log p + N ).

3 Efficiency & Parallelism

Definition 1. Efficiency is defined as tw(n)

Definition 2. A parallel algorithm is work-efficient if efficiency is equal to 1.

Looking back at the equation for parallelism, tw(n)

3.1 Simulation of CRCW PRAM algorithms on EREW PRAM

The proof will be shown in the next lecture.

You might also like