0% found this document useful (0 votes)

25 views24 pages

26 Parallel Algorithms

Uploaded by

abhijeet.pundkar2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views24 pages

26 Parallel Algorithms

Uploaded by

abhijeet.pundkar2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 24

Introduction to Parallel

Algorithms

1
Changing our assumptions
• So far most or all of your study of computer science has assumed
that only one thing happens at a time in a given program.
 sequential programming: Each statement executes in sequence.

• Removing this assumption creates challenges and opportunities:

 Programming: How can we divide work among threads of execution
and coordinate (synchronize) among them?
 Algorithms: How can activities in parallel speed-up a program?
• (more throughput: work done per unit time)
 Data structures: May need to support concurrent access (multiple
threads operating on data at the same time).

2
Brief arch. history
• CPU: Central Processing Unit. The brain of a computer.
 From ~1980-2005, CPU speed (GHz) got exponentially faster.
 Roughly doubled every 1.5 years ("Moore's Law").

• But we are reaching limits of classic CPU design.

 Increasing speeds further generates too much heat.
 Any single CPU over ~3-4 GHz crashes
or burns out in normal usage.

• Current work-around: Use multiple processors.

 Or, more recently, produce one CPU containing many processors in it.
 core: A processor-within-a-processor.
• A "multi-core" processor is one with several cores inside.
3
Using many cores
• What can you do with multiple CPUs (or cores)?
 Run multiple different programs at the same time (processes).
• Example: Core 1 runs Firefox; Core 2 runs iTunes; Core 3 runs Eclipse...
• Technically, programs receive "time slices" of attention from cores.
• Your OS (Windows, OSX, Linux) already does this for you.

• Do multiple things at once within the same program (threads).

 This will be our focus. More difficult; must be done manually.
 Requires rethinking everything about our algorithms, from how to
implement data-structure operations, to Big-Oh, to ...

• Writing correct/fast parallel code is much harder than sequential.

 Especially in common languages like Java and C.
4
Shared memory model
• Each thread has its own unshared call stack and local variables.
 Some objects are shared between multiple threads:
Any objects declared at a global scope or passed from one to another.
• Separate processes do not share memory with each other.

Thread1
Unshared: Shared:
locals, global objects,
call stack static fields
…

Thread2 Thread3

…
…

5
Parallel vs. concurrent
• parallel: Using multiple processing resources
work
(CPUs, cores) at once to solve a problem faster.
 Example: A sorting algorithm that has several
threads each sort part of the array. CPUs/cores

• concurrent: Multiple execution flows (e.g. threads)

CPUs/cores
accessing a shared resource at the same time.
 Example: Many threads trying to make changes to
the same data structure (a global list, map, etc.). resource

• Many programmers confuse these two concepts.

 Threads are often used to implement both.

6
Algorithm example
• Write a method named sum that computes the total sum of all
elements in an array of integers.
 For now, just write a normal solution that doesn't use parallelism.

index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
value 22 18 12 -4 27 30 36 50 7 68 91 56 2 85 42 98

// normal sequential solution

public static int sum(int[] a) {
int total = 0;
for (int i = 0; i < a.length; i++) {
total += a[i];
}
return total;
}
7
Parallelizing the algorithm
• Write a method named sum that computes the total sum of all
elements in an array of integers.
 How can we parallelize this algorithm if we have 2 CPUs/cores?

index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
value 22 18 12 -4 27 30 36 50 7 68 91 56 2 85 42 98

sum1 = 22+18+12+-4+27+30+36+50 = 191 sum2 = 7+68+91+56+2+85+42+98 = 449

sum = sum1 + sum2 = 640

 Compute sum of each half of array in a thread.

 Add the two sums together.
8
Initial steps
• First, write a method that sums a partial range of the array:

// normal sequential solution

public static int sumRange(int[] a, int min, int max) {
int total = 0;
for (int i = min; i < max; i++) {
total += a[i];
}
return total;
}

9
Runnable partial sum
• Now write a runnable class that can sum a partial array:

public class Summer implements Runnable {

private int[] a;
private int min, max, sum;

public Summer(int[] a, int min, int max) {

this.a = a;
this.min = min;
this.max = Math.min(max, a.length);
}

public int getSum() {

return sum;
}

public void run() {

sum = Sorting.sumRange(a, min, max);
}
}
10
Sum method w/ threads
• Now modify the overall sum method to run Summers in threads:

// Parallel version (two threads)

public static int sum(int[] a) {
Summer firstHalf = new Summer(a, 0, a.length/2);
Summer secondHalf = new Summer(a, a.length/2, a.length);
Thread thread1 = new Thread(firstHalf);
thread1.start();
Thread thread2 = new Thread(secondHalf);
thread2.start();
try {
thread1.join();
thread2.join();
} catch (InterruptedException ie) {}
return firstHalf.getSum() + secondHalf.getSum();
}

11
More than 2 threads
public static int sum(int[] a) { // many threads version
int threadCount = 5; // what number is best?
int len = (int) Math.ceil(1.0 * a.length / threadCount);
Summer[] summers = new Summer[threadCount];
Thread[] threads = new Thread[threadCount];
for (int i = 0; i < threadCount; i++) {
summers[i] = new Summer(a, i*len, (i+1)*len);
threads[i] = new Thread(summers[i]);
threads[i].start();
}
try {
for (Thread t : threads) {
t.join();
}
} catch (InterruptedException ie) {}

int total = 0;
for (Summer summer : summers) {
total += summer.getSum();
}
return total;
}
12
How many threads to use?
• You can find out how many cores/CPUs your machine has:
 int cores =
Runtime.getRuntime().availableProcessors();

• You'd think that would be the ideal number of threads.

 Sometimes yes, sometimes no.
 Your program does not always get all of the cores to use.

• Too few threads can be bad (core(s) sit idle).

• Too many threads can be bad (overhead of creating Threads).
 A bad ratio can slow the algorithm: e.g. 8 threads for 6 cores.
 If threads are lightweight to create, making tons of threads can be very
effective (e.g. make 1000 threads, set them all loose!).
• Java's Threads are too heavy-weight for this to be practical.
13
Parallel merge sort
• How can merge sort be parallelized if we have 2 CPUs/cores?
index 0 1 2 3 4 5 6 7
value 22 18 12 -4 58 7 31 42
split

22 18 12 -4 58 7 31 42

sort -4 12 18 22 sort 7 31 42 58
merge
-4 7 12 18 22 31 42 58
 Idea:
• Split array in half.
• Recursively sort each half in its own thread.
• Merge.
14
Runnable merge sort
• Write a runnable class that can merge sort an array:

public class MergeSortRunner implements Runnable {

private int[] a;

public MergeSortRunner(int[] a) {
this.a = a;
}

public void run() {

mergeSort(a);
}
}

15
Merge sort w/ threads
• Now modify the merge sort method to sort in threads:
// Parallel version (two threads)
public static void parallelMergeSort(int[] a) {
if (a.length < 2) { return; }
// split array in half
int[] left = Arrays.copyOfRange(a, 0, a.length / 2);
int[] right = Arrays.copyOfRange(a, a.length/2, a.length);
// sort each half (in parallel)
Thread lThread = new Thread(new MergeSortRunner(left));
Thread rThread = new Thread(new MergeSortRunner(right));
lThread.start();
rThread.start();
try {
lThread.join();
rThread.join();
} catch (InterruptedException ie) {}
// merge them back together
merge(left, right, a);
}

16
More than 2 threads?
• If we want to use more than 2 threads, it is tricky to code.
 Have to keep an array of threads/runnables.
 Tough to merge all the partial results together when done.

• A better way: divide-and-conquer parallelism

 Have each call spawn two threads, which spawn two threads, ...
 Each thread merges its two sub-threads; easier to manage

17
Modified Runnable
• Modify the runnable class to accept a level:
 Level 0 : base case; just do a sequential merge sort.
 Level K : spawn two threads at level K-1 to sort each half.

public class MergeSortRunner implements Runnable {

private int[] a;
private int level;

public MergeSortRunner(int[] a, int level) {

this.a = a;
this.level = level;
}

public void run() {

parallelMergeSort(a, level);
}
}
18
Merge sort w/ threads
• Now modify the merge sort method to use levels:
// Parallel version (many threads)
public static void parallelMergeSort(int[] a) {
parallelMergeSort(a, 3); // 3 levels => 2^3=8 threads
}
private static void parallelMergeSort(int[] a, int level) {
if (a.length < 2) { return; }
if (level == 0) { mergeSort(a); return; }
// split array in half
int[] left = Arrays.copyOfRange(a, 0, a.length/2);
int[] right = Arrays.copyOfRange(a, a.length/2, a.length);
// sort each half (in parallel)
Thread lThread = new Thread(new MergeSortRunner(left, level-1));
Thread rThread = new Thread(new MergeSortRunner(right, level-1));
lThread.start();
rThread.start();
try {
lThread.join();
rThread.join();
} catch (InterruptedException ie) {}
// merge them back together
merge(left, right, a);
}
19
Amdahl's Law
• Amdahl's Law: The speedup that can be achieved by parallelizing a
program is limited by the sequential fraction of the program.
 Example: If 33% of the program must be performed sequentially, no
matter how many processors you use, you can only get a 3x speedup.
 An example of diminishing returns from adding more processors.
• "Nine couples can't make a baby in one month."

 Therefore, part of the trick becomes

learning how to minimize the portion
of the program that must be
performed sequentially.
• Making better parallel algorithms.

20
Map/Reduce
• map/reduce: A strategy for implementing parallel algorithms.
 map: A master worker takes the problem input, divides it into smaller
sub-problems, and distributes the sub-problems to workers (threads).
 reduce: The master worker collects sub-solutions from the workers and
combines them in some way to produce the overall answer.
• Our multi-threaded merge sort is an example of such an algorithm.

• Frameworks and tools have been written to perform map/reduce.

 MapReduce framework by Google
 Hadoop framework by Yahoo!
 related to the ideas of
Big Data and Cloud Computing
 also related to functional programming
21
Thread object methods
Method name Description
getPriority() gets/sets this thread's running priority. Possible values:
setPriority(int) Thread.MIN_PRIORITY, NORM_PRIORITY, MAX_PRIORITY
getName() gets/sets the name of this thread as a string
setName(name)
getState() thread's state. One of Thread.State.NEW, RUNNABLE,
BLOCKED, WAITING, TIMED_WAITING, or TERMINATED
interrupt() stops the thread's current time slice
isAlive() returns true if the thread is in runnable state
join() waits indefinitely, or for a given number of milliseconds, for the
join(ms) thread to finish running
start() puts a thread into runnable state
stop() instructs a thread to stop immediately (deprecated)

22
Thread static methods
Static method name Description
activeCount() number of currently runnable/active threads
dumpStack() causes current thread to print a stack trace
getAllStackTraces() returns stack trace data for all currently running threads
getCurrentThread() returns the current code's active thread
holdsLock(obj) returns true if current thread has locked the given
object
sleep(ms) causes the current thread to wait for at least the given
number of ms before continuing
yield() temporarily pauses the current thread to let others run

23
Sleeping a thread
try {
Thread.sleep(ms);
} catch (InterruptedException ie) {}

 Causes current thread to wait for the given number of milliseconds.

 If the program has other threads, they will be given a chance to run.
 Useful for writing code that checks for an update periodically.

Thread Array Example: CS108, Stanford Handout #23 Fall, 2008-09 Osvaldo Jiménez
No ratings yet
Thread Array Example: CS108, Stanford Handout #23 Fall, 2008-09 Osvaldo Jiménez
2 pages
15 Java Multithreaded Programming
No ratings yet
15 Java Multithreaded Programming
34 pages
OS Open Ended Lab By Saqib Raheem 23MDSWE276
No ratings yet
OS Open Ended Lab By Saqib Raheem 23MDSWE276
11 pages
Slides Cours9 Multithreading
No ratings yet
Slides Cours9 Multithreading
90 pages
IT-318: Scalable and Cloud Computing: Programming at Scale Concurrency and Consistency
No ratings yet
IT-318: Scalable and Cloud Computing: Programming at Scale Concurrency and Consistency
37 pages
Dijkstra's Algorithm Overview: Mergesort Example: Merge As We Return From Recursive Calls
No ratings yet
Dijkstra's Algorithm Overview: Mergesort Example: Merge As We Return From Recursive Calls
4 pages
15 Java Multithreaded Programming PDF
No ratings yet
15 Java Multithreaded Programming PDF
32 pages
Lecture 05
No ratings yet
Lecture 05
73 pages
Java Multithreading for Senior Engineering Interviews Part I
No ratings yet
Java Multithreading for Senior Engineering Interviews Part I
80 pages
27 Concurrency
No ratings yet
27 Concurrency
22 pages
Parallel Asynchronous Programming Java
No ratings yet
Parallel Asynchronous Programming Java
144 pages
CSE524sp10-01
No ratings yet
CSE524sp10-01
62 pages
This Lesson Details The Reasons Why Threads Exist and What Bene T Do They Provide. We Also Discuss The Problems That Come With Threads
No ratings yet
This Lesson Details The Reasons Why Threads Exist and What Bene T Do They Provide. We Also Discuss The Problems That Come With Threads
5 pages
13 Wrapup
No ratings yet
13 Wrapup
21 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
Module 4 Pyq
No ratings yet
Module 4 Pyq
19 pages
15-Threads-10092024-014718pm
No ratings yet
15-Threads-10092024-014718pm
21 pages
Thread 2. Singleton Pattern 3. Strategy Pattern 4.: Sleep
No ratings yet
Thread 2. Singleton Pattern 3. Strategy Pattern 4.: Sleep
8 pages
CS4230 Parallel Programming Introduction To Parallel Algorithms
No ratings yet
CS4230 Parallel Programming Introduction To Parallel Algorithms
25 pages
Pipelining vs. Parallel Processing
No ratings yet
Pipelining vs. Parallel Processing
23 pages
Week11 Thread
No ratings yet
Week11 Thread
34 pages
Comp322 s19 Lec01 Slides v1 PDF
No ratings yet
Comp322 s19 Lec01 Slides v1 PDF
17 pages
Programming Assignment 2
No ratings yet
Programming Assignment 2
15 pages
Multi Threading
No ratings yet
Multi Threading
128 pages
Fork Join
No ratings yet
Fork Join
24 pages
Clase 8
No ratings yet
Clase 8
41 pages
Unit 4- Threads_Program Solution
No ratings yet
Unit 4- Threads_Program Solution
9 pages
Concurrency Models
No ratings yet
Concurrency Models
22 pages
Os m 1 Batch 2 Sec m 1
No ratings yet
Os m 1 Batch 2 Sec m 1
19 pages
Lecture1 Introduction PDF
No ratings yet
Lecture1 Introduction PDF
43 pages
Parallel Thinking: Guy Blelloch Carnegie Mellon University
No ratings yet
Parallel Thinking: Guy Blelloch Carnegie Mellon University
37 pages
Threads in C: David Chisnall
No ratings yet
Threads in C: David Chisnall
24 pages
8 Threads Inc
No ratings yet
8 Threads Inc
24 pages
Multithreading in Java
No ratings yet
Multithreading in Java
59 pages
F8 PDF
No ratings yet
F8 PDF
32 pages
Parallel Thinking: Guy Blelloch Carnegie Mellon University
No ratings yet
Parallel Thinking: Guy Blelloch Carnegie Mellon University
41 pages
Ch5 - Parallelism and Distributed Objects
No ratings yet
Ch5 - Parallelism and Distributed Objects
64 pages
Performance Optimization Techniques For Java Code
No ratings yet
Performance Optimization Techniques For Java Code
30 pages
Java (Micro) 5
No ratings yet
Java (Micro) 5
4 pages
Part 4 - Easy Data Parallelism
No ratings yet
Part 4 - Easy Data Parallelism
42 pages
PCP 2022 7 MutualExclusion
No ratings yet
PCP 2022 7 MutualExclusion
49 pages
SECTION-M_BATCH-02
No ratings yet
SECTION-M_BATCH-02
6 pages
Distributed Computing Seminar
No ratings yet
Distributed Computing Seminar
37 pages
ch 4
No ratings yet
ch 4
4 pages
DAA Report
No ratings yet
DAA Report
13 pages
Unit -II- FSD
No ratings yet
Unit -II- FSD
24 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
Java Practical File
No ratings yet
Java Practical File
22 pages
01 Introduction
No ratings yet
01 Introduction
41 pages
Locks 1
No ratings yet
Locks 1
61 pages
Threadin java
No ratings yet
Threadin java
7 pages
Threads & Concurrency: Lecture 23 - CS2110 - Fall 2018
No ratings yet
Threads & Concurrency: Lecture 23 - CS2110 - Fall 2018
34 pages
7-Tree Sum Parallel Algorithm & Applications
No ratings yet
7-Tree Sum Parallel Algorithm & Applications
23 pages
cs2110 16 Parallelprog
No ratings yet
cs2110 16 Parallelprog
63 pages
Unit 5 and 6 for Ppl Isha Bachhav
No ratings yet
Unit 5 and 6 for Ppl Isha Bachhav
39 pages
Programming Assignment 2 Checklist: Randomized Queues and Dequeues
No ratings yet
Programming Assignment 2 Checklist: Randomized Queues and Dequeues
2 pages
Chapter 5 - MultiThreading
No ratings yet
Chapter 5 - MultiThreading
36 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Essential Algorithms: A Practical Approach to Computer Algorithms
From Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
4.5/5 (2)
The Complete Future Trait Guide
From Everand
The Complete Future Trait Guide
Hamze Ghalebi
No ratings yet
Concurrency: Mutual Exclusion, Synchronization, Deadlock and Starvation
No ratings yet
Concurrency: Mutual Exclusion, Synchronization, Deadlock and Starvation
57 pages
2nd year NEP Syllabus
No ratings yet
2nd year NEP Syllabus
30 pages
Preemptive
No ratings yet
Preemptive
3 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
UNIT III Security in Computing
No ratings yet
UNIT III Security in Computing
9 pages
Windows API For Red Team 102
No ratings yet
Windows API For Red Team 102
25 pages
@vtucode - in 21CS643 Module 5 2021 Scheme
No ratings yet
@vtucode - in 21CS643 Module 5 2021 Scheme
108 pages
Lab 9: Synchronization and CPU Affinity For Threads: Objectives
No ratings yet
Lab 9: Synchronization and CPU Affinity For Threads: Objectives
3 pages
Ca Unit 4 Prabu
No ratings yet
Ca Unit 4 Prabu
24 pages
ch10 Virtual Memory
No ratings yet
ch10 Virtual Memory
53 pages
Inter Process Communication
100% (1)
Inter Process Communication
6 pages
Chapter 3 - Process Synchronization
No ratings yet
Chapter 3 - Process Synchronization
61 pages
Why Is My Linux ECS Not Booting and Going Into Emergency Mode
No ratings yet
Why Is My Linux ECS Not Booting and Going Into Emergency Mode
16 pages
07f ch4 Testbank
100% (1)
07f ch4 Testbank
7 pages
HP PRM User's Guide PDF
No ratings yet
HP PRM User's Guide PDF
154 pages
03 - Principles of Concurrent Systems - Processes PDF
No ratings yet
03 - Principles of Concurrent Systems - Processes PDF
29 pages
SRS of College Event Management System
No ratings yet
SRS of College Event Management System
5 pages
Unit 3 - Operating System - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 3 - Operating System - WWW - Rgpvnotes.in PDF
24 pages
The Functional Units
No ratings yet
The Functional Units
14 pages
Concurrent Processes: Understanding Operating Systems, Fourth Edition
No ratings yet
Concurrent Processes: Understanding Operating Systems, Fourth Edition
52 pages
Chapter 6 - Memory Management
No ratings yet
Chapter 6 - Memory Management
60 pages
UltraViewerService Log
No ratings yet
UltraViewerService Log
55 pages
OSY (Ashish) Micro Project Report
No ratings yet
OSY (Ashish) Micro Project Report
8 pages
Os Unit 3
No ratings yet
Os Unit 3
18 pages
Module 1
No ratings yet
Module 1
67 pages
Study Material of Operating Systems
No ratings yet
Study Material of Operating Systems
247 pages
QA
No ratings yet
QA
170 pages
CROS and System Shell
No ratings yet
CROS and System Shell
74 pages
OS Syllabus
No ratings yet
OS Syllabus
10 pages
Unit- 5 Linux Ppt
No ratings yet
Unit- 5 Linux Ppt
37 pages

26 Parallel Algorithms

Uploaded by

26 Parallel Algorithms

Uploaded by

Introduction to Parallel

• Removing this assumption creates challenges and opportunities:

• But we are reaching limits of classic CPU design.

• Current work-around: Use multiple processors.

• Do multiple things at once within the same program (threads).

• Writing correct/fast parallel code is much harder than sequential.

• concurrent: Multiple execution flows (e.g. threads)

• Many programmers confuse these two concepts.

// normal sequential solution

sum1 = 22+18+12+-4+27+30+36+50 = 191 sum2 = 7+68+91+56+2+85+42+98 = 449

sum = sum1 + sum2 = 640

 Compute sum of each half of array in a thread.

// normal sequential solution

public class Summer implements Runnable {

public Summer(int[] a, int min, int max) {

public int getSum() {

public void run() {

// Parallel version (two threads)

• You'd think that would be the ideal number of threads.

• Too few threads can be bad (core(s) sit idle).

public class MergeSortRunner implements Runnable {

public void run() {

• A better way: divide-and-conquer parallelism

public class MergeSortRunner implements Runnable {

public MergeSortRunner(int[] a, int level) {

public void run() {

 Therefore, part of the trick becomes

• Frameworks and tools have been written to perform map/reduce.

 Causes current thread to wait for the given number of milliseconds.

// check for new network messages every 2 sec

You might also like