0% found this document useful (0 votes)

141 views36 pages

Multithreading Algorithms

The document discusses parallel algorithms and multithreading. It describes different types of parallel computers like chip multiprocessors, clusters, and supercomputers. It then covers static and dynamic multithreading, with dynamic multithreading allowing automatic load balancing. Key aspects of multithreading like spawn, sync, and computational dags are defined. Performance measures like work, span, speedup and parallelism are also introduced.

Uploaded by

Asna Tariq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views36 pages

Multithreading Algorithms

Uploaded by

Asna Tariq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 36

Juan Mendivelso

MULTITHREADING ALGORITHMS
SERIAL ALGORITHMS & PARALLEL ALGORITHMS

 Serial Algorithms: Suitable for running on an

uniprocessor computer in which only one
instruction executes at a time.
 Parallel Algorithms: Run on a multiprocessor
computer that permits multiple execution to
execute concurrently.
PARALLEL COMPUTERS

 Computers with multiple processing units.

 They can be:
 ChipMultiprocessors: Inexpensive
laptops/desktops. They contain a single multicore
integrated-circuit that houses multiple processor
“cores” each of which is a full-fledged processor
with access to common memory.
PARALLEL COMPUTERS

 Computers with multiple processing units.

 They can be:
 Clusters:
Build from individual computers with a
dedicated network system interconnecting them.
Intermediate price/performance.
PARALLEL COMPUTERS

 Computers with multiple processing units.

 They can be:
 Supercomputers: Combination of custom
architectures and custom networks to deliver the
highest performance (instructions per second).
High price.
MODELS FOR PARALLEL COMPUTING

 Although the random-access machine model

was early accepted for serial computing, no
model has been established for parallel
computing.
 A major reason is that vendors have not agreed
on a single architectural model for parallel
computers.
MODELS FOR PARALLEL COMPUTING

 For example some parallel computers feature

shared memory where all processors can
access any location of memory.
 Others employ distributed memory where each
processor has a private memory.
 However, the trend appears to be toward
shared memory multiprocessor.
STATIC THREADING

 Shared-memory parallel computers use static

threading.
 Software abstraction of “virtual processors” or
threads sharing a common memory.
 Each thread can execute code independently.

 For most applications, threads persist for the

duration of a computation.
PROBLEMS OF STATIC THREADING

 Programming a shared-memory parallel

computer directly using static threads is
difficult and error prone.
 Dynamically partioning the work among the
threads so that each thread receives
approximately the same load turns out to be
complicated.
PROBLEMS OF STATIC THREADING

 The programmer must use complex

communication protocols to implement a
scheduler to load-balance the work.
 This has led to the creation of concurrency
platforms. They provide a layer of software that
coordinates, schedules and manages the
parallel-computing resources.
DYNAMIC MULTITHREADING

 Class of concurrency platform.

 It allows programmers to specify parallelism in
applications without worrying about
communication protocols, load balancing, etc.
 The concurrency platform contains a scheduler
that load-balances the computation
automatically.
DYNAMIC MULTITHREADING

 It supports:
 Nested parallelism: It allows a subroutine to be
spawned, allowing the caller to proceed while the
spawned subroutine is computing its result.
 Parallel loops: regular for loops except that the
iterations can be executed concurrently.
ADVANTAGES OF DYNAMIC MULTITHREADING

 The user only spicifies the logical parallelism.

 Simple extension of the serial model with:
parallel, spawn and sync.
 Clean way to quantify parallelism.

 Many multithreaded algorithms involving

nested parallelism follow naturally from the
Divide & Conquer paradigm.
BASICS OF MULTITHREADING

 Fibonacci Example
 The serial algorithm: Fib(n)
 Repeated work

 Complexity

 However, recursive calls are independent!

 Parallel algorithm: P-Fib(n)

SERIALIZATION

 Concurrency keywords: spawn, sync and

parallel
 The serialization of a multithreaded algorithm
is the serial algorithm that results from deleting
the concurrency keywords.
NESTED PARALLELISM

 It occurs when the keyword spawn precedes a

procedure call.
 It differs from the ordinary procedure call in
that the procedure instance that executes the
spawn - the parent – may continue to execute
in parallel with the spawn subroutine – its child
- instead of waiting for the child to complete.
KEYWORD SPAWN

 It doesn’t say that a procedure must execute

concurrently with its spawned children; only
that it may!
 The concurrency keywords express the logical
parallelism of the computation.
 At runtime, it is up to the scheduler to
determine which subcomputations actually run
concurrently by assigning them to processors.
KEYWORD SYNC
 A procedure cannot safely use the values
returned by its spawned children until after it
executes a sync statement.
 The keyword sync indicates that the procedure
must wait until all its spawned children have
been completed before proceeding to the
statement after the sync.
 Every procedure executes a sync implicitly
before it returns.
COMPUTATIONAL DAG

 We can see a multithread computation as a

directed acyclic graph G=(V,E) called a
computational dag.
 The vertices are instructions and and the edges
represent dependencies between instructions,
where (u,v) є E means that instruction u must
execute before instruction v.
COMPUTATIONAL DAG

 If a chain of instructions contains no parallel

control (no spawn, sync, or return), we may
group them into a single strand, each of which
represents one or more instructions.
 Instructions involving parallel control are not
included in strands, but are represented in the
structure of the dag.
COMPUTATIONAL DAG

 For example, if a strand has two successors,

one of them must have been spawned, and a
strand with multiple predecessors indicates the
predecessors joined because of a sync.
 Thus, in the general case, the set V forms the
set of strands, and the set E of directed edges
represents dependencies between strands
induced by parallel control.
COMPUTATIONAL DAG

 If G has a directed path from strand u to strand,

we say that the two strands are (logically) in
series. Otherwise, strands u and are (logically)
in parallel.
 We can picture a multithreaded computation as
a dag of strands embedded in a tree of
procedure instances.
 Example!
COMPUTATIONAL DAG

 We can classify the edges:

 Continuation edge : connects a strand u to its
successor u’within the same procedure instance.
 Call edges: representing normal procedure calls.

 Return edges: When a strand u returns to its calling

procedure and x is the strand immediately following
the next sync in the calling procedure.
 A computation starts with an initial strand and
ends with a single final strand.
IDEAL PARALLEL COMPUTER

 A parallel computer that consists of a set of

processors and a sequential consistent shared
memory.
 Sequential consistent means that the shared
memory behaves as if the multithreaded
computation’s instructions were interleaved to
produce alinear order that preserves the partial
order of the computation dag.
IDEAL PARALLEL COMPUTER

 Depending on scheduling, the ordering could

differ from one run of the program to another.
 The ideal-parallel-computer model makes some
performance assumptions:
 Each processor in the machine has equal
computing power
 It ignores the cost of scheduling.
PERFORMANCE MEASURES

 Work:
 Total time to execute the entire computation on
one processor.
 Sum of the times taken by each of the strands.

 In the computational dag, it is the number of

strands (assuming each strand takes a time unit).
PERFORMANCE MEASURES

 Span:
 Longest time to execute thge strands along in path
in the dag.
 The span equals the number of vertices on a
longest or critical path.
 Example!
PERFORMANCE MEASURES

 The actual running time of a multithreaded

computation depends also on how many
processors are available and how the
scheduler allocates strands to processors.
 Running time on P processors: TP

 Work: T1

 Span: T∞ (unlimited number of processors)

PERFORMANCE MEASURES

 The work and span provide lower bound on the

running time of a multithreaded computation TP
on P processors:
 Work law: TP ≥ T1 /P
 Span law: TP ≥ T∞
PERFORMANCE MEASURES

 Speedup:
 Speedup of a computation on P processors is the
ratio T1 /TP
 How many times faster the computation is on P
processors than on one processor.
 It’s at most P.

 Linear speedup: T1 /TP = θ(P)

 Perfect linear speedup: T1 /TP =P

PERFORMANCE MEASURES
 Parallelism:
 T1 /T∞
 Average amount amount of work that can be
performed in parallel for each step along the critical
path.
 As an upper bound, the parallelism gives the
maximum possible speedup that can be achieved
on any number of processors.
 The parallelism provides a limit on the possibility of
attaining perfect linear speedup.
SCHEDULING

 Good performance depends on more than

minimizing the span and work.
 The strands must also be scheduled efficiently
onto the processors of the parallel machine.
 On multithreaded programming model provides
no way to specify which strands to execute on
which processors. Instead, we rely on the
concurrency platform’s scheduler.
SCHEDULING

 A multithreaded scheduler must schedule the

computation with no advance knowledge of
when strands will be spawned or when they will
complete—it must operate on-line.
 Moreover, a good scheduler operates in a
distributed fashion, where the threads
implementing the scheduler cooperate to load-
balance the computation.
SCHEDULING

 To keep the analysis simple, we shall consider

an on-line centralized scheduler, which knows
the global state of the computation at any given
time.
 In particular, we shall consider greedy
schedulers, which assign as many strands to
processors as possible in each time step.
SCHEDULING

 If at least P strands are ready to execute during

a time step, we say that the step is a complete
step, and a greedy scheduler assigns any P of
the ready strands to processors.
 Otherwise, fewer than P strands are ready to
execute, in which case we say that the step is
an incomplete step, and the scheduler assigns
each ready strand to its own processor.
SCHEDULING

 A greedy scheduler executes a multithreaded

computation in time: TP ≤ T1 /P + T∞
 Greedy scheduling is provably good becauses it
achieves the sum of the lower bounds as an
upper bound.
 Besides it is within a factor of 2 of optimal.

Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Playmaker Game Creator Bridge Documentation
No ratings yet
Playmaker Game Creator Bridge Documentation
20 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Week1 Parallel and Distributed Computing
No ratings yet
Week1 Parallel and Distributed Computing
55 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
No ratings yet
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
22 pages
Unit 4
No ratings yet
Unit 4
42 pages
Multi Threading
No ratings yet
Multi Threading
96 pages
Chapter 02 - Asynchronous and Parallel Programming in
No ratings yet
Chapter 02 - Asynchronous and Parallel Programming in
55 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
PDC Lecture 05
No ratings yet
PDC Lecture 05
48 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
Multiprocessing Vs Multithreading 2
No ratings yet
Multiprocessing Vs Multithreading 2
16 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Daa 6
No ratings yet
Daa 6
59 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
33 pages
Arch13 Multiprocessors Afterlecture
No ratings yet
Arch13 Multiprocessors Afterlecture
70 pages
Lecture Week - 2 General Parallelism Terms
No ratings yet
Lecture Week - 2 General Parallelism Terms
24 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
PA Midsem
No ratings yet
PA Midsem
20 pages
U1&u2 Padcom-25
No ratings yet
U1&u2 Padcom-25
95 pages
Concurrent Programming With Threads: Rajkumar Buyya
No ratings yet
Concurrent Programming With Threads: Rajkumar Buyya
168 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
08 Systems Programming-Concurrent Programming
No ratings yet
08 Systems Programming-Concurrent Programming
61 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
Khaitan PSERC Webinar HPC Mar 2013 Slides
No ratings yet
Khaitan PSERC Webinar HPC Mar 2013 Slides
52 pages
High Performance Computing
No ratings yet
High Performance Computing
17 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Dyn Multi Alg
No ratings yet
Dyn Multi Alg
13 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
PDC Lecture 02
No ratings yet
PDC Lecture 02
35 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
HPC Lecture 2 Points
No ratings yet
HPC Lecture 2 Points
7 pages
Unit4 Session3 Parallel Computing Concepts Terminology Design Issues
No ratings yet
Unit4 Session3 Parallel Computing Concepts Terminology Design Issues
30 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Module - 4 - Parallel Processing
No ratings yet
Module - 4 - Parallel Processing
32 pages
Lecture 3
No ratings yet
Lecture 3
24 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
Unit 5
No ratings yet
Unit 5
66 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
RS - Pds-Oe 3010
No ratings yet
RS - Pds-Oe 3010
8 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Daa 1
No ratings yet
Daa 1
40 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Coa Chapter 5
No ratings yet
Coa Chapter 5
96 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Presentation 3
No ratings yet
Presentation 3
37 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
The Complete Future Trait Guide
From Everand
The Complete Future Trait Guide
Hamze Ghalebi
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Common Advantages and Disadvantages of Multithreading in Java - MVA Blog
No ratings yet
Common Advantages and Disadvantages of Multithreading in Java - MVA Blog
4 pages
SAP ABAP - Content: 1. Introduction To ERP & SAP
No ratings yet
SAP ABAP - Content: 1. Introduction To ERP & SAP
4 pages
Software Requirements Specification Introduction: Part - B
No ratings yet
Software Requirements Specification Introduction: Part - B
10 pages
Case Studies
0% (1)
Case Studies
19 pages
Amar Soni Sap TM-1
No ratings yet
Amar Soni Sap TM-1
1 page
Object Oriented Analysis
No ratings yet
Object Oriented Analysis
54 pages
Ict Programming Week 1
No ratings yet
Ict Programming Week 1
6 pages
DFT4024 Object Oriented Programming: Lab Exercise 3
No ratings yet
DFT4024 Object Oriented Programming: Lab Exercise 3
2 pages
Suhailjt 1
No ratings yet
Suhailjt 1
3 pages
Online Tutor Management System: Submitted To Purbanchal University Biratnagar, Nepal Submitted by
0% (1)
Online Tutor Management System: Submitted To Purbanchal University Biratnagar, Nepal Submitted by
45 pages
R2023-Sem V New
No ratings yet
R2023-Sem V New
3 pages
SL 1
No ratings yet
SL 1
16 pages
James O. Coplien, Neil B. Harrison - Organizational Patterns of Agile Software Development-Prentice Hall (2004) PDF
No ratings yet
James O. Coplien, Neil B. Harrison - Organizational Patterns of Agile Software Development-Prentice Hall (2004) PDF
488 pages
Use of Simulation in Manufacturing and Logistics Systems Planning
No ratings yet
Use of Simulation in Manufacturing and Logistics Systems Planning
24 pages
Unit 3 - Structure
No ratings yet
Unit 3 - Structure
34 pages
XML Integration With Java
No ratings yet
XML Integration With Java
102 pages
C# Case Example
No ratings yet
C# Case Example
3 pages
UNIT 3 Event-Handling
No ratings yet
UNIT 3 Event-Handling
97 pages
Ispf User Guide Vol2
No ratings yet
Ispf User Guide Vol2
548 pages
Xigmanas Download Options
No ratings yet
Xigmanas Download Options
2 pages
Decision Making Statements & Loops
No ratings yet
Decision Making Statements & Loops
26 pages
Crash 2021 09 26 - 16.29.04 Server
No ratings yet
Crash 2021 09 26 - 16.29.04 Server
14 pages
Online Bidding System
No ratings yet
Online Bidding System
8 pages
Cse R18 Iii-I Se Lab Instruction Manual
No ratings yet
Cse R18 Iii-I Se Lab Instruction Manual
88 pages
Viva Class 10 Ai
No ratings yet
Viva Class 10 Ai
6 pages
Online Java Compiler - Online Java Editor - Java Code Online2
No ratings yet
Online Java Compiler - Online Java Editor - Java Code Online2
2 pages
HM - PASTA - Boosting - Boosters 1208722923481989173
No ratings yet
HM - PASTA - Boosting - Boosters 1208722923481989173
47 pages
សេចក្ដីណែនាំអំពី Array
No ratings yet
សេចក្ដីណែនាំអំពី Array
6 pages
Binary File MCQ Question Bank For Class 12 - CBSE Python
No ratings yet
Binary File MCQ Question Bank For Class 12 - CBSE Python
51 pages

Multithreading Algorithms

Uploaded by

Multithreading Algorithms

Uploaded by

Juan Mendivelso

 Serial Algorithms: Suitable for running on an

 Computers with multiple processing units.

 Computers with multiple processing units.

 Computers with multiple processing units.

 Although the random-access machine model

 For example some parallel computers feature

 Shared-memory parallel computers use static

 For most applications, threads persist for the

 Programming a shared-memory parallel

 The programmer must use complex

 Class of concurrency platform.

 The user only spicifies the logical parallelism.

 Many multithreaded algorithms involving

 However, recursive calls are independent!

 Parallel algorithm: P-Fib(n)

 Concurrency keywords: spawn, sync and

 It occurs when the keyword spawn precedes a

 It doesn’t say that a procedure must execute

 We can see a multithread computation as a

 If a chain of instructions contains no parallel

 For example, if a strand has two successors,

 If G has a directed path from strand u to strand,

 We can classify the edges:

 Return edges: When a strand u returns to its calling

 A parallel computer that consists of a set of

 Depending on scheduling, the ordering could

 In the computational dag, it is the number of

 The actual running time of a multithreaded

 Span: T∞ (unlimited number of processors)

 The work and span provide lower bound on the

 Linear speedup: T1 /TP = θ(P)

 Perfect linear speedup: T1 /TP =P

 Good performance depends on more than

 A multithreaded scheduler must schedule the

 To keep the analysis simple, we shall consider

 If at least P strands are ready to execute during

 A greedy scheduler executes a multithreaded

You might also like