0% found this document useful (0 votes)

12 views24 pages

Parallel and Distributed Lec 8

The document covers key concepts in parallel and distributed computing, including various parallelization algorithms such as loop, pipeline, and speculative parallelism. It discusses synchronization mechanisms like mutexes, semaphores, and locks to prevent race conditions and deadlocks, along with best practices for synchronization. Additionally, it addresses data and work partitioning strategies to optimize task execution across multiple processors.

Uploaded by

reactuser76

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views24 pages

Parallel and Distributed Lec 8

Uploaded by

reactuser76

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Parallel and Distributed

Computing
COMP3139
Contents
• Parallelism algorithms
• Synchronization
• Synchronization mechanisms
• Challenges
• Best practices
• Data and work partitioning
PARALLELIZATION ALGORITHMS
LOOP PARALLELISM

• Loop parallelism is one of the simplest and most widely used

strategies in parallel computing.
• It involves distributing the iterations of a loop across multiple
processors, where each iteration can be executed independently.
• This strategy works best when there is no data dependency between
iterations, meaning each iteration can run without needing to wait for
the result of others.

3
PARALLELIZATION ALGORITHMS
LOOP PARALLELISM

• Example: Suppose you want to

compute the square of numbers in
an array.
• Each computation is independent,
making it a perfect candidate for
loop parallelism. Using OpenMP,
this can be easily parallelized

4
PARALLELIZATION ALGORITHMS
PIPELINE PARALLELISM

• It is a strategy where a task is broken down into a sequence of

stages, with each stage performing part of the overall task.
• Each stage works concurrently, processing data as soon as it is
available from the previous stage, like an assembly line.
• This approach is useful in scenarios where data flows through
different processing phases, such as video streaming or image
processing.

5
PIPELINE PARALLELISM
• Example code
PARALLELIZATION ALGORITHMS
SPECULATIVE PARALLELISM

• It involves executing tasks that might or might not be necessary.

• The idea is to compute potential outcomes in advance, even when it’s
uncertain which computation path will be chosen.
• This is especially useful in algorithms that involve branching, such as
decision trees, where future computation depends on a choice that
hasn’t been made yet.

7
SPECUL ATIVE PARALLELISM
• EXAMPLE CODE
SYNCHRONIZATION

• Synchronization is a key concept in

parallel computing that ensures correct
operation when multiple threads or
processes are working concurrently and
accessing shared resources.

• The goal is to prevent race conditions,

inconsistencies, or deadlocks that can
occur when parallel threads try to read and
write shared data at the same time.
9
SYNCHRONIZATION MECHANISMS

1. Mutex (Mutual Exclusion): A lock that ensures only one thread at a time
can access a shared resource or critical section of code.

• Mutexes are used when modifying shared variables or data structures to

prevent race conditions.

• Example:

10
SYNCHRONIZATION MECHANISMS

2. Semaphores: A counter-based
synchronization mechanism that
controls access to a pool of resources.
• Useful when multiple threads need to
access limited resources (e.g.,
multiple readers/writers).
• Example: A semaphore with a count
of 3 would allow three threads to
access the resource simultaneously
before blocking others.
11
SYNCHRONIZATION MECHANISMS

3. Locks: General synchronization

primitives that ensure only one thread
can execute a block of code at a time.

• Holding a lock is how one

thread tells other threads: “I'm
changing this thing, don't
touch it right now.”

12
• SYNCHRONIZATION MECHANISMS

Types:
Spinlocks: Threads continuously
check if a lock is available, useful in
situations where waiting times are
very short.
spinlock is a synchronization
mechanism used in operating
systems to protect shared resources
from single access by multiple
13
threads or processes.
• SYNCHRONIZATION MECHANISMS

Types:
• Read-Write Locks: Allows multiple
readers but only one writer at a
time.
• Read-write locks allow simultaneous
read access by many threads while
restricting write access to only one
thread at a time.

14
CHALLENGES ADDRESSED BY
SYNCHRONIZATION

• Race Conditions: Occur when multiple threads try to update

shared data without proper synchronization,

• Leading to unpredictable results.

• Example: Incrementing a shared counter in two threads

without a mutex might lead to lost updates.

15
CHALLENGES ADDRESSED BY
SYNCHRONIZATION

• Deadlocks: Happen when two or

more threads are waiting
indefinitely for resources locked by
each other.
• Example: Thread A locks resource
1 and waits for resource 2, while
Thread B locks resource 2 and
waits for resource 1.

16
BEST PRACTICES FOR
SYNCHRONIZATION

• Minimize the use of locks to avoid performance bottlenecks.

• Ensure locks are acquired and released in the same order by

all threads to prevent deadlocks.

• Use high-level synchronization constructs (e.g., thread-safe

queues) where possible.

17
DATA AND WORK PARTITIONING

• Partitioning is one of the fundamental

steps in parallel computing,
• where tasks or data are divided into
smaller units that can be processed
concurrently by different processors or
cores.
• Data partitioning criteria and the
partitioning strategy decide how the
dataset is divided.
18
DATA AND WORK PARTITIONING

1. Data Partitioning: Dividing the input data into chunks that can be
processed independently.

• In applications like matrix multiplication or image processing, the data

can be partitioned into smaller blocks or sections that are processed in
parallel.

i. Block Partitioning: The data is divided into equal-sized blocks, and

each block is assigned to a processor.

• This is simple but may not balance the load efficiently if some data
chunks are more compute-intensive than others.
19
DATA AND WORK PARTITIONING

ii. Cyclic Partitioning: Tasks are

distributed in a round-robin fashion. This
method is more efficient when data chunks
have varying computational loads.
Example: In a matrix multiplication
problem, partitioning each row of the
matrix across processors would allow each
processor to compute the partial results
concurrently.

20
DATA AND WORK PARTITIONING

2. Work Partitioning:

• Dividing computational tasks into smaller tasks that can

be executed in parallel.

• Work partitioning is typically used in recursive algorithms

(like quicksort) where the problem can be split into
subproblems that can be solved in parallel.

21
DATA AND WORK PARTITIONING

i. Task Decomposition: The program’s functionality is divided into

tasks, where each task performs a separate part of the overall
computation.

ii. Functional Decomposition: Each processor performs a different

function or operation on the same or different data.

• Example: In a parallel quicksort algorithm, after choosing a pivot, the

partitioning of the array into left and right sub-arrays can be
performed by separate threads.

22
CHALLENGES IN PARTITIONING

• Ensuring that tasks or data are independent of each other

to avoid race conditions.

• Balancing the workload across processors to avoid idling

or overloading any single processor.

23
THANK YOU

5 - Designing Parallel Programs
No ratings yet
5 - Designing Parallel Programs
52 pages
Model Driven Engineering (MDE) : ITC-708 by Dr. Mir Sajjad Hussain Talpur Dated: 08-2-2021
50% (2)
Model Driven Engineering (MDE) : ITC-708 by Dr. Mir Sajjad Hussain Talpur Dated: 08-2-2021
17 pages
Group 5COA
No ratings yet
Group 5COA
14 pages
E Billing User Manual
No ratings yet
E Billing User Manual
87 pages
Programming For Performance
No ratings yet
Programming For Performance
79 pages
ch2 PC
No ratings yet
ch2 PC
44 pages
Parallel Algorithem
No ratings yet
Parallel Algorithem
15 pages
Lecture 04 - Parallel Algorithm Models
No ratings yet
Lecture 04 - Parallel Algorithm Models
18 pages
Process Synchronization 20250115 091836 0000
No ratings yet
Process Synchronization 20250115 091836 0000
20 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Lecture 7 Disributed Algorithms
No ratings yet
Lecture 7 Disributed Algorithms
43 pages
CS439 CC 2 Parallel Distributed Systems
No ratings yet
CS439 CC 2 Parallel Distributed Systems
37 pages
Cs 621
No ratings yet
Cs 621
7 pages
Chap 4-7 - Parallel - Abstractions - and - MPI
No ratings yet
Chap 4-7 - Parallel - Abstractions - and - MPI
34 pages
4 DesigningParallelPrograms
No ratings yet
4 DesigningParallelPrograms
69 pages
Assignement
No ratings yet
Assignement
3 pages
Con Currency Mapping
No ratings yet
Con Currency Mapping
40 pages
Principles of Parallel Algorithm Design
No ratings yet
Principles of Parallel Algorithm Design
63 pages
Parallel and Distributed Lec 7
No ratings yet
Parallel and Distributed Lec 7
35 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
PDC Summers Finals Revision Notes
No ratings yet
PDC Summers Finals Revision Notes
50 pages
Lecture 2
No ratings yet
Lecture 2
32 pages
Partitioning
No ratings yet
Partitioning
37 pages
Presented by
No ratings yet
Presented by
23 pages
Module 1
No ratings yet
Module 1
14 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
Parallel Computing
No ratings yet
Parallel Computing
91 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Introduction
No ratings yet
Introduction
17 pages
L19-20 PA Design Intro
No ratings yet
L19-20 PA Design Intro
31 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Parallel Programming
No ratings yet
Parallel Programming
42 pages
Distributed Computing Seminar
No ratings yet
Distributed Computing Seminar
37 pages
Parallel Algorithm Design Principles and Programming
No ratings yet
Parallel Algorithm Design Principles and Programming
8 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
02 - Introduction To Concurrent Systems PDF
No ratings yet
02 - Introduction To Concurrent Systems PDF
31 pages
Parallel Algorithms Presentation
No ratings yet
Parallel Algorithms Presentation
32 pages
PDC Unit-2
No ratings yet
PDC Unit-2
48 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
07 Parallel Algorithms in Parallel and Distributed Computing
No ratings yet
07 Parallel Algorithms in Parallel and Distributed Computing
13 pages
Csce616 11
No ratings yet
Csce616 11
14 pages
Lectures5 14
No ratings yet
Lectures5 14
85 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
Parallel Programming
No ratings yet
Parallel Programming
18 pages
A Guide To Data Validation Manager
No ratings yet
A Guide To Data Validation Manager
42 pages
Hashing
No ratings yet
Hashing
24 pages
Lecture4 PDF
No ratings yet
Lecture4 PDF
23 pages
PA Midsem
No ratings yet
PA Midsem
20 pages
Programming Models
No ratings yet
Programming Models
21 pages
Multimedia Databases
100% (1)
Multimedia Databases
14 pages
Pda 1
No ratings yet
Pda 1
72 pages
Lecture1 Introduction PDF
No ratings yet
Lecture1 Introduction PDF
43 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
U1&u2 Padcom-25
No ratings yet
U1&u2 Padcom-25
95 pages
02 - Lecture #2
No ratings yet
02 - Lecture #2
29 pages
BCSE412L - Parallel Computing 01
No ratings yet
BCSE412L - Parallel Computing 01
27 pages
BMD 3xx Eval User Guide v2.1
No ratings yet
BMD 3xx Eval User Guide v2.1
19 pages
A Practical Strategy and Workflow For Large Projects
No ratings yet
A Practical Strategy and Workflow For Large Projects
9 pages
9305 - Datasheet UPS
No ratings yet
9305 - Datasheet UPS
2 pages
CS621 Cheatsheet
No ratings yet
CS621 Cheatsheet
11 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
Artweaver en
No ratings yet
Artweaver en
90 pages
CISCO ASAv - GNS3 Deployment
No ratings yet
CISCO ASAv - GNS3 Deployment
15 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Blancco Drive Eraser
100% (1)
Blancco Drive Eraser
2 pages
PDC 3
No ratings yet
PDC 3
26 pages
Kenneth H. Rosen - Discrete Mathematics and Its Applications (1998, McGraw-Hill Science Engineering Math)
No ratings yet
Kenneth H. Rosen - Discrete Mathematics and Its Applications (1998, McGraw-Hill Science Engineering Math)
700 pages
Proposal
No ratings yet
Proposal
5 pages
Lab-Proj.06 Regedit
No ratings yet
Lab-Proj.06 Regedit
4 pages
Iot Workshop: HF5111B Serial Server Device User Manual
No ratings yet
Iot Workshop: HF5111B Serial Server Device User Manual
58 pages
Biostar A58ml2 Spec
No ratings yet
Biostar A58ml2 Spec
6 pages
OOPS PROJECT REPORT Sheraz 21
No ratings yet
OOPS PROJECT REPORT Sheraz 21
14 pages
Workflow
No ratings yet
Workflow
5 pages
PDF 3450231 en-US-6
No ratings yet
PDF 3450231 en-US-6
440 pages
Medical Shop Management System
No ratings yet
Medical Shop Management System
197 pages
LICENSE
No ratings yet
LICENSE
11 pages
L-Series Integrated Speed Control: Product Manual 26250 (Revision M, 04/2019)
No ratings yet
L-Series Integrated Speed Control: Product Manual 26250 (Revision M, 04/2019)
135 pages
ST260 Manual
No ratings yet
ST260 Manual
24 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Gravity Workout
No ratings yet
Gravity Workout
4 pages
VideoLogic Multiple Region Headers Example Uses
No ratings yet
VideoLogic Multiple Region Headers Example Uses
9 pages
NCERT
No ratings yet
NCERT
6 pages
7951 Teka ST Makati - Google Search
No ratings yet
7951 Teka ST Makati - Google Search
1 page
HPC Pyq
No ratings yet
HPC Pyq
11 pages
Data Mining
No ratings yet
Data Mining
4 pages
Operation and Maintenance Manual of Computer Lab
100% (1)
Operation and Maintenance Manual of Computer Lab
4 pages
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
From Everand
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
Larry Jones
No ratings yet

Parallel and Distributed Lec 8

Uploaded by

Parallel and Distributed Lec 8

Uploaded by

Parallel and Distributed

• Loop parallelism is one of the simplest and most widely used

• Example: Suppose you want to

• It is a strategy where a task is broken down into a sequence of

• It involves executing tasks that might or might not be necessary.

• Synchronization is a key concept in

• The goal is to prevent race conditions,

• Mutexes are used when modifying shared variables or data structures to

3. Locks: General synchronization

• Holding a lock is how one

• Race Conditions: Occur when multiple threads try to update

• Leading to unpredictable results.

• Example: Incrementing a shared counter in two threads

• Deadlocks: Happen when two or

• Minimize the use of locks to avoid performance bottlenecks.

• Ensure locks are acquired and released in the same order by

• Use high-level synchronization constructs (e.g., thread-safe

• Partitioning is one of the fundamental

• In applications like matrix multiplication or image processing, the data

i. Block Partitioning: The data is divided into equal-sized blocks, and

ii. Cyclic Partitioning: Tasks are

• Dividing computational tasks into smaller tasks that can

• Work partitioning is typically used in recursive algorithms

i. Task Decomposition: The program’s functionality is divided into

ii. Functional Decomposition: Each processor performs a different

• Example: In a parallel quicksort algorithm, after choosing a pivot, the

• Ensuring that tasks or data are independent of each other

• Balancing the workload across processors to avoid idling

You might also like