0% found this document useful (0 votes)

11 views4 pages

HPC Ut 2

HPC

Uploaded by

Aditya Pimpale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views4 pages

HPC Ut 2

HPC

Uploaded by

Aditya Pimpale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

HPC UNIT 2

Decomposition:
1. Dividing the computation into sub computations to execute them parallely is called as decomposition.
2. Decomposition speeds up the overall computation.
Task:
1. It is programmer defined unit of computation.
2. Tasks are generated by subdividing the main computation by decomposition.
Dependency Graph: -
1. A decomposition can be illustrated in the form of a directed graph with nodes corresponding to tasks & edges
indicating that the result of one task is required for processing the next.
2. Such a graph is called a task dependency graph.

Parallel Algorithm Models: -

Master-Slave Model:
1. It focuses on Task distribution and control.
2. Working: - 1) Master: Assigns tasks. 2) Slave: Executes tasks and reports back.
3. Control: Centralized, master dictates tasks.
Example: In a rendering farm, the master assigns rendering tasks to slave computers. Slaves render frames and send
them back to the master for assembly.

Data-Parallel Model:
1. Performs same operation on different data concurrently.
2. Working: - Every processor performs the same operation on its assigned data.
3. Control: Centralized control for data distribution.
Example: Calculating average temperature across multiple weather stations. Each station calculates its average, then
the main system combines them for the overall average.

Task-Parallel Model:
 Executes independent tasks concurrently.
 Working: - Processors handle different tasks.
 Control: Can be centralized or decentralized.
Example: Web server handling multiple requests. Each request is a task, and they can be processed concurrently.

Characteristics of Tasks: -
1. Task Generation:
 Static: All tasks are known before starting (think of a grocery list you make beforehand).
 Dynamic: Tasks are created during execution (like encountering unexpected ingredients while cooking).
2. Task Sizes:
 Uniform: All tasks are roughly having the same amount of work.
 Non-uniform: Tasks have varying workloads.
3. Knowledge of Task Sizes:
 Known: We know the workload of each task upfront.
 Unknown: We don't know the workload until runtime.
4. Size of Data Associated with Tasks:
 Small tasks require minimal data to execute.
 Large tasks involve processing significant amounts of data.
Characteristics of Interactions:
1. Read-only vs. Read-write Interactions:
 Read-only: Tasks only access data from other tasks without modifying it.
 Read-write: Tasks reads and modifies the data associated with other tasks.
2. One-way vs. Two-way Interactions:
 One-way: A task initiates communication and completes it without further exchange.
 Two-way: Tasks engage in a back-and-forth communication.

Decomposition Techniques

Recursive Decomposition:

1. Concept: Divides problem into smaller sub-problems recursively until it is solvable by a single processor.

2. Recursive decomposition utilizes multiple processors efficiently for sorting large datasets.

3. Implementation: Each processor works on a subset of data, recursively applying quicksort until sorted.

4. Example: Parallel quicksort for sorting a large dataset of numbers.

Data Decomposition:

1. Concept: Divides data into smaller chunks distributed to different processors.

2. Data decomposition enables parallel processing for matrix operations, improving efficiency.

3. Implementation: Processors multiply the sub-matrices independently, then combine results.

4. Example: Parallel matrix multiplication of large matrices A and B.

Exploratory Decomposition:

1. Concept: Divides problem space into sub-spaces for concurrent exploration.

2. Speeds up search for optimal solutions by utilizing multiple processors.

3. Implementation: Each processor explores its sub-space, identifies promising solutions, and communicates
findings for refinement.

4. Example: Parallel parameter space search for optimizing a scientific model.

Limitations of parallelization of any algorithm: -

1. Amdahl's Law limits speedup by sequential tasks.
2. Communication overhead slows parallel processing.
3. Load imbalance reduces overall efficiency.
4. Memory bottleneck restricts scalability.
5. Synchronization complexity affects performance.
6. Not all algorithms parallelize effectively.
Mapping techniques for load balancing in high-performance computing ensure tasks are distributed evenly across
processors to maximize efficiency.

1. Static Mapping:
 Tasks assigned to processors before execution.
 Allocation criteria is predetermined, like task size or complexity.
 Implementation is straightforward, reducing complexity.
 However, there is a possibility of load imbalance, if the tasks vary greatly.
2. Dynamic Mapping:
 Task assignments adjusted during runtime based on system conditions.
 There is a constant monitoring of processor performance and workload distribution.
 Reallocation of tasks is done to ensure balanced load across processors.
 Offers adaptability to changing workload, minimizing idle time.
 Yet, introduces overhead due to constant monitoring and decision-making.

Classification of Dynamic Mapping Techniques: -

Centralized vs. Distributed:

1. Centralized: Master assigns tasks, simpler but can be a bottleneck.

2. Distributed: Processes communicate directly, scalable but more complex.

Deterministic vs. non-deterministic:

1. Deterministic: Rule-based task assignment, predictable but less adaptable.

2. Non-deterministic: Adaptive approaches, potentially better balancing but less predictable.

Informational Requirements:

1. Sender-initiated: Overloaded processes inform for reassignment.

2. Receiver-initiated: Idle processes request tasks, helps balance load.

3. Bilateral: Combines sender and receiver initiation for proactive balancing.

Communication Overhead:

1. Low-overhead: Minimizes communication, suitable for cost-sensitive scenarios.

2. High-overhead: Detailed information gathering for better balancing, but requires more communication.
Different anomalies in parallel algorithms

1. Slow Performance: Parallel processing can be slower due to communication overhead and workload imbalances.

2. Reverse Speedup: Multiple processors may make tasks slower, especially for small tasks or limited
parallelization.

3. Sequential Bottleneck: Inherent sequential work limits overall speedup, as per Amdahl's Law.

4. Data Sharing Issues: Shared-memory systems encounter performance problems due to data access conflicts and
cache invalidation.

5. Synchronization Overhead: Excessive coordination between processors using locks or barriers can slow down
processing.

6. Algorithmic Mismatch: Some algorithms are unsuitable for parallel execution, resulting in unexpected behaviour.

Principles of Parallel Algorithm Design

1. Designing parallel algorithms involves effectively utilizing multiple processors to solve a problem more efficiently
than a sequential approach.
2. Decomposition: Break the problem into smaller, independent tasks or data chunks.
3. Mapping: Assign these tasks or data chunks to processors, aiming for balanced workloads.
4. Communication: Minimize communication between processors to reduce overhead.
5. Synchronization: Coordinate processor execution using synchronization mechanisms sparingly.
6. Algorithmic Suitability: Choose algorithms suitable for parallelization, considering inherent parallelism and
avoiding bottlenecks.

Different methods for containing Interaction Overheads

1. Reduce Communication Costs: Keep frequently used data nearby to minimize communication.
2. Optimize Communication Patterns: Minimize data contention and overlap communication with computations.
3. Strategic Data Management: Carefully replicate often-used data on each processor to avoid communication,
but be cautious of increased memory usage.
4. Utilize System Optimizations: Use pre-optimized communication libraries for efficient data exchange patterns
like broadcasting and gathering.

Dsa - Lab Manual-18csl38
No ratings yet
Dsa - Lab Manual-18csl38
44 pages
Data Structures and Algorithm
No ratings yet
Data Structures and Algorithm
35 pages
21CS43 Microcontroller and Embedded Systems
No ratings yet
21CS43 Microcontroller and Embedded Systems
4 pages
MCQ Infix Postfix Prefix
No ratings yet
MCQ Infix Postfix Prefix
4 pages
50 Jenkins Interview Questions and Answers 2023
No ratings yet
50 Jenkins Interview Questions and Answers 2023
10 pages
Chapter 3 - Principles of Parallel Algorithm Design
No ratings yet
Chapter 3 - Principles of Parallel Algorithm Design
52 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
36 pages
Unit 2
No ratings yet
Unit 2
151 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
HPC BOOk
No ratings yet
HPC BOOk
68 pages
FoP HPC Unit II
No ratings yet
FoP HPC Unit II
107 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
Unit - 2 HPC
No ratings yet
Unit - 2 HPC
96 pages
Con Currency Mapping
No ratings yet
Con Currency Mapping
40 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
PDC Unit-2
No ratings yet
PDC Unit-2
48 pages
Unit 2
No ratings yet
Unit 2
64 pages
HPC - Unit-2 Insem Notes
No ratings yet
HPC - Unit-2 Insem Notes
99 pages
Principles of Parallel Algorithm Design
No ratings yet
Principles of Parallel Algorithm Design
63 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
3.1.3 Processes and Mapping (1/5)
No ratings yet
3.1.3 Processes and Mapping (1/5)
74 pages
Unit 2
No ratings yet
Unit 2
81 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
Parallel Algorithms Presentation
No ratings yet
Parallel Algorithms Presentation
32 pages
WINSEM2022-23 CSE4001 ETH VL2022230503176 Reference Material I 02-02-2023 Module3-ParallelDecomposition
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503176 Reference Material I 02-02-2023 Module3-ParallelDecomposition
89 pages
Introduction To Parallel Computing Design and Anal
No ratings yet
Introduction To Parallel Computing Design and Anal
53 pages
Parallel Computing
No ratings yet
Parallel Computing
91 pages
Parallel Computing Unit 3 - Principles of Parallel Computing Design
No ratings yet
Parallel Computing Unit 3 - Principles of Parallel Computing Design
78 pages
Principles of Parallel Algorithm Design
No ratings yet
Principles of Parallel Algorithm Design
78 pages
Module 3 - Principles of Parallel Algorithm Design
No ratings yet
Module 3 - Principles of Parallel Algorithm Design
39 pages
Partitioning
No ratings yet
Partitioning
37 pages
AA Part1
No ratings yet
AA Part1
43 pages
HPC Note
No ratings yet
HPC Note
39 pages
Unit 2 - Part - 1
No ratings yet
Unit 2 - Part - 1
32 pages
PDC Last Min Notes For MCQS - Theory
No ratings yet
PDC Last Min Notes For MCQS - Theory
39 pages
Lecture 6 Principles of Parallel Algorithm Design
No ratings yet
Lecture 6 Principles of Parallel Algorithm Design
35 pages
ConcurrencyDecomposition Parallel Algorithm
No ratings yet
ConcurrencyDecomposition Parallel Algorithm
40 pages
Common PDC Module3
No ratings yet
Common PDC Module3
43 pages
03-Task Decomposition and Mapping
No ratings yet
03-Task Decomposition and Mapping
62 pages
DA8 Modbus Manual
No ratings yet
DA8 Modbus Manual
2 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
Processes and Mapping, Decomposition Techniques
No ratings yet
Processes and Mapping, Decomposition Techniques
28 pages
PDC ch#5
No ratings yet
PDC ch#5
12 pages
Lecture 5 Principles of Parallel Algorithm Design
No ratings yet
Lecture 5 Principles of Parallel Algorithm Design
30 pages
Data Stage Scenarios: Scenario1. Cummilative Sum
No ratings yet
Data Stage Scenarios: Scenario1. Cummilative Sum
13 pages
HPC Chapter 2
No ratings yet
HPC Chapter 2
16 pages
CSC 580 - Chapter 3
No ratings yet
CSC 580 - Chapter 3
35 pages
Parallel Programming: Lecture #9
No ratings yet
Parallel Programming: Lecture #9
24 pages
2 ND
No ratings yet
2 ND
19 pages
WINSEM2022-23 CSE4001 ETH VL2022230503160 2023-01-19 Reference-Material-I
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503160 2023-01-19 Reference-Material-I
72 pages
Module 1
No ratings yet
Module 1
14 pages
HPC Lecture 2 Points
No ratings yet
HPC Lecture 2 Points
7 pages
Lecture 3 and 4HPC
No ratings yet
Lecture 3 and 4HPC
24 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
X. Mapping Techniques: 27 April, 2009
No ratings yet
X. Mapping Techniques: 27 April, 2009
27 pages
WINSEM2022-23 CSE4001 ETH VL2022230503160 2023-02-07 Reference-Material-I
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503160 2023-02-07 Reference-Material-I
35 pages
07 Parallel Algorithms in Parallel and Distributed Computing
No ratings yet
07 Parallel Algorithms in Parallel and Distributed Computing
13 pages
Module 3
No ratings yet
Module 3
12 pages
Parallel Algorithm Design Principles and Programming
No ratings yet
Parallel Algorithm Design Principles and Programming
8 pages
HPC Notes Unit 3
No ratings yet
HPC Notes Unit 3
7 pages
CC Assignment 3
No ratings yet
CC Assignment 3
8 pages
Load Balancing in Parallel Computing
No ratings yet
Load Balancing in Parallel Computing
5 pages
Chapter Overview: Algorithms and Concurrency: - Introduction To Parallel Algorithms
No ratings yet
Chapter Overview: Algorithms and Concurrency: - Introduction To Parallel Algorithms
84 pages
CT 1
No ratings yet
CT 1
96 pages
HPC Unit 2
No ratings yet
HPC Unit 2
2 pages
8-Parallel Algorithm Design - Preliminaries-09-Jan-2020Material - I - 09-Jan-2020 - Module - 3 - Preliminaries PDF
No ratings yet
8-Parallel Algorithm Design - Preliminaries-09-Jan-2020Material - I - 09-Jan-2020 - Module - 3 - Preliminaries PDF
18 pages
Namespaces in C++
No ratings yet
Namespaces in C++
8 pages
DSA Lab Manual
No ratings yet
DSA Lab Manual
80 pages
Quants
No ratings yet
Quants
7 pages
Invitation To Computer Science 7th Edition Schneider Test Bank 1
100% (70)
Invitation To Computer Science 7th Edition Schneider Test Bank 1
10 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
4 pages
Patterson6e MIPS Ch04
No ratings yet
Patterson6e MIPS Ch04
137 pages
Python
No ratings yet
Python
4 pages
Bilinear Quad Source Code in Matlab
No ratings yet
Bilinear Quad Source Code in Matlab
2 pages
ADA Module 1
No ratings yet
ADA Module 1
22 pages
1.3.4 Output Devices
No ratings yet
1.3.4 Output Devices
8 pages
So Therefore Then Consequently In/as A Consequence As A Result For That Reason Due To ... Owing To ... Accordingly Because of This in That Case
No ratings yet
So Therefore Then Consequently In/as A Consequence As A Result For That Reason Due To ... Owing To ... Accordingly Because of This in That Case
6 pages
2.routine July-Dec-2024 Revised 261024
No ratings yet
2.routine July-Dec-2024 Revised 261024
2 pages
Multiple Choice Questions Based On Switch Construct
No ratings yet
Multiple Choice Questions Based On Switch Construct
9 pages
Unit 2 - IOS
No ratings yet
Unit 2 - IOS
8 pages
DLD - Unit-1
No ratings yet
DLD - Unit-1
14 pages
Insertion Sort
No ratings yet
Insertion Sort
10 pages
Theory of ComputationTE IT (End Sem) (2019 PAT) (Sem I) May 2024
No ratings yet
Theory of ComputationTE IT (End Sem) (2019 PAT) (Sem I) May 2024
2 pages
ch10 354
No ratings yet
ch10 354
12 pages
Fiberhome - OLT - Zabbix
No ratings yet
Fiberhome - OLT - Zabbix
28 pages
Software Construction
No ratings yet
Software Construction
15 pages
Java Awt DND Dragsource
No ratings yet
Java Awt DND Dragsource
1 page
Assignment 1 Automata
No ratings yet
Assignment 1 Automata
10 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet

HPC Ut 2

Uploaded by

HPC Ut 2

Uploaded by

HPC UNIT 2

Parallel Algorithm Models: -

4. Example: Parallel quicksort for sorting a large dataset of numbers.

1. Concept: Divides data into smaller chunks distributed to different processors.

3. Implementation: Processors multiply the sub-matrices independently, then combine results.

4. Example: Parallel matrix multiplication of large matrices A and B.

1. Concept: Divides problem space into sub-spaces for concurrent exploration.

2. Speeds up search for optimal solutions by utilizing multiple processors.

4. Example: Parallel parameter space search for optimizing a scientific model.

Limitations of parallelization of any algorithm: -

Classification of Dynamic Mapping Techniques: -

Centralized vs. Distributed:

1. Centralized: Master assigns tasks, simpler but can be a bottleneck.

2. Distributed: Processes communicate directly, scalable but more complex.

Deterministic vs. non-deterministic:

1. Deterministic: Rule-based task assignment, predictable but less adaptable.

2. Non-deterministic: Adaptive approaches, potentially better balancing but less predictable.

1. Sender-initiated: Overloaded processes inform for reassignment.

2. Receiver-initiated: Idle processes request tasks, helps balance load.

3. Bilateral: Combines sender and receiver initiation for proactive balancing.

1. Low-overhead: Minimizes communication, suitable for cost-sensitive scenarios.

Principles of Parallel Algorithm Design

Different methods for containing Interaction Overheads

You might also like