0% found this document useful (0 votes)

92 views32 pages

Parallel Programming Models: Sathish Vadhiyar

The document discusses parallel programming models and approaches. It begins by describing challenges in parallel programming like communication delays, idling processes, and synchronization issues. It then covers key concepts like speedup, efficiency, and scalability for evaluating parallel programs. The document categorizes parallel programs as single program multiple data (SPMD) or multiple program multiple data (MPMD). It also discusses shared memory and message passing programming paradigms. The rest of the document details steps for parallelizing a program, including decomposition, assignment, orchestration, and mapping of processes to processors.

Uploaded by

dhruvbhagtani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views32 pages

Parallel Programming Models: Sathish Vadhiyar

Uploaded by

dhruvbhagtani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Parallel Programming Models

Sathish Vadhiyar
Department of Computational and Data Sciences
Supercomputer Education and Research Centre
Indian Institute of Science, Bangalore, India

September 13, 2019 SERC Training Workshop

Parallel Programming and Challenges

• Recall the advantages and motivation of

parallelism
• But parallel programs incur overheads
not seen in sequential programs
– Communication delay
– Idling
– Synchronization
Challenges

Idle time
Computation

Communication

Synchronization

3
4

How do we evaluate a parallel program?

• Execution time, Tp
• Speedup, S
– S(p, n) = T(1, n) / T(p, n)
– Usually, S(p, n) < p
– Sometimes S(p, n) > p (superlinear speedup)
• Efficiency, E
– E(p, n) = S(p, n)/p
– Usually, E(p, n) < 1
– Sometimes, greater than 1
• Scalability – Limitations in parallel computing,
relation to n and p.
PARALLEL PROGRAMMING
CLASSIFICATION AND STEPS

5
6

Parallel Program Models

• Single Program
Multiple Data (SPMD)
• Multiple Program
Multiple Data (MPMD)

Courtesy: http://www.llnl.gov/computing/tutorials/parallel_comp/
7

Programming Paradigms
• Shared memory model – Threads, OpenMP,
CUDA
• Message passing model – MPI
8

Parallelizing a Program
Given a sequential program/algorithm, how
to go about producing a parallel version
Four steps in program parallelization
1. Decomposition
Identifying parallel tasks with large extent of
possible concurrent activity; splitting the
problem into tasks
2. Assignment
Grouping the tasks into processes with best load
balancing
3. Orchestration
Reducing synchronization and communication costs
4. Mapping
Mapping of processes to processors (if possible)
9

Steps in Creating a Parallel Program

Partitioning

D A O M
e s r a
c s c p
o i h p
m g p0 p1 e p0 p1 i
p s P0 P1
n n
o m t g
s e r
i n a
t t t
P2 P3
i p2 p3 i p2 p3
o o
n n

Sequential Tasks Processes Parallel Processors

computation program
10

Decomposition and Assignment

• Specifies how to group tasks together for a process
– Balance workload, reduce communication and
management cost
• In practical cases, both steps combined into
one step, trying to answer the question “What
is the role of each parallel processing entity?”
11

Data Parallelism and Domain Decomposition

• Given data divided across the processing

entitites
• Each process owns and computes a portion
of the data – owner-computes rule
• Multi-dimensional domain in simulations
divided into subdomains equal to
processing entities
• This is called domain decomposition
12

Domain decomposition and Process Grids

• The given P processes arranged in multi-

dimensions forming a process grid
• The domain of the problem divided into
process grid
13

Illustrations
14

Data Distributions

• For dividing the data in a dimension using

the processes in a dimension, data
distribution schemes are followed
• Common data distributions:
– Block: for regular
computations
– Block-cyclic: when
there is load
imbalance across
space
15

Task parallelism

• Independent tasks identified

• The task may or may not process different
data
16

Based on Task Partitioning

• Based on task dependency graph

0 4

0 2 4 6

0 1 2 3 4 5 6 7

• In general the problem is NP complete

Orchestration
• Goals
–Structuring communication
–Synchronization
• Challenges
–Organizing data structures – packing
–Small or large messages?
–How to organize communication and
synchronization ?
18

Orchestration
• Maximizing data locality
– Minimizing volume of data exchange
• Not communicating intermediate results – e.g. dot product
– Minimizing frequency of interactions - packing
• Minimizing contention and hot spots
– Do not use the same communication pattern with the
other processes in all the processes
• Overlapping computations with interactions
– Split computations into phases: those that depend on
communicated data (type 1) and those that do not (type
2)
– Initiate communication for type 1; During
communication, perform type 2
• Replicating data or computations
– Balancing the extra computation or storage cost with
the gain due to less communication
19

Mapping
• Which process runs on which particular
processor?
–Can depend on network topology,
communication pattern of processes
–On processor speeds in case of
heterogeneous systems
20

Mapping
• Which process runs on which particular
processor?
–Can depend on network topology,
communication pattern of processes
–On processor speeds in case of
heterogeneous systems
21

Assignment -- Option 3

P4
22

Orchestration
• Different for different programming
models/architectures
– Shared address space
• Naming: global addr. Space
• Synch. through barriers and locks
– Distributed Memory /Message passing
• Non-shared address space
• Send-receive messages + barrier for synch.
23

SAS Version – Generating Processes

1. int n, nprocs; /* matrix: (n + 2-by-n + 2) elts.*/
2. float **A, diff = 0;
2a. LockDec (lock_diff);
2b. BarrierDec (barrier1);
3. main()
4. begin
5. read(n) ; /*read input parameter: matrix size*/
5a. Read (nprocs);
6. A  g_malloc (a 2-d array of (n+2) x (n+2) doubles);
6a. Create (nprocs -1, Solve, A);
7. initialize(A); /*initialize the matrix A somehow*/
8. Solve (A); /*call the routine to solve equation*/
8a. Wait_for_End (nprocs-1);
9. end main
24

SAS Version -- Solve

10. procedure Solve (A) /*solve the equation system*/
11. float **A; /*A is an (n + 2)-by-(n + 2) array*/
12. begin
13. int i, j, pid, done = 0;
14. float temp;
14a. mybegin = 1 + (n/nprocs)*pid;
14b. myend = mybegin + (n/nprocs);
15. while (!done) do /*outermost loop over sweeps*/
16. diff = 0; /*initialize difference to 0*/
16a. Barriers (barrier1, nprocs);
17. for i  mybeg to myend do/*sweep for all points of grid*/
18. for j  1 to n do
19. temp = A[i,j]; /*save old value of element*/
20. A[i,j]  0.2 * (A[i,j] + A[i,j-1] + A[i-1,j] +
21. A[i,j+1] + A[i+1,j]); /*compute average*/
22. diff += abs(A[i,j] - temp);
23. end for
24. end for
25. if (diff/(n*n) < TOL) then done = 1;
26. end while
27. end procedure
25

SAS Version -- Issues

• SPMD program
• Wait_for_end – all to one communication
• How is diff accessed among processes?
– Mutex to ensure diff is updated correctly.
– Single lock  too much synchronization!
– Need not synchronize for every grid point. Can do only
once.
• What about access to A[i][j], especially the boundary
rows between processes?
• Can loop termination be determined without any
synch. among processes?
– Do we need any statement for the termination condition
statement
26

SAS Version -- Solve

10. procedure Solve (A) /*solve the equation system*/
11. float **A; /*A is an (n + 2)-by-(n + 2) array*/
12. begin
13. int i, j, pid, done = 0;
14. float mydiff, temp;
14a. mybegin = 1 + (n/nprocs)*pid;
14b. myend = mybegin + (n/nprocs);
15. while (!done) do /*outermost loop over sweeps*/
16. mydiff = diff = 0; /*initialize local difference to 0*/
16a. Barriers (barrier1, nprocs);
17. for i  mybeg to myend do/*sweep for all points of grid*/
18. for j  1 to n do
19. temp = A[i,j]; /*save old value of element*/
20. A[i,j]  0.2 * (A[i,j] + A[i,j-1] + A[i-1,j] +
21. A[i,j+1] + A[i+1,j]); /*compute average*/
22. mydiff += abs(A[i,j] - temp);
23. end for
24. end for
24a lock (diff-lock);
24b. diff += mydiff;
24c unlock (diff-lock)
24d. barrier (barrier1, nprocs);
25. if (diff/(n*n) < TOL) then done = 1;
25a. Barrier (barrier1, nprocs);
26. end while
27. end procedure
27

SAS Program

• done condition evaluated redundantly by all

• Code that does the update identical to
sequential program
– each process has private mydiff variable
• Most interesting special operations are for
synchronization
– accumulations into shared diff have to be mutually
exclusive
– why the need for all the barriers?
• Good global reduction?
– Utility of this parallel accumulate??
28

Message Passing Version

• Cannot declare A to be global shared array
– compose it from per-process private arrays
– usually allocated in accordance with the assignment of
work -- owner-compute rule
• process assigned a set of rows allocates them
locally
• Structurally similar to SPMD SAS
• Orchestration different
– data structures and data access/naming
– communication
– synchronization
• Ghost rows
29

Data Layout and Orchestration

Data partition allocated per processor

Add ghost rows to hold boundary data
Send edges to neighbors
P4
Receive into ghost rows
Compute as in sequential program
30
Message Passing Version – Generating Processes

1. int n, nprocs; /* matrix: (n + 2-by-n + 2) elts.*/

2. float **myA;
3. main()
4. begin
5. read(n) ; /*read input parameter: matrix size*/
5a. read (nprocs);
/* 6. A  g_malloc (a 2-d array of (n+2) x (n+2) doubles); */
6a. Create (nprocs -1, Solve, A);
/* 7. initialize(A); */ /*initialize the matrix A somehow*/
8. Solve (A); /*call the routine to solve equation*/
8a. Wait_for_End (nprocs-1);
9. end main
31
Message Passing Version – Array allocation and
Ghost-row Copying
10. procedure Solve (A) /*solve the equation system*/
11. float **A; /*A is an (n + 2)-by-(n + 2) array*/
12. begin
13. int i, j, pid, done = 0;
14. float mydiff, temp;
14a. myend = (n/nprocs) ;
6. myA = malloc (array of (n/nprocs) x n floats );
7. initialize (myA); /* initialize myA LOCALLY */
15. while (!done) do /*outermost loop over sweeps*/
16. mydiff = 0; /*initialize local difference to 0*/
16a. if (pid != 0) then
SEND (&myA[1,0] , n*sizeof(float), (pid-1), row);
16b. if (pid != nprocs-1) then
SEND (&myA[myend,0], n*sizeof(float), (pid+1), row);
16c. if (pid != 0) then
RECEIVE (&myA[0,0], n*sizeof(float), (pid -1), row);
16d. if (pid != nprocs-1) then
RECEIVE (&myA[myend+1,0], n*sizeof(float), (pid -1),
row);
32

Message Passing Version – Solver

12. begin
… … …
15. while (!done) do /*outermost loop over sweeps*/
… … …
17. for i  1 to myend do/*sweep for all points of grid*/
18. for j  1 to n do
19. temp = myA[i,j]; /*save old value of element*/
20. myA[i,j]  0.2 * (myA[i,j] + myA[i,j-1] +myA[i-1,j] +
21. myA[i,j+1] + myA[i+1,j]); /*compute average*/
22. mydiff += abs(myA[i,j] - temp);
23. end for
24. end for
24a if (pid != 0) then
24b. SEND (mydiff, sizeof (float), 0, DIFF);
24c. RECEIVE (done, sizeof(int), 0, DONE);
24d. else
24e. for k  1 to nprocs-1 do
24f. RECEIVE (tempdiff, sizeof(float), k , DIFF);
24g. mydiff += tempdiff;
24h. endfor
24i. If(mydiff/(n*n) < TOL) then done = 1;
24j. for k  1 to nprocs-1 do
24k. SEND (done, sizeof(float), k , DONE);
24l. endfor
25. end while
26. end procedure

4.8 - MD-SX Instruction (1309)
67% (3)
4.8 - MD-SX Instruction (1309)
56 pages
MLU Spec Sheet 250W 255W
No ratings yet
MLU Spec Sheet 250W 255W
2 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Simulating Ocean Currents
No ratings yet
Simulating Ocean Currents
35 pages
Part 1 - Lecture 3 - Parallel Software-1
No ratings yet
Part 1 - Lecture 3 - Parallel Software-1
45 pages
RG2 ParallelizationPrinciples HPCAI Jan2020
No ratings yet
RG2 ParallelizationPrinciples HPCAI Jan2020
40 pages
PA Midsem
No ratings yet
PA Midsem
20 pages
Perspective On Parallel Programming: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
No ratings yet
Perspective On Parallel Programming: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
42 pages
Parallel Programming 3
No ratings yet
Parallel Programming 3
22 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
04 Progbasics
No ratings yet
04 Progbasics
62 pages
Mpi Course
No ratings yet
Mpi Course
202 pages
Parallel Computing A Comparative
No ratings yet
Parallel Computing A Comparative
65 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Parallel Programming
No ratings yet
Parallel Programming
42 pages
Lecture4 PDF
No ratings yet
Lecture4 PDF
23 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
High Performance Computing (HPC) - Lec2
No ratings yet
High Performance Computing (HPC) - Lec2
53 pages
Parallel Computing
No ratings yet
Parallel Computing
24 pages
Verilog Programming Styles
No ratings yet
Verilog Programming Styles
95 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
5 - Designing Parallel Programs
No ratings yet
5 - Designing Parallel Programs
52 pages
L19-20 PA Design Intro
No ratings yet
L19-20 PA Design Intro
31 pages
IT105 Midterm Lecture Part1
No ratings yet
IT105 Midterm Lecture Part1
5 pages
Parallel Algorithm - Introduction
No ratings yet
Parallel Algorithm - Introduction
36 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
F03 - Parallelizing A Sequential Algorithm and Multicore Architectures
No ratings yet
F03 - Parallelizing A Sequential Algorithm and Multicore Architectures
66 pages
L04 Parallel Programming Models I
No ratings yet
L04 Parallel Programming Models I
72 pages
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
No ratings yet
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
47 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
Parallel and Distributed Algorithms-IMPORTANT QUESTION
100% (1)
Parallel and Distributed Algorithms-IMPORTANT QUESTION
15 pages
Module 5
No ratings yet
Module 5
40 pages
03 (Parallel Software)
No ratings yet
03 (Parallel Software)
38 pages
Unit 3 Parallel Programming: Structure Nos
No ratings yet
Unit 3 Parallel Programming: Structure Nos
26 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
High-Level Optimizations: Embedded System Optimization
No ratings yet
High-Level Optimizations: Embedded System Optimization
5 pages
Principles of Parallel Algorithm Design
No ratings yet
Principles of Parallel Algorithm Design
63 pages
CT UR ES M: Sequential, Parallel and Distributed Algorithms
No ratings yet
CT UR ES M: Sequential, Parallel and Distributed Algorithms
18 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
98 pages
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
No ratings yet
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
27 pages
How To Parallelise An Application
No ratings yet
How To Parallelise An Application
30 pages
Intro Parallel Programming 2015
No ratings yet
Intro Parallel Programming 2015
38 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
36 pages
Programação Paralela e Distribuída
No ratings yet
Programação Paralela e Distribuída
39 pages
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
No ratings yet
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
24 pages
Parallel Programming Models
No ratings yet
Parallel Programming Models
25 pages
Parallel Paradigms
No ratings yet
Parallel Paradigms
16 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Message Passing Fundamentals: Reference: Http://foxtrot - Ncsa.uiuc - edu:8900/public/MPI
No ratings yet
Message Passing Fundamentals: Reference: Http://foxtrot - Ncsa.uiuc - edu:8900/public/MPI
22 pages
PDC Lecture 04
No ratings yet
PDC Lecture 04
44 pages
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
No ratings yet
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
170 pages
Sequential, Parallel and Distributed Algorithms
No ratings yet
Sequential, Parallel and Distributed Algorithms
18 pages
High Performance Computing
No ratings yet
High Performance Computing
17 pages
Lecture-4 Parallel Programming Model
No ratings yet
Lecture-4 Parallel Programming Model
14 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
OpenMP in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenMP in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Advanced Guide to Dynamic Programming in Python: Techniques and Applications
From Everand
Advanced Guide to Dynamic Programming in Python: Techniques and Applications
Adam Jones
No ratings yet
NumPy Recipes
From Everand
NumPy Recipes
Martin McBride
No ratings yet
Biharmonic Friction With A Smagorinsky-Like Viscosity For Use in Large-Scale Eddy-Permitting Ocean Models
No ratings yet
Biharmonic Friction With A Smagorinsky-Like Viscosity For Use in Large-Scale Eddy-Permitting Ocean Models
12 pages
Jclid190846 PDF
No ratings yet
Jclid190846 PDF
22 pages
Programming Environments On Sahasrat (Cray-Xc40 System)
No ratings yet
Programming Environments On Sahasrat (Cray-Xc40 System)
38 pages
Climate Tipping Points - Too Risky To Bet Against: Comment
No ratings yet
Climate Tipping Points - Too Risky To Bet Against: Comment
5 pages
Ocean Gyres Driven by Surface Buoyancy Forcing: Research Letter
No ratings yet
Ocean Gyres Driven by Surface Buoyancy Forcing: Research Letter
10 pages
SERC IntroMPI 2019-09-14 v0
No ratings yet
SERC IntroMPI 2019-09-14 v0
43 pages
Parallel Architecture: Sathish Vadhiyar
No ratings yet
Parallel Architecture: Sathish Vadhiyar
26 pages
Mesoscale Eddy Dynamics High Resolution Model
No ratings yet
Mesoscale Eddy Dynamics High Resolution Model
16 pages
Ship Motion Assignment
No ratings yet
Ship Motion Assignment
1 page
Forces On Ship: OE 1010 Introduction To Ocean Engineering
No ratings yet
Forces On Ship: OE 1010 Introduction To Ocean Engineering
62 pages
Assignment 5: Initial Checks On Hold and Tank Capacity, Resistance and Propulsion, Trim, Stability and Freeboard
No ratings yet
Assignment 5: Initial Checks On Hold and Tank Capacity, Resistance and Propulsion, Trim, Stability and Freeboard
1 page
OE5170 Notes 08
No ratings yet
OE5170 Notes 08
3 pages
Assignment 8
No ratings yet
Assignment 8
1 page
Jiang, Henn, Sharma - 2002 - Wash Waves Generated by Ships Moving On Fairways of Varying Topography
No ratings yet
Jiang, Henn, Sharma - 2002 - Wash Waves Generated by Ships Moving On Fairways of Varying Topography
15 pages
Transit Time (Days) : Eastbound
No ratings yet
Transit Time (Days) : Eastbound
2 pages
TLE ICT CSS 9 Q2 - Module 7 8 PITD
No ratings yet
TLE ICT CSS 9 Q2 - Module 7 8 PITD
23 pages
NetBackup102 Network Ports Reference Guide
No ratings yet
NetBackup102 Network Ports Reference Guide
21 pages
Ethical Hacking and Hacking Attacks
No ratings yet
Ethical Hacking and Hacking Attacks
12 pages
Cbmt2103 - Introduction To Multimedia Technology
No ratings yet
Cbmt2103 - Introduction To Multimedia Technology
8 pages
SOLIDWORKS 2024 Premium Home User Installation Steps
No ratings yet
SOLIDWORKS 2024 Premium Home User Installation Steps
2 pages
ASHRAE Webcast Information
No ratings yet
ASHRAE Webcast Information
2 pages
Datamax I Class Parts List
No ratings yet
Datamax I Class Parts List
24 pages
Unit 3 Note
No ratings yet
Unit 3 Note
9 pages
Xapp594 Parallel Lvds Hs Dac Interface
No ratings yet
Xapp594 Parallel Lvds Hs Dac Interface
17 pages
Unit 2
No ratings yet
Unit 2
33 pages
Sensors
No ratings yet
Sensors
7 pages
1.1 TI-B Ignition System
No ratings yet
1.1 TI-B Ignition System
6 pages
Project Report On Customers Satisfaction On Airtel
0% (1)
Project Report On Customers Satisfaction On Airtel
47 pages
User's Manual: Series Hp-E
No ratings yet
User's Manual: Series Hp-E
84 pages
Choosing OCR AS/A Level Computer Science (H046/H446)
No ratings yet
Choosing OCR AS/A Level Computer Science (H046/H446)
60 pages
CT042-3-1 Introduction To Databases (VD1) 6 January 2020
No ratings yet
CT042-3-1 Introduction To Databases (VD1) 6 January 2020
2 pages
Vehicle Service Information: Basic Maintenance and Light Repair
No ratings yet
Vehicle Service Information: Basic Maintenance and Light Repair
18 pages
Interfacing of LED 8051
No ratings yet
Interfacing of LED 8051
16 pages
Sop For Mechanical Engineering
No ratings yet
Sop For Mechanical Engineering
2 pages
Intel® Easy Steps: Create An Email Account and Send Emails With or Without Attachments
No ratings yet
Intel® Easy Steps: Create An Email Account and Send Emails With or Without Attachments
6 pages
Samsung
No ratings yet
Samsung
8 pages
High Mounted Stoplamp Removal 0
No ratings yet
High Mounted Stoplamp Removal 0
2 pages
Chapter 12 Data and Systems Integration
No ratings yet
Chapter 12 Data and Systems Integration
5 pages
Home Automation Using PLC and Scada: ISSN: 2348 - 6953
No ratings yet
Home Automation Using PLC and Scada: ISSN: 2348 - 6953
8 pages
Clone Spy
No ratings yet
Clone Spy
9 pages
Datex-Ohmeda S5 PDF
No ratings yet
Datex-Ohmeda S5 PDF
6 pages
EVP-X12PM: Service Manual
No ratings yet
EVP-X12PM: Service Manual
12 pages
EBOOK Keyboarding and Word Processing Complete Course Lessons 1 110 Microsoft Word 2016 20Th Edition Ebook PDF Download Full Chapter PDF Kindle
100% (52)
EBOOK Keyboarding and Word Processing Complete Course Lessons 1 110 Microsoft Word 2016 20Th Edition Ebook PDF Download Full Chapter PDF Kindle
62 pages

Parallel Programming Models: Sathish Vadhiyar

Uploaded by

Parallel Programming Models: Sathish Vadhiyar

Uploaded by

Parallel Programming Models

September 13, 2019 SERC Training Workshop

Parallel Programming and Challenges

• Recall the advantages and motivation of

How do we evaluate a parallel program?

Parallel Program Models

Steps in Creating a Parallel Program

Sequential Tasks Processes Parallel Processors

Decomposition and Assignment

Data Parallelism and Domain Decomposition

• Given data divided across the processing

Domain decomposition and Process Grids

• The given P processes arranged in multi-

• For dividing the data in a dimension using

• Independent tasks identified

Based on Task Partitioning

• Based on task dependency graph

• In general the problem is NP complete

SAS Version – Generating Processes

SAS Version -- Solve

SAS Version -- Issues

SAS Version -- Solve

• done condition evaluated redundantly by all

Message Passing Version

Data Layout and Orchestration

Data partition allocated per processor

1. int n, nprocs; /* matrix: (n + 2-by-n + 2) elts.*/

Message Passing Version – Solver

You might also like