0% found this document useful (0 votes)

20 views32 pages

Parallel Programming 1

Uploaded by

attack3rx0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views32 pages

Parallel Programming 1

Uploaded by

attack3rx0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Amdahl’s Law

• If 1/s of the program is sequential, then you

can never get a speedup better than s.
– (Normalized) sequential execution time =
1/s + (1- 1/s) = 1
– Best parallel execution time on p processors =
1/s + (1 - 1/s) /p
– When p goes to infinity, parallel execution =
1/s
– Speedup = s.
Why keep something sequential?

●
Some parts of the program are not
parallelizable (because of dependencies)

●
Some parts may be parallelizable, but the
overhead dwarfs the increased speedup.
How could two statements execute in parallel?

●
On one processor:
statement 1;
statement 2;

●
On two processors:
processor1: processor2:
statement1; statement2;
Fundamental Assumption

●
Processors execute independently:
no control over order of execution
among processors
How could two statements execute in parallel?

• Possibility 1
Processor1: Processor2:
statement1;
statement2;
• Possibility 2
Processor1: Processor2:
statement2:
statement1;
How could two statements execute in parallel?

●
Their order of execution must not matter!

●
In other words,
statement1; statement2;
must be equivalent to
statement2; statement1;
Example 1
a = 1;
b = a;

●
Statements cannot be executed in parallel
●
Program modifications may make it possible.
Example 2
a = f(x);
b = a;

●
May not be wise to change the program.
Example 3
a = 1;
a = 2;

●
Statements cannot be executed in parallel.
Dependencies
●
What prevent us from parallelizing the
previous codes is the dependency between
the code statements.
Types of Dependencies
●
True dependency
●
Anti dependency
●
Output dependency
True dependence
Statements S1, S2

S2 has a true dependence on S1

iff →
S2 reads a value written by S1
Anti-dependence
Statements S1, S2.

S2 has an anti-dependence on S1
iff →
S2 writes a value read by S1.
Output Dependence
Statements S1, S2.

S2 has an output dependence on S1

iff →
S2 writes a variable written by S1.
How could two statements execute in parallel?

S1 and S2 can execute in parallel

iff →
there are no dependencies between
S1 and S2
– true dependency
– anti-dependency
– output dependency
Some dependencies can be removed.
Example 4
●
Most parallelism occurs in loops.

for(i=0; i<100; i++)

a[i] = i;

●
No dependencies
●
Iterations can be executed in parallel.
Example 5
for(i=0; i<100; i++) {
a[i] = i;
b[i] = 2*i;
}

Iterations and statements can be executed

in parallel.
Example 6

for(i=0;i<100;i++) a[i] = i;
for(i=0;i<100;i++) b[i] = 2*i;

Iterations and loops can be executed in parallel

Example 7
for(i=0; i<100; i++)
a[i] = a[i] + 100;

●
There is a dependence … on itself!
●
Loop is still parallelizable.
Example 8
for( i=0; i<100; i++ )
a[i] = f(a[i-1]);

●
Dependence between a[i] and a[i-1].
●
Loop iterations are not parallelizable.
Loop-carried dependence
●
A loop carried dependence is a
dependence that is present only if the
statements are part of the execution of a
loop.
●
Otherwise, we call it a loop-independent
dependence.
●
Loop-carried dependencies prevent loop
iteration parallelization.
Example 9
for(i=0; i<100; i++ )
for(j=1; j<100; j++ )
a[i][j] = f(a[i][j-1]);

●
Loop-independent dependence on i.
●
Loop-carried dependence on j.
●
Outer loop can be parallelized, inner
loop cannot.
Example 10
for( j=1; j<100; j++ )
for( i=0; i<100; i++ )
a[i][j] = f(a[i][j-1]);
●
Inner loop can be parallelized, outer loop
cannot.
●
Less desirable situation.
●
Loop interchange is sometimes possible.
Level of loop-carried dependence
●
Is the nesting depth of the loop that
carries the dependence.

●
Indicates which loops can be parallelized.
Be careful … Example 11
printf(“a”);
printf(“b”);

Statements have a hidden output dependence

due to the output stream.
Be careful … Example 12
a = f(x);
b = g(x);

Statements could have a hidden dependence

if f and g update the same variable.
Also depends on what f and g can do to x.
Be careful … Example 13
for(i=0; i<100; i++)
a[i+10] = f(a[i]);

●
Dependence between a[10], a[20], …
●
Dependence between a[11], a[21], …
●
…
●
Some parallel execution is possible.
Be careful … Example 14
for( i=1; i<100;i++ ) {
a[i] = …;
... = a[i-1];
}
●
Dependence between a[i] and a[i-1]
●
Complete parallel execution impossible
●
Pipelined parallel execution possible
Be careful … Example 15
for( i=0; i<100; i++ )
a[i] = f(a[indexa[i]]);

Cannot tell for sure.

●
Parallelization depends on user
knowledge of values in indexa[].
●
User can tell, compiler cannot.
Optimizations: Example 16
for (i = 0; i < 100000; i++)
a[i + 1000] = a[i] + 1;

Cannot be parallelized as it is.

May be parallelized by applying certain code transformations.
An aside
●
Parallelizing compilers analyze program
dependencies to decide parallelization.
●
In parallelization by hand, user does the
same analysis.
●
Compiler more convenient and more correct
●
User more powerful, can analyze more
patterns.
To remember
●
Statement order must not matter.
●
Statements must not have dependencies
●
Some dependencies can be removed.
●
Some dependencies may not be obvious.

Grain Packing & Scheduling Ch2 Hwang
No ratings yet
Grain Packing & Scheduling Ch2 Hwang
80 pages
4 Threads and Concurrency
No ratings yet
4 Threads and Concurrency
62 pages
Advanced Computer Architecture: Conditions of Parallelism
No ratings yet
Advanced Computer Architecture: Conditions of Parallelism
27 pages
Hardware vs. Software Parallelism
50% (2)
Hardware vs. Software Parallelism
55 pages
2 TypesofParallelism
No ratings yet
2 TypesofParallelism
69 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
CS-3006 9 DependenceAnalysis
No ratings yet
CS-3006 9 DependenceAnalysis
67 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
98 pages
Capp 1
No ratings yet
Capp 1
38 pages
ACA Chapter2
No ratings yet
ACA Chapter2
66 pages
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
No ratings yet
Instruction Level Parallelism and Its Exploitation: Unit Ii by Raju K, Cse Dept
201 pages
OpenMP Performance Consideration
No ratings yet
OpenMP Performance Consideration
49 pages
CompanionAsset 9780128119051 Chapter03
No ratings yet
CompanionAsset 9780128119051 Chapter03
67 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
Con Currency
No ratings yet
Con Currency
99 pages
03 (Parallel Software)
No ratings yet
03 (Parallel Software)
38 pages
Dependence Alanysis and Loop Normalization
No ratings yet
Dependence Alanysis and Loop Normalization
23 pages
Module 5 Instruction Level Parallelism and Pipelining
No ratings yet
Module 5 Instruction Level Parallelism and Pipelining
54 pages
PDC Lecture 04
No ratings yet
PDC Lecture 04
44 pages
Concurrency
No ratings yet
Concurrency
99 pages
MCP Unit 1
No ratings yet
MCP Unit 1
41 pages
Lecture 5
No ratings yet
Lecture 5
76 pages
ACA UNIT-1 B Kai Hwang
No ratings yet
ACA UNIT-1 B Kai Hwang
23 pages
15CS72 ACA Module1 Chapter2Final
No ratings yet
15CS72 ACA Module1 Chapter2Final
28 pages
Data Depend
No ratings yet
Data Depend
29 pages
Chapter 2: Program and Network Properties
No ratings yet
Chapter 2: Program and Network Properties
94 pages
Pdf24 Merged
No ratings yet
Pdf24 Merged
54 pages
Chapter 5 PPTV 41 STDV 1
No ratings yet
Chapter 5 PPTV 41 STDV 1
47 pages
c3 Dependence Analysis p1
No ratings yet
c3 Dependence Analysis p1
32 pages
Unit 3
No ratings yet
Unit 3
49 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
100 pages
Pca Chapter 2 Program & Network Properties
No ratings yet
Pca Chapter 2 Program & Network Properties
71 pages
Dependencies, Instruction Scheduling, Optimization, and Parallelism
No ratings yet
Dependencies, Instruction Scheduling, Optimization, and Parallelism
49 pages
43-Instruction Scheduling and Software Pipelining-19!11!2024
No ratings yet
43-Instruction Scheduling and Software Pipelining-19!11!2024
25 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
Unit 3 Complete APP
No ratings yet
Unit 3 Complete APP
49 pages
03 Dynamic Sched
No ratings yet
03 Dynamic Sched
84 pages
01 Introduction
No ratings yet
01 Introduction
41 pages
Program and Network Properties
No ratings yet
Program and Network Properties
27 pages
14-Parallelization and Automatic Parallelization-08!11!2024
No ratings yet
14-Parallelization and Automatic Parallelization-08!11!2024
50 pages
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
No ratings yet
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
47 pages
13) Ilp1 PDF
No ratings yet
13) Ilp1 PDF
85 pages
Lect 02
No ratings yet
Lect 02
51 pages
Lecture 2
No ratings yet
Lecture 2
32 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
36 pages
U3.1 Concepts and Challenges
No ratings yet
U3.1 Concepts and Challenges
12 pages
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
No ratings yet
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
38 pages
ACA Unit 8 Hardware and Software For VLIW and EPIC Notes - Unit 8
No ratings yet
ACA Unit 8 Hardware and Software For VLIW and EPIC Notes - Unit 8
35 pages
Cosc530 Ch3all6up
No ratings yet
Cosc530 Ch3all6up
8 pages
Pipelining vs. Parallel Processing
No ratings yet
Pipelining vs. Parallel Processing
23 pages
Week 11
No ratings yet
Week 11
7 pages
Instruction Level Parallelism: Soner Onder
No ratings yet
Instruction Level Parallelism: Soner Onder
25 pages
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
No ratings yet
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
24 pages
CS 6290 Instruction Level Parallelism
No ratings yet
CS 6290 Instruction Level Parallelism
45 pages
Lec5 PDF
No ratings yet
Lec5 PDF
39 pages
COS 265 Parallel - Vs - Sequential - Programming
No ratings yet
COS 265 Parallel - Vs - Sequential - Programming
4 pages
Limits of Instruction-Level Parallelism
No ratings yet
Limits of Instruction-Level Parallelism
18 pages
Sols Book PDF
100% (1)
Sols Book PDF
120 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
Dependency Graph and Bernstein Conditions
No ratings yet
Dependency Graph and Bernstein Conditions
39 pages
Integer Set Library (ISL)
No ratings yet
Integer Set Library (ISL)
61 pages
Parallelism in Algorithms - Bernstein's Conditions
100% (2)
Parallelism in Algorithms - Bernstein's Conditions
2 pages
Data Dependences: CS 524 - High-Performance Computing
No ratings yet
Data Dependences: CS 524 - High-Performance Computing
20 pages
Conditions of Parallelism - Data Dependence
100% (1)
Conditions of Parallelism - Data Dependence
3 pages
C Programming
From Everand
C Programming
Netra
No ratings yet

Parallel Programming 1

Uploaded by

Parallel Programming 1

Uploaded by

Amdahl’s Law

• If 1/s of the program is sequential, then you

S2 has a true dependence on S1

S2 has an output dependence on S1

S1 and S2 can execute in parallel

for(i=0; i<100; i++)

Iterations and statements can be executed

Iterations and loops can be executed in parallel

Statements have a hidden output dependence

Statements could have a hidden dependence

Cannot tell for sure.

Cannot be parallelized as it is.

You might also like