0% found this document useful (0 votes)
6 views38 pages

Openmp 1

The document provides an overview of OpenMP, a library for parallel programming that enables the division of computational work among multiple threads in shared memory systems. It covers concepts such as program structure, directives, clauses, and the differences between concurrency and parallelism. Additionally, it explains the FORK-JOIN model, memory models, and various OpenMP programming constructs with examples.

Uploaded by

Ranny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views38 pages

Openmp 1

The document provides an overview of OpenMP, a library for parallel programming that enables the division of computational work among multiple threads in shared memory systems. It covers concepts such as program structure, directives, clauses, and the differences between concurrency and parallelism. Additionally, it explains the FORK-JOIN model, memory models, and various OpenMP programming constructs with examples.

Uploaded by

Ranny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Index

• OpenMP
• Program Structure
• Directives : parallel , for
• Clauses
• If
• Private
• Shared
• Default(shared/none)
• Firstprivate
• Num_threads

• References
1. Introduction
• Serial Programming • What is parallel
• Develop a serial program and programming?
Optimize for performance
• Obtain the same amount of
• Real World scenario: computation with multiple
• Run multiple programs cores or threads at low
• Large and Complex problems frequency (Fast)
• Time consuming
• Solution:
• Use parallel machines
• Use Multi-core Machines
• Why Parallel ?
• Reduce the execution time
• Run multiple programs
1. Introduction
• Concurrency
• Condition of a system in
which multiple tasks are
logically active at the same
time ….but they may not
necessarily run in parallel
• Parallelism
• Subset of concurrency
• Condition of a system in
which multiple tasks are
active at the same time and
run in parallel
1. Introduction
• Shared Memory Machines • Distributed Memory Machines
• All processors share the same • Each processor has its own
memory memory
• The variables can be shared or • The variables are Independent
private • Communication by passing
• Communication via shared messages (network)
memory
Multi-Processing
Multi-threading • Difficult to program
• Portable, easy to program and • Scalable
use
• Not very scalable
• MPI based Programming
• OpenMP based Programming
1. OpenMP : API
Open Specification for Multi-Processing (OpenMP)
• Library used to divide computational work in a program and add
parallelism to serial program (create threads)
• An Application Program Interface (API) that is used explicitly direct
multi-threaded, shared memory parallelism
• API Components
• Compiler Directives
• Runtime library routines
• Environment variables
• Standardization
• Jointly defined and endorsed by major computer hardware and software
vendors
1. OpenMP
FORK – JOIN Parallelism
• OpenMP program begin as a single process: the master thread. The master
thread executes sequentially until the first parallel region construct is
encountered.
• When a parallel region is encountered, master thread
– Create a group of threads by FORK.
– Becomes the master of this group of threads and is assigned the thread id 0 within the group.
• The statement in the program that are enclosed by the parallel region
construct are then executed in parallel among these threads.
• JOIN: When the threads complete executing the statement in the parallel region
construct, they synchronize and terminate, leaving only the master thread.
1. OpenMP
• I/O
• OpenMP does not specify parallel I/O.
• It is up to the programmer to ensure that I/O is conducted correctly within
the context of a multithreaded program.

• Memory Model
• Threads can “cache” their data and are not required to maintain exact
consistency with real memory all the time.

• When it is critical that all threads view a shared variable identically, the
programmer is responsible for ensuring that the variable is updated by all
threads as needed. (flush)
1. OpenMP : API
• Architecture
1. OpenMP : API
• OpenMP Language Extensions
2. OpenMP Programming

%%Program hello.c Compiling the program


#include <stdio.h> $ gcc -fopenmp hello.c -o hello
#include <omp.h>

int main(void) Output


{
#pragma omp parallel Hello, world.
printf("Hello, world.\n"); Hello, world.
return 0;
}
2. OpenMP Programming
• Directives
• An OpenMP executable directive applies to the succeeding structured block or an
OpenMP Construct. A “structured block” is a single statement or a compound
statement with a single entry at the top and a single exit at the bottom.

• Clauses
• Not all the clauses are valid on all directives. The set of clauses that is valid on a
particular directive is described with the directive. Most of the clauses accept a
comma-separated list of list items. All list items appearing in a clause must be
visible.

• Runtime Library Routines


• Execution environment routines affect and monitor threads, processors, and the
parallel environment. Lock routines support synchronization with OpenMP locks.
Timing routines support a portable wall clock timer. Prototypes for the runtime
library routines are defined in the file “omp.h”.
2. OpenMP Programming : Parallel

Directives Clauses Run time variables


• Parallel • Default • omp_set_num_threads
• For (shared/none) • omp_get_num_threads
• Sections • Shared • omp_get_max_threads
• Single • omp_get_thread_num(v
• Private oid)
• Task
• Master • Firstprivate • ….........etc
• Critical • Lastprivate
• Barrier • Reduction
• Taskwait
• Copyin
• Automic
• Flush • copyprivate
• Ordered
• Threadprivate
2. OpenMP Programming
• Directives
• An OpenMP executable directive applies to the succeeding structured block
or an OpenMP Construct. A “structured block” is a single statement or a
compound statement with a single entry at the top and a single exit at the
bottom.
• They are case sensitive
• Starts with #pragma omp
• Directives cannot be embedded in embedded within continued statements,
and statements cannot be embedded within directives.
• Only one directive-name can be specified per directive
2. OpenMP Programming: Directives

The parallel construct forms a team Restrictions


of threads and starts parallel
execution. • A program which branches into or out of
a parallel region is non-conforming.
#pragma omp parallel [clause[,]clause...] new-line • A program must not depend on any
Structured-block ordering of the evaluations of the clauses
Clause: if(scalar-expression) of the parallel directive, or on any side
effects of the evaluations of the clauses.
num_threads(integer-expression)
default(shared|none)
• At most one if clause can appear on the
directive.
private(list)
• At most one num_threads clause can
firstprivate(list)
appear on the directive. The
shared(list) num_threads expression must evaluate
copyin(list) to a positive integer value.
reduction(operator:list)
2. OpenMP Programming: Directives

• A team of threads is created to execute


#pragma omp parallel [clause[,]clause...] the parallel region
new-line
• A thread which encounters the parallel
Structured-block construct becomes master thread
Clause: if (scalar-expression) • The thread id of master is 0
num_threads(integer-expression) • All threads including master thread
default(shared|none) executes parallel region
private(list) • omp_get_thread_num() provides thread
id
firstprivate(list)
• There is implied barrier at the end of a
shared(list) parallel region.
copyin(list) • If a thread in a team executing a parallel
reduction(operator:list) region encounters another parallel
directive, it creates a new team, and that
thread is master of new team.
2. OpenMP Programming: Directives

#pragma omp parallel [clause[,]clause...] • If execution of a thread terminates while


new-line inside a parallel region, execution of all
threads in all teams terminates.
Structured-block
• The order of termination of threads
Clause: if(scalar-expression) is unspecified
num_threads(integer-expression) • All the work done by a team prior to
default(shared|none) any barrier which the team has passed
private(list) in the program is guaranteed to be
complete.
firstprivate(list)
• The amount of work done by each
shared(list) thread after the last barrier that it
copyin(list) passed and before it terminates is
unspecified
reduction(operator:list)
2. OpenMP Programming: Clauses

#pragma omp parallel [clause[,]clause...] If Clause


new-line • A structured block is executed in parallel
Structured-block if the evaluation of if() clause
is evaluated as true.
Clause: if(scalar-expression)
• A missing if clause is equivalent to an if
num_threads(integer-expression) clause that evaluates true.
default(shared|none) • At most one if () clause can appear on
private(list) the directive.
firstprivate(list)
shared(list)
copyin(list)
reduction(operator:list)
2. OpenMP Programming : if clause

%%Program hello.c What is the result of the


%% if () clause program?

#include <stdio.h> Hello, world.


#include <omp.h>
OR
int main(void)
{ Hello, world.
int par=0; Hello, world.
#pragma omp parallel if(par==1)num_threads(4) Hello, world.
printf("Hello, world.\n"); Hello, world.
return 0;
}
2. OpenMP Programming: clauses num_threads

#pragma omp parallel [clause[,]clause...] new-line num_threads ( ) Clause


Structured-block • num_threads must evaluate to positive
integer.
Clause: if(scalar-expression)
• It sets number of threads for the execution of
num_threads(integer-expression) parallel region
default(shared|none) • Some of the execution environment routines
private(list) • To set number of threads
void omp_set_num_threads(int num_threads);
firstprivate(list) • To get number of threads
shared(list) int omp_get_num_threads(void);
• To find number of threads which can for a
copyin(list) team
reduction(operator:list) int omp_get_max_threads(void);
• To get thread id
int omp_get_thread_num(void);
2. OpenMP Programming : num_threads clause

%%Program hello.c What is the result of the


program?
%% if () clause

Hello, world.
#include <stdio.h>
#include <omp.h> OR

int main(void) Hello, world.


{ Hello, world.
int par=0; Hello, world.
#pragma omp parallel if(par==1) num_threads(6) Hello, world.
printf("Hello, world.\n"); Hello, world.
return 0; Hello, world.
}
2. OpenMP Programming: num_threads

%%Program hello.c What is the result of the


program?
%% if () clause

Hello, world.
#include <stdio.h>
#include <omp.h> OR

int main(void) Hello, world.


{ Hello, world.
int par=1; Hello, world.
#pragma omp parallel if(par==1) num_threads(6) Hello, world.
printf("Hello, world.\n"); Hello, world.
return 0; Hello, world.
}
2. OpenMP Programming: Clauses

#pragma omp parallel [clause[,]clause...] new-line Default(shared/none) , shared(list) Clause


Structured-block • default (shared) clause causes all variables
referenced in the construct which have
Clause: if(scalar-expression) implicitly determined sharing attributes to be
num_threads(integer-expression) shared.
default(shared|none) • default(none) clause requires that
each variable which is referenced in the
private(list) construct, and that does not have a
predetermined sharing attribute, must
firstprivate(list) have its
shared(list) sharing attribute explicitly determined by
being listed in a data sharing attribute
copyin(list) clause
reduction(operator:list) • shared(list) : One or more list items must be
shared among all the threads in a team.
2. OpenMP Programming: Directives

#pragma omp parallel [clause[,]clause...] new-line private (list) Clause


Structured-block • private (list) clause declares one or more list
Clause: if(scalar-expression) items must be private to a thread.

num_threads(integer-expression) • A list item that appears in the


reduction clause of a parallel
default(shared|none) construct must not appear in a
private(list) private clause on a work-sharing construct.
firstprivate(list)
shared(list)
copyin(list)
reduction(operator:list)
2. OpenMP Programming: Clauses – default(shared)

// N: total number of iterations Shared : A, B, C


#pragma omp parallel default(shared)
private(threadnum, numthreads, low,high, i) Other variables are default
{
int threadnum = omp_get_thread_num(),
numthreads = omp_get_num_threads(); Shared : A, B, C, N=100

Threadnum =1
int low = N*threadnum/numthreads, Threadnum =0
Numthreads =2 Numthreads =2
high = N*(threadnum+1)/numthreads; Low=50
Low=0
High=50 High=100
for (i=low; i<high; i++)
a[i]=b[i]+c[i]
}
2. OpenMP Programming: Directives – Default(none)

// N: total number of iterations


Shared : A, B, C
#pragma omp parallel default(none)
shared(A,B,C,N) private(threadnum, numthreads, Other variables are default
low,high, i)
{
int threadnum = omp_get_thread_num(),
numthreads = omp_get_num_threads(); Shared : A, B, C, N=100

Threadnum =0 Threadnum =1
int low = N*threadnum/numthreads, Numthreads =2
Numthreads =2
high = N*(threadnum+1)/numthreads; Low=0 Low=50
High=50 High=100
for (i=low; i<high; i++)
a[i]=b[i]+c[i]
}
2. OpenMP Programming: Directives - shared

Since the x is shared, the change in one thread is visible to all


other threads too.
2. OpenMP Programming: Directives - private

The variable x is private here . So the change of


value performed in one thread is not visible to
other threads.
2. OpenMP Programming: Directives - Private

The variable x is private here. The value of x is 10 before parallel region. Since
x is private, the value 10 is not reflected in threads. When the parallel region
is entered, the x contains some garbage value as it is not initialed .
2. OpenMP Programming: Directives - Shared

X is shared. X is assigned value 15 inside parallel region.


Each thread is assigning value as 15. And updating it with x=x+1; But update is
not getting reflected in other threads.
2. OpenMP Programming: Directives - Shared

x is shared. No assignment in the parallel region.


Update is getting reflected in all other threads.
Synchronization is job of programmer.
2. OpenMP Programming: Directives

#pragma omp parallel [clause[,]clause...] new-line firstprivate (list) Clause


Structured-block • firstprivate (list) clause declares one or
Clause: if(scalar-expression) more list items to be private to a thread and
initializes each of them with that the
num_threads(integer-expression) corresponding original item has when the
default(shared|none) construct is encountered.
private(list) • For a firstprivate clause on a parallel
construct, the initial value of the new item is
firstprivate(list) the value of the original list item that
shared(list) exists immediately prior to
copyin(list) the parallel construct for the thread that
encounters the construct.
reduction(operator:list)
2. OpenMP Programming: Directives - Private

X is private. While entering parallel region the x gets


assigned with some value.
2. OpenMP Programming: Directives - Private

Shared x. x is already with some garbage


value. X is private to each thread.
2. OpenMP Programming: Directives - Private

First private works as follows.


1. assign the value as in main thread before parallel region.
2. thread is private once it enters to parallel region.
2. OpenMP Programming: Directives
A loop Construct specifies that the iterations of loops will be distributed
among and executed by the encountering team of threads.
#pragma omp for [clause[,]clause...] new-line Example
for-loops
Clause: private(list) #pragma omp parallel
firstprivate(list) #pragma omp for
lastprivate(list) for (i=0; i<N; i++) {
// do something with i
reduction(operator:list) }
schedule(kind[,chunk_size])
collapse(n)
ordered
nowait
Index

• OpenMP
• Program Structure
• Directives : parallel , for
• Clauses
• If
• Private
• Shared
• Default(shared/none)
• Firstprivate
• Num_threads
Reference
Text Books and/or Reference Books:
1. Professional CUDA C Programming – John Cheng, Max Grossman, Ty McKercher, 2014
2. B.Wilkinson, M.Allen, ”Parallel Programming: Techniques and Applications Using Networked
Workstations and Parallel Computers”, Pearson Education, 1999
3. I.Foster, ”Designing and building parallel programs”, 2003
4. Parallel Programming in C using OpenMP and MPI – Micheal J Quinn, 2004
5. Introduction to Parallel Programming – Peter S Pacheco, Morgan Kaufmann Publishers,
2011
6. Advanced Computer Architectures: A design approach, Dezso Sima, Terence Fountain, Peter
Kacsuk, 2002
7. Parallel Computer Architecture : A hardware/Software Approach, David E Culler, Jaswinder
Pal Singh Anoop Gupta, 2011 8. Introduction to Parallel Computing, Ananth Grama, Anshul
Gupta, George Karypis, Vipin Kumar, Pearson, 2011
Reference
Acknowledgements
1. Introduction to OpenMP https://www3.nd.edu/~zxu2/acms60212-40212/Lec-12-OpenMP.pdf
2. Introduction to parallel programming for shared memory
Machines https://www.youtube.com/watch?v=LL3TAHpxOig
3. OpenMP Application Program Interface Version 2.5 May 2005
4. OpenMP Application Program Interface Version 5.0 November 2018

You might also like