0% found this document useful (0 votes)

97 views35 pages

Open MP

OpenMP is a standard for shared memory parallel programming using compiler directives. It uses a fork-join model where the master thread forks additional threads to execute a parallel region. OpenMP consists of work-sharing constructs like parallel loops to distribute work, synchronization constructs to coordinate threads, and data environment constructs to specify data scope. It supports both loop-level and task-based parallelism and can be used to write hybrid MPI/OpenMP programs.

Uploaded by

Debarshi Majumder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views35 pages

Open MP

Uploaded by

Debarshi Majumder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Shared Memory Parallelism -

OpenMP

Sathish Vadhiyar
Credits/Sources:
OpenMP C/C++ standard (openmp.org)
OpenMP tutorial (http://www.llnl.gov/computing/tutorials/openMP/#Introduction)
OpenMP sc99 tutorial presentation (openmp.org)
Dr. Eric Strohmaier (University of Tennessee, CS594 class, Feb 9, 2000)
Introduction

 A portable programming model and standard for

shared memory programming using compiler
directives
 Directives?: constructs or statements in the program
applying some action on a block of code
 A specification for a set of compiler directives, library
routines, and environment variables – standardizing
pragmas
 Easy to program; easy for code developer to convert
his sequential to parallel program by throwing
directives
 First version in 1997, development over the years till
the latest 4.5 in 2015
Fork-Join Model
 Begins as a single thread
called master thread
 Fork: When parallel construct
is encountered, team of
threads are created
 Statements in the parallel
region are executed in parallel
 Join: At the end of the parallel
region, the team threads
synchronize and terminate
OpenMP consists of…

 Work-sharing constructs
 Synchronization constructs
 Data environment constructs
 Library calls, environment variables
Introduction
 Mainly supports loop-level parallelism
 Specifies parallelism for a region of code: fine-
level parallelism
 The number of threads can be varied from one
region to another – dynamic parallelism
 Follows Amdahl’s law – sequential portions in the
code
 Applications have varying phases of parallelism
 Also supports
 Coarse-level parallelism – sections and tasks
 Executions on accelerators
 SIMD vectorizations
 task-core affinity
parallel construct

#pragma omp parallel [clause [, clause] …] new-line

structured-block
Clause: Can support nested
parallelism
Parallel construct - Example
#include <omp.h>

main () {
int nthreads, tid;

#pragma omp parallel private(nthreads, tid) {

printf("Hello World \n);

}

}
Work sharing construct

 For distributing the execution among the threads

that encounter it
 3 types of work sharing constructs – loops,
sections, single
for construct

 For distributing the iterations among the threads

#pragma omp for [clause [, clause] …] new-

line
for-loop
Clause:
for construct

 Restriction in the structure of the for

loop so that the compiler can
determine the number of iterations –
e.g. no branching out of loop
 The assignment of iterations to
threads depends on the schedule
clause
 Implicit barrier at the end of for if not
nowait
schedule clause

1. schedule(static, chunk_size) –
iterations/chunk_size chunks distributed
in round-robin
2. Schedule(dynamic, chunk_size) – same
as above, but chunks distributed
dynamically.
3. schedule(runtime) – decision at runtime.
Implementation dependent
for - Example

include <omp.h>
#define CHUNKSIZE 100
#define N 1000

main () {
int i, chunk; float a[N], b[N], c[N];

/* Some initializations */
for (i=0; i < N; i++)
a[i] = b[i] = i * 1.0;

chunk = CHUNKSIZE;
#pragma omp parallel shared(a,b,c,chunk) private(i) {
#pragma omp for schedule(dynamic,chunk) nowait
for (i=0; i < N; i++)
c[i] = a[i] + b[i];
} /* end of parallel section */

}
Coarse level parallelism – sections and
tasks
 sections

 tasks – dynamic mechanism

 depend clause for task

Synchronization directives
flush directive

 Point where consistent view of memory is

provided among the threads
 Thread-visible variables (global variables,
shared variables etc.) are written to memory
 If var-list is used, only variables in the list are
flushed
flush - Example
flush – Example (Contd…)
Data Scope Attribute Clauses

Most variables are shared by default

Data scopes explicitly specified by data scope attribute clauses
Clauses:
1. private
2. firstprivate
3. lastprivate
4. shared
5. default
6. reduction
7. copyin
8. copyprivate
threadprivate
• Global variable-list declared are made private to a thread
• Each thread gets its own copy
• Persist between different parallel regions
 #include <omp.h>
 int alpha[10], beta[10], i;
 #pragma omp threadprivate(alpha)
 main () {
 /* Explicitly turn off dynamic threads */
 omp_set_dynamic(0);
 /* First parallel region */
 #pragma omp parallel private(i,beta)
 for (i=0; i < 10; i++) alpha[i] = beta[i] = i;
 /* Second parallel region */
 #pragma omp parallel
 printf("alpha[3]= %d and beta[3]= %d\n",alpha[3],beta[3]);}
private, firstprivate & lastprivate

 private (variable-list)
 variable-list private to each thread
 A new object with automatic storage duration allocated for the
construct

 firstprivate (variable-list)
 The new object is initialized with the value of the old object that
existed prior to the construct

 lastprivate (variable-list)
 The value of the private object corresponding to the last iteration
or the last section is assigned to the original object
shared, default, reduction

 shared(variable-list)

 default(shared | none)
 Specifies the sharing behavior of all of the variables visible in the
construct

 Reduction(op: variable-list)
 Private copies of the variables are made for each thread
 The final object value at the end of the reduction will be
combination of all the private object values
default - Example
Library Routines (API)

 Querying function (number of threads etc.)

 General purpose locking routines
 Setting execution environment (dynamic
threads, nested parallelism etc.)
API

 OMP_SET_NUM_THREADS(num_threads)
 OMP_GET_NUM_THREADS()
 OMP_GET_MAX_THREADS()
 OMP_GET_THREAD_NUM()
 OMP_GET_NUM_PROCS()
 OMP_IN_PARALLEL()
 OMP_SET_DYNAMIC(dynamic_threads)
 OMP_GET_DYNAMIC()
 OMP_SET_NESTED(nested)
 OMP_GET_NESTED()
API(Contd..)
 omp_init_lock(omp_lock_t *lock)
 omp_init_nest_lock(omp_nest_lock_t *lock)
 omp_destroy_lock(omp_lock_t *lock)
 omp_destroy_nest_lock(omp_nest_lock_t *lock)
 omp_set_lock(omp_lock_t *lock)
 omp_set_nest_lock(omp_nest_lock_t *lock)
 omp_unset_lock(omp_lock_t *lock)
 omp_unset_nest__lock(omp_nest_lock_t *lock)
 omp_test_lock(omp_lock_t *lock)
 omp_test_nest_lock(omp_nest_lock_t *lock)

 omp_get_wtime()
 omp_get_wtick()

 omp_get_thread_num()
 omp_get_num_proc()
 omp_get_num_devices()
Lock details

 Simple locks and nestable locks

 Simple locks are not locked if they are already in a
locked state
 Nestable locks can be locked multiple times by the
same thread
 Simple locks are available if they are unlocked
 Nestable locks are available if they are unlocked or
owned by a calling thread
Example – Nested lock
Example – Nested lock (Contd..)
Example 1: Jacobi Solver
Example 2: BFS Version 1
(Nested Parallelism)
Example 3: BFS Version 3
(Using Task Construct)
Hybrid Programming – Combining MPI and
OpenMP benefits
 MPI
- explicit parallelism, no synchronization problems
- suitable for coarse grain
 OpenMP
- easy to program, dynamic scheduling allowed
- only for shared memory, data synchronization problems
 MPI/OpenMP Hybrid
- Can combine MPI data placement with OpenMP fine-grain
parallelism
- Suitable for cluster of SMPs (Clumps)
- Can implement hierarchical model
 END
Definitions

 Construct – statement containing directive and

structured block
 Directive – Based on C #pragma directives

#pragma <omp id> <other text>

#pragma omp directive-name [clause [, clause] …]
new-line

Example:
#pragma omp parallel default(shared) private(beta,pi)
Parallel construct

 Parallel region executed by multiple threads

 If num_threads, omp_set_num_threads(),
OMP_SET_NUM_THREADS not used, then
number of created threads is implementation
dependent
 Number of physical processors hosting the
thread also implementation dependent
 Threads numbered from 0 to N-1
 Nested parallelism by embedding one parallel
construct inside another

Omp Sync Data Runtime Environment
No ratings yet
Omp Sync Data Runtime Environment
59 pages
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
CS-3006 8 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 8 UsingOpenMP SharedMemoryProgramming
61 pages
Delphi Language Guide 10.3
No ratings yet
Delphi Language Guide 10.3
391 pages
CS-3006 5 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 5 UsingOpenMP SharedMemoryProgramming
76 pages
About OpenMP
No ratings yet
About OpenMP
86 pages
Openmp
No ratings yet
Openmp
115 pages
Openmp HPC Ass1
No ratings yet
Openmp HPC Ass1
43 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
No ratings yet
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
50 pages
OPENMP1
No ratings yet
OPENMP1
67 pages
OPENMP
No ratings yet
OPENMP
37 pages
Openmp 1
No ratings yet
Openmp 1
38 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
Parallel Programming Module 3
No ratings yet
Parallel Programming Module 3
44 pages
Parallel Programming 2
No ratings yet
Parallel Programming 2
20 pages
The Complete Future Trait Guide
From Everand
The Complete Future Trait Guide
Hamze Ghalebi
No ratings yet
Computer Programming For Beginners. Fundamentals of Programming
100% (2)
Computer Programming For Beginners. Fundamentals of Programming
146 pages
SimpliVity CLI Upgrade Lab Guide
No ratings yet
SimpliVity CLI Upgrade Lab Guide
7 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
Chapter 3 - Shared-Memory Programming, OpenMP
No ratings yet
Chapter 3 - Shared-Memory Programming, OpenMP
65 pages
OpenMP P1
No ratings yet
OpenMP P1
32 pages
Ipc - Assig 1
No ratings yet
Ipc - Assig 1
9 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
DS1822-Parallel Computing - Unit2
No ratings yet
DS1822-Parallel Computing - Unit2
25 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
Lecture 10 Shared Memory Programming With OpenMP
No ratings yet
Lecture 10 Shared Memory Programming With OpenMP
30 pages
Chap4 OpenMP
No ratings yet
Chap4 OpenMP
35 pages
10 OpenMP-2
No ratings yet
10 OpenMP-2
25 pages
Omneon Logging Detail
No ratings yet
Omneon Logging Detail
53 pages
OpenMP SPM
No ratings yet
OpenMP SPM
9 pages
4 Openmp
No ratings yet
4 Openmp
32 pages
Unit 3
No ratings yet
Unit 3
13 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
Ffps7 Nuvera
No ratings yet
Ffps7 Nuvera
404 pages
CSL 730: Parallel Programming: Openmp
No ratings yet
CSL 730: Parallel Programming: Openmp
74 pages
Openmp Boston
No ratings yet
Openmp Boston
90 pages
Shared Memory: Openmp Environment and Synchronization
No ratings yet
Shared Memory: Openmp Environment and Synchronization
32 pages
OpenMPSlides Tamu SC
No ratings yet
OpenMPSlides Tamu SC
80 pages
Openmp: Martin Kruliš Ji Ří Dokulil
No ratings yet
Openmp: Martin Kruliš Ji Ří Dokulil
38 pages
(A) Puja (B) Mukhosh: Figure 1: Three Subfigures
No ratings yet
(A) Puja (B) Mukhosh: Figure 1: Three Subfigures
1 page
Assignment TestPlan TestCase Katalon TuấnVQ
No ratings yet
Assignment TestPlan TestCase Katalon TuấnVQ
15 pages
OpenMPSlides Tamu SC PDF
No ratings yet
OpenMPSlides Tamu SC PDF
74 pages
Stqa Practical Final
No ratings yet
Stqa Practical Final
35 pages
Parallel Programming Using Openmp: Mike Bailey
No ratings yet
Parallel Programming Using Openmp: Mike Bailey
27 pages
JAVA Week 1 GA
50% (2)
JAVA Week 1 GA
12 pages
Unit III
No ratings yet
Unit III
15 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
OpenMP 01 Introduction
No ratings yet
OpenMP 01 Introduction
70 pages
Omp Handouts
No ratings yet
Omp Handouts
109 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
46 pages
Presentation2 HS OpenMP
No ratings yet
Presentation2 HS OpenMP
29 pages
Openmp: Openmp Adds Constructs For Shared-Memory
No ratings yet
Openmp: Openmp Adds Constructs For Shared-Memory
15 pages
OpenMP Presentation
No ratings yet
OpenMP Presentation
51 pages
Num Tech
No ratings yet
Num Tech
39 pages
Openmp: Parallel Processing
No ratings yet
Openmp: Parallel Processing
40 pages
Mpsoc Architectures Openmp
No ratings yet
Mpsoc Architectures Openmp
35 pages
A Tutorial On Parallel Computing On Shared Memory Systems
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
23 pages
Open MP
No ratings yet
Open MP
30 pages
OpenMP Examples
No ratings yet
OpenMP Examples
12 pages
3unit3 Mca Pecnotes
No ratings yet
3unit3 Mca Pecnotes
23 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
OpenMP Reference
No ratings yet
OpenMP Reference
2 pages
U18CSI1202-Problem Solving and Programming
No ratings yet
U18CSI1202-Problem Solving and Programming
3 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
Com Dcom
No ratings yet
Com Dcom
28 pages
Dynamic Query in Data Model Oracle Fusion Samir Kumar Jha
No ratings yet
Dynamic Query in Data Model Oracle Fusion Samir Kumar Jha
9 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
LAB 1-Intro To Asmbly Lang Tools
No ratings yet
LAB 1-Intro To Asmbly Lang Tools
14 pages
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
4.5/5 (3)
1 Object Persistence
No ratings yet
1 Object Persistence
27 pages
Word Puzzle Game in Python (INT306)
No ratings yet
Word Puzzle Game in Python (INT306)
6 pages
Inheritance Java
100% (1)
Inheritance Java
10 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Enterprise Programming
No ratings yet
Enterprise Programming
3 pages
ADD UNIT 1 & 2 Question Bank
No ratings yet
ADD UNIT 1 & 2 Question Bank
12 pages
Java Theory Questions
No ratings yet
Java Theory Questions
3 pages
Aphs
No ratings yet
Aphs
111 pages
Python PBL Presentation
No ratings yet
Python PBL Presentation
13 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Unit Testing
No ratings yet
Unit Testing
6 pages
Enable Https On Ubuntu Web Server (20.04)
No ratings yet
Enable Https On Ubuntu Web Server (20.04)
8 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Hemant Rana SFDC Consultant New
No ratings yet
Hemant Rana SFDC Consultant New
6 pages
Agile Model
No ratings yet
Agile Model
2 pages
Assignment: Objective
No ratings yet
Assignment: Objective
30 pages
GTP Newsletter MAR13
No ratings yet
GTP Newsletter MAR13
21 pages
Beginning C Sharp Object Oriented Programming
No ratings yet
Beginning C Sharp Object Oriented Programming
35 pages
Name Shashank Upadhyay Title Microsoft Student Partner
No ratings yet
Name Shashank Upadhyay Title Microsoft Student Partner
42 pages
The Report Engine Automation Server
No ratings yet
The Report Engine Automation Server
37 pages
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
No ratings yet
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
19 pages
Bob Simon Silalahi-Resume
No ratings yet
Bob Simon Silalahi-Resume
1 page
Floss Sola: FAO Solutions For Open Land Administration
No ratings yet
Floss Sola: FAO Solutions For Open Land Administration
25 pages
Scanned by Camscanner
No ratings yet
Scanned by Camscanner
6 pages
Test Driven Development in Visual Studio 2012
No ratings yet
Test Driven Development in Visual Studio 2012
10 pages

Open MP

Uploaded by

Open MP

Uploaded by

Shared Memory Parallelism -

 A portable programming model and standard for

#pragma omp parallel [clause [, clause] …] new-line

#pragma omp parallel private(nthreads, tid) {

printf("Hello World \n);

 For distributing the execution among the threads

 For distributing the iterations among the threads

#pragma omp for [clause [, clause] …] new-

 Restriction in the structure of the for

 tasks – dynamic mechanism

 depend clause for task

 Point where consistent view of memory is

Most variables are shared by default

 Querying function (number of threads etc.)

 Simple locks and nestable locks

 Construct – statement containing directive and

#pragma <omp id> <other text>

 Parallel region executed by multiple threads

You might also like