0% found this document useful (0 votes)

20 views30 pages

Lecture 10 Shared Memory Programming With OpenMP

This lecture covers Shared Memory Programming with OpenMP, focusing on the Fork/Join parallelism model and the differences between shared-memory and message-passing models. It introduces OpenMP as a standard API for writing multithreaded applications, detailing its implementation, pragmas, and basic library functions. The lecture concludes with examples and a home task related to OpenMP programming.

Uploaded by

Sameer Zohaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views30 pages

Lecture 10 Shared Memory Programming With OpenMP

Uploaded by

Sameer Zohaib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

CS 3006

Parallel and Distributed Computing

Lecture 10
Danyal Farhat
FAST School of Computing
NUCES Lahore
Shared Memory Programming with
OpenMP
Lecture’s Agenda
• Shared Memory Programming
• Shared-Memory Programming (Fork/Join Parallelism)
• Shared-Memory vs. Message Passing Model
• Introduction to OpenMP
• OpenMP Solution Stack Model
• Implementation using Visual Studio C++
• How does OpenMP work?
Pragmas
parallel for Pragma
Lecture’s Agenda (Cont.)
• Canonical [allowed] Shape of for-loop Condition
• Shared and Private Variables
• Basic Library Functions
• Program Structure
• First Program: Hello World
• Home Task
• Summary
• Additional Resources
Shared Memory Programming
• Physically: processors in a computer share access to the same RAM
• Virtually: threads running on the processors interact with one
another via shared variables in the common address space of a
single process
• Making performance improvements to serial code tends to be
easier with multithreading than with message passing paradigm
Message passing usually requires the code/algorithm to be redesigned
Multithreading allows incremental parallelism: take it one loop at a time
• Clusters today are commonly made up of multiple processors per
compute node; using OpenMP with MPI is a strategy to improve
performance at the two levels (i.e. shared and distributed memory)
Shared Memory Programming (Cont.)
Shared-Memory Programming (Fork/Join Parallelism)

• Multi-threaded programming is the

most common shared-memory
programming methodology
• A serial code begins execution. The
process is the master thread or only
executing thread
• When a parallel portion of the code
is reached, the Master thread can
“fork” more threads to work on it
• When the parallel portion of the
code has completed, the threads
“join” again, and the master thread
continues executing the serial code

Introduction: 1-7
Shared-Memory Model vs. Message Passing
Model
Shared-Memory Model
• One active thread at start and end of the program
• Number of active threads inside program changes dynamically during
execution
• Supports incremental parallelism
The process of converting a sequential program to a parallel program a little bit at
a time
Message-Passing Model
• All processes remain active throughout execution of program
• Sequential-to-parallel transformation requires major effort
• No incremental parallelism
Transformation done in one giant step rather than many tiny steps
Introduction to OpenMP
• OpenMP has emerged as a standard method for shared-memory
programming
Same as MPI has become the standard for distributed-memory
programming

Codes are portable

Performance is usually good enough

• Compiler support
C, C++ & Fortran
Intel (icc -openmp), and GNU (gcc -fopenmp)
Introduction to OpenMP (Cont.)
• OpenMP (Open Multi Processing) is an API for writing
multithreaded applications

• Provides an implementation model to distribute and decompose

the work across multiple processors

• Uses threads to deploy work

• Greatly simplifies writing multi-threaded (MT) programs in Fortran,

C and C++
Introduction to OpenMP (Cont.)
• OpenMP is described by the API based on:
A set of compiler directives for depicting the parallelism in the source code
A library of subroutines
A set of Environment Variables

• OpenMP directives in C and C++ are based on the #pragma

compiler directives

• Works on the basis of Fork-Join model

OpenMP Solution Stack Model

Source: http://openmp.org/mp-documents/omp-hands-on-SC08.pdf
Implementation using Visual Studio C++
• Turn on OpneMP support in Visual Studio
• Project properties 🡪 Configuration 🡪 C/C++/Language
How does OpenMP work?
• C programs often express data-parallel operations as for loops
for (i = first; i < size; i += 2)
marked[i] = 1;

• OpenMP makes it easy to indicate when the iterations of a loop

may execute in parallel

• Compiler takes care of generating code that forks/joins threads and

allocates the iterations to threads
How does OpenMP work? - Pragmas
• Pragma: a compiler directive in C or C++
Stands for “pragmatic information”
✔ Pragmatic information : practical information, realistic insight
A way for the programmer to communicate with the compiler
Compiler is free to ignore pragmas
• Syntax:
#pragma omp <rest of pragma>
• The pragmas precede the regions that can be parallelize to flag the
compiler that performing operations in parallel does not affect the
program semantics (i.e., doesn’t affect the program’s logic).
How does OpenMP work? - parallel for
Pragma
• Format:
#pragma omp parallel for
for (i = 0; i < N; i++)
a[i] = b[i] + c[i];

• Compiler must be able to verify total number of iteration before

executing the program, to parallelize it
• Body of for-loop must not allow premature exits (e.g., break,
return, exit, or goto statements are not allowed)
• However, loops with ‘continue’ statement are allowed as it does
not cause the premature exit
Canonical [allowed] Shape of for-loop Condition
• Canonical shape: allowed shape, traditional shape, according to rules
• Here ‘inc’ can be any constant value
Shared and Private Variables
Shared Variable:
•Has same address in execution context of every thread

Private Variable:
•Has different address in execution context of every thread

•A thread cannot access the private variables of another

thread
Basic Library Functions
•omp_get_num_procs
int procs = omp_get_num_procs()
//number of CPUs/cores in machine
•omp_get_num_threads
int threads = omp_get_num_threads()
//number of active threads
//should be called from a parallel region
•omp_get_max_threads
printf(“Only %d threads can be forked\n",omp_get_max_threads());
//can be called outside of a parallel region
Basic Library Functions (Cont.)
•omp_get_thread_num
printf("Hello from thread id %d\n",omp_get_thread_num());
// return thread id

•omp_set_num_threads
omp_set_dynamic(0); // disable dynamic adjustment

omp_set_num_threads(4); //setting thread count to 4

Program Structure

Source: Introduction to Parallel Computing (Karypis and Co.)

First Program: Hello World
#include <omp.h>
#include <iostream>
using namespace std;
Runtime function to
int main() request a certain number
{
of threads
omp_set_num_threads(4);
#pragma omp parallel
{
int Id = omp_get_thread_num(); Runtime function returning
a thread ID
Printf (“hello(%d)”, Id);
printf (“world(%d)\n”, Id)
}
}
First Program: Hello World – 2nd Example
#include <omp.h>
int numT; Clause to request a
int main() certain number of
threads
{
#pragma omp parallel num_threads(4)
{
int Id = omp_get_thread_num();
numT = omp_get_num_threads();
Runtime function
Printf (“hello(%d)”, Id); returning no. of threads
printf (“world(%d)\n”, Id) actually created
}
}
Activity
•Write an OpenMP program that should create 2 threads to
run in parallel and display thread id of each created thread.
You must have to write complete code.
Summary
•Shared Memory Programming
Processors share RAM, Threads share variables
Performance improvement is easier with multithreading than with message
passing paradigm
•Shared-Memory Programming (Fork/Join Parallelism)
Using "fork", master thread creates worker threads
Using "join", worker threads return to the master thread
•Shared-Memory vs. Message Passing Model
Shared-Memory Model: One active thread at start and end of program,
incremental parallelism
Message-Passing Model: All processors remain active throughout execution
of program, no incremental parallelism
Summary (Cont.)
•Introduction to Open Multi Processing (OpenMP)
OpenMP is an API for writing multithreaded applications
OpenMP is described by the API based on:
✔ A set of compiler directives for depicting the parallelism in the source code
✔ A library of subroutines
✔ A set of Environment Variables

•OpenMP Solution Stack Model

•Implementation using Visual Studio C++
•How does OpenMP work?
C programs often express data-parallel operations as for loops
OpenMP makes it easy to indicate when the iterations of a loop may
execute in parallel
Summary (Cont.)
•Pragmas
A compiler directive in C or C++
A way for the programmer to communicate with the compiler
•parallel for Pragma
•Canonical [allowed] Shape of for-loop Condition
•Shared and Private Variables
Shared Variable: Has same address in execution context of every thread
Private Variable: Has different address in execution context of every thread
Summary (Cont.)
•Basic Library Functions
omp_get_num_procs
omp_get_num_threads
omp_get_max_threads
omp_get_thread_num
omp_set_num_threads
•Program Structure
•First Program: Hello World
•Home Task
Additional Resources
•Introduction to Parallel Computing by Ananth Grama and
Anshul Gupta

Chapter 7: Programming Shared Address Space Platforms

•Introduction to OpenMP by Tim Mattson (Intel)

https://www.youtube.com/watch?v=nE-xN4Bf8XI&list=PLLX-Q6B8xq
Z8n8bwjGdzBJ25X2utwnoEG&index=1
Questions?

Panduan Troubleshoot OLT Huawei
100% (1)
Panduan Troubleshoot OLT Huawei
8 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
Mpsoc Architectures Openmp
No ratings yet
Mpsoc Architectures Openmp
35 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
CS-3006 8 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 8 UsingOpenMP SharedMemoryProgramming
61 pages
Openmp: Parallel Processing
No ratings yet
Openmp: Parallel Processing
40 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
Openmp HPC Ass1
No ratings yet
Openmp HPC Ass1
43 pages
About OpenMP
No ratings yet
About OpenMP
86 pages
CS-3006 5 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 5 UsingOpenMP SharedMemoryProgramming
76 pages
OpenMP P1
No ratings yet
OpenMP P1
32 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
Unit III
No ratings yet
Unit III
15 pages
Unit 4 Shared-Memory Parallel Programming With Openmp
No ratings yet
Unit 4 Shared-Memory Parallel Programming With Openmp
37 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
Open MP
No ratings yet
Open MP
35 pages
Open MP
No ratings yet
Open MP
30 pages
OpenMP Presentation
No ratings yet
OpenMP Presentation
51 pages
Openmp
No ratings yet
Openmp
21 pages
Parallel Programming Using Openmp: Mike Bailey
No ratings yet
Parallel Programming Using Openmp: Mike Bailey
27 pages
OpenMP SPM
No ratings yet
OpenMP SPM
9 pages
OpenMPSlides Tamu SC PDF
No ratings yet
OpenMPSlides Tamu SC PDF
74 pages
OpenMP 01 Introduction
No ratings yet
OpenMP 01 Introduction
70 pages
Unit 3
No ratings yet
Unit 3
13 pages
Chapter 3 - Shared-Memory Programming, OpenMP
No ratings yet
Chapter 3 - Shared-Memory Programming, OpenMP
65 pages
Presentation2 HS OpenMP
No ratings yet
Presentation2 HS OpenMP
29 pages
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
No ratings yet
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
50 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
A Tutorial On Parallel Computing On Shared Memory Systems
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
23 pages
Openmp 2pp
No ratings yet
Openmp 2pp
15 pages
Openmp 1
No ratings yet
Openmp 1
38 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
Shared Memory Parallel Programming: Introduction To Openmp
No ratings yet
Shared Memory Parallel Programming: Introduction To Openmp
39 pages
Unit Iii
No ratings yet
Unit Iii
61 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
No ratings yet
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
73 pages
OpenMPSlides Tamu SC
No ratings yet
OpenMPSlides Tamu SC
80 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
Lecture - 06 (Shared Memory Programming With OpenMP)
No ratings yet
Lecture - 06 (Shared Memory Programming With OpenMP)
65 pages
Num Tech
No ratings yet
Num Tech
39 pages
Programming Assignment: On Openmp
No ratings yet
Programming Assignment: On Openmp
19 pages
Tutorial Presentation 8
No ratings yet
Tutorial Presentation 8
15 pages
OpenMP 2
No ratings yet
OpenMP 2
3 pages
Unit 3 - Programming Multi-Core and Shared Memory
No ratings yet
Unit 3 - Programming Multi-Core and Shared Memory
100 pages
OPENMP
No ratings yet
OPENMP
37 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
Openmp: Openmp Adds Constructs For Shared-Memory
No ratings yet
Openmp: Openmp Adds Constructs For Shared-Memory
15 pages
CP4253 Map Unit Iii
No ratings yet
CP4253 Map Unit Iii
26 pages
Open MP1
No ratings yet
Open MP1
15 pages
OpenMP Workshop Day 1
No ratings yet
OpenMP Workshop Day 1
49 pages
OMP Common Core-Voss
No ratings yet
OMP Common Core-Voss
217 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
46 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
Openmp
No ratings yet
Openmp
115 pages
ParallelProgramming Start2016
No ratings yet
ParallelProgramming Start2016
41 pages
The Complete Future Trait Guide
From Everand
The Complete Future Trait Guide
Hamze Ghalebi
No ratings yet
Basic Information About C language PDF
From Everand
Basic Information About C language PDF
Suraj Das
No ratings yet
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
C/C++ Code and Arduino Code: Sistemas Embebidos Oscar Acevedo, PHD
No ratings yet
C/C++ Code and Arduino Code: Sistemas Embebidos Oscar Acevedo, PHD
7 pages
Emtp-3 Day-Training Course
No ratings yet
Emtp-3 Day-Training Course
4 pages
T8 Cloning - v3 - ENG
No ratings yet
T8 Cloning - v3 - ENG
18 pages
WWW Javatpoint Com Redis Interview Questions and Answers
No ratings yet
WWW Javatpoint Com Redis Interview Questions and Answers
9 pages
Natural Interface Applications
No ratings yet
Natural Interface Applications
2 pages
U030600-G-5MM106 - Communication Manual - 902399
No ratings yet
U030600-G-5MM106 - Communication Manual - 902399
302 pages
BMS Integration With Other System
No ratings yet
BMS Integration With Other System
10 pages
Assembly Disassembly of Laptop
No ratings yet
Assembly Disassembly of Laptop
9 pages
Cynthia Selby - Conceitos Pensamento Computacional
No ratings yet
Cynthia Selby - Conceitos Pensamento Computacional
3 pages
Product Data Sheet: Modular Safety Transfomer iTR - 230 V 50..60 HZ - Output 12..24 V - 63 VA
No ratings yet
Product Data Sheet: Modular Safety Transfomer iTR - 230 V 50..60 HZ - Output 12..24 V - 63 VA
2 pages
Idler Sound Monitor Kit
No ratings yet
Idler Sound Monitor Kit
2 pages
Magnetic Flowmeters
No ratings yet
Magnetic Flowmeters
72 pages
Unit 4
No ratings yet
Unit 4
9 pages
Formal Website Handling Over - Broadstreet Radio - Savadub
No ratings yet
Formal Website Handling Over - Broadstreet Radio - Savadub
3 pages
OOP Unit 5 Notes
No ratings yet
OOP Unit 5 Notes
39 pages
Industrial Switch - IS1016GPS-4F Datasheet
No ratings yet
Industrial Switch - IS1016GPS-4F Datasheet
7 pages
MS-7549 Ver:1.1: Title
No ratings yet
MS-7549 Ver:1.1: Title
34 pages
Waukesha VHP L5794gsi Product Sheet
No ratings yet
Waukesha VHP L5794gsi Product Sheet
2 pages
CompTIA Test4prep SK0-004 v2019-02-19 by Owen 260q
No ratings yet
CompTIA Test4prep SK0-004 v2019-02-19 by Owen 260q
110 pages
FT232 For Arduino Writing Bootloader PDF
No ratings yet
FT232 For Arduino Writing Bootloader PDF
6 pages
Experience: 11 Years: Bachelor of Computer Application
No ratings yet
Experience: 11 Years: Bachelor of Computer Application
4 pages
Un Desafio para MR Parker Spanish Edition1st Edition Ella Valentine HQ File Fast Access
No ratings yet
Un Desafio para MR Parker Spanish Edition1st Edition Ella Valentine HQ File Fast Access
304 pages
TOR - Design-Supply & Installation of Additional Pump, Piping and Accessories For Raw Water Tank A, B, C & Filtration Trench-1
No ratings yet
TOR - Design-Supply & Installation of Additional Pump, Piping and Accessories For Raw Water Tank A, B, C & Filtration Trench-1
32 pages
Advance Java Programming Assignment 1: Q1. What Components Will Be Needed To Get Following Output? 2 Marks
No ratings yet
Advance Java Programming Assignment 1: Q1. What Components Will Be Needed To Get Following Output? 2 Marks
6 pages
The Development of The Lifecycle Function Model by IDEF0 For Construction Projects
No ratings yet
The Development of The Lifecycle Function Model by IDEF0 For Construction Projects
4 pages
Vasanth HighResume1
No ratings yet
Vasanth HighResume1
1 page
Complete Bundle Introduction To Digital Communications 1st Edition Pursley
No ratings yet
Complete Bundle Introduction To Digital Communications 1st Edition Pursley
403 pages
Convolution FT 0
No ratings yet
Convolution FT 0
29 pages
MoP AAOB - XL Central Java 2.9
No ratings yet
MoP AAOB - XL Central Java 2.9
110 pages

Lecture 10 Shared Memory Programming With OpenMP

Uploaded by

Lecture 10 Shared Memory Programming With OpenMP

Uploaded by

CS 3006

Parallel and Distributed Computing

• Multi-threaded programming is the

Codes are portable

Performance is usually good enough

• Provides an implementation model to distribute and decompose

• Uses threads to deploy work

• Greatly simplifies writing multi-threaded (MT) programs in Fortran,

• OpenMP directives in C and C++ are based on the #pragma

• Works on the basis of Fork-Join model

• OpenMP makes it easy to indicate when the iterations of a loop

• Compiler takes care of generating code that forks/joins threads and

• Compiler must be able to verify total number of iteration before

•A thread cannot access the private variables of another

omp_set_num_threads(4); //setting thread count to 4

Source: Introduction to Parallel Computing (Karypis and Co.)

•OpenMP Solution Stack Model

Chapter 7: Programming Shared Address Space Platforms

•Introduction to OpenMP by Tim Mattson (Intel)

You might also like