Num Tech
Num Tech
Num Tech
平行計算
Tien-Hsiung Weng
翁添雄
大葉大學資工系
[email protected]
Lecture 4
Programming Shared-Address
Space with OpenMP
Lecture 4
Topics
• Introduction to OpenMP
• OpenMP directives
– specifying concurrency
• parallel regions
• loops, task parallelism
• Synchronization directives
– reductions, barrier, critical, ordered
• Data handling clauses
– shared, private, firstprivate, lastprivate
• Library primitives
• Environment variables
• Example of SPMD style OpenMP program
Introduction to OpenMP
• Open specifications for Multi Processing
• An API for explicit multi-threaded, shared memory parallelism
• Three components
– compiler directives
– runtime library routines
– environment variables
• Higher-level programming model than Pthreads
– support for concurrency, synchronization, and data handling
– not mutexes, condition variables, data scope, and initialization
• Portable
– API is specified for C/C++ and Fortran
– implementations on many platforms (most Unix, Windows NT)
• Standardized
Introduction to OpenMP
• Parallelism is explicit
– It is not an automatic programming programming model
– programmer full control (and responsibility) over
parallelization
• No data locality control
– No guaranteed to make the most efficient use of shared
memory
• Not necessarily implemented identically by all
vendors
• designed for shared address spaced machines
– Not for distributed memory parallel systems (by itself)
Introduction to OpenMP
• Advantages
– Ease of use
– Enables incremental parallelization of a
serial program
– Supports both coarse-grain and fine-grain
parallelism
– Portable
– Standard
OpenMP: Fork-Join Parallelism
• OpenMP program begins execution as a single
master thread
• Master thread executes sequentially until 1st parallel
region
• When a parallel region is encountered, master thread
– creates a group of threads
– becomes the master of this group of threads
– is assigned the thread id 0 within the group
OpenMP Directive Format
C and C++ use compiler directives
– prefix: #pragma …
• Fortran uses significant comments
– prefixes: !$OMP, C$OMP, *$OMP
• A directive consists of a directive name
followed by clauses
• C:
– #pragma omp parallel default(shared) private(i,j)
• Fortran:
– !$OMP PARALLEL DEFAULT(SHARED)
PRIVATE(i,j)
OpenMP parallel Region
Directives
#pragma omp parallel [clause list]
Meaning
• if (is_parallel== 1) num_threads(8)
– If the value of the variable is_parallel is one, create 8 threads
• private (a) shared (b)
– each thread gets private copies of variables a and c
– each thread shares a single copy of variable b
• firstprivate(c)
– each private copy of c is initialized with the value of c in main thread when the parallel
directive is encountered
• default(none)
– default state of a variable is specified as none (rather than shared)
– signals error if not all variables are specified as shared or private
OpenMP Programming Model
OpenMP
Pthread
F()
{ int tmp; /* local allocation on stack */
for (i=0; i<n; i++) {
tmp[i] = a[i];
a[i] = b[i];
b[i] = tmp[i];
}
}
Processor count
int omp_get_num_procs(); /* # PE currently available */
int omp_in_parallel(); /* determine whether running in parallel */
• OMP_NUM_THREADS
– specifies the default number of threads for a
parallel region
• OMP_SET_DYNAMIC
– specfies if the number of threads can be
dynamically changed
• OMP_NESTED
– enables nested parallelism
• OMP_SCHEDULE
– specifies scheduling of for-loops if the clause
specifies runtime
OpenMP SPMD Style