Lecture - 06 (Shared Memory Programming With OpenMP)
Lecture - 06 (Shared Memory Programming With OpenMP)
Programming
Dr. Muhammad Naveed Akhtar
Lecture – 06
Shared Memory Programming with OpenMP
#pragma
• Special preprocessor instructions.
• Typically added to a system to allow
behaviors that aren’t part of the basic C
specification.
• Compilers that don’t support the
pragmas ignore them.
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 5
Hello World!
Hello from thread 2 of 4 Execute with only 1 Thread Execute without specifying Threads
Hello from thread 3 of 4
./omp_hello 1 ./omp_hello
Hello from thread 0 of 4
Hello from thread 1 of 4 Hello from thread 0 of 1 Segmentation fault
(core dumped)
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 7
OpenMP fork/join Model
• Clause
# pragma omp parallel num_threads ( thread_count )
• Text that modifies a directive.
• The num_threads clause can be added to a parallel directive.
• It allows the programmer to specify the number of threads that should execute the following block.
We can avoid this problem by declaring a private variable inside the parallel block and
moving the critical section after the function call.
Neither do I.
OLD
NEW
Serial Approach
What happened?
• OpenMP compilers don’t check for dependences among iterations in
a loop that’s being parallelized with a parallel for directive.
• A loop in which the results of one or more iterations depend on other
iterations cannot, in general, be correctly parallelized by OpenMP.
Serial Solution
OpenMP solution #1
loop dependency
OpenMP solution #2
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 32
The default clause
• Lets the programmer specify the scope of each variable in a block.
default (none)
• With this clause the compiler will require that we specify the scope of each variable we use in the block
and that has been declared outside the block.
Evan Phase ……
Odd Phase ……
Even Phase
• Cyclic schedule:
• Unlike the critical directive, it can only protect critical sections that consist of a single C assignment statement.
• Further, the statement must have one of the following forms:
• Here <op> can be one of the binary operators (+, *, -, /, &, ^, |, <<, >>)
• Many processors provide a special load-modify-store instruction.
• A critical section that only does a load-modify-store can be protected much more efficiently by using this
special instruction rather than the constructs that are used to protect more general critical sections.
Thread-Safety
• A block of code is thread-safe if it can be simultaneously executed by multiple threads without
causing problems.
Example
• Suppose we want to use multiple threads to “tokenize” a file that consists of ordinary English text.
• The tokens are just contiguous sequences of characters separated from the rest of the text by
white-space — a space, a tab, or a newline.
• Divide the input file into lines of text and assign the lines to the threads in a round-robin fashion.
• The first line goes to thread 0, the second goes to thread 1, . . . , the tth goes to thread t, etc.
Solution (Simple Approach)
• We can serialize access to the lines of input using semaphores.
• After a thread has read a single line of input, it can tokenize the line using the strtok function.
What happened?
• strtok caches the input line by declaring a variable to have static storage class.
• This causes the value stored in this variable to persist from one call to the next.
• Unfortunately for us, this cached string is shared, not private.
• Thus, thread 0’s call to strtok with the third line of the input has apparently
overwritten the contents of thread 1’s call with the second line.
• So the strtok function is not thread-safe. If multiple threads call it
simultaneously, the output may not be correct.
• Thread Safe Routines in C Library
Random generator random in stdlib.h, time conversion function localtime in time.h and many others are also not thread safe
Parallel and Distributed Programming (Dr. M. Naveed Akhtar) 63
Concluding Remarks
• OpenMP : • OpenMP provides several mechanisms for insuring
• A standard for programming shared-memory mutual exclusion in critical sections.
systems. • Critical directives and Named critical directives
• uses both special functions and preprocessor • Atomic directives and Simple locks
directives called pragmas. • By default most systems use a block-partitioning of
• programs start multiple threads rather than multiple the iterations in a parallelized for loop.
processes.
• OpenMP offers a variety of scheduling options.
• Many OpenMP directives can be modified by • In OpenMP the scope of a variable is the collection
clauses. of threads to which the variable is accessible.
• A major problem of shared memory programs is • A reduction is a computation that repeatedly
the possibility of race conditions. applies the same reduction operator to a sequence
of operands in order to get a single result.