0% found this document useful (0 votes)
43 views9 pages

Cao Da1

Uploaded by

veditha.r2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views9 pages

Cao Da1

Uploaded by

veditha.r2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

DA-1

Course Code: BCSE205L


Course Name: Computer
Architecture and Organisation
Class Number:
CH2024250100951 Course
Faculty: ANUSHA K
Slot: C1+TC1-AB3-501

Student Name: Veditha.R


Registration Number:
23BCE1301

VEDITHA.R
23BCE1301
Introduction to OpenMP and Pragmas
OpenMP (Open Multi-Processing) is an API that supports multi-platform shared memory
multiprocessing programming in C, C++, and Fortran. OpenMP simplifies parallel
programming by providing a set of compiler directives, runtime library routines, and
environment variables that help create and manage parallel programs. OpenMP is widely used
to leverage the capabilities of multi-core processors by enabling easy parallelization of code
without requiring significant modifications to existing sequential code.
What is a Pragma?
A pragma is a special kind of instruction or directive to the compiler that provides additional
information about how to compile the code. In OpenMP, pragmas are used to tell the compiler
how to parallelize certain parts of the code. Pragmas in OpenMP begin with `#pragma omp`
and instruct the compiler to execute the following block of code using multiple threads,
distribute iterations of loops, synchronize threads, or perform other parallel operations.
OpenMP pragmas (directives) do not affect the logic of the program if OpenMP is disabled.
They are simply ignored when OpenMP support is not enabled during compilation. This makes
it easy to maintain both parallel and sequential versions of the code in one source file.

OpenMP Pragmas (Syntaxes) and Their Explanations


1. #pragma omp parallel
This directive creates a parallel region where a team of threads is formed to execute the code
block in parallel. Each thread executes the same code simultaneously.
#pragma omp parallel
{
// Code inside this block is executed by multiple threads
}
Explanation: This is the most basic OpenMP directive. It creates a region where all threads
execute the code concurrently. Each thread is assigned a unique ID (available using
`omp_get_thread_num()`). If there are 4 threads, the block will run 4 times simultaneously.
Uses: It is useful when you want multiple threads to run a set of instructions in parallel
without any specific control over loop iterations or sections of code.
2. #pragma omp for` / `#pragma omp do
This directive is used to distribute the iterations of a loop among multiple threads in a parallel
region.
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < N; i++) {
// Loop iterations are distributed among threads
}

VEDITHA.R
23BCE1301
}
Explanation: When placed before a loop, OpenMP divides the iterations of the loop among
threads. For example, if you have 4 threads and 100 iterations, each thread will perform 25
iterations. OpenMP handles the distribution of loop iterations based on the number of
available threads.
Uses: It is ideal for parallelizing loops where iterations are independent of each other, such as
numerical computations over arrays, matrix multiplication, etc.

3. #pragma omp parallel for


This is a combined form of the `parallel` and `for` directives. It simplifies the syntax for
parallelizing loops.
#pragma omp parallel for
for (int i = 0; i < N; i++) {
// Loop iterations are executed in parallel
}

Explanation: This directive merges the creation of a parallel region and the parallelization of
the loop. It simplifies the syntax when the sole purpose of the parallel region is to parallelize
the loop.
Uses: When you want a more concise way to parallelize loops without explicitly defining
separate `parallel` and `for` pragmas.

4. #pragma omp sections


The `sections` directive is used when different sections of code need to be executed by
different threads in parallel.
#pragma omp parallel sections
{
#pragma omp section
{
// Code block 1, executed by one thread
}
#pragma omp section
{
// Code block 2, executed by another thread
}
}

VEDITHA.R
23BCE1301
Explanation: Each `section` is executed by a separate thread. This is useful for tasks that can
be divided into distinct independent pieces of work. Unlike `for`, where loop iterations are
divided, `sections` allow non-looped code blocks to be run in parallel.
Uses: Useful for tasks like parallelizing different parts of a computation where each section
performs a unique operation (e.g., different data processing pipelines).

5. #pragma omp single


The `single` directive ensures that only one thread (any one thread) executes the block of
code, while the rest of the threads wait.
#pragma omp parallel
{
#pragma omp single
{
// Only one thread executes this block
}
}
Explanation: The code inside the `single` directive is executed by one thread only, and the
rest of the threads wait at this point unless a `nowait` clause is added. This is useful for
serializing a piece of code that must not be executed in parallel, such as reading input or
writing output.
Uses: When only one thread should execute a particular block of code, like reading or
initializing shared data.

6. #pragma omp master


The `master` directive ensures that only the master thread (thread 0) executes the enclosed
code.
#pragma omp parallel
{
#pragma omp master
{
// Only the master thread executes this block
}
}
Explanation: This is like `single`, but specifically for the master thread. Other threads do not
execute this block, nor do they wait for the master thread to finish.
Uses: When only the master thread (thread 0) should perform a task, such as finalizing results
or managing critical resources.

VEDITHA.R
23BCE1301
7. #pragma omp critical
This directive ensures that the enclosed block of code is executed by only one thread at a
time to prevent race conditions.
#pragma omp parallel
{
#pragma omp critical
{
// Critical section: Only one thread at a time can execute this
}
}
Explanation: The `critical` pragma ensures mutual exclusion. Only one thread at a time can
enter the critical section, preventing data races and inconsistencies when accessing shared
data.
Uses: Useful for protecting critical sections of code that modify shared resources, such as
updating a shared counter, adding elements to a shared list, etc.

8. #pragma omp barrier


The `barrier` pragma synchronizes threads, forcing them to wait at a specified point until all
threads have reached this point.
#pragma omp parallel
{
// Code before barrier
#pragma omp barrier
// Code after barrier, executed only after all threads reach this point
}
Explanation: This ensures that all threads have completed their tasks before moving forward,
which is often necessary to ensure proper synchronization when threads share data or
computation.
Uses: Useful in stages of parallel programs where synchronization between threads is
necessary before proceeding to the next phase.

9. #pragma omp atomic


The `atomic` pragma ensures that a specific memory operation (e.g., increment, update) is
performed atomically, without interruption from other threads.
#pragma omp parallel
{
#pragma omp atomic
sum += value;
VEDITHA.R
23BCE1301
}

Explanation: This pragma ensures that a single memory operation is executed in a way that
prevents race conditions. It is more lightweight than `critical` because it applies only to a
single memory operation.
Uses: Ideal for simple operations on shared variables (e.g., incrementing a shared counter).

10. #pragma omp reduction


The `reduction` clause performs a reduction operation (such as sum, product, max, etc.)
across all threads at the end of a parallel region.
int sum = 0;
#pragma omp parallel for reduction(+:sum)
for (int i = 0; i < N; i++) {
sum += i;
}

Explanation: Each thread works on its own private copy of the reduction variable (e.g.,
`sum`). After the loop ends, OpenMP combines the results from all threads into the final value
of the reduction variable.
Uses: Common in operations like summing or multiplying elements of an array in parallel,
where the partial results from each thread must be combined at the end.

11. #pragma omp private


This clause declares variables as private to each thread, meaning each thread gets its own
independent copy of the variable.
int x = 10;
#pragma omp parallel private(x)
{
// Each thread has its own private copy of x
}
Explanation: `private` ensures that each thread has its own instance of a variable. Changes
made by one thread do not affect the values of the same variable in other threads.
Uses: Useful for variables that should not be shared, such as loop counters or temporary
variables used inside parallel loops.

12. #pragma omp shared


This clause declares variables as shared among all threads, meaning all threads access and
modify the same instance of the variable.

VEDITHA.R
23BCE1301
int x = 10;
#pragma omp parallel shared(x)
{
// All threads access the same shared x
}

Explanation: Shared variables are accessible to all threads and can be read or written by any
thread. However, race conditions may arise if multiple threads attempt to write to the same
shared variable at the same time.
Uses: Used when multiple threads need to access and modify shared data, but care should be
taken to ensure proper synchronization (using `critical` or `atomic` if necessary).

13. #pragma omp threadprivate


This pragma declares global or static variables as private to each thread.
static int x;
#pragma omp threadprivate(x)

Explanation: Each thread gets its own private copy of global or static variables. This is useful
when global or static variables need to maintain separate values for each thread.
Uses: Useful in parallel applications where global variables must be made private to each
thread.

14. #pragma omp schedule


The `schedule` clause controls how loop iterations are assigned to threads.
#pragma omp parallel for schedule(static, 4)
for (int i = 0; i < N; i++) {
// Loop iterations scheduled with static partitioning
}
Explanation: The `schedule` clause allows finer control over how iterations are assigned to
threads. It can use different strategies:
- `static`: Divides loop iterations into chunks of a fixed size and assigns them to threads in
a round-robin fashion.
- `dynamic`: Dynamically assigns chunks to threads as they become free, which can help
with load balancing.
- `guided`: Threads are assigned progressively smaller chunks as they complete tasks.
- `runtime`: The schedule is determined at runtime based on environment variables.

VEDITHA.R
23BCE1301
Uses: Useful when the load among iterations is imbalanced or when tuning the performance
of parallel loops.

15. #pragma omp flush


The `flush` directive synchronizes threads by ensuring memory consistency.
#pragma omp flush
Explanation: Ensures that the values of shared variables are consistent across all threads by
flushing all updates to memory. It is useful in scenarios where threads may be updating
shared variables, and you want to ensure that all threads have the latest values before
proceeding.
Uses: Used to guarantee that all threads have a consistent view of shared memory, especially
before or after critical sections.

16. #pragma omp task


The `task` directive defines a block of code as a task, which can be executed by any thread in
the team.
#pragma omp task
{
// Task code to be executed in parallel
}

Explanation: The `task` directive allows for deferred parallel execution. A task can be created
by one thread but may be executed by another thread. It is useful for irregular workloads
where tasks are created dynamically.
Uses: Ideal for dividing work into tasks that can be executed asynchronously by different
threads.

17. #pragma omp taskwait


The `taskwait` directive forces a thread to wait until all child tasks it has created are
complete.
#pragma omp taskwait

Explanation: It is a synchronization point for tasks. A thread waits until all the tasks it has
spawned are completed before continuing.
Uses: Used in applications where task dependencies must be honored before proceeding to
the next phase.

18. #pragma omp parallel sections

VEDITHA.R
23BCE1301
This directive is used when different threads need to execute different sections of
code in parallel.
#pragma omp parallel sections
{
#pragma omp section
{
// Section 1, executed by one thread
}
#pragma omp section
{
// Section 2, executed by another thread
}
}

Explanation: Each `section` is executed by a separate thread. Unlike loops, where iterations
are divided, `sections` allow distinct code blocks to be executed concurrently by different
threads.
Uses: Useful when you want to perform different tasks in parallel, such as processing different
parts of data independently.

Conclusion
OpenMP provides a flexible and simple API for parallel programming. By using pragmas,
developers can parallelize their code with minimal changes. OpenMP handles the complexities
of thread creation, synchronization, and workload distribution, allowing developers to focus on
the logic of their parallel applications. From simple parallel loops to complex task-based
parallelism, OpenMP offers powerful constructs for leveraging multi-core processors
effectively.

VEDITHA.R
23BCE1301

You might also like