0% found this document useful (0 votes)
19 views3 pages

OpenMP 2

Uploaded by

thatsarra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

OpenMP 2

Uploaded by

thatsarra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

CS424.

Parallel Computing Lab#6

1 OpenMP
OpenMP (Open Multi-Processing) is an Application Programming Interface (API) that supports multi-platform shared-
memory multiprocessing programming. It allows developers to write parallel programs that can leverage the power
of multiple cores or processors within a single computer system.

Key Concepts

• Shared Memory: Processes (threads) in an OpenMP program access the same memory space, enabling them
to share data efficiently.
• Threads: OpenMP uses threads of execution to perform tasks concurrently. These threads share the program's
data and instructions.
• Directives: OpenMP directives are compiler instructions embedded within the source code. These directives
specify how to parallelize specific sections of the code.

Benefits of OpenMP
• Improved Performance: By utilizing multiple cores/processors, OpenMP programs can execute faster
compared to their sequential counterparts.
• Portability: OpenMP is a widely supported standard, allowing code to be portable across different platforms
with minimal modifications.
• Ease of Use: Compared to lower-level parallel programming models, OpenMP offers a relatively simpler
approach for parallelization.

Common OpenMP Constructs:


• #pragma omp parallel: Creates a team of threads that execute the code block in parallel.
• #pragma omp for: Parallelizes a loop, allowing iterations to be executed concurrently across threads.
• #pragma omp critical: Defines a critical section where only one thread can execute at a time, ensuring
data consistency for shared resources.
• #pragma omp atomic: Used to ensure atomicity for specific memory access operations within a parallel
program.
• #pragma omp barrier: Creates a synchronization point where all threads must reach before proceeding
further.

Common OpenMP Functions:


• omp_get_thread_num(): This OpenMP function retrieves the current thread's identifier within the parallel
region.
• omp_get_num_threads(): This OpenMP function retrieves the total number of threads in the team.

Common OpenMP Clauses:


• Data-Sharing Clauses
o default(shared | none): Sets the default data-sharing attribute for variables referenced within
a parallel construct. shared makes all variables shared by default, while none makes them private.
o shared(list): Declares a list of variables to be shared by all threads in a parallel region.
o private(list): Declares a list of variables to be private to each thread in a team. Each thread will
have its own copy of these variables.
• Loop Control Clauses
o for: Used with the #pragma omp parallel directive to parallelize a loop.
o schedule(type, chunk_size): Controls how loop iterations are distributed among threads.
Common types include static, dynamic, and guided. chunk_size specifies the number of
iterations assigned to a thread at a time (relevant for dynamic and guided schedules).
• Synchronization Clauses:
1
o barrier: Creates a synchronization point where all threads in a team must reach before proceeding
further.
o critical(name): Defines a critical section where only one thread can execute at a time, ensuring
data consistency for shared resources. The optional name argument allows for named critical sections.
• num_threads(num): Specifies the number of threads to be created within a parallel region. This overrides
the default number of threads used by the OpenMP runtime.

Compiling and Running:


• Compile: Use a compiler that supports OpenMP (e.g., gcc with the -fopenmp flag).
• Run: Execute the compiled program.The output will display greetings from each thread created by the
#pragma omp parallel directive. The number of threads might vary depending on your system
configuration (e.g., number of cores).

2 Examples
• Code 1 demonstrates a simple use of OpenMp parallel construct. Compile and run the program and study the
output of several runs.
• Code 2 shows a simple OpenMP "Hello, world!" program in which every thread identifies its id. Compile and
run the program and study the output of several runs.
• Code 3, Code 4, and Code 5 show different use cases of for-loop. Compile and run the program and study the
output of several runs.

3 Practice
1. In Code 3, what is the impact of not using the private clause. Explain.
2. What is the difference between Code 3 and Code 4 in terms of the number of thread teams?
3. How many iterations are executed if six threads execute Code 3? What about Code 4?
4. Explain the output of Code 5.
5. Write a parallel version of a program that computes and displays the dot product of 2 vectors, a and b. Make
sure that race conditions are not going to happen.

for (i = 0; i < n; i++)


dotproduct += a[i] * b[i];

6. Consider the following loop.


a[0] = 0;
for (i = 1; i < n; i++)
a[i] = a[i−1] + i;
There’s clearly a loop-carried dependence, as the value of a[i] can’t be computed without the value of a[i−1].
Can you see a way to eliminate this dependence and parallelize the loop?
7. Suppose a C program declares an array a as follows:
float a[] = {4.0, 3.0, 3.0, 1000.0};

a) What is the output of the following block b) Now consider the following code:
of code? int i;
int i; float sum = 0.0;
float sum = 0.0; # pragma omp parallel for num threads(2) \
for (i = 0; i < 4; i++) reduction(+:sum)
sum += a[i]; for (i = 0; i < 4; i++)
printf("sum = %4.1f\n", sum); sum += a[i];
printf("sum = %4.1f\n", sum);
Suppose that the run-time system assigns iterations i = 0, 1 to
thread 0 and i = 2, 3 to thread 1. What is the output of this code?

2
Code 1 Code 2

Code 3

Code 4

Code 5

You might also like