0% found this document useful (0 votes)

12 views28 pages

Gauravkumar 221it027@it301 Lab2

The document outlines the implementation of parallel computing techniques using OpenMP for calculating Pi, matrix multiplication, and generating Fibonacci series. It includes code examples, performance analysis, and observations on speedup and efficiency as the number of threads increases. The analysis highlights the diminishing returns of parallelization, particularly for smaller problem sizes, while larger matrices benefit more from increased thread counts.

Uploaded by

gauravkr210679

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views28 pages

Gauravkumar 221it027@it301 Lab2

Uploaded by

gauravkr210679

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

PARALLEL COMPUTING LAB:- 02

Gaurav Kumar
221IT027

Q.1 Implement the parallel version of calculation of Pi (π) using #pragma

omp parallel. The formula for calculation of Pi (π) is
1
4.0
∫ 𝑑𝑥 = 𝜋
0 (1 + 𝑥2)
Run the parallel code and take the execution time with 1, 2, 4, 8, 16,
32 threads. Record the timing.
//CODE:-
#include <stdio.h>
#include <omp.h>
#include <sys/time.h>
#include <stdlib.h>

static long num_steps = 1000000000;

double step;

int main() {
int i, num_threads;
double x, pi, sum, start_time, end_time;
double seq_time, parallel_time, speedup, efficiency;

int thread_counts[] = {1, 2, 4, 8, 16, 32};

int num_tests = sizeof(thread_counts) / sizeof(thread_counts[0]);

struct timeval TimeValue_Start, TimeValue_Final;

struct timezone TimeZone_Start, TimeZone_Final;
long time_start, time_end;
double time_overhead;

step = 1.0 / (double) num_steps;

sum = 0.0;
gettimeofday(&TimeValue_Start, &TimeZone_Start);
for (i = 0; i < num_steps; i++) {
x = (i + 0.5) * step;
sum += 4.0 / (1.0 + x * x);
}
pi = step * sum;
gettimeofday(&TimeValue_Final, &TimeZone_Final);
time_start = TimeValue_Start.tv_sec * 1000000 + TimeValue_Start.tv_usec;
time_end = TimeValue_Final.tv_sec * 1000000 + TimeValue_Final.tv_usec;
seq_time = (time_end - time_start) / 1000000.0;

printf("Sequential calculation:\n");
printf("Calculated Pi: %f, Time taken: %lf seconds\n\n", pi, seq_time);

for (int test = 0; test < num_tests; test++) {

num_threads = thread_counts[test];
sum = 0.0;

omp_set_num_threads(num_threads);

gettimeofday(&TimeValue_Start, &TimeZone_Start);

#pragma omp parallel for private(x) reduction(+:sum)

for (i = 0; i < num_steps; i++) {
x = (i + 0.5) * step;
sum += 4.0 / (1.0 + x * x);
}

pi = step * sum;

gettimeofday(&TimeValue_Final, &TimeZone_Final);
time_start = TimeValue_Start.tv_sec * 1000000 +
TimeValue_Start.tv_usec;
time_end = TimeValue_Final.tv_sec * 1000000 + TimeValue_Final.tv_usec;
parallel_time = (time_end - time_start) / 1000000.0;

speedup = seq_time / parallel_time;

efficiency = speedup / num_threads;

printf("Threads: %d, Calculated Pi: %f, Time taken: %lf seconds\n",

num_threads, pi, parallel_time);
printf("Speedup: %lf, Efficiency: %lf\n\n", speedup, efficiency);
}
return 0;
}

//OUTPUT:-

ANALYSIS:-
This OpenMP program calculates the value of Pi using numerical integration via the midpoint
rule. The program first computes Pi sequentially, then repeats the calculation in parallel
using different numbers of threads. For each parallel computation, the program measures
the time taken, calculates the speedup and efficiency, and prints the results.
1. Performance Analysis:
o Speedup: As expected, the speedup increases with the number of threads.
However, the speedup is not linear, meaning that doubling the number of
threads does not necessarily halve the computation time.
o Efficiency: The efficiency decreases as the number of threads increases. This
is typical in parallel computing due to overheads such as thread management
and synchronization.
▪ For example, at 32 threads, efficiency drops significantly to 0.121929,
indicating that adding more threads yields diminishing returns. This is
likely due to the overhead becoming more significant compared to the
work done by each thread.
2. Impact of Parallelization:
o Using more threads does reduce the computation time, but the benefits
decrease with higher thread counts. At some point, the overhead of
managing many threads outweighs the benefits of parallelization, as seen
with the 32-thread case.
• The program effectively demonstrates the principles of parallel computing, including
speedup and efficiency.
• The code shows that while parallelization can significantly reduce computation time,
there are limits to its effectiveness due to overhead and the nature of the problem
being parallelized.
• Understanding the trade-offs between speedup and efficiency is crucial for
optimizing parallel programs.

Q.2 Develop an OpenMp program for matrix multiplication (C=A*B). Analyze the
speedup and efficiency of the parallelized code. Vary the size of your matrices
from 250, 500, 750, 1000, and 2000 and measure the runtime with one thread.
For each matrix size, change the number of threads from 2,4,8., and plot the
speedup versus the number of threads. Compute the efficiency.

//CODE:-
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
#include <time.h>

void matrixMultiplication(double A, double B, double *C, int n) {

#pragma omp parallel for
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
double sum = 0.0;
for (int k = 0; k < n; k++) {
sum += A[i * n + k] * B[k * n + j];
}
C[i * n + j] = sum;
}
}
}

void initializeMatrix(double *matrix, int n) {

for (int i = 0; i < n * n; i++) {
matrix[i] = (double)rand() / RAND_MAX * 100;
}
}

int main() {
int sizes[] = {250, 500, 750, 1000, 2000};
int numSizes = sizeof(sizes) / sizeof(sizes[0]);
int threads[] = {1, 2, 4, 8};
int numThreads = sizeof(threads) / sizeof(threads[0]);

for (int s = 0; s < numSizes; s++) {

int n = sizes[s];
printf("\nMatrix size: %d x %d\n", n, n);

double A = (double )malloc(n * n * sizeof(double));

double *B = (double *)malloc(n * n * sizeof(double));
double *C = (double *)malloc(n * n * sizeof(double));

initializeMatrix(A, n);
initializeMatrix(B, n);

double serialTime = 0.0;

for (int t = 0; t < numThreads; t++) {
int numThread = threads[t];
omp_set_num_threads(numThread);

double start = omp_get_wtime();

matrixMultiplication(A, B, C, n);
double end = omp_get_wtime();
double parallelTime = end - start;
if (numThread == 1) {
serialTime = parallelTime;
}

double speedup = serialTime / parallelTime;

double efficiency = speedup / numThread;

printf("Threads: %d, Time: %f seconds, Speedup: %f,

Efficiency: %f\n",
numThread, parallelTime, speedup, efficiency);
}

free(A);
free(B);
free(C);
}

return 0;
}
//OUTPUT:-

//Python code to plot the graph:-

import matplotlib.pyplot as plt

speedups = {
250: [1.0, 1.779375, 2.131326, 1.995449],
500: [1.0, 1.914561, 2.517489, 3.225659],
750: [1.0, 1.906177, 2.684145, 3.968366],
1000: [1.0, 1.873228, 2.900678, 4.127598],
2000: [1.0, 1.908705, 3.233444, 4.381061]
}
threads = [2, 4, 8]

for size, speedup in speedups.items():

plt.plot(threads, speedup[1:], marker='o', label=f'Matrix size
{size}')

plt.xlabel('Number of Threads')
plt.ylabel('Speedup')
plt.title('Speedup vs. Number of Threads for Different Matrix Sizes')
plt.legend()
plt.grid(True)
plt.show()

// Graph:-
ANALYSIS:-
Matrix Size: 250 x 250
• Threads: 2 → Speedup: 1.78, Efficiency: 0.89
• Threads: 4 → Speedup: 2.13, Efficiency: 0.53
• Threads: 8 → Speedup: 1.99, Efficiency: 0.25
• Observation: Speedup increases as the number of threads increases, but efficiency
decreases, suggesting diminishing returns due to overhead from parallelization.
Matrix Size: 500 x 500
• Threads: 2 → Speedup: 1.91, Efficiency: 0.96
• Threads: 4 → Speedup: 2.52, Efficiency: 0.63
• Threads: 8 → Speedup: 3.23, Efficiency: 0.40
• Observation: Speedup improves more significantly with increasing threads compared
to the 250 size matrix, but efficiency still drops as more threads are used.
Matrix Size: 750 x 750
• Threads: 2 → Speedup: 1.91, Efficiency: 0.95
• Threads: 4 → Speedup: 2.68, Efficiency: 0.67
• Threads: 8 → Speedup: 3.97, Efficiency: 0.50
• Observation: Similar to the 500 size matrix, but the gains in speedup with more
threads are more substantial.
Matrix Size: 1000 x 1000
• Threads: 2 → Speedup: 1.87, Efficiency: 0.94
• Threads: 4 → Speedup: 2.90, Efficiency: 0.73
• Threads: 8 → Speedup: 4.13, Efficiency: 0.52
• Observation: For this larger matrix size, speedup becomes more noticeable with 8
threads, though efficiency continues to decrease.
Matrix Size: 2000 x 2000
• Threads: 2 → Speedup: 1.91, Efficiency: 0.95
• Threads: 4 → Speedup: 3.23, Efficiency: 0.81
• Threads: 8 → Speedup: 4.38, Efficiency: 0.55
• Observation: Significant speedup is observed with 8 threads, indicating that larger
matrices benefit more from parallelization. The efficiency remains relatively high
compared to smaller matrices.

Graph Analysis
Plot: "Speedup vs. Number of Threads for Different Matrix Sizes"
• X-Axis: Number of threads (1, 2, 4, 8).
• Y-Axis: Speedup.
1. Speedup Trends:
o For all matrix sizes, the speedup increases as the number of threads
increases.
o The speedup curve tends to be more pronounced for larger matrices (e.g.,
size 2000), indicating better parallelization efficiency.
2. Smaller Matrices (e.g., 250, 500):
o The speedup gain is modest, especially when increasing threads from 4 to 8.
o The efficiency drops faster, showing that the overhead of managing more
threads outweighs the benefits of parallel processing for smaller matrices.
3. Larger Matrices (e.g., 1000, 2000):
o The speedup is more significant, particularly with 8 threads.
o The efficiency remains higher compared to smaller matrices, suggesting that
larger matrices have enough computational workload to benefit more from
parallelism.
4. General Efficiency:
o Efficiency decreases as the number of threads increases, which is typical in
parallel computing due to overhead and contention.
o Larger matrices maintain better efficiency with more threads, making them
more suitable for parallel execution.
• Small Matrices: Parallelization offers limited benefits due to lower computational
demands and higher overhead.
• Large Matrices: Parallelization is highly effective, particularly with a greater number
of threads, leading to significant speedup and better resource utilization.
• Optimal Thread Usage: For maximum efficiency, a balance must be struck between
the matrix size and the number of threads. Too many threads for smaller matrices
result in diminishing returns, while larger matrices can effectively leverage more
threads for substantial performance gains.

Q.3 Develop a multi-threaded program to generate and print ‘n’ Fibonacci

Series. One thread has to generate the numbers up to the specified limit and
another thread has to print them. Ensure proper synchronization. (Note: If your
output seems to be OK, try increasing the number of threads and/or n).

//CODE:-
#include <stdio.h>
#include <omp.h>

#define MAX_N 1000

// Declare global variables

int fib[MAX_N];
int n;

omp_lock_t print_lock; // Lock for synchronizing print operations

void generate_fibonacci(int limit) {

if (limit < 1) return;

fib[0] = 0;
if (limit > 1) {
fib[1] = 1;
}

for (int i = 2; i < limit; i++) {

fib[i] = fib[i-1] + fib[i-2];
}
}

void print_fibonacci(int limit) {

omp_set_lock(&print_lock); // Acquire lock before printing
printf("Fibonacci series up to %d terms:\n", limit);
for (int i = 0; i < limit; i++) {
printf("%d ", fib[i]);
}
printf("\n");
omp_unset_lock(&print_lock); // Release lock after printing
}

int main() {

omp_init_lock(&print_lock);

printf("Enter the number of Fibonacci terms to generate: ");

scanf("%d", &n);

if (n > MAX_N) {
printf("Number of terms exceeds maximum limit of %d\n",
MAX_N);
return 1;
}

// Set the number of threads

int num_threads;
printf("Enter the number of threads: ");
scanf("%d", &num_threads);

if (num_threads < 2) {
printf("Number of threads must be at least 2.\n");
return 1;
}

// Variables for timing

double start_time, end_time, generation_time, printing_time;
double total_time;

// Start timing
start_time = omp_get_wtime();

// Generate and print Fibonacci series in parallel

#pragma omp parallel num_threads(num_threads)
{
#pragma omp single
{
#pragma omp task
{
double gen_start_time = omp_get_wtime();
generate_fibonacci(n);
double gen_end_time = omp_get_wtime();
generation_time = gen_end_time - gen_start_time;
}

#pragma omp task

{
double print_start_time = omp_get_wtime();
print_fibonacci(n);
double print_end_time = omp_get_wtime();
printing_time = print_end_time - print_start_time;
}
}
}

end_time = omp_get_wtime();
total_time = end_time - start_time;

double speedup = total_time / (generation_time + printing_time);

double efficiency = speedup / num_threads;
printf("\nTotal time taken: %lf seconds\n", total_time);
printf("Time taken for generation: %lf seconds\n",
generation_time);
printf("Time taken for printing: %lf seconds\n", printing_time);
printf("Speedup: %lf\n", speedup);
printf("Efficiency: %lf\n", efficiency);

omp_destroy_lock(&print_lock);

return 0;
}

//OUTPUT:-
ANALYSIS:-
o Increasing the number of threads generally results in reduced total time and
higher speedup.
o However, the efficiency in the output suggests an anomaly as efficiency
should ideally decrease with increasing thread count due to overhead and
diminishing returns.
o For such a small problem size (10 Fibonacci numbers), the overhead of
managing multiple threads can be significant compared to the work done,
leading to unusually high speedup and efficiency values.
o In practical scenarios with larger problem sizes, the overhead becomes more
noticeable, and the efficiency decreases as more threads are used.
The program successfully demonstrates multi-threaded generation and printing of the
Fibonacci series using OpenMP. The observed high speedup and efficiency values are
primarily due to the small problem size and very fast computation time.

Q.4 Write an OpenMP program to perform vector addition of two one-

dimensional arrays of size 5 using 5 threads.

//CODE:-
#include <stdio.h>
#include <omp.h>

int main() {
int size = 5;
int A[5], B[5], C[5];

printf("Enter 5 elements for array A:\n");

for (int i = 0; i < size; i++) {
scanf("%d", &A[i]);
}
printf("Enter 5 elements for array B:\n");
for (int i = 0; i < size; i++) {
scanf("%d", &B[i]);
}

omp_set_num_threads(5);

#pragma omp parallel for

for (int i = 0; i < size; i++) {
int thread_id = omp_get_thread_num();
C[i] = A[i] + B[i];
printf("Thread %d: A[%d] = %d + B[%d] = %d -> C[%d] = %d\n",
thread_id, i, A[i], i, B[i], i, C[i]);
}

printf("\nResultant vector C:\n");

for (int i = 0; i < size; i++) {
printf("C[%d] = %d\n", i, C[i]);
}

return 0;
}
//OUTPUT:-

ANALYSIS:-
This OpenMP program performs vector addition of two one-dimensional arrays (A and B) of
size 5 using 5 threads.
1. Thread Assignment:
o The threads are assigned iterations non-deterministically, meaning the order
in which threads complete their tasks can vary each time the program is run.
In this case, threads 0, 4, 3, 1, and 2 completed their tasks in that order.
2. Computation:
o Each thread correctly adds corresponding elements from arrays A and B and
stores the result in C. For example, thread 0 computes C[0] = 1 + 10 = 11, and
thread 4 computes C[4] = 5 + 14 = 19.
3. Output Order:
o The output order of the threads in the terminal may not follow the order of
the indices because the threads execute concurrently. However, each element
of C is correctly calculated based on the input arrays.
4. Final Result:
o The resultant vector C correctly contains the sums of the corresponding
elements of arrays A and B, demonstrating successful parallel computation.
• The program efficiently utilizes 5 threads to perform vector addition. The
parallelization ensures that the task is divided among the threads, potentially
speeding up the computation compared to a single-threaded approach.
• The output correctly reflects the work done by each thread, showing that each
thread handled different parts of the task and contributed to the final result.

Q.5 Write an OpenMP program to understand and analyze the concepts

of private(), firstprivate().
(a). Write an OpenMP program to understand and analyze the concepts
of private(), firstprivate().

//CODE:-
#include <stdio.h>
#include <omp.h>

int main() {
int i;
int num_threads = 4;

int x = 10; // Shared variable

int y = 20; // Shared variable

// Parallel region with private()

#pragma omp parallel num_threads(num_threads) private(i)
{
// Each thread has its own copy of 'i'
i = omp_get_thread_num();
printf("Thread %d: i = %d\n", i, i);

printf("Thread %d: x = %d, y = %d\n", i, x, y);

x += i;
y += i;
}

// Print the results after the parallel region

printf("After parallel region with private():\n");
printf("x = %d, y = %d\n", x, y);

return 0;
}

OUTPUT:-
(b) Execute the same program with firstprivate(), record the results and write
your observation.
//code:-
#include <stdio.h>
#include <omp.h>

int main() {
int i;
int num_threads = 4;

int x = 10; // Shared variable

int y = 20; // Shared variable

// Parallel region with firstprivate()

#pragma omp parallel num_threads(num_threads) firstprivate(x, y) private(i)
{
// Each thread has its own copy of 'i'
i = omp_get_thread_num();
printf("Thread %d: i = %d\n", i, i);
// Each thread has its own copy of 'x' and 'y' initialized to the original values
printf("Thread %d: x = %d, y = %d\n", i, x, y);
x += i;
y += i;
}

// Print the results after the parallel region

printf("After parallel region with firstprivate():\n");
printf("x = %d, y = %d\n", x, y);

return 0;
}

//OUTPUT:-
ANALYSIS:-
In the first program, the variables x and y are shared among all threads, but i is declared as
private.

Each thread has its own private copy of i, initialized with the thread number
(omp_get_thread_num()).

Since x and y are shared, any modification to these variables by one thread will affect their
values as seen by other threads.

Threads execute in parallel, and due to the shared nature of x and y, their values are updated
by each thread.
The final output of x = 16 and y = 26 is the result of the cumulative modifications by all
threads.
The thread execution order is non-deterministic, so the sequence in which x and y are
updated varies with each run.
In the second program, the variables x and y are declared as firstprivate.
Each thread gets its own private copy of x and y, initialized to 10 and 20, respectively, before
entering the parallel region.
Changes made to x and y inside a thread do not affect the copies of other threads or the
original values of x and y.
Each thread starts with its own copy of x = 10 and y = 20.
The threads modify their copies of x and y, but these changes are local to each thread.
After the parallel region, the original x and y remain unchanged at 10 and 20, respectively.
This behavior contrasts with the private case, where changes were visible across threads.
• private(x, y): Variables x and y are shared among threads, leading to potential race
conditions where multiple threads modify the same variables concurrently, causing
non-deterministic output.
• firstprivate(x, y): Each thread operates on its own copy of x and y, initialized with the
values before the parallel region. The original variables remain unchanged after the
parallel region.
This comparison highlights the importance of choosing the right variable scope (private vs.
firstprivate) depending on whether you want variables to be shared or to have thread-
specific copies with initial values.

Parallel Computing Lab Manual PDF
100% (1)
Parallel Computing Lab Manual PDF
51 pages
Lab Programs
No ratings yet
Lab Programs
18 pages
CP4292 Multicore Architecture Lab Manual
No ratings yet
CP4292 Multicore Architecture Lab Manual
36 pages
Multicore Architecture and Programming Lab Manual
No ratings yet
Multicore Architecture and Programming Lab Manual
29 pages
CP4292 Mcap
No ratings yet
CP4292 Mcap
24 pages
Parallelisme II TD 1 2019 2020
No ratings yet
Parallelisme II TD 1 2019 2020
4 pages
Reporte
No ratings yet
Reporte
9 pages
Assignment 04
No ratings yet
Assignment 04
16 pages
Ee8218 Lab2
No ratings yet
Ee8218 Lab2
7 pages
DAPC Assignement 4
No ratings yet
DAPC Assignement 4
6 pages
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
No ratings yet
CS 61C: Great Ideas in Computer Architecture (Machine Structures)
32 pages
OpenMP Matrix
No ratings yet
OpenMP Matrix
6 pages
Excelente
No ratings yet
Excelente
64 pages
Tp2 - Openmp (Introduction) : Imad Kissami
No ratings yet
Tp2 - Openmp (Introduction) : Imad Kissami
4 pages
Pseudo Code of Mpi Programs
No ratings yet
Pseudo Code of Mpi Programs
22 pages
Tugas Pemograman Parallel: Analisis Benchmark Program Pi, Perkalian Matrix Dan Fibonacci
No ratings yet
Tugas Pemograman Parallel: Analisis Benchmark Program Pi, Perkalian Matrix Dan Fibonacci
9 pages
Lab 2
No ratings yet
Lab 2
2 pages
DAA Mini Project
No ratings yet
DAA Mini Project
6 pages
MPC LAB Manual New
No ratings yet
MPC LAB Manual New
24 pages
MAP Lab Completed
No ratings yet
MAP Lab Completed
29 pages
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
100% (1)
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
15 pages
Lab5 Mat Ops Pthreads 11
No ratings yet
Lab5 Mat Ops Pthreads 11
6 pages
Cp4292 Multicore Lab Multicore Lab Removed
No ratings yet
Cp4292 Multicore Lab Multicore Lab Removed
37 pages
Problem: Compute PI Using A Monte Carlo Approach in C With OpenMP.
No ratings yet
Problem: Compute PI Using A Monte Carlo Approach in C With OpenMP.
3 pages
Digital Assignment-2: Name: Bejugam Shiva Suprith REG NO: 18BCE0427 Faculty: Narayamoorthy Slot: C1
No ratings yet
Digital Assignment-2: Name: Bejugam Shiva Suprith REG NO: 18BCE0427 Faculty: Narayamoorthy Slot: C1
7 pages
Report Homework 1: 1 Openmp Experiment
No ratings yet
Report Homework 1: 1 Openmp Experiment
8 pages
Question 1 - Serial: Output
No ratings yet
Question 1 - Serial: Output
9 pages
(Serial)
No ratings yet
(Serial)
8 pages
Lab 3
No ratings yet
Lab 3
23 pages
CP 4292 MCP Lab Manual
No ratings yet
CP 4292 MCP Lab Manual
20 pages
Ass Parallel
No ratings yet
Ass Parallel
11 pages
Mcap-Lab Manual 1
No ratings yet
Mcap-Lab Manual 1
19 pages
#Include #Include #Define
No ratings yet
#Include #Include #Define
8 pages
Day 2 1 Advanced-Openmp
No ratings yet
Day 2 1 Advanced-Openmp
52 pages
Untitled Document
No ratings yet
Untitled Document
23 pages
Radix Sort
No ratings yet
Radix Sort
10 pages
HPC Int I Retest Answer Key
No ratings yet
HPC Int I Retest Answer Key
10 pages
CP4252 Multicore Architecture and Programming Lab Manual
No ratings yet
CP4252 Multicore Architecture and Programming Lab Manual
26 pages
HPC Codes-2
No ratings yet
HPC Codes-2
15 pages
Parallel and Distributed Computing CSE4001 Lab - 4
100% (1)
Parallel and Distributed Computing CSE4001 Lab - 4
5 pages
Lab # 2 by Akram
No ratings yet
Lab # 2 by Akram
14 pages
MAP Lab Mannual
No ratings yet
MAP Lab Mannual
24 pages
OpenMP Programs
No ratings yet
OpenMP Programs
4 pages
Par - 1 In-Term Exam - Course 2017/18-Q2
No ratings yet
Par - 1 In-Term Exam - Course 2017/18-Q2
7 pages
Parallel and Distributed Computing Lab Digital Assignment - 3
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 3
10 pages
CCNP 3 Multilayer Switching Companion Guide 2nd Edition
No ratings yet
CCNP 3 Multilayer Switching Companion Guide 2nd Edition
832 pages
4 Performance.4x
No ratings yet
4 Performance.4x
14 pages
Exercise 1 (Openmp-I)
No ratings yet
Exercise 1 (Openmp-I)
10 pages
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
No ratings yet
Project Assignment 3 Multi Processor System (DV 2544) : Susheel Sagar
4 pages
PDC-Lab 21BCE10419
No ratings yet
PDC-Lab 21BCE10419
20 pages
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
No ratings yet
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
3 pages
IHub Interface Guide
No ratings yet
IHub Interface Guide
66 pages
PC File
No ratings yet
PC File
57 pages
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
No ratings yet
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
5 pages
Host 1
No ratings yet
Host 1
1,803 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
Incremental Process Model
No ratings yet
Incremental Process Model
4 pages
E 3 (Openmp - Iii) : Matrix Multiplication
No ratings yet
E 3 (Openmp - Iii) : Matrix Multiplication
10 pages
Java Practise Exercise
No ratings yet
Java Practise Exercise
3 pages
LMDS & MMDS
50% (2)
LMDS & MMDS
36 pages
FortiGate 40C
No ratings yet
FortiGate 40C
2 pages
IT602 - Lab - Manual Final
No ratings yet
IT602 - Lab - Manual Final
201 pages
Dissertation On 5g Technology
100% (2)
Dissertation On 5g Technology
4 pages
Zund Cut Center ZCC Software
No ratings yet
Zund Cut Center ZCC Software
7 pages
BGP Secure Routing 1708284503
No ratings yet
BGP Secure Routing 1708284503
82 pages
Top 30 MS Excel Interview Questions and Answers (2024) - Naukri Code 360
No ratings yet
Top 30 MS Excel Interview Questions and Answers (2024) - Naukri Code 360
16 pages
Engineering Economics-Class 7
No ratings yet
Engineering Economics-Class 7
23 pages
IWP VIT Syllabus
No ratings yet
IWP VIT Syllabus
11 pages
Abb Automation
No ratings yet
Abb Automation
52 pages
I Questions
No ratings yet
I Questions
7 pages
Citrix Netscaler Data Sheet
No ratings yet
Citrix Netscaler Data Sheet
12 pages
White Paper - PCI Compliance
No ratings yet
White Paper - PCI Compliance
45 pages
Online Examination System Research Paper
No ratings yet
Online Examination System Research Paper
7 pages
PWC Mail - Re - Sign Off For Pending RICEFW's - Barcode Interface
No ratings yet
PWC Mail - Re - Sign Off For Pending RICEFW's - Barcode Interface
2 pages
Café Time Time Management System02
No ratings yet
Café Time Time Management System02
21 pages
IXrouter2 Installation Guide (2017-01)
No ratings yet
IXrouter2 Installation Guide (2017-01)
15 pages
5.1.5 Lab - Tracing A Route
No ratings yet
5.1.5 Lab - Tracing A Route
4 pages
Project Report SPM
No ratings yet
Project Report SPM
31 pages
Huawei MSC Pool
No ratings yet
Huawei MSC Pool
4 pages
Digi Go User Manual
No ratings yet
Digi Go User Manual
23 pages
Datasheet
No ratings yet
Datasheet
52 pages
Hci Lab2 1
No ratings yet
Hci Lab2 1
4 pages
REPORT WRITING SKILLS Assignment 2
No ratings yet
REPORT WRITING SKILLS Assignment 2
7 pages
Gauravkumar 221it027 Report
No ratings yet
Gauravkumar 221it027 Report
6 pages
Activate FEH
No ratings yet
Activate FEH
5 pages
Volvo Cem m32c L
No ratings yet
Volvo Cem m32c L
8 pages
IT464-LabAssignment-3
No ratings yet
IT464-LabAssignment-3
1 page
Adnaco R1BP1B DS Rev1.1 Ia
No ratings yet
Adnaco R1BP1B DS Rev1.1 Ia
3 pages
Best Practices For HP EVA
No ratings yet
Best Practices For HP EVA
4 pages

Gauravkumar 221it027@it301 Lab2

Uploaded by

Gauravkumar 221it027@it301 Lab2

Uploaded by

PARALLEL COMPUTING LAB:- 02

Q.1 Implement the parallel version of calculation of Pi (π) using #pragma

static long num_steps = 1000000000;

int thread_counts[] = {1, 2, 4, 8, 16, 32};

struct timeval TimeValue_Start, TimeValue_Final;

step = 1.0 / (double) num_steps;

for (int test = 0; test < num_tests; test++) {

#pragma omp parallel for private(x) reduction(+:sum)

speedup = seq_time / parallel_time;

printf("Threads: %d, Calculated Pi: %f, Time taken: %lf seconds\n",

void matrixMultiplication(double *A, double *B, double *C, int n) {

void initializeMatrix(double *matrix, int n) {

for (int s = 0; s < numSizes; s++) {

double *A = (double *)malloc(n * n * sizeof(double));

double serialTime = 0.0;

double start = omp_get_wtime();

double speedup = serialTime / parallelTime;

printf("Threads: %d, Time: %f seconds, Speedup: %f,

//Python code to plot the graph:-

for size, speedup in speedups.items():

Q.3 Develop a multi-threaded program to generate and print ‘n’ Fibonacci

#define MAX_N 1000

// Declare global variables

omp_lock_t print_lock; // Lock for synchronizing print operations

void generate_fibonacci(int limit) {

for (int i = 2; i < limit; i++) {

void print_fibonacci(int limit) {

printf("Enter the number of Fibonacci terms to generate: ");

// Set the number of threads

// Variables for timing

// Generate and print Fibonacci series in parallel

#pragma omp task

double speedup = total_time / (generation_time + printing_time);

Q.4 Write an OpenMP program to perform vector addition of two one-

printf("Enter 5 elements for array A:\n");

#pragma omp parallel for

printf("\nResultant vector C:\n");

Q.5 Write an OpenMP program to understand and analyze the concepts

int x = 10; // Shared variable

// Parallel region with private()

printf("Thread %d: x = %d, y = %d\n", i, x, y);

// Print the results after the parallel region

int x = 10; // Shared variable

// Parallel region with firstprivate()

// Print the results after the parallel region

You might also like

void matrixMultiplication(double A, double B, double *C, int n) {

double A = (double )malloc(n * n * sizeof(double));