0% found this document useful (0 votes)

71 views6 pages

Homework #4 El6201 - Parallel System: 1 Openmp Matrix Addition

This document is a homework assignment on parallel matrix addition using OpenMP. It contains code for serial matrix addition and 4 variants of parallel matrix addition using different OpenMP directives: #pragma omp parallel for, #pragma omp parallel for private(j), #pragma omp parallel for schedule(dynamic), and #pragma omp parallel for schedule(dynamic,2). Performance results of running each variant will be analyzed and discussed.

Uploaded by

Ardianto Satriawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views6 pages

Homework #4 El6201 - Parallel System: 1 Openmp Matrix Addition

Uploaded by

Ardianto Satriawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Homework #4 EL6201 - Parallel System

Ardianto Satriawan
23213079

Semester II 2013/2014

1 OpenMP Matrix Addition

For base comparison, a 500 500 matrix addition program is created as follows. Matrix A and
B are initialized in matrices500.h:

1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / S e r i a l Matrix A d d i t i o n /
5 / EL6201 P a r a l l e l System /
6 / /
7
8 #d e f i n e LENGTH 500
9
10 #i n c l u d e <s t d i o . h>
11 #i n c l u d e <math . h>
12 #i n c l u d e <time . h>
13
14 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
15 #i n c l u d e m a t r i c e s 5 0 0 . h
16
17 i n t main ( i n t argc , c h a r argv [ ] ) {
18 clock t tic = clock () ;
19
20 / r e s u l t o f matrix a d d i t i o n /
21 i n t C [LENGTH ] [ LENGTH ] ;
22
23 / matrix a d d i t i o n c a l c u l a t i o n /
24 int i , j ;
25 f o r ( i =0; i <LENGTH; i ++) {
26 f o r ( j =0; j <LENGTH; j ++) {
27 C [ i ] [ j ] = A[ i ] [ j ] + B [ i ] [ j ] ;
28 }
29 }
30
31 c l o c k t toc = clock () ;
32 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
33 return 0;
34 }

matrixadd.c
I created 4 variants of OpenMP matrix addition, by adding one of those directives before nested
for loop:

1
#pragma omp parallel for

#pragma omp parallel for private(j)

#pragma omp parallel for schedule(dynamic)

#pragma omp parallel for schedule(dynamic,2)

Here are their code listings, the differences between 4 versions is just in line 26.
1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / Matrix A d d i t i o n /
5 / u s i n g #pragma omp p a r a l l e l f o r d i r e c t i v e /
6 / EL6201 P a r a l l e l System /
7 / /
8
9 #d e f i n e LENGTH 500
10

11 #i n c l u d e <s t d i o . h>
12 #i n c l u d e <math . h>
13 #i n c l u d e <time . h>
14
15 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
16 #i n c l u d e m a t r i c e s 5 0 0 . h
17
18 i n t main ( i n t argc , c h a r argv [ ] ) {
19 clock t tic = clock () ;
20
21 / r e s u l t o f matrix a d d i t i o n /
22 i n t C [LENGTH ] [ LENGTH ] ;
23
24 / matrix a d d i t i o n c a l c u l a t i o n /
25 int i , j ;
26 #pragma omp p a r a l l e l f o r
27 f o r ( i =0; i <LENGTH; i ++) {
28 f o r ( j =0; j <LENGTH; j ++) {
29 C [ i ] [ j ] = A[ i ] [ j ] + B [ i ] [ j ] ;
30 }
31 }
32
33 c l o c k t toc = clock () ;
34 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
35 return 0;
36 }

matrixaddomp1.c

1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / Matrix A d d i t i o n /
5 / u s i n g #pragma omp p a r a l l e l f o r p r i v a t e ( j ) /
6 / EL6201 P a r a l l e l System /
7 / /
8
9 #d e f i n e LENGTH 500
10
11 #i n c l u d e <s t d i o . h>
12 #i n c l u d e <math . h>

2
13 #i n c l u d e <time . h>
14

15 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
16 #i n c l u d e m a t r i c e s 5 0 0 . h
17
18 i n t main ( i n t argc , c h a r argv [ ] ) {
19 clock t tic = clock () ;
20

21 / r e s u l t o f matrix a d d i t i o n /
22 i n t C [LENGTH ] [ LENGTH ] ;
23
24 / matrix a d d i t i o n c a l c u l a t i o n /
25 int i , j ;
26 #pragma omp p a r a l l e l f o r p r i v a t e ( j )
27 f o r ( i =0; i <LENGTH; i ++) {
28 f o r ( j =0; j <LENGTH; j ++) {
29 C [ i ] [ j ] = A[ i ] [ j ] + B [ i ] [ j ] ;
30 }
31 }
32
33 c l o c k t toc = clock () ;
34 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
35 return 0;
36 }

matrixaddomp2.c

1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / Matrix A d d i t i o n /
5 / u s i n g #pragma omp p a r a l l e l f o r s c h e d u l e ( dynamic ) /
6 / EL6201 P a r a l l e l System /
7 / /
8
9 #d e f i n e LENGTH 500
10
11 #i n c l u d e <s t d i o . h>
12 #i n c l u d e <math . h>
13 #i n c l u d e <time . h>
14
15 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
16 #i n c l u d e m a t r i c e s 5 0 0 . h
17
18 i n t main ( i n t argc , c h a r argv [ ] ) {
19 clock t tic = clock () ;
20
21 / r e s u l t o f matrix a d d i t i o n /
22 i n t C [LENGTH ] [ LENGTH ] ;
23
24 / matrix a d d i t i o n c a l c u l a t i o n /
25 int i , j ;
26 #pragma omp p a r a l l e l f o r s c h e d u l e ( dynamic )
27 f o r ( i =0; i <LENGTH; i ++) {
28 f o r ( j =0; j <LENGTH; j ++) {
29 C [ i ] [ j ] = A[ i ] [ j ] + B [ i ] [ j ] ;
30 }
31 }
32
33 c l o c k t toc = clock () ;
34 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;

3
35 return 0;
36 }

matrixaddomp3.c
1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / Matrix A d d i t i o n /
5 / #pragma omp p a r a l l e l f o r s c h e d u l e ( dynamic , 2 ) /
6 / EL6201 P a r a l l e l System /
7 / /
8

9 #d e f i n e LENGTH 500
10
11 #i n c l u d e <s t d i o . h>
12 #i n c l u d e <math . h>
13 #i n c l u d e <time . h>
14

21 / r e s u l t o f matrix a d d i t i o n /
22 i n t C [LENGTH ] [ LENGTH ] ;
23
24 / matrix a d d i t i o n c a l c u l a t i o n /
25 int i , j ;
26 #pragma omp p a r a l l e l f o r s c h e d u l e ( dynamic , 2 )
27 f o r ( i =0; i <LENGTH; i ++) {
28 f o r ( j =0; j <LENGTH; j ++) {
29 C [ i ] [ j ] = A[ i ] [ j ] + B [ i ] [ j ] ;
30 }
31 }
32

33 c l o c k t toc = clock () ;
34 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
35 return 0;
36 }

matrixaddomp4.c
The comparison between serial and 4 OpenMP variants can be viewed in the following table:

We can see that the #pragma omp parallel for private(j) make the program run 1.21 times
faster than serial version. The other directives make the program run slower.

2 OpenMP Matrix Multiplication

For base comparison, a 500 500 matrix multiplication program is created as follows. Matrix
A and B are initialized in matrices500.h:

4
1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / S e r i a l Matrix M u l t i p l i c a t i o n /
5 / EL6201 P a r a l l e l System /
6 / /
7
8 #d e f i n e LENGTH 500
9

10 #i n c l u d e <s t d i o . h>
11 #i n c l u d e <math . h>
12 #i n c l u d e <time . h>
13
14 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
15 #i n c l u d e m a t r i c e s 5 0 0 . h
16
17 i n t main ( i n t argc , c h a r argv [ ] ) {
18 clock t tic = clock () ;
19
20 / r e s u l t o f matrix a d d i t i o n /
21 i n t C [LENGTH ] [ LENGTH ] ;
22
23 / matrix m u l t i p l i c a t i o n c a l c u l a t i o n /
24 int i , j , k ;
25 f o r ( i =0; i <LENGTH; i ++) {
26 f o r ( j =0; j <LENGTH; j ++) {
27 C[ i ] [ j ] = 0 ;
28 f o r ( k=0;k<LENGTH; k++) {
29 C [ i ] [ j ] += A[ i ] [ k ] B [ k ] [ j ] ;
30 }
31 }
32 }
33

34 c l o c k t toc = clock () ;
35 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
36 return 0;
37 }

matrixmul.c
Analyzing the data dependency of the operation C[i][j] += A[i][k] * B[k][j], we can see that
each iteration i is independent of each other. The same applies to iteration j. For simplicity,
we consider parallelizing only single for-loop by adding:

#pragma omp parallel for default(none) shared(A,B,C) private(j,k) before the nested for
loop.
1 / /
2 / Ardianto S a t r i a w a n /
3 / S e m e s t e r I I 2013/14 /
4 / Matrix M u l t i p l i c a t i o n OpenMP v a r i a n t /
5 / EL6201 P a r a l l e l System /
6 / /
7

8 #d e f i n e LENGTH 500
9
10 #i n c l u d e <s t d i o . h>
11 #i n c l u d e <math . h>
12 #i n c l u d e <time . h>

5
13
14 / t h e m a t r i c e s t o add a r e i n t h i s h e a d e r f i l e , d e c l a r e d by e x t e r n /
15 #i n c l u d e m a t r i c e s 5 0 0 . h
16
17 i n t main ( i n t argc , c h a r argv [ ] ) {
18 clock t tic = clock () ;
19
20 / r e s u l t o f matrix a d d i t i o n /
21 i n t C [LENGTH ] [ LENGTH ] ;
22
23 / matrix m u l t i p l i c a t i o n c a l c u l a t i o n /
24 int i , j , k ;
25
26 #pragma omp p a r a l l e l f o r d e f a u l t ( none ) s h a r e d (A, B, C) p r i v a t e ( j , k )
27 f o r ( i =0; i <LENGTH; i ++) {
28 f o r ( j =0; j <LENGTH; j ++) {
29 C[ i ] [ j ] = 0 ;
30 f o r ( k=0;k<LENGTH; k++) {
31 C [ i ] [ j ] += A[ i ] [ k ] B [ k ] [ j ] ;
32 }
33 }
34 }
35
36 c l o c k t toc = clock () ;
37 p r i n t f ( Elapsed : %f s e c o n d s \n , ( d o u b l e ) ( t o c t i c ) / CLOCKS PER SEC) ;
38 return 0;
39 }

matrixmulomp.c
With the OpenMP directive (#pragma), the i-for-loop is divided into multiple chunks, each
chunk is assigned to a thread. Hence, multiple threads can compute assigned chunks in parallel.
Performance comparison between serial and OpenMP version:

Speed-up:
2.64
Speed-up = = 1.88 times
1.40
The parallel version is about 1.88 times faster than the serial version on a dual-core system.
Which is pretty good.

OpenMP Matrix
No ratings yet
OpenMP Matrix
6 pages
Tp2 - Openmp (Introduction) : Imad Kissami
No ratings yet
Tp2 - Openmp (Introduction) : Imad Kissami
4 pages
E 3 (Openmp - Iii) : Matrix Multiplication
No ratings yet
E 3 (Openmp - Iii) : Matrix Multiplication
10 pages
CP4292 Multicore Architecture Lab Manual
No ratings yet
CP4292 Multicore Architecture Lab Manual
36 pages
Lab 3
No ratings yet
Lab 3
23 pages
Lab 7
No ratings yet
Lab 7
3 pages
Lab Manual
No ratings yet
Lab Manual
31 pages
CP4252 Multicore Architecture and Programming Lab Manual
No ratings yet
CP4252 Multicore Architecture and Programming Lab Manual
26 pages
PDC-Lab 21BCE10419
No ratings yet
PDC-Lab 21BCE10419
20 pages
Gauravkumar 221it027@it301 Lab2
No ratings yet
Gauravkumar 221it027@it301 Lab2
28 pages
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
No ratings yet
Name: Harshvardhan Singh Gahlaut Reg. No.: 19BCE2372 Slot: L41+L42
3 pages
(Serial)
No ratings yet
(Serial)
8 pages
Multicore Architecture Lab Manual
No ratings yet
Multicore Architecture Lab Manual
34 pages
HPC Programs
No ratings yet
HPC Programs
19 pages
Parallel Computing Lab Manual PDF
100% (1)
Parallel Computing Lab Manual PDF
51 pages
CP4292 Mcap
No ratings yet
CP4292 Mcap
24 pages
Report Homework 1: 1 Openmp Experiment
No ratings yet
Report Homework 1: 1 Openmp Experiment
8 pages
PC File
No ratings yet
PC File
57 pages
Exercise 1 (Openmp-I)
No ratings yet
Exercise 1 (Openmp-I)
10 pages
MAP Lab Completed
No ratings yet
MAP Lab Completed
29 pages
Java Practise Exercise
No ratings yet
Java Practise Exercise
3 pages
Cp4292 Multicore Lab Multicore Lab Removed
No ratings yet
Cp4292 Multicore Lab Multicore Lab Removed
37 pages
MPC LAB Manual New
No ratings yet
MPC LAB Manual New
24 pages
Department of Computer Scienc2
No ratings yet
Department of Computer Scienc2
5 pages
Operating Systems Lab Assignment 5: Developing Multi-Threaded Applications
No ratings yet
Operating Systems Lab Assignment 5: Developing Multi-Threaded Applications
7 pages
Question 1 - Serial: Output
No ratings yet
Question 1 - Serial: Output
9 pages
Excelente
No ratings yet
Excelente
64 pages
MAP Lab Mannual
No ratings yet
MAP Lab Mannual
24 pages
Parallel and Distributed Computing Lab Digital Assignment - 3
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 3
10 pages
Lab 2
No ratings yet
Lab 2
2 pages
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
100% (1)
Vector Addition: Exercise 1 (Openmp-I) Scenario - I
15 pages
Multicore Architecture and Programming Lab Manual
No ratings yet
Multicore Architecture and Programming Lab Manual
29 pages
HPC - Assignment 1
No ratings yet
HPC - Assignment 1
2 pages
HPC Codes-2
No ratings yet
HPC Codes-2
15 pages
Lab5 Mat Ops Pthreads 11
No ratings yet
Lab5 Mat Ops Pthreads 11
6 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
Slides
No ratings yet
Slides
24 pages
OpenMP Programs
No ratings yet
OpenMP Programs
4 pages
#Include #Include #Define
No ratings yet
#Include #Include #Define
8 pages
Systems Programming - Lab9
No ratings yet
Systems Programming - Lab9
3 pages
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
No ratings yet
Cse 4001-Parallel and Distributed Computing Lab Digital Assessment-1 Name: Avulapati Anusha REG - NO: 17BCE0435
5 pages
CP 4292 MCP Lab Manual
No ratings yet
CP 4292 MCP Lab Manual
20 pages
4 Performance.4x
No ratings yet
4 Performance.4x
14 pages
Inf3380 Oblig2 2011
No ratings yet
Inf3380 Oblig2 2011
3 pages
Wa0001.
No ratings yet
Wa0001.
17 pages
20bce2126 PDC Lab Da 3
No ratings yet
20bce2126 PDC Lab Da 3
11 pages
Problem Statement
No ratings yet
Problem Statement
5 pages
Mat Multipli
No ratings yet
Mat Multipli
4 pages
Lec7 - TLP Shared Memory and OpenMP
No ratings yet
Lec7 - TLP Shared Memory and OpenMP
45 pages
Object Oriented Programming Lab: Department of Computer Science and Engineering
No ratings yet
Object Oriented Programming Lab: Department of Computer Science and Engineering
46 pages
DAA Mini Project
No ratings yet
DAA Mini Project
6 pages
HPC Project Report
No ratings yet
HPC Project Report
10 pages
Exercise 3
No ratings yet
Exercise 3
3 pages
Lab # 2 by Akram
No ratings yet
Lab # 2 by Akram
14 pages
Task PDF
No ratings yet
Task PDF
2 pages
HPC Lab Manual
No ratings yet
HPC Lab Manual
31 pages
Lab 5
No ratings yet
Lab 5
3 pages
150+ JavaScript Pattern Programs
From Everand
150+ JavaScript Pattern Programs
Hernando Abella
No ratings yet
150+ C Pattern Programs
From Everand
150+ C Pattern Programs
Hernando Abella
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
C++/ Java Interview Questions - Alcatel Lucent: Data Structure
No ratings yet
C++/ Java Interview Questions - Alcatel Lucent: Data Structure
8 pages
Playing Monkey's Audio in Linux Mint 10
No ratings yet
Playing Monkey's Audio in Linux Mint 10
1 page
CBSE Syllabus For Class 9 Information Technology 2023 24
No ratings yet
CBSE Syllabus For Class 9 Information Technology 2023 24
13 pages
English Version
No ratings yet
English Version
3 pages
B ACI Best Practices
No ratings yet
B ACI Best Practices
232 pages
Network Monitoring Tools
No ratings yet
Network Monitoring Tools
33 pages
Mini Project Final
No ratings yet
Mini Project Final
29 pages
TP - OpenFlow
No ratings yet
TP - OpenFlow
12 pages
QT For Symbian: Liu Xiaoguo Forum Nokia, China
No ratings yet
QT For Symbian: Liu Xiaoguo Forum Nokia, China
47 pages
Python Coding Questions 1-100
No ratings yet
Python Coding Questions 1-100
60 pages
Virtualmin Tips & Hacks - Practical Guide Series Book 2
No ratings yet
Virtualmin Tips & Hacks - Practical Guide Series Book 2
60 pages
Nemo File Format Specification - 2.24
100% (1)
Nemo File Format Specification - 2.24
484 pages
Tomato Health Monitoring System
No ratings yet
Tomato Health Monitoring System
2 pages
ECR - 209: Microprocessors and Interfacing: The Intel 8086 Microprocessor
No ratings yet
ECR - 209: Microprocessors and Interfacing: The Intel 8086 Microprocessor
32 pages
Question Bank Unit 1
No ratings yet
Question Bank Unit 1
3 pages
TLC0834C, TLC0834I, TLC0838C, TLC0838I 8-Bit Analog-To-Digital Converters With Serial Control
No ratings yet
TLC0834C, TLC0834I, TLC0838C, TLC0838I 8-Bit Analog-To-Digital Converters With Serial Control
14 pages
Black Vipers Windows 10 Service Configurations Black Viper
No ratings yet
Black Vipers Windows 10 Service Configurations Black Viper
48 pages
What Is Threat Analysis and Modeling?
No ratings yet
What Is Threat Analysis and Modeling?
2 pages
EQQI1 MST
No ratings yet
EQQI1 MST
256 pages
Faults, Errors and Failures: Dependability Tree
No ratings yet
Faults, Errors and Failures: Dependability Tree
24 pages
Orchid Power Source
No ratings yet
Orchid Power Source
23 pages
TCC Teacher Guide KS4
No ratings yet
TCC Teacher Guide KS4
35 pages
Grade 4 End of Year Agric Science and Tech Paper 2 2022
No ratings yet
Grade 4 End of Year Agric Science and Tech Paper 2 2022
4 pages
Remote I/O and Wiring Solutions: Compobus/S
No ratings yet
Remote I/O and Wiring Solutions: Compobus/S
25 pages
Sepfor
No ratings yet
Sepfor
158 pages
RECEIVER NOVASTAR MRV416 - Receiving - Card - Specifications-V1.0.2
No ratings yet
RECEIVER NOVASTAR MRV416 - Receiving - Card - Specifications-V1.0.2
7 pages
Hospital Information System His
No ratings yet
Hospital Information System His
13 pages
Operating A Personal Computer Learning Outcome 5: Work With User Application Programs Assessment Criteria
No ratings yet
Operating A Personal Computer Learning Outcome 5: Work With User Application Programs Assessment Criteria
16 pages
Sap TPM
No ratings yet
Sap TPM
29 pages
DBMS Syllabus - Course Outline 2011
No ratings yet
DBMS Syllabus - Course Outline 2011
3 pages

Homework #4 El6201 - Parallel System: 1 Openmp Matrix Addition

Uploaded by

Homework #4 El6201 - Parallel System: 1 Openmp Matrix Addition

Uploaded by

Homework #4 EL6201 - Parallel System

1 OpenMP Matrix Addition

#pragma omp parallel for private(j)

#pragma omp parallel for schedule(dynamic)

#pragma omp parallel for schedule(dynamic,2)

2 OpenMP Matrix Multiplication

You might also like