0% found this document useful (0 votes)
61 views5 pages

Problem Statement

1) The problem is to write a program that takes a matrix as input and calculates the sum of each column to output in a result vector. 2) The program is implemented sequentially using a for loop to iterate through the matrix and add column elements to the result vector. 3) A parallel implementation is also created using threads, with one thread calculating the sum for each column concurrently.

Uploaded by

ash1205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views5 pages

Problem Statement

1) The problem is to write a program that takes a matrix as input and calculates the sum of each column to output in a result vector. 2) The program is implemented sequentially using a for loop to iterate through the matrix and add column elements to the result vector. 3) A parallel implementation is also created using threads, with one thread calculating the sum for each column concurrently.

Uploaded by

ash1205
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Problem Statement:

Given a matrix M of size n m, write a program using C++ that computes the sum of
each column so
that the result vector V of size m is defined like so:
xample:
!he matrix has size "x# hence n$" and m$#
the resultant vector of size m$# is sum of columns such that
V%&' with & in %(,#)
for &$(
V%('
sum from i$( to n*+ of M,i,&)
i-e- sum from i$( to . of M,i,() :-here &$( and n$"
i-e- M,(,() + M,+,() + M,/,() + M,.,()
i-e- / + 0 + .+ 1 $+0-
V%(' $ +0-
for &$+
V%+'
sum from i$+ to n*+ of M,i,&)
i-e- sum from i$+ to . of M,i,+) :-here &$+ and n$"
i-e- M,(,+) + M,+,+) + M,/,+) + M,.,+)
i-e- . + # + "+ + $+2-
V%+' $ +2-
similarl3 all the # values ,from ( through 4) can 5e computed as shown in the figure-
Task 1: Sequential and scalar implementation.
6artial sum is implemented in c++ with floating point values of single precision using
float data t3pe- 7 two dimensional matrix M of n x m is used to store the input and a
one dimensional arra3 of size m is used to store the output- !he choice of data
structures is to keep the solution simple-
!he program is written for and targeted at a 2" 5it 8inux with intel core i. 9ehalem
,the code*name for an :ntel processor micro*architecture) architecture-
!he program has three parts:
+- accept input-
:nput is stored in a two dimensional arra3 named M of size rowlength x collength-
7ll the elements are floats- Values are stored from ( through row*+ and ( through col*+
last row and col are empt3- :ndex starts from ( therefore +
st
row +
st
col element is given
53 %('%(' and so forth as indicated in ta5le 5elow-
4
,(,()
1
,(,+)
.
,(,/)
(
,(,.)
(
,+,()
.
,+,+)
/
,+,/)
+
,+,.)
/
,/,()
+
,/,+)
+
,/,/)
(
,/,.)
/- calculate partial sum-
!his is the computation part of algorithm- :t solves the pro5lem se;uentiall3 53
computing values of V[j] for & from ( through collength.
for (int i = 0; i < rowlength; ++i)
{
for (int j = 0; j < collength; ++j)
{
V[j] += M[i][j];
}
}
!he loop works as follows:
first i is set to ( and & loops from ( through . ,as collength is " in our example)
when j < collength evaluated as false, the inner loop exits and : is incremented-
9ow, i is set to + and & loops again from ( through .- !his continues till the outer loop
condition fails-
!he Vector V during the execution is as follows:
:nitiall3 when it is declared, V is initiated to all zeroes- !here fore 5efore loop is
entered V$ <(,(,(,(=-
when inner loop is executed with i$( fixed, V is same as all first row elements, ,(,(),
,(,+), ,(,/), ,(,.)- V$<(,(,(,(=+<4,1,.,(=$<4,1,.,(=-
when inner loop is executed with i$+ fixed, V is changed to second row elements added
to first row elements, ,+,(), ,+,+), ,+,/), ,+,.)- V$<4,1,.,(=+<(,.,/,+=$<4,#,1,+=
and so on-
:n other words,
i$(
&$( v%('$ v%('+M%('%('$ (+4 $4
&$+ v%+'$ v%+'+M%('%+'$ (+1 $1
&$/ v%/'$ v%/'+M%('%/'$ (+. $.
&$. v%.'$ v%.'+M%('%.'$ (+( $(
i$+
&$( v%('$ v%('+M%+'%('$ 4+( $4
&$+ v%+'$ v%+'+M%+'%+'$ 1+. $#
&$/ v%/'$ v%/'+M%+'%/'$ .+/ $1
&$. v%.'$ v%.'+M%+'%.'$ (++ $+
i$/
&$( v%('$ v%('+M%/'%('$ 4+/ $0
&$+ v%+'$ v%+'+M%/'%+'$ #++ $0
&$/ v%/'$ v%/'+M%/'%/'$ 1++ $2
&$. v%.'$ v%.'+M%/'%.'$ ++( $+
7ll these steps are performed se;uentiall3 one after the other-
.-printing the output-
!his is simpl3 done 53 looping through the output vector V from ( through m-
sample output:
Task 2: Parallel Implementation
7 parallel implementation of partial sum is done using pthreads 5uilt in c++>s newest
standard std++-
7dvantages:
6rallel processing can easil3 5e achieved using <thread> header file- Calling
thread (funP,arguments) function creates a thread which executes fun6 function
parallel to parent thread executing main,) function-
8imitations:
c++++ standard thread li5rar3 is not implemented on all compilers- :t is there in GCC
on 8inux, 5ut not on M:9G? on windows- !@M*GCC is the one of few if not onl3
compiler on windows to implements this li5rar3-
:f a two dimentional arra3 needs to passed as an argument, its size should 5e known
and specified 5efore compilation- 7 workaround was to define the M and V as glo5al
varia5les-
!he data structure for holding Matrix M is changed from two dimentioanl arra3 to a
single dimentional arra3 of size nAm- M%i'%&' for /@ is e;uivalent of M%iAcol+&'-
6arallel computation part of algorithm:
:nstead of se;uentiall3 computing value of V%&' one after another, we have a thread
each for ever3 element in V-
:f there are m columns in matrix, we create m threads each to derive the partial sum
for that column-
for (int j = 0; j < collength; ++j)
{
thread T(addcol, j, rowlength, collength);
T.join();
}
!his loop creates a thread each for each column and calls addcols function- Boin,) is
used so that main,) waits till all the threads complete their execution- !his is to ensure
that main does not exit prematurel3-
void addcol(int j, int rowlength, int collength)
{
for(int i=0; i< rowlength; i++)
{
V[j] += M[i*collength+j];
}
cout<< "thread" << j << "computed V[" << j << "] as"<< V[j] << endl;
}
!he addcolfunction adds each column values and gets value of one element of V-
!he long output is to demonstrate that the values are generated 53 individual threads-
?hen n is large, there is a huge advantage of parallel computing as compared to
se;uential solution- ?hen 5oth n and m are small, the se;uential solution is 5etter
than overhead of creating threads- Cor moderate size inputs, 5oth solutions> run time is
compara5le-
Dample output:

You might also like