Daa 1
Daa 1
Unit : 6
Multithreaded and
Distributed Algorithms
Multithreaded algorithms
Introduction
-parallel: This keyword is used along with the for loop to indicate that
each iteration can be executed in parallel.
-spawn: Use of this allows to create a new sub-process and then keep
executing current process.
-sync: This keyword forces to wait until all active parallel threads
created by the instance of program finish.
•These keywords help to bring parallelism without affecting the
remaining sequential program.
•e.g.: Simple recursive Fibonacci algorithm without parallelism and
with parallelism.
The parallel Fibonacci algorithm achieves parallelism with the help of
keywords pam and sync. The keyword spawn create a child which is
computing partibin-l and there is also execution of parent which
ultimately executes procedure FarFibin- Thus both the processes are
executed in parallel. Finally these executions are synchronized using sync
keyword
Performance Measures
Theoretical efficiency of multithreaded algorithms is computed using
two metrics-
• Work: Defined as the total time to execute the entire
computation on one processor. Work is the sum of the times taken
by each of the strands.
• Span: Longest time to execute the strands along any path in the
directed acyclic graph.
• Speedup: Ratio of T1/Tp
• Parallelism: Work/Span
• Can be interpreted in three ways.
a. Ratio: The average amount of work that can be performed
for each step of parallel execution time.
b. Upper bound: The maximum possible speedup that can be
achieved on any number of processors.
c. Limit: The limit on the possibility of attaining perfect linear
speedup. Once the number of processors exceeds the
parallelism, the computation cannot possibly achieve
perfect linear speedup.
d. The more processors we use beyond parallelism, the less
perfect thespeedup.
•T1.n/ D max.T1.n 1/; T1.n 2// C ‚.1/ D T1.n 1/ C ‚.1/ ; which has
solution T1.n/ D ‚.n/. The parallelism of P-FIB.n/ is T1.n/=T1.n/ D
‚.n=n/, which grows dramatically as n gets large. Thus, on even the
largest parallel computers, a modest value for n suffices to achieve
near perfect linear speedup for P-FIB.n/, because this procedure
exhibits considerable parallel slackness.
Parallel loops
• Many algorithms contain loops all of whose iterations can operate
in parallel. As we shall see, we can parallelize such loops using the
spawn and sync keywords, but it is much more convenient to
specify directly that the iterations of such loops can run
concurrently. Our pseudocode provides this functionality via the
parallel concurrency keyword, which precedes the for keyword in a
for loop statement.
• As an example, consider the problem of multiplying an n n matrix A
D .aij / by an n-vector x D .xj /. The resulting n-vector y D .yi/ is given
by the equation
yi D Xn jD1 aij xj ;
for i D 1; 2; : : : ; n. We can perform matrix-vector
multiplication by computing all the entries of y in parallel
as follows:
MAT-VEC (A,x)
1. n D A:rows
2. let y be a new vector of length n
3. parallel for i D 1 to n
4. yi D 0
5. parallel for i D 1 to n
6. for j D 1 to n
7. yi D yi C aij xj
8. return y
In this code, the parallel for keywords in lines 3 and 5
indicate that the iterations of the respective loops may
be run concurrently. A compiler can implement each
parallel for loop as a divide-and-conquer subroutine
using nested parallelism.
For example, the parallel for loop in lines 5–7 can be
implemented with the call MAT-VEC-MAIN-LOOP.A; x; y; n;
1; n/, where the compiler produces the auxiliary
subroutine MAT-VEC-MAIN-LOOP as follows:
A dag representing the computation of MAT-VEC-MAIN-
LOOP.A; x; y; 8; 1; 8/. The two numbers within each
rounded rectangle give the values of the last two
parameters (i and i0 in the procedure header) in the
invocation (spawn or call) of the procedure. The black
circles represent strands corresponding to either the
base case or the part of the procedure up to the spawn
of MAT-VEC-MAIN-LOOP in line 5; the shaded circles
represent strands corresponding to the part of the
procedure that calls MAT-VEC-MAIN-LOOP in line 6 up
to the sync in line 7, where it suspends until the
spawned subroutine in line 5 returns; and the white
circles represent strands corresponding to the
(negligible) part of the procedure after the sync up to
the point where it returns.
MAT-VEC-MAIN-LOOP (A; x; y; n; i; i0)
if i = = i0
for j =1 to n
yi = yi + aij xj
else mid = [(i + I’) /2]
spawn MAT-VEC-MAIN-LOOP (A, x, y, n, I, mid)
MAT-VEC-MAIN-LOOP(A, x, y, n, mid + 1; I’)
sync
This code recursively spawns the first half of the iterations of the loop
to execute in parallel with the second half of the iterations and then
executes a sync, thereby creating a binary tree of execution where the
leaves are individual loop iterations, as shown in Figure.
Race Conditions
• A multithreaded algorithm is deterministic if and only if it
does the same thing on the same input, no matter how
the instructions are scheduled.A multithreaded algorithm
is nondeterministic if it's behaviour varies from run to run.
• Multithreaded algorithms are supposed to be
deterministic but they are actually nondeterministic
because they contain determinacy race situation.
• Determinacy race is a situation which occurs when two
logically parallel instructions access the same memory
location and at least one of the instructions performs a
write. This condition is called race condition.
Eg-
There are three operations performed in above algorithm and those are:
1) Read a from memory into one of the processor's registers
2) Increment the value of the register
3) Write the value in the register back into a in memory.
• This can be illustrated by following Fig
Step a r1 r2
1 0 - -
2 0 0 -
3 0 1 -
4 0 1 0
5 1 1 1
6 1 1 1
7 - 1 1
As we can see, this is low parallelism, which means that even with massive input,
having hundreds of processors would not be beneficial. So, to increase the parallelism,
we can speed up the serial Merge.
Distributed Algorithms
Introduction
• Distributed algorithms are those algorithms that are supposed to work in
distributed network or on multiprocessor.
• The distributed network is created by interconnected processors Sub-type
of parallel algorithms, typically executed concurrently, with separate parts
of the algorithm being run simultaneously on independent. processors,
and having limited information about what the other parts of the
algorithm are doing.
• Application areas –
- Distributed computing
-Telecommunication Scientific computing
-Distributed information processing
-Distributed database systems
-Real time process control.
•Challenges in developing and implementing distributed algorithms is
successfully coordinating the behaviour of the independent parts of the
algorithm in case of processor failure and unreliable communications
links.
Distributed breadth first search
• Assuming that we have a connected network of n nodes, the
distributed BFS algorithm proceeds as follows:
1) Each node is assigned a unique ID.
2) A random node is chosen as the root node. This node initiates the
BFS process by sending a message to all its neighbour’s, requesting
that they provide their ID.
3) Upon receiving the IDs from its neighbour’s, the root node computes
the shortest paths from itself to all other nodes using the IDs and
creates a tree. The tree is then broadcast to all nodes in the
network.
4) Each node receives the tree and sets its own ID as the parent of all
nodes in its subtree.
5) The process continues until all nodes have been visited.
The main advantage of the distributed BFS algorithm is that it is scalable
and can be used to find the shortest path in large networks.
The main disadvantage of the distributed BFS algorithm is that it requires
a lot of communication between nodes, which can be costly.
Minimum spanning tree
• Application to leader election:
− Converge cast from leaves until messages meet at node or edge.
−Works with any spanning tree, not just MST.
−E.g., in asynchronous ring, this yields O(n log n) messages for leader
election.
• Lower bounds on message complexity:
− Ω(n log n), from leader election lower bound and the reduction
above.
String Matching
• Introduction
• String Matching Algorithm is also called "String Searching
Algorithm." This is a vital class of string algorithm is declared as
"this is the method to find a place where one is several strings are
found within the larger string.“
• Given a text array, T [1.....n], of n character and a pattern array, P
[1......m], of m characters. The problems are to find an integer s,
called valid shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m].
In other words, to find even if P in T, i.e., where P is a substring of T.
The item of P and T are character drawn from some finite alphabet
such as {0, 1} or {A, B .....Z, a, b..... z}.
Given a string T [1......n], the substrings are
represented as T [i......j] for some 0≤i ≤ j≤n-1, the
string formed by the characters in T from index i to
index j, inclusive. This process that a string is a
substring of itself (take i = 0 and j =m).
The proper substring of string T [1......n] is T [1......j]
for some 0<i ≤ j≤n-1. That is, we must have either i>0
or j < m-1.
Using these descriptions, we can say given any string
T [1......n], the substrings are
T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤
n-1.
And proper substrings are
T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤
n-1.
Naïve string matching algorithm
• The naïve approach tests all the possible
placement of Pattern P [1.......m] relative to
text T [1......n]. We try shift s = 0, 1.......n-m,
successively and for each shift s. Compare
T [s+1.......s+m] to P [1......m].
• The naïve algorithm finds all valid shifts using
a loop that checks the condition P [1.......m] =
T [s+1.......s+m] for each of the n - m +1
possible value of s.
Algorithm
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s
Step-by-step approach:
Initially calculate the hash value of the pattern.
Start iterating from the starting of the string:
Calculate the hash value of the current substring having length m.
If the hash value of the current substring and the pattern are same
check if the substring is same as the pattern.
If they are same, store the starting index as a valid answer. Otherwise,
continue for the next substrings.
Return the starting indices as the required answer.