0% found this document useful (0 votes)
22 views

Module 2

Uploaded by

cbgopinath
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Module 2

Uploaded by

cbgopinath
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Design and Analysis of Algorithms (15CS43)

MODULE 2-DIVIDE AND CONQUER

Prepared by

ARPITHA C N
Asst. professor,
Dept. of CS&E,
A.I.T., Chikamagaluru

2.1 GENERAL METHOD


Given a function to compute on n inputs the divide and conquer strategy suggests
splitting the inputs into k distinct subsets, 1<k≤n, yielding k sub problems. These sub
problems must be solved, and then a method must be found to combine sub solutions into a
solution of a whole. If the sub problems are still relatively large, then the divide and
conquer strategy can possibly be reapplied. Often the sub problems relating from a divide
and conquer design are of the same type as the original problem. For those cases the
reapplication of the divide-and-conquer principle is naturally expressed by a recursive
algorithm. Now smaller and smaller sub problems of the same kind are generated until
eventually sub problems that are small enough to be solved without splitting are produced.

Example 1: Detecting a counterfeit coin


Problem:
You are given a bag with 16 coins and told that one of these coins may be
counterfeit. Further, you are told that one of these coins may be counterfeit. Further you
are told that counterfeit coins are lighter than genuine ones. Your task is to determine
whether the bag contains the counterfeit coins or not. To aid you in this task, you have a
machine that compares the weights of two sets of coins and tells you which set is lighter or
whether both sets have the same weight.
Solution 1: Without using divide and conquer
We can compare the weights of coins 1 and 2. If coin 1 is lighter than coin2, then
coin 1 is counterfeit and we are done with our task. If coin 2 is lighter than coin 1 then coin
2 is counterfeit. If both coins have the same weight we compare 3 and 4. Again, if one coin
is lighter, a counterfeit coin has been detected and we are done. If not we compare coins 5
and 6. Proceeding in this way, we can determine whether the bag contains a counterfeit
coin by making at most eight weight comparisons. This process also identifies the
counterfeit coin.
Solution 2: With using divide and conquer
Another approach is to use the divide and conquer methodology. Suppose that our
16- coin instance is considered a large instance. In step 1, we divide the original instance
into two or more similar instances. Let us divide our 16-coin instances into two 8 coin
instances by arbitrarily selecting 8 coins for first instance (A) and the remaining 8 coins for
the second instance (B). In step 2, we need to determine whether A or B has a counterfeit
coin. For this step we use our machine to compare the weights of the coin sets A and B. if
both the sets have the same weight, a counterfeit coin is not present is 16 coins. If A and B

1
has different weights, a counterfeit coins results from step 2 and generate the answer for
the original 16-coin instance. For the counterfeit coin problem step 3 is easy. The 16 coin
instance has a counterfeit coin iff either A or B has one. So with just one weight comparison
we can complete the task of determining the presence as a counterfeit coin.

To be more precise, suppose we consider the divide and conquer strategy when it
splits the input into two sub problems of the same kind as the original problem. This
splitting is typical of many of the problem as we examine here. We can write a control
abstraction that mirrors the way an algorithm based on divide and conquer will look. By a
control abstraction we mean a procedure whose flow of control is clear but whose primary
operations are specified by other procedure whose precise meanings are left undefined.
DAndC algorithm is initially invoked as DAndC(P), where P is the problem to be solved.

Small(P) is a Boolean valued function that determines whether the input size is
small enough that the answer can be computed without splitting. If this is so , the function S
is invoked. Otherwise the problem P is divided into smaller sub problems. These sub
problems P1,P2,P3,…,Pk are solved by recursive application of DAndC. Combine is a function
that determines the solution to P using the solutions to the k sub problems. I the size of P is
n and the sizes of k sub problems are n1,n2,…nk, respectively,

ALGORITHM DAndC(P)
{
If small(P) then return S(P)
else
{
Divide P into smaller instances P1,P2,P3,…,Pk , k>=1
Apply DAndC to each of these subproblems
return Combine(DAndC(P1), DAndC(P2), DAndC(P3),…… DAndC(Pk));
}
}

Then the computing time of DAndC is described by the recurrence relation.

T ( n )=
{ g ( n ) n small
T ( n 1 ) +T ( n 2 ) +… ..+T ( nk ) + f ( n ) ot h erwise
Eq 2.1

Where T(n) is the time for DAndC on any input of size n and g(n) is the time to compute the
answer directly for small inputs. The function f(n) is the time for dividing P and combining
the solutions to sub problems of the same type as the original problem, it is very natural to
first describe such algorithm using recursion. The complexity of many divide and conquer
algorithms is given by recurrence relation of the form.

2
{
T ( 1 ) n=1
T( n ) =
a .T ()
n
b
+ f ( n ) n>1
Eq 2.2

Where a and b are known constants. We assume T(1) is known and n is power of b (i.e,
n=bk).

One of the method for solving this recurrence relation is called substitution method.

This method repeatedly makes substitutions for each occurrences of the function T in the
right side until all such occurrences disappears.

Example 1:

Consider the case in which a=1 and b=2. Let T(1)=2 and f(n)=n.

We have,

T(n)=2 T( n2 )+ n
T(n)=2[2 T( ) + ¿+ n
n n
4 2

T(n)=4 T( )+2 n
n
4

T(n)=4[2 T( ) + ¿+ 2n
n n
8 4

T(n)=8 T( )+3 n
n
8


In general we see that
N
T(n)=2i T( i )+ i n, for any log 2 n≥ i ≥ 1.
2
In particular,
n
T(n)=2log nT( log n ¿+n log 2 n , corresponding to the choice of i = log 2 n .
2

22

Thus T(n)=n T(1)+ nlog 2 n


¿ n log 2 n+2 n.

Beginning with the recurrence relation 2.2 and using the substitution methods it can be
shown that T(n)= n log a [T ( 1 ) +u( n)]
b

3
k
Where u(n)=∑ h ( b ) ∧h ( n ) =f ( n)/n . Table 2.2 tabulates the asymptotic vales of u(n) and
j log a
b

j=1

h(n).

Table 2.1: u(n) values for various h(n)

h(n) U(n)
O(nr ),r<0 O(1)
θ¿ θ¿

Ω ( n ) ,r > 0 θ( h ( n ))
r

Example 2:

{()
T ( 1 ) n=1
T(n)= n
T + c n> 1
2

Solution: comparing with the eq2.2 , we see that a=1, b=2 and f(n)=c. So
f (n) 0
log b a=0∧h ( n )= log a =c=c ( logn ) c ( logn ) =θ ¿
n b

From the table 2.1 u(n)=θ ( logn ) . so T ( n )=n


log b a
[ c +θ ( logn ) ]=θ(logn)
Example 3:

Next consider the case of a=2,b=2,and f(n)=cn.

f (n)
For this recurrence,log b a=1∧h ( n )= =c=θ ( logn0 ) .
n

hence u(n)=θ ( logn )∧T ( n ) =n[T ( 1 ) +θ(logn )]=θ(nlogn)

2.2 BINARY SEARCH

Let ai, 1≤i≤n, be a list of elements that are sorted in non decreasing order. Consider
the problem of determining whether a given element x is present in the list. If x is present,
we have to determine a value j such that a j=x. if x is not in the list, then j is to be set as zero
4
(0). Let P=(n,ai,….al, x) denote an arbitrary instance of this search problem( n is the number
of elements in this list, ai ,…al is the list of elements, and x is te element searched for).

Divide and conquer can be used to solve this problem. Let Small (P) be true if n=1. In
this case S(P) will take the value I if x=ai; otherwise it will take the value 0. Then g(1)=θ ( 1 ) .
if P has more than on e element, it can be divided(or reduced) into a new sub problems as
follows.

Pick an index q (in the range [i,l]) and compare a with aq. there are three possibilities:

1) x= aq: in this case the problem P is immediately solved.


2) x< aq therefore problem P reduces to (q-i, ai,….aq-1,x)
3) x> aq in this case the sub list to be searched is aq+1,….al. P reduces to (l-q, aq+1,….al , x)

In this example, any given problem P gets divided (reduced into one new sub problem. This
division takes only θ ( 1 ) time . after a comparison with aq, the instance remaining to solved
can be solved by using this divide- and –conquer scheme again. If q is always chosen such
n+1
that aq is the middle element (that is, q=∫ ( ) . then the resulting search algorithm is
2
called as binary search. The algorithm describes this binary search method, where binsrch
has four inputs a[],i,l and x. It is initially invoked as binsrch(a,1,n,x).

Recursive binary search algorithm

Algorithm binsrch(a,i,l,x)

//given an array a[i..l] Of elements in non decreasing order, 1≤i≤l, determine whether x
is //present , and if so, return j such that x=a[j]; else return 0.

If(l=i) then //if small(P)


{
If(x=a[i]) then return i;
else return 0;
}
else
{
//reduce the P into subproblem
mid:= └(i+l)/2 ┘;
If(x=a[mid]) then return mid;
else if(x<a[mid]) then
return binsrch(a,i,mid-1,x);

5
else return binsrch(a,mid+1,l,x);
}
}

A non recursive version of binsrch is given in algorithm. binsrch has three inputs a,
n and x. the while loop continues processing as long as there are more elements left to
check. At the conclusion of the procedure 0 is returned if x is not present, or j is returned,
such that a[j]=x.

Iterative binary search algorithm

Algorithm binsrch(a,n,x)

//Given an array a[1:n] of elements in non decreasing order, n>=0, determine whether s
is //present, and if so, return j such that x=a[j]; else return 0.
{
Low:=1; high:=n;
while(low≤high) do
{
mid:= └(low+high)/2 ┘;
if(x<a[mid]) then high:=mid-1;
else if(x>a[mid]) then low:=mid+1;
else return mid;
}
return 0;
}

Example:
let us consider the 14 elements.
-15,-6,0,7,9,23,54,82,101,112,125,131,142,151
Solution:
Place them in a[1..14], and simulate the steps that binsrch goes through as it
searches for different values of x. only the variables low, high, and mid needs to be traced
as we simulate the algorithm. We try the following values for x: 151,-14 and 9 for two
successful searches and one unsuccessful search. Table 2.2 shows the traces of binsrch on
these three inputs.

Table 2.2: Three examples of binary search on 14 elements

6
X=151 X=-14 X=9
Low high mid Low high mid Low high mid
1 14 7 1 14 7 1 14 7
8 14 11 1 6 3 1 6 3
12 14 13 1 2 1 4 6 5
14 14 14 2 2 2 Found
Found 2 1 not found
(Low>high)

Theorem 1: Algorithm binsrch(a,n,x) works correctly.


Proof:
We assume that all statements work as expected and that comparisons such as
x>a[mid] are appropriately carried out. Initially low=1, high=n, n>=0, and a[1]≤a[2]≤…
≤a[n]. if n=0 , the while loop is not entered and 0 is returned. Otherwise we observe that
each time through the loop the possible elements to be checked for equality with x are
a[low],a[low+1],….a[mid],…a[high]. If x=a[mid], then the algorithm terminates
successfully. Otherwise the range is narrowed by either increasing low to mid+1 or
decreasing high to mid-1. Clearly this narrowing of the range does not affect the outcome of
the search. If low becomes greater than high, then x is not present and hence the loop is
exited.

Analysis:
Suppose we begin by determining the time for binsrch algorithm on the previous
example data set. We observe that the only operations in the algorithm are comparisons
and some arithmetic and data movements. We concentrate on comparisons between x and
the elements in a[], recognizing that the frequency count of all other operations is of the
same order as the for these comparisons. Comparisons between x and elements of a[] are
referred to as element comparisons. We assume that only one comparison is needed to
determine which of the three possibilities of if statement holds. The number of element
comparisons needed to find each of the 14 elements is

A [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
Elements -15 -6 0 7 9 23 54 82 101 112 125 131 142 151
Comparis 3 4 2 4 3 4 1 4 3 4 2 4 3 4
ons

No elements requires more than 4 comparisons to be found. The average is obtained


by summing the comparisons needed to find all 14 items and dividing by 14; this yields

7
45/14, or approximately 3.21, comparisons per successful search on the average. There are
15 possible ways that an unsuccessful search may terminate depending on the value of x. if
x<a[1], the algorithm requires 3 element comparisons to determine that x is not present.
For all the remaining possibilities, binsrch requires 4 element comparisons. Thus the
average number of element comparisons for an unsuccessful search is (3+14*4)/15=59/15
≈ 3.93.

Binary decision tree

Fig 2.1 binary decision tree for binary search, n=14

Fig 2.1 contains a binary decision tree that traces the way in which these values are
produced by binsrch. The first comparison ix x with a[7]. If x<a[7], then the next
comparison is with a[3]; similarly, if x>a[7], then the next comparison is with a[11]. Each
path through the tree represents a sequence of comparisons in the binary search methods.
If x is present, then the algorithm will end at one of the circular nodes that lists the index
into the array where x was found. If x is not present, the algorithm will terminate at one of
the square nodes. The circular nodes are called as internal nodes and the square nodes are
called as external nodes.

Theorem 2:

If n is the range[2k-1,2k), then binsrch makes at most k comparisons for a successful search
and either k-1 or k comparisons for an unsuccessful search.
Or
The time required for successful search is O(logn) and for unsuccessful search is θ(logn)
Proof:

8
Consider the binary decision tree describing the action of binsrch on n elements. All
successful searches end at a circular node whereas all unsuccessful search end at a square
node. If 2k-1≤n<2k, then all circular nodes are at levels 1,2,… k whereas all square nodes are
at levels k and k+1 ( note: the root node is at level 1). The number of element comparisons
needed to terminate at a circular node on level i is I whereas the number of element
comparisons needed to terminate at a square node at level i is only i-1. Hence The time
required for successful search is O(logn) and for unsuccessful search is θ(logn ).

In conclusion we are now able to completely describe the computing time of binary search
by giving formulas that describe the best, average and worst cases:

Successful searches Unsuccessful searches

θ(1) θ(logn ) θ(logn ) θ(logn )

Best Average Worst Best, average, worst

Let us find the number of key comparison in the worst case c worst(n). The worst case inputs
include all arrays that do not contain a given search key. Since after one comparison the
algorithm faces the same situation but for any array half the size, we get the following
recurrence relation for cworst(n)
Cworst(n)=Cworst(└n/2┘)+1 for n>1, Cworst(1)=1
k
Assume n=2
C(2k)=c(2k-1)+1
C(2k)=(c(2k-2)+1)+1
C(2k)=c(2k-2)+2
C(2k)=(c(2k-3)+1)+2
C(2k)=c(2k-3)+3
:
:
After i iterations
C(2k)=c(2k-i)+i
Replace I by k
C(2k)=c(20)+k
Since cworst(1)=1
C(2 )=1+k
k

= 1+log2n €θ( log 2 n)

2.3 FINDING MAXIMUM AND MINIMUM

9
Let us consider another simple problem that can be solved by the divide and conquer
technique. The problem is to find the maximum and minimum number in a set of n
elements. The algorithm straightmaxmin accomplishes this.
Algorithm straightmaxmin(a,n,max,min)
//Set max to the maximum and min to the minimum of a[1…n].
{ max:=min:=a[1];
for i:=2 to n do
{
if(a[i]>max) then max:=a[i];
if(a[i]<min) then min:=a[i];
}
}
In analyzing the complexity of the algorithm we first concentrate on the number of
comparisons. The justification for this is that the frequency count for other operations in
this algorithm is of the same order as that for element comparisons. More importantly,
when the elements in a[1…n] are polynomials ,vectors, very large numbers, or strings of
characters, the cost of an element comparison is much higher than the cost of the other
operations. Hence the time is determined mainly by the total cost of the element
comparisons.

Straightmaxmin requires 2(n-1) element comparisons in the best, average ,and


worst cases. An immediate improvement is possible by realizing that the comparison
a[i]<min is necessary only when a[i]>max is false.
Hence we can replace the contents of the for loop by

if(a[i]>max) then max:=a[i];


else if (a[i]<min) then min:=a[i];

Now the best case occurs when the elements are in increasing order. The number
of element comparisons is n-1. The worst case occurs when the elements are in decreasing
order. In this case the number of element comparison is 2(n-1).

A divide and conquer algorithm for this problem would proceed as follows:

Let P=(n,a[i],,,[j]) denote an arbitrary instance of the problem. Here n is the number
of elements in the list a[i],..,a[j[ and we are interested in finding the maximum and
minimum of this list. Let Small(P) be true when n<=2. In this case, the maximum and
minimum are a[i] if n=1. If n=2, the problem can be solved by making one comparison.
If the list has more than 2 elements, P has to be divided into smaller instances. For example,
we might divide P into the two instance P1=(└n/2 ┘,a[1],…. a[└n/2 ┘] and P2=(n- └n/2 ┘,

10
a[└n/2 ┘+1],…,a[n]). After having divided P into two small subproblems, we can solve
them by recursively invoking the same divide-and-conquer algorithm. to combine the
solutions for P1 and P2 to obtain a solution for P. If MAX(P) and MIN(P) are the maximum
and minimum of the elements in P, then MAX(P) is larger of MAX(P1) and MAX(P2). Also
MIN(P) is smaller of MIN(P1) and MIN(P2).
Maxmin is a recursive algorithm that finds the maximum and minimum of the set of
elements {a(i),a(i+1),….a(j)}. The situation of set sizes one(i=j) and two (i=j-1) are handled
separately. For sets containing more than two elements, the midpoint is determined and
two new problems are determined. The two maxima are compared and the two minima are
compared to achieve the solution for the entire set.
The procedure is initially invoked by the statement,
Maxmin(1,n,x,y)

Algorithm maxmin(i,j,max,min)
//a[1:n] is a global array. Parameters I and j are integers. 1≤i≤j≤n. the effect is to set
max //and min to the largest and smallest values in a[i:j], respectively
{
If(i=j) then max:=min:=a[i];
else if(i=j-1)then
{
If(a[i]<a[j]) then
{
max:=a[j]; min:=a[i];
}
else
{
max:=a[i]; min:=a[j];
}
}
else
{
// if P is not small, divide P into subproblems.
// find where to split the set
mid:= mid:= └(i+j)/2 ┘;
//solve the subproblems.
maxmin(i,mid,max,min);
maxmin(mid+1,j,max1,min1);
// combine the solutions
If(max<max1) then max:=max1;
If(min>min1) then min:=min1;

11
}
}

Example:
Suppose we simulate maxmin on the following nine elements:
A: [1] [2] [3] [4] [5] [6] [7] [8] [9]
22 13 -5 -8 15 60 17 31 47
A good way of keeping track of recursive calls is to build a tree by adding a node each time
a new call is made. For this algorithm each node has four items of information: i,j,max and
min. on the array a[] above, the tree of fig 2.2 is produced.

Fig 2.2 :Trees of recursive calls of maxmin.

The root node contains 1 and 9 as the values I and j corresponding to the
initial call to maxmin. This execution produces two new calls to maxmin, where I
and j have the values 1, 5 and 6, 9 respectively, and thus split the set into two
subsets of approximately the same size. From the tree we can immediately see that
the maximum depth of recursion id four. The circles numbers in the upper left
corner of each node represent the order in which max and min are assigned.
If T(n) represents thid number, then the resulting recurrence relation is

{
T ( └ n/2 ┘ )+ T ( └ n/2 ┘ )+ 2 n>2
T(n)= 1n=2
0 n=1
When n is a power of two, n=2k for some positive integer k, then
T(n)=2T(└ n/2 ┘)+2
By using smoothness rule replace n by 2k
By using backward substitution

12
T(2k)=2T(2K-1)+2
T(2k)=2[2. T(2K-2)+2]+2
T(2K)=22T(2K-2)+22+2
T(2k)=22[2. T(2K-3)+2]+22+2
T(2K)=23T(2K-3)+23+22+2
:
:
After i iterations
T(2K)=2iT(2K-i)+2i+2i-1+2i-2+…………+2
Replace i by k-1
T(2K)=2k-1T(21)+2k-1+2k-2+2k-3+…………+2
T(2K)=2k-1. 1+2k-1+2k-2+2k-3+…………+2 (since T(2)=1)
T(2K)=2k-1.1+2k-2
T(2K)=2k(1/2+1)-2
T(2K)= 2k(3/2)-2
3n
T(n)= -2
2

Note: 3n/2-2 is the best, average, and worst case number of comparisons when n is a
power of 2.
Compared with the 2n-2 comparisons for straight forward method, this is a saving of 25%
in comparisons.

2.4 MERGE SORT


Mergesort is a perfect example of a successful application of the divide-and conquer
technique. It sorts a given array A[0..n − 1] by dividing it into two halves A[0. ……..⌞n/2⌟−1]
and A[⌞n/2⌟……..n − 1], sorting each of them recursively, and then merging the two smaller
sorted arrays into a single sorted one.

ALGORITHM Mergesort(A[0..n − 1])


//Sorts array A[0..n − 1] by recursive mergesort
//Input: An array A[0..n − 1] of orderable elements
//Output: Array A[0..n − 1] sorted in nondecreasing order
if n > 1
copy A[0……..⌞n/2⌟−1] to B[0 ……..⌞n/2⌟−1]
copy A[⌞n/2⌟……..n − 1] to C[0 ……..⌞n/2⌟−1]
Mergesort(B[0.. ⌞n/2⌟−1])
13
Mergesort(C[0.. ⌞n/2⌟−1])
Merge(B, C, A)

The merging of two sorted arrays can be done as follows. Two pointers (array
indices) are initialized to point to the first elements of the arrays being merged. The
elements pointed to are compared, and the smaller of them is added to a new array being
constructed; after that, the index of the smaller element is incremented to point to its
immediate successor in the array it was copied from. This operation is repeated until one of
the two given arrays is exhausted, and then the remaining elements of the other array are
copied to the end of the new array.

ALGORITHM Merge(B[0..p − 1], C[0..q − 1], A[0..p + q − 1])


//Merges two sorted arrays into one sorted array
//Input: Arrays B[0..p − 1] and C[0..q − 1] both sorted
//Output: Sorted array A[0..p + q − 1] of the elements of B and C
i ←0; j ←0; k←0
while i <p and j <q do
if B[i]≤ C[j ]
A[k]←B[i]; i ←i + 1
else A[k]←C[j ]; j ←j + 1
k←k + 1
if i = p
copy C[j..q − 1] to A[k..p + q − 1]
else copy B[i..p − 1] to A[k..p + q − 1]

The operation of the algorithm on the list 8, 3, 2, 9, 7, 1, 5, 4 is illustrated in


Figure 2.3

14
Fig 2.3 Example for mergesort operation

How efficient is mergesort? Assuming for simplicity that n is a power of 2, the


recurrence relation for the number of key comparisons C(n) is C(n) = 2C(n/2) + Cmerge(n)
for n > 1, C(1) = 0.
Let us analyze Cmerge(n), the number of key comparisons performed during the
merging stage. At each step, exactly one comparison is made, after which the total number
of elements in the two arrays still needing to be processed is reduced by 1. In the worst
case, neither of the two arrays becomes empty before the other one contains just one
element (e.g., smaller elements may come from the alternating arrays). Therefore, for the
worst case, Cmerge(n) = n − 1, and we have the recurrence

Cworst(n) = 2Cworst(n/2) + n − 1 for n > 1, Cworst(1) = 0.

Replace n by 2k
By using backward substitution
C(2k)=2C(2k-1)+2k-1
C(2k)=2[2C(2k-2)+2k-1-1]+2k-1
C(2k)= 22C(2k-2)+2k-2+2k-1
C(2k)= 22C(2k-2)+2k+1-3
C(2k)=22[2C(2k-3)+2k-2-1]+2k+1-3
C(2k)= 23C(2k-3)+2k-22+2k+1-3
C(2k)= 23C(2k-3)+3. 2k-7
C(2k)= 23C(2k-3)+3. 2k - (23-1)

15
:
:
After i iterations
C(2k)= 2iC(2k-i)+i. 2k-(2i-1)
Replace i by k
C(2k)= 2kC(2k-k)+k. 2k-(2k-1)
C(2k)=0+ k. 2k-(2k-1) (since c(1)=0)

C(n)=log 2 n. n-(n-1)
C(n)=log 2 n. n-n+1
C(n)€θ(n log 2 n)

2.5 QUICK SORT


Quicksort is the other important sorting algorithm that is based on the divide-and
conquer approach. Unlike mergesort, which divides its input elements according to their
position in the array, quicksort divides them according to their value. A partition is an
arrangement of the array’s elements so that all the elements to the left of some element
A[s] are less than or equal to A[s], and all the elements to the right of A[s] are greater than
or equal to it:
A[0] . . . A[s − 1] A[s] A[s + 1] . . . A[n – 1]
all are ≤A[s] all are ≥A[s]

Obviously, after a partition is achieved, A[s] will be in its final position in the sorted array,
and we can continue sorting the two sub arrays to the left and to the right of A[s]
independently (e.g., by the same method). Note the difference with merge sort: there, the
division of the problem into two subproblems is immediate and the entire work happens in
combining their solutions; here, the entire work happens in the division stage, with no
work required to combine the solutions to the subproblems.
Here is pseudo code of quicksort: call Quicksort(A[0..n − 1]) where

ALGORITHM Quicksort(A[l..r])
//Sorts a subarray by quicksort
//Input: Subarray of array A[0..n − 1], defined by its left and right
// indices l and r
//Output: Subarray A[l..r] sorted in nondecreasing order
if l < r
s ←Partition(A[l..r]) //s is a split position
Quicksort(A[l..s − 1])
Quicksort(A[s + 1..r])

Alternatively, we can partition A[0..n − 1] and, more generally, its subarray A[l..r] (0 ≤ l < r
≤ n − 1) can be achieved by the following algorithm. As before, we start by selecting a pivot
—an element with respect to whose value we are going to divide the subarray. There are
several different strategies for selecting a pivot; we will return to this issue when we

16
analyze the algorithm’s efficiency. For now, we use the simplest strategy of selecting the
subarray’s first element: p = A[l].
Unlike the Lomuto algorithm, we will now scan the subarray from both ends, comparing
the subarray’s elements to the pivot. The left-to-right scan, denoted below by index pointer
i, starts with the second element. Since we want elements smaller than the pivot to be in
the left part of the subarray, this scan skips over elements that are smaller than the pivot
and stops upon encountering the first element greater than or equal to the pivot. The right-
to-left scan, denoted below by index pointer j, starts with the last element of the subarray.
Since we want elements larger than the pivot to be in the right part of the subarray, this
scan skips over elements that are larger than the pivot and stops on encountering the first
element smaller than or equal to the pivot. (Why is it worth stopping the scans after
encountering an element equal to the pivot? Because doing this tends to yield more even
splits for arrays with a lot of duplicates, which makes the algorithm run faster. For
example, if we did otherwise for an array of n equal elements, we would have gotten a split
into subarrays of sizes n − 1 and 0, reducing the problem size just by 1 after scanning the
entire array.)
After both scans stop, three situations may arise, depending on whether or not the
scanning indices have crossed. If scanning indices i and j have not crossed, i.e., i < j, we
simply exchange A[i] and A[j ] and resume the scans by incrementing i and decrementing j,
respectively.
i j
P All are≤p ≥p …… ≤p All are ≥p

If the scanning indices have crossed over, i.e., i > j, we will have partitioned the subarray
after exchanging the pivot with A[j ]:

j i
P All are≤p ≤p ≥p All are ≥p

Finally, if the scanning indices stop while pointing to the same element, i.e., i = j, the value
they are pointing to must be equal to p (why?). Thus, we have the subarray partitioned,
with the split position s = i = j :

i==j
P All are≤p =p All are ≥p

We can combine the last case with the case of crossed-over indices (i > j ) by exchanging the
pivot with A[j ] whenever i ≥ j .
Here is pseudocode implementing this partitioning procedure.

ALGORITHM Partition(A[l..r])
//Partitions a subarray by Hoare’s algorithm, using the first element as a pivot
//Input: Subarray of array A[0..n − 1], defined by its left and right indices l and r (l<r)
//Output: Partition of A[l..r], with the split position returned as this function’s value

17
p←A[l]
i ←l; j ←r + 1
repeat
repeat i ←i + 1 until A[i]≥ p
repeat j ←j − 1 until A[j ]≤ p
swap(A[i], A[j ])
until i ≥ j
swap(A[i], A[j ]) //undo last swap when i ≥ j
swap(A[l], A[j ])
return j

Note that index i can go out of the subarray’s bounds in this pseudocode. Rather than
checking for this possibility every time index i is incremented, we can append to array
A[0..n − 1] a “sentinel” that would prevent index i from advancing beyond position n. Note
that the more sophisticated method of pivot selection mentioned at the end of the section
makes such a sentinel unnecessary.
An example of sorting an array by quicksort is given in Figure 2.4.

18
FIGURE 2.4 Example of quicksort operation. (a) Array’s transformations with pivots
shown in bold. (b) Tree of recursive calls to Quicksort with input values l and r of subarray
bounds and split position s of a partition obtained.

Quick sort efficiency:


Best case:
The number of key comparisons made before a partition is achieved is n + 1 if the
scanning indices cross over and n if they coincide (why?). If all the splits happen in the
middle of corresponding subarrays, we will have the best case. The number of key
comparisons in the best case satisfies the recurrence

Cbest(n) = 2Cbest(n/2) + n for n > 1,


Cbest(1) = 0.

19
According to the Master Theorem, Cbest(n) ∈ θ (n log2 n); solving it exactly for n = 2k yields
Cbest(n) ∈ θ (n log2 n)

Worst case:

Number of comparisons

n+1

n+2

n+3
.
.
.
.
3

In the worst case, all the splits will be skewed to the extreme: one of the two
subarrays will be empty, and the size of the other will be just 1 less than the size of the
subarray being partitioned. This unfortunate situation will happen, in particular, for
increasing arrays, i.e., for inputs for which the problem is already solved! Indeed, if A[0..n −
1] is a strictly increasing array and we use A[0] as the pivot, the left-to-right scan will stop

20
on A[1] while the right-to-left scan will go all the way to reach A[0], indicating the split at
position 0.
So, after making n + 1 comparisons to get to this partition and exchanging the pivot A[0]
with itself, the algorithm will be left with the strictly increasing array A[1..n − 1] to sort.
This sorting of strictly increasing arrays of diminishing sizes will continue until the last one
A[n − 2..n − 1] has been processed. The total number of key comparisons made will be equal
to

(n+1)(n+ 2)
Cworst(n) = (n + 1) + n + . . . + 3 = -3
2
Cworst(n)∈θ (n2).

Average case :
Thus, the question about the utility of quicksort comes down to its average case
ehavior. Let Cavg(n) be the average number of key comparisons made by quicksort on a
randomly ordered array of size n. A partition can happen in any position s (0 ≤ s ≤ n−1)
after n+1comparisons are made to achieve the partition. After the partition, the left and
right subarrays will have s and n − 1− s elements, respectively. Assuming that the partition
split can happen in each position s with the same probability 1/n, we get the following
recurrence relation:
n−1
1
c avg ( n ) = ∑ [ ( n+1 ) + cavg ( s ) +c avg ¿ (n−1−s)]for n>1 ¿
n s =0
Cavg(0) = 0, Cavg(1) = 0.

Its solution, which is much trickier than the worst- and best-case analyses, turns out to be

Cavg(n) ≈ 2n ln n ≈ 1.38n log2 n.

Thus, on the average, quicksort makes only 38% more comparisons than in the best case.
Moreover, its innermost loop is so efficient that it usually runs faster than mergesort (on
randomly ordered arrays of nontrivial sizes. This certainly justifies the name given to the
algorithm by its inventor.

2.6 Satrssen’s matrix multiplication

Let A and B be two n× n matrices. The product matrix C=AB is also an n × n matrix whose
I,jth element is formed by taking the element in the ith row of A and jth column of B and
multiply them to get
C(i,j)= ∑ A ( i , k ) B (k , j) eq(1)
i ≤k ≤ n
For all i and j between 1 and n. to compute C(i,j) using this formula, we need n
multiplications. as the matrix C has n 2 elements, the time for the resulting matrix
multiplication algorithm, which we refer to as the conventional method is θ( n3)

21
The divide and conquer atrtergy suggests another way to compute the product of
two n× n matrices. For simplicity we assume that n is a power of 2, that is, that there exists
nonnegative integer k and such that n=2 k. in case n is not a power of 2, then enough rows
and columns of zeros can be added to both A and B so that the resulting dimensions are a
power of two. Imagine that A and B are each partitioned into four square sub matrices, each
n n
sub matrix having dimensions × . Then the product AB can be computed by using the
2 2
above formula for product of 2×2 matrices. If AB is

[ A 11
A 21 ][ ][
A12 B 11 B 12 C11 C 12
=
A22 B 21 B 22 C21 C 22 ] eq(2)

Then,
C 11=A 11 B11+ A 12 B 21
C 12=A 11 B12+ A 12 B 22
C 21=A 21 B11+ A 22 B21
C 22=A 21 B12+ A 22 B22 eq(3)

If n=2, then formula (2) and (3) are computed using multiplication operation for the
elements of A and B. These elements are typically floating pount numbers. For n>2, the
elements of C can be computes using matrix multiplication and addition operations applied
n n
to matrices of size × . Since n is of power of 2, these matrix products can be recursively
2 2
computed by the same algorithm we are using for n × n case . this algorithm will continue
applying itself to smaller-sized submatrices until n becomes suitably small(n=2) so that the
product is computed directly.
n n
To compute AB using eq(3), we need to perform eight multiplications of ×
2 2
n n n n
matrices and four additions of × matrices. Since two × matrices can be added in the
2 2 2 2
time cn2 for some constant c, the overall computing time T(n) of the divide and conquer
algorithm is given by the recurrence

{
bn≤2
T(n)=
8T
n
2 ()
+ c n 2 n> 2

Where b and c are constants.


This recurrence can be solved in the same way as earlier recurrences to obtain
T(n)=O(n3). Hence no improvement over the conventional method has been made. Since
matrix multiplication are more expensive than matrix addition (O(n 3) versus O(n2)), we can
attempt to reformulate the equation for C ij so as to have fewer multiplications and possibly
more additions. Volker Strassen has discovered a way to compute the C ij’s of eq(3) using
only 7 multiplications and 18 additions or subtractions. His method involves first

22
n n
computing the seven × matrices P,Q,R,S,T,U and V. as can be seen, P,Q,R,S,T,U,V can be
2 2
computed using 7 matrix multiplications and 10 matrix additions or subtractions. The C ij’s
require an additional 8 additions or subtractions.

P=(A11+A22)(B11+B22)
Q=(A21+A22)B11
R=A11(B12 –B22)
S= A22(B21-B11)
T=(A11+A12)B22
U=(A21-A12)(B11+B12)
V=(A12-A22)(B21+B22)

C11=P+S-T+V
C12=R+T
C21=Q+S
C22=P+R-Q+U

The resulting recurrence relation for T (n) is

{
b n ≤2
T(n)=
7T
n
2 ()
+ a n2 n>2

Where a and b are constants. The formula we get,

T(n)=an2[1+7/4+(7/4)2+….+(7/4)k-1]+7kT(1)
2
≤ c n ¿ , c a constant
=an log 4 +log 7−log 4 +nlog 7
2 2 2 2

=O(n log 7 ¿ ≈ O(n2.81)


2

2.7 Advantages and Disadvantages Of Divide and Conquer

Advantages
1.Solving difficult problems
Divide and conquer is a powerful tool for solving conceptually difficult problems: all it
requires is a way of breaking the problem into sub-problems, of solving the trivial cases
and of combining sub-problems to the original problem. Similarly, divide and conquer only
requires reducing the problem to a single smaller problem, such as the classic Tower of
Hanoi puzzle, which reduces moving a tower of height n to moving a tower of height n − 1.
2.Algorithm efficiency

23
The divide-and-conquer paradigm often helps in the discovery of efficient algorithms. It
was the key, for example, to Karatsuba's fast multiplication method, the quicksort and
mergesort algorithms, the Strassen algorithm for matrix multiplication,
3.Parallelism

Divide and conquer algorithms are naturally adapted for execution in multi-processor
machines, especially shared-memory systems where the communication of data between
processors does not need to be planned in advance, because distinct sub-problems can be
executed on different processors.
4.Memory access
Divide-and-conquer algorithms naturally tend to make efficient use of memory caches. The
reason is that once a sub-problem is small enough, it and all its sub-problems can, in
principle, be solved within the cache, without accessing the slower main memory. An
algorithm designed to exploit the cache in this way is called cache-oblivious, because it does
not contain the cache size as an explicit parameter.
5.Roundoff control
In computations with rounded arithmetic, e.g. with floating point numbers, a divide-and-
conquer algorithm may yield more accurate results than a superficially equivalent iterative
method. For example, one can add N numbers either by a simple loop that adds each datum
to a single variable, or by a D&C algorithm called pairwise summation that breaks the data
set into two halves, recursively computes the sum of each half, and then adds the two sums
Disadvantages
1.Recursion
Divide-and-conquer algorithms are naturally implemented as recursive procedures. In that
case, the partial sub-problems leading to the one currently being solved are automatically
stored in the procedure call stack. A recursive function is a function that calls itself within
its definition
2.Explicit stack
Divide and conquer algorithms can also be implemented by a non-recursive program that
stores the partial sub-problems in some explicit data structure, such as a stack, queue,
or priority queue. This approach allows more freedom in the choice of the sub-problem
that is to be solved next, a feature that is important in some applications — e.g. in breadth-
first recursion and the branch and bound method for function optimization. This approach
is also the standard solution in programming languages that do not provide support for
recursive procedures.

24
3.Stack size
In recursive implementations of D&C algorithms, one must make sure that there is
sufficient memory allocated for the recursion stack, otherwise the execution may fail
because of stack overflow. Fortunately, D&C algorithms that are time-efficient often have
relatively small recursion depth. For example, the quicksort algorithm can be implemented

so that it never requires more than nested recursive calls to sort   items.


4. Choosing the base cases
In any recursive algorithm, there is considerable freedom in the choice of the  base cases,
the small subproblems that are solved directly in order to terminate the recursion.
5. Sharing repeated subproblems
For some problems, the branched recursion may end up evaluating the same sub-problem
many times over. In such cases it may be worth identifying and saving the solutions to
these overlapping subproblems, a technique commonly known as memorization.

2.8 Decrease and Conquer Approach


The decrease-and-conquer technique is based on exploiting the relationship
between a solution to a given instance of a problem and a solution to its smaller instance.
Once such a relationship is established, it can be exploited either top down or bottom up.
The bottom-up variation is usually implemented iteratively, starting with a solution to the
smallest instance of the problem; it is called sometimes the incremental approach.
There are three major variations of decrease-and-conquer:
1. Decrease by a constant
2. Decrease by a constant factor
3. Variable size decrease

In the decrease-by-a-constant variation, the size of an instance is reduced by the


same constant on each iteration of the algorithm. Typically, this constant is equal to one
(Figure 2.5), although other constant size reductions do happen occasionally.
Consider, as an example, the exponentiation problem of computing an where a ≠ 0 and n is
a nonnegative integer. The relationship between a solution to an instance of size n and an
instance of size n − 1 is obtained by the obvious formula an = an−1 . a. So the function f (n) =
an can be computed either “top down” by using its recursive definition

f(n)= {f ( n−1a if) .n=1


a if n> 1

25
FIGURE 2.5 Decrease-(by one)-and-conquer technique.

The decrease-by-a-constant-factor technique suggests reducing a problem


instance by the same constant factor on each iteration of the algorithm. In most
applications, this constant factor is equal to two For an example, let us revisit the
exponentiation problem. If the instance of size n is to compute an, the instance of half its
size is to compute an/2, with the
obvious relationship between the two: an = (an/2)2. But since we consider here instances
with integer exponents only, the former does not work for odd n. If n is odd, we have to
compute an−1 by using the rule for even-valued exponents and then multiply the result by a.
To summarize, we have the following formula:

{
n
2
( a 2 ) if n is even∧positive
f( n ) = n−1
2 2
(a ) . a if n is odd∧greater t h an 1
a if n=1

26
FIGURE 2.6 Decrease-(by half)-and-conquer technique.

Finally, in the variable-size-decrease variety of decrease-and-conquer, the size-


reduction pattern varies from one iteration of an algorithm to another. Euclid’s algorithm
for computing the greatest common divisor provides a good example of such a situation.
Recall that this algorithm is based on the formula.
gcd(m, n) = gcd(n, m mod n)
Though the value of the second argument is always smaller on the right-hand side than on
the left-hand side, it decreases neither by a constant nor by a constant factor.

Topological Sorting
A directed graph, or digraph for short, is a graph with directions specified for all its edges
(Figure 4.5a is an example). The adjacency matrix and adjacency lists are still two principal
means of representing a digraph. There are only two notable differences between
undirected and directed graphs in representing them: (1) the adjacency matrix of a
directed graph does not have to be symmetric; (2) an edge in a directed graph has just one
(not two) corresponding nodes in the digraph’s adjacency lists.

27
FIGURE 2.7 (a) Digraph. (b) DFS forest of the digraph for the DFS traversal started at a.

As a motivating example, consider a set of five required courses {C1, C2, C3, C4, C5} a part-
time student has to take in some degree program. The courses can be taken in any order as
long as the following course prerequisites are met: C1 and C2 have no prerequisites, C3
requires C1 and C2, C4 requires C3, and C5 requires C3 and C4. The student can take only
one course per term. In which order should the student take the courses?
The situation can be modeled by a digraph in which vertices represent courses and
directed edges indicate prerequisite requirements (Figure 2.8). In terms of this digraph, the
question is whether we can list its vertices in such an order that for every edge in the
graph, the vertex where the edge starts is listed before the vertex where the edge ends.
(Can you find such an ordering of this digraph’s vertices?) This problem is called
topological sorting. It can be posed for an

FIGURE 2.8 Digraph representing the prerequisite structure of five courses.

28
FIGURE 2.9 (a) Digraph for which the topological sorting problem needs to be solved.
(b) DFS traversal stack with the subscript numbers indicating the popping off
order. (c) Solution to the problem.

arbitrary digraph, but it is easy to see that the problem cannot have a solution if a digraph
has a directed cycle. Thus, for topological sorting to be possible, a digraph in question must
be a dag. It turns out that being a dag is not only necessary but also sufficient for
topological sorting to be possible; i.e., if a digraph has no directed cycles, the topological
sorting problem for it has a solution. Moreover, there are two efficient algorithms that both
verify whether a digraph is a dag and, if it is, produce an ordering of vertices that solves the
topological sorting problem.
The first algorithm is a simple application of depth-first search: perform a DFS
traversal and note the order in which vertices become dead-ends (i.e., popped off the
traversal stack). Reversing this order yields a solution to the topological sorting problem,
provided, of course, no back edge has been encountered during the traversal. If a back edge
has been encountered, the digraph is not a dag, and topological sorting of its vertices is
impossible.
Why does the algorithm work? When a vertex v is popped off a DFS stack, no vertex u with
an edge from u to v can be among the vertices popped off before v. (Otherwise, (u, v) would
have been a back edge.) Hence, any such vertex u will be listed after v in the popped-off
order list, and before v in the reversed list. Figure 2.9 illustrates an application of this
algorithm to the digraph in Figure 2.8. Note that in Figure 2.9c, we have drawn the edges of
the digraph, and they all point from left to right as the problem’s statement requires. It is a
convenient way to check visually the correctness of a solution to an instance of the
topological sorting problem.

FIGURE 2.10 Illustration of the source-removal algorithm for the topological sorting
problem. On each iteration, a vertex with no incoming edges is deleted
from the digraph.

29
The second algorithm is based on a direct implementation of the decrease-(by
one)-and-conquer technique: repeatedly, identify in a remaining digraph a source, which is
a vertex with no incoming edges, and delete it along with all the edges outgoing from it. (If
there are several sources, break the tie arbitrarily. If there are none, stop because the
problem cannot be solved). The order in which the vertices are deleted yields a solution to
the topological sorting problem. The application of this algorithm to the same digraph
representing the five courses is given in Figure 2.10
Note that the solution obtained by the source-removal algorithm is different from
the one obtained by the DFS-based algorithm. Both of them are correct, of course; the
topological sorting problem may have several alternative solutions. The tiny size of the
example we used might create a wrong impression about the topological sorting problem.
As to applications of topological sorting in computer science, they include instruction
scheduling in program compilation, cell evaluation ordering in spreadsheet formulas, and
resolving symbol dependencies in linkers.

Master Theorem If f (n) ∈ θ (nd) where d ≥ 0 in recurrence , then


t (n)∈ ¿
Analogous results hold for the O andθ notations, too.
For example, the recurrence for the number of additions A(n) made by the divide-and-
conquer sum-computation algorithm (see above) on inputs of size
n = 2k is
Example 1: A(n) = 2A(n/2) + 1.
Thus, for this example, a = 2, b = 2, and d = 0; hence, since a >bd,
A(n) ∈θ (nlogb a)

= θ (nlog2 2)

= θ (n).

Example 2: T(n)=4T(n/3)+n

a=4, b=3 and d=1

Since a>bd

T(n)=θ (nlog3 4)

Example 3: T(n)=2T(n/2)+n2

a=2 ,b=2, d=2

Since a<bd

30
T(n)=θ (nlog2 4)

T(n)=θ (n 2)

Example 3: T(n)=8T(n/2)+n3

A=8,b=2, d=3

a>bd
3

T(n)=θ (n log 2 )=θ (n3 )


2

31

You might also like