Unit 1 and Unit Ii
Unit 1 and Unit Ii
INTRODUCTION
Algorithm analysis: Time and space complexity - Asymptotic Notations and its properties
Best case, Worst case and average case analysis – Recurrence relation: substitution method -
Lower bounds – searching: linear search, binary search and Interpolation Search, Pattern
search: The naïve string- matching algorithm - Rabin-Karp algorithm - Knuth-Morris-Pratt
algorithm. Sorting: Insertion sort – heap sort
NOTION OF AN ALGORITHM
Problem
Algorithm
The example here is to find the gcd of two integers with three different ways: The gcd of two
nonnegative, not-both –zero integers m & n, denoted as gcd (m, n) is defined as the largest integer
that divides both m & n evenly, i.e., with a remainder of zero.
Euclid of Alexandria outlined an algorithm, for solving this problem in one of the volumes of his
Elements.
Gcd (m, n) = gcd (n, m mod n)
Euclid”s algorithm :
ALGORITHM Euclid(m, n)
//Computes gcd(m, n) by Euclid’s algorithm
//Input: Two nonnegative, not-both-zero integers m and n
//Output: Greatest common divisor of m and n
while n _= 0 do
r ←m
mod n
m←n
n←r
return
m
This algorithm comes to a stop, when the 2nd no becomes 0. The second number of the pair gets
smaller with each iteration and it cannot become negative. Indeed, the new value of n on the next
iteration is m mod n, which is always smaller than n. hence, the value of the second number in the
pair eventually becomes 0, and the algorithm stops.
The second method for the same problem is: obtained from the definition itself. i.e., gcd of m & n is
the largest integer that divides both numbers evenly. Obviously, that number cannot be greater than
the second number (or) smaller of these two numbers,which we will denote by t = min {m, n
}. So start checking whether t divides both m and n: if it does t is the answer ; if it doesn’t t is
decreased by 1 and try again. (Do this repeatedly till you reach 12 and then stop for the example
given below)
Example: 60 = 2.2.3.5
24 = 2.2.2.3
This procedure is more complex and ambiguity arises since the prime factorization is not defined.
So to make it as an efficient algorithm, incorporate the algorithm to find the prime factors.
Design an algorithm
An input to an algorithm specifies an instance of the problem the algorithm solves. It’s also
important to specify exactly the range of instances the algorithm needs to handle. Before this we have
to clearly understand the problem and clarify the doubts after leading the problems description. Correct
algorithm should work for all possible inputs.
The second step is to ascertain the capabilities of a machine. The essence of von-Neumann
machines architecture is captured by RAM, Here the instructions are executed one after another, one
operation at a time, Algorithms designed to be executed on such machines are called sequential
algorithms. An algorithm which has the capability of executing the operations concurrently is called
parallel algorithms. RAM model doesn’t support this.
The next decision is to choose between solving the problem exactly or solving it approximately.
Based on this, the algorithms are classified as exact and approximation algorithms. There are three
issues to choose an approximation algorithm. First, there are certain problems like extracting square
roots, solving non-linear equations which cannot be solved exactly. Secondly, if the problem is
complicated it slows the operations. E.g. traveling salesman problem. Third, this algorithm can be a part
of a more sophisticated algorithm that solves a problem exactly.
Data structures play a vital role in designing and analyzing the algorithms. Some of the
algorithm design techniques also depend on the structuring data specifying a problem’s instance.
Algorithm + Data structure = Programs
Correctness has to be proved for every algorithm. To prove that the algorithm gives the required
result for every legitimate input in a finite amount of time. For some algorithms, a proof of correctness
is quite easy; for others it can be quite complex. A technique used for proving correctness s by
mathematical induction because an algorithm’s iterations provide a natural sequence of steps needed for
such proofs. But we need one instance of its input for which the algorithm fails. If it is incorrect, redesign
the algorithm, with the same decisions of data structures design technique etc.
The notion of correctness for approximation algorithms is less straightforward than it is for exact
algorithm. For example, in gcd (m,n) two observations are made. One is the second number gets smaller
on every iteration and the algorithm stops when the second number becomes 0.
∗ Analyzing an algorithm
There are two kinds of algorithm efficiency: time and space efficiency. Time efficiency indicates
how fast the algorithm runs; space efficiency indicates how much extra memory the algorithm needs.
Another desirable characteristic is simplicity. Simper algorithms are easier to understand and program,
the resulting programs will be easier to debug. For e.g. Euclid’s algorithm to fid gcd (m,n) is simple
than the algorithm which uses the prime factorization. Another desirable characteristic is generality.
Two issues here are generality of the problem the algorithm solves and the range of inputs it accepts.
The designing of algorithm in general terms is sometimes easier. For eg, the general problem of
computing the gcd of two integers and to solve the problem. But at times designing a general algorithm
is unnecessary or difficult or even impossible. For eg, it is unnecessary to sort a list of n numbers to find
its median, which is its [n/2]th smallest element. As to the range of inputs, we should aim at a range of
inputs that is natural for the problem at hand.
∗ Coding an algorithm
Programming the algorithm by using some programming language. Formal verification is done
for small programs. Validity is done thru testing and debugging. Inputs should fall within a range and
hence require no verification. Some compilers allow code optimization which can speed up a program
by a constant factor whereas a better algorithm can make a difference in their running time. The analysis
has to be done in various sets of inputs.
A good algorithm is a result of repeated effort & work. The program’s stopping / terminating
condition has to be set. The optimality is an interesting issue which relies on the complexity of the
problem to be solved. Another important issue is the question of whether or not every problem can be
solved by an algorithm. And the last, is to avoid the ambiguity which arises for a complicated algorithm.
IMPORTANT PROBLEM TYPES
The two motivating forces for any problem is its practical importance and some specific
characteristics.
The different types are:
1. Sorting
2. Searching
3. String processing
4. Graph problems
5. Combinatorial problems
6. Geometric problems
7. Numerical problems.
1. Sorting
Sorting problem is one which rearranges the items of a given list in ascending order. We
usually sort a list of numbers, characters, strings and records similar to college information about their
students, library information and company information is chosen for guiding the sorting technique. For
eg in student’s information, we can sort it either based on student’s register number or by their names.
Such pieces of information are called a key.
The most important when we use the searching of records. There are different types of sorting
algorithms. There are some algorithms that sort an arbitrary of size n using nlog2n comparisons, On the
other hand, no algorithm that sorts by key comparisons can do better than that. Although some
algorithms are better than others, there is no algorithm that would be the best in all situations. Some
algorithms are simple but relatively slow while others are faster but more complex. Some are suitable
only for lists residing in the fast memory while others can be adapted for sorting large files stored on a
disk, and so on.
There are two important properties. The first is called stable, if it preserves the relative order of
any two equal elements in its input. For example, if we sort the student list based on their GPA and if
two students GPA are the same, then the elements are stored or sorted based on its position. The second
is said to be ‘in place’ if it does not require extra memory. There are some sorting algorithms that are in
place and those that are not.
2. Searching
The searching problem deals with finding a given value, called a search key, in a given set. The
searching can be either a straightforward algorithm or binary search algorithm which is a different form.
These algorithms play a important role in real-life applications because they are used for storing and
retrieving information from large databases. Some algorithms work faster but require more memory,
some are very fast but applicable only to sorted arrays. Searching, mainly deals with addition and
deletion of records. In such cases, the data structures and algorithms are chosen to balance among the
required set of operations.
3. String processing
4. Graph problems
One of the interesting area in algorithmic is graph algorithms. A graph is a collection of points
called vertices which are connected by line segments called edges. Graphs are used for modeling a wide
variety of real-life applications such as transportation and communication networks.
It includes graph traversal, shortest-path and topological sorting algorithms. Some graph
problems are very hard, only very small instances of the problems can be solved in realistic amount of
time even with fastest computers.
There are two common problems: the traveling salesman problem, finding the shortest tour
through n cities that visits every city exactly once
The graph-coloring problem is to assign the smallest number of colors to vertices of a graph so
that no two adjacent vertices are of the same color. It arises in event-scheduling problem, where the
events are represented by vertices that are connected by an edge if the corresponding events cannot be
scheduled in the same time, a solution to this graph gives an optimal schedule.
5. Combinatorial problems
The traveling salesman problem and the graph-coloring problem are examples of combinatorial
problems. These are problems that ask us to find a combinatorial object such as permutation,
combination or a subset that satisfies certain constraints and has some desired (e.g. maximizes a value
or minimizes a cost).
These problems are difficult to solve for the following facts. First, the number of combinatorial
objects grows extremely fast with a problem’s size. Second, there are no known algorithms, which are
solved in acceptable amount of time.
6. Geometric problems
Geometric algorithms deal with geometric objects such as points, lines and polygons. It also includes
various geometric shapes such as triangles, circles etc. The applications for these algorithms are in
computer graphic, robotics etc.The two problems most widely used are the closest-pair problems, given
‘n’ points in the plane, find the closest pair among them. The convex-hull problem is to find the smallest
convex polygon that would include all the points of a given set.
7. Numerical problems
This is another large special area of applications, where the problems involve mathematical objects
of continuous nature: solving equations computing definite integrals and evaluating functions and so
on. These problems can be solved only approximately. These require real numbers, which can be
represented in a computer only approximately. If can also lead to an accumulation of round-off errors.
The algorithms designed are mainly used in scientific and engineering applications.
For analyzing the efficiency of algorithms the two kinds are time efficiency and space efficiency.
Time efficiency indicates how fast an algorithm in question runs; space efficiency deals with the extra
space the algorithm requires. The space requirement is not of much concern, because now we have the
fast main memory, cache memory etc. so we concentrate more on time efficiency.
Almost all algorithms run longer on larger inputs. For example, it takes to sort larger arrays,
multiply larger matrices and so on. It is important to investigate an algorithm’s efficiency as a function
of some parameter n indicating the algorithm’s input size. For example, it will be the size of the list for
problems of sorting, searching etc. For the problem of evaluating a polynomial p (x) = an xn+ ------
+ a0 of degree n, it will be the polynomial’s degree or the number of its coefficients, which is larger
by one than its degree.
The size also be influenced by the operations of the algorithm. For e.g., in a spell-check
algorithm, it examines individual characters of its input, then we measure the size by the number of
characters or words.
Note: measuring size of inputs for algorithms involving properties of numbers. For such
algorithms, computer scientists prefer measuring size by the number b of bits in the n’s binary
representation.
b= log2n+1.
Units for measuring Running time
We can use some standard unit of time to measure the running time of a program implementing
the algorithm. The drawbacks to such an approach are: the dependence on the speed of a particular
computer, the quality of a program implementing the algorithm.
The drawback to such an approach are : the dependence on the speed of a particular computer,
the quality of a program implementing the algorithm, the compiler used to generate its machine code
and the difficulty in clocking the actual running time of the program. Here, we do not consider these
extraneous factors for simplicity.
One possible approach is to count the number of times each of the algorithm’s operations is
executed. The simple way, is to identify the most important operation of the algorithm, called the basic
operation , the operation contributing the most to the total running time and compute the umber of times
the basic operation is executed.
The basic operation is usually the most time consuming operation in the algorithm’s inner most
loop. For example, most sorting algorithm works by comparing elements (keys), of a list being sorted
with each other; for such algorithms, the basic operation is the key comparison.
Let Cop be the time of execution of an algorithm’s basic operation on a particular computer and
let c(n) be the number of times this operations needs to be executed for this algorithm. Then we can
estimate the running time, T (n) as: T (n) ∼ Cop c(n)
Here, the count c(n) does not contain any information about operations that are not basic and in
tact, the count itself is often computed only approximately. The constant Cop is also an approximation
whose reliability is not easy to assess. If this algorithm is executed in a machine which is ten times faster
than one we have, the running time is also ten times or assuming that C(n) = ½ n(n- 1), how much longer
will the algorithm run if we doubt its input size? The answer is four times longer. Indeed, for all but
very small values of n,
C(n) = ½ n(n-1) = ½ n2- ½ n ≈ ½ n2
and therefore,
T(2n) Cop C(2n) ½(2n)2 = 4
Cop
T(n) ≈ C(n) ≈ ½(2n)2
Here Cop is unknown, but still we got the result, the value is cancelled out in the ratio. Also, ½ the
multiplicative constant is also cancelled out. Therefore, the efficiency analysis framework ignores
multiplicative constants and concentrates on the counts’ order of growth to within a constant multiple
for large size inputs.
Orders of Growth
This is mainly considered for large input size. On small inputs if there is difference in running
time it cannot be treated as efficient one.
Values of several functions important for analysis of algorithms:
log2
n n n n log2n n2 n3 2n n!
10 10
10 3.3 1 3.3 x 101 2 10 103 3.6 x 106
3
10 10 10 1.3 x 9.3 x
2 6.6 2 6.6 x 102 4 10 1030 1015
6 7
10 10 10
3 10 3 1.0 x 104 6 10
9
10 10 10 10
1
4 13 4 1.3 x 105 8
2
10 10 10 10
5 17 5 1.7 x 106 1 1
0 5
10 10 10 10
6 20 6 2.0 x 107 1 1
2 8
The function growing slowly is the logarithmic function, logarithmic basic-operation count to
run practically instantaneously on inputs of all realistic sizes. Although specific values of such a count
depend, of course, in the logarithm’s base, the formula
logan = logab x logbn
Makes it possible to switch from one base to another, leaving the count logarithmic but with a new
multiplicative constant.
On the other end, the exponential function 2n and the factorial function n! grow so fast even for
small values of n. These two functions are required to as exponential-growth functions. “Algorithms
that require an exponential number of operations are practical for solving only problems of very small
sizes.”
Another way to appreciate the qualitative difference among the orders of growth of the functions
is to consider how they react to, say, a twofold increase in the value of their argument n. The function
log2n increases in value by just 1 (since log22n = log22 + log2n = 1 + log2n); the linear function
increases twofold; the nlogn increases slightly more than two fold; the quadratic n 2 as fourfold ( since
(2n)2 = 4n 2) and the cubic function n3 as eight fold (since (2n)3 = 8n3); the value of 2n is squared
(since 22n = (2n)2 and n! increases much more than that.
Worst-case, Best-case and Average-case efficiencies:
The running time not only depends on the input size but also on the specifics of a particular
input. Consider the example, sequential search. It’s a straightforward algorithm that searches for a given
item (search key K) in a list of n elements by checking successive elements of the list until either a
match with the search key is found or the list is exhausted.
Clearly, the running time of this algorithm can be quite different for the same list size n. In the
worst case, when there are no matching elements or the first matching element happens to be the last
one on the list, the algorithm makes the largest number of key comparisons among all possible inputs
of size n; Cworst (n) = n.
The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size n,
which is an input of size n for which the algorithm runs the longest among all possible inputs of that
size. The way to determine is, to analyze the algorithm to see what kind of inputs yield the largest value
of the basic operation’s count c(n) among all possible inputs of size n and then compute this worst-case
value Cworst (n).
The best- case efficiency of an algorithm is its efficiency for the best-case input of size n, which
is an input of size n for which the algorithm runs the fastest among all inputs of that size. First, determine
the kind of inputs for which the count C(n) will be the smallest among all possible inputs of size n. Then
ascertain the value of C(n) on the most convenient inputs. For e.g., for the searching with input size n,
if the first element equals to a search key, Cbest(n) = 1.
Neither the best-case nor the worst-case gives the necessary information about an algorithm’s
behaviour on a typical or random input. This is the information that the average-case efficiency seeks
to provide. To analyze the algorithm’s average-case efficiency, we must make some assumptions about
possible inputs of size n.
Let us consider again sequential search. The standard assumptions are that:
1. the probability of a successful search is equal to p (0 ≤ p ≤ 1), and,
2. the probability of the first match occurring in the ith position is same for every i.
Accordingly, the probability of the first match occurring in the ith position of the list is p/n for every i,
and the no of comparisons is i for a successful search. In case of unsuccessful search, the number of
comparisons is n with probability of such a search being (1-p). Therefore,
= [p(n+1)]/2 + n.(1-p)
This general formula yields the answers. For e.g, if p=1 (ie., successful), the average number of
key comparisons made by sequential search is (n+1)/2; ie, the algorithm will inspect, on an average,
about half of the list’s elements. If p=0 (ie., unsuccessful), the average number of key comparisons will
be ‘n’ because the algorithm will inspect all n elements on all such inputs.
The average-case is better than the worst-case, and it is not the average of both best and worst-cases.
Another type of efficiency is called amortized efficiency. It applies not to a single run of an
algorithm but rather to a sequence of operations performed on the same data structure. In some situations
a single operation can be expensive, but the total time for an entire sequence of such n operations is
always better than the worst-case efficiency of that single operation multiplied by n. It is considered in
algorithms for finding unions of disjoint sets.
Recaps of Analysis framework:
1. Both time and space efficiencies are measured as functions of the algorithm’s i/p size.
2. Time efficiency is measured by counting the number of times the algorithm’s basic operation is
executed. Space efficiency is measured by counting the number of extra memory units consumed
by the algorithm.
3. The efficiencies of some algorithms may differ significantly for input of the same size. For such
algorithms, we need to distinguish between the worst-case, average-case and best-case efficiencies.
4. The framework’s primary interest lies in the order of growth of the algorithm’s running tine as its
input size goes to infinity.
The efficiency analysis framework concentrates on the order of growth of an algorithm’s basic
operation count as the principal indicator of the algorithm’s efficiency. To compare and rank such orders
of growth, we use three notations; 0 (big oh), Ω (big omega) and θ (big theta). First, we see the informal
definitions, in which t(n) and g(n) can be any non negative functions defined on the set of natural
numbers. t(n) is the running time of the basic operation, c(n) and g(n) is some function to compare the
count with.
Informal Introduction:
O [g(n)] is the set of all functions with a smaller or same order of growth as g(n) Eg: n 樺 O (n2),
100n+5 樺 O(n2), 1/2n(n-1) 樺 O(n2).
The first two are linear and have a smaller order of growth than g(n)=n2, while the last one is
quadratic and hence has the same order of growth as n2. on the other hand,
n3 樺 (n2), 0.00001 n3 鞄 O(n2), n4+n+1 鞄 O(n 2 ).The function n3 and 0.00001 n3 are both cubic and
have a higher order of growth than n2 , and so has the fourth-degree polynomial n4 +n+1
The second-notation, Ω [g(n)] stands for the set of all functions with a larger or same order of
growth as g(n). for eg, n3 樺 Ω(n2), 1/2n(n-1) 樺 Ω(n2), 100n+5 鞄 Ω(n2)
Finally, θ [g(n)] is the set of all functions that have the same order of growth as g(n).
E.g, an2+bn+c with a>0 is in θ(n2)
O-notation:
Definition: A function t(n) is said to be in 0[g(n)]. Denoted t(n) 樺 0[g(n)], if t(n) is bounded above by
some constant multiple of g(n) for all large n ie.., there exist some positive constant c and some non
negative integer no such that t(n) ≤ cg(n) for all n≥no.
Eg. 100n+5 樺 0 (n2)
Proof: 100n+ 5 ≤ 100n+n (for all n ≥ 5) = 101n ≤ 101 n2
Thus, as values of the constants c and n0 required by the definition, we con take 101 and 5
respectively.
The definition says that the c and n0 can be any value. For eg we can also take. C = 105, and n0 = 1.
Ω-Notation:
Definition: A fn t(n) is said to be in Ω[g(n)], denoted t(n) 樺 Ω[g(n)], if t(n) is bounded below by
some positive constant multiple of g(n) for all large n, ie., there exist some positive constant c and
some non negative integer n0 s.t.
t(n) ≥ cg(n) for all n ≥ n0.
For example: n3 樺 Ω(n2), Proof is n3 ≥ n2 for all n ≥ n0. i.e., we can select c=1 and n0=0.
θ - Notation:
Definition: A function t(n) is said to be in θ [g(n)], denoted t(n) 樺 θ (g(n)), if t(n) is bounded both above
and below by some positive constant multiples of g(n) for all large n, ie., if there exist some positive
constant c1 and c2 and some nonnegative integer n0 such that c2g(n) ≤ t(n) ≤ c1g(n) for all n
≥ n0.
Example: Let us prove that ½ n(n-1) 樺 θ( n2 ) .First, we prove the right inequality (the upper bound)
1. n(n-1) = ½ n2 – ½ n ≤ ½ n2 for all n ≥ n0.
Second, we prove the left inequality (the lower bound)
2. n(n-1) = ½ n2 – ½ n ≥ ½ n2 – ½ n½ n for all n ≥ 2 = ¼
2
n . Hence, we can select c2= ¼, c2= ½ and n0 = 2
Useful property involving these Notations:
The property is used in analyzing algorithms that consists of two consecutively executed parts:
THEOREMIf t1(n) Є O(g1(n)) and t2(n) Є O(g2(n)) then t1 (n) + t2(n) Є O(max{g1(n), g2(n)}).
PROOF (As we shall see, the proof will extend to orders of growth the following simple fact
about four arbitrary real numbers a1 , b1 , a2, and b2: if a1 < b1 and a2 < b2 then a1 + a2 < 2 max{
b1, b2}.) Since t1(n) Є O(g1(n)) , there exist some constant c and some nonnegative integer n 1 such
that
t1(n) < c1g1 (n) for all n > n1
since t2(n) Є O(g2(n)),
t2(n) < c2g2(n) for all n > n2.
Let us denote c3 = maxfc1, c2} and consider n > max{ n1 , n2} so that we can
use both inequalities. Adding the two inequalities above yields the following:
t1(n) + t2(n) < c1g1 (n) + c2g2(n)
< c3g1(n) + c3g2(n) = c3 [g1(n) + g2(n)]
< c32max{g1 (n),g2(n)}.
Hence, t1 (n) + t2(n) Є O(max {g1(n) , g2(n)}), with the constants c and n0 required by the
O definition being 2c3 = 2 max{c1, c2} and max{n1, n2}, respectively.This implies that the
algorithm's overall efficiency will be determined by the part with a larger order of growth, i.e., its
least efficient part:
t1(n) Є O(g1(n))
t2(n) Є O(g2(n)) then t1 (n) + t2(n) Є O(max{g1(n), g2(n)}).
For example, we can check whether an array has identical elements by means of the following
two-part algorithm: first, sort the array by applying some known sorting algorithm; second, scan the
sorted array to check its consecutive elements for equality. If, for example, a sorting algorithm used
in the first part makes no more than 1/2n(n — 1) comparisons (and hence is in O(n2)) while the second
part makes no more than n — 1 comparisons (and hence is in O(n}), the efficiency of the entire
algorithm will be in
The convenient method for doing the comparison is based on computing the limit of the
ratio of two functions in question. Three principal cases may arise:
t(n) 0 implies that t(n) has a smaller order of growth than g(n) lim= c implies that t(n) has the same
order of growth as g(n) n ->∞ g(n) ∞ implies that t (n) has a larger order of growth than g(n).
Note that the first two cases mean that t(n) Є O(g(n)), the last two mean that t(n) Є Ω(g(n)),
and the second case means that t(n) Є θ(g(n)).
EXAMPLE 1 Compare orders of growth of ½ n(n - 1) and n2. (This is one of the
examples we did above to illustrate the definitions.)
lim ½ n(n-1) = ½ lim n2 – n = ½ lim (1- 1/n ) = ½
n ->∞ n2 n ->∞ n2 n ->∞
Since the limit is equal to a positive constant, the functions have the same order of growth
or, symbolically, ½ n(n - 1) Є θ (n2)
Solving recurrences
Substitution method
A lot of things in this class reduce to induction. In the substitution method for
solving recurrences we
Guess the form of the solution.
Use mathematical induction to find the constants and show that the solution
works.
Example
17
the statement for all n ≥ 1.)
As our inductive hypothesis, we assume T (n) ≤ cn log n for all positive numbers
less than
n. Therefore, T ([n/2♩) ≤ c[n/2♩ log([n/2♩)), and
T (n) ≤ 2(c[n/2♩ log([n/2♩)) + n
≤ cn log(n/2) + n
= cn log n − cn log 2 + n
= cn log n − cn + n
≤ cn log n (for c ≥ 1)
Now we need to show the base case. This is tricky, because if T (n) cn log n,
then T (1) 0, which is not a thing. So we revise our inductio≤n so that we only prov≤e
the statement for n 2, and the base cases of the induction proof (which is not the same
≥ as the base case of the recurrence!) are n = 2 and n = 3. (We are allowed to do this
because asymptotic notation only requires us to prove our statement for n ≥ n0, and we
can set n0 = 2.)
We choose n = 2 and n = 3 for our base cases because when we expand the
recurrence formula, we will always go through either n = 2 or n = 3 before we hit the
case where n = 1.
So proving the inductive step as above, plus proving the bound works for n = 2
and n = 3, suffices for our proof that the bound works for all n > 1.
Plugging the numbers into the recurrence formula, we get T (2) = 2T (1) + 2
= 4 and T (3) = 2T (1) + 3 = 5. So now we just need to choose a c that satisfies those
constraints on T (2) and T (3). We can choose c = 2, because 4 ≤ 2 · 2 log 2 and 5 ≤ 2
· 3 log 3.
Therefore, we have shown that T (n) ≤ 2n log n for all n ≥ 2, so T (n) = O(n log
n).
Warnings
Warning: Using the substitution method, it is easy to prove a weaker bound than
the one you’re supposed to prove. For instance, if the runtime is O(n), you might still
be able to substitute cn2 into the recurrence and prove that the bound is O(n2). Which
is technically true, but don’t let it mislead you into thinking it’s the best bound on the
runtime. People often get burned by this on exams!
Warning: You must prove the exact form of the induction hypothesis. For
T (n) = 2T ( n/2 ) + n, we could falsely “prove” T (n) = O(n)
by guessing T (n) c[n and♩ then arguing T (n) 2(c n/2 ) + n cn + n = O(n). Here
example, in the recurrence
≤ we needed to prove T (n) cn,≤not T[ (n) ♩(c + 1)n≤. Accumulated over many recursive
calls≤
, those “plus ones≤” add up.
Recursion tree
A recursion tree is a tree where each node represents the cost of a certain
recursive sub- problem. Then you can sum up the numbers in each node to get the cost
18
of the entire algorithm.
Note: We would usually use a recursion tree to generate possible guesses for
the runtime, and then use the substitution method to prove them. However, if you are
very careful when drawing out a recursion tree and summing the costs, you can actually
use a recursion tree as a direct proof of a solution to a recurrence.
If we are only using recursion trees to generate guesses and not prove anything,
we can tolerate a certain amount of “sloppiness” in our analysis. For example, we can
ignore floors and ceilings when solving our recurrences, as they usually do not affect
the final guess.
Example
The top node has cost cn2, because the first call to the function does cn2 units of work, aside
from the work done inside the recursive subcalls. The nodes on the second layer all have cost
c(n/4)2, because the functions are now being called on problems of size n/4, and the functions
are doing c(n/4)2 units of work, aside from the work done inside their recursive subcalls, etc.
19
The bottom layer (base case) is special because each of them contribute T (1)to the cost.
Analysis: First we find the height of the recursion tree. Observe that a node at depth
i reflects a subproblem of size n/4i. The subproblem size hits n = 1 when n/4i = 1, or
i = log4 n. So the tree has log4 n + 1 levels.
Now we determine the cost of each level of the tree. The number of nodes at depth i is 3i. Each
node at depth i = 0, 1, . . . , log4 n 1 has a−cost of c(n/4i)2, so the total cost of level iis 3ic(n/4i)2
= (3/16)icn2. However, the bottom level is special. Each of the bottom nodes contribute cost T
(1) , and there are 3log4 n = nlog4 3 of them.
So the total cost of the entire tree is
3 cn2 + 3
2 log4
T (n) = cn2 + 16
cn2 + ···+ 3
n−1 cn2 + Θ(nlog4 3)
16
16
logΣ
4 n−1 i
3
= cn2 + Θ(nlog4 3)
16
i=0
The left term is just the sum of a geometric series. So T (n) evaluates to
This looks complicated but we can bound it (from above) by the sum of the infinite series
Σ ∞
3 i
1
cn + Θ(n
2 log4 3) cn2 + Θ(nlog4 3)
i=0 16 1 −(3/16)
=
20
Since functions in Θ(nlog4 3) are also in O(n2), this whole expression is O(n2). Therefore, we
can guess that T (n) = O(n2).
Now we can check our guess using the substitution method. Recall that the original recur- rence
was T (n) = 3T ( n/4 ) [+ Θ(♩n2). We want to show that T (n) dn2 ≤for some constant d > 0.
2
By the induction hypothesis, we have that T ( n/4 ) d n/4[ . ♩So ≤us[ing the♩ same constant c > 0
as before, we have
Note that we would also have to identify a suitable base case and prove the recurrence istrue
for the base case, and we don’t have time to talk about this in lecture, but you shoulddo that
in your homework.
The master theorem is a formula for solving recurrences of the form T (n) = aT (n/b) + f (n),
where a ≥1 and b > 1 and f (n) is asymptotically positive. (Asymptotically positive means that
the function is positive for all sufficiently large n.)
This recurrence describes an algorithm that divides a problem of size n into a subproblems,
each of size n/b, and solves them recursively. (Note that n/b might not be an integer, but in
section 4.6 of the book, they prove that replacing T (n/b) with T ( n/b[) or ♩T ( n/b )[ doe|s not affect
the asymptotic behavior of the recurrence. So we will just ignore floors and ceilings here.)
The theorem is as follows:
21
The master theorem compares the function nlogb a to the function f (n). Intuitively, if nlogb ais
larger (by a polynomial factor), then the solution is T (n) = Θ(nlogb a). If f (n) is larger(by
a polynomial factor), then the solution is T (n) = Θ(f (n)). If they are the same size, then we
multiply by a logarithmic factor.
Be warned that these cases are not exhaustive – for example, it is possible for f (n) to be
asymptotically larger than nlogb a, but not larger by a polynomial factor (no matter how small
the exponent in the polynomial is). For example, this is true when f (n) = nlogb a log n. In this
situation, the master theorem would not apply, and you would have to use another method to
solve the recurrence.
22
1.3.1 Examples:
To use the master theorem, we simply plug the numbers into the formula.
Example 3: T (n) = 3T (n/4) + n log n. Here nlogb a = nlog4 3 = O(n0.793). For ϵ = 0.2, we
have f (n) = Ω(nlog4 3+є). So case 3 applies if we can show that af (n/b) ≤ cf (n) for some
c < 1 and all sufficiently large n. This would mean 3 n log n ≤ cn log n. Setting c = 3/4
4 4
would cause this condition to be satisfied.
Example 4: T (n) = 2T (n/2) + n log n. Here the master method does not apply. nlogb a = n,
and f (n) = n log n. Case 3 does not apply because even though n log n is asymptotically larger
than n, it is not polynomially larger. That is, the ratio f (n)/nlogb a = log n is asymp- totically
less than nє for all positive constants ϵ.
23
2.1 Divide and conquer strategy
Instead of the daily price, consider the daily change in price, which (on each day) can be either
a positive or negative number. Let array A store these changes. Now we have to find the
subarray of A that maximizes the sum of the numbers in that subarray.
Now divide the array into two. Any maximum subarray must either be entirely in the first
half, entirely in the second half, or it must span the border between the first and the second half. If
the maximum subarray is entirely in the first half (or the second half), we can findit using a
recursive call on a subproblem half as large.
If the maximum subarray spans the border, then the sum of that array is the sum of two parts:
the part between the buy date and the border, and the part between the border andthe sell date.
To maximize the sum of the array, we must maximize the sum of each part.
We can do this by simply (1) iterating over all possible buy dates to maximize the first part
(2) iterating over all possible sell dates to maximize the second part. Note that this takes linear
time instead of quadratic time, because we no longer have to iterate over buy and selldates
simultaneously.
24
2.1.1 Runtime analysis
Note that we are omitting the correctness proof, because the main point is to give an example of the
divide and conquer strategy. In the homework, you would normally need to provide a correctness
proof, unless we say otherwise.
First we analyze the runtime of FindMaxCrossingSubarray. Since each iteration of each of the
two for loops takes Θ(1) time, we just need to count up how many iterations there are altogether.
The for loop of lines 3-7 makes mid low + 1 iteratio−ns, and the for loop of lines 10-14 makes high
mid iterations, so th—e total number of iterations is high low + 1 = n. Therefo−re, the helper
function takes Θ(n) time.
Now we proceed to analyze the runtime of the main function. For
the base case, T (1) = Θ(1), since line 2 takes constant time.
For the recursive case, lines 1 and 3 take constant time. Lines 4 and 5 take T ( n/2 [) and♩ T
( n/2 ) [ time|, since each of the subproblems has that many elements. The FindMax-
CrossingSubarray procedure takes Θ(n) time, and the rest of the code takes Θ(1) time. SoT
(n) = Θ(1) + T ( n[/2 ) + ♩ T ( n/[2 ) |+ Θ(n) + Θ(1) = 2T (n/2) + Θ(n) (ignoring the floors
and ceilings).
By case 2 of the master theorem, this recurrence has the solution T (n) = Θ(n log n).
25
3 If we have extra time
Consider the recurrence T (n) = T (n/3) + T (2n/3) + O(n). Let c represent the constantfactor
in the O(n) term.
The longest simple path from the root to a leaf is n →(2/3)n →(2/3)2n →···→1. Since(2/3)kn =
1 when k = log3/2 n, the height of the tree is log3/2 n.
We get that each level costs at most cn, but as we go down from the root, more and more
internal nodes are absent, so the costs become smaller. Fortunately we only care about an
26
upper bound.
Based on this we can guess O(n log n) as an upper bound, and verify this by the substitutionmethod.
In this section, we systematically apply the general framework to analyze the efficiency of
recursive algorithms. Let us start with a very simple example that demonstrates all the principal steps
typically taken in analyzing recursive algorithms.
Example 1: Compute the factorial function F(n) = n! for an arbitrary non negative integer n. Since,
ALGORITHM F(n)
// Computes n! recursively
// Input: A nonnegative integer n
// Output: The value of n!
ifn =0 return 1
else return F(n — 1) * n
For simplicity, we consider n itself as an indicator of this algorithm's input size (rather than
the number of bits in its binary expansion). The basic operation of the algorithm is multiplication,
whose number of executions we denote M(n). Since the function F(n) is computed according to the
formula
F(n) = F ( n - 1 ) - n for n > 0,
the number of multiplications M(n) needed to compute it must satisfy the equality
F(n-1) F(n-1) by n
Indeed, M(n - 1) multiplications are spent to compute F(n - 1), and one more multiplication is
needed to multiply the result by n.
The last equation defines the sequence M(n) that we need to find. Note that the equation
defines M(n) not explicitly, i.e., as a function of n, but implicitly as a function of its value at another
point, namely n — 1. Such equations are called recurrence relations or, for brevity, recurrences.
Recurrence relations play an important role not only in analysis of algorithms but also in some areas
of applied mathematics. Our goal now is to solve the recurrence relation M(n) = M(n — 1) + 1, i.e.,
to find an explicit formula for the sequence M(n) in terms of n only.
Note, however, that there is not one but infinitely many sequences that satisfy this recurrence.
To determine a solution uniquely, we need an initial condition that tells us the value with which the
sequence starts. We can obtain this value by inspecting the condition that makes the algorithm stop
its recursive calls:
if n = 0 return 1.
This tells us two things. First, since the calls stop when n = 0, the smallest value of n for which
this algorithm is executed and hence M(n) defined is 0. Second,by inspecting the code's exiting line,
we can see that when n = 0, the algorithm performs no multiplications. Thus, the initial condition we
are after is
M (0) = 0.
the calls stop when n = 0, no multiplications when n = 0 Thus, we succeed in setting up the recurrence
relation and initial condition for the algorithm's number of multiplications M(n):
M(n) = M(n - 1) + 1 for n > 0, (2.1)
M (0) = 0.
Before we embark on a discussion of how to solve this recurrence, let us pause to reiterate an
important point. We are dealing here with two recursively defined functions. The first is the factorial
function F(n) itself; it is defined by the recurrence
The second is the number of multiplications M(n) needed to compute F(n) by the recursive
algorithm whose pseudocode was given at the beginning of the section. As we just showed, M(n) is
defined by recurrence (2.1). And it is recurrence (2.1) that we need to solve now.
Though it is not difficult to "guess" the solution, it will be more useful to arrive at it in a
systematic fashion. Among several techniques available for solving recurrence relations, we use what
can be called the method of backward substitutions. The method's idea (and the reason for the name)
is immediately clear from the way it
28
applies to solving our particular recurrence:
M(n) = M(n - 1) + 1 substitute M(n - 1) = M(n - 2) + 1
+
= [M(n - 2) + 1] 1 = M(n - 2) + 2 substitute M(n - 2) = M(n - 3) + 1
+
= [M(n - 3) + 1] 2 = M (n - 3) + 3.
After inspecting the first three lines, we see an emerging pattern, which makes it possible to
predict not only the next line (what would it be?) but also a general formula for the pattern: M(n) =
M(n
— i) + i. Strictly speaking, the correctness of this formula should be proved by mathematical
induction, but it is easier to get the solution as follows and then verify its correctness.
What remains to be done is to take advantage of the initial condition given. Since it is specified
for n = 0, we have to substitute i = n in the pattern's formula to get the ultimate result of our backward
substitutions:
The benefits of the method illustrated in this simple example will become clear very soon,
when we have to solve more difficult recurrences. Also note that the simple iterative algorithm that
accumulates the product of n consecutive integers requires the same number of multiplications, and
it does so without the overhead of time and space used for maintaining the recursion's stack.
The issue of time efficiency is actually not that important for the problem of computing n!,
however. The function's values get so large so fast that we can realistically compute its values only
for very small n's. Again, we use this example just as a simple and convenient vehicle to introduce
the standard approach to analyzing recursive algorithms.
Generalizing our experience with investigating the recursive algorithm for computing n!, we
can now outline a general plan for investigating recursive algorithms.
Example 2: the algorithm to find the number of binary digits in the binary representation of a
positive decimal integer.
29
//Output: The number of binary digits in n's binary representation if n
= 1 return 1
else return BinRec(n/2) + 1
Let us set up a recurrence and an initial condition for the number of additions A(n) made by the
algorithm. The number of additions made in computing BinRec(n/2) is A(n/2), plus one more addition
is made by the algorithm to increase the returned value by 1. This leads to the recurrence
A(n) = A(n/2) + 1 for n > 1. (2.2)
Since the recursive calls end when n is equal to 1 and there are no additions made then, the
initial condition is
A(1) = 0
The presence of [n/2] in the function's argument makes the method of backward substitutions
stumble on values of n that are not powers of 2. Therefore, the standard approach to solving such a
recurrence is to solve it only for n — 2k and then take advantage of the theorem called the smoothness
rule which claims that under very broad assumptions the order of growth observed for n = 2k gives a
correct answer about the order of growth for all values of n. (Alternatively, after getting a solution for
powers of 2, we can sometimes finetune this solution to get a formula valid for an arbitrary n.) So let
us apply this recipe to our recurrence, which for n = 2k takes the form
A(2 k) =A(2 k -1 ) + 1 for k> 0, A(2 0 ) = 0
Now backward substitutions encounter no problems:
= A(2 k -i) + i
……………
= A(2 k -k) + k
Thus, we end up with
A(2k) = A ( 1 ) + k = k
or, after returning to the original variable n = 2k and, hence, k = log2 n,
A (n ) = log2 n Є θ (log n).
Example: Fibonacci numbers
30
F(n) = F(n -1) + F(n-2) for n > 1 -------- (2.3)
and two initial conditions
Though the Fibonacci numbers have many fascinating properties, we limit our discussion to a
few remarks about algorithms for computing them. Actually, the sequence grows so fast that it is the
size of the numbers rather than a time-efficient method for computing them that should be of primary
concern here. Also, for the sake of simplicity, we consider such operations as additions and
multiplications at unit cost in the algorithms that follow. Since the Fibonacci numbers grow infinitely
large (and grow rapidly), a more detailed analysis than the one offered here is warranted. These
caveats notwithstanding, the algorithms we outline and their analysis are useful examples for a student
of design and analysis of algorithms.
31
To begin with, we can use recurrence (2.3) and initial condition (2.4) for the obvious
recursive algorithm for computing F(n).
ALGORITHM F(n)
//Computes the nth Fibonacci number recursively by using its definition //Input:
A nonnegative integer n
//Output: The nth Fibonacci number
if n < 1 return n
else return F(n - 1) + F(n - 2)
Analysis:
The algorithm's basic operation is clearly addition, so let A(n) be the number of additions
performed by the algorithm in computing F(n). Then the numbers of additions needed for computing
F(n — 1) and F(n — 2) are A(n — 1) and A(n — 2), respectively, and the algorithm needs one more
addition to compute their sum. Thus,
we get the following recurrence for A(n):
A(n) = A(n - 1) + A(n - 2) + 1 for n > 1, (2.8)
A(0)=0, A(1) = 0.
The recurrence A(n) — A(n — 1) — A(n — 2) = 1 is quite similar to recurrence (2.7) but its
right-hand side is not equal to zero. Such recurrences are called inhomo-geneous recurrences. There
are general techniques for solving inhomogeneous recurrences (see Appendix B or any textbook on
discrete mathematics), but for this particular recurrence, a special trick leads to a faster solution. We
can reduce our inhomogeneous recurrence to a homogeneous one by rewriting it as
[A(n) + 1] - [A(n -1) + 1]- [A(n - 2) + 1] = 0 and substituting B(n) = A(n) + 1: B(n)
- B(n - 1) - B(n - 2) = 0
B(0) = 1, B(1) = 1.
This homogeneous recurrence can be solved exactly in the same manner as recurrence
(2.7) was solved to find an explicit formula for F(n).
We can obtain a much faster algorithm by simply computing the successive elements of
the Fibonacci sequence iteratively, as is done in the following algorithm.
ALGORITHM Fib(n)
//Computes the nth Fibonacci number iteratively by using its definition
//Input: A nonnegative integer n
//Output: The nth Fibonacci number
F[0]<-0; F[1]<-1
for i <- 2 to n do
F[i]«-F[i-1]+F[i-2]
return F[n]
This algorithm clearly makes n - 1 additions. Hence, it is linear as a function of n and "only"
exponential as a function of the number of bits b in n's binary representation. Note that using an
32
extra array for storing all the preceding elements of the Fibonacci sequence can
be avoided: storing just two values is necessary to accomplish the task.
The third alternative for computing the nth Fibonacci number lies in using
a formula. The efficiency of the algorithm will obviously be determined by the
efficiency of an exponentiation algorithm used for computing ø n. If it is done by
simply multiplying ø by itself n - 1 times, the algorithm will be in θ (n) = θ (2b) .
There are faster algorithms for the exponentiation problem. Note also that special
care should be exercised in implementing this approach to computing the nth
Fibonacci number. Since all its intermediate results are irrational numbers, we
would have to make sure that their approximations in the computer are accurate
enough so that the final round-off yields a correct result.
Finally, there exists a θ (logn) algorithm for computing the nth Fibonacci number that
manipulates only integers. It is based on the equality
F(n-1) F(n) 0 1n
F(n) F(n+1) 1 1 for n ≥ 1
Linear Search
Also known as the sequential search, the linear search is the most basic searching algorithm.
With a big-O notation of O(n), the linear search consists of comparing each element of the data
structure with the one you are searching for. It's up to your implementation whether you return
the value you were looking for or a Boolean according to whether or not the value was found.
As you can probably guess, this is a very inefficient process.
Recursive approach
binarySearch(arr, item, beg, end)
if beg<=end
midIndex = (beg + end) / 2
if item == arr[midIndex]
return midIndex
else if item < arr[midIndex]
return binarySearch(arr, item, midIndex + 1, end)
else
return binarySearch(arr, item, beg, midIndex - 1)
return -1
Complexity
Worst case time complexity: O(N)
Average case time complexity: O(log log N)
Best case time complexity: O(1)
Space complexity: O(1)
On assuming a uniform distribution of the data on the linear scale used for interpolation, the
performance can be shown to be O(log log n).
Dynamic Interpolation Search is possible in o(log log n) time using a novel data structure.
Solution:
37
The Rabin-Karp-Algorithm
The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well
as for each M-character subsequences of text to be compared. If the hash values are
unequal, the algorithm will determine the hash value for next M-character sequence. If the
hash values are equal, the algorithm will analyze the pattern and the M-character sequence.
In this way, there is only one comparison per text subsequence, and character matching is
only required when the hash values match.
RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T] 38
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1. ... s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q
Example: For string matching, working module q = 11, how many spurious hits does the
Rabin-Karp matcher encounters in Text T = 31415926535.......
T = 31415926535.......
P = 26
Here T.Length =11 so Q = 11
And P mod Q = 26 mod 11 = 4
Now find the exact match of P mod Q...
Solution:
39
40
Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1)
m but it has a good average case running time. If the expected number of strong shifts
is small O (1) and prime q is chosen to be quite large, then the Rabin-Karp algorithm
can be expected to run in time O (n+m) plus the time to require to process spurious hits.
Solution:
Initially: m = length [p] = 7
Π [1] = 0
k=0
42
After iteration 6 times, the prefix function computation is complete:
Let us execute the KMP Algorithm to find whether 'P' occurs in 'T.'
For 'p' the prefix function, ? was computed previously and is as follows:
Solution:
Initially: n = size of T = 15 m = size of P = 7
44
45
Pattern 'P' has been found to complexity occur in a string 'T.' The total number of shifts
that took place for the match to be found is i-m = 13 - 7 = 6 shifts.
The same approach is applied in insertion sort. The idea behind the insertion sort is
that first take one element, iterate it through the sorted array. Although it is simple to
use, it is not appropriate for large data sets as the time complexity of insertion sort in
the average case and worst case is O(n2), where n is the number of items. Insertion
sort is less efficient than the other sorting algorithms like heap sort, quick sort, merge
sort, etc.
o Simple implementation
o Efficient for small data sets
o Adaptive, i.e., it is appropriate for data sets that are already substantially sorted.
Algorithm
The simple steps of achieving the insertion sort are listed as follows -
Step 1 - If the element is the first element, assume that it is already sorted. Return 1.
Step3 - Now, compare the key with all elements in the sorted array.
Step 4 - If the element in the sorted array is smaller than the current element, then
move to the next element. Else, shift greater elements in the array towards the right.
47
To understand the working of the insertion sort algorithm, let's take an unsorted array.
It will be easier to understand the insertion sort via an example.
Here, 31 is greater than 12. That means both elements are already in ascending order.
So, for now, 12 is stored in a sorted sub-array.
Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25.
Along with swapping, insertion sort will also check it with all elements in the sorted
array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence,
the sorted array remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next
elements that are 31 and 8.
48
Both 31 and 8 are not sorted. So, swap them.
Now, the sorted array has three items that are 8, 12 and 25. Move to the next items
that are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
49
Swapping makes 31 and 17 unsorted. So, swap them too.
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the
array is already sorted. The best-case time complexity of insertion sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled
order that is not properly ascending and not properly descending. The average
case time complexity of insertion sort is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be
sorted in reverse order. That means suppose you have to sort the array elements
in ascending order, but its elements are in descending order. The worst-case
time complexity of insertion sort is O(n2).
50
2. Space Complexity
for i in range(len(a)):
print (a[i], end = " ")
Output:
51
Heap Sort Algorithm
In this article, we will discuss the Heapsort Algorithm. Heap sort processes the
elements by creating the min-heap or max-heap using the elements of the given array.
Min-heap or max-heap represents the ordering of array in which the root element
represents the minimum or maximum element of the array.
Before knowing more about the heap sort, let's first see a brief description of Heap.
What is a heap?
A heap is a complete binary tree, and the binary tree is a tree in which the node can
have the utmost two children. A complete binary tree is a binary tree in which all the
levels except the last level, i.e., leaf node, should be completely filled, and all the nodes
should be left-justified.
Algorithm
HeapSort(arr)
BuildMaxHeap(arr)
for i = length(arr) to 2
swap arr[1] with arr[i] 52
heap_size[arr] = heap_size[arr] ? 1
MaxHeapify(arr,1)
End
BuildMaxHeap(arr)
BuildMaxHeap(arr)
heap_size(arr) = length(arr)
for i = length(arr)/2 to 1
MaxHeapify(arr,i)
End
MaxHeapify(arr,i)
MaxHeapify(arr,i)
L = left(i)
R = right(i)
if L ? heap_size[arr] and arr[L] > arr[i]
largest = L
else
largest = i
if R ? heap_size[arr] and arr[R] > arr[largest]
largest = R
if largest != i
swap arr[i] with arr[largest]
MaxHeapify(arr,largest)
End
In heap sort, basically, there are two phases involved in the sorting of elements. By
using the heap sort algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of the
array.
53
o After the creation of heap, now remove the root element of the heap repeatedly
by shifting it to the end of the array, and then store the heap structure with the
remaining elements.
Now let's see the working of heap sort in detail by using an example. To understand it
more clearly, let's take an unsorted array and try to sort it using heap sort. It will make
the explanation clearer and easier.
First, we have to construct a heap from the given array and convert it into max heap.
After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To delete this node,
we have to swap it with the last node, i.e. (11). After deleting the root element, we
again have to heapify it to convert it into max heap.
54
After swapping the array element 89 with 11, and converting the heap into max-heap,
the elements of array are -
In the next step, again, we have to delete the root element (81) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (54). After deleting the
root element, we again have to heapify it to convert it into max heap.
After swapping the array element 81 with 54 and converting the heap into max-heap,
the elements of array are -
In the next step, we have to delete the root element (76) from the max heap again. To
delete this node, we have to swap it with the
55 last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into max-heap,
the elements of array are -
In the next step, again we have to delete the root element (54) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (14). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 54 with 14 and converting the heap into max-heap,
the elements of array are -
In the next step, again we have to delete the root element (22) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (11). After deleting the root
56
element, we again have to heapify it to convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into max-heap,
the elements of array are -
In the next step, again we have to delete the root element (14) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 14 with 9 and converting the heap into max-heap,
the elements of array are -
In the next step, again we have to delete the root element (11) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
57
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be empty.
1. Time Complexity
The time complexity of heap sort is O(n logn) in all three cases (best case, average
case, and worst case). The height of a complete binary tree having n elements is logn.
2. Space Complexity
Stable N0
l = 2 * i + 1 # left = 2*i + 1
r = 2 * i + 2 # right = 2*i + 2
59
# greater than root
largest = l
largest = r
if largest != i:
heapify(arr, n, largest)
60
# The main function to sort an array of given size
def heapSort(arr):
n = len(arr)
# Build a maxheap.
heapify(arr, n, i)
heapify(arr, i, 0)
61
heapSort(arr)
n = len(arr)
for i in range(n):
print(arr[i])
Output
Sorted array is
5
6
7
11
12
13
Time Complexity: O(n*log(n))
62
CS3401/ ALGORITHMS
Graph algorithms: Representations of graphs - Graph traversal: DFS – BFS - applications - Connectivity,
strong connectivity, bi-connectivity - Minimum spanning tree: Kruskal’s and Prim’s algorithm- Shortest
path: Bellman-Ford algorithm - Dijkstra’s algorithm - Floyd-Warshall algorithm Network flow: Flow
networks - Ford-Fulkerson method – Matching: Maximum bipartite matching
Like trees, graphs represent a fundamental data structure used in computer science. We often hear about
cyber space as being a new frontier for mankind, and if we look at the structure of cyberspace, we see
that it is structured as a graph; in other words, it consists of places (nodes), and connections between those
places. Some applications of graphs include
•the fact that trees are a special case of graphs, in that they are acyclic and connected graphs,
and that trees are used in many fundamental data structures
1
CS3401/ ALGORITHMS
the degree of a vertex v, denoted as deg(v), equals the number of edges that are incident with
•v.
Σ
• Handshaking Theorem. deg(v) = 2|E|.
v∈V
Example 1. Let G = (V, E), where
E = {(SD, LA), (SD, SF ), (LA, SB), (LA, SF ), (LA, SJ), (LA, OAK), (SB, SJ)}
are edges which represent flights between two cities. Provide the following for the graph:
A directed graph is a graph G = (V, E) whose edges have direction. In this case, given (u, v) ∈E,
u is called the start vertex and v is called the end vertex. Moreover, the in-degree of vertex
v, denoted deg+(v), is the number of edges for which v is the end vertex. And the out-degree of
vertex v, denoted deg−(v), is the number of edges for which v is the start vertex.
An undirected graph can be made into a directed graph by orienting each edge of the graph; that is,
by assigning a direction to each edge. Conversely, each directed graph has associated with it an underlying
undirected graph which is obtained by removing the orientation of each edge.
Example 2. Redraw the graph from Example 1, but now with oriented edges.
Recall that a path in a graph G = (V, E) (either directed or undirected) from vertex u to vertex v
is a sequence of vertices u = v0, v1, . . . , vn = v for which (vi, vi+1) ∈E for all i = 0, 1, . . . , n −
1.
We then say that G is connected provided there is a path from every vertex u ∈ V to every other
vertex v V ∈
. In what follows we present algorithms that pertain to the degree of connectivity of a graph.
To begin, we consider two different ways of traversing a graph; i.e. visiting each node of the graph.
The first algorithm makes use of a FIFO queue data structure, and is called a breadth-first
traversal, while the second uses a stack data structure, and is called a depth-first traversal.
2
CS3401/ ALGORITHMS
While Q is nonempty:
Remove node u from front of Q.
In addition to visiting each node, this procedure implicitly yields a spanning forest of trees (or a
spanning tree if the forest has only one tree) whose edges are comprised of those of the form (u, v) where
u was the node that was removed from Q and led to v being marked. For undirected graphs, a breadth-
first traversal partitions the edges of G into those that are used in the forest, and those that are not
used. The latter edges are called cross edges, since they always connect nodes that are on different
branches of one of the (spanning) trees of the spanning forest.
Example 3. For the graph G = (V, E), where
V = {a, b, c, d, e, f, g, h, i, j, k}
E = {(a, b), (a, c), (b, c), (b, d), (b, e), (b, g), (c, g), (c, f ),
(d, f ), (f, g), (f, h), (g, h), (i, j), (i, k), (j, k)}.
Show the forest that results when performing a breadth-first traversal of the G. Assume that all
adjacency lists follow an alphabetical ordering.
Depth-First Graph Traversal Algorithm.
While S is nonempty:
Let u be at the front of S.
Let v be the first unmarked neighbor/child of u.
If v does not
3
CS3401/ ALGORITHMS
exist: Pop u
from S.
Otherwise:
Mark v and push v on to S.
In addition to visiting each node, this procedure also implicitly yields a spanning forest of trees (or a
spanning tree if the forest has only one tree) whose edges are comprised of those of the form (u, v) where
u was the node from the front of S that reached v and caused it to be marked. For undirected graphs, a
depth-first traversal partitions the edges of G into those that are used in the forest, and those that are
not used. The latter edges are called backward edges, since they always connect nodes that are on
the same tree branch (i.e., the edge connects a descendant to an ancestor).
V = {a, b, c, d, e, f, g, h, i, j, k}
E = {(a, b), (a, c), (b, c), (b, d), (b, e), (b, g), (c, g), (c, f ),
(d, f ), (f, g), (f, h), (g, h), (i, j), (i, k), (j, k)}.
Show the forest that results when performing a breadth-first traversal of the G. Assume that all
adjacency lists follow an alphabetical ordering.
Proof of Theorem 1. Suppose a breadth-first traversal of G yields a forest with exactly one tree
T . Then G is connected since T is connected (by definition a tree is connected and acyclic), and is
a spanning tree for G.
Now suppose G is connected. Suppose u is the root of the first tree. Let v be any other vertex of G.
Then there is a path P : u = v0, v1, . . . , vn−1 = v from u to v. Notice that, in the first iteration of the
outer while loop, v1 has the opportunity to be marked when v0 is removed from the queue. And
inductively, assuming that vi−1 will be entered in the queue (during the first outer-loop iteration), vi
will then have the opportunity to be marked. Therefore, by induction, v will be marked in the first
iteration of the outer while loop. Since v was arbitrary, it follows that all vertices of G belong to the
first and only tree (each iteration of the outer loop corresponds with a new tree).
Corollary 1. The size of the forest generated in a breadth-first traversal of a undirected graph equals
the number of connected components of that graph. Moreover, each tree in the forest represents a
spanning tree for one of the connected components; in other words, a tree that contains each vertex of
a component.
For directed graphs, first note that there are two kinds of connectivity to consider. The first is called
strong connectivity, and requires that all paths follow the orientation of each directed edge. The other
kind is called weak connectivity and allows for paths that ignore edge orientation. Of course, weak
connectivity can be tested using the breadth-first traversal algorithm described above. On the other
4
CS3401/ ALGORITHMS
hand, the (strong) connectivity of a directed graph G = (V, E) can be determined by making two depth-
first traversals of the graph. In performing the first traversal, we recursively compute the post order
of each vertex. In the first depth-first traversal, for the base case, let v be the first vertex that is marked,
but has no unmarked children. Then the post order of v equals 1. For the recursive case, suppose v is
a vertex and the post order of each of v’s children has been computed. Then the post order of v is one
more than the largest post order of any of its children. Henceforth we let post(v) denote the post order
of v.
The second depth-first traversal is performed with what is called the reversal of G. The reversal Gr
of a directed graph G = (V, E) is defined as Gr = (V, Er), where Er is the set of edges obtained by
reversing the orientation of each edge in E. Moreover, during this second depth-first search, when
selecting a vertex to mark and enter into the stack during execution of the outer while loop, the vertex
of highest post order (obtained from the first depth-first traversal) is always chosen.
Theorem 2. In the second depth-first traversal of directed graph G = (V, E), the resulting forest
represents the set of strongly connected components of G.
Proof of Theorem 2. Let T1, T2, . . . , Tm, be the trees of the forest obtained from the second
traversal, and in the order in which they were constructed. First note that, if u and v are in different
trees, then u and v must be in different strongly-connected components. For example, suppose u is in
Ti and v is in Tj, with i < j, then there is no path in Gr from the root of Ti to v, which implies no
path from v to the root of Ti in G. Thus, being in the same tree is a necessary condition for being in
the same strongly-connected component. We now show that this condition is also sufficient, which
will complete the proof.
Now let u and v be in the same tree Tj, for some j = 1, . . . , m. Let r be the root of this tree. Then
there are paths from r to u and from r to v in Gr, which implies paths from u to r and from v to r
in G. We now show that there is also a path from r to u in G (and similarly from r to v). This will
imply a path from u to v (and similarly a path from v to u) in G by combining the paths from u to
r, and then from r to v.
To this end, since r is the root of a second-round depth-first tree, we know that r must have higher
post order than u, as computed during the first round. In the first round, r must be in the same tree as
u, since there is a path from u to r and r has a higher post order than u. But then, since u has post
order less than that of r, u must be a descendant of r, in which case there is a path from r to u.
Example 5. Use the algorithm described above to determine the strongly-connected components of
the following directed graph. G = (V, E), where
V = {a, b, c, d, e, f, g, h, i, j}
E = {(a, c), (b, a), (c, b), (c, f ), (d, a), (d, c), (e, c),
(e, d), (f, b), (f, g), (f, h), (h, g), (h, i), (i, j), (j, h)}.
Biconnectivity
An undirected graph is called biconnected if there are no vertices, called articulation points, or
edges, called bridges, whose removal disconnects the graph.
E = {(a, c), (a, d), (b, c), (b, e), (c, d), (c, e), (d, f ), (e, g)},
find all articulation points and bridges.
Biconnectivity algorithm
We now show how to determine all articulation points and bridges using a single depth-first traversal
for a connected undirected graph.
∈
Let D denote the depth-first spanning tree that is constructed for G = (V, E). For all v V , let num(v)
denote the order in which v is added to D (i.e. the order in which it is marked and placed in the
stack), and low(v) be recursively defined as the minimum of the following:
1. num(v),
2. the lowest num(w) for any back edge (v, w), and
3. the lowest low(w) for any child w of v in D.
Lemma 1. Let D denote the depth-first spanning tree that is constructed for G = (V, E), and v be any
vertex of D. Suppose T1 and T2 are two distinct subtrees rooted at v. Then there are no edges in G
that connect a vertex in T1 with a vertex in T2.
Proof of Lemma 1. Assume that T1 is generated first in the depth-first traversal that constructs
D. In other words, num(u) < num(w) for every u ∈ T1 and w ∈ T2. Thus, there cannot be an edge, say
(u, w) connecting T1 and T2, since otherwise, during the depth-first traversal, edge (u, w) would have
been added to D, and one would have w ∈ T1, a contradiction.
Theorem 3. Let G = (V, E) be an undirected and connected simple graph. Let D be a depth-first
spanning tree for G. Then v ∈ V is an articulation point iff either v is the root of D and two or more subtrees
are rooted at v, or v is not the root, but has a child w for which low(w) ≥ num(v).
Theorem 3. If v is the root and two or more subtrees are rooted at v, then the above lemma implies
that v’s removal from G will disconnect all of those subtrees. Therefore v is an articulation point.
Moreover, if there is only one subtree rooted at v, then removing v does not disconnect D, and hence v
is not an articulation point.
Now suppose v is not the root, and there is a child w of v in D for which low(w) ≥num(v). Then
the only path from w to an ancestor of v must pass through v, which implies v is an articulation point,
since its removal will disconnect w from the ancestor.
Finally, assume v is a non-root articulation point. First note that v cannot be a leaf of D, since the
removal of v would still leave D connected, and thus all other nodes remain interconnected via
D − {v}. Let D̂ denote the part of D that remains after v and its subtrees are removed from D.
Then it must be the case that there is at least one subtree rooted at v that has no back edges that connect
to D̂. If this were not the case, then v would not be an articulation point, since all of its descendants
would remain connected via D̂. Hence, letting w denote the root of this subtree, we
have low(w) = num(w) > num(v); and the proof is complete.
Corrolary 2. Let G = (V, E) be an undirected and connected simple graph. Let D be a depth-first
spanning tree for G. Then e ∈ E is a bridge iff e = (v, w) is an edge of D and low(w) ≥ num(v).
6
CS3401/ ALGORITHMS
Example 7. Using the graph from Example 6, show the depth-first spanning tree, along with the
low and num values of each vertex. Verify Proposition 3 for this example.
A directed acyclic graph (DAG) is simply a directed graph that has no cycles (i.e. paths of length
greater than zero that begin and end at the same vertex). DAGs have several practical applications.
One such example is to let T be a set of tasks that must be completed in order to complete a large
project, then one can form a graph G = (T, E), where (t1, t2) E iff ∈ t1 must be completed before t2 can
be started. Such a graph can be used to form a schedule for when each task should be completed, and hence
provide an estimate for when the project should be completed.
The following proposition suggests a way to efficiently check if a directed graph is a DAG.
Theorem 4. If directed graph G = (V, E) is a DAG, then it must have one vertex with out-degree equal
to zero.
Proof of Theorem 4. If DAG G did not have such a vertex, then one could construct a path
having arbitrary length. For example, if P is some path that ends at v V ∈ , then P can be extended by
+
∈
adding to it vertex w, where (v, w) E. We know that w exists, since deg (v) > 0. Thus, in constructing
a path of length greater than V , it follows that at least one vertex in V must occur more than once.
′ | | ′ | |
Letting P denote the subpath that begins and ends at this vertex, we see that P is a cycle, which
constradicts the fact that G is a DAG.
While Q is nonempty:
Exit node v from Q.
7
CS3401/ ALGORITHMS
use the above algorithm to verify that G is a DAG, and provide a topological sort for V .
GREEDY TECHNIQUE
The change making problem is a good example of greedy concept. That is to give change
for a specific amount ‘n’ with least number of coins of the denominations d1 >d2……dm. For
example: d1
=25, d1=10, d3=5, d4=1. To get a change for 48 cents. First step is to give 1 d1, 2 d2 and 3 d4’s
which gives a optimal solution with the lesser number of coins. The greedy technique is applicable
to optimization problems. It suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the problem
is reached. On each step the choice made must be –
Feasible - i.e., it has to satisfy the problem constraints.
Locally optimal - i.e., it has to be the best local choice among all feasible
choices available on that step.
Irrevocable – i.e., once made, it cannot be changed on subsequent steps of the
algorithm.
PRIM’S ALGORITHM
A spanning tree of connected graph is its connected acyclic sub graph that contains all the
vertices of the graph. A minimum spanning tree of a weighted connected graph is its spanning tree
of the smallest weight, where the weight of the tree is defined as the sum of the weight on all its
edges. The minimum spanning tree problem is the problem of finding a minimum spanning tree for
a given weighted connected graph.
Two serious obstacles are: first, the number of spanning tree grows exponentially with the
graph size. Second, generating all spanning trees for the given graph is not easy; in fact, it’s more
difficult than finding a minimum spanning for a weighted graph by using one of several efficient
algorithms available for this problem.
Prim’s algorithm constructs a minimum spanning tree thru a sequence of expanding sub
trees. The initial sub tree in such a sequence consists of a single vertex selected arbitrarily from the
set V of the graph’s vertices. On each iteration, we expand the current tree in the greedy manner by
simply attaching to it the nearest vertex not in that tree. The algorithm stops after being constructed.
The total number of iterations will be n-1, if there are ‘n’ number of edges.
Algorithm Prim(G)
// Input: A weighted connected graph G = {V, E}
// Output: ET, the set of edges composing a minimum spanning tree of G.
VT <-{V0}
ET <- ø
For i <- 1 to |V|-1 do
Find a minimum weight edge e*=(v*,u*) among all the edges (v,u)
8
CS3401/ ALGORITHMS
VT <- VT U
{u*} ET <- ET
U {e*}
9
CS3401/ ALGORITHMS
Return ET.
The algorithm makes to provide each vertex not in the current tree with the information
about the shortest edge connecting the vertex to a tree vertex. Vertices that are not adjacent is labeled
“∞ “ (infinity). The Vertices not in the tree are divided into 2 sets : the fringe and the unseen. The
fringe contains only the vertices that are not in the tree but are adjacent to atleast one tree vertex.
From these next vertices is selected. The unseen vertices are all the other vertices of the graph, (they
are yet to be seen). This can be broken arbitrarily.
After we identify a vertex u* to be added to the tree, we need to perform 2 operations:
Move u* from the set V-VT to the set of tree vertices VT.
For each remaining vertex u in V-VT that is connected to u* by a shorter edge than the
u’s
current distance label, update its label by u* and the weight of the edge between u* & u,
respectively.
Prim’s algorithm yields always a Minimum Spanning Tree. The proof is by induction.
Since T0 consists of a single vertex and hence must be a part of any Minimum Spanning Tree.
Assume that Ti-1 is part of some Minimum Spanning Tree. We need to prove that Ti generated from
Ti-1 is also a part of Minimum Spanning Tree, by contradiction. Let us assume that no Minimum
Spanning Tree of the graph can contain Ti. let ei = (v,u) be the minimum weight edge from a vertex
in Ti-1 to a vertex not in Ti-1 used by Prim’s algorithm to expand T i-1 to Ti, e i cannot belong to
any Minimum Spanning Tree including T. Therefore if we add ei to T, a cycle must be formed.
In addition to edge ei = (v,u), this cycle must contain another edge (v’,u’) connecting
a vertex v’ Є Ti-1 to a vertex u’ which is not in Ti-1. If we now delete the edge (v’,u’) from this
cycle we obtain another spanning tree of the entire graph whose weight is less than or equal to the
weight of T. Since the weight of ei is less than or equal to the weight of (v’,u’). Hence this is a
Minimum Spanning Tree and contradicts that no Minimum Spanning Tree contains Ti.
The efficieny of this algorithm depends on the data structure chosen for the graph itself and for the
p vertex priorities are the distance to the nearest tree vertices. For eg, if the graph is represented by
weigh matrix and the priority queue is implemented as an unordered array the algorithm’s running
time will be in θ(|V|2).
If a graph is represented by its adjacency linked lists and the priority queue is
implemented as a min_heap, the running time of the algorithm’s is in O(|E|log|V|). This is because
the algorithm performs |V|-1 deletions of the smallest element and makes |E| verifications , and
changes of an element’s priority in a min_heap of size not greater than |V|. Each of these operations
is a O(log|V|) operations.Hence , the running time is in:
10
CS3401/ ALGORITHMS
KRUSKAL’S ALGORITHM
The algorithm begins by sorting the graphs edges in increasing order of their weights.
Then, starting with the empty subgraph , it scans this sorted list adding the next edge on the list to
the current sub graph if such an inclusion does not create a cycle and simply skips the edge
otherwise.
Algorithm Kruskal(G)
// Input: A weighted graph G=<V,E>
// Output: ET,the set of edges composing a Minimum Spanning
Tree of G Sort E in increasing order of edge weights
ET<- ø ; ecounter <- 0 //initialize the set of tree edges and its size.
Kruskal’s algorithm is not simpler than prim’s. Because, on each iteration it has to check
whether the edge added forms a cycle. Each connected component of a sub graph generated is a tree
because it has no cycles.
There are efficient algorithms for doing these observations, called union_ find algorithms.
With this the time needed for sorting the edge weights of a given graph and it will be O(|E|log|E|) .
Kruskal’s algorithm requires a dynamic partition of some n-element set S into a collection
of disjoint subsets S1,S2,S3….. Sk. After initializing each consist of different elements of S, which
is the
sequence of intermixed union and find algorithms or operations . Here we deal with an abstract data
type of collection of disjoint subsets of a finite set with the following operations:
11
CS3401/ ALGORITHMS
Union(x,y): Constructs the union of the disjoint subsets Sx & Sy containing x & y
respectively and adds it to the collection to replace Sx & Sy, which are deleted from it.
For example, Let S={1,2,3,4,5,6}. Then make(i) Creates the set{i} and apply this operation to
create single sets:
{1},{2},{3},{4},{5},{6}.
Performing union (1,4) & union(5,2) yields
{1,4},{5,2},{3},{6}
It uses one element from each of the disjoint sub sets in a collection as that subset’s representative.
There are two alternatives for implementing this data structure, called the quick find, optimizes the
time efficiency of the find operation, the second one, called the quick union, optimizes the union
operation.
List 1 4 1 4 5 2
Null Null
List 2 0
List 3 2 3 6 Null
Null
List 4 0 Null
Null
List 5 0 Null
Null
List 6 0 Null
Subset representatives
Element Index Representation
1 1
2 1
3 3
4 1
5 1
6 3
12
CS3401/ ALGORITHMS
For the makeset(x) time is in θ(1) and for ‘n’ elements it will be in θ(n). The efficiency of find(x) is
also in θ (1): union(x,y) takes longer of all the operations like update & delete.
Each ai is updated Ai times, the resulting set will have 2Ai elements. Since the entire set has
n elements, 2Ai ≤ n and hence A i ≤ log2n. Therefore, the total number of possible updates of the
representatives for all n elements is S will not exceed nlog2n.
Thus for union by size, the time efficiency of a sequence of at most n-1 unions and m finds
is in O(n log n+m).
The quick union-represents by a rooted tree. The nodes of the tree contain the subset
elements , with the roots element considered the subsets representatives, the tree’s edges are directed
from children to their parents.
Makeset(x) requires θ (1) time and for ‘n’ elements it is θ (n), a union (x,y) is θ (1) and
find(x) is in θ (n) because a tree representing a subset can degenerate into linked list with n nodes.
The union operation is to attach a smaller tree to the root of a larger one. The tree size can
be measured either by the number of nodes or by its height(union-by-rank). To execute each find it
takes O(log n) time. Thus, for quick union , the time efficiency of at most n-1 unions and m finds is
in O(n+ m log n).
An even better efficiency can be obtained by combining either variety of quick union with path
compression. This modification makes every node encountered during he execution of a find
operation point to the tree’s root.
Given a graph G=(V,E), a weight function w: E -> R, and a source node s, find the shortest path
from s to v for every v in V.
Example:
Correctness
Fact 1: The distance estimate d[v] never underestimates the actual shortest path distance from
s to v.
Fact 2: If there is a shortest path from s to v containing at most i edges, then after iteration i
of the outer for loop:
d[v] <= the actual shortest path distance from s to v.
Theorem: Suppose that G is a weighted graph without negative weight cycles and let s denote the
source node. Then Bellman- Ford correctly calculates the shortest path distances from s.
Proof: Every shortest path has at most |V| - 1 edges. By Fact 1 and 2, the distance estimate d[v] is equal
to the shortest path length after |V|-1 iterations.
14
CS3401/ ALGORITHMS
Variations
One can stop the algorithm if an iteration does not modify distance estimates. This is beneficial
if shortest paths are likely to be less than |V|-1.
One can detect negative weight cycles by checking whether distance estimates can be reduced
after |V|-1 iterations.
DIJKSTRA’S ALGORITHM
The single-source shortest paths problem: for a given vertex called the source in a weighted
connected graph, find shortest paths to all its other vertices is considered. It is a set pf paths, each
leading from the source to a different vertex in the graph, though some paths may. of course, have
edges in common.
This algorithm is applicable to graphs with nonnegative weights only. It finds shortest paths
to a graph’s vertices in order of their distance from a given source. First, it finds the shortest path
from the source to a vertex nearest to it, then to a second nearest and so on. In general , before its
ith iteration commences, the algorithm has already, identified its shortest path to i -1 other vertices
nearest to the source. This forms a sub tree Ti and the next vertex chosen to be should be vertices
adjacent to the vertices of Ti, fringe vertices. To identify , the ith nearest vertex, the algorithm
computes for every fringe vertex u, the sum of the distance to the nearest tree vertex v and then
select the smallest such sum. Finding the next nearest vertex u* becomes the simple task of finding
a fringe vertex with the smallest d value. Ties can be broken arbitrarily.
After we have identified a vertex u* to be added to the tree, we need to perform two operations:
Move u* from the fringe to the set of the tree vertices.
For each remaining fringe vertex u that is connected to u* by an edge of weight w(u*,u)
s.t. du* + w(u*,u) < du, update the labels of u by u* and , du* + w(u*,u) respectively. Algorithm
Dijkstra(a)
// Input: A weighted connected graph G=<V,E> and its vertex s
// Output: The length dv of a shortest path from s to v and its penultimate vertex pv //
// for every vertex v in V.
Initialize(Q) //initialize vertex priority queue to empty for every vertex v in V do for
every vertex v in V do
dv <- ∞; pv <-null
Insert(Q,v, dv) //initialize vertex priority in the priority queue
ds<- 0;
decrease(Q,s, ds) //update priority of s with ds
VT <- ø
for i<-0 to |V|-1 do
u* <- deleteMin(Q) //delete the minimum priority
element VT <- VT U {u*}
for every vertex u in V-VT that is adjacent to u* do
if du* + w(u*,u) < du
du <- du* + w(u*,u);
pu <- u*
Decrease(Q,u, du)
15
CS3401/ ALGORITHMS
The time efficiency depends on the data structure used for implementing the priority queue and for
representing the input graph. It is in θ(|V|)2 for graphs represented by their weight matrix and the priority
queue implemented as an unordered array. For graphs represented by their adjacency linked lists and
the priority queue implemented as a min_heap it is in O(|E|log|V|).
Given a weighted digraph (G, c), determine for each pair of nodes u, v ∈ V (G) (the length of) a minimum
weight path from u to v.
Time complexity Θ(nAn,m) of computing the matrix D by finding the single-source shortest paths (SSSP)
from each node as the source in turn.
The APSP complexity Θ(n3) for the adjacency matrix version of the Dijkstra’s SSSP algorithm: An,m = n2.
The APSP complexity Θ(n2m) for the Bellman-Ford SSSP algorithm: An,m = mn.
Floyd’s algorithm –
One of the known simpler algorithms for computing the distance matrix (three nested for-loops; Θ(n3) time
complexity):
2. At each step k, maintain the matrix of shortest distances from node i to node j, not passing through
nodes higher than k.
3. Update the matrix at each step to see whether the node k shortens the current best distance.
Better than the Dijkstra’s algorithm for dense graphs, probably not for sparse ones.
Based on Warshall’s algorithm (just tells whether there is a path from node i to node j, not concerned
with length).
Floyd’s Algorithm
16
CS3401/ ALGORITHMS
17
CS3401/ ALGORITHMS
18
CS3401/ ALGORITHMS
19
CS3401/ ALGORITHMS
20
CS3401/ ALGORITHMS
21
CS3401/ ALGORITHMS
22
CS3401/ ALGORITHMS
23
CS3401/ ALGORITHMS
24
CS3401/ ALGORITHMS
25
CS3401/ ALGORITHMS
Bipartite Graphs
Bipartite graph: a graph whose vertices can be partitioned into two disjoint sets V and U, not
necessarily of the same size, so that every edge connects a vertex in V to a vertex in U.
A graph is bipartite if and only if it does not have a cycle of an odd length
A bipartite graph is 2-colorable: the vertices can be colored in two colors so that every edge has
its vertices colored differently
Matching in a Graph
A matching in a graph is a subset of its edges with the property that no two edges share a vertex a matching
in this graph.
A maximum (or maximum cardinality) matching is a matching with the largest number of edges
1) always exists
2) not always unique
Free Vertices and Maximum Matching
A matching in this graph (M)
For a given matching M, a vertex is called free (or unmatched) if it is not an endpoint of any
edge in M; otherwise, a vertex is said to be matched
If every vertex is matched, then M is a maximum matching
If there are unmatched or free vertices, then M may be able to be improved
We can immediately increase a matching by adding an edge connecting two free vertices
26
CS3401/ ALGORITHMS
27
UNIT III ALGORITHM DESIGN TECHNIQUES
Divide and Conquer methodology: Finding maximum and minimum - Merge sort - Quick sort , Dynamic
programming: Elements of dynamic programming — Matrix-chain multiplication - Multi stage graph — Optimal
Binary Search Trees. Greedy Technique: Elements of the greedy strategy - Activity-selection problem –- Optimal
Merge pattern — Huffman Trees.
Divide and Conquer is a best known design technique, it works according to the following plan:
a) A problem’s instance is divided into several smaller instances of the same problem of equal
size.
b) The smaller instances are solved recursively.
c) The solutions of the smaller instances are combined to get a solution of the original problem.
As an example, let us consider the problem of computing the sum of n numbers a0, a1, … an-1.
If n>1, we can divide the problem into two instances: to compare the sum of first n/2 numbers and the
remaining n/2 numbers, recursively. Once each subset is obtained add the two values to get the final
solution. If n=1, then return a0 as the solution.
i.e. a0 + a1….. + an-1 = (a0 + …..+ an/2– 1) + (an/2+ …. + a n-1)
This is not an efficient way, we can use the Brute – force algorithm here. Hence, all the
problems are not solved based on divide – and – conquer. It is best suited for parallel computations, in
which each sub problem can be solved simultaneously by its own processor.
Analysis:
In general, for any problem, an instance of size n can be divided into several instances of size
n/b with a of them needing to be solved. Here, a & b are constants; a≥1 and b > 1. Assuming that size n
is a power of b; to simplify it, the recurrence relation for the running time T(n) is:
T(n) = a T (n / b) + f (n)
Where f (n) is a function, which is the time spent on dividing the problem into smaller ones and
on combining their solutions. This is called the general divide-and- conquer recurrence. The order of
growth of its solution T(n) depends on the values of the constants a and b and the order of growth of
the function f (n).
CS3401 ALGORITHMS
Theorem:
Є θ(nd) where d≥0 in the above recurrence equation, then θ(nd) if a < bd
T(n) Є θ(ndlogn if a = bd θ(nlogba)
if a > bd
For example, the recurrence equation for the number of additions A(n) made by divide-and-
conquer on inputs of size n=2k is:
A (n) = 2 A (n/2) + 1
Thus for eg., a=2, b=2, and d=0 ; hence since a > bd A (n) Є θ(nlogba)
= θ(nlog22)
= θ(nl)
MERGE SORT
The merging of two sorted arrays can be done as follows: Two pointers are initialized to point
to first el Then the elements pointed to are compared and the smaller of them is added to a new array
being const smaller element is incremented to point to its immediate successor in the array it was
copied from. This the two given arrays is exhausted then the remaining elements of the other array are
copied to the end of the new array.
Algorithm Merge (B[0…P-1], C[0…q-1], A[0…p + q-1]) //Merge two sorted arrays into one
sorted array. //Input: Arrays B[0..p-1] and C[0…q-1] both sorted
//Output: Sorted Array A [0…p+q-1] of the elements of B & C
CS3401 ALGORITHMS
i = 0; j = 0; k = 0 while i
< p and j < q do
if B[i] ≤ C[j]
A[k] = B[i]; i = i+1
else
A[k] = B[j]; j = j+1 K = k+1
if i=p
copy C[j..q-1] to A[k..p+q-1]
else copy B[i..p-1] to A[k..p+q-1]
Analysis:
Assuming for simplicity that n is a power of 2, the recurrence relation for the number of key
comparisons C(n) is
At each step, exactly one comparison is made, after which the total number of elements in the
two arrays still needed to be processed is reduced by one element. In the worst case, neither of the two
arrays becomes empty before the other one contains just one element. Therefore, for the worst case,
Cmerge (n) = n-1 and the recurrence is:
Cworst (n) = 2Cworst (n/2) + n-1 for n>1, Cworst (1) = 0 When n is a power of 2, n= 2k, by
successive substitution, we get,
C(n) = 2 C (n/2) + Cn
= 2 (2 C (n/4) + C n/2 ) + Cn
= 2 C (n/4) +2 Cn
= 4 (2 C (n/8) + C n/4 ) + 2Cn
= 8 C (n/8) +3 Cn
:
:
= 2kC(1) + kCn
= an + Cnlog2n
Since k = logn and n = 2k, we get, log2n = k(log22) = k * 1 It is easy to see that if 2k ≤ n ≤ 2k+1
, then
1. It uses 2n locations. The additional n locations can be eliminated by introducing a key field which is a
linked field which consists of less space. i.e., LINK (1:n) which consists of [0:n]. These are pointers to
elements A. It ends with zero. Consider Q&R,
From this we conclude that A(2) < A(4) < A(1) < A (6) and A (5) < A (3) < A (7) < A(8).
2. The stack space used for recursion. The maximum depth of the stack is proportional to log2n. This is
developed in top-down manner. The need for stack space can be eliminated if we develop algorithm in
Bottom-up approach.
It can be done as an in-place algorithm, which is more complicated and has a larger multiplicative
constant.
QUICK SORT
Quick sort is another sorting algorithm that is based on divide-and-conquer strategy. Quick sort
divides according to their values. It rearranges elements of a given array A[o…n-1] to achieve its
partition, a situation where all the elements before some position s are smaller than or equal to A [s] and
all elements after s are greater than or equal to A [s]:
After this partition A [S] will be in its final position and this proceeds for the two sub arrays:
The partition of A [0..n-1] and its sub arrays A [l..r] (0<l<r<n -1) can be achieved by the following
algorithms. First, select an element with respect to whose value we are going to divide the
CS3401 ALGORITHMS
sub array, called as pivot. The pivot by default is considered to be the first element in the list. i.e.
P = A (l)
The method which we use to rearrange is as follows which is an efficient method based on two
scans of the sub array ; one is left to right and the other right to left comparing each element with the
pivot. The L R scan starts with the second element. Since we need elements smaller than the pivot to
be in the first part of the sub array, this scan skips over elements that are smaller than the pivot and
stops on encountering the first element greater than or equal to the pivot. The R L scan starts with last
element of the sub array. Since we want elements larger than the pivot to be in the second part of the
sub array, this scan skips over elements that are larger than the pivot and stops on encountering the
first smaller element than the pivot.
Three situations may arise, depending on whether or not the scanning indices have crossed. If
scanning indices i and j have not crossed, i.e . i < j, exchange A [i] and A [j] and resume the scans by
incrementing and decrementing j, respectively.
If the scanning indices have crossed over, i.e. i>j, we have partitioned the array after
exchanging the pivot with A [j].
Finally, if the scanning indices stop while pointing to the same elements, i.e. i=j, the value they
are pointing to must be equal to p. Thus, the array is partitioned. The cases where, i>j and i=j can be
combined to have i ≥ j and do the exchanging with the pivot.
until i ≥ j
Swap (A[i], A[j] ) //undo the last swap when i ≥ j Swap
(A [l], A [j])
return j
Analysis:
The efficiency is based on the number of key comparisons. If all the splits happen in the middle
of the sub arrays, we will have the best case. The no. of key comparisons will be:
According to theorem, C best (n) Є θ (n log 2 n); solving it exactly for n = 2 k yields C
best
(n) = n log 2 n.
In the worst case, all the splits will be skewed to the extreme : one of the two sub arrays
will be empty while the size of the other will be just one less than the size of a subarray being
partitioned. It happens for increasing arrays, i.e., the inputs which are already solved. If A [0…n-
1] is a strictly increasing array and we use A [0] as the pivot, the L→ R scan will stop on A[1]
while the R
→ L scan will go all the way to reach A[0], indicating the split at position 0:
So, after making n+1 comparisons to get to this partition and exchanging the pivot A [0]
with itself, the algorithm will find itself with the strictly increasing array A[1..n-1] to sort. This
sorting of increasing arrays of diminishing sizes will continue until the last one A[n- 2..n-1] has
been processed. The total number of key comparisons made will be equal to:
Finally, the average case efficiency, let Cavg(n) be the average number of key
comparisons made by quick sort on a randomly ordered array of size n. Assuming that the
partition split can happen in each position s (o ≤ s ≤ n – 1) with the same probability 1/n, we get
the following recurrence relation:
n-1
Cavg(0) = 0 , Cavg(1) = 0
Therefore, Cavg(n) ≈ 2nln2 ≈ 1.38nlog2n
Thus, on the average, quick sort makes only 38% more comparisons than in the best case.
To refine this algorithm : efforts were taken for better pivot selection methods (such as the median
– of – three partitioning that uses as a pivot the median of the left most, right most and the middle
element of the array) ; switching to a simpler sort on smaller sub files ; and recursion elimination
(so called non recursive quick sort). These improvements can cut the running time of the
algorithm by 20% to 25%
Partitioning can be useful in applications other than sorting, which is used in selection
problem also.
DYNAMIC ROGRAMMING
Bottom-up approach all the smaller sub problems to be solved. In top-down it avoids solving
the unnecessary sub problems, which is called memory-functions.
Optimal Substructure
Overlapping Sub-problems
Variant: Memoization
Sub-problems must be optimal, otherwise the optimal splitting would not have been optimal.
There is usually a suitable "space" of sub-problems. Some spaces are more "natural" than others.
For matrix-chain multiply we chose sub-problems as sub chains. We could have chosen all
arbitrary products, but that would have been much larger than necessary! DP based on that would
have to solve too many sub-problems.
Recursive Approach
The idea is very simple, placing parentheses at every possible place. For example, for a matrix chain M1
M2M3M4 we can have place parentheses like -
(M1) (M2M3M4)
(M1M2) (M3M4)
(M1M2M3) (M4)
So for a matrix chain of length n, M1M2...Mn we can place parentheses in n−1 ways, notice that
placing parentheses will not solve the problem rather it will split the problem into smaller sub-
problems which are later used to find the answer for the main problem. Like in (M1) (M2M3M4) the
problem is divided into (M1) and (M2M3M4) which can be solved again by placing parentheses and
CS3401 ALGORITHMS
splitting the problem into smaller sub-problems until size becomes less than or equal to 2.
So we can deduce the minimum number of steps required to multiply a matrix chain of length n, M1
M2...Mn minimum number of all n−1 sub-problems.
To implement it we can use a recursive function in which we will place parentheses at every possible
place, then in each step divide the main problem into two sub-problems n left part and right part.
Later, the result obtained from the left and right parts can be used to find the answer of the main
problem.
To implement it we can use a recursive function in which we will place parentheses at every possible
place, then in each step divide the main problem into two sub-problems n left part and right part.
Later, the result obtained from the left and right parts can be used to find the answer of the main
problem.
Pseudocode of Recursive Approach
// mat = Matrix chain of length n
// low = 1, j = n-1 initially
MatrixChainMultiplication(mat[], low , high):
// 0 steps are required if low equals high
// case of only 1 sized sub-problem.
If(low=high):
return 0
it means we are left with a sub-problem of size 1.i.e. a single matrix Mi so we will
return 0.
Then, we will iterate from low to high and place parentheses for each i such
that low≤i<high.
During each iteration we will also calculate the cost incurred if we place at that
particular position .i.e. Cost of left sub-problem + cost of right sub-problem
+ mat[low-1]×mat[k]×mat[high]
At last we will return the minimum of the costs found during the iteration.
Memoization: What if we stored sub-problems and used the stored solutions in a recursive algorithm? This
is like divide-and-conquer, top down, but should benefit like DP which is bottom-up. Memoized version
maintains an entry in a table. One can use a fixed table or a hash table.
The algorithm is almost the same as the naive recursive solution with the only change that here we will use
a dp array of dimension n×n.
Steps -
Declare a global dp array of dimension n×n and initialize all the cells with value −1.
dp[i][j] holds the answer for the sub-problem i..j and whenever we will need to calculate i..j we would
simply return the stored answer instead of calculating it again.
Initialize variables low with 1, and high with n−1 and call the function to find the answer to this
problem.
If low=high then return 0 because we don't need any step to solve sub-problem with a single matrix.
If dp[low][high]≠−1 it means the answer for this problem has already been calculated. So we will
simply return [low][high] instead of solving it again.
In the other case, for each i=low to i=high−1, we will find the minimum number of steps in solving the
left sub-problem, right sub-problem, and steps required in multiplying the result obtained from both
sides. Store the result obtained in dp[low][high] and return it.
Atlast low=0 and high=n−1 i.e. dp[0][n−1] will be returned.
CS3401 ALGORITHMS
// Input:
k: Number of stages in graph G = (V, E)
c[i, j]:Cost of edge (i, j)
cost[n] ← 0
for j ← n – 1 to 1 do
//Let r be a vertex such that (j, r) in E and c[j, r] + cost[r] is minimum
cost[j] ← c[j, r] + cost[r]
π[j] ← r
end
CS3401 ALGORITHMS
for j ← 2 to k - 1 do
p[j] ← π[p[j - 1]]
end
Complexity Analysis of Multistage Graph
If graph G has |E| edges, then cost computation time would be O(n + |E|). The complexity of tracing
the minimum cost path would be O(k), k < n. Thus total time complexity of multistage graph using
dynamic programming would be O(n + |E|).
Example
Example: Find minimum path cost between vertex s and t for following multistage graph using
dynamic programming.
Solution:
Solution to multistage graph using dynamic programming is constructed as,
Cost[j] = min{c[j, r] + cost[r]}
Here, number of stages k = 5, number of vertices n = 12, source s = 1 and target t = 12
Initialization:
Cost[n] = 0 ⇒ Cost[12] = 0.
p[1] = s ⇒ p[1] = 1
p[k] = t ⇒ p[5] = 12.
r = t = 12.
Stage 4:
CS3401 ALGORITHMS
Stage 3:
p[6] = 10
= min{3 + 2, 4 + 4} = min{5, 8} = 5
p[7] = 10
Stage 2:
= min{4 + 7, 2 + 5, 1 + 7} = min{11, 7, 8} = 7
CS3401 ALGORITHMS
p[2] = 7
p[3] = 6
p[4] = 8
Stage 1:
Cost[1] = min{ c[1, 2] + Cost[2], c[1, 3] + Cost[3], c[1, 4] + Cost[4], c[1, 5] + Cost[5]}
CS3401 ALGORITHMS
= min{ 9 + 7, 7 + 9, 3 + 18, 2 + 15 }
p[1] = 2
p[2] = 7
p[7] = 10
p[10] = 12
Optimal Binary Search Tree extends the concept of Binary search tree. Binary Search Tree (BST) is a
nonlinear data structure which is used in many scientific applications for reducing the search time. In BST,
left child is smaller than root and right child is greater than root. This arrangement simplifies the search
procedure.
CS3401 ALGORITHMS
Optimal Binary Search Tree (OBST) is very useful in dictionary search. The probability of searching is
different for different words. OBST has great application in translation. If we translate the book from
English to German, equivalent words are searched from English to German dictionary and replaced in
translation. Words are searched same as in binary search tree order.
Binary search tree simply arranges the words in lexicographical order. Words like ‘the’, ‘is’, ‘there’ are
very frequent words, whereas words like ‘xylophone’, ‘anthropology’ etc. appears rarely.
It is not a wise idea to keep less frequent words near root in binary search tree. Instead of storing words in
binary search tree in lexicographical order, we shall arrange them according to their probabilities. This
arrangement facilitates few searches for frequent words as they would be near the root. Such tree is called
Optimal Binary Search Tree.
Consider the sequence of n keys K = < k1, k2, k3, …, kn> of distinct probability in sorted order such that
k1< k2< … <kn. Words between each pair of key lead to unsuccessful search, so for n keys, binary search
tree contains n + 1 dummy keys di, representing unsuccessful searches.
Two different representation of BST with same five keys {k1, k2, k3, k4, k5} probability is shown in
following figure
With n nodes, there exist Ω (4n / n3/2) different binary search trees. An exhaustive search for optimal binary
search tree leads to huge amount of time.
The goal is to construct a tree which minimizes the total search cost. Such tree is called optimal binary
search tree. OBST does not claim minimum height. It is also not necessary that parent of sub tree has higher
priority than its child.
Dynamic programming can help us to find such optima tree.
Mathematical formulation
We formulate the OBST with following observations
Any sub tree in OBST contains keys in sorted order ki…kj, where 1 ≤ i ≤ j ≤ n.
Sub tree containing keys ki…kj has leaves with dummy keys di-1….dj.
Suppose kr is the root of sub tree containing keys ki…..kj. So, left sub tree of root kr contains keys
ki….kr-1 and right sub tree contain keys kr+1 to kj. Recursively, optimal sub trees are constructed from
the left and right sub trees of kr.
Let e[i, j] represents the expected cost of searching OBST. With n keys, our aim is to find and
minimize e[1, n].
Base case occurs when j = i – 1, because we just have the dummy key di-1 for this case. Expected
search cost for this case would be e[i, j] = e[i, i – 1] = qi-1.
For the case j ≥ i, we have to select any key kr from ki…kj as a root of the tree.
With kr as a root key and sub tree ki…kj, sum of probability is defined as
CS3401 ALGORITHMS
e[i, j] gives the expected cost in the optimal binary search tree.
Algorithm OBST(p, q, n)
// e[1…n+1, 0…n ] : Optimal sub tree
// w[1…n+1, 0…n] : Sum of probability
// root[1…n, 1…n] : Used to construct OBST
for i ← 1 to n + 1 do
e[i, i – 1] ← qi – 1
w[i, i – 1] ← qi – 1
end
for m ← 1 to n do
for i ← 1 to n – m + 1 do
j←i+m–1
e[i, j] ← ∞
w[i, j] ← w[i, j – 1] + pj + qj
for r ← i to j do
t ← e[i, r – 1] + e[r + 1, j] + w[i, j]
if t < e[i, j] then
e[i, j] ← t
root[i, j] ← r
end
end
end
end
CS3401 ALGORITHMS
Example
Problem: Let p (1 : 3) = (0.5, 0.1, 0.05) q(0 : 3) = (0.15, 0.1, 0.05, 0.05) Compute and construct OBST for
above values using Dynamic approach.
Solution:
Here, given that
CS3401 ALGORITHMS
CS3401 ALGORITHMS
CS3401 ALGORITHMS
CS3401 ALGORITHMS
CS3401 ALGORITHMS
CS3401 ALGORITHMS
CS3401 ALGORITHMS
Example
Considering the following tree, the cost is 2.80, though this is not an optimal result.
e1 2 3 4 5 6
5 2.75 2.00 1.30 0.90 0.50 0.10
4 1.75 1.20 0.60 0.30 0.05
3 1.25 0.70 0.25 0.05
2 0.90 0.40 0.05
1 0.45 0.10
0 0.05
w 1 2 3 4 5 6
5 1.00 0.80 0.60 0.50 0.35 0.10
4 0.70 0.50 0.30 0.20 0.05
3 0.55 0.35 0.15 0.05
2 0.45 0.25 0.05
CS3401 ALGORITHMS
1 0.30 0.10
0 0.05
root 1 2 3 4 5
5 2 4 5 5 5
4 2 2 4 4
3 2 2 3
2 1 2
1 1
GREEDY ALGORITHMS
Greedy algorithms build a solution part by part, choosing the next part in such a way, that it gives an
immediate benefit. This approach never reconsiders the choices taken previously. This approach is mainly
used to solve optimization problems. Greedy method is easy to implement and quite efficient in most of
the cases. Hence, we can say that Greedy algorithm is an algorithmic paradigm based on heuristic that
follows local optimal choice at each step with the hope of finding global optimal solution.
In many problems, it does not produce an optimal solution though it gives an approximate (near optimal)
solution in a reasonable time.
Components of Greedy Algorithm
Greedy algorithms have the following five components −
A candidate set − A solution is created from this set.
A selection function − Used to choose the best candidate to be added to the solution.
A feasibility function − Used to determine whether a candidate can be used to contribute to the
solution.
An objective function − Used to assign a value to a solution or a partial solution.
A solution function − Used to indicate whether a complete solution has been reached.
Areas of Application
Greedy approach is used to solve many problems, such as
Finding the shortest path between two vertices using Dijkstra’s algorithm.
Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s algorithm, etc.
Where Greedy Approach Fails
In many problems, Greedy algorithm fails to find an optimal solution, moreover it may produce a worst
solution. Problems like Travelling Salesman and Knapsack cannot be solved using this approach.
An Activity Selection Problem
The activity selection problem is a mathematical optimization problem. Our first illustration is the problem
of scheduling a resource among several challenge activities. We find a greedy algorithm provides a well-
designed and simple method for selecting a maximum- size set of manually compatible activities.
Suppose S = {1, 2....n} is the set of n proposed activities. The activities share resources which can be used
by only one activity at a time, e.g., Tennis Court, Lecture Hall, etc. Each Activity "i" has start time si and a
finish time fi, where si ≤fi. If selected activity "i" take place meanwhile the half-open time interval [si,fi).
Activities i and j are compatible if the intervals (si, fi) and [si, fi) do not overlap (i.e. i and j are compatible
if si ≥fi or si ≥fi). The activity-selection problem chosen the maximum- size set of mutually consistent
CS3401 ALGORITHMS
activities.
S = (A1 A2 A3 A4 A5 A6 A7 A8 A9 A10)
Si = (1,2,3,4,7,8,9,9,11,12)
fi = (3,5,4,7,10,9,11,13,12,14)
Compute a schedule where the greatest number of activities takes place.
Solution: The solution to the above Activity scheduling problem using a greedy strategy is
illustrated below:
Arranging the activities in increasing order of end time
Now, schedule A1
CS3401 ALGORITHMS
Merge a set of sorted files of different length into a single sorted file. We need to find an optimal solution,
where the resultant file will be generated in minimum time.
If the number of sorted files are given, there are many ways to merge them into a single sorted file. This
merge can be performed pair wise. Hence, this type of merging is called as 2-way merge patterns.
As, different pairings require different amounts of time, in this strategy we want to determine an optimal
way of merging many files together. At each step, two shortest sequences are merged.
To merge a p-record file and a q-record file requires possibly p + q record moves, the obvious choice
being, merge the two smallest files together at each step.
Two-way merge patterns can be represented by binary merge trees. Let us consider a set of n sorted
files {f1, f2, f3, …, fn}. Initially, each element of this is considered as a single node binary tree. To find
this optimal solution, the following algorithm is used.
Algorithm: TREE (n)
for i := 1 to n – 1 do
declare new node
node.leftchild := least (list)
node.rightchild := least (list)
node.weight) := ((node.leftchild).weight) + ((node.rightchild).weight)
insert (list, node);
return least (list);
At the end of this algorithm, the weight of the root node represents the optimal cost.
Example
Let us consider the given files, f1, f2, f3, f4 and f5 with 20, 30, 10, 5 and 30 number of elements
respectively.
If merge operations are performed according to the provided sequence, then
M1 = merge f1 and f2 => 20 + 30 = 50
M2 = merge M1 and f3 => 50 + 10 = 60
M3 = merge M2 and f4 => 60 + 5 = 65
M4 = merge M3 and f5 => 65 + 30 = 95
Hence, the total number of operations is
50 + 60 + 65 + 95 = 270
CS3401 ALGORITHMS
Step-1
Step-2
Step-3
Step-4
CS3401 ALGORITHMS
HUFFMAN TREES
Suppose we have to encode a text that comprises n characters from some alphabet by
assigning to each of the text’s characters some sequence of bits called the code word. Two types are
there :fixed length encoding that assigns to each character a bit string of the same length m variable-
length encoding, which assigns codewords of different lengths to different characters , introduces a
problem that fixed length encoding does not have.
Huffman’s algorithm
Step 1: Initialize n one-node trees and label them with the characters of the alphabet.
Record the frequency of each character in its tree’s root to indicate the tree’s weight.
Step 2: Repeat the following operation until a single tree is obtained. Find two trees with the
smallest weight. Make them the left and right sub tree of a new tree and record the sumof
their weights in the root of the new tree as its weight. A tree constructed by the above algm
is called a Huffman tree. It defines – a Huffman code.
CS3401 ALGORITHMS
Example : consider the 5 char alphabet {A,B,C,D,-} with the following occurrence
probabilities:
Char A B C D
Char A B C D _
With the occurrence probabilities given and the codeword lengths obtained, the
expected number of bits per char in this code is:
The draw back can be overcome by the so called dynamic Huffman encoding,
in which the coding tree is updated each time a new char is read from source text.
Huffman’s code is not only limited to data compression. The sum ∑ liwi where i=1
li is the length of the simple path from the root to ith leaf, it is weighted path length.
From this decision trees, can be obtained which is used for game applications.