MSC Computer Science: Karnataka State Open University
MSC Computer Science: Karnataka State Open University
Mukthagangotri, Mysore-570006
Course Name: Advanced Data Structures with Algorithms Credit: 4 Unit No: 1-16
Editorial Committee
Dr. D M Mahesha MCA.,PhD Chairman
BOS Chairman,
Assistant Professor & Programme co-ordinator(PG)
DoS&R in Computer Science,
Karnataka State Open University, Mysuru-570 006.
Copy Right
Registrar,
Karnataka State Open University,
Mukthagantoghri, Mysore 570 006.
Printed and Published on behalf of Karnataka State Open University, Mysore-570 006 by
the Registrar (Administration)-2022
TABLE OF CONTENTS
BLOCK
GRAPHS
1II
Elementary graph algorithms: representations of graphs – breadth- 142-157
UNIT-9 first search – depth-first search.
Preface
In the current era of intelligent systems, the computer science plays a very important role in
building up a complete automated society. The problems at hand have got to be tackled, on
time/ on real time such that the time required to solve them with the aid of computers
becomes negligible. What matters in achieving these is only the complexity of the method
(algorithm) to be adapted to solve a problem. The method may achieve the intended goal
either by reducing the search domain (in memory) or by the use of indexing on large
memory. That is all what the tradeoff that we have to understand while designing a suitable
algorithm for solving a problem. In case of existence of several ways for solving a problem,
we have to look for the most efficient one among them. The efficiency of a method depends
on our requirement specification. A method/algorithm which is efficient for somebody may
not be efficient for others. Before, one goes for adaptation of any method, he/she has to work
out the tradeoffs of all algorithms with respect to their time requirement, space requirement,
correctness, accuracy, robustness, simplicity in terms of transparency etc. which are generally
called the quality factors.
This block 1 begins with design strategies such as unit 1 introduction to algorithms and how
to test algorithms, performance measurement and practical complexities. In the 2nd unit we
focused on analyzing algorithms, analyzing control structures with supplementary examples.
In the 3rd unit we are Asymptotic notations, time complexity, space complexity. In the 4th
unit discusses the issues Introduction to recurrences, the substitution method, the recursion
tree method and solving recurrences are also explained.
This block 2 begins with Binary Search trees such as unit 5 Basics concept of querying a
binary search tree and insertion and deletion of Red-Black trees, properties of Red black
trees. In the 6th unit we focused on insertion and deletion of B-Trees and basic operations of
B trees as well as deleting a key from a B-Tree. In the 7th unit we are Fibonacci Heaps,
structure, Merge able heap operations. In the 8th unit discusses the issues of Decreasing a key
and deleting a Node-Bounding the maximum degree are also explained.
This block 3 begins with Graphs such as unit 9 Elementary graph algorithms, representations
of graphs and Breadth-First Search – Depth-First Search. In the 10th unit we focused on
Topological Sort – Strongly Connected Components- Minimum Spanning Trees: Growing a
Minimum Spanning Tree. In the 11th unit Kruskal and Prim- Single-Source Shortest Paths:
The Bellman-Ford algorithm Single-Source Shortest paths in Directed Acyclic Graphs. In
the 12th unit Dijkstra‘s Algorithm; All-Pairs Shortest Paths: Shortest Paths and Matrix
Multiplication – The Floyd Warshall Algorithm are also explained.
This block 4 begins with Dynamic Programming such as unit 13 th Dynamic Programming:
Matrix-Chain Multiplication – Elements of Dynamic Programming – Longest Common
Subsequence. In the 13th unit we focused on Greedy Algorithms: An Activity-Selection
Problem – Elements of the Greedy Strategy. In the 15th unit we are Huffman Codes. Np
Complete and NP Hard
NP-Completeness:. In the 16th unit discusses the Polynomial Time – Polynomial-Time
Verification – NP- Completeness and Reducibility – NP-Completeness Proofs –NP-Complete
Problems are also explained.
UNIT – 1
STRUCTURE
1.0 Objectives
1.6 Summary
1.7 Keywords
1.9 Reference
1.0 OBJECTIVES
1
1.1 INTRODUCTION TO ALGORITHMS
The 22nd century Computer Science is the field where we study about how to solve a problem
effectively and efficiently with the aid of computers. Solving a problem that too by the use of
computers requires a thorough knowledge and understanding of the problem. The problem could be
of any complex ranging from a simple problem of adding two numbers to a problem of making the
computer capable of taking decisions on time in real environment, automatically by understanding
the situation or environment, as if it is taken by a human being. In order to automate the task of solving
a problem, one has to think of many ways of arriving at the solution. A way of arriving at a solution
from the problem domain is called algorithm. Thus, one can have many algorithms for the same
problem.
In case of existence of many algorithms we have to select the one which best suits our
requirements through analysis of algorithms. Indeed, the design and analysis of algorithms are the
two major interesting sub fields of computer science. Most of the scientists do work on these subfields
just for fun. We mean to say that these two sub areas of computer science are such interesting areas.
Once the most efficient algorithm is selected, it gets coded in a programming language. This essentially
requires knowledge of a programming language. And finally, we go for executing the coded algorithm
on a machine (Computer) of particular architecture. Thus, the field computer science broadly
encompasses,
It shall be noticed that the fundamental notion in all the above is the term algorithm. Indeed,
that signifies its prominence of algorithms, in the field of computer science and thus the algorithm
deserves its complete definition. Algorithm means ‘process or rules for computer calculation’
according to the dictionary. However, it is something beyond that definition.
2
1.2 ALGORITHMS AS A TECHNOLOGY
When we use an algorithm to calculate the answer to a particular problem, we usually assume that
the rules will, if applied correctly, indeed give us the correct answer. A set of rules that calculate that
23 times 51 is 1170 is not generally useful in practice. However, in some circumstances such
approximate algorithms can be useful. If we want to calculate the square root of 2, for instance, no
algorithm can give us an exact answer in decimal notation, since the representation of square root of
2 is infinitely long and nonrepeating. In this case, we shall be content if an algorithm can give us an
answer that is as precise as we choose: 4 figures accuracy, or 10 figures, or whatever we want.
The definiteness property insists that each step in the algorithm is unambiguous. A step is said to be
unambiguous if it is clear, in the sense that, the action specified by the step can be performed without
any dilemma/confusion. The finiteness property states that the algorithm must terminate its
execution after finite number of steps (or after a finite time period of execution). That is to say, it
3
should not get into an endless process. The effectiveness property is indeed the most important
property of any algorithm. There could be many algorithms for a given problem. But, one algorithm
may be effective with respect to a set of constraints and the others may not be effective with respect
to the same constraints. However, they may be more effective with respect to some other constraints.
In fact, because of the effectiveness property associated with algorithms, finding out most
optimal/effective algorithm even for already solved problem is still open for the research community
for further research.
To study the effectiveness of an algorithm, we have to find out the minimum and maximum
number of operations the algorithm takes to solve the desired problem. The time requirement and
space requirement profiling should be done. The profiling may be relatively based on the number of
inputs or the number of outputs or on the nature of input. The process of knowing about the minimum
cost and the maximum cost (may be in terms of CPU time and memory locations) is called analysis of
algorithm. In the subsequent sections, we present methods of ANALYZING an algorithm.
4
1.4 PERFORMANCE MEASUREMENT
This is the final stage of algorithm evaluation. A question to be answered when the program
is ready for execution, (after the algorithm has been devised, made a priori analysis of that, coded into
a program debugged and compiled) is how do we actually evaluate the time taken by the program?
Obviously, the time required to read the input data or give the output should not be taken into
account. If somebody is keying in the input data through the keyboard or if data is being read from an
input device, the speed of operation is dependent on the speed of the device, but not on the speed of
the algorithm. So, we have to exclude that time while evaluating the programs. Similarly, the time to
write out the output to any device should also be excluded. Almost all systems provide a facility to
measure the elapsed system time by using stime() or other similar functions. These can be inserted at
appropriate places in the program and they act as stop clock measurement. For example, the system
time can be noted down just after all the inputs have been read. Another reading can be taken just
before the output operations start. The difference between the two readings is the actual time of run
of the program. If multiple inputs and outputs are there, the counting operations should be included
at suitable places to exclude the I/O operations.
It is not enough if this is done for one data set. Normally various data sets are chosen and the
performance is measured as explained above. A plot of data size n v/s the actual time can be drawn
which gives an insight into the performance of the algorithm.
The entire procedure explained above is called “profiling”. However, unfortunately, the times
provided by the system clock are not always dependable. Most often, they are only indicative in nature
and should not be taken as an accurate measurement. Especially when the time durations involved
are of the order of 1-2 milliseconds, the figures tend to vary often between one run and the other,
even with the same program and all same input values.
Irrespective of what we have seen here and in the subsequent discussions, devising algorithms
is both an art and a science. As a science part, one can study certain standard methods (as we do in
this course) but there is also an individual style of programming which comes only by practice.
5
1.5 PRACTICAL COMPLEXITIES
We have seen that the time complexity of a program is generally some function of the
instance characteristics. This function is very useful in determining how the time
requirements vary as the instance characteristics change. We can also use the complexity
function to compare two programs P and Q that perform the same task. Assume that program
P has complexity (n) and that program Q has complexity (n2). We can assert that; program
P is faster than program Q is for sufficiently large n. To see the validity of this assertion,
observe that the actual computing time of P is bounded from above by cn for some constant
c and for all n n > nl, while that of Q is bounded from below by dn2 for some constant d and
all n, n n2. Since cn dn2 for n c/d, program P is faster than program Q whenever n
max{n1,n2, c/d).
One should always be cautiously aware of the presence of the phrase sufficiently large
in the as assertion of the preceding discussion. When deciding which of the two programs
to use, we must know whether the n we are dealing with is, in fact, sufficiently large. If program
P actually runs in 106n milliseconds while program Q runs in n2 milliseconds and if we always
have n 106, then program Q is the one to use.
To get a feel for how the various functions grow with n, you should study figures 2.1
and 2.2. These figures show that 2 n grows very rapidly with n. In fact, if a program needs 2 n
steps for execution, then when n = 40, the number of steps needed is approximately
1.1*1012. On a computer performing 1.000.000,000 steps per second, this program would
require about 18.3 minutes. If n = 50, the same program would run for about 13 days on
this computer. When n = 60, about 310.56 years will be required to execute the program,
and when n = 100, about 4*10 13 years will be needed. We can conclude that the utility of
programs with exponential complexity is limited to small n (typically n < 40).
6
Programs that have a complexity that is a high-degree polynomial are also of limited
utility. For example, if a program needs n 10 steps, then our 1.000,000,000 steps per second
computer needs 10 seconds when n = 10; 3171 years when n = 100; and 3.17 + 1013
years when n = 1000. If the program's complexity had been n3 steps instead, then the
computer would need 1 second when n = 1000, 110.67 minutes when n = 10,000 and
11.57 days when n = 100,000.
In this unit, we have introduced algorithms. A glimpse of all the phases we should go through
when we study an algorithm and its variations was given. In the study of algorithms, the process of
designing algorithms, validating algorithms, analyzing algorithms, coding the designed algorithms,
verifying, debugging and studying the time involved for execution were presented. All in all, the basic
idea behind the analysis of algorithms is given in this unit.
1.6 SUMMARY
In this unit, we have introduced algorithms. A glimpse of all the phases we should go through
when we study an algorithm and its variations was given. In the study of algorithms, the process of
Introduction to algorithms, algorithms as a Technology, testing of algorithms, Performance
measurement of algorithms, Practical complexities. All in all, the basic idea behind the analysis of
algorithms is given in this unit.
7
1.7 KEYWORDS
1) Algorithm
2) Testing
3) Performance measurement
4) Practical complexities
1.9 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall Englewood
Cliffs, New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
3) Goodman And Hedetniemi, 1987, Introduction to the Design and Analysis of Algorithms,
Mcgraw Hill International Editions.
8
UNIT – 2
ANALYZING ALGORITHMS
STRUCTURE
2.0 Objectives
2.6 Summary
2.7 Keywords
2.9 Reference
2.0 OBJECTIVES
What is barometer?
9
2.1 ANALYZING OF ALGORITHMS
In practical situations, it may not be sufficient if an algorithm works properly and yields desired
results. A single problem can be solved in many different ways, and hence it is possible to design
several algorithms to perform the same job. However, when algorithms are executed (in the form of
a program) it uses the computer’s resources (the CPU time, the memory etc.) to perform operations
and its memory to hold the program
and data. An algorithm, which consumes lesser resource, is indeed a better one. Hence, the
process of “ANALYZING the algorithm” is an indispensable component in the study of algorithms.
Analysis of algorithms or performance analysis refers to the task of determining how much computing
time and storage an algorithm requires to run for completion.
A straight forward method of ANALYZING an algorithm is by coding the algorithm and then
executing it for measuring the space and time requirements on a specific computer for various data
sets. This straight forward method, however, is costly, time consuming and inconvenient. Hence,
alternate methods need to be evolved i.e., one should be able to arrive at the requirements going
through the lines of the algorithm. To do this, we keep in mind that each line of an algorithm gets
converted to one/more instructions (operations) to the computer. Hence, by counting such
instructions (or operations) one can approximate the time required. Similarly, the various data
structures provide information about the amount of storage space necessary.
But, in real pragmatic situations, an algorithm is normally quite lengthier and involves several
loops, so that ‘actual count’ may become unbelievably different and unanticipated. It should be
noticed that the instructions themselves are of different types, (involving arithmetic operations,
logical operations, simple data movement etc.). Certain operations like division and multiplication take
longer times than operations like addition, subtraction and data movement. Having obtained the final
count, it is not really possible to decide the exact time required by the algorithm, but can be thought
of as a fair approximation. Identifying the more complex and essential operations or functions and the
time required for them is also a way of ANALYZING since the time required for simple instructions
become negligible for lengthier algorithms. (Note that we are more interested in comparing an
algorithm with another instead of actually evaluating them with respect to their costs. Hence, these
approximations most often do not affect our final judgment). Thus, the problem of ANALYZING
algorithms reduces to identifying the costliest instructions and summing up the time required by
them. A given algorithm may work very efficiently with a few data sets, but may become sluggish with
10
others. Hence, the choice of sufficient number of data sets, representing all possible cases becomes
important, while ANALYZING an algorithm.
If there is more than one possible way of solving a problem, then one may think of more than
one algorithm for the same problem. Hence, it is necessary to know in what domains these algorithms
are applicable. Data domain is an important aspect to be known in the field of algorithms. Once we
have more than one algorithm for a given problem, how do we choose the best among them? The
solution is to devise some data sets and determine a performance profile for each of the algorithms.
A best case data set can be obtained by having all distinct data in the set. But, it is always complex to
determine a data set, which exhibits some average behaviour, for all kinds of algorithms.
Analysis of algorithms is a challenging area, which needs great mathematical skills. Usage of
mathematics allows us to make a quantitative judgment about the value of an algorithm. This
quantitative value can be used to select the best one out of many algorithms designed to solve the
same problem. In order to obtain this quantitative value, an algorithm can be analyzed at two different
stages. An algorithm can be analyzed just by looking in to the algorithm i.e., without executing the
algorithm. This type of ANALYZING is called a priori analysis. In this type of analysis, one obtains a
function (of some relevant parameters), which bounds the computing time of the algorithm. That is,
we get a lower limit and an upper such that the computing time of the algorithm always lies in between
these limits irrespective of the nature of the data sets. In the case of complex algorithm, ANALYZING
and determining the parameters for time and space consumption, without actually executing the
program is a challenging one. On the other hand, the algorithm can be tested through its execution
and the actual time memory required can be determined.
This way of knowing the performance of the algorithm is called a posteriori analysis. In this
analysis, we obtain statistics, by running the program and hence we get the accurate cost of the
algorithm’s execution. When compared to a priori analysis, a posteriori analysis can easily be
comprehended. Thus, we feel it is better to understand more about a posteriori analysis with an
example.
11
again with 10,000 data items. Hence, the most important step in a priori analysis is identifying
statements which consume more time. Such statements can be selected because they are complex
like, division/multiplication or because they get executed many times or more often both. Based on
these we can arrive at a sort of approximation to the actual execution time.
We do not actually know what is the time taken for multiplication, but we can assume that
the example (a) takes one unit of time, (b) takes n units of time (because it gets executed n times or
it’s frequency count is n and the example (c) takes n 2 units of time (gets executed n * n time or it’s
frequency count is n2). These values of 1, n and n2, are said be in increasing order of magnitude.
Philosophically, the above can be interpreted as follows. To travel a given distance, a plane
takes negligible time, motor car takes some time, a cycle takes much longer, a person walking takes
even more time. While one can clearly see that the actual time taken depends on the distance and
also the actual speeds of the vehicles in question, their “orders” are fixed. Given the orders, we directly
say the plane takes the least time and the walker takes the maximum time. This is a priori analysis and
thus requires a lot of a priori knowledge about the data and the functionalities.
To actually find the time taken by algorithms, it is necessary to execute them actually and note
the timings. However, to make things complex, the performance of an algorithm often depends on
the type of the inputs and the order in which they are given. Hence, the performance of an algorithm
cannot be labelled by one value, but often requires three different cases like best case performance,
average case performance and worst case performance.
Despite the fact that two different algorithms to solve the same problem are represented by
order notations, it is not always possible to say which among those is the best one. For instance,
consider the problem of finding out the sum of first n natural numbers. Following are the two
algorithms to achieve the same.
12
Although, both the algorithms A and B accomplish the same desired task of finding out the
sum of first n natural numbers, they have their own behaviours. Let us assume that all the arithmetic
operations take equal time (one unit) and let us also assume the assignment operation takes negligible
time when compared to any arithmetic operations and hence
can be neglected. According to this assumption the algorithm A takes, for a given n value, n unit of
time (‘for’ loop runs n times) while the algorithm B takes always, exactly 3 unit of time (one addition,
one multiplication and one division), irrespective of n value.
13
Fig. 2.2 Graph of time taken by algorithm A and B
If we plot the graph of the time taken by both the algorithms A and B, then the graph of A is linearly
increasing graph (See Fig.2.2(a)) and the graph of B is a constant graph (See Fig.2.2(b)).
Therefore, it shall be noticed that the behaviour (time taken by) of the algorithm A depends
on the value of n and the behaviour of (time taken by) the algorithm B is independent of n value.
Therefore, one may feel that the algorithm B is always preferred to the algorithm A. However, it shall
be noticed that the algorithm B is not as simple as the algorithm A from the point of view of
understanding its functionality as one should have familiarity with the formula in case of algorithm B.
That is what we say trade-off between simplicity and efficiency.
The analysis of algorithms usually proceeds from the inside out. First, we determine the time
required by individual instructions (this time is often bounded by a constant); then we combine these
times according to the control structures that combine the instructions in the program. Some control
structures such as sequencing - putting one instruction after another - are easy to analyze whereas
others such as while loops are more subtle. In this unit, we give general principles that are useful in
analyses involving the most frequently encountered control structures, as well as examples of the
application of these principles.
Sequencing
Let P1 and P2 be two fragments of an algorithm. They may be single instructions or complicated
sub algorithms. Let t1 and t2 be the times taken by P1 and P2, respectively. These times may depend on
various parameters, such as the instance size. The sequencing rule says that the time required to
14
compute "P1; P2 ", that is first P1 and then P2, is simply t1 + t2. By the maximum rule, this time is in θ
(max (t1, t2)). Despite its simplicity, applying this rule is sometimes less obvious than it may appear.
For example, it could happen that one of the parameters that control t2 depends on the result of the
computation performed by P1. Thus, the analysis of " P1; P2" cannot always be performed by
considering P1 and P2 independently.
"For" loops
For loops are the easiest loops to analyse. Consider the following loop.
for i ← 1 to m do P(i)
Here and throughout the book, we adopt the convention that when m = 0 this is not an error; it simply
means that the controlled statement P(i) is not executed at all. Suppose this loop is part of a larger
algorithm, working on an instance of size n. (Be careful not to confuse m and n.) The easiest case is
when the time taken by P(i) does not actually depend on i, although it could depend on the instance
size or, more generally, on the instance itself. Let t denote the time required to compute P(i). In this
case, the obvious analysis of the loop is that P(i) is performed m times, each time at a cost of t, and
thus the total time required by the loop is simply l = mt. Although this approach is usually adequate,
there is a potential pitfall: we did not take account of the time needed for loop control. After all, for
loop is shorthand for something like the following while loop.
i←1
while i < m do
P(i)
i←i+1
In most situations, it is reasonable to count at unit cost the test i < m, the instructions i ← 1 and i
← i + 1, and the sequencing operations (go to) implicit in the while loop. Let c be an upper bound on
the time required by each of these operations. The time l taken by the loop is thus bounded above by
l≤ c for i ← 1
15
≤ (t + 3c) m + 2c.
Moreover, this time is clearly bounded below by mt. If c is negligible compared to t, our previous
estimate that f is roughly equal to mt was therefore justified, except for one crucial case: 4l ≈ mt is
completely wrong when m = 0 (it is even worse if m is negative!).
Resist the temptation to say that the time taken by the loop is in θ(mt) on the pretext that the
θ notation is only asked to be effective beyond some threshold such as m > 1. The problem with this
argument is that if we are in fact analyzing the entire algorithm rather than simply the for loop, the
threshold implied by the θ notation concerns n, the instance size, rather than m, the number of times
we go round the loop, and m = 0 could happen for arbitrarily large values of n. On the other hand,
provided t is bounded below by some constant (which is always the case in practice), and provided
there exists a threshold no such that m >1 whenever n ≥ no.
The analysis of for loops is more interesting when the time t(i) required for P(i) varies as a
function of i. (In general, the time required for P(i) could depend not only on i but also on the instance
size n or even on the instance itself.) If we neglect the time taken by the loop control, which is usually
adequate provided m > 1, the same for loop
for i ← 1 to m do P(i)
takes a time given not by a multiplication but rather by a sum: it is . We illustrate the
analysis of for loops with a simple algorithm for computing the Fibonacci sequence as shown below.
function Fibiter(n)
i ← 1; j← 0
for k ←1 to n do j ← i+ j
i←j-i
return j
If we count all arithmetic operations at unit cost, the instructions inside the for loop take constant
time. Let the time taken by these instructions be bounded above by some constant c. Not taking loop
control into account, the time taken by the for loop is bounded above by n times this constant: nc.
Since the instructions before and after the loop take negligible time, we conclude that the algorithm
takes a time in 0(n). Similar reasoning yields that this time is also in 0(n), hence it is in 0((n). We know
that it is not reasonable to count the additions involved in the computation of the Fibonacci sequence
at unit cost unless n is very small. Therefore, we should take account of the fact that an instruction as
16
simple as "j - i + j " is increasingly expensive each time round the loop. It is easy to program long-
integer additions and subtractions so that the time needed to add or subtract two integers is in the
exact order of the number of figures in the larger operand. To determine the time taken by the kth trip
round the loop, we need to know the length of the integers involved. We can prove by mathematical
induction that the values of i and j at the end of the k-th iteration are fk-1 and fk, respectively. This is
precisely why the algorithm works: it returns the value of j at the end of the nth iteration, which is
therefore f, as required. Moreover, the Moivre's formula tells us that the size of fk is in θ(k). Therefore,
the kth iteration takes a time θ (k - 1) +θ (k), which is the same as θ(k). Let c be a constant such that
this time is bounded above by ck for all k > 1. If we neglect the time required for the loop control and
for the instructions before and after the loop, we conclude that the time taken by the algorithm is
bounded above by
Similar reasoning yields that this time is in Ω(n 2 ), and therefore it is in θ(n 2 ). Thus it makes a crucial
difference in the analysis of Fibrec whether or not we count arithmetic operations at unit cost.
The analysis of for loops that start at a value other than 1 or proceed by larger steps should be obvious
at this point. Consider the following loop for example.
Here, P(i) is executed ((m - 5) ÷ 2) + 1 times provided m ≥ 3. (For a for loop to make sense, the endpoint
should always be at least as large as the starting point minus the step).
Recursive calls
Let T(n) be the time taken by a call on Fibrec(n). If n < 2, the algorithm simply returns n, which
takes some constant time a. Otherwise, most of the work is spent in the two recursive calls, which
17
take time T(n -1) and T(n - 2), respectively. Moreover, one addition involving fn-1 and fn-2 (which are the
values returned by the recursive calls) must be performed, as well as the control of the recursion and
the test "if n < 2". Let h(n) stand for the work involved in this addition and control, that is the time
required by a call on Fibrec (n) ignoring the time spent inside the two recursive calls. By definition of
T(n) and h(n), we obtain the following recurrence.
If we count the additions at unit cost, h(n) is bounded by a constant and we conclude that
Fibrec(n) takes a time exponential in n. This is double exponential in the size of the instance since the
value of n is exponential in the size of n.
If we do not count the additions at unit cost, h(n) is no longer bounded by a constant. Instead
h(n) is dominated by the time required for the addition of fn – l and fn - 2 for sufficiently large n. We know
that this addition takes a time in the exact order of n. Therefore, h(n) Є θ(n). Surprisingly, the result is
the same regardless of whether h (n) is constant or linear: it is still the case that T (n) Є θ (fn). In
conclusion, Fibrec(n) takes a time exponential in n whether or not we count additions at unit cost! The
only difference lies in the multiplicative constant hidden in the θ notation.
While and repeat loops are usually harder to analyze than for loops because there is no
obvious a priori way to know how many times we shall have to go round the loop. The standard
technique for analyzing these loops is to find a function of the variables involved whose value
decreases each time around. To conclude that the loop will eventually terminate, it suffices to show
that this value must be a positive integer. (You cannot keep decreasing an integer indefinitely.) To
determine how many times, the loop is repeated, however, we need to understand better how the
value of this function decreases. An alternative approach to the analysis of while loops consist of
treating them like recursive algorithms. The analysis of repeat loops is carried out similarly.
We shall study binary search algorithm, which illustrates perfectly the analysis of while loops.
The purpose of binary search is to find an element x in an array T [1 .. n ] that is in non-decreasing
order. Assume for simplicity that x is guaranteed to appear at least once in T. We require to find an
integer i such that 1 < i < n and T [i] = x. The basic idea behind binary search is to compare x with the
element y in the middle of T. The search is over if x = y; it can be confined to the upper half of the array
if x > y; otherwise, it is sufficient to search the lower half. We obtain the following algorithm.
18
Recall that to analyze the running time of a while loop, we must find a function of the variables
involved whose value decreases each time round the loop. In this case, it is natural to consider j - i +
1, which we shall call d. Thus d represents the number of elements of T still under consideration.
Initially, d = n. The loop terminates when i ≥ j, which is equivalent to d ≤ 1. Each time round the loop,
there are three possibilities: either j is set to k -1, i is set to k + 1, or both i and j are set to k. Let d and
stand respectively for the value of j - i + 1 before and after the iteration under consideration. We
use i, j, î and ĵ similarly. If x < T[k], the instruction "j← k - 1" is executed and thus, î = i and ĵ = [(i +
j) ÷ 2] -1. Therefore,
Similarly, if x > T[k], the instruction "i ← k + 1" is executed and thus
î = [(i + j) ÷ 2] + 1 and ĵ = j
Finally, if x = T [k], then i and j are set to the same value and thus ; but d was at least
2 since otherwise the loop would not have been reentered. We conclude that whichever
case happens, which means that the value of d is at least halved each time round the loop. Since we
stop when d ≤ 1, the process must eventually stop, but how much time does it take?
To determine an upper bound on the running time of binary search, let dl denote the value of
j - i + 1 at the end of the lth trip round the loop for l ≥ 1 and let do = n. Since dl - 1 is the value of j - i +
1 before starting the lth iteration, we have proved that dl ≤ dl-1 /2 for all l ≥ 1. It follows
immediately by mathematical induction that dl ≤ n / 2l. But the loop terminates when d ≤ 1, which
happens at the latest when l = [lg n]. We conclude that the loop is entered at most [lg n] times. Since
each trip round the loop takes constant time, binary search takes a time in O(log n). Similar reasoning
yields a matching lower bound of Ω(log n) in the worst case, and thus binary search takes a time in θ
19
(log n). This is true even though our algorithm can go much faster in the best case, when x is situated
precisely in the middle of the array.
The analysis of many algorithms is significantly simplified when one instruction or one test-can be
singled out as barometer. A barometer instruction is one that is executed at least as often as any other
instruction in the algorithm. (There is no harm if some instructions are executed up to a constant
number of times more often than the barometer since their contribution is absorbed in the asymptotic
notation). Provided the time taken by each instruction is bounded by a constant, the time taken by
the entire algorithm is in the exact order of the number of times that the barometer instruction is
executed.
This is useful because it allows us to neglect the exact times taken by each instruction. In particular, it
avoids the need to introduce constants such as those bounding the time taken by various elementary
operations, which are meaningless since they depend on the implementation, and they are discarded
when the final result is expressed in terms of asymptotic notation. For example, consider the analysis
of Fibiter algorithm when we count all arithmetic operations at unit cost. We saw that the algorithm
takes a time bounded above by cn for some meaningless constant c, and therefore that it takes a time
in θ(n). It would have been simpler to say that the instruction j ← i + j can be taken as barometer, that
this instruction is obviously executed exactly n times, and therefore the algorithm takes a time in θ(n).
Selection sorting will provide a more convincing example of the usefulness of barometer instructions
in the next section.
When an algorithm involves several nested loops, any instruction of the innermost loop can usually
be used as barometer. However, this should be done carefully because there are cases where it is
necessary to take account of the implicit loop control. This happens typically when some of the loops
are executed zero times, because such loops do take time even though they entail no executions of
the barometer instruction. If this happens too often, the number of times the barometer instruction
is executed can be dwarfed by the number of times empty loops are entered-and therefore it was an
error to consider it as a barometer. Consider for instance pigeon-hole sorting. Here we generalize the
algorithm to handle the case where the elements to be sorted are integers known to lie between 1
and s rather than between 1 and 10000. Recall that T[1. .n] is the array to be sorted and U[1. .s] is an
20
array constructed so that U[k] gives the number of times integer k appears in T. The final phase of the
algorithm rebuilds T in nondecreasing order as follows from the information available in U.
To analyze the time required by this process, we use "U [k]" to denote the value originally stored in
U[k] since all these values are set to 0 during the process. It is tempting to choose any of the
instructions in the inner loop as a barometer. For each value of k, these instructions are executed U[k]
times. The total number of times they are executed is therefore . But this sum is equal
to n, the number of integers to sort, since the sum of the number of times that each element appears
gives the total number of elements. If indeed these instructions could serve as a barometer, we would
conclude that this process takes a time in the exact order of n. A simple example is sufficient to
convince us that this is not necessarily the case. Suppose U[k] = 1 when k is a perfect square and U[k]
= 0 otherwise. This would correspond to sorting an array T containing exactly once each perfect square
between 1 and n2, using s = n2 pigeon-holes. In this case, the process clearly takes a time in Ω(n2) since
the outer loop is executed s times. Therefore, it cannot be that the time taken is in θ(n). This proves
that the choice of the instructions in the inner loop as a barometer was incorrect. The problem arises
because we can only neglect the time spent initializing and controlling loops provided we make sure
to include something even if the loop is executed zero times.
The correct and detailed analysis of the process is as follows. Let a be the time needed for the test
U[k] ≠ 0 each time round the inner loop and let b be the time taken by one execution of the instructions
in the inner loop, including the implicit sequencing operation to go back to the test at the beginning
of the loop. To execute the inner loop completely for a given value of k takes a time tk = (1 + U[k]) a +
U[k] b, where we add 1 to U[k] before multiplying by a to take account of the fact that the test is
performed each time round the loop and one more time to determine that the loop has been
completed. The crucial thing is that this time is not zero even when U[k] = 0. The complete process
takes a time where c and d are new constants to take account of the time
needed to initialize and control the outer loop, respectively. When simplified, this expression yields c
+ (a + d) s + (a + b) n. We conclude that the process takes a time in θ(n + s). Thus the time depends on
two independent parameters n and s; it cannot be expressed as a function of just one of them. It is
21
easy to see that the initialization phase of pigeon-hole sorting also takes a time in θ (n + s), unless
virtual initialization is used in which case a time in θ(n) suffices for that phase. In any case, this sorting
technique takes a time in θ(n + s) in total to sort n integers between 1 and s. If you prefer, the
maximum rule can be invoked to state that this time is in θ(max (n, s)). Hence, pigeon-hole sorting is
worthwhile but only provided s is small enough compared to n. For instance, if we are interested in
the time required as a function only of the number of elements to sort, this technique succeeds in
astonishing linear time if s Є 0(n) but it chugs along in quadratic time when s Є θ(n2 ).
Despite the above, the use of a barometer is appropriate to analyze pigeon-hole sorting. Our problem
was that we did not choose the proper barometer. Instead of the instructions inside the inner loop,
we should have used the inner-loop test " U [k] ≠ 0 " as a barometer. Indeed, no instructions in the
process are executed more times than this test is performed, which is the definition of a barometer.
It is easy to show that this test is performed exactly n + s times, and therefore the correct conclusion
about the running time of the process follows immediately without need to introduce meaningless
constants.
In conclusion, the use of a barometer is a handy tool to simplify the analysis of many algorithms, but
this technique should be used with care.
Selection sort
Let ‘s considers a selection sorting technique as shown below, which is a good example for the
analysis of nested loops.
22
Although the time spent by each trip round the inner loop is not constant, it takes longer time when
T[j] < minx and is bounded above by some constant c (that takes the loop control into account). For
each value of i, the instructions in the inner loop are executed n -(i + 1) +1 = n - i times, and therefore
the time taken by the inner loop is t (i) ≤ (n - i) c. The time taken for the i-th trip round the outer loop
is bounded above by b + t (i) for an appropriate constant b that takes account of the elementary
operations before and after the inner loop and of the loop control for the outer loop. Therefore, the
total time spent by the algorithm is bounded above by
which is in o(n2). Similar reasoning shows that this time is also in Ω(n2) in all cases, and therefore
selection sort takes a time in θ(n2) to sort n items.
The above argument can be simplified, obviating the need to introduce explicit constants such
as b and c, once we are comfortable with the notion of a barometer instruction. Here, it is natural to
take the innermost test "if T[j] < minx" as a barometer and count the exact number of times it is
executed. This is a good measure of the total running time of the algorithm because none of the loops
can be executed zero times (in which case loop control could have been more time consuming than
our barometer). The number of times that the test is executed is easily seen to be
Thus the number of times the barometer instruction is executed is in θ(n2), which automatically gives
the running time of the algorithm itself.
Insertion Sort
Let’s consider one more sorting technique called Insertion Sort for analysis. The procedure for
insertion sorting is as shown below.
23
Unlike selection sorting, the time taken to sort n items by insertion depends significantly on
the original order of the elements. Here, we analyze this algorithm in the worst case. To analyze the
running time of this algorithm, we choose as barometer the number of times the while loop condition
(j > 0 and x < T [j]) is tested.
Suppose for a moment that i is fixed. Let x = T[i], as in the algorithm. The worst case arises
when x is less than T[j] for every j between 1 and i - 1, since in this case we have to compare x to T[i -
1], T[i - 2],..., T[1] before we leave the while loop because j = 0. Thus the while loop test is performed
i times in the worst case. This worst case happens for every value of i from 2 to n when the array is
initially sorted into descending order. The barometer test is thus performed
times in total, which is in θ(n2). This shows that insertion sort also
takes a time in θ(n2) to sort n items in the worst case.
2.6 SUMMARY
In this unit, we have analyzed the algorithms. when we study an algorithm and its variations
was given. In the study of algorithms, the process of designing an algorithm, and studying the growth
functions of an algorithm as well as analyzing control structure involved for execution were presented.
Finally, how to use the barometer, the analysis and execution of an algorithms is given in this unit.
24
2.8 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall Englewood Cliffs,
New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
3) Goodman And Hedetniemi, 1987, Introduction to the Design and Analysis of Algorithms,
Mcgraw Hill International Editions.
25
UNIT – 3
Asymptotic Notation
STRUCTURE
3.0 Objectives
3.6 Summary
3.7 Keywords
3.9 Reference
3.0 OBJECTIVES
26
3.1 ASYMPTOTIC NOTATIONS
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to work in
a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in mathematical units of
computation. For example, the running time of one operation is computed as f(n) and may be for
another operation it is computed as g(n2). This means the first operation running time will increase
linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Similarly, the running time of both operations will be nearly the
same if n is significantly small.
Following are the commonly used asymptotic notations to calculate the running time complexity of
an algorithm.
Ο Notation
Ω Notation
θ Notation
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly
take to complete.
27
For example, for a function f(n)
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly take
to complete.
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −
28
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
quadratic − Ο(n2)
cubic − Ο(n3)
polynomial − nΟ(1)
exponential − 2Ο(n)
29
3.2 STANADARD NOTATIONS AND COMMON FUNCTIONS
Definition: f(n) = O(g(n)) (read as “f of n equals big oh of g of n”), if and only if there exist two positive,
integer constants c and n0 such that
In other words, suppose we are determining the computing time, f(n) of some algorithm
where n may be the number of inputs to the algorithm, or the number of outputs, or their sum or any
other relevant parameter. Since f(n) is machine dependent (it depends on which computer we are
working on). An a priori analysis cannot determine f(n), the actual complexity as described earlier.
However, it can determine a g(n) such that f(n)=O(g(n)). An algorithm is said to have a computing time
O(g(n)) (of the order of g(n)), if the resulting times of running the algorithm on some computer with
the same type of data but for increasing values of n, will always be less than some constant times
|g(n)|. We use some polynomial of n, which acts as an upper limit, and we can be sure that the
algorithm does not take more than the time prescribed by the upper limit. For instance, let us consider
the following algorithm,
30
In the above algorithm, statement (1) is executed 1 time, statement (2) is executed n+1 times,
statement (3) is executed n times, and statement (4) is executed 1 time. Thus, the total time taken is
2n+3.
In order to represent the time complexity of the above algorithm as f(n)=O(n), it is required to find the
integer constants c and n0, which satisfy the above definition of O notation. i.e., an algorithm with the
time complexity 2n + 3 obtained from a priori analysis can be represented as O(n) because 2n 3
3n for all n ≥ 3 here c=3 and n0 = 3.
The most commonly encountered complexities are O(1), O(log n), O(n), O(n log n), O(n 2), O(n3)
and O(2n). Algorithms of higher powers of n are seldom solvable by simple methods. O(1) means a
computing time that is constant. O(n) is called linear, O(n 2) is called quadratic, O(n3) is called cubic and
O(2n) is called exponential. The commonly used complexities can thus be arranged in an increasing
order of complexity as follows.
31
If we substitute different values of n and plot the growth of these functions, it becomes
obvious that at lower values of n, there is not much difference between them. But as n increases, the
values of the higher powers grow much faster than the lower ones and hence the difference increases.
For example, at n = 2, 3, 4, …, 9 the values of 2n happens to be lesser than n3 but once n ≥10, 2n shows
a drastic growth.
The O - notation discussed so far is the most popular of the asymptotic notations and is used
to define the upper bound of the performance of an algorithm also referred to as the worst case
performance of an algorithm. But it is not the only complexity we have. Sometimes, we may wish to
determine the lower bound of an algorithm i.e., the least value, the complexity of an algorithm can
take. This is denoted by (omega).
Definition: f(n) = (g(n)) (read as “f of n equals omega of g of n) if and only if there exist positive non-
zero constants C and n0, such that for all ABS(f(n)) ≥ C*ABS(g(n)) for all n ≥ n0.
In some cases both the upper and lower bounds of an algorithm can be the same. Such a situation is
described by the -notation.
Definition: f(n) = (g(n)) if and only if there exist positive constants C1, C2 and n0 such that for all n >
n0, C1 |g(n)| f(n) C2 |g(n)|
It is important to decide how we are going to describe our algorithms. If we try to explain them
in English, we rapidly discover that natural languages are not at all suited to this kind of thing. To avoid
confusion, we shall in future specify our algorithms by giving a corresponding program. We assume
that the reader is familiar with at least one well-structured programming language such as Pascal.
However, we shall not confine ourselves strictly to any particular programming language: in this way,
32
the essential points of an algorithm will not be obscured by relatively unimportant programming
details, and it does not really matter which well-structured language the reader prefers.
A few aspects of our notation for programs deserve special attention. We use phrases in
English in our programs whenever this makes for simplicity and clarity. Similarly, we use mathematical
language, such as that of algebra and set theory, whenever appropriate-including symbols such as and
Li introduced in Section 1.4.7. As a consequence, a single "instruction" in our programs may have to
be translated into several instructions-perhaps a while loop-if the algorithm is to be implemented in
a conventional programming language. Therefore, you should not expect to be able to run the
algorithms we give directly: you will always be obliged to make the necessary effort to transcribe them
into a "real" programming language. Nevertheless, this approach best serves our primary purpose, to
present as clearly as possible the basic concepts underlying our algorithms.
To simplify our programs further, we usually omit declarations of scalar quantities (integer,
real, or Boolean). In cases where it matters-as in recursive functions and procedures-all variables used
are implicitly understood to be local variables, unless the context makes it clear otherwise. In the same
spirit of simplification, proliferation of begin and end statements, that plague programs written in
Pascal, is avoided: the range of statements such as if, while, or for, as well as that of declarations such
as procedure, function, or record, is shown by indenting the statements affected. The statement
return marks the dynamic end of a procedure or a function, and in the latter case it also supplies the
value of the function.
We do not declare the type of parameters in procedures and functions, nor the type of the
result returned by a function, unless such declarations make the algorithm easier to understand.
Scalar parameters are passed by value, which means they are treated as local variables within the
procedure or function, unless they are declared to be var parameters, in which case they can be used
to return a value to the calling program. In contrast, array parameters are passed by reference, which
means that any modifications made within the procedure or function are reflected in the array
actually passed in the calling statement.
Finally, we assume that the reader is familiar with the concepts of recursion, record, and
pointer. The last two are denoted exactly as in Pascal, except for the omission of begin and end in
records. In particular, pointers are denoted by the symbol “↑ ".
To wrap up this section, here is a program for multiplication. Here ÷ denotes integer division:
any fraction in the answer is discarded. We can compare this program to the informal English
description of the same algorithm.
33
function Multiply (m, n)
result ← 0
repeat
m←m÷2
m←n+n
until m = 1
return result
34
There is often a time-space-tradeoff involved in a problem, that is, it cannot be solved
with few computing time and low memory consumption. One then has to make a compromise
and to exchange computing time for memory consumption or vice versa, depending on which
algorithm one chooses and how one parameterizes it.
3.6 SUMMARY
In this unit, we discussed the Asymptotic Notation of an algorithms. We analyzed the most
frequently encountered Notions like omega, Big oh, Theta. We learnt that the use of a Standard
Notations and Common Functions such as Time complexity and Space Complexity to analysis of
many algorithms.
3.7 KEYWORDS
1) Notation
2) Space complexity
3) Time complexity
4) Asymptotic notation
3.8 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall Englewood Cliffs,
New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
3) Goodman And Hedetniemi, 1987, Introduction to the Design and Analysis of Algorithms,
Mcgraw Hill International Editions.
35
UNIT – 4
RECURRENCES
STRUCTURE
4.0 Objectives
4.5 Summary
4.6 Keywords
4.8 Reference
4.0 OBJECTIVES
36
4.1 INTRODUCTION TO RECURRENCES
A recurrence relation when we design algorithms typically (most times) is a growth function that
represents the running time of the algorithm with respect to the input size for a particular type of
analysis (e.g., worst-case). We usually formulate it as a function that is written in terms of itself
(recursive case), and a constant value when the input is small (base case).
Something like T(n)=T(n/2)+1T(n)=T(n/2)+1 when n>1n>1, T(1)=1T(1)=1 could represent the running
time of an algorithm such as binary search in the worst case. We call it a recurrence because it is
defined by itself, recursively. This is very frequently occurring in the analysis of Divide and Conquer
algorithms.
Recurrences do not just occur within the context of our field, and are studied quite extensively in all
mathematical disciplines, and beyond.
The substitution method is a powerful approach that is able to prove upper bounds for almost all
recurrences. However, its power is not always needed; for certain types of recurrences, the master
method (see below) can be used to derive a tight bound with less work. In those cases, it is better
to simply use the master method, and to save the substitution method for recurrences that actually
need its full power.
Note that the substitution method still requires the use of induction. The induction will always be
of the same basic form, but it is still important to state the property you are trying to prove, split
into one or more base cases and the inductive case, and note when the inductive hypothesis is being
used.
37
Substitution method example
Consider the following recurrence relation, which shows up fairly frequently for some types of
algorithms:
T(1) = 1
T(n) = 2T(n−1) + c1
By expanding this out a bit (using the "iteration method"), we can guess that this will be O(2n). To
use the substitution method to prove this bound, we now need to guess a closed-form upper bound
based on this asymptotic bound. We will guess an upper bound of k2n − b, where b is some constant.
We include the b in anticipation of having to deal with the constant c1 that appears in the recurrence
relation, and because it does no harm. In the process of proving this bound by induction, we will
generate a set of constraints on k and b, and if b turns out to be unnecessary, we will be able to set
it to whatever we want at the end.
Our property, then, is T(n) ≤ k2n − b, for some two constants k and b. Note that this property
logically implies that T(n) is O(2n), which can be verified with reference to the definition of O.
Inductive case: We assume our property is true for n − 1. We now want to show that it is true
for n.
T(n) = 2T(n−1) + c1
= k2n − 2b + c1
≤ k2n − b
So we end up with two constraints that need to be satisfied for this proof to work, and we can
satisfy them simply by letting b = c1 and k = (b + 1)/2, which is always possible, as the definition
of O allows us to choose any constant. Therefore, we have proved that our property is true, and
so T(n) is O(2n).
The biggest thing worth noting about this proof is the importance of adding additional terms to
the upper bound we assume. In almost all cases in which the recurrence has constants or lower-
order terms, it will be necessary to have additional terms in the upper bound to "cancel out" the
constants or lower-order terms. Without the right additional terms, the inductive case of the proof
will get stuck in the middle, or generate an impossible constraint; this is a signal to go back to your
38
upper bound and determine what else needs to be added to it that will allow the proof to proceed
without causing the bound to change in asymptotic terms.
The recursion tree for this recurrence has the following form:
In this case, it is straightforward to sum across each row of the tree to obtain the total work done at
a given level:
39
This a geometric series, thus in the limit the sum is O(n2). The depth of the tree in this case does not
really matter; the amount of work at each level is decreasing so quickly that the total is only a
constant factor more than the root.
Recursion trees can be useful for gaining intuition about the closed form of a recurrence, but they
are not a proof (and in fact it is easy to get the wrong answer with a recursion tree, as is the case
with any method that includes ''...'' kinds of reasoning). As we saw last time, a good way of
establishing a closed form for a recurrence is to make an educated guess and then prove by induction
that your guess is indeed a solution. Recurrence trees can be a good method of guessing.
Expanding out the first few levels, the recurrence tree is:
Note that the tree here is not balanced: the longest path is the rightmost one, and its length
is log3/2 n. Hence our guess for the closed form of this recurrence is O(n log n).
The indispensable last step when ANALYZING an algorithm is often to solve a recurrence
equation. With a little experience and intuition most recurrences can be solved by intelligent
guesswork. However, there exists a powerful technique that can be used to solve certain classes of
recurrence almost automatically. This is the main topic of this section: the technique of the
characteristic equation.
Intelligent Guesswork
This approach generally proceeds in four stages: calculate the first few values of the
recurrence, look for regularity, guess a suitable general form, and finally prove by mathematical
induction (perhaps constructive induction) that this form is correct. Consider the following recurrence.
40
(4.1)
One of the first lessons experience will teach you if you try solving recurrences is that discontinuous
functions such as the floor function (implicit in n ÷ 2) are hard to analyze. Our first step is to replace n
÷ 2 with the better-behaved "n/2" with a suitable restriction on the set of values of n that we consider
initially. It is tempting to restrict n to being even since in that case n ÷ 2 = n/2, but recursively dividing
an even number by 2 may produce an odd number larger than 1. Therefore, it is a better idea to restrict
n to being an exact power of 2. Once this special case is handled, the general case follows painlessly
in asymptotic notation.
First, we tabulate the value of the recurrence on the first few powers of 2.
Each term in this table but the first is computed from the previous term. For instance, T(16)= 3 x
T(8)+16 = 3 x 65 + 16 = 211. But is this table useful? There is certainly no obvious pattern in this
sequence! What regularity is there to look for? The solution becomes apparent if we keep more
"history" about the value of T(n). Instead of writing T(2) = 5, it is more useful to write T(2)= 3 x 1 + 2.
Then,
41
(4.2)
It is easy to check this formula against our earlier tabulation. By induction (not mathematical
induction), we are now convinced that the above equation is correct.
With hindsight, the Equation (4.2) could have been guessed with just a little more intuition.
For this it would have been enough to tabulate the value of T(n) + in for small values of i, such as -2 ≤
i ≤ 2.
This time, it is immediately apparent that T(n)+2n is an exact power of 3, from which the Equation
(4.5) is readily derived.
What happens when n is not a power of 2? Solving recurrence 4.1 exactly is rather difficult.
Fortunately, this is unnecessary if we are happy to obtain the answer in asymptotic notation. For this,
it is convenient to rewrite Equation 4.2 in terms of T(n) rather than in terms of T(2k). Since n = 2k it
follows that k = lg n.
Therefore
42
Homogeneous Recurrences
We begin our study of the technique of the characteristic equation with the resolution of
homogeneous linear recurrences with constant coefficients, that is recurrences of the form
(4.4)
where the ti are the values we are looking for. In addition to Equation 4.4, the values of ti on k values
of i (usually 0 ≤ i ≤ k - 1 or 1 ≤ i ≤ k) are needed to determine the sequence. These initial conditions will
be considered later. Until then, Equation 4.7 typically has infinitely many solutions. This recurrence is
linear because it does not contain terms of the form , and so on;
homogeneous because the linear combination of the tn-i is equal to zero; and
with constant coefficients because the ai are constants.
Consider for instance our now familiar recurrence for the Fibonacci sequence.
Consider for instance our now familiar recurrence for the Fibonacci sequence.
This recurrence easily fits the mould of Equation 4.7 after obvious rewriting.
Therefore, the Fibonacci sequence corresponds to a homogeneous linear recurrence with constant
coefficients with k = 2, a0 = 1 and a1, = a2 = -1.
Before we even start to look for solutions to Equation 4.7, it is interesting to note that any
linear combination of solutions is itself a solution. In other words, if fn, and gn satisfy Equation 4.7,
and similarly for gn, and if we set for arbitrary constants c and
d, then tn, is also a solution to Equation 4.7. This is true because
43
Trying to solve a few easy examples of recurrences of the form of Equation 4.5 (not the
Fibonacci sequence) by intelligent guesswork suggests looking for solutions of the form
where x is a constant as yet unknown. If we try this guessed solution in Equation 4.5, we obtain
This equation is satisfied if x = 0, a trivial solution of no interest. Otherwise, the equation is satisfied
if and only if
This equation of degree k in x is called the characteristic equation of the recurrence 4.5 and
Recall that the fundamental theorem of algebra states that any polynomial p(x) of degree k
has exactly k roots (not necessarily distinct), which means that it can be factorized as a product of k
monomials
where the ri may be complex numbers. Moreover, these ri are the only solutions of the equation p(x)
= 0.
Consider any root ri of the characteristic polynomial. Since p(ri) = 0 it follows that x = ri is a
solution to the characteristic equation and therefore rin is a solution to the recurrence. Since any linear
(4.6)
satisfies the recurrence for any choice of constants Cl, C2, …, Ck. The remarkable fact, which we do not
prove here, is that Equation 4.6 has only solutions of this form provided all the ri are distinct. In this
44
case, the k constants can be determined from k initial conditions by solving a system of k linear
equations in k unknowns.
(4.7)
It remains to use the initial conditions to determine the constants cl and c2. When n = 0, Equation 4.7
yields f0 = c1 + c2. But we know that f0 = 0. Therefore, c1 + c2 = 0. Similarly, when n = 1, Equation 4.7
together with the second initial condition tell us that f1 = cl r1 + c2 r2 = 1. Remembering that the values
of r1 and r2 are known, this gives us two linear equations in the two unknowns cl and c2.
Thus
which is de Moivre's famous formula for the Fibonacci sequence. Notice how much easier the
technique of the characteristic equation is than the approach by constructive induction. It is also more
45
precise since all we were able to discover with constructive induction was that “fn grows exponentially
in a number close to ф"; now we have an exact formula.
4.5 SUMMERY
In this unit, we observed that recurrences. The Recursion-Tree Method as well as Solving
Recurrences with a little experience and intuition most recurrences can be solved by intelligent
guesswork. The characteristic equation is a powerful technique that can be used to solve certain
classes of recurrence almost automatically.
4.6 KEYWORDS
1) Recurrence
2) Homogeneous
3) Substitution
4.8 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall Englewood
Cliffs, New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
3) Goodman And Hedetniemi, 1987, Introduction to the Design and Analysis of Algorithms,
Mcgraw Hill International Editions.
46
UNIT – 5
STRUCTURE
5.0 Objectives
5.6 Summary
5.7 Keywords
5.9 Reference
5.0 OBJECTIVES
47
5.1 INTRODUCTION TO BINARY SEARCH TREES
Definition: A tree is defined as a finite set of one or more nodes such that
In the above figure, node 1 represents the root of the tree, nodes 2, 3, 4 and 9 are all
intermediate nodes and nodes 5, 6, 7, 8, 10, 11 and 12 are the leaf nodes of the tree. The
definition of the tree emphasizes on the aspect of (i) connectedness and (ii) absence of loops
or cycles. Beginning from the root node, the structure of the tree permits connectivity of the
root to every other node in the tree. In general, any node is reachable from anywhere in the
tree. Also, with branches providing links between the nodes, the structure ensures that no
set of nodes link together to form a closed loop or cycle.
1. There is one and only one path between every pair of vertices in a tree, T.
48
2. A tree with n vertices has n-1 edges.
There are several basic terminologies associated with trees. There is a specially designated
node called the root node. The number of subtrees of a node is known as the degree of the
node. Nodes that have zero degree are called leaf nodes or terminal nodes. The rest of them
are called intermediate nodes. The nodes, which hang from branches emanating from a node,
are called as children and the node from which the branches emanate is known as the parent
node. Children of the same parent node are referred to as siblings. The ancestors of a given
node are those nodes that occur on the path from the root to the given node. The degree of
a tree is the maximum degree of the node in the tree. The level of the node is defined by
letting the root node to occupy level 0. The rest of the nodes occupy various levels depending
on their association. Thus, if parent node occupies level i then, its children should occupy level
i+1. This renders a tree to have a hierarchical structure with root occupying the top most level
of 0. The height or depth of a tree is defined to be the maximum level of any node in the tree.
A forest is a set of zero or more disjoint trees. The removal of the root node from a tree results
in a forest.
A binary tree has the characteristic of all nodes having at most two branches, that is,
all nodes have a degree of at most 2. Therefore, a binary tree can be empty or consist of a
root node and two disjointed binary trees termed left subtree and right subtree. Figure 2.2
shows an example binary tree.
49
Figure 2.2 An example binary tree
The number of levels in the tree is called the “depth” of the tree. A “complete” binary tree is
one which allows sequencing of the nodes and all the previous levels are maximally
accommodated before the next level is accommodated. i.e., the siblings are first
accommodated before the children of any one of them. And a binary tree, which is maximally
accommodated with all leaves at the same level is called “full” binary tree. A full binary tree
is always complete but a complete binary tree need not be full. Fig. 2.2 is an example for a
full binary tree and Figure 2.3 illustrates a complete binary tree.
The maximum number of vertices at each level in a binary tree can be found out as follows:
50
20 + 21 + 22 + … + 2l
A two dimensional array can be used to store the adjacency relations very easily and
can be used to represent a binary tree. In this representation, to represent a binary tree with
n vertices we use n×n matrix. Figure 5.3(a) shows a binary tree and Figure 5.4(b) shows its
adjacency matrix representation.
51
(a) A binary tree (b) Adjacency matrix representation
Here, the row indices correspond to the parent nodes and the column corresponds to the
child nodes. i.e., a row corresponding to the vertex vi having the entries ‘L’ and ‘R’ indicate
that vi has its left child, the index corresponding to the column with the entry ‘L’ and has its
right child, the index corresponding to the column with the entry ‘R’. The column corresponds
to vertex vi with no entries indicate that it is the root node. All other columns have only one
entry. Each row may have 0, 1 or 2 entries. Zero entry in the row indicates that the
corresponding vertex vi is a leaf node, only one entry indicates that the node has only one
child and two entries indicate that the node has both the left and right children. The entry “L”
is used to indicate the left child and “R” is used to indicate the right child entries.
From the above representation, we can understand that the storage space utilization is not
efficient. Now, let us see the space utilization of this method of binary tree representation.
Let ‘n’ be the number of vertices. The space allocated is n x n matrix. i.e., we have n2 number
of locations allocated, but we have only n-1 entries in the matrix. Therefore, the percentage
of space utilization is calculated as follows:
The percentage of space utilized decreases as n increases. For large ‘n’, the percentage of
utilization becomes negligible. Therefore, this way of representing a binary tree is not efficient
in terms of memory utilization.
Since the two dimensional array is a sparse matrix, we can consider the prospect of mapping
it onto a single dimensional array for better space utilization. In this representation, we have
to note the following points:
The left child of the ith node is placed at the 2ith position.
The right child of the ith node is placed at the (2i+1)th position.
The parent of the ith node is at the (i/2)th position in the array.
52
If l is the depth of the binary tree then, the number of possible nodes in the binary tree is 2l+1-
1. Hence it is necessary to have 2l+1-1 locations allocated to represent the binary tree.
Figure 2.5 shows a binary tree and Figure 2.6 shows its one-dimensional array representation.
For a complete and full binary tree, there is 100% utilization and there is a maximum wastage
if the binary tree is right skewed or left skewed, where only l+1 spaces are utilized out of the
2l+1 – 1 spaces.
An important observation to be made here is that the organization of the data in the binary
tree decides the space utilization of the representation used.
53
5.4 INSERTION AND DELETION RED BLOCK TREES
Red-black trees are an evolution of binary search trees that aim to keep the tree
balanced without affecting the complexity of the primitive operations. This is done by
colouring each node in the tree with either red or black and preserving a set of properties
that guarantee that the deepest path in the tree is no longer than twice the shortest one.
Using these properties, we can show in two steps that a red-black tree which contains n
nodes has a height of O(log n), thus all primitive operations on the tree will be of O(log n)
since their order is a function of tree height.
1. First, notice that for a red-black tree with height h, bh(root) is at least h/2 by property 3
above (as each red node strictly requires black children).
2. The next step is to use the following lemma:
Lemma: A subtree rooted at node v has at least 2^bh(v) – 1 internal nodes
54
Proof by induction: The basis is when h(v) = 0, which means that v is a leaf node and
therefore bh(v) = 0 and the subtree rooted at node v has 2^bh(v)-1 = 2^0-1 = 1-1 = 0 nodes.
Inductive hypothesis: if node v1 with height x has 2^bh(v1)-1 internal nodes then node v2
with height x+1 has 2^bh(v2)-1
For any non-leaf node v (height > 0) we can see that the black height of any of its two
children is at least equal to bh(v)-1 — if the child is black, that is, otherwise it is equal to
bh(v) . By applying the hypothesis above we conclude that each child has at least 2^[bh(v)-
1]-1 internal nodes, accordingly node v has at least
2^[bh(v)-1]-1 + 2^[bh(v)-1]-1 + 1 = 2^bh(v)-1
internal nodes, which ends the proof.
By applying the lemma to the root node (with bh of at least h/2, as shown above) we get
n >= 2^(h/2) – 1
where n is the number of internal nodes of a red-black tree (the subtree rooted at the root).
Playing with the equation a little bit we get h <= 2 log (n+1), which guarantees the
logarithmic bound of red-black trees.
ROTATIONS
How does inserting or deleting nodes affect a red-black tree? To ensure that its color
scheme and properties don’t get thrown off, red-black trees employ a key operation known
as rotation. Rotation is a binary operation, between a parent node and one of its children,
that swaps nodes and modify their pointers while preserving the in order traversal of the
tree (so that elements are still sorted).
There are two types of rotations: left rotation and right rotation. Left rotation swaps the
parent node with its right child, while right rotation swaps the parent node with its left
child. Here are the steps involved in for left rotation (for right rotations just change “left” to
“right” below):
55
Operations on red-black tree (insertion, deletion and retrieval)
Red-black tree operations are a modified version of BST operations, with the modifications
aiming to preserve the properties of red-black trees while keeping the operations
complexity a function of tree height.
56
RED-BLACK TREE DELETION:
The same concept behind red-black tree insertions applies here. Removing a node from a
red-black tree makes use of the BST deletion procedure and then restores the red-black tree
properties in O(log n). The total running time for the deletion process takes O(log n) time,
then, which meets the complexity requirements for the primitive operations.
Retrieving a node from a red-black tree doesn’t require more than the use of the BST
procedure, which takes O (log n) time.
57
1. Every node is either red or black.
2. The root is black.
3. Every leaf which is nil is black.
4. If a node is red, then both its children are black.
5. For each node, all simple paths from the node to descendant leaves contain the same
number of black nodes.
How to insert new Node in Red Black Tree
Pseudo Code.
1. Check whether tree is Empty.
2. If tree is Empty then insert the new Node as Root node with color Black .
3. If tree is not Empty, then insert the new Node as leaf node with colour Red.
4. If the parent of new Node is Black, then exit from the operation.
5. If the parent of new Node is Red, then check the colour of parent node’s sibling of new
Node.
6. If it is coloured Black or NULL, then make suitable Rotation and Recolour it.
7. If it is coloured Red, then perform Recolour and also check parent’s parent of new node
if it’s not root node than recolour it. Repeat the same until tree becomes Red Black Tree.
Create Red Black Tree by Inserting following number.
8, 18, 5, 15, 17, 25
Insert (8)
So first we check tree is empty or not. here tree is empty so enter a new Node as root node
with colour Black.
58
Insert (18)
Here Tree is not Empty so insert the new Node as leaf node with colour Red. (RBT is self-
balanced binary search tree so we have to follow rule of Binary Search Tree also during
Insertion a node in Tree.)
1. Left Subtree contain value lesser than the root node.
2. Right Subtree contain value greater than the root node.
Insert (5)
Tree is not Empty so insert new Node with Red colour.
Insert (15)
Tree is not Empty so insert new Node with Red colour.
Here there are two consecutive red nodes(18 & 15). The color of parent’s sibling(uncle(5)) of
new Node is Red and parent’s parent is root node. So we use Recolor and make it Red Black
Tree.
59
Insert (17)
Tree is not Empty so insert new Node with Red colour.
Here there are two consecutive red nodes (15 & 17). The parent’s sibling of new Node
NULL. So we need Rotation. Here we need LR Rotation and Recolour. After Left Rotation
node whose value is 17 it becomes parent node of 15.
60
After Right Rotation and Recolouring node whose value is 17 it becomes the parent node of
15 and 18
Insert (25)
Tree is not Empty so insert new Node with Red colour.
Here there are two consecutive red nodes (18 & 25) The color of parent’s sibling(uncle(15))
)) of new Node is Red and parent’s parent is root node. . So we use Recolor and Recheck.
61
After Recolouring, the tree is satisfying all the Red Black Tree properties.
5.6 SUMMARY
In this unit, we have introduced Binary Search Trees. A glimpse of all the phases we
should go through when we study a Binary Search Trees and its variations was given. In the
study of Binary Search Trees, the process of insertion and deletion of Red black trees as well
as properties of Red-Black Trees. All in all, the basic idea behind the Binary Search Trees is
given in this unit.
5.6 KEYWORDS
62
3) self-balancing
4) connectedness
5.7 REFERENCES
1. Sartaj Sahni, 2000, Data structures, algorithms and applications in C++, McGraw Hill
international edition.
2. Horowitz and Sahni, 1983, Fundamentals of Data structure, Galgotia publications
3. Horowitz and Sahni, 1998, Fundamentals of Computer algorithm, Galgotia
publications.
4. Narsingh Deo, 1990, Graph theory with applications to engineering and computer
science, Prentice hall publications.
5. Tremblay and Sorenson, 1991, An introduction to data structures with applications,
McGraw Hill edition.
6. Dromey R. G., 1999, How to solve it by computers, Prentice Hall publications, India.
63
UNIT – 6
STRUCTURE
6.0 Objectives
6.1 Introduction
6.6 Summary
6.7 Keywords
6.9 Reference
6.0 OBJECTIVES
Define a b trees.
64
6.1 INTRODUCTION
Here we will see what are the B-Trees. The B-Trees are specialized m-way search tree. This
can be widely used for disc access. A B-tree of order m, can have maximum m-1 keys and m
children. This can store large number of elements in a single node. So the height is relatively
small. This is one great advantage of B-Trees.
B-Tree has all of the properties of one m-way tree. It has some other properties.
Every node in B-Tree will hold maximum m children
Every node except root and leaves, can hold at least m/2 children
The root nodes must have at least two children.
All leaf nodes must have at same level
Example of B-Tree
This supports basic operations like searching, insertion, deletion. In each node, the item will
be sorted. The element at position i has child before and after it. So children sored before will
hold smaller values, and children present at right will hold bigger values.
Here we will see, how to perform the insertion into a B-Tree. Suppose we have a B-Tree like
below −
Example of B-Tree −
65
To insert an element, the idea is very similar to the BST, but we have to follow some rules.
Each node has m children, and m-1 elements. If we insert an element into one node, there
are two situations. If the node has elements less than m-1, then the new element will be
inserted directly into the node. If it has m-1 elements, then by taking all elements, and the
element which will be inserted, then take the median of them, and the median value is send
to the root of that node by performing the same criteria, then create two separate lists from
left half and right half of the node
Suppose we want to insert 79 into the tree. At first it will be checked with root, this is greater
than 56. Then it will come to the right most sub-tree. Now it is less than 81, so move to the
left sub-tree. After that it will be inserted into the root. Now there are three elements [66,
78, 79]. The median value is 78, so 78 will go up, and the root node becomes [79, 81], and the
elements of the node will be split into two nodes. One will hold 66, and another will hold 79.
B-Tree after inserting 79.
Algorithm
BTreeInsert(root, key)−
Input − The root of the tree, and key to insert We will assume, that the key is not present into
the list
x := Read root
if x is full, then
y := new node
z := new node
Locate the middle object oi, stored in x, move the objects to the left of oi in to node y
66
Move the object to the right of oi into node z.
If x is an index node, then move the child pointers accordingly
x->child[1] := address of y
x->child[2] := address of z
end if
Deletion of B-Tree
Here we will see, how to perform the deletion of a node from B-Tree. Suppose we have a
BTree like below −
Example of B-Tree −
Deletion has two parts. At first we have to find the element. That strategy is like the querying.
Now for deletion, we have to care about some rules. One node must have at-least m/2
elements. So if we delete, one element, and it has less than m-1 elements remaining, then it
will adjust itself. If the entire node is deleted, then its children will be merged, and if their size
issame as m, then split them into two parts, and again the median value will go up.
Suppose we want to delete 46. Now there are two children. [45], and [47, 49], then they will
be merged, it will be [45, 47, 49], now 47 will go up.
67
Algorithm
BTreeDelete(x, key) −
Input − The root of the tree, and key to delete
We will assume, that the key is present into the list
if x is leaf, then
delete object with key ‘key’ from x
else if x does not contain the object with key ‘key’, then
locate the child x->child[i] whose key range is holding ‘key’
y := x->child[i]
if y has m/2 elements, then
If the sibling node z immediate to the left or right of y, has at least one more
object than m/2, add one more object by moving x->key[i] from x to y, and
move that last or first object from z to x. If y is non-leaf node, then last or first
child pointer in z is also moved to y
else
any immediate sibling of y has m/2 elements, merge y with immediate sibling
end if
BTreeDelete(y, key)
else
if y that precedes ‘key’ in x, has at-least m/2 + 1 objects, then
find predecessor k of ‘key’, in the sub-tree rooted by y. then recursively delete k
from the sub-tree and replace key with k in x
else if ys has m/2 elements, then
check the child z, which is immediately follows ‘key’ in x
if z has at least m/2+1 elements, then
find successor k of ‘key’, in the sub-tree rooted by z. recursively delete k
from sub-tree, and replace key with k in x
else
both y and z has m/2 elements, then merge then into one node, and push ‘key’
down to the new node as well. Recursively delete ‘key’ from this new node
end if
end if
68
6.3: DEFINITION B-TREES
Just as AVL trees are balanced binary search trees, B-trees are balanced M-way search
trees. By imposing a balance condition , the shape of an AVL tree is constrained in a way
which guarantees that the search, insertion, and withdrawal operations are all ,
where n is the number of items in the tree. The shapes of B-Trees are constrained for the
same reasons and with the same effect.
Definition (B-Tree) A B-Tree of order M is either the empty tree or it is an M-way search
tree T with the following properties:
A B-tree of order one is clearly impossible. Hence, B-trees of order M are really only defined
for . However, in practice we expect that M is large for the same reasons that
motivate M-way search trees--large databases in secondary storage.
Below Figure gives an example of a B-tree of order M=3. By Definition , the root of a B-
tree of order three has either two or three subtrees and the internal nodes also have either
two or three subtrees. Furthermore, all the external nodes, which are shown as small boxes
in Figure , are at the same level.
It turns out that the balance conditions imposed by Definition are good in the same sense
as the AVL balance conditions. That is, the balance condition guarantees that the height of
B-trees is logarithmic in the number of keys in the tree and the time required for insertion
and deletion operations remains proportional to the height of the tree even when balancing
is required.
69
Theorem The minimum number of keys in a B-tree of order and
height is .
extbfProof Clearly, a B-tree of height zero contains at least one node. Consider a B-tree
order M and height h>0. By Definition , each internal node (except the root) has at
least subtrees. This implies the minimum number of keys contained in an internal
node is . The minimum number of keys a level zero is 1; at level
one, ; at level two, ; at level
three, ; and so on.
Therefore the minimum number of keys in a B-tree of height h>0 is given by the summation
Thus, we have shown that a B-tree satisfies the first criterion of a good balance condition--
the height of B-tree with n internal nodes is . What remains to be shown is that the
balance condition can be efficiently maintained during insertion and withdrawal operations.
To see that it can, we need to look at an implementation.
You must start with the root node and then find the suitable leaf node to which the new key
will be added using the Binary Search Tree rules. Now, you will check if the leaf node has an
empty place and add a new key in the tree. If the leaf node is complete, you must split it and
send the key to its parent node. You must do this until all elements are filled in the b-tree.
70
Code:
#include<bits/stdc++.h>
class btreenode
int *key;
int t;
71
btreenode **c; // A child pointers array
int n;
public:
// A function to insert a new key in the subtree rooted with non full node.
//we will make btree friend of btreenode class so that we can access private members of
this class
};
// class btree
class btree
int t;
72
public:
btree(int _t)
root = NULL;
t = _t;
void traverse()
{ if (root != NULL)
root->traverse();
};
t = t1;
leaf = leaf1;
n = 0;
void btreenode::traverse()
int i;
if (leaf == false)
c[i]->traverse();
if (leaf == false)
c[i]->traverse();
void btree::insert(int k)
{
74
//check if tree is empty
if (root == NULL)
root->key[0] = k;
root->n = 1;
else
if (root->n == 2*t-1)
s->c[0] = root;
s->splitchild(0, root);
int i = 0;
if (s->key[0] < k)
i++;
s->c[i]->insertnonfull(k);
root = s;
75
}
root->insertnonfull(k);
void btreenode::insertnonfull(int k)
int i = n-1;
if (leaf == true)
key[i+1] = key[i];
i--;
key[i+1] = k;
n = n+1;
}
76
else // If this node is not the leaf
i--;
if (c[i+1]->n == 2*t-1)
splitchild(i+1, c[i+1]);
if (key[i+1] < k)
i++;
c[i+1]->insertnonfull(k);
//
z->n = t - 1;
z->key[j] = y->key[j+t];
if (y->leaf == false)
77
{
z->c[j] = y->c[j+t];
y->n = t - 1;
c[j+1] = c[j];
c[i+1] = z;
key[j+1] = key[j];
key[i] = y->key[t-1];
n = n + 1;
int main()
p.insert(15);
p.insert(2);
p.insert(25);
p.insert(16);
78
p.insert(32);
p.insert(30);
p.insert(6);
p.insert(7);
p.traverse();
return 0;
You must start from the leftmost node and compare it with the key. If it doesn't match, you
will move to the next node to find the key or traverse the complete tree.
79
Code:
#include<iostream>
class btreenode
int *key;
int t;
int n;
80
bool leaf; // return true if leaf is empty
public:
};
class btree
public:
void traverse()
btreenode* search(int k)
};
t = t1;
leaf = leaf1;
n = 0;
void btreenode::traverse()
82
{
int i;
// If not leaf, then we will traverse the subtree rooted with child c[i].
if (leaf == false)
c[i]->traverse();
if (leaf == false)
c[i]->traverse();
btreenode *btreenode::search(int k)
int i = 0;
i++;
if (key[i] == k)
83
return this;
if (leaf == true)
return NULL;
return c[i]->search(k);
void btree::insert(int k)
if (root == NULL)
root->key[0] = k;
root->n = 1;
else
if (root->n == 2*t-1)
{
84
btreenode *s = new btreenode(t, false);
s->c[0] = root;
s->splitchild(0, root);
int i = 0;
if (s->key[0] < k)
i++;
s->c[i]->insertnonfull(k);
root = s;
else
root->insertnonfull(k);
void btreenode::insertnonfull(int k)
int i = n-1;
if (leaf == true)
key[i+1] = key[i];
i--;
key[i+1] = k;
n = n+1;
else
i--;
if((c[i+1]->n) == 2*t-1)
splitchild(i+1, c[i+1]);
if (key[i+1] < k)
86
i++;
c[i+1]->insertnonfull(k);
z->n = t - 1;
z->key[j] = y->key[j+t];
if (y->leaf == false)
z->c[j] = y->c[j+t];
y->n = t - 1;
c[i+1] = z;
key[k+1] = key[k];
key[i] = y->key[t-1];
n = n + 1;
int main()
btree t(3);
t.insert(13);
t.insert(8);
t.insert(5);
t.insert(6);
t.insert(11);
t.insert(3);
t.insert(7);
t.insert(27);
t.traverse();
int k = 6;
88
if(t.search(k) != NULL)
else
k = 15;
if(t.search(k) != NULL)
else
return 0;
Deletion from a B-tree is more complex than insertion because you can delete a key from
any node, not only a leaf, and you must rearrange the node's children when you delete a
key from an internal node.
89
Code:
#include<bits/stdc++.h>
class btreenode
90
int *key;
int t;
int n;
public:
void traverse();
};
class btree
int t;
public:
btree(int _t)
root = NULL;
t = _t;
void traverse()
{
92
if (root != NULL)
root->traverse();
btreenode* search(int k)
if(root == NULL)
return NULL;
else
root->search(k);
};
t = t1;
leaf = l1;
int btreenode::findkey(int k)
int idx=0;
++idx;
return idx;
void btreenode::remove(int k)
if (leaf)
removefromleaf(idx);
else
removefromnonleaf(idx);
}
94
else
if (leaf)
cout << "The key "<< k <<" is not found in the tree\n";
return;
//If there are less than t keys in the child where the key is expected to exist
if (c[idx]->n < t)
fill(idx);
//We recurse on the (idx-1)th child if the last child has been merged,
//If not, we go back to the (idx)th child, which now contains at least t keys.
c[idx-1]->remove(k);
else
c[idx]->remove(k);
return;
}
95
void btreenode::removefromleaf (int idx)
key[j-1] = key[j];
n--;
return;
int k = key[idx];
//In the subtree rooted at c[idx], look for k's predecessor 'pred'.
if (c[idx]->n >= t)
key[idx] = pred;
c[idx]->remove(pred);
}
96
//Examine c[idx+1] if the child C[idx] contains less than t keys.
key[idx] = succ;
c[idx+1]->remove(succ);
else
merge(idx);
c[idx]->remove(k);
return;
97
}
btreenode *cur=c[idx];
while (!cur->leaf)
cur = cur->c[cur->n];
return cur->key[cur->n-1];
while (!cur->leaf)
cur = cur->c[0];
return cur->key[0];
borrowfromprev(idx);
borrowfromnext(idx);
else
if (idx != n)
merge(idx);
else
merge(idx-1);
return;
btreenode *child=c[idx];
btreenode *sibling=c[idx-1];
//The parent receives the final key from C[idx-1], and key[idx-1] from
//If c[idx] isn't a leaf, advance all of its child pointers one step.
if (!child->leaf)
child->c[i+1] = child->c[i];
child->key[0] = key[idx-1];
if(!child->leaf)
child->c[0] = sibling->c[sibling->n];
key[idx-1] = sibling->key[sibling->n-1];
child->n += 1;
sibling->n -= 1;
return;
//A function that takes a key from C[idx+1] and stores it in C[idx].
{
100
btreenode *child=c[idx];
btreenode *sibling=c[idx+1];
child->key[(child->n)] = key[idx];
if (!(child->leaf))
child->c[(child->n)+1] = sibling->c[0];
key[idx] = sibling->key[0];
sibling->key[j-1] = sibling->key[j];
if (!sibling->leaf)
sibling->c[j-1] = sibling->c[j];
child->n ++;
sibling->n--;
return;
child->key[t-1] = key[idx];
child->key[j+t] = sibling->key[j];
if (!child->leaf)
child->c[j+t] = sibling->c[j];
//move all keys following idx in the current node one step before.
key[i-1] = key[i];
c[j-1] = c[j];
child->n += sibling->n+1;
n--;
delete(sibling);
return;
z->n = t - 1;
z->key[j] = y->key[j+t];
if (y->leaf == false)
z->c[j] = y->c[j+t];
y->n = t - 1;
c[j+1] = c[j];
c[i+1] = z;
key[j+1] = key[j];
key[i] = y->key[t-1];
n++;
void btreenode::traverse()
int i;
c[i]->traverse();
if (leaf == false)
c[i]->traverse();
btreenode *btreenode::search(int k)
int i = 0;
i++;
if (key[i] == k)
return this;
//If the key isn't found here and the node is a leaf,
if (leaf == true)
return NULL;
105
// Go to the appropriate child
return c[i]->search(k);
void btree::remove(int k)
if (!root)
return;
root->remove(k);
// Make the first child of the root node the new root
if (root->n==0)
if (root->leaf)
root = NULL;
106
else
root = root->c[0];
delete tmp;
return;
int main()
p.insert(1);
p.insert(13);
p.insert(7);
p.insert(10);
p.insert(11);
p.insert(6);
p.insert(14);
p.insert(15);
p.traverse();
p.remove(6);
107
cout << "Traversal of tree after deleting 6\n";
p.traverse();
p.remove(13);
p.traverse();
return 0;
6.6 SUMMARY
In this unit, we have introduced B Trees. A glimpse of all the phases we should go through
when we study a B Tree definition and insertion and deletion. In the study of B Trees, the
process of insertion and deletion how the process will work and give a final output.
6.7 KEYWORDS
B-Trees
Key
Complex
Children
108
6.8 QUESTIONS FOR SELF STUDY
1) What is B Trees explain?
2)Briefly explain insertion and deletion of B tree.
3) How to insert an and delete an element in B-trees?
4) Explain basic operation B-trees.
6.9 REFERENCES
1.Bayer R. The universal B-tree for multidimensional indexing: general concepts. In Proc.
Int. Conf. on Worldwide Computing and Its Applications (WWCA), 1997, pp. 198–
209.Google Scholar
2.Bayer R. and McCreight E.M. Organization and maintenance of large ordered indices.
Acta Inf., 1, 1972.Google Scholar
3.Comer D. The ubiquitous B-tree. ACM Comput. Surv., 11(2), 1979.Google Scholar
4.Knuth D. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison
Wesley, MA, USA, 1973.Google Scholar
5.Robinson J. The K-D-B tree: a search structure for large multidimensional dynamic
indexes. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 1981, pp. 10–18.Google
Scholar
6.Srinivasan V. and Carey M.J. Performance of B+ tree concurrency algorithms. VLDB J.,
2(4):361–406, 1993.CrossRefGoogle Scholar
109
UNIT – 7
FIBONACCI HEAPS
STRUCTURE
7.0 Objectives
7.1 Introduction
7.5 Summary
7.6 Keywords
7.8 Reference
7.0 OBJECTIVES
110
7.1 INTRODUCTION
Fibonacci heap is an unordered collection of rooted trees that obey min-heap property. Min-
heap property ensures that the key of every node is greater than or equal to that of its
parent. The roots of the rooted trees are linked to form a linked list, termed as Root list. Also
there exists a min pointer that keeps track of the minimum element, so that the minimum
can be retrieved in constant time. Elements in each level is maintained using a doubly linked
list termed as child list such that the insertion, and deletion in an arbitrary position in the list
can be performed in constant time. Each node is having a pointer to its left node, right node,
parent, and child. Also there are variables in each node for recording the degree (the number
of children of the node), marking of the node, and the key (data) of the node. We shall now
see various Fibonacci heap operations
Binary heap tree can be classified as a binary tree with two constraints:
Completeness - Binary heap tree is a complete binary tree except the last level which
may not have all elements but elements from left to right should be filled in.
Heapness - All parent nodes should be greater or smaller to their children. If parent
node is to be greater than its child then it is called Max heap otherwise it is called
Min heap. Max heap is used for heap sort and Min heap is used for priority queue.
We're considering Min Heap and will use array implementation for the same.
Basic Operations
Following are basic primary operations of a Min heap which are following.
Insert - insert an element in a heap.
111
Get Minimum - get minimum element from the heap.
Remove Minimum - remove the minimum element from the heap
Insert Operation
Get Minimum
Get the first element of the array implementing the heap being root.
int getMinimum(){
return intArray[0];
}
Remove Minimum
Whenever an element is to be removed. Get the last element of the array and reduce
size of heap by 1.
Heap down the element while heap property is broken. Compare element with
children's value and swap them if required.
void removeMin() {
intArray[0] = intArray[size - 1];
size--;
if (size > 0)
112
heapDown(0);
}
Like Binomial heaps, Fibonacci heaps are collection of trees. They are loosely based on
binomial heaps. Unlike trees with in binomial heaps are ordered trees within Fibonacci heaps
are rooted but unordered.
Each node x in Fibonacci heaps contains a pointer p[x] to its parent, and a pointer child[x] to
any one of its children. The children of x are linked together in a circular doubly linked list
known as child list of x. Each child y in a child list has pointers left[y] and right[y] to point left
and right siblings of y respectively. If node y is only child then left[y] = right[y] = y. The order
in which sibling appears in a child list is arbitrary.
Example of Fibonacci Heap
113
This Fibonacci Heap H consists of five Fibonacci Heaps and 16 nodes. The line with arrow head
indicates the root list. Minimum node in the list is denoted by min[H] which is holding 4.
The asymptotically fast algorithms for problems such as computing minimum spanning trees,
finding single source of shortest paths etc. makes essential use of Fibonacci heaps.
Marked Nodes
An important part of the Fibonacci Heap is how it marks nodes within the trees. The decrease
key operation marks a node when its child is cut from a tree, this allows it to track some
history about each node. Essentially the marking of nodes allows us to track whether:
The node is about to have a second child cut (removing a child of a marked node)
When a second child is cut from its parent, the parent is moved to the root list. This ensures
that the structure of the Fibonacci heap does not stray too far from that of the binomial heap,
114
which is one of the properties that enables the data structure to achieve its amortised time
bounds.
Notation
n = number of nodes in heap.
Operations
The different operations supported by Fibonacci heap are:
115
Union 【O(1)】
Extract minimum 【O(log N)】
Decrease key 【O(1)】
Deletion 【O(log N)】
Find Minimum
Finding minimum is one of the most important operations regarding Fibonacci heaps. A
pointer to minimum node of the root list is always kept up to date.
Insertion
Insertion to a Fibonacci heap is similar to the insert operation of a binomial heap. A heap of
one element is created and the two heaps are merged with the merge function. The minimum
element pointer is updated if necessary. The total number of nodes in the tree increases by
one.
To insert a node in a Fibonacci heap H, the following algorithm is followed:
3. If H is empty then:
4. Else:
116
Union
Union concatenates the root lists of two Fibonacci heaps and sets the minimum node to which
ever tree’s minimum node is smaller. Union of two Fibonacci heaps Tree1 and Tree2 can be
accomplished by following algorithm:
1. Join root lists of Fibonacci heaps Tree1 and Tree2 and make a single Fibonacci heap H.
H(min) = Tree1(min).
3. Else:
H(min) = Tree2(min).
Example:
Implementation
struct node
{
node* parent;
node* child;
node* left;
node* right;
117
int key;
};
if(min != NULL)
{
(min->left)->right = new_node;
new_node->right = min;
new_node->left = min->left;
min->left = new_node;
if(new_node->key > min->key){
min = new_node;
}
}
else
{
min = new_node;
}
count++;
}
118
cout << "The Heap is Empty" << endl;
}
else {
cout << "The root nodes of Heap are: " << endl;
do {
cout << ptr->key;
ptr = ptr->right;
if (ptr != min) {
cout << "-->";
}
} while (ptr != min && ptr->right != NULL);
cout << endl << "The heap has " << count << " nodes << endl;
}
}
cout << "min of heap is: " << min->key << endl;
}
C++
Copy
Extract Min
It works by first making a root out of each of the minimum node’s children and removing the
minimum node from the root list. It then consolidates the root list by linking roots of equal
degree until at most one root remains of each degree.
It is one of the most important operations regarding Fibonacci heaps. Much of a Fibonacci
heap’s speed advantage comes from the fact that it delays consolidating heaps after
operations until extract-min is called. Binomial heaps, on the other hand, consolidate
immediately. Consolidation occurs when heap properties are violated, for example, if two
heaps have the same order, the heaps must be adjusted to prevent this.
Following algorithm is used for Extract min in a Fibonacci Heap -
2. Set head to the next min node and add all the tree of the deleted node in root list.
119
If degrees are different then set degree pointer to next node.
If degrees are same then join the Fibonacci trees by union operation.
Decrease Key
Decrease key lowers the key of a node. The node is then cut from the tree, joining the root
list as its own tree. The parent of the node is then cut if it is marked, this continues for each
anscestor until a parent that is not marked is encountered, which is then marked. The pointer
to the minimum node is then updated if the node’s new value is less than the current
minimum.
Algorithm
1. Decrease the value of the node ‘x’ to the new chosen value.
Add tree rooted at ‘x’ to the root list and update min pointer if necessary.
Cut off the link between ‘x’ and its parent p[x].
120
Add p[x] to the root list, updating min pointer if necessary.
Else, cut off p[p[x]] and repeat steps 4.2 to 4.5, taking p[p[x]] as ‘x’.
Example:
121
Example:
Deletion
Delete is performed by calling decrease key to reduce the node to negative infinity which
pulls the node to the top of the tree. Extract minimum is then called on the node to remove
it from the heap.
Algorithm
1. Decrease the value of the node to be deleted ‘x’ to minimum by Decrease Key function.
122
2. By using min heap property, heapify the heap containing ‘x’, bringing ‘x’ to the root list.
Implementation
123
new_node->right = new_node;
if (mini != NULL) {
(mini->left)->right = new_node;
new_node->right = mini;
new_node->left = mini->left;
mini->left = new_node;
if (new_node->key < mini->key)
mini = new_node;
}
else {
mini = new_node;
}
no_of_nodes++;
}
(ptr2->left)->right = ptr2->right;
(ptr2->right)->left = ptr2->left;
if (ptr1->right == ptr1)
mini = ptr1;
ptr2->left = ptr2;
ptr2->right = ptr2;
ptr2->parent = ptr1;
if (ptr1->child == NULL)
ptr1->child = ptr2;
ptr2->right = ptr1->child;
ptr2->left = (ptr1->child)->left;
((ptr1->child)->left)->right = ptr2;
(ptr1->child)->left = ptr2;
if (ptr2->key < (ptr1->child)->key)
ptr1->child = ptr2;
ptr1->degree++;
}
int temp1;
float temp2 = (log(no_of_nodes)) / (log(2));
int temp3 = temp2;
struct node* arr[temp3];
for (int i = 0; i <= temp3; i++)
124
arr[i] = NULL;
node* ptr1 = mini;
node* ptr2;
node* ptr3;
node* ptr4 = ptr1;
do {
ptr4 = ptr4->right;
temp1 = ptr1->degree;
while (arr[temp1] != NULL) {
ptr2 = arr[temp1];
if (ptr1->key > ptr2->key) {
ptr3 = ptr1;
ptr1 = ptr2;
ptr2 = ptr3;
}
if (ptr2 == mini)
mini = ptr1;
Fibonnaci_link(ptr2, ptr1);
if (ptr1->right == ptr1)
mini = ptr1;
arr[temp1] = NULL;
temp1++;
}
arr[temp1] = ptr1;
ptr1 = ptr1->right;
} while (ptr1 != mini);
mini = NULL;
for (int j = 0; j <= temp3; j++) {
if (arr[j] != NULL) {
arr[j]->left = arr[j];
arr[j]->right = arr[j];
if (mini != NULL) {
(mini->left)->right = arr[j];
arr[j]->right = mini;
arr[j]->left = mini->left;
mini->left = arr[j];
if (arr[j]->key < mini->key)
mini = arr[j];
}
else {
mini = arr[j];
}
if (mini == NULL)
mini = arr[j];
else if (arr[j]->key < mini->key)
mini = arr[j];
125
}
}
}
if (mini == NULL)
cout << "The heap is empty" << endl;
else {
node* temp = mini;
node* pntr;
pntr = temp;
node* x = NULL;
if (temp->child != NULL) {
x = temp->child;
do {
pntr = x->right;
(mini->left)->right = x;
x->right = mini;
x->left = mini->left;
mini->left = x;
if (x->key < mini->key)
mini = x;
x->parent = NULL;
x = pntr;
} while (pntr != temp->child);
}
(temp->left)->right = temp->right;
(temp->right)->left = temp->left;
mini = temp->right;
if (temp == temp->right && temp->child == NULL)
mini = NULL;
else {
mini = temp->right;
Consolidate();
}
no_of_nodes--;
}
}
126
if (found == found->right)
temp->child = NULL;
(found->left)->right = found->right;
(found->right)->left = found->left;
if (found == temp->child)
temp->child = found->right;
temp->degree = temp->degree - 1;
found->right = found;
found->left = found;
(mini->left)->right = found;
found->right = mini;
found->left = mini->left;
mini->left = found;
found->parent = NULL;
found->mark = 'B';
}
if (mini == NULL)
cout << "The Heap is Empty" << endl;
if (found == NULL)
cout << "Node not found in the Heap" << endl;
found->key = val;
127
struct node* temp = found->parent;
if (temp != NULL && found->key < temp->key) {
Cut(found, temp);
Cascase_cut(temp);
}
if (found->key < mini->key)
mini = found;
}
if (mini == NULL)
cout << "The heap is empty" << endl;
else {
128
// delete minimum value node, which is 0
Extract_min();
cout << "Key Deleted" << endl;
}
}
else {
cout << "The root nodes of Heap are: " << endl;
do {
cout << ptr->key;
ptr = ptr->right;
if (ptr != mini) {
cout << "-->";
}
} while (ptr != mini && ptr->right != NULL);
cout << endl << "The heap has " << no_of_nodes << " nodes" << endl << endl;
}
}
C++
Copy
Complexity + Comparison
Comparision of time complexities for various operations:
To determine the amortized cost of FIB-HEAP-INSERT, let H be the input Fibonacci heap and
H' be the resulting Fibonacci heap. Then, t(H') = t(H) + 1 and m(H') = m(H) and the increase in
potential is (t(H) + 1) + 2m(H)) - (t(H) + 2m(H)) = 1.Since the actual cost is O(1), the amortized
cost is O(1) + 1 = O(1).
The minimum node of a Fibonacci heap H is given by the pointer H.min, so we can find the
minimum node in O(1) actual time. Because the potential of H does not change, the amortized
cost of this operation is equal to its O(1) actual cost.
129
7.4 MERGEABLE-HEAP OPERATIONS
A mergeable heap is a data structure that stores a collection of keys 1 and supports the
following
operations.
• Insert: Insert a new key into a heap. This operation can also be used to create a new heap
containing just one key.
• FindMin: Return the smallest key in a heap.
• DeleteMin: Remove the smallest key from a heap.
• Merge: Merge two heaps into one. The new heap contains all the keys that used to be in
the old heaps, and the old heaps are (possibly) destroyed.
If we never had to use DeleteMin, mergeable heaps would be completely trivial. Each
“heap”
just stores to maintain the single record (if any) with the smallest key. Inserts and Merges
require only one comparison to decide which record to keep, so they take constant time.
FindMin
obviously takes constant time as well.
If we need DeleteMin, but we don’t care how long it takes, we can still implement
mergeable
heaps so that Inserts, Merges, and FindMins take constant time. We store the records in a
circular doubly-linked list, and keep a pointer to the minimum key. Now deleting the
minimum
key takes Θ(n) time, since we have to scan the linked list to find the new smallest key.
130
In this lecture, I’ll describe a data structure called a Fibonacci heap that supports Inserts,
Merges, and FindMins in constant time, even in the worst case, and also handles DeleteMin
in
O(log n) amortized time. That means that any sequence of n Inserts, m Merges, f FindMins,
and d DeleteMins takes O(n + m + f + d log n) time.
7.5 SUMMARY
In this unit, we have discussed task of Fibonacci Heaps. We have considered few examples to
study. So we have a discussed the structure in this unit. We learnt various merge able heap
operations.
7.6 KEYWORDS
Heaps
FindMin
DeleteMin
Amortized
7.8 REFERENCES
[1] Gerth Stølting Brodal. Fast meldable priority queues.In Proc. 4th International Workshop
Algorithms and
[2] Gerth Stølting Brodal. Worst-case efficient priorityqueues. In Proc. 7th Annual ACM-SIAM
Symposium
131
[3] Svante Carlsson, J. Ian Munro, and Patricio V.Poblete. An implicit binomial queue with
constant
insertion time. In Proc. 1st Scandinavian Workshop on
[4] Timothy M. Chan. Quake heaps: a simple alternative to Fibonacci heaps. Manuscript,
2009.
[5] Clark Allan Crane. Linear lists and priority queues as
132
UNIT – 8
STRUCTURE
8.0 Objectives
8.1 Introduction
8.4 Summary
8.5 Keywords
8.7 Reference
8.0 OBJECTIVES
133
8.1 INTRODUCTION
we saw how binomial heaps support in O(lg n) worst-case time the mergeable-heap
operations INSERT, MINIMUM, EXTRACT-MIN, and UNION, plus the operations DECREASE-
KEY and DELETE. In this chapter, we shall examine Fibonacci heaps, which support the same
operations but have the advantage that operations that do not involve deleting an element
run in O(1) amortized time. From a theoretical standpoint, Fibonacci heaps are especially
desirable when the number of EXTRACT-MIN and DELETE operations is small relative to the
number of other operations performed. This situation arises in many applications. For
example, some algorithms for graph problems may call DECREASE-KEY once per edge. For
dense graphs, which have many edges, the O(1) amortized time of each call of DECREASE-
KEY adds up to a big improvement over the (lg n) worst-case time of binary or binomial
heaps. The asymptotically fastest algorithms to date for problems such as computing
minimum spanning trees and finding single-source shortest paths make essential use of
Fibonacci heaps.
From a practical point of view, however, the constant factors and programming complexity
of Fibonacci heaps make them less desirable than ordinary binary (or k-ary) heaps for most
applications. Thus, Fibonacci heaps are predominantly of theoretical interest. If a much
simpler data structure with the same amortized time bounds as Fibonacci heaps were
developed, it would be of great practical use as well.Like a binomial heap, a Fibonacci heap is
a collection of trees. Fibonacci heaps, in fact, are loosely based on binomial heaps. If
neither DECREASE-KEY nor DELETE is ever invoked on a Fibonacci heap, each tree in the heap
is like a binomial tree. Fibonacci heaps differ from binomial heaps, however, in that they have
a more relaxed structure, allowing for improved asymptotic time bounds. Work that
maintains the structure can be delayed until it is convenient to perform.
In this section, we show how to decrease the key of a node in a Fibonacci heap in O(1)
amortized time and how to delete any node from an n-node Fibonacci heap in O(D(n))
amortized time. These operations do not preserve the property that all trees in the Fibonacci
heap are unordered binomial trees. They are close enough, however, that we can bound the
maximum degree D(n) by O(1g n). Proving this bound will imply that FIB-HEAP-EXTRACT-
MIN and FIB-HEAP-DELETE run in O(1g n) amortized time.
FIB-HEAP-DECREASE-KEY(H,x,k)
1 if k > key[x]
134
2 then error "new key is greater than current key"
3 key[x] k
4 y p[x]
5 if y NIL and key[x] < key[y]
6 then CUT(H,x,y)
7 CASCADING-CUT(H,y)
8 if key[x] < key[min[H]]
9 then min[H] x
CUT(H,x,y)
1 remove x from the child list of y, decrementing degree[y]
2 add x to the root list of H
3 p[x] NIL
4 mark[x] FALSE
CASCADING-CUT(H,y)
1 z p[y]
2 if z NIL
3 then if mark[y] = FALSE
4 then mark[y] TRUE
5 else CUT(H,y,z)
6 CASCADING-CUT(H,z)
The FIB-HEAP-DECREASE-KEY procedure works as follows. Lines 1-3 ensure that the new key
is no greater than the current key of x and then assign the new key to x. If x is a root or
if key[x] key[y], where y is x's parent, then no structural changes need occur, since heap
order has not been violated. Lines 4-5 test for this condition.
If heap order has been violated, many changes may occur. We start by cutting x in line 6.
The CUT procedure "cuts" the link between x and its parent y, making x a root.
We use the mark fields to obtain the desired time bounds. They help to produce the following
effect. Suppose that x is a node that has undergone the following history:
As soon as the second child has been lost, x is cut from its parent, making it a new root. The
field mark[x] is TRUE if steps 1 and 2 have occurred and one child of x has been cut.
The CUT procedure, therefore, clears mark[x] in line 4, since it performs step 1. (We can now
see why line 3 of FIB-HEAP-LINK clears mark[y]: node y is being linked to another node, and
so step 2 is being performed. The next time a child of y is cut, mark[y] will be set to TRUE.)
We are not yet done, because x might be the second child cut from its parent y since the time
that y was linked to another node. Therefore, line 7 of FIB-HEAP-DECREASE-KEY performs
a cascading-cut operation on y. If y is a root, then the test in line 2 of CASCADING-CUT causes
135
the procedure to just return. If y is unmarked, the procedure marks it in line 4, since its first
child has just been cut, and returns. If y is marked, however, it has just lost its second child; y is
cut in line 5, and CASCADING-CUT calls itself recursively in line 6 on y's parent z.
The CASCADING-CUT procedure recurses its way up the tree until either a root or an
unmarked node is found.
Once all the cascading cuts have occurred, lines 8-9 of FIB-HEAP-DECREASE-KEY finish up by
updating min[H] if necessary.
Figure 8.2 shows the execution of two calls of FIB-HEAP-DECREASE-KEY, starting with the
Fibonacci heap shown in Figure 8.1(a). The first call, shown in Figure 8.2(b), involves no
cascading cuts. The second call, shown in Figures 8.3(c)-(e), invokes two cascading cuts.
We shall now show that the amortized cost of FIB-HEAP-DECREASE-KEY is only O(1). We start
by determining its actual cost. The FIB-HEAP-DECREASE-KEY procedure takes O(1) time, plus
the time to perform the cascading cuts. Suppose that CASCADING-CUT is recursively
called c times from a given invocation of FIB-HEAP-DECREASE-KEY. Each call of CASCADING-
CUT takes O(1) time exclusive of recursive calls. Thus, the actual cost of FIB-HEAP-DECREASE-
KEY, including all recursive calls, is O(c).
We next compute the change in potential. Let H denote the Fibonacci heap just prior to
the FIB-HEAP-DECREASE-KEY operation. Each recursive call of CASCADING-CUT, except for
the last one, cuts a marked node and clears the mark bit. Afterward, there are t(H) + c trees
(the original t(H) trees, c- 1 trees produced by cascading cuts, and the tree rooted at x) and at
most m(H) - c + 2 marked nodes (c - 1 were unmarked by cascading cuts and the last call
of CASCADING-CUT may have marked a node). The change in potential is therefore at most
O(c) + 4 - c = O(1) ,
since we can scale up the units of potential to dominate the constant hidden in O(c).
You can now see why the potential function was defined to include a term that is twice the
number of marked nodes. When a marked node y is cut by a cascading cut, its mark bit is
cleared, so the potential is reduced by 2. One unit of potential pays for the cut and the clearing
of the mark bit, and the other unit compensates for the unit increase in potential due to
node y becoming a root.
136
Figure 8.2 Two calls of FIB-HEAP-DECREASE-KEY. (a) The initial Fibonacci heap. (b) The node
with key 46 has its key decreased to 15. The node becomes a root, and its parent (with key
24), which had previously been unmarked, becomes marked. (c)-(e) The node with key 35 has
its key decreased to 5. In part (c), the node, now with key 5, becomes a root. Its parent, with
key 26, is marked, so a cascading cut occurs. The node with key 26 is cut from its parent and
made an unmarked root in (d). Another cascading cut occurs, since the node with key 24 is
marked as well. This node is cut from its parent and made an unmarked root in part (e). The
cascading cuts stop at this point, since the node with key 7 is a root. (Even if this node were
not a root, the cascading cuts would stop, since it is unmarked.) The result of the FIB-HEAP-
DECREASE-KEY operation is shown in part (e), with min[H] pointing to the new minimum node.
It is easy to delete a node from an n-node Fibonacci heap in O(D(n)) amortized time, as is done
by the following pseudocode. We assume that there is no key value of - currently in the
Fibonacci heap.
FIB-HEAP-DELETE(H, x)
1 FIB-HEAP-DECREASE-KEY(H, x, - )
2 FIB-HEAP-EXTRACT-MIN(H)
137
To prove that the amortized time of FIB-HEAP-EXTRACT-MIN and FIB-HEAP-DELETE is O(lg n),
we must show that the upper bound D(n) on the degree of any node of an n-node Fibonacci
heap is O(lg n). By Exercise 8-1.2-3, when all trees in the Fibonacci heap are unordered
binomial trees, D(n) = lg n . The cuts that occur in FIB-HEAP-DECREASE-KEY, however, may
cause trees within the Fibonacci heap to violate the unordered binomial tree properties. In
this section, we shall show that because we cut a node from its parent as soon as it loses two
children, D(n) is O(lg n). In particular, we shall show that D(n) log n , where
.
The key to the analysis is as follows. For each node x within a Fibonacci heap, define size(x) to
be the number of nodes, including x itself, in the subtree rooted at x. (Note that x need not
be in the root list--it can be any node at all.) We shall show that size(x) is exponential
in degree[x]. Bear in mind that degree[x] is always maintained as an accurate count of the
degree of x.
Lemma 8.3
Let x be any node in a Fibonacci heap, and suppose that degree[x] = k. Let y1, y2, ..., yk denote
the children of x in the order in which they were linked to x, from the earliest to the latest.
Then, degree [y1] 0 and degree[yi] i - 2 for i = 2, 3, . . . , k.
For i 2, we note that when yi was linked to x, all of y1, y2, . . . , yi-1 were children of x, so we
must have had degree[x] i - 1. Node yi is linked to x only if degree[x] = degree[yi], so we
must have also had degree[yi] i - 1 at that time. Since then, node yi has lost at most one
child, since it would have been cut from x if it had lost two children. We conclude
that degree [yi ] i - 2.
We finally come to the part of the analysis that explains the name "Fibonacci heaps." Recall
from Section 2.2 that for k = 0, 1, 2, . . . , the kth Fibonacci number is defined by the recurrence
Lemma 8.4
138
We now assume the inductive hypothesis that , and we have
The following lemma and its corollary complete the analysis. It uses the inequality (proved in
Exercise 2.2-8)
k,
Fk+2
Lemma 8.5
k,
Let x be any node in a Fibonacci heap, and let k = degree[x]. Then, size (x) Fk+2
where = .
Proof Let sk denote the minimum possible value of size(z) over all nodes z such that degree[z]
= k. Trivially, s0 = 1, s1 = 2, and s2 = 3. The number sk is at most size(x). As in Lemma 8.3,
let y1, y2, . . . , yk denote the children of x in the order in which they were linked to x. To
compute a lower bound on size(x), we count one for x itself and one for the first child y1 (for
which size(y1) 1 ) and then apply Lemma 8.3 for the other children. We thus have
We now show by induction on k that sk Fk+2 for all nonnegative integer k. The bases, for k =
0 and k = 1, are trivial. For the inductive step, we assume that k 2 and that si Fi + 2 for i =
0, 1, . . . , k - 1. We have
139
The last equality follows from Lemma 8.4.
k.
Thus, we have shown that size(x) sk + 2
Corollary 8.6
The maximum degree D(n) of any node in an n-node Fibonacci heap is O(lg n).
Proof Let x be any node in an n-node Fibonacci heap, and let k = degree[x]. By Lemma 21.3,
k. Taking base- logarithms yields k
we have n size(x) log n. (In fact, because k is an
integer, k log n .) The maximum degree D(n) of any node is thus O(lg n).
8.4 SUMMARY
In this unit we have learnt some advance concepts on decreasing a key and deleting a node.
We have discussed deleting node bounding maximum degree. In this block we have covered
hierarchical data structure and its operations.
8.5 KEYWORDS
Node bounding
Key
Degree
Equality
Lemma
140
8.6 QUESTIONS FOR SELF STUDY
8.7 REFERENCES
[1] Gerth Stølting Brodal. Fast meldable priority queues.In Proc. 4th International Workshop
Algorithms and
[2] Gerth Stølting Brodal. Worst-case efficient priorityqueues. In Proc. 7th Annual ACM-SIAM
Symposium
[3] Svante Carlsson, J. Ian Munro, and Patricio V.Poblete. An implicit binomial queue with
constant
insertion time. In Proc. 1st Scandinavian Workshop on
141
UNIT – 9
GRAPHS
STRUCTURE
9.0 Objectives
9.1 Introduction
9.5 Summary
9.6 Keywords
9.8 Reference
9.0 OBJECTIVES
142
9.0 INTRODUCTION
we have defined non-linear data structure and we mentioned that trees and graphs are the
examples of non-linear data structure. To recall, in non-linear data structures unlike linear
data structures, an element is permitted to have any number of adjacent elements.
Definition1: A graph G = (V,E) is a finite nonempty set V of objects called vertices together
with a (possibly empty) set E of unordered pairs of distinct vertices of G called edges.
Definition2: A digraph G = (V,E) is a finite nonempty set V of vertices together with a (possibly
empty) set E of ordered pairs of vertices of G called arcs
An arc that begins and ends at a same vertex u is called a loop. We usually (but not always)
disallow loops in our digraphs. By being defined as a set, E does not contain duplicate (or
multiple) edges/arcs between the same two vertices. For a given graph (or digraph) G, we also
denote the set of vertices by V (G) and the set of edges (or arcs) by E (G) to lessen any
ambiguity.
Definition3: The order of a graph (digraph) G = (V, E) is |V| sometimes denoted by |G| and
the size of this graph is |E|
Sometimes we view a graph as a digraph where every unordered edge (u, v) is replaced by
two directed arcs (u, v) and (v, u). In this case, the size of a graph is half the size of the
corresponding digraph.
143
Definition 4: A walk in a graph (digraph) G is a sequence of vertices v0,v1…vn such that for all
0 ≤ i < n, (vi,vi+1) is an edge (arc) in G. The length of the walk v0,v1…vn is the number n. A path
is a walk in which no vertex is repeated. A cycle is a walk (of length at least three for graphs)
in which v0 = vn and no other vertex is repeated; sometimes, it is understood, we omit vn from
the sequence.
In the next example, we display a graph G1 and a digraph G2 both of order 5. The size of the
graph G1 is 6 where E(G1) = {(0, 1), (0, 2), (1, 2), (2, 3), (2, 4), (3, 4) while the size of the graph
G2 is 7 where E(G2) = {(0, 2), (1, 0), (1, 2), (1, 3), (3, 1), (3, 4), (4, 2)}.
Example 1: For the graph G1 of Figure 1.1, the following sequences of vertices are classified
as being walks, paths, or cycles.
032 No No No
010 Yes No No
144
Example 2: For the graph G1 of Figure 1.1, the following sequences of vertices are classified
as being walks, paths, or cycles.
01234 No No No
024 No No No
31310 Yes No No
Definition 5: A graph G is connected if there is a path between all pairs of vertices u and v of
V(G). A digraph G is strongly connected if there is a path from vertex u to vertex v for all pairs
u an v in V(G).
In Figure 1.1, the graph G1 is connected by the digraph G2 is not strongly connected because
there are no arcs leaving vertex 2. However, the underlying graph G 2 is connected.
Definition 6: In a graph, the degree of a vertex v, denoted by deg(v), is the number of edges
incident to v. For digraphs, the out-degree of a vertex v is the number of arcs {(v, x) Є E | x Є
V} incident from v (leaving v) and the in-degree of vertex v is the number of arcs {(v, x) Є E | x
Є V} incident to v (entering v).
For a graph, the in-degree and out-degree’s are the same as the degree. For out graph G1, we
have deg(0) = 2, deg(2) = 4, deg(3) = 2 and deg(4) = 2. We may concisely write this as a degree
sequence (2, 2, 4, 2, 2) id there is a natural ordering (e.g., 0, 1, 2, 3, 4) of the vertices. The in-
degree sequence and out-degree sequence of the digraph G2 are (1, 1, 3, 1, 1) and (1, 3, 0, 2,
1), respectively. The degree of a vertex of a digraph is sometimes defined as the sum of its in-
degree and out-degree. Using this definition, a degree sequence of G2 would be (2, 4, 3, 3, 2).
Definition 7: A weighted graph is a graph whose edges have weights. These weights can be
thought as cost involved in traversing the path along the edge. Figure 1.2 shows a weighted
graph.
145
Figure 1.2 A weighted graph
Definition 8: If removal of an edge makes a graph disconnected then that edge is called
cutedge or bridge.
Definition 9: If removal of a vertex makes a graph disconnected then that vertex is called
cutvertex.
Definition 10: A connected graph without a cycle in it is called a tree. The pendent vertices of
a tree are called leaves.
Definition 11: A graph without self loop and parallel edges is called a simple graph.
Definition 12: A graph which can be traced without repeating any edge is called an Eulerian
graph. If all vertices of a graph happen to be even degree then the graph is called an Eulerian
graph.
Definition 13: If two vertices of a graph are odd degree and all other vertices are even then it
is called open Eulerian graph. In open Eulerian graph the starting and ending points must be
odd degree vertices.
Definition 14: A graph in which all vertices can be traversed without repeating any edge but
can have any number of edges is called Hamiltonian graph.
Definition 15: Total degree of a graph is twice the number of edges. That is, the total degree
= 2* |E|
146
Sum of degrees of all even degree vertices + Sum of degrees of all odd degree
vertices = Even.
The sequential or the matrix representations of graphs have the following methods:
A graph with n nodes can be represented as n x n Adjacency Matrix A such that an element Ai
j
Ai j =
0 Otherwise
Note that the number of 1s in a row represents the out degree of a node. In case of undirected
graph, the number of 1s represents the degree of the node. Total number of 1s in the matrix
147
represents number of edges. Figure 1.3(a) shows a graph and Figure 1.3(b) shows its
adjacency matrix.
Figure 1.4(a) shows a digraph and Figure 1.4(b) shows its adjacency matrix.
Let G be a graph with n vertices and e edges. Define an n x e matrix M = [mij] whose n rows
corresponds to n vertices and e columns correspond to e edges, as
1 ej incident upon vi
Aij =
0 Otherwise
148
Matrix M is known as the incidence matrix representation of the graph G. Figure 1.5(a) shows
a graph and Figure 1.5(b) shows its incidence matrix.
e1 e2 e3 e4 e5 e6 e7
v1 1 0 0 0 1 0 0
v2 1 1 0 0 0 1 1
v3 0 1 1 0 0 0 0
v4 0 0 1 1 0 0 1
v5 0 0 0 1 1 1 0
The incidence matrix contains only two elements, 0 and 1. Such a matrix is called a binary
matrix or a (0, 1)-matrix.
The following observations about the incidence matrix can readily be made:
1. Since every edge is incident on exactly two vertices, each column of in an incidence
matrix has exactly two1’s.
2. The number of 1’s in each row equals the degree of the corresponding vertex.
3. A row with all 0’s, therefore, represents an isolated vertex.
149
Figure 1.6(a) Undirected graph Figure 1.6(b) Linked representation of a graph
Depth first search algorithm starts visiting nodes of a graph arbitrarily, marking that node as
visited node. Soon after visiting any node (current node) we consider any of its adjacent nodes
as next node for traversal and the current node address will be stored in stack data structure
and traverse to the next adjacent node. The same thing is processed until no node can be
processed further. If there are any nodes which are not visited, then backtracking is used until
all the nodes are visited. In depth first search stack will be used as a storage structure to store
information about the nodes which will be used during backtracking.
Before knowing how to search a node in a graph using depth first search we need to
understand how depth first search can be used for traversal of graph. Consider a graph G as
shown in Figure 1(a). The traversal starts with node 1 (Figure 1(b)), mark the node as traversed
(Gray shading is used to indicate that the node is traversed) and push the node number 1 into
the stack. As it has only one adjacent node 4 we will move to node number 4. Mark the node
150
number 4 Figure 1(c) and push 4 into the stack. For node number 4 there are 2 adjacent nodes
i.e., 2 and 5.
151
(j) (k)
Figure 1: Traversal of a graph using depth first search algorithm
Select one node arbitrarily (For implementation purpose we can select the node with smallest
number) and move to that node, in this case we will move to node 2 and push the node
number 2 to stack. Similarly we will move to node 5 from node 2, pushing 5 into stack and
then move to node 3 from node 5 and push node 3 into the stack (Figure (1(d, e, f, g, h, I, j)).
Figure 1(k) shows the elements present in the stack at the end. From node 3 there is no
possibility to traverse further. From this point onwards we will backtrack to check whether
there are any other nodes which are not traversed. Pop the top node 3 from stack. Now,
check is there any possibility to traverse from the element present in the top of the stack. The
top element is 5 and there is an edge with has not been traversed from the node 5 (See Figure
2(b), the line marked in red color is untraversed edge). This edge leads to 4 which has been
already visited and there is no other possibility for traversing from node 5, pop node 5 from
the stack. Do the same process and at the end there will be no elements in the stack indicating
that all the vertices of the graph have been traversed.
152
(d) (e) (f)
(g)
Figure 2. Backtracking operations for the depth first search algorithm
Figure 1 and Figure 2 demonstrated the depth first search for traversal purpose. The same
technique can be used to search an element in the graph. Given a graph with n nodes we can
search whether given node is present in the graph or not. Each time we visit a node we check
whether that node is same as the search node, if it is stop the procedure declaring that the
node is present else push that node into the stack and traverse until you the stack become
empty.
Let us consider a tree example and illustrate the working principle of the depth first search.
Let the search element be F.
153
Figure 3. A binary tree with 8 nodes.
Figure 4(a)
Figure 4(b).
Figure 4(c)
154
Figure 4(d)
Figure 4(a) – 4(d) : Various steps in depth first search algorithm.
Note: Depth first search method uses stack as a data structure.
Analogous to depth first search which search the nodes from top to bottom fashion
postponing the traversal of adjacent elements, the breadth first search algorithm first
traverse adjacent nodes of a starting node, then all unvisited nodes in a connected graph will
be traversed in the same manner.
It is convenient to use a queue to trace the operation of breadth first search. The queue is
initialized with the traversal’s starting node, which is marked as visited. On each iteration, the
algorithm identifies all unvisited nodes that are adjacent to the front node, marks them as
visited, and adds them to the queue; after that front node is removed from the queue.
Let us consider the same example of tree traversal Figure 3.
Starting node is A, Insert A into queue mark A as traversed. Move to its successor element {B,
C}, push them to queue and mark them as traversed. Since there is no other adjacent element
to node A, remove A from which is first element in the queue. The next element in the queue
is B, check for its successor node. Since B has no successor elements remove B from the
Queue. The next element in the queue is C, find its successor elements i.e., {D, F}. Insert them
into the queue and correspondingly marks the, as traversed. Since C has no other elements
155
as its successor remove C from the Queue. The next element in the queue is D, its successor
is E insert it into the queue and mark it as traversed. Now, D has no successor node hence
remove D from the Queue. The next element in the Queue is F, find out is successor i.e., {H,
I}. Insert them into the queue and mark them as visited. Once again the element F has no
successor so remove it from queue and check for next element in the queue. The next
element is E and E has no successor remove it and next elements are H and I. traverse them
in the same way.
For searching an element using breadth first search, similar to depth first search we traverse
the graph using breadth first traversal and while traversing the graph if a node same as search
element occurs we declare that search element is present in the graph.
9.5 SUMMARY
In this unit we have presented the basics of elementary Graph Algorithms. We have
presented the Representation of Graph, Breadth first search and depth first search.
9.6 KEYWORDS
Depth first search
Breadth first search
Queue
Traverse
Non-linear data structures
Undirected graphs
4) Mention the difference between depth first search and breadth first search algorithm.
156
9.8 REFERENCES
1) Sartaj Sahni, 2000, Data structures, algorithms and applications in C++, McGraw Hill
international edition.
2) Horowitz and Sahni, 1983, Fundamentals of Data structure, Galgotia publications
3) Narsingh Deo, 1990, Graph theory with applications to engineering and computer science,
Prentice hall publications.
4) Tremblay and Sorenson, 1991, An introduction to data structures with applications,
McGraw Hill edition.
5) C and Data Structures by Practice- Ramesh, Anand and Gautham.
6) Data Structures and Algorithms: Concepts, Techniques and Applications by GAV Pai. Tata
McGraw Hill, New Delhi.
157
UNIT – 10
TOPOLOGICAL SORT
STRUCTURE
10.0 Objectives
10.1 Introduction
10.2 Topological Sort
10.3 Strongly Connected Components
10.4 Minimum spanning Tree
10.5 Growing a minimum Spanning tree
10.6 Summary
10.7 Keywords
10.8 Questions for self-study
10.9 Reference
10.0 OBJECTIVES
Define sorting
Topological sort
Minimum spanning tree
158
10.1 INTRODUCTION
Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to
arrange data in a particular order. Most common orders are in numerical or lexicographical
order.
The importance of sorting lies in the fact that data searching can be optimized to a very high
level, if data is stored in a sorted manner. Sorting is also used to represent data in more
readable formats. Following are some of the examples of sorting in real-life scenarios −
Sorting algorithms may require some extra space for comparison and temporary storage of
few data elements. These algorithms do not require any extra space and sorting is said to
happen in-place, or for example, within the array itself. This is called in-place sorting. Bubble
sort is an example of in-place sorting.
However, in some sorting algorithms, the program requires space which is more than or
equal to the elements being sorted. Sorting which uses equal or more space is called not-in-
place sorting. Merge-sort is an example of not-in-place sorting.
If a sorting algorithm, after sorting the contents, does not change the sequence of similar
If a sorting algorithm, after sorting the contents, changes the sequence of similar content in
which they appear, it is called unstable sorting.
159
Stability of an algorithm matters when we wish to maintain the sequence of original
elements, like in a tuple for example.
A non-adaptive algorithm is one which does not take into account the elements which are
already sorted. They try to force every single element to be re-ordered to confirm their
sortedness.
Important Terms
Some terms are generally coined while discussing sorting techniques, here is a brief
introduction to them −
Increasing Order
Decreasing Order
160
A sequence of values is said to be in decreasing order, if the successive element is less than
the current one. For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as every next element
is less than the previous element.
Non-Increasing Order
Non-Decreasing Order
vertices such that for every directed edge uv from vertex u to vertex v , u comes
before v in the ordering. Topological order is possible if and only if the graph has no
161
The above graph has many valid topological ordering of vertices like,
7 5 3 1 4 2 0 6
7 5 1 2 3 4 0 6
5 7 3 1 0 2 6 4
3 5 7 0 1 2 6 4
5 7 3 0 1 4 6 2
7 5 1 3 4 0 6 2
5 7 1 2 3 0 6 4
3 7 0 5 1 4 2 6
Note that for every directed edge u —> v , u comes before v in the ordering. For example,
162
problem:
In the previous post, we have seen how to print the topological order of a graph using
the Depth–first search (DFS) algorithm. In this post, Kahn’s topological sort algorithm is
introduced, which provides an efficient way to print the topological order.
Kahn’s topological sort algorithm works by finding vertices with no incoming edges and
removing all outgoing edges from these vertices. Following is a pseudocode for Kahn’s
topological sort algorithm taken from Wikipedia:
Kahn’s–Algorithm (graph)
while S is non-empty do
remove a vertex n from S
add n to tail of L
for each vertex m with an edge e from n to m do
remove edge e from the graph
if m has no other incoming edges, then insert m into S
insert m into S
Note that a DAG has at least one such vertex which has no incoming edges.
How can we remove an edge from the graph or check if a vertex has no other incoming edge
in constant time?
The idea is to maintain in-degree information of all graph vertices in a map or an array ,
say indegree[] , for constant-time operations. Here, indegree[m] will store the total number
If vertex m has no incoming edge and is ready to get processed, its indegree will be
0, i.e., indegree[m] = 0 .
Following is the C++, Java, and Python implementation of Kahn’s topological sort algorithm:
1 #include <iostream>
2 #include <vector>
3 using namespace std;
4
5 // Data structure to store a graph edge
6 struct Edge {
7 int src, dest;
8 };
9
10 // A class to represent a graph object
11 class Graph
12 {
13 public:
164
14 // a vector of vectors to represent an adjacency list
15 vector<vector<int>> adjList;
16
17 // stores indegree of a vertex
18 vector<int> indegree;
19
20 // Graph Constructor
21 Graph(vector<Edge> const &edges, int n)
22 {
23 // resize the vector to hold `n` elements of type `vector<int>`
24 adjList.resize(n);
25
26 // initialize indegree
27 vector<int> temp(n, 0);
28 indegree = temp;
29
30 // add edges to the directed graph
31 for (auto &edge: edges)
32 {
33 // add an edge from source to destination
34 adjList[edge.src].push_back(edge.dest);
35
36 // increment in-degree of destination vertex by 1
37 indegree[edge.dest]++;
38 }
39 }
40 };
41
42 // Function to perform a topological sort on a given DAG
43 vector<int> doTopologicalSort(Graph const &graph)
44 {
165
45 vector<int> L;
46
47 // get the total number of nodes in the graph
48 int n = graph.adjList.size();
49
50 vector<int> indegree = graph.indegree;
51
52 // Set of all nodes with no incoming edges
53 vector<int> S;
54 for (int i = 0; i < n; i++)
55 {
56 if (!indegree[i]) {
57 S.push_back(i);
58 }
59 }
60
61 while (!S.empty())
62 {
63 // remove node `n` from `S`
64 int n = S.back();
65 S.pop_back();
66
67 // add `n` at the tail of `L`
68 L.push_back(n);
69
70 for (int m: graph.adjList[n])
71 {
72 // remove an edge from `n` to `m` from the graph
73 indegree[m] -= 1;
74
75 // if `m` has no other incoming edges, insert `m` into `S`
166
76 if (!indegree[m]) {
77 S.push_back(m);
78 }
79 }
80 }
81
82 // if a graph has edges, then the graph has at least one cycle
83 for (int i = 0; i < n; i++)
84 {
85 if (indegree[i]) {
86 return {};
87 }
88 }
89
90 return L;
91 }
92
93 int main()
94 {
95 // vector of graph edges as per the above diagram
96 vector<Edge> edges =
97 {
98 { 0, 6 }, { 1, 2 }, { 1, 4 }, { 1, 6 }, { 3, 0 }, { 3, 4 },
99 { 5, 1 }, { 7, 0 }, { 7, 1 }
100 };
101
102 // total number of nodes in the graph (labelled from 0 to 7)
103 int n = 8;
104
105 // build a graph from the given edges
106 Graph graph(edges, n);
167
107
108 // Perform topological sort
109 vector<int> L = doTopologicalSort(graph);
110
111 // print topological order
112 if (L.size()) {
113 for (int i: L) {
114 cout << i << " ";
115 }
116 } else {
117 cout << "Graph has at least one cycle. Topological sorting is not possible";
118 }
119
120 return 0;
121 }
Output:
75123406
Connectivity in an undirected graph means that every vertex can reach every other vertex via
any path. If the graph is not connected the graph can be broken down into Connected
Components.Strong Connectivity applies only to directed graphs. A directed graph is strongly
connected if there is a directed path from any vertex to every other vertex. This is same as
connectivity in an undirected graph, the only difference being strong connectivity applies to
directed graphs and there should be directed paths instead of just paths. Similar to connected
components, a directed graph can be broken down into Strongly Connected Components.
168
Basic/Brute Force method to find Strongly Connected Components:
Strongly connected components can be found one by one, that is first the strongly connected
component including node 1 is found. Then, if node 2 is not included in the strongly
connected component of node 1, similar process which will be outlined below can be used
for node 2, else the process moves on to node 3 and so on.
So, how to find the strongly connected component which includes node 1? Let there be a list
which contains all nodes, these nodes will be deleted one by one once it is sure that the
particular node does not belong to the strongly connected component of node 1. So, initially
all nodes from 1 to N are in the list. Let length of list be LEN, current index be IND and the
element at current index ELE. Now for each of the elements at index IND+1,...,LEN, assume
the element is Other Element, it can be checked if there is a directed path from Other
Element to ELE by a single O(V+E) DFS, and if there is a directed path from ELE to Other
Element, again by a single O(V+E) DFS. If not, Other Element can be safely deleted from the
list.
After all these steps, the list has the following property: every element can reach ELE,
and ELE can reach every element via a directed path. But the elements of this list may or may
not form a strongly connected component, because it is not confirmed that there is a path
from other vertices in the list excluding ELE to the all other vertices of the list excluding ELE.
So to do this, a similar process to the above mentioned is done on the next element (at next
index IND+1) of the list. This process needs to check whether elements at indices IND+2,
LEN have a directed path to element at index IND+1. It should also check if element at
169
index IND+1 has a directed path to those vertices. If not, such nodes can be deleted from the
list. Now one by one, the process keeps on deleting elements that must not be there in the
Strongly Connected Component of 1.
In the end, list will contain a Strongly Connected Component that includes node 1. Now, to
find the other Strongly Connected Components, a similar process must be applied on the next
element (that is 2), only if it has not already been a part of some previous Strongly Connected
Component (here, the Strongly Connected Component of 1). Else, the process continues to
node 3 and so on.
The time complexity of the above algorithm is O(V3).
This algorithm just does DFS twice, and has a lot better complexity O(V+E), than the brute
force approach. First define a Condensed Component Graph as a graph with ≤V nodes
and ≤E edges, in which every node is a Strongly Connected Component and there is an edge
from C to C′, where C and C′ are Strongly Connected Components, if there is an edge from any
node of C to any node of C′.
170
It can be proved that the Condensed Component Graph will be a Directed Acyclic Graph(DAG).
To prove it, assume the contradictory that is it is not a DAG, and there is a cycle. Now observe
that on the cycle, every strongly connected component can reach every other strongly
connected component via a directed path, which in turn means that every node on the cycle
can reach every other node in the cycle, because in a strongly connected component every
node can be reached from any other node of the component. So if there is a cycle, the cycle
can be replaced with a single node because all the Strongly Connected Components on that
cycle will form one Strongly Connected Component.
Therefore, the Condensed Component Graph will be a DAG. Now, a DAG has the property
that there is at least one node with no incoming edges and at least one node with no outgoing
edges. Call the above 2 nodes as Source and Sink nodes. Now observe that if a DFS is done
from any node in the Sink (which is a collection of nodes as it is a Strongly Connected
Component), only nodes in the Strongly Connected Component of Sink are visited. Now,
removing the sink also results in a DAG, with maybe another sink. So the above process can
be repeated until all Strongly Connected Components are discovered. So at each step any
node of Sink should be known. This should be done efficiently.
Now a property can be proven for any two nodes C and C′ of the Condensed Component
Graph that share an edge, that is let C→C′ be an edge. The property is that the finish
time of DFS of some node in C will be always higher than the finish time of all nodes of C′.
Proof: There are 2 cases, when DFS first discovers either a node in C or a node in C′.
Case 1: When DFS first discovers a node in C: Now at some time during the DFS, nodes
of C′ will start getting discovered (because there is an edge from C to C′), then all nodes
of C′ will be discovered and their DFS will be finished in sometime (Why? Because it is a
Strongly Connected Component and will visit everything it can, before it backtracks to the
node in C, from where the first visited node of C′ was called). Therefore, for this case, the
finish time of some node of C will always be higher than finish time of all nodes of C′.
Case 2: When DFS first discovers a node in C′: Now, no node of C has been discovered
yet. DFS of C′ will visit every node of C′ and maybe more of other Strongly Connected
Component's if there is an edge from C′ to that Strongly Connected Component. Observe that
now any node of C will never be discovered because there is no edge from C′ to C.
Therefore, DFS of every node of C′ is already finished and DFS of any node of C has not even
171
started yet. So clearly finish time of some node (in this case all) of C, will be higher than the
finish time of all nodes of C′.
So, if there is an edge from C to C′ in the condensed component graph, the finish time of some
node of C will be higher than finish time of all nodes of C′. In other words, topological sorting
(a linear arrangement of nodes in which edges go from left to right) of the condensed
component graph can be done, and then some node in the leftmost Strongly Connected
Component will have higher finishing time than all nodes in the Strongly Connected
Component's to the right in the topological sorting.
Now the only problem left is how to find some node in the sink Strongly Connected
Component of the condensed component graph. The condensed component graph can be
reversed, then all the sources will become sinks and all the sinks will become sources. Note
that the Strongly Connected Components of the reversed graph will be same as the Strongly
Connected Components of the original graph.
Now a DFS can be done on the new sinks, which will again lead to finding Strongly Connected
Components. And now the order in which DFS on the new sinks needs to be done, is known.
The order is that of decreasing finishing times in the DFS of the original graph. This is because
it was already proved that an edge from C to C′ in the original condensed component graph
means that finish time of some node of C is always higher than finish time of all nodes of C′.
So when the graph is reversed, sink will be that Strongly Connected Component in which there
is a node with the highest finishing time. Since edges are reversed, DFS from the node with
highest finishing time, will visit only its own Strongly Connected Component.
172
Now a DFS can be done from the next valid node(valid means which is not visited yet, in
previous DFSs) which has the next highest finishing time. In this way all Strongly Connected
Component's will be found. The complexity of the above algorithm is O(V+E), and it only
requires 2DFSs.
The cost of the spanning tree is the sum of the weights of all the edges in the tree. There can
be many spanning trees. Minimum spanning tree is the spanning tree where the cost is
minimum among all the spanning trees. There also can be many minimum spanning trees.
Minimum spanning tree has direct application in the design of networks. It is used in
algorithms approximating the travelling salesman problem, multi-terminal minimum cut
problem and minimum-cost weighted perfect matching. Other practical applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation
173
Assumptions.
Underlying principles.
174
A cut of a graph is a partition of its vertices into two disjoint sets. A crossing edge is an edge
that connects a vertex in one set with a vertex in the other. We recall For simplicity, we
assume all edge weights are distinct. Under this assumption, the MST is unique. Define cut
and cycle. The following properties lead to a number of MST algorithms.
Proposition. (Cut property)
Given any cut in an edge-weighted graph (with all edge weights distinct), the crossing edge
of minimum weight is in the MST of the graph.
The cut property is the basis for the algorithms that we consider for the MST problem.
Specifically, they are special cases of the greedy algorithm.
Proposition. (Greedy MST algorithm)
The following method colors black all edges in the the MST of any connected edge-weighted
graph with V vertices: Starting with all edges colored gray, find a cut with no black edges, color
its minimum-weight edge black, and continue until V-1 edges have been colored black.
175
Edge-weighted graph data type.
The either () and other () methods are useful for accessing the edge's vertices;
the compareTo() method compares edges by weight. Edge.java is a straightforward
implementation.
We represent edge-weighted graphs using the following API:
176
We allow parallel edges and self-loops. implements the API using the adjacency-lists
representation.
MST API.
177
10.6 SUMMARY
In this unit we have learnt sort as well as Topological sorting. We have discussed strongly
connected components. In this unit we have covered Minimum Spanning Tree and its growing
concepts.
10.7 KEYWORDS
MST
Weighted
Disjoint
Spanning
10.9 References
7. Sartaj Sahni, 2000, Data structures, algorithms and applications in C++, McGraw Hill
international edition.
8. Horowitz and Sahni, 1983, Fundamentals of Data structure, Galgotia publications
178
9. Horowitz and Sahni, 1998, Fundamentals of Computer algorithm, Galgotia
publications.
10. Narsingh Deo, 1990, Graph theory with applications to engineering and computer
science, Prentice hall publications.
11. Tremblay and Sorenson, 1991, An introduction to data structures with applications,
McGraw Hill edition.
179
UNIT – 11
GRAPHS ALGORITHMS
STRUCTURE
11.0 Objectives
11.1 Introduction
11.3 The Bellman Ford Algorithm single source shortest paths in directed Acyclic
graph
11.4 Summary
11.5 Keywords
11.7 Reference
11.0 OBJECTIVES
180
11.1 INTRODUCTION
A graph is an abstract notation used to represent the connection between pairs of objects.
A graph consists of −
Vertices − Interconnected objects in a graph are called vertices. Vertices are also
known as nodes.
Edges − Edges are the links that connect the vertices.
There are two types of graphs −
Directed graph − In a directed graph, edges have direction, i.e., edges go from one
vertex to another.
Undirected graph − In an undirected graph, edges have no direction.
Graph Coloring
Graph coloring is a method to assign colors to the vertices of a graph so that no two adjacent
vertices have the same color. Some graph coloring problems are −
Vertex coloring − A way of coloring the vertices of a graph so that no two adjacent
vertices share the same color.
Edge Coloring − It is the method of assigning a color to each edge so that no two
adjacent edges have the same color.
Face coloring − It assigns a color to each face or region of a planar graph so that no
two faces that share a common boundary have the same color.
Chromatic Number
Chromatic number is the minimum number of colors required to color a graph. For example,
the chromatic number of the following graph is 3.
181
The concept of graph coloring is applied in preparing timetables, mobile radio frequency
assignment, Suduku, register allocation, and coloring of maps.
begin
end
ok = ΣStatus
end
182
11.2 KRUSHKAL AND PRIM-SINGLE –SOURCE SHORTEST PATHS
Kruskal’s Algorithm
The Kruskal’s algorithm differs from Prim’s in the following manner. It does not insist
on nearness to a vertex already existing in the partial spanning tree. As long as the new
incoming low cost edge does not form a loop, it is included in the tree. A broad outline of the
algorithm can be listed as follows:
Choose an edge with the lowest cost. Add it to the spanning tree. Delete it from the
set of edges.
From the set of edges choose the next low cost edge. Try it on the partial spanning
tree. If no loop is created, add it to the spanning tree, otherwise discard. In either case,
delete it from the set of edges.
Repeat the operation till (n-1) edges are picked up from the set of edges and added to
the spanning tree which spans over the vertex set V.
We now see its effect on the graph we have considered for the Prim’s algorithm
183
184
Complexity
Since kruskal’s method works on the basis of sorting the edges based on their weights, the
complexity in the worst case is O( |E| log |E| ). This complexity can be reduced to O( |V| +
|E|).
Prim’s algorithm
Prim’s algorithm starts with the least cost edge. Then, it chooses another edge that is
adjacent to this edge and is of least cost and attaches it to the first edge. The process
continues as follows:
1) At each stage, choose an edge that is adjacent to any of the nodes of the partially
constructed spanning tree and the least weighted amongst them.
2) If the node selected above forms a loop/circuit, then reject it and select the next edge
that satisfies the criteria.
3) Repeat the process (n-1) times for the graph with n vertices.
To further clarify the situation let us trace the application of Prim’s algorithm to the following
graph.
185
Now we see how the Prim’s algorithm works on it.
186
Complexity
187
The algorithm, though appears to be fairly complex, can be looked up as made up of
several parts. At each stage, the nearest edge (indicating the edge with the least weight that
connects an outside vertex to a vertex that is in the partially built up spanning tree) is
identified and added to the tree. It can be seen that the complexity of the algorithm is q (n2).
The complexity can be reduced to O( ( n + |E| ) log n). You are expected to refer additional
books and obtain more information about this.
This gives another application for greedy algorithms on graphs. Often, graphs are used
to indicate paths - roadmaps, pipelines etc. Graphs can be used to represent the highway
structure of a state or country with vertices representing cities and edges representing
sections of highway. The edges can then be assigned weights which may be either the
distance between the two cities connected by the edge or the average time to drive along
that section of highway. A motorist wishing to drive from city A to B would be interested in
answers to the following questions:
2) If there is more than one path from A to B? Which is the shortest path?
The problems defined by these questions are special case of the path problem we
study in this section. The length of a path is now defined to be the sum of the weights of the
edges on that path. The starting vertex of the path is referred to as the source and the last
vertex, destination. The graphs are digraphs representing streets. Consider a digraph G = (V,
E), with the distance to be traveled as weights on the edges. The problem is to determine the
shortest path from v0 to all the remaining vertices of G. It is assumed that all the weights
associated with the edges are positive. The shortest path between v0 and some other node v
is an ordering among a subset of the edges. Hence this problem fits the ordering paradigm.
Example:
Consider the digraph of fig 8-1. Let the numbers on the edges be the costs of traveling along
that route. If a person is interested travel from v1 to v2, then he encounters many paths. Some
of them are
v1→ v2 = 50 units.
188
v1→ v3→ v4→ v2 = 10 + 15 + 20 = 45 units.
Figure 8.1
The cheapest path among these is the path along v1→ v3→ v4→ v2. The cost of the
path is 10 + 15 + 20 = 45 units. Even though there are three edges on this path, it is cheaper
than traveling along the path connecting v1 and v2 directly i.e., the path v1 → v2 that costs 50
units. One can also notice that, it is not possible to travel to v6 from any other node.
A much simpler method would be to solve it using matrix representation. The steps
that should be followed is as follows:
1. Find the adjacency matrix for the given graph. The adjacency matrix for figure 8.1
is given below
189
2. Consider v1 to be the source and choose the minimum entry in the row v1. In the
above table the minimum in row v1 is 10.
3. Find out the column in which the minimum is present, for the above example it is
column v3. Hence, this is the node that has to be next visited.
4. Compute a matrix by eliminating v1 and v3 columns. Initially retain only row v1. The
second row is computed by adding 10 to all values of row v3.
5. Find the minimum in each column. Now select the minimum from the resulting
row. In the above example the minimum is 25. Repeat step 3 followed by step 4
till all vertices are covered or single column is left.
190
Finally the cheapest path from v1 to all other vertices is given by v1→ v3→ v4→ v2→ v5. The
suggested simple algorithm is called the Dijkstra’s algorithm that finds the path from the initial
vertex v to all other vertices. The devised simple algorithm is as follows.
191
Complexity
The time taken by this algorithm on a graph with n vertices is O(n2). Any shortest path
algorithm must examine each edge in the graph at least once since any of the edges could be
in a shortest path. Hence the minimum time taken is W(|E|) time. However, since the costs
are represented in a cost matrix, this representation must take W(n2) time. The worst
complexity can be reduced to O((n + |E|) log n) which is left as an assignment for you.
192
11.3 The Bellman Ford Algorithm single source shortest paths in directed Acyclic graph
One weighted directed acyclic graph is given. Another source vertex is also provided. Now we
have to find the shortest distance from the starting node to all other vertices, in the graph.
To detect Smaller distance, we can use another algorithm like Bellman-Ford for the graph with
negative weight, for positive weight the Dijkstra’s algorithm is also helpful. Here for Directed
Acyclic Graph, we will use the topological sorting technique to reduce complexity.
Output:
Shortest Distance from Source Vertex 1
Infinity 0 2 6 5 3
Algorithm
topoSort(u, visited, stack)
193
Input: starting node u, the visited list to keep track, the stack.
Output: Sort the nodes in a topological way.
Begin
mark u as visited
for all vertex v, which is connected with u, do
if v is not visited, then
topoSort(v, visited, stack)
done
push u into the stack
End
shortestPath(start)
Input − The starting node.
Output − List of the shortest distance of all vertices from the starting node.
Begin
initially make all nodes as unvisited
for each node i, in the graph, do
if i is not visited, then
topoSort(i, visited, stack)
done
194
display Infinity
else
display dist[i]
done
End
Example
#include<iostream>
#include<stack>
#define NODE 6
#define INF 9999
int cost[NODE][NODE] = {
{0, 5, 3, INF, INF, INF},
{INF, 0, 2, 6, INF, INF},
{INF, INF, 0, 7, 4, 2},
{INF, INF, INF, 0, -1, 1},
{INF, INF, INF, INF, 0, -2},
{INF, INF, INF, INF, INF, 0}
};
195
void shortestPath(int start) {
stack<int> stk;
int dist[NODE];
bool vis[NODE];
for(int i = 0; i<NODE;i++)
vis[i] = false; // make all nodes as unvisited at first
if(dist[nextVert] != INF) {
for(int v = 0; v<NODE; v++) {
if(cost[nextVert][v] && cost[nextVert][v] != INF){ if(dist[v] > dist[nextVert]
+cost[nextVert][v])dist[v] = dist[nextVert] + cost[nextVert][v];
}
}
}
for(int i = 0; i<NODE; i++)
(dist[i] == INF)?cout << "Infinity ":cout << dist[i]<<" ";
}
main() {
int start = 1;
cout << "Shortest Distance From Source Vertex "<<start<<endl;
shortestPath(start);
196
}
Output
Shortest Distance From Source Vertex 1
Infinity 0 2 6 5 3
11.4 SUMMARY
In this unit, we have described two problems where greedy strategy is used to provide
optimal solution. In single source shortest path problem, we described how greedy strategy
is used to determine the shortest path from a single source to all the remaining vertices of G.
In the case of minimum cost spanning tree problem, we intend to build a least cost spanning
tree, stage by stage, using the greedy method. Obviously at each stage, we choose the edge
with the least weight from amongst the available edges. With this, in mind, we described two
algorithms Prim’s and Kruskal’s respectively, which work on greedy principle.
11.5 KEYWORDS
Graph
Kruskal’s algorithm
Greedy strategy
Complexity
197
11.7 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall
Englewood Cliffs, New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
198
UNIT – 12
GRAPHS ALGORITHMS
STRUCTURE
12.0 Objectives
12.5 Keywords
12.7 Reference
12.0 OBJECTIVES
199
12.1 DIJKSTRA’S ALGORITHMS
The main problem is the same as the previous one, from the starting node to any other node,
find the smallest distances. In this problem, the main difference is that the graph is
represented using the adjacency matrix. (Cost matrix and adjacency matrix is similar for this
purpose).
For the adjacency list representation, the time complexity is O(V^2) where V is the number of
nodes in the graph G(V, E)
Input and Output
Input:
The adjacency matrix:
Output:
0 to 1, Using: 0, Cost: 3
0 to 2, Using: 1, Cost: 5
0 to 3, Using: 1, Cost: 4
0 to 4, Using: 3, Cost: 6
0 to 5, Using: 2, Cost: 7
0 to 6, Using: 4, Cost: 7
Algorithm
dijkstraShortestPath(n, dist, next, start)
Input − Total number of nodes n, distance list for each vertex, next list to store which node
comes next, and the seed or start vertex.
Output − The shortest paths from start to all other vertices.
Begin
create a status list to hold the current status of the selected node
for all vertices u in V do
status[u] := unconsidered
200
dist[u] := distance from source using cost matrix
next[u] := start
done
201
min = INF;
if(status[index] == 1)
return index; //minimum unconsidered vertex distance
else
return -1; //when all vertices considered
}
//initialization
for(u = 0; u<n; u++) {
status[u] = 1; //unconsidered vertex
dist[u] = costMat[u][s]; //distance from source
next[u] = s;
}
202
next[v] = u;
}
}
}
main() {
int dis[V], next[V], i, start = 0;
dijkstra(V, dis, next, start);
Given a directed graph G = (V, E), it is required to find shortest paths between all pairs
of vertices. In other words, beginning from vertex 1, finding shortest paths to each of the
vertices 2, 3, 4, … , n. Similarly from vertex 2 to 1, 3, 4, …, n. Note that the shortest path from
1 to 2 may not be the same as from 2 to 1. In fact, when a path from 1 to 2 exists, a path from
2 to 1 need not necessarily exist.
One method of solving the problem is to repeatedly apply the concept of single source
shortest path by considering each vertex as a source vertex. Given n vertices, we have to
203
invoke single source shortest path algorithm n2 times, by giving each of the vertices as the
source vertex. In this section, we look at a slightly different approach using dynamic
programming concept to find the shortest path between all pairs of vertices.
Let Ak(i, j) represent the shortest path from i to j going through no vertex of index
greater than k. i.e.,
Algorithm: All_pair_shortest_paths
Input: C, is a cost adjacency matrix of the graph G(V, E).
204
N, the number of vertices.
Output: A(i, j), the cost of the shortest path between i and j.
Method :
For i = 1 to n do
C(i, i) = 0
For end
For i = 1 to n do
For j = 1 to n do
A(i, j) = C (i, j) //copy cost into A
For end
For end
For k = 1 to n do
For i = 1 to n do
For j = 1 to n do
A[i, j] = min(A[i, j], A[i, k] + A[k, j]
For end
For end
For end
Algorithm ends
Figure 15.2
Let us try the above method on a 4 vertex graph shown in Figure 15.2. The cost matrix of the
above figure appears as follows
205
Now we compute the Ak matrices
The values in the last matrix A4 gives the cost of going from every vertex to every other vertex
Complexity
206
The above algorithm has three ‘for loops’ running one within the other. Hence the complexity
of the above algorithm is O(n3)
The linear algebra, matrices play an important role in dealing with different
concepts. A matrix is a rectangular array or table of numbers, symbols, or expressions,
arranged in rows and columns in mathematics. We can perform various operations on
matrices such as addition, subtraction, multiplication and so on. In this article, you will learn
how to multiply a matrix by another matrix, its algorithm, formula, 2×2 and 3×3 matrix
multiplication with examples in detail.
207
Solution:
Given,
A=[34−1095]
4 × A = 4×[34−1095]
Now, we have to multiply each element of the matrix A by 4.
=[1216−403620]
This is the required matrix after multiplying the given matrix by the constant or scalar value,
i.e. 4.
Notation
If A is a m×n matrix and B is a p×q matrix, then the matrix product of A and B is represented
by:
X = AB
Where X is the resulting matrix of m×q dimension.
208
Let’s say A and B are two matrices, such that,
A=[A11A12⋯A1nA21A22⋯A2n………….Am1Am2⋯Amn], B=[B11B12⋯B1nB21B22⋯B2n……
…….Bm1Bm2⋯Bmn]
Then Matrix C = AB is denoted by
C = [C11C12…….C1cC21C22…….C2c……………Ca1Ca2…….Cac]
An element in matrix C where C is the multiplication of Matrix A X B.
C = Cxy = Ax1By1 +….. + AxbBby = ∑k=1b AxkBky for x = 1…… a and y= 1…….c
Iterative Algorithm
Divide and conquer algorithm
Sub-cubic algorithms
Parallel and distributed algorithms
This is majorly used in various programming languages such as C, Java, etc., for online
multiplication. The most common are 2×2, 3×3 and 4×4, multiplication of matrices.
The operation is binary with entries in a set on which the operations of addition, subtraction,
multiplication, and division are defined. These operations are the same as the corresponding
operations on real and rational numbers.
Although there are many applications of matrices, essentially, multiplication of matrices is an
operation in linear algebra. The linear mapping, which includes scalar addition and
multiplication, is represented by matrix multiplication.
One can also find a wide range of algorithms on meshes. This type of algorithm is designed to
minimize the inherent inefficiency of standard array algorithms where there can be a delay in
the arrival of data from 2 different matrices.
The product of two matrices A and B is defined if the number of columns of A is equal
to the number of rows of B.
If AB is defined, then BA need not be defined
If both A and B are square matrices of the same order, then both AB and BA are
defined.
209
If AB and BA are both defined, it is not necessary that AB = BA.
If the product of two matrices is a zero matrix, it is not necessary that one of the
matrices is a zero matrix.
AB11 = 3 × 6 + 7 ×5 = 53
AB12 = 3 × 2 + 7 × 8 = 62
AB21 = 4 × 6 + 9 × 5 = 69
AB22 = 4 × 2 + 9 × 8 = 80
Therefore matrix AB = [53626980]
Commutative Property
The matrix multiplication is not commutative.
Assume that, if A and B are the two 2×2 matrices,
210
AB ≠ BA
In matrix multiplication, the order matters a lot.
For example,
If A=[1234] and B=[3214] are the two matrices, then
A×B=[1234]×[3214]
A×B=[5101322]
But,
B×A=[3214]×[1234]
B×A=[9141318]
This shows that the matrix AB ≠BA.
Hence, the multiplication of two matrices is not commutative.
Associative Property
If A, B and C are the three matrices, the associative property of matrix multiplication states
that,
(AB) C = A(BC)
Let A=[1211]
B=[3212]
C=[0123]
LHS = (AB) C
A×B=[1211]×[3212]
A×B=[5644]
(AB)C=[5644]×[0123]
(AB)C=[1223816]
RHS = A(BC)
BC=[3212]×[0123]
BC=[4947]
A(BC)=[1211]×[4947]
A(BC)=[1223816]
Hence, the associative property of matrix multiplication is proved.
211
Distributive Property
If A, B and C are the three matrices, the distributive property of matrix multiplication states
that,
(B+C)A = BA +CA
A(B+C) = AB + AC
1. I = I. A = A
Dimension Property
In matrix multiplication, the product of m × n matrix and n×a matrix is the m× a matrix.
For example, matrix A is a 2 × 3 matrix and matrix B is a 3 × 4 matrix, then AB is a 2 × 4 matrices.
Solved Example
Multiplication of 4×4 matrices is explained below with two 4×4 matrices A and B.
A = [7141564812314216913764], B = [5714281649136846324]
Following the same steps as in the previous 2 examples, we can construct an AB matrix.
AB = [378381286224258237190140370497346277223251266129]
Given a directed or an undirected weighted graph G with n vertices. The task is to find the
length of the shortest path dij between each pair of vertices i and j.
The graph may have negative weight edges, but no negative weight cycles.
212
If there is such a negative cycle, you can just traverse this cycle over and over, in each iteration
making the cost of the path smaller. So you can make certain paths arbitrarily small, or in
other words that shortest path is undefined. That automatically means that an undirected
graph cannot have any negative weight edges, as such an edge forms already a negative cycle
as you can move back and forth along that edge as long as you like.
This algorithm can also be used to detect the presence of negative cycles. The graph has a
negative cycle if at the end of the algorithm, the distance from a vertex v to itself is negative.
This algorithm has been simultaneously published in articles by Robert Floyd and Stephen
Warshall in 1962. However, in 1959, Bernard Roy published essentially the same algorithm,
but its publication went unnoticed.
The key idea of the algorithm is to partition the process of finding the shortest path between
any two vertices to several incremental phases.
Let us number the vertices starting from 1 to n. The matrix of distances is d[][].
Before k-th phase (k=1…n), d[i][j] for any vertices i and j stores the length of the shortest path
between the vertex i and vertex j, which contains only the vertices {1,2,...,k−1} as internal
vertices in the path.
In other words, before k-th phase the value of d[i][j] is equal to the length of the shortest path
from vertex i to the vertex j, if this path is allowed to enter only the vertex with numbers
smaller than k (the beginning and end of the path are not restricted by this property).
It is easy to make sure that this property holds for the first phase. For k=0, we can fill matrix
with d[i][j]=wij if there exists an edge between i and j with weight wij and d[i][j]=∞ if there
doesn't exist an edge. In practice ∞ will be some high value. As we shall see later, this is a
requirement for the algorithm.
Suppose now that we are in the k-th phase, and we want to compute the matrix d[][] so that
it meets the requirements for the (k+1)-th phase. We have to fix the distances for some
vertices pairs (i,j). There are two fundamentally different cases:
The shortest way from the vertex i to the vertex j with internal vertices from the
set {1,2,…,k} coincides with the shortest path with internal vertices from the
set {1,2,…,k−1}.
In this case, d[i][j] will not change during the transition.
The shortest path with internal vertices from {1,2,…,k} is shorter.
This means that the new, shorter path passes through the vertex k. This means that we
can split the shortest path between i and j into two paths: the path between i and k, and
the path between k and j. It is clear that both this paths only use internal vertices
of {1,2,…,k−1} and are the shortest such paths in that respect. Therefore we already have
213
computed the lengths of those paths before, and we can compute the length of the
shortest path between i and j as d[i][k]+d[k][j].
Combining these two cases we find that we can recalculate the length of all pairs (i,j) in the k-
th phase in the following way:
dnew[i][j]=min(d[i][j],d[i][k]+d[k][j])
Thus, all the work that is required in the k-th phase is to iterate over all pairs of vertices and
recalculate the length of the shortest path between them. As a result, after the n-th phase,
the value d[i][j] in the distance matrix is the length of the shortest path between i and j, or
is ∞ if the path between the vertices i and j does not exist.
A last remark - we don't need to create a separate distance matrix dnew[][] for temporarily
storing the shortest paths of the k-th phase, i.e. all changes can be made directly in the
matrix d[][] at any phase. In fact at any k-th phase we are at most improving the distance of
any path in the distance matrix, hence we cannot worsen the length of the shortest path for
any pair of the vertices that are to be processed in the (k+1)-th phase or later.
Implementation
Let d[][] is a 2D array of size n×n, which is filled according to the 0-th phase as explained
earlier. Also we will set d[i][i]=0 for any i at the 0-th phase.
It is assumed that if there is no edge between any two vertices i and j, then the matrix
at d[i][j] contains a large number (large enough so that it is greater than the length of any
path in this graph). Then this edge will always be unprofitable to take, and the algorithm will
work correctly.
However if there are negative weight edges in the graph, special measures have to be taken.
Otherwise the resulting values in matrix may be of the form ∞−1, ∞−2, etc., which, of course,
still indicates that between the respective vertices doesn't exist a path. Therefore, if the graph
has negative weight edges, it is better to write the Floyd-Warshall algorithm in the following
way, so that it does not perform transitions using paths that don't exist.
214
for (int k = 0; k < n; ++k) {
for (int i = 0; i < n; ++i) {
for (int j = 0; j < n; ++j) {
if (d[i][k] < INF && d[k][j] < INF)
d[i][j] = min(d[i][j], d[i][k] + d[k][j]);
}
}
}
It is easy to maintain additional information with which it will be possible to retrieve the
shortest path between any two given vertices in the form of a sequence of vertices.
For this, in addition to the distance matrix d[][], a matrix of ancestors p[][] must be
maintained, which will contain the number of the phase where the shortest distance between
two vertices was last modified. It is clear that the number of the phase is nothing more than
a vertex in the middle of the desired shortest path. Now we just need to find the shortest
path between vertices i and p[i][j], and between p[i][j] and j. This leads to a simple recursive
reconstruction algorithm of the shortest path.
If the weights of the edges are not integer but real, it is necessary to take the errors, which
occur when working with float types, into account.
The Floyd-Warshall algorithm has the unpleasant effect, that the errors accumulate very
quickly. In fact if there is an error in the first phase of δ, this error may propagate to the
second iteration as 2δ, to the third iteration as 4δ, and so on.
To avoid this the algorithm can be modified to take the error (EPS = δ) into account by using
following comparison:
Formally the Floyd-Warshall algorithm does not apply to graphs containing negative weight
cycle(s). But for all pairs of vertices i and j for which there doesn't exist a path starting at i,
visiting a negative cycle, and end at j, the algorithm will still work correctly.
For the pair of vertices for which the answer does not exist (due to the presence of a negative
cycle in the path between them), the Floyd algorithm will store any number (perhaps highly
negative, but not necessarily) in the distance matrix. However it is possible to improve the
Floyd-Warshall algorithm, so that it carefully treats such pairs of vertices, and outputs them,
for example as −INF.
215
This can be done in the following way: let us run the usual Floyd-Warshall algorithm for a
given graph. Then a shortest path between vertices i and j does not exist, if and only if, there
is a vertex t that is reachable from i and also from j, for which d[t][t]<0.
In addition, when using the Floyd-Warshall algorithm for graphs with negative cycles, we
should keep in mind that situations may arise in which distances can get exponentially fast
into the negative. Therefore integer overflow must be handled by limiting the minimal
distance by some value (e.g. −INF).
12.5 SUMMARY
In this unit, we presented a dynamic programming algorithm that solves the matrix
multiplication problem. A product of matrices is fully parenthesized if it is either a single
matrix or the product of two fully parenthesized matrix products, surrounded by parentheses.
We learnt how to parenthesize a chain of matrices that can have a dramatic impact on the
cost of evaluating the product. Few examples are considered to illustrate the different costs
incurred by different parenthesizations of a matrix product. The unit also addressed all pair
shortest path problem and the dynamic programming algorithm to solve this problem.
Complexity of the dynamic programming algorithms to solve these two problems is analyzed.
12.6 KEYWORDS
Dynamic programming
Matrix- multiplication
All pair shortest path
Recursive solution
Directed graph
Parenthesization
216
4. Explain the algorithm to solve all pair shortest path problem.
12.8 REFERENCES
217
UNIT – 13
STRUCTURE
13.0 Objectives
13, 1 Introduction
13.5 Summary
13.6 Keywords
13.8 Reference
13.0 OBJECTIVES
218
13.1 INTRODUCTION
Dynamic programming is arguably the most difficult of the five design methods we
are studying. It has its foundations in the principle of optimality. We can use this method to
obtain elegant and efficient solutions to many problems that cannot be so solved with either the
greedy or divide-and-conquer methods.
219
13.2 DYNAMIC PROGRAMMING
Example 13.1 [Shortest Path] Consider the digraph of Figure 13.1. We wish to find a shortest
path from the source vertex a = 1 to the destination vertex d = 5. We need to make decisions
on the intermediate vertices. The choices for the first decision are 2, 3, and 4. That is, from
vertex 1 we may move to any one of these vertices. Suppose we decide to move to vertex 3.
Now we need to decide on how to get from 3 to 5. If we go from 3 to 5 in a suboptimal way,
then the 1-to-5 path constructed cannot be optimal, even under the restriction that from vertex
1 we must go to vertex 3. For example, if we use the suboptimal path 3, 2, 5 with length 9, the
constructed 1-to-5 path 1, 3, 2, 5 has length 11. Replacing the suboptimal path 3, 2, 5 with an
optimal one 3. 4, 5 results in the path 1, 3, 4, 5 of length 9.
So for this shortest-path problem, suppose that our first decision gets us to some vertex
v. Although we do not know how to make this first decision we do know that the remaining
decisions must be optimal for the problem of going from v to d.
1 2 3
4 5 6
220
Example 13.2 [0/1 Knapsack Problem] Consider the 0/1 knapsack problem using Greedy
algorithm, We need to make decisions on the values of x1, ..., xn. Suppose we are deciding the
values of the xis in the order i = 1, 2,..., n. If we set xi = 0, then the available knapsack capacity
for the remaining objects (i.e., objects 2. 3, ... , n) is c. If we set xl = 1, the available knapsack
capacity is c - w1. Let r {c, c - w1) denote the remaining knapsack capacity.
Following the first decision, we are left with the problem of filling a knapsack with
capacity r. The available objects (i.e., 2 through n) and the available capacity r define the
problem state following the first decision. Regardless of whether xl is 0 or 1, [x2, . . , xn] must
be an optimal solution for the problem state following the first decision. If not, there is a
solution [y2, . . . ,yn] that provides the greater profit for the problem state following the first
decision. So [x1, y2 . . . ,yn] is a better solution for the initial problem.
Suppose that n = 3. w = [100, 14, 10], p = [20, 18, 15], and c = 116. If we set x1 = 1,
then following this decision, the available knapsack capacity is 16. [x2, x3] = [0, 1] is a feasible
solution to the two-object problem that remains. It returns a profit of 15. However, it is not an
optimal solution to the remaining two-object problem, as [x2, x3] = [1, 0] is feasible and returns
a greater profit of 18. So x = [1, 0, 1] can be improved to [1, 1, 0] if we set x1 = 0, the available
capacity for the two-object instance that remains is 116. If the subsequence [x2, x3] is not an
optimal solution for this remaining instance, then [x1, x2, x3] cannot be optimal for the initial
instance.
Example 13. 3 [Airfares] A certain airline has the following airfare structure: From Atlanta to
New York or Chicago, or from Los Angeles to Atlanta, the fare is $100; from Chicago to New
York, it is $20; and for passengers connecting through Atlanta, the Atlanta to Chicago segment
is only $20. A routing from Los Angeles to New York involves decisions on the intermediate
airports. If problem states are encoded as (origin, destination) pairs, then following a decision
to go from Los Angeles to Atlanta, the problem state is we are at Atlanta and need to get to
New York. The cheapest way to go from Atlanta to New York is a direct flight with cost $100.
Using this direct flight results in a total Los Angeles-to-New York cost of $200. However, the
cheapest routing is Los Angeles-Atlanta-Chicago-New York with a cost of $140, which
involves using a suboptimal decision subsequence for the Atlanta-to- New York problem
(Atlanta- Chicago New York).
221
If instead we encode the problem state as a triple (tag, origin, destination) where tag is
zero for connecting flights and 1 for all others, then once we reach Atlanta, the state becomes
(0, Atlanta, New York) for which the optimal routing is through Chicago.
When optimal decision sequences contain optimal decision subsequences, we can establish
recurrence equations, called dynamic-programming recurrence equations that enable us to
solve the problem in an efficient way.
Recursive Solution: The dynamic-programming recurrence equations for the 0/1 knapsack
problem will be discussed in this section. A natural way to solve a recurrence such as 4.2 for
the value F(l, c) of an optimal knapsack packing is by a recursive program such as program
4.1. This code assumes that p, w, and n are global and that p is of type int. The invocation F(1,
c) returns the value of F(1, c).
Let t(n) be the time this code takes to solve an instance with n objects. We see that t(1)
= a and t(n) ≤ 2t (n -1) + b for n > 1. Here a and b are constants. This recurrence solves to t(n)
= O(2n).
222
Example 13.4 Consider the case n = 5, p = [6, 3, 5, 4, 6], w = [2, 2, 6, 5, 4], and c = 10. To
determine f(1, 10), function F is invoked as F(1,10). The recursive calls made are shown by
the tree of Figure 13.2. Each node has been labelled by the value of y. Nodes on level j have i
= j. So the root denotes the invocation F (1, 10). Its left and right children, respectively, denote
the invocations F (2, 10) and F (2, 8). In all, 28 invocations are made. Notice that several
invocations redo the work of previous invocations. For example, f(3, 8) is computed twice, as
are f(4, 8), f(4, 6), f(4, 2), f(5, 8), f(5, 6), f(5, 3), f(5, 2), and f(5, 1). If we save the results of
previous invocations, we can reduce the number of invocations to 19 because we eliminate the
shaded nodes of Figure 13.2.
As observed in above Example, Program 13.1 is doing more work than necessary. To
avoid computing the same f(i, y) value more than once, we may keep a list L of f(i, y)s
that have already been computed. The elements of this list are triples of the form (i, y, f(i, y)).
Before making an invocation F(i, y), we see whether the list L contains a triple of the form (i,
y, * ) where * denotes a wildcard. If so, f(i, y) is retrieved from the list. If not, the invocation is
made and then the triple (i, y, f(i, y)) is added to L. L may be stored as a hash table or as a
binary search tree.
Iterative Solution with Integer Weights: We can devise a fairly simple iterative algorithm
(Program 13.2) to solve for f(1, c) when the weights are integers. It computes each f (i, y)
exactly once. Program 13.2 uses a two-dimensional f [][] to store the values of the function f.
223
The code for the traceback needed to determine the xi values that result in the optimal filling
appears in Program 13.2.
224
}
Program 13.2 Iterative computation of f and x
The complexity of function Knapsack is Θ(nc) and that of Traceback is Θ(n).
Tuple Method (Optional): There are two drawbacks to the code of Program 13.2. First, it
requires that the weights be integer. Second, it is slower than Program 13.1 when the knapsack
capacity is large. In particular, if c > 2n, its complexity is Ω(n2n).We can over-come both of
these shortcomings by using a tuple approach in which for each i, f(i, y) is stored as an ordered
list P(i) of pairs (y, f(i, y)) that correspond to the y values at which the function f changes. The
pairs in each P(i) are in increasing order of y. Also, since f(i, y) is a nondecreasing function
of y, the pairs are also in increasing order of f(i, y).
Our next example of dynamic programming is an algorithm that solves the problem of
A1 A2 … An (15.1)
We can evaluate the expression (15.1) using the standard algorithm for multiplying pairs of
matrices as a subroutine once we have parenthesized it to resolve all ambiguities in how the
matrices are multiplied together. Matrix multiplication is associative, and so all
parenthesizations yield the same product. A product of matrices is fully parenthesized if it is
either a single matrix or the product of two fully parenthesized matrix products, surrounded by
How we parenthesize a chain of matrices can have a dramatic impact on the cost of evaluating
the product. Consider first the cost of multiplying two matrices. The standard algorithm is given
225
by the following pseudocode. The attributes rows and columns are the numbers of rows and
columns in a matrix.
We can multiply two matrices A and B only if they are compatible: the number of columns of
A must equal the number of rows of B. If A is a p x q matrix and B is a q x r matrix, the resulting
matrix C is a p x r matrix. The time to compute C is dominated by the number of scalar
multiplications in line 8, which is pqr. In what follows, we shall express costs in terms of the
number of scalar multiplications.
Note that in the matrix-chain multiplication problem, we are not actually multiplying
matrices. Our goal is only to determine an order for multiplying matrices that has the lowest
226
cost. Typically, the time invested in determining this optimal order is more than paid for by the
time saved later on when actually performing the matrix multiplications (such as performing
only 7500 scalar multiplications instead of 75,000).
(15.2)
A simpler exercise is to show that the solution to the recurrence (15.2) is Ω(2n). The number of
solutions is thus exponential in n, and the brute-force method of exhaustive search makes for a
poor strategy when determining how to optimally parenthesize a matrix chain.
We shall go through these steps in order, demonstrating clearly how we apply each step to the
problem.
227
For our first step in the dynamic-programming paradigm, we find the optimal
substructure and then use it to construct an optimal solution to the problem from optimal
solutions to subproblems. In the matrix-chain multiplication problem, we can perform this step
as follows. For convenience, let us adopt the notation Ai..j, where i ≤ j , for the matrix that results
from evaluating the product Ai Ai+1.. Aj. Observe that if the problem is nontrivial, i.e., i < j, then
to parenthesize the product Ai Ai+1.. Aj, we must split the product between Ak and Ak+1 for some
integer k in the range i ≤ k < j. That is, for some value of k, we first compute the matrices Ai..k
and Ak+1..j and then multiply them together to produce the final product Ai..j. The cost of
parenthesizing this way is the cost of computing the matrix Ai..k, plus the cost of computing
Ak+1..j, plus the cost of multiplying them together.
Now we use our optimal substructure to show that we can construct an optimal solution
to the problem from optimal solutions to subproblems. We have seen that any solution to a
nontrivial instance of the matrix-chain multiplication problem requires us to split the product,
and that any optimal solution contains within it optimal solutions to subproblem instances.
Thus, we can build an optimal solution to an instance of the matrix-chain multiplication
problem by splitting the problem into two subproblems (optimally parenthesizing Ai Ai+1.. Ak
and Ak+1Ak+2…Aj, finding optimal solutions to subproblem instances, and then combining these
optimal subproblem solutions. We must ensure that when we search for the correct place to
split the product, we have considered all possible places, so that we are sure of having examined
the optimal one.
228
subproblems the problems of determining the minimum cost of parenthesizing Ai Ai+1.. Aj for 1
≤ i ≤ j ≤ n. Let m[i, j] be the minimum number of scalar multiplications needed to compute the
matrix Ai..j; for the full problem, the lowest-cost way to compute A1..n would thus be m[1, n] .
We can define m[i, j] recursively as follows. If i = j, the problem is trivial; the chain consists
of just one matrix Ai..i = Ai, so that no scalar multiplications are necessary to compute the
product. Thus, m[i, j] = 0 for i = 1,2, ... , n. To compute m[i, j] when i < j, we take advantage
of the structure of an optimal solution from step 1. Let us assume that to optimally parenthesize,
we split the product Ai Ai+1.. Aj between Ak and Ak+1, where i ≤ k ≤ j. Then, m[i, j] equals the
minimum cost for computing the subproducts Ai..k and Ak+1..j, plus the cost of multiplying these
two matrices together. Recalling that each matrix Ai is
Pi -1 X Pi, we see that computing the matrix product Ai..k Ak+1..j, takes Pi -1 Pk Pj scalar
multiplications. Thus, we obtain
This recursive equation assumes that we know the value of k, which we do not. There are only
j - i possible values for k, however, namely k = i, i + 1, …, j -1. Since the optimal
parenthesization must use one of these values for k, we need only check them all to find the
best. Thus, our recursive definition for the minimum cost of parenthesizing the product Ai Ai+1..
Aj becomes
(15.3)
The m[i, j] values give the costs of optimal solutions to subproblems, but they do not provide
all the information we need to construct an optimal solution. To help us do so, we define s[i, j]
to be a value of k at which we split the product Ai Ai+1.. Aj in an optimal parenthesization. That
is, s[i, j] equals a value k such that m[i, j] = m[i, k] + m[k + 1, j] + Pi -1 Pk Pj
At this point, we could easily write a recursive algorithm based on recurrence (15.3) to compute
the minimum cost m[1, n] for multiplying A1A2 … An. This recursive algorithm takes
exponential time, which is no better than the brute-force method of checking each way of
parenthesizing the product.
229
Observe that we have relatively few distinct subproblems: one subproblem for each choice of
In order to implement the bottom-up approach, we must determine which entries of the
table we refer to when computing m[i, j]. Equation (15.3) shows that the cost m[i, j]. of
computing a matrix-chain product of j – i + 1matrices depends only on the costs of computing
matrix-chain products of fewer than j – i + 1 matrices. That is, for k = i, i + 1, …, j -1, the
matrix Ai..k is a product of k – i + 1 < j – i + 1 matrices and the matrix Ak+1..j is a product of j -
k < j – i + 1 matrices. Thus, the algorithm should fill in the table m in a manner that corresponds
to solving the parenthesization problem on matrix chains of increasing length. For the
subproblem of optimally parenthesizing the chain Ai Ai+1.. Aj, we consider the subproblem size
to be the length j – i + 1 of the chain.
The algorithm first computes m[i, i] = 0 for i = 1,2, ... , n. (the minimum costs for chains
of length 1) in lines 3–4. It then uses recurrence (15.3) to compute m[i, i + 1] for i = 1,2, ... ,
n – 1 (the minimum costs for chains of length l = 2) during the first execution of the for loop
in lines 5–13. The second time through the loop, it computes m[i, i + 2] for i = 1,2, ... , n – 2
(the minimum costs for chains of length l = 3), and so forth. At each step, the m[i, j] cost
computed in lines 10–13 depends only on table entries m[i, k] and m[k + 1, j] already computed.
230
Figure 15.1 The m and s tables computed by MATRIX-CHAIN-ORDER for n = 6 and the
following matrix dimensions:
231
used. The figure shows the table rotated to make the main diagonal run horizontally. The matrix
chain is listed along the bottom. Using this layout, we can find the minimum cost m[i, j] for
multiplying a subchain Ai Ai+1.. Aj of matrices at the intersection of lines running northeast from
Ai and northwest from Aj. Each horizontal row in the table contains the entries for matrix chains
of the same length. MATRIX-CHAIN-ORDER computes the rows from bottom to top and
from left to right within each row. It computes each entry m[i, j] using the products Pi-1 Pk Pj
for k = i, i + 1, …, j -1 and all entries southwest and southeast from m[i, j].
A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields
a running time of O(n3) for the algorithm. The loops are nested three deep, and each loop index
(l, i, and k) takes on at most n-1 values. The MATRIX-CHAINORDER is much more efficient
than the exponential-time method of enumerating all possible parenthesizations and checking
each one.
232
In the example of Figure 15.1, the call PRINT-OPTIMAL-PARENS (s,1,6) prints the
parenthesization ((A1(A2 A3))((A4 A5) A6)).
The longest common subsequence problem is finding the longest sequence which exists in both the
given strings.
Subsequence
A sequence Z = <z1, z2, z3, z4, …,zm> over S is called a subsequence of S, if and only if it can be derived
from S deletion of some elements.
Common Subsequence
Suppose, X and Y are two sequences over a finite set of elements. We can say that Z is a common
subsequence of X and Y, if Z is a subsequence of both X and Y.
If a set of sequences are given, the longest common subsequence problem is to find a common
subsequence of all the sequences that is of maximal length.
The longest common subsequence problem is a classic computer science problem, the basis of data
comparison programs such as the diff-utility, and has applications in bioinformatics. It is also widely
used by revision control systems, such as SVN and Git, for reconciling multiple changes made to a
revision-controlled collection of files.
Naïve Method
Let X be a sequence of length m and Y a sequence of length n. Check for every subsequence
of X whether it is a subsequence of Y, and return the longest common subsequence found.
233
Dynamic Programming
Let X = < x1, x2, x3,…, xm > and Y = < y1, y2, y3,…, yn > be the sequences. To compute the length of an
element the following algorithm is used.
In this procedure, table C[m, n] is computed in row major order and another table B[m,n] is
computed to construct optimal solution.
234
else
Print-LCS(B, X, i, j-1)
Analysis
To populate the table, the outer for loop iterates m times and the inner for loop iterates n times.
Hence, the complexity of the algorithm is O(m, n), where m and n are the length of two strings.
Example
In this example, we have two strings X = BACDB and Y = BDCB to find the longest common
subsequence.
Following the algorithm LCS-Length-Table-Formulation (as stated above), we have calculated table C
(shown on the left hand side) and table B (shown on the right hand side).
In table B, instead of ‘D’, ‘L’ and ‘U’, we are using the diagonal arrow, left arrow and up arrow,
respectively. After generating table B, the LCS is determined by function LCS-Print. The result is BCB.
13.5 SUMMARY
In this unit, we looked into the concept of dynamic programming. Dynamic programming is
essentially about taking a series of decisions, one at every stage. This helps us to derive the
benefits of exhaustive search at a much lower cost. The algorithm was applied to several
235
practical applications and algorithms have also been developed. A recursive solution to 0/1
knapsack problem using dynamic programming approach is presented and the complexity of
this algorithm has also been analyzed.
13.6 KEYWORDS
Dynamic programming
Knapsack problem
Recursive solution
Tuple method
Traceback
Complexity
Iterative solution
236
13.8 REFERENCES
237
UNIT – 14
GREEDY ALGORITHM
STRUCTURE
14.0 Objectives
14.4 Summary
14.5 Keywords
14.7 Reference
14.0 OBJECTIVES
238
14.1 GREEDY ALGORITHM
Greedy method is a method of choosing a subset of a dataset as the solution set that result in some profit.
Consider a problem having n inputs. We are required to obtain a solution which is a series of subsets
that satisfy some constraints or conditions. Any subset, which satisfies these constraints, is called a
feasible solution. It is required to obtain a feasible solution that maximizes or minimizes an objective
function. This feasible solution finally obtained is called optimal solution. The concept is called Greedy
because at each stage we choose the “best” available solution i.e., we are “greedy” about the output.
In greedy strategy, one can devise an algorithm that works in stages, considering one input at a
time and at each stage, a decision is taken on whether the data chosen results with an optimal solution
or not. If the inclusion of a particular data, results with an optimal solution, then the data is added into
the partial solution set. On the other hand, if the inclusion of that data results with infeasible solution
then the data is eliminated from the solution set.
Stated in simple terms, the greedy algorithm suggests that we should be “greedy” about the
intermediate solution i.e., if at any intermediate stage k different options are available to us, choose an
option which “maximizes” the output.
Sometimes the problem under greedy strategy could be to select a subset out of given n inputs,
and sometimes it could be to reorder the n data in some optimal sequence.
239
SELECT selects the best possible solution (or input) from the available inputs and includes it in the
solution. If it is feasible (in some cases, the constraints may not allow us to include it in the solution,
even if it produces the best results), then it is appended to the partially built solution. The whole process
is repeated till all the options are exhausted.
In the next few sections, we look into some of the applications of the Greedy method.
Our first example is the problem of scheduling a resource among several competing activities. We
shall find that a greedy algorithm provides an elegant and simple method for selecting a maximum-
size set of mutually compatible activities.
Suppose we have a set S = { 1, 2, . . . , n} of n proposed activities that wish to use a resource, such as
a lecture hall, which can be used by only one activity at a time. Each activity i has a start time si and
a finish time âi, where si âi. If selected, activity i takes place during the half-open time interval [si,âi).
Activities i and j are compatible if the intervals [si, âi) and [sj,âj) do not overlap (i.e., i and j are
compatible if si âj or sj âi). The activity-selection problem is to select a maximum-size set of
mutually compatible activities.
A greedy algorithm for the activity-selection problem is given in the following pseudocode. We assume
that the input activities are in order by increasing finishing time:
240
â1 â2 . . . ân .
(14.1)
If not, we can sort them into this order in time O(n 1g n), breaking ties arbitrarily. The pseudocode
assumes that inputs s and â are represented as arrays.
GREEDY-ACTIVITY-SELECTOR(s, f)
1n length[s]
2A {1}
3j 1
4 for i 2 to n
5 do if si âj
6 then A A {i}
7 j i
8 return A
The operation of the algorithm is shown in Figure 17.1. The set A collects the selected activities. The
variable j specifies the most recent addition to A. Since the activities are considered in order of
nondecreasing finishing time, fj is always the maximum finishing time of any activity in A. That is,
âj = max{fk : K A}.
(14.2)
Lines 2-3 select activity 1, initialize A to contain just this activity, and initialize j to this activity. Lines 4-
7 consider each activity i in turn and add i to A if it is compatible with all previously selected activities.
To see if activity i is compatible with every activity currently in A, it suffices by equation (17.2) to check
(line 5) that its start time si is not earlier than the finish time fj of the activity most recently added to A.
If activity i is compatible, then lines 6-7 add it to A and update j. The GREEDY-ACTIVITY-
SELECTOR procedure is quite efficient. It can schedule a set S of n activities in (n) time, assuming
that the activities were already sorted initially by their finish times.
241
Figure 14.1 The operation of GREEDY-ACTIVITY-SELECTOR on 11 activities given at the left. Each row
of the figure corresponds to an iteration of the for loop in lines 4-7. The activities that have been
selected to be in set A are shaded, and activity i, shown in white, is being considered. If the starting
time si of activity i occurs before the finishing time f j of the most recently selected activity j (the arrow
between them points left), it is rejected. Otherwise (the arrow points directly up or to the right), it is
accepted and put into set A.
The activity picked next by GREEDY-ACTIVITY-SELECTOR is always the one with the earliest finish time
that can be legally scheduled. The activity picked is thus a "greedy" choice in the sense that, intuitively,
it leaves as much opportunity as possible for the remaining activities to be scheduled. That is, the
greedy choice is the one that maximizes the amount of unscheduled time remaining.
Theorem 14.1
242
Proof Let S = {1, 2, . . . , n} be the set of activities to schedule. Since we are assuming that the activities
are in order by finish time, activity 1 has the earliest finish time. We wish to show that there is an
optimal solution that begins with a greedy choice, that is, with activity 1.
Suppose that A S is an optimal solution to the given instance of the activity-selection problem, and
let us order the activities in A by finish time. Suppose further that the first activity in A is activity k.
If k = 1, then schedule A begins with a greedy choice. If k 1, we want to show that there is another
optimal solution B to S that begins with the greedy choice, activity 1. Let B = A - {k} {1}.
Because fi fk, the activities in B are disjoint, and since B has the same number of activities as A, it is
also optimal. Thus, B is an optimal solution for S that contains the greedy choice of activity 1.
Therefore, we have shown that there always exists an optimal schedule that begins with a greedy
choice.
Moreover, once the greedy choice of activity 1 is made, the problem reduces to finding an optimal
solution for the activity-selection problem over those activities in S that are compatible with activity
1. That is, if A is an optimal solution to the original problem S, then A' = A - {1} is an optimal solution
to the activity-selection problem S' = {i S: Si f1}. Why? If we could find a solution B' to S' with more
activities than A', adding activity 1 to B' would yield a solution B to S with more activities than A,
thereby contradicting the optimality of A. Therefore, after each greedy choice is made, we are left
with an optimization problem of the same form as the original problem. By induction on the number
of choices made, making the greedy choice at every step produces an optimal solution.
A greedy algorithm obtains an optimal solution to a problem by making a sequence of choices. For
each decision point in the algorithm, the choice that seems best at the moment is chosen. This
heuristic strategy does not always produce an optimal solution, but as we saw in the activity-selection
problem, sometimes it does. This section discusses some of the general properties of greedy methods.
How can one tell if a greedy algorithm will solve a particular optimization problem? There is no way in
general, but there are two ingredients that are exhibited by most problems that lend themselves to a
greedy strategy: the greedy-choice property and optimal substructure.
243
Greedy-choice property
The first key ingredient is the greedy-choice property: a globally optimal solution can be arrived at by
making a locally optimal (greedy) choice. Here is where greedy algorithms differ from dynamic
programming. In dynamic programming, we make a choice at each step, but the choice may depend
on the solutions to subproblems. In a greedy algorithm, we make whatever choice seems best at the
moment and then solve the subproblems arising after the choice is made. The choice made by a greedy
algorithm may depend on choices so far, but it cannot depend on any future choices or on the
solutions to subproblems. Thus, unlike dynamic programming, which solves the subproblems bottom
up, a greedy strategy usually progresses in a top-down fashion, making one greedy choice after
another, iteratively reducing each given problem instance to a smaller one.
Of course, we must prove that a greedy choice at each step yields a globally optimal solution, and this
is where cleverness may be required. Typically, as in the case of Theorem 17.1, the proof examines a
globally optimal solution. It then shows that the solution can be modified so that a greedy choice is
made as the first step, and that this choice reduces the problem to a similar but smaller problem.
Then, induction is applied to show that a greedy choice can be used at every step. Showing that a
greedy choice results in a similar but smaller problem reduces the proof of correctness to
demonstrating that an optimal solution must exhibit optimal substructure.
Optimal substructure
A problem exhibits optimal substructure if an optimal solution to the problem contains within it
optimal solutions to subproblems. This property is a key ingredient of assessing the applicability of
dynamic programming as well as greedy algorithms. As an example of optimal substructure, recall that
the proof of Theorem 14.1 demonstrated that if an optimal solution A to the activity selection problem
begins with activity 1, then the set of activities A' = A - {1} is an optimal solution to the activity-
selection problem S' = {i S : si â1}.
244
The 0-1 knapsack problem is posed as follows. A thief robbing a store finds n items; the ith item is
worth vi dollars and weighs wi pounds, where vi and wi are integers. He wants to take as valuable a
load as possible, but he can carry at most W pounds in his knapsack for some integer W. What items
should he take? (This is called the 0-1 knapsack problem because each item must either be taken or
left behind; the thief cannot take a fractional amount of an item or take an item more than once.)
In the fractional knapsack problem, the setup is the same, but the thief can take fractions of items,
rather than having to make a binary (0-1) choice for each item. You can think of an item in the 0-1
knapsack problem as being like a gold ingot, while an item in the fractional knapsack problem is more
like gold dust.
Both knapsack problems exhibit the optimal-substructure property. For the 0-1 problem, consider the
most valuable load that weighs at most W pounds. If we remove item j from this load, the remaining
load must be the most valuable load weighing at most W - wj that the thief can take from the n - 1
original items excluding j. For the comparable fractional problem, consider that if we remove a
weight w of one item j from the optimal load, the remaining load must be the most valuable load
weighing at most W - w that the thief can take from the n - 1 original items plus wj - w pounds of
item j.
Although the problems are similar, the fractional knapsack problem is solvable by a greedy strategy,
whereas the 0-1 problem is not. To solve the fractional problem, we first compute the value per
pound vi/wi for each item. Obeying a greedy strategy, the thief begins by taking as much as possible
of the item with the greatest value per pound. If the supply of that item is exhausted and he can still
carry more, he takes as much as possible of the item with the next greatest value per pound, and so
forth until he can't carry any more. Thus, by sorting the items by value per pound, the greedy algorithm
runs in O(n1gn) time. The proof that the fractional knapsack problem has the greedy-choice property
is left as Exercise 14.3-1.
To see that this greedy strategy does not work for the 0-1 knapsack problem, consider the problem
instance illustrated in Figure 14.2(a). There are 3 items, and the knapsack can hold 50 pounds. Item 1
weighs 10 pounds and is worth 60 dollars. Item 2 weighs 20 pounds and is worth 100 dollars. Item 3
weighs 30 pounds and is worth 120 dollars. Thus, the value per pound of item 1 is 6 dollars per pound,
which is greater than the value per pound of either item 2 (5 dollars per pound) or item 3 (4 dollars
per pound). The greedy strategy, therefore, would take item 1 first. As can be seen from the case
analysis in Figure 14.2(b), however, the optimal solution takes items 2 and 3, leaving 1 behind. The
two possible solutions that involve item 1 are both suboptimal.
245
For the comparable fractional problem, however, the greedy strategy, which takes item 1 first, does
yield an optimal solution, as shown in Figure 14.2 (c). Taking item 1 doesn't work in the 0-1 problem
because the thief is unable to fill his knapsack to capacity, and the empty space lowers the effective
value per pound of his load. In the 0-1 problem, when we consider an item for inclusion in the
knapsack, we must compare the solution to the subproblem in which the item is included with the
solution to the subproblem in which the item is excluded before we can make the choice. The problem
formulated in this way gives rise to many overlapping subproblems--a hallmark of dynamic
programming, and indeed, dynamic programming can be used to solve the 0-1 problem. (See Exercise
14.3-2.)
Figure 14.2 The greedy strategy does not work for the 0-1 knapsack problem. (a) The thief must select
a subset of the three items shown whose weight must not exceed 50 pounds. (b) The optimal subset
includes items 2 and 3. Any solution with item 1 is suboptimal, even though item 1 has the greatest
value per pound. (c) For the fractional knapsack problem, taking the items in order of greatest value
per pound yields an optimal solution.
14.4 SUMMARY
In this unit, we have described Sequences that satisfy the property of Greedy algorithms, and the
process of constructing activity based selection problem. The problem of finding an elements of the
greedy strategy.
14.5 KEYWORDS
Greedy Heuristic
Data structure
Complexity, Greedy algorithm
246
14.6 QUESTIONS FOR SELF STUDY
14.7 REFERENCES
1) Fundamentals of Algorithmics: Gilles Brassard and Paul Bratley, Prentice Hall Englewood
Cliffs, New Jersey 07632.
2) Sartaj Sahni, 2000, Data structures, Algorithms and Applications in C++, McGraw Hill
International Edition.
3) Goodman And Hedetniemi, 1987, Introduction to the Design and Analysis of Algorithms,
Mcgraw Hill International Editions.
247
UNIT – 15
HUFFMAN CODES
STRUCTURE
15.0 Objectives
15.3 Summary
15.4 Keywords
15.6 Reference
15.0 OBJECTIVES
Define NP complete
248
15.1 HUFFMAN CODES
Huffman code is a particular type of optimal prefix code that is commonly used for lossless data
compression. It compresses data very effectively saving from 20% to 90% memory, depending on the
characteristics of the data being compressed. We consider the data to be a sequence of characters.
Huffman's greedy algorithm uses a table giving how often each character occurs (i.e., its frequency)
to build up an optimal way of representing each character as a binary string. Huffman code was
proposed by David A. Huffman in 1951.Suppose we have a 100,000-character data file that we wish
to store compactly. We assume that there are only 6 different characters in that file. The frequency of
the characters are given by:
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
|Frequency (in thousands)| 45 | 13 | 12 | 16 | 9 | 5 |
+------------------------+-----+-----+-----+-----+-----+-----+
We have many options for how to represent such a file of information. Here, we consider the problem
of designing a Binary Character Code in which each character is represented by a unique binary string,
which we call a codeword.
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
| Fixed-length Codeword | 000 | 001 | 010 | 011 | 100 | 101 |
+------------------------+-----+-----+-----+-----+-----+-----+
|Variable-length Codeword| 0 | 101 | 100 | 111 | 1101| 1100|
+------------------------+-----+-----+-----+-----+-----+-----+
249
If we use a fixed-length code, we need three bits to represent 6 characters. This method requires
300,000 bits to code the entire file. Now the question is, can we do better?
A variable-length code can do considerably better than a fixed-length code, by giving frequent
characters short codewords and infrequent characters long codewords. This code requires: (45 X 1 +
13 X 3 + 12 X 3 + 16 X 3 + 9 X 4 + 5 X 4) X 1000 = 224000 bits to represent the file, which saves
approximately 25% of memory.
One thing to remember, we consider here only codes in which no codeword is also a prefix of some
other codeword. These are called prefix codes. For variable-length coding, we code the 3-character
file abc as 0.101.100 = 0101100, where "." denotes the concatenation.
Prefix codes are desirable because they simplify decoding. Since no codeword is a prefix of any other,
the codeword that begins an encoded file is unambiguous. We can simply identify the initial codeword,
translate it back to the original character, and repeat the decoding process on the remainder of the
encoded file. For example, 001011101 parses uniquely as 0.0.101.1101, which decodes to aabe. In
short, all the combinations of binary representations are unique. Say for example, if one letter is
denoted by 110, no other letter will be denoted by 1101 or 1100. This is because you might face
confusion on whether to select 110 or to continue on concatenating the next bit and select that one.
Compression Technique:
The technique works by creating a binary tree of nodes. These can stored in a regular array, the size
of which depends on the number of symbols, n. A node can either be a leaf node or an internal node.
Initially all nodes are leaf nodes, which contain the symbol itself, its frequency and optionally, a link
to its child nodes. As a convention, bit '0' represents left child and bit '1' represents right child. Priority
queue is used to store the nodes, which provides the node with lowest frequency when popped. The
process is described below:
1. Create a leaf node for each symbol and add it to the priority queue.
2. While there is more than one node in the queue:
1. Remove the two nodes of highest priority from the queue.
2. Create a new internal node with these two nodes as children and with frequency
equal to the sum of the two nodes' frequency.
3. Add the new node to the queue.
3. The remaining node is the root node and the Huffman tree is complete.
250
example:
251
Z.left = x = Q.pop
Z.right = y = Q.pop
Z.frequency = x.frequency + y.frequency
Q.push(Z)
end while
Return Q
Although linear-time given sorted input, in general cases of arbitrary input, using this algorithm
requires pre-sorting. Thus, since sorting takes O(nlogn) time in general cases, both methods have
same complexity.
Since n here is the number of symbols in the alphabet, which is typically very small number (compared
to the length of the message to be encoded), time complexity is not very important in the choice of
this algorithm.
Decompression Technique:
The process of decompression is simply a matter of translating the stream of prefix codes to individual
byte value, usually by traversing the Huffman tree node by node as each bit is read from the input
stream. Reaching a leaf node necessarily terminates the search for that particular byte value. The leaf
value represents the desired character. Usually the Huffman Tree is constructed using statistically
adjusted data on each compression cycle, thus the reconstruction is fairly simple. Otherwise, the
information to reconstruct the tree must be sent separately. The pseudo-code:
252
current := current.right
endif
i := i+1
endwhile
print current.symbol
endfor
Greedy Explanation:
Huffman coding looks at the occurrence of each character and stores it as a binary string in an optimal
way. The idea is to assign variable-length codes to input input characters, length of the assigned codes
are based on the frequencies of corresponding characters. We create a binary tree and operate on it
in bottom-up manner so that the least two frequent characters are as far as possible from the root. In
this way, the most frequent character gets the smallest code and the least frequent character gets the
largest code.
A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A problem is NP-hard if
all problems in NP are polynomial time reducible to it, even though it may not be in NP itself.
253
If a polynomial time algorithm exists for any of these problems, all problems in NP would be
polynomial time solvable. These problems are called NP-complete. The phenomenon of NP-
completeness is important for both theoretical and practical reasons.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
B is in NP
Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the language B is known
as NP-Hard. Informally, a search problem B is NP-Hard if there exists some NP-
Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a problem is proved to
be NPC, there is no need to waste time on trying to find an efficient algorithm for it. Instead, we can
focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is known.
Determining whether a graph has a Hamiltonian cycle
Determining whether a Boolean formula is satisfiable, etc.
NP-Hard Problems
The following problems are NP-Hard
The circuit-satisfiability problem
Set Cover
Vertex Cover
Travelling Salesman Problem
In this context, now we will discuss TSP is NP-Complete
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities. The salesman has to visit
each one of the cities starting from a certain one and returning to the same city. The challenge of the
problem is that the traveling salesman wants to minimize the total length of the trip
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In TSP, we find a tour
and check that the tour contains each vertex once. Then the total cost of the edges of the tour is
calculated. Finally, we check if the cost is minimum. This can be completed in polynomial time.
Thus TSP belongs to NP.
254
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show that Hamiltonian
cycle ≤p TSP (as we know that the Hamiltonian cycle problem is NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'), where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j
Thus, the cost function is defined as follows −
t(i,j)={01if(i,j)∈Eotherwiset(i,j)={0if(i,j)∈E1otherwise
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each edge
in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus, if graph G has a
Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges in E' are 0 and 1 by
definition. Hence, each edge must have a cost of 0 as the cost of h' is 0. We therefore conclude
that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of cost at most 0. TSP
is NP-complete.
Meaning and One can only solve an NP-Hard Problem X Any given problem X acts as NP-
Definition only if an NP-Complete Problem Y exists. It Complete when there exists an NP
then becomes reducible to problem X in a problem Y- so that the problem Y gets
polynomial time. reducible to the problem X in a
polynomial line.
Presence in The NP-Hard Problem does not have to For solving an NP-Complete Problem,
NP exist in the NP for anyone to solve it. the given problem must exist in both
NP-Hard and NP Problems.
Decision This type of problem need not be a This type of problem is always a
Problem Decision problem. Decision problem (exclusively).
255
Example Circuit-satisfactory, Vertex cover, Halting A few examples of NP-Complete
problems, etc., are a few examples of NP- Problems are the determination of the
Hard Problems. Hamiltonian cycle in a graph, the
determination of the satisfaction level
of a Boolean formula, etc.
15.4 SUMMARY
In this unit, we have described Huffman codes, and the process of NP Complete and NP Hard based
selection problems as well as differentiate between NP hard and Complete.
15.5 KEYWORDS
Greedy Explanation
NP-Hard
NP Problems.
Vertex Cover
15.7 REFERENCES
[1] Web Page Compression using Huffman Coding Technique Manjeet Gupta (Assistant professor)
Department of CSE JMIT Radaur ,Brijesh Kumar (associate professor) Department of IT Lingyaya’s
university.
[2] Fundamental Data Compression Author(s):Ida Mengyi Pu
[3] A fast adoptive Huffman coding algorithmCommunications, IEEE Transactions on (Volume:41 ,
Issue: 4 )
[4] S. Aaronson. Is P versus NP formally independent? Bulletin of the EATCS, (81), October 2003.
256
UNIT – 16
NP COMPLETE PROBLEMS
STRUCTURE
16.0 Objectives
16.6 Summary
16.7 Keywords
16.9 Reference
16.0 OBJECTIVES
257
16.1 NP Completeness Polynomial Time
Here are several ways that a problem could be considered hard. For example, we might have trouble
understanding the definition of the problem itself. At the beginning of a large data collection and
analysis project, developers and their clients might have only a hazy notion of what their goals actually
are, and need to work that out over time. For other types of problems, we might have trouble finding
or understanding an algorithm to solve the problem. Understanding spoken English and translating it
to written text is an example of a problem whose goals are easy to define, but whose solution is not
easy to discover. But even though a natural language processing algorithm might be difficult to write,
the program’s running time might be fairly fast. There are many practical systems today that solve
aspects of this problem in reasonable time.
None of these is what is commonly meant when a computer theoretician uses the word “hard”.
Throughout this section, “hard” means that the best-known algorithm for the problem is expensive in
its running time. One example of a hard problem is Towers of Hanoi. It is easy to understand this
problem and its solution. It is also easy to write a program to solve this problem. But, it takes an
extremely long time to run for any “reasonably” large value of nn. Try running a program to solve
Towers of Hanoi for only 30 disks!
The Towers of Hanoi problem takes exponential time, that is, its running time is Θ(2n)Θ(2n). This is
radically different from an algorithm that takes Θ(nlogn)Θ(nlogn) time or Θ(n2)Θ(n2) time. It is even
radically different from a problem that takes Θ(n4)Θ(n4) time. These are all examples of polynomial
running time, because the exponents for all terms of these equations are constants. If we buy a new
computer that runs twice as fast, the size of problem with complexity Θ(n4)Θ(n4) that we can solve in
a certain amount of time is increased by the fourth root of two. In other words, there is a multiplicative
factor increase, even if it is a rather small one. This is true for any algorithm whose running time can
be represented by a polynomial.
Consider what happens if you buy a computer that is twice as fast and try to solve a bigger Towers of
Hanoi problem in a given amount of time. Because its complexity is Θ(2n)Θ(2n), we can solve a
problem only one disk bigger! There is no multiplicative factor, and this is true for any exponential
algorithm: A constant factor increase in processing power results in only a fixed addition in problem-
solving power.
There are a number of other fundamental differences between polynomial running times and
exponential running times that argues for treating them as qualitatively different. Polynomials are
closed under composition and addition. Thus, running polynomial-time programs in sequence, or
258
having one program with polynomial running time call another a polynomial number of times yields
polynomial time. Also, all computers known are polynomially related. That is, any program that runs
in polynomial time on any computer today, when transferred to any other computer, will still run in
polynomial time.
There is a practical reason for recognizing a distinction. In practice, most polynomial time algorithms
are “feasible” in that they can run reasonably large inputs in reasonable time. In contrast, most
algorithms requiring exponential time are not practical to run even for fairly modest sizes of input.
One could argue that a program with high polynomial degree (such as n100n100) is not practical, while
an exponential-time program with cost 1.001n1.001n is practical. But the reality is that we know of
almost no problems where the best polynomial-time algorithm has high degree (they nearly all have
degree four or less), while almost no exponential-time algorithms (whose cost is (O(cn))(O(cn)) have
their constant cc close to one. So there is not much gray area between polynomial and exponential
time algorithms in practice.
For the purposes of this Module, we define a hard algorithm to be one that runs in exponential time,
that is, in Ω(cn)Ω(cn) for some constant c>1c>1. A definition for a hard problem will be presented
soon.
The Theory of NP-Completeness
Imagine a magical computer that works by guessing the correct solution from among all of the possible
solutions to a problem. Another way to look at this is to imagine a super parallel computer that could
test all possible solutions simultaneously. Certainly this magical (or highly parallel) computer can do
anything a normal computer can do. It might also solve some problems more quickly than a normal
computer can. Consider some problem where, given a guess for a solution, checking the solution to
see if it is correct can be done in polynomial time. Even if the number of possible solutions is
exponential, any given guess can be checked in polynomial time (equivalently, all possible solutions
are checked simultaneously in polynomial time), and thus the problem can be solved in polynomial
time by our hypothetical magical computer. Another view of this concept is this: If you cannot get the
answer to a problem in polynomial time by guessing the right answer and then checking it, then you
cannot do it in polynomial time in any other way.
The idea of “guessing” the right answer to a problem—or checking all possible solutions in parallel to
determine which is correct—is a called a non-deterministic choice. An algorithm that works in this
manner is called a non-deterministic algorithm, and any problem with an algorithm that runs on a
non-deterministic machine in polynomial time is given a special name: It is said to be a problem in NP.
Thus, problems in NP are those problems that can be solved in polynomial time on a non-deterministic
machine.
259
Not all problems requiring exponential time on a regular computer are in NP. For example, Towers of
Hanoi is not in NP, because it must print out O(2n)O(2n) moves for nn disks. A non-deterministic
machine cannot “guess” and print the correct answer in less time.
On the other hand, consider the TRAVELING SALESMAN problem.
Problem
TRAVELING SALESMAN 1
Input: A complete, directed graph GG with positive distances assigned to each edge in the graph.
Figure 16.1illustrates this problem. Five vertices are shown, with edges and associated costs between
each pair of edges. (For simplicity Figure 16.1 shows an undirected graph, assuming that the cost is
the same in both directions, though this need not be the case.) If the salesman visits the cities in the
order ABCDEA, they will travel a total distance of 13. A better route would be ABDCEA, with cost 11.
The best route for this particular graph would be ABEDCA, with cost 9.
Figure 16.1 An illustration of the TRAVELING SALESMAN problem. Five vertices are shown, with edges
between each pair of cities. The problem is to visit all of the cities exactly once, returning to the start
city, with the least total cost.
We cannot solve this problem in polynomial time with a guess-and-test non-deterministic computer.
The problem is that, given a candidate cycle, while we can quickly check that the answer is indeed a
cycle of the appropriate form, and while we can quickly calculate the length of the cycle, we have no
easy way of knowing if it is in fact the shortest such cycle. However, we can solve a variant of this
problem cast in the form of a decision problem. A decision problem is simply one whose answer is
either YES or NO. The decision problem form of TRAVELING SALESMAN is as follows.
260
16.2 POLYNOMIAL TIME VERIFICATION
In order to define NP-completeness, we need to first define NP. Unfortunately, providing a rigorous
definition of NP will involve a presentation of the notion of nondeterministic models of computation,
and will take us away from our main focus. (Formally, NP stands for nondeterministic polynomial
time.) Instead, we will present a very simple, “hand-wavy” definition, which will suffice for our
purposes. To do so, it is important to first introduce the notion of a verification algorithm. Many
language recognition problems that may be hard to solve, but they have the property that they are
easy to verify that a string is in the language. Recall the Hamiltonian cycle problem defined above. As
we saw, there is no obviously efficient way to find a Hamiltonian cycle in a graph. However, suppose
that a graph did have a Hamiltonian cycle and someone wanted to convince us of its existence. This
person would simply tell us the vertices in the order that they appear along the cycle. It would be a
very easy matter for us to inspect the graph and check that this is indeed a legal cycle that it visits all
the vertices exactly once. Thus, even though we know of no efficient way to solve the Hamiltonian
cycle problem, there is a very efficient way to verify that a given graph has one.
The given cycle in the above example is called a certificate. A certificate is a piece of information which
allows us to verify that a given string is in a language in polynomial time. More formally, given a
language L, and given x ∈ L, a verification algorithm is an algorithm which, given x and a string y called
the certificate, can verify that x is in the language L using this certificate as help. If x is not in L then
there is nothing to verify. If there exists a verification algorithm that runs in polynomial time, we say
that L can be verified in polynomial time. Note that not all languages have the property that they are
easy to verify. For example, consider the following languages:
There is no known polynomial time verification algorithm for either of these. For example, suppose
that a graph G is in the language UHC. What information would someone give us that would allow us
to verify that G is indeed in the language? They could certainly show us one Hamiltonian cycle, but it
is unclear that they could provide us with any easily verifiable piece of information that would
demonstrate that this is the only one.
261
The class NP: We can now define the complexity class NP.
Definition: NP is the set of all languages that can be verified in polynomial time.
Observe that if we can solve a problem efficiently without a certificate, we can certainly solve given
the additional help of a certificate. Therefore, P ⊆ NP. However, it is not known whether P = NP. It
seems unreasonable to think that this should be so. In other words, just being able to verify that you
have a correct solution does not help you in finding the actual solution very much. Most experts
believe that P 6= NP, but no one has a proof of this. Next time we will define the notions of NP-hard
and NP-complete. There is one last ingredient that will be needed before defining NP-completeness,
namely the notion of a polynomial time reduction.
The most compelling reason why theoretical computer scientists believe that P ≠ NP is the existence
of the class of "NP-complete" problems. This class has the surprising property that if any NP-complete
problem can be solved in polynomial time, then every problem in NP has a polynomial-time solution,
that is, P = NP. Despite years of study, though, no polynomial-time algorithm has ever been discovered
for any NP-complete problem.
The language HAM-CYCLE is one NP-complete problem. If we could decide HAM-CYCLE in polynomial
time, then we could solve every problem in NP in polynomial time. In fact, if NP - P should turn out to
be nonempty, we could say with certainty that HAM-CYCLE ∈ NP - P.
The NP-complete languages are, in a sense, the "hardest" languages in NP. In this section, we shall
show how to compare the relative "hardness" of languages using a precise notion called "polynomial-
time reducibility." Then we formally define the NP-complete languages, and we finish by sketching a
proof that one such language, called CIRCUIT-SAT, is NP-complete.
Reducibility
Intuitively, a problem Q can be reduced to another problem Q′ if any instance of Q can be "easily
rephrased" as an instance of Q′, the solution to which provides a solution to the instance of Q. For
example, the problem of solving linear equations in an indeterminate x reduces to the problem of
solving quadratic equations. Given an instance ax + b = 0, we transform it to 0x2 + ax + b = 0, whose
solution provides a solution to ax + b = 0. Thus, if a problem Q reduces to another problem Q′,
then Q is, in a sense, "no harder to solve" than Q′.
We call the function f the reduction function, and a polynomial-time algorithm F that computes f is
called a reduction algorithm.
262
Figure 16.3 illustrates the idea of a polynomial-time reduction from a language L1 to another
language L2. Each language is a subset of {0, 1}*. The reduction function f provides a polynomial-time
mapping such that if x ∈ L1, then f(x) ∈ L2. Moreover, if x ∉ L1, then f (x) ∉ L2. Thus, the reduction
function maps any instance x of the decision problem represented by the language L1 to an
instance f (x) of the problem represented by L2. Providing an answer to whether f(x) ∈ L2 directly
provides the answer to whether x ∈ L1.
Polynomial-time reductions give us a powerful tool for proving that various languages belong to P.
We would not want to do something like the above for every proof of NP Completeness! Fortunately
we can rely on the fact that polynomial reduction is transitive:
Transitivity follows from the definitions and that the sum of two polynomials is itself polynomial.
This means that we can prove that other problems are in NPC without having to reduce every possible
problem to them. The general procedure for proving that L is in NPC is:
Important: Why doesn't the reverse, mapping every instance of L to some instances of L' work?
263
Because we want to show that L can be used to solve every problem in NP, and we are doing this via L',
which already has this property, so we have to be able to solve every instance of L'.
The CLRS text steps through reduction of problems as shown in the figure. We do not have time to go
through the proofs in detail, so we just indicate the general nature of the reductions. In studying the
following you should become aware of the diversity of NPC problems, and also get the general idea of
how reductions work in case in the future you encounter a potential NPC problem (such as iThingy
configuration!).
Satisfiability (SAT)
A truth assignment is a set of values for the variables of φ and a satisfying assignment is a truth
assignment that evaluates to 1 (true).
SAT ∈ NP: There are 2n possible assignments, but a given assignment can be checked in polynomial
time.
264
SAT is NP-Hard: CIRCUIT-SAT is reduced
to SAT in polynomial time through a
construction that turns CIRCUIT-SAT gates
into small logical formulas for SAT:
The resulting boolean formula is satisfied just when the circuit is satisfied. (You can verify that the
formula shown is equivalent to the circuit.)
x10 ∧ (x4 ↔ ¬ x3) ∧ (x5 ↔ (x1 ∨ x2)) ∧ (x6 ↔ ¬ x4) ∧ (x7 ↔ (x1 ∧ x2 ∧ x4)) ∧ (x8 ↔ (x5 ∨ x6)) ∧ (x9 ↔
(x6 ∨ x7)) ∧ (x10 ↔ (x7 ∧ x8 ∧ x9))
This shows that we can reduce an arbitrary instance of CIRCUIT-SAT to a specialized instance of SAT in
polynomial time. That means if we can solve SAT we can solve any instance of CIRCUIT-SAT in
polynomially related time, and since we know that CIRCUIT-SAT is NPC, transitively we can use SAT to
solve any instance of any problem in NP: SAT ← CIRCUIT-SAT and CIRCUIT-SAT ← Problems in
NP implies SAT ← Problems in NP.
Furthermore, only a polynomial cost is incurred in the translation, so the time required to solve SAT is
polynomially related to that of the problems in NP. If we can solve SAT in polynomial time we can
solve any problem in NP in polynomial time!
Mapping an arbitrary instance of SAT to a specialized instance of CIRCUIT-SAT would not work. Such a
reduction would go in the wrong direction to give logical transitivity: CIRCUIT-SAT ← SAT and CIRCUIT-
SAT ← Problems in NP does not let us infer SAT ← Problems in NP).
Reduction proofs require that we handle any possible case of a known NPC problem. It would be
complicated to handle all the possible forms of SAT formulas, so it is useful to have a more restricted
logical form for the target for reduction proofs. 3-CNF serves this purpose.
A literal in a boolean formula is an occurrence of a variable or its negation, such as x1 and ¬x1
A boolean formula is in conjunctive normal form (CNF) if it is a conjunction of clauses, each of which
is the disjunction of one or more literals.
A boolean formula is in 3-conjunctive normal form (3-CNF) if each clause has exactly three distinct
literals. For example:
265
3-CNF-SAT asks whether a boolean formula is satisfiable by an assignment of truth values to the
variables.
3-CNF-SAT ∈ NP: There are an exponential possible number of variable assignments, but a given one
can be checked in polynomial time merely by substituting and evaluating the expression.
3-CNF-SAT is NP-Hard: SAT can be reduced to 3-CNF-SAT through a polynomial-time process of:
1. parsing the SAT expression into a binary tree with literals as leaves and connectives as
internal nodes;
2. introducing a variable yi for the output of each internal node;
3. rewriting as the conjunction of the root variable and a clause for each node of the binary
tree (yi ↔ the literal for its child nodes);
4. converting each clause to conjunctive normal form (see text), first by converting to
disjunctive normal form and then applying DeMorgan's laws to convert to CNF; and then
5. supplying dummy variables as needed to convert clauses of 1 or 2 variables into 3-CNF.
¬(a ∧ b) ≡ ¬a ∨ ¬b
¬(a ∨ b) ≡ ¬a ∧ ¬b
resulting in:
(¬y1 ∨ ¬y2 ∨ ¬x2) ∧ (¬y1 ∨ y2 ∨ ¬x2) ∧ (¬y1 ∨ y2 ∨ x2) ∧ (y1 ∨ ¬y2 ∨ x2).
PROBLEM-3- CLIQUE
A clique in an undirected graph G = (V, E) is a subset V' ⊆ V, each pair of which is connected by an
edge in E (a complete subgraph of G). (Examples of cliques of sizes between 2 and 7 are shown on the
right.)
The clique problem is the problem of finding a clique of maximum size in G. This can be converted to
a decision problem by asking whether a clique of a given size k exists in the graph:
266
CLIQUE = {⟨G, k⟩ : G is a graph containing a clique of size k}
CLIQUE ∈ NP: One can check a solution in polynomial time. (Given a set of proposed vertices, how
could you check that they are a CLIQUE?)
If there are k clauses in φ, we ask whether the graph has a k-clique. For the example above, which has
three clauses, one such k-clique is formed by the three lighter nodes: all formulas are satisfied if those
three literals are true. Can you find another clique that makes the formulas true?
In general, the claim is that such a clique exists in G if and only if there is a satisfying assignment for
φ:
267
Only if direction (If φ can be satisfied, then there is a clique in G):
o If φ can be satisfied, then we can assign values to the literals, such that at least one
literal in each clause is assigned value 1; i.e., they are consistent.
o Consider the vertices corresponding to those literals. Since the literals are consistent
and they are in different clauses, there is an edge between every pair of them.
o Since there are k clauses in φ we have a subset of at least k vertices in the graph with
edges between every pair of vertices, i.e., we have a k-clique in G.
Any arbitrary instance of 3-CNF-SAT can be converted to an instance of CLIQUE in polynomial time
with this particular structure. That means if we can solve CLIQUE we can solve any instance of 3-CNF-
SAT, and since we know that 3-CNF-SAT is NPC, transitively we can solve any instance in NP in
polynomially related time. Be sure you understand why mapping an arbitrary instance of CLIQUE to a
specialized instance of 3-CNF-SAT would not work.
PROBLEM-4- VERTEX-COVER
Each vertex "covers" its incident edges, and a vertex cover for G is
a set of vertices that covers all the edges in E. For example, in the
graph on the right, {w, z} is a vertex cover. So is {v, w, y} and V =
{u, v, w, x, y, z}.
VERTEX-COVER is NP-
Hard: There is a
straightforward
reduction of CLIQUE to
VERTEX-COVER. Given
an instance G=(V,E) of
CLIQUE, one computes
the complement of G,
which we will call Gc = (V,Ē), where (u,v) ∈ Ē iff (u,v) ∉ E. For example, on the left side of the figure
we have G, an instance of CLIQUE, and its complement Gc on the right, for which we find a minimum
vertex cover.
The graph G has a clique of size k iff the complement graph has a vertex cover of size |V| − k.
(Note that this is an existence claim, not a minimization claim: a smaller cover may be possible.)
If direction (If G has a k-clique, then Gc has a vertex cover of size |V| − k):
We show that none of the k vertices in the clique need to be in the cover.
268
o They are all connected to each other in G, so none of them will be connected to each
other in Gc.
o Thus, every edge in Gc must involve at least one vertex not in the clique, so the
clique vertices can be excluded from the cover: we can use vertices from the
remaining |V| − k vertices to cover all the edges in Gc.
o The minimum vertex cover may be smaller than |V| − k, but we know that |V|
− k will work.
Only if direction (If Gc has a vertex cover of size |V| − k, then G has a k-clique):
We will use proof by contrapositive to show that if there is no clique of size k in G, then
there is no vertex cover of size |V| − k in Gc.
o Assume for the sake of contradiction that there is no k-clique in G, but there is a
vertex cover V' in Gc of size |V'| = |V| − k.
o The non-existence of a k-clique in G means that at least two vertices in every subset
of k vertices are not connected in G.
o Consider the subset V \ V', i.e. a subset of k vertices that are not part of the
presumed (|V| − k)-sized vertex cover of Gc. By the above, there exist at least two
vertices in this subset (let's call them u and v), such that there is no edge (u,v) in G.
o But if edge (u,v) is not in G, it must exist in Gc (the complement of G).
o But if there is an edge (u,v) in Gc, and neither u nor v are in the vertex cover, then
edge (u,v) is uncovered and V' is not a valid vertex cover.
A graph G = (V,E) contains a Hamiltonian cycle if it contains a simple cycle C of size |V|. That
is, G contains a cycle that visits every vertex exactly once.
269
Any Hamiltonian cycle must include all the vertices in the widget (a), but there are only three ways
to pass through each widget (b, c, and d in the figure). If only vertex u is included in the cover, we
will use path (b); if only vertex v then path (d); otherwise path (c) to include both. (Not traversing the
widget is not an option because at least one of the two vertices must be chosen to cover the edge.)
The widgets are then wired together in sequences that chain all the widgets that involve a given
vertex, so if the vertex is selected all of the widgets corresponding to its edges will be reached.
Finally, k selector vertices are added, and wired such that each will select the kth vertex in the cover
of size k. I leave it to you to examine the discussion in the text, to see how clever these reductions
can be!
One of the more famous NPC problems is TSP: Suppose you are a traveling salesperson, and you
want to visit n cities exactly once in a Hamiltonian cycle. A cost function c assigns costs (distance) to
the edges (roads) between vertices (cities), and you want to choose a tour with minimum tota cost
of the edges. Written as a decision problem:
270
Only exponential solutions have been found
to date (including brute force and dynamic
programming, as indicated by the cartoon),
although it is easy to check a solution in
polynomial time.
Many NP-Complete problems are of a numerical nature. We already mentioned integer linear
programming. Another example is the subset-sum problem: given a finite set S of positive integers
and an integer target t > 0, does there exist a subset of S that sums to t?
The proof reduces 3-CNF-SAT to SUBSET-SUM. Please see the text for the details of yet another
clever reduction!
Briefly:
For example, see how this clause maps to the table shown:
(x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ ¬x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x3).
271
16.6 SUMMARY
In this unit, we looked into the concept of NP Completeness polynomial time. Polynomial
verification is essentially about taking a series of decisions, one at every stage. This helps us
to derive the NP completeness and reducibility as well as Completeness proofs, so finally we
go through NP Complete problems.
16.7 KEYWORDS
linear programming
exponential solutions
widget
VERTEX-COVER
16.9 REFERENCES
1 Aho, A.V., Hopcroft, J.E., and Ullman, J.D. [1974]: The Design and Analysis of Computer
Algorithms. Addison-Wesley, Reading 1974zbMATHGoogle Scholar
2 Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., and Protasi, M.
[1999]: Complexity and Approximation: Combinatorial Optimization Problems and Their
Approximability Properties. Springer, Berlin 1999zbMATHGoogle Scholar
3 Bovet, D.B., and Crescenzi, P. [1994]: Introduction to the Theory of Complexity. Prentice-
Hall, New York 1994Google Scholar
4 Garey, M.R., and Johnson, D.S. [1979]: Computers and Intractability: A Guide to the Theory
of NP-Completeness. Freeman, San Francisco 1979, Chapters 1–3, 5, and 7Google Scholar
5 Horowitz, E., and Sahni, S. [1978]: Fundamentals of Computer Algorithms. Computer Science
Press, Potomac 1978, Chapter 11zbMATHGoogle Scholar
6 Johnson, D.S. [1981]: The NP-completeness column: an ongoing guide. Journal of Algorithms
starting with Vol. 4 (1981)Google Scholar
7 Karp, R.M. [1975]: On the complexity of combinatorial problems. Networks 5 (1975), 45–
68zbMATHMathSciNetGoogle Scholar
272