PG - M.Sc. - Computer Science - 341 11 - Design and Analysis of Algorithms - Binder
PG - M.Sc. - Computer Science - 341 11 - Design and Analysis of Algorithms - Binder
Authors
Dr Mudasir M Kirmani, Assistant Professor-cum-Junior Scientist, Sher-e-Kashmir University of Sciences and Technology
of Kashmir
Dr Syed Mohsin Saif Andrabi , Assistant Professor, Islamic University of Science & Technology, Awantipora, Jammu
and Kashmir
Units (1, 4-5, 7.2, 10.0-10.3, 11-12, 13.7)
S. Mohan Naidu, Principal & Visiting Faculty, VRN College of Computer Science and Management, Tirupathi and Viswa Bharathi
P.G. College of Engineering & Management, Hyderabad
Unit (3)
Rohit Khurana, CEO, ITL Education Solutions Ltd.
Units (6, 7.3-7.4, 8.0-8.2, 9, 10.4-10.10, 13.0-13.6, 13.8-13.14, 14.3-14.8)
Sunita Tiwari, Faculty in Computer Science & Information Technology Department at JSS Academy of Technical Education,
Noida
Shilpi Sengupta , Lecturer of Computer Science and Engineering in JSS Academy of Technical Education, Noida
Unit (14.0-14.2)
Vikas® Publishing House: Units (2, 7.0-7.1, 7.5-7.10, 8.2.1-8.7)
All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Alagappa
University, Karaikudi, Tamil Nadu.
Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Alagappa University, Publisher and its Authors shall in no event be liable for
any errors, omissions or damages arising out of use of this information and specifically disclaim any
implied warranties or merchantability or fitness for any particular use.
Work Order No. AU/DDE/DE1-616/Printing of Course Materials/2019, Dated 19.11.2019 Copies - 200
SYLLABI-BOOK MAPPING TABLE
Design and Analysis of Algorithms
Syllabi Mapping in Book
BLOCK 3: Dynamic Programming and Search Binary Trees Unit 7: General Method
Unit 7: General method: computing a Binomial coefficient, warshalls (Pages 68-82)
and Floyds algorithms, optimal search Binary trees, knapsack problems Unit 8: Greedy Technique
Unit 8: Greedy Technique: General method (Pages 83-95)
Unit 9: Applications: prims algorithm, kruskals algorithm, dijikstras Unit 9: Applications
algorithm (Pages 96-110)
BLOCK 4: Sorting and Optimization Problem Unit 10: Sorting and Searching
Unit 10: Sort and Searching algorithms: decrease and conquer, Algorithms
Insertion sort, Depth first search and Breadth first search, Topological (Pages 111-131)
sorting Unit 11: Generating Combinatorial
Unit 11: Generating combinatorial objects: Transform and Conquer, Objects
presorting, Heap and Heap sort (Pages 132-140)
Unit 12: Optimization Problems: Reductions, Reduction to Graph Unit 12: Optimization Problems
Problems (Pages 141-151)
NOTES
UNIT 1 INTRODUCTION TO
ALGORITHMS
Structure
1.0 Introduction
1.1 Objectives
1.2 Notion of Algorithm
1.3 Fundamentals of Algorithmic Problem Solving
1.4 Important Problem Types
1.5 Fundamentals of Analysis of Algorithm Efficiency
1.6 Answers to Check Your Progress Questions
1.7 Summary
1.8 Key Words
1.9 Self Assessment Questions and Exercises
1.10 Further Readings
1.0 INTRODUCTION
1.1 OBJECTIVES
Line 1: start
Line 2: collect ingredients vessel, water, tea leaves, sugar, milk
Line 3: switch on the Heater
Line 4: Put the empty vessel on heater
Line 5: pour water as per the requirement in the vessel
Line 6: wait till the water is boiled
Line 7: Add milk, tea leaves and sugar as per the requirement
Line 8: wait till the mixture is boiled
Line 9: pour the boiling mixture in a cup
Line 10: serve the tea
The algorithm given above to prepare a cup of tea is one of the methods to
prepare tea. However, different methods are used to prepare a cup of tea and the
same may vary from one place to another. Therefore, the algorithm written is not
the only algorithm to prepare tea rather ‘n’ number of algorithms can be written to
prepare tea. Similarly a computer program can have different methods to write a
computer program.
The notion of algorithm
An algorithm is a systematic sequence of instructions which are required to be
executed in order to achieve the objective of an algorithm. The basic characteristics
of an algorithm are given below:
(i) Finite
Self-Instructional (ii) Single Entry and Single Exit point
2 Material
(iii) Achieve Desired Objective Introduction to Algorithms
The program given above will not terminate as the condition mentioned in
loop is always true. The program will result in displaying the text “I am stuck in the
loop as the condition will never terminate” continuously on screen. Therefore, the
need of the hour in this case is to write a program or an algorithm which will allow
a user to terminate a program as and when required.
Single Entry and Single Exit point: Every algorithm should have a single entry
point and a single exit point. In case a program is having multiple entry and exit
points will lead to improper execution of an algorithm.
Table 1.1 Single entry and single exit point
Algorithm Program using C language
void main()
Line 1: start
Line 2: declare a,b,c {
The number of variables used in the above given algorithms varies which
results in optimal usage of resources. The same algorithm can be written in different
ways in order to achieve the desired objective. This helps a user in differentiating
between a good algorithm and a bad algorithm. The comparison of the algorithms
based on the usage of resources like memory-usage, time-taken etc motivates the
reader to study the domain of algorithm analysis. Keeping in mind the proper
utilization of memory-usage and time taken to solve a problem a good algorithm
will always efficiently use the resources of a system.
The problem solving is the main initiator of writing algorithms or computer programs.
Everyone around you has different problems or requirements on daily basis. Every
individual would like to develop automated tools for specific problem solving
methods or procedures. Therefore, a computer program or an algorithm is written
for automating the process of developing a set of instructions to fulfill the requirement
of a user by automating the procedure of a problem solving technique. The process
of writing an algorithm for a problem solving methods includes different steps and
Self-Instructional the same are shown in Figure 1.1
4 Material
Introduction to Algorithms
Understanding the problem/
requirement
NOTES
Prepare the complete flow chart of the
problem
Testing an algorithm
In the real world scenario the numbers of problems existing at present are infinite.
However, in the domain of design and analysis of algorithms researchers and the
scientific community have mainly focused on the following types of problems.
Sorting
Searching
Graph problems
Self-Instructional
6 Material
Strings Processing Introduction to Algorithms
Geometric Problems
Combinatorial Problems
Numerical Analysis Problems NOTES
Sorting: The sorting problem is arranging a sequence of items either in ascending
or descending order based on the requirements of a user. The sorted list of items
can be prepared based on an integer or a character or a field depending on the
need of the process. The sorted list of items is used for different purposes and one
of the important usages is in searching for an item within a given list in lesser time.
Different sorting algorithms are available for implementation, depending on the
situation an appropriate sorting algorithm is selected for usage. Some of the sorting
algorithms at present in user are given below:
Selection sort
Merge sort
Bubble sort
Insertion sort
Quick sort
Radix sort
Heap sort
Searching: The searching problem is related to searching an item from a list of
items. The item searched can be an integer or a character which helps an end-user
in searching for an attribute as and when required based on the requirements. The
searching algorithms are desingned using different methods like binary searching
technique and binary searching technique. The sorting algorithms are also used in
searching where first the list of items is sorted which makes the process of searching
easier and faster.
Graph Problems: A graph is a collection of nodes also known as vertices and the
vertices are connected with one another using edges. The graphs are one of the
pivotal parts of analyses of algorithms as most of the algorithms use the graph
theory concept in one or the other form. The graphs can be used in different real
life problems like travelling salesman problem, optimal network path problem,
shortest path searching algorithms etc. The graphs are also used in developing
advances electronic chips where different cores are put in a single chip by simulating
the concept prior to implementing the same in physical design.
Strings Processing: With the advent of technology in general and bio-informatics
in particular the need for analyzing and processing text and strings has increased
manifold. This has motivated researchers and Practitioners across the globe to
develop algorithms for analyzing and processing of strings. The main objective of
string processing algorithms is to analyze the presence of defined strings in order
Self-Instructional
Material 7
Introduction to Algorithms to analyze huge sized files effectively and efficiently. At present the need has been
justified with the pattern recognition in the bio-informatics domain for genome
sequence matching analysis and information retrieval.
Geometric Problems: In order to construct a geometric shape different processes
NOTES
are to be executed from plotting points and then connecting the points using lines
based on the different parameters are automated using algorithms. These algorithms
are known as geometric problem algorithms. The geometric problem algorithms
are applied in different domain across horizontals and verticals in order to serve
humanity in general and computer sciences in particular. The different domains
like bio-medical equipments, robotics, graphics etc. are some of the field where
the usage of geometric problem algorithms has become a necessity.
Combinatorial Problems: A problem will not necessarily have only one method
of solution. Every problem solution method starts from an initial state and explores
the possible outcomes in order to move from one state to another in order to
evaluate the possible outcome for selection of an optimal option. The number of
possible outcomes at a particular state is retrieved using permutations and
combinations as well.
Numerical Analysis Problems: These algorithms are used mainly in mathematical
problems where the solutions generated are continuous. Some numerical problems
like roots between two given parameters using different methods is one of the best
examples where the application of these algorithms helps the mathematicians in
reaching higher levels of efficiency.
Self-Instructional
8 Material
algorithm. The two parameters memory and time are two basic parameters used Introduction to Algorithms
to find the efficiency of an algorithm. However, some other parameters are also
used to find out efficiency of an algorithm and the list of the parameters is given
below:
NOTES
Size of input
Running time
Worst, Best and Average scenarios
Asymptotic Notations
Size of input: The efficiency of algorithm is also evaluated based on the input size
of an algorithm. For example the size of input in case of an word count algorithm
will be the number of words and for a alphabet count algorithm the size of the input
will be the number of alphabets given as input to an algorithm.
Running time: The efficiency of an algorithm based on running depends on the
measuring unit used to measure running time. For example if the unit used for
measurement in terms of second then the efficiency will be evaluated in seconds.
In case the measuring unit is nano-seconds then the evaluation parameter unit will
be in nano-seconds. The running time for an algorithm also is directly dependent
on the speed of the computer, compiler used and quality of an algorithm. However,
for uniformity the basic operations are identified within a algorithm and the time
taken to complete the same is evaluated in order to analyze the efficiency of an
algorithm.
Worst, Best and Average scenarios: The algorithm efficiency is also evaluated
based on the different possibilities of the input size. In case the size of the input is
worst fit where the size of the inputs is taken as extreme higher value and the
efficiency of an algorithm based on the same is evaluated and the same is used to
compare for a good algorithm. In case the size of the input is best fit where the size
of the input is ideal and the efficiency of an algorithm is evaluated and the same
criterion is used as one of the benchmarks in finding out a optimal algorithm.
Similarly in case the size of the input is average then the efficiency of the algorithm
is evaluated in order to find out the best selection of an algorithm.
Order of Growth: The efficiency of an algorithm is evaluated based on order of
the growth in which an algorithm is performing. The order of growth is about how
a algorithm is performing when the system used to execute the same very fast and
what is the efficiency of an algorithm when the input size is doubled. The efficiency
can be evaluated using a basic equation as given below:
T(n) = Cop × Cn
where T(n) is running time of an algorithm
Cop is the time take for single basic operation
Cn is the number of basic operations within an algorithm.
Self-Instructional
Material 9
Introduction to Algorithms Asymptotic Notations: The efficiency of an algorithm is measure using asymptotic
notations where three notations O, , Θ are used. The notation O is known as big
“Oh”, the notation is known as omega and the notation Θ is known as big theta.
For big O notation a function t(n) is said to be in O(g(n)), for big notation
NOTES
a function t(n) is said to be in (g(n)) and for big Θ notation a function t(n) is said
to be in Θ (g(n)).
1.7 SUMMARY
NOTES
1.9 SELF ASSESSMENT QUESTIONS AND
EXERCISES
Self-Instructional
12 Material
Asymptotic Notations
2.0 INTRODUCTION
Asymptotic notations are the way to express time and space complexity. It represents
the running time of an algorithm. If we have more than one algorithm with alternative
steps then to choose among them, the algorithm with lesser complexity should be
selected. To represents these complexities, asymptotic notations are used.
Asymptotic notations are of three types Oh, Omega, and Theta. These are further
classified as Big-oh, Small-oh, Big Omega, Small Omega, etc. This unit will
introduce you with all of these.
2.1 OBJECTIVES
It is the method of expressing the tight bound of an algorithm’s running time. For
non-negative functions f(n) and g(n), if there exists an integer n0 and positive
constants c1 and c2; such that for all integers n n0,
0 c1g(n) f(n) c2g(n)
Self-Instructional
Material 13
Asymptotic Notations c2g(n) cg(n)
f(n)
f(n )
c1g(n) f(n )
cg(n)
NOTES
(g(n)) = {f(n): There exist positive constants c1, c2 and n0, such that 0 c1g(n)
f(n) c2g(n) for all n n0}.
A function f(n) belongs to the set (g(n)) such that it can be ‘sandwiched’
between c1g(n) and c2g(n), for sufficiently large n.
Figure 2.1 shows the functions f(n) and g(n), where we have that f(n) =
(g(n)). For all values of n n0, the value of f(n) lies at or above c1g(n) and at or
below c2g(n). In other words, for all n n0, the function f(n) is equal to g(n) to
within a constant factor. We say that g(n) is an asymptotically tight bound for
f(n).
Theta notation provides a tight bound on the running time of an algorithm.
Consider the following examples:
1. f(n) = 123
122*1 f(n) 123*1
Here, c1 = 122, c2 = 123 and n0 = 0
So, f(n) = (1)
2. f(n) = 3n + 5
3n < 3n + 5 4n
Here, c1 = 3, c2 = 4 and n0 = 5
So, f(n) = (n)
3. f(n) = 3n2 + 5
3n2 < 3n2 + 5 4n2
Here, c1 = 3, c2 = 4 and n0 = 5
So, f(n) = (n2)
4. f(n) = 7n2 + 5n
7n2 < 7n2 + 5n for all n, c1 = 7
Also,7n2 + 5n 8n2 for n n0 = 5, c2 = 8
So, f(n) = (n2)
Self-Instructional
14 Material
5. f(n) = 2n + 6n2 + 3n Asymptotic Notations
As any notation asymptotically bounds a function from above and below, Big-Oh
is the formal method of expressing the upper bound of an algorithm’s running time.
It describe the limiting behaviour of a function if the argument tends towards a
particular value or infinity. It is the measure of the longest amount of time it could
possibly take for the algorithm to complete. For non-negative functions f(n) and
g(n), if there exists an integer n0 and constant c > 0, such that for all integers n
n0 ,
0 f(n) cg(n)
O(g(n)) = {f(n): There exist positive constants c and n0, such that 0 f(n)
cg(n) for all n n0}.
Consider the following examples:
1. f(n) = 13
f(n) 13*1
Here, c = 13 and n0 = 0
So, f(n) = O(1)
2. f(n) = 3n + 5
3n + 5 3n + 5n
Self-Instructional
Material 15
Asymptotic Notations 3n + 5 8n
Here, c = 8 and n0 = 1
So, f(n) = O(n)
NOTES 3. f(n) = 3n2 + 5
3n2 + 5 3n2 + 5n
3n2 + 5 3n2 + 5n2
3n2 + 5 8n2
Here, c = 8 and n0 = 1
So, f(n) = O(n2)
4. f(n) = 7n2 + 5n
7n2 + 5n 7n2 + 5n2
7n2 + 5n 12n2
Here, c = 12 and n0 = 1
So, f(n) = O(n2)
5. f(n) = 2n + 6n2 + 3n
2n + 6n2 + 3n 2n + 6n2 + 3n2 2n + 6. 2n + 3. 2n 10.2n
Here, c = 10 and n0 = 1
So, f(n) = O(2n)
Lim f(n)/g(n) exists,
Big-Oh Ratio Theorem: Let f(n) and g(n) be, such that n
then f(n) = O(g(n)),
Lim f(n)/g(n) = c , also including the case in which limit is 0.
n
Self-Instructional
16 Material
Some loose bounds are as follows: Asymptotic Notations
5n + 4 = O(n2)
7n3 + 5 = O(n4)
105n2 + 3 = O(n3) NOTES
Big Omega () is the method used for expressing the lower bound of an algorithm’s
running time. It is the measure of the smallest amount of time it could possibly take
for the algorithm to complete.
For non-negative functions f(n) and g(n), if there exists an integer n0 and
constant c > 0, such that for all integers n n0 ,
0 cg(n) f(n)
Consider the following examples:
1. f(n) = 13
f(n) 12*1
where c = 12 and n0 = 0
So, f(n) = (1)
2. f(n) = 3n + 5
3n + 5 > 3n
where c = 3 and n0 = 1
So, f(n) = (n)
3. f(n) = 3n2 + 5
3n2 + 5 > 3n2
where c = 3 and n0 = 1
So, f(n) = (n2)
4. f(n) = 7n2 + 5n
7n2 + 5n > 7n2
where c = 7 and n0 = 1
So, f(n) = (n2)
5. f(n) = 2n + 6n2 + 3n
2n + 6n2 + 3n > 2n
where c = 1 and n0 = 1
So, f(n) = (2n)
Lim f(n) / g(n)
Big Omega Ratio Theorem: Let f(n) and g(n) be such that n
exists, then f(n) = (g(n)),
Lim f(n) / g(n) > 0, also including the case in which limit is .
n
Self-Instructional
Material 17
Asymptotic Notations Some incorrect bounds are as follows:
2n + 4 (n2)
2n3 + 5 (n4)
NOTES 10n2 + 3 (n3)
Some loose bounds are as follows:
5n + 4 = (1)
7n3 + 5 = (n2)
105n2 + 3 = (n)
The asymptotic upper bound provided by Big-oh(O) notation may or may not be
asymptotically tight. We use Little-Oh(o) notation to denote an upper bound that
is not asymptotically tight.
0 f(n) < cg(n)
Lim f(n) / g(n) = 0
For Little-Oh notation: n
Example, 3n + 9 = o(n2)
Lim f(n) / g(n) = Lim (3n + 9) / n2 = 0
n n
The asymptotic lower bound provided by Big Omega () notation may or may
not be asymptotically tight. We use Little Omega () notation to denote a lower
bound that is not asymptotically tight.
0 cg(n) < f(n)
Lim f(n) / g(n) =
For Little Omega () notation: n
Figure 2.2 shows the concept of asymptotic notation. Consider the function, say
f(n) = an2 + bn + c. Also, let us consider the band for n2(maximum contributing
term for large n) and the lower and upper bounds are c1g(n) and c2g(n),
respectively.
Self-Instructional
18 Material
So, the function can be rewritten in terms of several asymptotic notations Asymptotic Notations
as:
an2 + bn + c = (1)
an2 + bn + c = (n) NOTES
an2 + bn + c = (n2)
an2 + bn + c = (1)
an2 + bn + c = (n)
an2 + bn + c = O(n2)
an2 + bn + c = O(n6)
an2 + bn + c = O(n100)
an2 + bn + c = o(n19)
Also, an2 + bn + c o(n2) (Since it can never be tight bound)
an2 + bn + c ω(n19)
an2 + bn + c O(n)
c2g(n)
c1g(n)
Self-Instructional
Material 19
Asymptotic Notations Some more incorrect bounds are as follows:
7n + 5 O(1)
2n + 3 O(1)
NOTES 3n2 + 16n + 2 O(n)
5n3 + n2 + 3n + 2 O(n2)
7n + 5 (n2)
2n + 3 (n3)
10n2 + 7 (n4)
7n + 5 (n2)
2n2 + 3 (n3)
Some more loose bounds are as follows:
2n + 3 = O(n2)
4n2 + 5n + 6 = O(n4)
5n2 + 3 = (1)
2n3 + 3n2 + 2 = (n2)
Some correct bounds are as follows:
2n + 8 = O(n)
2n + 8 = O(n2)
2n + 8 = (n)
2n + 8 = (n)
2n + 8 = o(n2)
2n + 8 o(n)
2n + 8 (n)
4n2 + 3n + 9 = O(n2)
4n2 + 3n + 9 = (n2)
4n2 + 3n + 9 = (n2)
4n2 + 3n + 9 = o(n3)
4n2 + 3n + 9 o(n2)
4n2 + 3n + 9 (n2)
Correlation
f(n) = (g(n)) f = g
f(n) = O(g(n)) f g
Self-Instructional
20 Material
f(n) = (g(n)) f g Asymptotic Notations
(i) Transitivity
(ii) Reflexivity
(iii) Symmetry
(iv) Transpose Symmetry
(i) Transitivity
f(n) = (g(n)) and g(n) = (h(n)) imply f(n) = (h(n))
f(n) = O(g(n)) and g(n) = O(h(n)) imply f(n) = O(h(n))
f(n) = (g(n)) and g(n) = (h(n)) imply f(n) = (h(n))
f(n) = o(g(n)) and g(n) = o(h(n)) imply f(n) = o(h(n))
f(n) = (g(n)) and g(n) = (h(n)) imply f(n) = (h(n))
(ii) Reflexivity
f(n) = (f(n))
f(n) = O(f(n))
f(n) = (f(n))
(iii) Symmetry
f(n) = (g(n)) if and only if g(n) = (f(n))
(iv) Transpose Symmetry
f(n) = O(g(n)) if and only if g(n) = (f(n))
f(n) = o(g(n)) if and only if g(n) = (f(n))
Comparisons
f(n) is asymptotically smaller than g(n) if f(n) = o(g(n)).
f(n) is asymptotically larger than g(n) if f(n) = (g(n)).
Self-Instructional
Material 21
Asymptotic Notations
2.7 ANSWERS TO CHECK YOUR PROGRESS
QUESTIONS
NOTES 1. Big Omega () is the method used for expressing the lower bound of an
algorithm’s running time.
2. We use Little Omega () notation to denote a lower bound that is not
asymptotically tight.
2.8 SUMMARY
Self-Instructional
22 Material
Long Answer Questions Asymptotic Notations
Self-Instructional
Material 23
Performance Analysis
3.0 INTRODUCTION
3.1 OBJECTIVES
Are procedures structured in such a way that they are able to perform
logical sub-functions?
Is the code of algorithm readable? NOTES
These criteria are very important as far as writing software is concerned,
especially for large systems. Algorithms can also be judged using some other criteria
having a more direct relationship with their performance. These have to do with
their computing time and storage requirements.
Definitions
Profiling: Profiling is the process of the execution of a correct program on data
sets and the measurement of the time and storage taken for computing the results.
It is also known as performance profile. These timing figures can confirm and
point out logical places for performing useful optimization and hence are very
useful. Profiling can be done on programs that are devised, coded, proved correct
and debugged on a computer.
Debugging: Debugging refers to the process of execution of program on sample
data sets for determining if faulty results occur. In other words, debugging is
concerned with conducting tests for uncovering errors and ensuring that the defined
input will give the actual results which agree with the required results.
Debugging only points to the presence of errors; it does not point to their
absence. Debugging is not testing but always occurs as a consequence of testing.
Debugging begins with the execution of a test case. The debugging process attempts
to match symptom with cause, thereby leading to error correction. Debugging has
two outcomes. Either the error is detected and corrected or the error is not found.
Priori analysis: It is also known as machine-independent and programming
language-independent analysis, is done to bind the algorithms computing time.
Posteriori testing: It is also known as machine-dependent and programming
language-dependent analysis, is done to collect the actual statistics about the
algorithms consumption of time and space while it is executing.
A priori analysis of algorithms is concerned chiefly with determination of
order of magnitude/frequency count of the step/statement. This can be determined
directly from the algorithm, independent of the machine it will be executed on and
the programming language the algorithm is written in.
For example: Consider the three program segments a, b, c
a. for i 1 to n
repeat x n + y
b. for i 1 to n
c. for j 1 to n
Self-Instructional
Material 25
Performance Analysis x x + y
repeat
repeat
for segment a the frequency count is 1;
NOTES
for segment b the frequency count is n;
for segment c the frequency count is n2
These frequencies 1, n, n2 are said to be different
increasing orders of magnitude.
3.2.1 Space Complexity
The space complexity of an algorithm indicates the quantity of temporary storage
required for running the algorithm, i.e. the amount of memory needed by the
algorithm to run to completion.
In most cases, we do not count the storage required for the inputs or the
outputs as part of the space complexity. This is so because the space complexity
is used to compare different algorithms for the same problem in which case the
input/output requirements are fixed.
Also, we cannot do without the input or the output, and we want to count
only the storage that may be served. We also do not count the storage required for
the program itself since it is independent of the size of the input.
Like time complexity, space complexity refers to the worst case, and it is
usually denoted as an asymptotic expression in the size of the input. Thus, a 0(n) –
space algorithm requires a constant amount of space independent of the size of the
input.
The amount of memory an algorithm needs to run to completion is called its
space complexity. The space required by an algorithm consists of the following
two components:
(i) Fixed or static part: Fixed or static part is not dependent on the
characteristics (such as number size) of the inputs and outputs. It includes
the various types spaces, such as instruction space (i.e., space for code),
space for simple variables and fixed-size component variables, space for
constants, etc.
(ii) Variable or dynamic part: Variable or dynamic part consists of the space
required by component variables whose size is dependent on the particular
problem instance at run-time being solved, the space needed by referenced
variables and the recursion stack space (depends on instance characteristics).
The space requirements S(p) of an algorithm p is S(p) = c + Sp (instance
characteristics), where ‘c’ is a constant.
We are supposed to concentrate on estimating SP (instance characteristics)
since the first part is static.
Self-Instructional
26 Material
Example 3.1: The problem instances for algorithm are characterized by n, the Performance Analysis
The time complexity of an algorithm may be defined as the amount of time the
computer requires to run to completion.
The time T(P) consumed by a program P is the sum of the compile-time and
the run- time (execution-time). The compile time is independent of the instance
characteristics. Also, it may be assumed that a compiled program can be run many
times without recompilation. As a result, we are more interested in the run-time of
a program. This run-time is denoted by tp (instance characteristics).
Many factors on which tp depend are not known at the time a program is
written; so it is always better to estimate tp. If we happen to know the type of the
compiler used, then we could proceed to find the number of additions, subtractions,
multiplications, divisions, compare statements, loads, stores and so on that would
be made by a program P.
So we can obtain an expression of the form.
t (n) = C ADD(n) + C SUB(n) + C MUL(n) + C DIV(n) +.........
p a s m d
where n denotes the instance characteristics, and Ca , Cs, Cm , Cd and so on
denote the time needed for addition, subtraction, multiplication, division and so
on.
But here we need to note that the exact amount of time needed for the
operations mentioned here cannot be found exactly; so instead we could only
count the number of program steps, which means that a program step is counted.
A program step is defined as a syntactically or semantically meaningful
segment of a program that has an execution time that is independent of the instance
characteristics.
Self-Instructional
Material 27
Performance Analysis For example, Consider the statement return a + b × c + d “ e/f
where this can be regarded as a step since its execution time is independent
of the instance characteristics.
NOTES The number of steps any program statement is assigned depends on the
type of statement. The comments do not count for the program step; a general
assignment statement, which does not call another algorithm, is considered one
step whereas in an iterative statement like for, while and repeat_until, we count the
step only for the control part of the statement.
The general syntax for ‘for’ and ‘while’ statements is as follows:
for i = (exprl) to (expr2) do
while(expr) do
Each execution of the control part of a while statement is given step count
equal to the number of step counts assignable to <expr>. The step count for each
execution of the control part of a for statement is one, unless the counts attributable
to <expr> and <exprl> are functions of the instance characteristics.
Self-Instructional
Material 31
Performance Analysis 5. In the field of computer programming, the term code refers to instructions
to a computer in a programming language.
NOTES
3.6 SUMMARY
Algorithms can be judged using some other criteria having a more direct
relationship with their performance.
Debugging refers to the process of execution of program on sample data
sets for determining if faulty results occur.
Debugging only points to the presence of errors; it does not point to their
absence. Debugging is not testing but always occurs as a consequence of
testing.
Debugging begins with the execution of a test case. The debugging process
attempts to match symptom with cause, thereby leading to error correction.
A priori analysis of algorithms is concerned chiefly with determination of
order of magnitude/frequency count of the step/statement.
The space complexity of an algorithm indicates the quantity of temporary
storage required for running the algorithm, i.e. the amount of memory needed
by the algorithm to run to completion.
The amount of memory an algorithm needs to run to completion is called its
space complexity.
The time complexity of an algorithm may be defined as the amount of time
the computer requires to run to completion.
A pseudo-code is neither an algorithm nor a program. It is an art of expressing
a program in simple English that parallels the forms of a computer language.
In the field of computer programming, the term code refers to instructions
to a computer in a programming language.
If you are using a procedural language, you need to ensure that code is
linear at the first executable statement and continues to a final return or end
of block statement.
Portable code makes it possible for the source file to be compiled with any
compiler.
Analysis is the first technical step in the program development process.
The design phase will begin after the software analysis process. It is a multi-
step process.
Program testing begins after the implementation. The importance of the
software testing is in finding the uncover errors, assuring software quality
and reviewing the analysis, design and implementation phases.
Self-Instructional
32 Material
This activity includes amendments, measurements and tests in the existing Performance Analysis
software.
A program should be correct and designed in accordance with the
specifications so that anyone can understand the design of the program.
NOTES
NOTES
3.9 FURTHER READINGS
Self-Instructional
34 Material
Analysis of Recursive
BLOCK - II Algorithms
4.0 INTRODUCTION
4.1 OBJECTIVES
Self-Instructional
Material 35
Analysis of Recursive
Algorithms 4.2 RECURSION
Self-Instructional
36 Material
Analysis of Recursive
4.3 RECURSIVE ALGORITHMS Algorithms
Self-Instructional
Material 37
Analysis of Recursive 4.4.2 Graphical Representation
Algorithms
NOTES
Self-Instructional
Material 39
Analysis of Recursive
Algorithms 4.6 SUMMARY
Self-Instructional
40 Material
Analysis of Recursive
4.8 SELF ASSESSMENT QUESTIONS AND Algorithms
EXERCISES
Self-Instructional
Material 41
Empirical Analysis
of Algorithms
UNIT 5 EMPIRICAL ANALYSIS OF
ALGORITHMS
NOTES
Structure
5.0 Introduction
5.1 Objectives
5.2 Brute Force
5.2.1 Selection Sort using Brute Force Approach
5.3 Selection Sort
5.4 Bubble Sort
5.5 Sequential Sorting
5.6 Answers to Check Your Progress Questions
5.7 Summary
5.8 Key Words
5.9 Self Assessment Questions and Exercises
5.10 Further Readings
5.0 INTRODUCTION
5.1 OBJECTIVES
Selection sort works on the basic principle concept to sort the given array of data
by either selecting largest or smallest item/number from the given array. The selection
of the smallest/ largest number is carried out by scanning all the length of the array
that means after scanning all the elements of the array to locate the desired element
to be notified as sorted. The selection of a given number is validated by performing
comparisons of the current element with its adjacent elements in the list thereafter
swapping the proper element and placing it at its perfect index in a sorted array.
Let’s consider an array A[n] with ‘n’ elements to be sorted. Sorting on the
basis of selection of largest or smallest number in the list requires scanning of all
the ‘n’ elements in an array A[n] consuming n-1 comparisons. Similarly, finding
next largest or smallest element from the unsorted array requires n-1, here n is the
effective size of the array A[n]. After every single sort the effective size of the array
is reduced by same value. For example if initial size n=5, in first case n-1= 5-1= 4
comparisons will happen and the effective size will be now n-1=4. In order to find
second sorted elements n-1=4-1=3 comparisons will happen because effective
size of the array has been reduced from 5 to 4. Comparisons and swapping will
conclude once effective size will become 1. That means total comparisons for n=5
is 4+3+2+1= 10.
Algorithm
Self-Instructional
44 Material
Example Empirical Analysis
of Algorithms
NOTES
Analysis:
Let A[n] is a given array with random elements.
Selecting either smallest/largest elements requires scanning of n
elements.
It performs n-1 comparisons.
Therefore, the next elements are selected by following pattern of
comparisons:
(n-1), (n-2), (n-3) if n is static the original size of an array otherwise
it will be (n-1) but n will get reduced at each element sort.
Therefore, total comparisons(C) made is equal to,
C=(n-1)+(n-2)+(n-3)+…+2+1
Best case complexity for selection sort is O(n log n), Average case
complexity is O(n2) and worst case complexity is O(n2).
Bubble sort is treated as one of the simple and oldest sorting approach. Bubble
sort almost performs sorting processes more or less in a similar fashion as selection
sort. That is comparing each element in a list with immediate elements within the
list and if the sort criteria is meet then swapping of the items is carried out. This
particular technique is also called as sinking sort because here the smallest element
in the sorted array lies at the bottom of array that is at index 0 and the largest
element bubbles at the top of the array. The bubble sort differs from the selection
sort in a way that in bubble sort the comparisons is carried out within the adjacent
pairs of an array. The element that is result of the first iteration is probably not the
sorted element in context with all elements of an array. Because in bubble sort the Self-Instructional
Material 45
Empirical Analysis sorting process begins by comparing first two elements (i and i +1) in a list, if the
of Algorithms
element on lower index is smaller than the other element then no swapping is done
otherwise swap the elements. In next iteration the comparison is performed between
i+2 and i+3 elements of the array. Similarly, next comparison pair will be i+3 and
NOTES i+4 and will continue up to last element of array. The same pattern will repeat
recursively till final sorted array is arrived.
Let’s consider an array A[n] with ‘n’ elements to be sorted. Sorting on the
basis of bubble sort, that is finding the smallest element from within the array. The
array takes n-1 pass to bring sorted array. In first pass there are n-1 comparisons,
in pass two n-2 and similarly in subsequent passes n-3,…2 and 1.
That means to find a smallest number in the list it requires scanning of all the
‘n’ elements in an array A[n] but in pairs and pair consists of adjacent array
elements. However, it is not possible that after every scan the array will get any
elements to be placed at its exact sorting location. The operational behavior of
bubble sort is described with the help of an example provided as under.
Example
Algorithm
For the first position in the sorted list, the whole list is scanned sequentially.
The first position where 14 is stored presently, we search the whole list and find
that 10 is the lowest value.
So we replace 14 with 10. After one iteration 10, which happens to be the
minimum value in the list, appears in the first position of the sorted list.
For the second position, where 33 is residing, we start scanning the rest of
the list in a linear manner.
We find that 14 is the second lowest value in the list and it should appear at
the second place. We swap these values.
After two iterations, two least values are positioned at the beginning in the
list in a sequential sorted ascending manner.
The same process or methodology is applied to the rest of the items in the
array. Self-Instructional
Material 47
Empirical Analysis
of Algorithms
Check Your Progress
1. How does comparison based sorting approach sort data?
NOTES 2. What is Brute force defined as?
3. How is the selection of a given number validated?
1. Comparison based sorting approach sorts data by comparing the data values
2. Brute force is defined as a type of problem solving approach wherein a
problem is solution is directly based on the problem statement or the problem
definition that is provided.
3. The selection of a given number is validated by performing comparisons of
the current element with its adjacent elements in the list.
5.7 SUMMARY
6.0 INTRODUCTION
In the field of computer science and mathematics, we often come across various
problems which are quite complex, and solving such problems is a difficult task.
Designing a solution for those problems which theoretically can be solved
algorithmically is quite tough. Hence, in order to solve such problems, many new
techniques and algorithms have been developed, out of which divide-and-conquer
is an efficient one.
The divide-and-conquer technique solves a problem by breaking a large
problem that is difficult to solve into sub-problems, solve these sub-problems
recursively and then combine the answers. This unit discusses algorithms such as
binary search, modular exponentiation, quick sort, and merge sort which are based
on the divide-and-conquer technique. It also gives a brief comparison of various
algorithms in terms of their time complexities.
6.1 OBJECTIVES
6.3 EXPONENTIATION
{ xn = x,
(x ) ,
2 n/2
x.(x2)(n-1)/2
if n =1
if n is even
if n is odd
In this algorithm only O (log n) multiplications are used; therefore, the
computation of xn becomes faster.
The modular exponentiation, that is, xb mod n, for very large b and n,
can be computed using same technique. Modular exponentiation is useful in
computer science, especially in the field of cryptography. Let the binary
representation of b be (bm, bm-1, bm-2,..., b1, b0) where bm is the most significant
Self-Instructional
Material 51
Closest Pair and bit and b0 is the least significant bit. Then xb mod n can be computed by the
Covex-Hull Problems
Algorithm 6.1.
Algorithm 6.1: Modular Exponentiation
NOTES modular_exponentiation(x,b,n)
1. Set c=0
2. Set res=1
//let(bm, bm-1, bm-2…..,b1, b0) be the binary
//representation of b
3. for i=m downto 0
4. {
5. Set c=2*c
6. Set res=(res*res) mod n
7. if (bi=1)
8. {
9. Set c=c+1
10. Set res=(res*x) mod n
11. }
12. }
13. return res
Step 1: Initially, the value of c is 0 and the value of res is 1 according to step 1
and 2 of the algorithm. During the first iteration (that is, i = 9), we get c =
0 and res = 1 from Step 6 and 7 of algorithm. The condition in the Step
8 holds true (as b9 = 1), which results in:
Self-Instructional
52 Material
c =0+1=1 (from Step 10 of algorithm) Closest Pair and
Covex-Hull Problems
res = (1*5) mod 765 = 5 (from Step 11 of algorithm)
Step 2: In the second iteration (that is, i = 8), we get c = 1 * 2 = 2, and res =
(5 * 5) mod 765 = 25 from Step 6 and 7 of algorithm, respectively. Now, NOTES
since the condition in the Step 8 of the algorithm evaluates to false (as b8
= 0), the value of c and res remains 2 and 25, respectively.
Step 3: In the third iteration (that is, i = 7), we get c = 2 * 2 = 4, and res =
(25 * 25) mod 765 = 625 from Step 6 and 7 of algorithm, respectively.
As b7 = 1, the condition in Step 8 of algorithm evaluates to true. This
makes the value of c = 4 + 1 = 5 and res = (625 * 5) mod 765 = 65
Proceeding in this manner through each iteration, we get the final value of
res = 655. That is, 5650 mod 765 = 655.
The binary search technique is used to search for a particular data item in a sorted
(in ascending or descending order) array. In this technique, the value to be searched
(say, item) is compared with the middle element of the array. If item is equal to the
middle element, then search is successful. If item is smaller than the middle value,
item is searched in the segment of the array before the middle element. However,
if item is greater than the middle value, item is searched in the array segment after
the middle element. This process is repeated until the value is found or the array
segment is reduced to a single element that is not equal to item.
At every stage of the binary search technique, the array is reduced to a
smaller segment. It searches a particular data item in the lowest possible number
of comparisons. Hence, the binary search technique is used for larger and sorted
arrays, as it is faster as compared to linear search. For example, consider an
array ARR shown in Figure 6.2.
To search an item (say, 7) using binary search in the array ARR with size=7,
these steps are performed.
1. Initially, set LOW=0 and HIGH=size–1. The middle of the array is determined
using the formula MID=(LOW+ HIGH)/2, that is, MID=(0+6)/2, which is
equal to 3. Thus, ARR [MID]=4.
Self-Instructional
Material 53
Closest Pair and 2. Since the value stored at ARR[3] is less than the value to be searched, that
Covex-Hull Problems
is 7, the search process is now restricted from ARR[4] to ARR[6]. Now
LOW is 4 and HIGH is 6. The middle element of this segment of the array is
calculated as MID=(4+6)/2, that is, 5. Thus, ARR[MID]=6.
NOTES
3. The value stored at ARR[5] is less than the value to be searched, hence the
search process begins from the subscript 6. As ARR[6] is the last element,
the item to be searched is compared with this value. Since ARR[6] is the
value to be searched, the search is successful.
Algorithm 6.2: Binary Search
binary_search(ARR,size,item)
//ARR is the list in which the element is to be searched
1. Set LOW=0
2. Set HIGH=size-1
3. while (LOW = HIGH)
4. {
5. Set MID=(LOW + HIGH)/2
6. If (item=ARR[MID])
7. return MID
8. Else If (item<ARR[MID])
9. Set HIGH=MID–1
10. Else
11. Set LOW=MID+1
12. }
13. return -1 //item not found in the list
Self-Instructional
54 Material
Closest Pair and
6.5 QUICK SORT Covex-Hull Problems
algorithm is based on the fact that it is easier and faster to sort two smaller arrays
than one larger array. Thus, it follows the principle of divide-and-conquer. Quick NOTES
sort algorithm first picks up a partitioning element, called pivot, that divides the list
into two sub lists such that all the elements in the left sub list are smaller than the
pivot, and all the elements in the right sub list are greater than the pivot. Once the
given list is partitioned into two sub lists, these two sub lists are sorted separately.
The same process is applied to sort the elements of left and right sub lists. This
process is repeated recursively until each sub list contains not more than one
element.
As we have discussed, the main task in quick sort is to find the pivot that
partitions the given list into two halves so that the pivot is placed at its appropriate
location in the array. The choice of pivot has a significant effect on the efficiency of
quick sort algorithm. The simplest way is to choose the first element as pivot.
However, first element is not always a good choice, especially if the given list is
already or nearly ordered. For better efficiency, the middle element is chosen as
pivot. For simplicity, we will take the first element as pivot.
The steps involved in quick sort algorithm are as follows:
1. Initially, three variables pivot, beg and end are taken, such that both
pivot and beg refer to the 0th position, and end refers to (n-1)th
position in the list.
2. Starting with the element referred to by end, the array is scanned from
right to left, and each element on the way is compared with the element
referred to by pivot. If the element referred to by pivot is greater
than the element referred to by end, they are swapped and Step 3 is
performed. Otherwise, end is decremented by 1 and Step 2 is continued.
3. Starting with the element referred to by beg, the array is scanned from left
to right, and each element on the way is compared with the element referred
to by pivot. If the element referred to by pivot is smaller than the
element referred to by end, they are swapped and Step 2 is performed.
Otherwise, beg is incremented by 1 and Step 3 is continued.
The first pass terminates when pivot, beg and end all refer to the
same array element. This indicates that the element referred to by pivot is
placed at its final position. The elements to the left of this element are smaller than
this element and elements to its right are greater.
To understand the quick sort algorithm, consider an unsorted array shown
in Figure 6.3. The steps to sort the values stored in the array in ascending order
using quick sort are given here.
2. The scanning of elements is started from the end of the list. ARR[pivot] (that
is, 8) is greater than ARR[end] (that is, 4). Therefore, they are swapped.
3. Now, the scanning of elements is started from the beginning of the list. Since
ARR[pivot] (that is, 8) is greater than ARR[beg] (that is 33), therefore
beg is incremented by 1, and the list remains unchanged.
4. Next, the element ARR[pivot] is smaller than ARR[beg], they are swapped.
5. Again, the list is scanned from right to left. Since, ARR[pivot] is smaller
than ARR[end], therefore the value of end is decremented by 1, and the
list remains unchanged.
Self-Instructional
7. Now, ARR[pivot] is greater than ARR[end], they are swapped.
56 Material
Closest Pair and
Covex-Hull Problems
NOTES
8. Now, the list is scanned from left to right. Since, ARR[pivot] is greater
than ARR[beg], value of beg is incremented by 1, and the list remains
unchanged.
At this point, since the variables pivot, beg and end all refer to the same
element, the first pass is terminated and the value 8 is placed at its appropriate
position. The elements to its left are smaller than 8, and elements to its right are
greater than 8. These two sub lists are again sorted using the same procedure.
Algorithm 6.3: Quick Sort
quick_sort(ARR,size,lb,ub)
1. Set i=1 //i is a static integer variable
2. If (lb<ub)
3. {
4. Call splitarray(ARR,lb,ub) //returning an
//integer value pivot
5. Print ARR after ith pass
6. Set i=i+1
7. Call quick_sort(ARR,size,lb,pivot – 1)
//recursive call to quick_sort() to
//sort left sub list
8. Call quick_sort(ARR,size,pivot + 1,ub);
//recursive call to quick_sort()
//to sort right sub list
9. }
10. Else If (ub=size-1)
11. Print “No. of passes: ”, i
splitarray(ARR,lb,ub)
//spiltarray partitions the list into two sub lists such
//that the elements in left sub list are smaller than
//ARR[pivot], and elements in the right sub list are
//greater than ARR[pivot]
1. Set flag=0
2. Set beg=pivot=lb
3. Set end=ub
4. while (flag != 1)
5. {
6. while (ARR[pivot] = ARR[end] AND pivot != end)
7. Set end=end–1
8. If (pivot=end)
9. Set flag=1
10. Else
11. {
12. Set temp=ARR[pivot]
13. Set ARR[pivot]=ARR[end]
14. Set ARR[end]=temp
15. Set pivot=end
Self-Instructional
Material 57
Closest Pair and 16. }
Covex-Hull Problems 17. If (flag != 1)
18. {
19. while (ARR[pivot] = ARR[beg] AND pivot != beg)
20. Set beg=beg+1
21. If (pivot=beg)
NOTES 22. Set flag=1
23. Else
24. {
25. Set temp=ARR[pivot]
26. Set ARR[pivot]=ARR[beg]
27. Set ARR[beg]=temp
28. Set pivot=beg
29. }
30. }
31. }
32. return pivot
Self-Instructional
58 Material
given problem. The random numbers are generated by a random number generator. Closest Pair and
Covex-Hull Problems
Since the randomizer will generate different values with each execution, the output
of the algorithm may vary for the same input data. The complexity of this algorithm
is not affected by any input but is affected greatly by the random number chosen.
NOTES
Algorithm 6.4: Randomized Quick Sort
randomized_quick_sort(ARR,size,lb,ub)
1. Set i=1 //i is a static integer variable
2. If (lb < ub)
3. {
4. Call randomized_splitarray(ARR,lb,ub)
5. Print ARR after ith pass
6. Set i=i+1
7. Call randomized_quick_sort(ARR,size,lb,pivot–1)
//recursive call to randomized_quick_sort()
8. Call randomized_quick_sort(ARR,size,pivot+1,ub)
//recursive call to randomized_quick_sort()
9.}
10.Else If (ub=size-1)
11. Print “No. of passes: ”, i
randomized_splitarray(ARR,lb,ub)
//randomized-_splitarray () randomly chooses an element
from
//the list and exchanges it with the first element and then
//calls the splitarray
1. Set i=Random(lb,ub)
2. exchange ARR[lb]=ARR[i]
3. return splitarray(ARR,lb,ub)
splitarray (ARR,lb,ub)
1. Set flag=0
2. Set beg=pivot=lb
3. Set end=ub
4. while (flag != 1)
5. {
6. while (ARR[pivot] = ARR[end] AND pivot != end)
7. Set end=end–1
8. If (pivot=end)
9. Set flag=1
10. Else
11. {
12. Set temp=ARR[pivot]
13. Set ARR[pivot]=ARR[end]
14. Set ARR[end]=temp
15. Set pivot=end
16. }
17. If (flag != 1)
18. {
19. while (ARR[pivot] = ARR[beg] AND pivot != beg)
20. Set beg=beg+1
21. If (pivot=beg)
22. Set flag=1
Self-Instructional
Material 59
Closest Pair and
Covex-Hull Problems 23. Else
24. {
25. Set temp=ARR[pivot]
26. Set ARR[pivot]=ARR[beg]
27. Set ARR[beg]=temp
NOTES 28. Set pivot=beg
29. }
30. }
31. }
32. return pivot
Like quick sort, merge sort algorithm also follows the principle of divide-and-
conquer. In this sorting, the list is first divided into two halves. The left and right
sub lists obtained are recursively divided into two sub lists until each sub list contains
not more than one element. The sub lists containing only one element do not require
any sorting. Therefore, we start merging the sub lists of size one to obtain the
sorted sub list of size two. Similarly, the sub lists of size two are then merged to
obtain the sorted sub list of size four. This process is repeated until we get the final
sorted array.
To understand the merge sort algorithm, consider an unsorted array shown
in Figure 6.4. The steps to sort the values stored in the array in ascending order
using merge sort are given here.
2. The left sub list is considered first, and it is again divided into two sub lists.
Now, low=0 and high=3, therefore, mid=(0+3)/2=1. Thus, the left sub
list is divided into two halves from the 2nd element. The sub lists are as
follows:
3. These two sub lists are again divided into sub lists such that all of them
contain one element. Now the sub lists are as follows:
4. Since each sub list now contains one element, they are first merged to
produce the two arrays of size 2. First, the sub lists containing the elements
18 and 13 are merged to give one sorted sub array, and the sub lists containing
Self-Instructional
60 Material
the elements 5 and 20 are merged to give another sorted sub array. The Closest Pair and
Covex-Hull Problems
two sorted sub arrays are as follows:
5. Now these two sub arrays are again merged to give the following sorted NOTES
sub array of size 4.
6. After sorting the left half of the array, we perform the same steps for the
right half. The sorted right half of the array is given below:
7. Finally, the left and right halves of the array are merged to give the sorted
array as shown in Figure 6.5.
Self-Instructional
Material 61
Closest Pair and 22. Set merged[k]=ARR[i]
Covex-Hull Problems 23. Set i=i+1
24. Set k=k+1
25. }
26. }
27. If(j = ur)
NOTES 28. {
29. while(j = ur)
30. {
31. Set merged[k]=ARR[j]
32. Set j=j+1
33. Set k=k+1
34. }
35. }
36. Set k=ll
37. while (k = ur)
38. {
39. Set ARR[k]=merged[k]
40. Set k=k+1
41. }
Volker Strassen is a German mathematician born in 1936. His algorithm for matrix
multiplication is still one of the main methods that outperforms the general matrix
multiplication algorithm.
Assume that X and Y are two n x n matrices. We need to determine the
matrix Z as the product of matrix X and Y, that is Z=X x Y, and Z is also an n x
n matrix. The conventional method to computer the element at position Z[i,
j] is as follows:
Z(i,j)= X(i,k)Y(k,j)
… (6.1)
1k n
Self-Instructional
Material 63
Closest Pair and For n=2, the matrix Z is obtained by directly multiplying the elements of X
Covex-Hull Problems
n n
and Y. However, for n>2, the matrices are recursively divided into x sub-
2 2
matrices, and multiplication and addition operations are applied to them.
NOTES If the matrices are of size 4x4, then to compute XY using Equation 6.3, we
n n
need eight multiplications and four additions of x matrices. Since two matrices
2 2
n n
of size x can be added in cn2 time, where c is a constant, the overall
2 2
computing time T(n) of the divide-and-conquer technique is as follows:
8T(n / 2) cn n 2
2
T(n)
b n 2
using the formulas given in the Equation 2.4 followed by computing the Zij using
the formulas given in Equation 2.5.
P1 = (X11 + X22)(X11+Y22)
P2 = (X21 + X22)Y11
P3 = X11(Y12-Y22)
P4 = X22(Y21+Y11)
P5 = (X11 + X12)Y22 … (6.4)
P6 = (X21 – X11)(Y11+Y12)
P7 = (X12 + X22)(Y21+Y22)
Z11 = P1 + P4 – P5 + P7
Z12 = P3 + P5 … (6.5)
Z21 = P2 + P4
Z22 = P1 + P3 – P2 + P6
As we can see, to compute P1, P2, P3, P4, P5, P6, and P7 seven matrix
multiplications and 10 matrix additions or subtractions are required, and to compute
Zij 8 additions or subtractions are required. The time complexity T(n)of this
technique is as follows:
Self-Instructional
64 Material
Closest Pair and
7T(n / 2) an n 2
2
Covex-Hull Problems
T(n) …(6.6)
b n 2
6.9 SUMMARY
Self-Instructional
Material 65
Closest Pair and At every stage of the binary search technique, the array is reduced to a
Covex-Hull Problems
smaller segment.
In each iteration, binary search algorithm reduces the array to one half.
NOTES The main task in quick sort is to find the pivot that partitions the given list
into two halves so that the pivot is placed at its appropriate location in the
array.
The quick sort algorithm gives worst case performance when the list is
already sorted.
The complexity of quick sort algorithm in worst case is O(n2) which is
observed when the list is already sorted.
The randomized version of quicksort is a better option to opt, when the
inputs are large.
Like quick sort, merge sort algorithm also follows the principle of divide-
and- conquer.
In the first pass of merge sort algorithm, the given array is divided into two
halves and each half is sorted separately.
Volker Strassen is a German mathematician born in 1936. His algorithm for
matrix multiplication is still one of the main methods that outperforms the
general matrix multiplication algorithm.
Thus, the time complexity of divide-and-conquer technique is also O(n3),
which is same as the conventional approach of matrix multiplication.
Self-Instructional
66 Material
Long Answer Questions Closest Pair and
Covex-Hull Problems
1. “The divide-and-conquer technique is one of the widely used technique
to develop algorithms for problems which can be divided into sub-problems
(smaller in size but similar to the actual problem) so that they can be solved NOTES
efficiently.” Explain with the help of an example.
2. “The binary search technique is used to search for a particular data item in
a sorted (in ascending or descending order) array.” Discuss.
3. What is general strategy? Discuss the steps of general strategy.
Self-Instructional
Material 67
General Method
7.0 INTRODUCTION
7.1 OBJECTIVES
Each every iteration the particular entry in the table is filled out row by row
using a recursive approach.
Algorithm Binomial(n, k)
for i = 0 to min(i, k) do
return C[n, k]
Self-Instructional
70 Material
General Method
7.3 FLOYD-WARSHALL ALGORITHM
All pairs-shortest path problem is to find the shortest path between all pairs of
vertices in a graph G=(V,E). For example, if we are given a graph consisting of NOTES
five cities, say A, B, C, D, and E, then the aim is to find the shortest path between
all pairs of vertices such as from A to B, B to C, A to E, E to D, and so on. The
problem is efficiently solved by the Floyd-Warshall algorithm that we will discuss
in this section. Another way is to apply the single source shortest path algorithm on
all vertices, which is explained in the next section.
7.3.1 The Floyd-Warshall Algorithm
The Floyd-Warshall algorithm is used to find the shortest path between all pairs of
vertices in a directed graph G=(V,E). This algorithm uses the dynamic
programming approach in a different manner. This algorithm defines the structure
of the shortest path by considering the ‘intermediate’ vertices of the shortest path,
where the intermediate vertex of the path p={v1, v2, v3,…,v1} can be any
vertex other than v1 and vk.
Let G be the graph and V be the vertex set of graph, where V={1, 2, 3, 4,
…, n}. For any pair of vertices i, jõV, while considering all paths from i to j
which have a number of intermediate vertices, say belongs to the subset of V,
which is {1, 2, 3, 4, …, k} for some vertex k and let us consider p be the
minimum weighted path among all. The Floyd-Warshall algorithm establishes a
notable relationship between path p and other shortest paths from i to j with the
intermediate vertices from set {1, 2, 3, 4, …, k-1}.
If k is not an intermediate vertex of path p, then all the intermediate vertices
of path p belongs to the set {1, 2, 3, 4, …, k-1}. Hence, the shortest path
from vertex i to j having all intermediate vertices in set {1, 2, 3, 4, …, k}. If k
is an intermediate vertex of path p, then we will break path p into two ways, say
p1 and p2, such that all intermediate vertices in set {1, 2, 3, 4, …, k-1}.
k
p2
p1
i j
From the above discussion, a recursive solution for the estimation of the
shortest path can be made as follows. Let dij(k) be the weight of the shortest path
from vertex i to j with all intermediate vertices belonging to set {1, 2, 3, 4,
…,k}.
Self-Instructional
Material 71
General Method If k is zero, that is no intermediate vertex exists between i and j, then
there will be a single edge from i to j, and hence dij(0) Wij . For this, first we
need to define Wij . Actually, W is the weight matrix defined as:
NOTES
0 ;if i j
Wij the weight of directed edge(i,j) ;if i j and(i,j) E
;if i j and(i,j) E
On the basis of the above definition, the recursive definition can be given as:
Wij ;if k 0
dij
min dij ,dik dkj(k 1)
(k 1) (k 1)
;if k 1
The algorithm given below is used to compute the all pairs shortest path
using the above recurrence relation.
Algorithm 7.1 All Pairs Shortest Path
FLOYD-WARSHALL(W)
//consider set of vertices V={1, 2, 3,..n)
1. Set n = rows in matrix W
2. Set d(0) = W
3. for k = 1 to n do
4. for i = 1 to n do
5. for j = 1 to n do
6. Set dij(k) = min(dij(k-1), dik(k-1)+dkj(k-1))
7. return d(n)
Example 7.1: Given the directed graph shown in Figure 7.2. Design the initial
n×n matrix W, then compute the values of dij(k) for increasing values of k, till it
returns the matrix d(n) of shortest path weight.
3 2
4
3
1
8
2 1
-4 7 -5
5 4
6
Self-Instructional
72 Material
Solution: The initial weight matrix is: General Method
0 3 8 4
0 1 7
NOTES
W 4 0
2 5 0
6 0
Here, we have considered the direct path between all the vertices, so d(0)=W,
that is,
0 3 8 4
0 1 7
d(0) 4 0
2 5 0
6 0
Now, after applying algorithm we get the following matrices:
0 3 8 4
0 1 7
d(1) 4 0
2 5 5 0 2
6 0
0 3 8 4 4
0 1 7
d(2)
4 0 5 11
2 5 5 0 2
6 0
0 3 8 4 4
0 1 7
d(3) 4 0 5 11
2 1 5 0 2
6 0
Self-Instructional
Material 73
General Method
0 3 1 4 4
3 0 4 1 1
d(4) 7 4 0 5 3
NOTES
2 1 5 0 2
8 5 1 6 0
0 1 3 2 4
3 0 4 1 1
d(5) 7 4 0 5 3
2 1 5 0 2
8 5 1 6 0
This is the required matrix that shows all pairs shortest path of the graph.
Example 7.2: Consider the following directed graph:
2 4
3
8
1 3
1
2
–4 7 -5
5 4
6
0 3 4
8
0
1 7
W 4 0
2 5 0
6 0
Since in this matrix we have considered the direct path between all the
vertices,
so d(0) = W; i.e.,
0 3 8 –4
0 1 7
d(0) = 4 0
2 –5 0
6 0
Self-Instructional
74 Material
After applying the algorithm, the following matrices will be formed. General Method
0 3 8 –4
0 1 7
d(1) = 4 0 NOTES
2 5 –5 0 –2
6 0
0 3 8 4 –4
0 1 7
d(2) = 4 0 5 11
2 5 –5 0 –2
6 0
0 3 8 4 –4
0 1 7
d(3) = 4 0 5 11
2 –1 –5 0 –2
6 0
0 3 –1 4 –4
3 0 –4 1 –1
d(4) = 7 4 0 5 3
2 –1 –5 0 –2
8 5 1 6 0
0 1 –3 2 –4
3 0 –4 1 –1
d(5) = 7 4 0 5 3
2 –1 –5 0 –2
8 5 1 6 0
This is the required matrix which shows all pairs shortest path of the graph.
An Optimal Binary Search Tree (OBST) is a binary search tree in which nodes
are arranged in such a way that the cost of searching any value in this tree is
Self-Instructional
Material 75
General Method minimum. Let us consider a given set of n distinct key values
A={a1,a2,...,an}, where a1<a2<...<an and it is required to construct
binary search tree from these key values. The search for any of these key values
will be successful. However, there may be many searches for the key values which
NOTES are not part of the set A, and thus will always be unsuccessful. To represent the
key values that are not part of A, dummy (external) nodes are added in the tree.
If there are n key values then there will be n+1 dummy nodes. Let d0, d1, . . ., dn
represent dummy nodes not present in set A. Here, d0 represents all key values
less than a1, di (for i= 1 to n) represents all the key values between ai and ai+1
and dn represents all the key values greater than an.
Figure 7.3 shows a binary search tree with dummy nodes (represented by
square) added. Here, the internal nodes (shown by circle) represent the key values
(a1 to a6) which are actually stored in the tree, while the external nodes represent
the key values (d0 to d6) which are not present in the tree.
Let pi be probability for each ai with which the search will be for the key
value ai. Let qi be the probability that the search will be unsuccessful and will end
up at dummy node (say di). This implies,
n n
pi qi 1
i 1 i 0
Using the probabilities of searches for key values represented by both internal
and dummy nodes, the expected cost of a search in a binary search tree T can be
determined. Let the cost of a search of a particular node is the number of nodes
visited which is equal to one more than the depth of the node to be searched in tree
T. Then, c the expected cost of a search in binary tree T is given by the following
equation.
Self-Instructional
76 Material
n n General Method
c (depthT(ki) 1.pi) (depthT(ki) 1.qi)
i 1 i0
n n
1 depthT(ki).pi depthT(ki).qi
i 1 i 0 NOTES
The aim is to create a binary search tree with minimum estimated search
cost, that is, an optimal binary search tree. Unfortunately, the binary search tree
obtained using the greedy technique may or may not be optimal This is because it
always creates the tree in the decreasing order of the probabilities, that is, by
taking the highest probability first, then the second highest, and so on. The resulting
binary search tree may not always be the best solution of the problem. A guaranteed
optimal binary search tree can be obtained using the dynamic programming
technique, which is discussed in next unit.
Knapsack problems can be explained considering the given example. A thief wants
to rob a store which contains n items. Item i has weight wi and is worth vi dollars,
profit earned for ith item is pi = (vi/wi), the capacity of the Knapsack or bag is W.
If a fraction xi,, 0 xi 1, of item i is placed in the bag, then a profit of pixi is
earned. The main objective of the problem is for the thief to take as valuable a
load as possible but without exceeding W, i.e., maximize the total profit earned.
So, we have to maximize
n
px
i 1
i i
n
Such that, w x
i 1
i i W
Where 0 xi 1 and 1 i n
The following are two versions of Knapsack problem:
0-1 Knapsack: In 0-1 Knapsack, an item either must be taken or left but
cannot be taken as a fractional amount, i.e., the value of xi for ith item is either 0 or
1. If an item is taken then xi = 0 and if an item is left then xi = 0. 0-1 Knapsack
problem can be solved using the dynamic programming method.
Fractional Knapsack: In fractional Knapsack, the thief can fraction the
items. Fractional Knapsack problem can be solved using the greedy method.
Consider the following algorithm for fractional Knapsack. Here, we consider
three arrays for weights, values and profits, respectively. W denotes the capacity
of the Knapsack. This algorithm provides the fractions of items taken.
Self-Instructional
Material 77
General Method FRACTIONAL-KNAPSACK (w, v, W)
1. for i 1 to n
2. do p[i] v[i]/w[i]
NOTES 3. Arrange the profit array in descending order using any linear sorting algorithm
4. for i 1 to n
5. do x[i] 0.0
6. U W
7. for i 1 to n
8. do if (w[i] > U)
9. then break
10. else x[i] 1.0
11. U U – w[i]
12. if (i n)
13. then x[i] U/w[i]
The Knapsack algorithm takes O(nlog n) time if arranging profits array uses either
merge sort or heap sort; otherwise only O(n) time will be required.
Example 7.3: Consider the following instance of the Knapsack problem.
n = 3, W = 20, (v1, v2, v3) = (25, 24, 15) and (w1, w2, w3) = (18, 15, 10).
Find the optimal solution to the Knapsack problem.
Solution:
i Items Weight (w[i]) Value (v[i]) Profit p[i]= v[i]/w[i]
1 I1 18 25 1.38
2 I2 15 24 1.6
3 I3 10 15 1.5
Now arrange the items in the decreasing order of their profit.
i Items Weight (w[i]) Value (v[i]) Profit p[i]= v[i]/w[i]
1 I2 15 24 1.6
2 I3 10 15 1.5
3 I1 18 25 1.38
In for loop of steps 4–5, x[1] = x[2] = x[3] = 0, i.e., we have not taken any
item yet.
i U w[i] > U x[i]
Self-Instructional
78 Material
1 20
15 > 20, false 1 (Originally it is item I2 so x[2] = 1) General Method
2 I2 3 5 1.67
3 I3 5 15 3
4 I4 7 7 1
5 I5 1 6 6
6 I6 4 18 4.5
7 I7 1 3 3
Now arrange the items in the decreasing order of their profit.
i Items Weight (w[i]) Value (v[i]) Profit p[i]= v[i]/w[i]
1 I5 1 6 6
2 I1 2 10 5
3 I6 4 18 4.5
4 I7 1 3 3
5 I3 5 15 3
6 I2 3 5 1.67
7 I4 7 7 1
Self-Instructional
Material 79
General Method i U w[i] > U x[i]
1 15 1 > 15, false 1 (Originally it is item I5 so x[5] = 1)
2 14 2 > 14, false 1 (Originally it is item I1 so x[1] = 1)
3 12 4 > 12, false 1 (Originally it is item I6 so x[6] = 1)
4 8 1 > 8, false 1 (Originally it is item I7 so x[7] = 1)
NOTES 5 7 5 > 7, false 1 (Originally it is item I3 so x[3] = 1)
6 2 3 > 2, true 2/3 (Originally it is item I2 so x[2] = 2/3)
So, the solution according to the original items is,
x[1] = 1, x[2] = 2/3, x[3] = 1, x[4] = 0, x[5] = 1, x[6] = 1 and x[7] = 1
The optimal solution is (1, 2/3, 1, 0, 1, 1, 1).
Total profit = p[1]x[1] + p[2]x[2] + p[3]x[3] + p[4]x[4] + p[5]x[5]
+ p[6]x[6] + p[7]x[7]
= (5 × 1) + (1.67 × 2/3) + (3 × 1) + (1 × 0) + (6 × 1) + (4.5 × 1)
+ (3 × 1)
= 5 + 1.11 + 3 + 0 + 6 + 4.5 + 3
= 22.61 units
7.7 SUMMARY
main expression is divided into sub problems and the solution of the main
problem is expressed in terms of the solutions obtained for small sub
problems.
NOTES
The most favorable approach used to compute binomial coefficients of any
expression is dynamic programming.
Dynamic programming is best suited approach for optimization problems.
In dynamic programming the solution of the problem is arrived by using
multistage optimized decisions.
All pairs-shortest path problem is to find the shortest path between all pairs
of vertices in a graph G = (V,E).
The Floyd-Warshall algorithm is used to find the shortest path between all
pairs of vertices in a directed graph G = (V,E).
An Optimal Binary Search Tree (OBST) is a binary search tree in which
nodes are arranged in such a way that the cost of searching any value in this
tree is minimum.
Self-Instructional
82 Material
Greedy Technique
8.0 INTRODUCTION
8.1 OBJECTIVES
Consider that a large ship has to be loaded with cargo. The cargo is containerized
and all containers are of the same size. The weights of different containers may be
different. Let the cargo capacity of the ship be c. Let the weight of the ith container NOTES
be w i , where 1 i n. We wish to load the ship with the maximum number of
containers.
To solve this problem using the greedy method, consider a variable x whose
value is either 0 or 1.
If x[i] = 0, it means that ith container is not loaded in the cargo.
If x[i] = 1, it means that ith container is loaded in the cargo.
We wish to assign values to xis that satisfies the following constraints:
n
x[i]
i 1
There exists many feasible solutions because there exists many values of
x[i]s
n
which satisfy the given constraints and the feasible solution which maximizes,
x[i] is an optimal solution.
i 1
Hence, we proceed according to the greedy method as: In the first stage,
we select the container with the least weight, then select the container with the next
smallest weight and continue in this way until the capacity of cargo is reached or
we have finished with the containers. Selecting the containers in this way will keep
the total weight of the containers minimum, and hence leave maximum capacity so
that more and more containers will be loaded in the cargo.
CONTAINER-LOADING (A, capacity, n, x)
1. MERGESORT(A)
2. for i 1 to n
3. do x[i] 0
4. i 1
5. while (i n and A[i].weight capacity)
6. do x[A[i].id] 1
7. capacity capacity – A[i].weight
8. i i + 1
In this pseudocode, we are given an array A in which weights of all containers are
arranged. The capacity of the ship is denoted by capacity. The number of containers
Self-Instructional
Material 85
Greedy Technique are denoted by n. x denotes whether a container is selected or not, accordingly x is
1 or 0.
A[i].weight denotes the weight of the container at location i in the array A.
NOTES A[i].id denotes the identifier in the range 1 to n. This id denotes at which
location the container is given in the original array.
Analysis: In Step1 we use merge sort technique to sort the containers
according to their weight in the increasing order. We can also use heap sort here
to sort these containers. So, Step1 takes O (n log n) time. Steps 2 and 3 take
O(n) time and similarly Steps 5 to 8 take O(n) time.
So, T(n) = O(n log n) + O(n) = O(n log n)
Example 8.1: Suppose we have 8 containers whose weights are 100, 200, 50,
90, 150, 50, 20 and 80, and a ship whose capacity, c = 400. Use CONTAINER-
LOADING algorithm to find an optimal solution to this container loading problem.
Solution: Apply the above CONTAINER-LOADING algorithm as:
Initially: A 100 200 50 90 150 50 20 80
In Step1 we use a sorting technique and thus the array becomes,
20 50 50 80 90 100 150 200
In Steps 2 and 3, we set x[i] = 0, which indicates that till now we have
selected no container.
x[1] = x[2] = x[3] = x[4] = x[5] = x[6] = x[7] = x[8] = 0
Self-Instructional
86 Material
Example 8.2: Suppose you have 6 containers whose weights are 50, 10, 30, 20, Greedy Technique
Initially: A 50 10 30 20 60 5
After sorting the array becomes,
5 10 20 30 50 60
Self-Instructional
Material 87
Greedy Technique 5. do if sm fi
6. then A A U {am}
7. i m
NOTES 8. return A
In this problem, we have to first select the minimum duration activity, i.e., the
activity which holds the resource for the least duration. Then we ignore all those
activities which are not compatible with the selected activity and select the next
minimum duration activity which is compatible. Assume that the activities are
arranged in the order of their increasing finish times so that the process of selecting
an activity becomes faster.
Example 8.3: Consider the following 11 activities along with their start and finish
time. i 1 2 3 4 5 6 7 8 9 10 11
si 1 3 0 5 3 5 6 8 8 2 12
fi 4 5 6 7 8 9 10 11 12 13 14
Compute a schedule where the largest number of activities takes place.
Solution: First, we arrange the activities in the increasing order of their time. In
this example, the activities are already given in the increasing order of their finish
time. According to algorithm, in Step 2, we select the first activity in set A. In
Steps 4 to 7 we have a for loop from the 2nd to the nth activity; select the activity
whose start time is greater or equal to the activity already selected, i.e., select
those activities which are compatible with the selected activity.
n A i m sm fi
11 a1 1 2 3 4 Condition fails
3 0 4 Condition fails
4 5 4 Condition true
a1 a4 4
5 3 7 Condition fails
6 5 7 Condition fails
7 6 7 Condition fails
8 8 7 Condition true
a1 a4 a8 8
9 8 11 Condition fails
10 2 11 Condition fails
11 12 11 Condition true
a1 a4 a8 a11
Return a1 a4 a8 a11
Self-Instructional
88 Material
RECURSIVE-ACTIVITY-SELECTOR (s, f, i, j) Greedy Technique
1. m i + 1
2. while m < j and sm < fi
NOTES
3. do m m + 1
4. if m < j
5. then return {am} RECURSIVE-ACTIVITY-SELECTOR (s, f,
m, j)
6. else return
The initial call is RECURSIVE-ACTIVITY-SELECTOR (s, f, 0, n + 1).
The operation of RECURSIVE-ACTIVITY-SELECTOR is shown as
follows:
Analysis: Both the versions, i.e., iterative and recursive runs in (n) time if
the activities are arranged in the increasing order according to their finish time.
If the activities are not sorted, then first sort them either using merge sort or
heap sort which takes O(n log n) time.
k sk fk
0 - 0
a0
a3
306
a1
a4
457
a1
m=4
RECURSIVE-ACTIVITY-SELECTOR (s, f, 4, 12 )
a5
Self-Instructional
Material 89
Greedy Technique 538
a1 a4
a6
659
a1 a4
a7
NOTES 7 6 10
a1 a4
a8
8 8 11 a1 a4
m=8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
fixed length coding, i.e., the saving percentage increases using this technique. If we
represent each character with unequal number of bits, then
a 0 NOTES
b 101
c 100
d 111
e 1101
f 1100
So, the total bits required are: (45000 × 1) + (13000 × 3) + (12000 × 3) +
(16000 × 3) + (9000 × 4) + (5000 × 4) = 2.24 × 105 bits,
A saving of approximately 25 per cent and this is an optimal character
coding for this file.
While using variable length code, we use prefix codes. Prefix codes are
those codes in which no codeword is a prefix of some other codeword. Prefix
codes are used because they simplify decoding.
Example 8.4: (a) Is 101, 0011, 011, 1011 is a prefix code?
(b) Is 0, 101, 1100, 1101, 100 is a prefix code?
Solution: (a) 101, 0011, 011, 1011 is not a prefix code because here the codeword
101 is a prefix of codeword 1011.
(b) 0, 101, 1100, 1101, 100 is a prefix code because here no codeword is a
prefix of some other codeword.
HUFFMAN (C)
1. n | C |
2. Q C
3. for i 1 to n – 1
4. do Allocate a new node z
5. left[z] x EXTRACT-MIN(Q)
6. right[z] y EXTRACT-MIN(Q)
7. f[z] f[x] + f[y]
8. INSERT(Q, z)
9. return EXTRACT-MIN(Q)
In this algorithm, we have a set C which contains the characters. The characters
are maintained in a priority queue Q according to the increasing order of their
frequencies. INSERT(Q, z) inserts a node z into the priority queue.
Self-Instructional
Material 91
Greedy Technique EXTRACT-MIN(Q) removes and returns the element having the minimum
key from the priority queue.
Analysis: The priority queue in Step 2 can be initialized in O(n) time. This
priority is created by building max heap. Steps 3 to 8 are executed exactly n – 1
NOTES
times, and since each heap operation takes O(log n) time, Steps 3 to 8 contribute
to O(n log n) time.
Thus, the running time of Huffman code on a set C of n characters takes
O(n logn) time.
Example 8.5: What is an optimal Huffman code for the following set of
frequencies? u : 45, v : 13, w : 12, x : 16, y : 9, z : 5
Solution: First, arrange the characters in the increasing order of their frequencies.
0 1
z:5 y:9
0 1 0 1
(d) 25 30 u:45
0 1 0 1
0 1
z:5 y:9
Self-Instructional
92 Material
Greedy Technique
(e) u:45 55
0 1
NOTES
25 30
0 1 0 1
0 1
z:5 y:9
(f) 100
0 1
u:45
55
0 1
25 30
0 1 0 1
0 1
z:5 y:9
To find the codeword for each character, we start from the root and reach the leaf
which contains the character. So, the codeword for each character is given as:
u0
v 101
w 100
x 111
y 1101
z 1100
Self-Instructional
Material 93
Greedy Technique
8.4 SUMMARY
Fixed Length Code: Using this code each character in the file is represented
by equal number of bits.
Variable Length Code: It is a code that does much better than fixed
length coding, i.e., the saving percentage increases using this technique.
Self-Instructional
94 Material
Greedy Technique
8.6 SELF ASSESSMENT QUESTIONS AND
EXERCISES
Self-Instructional
Material 95
Applications
UNIT 9 APPLICATIONS
NOTES Structure
9.0 Introduction
9.1 Objectives
9.2 Minimal Spanning Tree
9.2.1 Kruskal’s Algorithm
9.2.2 Prim’s Algorithm
9.3 Dijkstra’s Algorithm
9.4 Answers to Check Your Progress Questions
9.5 Summary
9.6 Key Words
9.7 Self Assessment Questions and Exercises
9.8 Further Readings
9.0 INTRODUCTION
9.1 OBJECTIVES
A spanning tree of a connected graph G is a tree that covers all the vertices and the
edges required to connect those vertices in the graph. Formally, a tree T is called a
spanning tree of a connected graph G if the following two conditions hold.
Self-Instructional
96 Material
1. T contains all the vertices of G, and Applications
Self-Instructional
98 Material
Applications
NOTES
Self-Instructional
Material 99
Applications Algorithm 9.1: Kruskal’s Algorithm
Greedy_kruskal(E)
//E is the set of edges in graph G containing n nodes. MST
//contains the edges in the minimum spanning tree
NOTES 1. Set i=n-1
2. while(i = n-1)
3. {
4. Find minimum cost edge(x,y) from set of edges
5. Set E={x,y}
6. Set root_x=find(x) //find the root node of tree
//containing x
Set root_y=find(y) //find the root node of tree
//containing y
7. If(root_x ? root_y)
8. {
9. Merge x and y
10. Set MST=union(MST,E) //add minimum edge to the tree
11. }
12. Set i=i+1
13. }
Self-Instructional
100 Material
other smallest entry in the rows of V1, Vk, and Vm is found out and corresponding Applications
edge is inserted in the sub tree. This process is continued until all the n vertices get
connected by n-1 edges. Figure 9.5 shows the adjacency weight matrix of the
graph shown in Figure 9.3 and the Figure 9.6 illustrates the steps for constructing
the minimal spanning tree for this graph using greedy strategy. NOTES
Self-Instructional
Material 101
Applications
NOTES
Self-Instructional
Fig. 9.6 Constructing the Minimum-Cost Tree Using Prim’s Algorithm
102 Material
Algorithm 9.2: Prim’s Algorithm Applications
Greedy_Prims(Edge,r,cost,MST)
//Edge is the set of edges in Graph G having r vertices and
//cost C. MST[1:r-1,1:2] is the array to hold set minimum
//cost edges in the spanning tree NOTES
1. Set Minimum_cost=cost[m,n] //[m,n] is the edge having
//minimum cost in Edge
2. Set MST[1,1]=m
3. Set MST[1,2]=n
4. Set j=1
5. while(j = r)
6. {
7. If(cost[j,n]<cost[j,m]) //determine the adjacent
//vertex
8. Set near_vertex[j]=n
9. Else
10. Set near_vertex[j]=m
11. Set j=j+1
12. }
13. Set near_vertex[m]=0
14. Set near_vertex[n]=0
15. Set j=2
16. while(j = r-1) //build the remaining spanning
tree
17. {
//Let i is any index such that near_vertex[i]?0 and
//cost[i,near_vertex[i]] is minimum
18. Set MST[j,1]=i
19. Set MST[j,2]=near_vertex[i] //determine the next edge
//to be included in the
//spanning tree
20. Set Minimum_cost= Minimum_cost+cost[i,near_vertex[i]]
21. Set near_vertex[i]=0
22. Set m=1
23. while(m = r) //update next_vertex
24. {
25. If((near_vertex[m]?0) AND (cost[m,near_
vertex[m])>cost[m,i]))
26. Set near_vertex[m]=i
27. Set m=m+1
28. }
29. Set j=j+1
30. }
31. return Minimum_cost
Self-Instructional
Material 103
Applications
9.3 DIJKSTRA’S ALGORITHM
The different shortest paths, assuming that V1 is the source vertex are given
below:
Table 9.1 Shortest Path
From V1 to V2 V1 V2 2
From V1 to V3 V1 V2 V 3 5
From V1 to V4 V1 V2 V 4 6
From V1 to V5 V1 V2 V 4 V5 9
Example 9.1: Find the shortest path for the given graph using Dijkstra’s algorithm
assuming that the source vertex is 1.
Solution: The adjacency matrix for the above graph, G is shown below:
Self-Instructional
Material 105
Applications To compute the shortest path for all the vertices, follow the steps given below:
Step 1:
Consider all the outward edges from the source vertex 1 (see Figure
NOTES 9.10(a)).Initially, the source vertex has weight 0 and other vertices connected
directly to it will have the weight specified on their edges (see Figure 9.10(b)).
This weight is the distance d, of the vertex from the source vertex. The vertex that
is not directly connected to the source vertex has weight equal to “. At this step, S
= {1}, where S contains the list of vertices that have already been visited.
Step 2:
Next, select the vertex having least weight among all the vertices i.e. vertex 2 and
consider the outward edges from vertex 2 (see Figure 9.10(c)). Then, set the
distance of vertex 4 as 8+1=9 as specified on the edge from 2 to 4 and adding the
distance of vertex 2 to it as well (see Figure 9.10(d)). Also, change the distance,
d, of vertex 3 to 3 as it is the shorter distance (vertex 2 to vertex 3) from the one
assigned previously (vertex 1 to vertex 3). Thus, S = {1, 2}.
Step 3:
Now, the next vertex with least weight after considering vertex 2 is 3 (see Figure
9.10(e)). Connect all the outgoing edges from 3 and adjust the distance of the
vertices accordingly. If any vertex can have the shorter distance by using the edge
from vertex 3, then its assigned distance, d, will be replaced with new distance as
done for vertex 6 (see Figure 9.10(f)). Thus, S = {1, 2, 3}.
Step 4:
Thereafter, the next vertex with least weight after considering vertex 2 is 5 (see Figure
9.10(g)). Adjust the distance of the vertices according to vertex 5 (see Figure 9.10(h)).
If any vertex can have the shorter distance by using the edge from vertex 5, then its
assigned distance, d, will be replaced with the new distance as done for vertex 4.
Thus, S = {1, 2, 3, 5}.
Step 5:
Consider vertex 4 now as it has the least weight after vertex 5 (see Figure 9.10(i))
and change the value for distance, d, of other vertices only if the new value is less
than the previously assigned value as done for vertex 6 (see Figure 9.10(j)). Thus,
S = {1, 2, 3, 5, 4}.
Step 6:
Finally, select vertex 6. Figure 9.10(l) shows the shortest distance from vertex 1 (source
vertex) to every other vertex V. Thus, S = {1, 2, 3, 5, 4, 6}.
Self-Instructional
106 Material
Applications
NOTES
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Self-Instructional
Material 107
Applications
NOTES
(i) (j)
(k) (l)
1. Kruskal’s algorithm requires listing the edges in the increasing order of their
weights.
2. A spanning tree of a connected graph G is a tree that covers all the vertices
and the edges required to connect those vertices in the graph.
9.5 SUMMARY
A spanning tree of a connected graph G is a tree that covers all the vertices
and the edges required to connect those vertices in the graph.
In Kruskal’s approach, initially, all the vertices, n, of the graph are considered
as distinct partial tree having one vertex and all its edges are listed in the
increasing order of their weights.
Self-Instructional
108 Material
Kruskal’s algorithm requires listing the edges in the increasing order of their Applications
NOTES
9.8 FURTHER READINGS
Self-Instructional
110 Material
Sorting and Searching
BLOCK - IV Algorithms
NOTES
UNIT 10 SORTING AND SEARCHING
ALGORITHMS
Structure
10.0 Introduction
10.1 Objectives
10.2 Decrease and Conquer
10.3 Insertion Sort
10.4 DFS and BFS
10.4.1 Depth-First Search
10.4.2 Breadth-First Search
10.5 Topological Sorting
10.5.1 Topological Sorting
10.6 Answers to Check Your Progress Questions
10.7 Summary
10.8 Key Words
10.9 Self Assessment Questions and Exercises
10.10 Further Readings
10.0 INTRODUCTION
Algorithm analysis should begin with a clear statement of the task to be performed.
This allows us both to check that the algorithm is correct and to ensure that the
algorithms we are comparing perform the same task. A sorting algorithm in
computer science is an algorithm that puts elements of a list in a certain order. A
search algorithm on the other hand is a step-by-step procedure used to locate
specific data among a certain collection of data. This is also considered a fundamental
procedure in computing. In computer science the difference between a fast
application and a slower one often lies in the use of the proper search algorithm.
This unit will explain sort and searching algorithms in detail.
10.1 OBJECTIVES
This is a problem solving approach especially used to perform searching and sorting
NOTES operations. By adopting approaches like divide and conquer, Decrease and conquer
the main problem is reduced to smaller sub- problem at each step of its execution.
However, decrease and conquer is not same as that of divide and conquer .
The main and principle approach of decrease and conquer strategy involves
three major activities:
1. Decrease: Reduce the main problem domain into smaller sub-problem
instances and extend the solution space.
2. Conquer: Resolve these smaller sub-problems to obtain desired result.
3. Extend: Extend the result obtained in step 2 to arrive at final problem
result.
The complexity of the problem can be reduced by three different variations
of decrease and conquer approach of problem solving:
(a) Decrease by a Constant amount
(b) Decrease by a Constant factor or
(c) Decrease by a Variable factor
Decrease by Constant Amount:
In decrease and conquer approach, the problem domain is reduced by
same constant amount at every individual step till problem arrives to desired result.
In other words at every interaction of execution the main problem is reduced by
some constant amount or factor. In most of the cases this constant amount has
been assigned with integer value 1. Examples of the algorithms where decrease by
constant factor is used to solve the problem are:
Insertion sort
Depth first search
Breadth first Search
Topological sorting
Problems to generate permutations, subsets, etc
Decrease by a Constant Factor
In decrease and conquer approach, the problem domain is reduced by same
constant factor on every occurrence of an iteration or step in program logic till a
resulted is arrived. In most of the cases this constant amount has been assigned
with integer value 2 and the reduction by constant factor other than two is very
rare situation in an algorithm. Examples of the algorithms where decrease by constant
factor is used to solve the problem are:
Self-Instructional
112 Material
Binary Search Sorting and Searching
Algorithms
Fake Coin Problem
Russian Peasant Multiplication
Josephus problem, etc. NOTES
Decrease by variable Size
In decrease and conquer approach, the main problem instance is reduced by
variable size reduction factor at each individual step or iteration of an algorithm.
In other words the reduction factor varies from one iteration to another. Examples
of the algorithms where decrease by constant factor is used to solve the problem
are:
Euclid’s algorithm for GCD
Partition-based algorithm for selection problem
Interpolation search
Search and insertion in binary search trees, etc.
Self-Instructional
Material 113
Sorting and Searching Working of Insertion Sort
Algorithms
In order to under the practical aspect of insertion sort lets consider the following
example.
NOTES Let ‘Array’ is an unsorted array with following elements
Array[5]=[14,33,27,10,35,19]
The insertion sort in the above array begins by comparing the first two
elements of Array[5], that is 14 and 33. While comparing 14 and 33 it is reported
that the elements are already in sorted form (ascending order) therefore, no
swapping is done and 14 becomes the first element of left sub-array(sorted array).
The next step is to compare 33 with 27. After comparing it is reported that
27 is smaller than 33. Therefore, the element 27 needs to be inserted at position
that is before 33 by performing swapping. Before the element 27 is placed in
sorted sub-array it checks and compares with all the elements within the left sorted
sub-array so that the swapped element gets exact place in sorted array sub-array.
The array ‘Array[5]’ will look like:
Array[5]=[14,27,33,10,35,19]
Now, the comparison will again begin from element 14 proceeds across 27
then 33 so on. As 14 is greater than 10 the elements need to be swapped at it will
be placed at first location in sorted sub-array. The whole comparing, sorting and
inserting an element by swapping will continue till a perfect sorted list is not obtained.
After iterating across all elements the final sorted array ‘Array[5]’ will look like as;
Array[5]=[10,14,19, 27,33,35]
Algorithm Insertion sort
Self-Instructional
114 Material
Sorting and Searching
Step 1 -
If it is the first element, it is already sorted. return 1;
Step 2 -
Pick next element Algorithms
Step 3 -
Compare with all elements in the sorted sub-list
Step 4 -
Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 - Insert the value
Step 6 - Repeat until list is sorted NOTES
Traversing a Graph
One of the most common operations that can be performed on graphs is traversing,
i.e., visiting all vertices that are reachable from a given vertex. The most commonly
used methods for traversing a graph are depth-first search and breadth-first
search.
10.4.1 Depth-First Search
In depth-first search, starting from any vertex, a single path P of the graph is
traversed until a vertex is found whose all-adjacent vertices have already been
visited. The search then backtracks on path P until a vertex with unvisited adjacent
vertices is found and then begins traversing a new path P’ starting from that vertex,
and so on. This process continues until all the vertices of graph are visited. It must
be noted that there is always a possibility of traversing a vertex more than one
time. Thus, it is required to keep track whether the vertex is already visited.
For example, the depth-first search for a graph, shown in Figure 10.1(a),
results in a sequence of vertices 1, 2, 4, 3, 5, 6, which is obtained as follows:
2 1
1
2
3
5
4
6
6
5
4 3
(a) Graph
Self-Instructional
Material 115
Sorting and Searching
Algorithms Vertex 1 2 3 4 6 NULL
Vertex 2 1 4 NULL
Vertex 3 1 4 5 6 NULL
NOTES Vertex 4 1 2 3 NULL
Vertex 5 3 6 NULL
Vertex 6 1 3 5 NULL
Self-Instructional
Material 117
Sorting and Searching { /* adjacency list is initilialized
Algorithms with NULL */
int i;
for(i=1; i<=num; i++)
NOTES
arr_ptr[i]=NULL;
}
void input(Node *arr_ptr[], int num)
{
Node *nptr,*save;
int i,j,num_vertex,item;
for(i=1; i<=num; i++)
{
printf(“Enter the no. of vertices in adjacency
list a[%d] : ”,i);
scanf(“%d”, &num_vertex);
for(j=1; j<=num_vertex; j++)
{
printf(“Enter the value of vertex : ”);
scanf(“%d”, &item);
nptr=(Node*)malloc(sizeof(Node));
nptr->info=item;
nptr->next=NULL;
if(arr_ptr[i]==NULL)
arr_ptr[i]=last=nptr;
else
{
save->next=nptr;
save=nptr;
}
}
}
}
void display(Node *arr_ptr[], int num)
{
int i;
Self-Instructional
118 Material
Node *ptr; Sorting and Searching
Algorithms
printf(“\n\nGraph is:\n”);
for(i=1;i<=num;i++)
{ NOTES
ptr=arr_ptr[i];
printf(“\na[%d] “,i);
while(ptr != NULL)
{
printf(“ -> %d”, ptr->info);
ptr=ptr->next;
}
}
}
void depth_first_search(int v, Node *arr_ptr[])
{
Node *ptr;
visited[v]=True; /* mark first vertex as visited */
printf(“\%d\t”, v);
ptr=*(arr_ptr+v); / * assign address of adjacency
list to ptr */
while(ptr!=NULL)
{
if(visited[ptr->info]==False)
depth_first_search(ptr->info, arr_ptr);
else
ptr=ptr->next;
}
}
Self-Instructional
Material 119
Sorting and Searching Enter the value of vertex : 6
Algorithms
Enter the no. of vertices in adjacency list a[2] : 2
Enter the value of vertex : 1
Enter the value of vertex : 4
NOTES
Enter the no. of vertices in adjacency list a[3] : 4
Enter the value of vertex : 1
Enter the value of vertex : 4
Enter the value of vertex : 5
Enter the value of vertex : 6
Enter the no. of vertices in adjacency list a[4] : 3
Enter the value of vertex : 1
Enter the value of vertex : 2
Enter the value of vertex : 3
Enter the no. of vertices in adjacency list a[5] : 2
Enter the value of vertex : 3
Enter the value of vertex : 6
Enter the no. of vertices in adjacency list a[6] : 3
Enter the value of vertex : 1
Enter the value of vertex : 3
Enter the value of vertex : 5
Values are inputted in the graph
Graph is:
a[1] -> 2 -> 3 -> 4 -> 6
a[2] -> 1 -> 4
a[3] -> 1 -> 4 -> 5 -> 6
a[4] -> 1 -> 2 -> 3
a[5] -> 3 -> 6
a[6] -> 1 -> 3 -> 5
1. Set v = 1
2. Set visited[v] = True //mark first vertex as visited
3. Print v
4. call qinsert(v) //insert this vertex in queue
5. While isqempty() = False // check if there is element in queue
Call qdelete() // returning an integer value v
Set ptr=*(arr_ptr+v)
//assign address of adjacency list to ptr, ptr is a
pointer of type node
While(ptr != NULL)
If visited[ptr->info] = False
Call qinsert(ptr->info)
Set visited[ptr->info] = True
Print ptr->info
End If
End While
Set ptr = ptr->next
End While
6. End
Self-Instructional
Material 121
Sorting and Searching #define False 0
Algorithms
typedef struct node
{
int info;
NOTES
struct node *next;
}Node;
int visited[MAX];
int queue[MAX];
int Front, Rear;
void create_graph(Node *[], int num);
void input(Node *[], int num);
void breadth_first_search(Node *[]);
void qinsert(int);
int qdelete();
int isqempty();
void display(Node *[], int num);
void main()
{
Node *arr_ptr[MAX];
int nvertex;
clrscr();
printf(“\nEnter the number of vertices in Graph: ”);
scanf(“%d”, &nvertex);
create_graph(arr_ptr, nvertex);
input(arr_ptr, nvertex);
printf(“\nValues are inputted in the graph”);
display(arr_ptr, nvertex);
Front=Rear=-1;
breadth_first_search(arr_ptr);
getch();
}
void create_graph(Node *arr_ptr[], int num)
{
int i;
for(i=1; i<=num; i++)
arr_ptr[i]=NULL;
}
Self-Instructional
124 Material
printf(“Underflow! Queue is empty”); Sorting and Searching
Algorithms
exit();
}
item=queue[Front];
NOTES
if (Front==Rear)
Front=Rear=-1;
else
Front++;
return item;
}
int isqempty()
{
if (Front==-1)
return True;
return False;
}
The output of the program is as follows:
Enter the number of vertices in Graph: 6
Enter the no. of vertices in adjacency list a[1] : 4
Enter the value of vertex : 2
Enter the value of vertex : 3
Enter the value of vertex : 4
Enter the value of vertex : 6
Enter the no. of vertices in adjacency list a[2] : 2
Enter the value of vertex : 1
Enter the value of vertex : 4
Enter the no. of vertices in adjacency list a[3] : 4
Enter the value of vertex : 1
Enter the value of vertex : 4
Enter the value of vertex : 5
Enter the value of vertex : 6
Enter the no. of vertices in adjacency list a[4] : 3
Enter the value of vertex : 1
Enter the value of vertex : 2
Enter the value of vertex : 3
Enter the no. of vertices in adjacency list a[5] : 2
Enter the value of vertex : 3
Enter the value of vertex : 6
Self-Instructional
Material 125
Sorting and Searching Enter the no. of vertices in adjacency list a[6] : 3
Algorithms
Enter the value of vertex : 1
Enter the value of vertex : 3
Enter the value of vertex : 5
NOTES
Values are inputted in the graph
Graph is:
a[1] -> 2 -> 3 -> 4 -> 6
a[2] -> 1 -> 4
a[3] -> 1 -> 4 -> 5 -> 6
a[4] -> 1 -> 2 -> 3
a[5] -> 3 -> 6
a[6] -> 1 -> 3 -> 5
Breadth First Search: 1 2 3 4 6 5
Applications of Graphs
Graphs have various applications in diverse areas. Various real-life situations like
traffic flow, analysis of electrical circuits, finding shortest routes, applications related
with computation, etc., can easily be managed by using graphs. Some of the
applications of graphs like topological sorting and minimum spanning trees have
been discussed in the following section.
1 3
Self-Instructional
126 Material
Clearly, if a directed graph contains a cycle, the topological ordering of vertices Sorting and Searching
Algorithms
is not possible. It is because for any two vertices Vi and Vj in the cycle, Vi precedes
Vj as well as Vj precedes Vi. To this, exemplify let us study the simple cyclic
directed graph shown in Figure 10.3. The topological sort for this graph is (1, 2,
NOTES
3, 4) assuming the vertex 1 as starting vertex. Since, there exists a path from vertex
4 to 1 then according to the definition of a topological sort, vertex 4 must appear
before vertex 1, which contradicts the topological sort generated for this graph.
Hence, topological sort can exist only for an acyclic graph.
1 3
2 3
2 6
0 1 3
1 3 5
3
4 7
2
Fig. 10.4 Acyclic Directed Graph
Self-Instructional
Material 127
Sorting and Searching The steps for finding topological sort for this graph are shown in Figure 10.5 as
Algorithms
follows:
2 3 1 3
2 6 2 6
3 3
4 7 4 7
2 1
(a) Removing vertex 1 with 0 indegree
1 3 0 3
2 6 2 6
0 3 Topological Sort: 1, 3 2
3 5 5
3 2
4 7 4 7
1 (b) Removing vertex 3 with 0 indegree 0
0 3
2
2 6 6
Topological Sort: 1, 3, 2
2 1
5 5
2 2
4 7 4 7
0 (c) Removing vertex 2 with 0 indegree 0
2 2
6 6
Topological Sort: 1, 3, 2, 4
1 0
5 5
2 1
4 7 7
0 (d) Removing vertex 4 with 0 indegree
2 1
6 6
0 Topological Sort: 1, 3, 2, 4, 5
5
1 0
7 7
0
7
(f) Removing node 7 with 0 in-degree
Self-Instructional
128 Material
Another possible topological sort for this graph is (1, 3, 4, 2, 5, 7, 6). Sorting and Searching
Algorithms
Hence, it can be concluded that the topological sort for an acyclic graph is
not unique. Topological ordering can be represented graphically. In this
representation, edges are also included to justify the ordering of vertices as shown
NOTES
in Figure 10.6.
1 3 2 4 5 7 6
(a)
1 3 4 2 5 7 6
(b)
Fig. 10.6 Graphical Representation of Topological Sort
Self-Instructional
Material 129
Sorting and Searching
Algorithms 10.6 ANSWERS TO CHECK YOUR PROGRESS
QUESTIONS
NOTES 1. Insertion sort is one of the most used sorting approach where sorting of an
array is performed by sorting one array element at a time only.
2. Decrease and Conquer is a problem solving approach especially used to
perform searching and sorting operations.
3. To implement depth-first search, an array of pointers arr_ptr is maintained.
4. A few applications of Graphs include traffic flow, analysis of electrical circuits,
finding shortest routes and applications related to computation.
10.7 SUMMARY
Self-Instructional
Material 131
Generating Combinatorial
Objects
UNIT 11 GENERATING
COMBINATORIAL
NOTES
OBJECTS
Structure
11.0 Introduction
11.1 Objectives
11.2 Generating Combinational Objects
11.3 Transform and Conquer
11.3.1 Presorting
11.3.2 Heap
11.4 Answers to Check Your Progress Questions
11.5 Summary
11.6 Key Words
11.7 Self Assessment Questions and Exercises
11.8 Further Readings
11.0 INTRODUCTION
11.1 OBJECTIVES
Combinational objects are characterized as an object that can be put into one-to-
one correspondence with finite set of integers.
The combinational analysis is a part of mathematics which instructs one to
Self-Instructional find out and show all the possible patterns by which a given combination of number
132 Material
of things might be related and combined with the goal that one might be sure that Generating Combinatorial
Objects
he has not missed any collection or arrangement of the conceivable things that has
not been counted. Combinatorial Objects results into the generation of : all subsets
of a given set, all possible permutations of numbers in set and partition of an
integer n into k parts. NOTES
In order to under this concept let’s consider the following example to generate
subsets from a given collection of numbers that constitute a set.
In the event that the set contains the elements a0,a1,….,an-1,an. And a
subset is expressed as binary string of length equal to the size of set say n.
Let’s assume a set S={1,2,3} where n=3. Each bit is assigned as 0 if that
corresponding bit from the set is dropped in subset and if selected then it is
assigned by bit 1.
Therefore, all the combinations of bits starting from null set that is s={0,0,0}
to {1,2,3} that is numbers from 0 to 2n-1 will be generated.
The procedure continues up to the generation of all possible subsets.
Example
11.3.2 Heap
Definition of Heap: Heap data (binary) structure is an array of objects that can
be viewed as a nearly complete binary tree or a heap is a binary tree represented
as an array of objects with the conditions that it is essentially complete and the key
at each node is keys at its children nodes.
A heap is a specialized tree-based data structure which is essentially an almost
complete tree that satisfies the heap property such that in a max heap, for any given
node C, if P is a parent node of C, then the key (the value) of P is greater than or
equal to the key of C. In a min heap, the key of P is less than or equal to the key of
C. The node at the ‘Top’ of the heap (with no parents) is called the ‘Root’ node.
Heap is, therefore, a binary tree with nodes of the tree which are assigned
with some keys and must satisfy the following criteria:
Tree’s Structure or Shape: Binary tree is essentially complete that means
the tree is completely filled on all levels with a possible exception where the lowest
level is filled from left to right, and some rightmost leaves may be missing.
Heap Order or Parental Dominance Requirements: For every node
‘i’ in a binary tree, except the root node the value stored in ‘i’ is greater than or
equal to the values of its children node.
Self-Instructional
Material 135
Generating Combinatorial In other words heap can be defined as a complete binary tree with the
Objects
property that the value at each node is at least as large( as small as ) the values at
its children(if exists). This property is also called as heap property.
Self-Instructional
136 Material
Generating Combinatorial
Objects
NOTES
Initialize the essentially complete binary tree with n nodes by placing keys in
the order given and then heapify the tree.
Heapify
Compare 7 with its children
Self-Instructional
Material 137
Generating Combinatorial C. Compare 2 with its children
Objects
NOTES
Properties of Heaps
1. The height of a heap is floor(lg n).
2. The root node of heap has highest priority item.
3. A node and all the descendants is a heap.
4. Array can be used to implement heap and all operations applicable on
arrays can be performed on heap.
5. If root node is indexed as 1 then left child has 2i and right child has 2i+1 as
their corresponding indexes.
6. At any level i of heap, heap possesses 2i elements.
7. In heap the value of the parent node is always higher than its children nodes.
Heapsort
Stage 1: Construct a heap for a given list of n keys
Stage 2: Repeat operation of root removal n-1 times:
Exchange keys in the root and in the last (rightmost) leaf
Decrease heap size by 1
If necessary, swap new root with larger child until the heap condition holds
Analysis of Heapsort
HEAPSORT(A)
1 BUILD-HEAP(A)
2 for i -> length[A] downto 2
Self-Instructional
138 Material
3 do exchange A[1] <-> A[i] Generating Combinatorial
Objects
4 heap-size[A] <- heap-size[A] -1
5 HEAPIFY(A, 1)
NOTES
Check Your Progress
1. What are combinational objects characterized as?
2. What is presorting defined as?
11.5 SUMMARY
Self-Instructional
Material 139
Generating Combinatorial
Objects 11.6 KEY WORDS
Self-Instructional
140 Material
Optimization Problems
12.0 INTRODUCTION
12.1 OBJECTIVES
12.2 REDUCTIONS
Example 12.2:
Case (a): performing multiplication of two matrices M1 and M2 and return the
result of multiplication.
Case (b): Performing a squaring operation of matrix M the result is squared
matrix M.
Self-Instructional
142 Material
The algorithm that is used to perform multiplication of two matrices can Optimization Problems
also be used to perform the squaring of a matrix by reducing the matrix for which
squaring is to be done into two matrices and then implement the algorithm as was
used for matrix multiplication.
NOTES
Travelling salesperson problem describes a graphical solution for the sales person
to perform effective sales by travelling all the connected cites without repeating
any city and finally returning to its original city from where he started his sales
travel. The path the salesperson selects or chooses must be a minimum cost route/
path. Let’s reduce this whole situation by constructing a directed graph G(V,E)
which defines the particular instance of the travelling salesperson main problem
domain. Let cij is the cost required by salesperson to travel from node ‘i’ to node
‘j’, in other words cij is cost of edge(i,j). Nodes are represented by vertices of
graph where V is the set of all vertices in graph G(V,E) and V={v1,v2,…,vn}.Each
edge in graph G(V,E) is assigned a particular weight that represents a cost of
travel associated with that corresponding edge(i,j). Let’s assume that the initial
vertex from where salesperson starts his travel is labeled as v1, therefore, the solution
space of the problem S is expressed as S={1,X,1}. Where ‘X’ is all possible
permutations of intermediate nodes {2,3,..,n}.
In order to solve traveling salesman problem to find the minimum cost path
starting from some node v1 and returning back to the same node without repeating
any intermediate node. Reducing the travelling salesman’s problem using branch
and bound algorithm the cost matrix describing the cost factor that is associated
with various paths that can be opted during his travel is also reduced. The cost
matrix is reduced if and only if a row of the matrix has at least one zero and
remaining entries are non-negative numbers. By other definition a matrix is said to
be reduced if and only if its every row and column is reduced (that is contains at
least one zero).
Self-Instructional
Material 143
Optimization Problems 12.4.1 Branching
The branching part of the algorithm works by dividing the solution space into
two sub-graphs or groups. In more accurate senses each node divides the
NOTES solution (remaining solution) into two halves wherein, one some nodes are
included into final solution and others are excluded to be part of the solution. Each
node is associated with lower bound and the same is represented in Figure 12.2
below.
12.4.2 Bounding
Bounding deals how to compute the cost associated with each node. The cost at
each node is obtained by performing the following operations on cost matrix.
Subtract a constant from any row and column. Subtracting this constant
does not impact the optimal solution of the desired path.
The cost of the path changes but not the path itself.
Let’s consider M as the cost matrix of a Graph G=(V,E).
The cost associated with each individual node is calculated as:
o Let R is any node in the graph G and A(R) is its associated reduced
matrix
o The cost of the child S associated with node R say C :
– Set row ‘i’ and column ‘j’ to infinity.
– Set A(i,j) to infinity
– Reduced C and let RCL represents reduced cost.
– Cost(S)=Cost(R)+RCL+A(i,j)
– The reduced matrix MR of M is produced and let L be the valued
subtracted from M.
– L is the lower bound of the conceived path and the cost of the
path is reduced by value L.
Let’s consider the following example to understand the operational aspect
of this reduction approach of problem solution.
Let M is the cost matrix and is equal to
Self-Instructional
144 Material
Optimization Problems
NOTES
The space tree that is the outcome of reduction matrix is represented in the
Figure 12.3 below.
Self-Instructional
Material 145
Optimization Problems Reducing row 3 by 2the cost matrix M will become
NOTES
NOTES
The matrix is reduced and RCL=0 and the cost of node 2 (from vertex 2
to 1) id
Cost(2)=Cost(1)+A(1,2)= 25+10=35
Similarly, if salesman goes to vertex 3: Node 3 the cost matrix will be:
The RCL=11 and the cost going through node 3 is calculated as:
Cost(3)= cost(1)+RCL+A(1,2)=25+11+17=53
If Salesman goes to vertex 4: Node 4 the cost matrix will be:
The rows and columns are already reduced therefore, RCL=0 and the cost
will be calculated as:
Cost(4)=Cost(1)+RCL+A(1,4)=25+0+0=25.
In case if Salesman goes to vertex 5: Node 5 the cost matrix will be:
The cost matrix needs to be reduced at row 2 by 2 and row 4 by 3,
column are already reduced. The cost matrix will be:
Self-Instructional
Material 147
Optimization Problems
NOTES
Reduce row 2 by 2:
Reduce row 4 by 3:
Self-Instructional
148 Material
Optimization Problems
12.6 SUMMARY
Self-Instructional
Material 149
Optimization Problems Travelling salesperson problem describes a graphical solution for the sales
person to perform effective sales by travelling all the connected cites without
repeating any city and finally returning to its original city from where he
started his sales travel.
NOTES
Reducing the travelling salesman’s problem using branch and bound algorithm
the cost matrix describing the cost factor that is associated with various
paths that can be opted during his travel is also reduced.
Cost Matrix: It describe the cost factor that is associated with various
paths that can be opted during traversal.
Optimization problem: It is a mathematical problem or any computational
domain where the main purpose of the problem solution is to reveal the
best possible solution of the problem among the all possible outcomes of
problem.
Self-Instructional
150 Material
Optimization Problems
12.9 FURTHER READINGS
Self-Instructional
Material 151
General Method
BLOCK - V
BACKTRACKING AND GRAPH TRAVERSALS
NOTES
UNIT 13 GENERAL METHOD
Structure
13.0 Introduction
13.1 Objectives
13.2 8-Queen’s Problem
13.3 Sum of Subsets
13.4 Graph Coloring
13.5 Hamiltonian Cycles
13.6 Branch and Bound
13.6.1 Branch and Bound Search Methods
13.7 Assignment Problem
13.8 0/1 Knapsack Problem
13.9 Traveling Salesman Problem
13.10 Answers to Check Your Progress Questions
13.11 Summary
13.12 Key Words
13.13 Self Assessment Questions and Exercises
13.14 Further Readings
13.0 INTRODUCTION
Backtracking is a generalized term for starting at the end of a goal, and incrementally
moving backwards, gradually building a solution. Depth first is an algorithm
for traversing or searching a tree. Graph traversals on the other hand are also
known as graph search and refers to the process of visiting each vertex in a graph.
Such traversals are classified by the order in which the vertices are visited. This
unit will explain about these in detail.
13.1 OBJECTIVES
Self-Instructional
152 Material
General Method
13.2 8-QUEEN’S PROBLEM
The name ‘backtrack’ was first given by D.H. Lehmer in 1950s. Backtracking is
a very useful technique used for solving the problems that require finding a set of NOTES
solutions or an optimal solution satisfying some constraints. In this technique, if
several choices are there for a given problem, then any choice is selected and we
proceed towards finding the solution. However, if at any stage, it is found that this
choice does not provide the required solution then from that point, we backtrack
to previous step and select another choice. The process is continued until the
desired solution to the given problem is obtained.
In most of the applications of backtrack method, an optimal solution is
expressed as an n-tuple (a1, a2, a3, ..., an), where ai belongs to some finite set
Si. Further, the solution is based on finding one or more vectors that maximizes or
minimizes or satisfies a criterion function P(a1, a2, a3, ..., an). Criterion function is
also called as bounding function. Note that sometimes, it is required to find all
vectors that satisfy P.
Consider an example that depicts the idea behind this technique. Let f[1:n]
be an unsorted array containing n elements and we have to sort this array using the
backtracking technique. The sorted sequence can be represented as an n-tuple(a1,
a2, a3, ..., an), where ai is the index of the ith smallest element in array f. The
criterion function is given by f[ai] d” f[ai+1] for 1d”i<n. The set Si is a finite set
of integers in the range [1,n].
If ki is the size of the set Si, then there are k (where, k=k1k2k3...kn )
n-tuples that may be the possible solutions satisfying the criterion function P. The
brute force approach can be used to define all the k n-tuples, evaluate each one of
them to determine the optimal solution. But this method is tedious and time-
consuming. On the other hand, backtracking requires less number of trials to
determine the solution. The idea behind backtracking technique is to generate the
solution vector by adding one component at a time and to use modified criterion
functions Pi (a1, a2, a3, …, ai). These functions are used to check whether
solution vector (a1, a2, a3, …, ai) leads to an optimal solution or not. If at any
stage, it is found that the partial vector (a1, a2, a3, …, ai) cannot result in optimal
solution then ki+1...kn possible test vectors need not to considered and hence,
can be ignored completely from the set of feasible solutions.
Note that most of the problems that can be solved by using backtracking
technique require all the solutions to satisfy a complex set of constraints. These
constraints are divided into two categories: explicit and implicit.
Explicit Constraints: These are the rules that allow each ai to take values
from the given set only. They depend upon the particular instance I of the
problem being solved. Some examples of explicit constraints are given below:
Self-Instructional
Material 153
General Method ai=0 or 1 or Si={0,1}
aie”0 or Si={all nonnegative real numbers}
lid”aid”ui or Si={b: lid”bd”ui}
NOTES Notice that the solution space for I is defined by all the tuples that satisfy
the explicit constraints.
Implicit Constraints: These are the rules that identify all the tuples in the
solution space of I which satisfy the bounding function.
Some Important Terminologies
Backtracking algorithm finds the problem solutions by searching the solution
space (represented as a set of n tuples) for the given problem instance in a
systematic manner. To help in searching, a tree organization is used for the solution
space, which is referred to as state space tree. The set of all the paths from the
root node of the tree to other nodes is referred to as the state space of the
problem. Each node in the state space tree defines a problem state. A problem
state for which the path from the root node to the node defining that problem state
defines a tuple in the solution space (that is, the tuple satisfies the explicit
constraints), is referred to as the solution state. A solution state (say, s) for
which the path from the root node to s defines a tuple that satisfies the implicit
constraints is referred to as an answer state.
After defining these terms, we can understand how a problem can be solved
using backtracking. Solving any problem using backtracking involves the following
four steps.
1. Construct the state space tree for the given problem.
2. Generate the problem states from the state space tree in a systematic manner.
3. Determine which problem states are the solution states.
4. Determine which solution states are the answer states.
The most important of these steps is the generation of problem states from
the state space tree. This can be performed using two methods. Both methods are
similar in the sense as they both start from the root node and proceed to generate
other nodes. While generating the nodes, nodes are referred to as live nodes,
E-nodes and dead nodes. A node which has been generated but its children have
not yet been generated is known as live node. A live node is referred to as an
E-node (expanded node) if its children are currently being generated. After all the
children of an E-node have been generated or if a generated node is not to be
further expanded, it is referred to as a dead node. Notice that in both methods,
we have a list of live nodes. Moreover, a live node can be killed at any stage
without further generating all its children with the help of a bounding function. The
bounding functions for any problem are chosen such that whatever method is
adopted, it always generates at least one answer node.
Self-Instructional
154 Material
Both methods differ in the path they follow while generating the problem General Method
states. In one method, the state space tree is traversed in depth-first manner for
generating the problem states. When a new child (say, C) of the current E-node
(say, R) is generated, C becomes the new E-node. The node R becomes the
E-node again when the subtree C has been fully explored. In contrast, in the second NOTES
method, an E-node remains an E-node until all its children have been generated,
that is, it becomes a dead node. The former state generation method is referred to
as backtracking, while the latter method of state generation is referred to as branch
and bound method.
Note: Many tree organizations are possible for a single solution space.
8-Queen’s Problem
Eight queen’s (8-queen’s) problem is the challenge to place eight queens on the
chessboard so that no two queens attack each other. By attacking, we mean that
no two queens can be on the same row, column, or diagonal. For this problem, all
the possible set of solutions can be represented as 8-tuples (x1, x2, ..., x8) where
xi is the column on which queen i is placed. Further, the explicit constraint is
Si={1, 2, 3, 4, 5, 6, 7, 8} for 1d”id”8, as according to the first constraint, only
one queen can be placed in one row. Thus, the solution space consists of 88
8-tuples. However, the other two constraints that queens must be on different
columns and no two queens can be on the same diagonal, are the implicit constraints.
If we consider only first one of the implicit constraints, then the solution space size
is reduced from 88 tuples to 8! tuples.
Before discussing the solution to 8-queen’s problem, let us consider the
n-queen’s problem. It is the generalization of the 8-queen’s problem. Here, the
problem is to place n queens on an n*n chessboard such that no two queens
attack each other. We can observe that such an arrangement is possible for only
n4 because for n=1, the problem has a trivial solution and for n=2, and n=3, no
solution exists.
Assuming n=4, we have to place four queens on a 4*4 chessboard such
that no two queens attack each other. That is, queens must be on different rows,
columns and diagonal. The explicit constraint for this problem is si={1,2,3,4},
where 1d”id”4 as no two queens can be placed in the same row. In addition, two
implicit constraints are that no two queens can be placed in the same column and
also not on the same diagonal. Here, if only rows and columns are to be different,
then solution space consists of 4! ways in which queens can be placed on the
chessboard but if third constraint, that is, no two queens can be on the same
diagonal is considered, then the size of the solution space is reduced very much.
Consider Figure 13.1 for understanding how backtracking works for 4-queen’s
problem.
Self-Instructional
Material 155
General Method
NOTES
(g) (h)
Self-Instructional
156 Material
position (4, 3) as shown in Figure 13.1(g). Finally, all queens have been placed on General Method
the 4*4 chessboard and we get the optimal solution as 4-tuple vector {2, 4, 1, 3}.
For other possible solutions, the whole process is repeated again with choosing
alternative options.
NOTES
Figure 13.2 shows the state space tree for 4-queen’s problem using
backtracking technique. This technique generates the necessary nodes and stops
if the next node does not satisfy the constraint, that is, if two queens are attacking.
It can be seen that all the solutions in the solution space for 4-queen’s problem can
be represented as 4-tuples (x1, x2, x3, x4) where xi represents the column on
which queen i is placed.
Self-Instructional
Material 157
General Method Algorithm 13.1: QueenPlace Function
QueenPlace(w,u)
//function returns true if queen can be placed at wth row
//and ith column
NOTES 1. Set i=1
2. while(i = w-1)
3. {
4. If((a[i]=u) OR (Abs(a[i]-u)=Abs(i-w)) //checks whether
//two queens are in the same column or same diagonal
5. return false
6. Set i=i+1
7. }
8.return true
With the help of above algorithm, now we are able to find the solutions to 8-
queen’s problem using backtracking. Note that there are total 92 solutions to the
8-queen’s problem. If rotations and reflections of the board are considered, then
there are 12 unique solutions. Some possible solutions for 8-queen’s problem are
shown in Figure 13.3.
Self-Instructional
158 Material
General Method
NOTES
2 3
x2 = 1 x2 = 0
x2 = 1 x2 = 0
19 4 5
18
x3 = 1 x3 = 0 x3 = 1 x3 = 0 x3 = 1 x3 = 0 x3 = 1 x3 = 0
26 27 20 21 12 13 6 7
x4 = 0 x4 = 0 x4 = 0
x4 = 1 x4 = 0 x4 = 1 x4 = 0 x4 = 1 x4 = 0 x4 = 1 x4 = 0 x4 = 1 x4 = 1 x4 = 0
x4 = 1 x4 = 1
30 31 28 29 24 25 22 23 16 17 14 15 10 11 8 9
The left subtree of each node at a specific level (say, i) defines all the
subsets which include the weight wi, while the right subtree defines all subsets
which do not include wi. In other words, for each node at a specific level (say, i),
the left child represents xi = 1, while the right child represents xi = 0. The bounding
function Bj(x1, …, xj) is true if and only if:
j
wixi wi s
i 1 i j 1
Self-Instructional
160 Material
Observe that (x1, …, xj) leads to an answer node only if the condition of General Method
bounding function is satisfied. If wi’s are initially in increasing order, then (x1, …,
xj) can lead to an answer node if:
j
wixi wj+1 s NOTES
i 1
Thus, the modified bounding function Bj(x1, …, xj) is true if and only if:
j
wixi wi s
i 1 i j 1
and
j
wixi wj+1 s
i 1
Algorithm 13.3 describes the solution to sum of subsets problem using recursive
backtracking.
Algorithm 13.3: Sum of Subsets Problem using Backtracking
Sum_of_Sub_Back(m,j,p)
//prints all subsets of w[1..n] that add up to s.
j 1
//Variable m holds the value w[k] * x[k] and p holds the
k 1
value
n
// w[k]. The w[k]’s are in increasing order. w[1] = s and
k j
n
w[i] s
i 1
Self-Instructional
Material 161
General Method Example 13.1: Let w={3, 4, 5, 6} and s=13.Trace Algorithm 13.3 to find all
possible subsets of w that sum to s.
Solution: Given that n = 4, w = {3, 4, 5, 6} and s = 13. To find the desired
subsets, we start with j = 1.Thus, m = k=1
0
w[k] * x[k] = 0 and p = 4k=1w[k]
NOTES
w[1]+w[2]+w[3]+w[4] = 3+4+5+6 = 18. Now follow these steps to find
the subsets.
1. Set x[1] = 1 (means include w[1] in the subset) and check the condition
at step 3 of algorithm. As m + w[1] = 0 + 3 = 3 13, we move to step 7
of algorithm and check if m + w[1] + w[2] 13 or not. Since 0 + 3 +
4 = 7 13, we again call the algorithm with arguments m = m + w[1] = 0
+ 3 = 3, j = 2 and p = p – w[1] = 18 – 3 = 15 and thus, move to step
2 of algorithm.
2. Set x[2] = 1 (means include w[2] in the subset) and check the condition
at step 3 of algorithm. As m + w[2] = 3 + 4 = 7 13, we move to step 7
of algorithm and check if m + w[2] + w[3] 13 or not. Since 3 + 4 + 5
= 12 13, we again call the algorithm with arguments m = m + w[2] = 3
+ 4 = 7, j = 3 and p = p – w[2] = 15 – 4 = 11 and thus, move to step
2 of algorithm.
3. Set x[3] = 1 (means include w[3] in the subset) and check the condition
at step 3 of algorithm. As m + w[3] = 7 + 5 = 12 13, we move to step
7 of algorithm and check if m + w[3] + w[4] 13 or not. Since 7 + 5 +
6 = 18 > 13, we need to backtrack to previous step. Now, to generate the
right child, we move to step 11 of algorithm and check if [(m + p – w[3]
13) AND (m + w[4] 13)] or not. Since [(7 + 11 – 5 = 13 13) AND
(7 + 6 = 13 13)] evaluates to true, set x[3] = 0 (means remove x[3]
from the subset). We then again call the algorithm with arguments m = 7, j
= 4 and p = p – w[3] = 11 – 5 = 6 and thus, move to step 2 of algorithm.
4. Set x[4] = 1 (means include w[4] in the subset) and check the condition
at step 3 of algorithm. As m + w[4] = 7 + 6 = 13 = 13, we print the
solution as x[1..4] = {1, 1, 0, 1}. Next, we move to step 11 to generate
the right child. Since now the condition at step 11 evaluates to false, the
process is terminated.
nodes have the same color. The integer m is known as chromatic number of the
graph. For example, the graph shown in Figure 13.5 can be colored using three colors
1, 2, and 3. Hence, the chromatic number for the graph is 3.
NOTES
Now, consider the 4-color problem for planar graphs which is a special
case of the m-colorability decision problem. A planar graph is defined as a graph
that can be drawn in a plane in such a way that no two edges of graph cross each
other. Figure 13.6 shows a map with five regions. Now, the problem is to determine
whether all the regions can be colored in such a way that no two adjacent regions
have the same color by using the given four colors only.
The map shown in Figure 13.6 can be transformed into a graph shown in
Figure 13.7 where each region acts as a node of a graph and two adjacent regions
are represented by an edge joining the corresponding nodes.
Self-Instructional
Material 163
General Method
NOTES
Now, we are to determine all the different ways in which the given graph
can be colored using at most m colors. Let us consider the graph G represented by
its adjacency matrix G[1:n,1:n] where n is defined as the number of nodes in
the graph. For each edge (i,j) in G, G[i,j]=1; otherwise G[i,j]=0. The colors
are represented by the numbers 1, 2, 3, …, m. The solutions to the problem are
represented in the form of n-tuple (a1, a2,...., an)where ai is the color of the
node i. The algorithm for this problem is given here.
Algorithm 13.4: m-Coloring Graph Problem
mColoring(h)
//h is the index of the next node to be colored. Array a[h]
//holds the color of each of the n nodes
1. do
2. {
3. do //assigning color to hth node
4. {
5. Set a[h]=(a[h]+1)mod(m+1) //Select next highest
color
6. If(a[h]=0) //check if all colors have been used
7. return
8. Set k=1
9. while(k = n)//this loop determines whether the
color
//is distinct from the adjacent colors
10. {
11. If((G[h,k]?0 AND (a[h]=a[k])) //check if (h,k)
//is an edge and if adjacent nodes have the same
//color
12. break
13. Set k=k+1
14. }
15. If(k=n+1)
16. break //New color found
17. }while(false) //Try to find another color
18. If(a[h]=0) //No new color for this node
19. return
20. If(h=n) //All nodes are colored with utmost
//m colors
21. Print(a[1:n]) //displays the solution vector
22. Else
23. mColoring(h+1) //Call mColoring()for value h+1
24. }while(false)
Self-Instructional
164 Material
General Method
Example 13.2: Trace the Algorithm 13.4 for the sample planar graph shown in
Figure. 13.8 and obtain its chromatic number.
NOTES
Solution: For the given graph, we have n=4. Thus, the chromatic number m = 4
- 1 = 3. Initialize a[] as {0, 0, 0, 0}.
Step 1: Follow the steps of the algorithm for h=1, that is, node 1. From step
5, we get, a[1]=((a[1]+1) mod(3+1))=1. This makes the condition
in step 6 to evaluate to false. Next, for k=1, the condition specified in
step 11, that is, (G[1,1]‘“0 AND a[1]=a[1]) to false and hence, k
is incremented. Observe that the condition also evaluates to false for
k=2, 3 and 4. When k=5, the while loop is exited. Now, as the
condition specified in step 15 evaluates to true, we exit from the inner
do loop as well. Next, the conditions at step 18 and 20 evaluate to
false, thus the statement at step 23 is executed and mColoring () is
called for h=2. The array a[h] becomes {1, 0, 0, 0} up to this step.
Step 2: For h=2, we get a[2]=((a[2]+1) mod(3+1))=1 from step 5.
This makes the condition in step 6 to evaluate to false. Next, for k=1,
the condition (G[2,1]‘“0 AND a[2]=a[1]) evaluates to true and
hence, we exit the while loop. Now, as the condition at step 15
evaluates to false, we loop back to step 5 and get a[2]=2. This makes
the condition in step 6 to again evaluate to false. Next, for k=1, the
condition (G[2,1]‘“0 AND a[2]=a[1])evaluates to false and hence,
the value of k is incremented. Observe that the condition evaluates to
false until k=5. Observe that the condition also evaluates to false for
k=2, 3 and 4. When k =5, the while loop is exited. The condition at
step 15 evaluates to true and we exit from the inner do loop. Next, as
the condition at step 18 and 20 evaluate to false, the statement at
step 23 is executed and mColoring() is called for h=3. The array
a[h] becomes {1, 2, 0, 0} up to this step.
Self-Instructional
Material 165
General Method Step 3: The above procedure executes for h=3 and we obtain the modified
array a[h]={1, 2, 3, 0}.
Step 4: For h=4, we get a[4]=((a[4]+1) mod(3+1))=1 from step 5.
This makes the condition in step 6 to evaluate to false. Next, for k=1,
NOTES
the condition (G[4,1]‘“0 and a[4]=a[1]) evaluates to false, so
the value of k is incremented. The condition also evaluates to false for
k=2, 3 and 4 and thus, the while loop is exited. For k=5, the condition
specified in step 15 evaluates to true and thus, we exit from the inner
do loop. Next, the condition at step 18 evaluates to false and the
condition at step 20 evaluates to true. Hence, the array a[h]={1, 2,
3, 1} is printed.
(a) Graph with Hamiltonian Cycle (b) Graph without Hamiltonian Cycle
Note: The graph is said to be Hamiltonian if and only if Hamiltonian cycle exists.
Self-Instructional
166 Material
To find a Hamiltonian cycle in a graph G using backtracking, we start with General Method
any arbitrary vertex, say 1, the first element of the partial solution becomes the
root of the implicit tree. Then, the next adjacent vertex is selected and added to
the tree. In case, at any stage, it is found that any vertex other than vertex 1 makes
a cycle, backtrack one step and proceed by selecting another vertex. NOTES
It is clear that the solution vector obtained using backtracking technique is
in the form of (a1, a2, ...., an), where ai depicts the ith visited vertex of the cycle.
To find the solution for Hamiltonian cycle problem, it is required to find the set of
the candidate vertices for ai if a1, a2, ...., ai-1 have already been chosen. If first
vertex is to be chosen (that is, i=1), then any one of the n vertices can be assigned
to ai. But we assume that all the cycle begins from vertex 1 (that is, a1=1).
Considering this, the algorithm to find all the Hamiltonian cycles in a graph G using
recursive backtracking algorithm is given below.
Algorithm 13.5: Finding Hamiltonian Cycles in a Graph
HamiltonianCycle(i)
//Graph is represented as an adjacency matrix G[1:n,1:n].
//All cycles begin at vertex 1. a[] is the solution vector.
1. do //Generate legal values for a[i]
2. {
3. do
4. {
5. Set a[i]=(a[i]+1)mod(n+1) //Next vertex
6. If(a[i]=0)
7. break
8. If(G[a[i-1],a[i]]?0) //Check the existence of edge
9. {
10. Set k=1
11. while(k = i-1) //Checking for distinctness
12. {
13. If(a[k]=a[i])
14. break
15. Else
16. Set k=k+1
17. }
18. If(k=i) //If true, vertex is distinct
19. {
20. If((k = n) AND G[a[n],a[1]]?0))
21. break
22. }
23. }
24. }while(false)
25. If(a[i]=0)
26. return
27. If(i=n)
28. Print(a[1:n]) //Returns Hamiltonian cycle path
29. Else
30. HamiltonianCycle(i+1)
31. }while(false)
In this algorithm, initially the adjacency matrix G[1:n,1:n] and a[2:n] are set
to 0 and a[1] is set to to 1. a[1:n-1] is a path of n-1 distinct vertices. The
algorithm starts by generating possible value for a[i]. a[i] is assigned to the
Self-Instructional
Material 167
General Method next highest numbered vertex which is not present in a[1:i-1] and is connected
by an edge to a[i-1]; otherwise, a[i]=0. After assigning value to a[i], the
function HamiltonianCycle() for next vertex (that is, i=i+1) is executed. It is
executed repeatedly until i‘“n. If i=n, a[i] is connected to a[1].
NOTES
Example 13.3: Consider a graph G = (V,E) shown in Figure 13.10. Find a
Hamiltonian cycle using backtracking method.
Select any vertex from the vertices 2 and 4 which are adjacent to vertex 1,
say 2. Note that we can select any vertex but generally we choose vertex in
numerical order.
Select vertex 3 from the vertices 1, 3 and 6 that are adjacent to 2 as the
vertex 1 has already been visited and vertex 3 comes first in numerical
order from the remaining vertices.
Select the vertex 4 from the vertices 2, 4, 5 and 6 that are adjacent to 3.
Self-Instructional
168 Material
General Method
NOTES
Backtrack one step and remove vertex 6 from the partial solution. For
same reason, backtrack to vertex 3 and remove vertices 5 and 4 from the
partial solution.
Self-Instructional
Material 169
General Method Select the vertex 5 from the vertices adjacent to 3. After proceeding further
as done earlier, we have reached at vertex 4, one again, at dead end. So
backtrack to vertex 5 and select vertex 6. From this vertex, we cannot
proceed further.
NOTES
Backtrack to the vertex 3 from where we can proceed further. After proceeding
from vertex 3, we have reached to vertex 1, that is, a Hamiltonian circuit is
obtained. So, the complete solution is 1-2-3-6-5-4-1.
The branch and bound procedure involves two steps: one is branching and the
second is bounding. In branching, the search space (S) is split into two or NOTES
more smaller sets say, S1, S2, …, Sn (also known as nodes) whose union
covers S. This step is called so as this process is repeated recursively to each
of the smaller sets to form a tree structure. The second step bounding
computes the lower bound and the upper bound of each node of the tree. The
main idea behind branch and bound approach is to reduce the number of
nodes that are eligible to become an answer node by safely discarding a node
whose lower bound is greater than the upper bound of some other node in the
tree formed so far. This step is called pruning, and is usually implemented by
maintaining a global variable U (shared among all nodes of the tree) that records
the minimum upper bound seen among all subsets examined so far. Any node
whose lower bound is greater than U can be discarded. This step is repeated
till the candidate set S is reduced to a single element or when the upper bound
and lower bound become same.
In branch and bound technique, there are two additional requirements that
are not required in backtracking. These additional requirements differentiate both
these techniques and help in finding an optimal solution for a given problem. These
additional requirements are:
Lower or Upper Bound: For each node, there is an upper bound in case
of a maximization problem and a lower bound in case of a minimization
problem. This bound is obtained by using partial solution.
Previous Checking: The bound for each node is calculated by using partial
solution. This calculated bound for the node is checked with the best previous
partial result. If the new partial solution leads to worse case, then the bound with
the best solution so far, is selected and we do not further explore that part.
Otherwise, the checking continues until a complete solution to the problem is
obtained using the best result obtained so far.
13.6.1 Branch And Bound Search Methods
As we have already discussed that branch and bound technique deals with finding
the optimum solution by using bound values (upper bound and lower bound).
These bound values are calculated using D-search, breadth-first search techniques
and least cost (LC) search. In branch and bound approach, the breadth-first
search is called as FIFO search as the list of live nodes are in a queue that follows
first-in-first-out list. Similarly, D-search is called as LIFO search, as the list of live
nodes is a last-in-first-out list. These calculations help us to select the path that we
have to follow and explore the nodes. These search methods have been discussed
in detail in this section.
Self-Instructional
Material 171
General Method FIFO Branch and Bound Search
In FIFO branch and bound search, each new node is placed into a queue. Once all the
children of the current E-node have been generated, the node at the front of the queue
NOTES becomes the new E-node. To understand how the nodes can be explored in FIFO
branch and bound search, consider the state space tree shown in Figure 13.11.
Assume that node 14 is the answer node and all others are killed.
Initially, the queue (represented as Q) is empty and the root node 1 is the
E-node. Expand and generate the children of E-node (node 1), that is, node 2, 3,
4 and 5 and place them in a queue as shown below.
Here, the live nodes are node 2, 3, 4 and 5. Using the FIFO approach,
remove the element placed at the front of queue, that is, node 2. The next E-node
now is node 2, thus generate all its children, that is, node 6 and 7 and put them in
the rear of queue as shown below.
Now, node 3 becomes E-node. Note that the children of node 3, which are
node 8 and 9 are not placed in the queue as they are already killed. Thus, the
queue becomes as shown below.
Now, as node 4 becomes the E-node, remove it from the queue. The child
of node 4, which is node 10 is not placed in the queue as it is already killed. Thus,
the queue becomes as shown as follows.
Self-Instructional
172 Material
General Method
Next, node 5 becomes the E-node, so remove it from the queue. Its child
node, that is, node 11 is not added in the queue as it is already killed. Thus, the
queue becomes as shown below. NOTES
Now, as node 6 becomes the E-node, remove it from the queue. The children
of node 6, which are node 12 and 13 are not entered in the queue as they are
already killed. Thus, the queue becomes as shown below.
Now, node 7 becomes the E-node. Remove it from the queue and generate
its only child node, that is, node 14. The queue now becomes as shown below.
As node 14 is the answer node, the search ends here. Thus, the optimal
path obtained using FIFO branch and bound search is (1 2 7 14).
LIFO Branch and Bound Search
In LIFO branch and bound search, the children of an E-node are placed in a stack
instead of a queue. Thus, the elements are pushed and popped from one end. To
understand the LIFO branch and bound search, consider the state space tree shown in
Figure 13.12. Assume that the answer node is node 12 and all others are killed nodes.
Self-Instructional
Material 173
General Method Further, assume a stack S, which is empty initially, and the root node 1 is the
E-node. Push all children of node 1 (which are node 2, 3, 4 and 5) in the stack
with node 2 as the top element.
NOTES
Now, node 2 becomes the E-node. Pop it from the stack, generate its
children and push them onto the stack. As the only child of node 2 is node 6,
which is already killed, it is not pushed onto the stack. Thus, the stack becomes as
shown here.
Next, the node 3 becomes E-node, so pop it from the stack. Since its child
node, node 7 is killed node; it is not pushed onto the stack. Thus, the stack becomes
as shown below.
Now, node 4 becomes the E-node, so pop it from the stack. The children
of node 4, which are node 8 and 9 are not pushed onto the stack, as they are
killed nodes. Thus, the stack becomes as shown below.
Now, node 5 becomes the E-node. Pop it from the stack, generate all its
children and push them into the stack as shown below.
Next, pop the current E-node, that is node 10 and push node 12 onto the
stack, as shown here.
Self-Instructional
174 Material
As node 12 is the answer node, the search ends here. Thus, the optimal General Method
Assignment problem deals with the assignment of different jobs or tasks to workers
in a manner so that each worker gets exactly one particular job. This one-to-one NOTES
strategy of assignment adopted so that all jobs are completed in least time or at
least cost. In more technical senses this problems describes the mechanism to
assign ’n’ different tasks to ’n’ different works so that both time and cost that
incurs is optimal. In addition to assignment problems also helps a programmer to
overcome the situation when the number of jobs and number of works is not
same. That means either of the two can more or less. For example How a salesman
of company be assigned to take control of sales of other department maximize the
total sales values or how to design route map for buses to ferry across cities to
reduce layover time. Assignment problem can be solved by implementing following
approaches:
(a) enumeration method
(b) Transportation method
(c) Simplex method
(d) Hungarian assignment method
Solution to Assignment Problem
In order to find the optimal assignment of ’n’ jobs to ’n’ different workers to
minimize the cost of work, the first step to instantiate is to construct a n-by-n cost
matrix say M. Let P represents workers (P=P1,P2,P3,…,Pn). In a constructed
cost matrix M, every single value stored at Mij represents the minimum cost for
job ‘Ji’ to assigned to person ‘Pi’, where, 1<=i<=n and 1<=J <=n.
J1 J2 J3
J1 J2 J3
Let’s consider a given cost matrix ‘M’ where there persons (P1, P2 and P3)
are assigned three jobs (J1 , J2 and J3 ). From the given cost matrix different
assignment cases emerge:
Case 1: If person P1 is assigned J1, P2 is assigned J2 and P3 is assigned
job J3.
Self-Instructional
Material 177
General Method Case 2: If person P1 is assigned job J2, P2 is assigned job J1 and P3
is assigned job J3
Case 3: If Person P1 is assigned Job J3, Person P2 job J1 and Person P3
job J2
NOTES
Case 4: If Person P1 is assigned job J2, Person P2 job J3 and person P3
job J1.
These cases continue to emerge by obtaining all possible permutations from
the given problem.
Let’s consider case 1 to find the total cost to perform all the three jobs
assigned to persons(P1 to P3)
Case 1: P1 J1, P2 J2 and P3 J3 = 6 + 8 + 6 = 20
Case 2: P1 J2, P2 J1 and P3 J3 = 9 + 4 + 6 = 19
Case 3: P1 J3, P2 J1 and P3 J2 = 5 + 4 + 11 = 20
Case 4: P1 J2, P2 J3 and P3 J1 = 9 + 3 + 5 = 17
All the case must be explored to find the optimal assignment of jobs to
person by obtaining the optimal value.
If we look around the above matrix M again and check the possible minimum
cost in each row that can be put to perform the desired jobs. One can find that as:
P1 J3, P2 J2 and P3 J1= 5 + 3 + 5 = 13
However, it is noticeable that the cost of any job assignment including the
optimal cost solution cannot be smaller than the sum or cost obtained above that
is 13(that is the sum of minimum cost values in each row). Therefore, in matrix M
the sum of values should not be less than 13.In this situation the sum obtained is
called as lower bound for the problem.
As we know, the objective of knapsack problem is to fill the knapsack with the given
items in such a way that the total weight put in knapsack should not exceed the capacity
of knapsack and maximum profit should be obtained. This problem is a maximization
problem as we consider the maximum value of profit and hence we will use upper
bound value. As already discussed, knapsack problem is defined as:
M aximize
subject to
Self-Instructional
178 Material
xi = 0 or 1, 1 d” i d” n General Method
UBound(cp,cw,k,m)
//cp is the current profit total, cw is the current weight
//total, k is the index of last removed item and m is the
//knapsack size
1. Set p=cp
2. Set w=cw
3. Set i=k+1
4. while (i = n) //n is number of weights and profits
5. {
6. If (c+w[i] = m)
7. {
8. Set c=c+w[i]
9. Set p=p+p[i]
10. }
11.}
12. return p
Example 13.4 Consider an instance of the knapsack problem where n=3, m =4,
(w1, w2, w3) = (2, 3, 4) and (p1, p2, p3) = (3, 4, 5). Fill this Knapsack using the
branch and bound technique so as to give the maximum possible profit.
Solution: To solve this problem, using branch and bound technique, follow these
steps.
Step 1: Calculate profit per weight as shown here.
Self-Instructional
Material 179
General Method Step 2: Compute the upper bound for the root node as given here.
Since, p = 0, m = 4, w = 0, p1/w1 = 1.5
Thus, UB = 0 + (4 - 0)*(1.5) = 6
NOTES Step 3: Include item I1 (as indicated by the left branch in Figure 13.13) and compute
the upper bound for this node as given here.
Since, p = 3, m = 4, w = 2, p2/w2 = 1.3
Thus, UB = 3 + (4 - 2)*(1.3) = 5.6
Step 4: Include item I2.
Now, p = (0+3) = 3, m = 4, w = (2+3) = 5, p2/w2 = 1.3
Here, w>m. Since, w cannot be greater than m, we will backtrack to previous
node without exploring this node.
Step 5: Exclude item I2 , that is, include item I3 .
Now, p = (0+3) = 3, m = 4, w = (2+3) = 5, p2/w2 = 1.3
Again, w>m, so backtrack to root node.
Step 6: Exclude item I1 (as indicated by right branch in Figure 6.3) node in which
case there is no item in the knapsack. Compute the upper bound for this node as
given here.
Since, p = 0, m = 4, w = 0, p2/w2 = 1.3
Thus, UB = 0 + (4 - 0)*(1.3) = 5.2
Step 7: Include item I2 and compute the upper bound for this node as given here.
Since, p = (0+4) = 4, m = 4, w = 0+3 = 3, p3/w3 = 1.25
Thus, UB = 4 + (4 - 3)*(1.25) = 5.25
Step 8:
Exclude item I2 , that is, include item I3 and compute the upper bound for this node
as given here.
Since, p = 0+5 = 5, m = 4, w = 0+4 = 4, p3/w3 = 1.25
Thus, UB = 5 + (4 - 4)*(1.25) = 5
Finally, select the node with maximum upper bound as an optimum solution. Here,
the node with item 1 having weight 2 and profit 3 has the maximum value, that is,
5.6. Thus, it gives the optimum solution to the given problem.
NOTES
Note that if the given knapsack problem is solved using backtracking
technique; the solution obtained would be same as both these problem solving
techniques provide the optimal solution for the knapsack problem.
As you have already learned in the previous unit, in travelling salesperson problem, the
salesperson is required to visit n number of cities in such an order that he visits all the
cities exactly once and returns to the city from where he has started with incurring
minimum cost. Let G=(V,E) be a directed graph representing travelling salesperson
problem, where V is a set of vertices representing cities and E is a set of edges. Let
number of vertices (cities) be n, that is |V| = n. Further, assume that c(i,j)>0 be the
cost of an edge (i,j), representing cost of travelling from city i to j. Set c(i,j) = ,
when there is no edge between vertices i and j. Without loss of generality, we assume
that the tour begins and ends at the vertex 1. Let S be the solution space, the tour will
be of the form (1, i1, i2, . . . , in-1, 1) S, if and only if (ij, ij+1) E.
We will solve this problem by using LCBB method. To search the state
space tree of travelling salesperson, it is required to define a cost function c(.)
and the other two functions (.) and u(.) in such a way that (x)d”C(x)d”u(x)
for each node x. Further the cost C(.) will be such that the solution node having
least C(.) represents the shortest tour in G.
The steps to solve this problem are as follows:
1. Obtain (x) by reducing the cost matrix representing travelling salesperson
problem. A matrix is said to be reduced if all its rows and columns are
reduced and a row or a column is said to be reduced if and only if it contains
at least one zero. The total of all the values, L, subtracted from the matrix to
reduce it, is the minimum cost for any tour. This value is used as the (.) for
the root of the state space tree.
For example, consider the cost matrix representing the graph G shown in
Figure 13.14.
NOTES
(a) Reducing rows (b) Reducing column (c) Reduced cost matrix
Now, L = 4+3+4+2+6+3 = 22. Hence, length of all the tours in the graph
will have at least cost 22.
2. Associate the least cost to the root of the state space tree and generate all
the children nodes for this node.
3. Obtain the reduced cost matrix for each child node. For this consider that A
is the reduced cost matrix for node x, and y be the child node of node x.
The tree edge (x, y) corresponds to the edge (i,j) included in the tour.
Now, the reduced cost matrix for node y, say B can be obtained by following
these steps.
(a) Change all the entries of row i and column j to .
(b) Set entry in matrix corresponding to edge (j, i), that is A[i,j] to .
(c) Reduce all the rows and columns of the matrix so obtained except the
rows and columns containing .
4. Now (y) can be obtained as follows:
(y)= C^(x)+A[i,j] + l
where, l = total value subtracted from matrix A to obtain matrix B. For the
upper bound function u, can be assigned to each node x, that is u(x)=
. Further (.)=C(.) for leaf nodes can be determined easily since each
leaf represents a unique tour.
5. Select the node with minimum (.) as next E-node and explore it further.
This procedure is repeated till we get node with (.) less than (.) of all
other live nodes.
Select root node 1 as E-node, as we have assumed the tour will begin from
vertex 1. The (.) for the E-node is 22 (=L).
As a next step, nodes 2, 3, 4 and 5 (corresponding to vertices 2, 3, 4 and
5) are generated for E-node 1. Now, we have to calculate (.) for all these nodes
corresponding to the edges (1,2), (1,3), (1,4) and (1,5).
Self-Instructional
182 Material
Now, generate the reduced cost matrix for node 2, 3, 4 and 5 and compute General Method
Self-Instructional
184 Material
(8)=22+0*(cost of edge (5,4))+0=22 General Method
NOTES
Out of these (8)is minimum. Therefore, the next vertex selected is 4 and
we will select node 8 as our next E-node and generate nodes 9 and 10
(corresponding to vertices 2 and 3) for the E-node 8. Using the same procedure,
obtain reduced cost matrix and compute (.) for nodes 9 and 10.
Node 9, path(1,5,4,2) , edges: (1,5), (5,4), (4,2)
Self-Instructional
Material 185
General Method (10)=22+0*(cost of edge(4,3))+5 = 27
NOTES
Out of these (9)is minimum. Therefore, the next vertex selected is 2 and
we will select node 9 as our next E-node. Generate solution node 11 (corresponding
to vertex 3) for the E-node 10. Using the same procedure, obtain reduced cost
matrix and compute (.) for the node 11.
Node 11, path (1,5,4,2,3), edges: (1,5), (5,4), (4,2), (2,3)
Self-Instructional
186 Material
General Method
NOTES
Self-Instructional
Material 187
General Method 6. Assignment problem deals with the assignment of different jobs or tasks
to workers in a manner so that each worker gets exactly one particular
job.
NOTES
13.11 SUMMARY
Backtracking is a very useful technique used for solving the problems that
require finding a set of solutions or an optimal solution satisfying some
constraints.
In this technique, if several choices are there for a given problem, then any
choice is selected and we proceed towards finding the solution
Note that most of the problems that can be solved by using backtracking
technique require all the solutions to satisfy a complex set of constraints.
These constraints are divided into two categories: explicit and implicit.
Backtracking algorithm finds the problem solutions by searching the solution
space
To help in searching, a tree organization is used for the solution space,
which is referred to as state space tree.
Each node in the state space tree defines a problem state.
The most important of these steps is the generation of problem states from
the state space tree.
While generating the nodes, nodes are referred to as live nodes, E-nodes
and dead nodes.
Aplanar graph is defined as a graph that can be drawn in a plane in such a
way that no two edges of graph cross each other.
A Hamiltonian cycle c of G is a cycle that goes through every vertex exactly
once and returns to its initial position.
To find a Hamiltonian cycle in a graph G using backtracking, we start with
any arbitrary vertex, say 1, the first element of the partial solution becomes
the root of the implicit tree.
The elimination of possibilities in one step is known as pruning.
Backtracking is useful in the problems where there are many possibilities
but few of them are required to test for complete solution.
The branch and bound procedure involves two steps: one is branching and
the second is bounding.
Self-Instructional
188 Material
The second step bounding computes the lower bound and the upper bound General Method
Explicit Constraints: These are the rules that allow each a to take values
from the given set only.
Implicit Constraints: These are the rules that identify all the tuples in the
solution space of I which satisfy the bounding function.
Self-Instructional
Material 189
General Method
13.14 FURTHER READINGS
Self-Instructional
190 Material
Graph Traversals
14.0 INTRODUCTION
A graph is a non-linear data structure. Graph traversal (also known as graph search)
refers to the process of visiting (checking and/or updating) each vertex in a graph.
Such traversals are classified by the order in which the vertices are visited. This
unit will explain graph traversals in detail.
14.1 OBJECTIVES
14.2 GRAPHS
A graph is a non-linear data structure. A data structure in which each node has at
most one successor node is called a linear data structure, for example array, linked
list, stack, queue etc. A data structure in node has more than one successor node
is a called non-linear data structure.
Many problems can be naturally formulated as in terms of elements and
their interconnections. A graph is a mathematical representation for such situations.
A graph can be defined as follows:
Self-Instructional
Material 191
Graph Traversals A graph is a set of finite verities or nodes V and a finite set of edges E. Each
edge is uniquely identified by a pair of vertices [x, y]. A Graph can be represented
by G(V, E).
c d c d
31
31 a a 83 a a
41 59
c c 26
9 d 88 d
Self-Instructional
192 Material
Graph Traversals
NOTES
U U
e a e
f
y V y V
g g
d h b d b
Graph G vertices are
labelled by in-degree
and out-degree. X W X W
C C
Graph G7 and the subgraph G7
Adjacency list
An adjacency list of a graph is used to keep track of all edges incident to a vertex.
This representation can be done with an array of size N, every ith index specifies
the list (list of incident edges) for vertex i.
Searching vertices adjacent to each node is easy and a cheap task in this
representation. In this structure addition of an edge in this structure is an easy task;
whereas deletion of an existing edge is a difficult operation. Example of an adjacency
list is shown in Figure 14.5. The graph contains five nodes with six edges. For
each vertex there is a corresponding incident list. Vertex 1 is connected with vertex
Self-Instructional
194 Material
2 and vertex 5. Therefore adjacency representation contains node 2 and node 5 Graph Traversals
in list of vertex 1.
NOTES
Graph G
Adjacency multi-list
An adjacency multi-list is a representation in which there are two parts, directory
information (an array is used to represent all the vertices of the graph and is called
directory) and another part represented by the set of linked list for incident edge
information. For each node of graph, there is one entry in the directory information
and every directory entry node i points to an adjacency list for node i. Each edge
record appears on two adjacency lists. We use the following data structure to
represent the node of adjacency list.
Vi Next 1 Vj Next 2
Figure 14.6 shows a graph with four nodes and its adjacency multi-list
representation.
Self-Instructional
Material 195
Graph Traversals
NOTES
Self-Instructional
196 Material
Graph Traversals
0 0 0
1 1 1 1 1 2 1 1 1 1
NOTES
1 2 2 2 2 1
2
Self-Instructional
Material 197
Graph Traversals BFS Example
The following example shows the working of BFS. Given a graph G with 12
vertices and starting vertex is 1. In Figure 14.8, node added to the queue is shown
by filled circle and a queue is also shown at every stage
NOTES
Traverse vertex 1.
Self-Instructional
198 Material
Now visit vertex 11, but none of the neighbors is unvisited Graph Traversals
12
Now visit vertex 12, but none of the neighbors is unvisited NOTES
.
Depth-First Traversal
The Depth-First-Traversal (DFS) uses stack as a supporting data structure. DFS
begins from the start node and works as follows (recall stack is a first-in last-out
data structure). It is a recursive algorithm that records the backtracking path from
root to node presently under consideration. DFS is a way of traversal, which is
very similar to preorder traversal of a tree. BFS is short and bushy whereas DFS
is long and stringy. DFS works as follows:
It begins from starting node S
Then visits the node N along the path P which starts at node S.
Then it visits a neighbor of a neighbor of node S and so on.
Self-Instructional
Material 199
Graph Traversals After coming to the end of path P, similarly continue along another path P’
and so on.
DFS Algorithm
NOTES Algorithm DFS ( G, S)
// given a graph G and a starting vertex S, it uses a Boolean array VISITED
equivalent to the //number of nodes in graph Figure 2.
1) For each vertex U ? V(G) do
VISITED [U] = FALSE
End for
2) push the starting node S on to the stack
3) while stack is not empty do
a) Pop the top node X of the stack and set VISITED[X]=TRUE
b) For each neighbor P of node N
If VISITED [P] =false and not already on stack then
Push P onto the stack
End if
End for
End while (step 3)
4) EXIT
DFS Example
Given a graph G (Figure 14.9) with five vertices A,B,C, D and E. DFS works as
follows
Initially
STACK is empty and
VISITED:
A B C D E
FALSE FALSE FALSE FALSE FALSE
(Figure 14.10),
STACK : A <-Top
VISITED: NOTES
A B C D E
TRUE FALSE FALSE FALSE FALSE
Now pop A from stack and push the neighbours of A onto the stack. Here
neighbors of A are B, C , D and E.
STACK : E D C B <-Top
VISITED:
A B C D E
TRUE FALSE FALSE FALSE FALSE
Now it pops the top of stack B and pushes neighbours of B onto the stack
(which are not on the stack till now). Neighbors of B are A and C but they
are already on stack so nothing is pushed.
STACK : E D C <-Top
VISITED:
A B C D E
TRUE TRUE FALSE FALSE FALSE
Now it pops up C from stack and pushes neighbors of C onto the stack.
Here neighbors of C are A, B, D and E. A and B are already visited and D
and E are on stack so it backtracks to A.
STACK : E D <-Top
VISITED:
A B C D E
TRUE TRUE TRUE FALSE FALSE
Now it pops up D from stack and push neighbors of D onto the stack. Here
neighbors of D are A and C, A and C are already visited.
STACK : E <-Top
VISITED:
A B C D E
TRUE TRUE TRUE TRUE FALSE
Self-Instructional
Material 201
Graph Traversals Now it pops up E from stack and push neighbors of E onto the stack. Here
neighbors of E are A and C. But both A and C are already visited.
STACK : <-Top
NOTES VISITED:
A B C D E
TRUE TRUE TRUE TRUE TRUE
Now the stack is empty that means all the nodes have already been visited.
Self-Instructional
202 Material
The Kosaraju’s algorithm efficiently computes strongly connected Graph Traversals
e f g h
(a)
(b) Self-Instructional
Material 203
Graph Traversals Finally, the acyclic component graph is shown as follows.
NOTES
Self-Instructional
204 Material
Graph Traversals
NOTES
Self-Instructional
Material 205
Graph Traversals
14.3 NP HARD AND NP COMPLETE PROBLEMS
Basic Concepts
NOTES
So far, we have come across many problems in this book. For some problems like
ordered searching, sorting, etc., there exists polynomial time algorithmic solutions
with complexities ranging from O(n) to O(n2), where n is the size of input. The
problems for which there exists (or known) polynomial time solution are class P
problems. There are, however, some problems like knapsack, traveling salesperson,
etc., for which no polynomial time algorithm is known so far. In addition, no one has
yet been able to prove that polynomial time solution cannot exist for these problems.
These problems fall under another class of problem that is class NP.
Class NP problems can be further categorized into two classes of problems:
NP-hard and NP-complete. An NP problem has an interesting characteristic
according to which it can be solved in polynomial time if and only if every NP
problem can be solved in polynomial time. Further, if an NP-hard problem can be
solved in polynomial time, then all NP-complete problems can be solved in
polynomial time. This implies that all NP-complete problems are NP-hard, but it
is not necessary that all NP-hard problems are NP-complete.
Besides the NP-hard and NP-complete classes, there can be more problem
classes having characteristic mentioned above. We will restrict our discussion to
NP-hard and NP-complete classes, which are computationally related; both of
these can be solved using non-deterministic computation.
14.3.1 Non-Deterministic Algorithms
Before proceeding to the concept of non-deterministic algorithms, let us first
understand what deterministic algorithms are. The deterministic algorithms are
algorithms in which the result obtained from each operation is uniquely defined.
Till now we have been using the deterministic algorithms to solve the problems.
However, to deal with NP problems, the above stated limitation on the result of
each operation must be removed from the algorithm. The algorithms can be allowed
to have the operations whose results are not uniquely defined but are restricted to
some specified sets of possibilities. Such algorithms are known as non-
deterministic algorithms. These algorithms are executed on special machines
called non-deterministic machines. Such machines do not exist in practice.
For specifying non-deterministic algorithms, three functions are required to
be defined which are as follows:
Choice(S): It selects one of the elements of set S; selection is made at
random. Consider a statement a = Choice(1,n). This statement will
assign any one of the values in the range [1,n] to a. Note that there is not
any rule to specify how these values are chosen from a set.
Failure(): It indicates that the algorithm terminates unsuccessfully. A non-
deterministic algorithm terminates unsuccessfully if and only if there is not a
Self-Instructional
206 Material
single set of choices in the specified sets of choices that can lead to the Graph Traversals
successful completion of the algorithm.
Success(): It indicates that algorithm terminates successfully. If there exists
a set of choices that can lead to the successful completion of the algorithm,
then it is certain that one set of choices will always be selected and the NOTES
algorithm terminates successfully.
Note that time taken to compute the functions: Choice(), Failure()and
Success()is O(1).
Non-Deterministic Search
Consider the problem for searching an element a in an unordered set of integers
S[1:n], where n > 1. To solve this problem, we have to find the index k containing
the element a that is S[k]=a or k=0 if a does not exist in s. The non-deterministic
algorithm to solve this problem is given in Algorithm 14.1.
Given a Knapsack of capacity c and n items, where each item i has a weight wti.
If a fraction xi (0 < xi < 1). of items is kept into the Knapsack, then a profit of
pixi is earned. The objective of this optimization problem is to fill the knapsack
with the items in such a way that the profit earned is maximum. This optimization NOTES
problem can be recast into the decision problem. The objective of knapsack
decision problem is to check if the value 0 or 1 can be assigned to xi (1 < i < n)
such that pixi> mp and wtixi < c where mp is the given number. If this decision
problem cannot be computed in deterministic polynomial time, then the optimization
problem cannot either. The non-deterministic algorithm for knapsack decision
problem is given in Algorithm 14.3.
Algorithm 14.3: Non-Deterministic Knapsack Decision Problem
NDKDP(p,wt,n,c,mp,x)
1. Set W=0
2. Set P=0
3. Set i=1
4. while(i = n)
5. {
6. Set x[i]=Choice(0,1) //assign 0 or 1 value
7. Set W=W+x[i]*wt[i] //compute total weight
//corresponding to the choice of x[]
8. Set P=P+x[i]*p[i] //computes total profit
//corresponding to the choice of x[]
9. Set i=i+1
10. }
11. If ((W>c) OR (P<mp)) //checks if total weight is more
//than knapsack capacity or the
//resultant profit is less than
mp
12. Failure()
13. Else
14. Success()
In this algorithm, the non-deterministic time for executing first while loop is
O(n). The time required to execute second while loop is O(k2).The total non-
deterministic time to run this algorithm is O(n+k2)=O(n2)=O(m). Note that no
polynomial time deterministic algorithm exists for this problem.
Satisfiability Problem
The objective of satisfiability problem is to determine whether a formula is true for
some sequence of truth values to its Boolean variables, say x1, x2, ... A formula in
the propositional calculus is composed of literals (a literal can be either a variable
or its negation) and the operators AND (), OR (), and NOT ( ¯ ). We say a
formula in n-conjunctive normal form (n-CNF) if it comprises AND of terms that
are OR of n Boolean variables or their negations. For example, is in 3-CNF.
For a satisfiability problem, a polynomial time non-deterministic algorithm
can be obtained easily if and only if the given formula F(x1, x2,…, xn ) is
satisfiable. The non-deterministic algorithm for satisfiability problem is given in
Algorithm 14.5.
Algorithm 14.5: Non-Deterministic Satisfiability Problem
NDSAT(F,n)
//F is the formula and n is the number of variables x1,
x2,..,xn
1. Set i=1
2. while(i = n)
3. {
4. Set xi=Choice(false,true) //selects a truth
value
//for assignment
5. Set i=i+1
6. }
7. If F(x1,....,xn)
8. Success()
9. Else
10. Failure()
The computing time for this algorithm is equal to the sum of the time taken
to select a truth value (x1,.....,xn ), that is, O(n) and the time required to
deterministically evaluate the expression F for that assignment. This time will be
proportional to the length of formula F.
14.3.2 NP-Hard and NP-Complete Classes
P is defined as the set of all decision problems that can be solved by deterministic
algorithms with in polynomial time, whereas, NP is defined as the set of all decision
problems that can be solved by non-deterministic algorithms with in polynomial
Self-Instructional time.
210 Material
From these definitions, it is clear that P NP, as deterministic algorithms are Graph Traversals
the special cases of non-deterministic algorithms. Now, the thing that is to be
determined is whether P=NP or P‘“NP. This problem is not solved yet. However,
some other useful results have been obtained. One of them is stated above that is,
P NP. The relationship between P and NP on the basis of assumption P‘“NP is NOTES
depicted in Figure 14.12.
Here, if the input calls for bit S(i,j,0) to be1, then T(i,j,0) is S(i,j,0);
otherwise, T(i,j,0) is equal to Hence, if there is no input, then:
Self-Instructional
214 Material
Where, each Lt states that there is a unique instruction for step t and is Graph Traversals
defined as:
NOTES
Lt is true if and only if exactly one of the R(j,t)’s is true where 1d”jd”l.
Also note that L is in CNF.
4. The formula M is given by:
Here, each Mi,t states that either the instruction i is not the one which will
be executed at time t, or if it is executed at time t, then the instruction to be
executed at time t+1 will definitely be determined by the instruction i. Mi,t
is defined as:
Here, each Ni,t states that at time t, the instruction i is not executed or
it is and the status of the p(n) words after step t is correct with respect to the
status before step t and the resultant changes from i. Formally, Ni,t is defined
as:
Self-Instructional
216 Material
Graph Traversals
14.4 ANSWERS TO CHECK YOUR PROGRESS
QUESTIONS
1. A graph is a non-linear data structure. NOTES
2. In a directed graph, the edges consist of an ordered pair of vertices.
3. The Depth-First-Traversal (DFS) uses stack as a supporting data structure.
4. Class NP problems can be categorized into two classes of problems: NP-
hard and NP-complete.
5. The objective of satisfiability problem is to determine whether a formula is
true for some sequence of truth values to its Boolean variables.
14.5 SUMMARY
A graph is a non-linear data structure. A data structure in which each node
has at most one successor node is called a linear data structure, for example
array, linked list, stack, queue etc.
Many problems can be naturally formulated as in terms of elements and
their interconnections.
In a directed graph, the edges consist of an ordered pair of vertices (one-
way edges).
A weighted graph is one in which edges are associated with some weight;
this weight can be distance, time or any cost function.
Every edge contributes in the degree of the exactly two vertices in an
undirected graph.
An adjacency matrix is a way of representing graphs in memory.
An adjacency list of a graph is used to keep track of all edges incident to a
vertex.
Searching vertices adjacent to each node is easy and a cheap task in this
representation.
An adjacency list is preferred over an adjacency matrix because of its
compact structure.
There is always a need to traverse a graph to find out the structure of graph
used in various applications (recall traversal is visiting each node of a data
structure exactly once).
The Depth-First-Traversal (DFS) uses stack as a supporting data structure.
A spanning tree of a graph G is a connected subgraph G’ which is a tree and
contains all the vertices (but fewer edges) of graph G.A spanning tree of
graph G with N vertices contain N-1 edges.
The Kosaraju’s algorithm efficiently computes strongly connected
components and it is the simplest to understand. There is a better algorithm
than Kosaraju’s algorithm called the Tarjan’s algorithm which improves Self-Instructional
Material 217
Graph Traversals performance by a factor of two. Tarjan’s algorithm is a simple variation of
Tarjan’s algorithm.
The deterministic algorithms are algorithms in which the result obtained from
each operation is uniquely defined.
NOTES
Class NP problems can be categorized into two classes of problems: NP-
hard and NP-complete.
15
4 3
6
7
8
9. Find all the possible spanning trees for the following graph:
5
4
2 3
7
8
Self-Instructional
Material 219