0% found this document useful (0 votes)
31 views

Algorithm Ebook

Uploaded by

csefmcet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Algorithm Ebook

Uploaded by

csefmcet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 273

For B.E.

, COMPUTER SCIENCE AND ENGINEERING BRANCH


As per the Latest Syllabus of Anna University, Chennai
(Regulations 2021)

LAKSHMI PUBLICATIONS
Plot No.73, VGP Gokul Nagar, 2nd Main Road (40 Feet Road),
Perumbakkam, Chennai-600 100, Tamil Nadu, INDIA.
Phone: 044-49523977, 98945 98598
E-mail: [email protected]
[email protected]
Website: www.lakshmipublications.com
ALGORITHMS
by

First Edition: January 2023

Copyright © 2023 exclusive by the Authors

All Rights Reserved

No part of this publication can be reproduced, stored in a retrieval system or


transmitted in any form or by any means, mechanical, photocopying, recording or
otherwise, without the prior written permission of the author.

Price: Rs.
ISBN
Published by and copies can be had from:

Head Office: Branch Office:


LAKSHMI PUBLICATIONS LAKSHMI BOOK HOUSE
Plot No.73, VGP Gokul Nagar, No.88, Pidari South Street,
nd
2 Main Road (40 Feet Road), (Govt. Hospital Road),
Perumbakkam, Chennai-600 100, Sirkali – 609 110. (TK)
Tamil Nadu, INDIA. Nagapattinam (Dt)
Phone: 044-49523977 Phone: 044 - 4952 3977
Mobile: 98945 98598
E-mail: [email protected]
[email protected]
Website: www.lakshmipublications.com
PREFACE
The goal of writing the textbook "ALGORITHMS" was to satisfy the needs of
Tamilnadu engineering students enrolled at Anna University. This book is
appropriate for fourth semester Engineering College students in the Computer
Science branch in accordance with Regulation 2021.

FEATURE OF THE BOOK


The main concept is to give the pupils a textbook, which is not very common at
the moment. More effort is made to ensure that the subject's contents are as
accurate as possible. In the book, clarity and correctness are scrupulously adhered
to.

The audience's goal and purpose are more tightly focused. More effort has been
put into getting more accurate results. Readers will feel more at ease and
comprehend the book's depth thanks to its contents.

A brief introduction and concise captions for the figures and tables are provided
for the content in the book. More emphasis is placed on avoiding grammar mistakes
and using the proper format.

- AUTHORS
ACKNOWLEDGEMENT

We first and foremost express our gratitude to GOD ALMIGHTY for giving us the

mental fortitude required to complete the book work preparation.

We would like to extend our sincere appreciation to everyone who helped out,

discussed things, supplied feedback, enabled us to use their comments, and helped

with the editing, proofreading, and design.

We would like to express our gratitude to Management, Principal Dr. N. Balaji,

Vice Principal Dr.S.Soundararajan & our HOD Dr. V.P.Gladispushparathi for

directing us in publishing the book.

We also thank our Family members, Friends and other loved ones whose

possessions and support have always been with us.

With great concern for the welfare of engineering students, we would like to

extend a very special thanks to Mrs.Nirmala Durai, Proprietrix of Lakshmi

Publications, Mr. A. Durai, B.E., Founder and Managing Director of Lakshmi

publications and Mrs. P.S. Malathi for providing professional support in all of the

activities throughout the development and production of this book.

Suggestions and comments for further improvements of this book are most

welcome. Please mail me at [email protected].


For B.E., COMPUTER SCIENCE AND ENGINEERING BRANCH

UNIT I: INTRODUCTION 9
Algorithm analysis: Time and space complexity - Asymptotic Notations and its
properties Best case, Worst case and average case analysis - Recurrence relation:
substitution method - Lower bounds - searching: linear search, binary search and
Interpolation Search, Pattern search: The naïve string-matching algorithm - Rabin-
Karp algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort - heap sort.

UNIT II: GRAPH ALGORITHMS 9


Graph algorithms: Representations of graphs - Graph traversal: DFS - BFS -
applications - Connectivity, strong connectivity, bi-connectivity - Minimum
spanning tree: Kruskal’s and Prim’s algorithm- Shortest path: Bellman-Ford
algorithm - Dijkstra’s algorithm - Floyd-Warshall algorithm Network flow: Flow
networks - Ford-Fulkerson method - Matching: Maximum bipartite matching.

UNIT III: ALGORITHM DESIGN TECHNIQUES 9


Divide and Conquer methodology: Finding maximum and minimum - Merge sort -
Quick sort Dynamic programming: Elements of dynamic programming - Matrix-
chain multiplication - Multi stage graph - Optimal Binary Search Trees. Greedy
Technique: Elements of the greedy strategy - Activity-selection problem - Optimal
Merge pattern - Huffman Trees.

UNIT IV: STATE SPACE SEARCH ALGORITHMS 9


Backtracking: n-Queens problem - Hamiltonian Circuit Problem - Subset Sum
Problem – Graph colouring problem Branch and Bound: Solving 15-Puzzle
problem - Assignment problem - Knapsack Problem - Travelling Salesman Problem.
UNIT V: NP-COMPLETE AND APPROXIMATION ALGORITHM 9
Tractable and intractable problems: Polynomial time algorithms - Venn diagram
representation - NP-algorithms - NP-hardness and NP-completeness - Bin Packing
problem - Problem reduction: TSP - CNF problem. Approximation Algorithms:
TSP - Randomized Algorithms: concept and application - primality testing -
randomized quick sort - Finding kth smallest number.
CONTENTS
UNIT I

1.1. Algorithm Analysis ..................................................................................... 1.1


1.1.1. Time and Space Complexity .................................................................. 1.1
1.1.2. Asymptotic Notations and its Properties ................................................ 1.2
1.1.3. Measurement of Complexity of an Algorithm ....................................... 1.6
1.2. Recurrence Relation ................................................................................... 1.8
1.2.1. Substitution Method ............................................................................... 1.9
1.2.2. Lower Bound Theory ........................................................................... 1.10
1.2. Linear Search ............................................................................................ 1.12
1.2.1. Working of Linear search..................................................................... 1.12
1.2.2. Algorithm ............................................................................................. 1.13
1.3. Binary Search............................................................................................ 1.15
1.4. Interpolation Search ................................................................................. 1.17
1.5. Pattern Search .......................................................................................... 1.18
1.6. The Naïve String Matching Algorithm .................................................. 1.19
1.7. The Rabin-Karp-Algorithm .................................................................... 1.20
1.8. The Knuth - Morris - Pratt (KMP) Algorithm ...................................... 1.22
1.8.1. Components of KMP Algorithm .......................................................... 1.23
1.8.2. Running Time Analysis ....................................................................... 1.23
1.8.3. The KMP Matcher................................................................................ 1.25
1.8.4. Running Time Analysis ....................................................................... 1.25
C.2 Algorithms

1.9. Sorting ....................................................................................................... 1.29


1.9.1. Insertion Sort ........................................................................................ 1.29
1.9.2. Heap Sort Algorithm ............................................................................ 1.32
Two Marks Question and Answers (Part - A) ............................................... 1.40
Part - B & C ...................................................................................................... 1.42

UNIT II

2.1. Graph Algorithms ...................................................................................... 2.1


2.1.1. Representations of Graphs ...................................................................... 2.1

2.2. Graph Traversals........................................................................................ 2.3


2.2.1. Depth-First Search Algorithm ................................................................ 2.3

2.2.2. Breadth First Search (BFS) .................................................................... 2.8

2.2.3. Applications of DFS and BFS algorithms ............................................ 2.12

2.2.4. Strongly Connected Components ......................................................... 2.13

2.2.5. Biconnectivity ...................................................................................... 2.14

2.3. Spanning Tree ........................................................................................... 2.15


2.3.1. Minimum Spanning Tree ..................................................................... 2.16

2.3.2. Kruskal’s Algorithm ............................................................................ 2.16

2.3.3. Prim’s Algorithm ................................................................................. 2.18

2.4. Shortest Path ............................................................................................. 2.19


2.4.1. Bellman-Ford Algorithm...................................................................... 2.20

2.4.2. Dijkstra's Algorithm ............................................................................. 2.21

2.4.3. Floyd Warshall Algorithm ................................................................... 2.26

2.5. Flow Networks .......................................................................................... 2.29


Contents C.3

2.6. Ford - Fulkerson Algorithm .................................................................... 2.30


2.7. Maximum Bipartite Matching................................................................. 2.31
Two Marks Question and Answers (Part - A) .............................................. 2.33
Part - B & C ...................................................................................................... 2.35

UNIT III

3.1. Divide and Conquer Methodology ............................................................ 3.1


3.1.1. Finding Maximum and Minimum .......................................................... 3.3

3.1.2. Merge Sort .............................................................................................. 3.4

3.1.3. Quick Sort .............................................................................................. 3.6

3.2. Dynamic Programming .............................................................................. 3.7


3.2.1. Elements of Dynamic Programming ..................................................... 3.7

3.2.2. Applications of Dynamic Programming ................................................ 3.8

3.3. Matrix Chain Multiplication ..................................................................... 3.9

3.4. Multistage Graph...................................................................................... 3.14

3.5. Optimal Binary Search Trees .................................................................. 3.18

3.6. Greedy Technique .................................................................................... 3.22


3.6.1. Elements of the Greedy Strategy.......................................................... 3.23

3.7. Activity - selection problem ..................................................................... 3.24

3.8. Optimal Merge Pattern ............................................................................ 3.26

3.9. Huffman Trees .......................................................................................... 3.29

Two Marks Question and Answers (Part - A) ............................................... 3.31

Part - B & C ...................................................................................................... 3.33


C.4 Algorithms

UNIT IV

4.1. Backtracking ............................................................................................... 4.1


4.1.1. n - Queens problem ............................................................................... 4.5
4.1.2. Hamiltonian Circuit Problem ................................................................. 4.9
4.1.3. Subset Sum Problem ............................................................................ 4.13
4.2. Graph colouring problem ........................................................................ 4.16
4.3. Branch and Bound.................................................................................... 4.20
4.3.1. Solving 15-Puzzle Problem .................................................................. 4.20
4.3.2. Assignment Problem ............................................................................ 4.26
4.3.3. Knapsack Problem ............................................................................... 4.37
4.3.4. Algorithm of solving Knapsack Problem ............................................. 4.37
4.3.5. Travelling Salesman Problem .............................................................. 4.42
4.3.6. Dynamic Programming ........................................................................ 4.43
Two Marks Question and Answers (Part - A) ............................................... 4.46
Part - B & C ...................................................................................................... 4.48

UNIT V

5.1. Tractable and Intractable Problems ......................................................... 5.1


5.1.1. Polynomial time (p - time) reduction ..................................................... 5.4
5.1.2. Problems in NP and NP - Complete are very hard
to solve but easy to check ........................................................................ 5.5
5.1.3. Problems in NP - Complete are the hardest problems in NP ................. 5.5
5.1.4. Problems in NP - Complete stand or fall together ................................. 5.5
Contents C.5

5.2. Polynomial Time Algorithms..................................................................... 5.6


5.3. Venn Diagram Representation .................................................................. 5.9
5.3.1. Venn Diagram Formula........................................................................ 5.14
5.3.2. Applications of Venn Diagram ............................................................ 5.15
5.4. NP - Algorithms ........................................................................................ 5.16
5.5. NP - Hardness and NP - Completeness................................................... 5.17
5.6. Bin Packing Problem................................................................................ 5.20
5.7. Problem Reduction ................................................................................... 5.23
5.7.1. Problem Reduction Algorithm ............................................................. 5.24
5.8. TSP ............................................................................................................. 5.25
5.9. 3- CNF problem ........................................................................................ 5.28
5.10. Approximation Algorithms: TSP .......................................................... 5.30
5.11. Randomized Algorithms: Concept and Application ........................... 5.32
5.11.1. Linearity of Expectation ..................................................................... 5.32
5.11.2. Primality Testing ................................................................................ 5.34
5.12. Randomized Quick Sort ......................................................................... 5.39
5.13. Finding kth Smallest Number ................................................................. 5.43
Two Marks Question and Answers (Part-A) ................................................. 5.46
Part - B & C ...................................................................................................... 5.51

P.1. Searching and Sorting Algorithms ........................................................... P.1


P.2. Graph Algorithms ...................................................................................... P.9
P.3. Algorithm Design Techniques .................................................................P.31
P.4. State Space Search Algorithms ............................................................... P.36
P.5. Approximation Algorithms Randomized Algorithms .......................... P.38
Model Question Papers .................................................................... MQ.1 - MQ.8
UNIT I
INTRODUCTION
Algorithm analysis: Time and space complexity - Asymptotic Notations and its
properties Best case, Worst case and average case analysis – Recurrence relation:
substitution method - Lower bounds – searching: linear search, binary search and
Interpolation Search, Pattern search: The naïve string matching algorithm - Rabin-
Karp algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort – heap sort

 Analyzing an algorithm has come to mean predicting the resources that the
algorithm requires. Occasionally, resources such as memory,
communication band-width, or computer hardware are of primary concern,
but most often it is computational time that we want to measure.
 Generally, by analyzing several candidate algorithms for a problem, we
can identify a most efficient one. Such analysis may indicate more than
one viable candidate.
 Before we can analyze an algorithm, have a model of the implementation
technology that implements, including a model for the resources of that
technology and their costs.

1.1.1. TIME AND SPACE COMPLEXITY


 Human nature aspires to seek an efficient way to assemble their daily tasks.
The predominant thought process behind innovation and technology is to
make life easier for people by providing ways to solve problems they may
encounter. The same thing happens within the world of computer science
and digital products. We write algorithms that are efficient and take up less
memory to perform better.
Time complexity is the time taken by the algorithm to execute each set of
instructions. It is always better to select the most efficient algorithm when a simple
problem can solve with different methods.
1.2 Algorithms

Space complexity is usually referred to as the amount of memory consumed by


the algorithm. It is composed of two different spaces; Auxiliary space and Input
space.
A good algorithm is one that takes less time in execution and saves space during
the process. Ideally, we have to find a middle ground between space and time, but we
can settle for the average. Let’s look at a simple algorithm of finding out the sum of
two numbers.
Step #01: Start.
Step #02: Create two variables (a & b).
Step #03: Store integer values in ‘a’ and ‘b.’  Input
Step #04: Create a variable named ‘Sum.’
Step #05: Store the sum of ‘a’ and ‘b’ in a variable named ‘Sum’  Output
Step #06: End.
Following are the factors that play a significant role in the long term usage of an
algorithm:
1. Efficiency - We’ve already talked about how much efficiency matters in
creating a good algorithm. It is the efficiency that reduces the
computational time and generating quick output.
2. Finiteness - The algorithm must terminate after completing a specified
number of steps. Otherwise, it’ll use more memory space, and it’s
considered a bad practice. Stack overflow and out-of-bounds conditions
may occur if it goes on for infinite loops or recursion.
3. Correctness - A good algorithm should produce a correct result
irrespective of the size of the input provided.

1.1.2. ASYMPTOTIC NOTATIONS AND ITS PROPERTIES


The main idea of asymptotic analysis is to have a measure of the efficiency of
algorithms that don’t depend on machine-specific constants and don’t require
algorithms to be implemented and time taken by programs to be compared.
Asymptotic notations are mathematical tools to represent the time complexity of
algorithms for asymptotic analysis.
Introduction 1.3

There are mainly three asymptotic notations:


1. Big-O Notation (O-notation)
2. Omega Notation (Ω-notation)
3. Theta Notation (Θ-notation)
1. Big-oh notation: Big-oh is the formal method of expressing the upper
bound of an algorithm's running time. It is the measure of the longest
amount of time. The function f (n) = O (g (n)) [read as "f of n is big-oh
of g of n"] if and only if exist positive constant c and such that
f (n)  k. g (n)f (n)  k. g (n) for n > n 0 n > n 0 in all case
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster
than f (n)

Fig. 1.1. Asymptotic Upper Bound


For Example:
1. 3n + 2 = O(n) as 3n + 2 ≤ 4n for all n ≥ 2
2. 3n + 3 = O(n) as 3n + 3 ≤ 4n for all n ≥ 3
Hence, the complexity of f (n) can be represented as O (g (n))
2. Omega () Notation: The function f (n) = Ω (g (n)) [read as "f of n is
omega of g of n"] if and only if there exists positive constant c and n 0
such that
F (n) ≥ k* g (n) for all n, n ≥ n 0
For Example:
f (n) = 8 n 2 + 2 n – 3 ≥ 8 n 2 – 3
= 7 n 2 + (n 2 – 3) ≥ 7 n 2 (g(n))
1.4 Algorithms

Thus, k1 = 7
Hence, the complexity of f (n) can be represented as Ω (g (n))

Fig. 1.2. Asymptotic Lower Bound


3. Theta (θ): The function f (n) = θ (g (n)) [read as "f is the theta of g of n
"] if and only if there exists positive constant k1, k 2 and k0 such that
k1 * g (n) ≤ f (n) ≤ k 2 g(n)for all n, n ≥ n 0

Fig. 1.3. Asymptotic Tight Bound


For Example:
3n + 2 = θ (n) as 3n + 2 ≥ 3n and 3n + 2 ≤ 4 n, for n
k1 = 3, k 2 = 4, and n 0 = 2
Hence, the complexity of f (n) can be represented as θ (g(n)).
The Theta Notation is more precise than both the big-oh and Omega notation. The
function f (n) = θ (g (n)) if g(n) is both an upper and lower bound.
Properties of Asymptotic Notations:
1. General Properties:
If f (n) is O(g(n)) then a * f (n) is also O(g(n)), where a is a constant.
Introduction 1.5

Example:
f (n) = 2 n 2 + 5 is O( n 2 )
then, 7* f (n) = 7(2 n 2 + 5) = 14 n 2 + 35 is also O( n2).
Similarly, this property satisfies both Θ and Ω notation.

2. Transitive Properties:
If f (n) is O(g(n)) and g(n) is O(h(n)) then f (n) = O(h(n)).

Example:
If f (n) = n, g(n) = n 2 and h(n) = n 3
n is O( n 2) and n 2 is O(n 3) then, n is O(n 3)
Similarly, this property satisfies both Θ and Ω notation.

3. Reflexive Properties:
Reflexive properties are always easy to understand after transitive.
If f (n) is given then f (n) is O(f (n)). Since MAXIMUM VALUE OF f (n) will be
f (n) ITSELF!
Hence x = f (n) and y = O(f (n) tie themselves in reflexive relation always.

Example:
f (n) = n 2 ; O( n 2) i.e O(f (n))
Similarly, this property satisfies both Θ and Ω notation.

4. Symmetric Properties:
If f (n) is Θ(g(n)) then g(n) is Θ(f (n)).

Example:
If(n) = n 2 and g(n) = n 2
then, f (n) = Θ( n 2) and g(n) = Θ(n 2)
This property only satisfies for Θ notation.

5. Transpose Symmetric Properties:


If f (n) is O(g(n)) then g(n) is Ω (f (n)).
1.6 Algorithms

Example:
If(n) = n , g(n) = n 2
then n is O(n 2) and n 2 is Ω (n)
This property only satisfies O and Ω notations.

6. Some More Properties:


1. If f (n) = O(g(n)) and f (n) = Ω(g(n)) then f (n) = Θ(g(n))
2. If f (n) = O(g(n)) and d(n) = O(e(n)) then f (n) + d(n) = O(max( g(n),
e(n) ))

Example:
f (n) = n i.e O(n)
d(n) = n 2 i.e O(n 2)
then f (n) + d(n) = n + n 2 i.e O(n 2)

1.1.3. MEASUREMENT OF COMPLEXITY OF AN ALGORITHM


Based on the above three notations of Time Complexity there are three cases to
analyze an algorithm:

1. Worst Case Analysis (Mostly used)


In the worst-case analysis, we calculate the upper bound on the running time of
an algorithm. We must know the case that causes a maximum number of
operations to be executed. For Linear Search, the worst case happens when the
element to be searched (x) is not present in the array. When x is not present, the
search() function compares it with all the elements of arr[] one by one. Therefore,
the worst-case time complexity of the linear search would be O(n).

2. Best Case Analysis (Very Rarely used)


In the best-case analysis, we calculate the lower bound on the running time of an
algorithm. We must know the case that causes a minimum number of operations to
be executed. In the linear search problem, the best case occurs when x is present at
the first location. The number of operations in the best case is constant (not
dependent on n). So time complexity in the best case would be Ω(1)
Introduction 1.7

3. Average Case Analysis (Rarely used)


In average case analysis, we take all possible inputs and calculate the computing
time for all of the inputs. Sum all the calculated values and divide the sum by the
total number of inputs. We must know (or predict) the distribution of cases. For the
linear search problem, let us assume that all cases are uniformly
distributed (including the case of x not being present in the array). So we sum all the
cases and divide the sum by (n + 1). Following is the value of average-case time
complexity.

Linear search algorithm:

// C implementation of the approach


#include <stdio.h>
// Linearly search x in arr[].
// If x is present then return the index,
// otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i = 0; i < n; i++) {
if (arr[i] == x)
return i;
}
return -1;
}
/* Driver's code*/
int main()
{
1.8 Algorithms

int arr[] = { 1, 10, 30, 15 };


int x = 30;
int n = sizeof(arr) / sizeof(arr[0]);
// Function call
printf("%d is present at index %d", x,
search(arr, n, x));
getchar();
return 0;
}

Output
30 is present at index 2
Time Complexity Analysis: (In Big-O notation)
 Best Case: O(1), This will take place if the element to be searched is on
the first index of the given list. So, the number of comparisons, in this
case, is 1.
 Average Case: O(n), This will take place if the element to be searched
is on the middle index of the given list.
 Worst Case: O(n), This will take place if:
 The element to be searched is on the last index
 The element to be searched is not present on the list

A recurrence is an equation or inequality that describes a function in terms of its


values on smaller inputs. To solve a Recurrence Relation means to obtain a function
defined on the natural numbers that satisfy the recurrence.
For Example, the Worst Case Running Time T(n) of the MERGE SORT
Procedures is described by the recurrence.
Introduction 1.9

T (n) = θ (1) if n = 1
n
2T 2  + θ (n) if n > 1
 
There are four methods for solving Recurrence:
1. Substitution Method
2. Iteration Method
3. Recursion Tree Method
4. Master Method

1.2.1. SUBSTITUTION METHOD


The Substitution Method Consists of two main steps:
1. Guess the Solution.
2. Use the mathematical induction to find the boundary condition and shows
that the guess is correct.

Examples of the process of solving recurrences using substitution


Let’s say we have the recurrence relation given below.
T(n) = 2 * T(n – 1) + c 1 , (n > 1)
T(1) = 1
We know that the answer is probably T(N) = O(2 n). The observation that we are
almost doubling the number of O(1) operations for a constant decrease in n leads to
the guess. We will thus show that T(n)  k * 2 n – b and prove our answer.
Now we come over to the use of induction, here the base case occurs when n = 1.
 T(1) = 1  k * 21 – b = 2 * k – b.
 2 * k – b >= 1
 k >= (b +1) / 2
Inductive step: We assume that the property that we guessed holds for n – 1 and
prove that in such a case it will also hold for n.
 T(n) = 2 * (T(n –1)) + c 1
 T(n)  2 * (k * 2n – 1 – b) + c 1
 T(n)  k * 2n – 2 * b + c 1
 T(n)  k * 2n – b (let, b = c 1 )
1.10 Algorithms

After the above steps we obtain, b = c 1 and k = (b + 1) / 2. These equations have


infinite solutions, and we are free to choose any constant as big-O notation doesn't
care about constants being multiplied, divided, added or subtracted.
Hence, we have confirmed that T(n) = O(2n).
Let's cover another example. The recurrence relation is described below.
T(n) = 2 * T(n / 2) + n, n > 1
T(1) = 1
T(n) = O(n log(n)) is our guess. We must therefore show that T(n)  c 1 n log(n)
+ c 2 to prove our guess.
Base case (n = 1)
 T(1) = 1  c 1 * 1 * log(1) + c 2
 1  c1 * 1 * 0 + c2
 c 2 >= 1
Inductive step, (n > 1)
 T(n)  2 * T(n / 2) + n
 T(n)  2 * (c 1 * n / 2 * log(n / 2) + c 2 ) + n
 T(n)  c 1 * n * log(n) - c 1 * n * log(2) + 2 * c 2 + n
T(n)  c 1 * n * log(n) + (1 – c 1 * log(2)) * n + 2 * c 2
We can conclude that the above holds for some c 1  1 / log(2) and some c 2. We
again have numerous solutions for the above set of equations, and hence the guess is
accurate and T(n) = O(n * log(n)).

1.2.2. LOWER BOUND THEORY

What is the lower bound?


Let us assume L(n) to be the running time of an algorithm A, then g(n) is the
lower bound of the algorithm A. If there are two constants C and N, such that
L(n) >= C * g(n), for n > N.
The lower bound theory is the technique that has been used to establish the given
algorithm in the most efficient way which is possible. This is done by discovering a
function g(n) that is a lower bound on the time that any algorithm must take to solve
the given problem. Now if we have an algorithm whose computing time is the same
Introduction 1.11

order as g(n), then we know that asymptotically we cannot do better. Lower Bound
Theory Concept is based upon the calculation of minimum time that is required to
execute an algorithm is known as a lower bound theory or Base Bound Theory.
Lower Bound Theory uses a number of methods/techniques to find out the lower
bound
If f (n) is the time for some algorithm, then we write f (n) = Ω(g(n)) to mean
that g(n) is the lower bound of f (n). This equation can be formally written, if there
exists positive constants c and n 0 such that |f (n)| >= c|g(n)| for all n > n 0 . In
addition for developing lower bounds within the constant factor, we are more
conscious of the fact to determine more exact bounds whenever this is possible.
Deriving good lower bounds is more difficult than devising efficient algorithms. This
happens because a lower bound states a fact about all possible algorithms for solving
a problem. Generally, we cannot enumerate and analyze all these algorithms, so
lower bound proofs are often hard to obtain.
The proofing techniques that are useful for obtaining lower bounds are:

1. Comparison trees:
Comparison trees are the computational model useful for determining lower
bounds for sorting and searching problems.

2. Oracles and adversary arguments:


One of the techniques that are important for obtaining lower bounds consists of
making the use of an oracle

3. Lower bounds through reduction:


This is a very important technique of lower bound, This technique calls for
reducing the given problem for which a lower bound is already known.

4. Techniques for the algebraic problem:


Substitution and linear independence are two methods used for deriving lower
bounds on algebraic and arithmetic problems. The algebraic problems are operation
on integers, polynomials, and rational functions.

Application:
The lower bound is necessary for any algorithm as after we have completed it, we
can compare it with the actual complexity of the algorithm. If the complexity and the
1.12 Algorithms

order of the algorithm are the same, then we can declare the algorithm optimal. One
of the best examples of optimal algorithms is merge sort. The upper bound should
match the lower bound, that is, L(n) = U(n). The easiest method to find the lower
bound is the trivial lower bound method. If we can easily observe the lower bounds
on the basis of the number of inputs taken and outputs produced, then it is known as
the trivial lower bound method. For example, multiplication of n  n matrix.

5. Ordered Searching
It is a type of searching in which the list is already sorted.

Explanation
In linear search, we compare the key with the first element if it does not match we
compare it with the second element, and so on till we check against the nth element.
Else we will end up with a failure. Else, Linear search is a sequential searching
algorithm where we start from one end and check every element of the list until the
desired element is found. It is the simplest searching algorithm.

1.3.1. WORKING OF LINEAR SEARCH


Let's see the working of the linear search Algorithm.
To understand the working of linear search algorithm, let's take an unsorted array.
It will be easy to understand the working of linear search with an example.
Let the elements of array are -

Fig. 1.4.
Let the element to be searched is K = 41
Now, start from the first element and compare K with each element of the array.

Fig. 1.5.
Introduction 1.13

The value of K, i.e., 41, is not matched with the first element of the array. So,
move to the next element. And follow the same process until the respective element
is found.

Fig. 1.6.
Now, the element to be searched is found. So algorithm will return the index of
the element

1.3.2. ALGORITHM
1. Linear_Search(a, n, val) // 'a' is the given array, 'n' is the size of given
array, 'val' is the value to search
2. Step 1: set pos = –1
3. Step 2: set i = 1
4. Step 3: repeat step 4 while i  n
5. Step 4: if a[i] == val
1.14 Algorithms

6. set pos = i
7. print pos
8. go to step 6
9. [end of if]
10. set ii = i + 1
11. [end of loop]
12. Step 5: if pos = –1
13. print "value is not present in the array "
14. [end of if]
15. Step 6: exit

Example:
#include <stdio.h>
int linearSearch(int a[], int n, int val) {
// Going through array sequencially
for (int i = 0; i < n; i++)
{
if (a[i] == val)
return i+1;
}
return -1;
}
int main() {
int a[] = {70, 40, 30, 11, 57, 41, 25, 14, 52}; // given array
int val = 41; // value to be searched
int n = sizeof(a) / sizeof(a[0]); // size of array
int res = linearSearch(a, n, val); // Store result
printf("The elements of the array are - ");
for (int i = 0; i < n; i++)
printf("%d ", a[i]);
Introduction 1.15

printf("\nElement to be searched is - %d", val);


if (res == -1)
printf("\nElement is not present in the array");
else
printf("\nElement is present at %d position of array", res);
return 0;
}

Output
The elements of the array are – 70 40 30 11 57 41 25 14 52
Element to be searched is – 41
Element is present at 6 position of array

Explanation
In binary search, we check the middle element against the key, if it is greater we
search the first half else we check the second half and repeat the same process. The
diagram below there is an illustration of binary search in an array consisting of 4
elements

Fig. 1.7.
1.16 Algorithms

Calculating the lower bound: The max no of comparisons is n. Let there be k


levels in the tree.
1. No. of nodes will be 2k – 1
2. The upper bound of no of nodes in any comparison-based search of an
element in the list of size n will be n as there are a maximum of n
comparisons in worst case scenario 2k – 1
3. Each level will take 1 comparison thus no. of comparisons k ≥ |log2 n|
Thus the lower bound of any comparison-based search from a list of n elements
cannot be less than log(n). Therefore we can say that Binary Search is optimal as
its complexity is Θ(log n).
Algorithm
1. Binary_Search(a, lower_bound, upper_bound, val) // 'a' is the
given array, 'lower_bound' is the index of the first array element, 'upper_bo
und' is the index of the last array element, 'val' is the value to search
2. Step 1: set beg = lower_bound, end = upper_bound, pos = - 1
3. Step 2: repeat steps 3 and 4 while beg <=end
4. Step 3: set mid = (beg + end)/2
5. Step 4: if a[mid] = val
6. set pos = mid
7. print pos
8. go to step 6
9. else if a[mid] > val
10. set end = mid - 1
11. else
12. set beg = mid + 1
13. [end of if]
14. [end of loop]
15. Step 5: if pos = -1
16. print "value is not present in the array"
17. [end of if]
18. Step 6: exit
Introduction 1.17

Interpolation search finds a particular item by computing the probe position.


Initially, the probe position is the position of the middle most item of the collection.

If a match occurs, then the index of the item is returned. To split the list into two
parts, we use the following method −
mid = Lo + ((Hi - Lo) / (A[Hi] – A[Lo])) * (X – A[Lo])
where −
A = list
Lo = Lowest index of the list
Hi = Highest index of the list
A[n] = Value stored at index n in the list
If the middle item is greater than the item, then the probe position is again
calculated in the sub-array to the right of the middle item. Otherwise, the item is
searched in the subarray to the left of the middle item. This process continues on the
sub-array as well until the size of subarray reduces to zero.
Runtime complexity of interpolation search algorithm is Ο(log (log n)) as
compared to Ο(log n) of BST in favorable situations.

Algorithm
As it is an improvisation of the existing BST algorithm, we are mentioning the
steps to search the 'target' data value index, using position probing −
Step 1 − Start searching data from middle of the list.
Step 2 − If it is a match, return the index of the item, and exit.
Step 3 − If it is not a match, probe position.
Step 4 − Divide the list using probing formula and find the new midle.
Step 5 − If data is greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list.
Step 7 − Repeat until match.
1.18 Algorithms

The Pattern Searching algorithms are sometimes also referred to as String


Searching Algorithms and are considered as a part of the String algorithms. These
algorithms are useful in the case of searching a string within another string.
Given a text array, T [1.....n], of n character and a pattern array, P [1......m], of m
characters. The problems are to find an integer s, called valid shift where 0 ≤ s < n –
m and T [s + 1......s + m] = P [1......m]. In other words, to find even if P in T, i.e.,
where P is a substring of T. The item of P and T are character drawn from some
finite alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.
Given a string T [1  n], the substrings are represented as T [i  j] for some 0
≤ i ≤ j ≤ n – 1, the string formed by the characters in T from index i to index j,
inclusive. This process that a string is a substring of itself (take i = 0 and j = m).
The proper substring of string T [1  n] is T [1  .j] for some 0 < i ≤ j ≤ n – 1.
That is, we must have either i > 0 or j < m – 1.
Using these descriptions, we can say given any string T [1  n], the substrings
are
T [i  .j] = T [i] T [i +1] T [i + 2]  T [j] for some 0 ≤ i ≤ j ≤ n – 1.
And proper substrings are
T [i  j] = T [i] T [i + 1] T [i + 2]  T [j] for some 0 ≤ i ≤ j ≤ n – 1.
Text: A A B A A C A A D A A B A A B A
Pattern: A A B A
A A B A A A B A
A A B A A C A A D A A B A A B A
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
A A B A
Pattern Found at 0, 9 and 12

Algorithms used for String Matching:


There are different types of method is used to finding the string
1. The Naive String Matching Algorithm
2. The Rabin-Karp-Algorithm
3. Finite Automata
Introduction 1.19

4. The Knuth-Morris-Pratt Algorithm


5. The Boyer-Moore Algorithm

The naïve approach tests all the possible placement of Pattern P [1  m] relative
to text T [1  .n]. We try shift s = 0, 1  n – m, successively and for each shift s.
Compare T [s + 1  s + m] to P [1  .m].
The naïve algorithm finds all valid shifts using a loop that checks the condition P
[1  m] = T [s + 1  s + m] for each of the n – m +1 possible value of s.
NAIVE-STRING-MATCHER (T, P)
1 n  length [T]
2. m  length [P]
3. for s  0 to n – m
4. do if P [1  .m] = T [s + 1  s + m]
5. then print "Pattern occurs with shift" s
Example:
Suppose T = 1011101110
P = 111
Find all the Valid Shift
1.20 Algorithms

The Rabin-Karp string matching algorithm calculates a hash value for the pattern,
as well as for each M-character subsequences of text to be compared. If the hash
values are unequal, the algorithm will determine the hash value for next M-character
sequence. If the hash values are equal, the algorithm will analyze the pattern and the
M-character sequence. In this way, there is only one comparison per text
subsequence, and character matching is only required when the hash values match.

RABIN-KARP-MATCHER (T, P, d, q)
1. n  length [T]
2. m  length [P]
3. h  dm- – 1 mod q
4. p 0
Introduction 1.21

5. t0 0
6. for i  1 to m
7. do p  (d p + P[i]) mod q
8. t 0  (d t 0+T [i]) mod q
9. for s  0 to n – m
10. do if p = ts
11. then if P [1 m] = T [s + 1 s + m]
12. then "Pattern occurs with shift" s
13. If s < n – m
14. then t s +1  (d (ts -T [s + 1]h)+T [s + m + 1])mod q
Example: For string matching, working module q = 11, how many spurious hits
does the Rabin-Karp matcher encounters in Text T = 31415926535.
1. T = 31415926535.......
2. P = 26
3. Here T. Length =11 so Q = 11
4. And P mod Q = 26 mod 11 = 4
5. Now find the exact match of P mod Q...
1.22 Algorithms

Knuth-Morris and Pratt introduce a linear time algorithm for the string matching
problem. A matching time of O (n) is achieved by avoiding comparison with an
element of 'S' that have previously been involved in comparison with some element
of the pattern 'p' to be matched. i.e., backtracking on the string 'S' never occurs
Introduction 1.23

1.9.1. COMPONENTS OF KMP ALGORITHM


1. The Prefix Function (Π): The Prefix Function, Π for a pattern
encapsulates knowledge about how the pattern matches against the shift of
itself. This information can be used to avoid a useless shift of the pattern
'p.' In other words, this enables avoiding backtracking of the string 'S.'
2. The KMP Matcher: With string 'S,' pattern 'p' and prefix function 'Π' as
inputs, find the occurrence of 'p' in 'S' and returns the number of shifts of
'p' after which occurrences are found.

The Prefix Function (Π)


Following pseudo code compute the prefix function, Π:

COMPUTE- PREFIX- FUNCTION (P)


1. m length [P] //'p' pattern to be matched
2. Π [1]  0
3. k 0
4. for q  2 to m
5. do while k > 0 and P [k + 1] ≠ P [q]
6. do k  Π [k]
7. If P [k + 1] = P [q]
8. then k  k + 1
9. Π [q]  k
10. Return Π

1.9.2. RUNNING TIME ANALYSIS


In the above pseudo code for calculating the prefix function, the for loop from
step 4 to step 10 runs 'm' times. Step1 to Step3 take constant time. Hence the running
time of computing prefix function is O (m).
Example: Compute Π for the pattern 'p' below:

Solution:
Initially: m = length [p] = 7
1.24 Algorithms

Π [1] = 0
k =0
Step 1: q = 2, k = 0
 [2] = 0
q 1 2 3 4 5 6 7
p a b a b a c a
 0 0
Step 2: q = 3, k = 0
[3] = 1
q 1 2 3 4 5 6 7
p a b a b a c a
 0 0 1
Step 3: q = 4, k = 1
 [4] = 2
q 1 2 3 4 5 6 7
p a b a b a c A
 0 0 1 2
Step 4: q = 5, k = 2
 [5] = 3
q 1 2 3 4 5 6 7
p a b a b a c a
 0 0 1 2 3
Step 5: q = 6, k = 3
 [6] = 0
q 1 2 3 4 5 6 7
p a b a b a c a
 0 0 1 2 3 0
Introduction 1.25

Step 6: q = 7, k = 1
 [7] = 1
q 1 2 3 4 5 6 7
p a b a b a c a
 0 0 1 2 3 0 1

1.9.3. THE KMP MATCHER


The KMP Matcher with the pattern 'p,' the string 'S' and prefix function 'Π' as
input, finds a match of p in S. Following pseudo code compute the matching
component of KMP algorithm:

KMP-MATCHER (T, P)
1. n  length [T]
2. m  length [P]
3. Π  COMPUTE-PREFIX-FUNCTION (P)
4. q 0 // numbers of characters matched
5. for i  1 to n // scan S from left to right
6. do while q > 0 and P [q + 1] ≠ T [i]
7. do q  Π [q] // next character does not match
8. If P [q + 1] = T [i]
9. then q  q + 1 // next character matches
10. If q = m // is all of p matched?
11. then print "Pattern occurs with shift" i – m
12. q  Π [q]

1.9.4. RUNNING TIME ANALYSIS


The for loop beginning in step 5 runs 'n' times, i.e., as long as the length of the
string 'S.' Since step 1 to step 4 take constant times, the running time is dominated by
this for the loop. Thus running time of the matching function is O (n).
Example: Given a string 'T' and pattern 'P' as follows:
1.26 Algorithms

Let execute the KMP Algorithm to find whether 'P' occurs in 'T.'
For 'p' the prefix function, ? was computed previously and is as follows:

q 1 2 3 4 5 6 7
p a b A b a c a
 0 0 1 2 3 0 1
Solution:
Initially: n = size of T = 15
m = size of P = 7

Step 1: i = 1 q = 0

Comparing P[2] with T[1]

P(1) does not match will T[1] ‘p’ will be shifted one position to the right
Step 2: i = 2, q = 0

Comparing P[1] with T[2]

P[1] matches T[2]. Since there is a match, p is not shifted

Step 3: i = 3, q = 1

Comparing P[2] with T[3] P[2] doesn’t match with T[3]


Introduction 1.27

Backtracking on p, Comparing P[1] and T[3]

Step 4: i = 4, q = 0

Comparing P[1] with T[4] P[1] doesn’t match with T[4]

Step 5: i = 5, q = 0

Comparing P[1] with T[5] P[1] match with T[5]

Step 6: i = 6, q = 1

Comparing P[2] with T[6] P[2] matches with T[6]

Step 7: i = 7, q = 2

Comparing P[3] with T[7] P[3] matches with T[7]


1.28 Algorithms

Step 8: i = 8, q = 3

Comparing P[4] with T[8] P[4] matches with T[8]

Step 9: i = 9, q = 4

Comparing P[5] with T[9] P[5] matches with T[9]

Step 10: i = 10, q = 5

Comparing P[6] with T[10] P[6] doesn’t match with T[10]

Backtracking on p, Comparing P[4] with T[10] because after mismatch q =  [5] = 3

Step 11: i = 11, q = 4

Comparing P[5] with T[11] P[5] match with T[11]

Step 12: i = 12, q = 5

Comparing P[6] with T[12] P[6] matches with T[12]


Introduction 1.29

Step 13: i = 3, q = 6

Comparing P[7] with T[13] P[7] matches with T[13]

Pattern 'P' has been found to complexity occur in a string 'T.' The total number of
shifts that took place for the match to be found is i – m = 13 – 7 = 6 shifts.

A Sorting Algorithm is used to rearrange a given array or list of elements


according to a comparison operator on the elements. The comparison operator is used
to decide the new order of elements in the respective data structure.

1.10.1. INSERTION SORT


This is an in-place comparison-based sorting algorithm. Here, a sub-list is
maintained which is always sorted. For example, the lower part of an array is
maintained to be sorted. An element which is to be 'insert'ed in this sorted sub-list,
has to find its appropriate place and then it has to be inserted there. Hence the
name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted into
the sorted sub-list (in the same array). This algorithm is not suitable for large data
sets as its average and worst case complexity are of Ο(n 2 ), where n is the number of
items.
Below are unsorted array for example.

Insertion sort compares the first two elements.


1.30 Algorithms

It finds that both 14 and 33 are already in ascending order. For now, 14 is in
sorted sub-list.

Insertion sort moves ahead and compares 33 with 27.

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here
we see that the sorted sub-list has only one element 14, and 27 is greater than 14.
Hence, the sorted sub-list remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.


Introduction 1.31

We swap them again. By the end of third iteration, we have a sorted sub-list of 4
items.

This process goes on until all the unsorted values are covered in a sorted sub-list.

Algorithm
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

Pseudocode
procedure insertionSort( A : array of items )
int holePosition
int valueToInsert
for i = 1 to length(A) inclusive do:
/* select value to be inserted */
valueToInsert = A[i]
holePosition = i
/*locate hole position for the element to be inserted */

while holePosition > 0 and A[holePosition-1] > valueToInsert do:


A[holePosition] = A[holePosition-1]
holePosition = holePosition -1
1.32 Algorithms

end while

/* insert the number at hole position */


A[holePosition] = valueToInsert
end for
end procedure

1.10.2. HEAP SORT ALGORITHM


Heap sort processes the elements by creating the min-heap or max-heap using the
elements of the given array. Min-heap or max-heap represents the ordering of array
in which the root element represents the minimum or maximum element of the array.
Heap sort basically recursively performs two main operations -
 Build a heap H, using the elements of array.
 Repeatedly delete the root element of the heap formed in 1st phase.
Before knowing more about the heap sort, let's understand about Heap
A heap is a complete binary tree, and the binary tree is a tree in which the node
can have the utmost two children. A complete binary tree is a binary tree in which all
the levels except the last level, i.e., leaf node, should be completely filled, and all the
nodes should be left-justified
Heapsort is a popular and efficient sorting algorithm. The concept of heap sort is
to eliminate the elements one by one from the heap part of the list, and then insert
them into the sorted part of the list.
In heap sort, basically, there are two phases involved in the sorting of elements.
By using the heap sort algorithm, they are as follows -
 The first step includes the creation of a heap by adjusting the elements of
the array.
 After the creation of heap, now remove the root element of the heap
repeatedly by shifting it to the end of the array, and then store the heap
structure with the remaining elements.
let's take an unsorted array and try to sort it using heap sort.
Introduction 1.33

Fig. 1.8.
First, we have to construct a heap from the given array and convert it into max
heap.

Fig. 1.9.
After converting the given heap into max heap, the array elements are -

Fig. 1.10.
Next, we have to delete the root element (89) from the max heap. To delete this
node, we have to swap it with the last node, i.e. (11). After deleting the root element,
we again have to heapify it to convert it into max heap.

Fig. 1.11.
1.34 Algorithms

After swapping the array element 89 with 11, and converting the heap into max-
heap, the elements of array are -

Fig. 1.12.
In the next step, again, we have to delete the root element (81) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (54). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.13.
After swapping the array element 81 with 54 and converting the heap into max-
heap, the elements of array are -

Fig. 1.14.
In the next step, we have to delete the root element (76) from the max heap again.
To delete this node, we have to swap it with the last node, i.e. (9). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.15.
Introduction 1.35

After swapping the array element 76 with 9 and converting the heap into max-
heap, the elements of array are -

Fig. 1.16.
In the next step, again we have to delete the root element (54) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (14). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.17.
After swapping the array element 54 with 14 and converting the heap into max-
heap, the elements of array are -

Fig. 1.18.
In the next step, again we have to delete the root element (22) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (11). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.19.
1.36 Algorithms

After swapping the array element 22 with 11 and converting the heap into max-
heap, the elements of array are -

Fig. 1.20.
In the next step, again we have to delete the root element (14) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (9). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.21.
After swapping the array element 14 with 9 and converting the heap into max-
heap, the elements of array are -

Fig. 1.22.
In the next step, again we have to delete the root element (11) from the max heap.
To delete this node, we have to swap it with the last node, i.e. (9). After deleting the
root element, we again have to heapify it to convert it into max heap.

Fig. 1.23.
After swapping the array element 11 with 9, the elements of array are -

Fig. 1.24.
Now, heap has only one element left. After deleting it, heap will be empty.
Introduction 1.37

Fig. 1.25.
After completion of sorting, the array elements are -

Fig. 1.26.
Now, the array is completely sorted.

Algorithm
heapify(array, size)
Input − An array of data, and the total number in the array
Output − The max heap using an array element
Begin
for i := 1 to size do
node := i
par := floor (node / 2)
while par >= 1 do
if array[par] < array[node] then
swap array[par] with array[node]
node := par
par := floor (node / 2)
done
done
End
heapSort(array, size)
Input: An array of data, and the total number in the array
Output −nbsp;sorted array
Begin
for i := n to 1 decrease by 1 do
heapify(array, i)
1.38 Algorithms

swap array[1] with array[i]


done
End

Example

#include<iostream>
using namespace std;
void display(int *array, int size) {
for(int i = 1; i<=size; i++)
cout << array[i] << " ";
cout << endl;
}

void heapify(int *array, int n) {


int i, par, l, r, node;
// create max heap
for(i = 1; i<= n; i++) {
node = i; par = (int)node/2;
while(par >= 1) {
//if new node bigger than parent, then swap
if(array[par] < array[node])
swap(array[par], array[node]);
node = par;
par = (int)node/2;//update parent to check
}

}
Introduction 1.39

void heapSort(int *array, int n) {

int i;

for(i = n; i>= 1; i--) {

heapify(array, i);//heapify each time

swap(array[1], array[i]);//swap last element with first

int main() {

int n;

cout << "Enter the number of elements: ";

cin >> n;

int arr[n+1]; //effective index starts from i = 1.

cout << "Enter elements:" << endl;

for(int i = 1; i<=n; i++) {

cin >> arr[i];

cout << "Array before Sorting: ";

display(arr, n);

heapSort(arr, n);

cout << "Array after Sorting: ";

display(arr, n);

}
1.40 Algorithms

Output
Enter the number of elements: 6
Enter elements:
30 8 99 11 24 39
Array before Sorting: 30 8 99 11 24 39
Array after Sorting: 8 11 24 30 39 99

1. What is Algorithm analysis?


Analyzing an algorithm has come to mean predicting the resources that the
algorithm requires. Occasionally, resources such as memory, communication
band-width, or computer hardware are of primary concern, but most often it is
computational time that we want to measure.
2. Write a short note on Asymptotic Notations and its properties.
The main idea of asymptotic analysis is to have a measure of the efficiency of
algorithms that don’t depend on machine-specific constants and don’t require
algorithms to be implemented and time taken by programs to be compared.
Asymptotic notations are mathematical tools to represent the time complexity of
algorithms for asymptotic analysis.
There are mainly three asymptotic notations:
 Big-O Notation (O-notation)
 Omega Notation (Ω-notation)
 Theta Notation (Θ-notation)
3. What are the four methods for solving Recurrence?
 Substitution Method
 Iteration Method
 Recursion Tree Method
 Master Method
Introduction 1.41

4. Write a short note on Binary search.


In binary search, we check the middle element against the key, if it is
greater we search the first half else we check the second half and repeat the
same process.
5. What are the different types of method is used to finding the string.
 The Naive String Matching Algorithm
 The Rabin-Karp-Algorithm
 Finite Automata
 The Knuth-Morris-Pratt Algorithm
 The Boyer-Moore Algorithm
6. Write an algorithm for Naïve String matching.
NAIVE-STRING-MATCHER (T, P)
 n  length [T]
 m  length [P]
 for s  0 to n – m
 do if P [1.....m] = T [s + 1....s + m]
 then print "Pattern occurs with shift" s
7. What is Rabin-Karp-Algorithm?
The Rabin-Karp string matching algorithm calculates a hash value for the
pattern, as well as for each M-character subsequences of text to be compared. If
the hash values are unequal, the algorithm will determine the hash value for next
M-character sequence. If the hash values are equal, the algorithm will analyze
the pattern and the M-character sequence. In this way, there is only one
comparison per text subsequence, and character matching is only required when
the hash values match.
8. Write about Knuth-Morris-Pratt (KMP)Algorithm.
Knuth-Morris and Pratt introduce a linear time algorithm for the string
matching problem. A matching time of O (n) is achieved by avoiding
comparison with an element of 'S' that have previously been involved in
comparison with some element of the pattern 'p' to be matched. i.e., backtracking
on the string 'S' never occurs
1.42 Algorithms

9. What is Sorting?
A Sorting Algorithm is used to rearrange a given array or list of elements
according to a comparison operator on the elements. The comparison operator is
used to decide the new order of elements in the respective data structure.
10. What is Insertion sort?
This is an in-place comparison-based sorting algorithm. Here, a sub-list is
maintained which is always sorted. For example, the lower part of an array is
maintained to be sorted. An element which is to be 'insert'ed in this sorted sub-
list, has to find its appropriate place and then it has to be inserted there. Hence
the name, insertion sort.
11. What is Heap Sort Algorithm.
Heap sort processes the elements by creating the min-heap or max-heap using
the elements of the given array. Min-heap or max-heap represents the ordering
of array in which the root element represents the minimum or maximum element
of the array.

1. Explain in detail about Asymptotic Notations and its properties


2. Write in detail on Linear search with an example.
3. Explain briefly about Binary search and Interpolation search.
4. Discuss naïve string matching algorithm in detail
5. Write about Rabin-Karp algorithm
6. Explain in detail about Knuth-Morris-Pratt algorithm
7. Explain Insertion sort with an example.
8. Explain heap sort with an example.

*******************
UNIT II
GRAPH ALGORITHMS
Graph algorithms: Representations of graphs - Graph traversal: DFS – BFS -
applications - Connectivity, strong connectivity, bi-connectivity - Minimum
spanning tree: Kruskal’s and Prim’s algorithm- Shortest path: Bellman-Ford
algorithm - Dijkstra’s algorithm - Floyd-Warshall algorithm Network flow: Flow
networks - Ford-Fulkerson method – Matching: Maximum bipartite matching

A graph is a unique data structure in programming that consists of finite sets of


nodes or vertices and a set of edges that connect these vertices to them. At this
moment, adjacent vertices can be called those vertices that are connected to the same
edge with each other. In simple terms, a graph is a visual representation of vertices
and edges sharing some connection or relationship. Although there are plenty of
graph algorithms that you might have been familiar with, only some of them are put
to use. The reason for this is simple as the standard graph algorithms are designed in
such a way to solve millions of problems with just a few lines of logically coded
technique. To some extent, one perfect algorithm is solely optimized to achieve such
efficient results.

2.1.1. REPRESENTATIONS OF GRAPHS


There are two ways to store Graphs into the computer's memory:
 Sequential representation (or, Adjacency matrix representation)
 Linked list representation (or, Adjacency list representation)
In sequential representation, an adjacency matrix is used to store the graph.
Whereas in linked list representation, there is a use of an adjacency list to store the
graph.

Sequential Representation
In sequential representation, there is a use of an adjacency matrix to represent the
mapping between vertices and edges of the graph. We can use an adjacency matrix to
2.2 Algorithms

represent the undirected graph, directed graph, weighted directed graph, and
weighted undirected graph.
If adj[i][j] = w, it means that there is an edge exists from vertex i to vertex j with
weight w.
An entry A i j in the adjacency matrix representation of an undirected graph G will
be 1 if an edge exists between Vi and V j . If an Undirected Graph G consists of n
vertices, then the adjacency matrix for that graph is n  n, and the matrix A = [ai j ]
can be defined as -
ai j = 1 {if there is a path exists from Vi to V j }
ai j = 0 {Otherwise}
It means that, in an adjacency matrix, 0 represents that there is no association
exists between the nodes, whereas 1 represents the existence of a path between two
edges.
If there is no self-loop present in the graph, it means that the diagonal entries of
the adjacency matrix will be 0.
Now, let's see the adjacency matrix representation of an undirected graph.

Fig. 2.1.

Linked List Representation


An adjacency list is used in the linked representation to store the Graph in the
computer's memory. It is efficient in terms of storage as we only have to store the
values for edges.
Graph Algorithms 2.3

Fig. 2.2.
Let's see the adjacency list representation of an undirected graph.
In the above figure, we can see that there is a linked list or adjacency list for every
node of the graph. From vertex A, there are paths to vertex B and vertex D. These
nodes are linked to nodes A in the given adjacency list.
An adjacency list is maintained for each node present in the graph, which stores
the node value and a pointer to the next adjacent node to the respective node. If all
the adjacent nodes are traversed, then store the NULL in the pointer field of the last
node of the list.
The sum of the lengths of adjacency lists is equal to twice the number of edges
present in an undirected graph.

Graph traversal means visiting every vertex and edge exactly once in a well-
defined order. While using certain graph algorithms, you must ensure that each
vertex of the graph is visited exactly once. The order in which the vertices are visited
are important and may depend upon the algorithm or question that you are solving.
During a traversal, it is important that you track which vertices have been visited.
The most common way of tracking vertices is to mark them.

2.2.1. DEPTH-FIRST SEARCH ALGORITHM


The depth-first search or DFS algorithm traverses or explores data structures, such
as trees and graphs. The algorithm starts at the root node (in the case of a graph, you
2.4 Algorithms

can use any random node as the root node) and examines each branch as far as
possible before backtracking.

Fig. 2.3.
When a dead-end occurs in any iteration, the Depth First Search (DFS) method
traverses a network in a deathward motion and uses a stack data structure to
remember to acquire the next vertex to start a search.
Consider the following graph as an example of how to use the dfs algorithm.

Fig. 2.4.
Step 1: Mark vertex A as a visited source node by selecting it as a source
node.
You should push vertex A to the top of the stack.

Fig. 2.5.
Step 2: Any nearby unvisited vertex of vertex A, say B, should be visited.
You should push vertex B to the top of the stack.
Graph Algorithms 2.5

Fig. 2.6.
Step 3: From vertex C and D, visit any adjacent unvisited vertices of vertex B.
Imagine you have chosen vertex C, and you want to make C a visited
vertex.
Vertex C is pushed to the top of the stack.

Fig. 2.7.
Step 4: You can visit any nearby unvisited vertices of vertex C, you need to
select vertex D and designate it as a visited vertex.
Vertex D is pushed to the top of the stack.

Fig. 2.8.
Step 5: Vertex E is the lone unvisited adjacent vertex of vertex D, thus
marking it as visited.

Fig. 2.9.
Vertex E should be pushed to the top of the stack.
2.6 Algorithms

Step 6: Vertex E's nearby vertices, namely vertex C and D have been visited,
pop vertex E from the stack.

Fig. 2.10.
Step 7: Now that all of vertex D's nearby vertices, namely vertex B and C,
have been visited, pop vertex D from the stack.

Fig. 2.11.
Step 8: Similarly, vertex C's adjacent vertices have already been visited;
therefore, pop it from the stack.

Fig. 2.12.
Step 9: There is no more unvisited adjacent vertex of b, thus pop it from the
stack.

Fig. 2.13.
Step 10: All of the nearby vertices of Vertex A, B, and C, have already been
visited, so pop vertex A from the stack as well.
Graph Algorithms 2.7

Fig. 2.14.

DEPTH FIRST SEARCH ALGORITHM-


DFS (V, E)
for each vertex u in V[G]
do color[v] WHITE
π[v]  NIL
time  0
for each vertex v in V[G]
do if color[v]  WHITE
then Depth_First_Search(v)

Depth_First_Search (v)

color[v]  GRAY
time  time + 1
d[v]  time
for each vertex u adjacent to v
do if color[u]  WHITE
π[u]  v
Depth_First_Search(u)
color[v]  BLACK
time  time + 1
f [v]  time
2.8 Algorithms

2.2.2. BREADTH FIRST SEARCH (BFS)


There are many ways to traverse graphs. BFS is the most commonly used
approach.
BFS is a traversing algorithm where you should start traversing from a selected
node (source or starting node) and traverse the graph layerwise thus exploring the
neighbour nodes (nodes which are directly connected to source node). You must then
move towards the next-level neighbour nodes.
As the name BFS suggests, you are required to traverse the graph breadthwise as
follows:
1. First move horizontally and visit all the nodes of the current layer
2. Move to the next layer
Consider the following diagram.

Fig. 2.15.
The distance between the nodes in layer 1 is comparitively lesser than the distance
between the nodes in layer 2. Therefore, in BFS, you must traverse all the nodes in
layer 1 before you move to the nodes in layer 2.

Traversing child nodes


A graph can contain cycles, which may bring you to the same node again while
traversing the graph. To avoid processing of same node again, use a boolean array
which marks the node after it is processed. While visiting the nodes in the layer of a
graph, store them in a manner such that you can traverse the corresponding child
nodes in a similar order.
In the earlier diagram, start traversing from 0 and visit its child nodes 1, 2, and 3.
Store them in the order in which they are visited. This will allow you to visit the
child nodes of 1 first (i.e. 4 and 5), then of 2 (i.e. 6 and 7), and then of 3 (i.e. 7) etc.
Graph Algorithms 2.9

To make this process easy, use a queue to store the node and mark it as 'visited'
until all its neighbours (vertices that are directly connected to it) are marked. The
queue follows the First In First Out (FIFO) queuing method, and therefore, the
neigbors of the node will be visited in the order in which they were inserted in the
node i.e. the node that was inserted first will be visited first, and so on.

TRAVERSING PROCESS

Fig. 2.16.
The traversing will start from the source node and push s in queue. s will be
marked as 'visited'.

First Iteration
 s will be popped from the queue
 Neighbors of s i.e. 1 and 2 will be traversed
2.10 Algorithms

Fig. 2.17.
 1 and 2, which have not been
 traversed earlier, are traversed. They will be:
o Pushed in the queue
o 1 and 2 will be marked as visited

Second iteration
 1 is popped from the queue
 Neighbors of 1 i.e. s and 3 are traversed
 s is ignored because it is marked as 'visited'
 3, which has not been traversed earlier, is traversed. It is:
o Pushed in the queue
o Marked as visited

Third iteration
 2 is popped from the queue
 Neighbors of 2 i.e. s, 3, and 4 are traversed
 3 and s are ignored because they are marked as 'visited'
 4, which has not been traversed earlier, is traversed. It is:
o Pushed in the queue
o Marked as visited
Graph Algorithms 2.11

Fourth iteration
 3 is popped from the queue
 Neighbors of 3 i.e. 1, 2, and 5 are traversed
 1 and 2 are ignored because they are marked as 'visited'
 5, which has not been traversed earlier, is traversed. It is:
o Pushed in the queue
o Marked as visited

Fifth iteration
 4 will be popped from the queue
 Neighbors of 4 i.e. 2 is traversed
 2 is ignored because it is already marked as 'visited'

Sixth iteration
 5 is popped from the queue
 Neighbors of 5 i.e. 3 is traversed
 3 is ignored because it is already marked as 'visited'
The queue is empty and it comes out of the loop. All the nodes have been
traversed by using BFS.
If all the edges in a graph are of the same weight, then BFS can also be used to
find the minimum distance between the nodes in a graph.

Breadth First Search Algorithm-


BFS (V,E,s)
for each vertex v in V – {s}
do
color[v]  WHITE
d[v]  ∞
π[v]  NIL
color[s] = GREY
d[s]  0
π[s]  NIL
2.12 Algorithms

Q{}
ENQUEUE (Q,s)
While Q is non-empty
do v  DEQUEUE (Q)
for each u adjacent to v
do if color[u]  WHITE
then color[u]  GREY
d[u]  d[v] + 1
π[u]  v
ENQUEUE (Q, u)
color[v]  BLACK

2.2.3. APPLICATIONS OF DFS AND BFS ALGORITHMS


The DFS or Depth First Search is used in different places. Some common uses are
 If we perform DFS on unweighted graph, then it will create minimum
spanning tree for all pair shortest path tree
 We can detect cycles in a graph using DFS. If we get one back-edge
during BFS, then there must be one cycle.
 Using DFS we can find path between two given vertices u and v.
 We can perform topological sorting is used to scheduling jobs from given
dependencies among jobs. Topological sorting can be done using DFS
algorithm.
 Using DFS, we can find strongly connected components of a graph. If
there is a path from each vertex to every other vertex, that is strongly
connected.
Like DFS, the BFS (Breadth First Search) is also used in different situations.
These are like below −
 In peer-to-peer network like bit-torrent, BFS is used to find all neighbor
nodes
 Search engine crawlers are used BFS to build index. Starting from source
page, it finds all links in it to get new pages
Graph Algorithms 2.13

 Using GPS navigation system BFS is used to find neighboring places.


 In networking, when we want to broadcast some packets, we use the BFS
algorithm.
 Path finding algorithm is based on BFS or DFS.
 BFS is used in Ford-Fulkerson algorithm to find maximum flow in a
network.

2.2.4. STRONGLY CONNECTED COMPONENTS


A strongly connected component is the portion of a directed graph in which there
is a path from each vertex to another vertex. It is applicable only on a directed graph.
For example:
Let us take the graph below.

Fig. 2.18. Initial graph


The strongly connected components of the above graph are:

Fig. 2.19. Strongly connected components


You can observe that in the first strongly connected component, every vertex can
reach the other vertex through the directed path.
We can find all strongly connected components in O(V+E) time
using Kosaraju’s algorithm. Following is detailed Kosaraju’s algorithm.
1. Create an empty stack ‘S’ and do DFS traversal of a graph. In DFS
traversal, after calling recursive DFS for adjacent vertices of a vertex, push
2.14 Algorithms

the vertex to stack. In the above graph, if we start DFS from vertex 0, we
get vertices in stack as 1, 2, 4, 3, 0.
2. Reverse directions of all arcs to obtain the transpose graph.
3. One by one pop a vertex from S while S is not empty. Let the popped
vertex be ‘v’. Take v as source and do DFS (call DFSUtil(v)). The DFS
starting from v prints strongly connected component of v. In the above
example, we process vertices in order 0, 3, 4, 2, 1 (One by one popped
from stack).
How does this work?
The above algorithm is DFS based. It does DFS two times. DFS of a graph
produces a single tree if all vertices are reachable from the DFS starting point.
Otherwise DFS produces a forest. So DFS of a graph with only one SCC always
produces a tree. The important point to note is DFS may produce a tree or a forest
when there are more than one SCCs depending upon the chosen starting point. For
example, in the above diagram, if we start DFS from vertices 0 or 1 or 2, we get a
tree as output. And if we start from 3 or 4, we get a forest. To find and print all
SCCs, we would want to start DFS from vertex 4 (which is a sink vertex), then move
to 3 which is sink in the remaining set (set excluding 4) and finally any of the
remaining vertices (0, 1, 2).

2.2.5. BICONNECTIVITY
An undirected graph is said to be a biconnected graph, if there are two vertex-
disjoint paths between any two vertices are present. In other words, we can say that
there is a cycle between any two vertices.

Fig. 2.20. Bi- connected


We can say that a graph G is a bi-connected graph if it is connected, and there are
no articulation points or cut vertex are present in the graph.
To solve this problem, we will use the DFS traversal. Using DFS, we will try to
find if there is any articulation point is present or not. We also check whether all
Graph Algorithms 2.15

vertices are visited by the DFS or not, if not we can say that the graph is not
connected.
Here's the pseudo code:
time = 0
function is Biconnected(vertex, adj[][], low[], disc[], parent[], visited[], V)
disc[vertex]=low[vertex]=time+1
time = time + 1
visited[vertex]=true
child = 0
for i = 0 to V
if adj[vertex][i] == true
if visited[i] == false
child = child + 1
parent[i] = vertex
result = isBiconnected(i, adj, low, disc, visited, V, time)
if result == false
return false
low[vertex] = minimum(low[vertex], low[i])
if parent[vertex] == nil AND child > 1
return false
if parent[vertex] != nil AND low[i] >= disc[vertex]
return false
else if parent[vertex] != i
low[vertex] = minimum(disc[i], low[vertex])
return true

Given an undirected and connected graph G = (V, E), a spanning tree of the
graph G is a tree that spans G (that is, it includes every vertex of G) and is a
subgraph of G (every edge in the tree belongs to G)
2.16 Algorithms

2.3.1. MINIMUM SPANNING TREE


The cost of the spanning tree is the sum of the weights of all the edges in the tree.
There can be many spanning trees. Minimum spanning tree is the spanning tree
where the cost is minimum among all the spanning trees. There also can be many
minimum spanning trees.
Minimum spanning tree has direct application in the design of networks. It is used
in algorithms approximating the travelling salesman problem, multi-terminal
minimum cut problem and minimum-cost weighted perfect matching. Other practical
applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation

Fig. 2.21.
There are two famous algorithms for finding the Minimum Spanning Tree:

2.3.2. KRUSKAL’S ALGORITHM


Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a
growing spanning tree. Kruskal's algorithm follows greedy approach as in each
iteration it finds an edge which has least weight and add it to the growing spanning
tree.

Algorithm Steps:
 Sort the graph edges with respect to their weights.
 Start adding edges to the MST from the edge with the smallest weight until
the edge of the largest weight.
Graph Algorithms 2.17

 Only add edges which doesn't form a cycle , edges which connect only
disconnected components.
Consider following example:

Fig. 2.22. Kruskal’s Algorithm


Kruskal’s algorithm, at each iteration we will select the edge with the lowest
weight. So, we will start with the lowest weighted edge first i.e., the edges with
weight 1. After that we will select the second lowest weighted edge i.e., edge with
weight 2. Notice these two edges are totally disjoint. Now, the next edge will be the
third lowest weighted edge i.e., edge with weight 3, which connects the two disjoint
pieces of the graph. Now, we are not allowed to pick the edge with weight 4, that
will create a cycle and we can’t have any cycles. So we will select the fifth lowest
2.18 Algorithms

weighted edge i.e., edge with weight 5. Now the other two edges will create cycles so
we will ignore them. In the end, we end up with a minimum spanning tree with total
cost 11 ( = 1 + 2 + 3 + 5).

Time Complexity:
In Kruskal’s algorithm, most time consuming operation is sorting because the
total complexity of the Disjoint-Set operations will be O(ElogV), which is the overall
Time Complexity of the algorithm.

2.3.3. PRIM’S ALGORITHM

Prim’s Algorithm also use Greedy approach to find the minimum spanning tree.
In Prim’s Algorithm we grow the spanning tree from a starting position. Unlike
an edge in Kruskal's, we add vertex to the growing spanning tree in Prim's.

Algorithm Steps:
 Maintain two disjoint sets of vertices. One containing vertices that are in
the growing spanning tree and other that are not in the growing spanning
tree.
 Select the cheapest vertex that is connected to the growing spanning tree
and is not in the growing spanning tree and add it into the growing
spanning tree. This can be done using Priority Queues. Insert the vertices,
that are connected to growing spanning tree, into the Priority Queue.
 Check for cycles. To do that, mark the nodes which have been already
selected and insert only those nodes in the Priority Queue that are not
marked.
In Prim’s Algorithm, we will start with an arbitrary node (it doesn’t matter which
one) and mark it. In each iteration we will mark a new vertex that is adjacent to the
one that we have already marked. As a greedy algorithm, Prim’s algorithm will select
the cheapest edge and mark the vertex. So we will simply choose the edge with
weight 1. In the next iteration we have three options, edges with weight 2, 3 and 4.
So, we will select the edge with weight 2 and mark the vertex. Now again we have
three options, edges with weight 3, 4 and 5. But we can’t choose edge with weight 3
Graph Algorithms 2.19

as it is creating a cycle. So we will select the edge with weight 4 and we end up with
the minimum spanning tree of total cost 7 ( = 1 + 2 + 4).
Consider the example below:

Fig. 2.23. Prim’s algorithm

Time Complexity:
The time complexity of the Prim’s Algorithm is O((V + E)logV) because each
edge is inserted in the priority queue only once and insertion in priority queue take
logarithmic time.

Shortest path algorithms are a family of algorithms designed to solve the shortest
path problem. The shortest path problem is something most people have some
intuitive familiarity with: given two points, A and B, what is the shortest path
between them? In computer science, however, the shortest path problem can take
different forms and so different algorithms are needed to be able to solve them all.
Applications-
Shortest path algorithms have a wide range of applications such as in-
 Google Maps
 Road Networks
 Logistics Research
2.20 Algorithms

2.4.1. BELLMAN-FORD ALGORITHM


Bellman Ford algorithm works by overestimating the length of the path from the
starting vertex to all other vertices. Then it iteratively relaxes those estimates by
finding new paths that are shorter than the previously overestimated paths.

Step 1: Start with the weighted graph

Fig. 2.24.

Step 2: Choose a starting vertex and assign infinity path values to all other
vertices.

Fig. 2.25.

Step 3: Visit each edge and relax the path distances if they are inaccurate.

Fig. 2.26.

Step 4: We need to do this V times because in the worst case, a vertex’s


path length might need to be readjusted V times.

Fig. 2.27.
Graph Algorithms 2.21

Step 5: Notice how the vertex at the top right corner had its path length
adjusted.

Fig. 2.28.

Step 6: After all the vertices have their path lengths, we check if a negative
cycle is present.

B C D E
0    
0 4 2  
0 3 2 6 6
0 3 2 1 6
0 3 2 1 6

2.4.2. DIJKSTRA'S ALGORITHM


Dijkstra algorithm is a single-source shortest path algorithm. Here, single-source
means that only one source is given, and we have to find the shortest path from the
source to all the nodes.
Let's understand the working of Dijkstra's algorithm. Consider the below
graph.

Fig. 2.29.
2.22 Algorithms

First, we have to consider any vertex as a source vertex. Suppose we consider


vertex 0 as a source vertex.
Here we assume that 0 as a source vertex, and distance to all the other vertices is
infinity. Initially, we do not know the distances. First, we will find out the vertices
which are directly connected to the vertex 0. As we can observe in the above graph
that two vertices are directly connected to vertex 0.

Fig. 2.30.
Let's assume that the vertex 0 is represented by 'x' and the vertex 1 is represented
by 'y'. The distance between the vertices can be calculated by using the below
formula:
1. d(x, y) = d(x) + c(x, y) < d(y)
2. = (0 + 4) < ∞
3. =4<∞
Since 4 < ∞ so we will update d(v) from ∞ to 4.
Therefore, we come to the conclusion that the formula for calculating the distance
between the vertices:
1. {if( d(u) + c(u, v) < d(v))
2. d(v) = d(u) + c(u, v) }
Now we consider vertex 0 same as 'x' and vertex 4 as 'y'.
1. d(x, y) = d(x) + c(x, y) < d(y)
2. = (0 + 8) < ∞
3. =8<∞
Therefore, the value of d(y) is 8. We replace the infinity value of vertices 1 and 4
with the values 4 and 8 respectively. Now, we have found the shortest path from the
Graph Algorithms 2.23

vertex 0 to 1 and 0 to 4. Therefore, vertex 0 is selected. Now, we will compare all the
vertices except the vertex 0. Since vertex 1 has the lowest value, i.e., 4; therefore,
vertex 1 is selected.
Since vertex 1 is selected, so we consider the path from 1 to 2, and 1 to 4. We will
not consider the path from 1 to 0 as the vertex 0 is already selected.
First, we calculate the distance between the vertex 1 and 2. Consider the vertex 1
as 'x', and the vertex 2 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (4 + 8) < ∞
= 12 < ∞
Since 12 < ∞ so we will update d(2) from ∞ to 12.
Now, we calculate the distance between the vertex 1 and vertex 4. Consider the
vertex 1 as 'x' and the vertex 4 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (4 + 11) < 8
= 15 < 8
Since 15 is not less than 8, we will not update the value d(4) from 8 to 12.
Till now, two nodes have been selected, i.e., 0 and 1. Now we have to compare
the nodes except the node 0 and 1. The node 4 has the minimum distance, i.e., 8.
Therefore, vertex 4 is selected.
Since vertex 4 is selected, so we will consider all the direct paths from the vertex
4. The direct paths from vertex 4 are 4 to 0, 4 to 1, 4 to 8, and 4 to 5. Since the
vertices 0 and 1 have already been selected so we will not consider the vertices 0 and
1. We will consider only two vertices, i.e., 8 and 5.
First, we consider the vertex 8. First, we calculate the distance between the vertex
4 and 8. Consider the vertex 4 as 'x', and the vertex 8 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (8 + 7) < ∞
= 15 < ∞
Since 15 is less than the infinity so we update d(8) from infinity to 15.
2.24 Algorithms

Now, we consider the vertex 5. First, we calculate the distance between the vertex
4 and 5. Consider the vertex 4 as 'x', and the vertex 5 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (8 + 1) < ∞
= 9<∞
Since 5 is less than the infinity, we update d(5) from infinity to 9.
Till now, three nodes have been selected, i.e., 0, 1, and 4. Now we have to
compare the nodes except the nodes 0, 1 and 4. The node 5 has the minimum value,
i.e., 9. Therefore, vertex 5 is selected.
Since the vertex 5 is selected, so we will consider all the direct paths from vertex
5. The direct paths from vertex 5 are 5 to 8, and 5 to 6.
First, we consider the vertex 8. First, we calculate the distance between the vertex
5 and 8. Consider the vertex 5 as 'x', and the vertex 8 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (9 + 15) < 15
= 24 < 15
Since 24 is not less than 15 so we will not update the value d(8) from 15 to 24.
Now, we consider the vertex 6. First, we calculate the distance between the vertex
5 and 6. Consider the vertex 5 as 'x', and the vertex 6 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (9 + 2) < ∞
= 11 < ∞
Since 11 is less than infinity, we update d(6) from infinity to 11.
Till now, nodes 0, 1, 4 and 5 have been selected. We will compare the nodes
except the selected nodes. The node 6 has the lowest value as compared to other
nodes. Therefore, vertex 6 is selected.
Since vertex 6 is selected, we consider all the direct paths from vertex 6. The
direct paths from vertex 6 are 6 to 2, 6 to 3, and 6 to 7.
First, we consider the vertex 2. Consider the vertex 6 as 'x', and the vertex 2 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
Graph Algorithms 2.25

= (11 + 4) < 12
= 15 < 12
Since 15 is not less than 12, we will not update d(2) from 12 to 15
Now we consider the vertex 3. Consider the vertex 6 as 'x', and the vertex 3 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (11 + 14) < ∞
= 25 < ∞
Since 25 is less than ∞, so we will update d(3) from ∞ to 25.
Now we consider the vertex 7. Consider the vertex 6 as 'x', and the vertex 7 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (11 + 10) < ∞
= 22 < ∞
Since 22 is less than ∞ so, we will update d(7) from ∞ to 22.
Till now, nodes 0, 1, 4, 5, and 6 have been selected. Now we have to compare all
the unvisited nodes, i.e., 2, 3, 7, and 8. Since node 2 has the minimum value, i.e., 12
among all the other unvisited nodes. Therefore, node 2 is selected.
Since node 2 is selected, so we consider all the direct paths from node 2. The
direct paths from node 2 are 2 to 8, 2 to 6, and 2 to 3.
First, we consider the vertex 8. Consider the vertex 2 as 'x' and 8 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (12 + 2) < 15
= 14 < 15
Since 14 is less than 15, we will update d(8) from 15 to 14.
Now, we consider the vertex 6. Consider the vertex 2 as 'x' and 6 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (12 + 4) < 11
= 16 < 11
Since 16 is not less than 11 so we will not update d(6) from 11 to 16.
2.26 Algorithms

Now, we consider the vertex 3. Consider the vertex 2 as 'x' and 3 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (12 + 7) < 25
= 19 < 25
Since 19 is less than 25, we will update d(3) from 25 to 19.
Till now, nodes 0, 1, 2, 4, 5, and 6 have been selected. We compare all the
unvisited nodes, i.e., 3, 7, and 8. Among nodes 3, 7, and 8, node 8 has the minimum
value. The nodes which are directly connected to node 8 are 2, 4, and 5. Since all the
directly connected nodes are selected so we will not consider any node for the
updation.
The unvisited nodes are 3 and 7. Among the nodes 3 and 7, node 3 has the
minimum value, i.e., 19. Therefore, the node 3 is selected. The nodes which are
directly connected to the node 3 are 2, 6, and 7. Since the nodes 2 and 6 have been
selected so we will consider these two nodes.
Now, we consider the vertex 7. Consider the vertex 3 as 'x' and 7 as 'y'.
d(x, y) = d(x) + c(x, y) < d(y)
= (19 + 9) < 2121
= 28 < 21
Since 28 is not less than 21, so we will not update d(7) from 28 to 21.

2.4.3. FLOYD WARSHALL ALGORITHM


 Floyd Warshall Algorithm is a famous algorithm.
 It is used to solve All Pairs Shortest Path Problem.
 It computes the shortest path between every pair of vertices of the given
graph.
 Floyd Warshall Algorithm is an example of dynamic programming
approach.
Floyd Warshall Algorithm has the following main advantages-
 It is extremely simple.
 It is easy to implement.
Graph Algorithms 2.27

Time Complexity-
 Floyd Warshall Algorithm consists of three loops over all the nodes.
 The inner most loop consists of only constant complexity operations.
 Hence, the asymptotic complexity of Floyd Warshall algorithm is O(n3).
 Here, n is the number of nodes in the given graph.

Example problem
Consider the following directed weighted graph-

Fig. 2.31.
Using Floyd Warshall Algorithm, find the shortest path distance between every
pair of vertices.
Solution:

Step-01:
Remove all the self loops and parallel edges (keeping the lowest weight edge)
from the graph.
In the given graph, there are neither self edges nor parallel edges.

Step-02:
Write the initial distance matrix.
 It represents the distance between every pair of vertices in the form of
given weights.
 For diagonal elements (representing self-loops), distance value = 0.
 For vertices having a direct edge between them, distance value = weight of
that edge.
 For vertices having no direct edge between them, distance value = ∞.
2.28 Algorithms

Initial distance matrix for the given graph is-

Step-03:
Using Floyd Warshall Algorithm, write the following 4 matrices-
Graph Algorithms 2.29

The last matrix D4 represents the shortest path distance between every pair of
vertices.

In graph theory, a flow network is defined as directed graph G = (V, E)G =


(V,E) constrained with a function cc, which bounds each edge ee with a non-negative
integer value which is known as capacity of the edge ee with two additional vertices
defined as source SS and sink TT.
As shown in the flow network given below, a source vertex has all outgoing edges
and no incoming edges, more formally we can say In_degree[source]= 0 and sink
vertex has all incoming edges and no outgoing edge more
formally out_degree[sink]= 0.

Fig. 2.32.
Any flow network should satisfy all the underlying conditions --
 For all the vertices (except the source and the sink vertex), input flow must
be equal to output flow.
 For any given edge(E_iEi) in the flow network, 0\leq flow(E_i)\leq
capacity(E_i) ≤ flow(Ei) ≤ capacity(Ei) must hold, we can not send more
flow through an edge than its capacity.
 Total outflow from the source vertex must be equal to total inflow to the
sink vertex.
2.30 Algorithms

Ford-Fulkerson Algorithm for Maximum Flow Problem


Given a graph which represents a flow network where every edge has a capacity.
Also given two vertices source ‘s’ and sink ‘t’ in the graph, find the maximum
possible flow from s to t with following constraints:
 Flow on an edge doesn’t exceed the given capacity of the edge.
 Incoming flow is equal to outgoing flow for every vertex except s and t.
 For example, consider the following graph from CLRS book.

Fig. 2.33.
The maximum possible flow in the above graph is 23.

Fig. 2.34.

The following is simple idea of Ford-Fulkerson algorithm:


 Start with initial flow as 0.
 While there is a augmenting path from source to sink.
 Add this path-flow to flow.
 Return flow.
Time Complexity: Time complexity of the above algorithm is O(max_flow * E).
We run a loop while there is an augmenting path. In worst case, we may add 1 unit
flow in every iteration. Therefore the time complexity becomes O(max_flow * E).
Graph Algorithms 2.31

 Residual Graph of a flow network is a graph which indicates additional


possible flow.
 If there is a path from source to sink in residual graph, then it is possible to
add flow. Every edge of a residual graph has a value called residual
capacity which is equal to original capacity of the edge minus current
flow. Residual capacity is basically the current capacity of the edge.
 Residual capacity is 0 if there is no edge between two vertices of residual
graph.
 We can initialize the residual graph as original graph as there is no initial
flow and initially residual capacity is equal to original capacity.
 To find an augmenting path, we can either do a BFS or DFS of the residual
graph.
 Using BFS, we can find out if there is a path from source to sink. BFS also
builds parent[] array. Using the parent[] array, we traverse through the
found path and find possible flow through this path by finding minimum
residual capacity along the path.
 We later add the found path flow to overall flow.
 The important thing is, we need to update residual capacities in the
residual graph.
 We subtract path flow from all edges along the path and we add path
flow along the reverse edges
 We need to add path flow along reverse edges because may later need to
send flow in reverse direction.

A Bipartite Graph is a graph whose vertices can be divided into two independent
sets L and R such that every edge (u, v) either connect a vertex from L to R or a
vertex from R to L. In other words, for every edge (u, v) either u ∈ L and v ∈ L. We
can also say that no edge exists that connect vertices of the same set.
2.32 Algorithms

Fig. 2.35.

Matching is a Bipartite Graph is a set of edges chosen in such a way that no two
edges share an endpoint. Given an undirected Graph G = (V, E), a Matching is a
subset of edge M ⊆ E such that for all vertices v ∈ V, at most one edge of M is
incident on v.
A Maximum matching is a matching of maximum cardinality, that is, a matching
M such that for any matching M', we have |M| > |M' |.

Finding a maximum bipartite matching


We can use the Ford-Fulkerson method to find a maximum matching in an
undirected bipartite graph G = (V, E) in time polynomial in |V| and |E|. The trick is to
construct a flow network G = (V′, E′) for the bipartite graph G as follows. We let the
source s and sink t be new vertices not in V, and we let V′ = V ∪{s, t}.If the vertex
partition of G is V = L∪ R, the directed edges of G' are the edges of E, directed from
L to R, along with |V| new directed edges:

E′ = [(s, u): u  L}  {(u, v): (u, v)  E}  {(v, t){ v R}

Fig: A Bipartite Graph G = (V, E) with vertex partition V = L ∪ R.


Graph Algorithms 2.33

Fig. 2.36. (a)Matching with Cardinally 2 (b) Matching with Cardinality 3 , (c)The
corresponding flow network G′ with a maximum flow shown . Each edge has until
capacity. The shaded edges from L to R correspond to those in the maximum marching
from (b)

1. What is Graph algorithms.


A graph is a unique data structure in programming that consists of finite
sets of nodes or vertices and a set of edges that connect these vertices to them.
At this moment, adjacent vertices can be called those vertices that are connected
to the same edge with each other.
2. What are the 2 types of graphs representations?
 Sequential representation (or, Adjacency matrix representation)
 Linked list representation (or, Adjacency list representation)
3. Write a short note on Depth-First Search Algorithm.
The depth-first search or DFS algorithm traverses or explores data
structures, such as trees and graphs. The algorithm starts at the root node (in the
case of a graph, you can use any random node as the root node) and examines
each branch as far as possible before backtracking.
4. What is Breadth First Searching?
BFS is a traversing algorithm where you should start traversing from a
selected node (source or starting node) and traverse the graph layerwise thus
2.34 Algorithms

exploring the neighbour nodes (nodes which are directly connected to source
node). You must then move towards the next-level neighbour nodes.
5. What is Strongly Connected Components?
A strongly connected component is the portion of a directed graph in which
there is a path from each vertex to another vertex. It is applicable only on a
directed graph.
6. What is BICONNECTIVITY.
An undirected graph is said to be a biconnected graph, if there are two vertex-
disjoint paths between any two vertices are present. In other words, we can say
that there is a cycle between any two vertices
7. What is Minimum spanning tree?
Minimum spanning tree is the spanning tree where the cost is minimum
among all the spanning trees. There also can be many minimum spanning trees.
8. What is Kruskal’s algorithm?
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into
a growing spanning tree. Kruskal's algorithm follows greedy approach as in each
iteration it finds an edge which has least weight and add it to the growing
spanning tree.
9. What is Prim’s Algorithm
Prim’s Algorithm also use Greedy approach to find the minimum spanning
tree. In Prim’s Algorithm we grow the spanning tree from a starting position.
Unlike an edge in Kruskal's, we add vertex to the growing spanning tree in
Prim's.
10. What is Bellman Ford algorithm?
Bellman Ford algorithm works by overestimating the length of the path from
the starting vertex to all other vertices. Then it iteratively relaxes those estimates
by finding new paths that are shorter than the previously overestimated paths.
11. What is Dijkstra algorithm?
Dijkstra algorithm is a single-source shortest path algorithm. Here, single-
source means that only one source is given, and we have to find the shortest path
from the source to all the nodes.
12. What is Floyd Warshall Algorithm?
 Floyd Warshall Algorithm is a famous algorithm.
Graph Algorithms 2.35

 It is used to solve All Pairs Shortest Path Problem.


 It computes the shortest path between every pair of vertices of the given
graph.
 Floyd Warshall Algorithm is an example of dynamic programming
approach.
13. What is Flow Networks?
In graph theory, a flow network is defined as directed graph G = (V,E)G =
(V, E) constrained with a function cc, which bounds each edge ee with a non-
negative integer value which is known as capacity of the edge ee with two
additional vertices defined as source SS and sink TT.
14. What is Ford-Fulkerson algorithm?
The following is simple idea of Ford-Fulkerson algorithm:
 Start with initial flow as 0.
 While there is a augmenting path from source to sink.
 Add this path-flow to flow.
 Return flow.

1. Explain graph representations in detail.


2. Explain Depth First Search and Breadth First Search.
3. Write in detail about Kruskal’s algorithm
4. Write in detail about Prim’s algorithm
5. Explain in detail about Bellman-Ford algorithm
6. Discuss in detail about Dijkstra’s algorithm
7. Briefly explain Floyd-Warshall algorithm
8. Explain Ford-Fulkerson method
9. Write in detail about Maximum bipartite matching.

*******************
UNIT III
ALGORITHM DESIGN
TECHNIQUES
Divide and Conquer methodology: Finding maximum and minimum - Merge sort
- Quick sort Dynamic programming: Elements of dynamic programming - Matrix-
chain multiplication - Multi stage graph - Optimal Binary Search Trees. Greedy
Technique: Elements of the greedy strategy - Activity-selection problem - Optimal
Merge pattern - Huffman Trees.

ALGORITHM DESIGN TECHNIQUES


An algorithm design technique is a general approach to solving problems
algorithmically that is applicable to a variety of problems from different areas of
computing.

Divide and conquer algorithm works on top-down approach and is preferred for
large problems. As the name says divide and conquer, it follows following steps:
Step 1: Divide the problem into several subproblems.
3.2 Algorithms
Algorithm Design Techniques 3.3

Step 2: Conquer or solve each sub-problem.


Step 3: Combine each sub-problem to get the required result.
Divide and Conquer solve each subproblem recursively, so each subproblem will
be the smaller original problem.

3.1.1. FINDING MAXIMUM AND MINIMUM


Here the array is divided into two halves. Then using recursive approach
maximum and minimum numbers in each halves are found. Later, return the
maximum of two maxima of each half and the minimum of two minima of each half.
In this given problem, the number of elements in an array is y – x + 1, where y
is greater than or equal to x.
Max − Min(x, y) will return the maximum and minimum values of an
array numbers[x  y].

Algorithm: Max - Min(x, y)


if y – x ≤ 1 then
return (max(numbers[x], numbers[y]), min((numbers[x], numbers[y]))
else
(max1, min1) = maxmin(x, [((x + y) / 2)])
(max2, min2) = maxmin([((x + y) / 2) + 1)],y)
return (max(max1, max2), min(min1, min2))

Analysis
Let T(n) be the number of comparisons made by Max − Min(x, y), where the
number of elements n = y – x + 1.
If T(n) represents the numbers, then the recurrence relation can be represented as

T n2   + T   n2   + 2 for n > 2


T(n) = 
1 for n = 2
0 for n = 1
Let us assume that n is in the form of power of 2. Hence, n = 2k where k is height
of the recursion tree.
3.4 Algorithms

So,
n  n  3n
T(n) = 2.T 2  + 2 = 2.  2.T 4  + 2  + 2  = 2 – 2
     
Compared to Naïve method, in divide and conquer approach, the number of
comparisons is less. However, using the asymptotic notation both of the approaches
are represented by O(n).

3.1.2. MERGE SORT


The MergeSort function repeatedly divides the array into two halves until we
reach a stage where we try to perform MergeSort on a subarray of size 1 i.e. p == r.
After that, the merge function comes into play and combines the sorted arrays into
larger arrays until the whole array is merged.

Algorithm:
step 1: start
step 2: declare array and left, right, mid variable
step 3: perform merge function.
if left > right
return
mid= (left+right)/2
mergesort(array, left, mid)
mergesort(array, mid+1, right)
merge(array, left, mid, right)
step 4: Stop
Follow the steps below to solve the problem:
MergeSort(arr[], l, r)
If r > l
 Find the middle point to divide the array into two halves:
 middle m = l + (r – l) / 2
 Call mergeSort for first half:
 Call mergeSort(arr, l, m)
 Call mergeSort for second half:
Algorithm Design Techniques 3.5

 Call mergeSort(arr, m + 1, r)
 Merge the two halves sorted in steps 2 and 3:
 Call merge(arr, l, m, r)
The following figure illustrates the dividing (splitting) procedure.

Fig. 3.1.
3.6 Algorithms

3.1.3. QUICK SORT


It is an algorithm of Divide & Conquer type.
Divide: Rearrange the elements and split arrays into two sub-arrays and an
element in between search that each element in left sub array is less than or equal to
the average element and each element in the right sub- array is larger than the middle
element.
Conquer: Recursively, sort two sub arrays.
Combine: Combine the already sorted array

Algorithm:

Fig. 3.2. Shows the execution trace partition algorithm

QUICKSORT (array A, int m, int n)


1 if (n > m )
2 then
3 i  a random index from [m, n]
4 swap A [i] with A[m]
Algorithm Design Techniques 3.7

5 o  PARTITION (A, m, n)
6 QUICKSORT (A, m, o - 1)
7 QUICKSORT (A, o + 1, n)

Stopping
Method Name Equation Complexities
Condition

1. Quick Sort T(n) = T(n – 1) + T(0) + n T(1) = 0 T(n) = n 2

[Worst Case]

2. Quick Sort 1 T(n) = n log n


T(n) = n + 1 + n
[Average Case]  n 
  T(k – 1) + T(n – k) 
K=1 

Dynamic programming is used where we have problems, which can be divided


into similar sub-problems, so that their results can be re-used. Mostly, these
algorithms are used for optimization. Before solving the in-hand sub-problem,
dynamic algorithm will try to examine the results of the previously solved sub-
problems.

Example
Let's find the fibonacci sequence upto 5th term. A fibonacci series is the sequence
of numbers in which each number is the sum of the two preceding ones. For
example, 0, 1, 1, 2, 3 . Here, each number is the sum of the two preceding numbers.

3.2.1. ELEMENTS OF DYNAMIC PROGRAMMING


Three elements of the Dynamic Programming algorithm are:
1. Substructure
2. Table Structure
3. Bottom-Up Computation
3.8 Algorithms

The elements in a Dynamic Programming Solution are discussed below:


 To solve a given complex problem and to find its optimal solution, it is
broken down into similar but smaller and easily computable problems
called subproblems. Hence, the complete solution depends on many
smaller problems and their solutions. We get to the final optimal solution
after going through all subproblems and selecting the most optimal ones.
This is the substructure element of any Dynamic Programming solution.
 Any Dynamic Programming solution involves storing the optimal
solutions of the subproblems so that they don't have to be computed
again and again. To store these solutions a table structure is needed. So,
for example arrays in C++ or ArrayList in Java can be used. By using
this structured table, the solutions of previous subproblems are reused.
 The solutions to subproblems need to be computed first to be reused
again. This is called Bottom-Up Computation because we start storing
values from the bottom and then consequently upwards. The solutions to
the smaller subproblems are combined to get the final solution to the
original problem.

Approaches to Dynamic Programming


There are two types of approach that can be used to solve a problem by Dynamic
Programming:
1. Memoization or Top-Down Dynamic Programming
2. Tabulation or Bottom Up Dynamic Programming

3.2.2. APPLICATIONS OF DYNAMIC PROGRAMMING


The various applications of Dynamic Programming are :
1. Longest Common Subsequence
2. Finding Shortest Path
3. Finding Maximum Profit with other Fixed Constraints
4. Job Scheduling in Processor
5. BioInformatics
6. Optimal search solutions
Algorithm Design Techniques 3.9

Example: We are given the sequence {4, 10, 3, 12, 20, and 7}. The matrices have
size 4  10, 10  3, 3  12, 12  20, 20  7. We need to compute M [i, j], 0 ≤ i, j ≤
5. We know M [i, i] = 0 for all i.

Fig. 3.3.
Let us proceed with working away from the diagonal. We compute the optimal
solution for the product of 2 matrices.

Fig. 3.4.
Here P0 to P5 are Position and M1 to M5 are matrix of size (p i to p i – 1 )

On the basis of sequence, we make a formula


For M i  p[i] as column
p[i – 1] as row
In Dynamic Programming, initialization of every method done by '0'.So we
initialize it by '0'.It will sort out diagonally.
We have to sort out all the combination but the minimum output combination is
taken into consideration.

Calculation of Product of 2 matrices:


1. m (1, 2) = m 1  m 2
= 4  10  10  3
3.10 Algorithms

= 4  10  3 = 120
2. m (2, 3) = m 2  m3
= 10  3  3  12
= 10  3  12 = 360
3. m (3, 4) = m3  m4
= 3  12  12  20
= 3  12  20 = 720
4. m (4,5) = m4  m5
= 12  20  20  7
= 12  20  7 = 1680

Fig. 3.5.
 We initialize the diagonal element with equal i, j value with '0'.
 After that second diagonal is sorted out and we get all the values
corresponded to it
Now the third diagonal will be solved out in the same way.

Now product of 3 matrices:


M [1, 3] = M1 M2 M3
1. There are two cases by which we can solve this multiplication: (M1  M2)
+ M3, M1 + (M2  M3)
2. After solving both cases we choose the case in which minimum output is
there.
 M[1, 2] + M[3, 3] + p 0 p 2 p 3 = 120 + 0 + 4.3.12 = 264 
M[1, 3] = min  M[1, 1] + M[2, 3] + p p p = 0 + 360 + 4.10.12 = 840 
 0 1 3 
Algorithm Design Techniques 3.11

M [1, 3] = 264
As Comparing both output 264 is minimum in both cases so we insert 264 in table
and ( M1  M2) + M3 this combination is chosen for the output making.
M [2, 4] = M2 M3 M4
1. There are two cases by which we can solve this multiplication: (M2  M3)
+ M4, M2 + (M3  M4)
2. After solving both cases we choose the case in which minimum output is
there.
 M[2,3] + M[4,4] + p 1 p 3 p 4 = 360 + 0 + 10.12.20 = 2760 
M[2, 4] = min  M[2,2] + M[3,4] + p p p = 0 + 720 + 10.3.20 = 1320 
 1 2 4 

M [2, 4] = 1320
As Comparing both output 1320 is minimum in both cases so we insert 1320 in
table and M2 + (M3  M4) this combination is chosen for the output making.
M [3, 5] = M3 M4 M5
1. There are two cases by which we can solve this multiplication: (M3  M4)
+ M5, M3+ (M4  M5)
2. After solving both cases we choose the case in which minimum output is
there.
 M[3,4] + M[5,5] + p 2 p 4 p 5 = 720 + 0 + 3.20.7 = 1140 
M[3, 5] = min  M[3,3] + M[4,5] + p p p = 0 + 1680 + 3.12.7 = 1932 
 2 3 5 

M [3, 5] = 1140
As Comparing both output 1140 is minimum in both cases so we insert 1140 in
table and (M3  M4) + M5 this combination is chosen for the output making.

Fig. 3.6.
3.12 Algorithms

Now Product of 4 matrices:


M [1, 4] = M1 M2 M3 M4
There are three cases by which we can solve this multiplication:
1. (M1  M2  M3) M4
2. M1  (M2  M3  M4)
3. (M1  M2)  (M3  M4)
After solving these cases we choose the case in which minimum output is there
 M[1,3] + M[4,4] + p 0 p 3 p 4 = 264 + 0 + 4.12.20 = 1224 
M[1, 4] = min  M[1,2] + M[3,4] + p 0 p 2 p 4 = 120 + 720 + 4.3.20 = 1080 
 M[1,1] + M[2,4] + p 0 p 1 p 4 = 0 + 1320 + 4.10.20 = 2120 

M [1, 4] = 1080
As comparing the output of different cases then '1080' is minimum output, so we
insert 1080 in the table and (M1  M2)  (M3  M4) combination is taken out in
output making,
M [2, 5] = M2 M3 M4 M5
There are three cases by which we can solve this multiplication:
1. (M2  M3  M4)  M5
2. M2  (M3  M4  M5)
3. (M2  M3)  (M4  M5)
After solving these cases we choose the case in which minimum output is there
 M[2,4] + M[5,5] + p 1 p 4 p 5 = 1320 + 0 + 10.20.7 = 2720 
M[2, 5] = min  M[2,3] + M[4,5] + p 1 p 3 p 5 = 360 + 1680 + 10.12.7 = 2880 
 M[2,2] + M[3,5] + p 1 p 2 p 5 = 0 + 1140 + 10.3.7 = 1350 

M [2, 5] = 1350

Fig. 3.7.
Algorithm Design Techniques 3.13

As comparing the output of different cases then '1350' is minimum output, so we


insert 1350 in the table and M2  (M3  M4  M5)combination is taken out in output
making.

Now Product of 5 matrices:

M [1, 5] = M1 M2 M3 M4 M5

There are five cases by which we can solve this multiplication:


1. (M1  M2  M3  M4)  M5

2. M1  (M2  M3  M4  M5)

3. (M1  M2  M3)  M4  M5

4. M1  M2  (M3  M4  M5)

After solving these cases we choose the case in which minimum output is there

 M[1,3]
M[1,4] + M[5,5] + p 0 p 4 p 5 = 1080 + 0 + 4.20.7 = 1544

+ M[4,5] + p 0 p 3 p 5 = 264 + 1680 + 4.12.7 = 2016 
M[1, 5] = min  M[1,2] + M[3,5] + p p p = 120 + 1140 + 4.3.7 = 1344 
 M[1,1] + M[2,5] + p 0 p 1 p 5 = 0 + 1350 + 4.10.7 = 1630 
0 2 5

M [1, 5] = 1344

As comparing the output of different cases then '1344' is minimum output, so we


insert 1344 in the table and M1  M2  (M3  M4  M5) combination is taken out in
output making.
Final Output is:

Fig. 3.8.
3.14 Algorithms

Step 3: Computing Optimal Costs: let us assume that matrix Ai has dimension
pi-1x pi for i = 1, 2, 3....n. The input is a sequence (p 0, p 1,  p n) where length [p] = n
+ 1. The procedure uses an auxiliary table m [1....n, 1.....n] for storing m [i, j] costs
an auxiliary table s [1  n, 1  .n] that record which index of k achieved the
optimal costs in computing m [i, j].
The algorithm first computes m [i, j]  0 for i = 1, 2, 3 .n , the minimum costs
for the chain of length 1.

Multistage Graph problem is defined as follow:


 Multistage graph G = (V, E, W) is a weighted directed graph in which
vertices are partitioned into k ≥ 2 disjoint sub sets V = {V1 , V2, …, Vk}
such that if edge (u, v) is present in E then u ∈ Vi and v ∈ Vi + 1, 1 ≤ i ≤ k.
The goal of multistage graph problem is to find minimum cost path from
source to destination vertex.
 The input to the algorithm is a k-stage graph, n vertices are indexed in
increasing order of stages.
 The algorithm operates in the backward direction, i.e. it starts from the last
vertex of the graph and proceeds in a backward direction to find minimum
cost path.
 Minimum cost of vertex j ∈ Vi from vertex r ∈ Vi Vi + 1 is defined as,
Cost[j] = min{c[ j, r] + cost[r] }
where, c[j, r] is the weight of edge < j, r > and cost[r] is the cost of moving
from end vertex to vertex r.

Complexity Analysis of Multistage Graph


If graph G has |E| edges, then cost computation time would be O(n + |E|). The
complexity of tracing the minimum cost path would be O(k), k < n. Thus total time
complexity of multistage graph using dynamic programming would be O(n + |E|).

Example
Find minimum path cost between vertex s and t for following multistage
graph using dynamic programming.
Algorithm Design Techniques 3.15

Fig. 3.9.
Solution to multistage graph using dynamic programming is constructed as,
Cost[j] = min{c [ j , r] + cost[r]}
Here, number of stages k = 5, number of vertices n = 12, source s = 1 and target t =
12

Initialization:
Cost[n] = 0  Cost[12] = 0.
p[1] = s  p[1] = 1
p[k] = t  p[5] = 12.
r = t = 12

Stage 4:

Fig. 3.10.

Stage 3:

Vertex 6 is connected to vertices 9 and 10:


Cost[6] = min{ c[6, 10] + Cost[10], c[6, 9] + Cost[9] }
3.16 Algorithms

= min{5 + 2, 6 + 4} = min{7, 10} = 7


p[6] = 10

Vertex 7 is connected to vertices 9 and 10:


Cost[7] = min{ c[7, 10] + Cost[10], c[7, 9] + Cost[9] }
= min{3 + 2, 4 + 4} = min{5, 8} = 5
p[7] = 10

Vertex 8 is connected to vertex 10 and 11:


Cost[8] = min{ c[8, 11] + Cost[11], c[8, 10] + Cost[10] }
= min{6 + 5, 5 + 2} = min{11, 7} = 7 p[8] = 10

Fig. 3.11.

Stage 2:

Vertex 2 is connected to vertices 6, 7 and 8:


Cost[2] = min{ c[2, 6] + Cost[6], c[2, 7] + Cost[7], c[2, 8] + Cost[8] }
= min{4 + 7, 2 + 5, 1 + 7} = min{11, 7, 8} = 7
p[2] = 7

Vertex 3 is connected to vertices 6 and 7:


Cost[3] = min{ c[3, 6] + Cost[6], c[3, 7] + Cost[7] }
= min{2 + 7, 7 + 5} = min{9, 12} = 9
p[3] = 6
Algorithm Design Techniques 3.17

Vertex 4 is connected to vertex 8:


Cost[4] = c[4, 8] + Cost[8] = 11 + 7 = 18
p[4] = 8

Vertex 5 is connected to vertices 7 and 8:


Cost[5] = min{ c[5, 7] + Cost[7], c[5, 8] + Cost[8] }
= min{11 + 5, 8 + 7} = min{16, 15} = 15 p[5] = 8

Fig. 3.12.

Stage 1:

Vertex 1 is connected to vertices 2, 3, 4 and 5:


Cost[1] = min{c[1, 2] + Cost[2], c[1, 3] + Cost[3], c[1, 4] + Cost[4], c[1, 5]
+ Cost[5]}
= min{ 9 + 7, 7 + 9, 3 + 18, 2 + 15 }
= min {16, 16, 21, 17 }
= 16 p[1]
= 2

Trace the solution:


p[1] = 2
p[2] = 7
p[7] = 10
3.18 Algorithms

p[10] = 12

Fig. 3.13.
Minimum cost path is : 1 – 2 – 7 – 10 – 12

Fig. 3.14.
Cost of the path is : 9 + 2 + 3 + 2 = 16

In binary search tree, the nodes in the left subtree have lesser value than the root
node and the nodes in the right subtree have greater value than the root node.
We know the key values of each node in the tree, and we also know the
frequencies of each node in terms of searching means how much time is required to
search a node. The frequency and key-value determine the overall cost of searching a
node. The cost of searching is a very important factor in various applications. The
overall cost of searching a node should be less. The time required to search a node in
BST is more than the balanced binary search tree as a balanced binary search tree
contains a lesser number of levels than the BST. There is one way that can reduce the
cost of a binary search tree is known as an optimal binary search tree.

Example:
If the keys are 10, 20, 30, 40, 50, 60, 70
Algorithm Design Techniques 3.19

Fig. 3.15.
In the above tree, all the nodes on the left subtree are smaller than the value of the
root node, and all the nodes on the right subtree are larger than the value of the root
node. The maximum time required to search a node is equal to the minimum height
of the tree, equal to log n.
Now we will see how many binary search trees can be made from the given
number of keys.
For example: 10, 20, 30 are the keys, and the following are the binary search trees
that can be made out from these keys.
The Formula for calculating the number of trees:
2n
Cn
n +1
When we use the above formula, then it is found that total 5 number of trees can
be created.
The cost required for searching an element depends on the comparisons to be
made to search an element. Now, we will calculate the average cost of time of the
above binary search trees.

Fig. 3.16.
3.20 Algorithms

In the above tree, total number of 3 comparisons can be made. The average
number of comparisons can be made as:
1+2+3
Average number of comparisons = =2
3

Fig. 3.17.
In the above tree, the average number of comparisons that can be made as:
1 + 2 +3
Average number of comparisons = =2
3

Fig. 3.18.
In the above tree, the average number of comparisons that can be made as:
1+2+2
Average number of comparisons = = 5/3
3

Fig. 3.19.
In the above tree, the total number of comparisons can be made as 3. Therefore,
the average number of comparisons that can be made as:
1+2+3
Average number of comparisons = =2
3
Algorithm Design Techniques 3.21

Fig. 3.20.
In the above tree, the total number of comparisons can be made as 3. Therefore,
the average number of comparisons that can be made as:
1 +2 + 3
Average number of comparisons = =2
3
In the third case, the number of comparisons is less because the height of the tree
is less, so it's a balanced binary search tree.
Let's assume that frequencies associated with the keys 10, 20, 30 are 3, 2, 5.
The above trees have different frequencies. The tree with the lowest frequency
would be considered the optimal binary search tree. The tree with the frequency 17 is
the lowest, so it would be considered as the optimal binary search tree.

Algorithm
optCostBst(keys, freq, n)
Input: Keys to insert in BST, the frequency for each key, number of keys.
Output: Minimum cost to make optimal BST.
Begin
define cost matrix of size n x n
for i in range 0 to n-1, do
cost[i, i] := freq[i]
done

for length in range 2 to n, do


for i in range 0 to (n-length+1), do
j := i + length – 1
cost[i, j] := ∞
3.22 Algorithms

for r in range i to j, done


if r > i, then
c := cost[i, r-1]
else
c := 0
if r < j, then
c := c + cost[r+1, j]
c := c + sum of frequency from i to j
if c < cost[i, j], then
cost[i, j] := c
done
done
done
return cost[0, n-1]
End

Among all the algorithmic approaches, the simplest and straightforward approach
is the Greedy method. In this approach, the decision is taken on the basis of current
available information without worrying about the effect of the current decision in
future.
Greedy algorithms build a solution part by part, choosing the next part in such a
way, that it gives an immediate benefit. This approach never reconsiders the choices
taken previously. This approach is mainly used to solve optimization problems.
Greedy method is easy to implement and quite efficient in most of the cases. Hence,
we can say that Greedy algorithm is an algorithmic paradigm based on heuristic that
follows local optimal choice at each step with the hope of finding global optimal
solution.
In many problems, it does not produce an optimal solution though it gives an
approximate (near optimal) solution in a reasonable time.
Algorithm Design Techniques 3.23

Components of Greedy Algorithm


Greedy algorithms have the following five components −
 A candidate set − A solution is created from this set.
 A selection function − Used to choose the best candidate to be added to
the solution.
 A feasibility function − Used to determine whether a candidate can be
used to contribute to the solution.
 An objective function − Used to assign a value to a solution or a partial
solution.
 A solution function − Used to indicate whether a complete solution has
been reached.

Areas of Application
Greedy approach is used to solve many problems, such as
 Finding the shortest path between two vertices using Dijkstra’s algorithm.
 Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s
algorithm, etc.

3.6.1. ELEMENTS OF THE GREEDY STRATEGY

Optimal Substructure:
An optimal solution to the problem contains within it optimal solutions to sub-
problems. A' = A - {1} (greedy choice) A' can be solved again with the greedy
algorithm. S' = { i ? S, s i ? f i }
The 0 - 1 knapsack problem:
A thief has a knapsack that holds at most W pounds. Item i : (v i , w i ) ( v = value,
w = weight ) thief must choose items to maximize the value stolen and still fit into
the knapsack. Each item must be taken or left (0 – 1 ).

Fractional knapsack problem:


Both the 0 - 1 and fractional problems have the optimal substructure property:
Fractional: v i / w i is the value per pound. Clearly you take as much of the item with
the greatest value per pound. This continues until you fill the knapsack. Optimal
(Greedy) algorithm takes O (n log n ), as we must sort on v i / w i = d i .
3.24 Algorithms

Consider the same strategy for the 0 - 1 problem:


W = 50 lbs. (maximum knapsack capacity)

w1 = 10 v1 = 60 d1.= 6
w2 = 20 v2 = 100 d2.= 5
w3 = 30 v3 = 120 d3 = 4
where d is the value density
Greedy approach: Take all of 1, and all of 2: v 1 + v 2 = 160, optimal solution is to
take all of 2 and 3: v2 + v3= 220, other solution is to take all of 1 and 3 v 1 + v 3 =
180. All below 50 lbs.
When solving the 0 - 1 knapsack problem, empty space lowers the effective d of
the load. Thus each time an item is chosen for inclusion we must consider both
 i included
 i excluded
These are clearly overlapping sub-problems for different i's and so best solved by
DP!

Fig. 3.21. The greedy strategy does not work for the 0-1 knapsack problem (a) The thief
must select a subset of the three items shown whose weight must not exceed 50 pounds. (b)
The optimal subset includes items 2 and 3. any solution with item 1 is suboptimal, even
though item 1 has the greatest value per pound. (c) For the fractional knapsack problem,
taking the items in order of greatest value per pound yields an optimal solution
The activity selection problem is an optimization problem used to find the
maximum number of activities a person can perform if they can only work on one
Algorithm Design Techniques 3.25

activity at a time. This problem is also known as the interval scheduling


maximization problem (ISMP).
The greedy algorithm provides a simple, well-designed method for selecting the
maximum number of non-conflicting activities.

Algorithm
We are provided with n activities; each activity has its own start and finish time.
In order to find the maximum number of non-conflicting activities, the following
steps need to be taken:
 Sort the activities in ascending order based on their finish times.
 Select the first activity from this sorted list.
 Select a new activity from the list if its start time is greater than or equal to
the finish time of the previously selected activity.
 Repeat the last step until all activities in the sorted list are checked.

Step 1: Sort the activities in ascending order of finish times

Step 2: Select the first activity in the sorted list


3.26 Algorithms

Step 3: Select the next activity in the sorted list its ‘start’ time is greater than or
equal to the ‘finish’ time of the previously selected activity.

Hence, the person can perform 4 non-conflicting activities

It is a pattern that relates to the merging of two or more sorted files in a single
sorted file. This type of merging can be done by the two-way merging method.
If we have two sorted files containing n and m records respectively then they
could be merged together, to obtain one sorted file in time O (n + m).
There are many ways in which pairwise merge can be done to get a single sorted
file. Different pairings require a different amount of computing time. The main thing
is to pairwise merge the n sorted files so that the number of comparisons will be less.
The formula of external merging cost is:
n
 f (i) d(i)
i = 1

Where, f (i) represents the number of records in each file and d (i) represents the
depth.
Algorithm Design Techniques 3.27

Algorithm for optimal merge pattern

Algorithm Tree(n)

//list is a global list of n single node

For i=1 to i= n-1 do

// get a new tree node

Pt: new treenode;

// merge two trees with smallest length

(Pt = lchild) = least(list);

(Pt = rchild) = least(list);

(Pt =weight) = ((Pt = lchild) = weight) = ((Pt = rchild) = weight);

Insert (list , Pt);

// tree left in list

Return least(list);
}

Example:
Given a set of unsorted files: 5, 3, 2, 7, 9, 13
Now, arrange these elements in ascending order: 2, 3, 5, 7, 9, 13
After this, pick two smallest numbers and repeat this until we left with only one
number.
Now follow following steps:

Step 1: Insert 2, 3
3.28 Algorithms

Step 2:

Step 3: Insert 5

Step 4: Insert 13

Step 5: Insert 7 and 9


Algorithm Design Techniques 3.29

Step 6:

So, The merging cost = 5 + 10 + 16 + 23 + 39 = 93

Huffman coding provides codes to characters such that the length of the code
depends on the relative frequency or weight of the corresponding character. Huffman
codes are of variable-length, and without any prefix (that means no code is a prefix
of any other). Any prefix-free binary code can be displayed or visualized as a binary
tree with the encoded characters stored at the leaves.
Huffman tree or Huffman coding tree defines as a full binary tree in which each
leaf of the tree corresponds to a letter in the given alphabet.
The Huffman tree is treated as the binary tree associated with minimum external
path weight that means, the one associated with the minimum sum of weighted path
lengths for the given set of leaves. So the goal is to construct a tree with the
minimum external path weight.
An example is given below-

Letter frequency table

Letter z k m c u d l e
Frequency 2 7 24 32 37 42 42 120
3.30 Algorithms

Huffman Code

Letter Freq Code Bits


e 120 0 1
d 42 101 3
l 42 110 3
u 37 100 3
c 32 1110 4
m 24 11111 5
k 7 111101 6
z 2 111100 6
The Huffman tree (for the above example) is given below -

Fig. 3.22.

Huffman (C)
1. n=|C|
2. Q  C
3. for i=1 to n-1
Algorithm Design Techniques 3.31

4. do
5. z= allocate-Node ()
6. x= left[z]=Extract-Min(Q)
7. y= right[z] =Extract-Min(Q)
8. f [z]=f[x]+f[y]
9. Insert (Q, z)
10. return Extract-Min (Q)

1. What is Algorithm design technique?


An algorithm design technique is a general approach to solving problems
algorithmically that is applicable to a variety of problems from different areas of
computing.
2. What is Merge sort?
 The MergeSort function repeatedly divides the array into two halves until
we reach a stage where we try to perform MergeSort on a subarray of size
1 i.e. p == r.
 After that, the merge function comes into play and combines the sorted
arrays into larger arrays until the whole array is merged.
3. What is Quick Sort?
 It is an algorithm of Divide & Conquer type.
 Divide: Rearrange the elements and split arrays into two sub-arrays and an
element in between search that each element in left sub array is less than
or equal to the average element and each element in the right sub- array is
larger than the middle element.
 Conquer: Recursively, sort two sub arrays.
 Combine: Combine the already sorted array
3.32 Algorithms

4. What is Dynamic Programming?


Dynamic programming is used where we have problems, which can be
divided into similar sub-problems, so that their results can be re-used. Mostly,
these algorithms are used for optimization. Before solving the in-hand sub-
problem, dynamic algorithm will try to examine the results of the previously
solved sub-problems.
5. Write a short note on Complexity Analysis of Multistage Graph.
If graph G has |E| edges, then cost computation time would be O(n + |E|). The
complexity of tracing the minimum cost path would be O(k), k < n. Thus total
time complexity of multistage graph using dynamic programming would be O(n
+ |E|).
6. What is Greedy technique?
In this approach, the decision is taken on the basis of current available
information without worrying about the effect of the current decision in future.
7. What is activity selection problem?
The activity selection problem is an optimization problem used to find the
maximum number of activities a person can perform if they can only work on
one activity at a time. This problem is also known as the interval scheduling
maximization problem (ISMP).
8. What is Optimal merge pattern.
It is a pattern that relates to the merging of two or more sorted files in a single
sorted file. This type of merging can be done by the two-way merging method.
9. What is Huffman tree?
Huffman tree or Huffman coding tree defines as a full binary tree in which
each leaf of the tree corresponds to a letter in the given alphabet.
The Huffman tree is treated as the binary tree associated with minimum
external path weight that means, the one associated with the minimum sum of
weighted path lengths for the given set of leaves. So the goal is to construct a
tree with the minimum external path weight
Algorithm Design Techniques 3.33

1. Explain Divide and Conquer methodology in detail


2. Write in detail about Merge sort with an example
3. Write in detail about Quick sort
4. Write Elements of dynamic programming in detail
5. Explain the concept of Matrix-chain multiplication
6. Explain Optimal Binary Search Trees with an example.
7. Explain in detail about Greedy Technique.
8. Discuss on Activity-selection problem
9. Explain Huffman Trees with an example.

*******************
UNIT IV
STATE SPACE SEARCH
ALGORITHMS
Backtracking: n-Queens problem - Hamiltonian Circuit Problem - Subset Sum
Problem – Graph colouring problem Branch and Bound: Solving 15-Puzzle problem
- Assignment problem - Knapsack Problem - Travelling Salesman Problem

Backtracking is one of the techniques that can be used to solve the problem. We
can write the algorithm using this strategy. It uses the Brute force search to solve the
problem, and the brute force search says that for the given problem, we try to make
all the possible solutions and pick out the best solution from all the desired solutions.
This rule is also followed in dynamic programming, but dynamic programming is
used for solving optimization problems. In contrast, backtracking is not used in
solving optimization problems. Backtracking is used when we have multiple
solutions, and we require all those solutions.
Backtracking name itself suggests that we are going back and coming forward; if
it satisfies the condition, then return success, else we go back again. It is used to
solve a problem in which a sequence of objects is chosen from a specified set so that
the sequence satisfies some criteria.

When to use a Backtracking algorithm?


When we have multiple choices, then we make the decisions from the available
choices. In the following cases, we need to use the backtracking algorithm:
 A piece of sufficient information is not available to make the best choice,
so we use the backtracking strategy to try out all the possible solutions.
 Each decision leads to a new set of choices. Then again, we backtrack to
make new decisions. In this case, we need to use the backtracking
strategy.
4.2 Algorithms

How does Backtracking work?


Backtracking is a systematic method of trying out various sequences of decisions
until you find out that works. Let's understand through an example.
We start with a start node. First, we move to node A. Since it is not a feasible
solution so we move to the next node, i.e., B. B is also not a feasible solution, and it
is a dead-end so we backtrack from node B to node A.

Fig. 4.1.
Suppose another path exists from node A to node C. So, we move from node A to
node C. It is also a dead-end, so again backtrack from node C to node A. We move
from node A to the starting node.

Fig. 4.2.

Fig. 4.3.
Now we will check any other path exists from the starting node. So, we move
from start node to the node D. Since it is not a feasible solution so we move from
State Space Search Algorithms 4.3

node D to node E. The node E is also not a feasible solution. It is a dead end so we
backtrack from node E to node D.
Suppose another path exists from node D to node F. So, we move from node D to
node F. Since it is not a feasible solution and it's a dead-end, we check for another
path from node F.

Fig. 4.4.

Fig. 4.5.
4.4 Algorithms

Suppose there is another path exists from the node F to node G so move from
node F to node G. The node G is a success node.

Algorithm:
If all squares are visited
print the solution
Else
(a) Add one of the next moves to solution vector and recursively check if this
move leads to a solution. (A Knight can make maximum eight moves. We
choose one of the 8 moves in this step).
(b) If the move chosen in the above step doesn't lead to a solution then remove
this move from the solution vector and try other alternative moves.
(c) If none of the alternatives work then return false (Returning false will
remove the previously added item in recursion and if false is returned by
the initial call of recursion then "no solution exists")
Terms related to the backtracking are:
 Live node: The nodes that can be further generated are known as live
nodes.
 E node: The nodes whose children are being generated and become a
success node.
 Success node: The node is said to be a success node if it provides a
feasible solution.
 Dead node: The node which cannot be further generated and also does
not provide a feasible solution is known as a dead node.
Many problems can be solved by backtracking strategy, and that problems satisfy
complex set of constraints, and these constraints are of two types:
 Implicit constraint: It is a rule in which how each element in a tuple is
related.
 Explicit constraint: The rules that restrict each element to be chosen
from the given set.

APPLICATIONS OF BACKTRACKING
 N-queen problem
State Space Search Algorithms 4.5

 Sum of subset problem


 Graph coloring
 Hamiliton cycle

4.1.1. N-QUEENS PROBLEM


This problem is to find an arrangement of N queens on a chess board, such that no
queen can attack any other queens on the board. The chess queens can attack in any
direction as horizontal, vertical, horizontal and diagonal way. A binary matrix is used
to display the positions of N Queens, where no queens can attack other queens.

Input and Output

Input:
The size of a chess board. Generally, it is 8. as (8  8 is the size of a normal chess
board.)

Output:
The matrix that represents in which row and column the N Queens can be placed.
If the solution does not exist, it will return false.
10000000
00000010
00001000
00000001
01000000
00010000
00000100
00100000
In this output, the value 1 indicates the correct place for the queens.
The 0 denotes the blank spaces on the chess board.

Algorithm
isValid(board, row, col)
Input: The chess board, row and the column of the board.
4.6 Algorithms

Output − True when placing a queen in row and place position is a valid or not.
Begin
if there is a queen at the left of current col, then
return false
if there is a queen at the left upper diagonal, then
return false
if there is a queen at the left lower diagonal, then
return false;
return true //otherwise it is valid place
End
solveNQueen(board, col)
Input − The chess board, the col where the queen is trying to be placed.
Output − The position matrix where queens are placed.
Begin
if all columns are filled, then
return true
for each row of the board, do
if isValid(board, i, col), then
set queen at place (i, col) in the board
if solveNQueen(board, col+1) = true, then
return true
otherwise remove queen from place (i, col) from board.
done
return false
End

Example
#include<iostream>
using namespace std;
#define N 8
State Space Search Algorithms 4.7

void printBoard(int board[N][N]) {


for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++)
cout << board[i][j] << " ";
cout << endl;
}
}
bool isValid(int board[N][N], int row, int col) {
for (int i = 0; i < col; i++) //check whether there is queen in the left or not
if (board[row][i])
return false;
for (int i=row, j=col; i>=0 && j>=0; i--, j--)
if (board[i][j]) //check whether there is queen in the left upper diagonal
or not
return false;
for (int i=row, j=col; j>=0 && i<N; i++, j--)
if (board[i][j]) //check whether there is queen in the left lower diagonal or
not
return false;
return true;
}
bool solveNQueen(int board[N][N], int col) {
if (col >= N) //when N queens are placed successfully
return true;
for (int i = 0; i < N; i++) { //for each row, check placing of queen is possible
or not
if (isValid(board, i, col) ) {
board[i][col] = 1; //if validate, place the queen at place (i, col)
if ( solveNQueen(board, col + 1)) //Go for the other columns recursively
4.8 Algorithms

return true;

board[i][col] = 0; //When no place is vacant remove that queen


}
}
return false; //when no possible order is found
}
bool checkSolution() {
int board[N][N];
for(int i = 0; i<N; i++)
for(int j = 0; j<N; j++)
board[i][j] = 0; //set all elements to 0

if ( solveNQueen(board, 0) == false ) { //starting from 0th column


cout << "Solution does not exist";
return false;
}
printBoard(board);
return true;
}
int main() {
checkSolution();
}

Output
10000000
00000010
00001000
00000001
01000000
State Space Search Algorithms 4.9

00010000
00000100
00100000

4.1.2. HAMILTONIAN CIRCUIT PROBLEM


In an undirected graph, the Hamiltonian path is a path, that visits each vertex
exactly once, and the Hamiltonian cycle or circuit is a Hamiltonian path, that there is
an edge from the last vertex to the first vertex. In this problem, we will try to
determine whether a graph contains a Hamiltonian cycle or not. And when a
Hamiltonian cycle is present, also print the cycle.

Input and Output

Input:
The adjacency matrix of a graph G(V, E).

Fig. 4.6.

Output:
The algorithm finds the Hamiltonian path of the given graph. For this case it is (0,
1, 2, 4, 3, 0). This graph has some other Hamiltonian paths.
If one graph has no Hamiltonian path, the algorithm should return false.

Algorithm
isValid(v, k)
Input − Vertex v and position k.
Output − Checks whether placing v in the position k is valid or not.
Begin
if there is no edge between node(k –1) to v, then
4.10 Algorithms

return false
if v is already taken, then
return false
return true; //otherwise it is valid
End
cycleFound(node k)
Input − node of the graph.
Output − True when there is a Hamiltonian Cycle, otherwise false.
Begin
if all nodes are included, then
if there is an edge between nodes k and 0, then
return true
else
return false;
for all vertex v except starting point, do
if isValid(v, k), then //when v is a valid edge
add v into the path
if cycleFound(k +1) is true, then
return true
otherwise remove v from the path
done
return false
End

Example
#include<iostream>
#define NODE 5
using namespace std;

int graph[NODE][NODE] = {
State Space Search Algorithms 4.11

{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 1},
{0, 1, 1, 1, 0},
};

/* int graph[NODE][NODE] = {
{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 0},
{0, 1, 1, 0, 0},
}; */

int path[NODE];
void displayCycle() {
cout<<"Cycle: ";
for (int i = 0; i < NODE; i++)
cout << path[i] << " ";
cout << path[0] << endl; //print the first vertex again
}
bool isValid(int v, int k) {
if (graph [path[k-1]][v] == 0) //if there is no edge
return false;
for (int i = 0; i < k; i++) //if vertex is already taken, skip that
if (path[i] == v)
4.12 Algorithms

return false;
return true;
}
bool cycleFound(int k) {
if (k == NODE) { //when all vertices are in the path
if (graph[path[k-1]][ path[0] ] == 1 )
return true;
else
return false;
}
for (int v = 1; v < NODE; v++) { //for all vertices except starting point
if (isValid(v,k)) { //if possible to add v in the path
path[k] = v;
if (cycleFound (k+1) == true)
return true;
path[k] = -1; //when k vertex will not in the solution
}
}
return false;
}
bool hamiltonianCycle() {
for (int i = 0; i < NODE; i++)
path[i] = -1;
path[0] = 0; //first vertex as 0
if ( cycleFound(1) == false ) {
cout << "Solution does not exist"<<endl;
return false;
State Space Search Algorithms 4.13

}
displayCycle();
return true;
}
int main() {
hamiltonianCycle();
}

Output
Cycle: 0 1 2 4 3 0

4.1.3. SUBSET SUM PROBLEM


In this problem, there is a given set with some integer elements. And another
some value is also provided, we have to find a subset of the given set whose sum is
the same as the given sum value.
Here backtracking approach is used for trying to select a valid subset when an
item is not valid, we will backtrack to get the previous subset and add another
element to get the solution.

Input and Output


Input:
This algorithm takes a set of numbers, and a sum value.
The Set: {10, 7, 5, 18, 12, 20, 15}
The sum Value: 35
Output:
All possible subsets of the given set, where sum of each element for every subsets
is same as the given sum value.
{10, 7, 18}
{10, 5, 20}
{5, 18, 12}
{20, 15}
4.14 Algorithms

Algorithm
subsetSum(set, subset, n, subSize, total, node, sum)
Input − The given set and subset, size of set and subset, a total of the subset,
number of elements in the subset and the given sum.
Output − All possible subsets whose sum is the same as the given sum.
Begin
if total = sum, then
display the subset
//go for finding next subset
subsetSum(set, subset, , subSize-1, total-set[node], node+1, sum)
return
else
for all element i in the set, do
subset[subSize] := set[i]
subSetSum(set, subset, n, subSize+1, total+set[i], i+1, sum)
done
End

Example
#include <iostream>
using namespace std;
void displaySubset(int subSet[], int size) {
for(int i = 0; i < size; i++) {
cout << subSet[i] << " ";
}
cout << endl;
}
void subsetSum(int set[], int subSet[], int n, int subSize, int total, int nodeCount
,int sum) {
if( total == sum) {
State Space Search Algorithms 4.15

displaySubset(subSet, subSize); //print the subset


subsetSum(set,subSet,n,subSize-1,total-set[nodeCount],nodeCount+1,sum);
//for other subsets
return;
}else {
for( int i = nodeCount; i < n; i++ ) { //find node along breadth
subSet[subSize] = set[i];
subsetSum(set,subSet,n,subSize+1,total+set[i],i+1,sum); //do for next
node in depth
}
}
}
void findSubset(int set[], int size, int sum) {
int *subSet = new int[size]; //create subset array to pass parameter of
subsetSum
subsetSum(set, subSet, size, 0, 0, 0, sum);
delete[] subSet;
}
int main() {
int weights[] = {10, 7, 5, 18, 12, 20, 15};
int size = 7;
findSubset(weights, size, 35);
}

Output
10 7 18
10 5 20
5 18 12
20 15
4.16 Algorithms

Graph coloring problem is a special case of graph labeling. In this problem, each
node is colored into some colors. But coloring has some constraints. We cannot use
the same color for any adjacent vertices.

Fig. 4.7.
For solving this problem, we need to use the greedy algorithm, but it does not
guaranty to use minimum color.

Input and Output


Input:
Adjacency matrix of the graph.
00101
00111
11010
01101
11010
Output:
Node: 0, Assigned with Color: 0
Node: 1, Assigned with Color: 0
Node: 2, Assigned with Color: 1
Node: 3, Assigned with Color: 2
Node: 4, Assigned with Color: 1

Algorithm
graphColoring(graph)
State Space Search Algorithms 4.17

Input − The given graph.


Output − Each node with some color assigned to it.
Begin
declare a list of colors
initially set the color 0 for first node
define an array colorUsed to track which color is used, and which colors have
never used.
for all vertices i except first one, do
mark i as unassigned to any color
done
mark colorUsed to false for all vertices
for all vertices u in the graph except 1st vertex, do
for all vertex v adjacent with u, do
if color[v] is unassigned, then
mark colorUsed[color[v]] := true
done
for all colors col in the color list, do
if color is not used, then
stop the loop
done
color[u] := col
for each vertex v which is adjacent with u, do
if color[v] is unassigned, then
colorUsed[color[v]] := false
done
done
for all vertices u in the graph, do
display the node and its color
done
End
4.18 Algorithms

Example
#include<iostream>
#define NODE 6
using namespace std;
int graph[NODE][NODE] = {
{0, 1, 1, 1, 0, 0},
{1, 0, 0, 1, 1, 0},
{1, 0, 0, 1, 0, 1},
{1, 1, 1, 0, 1, 1},
{0, 1, 0, 1, 0, 1},
{0, 0, 1, 1, 1, 0}
};
void graphColoring() {
int color[NODE];
color[0] = 0; //Assign first color for the first node
bool colorUsed[NODE]; //Used to check whether color is used or not
for(int i = 1; i<NODE; i++)
color[i] = -1; //initialize all other vertices are unassigned
for(int i = 0; i<NODE; i++)
colorUsed[i] = false; //initially any colors are not chosen
for(int u = 1; u<NODE; u++) { //for all other NODE - 1 vertices
for(int v = 0; v<NODE; v++) {
if(graph[u][v]){
if(color[v] != -1) //when one color is assigned, make it unavailable
colorUsed[color[v]] = true;
}
}
int col;
for(col = 0; col<NODE; col++)
State Space Search Algorithms 4.19

if(!colorUsed[col]) //find a color which is not assigned


break;

color[u] = col; //assign found color in the list


for(int v = 0; v<NODE; v++) { //for next iteration make color availability to
false
if(graph[u][v]) {
if(color[v] != -1)
colorUsed[color[v]] = false;
}
}
}
for(int u = 0; u<NODE; u++)
cout <<"Color: " << u << ", Assigned with Color: " <<color[u] <<endl;
}
main() {
graphColoring();
}

Output
Node: 0, Assigned with Color: 0
Node: 1, Assigned with Color: 0
Node: 2, Assigned with Color: 1
Node: 3, Assigned with Color: 2
Node: 4, Assigned with Color: 1
Let's explore bound methods.
Global upper bound U - as noted, we never need more than n colours. We can use
n as the initial value of U
4.20 Algorithms

Lower bound function: Let S be a partial solution, in which we have assigned


colours to vertices 1, 2, ..., i. Our lower bound on the best possible solution
obtainable from S must obviously include the number of colours used so far ... can
we do better? Maybe. We can scan the vertices not yet coloured. If any of them are
adjacent to vertices covering the full set of colours used so far, then we will need at
least one more colour. If any such vertices are also ajacent to each other, we will
need at least two more colours. This test would take O(n^2) time, but since we are
doomed to exponential time in the worst case anyway, spending O(n^2) to help
reduce the tree is probably worthwhile.
So we can define L(S) = "number of colours used so far" + "number of colours we
will be forced to use in the future"
Upper bound function: Let S be a partial solution, in which we have assigned
colours to vertices 1, 2, ..., i. As a simple upper bound on the number of colours
used in the best solution obtainable from S, we could use U(S) = "number of colours
used so far" + "number of vertices not yet coloured", since the worst thing that could
happen is that each remaining vertex needs its own colour. However, we can try to
do better. If we apply any kind of heuristic (such a Greedy-style heuristic based on
"for each remaining vertex, choose the legal colour that has been used the most often
(note that the legal colours for each vertex may change as we go along)") we have a
chance of arriving at a solution that uses close to the optimal number of colours. If
we are fortunate, this will give us a relatively tight upper bound on the optimal
solution.
So we can define U(S) = "number of colours used so far" + "number of colours
used to colour the remaining vertices by applying the Greedy heuristic"

4.3.1. SOLVING 15-PUZZLE PROBLEM


Each state of the board can be represented as a node in a graph, and two nodes are
connected with an edge if one of the board states can be turned into the other by
sliding a tile.
State Space Search Algorithms 4.21

Fig. 4.8.
Finding the optimal solution to a board is equivalent to finding the shortest path
from the current board state to the solved board state. A common path finding
algorithm is A*, which traverses through the search space using a heuristic which
estimates the distance to the desired end state. While A* would work for solving the
15 Puzzle, its memory usage increases exponentially with the length of the solution.
This is reasonable for the 8 Puzzle, as the hardest boards take 31 moves to solve.

Fig. 4.9.
4.22 Algorithms

For the 15 Puzzle, this upper limit increases to 80 moves.

Fig. 4.10.
Hence, solving the 15 Puzzle with A* can require massive amounts of memory. A
better algorithm to use is a variant of A* called IDA*. Compared to A*, it is less
efficient as it can explore the same nodes multiple times, but its memory usage is
only linear to the solution length.
For a p  q puzzle, there are (pq)!/2 number of boards in the search space. So, the
search space increases super-exponentially as the board size increases. This makes
optimally solving large puzzles very impractical (the 35 Puzzle is the largest
analyzed size that I could find).

Algorithm:
solve = (startGrid) ->
frontier = new PriorityQueue
frontier.enqueue(new SolverState(startGrid, []))
while not frontier.empty()
curState = frontier.dequeue()
if curState.solved
return curState.steps
candidateMoves = grid.validMoves()
for move in candidateMoves
nextGrid = grid.applyMove(move)
nextSteps = curState.steps.concat([move])
nextState = new SolverState(nextGrid, nextSteps)
State Space Search Algorithms 4.23

frontier.enqueue(nextState)

// C++ program to check if a given instance of N*N-1


// puzzle is solvable or not
#include <iostream>
#define N 4
using namespace std;
// A utility function to count inversions in given
// array 'arr[]'. Note that this function can be
// optimized to work in O(n Log n) time. The idea
// here is to keep code small and simple.
int getInvCount(int arr[])
{
int inv_count = 0;
for (int i = 0; i < N * N - 1; i++)
{
for (int j = i + 1; j < N * N; j++)
{
// count pairs(arr[i], arr[j]) such that
// i < j but arr[i] > arr[j]
if (arr[j] && arr[i] && arr[i] > arr[j])
inv_count++;
}
}
return inv_count;
}
// find Position of blank from bottom
int findXPosition(int puzzle[N][N])
4.24 Algorithms

{
// start from bottom-right corner of matrix
for (int i = N - 1; i >= 0; i--)
for (int j = N - 1; j >= 0; j--)
if (puzzle[i][j] == 0)
return N - i;
}
// This function returns true if given
// instance of N*N - 1 puzzle is solvable
bool isSolvable(int puzzle[N][N])
{
// Count inversions in given puzzle
int invCount = getInvCount((int*)puzzle);
// If grid is odd, return true if inversion
// count is even.
if (N & 1)
return !(invCount & 1);
else // grid is even
{
int pos = findXPosition(puzzle);
if (pos & 1)
return !(invCount & 1);
else
return invCount & 1;
}
}
/* Driver program to test above functions */
State Space Search Algorithms 4.25

int main()
{
int puzzle[N][N] =
{
{12, 1, 10, 2},
{7, 11, 4, 14},
{5, 0, 9, 15}, // Value 0 is used for empty space
{8, 13, 6, 3},
};
/*
int puzzle[N][N] = {{1, 8, 2},
{0, 4, 3},
{7, 6, 5}};
int puzzle[N][N] = {
{13, 2, 10, 3},
{1, 12, 8, 4},
{5, 0, 9, 6},
{15, 14, 11, 7},
};
int puzzle[N][N] = {
{6, 13, 7, 10},
{8, 9, 11, 0},
{15, 2, 12, 5},
{14, 3, 1, 4},
};
int puzzle[N][N] = {
{3, 9, 1, 15},
4.26 Algorithms

{14, 11, 4, 6},


{13, 0, 10, 12},
{2, 7, 8, 5},
};
*/
isSolvable(puzzle)? cout << "Solvable":
cout << "Not Solvable";
return 0;
}
Solver State stores the current position of all the numbers in the grid and the list of
steps to get there from the starting grid.PriorityQueue is responsible for making sure
we always explore the lowest cost state first. The cost of the state is the number of
steps taken from the initial state plus the estimated number of steps remaining to get
to the solution. This estimate (h2 from Unit 2, Topic 31) is admissible because each
move can at best reduce that estimate by one.
grid.validMoves returns a list of all valid moves to make the on the grid. If the
empty square is in the middle of the grid, this is all four directions. If it’s in a corner,
there are only two valid directions.

Output
Solvable
Time Complexity : O(n 2)
Space Complexity: O(n)

4.3.2. ASSIGNMENT PROBLEM


Let there be N workers and N jobs. Any worker can be assigned to perform any
job, incurring some cost that may vary depending on the work-job assignment. It is
required to perform all jobs by assigning exactly one worker to each job and exactly
one job to each agent in such a way that the total cost of the assignment is
minimized.
State Space Search Algorithms 4.27

An example job assignment problem. Green values show optimal job assignment
that is A-Job4, B-Job1 C-Job3, B-Job3 and D-Job4
Fig. 4.11.
Let us explore all approaches for this problem.

Solution 1: Brute Force


We generate n! possible job assignments and for each such assignment, we
compute its total cost and return the less expensive assignment. Since the solution is
a permutation of the n jobs, its complexity is O(n!).

Solution 2: Hungarian Algorithm


The optimal assignment can be found using the Hungarian algorithm. The
Hungarian algorithm has worst case run-time complexity of O(n ^3).

Solution 3: DFS/BFS on state space tree


A state space tree is a N-ary tree with property that any path from root to leaf node
holds one of many solutions to given problem. We can perform depth-first search on
state space tree and but successive moves can take us away from the goal rather than
bringing closer. We can also perform a Breadth-first search on state space tree.

Solution 4: Finding Optimal Solution using Branch and Bound


The search for an optimal solution can often be speeded by using an “intelligent”
ranking function, also called an approximate cost function to avoid searching in sub-
trees that do not contain an optimal solution. It is similar to BFS-like search but with
one major optimization. Instead of following FIFO order, we choose a live node with
least cost.
There are two approaches to calculate the cost function:
4.28 Algorithms

1. For each worker, we choose job with minimum cost from list of
unassigned jobs (take minimum entry from each row).
2. For each job, we choose a worker with lowest cost for that job from list
of unassigned workers (take minimum entry from each column).
Let’s take below example and try to calculate promising cost when Job 2 is
assigned to worker A.

Fig. 4.12.
Since Job 2 is assigned to worker A (marked in green), cost becomes 2 and Job 2
and worker A becomes unavailable (marked in red).

Fig. 4.13.
Now we assign job 3 to worker B as it has minimum cost from list of unassigned
jobs. Cost becomes 2 + 3 = 5 and Job 3 and worker B also becomes unavailable.

Fig. 4.14.
State Space Search Algorithms 4.29

Finally, job 1 gets assigned to worker C as it has minimum cost among


unassigned jobs and job 4 gets assigned to worker C as it is only Job left. Total cost
becomes 2 + 3 + 5 + 4 = 14.

Fig. 4.15.
Below diagram shows complete search space diagram showing optimal solution
path in green.

Fig. 4.16.

Complete Algorithm:
/* findMinCost uses Least() and Add() to maintain the
4.30 Algorithms

list of live nodes


Least() finds a live node with least cost, deletes
it from the list and returns it
Add(x) calculates cost of x and adds it to the list
of live nodes
Implements list of live nodes as a min heap */
// Search Space Tree Node
node
{
int job_number;
int worker_number;
node parent;
int cost;
}
// Input: Cost Matrix of Job Assignment problem
// Output: Optimal cost and Assignment of Jobs
algorithm findMinCost (costMatrix mat[][])
{
// Initialize list of live nodes(min-Heap)
// with root of search tree i.e. a Dummy node
while (true)
{
// Find a live node with least estimated cost
E = Least();
// The found node is deleted from the list
// of live nodes
if (E is a leaf node)
{
printSolution();
State Space Search Algorithms 4.31

return;
}
for each child x of E
{
Add(x); // Add x to list of live nodes;
x->parent = E; // Pointer for path to root
}
}
}

Below is its C++ implementation.


// Program to solve Job Assignment problem
// using Branch and Bound
#include <bits/stdc++.h>
using namespace std;
#define N 4

// state space tree node


struct Node
{
// stores parent node of current node
// helps in tracing path when answer is found
Node* parent;

// contains cost for ancestors nodes


// including current node
int pathCost;

// contains least promising cost


4.32 Algorithms

int cost;

// contain worker number


int workerID;

// contains Job ID
int jobID;

// Boolean array assigned will contains


// info about available jobs
bool assigned[N];
};

// Function to allocate a new search tree node


// Here Person x is assigned to job y
Node* newNode(int x, int y, bool assigned[],
Node* parent)
{
Node* node = new Node;

for (int j = 0; j < N; j++)


node->assigned[j] = assigned[j];
node->assigned[y] = true;

node->parent = parent;
node->workerID = x;
node->jobID = y;

return node;
State Space Search Algorithms 4.33

// Function to calculate the least promising cost


// of node after worker x is assigned to job y.
int calculateCost(int costMatrix[N][N], int x,
int y, bool assigned[])
{
int cost = 0;

// to store unavailable jobs


bool available[N] = {true};

// start from next worker


for (int i = x + 1; i < N; i++)
{
int min = INT_MAX, minIndex = -1;

// do for each job


for (int j = 0; j < N; j++)
{
// if job is unassigned
if (!assigned[j] && available[j] &&
costMatrix[i][j] < min)
{
// store job number
minIndex = j;

// store cost
min = costMatrix[i][j];
4.34 Algorithms

}
}

// add cost of next worker


cost += min;

// job becomes unavailable


available[minIndex] = false;
}

return cost;
}

// Comparison object to be used to order the heap


struct comp
{
bool operator()(const Node* lhs,
const Node* rhs) const
{
return lhs->cost > rhs->cost;
}
};

// print Assignments
void printAssignments(Node *min)
{
if(min->parent==NULL)
return;
State Space Search Algorithms 4.35

printAssignments(min->parent);
cout << "Assign Worker " << char(min->workerID + 'A')
<< " to Job " << min->jobID << endl;

// Driver code
int main()
{
// x-cordinate represents a Worker
// y-cordinate represents a Job
int costMatrix[N][N] =
{
{9, 2, 7, 8},
{6, 4, 3, 7},
{5, 8, 1, 8},
{7, 6, 9, 4}
};

/* int costMatrix[N][N] =
{
{82, 83, 69, 92},
{77, 37, 49, 92},
{11, 69, 5, 86},
{ 8, 9, 98, 23}
};
*/
4.36 Algorithms

/* int costMatrix[N][N] =
{
{2500, 4000, 3500},
{4000, 6000, 3500},
{2000, 4000, 2500}
};*/

/*int costMatrix[N][N] =
{
{90, 75, 75, 80},
{30, 85, 55, 65},
{125, 95, 90, 105},
{45, 110, 95, 115}
};*/

cout << "


Optimal Cost is "
<< findMinCost(costMatrix);

return 0;
}

Output :
Assign Worker A to Job 1
Assign Worker B to Job 0
Assign Worker C to Job 2
Assign Worker D to Job 3
Optimal Cost is 13
State Space Search Algorithms 4.37

4.3.3. KNAPSACK PROBLEM


The Knapsack Problem is an Optimization Problem in which we have to find an
optimal answer among all the possible combinations. In this problem, we are given a
set of items having different weights and values. We have to find the optimal
solution considering all the given items. There are three types of knapsack problems
: 0-1 Knapsack, Fractional Knapsack and Unbounded Knapsack.

What is 0-1 Knapsack Problem


In the 0-1 Knapsack Problem, we are given a Knapsack or a Bag that can hold
weight up to a certain value. We have various items that have different weights and
values associated with them. Now we have to fill the knapsack in such a way so that
the sum of the total weights of the filled items does not exceed the maximum
capacity of the knapsack and the sum of the values of the filled items is maximum.

Optimal Solution
Select item 2 and Item 3 which will give us total value of 140 + 60 = 200 which is
the maximum value we can get among all valid combinations.
The diagram above shows a Knapsack that can hold up to a maximum weight
of 30 units. We have three items namely item1, item2, and item3. The values and
weights associated with each item are shown in the diagram. Now we have to fill the
knapsack in such a way so that the sum of the values of items filled in the knapsack
is maximum. If we try out all the valid combinations in which the total weight of the
filled items is less than or equal to 30, we will see that we can get the optimal answer
by selecting item2 and item3 in the knapsack which gives us a total value of 120.

4.3.4. ALGORITHM OF SOLVING KNAPSACK PROBLEM


 Assume a knapsack of capacity M and n items having profit p i and weight
wi
 Sort items by profit/weight ratio: p i / w i

 Consider items in order of decreasing ratio


 Take as much of each item as possible.
 Traverse this sorted list as:
if(wi ≤ M)
4.38 Algorithms

{
Xi=1;
M=M-wi;
}
else if (wi > M && M>0)
{
Xi=M/wi;
M=0;
}
else
{
Xi=0;
}

Example:-
Let us consider that the capacity of the knapsack M = 60 and the list of provided
items are shown in the following table-
Item A B C D
Profit 280 100 120 120
Weight 40 10 20 24
Ratio (p i /w i ) 7 10 6 5
However, the provided items are not sorted based on (p i / w i ). After sorting, the
items are shown in the following table.
Item A B C D
Profit 100 280 120 120
Weight 10 40 20 24
Ratio (p i /w i ) 10 7 6 5
Xi 1 1 10/20 0
State Space Search Algorithms 4.39

Solution:
Afterward, sorting all the items according to (pi/wi), first item B is chosen as the
weight of B is less than the capacity of the knapsack. Next, item A is chosen, as the
available capacity of the knapsack is greater than the weight of A. Then, C is chosen
as the next item. However, the whole item cannot be chosen as the remaining
capacity of the knapsack is less than the weight of C. Hence, fraction of C (i.e.(60-
50) / 20) is chosen.
Chiefly, the capacity of the knapsack is equal to the selected items. Hence, no
more items can be selected. So, the total weight of the selected items is 10 + 40 +
20*(10 / 20) = 60. Also, the total profit is 100 + 280 + 120*(10 / 20) = 380 + 60 =
440.Hence, this is the optimal solution. Moreover, we cannot gain more profit by
selecting any different combination of items.
#include<stdio.h>
#include<conio.h>
void main( )
{
int p[10],w[10],k[10],n,m,i,j,temp1,temp2,temp3;
float x[10],profit;
clrscr( );
printf(“Enter weight of knapsack\ n”);
scanf(“%d”,&m};
printf(“Enter no. of items\n”);
scanf(“%d”,&n”);
printf(“enter profit and weight of each item \ n”);
printf(“profit weight \ n”);
for(i=0;<n;i++)
{
scanf(“%d\t%d”,&p[i],&w[i]);
}
for(i=0;i<n;i++)
4.40 Algorithms

{
k[i]=p[i]/w[i];
}
for(i=0;i<n–1;i++)
{
for(j=0;j<n-1;j++)
{
if(k[j]<k[j+1])
{
temp1=k[j];
k[j]=k[j+1];
k[j+1]=temp1;
temp2=p[j];
p[j]=p[j+1];
p[j+1]=temp2;
temp3=w[j];
w[j]=w[j+1];
w[j+1]=temp3;
}
}
}

for(i=0;i<n;++)
{
if(w[i]<=m)
{
x[i]=1;
m=m–w[i];
}
State Space Search Algorithms 4.41

else if(w[i]>m&&m>0)
{
x[i]=(float)m/w[i];
m=0;
}
else
{
x[i]=0;
}
}

Output:-
Enter weight of knapsack
60
Enter no. of items
4
enter profit and weight of each item
profit weight
280 40
100 10
120 20
120 24

Selection of items
selection=1.000000
selection=1.000000
selection=0.500000
selection=0.000000
Maximum profit is 440.000000
4.42 Algorithms

4.3.5. TRAVELLING SALESMAN PROBLEM


Given a set of cities and the distance between every pair of cities, the problem is
to find the shortest possible route that visits every city exactly once and returns to the
starting point. Note the difference between Hamiltonian Cycle and TSP. The
Hamiltonian cycle problem is to find if there exists a tour that visits every city
exactly once. Here we know that Hamiltonian Tour exists (because the graph is
complete) and in fact, many such tours exist, the problem is to find a minimum
weight Hamiltonian Cycle.

Fig. 4.17.
For example, consider the graph shown in the figure on the right side. A TSP tour
in the graph is 1 – 2 – 4 – 3 –1. The cost of the tour is 10 + 25 + 30 + 15 which is 80.
The problem is a famous NP-hard problem. There is no polynomial-time know
solution for this problem. The following are different solutions for the traveling
salesman problem.

Naive Solution:
1. Consider city 1 as the starting and ending point.
2. Generate all (n – 1)! Permutations of cities.
3. Calculate the cost of every permutation and keep track of the minimum
cost permutation.
4. Return the permutation with minimum cost.
Time Complexity: Θ(n!)
State Space Search Algorithms 4.43

4.3.6. DYNAMIC PROGRAMMING


Let the given set of vertices be {1, 2, 3, 4,  n}. Let us consider 1 as starting and
ending point of output. For every other vertex I (other than 1), we find the minimum
cost path with 1 as the starting point, I as the ending point, and all vertices appearing
exactly once. Let the cost of this path cost (i), and the cost of the corresponding
Cycle would cost (i) + dist(i, 1) where dist(i, 1) is the distance from I to 1. Finally,
we return the minimum of all [cost(i) + dist(i, 1)] values.
Let us define a term C(S, i) be the cost of the minimum cost path visiting each
vertex in set S exactly once, starting at 1 and ending at i. We start with all subsets of
size 2 and calculate C(S, i) for all subsets where S is the subset, then we calculate
C(S, i) for all subsets S of size 3 and so on. Note that 1 must be present in every
subset.
If size of S is 2, then S must be {1, i},
C(S, i) = dist(1, i)
Else if size of S is greater than 2.
C(S, i) = min { C(S-{i}, j) + dis(j, i)} where j belongs to S, j != i and j != 1.
Program for a travelling salesman problem
#include <iostream>
using namespace std;
// there are four nodes in example graph (graph is 1-based)
const int n = 4;
// give appropriate maximum to avoid overflow
const int MAX = 1000000;
// dist[i][j] represents shortest distance to go from i to j
// this matrix can be calculated for any given graph using
// all-pair shortest path algorithms
int dist[n + 1][n + 1] = {
{ 0, 0, 0, 0, 0 }, { 0, 0, 10, 15, 20 },
{ 0, 10, 0, 25, 25 }, { 0, 15, 25, 0, 30 },
{ 0, 20, 25, 30, 0 },
4.44 Algorithms

};
// memoization for top down recursion
int memo[n + 1][1 << (n + 1)];
int fun(int i, int mask)
{
// base case
// if only ith bit and 1st bit is set in our mask,
// it implies we have visited all other nodes already
if (mask == ((1 << i) | 3))
return dist[1][i];
// memoization
if (memo[i][mask] != 0)
return memo[i][mask];
int res = MAX; // result of this sub-problem
// we have to travel all nodes j in mask and end the
// path at ith node so for every node j in mask,
// recursively calculate cost of travelling all nodes in
// mask except i and then travel back from node j to
// node i taking the shortest path take the minimum of
// all possible j nodes
for (int j = 1; j <= n; j++)
if ((mask & (1 << j)) && j != i && j != 1)
res = std::min(res, fun(j, mask & (~(1 << i)))
+ dist[j][i]);
return memo[i][mask] = res;
}
// Driver program to test above logic
int main()
{
State Space Search Algorithms 4.45

int ans = MAX;


for (int i = 1; i <= n; i++)
// try to go from node 1 visiting all nodes in
// between to i then return from i taking the
// shortest route to 1
ans = std::min(ans, fun(i, (1 << (n + 1)) - 1)
+ dist[i][1]);
printf("The cost of most efficient tour = %d", ans);
return 0;
}

Output
The cost of most efficient tour = 80
For a set of size n, we consider n – 2 subsets each of size n –1 such that all
subsets don’t have nth in them. Using the above recurrence relation, we can write a
dynamic programming-based solution. There are at most O(n* 2 n) subproblems,
and each one takes linear time to solve. The total running time is therefore O( n *
2 n). The time complexity is much less than O(n!) but still exponential. The space
required is also exponential.
4.46 Algorithms

1. What is Backtracking
Backtracking is one of the techniques that can be used to solve the problem.
We can write the algorithm using this strategy. It uses the Brute force search to
solve the problem, and the brute force search says that for the given problem, we
try to make all the possible solutions and pick out the best solution from all the
desired solutions.
2. When to use a Backtracking algorithm?
When we have multiple choices, then we make the decisions from the
available choices. In the following cases, we need to use the backtracking
algorithm:
 A piece of sufficient information is not available to make the best choice,
so we use the backtracking strategy to try out all the possible solutions.
 Each decision leads to a new set of choices. Then again, we backtrack to
make new decisions. In this case, we need to use the backtracking
strategy.
3. List the applications of Backtracking.
 N-queen problem
 Sum of subset problem
 Graph coloring
 Hamiliton cycle
4. What is meant by n-Queens problem.
This problem is to find an arrangement of N queens on a chess board, such
that no queen can attack any other queens on the board. The chess queens can
attack in any direction as horizontal, vertical, horizontal and diagonal way. A
binary matrix is used to display the positions of N Queens, where no queens can
attack other queens.
5. State Hamiltonian Circuit Problem
In an undirected graph, the Hamiltonian path is a path, that visits each vertex
exactly once, and the Hamiltonian cycle or circuit is a Hamiltonian path, that
there is an edge from the last vertex to the first vertex. In this problem, we will
try to determine whether a graph contains a Hamiltonian cycle or not.
State Space Search Algorithms 4.47

6. Write an algorithm for Subset Sum Problem


subsetSum(set, subset, n, subSize, total, node, sum)
Input − The given set and subset, size of set and subset, a total of the subset,
number of elements in the subset and the given sum.
Output − All possible subsets whose sum is the same as the given sum.
Begin
if total = sum, then
display the subset
//go for finding next subset
subsetSum(set, subset, , subSize-1, total-set[node], node+1, sum)
return
else
for all element i in the set, do
subset[subSize] := set[i]
subSetSum(set, subset, n, subSize+1, total+set[i], i+1, sum)
done
End
7. What is a Graph coloring problem.
Graph coloring problem is a special case of graph labeling. In this problem,
each node is colored into some colors. But coloring has some constraints. We
cannot use the same color for any adjacent vertices.
8. How do you solve a 15-Puzzle problem
Given a 4 × 4 board with 15 tiles (every tile has one number from 1 to 15)
and one empty space. The objective is to place the numbers on tiles in order
using the empty space. We can slide four adjacent (left, right, above and
below) tiles into the empty space.
9. State the Assignment problem
Let there be N workers and N jobs. Any worker can be assigned to perform
any job, incurring some cost that may vary depending on the work-job
assignment. It is required to perform all jobs by assigning exactly one worker to
each job and exactly one job to each agent in such a way that the total cost of the
assignment is minimized.
4.48 Algorithms

10. What are the two approaches to calculate the cost function?
 For each worker, we choose job with minimum cost from list of
unassigned jobs (take minimum entry from each row).
 For each job, we choose a worker with lowest cost for that job from list
of unassigned workers (take minimum entry from each column).
11. What is 0-1 Knapsack Problem
In the 0-1 Knapsack Problem, we are given a Knapsack or a Bag that can hold
weight up to a certain value. We have various items that have different weights
and values associated with them. Now we have to fill the knapsack in such a
way so that the sum of the total weights of the filled items does not exceed the
maximum capacity of the knapsack and the sum of the values of the filled items
is maximum.
12. Define a TSP.
Given a set of cities and the distance between every pair of cities, the
problem is to find the shortest possible route that visits every city exactly once
and returns to the starting point. Note the difference between Hamiltonian
Cycle and TSP. The Hamiltonian cycle problem is to find if there exists a tour
that visits every city exactly once. Here we know that Hamiltonian Tour exists
(because the graph is complete) and in fact, many such tours exist, the problem
is to find a minimum weight Hamiltonian Cycle.

1. Explain the operation and working of backtracking


2. Write an algorithm for n-Queens problem
3. Summarize Hamiltonian Circuit Problem
4. Elaborate Subset Sum Problem with an example
5. Explain with an algorithm the efficiency of Graph colouring problem
6. How do you solve a Solving 15-Puzzle problem. Elaborate
7. Elucidate the Assignment problem with an example
8. Explain the mechanism of Knapsack Problem
9. Write in detail about Travelling Salesman Problem
*******************
UNIT V
NP - COMPLETE AND
APPROXIMATION ALGORITHM
Tractable and intractable problems: Polynomial time algorithms – Venn diagram
representation - NP- algorithms - NP-hardness and NP-completeness – Bin Packing
problem - Problem reduction: TSP – 3-CNF problem. Approximation Algorithms:
TSP - Randomized Algorithms: concept and application-primality testing -
randomized quick sort - Finding kth smallest number

Tractable Problem: A problem that is solvable by a polynomial-time algorithm.


The upper bound is polynomial.
Here are examples of tractable problems (ones with known polynomial-time
algorithms):
 Searching an unordered list
 Searching an ordered list
 Sorting a list
 Multiplication of integers (even though there‟s a gap)
 Finding a minimum spanning tree in a graph (even though there‟s a gap)
Intractable Problem: a problem that cannot be solved by a polynomial-time
algorithm. The lower bound is exponential.
From a computational complexity stance, intractable problems are problems for
which there exist no efficient algorithms to solve them.
Most intractable problems have an algorithm that provides a solution, and that
algorithm is the brute-force search.
This algorithm, however, does not provide an efficient solution and is, therefore,
not feasible for computation with anything more than the smallest input.
5.2 Algorithms

Examples
Towers of Hanoi: we can prove that any algorithm that solves this problem must
have a worst-case running time that is at least 2n − 1.
* List all permutations (all possible orderings) of n numbers.

Tractable and Intractable Problems

Tractable problems: the class P


The class of algorithms whose time-demand was described by a polynomial
function. Such problems are said to be tractable and in the class PTIME.
A problem is in P if it admits an algorithm with worst-case time-demand in O
(n^k) for some integer k.
However there are some problems for which it is known that there are no
algorithms which can solve them in polynomial time, these are referred to as
provably intractable and as being in the class EXPTIME (exponential time) or
worse. For these problems, it has been shown that the lower bound on the time-
demand of any possible algorithm to solve them is a function that grows
unreasonably fast.

Intractable problems: the class EXPTIME and beyond


A problem is in the class EXPTIME if all algorithms to solve it have a worst-case
time demand which is in O (2 ^ p(n)) for some polynomial p(n).
Higher time-complexity classes
There are other classes of problems for which the time demand cannot be bounded
above even by a function of the form 2^p(n).
In fact there are a hierarchy of these higher time-complexity classes such that a
problem within a given class is considered more intractable than all those within
lower- ranked classes.
So beyond EXPTIME we can have EXP(EXPTIME), for which the time-demands
of all known solutions are bounded above by a multiple of 2 ^ 2 ^ p(n).
All these classes of provably intractable problems, from EXPTIME upward, can
be referred to as having a super-polynomial time demand.
NP - Complete and Approximation Algorithm 5.3

Fig. 5.1.
It turns out that the most interesting class of problems is a class, which lies in
some sense between the class of tractable problems P and those of the provably
intractable, super-polynomial time problems.

The classes NP and NP-Complete

The Hamiltonian Circuit Problem


A connected, undirected, unweighted graph G has a Hamiltonian circuit if there is
a way to link all of the nodes via a closed route that visits each node once, and only
once.
As the size grows the time-demand appears to scale very badly and it is strongly
believed that there are no polynomial time algorithms for this problem.

The Travelling Salesman Problem


The TSP shares the extremely bad scaling behavior as the Hamiltonian circuit
problem, and is one of the famous intractable problems. This graph problem is
similar to the Hamiltonian cycle problem in that it looks for a route with the same
properties as required by the Hamiltonian cycle, but now of minimal length.
5.4 Algorithms

The Hamiltonian cycle and Travelling Salesman Problems belong to the class of
NP- Complete, which is a subset of the larger problem class NP. NP-Complete is a
class of problems whose time-complexity is presently unknown, though strongly
believed to be super-polynomial.

5.1.1. POLYNOMIAL TIME (P-TIME) REDUCTION


A problem A reduces in = time to another problem B, written as
A ≤ pB
Means that there is some procedure, taking no more than polynomial time as a
function of the size of the input to A, which
 Converts an input instance of A into an input instance of B
 Allows a suitable algorithm for problem B to be executed
 Provides a mechanism whereby the output obtained by this algorithm for
problem B can be translated back into an output for problem A
The algorithm for problem B thus also provides a solution to problem A.
Moreover A‟s solution will be obtained in a time which is in the same complexity
class as the algorithm which solves B, since the extra work needed to translate is just
in p-time.
Example: To show that Hamiltonian cycle is reducible to travelling salesman
decision problem.
 Take an instance of Hamiltonian Cycle problem, say G.
 Create a new weighted graph (G′, w) as follows:
o Node of G‟ are the same as nodes of G
o Add extra edges so that G‟ is fully connected ( so that it now has n /
2 (n – 1) edges
o Set the weights in the new graph G′ so that if an edge existed already
in G it has weight 0, otherwise it has a weight 1
 Return Travelling salesman Decision Problem ((G′, w), 0)
The reduction takes time O(n 2) in the number of nodes since the maximum
number of edges in any undirected graph is only n / 2 * (n – 1), and this the number
of added edges must be bounded above this.
There are three basic defining properties of problems in NP and NP-complete.
NP - Complete and Approximation Algorithm 5.5

5.1.2. PROBLEMS IN NP AND NP-COMPLETE ARE VERY HARD


TO SOLVE BUT EASY TO CHECK
The problems are hard because they appear to admit algorithms whose time-
demand behavior is exponential. However if a solution to a yes-instance of the
problem is asserted then it can be checked in polynomial time; this ability to check a
solution for correctness in polynomial time is referred to as a short certificate for the
problem.
The class of problems with this “very hard to solve, but easy to check” property is
known as Non-deterministic Polynomial.

5.1.3. PROBLEMS IN NP-COMPLETE ARE THE HARDEST PROBLEMS IN NP


An NP-Hard problem is one to which any problem in NP can be reduced in
polynomial time:
If A is NP-hard, for all B in NP it is true that B ≤ p A.
The class NP-Complete is the class of problems within NP itself, which have this
property:
NP-Complete = NP ∩ NP-hard

5.1.4. PROBLEMS IN NP-COMPLETE STAND OR FALL TOGETHER


Any problem in the class NP-Complete can be shown to be reducible in
polynomial time to any other problem in the class. Meaning that there is a way in
which any problem A can be mapped onto any other problem B using a number of
steps taking no more than polynomial time such that a solution for B also provides a
solution for A, and that the converse can also be done.
If A, B Ꜫ NP-Complete then A ≤ p B (A reduces in p–time to B) and B ≤ p A (B
reduces in p-time to A)
It is the best-known property of problems in NP-Complete because it means that
should a polynomial-time algorithm be found for just one problem in NP-Complete,
then all NP- Complete problems would be solvable in polynomial time.

How to assign a new problem to NP-Complete?


It would need to be shown that:
A ≤ p B1 and B2 ≤ p A, for some B1, B2 in NP-Complete
5.6 Algorithms

Now we analyze this algorithm to show that it runs in polynomial time.


Obviously, stages 1 and 4 are executed only once. Stage 3 runs at most m times
because each time except the last it marks an additional node in G. Thus the total
number of stages used is at most 1 + 1 + m, giving a polynomial in the size of G.
Hence M is a polynomial time algorithm for PATH.

What is polynomial time algorithm?


A polynomial-time algorithm is one which runs in an amount of time proportional
to some polynomial value of N, where N is some characteristic of the set over which
the algorithm runs, usually its size. For example, simple sort algorithms run
approximately in time k N^2/2, where N is the number of elements being sorted, and
k is a proportionality constant that is CPU-dependent and specific to this problem.
Multiplication of N  N matrices runs approximately in time k N^3, where we are
talking about a different k for this different problem. If you want to get picky, there
are other terms in that polynomial, but they don‟t matter as much as the n^3 term.
Polynomial-time algorithms are sometimes called tractable algorithms, because
they don‟t take nearly forever to run. As an alternative, some algorithms run in
exponential time, for example k^N. These algorithms are sometimes
called intractable, because they take a really long time to run to completion. It is a
very well known fact that there is no known polynomial time solution for NP
Complete problems and these problems occur a lot in real world So there must be a
way to handle them.
Polynomial Time Approximation Scheme (PTAS) is a type of approximate
algorithms that provide user to control over accuracy which is a desirable feature.
These algorithms take an additional parameter ε > 0 and provide a solution that is (1
+ ε) approximate for minimization and (1 – ε) for maximization. For example
consider a minimization problem, if ε is 0.5, then the solution provided by the PTAS
algorithm is 1.5 approximate.
The running time of PTAS must be polynomial in terms of n, however, it can be
exponential in terms of ε. In PTAS algorithms, the exponent of the polynomial can
NP - Complete and Approximation Algorithm 5.7

increase dramatically as ε reduces, for example if the runtime is O(n(1/ε)!) which is a


problem. There is a stricter scheme, Fully Polynomial Time Approximation Scheme
(FPTAS). In FPTAS, algorithm need to polynomial in both the problem size n and
1/ε.

Example (0-1 knapsack problem):


We know that 0-1 knapsack is NP Complete. There is a DP based pseudo
polynomial solution for this. But if input values are high, then the solution becomes
infeasible and there is a need of approximate solution. One approximate solution is to
use Greedy Approach (compute value per kg for all items and put the highest value
per kg first if it is smaller than W), but Greedy approach is not PTAS, so we don‟t
have control over accuracy.
Below is a FPTAS solution for 0-1 Knapsack problem:

Input:
W (Capacity of Knapsack)
val[0..n – 1] (Values of Items)
wt[0..n – 1] (Weights of Items)
1. Find the maximum valued item, i.e., find maximum value in val[]. Let this
maximum value be maxVal.
2. Compute adjustment factor k for all values
k = (maxVal * ε) / n
3. Adjust all values, i.e., create a new array val'[] that values divided by k. Do
following for every value val[i].
val'[i] = floor(val[i] / k)
4. Run DP based solution for reduced values, i,e, val'[0..n – 1] and all other
parameter same.
The above solution works in polynomial time in terms of both n and ε. The
solution provided by this FPTAS is (1 – ε) approximate. The idea is to rounds off
some of the least significant digits of values then they will be bounded by a
polynomial and 1/ε.
5.8 Algorithms

Example:
val[] = {12, 16, 4, 8}
wt[] = {3, 4, 5, 2}
W = 10
ε = 0.5
maxVal = 16 [maximum value in val[]]
Adjustment factor, k = (16 * 0.5)/4 = 2.0
Now we apply DP based solution on below modified instance of problem.
val'[] = {6, 8, 2, 4} [ val'[i] = floor(val[i]/k) ]
wt[] = {3, 4, 5, 2}
W = 10

How is the solution (1 – ε) * OPT?


Here OPT is the optimal value. Let S be the set produced by above FPTAS
algorithm and total value of S be val(S). We need to show that
val(S) >= (1 – ε)*OPT
Let O be the set produced by optimal solution (the solution with total value OPT),
i.e., val(O) = OPT.
val(O) – k*val'(O)  n* k
[Because val'[i] = floor(val[i]/k) ]
After the dynamic programming step, we get a set that is optimal for the scaled
instance and therefore must be at least as good as choosing the set O with the smaller
profits. From that, we can conclude,
val'(S) >= k . val'(O)
>= val(O) – nk
>= OPT – ε * maxVal
>= OPT – ε * OPT [OPT >= maxVal]
>= (1 – ε) * OPT
NP - Complete and Approximation Algorithm 5.9

Venn Diagram
A Venn diagram is used to visually represent the differences and the similarities
between two concepts. Venn diagrams are also called logic or set diagrams and are
widely used in set theory, logic, mathematics, businesses, teaching, computer
science, and statistics.
What is a Venn Diagram?
A Venn diagram is a diagram that helps us visualize the logical relationship
between sets and their elements and helps us solve examples based on these sets. A
Venn diagram typically uses intersecting and non-intersecting circles (although other
closed figures like squares may be used) to denote the relationship between sets.

Fig. 5.2. Venn Diagram


Venn Diagram Example
Let us observe a Venn diagram example. Here is the Venn diagram that shows the
correlation between the following set of numbers.
 One set contains even numbers from 1 to 25 and the other set contains the
numbers in the 5x table from 1 to 25.
 The intersecting part shows that 10 and 20 are both even numbers and
also multiples of 5 between 1 to 25.

Fig. 5.3. Venn Diagram Example


5.10 Algorithms

Terms Related to Venn Diagram


Let us understand the following terms and concepts related to Venn Diagram, to
understand it better.

Universal Set
Whenever we use a set, it is easier to first consider a larger set called a universal
set that contains all of the elements in all of the sets that are being considered.
Whenever we draw a Venn diagram:
 A large rectangle is used to represent the universal set and it is usually
denoted by the symbol E or sometimes U.
 All the other sets are represented by circles or closed figures within this
larger rectangle.
 Every set is the subset of the universal set U.

Fig. 5.4.
Consider the above-given image:
 U is the universal set with all the numbers 1-10, enclosed within the
rectangle.
 A is the set of even numbers 1-10, which is the subset of the universal set
U and it is placed inside the rectangle.
 All the numbers between 1-10, that are not even, will be placed outside the
circle and within the rectangle as shown above.

Subset
Venn diagrams are used to show subsets. A subset is actually a set that is
contained within another set. Let us consider the examples of two sets A and B in the
NP - Complete and Approximation Algorithm 5.11

below-given figure. Here, A is a subset of B. Circle A is contained entirely within


circle B. Also, all the elements of A are elements of set B.

Fig. 5.5. Venn Diagram for Subset and Superset


This relationship is symbolically represented as A ⊆ B. It is read as A is a subset
of B or A subset B. Every set is a subset of itself. i.e. A ⊆ A. Here is another
example of subsets :
 N = set of natural numbers
 I = set of integers
 Here N ⊂ I, because all-natural numbers are integers.

Venn Diagram Symbols


There are more than 30 Venn diagram symbols. We will learn about the three
most commonly used symbols in this section. They are listed below as:
Venn Diagram Symbols Explanation
The union symbol - ∪ A ∪ B is read as A union B.
Elements that belong to either set A or set B or both
the sets.
U is the universal set.
The intersection symbol - ∩ A ∩ B is read as A intersection B.
Elements that belong to both sets A and B.
U is the universal set.
The complement symbol - A' is read as A complement.
Ac or A' Elements that don't belong to set A.
U is the universal set.
5.12 Algorithms

Let us understand the concept and the usage of the three basic Venn diagram
symbols using the image given below.

Fig. 5.6.
Total Elements (No. of
Symbol It refers to
students)
A∪C The number of students 1 + 10 + 2 + 2 + 6 + 9 = 30
that prefer either burger or
pizza or both.
A∩C The number of students 2+2=4
that prefer both burger and
pizza.
A∩B∩C The number of students 2
that prefer a burger, pizza
as well as hotdog.
Ac or A' The number of students 10 + 6 + 9 = 25
that do not prefer a burger.

Venn Diagram for Sets Operations


In set theory, we can perform certain operations on given sets. These operations
are as follows,
 Union of Set
 Intersection of set
 Complement of set
 Difference of set
NP - Complete and Approximation Algorithm 5.13

Union of Sets Venn Diagram


The union of two sets A and B can be given by: A ∪ B = {x | x ∈ A or x ∈ B}.
This operation on the elements of set A and B can be represented using a Venn
diagram with two circles. The total region of both the circles combined denotes the
union of sets A and B.

Intersection of Set Venn Diagram


The intersection of sets, A and B is given by: A ∩ B = {x : x ∈ A and x ∈ B}.
This operation on set A and B can be represented using a Venn diagram with two
intersecting circles. The region common to both the circles denotes the intersection
of set A and Set B.

Complement of Set Venn Diagram


The complement of any set A can be given as A′. This represents elements that are
not present in set A and can be represented using a Venn diagram with a circle. The
region covered in the universal set, excluding the region covered by set A, gives the
complement of A.

How to Draw a Venn Diagram?


Venn diagrams can be drawn with unlimited circles. Since more than three
becomes very complicated, we will usually consider only two or three circles in a
Venn diagram. Here are the 4 easy steps to draw a Venn diagram:
 Step 1: Categorize all the items into sets.
 Step 2: Draw a rectangle and label it as per the correlation between the
sets.
 Step 3: Draw the circles according to the number of categories you have.
 Step 4: Place all the items in the relevant circles.
Example: Let us draw a Venn diagram to show categories of outdoor and indoor
for the following pets: Parrots, Hamsters, Cats, Rabbits, Fish, Goats, Tortoises,
Horses.
 Step 1: Categorize all the items into sets (Here, its pets): Indoor pets: Cats,
Hamsters, and, Parrots. Outdoor pets: Horses, Tortoises, and Goats. Both
categories (outdoor and indoor): Rabbits and Fish.
5.14 Algorithms

 Step 2: Draw a rectangle and label it as per the correlation between the
two sets. Here, let's label the rectangle as Pets.
 Step 3: Draw the circles according to the number of categories you have.
There are two categories in the sample question: outdoor pets and indoor
pets. So, let us draw two circles and make sure the circles overlap.

Fig. 5.7.
 Step 4: Place all the pets in the relevant circles. If there are certain pets
that fit both the categories, then place them at the intersection of sets,
where the circles overlap. Rabbits and fish can be kept as indoor and
outdoor pets, and hence they are placed at the intersection of both circles.

Fig. 5.8. Pets


 Step 5: If there is a pet that doesn't fit either the indoor or outdoor sets,
then place it within the rectangle but outside the circles.

5.3.1. VENN DIAGRAM FORMULA


For any two given sets A and B, the Venn diagram formula is used to find one of
the following: the number of elements of A, B, A U B, or A ⋂ B when the other 3
are given. The formula says:
n(A U B) = n(A) + n(B) – n (A ⋂ B)
NP - Complete and Approximation Algorithm 5.15

Here, n(A) and n(B) represent the number of elements in A and B respectively.
n(A U B) and n(A ⋂ B) represent the number of elements in A U B and A ⋂ B
respectively. This formula is further extended to 3 sets as well and it says:
n (A U B U C) = n(A) + n(B) + n(C) – n(A ⋂ B) – n(B ⋂ C) – n(C ⋂ A) +
n(A ⋂ B ⋂ C)
Here is an example of Venn diagram formula.
Example: In a cricket school, 12 players like bowling, 15 like batting, and 5 like
both. Then how many players like either bowling or batting.
Solution:
Let A and B be the sets of players who like bowling and batting respectively. Then
n(A) = 12
n(B) = 15
n(A ⋂ B) = 5
We have to find n(A U B). Using the Venn diagram formula,
n(A U B) = n(A) + n(B) – n (A ⋂ B)
n(A U B) = 12 + 15 - 5 = 22.

5.3.2. APPLICATIONS OF VENN DIAGRAM


There are several advantages to using Venn diagrams. Venn diagram is used to
illustrate concepts and groups in many fields, including statistics, linguistics, logic,
education, computer science, and business.
 We can visually organize information to see the relationship between sets
of items, such as commonalities and differences, and to depict the relations
for visual communication.
 We can compare two or more subjects and clearly see what they have in
common versus what makes them different. This might be done for
selecting an important product or service to buy.
 Mathematicians also use Venn diagrams in math to solve complex
equations.
 We can use Venn diagrams to compare data sets and to find correlations.
5.16 Algorithms

 Venn diagrams can be used to reason through the logic behind statements
or equations.

There are computational problems that can not be solved by algorithms even with
unlimited time. For example Turing Halting problem (Given a program and an input,
whether the program will eventually halt when run with that input, or will run
forever). Alan Turing proved that a general algorithm to solve the halting problem
for all possible program-input pairs cannot exist. A key part of the proof is, that the
Turing machine was used as a mathematical definition of a computer and program.
Status of NP-Complete problems is another failure story, NP-complete problems
are problems whose status is unknown. No polynomial-time algorithm has yet been
discovered for any NP-complete problem, nor has anybody yet been able to prove
that no polynomial-time algorithm exists for any of them. The interesting part is, that
if any one of the NP-complete problems can be solved in polynomial time, then all of
them can be solved.

What are NP, P, NP-complete, and NP-Hard Problems?


P is a set of problems that can be solved by a deterministic Turing machine in
Polynomial-time.
NP is a set of decision problems that can be solved by a Non-deterministic Turing
Machine in Polynomial-time. P is a subset of NP (any problem that can be solved by
a deterministic machine in polynomial time can also be solved by a non-deterministic
machine in polynomial time).
Informally, NP is a set of decision problems that can be solved by a polynomial-
time via a “Lucky Algorithm”, a magical algorithm that always makes a right guess
among the given set of choices.
NP-complete problems are the hardest problems in the NP set. A decision
problem L is NP-complete if:
1. L is in NP (Any given solution for NP-complete problems can be verified
quickly, but there is no efficient known solution).
2. Every problem in NP is reducible to L in polynomial time (Reduction is
defined below).
NP - Complete and Approximation Algorithm 5.17

A problem is NP-Hard if it follows property 2 mentioned above, and doesn‟t need


to follow property 1. Therefore, the NP-Complete set is also a subset of the NP-Hard
set.

Fig. 5.9.

In computer science, there exist some problems whose solutions are not yet found,
the problems are divided into classes known as Complexity Classes. In complexity
theory, a Complexity Class is a set of problems with related complexity. These
classes help scientists to groups problems based on how much time and space they
require to solve problems and verify the solutions. It is the branch of the theory of
computation that deals with the resources required to solve a problem.
The common resources are time and space, meaning how much time the algorithm
takes to solve a problem and the corresponding memory usage. The time
complexity of an algorithm is used to describe the number of steps required to
solve a problem, but it can also be used to describe how long it takes to verify the
answer. The space complexity of an algorithm describes how much memory is
required for the algorithm to operate.

P Class
The P in the P class stands for Polynomial Time. It is the collection of decision
problems(problems with a “yes” or “no” answer) that can be solved by a
deterministic machine in polynomial time.

Features:
1. The solution to P problems is easy to find.
5.18 Algorithms

2. P is often a class of computational problems that are solvable and


tractable. Tractable means that the problems can be solved in theory as
well as in practice. But the problems that can be solved in theory but not in
practice are known as intractable.
This class contains many natural problems like:
1. Calculating the greatest common divisor.
2. Finding a maximum matching.
3. Decision versions of linear programming.
NP Class
The NP in NP class stands for Non-deterministic Polynomial Time. It is the
collection of decision problems that can be solved by a non-deterministic machine
in polynomial time.
Features:
1. The solutions of the NP class are hard to find since they are being solved
by a non-deterministic machine but the solutions are easy to verify.
2. Problems of NP can be verified by a Turing machine in polynomial time.
Example:
Let us consider an example to better understand the NP class. Suppose there is a
company having a total of 1000 employees having unique employee IDs. Assume
that there are 200 rooms available for them. A selection of 200 employees must be
paired together, but the CEO of the company has the data of some employees who
can‟t work in the same room due to some personal reasons.
This is an example of an NP problem. Since it is easy to check if the given
choice of 200 employees proposed by a coworker is satisfactory or not i.e. no pair
taken from the coworker list appears on the list given by the CEO. But generating
such a list from scratch seems to be so hard as to be completely impractical. It
indicates that if someone can provide us with the solution to the problem, we can
find the correct and incorrect pair in polynomial time. Thus for the NP class
problem, the answer is possible, which can be calculated in polynomial time.
This class contains many problems that one would like to be able to solve
effectively:
1. Boolean Satisfiability Problem (SAT).
NP - Complete and Approximation Algorithm 5.19

2. Hamiltonian Path Problem.


3. Graph coloring.

Co-NP Class
Co-NP stands for the complement of NP Class. It means if the answer to a
problem in Co-NP is No, then there is proof that can be checked in polynomial time.

Features:
1. If a problem X is in NP, then its complement X‟ is also is in CoNP.
2. For an NP and CoNP problem, there is no need to verify all the answers at
once in polynomial time, there is a need to verify only one particular
answer “yes” or “no” in polynomial time for a problem to be in NP or
CoNP.
Some example problems for C0-NP are:
1. To check prime number.
2. Integer Factorization.

NP-hard class
An NP-hard problem is at least as hard as the hardest problem in NP and it is the
class of the problems such that every problem in NP reduces to NP-hard.

Features:
1. All NP-hard problems are not in NP.
2. It takes a long time to check them. This means if a solution for an NP-
hard problem is given then it takes a long time to check whether it is
right or not.
3. A problem A is in NP-hard if, for every problem L in NP, there exists a
polynomial-time reduction from L to A.
Some of the examples of problems in Np-hard are:
1. Halting problem.
2. Qualified Boolean formulas.
3. No Hamiltonian cycle.
5.20 Algorithms

NP-complete class
A problem is NP-complete if it is both NP and NP-hard. NP-complete problems
are the hard problems in NP.

Features:
1. NP-complete problems are special as any problem in NP class can be
transformed or reduced into NP-complete problems in polynomial time.
2. If one could solve an NP-complete problem in polynomial time, then one
could also solve any NP problem in polynomial time.
Some example problems include:
1. Decision version of 0/1 Knapsack.
2. Hamiltonian Cycle.
3. Satisfiability.
4. Vertex cover.

Complexity
Class Characteristic feature
P Easily solvable in polynomial time.
NP Yes, answers can be checked in polynomial time.
Co-NP No, answers can be checked in polynomial time.
All NP-hard problems are not in NP and it takes a long
NP-hard time to check them.
NP-complete A problem that is NP and NP-hard is NP-complete.

Given n items of different weights and bins each of capacity c, assign each item
to a bin such that number of total used bins is minimized. It may be assumed that
all items have weights smaller than bin capacity.

Example:
Input: weight[] = {4, 8, 1, 4, 2, 1}
NP - Complete and Approximation Algorithm 5.21

Bin Capacity c = 10

Output: 2
We need minimum 2 bins to accommodate all items
First bin contains {4, 4, 2} and second bin {8, 1, 1}
Input: weight[] = {9, 8, 2, 2, 5, 4}
Bin Capacity c = 10

Output: 4
We need minimum 4 bins to accommodate all items.
Input: weight[] = {2, 5, 4, 7, 1, 3, 8};
Bin Capacity c = 10

Output: 3

Lower Bound
We can always find a lower bound on minimum number of bins required. The
lower bound can be given as :
Min no. of bins >= Ceil ((Total Weight) / (Bin Capacity))
In the above examples, lower bound for first example is “ceil(4 + 8 + 1 + 4 + 2 +
1)/10′′ = 2 and lower bound in second example is “ceil(9 + 8 + 2 + 2 + 5 + 4)/10′′ =
3.
This problem is a NP Hard problem and finding an exact minimum number of
bins takes exponential time. Following are approximate algorithms for this
problem.

Applications
1. Loading of containers like trucks.
2. Placing data on multiple disks.
3. Job scheduling.
4. Packing advertisements in fixed length radio/TV station breaks.
5. Storing a large collection of music onto tapes/CD‟s, etc.
5.22 Algorithms

Online Algorithms
These algorithms are for Bin Packing problems where items arrive one at a time
(in unknown order), each must be put in a bin, before considering the next item.

1. Next Fit:
When processing next item, check if it fits in the same bin as the last item. Use a
new bin only if it does not.
Below is C++ implementation for this algorithm.

C++ program to find number of bins required using next fit algorithm.
#include <bits/stdc++.h>
using namespace std;
// Returns number of bins required using next fit
// online algorithm
int nextFit(int weight[], int n, int c)
{
// Initialize result (Count of bins) and remaining
// capacity in current bin.
int res = 0, bin_rem = c;
// Place items one by one
for (int i = 0; i < n; i++) {
// If this item can't fit in current bin
if (weight[i] > bin_rem) {
res++; // Use a new bin
bin_rem = c - weight[i];
}
else
bin_rem -= weight[i];
}
return res;
}
NP - Complete and Approximation Algorithm 5.23

// Driver program
int main()
{
int weight[] = { 2, 5, 4, 7, 1, 3, 8 };
int c = 10;
int n = sizeof(weight) / sizeof(weight[0]);
cout << "Number of bins required in Next Fit : "
<< nextFit(weight, n, c);
return 0;

Output:
Number of bins required in Next Fit : 4

Reduction
Problem decomposition refers to the problem-solving process that computer
scientists apply to solve a complex problem by breaking it down into parts that can
be more easily solved. Oftentimes, this involves modifying algorithm templates and
combining multiple algorithm ideas them to solve the problem at hand: all of the
design work in the projects have been designed with this process in mind.
However, modifying an algorithm template is just one way that we can solve a
problem. An alternative approach is to represent it as a different problem, one that
can be solved without modifying an algorithm. Reduction is a problem
decomposition approach where an algorithm designed for one problem can be used
to solve another problem.
1. Modify the input so that it can be framed in terms of another problem.
2. Solve the modified input using a standard (unmodified) algorithm.
3. Modify the output of the standard algorithm to solve the original problem.
5.24 Algorithms

We‟ve actually already seen part of a reduction algorithm. In Asymptotic


Analysis, we discussed one method for detecting duplicates in an arbitrary (possibly
unsorted) array: dup1 considered every pair of items in quadratic time. How might
we design a faster approach for detecting duplicates? If we know an approach for
detecting duplicates in a sorted array, then detecting duplicates in an unsorted array
reduces to sorting that array and then running the much faster dup2 algorithm. In
fact, there are many problems that reduce to sorting.
Are there any duplicate keys in an array of Comparable objects? How many
distinct keys are there in an array? Which value appears most frequently? With
sorting, you can answer these questions in linearithmic time [with merge sort]: first
sort the array, then make a pass through the sorted array, taking note of duplicate
values that appear consecutively in the ordered array.1
Designing a faster sorting algorithm not only helps the problem of sorting, but
also every problem that reduces to sorting

5.7.1. PROBLEM REDUCTION ALGORITHM


1. Initialize the graph to the starting node.
2. Loop until the starting node is labelled SOLVED or until its cost goes
above FUTILITY:
(i) Traverse the graph, starting at the initial node and following the
current best path and accumulate the set of nodes that are on that
path and have not yet been expanded.
(ii) Pick one of these unexpanded nodes and expand it. If there are no
successors, assign FUTILITY as the value of this node. Otherwise,
add its successors to the graph and for each of them compute f ′(n).
If f '(n) of any node is O, mark that node as SOLVED.
(iii) Change the f '(n) estimate of the newly expanded node to reflect the
new information provided by its successors. Propagate this change
backwards through the graph. If any node contains a successor arc
whose descendants are all solved, label the node itself as SOLVED.
NP - Complete and Approximation Algorithm 5.25

Genetic algorithms are heuristic search algorithms inspired by the process that
supports the evolution of life. The algorithm is designed to replicate the natural
selection process to carry generation, i.e. survival of the fittest of beings. Standard
genetic algorithms are divided into five phases which are: Creating initial
population.
1. Calculating fitness.
2. Selecting the best genes.
3. Crossing over.
4. Mutating to introduce variations.
These algorithms can be implemented to find a solution to the optimization
problems of various types. One such problem is the Traveling Salesman Problem.
The problem says that a salesman is given a set of cities, he has to find the shortest
route to as to visit each city exactly once and return to the starting city.
Approach: In the following implementation, cities are taken as genes, string
generated using these characters is called a chromosome, while a fitness score which
is equal to the path length of all the cities mentioned, is used to target a population.
Fitness Score is defined as the length of the path described by the gene. Lesser the
path length fitter is the gene. The fittest of all the genes in the gene pool survive the
population test and move to the next iteration. The number of iterations depends
upon the value of a cooling variable. The value of the cooling variable keeps on
decreasing with each iteration and reaches a threshold after a certain number of
iterations.

Algorithm:
1. Initialize the population randomly.
2. Determine the fitness of the chromosome.
3. Until done repeat:
1. Select parents.
2. Perform crossover and mutation.
3. Calculate the fitness of the new population.
5.26 Algorithms

4. Append it to the gene pool.


Initialize procedure GA{
Set cooling parameter = 0;
Evaluate population P(t);
While( Not Done ){
Parents(t) = Select_Parents(P(t));
Offspring(t) = Procreate(P(t));
p(t+1) = Select_Survivors(P(t), Offspring(t));
t = t + 1;
}
}

How the mutation works?


Suppose there are 5 cities: 0, 1, 2, 3, 4. The salesman is in city 0 and he has to find
the shortest route to travel through all the cities back to the city 0. A chromosome
representing the path chosen can be represented as:

Fig. 5.10.
This chromosome undergoes mutation. During mutation, the position of two cities
in the chromosome is swapped to form a new configuration, except the first and the
last cell, as they represent the start and endpoint.

Fig. 5.11.
Original chromosome had a path length equal to INT_MAX, according to the
input defined below, since the path between city 1 and city 4 didn‟t exist. After
mutation, the new child formed has a path length equal to 21, which is a much-
NP - Complete and Approximation Algorithm 5.27

optimized answer than the original assumption. This is how the genetic algorithm
optimizes solutions to hard problems.

Example:
Given a set of cities and the distance between each pair of cities, the travelling
salesman problem finds the path between these cities such that it is the shortest path
and traverses every city once, returning back to the starting point.
Problem – Given a graph G(V, E), the problem is to determine if the graph has
a TSP consisting of cost at most K.

Explanation –.
In order to prove the Travelling Salesman Problem is NP-Hard, we will have to
reduce a known NP-Hard problem to this problem. We will carry out a reduction
from the Hamiltonian Cycle problem to the Travelling Salesman problem.
Every instance of the Hamiltonian Cycle problem consists of a graph G = (V, E)
as the input can be converted to a Travelling Salesman problem consisting of graph
G′ = (V′, E′) and the maximum cost, K. We will construct the graph G′ in the
following way:
For all the edges e belonging to E, add the cost of edge c(e) =1. Connect the
remaining edges, e‟ belonging to E‟, that are not present in the original graph G, each
with a cost c(e′) = 2.
And, set
The new graph G′ can be constructed in polynomial time by just converting G to a
complete graph G‟ and adding corresponding costs. This reduction can be proved by
the following two claims:
 Let us assume that the graph G contains a Hamiltonian Cycle, traversing
all the vertices V of the graph. Now, these vertices form a TSP
with. Since it uses all the edges of the original graph having cost c(e) =
1. And, since it is a cycle, therefore, it returns back to the original vertex.
 We assume that the graph G‟ contains a TSP with cost, The TSP
traverses all the vertices of the graph returning to the original vertex.
Now since none of the vertices are excluded from the graph and the cost
sums to n, therefore, necessarily it uses all the edges of the graph present
in E, with cost 1, hence forming a hamiltonian cycle with the graph G.
5.28 Algorithms

Fig. 5.12.
Thus we can say that the graph G’ contains a TSP if graph G contains
Hamiltonian Cycle. Therefore, any instance of the Travelling salesman problem can
be reduced to an instance of the hamiltonian cycle problem. Thus, the TSP is NP-
Hard.

The point of 3CNF is that it is a "normal form" for formulas: as you mention,
every formula is equivalent, up to a quadratic (linear?) blowup, to a 3CNF formula.
3CNF formulas are "simple", and so more easy to deal with. In particular, if you ever
read about NP-completeness, you will find out that we want our to put our
"challenging problems" in as simple a form as possible. This makes it easier both to
design and analyze algorithms solving these problems, and to prove that other
problems are also difficult by reducing 3CNF to them (showing how to solve 3CNF
using an algorithm for them).
We care specifically about 3CNF for historical reasons, it was the first (or one of
the first) NP-complete problems (check out Cook's paper or Karp's paper on their
respective pages). Also, 2CNF is not "complete" (arbitrary formulas cannot are not
equivalent to a 2CNF), and it is easy to determine whether they are satisfiable or not
(google if interested).
The conversion from CNF to 3CNF is best explained by an example. We convert
each clause separately. The clause A∨ B∨ C∨ D∨ EA∨ B∨ C∨ D∨ E is equivalent to
the 3CNF
NP - Complete and Approximation Algorithm 5.29

(A∨ B∨  1) ∧ (¬ x 1 ∨ C∨ x 2 ) ∧ (¬ x 2 ∨ D ∨ E), (A∨ B ∨ x 1 x1)∧ (¬ x 1 ∨ C


∨ x 2 ) ∧ (¬ x 2 ∨ D ∨ E),
in the sense that the original formula (in this case, a single clause) is satisfiable iff
the converted formula is. You convert each of the clauses, and take their conjunction.
Suppose you have an arbitrary formula using the connectives ¬,∨ ,∧ ¬,∨ ,∧ .
Picture it as a tree, where inner nodes are labeled with ¬,∨ ,∧ ¬,∨ ,∧ , and each node
has either one (¬¬) or two (∨ ,∧ ∨ ,∧ ) children. I'm sorry I can't provide a picture.
The first step is "pushing the negations to the leaves" using de Morgan's rules (q. v.).
That rids us of all ¬¬ nodes, but now leaves may be literals (variables or negations of
variables). Now we convert the formula to a CNF recursively.
Denote by φφ the function we're constructing, that takes a formula as above
(with ∧ ,∨ ∧ ,∨ and possibly negated variables) and returns a CNF. For the base case,
we have φ(x) = x φ(x) = x and φ(¬ x) = ¬ x φ(¬ x) = ¬ x. For a formula of the
form A∧ BA ∧ B, we don't have to work hard: we define φ(A∧ B) = φ(A) ∧ φ (B)
φ(A∧ B) = φ(A)∧ φ(B). The hardest case is formulas of the form A∨ BA∨ B: a
somewhat economical choice is
φ(A∨ B) = (y∨ φ(A)) ∧ ((¬ y)∨ φ(B)), φ(A∨ B) = (y ∨ φ(A)) ∧ ((¬ y) ∨ φ(B)),
where yy is a new variable, and y∨ φ(A) y∨ φ(A) means adding yy to all the
clauses.
General formulas with arbitrary connectives can be handled by expressing the
connectives using ∨ ,∧ ,¬∨ ,∧ ,¬, for example by writing a truth-table; this can be
quite wasteful, though.
The entire construction results in a quadratic blowup in formula size (your
"occurrences of variables"; there are many other ways to define formula size).
However, the result you're quoting is only a linear blowup (indeed, by a factor
of 2424). It is entirely possible that such a construction exists, but I'm not aware of it;
perhaps one of the readers is. The SAT to 3SAT part has a linear blowup with a
factor of 33.
If you do not allow adding new variables, then no simple conversion is possible.
While it is always possible to convert an arbitrary formula into a CNF (using the
"truth table" approach), not every formula is convertible to a 3CNF without adding
new variables. An example is parity on 44 variables. Also, the conversion from an
5.30 Algorithms

arbitrary formula to a CNF can result in exponential blow-up, for example for parity
on nn variables we go from Θ( n 2)Θ(n 2) to Θ(n 2 n)Θ(n 2 n).

Here, we consider several approximation algorithms, a small sample of dozens of


such algorithms suggested over the years for this famous problem. First let us answer
the question of whether we should hope to find a polynomial-time approximation
algorithm with a finite performance ratio on all instances of the traveling salesman
problem. As the following theorem [Sah76] shows, the answer turns out to be no,
unless P = N P .

THEOREM 1
If P!= NP, there exists no c-approximation algorithm for the traveling salesman
problem, i.e., there exists no polynomial-time approximation algorithm for this
problem so that for all instances
f (sa)  c f (s*)
for some constant c.

PROOF
By way of contradiction, suppose that such an approximation algorithm A and a
constant c exist. (Without loss of generality, we can assume that c is a positive
integer.) We will show that this algorithm could then be used for solving the
Hamiltonian circuit problem in polynomial time. We will take advantage of a
variation of the transformation used in Section 11.3 to reduce the Hamiltonian circuit
problem to the traveling salesman problem. Let G be an arbitrary graph
with n vertices. We map G to a complete weighted graph G by assigning weight 1 to
each edge in G and adding an edge of weight cn + 1 between each pair of vertices not
adjacent in G. If G has a Hamiltonian circuit, its length in G is n; hence, it is the
exact solution s * to the traveling salesman problem for G .
Note that if sa is an approximate solution obtained for G by algorithm A, then f
(sa) ≤ cn by the assumption. If G does not have a Hamiltonian circuit in G, the
shortest tour in G will contain at least one edge of weight cn + 1, and hence f (sa) ≥ f
(s*) > cn. Taking into account the two derived inequalities, we could solve the
Hamiltonian circuit problem for graph G in polynomial time by mapping G to G ,
NP - Complete and Approximation Algorithm 5.31

applying algorithm A to get tour sa in G , and comparing its length with cn. Since the
Hamiltonian circuit problem is NP-complete, we have a contradiction unless P = NP.

Nearest-neighbor algorithm
The following well-known greedy algorithm is based on the nearest-
neighbor heuristic: always go next to the nearest unvisited city.
Step 1 Choose an arbitrary city as the start.
Step 2 Repeat the following operation until all the cities have been
visited: go to the unvisited city nearest the one visited last (ties can
be broken arbitrarily).
Step 3 Return to the starting city.
EXAMPLE 1 For the instance represented by the graph in Figure , with a as the
starting vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian
circuit) sa: a − b − c − d − a of length 10.

Fig. 5.13. Instance of the travelling salesman problem


The optimal solution, as can be easily checked by exhaustive search, is the
tour s∗ : a − b − d − c − a of length 8. Thus, the accuracy ratio of this approximation
is

f (sa) 10
r(sa) = f (s*) = 8 = 1.25

(i.e., tour sa is 25% longer than the optimal tour s*).


Unfortunately, except for its simplicity, not many good things can be said about
the nearest-neighbor algorithm. In particular, nothing can be said in general about the
accuracy of solutions obtained by this algorithm because it can force us to traverse a
very long edge on the last leg of the tour. Indeed, if we change the weight of edge (a,
d) from 6 to an arbitrary large number w ≥ 6 in Example 1, the algorithm will still
5.32 Algorithms

yield the tour a – b – c – d − a of length 4 + w, and the optimal solution will still
be a − b − d − c − a of length 8. Hence,
f (sa) 4 + w
r (sa) = f (s*) = 8

which can be made as large as we wish by choosing an appropriately large value


of w. Hence, RA = ∞ for this algorithm (as it should be according to Theorem 1).

What is a Randomized Algorithm?


An algorithm that uses random numbers to decide what to do next anywhere in
its logic is called a Randomized Algorithm. For example, in Randomized Quick
Sort, we use a random number to pick the next pivot (or we randomly shuffle the
array). And in Karger‟s algorithm, we randomly pick an edge.

How to analyse Randomized Algorithms?


Some randomized algorithms have deterministic time complexity. For
example, this implementation of Karger‟s algorithm has time complexity is O(E).
Such algorithms are called Monte Carlo Algorithms and are easier to analyse for
worst case.
On the other hand, time complexity of other randomized algorithms (other than
Las Vegas) is dependent on value of random variable. Such Randomized algorithms
are called Las Vegas Algorithms. These algorithms are typically analysed for
expected worst case. To compute expected time taken in worst case, all possible
values of the used random variable needs to be considered in worst case and time
taken by every possible value needs to be evaluated. Average of all evaluated times
is the expected worst case time complexity. Below facts are generally helpful in
analysis os such algorithms.

5.11.1. LINEARITY OF EXPECTATION

Expected Number of Trials until Success


For example consider below a randomized version of QuickSort.
A Central Pivot is a pivot that divides the array in such a way that one side has
at-least 1/4 elements.
NP - Complete and Approximation Algorithm 5.33

// Sorts an array arr[low..high]


randQuickSort(arr[], low, high)
1. If low >= high, then EXIT.
2. While pivot 'x' is not a Central Pivot.
(i) Choose uniformly at random a number from [low..high].
Let the randomly picked number number be x.
(ii) Count elements in arr[low..high] that are smaller
than arr[x]. Let this count be sc.
(iii) Count elements in arr[low..high] that are greater
than arr[x]. Let this count be gc.
(iv) Let n = (high – low + 1). If sc >= n / 4 and
gc >= n / 4, then x is a central pivot.
3. Partition arr[low..high] around the pivot x.
4. // Recur for smaller elements
randQuickSort(arr, low, sc-1)
5. // Recur for greater elements
randQuickSort(arr, high-gc+1, high)
The important thing in our analysis is, time taken by step 2 is O(n).

How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/ n.
Therefore, expected number of times the while loop runs is n (See this for details)
Thus, the expected time complexity of step 2 is O(n).
What is overall Time Complexity in Worst Case?
In worst case, each partition divides array such that one side has n/4 elements and
other side has 3n / 4 elements. The worst case height of recursion tree is Log 3/4 n
which is O(Log n).
T(n) < T(n/4) + T(3n/4) + O(n)
T(n) < 2T(3n/4) + O(n)
Solution of above recurrence is O(n Log n)
5.34 Algorithms

Note that the above randomized algorithm is not the best way to implement
randomized Quick Sort. The idea here is to simplify the analysis as it is simple to
analyse.
Typically, randomized Quick Sort is implemented by randomly picking a pivot
(no loop). Or by shuffling array elements. Expected worst case time complexity of
this algorithm is also O(n Log n), but analysis is complex, the MIT prof himself
mentions same in his lecture here.

5.11.2. PRIMALITY TESTING


A primality test is an algorithm to decide whether an input number is prime. Some
primality tests are deterministic. They always correctly decide if a number is prime
or composite. The fastest known deterministic primality test was invented in 2004.
There are three computer scientists, such as Agrawal, Kayal, and Saxena, invented
the AKS primality test that operated in O˜ (log(n)6 ) time, where O˜ (f (n)) is
represented as O(f (n).log(f (n))k) for some integer k [1]. Although a significant
breakthrough, this speed is rather slow when compared to information security
requirement.
The advantage of prime numbers are that they are utilized in cryptography. One of
the standard cryptosystem -RSA algorithm need a prime number as key which is
generally over 1024 bits to provide higher security. When handling with such large
numbers, definitely doesn't create the following method any good. It is not simply to
work with such large numbers particularly when the operations performed are / and
% at the time of primality testing.
Therefore, the best primality testing algorithms that are produced can only decide
if the provided number is a "probable prime" or composite.
There are the following types of Primality Testing which are as follows −
 Determininstic Algorithm − A deterministic primality testing algorithm
accepts an integer and continually output a prime or a composite. This
algorithm always provide a proper answer.
 Divisibility Algorithm − The simplest primality test is as follows −
Given an input number n, checks whether any integer from 2 to n – 1
divides n. If n is divisible by any m, then n is composite otherwise it is a
prime. However, instead of testing all m upto n – 1, it is only important to
NP - Complete and Approximation Algorithm 5.35

test m upto √n. If n is composite then it can be factored into two values, at
least one of which should be less than or same to √n.
 Probabilistic Algorithm − A probabilistic algorithm provide an answer
that is correct most of time, but not all of the time. These tests decided
whether n satisfies one or more conditions that all primes should satisfy. A
probabilistic algorithm restore either a prime or a composite depends on
the following rules −
o If the integer to be tested is actually a prime, the algorithm
definitely return a prime.
o If the integer to be tested is actually a composite, it returns a
composite with probability 1 − ε, but it can return a prime with the
probability ε. The probability of mistakes can be enhanced if it can
run the algorithm „m‟ times and the probability of error reduce to
Σm.

Fermat's Little Theorem:


If n is a prime number, then for every a, 1 < a < n – 1,
an –1 ≡ 1 (mod n)
[OR]
an–1 % n = 1
Example: Since 5 is prime, 24 ≡ 1 (mod 5) [or 24% 5 = 1],
34 ≡ 1 (mod 5) and 44 ≡ 1 (mod 5)
Since 7 is prime, 26 ≡ 1 (mod 7),
36 ≡ 1 (mod 7), 46 ≡ 1 (mod 7)
56 ≡ 1 (mod 7) and 66 ≡ 1 (mod 7)
If a given number is prime, then this method always returns true. If the given
number is composite (or non-prime), then it may return true or false, but the
probability of producing incorrect results for composite is low and can be reduced by
doing more iterations.
Below is algorithm:
// Higher value of k indicates probability of correct
// results for composite inputs become higher. For prime
5.36 Algorithms

// inputs, result is always correct


1) Repeat following k times:
a) Pick a randomly in the range [2, n – 2]
b) If gcd(a, n) ≠ 1, then return false
c) If an-1 &nequiv; 1 (mod n), then return false
2) Return true [probably prime].

C++ program to find the smallest twin in given range


#include <bits/stdc++.h>

using namespace std;

/* Iterative Function to calculate (a^n)%p in O(logy) */

int power(int a, unsigned int n, int p)

int res = 1; // Initialize result

a = a % p; // Update 'a' if 'a' >= p

while (n > 0)

// If n is odd, multiply 'a' with result

if (n & 1)

res = (res*a) % p;

// n must be even now

n = n>>1; // n = n/2

a = (a*a) % p;

return res;
NP - Complete and Approximation Algorithm 5.37

/*Recursive function to calculate gcd of 2 numbers*/

int gcd(int a, int b)

if(a < b)

return gcd(b, a);

else if(a%b == 0)

return b;

else return gcd(b, a%b);

// If n is prime, then always returns true, If n is

// composite than returns false with high probability

// Higher value of k increases probability of correct

// result.

bool isPrime(unsigned int n, int k)

{
// Corner cases

if (n <= 1 || n == 4) return false;

if (n <= 3) return true;

// Try k times

while (k>0)

// Pick a random number in [2..n-2]

// Above corner cases make sure that n > 4


5.38 Algorithms

int a = 2 + rand()%(n-4);

// Checking if a and n are co-prime

if (gcd(n, a) != 1)

return false;

// Fermat's little theorem

if (power(a, n-1, n) != 1)

return false;

k--;

return true;

// Driver Program to test above function

int main()

{
int k = 3;

isPrime(11, k)? cout << " true\n": cout << " false\n";

isPrime(15, k)? cout << " true\n": cout << " false\n";

return 0;
}

Output:
true
false
NP - Complete and Approximation Algorithm 5.39

What is a Randomized Algorithm?


An algorithm that uses random numbers to decide what to do next anywhere in
its logic is called a Randomized Algorithm. For example, in Randomized Quick
Sort, we use a random number to pick the next pivot (or we randomly shuffle the
array). And in Karger‟s algorithm, we randomly pick an edge.

How to analyse Randomized Algorithms?


Some randomized algorithms have deterministic time complexity. For
example, this implementation of Karger‟s algorithm has time complexity is O(E).
Such algorithms are called Monte Carlo Algorithms and are easier to analyse for
worst case.
On the other hand, time complexity of other randomized algorithms (other than
Las Vegas) is dependent on value of random variable. Such Randomized algorithms
are called Las Vegas Algorithms. These algorithms are typically analysed for
expected worst case. To compute expected time taken in worst case, all possible
values of the used random variable needs to be considered in worst case and time
taken by every possible value needs to be evaluated. Average of all evaluated times
is the expected worst case time complexity. Below facts are generally helpful in
analysis os such algorithms.
// Sorts an array arr[low..high]
randQuickSort(arr[], low, high)
1. If low >= high, then EXIT.
2. While pivot 'x' is not a Central Pivot.
(i) Choose uniformly at random a number from [low..high].
Let the randomly picked number number be x.
(ii) Count elements in arr[low..high] that are smaller
than arr[x]. Let this count be sc.
(iii) Count elements in arr[low..high] that are greater
than arr[x]. Let this count be gc.
5.40 Algorithms

(iv) Let n = (high-low+1). If sc >= n/4 and


gc >= n/4, then x is a central pivot.
3. Partition arr[low..high] around the pivot x.
4. // Recur for smaller elements
randQuickSort(arr, low, sc-1)
5. // Recur for greater elements
randQuickSort(arr, high-gc+1, high)

// C++ implementation QuickSort


// using Lomuto's partition Scheme.
#include <cstdlib>
#include <time.h>
#include <iostream>
using namespace std;
// This function takes last element
// as pivot, places
// the pivot element at its correct
// position in sorted array, and
// places all smaller (smaller than pivot)
// to left of pivot and all greater
// elements to right of pivot
int partition(int arr[], int low, int high)
{
// pivot
int pivot = arr[high];
// Index of smaller element
NP - Complete and Approximation Algorithm 5.41

int i = (low - 1);


for (int j = low; j <= high - 1; j++)
{
// If current element is smaller
// than or equal to pivot
if (arr[j] <= pivot) {

// increment index of
// smaller element
i++;
swap(arr[i], arr[j]);
}
}
swap(arr[i + 1], arr[high]);
return (i + 1);
}
// Generates Random Pivot, swaps pivot with
// end element and calls the partition function
int partition_r(int arr[], int low, int high)
{
// Generate a random number in between
// low .. high
srand(time(NULL));
int random = low + rand() % (high - low);
5.42 Algorithms

// Swap A[random] with A[high]


swap(arr[random], arr[high]);
return partition(arr, low, high);
}
/* The main function that implements
QuickSort
arr[] --> Array to be sorted,
low --> Starting index,
high --> Ending index */
void quickSort(int arr[], int low, int high)
{
if (low < high) {
/* pi is partitioning index,
arr[p] is now
at right place */
int pi = partition_r(arr, low, high);
// Separately sort elements before
// partition and after partition
quickSort(arr, low, pi - 1);
quickSort(arr, pi + 1, high);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
NP - Complete and Approximation Algorithm 5.43

{
int i;
for (i = 0; i < size; i++)
cout<<arr[i]<<" ";
}
// Driver Code
int main()
{
int arr[] = { 10, 7, 8, 9, 1, 5 };
int n = sizeof(arr) / sizeof(arr[0]);
quickSort(arr, 0, n - 1);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}

Output
Sorted array:
1 5 7 8 9 10

Given an array and a number K where K is smaller than the size of the array. Find
the K‟th smallest element in the given array. Given that all array elements are
distinct.

Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 3
5.44 Algorithms

Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 4
Output: 10

K’th smallest element in an unsorted array using sorting:


Sort the given array and return the element at index K – 1 in the sorted array.
Follow the given steps to solve the problem:
 Sort the input array in the increasing order
 Return the element at the K – 1 index (0 – Based indexing) in the sorted
array
Below is the Implementation of the above approach:
// C program to find K'th smallest element
#include <stdio.h>
#include <stdlib.h>
// Compare function for qsort
int cmpfunc(const void* a, const void* b)
{
return (*(int*)a - *(int*)b);
}
// Function to return K'th smallest
// element in a given array
int kthSmallest(int arr[], int N, int K)
{
// Sort the given array
qsort(arr, N, sizeof(int), cmpfunc);
// Return k'th element in the sorted array
return arr[K - 1];
}
// Driver's code
int main()
NP - Complete and Approximation Algorithm 5.45

{
int arr[] = { 12, 3, 5, 7, 19 };
int N = sizeof(arr) / sizeof(arr[0]), K = 2;

// Function call
printf("K'th smallest element is %d",
kthSmallest(arr, N, K));
return 0;
}
// This code is contributed by Sania Kumari Gupta
// (kriSania804)

Output
Kth smallest element is 5
Time Complexity: O(N log N)
Auxiliary Space: O(1)
5.46 Algorithms

1. Differentiate Tractable and Intractable problems.


Tractable Problem: A problem that is solvable by a polynomial-time
algorithm.
The upper bound is polynomial. Here are examples of tractable problems
(ones with known polynomial-time algorithms):
Searching an unordered list
Searching an ordered list
Intractable Problem: a problem that cannot be solved by a polynomial-time
algorithm.
Most intractable problems have an algorithm that provides a solution, and that
algorithm is the brute-force search. This algorithm, however, does not provide
an efficient solution and is, therefore, not feasible for computation with anything
more than the smallest input.
Examples
Towers of Hanoi
2. Compare TSP and the Hamiltonian Circuit Problem.
The TSP shares the extremely bad scaling behavior as the Hamiltonian circuit
problem, and is one of the famous intractable problems. This graph problem is
similar to the Hamiltonian cycle problem in that it looks for a route with the
same properties as required by the Hamiltonian cycle, but now of minimal
length.
The Hamiltonian cycle and Travelling Salesman Problems belong to the class
of NP- Complete, which is a subset of the larger problem class NP. NP-
Complete is a class of problems whose time-complexity is presently unknown,
though strongly believed to be super-polynomial.
3. What is polynomial time algorithm?
A polynomial-time algorithm is one which runs in an amount of time
proportional to some polynomial value of N, where N is some characteristic of
the set over which the algorithm runs, usually its size. For example, simple sort
algorithms run approximately in time k N^2/2, where N is the number of
NP - Complete and Approximation Algorithm 5.47

elements being sorted, and k is a proportionality constant that is CPU-dependent


and specific to this problem. Multiplication of N  N matrices runs
approximately in time k N^3, where we are talking about a different k for this
different problem.
4. Define a Venn Diagram.
A Venn diagram is a diagram that helps us visualize the logical relationship
between sets and their elements and helps us solve examples based on these sets.
A Venn diagram typically uses intersecting and non-intersecting circles
(although other closed figures like squares may be used) to denote the
relationship between sets.
5. What is a Subset.
Venn diagrams are used to show subsets. A subset is actually a set that is
contained within another set. Let us consider the examples of two sets A and B
in the below-given figure. Here, A is a subset of B. Circle A is contained entirely
within circle B. Also, all the elements of A are elements of set B.
6. List the Venn Diagram Symbols.
Venn Diagram Symbols Explanation
The union symbol - ∪ A ∪ B is read as A union B.
Elements that belong to either set A or set B or both
the sets.
U is the universal set.
The intersection symbol - ∩ A ∩ B is read as A intersection B.
Elements that belong to both sets A and B.
U is the universal set.
The complement symbol - A' is read as A complement.
Ac or A' Elements that don't belong to set A.
U is the universal set.
7. List the Sets Operations for a Venn Diagram.
 Union of Set
 Intersection of set
5.48 Algorithms

 Complement of set
 Difference of set
8. How to Draw a Venn Diagram?
Venn diagrams can be drawn with unlimited circles. Since more than three
becomes very complicated, we will usually consider only two or three circles in
a Venn diagram. Here are the 4 easy steps to draw a Venn diagram:
 Step 1: Categorize all the items into sets.
 Step 2: Draw a rectangle and label it as per the correlation between the
sets.
 Step 3: Draw the circles according to the number of categories you have.
 Step 4: Place all the items in the relevant circles.
9. State in brief about NP- algorithms.
There are computational problems that can not be solved by algorithms even
with unlimited time. For example Turing Halting problem (Given a program
and an input, whether the program will eventually halt when run with that
input, or will run forever). Alan Turing proved that a general algorithm to
solve the halting problem for all possible program-input pairs cannot exist. A
key part of the proof is, that the Turing machine was used as a mathematical
definition of a computer and program.
10. What are NP, P, NP-complete, and NP-Hard problems?
P is a set of problems that can be solved by a deterministic Turing machine in
Polynomial-time.
NP is a set of decision problems that can be solved by a Non-deterministic
Turing Machine in Polynomial-time. P is a subset of NP (any problem that can
be solved by a deterministic machine in polynomial time can also be solved by a
non-deterministic machine in polynomial time).
Informally, NP is a set of decision problems that can be solved by a
polynomial-time via a “Lucky Algorithm”, a magical algorithm that always
makes a right guess among the given set of choices.
11. Define a P Class.
The P in the P class stands for Polynomial Time. It is the collection of
decision problems(problems with a “yes” or “no” answer) that can be solved by
a deterministic machine in polynomial time.
NP - Complete and Approximation Algorithm 5.49

12. Define a Bin Packing problem


Given n items of different weights and bins each of capacity c, assign each
item to a bin such that number of total used bins is minimized. It may be
assumed that all items have weights smaller than bin capacity.
13. What is Problem reduction.
Problem decomposition refers to the problem-solving process that computer
scientists apply to solve a complex problem by breaking it down into parts that
can be more easily solved. Oftentimes, this involves modifying algorithm
templates and combining multiple algorithm ideas them to solve the problem at
hand: all of the design work in the projects have been designed with this process
in mind. Reduction is a problem decomposition approach where an algorithm
designed for one problem can be used to solve another problem.
 Modify the input so that it can be framed in terms of another problem.
 Solve the modified input using a standard (unmodified) algorithm.
 Modify the output of the standard algorithm to solve the original problem.
14. List down the phases of Standard genetic algorithms
Standard genetic algorithms are divided into five phases which are:
 Creating initial population.
 Calculating fitness.
 Selecting the best genes.
 Crossing over.
 Mutating to introduce variations.
15. Define a 3- CNF problem .
The point of 3CNF is that it is a "normal form" for formulas: as you mention,
every formula is equivalent, up to a quadratic (linear?) blowup, to a 3CNF
formula. 3CNF formulas are "simple", and so more easy to deal with. In
particular, if you ever read about NP-completeness, you will find out that we
want our to put our "challenging problems" in as simple a form as possible. This
makes it easier both to design and analyze algorithms solving these problems,
and to prove that other problems are also difficult by reducing 3CNF to them
16. What do you mean by Nearest-neighbor algorithm.
The following well-known greedy algorithm is based on the nearest-
neighbor heuristic: always go next to the nearest unvisited city.
5.50 Algorithms

Step 1 Choose an arbitrary city as the start.


Step 2 Repeat the following operation until all the cities have been
visited: go to the unvisited city nearest the one visited last (ties can
be broken arbitrarily).
Step 3 Return to the starting city.
17. What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere
in its logic is called a Randomized Algorithm. For example, in Randomized
Quick Sort, we use a random number to pick the next pivot (or we randomly
shuffle the array). And in Karger‟s algorithm, we randomly pick an edge.
18. How to analyse Randomized Algorithms?
Some randomized algorithms have deterministic time complexity. For
example, this implementation of Karger‟s algorithm has time complexity is
O(E). Such algorithms are called Monte Carlo Algorithms and are easier to
analyse for worst case. On the other hand, time complexity of other
randomized algorithms (other than Las Vegas) is dependent on value of
random variable. Such Randomized algorithms are called Las Vegas
Algorithms. These algorithms are typically analysed for expected worst case.
To compute expected time taken in worst case, all possible values of the used
random variable needs to be considered in worst case and time taken by every
possible value needs to be evaluated.
19. Discuss in brief on Primality testing
A primality test is an algorithm to decide whether an input number is prime.
Some primality tests are deterministic. They always correctly decide if a number
is prime or composite. The fastest known deterministic primality test was
invented in 2004. There are three computer scientists, such as Agrawal, Kayal,
and Saxena, invented the AKS primality test that operated in O˜ (log(n)6 ) time,
where O˜(f (n)) is represented as O(f (n).log(f (n))k) for some integer k [1].
Although a significant breakthrough, this speed is rather slow when compared to
information security requirement.
20. What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere
in its logic is called a Randomized Algorithm. For example, in Randomized
NP - Complete and Approximation Algorithm 5.51

Quick Sort, we use a random number to pick the next pivot (or we randomly
shuffle the array). And in Karger‟s algorithm, we randomly pick an edge.
21. How to find kth smallest number.
Given an array and a number K where K is smaller than the size of the array.
Find the Kth smallest element in the given array. Given that all array elements
are distinct.
Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 3
Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}, K = 4
Output: 10

1. Compare and contrast Part-B & C


2. Discuss on Polynomial time (p-time) reduction.
3. What is polynomial time algorithm? Explain in detail the mechanism.
4. Elaborate the representation of Venn diagram.
5. Elucidate the problems in NP- algorithms.
6. Classify in detail NP-hardness and NP-completeness.
7. Explain Bin Packing problem with an example.
8. Discuss the idea behind problem reduction.
9. Express your views on TSP and its algorithm.
10. Explain 3- CNF problem in detail.
11. Discuss the theorem behind Approximation Algorithms.
12. Elaborate the concept and application of Randomized Algorithms.
13. Explain the Fermat's Little Theorem.
14. Discuss on how to analyse Randomized Algorithms?
15. List in detail the steps to Find the kth smallest number.
*******************
PRACTICAL EXERCISES

1. Implement Linear Search. Determine the time required to search for


an element. Repeat the experiment for different values of n, the
number of elements in the list to be searched.

Program
#include<stdio.h>
#include<conio.h>
#include<time.h>
#include<stdlib.h>
#define max 20
int pos;
int linsearch (int,int[],int);
void main()
{
int ch=1; double t;
int n,i,a [max],k,op,low,high,pos; clock_tbegin,end;
clrscr();
while(ch)
{
printf("\n.......MENU \n 1. Linear search \n 2.Exit \n");
printf("\n enter your choice\n");
scanf("%d",&op);
switch(op)
{
case 1:printf("\n enter the number of elements \n");
scanf("%d",&n);
P.2 Algorithms

printf("\n enter the elements of an array\n");


for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("\n enter the element to be searched \n");
scanf("%d",&k);
begin=clock();
pos=linsearch(n,a,k);
end=clock();
if(pos==-1)
printf("\n\n Unsuccessful search");
else
printf("element %d is found at position %d",k,pos+1);
printf("\n Time taken is %lf CPU cycles \n",(end-begin)/CLK_TCK);
getch();
break;
default:printf("Invalid choice entered \n"); exit(0);
}
printf("\n Do you wish to run again(1/0) \n");
scanf("%d",&ch);
}
getch();
}
int linsearch(int n,int a[],int k)
{
delay(1000);
if(n<0) return -1;
if(k==a[n-1])
return (n-1);
else
Practical Exercises P.3

return linsearch(n-1,a,k);
}

MENU
.Linear search
.Exit
enter your choice
1
enter the number of elements
3
enter the number of an array in the order
25 69 98
enter the elements to be searched
98
element 98 is found at position 3
Time Taken is 1.978022 CPU1 cycles

2. Implement recursive Binary Search. Determine the time required to


search an element. Repeat the experiment for different values of n, the
number of elements in the list to be searched.

Program
#include<stdio.h>
#include<conio.h>
#include<time.h>
#include<stdlib.h>
#define max 20
int pos;
Int binsearch (int,int[],int,int,int);
void main()
P.4 Algorithms

{
int ch=1;
double t;
int n,i,a [max],k,op,low,high,pos;
clock_tbegin,end;
clrscr();
while(ch)
{
printf("\n.......MENU \n 1.BinarySearch \n 2.Exit \n");
printf("\n enter your choice\n"); scanf("%d",&op);
switch(op)
{
case 1:printf("\n enter the number of elments\n"); scanf("%d",&n);
printf("\n enter the number of an array in the order \n"); for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("\n enter the elements to be searched \n"); scanf("%d",&k);
low=0;high=n-1; begin=clock(); pos=binsearch(n,a,k,low,high);
end=clock();
if(pos==-1) printf("\n\nUnsuccessful search");
else
printf("\n element %d is found at position %d",k,pos+1);
printf("\n Time Taken is %lf CPU1 cycles \n",(end-begin)/CLK_TCK); getch();
break;
}
printf("\n Do you wish to run again(1/0) \n"); scanf("%d",&ch);
}
getch();
}
Int binsearch(int n,int a[],int k,int low,int high)
Practical Exercises P.5

{
int mid; delay(1000); mid=(low+high)/2; if(low>high) return -1; if(k==a[mid])
return(mid);
else
if(k<a[mid])
return binsearch(n,a,k,low,mid-1);
else return binsearch(n,a,k,mid+1,high);
}

MENU
.BinarySearch
.Exit
enter your choice
1
enter the number of elements
3
enter the number of an array in the order
98
22
46
enter the elements to be searched
22
element 22 is found at position 2
Time Taken is 1.978022 CPU cycles

3. Given a text txt [0...n – 1] and a pattern pat [0...m – 1], write a
function search (char pat [ ], char txt [ ]) that prints all occurrences of
pat [ ] in txt [ ]. You may assume that n > m.
P.6 Algorithms

Program (in python)


def naive_pattern_search(pat, text):
m = len(pat)
n = len(text)
for i in range(n-m+1):
flag = True
for j in range(m):
if(pat[j] != text[i+j]):
flag = False

if(flag):
print("Pattern found at index {}".format(i))
print("Example-1:")
txt = "AABAACAADAABAABA"
pat = "AABA"
naive_pattern_search(pat, txt)
print("Example-2:")
txt = "abracadabra"
pat = "ab"
naive_pattern_search(pat, txt)

Pattern found at index 0


Pattern found at index 9
Pattern found at index 12

Example-2:
Pattern found at index 0
Pattern found at index 7
Practical Exercises P.7

4. Sort a given set of elements using the Heap sort methods and
determine the time required to sort the elements. Repeat the
experiment for different values of n, the number of elements in the list
to be sorted and plot a graph of the time taken versus n.

Program
#include<stdio.h>
#include<conio.h>
#include<time.h>
void heapsort(int n,int arr[]);
void heapy(int n,int arr[]);
void adjust(int n,int arr[]);
void heapsort(int n,int arr[])
{
inti,item; delay(100); heapy(n,arr);
for(i=n;i>=1;i--)
{
item=arr[1]; arr[1]=arr[i]; arr[i]=item; adjust(i,arr);
}
}
void heapy(int n,int arr[])
{
inti,j,k,item;
for(i=1;i<=n;i++)
{
item=arr[i]; j=i;
k=j/2;
while(k!=0 && item>arr[k])
{
arr[j]=arr[k]; j=k;
P.8 Algorithms

k=j/2;
}
arr[j]=item;
}
}
void adjust(int n,int arr[])
{
inti,j,item; j=1;
item=arr[j]; i=j*2; while(i<n)
{
if((i+1)<n)
{
if(arr[i]<arr[i+1])
i++;
}
if(item<arr[i])
{
arr[j]=arr[i]; j=i;
i=2*j;
}
else break;
}
arr[j]=item;
}
void main()
{
int i,n,arr[20];
clock_tend,start;
clrscr();
Practical Exercises P.9

printf("\nEnter the number of Elements: \t");


scanf("%d",&n);
printf("\nEnter the %d Elements:",n); for(i=1;i<=n;i++)
scanf("%d",&arr[i]);
start=clock();
heapsort(n,arr);
end=clock();
printf("\nSorted Elements are\n"); for(i=1;i<=n;i++)
printf("%d ",arr[i]);
printf("\nTime taken by Heapsort %f CPU Cycle",(end-start)/CLK_TCK);
getch();
}

Enter the number of Elements: 3


Enter the 3 Elements: 42
85
58
Sorted Elements are
42 58 85
Time taken by Heasport 0.109890 CPU Cycle_

1. Develop a program to implement graph traversal using Breadth First


Search

Program
#include <stdio.h>
#include <stdlib.h>
#define SIZE 40
P.10 Algorithms

struct queue {
int items[SIZE];
int front;
int rear;
};

struct queue* createQueue();


void enqueue(struct queue* q, int);
int dequeue(struct queue* q);
void display(struct queue* q);
int isEmpty(struct queue* q);
void printQueue(struct queue* q);

struct node {
int vertex;
struct node* next;
};

struct node* createNode(int);

struct Graph {
int numVertices;
struct node** adjLists;
int* visited;
};

// BFS algorithm
void bfs(struct Graph* graph, int startVertex) {
struct queue* q = createQueue();
Practical Exercises P.11

graph->visited[startVertex] = 1;
enqueue(q, startVertex);

while (!isEmpty(q)) {
printQueue(q);
int currentVertex = dequeue(q);
printf("Visited %d\n", currentVertex);

struct node* temp = graph->adjLists[currentVertex];

while (temp) {
int adjVertex = temp->vertex;

if (graph->visited[adjVertex] == 0) {
graph->visited[adjVertex] = 1;
enqueue(q, adjVertex);
}
temp = temp->next;
}
}
}

// Creating a node
struct node* createNode(int v) {
struct node* newNode = malloc(sizeof(struct node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
P.12 Algorithms

// Creating a graph
struct Graph* createGraph(int vertices) {
struct Graph* graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;

graph->adjLists = malloc(vertices * sizeof(struct node*));


graph->visited = malloc(vertices * sizeof(int));

int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}

return graph;
}

// Add edge
void addEdge(struct Graph* graph, int src, int dest) {
// Add edge from src to dest
struct node* newNode = createNode(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;

// Add edge from dest to src


newNode = createNode(src);
newNode->next = graph->adjLists[dest];
Practical Exercises P.13

graph->adjLists[dest] = newNode;
}

// Create a queue
struct queue* createQueue() {
struct queue* q = malloc(sizeof(struct queue));
q->front = -1;
q->rear = -1;
return q;
}

// Check if the queue is empty


int isEmpty(struct queue* q) {
if (q->rear == -1)
return 1;
else
return 0;
}

// Adding elements into queue


void enqueue(struct queue* q, int value) {
if (q->rear == SIZE - 1)
printf("\nQueue is Full!!");
else {
if (q->front == -1)
q->front = 0;
q->rear++;
q->items[q->rear] = value;
}
P.14 Algorithms

// Removing elements from queue


int dequeue(struct queue* q) {
int item;
if (isEmpty(q)) {
printf("Queue is empty");
item = -1;
} else {
item = q->items[q->front];
q->front++;
if (q->front > q->rear) {
printf("Resetting queue ");
q->front = q->rear = -1;
}
}
return item;
}

// Print the queue


void printQueue(struct queue* q) {
int i = q->front;

if (isEmpty(q)) {
printf("Queue is empty");
} else {
printf("\nQueue contains \n");
for (i = q->front; i < q->rear + 1; i++) {
printf("%d ", q->items[i]);
Practical Exercises P.15

}
}
}

int main() {
struct Graph* graph = createGraph(6);
addEdge(graph, 0, 1);
addEdge(graph, 0, 2);
addEdge(graph, 1, 2);
addEdge(graph, 1, 4);
addEdge(graph, 1, 3);
addEdge(graph, 2, 4);
addEdge(graph, 3, 4);

bfs(graph, 0);

return 0;
}

Output
Queue contains
0 Resetting queue Visited 0
Queue contains
2 1 Visited 2
Queue contains
1 4 Visited 1
Queue contains
4 3 Visited 4
Queue contains
3 Resetting queue Visited 3
P.16 Algorithms

2. Develop a program to implement graph traversal using Depth First


Search

Program
#include <stdio.h>
#include <stdlib.h>

struct node {
int vertex;
struct node* next;
};

struct node* createNode(int v);

struct Graph {
int numVertices;
int* visited;

// We need int** to store a two dimensional array.


// Similary, we need struct node** to store an array of Linked lists
struct node** adjLists;
};

// DFS algo
void DFS(struct Graph* graph, int vertex) {
struct node* adjList = graph->adjLists[vertex];
struct node* temp = adjList;

graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
Practical Exercises P.17

while (temp != NULL) {


int connectedVertex = temp->vertex;
if (graph->visited[connectedVertex] == 0) {
DFS(graph, connectedVertex);
}
temp = temp->next;
}
}
// Create a node
struct node* createNode(int v) {
struct node* newNode = malloc(sizeof(struct node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}
// Create graph
struct Graph* createGraph(int vertices) {
struct Graph* graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = malloc(vertices * sizeof(struct node*));
graph->visited = malloc(vertices * sizeof(int));
int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}
return graph;
}
P.18 Algorithms

// Add edge
void addEdge(struct Graph* graph, int src, int dest) {
// Add edge from src to dest
struct node* newNode = createNode(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;

// Add edge from dest to src


newNode = createNode(src);
newNode->next = graph->adjLists[dest];
graph->adjLists[dest] = newNode;
}
// Print the graph
void printGraph(struct Graph* graph) {
int v;
for (v = 0; v < graph->numVertices; v++) {
struct node* temp = graph->adjLists[v];
printf("\n Adjacency list of vertex %d\n ", v);
while (temp) {
printf("%d -> ", temp->vertex);
temp = temp->next;
}
printf("\n");
}
}
int main() {
struct Graph* graph = createGraph(4);
addEdge(graph, 0, 1);
addEdge(graph, 0, 2);
Practical Exercises P.19

addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
printGraph(graph);
DFS(graph, 2);
return 0;
}

Output
Adjacency list of vertex 0
21
Adjacency list of vertex 1
20
Adjacency list of vertex 2
310
Adjacency list of vertex 3
2
Visited 2
Visited 3
Visited 1
Visited 0

3. From a given vertex in a weighted connected graph, develop a


program to find the shortest paths to other vertices using Dijkstra’s
algorithm

Program
#include <stdio.h>
#define INFINITY 9999
#define MAX 10
void Dijkstra(int Graph[MAX][MAX], int n, int start);
void Dijkstra(int Graph[MAX][MAX], int n, int start) {
P.20 Algorithms

int cost[MAX][MAX], distance[MAX], pred[MAX];


int visited[MAX], count, mindistance, nextnode, i, j;
// Creating cost matrix
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
if (Graph[i][j] == 0)
cost[i][j] = INFINITY;
else
cost[i][j] = Graph[i][j];
for (i = 0; i < n; i++) {
distance[i] = cost[start][i];
pred[i] = start;
visited[i] = 0;
}
distance[start] = 0;
visited[start] = 1;
count = 1;
while (count < n - 1) {
mindistance = INFINITY;
for (i = 0; i < n; i++)
if (distance[i] < mindistance && !visited[i]) {
mindistance = distance[i];
nextnode = i;
}
visited[nextnode] = 1;
for (i = 0; i < n; i++)
if (!visited[i])
if (mindistance + cost[nextnode][i] < distance[i]) {
distance[i] = mindistance + cost[nextnode][i];
Practical Exercises P.21

pred[i] = nextnode;
}
count++;
}
// Printing the distance
for (i = 0; i < n; i++)
if (i != start) {
printf("\nDistance from source to %d: %d", i, distance[i]);
}
}
int main() {
int Graph[MAX][MAX], i, j, n, u;
n = 7;
Graph[0][0] = 0;
Graph[0][1] = 0;
Graph[0][2] = 1;
Graph[0][3] = 2;
Graph[0][4] = 0;
Graph[0][5] = 0;
Graph[0][6] = 0;

Graph[1][0] = 0;
Graph[1][1] = 0;
Graph[1][2] = 2;
Graph[1][3] = 0;
Graph[1][4] = 0;
Graph[1][5] = 3;
Graph[1][6] = 0;
P.22 Algorithms

Graph[2][0] = 1;
Graph[2][1] = 2;
Graph[2][2] = 0;
Graph[2][3] = 1;
Graph[2][4] = 3;
Graph[2][5] = 0;
Graph[2][6] = 0;

Graph[3][0] = 2;
Graph[3][1] = 0;
Graph[3][2] = 1;
Graph[3][3] = 0;
Graph[3][4] = 0;
Graph[3][5] = 0;
Graph[3][6] = 1;

Graph[4][0] = 0;
Graph[4][1] = 0;
Graph[4][2] = 3;
Graph[4][3] = 0;
Graph[4][4] = 0;
Graph[4][5] = 2;
Graph[4][6] = 0;

Graph[5][0] = 0;
Graph[5][1] = 3;
Graph[5][2] = 0;
Graph[5][3] = 0;
Graph[5][4] = 2;
Practical Exercises P.23

Graph[5][5] = 0;
Graph[5][6] = 1;

Graph[6][0] = 0;
Graph[6][1] = 0;
Graph[6][2] = 0;
Graph[6][3] = 1;
Graph[6][4] = 0;
Graph[6][5] = 1;
Graph[6][6] = 0;

u = 0;
Dijkstra(Graph, n, u);

return 0;
}

Output
Distance from source to 1: 3
Distance from source to 2: 1
Distance from source to 3: 2
Distance from source to 4: 4
Distance from source to 5: 4
Distance from source to 6: 3

4. Find the minimum cost spanning tree of a given undirected graph


using Prim’s algorithm

Program
#include<stdio.h>
#include<stdbool.h>
P.24 Algorithms

#define INF 9999999


// number of vertices in graph
#define V 5
// create a 2d array of size 5x5
//for adjacency matrix to represent graph
int G[V][V] = {
{0, 9, 75, 0, 0},
{9, 0, 95, 19, 42},
{75, 95, 0, 51, 66},
{0, 19, 51, 0, 31},
{0, 42, 66, 31, 0}};
int main() {
int no_edge; // number of edge
// create a array to track selected vertex
// selected will become true otherwise false
int selected[V];
// set selected false initially
memset(selected, false, sizeof(selected));
// set number of edge to 0
no_edge = 0;
// the number of egde in minimum spanning tree will be
// always less than (V -1), where V is number of vertices in
//graph
// choose 0th vertex and make it true
selected[0] = true;
int x; // row number
int y; // col number
// print for edge and weight
printf("Edge : Weight\n");
Practical Exercises P.25

while (no_edge < V - 1) {


//For every vertex in the set S, find the all adjacent vertices
// , calculate the distance from the vertex selected at step 1.
// if the vertex is already in the set S, discard it otherwise
//choose another vertex nearest to selected vertex at step 1.
int min = INF;
x = 0;
y = 0;
for (int i = 0; i < V; i++) {
if (selected[i]) {
for (int j = 0; j < V; j++) {
if (!selected[j] && G[i][j]) { // not in selected and there is an edge
if (min > G[i][j]) {
min = G[i][j];
x = i;
y = j;
}
}
}
}
}
printf("%d - %d : %d\n", x, y, G[x][y]);
selected[y] = true;
no_edge++;
}

return 0;
}
P.26 Algorithms

Output
Edge : Weight
0-1:9
1 - 3 : 19
3 - 4 : 31
3 - 2 : 51

5. Implement Floyd’s algorithm for the All-Pairs- Shortest-Paths


problem.

Program
#include<iostream>
#include<iomanip>
#define NODE 7
#define INF 999
using namespace std;
//Cost matrix of the graph
int costMat[NODE][NODE] = {
{0, 3, 6, INF, INF, INF, INF},
{3, 0, 2, 1, INF, INF, INF},
{6, 2, 0, 1, 4, 2, INF},
{INF, 1, 1, 0, 2, INF, 4},
{INF, INF, 4, 2, 0, 2, 1},
{INF, INF, 2, INF, 2, 0, 1},
{INF, INF, INF, 4, 1, 1, 0}
};
void floydWarshal(){
int cost[NODE][NODE]; //defind to store shortest distance from any node to
any node
for(int i = 0; i<NODE; i++)
Practical Exercises P.27

for(int j = 0; j<NODE; j++)


cost[i][j] = costMat[i][j]; //copy costMatrix to new matrix
for(int k = 0; k<NODE; k++){
for(int i = 0; i<NODE; i++)
for(int j = 0; j<NODE; j++)
if(cost[i][k]+cost[k][j] < cost[i][j])
cost[i][j] = cost[i][k]+cost[k][j];
}
cout << "The matrix:" << endl;
for(int i = 0; i<NODE; i++){
for(int j = 0; j<NODE; j++)
cout << setw(3) << cost[i][j];
cout << endl;
}
}
int main(){
floydWarshal();
}

Output
The matrix:
0354677
3021344
5201323
4110233
6332021
7423201
7433110
P.28 Algorithms

6. Compute the transitive closure of a given directed graph using


Warshall's algorithm.

Program
#include<stdio.h>
// Number of vertices in the graph
#define V 4
// A function to print the solution matrix
void printSolution(int reach[][V]);
// Prints transitive closure of graph[][]
// using Floyd Warshall algorithm
void transitiveClosure(int graph[][V])
{
/* reach[][] will be the output matrix
// that will finally have the
shortest distances between
every pair of vertices */
int reach[V][V], i, j, k;
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
reach[i][j] = graph[i][j];
for (k = 0; k < V; k++)
{
// Pick all vertices as
// source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as
// destination for the
// above picked source
Practical Exercises P.29

for (j = 0; j < V; j++)


{
// If vertex k is on a path
// from i to j,
// then make sure that the value
// of reach[i][j] is 1
reach[i][j] = reach[i][j] ||
(reach[i][k] && reach[k][j]);
}
}
}
// Print the shortest distance matrix
printSolution(reach);
}
/* A utility function to print solution */
void printSolution(int reach[][V])
{
printf ("Following matrix is transitive");
printf("closure of the given graph\n");
for (int i = 0; i < V; i++)
{
for (int j = 0; j < V; j++)
{
if(i == j)
printf("1 ");
else
printf ("%d ", reach[i][j]);
}
printf("\n");
P.30 Algorithms

}
}
int main()
{
/* Let us create the following weighted graph
10
(0)------->(3)
| /|\
5| |
| |1
\|/ |
(1)------->(2)
3
int graph[V][V] = { {1, 1, 0, 1},
{0, 1, 1, 0},
{0, 0, 1, 1},
{0, 0, 0, 1}
};

// Print the solution


transitiveClosure(graph);
return 0;
}

Output
Following matrix is transitive closure of the given graph
1111
0111
0011
0001
Practical Exercises P.31

1. Develop a program to find out the maximum and minimum numbers


in a given list of n numbers using the divide and conquer technique.

Program
#include <iostream>
#include <vector>
#include <climits>
using namespace std;

void findMinAndMax(vector<int> const &nums, int low, int high, int &min, int
&max)
{
if (low == high) // common comparison
{
if (max < nums[low]) { // comparison 1
max = nums[low];
}

if (min > nums[high]) { // comparison 2


min = nums[high];
}

return;
}

if (high - low == 1) // common comparison


{
if (nums[low] < nums[high]) // comparison 1
P.32 Algorithms

{
if (min > nums[low]) { // comparison 2
min = nums[low];
}

if (max < nums[high]) { // comparison 3


max = nums[high];
}
}
else {
if (min > nums[high]) { // comparison 2
min = nums[high];
}

if (max < nums[low]) { // comparison 3


max = nums[low];
}
}
return;
}

int mid = (low + high) / 2;

findMinAndMax(nums, low, mid, min, max);

findMinAndMax(nums, mid + 1, high, min, max);


}

int main()
Practical Exercises P.33

{
vector<int> nums = { 7, 2, 9, 3, 1, 6, 7, 8, 4 };

int max = INT_MIN, min = INT_MAX;

int n = nums.size();
findMinAndMax(nums, 0, n - 1, min, max);

cout << "The minimum array element is " << min << endl;
cout << "The maximum array element is " << max << endl;

return 0;
}

Output:
The minimum array element is 1
The maximum array element is 9

2. Implement Merge sort methods to sort an array of elements and


determine the time required to sort. Repeat the experiment for
different values of n, the number of elements in the list to be sorted.

Program
#include <stdio.h>
#include <stdlib.h>
void merge(int arr[], int l, int m, int r)
{
int i, j, k;
int n1 = m - l + 1;
int n2 = r - m;
int L[n1], R[n2];
P.34 Algorithms

for (i = 0; i < n1; i++)


L[i] = arr[l + i];
for (j = 0; j < n2; j++)
R[j] = arr[m + 1 + j];
i = 0; // Initial index of first subarray
j = 0; // Initial index of second subarray
k = l; // Initial index of merged subarray
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
}
else {
arr[k] = R[j];
j++;
}
k++;
}
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
}
Practical Exercises P.35

void mergeSort(int arr[], int l, int r)


{
if (l < r) {
int m = l + (r - l) / 2;

// Sort first and second halves


mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);
merge(arr, l, m, r);
}
}
void printArray(int A[], int size)
{
int i;
for (i = 0; i < size; i++)
printf("%d ", A[i]);
printf("\n");
}
int main()
{
int arr[] = { 12, 11, 13, 5, 6, 7 };
int arr_size = sizeof(arr) / sizeof(arr[0]);
printf("Given array is \n");
printArray(arr, arr_size);
mergeSort(arr, 0, arr_size - 1);
printf("\nSorted array is \n");
printArray(arr, arr_size);
return 0;
}
P.36 Algorithms

Output
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

1. Implement N Queens problem using Backtracking.

Program
#define N 4
#include <stdbool.h>
#include <stdio.h>
void printSolution(int board[N][N])
{
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++)
printf(" %d ", board[i][j]);
printf("\n");
}
}
bool isSafe(int board[N][N], int row, int col)
{
int i, j;
for (i = 0; i < col; i++)
if (board[row][i])
return false;
for (i = row, j = col; i >= 0 && j >= 0; i--, j--)
if (board[i][j])
return false;
Practical Exercises P.37

for (i = row, j = col; j >= 0 && i < N; i++, j--)


if (board[i][j])
return false;
return true;
}
bool solveNQUtil(int board[N][N], int col)
{
if (col >= N)
return true;
this queen in all rows one by one */
for (int i = 0; i < N; i++) {
if (isSafe(board, i, col)) {
board[i][col] = 1;
if (solveNQUtil(board, col + 1))
return true;
board[i][col] = 0; // BACKTRACK
}
}
return false;
}
bool solveNQ()
{
int board[N][N] = { { 0, 0, 0, 0 },
{ 0, 0, 0, 0 },
{ 0, 0, 0, 0 },
{ 0, 0, 0, 0 } };
if (solveNQUtil(board, 0) == false) {
printf("Solution does not exist");
return false;
P.38 Algorithms

}
printSolution(board);
return true;
}
int main()
{
solveNQ();
return 0;
}

Output
..Q.
Q...
...Q
.Q..

1. Implement any scheme to find the optimal solution for the Traveling
Salesperson problem and then solve the same problem instance using
any approximation algorithm and determine the error in the
approximation.

Program
#include<stdio.h>
int a[10][10],n,visit[10];
int cost_opt=0,cost_apr=0;
int least_apr(int c);
int least_opt(int c);

void mincost_opt(int city)


{
Practical Exercises P.39

int i,ncity;
visit[city]=1;
printf("%d-->",city);
ncity=least_opt(city);
if(ncity==999)
{
ncity=1;
printf("%d",ncity);
cost_opt+=a[city][ncity];
return;
}
mincost_opt(ncity);
}
void mincost_apr(int city)
{
int i,ncity;
visit[city]=1;
printf("%d-->",city);
ncity=least_apr(city);
if(ncity==999)
{
ncity=1;
printf("%d",ncity);
cost_apr+=a[city][ncity];
return;
}
mincost_apr(ncity);
}
P.40 Algorithms

int least_opt(int c)
{
int i,nc=999;
int min=999,kmin=999;
for(i=1;i<=n;i++)
{
if((a[c][i]!=0)&&(visit[i]==0))
if(a[c][i]<min)
{
min=a[i][1]+a[c][i];
kmin=a[c][i];
nc=i;
}
}
if(min!=999)
cost_opt+=kmin;
return nc;
}

int least_apr(int c)
{
int i,nc=999;
int min=999,kmin=999;
for(i=1;i<=n;i++)
{
if((a[c][i]!=0)&&(visit[i]==0))
if(a[c][i]<kmin)
{
min=a[i][1]+a[c][i];
Practical Exercises P.41

kmin=a[c][i];
nc=i;
}
}
if(min!=999)
cost_apr+=kmin;
return nc;
}
void main()
{
int i,j;
printf("Enter No. of cities:\n");
scanf("%d",&n);

printf("Enter the cost matrix\n");


for(i=1;i<=n;i++)
{
printf("Enter elements of row:%d\n",i );
for(j=1;j<=n;j++)
scanf("%d",&a[i][j]);
visit[i]=0;
}
printf("The cost list is \n");
for(i=1;i<=n;i++)
{
printf("\n\n");
for(j=1;j<=n;j++)
printf("\t%d",a[i][j]);
}
P.42 Algorithms

printf("\n\n Optimal Solution :\n");


printf("\n The path is :\n");
mincost_opt(1);
printf("\n Minimum cost:");
printf("%d",cost_opt);
printf("\n\n Approximated Solution :\n");
for(i=1;i<=n;i++)
visit[i]=0;
printf("\n The path is :\n");
mincost_apr(1);
printf("\nMinimum cost:");
printf("%d",cost_apr);
printf("\n\nError in approximation is approximated solution/optimal
solution=%f",
(float)cost_apr/cost_opt);
}

OUTPUT:
Enter No. of cities:
4
Enter the cost matrix
Enter elements of row:1
0136
Enter elements of row:2
1023
Enter elements of row:3
3201
Enter elements of row:4
6310
Practical Exercises P.43

The cost list is


0 1 3 6

1 0 2 3

3 2 0 1

6 3 1 0

Optimal Solution :
The path is :
12431
Minimum cost:8

Approximated Solution :
The path is :
12341
Minimum cost:10
Error in approximation is approximated solution/optimal solution = 1.250000

2. Implement randomized algorithms for finding the kth smallest number

Program
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;

int randomPartition(int arr[], int l, int r);

int kthSmallest(int arr[], int l, int r, int k)


P.44 Algorithms

{
if (k > 0 && k <= r - l + 1)
{
int pos = randomPartition(arr, l, r);

if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1)
return kthSmallest(arr, l, pos-1, k);

return kthSmallest(arr, pos+1, r, k-pos+l-1);


}

return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x)
{
swap(&arr[i], &arr[j]);
Practical Exercises P.45

i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}

int randomPartition(int arr[], int l, int r)


{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}

int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}

Output
K'th smallest element is 5
Model Question Paper - 1
B.E./B.Tech DEGREE EXAMINATION.,
Fourth Semester
CS3401 - ALGORITHMS
(Regulations 2021)
Time: Three Hours Maximum: 100 Marks
Answer ALL Questions
PART – A (10  2 = 20 Marks)
1. What is Algorithm analysis?
Analyzing an algorithm has come to mean predicting the resources that the
algorithm requires. Occasionally, resources such as memory, communication
band-width, or computer hardware are of primary concern, but most often it is
computational time that we want to measure.
2. What is Rabin-Karp-Algorithm?
The Rabin-Karp string matching algorithm calculates a hash value for the
pattern, as well as for each M-character subsequences of text to be compared. If
the hash values are unequal, the algorithm will determine the hash value for next
M-character sequence. If the hash values are equal, the algorithm will analyze
the pattern and the M-character sequence. In this way, there is only one
comparison per text subsequence, and character matching is only required when
the hash values match.
3. What are the 2 types of graphs representations?
 Sequential representation (or, Adjacency matrix representation)
 Linked list representation (or, Adjacency list representation)
4. What is Bellman Ford algorithm?
Bellman Ford algorithm works by overestimating the length of the path from
the starting vertex to all other vertices. Then it iteratively relaxes those estimates
by finding new paths that are shorter than the previously overestimated paths.
MQ.2 Algorithms

5. What is Dynamic Programming?


Dynamic programming is used where we have problems, which can be
divided into similar sub-problems, so that their results can be re-used. Mostly,
these algorithms are used for optimization. Before solving the in-hand sub-
problem, dynamic algorithm will try to examine the results of the previously
solved sub-problems.
6. What is Greedy technique?
In this approach, the decision is taken on the basis of current available
information without worrying about the effect of the current decision in future.
7. List the applications of Backtracking.
 N-queen problem
 Sum of subset problem
 Graph coloring
 Hamiliton cycle
8. State Hamiltonian Circuit Problem.
In an undirected graph, the Hamiltonian path is a path, that visits each vertex
exactly once, and the Hamiltonian cycle or circuit is a Hamiltonian path, that
there is an edge from the last vertex to the first vertex. In this problem, we will
try to determine whether a graph contains a Hamiltonian cycle or not.
9. What is polynomial time algorithm?
A polynomial-time algorithm is one which runs in an amount of time
proportional to some polynomial value of N, where N is some characteristic of
the set over which the algorithm runs, usually its size. For example, simple sort
algorithms run approximately in time kN^2/2, where N is the number of
elements being sorted, and k is a proportionality constant that is CPU-dependent
and specific to this problem. Multiplication of N  N matrices runs
approximately in time kN^3, where we are talking about a different k for this
different problem.
10. Define a 3- CNF problem.
The point of 3CNF is that it is a "normal form" for formulas: as you mention,
every formula is equivalent, up to a quadratic (linear?) blowup, to a 3CNF
formula. 3CNF formulas are "simple", and so more easy to deal with. In
Model Question Papers MQ.3

particular, if you ever read about NP-completeness, you will find out that we
want our to put our "challenging problems" in as simple a form as possible. This
makes it easier both to design and analyze algorithms solving these problems,
and to prove that other problems are also difficult by reducing 3CNF to them

PART-B (5  13 = 65 Marks)
11. (a) Explain in detail about Knuth-Morris-Pratt algorithm
Ans: Refer Page No.1.22
[OR]
(b) Explain briefly about Binary search and Interpolation search
Ans: Refer Page No.1.15
12. (a) Explain the concept of Depth First Search and Breadth First Search in detail.
Ans: Refer Page No. 2.3
[OR]
(b) Briefly explain Floyd-Warshall algorithm
Ans: Refer Page No.2.26
13. (a) Explain in detail about Greedy Technique.
Ans: Refer Page No.3.23
[OR]
(b) Explain the concept of Matrix-chain multiplication.
Ans: Refer Page No.3.9
14. (a) Summarize Hamiltonian Circuit Problem.
Ans: Refer Page No.4.9
[OR]
(b) How do you solve a Solving 15-Puzzle problem. Elaborate.
Ans: Refer Page No.4.20
15. (a) Explain Bin Packing problem with an example.
Ans: Refer Page No.5.20
[OR]
MQ.4 Algorithms

(b) Explain the Fermat's Little Theorem.


Ans: Refer Page No.5.35

PART - C (1  15 = 15 Marks)

16. (a) Discuss on Polynomial time (p-time) reduction.


Ans: Refer Page No.5.4
[OR]
(b) Write about Rabin-Karp algorithm in detail
Ans: Refer Page No.1.20

***************
Model Question Papers MQ.5

Model Question Paper - 2


B.E./B.Tech DEGREE EXAMINATION.,
Fourth Semester
CS3401 - ALGORITHMS
(Regulations 2021)
Time: Three Hours Maximum: 100 Marks
Answer ALL Questions
PART – A (10  2 = 20 Marks)
1. What are the four methods for solving Recurrence?
 Substitution Method
 Iteration Method
 Recursion Tree Method
 Master Method
2. What is Heap Sort Algorithm.
Heap sort processes the elements by creating the min-heap or max-heap using
the elements of the given array. Min-heap or max-heap represents the ordering
of array in which the root element represents the minimum or maximum element
of the array.
3. Write a short note on Depth-First Search Algorithm.
The depth-first search or DFS algorithm traverses or explores data
structures, such as trees and graphs. The algorithm starts at the root node (in the
case of a graph, you can use any random node as the root node) and examines
each branch as far as possible before backtracking..
4. What is Floyd Warshall Algorithm?
 Floyd Warshall Algorithm is a famous algorithm.
 It is used to solve All Pairs Shortest Path Problem.
 It computes the shortest path between every pair of vertices of the given
graph.
 Floyd Warshall Algorithm is an example of dynamic programming
approach
MQ.6 Algorithms

5. What is Quick Sort?


 It is an algorithm of Divide & Conquer type.
 Divide: Rearrange the elements and split arrays into two sub-arrays and an
element in between search that each element in left sub array is less than or
equal to the average element and each element in the right sub- array is
larger than the middle element.
 Conquer: Recursively, sort two sub arrays.
 Combine: Combine the already sorted array
6. What is Optimal merge pattern.
It is a pattern that relates to the merging of two or more sorted files in a single
sorted file. This type of merging can be done by the two-way merging method.
7. Write an algorithm for Subset Sum Problem.
subsetSum(set, subset, n, subSize, total, node, sum)
Input − The given set and subset, size of set and subset, a total of the subset,
number of elements in the subset and the given sum.
Output − All possible subsets whose sum is the same as the given sum.
Begin
if total = sum, then
display the subset
//go for finding next subset
subsetSum(set, subset, , subSize-1, total-set[node], node+1, sum)
return
else
for all element i in the set, do
subset[subSize] := set[i]
subSetSum(set, subset, n, subSize+1, total+set[i], i+1, sum)
done
End
Model Question Papers MQ.7

8. What are the two approaches to calculate the cost function?


 For each worker, we choose job with minimum cost from list of
unassigned jobs (take minimum entry from each row).
 For each job, we choose a worker with lowest cost for that job from list of
unassigned workers (take minimum entry from each column).
9. Differentiate Tractable and intractable problems.
Tractable Problem: A problem that is solvable by a polynomial-time
algorithm. The upper bound is polynomial. Here are examples of tractable
problems (ones with known polynomial-time algorithms):
Searching an unordered list
Searching an ordered list
Intractable Problem: a problem that cannot be solved by a polynomial-time
algorithm.
Most intractable problems have an algorithm that provides a solution, and that
algorithm is the brute-force search. This algorithm, however, does not provide
an efficient solution and is, therefore, not feasible for computation with anything
more than the smallest input.
Examples: Towers of Hanoi.
10. How to Draw a Venn Diagram?
Venn diagrams can be drawn with unlimited circles. Since more than three
becomes very complicated, we will usually consider only two or three circles in
a Venn diagram. Here are the 4 easy steps to draw a Venn diagram:
 Step 1: Categorize all the items into sets.
 Step 2: Draw a rectangle and label it as per the correlation between the
sets.
 Step 3: Draw the circles according to the number of categories you have.
 Step 4: Place all the items in the relevant circles.

PART-B (5  13 = 65 Marks)
11. (a) Explain Insertion sort with an example.
Ans: Refer Page No.1.29
MQ.8 Algorithms

[OR]
(b) Explain in detail about Asymptotic Notations and its properties.
Ans: Refer Page No.1.2
12. (a) Explain graph representations in detail
Ans: Refer Page No.2.1
[OR]
(b) Write in detail about Maximum bipartite matching.
Ans: Refer Page No.2.31
13. (a) Explain Optimal Binary Search Trees with an example.
Ans: Refer Page No.3.18
[OR]
(b) Explain Huffman Trees with an example.
Ans: Refer Page No.3.29
14. (a) Elaborate Subset Sum Problem with an example
Ans: Refer Page No.4.13
[OR]
(b) Elucidate the Assignment problem with an example
Ans: Refer Page No.4.26
15. (a) Classify in detail NP-hardness and NP-completeness.
Ans: Refer Page No.5.16
[OR]
(b) Elaborate the concept and application of Randomized Algorithms.
Ans: Refer Page No.5.32

PART - C (1  15 = 15 Marks)
16. (a) Explain with an algorithm the efficiency of Graph colouring problem.
Ans: Refer Page No.4.16
[OR]
(b) Write in detail about Prim’s algorithm.
Ans: Refer Page No.2.18

You might also like