Department of Information Technolo
Department of Information Technolo
Objectives:
To analyze performance of algorithms.
To choose the appropriate data structure and algorithm design method for a specified
application.
To understand how the choice of data structures and algorithm design methods
impacts the performance of programs.
To solve problems using algorithm design methods such as the greedy method, divide
and conquer, dynamic programming, backtracking and branch and bound.
Prerequisites (Subjects) Data structures, Mathematical foundations of computer
science.
UNIT I:
Introduction: Algorithm, Psuedo code for expressing algorithms, Performance Analysis-
Space complexity, Time complexity, Asymptotic Notation- Big oh notation, Omega notation,
Theta notation and Little oh notation, Probabilistic analysis, Amortized analysis.
Divide and conquer: General method, applications-Binary search, Quick sort, Merge sort,
matrix multiplication.
UNIT II:
Searching and Traversal Techniques: Efficient non - recursive binary tree traversal
algorithm, Disjoint set operations, union and find algorithms, Spanning trees, Graph
traversals - Breadth first search and Depth first search, AND / OR graphs, game trees,
Connected Components, Bi - connected components. Disjoint Sets- disjoint set operations,
union and find algorithms, spanning trees, connected components and biconnected
components.
UNIT III:
Greedy method: General method, applications - Job sequencing with deadlines, 0/1
knapsack problem, Minimum cost spanning trees, Single source shortest path problem.
Dynamic Programming: General method, applications-Matrix chain multiplication, Optimal
binary search trees, 0/1 knapsack problem, All pairs shortest path problem, Travelling sales
person problem, Reliability design.
UNIT IV:
Backtracking: General method, applications-n-queen problem, sum of subsets problem,
graph coloring, Hamiltonian cycles.
Branch and Bound: General method, applications - Travelling sales person problem,0/1
knapsack problem- LC Branch and Bound solution, FIFO Branch and Bound solution.
UNIT V:
NP-Hard and NP-Complete problems: Basic concepts, non deterministic algorithms, NP -
Hard and
TEXT BOOKS:
1. Fundamentals of Computer Algorithms, Ellis Horowitz,Satraj Sahni
and Rajasekharam,Galgotia publications pvt. Ltd.
2. Foundations of Algorithm, 4th edition, R. Neapolitan and K. Naimipour, Jones and
Bartlett Learning.
3. Design and Analysis of Algorithms, P. H. Dave, H. B. Dave, Pearson Education,
2008.
REFERENCES:
1. Computer Algorithms, Introduction to Design and Analysis, 3rd Edition, Sara Baase,
Allen, Van, Gelder, Pearson Education.
2. Algorithm Design: Foundations, Analysis and Internet examples, M. T. Goodrich and
R. Tomassia, John Wiley and sons.
3. Fundamentals of Sequential and Parallel Algorithm, K. A. Berman and J. L. Paul,
Cengage Learning.
4. Introducation to the Design and Analysis of Algorithms, A. Levitin, Pearson
Education.
5. Introducation to Algorithms, 3rd Edition, T. H. Cormen, C. E. Leiserson, R. L. Rivest,
and C. Stein, PHI Pvt. Ltd.
6. Design and Analysis of algorithm, Aho, Ullman and Hopcroft, Pearson Education,
2004.
Outcomes:
Be able to analyze algorithms and improve the efficiency of algorithms.
Apply different designing methods for development of algorithms to realistic
problems, such as divide and conquer, greedy and etc. Ability to understand and
estimate the performance of algorithm.
INDEX
S. No Topic Page no
Unit
1 Introduction to Algorithms 5
I
9
UNIT I:
Introduction: Algorithm, Psuedo code for expressing algorithms, Performance Analysis-
Space complexity, Time complexity, Asymptotic Notation- Big oh notation, Omega notation,
Theta notation and Little oh notation, Probabilistic analysis, Amortized analysis.
Divide and conquer: General method, applications-Binary search, Quick sort, Merge sort,
matrix multiplication.
INTRODUCTION TO ALGORITHM
History of Algorithm
What is an Algorithm?
For example,
Described precisely: very difficult for a machine to know how much water, milk to be
added etc. in the above tea making algorithm.
These algorithms run on computers or computational devices..For example, GPS in our
smartphones, Google hangouts.
GPS uses shortest path algorithm.. Online shopping uses cryptography which uses RSA
algorithm.
Algorithm Definition1:
Algorithm Definition2:
Algorithms that are definite and effective are also called computational procedures.
A program is the expression of an algorithm in a programming language
Keeping illegal inputs separate is the responsibility of the algorithmic problem, while
treating special classes of unusual or undesirable inputs is the responsibility of the algorithm
itself.
4 Distinct areas of study of algorithms:
How to devise algorithms. Techniques Divide & Conquer, Branch and Bound ,
Dynamic Programming
How to validate algorithms.
Check for Algorithm that it computes the correct answer for all possible legal inputs.
algorithm validation. First Phase
Second phase Algorithm to Program Program Proving or Program Verification
Solution be stated in two forms:
First Form: Program which is annotated by a set of assertions about the input and output
variables of the program predicate calculus
Second form: is called a specification
4 Distinct areas of study of algorithms (..Contd)
How to analyze algorithms.
Analysis of Algorithms or performance analysis refer to the task of determining how
much computing time & storage an algorithm requires
How to test a program 2 phases
Debugging - Debugging is the process of executing programs on sample data sets to
determine whether faulty results occur and, if so, to correct them.
or performance measurement is the process of executing a correct program on
data sets and measuring the time and space it takes to compute the results
PSEUDOCODE:
Example of Pseudocode:
Algorithm arrayMax(A, n)
Input array A of n integers
Output maximum element of A
currentMax A[0]
for i 1 to n 1 do
if A[i] currentMax then
currentMax A[i]
return currentMax
Control flow
Method call
var.method (arg [, arg
Return value
return expression
Expressions
Assignment (equivalent to )
Equality testing (equivalent to )
n2 Superscripts and other mathematical formatting allowed
PERFORMANCE ANALYSIS:
What are the Criteria for judging algorithms that have a more direct relationship to
performance?
computing time and storage requirements.
Space Complexity:
The Space needed by each of these algorithms is seen to be the sum of the following
component.
1.A fixed part that is independent of the characteristics (eg:number,size)of the inputs and
outputs.
The part typically includes the instruction space (ie. Space for the code), space for simple
variable and fixed-size component variables (also called aggregate) space for constants, and
so on.
2. A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by referenced
variables (to the extent that is depends on instance characteristics), and the recursion stack
space.
The space requirement s(p) of any algorithm p may therefore be written as,
S(P) = c+ Sp(Instance characteristics)
Example 2:
Algorithm sum(a,n)
{
s=0.0;
for I=1 to n do
s= s+a[I];
return s;
}
The problem instances for this algorithm are characterized by n,the number of
type integer.
The
floating point numbers.
elements to be summed.
So,we obtain Ssum(n)>=(n+s)
[ n for a[],one each for n,I a& s]
Time Complexity:
The time T(p) taken by a program P is the sum of the compile time and the
run time(execution time)
The compile time does not depend on the instance characteristics. Also we may
assume that a compiled program will be run several times without recompilation .This
rum time is denoted by tp(instance characteristics).
The number of steps any problem statement is assigned depends on the kind of
statement.
We introduce a variable, count into the program statement to increment count with
initial value 0.Statement to increment count by the appropriate amount are introduced
into the program.
This is done so that each time a statement in the original program is executes
count is incremented by the step count of that statement.
Algorithm:
Algorithm sum(a,n)
{
s= 0.0;
count = count+1;
for I=1 to n do
{
count =count+1;
s=s+a[I];
count=count+1;
}
count=count+1;
count=count+1;
return s;
}
If the count is zero to start with, then it will be 2n+3 on termination. So each
invocation of sum execute a total of 2n+3 steps.
First determine the number of steps per execution (s/e) of the statement and the
total number of times (ie., frequency) each statement is executed.
By combining these two quantities, the total contribution of all statements, the
step count for the entire algorithm is obtained.
Statement Steps per Frequency Total
execution
1. Algorithm Sum(a,n) 0 - 0
2.{ 0 - 0
3. S=0.0; 1 1 1
4. for I=1 to n do 1 n+1 n+1
5. s=s+a[I]; 1 n n
6. return s; 1 1 1
7. } 0 - 0
Total 2n+3
How to analyse an Algorithm?
Let us form an algorithm for Insertion sort (which sort a sequence of numbers).The pseudo
code for the algorithm is give below.
Identify each line of the pseudo code with symbols such as C1, C2 ..
Let Ci be the cost of ith line. Since comment lines will not incur any cost C3=0.
Cost No. Of times
Executed
C1 N
C2 n-1
C3=0 n-1
C4 n-1
C5
C6
C7
C8 n-1
Best case:
· Linear function of n.
Worst case:
· The worst-case running time gives a guaranteed upper bound on the running time for
any input.
· For some algorithms, the worst case occurs often. For example, when searching, the
worst case often occurs when the item being searched for is not present, and searches
for absent items may be frequent.
It is described by the highest degree term of the formula for running time. (Drop lower-order
terms. Ignore the constant coefficient in the leading term.)
Example: We found out that for insertion sort the worst-case running time is of the form
an2 + bn + c.
Drop lower-order terms. What remains is an2.Ignore constant coefficient. It results in n2.But
we cannot say that the worst-case running time T(n) equals n2 .Rather It grows like n2 . But it
2 2
) to capture the notion that the order of
2
growth is n .
We usually consider one algorithm to be more efficient than another if its worst-case
running time has a smaller order of growth.
Complexity of Algorithms
The complexity of an algorithm M is the function f(n) which gives the running time and/or
the input data but also on the particular data. The complexity function f(n) for certain cases
are:
1. Best Case : The minimum possible value of f(n) is called the best case.
3. Worst Case : The maximum value of f(n) for any key possible input.
ASYMPTOTIC NOTATION
The following notations are commonly use notations in performance analysis and used to
characterize the complexity of an algorithm:
1. Big OH (O) ,
2. Big OMEGA ( ),
3. Big THETA ( ) and
4. Little OH (o)
the function f(n)=O(g(n)) iff there exist positive constants c and no such that
f(n)<=c*g(n) for all n, n>= no.
Omega: the function f(n)=(g(n)) iff there exist positive constants c and no such that
f(n) >= c*g(n) for all n, n >= no.
Theta: the function f(n)=(g(n)) iff there exist positive constants c1,c2 and no such that c1
g(n) <= f(n) <= c2 g(n) for all n, n >= no
Big-O Notation
This notation gives the tight upper bound of the given function. Generally we represent it as
f(n) = O(g (11)). That means, at larger values of n, the upper bound off(n) is g(n). For
example, if f(n) = n4 + 100n2 + 10n + 50 is the given algorithm, then n4 is g(n). That means
g(n) gives the maximum rate of growth for f(n) at larger values of n.
In general, we do not consider lower values of n. That means the rate of growth at lower
o is the point from which we consider the
rate of growths for a given algorithm. Below no the rate of growths may be different.
Note Analyze the algorithms at larger values of n only What this means is, below no we do
not care for rates of growth.
Omega notation
Similar to above discussion, this notation gives the tighter lower bound of the given
algorithm and we represent it as f(n) = (g(n)). That means, at larger values of n, the
tighter lower bound of f(n) is g
For example, if f(n) = 100n2 + 10n + 50, g(n) is (n2).
The . . notation as be defined as (g (n)) = {f(n): there exist positive constants c and
no such that 0 <= cg (n) <= f(n) for all n >= no}. g(n) is an asymptotic lower bound for
f(n). (g (n)) is the set of functions with smaller or same order of growth as f(n).
Theta- notation
This notation decides whether the upper and lower bounds of a given function are same or
not. The average running time of algorithm is always between lower bound and upper bound.
If the upper bound (O) and lower bound ( ) gives the same result then notation will also
have the same rate of growth. As an example, let us assume that f(n) = 10n + n is the
expression. Then, its tight upper bound g(n) is O(n). The rate of growth in best case is g (n) =
0(n). In this case, rate of growths in best case and worst are same. As a result, the average
case will also be same.
None: For a given function (algorithm), if the rate of growths (bounds) for O and are not
same then the rate of growth case may not be same.
Important Notes
For analysis (best case, worst case and average) we try to give upper bound (O) and lower
bound ( ) and average running time ( ). From the above examples, it should also be clear
that, for a given function (algorithm) getting upper bound (O) and lower bound ( ) and
average running time ( ) may not be possible always.
For example, if we are discussing the best case of an algorithm, then we try to give upper
bound (O) and lower bound ( ) and average running time ( ).
In the remaining chapters we generally concentrate on upper bound (O) because knowing
lower bound ( ) of an algorithm is of no practical importance and we use 9 notation if upper
bound (O) and lower bound ( ) are same.
Little Oh Notation
The little Oh is denoted as o. It is defined as : Let, f(n} and g(n} be the non negative
functions then
such that f(n}= o(g{n)} i.e f of n is little Oh of g of n.
PROBABILISTIC ANALYSIS
In order to perform a probabilistic analysis, we must use knowledge of, or make assumptions
about, the distribution of the inputs. Then we analyze our algorithm, computing an average-
case running time, where we take the average over the distribution of the possible inputs.
Probability theory has the goal of characterizing the outcomes of natural or conceptual
three times, playing a lottery, gambling, picking a ball from an urn containing white and red
balls, and so on
Each possible outcome of an experiment is called a sample point and the set of all possible
outcomes is known as the sample space S. In this text we assume that S is finite (such a
sample space is called a discrete sample space). An event E is a subset of the sample space S.
If the sample space consists of n sample points, then there are 2n possible events.
Then the indicator random variable I {A} associated with event A is defined as
I {A} = 1 if A occurs ;
0 if A does not occur
Theorem 1.5
1. Prob.[E] = 1 - Prob.[E].
2. Prob.[E1 U E2] = Prob.[E1] + Prob.[E2] - E2]
<= Prob.[E1] + Prob.[E2]
E[X] =
Consider a game in which you flip two fair coins. You earn $3 for each head but lose $2 for
each tail. The expected value of the random variable X representing
your earnings is
= 6(1/4)+1(1/2)-4(1/4)
=1
Any one of these first i candidates is equally likely to be the best-qualified so far. Candidate i
has a probability of 1/i of being better qualified than candidates 1 through i -1 and thus a
probability of 1/i of being hired.
E[Xi]= 1/i
So,
E[X] = E[ ]
=
=
AMORTIZED ANALYSIS
2. Accounting method When there is more than one type of operation, each type of
operation may have a different amortized cost. The accounting method overcharges
some operations early in the sequence, storing the overcharge as
specific objects in the data structure. Later in the sequence, the credit pays for
operations that are charged less than they actually cost.
3. Potential method -
data structure as a whole instead of associating the credit with
individual objects within the data structure. The potential method, which is like the
accounting method in that we determine the amortized cost of each operation and
may overcharge operations early on to compensate for undercharges later
General Method
If the subproblems are large enough then divide and conquer is reapplied.
The generated subproblems are usually of some type as the original problem.
Problem of size N
Solution to Solution to
b 1
DAndC(Pk)
return combine(DAndC(P1),
}
}
//P Problem
//Here small(P) Boolean value function. If it is true, then the function S is
//invoked
a,b contants.
This is called the general divide and-conquer recurrence.
Advantages of DAndC:
The time spent on executing the problem using DAndC is smaller than other method.
This technique is ideally suited for parallel computation.
This approach provides an efficient algorithm in computer science.
The following theorem can be used to determine the running time of divide and conquer
the recurrence relation for
the problem. If the recurrence is of below form then we directly give the answer without
fully solving it.
If the reccurrence is of the form T(n) = aT( ) + (nklogpn), where a >= 1, b > 1, k >= O
and p is a real number, then we can directly give the answer as:
2) If a = bk
a. If p > -1, then T(n) = ( )
b. If p = -1, then T(n) = ( )
c. If p < -1, then T(n) = ( s)
3) If a < bk
a. If p >= 0, then T(n) = (nklogpn)
b. If p < 0, then T(n) = 0(nk)
Time Complexity:
Data structure:- Array
For successful search Unsuccessful search
Worst case - for all cases.
Average case
Best case
Binary search algorithm by using iterative methodology:
Binary search program by using iterative Binary search algorithm by using iterative
methodology: methodology:
int binary_search(int A[], int key, int imin, int Algorithm binary_search(A, key, imin, imax)
imax) {
{ While < (imax >= imin)> do
while (imax >= imin) {
{ int imid = midpoint(imin, imax);
int imid = midpoint(imin, imax); if(A[imid] == key)
if(A[imid] == key) return imid;
return imid; else if (A[imid] < key)
else if (A[imid] < key) imin = imid + 1;
imin = imid + 1; else
else imax = imid - 1;
imax = imid - 1; }
} }
}
Merge Sort:
The merge sort splits the list to be sorted into two equal halves, and places them in separate
arrays. This sorting method is an example of the DIVIDE-AND-CONQUER paradigm i.e. it
breaks the data into two halves and then sorts the two half data sets recursively, and finally
merges them to obtain the complete sorted list. The merge sort is a comparison sort and has an
algorithmic complexity of O (n log n). Elementary implementations of the merge sort make use of
two arrays - one for each half of the data set. The following image depicts the complete procedure
of merge sort.
Advantages of Merge Sort:
1. Marginally faster than the heap sort for larger sets
2. Merge Sort always does lesser number of comparisons than Quick Sort. Worst case for
merge sort does about
3. Merge sort is often the best choice for sorting a linked list because the slow random-
access performance of a linked list makes some other algorithms (such as quick sort)
perform poorly, and others (such as heap sort) completely impossible.
while(j<=high){
temp[k]=a[j];
j++;
k++;
}
for(k=low;k<=high;k++)
a[k]=temp[k];
}
void display(int a[10]){
int i;
printf("\n \n the sorted array is \n");
for(i=0;i<n;i++)
printf("%d \t",a[i]);}
Algorithm for Merge sort:
Algorithm mergesort(low, high)
{
if(low<high) then // Dividing Problem into Sub-problems and
{
mid=(low+high)/2;
mergesort(low,mid);
mergesort(mid+1,high); //Solve the sub-problems
Merge(low,mid,high); // Combine the solution
}
}
void Merge(low, mid,high){
k=low;
i=low;
j=mid+1;
while(i<=mid&&j<=high) do{
if(a[i]<=a[j]) then
{
temp[k]=a[i];
i++;
k++;
}
else
{
temp[k]=a[j];
j++;
k++;
}
}
while(i<=mid) do{
temp[k]=a[i];
i++;
k++;
}
while(j<=high) do{
temp[k]=a[j];
j++;
k++;
}
For k=low to high do
a[k]=temp[k];
}
For k:=low to high do a[k]=temp[k];
}
Tree call of Merge sort
Consider a example: (From text book)
A[1:10]={310,285,179,652,351,423,861,254,450,520}
1, 10
1, 5 6, 10
1, 3 4, 5 6, 8 9, 10
The time for the merging operation in proportional to n, then computing time for merge sort
is described by using recurrence relation.
T(n)= a if n=1;
2T(n/2)+ cn if n>1
Here c, a Constants.
If n is power of 2, n=2k
T(n)= 2T(n/2) + cn
2[2T(n/4)+cn/2] + cn
4T(n/4)+2cn
22 T(n/4)+2cn
23 T(n/8)+3cn
24 T(n/16)+4cn
2k T(1)+kcn
an+cn(log n)
By representing it by in the form of Asymptotic notation O is
T(n)=O(nlog n)
Quick Sort
Quick Sort is an algorithm based on the DIVIDE-AND-CONQUER paradigm that selects a pivot
element and reorders the given list in such a way that all elements smaller to it are on one side
and those bigger than it are on the other. Then the sub lists are recursively sorted until the list gets
completely sorted. The time complexity of this algorithm is O (n log n).
Auxiliary space used in the average case for implementing recursive function calls is
O (log n) and hence proves to be a bit space costly, especially when it comes to large
data sets.
2
Its worst case has a time complexity of O (n ) which can prove very fatal for large
data sets. Competitive sorting algorithms
Quick sort program
#include<stdio.h>
#include<conio.h>
int n,j,i;
void main(){
int i,low,high,z,y;
int a[10],kk;
void quick(int a[10],int low,int high);
int n;
clrscr();
printf("\n \t\t mergesort \n");
printf("\n enter the length of the list:");
scanf("%d",&n);
printf("\n enter the list elements");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
low=0;
high=n-1;
quick(a,low,high);
printf("\n sorted array is:");
for(i=0;i<n;i++)
printf(" %d",a[i]);
getch();
}
Time Complexity
Name Best case Average Worst Space
Case Case Complexity
Bubble O(n) - O(n2) O(n)
Insertion O(n) O(n2) O(n2) O(n)
Selection O(n2) O(n2) O(n2) O(n)
Quick O(log n) O(n log n) O(n2) O(n + log n)
Merge O(n log n) O(n log n) O(n log n) O(2n)
Heap O(n log n) O(n log n) O(n log n) O(n)
Let A and B be two n×n Matrices. The product matrix C=AB is also a n×n matrix whose i, jth
element is formed by taking elements in the ith row of A and jth column of B and multiplying
them to get
C(i, j)=
.
The divide and conquer strategy suggests another way to compute the product of two n×n
matrices.
For Simplicity assume n is a power of 2 that is n=2k
Here k any nonnegative integer.
If n is not power of two then enough rows and columns of zeros can be added to both A and
B, so that resulting dimensions are a power of two.
Let A and B be two n×n Matrices. Imagine that A & B are each partitioned into four square
sub matrices. Each sub matrix having dimensions n/2×n/2.
The product of AB can be computed by using previous formula.
If AB is product of 2×2 matrices then
=
C11=A11B11+A12B21
C12=A11B12+A12B22
C21=A21B11+A22B21
C22= A21B12+A22B22
C11=P+S-T+V
C12=R+T T(n)= b if n ;
2
C21=Q+S 7T(n/2)+ cn if n>2
C22=P+R-Q+U
UNIT II:
Searching and Traversal Techniques: Efficient non - recursive binary tree traversal
algorithm, Disjoint set operations, union and find algorithms, Spanning trees, Graph
traversals - Breadth first search and Depth first search, AND / OR graphs, game trees,
Connected Components, Bi - connected components. Disjoint Sets- disjoint set operations,
union and find algorithms, spanning trees, connected
components and biconnected components.
So we go on traversing all left node. as we visit the node. we will put that node into
stack.remember need to visit parent after the
start from root. it's case for LIFO :) and hence the stack). Once we reach NULL node. we will
take the node at the top of the stack. last node which we visited. Print it.
Check if there is right child to that node. If yes. move right child to stack and again start
traversing left child node and put them on to stack. Once we have traversed all node. our
stack will be empty.
1 5 3
7 8 9 2 10 4 6
S1 S2 S3
Disjoint set Union: Means Combination of two disjoint sets elements. Form above
example S1 U S2 ={1,7,8,9,5,2,10 }
For S1 U S2 tree representation, simply make one of the tree is a subtree
of the other.
1 1
5 5 7 8 9
7 8 9
S1 U S2 2 10 2 10
S1 U S2 S2 U S1
Tress can be accomplished easily if, with each set name, we keep a pointer to the root of the
tree representing that set.
For presenting the union and find algorithms, we ignore the set names and identify sets just
by the roots of the trees representing them.
For example
name[k]
If n numbers of roots are there then the above algorithms are not useful for union and find.
For union of n trees -1,n).
For Find i in n trees Find(1),
Time taken for the find for an element at level i of a tree is O(i).
For n finds O(n2).
To improve the performance of our union and find algorithms by avoiding the creation of
degenerate trees. For this we use a weighting rule for union(i, j)
the parent
Algorithm for weightedUnion(i, j)
Algorithm WeightedUnion(i,j)
For implementing the weighting rule, we need to know how many nodes there are
in every tree.
For this we maintain a count field in the root of every tree.
i root node
count[i] number of nodes in the tree.
Time required for this above algorithm is O(1) + time for remaining unchanged is
determined by using Lemma.
Lemma:- Let T be a tree with m nodes created as a result of a sequence of unions each
performed using WeightedUnion. The height of T is no greater than
|log2 m|+1.
Collapsing rule:
p[j] to root[i].
Algorithm for Collapsing find.
Algorithm CollapsingFind(i)
//Find the root of the tree containing element i.
//collapsing rule to collapse all nodes form i to the root.
{
r;=i;
while(p[r]>0) do r := p[r]; //Find the root.
from i to root r.
{
s:=p[i];
p[i]:=r;
i:=s;
}
return r;
}
Collapsing find algorithm is used to perform find operation on the tree created by
WeightedUnion.
s Algorithm: Start with no nodes or edges in the spanning tree, and repeatedly add
the cheapest edge that does not create a cycle.
Connected Component:
Connected component of a graph can be obtained by using BFST (Breadth first search and
traversal) and DFST (Dept first search and traversal). It is also called the spanning tree.
BFST (Breadth first search and traversal):
In BFS we start at a vertex V mark it as reached (visited).
The vertex V is at this time said to be unexplored (not yet discovered).
A vertex is said to been explored (discovered) by visiting all vertices adjacent from it.
All unvisited vertices adjacent from V are visited next.
The first vertex on this list is the next to be explored.
Exploration continues until no unexplored vertex is left.
These operations can be performed by using Queue.
Algorithm BFS(v)
// a bfs of G is begin at vertex v
// for any node I, visited[i]=1 if I has already been visited.
// the graph G, and array visited[] are global
{
U:=v; // q is a queue of unexplored vertices.
Visited[v]:=1;
Repeat{
For all vertices w adjacent from U do
If (visited[w]=0) then
{
Add w to q; // w is unexplored
Visited[w]:=1;
}
If q is empty then return; // No unexplored vertex.
Delete U from q; //Get 1st unexplored vertex.
} Until(false)
}
Maximum Time complexity and space complexity of G(n,e), nodes are in adjacency list.
Algorithm dFS(v)
// a Dfs of G is begin at vertex v
// initially an array visited[] is set to zero.
//this algorithm visits all vertices reachable from v.
// the graph G, and array visited[] are global
{
Visited[v]:=1;
For each vertex w adjacent from v do
{
If (visited[w]=0) then DFS(w);
{
Add w to q; // w is unexplored
Visited[w]:=1;
}
}
Maximum Time complexity and space complexity of G(n,e), nodes are in adjacency list.
T(n,
Bi-connected Components:
A graph G is biconnected, iff (if and only if) it contains no articulation point (joint or
junction).
A vertex v in a connected graph G is an articulation point, if and only if (iff) the deletion of
vertex v together with all edges incident to v disconnects the graph into two or more none
empty components.
Then the failure of a communication station I that is an articulation point, then we loss the
communication in between other stations. F
Form graph G1
If the graph is bi-connected graph (means no articulation point) then if any station i
fails, we can still communicate between every two stations not including station i.
From Graph Gb
There is an efficient algorithm to test whether a connected graph is biconnected. If the case of
graphs that are not biconnected, this algorithm will identify all the articulation points.
Once it has been determined that a connected graph G is not biconnected, it may be desirable
(suitable) to determine a set of edges whose inclusion makes the graph biconnected.
UNIT III:
Greedy method: General method, applications - Job sequencing with deadlines, 0/1
knapsack problem, Minimum cost spanning trees, Single source shortest path problem.
Dynamic Programming: General method, applications-Matrix chain multiplication, Optimal
binary search trees, 0/1 knapsack problem, All pairs shortest path problem, Travelling sales
person problem, Reliability design.
Greedy Method:
The greedy method is perhaps (maybe or possible) the most straight forward design
technique, used to determine a feasible solution that may or may not be optimal.
Feasible solution:- Most problems have n inputs and its solution contains a subset of inputs
that satisfies a given constraint(condition). Any subset that satisfies the constraint is called
feasible solution.
Optimal solution: To find a feasible solution that either maximizes or minimizes a given
objective function. A feasible solution that does this is called optimal solution.
The greedy method suggests that an algorithm works in stages, considering one input at a
time. At each stage, a decision is made regarding whether a particular input is in an optimal
solution.
Greedy algorithms neither postpone nor revise the decisions (ie., no back tracking).
Example
and never visit it again.
Application of Greedy Method:
Job sequencing with deadline
0/1 knapsack problem
Minimum cost spanning trees
Single source shortest path problem.
To complete a job one had to process the job on a machine for one unit of time. Only one
machine is available for processing jobs.
A feasible solution for this problem is a subset J of jobs such that each job in this subset can
be completed by its deadline.
The value of a feasible solution J is the sum of the profits of the jobs in J, i.e., i Pi
An optimal solution is a feasible solution with maximum value.
Ex: - Obtain the optimal sequence for the following jobs.
j1 j2 j3 j4
(P1, P2, P3, P4) = (100, 10, 15, 27)
We must formulate an optimization measure to determine how the next job is chosen.
algorithm js(d, j, n)
//d dead line, j subset of jobs ,n total number of jobs
---
subset range
{
d[0]=j[0]=0;
j[1]=1;
k=1;
for i=2 to n do{
r=k;
r=r-1;
{
for q:=k to (r+1) setp-1 do j[q+1]= j[q];
j[r+1]=i;
k=k+1;
}
}
return k;
}
Note: The size of sub set j must be less than equal to maximum deadline in given list.
Graphs can be used to represent the highway structure of a state or country with
vertices representing cities and edges representing sections of highway.
The edges have assigned weights which may be either the distance between the 2
cities connected by the edge or the average time to drive along that section of
highway.
For example if A motorist wishing to drive from city A to B then we must answer the
following questions
Is there a path from A to B
If there is more than one path from A to B which is the shortest path
The length of a path is defined to be the sum of the weights of the edges on that path.
Given a directed graph G(V,E) with weight edge w(u,v). e have to find a shortest path from
Bellman-Ford Algorithm
Consider the above directed graph, if node 1 is the source vertex, then shortest path
from 1 to 2 is 1,4,5,2. The length is 10+15+20=45.
As an optimization measure we can use the sum of the lengths of all paths so far
generated.
the next path to be constructed should be the next shortest minimum length path.
The greedy way to generate the shortest paths from Vo to the remaining vertices is to
generate these paths in non-decreasing order of path length.
For this 1st, a shortest path of the nearest vertex is generated. Then a shortest path to
the 2nd nearest vertex is generated and so on.
Algorithm for finding Shortest Path
with n-vertices.
// dist[v] is zero
{
for i=1 to n do{
s[i]=false;
dist[i]=cost[v,i];
}
s[v]=true;
dist[v]:=0.0; // put v in s
for num=2 to n do{
// determine n-1 paths from v
choose u form among those vertices not in s such that dist[u] is minimum.
s[u]=true; // put u in s
for (each w adjacent to u with s[w]=false) do
if(dist[w]>(dist[u]+cost[u, w])) then
dist[w]=dist[u]+cost[u, w];
}
}
The greedy method suggests that a minimum cost spanning tree can be obtained by contacting
the tree edge by edge. The next edge to be included in the tree is the edge that results in a
minimum increase in the some of the costs of the edges included so far.
There are two basic algorithms for finding minimum-cost spanning trees, and both are greedy
algorithms
: Start with any one node in the spanning tree, and repeatedly add the
cheapest edge, and the node it leads to, for which the node is not already in the spanning tree.
Prim's minimum spanning tree algorithm
Start with no nodes or edges in the spanning tree, and repeatedly
add the cheapest edge that does not create a cycle.
Consider the above graph of , Using Kruskal's method the edges of this graph are considered
for inclusion in the minimum cost spanning tree in the order (1, 2), (3, 6), (4, 6), (2, 6), (1, 4),
(3, 5), (2, 5), (1, 5), (2, 3), and (5, 6). This corresponds to the cost sequence 10, 15, 20, 25,
30, 35, 40, 45, 50, 55. The first four edges are included in T. The next edge to be considered
is (I, 4). This edge connects two vertices already connected in T and so it is rejected. Next,
the edge (3, 5) is selected and that completes the spanning tree.
Dynamic Programming
When optimal decision sequences contain optimal decision subsequences, we can establish
recurrence equations, called dynamic-programming recurrence equations, that enable us to
solve the problem in an efficient way.
Solve the dynamic-programming recurrence equations for the value of the optimal
solution.
A multistage graph G = (V, E) is a directed graph in which the vertices are partitioned
into k > 2 disjoint sets Vi, 1 < i < k. In addition, if <u, v> is an edge in E, then u E Vi and v
E Vi+1 for some i, 1 < i < k.
The cost
sum of the costs of the edges on the path. The multistage graph
defines a stage in the
2, then to stage 3, then to stage 4, and so on, and eventually terminates in stage k.
ALGORITHM:
Algorithm Fgraph (G, k, n, p)
// The input is a k-stage graph G = (V, E) with n vertices //
indexed in order or stages. E is a set of edges and c [i, j] // is the
cost of (i, j). p [1 : k] is a minimum cost path.
{
cost [n] := 0.0;
for j:= n - 1 to 1 step 1 do
{ // compute cost [j]
let r be a vertex such that (j, r) is an edge of G
and c [j, r] + cost [r] is minimum; cost [j] := c
[j, r] + cost [r];
d [j] := r:
}
p [1] := 1; p [k] := n; // Find a minimum cost path.
for j := 2 to k - 1 do p [j] := d [p [j - 1]];}
The multistage graph problem can also be solved using the backward approach. Let bp(i,
j) be a minimum cost path from vertex s to j vertex in Vi. Let Bcost(i, j) be the cost of bp(i,
j). From the backward approach we obtain:
Bcost (i, j) = min { Bcost (i 1, l) + c (l, j)}
l e Vi - 1
<l, j> e E
Complexity Analysis:
The complexity analysis of the algorithm is fairly straightforward. Here, if G has ~E~ edges,
then the time for the first for loop is CJ ( V~ +~E ).
EXAMPLE 1:
Find the minimum cost path from s to t in the multistage graph of five stages shown below. Do
this first using forward approach and then using backward approach.
FORWARD APPROACH:
We use the following equation to find the minimum cost path from s to t: cost (i,
j) = min {c (j, l) + cost (i + 1, l)}
l c Vi + 1
<j, l> c E
cost (1, 1) = min {c (1, 2) + cost (2, 2), c (1, 3) + cost (2, 3), c (1, 4) + cost (2, 4), c (1, 5) +
cost (2, 5)}
= min {9 + cost (2, 2), 7 + cost (2, 3), 3 + cost (2, 4), 2 + cost (2, 5)}
Now first starting with,
cost (2, 2) = min{c (2, 6) + cost (3, 6), c (2, 7) + cost (3, 7), c (2, 8) + cost (3, 8)} = min {4 +
cost (3, 6), 2 + cost (3, 7), 1 + cost (3, 8)}
cost (3, 6) = min {c (6, 9) + cost (4, 9), c (6, 10) + cost (4, 10)}
= min {6 + cost (4, 9), 5 + cost (4, 10)}
cost (4, 9) = min {c (9, 12) + cost (5, 12)} = min {4 + 0) = 4 cost (4,
cost (3, 7) = min {c (7, 9) + cost (4, 9) , c (7, 10) + cost (4, 10)}
= min {4 + cost (4, 9), 3 + cost (4, 10)}
cost (4, 9) = min {c (9, 12) + cost (5, 12)} = min {4 + 0} = 4 Cost (4,
10) =
min
{c
(10,
+ 2} = min {8, 5} = 5
cost (3, 8) = min {c (8, 10) + cost (4, 10), c (8, 11) + cost (4, 11)}
= min {5 + cost (4, 10), 6 + cost (4 + 11)}
Therefore, cost (2, 3) = min {c (3, 6) + cost (3, 6), c (3, 7) + cost (3, 7)}
= min {2 + cost (3, 6), 7 + cost (3, 7)}
= min {2 + 7, 7 + 5} = min {9, 12} = 9
cost (2, 4) = min {c (4, 8) + cost (3, 8)} = min {11 + 7} = 18 cost (2, 5) =
min {c (5, 7) + cost (3, 7), c (5, 8) + cost (3, 8)} = min {11 + 5, 8 +
7} = min {16, 15} = 15
l c vi 1
<l, j> c E
Bcost (5, 12) = min {Bcost (4, 9) + c (9, 12), Bcost (4, 10) + c (10, 12),
Bcost (4, 11) + c (11, 12)}
= min {Bcost (4, 9) + 4, Bcost (4, 10) + 2, Bcost (4, 11) + 5}
Bcost (4, 9) = min {Bcost (3, 6) + c (6, 9), Bcost (3, 7) + c (7, 9)}
= min {Bcost (3, 6) + 6, Bcost (3, 7) + 4}
Bcost (3, 6) = min {Bcost (2, 2) + c (2, 6), Bcost (2, 3) + c (3, 6)}
= min {Bcost (2, 2) + 4, Bcost (2, 3) + 2}
Bcost (2, 2) = min {Bcost (1, 1) + c (1, 2)} = min {0 + 9} = 9 Bcost (2, 3) = min
min {13, 9} = 9
Bcost (3, 7) = min {Bcost (2, 2) + c (2, 7), Bcost (2, 3) + c (3, 7), Bcost (2, 5) + c (5,
7)}
Bcost (2, 5) = min {Bcost (1, 1) + c (1, 5)} = 2
Bcost (3, 7) = min {9 + 2, 7 + 7, 2 + 11} = min {11, 14, 13} = 11 Bcost (4, 9) = min {9
Bcost (4, 10) = min {Bcost (3, 6) + c (6, 10), Bcost (3, 7) + c (7, 10),
Bcost (3, 8) + c (8, 10)}
Bcost (3, 8) = min {Bcost (2, 2) + c (2, 8), Bcost (2, 4) + c (4, 8),
Bcost (2, 5) + c (5, 8)}
Bcost (2, 4) = min {Bcost (1, 1) + c (1, 4)} = 3
Bcost (3, 8) = min {9 + 1, 3 + 11, 2 + 8} = min {10, 14, 10} = 10 Bcost (4, 10) = min {9
+ 5, 11 + 3, 10 + 5} = min {14, 14, 15) = 14
Bcost (4, 11) = min {Bcost (3, 8) + c (8, 11)} = min {Bcost (3, 8) + 6} = min {10 + 6} =
16
Bcost (5, 12) = min {15 + 4, 14 + 2, 16 + 5} = min {19, 16, 21} = 16. EXAMPLE
2:
Find the minimum cost path from s to t in the multistage graph of five stages shown below. Do
this first using forward approach and then using backward approach.
3 4 1
2 4 7
7
5 6
3 6
s1 5 2 9 t
2 5
3
SOLUTION:
FORWARD APPROACH:
cost (1, 1) = min {c (1, 2) + cost (2, 2), c (1, 3) + cost (2, 3)}
= min {5 + cost (2, 2), 2 + cost (2, 3)}
cost (2, 2) = min {c (2, 4) + cost (3, 4), c (2, 6) + cost (3, 6)}
= min {3+ cost (3, 4), 3 + cost (3, 6)}
cost (3, 4) = min {c (4, 7) + cost (4, 7), c (4, 8) + cost (4, 8)}
= min {(1 + cost (4, 7), 4 + cost (4, 8)}
cost (4, 7) = min {c (7, 9) + cost (5, 9)} = min {7 + 0) = 7 cost (4, 8)
cost (2, 3) = min {c (3, 4) + cost (3, 4), c (3, 5) + cost (3, 5), c (3, 6) + cost (3,6)}
cost (3, 5) = min {c (5, 7) + cost (4, 7), c (5, 8) + cost (4, 8)}= min {6 + 7, 2 + 3} = 5
104
Therefore, cost (2, 3) = min {13, 10, 13} = 10
cost (1, 1) = min {5 + 8, 2 + 10} = min {13, 12} = 12
BACKWARD APPROACH:
Bcost (i, J) = min {Bcost (i 1, l) = c (l, J)}
l E vi 1
<l ,j>E E
Bcost (5, 9) = min {Bcost (4, 7) + c (7, 9), Bcost (4, 8) + c (8, 9)}
= min {Bcost (4, 7) + 7, Bcost (4, 8) + 3}
Bcost (4, 7) = min {Bcost (3, 4) + c (4, 7), Bcost (3, 5) + c (5, 7),
Bcost (3, 6) + c (6, 7)}
= min {Bcost (3, 4) + 1, Bcost (3, 5) + 6, Bcost (3, 6) + 6}
Bcost (3, 4) = min {Bcost (2, 2) + c (2, 4), Bcost (2, 3) + c (3, 4)}
= min {Bcost (2, 2) + 3, Bcost (2, 3) + 6}
Bcost (2, 2) = min {Bcost (1, 1) + c (1, = min {0 + 5} = 5
Bcost (4, 8) = min {Bcost (3, 4) + c (4, 8), Bcost (3, 5) + c (5, 8), Bcost
(3, 6) + c (6, 8)}
= min {8 + 4, 7 + 2, 10 + 2} = 9
In the all pairs shortest path problem, we are to find a shortest path between every pair of
vertices in a directed graph G. That is, for every pair of vertices (i, j), we are to find a
shortest path from i to j as well as one from j to i. These two paths are the same when G is
undirected.
When no edge has a negative length, the all-pairs shortest path problem may be solved
the n
vertices as the source vertex.
The all pairs shortest path problem is to determine a matrix A such that A (i, j) is the length
of a shortest path from i to j. The matrix A can be obtained by solving n single-source
105
Step 3: Solving the equation for, k = 3;
A3 (1, 1) = min {A2 (1, 3) + A2 (3, 1), c (1, 1)} = min {(6 + 3), 0} = 0
A3 (1, 2) = min {A2 (1, 3) + A2 (3, 2), c (1, 2)} = min {(6 + 7), 4} = 4
A3 (1, 3) = min {A2 (1, 3) + A2 (3, 3), c (1, 3)} = min {(6 + 0), 6} = 6
A3 (2, 1) = min {A2 (2, 3) + A2 (3, 1), c (2, 1)} = min {(2 + 3), 6} = 5
A3 (2, 2) = min {A2 (2, 3) + A2 (3, 2), c (2, 2)} = min {(2 + 7), 0} = 0
A3 (2, 3) = min {A2 (2, 3) + A2 (3, 3), c (2, 3)} = min {(2 + 0), 2} = 2
A3 (3, 1) = min {A2 (3, 3) + A2 (3, 1), c (3, 1)} = min {(0 + 3), 3} = 3
A3 (3, 2) = min {A2 (3, 3) + A2 (3, 2), c (3, 2)} = min {(0 + 7), 7} = 7
107
A3 (3, 3) = min {A2 (3, 3) + A2 (3, 3), c (3, 3)} = min {(0 + 0), 0} = 0
TRAVELLING SALESPERSON PROBLEM
Let G = (V, E) be a directed graph with edge costs Cij. The variable cij is defined such that
cij > 0 for all I and j and cij = a if < i, j> o E. Let |V| = n and assume n > 1. A tour of G is
a directed simple cycle that includes every vertex in V. The cost of a tour is the sum of the
cost of the edges on the tour. The traveling sales person problem is to find a tour of
minimum cost. The tour is to be a simple path that starts and ends at vertex 1.
Let g (i, S) be the length of shortest path starting at vertex i, going through all vertices in
S, and terminating at vertex 1. The function g (1, V {1}) is the length of an optimal
salesperson tour. From the principal of optimality it follows that:
g(1, V - {1 }) = 2 ~ k ~ n ~c1k ~ g ~ k, V ~ ~ 1, k ~~
~
min
Generalizing equation 1, we obtain (for i o S)
g ( i, S ) = min{ci j
j ES
The Equation can be solved for g (1, V 1}) if we know g (k, V {1, k}) for all
choices of k.
Complexity Analysis:
For each value of |S| there are n 1 choices for i. The number of distinct
sets S of
~n-2~
size k not including 1 and i is I k ~ .
~ ~
~ ~
{1}) is:
~n-2~
~ ~n ~1~ ~
~k ~
k~0 ~ ~
To calculate this sum, we use the binominal theorem:
[((n - 2) ((n - 2) ((n - 2) ((n - 2)1
(n 1)111 11+ii iI+ii iI+----~~~ ~
2
~~ 0 ) ~ 1 ) ~ 2 ) ~(n~ )~~
According to the binominal theorem:
[((n - 2) ((n - 2) ((n - 2 ((n - 2)1
il 11+ii iI+ii ~~~~~~~~~~ ~~~=2n-2
~~0 ~ ~ 1 ~ ~ 2 ~ ~(n - 2))]
Therefore,
n- 1 ~ n _ 2'
~ ( n _ 1 ~ ~~ k = (n - 1) 2n ~ 2
~
k
This is (n 2n-2), so there are exponential number of calculate. Calculating one g (i, S)
require finding the minimum of at most n quantities. Therefore, the entire algorithm
is (n2 2n-2). This is better than enumerating all n! different tours to find the best one.
So, we have traded on exponential growth for a much smaller exponential growth.
Therefore, g (2, {3, 4}) = min {9 + 20, 10 + 15} = min {29, 25} = 25
{2, 4}) = min {(c32 + g (2, {4}), (c34 + g (4, {2})}
Therefore, g (3, {2, 4}) = min {13 + 18, 12 + 13} = min {41, 25} = 25
g (4, {2, 3}) = min {c42 + g (2, {3}), c43 + g (3, {2})}
{3}) = min {c23 + g (3, ~} = 9 + 6 = 15
{2}) = min {c32 + g (2, T} = 13+ 5 = 18
Therefore, g (4, {2, 3}) = min {8 + 15, 9 + 18} = min {23, 27} = 23
g (1, {2, 3, 4}) = min {c12 + g (2, {3, 4}), c13 + g (3, {2, 4}), c14 + g (4, {2, 3})} = min
{10 + 25, 15 + 25, 20 + 23} = min {35, 40, 43} = 35
Let P (i) be the probability with which we shall be searching for 'ai'. Let Q (i) be the
probability of an un-successful search. Every internal node represents a point where a
successful search may terminate. Every external node represents a point where an
unsuccessful search may terminate.
The expected cost contribution for the internal node for 'ai' is:
Unsuccessful search terminate with I = 0 (i.e at an external node). Hence the cost
contribution for this node is:
110
The expected cost of binary search tree is:
n n
~ P(i) * level (ai) + ~ Q (i) * level ((Ei ) - 1)
Given a fixed set of identifiers, we wish to create a binary search tree organization. We
may expect different binary search trees for the same identifier set to have different
performance characteristics.
quantities. Hence, each such c(i, j) can be computed in time O(m). The total time for all
i = m is therefore O(nm m2).
~ (nm - m 2 ) = O (n 3
) 1<m<n
Example 1: The possible binary search trees for the identifier set (a1, a2, a3) = (do, if,
stop) are as follows. Given the equal
probabilities p (i) = Q (i) = 1/7 for all i,
we have:
Tree 2
1 + 2 + 3 1 + 2 + 3 + 3 6 + 9 15
Huffman coding tree solved by a greedy algorithm has a limitation of having the data only
at the leaves and it must not preserve the property that all nodes to the left of the root
have keys, which are less etc. Construction of an optimal binary search tree is harder,
because the data is not constrained to appear only at the leaves, and also because the tree
must satisfy the binary search tree property and it must preserve the property that all
nodes to the left of the root have keys, which are less.
A dynamic programming solution to the problem of obtaining an optimal binary search
tree can be viewed by constructing a tree as a result of sequence of decisions by holding
the principle of optimality. A possible approach to this is to make a decision as which
of the ai's be arraigned to the root node at 'T'. If we choose 'ak' then is clear that the
internal nodes for a1, a2, . . . . . ak-1 as well as the external nodes for the classes Eo, E1,
. . . . . . . Ek-1 will lie in the left sub tree, L, of the root. The remaining nodes will be in
the right subtree, ft. The structure of an optimal binary search tree is:
The C (i, J) can be computed as:
C (i, J) = min {C (i, k-1) + C (k, J) + P (K) + w (i, K-1) + w (K, J)}
i<k<J
Equation (1) may be solved for C (0, n) by first computing all C (i, J) such that J - i = 1
Next, we can compute all C (i, J) such that J - i = 2, Then all C (i, J) with J - i = 3
and so on.
C (i, J) is the cost of the optimal binary search tree 'Tij' during computation we record
the root R (i, J) of each tree 'Tij'. Then an optimal binary search tree may be
constructed from these R (i, J). R (i, J) is the value of 'K' that minimizes equation (1).
We solve the problem by knowing W (i, i+1), C (i, i+1) and R (i, i+1), 0
Example 1:
Let n = 4, and (a1, a2, a3, a4) = (do, if, need, while) Let P (1: 4) = (3, 3, 1, 1) and Q (0:
4) = (2, 3, 1, 1, 1)
Solution:
Table for recording W (i, j), C (i, j) and R (i, j):
Column 0 1 2 3 4
Row
0 2, 0, 0 3, 0, 0 1, 0, 0 1, 0, 0, 1, 0, 0
1 8, 8, 1 7, 7, 2 3, 3, 3 3, 3, 4
2 12, 19, 1 9, 12, 2 5, 8, 3
3 14, 25, 2 11, 19, 2
4 16, 32, 2
This computation is carried out row-wise from row 0 to row 4. Initially, W (i, i) = Q
(i) and C (i, i) = 0 and R (i, i) = 0, 0 < i < 4.
Solving for C (0, n):
First, computing all C (i, j) such that j - i = 1; j = i + 1 and as 0 < i < 4; i = 0, 1, 2 and
k=1
Next
W (3, 4) = P (4) + Q (4) + W (3, 3) = 1 + 1 + 1 =3
C (3, 4) = W (3, 4) + min {[C (3, 3) + C (4, 4)]} =3 + [(0 + 0)] =3
ft (3, 4) = 4
Second, Computing all C (i, j) such that j - i = 2; j = i + 2 and as 0 < i < 3; i = 0, 1, 2; i <
1 and 2.
W (0, 2) = P (2) + Q (2) + W (0, 1) = 3 + 1 + 8 = 12
C (0, 2) = W (0, 2) + min {(C (0, 0) + C (1, 2)), (C (0, 1) + C (2, 2))} = 12
+ min {(0 + 7, 8 + 0)} = 19
ft (0, 2) = 1
Hence the left sub tree is 'T01' and right sub tree is T24. The root of 'T01' is 'a1' and the
root of 'T24' is a3.
The left and right sub trees for 'T01' are 'T00' and 'T11' respectively. The root of T01 is
'a1'
The left and right sub trees for T24 are T22 and T34 respectively.
Example 2:
Consider four elements a1, a2, a3 and a4 with Q0 = 1/8, Q1 = 3/16, Q2 = Q3 = Q4 =
1/16 and p1 = 1/4, p2 = 1/8, p3 = p4 =1/16. Construct an optimal binary search tree. Solving
for C (0, n):
First, computing all C (i, j) such that j - i = 1; j = i + 1 and as 0 < i < 4; i = 0, 1, 2 and 3; i
< k=1
Second, Computing all C (i, j) such that j - i = 2; j = i + 2 and as 0 < i < 3; i = 0, 1, 2; i <
From the table we see that C (0, 4) = 33 is the minimum cost of a binary search tree for
(a1, a2, a3, a4)
The root of the tree 'T04' is 'a2'.
Hence the left sub tree is 'T01' and right sub tree is T24. The root of 'T01' is 'a1' and the
root of 'T24' is a3.
The left and right sub trees for 'T01' are 'T00' and 'T11' respectively. The root of T01 is
'a1'
The left and right sub trees for T24 are T22 and T34 respectively.
We are given n objects and a knapsack. Each object i has a positive weight wi and a positive
value Vi. The knapsack can carry a weight not exceeding W. Fill the knapsack so that the
value of objects in the knapsack is optimized.
A solution to the knapsack problem can be obtained by making a sequence of
decisions on the variables x1, x2, . . . . , xn. A decision on variable xi involves
determining which of the values 0 or 1 is to be assigned to it. Let us assume that
decisions on the xi are made in the order xn, xn-1, . . . .x1. Following a decision on xn,
we may be in one of two possible states: the capacity remaining in m wn and a profit
of pn has accrued. It is clear that the remaining decisions xn-1, . . . , x1 must be optimal
with respect to the problem state resulting from the decision on xn. Otherwise, xn,. .
. . , x1 will not be optimal. Hence, the principal of optimality holds.
Fn (m) = max {fn-1 (m), fn-1 (m - wn) + pn} -- 1
Equation-2 can be solved for fn (m) by beginning with the knowledge fo (y) = 0 for all y
and fi (y) = - ~, y < 0. Then f1, f2, . . . fn can be successively computed using
equation 2.
Now, Si+1 can be computed by merging the pairs in Si and Si 1 together. Note that if Si+1
contains two pairs (Pj, Wj) and (Pk, Wk) with the property that Pj < Pk and Wj > Wk,
then the pair (Pj, Wj) can be discarded because of equation-2. Discarding or purging
rules such as this one are also known as dominance rules. Dominated tuples get purged. In
the above, (Pk, Wk) dominates (Pj, Wj).
Reliability Design
The problem is to design a system that is composed of several devices connected in series.
Let ri be the reliability of device Di (that is ri is the probability that device i will
function properly) then the reliability of the entire system is fT ri. Even if the individual
the system may
not be very good. For example, if n = 10 and ri = 0.99, i < i < 10, then fT ri = .904.
Hence, it is desirable to duplicate devices. Multiply copies of the same device type are
connected in parallel.
If stage i contains mi copies of device Di. Then the probability that all mi have a
malfunction is (1 - ri) mi. Hence the reliability of stage i becomes 1 (1 - r )mi.
i
Our problem is to use device duplication. This maximization is to be carried out under a
cost constraint. Let ci be the cost of each unit of device i and let c be the maximum allowable
cost of the system being designed.
We wish to solve:
M ax i m iz e ~ qi (mi ~
1<i<n
Subject to ~ Ci mi < C
1<i<n
mi > 1 and interger, 1 < i < n
Assume each Ci > 0, each mi must be in the range 1 < mi < ui, where
~~ +Ci n ~ ~
ui ~ ~ ~C ~ C~
Ci~
ILk ~ ~J U
1
The upper bound ui follows from the observation that mj > 1
An optimal solution m1, m2 . . . . . mn is the
result of a sequence of decisions, one
decision for each mi.
The last decision made requires one to choose mn from {1, 2, 3, . . . . . un}
Once a value of mn has been chosen, the remaining decisions must be such as to use the
remaining funds C Cn mn in an optimal way.
The principle of optimality holds on
f n ~C ~ ~max { On (m n ) fn _ 1 (C - C n
m n ) } 1 < mn < un
decisions on m1, m2, . . . . mn. The dominance rule (f1, x1) dominate (f2, x2) if f1 2
and x1 2. Hence, dominated tuples can be discarded from Si.
Example 1:
Design a three stage system with device types D1, D2 and D3. The costs are $30, $15 and
$20 respectively. The Cost of the system is to be no more than $105. The reliability of
each device is 0.9, 0.8 and 0.5 respectively.
Solution:
We assume that if if stage I has mi devices of type i in parallel, then 0 i (mi) =1 (1- ri)mi
~~2~
2 ~ 1 ~~1 ~ 0.8~ 2 = 0.96
02(3) = 1 - (1 - 0.8) 3 = 0.992
If Si contains two pairs (f1, x1) and (f2, x2) with the property that f1 2 and x1 2,
then (f1, x1) dominates (f2, x2), hence by dominance rule (f2, x2) can be discarded.
Discarding or pruning rules such as the one above is known as dominance rule.
Dominating tuples will be present in Si and Dominated tuples has to be discarded from
Si.
Ø ~2~ ~ 1 ~ ~1
~ 0.5~ 2
3
Ø ~3~ ~ 1 ~ ~1
~ 0.5~ 3
3
1 ! m1 ! u1
= max {~1(1).f0(35-30), ~1(2).f0(35-30x2)}
= max {~1(1) x 1, t1(2) x -oo} = max{0.9, -oo} = 0.9
f1 (20) = max {~1(m1). f0(20 - 30m1)}
1 ! m1 ! u1
= max {~1(1) f0(20 - 30), t1(2) f0(20 - 30x2)}
= max {2(1) f1(45 - 15), ~2(2) f1(45 - 15x2), ~2(3) f1(45 - 15x3)} = max {0.8 f1(30),
f1 (0) = -.
m1 = 1.
UNIT IV:
Backtracking: General method, applications-n-queen problem, sum of subsets problem, graph
coloring, Hamiltonian cycles.
Branch and Bound: General method, applications - Travelling sales person problem,0/1
knapsack problem- LC Branch and Bound solution, FIFO Branch and Bound solution.
Backtracking is a methodical (Logical) way of trying out various sequences of decisions, until
Explicit constraints: Explicit constraints are rules that restrict each xi to take on values only
from a given set.
Example: xi or si={all non negative real numbers}
Xi=0 or 1 or Si={0, 1}
li i i or si={a: li i }
The explicit constraint depends on the particular instance I of the problem being solved. All
tuples that satisfy the explicit constraints define a possible solution space for I.
Implicit Constraints:
The implicit constraints are rules that determine which of the tuples in the solution space of I
satisfy the criterion function. Thus implicit constraints describe the way in which the Xi must
relate to each other.
Applications of Backtracking:
N Queens Problem
Sum of subsets problem
Graph coloring
Hamiltonian cycles.
N-Queens Problem:
queens puzzle is the problem of placing eight queens on an 8×8 chessboard so that no two
queens attack each other. That is so that no two of them are on the same row, column, or
diagonal.
The 8-queens puzzle is an example of the more general n-queens problem of placing n queens on
an n×n chessboard.
For example:
If n=4 (w1, w2, w3, w4)=(11,13,24,7) and m=31.
Then desired subsets are (11, 13, 7) & (24, 7).
The two solutions are described by the vectors (1, 2, 4) and (3, 4).
Wi weight of item i
M Capacity of bag (subset)
Xi the element of the solution vector is either one or zero.
Xi value depending on whether the weight wi is included or not.
If Xi=1 then wi is chosen.
If Xi=0 then wi is not chosen.
The above equation specify that x1, x2, x3, --- xk cannot lead to an answer node if this condition
is not satisfied.
X[k]=1
If(S+w[k]=m) then write(x[1: ]); // subset found.
Graph Coloring:
Let G be a
assigning colors to the vertices of an undirected graph with the restriction that no two adjacent
The optimization version calls for coloring a graph using the minimum number of coloring.
The decision version, known as K-coloring asks whether a graph is colourable using at most k-
colors.
The m-
Chromatic number
Example
Example:
Ex:
Adjacency matrix is
By using backtracking we need to determine how to compute the set of possible
vertices for xk if x1,x2,x3---xk-1 have already been chosen.
If k=1 then x1 can be any of the n-vertices.
ursive backtracking scheme to find all Hamiltoman
cycles.
This algorithm is started by 1st initializing the adjacency matrix G[1:n, 1:n] then setting x[2:n]
to zero & x[1] to 1, and then executing Hamiltonian (2)
Generating Next Vertex Finding all Hamiltonian Cycles
Algorithm NextValue(k) Algorithm Hamiltonian(k)
{ {
// x[1: k-1] is path of k-1 distinct vertices. Repeat{
// if x[k]=0, then no vertex has yet been NextValue(k); //assign a legal next value to
assigned to x[k] x[k]
Repeat{ If(x[k]=0) then return;
X[k]=(x[k]+1) mod (n+1); //Next vertex If(k=n) then write(x[1:n]);
If(x[k]=0) then return; Else Hamiltonian(k+1);
If(G[x[k- } until(false)
{ }
For j:=1 to k-1 do if(x[j]=x[k]) then break;
//Check for distinctness
If(j=k) then //if true , then vertex is distinct
Then return ;
}
}
Until (false);
}
Two graph search strategies, BFS & D-search (DFS) in which the exploration of a new
node cannot begin until the node currently being explored is fully explored.
Both BFS & D-search (DFS) generalized to B&B strategies.
BFS like state space search will be called FIFO (First In First Out) search as the list of
-in-first-
D-search (DFS) Like state space search will be called LIFO (Last In First Out) search
-in-first-
In backtracking, bounding function are used to help avoid the generation of sub-trees that
do not contain an answer node.
We will use 3-types of search strategies in branch and bound
1) FIFO (First In First Out) search
2) LIFO (Last In First Out) search
3) LC (Least Count) search
FIFO B&B:
FIFO Branch & Bound is a BFS.
In this, children of E-Node (or Live nodes) are inserted in a queue.
Implementation of list of live nodes as a queue
Least() Removes the head of the Queue
Add() Adds the node to the end of the Queue
st
we take E-
LIFO B&B:
LIFO Brach & Bound is a D-search (or DFS).
In this children of E-node (live nodes) are inserted in a stack
Implementation of List of live nodes as a stack
Least() Removes the top of the stack
ADD() Adds the node to the top of the stack.
Expended node (E-node) is the live node with the best value.
Branching: A set of solutions, which is represented by a node, can be partitioned into
mutually (jointly or commonly) exclusive (special) sets. Each subset in the partition is
represented by a child of the original node.
Lower bounding: An algorithm is available for calculating a lower bound on the cost of any
solution in a given subset.
Example:
8-puzzle
Cost function: = g(x) +h(x)
where h(x) = the number of misplaced tiles
and g(x) = the number of moves so far
Assumption: move one tile in any direction cost 1.
Note: In case of tie, choose the leftmost node.
Travelling Salesman Problem:
Def:- Find a tour of minimum cost starting from a node S going through other nodes
only once and returning to the starting point S.
2 n
Time Conmlexity of TSP for Dynamic Programming algorithm is O(n 2 )
B&B algorithms for this problem, the worest case complexity will not be any better than
O(n22n) but good bunding functions will enables these B&B algorithms to solve some
problem instances in much less time than required by the dynamic programming alogrithm.
Let G=(V,E) be a directed graph defining an instances of TSP.
Let Cij cost of edge <i, j>
State space tree for the travelling salesperson problem with n=4 and i0=i4=1
The above diagram shows tree organization of a complete graph with |V|=4.
to L.
Reduce column # 1: by 11
Reduce column # 1: by 11
Xi i
Define two functions c(x) and u(x) such that for every
node x,
c(x) c(x) u(x)
Computing c(·) and u(·)
Basic concepts:
NP Nondeterministic Polynomial time
-)
> Problems with solution time bound by > Problems with solution times not
a polynomial of a small degree. bound by polynomial (simply non
polynomial )
>
These are hard or intractable
> problems
Most Searching & Sorting algorithms
> are polynomial time algorithms
> None of the problems in this group
has been solved by any polynomial
> Ex: time algorithm
Ordered Search (O (log n)),
Polynomial evaluation O(n) > Ex:
Traveling Sales Person O(n2 2n)
Sorting O(n.log n)
Knapsack O(2n/2)
No one has been able to develop a polynomial time algorithm for any problem in the 2nd
group (i.e., group 2)
So, it is compulsory and finding algorithms whose computing times are greater than
polynomial very quickly because such vast amounts of time to execute that even moderate
size problems cannot be solved.
Theory of NP-Completeness:
Show that may of the problems with no polynomial time algorithms are computational time
algorithms are computationally related.
1. NP-Hard
2. NP-Complete
DESIGN AND ANALYSIS OF ALGORITHMS (UNIT-VIII)
NP-Hard: Problem can be solved in polynomial time then all NP-Complete problems can be
solved in polynomial time.
All NP-Complete problems are NP-Hard but some NP-Hard problems are not know to be NP-
Complete.
Nondeterministic Algorithms:
Algorithms with the property that the result of every operation is uniquely defined are termed
as deterministic algorithms. Such algorithms agree with the way programs are executed on a
computer.
Algorithms which contain operations whose outcomes are not uniquely defined but are
limited to specified set of possibilities. Such algorithms are called nondeterministic
algorithms.
The machine executing such operations is allowed to choose any one of these outcomes
subject to a termination condition to be defined later.
}
DESIGN AND ANALYSIS OF ALGORITHMS (UNIT-VIII)
}
if( (W>m) or (P<r) ) then Failure();
else Success();
}
time.
NP is the set of all decision problems solvable by nondeterministic algorithms in
-)
polynomial time.
-)Notation of Reducibility
Let L1 and L2 be problems, Problem L1 reduces to L2 (written L1 2) iff there is a way to
solve L1 by a deterministic polynomial time algorithm using a deterministic algorithm that
solves L2 in polynomial time
This implies that, if we have a polynomial time algorithm for L2, Then we can solve L1 in
polynomial time.
Algorithm Z Q
Q
> If O(q(m)) m
satisfiable then the comple Z O(p3(n) log n + q(p3(n)log n)).
> If satisfiability is p , then q(m) is a polynomial function of m and the
complexity of Z O(r(n)) r() .