0% found this document useful (0 votes)
2 views

Module 5 ADA Notes Students

The document discusses decision trees as a model for analyzing algorithms, particularly in sorting and searching, establishing lower bounds on comparisons needed for these algorithms. It also explores the complexity classes P, NP, and NP-complete problems, highlighting the significance of polynomial-time solvable problems and the challenges posed by intractable problems. Lastly, it introduces algorithm design techniques like backtracking and branch-and-bound for coping with algorithmic limitations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 5 ADA Notes Students

The document discusses decision trees as a model for analyzing algorithms, particularly in sorting and searching, establishing lower bounds on comparisons needed for these algorithms. It also explores the complexity classes P, NP, and NP-complete problems, highlighting the significance of polynomial-time solvable problems and the challenges posed by intractable problems. Lastly, it introduces algorithm design techniques like backtracking and branch-and-bound for coping with algorithmic limitations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

UNIT V - COPING WITH THE LIMITATIONS OF ALGORITHM POWER

5.1 DECISION TREES


Important algorithms like sorting and searching are based on comparing items of their
inputs. The study of the performance of such algorithm is called a decision tree. As an example,
Figure 5.1 presents a decision tree of an algorithm for finding a minimum of three numbers. Each
internal node of a binary decision tree represents a key comparison indicated in the node.

FIGURE 5.1 Decision tree for finding a minimum of three numbers.

Consider a binary decision tree with height h and leaves n. and height h, then h ≥ ]log2 n]. A
binary tree of height h with the largest number of leaves on the last level is 2 h. In other words, 2h ≥
n, which puts a lower bound on the heights of binary decision trees. Hence the worst-case number
of comparisons made by any comparison-based algorithm for the problem is called the information
theoretic lower bound.
Decision Trees for Sorting

Cba
123
FIGURE 5.2 Decision tree for the tree-element selection sort.
A triple above a node indicates the state of the array being sorted. Note two redundant
comparisons b <a with a single possible outcome because of the results of some previously made
comparisons.

FIGURE 5.3 Decision tree for the three-element insertion sort.

The three-element insertion sort whose decision tree is given in Figure 5.3, this number is
(2 + 3 + 3 + 2 + 3 + 3)/6 = 2.66. Under the standard assumption that all n! outcomes of sorting are
equally likely, the following lower bound on the average number of comparisons Cavg made by any
comparison-based algorithm in sorting an n-element list has been proved:
Cavg(n) ≥ log2 n!.
Decision tree is a convenient model of algorithms involving comparisons in which
 internal nodes represent comparisons
 leaves represent outcomes (or input cases)

Decision Trees and Sorting Algorithms


 Any comparison-based sorting algorithm can be represented by a decision tree (for each
fixed n)
 Number of leaves (outcomes)  n!
 Height of binary tree with n! leaves  log2n!
 Minimum number of comparisons in the worst case  log2n! for any comparison-
based sorting algorithm, since the longest path represents the worst case and its length is
the height
 log2n!  n log2n (by Sterling approximation)
 This lower bound is tight (mergesort or heapsort)

Decision Trees for Searching a Sorted Array


Decision trees can be used for establishing lower bounds on the number of key comparisons
in searching a sorted array of n keys: A[0]<A[1]< . . .<A[n − 1].
The principal algorithm for this problem is binary search. The number of comparisons made
by binary search in the worst case, Cworst(n), is given by the formula
Cworst(n) = 𝖫log2 n] + 1= log2(n + 1)

FIGURE 5.4 Ternary decision tree for binary search in a four-element array.

FIGURE 5.5 Binary decision tree for binary search in a four-element array.

As comparison of the decision trees in the above illustrates, the binary decision tree is
simply the ternary decision tree with all the middle subtrees eliminated. Applying inequality to
such binary decision trees immediately yields Cworst(n) ≥ log2(n + 1) 

5.2 P, NP AND NP-COMPLETE PROBLEMS
Problems that can be solved in polynomial time are called tractable, and problems that
cannot be solved in polynomial time are called intractable.
There are several reasons for intractability.
 First, we cannot solve arbitrary instances of intractable problems in a
reasonable amount of time unless such instances are very small.
 Second, although there might be a huge difference between the running times in
O(p(n)) for polynomials of drastically different degrees. where p(n) is a
polynomial of the problem’s input size n.
 Third, polynomial functions possess many convenient properties; in particular,
both the sum and composition of two polynomials are always polynomials too.
 Fourth, the choice of this class has led to a development of an extensive theory
called computational complexity.

Definition: Class P is a class of decision problems that can be solved in polynomial time by
deterministic algorithms. This class of problems is called polynomial class.
 Problems that can be solved in polynomial time as the set that computer science
theoreticians call P. A more formal definition includes in P only decision problems, which
are problems with yes/no answers.
 The class of decision problems that are solvable in O(p(n)) polynomial time, where p(n) is
a polynomial of problem’s input size n
Examples:
 Searching
 Element uniqueness
 Graph connectivity
 Graph acyclicity
 Primality testing (finally proved in 2002)
 The restriction of P to decision problems can be justified by the following reasons.
 First, it is sensible to exclude problems not solvable in polynomial time
because of their exponentially large output. e.g., generating subsets of a given set
or all the permutations of n distinct items.
 Second, many important problems that are not decision problems in their
most natural formulation can be reduced to a series of decision problems that are
easier to study. For example, instead of asking about the minimum number of
colors needed to color the vertices of a graph so that no two adjacent vertices are
colored the same color. Coloring of the graph’s vertices with no more than m
colors for m = 1, 2, (The latter is called the m-coloring problem.)
 So, every decision problem can not be solved in polynomial time. Some decision
problems cannot be solved at all by any algorithm. Such problems are called
undecidable, as opposed to decidable problems that can be solved by an
algorithm (Halting problem).
 Non polynomial-time algorithm: There are many important problems, however, for which
no polynomial-time algorithm has been found.
 Hamiltonian circuit problem: Determine whether a given graph has a
Hamiltonian circuit—a path that starts and ends at the same vertex and passes
through all the other vertices exactly once.
 Traveling salesman problem: Find the shortest tour through n cities with known
positive integer distances between them (find the shortest Hamiltonian circuit in
a complete graph with positive integer weights).
 Knapsack problem: Find the most valuable subset of n items of given positive
integer weights and values that fit into a knapsack of a given positive integer
capacity.
 Partition problem: Given n positive integers, determine whether it is possible to
partition them into two disjoint subsets with the same sum.
 Bin-packing problem: Given n items whose sizes are positive rational numbers
not larger than 1, put them into the smallest number of bins of size 1.
 Graph-coloring problem: For a given graph, find its chromatic number, which is
the smallest number of colors that need to be assigned to the graph’s vertices so
that no two adjacent vertices are assigned the same color.
 Integer linear programming problem: Find the maximum (or minimum) value
of a linear function of several integer-valued variables subject to a finite set of
constraints in the form of linear equalities and inequalities.

Definition: A nondeterministic algorithm is a two-stage procedure that takes as its input an


instance I of a decision problem and does the following.
1. Nondeterministic (“guessing”) stage: An arbitrary string S is generated that can be
thought of as a candidate solution to the given instance.
2. Deterministic (“verification”) stage: A deterministic algorithm takes both I and S as its
input and outputs yes if S represents a solution to instance I. (If S is not a solution to
instance I , the algorithm either returns no or is allowed not to halt at all.)
Finally, a nondeterministic algorithm is said to be nondeterministic polynomial if the time
efficiency of its verification stage is polynomial.

Definition: Class NP is the class of decision problems that can be solved by nondeterministic
polynomial algorithms. This class of problems is called nondeterministic polynomial.
Most decision problems are in NP. First of all, this class includes all the problems in P:
P ⊆ NP
This is true because, if a problem is in P, we can use the deterministic polynomial time
algorithm that solves it in the verification-stage of a nondeterministic algorithm that simply ignores
string S generated in its nondeterministic (“guessing”) stage. But NP also contains the Hamiltonian
circuit problem, the partition problem, decision versions of the traveling salesman, the knapsack,
graph coloring, and many hundreds of other difficult combinatorial optimization. The halting
problem, on the other hand, is among the rare examples of decision problems that are known not to
be in NP.
Note that P = NP would imply that each of many hundreds of difficult combinatorial
decision problems can be solved by a polynomial-time algorithm.

Definition: A decision problem D1 is said to be polynomially reducible to a decision problem D2,


if there exists a function t that transforms instances of D1 to instances of D2 such that:
1. t maps all yes instances of D1 to yes instances of D2 and all no instances of D1 to no
instances of D2.
2. t is computable by a polynomial time algorithm.

This definition immediately implies that if a problem D1 is polynomially reducible to some


problemD2 that can be solved in polynomial time, then problem D1 can also be solved in
polynomial time

Definition: A decision problem D is said to be NP-complete if it is hard as any problem in NP.


1. It belongs to class NP
2. Every problem in NP is polynomially reducible to D
The fact that closely related decision problems are polynomially reducible to each other is not very
surprising. For example, let us prove that the Hamiltonian circuit problem is polynomially
reducible to the decision version of the traveling salesman problem.
NP problems

FIGURE 5.6 Polynomial-time reductions of NP problems to an NP-complete problem

Theorem: A decision problem is said to be NP-complete if it is hard as any problem in NP.

Proof: Let us prove that the Hamiltonian circuit problem is polynomially reducible to the decision
version of the traveling salesman problem.
We can map a graph G of a given instance of the Hamiltonian circuit problem to a complete
weighted graph G' representing an instance of the traveling salesman problem by assigning 1 as the
weight to each edge in G and adding an edge of weight 2 between any pair of nonadjacent vertices
in G. As the upper bound m on the Hamiltonian circuit length, we take m = n, where n is the
number of vertices in G (and G' ). Obviously, this transformation can be done in polynomial time.
Let G be a yes instance of the Hamiltonian circuit problem. Then G has a Hamiltonian
circuit, and its image in G' will have length n, making the image a yes instance of the decision
traveling salesman problem.
Conversely, if we have a Hamiltonian circuit of the length not larger than n in G', then its
length must be exactly n and hence the circuit must be made up of edges present in G, making the
inverse image of the yes instance of the decision traveling salesman problem be a yes instance of
the Hamiltonian circuit problem.
This completes the proof.

Theorem: State and prove Cook’s theorem.


Prove that CNF-sat is NP-complete.
Satisfiability of boolean formula for three conjuctive normal form is NP-Complete.
NP problems obtained by polynomial-time reductions from a NP-complete problem
Proof: The notion of NP-completeness requires, however, polynomial reducibility of all problems
in NP, both known and unknown, to the problem in question. Given the bewildering variety of
decision problems, it is nothing short of amazing that specific examples of NP-complete problems
have been actually found.
Nevertheless, this mathematical feat was accomplished independently by Stephen Cook in
the United States and Leonid Levin in the former Soviet Union. In his 1971 paper, Cook [Coo71]
showed that the so-called CNF-satisfiability problem is NPcomplete.
𝑥1 𝑥2 𝑥3 𝑥1 𝑥2 𝑥3 𝑥1𝗏𝑥2𝗏𝑥3 𝑥1𝗏 𝑥2 𝑥1𝗏𝑥2𝗏𝑥3 (𝑥1𝗏𝑥2𝗏𝑥 3)𝖠 (𝑥 1𝗏 𝑥2)𝖠 (𝑥1𝗏𝑥2𝗏𝑥3)
T T T F F F T T F F
T T F F F T T T T T
T F T F T F T F T F
T F F F T T T F T F
F T T T F F F T T F
F T F T F T T T T T
F F T T T F T T T T
F F F T T T T T T T

The CNF-satisfiability problem deals with boolean expressions. Each boolean expression
can be represented in conjunctive normal form, such as the following expression involving three
boolean variables x1, x2, and x3 and their negations denoted 𝑥1, 𝑥2, and 𝑥3 respectively:

The CNF-satisfiability problem asks whether or not one can assign values true and false to
variables of a given boolean expression in its CNF form to make the entire expression true. (It is
easy to see that this can be done for the above formula: if x1 = true, x2 = true, and x3 = false, the
entire expression is true.)
Since the Cook-Levin discovery of the first known NP-complete problems, computer
scientists have found many hundreds, if not thousands, of other examples. In particular, the well-
known problems (or their decision versions) mentioned above—Hamiltonian circuit, traveling
salesman, partition, bin packing, and graph coloring—are all NP-complete. It is known, however,
that if P != NP there must exist NP problems that neither are in P nor are NP-complete.

Showing that a decision problem is NP-complete can be done in two steps.


1. First, one needs to show that the problem in question is in NP; i.e., a randomly generated
string can be checked in polynomial time to determine whether or not it represents a
solution to the problem. Typically, this step is easy.
2. The second step is to show that every problem in NP is reducible to the problem in
questionin polynomial time. Because of the transitivity of polynomial reduction, this step
can be done by showing that a known NP-complete problem can be transformed to the
problem in question in polynomial time.
The definition of NP-completeness immediately implies that if there exists a deterministic
polynomial-time algorithm for just one NP-complete problem, then every problem in NP can be
solved in polynomial time by a deterministic algorithm, and hence P = NP.
NP problems

FIGURE 5.7 NP-completeness by reduction

Examples: TSP, knapsack, partition, graph-coloring and hundreds of other problems of


combinatorial nature P = NP would imply that every problem in NP, including all NP-complete
problems, could be solved in polynomial time If a polynomial-time algorithm for just one NP-
complete problem is discovered, then every problem in NP can be solved in polynomial time, i.e. P
= NP Most but not all researchers believe that P != NP , i.e. P is a proper subset of NP. If P != NP,
then the NP-complete problems are not in P, although many of them are very useful in practice.
FIGURE 5.8 Relation among P, NP, NP-hard and NP Complete problems

5.3 COPING WITH THE LIMITATIONS OF ALGORITHM POWER

There are some problems that are difficult to solve algorithmically. At the same time, few of
them are so important, we must solve by some other technique. Two algorithm design techniques
backtracking and branch-and-bound that often make it possible to solve at least some large
instances of difficult combinatorial problems.

Both backtracking and branch-and-bound are based on the construction of a state-space tree
whose nodes reflect specific choices made for a solution’s components. Both techniques terminate
a node as soon as it can be guaranteed that no solution to the problem can be obtained by
considering choices that correspond to the node’s descendants

We consider a few approximation algorithms for solving the Assignment Problem, traveling
salesman and knapsack problems. There are three classic methods like the bisection method, the
method of false position, and Newton’s method for approximate root finding.

Exact Solution Strategies are given below:


Exhaustive search (brute force)-
• useful only for small instances
Dynamic programming
• applicable to some problems (e.g., the knapsack problem)
Backtracking
• eliminates some unnecessary cases from consideration
• yields solutions in reasonable time for many instances but worst case is still
exponential
Branch-and-bound
• further refines the backtracking idea for optimization problems

Coping with the Limitations of Algorithm Power are given below:


Backtracking
 n-Queens Problem
 Hamiltonian Circuit Problem
 Subset-Sum Problem
Branch-and-Bound
 Assignment Problem
 Knapsack Problem
 Traveling Salesman Problem
Approximation Algorithms for NP-Hard Problems
 Approximation Algorithms for the Traveling Salesman Problem
 Approximation Algorithms for the Knapsack Problem
Algorithms for Solving Nonlinear Equations
 Bisection Method
 False Position Method
 Newton’s Method

5.4 BACKTRACKING
 Backtracking is a more intelligent variation approach.
 The principal idea is to construct solutions one component at a time and evaluate such
partially constructed candidates as follows.
 If a partially constructed solution can be developed further without violating the
problem’s constraints, it is done by taking the first remaining legitimate option for the
next component.
 If there is no legitimate option for the next component, no alternatives for any remaining
component need to be considered. In this case, the algorithm backtracks to replace the
last component of the partially constructed solution with its next option.
 It is convenient to implement this kind of processing by constructing a tree of choices
being made, called the state-space tree.
 Its root represents an initial state before the search for a solution begins.
 The nodes of the first level in the tree represent the choices made for the first component
of a solution, the nodes of the second level represent the choices for the second
component, and so on.
 A node in a state-space tree is said to be promising if it corresponds to a partially
constructed solution that may still lead to a complete solution. otherwise, it is called
nonpromising.
 Leaves represent either nonpromising dead ends or complete solutions found by the
algorithm. In the majority of cases, a statespace tree for a backtracking algorithm is
constructed in the manner of depthfirst search.
 If the current node is promising, its child is generated by adding the first remaining
legitimate option for the next component of a solution, and the processing moves to this
child. If the current node turns out to be nonpromising, the algorithm backtracks to the
node’s parent to consider the next possible option for its last component; if there is no
such option, it backtracks one more level up the tree, and so on.
 Finally, if the algorithm reaches a complete solution to the problem, it either stops (if
just one solution is required) or continues searching for other possible solutions.
 Backtracking techniques are applied to solve the following problems
 n-Queens Problem
 Hamiltonian Circuit Problem
 Subset-Sum Problem
5.5 N-QUEENS PROBLEM
The problem is to place n queens on an n × n chessboard so that no two queens attack each
other by being in the same row or in the same column or on the same diagonal.

For n = 1, the problem has a trivial solution.


Q

For n = 2, it is easy to see that there is no solution to place 2 queens in 2 × 2 chessboard.


Q

For n = 3, it is easy to see that there is no solution to place 3 queens in 3 × 3 chessboard.


1 2 3 1 2 3 1 2 3
1 Q queen 1 1 Q queen 1 1 Q queen 1
2 Q queen 2 Or 2 Or 2
3 3 Q queen
Queen 22 3 Q queen
Queen 22

For n = 4, There is solution to place 4 queens in 4 × 4 chessboard. the four-queens problem solved
by the backtracking technique.

Step 1: Start with the empty board


1 2 3 4
1 queen 1
2 queen 2
3 queen 3
4 queen 4

Step 2: Place queen 1 in the first possible position of its row, which is in column 1 of row 1.
1 2 3 4
1 Q
2
3
4

Step 3: place queen 2, after trying unsuccessfully columns 1 and 2, in the first acceptable position
for it, which is square (2, 3), the square in row 2 and column 3.
1 2 3 4
1 Q
2 Q
3
4

Step 4: This proves to be a dead end because there is no acceptable position for queen 3. So, the
algorithm backtracks and puts queen 2 in the next possible position at (2, 4).
1 2 3 4
1 Q
2 Q
3
4

Step 5: Then queen 3 is placed at (3, 2), which proves to be another dead end.
1 2 3 4
1 Q
2 Q
3 Q
4

Step 6: The algorithm then backtracks all the way to queen 1 and moves it to (1, 2).
1 2 3 4
1 Q
2
3
4

Step 7: The queen 2 goes to (2, 4).


1 2 3 4
1 Q
2 Q
3
4

Step 8: The queen 3 goes to (3, 1).


1 2 3 4
1 Q
2 Q
3 Q
4

Step 9: The queen 3 goes to (4, 3). This is a solution to the problem.
1 2 3 4
1 Q
2 Q
3 Q
4 Q

FIGURE 5.9 Solution four-queens problem in 4x4 Board.

The state-space tree of this search is shown in Figure 12.2


FIGURE 5.10 State-space tree of solving the four-queens problem by backtracking. × denotes an
unsuccessful attempt to place a queen.

For n = 8, There is solution to place 8 queens in 8 × 8 chessboard.


1 2 3 4 5 6 7 8
1 Q
2 Q
3 Q
4 Q
5 Q
6 Q
7 Q
8 Q

FIGURE 5.11 Solution 8-queens problem in 8x8 Board.


5.6 HAMILTONIAN CIRCUIT PROBLEM
A Hamiltonian circuit (also called a Hamiltonian cycle, Hamilton cycle, or Hamilton
circuit) is a graph cycle (i.e., closed loop) through a graph that visits each node exactly once. A
graph possessing a Hamiltonian cycle is said to be a Hamiltonian graph.

FIGURE 5.12 Graph contains Hamiltonian circuit

Let us consider the problem of finding a Hamiltonian circuit in the graph in Figure 5.13.

Example: Find Hamiltonian circuit starts at vertex a.

FIGURE 5. 13 Graph.

Solution:
 Assume that if a Hamiltonian circuit exists, it starts at vertex a. accordingly, we make
vertex a the root of the state-space tree as in Figure 5.14.
 In a Graph G, Hamiltonian cycle begins at some vertex V1 ∈ G, and the vertices are visited
only once in the order V1, V2, . . . , Vn. (Vi are distinct except for V1 and Vn+1 which are
equal).
 The first component of our future solution, if it exists, is a first intermediate vertex of a
Hamiltonian circuit to be constructed. Using the alphabet order to break the three-way tie
among the vertices adjacent to a, we
 Select vertex b. From b, the algorithm proceeds to c, then to d, then to e, and finally to f,
which proves to be a dead end.
 So the algorithm backtracks from f to e, then to d, and then to c, which provides the first
alternative for the algorithm to pursue.
 Going from c to e eventually proves useless, and the algorithm has to backtrack from e to c
and then to b. From there, it goes to the vertices f , e, c, and d, from which it can
legitimately return to a, yielding the Hamiltonian circuit a, b, f , e, c, d, a. If we wanted to
find another Hamiltonian circuit, we could continue this process by backtracking from the
leaf of the solution found.
FIGURE 5.14 State-space tree for finding a Hamiltonian circuit.

5.7 SUBSET SUM PROBLEM


The subset-sum problem finds a subset of a given set A = {a1, . . . , an} of n positive
integers whose sum is equal to a given positive integer d. For example, for A = {1, 2, 5, 6, 8} and
d = 9, there are two solutions: {1, 2, 6} and {1, 8}. Of course, some instances of this problem may
have no solutions.

It is convenient to sort the set’s elements in increasing order. So, we will assume that
a1< a2 < . . . < an.

A = {3, 5, 6, 7} and d = 15 of the subset-sum problem. The number inside a node is the sum
of the elements already included in the subsets represented by the node. The inequality below a leaf
indicates the reason for its termination.

FIGURE 5.15 Complete state-space tree of the backtracking algorithm applied to the instance
Example:
 The state-space tree can be constructed as a binary tree like that in Figure 5.15 for the
instance A = {3, 5, 6, 7} and d = 15.
 The root of the tree represents the starting point, with no decisions about the given elements
made as yet.
 Its left and right children represent, respectively, inclusion and exclusion of a1 in a set being
sought. Similarly, going to the left from a node of the first level corresponds to inclusion of
a2 while going to the right corresponds to its exclusion, and so on.
 Thus, a path from the root to a node on the ith level of the tree indicates which of the first I
numbers have been included in the subsets represented by that node.
 We record the value of s, the sum of these numbers, in the node.
 If s is equal to d, we have a solution to the problem. We can either report this result and stop
or, if all the solutions need to be found, continue by backtracking to the node’s parent.
 If s is not equal to d, we can terminate the node as nonpromising if either of the following
two inequalities holds:
𝑠 + 𝑎i+1 > 𝑑 (the sum s is too large),
s + ∑𝑛j=𝑖+1 𝑎j < d (the sum s is too small).

General Remarks

From a more general perspective, most backtracking algorithms fit the following escription.
An output of a backtracking algorithm can be thought of as an n-tuple (x1, x2, . . . , xn) where each
coordinate xi is an element of some finite lin early ordered set Si . For example, for the n-queens
problem, each Si is the set of integers (column numbers) 1 through n.

A backtracking algorithm generates, explicitly or implicitly, a state-space tree; its nodes


represent partially constructed tuples with the first i coordinates defined by the earlier actions of the
algorithm. If such a tuple (x1, x2, . . . , xi) is not a solution, the algorithm finds the next element in
Si+1 that is consistent with the values of ((x1, x2, . . . , xi) and the problem’s constraints, and adds it
to the tuple as its (i + 1)st coordinate. If such an element does not exist, the algorithm backtracks to
consider the next value of xi, and so on.

ALGORITHM Backtrack(X [1..i] )


//Gives a template of a generic backtracking algorithm
//Input: X[1..i] specifies first i promising components of a solution
//Output: All the tuples representing the problem’s solutions
if X[1..i] is a solution write X[1..i]
else //see Problem this section
for each element x ∈ Si+1 consistent with X[1..i] and the constraints do
X[i + 1] ← x
Backtrack(X[1..i + 1])
5.8 BRANCH AND BOUND

An optimization problem seeks to minimize or maximize some objective function, usually


subject to some constraints. Note that in the standard terminology of optimization problems, a
feasible solution is a point in the problem’s search space that satisfies all the problem’s constraints
(e.g., a Hamiltonian circuit in the travelling salesman problem or a subset of items whose total
weight does not exceed the knapsack’s capacity in the knapsack problem), whereas an optimal
solution is a feasible solution with the best value of the objective function (e.g., the shortest
Hamiltonian circuit or the most valuable subset of items that fit the knapsack).

Compared to backtracking, branch-and-bound requires two additional items:


1. a way to provide, for every node of a state-space tree, a bound on the best value of
the objective function1 on any solution that can be obtained by adding further
components to the partially constructed solution represented by the node
2. the value of the best solution seen so far

If this information is available, we can compare a node’s bound value with the value of the
best solution seen so far. If the bound value is not better than the value of the best solution seen so
far—i.e., not smaller for a minimization problem and not larger for a maximization problem—the
node is nonpromising and can be terminated (some people say the branch is “pruned”). Indeed, no
solution obtained from it can yield a better solution than the one already available. This is the
principal idea of the branch-and-bound technique.

In general, we terminate a search path at the current node in a state-space tree of a branch-
and-bound algorithm for any one of the following three reasons:
1. The value of the node’s bound is not better than the value of the best solution seen so far.
2. The node represents no feasible solutions because the constraints of the problem are already
violated.
3. The subset of feasible solutions represented by the node consists of a single point (and
hence no further choices can be made)—in this case, we compare the value of the objective
function for this feasible solution with that of the best solution seen so far and update the
latter with the former if the new solution is better.

Some problems can be solved by Branch-and-Bound are:


1. Assignment Problem
2. Knapsack Problem
3. Traveling Salesman Problem
5.9 ASSIGNMENT PROBLEM

Let us illustrate the branch-and-bound approach by applying it to the problem of assigning n


people to n jobs so that the total cost of the assignment is as small as possible. An instance of the
assignment problem is specified by an n × n cost matrix C.

We have to find a lower bound on the cost of an optimal selection without actually solving
the problem. We can do this by several methods. For example, it is clear that the cost of any
solution, including an optimal one, cannot be smaller than the sum of the smallest elements in each
of the matrix’s rows. For the instance here, this sum is 2 + 3+ 1+ 4 = 10. It is important to stress
that this is not the cost of any legitimate selection (3 and 1 came from the same column of the
matrix); it is just a lower bound on the cost of any legitimate selection. We can and will apply the
same thinking to partially constructed solutions. For example, for any legitimate selection that
selects 9 from the first row, the lower bound will be 9 + 3 + 1+ 4 = 17.

It is sensible to consider a node with the best bound as most promising, although this does
not, of course, preclude the possibility that an optimal solution will ultimately belong to a different
branch of the state-space tree. This variation of the strategy is called the best-first branch-and-
bound.

The lower-bound value for the root, denoted lb, is 10. The nodes on the first level of the tree
correspond to selections of an element in the first row of the matrix, i.e., a job for person a as
shown in Figure 5.15.

FIGURE 5.15 Levels 0 and 1 of the state-space tree for the instance of the assignment problem
being solved with the best-first branch-and-bound algorithm. The number above a node shows the
order in which the node was generated. A node’s fields indicate the job number assigned to person
a and the lower bound value, lb, for this node.

.
So we have four live leaves (promising leaves are also called live) —nodes 1 through 4—
that may contain an optimal solution. The most promising of them is node 2 because it has the
smallest lowerbound value. Following our best-first search strategy, we branch out from that node
first by considering the three different ways of selecting an element from the second row and not in
the second column—the three different jobs that can be assigned to person b (Figure 5.16).
FIGURE 5.16 Levels 0, 1, and 2 of the state-space tree for the instance of the assignment problem
being solved with the best-first branch-and-bound algorithm.

Of the six live leaves—nodes 1, 3, 4, 5, 6, and 7—that may contain an optimal solution, we again
choose the one with the smallest lower bound, node 5. First, we consider selecting the third
column’s element from c’s row (i.e., assigning person c to job 3); this leaves us with no choice but
to select the element from the fourth column of d’s row (assigning person d to job 4). This yields
leaf 8 (Figure 5.17), which corresponds to the feasible solution {a→2, b→1, c→3, d →4} with the
total cost of 13. Its sibling, node 9, corresponds to the feasible solution {a→2, b→1, c→4, d →3}
with the total cost of 25. Since its cost is larger than the cost of the solution represented by leaf 8,
node 9 is simply terminated. (Of course, if its cost were smaller than 13, we would have to replace
the information about the best solution seen so far with the data provided by this node.)

FIGURE 5.17 Complete state-space tree for the instance of the assignment problem solved with
the best-first branch-and-bound algorithm.

Now, as we inspect each of the live leaves of the last state-space tree—nodes 1, 3, 4, 6, and
7 in Figure 5.17—we discover that their lower-bound values are not smaller than 13, the value of
the best selection seen so far (leaf 8). Hence, we terminate all of them and recognize the solution
represented by leaf 8 as the optimal solution to the problem.
5.10 KNAPSACK PROBLEM
Let us now discuss how we can apply the branch-and-bound technique to solving the
knapsack problem. Given n items of known weights wi and values vi , i = 1, 2, . . . , n, and a
knapsack of capacity W, find the most valuable subset of the items that fit in the knapsack. It is
convenient to order the items of a given instance in descending order by their value-to-weight
ratios. Then the first item gives the best payoff per weight unit and the last one gives the worst
payoff per weight unit, with ties resolved arbitrarily:
v1/w1 ≥ v2/w2 ≥ . . . ≥ vn/wn.
It is natural to structure the state-space tree for this problem as a binary tree constructed as
follows. Each node on the ith level of this tree, 0 ≤ i ≤ n, represents all the subsets of n items that
include a particular selection made from the first i ordered items. This particular selection is
uniquely determined by the path from the root to the node: a branch going to the left indicates the
inclusion of the next item, and a branch going to the right indicates its exclusion. We record the
total weight w and the total value v of this selection in the node, along with some upper bound ub
on the value of any subset that can be obtained by adding zero or more items to this selection.
Item Weight value value / weight capacity
1 4 $40 10
2 7 $42 6
W = 10
3 5 $25 5
4 3 $12 4
w=19 v=119 vi+1/wi+1=25
A simple way to compute the upper bound ub is to add to v, the total value of the items
already selected, the product of the remaining capacity of the knapsack W − w and the best per unit
payoff among the remaining items, which is vi+1/wi+1:
ub = v + (W − w)(vi+1/wi+1).
= 0+(10-0) (10)
= 100

FIGURE 5.18 State-space tree of the best-first branch-and-bound algorithm for the instance of the
knapsack problem.
At the root of the state-space tree (see Figure 5.18), no items have been selected as yet.
Hence, both the total weight of the items already selected w and their total value v are equal to 0.
The value of the upper bound computed by formula (12.1) is $100. Node 1, the left child of the
root, represents the subsets that include item 1. The total weight and value of the items already
included are 4 and $40, respectively; the value of the upper bound is 40 + (10 − 4) * 6 = $76. Node
2 represents the subsets that do not include item 1. Accordingly, w = 0, v = $0, and ub = 0 + (10 −
0) * 6 = $60. Since node 1 has a larger upper bound than the upper bound of node 2, it is more
promising for this maximization problem, and we branch from node 1 first. Its children—nodes 3
and 4—represent subsets with item 1 and with and without item 2, respectively.

Since the total weight w of every subset represented by node 3 exceeds the knapsack’s capacity,
node 3 can be terminated immediately. Node 4 has the same values of w and v as its parent; the
upper bound ub is equal to 40 + (10 − 4) * 5 = $70. Selecting node 4 over node 2 for the next
branching (why?), we get nodes 5 and 6 by respectively including and excluding item 3. The total
weights and values as well as the upper bounds for these nodes are computed in the same way as
for the preceding nodes. Branching from node 5 yields node 7, which represents no feasible
solutions, and node 8, which represents just a single subset {1, 3} of value $65. The remaining live
nodes 2 and 6 have smaller upper-bound values than the value of the solution represented by node
8. Hence, both can be terminated making the subset {1, 3} of node 8 the optimal solution to the
problem.

Solving the knapsack problem by a branch-and-bound algorithm has a rather unusual


characteristic. Typically, internal nodes of a state-space tree do not define a point of the problem’s
search space, because some of the solution’s components remain undefined. If we had done this for
the instance investigated above, we could have terminated nodes 2 and 6 before node 8 was
generated because they both are inferior to the subset of value $65 of node 5.

5.11 TRAVELING SALESMAN PROBLEM

We will be able to apply the branch-and-bound technique to instances of the travelling


salesman problem if we come up with a reasonable lower bound on tour lengths. One very simple
lower bound can be obtained by finding the smallest element in the intercity distance matrix D and
multiplying it by the number of cities n. But there is a less obvious and more informative lower
bound for instances with symmetric matrix D, which does not require a lot of work to compute. It is
not difficult to show (Problem 8 in this section’s exercises) that we can compute a lower bound on
the length l of any tour as follows. For each city i, 1≤ i ≤ n, find the sum si of the distances from
city i to the two nearest cities; compute the sum s of these n numbers, divide the result by 2, and, if
all the distances are integers, round up the result to the nearest integer:
𝑙𝑏 = ]𝑠/2]
FIGURE 5.19 (a)Weighted graph. (b) State-space tree of the branch-and-bound algorithm to find a
shortest Hamiltonian circuit in this graph. The list of vertices in a node specifies a beginning part of
the Hamiltonian circuits represented by the node.

For example, for the instance in Figure and above formula yields
lb =][(1 + 3) + (3 + 6) + (1 + 2) + (3 + 4) + (2 + 3)]/2] = 14.
Moreover, for any subset of tours that must include particular edges of a given graph, we
can modify lower bound accordingly. For example, for all the Hamiltonian circuits of the graph in
Figure that must include edge (a, d), we get the following lower bound by summing up the lengths
of the two shortest edges incident with each of the vertices, with the required inclusion of edges (a,
d) and (d, a):

][(1 + 5) + (3 + 6) + (1 + 2) + (3 + 5) + (2 + 3)]/2] = 16.


We now apply the branch-and-bound algorithm, with the bounding function given by
formula, to find the shortest Hamiltonian circuit for the graph in Figure 5.19a. To reduce the
amount of potential work. First, without loss of generality, we can consider only tours that start at
a. Second, because our graph is undirected, we can generate only tours in which b is visited before
c. In addition, after visiting n − 1= 4 cities, a tour has no choice but to visit the remaining unvisited
city and return to the starting one. The state-space tree tracing the algorithm’s application is given
in Figure 5.19b.
5.12 APPROXIMATION ALGORITHMS FOR NP HARD PROBLEMS

Now we are going to discuss a different approach to handling difficult problems of


combinatorial optimization, such as the travelling salesman problem and the knapsack problem.
The decision versions of these problems are NP-complete. Their optimization versions fall in the
class of NP-hard problems—problems that are at least as hard as NP-complete problems. Hence,
there are no known polynomial-time algorithms for these problems, and there are serious
theoretical reasons to believe that such algorithms do not exist.

Approximation algorithms run a gamut in level of sophistication; most of them are based on
some problem-specific heuristic. A heuristic is a common-sense rule drawn from experience rather
than from a mathematically proved assertion. For example, going to the nearest unvisited city in the
travelling salesman problem is a good illustration of this notion.

Of course, if we use an algorithm whose output is just an approximation of the actual


optimal solution, we would like to know how accurate this approximation is. We can quantify the
accuracy of an approximate solution sa to a problem of minimizing some function f by the size of
the relative error (re) of this approximation,

𝑓(𝑠𝑎) − 𝑓(𝑠∗)
𝑟𝑒(𝑠𝑎) =
𝑓(𝑠∗)
where s is an exact solution to the problem. Alternatively, re(sa) = f (sa)/f (s*) − 1, we can simply
*

use the accuracy ratio


𝑓(𝑠𝑎)
𝑟(𝑠𝑎 ) =
𝑓(𝑠∗)
as a measure of accuracy of sa. Note that for the sake of scale uniformity, the accuracy ratio of
approximate solutions to maximization problems is usually computed as
𝑓(𝑠∗)
𝑟(𝑠𝑎) =
𝑓(𝑠𝑎)
to make this ratio greater than or equal to 1, as it is for minimization problems. Obviously, the
closer r(sa) is to 1, the better the approximate solution is. For most instances, however, we cannot
compute the accuracy ratio, because we typically do not know f (s*), the true optimal value of the
objective function. Therefore, our hope should lie in obtaining a good upper bound on the values of
r(sa). This leads to the following definitions.

A polynomial-time approximation algorithm is said to be a c approximation algorithm,


where c ≥ 1, if the accuracy ratio of the approximation it produces does not exceed c for any
instance of the problem in question: r(sa) ≤ c.

The best (i.e., the smallest) value of c for which inequality holds for all instances of the
problem is called the performance ratio of the algorithm and denoted RA.

The performance ratio serves as the principal metric indicating the quality of the
approximation algorithm. We would like to have approximation algorithms with RA as close to 1
as possible. Unfortunately, as we shall see, some approximation algorithms have infinitely large
performance ratios (RA = ∞). This does not necessarily rule out using such algorithms, but it does
call for a cautious treatment of their outputs.

Approximation Algorithms for NP Hard Problems are:


 Traveling salesman problem (tsp)
 Knapsack problem
5.13 TRAVELING SALESMAN PROBLEM (APPROXIMATION ALGORITHM)

Greedy Algorithms for the TSP The simplest approximation algorithms for the traveling
salesman problem are based on the greedy technique. We will discuss here two such algorithms.
1. Nearest-neighbor algorithm
2. Minimum-Spanning-Tree–Based Algorithms

NEAREST-NEIGHBOR ALGORITHM
The following well-known greedy algorithm is based on the nearest-neighbor heuristic:
always go next to the nearest unvisited city.
Step 1 Choose an arbitrary city as the start.
Step 2 Repeat the following operation until all the cities have been visited: go to the unvisited city
nearest the one visited last (ties can be broken arbitrarily).
Step 3 Return to the starting city.

EXAMPLE 1 For the instance represented by the graph in Figure 5.20, with a as the starting
vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian circuit) sa: a − b − c − d − a of
length 10.

FIGURE 5.20 Instance of the traveling salesman problem.


The optimal solution, as can be easily checked by exhaustive search, is the tour s*: a − b − d − c – a
of length 8. Thus, the accuracy ratio of this approximation is

𝑓(𝑠𝑎) = 10 = 1.25
𝑟(𝑠𝑎 ) =
𝑓(𝑠∗) 8

(i.e., tour sa is 25% longer than the optimal tour s*).

Multifragment-heuristic algorithm
Another natural greedy algorithm for the traveling salesman problem considers it as the
problem of finding a minimum-weight collection of edges in a given complete weighted graph so
that all the vertices have degree 2.
Step 1 Sort the edges in increasing order of their weights. (Ties can be broken arbitrarily.)
Initialize the set of tour edges to be constructed to the empty set.
Step 2 Repeat this step n times, where n is the number of cities in the instance being solved:
add the next edge on the sorted edge list to the set of tour edges, provided this
addition does not create a vertex of degree 3 or a cycle of length less than n;
otherwise, skip the edge.
Step 3 Return the set of tour edges.

As an example, applying the algorithm to the graph in Figure 5.20 yields {(a, b), (c, d), (b,
c), (a, d)}. This set of edges forms the same tour as the one produced by the nearest-neighbor
algorithm. In general, the multifragment-heuristic algorithm tends to produce significantly better
CS6404 Design and Analysis of Algorithms Unit V 5.24

tours than the nearest-neighbor algorithm, as we are going to see from the experimental data quoted
at the end of this section. But the performance ratio of the multifragment-heuristic algorithm is also
unbounded, of course.
There is, however, a very important subset of instances, called Euclidean, for which we can
make a nontrivial assertion about the accuracy of both the nearestneighbor and multifragment-
heuristic algorithms. These are the instances in which intercity distances satisfy the following
natural conditions:
 triangle inequality 𝑑[𝑖, j ] ≤ 𝑑[𝑖, 𝑘] + 𝑑[𝑘, j] for any triple of cities i, j, and k (the distance
between cities i and j cannot exceed the length of a two-leg path from i to some
intermediate city k to j )
 symmetry 𝑑[𝑖, j ] = 𝑑[j, 𝑖] for any pair of cities i and j (the distance from I to j is the same
as the distance from j to i)

MINIMUM-SPANNING-TREE–BASED ALGORITHMS
There are approximation algorithms for the travelling salesman problem that exploit a
connection between Hamiltonian circuits and spanning trees of the same graph. Since removing an
edge from a Hamiltonian circuit yields a spanning tree, we can expect that the structure of a
minimum spanning tree provides a good basis for constructing a shortest tour approximation. Here
is an algorithm that implements this idea in a rather straightforward fashion.
Twice-around-the-tree algorithm
Step 1 Construct a minimum spanning tree of the graph corresponding to a given instance
of the traveling salesman problem.
Step 2 Starting at an arbitrary vertex, perform a walk around the minimum spanning tree
recording all the vertices passed by. (This can be done by a DFS traversal.)
Step 3 Scan the vertex list obtained in Step 2 and eliminate from it all repeated occurrences
of the same vertex except the starting one at the end of the list. (This step is
equivalent to making shortcuts in the walk.) The vertices remaining on the list will
form a Hamiltonian circuit, which is the output of the algorithm.

EXAMPLE 2 Let us apply this algorithm to the graph in Figure 5.21a. The minimum spanning tree
of this graph is made up of edges (a, b), (b, c), (b, d), and (d, e) (Figure 5.21b). A twice-around-the-
tree walk that starts and ends at a is a, b, c, b, d, e, d, b, a. Eliminating the second b (a shortcut
from c to d), the second d, and the third b (a shortcut from e to a) yields the Hamiltonian circuit a,
b, c, d, e, a of length 39.

FIGURE 5.21 Illustration of the twice-around-the-tree algorithm. (a) Graph. (b) Walk around the
minimum spanning tree with the shortcuts.
CS6404 Design and Analysis of Algorithms Unit V 5.25

5.14 KNAPSACK PROBLEM (APPROXIMATION ALGORITHM)


The knapsack problem is one well-known NP-hard problem. Given n items of known
weights w1, . . . , wn and values v1, . . . , vn and a knapsack of weight capacity W, find the most
valuable subset of the items that fits into the knapsack.

GREEDY ALGORITHMS FOR THE KNAPSACK PROBLEM


We can think of several greedy approaches to this problem. One is to select the items in
decreasing order of their weights; however, heavier items may not be the most valuable in the set.
Alternatively, if we pick up the items in decreasing order of their value, there is no guarantee that
the knapsack’s capacity will be used efficiently. we find a greedy strategy that takes into account
both the weights and values by computing the value-to-weight ratios vi/wi, i = 1, 2, . . . , n, and
selecting the items in decreasing order of these ratios. Here is the algorithm based on this greedy
heuristic.

Greedy algorithm for the discrete knapsack problem


Step 1 Compute the value-to-weight ratios ri = vi/wi, i = 1, . . . , n, for the items given.
Step 2 Sort the items in nonincreasing order of the ratios computed in Step 1.(Ties can be
broken arbitrarily.)
Step 3 Repeat the following operation until no item is left in the sorted list: if the current
item on the list fits into the knapsack, place it in the knapsack and proceed to the
next item; otherwise, just proceed to the next item.

EXAMPLE 1 Let us consider the instance of the knapsack problem with the knapsack capacity 10
and the item information as follows:

Item weight value


1 4 $40
2 7 $42
3 5 $25
4 3 $12

Computing the value-to-weight ratios and sorting the items in non increasing order of these
efficiency ratios yields

Item weight value value / weight capacity


1 4 $40 10
2 7 $42 6
W = 10
3 5 $25 5
4 3 $12 4

The greedy algorithm will select the first item of weight 4, skip the next item of weight 7,
select the next item of weight 5, and skip the last item of weight 3. The solution obtained happens
to be optimal for this instance. So the total items value in knapsack is $65.

GREEDY ALGORITHM FOR THE CONTINUOUS KNAPSACK PROBLEM


Step 1 Compute the value-to-weight ratios vi/wi, i = 1, . . . , n, for the items given.
Step 2 Sort the items in nonincreasing order of the ratios computed in Step 1. (Ties can be
broken arbitrarily.)
Step 3 Repeat the following operation until the knapsack is filled to its full capacity or no
item is left in the sorted list: if the current item on the list fits into the knapsack in its
CS6404 Design and Analysis of Algorithms Unit V 5.26

entirety, take it and proceed to the next item; otherwise, take its largest fraction to
fill the knapsack to its full capacity and stop.

EXAMPLE 2 A small example of an approximation scheme with k = 2 is provided. The algorithm


yields {1, 3, 4}, which is the optimal solution for this instance.

Item weight value value / weight capacity


1 4 $40 10
2 7 $42 6
W = 10
3 5 $25 5
4 1 $4 4

subset Added items value


{} 1, 3, 4 $69
{1} 3, 4 $69
{2} 4 $46
{3} 1, 4 $69
{4} 1, 3 $69
{1, 2} Not feasible
{1, 4} 4 $69
{1, 4} 3 $69
{2, 3} Not feasible
{2, 4} - $46
{3, 4} 1 $69

For each of those subsets, it needs O(n) time to determine the subset’s possible extension.
Thus, the algorithm’s efficiency is in O(knk+1). Note that although it is polynomial in n, the time
efficiency of Sahni’s scheme is exponential in k. More sophisticated approximation schemes, called
fully polynomial schemes, do not have this shortcoming.

You might also like