Module 5
Module 5
MODULE 5
Decision Trees
Many important algorithms, especially those for sorting and searching, work by comparing
items of their inputs. We can study the performance of such algorithms with a device or tool
Definition: “A decision tree also called comparison tree is a binary tree that represents
only the comparisons of given elements in the array while sorting or searching”.
Example: Obtain a decision tree of an algorithm for finding a minimum of three numbers.
• The algorithm’s work on a particular input of size n can be traced by a path from the root
to a leaf in its decision tree, and the number of comparisons made by the algorithm on
such a run is equal to the length of this path. Hence, the number of comparisons in the
worst case is equal to the height of the algorithm’s decision tree. We know that any
binary tree with l leaves and height h,
Dept., of CSE Page 1
Analysis and Design of Algorithms BCS401
• This puts a lower bound on the heights of binary decision trees and hence the worst-
case number of comparisons made by any comparison-based algorithm for the problem
in question. Such a bound is called the information theoretic lower bound.
• Most sorting algorithms are comparison based, i.e., they work by comparing elements
in a list to be sorted. By studying properties of decision trees for such algorithms, we
can derive important lower bounds on their time efficiencies.
Example: Show the decision tree to sort elements using selection sort and show that the
lower bound is log2n!
Answer: Consider, as an example, a three-element list a, b, c of orderable items such as real
numbers or strings.
• For the outcome a < c<b obtained by sorting this list, the permutation in question is
1, 3, 2.
• In general, the number of possible outcomes for sorting an arbitrary n- element
list is equal to n!.
• In general, the number of possible outcomes for sorting an arbitrary n- element
list is equal to n!.
Example 2: Constrcut decision tree for the three-element insertion sort. Answer: Following
diagram shows the decision tree for the three-element insertion sort with number of
average number of comparison is (2 + 3 + 3 + 2 + 3 + 3)/6 = 2 2/3.
Under the standard assumption that all n! outcomes of sorting are equally likely, the following
lower bound on the average number of comparisons Cavg made by any comparison-based
algorithm in sorting an n-element list has been proved: Cavg(n) ≥ log2 n!.
The principal algorithm for this problem is binary search. The number of comparisons made
by binary search in the worst case, Cbsworst(n), is given by the formula,
• Since the minimum height h of a ternary tree with l leaves is log3 l, we get the
following lower bound on the number of worst-case comparisons:
• To obtain a better lower bound, we should consider binary rather than ternary
decision trees, such as the one in Figure 11.5.
DEFINITION 1 We say that an algorithm solves a problem in polynomial time if its worst-
case time efficiency belongs to O(p(n)) where p(n) is a polynomial of the problem’s input size
n. (ex: p(n)=n or, log n or, n log n, or n2or n3 or anything representable in polynomial
equation).
• Problems that can be solved in polynomial time are called tractable, and
• Problems that cannot be solved in polynomial time are called intractable.
P - Problems
Most problems discussed in this syllabus can be solved in polynomial time by some
algorithm.
• Informally, we can think about problems that can be solved in polynomial time as the
set that computer science theoreticians call P.
• A more formal definition includes in P only decision problems, which are problems
with yes/no answers.
Examples:
• Computing the product and the greatest common divisor of two integers,
• sorting a list
• searching for a key in a list or for a pattern in a text string,
• checking connectivity and acyclicity of a graph,
• and finding a minimum spanning tree and shortest paths in a weighted graph etc.
Some decision problems cannot be solved at all by any algorithm. Such problems are called
undecidable, as opposed to decidable problems that can be solved by an algorithm.
• Example for undecidable problem is: halting problem: given a computer program and
an input to it, determine whether the program will halt on that input or continue working
indefinitely on it.
NP-problems:
There are many important problems, however, for which no polynomial-time algorithm has
been found, nor has the impossibility of such an algorithm been proved.
All these problems have in common is an exponential growth of order (n! or 2n).
• However, a common feature of a vast majority of decision problems is the fact that
although solving such problems can be computationally difficult, checking whether a
proposed solution actually solves the problem is computationally easy, i.e., it can be
done in polynomial time.
Dept., ofI.CSE
(If S is not a solution to instance I , the algorithm either returns no or is allowed
Page not
7
Analysis and Design of Algorithms BCS401
to halt at all.)
• We say that a nondeterministic algorithm solves a decision problem if and only if for
every yes instance of the problem it returns yes on some execution.
DEFINITION 4: NP-Problems: Class NP is the class of decision problems that can be solved
by nondeterministic polynomial algorithms. This class of problems is called nondeterministic
polynomial problems.
Most decision problems are in NP. First of all, this class includes all the problems in P:
P ⊆ NP.
NP-Complete Problems
Informally, an NP-complete problem is a problem in NP that is as difficult as any other problem
in this class because, by definition, any other problem in NP can be reduced to it in polynomial
time, shown symbolically in Figure 11.6
Backtracking
Backtracking is a more intelligent variation of Exhaustive search (which generates all candidate
solutions and identifying the one with a desired property). The principal idea is to construct
solutions one component at a time and evaluate such partially constructed candidates as follows.
• If a partially constructed solution can be developed further without violating the
problem’s constraints, it is done by taking the first remaining legitimate option for the
next component.
• If there is no legitimate option for the next component, no alternatives for any remaining
component need to be considered. In this case, the algorithm backtracks to replace the
last component of the partially constructed solution with its next option.
• It is convenient to implement this kind of processing by constructing a tree of choices
being made, called the state-space tree.
➢ Its root represents an initial state before the search for a solution begins. The nodes
of the first level in the tree represent the choices made for the first component of a
solution, the nodes of the second level represent the choices for the second
component, and so on.
➢ A node in a state-space tree is said to be promising if it corresponds to a partially
constructed solution that may still lead to a complete solution; otherwise, it is called
nonpromising.
➢ Leaves represent either nonpromising dead ends or complete solutions found by the
algorithm.
ALGORITHM Backtrack(X[1..i])
//Gives a template of a generic backtracking algorithm
//Input: X[1..i] specifies first i promising components of a solution
//Output: All the tuples representing the problem’s solutions
if X[1..i] is a solution write X[1..i]
else
for each element x ∈ Si+1 consistent with X[1..i] and the
constraints do
X[i + 1]←x
Backtrack(X[1..i + 1])
Examples:
1. With the help of State Space tree, solve the 4- Queens problem by using Backtracking
approach. 10M
2. Illustrate N queen’s problem using backtracking to solve 4-Queens problem 10M
3. Explain n-Queen’s problem with example using backtracking approach 10M
a) n-Queens Problem
The problem is to place n queens on an n × n chessboard so that no two queens attack
each other by being in the same row or in the same column or on the same diagonal.
• For n = 1, the problem has a trivial solution,
• It is easy to see that there is no solution for n = 2 and n = 3.
• So let us consider the four-queens problem and solve it by the backtracking technique.
• Since each of the four queens has to be placed in its own row, all we need to do is to
assign a column for each queen on the board presented in Figure 12.1.
(× denotes an unsuccessful attempt to place a queen in the indicated column. The numbers above
the nodes indicate the order in which the nodes are generated.
Basically, for n=4, we have 2 solutions. The solution vector and position of each queen on the
chess board for the 1st solution can be written as,
(x1, x2, x3, x4) = (2, 4, 1, 3).
The 2nd solution is,
(x1, x2, x3, x4)=(3, 1, 4, 2). (refer classwork for the 2nd solution)
Time complexity: T(n) ϵ O(n!) Space
complexity: O(n2)
b) Subset-Sum Problem
1. What is backtracking? Apply backtracking to solve the below instance of sum of subset
problem S={5,10,12,13,15,18} d=30 10M
2. Apply backtracking to solve the following instance of subset-sum problem S={3,5,6,7} and
d=15. Construct state space tree 5M
3. Apply Back Tracking method to solve sum of subset problem for the instance S={5, 10, 12,
13, 15, 18} . Give all possible solution with state space for construction
As our second example, we consider the subset-sum problem: find a subset of a given set A =
{a1, . . . , an } of n positive integers whose sum is equal to a given positive integer d.
For example:
For A = {1, 2, 5, 6, 8} and d = 9, there are two solutions: {1, 2, 6} and {1, 8}.
• The state-space tree can be constructed as a binary tree with root of the tree represents
the starting point, with no decisions about the given elements made as yet. Its left and
right children represent, respectively, inclusion and exclusion of a1 in a set being sought.
• We record the value of s, the sum of these numbers, in the node. If s is equal to d, we
have a solution to the problem. We can either report this result and stop or, if all the
solutions need to be found, continue by backtracking to the node’s parent.
• If s is not equal to d, we can terminate the node as nonpromising if either of the
following two inequalities holds:
Example: Let S={3, 5, 6, 7} and d=15. The sate-space tree to solve the subset problem is
Dept., of CSE Page 14
Analysis and Design of Algorithms BCS401
as follows:
Branch-and-Bound
Dept., of CSE Page 15
Analysis and Design of Algorithms BCS401
• Recall that the central idea of backtracking, is to cut off a branch of the problem’s state-
space tree as soon as we can deduce that it cannot lead to a solution. This idea can be
strengthened further if we deal with an optimization problem.
• An optimization problem seeks to minimize or maximize some objective function (a
tour length, the value of items selected, the cost of an assignment, and the like), usually
subject to some constraints.
• Note that in the standard terminology of optimization problems, a feasible solution is a
point in the problem’s search space that satisfies all the problem’s constraints (e.g.,a
subset of items whose total weight does not exceed the knapsack’s capacity in the
knapsack problem), whereas an optimal solution is a feasible solution with the best
value of the objective function (e.g., the most valuable subset of items that fit the
knapsack).
• Compared to backtracking, branch-and-bound requires two additional
items:
o a way to provide, for every node of a state-space tree, a bound on the best value
of the objective function on any solution that can be obtained by adding further
components to the partially constructed solution represented by the node.
o the value of the best solution seen so far
If this information is available, we can compare a node’s bound value with the value of the best
solution seen so far. If the bound value is not better than the value of the best solution seen so
far, the node is nonpromising and can be terminated.
In general, we terminate a search path at the current node in a state-space tree of a branch-and-
bound algorithm for any one of the following three reasons:
o The value of the node’s bound is not better than the value of the best solution seen
so far.
o The node represents no feasible solutions because the constraints of the problem
are already violated.
o The subset of feasible solutions represented by the node consists of a single
point (and hence no further choices can be made)—in this case, we compare the
value of the objective function for this feasible solution with that of the best
solution seen so far and update the latter with the former if the new solution is
better.
Example:
Knapsack Problem: Let us now discuss how we can apply the branch-and-bound technique to solving
the knapsack problem.
1. Apply branch and bound approach to solve the instance of 0/1 knapsack problem W=10
10M
2. Using Branch and Bound technique solve the below instance of knapsack problem.
3.
• Given n items of known weights wi and values vi , i = 1, 2, . . . , n, and a knapsack of
capacity W, find the most valuable subset of the items that fit in the knapsack.
• It is convenient to order the items of a given instance in descending order by their value-
to-weight ratios. Then the first item gives the best payoff per weight unit and the last one
gives the worst payoff per weight unit, with ties resolved arbitrarily:
• A simple way to compute the upper bound ub is to add to v, the total value of the items
already selected, the product of the remaining capacity of the knapsack, W−w and the best
per unit payoff among the remaining items, which is vi+1/wi+1:
Example: Let us apply the branch-and-bound algorithm to the following instance of the
knapsack problem:
(Note: Reorder the items in descending order of their value-to-weight ratios)
The State-space tree of the best-first branch-and-bound algorithm for the instance of the
knapsack problem is as follows:
Can be inefficient for large problem More efficient for optimization problems
Efficiency
spaces due to exhaustive search. if good bounds are available.
Solution Type Finds all solutions or a feasible solution. Finds the optimal solution.
Can be more efficient but still potentially
Complexity Generally exponential time complexity.
exponential.
Requires more sophisticated bounding
Implementation Relatively simple and straightforward.
functions.
• Step 1 Compute the value-to-weight ratios ri= vi/wi, i = 1, . . . , n, for the items given.
• Step 2 Sort the items in nonincreasing order of the ratios computed in Step1.
(Ties can be broken arbitrarily.)
• Step 3 Repeat the following operation until no item is left in the sorted list:
➢ if the current item on the list fits into the knapsack, place it in the knapsack and
proceed to the next item; otherwise, just proceed to the next item.
EXAMPLE: Let us consider the instance of the knapsack problem with the knapsack
capacity 10 and the item information as follows:
Result: The greedy algorithm will select the first item of weight 4, skip the next item of
weight 7, select the next item of weight 5, and skip the last item of weight
3. The solution obtained happens to be optimal for this instance is 65.
But this greedy algorithm always does not yield an optimal solution; if it did, we would have
a polynomial-time algorithm for the NP-hard problem. In fact, the following example shows that
no finite upper bound on the accuracy of its approximate solutions can be given either.
Example: Consider the following instance,
• Step 1 Compute the value-to-weight ratios vi/wi, i = 1, . . . , n, for the items given.
• Step 2 Sort the items in nonincreasing order of the ratios computed in Step 1. (Ties
can be broken arbitrarily.)
• Step 3 Repeat the following operation until the knapsack is filled to its full capacity
or no item is left in the sorted list: if the current item on the list fits into the knapsack
in its entirety, take it and proceed to the next item; otherwise, take its largest fraction
to fill the knapsack to its full capacity and stop.
Example: If we apply this algorithm to solve the same instance of knapsack problem, the
algorithm will take the first item of weight 4 with value 40 and then 6/7 of the next item on the
sorted list to fill the knapsack to its full capacity with value as 42*6/7=36. Therefore, the total
profit =40+36=76.