Unit - V
Unit - V
Unit - V
Snapshots
Introduction Basic Concepts Deterministic and Nondeterministic Algorithms The classes NP hard and NP complete NP-Hard Graph Problems Clique decision Problem (CDP) Non-Cover Decision Problem Chromatic Number Decision Problem (CNDP) Directed Hamiltonian Cycle (DHC) Traveling Salesperson Decision Problem (TSP) NP-Hard Scheduling Problems Scheduling Identical Processors Flow Shop Scheduling Job Shop Scheduling NP-Hard Code Generation Problems Code Generation with Common Subexpressions Implementing Parallel Assignment Instructions Introduction to Approximate Algorithms for NP-Hard Problems.
5.0
Introduction
The earlier chapters discussed about a variety of problems and algorithms. Some of them are straightforward, some are complicated and some are tricky, but virtually all of them have complexity in o(n3) where n is inexactly described as the input size. From this point of view, this chapter deals in accepting all algorithms studied so far as having low time requirements. Since many of these problems are optimization problems that arise repeatedly in applications, the need for an efficient algorithm is the actual significance.
Objective
This chapter deals with the class of vital problems that have some annoying property of being efficiently solved. No reasonable fast algorithms for these problems have been found but no one has been able to prove that the algorithms require a lot of time. At end of this lesson youll able to sharp your skill and knowledge about NP-Hard Graph Problems, NP-Hard Code Generation Problems and NP-Hard Scheduling Problems.
Page 161
The assignment statement x:=Choice(1,n) could result in x being assigned any one of the integers in the range [1,n]. There is no rule specifying how this choice is to be made. The Failure() and Success() signals, used to define a computation of the algorithm, cannot be used to effect a return. A nondeterministic algorithm terminates unsuccessfully if and only if there exists no set of choices leading to a success signal. The
Page 162
Page 163
Note that if G has only one connected component then n |V|. Thus, if this decision problem cannot be solved by an algorithm of complexity p(n) for some polynomial p(), then it cannot be solved by an algorithm of complexity p(|V|). Definition The time required by a nondeterministic algorithm performing on any given input is the minimum number of steps needed to reach a successful completion if there exists a sequence of choices leading to such a completion. In case successful completion is not possible, then the time required is O(1). A nondeterministic algorithm is of complexity O(f(n)) if for all inputs of size n, n n0 , that result in a successful completion, the time required is at most cf(n) for some constants c and n0. Example 4: Satisfiability: Let x1, x2, denote boolean variables (their value is either true or false). Let xi denote the negation of xi. A literal is either a variable or its negation. A formula in the propositional calculus is an expression that can be constructed using literals and the operations and and or. Examples of such formulas are (x1 2) (x3x4) and (x3x4) x (x1x2). The symbol denotes or and denotes and. A formula is in conjunctive normal k form (CNF) if and only if it is represented as i=1 ci where ci are clauses each represented as lij .The lij are literals .It is in disjunctive normal form (DNF) if and only if it is represented as ki=1 ci and each clause ci is represented as lij Thus (x1 2) (x3x4) is in x DNF whereas (x3x4) (x1x2) is in CNF. The satisfiability problem is to determine whether a formula is true for some assignment of truth-values to the variables. CNFsatisfiability is the satisfiability problem for CNF formulas. Algorithm DKP (p,w,n,m,r,x) Page 164
The classes NP-hard and NP-complete Definition NP is the class of decision problems for which there is a polynomially bound non deterministic algorithm. P is the set of all decision problems solvable by deterministic algorithms in polynomial time. NP is the set of all decision problems solvable by nondeterministic algorithms in polynomial time. Since deterministic algorithms are just a special case of nondeterministic ones, one can conclude that PNP. What has become perhaps the most famous unsolved problem in computer science, is whether P=NP or P NP. Figure (8.1) displays the relationship between P nd NP assuming that P NP.
P
Page 165
NP
Figure 8.1: Commonly believed relationship between P and NP Theorem Satisfiability is in P if and only if P=NP. Definition(s) Let L1 and L2 be problems. Problem L1 reduces to L2 (also written L1 L2) if and only if there is a way to solve L1 by a deterministic polynomial time algorithm using a deterministic algorithm that solves L2 in polynomial time. A problem L is NP-hard if and only if satisfiability reduces to L (satisfiability L). A problem L is NP-complete if and only if L is NP-hard and LNP.
NP NP-complete
NP-hard
P Figure 8.2: Commonly believed relationship among P NP, NP-complete and NP-hard problems Only a decision problem can be NP-complete. However, an optimization problem may be NP-hard. Furthermore if L1 is a decision problem and L2 an optimization problem, it is quite possible that L1 L2. One can trivially show that the knapsack decision problem reduces to the knapsack optimization problem. For the clique problem
Page 166
5.2.2
1 Pick a problem L1 already known to be NP-hard. 2 Show how to obtain an instance I of L2 from any instance I of L1 such that from
the solution of I, solution to instance I of L1(Figure 8.2) can be determined. 3 Conclude from step(2) that L1L2. 4 Conclude from steps(1) and (3) and the transitivity of that L2 is NP-hard. An NP-hard decision problem L2 can be shown to be NP-complete by exhibiting a polynomial time non- deterministic algorithm for L2.
Clique Decision Problem (CDP) According to the theorem, using the result, the transitivity of and the knowledge that satisfiability CNF-satisfiability, It can be established that satisfiability CDP. Hence, CDP is NP-hard. Since, CDPNP, CDP is also NP-complete. Theorem Example: CNF-satisfiability clique decision problem. Consider F=(x1Vx2Vx3) (x1Vx2Vx3). The construction of theorem yields the graph of Figure (8.3). This graph contains six cliques of size two. Consider the clique with vertices {(x1,1), (x2,2)}. By setting x1=true and x=true (that is, x2=false), F is satisfied. The x3 may be set either to true or false. <x1,1> <x1,2>
<x2,1>
<x2,2>
<x3,1>
<x3,2>
Page 167
In the node cover decision problem, given a graph G and an integer k. It is required to determine whether G has a node cover of size at most k. Theorem The clique decision problem the node cover decision problem. Example: Figure 8.5 shows, a graph G and its complement G. In this Figure, G has a node cover of {4,5}, since every edge of G is incident either on the node 4 or on the node 5. Thus, G has a clique of size 5-2=3 consisting of the nodes 1,2 and 3. Chromatic Number Decision Problem (CNDP) A coloring of a graph G=(V,E) is a function f:v{1,2,,k} defined for all iV. If (u,v)E, then f(u) f(v). The chromatic number decision problem is to determine whether G has a coloring for a given k. Satisfiability with at most three literals per clause Chromatic Number Decision Problem. Proof Let F be a CNF formula having at most three literals per clause and having r clauses C1, C2,,Cr. Let xi, 1 i n, be the n variables in F. Let us assume n 4. if n<4, then one can determine whether F is satisfiable by trying out all eight possible truth value
Page 168
1
5 3
G 4
V={x1,x2,.,xn}U{x1,x2,.,xn}U{y1,y2,,yn} U {C1,C2,..,Cr} where y1,y2,,yn are new variables and E={(xi,xi),1 i n}U{(yi,yj)|i j}U{(yi,xj)| i j} U{(yi,xj)|i j}U{(xi,Cj)|xiCj}U(xi,Cj)|xiCj} To see that G is n+1 colorable if and only if F is satisfiable, first observe that the yis form a complete subgraph on n vertices. Hence, each yi must be assigned a distinct color. Without loss of generality one can assume that in any coloring of G, yi is given the color i. Since yi is also connected to all the xjs and xjs except xi and xi ,the color i can be assigned to only xi and xi. However, (xi, xi) E and so a new color, n+1, is needed for one of these vertices. The vertex that is assigned the new color n+1 is called a false vertex. The other vertex is a true vertex. The only way to color G using n+1 colors is to assign color n+1 to one of {xi,xi} for each i, 1 i n. Under what conditions can the remaining vertices be colored using no new colors? Since n 4 and each clause has at most three literals, each Ci is adjacent to a pair of vertices xj,xj for at least one j. Consequently, no Ci can be assigned the color n+1. Also, no Ci can be assigned a color corresponding to an xj or xj not in clause Ci. G is n+1 colorable if and only if there is a true vertex corresponding to each Ci. Directed Hamiltonian Cycle (DHC)
Page 169
5 Theorem
CNF-satisfiability Directed Hamiltonian cycle. Traveling Salesperson Decision Problem (TSP) The corresponding decision problem is to determine whether a complete directed graph G=(V,E) with edge costs c(u,v) has a tour of cost at most M. Theorem (TSP). Directed Hamiltonian cycle (DHC) the traveling salesperson decision problem
5.2.3
The NP-hard problem is called partition. This problem requires us to decide whether a given multiset A= {a1,a2,,an} of n positive integers has a partition P such that ip ai=ip ai. In the sum of subsets problem, It has to be determined whether A={a1,a2,..,an} has a subset S that sums to a given integer M. Theorem(s)
Page 170
FT(S)=max{Ti}
1 i m In a preemptive schedule each job need not be processed continuously to completion on one processor. Obtaining minimum weighted mean finish time and minimum finish time nonpreemptive schedules is NP-hard. Theorem Partition minimum finish time non-preemptive schedule. Example: Consider the following input to the partition problem: a1=2, a2=5, a3=6, a4=7 and a5=10. The corresponding minimum finish time non-preemptive schedule problem has the
Page 171
Page 172
t1,n+2=0;t2,n+2=T;
t3,n+2=T/2 n where T= ai 1
The preceding flow shop instance has a preemptive schedule with finish time at most 2T if and only if A has a partition. 1 2 If A has a partition u, then there is a non-preemptive schedule with finish time 2T as shown in Figure (8.7). If A has no partition, then all preemptive schedules for FS must have a finish time greater than 2T. This can be shown by contradiction. Assume that there is preemptive schedule for FS with finish time at most 2T. Observations regarding this schedule are the following.
a) Task t1,n+1 must finish by time T as t2,n+1=T and cannot start until t1,n+1
finishes.
b) Task t3,n+2 cannot start before T units of time have elapsed as t2,n+2=T.
Observation (a) Implies that only T/2 of the first T time units are free on processor one. Let V be the set of indices of tasks completed on processor 1 by time T (excluding task t1,n+1). t1,i<T/2 iV as A has no partition. Hence t3,i<T/2 iV 1 i n The processing of jobs not included in V cannot commence on processor three until after time T since their processor 1 processing is not completed until after T. This together with observation (b) implies that the total amount of processing left for processor three at time T is T3,n+2 + t3,i>T iV
Page 173
{t1,i|Iu}
A job shop, like a flow shop, has m different processors. The n jobs to be scheduled require the completion of several tasks. The time of the jth task for job Ji is tk,i,j. Task j is to be performed on processor Pk. The tasks for any job Ji are to be carried out in the order 1,2,3,.. and so on. Task j cannot begin until task j-1 (if j>1) has been completed. Note that it is quite possible for a job to have many tasks that are to be performed on the same processor. In a non-preemptive schedule, a task once begun is processed without interruption until it is completed. Obtaining either a minimum finish time preemptive schedule or a minimum finish time non-preemptive case is very simple (use partition). Theorem Partition minimum finish time preemptive job shop schedule (m>1).
5.2.4
The function of a compiler is to translate program written in some source language into an equivalent assembly language or machine language program. Thus, the C++ compiler on the sparc 10 translates C++ programs into the machine language of this machine. The translation clearly depends on the particular assembly language being used. The model machine A has only one register. If it represents a binary operator such as +, -, * and /, then the left operand must be in the accumulator. The relevant assembly language instructions are: LOAD X- load accumulator with contents of memory location X. STORE X- store contents of accumulator into memory location X. OP X OP may be ADD, SUB, MPY or DIV. The instruction OP X computes the operator OP using the contents of the accumulator as the left operand and that of memory location X as the right operand. Two
Page 174
(b)
Figure 8.8: Two possible codes for (a+b)/(c+d) Definition(s) 1. A translation of an expression E into the machine or assembly language of a given machine is optimal if and only if it has a minimum number of instructions. 2. A binary operator is commutative in the domain D if ab =ba for all a and b in D. Machine A can be generalized to another machine B. Machine B has N 1 registers in which arithmetic can be performed. There are four types of machine instructions for B: 1. 2. 3. 4. LOAD STORE OP OP M,R M,R R1,M,R2 R1,R2,R3
Figure 8.9: Optimal codes for N=1 and N=2 Code Generation with Common Subexpressions When arithmetic expressions have common sub expressions, they can be represented by a directed acyclic graph (dag). An operator is represented by each of the internal node in the dag. Assuming the expression contains only binary operators, each internal node P has out-degree two. The two nodes adjacent from P are called the left and right children of P respectively. The children of P are the roots of the dags for the left and right operand of P. Node P is the parent of its children. Definition A leaf is a node with out-degree zero. A level-one node is a node both of whose children are leaves. A shared node is a node with more than one parent. A level-one dag is a dag in which all shared nodes are leaves. A level-one dag is a dag in which all shared nodes are level-one nodes. Theorem FNS the optimal code generation for level-one dags on a one-register machine. Proof Let G,k be an instance of FNS. Let n be the number of vertices in G. Dag A is constructed with the property that the optimal code for the expression corresponding to A has at most n+k LOADs if and only if G has a feedback node set of size at most R.
Page 176
Page 177
5.2.5
The best-known algorithms for NP-hard problems have a worst-case complexity that is exponential in the number of inputs. O(2n/2) algorithm for the knapsack problem was developed. These algorithms can also be used for the partition, sum of subsets and exact cover problems. NP-hard problem increases the maximum problem size that can be solved. However, for large problem instances, even an O(n 4) algorithm requires too much computational effort. The use of heuristics in an existing algorithm may enable it to quickly solve a large instance of a problem provided the heuristic works on that instance. A heuristic does not work equally effectively on all problem instances. If one has to produce an algorithm of low polynomial complexity to solve an NP-hard optimization problem, then it is necessary to relax the meaning of solve. One removes the requirement that the algorithm that solves the optimization problem P must always generate an optimal solution. This requirement is replaced by the requirement that the algorithm for P must always generate a feasible solution with value close to the value of an optimal solution. A feasible solution with value close to the value of an optimal solution is called an appropriate solution. An appropriate algorithm for P is an algorithm that generates approximate solutions for P. In the circumstance of NP-hard problems, approximate solutions have added importance, as exact solutions may not be obtainable in a feasible amount of computing time. An approximate solution may be all one can get using a reasonable amount of computing time. Look for an algorithm for P that almost generates optimal solutions. Algorithms with this property are called probabilistically good algorithms. Consider P is a problem of knapsack or the traveling salesperson problem, I is an instance of problem P and F*(I) is the value of an optimal solution to I. An
Page 178
Page 179
NP- Complete & NP Hard Problem A problem that is NP-complete has the property that it can be solved in polynomial time if and only if all other NP-complete problems can be solved in polynomial time. If an NP-hard problem can be solved in polynomial time, then all NP-complete problems can be solved in polynomial time. All NP-complete problems are NP-hard, but the reverse is not always true. Deterministic algorithm Algorithms, which use the property that the result of every operation is uniquely defined, are termed deterministic algorithms Non - Deterministic Algorithm The machine executing such operations is allowed to choose any one of these outcomes subject to a termination condition to be defined later and this led to the concept of a non -deterministic algorithm
5.4
Intext Questions
1. Discuss briefly Nondeterministic Algorithms. 2. Explain in detail NP-Hard Graph Problems. 3. Different NP-Hard Scheduling problems are to be explained in detail. 4. Explain the problems of NP-Hard Code Generation. 5. What are Approximate Algorithms used for?
5.5
Summary
Search, which necessitates the examination of every vertex in the object being searched, is called a traversal. Algorithms, which use the property that the result of every operation is uniquely defined, are termed deterministic algorithms. A nondeterministic algorithm terminates unsuccessfully if and only if there exists no set of choices leading to a success signal. Any problem that involves the identification of an optimal value of a given cost function is known as an optimization problem.
Page 180
Terminal Exercises
1. Define the classes NP-hard and NP-complete. 2. Define clique Decision problem. 3. Define node cover decision problem. 4. Define chromatic number decision problem. 5. Define Directed Hamiltonian Cycle.
5.7
Supplementary Materials
1. Ellis Horowitz, Sartaj Sahni, Fundamentals of Computer Algorithms, Galgotia Publications, 1997. 2. Aho, Hopcroft, Ullman, Data Structures and Algorithms, Addison Wesley, 1987. 3. Jean Paul Trembly & Paul G.Sorenson, An introduction to Data Structures with Applications, McGraw-Hill, 1984.
5.8
Assignments
Collect information on NP hard and NP Complete problems.
Page 181
Mark Allen Weiss, Data Structures and Algorithm Analysis in C++, Addison Wesley,
5.11 Keywords
NP- Complete & NP Hard Problem Deterministic algorithm Non - Deterministic Algorithm NP-Hard Graph NP-Hard Scheduling NP-Hard Code
Page 182