Proof of NP Problems
Proof of NP Problems
http://www.cs.ecu.edu/karl/6420/spr16/Notes/NPcomplete/3sat.html
3-SAT
3-SAT is a restriction of SAT where each clause is required to have exactly 3 literals.
(¬x ∨ y ∨ ¬w)
∧
(¬y ∨ z ∨ w) ∧
(x ∨ ¬z ∨ y)
Because 3-SAT is a restriction of SAT, it is not obvious that 3-SAT is difficult to solve. Maybe the
restriction makes it easier.
But, in reality, 3-SAT is just as difficult as SAT; the restriction to 3 literals per clause makes no
difference.
Part (a). We must show that 3-SAT is in NP. But we already showed that SAT is in NP. Surely,
and nondeterministic algorith for SAT also works for 3-SAT; it does not care about the
restriction to 3 literals per clause.
Part (b). We need to show, for every problem X in NP, X ≤ 3-SAT. But we can accomplish that by
showing that SAT ≤p 3-SAT.
So our goal is to find a polynomial-time reduction from SAT to 3-SAT. The reduction is a
polynomial-time computable function f that takes a clausal formula φ and yields a clausal
formula φ′ with 3 literals per clause.
The reduction function works on one clause of φ at a time. Here is what it does on a clause C.
3. Suppose that clause C is (ℓ1 ∨ ℓ2 ∨ … ∨ ℓn), where n > 3. Create new variables λ1, λ2,
etc., and replace C by clauses
(ℓ1 ∨ ℓ2 ∨ λ1) ∧
(¬λ1 ∨ ℓ3 ∨ λ2) ∧
(¬λ2 ∨ ℓ4 ∨ λ3) ∧
(¬λ3 ∨ ℓ5 ∨ λ4) ∧
…
(x ∨ ¬y ∨ λ1)
∧
(¬λ1 ∨ z ∨ λ2)
∧
(¬λ2 ∨ u ∨ ¬v)
6. We need to show that doing this replacement does not affect whether the formula is
satisfiable.
Since assignment A makes φ true, it must make at least one of the literals in
clause C true.
(ℓ1 ∨ ℓ2 ∨ true) ∧
(false ∨ ℓ3 ∨ true) ∧
(false ∨ ℓ4 ∨ false)
∧
(true ∨ ℓ5 ∨ false)
∧
(true ∨ ℓ6 ∨ ℓ7)
2. Suppose that φ′ is satisfiable. Choose values for the variables to make φ′ true.
We need to show that, no matter how the values of the λ variables are chosen,
each original clause C must have one true literal. That is, at least one of ℓ1, …,
ℓn is true.
So suppose that all of ℓ1, …, ℓn are false. From the first clause,
(ℓ1 ∨
ℓ2 ∨ λ 1 )
(¬λ1 ∨
ℓ3 ∨ λ 2 )
λ2 must also be true. Working through the clauses that correspond to original
clause C, we see that λ1, …, λn−3 must be true. But then the last clause,
2-SAT
What if we restrict SAT even further, insisting that every clause have exactly 2 literals? Call that
problem 2-SAT.
Then the problem becomes easy. There is a polynomial time algorithm to solve 2-SAT. The key
to the algorithm is that, if you are looking at clause
(ℓ1 ∨ ℓ2)
and ℓ1 is false, then ℓ2 must be true. Choosing a value for a variable leads to other variable
values being forced without much effort.
There is more to the algorithm than that, but that is the heart of it.
Proof that 4 SAT is NP complete
Prerequisite: NP-Completeness
A clique is a subgraph of a graph such that all the vertices in this subgraph
are connected with each other that is the subgraph is a complete graph. The
Maximal Clique Problem is to find the maximum sized clique of a given graph
G, that is a complete graph which is a subgraph of G and contains the
maximum number of vertices. This is an optimization problem.
Correspondingly, the Clique Decision Problem is to find if a clique of size k
exists in the given graph or not.
https://www.geeksforgeeks.org/proof-that-clique-decision-problem-is-np-
complete/
To prove that a problem is NP-Complete, we have to show that it belon
gs to both NP and NP-Hard Classes
.
The Clique Decision Problem belongs to NP – If a problem belongs to
the NP class, then it should have polynomial-time verifiability, that is given a
certificate, we should be able to verify in polynomial time if it is a solution to
the problem.
Proof:
1. Certificate – Let the certificate be a set S consisting of nodes in the clique
and S is a subgraph of G.
2. Verification – We have to check if there exists a clique of size k in the
graph. Hence, verifying if number of nodes in S equals k, takes O(1) time.
Verifying whether each vertex has an out-degree of (k-1) takes O(k 2) time.
(Since in a complete graph, each vertex is connected to every other
vertex through an edge. Hence the total number of edges in a complete
graph = kC2 = k*(k-1)/2 ). Therefore, to check if the graph formed by the k
nodes in S is complete or not, it takes O(k2) = O(n2) time (since k<=n,
where n is number of vertices in G).
Therefore, the Clique Decision Problem has polynomial time verifiability and
hence belongs to the NP Class.
The Clique Decision Problem belongs to NP-Hard – A problem L belongs
to NP-Hard if every NP problem is reducible to L in polynomial time. Now, let
the Clique Decision Problem by C. To prove that C is NP-Hard, we take an
already known NP-Hard problem, say S, and reduce it to C for a particular
instance. If this reduction can be done in polynomial time, then C is also an
NP-Hard problem. The Boolean Satisfiability Problem (S) is an NP-Complete
problem as proved by the Cook’s theorem. Therefore, every problem in NP
can be reduced to S in polynomial time. Thus, if S is reducible to C in
polynomial time, every NP problem can be reduced to C in polynomial time,
thereby proving C to be NP-Hard.
Proof that the Boolean Satisfiability problem reduces to the Clique Decision
Problem
Let the boolean expression be – F = (x1 v x2) ^ (x1‘ v x2‘) ^ (x1 v x3) where
x1, x2, x3 are the variables, ‘^’ denotes logical ‘and’, ‘v’ denotes logical ‘or’
and x’ denotes the complement of x. Let the expression within each
parentheses be a clause. Hence we have three clauses – C 1, C2 and C3.
Consider the vertices as – <x1, 1>; <x2, 1>; <x1’, 2>; <x2’, 2>; <x1, 3>; <x3,
3> where the second term in each vertex denotes the clause number they
belong to. We connect these vertices such that –
1. No two vertices belonging to the same clause are connected.
2. No variable is connected to its complement.
Thus, the graph G (V, E) is constructed such that – V = { <a, i> | a belongs
to Ci } and E = { ( <a, i>, <b, j> ) | i is not equal to j ; b is not equal to a’ }
Consider the subgraph of G with the vertices <x 2, 1>; <x1’, 2>; <x3, 3>. It
forms a clique of size 3 (Depicted by dotted line in above figure) .
Corresponding to this, for the assignment – <x 1, x2, x3> = <0, 1, 1> F
evaluates to true. Therefore, if we have k clauses in our satisfiability
expression, we get a max clique of size k and for the corresponding
assignment of values, the satisfiability expression evaluates to true. Hence,
for a particular instance, the satisfiability problem is reduced to the clique
decision problem.
Therefore, the Clique Decision Problem is NP-Hard.
Conclusion
The Clique Decision Problem is NP and NP-Hard. Therefore, the Clique
decision problem is NP-Complete
or
Explanation:
An instance of the problem is an input specified to the problem. An instance
of the Clique problem is a graph G (V, E) and a positive integer K, and the
problem is to check whether a clique of size K exists in G. Since an NP-
Complete problem, by definition, is a problem which is both in NP and NP-
hard, the proof for the statement that a problem is NP-Complete consists of
two parts:
1. Clique Problem is in NP
If any problem is in NP, then, given a ‘certificate’, which is a solution to
the problem and an instance of the problem (a graph G and a positive
integer K, in this case), we will be able to verify (check whether the
solution given is correct or not) the certificate in polynomial time.
The certificate is a subset V’ of the vertices, which comprises the
vertices belonging to the clique. We can validate this solution by checking
that each pair of vertices belonging to the solution are adjacent, by simply
verifying that they share an edge with each other. This can be done in
polynomial time, that is O(V +E) using the following strategy for
graph G(V, E):
flag=true
For every pair {u, v} in the subset V’:
Check that these two
vertices {u, v} share an edge
If there is no edge,
set flag to false and break
If flag is true:
Solution is correct
Else:
Solution is incorrect
2. Clique Problem is NP-Hard
To prove that the clique problem is NP-Hard, we take the help of a
problem that is already NP-Hard and show that this problem can be
reduced to the Clique problem.
For this, we consider the Independent Set problem, which is NP-
Complete (and hence NP-Hard). Every instance of the independent set
problem consisting of the graph G (V, E) and an integer K can be
converted to the required graph G’ (V’, E’) and K’ of the Clique problem.
We will construct the graph G’ by the following modifications:
V’ =V, that is all the vertices of graph G are a part of the graph G’
E’ = complement of the edges E, that is, the edges not present in the
original graph G.
The graph G’ is the complementary graph of G. The time required to
compute the complementary graph G’ requires a traversal over all the
vertices and edges.
Time complexity: O (V+E)
We will now prove that the problem of computing the clique indeed boils
down to the computation of the independent set. The reduction can be
proved by the following two propositions:
Let us assume that the graph G contains a clique of size K. The
presence of clique implies that there are K vertices in G, where each of
the vertices is connected by an edge with the remaining vertices. This
further shows that since these edges are contained in G, therefore
they can’t be present in G’. As a result, these K vertices are not
adjacent to each other in G’ and hence form an Independent Set of
size K.
We assume that the complementary graph G’ has an independent set
of vertices of size K’. None of these vertices shares an edge with any
other vertices. When we complement the graph to
obtain G, these K vertices will share an edge and hence, become
adjacent to each other. Therefore, the graph G will have a clique of
size K.
1)A belongs to NP
2)B<=p A (B is reduce to A in polynomial time)
If any problem satisfy these two condition then it is called NPC(NP COMPLETE)
(A is NP Complete ) or If a problem is NP as well as NP hard then it is NPC
Problem – Given a graph G(V, E), the problem is to determine if the graph
has a TSP consisting of cost at most K.
Explanation –
In order to prove the Travelling Salesman Problem is NP-Hard, we will have to
reduce a known NP-Hard problem to this problem. We will carry out a
reduction from the Hamiltonian Cycle problem to the Travelling Salesman
problem.
Every instance of the Hamiltonian Cycle problem consists of a graph G =(V,
E) as the input can be converted to a Travelling Salesman problem consisting
of graph G’ = (V’, E’) and the maximum cost, K. We will construct the graph
G’ in the following way:
For all the edges e belonging to E, add the cost of edge c(e)=1. Connect the
remaining edges, e’ belonging to E’, that are not present in the original
graph G, each with a cost c(e’)= 2.
And, set K=N.The new graph G’ can be constructed in polynomial time by
just converting G to a complete graph G’ and adding corresponding costs.
This reduction can be proved by the following two claims:
Let us assume that the graph G contains a Hamiltonian Cycle, traversing
all the vertices V of the graph. Now, these vertices form a TSP
with COST=N Since it uses all the edges of the original graph having cost
c(e)=1. And, since it is a cycle, therefore, it returns back to the original
vertex.
We assume that the graph G’ contains a TSP with cost,K=N . The TSP
traverses all the vertices of the graph returning to the original vertex.
Now since none of the vertices are excluded from the graph and the cost
sums to n, therefore, necessarily it uses all the edges of the graph present
in E, with cost 1, hence forming a hamiltonian cycle with the graph G.
Thus we can say that the graph G’ contains a TSP if graph G contains
Hamiltonian Cycle. Therefore, any instance of the Travelling salesman
problem can be reduced to an instance of the hamiltonian cycle problem.
Thus, the TSP is NP-Hard.
OR
As an interview question, for many years I'd ask candidates to write a brute-force
solution for the traveling salesman problem (TSP). This isn't nearly as hard as it sounds:
you just need to try every possible path, which can be done using a basic depth first
search. A lot of the time candidates who would get stuck would mention something
about how the problem is NP-complete (usually to indicate that they thought the
problem was impossible). The complexity class of TSP is in not something I asked as
part of the interview, and not something I would use to judge a candidate. But hearing
that TSP is NP-complete over and over used to kind of irk me, because TSP is not NP-
complete. This surprises a lot of people, but it's true.
The term "NP-complete" has a very specific technical meaning, so it's not surprising that
people misuse the term. Therefore before continuing, I'm going to define various
complexity classes.
Definitions
P is the set of all problems that can be solved in polynomial time.
NP is the set of all problems whose solutions can be verified in polynomial time.
NP-hard problems are informally defined as those that can't be solved in polynomial
time. In other words, the problems that are harder than P. This is actually a simplified,
informal definition; later I'll give a more accurate definition.
NP-complete problems are the problems that are both NP-hard, and in NP. Proving
that a problem is NP is usually trivial, but proving that a problem is NP-hard is
not. Boolean satisfiability (SAT) is widely believed to be NP-hard, and thus the usual
way of proving that a problem is NP-complete is to prove that there's a polynomial time
transformation of the problem to SAT.
Why TSP Is Not NP-complete
Why is TSP not NP-complete? The simple answer is that it's NP-hard, but it's not in NP.
Since it's not in NP, it can't be NP-complete.
In TSP you're looking for the shortest loop that goes through every city in a given set of
cities. Suppose you're given a set of cities, and the solution for the shortest loop among
these cities. How would you verify that the solution you're given really is the shortest
loop? In other words, how do you know there's not another loop that's shorter than the
one given to you? The only known way to verify that a provided solution is the shortest
possible solution is to actually solve TSP. Since it takes exponential time to solve NP,
the solution cannot be checked in polynomial time. Thus this problem is NP-hard,
but not in NP.
In general, for a problem to be NP-complete it has to be a "decision problem", meaning
that the problem is to decide if something is true or not. There's a simple variation of
TSP called "decision TSP" that turns it into a decision problem. Imagine that instead of
finding the shortest loop going through all cities, your goal is to determine if there exists
any loop whose total length is less than some fixed number. For example, the question
might be: is there a loop that goes through all of these cities, whose total distance is
less than 100 km? In the negative case this is just as hard as regular TSP, because
you'd end up testing all possible paths. But there's an important difference: the solution
can be verified in linear time by adding up all of the distances making up the path, and
that's what makes this variant part of NP. There's a straightforward proof that the
decision variant is also NP-complete, by showing that there's an equivalence between
the TSP decision problem and the Hamiltonian path problem.
What Does NP-hard Really Mean?
Earlier I mentioned that the definition of NP-hard as "problems harder than P" is
informal. In nearly all cases this is sufficient, but it's not technically accurate. Formally, a
problem is NP-hard if given an oracle machine for the problem, all other problems in NP
could be solved in polynomial time.
The best known example of a problem that is in NP, but thought not to be NP-hard,
is integer factorization. It's trivial to verify that the factorization of a number is correct,
simply by taking the product of the factors given to you. This puts integer factorization in
NP. However, it is widely believed that integer factorization is not NP-hard (and thus not
NP-complete), because there doesn't appear to be any equivalence between integer
factorization and other NP-complete problems.
Integer factorization is therefore in a weird complexity class. We think it's not NP-hard,
and intuitively it feels "easier" than NP-hard problems like 3-SAT. A lot of smart people
have looked at integer factorization, and no one has found a polynomial time algorithm
for integer factorization. Thus it is probably "in-between" P and NP-hard, but not one
really knows for sure.
OR
TSP is NP-Complete???
The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to
the same city. The challenge of the problem is that the traveling salesman wants to
minimize the total length of the trip
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In
TSP, we find a tour and check that the tour contains each vertex once. Then the total
cost of the edges of the tour is calculated. Finally, we check if the cost is minimum.
This can be completed in polynomial time. Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is
NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'),
where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each
edge in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus,
if graph G has a Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges
in E' are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost
of h' is 0. We therefore conclude that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of cost
at most 0. TSP is NP-complete.
PLEASE STUDY IT
https://stackoverflow.com/questions/49837125/confusion-about-np-hard-and-np-
complete-in-traveling-salesman-problems