0% found this document useful (0 votes)
22 views32 pages

Chap07 Handout4

Uploaded by

Ola Badawy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views32 pages

Chap07 Handout4

Uploaded by

Ola Badawy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

CS 341: Chapter 7 7-2

Chapter 7
Time Complexity
CS 341: Foundations of CS II
Contents
• Time and space as resources
• Big O/little o notation, asymptotics
Marvin K. Nakayama • Time complexity
Computer Science Department
• Polynomial time (P)
New Jersey Institute of Technology
Newark, NJ 07102 • Nondeterministic polynomial time (NP)
• NP-completeness

CS 341: Chapter 7 7-3 CS 341: Chapter 7 7-4


Introduction Counting Resources
• Chapters 3–5 dealt with computability theory: • Two ways of measuring “hardness” of problem:
“What is and what is not possible to solve with a computer?”
1. Time Complexity:
How many time-steps are required in the computation of a problem?
• For the problems that are computable, this leads to the next question:
2. Space Complexity:
“If we can decide a language A, how easy or hard is it to do so?”
How many bits of memory are required for the computation?

• Complexity theory tries to answer this. • We will only examine time complexity in this course.

• We will use the Turing machine model.


If we measure time complexity in a crude enough way, then results
for TMs will also hold for all “reasonable” variants of TMs.
CS 341: Chapter 7 7-5 CS 341: Chapter 7 7-6
Example How much time does M1 need?
• Consider language • Number of steps may depend on several parameters.
A = { 0k 1k | k ≥ 0 }.
• Example: If input is a graph, this could depend on
• Below is a single-tape Turing machine M1 that decides A:
number of nodes
M1 = “On input w, where w ∈ {0, 1}∗ is a string: number of edges
1. Scan across tape and reject if 0 is found to the right of a 1. maximum degree
2. Repeat the following if both 0s and 1s appear on tape: all, some, or none of the above
Scan across tape, crossing off single 0 and single 1.
3. If 0s still remain after all 1s crossed out, or vice-versa, reject. • Definition: Complexity is measured as function of length of input
Otherwise, if all 0s and 1s crossed out, accept.” string.
Worst case: longest running time on input of given length.
0 0 0 1 1 1   . . . Average case: average running time on input of given length.

• We will only consider worst-case complexity.


• Question: How much time does TM M1 need to decide A?

CS 341: Chapter 7 7-7 CS 341: Chapter 7 7-8


Running Time Running Time
• Let M be a deterministic TM that halts on all inputs. • The exact running time of most algorithms is quite complex.
• We will study the relationship between • Instead use an approximation for large problems.
the length of encoding of a problem instance and • Informally, we want to focus only on “important” parts of running time.
the required time complexity of the solution for such an instance
(worst case). • Examples:

• Definition: The running time or time complexity of M is a 6n3 + 2n2 + 20n + 45 has four terms.
function f : N → N defined by the maximization: 6n3 most important when n is large.
f (n) = max ( number of time steps of M on input x ) Leading coefficient “6” does not depend on n, so only focus on n3.
|x|=n

• Terminology
f (n) is the running time of M .
M is an f (n)-time Turing machine.
CS 341: Chapter 7 7-9 CS 341: Chapter 7 7-10
Asymptotic Notation Some big-O examples
• Consider functions f and g , where • Example 1: Show f (n) = O(g(n)) for
1 3
f, g : N → R+ f (n) = 15n2 + 7n, g(n) =
n .
2
Let n0 = 16 and c = 2, so we have ∀ n ≥ n0:
• Definition: We say that
1
f (n) = O(g(n)) f (n) = 15n2 + 7n ≤ 16n2 ≤ n3 = 2 · n3 = c · g(n).
2
if there are two positive constants c and n0 such that For first ≤, if 7 ≤ n, then 7n ≤ n2 by multiplying both sides by n.
f (n) ≤ c · g(n) for all n ≥ n0. For second ≤, if 16 ≤ n, then 16n2 ≤ n3 (mult. by n2).
• We say that: • Example 2: 5n4 + 27n = O(n4).

“g(n) is an asymptotic upper bound on f (n).” Take n0 = 1 and c = 32. (Also n0 = 3 and c = 6 works.)
“f (n) is big-O of g(n).” But 5n4 + 27n is not O(n3): no values for c and n0 work.
• Basic idea: ignore constant factor differences:
2n3 + 52n2 + 829n + 2193 = O(n3).
2 = O(1) and sin(n) + 3 = O(1).

CS 341: Chapter 7 7-11 CS 341: Chapter 7 7-12


Polynomials vs Exponentials Big-O for Logarithms
• For a polynomial • Let logb denote logarithm with base b.
p(n) = a1nk1 + a2nk2 + · · · + adnkd , • Recall c = logb n if bc = n; e.g., log2 8 = 3.
• logb(xy ) = y logb x because x = blogb x and
where k1 > k2 > · · · > kd ≥ 0, then
by logb x = (blogb x)y = xy
p(n) = O(nk1 ).
Also, p(n) = O(nr ) for all r ≥ k1, e.g., 7n3 + 5n2 = O(n4). • Note that n = 2log2 n and logb(xy ) = y logb x imply

• Exponential fcns like 2n always eventually “overpower” polynomials. logb n = logb(2log2 n) = (log2 n)(logb 2)

For all constants a and k, polynomial f (n) = a · nk + · · · obeys: Changing base b changes value by only constant factor.
f (n) = O(2n). So when we say f (n) = O(log n), the base is unimportant.
• Note that log n = O(n).
For functions in n, we have
• In fact, log n = O(nd) for any d > 0.
nk = O(bn)
Polynomials overpower logarithms,
for all positive constants k, and b > 1. just like exponentials overpower polynomials.
• Thus, n log n = O(n2).
CS 341: Chapter 7 7-13 CS 341: Chapter 7 7-14
Big-O Properties More Remarks
• O(n2) + O(n) = O(n2) and O(n2)O(n) = O(n3)
• Definition:
• Sometimes we have A bound of nc, where c > 0 is a constant, is called polynomial.
f (n) = 2O(n). δ
A bound of 2(n ), where δ > 0 is a constant, is called exponential.
What does this mean?
Answer: f (n) has an asymptotic upper bound of 2cn for some • f (n) = O(f (n)) for all functions f .
constant c. • [log(n)]k = O(n) for all constants k.
• nk = O(2n) for all constants k.
• What does f (n) = 2O(log n) mean?
• Because n = 2log2 n, n is an exponential function of log n.
Recall the identities:
n = 2log2 n, • If f (n) and g(m) are polynomials, then g(f (n)) is polynomial in n.
nc = 2c log2 n = 2O(log2 n). Example: If f (n) = n2 and g(m) = m3, then
Thus, 2O(log n) means an upper bound of nc for some constant c. g(f (n)) = g(n2) = (n2)3 = n6.

CS 341: Chapter 7 7-15 CS 341: Chapter 7 7-16


Little-o Notation Remarks
Definition: • Big-O notation is about “asymptotically less than or equal to”.
• Let f and g be two functions with f, g : N → R+. • Little-o is about “asymptotically much smaller than”.
• Then f (n) = o(g(n)) if • Make it clear whether you mean O(g(n)) or o(g(n)).
f (n) • Make it clear which variable the function is in:
lim
n→∞
= 0.
g(n) O(xy ) can be a polynomial in x or an exponential in y .

Example: If • Simplify!

• f (n) = 10n2 Rather than O(8n3 + 2n), instead use O(n3).

• g(n) = 2n3 • Try to keep your big-O as “tight” as possible.

then f (n) = o(g(n)) because Suppose f (n) = 2n3 + 8n2.


Although f (n) = O(n5), better to write f (n) = O(n3).
f (n) 10n2 5
= 3
= → 0 as n → ∞
g(n) 2n n
CS 341: Chapter 7 7-17 CS 341: Chapter 7 7-18

Back to Example of TM M1 for A = { 0k 1k | k ≥ 0 } Analysis of Stage 1

M1 = “On input string w ∈ {0, 1}∗: 1. Scan across tape and reject if 0 is found to the right of a 1.
1. Scan across tape and reject if 0 is found to the right of a 1.
2. Repeat the following if both 0s and 1s appear on tape: 0 0 0 1 1 1   . . .
• Scan across tape, crossing off single 0 and single 1.
Analysis:
3. If no 0s or 1s remain, accept;
otherwise, reject.” • Input string w is of length n.
• Scanning requires n steps.
Let’s now analyze M1’s run-time complexity. • Repositioning head back to beginning of tape requires n steps.
• We will examine each stage separately. • Total is 2n = O(n) steps.
• Suppose input string w is of length n.

0 0 0 1 1 1   . . .

CS 341: Chapter 7 7-19 CS 341: Chapter 7 7-20


Analysis of Stage 2 Analysis of Stage 3 and Overall

2. Repeat the following if both 0s and 1s appear on tape: 3. If no 0s or 1s remain, accept;


otherwise, reject.
• Scan across tape, crossing off single 0 and single 1.

0 0 0 1 1 1   . . .
0 0 0 1 1 1   . . .
Analysis:

Analysis: • Single scan requires O(n) steps.

• Each scan requires O(n) steps.


Total cost for each stage:
• Because each scan crosses off two symbols,
• Stage 1: O(n)
at most n/2 scans can occur.
• Stage 2: O(n2)
• Total is O( n 2
2 ) O(n) = O(n ) steps. • Stage 3: O(n)

Overall complexity: O(n) + O(n2) + O(n) = O(n2)


CS 341: Chapter 7 7-21 CS 341: Chapter 7 7-22
Time Complexity Class Another TM for A = { 0k 1k | k ≥ 0 }

Definition: For a function t : N → N , M2 = “On input string w ∈ {0, 1}∗:


TIME(t(n)) = { L | there is a 1-tape TM that decides 1. Scan across tape and reject if 0 is found to the right of a 1.
language L in time O(t(n)) }
2. Repeat the following if both 0s and 1s appear on tape:
2.1 Scan across tape, checking whether total number of 0s and 1s is
Remarks: even or odd. If odd, reject.
• TM M1 decides language A = { 0k 1k | k ≥ 0 } 2.2 Scan across tape, crossing off every other 0 (starting with the
M1 has run-time complexity O(n2). leftmost), and every other 1 (starting with the leftmost).
3. If no 0s or 1s remain, accept;
• Thus, A ∈ TIME(n2). otherwise, reject.”
• Can we do better?

CS 341: Chapter 7 7-23 CS 341: Chapter 7 7-24


Why M2 Halts Why M2 Works
• Stage 2.2: Scan across tape, crossing every other 0 and 1. • Consider parity of 0s and 1s in Stage 2.1.
• On each scan in Stage 2.2, • Example: Start with 013 113
Total number of 0s is decreased by (at least) half Initially, odd-odd (13, 13)
Same for the 1s 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1

• Example: Then, even-even (6, 6)


Start with 13 0s. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1

After first pass, 6 remaining. 0 0 0 0 0 0 0 0 0 0 0 0 0 Then, odd-odd (3, 3)


0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
After second pass, 3 remaining. 0 0 0 0 0 0 0 0 0 0 0 0 0
Then, odd-odd (1, 1)
After third pass, 1 remaining. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1

After fourth pass, none remaining. • Result is 1011, which is reverse of binary representation of 13.
0 0 0 0 0 0 0 0 0 0 0 0 0
• Each pass checks one binary digit.
CS 341: Chapter 7 7-25 CS 341: Chapter 7 7-26
M2 = “On input string w ∈ {0, 1}∗: 2-Tape TM for A = { 0k 1k | k ≥ 0 }
1. Scan across tape and reject if 0 is found to the right of a 1.
M3 = “On input string w ∈ {0, 1}∗:
2. Repeat the following if both 0s and 1s appear on tape:
2.1 Scan across tape, checking whether total number of 0s and 1s is even or odd. 1. Scan across tape and reject if 0 is found to the right of a 1.
If odd, reject. 2. Scan across 0s to first 1, copying 0s to tape 2.
2.2 Scan across tape, crossing off every other 0 (starting with the leftmost), and
every other 1 (starting with the leftmost). 3. Scan across 1s on tape 1 until the end.
For each 1 on tape 1, cross off a 0 on tape 2.
3. If no 0s or 1s remain, accept; otherwise, reject.”
If no 0s left, reject.
Analysis: 4. If any 0s left, reject; otherwise, accept.”
• Each stage requires O(n) time. Before Stage 1 After Stage 2
• Stage 1 and 3 run once each.
Tape 1 0 0 0 1 1 1   . . . 0 0 0 1 1 1   . . .
• Stage 2.2 eliminates half of 0s and 1s: Stage 2 runs O(log2 n) times.
• Total for stage 2 is O(log2 n)O(n) = O(n log n). Tape 2         . . . 0 0 0      . . .
• Grand total: O(n) + O(n log n) = O(n log n), Can show that running time of M3 is O(n).
so language A ∈ TIME(n log n).

CS 341: Chapter 7 7-27 CS 341: Chapter 7 7-28

Runtimes of TMs for A = { 0k 1k | k ≥ 0 } k-Tape TM can be Simulated on 1-Tape TM


with Polynomial Overhead
• Runtime depends on computational model: Theorem 7.8
1-tape TM M1: O(n2) • Let t(n) be a function where t(n) ≥ n.
1-tape TM M2: O(n log n) • Then any t(n)-time multi-tape TM has an equivalent O(t2(n))-time
2-tape TM M3: O(n). single-tape TM.

• For computability, all reasonable computational models are equivalent Tape 1 0 1 1  · · ·


(Church-Turing Thesis).
3-tape TM Tape 2 0 0  · · ·
• For complexity, choice of computational model affects time
complexity. Tape 3 1 0 0 1  · · ·

Equivalent • • •
1-tape TM Tape # 0 1 1 # 0 0 # 1 0 0 1 #  · · ·
CS 341: Chapter 7 7-29 CS 341: Chapter 7 7-30
Review Thm 3.13: Simulating k-Tape TM M on 1-Tape TM S Complexity of Simulation
On input w = w1 · · · wn, the 1-tape TM S does the following: • For each step of k-tape TM M , 1-tape TM S performs two scans
• First S prepares initial string on single tape: Length of active portion of S ’s tape determines how long S takes to
• • • perform each scan.
# w1 w2 · · · wn #  #  #   ···
In r steps, TM M can read/write in ≤ k × r different cells on its k
• For each step of M , TM S scans tape twice tapes.
1. Scans its tape from As M has t(n) runtime, at any point during M ’s execution,
first # (which marks left end of tape) to total # active cells on all of M ’s tapes ≤ k × t(n) = O(t(n)).
(k + 1)st # (which marks right end of tape) Thus, each of S ’s scans requires O(t(n)) time.
to read symbols under “virtual” heads • Overall runtime of S
2. Rescans to write new symbols and move heads
Initial tape arrangement: O(n) steps.
If S tries to move virtual head to the right onto #, then S simulates each of M ’s t(n) steps using O(t(n)) steps.
 M is trying to move head onto unused blank cell.
 Thus, total of t(n) × O(t(n)) = O(t2 (n)) steps.
 So S has to write blank on tape and shift rest of tape right one
cell. Grand total: O(n) + O(t2(n)) = O(t2(n)) steps.

CS 341: Chapter 7 7-31 CS 341: Chapter 7 7-32


Running Time of Nondeterministic TMs
Deterministic vs. Nondeterministic TM Runtime
• What about nondeterministic TMs (NTMs)?
• Informally, NTM makes “lucky guesses” during computation. Deterministic Nondeterministic
• In terms of computability, no difference between TMs and NTMs.
• For time-complexity, nondeterminism seems to make big difference.

Definition:
• Let N be NTM that is a decider (no looping).
f (n) f (n)
• Running time of NTM N is function f : N → N , where
f (n) = max ( height of tree of configs for N on input x ) accept
|x|=n
the maximum number of steps that NTM N uses
on any branch of the computation
on any input x of size n. accept/reject reject
CS 341: Chapter 7 7-33 CS 341: Chapter 7 7-34
Simulating NTM N on 1-Tape DTM D Complexity of Simulating NTM N on 1-Tape DTM D
Requires Exponential Overhead
• Analyze NTM N ’s computation tree on input w with |w| = n
Theorem 7.11
Root is starting configuration.
• Let t(n) be a function with t(n) ≥ n. Each node has ≤ b children
• Any t(n)-time nondeterministic TM has an equivalent 2O(t(n))-time  b = max number of legal choices given by N ’s transition fcn δ .
deterministic 1-tape TM. Each branch has length ≤ t(n).
Total number of leaves ≤ bt(n).
Proof Idea: Total number of nodes ≤ 2 × (max number of leaves) = O(bt(n)).
• Suppose N is NTM decider running in t(n) time. Time to travel from root to any node is O(t(n)).
• On each input w, NTM N ’s computation is a tree of configurations. • DTM’s runtime ≤ time to visit all nodes:
• Simulate N on 3-tape DTM D using BFS of N ’s computation tree: O(bt(n)) × O(t(n)) = 2O(t(n))
D tries all possible branches. • Simulating NTM by DTM requires 3 tapes by Theorem 3.16.
If D finds any accepting configuration, D accepts. • By Theorem 3.13, simulating 3-tape DTM on 1-tape DTM requires
If all branches reject, D rejects.
(2O(t(n)))2 = 22×O(t(n)) = 2O(t(n)) steps.

CS 341: Chapter 7 7-35 CS 341: Chapter 7 7-36


Summary of Simulation Results Polynomial Good, Exponential Bad
106 steps/second
• Simulating k-tape DTM on 1-tape DTM n
increases runtime from t(n) to O(t2(n)) f (n) 10 20 30 40 50 60
i.e., polynomial increase in runtime. n .00001 .00002 .00003 .00004 .00005 .00006
seconds seconds seconds seconds seconds seconds
n2 .0001 .0004 .0009 .0016 .0025 .0036
• Simulating NTM on 1-tape DTM
seconds seconds seconds seconds seconds seconds
increases runtime from t(n) to 2O(t(n)) n 3 .001 .008 .027 .064 .125 .216
i.e., exponential increase in runtime. seconds seconds seconds seconds seconds seconds
n5 .1 3.2 24.3 1.7 5.2 13
seconds seconds seconds minutes minutes minutes
2n .001 1.05 17.9 12.7 35.7 366
seconds seconds minutes days years centuries
3 n .059 58 6.5 3855 2 × 108 1013
seconds minutes years centuries centuries centuries
CS 341: Chapter 7 7-37 CS 341: Chapter 7 7-38
Strong Church-Turing Thesis The Class P
• In general, every “reasonable” variant of DTM (k-tape, r -heads, etc.) Because of polynomial equivalence of DTM models,
can be simulated by a single-tape DTM with only polynomial • group languages solvable in O(n2), O(n log n), O(n), etc., together
time/space overhead. in the polynomial-time class.
Any one of these models can simulate another with only polynomial
increase in running time or space required. Definition: The class of languages that can be decided by a single-tape
All “reasonable” models of computation are polynomially equivalent. DTM in polynomial time is denoted by P, where

NTM is “unreasonable” variant: it can do O(bs) work on step s. P= TIME(nk ).
k≥0
• If any reasonable version of a DTM can solve a problem in polynomial
time, then any other reasonable type of DTM can also. Remarks:
• If we ask if a particular problem is solvable in linear time (i.e., O(n)), • If we ask if a particular problem A is solvable in polynomial time
answer depends on computational model used. (i.e., is A ∈ P?),
• If we ask if a particular problem A is solvable in polynomial time, answer is independent of deterministic computational model used.
answer is independent of reasonable computational model used.
• Class P roughly corresponds to tractable (i.e., realistically solvable)
problems.

CS 341: Chapter 7 7-39 CS 341: Chapter 7 7-40


Encoding of Problems Example of Problem in P: PATH
• Recall: TM running time defined as fcn of length of encoding x of • Decision problem: Given directed graph G with nodes s and t,
input x. does G have a path from s to t?
• But for given problem, many ways to encode input x as x .
Should use “good” encoding scheme. 2 4

• For integers 1
binary is good
unary is bad (exponentially worse) 3 5

Example: Suppose input to TM is the number 18 in decimal. • Universe Ω = { G, s, t | G is directed graph with nodes s, t } of
 if encoding in binary, 18 = 10010 instances (for a particular encoding scheme).
 if encoding in unary, 18 = 111111111111111111 • Language of decision problem comprises YES instances:
• For graphs PATH = { G, s, t | G is directed graph with path from s to t } ⊆ Ω.
list of nodes and edges (good) • For graph G above, G, 1, 5 ∈ PATH, but G, 2, 1 ∈ PATH.
adjacency matrix (good)
CS 341: Chapter 7 7-41 CS 341: Chapter 7 7-42
PATH ∈ P Complexity of Brute-Force Algorithm for PATH
Brute-force algorithm:
Theorem 7.14
PATH ∈ P. • Input is G, s, t ∈ Ω, where G is directed graph with nodes s and t.
• Any path from s to t need not repeat nodes.
Brute-force algorithm: • Examine each potential path in G of length ≤ m (= # nodes in G).
• Input is instance G, s, t ∈ Ω Check if the path goes from s to t.
G is directed graph with nodes s and t. Complexity analysis:
• Let m be number of nodes in G. • There are roughly mm potential paths of length ≤ m.
≤ m2 edges. For each potential path length k = 2, 3, . . . , m,  
m (or m2) roughly measures size of instance G, s, t . check all k! permutations of k distinct nodes from m k possibilities.
 
m m!
• Any path from s to t need not repeat nodes. k! = k × (k − 1) × (k − 2) × · · · × 1, k = k!(m−k)!
 k √
• Examine each potential path in G of length ≤ m. Stirling’s approximation: k! ∼ ke 2πk.
Check if the path goes from s to t. • This is exponential in the number m of nodes.

What is complexity of this algorithm? • So brute-force algorithm’s runtime is exponential in size of input.

CS 341: Chapter 7 7-43 CS 341: Chapter 7 7-44


A Better Algorithm Shows PATH ∈ P Complexity of Better Algorithm for PATH
On input G, s, t ∈ Ω, where G is directed graph with nodes s and t: On input G, s, t ∈ Ω, where G is a directed graph with nodes s and t:
1. Place mark on node s.
1. Place mark on node s. 2. Repeat until no additional nodes marked:
• Scan all edges of G.
2. Repeat until no additional nodes marked: • If edge (a, b) found from marked node a to unmarked node b, then mark b.
• Scan all edges of G. 3. If node t is marked, accept; otherwise, reject.
• If edge (a, b) found from marked node a to unmarked node b, Complexity of algorithm: (depends on how G, s, t is encoded)
then mark b. • Suppose G encoded as list of nodes, list of edges .
3. If node t is marked, accept; otherwise, reject. • Suppose input graph G has m nodes, so ≤ m2 edges.
• Stage 1 runs only once, running in O(m) time
G, 1, 5 ∈ PATH • Stage 2 runs at most m times
Graph G 2 4
Each time (except last), it marks new nodes.
G, 5, 3 ∈ PATH Each time requires scanning edges, which runs in O(m2) steps.
1

G, 2, 1 ∈ PATH • Stage 3 runs only once, running in O(m) time


3 5 • Overall complexity: O(m) + O(m)O(m2) + O(m) = O(m3),
so PATH ∈ P.
CS 341: Chapter 7 7-45 CS 341: Chapter 7 7-46
Another Problem in P: RELPRIME Bad Algorithm for RELPRIME
• Definition: Two integers x, y are relatively prime if 1 is largest RELPRIME = { x, y | x and y are relatively prime }.
integer that divides both; greatest common divisor GCD(x, y) = 1.
Bad Idea: Test all possible divisors (i.e., 2 to min(x, y)).
• Examples:
Complexity of algorithm depends on how integers are encoded:
10 and 21 are relatively prime.
• If x, y encoded in unary (bad), then
10 and 25 are not.
length of x is x; length of y is y .
• Decision problem: Given integers x and y , are x, y relatively prime?
testing min(x, y) values is polynomial in length of input x, y .
Universe Ω = { x, y | x, y integers } of problem instances.
• If x, y encoded in binary (good), then
Language of decision problem:
length of x is log x; length of y is log y .
RELPRIME = { x, y | x and y are relatively prime } ⊆ Ω.
testing min(x, y) values is exponential in length of input x, y
So 10, 21 ∈ RELPRIME and 10, 25 ∈ RELPRIME . because n is an exponential function of log n (i.e., n = 2log2 n).
• This algorithm is pseudo-polynomial.
Theorem 7.15 Polynomial running time with bad encoding.
RELPRIME ∈ P. Exponential running time with good encoding.

CS 341: Chapter 7 7-47 CS 341: Chapter 7 7-48


A Better Algorithm for RELPRIME Complexity of Euclidean Algorithm
Euclidean Algorithm E :
Euclidean Algorithm E : E = “On input x, y , where x, y are natural numbers encoded in binary:
E = “On input x, y , where x, y are natural numbers encoded in binary: 1. Repeat until y = 0
1. Repeat until y = 0 • Assign x ← x mod y .
• Exchange x and y .
• Assign x ← x mod y .
• Exchange x and y . 2. Output x.”

2. Output x.” Complexity of E :


• After first step of Stage 1, x < y because of mod.
Algorithm R below solves RELPRIME , using E as a subroutine:
• Values then swapped, so x > y .
R = “On input x, y , where x, y are natural numbers encoded in binary:
• Can show each subsequent execution of Stage 1 cuts x by at least half.
1. Run E on x, y . • # times Stage 1 executed ≤ min(log2 x, log2 y).
2. If output of E is 1, accept; • Thus, total running time of E (and R) is polynomial in | x, y |, so
otherwise, reject.” RELPRIME ∈ P.
CS 341: Chapter 7 7-49 CS 341: Chapter 7 7-50
CFLs are in P Recall Previous Algorithm to Decide CFL
Lemma
Theorem 7.16
If G is in Chomsky normal form and string w ∈ L(G) has length n > 0,
Every context-free language is in P.
then w has a derivation with 2n − 1 steps.
Remarks: Theorem 4.9
• Will show that each CFL ∈ TIME(n3) Every CFL is a decidable language.
n is length of input string w ∈ Σ∗. Proof.
In contrast, each regular language ∈ TIME(n). Why? • Assume L is a CFL generated by CFG G in Chomsky normal form.
• Theorem 4.9 showed that every CFL is decidable, which we now review. • Theorem 4.7: ∃ TM S that decides
• Convert CFG into Chomsky normal form: ACFG = { G, w | G is a CFG that generates w }.
Each rule has one of the following forms: • Following TM MG decides CFL L ⊆ Σ∗:
MG = “On input w ∈ Σ∗:
A → BC, A → x, S→ε
1. Run TM S on input G, w .
where A, B, C, S are variables; S is start variable;
B, C are not start variable; x is a terminal. 2. If S accepts, accept; if S rejects, reject.”

CS 341: Chapter 7 7-51 CS 341: Chapter 7 7-52


Previous Algorithm is Exponential Dynamic Programming
• Recall that to determine if G, w ∈ ACFG, TM S tries all derivations • Fix CFG G in Chomsky normal form.
with k = 2n − 1 steps, where n = |w| > 0. • Input to DP algorithm is string w = w1w2 · · · wn with |w| = n
But number of derivations taking k steps can be exponential in k. • In our case of DP, subproblems are to determine which variables in G
So we need to use a different algorithm. can generate each substring of w.
• Create an n × n table.
• Use dynamic programming (DP) Entry (i, j): row i, column j
Powerful, general technique. 1 2 3 n
Basic idea: accumulate information about smaller subproblems to 1 Complete string
solve larger subproblems. 2

...
3
Store subproblem solutions in a table as they are generated.
Look up smaller subproblem solutions as needed when solving larger Substrings of length 3
subproblems. Substrings of length 2
n Substrings of length 1
DP for CFGs: Cocke-Younger-Kasami (CYK) algorithm. w1 w2 w3 ... wn
CS 341: Chapter 7 7-53 CS 341: Chapter 7 7-54
Dynamic Programming Table Filling in Dynamic Programming Table
1 2 3 n ∗ ∗
1 Complete string
• Suppose s = uv , B ⇒ u, C ⇒ v , and ∃ rule A → BC .
2 ∗ ∗
Then A ⇒ s because A ⇒ BC ⇒ uv = s.

...
3

Substrings of length 3 • Suppose that algorithm has determined which variables generate each
Substrings of length 2 substring of length ≤ k.
n Substrings of length 1
w1 w2 w3 ... wn • To determine if variable A can generate substring of length k + 1:

• For i ≤ j , (i, j)th entry contains those variables that can generate split substring into 2 non-empty pieces in all possible (k) ways.
substring wi wi+1 · · · wj For each split, algorithm examines rules A → BC
• For i > j , (i, j)th entry is unused.  Each piece is shorter than current substring,

• DP starts by filling in all entries for substrings of length 1, so table tells how to generate each piece.
then all entries for length 2,  Check if B generates first piece.

then all entries for length 3, etc.  Check if C generates second piece.
 If both possible, then add A to table.
• Idea: Use shorter lengths to determine how to construct longer lengths.

CS 341: Chapter 7 7-55 CS 341: Chapter 7 7-56


Example: CYK Algorithm Ex. (cont.): CYK for Substrings of Length 1
Does the following CFG in Chomsky Normal Form generate baaba ? Chomsky CFG: S → XY | Y Z X → YX | a
S → XY | Y Z X → YX | a Y → ZZ | b Z → XY | a
Y → ZZ | b Z → XY | a 1 2 3 4 5
1 2 3 4 5 1 Y
1 2
2 3
3 4
4 5
5 string b a a b a
string b a a b a
• t(1, 1): substring b starts in position 1 and ends in position 1.

• Build table t so that for i ≤ j , entry t(i, j) contains variables that can CFG has rule Y → b, so put Y in t(1, 1).
generate substring starting in position i and ending in position j
• Fill in one diagonal at a time.
CS 341: Chapter 7 7-57 CS 341: Chapter 7 7-58
Ex. (cont.): CYK for Substrings of Length 1 Ex. (cont.): CYK for Substrings of Length 2
Chomsky CFG: S → XY | Y Z X → YX | a Chomsky CFG: S → XY | Y Z X → YX | a
Y → ZZ | b Z → XY | a Y → ZZ | b Z → XY | a

1 2 3 4 5 1 2 3 4 5
1 Y 1 Y S, X
2 X, Z 2 X, Z
3 X, Z 3 X, Z
4 Y 4 Y
5 X, Z 5 X, Z
string b a a b a string b a a b a

• t(2, 2): substring a starts in position 2 and ends in position 2. • t(1, 2): substring ba starts in position 1 and ends in position 2.

CFG has rules X → a and Z → a, so put X, Z in t(2, 2). split ba = b a :


∗ ∗
• Similarly fill in other t(i, i). Y ⇒ b by t(1, 1); X, Z ⇒ a by t(2, 2).

If rule RHS ∈ t(1, 1) ◦ t(2, 2) = {Y X, Y Z}, then LHS ⇒ ba:
∗ ∗
X ⇒ Y X ⇒ ba, S ⇒ Y Z ⇒ ba

CS 341: Chapter 7 7-59 CS 341: Chapter 7 7-60


Ex. (cont.): CYK for Substrings of Length 2 Ex. (cont.): CYK for Substrings of Length 2
Chomsky CFG: S → XY | Y Z X → YX | a Chomsky CFG: S → XY | Y Z X → YX | a
Y → ZZ | b Z → XY | a Y → ZZ | b Z → XY | a

1 2 3 4 5 1 2 3 4 5
1 Y S, X 1 Y S, X
2 X, Z Y 2 X, Z Y
3 X, Z 3 X, Z S, Z
4 Y 4 Y S, X
5 X, Z 5 X, Z
string b a a b a string b a a b a

• t(2, 3): substring aa starts in position 2 and ends in position 3. • t(3, 4): substring ab starts in position 3 and ends in position 4.
∗ ∗
split aa = a a : split ab = a b : X, Z ⇒ a by t(3, 3); Y ⇒ b by t(4, 4).
∗ ∗ ∗
X, Z ⇒ a by t(2, 2); X, Z ⇒ a by t(3, 3). If rule RHS ∈ t(3, 3) ◦ t(4, 4) = {XY , ZY }, then LHS ⇒ ab:
∗ ∗
If rule RHS ∈ t(2, 2) ◦ t(3, 3) = {XX, XZ, ZX, ZZ}, then S ⇒ XY ⇒ ab, Z ⇒ XY ⇒ ab

LHS ⇒ aa: • t(4, 5): similarly handle substring ba by adding LHS of rule to t(4, 5)

Y ⇒ ZZ ⇒ aa if RHS ∈ t(4, 4) ◦ t(5, 5).
CS 341: Chapter 7 7-61 CS 341: Chapter 7 7-62
Ex. (cont.): CYK for Substrings of Length 3 Ex. (cont.): CYK for Substrings of Length 3
Chomsky CFG: S → XY | Y Z X → YX | a Chomsky CFG: S → XY | Y Z X → YX | a
Y → ZZ | b Z → XY | a Y → ZZ | b Z → XY | a

1 2 3 4 5 1 2 3 4 5
1 Y S, X — 1 Y S, X —
2 X, Z Y 2 X, Z Y Y
3 X, Z S, Z 3 X, Z S, Z Y
4 Y S, X 4 Y S, X
5 X, Z 5 X, Z
string b a a b a string b a a b a
• t(1, 3): substring baa starts in position 1 and ends in position 3. • t(2, 4): substring aab starts in position 2 and ends in position 4.
• For each rule, add LHS to t(1, 3) if • Add LHS of rule to t(2, 4) if RHS ∈ t(2, 2) ◦ t(3, 4) ∪ t(2, 3) ◦ t(4, 4).
∗ ∗
RHS ∈ t(1, 1) ◦ t(2, 3) ∪ t(1, 2) ◦ t(3, 3). split aab = a ab : X, Z ⇒ a by t(2, 2); S, Z ⇒ ab by t(3, 4);


split baa = b aa : Y ⇒ b by t(1, 1); Y ⇒ aa by t(2, 3);
∗ so if rule RHS ∈ t(2, 2) ◦ t(3, 4) = {XS, XZ, ZS, ZZ}, then LHS ⇒ aab:
∗ ∗
so if rule RHS ∈ t(1, 1) ◦ t(2, 3) = {Y Y }, then LHS ⇒ baa. Y ⇒ ZZ ⇒ a ab
∗ ∗ ∗ ∗
split baa = ba a : S, X ⇒ ba by t(1, 2); X, Z ⇒ a by t(3, 3); split aab = aa b : Y ⇒ aa by t(2, 3); Y ⇒ b by t(4, 4);
∗ ∗
if rule RHS ∈ t(1, 2) ◦ t(3, 3) = {SX, SZ, XX, XZ}, then LHS ⇒ baa. so if rule RHS ∈ t(2, 3) ◦ t(4, 4) = {Y Y }, then LHS ⇒ aab.

CS 341: Chapter 7 7-63 CS 341: Chapter 7 7-64


Ex. (cont.): CYK for Substrings of Length 4 Ex. (cont.): CYK for Substrings of Length 4
Chomsky CFG: S → XY | Y Z X → YX | a Chomsky CFG: S → XY | Y Z X → YX | a
Y → ZZ | b Z → XY | a Y → ZZ | b Z → XY | a
1 2 3 4 5 1 2 3 4 5
1 Y S, X — — 1 Y S, X — —
2 X, Z Y Y 2 X, Z Y Y S, X, Z
3 X, Z S, Z Y 3 X, Z S, Z Y
4 Y S, X 4 Y S, X
5 X, Z 5 X, Z
string b a a b a string b a a b a

• t(1, 4): substring baab starts in position 1 and ends in position 4. • t(2, 5): substring aaba starts in position 2 and ends in position 5.
∗ ∗
• For each rule, add LHS to t(1, 4) if split a aba: X, Z ⇒ a by t(2, 2); Y ⇒ aba by t(3, 5);

so if rule RHS ∈ t(2, 2) ◦ t(3, 5) = {XY , ZY }, then LHS ⇒ aaba:
RHS ∈ ∪3k=1 t(1, k) ◦ t(k + 1, 4). ∗ ∗
S ⇒ XY ⇒ a aba, Z ⇒ XY ⇒ a aba
∗ ∗
split b aab :

Y ⇒ b by t(1, 1);

Y ⇒ aab by t(2, 4); split aa ba: Y ⇒ aa by t(2, 3); S, X ⇒ ba by t(4, 5);


so if rule RHS ∈ t(1, 1) ◦ t(2, 4) = {Y Y }, then LHS ⇒ baab. so if rule RHS ∈ t(2, 3) ◦ t(4, 5) = {Y S, Y X}, then LHS ⇒ aaba:

∗ ∗ X ⇒ Y X ⇒ aa ba
split ba ab : S, X ⇒ ba by t(1, 2); S, Z ⇒ ab by t(3, 4); ∗ ∗

so if rule RHS ∈ t(1, 2) ◦ t(3, 4) = {SS, SZ, XS, XZ}, then LHS ⇒ baab. split aab a: Y ⇒ aab by t(2, 4); X, Z ⇒ a by t(5, 5);

∗ ∗ so if rule RHS ∈ t(2, 4) ◦ t(5, 5) = {Y X, Y Z}, then LHS ⇒ aaba:
split baa b : Nothing ⇒ baa as t(1, 3) = ∅; Y ⇒ b by t(4, 4). ∗
X ⇒ Y X ⇒ aab a
CS 341: Chapter 7 7-65 CS 341: Chapter 7 7-66
Ex. (cont.): CYK for Substrings of Length 5 Overall CYK Algorithm to show every CFL ∈ P
Does the following CFG in Chomsky Normal Form generate baaba ?
D = “On input string w = w1 w2 · · · wn ∈ Σ∗:
S → XY | Y Z X → YX | a
1. For w = ε, if S → ε is a rule, accept; else reject. [w = ε case]
Y → ZZ | b Z → XY | a
2. For i = 1 to n, [examine each substring of length 1]
1 2 3 4 5 3. For each variable A,
1 Y S, X — — S, X, Z 4. Test whether A → b is a rule, where b = wi.
2 X, Z Y Y S, X, Z 5. If so, put A in table (i, i).
3 X, Z S, Z Y 6. For  = 2 to n, [ is length of substring]
4 Y S, X 7. For i = 1 to n −  + 1, [i is start position of substring]
5 X, Z 8. Let j = i +  − 1, [j is end position of substring]
string b a a b a 9. For k = i to j − 1, [k is split position]
10. For each rule A → BC ,
• t(1, 5): substring baaba starts in position 1 and ends in position 5. 11. If table (i, k) contains B and table (k + 1, j) contains C ,
• For each rule, add LHS to t(1, 5) if put A in table (i, j).
RHS ∈ ∪3k=1 t(1, k) ◦ t(k + 1, 5).
12. If S is in table (1, n), accept; else, reject.”

• Answer is YES iff start variable S ∈ t(1, 5).

CS 341: Chapter 7 7-67 CS 341: Chapter 7 7-68


Complexity of CYK Algorithm Complexity (cont)
• Each stage runs in polynomial time. 6. For  = 2 to n, [ is length of substring]
7. For i = 1 to n −  + 1, [i is start position of substring]
• Examine stages 2–5: 8. Let j = i +  − 1, [j is end position of substring]
2. For i = 1 to n, [examine each substring of length 1] 9. For k = i to j − 1, [k is split position]
3. For each variable A, 10. For each rule A → BC ,
4. Test whether A → b is a rule, where b = wi. 11. If table (i, k) contains B and table (k + 1, j) contains C ,
5. If so, put A in table (i, i). put A in table (i, j).
12. If S is in table (1, n), accept. Otherwise, reject.
• Analysis:
Analysis:
Stage 2 runs n times • Stage 6 runs at most n times
Each time stage 2 runs, stage 3 runs v times, where • Each time stage 6 runs, stage 7 runs at most n times
 v is number of variables in G • Each time stage 7 runs, stage 9 runs at most n times
 v is independent of n. • Each time stage 9 runs, stage 10 runs r times (r = # rules = constant)
Thus, stages 4 and 5 run at most nv times, • Thus, stage 8 runs O(n2) times, and stage 11 runs O(n3) times
which is O(n) because v is independent of n. Grand total: O(n3)
CS 341: Chapter 7 7-69 CS 341: Chapter 7 7-70
Hamiltonian Path Hamiltonian Path
2 6

4 HAMPATH = { G, s, t | G is a directed graph with a


1 8 Hamiltonian path from s to t }
5

3 7
• Question: How hard is it to decide HAMPATH?
• Definition: A Hamiltonian path in a directed graph G visits each • Suppose graph G has m nodes.
node exactly once, e.g., 1 → 3 → 5 → 4 → 2 → 6 → 7 → 8.
• Easy to come up with (exponential) brute-force algorithm
• Decision problem: Given a directed graph G with nodes s and t,
does G have a Hamiltonian path from s to t? Generate each of the (m − 2)! potential paths.
• Universe Ω = { G, s, t | directed graph G with nodes s, t }, and Check if any of these is Hamiltonian.
language is
• Currently unknown if HAMPATH is solvable in polynomial time.
HAMPATH = { G, s, t | G is a directed graph with a
Hamiltonian path from s to t } ⊆ Ω.
• If G is above graph, G, 1, 8 ∈ HAMPATH, G, 2, 8 ∈ HAMPATH.

CS 341: Chapter 7 7-71 CS 341: Chapter 7 7-72


Hamiltonian Path Composite Numbers
• But HAMPATH has feature known as polynomial verifiability. Definition: A natural number is composite if it is the product of two
• A claimed Hamiltonian path can be verified in polynomial time. integers greater than one

Consider G, s, t ∈ HAMPATH, where graph G has m nodes. • a composite number is not prime.
Then (# edges in G) ≤ m(m − 1) = O(m2). • Decision problem: Given natural number x, is x composite?
Suppose G encoded as list of nodes, list of edges . • Universe Ω = { x | natural number x }, and language is
Suppose given list p1, p2, . . . , pm of nodes that is claimed to be
COMPOSITES = { x | x = pq , for integers p, q > 1 } ⊆ Ω.
Hamiltonian path in G from s to t.
Can verify claim by checking Remarks:
1. if each node in G appears exactly once in claimed path,
which takes O(m2) time, • Can easily verify that a number is composite.
2. if each pair (pi, pi+1) is edge in G, which takes O(m3) time. If someone claims a number x is composite and provides a divisor p,
So verification takes time O(m3), which is polynomial in m. just need to verify that x is divisible by p.
• In 2002, Agrawal, Kayal and Sexena proved that PRIMES ∈ P.
• Thus, verifying a given path is Hamiltonian may be easier than
determining its existence. But COMPOSITES = PRIMES , so COMPOSITES ∈ P.
CS 341: Chapter 7 7-73 CS 341: Chapter 7 7-74
Verifiability Examples of Verifiers and Certificates
• Some problems may not be polynomially verifiable. • For HAMPATH, a certificate for
Consider HAMPATH, which is complement of HAMPATH. G, s, t ∈ HAMPATH
No known way to verify G, s, t ∈ HAMPATH in polynomial time. is simply the Hamiltonian path from s to t.
• Definition: Verifier for language A is (deterministic) algorithm V , Can verify in time polynomial in | G, s, t | if path is Hamiltonian.
where
A = { w | V accepts w, c for some string c }
• For COMPOSITES, a certificate for
• String c used to verify string w ∈ A
x ∈ COMPOSITES
c is called a certificate, or proof, of membership in A.
is simply one of its divisors.
Certificate is only for YES instance, not for NO instance.
• We measure verifier runtime only in terms of length of w. Can verify in time polynomial in | x | that the given divisor actually
divides x
• A polynomial-time verifier runs in (deterministic) time that is
polynomial in |w|.
• Remark: Certificate c is only for YES instance, not for NO instance.
• Language is polynomially verifiable if it has polynomial-time verifier.

CS 341: Chapter 7 7-75 CS 341: Chapter 7 7-76


Class NP NTM N1 for HAMPATH
Definition: NP is class of languages with polynomial-time verifiers. N1 = “On input G, s, t ∈ Ω, for directed graph G with nodes s, t:

Remarks: 1. Write list of m numbers p1, p2, . . . , pm, where m is # of nodes in G.


Each number in list selected nondeterministically between 1 and m.
• Class NP contains many problems of practical interest
2. Check for repetitions in list. If any found, reject.
HAMPATH 3. Check whether p1 = s and pm = t. If either fails, reject.
Travelling salesman
4. For i = 1 to m − 1, check whether (pi, pi+1) is an edge of G.
All of P If any is not, reject. Otherwise, accept.”
• The term NP comes from nondeterministic polynomial time.
Complexity of N1 (when G encoded as list of nodes, list of edges ):
Can define NP in terms of nondeterministic polynomial-time TMs.
• Stage 1 takes nondeterministic polynomial time: O(m).
• Recall: a nondeterministic TM (NTM) makes “lucky guesses” in
• Stages 2 and 3 are simple deterministic poly-time checks: O(m2).
computation.
• Stage 4 runs in deterministic polynomial time: O(m3).
• Overall: O(m3) nondeterministic running time.
CS 341: Chapter 7 7-77 CS 341: Chapter 7 7-78
Equivalent Definition of NP Proof: “A ∈ NP” ⇒ “A Decided by Poly-time NTM”
• Let V be polynomial-time verifier for A.
Theorem 7.20
A language is in NP if and only if it is decided by some polynomial-time Assume V is DTM with nk runtime, where n is length of input w.
nondeterministic TM. • Using V as subroutine, construct NTM N as follows:
N = “On input w of length n:
Proof Idea:
1. Nondeterministically select string c of length at most nk .
• Recall language in NP has (deterministic) poly-time verifier. 2. Run V on input w, c .
• Given a poly-time verifier, build NTM that on input w, 3. If V accepts, accept;
guesses the certificate c and then runs verifier on input w, c . otherwise, reject.”
NTM runs in nondeterministic polynomial time. • NTM N runs in nondeterministic polynomial time.
• Given a poly-time NTM, build verifier with input w, c , where Verifier V runs in time nk , so certificate c must have length ≤ nk ;
certificate c tells NTM on input w which is accepting branch. otherwise, V can’t even read entire certificate.
Verifier runs in deterministic polynomial time. Stage 1 of NTM N takes O(nk ) nondeterministic time.

CS 341: Chapter 7 7-79 CS 341: Chapter 7 7-80


Proof: “A Decided by Poly-time NTM” ⇒ “A ∈ NP” NTIME(t(n)) and NP

• Assume A decided by polynomial-time NTM N .


Definition:
• Use N to construct polynomial-time verifier V as follows:
NTIME(t(n)) = { L | L is a language decided
V = “On input w, c , where w and c are strings: by an O(t(n))-time NTM }
1. Simulate N on input w, treating each symbol of c as
a description of each step’s nondeterministic choice.
Corollary 7.22
2. If this branch of N ’s computation accepts, accept;
otherwise, reject.” 
NP = NTIME(nk ).
• V runs in deterministic polynomial time. k≥0

NTM N originally runs in nondeterministic polynomial time.


Remark:
Certificate c tells NTM N how to compute, eliminating
nondeterminism in N ’s computation. • NP is insensitive to choice of “reasonable” nondeterministic
computational model.
This is because all such models are polynomially equivalent.
CS 341: Chapter 7 7-81 CS 341: Chapter 7 7-82
Example: CLIQUE CLIQUE ∈ NP
1 4 6 Theorem 7.24
CLIQUE ∈ NP.
3
Proof.
2 5 7
• The clique is the certificate c.
• Definition: A clique in a graph is a subgraph in which every two • Here is a verifier for CLIQUE :
nodes are connected by an edge, i.e., clique is complete subgraph. V = “On input G, k , c :
• Definition: A k-clique is a clique of size k. 1. Test whether c is a set of k different nodes in G.
• Decision problem: Given graph G and integer k, 2. Test whether G contains all edges connecting nodes in c.
does G have k-clique? 3. If both tests pass, accept; otherwise, reject.”
Universe Ω = { G, k | G is undirected graph, k integer } • If graph G (encoded as list of nodes, list of edges ) has m nodes, then
Language of decision problem Stage 1 takes O(k)O(m) = O(km) time.
CLIQUE = { G, k | G is undirected graph with k-clique } ⊆ Ω. Stage 2 takes O(k2)O(m2) = O(k2m2) time.
For graph G above, G, 5 ∈ CLIQUE , but G, 6 ∈ CLIQUE .

CS 341: Chapter 7 7-83 CS 341: Chapter 7 7-84


Example: SUBSET-SUM SUBSET-SUM ∈ NP
• Decision problem: Given Theorem 7.25
collection S of numbers x1, . . . , xk SUBSET-SUM ∈ NP.
target number t Proof.
does some subcollection of S add up to t? • The subset is the certificate c.
• Universe Ω = { S, t | collection S = {x1, . . . , xk }, target t }. • Here is a verifier V for SUBSET-SUM:
• Language V = “On input S, t , c :
SUBSET-SUM = { S, t | S = {x1, . . . , xk } and ∃ 1. Test whether c is a collection of numbers that sum to t.
{y1, . . . , y} ⊆ {x1, . . . , xk }
 2. Test whether every number in c belongs to S .
with i=1 yi = t } ⊆ Ω
3. If both tests pass, accept;
Example:
otherwise, reject.”
• {4, 11, 16, 21, 27}, 32 ∈ SUBSET-SUM as 11 + 21 = 32.
• When |S| = k,
• {4, 11, 16, 21, 27}, 17 ∈ SUBSET-SUM.
|c| ≤ k, so V takes O(k2) time.
Remark: Collections are multisets: repetitions allowed.
If number x appears r times in S , then sum can include ≤ r copies of x.
CS 341: Chapter 7 7-85 CS 341: Chapter 7 7-86
Class coNP P vs. NP Question
• Language in P has polynomial-time decider.
• The complements CLIQUE and SUBSET-SUM are not obviously
• Language in NP has polynomial-time verifier (or poly-time NTM).
members of NP.
• P ⊆ NP because each poly-time DTM is also poly-time NTM.
CLIQUE = { G, k | undirected graph G does not have k-clique }
Not clear how to define certificates so that we can verify in
NP
polynomial time.
or P = NP
• It seems harder to verify that something does not exist.
P

Definition: The class coNP consists of languages whose complements


belong to NP. • Answering question whether P = NP or not is one of the great
• Language A ∈ coNP iff A ∈ NP. unsolved mysteries in computer science and mathematics.
Most computer scientists believe P = NP; e.g., jigsaw puzzle.
Remark: Currently not known if coNP is different from NP. Clay Math Institute (www.claymath.org) has $1,000,000 prize to
anyone who can prove either P = NP or P = NP.

CS 341: Chapter 7 7-87 CS 341: Chapter 7 7-88


Remarks on P vs. NP Question NP-Complete
• If P = NP, then
Informally, the class NP-Complete comprise languages that are
languages in P are tractable (i.e., solvable in polynomial time)
languages in NP − P are intractable (i.e., polynomial-time solution • “hardest” languages in NP
doesn’t exist).
• “least likely” to be in P

NP • If any NP-Complete language A ∈ P, then P = NP.


or P = NP If P = NP, then every NP-Complete language A ∈ P.
P
• Because NP-Complete ⊆ NP,

• If any NP language A ∈ P, then P = NP. if any NP-Complete language A ∈ P, then P = NP.

Nobody has been able to (dis)prove ∃ language ∈ NP − P.


We will give a formal definition of NP-Complete later.
CS 341: Chapter 7 7-89 CS 341: Chapter 7 7-90
Satisfiability Problem Satisfiability Problem
• A Boolean formula (or function) is an expression involving Boolean
• A Boolean variable is a variable that can take on only the values variables and operations, e.g.,
TRUE (1) and FALSE (0).
φ1 = (x ∧ y) ∨ (x ∧ z)
• Boolean operations • Definition: A formula is satisfiable if some assignment of 0s and 1s
AND: ∧ to the variables makes the formula evaluate to 1.
OR: ∨ Example: φ1 above is satisfiable by (x, y, z) = (0, 1, 0).
This assignment satisfies φ1.
NOT: ¬ or overbar (x = ¬x)
Example: The following formula is not satisfiable:
• Examples φ2 = (x ∨ y) ∧ (z ∧ z) ∧ (y ∨ x)
• Decision problem SAT : Given Boolean fcn φ, is φ satisfiable?
0∧1 = 0 Universe Ω = { φ | φ is a Boolean fcn }
0∨1 = 1 Language of satisfiability problem:
0 = 1 SAT = { φ | φ is a satisfiable Boolean function } ⊆ Ω
so φ1 ∈ SAT and φ2 ∈ SAT .

CS 341: Chapter 7 7-91 CS 341: Chapter 7 7-92


More Definitions Related to Satisfiability Polynomial-Time Computable Functions
• A literal is a variable or negated variable: x or x
• A clause is several literals joined by ORs (∨): (x1 ∨ x3 ∨ x7)
Definition: A polynomial-time computable function is

Clause is TRUE iff at least one of its literals is TRUE. f : Σ∗1 → Σ∗2

• A Boolean function is in conjunctive normal form, called a if ∃ Turing machine that


cnf-formula, if it comprises several clauses connected with ANDs (∧):
• starts with input w ∈ Σ∗1,
(x1 ∨ x2 ∨ x3 ∨ x4) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6)
• 3cnf-formula has all clauses with 3 literals: • halts with only f (w) ∈ Σ∗2 on the tape, and

(x1 ∨ x2 ∨ x3) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6 ∨ x4) ∧ (x2 ∨ x1 ∨ x5) • has runtime that is polynomial in |w| for w ∈ Σ∗1.
• Decision problem 3SAT : Given a 3cnf-formula φ, is φ satisfiable?
Universe Ω = { φ | φ is 3cnf-formula }
Language of decision problem:
3SAT = { φ | φ is a satisfiable 3cnf-function } ⊆ Ω.
φ ∈ 3SAT iff each clause in φ has at least one literal assigned 1.
CS 341: Chapter 7 7-93 CS 341: Chapter 7 7-94
Polynomial-Time Mapping Reducible: A ≤P B Polynomial-Time Mapping Reducible: A ≤P B
Consider
Ω1 = Σ∗1 Ω2 = Σ∗2
• language A defined over alphabet Σ1; i.e., universe Ω1 = Σ∗1. f
• language B defined over alphabet Σ2; i.e., universe Ω2 = Σ∗2.
A f B
Definition: A is polynomial-time mapping reducible to B , written
A ≤P B
if there is a polynomial-time computable function w∈A ⇐⇒ f (w) ∈ B
f : Σ∗1 → Σ∗2 YES instance for problem A ⇐⇒ YES instance for problem B
such that, for every string w ∈ Σ∗1,
• converts questions about membership in A to membership in B
w ∈ A ⇐⇒ f (w) ∈ B. • conversion is done efficiently (i.e., in polynomial time).

CS 341: Chapter 7 7-95 CS 341: Chapter 7 7-96


Polynomial-Time Mapping Reducible 3SAT ≤P CLIQUE
Ω1 = Σ∗1 Ω2 = Σ∗2
f Theorem 7.32
Theorem 7.31
3SAT is polynomial-time mapping reducible to CLIQUE .
If A ≤P B and B ∈ P, then A ∈ P. A f B
Proof Idea: Convert instance φ of 3SAT problem with k clauses into
Proof.
instance G, k of clique problem: φ ∈ 3SAT iff G, k ∈ CLIQUE .
• B ∈ P ⇒ ∃ TM M that is polynomial-time decider for B . • Recall
• A ≤P B ⇒ ∃ function f that reduces A to B in polynomial time. 3SAT = { φ | 3cnf-fcn φ is satisfiable }
• Define TM N that decides A ⊆ Ω1 as follows: ⊆ {φ | 3cnf-fcn φ } ≡ Ω3,
CLIQUE = { G, k | undirected graph G has k-clique }
N = “On input w ∈ Ω1,
1. Compute f (w) ∈ Ω2. ⊆ {G, k | undirected graph G, integer k } ≡ ΩC .
2. Run M on input f (w) and output whatever M outputs.” • Need poly-time reducing function f : Ω3 → ΩC
• Analysis of Time Complexity of TM N : Ω3
f
ΩC

Each stage runs once.


Stage 1 is polynomial because f is polynomial-time function. 3SAT f CLIQUE

Stage 2 is polynomial because M is polynomial-time decider for B .


CS 341: Chapter 7 7-97 CS 341: Chapter 7 7-98
3SAT is Mapping Reducible to CLIQUE 3SAT is Mapping Reducible to CLIQUE
Example: 3cnf-function with k = 3 clauses and m = 2 variables:
Proof Idea: Map instance φ ∈ Ω3 of 3SAT problem with k clauses
into instance G, k ∈ ΩC of clique problem: φ = (x1 ∨ x1 ∨ x2) ∧ (x1 ∨ x2 ∨ x2) ∧ (x1 ∨ x2 ∨ x2)

φ ∈ 3SAT iff G, k ∈ CLIQUE
Corresponding Graph:
• Suppose φ is a 3cnf-function with k clauses, e.g.,
Clause 2
φ = (x1 ∨x2 ∨x3)∧(x3 ∨x5 ∨x6)∧(x3 ∨x6 ∨x4)∧(x2 ∨x1 ∨x5) x1 x2 x2
• Convert φ into a graph G as follows:
Each literal in φ corresponds to a node in G. x1 x1
Nodes in G are organized into k triples t1, t2, . . . , tk .
Triple ti corresponds to the ith clause in φ.
Clause 1 x1 x2 Clause 3
Add edges between each pair of nodes, except
 within same triple
 between contradictory literals, e.g., x1 and x1 x2 x2

CS 341: Chapter 7 7-99 CS 341: Chapter 7 7-100


3SAT is Mapping Reducible to CLIQUE 3SAT is Mapping Reducible to CLIQUE
• 3cnf-formula with k = 3 clauses and m = 2 variables Need to show 3cnf-fcn φ with k clauses is satisfiable iff G has a k-clique.
φ = (x1 ∨ x1 ∨ x2) ∧ (x1 ∨ x2 ∨ x2) ∧ (x1 ∨ x2 ∨ x2) • Key Idea: φ ∈ 3SAT iff each clause in φ has ≥ 1 true literal.
is satisfiable by assignment x1 = 0, x2 = 1. • Recall: G has node triples corresponding to clauses in φ.

• Resulting graph has k-clique based on true literal from each clause: • Add edges between each pair of nodes, except
within same triple
Clause 2
between contradictory literals, e.g., x1 and x1
x1 x2 x2
• k-clique in G
must have 1 node from each triple
x1 x1 cannot include contradictory literals
• If φ ∈ 3SAT , then choose node corresponding to satisfied literal in
each clause to get k-clique in G.
Clause 1 x1 x2 Clause 3
• If G, k ∈ CLIQUE , then literals corresponding to k-clique satisfy φ.

x2 x2 Conclusion: φ ∈ 3SAT iff G, k ∈ CLIQUE , so


3SAT ≤m CLIQUE .
CS 341: Chapter 7 7-101 CS 341: Chapter 7 7-102
Reducing 3SAT to CLIQUE Takes Polynomial Time NP-Complete
Definition: Language B is NP-Complete if
Claim: The mapping φ → G, k is polynomial-time computable.
1. B ∈ NP, and
Proof. 2. B is NP-Hard: For every language A ∈ NP, we have A ≤P B .

• Size of given 3cnf-function φ NP A3 A4


k clauses A2
m variables. B

A1 A5
• Constructing graph G
G has 3k nodes
Adding edges entails considering each pair of nodes in G: Remarks:

3k 3k(3k − 1) • NP-Complete problems are the most difficult problems in NP.
= = O(k2)
2 2
• Definition: Language B is NP-Hard if B satisfies part 2 of
Time to construct G is polynomial in size of 3cnf-function φ.
NP-Complete.

CS 341: Chapter 7 7-103 CS 341: Chapter 7 7-104


NP-Complete and P vs. NP Question Identifying New NP-Complete Problems from Known Ones

Theorem 7.35 Theorem 7.36


If there is an NP-Complete language B and B ∈ P, then P = NP. If B is NP-Complete and B ≤P C for C ∈ NP, then C is NP-Complete.
Proof.
NP A3 A4 NP A3 A4
• Consider any language A ∈ NP.
A2
B A2
• As A ∈ NP, defn of NP-completeness
A5
B
implies A ≤P B . A1

A1 A5
• Recall Theorem 7.31: If A ≤P B and B ∈ P, then A ∈ P.
C

• Because B ∈ P, it follows that also A ∈ P by Theorem 7.31.


CS 341: Chapter 7 7-105 CS 341: Chapter 7 7-106
Identifying New NP-Complete Problems from Known Ones Cook-Levin Theorem
Recall Theorem 7.36: • Once we have one NP-Complete problem,
If B is NP-Complete and B ≤P C for C ∈ NP, then C is NP-Complete. can identify others by using polynomial-time reduction (Theorem 7.36).
Proof. • But identifying the first NP-Complete problem requires some effort.

• Assume that C ∈ NP. • Recall satisfiability problem:

• Must show that every A ∈ NP satisfies A ≤P C . SAT = { φ | φ is a satisfiable Boolean function }


• Because B is NP-Complete,
Theorem 7.37
every language in NP is polynomial-time reducible to B .
SAT is NP-Complete.
Thus, A ≤P B when A ∈ NP.
• By assumption, B is polynomial-time reducible to C . Proof Idea:
Hence, B ≤P C . • SAT ∈ NP because a polynomial-time NTM can guess assignment to
• But polynomial-time reductions compose. formula φ and accept if assignment satisfies φ.

So A ≤P B and B ≤P C imply A ≤P C . • Show that SAT is NP-Hard: A ≤P SAT for every language A ∈ NP.

CS 341: Chapter 7 7-107 CS 341: Chapter 7 7-108


Proof Outline of Cook-Levin Theorem Proof Outline of Cook-Levin Theorem
• Let A ⊆ Σ∗1 be a language in NP. Idea: “Satisfying assignments of φ”
↔ “accepting computation history of NTM N on w”
• Need to show that A ≤P SAT .
Step 1: Describe computations of NTM N on w by Boolean variables.
• For every w ∈ Σ∗1, we want a (CNF) formula φ such that
• Any computation history of N = (Q, Σ, Γ, δ, q0, qA, qR) on w with
w ∈ A iff φ ∈ SAT |w| = n has ≤ nk configurations since assumed N runs in time nk .
polynomial-time reduction that constructs φ from w. k
• Each configuration is an element of C (n ), where C = Q ∪ Γ ∪ {#}
• Let N be poly-time NTM that decides A in time at most nk (mark left and right ends with #, where # ∈ Γ).
for input w with |w| = n.
• Computation described by nk × nk “tableau”
• Basic approach:
Each row of tableau represents one configuration.
w ∈ A ⇐⇒ NTM N accepts input w Each cell in tableau contains one element of C .
⇐⇒ ∃ accepting computation history of N on w
⇐⇒ ∃ Boolean function φ and variables x1, . . . , xm • Represent contents of cell (i, j) by |C| Boolean variables
{ xi,j,s | s ∈ C }
with φ(x1, . . . , xm) = TRUE
xi,j,s = 1 means “cell (i, j) contains s” (variable is “on”)
CS 341: Chapter 7 7-109 CS 341: Chapter 7 7-110

Tableau is an nk × nk table of configurations Proof Outline of Cook-Levin Theorem


Step 2: Express conditions for an accepting sequence of configurations of
# q0 w1 w2 ··· wn  ···  # start configuration NTM N on w by Boolean formulas:
# # second configuration φcell = “for each cell (i, j), exactly one s ∈ C with xi,j,s = 1”,
# # φstart = “first row of tableau is the starting configuration of N on w”,
φaccept = “last row of tableau is an accepting configuration of N on w”,
2 × 3 window φmove = “every 2 × 3 window is consistent with N ’s transition fcn”.
nk For example,
  
φcell = ( xi,j,s) ∧( (xi,j,s ∨ xi,j,t)) .
1≤i,j≤nk s∈C  s,t∈C
 s=t
for each cell (i, j), ≥ 1 symbol used 
not ≥ 2 symbols used
Step 3: Show that each of the above formulas can be
# # nk th configuration
• expressed by a formula of size O((nk )2) = O(n2k )
nk • constructed from w in time polynomial in n = |w|.

CS 341: Chapter 7 7-111 CS 341: Chapter 7 7-112


Proof Outline of Cook-Levin Theorem 3SAT is NP-Complete

Step 4: Show that N has an accepting computation history on w iff Recall


φ = φcell ∧ φstart ∧ φaccept ∧ φmove
3SAT = { φ | φ is a satisfiable 3cnf-function }

has a satisfying assignment of the xi,j,s variables.


Corollary 7.42
3SAT is NP-Complete.
Thus, we constructed φ using a polynomial-time reduction from A to SAT :
Proof Idea:
A ≤P SAT Can modify proof that SAT is NP-Complete (Theorem 7.37) so that
resulting Boolean function is a 3cnf-function.
Because construction holds for every A ∈ NP,
SAT is then NP-Complete.
CS 341: Chapter 7 7-113 CS 341: Chapter 7 7-114
Proving NP-Completeness CLIQUE is NP-Complete

• Tedious to prove a language C is NP-Complete using definition: CLIQUE = { G, k | G is an undirected graph with a k-clique }
1. C ∈ NP, and
1 4 6
2. C is NP-Hard: For every language A ∈ NP, we have A ≤P C .
3
NP
• Recall Theorem 7.36: A3 A4
A2 2 5 7
If B is NP-Complete and B ≤P C for C ∈ NP, B

A1 A5
then C is NP-Complete. C Corollary 7.43
CLIQUE is NP-Complete.
• Typically prove a language C is NP-Complete by applying Thm 7.36
Proof.
1. Prove that language C ∈ NP.
2. Reduce a known NP-Complete problem B to C . • Theorem 7.24: CLIQUE ∈ NP.
At this point, have shown that SAT and 3SAT are NP-Complete. • Corollary 7.42: 3SAT is NP-Complete.
3. Show that reduction takes polynomial time. • Theorem 7.32: 3SAT ≤P CLIQUE .
• Thus, Theorem 7.36 implies CLIQUE is NP-Complete.

CS 341: Chapter 7 7-115 CS 341: Chapter 7 7-116


Integer Linear Programming Integer Linear Programming
Definition: An integer linear program (ILP) is Example: Can transform ≥ and = relations into ≤ relations:
• set of variables y1, y2, . . . , yn, which must take integer values. 5y1 − 2y2 + y3 ≤ 7
• set of m linear inequalities: y1 ≥ 2 ←→ −y1 ≤ −2
y2 + 2y3 = 8 ←→ y2 + 2y3 ≤ 8 & y2 + 2y3 ≥ 8
a11 y1 + a12 y2 + · · · + a1n yn ≤ b1
a21 y1 + a22 y2 + · · · + a2n yn ≤ b2 becomes ILP
.. .. ... .. .. 5 y1 − 2 y2 + 1y3 ≤ 7
am1 y1 + am2 y2 + · · · + amn yn ≤ bm −1 y1 + 0 y2 + 0 y3 ≤ −2
0 y1 + 1 y2 + 2 y3 ≤ 8
where the aij and bi are given constants. 0 y1 − 1 y2 − 2 y3 ≤ −8
• In matrix notation, Ay ≤ b, with matrix A and vectors y , b: so ⎛ ⎞ ⎛ ⎞
⎛ ⎞
⎜ 5 −2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1⎟ ⎜ 7⎟
⎜ a11 a12 · · · a1n ⎟ ⎜ y1 ⎟ ⎜ b1 ⎟ ⎜
⎜ −1 0 0⎟⎟ ⎜ y1 ⎟ ⎜ ⎟
⎜ −2 ⎟
⎜ ⎜ ⎟


a21 a22 · · · a2n ⎟

⎟,

⎜ y2



⎟,



b2 ⎟⎟
⎟.
A=⎜

⎜ 0 1 2
⎟,


y= ⎜ y2

⎟,

b=⎜

⎜ 8⎟
⎟.

A= ⎜ .. .. . . . .. ⎟ y= ⎜ . ⎟ b= ⎜ .. ⎟ ⎝ ⎠ y3 ⎝ ⎠
⎜ ⎟ ⎜ . ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 0 −1 −2 −8
am1 am2 · · · amn yn bm
CS 341: Chapter 7 7-117 CS 341: Chapter 7 7-118
ILP is NP-Complete ILP ∈ NP
• Decision problem: Given matrix A and vector b, Proof.
is there an integer vector y such that Ay ≤ b?
• The certificate c is an integer vector satisfying Ac ≤ b.
ILP = { A, b | matrix A and vector b satisfy Ay ≤ b
with y an integer vector } • Here is a verifier for ILP:
⊆ { A, b | matrix A, vector b } ≡ ΩI V = “On input A, b , c :
• Example: The instance A, b ∈ ΩI , where 1. Test whether c is a vector of all integers.
⎛ ⎞ ⎛ ⎞
1 2⎠ 3⎠ 2. Test whether Ac ≤ b.
A= ⎝ , b= ⎝ ,
2 4 7 3. If both tests pass, accept; otherwise, reject.”
satisfies Ay ≤ b for y = (1, 1), so A, b ∈ ILP. • If Ay ≤ b has m inequalities and n variables, then
• Example: The instance C, d ∈ ΩI , where
⎛ ⎞ ⎛ ⎞ Stage 1 takes O(n) time
2 0⎠ 3⎠ Stage 2 takes O(mn) time
C=⎝ , d=⎝ ,
−2 0 −3
So verifier V runs in O(mn), which is polynomial in size of instance.
requires 2y1 ≤ 3 & −2y1 ≤ −3, which means 2y1 = 3, so only
non-integer solutions y = (3/2, y2) for any y2; thus, C, d ∈ ILP.
Now prove ILP is NP-Hard by showing 3SAT ≤P ILP.
• Theorem: ILP is NP-Complete.

CS 341: Chapter 7 7-119 CS 341: Chapter 7 7-120


3SAT ≤m ILP Ω3
f
ΩI
3SAT ≤m ILP
• Reducing fcn f : Ω3 → ΩI • Recall 3cnf-formula with m = 4 variables and k = 3 clauses:
3SAT f ILP
φ ∈ 3SAT iff f ( φ ) = A, b ∈ ILP φ = (x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x4) ∧ (x2 ∨ x4 ∨ x3)
• Consider 3cnf-formula with m = 4 variables and k = 3 clauses:
φ = (x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x4) ∧ (x2 ∨ x4 ∨ x3) φ satisfiable iff each clause evaluates to 1.
• Define integer linear program with A clause evaluates to 1 iff at least one literal in the clause equals 1.
2m = 8 variables y1, y1 , y2, y2 , y3, y3 , y4, y4 For each clause (xi ∨ xj ∨ x), create inequality yi + yj + y ≥ 1.
 yi corresponds to xi
For our example, ILP has k = 3 inequalities of this type:
 yi corresponds to xi
3 sets of inequalities for each pair (yi, yi ), which must be integers: y1 + y2 + y3 ≥ 1
y1 + y2 + y4 ≥ 1
0 ≤ y1 ≤ 1, 0 ≤ y1 ≤ 1, y1 + y1 =1
y2 + y4 + y3 ≥ 1
0 ≤ y2 ≤ 1, 0 ≤ y2 ≤ 1, y2 + y2 =1
0 ≤ y3 ≤ 1, 0 ≤ y3 ≤ 1, y3 + y3 =1  All true for binary variables iff 3cnf-function is satisfiable.
0 ≤ y4 ≤ 1, 0 ≤ y4 ≤ 1, y4 + y4 =1
 Exactly one of yi and yi is 1, and other is 0.
CS 341: Chapter 7 7-121 CS 341: Chapter 7 7-122
3SAT ≤m ILP Reducing 3SAT to ILP Takes Polynomial Time
• Given 3cnf-formula: • Given 3cnf-formula φ with
φ = (x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x4) ∧ (x2 ∨ x4 ∨ x3) m variables: x1, x2, . . . , xm
k clauses
• Constructed ILP:
0 ≤ y1 ≤ 1, 0 ≤ y1 ≤ 1, y1 + y1 =1 • Constructed ILP has
0 ≤ y2 ≤ 1, 0 ≤ y2 ≤ 1, y2 + y2 =1 2m (integer) variables: y1, y1 , y2, y2 , . . . , ym, ym

0 ≤ y3 ≤ 1, 0 ≤ y3 ≤ 1, y3 + y3 =1 6m + k inequalities:
0 ≤ y4 ≤ 1, 0 ≤ y4 ≤ 1, y4 + y4 =1  3 sets of inequalities for each pair yi, yi :

y1 + y2 + y3 ≥ 1 0 ≤ yi ≤ 1, 0 ≤ yi ≤ 1, yi + yi = 1,


y1 + y2 + y4 ≥ 1 so total of 6m inequalities of this type (convert = into ≤ & ≥)
y2 + y4 + y3 ≥ 1  For each clause in φ, ILP has corresponding inequality, e.g.,
• Note that: (x1 ∨ x2 ∨ x3) ←→ y1 + y2 + y3 ≥ 1,
φ satisfiable ⇐⇒ constructed ILP has solution so total of k inequalities of this type.
(with values of variables ∈ {0, 1}) Thus, size of ILP is polynomial in m and k.

CS 341: Chapter 7 7-123 CS 341: Chapter 7 7-124


Many Other NP-Complete Problems NP-Hard Optimization Problems

• HAMPATH, SUBSET-SUM, . . . • Decision problems have YES/NO answers.


• Many decision problems have corresponding optimization version.
• Travelling Salesman Problem (TSP): Given a graph G with weighted
edges and a threshold value d, is there a tour that visits each node • Optimization version of NP-Complete problems are NP-Hard.
once and has total length at most d?
• Long-Path Problem: Given a graph G with weighted edges, two nodes Problem Decision Version Optimization Version
s and t in G, and a threshold value d, is there path (with no cycles) CLIQUE Does a graph G have Find largest clique
from s to t with length at least d? a clique of size k ?
ILP Does ∃ integer vector y Find integer vector y to
• Scheduling Final Exams: Is there a way to schedule final exams in a
such that Ay ≤ b ? max dy s.t. Ay ≤ b
d-day period so no student is scheduled to take 2 exams at same time?
TSP Does a graph G have tour Find min length tour
• Minesweeper, Sudoku, Tetris of length ≤ d ?
Scheduling Given set of tasks and constraints, Find min time schedule
• See Garey and Johnson (1979), Computers and Intractability: A Guide
can we finish all tasks in time d ?
to the Theory of NP-Completeness, for many reductions.
CS 341: Chapter 7 7-125 CS 341: Chapter 7 7-126
Why are NP-Complete and NP-Hard Important? Summary of Chapter 7
• Time complexity: In terms of size n of input w,
• Suppose you are faced with a problem and you can’t come up with an how many time steps are required by TM to solve problem?
efficient algorithm for it.
• Big-O notation: f (n) = O(g(n))
• If you can prove the problem is NP-Complete or NP-Hard, f (n) ≤ c · g(n) for all n ≥ n0.
then there is no known efficient algorithm to solve it. g(n) is an asymptotic upper bound on f (n).
No known polynomial-time algorithms for NP-Complete and Polynomials ak nk + ak−1nk−1 + · · · = O(nk ).
NP-Hard problems. Polynomial = O(nc) for constant c ≥ 0
δ
Exponential = O(2n ) for constant δ > 0
• How to deal with an NP-Complete or NP-Hard problem? Exponentials are asymptotically much bigger than any polynomial
Approximation algorithm • t(n)-time k-tape TM has equivalent O(t2(n))-time 1-tape TM.
Probabilistic algorithm • t(n)-time NTM has equivalent 2O(t(n))-time 1-tape DTM.
Special cases
• Strong Church-Turing Thesis: all reasonable variants of DTM are
Heuristic polynomial-time equivalent.

CS 341: Chapter 7 7-127 CS 341: Chapter 7 7-128

• Polynomial-time mapping reducible: A ≤P B if ∃ polynomial-time


• Class P comprises problems that can be solved in polynomial time computable function f such that
P includes PATH, RELPRIME , CFLs (using dynamic programming) w∈A ⇐⇒ f (w) ∈ B.

• Defn: language B is NP-Complete if B ∈ NP and A ≤P B for all


• Class NP: problems that can be verified in deterministic polynomial
A ∈ NP.
time (equivalently, solved in nondeterministic polynomial time).
If any NP-Complete language B is in P, then P = NP.
NP includes all of P and HAMPATH, CLIQUE , SUBSET-SUM,
3SAT , ILP If any NP language B is not in P, then P = NP.
If B is NP-Complete and B ≤P C for C ∈ NP,
• P vs. NP problem: then C is NP-Complete.

Know P ⊆ NP: poly-time DTM is also poly-time NTM. Cook-Levin Theorem: SAT is NP-Complete.
Unknown if P = NP or P = NP. 3SAT , CLIQUE , ILP, SUBSET-SUM, HAMPATH, etc. are all
NP-Complete

You might also like