The Complexity of Theorem-Proving Procedures
The Complexity of Theorem-Proving Procedures
Stephen A. Cook
University of Toronto
Summary
It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing
machine can be reduced to the problem of determining whether a given propositional formula is a
tautology. Here reduced means, roughly speaking, that the rst problem can be solved determinis-
tically in polynomial time provided an oracle is available for solving the second. From this notion of
reducible, polynomial degrees of difculty are dened, and it is shown that the problem of determining
tautologyhood has the same polynomial degree as the problem of determining whether the rst of two
given graphs is isomorphic to a subgraph of the second. Other examples are discussed. A method of
measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.
Throughout this paper, a set of strings
1
means a set of strings on some xed, large, nite alphabet .
This alphabet is large enough to include symbols for all sets described here. All Turing machines are
deterministic recognition devices, unless the contrary is explicitly stated.
1 Tautologies and Polynomial Re-Reducibility.
Let us x a formalism for the propositional calculus in which formulas are written as strings on . Since
we will require innitely many proposition symbols (atoms), each such symbol will consist of a member
of followed by a number in binary notation to distinguish that symbol. Thus a formula of length n can
only have about n/logn distinct function and predicate symbols. The logical connectives are
2
(and),
(or), and (not).
The set of tautologies (denoted by {tautologies}) is a certain recursive set of strings on this alphabet,
and we are interested in the problem of nding a good lower bound on its possible recognition times. We
provide no such lower bound here, but theorem 1 will give evidence that {tautologies} is a difcult set to
recognize, since many apparently difcult problems can be reduced to determining tautologyhood. By
reduced we mean, roughly speaking, that if tautologyhood could be decided instantly (by an oracle)
then these problems could be decided in polynomial time. In order to make this notion precise, we
introduce query machines, which are like Turing machines with oracles in [1].
Transliteration of the original 1971 typewritten paper by Tim Rohlfs (rev. 3). I transcripted basically exactly as Cook wrote
the text; frequently, I even kept inconsistent punctuation. Whenever my version differs from Cooks, I give notice. Minor
typesetting issues are corrected without notice.
1
Cook underlines phrases he wants to emphasize. I will use italics for this purpose.
2
Cook uses & (et) instead of . For better readability, I will use , which is common usage.
1
A query machine is a multitape Turing machine with a distinguished tape called the query tape, and
three distinguished states called the query state, yes state, and no state, respectively. If M is a query
machine and T is a set of strings, then a T-computation of M is a computation of M in which initially M
is in the initial state and has an input string w on its input tape, and each time M assumes the query state
there is a string u on the query tape, and the next state M assumes is the yes state if u T and the no state
if u / T. We think of an oracle, which knows T, placing M in the yes state or no state.
Denition. A set S of strings is P-reducible (P for polynomial) to a set T of strings iff there is some query
machine M and a polynomial Q(n) such that for each input string w, the T-computation of M with input
w halts within Q(|w|) steps (|w| is the length of w) and ends in an accepting state iff w S.
It is not hard to see that P-reducibility is a transitive relation. Thus the relation E on sets of strings,
given by (S, T) E iff each of S and T is P-reducible to the other, is an equivalence relation. The
equivalence class containing a set S will be denoted by deg(S) (the polynomial degree of difculty of S).
Denition. We will denote deg({0}) by L
1i<jT
(S
i,t
S
j,t
)
.
For 1 s T and q t T
j
C
s,t
asserts that at square s and time t there is one and only one symbol.
C is the conjunction of all the C
s,t
.
D asserts that for each t there is one and only one state.
E asserts the initial conditions are satised:
E = Q
0
1
S
1,1
P
i
1
1,1
P
i
2
2,1
. . . P
i
n
n,1
P
1
n+1,1
. . . P
1
T,1
where w =
i
1
. . .
i
n
, q
0
is the initial state and
1
is the blank symbol.
F, G, and H assert that for each time t the values of the Ps, Qs and Ss are updated properly. For
example, G is the conjunction over all t, i, j of G
t
i, j
, where G
t
i, j
asserts that if at time t the machine is in
state q
i
scanning symbol
j
, then at time t +1 the machine is in state q
k
, where q
k
is the state given by
the transition function for M.
5
G
t
i, j
=
T
s=1
Q
i
t
S
s,t
P
j
s,t
Q
k
t+1
.
Finally, the formula I asserts that the machine reaches an accepting state at some time. The machine
M should be modied so that it continues to compute in some trivial fashion after reaching an accepting
state, so that A(w) will be satised.
It is now straightforward to verify that A(w) has all the properties asserted in the rst paragraph of the
proof.
4
Here, the original paper mentions {q
1
, . . . , q
s
} instead of {q
1
, . . . , q
r
}. Theres a hardly readable, handwritten r below the
s, and Cook subsequently does not refer to s but to r; so it is likely that q
r
is correct.
5
Following this sentence, the paper contains some handwritten annotation I cannot decipher.
3
Theorem 2. The following sets are P-reducible to each other in pairs (and hence each has the same
polynomial degree of difculty): {tautologies}, {DNF tautologies}, D
3
, {subgraph pairs}.
Remark. We have not been able to add either {primes} or {isomorphic graphpairs} to the above list.
To show {tautologies} is P-reducible to {primes} would seem to require some deep results in number
theory, while showing {tautologies} is P-reducible to {isomorphic graphpairs} would probably upset a
conjecture of Corneils [4] from which he deduces that the graph isomorphism problem can be solved in
polynomial time.
Incidently, it is
6
not hard to see from the Davis-Putnam procedure [5] that the set D
2
consisting of all
DNF tautologies with at most two conjuncts per disjunct, is in L
. Hence D
2
cannot be added to the list
in theorem 2 (unless all sets in the list are in L
).
Proof of theorem 2. By the corollary to theorem 1, each of the sets is P-reducible to {DNF tautologies}.
Since obviously {DNF tautologies} is P-reducible to {tautologies}, it remains to show{DNF tautologies}
is P-reducible to D
3
and D
3
is P-reducible to {subgraph pairs}.
To show {DNF tautologies} is P-reducible to D
3
, let A be a proposition formula in disjunctive normal
form. Say A =B
1
B
2
. . . B
k
, where B
1
=R
1
. . . R
s
, and each R
i
is an atom or negation of an atom,
and s > 3. Then A is a tautology if and only if A
is a tautology where
A
= PR
3
. . . R
s
PR
1
R
2
B
2
. . . B
k
,
where P is a new atom. Since we have reduced the number of conjuncts in B
1
, this process may be
repeated until eventually a formula is found with at most three conjuncts per disjunct. Clearly the entire
process is bounded in time by a polynomial in the length of A.
It remains to show that D
3
is P-reducible to {subgraph pairs}. Suppose A is a formula in disjunctive
normal formwith three conjuncts per disjunct. Thus A=C
1
. . . C
k
, where C
i
=R
i1
R
i2
R
i3
, and each
R
i j
is an atom or a negation of an atom. Now let G
1
be the complete graph with vertices {v
1
, v
2
, . . . , v
k
},
and let G
2
be the graph with vertices {u
i j
}, 1 i k, 1 j 3, such that u
i j
is connected by an edge to
u
rs
if and only if i = r and the two literals (R
i j
, R
rs
) do not form an opposite pair (that is they are neither
of the form (P, P) nor of the form (P, P)). Thus there is a falsifying truth assignment to the formula
A iff there is a graph homomorphism : G
1
G
2
such that for each i, (i) = u
i j
for some j. (The
homomorphism tells for each i which of R
i1
, R
i2
, R
i3
should be falsied, and the selective lack of edges
in G
2
guarantees that the resulting truth assignment is consistently specied.)
In order to guarantee that a one-one homomorphism : G
1
G
2
has the property that for each i,
(i) = u
i j
for some j, we modify G
1
and G
2
as follows. We select graphs H
1
, H
2
, . . . , H
k
which are
sufciently distinct from each other that if G
1
is formed from G
1
by attaching H
i
to v
i
, 1 i k, and
G
2
is formed from G
2
by attaching H
i
to each of u
i1
and u
i2
and u
i3
, 1 i k, then every one-one
homomorphism : G
1
G
2
has the property just stated. It is not hard to see such a construction can be
carried out in polynomial time. Then G
1
can be embedded in G
2
if and only if A / D
3
. This completes
the proof of theorem 2.
2 Discussion
Theorem 1 and its corollary give strong evidence that it is not easy to determine whether a given
proposition formula is a tautology, even if the formula is in normal disjunctive form. Theorems 1 and
6
The original paper contains a typing error here (it instead of it is).
4
2 together suggest that it is fruitless to search for a polynomial decision procedure for the subgraph
problem, since success would bring polynomial decision procedures to many other apparently intractible
problems. Of course the same remark applies to any combinatorial problem to which {tautologies} is
P-reducible.
Furthermore, the theorems suggest that {tautologies} is a good candidate for an interesting set not in
L
, and I feel it is worth spending considerable effort trying to prove this conjecture. Such a proof would
be a major breakthrough in complexity theory.
In view of the apparent complexity of {DNF tautologies}, it is interesting to examine the Davis-
Putnam procedure [5]. This procedure was designed to determine whether a given formula in conjunctive
normal form is satisable, but of course the dual procedure determines whether a given formula in
disjunctive normal form is a tautology. I have not yet been able to nd a series of examples showing
the procedure (treated sympathetically to avoid certain pitfalls) must require more than polynomial time.
Nor have I found an interesting upper bound for the time required.
If we let strings represent natural numbers, (or k-tuples of natural numbers) using m-adic or other
suitable notation, then the notions in the preceeding sections can be made to apply to sets of numbers (or
k-place relations on numbers). It is not hard to see that the set of relations accepted in polynomial time
by some nondeterministic Turing machine is precisely the set L
+
of relations of the form
(y g
k
(x)) R(x, y) (1)
where g
k
(x) = 2
(l(maxx))
k
, l(z) is the dyadic length of z, and R(x, y) is an L
relation, (L
+
is the class
of extended positive rudimentary relations of Bennett [6]). If we remove the bound on the quantier
in formula (1), the class L
+
would become the class of recursively enumerable sets. Thus if L
+
is
the analog of the class of r.e. sets, then determining tautologyhood is the analog of the halting problem;
since, according to theorem 1, {tautologies} has the complete L
+
degree just as the halting problem has
the complete r.e. degree. Unfortunately, the diagonal argument which shows the halting problem is not
recursive apparently cannot be adapted to show {tautologies} is not in L
.
3 The Predicate Calculus
Formulas in the predicate calculus are represented by strings in a manner similar to the propositional
calculus. In addition to the symbols for the latter, we need the quantier symbols and , and symbols
for forming an innite list of individual variables, and innite lists of function and predicate symbols of
each order (of course the underlying alphabet is still nite).
Suppose Q is a procedure which operates on the above formulas and which terminates on a given input
formula A iff A is unsatisable. Since there is no decision procedure for satisability in the predicate
calculus, it follows that there is no recursive function T such that if A is unsatisable, then Q will
terminate within T(n) steps, where n is the length of A. How then does one appraise the efciency of the
procedure?
We will take the following approach. Most automatic theorem provers depend on the Herbrand theo-
rem, which states briey that a formula A is unsatisable if and only if some conjunction of substitution
instances of the functional form f n(A) of A is truth functionally inconsistent. Suppose we order the terms
in the Herbrand universe of f n(A) according to rank, and then order in a natural way the substitution
instances of f n(A) from the Herbrand universe. The ordering should be such that in general substitution
instances which use terms with greater rank follow substitution instances which use terms of lesser rank.
Let A
1
, A
2
, . . . be these substitution instances in order.
5
Denition. If A is unsatisable, then (A) is the least k such that A
1
A
2
. . . A
k
is truth-functionally
inconsistent. If A is satisable, then (A) is undened.
Now let Q be the procedure which, given A, computes the sequence A
1
, A
2
, . . . and for each i, tests
whether A
1
. . . A
i
is truth-functionally consistent. If the answer is ever no, the procedure terminates
successfully. Then clearly there is a recursive T(k) such that for all k and all formulas A, if the length
of A k and (A) k, then Q will terminate within T(k) steps. We suggest that the function T(k) is a
measure of the efciency of Q.
For convenience, all procedures in this section will be realized on single tape Turing machines, which
we shall call simply machines.
Denition. Given a machine M
Q
and recursive function T
Q
(k), we will say M
Q
is of type Q and runs
within time T
Q
(k) provided that when M
Q
starts with a predicate formula A written on its tape, then M
Q
halts if and only if A is unsatisable, and for all k, if (A) k and |A| log
2
k, then M
Q
halts within
T
Q
(k) steps. In this case we will also say that T
Q
(k) is of type Q. Here |A| is the length of A.
The reason for the condition |A| log
2
k instead of |A| k, is that with the latter condition, nding a
lower bound for T
Q
(k) would be nearly equivalent to nding a lower bound for the decision problem for
the propositional calculus. In particular, theorem 3A would become obvious and trivial.
Theorem 3. A) For any T
Q
(k) of type Q,
T
Q
(k)
k/(logk)
2
(2)
is unbounded.
B) There is a T
Q
(k) of type Q such that
T
Q
(k) k 2
k(logk)
2
.
Outline of proof. A) Given any machine M, one can construct a predicate formula A(M) which is
satisable if and only if M never halts when starting on a blank tape. This is done along the lines
described in Wang [7] in the proof which reduces the halting problem to the decision problem for
the predicate calculus. Further, if M halts in s steps, then (A(M)) s
2
. Thus, if, contrary to (2),
T
Q
(k) = O(
k/log
2
k), then a modication of M
Q
could verify in only
O(
s
2
/log
2
s
2
) = O(s/log
2
s)
steps that M halted in s steps (provided m logs
2
, where m is the length of A(M)). A diagonal
argument (see [8] p. 153) shows that this is impossible in general.
B) The machine M
Q
operates in time T
Q
by following the procedure outlined at the beginning of
this section. Note that the formula A
1
A
2
. . . A
k
has length O(klog
2
k), since we can assume
|A| logk.
Theorem 4. If the set S of strings is accepted by a nondeterministic machine within time T(n) = 2
n
, and
if T
Q
(k) is an honest (i.e. real-time countable) function of type Q, then there is a constant K so S can be
recognized by a deterministic machine within time T
Q
(K 8
n
).
6
Proof. Suppose M
1
is a nondeterministic machine which accepts S in time 2
n
. Let M
2
be a nondeter-
ministic machine which simulates M
1
for exactly 2
n
steps and then halts, unless M
1
accepts the input,
in which case M
2
computes forever. Thus for all strings w, if w S then there is a computation for
which M
2
with input w fails to halt, and if w / S, then M
2
with input w halts within 4
n
steps for all
computations. Now given w of length n, we may construct a formula A(w) of length O(n) such that A(w)
is satisable if and only if M
1
accepts w. (A(w) is constructed in a way similar to A(M) in the proof of
3A.)
7
Further, if M
2
halts within 4
n
steps for all possible computations, then (A(w)) K (4
n
)
2
= K 8
n
.
Thus, a deterministic machine M can be constructed to determine whether w S by presenting M
Q
with
input A(w). If no result appears within T
Q
(K 8
n
) steps, then w S, and otherwise w / S.
4 More Discussion
There is a large gap between the lower bound of
k/(logk)
2
for time functions T
Q
(k) given in theorem
3A and a possible
T
Q
(k) = k 2
k(logk)
2
given in 3B. However, there are reasons for the gap. For example, if we could improve the result in 3B
and nd a T
Q
(k) bounded by a polynomial in k, then by theorem 4 we could simulate a nondeterministic
2
n
time bounded machine deterministically in time p(2
n
) for some polynomial p. This is contrary to
experience which indicates deterministic simulation of a nondeterministic T(n) time bounded machine
requires time k
T(n)
in general.
On the other hand, if we could push up the lower bound given in theorem 3A and show
T
Q
(k)
2
k
is unbounded, then we could conclude {tautologies} / L