0% found this document useful (0 votes)

35 views17 pages

Computer Runtimes and The Length of Proofs

This document discusses comparing the runtimes of Turing machines to the lengths of proofs in formal logic systems. It analyzes the halting times of small Turing machines and lengths of provable theorems to find similarities in how non-linear runtime/proof length increases as problem size increases. The document specifically examines Turing machines with up to 3 states and 2 symbols, and formulas in predicate calculus, to empirically investigate if theorem provers exhibit a non-linear tradeoff between time and size like computer programs.

Uploaded by

Good Books Lover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views17 pages

Computer Runtimes and The Length of Proofs

Uploaded by

Good Books Lover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Computer Runtimes and the Length of Proofs

With an Algorithmic Probabilistic Application to Waiting Times in Automatic Theorem Proving

Hector Zenil
Dept. of Computer Science, University of Sheeld, UK and Special Projects, Wolfram Research, Inc., USA [email protected]

Abstract. This paper is an experimental exploration of the relationship between the runtimes of Turing machines and the length of proofs in formal axiomatic systems. We compare the number of halting Turing machines of a given size to the number of provable theorems of rst-order logic of a given size, and the runtime of the longest-running Turing machine of a given size to the proof length of the most-dicult-to-prove theorem of a given size. It is suggested that theorem provers are subject to the same non-linear tradeo between time and size as computer programs are, aording the possibility of determining optimal timeouts and waiting times in automatic theorem proving. I provide the statistics for some small choices of parameters for both of these systems. Keywords: halting problem, halting probability, proof length, automatic theorem proving, Busy Beaver problem, program-size complexity, small Turing machines.

Introduction

While profound connections between computer programs and mathematical proofs have been studied and are known (e.g. the Curry-Howard correspondence), little has been done to connect the two elds at the level of empirical practice. We present an experimental approach to the question of optimal proving times for automatic theorem provers, which bears out Calude and Stays theoretical ndings that programs either stop quickly or never halt [4]. Working with self-delimiting programs, that is, programs that are not the beginning of any other valid programs, Chaitin dened the complexity of the runtime of a program which eventually halts that we cannot eectively compute [5], and Calude and Stay have recently proven [4] that even though short programs can run for a very long time, long programs are the scarcest because most of them will stop rather quicklyif they ever dodepending on their length. Thus, the probability of a machine halting decreases the longer it takes to halt, if it ever does. Just as Calude and Stay suggest that most Turing machines are fully determined qua termination by a small number of computational steps, and that the
M.J. Dinneen et al. (Eds.): WTCS 2012 (Calude Festschrift), LNCS 7160, pp. 224240, 2012. c Springer-Verlag Berlin Heidelberg 2012

Computer Runtimes and the Length of Proofs

225

error margin drops drastically, in [8] we have also shown that Turing machines are fully determined qua extensionality by a small number of initial input values (a theoretical value for the error margin has yet to be determined but the very few data points that we could generate suggest to follow at least a polynomial distribution). We undertake an experimental approach to the runtimes of deterministic Turing machines up to three states and two symbols in connection, and empirical evidence, to Calude and Stays theoretical results. Then we undertake the same experimental approach to formulas of predicate calculus, in order to nd some (if any) evidence in favour of a possible similar non-linear phenomenon in the distribution of proof lengths of (dis)proven theorems in random axiom systems and Turing machines. Traditional intuition might make one think this an ill-fated approach. On the one hand because undecidability would interfere in any such experimental attempt, and on the other hand, because small systems may say more about design choices than about important results. Even though possible limiting eects may appear right away one can limitedly circumvent these limits (as the Busy Beaver problem does) in an eort tantamount to other interesting experiments including some of Caludes own interest [3] or of my own [7], this latter providing useful applications for the evaluation of the algorithmic complexity of short strings dicult to calculate with the other alternative (lossless compression algorithms). With the intuition one gets from studying small systems (see [13]), it seems worth it and insightful to undertake these kind of experiments. 1.1 The Halting Problem

The Halting Problem for Turing machines involves deciding whether an arbitrary Turing machine M eventually halts on an arbitrary input x. One can ask whether there is a Turing machine halt M which, given code (M ) and the input x, eventually stops and produces 1 if M (x) stops, and 0 if M (x) does not stop. Turings seminal result states that this problem cannot be solved by any Turing machine, i.e. there is no such halt M . Halting can be recognized by simply running the machine in question; the main diculty is to detect non-halting machines. Since many real-world problems arising in the elds of compiler optimization, automatized software engineering, formal proof systems, and so forth are deeply connected to the halting problem, there is an interest in understanding the problem in order to translate theoretical results into practical applications. In [4], it was observed that for any computable probability distribution, most long times are eectively rare, so that at the limit they all had the same behavior regardless of the choice of distribution. They proved that the exact time at which a program stops is not too complicated algorithmically. It is (algorithmically) non-random because most programs either stop quickly or never halt. Since non-random times are (eectively) rare, according to Calude and Stay, the density of times at which an N -bit program can stop decreases quickly.

226

H. Zenil

The Busy Beaver Problem

There are (4n + 2)2n possible (n, 2) deterministic Turing machines with n states and 2 symbols. We denote by (n, m) the class (or space) of all n-state m-symbol Turing machines having a bidirectional tape and remaining on the same cell when entering the (additional to n) halting state. Among the machines that halt, there are some that print more 1s on their output tapes than any other Turing machines of the same size, and some that reach a maximum number of steps upon halting. If T is the number of 1s on the tape of a Turing machine T upon halting, then: (n) = max {T : T (n, 2) T (n) halts} with n the number of states of the Turing machine. If tT is the number of steps that a machine T takes upon halting, then S(n) = max {tT : T (n, 2) T (n) halts} with n the number of states of the Turing machine. (n) and S(n) are noncomputable functions [9] by reduction to the halting problem. Yet values are known for (n, 2) with n 4. The solution for (n, 2) with n < 3 is trivial; the process leading to the solution in (3, 2) is discussed by Lin and Rado [11]; and the process leading to the solution in (4, 2) is discussed in [1]. Solving the halting problem for small machines. It is easy to see that (1) = 1 and (2) = 4. Lin and Rado [9] proved (3) = 6 and Brady [1] that (4) = 13. The exact known values for S are S(1) = 1, S(2) = 6, S(3) = 21, S(4) = 107. These Busy Beaver values are for 2-symbol Turing machines. These numerical values of the Busy Beaver functions have been calculated by a combination of techniques, notably the exhaustive simulation of a reduced number of non-equivalent Turing machines, as it turns out that many can be decided (e.g. evident loops, etc) and because the number of cases is small enough one can either analyse case by case or actually run the machines and analyse their behaviour until deciding whether it halts or not. This is evidently possible because of the relatively small number of Turing machines with up to the number of states for for which the values of the Busy Beaver functions are known. A program showing the evolution of all known Busy Beaver machines developed by this papers authors is available online [15]. The formalism followed in this paper is the same as the one originally described and followed for the Busy Beaver problem as introduced by Rado [9]. It is worth noting that the Busy Beaver problem is dened for Turing machines with initial empty tapes, and Turing machines studied in this paper are all provided with an initially empty tapes too. Turing universality tells us, however, that for every Turing machine with an arbitrary input there is a Turing machine with empty input computing the same function, hence Turing machines with empty tapes cover all possible cases (the translation may only result in some extra states).

Computer Runtimes and the Length of Proofs

227

Halting and Runtime Distributions

Calude and Stay showed that long-running Turing machines can only halt at non-random times; the density of non-random times near n is about 1/n. Longrunning means that if we have a universal Turing machine U and machine M is implemented by a program m for U of length n, then U (m) runs for more than c 2n steps, where c is some uncomputable constant depending on U . 3.1 Halting History of (2, 2) Turing Machines

We know that a machine halts if it enters the halting state before reaching the known Busy Beaver value S(n). If it does not, then it never halts. The halting problem and the halting probability problem are closely related to the Busy Beaver problem in that a solution to any one of them would yield a solution to each of the others. Consider the halting space of all (2, 2) Turing machines (with an extra halting state) provided with an empty tape. The table in Fig. 1 shows the runtime distribution at which all machines in (2, 2) halt (or do not).
t kt 6544 1 2000 2 800 3 160 4 56 5 362 6 78 p(kt ) 0.65 0.20 0.080 0.016 0.0056 0.036 0.0078

Fig. 1. Runtime distribution at which all machines halt (those that dont are indicated by ). Where t is the number of steps, kt the number of machines that halted at t (out of a total of 3456 that halt), and p(kt ) is the halting probability of a machine to halt (or not) in time t.

There are 10 000 2-state, 2-symbol Turing machines (the 10 000 gure comes simply from the formula giving the number of Turing machines with n = 2 states (4n + 2)2n ). No other Turing machine halts after 6 steps (see Fig. 1) in (2, 2). Machines that never halt are 6544 in number, representing around .65 of the total. What we term a runtime space is the product of a class of (n, m) Turing machines for xed n and m, where programs are uniformly distributed, and the time space, which is discrete, has a halting time mapped to a greyscale color (the lighter the color, the sooner it halted; white means the program never halted and red means it reached the Busy Beaver value S(n)). Each point in Fig. 3 represents a Turing machine and as dened by the corresponding spectrum in Fig. 2, the lighter the square the sooner it halted. White

228

H. Zenil

Fig. 2. Halting color mapping spectrum for Turing machines in (2, 2) (the last color is red, visible in the online and printed versions only)

Fig. 3. Runtime distribution plot showing all the 10 000 Turing machines in (2, 2) compressed in a Peano curve packing array (preserving the enumeration distance between machines). Some clusters may emerge due to the enumeration (e.g. terms involving transition rule parameters grouping Turing machines). The plot may look as if it had less than the necessary rows and columns to represent all the 10 000 Turing machines, but that is a consequence of the Peano packing, each apparent pixel is in fact a small cluster of several machines.

cells represent machines that dont halt. Red cells (only visible in the online and color printed versions) show the Busy Beaver machines (for this space, with runtime S(2) = 6 steps). Among all the 3456 Turing machines in (2, 2) that halt, .65 of them do so after the rst step, .2 do so after the second, .05 after the third, and so on. In other words, .57 out of the 3456 (2, 2) Turing machines that halted did so at the rst step, .81 halted before or by the second step at the latest, .84 before or by the third step at the latest, and so on (see Fig. 6).

Computer Runtimes and the Length of Proofs t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 kt 100 214t 5382624 1075648 819200 614656 409600 263424 204800 97216 102400 53760 51200 20800 25600 12512 12800 4264 6400 2424 3200 1064 1600 536 800 304 400 176 200 128 100 p(kt ) 0.71 0.14 0.082 0.035 0.013 0.0071 0.0028 0.0017 0.00057 0.00032 0.00014 0.000071 0.000040 0.000023 0.000017

229

Fig. 4. Where t is the number of steps, kt the number of machines that halted at t, and p(kt ) is the halting probability calculated from t and kt . 100 214t is a good t to the limit behavior as a function relating runtimes and the number of Turing machines halting at a certain runtime for the 14 runtimes at which Turing machines halt.
number of machines 1106 800 000 600 000 400 000 200 000 runtime

Fig. 5. Number of machines in (3, 2) that halt step by step versus 100 214t (dark line (blue in color version))

3.2

Halting History of (3, 2) Turing Machines

Interesting output distribution facts: Out of 7 529 536 machines only 2 146 912 halt. There are 5 382 624 machines that do not halt. Those machines that halt only produce 126 dierent output strings, with the largest being 6 digits in length (the Busy Beavers). Exactly .2 of the Turing machines produce a 0 or a 1 as output.

230
number of machines (log) 9.2

H. Zenil
number of machines (log) 14.6 14.5

9.1

14.4 14.3

9.0 14.2 8.9 14.1 14.0 8.8 1 2 3 4 5 6 7 runtime 2 4 6 8 10 12 14 runtime

Fig. 6. Accumulated number of machines in (2, 2) (left) and (3, 2) (right) that halt step by step

The fact that the gures are mostly white and lightly colored is an indicator of the sparsity of non-halting or quickly-halting machines.

Fig. 7. Halting spectrum for (3, 2). Last color in the spectrum is red (only visible in the online and color printed versions).

Gdel Meets Turing in the Computational Universe o

Inspired by [13] where Wolfram undertakes an exhaustive investigation of the space of propositional logic formulas, I extended his ideas to investigate the space of rst order logic. The extension wasnt trivial, among other reasons because unlike propositional calculus, predicate calculus is undecidable, meaning that one may come across cases where formulas (or their negations) are not proven or disproven in an axiom system of rst order logic. Proof lengths are, of course, not bounded, or one would be able to decide whether a formula in an axiom system can be proven or not if it has reached a limit. Frequency of proof lengths for randomly generated formulas, however, can be studied and analyzed. Frequency distributions of (dis)proven formulas turn out to follow a similar distribution to those of randomly generated computer programs, in which most programs, just as we found for formulas, halt (or are (dis)proven) quickly, with their number diminishing fast over time. When I met Cris Calude and became acquainted with his fascinating work, including a recent collaboration with Michael Stay on the distribution of halting times of random computer programs [4], it prompted me to seek connections with these other ndingspersuaded as I was of the strong connections known to exist between computation and proof theoryand to undertake an empirical

Computer Runtimes and the Length of Proofs

231

Fig. 8. Runtime deep eld of a segment of runtimes from the 7 529 536 Turing machines in (3, 2). The (3, 2) Busy Beavers are barely visible as isolated red points (online and color printed versions only).

Fig. 9. This is what a typical random part of the runtime deep eld looks like after a 10 zoom from a 10th. square area of the original (Fig. 8) image.

232

H. Zenil

investigation of both the halting runtimes of Turing machines that Calude and Stay had calculated theoretically, and the lengths of proofs found by automatic theorem provers. It follows from Chaitin [5] and Calude and Stay [4] that to (dis)prove a formula in an axiom system one only needs to check up to the runtime for which the Turing machine encoding the proof no longer halts. Busy Beavers, as used in the previous section, are therefore relevant to automatic theorem proving because they provide an upper bound on the length of proofs. One only needs to run the computer to (dis)prove the formula up to the Busy Beaver value of the size of the Turing machine, and if it cannot be proven by then then it is undecidable for that axiom system. Moreover, Calude and Stays work may then suggest that chances of proving a formula should decrease over time, or that if a formula can be (dis)proven it will likely do so early in time rather than later meaning that one can set an optimal time for a given provability certainty goal. 4.1 Computer Runtimes and Lengths of Proofs

Optimal proving times are relevant because, on the one hand, they may allow one to set a maximum waiting time, given that proofs may never arrive if a theorem is undecidable in an axiom system, but also because one would know how long to wait before giving up with a certain degree of certainty of provability. If one had a goal (say to prove a fraction of .90 of a set of formulas) one could calculate an optimal timeout and a maximum waiting time, taking advantage of the fact that in the case of theorem provers running on digital computers, there is a correspondence between runtime and proof length. The numbers involved are so large and grow so fast because of the combinatoric explosion (in the number of formulas as well as the number of Turing machines). We were only able to explore the tip of the iceberg of the space of all possible rst-order formulas, but with interesting and encouraging results nonetheless. 4.2 Enumerating and Generating Predicate Calculus Axiom Systems with Equality

A number of sound and complete calculi have been developed enabling fully automated theorem provers for rst-order logic. Equational logic is quite simple, and yet powerful [2]. Its atomic formulas are equations, making it very easy to encode and deal with. In our formalism, terms are rst-order formulas built from variables and constants using function symbols. Equalities of the form lhs = rhs are the atomic formulas in our language, where lhs and rhs are terms. One can represent most mathematical axiom systems and theorems in equational form, so it is expressively very rich. A logical system which possesses an explicitly stated set of axioms from which theorems can be derived is an axiomatic system. In predicate calculus, a formula is in prenex normal form if it can be written as a string of quantiers followed by a quantier-free part. All rst-order wellformed formulas (hereafter simply formulas) are logically equivalent to some formula in prenex normal form. Skolemization is a way of removing existential

Computer Runtimes and the Length of Proofs

233

quantiers from a formula. Variables bound by existential quantiers which are not within the scope of universal quantiers can simply be replaced by the appropriate constants. Both will be used in order to enumerate all possible quantied axioms and formulas of rst order logic. All equational formulas can be represented with two binary operators f and p, where p is a pairing function and f is an indexing operator (any possible binary function). The rst parameter of f will be a constant determining its index, while the second is any other term (variable, constant, f itself or p). When the existential quantier is inside a universal quantier, the bound variable must be replaced by a Skolem function of the variables bound by universal quantiers. We can then specify any constant using a formula of the form: a b f (a, a) = f (b, b). And the ith constant can be dened in terms of f and p recursively as follows: c(0)=p(f(a, a), f(a, a)) c(n+1)=p(f(a, a), c(n)) Or in a single M athematica expression: Nest[p[f[a,a],#]&,p[a,a],i] To represent all possible functions one can combine both f and p. For instance, f (c(i), p(c(i), x)) is the expression representing the i-th function (the function with index i) of x. This assumes that there are an innite number of individuals in the most general case. Notice that x may be a list built from pairs. Formulas were enumerated and generated by the number of variables and constants on both sides of the equality. There are no formulas of length 1, simply because an equality requires at least 2 terms on each side. Finally, all single axioms were arranged by length. The length of an equational formula is the sum of the bound variables on both sides of the equality. Axiom systems are simply all the possible subsets over the formulas of xed length. Applying this operation makes the number of axiom systems to grow exponentially, so we were able to proceed exhaustively only up to 3 bound variables formulas and to generate a sample of 1000 axiom systems only (an initial segment) for 4 bound variables formulas. An automatic theorem prover was fed with all 4 bound variable single formulas as its proving goal for each of the generated axiom system, producing almost 10 103 proofs. Among the initial 1000 axiom systems, 607 were used only, as they were proven to be consistent (no axiom was the negation of any other) and independent (no axiom could be derived from the others). An example of a formula with 3 bound variables is: x1 x2 x3 , x1 = f (f (x2 , x3 ), x1 ) and with four: x1 x2 x3 x4 , x1 = p(f (x2 , x3 ), x4 ). An example of an axiom system consisting of 2 axioms each with 2 bound variables is: x1 x2 , x1 = f (x2 , x1 ) x1 x2 , x1 = p(x1 , x2 ). Notice that one does not need to further compose f with p or p with f in order to produce other possible formulas, because f is a general function with an index as rst parameter and any term as second parameter which can be p or f itself, without the need of innitely nesting each into the other in order to reach other possible constructions.

234

H. Zenil

4.3

Experimental Setting

The project was undertaken using Mathematicas built-in implementation of the well known and award-winning theorem prover Waldmeister 1 . Waldmeister returns True after evaluating an expression in Mathematica if it can prove the conclusions from the given axioms, and False if it can prove that the conclusions do not follow from the axioms. If it cannot prove either, it returns Unevaluated. The axiom systems generatedas described in section 4.2were rst checked for logical consistency and internal axiom independence, these being two of the most important qualities of conventional mathematical axiom systems. A is said to be consistent if no theorem and its negation can be derived from A. On the other hand, if A is an axiom system and a A, then a is considered independent in A, or an independent axiom of A if a cannot be derived from A {a}. As with any axiomatic system, we want this axiomatic system to be minimal, i.e. to contain no superuous axiom. From this point on, only consistent axiom systems were taken into account. Miscellaneous interesting rst results: It was found that only .01 out of a total of 490 axiomatic systems with 1 or 2 axioms of length up to 3 bound variables were non-independent, i.e. one of its members could be derived from a combination of the others. All the 29 axiomatic systems of length 3 with 2 or more axioms were independent. This could be explained by the way in which the axiomatic systems were enumerated, because axioms closer to each other in the enumeration seem to have a better chance of being derived from each other. The condition of being a theorem or an axiom is evidently an arbitrary convention. The number of consistent axiom systems of length 3 was only .0342 percent of a total of 1024 initial axiomatic systems. In the case of axiom systems of length 4 (composed by formulas of that size), .607 of them were found to be consistent. This may be interpreted in two dierent ways: that even when the complexity of the axiom systems grows, the overall inconsistency does not increase, or else that the process only unveils the tip of the iceberg, where they are consistent chiey due to their simplicity (both in terms of number of axioms per axiom system and the length of the axioms themselves, thereby reducing the possible number of clashes). 4.4 Distribution of Proof Lengths

The relation between the length of the formulas and the optimal runtime limit is of particular utility when no upper bound is known (or possible), when, for example, there are non-provable formulas for which longer runtimes will not make any dierencewhich, as veried herein, would cover a negligible number of cases.
1

http://www.mpi-inf.mpg.de/~ hillen/waldmeister/ (August, 2011).

Computer Runtimes and the Length of Proofs

235

A total of 89 145 formulas out of the 97 727 with at most 4 variables were proven to be theorems (or their negations) after a single step. One can call such a theorem trivial simply because its proof, requiring only 1 step, can be accomplished with an axiom, therefore itself being an axiom. The proof length (t) distribution (in percentage) of formulas with up to 4 variables is as shown in Fig. 10.
t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 kt 89145 2311 473 931 928 426 577 834 1344 294 186 206 44 15 7 2 4 p(kt ) 91.2184 2.36475 0.484001 0.952654 0.949584 0.435908 0.59042 0.853398 1.37526 0.300838 0.190326 0.210791 0.0450234 0.0153489 0.00716281 0.00204652 0.00409303

Fig. 10. Proof length (t) distribution (in percentage) of formulas with up to 4 variables

Proof length distribution of (dis)proven theorems. Where t is the number of steps the theorem prover has taken to produce the proof, kt the number of machines that halted at t, and p(kt ) is the halting probability of having (dis)proven k theorems in time t from which one can build a probability distribution p(kt ). It is worth noting that the behavior of 10 graph resembles the rst case of (2, 2) Turing machines, where the number of machines that halted was not strictly decreasing (unlike (3, 2) that was monotonically decreasing). Already 0.912 out of the total number of theorems are proven by the very rst step, with that number dropping as the total is approached. From the distribution it follows that going beyond the 7th. step to the 17 steps that require the longest proofs only adds .012 new (dis)proven formulas to the total. Summary of proving times: A total of 89 145 formulas out of 97 727 were immediately proved (or disproved) after the rst step (i.e. 91.21%). 95.96 were proven after 5 steps, and 96 969 formulas were proven after 9 steps (which is almost half of the 17 maximum number of steps reached by the formulas with 4 bound variables). That is, 99.22% of the total. Letting the theorem prover run up to 17 steps only generates 758 new proofs, that is only 0.77% of the total.

236

H. Zenil
number of theorems (dis)proven (log)

11.48 11.46 11.44 11.42 11.40 5 10 15 proof length

Fig. 11. Accumulated number of theorems (dis)proven step by step

Fig. 12. Truth space of 97 727 proofs from the 607 consistent and independent axiom systems (x axis) against 161 formulas (y axis) from formulas with 4 bound variables. Every dot is a proof, a black square indicates that a particular theorem holds in a particular axiom system (which explains the diagonal, among other patterns) and white means the formula was proven to be false in the corresponding axiom system (i.e. the negation is a theorem). No undecidable candidate was found.

Fig. 13. Color mapping spectrum for proofs of length 4

As for Turing machines (see Fig. 8), the space of proof lengths (Fig. 14) is mostly white and lightly colored as an indicator of the sparsity of long proof lengths given that most formulas are (dis)proven very quickly, suggesting that the distribution of proof lengths follows the distribution of program runtimes.

Computer Runtimes and the Length of Proofs

237

Fig. 14. Proof length deep eld plot from the 97 727 formulas of up to 4 variables. formula Busy Beavers are barely visible as isolated red points (online and color printed versions only). Points are arranged as in 12.

Timeouts and Optimal Waiting Times

As for Busy Beaver Turing machines, the values of which depend on the size of the Turing machines (states and symbols), proof lengths depend on the length of the formulas. One can dene Busy Beaver formulas (the values of which will be denoted by f BB(n)) as the formulas for which an automatic theorem prover takes more time to (dis)prove whether a theorem is decidable, or to produce the longest proof, among all the formulas of a xed length. Unlike Turing machines, however, the size of a formula can take many forms, and may depend on the number of bound variables (as was the case in the experiments undertaken here), the number of logical operators or the number of symbols in general. It also depends on the formalism, just as Busy Beavers depend on the formalism used by Rado [9]. Following the analogy, the values of f BB(n) would therefore work in a similar way and may be used just as Busy Beaver Turing machine values are currently usedfor dening maximum runtimes and maximum output lengths for (small) Turing machines, saving time once an upper limit is known. The exact relation would also save considerable computational resources in automatic theorem proving. As explained before, the theoretical algorithmic analysis in [4] indicates that a program that has not stopped after running for a long time has smaller and smaller chances of eventually stopping, so the longer the time t the more unlikely the program is to halt. Calude and Stays results can be interpreted as follows: most Turing machines are fully determined qua termination by a small number of computational steps, and the error margin upon betting that a Turing machine will halt drops exponentially. Because proofs are programs for automatic theorem prover and one can connect this interpretation to the probability of a formula to be (dis)proven in an axiom system with a condence error margin to be proven dropping fast. Let the optimal timeout be the number of steps for which a fraction of formulas from a set of xed length is (dis)proven. Evidently, proving time is asymptotically optimal, in the sense that the closest to the maximum runtime (the Busy

238

H. Zenil runtime (dis)proven fraction f (t) = 1/2t t of theorems p(t) (rst signicant digit) 1 0.9 0.5 2 0.02 0.2 3 0.005 0.1 4 0.01 0.06 5 0.009 0.03 6 0.004 0.02 7 0.006 0.008 8 0.009 0.004 9 0.01 0.002 10 0.003 0.001 11 0.002 0.0005 12 0.002 0.0002 13 0.0005 0.0001 14 0.0002 0.00006 15 0.00007 0.00003 16 0.00002 0.00002 17 0.00001 0.00001

Fig. 15. Runtime distribution at which all machines halt (those that dont are indicated by ). Where t is the number of steps, kt the number of machines that halted at t (out of a total of 3456 that halt), and p(kt ) is the halting probability calculated from t and kt .

Beaver formula values), the greatest the fraction of (dis)proven formulas. An optimal time OP T ime for a given goal implies that upon t one has reached a fraction of (dis)proved formulas. Thus OP T ime(n, ) = min{t(n) : |t(n) | = }, where n is the length of the set of formulas, the desired fraction of (dis)proved formulas and || the number of formulas proven at time t(n) 0. Obviously 0 < OP T ime(n) f BB(n) for each time t > 0, and OP T ime(n) = f BB(n) if = 1, that is, if the fraction of formulas to be (dis)proved is 1 (i.e. if the goal is to (dis)prove all the formulas of a xed length). Just as with Busy Beavers, the exact value of OP T ime(n) is uncomputable and unpredictable in general, but one can approach it. For example, in our formalism, for 4 bound variables it can be calculated from the probability distribution in 15. One can ascertain, for example, that from a uniform distribution of randomly generated formulas, nearly .90 of the formulas will be proven after the rst step. And that the number of new proofs from then on will rapidly drop as a function of the number of steps. The value of OP T ime(n) can also determine a timeout for single formulas, given a condence expectation. Which is to say that a single formula has, for example, a .90 chance of being (dis)proven in the rst step, and that it has diminishing possibilities, if any, of being (dis)proven thereafter. We think that the results are robust enough to model specications of theorem provers, despite not being completely independent. We were able to verify the results using another very dierent theorem prover, the Automatic

Computer Runtimes and the Length of Proofs

239

Proof Search or AProS [10] for propositional logic and predicate calculus (the theorem prover deals, however, with all sorts of other classical and non-classical calculus). AProS uses the intercalation method to search for normal natural deduction proofs not requiring a language in which the atomic formulas are identities, unlike Waldmeister. Notice that for this new case, the denition of the length of formulas was adjusted to the new framework, given that since the prover calculus does not require equality, no sense can be given to left or right hand sides. The set of randomly chosen operators used to generate formulas were the classic and, or, implies and double implies. AProS found proofs for .12 of the assertions (and for .353 of a set of assertions with no-double conditionals), out of a random choice of 1000 automatically generated predicate calculus assertions with up to 4 quantiers, 3 general functions, 3 logical operators and 3 variables. The longest proof length (runtime) was 42 with an average proof length of 13, and a distribution very close to the one described by Waldmeister using Mathematica.

Concluding Remarks and Further Work

A logically signicant question concerns the structure of the theorems established. If signicant structural features are uncovered, then one could generate randomly formulas of that structure and repeat the proof length and runtime distribution experiments. It would be quite interesting, if one could nd, for example, systematic biases for dierent theorem provers and theorem proving techniques when deviating in distribution from each other. One can continue the process of generalizing theoretical results from computer programs to proof lengths and seek the equivalent of Busy Beavers in sets of well dened proofs and theorem provers. Just as for larger Busy Beaver Turing machine values, the computer time and resources to explore much larger sets of proofs are out of reach. The experiments suggest that the statistics for theorem proving times from randomly generated formulas may follow a similar trend to the distribution of runtimes of random computer programs. And that when searching for proofs, appropriate timeouts can be set and optimal waiting times dened depending on the size of the formulas as it has been determined that runtimes depend on the size of machines. It is too soon, however, to declare any true resemblance and there are always dangers of extrapolating from the behavior of small systems. Acknowledgments. I am grateful to Cris Calude who encouraged me to publish these results in connection with his own work [4]. I am also indebted to Stephen Wolfram, Todd Rowland and Matthew Szudzik for their support and guidance during and after the 2005 NKS Summer School at Brown University, when I started this project as part of a 3-week Summer project and inspired by Stephen Wolframs own work in [13], intending to extend his results from propositional logic to predicate calculus. I am also grateful to J.-P. Delahaye

240

H. Zenil

with whom Ive undertaken related research [7], studying the output distribution of abstract computing machines. To Wilfried Sieg for his guidance and for introducing me to AProS, which I used to strengthen the experimental results in this paper while a visiting scholar at Carnegie Mellon, and to Jeremy Avigad who brought me to Carnegie Mellon. And to the anonymous referee. Any error or omission remains, of course, the sole responsibility of this author.

References
1. Brady, A.H.: The Determination of the Value of Rados Noncomputable Function for Four-State Turing Machines. Math. Comput. 40, 647665 (1983) 2. Baumgartner, P., Zhang, H.: On Using Ground Joinable Equations in Equational Theorem Proving. In: Proceedings of the 3rd International Workshop on First Order Theorem Proving (St Andrews, Scotland), Fachberichte Informatik 5/2000, pp. 3343. Universitt Koblenz-Landau (2000) a 3. Calude, C.S., Dinneen, M.J., Shu, C.-K.: Computing a glimpse of randomness. Experimental Mathematics 11(2), 369378 (2002) 4. Calude, C.S., Stay, M.A.: Most programs stop quickly or never halt. Advances in Applied Mathematics 40, 295308 (2005) 5. Chaitin, G.J.: Computing the Busy Beaver function. Information, Randomness & Incompleteness, 7476 (1984) 6. Chaitin, G.J.: A theory of program size formally identical to information theory. J. ACM 22, 329340 (1975) 7. Delahaye, J.-P., Zenil, H.: Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance Into the Innermost Structure of Randomness. Appl. Math. Comput. (in press, 2011) 8. Joosten, J., Soler-Toscano, F., Zenil, H.: Program-size Versus Time Complexity, Speed-up and Slowdown Phenomena in Small Turing Machines. International Journal of Unconventional Computing (2011) 9. Rado, T.: On Non-Computable Functions. Bell System Technical J. 41, 877884 (1962) 10. Sieg, W.: The AProS Project: Strategic Thinking & Computational Logic. Logic Journal of the IGPL 15(4), 359368 (2007) 11. Lin, S., Rado, T.: Computer Studies of Turing Machine Problems. J. ACM 12, 196212 (1965) 12. Hillenbrand, T., Lchner, B.: The Next WALDMEISTER Loop. In: Voronkov, A. o (ed.) CADE 2002. LNCS (LNAI), vol. 2392, pp. 486500. Springer, Heidelberg (2002) 13. Wolfram, S.: A New Kind of Science. Wolfram Media (2002) 14. Zvonkin, A.K., Levin, L.A.: The complexity of nite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Math. Surveys 25(6), 83124 (1970) 15. Zenil, H.: Busy Beaver, from the Wolfram Demonstrations Project (2009), http://demonstrations.wolfram.com/BusyBeaver/

Linear Bounded Automata
100% (1)
Linear Bounded Automata
23 pages
Class 9 Maths Half Yearly Exam Paper Set 2-1
100% (5)
Class 9 Maths Half Yearly Exam Paper Set 2-1
7 pages
Proof Theory
100% (6)
Proof Theory
309 pages
Complexity Slides
No ratings yet
Complexity Slides
301 pages
Theory of Computation Answers
No ratings yet
Theory of Computation Answers
195 pages
Computer Science Notes Theory of Computation
No ratings yet
Computer Science Notes Theory of Computation
21 pages
Ece120 Notes PDF
No ratings yet
Ece120 Notes PDF
150 pages
Katz - Complexity Theory (Lecture Notes)
No ratings yet
Katz - Complexity Theory (Lecture Notes)
129 pages
Lesson23 24 2
No ratings yet
Lesson23 24 2
67 pages
Turing Machine and PDA Notes PDF
No ratings yet
Turing Machine and PDA Notes PDF
14 pages
TM
No ratings yet
TM
37 pages
Noncomputability and The Busy Beaver Problem: Bryant A. Julstrom
No ratings yet
Noncomputability and The Busy Beaver Problem: Bryant A. Julstrom
36 pages
Alejandrorigoberto
No ratings yet
Alejandrorigoberto
32 pages
Undecidability
No ratings yet
Undecidability
47 pages
Ict351 3 Reasoning About Programs
No ratings yet
Ict351 3 Reasoning About Programs
27 pages
Exercises ChurchTuring
No ratings yet
Exercises ChurchTuring
2 pages
Theorem: The Halting Problem Is Undecidable
No ratings yet
Theorem: The Halting Problem Is Undecidable
4 pages
Introduction To Turing Machines: Site
No ratings yet
Introduction To Turing Machines: Site
28 pages
Busy Beaver
No ratings yet
Busy Beaver
24 pages
NCERT Class 9 Maths Answers and Solutions PDF
No ratings yet
NCERT Class 9 Maths Answers and Solutions PDF
26 pages
Explain Halting Problem of Turing Machine
No ratings yet
Explain Halting Problem of Turing Machine
2 pages
Flat CH 5
No ratings yet
Flat CH 5
77 pages
Unit Vi Turing Machines
No ratings yet
Unit Vi Turing Machines
77 pages
B. Jack Copeland
No ratings yet
B. Jack Copeland
26 pages
CSE - Unit 5 - ToC - Basic Complexity
No ratings yet
CSE - Unit 5 - ToC - Basic Complexity
39 pages
Unit Vi Flat LM Cse
No ratings yet
Unit Vi Flat LM Cse
16 pages
Automata Module 4
No ratings yet
Automata Module 4
39 pages
Assignment 1 Mrs Nyambo
No ratings yet
Assignment 1 Mrs Nyambo
13 pages
Atc Module 5 2021
No ratings yet
Atc Module 5 2021
23 pages
Halting Problem: Introduction To Computing Science and Programming I
No ratings yet
Halting Problem: Introduction To Computing Science and Programming I
11 pages
MIS Year 4 Lecture Notes: Semantics and Storage Management
No ratings yet
MIS Year 4 Lecture Notes: Semantics and Storage Management
8 pages
(Ancient Commentators On Aristotle) Aristotle. - Osborne, Catherine - Philoponus, John-Philoponus On Aristotle Physics 1.4-9-Bloomsbury Academic - Bristol Classical Press - Duckworth (2009)
100% (1)
(Ancient Commentators On Aristotle) Aristotle. - Osborne, Catherine - Philoponus, John-Philoponus On Aristotle Physics 1.4-9-Bloomsbury Academic - Bristol Classical Press - Duckworth (2009)
192 pages
Two-Way Deterministic Finite Automata
No ratings yet
Two-Way Deterministic Finite Automata
9 pages
The Halting Problem: M M M W M W M W M M
No ratings yet
The Halting Problem: M M M W M W M W M M
19 pages
Module 5 Notes
No ratings yet
Module 5 Notes
25 pages
Turing Machine
No ratings yet
Turing Machine
22 pages
The Busy Beaver The Placid Platypus And-4
No ratings yet
The Busy Beaver The Placid Platypus And-4
8 pages
Nondeterministic Polynomial Time: Figure 5.3.1 A Classification of The Decidable Problems
No ratings yet
Nondeterministic Polynomial Time: Figure 5.3.1 A Classification of The Decidable Problems
9 pages
Solutions For Exercise Sheet 1
No ratings yet
Solutions For Exercise Sheet 1
7 pages
1 Definition of A Turing Machine
No ratings yet
1 Definition of A Turing Machine
24 pages
TOC Assignment
No ratings yet
TOC Assignment
11 pages
Comp Prob
No ratings yet
Comp Prob
8 pages
Microsoft Word - April Toc Solved Question Peper Even
No ratings yet
Microsoft Word - April Toc Solved Question Peper Even
9 pages
Meir Buzaglo - Solomon Maimon - Monism, Skepticism, and Mathematics-University of Pittsburgh Press (2002)
No ratings yet
Meir Buzaglo - Solomon Maimon - Monism, Skepticism, and Mathematics-University of Pittsburgh Press (2002)
185 pages
Q3M2 Part 1
No ratings yet
Q3M2 Part 1
132 pages
Select A File From Computer: DCIT 101 Assignment
No ratings yet
Select A File From Computer: DCIT 101 Assignment
10 pages
Complexity 121
No ratings yet
Complexity 121
23 pages
CSCI3390-Lecture 7: Undecidability Proofs Reductions: September 21,2018
No ratings yet
CSCI3390-Lecture 7: Undecidability Proofs Reductions: September 21,2018
7 pages
Recursively Enumerable Sets and Turing Machines, Decidability
No ratings yet
Recursively Enumerable Sets and Turing Machines, Decidability
13 pages
The Turing Machine Problem Solver For Undecidable Problems
No ratings yet
The Turing Machine Problem Solver For Undecidable Problems
8 pages
CS 701 Viva Qa
No ratings yet
CS 701 Viva Qa
4 pages
AP Top Halting
No ratings yet
AP Top Halting
8 pages
A Simple Generator of Incompleteness Theorems
No ratings yet
A Simple Generator of Incompleteness Theorems
5 pages
Hein 731
No ratings yet
Hein 731
1 page
Turing Machine
No ratings yet
Turing Machine
5 pages
CT Test 2 Portion
No ratings yet
CT Test 2 Portion
9 pages
Computer Algorithm CPU Alan Turing Thought Experiment: Final Automata TP
No ratings yet
Computer Algorithm CPU Alan Turing Thought Experiment: Final Automata TP
7 pages
Fiske - 1992 - The Four Elementary Forms of Sociality. Framework For A Unified Theory of Social Relations
100% (1)
Fiske - 1992 - The Four Elementary Forms of Sociality. Framework For A Unified Theory of Social Relations
35 pages
The Numinous Way Philosophy of David Myatt
No ratings yet
The Numinous Way Philosophy of David Myatt
692 pages
Homework 4 Solutions
No ratings yet
Homework 4 Solutions
4 pages
ECE199JL: Introduction To Computer Engineering Fall 2012 Notes Set 1.1
No ratings yet
ECE199JL: Introduction To Computer Engineering Fall 2012 Notes Set 1.1
2 pages
SLOa First Homework Solution
No ratings yet
SLOa First Homework Solution
2 pages
Theories of AI
No ratings yet
Theories of AI
17 pages
Auto
No ratings yet
Auto
7 pages
Submitted By: Submitted To Reg No Subject Roll No: Nandan Sharma: Prof. Ajay Kumar Bansal
No ratings yet
Submitted By: Submitted To Reg No Subject Roll No: Nandan Sharma: Prof. Ajay Kumar Bansal
8 pages
English: Grade 8
No ratings yet
English: Grade 8
40 pages
Mere Jannery D. Serrano Bsit-Wma W32 MAT513 Discuss The Halting Problem and What It Is All About
No ratings yet
Mere Jannery D. Serrano Bsit-Wma W32 MAT513 Discuss The Halting Problem and What It Is All About
1 page
Handout 4: Iii. Turing Machines
No ratings yet
Handout 4: Iii. Turing Machines
12 pages
Daily Lesson Plan (DLP) Format: Instructional Planning
No ratings yet
Daily Lesson Plan (DLP) Format: Instructional Planning
9 pages
A Conversation With Nicole Brenez - Cinética (In English)
No ratings yet
A Conversation With Nicole Brenez - Cinética (In English)
8 pages
Set Theory - Peter Koepke PDF
No ratings yet
Set Theory - Peter Koepke PDF
35 pages
Godel Thesis
100% (2)
Godel Thesis
8 pages
Euclid
No ratings yet
Euclid
20 pages
Module 5 MMW
No ratings yet
Module 5 MMW
19 pages
(Ebook) Beautiful Geometry by Eli Maor Eugen Jost ISBN 9781400848331 2024 Scribd Download
100% (1)
(Ebook) Beautiful Geometry by Eli Maor Eugen Jost ISBN 9781400848331 2024 Scribd Download
76 pages
Science Wars: The Battle Over Knowledge and Reality Steven L. Goldman Download
100% (2)
Science Wars: The Battle Over Knowledge and Reality Steven L. Goldman Download
55 pages
Origins of Boolean Algebra in The Logic of Classes - John Venn and C S Pierce 22-5-2011
No ratings yet
Origins of Boolean Algebra in The Logic of Classes - John Venn and C S Pierce 22-5-2011
35 pages
G8 Math Q3 - Week 3-4 - Illustrates Triangle Congruence
No ratings yet
G8 Math Q3 - Week 3-4 - Illustrates Triangle Congruence
113 pages
Lessons in TFN
No ratings yet
Lessons in TFN
23 pages
Auditing Symposium IX 1988-p33-44
No ratings yet
Auditing Symposium IX 1988-p33-44
12 pages
Ryszard Puciato: Axiomathes, N. 2, Settembre 1993, Pp. 169.191
No ratings yet
Ryszard Puciato: Axiomathes, N. 2, Settembre 1993, Pp. 169.191
23 pages
Theorems G8
No ratings yet
Theorems G8
5 pages
Finite Geometrie S
No ratings yet
Finite Geometrie S
13 pages
Makale
No ratings yet
Makale
16 pages
The Seven Properties of Good Models PDF
No ratings yet
The Seven Properties of Good Models PDF
13 pages
Computational Logic 2013-2014
No ratings yet
Computational Logic 2013-2014
8 pages
Today's Topics Introduction To Set Theory ( 1.6)
No ratings yet
Today's Topics Introduction To Set Theory ( 1.6)
11 pages
Handouts Theoretical Conceptual Frameworks
No ratings yet
Handouts Theoretical Conceptual Frameworks
5 pages
Automated Theorem Proving: Fundamentals and Applications
From Everand
Automated Theorem Proving: Fundamentals and Applications
Fouad Sabry
No ratings yet
Algorithmic Probability: Fundamentals and Applications
From Everand
Algorithmic Probability: Fundamentals and Applications
Fouad Sabry
No ratings yet

Computer Runtimes and The Length of Proofs

Uploaded by

Computer Runtimes and The Length of Proofs

Uploaded by

Computer Runtimes and the Length of Proofs

With an Algorithmic Probabilistic Application to Waiting Times in Automatic Theorem Proving

Computer Runtimes and the Length of Proofs

The Busy Beaver Problem

Computer Runtimes and the Length of Proofs

Halting and Runtime Distributions

Halting History of (3, 2) Turing Machines

9.0 14.2 8.9 14.1 14.0 8.8 1 2 3 4 5 6 7 runtime 2 4 6 8 10 12 14 runtime

Gdel Meets Turing in the Computational Universe o

Computer Runtimes and the Length of Proofs

Computer Runtimes and the Length of Proofs

http://www.mpi-inf.mpg.de/~ hillen/waldmeister/ (August, 2011).

Computer Runtimes and the Length of Proofs

11.48 11.46 11.44 11.42 11.40 5 10 15 proof length

Fig. 11. Accumulated number of theorems (dis)proven step by step

Fig. 13. Color mapping spectrum for proofs of length 4

Computer Runtimes and the Length of Proofs

Timeouts and Optimal Waiting Times

Computer Runtimes and the Length of Proofs

Concluding Remarks and Further Work

You might also like