Research Paper 5
Research Paper 5
2019 The Author(s) Published by the Royal Society. All rights reserved.
1
An account of proofs and their history is given in [5], a 429-page book that is very informative and quite comprehensive.
However, Hinkis does not provide a unifying mathematical examination.
2
The 24th Paris problem was formulated for but not included in the publication of Hilbert’s Paris lecture. For the discovery
of Hilbert’s manuscript and its significance, see [7].
3
I have profited from a stay in Lisbon that was partially supported by the FCT project ‘Hilbert’s 24th problem’ (PTDC/MHC-
FIL/2583/2014). The challenge of giving four seminar talks during this stay in October 2017 helped me to better organize
my considerations; I am grateful, in particular, to Mirko Engeler, Reinhard Kahle and Isabel Oitavem. Some aspects of the
considerations presented here are reported with many more details in [8,16]. I gave a version of this paper at the Joint
Mathematics Meeting in San Diego on 13 January 2018 as part of the Special Session on Alternative Proofs that John Dawson
had organized.
...............................................................
–h1[a–] Æ h2[a–] Æ º
d
– a
d
= d \e
d
– –
–h1[d ] Æ h2[d ] Æ º e
We have d\h[c] = (d ∪ (a\d))\(h[c] ∪ (a\d)). With d ∪ (a\d) = a and the structural identity c = (a\d) ∪
h[c] = h[c] ∪ (a\d) we have d\h[c] = a\c. Thus, h2 is a function from a\c to d\h[c]. Both h1 and h2
are bijections; c and a\c partition a, whereas h[c] and d\h[c] partition d. So, the union h* of h1 and
h2 is a bijection from a to d. The above structural identity articulates a crucial insight concerning
inductively defined sets: their elements are either in the initial set (above, in a\d) or have been
obtained by the iterating function (above, h).
A modification of this proof establishes that there is a bijection h* from d to e. Let c* be the
set obtained from d\e by finitely iterating h; define h* from d to e by h*(x) = h(x) if x is in c* and
h*(x) = id(x) if x is in d\c* (figure 2). The composition of h with the inverse of h* is a bijection h**
from a to d. This bijection can be directly defined by exploiting c* as follows: h**(x) = h(x) if x is in
a\c* and h**(x) = id(x) otherwise. This is Zermelo’s argument for the Equivalence theorem. Note
that for the two arguments above (and also for König’s below) the important case arises when
a\d and d\e are non-empty; otherwise, the identity on a, respectively the given bijection h, can be
taken as the sought-after bijection.4
König’s proof was published in [6]; his informal argument is presented rigorously in ([19], p.
55). Adapted to my set up, it is seen to join the earlier considerations using both c and c* (figure 3.)
4
An informative analysis of Zermelo’s proof is found in [17, pp. 508–509]. Sieg & Walsh [8] recast a proof of CBT given in [18].
The bijection obtained from Banach’s proof is Dedekind’s h*.
–
d – –
–h1[d ] Æ h2[d ] Æ º e
Let r be e\(h[c] ∪ h[c*]) and define h1 *(x) = h(x), if x is in c, and h1 *(x) = id(x), if x is in c* ∪ r, and
h2 *(x) = h(x), if x is in c ∪ r, and h2 *(x) = id(x), if x is in c*.
It is not difficult to verify that h1 * is the h* from Dedekind’s proof and that h2 * is the h** from
Zermelo’s proof. Here are the definitions side by side:
and
We have only to observe that, in the first case, a\c = c* ∪ r and, in the second case, a\c* = c ∪ r. h*
and h** are the canonical mappings that are obtained also in all the other proofs I have analysed.5
There are two important and problematic issues in the above arguments; first, we have to
find for the informally described sets c and c* an explicit set-theoretic definition and, second,
we have to prove the structural identities. If one defines c ‘from below’ as ∪ [hn [a\d] | n ∈ N]
with h0 [a\d] = a\d and hn+1 [a\d] = h[hn [a\d]], then it is immediate that c = (a\d) ∪ h[c]. This
approximation of c from below is the central construction in Bernstein’s proof [21]. Its standard
diagrammatic presentation, as given for example in [22, pp. 11–12], can be adapted as in figure 4
for the proof of Dedekind’s fundamental lemma in the following way, because d is a subset of a.
The bijection h*: a → d that is obtained ‘from the diagram’ is indeed Dedekind’s, as it is defined
by h*(x) = h(x) if x is in ∪ [hn [a\d] | n ∈] and h*(x) = id(x) if x is in a\∪ [hn [a\d] | n ∈ N]. However,
Dedekind wanted to avoid any appeal to natural numbers in the development of his general
theory of chains; after all, the natural numbers were to be founded on it. Once the natural numbers
had been given a chain-theoretic characterization, Dedekind established that the approximation
from below and above (as the intersection of all chains containing a\d and closed under h)
yield the same set.6 The latter characterization is going to be discussed next to address the two
problematic and deeply related issues I just pointed to: an explicit set-theoretic definition of c and
c* as well as the proof of the structural identities.
5
Many proofs, including König’s original one and more contemporary proofs like that of Doyle & Conway [20], define a
partition of a into sets c, c* and r based on the basic insight underlying figure 3; cf. also footnote 4. Scott Weinstein pointed
me to Doyle and Conway’s paper and provided a proof of the graph-theoretic fact that is crucially used there to prove CBT.
Weinstein’s proof emphasized for me the parallelism of the Doyle and Conway argument to that of König.
6
In the last part of #131 of [23], one finds an unnumbered theorem that expresses this identity. The theorem guarantees the
existence of the approximation from above—on the basis of the existence of N. In that restricted sense, the general theory
is dependent on N: the infinity axiom guarantees the existence of N in ZF; N together with the Replacement Principle (and
the union axiom) ensures the existence of a set that contains a\d and is closed under h; thus, the intersection is applied to a
non-empty set.
d
–]
h[a h2[a–] h3[a–]
1. fŒinj(a,b) Prem
2. gŒinj(b,a) Prem
The additional lemmas, appealed to on lines 3, 4, 5, 7 and 8, do not at all touch the central
considerations that lead to the fundamental lemma. That part begins with proving the structural
identity for chains c = b ∪ h[c ] of a subset b of a, where a is any system and h any function from a
to a. This general fact can be instantiated for the two chains c and c*; that fact, in turn, allows the
partitions used in the proofs of the fundamental lemma to define the bijections h* and h**. Here is a
diagrammatic summary:
Cantor–Bernstein theorem
|
fundamental lemma
The proofs of the structural identity for the two chains are instances of the same proof presented in
its general form below. This almost linear proof is not presented for its completed structure, but
for the possibility of indicating how it was constructed through a sequence of partial proofs with
gaps that are filled successively by forward and backward moves. (I use the line numbering of
The construction begins with the partial proof consisting of the premises 1–2 and the goal 22. To
allow the introduction of a (temporary) name for the complex term ¢(a,b,h) denoting the chain
of the system b given a and h, we employ the theorem in 3 and apply the elimination rules for
the existential quantifier and for identity to obtain the partial proof with lines 1–4 and 20–22.
Here is where the core of the proof begins, namely, to establish the identity in 20. That leads to
two new goals with gaps, thus to the partial proof: 1–4 . . . gap1 . . . 9 . . . gap2 . . . 19–22. The
reader, I hope, is now in a position to see how these new gaps are closed and how the proof of the
structural identity is completed.
Given the earlier diagrammatic summary of the two parallel ways of obtaining the CBT from
the structural identity, it seems that the multitude of proofs of the theorem has been reduced to
essentially one proof by analysing crucial concepts and related techniques. This case study presents
proof-theoretic investigations that are quite different from the standard ones (in pursuit of modified
Hilbert programmes). Nevertheless, it uses crucial insights from the traditional work and not only
opens new directions rooted in the earlier work, but actually takes up deep programmatic themes.
4. Programmatic directions
For his Paris list of mathematical problems, Hilbert had prepared a 24th problem that was not
included in their final publication [10]. As mentioned already in §1, this hastily formulated
problem called for the development of ‘a theory of the method of proof in mathematics in general’.
Hilbert made the bold claim that ‘under a given set of conditions there can be but one simplest
proof’, without indicating a notion of simplicity. If there should be two proofs for a theorem, then,
. . . you must keep going until you have derived each from the other, or until it becomes
The strategic conceptual necessity underlying the proofs of the fundamental lemma has to be
distinguished from the mathematical set-theoretic necessity of defining the set that is obtained by
finitely iterating an operation. The strategic conceptual necessity is realized through the two
variant conditions that separate Dedekind’s from Zermelo’s proof. As far as the other necessity
is concerned, three aids have been exploited to find explicit set-theoretic definitions: Bernstein’s
approximation from below, Dedekind’s approximation from above, and the Knaster–Tarski fixed-
point construction. The first way requires the availability of the natural numbers, whereas the
second and third approaches yield exactly the same two sets, c and c*.
In his Zürich talk of September 1917, Hilbert implicitly resumed the project outlined in the 24th
problem and called for the investigation of ‘the concept of the specifically mathematical proof’. It
was clear that the logical calculi Frege, Peano, Whitehead and Russell had developed would play
a crucial role in such an investigation. In lectures of the winter term 1917/18, Hilbert & Bernays
[27] used the Principia Mathematica calculus when sketching the formal development of number
theory and analysis. As the sustained formal work was far too unwieldy, they introduced in early
1922 a novel logical calculus with two explicit goals. The first goal was methodological, whereas
the second was entirely pragmatic:
(1) formulate a group of characteristic axioms for each logical connective and fix in this way
the logically relevant meaning of connectives,9 and
(2) make it easier to formalize mathematical arguments as well as to guarantee the
intelligibility of the formal object representing the informal proof.
Gentzen’s natural deduction systems are rule-based versions of the Hilbert–Bernays calculi,
but introduce one completely new feature: making and discharging assumptions. Gentzen viewed
that feature as an essential reflection of mathematical practice. A subclass of proofs in natural
deduction calculi, so-called normal ones that do not make detours, have most striking structural
properties, among them the subformula property.10 As there was no direct way of generating
normal proofs similar to that of generating cut-free proofs in sequent calculi, the question was:
How can those properties be exploited for shaping a search for proofs? Intercalation calculi
address exactly this problem. The systematic bi-directional use of elimination and introduction
rules underlies the completeness proof for these calculi and produces either normal proofs or
allows the formulation of a counterexample. The structural features of normal proofs motivate
particular strategic moves to make proof search efficient and always goal-directed.
Let us return to Hilbert’s 24th problem and the question, when two proofs in mathematics
should be considered to be the same. Recall that Noether viewed Dedekind’s and Zermelo’s proofs
as ‘exactly the same’. If Beweistheorie is to be a theory of mathematical arguments then, ultimately,
one has to find a criterion that relates the identity of proofs to the literal identity of syntactic
configurations. The latter are, or have been obtained from, formal representatives of the two proofs.
That raises, of course, the question when a formal derivation can be viewed as representing an
informal proof. Neither the question of proof representation nor the topic of proof identity can be
9
This was done explicitly to mimic for logic what Hilbert had done in Grundlagen der Geometrie [28], namely, fix the
mathematically relevant meaning of each geometric notion through a group of axioms.
10
For these structural properties, I refer to [29], in particular to the sections on The form of normal deductions in chapters III
and IV. Prawitz shows there that every branch (path) can be divided uniquely into E- and I-parts. This structural property
is underlying the strategic search for intercalation proofs. Gentzen discovered a procedure for the normalization of proofs in
intuitionist first-order logic, but could not extend it to classical logic; see my essay [30, section 6]. That was achieved, at least
partially, in [29]. The completeness proofs for intercalation calculi were established in the two papers mentioned in footnote
8 for classical and intuitionist first-order logic.
Appendix A
Here is the list of lemmas that are actually used in the top-level proof of the Cantor–Bernstein
Theorem. For a full list of lemmas used in the proof of the CBT, see ([8], appendix A). One
should note the directness of the first five observations; the remaining ones are facts concerning
Dedekind’s chains and are used in the proof of the structural identity.
11
This technique is reminiscent of the Greek way of partitioning geometric figures and showing them to be congruent by
arguing for the congruence of the parts; such Zerlegungsbeweise are the topic of Mahlo’s thesis [9]. Its most famous application
is found in Euclid’s proof of the Pythagoras’ theorem; see my [31].
Equi8 a ≈ b; a ≈ c b≈c
..........................................................................................................................................................................................................
...............................................................
Func17
..........................................................................................................................................................................................................
References
1. Dedekind R. 1887 Ähnliche (deutliche) Abbildung und ähnliche Systeme. In Gesammelte
mathematische Werke, vol. 3 (eds R Fricke, E Noether, Ö Ore), pp. 447–449. Braunschweig:
Vieweg.
2. Dedekind R. 1932 Gesammelte mathematische Werke, vol. 3 (eds R Fricke, E Noether, Ö Ore).
Braunschweig: Vieweg.
3. Cantor G. 1932 Gesammelte Abhandlungen mathematischen und philosophischen Inhalts
(ed. E Zermelo). Berlin, Germany: Springer.
4. Zermelo E. 1908 Untersuchungen über die Grundlagen der Mengenlehre. I. Math. Ann. 65,
261–281. (Translated in (van Heijenoort 1967).). (doi:10.1007/BF01449999)
5. Hinkis A. 2013 Proofs of the Cantor-Bernstein Theorem: A mathematical excursion. Basel:
Birkhäuser Verlag.
6. König J. 1906 Sur la théorie des ensembles. C. R. Hebd. Séances Acad. Sci. 143, 110–112.
7. Thiele R. 2003 Hilbert’s twenty-fourth problem. Am. Math. Mon. 110, 1–24. (doi:10.1080/
00029890.2003.11919933)
8. Sieg W, Walsh P. Submitted. Natural formalization: deriving the Cantor-Bernstein Theorem
in ZF.
9. Mahlo P. 1908 Topologische Untersuchungen über Zerlegung in ebene und sphärische
Polygone. Dissertation, Halle.
10. Hilbert D. 1900 Mathematische Probleme. Nachrichten der Königlichen Gesellschaft der
Wissenschaften zu Göttingen 253–297.
11. Hilbert D. 1918 Axiomatisches Denken. Math. Ann. 78, 405–415. (doi:10.1007/BF01457115)
12. Gentzen G. 1934 Untersuchungen über das logische Schließen I, II. Math. Z. 39, 176–210,
405–431.
13. Gentzen G. 1936 Die Widerspruchsfreiheit der reinen Zahlentheorie. Math. Ann. 112, 493–565.
14. MacLane S. 1934 Abgekürzte Beweise im Logikkalkul. Dissertation, Göttingen.
15. MacLane S. 1935 A logical analysis of mathematical structure. Monist 45, 118–130.
(doi:10.5840/monist19354515)
16. Sieg W. 2018 In preparation. Proofs as objects.
17. Kanamori A. 2004 Zermelo and set theory. Bull. Symbol. Logic 10, 487–553. (doi:10.2178/bsl/
1102083759)
18. Banach S. 1924 Un théorème sur les transformations biunivoques. Fundam. Math. 6, 236–239.
(doi:10.4064/fm-6-1-236-239)
19. Deiser O. 2010 Introductory Note to 1901. In Collected Works/Gesammelte Werke, vol. 1 (eds
HD Ebbinghaus, C Fraser, A Kanamori), pp. 52–70. Berlin, Germany: Springer.
20. Doyle PG, Conway JH. 1994 Division by three. arXiv:math/0605779v1.
21. Bernstein F. 1905 Untersuchungen aus der Mengenlehre. Math. Ann. 61, 117–155. [This is
the publication of Bernstein’s dissertation from 1901. His proof of the Cantor-Bernstein
Theorem had been found earlier and published, with the appropriate acknowledgment,