14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 208

Undergraduate Texts in Mathematics

Editors
F. W. Gehring
P. R. Halmos
Advisory Board
C. DePrima
I. Herstein
J. Kiefer
Jerome Malitz

Introduction to
Mathematical Logic
Set Theory
Computable Functions
Model Theory

Springer-Verlag
New York Heidelberg Berlin
J. Malitz
Department of Mathematics
University of Colorado
Boulder, Colorado 80309
USA

Editorial Board

P.R. Halmos
Managing Editor
Indiana University
Department of Mathematics
Bloomington, Indiana 47401
USA

F.W. Gehring
University of Michigan
Department of Mathematics
Ann Arbor, Michigan 48104
USA

AMS Subject Classification: 02-01,04-01

With 2 Figures
Library of Congress Cataloging in Publication Data

Malitz, J.
Introduction to mathematical logic.

Bibliography: p.
Includes index.
1. Logic, Symbolic and mathematical. I. Title.
QA9.M265 511'.3 78-13588

All rights reserved.

No part of this book may be translated or reproduced


in any form without written permission from Springer-Verlag.

© 1979 by Springer-Verlag New York Inc.


Softcover reprint of the hardcover 1st edition 1979

9 8 7 6 5 432 I

ISBN-13 : 978-1-4613-9443-3 e-ISBN-13 : 978-1-4613-9441-9


DOl: 10.1007/978-1-4613-9441-9
For Sue, Jed, and Seth
Contents

Preface ix
Glossary of Symbols xi

Part I: An Introduction to Set Theory


l.l Introduction
1.2 Sets I
1.3 Relations and Functions 6
1.4 Pairings 9
1.5 The Power Set 14
1.6 The Cantor-Bernstein Theorem 17
1.7 Algebraic and Transcendental Numbers 20
1.8 Orderings 21
1.9 The Axiom of Choice 27
1.l0 Transfinite Numbers 31
I.Il Paradise Lost, Paradox Found (Axioms for Set Theory) 43
1.12 Declarations of Independence 51

Part II: An Introduction to Computability Theory


2.1 Introduction 59
2.2 Turing Machines 60

vii
V11l Contents

2.3 Demonstrating Computability without an Explicit


Description of a Turing Machine 68
2.4 Machines for Composition, Recursion, and the "Least Operator" 79
2.5 Of Men and Machines 89
2.6 Non-computable Functions 90
2.7 Universal Machines 95
2.8 Machine Enumerability 100
2.9 An Alternate Definition of Computable Function lOS
2.10 An Idealized Language llO
2.ll Definability in Arithmetic ll8
2.12 The Decision Problem for Arithmetic 120
2.13 Axiomatizing Arithmetic 124
2.14 Some Directions in Current Research 129

Part III: An Introduction to Model Theory


3.1 Introduction 135
3.2 The First Order Predicate Calculus 136
3.3 Structures 138
3.4 Satisfaction and Truth 142
3.5 Normal Forms 150
3.6 The Compactness Theorem 158
3.7 Proof of the Compactness Theorem 162
3.8 The LOwenheim-Skolem Theorem 167
3.9 The Prefix Problem 174
3.10 Interpolation and Definability 180
3.ll Herbrand's Theorem 185
3.12 Axiomatizing the Validities of L 187
3.13 Some Recent Trends in Model Theory 190
Subject Index 195
Preface

This book is intended as an undergraduate senior level or beginning


graduate level text for mathematical logic. There are virtually no prere-
quisites, although a familiarity with notions encountered in a beginning
course in abstract algebra such as groups, rings, and fields will be useful in
providing some motivation for the topics in Part III.
An attempt has been made to develop the beginning of each part slowly
and then to gradually quicken the pace and the complexity of the material.
Each part ends with a brief introduction to selected topics of current
interest.
The text is divided into three parts: one dealing with set theory, another
with computable function theory, and the last with model theory. Part III
relies heavily on the notation, concepts and results discussed in Part I and
to some extent on Part II. Parts I and II are independent of each other,
and each provides enough material for a one semester course.
The exercises cover a wide range of difficulty with an emphasis on more
routine problems in the earlier sections of each part in order to familiarize
the reader with the new notions and methods. The more difficult exercises
are accompanied by hints. In some cases significant theorems are devel-
oped step by step with hints in the problems. Such theorems are not used
later in the sequence.
The part dealing with set theory is intended to provide a notational and
conceptual framework for areas of mathematics outside of logic as well as
to introduce the student to those topics that are of particular interest to
those working in the foundations of set theory.
We hope that the part of the text devoted to computable functions will
be of interest to those who intend to work with real world computers.

ix
x Preface

We believe that the notation, methodology, and results of elementary


logic should be a part of a general mathematics program and are of value
in a wide variety of disciplines within mathematics and outside of mathe-
matics.
Boulder, Colorado 1. MAUTZ
March 1979
Glossary of Symbols

f- 1 7
{ ... } 2 7
ftC
I 2 fog 7
N 2 9
N+ 2 -<,'" 15
Q 2 (R, <), (Q, <), (I, <), 22
Q+ 2 <.4
R 2 (N, <) 23
R+ 2 Orda 33
{x:'" } 2 ~ 34
0 2 Card 36
E 2 c(x) 37
~,;;;?,c,::> 2, 3 ZF 38
U 3 ZFC 46
UX 4 1= 51
U Ai 4 M(t) 61
iEI Sum 63
n 4 Ck,d 63
nX 4 Pk,t 63
B-A 4 Pred 63
P(X) 5 Prodn 69
AxB 6 Mult 70
[B]k 6 Pow 71
DomR 7 Diff' 71
RanzR 7 m':"n 71
1-1 7 3x<y 75
f:A-+B 7 Vx<y 75
AB 7 P(H,x) 75
J[C] 7 Prime 75
xi
Glossary of Symbols xii

Prim 75 In 101
Exp' 76 Halt 101
Max 77 'If 103
Mlt,."..s 80 3 103
n• 80 Rec 107
Icompress I 80
Rem
L
108
111
M) 82 (see also 136)
~ ~ 111
M
V,/\,. III
MM 'If, 3 111
It" \i 82 EB,0 111
M) M) [, ] 111
M Trm 111
It" ...,. 82 t<z) 112
M) M2 Fm 113
rM,M+--., 82 F 114
I- 124
Icopyk I 83 Cons:;;
Prf:;;
129
129
Ishift right I
p 133
84 NP 133

I I
7' 136
shift left 84 Fm, 137
~~~ 140
Ierase I 84 ~ts
::::::g,::::::
140
140
#k 91 z(~) 142
TS 95 til<z) 142
STP 95 ~FCP<Z) 142
Idecode I 95
Th~
-
144
144
144
Icode I
Mod~
98 6J) (~) 147
Exp 98 IIF 166
RR 98
ex: 167
RC 98
6J)c~ 169
NP 99 U ~a 170
a<k
NS 99
S(~,X) 177
NST 99
~ex:r~ 178
T 99 'lf3-formula 178
STP 99 178
ThV3 K
Row 100
Mach 101
PART I
An Introduction to Set Theory

1.1 Introduction
Through the centuries mathematicians and philosophers have wondered if
size comparisons between infinite collections of objects can be made in a
meaningful way. Does it make sense to ask if there are as many even
numbers as odd numbers? What does it mean to say that one infinite
collection has greater magnitude than another? Can one speak of different
sizes of infinity?
Before the last three decades of the nineteenth century, mathematicians
and philosophers generally agreed that such notions are not meaningful.
But then in the early 18708, a German mathematician, Georg Cantor
(1845-"1918), in a remarkable series of papers, formulated a theory in
which size comparisons between infinite collections could be made. This
theory became known as set theory. As with many radical departures from
traditional approaches, his ideas were at first violently attacked but now
have come to be regarded as a useful and basic part of modem mathemat-
ics. This chapter is an introduction to set theory.

1.2 Sets
We use the term set to refer to any collection of objects. The objects
composing a set will be referred to as the members or elements of the set.
There are various ways to denote sets. One approach is to list the elements
of the set in some way and enclose this list in braces. For example, using
2 I An Introduction to Set Theory

this convention, the set consisting of the numbers 1,2, and 3 is denoted by
{l,2,3}. A set is completely determined by its members, and so the order
in which we list the elements is immaterial. Thus {1,2,3}={2,3,1}=
{3,2, I} = {l,3,2} = {2, 1,3} = {3, 1,2}.
A set may have so many members belonging to it that it is impractical
or impossible to use the above method of notation, and so other notational
devices must be used. For example, instead of using the method described
above to denote the set of all positive integers less than or equal to 10 10, we
might use {1, 2, 3, ... , 101O} to denote this set. The three dots indicate that
some members of the set being described have not been listed explicitly. Of
course, in using this notational device it is important to include enough
members of the list before and after the three dots so that the reader will
know which elements belong to the set and which do not. For example, the
set of even integers between -100 and 100 inclusive should not be
denoted by {-100, ... , lOO} but by something like {-100, -98, ... ,
-4, -2,0,2,4, ... ,98, lOO} or by {0,2, -2,4, -4, ... ,98, -98, 100, -100}.
Again, the order in which the elements are listed is arbitrary as long as the
reader understands which elements of the set have not been mentioned
explicitly.
The method for denoting sets using the three dots abbreviation can also
be used for infinite sets. For example, the set of even integers can be
denoted by {0,2, -2,4, -4, ... } or by { ... , -6, -4, -2,0,2,4,6, ... }. We
will use N to denote the set {O, 1,2,3, ... } of natural numbers, while N+
will denote {t, 2, 3, ... }. I will denote the set of integers
{O, 1, - 1,2, - 2, 3, - 3, ... }. Q will denote the set of rationals, Q + the set of
positive rationals, R the set of real numbers, and R + the set of positive
reals.
If a set consists of exactly those objects satisfying a certain condition,
say P, we may denote it by {x:P(x)}, which is read: "the set of all x such
that P(x) is true." For example, {x: 3..;; x";; 8 and x is a rational number} is
the set of rationals between 3 and 8 inclusive. Notice that x merely
represents a typical object in the set under consideration, and any letter
will serve just as well in place of x. Thus {l, 2, 3} = {x: x is an integer and
1,,;;x";;3}={y:y is an integer and 1";;y";;3}={x:x is an integer and
0<x<4}. Notice that the last two conditions are different but define the
same set.
We consider as a set the collection which has no members. We call this
set the null set and denote it by 0, rather than { }.
A set may contain other sets as elements. For example, the set
{1, {2, 3} } is the set whose elements are the number 1 and the set {2, 3}. It
is important to understand that this set has only two elements, namely 1
and {2, 3}. 2 is an element of {2, 3}, but 2 is not an element of {I, {2, 3}}.
We write x EA when x is an element of A, and x ElA otherwise.
Let A be a set. We say that a set B is a subset of A if each eleme:Q.1 of B
is an element of A. If B is a subset of A we write B ~A or A -;;J B. If B ~A
1.2 Sets 3

and, in addition, A =1= B, we write B c A or A ::J B and say that B is a proper


subset of A. So p, {2,3} } c p, {2,3 },4} but {l, {2,3} } g {l, {2}, {3} }.
Notice that A (;;;A and 0 (;;;A for every set A (since 0 has no elements, it
is true that every element of 0 is an element of A). Another trivial
observation is that if A (;;; Band B (;;; C, then A (;;; C.
We note that if A and B are sets such that A (;;; Band B (;;;A, then A = B.
For if xEA, then since A (;;;B, xEB. Similarly, if yEB we have yEA.
Thus A and B contain precisely the same elements and so are equal. This
will be used frequently in what follows; two sets A and B will be shown to
be equal by proving both A (;;;B and B (;;;A.
Next we consider ways of combining sets to get new sets.
The union of A and B, denoted by Au B, is the set whose elements
belong either to A or to B. In other words A U B= {x:x EA or x EB}. (In
mathematics we use the word "or" in the inclusive sense. So when we say
that an object is in A or in B we include the case where the object is in
both A and B.) For example

{1,2} U {3,4} = {1,2,3,4},


{a,b,c} U {a,c,d} = {a,b,c,d},
{x:x is an even integer} u {x:x is an odd integer}
= {x:x is an integer}.
Some of the elementary properties of the union operation are summarized
below.

Theorem 2.1.
i. A (;;; B implies that A U B = B.
ii. A UB=B UA.
iii. AU(BUC)=(AUB)UC.

The proof of the theorem is very easy, and we leave all parts but iii as
exercises.
To prove part iii, first suppose that xEA U(B U C). Then either xEA
or x E B U C. If x E A, then x E A U B, and so x E (A U B) U C. If x E B U
C, then xEB or xE C. If xEB, then xEA U B, and so xE(A UB)U C. If
x E C, then x E(A U B)U C. Hence we have shown that whenever x EA U
(B u C), then xE(A U B)U C, in other words, we have shown that A U(B
U C)(;;;(A u B)U C. In the same way one proves that A U(B U C):?(A U
B)U C (the reader should check this). Hence A U(B U C)=(A U B)U Cas
claimed.
Because of part iii, no confusion can arise if parentheses are omitted
from (A U B)U C and we write AU B u C.
It should be clear what is meant by A I U A2 U ... U An' namely, {x: x E
A I or x E A 2 or ... or x E An}' An alternative notation for this set is
4 I An Introduction to Set Theory

U {A;:iEN+ and i <:n}. In general, if X is a non-empty set of sets, then


U X = {y : there is aYE X such that y E Y}. This is called the union over
X. So if X={A J,A 2 , ••• ,An }. then U X=A J UA 2 u ... UAn. For example,
if A;= {x:x= i/n for some n EN+} (so that As= {5/1,5/2,5/3,5/4, ... }),
then U {A;:iEN+} =Q+. Instead of writing U {A;:iEI} we may write
U A;.
;EI
The intersection of A and B, A n B, is the set whose elements are
simultaneously elements of A and of B. In other words A n B = { x: x E A
and xEB}. For example {1,3,9}n{I,5,9}={1,9} and {X:XER+ and
x<5} n {x:xEQ and x;) 3} ={x:xEQ and 3 <:x<5}.

Theorem 2.1. (Cont.)


i'. A C;B implies that A n B=A.
ii'. AnB=BnA.
iii'. A n(B n C)=(A n B)n c.

The proofs are very easy and left as exercises.


As in the case of the union operation, the intersection operation
generalizes to the intersection over a set of sets. Letting X be a non-empty
set of sets, we define the intersection over X, n X, to be {y: for all Y EX,
yE Y}. So if X={A J,A 2 , ••• ,An }. then n X=AJn ... nAn. (As before,
we use iii' to justify our omission of parentheses in A J n ... n An.) As
another example let An={x:xER and Ixl<l/n}. Let X={An:nEN+}.
Then n X={O}.
We say that A and B are disjoint if A n B = 0. Similarly, X is a set of
pairwise disjoint sets if for all A, B E X, either A = B or A n B = 0.
We next state some easily proved facts relating union and intersection.
The proofs are left for the exercises.

Theorem 2.1. (Cont.)

iv. An (B U C) = (A n B) U (A n C), and more generally (U X) n


(U Y)= U (AnB:AEX and BEY).
iv'. AU (B n C) = (A U B) n (A U C), and more generally (n X) u
(n Y)= n ({AuB:AEXandBEY}).

The difference of A from B, denoted B-A, is the set of elements in B


but not in A; in other words we define B - A = {x:x E B and x ~A}. For
example, Q+ - {x: x E R and x <: 3} is the set of positive rationals greater
than 3. As another example, {1,4,9} - P,4,8} = {1,9}. B- A is also called
the complement of A in B.
We next state several relations between the above notions.
1.2 Sets 5

Theorem 1.1. (Cont.)


v. A eB implies B-(B-A)=A.
vi. C;JB;JA implies C-A;J C- B.
vii. C - (A U B) = (C - A) n (C - B), and more generally
C-( U X)= n {C-A:A EX}.
viii. C - (A n B ) = (C - A.) U ( C - B), and more generally
C-( n X)= U {C-A:AEX}.

We prove vii, leaving the proof of the other clauses for the exercises.
Here and throughout the text we use 'ifr to abbreviate 'if and only ir.

XEC-( U X) iff
x E C and x ~ A for all A E X iff
xEC-A forallAEX iff

xE n {C-A:A EX}.
In other words, C-( U X) and n
{C-A:A EX} have the same mem-
bers and so are identical, as claimed in vii.
Clauses vii and viii are called De Morgan's rules.
We next define the power set, P(X), of a set X. This is the set of all
subsets of X, i.e., P(X) is defined as {Y: YeX}.
For example, if X= {I,2,3}, then P(X) = {f/>, {I}, {2}, {3}, {I,2},
{I,3}, {2,3}, {I,2,3}}. Oearly, we always have f/>EP(X) and X EP(X).
Elementary properties of the power set operation will be found in Exercise
7 below.

ExERCISES FOR § 1.2


I. How many elements are there in each of the following sets?
{1,2,1/I}, {I, {I,I/I}}, {I/I}, {I}, {{I}}.
2. Which of the following are true?
0e0, 0~0, {I} e{I,2}, 1e{ {1},2}.
3. Show that
(a) if A ~ C and B ~ C, then A U B ~ C, and
(b) if C~A and C~B, then C~AnB.
4. Supply the missing proofs for Theorem 2.1.
5. List the elements of P({l,2,3,4}).
6. List the elements of P(P(P(0»).
6 I An Introduction to Set Theory

7. Show that
(a) A ~ B implies P(A)~ P(B);
(b) P(A U B)~ P(A)U P(B), and more generally

p( U X)~ U {P(A):A EX};


(c) P(AnB)~P(A)nP(B), and more generally
p( n X)~ n {P(A):AEX};
When does equality hold in (b) and in (c)?

1.3 Relations and Functions


The aim of this section is to supply definitions of 'relation', 'function' and
related notions in enough generality to be of service throughout the book.
These notions ultimately rest on that of the ordered pair (a,b). Although
'ordered pair' can be defined in terms of the membership relation, as can
all the notions of the classical mathematics, we will not do this until later.
For the time being we shall take the ordered pair (a, b) to be an undefined
notion with the property that (a,b)=(c,d) if and only if a=c and b=d.
For example (3,8) =r!= (8, 3) (although {3, 8} = {8, 3D. Similarly, the only
property of n-tuples that we shall use is that (a), ... ,an)=(b), ... ,bn) iff
aj = bj for all i <: n.
The Cartesian product of A and B, written A X B, is {(x,y):x EA and
yEB}. More generally, we defineAOxA)x ... XAn to be {(ao, ... ,an):ajE
Aj for each iE{O, ... ,n)}. For example, (1,~)ENXQ, but (~,I)€l:NXQ.
If for each iJ E {O, ... , k - I} we have B = Aj = Aj , then we abbreviate
AoX ... XA k _) by [Bh. For example, [R1n is Euclidean n-space.
A binary relation is a set of ordered pairs. For example {(x,y):x <y and
xEN,yEN} is a binary relation. So is {(3,4),(1, I)}, as well as the circle
in Euclidean 2-space of radius 3 with center (4,'IT), namely {(x,Y):(X-4)2
+(y - 'IT)2=3 2}.
The domain of a binary relation R, sometimes written DomR, is
{x:there is a y such that (x,y)ER}; the range of R, RanR, is {y:for
some x, (x,y)ER}. The field of R is DomRuRanR. In the first of the
three examples above we have DomR=N, RanR=N+; in the second
DomR={3,I}, RanR={4,1}; and in the third DomR={x:l<:x<:7}.
RanR={Y:'IT-3<:Y<:'IT+3}. One frequently writes xRy instead of (x,y)
ER, and xKy if (x,y)€l:R.
More generally, a k-relation is a set of ordered k-tuples (so a 2-relation
is a binary relation). As an example of a 3-relation we have {(x,y,z):(x,
y,z)E[Nh and z is the least common multiple of x and y}. Another
example is {(x,y,z):(x,y,z)E[Rh and x+y=z}. We do not define the
domain or range of a k-relation when k=r!=2.
The set of all primes is an example of a I-relation, as is the set of all
multiples of 'IT.
1.3 Relations and Functions 7

A function J is a 2-relation such that for every x there is at most one y


for which (x,y)EJ. In other words, if (x,y)EJ and (x,z)EJ, then y=z.
When J is a function, one usually writes J(x) = y instead of (x,y)EJ, and
says that y is the value of J at x.
For example, {(l,3),(3, 1),('17, I)} is a function, but {(l,3),(3, 1), (1,'17)} is
not. {(x,Y):X=y3 and xEN and yEN} is a function, but {(x,Y):X=y2
and x E Nand y E I} is not.
A function J is one to one, abbreviated 1-1, if {(y,x):J(x)=y} is a
function, i.e., if when J(x) = y and J(z) = y we have x = z. In our examples
of functions above, the second is 1-1 but the first is not.
We say that a function J is on A if DomJ = A; into B if RanJ (;;; B; onto
B if RanJ= B. If J is a function on A into B, we may write J:A~B. The
notationJ:A::!B adds the condition thatJis 1-1, whileJ:A ~ B adds the
onto
condition thatJ is onto B. The set of all functions on A into B is denoted
by AB, i.e., AB={f : J:A~B}.
By fIC] we mean {y:for some XEC,J(x)=y}. Notice that no restric-
tion is placed on C; C need not be included in DomJ. For example, if
J={(x,y):y=x 2 and xEN} and C={x:x<'17 and xER}, then J[C]=
{O,I,4,9}.
DefineJ-I[y] to be {x:J(x)E Y}.J-I[y] is defined even if ygRanJ.
So if J(x)=3x+2 for all xER+, thenJ-I[{y:O<y < ll}]={x:O<x <3}.
As another example, if J(x) = x 2 for each x E R, then
,~ ,~ \-\
J-I[{y}]={- v y, v y} for each yER+. If J:A ~ B, then the set
onto
{(b,a):J(a) = b} is a 1-1 function on B onto A which we call J inverse,
writtenJ- I. Notice that JU-I(b))= b and J-I(f(a))= a for all aEA and
all bEB.
The restriction oj J to C, abbreviated Jt C, is the function g with domain
C n DomJ such that for each x E C n DomJ we have g(x) = J(x). In other
words g= {(x,y):xE C nDomJ andy= J(x)}.
Notice that C is arbitrary and need not be a subset of DomJ. For
example, ifJ= {(x,y):y=x 2 and xEN} and C= {x:x<'17 and xER}, then
it C = {(O,O),(1, 1),(2,4),(3, 9)}.
If g= JtC and C (;;;DomJ, then we say thatJ is an extension oj g.
LetJERC and let gEAB. The composite oj J and g, writtenJog, is that
element of AC defined by (fo g)(x) = J(g(x)) for all x EA.

Theorem 3.1. Let JERC, gEAB. Then


i. if J and g are I-I, then so isJog.
ii. if J is onto C and g is onto B, then Jog is onto C.

PROOF OF i. Suppose J and g are I-I, and (fog)(a)=(fog)(b). Then


J(g(a)) = J(g(b)). SinceJ is I-I, g(a) = g(b). Since g is 1-1, a = b. 0

We leave the proof of part ii as an exercise (see Exercise 15).


8 I An Introduction to Set Theory

EXERCISES FOR §3.


1. What are the elements of {l,3} X {l,w,4}?
2. If A has m elements and B has n elements, how many elements does A X B
have?
3. Prove that if Aj~Bj for each iE{l,2, ... ,k} then AIX ... XAk~BIX ... XBk.
4. Show that the relation < on Q is not a set of the form A X B.
5. If the following statement is true, prove it; if not, give a counter example:
If A:2 B u C then (A XA)-(B X C)=(A - B)X(A - C).

6. Prove or disprove the following statement:


(AI XA 2)U(B I X B2)=(AI u B I)X(A 2uB 2).
7. Prove or disprove the following statement:
(AI xA 2)n(B I XB 2)=(A I n B I)X(A 2n B2)'
8. For each relation R below, find DomR, RanR, and the field of R:
(a) R = {(l, 4), (w, 3), (w, 1),(1,w)}.
(b) R = {(x,y): Ixl + Iyl = I}.
(c) R= {p:pE[Rh and Ip-(1,0)1+lp-(1,0)1=3}
[where l(xI'YI) - (X2,Y2) I=V"-(X-I---X-2--;;)2:-+-(-Y-I---Y-2)-::-2 l
9. Which of the following are functions; which of the functions are I-I?
(a) {(x,y):x>O, x 2+y 2=1, and xER,yER}.
(b) {(x,y):y >0, X2+y2= I, and xER,yER}.
(c) {(x,y):x>O,y >0, x 2+y2= 1 and xER,yER}
(d) {(x,y,z):x,y,ZEN+ and z=rY}.
(e) {(x,y,z):x,y,zEN+ and z=2x+3y}.
10. Prove:
(a) J[ U X]= U {f(A):AEX}.
(b) J[ n X] ~ n
{f(A):A EX}.
Show that equality need not hold in (b) by describing sets A and B and a
functionJ such thatJ[A n B)~J[A)nJ[B).
11. Show that
(a) (f~CHD=J~(CnD)
(b) n
{f~C:CEK}=J~ n {C:CEK}.
12. Suppose A has n elements and B has m elements. How many elements are
there in A B? Give a proof.
13. Prove:
(a)r l [ U X]= U {f-I[A):AEX}.
(b)J-I[ nX]= n
{f-I[A):AEX}.
(c) rl[A - B):2r l [A)- J-I[B).
Show that equality need not hold in (c).
14. Find functionsJ and g such that RanJ= Rang = DomJ= Domg andJog=l=goJ.
15. ShowthatifJ:B ~ Candg:A ~ B,thenJog:A ~ C.
onto onto onto
1.4 Pairings 9

1.4 Pairings
Suppose that A and B are sets and j is a 1-1 function from A onto B. j can
be thought of as an association or pairing of the elements of A with those
of B such that each element x of A has a unique associate j(x) in B and,
conversely, each element y in B has a unique associate j- I(y) in A.
Clearly, if A has ten elements, then so does B; if A has one million
elements, then so does B. Indeed, the existence of such a pairingj assures
us that if A has n elements, where it EN, then B also has n elements. But
what if A is infinite? It seems natural to use the existence of such an j to
assert that A and B have equal magnitude or are equinumerous even when
both sets are infinite. It is this simple idea that underlies the theory of
infinite sets. We now discuss this idea in more detail and consider some of
the surprising consequences.
Let A and B be arbitrary sets. A pairing between A and B is a 1-1
function on A onto B.

EXAMPLE 4.1. Let A = {l,2,3}, B= {4,5,6}. There are several pairings of A


with B. One such is {(1,4),(2,5),(3,6)}; another is {(I,6),(2,4),(3,5)}.

EXAMPLE 4.2. Let A = {I,3,5, 7,9,11, ... }, B= {0,2,4,6,8, ... }. One pairing
between A and B is defined by P(n)=n-l for all odd nEN+, i.e.,
P={(I,0),(3,2),(5,4),(7,6), ... }. Clearly P is 1-1 and onto.

EXAMPLE 4.3. Let A =N, B={O, -1, -2, -3, ... }. One pairing of A and B
is given by P(n) = - n for all nEN, i.e., P= {(O, 0), (1, -1),(2, -2), ... ).

EXAMPLE 4.4. Let A=Q+={p/qlp,qEN+, p/q in lowest terms}. Let


B = N+. Consider the function P from A into B defined by P(p / q) =p + q.
This is not a pairing between A and B, since P is neither 1-1 nor onto
(verify this). This does not show that no pairing exists between A and B,
but only that this particular function is not a pairing. We shall return to
the question of whether or not there is a pairing between this A and B later
in this section.

EXAMPLE 4.5. Let A = {l,2,3}, B={I,2}. The reader can easily list all
functions from A into B and check that no pairing is possible between A
~~ .

Definidon. Let A and B be sets. We say that A is equinumerous to B if there


is a pairing between A and B. If A and B are equinumerous we write
A-B.

This definition fits well with our intuition, and yet some of its con-
sequences at first glance seem bizarre. For example, a set A may properly
contain a set B and still be equinumerous to B. Of course this can not
happen if A is finite, but it can happen when A is infinite.
10 I An Introduction to Set Theory

To illustrate, let B={n 2 :nEN}. Then N:::>B. Yet {(x,x2):XEN} is a


pairing between Nand B, so Nand B are equinumerous. At first this
seems so strange that one balks at accepting the above definition of
equinumerous. In fact, for centuries this very pairing was used as evidence
that it is nonsensical to compare magnitudes of infinite sets.
It was Cantor who in 1874 took the bold position that the above
definition of equinumerous is the "correct" mathematical definition in
spite of such examples. Moreover, he proceeded to show that this defini-
tion leads to a significant and beautiful mathematical theory.
Why did mathematicians come to accept Cantor's views? There are
several compelling reasons. First of all, Cantor's ideas were found to be
applicable to many problems in several branches of mathematics outside
of set theory, problems that did not seem on the surface to be concerned
with comparisons of infinite magnitudes. In § 1.7 we give a proof of
Cantor's of a striking fact about the real numbers which shed light on a
major problem of the 19th century. Furthermore, set theory was found to
provide a uniform notational and conceptual framework within which all
of mathematics can be expressed. We touched on this aspect briefly in §3,
but much more will be said in the next chapter. Later we will give an
axiomatization of set theory which provides an axiomatic framework for
all of classical mathematics. Esthetically, Cantor's proofs are among the
most beautiful in mathematics, and there is no question that mathematics
has been considerably enriched by his theories and methods.

Definition. A set A is countable if A~B for some B (:N. Sometimes the


term denumerable is used for A when A~N. If for some nE N, A~
{O, 1, ... , n - 1}, then A is finite. If no such n exists, then A is infinite.

We now give more examples of sets that are equinumerous with N.

EXAMPLE 4.6. A pairing P between N and I is given by


N: 0 1 2 3 4 5 6
I: 0 -1 2 -2 3 -3
That is, P = {(O, 0), (1, 1), (2, - 1), ... }. P can also be described by the
following equations:
P(n) = - 4- n if n is even,

P(n)=i(n+ 1) if n is odd.

EXAMPLE 4.7. Let A =N and B=N+. Define P:N~N+ by P(n)=n+ 1 for


all n EN. Since P is 1-1 and onto, N~N+.

We have used the word 'equinumerous' for the relation '~' so as to


imply an analogy between 'equinumerous' as the word is used informally
1.4 Pairings 11

when referring to finite sets having the same number of elements and the
technical meaning given to the word in the present context. This analogy
would be weak indeed without the following.

Theorem 4.8. For all sets A, B, and C:


i. A-A.
ii. If A-B, then B-A.
iii. If A-B and B-C, then A-C.

PROOF: i. Define P:A--+A by P(a) = a for all aEA. Then P is a pairing


between A and A, so A-A.
ii. Suppose A-B. Then there is a P such that P:A ~ B. Clearly
onto
P-I:B ~ A, and so B-A.
onto
iii. Suppose P:A ~ Band Q:B ~ C. It is immediate from Theorem
onto onto
3.1 that Q 0 P is a pairing between A and C. 0
We have already observed that {n 2 :nEN} is a proper subset of N
which is equinumerous to N. This is a specific instance of the following
more general statement.
Theorem 4.9. If A ~N and A is infinite, then N-A.
PROOF: For each non-empty subset Y of N, let f(Y) be the smallest
element of Y [so for all xE Ywe havef(Y)":x]. Now define
P{O}=f{A},
P{I} = f{A - {P{O}}},
P{2} = f{A - {P{O},P{I}}},
P{3} = f{A - {P{O},P{I},P{2}}},
and so on. In general we have P(n)=f(A-{P{j):j<n}). Notice that for
each n, P(n) is defined, since A - {P(j):j <n} is non-empty [in fact, since
A is infinite and {P(j):j <n} is finite, A - {P{j):j <n} is infinite]. Clearly,
if m>n, then P(n)Et:(A-{P(I):i<m}), and so P(m»P(n). Hence P is
I-I. We need only show that P is onto. If not, let m* be the least element
of A - P. Let X = {n :P(n) <m*}. X is finite, since P is I-I. Hence there is
an n* such that n*>n for every nEX. Then P(n*)=f(A-{P(j):j<n*})
=m*. So m*ERanP-a contradiction. 0
As an application we see that the set of primes is equinumerous to N.
Corollary 4.10. A countable set is finite or denumerable.

Theorem 4.11. The following statements are equivalent:


i. f:N --+ A for some f.
onto
ii. Either A is finite and non-empty, or N-A.
12 I An Introduction to Set Theory

PROOF THAT i IMPLIES ii. Let J: N --+ A. By the preceding theorem, it is


onto
enough to find a pairing P between A and a subset of N. For each a EA let
P(a) be the least nEN such that J(n) = a. Clearly P is 1-1 and onto A as
needed. 0
PROOF THAT ii IMPLIES i. If N--A then there is a pairing P between N and
A which we can take as the J of clause i). If A is finite then there is an n
and a g such that g:{O, I, ... ,n-I} ~ A. Extend g toJ:N--+A by defining
onto
J(x) = g(x) if XE{O, I, ... ,n-I} andJ(m) = g(O) if m ;;'n. 0
Recall that a number n EN+ is a prime if n> 1 and whenever k·l= n
with k,1 EN+ then either k= 1 or 1= 1. For example, 6=3·2 and so is not
prime. But 2,5,7,11,13,17,19,23 are primes. One of the most basic theo-
rems of arithmetic is the Prime Factorization Theorem. We shall use it
frequently and so we state it below for easy reference. A proof may be
found in almost any text on elementary algebra.

Theorem (Prime Factorization Theorem). For each xEN+, x> 1, there is


exactly one sequence k, n l , n2' ... , nk oj elements oj N+ such that x = 2n •• 3n2
· .. ··h'lt., wherep; is the ithprime (soPI=2,P2=3,P3=5,P4=7,Ps=11,
etc.).

Theorem 4.12. U kEN+[Nh--N.


PROOF: Let xE U kEN+[N]k' say x=(nl, ... ,nm)E[N]m' Define J(x)=
.p:;o+l. ThenJ: U kEN+[N]k--+N, and by the prime factori-
2n ,+1·3n2 + 1....
zation theoremJ is 1-1. Now apply Corollary 4.10.
Of course with J as above, J t[N]k is I-Ion [N]k into N, so by Corollary
4.10 [N]k'- N for each k E N+. (Another proof of this is found in Exercise
8.)
The existence of a pairing between N X N and N (different from those
above) can be seen pictorally as follows:
(0,0)--+(0, 1) (0,2)--+(0,3) ...

(1,0) (1,1) (1,2) (1,3) ...


~ ~ ,/

(2,0) (2, 1) (2,2) (2,3) ...


,/

(3,0) (3,1) (3,2) (3,3) ...


~
1.4 Pairings 13

The arrows indicate the ordering of NXN imposed by the pairing with N:
O-P(O,O), I-P(O, I), 2-P(I,O),
3- P(2,O), 4- P(I, I), 5-P(O,2), ....

1heorem 4.13. Suppose that An is countable for each n EN. Then U nE~n
is countable.

PROOF: The hypothesis asserts that for each n EN there is a function


I-I
f:An ~ N. For each n we choose one such f; call it fn. (In general there are
infinitely many such 1's; cf. Exercise I of this section.) For each x
E U nE~n' let k(x) be the smallest j such that xEAj • Define F(x)-
2k (x)Jlk(x><x).Clearly F: U nE~n~N. We claim that F is I-I. For suppose
that F(x)_2k (x)Jlk(X)(x), F(y)_2k(Y)3-Ii(y)(Y), and F(x)- F(y). By the Prime
Factorization .Theorem: k(x) - k(y) and fk(x)(X) - A(y)(Y). Hence fk(X)~~l­
fk(x)(Y), and smceA(x) IS I-I, we have x-Yo Therefore, F: U nE~n~N,
i.e., U nE~n is countable.

EXAMPLE 4.14. We show that Q+--N. Let Ao-{f,t,~,~, ... },AI-


H,~,~,~, ... }, and in general An-{m/n:mEN+}. Then Q+- U ::'aAi'
I-I
Clearly J,,:An ~ N+, where J,,(m/n)-m for each mEN+. Now apply
onto
Theorem 4.13.

EXAMPLE 4.15. We show that Q--N. Let Ao-{O}, AI-Q+, and A 2 -


{ - x: x E Q +}. Oearly Q- Ao U A I U A 2 • Moreover, it follows from Exam-
ple 4.14 that each Ai is countable. Hence, by Theorem 4.13, Q is countable.

So far all the examples discussed in this section have involved sets that
are either finite or equinumerous to N. Are there any infinite sets that are
not equinumerous to N, or is there only one "size of infinity"? If all infinite
sets are equinumerous, then there is little else to say about this notion.
However, this is not the case, as we shall see in the next section.

ExERCISES FOR §4
1. Show that if A is countable and not empty, then there are infinitely many fs
I-I
such thatj:A -+N.
2. Prove that Q+ -N by making use of a table as in the second proof that
NxN-N.
3. Let A-B, and suppose that A has 20 elements. How many pairings are there
between A and B? Justify your answer.
In Exercises 4 through 6 let A, B, C, and D be sets such that A--C and
B--D.
14 I An Introduction to Set Theory

4. (a) Give an example to show that An B need not be equinumerous to C n D.


(b) Give an example to show that A U B need not be equinumerous to CUD.
5. (a) Suppose A nB= CnD=Ij>. Prove that A UB-CUD.
(b) SupposeAuB-CuD. MustAnB-CnD?
6. Suppose A, B, C, and D are finite sets. If An B-C n D, prove that A U B-
CuD.
7. Let PI be a pairing between A and C, and let P2 be a pairing between Band D.
Suppose A n B = C n D and PI (x) = P 2(x) for all x EA n B. Prove that AU B-
CUD.
8. (a) Define 1:NXN--+N by f(m,n)=(2m(2n + 1»-1. Prove that 1 is 1-1 and
onto.
I-I
--+ N. Define h(n"n2' ... ,nk'nk+I)=(2g(n,.n2·····"")(2nk+1 + 1»-
(b) Let g:[N]k onto
I-I
1. Show that h:[N]k+1 onto
--+ N.
--+ A, and suppose that 1- I [a] is finite for each a EA. Show that
9. Let 1:N onto
N-A.
10. Prove that p(A)_A {O, I} for any A. (Hint: For B a subset of A, define
1B(X) = 1 if xEB,
1B(X)=O if xEA -B.
Now)et F be the function on P(A) defined by F(B)= 1B' Show that
I-I A
F:P(A) --+ {O, 1}.)
onto
11. Prove that A-B implies AX_BX, xA_xB, and P(A)-(B).
12. Prove that A X B-B X A and that A X(B X C)-(A X B) xc.

1.5 The Power Set


In this section we show that there are infinite sets that are not equi-
numerous to N. It is then quite natural to ask if these sets have greater
magnitude than N, and we shall consider several that do. However, in
order to give a cohesive and general discussion of size comparisons one
needs an assumption that we have not explicitly mentioned so far and that
is considerably more subtle in content. This new principle, the axiom of
choice, is discussed in § 1.9, and there we shall again take up the problem
of size comparisons. (Covert use of this axiom was made in Theorem 4.13.)

Theorem 5.1. Let A be an arbitrary set, and suppose J:A~P(A). Then J is


not onto P(A), i.e., there is a B EP(A) such that B E;l:RanJ.
PROOF: Let B={x:xEA and xE;l:J(x)}. Then BCA and so BEP(A).
Suppose B ERanJ. Then B= J(a) for some aEA. We ask whether or not
aEB. If aEB, then, by the definition of B, aE;l:J(a). But B=J(a) so aE;l:B,
1.5 The Power Set IS

a contradiction. On the other hand, if afl.B, then afl.j(a), so, again by the
definition of B, we conclude that a E B, contradiction. Since both assump-
tions a E B and a fl B lead to contradictions, our assumption that B E Ranj
is erroneous, i.e., B flRanj, as we needed to show. 0
Corollary S.2. Let A be an arbitrary set. Then A?"P(A).
PROOF: Suppose for some set A we have A-P(A). Then there is a pairing
P:A :! P(A), contradicting Theorem 5.1. 0
onto

The argument used in the proof of Theorem 5.1 is known as a "diagonal


argument." To see why, consider the special case N=A. Again letj:N--+
P(N). Now consider the following table:

0 2 3 4

f(O) aoo a01 a02 a03 a04


f(l) a10 all a12 a13 a14
f(2) a20 a21 a22 a23 a24
f(3) a30 a31 a32 a33 a34

Here aij is 0 ifjfl.j(i), and aij is 1 ifjEj(I). For example, ifj(3) is the set
of primes, then the fourth line of the table begins as follows:
0011010100010 .... We define B as before, that is, B={x:xEN, xfl.j(x)}
={m:mEN and am.m=O}. Then for each n, B=I=j(n), since nEj(n) iff
ann = 1 and ann = I iff n fl B. Note how the elements of B are determined by
the diagonal of the table, namely the entries ann for n EN.
Diagonal arguments play an important role in many proofs in this book.
As with the proof of Theorem 5.1, the "diagonal" considered may not be
immediately obvious but it is usually helpful to rewrite the proof in tabular
form.
The power set of an infinite set is certainly infinite, since P(A)::J {{a} :a
EA}, and so Corollary 5.2 implies that there are different sizes of infinity.
Is there a reasonable way to compare different sizes of infinity? Can one
speak of one infinite set having greater "magnitude" than another infinite
set? The following definition fits well with our intuition and provides a
basis for size comparisons of infinite sets.
We say that B is at least as numerous as A if A is equinumerous with a
subset of B. If this is the case, we write B~A or A~B. In other words,
B ~A if there is a 1-1 function from A into B. If B is at least as numerous
as A but not equinumerous to A, we say that B is more numerous than A
(or A is less numerous than B) and we write B>-A (or A-<B).
If A is an arbitrary set, then the functionj defined by j(a) = {a} for all
a EA yields a 1-1 function from A into P(A). Thus A ~P(A). By Corollary
5.2, A is not equinumerous with P(A). Hence A is less numerous than
P(A), and we have proved:
16 I An Introduction to Set Theory

Theorem 5.3. Let A be an arbitrary set. Then A -<P(A).

Our definitions of "more numerous" and "at least as numerous" allow


us to make size comparisons between infinite sets. Do size comparisons
between infinite sets obey the basic rules as size comparisons between
finite sets? If so then our experience with finite sets will be of help when
comparing infinite sets. For example, if any of the statements in the
following theorem were false, it would be difficult to think of "at least as
numerous" as a generalization of this concept for finite sets.

Theorem 5.4. Let A, B, and C be arbitrary sets. Then:


i. A C, B implies A <. B.
ii. A<.A.
iii. If A<'B and B<'C, then A<'C.
iv. If A~B and B~A, then A-B.
v. Either A ~B or B~A.

The proofs of parts i and ii are immediate from the definitions. Part iii
follows from Theorem 3.1i. We will prove iv in the next section. Part v is
considerably harder and will not be proved until § 1.9.
Note that Corollary 4.10 and Theorem 5.4v together imply that the
magnitude of N is minimal among the infinite sets. In other words, we
have

Corollary 5.5. If A is inijinite, then N<.A.

Here is a direct proof of Corollary 5.5 that does not make use of
theorem 5.4. Letfbe a function whose domain is P(A)- {0} and such that
1-1
f(X)EX for each X EP(A)-{cf>}. Now define g:N --+A as follows:
g(O)=f(A),
g(l)= f(A - {g(O)}),
g(2) = f(A - {g(O),g(I)}),
g(3) = f(A - {g(0),g(I),g(2)}),
and so on. In other words g(n+ 1)= f(A - {g(m):m <;n}). Notice that for
each n, g(n+l) is defined, since A-{g(m):m<;n}:;6cf>. Since g(n+l)(t:
{g(m):m <;n}, it is clear that g is I-I. Hence g:N ~A.

EXERCISES FOR § 1.5


I. If A-B, prove that P(A)-P(B).
2. If A~B and B-C, prove thatA~C.

3. If A-<B and B-C, prove thatA-<C.


4. Show that if N~A U B, then N~A or N~B.
1.6 The Cantor-Bernstein Theorem 17

5. Prove that if A has n elements, where n EN, then P(A) has 2ft elements. (fry a
proof by induction on n, using Exercise 10 of §1.4.)
6. Prove:
(a) A-<BnCimpliesA-<B andA-<C.
(b) A-<B n C implies A-<B and A-<C.
(c) B U C-<A implies B-<A and C-<A.
(d) B U C-<A implies B-<A and C-<A.

1.6 The Cantor-Bernstein Theorem


In this section we shall prove of Theorem 5.4iv, which states that if A ~ B
and B~A, then A """,B. This is called the Cantor-Bernstein Theorem. We
begin by proving a related result.

1beorem 6.1. If A:d B, B:d C, and A""",C, then A """,B.

PROOF: By assumption there is an f:A ~ C. Since f maps A onto C,


onto
f[A - B]= {f(x):xEA, xElB} k C. Let Ao=A - B, AI = J[Ao] = f[A - B],
A 2 =J[ A d, and so on. The situation is illustrated in Figure 1.1.
Let D = U ~oAi' We define a function f' from A into B as follows:
J'(a) = f(a) if aED,
f'(a)=a if aElD.

Figure 1.1
18 I An Introduction to Set Theory

(A glance at the illustration may be helpful at this point.) We show thatf'


is a pairing between A and B.
To see that f' is 1-1 we take distinct elements a l and a2 of A and
consider four cases:
Case 1: a l ED and a2ED. Then f'(a l) = f(al)+f(a~= f'(a~, since f is
1-1.
Case 2: a l fl D and a2 fl D. Then of course f'(al)+f'(a~, since f'(a l) =
a l andf'(a~=a2'
Case 3: a l ED and a2 flD. Thenf'(al)=f(al)ED andf'(a~=a2flD.
Case 4: a l fl D and a2E D is handled as in case 3.
We next show thatf' is onto. Suppose bEB. If bflD, thenf'(b)=b, so
bERanf'. Suppose bED. Then bEAj for some i>O (i+O, since Aon B=
cp). Since Aj= j[Aj_ I], b= f(c) for some cEAj_ l • Then cED, SOf'(C) = f(c)
I-I
= b. Thus b ERanf' , and so f' is onto B. Hence f':A -+ B, so A ~ B as the
onto
theorem claims. 0
We can now prove

Theorem 6.2 (Cantor-Bernstein Theorem). If A~B and B~A, then A~B.

PROOF: By assumption there are functions f and g such that f:A :! B,


I-I I-I
g:B-+A. Then gof:A-+A. Let C=goj[A]. So A~C and A:2C. Let
B'=g[B]. If xEC, then x=g(f(a» for some f(a)EB, so xEB'. Thus
A :2 B':2 C and A ~ C. By Theorem 6.1 we conclude that A ~ B Since I.

g:B :! g[B], B~B', and soA~B, proving the Cantor-Bernstein theorem.


onto
o
As an application of the Cantor-Bernstein theorem we prove

Theorem 6.3. R~ P(N).

PROOF: We first exhibit a functionf:P(N):!R. Let X EP(N), i.e., X kN.


Definef(X) to be the real number 0.aOa la2a3 ... , where aj=O if iflX and
aj = 1 if iEX. To see thatf is 1-1, we need only recall that if two decimal
expansions are equal, then one of them ends in an infinite string of 9's.
This shows that P(N)~R.
Next we show that R ~ P(N). First notice that R~{ r: -1 <r < I}, as the
pairingf:x-+x/(l +Ixl) shows. Then {r: -1 <r< I}~{r:O<r< I}, as the
pairing g:x-+i(x+I) shows. Hence it is enough to show that A~P(N),
where A = {r:O<r< I}. Given rEA, let 0.r l r2r3'" be its non-terminating
decimal representation. Let h(r) = {rl' 10 + r2' 100 + r3' 1000 + r4""}'
I-I
Clearly h:A -+ P(N), and so R ~ P(N). Since we have already shown
R~P(N), an application of the Cantor-Bernstein theorem gives A~P(N).
o
1.6 The Cantor-Bernstein Theorem 19

We now have the machinary to prove

Theorem 6.4. Let A, B, and C be arbitrary sets. Then:


i. IJ A~B and B-<.C, then A-<.C.
ii. IJ A-<.B and B~C, then A-<.C.

We prove part i and leave ii as an exercise (see Exercise 5). Suppose


1-1 1-1
A ~B and B-<.C. Then there are functionsJandg withJ:A -+B, g:B-+ C.
1-1
Thus goJ:A-+C, and so A~C. If A~C, we would have C~g[B]~
goJ[A] and g:J[A]~A~C. By Theorem 6.1 this would imply C~g[B]~
B, a contradiction. Thus A -<. C.
Let Po(N)=N, P1(N)=P(N), P 2(N)=P(P(N», ... , and in general
Pn+I(N)=Pn(N). Using part i of the above theorem and Corollary 5.3, we
see that Pn(N)-<.Pm(N) for all n,mEN with n<m. Hence there are in-
finitely many sizes of infinity. Are there sets more numerous than any of
the Pn's? Indeed there are. For example, let X= {Pn(N):n EN}. Then for
any B EX we have B-<.P(B) and P(B)C U X. Hence B-<.P(B)~ U X,
and so by Theorem 6.4ii we have B -<. U X.
Of course, there are sets that are more numerous than U X, such as
p( U X), p(p( U X)), and so on.
We have seen in Theorem 5.4 that there are no infinite sets less
numerous than N. Does the magnitude of N have an immediate successor,
i.e., is there a set X such that N-<.X and for no Y do we have N-<. Y-<.X?
In particular, is P(N) such an X? This problem, posed by Cantor, has been
one of the most vexing in set theory, and many ramifications arising from
this problem are still under active investigation. We shall say more about
this famous problem in a later section.

EXERCISES FOR § 1.6


1. Let a,b,c,d be real numbers with a<b and c<d. Let A ={r:rER and a<r<
b}, B={r:rER and a<;;r<;;b}, and C={r:rER and c<r<d}. Show that
A-B-C.
2. Let A ={r:rER and O<;;r<;; I}. Show that A XA-A. [Hint: let r and 3 be reals
with decimal expansions 0.r\r2r3 ... and 0.313233 ••• respectively. Consider f(r,3)
=0.rI3I r232r333····J
3. Let A = {r:rER and -I <r< I}. Show that
(a) A XA-RXR,
(b) RxR-R.
4. Let B={(x,y):(x-a)2+(y-b)2=c2 and xER,yER}, where a, b, and care
fixed members of R and c>O. Suppose RXR~A~B. Show that RxR-A.
5. Prove Theorem 6.4ii.
20 I An Introduction to Set Theory

6. Show that NR_R. [Hint: R .... NR is clear. Let A={r:O<r<l and rER}. Let
JENA; say j(n)=O.rn.lrn.2rn.3 .... Let F(f)=O,SIS2"" where Sj=O ifj is not of
the form 2 3, and Sj = rkl otherwise. Show that F: 'A ~ R.]
k i N I-I

7. Show that
(a) NN_R,
(b) ~-R.
(Hint: Show R .... NN .... NR).
8. Let e be the class of continuous functions. Then e-R.

1.7 Algebraic and Transcendental Numbers


In §l.4 we mentioned the applicability of Cantor's methods to problems
outside of set theory. In this section we shall present one of Cantor's most
striking results, achieved by applying his methods to the study of real
numbers.
By an integral polynomial in the variable x, we mean a polynomial
ao+alx+ ... +anx n, where the aiEl (I being the set of integers) and
an :;60. A real root of an integral polynomial ao+ alx+ ... +anx n= f(x) is a
real number a such that f(a)=O. Of course, some integral polynomials
have no real roots, for example, 1 + x 2 • A real number is said to be
algebraic if it is the root of some integral polynomial. Otherwise, it is called
transcendental. Are there transcendental numbers, or is every real number
algebraic?
Although the above question is a natural one to ask, no answer was
known until the middle of the nineteenth century. Then in 1844, the
French mathematician Liouville produced the first examples of transcen-
dental numbers. This was a considerable achievement, and to this day
there are many simply stated but unanswered questions concerning the
transcendence of various numbers. For example, while 'IT and e were shown
to be transcendental in 1882, it is still not known whether e+'IT or e''IT is
transcendental (although one of them must be; see Exercise 3).
One is tempted to conclude that transcendental numbers are scarce. But
in 1874 Cantor proved the remarkable result that most real numbers are
transcendental. More precisely,

Theorem 7.1. Let T be the set of transcendental numbers and A the set of
algebraic numbers. Then T~R and A~N.

PROOF: We first show that there are countably many algebraic numbers.
Let I[x] be the set of all integral polynomials in the variable x. If
f(x)EI[x], [say f(x)=ao+alx+ ... +anx n, an:;60], we call n the degree of
f(x). Recall that a polynomial of degree n with real coefficients can have
at most n real roots, a fact usually proved in courses in elementary algebra.
1.7 Algebraic and Transcendental Numbers 21

We first prove that N--A. Define f as follows:


f(n)=y if n=2 3 Qo Q
·Pk'Pk'++I' for some a()al, ... ,ak,ak+ l, andy is the
,· •••

ak+ 1 largest root of ao+alx+ ... +anxn.


f(n) = I otherwise.
Clearly f:N -+ A, and so N--A by 4.11.
onto
We now prove that T, the set of transcendental numbers, is equi-
numerous with R. Clearly, T~R, since T CR. By the Cantor-Bernstein
theorem, we need only show that R ~ T. We know that R--R( _1,1), where
R(_I,I)={Z: -I <z< I and zER} (see the proof of Theorem 6.3). So it is
enough to show that R( _ I, I) ~ T. Choose t E T so that t> I (we know that
such a t exists, for otherwise {z: I <z and z E R} C A, and this is impossi-
ble, since {z: I <z and z ER}--R but A--N). Notice that for each n EN+,
nt is transcendental, for if nt is a root of ao+alx+ ... +akxk, then t is a
root of ao+(aln)x+ ... +(aknk)x k. Let h:A ~N. Define H(z)=z if zE
R(_I,I) ~I an~ H(z)=(h(z»t if zER(_I,I) nA. Then clearly
H: R( _1,1) -+ T, which completes the proof. 0
Later we shall prove that if B~ C and C is infinite, then B U C -- C.
Since A U T=R, this gives an immediate proof that T--R.

ExERCISES FOR § 1.7


I. Let A' be the set of real roots of polynomials with rational coefficients. Show
thatA=A'.
2. Let A * be the set of real roots of polynomials whose coefficients are algebraic
numbers. Show thatA*=A. (Requires a bit of field theory.)
3. It is known that e and 'IT are transcendental. Show that either e + 'IT or e·'IT is
transcendental by examining the polynomial x2-(e+'IT)x + e·'IT (and using some
field theory). However, it is not known which of the two is transcendental. As a
completely trivial problem in the same vein, show that either e + 'IT or e - 'IT is
transcendental.

1.8 Orderings
In every branch of mathematics orderings of one sort or another are
encountered. We have no intention of giving a comprehensive classifica-
tion of the various orderings that arise, but instead we restrict our attention
to those that arise most frequently and that are particularly important in
logic.
Apartiai ordering is a binary relation R such that for every x,y,z
i. xXx;
ii. xRy impliesyAx;
iii. xRy andyRz implies xRz.
22 I An Introduction to Set Theory

[Recall that xRy means (x,y)ER and x.xy means (x,y)€lR.] We shall
often use the symbol < to denote a partial ordering.
An ordered pair (A,R) is apartially ordered structure if A:;I=0 and R is a
partial ordering with field kA.

EXAMPLE 8.1. For any X, (P(X), C) is a partially ordered structure.

EXAMPLE 8.2. For a,bEN+ we write alb if there is a cEN such that c=FI
and C=Fa and a·c=b. Then (N,!) is a partially ordered structure.

EXAMPLE 8.3. Let A be the set of polynomials with real coefficients. Let
p,qEA. Writeplq if for some rEA,p·r=q and the degree ofp and of r is
positive. Then I is a partial ordering with field A.

EXAMPLE 8.4. Let §" =RR. Given f,g E §", define f 6.g if f(x)..: g(x) for
each xER and f(z)<g(z) for some zER. Then (§",6.) is a partially
ordered structure.

EXAMPLE 8.5. Let 8 be the set of continuous functions with domain


{x:O":x":I}. For j,gE8 definef<*g if fAf<fAg. Then (8,<*) is a
partially ordered structure.

A linear ordering is a partial ordering R that satisfies the additional


requirement that any two elements of its field are comparable, i.e.,
iv. for all x,y in the field of R, either xRy or x=y or yRx.
(A,R) is a linearly ordered structure if R is a linear ordering with field A
andA=F0.
Requirement iv forces the elements of A to be arranged in a chain.
Examples of linearly ordered structures are (R, <), (Q, <), (I, <), and
(N, <) (the relation < is the usual "less than" relation on R, restricted
appropriately). Examples 8.1 through 8.5 are not linearly ordered· struc-
tures.
A linear ordering R is dense if DomR has at least two elements and
between any two elements of DomR there is another element of DomR-
i.e., for every x,y, if x Ry, then there is a z such that x Rz and z Ry. (A,R)
is a densely ordered structure if R is a dense linear ordering with field A.
Both Q and R are densely ordered by the usual <, but N is not.
We often will ignore the distinction between the ordered structure
(A,R) and the ordering R, referring to both as orderings of the appropriate
kind.
Let (A,R) be an ordering with x,y EA. y is a successor of x in (A,R) if
xRy.y is the immediate successor of x ify is a successor of x and there is
no zEA such that xRz and zRy. Predecessor and immediate predecessor
are defined analogously. A linear ordering is discrete if every element
having a successor has an immediate successor and every element having a
1.8 Orderings 23

predecessor has an immediate predecessor. (N, <) and (I, <) are discretely
ordered, but (Q, <) and (R, <) are not.
If B =F 0 and (B X B) n R is a linear ordering, then B is called a chain or
a branch. In Example 8.2 above, pn: n E N+} is a chain. A minimal element
in a partially ordered structure (A,R) is an element aEA such that for no
bEA do we have bRa. If there is no bEA such that aRb, then a is a
maximal element. If for each bEA we have aRb or a=b, then a is the
least element, and if for each bEA we have bRa or a=b, then a is the
greatest element.
In Example 8.1, cp is the least element and X is the greatest element. In
Example 8.5 there is no minimal element and no maximal element. If we
let A = P(N) - {0,N}, then (A, ~) is a partially ordered structure with as
many minimal elements and many maximal elements as there are elements
in A (see Exercise 9).
A well ordering is a linear ordering R having the additional property
that
v. Every non-empty subset of the field of R has a least element.
In other words, for every non-empty X ~ the field of R, there is an
element x E X such that x Ry for all y EX - {x}.
With the usual "less than" relation the natural numbers are well
ordered. However, neither I nor R+ is well ordered. To see this let X be the
set of elements less than I (of I and R+ respectively). Clearly X does not
have a least element.
Next we describe a construction that provides many examples of well-
ordered structures.
Let (A, <A) be a linearly ordered structure, and let (Ba' <a) be a linearly
ordered structure for each aEA. Define <A~{(Ba' <a):aEA} to be the
structure (B, <B), where B= U {(a,b):aEA and bEBa}, and (a,b)
<B(c,d) iff a<AC or (a=c and b<ad).
Loosely speaking, (B, <B) is obtained by replacing each a EA with a
copy of (Ba' <a). For example, if A={O,I,2}, B;=N for each iEA, and
<A and <; are the usual orderings on A and on B;, then <A~{(B;, <;):iE
A} can be viewed as the ordering obtained by stacking three copies of
(N, <) one upon the other. It is easy to see that the resulting structure is
well ordered. More generally we have the following.

Theorem 8.6.
i. <A~{(Ba' <a):a EA} is linearly ordered.
ii. If (A, <A) is well ordered and if (Ba' <a) is well ordered for each aEA,
then <A ~{(Ba' <a):a EA} is well ordered.

PROOF: i. Let (B, <B)= <A~{(Ba' <a):aEA}. Let (a,b),(c,d),(e,j)EB.


Clearly (a,b)1:B(a,b), for otherwise we would have a<Aa or b<ab.
Suppose (a,b)<B(c,d). Then either a<Ac, or a=c and b<ad. If a<Ac,
24 I An Introduction to Set Theory

then C4Aa, and so we can not have (c,d)<B(a,b). If a=c and b<ad,
then d4ab, and so again (c,d) <tB(a,b). Now suppose that (a,b)<B(c,d)
and (c,d)<B(e,j). Then either
i. a<Ac<Ae,
ii. a<Ac=e and d<ci,
iii. a = c <A e and b <a d, or
iv. a=c=e and b<ad and d<cf.
In either case a <A e, or a= e and b <af; hence (a,b) <B(e,j). D
PROOF: ii. Let X be a non-empty subset of B. Let XI={a:(a,b)EX}.
XI :;60, and XI C;A. Since (A, <A) is well ordered, XI has a least element,
say a*. Let X2={b:(a*,b)EX}. Then X 2:;60 and X 2 C;Bao. Since
(Bao, <ao) is well ordered, X 2 has a least element, say b*. Clearly (a*,b*) is
the least element of X. D
An initial segment of a linearly ordered structure (A, <) is a subset
X C;A such that for each xEX and aEA, if a<x then aEX.
For example, {x: x E Q and x <'IT} is an initial segment of Q but not of
R, and {x: x E Rand 0 <x, 4} is an initial segment of R + but not of R.
Our next theorem shows that given two well-ordered structures an initial
segment of one of them is a copy of the other. For this we need some
definitions and several easy lemmas.
Thro'ugh the remainder of this section we let (AI' <I) and (A2' <2) be
well-ordered structures.
A binary relation SC;A I XA 2 is order preserving [with respect to
(AI' <I) and (A2' <2)] if whenever (xl'YI)ES and (X2,Y2)ES, then XI <IX2
iff YI <2Y2·
Clearly, an order preserving relation is a 1-1 function.
An order preserving relation S is an initial pairing if DomS is an initial
segment of (AI' <I) and RanS is an initial segment of (A2' <2).

Lemma S.7. If Sand S' are initial pairings and DomS=DomS', then
S=S'.
PROOF: Assume that the hypotheses holds but that for some x EDomS we
have Sx:;6S'x. Let x* be the least such x (in the sense of <I), say
S'X*<2SX*. Then SZ=S'Z<2S'X* for each Z<IX*, and SX*'2Sz for
each Z ~I x*, Z EDomS. Thus S'x*Et:RanS, and so RanS is not an initial
segment of (A2' <2). This contradicts the assumption that S is an initial
pairing. D

Lemma S.S. If X is an initial segment of (A I' <I) and S is an initial pairing,


then St X is an initial pairing.
PROOF: We need only show that Ran(St X) is an initial segment of
(A2' <2). So suppose that Y <2Y' and that Sx' = y', where x' EX. Since
1.8 Orderings 25

RanS is an initial segment andy' E Ran S, there is an xEDomS such that


Sx = y. Since S is order preserving, x < IX' and so X EX. Hence Ran( S ~ X)
is an initial segment. 0
Lemma 8.9. If Sand S' are initial pairings, then S ~ S' or S'~ S.
PROOF: Suppose DomS kDomS'. By Lemma 8.8, S'~DomS is an initial
pairing, and so by Lemma 8.7 S'~ Dom S=S, i.e., S kS'. 0

Lemma 8.10. If e is a set of initial segments of (A I' <I), then U e is an


initial segment of (AI' <I).

PROOF: If XI<IX2 and xI,x2EAI with x 2 E U e, then for some XEe,


X2E X. Since X is an initial segment, XI EX and so XI E U e. 0

Theorem 8.11. If (AI' <I) and (A2' <2) are well-ordered structures, then
there is a unique order preserving function S such that either Dom S = A I and
RanS is an initial segment of (A2' <2), or RanS=A 2 and DomS is an
initial segment of (AI' <I),
PROOF: Let :JC be the set of all initial pairings. Let S= U :JC.
We first show that S is order preserving. For suppose (XI'YI) E S and
(X 2'Y2)ES and XI <IX2' Then (xI'YI)ES I and (x2,y0ES2 for some SI,S2
E:JC. By Lemma 8.9, SIkS2 or S 2k SI; let's say SlkS2' HenceYI<2Y2'
since S2 is order preserving. Thus S is order preserving.
DomS= U {DomS':S'E:JC}, and this is an initial segment of
(AI' <I) by Lemma 8.10. Similarly RanS is an initial segment of (A2' <2).
Thus S is an initial pairing.
Suppose DomS cAl and RanS cA 2• Let a be the least number (in the
sense of <I) of A 1- DomS, and let b be the least number (in the sense of
<2) of A 2-RanS. Then clearly SU{(a,b)}E:JC, and so (a,b)ES. But
then aEDomS-a contradiction. Hence DomS=A I or RanS=A 2.
The unicity of S follows from Lemma 8.7. 0
Let <s be a linear ordering with field B, and let <c be a linear
ordering with field C. We say that <s is an initial segment of <c if B is an
initial segment of <c and <s = <c n (B X B). The following theorem will
be useful in the next section.

Theorem 8.12. Suppose :JC is a set of linear orderings (well orderings) such
that whenever <s E:JC and <c E:JC, one of them is an initial segment of the
other. Then U:JC is a linear ordering (well ordering) and each <s E:JC is
an initial segment of U :JC.

PROOF: U:JC is a binary relation; call it <. Suppose that Y <z and z <y.
Theny<sz and z<cY for some <s, <cE:JC. We may assume that <B is
26 I An Introduction to Set Theory

an initial segment of <c. But then y<cz and z<cy, contrary to the
assumption that <c is a linear ordering. Hence y <z implies z {y. Simi-
larly one shows that x{x, and that if x <y andy <z then x <z. Hence <
is a linear ordering.
Suppose that a <b when bE B, and B is the field of <B E %. Then
a <c b for some <c E %. Either <c is an initial segment of <B, or <B is
an initial segment of <c; and both alternatives imply that a E B. Hence <B
is an initial segment of <.
Finally, suppose each <B E % is well ordered and X is a non-empty
subset of the field of %. Let aEX. Then aEC, where C is the field of
some <c E%. Let d be the least element of X n {x:x <;; ca}. Then d is the
least element of X in the sense of <, since <c is an initial segment of <.
Hence < is a well ordering. D

EXERCISES FOR § 1.8


1. Prove: If A is finite and (A, <A) is a linearly ordered structure, then (A, <A) is
discretely ordered.
2. Prove: If (A, <) is densely ordered, then A is infinite.
3. Prove: A well-ordered structure is not densely ordered.
4. (a) Find a well-ordered structure that is not discretely ordered.
(b) Find a discretely ordered structure that is not well ordered.
5. (a) How many linear orderings are there on a set A of n elements?
(b) Show that the set of linear orderings of N is equinumerous to R.
6. Let (A,R) be a well-ordered structure, and let B nA =F 0. Show that R n(B X
B) is a well ordering.
7. Let <Q be the usual dense ordering of the rationals, and for each rEQ let
Br=I and <r be the usual discrete ordering on I. Show that <Q~{(B" <r):rE
Q} is a discretely ordered structure.
8. Let A be an uncountable subset of R. Let < be the usual order relation on R.
Then (A XA)n < is neither a discrete ordering nor a well ordering.
9. LetA=P(N)-{¢,N}.
(a) Show that (A, C) is a partially ordered structure.
(b) Show that the set of maximal elements and the set of minimal elements of
(A, c) are each equinumerous with N.
(c) Show that the set of branches of (A, C) is equinumerous with R.
10. Let <A be a binary relation with field A. Let B be a non-empty subset of A,
and let <B = <A n (B X B). Give examples to show that the following are
possible:
(a) <A is a dense ordering, and <B is an infinite discrete ordering.
(b) <A is a discrete ordering, and <B is a dense ordering.
1.9 The Axiom of Choice 27

1.9 The Axiom of Choice


Let A be a set of non-empty sets. We say that a function J is a choice
Junction for A if the domain of J is A and J(X) E X for each X EA. The
terminology is quite descriptive, since J "chooses" an element in each
X EA, namely J(X). There are some sets A for which the existence of a
choice function is obvious. For example, if A C;; P(N) and each member of
A is non-empty, we can define J(X) to be the least member of X.
On the other hand, suppose that A=P(R)-{0}. We can no longer
define a choice function on A by definingJ(X) to be the least member of
X. This is because R is not well ordered by the usual "less than" relation,
and so X need not have a least member. In fact, in a sense which will be
made clearer in a later section, the existence of a choice function on A
must be taken on faith. This is a special· case of the following principle,
called the axiom oj choice:

Every set oj non-empty sets has a choice Junction.

We have already used the axiom of choice in proving that a countable


union of countable sets is countable (Theorem 4.13), and in the direct
proof of Corollary 5.5, in which it is shown that N ~A if A is infinite.
The axiom was used frequently and without mention in various
branches of mathematics until 1904, when Zermelo gave an explicit state-
ment of the axiom. Once it surfaced, many investigators turned their
attention to it. Some tried to prove it from simpler, more readily accepted
principles; others deduced consequences from it that seemed to some
paradoxical.
Today, the axiom has virtually universal acceptance among mathemati-
cians. There are several reasons for this. First of all, Godel proved in 1938
that no contradiction can be derived from the axiom of choice and the
other basic assumptions about sets unless the other assumptions already
lead to a contradiction. We shall discuss this more thoroughly in a later
section.
Another reason is that the vast majority of mathematicians are con-
vinced on intuitive grounds that the axiom is true. To these mathemati-
cians the axiom is a valid assertion about sets.
Expediency provides another reason for acceptance. In several branches
of mathematics there are theorems that have long, complicated, and
non-intuitive proofs not using the axiom of choice, and short, easily
comprehended proofs using the axiom. There are cases where the first
proof discovered used the axiom of choice, and only later was a proof
found not using the axiom. In fact, the first proof of the Cantor-Bernstein
theorem used .the axiom, although the proof given here does not.
Aesthetic considerations motivate acceptance also. Frequently, the
axiom of choice yields an organization of a theory that is easy to grasp and
28 I An Introduction to Set Theory

sits well with the intuitions. This is particularly true in set theory. For
example, we shall use it in proving that given any sets A and B either
A~B or B~A.
In the remainder of this section, and in the next, we shall present some
of the consequences of the axiom of choice.

1beorem 9.1. Every set can be well ordered, i.e., for every set A there is a
binary relation < such that < is a well ordering of A.
The intuitive idea behind the proof that follows is this. A choice
functionf on P(A)- {0} can be used to well-order parts of A as follows.
Take f(A) to be the least element ao. The next element is f(A - {ao}); call
it a,. The immediate successor of a, isf(A - {ao, ad); call it a2 , and so on.
We collect all orderings obtained in this way in a set X and show that any
two members of X fit together in the sense that one is an initial segment of
the other. From this it follows that U X is a well ordering of A.
PROOF: Let f be a choice function on P(A) - {0}. Let X be the set of all
well orderings <B where <B has field B for some B!:;A and b = f(A -
{x:x<Bb}) for all bEB. (So given an initial <B segment Y,fpicks the
next element in the <B ordering from A - Y.) We show that X satisfies
the hypotheses of Theorem 8.12 and then that U X well-orders A.
First notice that X+0, since the empty relation well-orders {f(A)}.
Now suppose that <B and <c belong to X. By Theorem 8.11 we can
assume that we have an order preserving function S:B-+C such that RanS
is an initial segment of C. We claim that S(x) = x for all x E B. If not,
there is a least b (in the sense of <B) such that S(b)+b, say S(b) = d. But
d=f(A-{x:x<cd}), and x<cd implies S-'(x)<b, so S-'(x)=x.
Hence {x:x<cd}={X:X<Bb}, and d=b-a contradiction. Therefore
S(x) = x for all xEB, and <c is an initial segment of <B. We now know
that X satisfies the hypotheses of Theorem 8.12, and hence U X is a
well ordering; call it <.
Now let b belong to the field of <. Then for some <B E X, bE B.
Hence b=f(A-{x:X<Bb}), and so b=f(A-{x:x<b}). Therefore < E
X.
Let A· be the field of <. We claim that A· =A. For if not, we can add
d= f(A - A·) at the top to extend < to <#, i.e., we define x <# y if x <y
when both x,yEA·, and x<#d when xEA·. Clearly <# EX, and so
dE A • -a contradiction. Therefore < well-orders A. 0
Recall that we postponed the proof of Theorem 5.4v, which states that
for any sets A and B, either A ~B or B~A. Assuming the axiom of choice,
we can now supply the proof. By Theorem 9.1, there is a well ordering <.of
on A and a well ordering <B on B. Hence by Theorem 8.11 there is an
order preserving functionf on A into B or on B into A. Since f is 1-1, we
have A ~B or B~S as required. 0
1.9 The Axiom of Choice 29

The next theorem is also a consequence of the Axiom of Choice and is


used widely throughout algebra and analysis.

Definition 9.2. Let (A, <) be a partially ordered structure. A subset X of A


is a chain if (X, < ~ X) is a linear ordering. A chain X is maximal if no
chain properly contains X.

1beorem 9.3 (Maximal Principle). Every partially ordered structure has a


maximal chain.
PROOF: Let (A, <) be a partially ordered structure. By Theorem 9.1 there
is a well ordering (call it < *) on A. Since we need to distinguish between
the two orderings we now have on A, we shall write' < *-least element'
when we mean 'least element with respect to the ordering < *', and so on.
Let a* be the < *-least element of A. Take K to be the set of all
functions j such that Domj is a < *-initial segment of A and Ranj is a
<-chain, and for each aEDomj
i. j(a) = a if {a}u{f(b):b<*a} is a <-chain, and
ii. j(a) = a* if {a} u {f(b):b< *a} is not a <-chain.
First we claim that ifj,gEK, thenj~g or g~j. For if not, then there is
a < *-least cEDomjnDomg such that j(c)=Fg(c), which is impossible,
since {j(b):b<*c}={g(b):b<*c}.
Hence U K is a function; call it F. DomF is a < *-initial segment by
Lemma 8.10. It is also easy to see that F satisfies the other conditions for
membership in K, so FE K.
We claim that F is a maximal chain. For if not, let d be the < *-least
element of A - RanF such that {d} U RanF is a chain. Define
F*(b)=b for all bEDomF,
F*(b)=a* forallbE{x:x<*dandxElDomF},·
F*(d)=d.
It is easy to see that F*EK, so that dEDomF*~DomF, contradicting
the choice of d. Hence RanFis a maximal chain in (A, <). 0
Definition 9.4. A partially ordered structure (~, <) is a tree if for all a E A,
{ b: b < a} is < -well ordered.

Trees are frequently encountered in many branches of mathematics. An


amusing application to game theory will be found in Exercise 10. The
following theorem is particularly useful and has led to generalizations that
are still being investigated.

1beorem 9.5 (Konig's Infinity Lemma). Suppose (A, <) is a tree such that
A is infinite but each a EA has only finitely many immediate successors.
Then (A, <) has an infinite chain.
30 I An Introduction to Set Theory

PROOF: Well-order A by <*. Let a>={b:b>a}. Now by recursion on N


define an to be the < *-least a EA such that a is an immediate successor of
an _ I and a> is infinite. It is then easy to see that {an: n EN} is an infinite
chain. 0

EXERCISES FOR § 1.9


1. Prove: If !:A ~ B, then B~A.
onto
2. Let (A, <) be a linearly ordered structure. An infinite descending chain in
(A, <) is a countable subset {a,,:nEN} of A such that Q,,+1 <a" for each
n E N. Show that < is a well ordering of A iff there is no infinite descending
chain in (A, <).
3. Show that the axiom of choice is equivalent to the following statement: If for
each"X, Y EA either X= Y or X n Y=0, then there is a set B such that for
each X EA, X n B has exactly one element.
4, Deduce the axiom of choice from the well ordering principle.
5. Prove that the following statements are equivalent:
(a) A is infinite.
(b) A is equinumerous to a proper subset of itself.
(c) N~A.
6. The maximal principle implies that every set can be well ordered. [Hint: Let F
be the set of well orderings that have a domain which is a subset of X. If <I
and <2 belong to F and <I is an initial segment of <2, write <16 <2. F is
partially ordered by 6. Let B be a maximal branch in F. Then U B
well-orders X.]
7. The maximal principle implies the axiom of choice. Of course this is immediate
from Exercises 4 and 6, but give a direct proof along the following lines. Let X
be a set of non-empty sets, and let F be the set of all functions! such that
Dom! k X and! is a choice function on its domain. Now partially order F by
proper set inclusion and consider a maximal branch.
8. Let G be a group, and let x E G, x not the identity. Then G contains a
subgroup H such that x ft H and no subgroup K of G is such that x ft K and
HcK.
9. Prove that every vector space has a basis.
10. Let G be a two player game in which the players move alternately, each player
has only finitely many alternatives on each move, and each play of the game
ends in a win for one or the other after finitely many steps. Show that either
the first player has a strategy which guarantees a win for every play of the
game, or the second player has such a strategy. [Hint: Consider a tree in which
the first level consists of the finitely many moves that player 1 can make. For
each point at level I let the immediate successors (at level 2) of that point be
the possible moves of player 2 in response to this particular move by player 1.
Construct the third level analogously and continue. Now apply the infinity
lemma.]
1.10 Transfinite Numbers 31

1.10 Transfinite Numbers


The size of a finite set is the natural number that is equinumerous to it.
What is the size of an infinite set? What we want is a generalization of the
notion of number so that each set will be equinumerous to some number,
finite or transfinite. In order to obtain a hint on how to proceed, we first
tum our attention to a definition of the set of finite numbers and some of
its set theoretic consequences. The definition we adopt is due to von
Neumann.
The idea is to define 0 to be 0, I to be {O}, 2 to be I u {I} (in other
words {0, {0} D, 3 to be 2u {2} (i.e., {0, {0}}, {{0, {0}}}, and so on. In
general n + I is n u {n}.
More precisely, we take N to be the smallest set X (in the sense of !:)
such that
i. 0EX,
ii. whenever xEX, then XU{x}EX.
This definition justifies "proofs by induction." To show that every
element of N has property P, one need only show that 0 has the property,
and whenever n has the property, so does nu {n}.
This definition has several pleasant features. The < relation among the
natural numbers can be defined to be the E relation (see Theorem 10.3
below). The successor function is thenf(n)= nu {n}. With this definition n
has exactly n elements, namely n = {O, I, .. . ,n -I}.
Elementary number theory is often given an axiomatic foundation
based on the Peano axioms. These state that there is a set N and a unary
function f with domain N such that:
i. f is I-I.
ii. There is a unique element 0 E N such that 0 f/. Ranf.
iii. If X!:N and OEX and.flX]!:X, then X=N.
Another attractive feature of von Neumann's definition is the ease with
which Peano's axioms are shown to be satisfied, taking f(x) to be xU {x}
(see Exercise 1). One can then proceed in the usual way (see Exercises 2,3,
and 4) to define addition and multiplication of integers, and then to
develop number theory.
Our next few theorems will tell us more about the set theoretic proper-
ties of N.

Dermidon 10.1. x is E-transitive if whenever zEx andyEz, thenyEx.

1beorem 10.1. Each nEN is E-transitive, and so is N.


PROOF: Let X be the set of E-transitive members of N. Surely 0 EX. By
clause ii of the definition of N, if we show that xu { x} E X whenever x EX,
32 I An Introduction to Set Theory

then X = N and we are done. So let x EX. Suppose that y Ex U {x} and
zEy. We must show zExU{x}. ButyEXU{X} impliesyEx or y=x.
Since x is E -trall'Sitive, y E x and z Ey gives z Ex. Y = {x} yields z Ex
immediately. Hence xu {x} is E-transitive. Hence each nEN is E-transi-
tive.
Now consider the set K of all xEN such that yEx implies yEN.
Clearly OEK. Suppose xEK, and letyExu {x}. Either yEx or y=x; in
either case y E K. Hence by the definition of N, K = N, and so N is
E -transitive. 0
Theorem 10.3. Each n EN is well ordered by E, and so in N.
PROOF: The set X consisting of those members of N that are well ordered
by E certainly contains O. Suppose x EX. We need to see that xU {x} EN.
xex, for otherwise {x} would be a subset of x having no E-minimal
member, contrary to the assumption that Ewell-orders x. A similar
argument shows that there is no y E x such that x Ey. Also, if z Ey and
y E x, then z E x by the preceding lemma. These observations together with
our assumption that E well-orders x shows that E linearly orders xU { x}.
Now lety be a non-empty subset of xu {x}. Ifynx*0, theny has an E
least member, since Ewell-orders x. If ynx=0, theny={x}, and x is
the E-minimal member (we already noticed that x ex). Thus every subset
of xu {x} has an E-minimal member, and so xu {x} is well ordered by
E. Using this, similar considerations show that Ewell-orders N also. 0
What about extending the sequence 1,2,3, ... to include "infinite num-
bers"? Since n= {O, 1, .. . ,n -I} for each nEN, a reasonable candidate for
the first number greater than each "finite number" is N itself, {O, I,2, ... }.
Then why not go on and consider N U {N} a number, the immediate
successor of N? Calling this number N + 1, we go on to the successor of
N+I, namely N+lU{N+l}, which we call N+2. Continuing, we get
N + 3, N + 4, and so on. Continuing, let the number immediately following
N,N+ I,N+2, ... be {O, I,2, ... ,N,N+ I,N+2, ... } which we can call N·2.
Then continue from there, letting N ·2+ 1= N . 2 U {N . 2}. and so on. This
begins the sequence of ordinal numbers, but what we need is an explicit
way of defining them all.
I t is tempting to define the ordinals as being the members of the least
set X such that
i. OEX,
ii. x E X implies XU {x } EX,
iii. x ~X implies U x EX.
Unfortunately, such a set X is an impossibility, as we shall see in § 1.11.
Although this approach fails, there is a satisfactory alternative. Rather
than attempting to define the collection of all ordinals, we define the
property of being an ordinal, taking as our guideline Theorems 10.2 and
10.3.
1.10 Transfinite Numbers 33

Definition 10.4. a is an ordinal if


i. a is E -transitive, and
ii. a is well ordered by E.

We shall use a,/3, y,8 to denote ordinals, and we write Orda as an


abbreviation of 'a is an ordinal'. We shall frequently write a </3 instead of
aE/3.
By Theorems 10.2 and 10.3, each n EN is an ordinal. Other examples
are given in the following.

Theorem 10.5.
i. OrdN.
ii. Orda implies Ord aU {a}.
iii. If a E/3 and Ord/3, then Orda.
PROOF OF i. This is immediate by Theorems 10.2 and 10.3. D
PROOF OF ii. See the proof of Theorem 10.3. D
PROOF OF iii. Suppose aE/3 and Ord/3. Let xEy and yEa. Since /3 is
E-transitive, y E/3, and hence so is x. Since /3 is linearly ordered by E,
x E a. Hence a is E -transitive. Since a r;;;, /3 by the E -transitivity of /3, it
follows that a is well ordered by E. Hence Orda. D

From Theorem 1O.5iii we have immediately

Corollary 10.6. If a is an ordinal, then a={/3:/3 Ea and Ord/3}.

As we mentioned before, one cannot consistently talk about the set of


all ordinals. (We shall consider this problem again in §l.ll.) Hence, the
E -relation restricted to the ordinals cannot be considered a set either,
since then the domain of E restricted to the ordinals would be a set, but
this is the collection of all ordinals. Nevertheless, E is essentially a well
ordering of the ordinals, as we now show.

Theorem 10.7. Let a,/3,y be ordinals. Then:


i. a~a.
ii. a E /3 implies /3 ~ a.
iii. a E /3 and /3 E Y implies a E y.
iv. Either aE/3 or /3 Ea or a=/3.
v. If X is a set of ordinals and X =1=0, then X has an E-least element.
PROOF OF i AND ii. If a E a or if a E /3 and /3 E a, then {a} is a non-empty
subset of a with no E minimal member, contrary to the assumption that a
is an ordinal. D
PROOF OF iii. Clearly, since Ord y, y is E -transitive. D
34 I An Introduction to Set Theory

PROOF OF iv. By Theorem 8.11 we can assume that there is an order


preserving function f: a---+r where r is an initial segment of p (or ftakes p
onto an initial segment of a). We show that f is the identity function. If
not, then there is an E-minimal element 6 of a such that f(6)=I=6. But
f(6)= {!(g):~E6} by our assumption onf and Corollary 10.6, and {!(~:
~E6}={~:~E6}=6. Hencefis the identity function. If Ranf=P, then
a = p. If not, let y be the least element in P - Ranf. Clearly r = y, and
since f is the identity on a, y = a. So a E p. D
PROOF OF v. Suppose X is a set of ordinals and aEX. If anX=I=0, then it
has an E -least member p, since a is well ordered by E. Clearly p is the
E-Ieast member of X. If a n X =0, then a is the E-least member of X. D

We may write a + 1 instead of aU {a}. The notation is suggested by the


fact that n + 1 = n U {n} for n EN with '+' the usual addition, and the
following.

Corollary 10.8. i. a + 1 is the immediate successor of a, i.e., a + 1 is the least


ordinal p such that a E p.
ii. If X is a set of ordinals then Ord U X.
PROOF OF i. Certainly a E a + 1, and by Theorem lO.5ii, a + I is an ordinal.
Suppose yEa+ 1. Then yEa or y=a by the definition of a+ I. D
PROOF OF ii. Let X be a set of ordinals, and let z = U X. Suppose that
xEy and yEz. For some ordinal aEX, we have yEa. Since a is
E-transitive, xEa. Hence xEz. Thus z is E transitive. Suppose s,t,uEz.
Then for some a, p, y E X we have sEa, t E p, u E y. Since the ordinals are
E-transitive and linearly ordered by E, we have that s,t,uE6, where 6 is
the largest ordinal among a,p, y. It then follows that s (!£.s, sEt implies
H£s; sEt and lEu implies sEu; and either sEt or tEs or s=t. Hence z
is linearly ordered by E. Now let w be a non-empty subset of z. Then
w n a =1= 0 for some a EX. It follows that the E -minimal member of w n a
is the E-minimal member of w. Hence z is well ordered by E. D
Given any well-ordered structure (A, <), there is an ordinal a such that
the well ordering E ~ a has the same mathematical properties as the
ordering <. To make this more precise we need the following definition.
We say that two partial orderings (A, <A) and (B, <B) are isomorphic if
there is a functionf:A ~ B such that for each a,a' EA we have a <A a' iff
f(a) <Bf(a'). We call s~~h a function an isomorphism. We write (A, <A)~
(B, <B) if (A, <A) is isomorphic to (B, <B)' A structure that is isomorphic
to (A, <A) can be thought of as a sort of copy of (A, <A) that will have all
of the mathematical properties of (A, <A) that depend only on the order-
ing <A and not on the nature of the elements of A (see Exercise 9). In this
sense, every well-ordered structure is a copy of some ordinal, as we now
show.
1.10 Transfinite Numbers 35

Theorem 10.9. if (A, <) is a well-ordered structure, then there is an ordinal a


such that (A, <)~(a, E ~a).

PROOF: Suppose that X and Yare initial segments of A with X k Y, and


thatfx is an isomorphism on X onto a EOrd andfy is an isomorphism on
Y onto P E Ord. It is easy to see that jy ~ X is an isomorphism on X onto
some ordinal. By Theorem 8.11 we must havefy~X=fx' Hencefxkfy.
Now let F be the set of all isomorphisms f such that Domf is an initial
segment of A and RanfEOrd. By Theorems 8.12 and lO.8ii, U F is an
isomorphismr on an initial segment X* of A onto some ordinal a*. We
claim that X* = A. If not, let a* be the least element in A - X*. Now
extend r to an isomorphism f# on X* U {a*} onto a* + 1 by defining
r
f# = u {(a*,a*)}. Clearly f# EF, so that a* EDomr, contradicting the
choice of a*. Therefore X* =A, and (A, <) is isomorphic to (a*, E ~a*). D

Often an argument will be divided into cases, one corresponding to well


orderings with a maximal element and one to well orderings without. An
ordinal a with a maximal element P is called a successor ordinal because
a = P + 1; otherwise a is a limit ordinal. Notice that if a is a limit ordinal,
then a= U a.
The inductive definitions given for addition and multiplication of nat-
ural numbers can be extended to all the ordinals as follows.
Addition of ordinals:
a+O=a,
a+(p+ l)=(a+ P)+ 1,
a+A= U (a+p)ifA= UA.
PEA
Multiplication of ordinals:
a'O=O,
a·(p+ l)=(a,p)+a,
a'A= U (a·p) if A= U A.
PEA
Exponentiation of ordinals:
aO=I,
a(P+ 1) = aP'a,
a A= U a fJ if A= U A.
PEA
Many of the familiar properties that the arithmetic operations have
when applied to finite numbers no longer hold in the more general setting.
For example, 1 +"'="'+"'+ 1, and ""2=",+",+",=2·",, and (1 + 1)",-
2·",=",+",+",. Hence, the extended operations are not commutative or
distributive. On the other hand, many of the familiar properties of plus and
36 I An Introduction to Set Theory

times do carry over (see Exercises 7 and 8). Although there are many
interesting theorems about ordinal arithmetic, they are outside our main
interest, and we go on to consider another notion of number.
The finite numbers have an attribute that some of the ordinals do not
have. No two members of N are equinumerous, and so each finite set has a
unique number associated with it, its magnitude. However, this is not a
property that all ordinals enjoy. In fact, as we shall see, if a is infinite and
p is no greater than a, then a-a + p-a· p. However, there is a more
restrictive notion of number that does not have this defect.

Definition 10.10. A cardinal is an ordinal that is not equinumerous to any


smaller ordinal. We write Cardx if x is a cardinal.

We shall use the letters K, A, JL and v to denote cardinals.


Clearly, since each cardinal is an ordinal, E well-orders the cardinals in
the sense of Theorem 10.7.

1beorem 10.11.
i. Each n EN is a cardinal, and N is also.
ii. If K is an infinite cardinal, then {a: Ord a and a ~ K} is a cardinal, and in
fact is the smallest cardinal larger than K.
iii. If x is a set of cardinals, then U x is a cardinal.
PROOF OF i is clear. o
PROOF OF ii. Since {a:Orda and a~K} is E-transitive and well ordered, it
is an ordinal p. Since we cannot have PEP, we must have K-< P and p~a
for any a E p. Hence P is a cardinal greater than K. If AE p, then A~ K, so
P is the least cardinal greater than K. 0
PROOF OF iii. Let x be a set of cardinals. Then U x is an ordinal by
Corollary 1O.8ii. Now suppose that a E U x and a- U x. But then a E K
for some K Ex. Also, by the E-transitivity of K, a C; K. Hence, by the
Cantor-Bernstein theorem (actually Theorem 6.1) we see that a-K-
impossible, since CardK. Hence U x is a cardinal. o
When viewed as a cardinal, N is usually denoted by W or Wo or No. With
each ordinal a associate a cardinal Wa such that Wa is the immediate
cardinal successor of wp if a = P+ 1 and Wa = Up Eawp if a = U a. Na is
often used in place of Wa.
We use K+ to denote the cardinal successor of K. If K=A + for some A,
then K is called a successor cardinal; otherwise K is a limit cardinal.
As a consequence of the axiom of choice, every set has its magnitude.

Theorem 10.12. For every x there is a unique cardinal K such that X-K.
1.10 Transfinite Numbers 37

PROOF: Let < be a well ordering of x. By Theorem 10.9, (x, <) is


isomorphic to (a, Eta) for some ordinal a. Let /C be the least ordinal such
that /C ...... a. Then x ...... /C. Of course, there can be no other cardinal >. such
that x ......>., since this implies /C ......>., which in tum implies that either /C or>. is
not a cardinal. D

Definition 10.13. If x ...... /C and Card/C, we write c(x)=/C and say that the
cardinality of x is /c.

Although cardinals are ordinals, the arithmetic of ordinals does not


specialize to an arithmetic for cardinals. For example, the ordinal sum of
two cardinals need not be a cardinal, as is the case for w+w. Similarly for
ordinal multiplication of cardinals: w' 2 is not a cardinal. This suggests that
we should define the cardinal sum of /C and >. to be c( /C + >.), where' + ' here
is ordinal addition; and define the cardinal product of /C and>. to be c(/C'>')
where '.' is ordinal multiplication. These definitions are unduly com-
plicated. Equivalent (see Exercise 9) but simpler definitions that do not
depend on those for ordinal addition and multiplication are the following.

Definition 10.14.
i. The cardinal sum of /C and >. is c( /C U {(O, a) : a E >. }).
ii. The cardinal product of /C and >. is c( /C x>.).

We denote the cardinal sum of /C and >. and the cardinal product of /C
and ~ by /C+~ and /C,>'.
Although we are using '+' and '.' to denote the ordinal addition and
multiplication as well as cardinal addition and multiplication, it should be
clear which is intended by the context and by our convention of using
/C,~, p.," for cardinals and a, p, y, I) for ordinals.
Much of cardinal arithmetic is extremely simple because of the follow-
ing fact.

Lemma 10.IS. If /C is infinite, then /C'/C = /C.


PROOF: Suppose that the theorem is true for all >. E /c. Let X = /C X /c, let
X~={(a,p):p<a}, let X~'={(p,a):p<a}, and let Xa=X~UX~'. Clearly
X= UaEkX~, Next we need a particular well ordering of X, which the
diagram in Figure 10.1 will help explain.
If x,y E X, say that x < •y iff one of the following holds:
i. xEXa,yEXp and aEp.
ii. XEX~,yEX~'.
iii. x,yEX~ and x=(a,fJ) andy=(a,y) and p<y.
iv. x,yEX;' and x=(p,a) andy=(y,a) and p<y.
38 I An Introduction to Set Theory

o 1 2 ... ~ ... a ...

o
1
2

(t, ~)

Figure 10.1

Since X~ and X~' are well ordered by < *, so is Xa (by Theorem 8.6), and
hence so is X (by Theorem 8.6 again). We show that (X, < *) is isomorphic
to (K, E tK).
First let S be a proper initial segment of X. Then X - S has a least
element x*. Let X*EXa •.Then Sk(a*+I)X(a*+I)={(y,~):y,~E
a* + I}. Let Y =(a* + 1) X (a* + 1). Since Y -c(a* + 1) X c(a* + 1), and
since c(a*+ 1)·c(a*+ 1)= c(a* + 1) by the induction hypotheses, we have
cS E K. Hence, < * is a well ordering in which every initial segment has
cardinality less than K.
Let p ·be that ordinal such that (P, E tP)-(X, -<*) (by Theorem 10.9).
Clearly PIlK (otherwise CardK is false, since P-X~K). From what we
have just shown, KIl p. Hence K= p, and so K-X as needed. D

Theorem 10.16. Let K and A be cardinals with '" (; K and A (; K. Then


i. K+A=A+ K= K, and
ii. ijA=I=O, then K·A=A·K=K.
PROOF: Since K+A-KU {(O,a):a EA}-AU {(O,P):p EK}-A+ K, we have
K+A=A+K. Also KXA-AXK, so that K·A=A·K. Since K(;K+A(;K+K(;
K·K and K(;K·A(;K·K, we have by Lemma 10.15 that K+A= K·A=K. D
Definition 10.17. Let F be a function with domain a. By II,BEaF(p) we
mean the set of all functions f such that Domf = a and f( P) E F( P) for
each pEa.

One can think of this definition as a generalization of the finite


Cartesian product.
The next theorem is extremely useful in making computations with
cardinals.
1.10 Transfinite Numbers 39

1heorem 10.18 (Konig's Lemma). Suppose /(p <Ap Jor each {3 Ea. Then
U pEa/(p<cIlpEaAp'
PROOF: For each 8 E Up Ea/(p let H( 8) be that function h in IIp EaAp such
that for all {3 E a, h( {3) = 8 if 8 E /(p and h( {3) = 0 otherwise. Clearly
H: UPEa/(p~IIpEaAp. We need only show that Upea/(p is not equi-
I-I
numerous to IIpEaAn. For suppose G: UPEa/(p-+IIpEaAp. Let
Xp = {( G(8»( {3):8 E /(p}. Then Xp C,Ap and Ap - X p=F0. Let J(fJ) be the
least element in "Ap-Xp. ThenJEIIpEaAp, but clearly JERanG. Hence G
is not onto. Therefore Up Ea/(p <CIIPEaAp. 0
By analogy with finite arithmetic, one might conjecture that if 1 </(p <;
Ap for all {3 E a, then U /(p <cIIAp, but this is false (see Exercise 13).
The definition of cardinal exponentiation does not follow the pattern
used for addition and multiplication; /(A in the cardinal sense is not defined
as the cardinality of /(A in the ordinal sense. Instead, the definition is
motivated by the fact that for finite cardinals mn,.~,/Im.
Dermition 10.19. If /( and A are cardinals, then /(A=c(A/().
Again our notation involves some ambiguity, since /(A now has two
different meanings depending on whether ordinal exponentiation or cardi-
nal exponentiation is used. For example, 2'" in the ordinal sense is count-
able, but 2'" in the cardinal sense is not. However, in the remainder of the
text, exponentiation will always mean cardinal exponentiation.
The proof of the next lemma is trivial and so is omitted.
Lemma 10.20.
i. IJA>.",then/(A>/(~.
ii. If /( >A, then /(~ >A~.

Theorem 10.11.
i. /(A/(~ = /(A+~.
ii. (/(Ay= /(A.~.
iii. /(A."A=(/(·."t.
PROOF OF i. We consider only the case where A >." and A > w and /( > 2,
since the other cases are analogous or trivial. By the lemma, /(A > /( P and
/(A > w. By Theorem 10.16, /(A/(P= /(A. Also A+ .,,=A, so /(A+ .. = /(A. 0
PROOF OF ii. LetfE ..(A/(). Then for every aE.", {3 EA we have (f(a»({3)E
/(. Let H map J to the function g, defined g,(a,{3)=(f(a»({3). Then
H:"(A/() ~Ax,,/(. It is also clear that each gE Ax ../( is g, for somefEP(".c), and
so H is onto AX ../(. Hence (/(Ay = /(A.... 0
PROOF OF iii. We consider the case where A> w, since the other case is
trivial. We may assume that /( >.". Then /(A."A = /(A by the lemma and
40 I An Introduction to Set Theory

Theorem 10.16. If /C;> w, then /C. p = /C by Theorem 10.16, and so t=


(/C. P /C"
also. If /C<w, then (/C·pt <; (/C 2t= /C" by the lemma and part ii. D

Here are two calculations that make use of Lemma 10.20 and Theorem
10.21 and show that Lemma 10.20 does not hold if we replace' <; , by '<'
throughout.
i. (2"')"'=2"'·"'=2"'=(2",)1 (compare with Lemma 1O.20i).
ii. 2'" = 2"'·'" = (2"')'" ;> w"', which along with 2'" <; w'" gives 2'" =w'" (compare
with Lemma 1O.20ii).
As we shall see in the next sections, the value of /C" in relation to /C and A
is very mysterious. For example, while it is consistent to assume that
2"'=W1' it is also consistent to assume that 2"'=w2 • In fact, it is known that
if /C > wand there is no countable subset X of /C such that U X = /C, then
one can consistently assume that 2'" = /C. No principle we have used so far
determines the value of 2"'.
We end this section by considering extensions of two of the most
commonly used principles in mathematics, definition by recursion and
proof by induction on w. What makes these principles work is the fact that
w is well ordered. Therefore it is not surprising that these principles have
useful extensions to transfinite ordinals. We shall consider several exten-
sions here and in the problems. Others, which are more complicated to
state, will not be mentioned. For the most part, they are easy enough to
devise when the need arises, following the form of the versions given here.

lbeorem 10.22 (Transfinite Induction). Suppose that for each /3 E a, /3 ~ X


implies /3 EX. Then a ~ X.
PROOF: If the hypotheses is true but the conclusion false for X and a, let y
be the least element in a-X. But then y ~ X, and so y EX-a contradic-
tion. Hence no such y exists, i.e., a ~ X. D

This theorem is frequently stated in the following form.

lbeorem 10.22'. Suppose that X ~ a and the following are true:


i.OEX.
ii. /3 EX and /3+ 1 Ea implies /3+ 1 EX.
iii. /3 ~X and /3 Ea implies U /3 EX.
Then X=a.
PROOF: Suppose y is the least ordinal in a-X. Then y:~0, y is not a
successor ordinal, nor is y a limit ordinal; hence no such y exists. D

The set X above is usually specified by a property 0/, i.e., X is given as


{ {3:0/({3)}.
Another useful form of transfinite induction is the following.
1.10 Transfinite Numbers 41

Theorem 10.22". Suppose that 1/1 is a property such that 1/1 (a) whenever 1/1 ({3)
for all {3 Ea. Then 1/1 (a) for every ordinal a.
PROOF: Like that of Theorem 10.22. o
The difference between Theorems 10.22 and 10.22" is that a has been
replaced by Ord (which, as we will see in the next section, is not a set), and
X has been replaced by the property 1/1, which does not correspond to a set
either. Theorem 10.22' can be stated with analogous modifications.
Next we consider various transfinite generalizations of definition by
recursion. Again, we shall not try to give the most comprehensive versions
of the theorem, since these are more complicated to state, but follow the
same general outline as the simpler versions.

1beorem 10.23 (Definition by Transfinite Recursion). Let G be a function


such that for each {3 Ea and each sEP(RanG) we have ({3,s)EDomG.
Then there is a unique function F such that DomF = a and for each {3 E a

F( {3) = G( {3,sp),

where Domsp ={3 and for each yE{3, sp(y)=F(y).


PROOF: Let F be the set of all functions f such that Domf= aj for some
ajEa, and for each {3 Eaj,f({3)= G({3,s;), where Doms;={3 and s;(y) =
fey) for each yE{3. F is non-empty, since the empty function is a member
and F is partially ordered by ~. By Theorem 9.3, (F,~) has a maximal
chain; say B is such. Let F= U B. Clearly F is a function and F({3)=
G({3,sp) for each {3 EDomF. We need only see that a = DomF. If not, let
8 be the least element in a - DomF. It is obvious that FU {(8,G(8,sa))}:::>
F, where siy)=F(y) for each yE8. This contradicts the maximality of B,
and so DomF= a.
We now prove the uniqueness of F. Suppose F and F' are two different
functions with the properties stated in the theorem. Let 8* be the least 8
such that F( 8*) =t= F'( 8*). But then sa. = SB., where sa.( y) = F( y) and SB.( y) =
F'(y) for each yE8*. Hence F(8*) = G(8*,sa.)=F'(8*), and so no such 8*
exists, i.e., F is unique. 0
Another version of definition by transfinite recursion can be given,
which is in the spirit of Theorem 10.22", with Ord in place of a. To
facilitate the statement of this version we need some definitions. A prop-
erty 1/1 is said to be functional if for each x there is at most one y such that
1/Ixy [i.e., such that (x,y) has the property 1/1]. If 1/1 is functional and 1/Ixy, we
say that x is in the domain of 1/1 and y is in the range of 1/1, and we write
1/Ix = y. Of course this notation is in complete agreement with that used for
functions. The difference is that functions are sets, where as some proper-
ties (such as Ord a, or {3 = a + 1 and Orda) do not correspond to sets, as we
shall see in the next section.
42 I An Introduction to Set Theory

Theorem 10.23'. Suppose that"" is a functional property such that (a,s) is in


the domain of"" whenever s is a function from a into the range of "". Then
there is a unique functional property () such that
()( a) = "" (a,sa)
whenever Orda and sa is the function on a such that sa(Y) = ()(a) for each
yEa.
PROOF: Like that of Theorem 10.23. o
The content of this theorem for the time being is as vague as the notion
of property which will not be defined precisely until §3.4.

EXERCISES FOR § 1.10


1. Let f(x) = x U {x}. With von Neumann's definition of N show that Peano's
axioms are satisfied.
2. Let + be the least set X such that:
i. (n,O,n)EX for each nEN.
ii. (n,m,k)EX implies (n,m+ l,k+ l)EX for each n,m,kEN.
Write n+m=l if (n,m,I)EX. Show that X is a function with DomX=NXN.
Show that n+m=m+n for each n,mEN.
3. Let· be the least set X such that:
i. (n,O,O)EX for each nEN.
ii. (n,m,k)EX implies (n,m+ 1,k+n)EX for each n,m,kEN.
Write n·m=l if (n,m,I)EX. Show that X is a function with DomX=NxN.
Show that n'm=m'n for each n,mEN.
4. Prove that n(m+k)=n·m+k·m for each n,m,kEN.
5. Prove that '~' is an equivalence relation, i.e., prove that:
(a) (A, <A)~(A, <A).
(b) (A, ~A)~(B, <B) implies (B, <B)~(A, <A).
(c) (A, <A)~(B, <B) and (B, <B)~(C, <c) implies (A, <A)~(C, <d.
6. Show that if (A, <A)~(B, <B) and <A is a linear ordering, then so is <B.
Similarly for <A a dense ordering and a well ordering.
7. Show that a'({3+y)=a'{3+a'Y, where a,{3,y are ordinals. However, it is not
always true that (a + {3)y = a'Y + {3'Y'
8. Prove the following laws of exponents for ordinal exponentiation:
(apr = a P'Y ,
aP·aY=a P+ Y.
Y
However, it is not true that a '{3Y=(a'{3)Y for all ordinals a,{3,y. (Consider
?
2w·2w ';' 4w; we already know that 2w·2w=2w+ w.)
9. Show Definition 10.14 is equivalent to the definition for cardinal sum and
product suggested in the preceding paragraph.
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 43

10. Suppose cX=cX', cY=cY', and X and Yare infinite. Show that c(XU Y)=
c(X'U Y'), c(XX Y)=c(X'X Y'), and cYX=cY'X'.
II. Suppose K, A, and p. are cardinals. Show that (K+A)P.=K·P.+A·P..

12. Suppose that for each nEw, Kn <2"'. Show that Une",IC,,<2"'.
13. A relation R is well founded if for each x either {y :yRx} = 0 or there is a y*
such that y* R x and there is no z such that x Ry* and z R x. (Note that every
well ordering is well founded.) Prove the following extension of Theorem
10.22: Suppose R is well rounded and that X has the property that whenever
xEDomR and {y:yRx}~X, then xEX. Then DomR ~X.
14. Extend Theorem 10.22" to well-founded properties along the lines of Exercise
13.

1.11 Paradise Lost, Paradox Found (Axioms for Set


Theory)
One of the most appealing features of set theory is its generality. Its broad
applicability stems from the fact that so many notions of classical mathe-
matics can be formulated within it, so that set theory can be used as a
foundation of all of classical mathematics. However the unbridled use of
'set' to refer any collection of objects quickly leads to mathematical
nonsense.
For example, one cannot say that there is a set whose elements are
exactly the ordinal numbers. Such a set would be well ordered by E
(Theorem 10.7) and transitive (Theorem 105iii), and so would itself be an
ordinal a. But then Orda and a E a, an impossibility. Exactly the same
argument shows that no set contains all the cardinals.
A more direct example was given by Bertrand Russell. Say that a set x
is normal if x fl x; otherwise say that x is abnormal. Is there a set X whose
members are precisely the normal sets? If so, then X is either normal or
abnormal and not both. X is not normal, for if it were, then X E X (by the
definition of X) and so would be abnormal. On the other hand, X is not
abnormal, for if it were, then X E X, and by the definition of X, X would
be normal. Hence X is neither normal nor abnormal-a contradiction.
Of course, the set of all normal sets, if it existed, would contain all
ordinals, and so this example can be reduced to the one preceding it.
However, Russell's example does not use the notion of ordinal number,
but merely set membership.
Exactly what is it about the assumptions 'Ord is a set' or 'the set of all
normal sets exists' that leads to nonsense? This problem has been much
discussed among philosophers and mathematicians for many decades, but
no completely satisfactory answer has been given.
44 I An Introduction to Set Theory

What should one do with set theory in view of these paradoxical 'sets'?
Should we abandon its use as a notational and conceptual framework for
classical mathematics? And what about Cantor's proof that most numbers
are transcendental? Should that be abandoned because of the paradoxes
even though none of the paradoxical "sets" are mentioned in the proof? In
fact, the paradoxical "sets" of the examples never arise in any branch of
classical mathematics, and the sets that do arise seem to be completely
innocuous as far as giving rise to contradictions is concerned.
In the early part of this century, Russell and Whitehead in their
Prinicpia, and Zermelo in a series of papers, attempted to axiomatize a
significant portion of set theory in a way that would avoid the paradoxes.
The axiomatization given by Zermelo and later modified by Fraenkel is
the one most frequently encountered today. This axiomatization appears to
be highly successful. The axioms have strong intuitive appeal, apparently
asserting simple truths about sets. The axiomatization seems to be free of
contradiction, and moreover, is strong enough to provide a base for all of
classical mathematics. One axiom, the axiom of extensionality, says that a
set is determined by its members. The remaining axioms either state that a
certain set exists or that a set is obtained from a given set by a specified
operation. In developing set theory within such an axiomatic framework,
the only sets that can be asserted to exist are those that can be proven to
exist by a valid argument whose only premises about sets are those given
by the axioms.
The remainder of this section is devoted to the Zermelo-Fraenkel
axiomatization (abbreviated ZF).

Axiom of Extensionality. If x and y have the same elements then x = y.

So this axiom says that a set is completely determined by its members.

Axiom of Regularity. Every non-empty set has an E-Ieast member, i.e., if


there is some y E x, then there is some z E x for which there is no wE z n x.

We could have appealed to this axiom in Definition 10.4, the definition


of ordinal. In view of the axiom of regularity, clause ii can be replaced by
ii. a is linearly ordered by E.
This shortens the proofs of Theorem 10.5 and Corollary 10.6 somewhat,
but we did not want to mention this axiom at that time.

Axiom of the Null Set. There is a set with no members.

This is what we have been calling 0 or O. Of course, the axiom of


extensionality implies that there is only one such set.

Axiom of Pairing. If x and y are sets, then there is a set z such that for all
w, wEz iff w=x or w=y.
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 45

This axiom says that for every x and y, {x,y} exists.

Axiom of Union. For every x there is a y such that z Ey iff there is awE x
with z Ew.

So this axiom says that if x exists, then so does U x.

Axiom of the Power Set. For every x there is ay such that for all z, z Ey iff
zCx.

In other words if x is a set, so is P(x).

Axiom of Infinity. There is a set x such that 0 E x and whenever y E x, then


YU{y}Ex.

This axiom assures us that there is a set containing w. The next axiom
will allow us to extract w from such a set.

Axiom of Replacement. If P is a functional property and x is a set, then


Ran(P~x) is a set, i.e., there is a sety such that for every z, zEy iff there
is a wEx such that P(w)=z.

Recall from § 1.10 that a functional property P is one such that for every
x there is at most one y such that Pxy. In our examples of paradoxes the
pathological sets are inordinately large. If y is a set obtained by replace-
ment as the range of P restricted to x, then the magnitude of y is no
greater than that of x, which provides some intuitive justification of
replacement as an axiom.
Our statement of the axiom of replacement is somewhat sloppy in that
the notion of a property is undefined. At this point we shall take it to mean
any statement about sets mentioning only the E-relation. A precise
version of this axiom (and more elegant statements of the others) will be
given in §3.4.
We used the axiom of replacement in proving Theorem 10.9, although
an alternative proof can be given that avoids its use. In our proof we need
to know that there is a set F whose elements are the fa's for a EA. Let
P( u, v) hold iff u E A and v =fu, or u fl A and v = 0. Then P is functional,
and the range of P restricted to A is the needed set F. The axiom of
replacement was used again in Theorem 1O.11ii.
A very useful consequence of the axiom of replacement is the axiom of
separation, which says that a definable subset of a set is a set. In other
words, if x is a set and S a property then there is a set y whose elements
are exactly those of x which satisfy S. To deduce the axiom of separation
from replacement, first choose some a E x such that a has property S (if no
such a exists, then the axiom of the null set gives {y:yEx and Sy}). Now
let Puv be the property

Either uEx and Su and v=u, or uflx and v=a, or not Su and v=a.
46 I An Introduction to Set Theory

Clearly, for every u there is a unique v such that Puv. Now the axiom of
replacetnent applied to x and this P gives the set {z: z E x and Sz}
immediately.
On the other hand, we will see in Exercise 7 of § 1.12 that the axiom of
separation does not imply replacement and hence is a weaker axiom.
Since the axiom of separation is a theorem of ZF as we have just seen,
there is no need to include it as an additional axiom.
Zermelo's original axiomatization included separation but not regular-
ity. Fraenkellater modified the axiomatization by deleting separation and
aiding replacement.
Separation has been used implicitly several times in previous sections.
As another example of its use, we can prove that if x is a set, then n x is
a set. If x = 0, then we are done, and if there is some y E x, then separation
gives us the existence of the set {z:zEy and zEw for all wEx}, which is
n x. Letting s be the set specified in the axiom of infinity, and letting x
be the set of all z k s such that 0 E z and whenever u E z then u u { u} E z
(x exists by the axioms of power set and separation), we see that n x is a
set, so the axioms give the existence of "'.
The axiom of separation is strong enough to develop most of classical
mathematics within our set theoretic framework. However, quite recently,
Martin found an assertion in analysis that can be proved from replacement
but not separation. The assertion is 'Every Borel set of reals is determined'.
(The set of Borel sets is the smallest set containing the open intervals that
is closed under complementation and countable unions. A set X is de-
termined if one of the two players in the following game has a winning
strategy: The players a and b move alternately beginning with the a player.
Each chooses an integer between 0 and 9 inclusive; say a chooses aj on the
ith move and b chooses bj on the jth. Then player a wins just in case
O.aOb)a2b3a ... EX.)
The axioms mentioned so far, including replacement, constitute the
Zermelo-Fraenkel axiomatization, and as we have said are sufficient to
form a foundation for classical mathematics. Moreover, the axiomatization
is extremely elegant in that it can be stated in terms of E alone (see §3.4).
On the other hand, there are statements of relatively recent vintage that
have considerable mathematical interest but cannot be proved or dis-
proved in ZF. Some of these have been considered as additional axioms.
The one that has gained the broadest acceptance is the axiom of choice.
The axiomatization obtained by adjoining the axiom of choice to the
Zermelo-Fraenkel axioms will be denoted by ZFC.
At this point we want to stress that everything done in the preceding
sections can be justified within ZFC, and most of it within ZF. For
example, consider the notion of the ordered pair (x,y). The only property
of the ordered pair that has mathematical significance is the following: If
(x,y)=(u,v), then x= u andy = v. If we are to develop a suitable notion of
ordered pair within our axiomatic framework, then (x,y) has to be given a
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 47

definition in terms of x, y, and E, and then using ZFC the ordered pair
must be shown to exist and to have the requisite property. This we now do.
The axiom of pairing asserts that for every x and y there is a z such that
wE z iff w = x or w = y. By extensionality, there is only one z such that
wE z iff w = x or w = y. This z is, of course, the unordered pair of x and y,
which we denote by {x,y}. For x=y we write {x} instead of {x,x}. Now
define (x,y) to be {{ x}, {x,y}}. Another application of pairing gives

Theorem 11.1. For every x andy, (x,y) exists.

Theorem 11.2. If (x,y) = (r,s), then x=r andy=s.


PROOF: Suppose (x,y)=(r,s). By extensionality, (x,y) and (r,s) have the
same members. Since {x}E(x,y) [by definition of (x,y)], {x}E(r,s).
Similarly {x,y} E(r,s). But z E(r,s) iff z = {r} or z = {r,s} [by definition of
(r,s)]. The argument now breaks up into two cases.
Case 1: {x} ={r}. Then x=r by extensionality. {x,y} = {r} or {x,y} =
{r,s}. The first implies thaty=r=x (extensionality again), and so {x,y} =
{x} and (x,y)={{x}}. Since {r,s}E(x,y), extensionality implies that
{r, s} = {x} and so s = x = y (extensionality), and we are done. If {x,y} =
{r,s}, theny=r or y=s by extensionality. If y=s, we are done. If y=r,
then y = x and the argument just given shows r = s = y = x.
Case 2: {x}={r,s}. Then by extensionality, r=s=x, so (r,s) = { {x}}.
Since {x,y}E(r,s), extensionality impliesy=x=s and we are done.
Thus, arguing within ZF, we have shown that for every x and y the
ordered pair (x,y) exists and has the requisite properties. 0
Given x, let U x be the unique y whose existence is assured by the
union axiom (the unicity of y is a consequence of extensionality). The
unique y of the power set axiom is denoted by P( x).
Theorem 11.3. For every X and Y, there is a Z such that for all z, z E Z iff
there is an x E X and ayE Y such that z = (x,y).
PROOF: If X and Yare sets, so is {X, Y} by the axiom of pairing. So Xu Y
(i.e., U {X, Y}) exists by the axiom of union. Two applications of the
power set axiom give us the existence of P( P( X U Y». For every x EX
and y E Y we have {x} E P(X U Y) and {x,y} E P(X U Y); hence
(x,y)EP(P(XU Y». The axiom of replacement (in fact substitution is
enough) gives us the existence of {(x,y):x E X, y E Y, and
(x,y)EP(P(XU Y»}.
Denoting the Z of the theorem by X X Y, we have just proved in ZF
that the Cartesian product of any two sets exists (Le., is a set). 0
We could now go on and repeat all the definitions and theorems of the
preceding sections, justifying each step by an argument from the axioms of
48 I An Introduction to Set Theory

ZFe. However, what we have just done should help convince the reader
that such a task is a straightforward, albeit tedious, exercise. At the end of
this section there are several problems that continue in this direction.
There are other statements that have been considered as axioms but
none of them has gained the wide acceptance of the axiom of choice.
Moreover, it has been shown by GOdel that if ZF is free of contradiction,
then so is ZFC (we shall say more about this in the next section). With
some of these other axioms, either no such proof of relative consistency
can be given or else the axiom does not appeal to the intuitions as strongly
as the axiom of choice (at least at this time) or are not as useful.
Perhaps the most famous of these axioms is the generalized continuum
hypothesis, abbreviated GCH. This states that the cardinal successor of the
cardinal /C is 2". The special case for /C=W, namely the assertion WI =2"', is
called the continuum hypothesis (CH). Since 2'" = cP(w) and P(w)--R (see
§ 1.5), the CH says that every subset of R either is equinumerous to R or is
countable. As concrete a statement as this seems, GOdel's work in 1938
and Cohen's work in 1963 show that CH and GCH cannot be proved or
refuted within ZFe.
A very active area of investigation in the past few years has centered
around axioms of infinity. Roughly speaking, these axioms assert the
existence of extremely large cardinals. The axiom of infinity of ZF is an
axiom of infinity and gives us the existence of w, a cardinal which has the
following two properties:
i. nEw implies 2n E w, and
ii. if nEw and mO,ml, ... ,mn _ 1 Ew, then UjEnmjEw.
It is natural to ask if there are any other cardinals /C that enjoy these
properties, i.e., cardinals /C such that
i. AE /C implies i' E /C, and
ii. A E /C and Va E /C for every 0: EA implies U aE"va E /C.

Such a /C is called a strong inaccessible, and the statement that there is a


strong inaccessible other than w we denote by Ie. IC is an example of an
axiom of infinity. We shall see in the next section that IC cannot be proved
in ZFC even if we adjoin the GCH. However, there are many mathemati-
cians who find this axiom appealing. Their point of view is that any axiom
consistent with ZFC that enlarges the domain of set theory by introducing
new sets should be accepted. The consistency of ZFC with IC seems highly
likely, but it has been shown that a proof of relative consistency like that
given for AC or GCH is impossible.
In recent years there has been a proliferation of axioms of infinity,
asserting the existence of larger and larger cardinals, so that in comparison
a strongly inaccessible cardinal is quite small. One such axiom asserts the
existence of a weakly compact cardinal larger than w. Let [x]n denote the
set of n-element subsets of X, [Xr={{xl, ... ,xn}:xjEX and Xj=l=xj if
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 49

i=t=J}. Say that IC is weakly compact if whenever Y !;;[IC]2, then there is some
Z !: IC such that cZ = IC and either [Z]2!;; Y or [Z]2 !;;[IC]2 - Y. '" is weakly
compact, by a famous theorem of Ramsey (see Exercise 3). However, the
axiom asserts the existence of a larger weakly compact cardinal. Moreover,
it can be shown that if '" E IC and IC is weakly compact, then IC is strongly
inaccessible, but much larger than the first strongly inaccessible cardinal.
No proof of relative consistency of this axiom with ZF can be given, since
no such proof can be given for IC.
Far larger than the first uncountable weakly compact cardinal is the
first measurable cardinal, a notion introduced by Ulam. Say that a
function f:P(IC)~{O, I} is a two valued measure on IC if f(IC) = 1, and
whenever A<IC and {X.. :aEA} is a set of pairwise disjoint subsets of IC,
then f U .. E>"X.. = ~"E.J(X.. ). A cardinal IC is measurable if there is a
measure f whose domain is IC.
Other cardinals have been considered which dwarf the first measurable
cardinal (huge, super-huge, Vopenka, etc.), and the game of finding ever
larger axioms of infinity continues.
Before we close this section we want to mention several famous state-
ments that, like the axiom of choice, are independent of ZF, i.e., can be
neither proven nor refuted within ZF. The first concerns the reals with the
usual ordering (R, <). This ordering is dense, is complete (every set having
an upper bound has a least upper bound), and has the property that no
uncountable set of intervals can be pairwise disjoint. Souslin's hypothesis
asserts that any other ordering (A, <) having these three properties is
isomorphic to the reals. This statement, earthy as it seems, is independent
not only of ZF but also of ZFC plus the GCH.
Another question, even more relevant to analysis, is the following. Is
every set of reals measurable in the sense of Lebesgue? Lebesgue measura-
bility is a standard notion that is treated in most texts dealing with real
analysis, and we shall not discuss it here except to say that the Lebesgue
measure assigns to certain subsets of the reals a length, and this notion of
length generalizes the usual one as applied to, say, a union of disjoint
intervals. In courses of real analysis, it is proved that sets exist that are not
Lebesgue measurable, and these proofs invariably use the axiom of choice.
In fact, choice must be used in a strong form, for Solovay in 1964 proved
that the assertion 'all sets are Lebesgue measurable' is consistent with ZF
and a restricted version of choice. This restricted version of choice, called
the countable axiom of choice, asserts that every countable set of non-
empty sets has a choice function. Countable choice and the assumption
that all sets of reals have length in the sense of Lebesgue yields a very slick
development of the early portion of real analysis.
Finally, we want to mention an axiom that is not known to be con-
sistent with ZF. This is Mychielski's axiom of determinacy. Recall the
game mentioned in our discussion of Martin's theorem: A set X of reals is
fixed and two players a and b alternately choose integers between 0 and 9
50 I An Introduction to Set Theory

inclusive; say a chooses ai' then b chooses b2, then a chooses a3, etc. If the
number 0.a l b2a3b4 ••• belongs to X, then player a is declared the winner;
otherwise b wins. The axiom of determinacy states that the game is
determined regardless of X, i.e., one of the players has a winning strategy
(of course, which player has a winning strategy will depend on X). ZFC
implies the negation of the axiom of determinacy, although ZF plus
determinacy implies that every countable set of non-empty subsets of R
has a choice function. Moreover, it is conjectured that ZF plus de-
terminacy is consistent with the countable axiom of choice. Much of the
interest in determinacy stems from its implications for real analysis. For
example, determinacy implies that every set of reals is Lebesgue measur-
able. In addition determinacy implies that every set of reals has the
property of Baire and every uncountable set of reals contains a perfect set
(we leave the definition of these notions to a course in real analysis).
Determinacy is also known to imply the consistency of 'there is a measur-
able cardina1', which, as we shall see in the next section, shows that
determinacy is not a consequence of ZF.

EXERCISES FOR § 1.11


1. Prove in ZF that if x andy are sets, then so is Y x , i.e., there is a set z such that
for all w, wEz iff w:y-+x.
2. Show that if /C is strongly inaccessible and if A and i'a (0: EA) are less than /C,
then cIIaEAi'a E /C. (Recall that IIaEAi'a is the set of allfunctionsf on A such that
f(o:)Ei'a for each 0: EA.)
3. Ramsey's theorem states that whenever X I UX2 U··· UXm=[w]n, then there is
an infinite Z~w such that [Zr~Xj for some j. This theorem has been
remarkably useful in algebra, number theory, and logic. Moreover, its generali-
zations initiated an entire subject, combinatorial set theory, and motivated
several large cardinal axioms. We sketch the proof, leaving the details as
exercises for the reader. We consider only the case for m=2, since the theorem
says nothing when m= 1, and the case for m>2 reduces to the case m-1.
The case for n=l is easy. Suppose m=2, n=1. Let xo=O. Let Zli={n:n*O
and {O,n}EX;} for i=I,2. Either Z11 or Z.2 is infinite. Let Z. be Z11 if
CZ11 =w, and let P. = 1; otherwise let Z. = Z.2 and P. =2. Let x. EZ •. Let
Z2j= {n:n*x., nEZ., and {x.,n} EXj }. If cZ21 =w, take Z2=Z2. andp2= 1;
otherwise take Z2 = Z22 and P2 = 2. Now choose X2 E Z2. Continuing, we get
xO'X.'X2' ... ;Z.'Z2'Z3' ... ;P.,P2'P3' .... Either for infinitely many i's, P;= 1, or
for infinitely many i's, p;=2. Suppose the first (the second case yields a
completely analogous argument). Let Z= {Xj:Pj= I}. Take x;,xjEZ where
i<J. Then XjEZ;, and sincep;=l, we have {X;,Xj} EX •. Thus [Z]2~X•. An
analogous argument reduces the case m + 1 to m when m ;;. 2.
4. Prove that any set of pairwise disjoint open intervals of reals is countable.
5. Consider the game described on pp.49 and 50. For which of the following
choices of X does the first player have a winning strategy? 0, R, {~, ~}, Q.
1.12 Declarations of Independence 51

1.1-2 Declarations of Independence


This section is a meager introduction to what is now a vast and highly
technical branch of set theory, the study of independence and relative
strength of axioms. We prove here, among other things, that IC is not
implied by ZFC. We also discuss the famous results of Godel and Cohen.
We need several notions from model theory, which we shall meet again
in much greater detail and generality in §3. A structure is an ordered pair
(A, e) where A ~O and e is a binary relation on A. Let (1 be a statement in
the language of set theory. Say that (A,e) is a model of (1 if (1 is true in
(A,e) when E is interpreted as e, and 'for all x' is interpreted as 'for all
x EA', and 'there is an x' is interpreted as 'there is an x in A'.
For example, (R, <) is a model of the axiom of extensionality, since if x
and yare real numbers such that z <x iff z <y for all real numbers z, then
x = y. However, (R, <) is not a model of the axiom of the null set. It is also
easy to see that (R, <) is a model of union but not of power set (see
Exercise 1).
If ~ is a set of assertions in the language of set theory, then we say that
(A, <) is a model of ~ if each sentence in ~ is true in (A, <). We write
(A,e)l=(1 or (A,e)l=~ if (A,e) is a model of (1 or (A,e) is a model of ~
respectively.
Once e is defined in a given context, we may refer to (A,e) simply as A.
Our main interest in this section is in consistency results. We say that an
assertion (1 is consistent with a set of assertions ~ just in case ~ U {(1} has a
model. If both (1 is consistent with ~ and the negation of (1 is consistent
with (1, then (1 is said to be independent of ~. So to prove that (1 is
independent of ~ it is enough to display two models of ~, one in which (1 is
true and one in which (1 is false. We say that ~ implies (1 if every model of
~ is a model of (1. Hence ~ implies (1 iff the negation of (1 is not consistent
with ~.
As an example, let ~ be the axioms defining a group, and let (1 be the
commutative or Abelian axiom. Since there are groups (i.e., models of ~)
that are Abelian and others that are not, (1 is independent of ~.
It is easy to find examples of groups, but the situation is quite different
regarding models of ZF, and even fragments of ZF. The consistency of
ZFU {(1} implies the existence of a model for ZF, but GOdel has shown
that the existence of a model of ZF cannot be proved from ZF. Since ZF is
a sufficient framework for all of classical mathematics, this means that the
existence of a model of ZF is a new assumption, to be taken on faith, that
is not provable within classical mathematics. Nevertheless, the intuitive
appeal of the axioms and the fact that no contradiction has been derived
from them in the forty plus years following their discovery gives us
confidence that such a model exists, but this (or the existence of a model
for some fragments of ZF) has to be taken as an additional assumption,
which we do without further mention.
52 I An Introduction to Set Theory

We begin with a string of definitions and lemmas that will be funda-


mental to all our consistency proofs.

Definition 12.1. Fix x. For each ordinal a define Pax as follows:


Po{x) = x,
Pa+ ((x) = P(Pa{x»,

Pix)=p( U pp) fory= U y.


PEy

(This is an example of a recursive definition on Ord as discussed in


§ 1.10.)
The next lemma lists some of the elementary properties of the Pa's that
we need.

Lemma 12.2.
i. If x is E-transitive, then so is Pa(x).
ii. If f3 Ea, then Pp(x) EPa(x) and so Pp(x) c Pix).
iii. f3 E Pa(x) iff f3 Ea.
PROOF: The proof of each clause is by induction on a. Clearly, parts i
through iii are true when a = O. Suppose i through iii hold for all 8 Ea.
There are two cases to consider: a=y+ I for some y, and a= U a. We
prove the first and leave the second case, which is similar, as an exercise.
Let z E wand wE Pa(x). By the definition of Pix), we have we Pix)
and so z E Py(x). By the induction hypothesis Pix) is E-transitive, and so
z C Py(x). Hence z E Pa(x), giving part i. 0
The definition of Pa(x) immediately gives Py(x) E Pa(x), so by part i,
Py(x) c Pa(x). 0
If f3 <a, then f3 <:. a, and so by the induction hypothesis f3 C Py(x). So
f3 E Pa(x). Conversely, if f3 E Pa(x), then f3 C Pix), and so f3 <:. y <a. This
proves part iii. 0

Lemma 12.3. Suppose that a = U a (i.e., that a is a limit ordinal). Let


M = Up EaPp(S), where s is E-transitive. Then (M, E ~ M) is a model of the
following axioms:
i. null set,
ii. extensionality,
iii. regularity,
iv. pairing,
v. union,
vi. power set,
vii. choice.
PROOF OF i follows from Lemma 12.2iii. o
1.12 Declarations of Independence 53

PROOF OF ii. Suppose x,y E M and x =Fy. Then there is a z E


(x - y) U (y - x). By Lemma 12.2i, z EM. Hence there is a z EM such that
z E (x- y)U(y - x), and so extensionality holds in M. 0
PROOF OF iii. Let x E M, and let y be E-minimal in x. Then y EM by
Lemma 12.2i and is an E-minimal member of x. 0
PROOF OF iv. Suppose x,yEM. Then xEPp(s) and yEP.,(s) for some
{3,yEa. Let c5={3 U y. Then {x,y} ~P8(S), and so {x,y} EP8+1(S)~M. 0

PROOF OF v. Let xEM; say xEPp(S), where {3 Ea. If zE U x, then


zEPp(s) by two applications of Lemma 12.2i. Hence U x~Pp(s) and so
U XEPp+l(s)~M. 0
PROOF OF vi. Again let XEPp(S), where {3 Ea. If y~x, theny~Pp(s) by
Lemma 12.2i. Hencey EPp+l(s), and so P(X)EPP+2(S)~M. 0
PROOF OF vii. Suppose x is a set of non-empty sets and x E Pp(s), where
{3 Ea. Let f be a choice function on x. Then f is a set of ordered pairs
(y,f(y» where y Ex and f(y)Ey. Since (y,f(y» = {{y}, {y,f(y)}} and
since y,J(y)EPp(s) by Lemma 12.2i, we have (y,f(y»EPp+2(s). Hence
f~Pp'+-2(s), and so fEPp+3(s)~M. So M satisfies the axiom of choice.
This concludes the proof of Lemma 12.3. 0
Our next theorem shows that the axiom of infinity is not redundant. In
fact, none of the axioms of ZFC are redundant. There are several problems
at the end of this section that provide a partial proof of this claim.

Theorem 12.4. Let M= UiE",Pi(O). Then (M, E ~M) is a model of ZFC


except for the axiom of infinity. Hence, the axiom of infinity is not implied by
the other axioms of ZFC.
PROOF: In view of the preceding lemma, we need only see that M satisfies
the axiom of replacement but not the axiom of infinity.
To see that replacement holds, let t/I be a property such that for every
y E M there is a unique z E M such that tfiyz. Let x E M. An easy induction
on i shows that Pi(O) is finite for each i Ew. Hence x is finite (by Lemma
12.2i. For eachyEx letf(y) be the first iEw such that tfiyz for some
z E Pf(y)(O). Then the range of f is finite and has an upper bound mEw. So
{z:zEM and tfiyz for someyEx}~Pm(O), and hence {z:zEM and tfiyz
for some y Ex} Epm+ 1(0) ~ M. Therefore, replacement holds in M.
Now suppose x E M and x satisfies the axiom of infinity, i.e., 0 E x and
y U {y } E x whenever y Ex. Hence x d w. Let x E Pi(O) where i Ew. Since
i + 1 E x and x ~ Pi(O) (by Lemma 12.2i), we have i + I E Pi(O), contradict-
ing Lemma 12.2iii. Hence no such x exists, and M is not a model of the
axiom of infinity. 0
It may happen that an element of M has a property t/I when viewed
within the structure (M,e) but does not have the property when considered
54 I An Introduction to Set Theory

in the context of all sets. For example, "'+ 1 is a cardinal in P",+2(O) but
not in the context of all sets (Exercise 9). In the work of GOdel and Cohen,
x may be the power set of y in the sense of one model but not in the sense
of another or in the context of all sets. However, there are many important
properties that behave more uniformly, at least with respect to structures
of the following kind.

Definition 12.5. The structure (M,e) is standard if M is E-transitive and


e= EtM.

Definition 12.6. The n-ary property t/I is absolute if for every standard
structure M and every x\, ... ,xnEM, we have t/lx\, ... ,xn is a true statement
about sets iff MFt/lX\, ... ,xn'

Lemma 12.7. The following are absolute:


i. x=y,
ii. xEy,
iii. x kY,
iv. x= {y,z},
v. x=(y,z),
vi. x is an ordered pair,
vii. R is a relation,
viii. f is a junction,
ix. f is a 1-1 junction with Domf=x and Ranf=y.
x. x is E-transitive,
xi. Ordx.
PROOF: Let (M,e) be a standard structure so that M is E-transitive and e
is E t M. Then parts i and ii are immediate, and iii follows directly from
~ 0
PROOF OF iv. Let x,y,z E M. By part ii and the E-transitivity of M, for any
wEM, wEx iff wEM and MFWEx. Hence the following statements are
equivalent:
MFX={y,Z}.
For every wEM, wEx iff w=y or w=z.
For every w, wEx iff w=y or w=z.
x= {y,z}. o
PROOF OF v follows from iv and the definition of (y,z) as {{y}, {y,z}}. 0
PROOF OF vi. Let x E M. The following statements are equivalent:
M Fx is an ordered pair.
There existy,zEM such that x=(y,z).
There exist y, z such that x = (y, z).
x is an ordered pair.
1.12 Declarations of Independence 55

To get from the third statement to the second note that if x=(y,z) and
xEM, then {y},{y,z}EMandsox,yEM. 0
PROOF OF vii follows from ii and vi. o
PROOF OF viii follows from vii. o
PROOF OF ix. Let j EM. By part viii we may suppose that j is a function
and that M'Fj is a function. Notice that X7'~y and (x,z),(y,z)Ej iff
(x,z),(y,z)EM and (x,z),(y,z)Ej. Hencejis 1-1 just in case M'Fjis 1-1.
Also xEDomj iff for some y (x,y)Ej, iff (x,y)EM and (x,y)Ej, iff
M'FxEDomj. Hence z=Domjiff M'Fz=Domj. Similarly for Ranj. 0
PROOF OF x. Let x E M. Then z Ey and y E x iff z,y E M and z Ey and
yEx. 0
PROOF OF xi follows from x and ii. o
Now let's look at the other axioms of infinity discussed in § 1.11. All are
known to imply IC, the axiom that asserts the existence of a strongly
inaccessible cardinal. Since our next theorem states that IC is not implied
by ZFC, it follows that the other axioms of infinity are not implied by
ZFC either.

Theorem 12.8. ZFC does not imply Ie.


PROOF: Suppose ZFC implies IC, and let /C be the first strongly inaccessible
cardinal. Let M=UPE"Pp(O). We claim that (M,EtM) is a model of
ZFC but not of Ie. In view of Lemma 12.3 we need only show that
(M, E tM) is an E-model of infinity and replacement but not of Ie.
Since wE M and M is E -transitive, it follows that M is a model of the
axiom of infinity.
N ow suppose If; is a property such that for each y EM there is a unique
z E M such that I/;yz. Let x E M. Our argument now is quite similar to that
given for Theorem 12.4. For eachyEx we letj(y) be the least aE/C such
that I/;yz for some z E Pa(O). Let Z be the range of j. An easy induction on
f3, using the fact that /C is strongly inaccessible, shows that cPp(O) E /C for all
f3 E /C. Hence ex E /C, and so cZ E /C. Again using the strong inaccessibility
of /C we get that U ZE/C. Let y= U Z. Then {z:l/;yz for someyEx}k
PrCO) and so is a member of M. Hence replacement is true in M.
We now show that IC is false in M. Let aE/C (so that a is an ordinal in
M). By Lemma l2.7xi, a is an ordinal and a E /C. Hence either a is not a
cardinal or a..;;2>' for some cardinal AEa, or a= U {AII:f3EA} where
AE a and {All: f3 E A} k a. In the first two cases there is a function
j:a~P(A) for some AEa. Since A,aEM and since jkPa+3(O)kM, we
have by Lemma l2.7iii,ix thatj:a ~ peA) in the sense of M. Hence a is not
strongly inaccessible. In the remaining case, A and each Ap for f3 E a
56 I An Introduction to Set Theory

belong to P,,(O), and so P'p:P EA} EP,,+I(O). Hence U {Ap:P EA} EM,
so a is not strongly inaccessible in this case either. D

We shall now use the terms structure and model in a more general
context that involves an abuse of notation. We now think of a structure as
an ordered pair (M,e) where M is a unary property and e is a binary
property. The abuse of notation arises because M and e need not be sets,
in which case (M,e) no longer denote an ordered pair of sets. However, we
think of M as the collection of all sets x having the property M. In general,
this collection, like the collection of all sets, or the collection of all
cardinals, will be "too large" to be a set.
The notion of truth in (M,e) and the notion of model are then extended
in the obvious way. For example, to say that pairing is true in (M,e), or
that (M,e) is a model of pairing, or that (M,e)t=pairing, means that for
every x and y having the property M, there is a z having the property M
such that for all w having the property M, wEz iff w=x or w=y. Clearly,
there is a need to further abuse notation, and we write 'x E M' instead of
'x has the property M'; also, we will write 'xEy' instead of '(x,y) has the
property e' (except for one exercise, e will be E, so this takes care of itself).
The definitions of 'E -transitive' and 'standard model' are generalized in
the obvious way.
Now let ~ be the axioms of ZFC other than regularity. Let x have the
property M just in case x E P,,(O) for some a. Now consider (M, e ~ M).
Trivial modifications of Lemma 12.3 and the proof of Theorem 12.4 show
that M is a model of ZFC. Hence if the axioms of ZF other than regularity
have a model, then ZF has a model. On the other hand, there is a model of
the axioms of ZF other than regularity in which regularity fails. Thus the
regularity axiom is not redundant, and in fact we have

lbeorem 12.9. The axiom of regularity is independent of the other axioms of


ZF.

For many years, the status of the axiom of choice relative to ZF


remained a mystery. Is choice independent of ZF, or provable from ZF, or
perhaps even refutable from ZF? The same question arises regarding the
generalized continuum hypothesis and ZF. As far as the axioms of ZF
itself are concerned, each can be shown to be independent of the remain-
ing ones without too much effort (see Exercises 2 and 3 for a couple of
instances). For choice and especially the GeH, the problem is much more
difficult and more crucial-more crucial because these statements are far
less intuitive to most mathematicians than the statements of ZF; heuristic
arguments for or against these axioms as valid assertions about the
universe of all sets are not very compelling. A proof of AC from ZF would
cause AC to be accepted as a valid statement about sets; a disproof from
ZF would cause its rejection; and similarly for GeH. But such proofs are
impossible, for in 1938, Godel assuming the existence of a model for ZF,
1.12 Declarations of Independence 57

described a model of ZFC plus GCH; then, in 1963, Cohen, assuming the
existence of a model for ZF, produced a model of ZF in which choice fails,
and a model of ZFC in which GCH fails. (One can show that choice is a
consequence of ZF plus GCH, and so no model of ZF plus GCH exists in
which choice is false.) Together these results prove the following.

1heorem 12.10.
i. The axiom of choice is independent of ZF.
ii. GCH is independent of ZFC.

The constructions of Godel and Cohen are much too involved to give
here, although it might appear at first glance that part of Godel's contribu-
tion has already been dealt with in Theorem 12.8 and Theorem 12.9, where
we constructed a model of ZFC. However, our proofs that choice holds in
these models depended on choice being used in the universe of all sets.
Thus assuming the truth of ZFC in the universe of all sets, we produced
models of ZFC having additional properties. But suppose that some
doubter believes that ZF is true in the universe of all sets but that choice is
not-even more, he suspects that choice can be disproved from ZF. Godel
produced a model of ZFC assuming only that ZF has a model and hence
showed that ZFC is as consistent as ZF, and so choice cannot be disproved
from ZF.
Not only are these theorems of GOdel and Cohen milestones in the
foundations of mathematics in themselves, but the techniques used to
prove them have been extremely fruitful in the last decade, yielding
consistency results that answered longstanding problems in logic, topology,
analysis, algebra, and other branches of mathematics. Work in this direc-
tion is still continuing at an enormous rate.

EXERCISES FOR § 1.12


I. Show that (R, <) is a model of union but not power set.
2. The power set axiom cannot be proved from the remaining axioms of ZF. Let
M be the set of all x such that x, U x, U U x, U U U x, •.. are all
countable. Show that all the axioms of ZF are true in M except for the power
set axiom.

3. Prove the remaining half of Lemma 12.2, for a = U a.


4. (a) Show that there is no finite sequence of sets XO,Xb""X" such that XOEXI
E'" EX"Exo.
(b) Show that there is no sequence of sets Xo.Xb'" such that ",X2Exl Exo.
(Hint: Use regularity.)
5. Suppose that 0/ is a property such that for every x, o/x whenever ~ for ally Ex.
Show that o/x for all x. An argument based on this principle is called an
induction on sets. (Hint: Use regularity.)
58 I An Introduction to Set Theory

6. Let x' = x for all x other than x =0 or x = 1. Let 0' = I and I' =0. Let x E'y iff
x Ey'. With M the collection of all sets, prove that (M, E') is a model of ZF in
which regularity fails. This along with the argument preceding Theorem 12.9
gives a proof of 12.9 and shows that regularity is independent of ZF.
7. The axiom of separation is weaker than the axiom of replacement. To show
this let M=P.,+.,(O) and consider (M,EtM). Verify that this is a model of
separation and all the axioms of ZF except replacement. To see that replace-
ment fails, consider the property o/xy which holds iff xEw andy = P., + x(O). As
we have seen in § 1.11, separation is a consequence of ZF, and so this example
shows that separation is weaker than replacement.
8. Show that the following are absolute:
x=yXz,
x= Uy,

x= ny,
x=2,
x=w.

9. Cardx is not absolute. [Hint: Consider the structure (P.,+iO), E t P.,+2(0»


and Cardw+ 1.]
PART II
An Introduction to
Computability Theory

2.1 Introduction
What are the capabilities and limitations of computers? Are they glorified
adding machines capable of superfast arithmetic computations and noth-
ing else? Can they outdo man in the variety of problems they can handle?
Let's narrow the question a bit. Consider the class of number theoretic
functions that a computer can be programmed to compute or that a man
can be instructed to compute. Are any of these functions computable by a
computer but not by a man, or by a man but not by a computer? Is there a
number theoretic function that is not computable by any computer, and if
so, can such functions be described? Is there a computer that can be
programmed to compute any function that any other computer can com-
pute? Is man such a computer?
In order to make these questions amenable to mathematical analysis,
Alan Turing in 1942, introduced a purely abstract mathematical notion of
computer and presented heuristic arguments in support of the view that his
"machines" have exactly the same computational powers as a "real com-
puter" or a man, at least if speed of computation is ignored.
The reason that his arguments are necessarily heuristic is that neither
the notion of "man computable" nor "real computer" is mathematically
defined. His mathematical machines are intended to be an abstraction of
"real computers" that allows precise mathematical analysis, just as the
integral is an attempt to make rigorous our intuitive notion of area.
We begin this section with a description of Turing machines, and later
take up some of the various intuitive arguments that support the thesis that
man, computers, and Turing machines are computationally equivalent.
Most of the work will be aimed at delimiting the capabilities of Turing
59
60 II An Introduction to Computability Theory

machines, culminating with Godel's famous incompleteness theorem of


1931, one of the milestones of mathematics. Some time will be spent on the
consequences of the incompleteness theorem and in particular its bearing
on the problem of placing arithmetic, or any other significant branch of
mathematics, on a sound axiomatic foundation.

2.2 Turing Machines


The kind of machine we have in mind performs computations by printing
and erasing checks on a tape. The tape is partitioned into cells, and each
cell either is blank or has a single check. At any given step in the
computation, the machine scans a single cell. This scanned cell is then
checked or left blank, and the scanner may move one cell to the left or to
the right. Exactly which of these four alternatives occurs at a given step of
the computation is completely determined by two things: the internal state
of the machine (we assume the machine has only finitely many states), and
whether the scanned cell is checked or blank. The situation can be pictured
as follows:

Ivlvl Ivl Ivlvlvl Ivl


3
Here, the machine is in state 3, and cell 8 of this tape is checked and is the
scanned cell. This notation is a bit clumsy and some simplification is called
for.
Recall that N is the set of natural numbers {O, 1,2, ... } and N+ is the set
of strictly positive natural numbers. If ! is a function, then Dom! is its
domain and Ran! is its range. Most of the other symbols used here are
defined in § 1.2.

Definition 2.1. A tape is a sequence a1a2a3'" where each a j is 0 or 1. A


marker is a pair U,k) where yEN+ and kEN+. A tape position is a
marker U,k) and a tape a1a2a3'''; thejth term is the scanned term, and k is
the state for this tape position.

Here we have substituted O's and 1's for the blanks and checks of our
initial description. We now need three functions: one to tell us how to alter
the scanned term, one to tell us which term to scan next, and one to tell us
the next state of the machine.

Definition 2.2. A Turing machine is an ordered triple of functions (d,p,s),


each with the same domain D, where D is a finite set of tuples of the
form (i,k) with iE{O, I} and kEN+, and where Rand~{O, I}, p(i,k)E
2.2 Turing Machines 61

{-I,O, I}, and Rans~N+. If M=(d,p,s), then the domain of M, DomM,


is D.

We may refer to a Turing machine simply as a machine.

The functions d,p,s are the "print-erase" function, the "next position"
function, and the "next state" function respectively. For example, if the
tape position t is given by
(3,2): 1111010 ...
and the machine M=(d,p,s) is such that
d(I,2) =0,
p(I,2)= -I,
s(I,2) =4,
then one application of M to t yields a new tape position
(2,4): 11010 ....
Originally the third cell is scanned and 2 is the state of the machine. This is
denoted by the marker (3,2). d tells us to place a 0 in the scanned cell; p
tells us to move the scanner to the left, so that now the second cell is the
scanned cell; and s tells us that the new state is 4. More generally, we have

Definidon 2.3. Let t be the tape position with marker U,k) and tape a, and
let M be the machine (d,p,s). Then M(t), the successor tape position, has
marker (j+p(apk),s(apk» and tape a 1a2 ••• aj _ 1 d(aj,k) aj +laj +2 •••
provided that (apk)EOomM andj+p(apk»O. A partial computation of
M is a sequence t l ,t2, ••• ,tm of tape positions such that t;+1 =M(t;) for each
i<m. The sequence is a computation if M(tm) is not defined. If t l,t2 , ••• ,tm
is a computation, then t 1 is the input and tm the output. If t I' ••• , t, is a
partial computation, we may write M'(t l) for t,.

There is a convenient way of writing a machine as a table. For example


o I
M:
~ I :~~ I ORI I
denotes the machine (d,p,s) with domain {(O, I), (I, I), (0,2)} where
d(O, 1)= I, p(O, 1)= -I, s(0,1)=2,
d(l,I)=O, p(l, 1)= I, s(l, 1)= I,
d(0,2) = I, p(0,2)=0, s(0,2) =2.
Since a p-value of 1 indicates a scanner shift to the right, a p-value of 0
indicates no movement of the scanner, and a p-value of - 1 indicates a
62 II An Introduction to Computability Theory

scanner shift to the left, we use R, 0, and L in the table instead of 1,0, and
-I.
As an example, take M to be the machine above and t the tape position
(2,1): 0100000 ....
Then M(t) is
(3,1): 0000000 ....
[Since the second term is the scanned term and its value is 1, and the state
is 1, d(1, 1)=0 is the value of the second term in M(t). p(1, 1)= 1, which
says that the term to the right of the second term, i.e., the third term, is the
new scanned term. Finally s(1, 1)= 1, which says that the new state is I.] It
should be clear that the following is a computation:
(2,1): 0100000 .. .
(3,1): 0000000 .. .
(2,2): 00.1 0000 .. .
(2,2): 0110000 ... .
An easy induction on n shows that with M as above there is a computation
that begins with
(2,1): 01 ... 1000 ...
~
n consecutive I's

and ends with


(n + 1,2): ,00.: .O} 1000 ...
n - 1 consecutive 0'8

If .we think of n as being represented by n consecutive 1's, then this


machine computes the constant functionf(n)=2.
In abbreviating tapes we may use In to denote an n-term subsequence of
l's and on an n-term subsequence of O's. For example
0111001111000 ...
may be written as
o13 Q21 4,
suppressing mention of the tail of O's.
We may write O'"(nl, ... ,nk) for 0'"0 I n ,O I n2 0 ... O1"", where mEN. For
m=O this becomes (nl, ... ,nk)' and if k= 1, then we write n l instead of (n l ).
The sequence or tape (nl, ... ,nk) may be written as n.
The input (nl, ... ,nk) is the marker (2,1) with the tape (nl, ... ,nk). The
output n has the leftmost 1 of the tape rf n as the scanned term, where
k~ I.
We have introduced several obvious ambiguities in our notation, confus-
ing the n-term sequence of l's with the number 1 (= In), the tape
(n1, ... ,nk) with the sequence (nl, ... ,nk)' and so on. This may however
increase the readability of what follows, and the particular meaning
intended will be apparent from the context.
2.2 Turing Machines 63

Definidon 2.4. Suppose that M is a machine and g is a k-ary function such


that for each (nt, ... , nk) E k(N+) there is a computation with respect to M
with the input (nt, ... ,nk) and the outputg(nt,.,.,nk). We then say thatg is
computable and that M computes g.

We have already seen that the constant function f(n) =2 is computable


and is computed by the machine
o
I IL2 ORI
2 102

The following theorem establishes the computability of several basic func-


tions.

1beorem 2.S. The following functions are computable:


i. the addition function Sum(m, n) = m + n,
ii. for each k and d in N+, the constant k-function Ck,I-nt, ... ,nk)=d,
iii. for each k and t in N+ with t <.k, the projection k-function
Pk,t(n t,·· .,nk)= nt'
iv. the I-function

Pred(m)= { m~ I ifm>2,
ifm=l.
[Read 'Pred(m)' as 'Predecessor m'.]
PROOF OF i. Consider the machine
o I
I 0R2
2 IL3 IR2
3 OR4 IL3

It is easy to see that the following is a subsequence of the computation


beginning with (m,n):
(2, I): 0 ImO In
(3,2): OOlm-tOln
(m+2,2): OOlm-tOln
(m+I,3): OOlm-tlln
(2,3): OOlm+n
(3,4): m+n
Hence m + n is computable. D
64 II An Introduction to Computability Theory

PROOF OF ii. We give the machine for d=2 only, but this can be easily
modified to handle any given d:
o
I OR2 ORI
2 IL3 ORI
3 103 o
PROOF OF iii. The general idea will be quite clear from a discussion of the
special case k=4, t=3. For this case we use the following machine:
o I
I OR2 ORI
2 OR3 OR2
3 OR4 IR3
4 OL5 OR4
5 OL5 IL6
6 OR7 IL6
If (nl,n2,n3,n4) is the input, then a computation results in which the first
two blocks of I's are erased, the third is passed over, the fourth is erased,
and then the scanner returns to the first I of the third block. 0
PROOF OF iv. It is easy to verify that Pred(m) is computed by the following
machine:

o I
I OR2
2 103

This completes the proof of Theorem 2.5. o


Two different machines may compute the same function. For example,
the machine given above for the sum x + y computes the same unary
function as the following:

o I
I 102
namely, the identity function f(n)= n. In fact it is easy to see that any
computable function is computed by infinitely many different Turing
machines (Exercise 5).
Our machine for addition computes both a unary function and a binary
function but not a ternary function. Some machines do not compute any
function; others compute k-functions for all k.
2.2 Turing Machines 65

A machine M may fail to compute a k-ary function for one of two


reasons: there is some k-ary input t such that Mr(t} exists for all rEN+ (so
the partial computation does not end), or there is a computation beginning
with I that does not end in an output. An example of the first phenomenon
is given by
o I
I ORI IRI
An example of the second is
o I
ILl
If a function J is computable, then there is a Turing machine M that,
given the input if, prints out J(if}. In this sense we can consider M as
answering the question: 'What is J(n}?' It is natural to ask whether Turing
machines can answer other types of mathematical questions. For example,
can these machines answer questions such as 'Is n a prime?' or 'Is m the
greatest common divisor of m 1 and m2?'? Alternatively, we can view these
questions as queries about set membership in the following way: Let X be
the set of all primes, and let Y be the set of all triples (m,ml,m2)' where m
is the greatest common divisor of ml and m2. The first question is
equivalent to the question 'Is n EX?', while the second is equivalent to 'Is
(m,m 1,m2}E Y?'. More generally, these considerations lead us to the
following notion of a computable set.

Definition 2.6. Let X be a set of k-tuples of natural numbers. The


representing function oj X, Rx, is the k-function defined by Rx(ii) = I if
if EX, Rx (if}=2 if iff/.X. We say that X is computable if Rx is a comput-
able k-function.
If one interprets an output of 1 as being the answer 'yes' and the output
2 as being the answer 'no', then a Turing machine which computes Rx
answers the question 'Is if EX?'.
Notice that the above definition is made only for sets of k-tuples for
fixed k, i.e., we do not consider sets containing both k-tuples and I-tuples
when k=l=l.
A set P of k-tuples is called a k-relation (recall § 1.3). For example, the
set of all ordered triples (m,m 1,m0 such that m is the greatest common
divisor of m 1 and m2 is a 3-relation. If P is a k-relation, it is often more
convenient to write P(n1, ... ,nk) when we mean (n1, ... ,nk}EP. Thus if P is
the set of all ordered triples (m,m 1,m0 such that m is the greatest common
divisor of m 1 and m 2, we have P(3, 15, 12) but not P(2,8,23}. For binary
relations P, more specialized notation is frequently used, and we may write
xPy instead of (x,y}EP or P(x,y). So, for example, if P is the set of all
ordered doubles (x,y) such that x ";;y, then 3P8 but not 5P2. Actually,
this is the familiar notation, for if we write' ..;; , instead of 'P' this becomes
3 ..;; 8 but not 5 ..;; 2.
66 II An Introduction to Computability Theory

Thus to say that P is a computable k-relation means that there is a


Turing machine which correctly answers all questions of the form 'Is
(nl, ... ,nk) in the relation P?'.
As usual, the I-tuple (n) is identified with n, and the I-relation P is
identified with {n:(n)EP}.
We next consider some examples of computable sets and relations.

EXAMPLE 2.7. Let X denote the set of all odd numbers. Then Rx(n) equals
I if n is odd and equals 2 if n is even. It is easily seen that the following
machine computes Rx:

o
IL2 OR2
2 103 ORI

Hence, X is computable.

EXAMPLE 2.8. Let X denote the set of all positive integers :> 2, i.e.,
X= {m:m:> 2}. Rx(m) = I if m:> 2 and Rx(1) =2. It is easy to see that the
following machine computes Rx:

o I
I IR2
2 IL5 OR3
3 OL4 OR3
4 OL4 101

Hence X is computable.

EXAMPLE 2.9. Let P denote the relation <. That is, P= {(m,n):m <n}. The
representing function Rp for this relation is defined by Rp(m,n) = I if
m <n, and Rp(m,n) =2 otherwise.
The Turing machine below computes Rp in the following way: Given
the input (m,n), the machine erases the first check on the tape and then the
last check on the tape, and then repeats the procedure until either the
block of I's on the left has been erased but not the entire block on the right
(in which case all I's are then erased and then I is printed), or the right
block of I's is erased first (in which case all I's are then erased and two I's
are printed). The dotted line divides the table into halves: the top half
dictates the erasing of the leftmost I and then movement to the right after
checking that all of the l's in the left hand block have not been erased; the
lower half of the table dictates a dual operation but from right to left.
2.2 Turing Machines 67

o I
I 0R2
2 OR5 IR3
3 OR4 IR3
4 OL6 IR4
5
.lOll
... ORS
....
6 OL7
7 OLIO IL8
8 OL9 IL8
9 ORI IL9
10 IL5 OLIO
Much more interesting examples of computable functions and relations
will be found in the sections that follow.
The notion of a Turing machine can be formulated in many equivalent
ways-equivalent in the sense that the resulting set of computable func-
tions will be the same as the set of functions that are computable accord-
ing to our present definitions. For example, two way infinite tapes of the
form ... a-2a-1aOa1a2'" can be used in place of one way infinite tapes.
The terms of the tape might be restricted to {O, t, ... ,n} instead of {O, I}.
There are many more alternative formulations, each being advantageous in
certain circumstances and disadvantageous in others. Our choice is moti-
vated by personal preference and expediency in the present development.

EXERCISES FOR §2.2


l. Give a Turing machine that computes C2,3 (see Theorem 2.5). Write down the
computation with respect to this machine, beginning with the input (2,2).
2. Give a Turing machine that computes P3,2 (see Theorem 2.5). Write down the
computation beginning with (2,3,4).
3. Give the computation arising from the inputs I and 4 with respect to the
machine in the text that computed Pred(m).
4. Give the complete sequence of tape positions arising from the input (2, I) with
respect to the machine that computed Pred(m). Is there a tape output?
5. Letfbe a computable k-function. Show that there are infinitely many different
machines that compute f.
6. Determine the I-function and 2-function computed by the following machine:
o
I IL2 IRI
2 OR3 IL2
68 II An Introduction to Computability Theory

7. Consider the following machine:


o
ORI lR2
2 lL3 lR2
For which inputs is there an output?
8. Find a machine that computes a k-function for all k E N+ .
9. Find a machine that computes the functionJ(n)=2n.
10. Find a machine that computes the functionJ(n)=3n.
11. Show that the set of all even numbers is computable.
12. Show that the set {1,4, 7,1O,13, ... } is computable.
13. Find a machine that computes the 2-functionJ(n,m) = nm. Briefly describe the
behavior of your machine, indicating why it works.
14. Find a machine that computes the 2-function
Diff'(n,m) = {m-n if m-~EN+,
1 otherwIse.

2.3 Demonstrating Computability Without an


Explicit Description of a Turing Machine
To prove that a particular function, set, or relation is computable by
exhibiting a Turing machine that computes it is usually a long and tedious
task even for simple examples. In this section we shall describe procedures
for proving the computability of many functions, sets, and relations
without explicitly exhibiting machines that compute them. The proofs of
these theorems are constructive in the sense that if a function is shown to
be computable by one of these theorems, then, following the proof of the
theorem, an explicit machine for computing the function can be exhibited.
Most of the proofs will be deferred until §2.4.
Suppose g\, ... ,g, are k-functions andfis an r-function. The composition
of f with g\, ... , g, is the k-function h defined by
h(iii) = f(g\(iii),giiii), ... ,g,(iii» for any k-tuple iii. For example, if
g\(m,n)=m+n, gim,n)=mn, gim,n)=m 2+n, f(u,v,w)=U 3+V 2+W,
then
h(m,n)= f(g\(m,n),g2(m,n),g3(m,n» =(m + n)3 +(mni+ m 2+ n.

Theorem 3.1. If f is a computable r-.function and g\, ... ,g, are computable
k-functions, then the composition of f with g\, ... ,g, is a computable k-
function.
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 69

This result states, in other words, that the composition 'of computable
functions is computable. The proof of this result is given in §2.4. In this
section we shall be primarily concerned with showing how results of this
type are used to demonstrate computability without recourse to machines.

EXAMPLE 3.2. Using Theorem 3.1, we see that the function f(n)-2n is
computable, since f(n)-Sum(P I,I(n),PI,I(n»-Sum(n,n)-2n and both
the functions Sum and PI,I are computable by Theorem 2.5.

EXAMPLE 3.3. Suppose g is a computable I-function. Define g' by g'(m)-


g(m + 1). Then g'(m) - g( Sum(PI,I(m),CI,I(m»). Let h(m)
-Sum(PI,I(m), C I, I(m». h is computable by Theorems 2.5 and 3.1.
Hence g(h(m» is computable by Theorem 3. 1. But g(h(m»-g'(m), so g'is
computable.

It is tempting to consider the following argument as a proof of the


computability of f(n)-2n: By Theorem 2.5, there is a machine that
computes the sum function m + n. To compute 2n we need only input
(n,n) into this machine. The flaw in this argument is that we are inputting
a 2-tuple (n,n), and not a I-tuple as the definition of a computable
I-function requires.

EXAMPLE 3.4. We show that, for every n, the I-function Prodn, defined by
Prodn(m)-nm, is computable. (In Example 3.2 above we showed that
Pro~ is computable.) The proof is by induction on n. For n -1, we have
Prodl-PI,I' which is computable by Theorem 2.5. Now suppose that
Prodk is computable. Then Prodk+l(m)-Sum(Prodk(m),PI,I(m»-km+m
-(k+ I)m. Since Sum and PI,I are computable by Theorem 2.5 and
Prodk is computable by assumption, we see by Theorem 3~1 that Prodk + 1
is computable. This completes the induction.

As a corollary to Theorem 3.1 we have the following useful result which,


loosely speaking, allows us to change variables in a computable function:

Corollary 3.S. Let f be a computable r-function, and let k EN+ ,iI '"
'" k, i2
k, ... ,i, '" k. Then the k-function h defined by h (n)-
f(Pk,i,(n),Pk,iz(n), ... ,Pk,i,(n» is computable.
PROOF: The result follows immediately from Theorems 2.5 and 3.1. 0
ExAMPLE 3.6. Suppose f is a computable 2-function. Then the 3-func-
tion hi defined by hl(a, b, c) - f(a, b) is computable, since
hl(a,b,c)- f(P3,I(a,b,c),P3,2(a,b,c». Similarly the following 2 and 3-func-
tions are computable:
i. the 3-function ~ defined by h2(a,b,c)- f(a,c),
ii. the 3-function h3 defined by h3(a,b,c)- f(a,a),
iii. the 2-function h4 defined by h4(a,b)- f(b,a).
70 II An Introduction to Computability Theory

In Example 3.4, we showed that for every n, Prodn is computable. The


reader might be inclined to think that this also shows that the multiplica-
tion function, Mult, defined by Mult(m,n)=mn, is also computable. For to
compute Mu1t(m,n) we need only go to the machine that computes Prodm
and let n be the input. The fallacy in this argument is that Mult is a
2-function: to show that Mult is computable we need to find one machine
such that if the input is (m,n), the output is mn. The above argument
would involve going to infinitely many machines, one for each value of n.
While it is not particularly difficult to write down a machine that computes
Mult directly, we instead consider another method of defining a function
in terms of given functions, a method which we will use to define Mult
from Sum. As with composition, this method, called definition by recur-
sion, defines a computable function when the given functions are comput-
able.
We can define Mult (by recursion) as the unique function satisfying the
following pair of equations:
Mult(I,n)=n, (1)
Mult(m+ l,n) = Mult(m,n) + n. (2)
For example, using (2) repeatedly and then using (1), we see that Mult(4,7)
= Mult(3, 7)+7 = Mult(2, 7) +7 + 7 = Mult(1, 7)+7 +7 +7=7 +7 +7 +7.
We can rewrite (1) and (2) as follows:
Mult(l,n)=Pl,l(n), (1')
Mult(m+ l,n)=Sum(Mult(m,n),n). (2')
We have already shown that Pl,l and Sum are computable (Theorem 2.5).
An easy application of the next theorem gives us the computability of
Mult.

lbeorem 3.7 (Definition by Recursion). Let f be a (k+2)-function and let g


be a k-function. There is a unique (k+ I)-function h satisfying the following
two equations:
h(l,n)=g(n) (n denotes an arbitrary k-tuple), (1)
h(m+ I,n)= f(h(m,n),m,n). (2)
Furthermore, iff and g are computable, then so is h.

As with our definition of Mult, we note that these equations may be


viewed as a set of directions that when followed will produce the sequence
h(1,ii"),h(2,ii"), .... The first equation tells us what h(I,ii") is; the second
equation tells us how to get from a given term of the sequence
h(I,ii"),h(2,ii"), ... to the next term. The existence and uniqueness of the
function h is a special case of Theorem 10.23 in Part I, but a self-contained
proof is sketched in Exercise 4. The proof that h is computable if f and g
are is deferred until Section 2.4.
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 71

We can now use Theorem 3.7 to get the computability of Mult. We first
observe that the functionf defined by f(a,b,c)=Sum(a,c) is computable
by Corollary 3.5. Now apply Theorem 3.7, taking k= 1, g=PI,I' and thisf.

EXAMPLE 3.8. Consider the two equations


Pow(1,n)=n= PI,I(n), (1)
Pow(m+ I,n) = MuIt(Pow(m, n),n). (2)
These equations have the form specified in Theorem 3.7 if we take g= PI I
and define a 3-function f by f(a,b,c)=Mu1t(a,c). We know that g is
computable by Theorem 2.5, and that f is computable by Corollary 3.5 and
the computability of Mult (obtained above). Hence by Theorem 3.7 the
unique 2-function Pow satisfying these equations is computable. Notice
that from (1) and (2) we have
Pow(I,n) = n, Pow(2,n) = Mu1t(n,n) = n 2 , Pow(3,n) = Mult(n2,n) =n 3,
and so on. An easy induction on m shows that Pow(m,n)=nm. The reader
who does not yet appreciate the power of these methods should attempt to
find a machine that computes Pow directly.

EXAMPLE 3.9. Let f(a,b,c) = Preda. f is computable by Theorem 2.5 and


Corollary 3.5. Consider these two equations:
h(I,n)=Predn, (1)
h(m+ I,n)= f(h(m,n),m,n). (2)
From Theorem 3.7 we see that these equations define a unique computable
2-function h. We next show, by induction on m, that h(m,n)=Diff'(m,n),
where Diff' is defined as in Exercise 19, §2, that is,

Diff'(m,n) = { 7- m if n-m ~ 1,
otherwise.
For m= 1, n arbitrary, we have h(1,n)=Predn=Diff'(I,n). Suppose
for some k and all n we have h(k, n) = Diff'(k, n). Then
h( k + 1, n) = Pred( h( k, n» = Pred(Diff' (k, n». To complete the induction
we need only show that Diff'(k+ I,n)=Pred(Diff'(k,n». If Diff'(k+ I,n)
~ 2, then Diff'(k + I,n)= n -(k + I)=(n - k)-I = Pred(Diff'(k,n». If Diff'
(k+I,n)=I, then n-(k+I)~ 1, so n-k~2. Thus Pred(Diff'(k,n» = 1
and the induction is completed.

We may write m ~ n instead of Diff'(n,m).


Theorem 3.7 yields computable (k+ I)-functions for k> 1 only. A useful
analog of Theorem 3.3 that yields computable I-functions is the following:

Corollary 3.10. Let f be a computable 2-function and d a fixed positive


integer. Then the unique I-function h satisfying the following equations is
72 II An Introduction to Computability Theory

computable:
h(I)=d, (1)
h(n+ 1)= J(h(n),n). (2)

PROOF: Define a 3-functionJ' by f'(a,b,c) =J(a, c). The two equations


h'(I,p)= CI,ip), (1')
h'(n + I,p) = J'(h'(n,p ),n,p) (2')
define a unique computable 2-function h'. We shall prove, by induction
on n, that h(n)=h'(n,I) for all n. For n=I, we have h(1)=d=C I,d
(1) = h'(I, 1). Suppose h(k) = h'(k, 1). Then h(k + 1) =
J(h(k),k) = f'(h(k), I,k) = f'(h'(k, 1), I,k) = h'(k + 1, 1). This completes
the induction and shows that h(n)= h'(n, 1) for all n. The computability of
h now follows from the computability of h', CI,I' PI,I and the fact that
h(n)= h'(n, 1)= h'(PI,I(n),CI,I(n». 0
EXAMPLE 3.11. The factorial function n! = n· (n -1) .... ·1 can be defined
by recursion as follows:
1! = 1,
(n + I)! = (n + I)·n!.
The second equation may be written as
(n+ I)! =Mult(n!,Sum(n,CI,I(n»).
Since Mult(y,Sum(x,CI,I(x») is computable by Theorem 2.5 and Theo-
rem 3.1, it follows by Corollary 3.10 that n! is computable.

EXAMPLE 3.12. Let g be a computable I-function, and define Sg(n) = g(l) +


g(2) + ...+g(n)=~~=lg(i). Giving an alternative description of Sg and
applying Corollary 3.10 gives the computability of Sg:
Sil)=g(l),
Sin + 1)= Sg(n) + g(n+ 1)
=Sum(Sin),g(Sum(n,CI,I(n»»
[here, the J of 3.10 is Sum(y,g(Sum(x, CI,I(X»»].
Starting with g a computable (k+ I)-function and defining
m
Si m ,n)=g(l,n)+g(2,n)+··· +g(m,n)= L g(i,n),
;=1

we get a computable function Sg by arguing as above but using Theorem


3.7 instead of Corollary 3.10.
A similar argument gives the computability of Hg defined by
m
Hg(m,n)=g(1,n)·g(2,n)···· ·g(m,n)= II g(i,n)
;-1
(see also Exercise 8).
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 73

EXAMPLE 3.13. We prove that every polynomial with positive integral


coefficients is computable. We first show by induction on m that each
polynomial of the form f(n\, ... ,nm)=anf'n;2 ... n~m is computable. For
m= I we have f(n\) = Mu1t(a, nf') = Mu1t( C\,a(n\), Pow ( C\,dn\),
P1,\(n l»)· Now suppose anf'n;2 ... n~-l is computable. Then so is
k'n2k2m
an 1 . .. n Ie,.' since this is

Prod(ank'n2k2'"
\ km -' nle,.)
nm-I'm
= Prod( anf'n;2 . .. n~_\, Pow( C1,1e,. (nm),P 1, I(nm»).
Thus any polynomial with positive integral coefficients having only one
term is computable. Now we use induction on the number of terms. If h
has r terms, r> I, then h(n\, ... ,nm)=g(nl, ... ,nm)+ f(n\, ... ,nm), where g is
a polynomial with r-I terms and f is a polynomial with I term. So f is
computable, as we have just shown, and g is computable by the induction
hypotheses. Hence h(nl, ... ,nm)=Sum(g(nl, ... ,nm),f(nl, ... ,nm» and is
computable by Theorems 2.5 and 3.1.

We next consider the computability of some relations. Recall that if P is


a relation, then Rp denotes the representing function for P, and P is
computable just in case Rp is.

EXAMPLE 3.14. The relations > and < are computable. We saw in
Example 2.8 that the set {m: m ;;> 2} is computable. Let f be the represent-
ing function for this set, i.e., f(m)= I if m;;> 2 and f(1)=2. Then f is
computable. It follows that the 2-function g defined by
g(m,n) = f«m+ I)-':'n)
is computable, since g is obtained by composition of computable functions
[-.:. was shown computable in Example 3.9, and n+ I =Sum(n,C1,I(m»].
Note that g(m,n) = I if and only if (m+1)-':'n;;>2, i.e., if and only if
m-n;;> I, or m>n. This proves that g(m,n) = 1 if m>n and g(m,n)=2
otherwise. Thus g is the representing function of the relation> (which is
the set {( m, n): m > n}). Since g is computable, > is computable. If we
define h(m,n)=g(n,m), then h is the representing function of <, ie.,
h(m,n)= I if m<n and h(m,n)=2 otherwise. We see from Example 3.6
that h is computable and so the relation < is also computable.

EXAMPLE 3.15. The relations ..; and ;;> are computable: Let f be the
representing function for the relation >, i.e., f(m,n) = 1 if m>n, and
f(m,n)=2 otherwise. Let g be defined by g(m,n)=3-':' f(m,n) = C2,3(m,n)
-.:. f(m,n). Then g is computable (by Example 3.9, Example 3.14, Theorem
2.5, and Theorem 3.1). Also, g is the representing function for ..;, for
g(m,n)= 1 if and only if 3-':' f(m,n)= 1, and this occurs if and only if
m::j.n, i.e., if and only if m ";n. Setting h(m,n)=g(n,m) shows that the
relation ;;> is computable.
74 II An Introduction to Computability Theory

EXAMPLE 3.16. The equality relation is the set {(x,x):xEN+}, denoted as


usual by =. The relation = is computable: Let f be the representing
function of ;;>, g the representing function of ..;, and h the representing
function of >. Define the 2-function w by
w(m,n)= h(2, Mult(j(m,n),g(m,n»). Then w(m,n)= 1 if and only if 2>
f(m,n)g(m,n), which happens if and only if f(m,n)= 1, g(m,n)= 1, i.e.,
m ;;> nand n"; m. Thus w is the representing function of the equality
relation.

We are now in a position to prove that the collection of computable sets


is closed under the Boolean operations of union, intersection, and com-
plementation. Recall that k(N+) is the set of all k-tuples whose terms
belong to N+, and X- Y={z:zEX and zft Y}.

Theorem 3.17. Let X and Y be computable k-relations. Then k(N+)-X,


Xu Y, and X n Yare computable k-relations.
PROOF: The representing function R[k(N+)-XI for k(N+) - X is 3':"" Rx(if),
since nEk(N+)- X implies nftX, so Rx(n) = 2 and 3':"" Rx(n) = I; while if
nftk(N+)-X, then nEX, so Rx(n) = I and 3':"" Rx(n) =2. The computabil-
ity of 3':"" Rx(n) follows from Theorem 3.1.
RxUy=R>(3,Rx(n)'Ry(n)), since nEXu Yimplies nEX or nE Y, so
Rx(n) = I or Ry(n) = I, and Rx(n)·Ry(n) is at most 2; on the other hand,
nftXu Y implies Rx(n)=2 and Ry(n) =2, so Rx(n)·Ry(n) is greater than
3. Computability follows from Theorem 3.1.
It is also easy to see that Rxn y= R=(I,RxCn)'Ry(n)), and computabil-
ity again follows from Theorem 3.1. 0

As we have mentioned before, given a k-relation P and a k-tuple n, we


may write Pn instead of nEP. This notational device is extended to
Boolean combinations of relations as follows: Given k-relations P and Q,
we may write PnVQn when nEPuQ, Pnl\Qn when nEPnQ, and
-, Pn when n ft P.
The following theorem and corollary are analogs of Theorem 3.1 and
Corollary 3.5 for relations.

Theorem 3.1S. Let P be a computable r-relation, and let g), ... ,gr be
computable k-functions; then the k-relation Q, defined by 'Q(n) if and only if
P(g)(n), ... ,gr(n))', is computable.
PROOF: Clearly, RQ(n) = Rp(g)(n), ... ,gr(n)). Now apply Theorem 3.1. 0

Corollary 3.19. Let P be a computable r-relation, and let i) ..; k, i2 ..; k, ... , ir ..;
k. Then the k-relation Q defined by 'Qn if and only if P(Pk,i,(n),Pk,i2(n), ... ,
Pk.i,(n))' is computable.
PROOF: Immediate from Theorems 2.5 and 3.18. o
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 75

EXAMPLE 3.20. Every finite relation is computable: We first consider the


special case of a k-relation P with only one member, say (a), ... ,ak)'
Clearly, ii E P if and only if (Pk,)(ii) = a) I\(Pk,2(ii) = a2) 1\ ... 1\
(Pk,k( ii) = ak)' The computability of P follows from Theorem 2.5,
Example 3.16, Theorem 3.18 [to get the computability of Pk,i(ii) = a;] and
Theorem 3.17. Now let Q be a k-relation with t members, say Q=
{(af, ... ,al),(af, ... ,ai), ... ,(at, ... ,al)}. Then Q=p) Up 2 U'" UP',
wherep i is the relation whose only member is (af, ... ,a1) for each i<t. The
computability of Q is now immediate from Theorem 3.17.
If P is a (k+ I)-relation, then (3x <n)P(x,n), ... ,nk) is the relation
consisting of those k-tuples (n), n2"'" nk) for which there is at least one m
equal to or less than n) such that P(m,n),n2, ... ,nk). For example, the
2-relation 'z divides y', written zly, can be expressed as (3x <y)(x·z = y).
This relation is of the form (3x <y)P(x,y,z), where P(x,y,z) is x·z = y. We
say that (3x <n)P(x, n), n2'"'' nk) is the result of applying the existential
bounded quantifier (3x <n) to P(x, n), n2"'" nk)'
Applying the universal bounded quantifier f'I x <n) to P(x, n), n2, ... , nk)
give the relation f'I x <n)P(x, n), n2, . .. , nk) consisting of those k-tuples
(n),n 2, ... ,nk) such that for all m equal to or less than n) we have
P(m,n),n2, ... ,nk). For example, the I-relation 'y is a prime', written as
Primey, can be expressed as f'lx <y)(x=YVx= IV( -,xIY».
Starting with a computable P and applying bounded quantification
gives another computable relation; this is the content of the next theorem.

Theorem 3.21. If P is a computable (k+ I)-relation, then both (3x <


n)P(x,n),n2, ... ,nk) and f'lx <n)P(x,n),n2, ... ,nk) are computable k-
relations.
PROOF. The relation (3x <n)P(x, n), n2, ... , nk)
can be written as
~7!.)Rp(i,n),n2, ... ,nk)<2n),and computability follows from the assump-
tion that Rp is computable, the computability of the function
f(n),n2, ... ,nk)=2n), Examples 3.12 and 3.15, and Theorem 3.18. The
relation (Vx<n)P(x,n),n2, ... ,n k ) is the same as
-,«3x <n)( -,P(x,n),n2, ... ,nk»)' and computability of P, the line above,
and Theorem 3.17. 0
EXAMPLE 3.22. The relations xly and Primex are computable, as is easily
seen from the form in which these relations are expressed above and
Theorems 3.17, 3.18 and 3.21.

Let P be a (k+ I)-relation, and let iiEk(N+). Suppose that there is an x


such that P(ii,x). Then there must be a smallest such x which we denote
by pxP(ii,x) [read "the least x such that P(ii,x)"]. If there is no x for
which P(ii,x), then pxP(ii,x) has no meaning. Thus in general, for a given
P, pxP(ii,x) will be defined for certain ii and undefined for others.
However, should P have the property that for all iiEk(N+) there is an x
76 II An Introduction to Computability Theory

such that P(n,x), then JLXP(n,x) is defined for all nEk(N+). In this case
the following equation defines a k-function [with domain k(N+)]:
f(n) = JLXP(n,x).
As we shall see in the next section, if P is computable, there is a Turing
machine that will search for the least x such that P(n,x) when the input is
n. This is stated in the next theorem.
Theorem 3.23. Let P be a computable (k + I)-relation. Then there is a Turing
machine M such that for every input n,
i. the output is JLXP(n,x) if there is an x such that P(n,x), or
ii. there is no output and no x such that P( n, x).

Hence, if for every nEk(N+) there is an x such that P(n,x), then the function
f defined by f(n) = JLXP(n,x) is computable, and in fact is computed by M.

The proof is deferred to the next section.

EXAMPLE 3.24. Let Prm(n) be the nth prime number in order of magnitude.
So we have Prm(l)=2, Prm(2) = 3, Prm(3)=S, Prm(4)=7, Prm(S) = 11, etc.
[Do not confuse the I-function Prm(n), which enumerates the primes, with
the I-relation Prime(n), 'n is a prime'.] Prm(n) can be defined as follows:

Prm(I) =2,
Prm(n + 1) = JLX(Prime(x)J\(x > Prm(n»).
(Prime(x)J\(x>y» is computable by Examples 3.IS and 3.22 and Theo-
rem 3.17. Moreover, for every n there is an m such that P(n,m). Hence the
functionf defined by f(n) = JLXP(n,x) is computable by Theorem 3.23.

EXAMPLE 3.2S. Let Exp'(m,n) be defined by

Exp'(m,n)= px« -,mXln)Vm= 1).


Then Exp' is computable by examples 3.8, 3.16, and 3.22, and Theorems
2.5, 3.17, 3.18, and 3.23.

A set {P), ... ,P,} of k-relations is said to be a partition of k(N~) if


p) u P 2 U .,. U p,=k(N+) and Pjn Pj is empty whenever i::l=j. As another
application of Theorem 3.23 we have the following theorem, which gives
conditions sufficient to conclude that a function defined by cases is
computable.

Theorem 3.26. Let g), ... ,g, be computable k-functions, and let p), ... ,P, be
computable k-relations such that {p), ... ,P,} is a partition on k(N+). Letf be
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 77

the k-function defined (by cases) as follows:


f(ii)=gl(ii) if PI(ii)
= g2(ii) if P 2(ii)
= g,(ii) if P,(ii).
Then f is computable.
PROOF: f(ii) is the leasty such that
(gl(ii) =Y I\P I(ii»)V(g2(ii) =y I\P2(ii»)
V' .. V( g,( ii) = y I\P,( ii»).
The computability of f now follows from the assumption that the gj and Pj
are computable, Example 3.16, and Theorems 3.17, 3.18, and 3.23. 0
EXAMPLE 3.27. Let Max(nl,n2,n3) be the largest member of the set
{nl,n 2,n3}' Max can be defined by cases as follows:
nl if n l >n2 and n l >n3,
{
Max(nl,n 2,n3)= n2 ifn2>n3andn2>nl'
n3 if n3 >n l and n3 >n2.
The computability of Max follows easily from Example 3.15 and Theorems
3.17 and 3.26. (See also Exercise 16.)

For easier reference, we collect several of the examples in this section in


the following theorem.

'Theorem 3.28. The functions and relations described below are computable:
i. Mult(m,n)=m·n.
ii. Pow(m,n)=n m •
iii. All polynomials with positive integral coefficients.
.
IV.
.
m-n=. {m-n ijm>n,
I otherwise.
v. n!=1·2 ... ·n.
vi. Provided that g is computable, so is
m
~ g(i,ii) = g(l,ii) + g(2,ii) + ... + g(m,ii).

vii. Provided that g is computable, so is


m
II g(i,ii) = g(l,ii)-g(2,ii)· ... ·g(m,ii).
i=1
viii. =, <;, >, <, >.
ix. All finite relations.
x. min (m divides n).
78 II An Introduction to Computability Theory

xi. Prime(n) (n is a prime).


xii. Prm( n) = the nth prime.
xiii. Exp'(m,n)= ILX« -,mXln)Vm= 1).
xiv. Max(n"n2, ... ,nk)= the largest member of {n"n2, ... ,nd.

EXERCISES FOR §2.3


1. Let g be a computable I-function and k a fixed positive integer. Prove that the
I-functions g',g" defined below are computable:
(a) g'(m)=g(m+k).
(b) g"(m)=g(km).
2. Suppose J is a computable 3-function. Prove that the following functions are
computable:
(a) the 3-function hi defined by hl(a,b,c)= J(a,c,b),
(b) the 4-function h2 defined by h2(a,b,c,d) = J(b,a,c),
(c) the 4-function h3 defined by h3(a,b,c,d) = J(a+ b,c,d),
(d) the 2-function h4 defined by h4(a, b) = J(a,b,a),
(e) the I-function hs defined by hs(a) = J(a,a,2a).
3. Prove that the 2-functionJ defined by f(a,b)=2a+b is computable.
4. LetJ be a (k+2)-function, and let g be a k-function. Then there is a unique
function h that satisfies the following two equations:
h(I,n) = g(n) (1)
h(m+ I,n)= J(m,n,h(m,n» for all mEN+, nEk(N+). (2)
[Hint: Let A be the set of all functions I such that for some m* EN+
I(I,n)=g(n),
I(m+ l,n)= J(m,n,h(m,n»,
whenever m <"om* and nEk(N+). Then kt= 0, and whenever II EA and 12EA,
then II ~ 12 or 12 ~ II' Now let h =uA. This gives the existence of h; the proof of
unicity is easier.]
5. Consider the function h defined by the two equations
h(I)= 1, (1)
h(n + 1) = Sum(2n,h(n». (2)
(a) Find h(6) by direct use of the equations.
(b) Prove that h is computable.
6. (a) Prove that h(n)=~7_li is computable.
(b) Show h(n,m)=~7_lim is computable.
7. Define g by the two equations
g(I,n)=n (1)
g(m+ I,n)=ng(m,n). (2)

Show that g is compatible.


2.4 Machines for Composition, Recursion, and the Least Operator 79

8. Let g be a computable I-function. Show that the I-function


n
Qg(n)= II g(i)=g(l)g(2) ... g(n)
i-I
is computable.
9. Show that iff and g are computable I-functions, then so is
n
QJ,g(n) = II f(i)g(;).
i-I

10. Prove that all polynomials in any number of variables are computable.
II. Let P={(a,b):a2 >b}. Prove that P is computable.
12. Let X be a subset of N+ such that N+ - X is finite. Prove that X is
computable.
13. Let X= {(a,b,e):a 2 +b2 = e 2 }. Prove that X is computable.
14. Let Y= {(a,b,e):a 2 ..:. b 2 >e2 }. Prove that Yis computable.
15. Let g.c.d.(m,n) be the greatest common divisor of m and n. Show that this
function is computable. Do the same for the least common multiple of m and
n, l.c.m.(m,n).
16. Letf(n)= I for all n if Fermat's last theorem is true. Otherwise let f(n) =2 for
all n. Prove that f is computable.
17. Find a relation R(x,y) such that for some but not all n, there is an m such that
R(n,m). Letf(n) = p.xR(n, x). What is the domain off? In what sense does the
Turing machine mentioned in Theorem 3.23 compute this f?
18. Let
f(n) = {n if n an~ n+2 are both prime,
1 otherwIse.
Show that f is computable. Note that f is unbounded if and only if the
following unsolved conjecture is true: There are infinitely many twin prime
pairs, i.e., there are infinitely many primes p such that p + 2 is also prime.
19. Let X={l}u {n:2n is the sum of two primes}. Show that X is computable. A
famous unsolved conjecture of Goldbach states that every even number greater
than 2 is the sum of 2 primes. In other words, the conjecture states that
X=N+.

2.4 Machines for Composition, Recursion,


and the Least Operator
In this section we prove Theorems 3.1, 3.7, and 3.23. Each proof consists of
the description of a machine whose existence is asserted in the correspond-
ing theorem. For example, Theorem 3.1 asserts the existence of a machine
that computes f(gt(ii), ... ,g,(ii» when f is a computable r-function and
80 II An Introduction to Computability Theory

the- g's are computable k-functions. Our descriptions of this machine will
be given in terms of machines that compute f and the g's. The description
is detailed enough so that a machine for the composite function can be
written down explicitly whenever machines that compute f and the g's are
given explicitly. Careful proofs that our machines do what we claim they
do can be given by induction on some parameter of the tape (on m in
Lemma 4.2 for example), but this is messy business, and we shall not
torture the reader with these details. However, following the machines
explicitly through one or two computations should demonstrate the work-
ings of the particular machine under consideration.
Similarly, the proof of computability of a function h defined recursively
in terms of computable functions g and f by
h(l,n)=g(n),
h(m+ I,n)= f(h(m,n),m,n)
consists of a description of a machine that computes h, a description given
in terms of machines that compute g and f. As before, one obtains an
explicit description of a machine that computes h when provided with
machines that compute g and f.
To prove Theorem 3.23 we describe a machine that searches for the
least x such that P(n,x), provided that P is computable. The description is
given in terms of a machine that computes Rp.
Next we describe some "component" machines and several methods of
linking machines together. Using these methods with our component
machines will yield the machines for composition, recursion, and the
p.-operator.

Defmltion 4.1. Write Mlt~s if there is a computation with respect to M


with input t and output s. If Mlt~s and M is initial and t is (k+ j, I):
al ... akhl ... hj ... hr ... CIC2 ••• and s is (k+l,n):
al ••• ak hi ... h; ... h; ... C I C 2 ••• ' then we may write M:
• •
... hi ... hj ••• hr ... ~ hi ... h , ... h; ... , where the • indicates the scanned
term.

We also write n to indicate that the leftmost cell of the block of n I's is
• •
scanned. Similarly for on and In.

Lemma 4.2. There is a machine, which we call

Icompress I,
such that for each m and n EN+
• •
Icompress II··· I (Y" II (n-I)o ••• ~ ••• 10 II (n-I)(Y" ....
2.4 Machines for Composition, Recursion, and the Least Operator 81

PROOF: I I
Let compress be the following machine:

o
I IL2
2 lL3 OR7
3 OR4 IR2
4 OL5 IR4
5 OL6
6 ORI lL6

This machine first checks to see if m >2. If not, then the output has the
same tape as the input. If m > 2, then a partial computation yields the tape
position
*
(k+m,I): ... lOm - 1 1 noo ... ,
and this process of printing a I to the left of In and erasing a I on the right
of In is iterated until we get ... 10 Inom... . 0
Let M=(d,p,s). We may write M(i,k) for (d(i,k),p(i,k),s(i,k)). Let
S(M) be the set of all k such that either (O,k)EDomM (the domain of
M), or (l,k)EM, or kERans.

Definition 4.3. Let u >S(M). Define M u to be the machine with domain


t
{(i,k):iE {O, I} and kE S(M)}
such that
M ii,k)= {~(i'k) if (i,k).EDomM,
t (1,0, u) otherwIse.

Clearly U,k)a is an output for M arising from the input tiff U,u)a is an
output of M u arising from the input t. .
t
Definition 4.4. Let [M,l] be the machine with domain {(i,I+k):(i,k)E
DomM} such that
[ M,l](i,l+ k) = (d(i,k),p(i,k),s(i,k) + I)
for each (i,k)EDomM.

So the table for [M,l] is the result of replacing each state k in the table
of M by l+k.
82 II An Introduction to Computability Theory

Definidon 4.5. Let U= S(M) + 1.


i. By M we mean M u U [MI,u -1]. This machine is the result of joining
! !
MI
MI to M in tandem so that an output from M becomes an input to MI.
ii. Let M 0 be the machine

By M we mean MuUMou[MI'U].
It' !
MI
This machine takes all outputs (i,})a of M in which aj=O and uses
these as inputs to MI.
iii. Now let Mo be

UI \lOU+ 11.
Define M to be M u U Mo U [M I, u]. Here all outputs of M in which
~ !
MI
a 1 is scanned are converted to inputs for MI.
iv. Let ~M be M uU Mo. where
!
M o=uIOOll I.
Roughly speaking, an output U,k)a for M in which aj=O is converted
to an input by ~ M and looped back.
v. Take

1101 1,
and define M ~ to be M uU Mo. What would be an output U,k)a for
M, where aj = 1, is loop;d back to M~.
vi. Now define

UI OOu+ 111 Ou+v+ 1I


and v = S(MI). An output from M becomes an input to MI if the
scanned term is 0, and an input to M2 if the scanned term is 1.
2.4 Machines for Composition, Recursion, and the Least Operator 83

Lemma 4.6. For each kEN+ there is a machine

ICOPyk!
such that

PROOF: We consider only the case k=2, which is enough to illustrate the
general argument. Let M I be the machine
o
I IR2
2 OL3 OR2
3 OL3 IR4

*
Clearly MII(nl,n2)~ lon, In2.
Next we need a machine M2 such that
*
M 2101j + 1 on,-jln20Ij~Olj+1 *
on,-jl n2 0lj+l.
For M2 we can take
o
I ORI IR2
2
3
ORJ
IL4
IR2
IRJ
} Go right and add a I to Ij.

4 OL5 IL4
5
6
OL6
OL6
IL5
IR7 } Go left.

We also need a machine M3 to check if the cell to the right of the scanned
cell in the output of M2 has a I:
o I
I OR2
2 OL3 103

Now we take COPY ! 2! to be

o
84 II An Introduction to Computability Theory

Definition 4.7. Define Mk by recursion on k as follows:

rl
MO=M,

Mk+I=[
For example

Lemma 4.8.

1. I I
There is a machine shift right such that

Ishift right II ... *I 1m- I 0 In ... ~ ... 1m0 *I In-I ....


I I
ii. There is a machine shift left such that

Ishift left 11 .. ·Olm o*lln-I ... ~ ... O *lim-lOin ....


I I
iii. There is a machine erase such that

Ierase II .. ·*Iln-IOl ... ~ ... ono *I ....


PROOF: Trivial. o
Lemma 4.9. Let glg2, ... ,gr be computable k-functions. Then there is a
machine Mg I g2... gr such that for all n=(nl, ... ,nk ) we have
* *
Mg I g2' .. grln~ngl(n)gin) .. ·gr(n).
PROOF: If g is a compatible function we let 0 denote a machine that
computes g.
For Mg I take

Icompress I
~
Ishift left Ik
2.4 Machines for Composition, Recursion, and the Least Operator 85

I copyk+r-l Ik
~

Ishift right I'


o
~

~
Icompress I
~
Ishift left Ik+,- I

PROOF OF THEOREM 3.1. Let gl,g2, ... ,g, be computable k-ary functions,
and let f be a computable r-ary function. We want to show that the k-ary
function h defined by
h( ii) = f( g I (ii),g2( ii), ... ,g,( ii»
is computable. Let [2] compute f. Then h is computed by
Mg1···g,
~
Ierase Ik
~
Mf 0
PROOF OF THEOREM 3.7. Let g be a k-function computed by 0, andf a
computable (k+2)-function. Define h by the equations
h(I, ii) = g( ii)
h(m + l,ii) = f(h(m,ii),m,ii).
By Corollary 3.5, the function l' defined by
f'(m,ii,r)- f(r,m,ii)
is computable, say by [2]. This gives an alternate definition for h:
h(l,ii) = g(ii),
h(m + I,ii) = 1'(m,ii,h(m,ii»,
86 II An Introduction to Computability Theory

and this definition will guide our construction of a machine that computes
M.
Let
* *
MilO I n.-... 0 Ig(;o,

* *
MilO I mn.-IO I m-I n for m> 1.
For MI we can take

IORII OL2 ILl

J. IR3

oJ.
OR4

IILlI

Next we need a machine M2 such that

M2110 *I m-I n.-I 0 Im-I iiOO Im-2 n oo ... *I nO Ig(n).

Icopyk+ I (+1
J. ~

IORII

Icompress I
J.
Ishift left k
I +I
2.4 Machines for Composition, Recursion, and the Least Operator 87

Now we need a machine M3 such that


M 310 IT nOO *1 T-I nO IP ~ *1 T nO If'(r-I,i1,p),
and
* *
M31IO 1 r nO IP ~(f 0 If'(r-I,i1,p) for some l.
For M3 we can take

~ IOL2/ILl I ~
I 0R21 ORII
t
[2]
t
Icompress I IILlI
t
r--.:..----, k + I

We are now ready to assemble a machine for h:


MI
'>oM
2
t o
M3~
t

PROOF OF THEOREM 3.23. Let P be a computable (k+ I)-relation where ~

is computed by rR;l. We want a machine M such that, given n, the


machine will sear~r the least m for which p(n,m).
* *
Let MIl n~nOl.
* * *
Let M21 nOlm(f 1 ~nOlm+l. For M2 we can take
IL3 OL2
OL2 IRI
OR4 IL3
88 II An Introduction to Computability Theory

*
Next we describe a machine M3 such that given nO 1m, our machine
first yields a copy and then uses the copy to compute ~. If the answer is
*
no, then the output is nO 1m + I; if the answer is yes, then the output is
*
nO lmor- I 0 for some r.

I copy k+ 1 Ik+1
t
~
t
IORll

* *
We also need a machine M41 nOlm~otl m, where 1="i.I<;.j<;.knj+k+ 1.
OLl lL2
ORI lL2

Ishift left Ik
t
Ierase I k

We can now assemble our machine M:


M)
t
M3~
t
M4 • o
EXERCISES FOR §2.4
1. Let
M= IL2 IRI
OR3 IL2
What is the table for M 4? What is the table for M? What I-function does M r
t It
compute?
2.5 Of Men and Machines 89

2. Let

M = ,--I-----'-I_OR---'lI,
M-~
J-~'

What is the table for


M
1/ \&
MJ M2
Show that this machine computes the representing function for {x:x >2}.
3. Define multiplication by recursion as in §2.3. Following the proof of Theorem
3.7, describe a machine for Mult and use this machine to compute Mult(3,2).
4. Suppose that R(n,x) and S(n,x) are computable relations such that for each n
there is either some x such that R(n,x) or some x such that S(n,x). Show that
px(R(n,x)/\S(n,x» is a computable function.

2.5 Of Men and Machines


What do we mean when we say that a given set of directions is an
algorithm for computing the n-ary function f? We mean that given any
(xJ, ... ,Xn)En(N+) we invariably obtainj(Xl' ... 'Xn ) after a finite amount
of time by following the directions of the algorithm. If there is an
algorithm for j, we shall say that j is man computable. For example, the
Euclidean algorithm tells us how to divide m by n, where m EN and
nEN+, so as to obtain p and qEN such that pn+r=m. Hence, the
resulting functions j(n,m) =p and g(n,m)=r are man computable.
Given a Turing machine that computes a function j, we can obtain
j(xJ, ... ,xn) for arbitrary (xJ, ... ,Xn)E n(N+) by contructing the appropriate
tape sequence in accordance with the machine's table. Thus the table
provides an algorithm for the computation of j. In this sense any machine
computable function is man computable. This, of course, is a philosophical
argument, not a mathematical proof, since no definition of man comput-
able function or algorithm has been previously given that is precise enough
for mathematical analysis. Nevertheless, it seems that no matter what
notion of man computable one might have in mind, a table for the
computation of j is an algorithm for the computation of j.
What about the converse? Is there a man computable function that is
not machine computable? Turing took the philosophical stand that there is
no such function. Thus, in Turing's view a function is man computable iff
it is machine computable. Machine computability is then, according to
Turing, the correct mathematical definition of our vague intuitive notion of
man computability, and a table is an appropriate abstraction of the notion
of algorithm.
90 II An Introduction to Computability Theory

There are several heuristic arguments that bolster Turing's stand. Per-
haps- the most. simple minded is that in over thirty years no function has
been found that is man computable but not machine computable.
Another argument requires adequate faith in the genius of men like
Post, Church and Kleene, GOdel, and Turing, along with a reluctance to
accept coincidence as an explanation. Each of these men defined a set of
number theoretic functions and proposed that their set be taken as the set
of man computable functions. Although their definitions seemed to differ
radically from each other, it later turned out that they all define the same
set of functions, i.e., they all define the set of machine computable
functions. Most of these alternative definitions were made before Turing
proposed his thesis.
A third argument, and to me the most persuasive, attempts to analyze
the intuitive notion of man computability into its simplest components.
Let's say we have an algorithm for computing the unary function f. The
algorithm is a set of directions given in some language, say English. By
restating the directions if necessary, we can assume that the number n is
written as n consecutive checks on a tape divided into cells. Thus the
directions tell us how to pass from n to f(n), each step of the computation
depending on at most the preceding steps, which, as we have seen in
previous sections, can be coded as the last step of the computation. It
seems quite plausible that the directions can be written in such detail that
the passage from one step in the computation to another is accomplished
by erasing or writing checks on a tape and moving one cell at a time. This
then gives us an algorithm that is very close to a Turing machine.
For these reasons, mathematicians regard the Turing machine as the
mathematical definition of algorithm, and Turing machine computability
as the precise analog of man computability.
With slight modifications, the above arguments offered in support of the
computational equivalence of men and Turing machines can also be given
in support of the computational equivalence of real computers and Turing
machines. The speed of computation does not enter into these considera-
tions; we require only that the computation end in a finite number of
steps.

2.6. Non-computable Functions


Having suffered through the tedium of §2.3, the exhausted reader might be
willing to concede that all k-functions are computable. If not, he might
suspect that non-computable functions are in some sense rare, or difficult
to describe, or even indescribable. However, this is not the case; in this
section we show that most functions are not computable, and then we
describe some specific examples of non-computable functions.
If one accepts Turing's theses, then these functions are not computable
by any "real machine" in any technology, even allowing a computation to
2.6 Non-computable Functions 91

take an aribtrary (but finite) amount of time and tape. Nor are these
functions computable by man; no finite set of directions will tell a man
how to find f(x) for arbitrary x if f is not computable by a Turing
machine.
The following definition and theorem will be used to prove that among
all the functions, relatively few are computable.
By # k we mean the k-function with domain kN defined by
# k(n I' n2"'" nk) -_pnl'pn2.
I 2 •••
.pn.
k
wherep; is the ith prime in order of magnitude. For example, #3(4,1,3)=
24 .3 1.5 3 •

Theorem 6.1. If # k(ii) = # ,(iii), then k = / and ii= iii.


PROOF: This is an immediate consequence of the unique factorization
theorem of arithmetic which says that every n> I has one and only
factorization of the form n=q[I'q?' ... ·qft where l,rl, ... ,r,EN+, the q's
are all prime, and for each i</, q;<q;+I' (Our theorem uses only the fact
that each n has at most one such representation.) Since the unique
factorization theorem is usually proved in courses on algebra, we shall not
take the time to give a proof of it here. 0
Now we assign a number to each machine in such a way that different
machines are assigned different numbers. First define
#1, #0, #R, #L, #0, #jforj;.1
to be
I, 2, I, 2, 3, j
respectively. If r is the row of a machine-say r is

I uVWJXYZI
-we define #r to be #6(# u, # V, # W, #x, # Y, #Z). If M is a k-row
machine, the ith row being r l , we define #M to be #k(#r l , #r2"'" #rk).
For example, if M is the machine
0R2 IRI
IR3 OL2
IR4
then
#r l =22315271111131,
# r 2=21315372112132,

#r3=2 13154 ,
and
92 II An Introduction to Computability Theory

The fact that different machines are assigned different numbers is im-
mediate from Theorem 6.1.
From now on, we shall not mention Theorem 6.1 when making use of it.
We can now easily show that there are functions that are not comput-
able. For each computable k-function 1, define Gk(J) to be the smallest
number m such that m = # M for some machine M that computes 1. Then
Gk is a I-I function on the set of all computable k-functions into N+. As
was seen in §1.5, there is no I-I function whose domain is the set of all
k-functions on N+ and whose range is contained in N+. Therefore, some
k-functions are not computable.
By Theorem 4.13, in Part I we see that there are No computable
k-functions. (Hence, there are No computable functions, because a count-
able union of countable sets is countable, and the set of computable
functions is U kEN+Ck , where C k is the set of computable k-functions.)
Since there are 2110 k-functions, we see that there are 2"° k-functions that
are not computable; so most k-functions are not computable.
We shall now see that there are non-computable functions that are very
easy to describe.

ExAMPLE 6.2. First we make a list (with repetitions) of all the computable
I-functions as follows. If m is the number of a machine that computes a
I-function, we let 1m be that I-function; if m is not the number of a
machine that computes a I-function, we let Im(n) = I for all n. Then each
computable I-function occurs at least once in the sequencefl,I2, .... Now
define a I-function F as follows:
F(n) = In(n) + 1.
We claim that F is not computable. For suppose F is computable. Then F
isft for some I. Hence F(I)= ft(l), but at the same time F(/) = ft(/) + I-a
contradiction. Therefore F is not computable. (This is another example of
a diagonal argument; see the second proof of Theorem 5.1 in Part I.)

EXAMPLE 6.3. As above, let fl,f2, ... be an enumeration of the computable


I-functions. Let F be the I-function defined by
n
F(n)= ~ ./;(n).
;=1

Then F is not computable. Indeed, if f is a computable I-function, then


there is an m such that for all n>m, F(n»I(n). For iff is a I-function,
thenf=ft for some I, and for all n>1 we have F(n)='2.~_I./;(n»ft(n). In
other words, F eventually majorizes every computable I-function.

EXAMPLE 6.4 (The Self-Halting Problem). We say that M halts for the
input A if the complete sequence of tape positions determined by M that
begins with A is finite. Let K' be the set of all machines M such that M
halts for the input #M. Let K= {#M:M EK'}. The self-halting problem
2.6 Non-computable Functions 93

can be stated as follows: Is K computable, i.e., is there a machine that,


when given the code number # M of any machine M, prints out I if M
halts for the input # M and prints out 2 otherwise?
The answer is no: K is not computable. To see this, we argue by
contradiction. Suppose K is computable. This means that RK (the repre-
senting function for K) is computable, where
() { I ifnEK,
RK n = 2 otherwise.
Let MI be a machine that computes RK • By making trivial modifications of
MI if necessary, we can assume that the state of all outputs is the same. Let
M2 be

Let
MI
M= t .
M2
Clearly, if Mlln~l, then M does not halt when the input is n, and if
Mlln~2, then Mln~1. Hence M halts on #M iff MII#M~1 iff M does
not halt on # M-a contradiction. Therefore the ass\1mption that RK is
computable is untenable, and we must conclude that K is not computable.

EXAMPLE 6.S. Let KI be the set of all numbers m such that for some M,
m = # M, and M yields the output I when the input is # M. A slight
modification of the argument used in Example 6.4 proves the non-comput-
ability of K 1•

EXAMPLE 6.6. Let K2 be the set of all numbers of the form # M, where M
computes C1 I' the cOnstant I-function whose value is always 1. We shall
show that K 2' is not computable. Given a machine M, let M be the machine
MI
t where MI computes C1 #M; say MI is as given in the proof of
M .
Theorem 2.S. It is easy to see that there is an effective procedure for
getting M from M; hence assuming Turing's theses, the function g, defined
as follows, is computable:
g(m) = {#M if m= #M,
I if for no M is m = # M.
g can be shown computable by the techniques of §2.3 without recourse to
Turing's thesis, but we shall not take the time to go through this tedious
but straightforward bit of work. Notice that # ME K2 if and only if
#MEK1 (we defined KI in the preceding example). Hence, RK2(g(m»=
RK,(m) for all m. Since g is computable and RK, is not by Example 6.S, we
94 II An Introduction to Computability Theory

see by Theorem 3.1 that RK2 is not computable either. Hence, K2 is not
computable.

EXAMPLE 6.7. Let K3 be the 2-relation consisting of those 2-tuples (m,n)


such that m and n are numbers of machines that compute the same
I-function. K3 is not computable: for let a be the number of a machine
which computes Ct,t. Then Kim,a) if and only if K2(m), and since K2 is
not computable, neither is K3 (by an application of Theorem 3.1).

EXAMPLE 6.8. If m = # M and M computes a I-function, we let F(m) be the


smallest number n such that for some M', n = # M' and M' computes the
same I-function that M computes. If m is not the number of a machine, we
let F(m) = 1. F is not computable, for otherwise the relation
(F(mt)=F(m2»/\(F(mt)~I) would be computable; but this relation is
K 3 , shown to be non-computable above.

EXAMPLE 6.9 (The Halting Problem). Is there a machine that decides


whether or not an arbitrarily given machine always halts? More precisely,
let K4 be the set of those numbers # M such that M halts for each input.
The halting problem asks if K4 is computable. Using the function g of
Example 6.6, we see that

Ki g( # M» if and only if K( # M),


where K is the non-computable set of Example 6.4. Hence K4 is not
computable.

Assuming Turing's thesis, Example 6.6 shows that there is no decision


procedure for determining whether or not an arbitrarily given machine
computes the constant function f(x)= 1. Example 6.7 shows that no
algorithm exists for deciding whether two arbitrarily given machines com-
pute the same I-function. Example 6.8 may be interpreted as saying that
there is no effective procedure for finding the smallest machine that
computes the same I-function as a given machine. The last example shows
that no effective procedure exists for testing whether or not a machine
halts for all inputs.

EXERCISES FOR §2.6


1. Find non-computable I-functions f and g such that f+ g and J-g are both
computable. [Recall that f+ g and J-g are the functions whose value at n are
f(n)+ g(n), and f(n) ·g(n) respectively.]
2. Is is true that whenever f and g are non-computable I-functions, then so is the
function h, defined by

h(n) = f(g(n))?
2.7 Universal Machines 95

3. Prove: H A !;N+ and A is infinite, then there are non-computable sets B and C
such that A = B U C and B n C is empty. (Hint: Try a cardinality argument.
How many ways are there of partitioning A into two subsets?)
4. Let K be the set of all numbers of machines M that yield some number k as an
output for the input # M. Prove that K is not computable.
5. Let E be the set of all numbers of machines that compute bounded I-functions.
Show that K is not computable.

2.7 Universal Machines


Man is a universal computer in the sense that he can compute any function
that a machine can. In this section we show that there is a machine U that
is just as versatile. Given any k and any k-function j and any machine M
that computesj, U yields the outputj(ii) when the input is (#M,ii). Thus
U can be "programmed" to compute j, and # M is such a program. So by
Turing's thesis, this machine U can be programmed to compute any
function that a man can compute.
We begin by assigning numbers to tape positions. A given tape position
A is completely specified by indicating the marker, and which terms have
value 1. If only finitely many terms have value 1, then all of this informa-
tion can be retrieved from any number of the form # tA, defined to be
2CI • 3C2 • 5dl • 7d2 • •• Prm4( t + 2), where:
i. C I is the number of the scanned term.
ii. C2 is the state of A.
iii. d; is 1 if the ith term is 1, and d; is 2 otherwise.
iv. t;> max{ CI , k}, where the (k + I)th term of the tape has value 0 for all
IEN+.
As an illustration, if A is (5,8) 010 111000 ... , then one # ~ is 25.3 8.52.
7.11 2.13.17.19. As another example, #8A is 25 .38 .52.7.11 2.13.
17·19·232·2~. When writing #tA we shall always assume that t satisfies
condition iv above.
If I is #tA and M is a machine, let IM - #t+IM(A), and otherwise take
1M to be 1.
We need a computable 2-function STP (read "successor tape position")
such that STP(m,/)-IM if for some machine M and tape position A,
m- #M and 1- #tA.
For our purposes the value of STP when (m,/) does not satisfy the
condition is immaterial. Later in this section we show that such a function
exists, but first we use this fact to obtain a universal machine.
Now let TS (read "tape sequence") be the 3-function defined recursively
as follows:
TS( m, I, I) - STP( m, I),
TS(m,l,k+ 1)-STP(m, TS(m,l,k».
96 II An Introduction to Computability Theory

TS is computable, since STP is. Notice that if A I ,A 2, ••• is a sequence of


tape positions determined by M, then #1+iAi+1 is TS(#M, #IAI,i) for all
iEN+. Hence if TS(m,l,k)=TS(m,l,k+I), then the sequence determined
by M beginning with A I ends with the tape position whose number is
TS(m,l,k).
Now let UI be a machine that prints out p.x(TS(m,l,x)=TS(m,l,x+ 1»
when this is defined, and does not halt otherwise (an application of
Theorem 3.23). Let U2 compute TS. Let U' be the machine

Icompress I
t
.-------,2

In a weak sense, U' is already a universal machine. For suppose M is a


machine that yields the output B when the input is A. Then U' will yield
the output # sB when the input is (# M, # IA). However, U' has several
shortcomings. First of all, A must be coded as # IA. In the second place,
the output of U' obtained from the input (#M, #IA) is #sB and not B.
In order to do away with these deficiencies, we need machines to code and
decode tape positions.
To decode the output tape position we use a machine

Idecode I
that computes the function D(n) that subtracts the location of the first 1 on
the tape from 1 plus the location of the last 1:

D(n) = [1 + [ p.x[ (x> 2)/\( -,Prm2xln)


/\«( -,Prm(x+ 1)ln)V(Prm2(x+ 1)ln»]]]
~ p.x«(x>2)/\( -,Prm2(x)ln».

I
Clearly, ,decode I yields the output B when the input is # lB.
(See ExerCise 1.)
2.7 Universal Machines 97

The following computable functions will be useful m describing a


machine that codes inputs:
n
f(n,I)= II Prm(i)
i= 1

(2)
m
when II Prmk,(i + m + I) > 1.
i= 1

It is easy to see that f is computable (Exercise 2). The usefulness of f in


coding inputs can be seen by considering the input A =(n\,n2,n3)' Then
f(n\,(j(n 2,J(n3, I»»
is 2d ,. 3d2 • Sd" where d; is as in the definition of # t(A).
We also need a function that will tack on the marker (2, I), and this is
given by

(3)

Hence, in the example above, g(f(n\,J(n2,J(n3, I»» is # tAo It is easy to


see that g is computable (Exercise 3).
Now let M\ be any machine such that

Md(m,n\, ... ,nk)~ o( m,n\, ... , ~k' I).


Let ~ compute f, and let Mg compute g. Take M2 to be the following
machine:
OL2 ILl
OR3 IR3
,/
r--- - I -.------.,
OR ORI lRl

'compress'
98 II An Introduction to Computability Theory

Now let I code 1 be the machine


Ml
J,
M2~
J,
I ORI I

I compress 1

J,
I shift left 1

Now as our universal machine U we can take

I code 1

J,
U'
J,
Idecode I·
If M is any machine and (nl, ... ,nk) any tuple, then
MI(n1, ... ,nk)=Y iff UI(#M,n1, ... ,nk)=Y'
We still have to show that STP, the successor tape function, is comput-
able. To do this we require several functions, each of which is easily seen
to be computable from its definition.
Recall that Exp'(m,n) is computable, where Exp'(m,n)= p.x«mX %n)V
(m= I)). Now let
Exp(m,n) = Exp'(Prm(m),n)-=-1.
Then if the mth prime divides n, Exp( m, n) is the largest power of Prm( m)
that divides n.
If m is the number of a machine and 1 is the number of a tape position,
then the code number of the relevant row of the table is given by the
function
RR( m, I) = Exp(Exp(2, I), m).
The code number of the relevant column is
RC( I) = Exp(Exp(l, I) + 2, I).
(Recall that the O-column is coded by 2 and the I-column by 1.)
2.7 Universal Machines 99

The next-position function is


NP( m, I) = Exp(6":" RC2( m, I), RR( m, I».
The next-state function is defined as

NS(m /) = { Exp(2,/) if Exp(I,/)= I and NP(m, I) =


, Exp(7":"RC2(m,/),RR(m,/) 2, otherwise.
The next scanned term is given by
ExP(I,/)= I if NP(m,/)= I,
NST(m,/)= { ExP(I,/).":"1 if NP(m,/) =2,
Exp(I,/) if NP(m,/)=3.
The only term that might be altered in the successor tape is the scanned
term whose location is Exp(l,/), and the new value of this term will be the
first or fourth entry in the relevant row of the machine, depending on the
relevant column, i.e., if v is the new value and j = # v, then
Exp(S":"C2(m,/),RR(m,/)=j. Terms other than the scanned term will not
be altered, and we agree that no term will be altered if our machine
dictates a move off the left end of the tape. In order to describe the kth
term of the successor tape we use the following function:

i if (k=Exp(I,/»/\(Exp(S":"RC2(m,/),RR(m,/)=j)
T(k,m,/)= { /\(k=Ln/":"2)/\ -,(Exp(I,/)= I/\NP(m,/) =2
2 if k=Ln/":"l
Exp( k + 2, /) otherwise.

where Lnl is the length of I, namely p.x( -,Prmxl/)":"1.


Finally we obtain STP as
Ln/+I
STP(m,/) = 2NST(m,/)3NS(m, I) IT [Prm(i+2)r(i,m,/).
i-I

ExERCISES FOR §2.7


I. Prove that the decode function D(n) in (1) is computable.
2. Prove that the function f defined in (2) is computable.
3. Prove that the function g defined in (3) is computable.

4. Show that there is a machine M) such that M)I(m,n), ... ,nk)"....(m,n ..... ,nk' 1).
5. Show that RR, Re, and NS are computable.
6. Show that NT and STP are computable.
7. Let X = {# M: M is a machine}. Show that X is computable.
8. Let L = {I: I is the number of a tape position}. Show that L is computable.
100 II An Introduction to Computability Theory

2_.8 Machine Enumerability


If j is a computable I-function, then there is an effective procedure for
enumerating its range. For suppose that M computes j. We can use M to
computej(I), and then to computej(2), and thenj(3), and so on. In this
way we begin an enumeration of the set {!(1),j(2),j(3), ... } which is the
range of j.
On the other hand, an effective procedure for enumerating a set X, say
as the list Yl,Y2'Y3' ... ' clearly yields an effective procedure for computing
the I-function j(n) = Yn • Accepting Turing's thesis, we conclude that j is
machine computable, and so X is the range of a computable function.
Thus we are led to make the following

Definition 8.1. A set Y kN+ is machine enumerable if and only if it is either


the range of a computable I-function or the empty set.

Notice that the range of a computable k-function g is also the range of


the computable I-functionj defined by
j(x) = g(Exp(1,x), Exp(2,x), ... ,Exp(k,x».
Hence the notion of machine enumerable is not broadened by substituting
ok-function' for 'I-function' in the definition.
Is every machine enumerable set computable? Suppose Y is machine
enumerable: say Y is the range of the computable I-functionj. We want to
know if there is an effective procedure that will enable us to decide, given
arbitrary Y, whether or noty E Y. We can, of course, begin to enumerate Y
by computingj(1), j(2), j(3), and so on. Then ify does belong to Y, it will
appear in our list as some j(n) after finitely many steps. But if y e: Y, then
we shall never know it by following this procedure. For after the first 100
computations, we have no guarantee that Y'tl=j(101), and after the first
10,000 computations we have no guarantee that y#=j(IO,OOI). We can
never be certain that no future computation will yield y as some j(n).
Hence, this procedure does not constitute an effective method for de-
termining membership in Y. Indeed, as we next see, there are sets that are
machine enumerable but not computable.

ExAMPLE 8.2. Let K = { # M: M is a machine which halts for the input


# M}. In Example 6.4 we proved that K is not computable. However, K is
machine enumerable. Our proof of this is based on the computability of the
following relations and functions. Write Row(x) if x is the code number of
a row of a machine, i.e.,
Row(x) iff (3x 1,X2' ... 'X6<x)
[[x = #6(X 1, ••• ,XJ/\(Xl-=-I)(X2-=-2)(x4 -=-I)(xs-=-4)= 1]
V[x = #3(X 1,X2,X3)/\(x1 -=-1)(X2 -=- 2)= 1]
V[x = (7X411Xs13X6)/\(x4 -=- I)(xs -=- 2) = I]].
2.8 Machine Enumerability 101

The computability of Row(x) gives us the computability of the set of code


numbers of machines:
Mach(x) iff (Vn<x)[ -,(Prm(nlx»VRow(Exp(n,x»].
The code number of the input x is given by
x
In(x)=223 I S2 IT Prm(i+3).
i=l
Now define
Halt(m,n,s) iff Mach(m)/\[TS(m,n,s)=TS(m,n,s+ 1)].
Let d be the number of any machine M that halts on the input # M. Now
define
f(k)= {m if (3r,~<k)[2r3s=k/\r=m/\Halt(m,lnm,s)]
d otherwIse.
Thenfis computable and its range is K. Thus K is a machine enumerable,
non-computable set.

We extend our definition of machine enumerable to k-relations in a


natural way as follows.

Definidon 8.3. A k-relation R is machine enumerable if and only if


(#k(nl, ... ,nk):R(nl, ... ,nk)} is.

Thus a k-relation R is machine enumerable if and only if there is a


machine which enumerates the code numbers of the k-tuples that belong to
R, or R is empty. In fact it is an easy but tedious exercise to modify such a
machine so that it will print out the k-tuples themselves rather than the
code numbers of the k-tuples. So by Turing's thesis, if there is a set of
directions that will enable a man to enumerate the k-tuples belonging to
the relation R, then R is machine enumerable, and conversely.

EXAMPLE 8.4. An example of a non-computable machine enumerable


2-relation is the relation R defined by
R(m,n) if and only if m is the code number of a machine which halts for
the input n.
R is machine enumerable, since the following function is computable:

2ExP(I.r>3 ExP(2.r> if (3m <r)(3n <r)(3s <r)


F(r) = { (r=2 m 3n ss /\Halt(m,n,s»
2m"3/" otherwise.
Here /. is the number of the I-input 1, and m* the number of some
machine that halts on 1. From our discussion of the self-halting problem,
we see that R(m,In(m» is not computable, and so neither is R(m,n).
102 II An Introduction to Computability Theory

In the next few theorems we explore the relationship between machine


enumerability and computability.

Theorem 8.5. Every computable k-relation is machine enumerable.


PROOF: Let R be a computable k-relation. If R is empty, then it is machine
enumerable by definition. In the case where R is not empty, choose a
k-tuple sER. Definefby
fen) = { # k(~) if R(n),
# k( s) otherwise.
Clearly, f is computable and R is the range of f. Hence R is machine
enumerable (making use of the remark following Definition 8.1). 0

The examples above show that the converse of Theorem 8.S is false.

Theorem 8.6. R is computable if and only if both Rand -, R are machine


enumerable.
PROOF: If R is computable, then so is -,R (by Theorem 3.17). So by the
last theorem, both Rand -, R are machine enumerable.
To prove the converse, assume that Rand -, R are both machine
enumerable k-relations. If R is empty or equal to k(N+), there is nothing to
prove. If this is not the case, we choose computable I-functions f and g
such that Ranf={#(n):R(ii)} and Rang={#(n): -,R(nn. Since every n
is either in the relation R or -, R, we know that for every n there is an m
such that f(m) = #k(n) or g(m) = #k(ii). Thus the following relation,
which is easily seen to be R, is computable:
f( pm(J(m) = # k(n)Vg(m) = # ken))) = # ken). 0

Corollary 8.7. If R is machine enumerable but not computable, then -, R is


not machine enumerable.

Hence, using the examples at the beginning of this section and taking
complements, we obtain examples of non-machine enumerable relations.
We do not define machine enumerable functions. The reason is that
machine enumerable functions are computable, as the following theorem
shows.

Theorem 8.8. Suppose R is a machine enumerable (k+ I)-relation such that


for every k-tuple m there is exactly one n such that R(m,n). Then R is
computable, as is the k-function f defined by
f(m)=n ifandonlyif R(m,n).

PROOF. Let Rand f be as in the hypothesis of the theorem. Then


f(m) = p.x(R(m, x)). Hence if R is computable, then so is f (by Theorem
2.8 Machine Enumerability 103

3.23). To show that R is computable, let g be a computable I-function


whose range is {# k+l(m,n):R(m,n)}. Then R(m,n) if and only if
g(Exp(l, px(g(Exp(l, x» = # k(iii) . (Prm(k + 1»ExP(2.x»»
= # k+ l(m, n).
Hence R is computable 0
The next theorem pinpoints the relation between machine enumerability
and computability even further. If S is a (k+ I)-relation, we define the
k-relation (3x)S(m,x) to hold for m if and only if there is an n such that
S(m,n). We say that (3x)S(m,x) is the relation resulting from S(m,x) by
applying the existential quantifier to x. We read '3x' as 'there is an x'.
Theorem 8.9. R is a machine enumerable k-relation if and only if there is a
computable (k+ I)-relation S such that
R(m) if and only if (3x)S(m,x).

PROOF: Let R be a machine enumerable k-relation. If R is empty, then


R(m) ifandonlyif (3x)(#k(m)=I=#k(m)l\x=l=x),
where # k(iii)=I= # iiii)l\x=l=x is clearly computable. Suppose R is not
empty, so that there is a computable I-function f with Ranf=
{# k(iii):R(iii)}. Then
R( m) if and only if (3x )(J( x) = # k( m»,
where f(x) = # k(iii) is a computable (k + I)-relation.
To prove the other direction of the theorem, we start with a computable
(k+ I)-relation S and define R(iii) if and only if (3 x) S(m, x). If S is
empty, then R is empty and so is machine enumerable. If S is not empty,
we choose (m*,n*)ES and define
(_ ) (#
f m,n =
k(m)
#k(m*)
if S(m,n),
otherwise.
Thenfis a computable function whose range is obviously {#iiii):R(iii)}.
Hence R is machine enumerable, as we needed to show. 0
Given a (k+ I)-relation S we define a k-relation R by
R(m) if and only if for all n, S(m,n).
We write ('f/x)S(m,x) in place of R(iii) and say that VxS(m,x) is the
relation resulting from S(m,x) by applying the universal quantifier to x.
'V x' is read 'for every x'.

Corollary 8.10. Let Sl and S2 be computable (k+ I)-relations, and let R be a


k-relation. Suppose that
R(iii) if and only if ('f/x)Sl(m,x)
if and only if (3x)S2(m,x).
Then R is computable.
104 II An Introduction to Computability Theory

PROOF: ..., R(iiI) if and only if (3x)..., Sl(m,x). Hence by the theorem, both
Rand ..., R are machine enumerable. Therefore Theorem 8.6 applies, and
so R is computable. D

We end this section with a result in the spirit of Theorem 3.l8-a result
that shows how machine enumerable relations can be combined to yield
other machine enumerable relations.

1beorem 8.11.
i. If Rl and R2 are machine enumerable k-relations, then so are Rl U R2 and
R 1 nR2•
ii. If R is a machine enumerable (k+ I)-relation, then the following are
machine enumerable k-relations:
(a) (3x)R(y,x),
(b) (3x <; Yl)R(y, x),
(c) (Vx <; Yl)R(y, x).
PROOF OF I. By Theorem 8.9 there are computable relations Sl and S2 such
that

and
Rim) if and only if (3x)S2(m,x).
Then R1(iiI)V R 2(iiI) if and only if (3x)SI(m,x)V(3x)S2(m,x) if and only
if (3x)(SI(m,x)V Sim,x», where SI(m,n)V S2(m,n) is computable.
Hence Rl U R2 is machine enumerable by Theorem 8.9. Similarly, R1(iiI)/\
R 2(m) if and only if (3x)(SI(m, Exp(l,x»/\S2(m, Exp(2,x»), so Rl n R2
is machine enumerable. D
PROOF OF II. Let R be a machine enumerable (k+ I)-relation. By Theorem
8.9 there is a computable (k+2)-relation S such that
R(y,x) if and only if (3z)S(y,x,z)
Hence,
(3x)R(y,x) if and only if (3w)(S(y, Exp(l, w),Exp(2, w»),
which (by Theorem 8.9) proves part ii(a). Also,
(3x <;Yl)R(y,x) if and only if (3z)(3x <;Yl)S(y,X,z),
which proves ii(b) (by an application of Theorem 8.9 and Theorem 3.21).
Finally,
(Vx <;Yl)R(y,x) if and only if (3w)(Vx <;Yl)S(y,x,Exp(x, w»,
which proves ii(c) (by an application of Theorem 8.9 and Theorem 3.21). D

In a later section we shall see how computability theory is related to the


problem ofaxiomatizing arithmetic, and we shall find some particularly
2.9 An Alternate Definition of Computable Function 105

interesting sets that are machine enumerable but not computable, and
others that are not even machine enumerable.

EXERCISES FOR §2.8


1. Let J be a computable I-function, and suppose that J is monotonic increasing,
i.e., J(n + I) >J(n) for all n EN+. Show that RanJ is computable.
2. LetJ and g be computable I-functions withJ(n) >g(n) for all n EN+. Suppose
that g is 1-1 and that Rang is computable. Show that RanJis computable.
3. LetJbe a computable I-function and X a computable set. By J-1(X) we mean
{x:J(x)EX}. MustJ-I(X) be computable?
4. What is the cardinality of the set of machine enumerable sets? What is the
cardinality of the set of sets that are machine enumerable but not computable?
5. Show that every infinite computable set contains an infinite machine enumer-
able non-computable subset.
6. Show that every infinite machine enumerable set contains an infinite comput-
able subset. (Hint: Use Exercise 1 above.)
7. Show that there is a machine enumerable 2-reIation R that enumerates all
machine enumerable I-relations, in the sense that for any machine enumerable
I-relation S there is an m such that for all n
R(m,n) if and only if Sen).

8. Let R(m,n) hold if and only if there are machines M .. M 2 such that m= #M1,
n = # M 2, and MI and M2 compute the same I-functions, or one of the two
machines does not compute a function. Is R machine enumerable?

2.9 An Alternate Definition of Computable Function


There are ways of defining the class of computable functions without
mentioning machines, and some of these definitions are far more elegant
than the one we have been working with. However, the notion of machine
has considerable intuitive appeal and provides easier access to examples of
non-computable functions that are of interest to those who deal with real
computers.
In this section we present one of these alternate definitions of the class
of computable functions and sketch a proof of the equivalence of the new
definition with the one we have been using. The new definition will be of
use in the next chapter.
Before giving the alternate formulation, we first show that any comput-
able function f can be given a rather simple definition in terms of a
machine M that computes it. Let m- #M and Ink(ii) be the number of
106 II An Introduction to Computability Theory

the k-input n. Then


J( ii) = Decode [TS( m, Ini ii), Halt( m, Ink( ii), s) ]

= Decode [TS( m, Ink( ii), ps( R = (TS( m, Ink( ii), s + 1),

TS(m,Ink(ii),s)) = 1)]. (*)


e
Now let be the set oj computable functions. Suppose that X is any class of
functions such that:
i. For each k the function Ink belongs to X, and the functions Decode,
TS, R_, andJ(x)=x+ 1 belong to X.
ii. If J is an r-function and belongs to X, and if g\, ... ,g, are k-functions
each belonging to X, then the composition of J with g\, ... ,g, belongs
to X.
iii. if g is a (k+ I)-function belonging to X and if for every k-tuple n there
is an m such that g(m,ii) = 1, then the k-functionJ defined by
J(ii)=px[ g(x,ii) = 1]
belongs to X.
Then X contains each function J defined as in (*) above, and so X ~ e.
Except for Ink we already know that each of the functions mentioned
in i. belongs to e. We now show that Ink also belongs to e
for each k.
First consider the function C defined by
Lny
C(x,y)=x· II [Prm(Lnx+i)]ExP(i,y).
i-I

So if x= #(n\, ... ,nk) and y= #(m\, ... ,m/), then C(x,y) = #(n\, ... ,nk'
m\, ... ,m/). Clearly C is computable. Now we have
nl

In\(n\)=22 3\52 II Prm(3+i)


i=\

and

We see by induction on k that each of the Ink'S belongs to e.


Hence if X is the smallest set of functions satisfying conditions i, ii, and
iii above, then X!;;;B (since e
satisfies i, ii, and iii). Thus we have a new
characterization of e,
namely, e
is the smallest set X satisfying i, ii, and
111.
The definition as it stands is not particularly elegant or useful, because
of the complexity of the functions mentioned in condition i and because
the definition of some of them is motivated only within the framework of
2.9 An Alternate Definition of Computable Function 107

Turing machines. These shortcomings are bypassed by observing that each


of the functions appearing in condition i was built up from those men-
tioned in Theorem 2.5 by composition, recursion, and the IL-operator.

Lemma 9.1. e is the smallest set :JC such that:


i. For each k,dEN+ and t <.d, Ck,dE:JC and Pk"E:JC. Also Sum, R_, and
Pred belong to :JC.
ii. If h(ii)= f(g.(ii), ... ,gk(ii» and f,g., ... ,gk E:JC, then h E:JC.
iii. if h is defined by
h(l, ii) = g( ii),
h(m+ l,ii)= f(m,ii,h(m,ii»,
and g,jE:JC, then hE:JC.
iv. If gE:JC and for all ii there is an m such that g(m,ii) = 1, and if
h(ii)= px(g(x,ii) = 1), then h E:JC.

e
Next we will obtain a more elegant formulation of that has only two
closure conditions (closure under recursive definitions being omitted).

Definition 9.2. The set of recursive junctions, Rec, is the smallest set :JC such
that:
i. +, " and R> belong to:JC, as does Pk" for each kEN+ and t<.k.
ii. If h(ii)=f(g.(ii), ... ,gk(ii» andf,g., ... ,gkE:JC, then hE:JC.
iii. If gE:JC, and for each ii there is an m such that g(m,ii) = 1, and if
h(ii)= px(g(x,ii) = 1), then hE:JC.

The main result of this section is

Theorem 9.3. e = Rec.


e,
As before, it is clear that Rec ~ since each of the functions mentioned
in condition i belong to e,and e
satisfies conditions ii and iii. By the
lemma, if we show that +, R=, Pred, and each Ck,d and Pk,d belong to
Rec, and that Rec is closed under recursive definitions, we shall have
e
Rec ~ as needed. To see this we need the next lemma as well as some
results from number theory.

Definition 9.4. A relation is recursive if its representing function is.

Lemma 9.5.
i. For every k, dEN +, the constant junction Ck,d is recursive.
ii. Pred E Rec.
iii. The equality relation is recursive.
108 II An Introduction to Computability Theory

iv. If P and Q are recursive k-relations, then so are


PVQ, P /\Q, and ,P.
v. If P is a recursive (k+ I)-relation then so are
('f/x <n l )( Px, n l , ••• , nk) and (3x <n l )( Px,n l, ... ,nk ).
PROOF: In each case, the proof follows easily from the form of the
equation or equivalence given below that defines the function or relation in
question.
i: Ck,l(ii) = p.x(x<nl + n l )= p.x(R«X,Pk,l(n» = 1).
Now suppose Ck,jERec. Then Ck,j+l ERec, since
Ck,j+l(n) = p.x(R« Ck)n),x) = 1).

ii: Pred(n)= p.x(n <x+2)= p.x(R«n,Sum(x, C 1,2(n»)= 1).


iii: R_(m,n)=R«R«m,n+ I)·R«n,m+ 1),2).
iv: R pVQ ( n) = R« Rp( n) + RQ( n), 4),
Rpt\Q( n) = R« Rp( n) + RQ( n), 3),
R.,p(n) = R«I,Rp(n».
v: Suppose that P is recursive, and let F be the representing function of
('fIx<nl)(Px,nl, ... ,nk). Then

F(n) = + R«nl'p.x(, P(x,n)V(x=nl + 1»).

Finally, note that (3x <;nl)(Px,nl, ... ,nk) iff ,('fix <;n l )( ,Px,nl, ... ,nk)
and apply part iii. 0
Recall that m and n are relatively prime if 1 is their greatest common
divisor. We need several number theoretic facts whose proofs are short
enough to be given here.

Theorem 9.6. If m and n are relatively prime, then there are integers s and t
such that 1 = sm + tn.
PROOF: Let u be the least positive integer that can be written in the form
sm + tn where s and t are integers (positive, negative, or 0). Notice that
ulm, for if not, then m=uk+1 for some k and some I such that O<I<u;
hence m=(sm+ tn)k+l, from which we get I=(l-sk)m- tkn, contradict-
ing the minimality of u. Similarly uln. Since m and n are relatively prime, u
must be 1. 0
Now define the remainder function Rem as follows:
if m=nk+r for some k> 1, n>r> 1,
Rem(m,n)= { ~
otherwise.
2.9 An Alternate Definition of Computable Function 109

Rem is also defined by


Rem(m,n)=p.x(3k<n(m=kn+x)V(x=I/\(3k<n(m=k·n)Vm<k»).
Using the lemma and this definition, we see that RemERec.

11teorem 9.7. (Chinese Remainder Theorem). If n1, ... ,nk are relatively
prime in pairs, and if aj <nj for all i <.k, then there is an m such that
Rem(m,nj)=ador all i<.k.
PROOF: Let z=n(n2 • ••• ·nk and let zj=z/nj. Since Zj and nj are relatively
prime, there are integers Sj and I j such that 1 = SjZj + tjnj. Hence aj = ajsjzj +
ajtjnj or aj-ajtjnj=ajsjzj. Dividing both sides by nj we get Rem(ajsjzj,nj)=
aj. Let m=alslzl+ ... +akskzk+lz where I is arbitrarily chosen so that
m>O. Then
Rem(m,nj)=aj foreachi<.k. o
11teorem 9.8. For each finite sequence a1, ... ,ak there is an m and an n such
that
Rem(m, 1+jn)=aj
for eachj<.k.
PROOF: By the preceding theorem it is enough to find an n such that the
numbers 1+jn for j <. k are relatively prime in pairs and 1 +jn >aj • Let
n=b!, where b=max{al, ... ,ak,k}. Surely l+jn>aj for each j<.k.
Suppose that some prime p divides both 1 +jn and 1+ in where i <j <.k.
Then p divides (1 + jn) - (1 + in), i.e., p divides U - i)n. Hence p jj - i or
pin. But if pjj-i, then pin, sincej-i<k<b and n=b!. So in any case,
pin. Along with our assumption that pi 1 +jn, we getpl(I +jn)-j(n), orpll
-a contradiction. Hence the 1 + jn are relatively prime in pairs for j <. k. 0

Lemma 9.9. Let g be a k-function and fa (k + 2)-function such that g and f


are recursive. Then the function h defined as follows is recursive:
h(I,ii) = g(ii),
h(m + I,ii) =f(m, ii,h(m, ii».

PROOF: Using the lemma we see that the following relation H is recursive:
H(x,y,m,ii,s)
iff (g(ii) = Rem(x, I +y»
/\CVw<m)(Rem(x, 1+(w+ I)y) = f(m,ii,Rem(x, 1+ 10/»
/\Rem(x, I + my)=s).
Notice that h(m,ii)=s iff there is an x andy such that H(x,y,m,ii,s). The
lemma gives the recursiveness of the function
G(m,ii) = p.x(3y <x)(3s<x)H(x,y,m,ii,s).
110 II An Introduction to Computability Theory

This in tum gives the recursiveness of


K(m,ii) = lLy(3s <; G(m,ii»H( G(m, ii),y, m, ii,s),
and hence the recursiveness of
h( m, ii) = Rm( G( m, ii), 1 + mK( m, ii»
as needed. D
The two lemmas above conclude the proof of Theorem 9.2.

ExERCISES FOR §2.9


I. The class of primitive recursive junctions, Pr, is the least class r such that
i. C.,bR_,Succer [recall that C.,.(n)=1 for all n; R_(m,n)=1 if m=n, and
R_(m,n)=2 if m'i=n; Succn= n+ I],
ii. r is closed under composition,
iii. r is closed under definition by recursion.
(a) Show that Succ,ProdePr.
(b) Show that there is some 2-function FePr such that for each I-function
g e Pr there is an m for which
g(n)= F(m,n)
whenever n ePr.
(c) Conclude from (b) that PrgRec.
2. Let 0 be the least class of relations 11 such that
i. +,. (as 3-relations) el1,
ii. if R, S el1 and both are k-relations, then R uS el1 and ..., R el1,
iii. if R is II k-relation in 11 and p :k-+k, then S is a k-relation in 11, where S is
defined by
S(n., ... ,nk) iff R(IIp(.), ... , IIp(k)),
iv. if R is a k-relation in 11 and
S(n., ... ,nk) iff (Vy <nk)R(n., ... ,nk_bnk),
then S el1,
v. if R .. R2 are (k+ I)-relations in 11, and if S(n., ... ,nk) iff (3y)R.(n., ... ,nk'Y)
iff (Vy)R 2(nb ... ,nkoY), then S el1.
Show that 0 is the set of recursive relations.

2.10 An Idealized Language


In his famous address of 1928 to the International Congress in Bologna,
David Hilbert proposed a search for an adequate axiomatic foundation for
the arithmetic of the natural numbers. His hope was to use such an
axiomatization as a base for all of mathematics, since the other number
2.10 An Idealized Language 111

systems such as the rationals, the reals, the complex numbers, etc., can be
defined in terms of the arithmetic of the natural numbers in such a way
that their properties can be derived from those of the natural numbers.
Unfortunately, in 1931, Godel proved that this goal was not obtainable
and with his incompleteness theorem pointed out startling limitations
inherent in the axiomatic method. With GOdel's remarkable theorem as
our goal, our first task is to define a language that is appropriate for an
axiomatic development of number theory. This language, which we call L,
can be viewed as a fragment of English, so formalized that the set of
meaningful expressions is precisely defined, and the meaning of each
assertion is unambiguous.
Since we shall be using the English language to discuss expressions of L,
we make the symbols of L distinct from those of English in order to avoid
confusion.

THE SYMBOLS OF L
Variables: VI' V2' VO ' · · • •
Constant symbols: 1,2,3, ....
Equality symbols: ~.
Symbols for 'and', 'or', and 'not': 1\, V, ,.
Existential and universal quantifier symbols: 3, 'fl.
Symbols for the functions of addition and multiplication: ED, <:).
Left and right parentheses: [,].

Definition IO.la. An expression is a finite sequence of symbols of L.

For example, ],], 'fI,9,3, 'fI is an expression. From now on we shall omit
the commas between successive terms of expressions, so that this will be
written ]]'fI93'f1.

Definition IO.lb. The set of lerms, Trm, is the smallest set X of expressions
such that
i. v;EX for all iEN+,
11. iEX for all iEN+,
iii. if 11,/2 EX, then so are [/ 1$/2 ] and [/ 1 8/2].

J.
Some examples of terms are v 28' 9, and [[ 1$ v 4 ] <:) [ 4 <:) 3] To see
that the last expression is a term, we first observe by conditions i and ii
that I, V4' 4, and 3 are terms. Hence by condition iii, [I ED V4] and [4 <:) 3] are
terms; call them II and 12 respectively. Hence by iii again [tl 8/2] is a term
as needed.
The expressions 3ED4 and 28 vIl are not terms.
Without the parentheses, we could not distinguish between
[[2$4] 03J and [2ED[403]]; i.e., the expression 2$403 could be
viewed both as II $/2 and Ii 8 I~ where 11,/2, li,/~ are respectively 2ED4, 3, 2,
112 II An Introduction to Computability Theory

and 4 03. That this kind of ambiguity is avoided by the use of parentheses
is the content of the following:

Theorem 10.2 (Unique Readability for Terms). For each term t, exactly one
of the following conditions hold:
i. there is a unique i such that t = Vi'
ii. there is a unique i such that t = i,
iii. there is exactly one sequence t),t2,s where t),t2 ETrm and sE{e, 0}
such that t=[t)st2]'
PROOF: We first observe that a proper initial segment of a term has fewer
right parentheses than left parentheses, and that a proper end segment of a
term has fewer left parentheses than right parentheses. For let Y be the set
of all terms for which this is true. aearly all variables and constant
symbols belong to Y (these have no proper segments), and it is easily seen
that [t)et2]E Y and [I) 0t2]E Y if t),t2 E Y. Hence Y=Trm. It follows
that a term has as many right parentheses as left parentheses.
Now suppose that t is a term that is neither a variable or a constant
symbol, say t=[t)*/2], where '*' is either 'O' or 'e'. If t can also be
written differently as [s) # S2]' then s) is a proper initial segment of t) or t)
is a proper initial segment of s), which is impossible in view of our remarks
about parentheses. 0
Of course, our intention is to have the symbol 1 denote the number 1,
the symbol 2 denote the number 2, and so on. In the most natural way we
use terms other than the constant symbols to denote numbers also. For
example, [1 e 1] denotes the number 2, while [[ [1 e 1] e 1] e 1], [202],
[Ie [2el]] are three terms denoting the number 4.
On the other hand, terms having variables have no denotation. For
example, [3eV9] has no denotation, since V9 does not denote a specific
number. However, if a particular number is assigned to the variable V9'
then we may think of [3eV9] as denoting a number. For example, if we
assign the number 5 to V9' then [3eV9] denotes 8. We now spell out these
ideas more precisely.

Definition 10.3a. An assignment is a function whose domain is


{ v)' v 2, V3""} and whose range is contained in N+.

Definition 10.3b. If z is an assignment and t is a term, we define t(z), the


denotation of t under z, as follows:
i. If I is the variable v, then I(z) is z(v).
ii. If t is n, then I(z) = n.
iii. If 1=[t)e/2], then l(z)=t)(z)+t2(z). If 1=[/)0/2], then t(z)=
t)(z) ·t2(z).
2.10 An Idealized Language 113

For example, let t = [[ V3 0 VI] ED [2E9v9]]' and let z be any assignment


such that z( VI) = 5, z(V3) = 2, z( V9) = 3. Then by clause i and clause ii,
VI(Z) =5, V3(Z) =2, V9(Z) =3, and 2(z) =2. Using clause iii, [V30vI](z)
=v3(z)·v l (z)=5·2=1O, and [2 E9 V9](Z) =2(z) +V9(Z) =2+3=5. Using
clause iii again we get t(z) =[V3 0vt1(z) + [2E9v9](Z) = 10+5= 15.

Definidon 10.4a. An atomic formula is an expression of the form [t I~ t2]


where t I and t2 are terms.

Definidon 10.4b. The set of all formulas (call it Fm) is the smallest set X of
expressions such that
i. every atomic formula belongs to X,
ii. if cpEX then [....,cp]EX,
iii. if cpEX and o/EX, then both [cp/\o/] and [cpVo/] belong to X,
iv. if cpEX and V is a variable, then both [Vvcp] and [3vcp] belong to X.
For example, the following expressions are formulas:

f [V303]~ [[v3E9vdED9]], \3V3[[[1E91]0V3]~ vdJ,


VV4[[v4~2]V[3vI[[vIED1]~v4]J] . Only the first of these is an
atomic formula. To see that the last expression is a formula we ob-
serve by clause i that [v4~2] and [[vIED1]~v4] are (atomic) formulas.
Hence· by clause iv, [3vI [ [ VI E91 ] ~V4]] is a formula. By clause iii,
[[v4~2]V[3vI[[vIE91]~v4]J] is a formula. Finally, by clause iv,
[VV4[[ v4~2]V[3vI[ [v l E91 ]~V4]]]] is a formula.

Theorem 10.5 (Unique Readability for Formulas). If cp is a formula, then


exactly one of the following conditions is true:
i. there are unique terms t l ,t2 such that CP=[tI~t2]'
ii. there is a unique formula 0/ such that cp = [ ...., 0/],
iii. there is a unique triple o/1'o/2'S where 0/1 and 0/2 are formulas and s is
either Vor /\ and CP=[o/lso/2]'
iv. there is a unique triple cp,v,o/ where Q is either V or 3, V is a variable,
and 0/ is a formula, and cp = [QV1/I]

A proof for Theorem 10.5 can be given along the lines of that for
Theorem 10.2 (see Exercise 3).
Some of our formulas can be interpreted in the obvious
way as assertions about arithmetic. For example,
[VvI[3v2[[[20 v2]~vdv[[20V 2]~[vIE91]]]J] is intended to
assert that every number is either divisible by 2 or its successor is. The
intended assertion of the formula [3vI [Vv2[3v 3[ [V2E9V3] ~VI]]]] is
that there is a largest natural number. On the other hand,
114 II An Introduction to Computability Theory

[3vl[3v2[[ l[vl~I]]/\[ l[v2~1]]/\[[vIOV 2]~v3]]]1 makes no


assertion unless we assign a number to V3' in which case the formula will
be true if the number assigned is not a prime and false otherwise. We now
make these notions more precise.
Let Z be an assignment, v a variable, and nEN+. Then by z( ~) we
mean the assignment defined by
z( V){u)= { z{u) ~f u=l=v,
n n if u=v.

Definition 10.6. We say that the assignment Z satisfies the formula qJ and
write N+FqJ(Z) if either
i. for some t l,t2ETrm, qJ is [tl~t2] and tl(Z) = t2(z),
ii. for some o/EFm, qJ is [,0/] and it is not the case that N+Fo/(Z)'
iii. for some 0/1,0/2 EFm, qJ is [0/1/\0/2] and both N+F0/1(Z) and N+F0/2(Z),
iii. for some 0/1,0/2 EFm, qJ is [0/1 V 0/2] and either N+Fo/l(Z) or N+F0/2(Z)
(as usual, 'or' here is used in the inclusive sense),
IV. for some o/EFm and some variable v, qJ is [V'vo/], and for all n,
N+Fo/(Z( ~),
or
iv'. for some 0/ EFm and some variable v, qJ is [3vo/], and there is at least
one n such that

As an example, take qJ=[V'VI[3V2[[20v2]~vd]] and let Z be an


arbitrary assignment. By clause iv, N+FqJ(Z) only if for all n we have
N+FqJ(Z(:1 ).
By iv', this is the case only if for all n there is an m such that

N +F[ [2 0v2] ~Vd(Z(:I)( ~ ),


i.e., only if for all n there is an m such that 2·m = n. This is not the case if n
is odd, and so Z does not satisfy qJ. Of course, since Z is arbitrary in this
example, we see that no Z satisfies qJ.

Definition 10.7.
i. If S is a sequence SI'S2' ... 'Sn and if I <.i<'j<.n, then the i,j-sub-
sequence of S is Sj,si+I, ... ,Sj.
ii. 0/ is the i,}-subformula of cp if 0/ is a formula and the i,j-subsequence of
cpo 0/ is the subformula of qJ if 0/ is the i,j-subformula of cp for some i,j
iii. The symbol s occurs at i in qJ if s is the ith term of the sequence cpo
2.10 An Idealized Language 115

iv. The variable v is bound at k in <p if v occurs at k in cp and for some


i<k<}, the iJ-subsequence of cp is a subformula of the form [V1JI/;] or
[31J1/;].
v. A variable v that occurs at k in cp but is not bound there is said to
occur free at k. v occurs free (bound) in cp if for some k, v occurs free
(bound) at k.
vi. A formula cp is an assertion if no variable occurs free in it.

For example, let cp=[3v3[[VV 2[ I[V4~V2]]]V[V3~V2]]]. Then


VV2[ I[V4~V2]] is the 6,15-subsequence of cp, while [ I [V3~V2] J is the
8, 15-subsequence of cpo The first is not a subformula but the second is. V2 is
bound at 7 and 13 but free at 21. V3 is bound at both of its occurrences,
and V4 is free at its single occurrence. cp is not an assertion, but
[VV2[VV4CP] Jis.

Theorem 10.S.
i. If tETrm, and Z and z' are assignments such that z(v)=z'(v) for all
variables v occurring in t, then
t<z) = t<z').
ii. If cpEFm and z and z' are assignments such that z(v) = z'(v) for all
variables v occurring free in cp, then

PROOF OF I. Let X be the set of all terms t for which Theorem IO.8i is true,
< <
i.e., all terms t such that t z) = t z') whenever z( v) = z'(v) for all variables
v occurring in t. Clearly, every constant symbol and every variable belongs
to X. Moreover, if tl and t2 belong to X, then [tl ffit 2]<z) = tl<z) + t2<z) =
t l<z')+t2<Z')=[t l E9t2Kz'), and so [t l ffit 2] EX. Similarly [t 10t2] belongs
to X if t l ,t2 EX. Hence, by Definition lO.lb, X =Trm as we needed to
show. D
PROOF OF II. Let X be the set of all formulas for which Theorem 1O.8ii
holds. Using part i, we see that the atomic formulas are in X.
Suppose that I/; E S and that z and z' are assignments that agree on the
free variables that occur in 1/;. Then N+FI/;<Z> if and only if N+FI/;<Z'>.
Noting that I/; and [ I 1/;] have the same free variables, and using Definition
1O.6ii, we see that the following statements are equivalent:
N+F[ II/;Kz).
It is not the case that N+FI/;<Z>.
It is not the case that N+FI/;<Z').
N+FbI/;Kz'>.
Hence, if I/; EX, then [, 1/;] EX.
116 II An Introduction to Computability Theory

Now suppose that 1/11 EX and 1/12 EX. aearly, every variable that occurs
free in 1/11 or occurs free in 1/12 also occurs free in [1/11 V 1/12]' Now suppose
that Z and z' are assignments that agree on the variables occurring free in
[1/11 V 1/12]' Since I/I1,I/I2EX, we have that N+t=I/II<z) if and only if N+t=
I/II<Z'), and N+t=I/I2<Z) if and only if N+t=I/I2<Z'). Hence, using Definition
lO.lbiii, we see that the following statements are equivalent:
N+t=[I/I1 VI/I2]<z),
N+t=I/II<z) or N+t=I/I2<z),
N+t=I/II<z') or N+t=I/I2<Z').
N + t=[1/1 1V 1/121< z').
Hence, if 1/11,1/12 E X, then [1/11 V 1/12] E X. A similar argument shows that if
1/11,1/12 EX, then [I/Id'I/I2] EX.
Now suppose 1/1 EX, and let Z and z' be assignments that agree on all
variables occurring free in [3vtJ!]. Suppose that N+t=[3vtJ!1<z). By Defini-
tion lO.6iv' this means that for some assignment w agreeing with Z on all
variables except possibly v, we have N+t=I/I<w). Now let w'(v)=w(v) and
w'(u)=z'(u) for all u#'v. Since we are assuming that 1/1 EX, and since a
variable that occurs free in 1/1 either is v or occurs free in [3vtJ!], we see that
N+t=I/I<w'). Hence, by Definition lO.6iv' again, N+t=[3vtJ!1<z'). Thus if
I/IEX, so is [3vtJ!]. Similarly, one shows that if 1/1 EX, then ['VvtJ!] EX.
Hence, by Definition lO.4a, we see that X = Fm, as needed to prove the
theorem. [J

Recall that an assertion is a formula without free variables. By Theorem


lO.8ii, if cp is an assertion, then N+t=cp<z) either for all z or for no z. Since
the assignment is immaterial, we shall write N+t=cp and say that cp is true in
the first case. and write N+ I1cp and say that cp is false in the second case.
To facilitate the writing and reading of formulas of L, we shall omit
those occurrences of parentheses that are not needed to avoid ambiguity,
and we shall write + and . instead of EB and <:). For example, we might
write

instead of
['VVI [3v 2[[[ VI~V2]1\[ [V2 <:> V3] ~V4] ]1\[ 3v s[ [V3EBVS]~V4]]]]]
(2)
[So according to the context, (1) might be an expression that is not a
formula, or an abbreviation for the formula (2).]
When in a mathematical discussion a statement of the form 'If A then
B' or of the form 'A implies B' is made, it is intended to mean that B is
true if A is true, but if A is false then B may be either true or false. So the
statement 'If A then B' has the same meaning as 'Either A is false or B is
true'. It is handy to have a counterpart of such statements in L, so we may
2.10 An Idealized Language 117

abbreviate formulas of the form '[ ...,cp]VI/I' by 'cp---+I/I' and read these as 'If
cp then 1/1' or as 'cp implies 1/1'. Similarly, a mathematical statement of the
form 'A if and only if B' means that either A and B are both true or both
false, i.e., if A then B, and if B then A. The formalized counterpart of this
in L is '[ cp---+ 1/1] 1\[ I/I---+CP)" which we further abbreviate as 'cp+-+I/I' and read
'cp if and only if 1/1'.

Definition 10.9. The variable v is free at k for the term 'T in the formula cp if
i. v occurs free at k in cp, and
ii. if u is any variable occurring in 'T, and cp' is the result of replacing v at
the kth place in cp by u, then u is free at k in cp'.

For example, if v is V2' 'T is [v) + V2] . v), and cp is 3V3[ V2


~V3]1\ Vv)[ V2 + v) ~vd, then V2 is free for 'T in its first occurrence but not
in its second.
We say that v is free for 'T in cp if every free occurrence of v in cp is free
for 'T in cpo
If we let cp,'T),'T2 be respectively [3v) [ v) + v2~3] ]1\[VV3[ V)"V2~V3]]'
V)"V2' and V3·V4t then v) is free for 'T) in cp but not 'T2' V2 is free for'T) and'T2'
and V3 is free for 'T) and'T2.
When Vj '' 2Vj , ••• , Vj is a list of the variables occurring free in cp and
k
i) <i2 < ... <ik, we may indicate this by writing cp( vj,' vj2' ••• ' Vj). Then if Vj
is free for ~ in cp for each j ~ k, we write cp(t), t2, ••• , tk ) for the result ot
replacing each free occurrence of Vj . by~. For example, if cp is VV3[V2+3~
v3]1\3v2[V2~V3]' we may write CP(~2,V3) to indicate that V2,V3 are the only
variables occurring free in cpo Then cp(V2 + 8, V4) is the formula
VV3[ [V2+8] +3~V3]1\3v2[V2~V4].
If Vj. is not free for ~ in cp(Vj" ••• ,vj) for eachj~k, then by cp(t), ... ,tk )
we mean the formula that is obtained from cp as follows. Let U),U2' ••• 'U/ be
a list of the bound variables occurring in cp, and let ui, ui, ... ,u; be distinct
variables not occurring in cp or in any of the ~'s. Let cp' be the result of
u;
replacing each occurrence of Uj by for i ~ I. (By Theorem IO.8ii, we have
that for all assignments z, N+FCP(Z) if and only if N+FfP/(Z).) Notice that
Vj. is free for ~ in cp'. Then by cp(t), ... ,tk) we mean cp'(t), ... ,tk ). For
eiample, if CP(V2,V4) is 3v) VV3[V) +V2~V3·V4]' then CP(V),V2+V3) is
3V5 Vv6 [ v 5+ V)~V6· [V 2+V3] J.
This completes our description of L and attendant semantic notions.
Definitions 10.1 b and lO.4b can be regarded as giving the complete rules
of grammar for L, while Definition 10.6 provides a complete dictionary.
The set of true assertions is then a well-defined set. Is there a decision
procedure for determining membership in this set? I.e., is there a machine-
like way of determining which assertions of L are true? In the next section
we investigate the expressive power of L and §2.12 will be devoted to this
decision problem.
118 II An Introduction to Computability Theory

ExERCISES FOR §2.1O


1. Show that [[(VI + 3j-V4] + [VI' 2]] is a term.
2. Show that [VVI[VV2[3v3[3v4[[VI+V3]~[V2+V4]]]]]] is a formula.
3. Prove Theorem 10.5.
4. Find an assertion of L that is true just in case there are finitely many twin prime
pairs. [The pair (p,q) is a twin prime pair if p and q are primes and q=p+2.]
5. It is not known if there are infinitely many twin prime pairs. Why doesn't
Definition 10.6 along with Exercise 4 yield a solution?
6. Find an assignment that makes the following formula true:
3v23v3[[ VI + VI] + [V2'V2]~[ V3 + V3]]'
7. Let qJ be VVz[vI~v2]V3v3VVI[V3~[V2+vd]. Let 7"1 be [V['V3]' and let 7"2 be
[V2+V4]' What is qJ(7"I,7"z)?

2.11 Definability in Arithmetic


In this section we show that the expressive power of L is adequate to
provide a definition of each computable function.

Definition 11.1.
i. Let R be a k-relation. Say that cp defines R if R={(nl, ... ,nk ):
N+Fcp(n l ,· .. ,nk )}·
ii. The k-function f is defined by cp if the (k + I)-relation f( it) = y is
defined by cpo
iii. A function or relation is arithmetical if there is some formula that
defines it.

For example, let Div be the formula 3V3[VJ'V3~V2]' Then clearly Div
defines the 2-relation xly.
As another example, Prime is defined by the following formula:
VV2[Div(v2,vl)~[ [v2~1 ]v[ V2~VI]] ]!\[ vl~I].

Theorem 11.2. Every computable function is arithmetical.


PROOF: Let Arth be the set of arithmetical functions. By Theorem 9.1 it is
enough to prove that Arth ~ Rec. Recalling the definition of Rec the proof
breaks up into three cases as follows:
i. for all k,t the function Pk,l belongs to Arth; also +, ·,R<EArth.
Indeed, + is defined VI + V2 = V3' . is defined by vJ'v2~v3' R< is
defined by
[ 3V4[ VI + V4~V2]!\[ v3~1 ]]v[ .., 3V4[ VI + V4~V2]!\[ v3~2]],
and Pk,l by [VI~VI]!\'" !\[Vk~Vk]!\[Vk+I~V,],
2.11 Definability in Arithmetic 119

ii. Arth is closed under composition. For suppose that the k-function f is
defined by t/I and that the I-functions gl"" ,gk are defined by epl'"'' epk
respectively. Then the composition of f with the g's is defined by
(3VI+2' .. 3VI+k+2)[ epl( VI'"'' VI' VI+2)/\'" /\ epk(VI'"'' VI' VI+k+2)
/\t/I( VI+2"'" VI+k+2' VI+ I)]'
iii. The set of arithmetical functions is closed under the restricted JL-opera-
tor. For suppose that g is an arithmetical (k+ I)-function with ep
defining g. Suppose further that for all k-tuples ii there is an m such
that
g(ii,m) = 1.
Then the k-function f whose value at ii is p.x( g( ii, x) = 1) has as a
defining formula
ep(VI"'" Vk' Vk+ I' 1) /\(VVk+2)( Vk+ I ..;; ...., ep( VI"'" Vk' vk+ I' 1»,
where Vi";;tJ is (vi~v)V(3vl)(Vi+V/~V),

In view of Definition 9.1, this completes the proof that Arth:2 Rec. 0

Theorem 11.3. All machine enumerable relations (and hence all computable
relations) are arithmetical.
PROOF: We first show that the computable relations are arithmetical. To
say that the k-relation S is computable means that its representing func-
tion Rs is computable. By Theorem 11.2, Rs is arithmetical. Let ep define
Rs. Then clearly, S is defined by
ep( VI' V2, .. ·, Vk' 1).
Hence all computable relations are arithmetical. o
Now let S' be a machine enumerable k relation. By Theorem 8.9 there
is a computable (k+ I)-relation S such that for all k-tuples ii
S'(nl, ... ,nk _ l ) if and only if (3nk)S(n l , ... ,nk _l,nk)'
As we have just observed, there is an X that represents S. Hence S' is
represented by

The converse of Theorem 11.3 is false. There are sets that are arithmeti-
cal but not machine enumerable. In fact, if X is any machine enumerable
non-computable set (cf. Example 8.2), then N+ - X is not machine enu-
merable, by Corollary 8.7. However, N+ - X is arithmetical, since by
Theorem 11.3 there is a formula ep that defines X, and so clearly ...., ep
represents N+ - X.
Since there are only countably many formulas, it follows that there are
only countably many arithmetical sets. Hence the vast majority of sets are
not arithmetical. In the next section we shall exhibit an important example
of a set that is not arithmetical.
120 II An Introduction to Computability Theory

ExERCISES FOR §2.11


1. Find an arithmetical formula ((J that defines the function
f( m, n) is the greatest divisor of m and n.

2. Give an example of an arithmetical non-computable function.


3. Is there an arithmetical 2-function F such that for every arithmetical I-function
g there is some m for which
F(m,n)=g(n)
for all n'l

2.12 The Decision Problem for Arithmetic


Let S(m) be the statement 'm is the number of a machine that halts on the
input m'. Is there an effective procedure for deciding for which m the
statement S(m) is true? If such a decision procedure existed, then there
would be an effective way of determining membership in the set
{m: S(m)}. By Turing's thesis, this set would then be computable, but as
we have seen in Example 6.4, it is not computable. Hence there is no
algorithm for deciding for which m the sentence S(m) is true.
What does this have to do with arithmetic? First of all, S(m) can be
viewed as a statement about arithmetic. Indeed, there is a formula <p( VI) of
L such that the assertion <p(m) is true iff S(m) is true. To see this let
X = { m: S( m) }. In Example 8.2 we proved that X is machine enumerable.
Hence, by Theorem 11.3, X EArth. This means that there is a formula
<p( VI) such that
N+f;<p(m) iff mEX.
Thus we may consider <p(m) to be the formalization in L of the statement
S(m); it is in this sense that S(m) is a statement about arithmetic.
Thus we see that there is no effective procedure that will work for all m
for determining the truth of the assertion <p(m) where <p is as above.
From this we conclude that there is no decision procedure for determining
which assertions of L are true. In other words, if T is the set of true
assertions of L, then no algorithm exists for determining membership in T.
For such an algorithm, when applied to an assertion of the form <p(m), <p as
above, would enable us to decide whether or not <p(m) is true.
Moreover, this argument applies to any language whose set of assertions
contains the assertions <p(m). Thus we see that there is no decision procedure
for arithmetic.
Indeed, we can say more: There is no effective way of listing the true
assertions of L. In fact, if Z is any set of assertions in any language
extending L, and Z contains all assertions <p(m) as well as all assertions
..., <p(m), then no machinelike procedure exists for enumerating the true
assertions of Z. For if such a procedure existed, then one could check
2.12 The Decision Problem for Arithmetic 121

whether or not q>(m) is true by enumerating the true assertions in Z until


q>(m) is reached or -, q>(m) is reached.
Of course one fully expects that a reasonable numbering of the asser-
tions of L (or the asserti"ns of some richer language) would enable us to
argue that the set of numbers of true sentences is not machine enumerable,
and indeed this is the case. However, instead of doing this directly, we
shall prove a much stronger result of Tarski, namely, that "truth is not
arithmetical." More precisely, we shall associate with each assertion 0 of L
a number # 0, and then prove that the set {# 0: NF-o} is not arithmetical,
and so by Theorem 1l.3 not machine enumerable. However, one should
notice that non-arithmetical sets are, in a sense, quite plentiful. For L has
countably many symbols, and so only countably many formulas, since
each formula is a finite sequence of symbols (see Theorem 4.12 in Part I).
Since each arithmetical set has a defining formula in L, there are at most
countably many arithmetical subsets of N+. Hence 2w subsets of N+ are
not arithmetical (cf. Theorem 5.3 in Part I).
Of course, there must be some restrictions on the numbering #, for
otherwise nothing can be said about {# 0: NF-o}. Indeed, if X is any
infinite subset of N+ such that N+ - X is also infinite, then there is a 1- I
correspondence between the assertions of Land N+ under which the true
assertions of L are exactly those that correspond to elements of X. Since X
can be computable, machine enumerable but not computable, arithmetical
but not machine enumerable, or not arithmetical, we see that arbitrary
numberings of the assertions of L are meaningless; such numberings will
tell us nothing about the set of true assertions.
To assure a meaningful relation between 0 and # 0, we should restrict
our attention to numberings that are effective in the sense that there is a
decision procedure by which given an assertion 0 we can find its number,
and given a number we can decide if it corresponds to an assertion, and if
so, to what assertion. Given such a numbering #, the computability of
{#o: NF-o} implies a decision procedure for membership in {o: NF-o},
and conversely; machine enumerability of {#o: NF-o} implies the ex-
istence of an algorithm for enumerating {o: NF-o}, and conversely.
Now suppose that # is a numbering of the assertions of L in the sense
just described. Let 0 be an assertion, and let c be the largest n such that D
occurs in 0 (if such an n exists). Let s(o,n) be the result of replacing c
throughout 0 by D. If no constant symbol occurs in 0, then s(o,n) is just o.
Now define the substitution function S as follows:

S(m,n)= #s(o,n) if m= #0 for some o.

Assuming that # is effective, it is clear that there is an algorithm for


computing S, and that S can be extended to an effectively computable
function whose domain consists of all pairs of positive integers. These
considerations are intended to show that the assumptions in: the next
theorem are natural. In fact, in the hypotheses of the theorem we assume
only that S is an arithmetical extension, and this assumption is not as
122 II An Introduction to Computability Theory

strong as the assumption that S is computable (see Theorem 11.2 and the
last two paragraphs of §2.11).

Lemma. There is no arithmetical 2-relation R sue" that for each arithmetical


I-relation P there is an m for which
Rmn iff Pn
for all n EN+.
PROOF: This is a straightforward diagonal argument. If such an R exists,
then R is represented by a formula, say <pV()oV\. Then ...,<pVovo represents
..., Rnn, and so ..., Rnn is an arithmetical I-relation. But then there is an rna
such that Rmon iff ..., Rnn for all n, by our assumption on R. Taking n - rno
gives Rmomo iff ..., Rmomo, a contradiction. Hence no such R exists. D
Theorem 12.1 (Tarski). Let # be a I-I function from assertions to positive
integers such that the substitution function S (defined above) has an
arithmetical extension. Let '5 - {# 0: N+ Fo}. Then '5 is not arithmetical,
and so '5 is neither computable or machine enumerable.

PROOF: We argue by contradiction. Assume that the substitution function


has an arithmetical extension defined by S. Assume also that '5 is
arithmetical, say '5 is defined by 5". Now define Rm,n to hold iff
N+F3v\(~T(v\)!\S(m,D,v\». R is an arithmetical2-relation. Moreover, if P
is an arithmetical I-relation (say P is defined by q», then Pn iff R(#q>',n).
where q>' is q>(k). k is the first I such that I does not occur in q>. But this is
impossible by the lemma. Hence '5 is not arithmetical, as needed. The fact
that '5 is not machine enumerable (and hence not computable) follows
from Theorem 11.3. D
Can the hypotheses of the theorem be satisfied; i.e., is there a function
# that numbers the assertions of L in such a way that S is arithmetical?
The answer is yes, as we shall now show, following a procedure analogous
to that in §2.7, where machines, tape positions, computations, etc., were
coded as numbers. In fact we display a # such that S is computable.
First assign to each symbol s of L a number #'(s) as indicated below:
s: +, .,~,!\, V, ...,,'9',3, [, ],Vj,D
#'(s): 1,2,3,4,5,6,7,8,9, 1O,9+2i, 1O+2n.
If s\s2 ... sn is a sequence of symbols of L, we let

#(S\S2 ••• S n)- IT (Prm(i»#'(S/).


1-\
The numbering # is defined for all expressions of L, and so # 0 is defined
for all assertions o. Clearly this numbering is effective in the sense
discussed above, and so we have every reason to suspect that S has a
computable extension and hence an arithmetical one.
2.12 The Decision Problem for Arithmetic 123

To prove this, we first consider the 3-function f defined as follows:

if m= 1,
pfpi2• •• Pr"; if m=pf'Pi2 •• ·Pr"'. and thep's
f(k,l,m) =
n;
are distinct primes, and = nj
n;
when nj *- k, and = I when nj = k.

f is computable, as we see by writing it as

f(k, I,m) = px(Vy ";;m)«Exp(y,m) *-k/\Exp(y, x) = Exp(y,m»


V (Exp(y, m) = k /\Exp(y ,x) = l).

Now we need a computable function that will pick out the constant with
the largest index occurring in a sequence:

q(n) = px(Vy ..;;n)(Vz ";;n)(Exp(y,n)*- 1O+2(x+ z»).


We can now define

S'(m,n)= f(1O+2g(m), 1O+2(n),m).


Clearly S' is computable and hence an arithmetical extension of S, as was
to be shown.

ExERCISES FOR §2.I2


1. If we replace N+ by I, the set of integers {O, 1, -1,2, -2, ... }, in Definition 10.6
and take our assignments z EvblI, then we obtain the notion IF'P(z). Replacing
N+ by I in Definition Il.l gives us the definition of Arth'.
(a) Is {# a: IFa} EArth'?
(b) Is {#a: IFa} computable?
(c) Is {#a: N+Fa} EArth'?
(Hint: If n is an integer, then nEN+ iff IF(3s,t,u,v)[s2+t2+u2+v2~n].)
2. Let sq(n) = n 2 for all n EN+. Alter L by replacing· with sq. Let X be the set of
assertion true in N+ in the new language. Show that X is not computable. [Hint:
r=m·n if r+r=sq(m+n)-sq(m)-sq(n). Now show that there is a computable
function taking assertions a E L into assertion a' in the new language such that a
is true iff a' is true.]
3. Let Sq(n) be the relation on the integers 'n in a square', i.e., Sq(n) iff n = m 2 for
some mEl. Replace • in L by Sq, getting a language L', and define term,
formula, satisfaction for L'-formulas accordingly. Show that the set of true
L'-assertions is not computable. [Hint: Refer to the hint in Exercise 1. Also
notice that since (x + 1)2 = x 2 + 2x + 1, we have y = x 2 iff
Sq(y)/\ 3z(Sq(z)/\y <z/\ ...,(3u)(Sq(u)J\y <u/\u <z)/\z~y + x+ x+ I).
Since (x+yi=x 2+2xy+y2, we have z=x·y iff X2+Z+z+y2=(X+y)2.]
124 II An Introduction to Computability Theory

2.13 Axiomatizing Arithmetic


In the last section we saw that no decision procedure exists for determining
which assertions of L are true; indeed, we proved that the true assertions
cannot be effectively enumerated. Now we investigate the possibility of
summarizing the true statements of arithmetic by means of an axiom
system.
In the beginning of the usual high school geometry course one is given a
set of statements about points and lines called axioms. Ip this conte}l:t a
proof is a finite sequence of statements, each being an axiom or a "logical
consequence" of preceding statements. A theorem is the last statement of a
proof. With this example in mind, we make the following definition.
Definition 13.1.
i. An axiom is an assertion of L.
ii. A k-premise rule of inference is a (k+ I)-tuple (al, ••• ,ak,ak+l)' where
each ai is an L-assertion and ai-:F~ for I <j<.k. ak+l is the conclusion
of the rule, and the a;'s for i <. k are the premises.
iii. An axiom system is a set ~ = (£ U ~ where (£ is a set of axioms and ~
is a set of rules of inference. If we want to make explicit mention of the
axioms and rules, we write ~ «(£, ~) instead of ~.
IV. A finite sequence (Pl,P2, ••• ,Pn) is an ~«(£, ~ )-proofif for each Pi either
(a) PiE(£, or
(b) There is a k-premise rule (al, ••• ,ak,Pi) in ~ such that for all/<.k,
at E {Pi j <i}.
v. An ~ -theorem is the last line of an ~ -proof. We write I-:;;a if a is an
~-theorem.

What requirements should an axiom system meet in order that it may


provide a fully adequate summary of the assertions true in N+? Ideally,
every assertion true in N+ should be provable in ~, and every ~ -theorem
should be true in N+, i.e., I-:;;a iff N+l=a for every L-assertion a. Of course,
these conditions are met by the axiom system ~1(~' ~1) where (£1 =
{a: N+l=a} and ~I is empty. Every theorem a has a I-term ~1-proof,
namely (a). Clearly, there is no effective method for decidiD.g whether a
given sequence of L-assertions is an ~1-proof, for such a method when
applied to I-term sequences would provide a test for truth in N+.
Axiom systems for which there is no decision procedure for determining
which sequences are proofs are of little interest. For an axiom system ~ to
be of some use, one should be able to decide if an alleged ~ -proof is in
fact an ~ -proof. As we show in Theorem 13.7, a system that is effectively
given in the sense of Definition 13.2 below has this property.

Definition 13.2. Let ~ «(£, ~) be an axiom system.


i. ~ is· effectively given if there is a decision procedure for determining
membership in the sets (£ and ~.
2.13 Axiomatizing Arithmetic 125

ii. :i) is correct for N+ if for all L-assertions 0, I-§o implies N+l=o.
iii. :i) is complete for N+ if for all L-assertions 0, N+l=o implies I-§o.

ExAMPLE 13.3. :i)1 (described above) is complete and correct for N+, but
not effectively given.

ExAMPLE 13.4. Let 1'£ ={l = I} and ~2 ={(l = 1,0): 0 an L-assertion}. Let
:i)2 = :i)2(~' ~2)' Then :i)2 is effectively given and complete for N+, but not
correct for N+.

EXAMPLE 13.5. Let :i)3 = :i)3(~' ~3)' where ~ = { ...,[1 = In and


~3 = {( ..., [ 1 = 1 ],0):
0 is an L-assertion }. Then :i)3 is effectively given and
complete for N+, but not correct for N+.

EXAMPLE 13.6. Let :i)4 = :i)4(1'£4' ~4)' where 1'£4 is the set of all atomic
assertions that are true iII. N+, and ~4 is the set of all sequences
(OI, ... ,Ok' [od\ ... I\Ok]VP) where k=N+ and 01, ... ,Ok'P are L-asser-
tions. Then :i)4 is effectively given and correct for N+, but is not complete
for N+. Indeed, ..., [1 = 2] is true in N+ but is not an S4 theorem, for
..., [1 = 2] is clearly not an axiom or a conclusion of an S4-rule.

Lemma. There is an effective procedure for enumerating the set of finite


sequences of L-assertions.
PROOF: Let # be the numbering described at the end of §2.12. If n is of
the form IH_I(Prm(i»#'" for some 01,,,.,Ok' we let the nth member of
the enumeration be (01'"'' on). Otherwise the nth member is (1 = 1). 0

111eorem 13.7. Let :i) be effectively given. Then


i. there is an effective way of deciding which finite sequences of L-assertions
are :i)-proofs;
ii. there is an effective procedure for enumerating the set of :i) -theorems.
PROOF OF i. Let X be the set of all finite sequences of the form
(OI, ... ,ok,Ok+l) where 0k+1 is the conclusion of an :i)-rule whose premises
belong to {ol, ... ,od. We claim that X is decidable. For given a sequence
(01,02, ... ,Ok,ok+I)' we can find the finite set Y of those sequences
(PI,,,,,PI,Ok+l) having the property that l<ok and each Pi is some D.i for
j <0 k. Since :i) is effectively given, we can decide if any member of Y
belongs to ~. Since (ol, ... ,Ok,Ok+I)EX iff YnX+O, this gives an effec-
tive test for membership in X.
Now let (81, ... ,8m ) be a sequence of L-assertions. Oearly, this sequence
is an :i) -proof iff for each n <om either
i. 8n is an :i) -axiom, or
ii. (8 1, ... ,8n )EX (X as above).
126 II An Introduction to Computability Theory

We have just seen that there is an effective test for membership in X. Also
there is an effective test for membership in the set of axioms of ~.
Applying these tests for m=I,2, ... ,n enables us to decide if (8 1, ••• ,8m ) is
an ~ -proof or not. 0
PROOF OF ii. If there are no axioms in the system ~, then the set of all
~ -theorems is empty and there is nothing to prove. So suppose that a is an
axiom of ~. By the lemma, there is an effective enumeration of the finite
sequences of L-assertions. Let aI' 02' .. • be such an enumeration with
0i =(ai,I, ... ,ai,k;,ai,k;+ I)' We enumerate the theorems of ~ by letting an be a
if an is not a proof, and letting an = an, A;. + 1 otherwise. Clearly, this is an
effective enumeration. 0
'Theorem 13.8. No axiom system ~ can satisfy all three of the following
conditions :
i. ~ is effectively given,
ii. ~ is correct for N+,
iii. ~ is complete for N+ .
PROOF: Suppose ~ is effectively given, correct for N+, and complete for
N+. By the preceding theorems, there is an effective way of enumerating
{a: I-§a}. Also {a: I-§a}={a: N+Fa}. But this contradicts the fact that no
effective enumeration exists for {a: N+ Fa}, as shown in Theorem 12.1. 0

As one expects, the last two theorems can be placed in the context of
machine enumerability. This presupposes a numbering # of the L-asser-
tions.

Definidon 13.2'. Let # be a numbering of the assertions of L. Say that


~ (te, ~ ) is # -effectively given if both of the following sets are computable:

i. {#a: aEte},
ii. {m:l(Prm(i»#O"/: (al, ... ,ak,ak+I)E~}.

Theorem 13.7'. Let ~ be #-effectively given. Then


i. {1If_I(Prm (i»#O"/: (al, ... ,an) is an ~-proof} is computable.
ii. {# a: I- §a} is machine enumerable. -
PROOF OF I. Our argument is the obvious modification of that given
in Theorem 13.7. Let ~ = ~ (te, ~), te# = {# a: a E te}, ~# = {n~:l(Prm
(i»#O"/:(al, ... ,ak,ak+I)E~}. Let X# be the set of all numbers of the form
m~:(Prm(i»/'', where nl+ l is the number of the conclusion of some ~-rule
(al, ... ,ak,ak +l ) such that {#al"'" #ad ~{nl, ... ,na. X# is computable,
since n EX# iff Seqn/\(3y <::nn) (~#y /\Exp(Lny,y)= Exp(Lnn,n)/\
(Vz<Lny)(3w<Lnn)(Exp(z,y)=Exp(w,n»). It follows that n is the
2.13 Axiomatizing Arithmetic 127

number of an ~ -proof iff

Seqnl\(Vx'" Lnn)( (£#(ExP(x,n»V€S#( i~l (Prm(i»Exp(j,n»).


This proves part i. o
PROOF OF II. We have this immediately if {o: f-:;;o} is empty. So suppose
f-:;;o*. Define
f(n} = {ExP(Lnn,n} if n is the number of an ~-proof,
#0 otherwise.
From part i we see that f is computable, and clearly the range of f is {# 0:
f-:;;o}. Hence {#o: f-:;;o} is machine enumerable as asserted in part ii.

Theorem 13.8'. Let # be a 1-I function from assertions to positive integers


such that the substitution function S (defined in §2.l2) has an arithmetical
extension. Then no axiom system ~ can satisfy all three of the following
conditions :
i. ~ is # -effectively given,
ii. ~ is correct for N+,
iii. ~ is complete/or N+.
PROOF: Suppose # is as specified and that ~ satisfies i, ii, and iii.
Condition i allows us to use Theorem 13.7' to conclude that {# 0: f-:;;o} is
machine enumerable. By ii and iii, {#o: f-:;;o}={#o: N+Fo}. But then
{ # 0: N+ FO} is machine enumerable, contradicting Theorem 12.1. 0
Definition 13.9. 0 is undecidable with respect to ~ if neither f-:;;o nor f-:;; -, o.

Given a function # from assertions to positive integers, we define the


diagonal function D by
D(#o}= # -, S(o, #o},
where S is defined in §2.12. If we can effectively find # 0 from 0 and 0
from # 0, then surely there is an algorithm for computing D. For example,
the numbering described near the end of §2.l2 has this property. In the
next theorem we assume that D has an arithmetical extension, an assump-
tion much weaker than the assumption that D is computable.
Notice that if # is as above and ~ is #-effectively given and correct
for N+ then Theorem 13.8' assures the existence of assertions that are
undecidable with respect to ~. The next theorem displays such an asser-
tion.
Let cp(u) be a formula and n the least number k such that k does not
occur in cpo By cpo we mean the result of replacing each free occurrence of u
in cp by n.
128 II An Introduction to Computability Theory

Recall that every machine enumerable set is definable in N+ (Theorem


11.3). We use this fact in the following stronger version of Theorem 13.8'.

Theorem 13.10. Let # be a 1-1 function from assertions to positive integers


such that the diagonal function has an arithmetical extension. Let the axiom
system ~ be #-effectively given and N+ -correct. Let "'~ and D define {#o:
f-~o} and the diagonal function respectively. Let cp( v2) be
3Vl[D(V2,Vl)!\"'~(Vl)]' and let o=cp(#cp~. Then 0 is not decidable with
respect to ~.

PROOF: Assume the hypotheses of the theorem and suppose f-~o. Then by
the correctness of ~, N+t=o. Hence for some k, N+t=D(#cpo,k) and
N#t="'~(k). Since D defines the diagonal function, k must be # ICP(#cp~,
i.e., k=#,o. Hence N+t="'~(#,o). Since "'~ defines {p: f-~}, f-~IO.
This and our assumption f-~o contradict the correctness of ~. Thus f-~o is
impossible.
On the other hand, suppose that f-~ 1 o. Since "'~ defines {p: f-~},
we have N+t="'~(# 1 0). We also have N+t=D(#cpo, # 10). Hence N+t=
3Vl[D(#cpO,Vl)!\"'~(Vl)]' i.e., N+t=o. But ~ is correct, so that our
assumption f-~ 1 0 implies N+t= 1 o-a contradiction. Therefore, ,ois not
a theorem of ~. Hence, with what we have above, 0 is undecidable with
respect to ~ . 0

Thus there is no axiomatization in L that provides a completely satisfac-


tory summary of the true assertions about N+.
Theorem 13.8 (or 13.10) is GOdel's incompleteness theorem, so called
because it says that every effectively given correct axiomatization is
incomplete. This is one of the most outstanding results of twentieth
century mathematics, for this theorem caused a profound alteration in
views held about mathematics and science in general. No longer can
mathematics be thought of as an idealized science that can be formalized
using self-evident axioms and rules of inference in such a way that all
things true are provable. Any correct formalization whose proofs can be
checked effectively must admit undecidable assertions.
In our chapter on set theory we mentioned that the axiomatic set theory
ZFC provides an axiomatic foundation for all of classical mathematics. In
view of the incompleteness theorem, we should qualify this statement,
since G6del's result can be extended to cover axiomatic set theory (as well
as any other axiomatic system strong enough to code an adequate notion
of proof). Virtually all mathematicians believe that ZFC is correct. If this is
so, since ZF is effectively given, the incompleteness theorem tells us that
ZFC is incomplete, and the analog of Theorem 13.10 will produce an
undecidable sentence.
2.14 Some Directions in Current Research 129

EXERCISES FOR §2.l3


1. Let ~ be a correct effectively given axiomatization for N+, and let tp define the
relation {#M: M(#M)=l}.
(a) Show that there is an M* such that for all machines M
M*(#M)=2 if l-:;tp(#M)
M*(#M)= I if 1-:;..,tp(#M)
(See Theorem 3.23.) Convince yourself that given a machine that enumerates
the axioms and rules of ~, a machine M* satisfying the above conditions
can be found explicitly.
(b) Let t1 be tp(#M*). Prove that a is undecidable.
2. Prove that the a of Exercise I is false.
3. Let ~ and a be as in Exercise 1. Let ~' be the result of adding a to the set of
axioms of ~.
(a) Is ~ effectively given?
(b) Is ~ correct?
(c) Is ~ complete?

2.14 Some Directions in Current Research


A closer look at the incompleteness theorem, especially a version like that
in Exercise 1 of the last section in which a specific example of an
undecidable assertion 0 is presented, leads to Godel's second incomplete-
ness theorem. Roughly speaking, this result states that a sufficiently strong
consistent axiomatization does not yield a proof of its own consistency. A
detailed proof of this is rather messy, and so we content ourselves with the
barest elements of an outline. Let £ be effectively given. Let Prf:;;(n) define
the relation 'n is an assertion and there is an £ proof of n'. (Notice that
this relation is machine enumerable.) Letf(n) be a definable function that
maps # '" to # -, '" for each formula ",. As the formal counterpart of our
assertion that £ is consistent (i.e., free of contradiction) we take Cons:;;:
-,3v[ PrJ:;v/\PrfJ(v)]. Let 0 be the undecidable assertion of Exercises 1
and 2 of the last section. Exercise 2 can be answered in outline as follows:
i. Suppose 0 is true, i.e., suppose N+FO.
ii. Since 0 is true, M*(# M*)= 1, so by definition of M*, 1-:; -, 0. By the
correctness of £, N+F-,O.
iii. But then N+FO/\ -,o-a contradiction. Hence N+F-,O.
Instead of assuming the correctness of £ in step ii, let us suppose that £ is
strong enough so that when N+F'P(O), then 1-:;'1'(0) where 'Pis the formula
described in exercise 1 of §2.13. This is not an outlandish requirement, and
many known axiom systems have this feature. Now the above argument
130 II An Introduction to Computability Theory

can be modified as follows:


i. If a is true, then I-~a.
ii. If a is true, then I- ~ -, a.
iii. But § is consistent, so not [I-~a and I-~ -, a]. Hence -, a is true.
Now suppose that the axioms and rules of § are strong enough to
enable one to give the above argument inside of §. Again, this is not an
outrageous requirement. Many of the interesting axiomatizations of num-
ber theory have been made up to be as strong as possible and satisfy this
requirement. Suppose also that I-~Cons~. Now we can internalize the above
argument:
i. I-~a-+Prf~ #a
11. I-~a-+Prf~ # -, a
iii. I-~ Cons~ from which we get 1-~-,[PrJ~#aI\PrJ~#-,a]. Hence
I-~ -, a. But If~ -, a, and so we can not have I-~Cons~.

A more dramatic way of stating the second incompleteness theorem


might be: Provided § is strong enough, any proof of the consistency of §
must use methods not formalizable within §, and hence the proof of
consistency of § is as much in doubt as the consistency of §.
The undecidability of arithmetic is one of the most important and
profound discoveries of logic. However, there are many structures that are
of interest to mathematicians, and one can ask if their theories are
decidable. The general notion of a first order theory will be discussed in
the last chapter, but the intuitive idea is straightforward. For example, the
reals (with the usual addition, multiplication, <, and 0, I) can be thought
of as a structure tBt = (R, +, ., <,0, I). The first order language appropriate
for this structure has binary function symbols +,., a binary relation
symbol <, and constant symbols 0,1. The notion of formula, satisfaction,
and truth are defined in the obvious way by modifying Definitions 10.4,
and 10.6 appropriately. Now one can ask about the decidability of the set
of assertions that are true in tBt.
Surprisingly, the set of assertions true in tBt is decidable. In particular
this means that the natural numbers cannot be defined by a formula of this
language. Similarly the theory of the complex numbers with plus and times
is decidable. On the other hand, the set of assertions true in the rationals
with + and . is undecidable. The theory of Abelian groups (i.e., the set of
assertions in the language of group theory that are true in every Abelian
group) is decidable, but the theory of groups is not. The theory of linear
orderings, the theory of well orderings, and the theory of Boolean algebras
are decidable. The theory of an equivalence relation is decidable, but the
theory of two equivalence relations is not. The theory of a symmetric
reflexive relation is not decidable either. The theory of a single unary
function is decidable, but the theory of two such or of one binary function
is not. On the other hand, the theory of two successor functions or even the
2.14 Some Directions in Current Research 131

theory of countably many successor functions is decidable. Such results are


most surprising when a seemingly weak theory turns out to be undecidable
or a seemingly strong theory turns out to be decidable.
Of course it is not necessary to limit oneself to first order languages
when considering questions of decidability. For example, one can consider
a language extending L that not only has variables ranging over elements
but also has variables ranging over relations. This language is called the
second order calculus. Although the set of first order assertions true in
(N, <) is a decidable set, the set of second order assertions true in (N, <)
is not. A remarkable theorem along these lines proved by Rabin is that the
monadic second order theory of countably many successor functions is
decidable. (The second order variables of the monadic second order
calculus are restricted to range over sets of elements of the structure in
question and not over relations of arbitrary arity.)
Proceeding in another direction, one can restrict attention to assertions
that have some special form. For example, the set of Diophantine equa-
tions that are solvable in N is undecidable. A Diophantine equation has
the form
Q(XO"",Xk)=O,
where Q(xo,""xk) is a polynomial with integer coefficients in the variables
XO,,,,,Xk and we ask for no, ... ,nnEN such that Q(no> ... ,nk)=O. This
problem was first raised by Hilbert in 1900 and finally solved by
Matijasevic in 1970. Thus the decision problem for N has a negative solu-
tion even if we restrict our attention to just the assertions
3xO""'xk [ Q(xo>''''xk)=O]. What Matijasevic shows is that there is a
formula E(u,v) of the form 3xO"",Xk[Q(xo>,,,,Xk'U,V)=0 where
Q(xo,''''xk,u,v)=O is diophantive and for every machine enumerable set
X there is a n for which X= {x: NFE(x,n)}. Hence taking X to be any
machine enumerable undecidable set gives us a set of assertions {3uE(un)
: XEN} which is undecidable.
As a corollary to this result, we get a polynomial P(x,u,v) with
coefficients from N and domain k+2N such that for each machine enumer-
able X there is an n such that X is the set of positive integers in the range
of P. For if E(u,n) is the formula for X mentioned above, then we can take
P(x,u,n) to be u[l- Q2(x,u,n»). So in particular, there is a polynomial P
in several variables that enumerates the primes in the sense that the
positive portion of the range of P is the set of prime numbers. This came as
a surprise to those working in number theory.
Here is another surprising consequence. In the next section we shall
describe an effectively given axiomatization for set theory, called ZFC,
that is complete for classical mathematics in the sense that all mathemati-
cal notions can be defined and all theorems of classical mathematics can
be proved in this axiomatic framework. This axiomatization is thought to
be consistent by virtually all mathematicians, but cannot be proved so
132 II An Introduction to Computability Theory

within its own framework, by the second incompleteness theorem. Being


effectively given, the set <5 of theorems of ZFC is effectively enumerable
(as in Theorem 13.7). So the theorems provable within the framework of
classical mathematics can be enumerated effectively. Assign numbers to
formulas of set theory in the way we assigned numbers to formulas of
arithmetic at the end of §2.12, except assign 0 to each assertion of the form
0/\ -, o. Let ~ be the set of numbers of elements of <5. ~, being
effectively enumerable, is the non-negative part of the range of a poly-
nomial P. We believe that P does not have a solution for 0 in integers,
since we believe ZFC is consistent. On the other hand we cannot prove
that P has no solution in integers within the classical framework of
mathematics, since this would be tantamount to a proof of consistency of
ZFC within ZFC.
The problem of deciding whether an arbitrarily given Diophantine
equation has a solution in the rationals remains open.
One of the most active areas of investigation in the theory of computers
in recent years is the theory of computational complexity. The notion of
machine computability as an abstraction of our intuitive notion of algo-
rithm or man computability is very generous in that no restriction is placed
on amount of tape or number of steps required in a given computation as
long as these quantities are finite. However, length of computation is a
crucial consideration in the operation of real computers. A problem may
be computable in the theoretical sense, but non-computable in the practi-
cal sense that given any program for the problem, unavoidable calculations
will require a totally unacceptable amount of time.
A remarkable example of this was discovered by Fischer and Rabin. Let
L - be the set of L-assertions without occurrences of . or constant
symbols. Let T, be the set of L - -assertions true in N+ under addition, and
let T2 be the set of L - -assertions true in the reals under addition. It is
known that T, and T2 are decidable. However, Fischer and Rabin prove
that they are not decidable in the practical sense. For T, they show that
there is a constant c such that for any machine M there is an no such that
for each n >no there is an L - -assertion of length n which takes at least 21f"
steps by M to decide. For T2 the situation is analogous except that the
decision takes at least 2tin steps. Thus the number of steps required for a
decision is growing much, much faster than the length of the question and
soon exceeds, say, the number of atoms in the universe. The methods used
to prove these results have had many applications to questions of computa-
tional complexity in mathematical structures other than the reals or natural
numbers under addition.
There is a beautiful theorem of P. Young which states, loosely speaking,
that there is a recursively enumerable set X such that given any recursive
enumeration (x" X2' X3' • •• ) of X, there is another enumeration
(Y"Y2'Y3' ... ) of X and a machine M that will enumerate X in the order
(Y"h'Y3' ... ) significantly faster than any machine can enumerate X in the
order (X"X2,X3' ••• ).
2.14 Some Directions in Current Research 133

Another bizarre result of complexity theory is that there is no effectively


given sequence of computable functions 11,12,13".' such that 1;+ I is more
complicated than /; in the sense that any program for the computation of
1;+1 is larger than the minimal program for 1;. In other words, if f is a
computable 2-function, and/;(n)= I(i,n) for all i,n, and cl; is, say, the least
number of the form # M where M computes 1;, then cl; + I <; cl; for some i's.
So in particular, letting/;+I(n)=n.fi(n) with II (x) the identity function does
not yield a sequence of functions of increasing complexity.
As a consequence of this we have the following strange corollary. Given
any axiomatic framework for mathematics, there are only finitely many
machines M for which there is a proof that M computes a function that
cannot be computed by a simpler machine (simpler, say, in the sense of the
size of #M).
There is a particularly notorious open question in this domain. Consider
the class P of problems that can be done in polynomial bounded time. A
problem in this class can be thought of as a recursive set X for which there
is a machine M and a polynomialp such that M(n) = Rin) (where Rx is
the representing function of X) and the computation M(n) takes fewer
than p(n) steps. P is the class of deterministic, polynomial bounded
problems. The class NP of non-deterministic polynomial bounded prob-
lems consists of those recursive sets X for which there is a non-determinis-
tic machine M and a polynomial P such that M computes Rx and for each
n the computation for M(n) requires less thanp(n) steps. A non-determin-
istic machine differs from our machines in that the next-state function s
maps elements of {O, I} X N+ into finite subsets of the set of states.
Definition 2.3 is modified so that a successor tape position M(t) can have
any marker of the form (j+p(aj,k),u) where uEs(aj,k). No one has yet
been able to show that P:;=NP.
Beside the attempts to classify the computable functions into a
hierarchy according to a computational complexity, considerable success
has been achieved in classifying non-computable functions according to
their relative complexity. If we add the stipulation that g E K to the
definition of Rec given in Definition 9.1, we obtain the set of functions
recursive with respect to g, Recg • Intuitively,IERecg means that there is a
canonical procedure for computing I that may require at any given step of
a computation some value of g. If IERecg , we write I<;g. Oearly if
IERec, then 1<; g for all g. If we restrict our attention to representing
functions of effectively enumerable sets, then it is not difficult to find an h
such that 1<; h for all such 1's. More work is required to show that each of
our effectively enumerable non-computable sets given in the examples of
§2.8 has a representing function that is such a maximal h. Post's problem
asks if there are any functions g that are neither maximal or minimal with
respect to <;. After many years the question was settled affirmatively by
Friedberg and Mucnik. If we identify I and g whenever f <; g and g <;1,
then the resulting equivalence classes are partially ordered by <; with a
least element and a maximal element. Using the methods developed by
134 II An Introduction to Computability Theory

Friedberg and Mucnik., much was learned about this ordering. For exam-
ple, any countable partially ordered structure can be embedded in it.
Notions of computability have been developed for objects of higher
type, such as functions that map functions to functions.
Analogs of computability have been proposed by Kreisel for functions
defined on some ordinal a with range in a. This generalization grew into a
rich theory developed by Barwise and others.
Formal language theory is another direction that has received consider-
able attention. An alphabet is a set l: of symbols. l:* is the set of all finite
sequences of terms in l:. l:* is called the set of words on l:. A language is a
subset of some l:*. Languages can be specified by syntactic conditions or
by a process. As an example of the latter kind, suppose we fix a word
wEl:* and a function f : l:-+l:*. Now define a function F: l:*-+l:* as
follows. If a El:*, say o=a.a2 ... an then Fo=f(a.)f(a2) ...f(a n), i.e., each
aj is replaced by the wordf(a;) and the resulting concatination of symbols
is Fa. Let L be the orbit of Fon w, that is, L={w,Fw,F2w, ... }. Then L is
a language, and languages obtained in this way are simple examples of
Lindenmeyer languages. Lindenmeyer is a botanist who first proposed the
study of these languages, and the cause was subsequently championed by
Rozenberg. These languages have been used to model simple morphogenic
processes in biology and are of interest to computer scientists as examples
of parallel programming.
Many other interesting classes of languages have been described, and
the relationships among them are an active current area of research.
PART III
An Introduction to Model Theory

3.1 Introduction
For the remainder of the text we turn our attention to that branch of logic
called model theory. Here we consider formal languages with enough
expressive power to formulate a large class of notions that arise in many
diverse areas of mathematics. Within our idealized language we shall be
able to describe different kinds of orderings, groups, rings, fields, and other
commonly studied mathematical notions.
If an assertion 0 is a true statement about a mathematical structure W,
then Wis said to be a model of o. The main concern of model theory is the
relation between assertions in some formal language and their models. For
example, we shall describe a language strong enough to capture many of
the important properties of the real number field but weak enough so that
the sentences true in this structure also have a non-standard model, a
model in which there are infinitesimaly small numbers and infinitely large
numbers. Within such non-standard models one can justify a development
of the calculus along lines close to that of Newton's original conception.
For example lim 11--+ 00 all = a would mean that Iall - al is infinitesimal
whenever n is infinite.
Where model theory is developed within the context of algebra, the
latter subject undergoes considerable unification and generalization, while
model theory benefits from the examples, methods, and problems of
algebra. This interaction has been particularly fruitful in the areas of
Boolean algebra and the theory of groups.
The interaction of set theory and model theory has given tremendous
impetus to both, and each has contributed techniques and theorems to the
other that were used to solve famous problems of long standing.
135
136 III An Introduction to Model Theory

In fact, over the past several decades, the connections between model
theory, computable function theory, set theory, and infinitary combinator-
ics have become more and more closely knit. There are areas in which the
symbiosis is so strong that any division between them is bound to be
artificial. In the last section we shall briefly indicate some of the more
recent directions taken by model theory and hint at the growing interrela-
tion between the various branches of logic.

3.2 The First Order Predicate Calculus


We now expand the language L, introduced in §2.1O, by adding new
function symbols, relation symbols, and constant symbols. In the expand-
ed language we shall be able to make assertions about groups, rings, fields,
orderings, and other objects of mathematical interest. Our new language
(which we also call L, or the first order predicate calculus) and the new
definitions of assignment and satisfaction are natural extensions of the
corresponding notions previously introduced in §2.1O.
As before, the language L can be viewed as a formalized fragment of
mathematical English.

THE SYMBOLS OF L
Variables: VO,VI,V2'··· •
For each ordinal ex, a constant symbol Ca.
Equality symbol: RJ.
Symbols for 'and', 'or', and 'not': 1\, V, -,.
Existential and universal quantifier symbols: 3, 'tI.
Left and right parentheses: [, ].
For each nEN+ and each ordinal ex, an n-function symbolfn,a.
For each n EN+ and each ordinal ex, an n-relation symbol Rn a.

Definition 2.la. A set whose elements are either constant symbols, function
symbols, or relation symbols is called a type. An expression is a finite
sequence of symbols. If cP is an expression then the type of cp, written T( cp),
is the set of constant symbols, function symbols, and relation symbols
occurring in cpo If s is a type, then cp is of type s if T( cp) k s.

EXAMPLE. ]]v9't1c",R3,9f84,2[ is an expression cp such that T(cp)=


{c""R 3,9f84,2}. cp is of type T(cp) and of any type containing T(cp).

Definition 2.lb. Let s be a type. The set of lerms of type s, Trms ' is the
smallest set X of expressions such that
i. v;EX for all fEW,
ii. ca EX for all ca Es,
iii. if II' ... ,In EX and fn,a Es, thenfn,all' ... ' tn EX.
3.2 The First Order Predicate Calculus 137

ExAMPLE. Suppose that v is a variable, c a constant symbol,f a I-function


symbol, and q a 2-function symbol. Then c and v are terms by conditions i
and ii; so fc is a term by iii, and so gfcv is a term (of type {g,f, c}) by iii
also.

As before (Theorem 10.2 in part II), unique readability for terms is easy
but tedious to prove, so we leave both the statement and the proof as an
exercise.

Definition 2.1c. An atomic formula is an expression of the form [t I == t2] or


of the form [R..,,.tl, ... ,tn ] where the t/s are terms.

Definition 2.1d. Let s be a type. The set Fms ' the formulas of type s, is the
smallest set X such that
i. every atomic formula of type s belongs to X,
ii. if q>EX then [-,q>]EX,
iii. if q>EX and t/lEX, then [q>Vt/l] EX and [q>At/I] EX,
iv. if q>EX and v is a variable, then [3Vq>] EX and [VVq>] EX.
If l: is a set of formulas, then '7'l:, the type of l:, is U {'7'(0):0 El:}.

EXAMPLE. Let f be a I-function symbol, R a 2-relation symbol, u and v


variables, and c a constant symbol. Then we can check that

[Vu[3v[ [Rufv]V[fv~c]]]] (*)

is a formula. By Definition 2.1b, u,fv, and c are terms. Hence [Rufv] and
[Jv~c] are formulas by Definition 2.1di, and so by Definition 2.1diii we
get that [[ Rufv] V [fv~c]] is a formula. By Definition 2.ldiii used twice
we see that (*) is a formula.

We shall make the usual notational abuses in the name of readability


when writing down formulas, such as omitting or inserting brackets or
commas (see discussion preceding Definition 10.9 in Part II). For example,
we may write
Vu3v[ R(u,j(v»Vj(v)~c]
in place of (*).
The obvious analog of Theorem 10.5 in Part II (unique readability for
formulas) holds. We leave the precise statement of the theorem and its
proof as an exercise.
The notions of free occurrence, bound occurrence, free for, and asser-
tion carry over from Definition 10.7 of Part II without change.
In Part I we remarked that all of mathematics can be developed within
an axiomatic framework such as the Zermelo-Fraenkel axiomatization or
138 III An Introduction to Model Theory

some extension of it, and model theory is no exception. To do so all


objects under discussion would have to be sets whose existence followed
from the axioms. For example, we could take v,., Ca' In, a' R,.,a to be (I,n),
(2,a), (3,n,a), (4,n,a) respectively, and ~, V, 1\, I , 3, 'fI, [, ] to be (0,0),
(0,1), ... , (0,3) respectively. Expressions, terms, and formulas could then be
defined as sequences in an appropriate set theoretic way (see Part I), and
unique readability would then be proved from the axioms of set theory.
Indeed, everything that follows could be developed formally within the
axiom system ZFC. However, within the present context, this would be a
tedious exercise in needless rigor. On the other hand, it is important to
realize that this can be done, and in several problems that arise in logic
such a formalization of model theory is given explicitly or at least assumed.

EXERCISES FOR §3.2


1. Revise Theorem 10.2 of Part II to obtain a unique readability theorem for terms
as defined in Definition 2.1 above, and give a proof of the new theorem.
2. Obtain a unique readability theorem for formulas as defined in Definition 2.ld
by revising Theorem 10.5 of Part II, and give a proof of the theorem.

3.3 Structures
In Part I we defined a structure to be an ordered pair (A,e) where A *0
and e is a binary relation on A. We now extend the notion of structure so
as to encompass a great variety of constructs that are of interest to
mathematicians.

Definidon 3.1. Let s be a type. By a structure of type s we mean a function


at whose domain is su {0} satisfying the following requirements:

i. at(0) is a non-empty set.


ii. at(ca) Eat(0) for each caEs.
iii. at(R,.,a) is an n-relation on at(0) for each R,.,a Es.
iv. at(f,.,a) is an n-function on at(0) for eachf,.,aEs.

So at maps constant symbols into constants in at(0), n-ary relation


symbols into n-relations on at(0), and n-ary function symbols into n-func-
tions on at(0).
We shall write latl instead of at(0) and call it the universe of the
structure at. We may also write c~,f,.~ a' and R,.~ a in place of at(ca ), at(f,.,a)'
and at(R,.,a) respectively. If S is a symbol, then S¥! is called the denotation
of S in at.
3.3 Structures 139

With this notation, structures of the kind described in §I12 can be


thought of as functions ~ with domain {0,R 2,a}' so we have ~=(I~I,R~ a)
=(A,e91 ). Here a is fixed but arbitrary.
If ~ is a structure with universe A and {HOOHI,Ha,'" }=T~, then we
may write (A,H:,H I91, ... , H a91,··) instead of ~. Or it may be convenient
to write ~ = (A, R;91,J,.'#!,C:);E/JEJ,kEK if the type of ~ is {R;:iEI} U L~:jE
J} U {ck : k E K}, where the R;'s are relation symbols, the Ii's are function
symbols, and the ck's are constant symbols.
We shall use capital German letters ~, ~, ~, etc. to denote structures.

ExAMPLE 3.2. We can consider (N+, +,',1,2,"') a structure ~ of type


{Ao.J2,I,cO,c 1, · · }, where 1~I=N+,jfo= +,ff I = ·,c;91=i+ 1.

EXAMPLE 3.3. A group can be thought of as a structure @S = (G, *) of type


{A a}, where I@SI = G and ff, 0 = *, * being a binary group operation.
Alternatively, it may be convenient to regard a group as a structure
@S=(G,*,In,c) where T@S={Ao.JI,O'CO} andff,o= * (* the binary group
operation), f~ 0 = In (In the unary inverse operation), cgs = C (c the identity
element).

ExAMPLE 3.4. An ordered ring or field may be viewed as a structure


~=(A, Ea, 0, <) with ~ of the same type as (N+, +,', <), say T~=
{Aoof2,I,R2,o}' Or one might find it convenient to have the type be
U2,a,Ap,R 2,y} for some (a,p,y) different than (0,1,0) with a-=l=p.

EXAMPLE 3.5. A partial ordering < on a set A may be viewed as a


structure ~ of type {R 2,a} where I~I = A and Rl a = < .

ExAMPLE 3.6. With a little hanky-panky, vector spaces can be regarded as


structures. Suppose we have a vector space with (V, Ea) the underlying
group of vectors, (F, +, .) the field of scalars, and 0 the multiplication of
a vector by a scalar. We can regard the vector space as the structure
~=(VU F, V,F,R e ,R+,R.,R 0 ) where V,F are unary relations, Re=
{(x,y,z):xEay=z}, R+={(x,y,z):x+y=z}, R.={(x,y,z):x·y=z} and
R 0 ={(x,y,z):x0y=z}. Note that we have used relations rather than
functions for Ea, +, ., and 0 in~; this is because clause iv of Definition
3.1 demands that function symbols fn,a denote functions in ~ that are
defined for all n-tuples of ~.

EXAMPLE 3.7. Structures of the empty type are allowed, as for example (N).
We also allow two different symbols to have the same denotation, as in the
structure (A, c~, c~) where A = N, c~ = c~ = 3. Since structures are func-
tions, ~=~ iff Dom~=Dom~ and for all x in the common domain,
~(x)=~(x). Thus if ~=(A,fro) and ~=(B.Jf\), then ~-=I=~ even if
• !8 ' ,
A=B,iI,o=iI,I'
140 III An Introduction to Model Theory

Definition 3.8. We say that 2£ is a substructure of ~, or that ~ is an


extension of 2£, and write 2£ c~, if

i. s2£-s~,
ii. 12£1 c I~I,
... l!{-!Bf all c.. EQ'(
111. C.. - c.. or Su,
iv. J,,~ .. ii-.t:: .. ii for every In, .. Es2£ and all iiE n l2£l, and
v. R,,~..ii iff R,,~.. ii for every R", .. Es2£ and all iiE n l2£l.
If ~ has a substructure with universe X, then that substructure will be
called the restriction of ~ to X and denoted by ~IX.

Notice that if 2£ is a subgroup of ~ then 2£ c~. A similar statement


holds for rings, fields, lattices, etc. The converse, however, is not always
true, but depends on the way in which the group (ring, field, etc.) is
realized as a structure. For example, (N+, +) is a substructure of the
integers under addition, but is not a subgroup. On the other hand, if
~-(G,·, -I) is a group C l being the inverse operation of the group), then
any substructure of ~ is a subgroup of ~ and conversely.
The notation 'c' for substructure is a bit misleading in that 2£C~ does
not mean that as functions, ~ extends 2£. Extension in the sense of
functions is considered in the following definition.

Definition 3.9. If s C 7~, then 2£ is the reduct of ~ to s, written 2£- ~ t s, if


i. 72£- s,
ii. 12£1-1~1,
iii. S9l-S!BforallSEs.
If 2£- ~ t s for some type s then ~ is called an expansion of 2£. When
convenient we will write ~-(2£,S~SE..-!8-S'

ExAMPLE Let ~ - (N+, +, ., < , 1,2) = (N+ ,I~o,J~ I' Ri!Ot c~, c~, 2£-
(N+,·, <,2)-(N+,J~I,R~Otc~), and s= {f2,I,R 2•0 ,cd. Then ~ts-2£ and
~=(2£, +, 1).

Definition 3.10. Let 72£ C 7~. A function g on 12£1 into I~I is an ilifection if
i. g is 1-1,
ii. g( c9l) - c!B for all constant symbols c E 72£,
iii. R~al'''''a" iff R!Bg(al), ... ,g(an ) for all a l ,. .. ,an EI2l1 and all n-relation
symbols R E 72£,
iv. gU'fI.al, ... ,a,,)-I!Bg(al), ... ,g(a,,) for all a l , ... ,an EI2£1 and all n-func-
tion symbols I E 72£.
If in addition g is onto I~I, then g is an isomorphism of 2£ onto ~, in which
case we write 2£~g~. Or we may write 2£~~ if explicit mention of the
isomorphism is unnecessary.
3.3 Structures 141

Notice that this definition coincides completely with the use of 'injec-
tion' and 'isomorphism' in algebra. Several other notions from abstruct
algebra will be generalized in the problems and in the sections that follow.

ExERCISES FOR §3.3


1. Show that if ~C;;;;!BC;;;;(i, then ~C;;;;(i.

2. Show that •==' is an equivalence relation by showing that for all structures
~,!B,(i,
(a) ~==~,
(b) ~==!B implies !B~~, and
(c) ~:=!B and !B~~ implies ~:=~.
3. Give examples of structures ~, ~', !B, and !B' such that
~==!B' C;;;;!Ba;;~' C;;;;~
but ~a!B.

4. Suppose that ~I C;; ; ~i + \ for i =0, 1,2, .... Let ~ be that structure of type .,.~o such
that
I~I= U I~il,
lEN

R"= U R"j for each R E"'~,


lEN

F"= U j"/ for eachjET~,


lEN

c"=c"j foreachcE"'~.
Clearly each ~i is a substructure of~. Find such a substructure chain ~\ C;;;;~2C;;;;
~3 c;;;; ••• such that each ~I is isomorphic to (N+, <) but ~ is not.

5. LetA be the set of all integral powers of 3, A = {Jl:jEI}. Show that (A,·)==
(/, +).
6. A function h mapping I~I into I!BI is a homomorphism if .,.~ =.,.!B and whenever
c,R,jET~, then
i. hc" .. c'i8,
ii. hj"(a\, ... ,0,,)= j'i8(ha\,. .. ,ha,,),
iii. R"a\, ... ,a" iff R!8ha\, ... ,ha".
Let k be a positive integer and let + k and ·k be addition and multiplication
modulo k. Let h(n)-=nmoda. Show that h is a homomorphism from (J, +,.)
onto ({O,l, ... ,k}, +k, ·k).
7. An equivalence relation C on I~I is a congruence relation on ~ if whenever
al C bi for i < n and R is an n-relation in .,.~ and j is an n-function in .,.~, then
Ra\, ... ,a" iff Rb)o ... ,b"
and
ja\, ... ,a" Cjb)o ... ,b".
Let h be a homomorphism on ~ to !B. Let C={(a,b):a,bE~, h(a)-=h(b)}.
Show that C is a congruence relation.
142 III An Introduction to Model Theory

8. L-et C be a congruence relation on ~. For each a E I~I we let IJ= {b E 1~I:b C a}.
The quotient structure ~ modulo C is that structure lB of type 'T~ with
IlBl= {a :aEI~I},
c!B =?i' for each c E 'T~,
f !B""""
al , ... , an - f al, ... ,an for eachfE'T~,
r-. _ ' !II '

R!B~, ... , ~ iff R!IIal> ... ,an foreachRE'T~.


Show that if h(a)=afor each aEI~I, then h is a homomorphism from ~ onto lB.

3.4 Satisfaction and Truth


The purpose of this section is to extend the notions of satisfaction and
truth as defined in Section 2.10 to our more general language. The
extension is made in the obvious way, in correspondence to our usual use
of the symbols in mathematics. After this is done, several examples are
given to illustrate the expressive power of L.

Definition 4.1. An assignment to ~ is a function Z with Domz={v;:iEN}


and Rngz k I~I. If z is an assignment to ~, u a variable, and a E I~I, then
z( z'
~) is the assignment defined as follows:
z'( v) = z( v) for all variables v =l=u,
z'(u)=a.

Letting Vbl be the set of variables, we can use the notation of Part I and
write' z E Vb\I~I' when z is an assignment to ~.

Definition 4.2. Let t be a term of type k 'T~ and let z E Vb\I~I. We define
t~<z> (by induction on the length of t) as follows:

i. v~<z> =z(vn),
ii. c~<z >= c~,
iii. Un,a(t\, ... , tn»~<z> = f~a(t~<z>, ... , t~<z».

Definition 4.3. Let cp be a formula of type ~ and let z E Vb\I~I. We say that
z satisfies the formula cp in ~, written ~FCP<Z>, if either
i. cp=[t\R:::t2] and t~<z>R:::t~<z>,
i'. cp= [Rn,at\, ... ,tn] and Rn~at\<z>, ... , tn<z>,
ii. cp=[ -lip] and it is not the case that ~FI/I<Z>,
iii. cp = [1/1\ A 1/12] and both ~FI/I\<z> and ~FI/I2<Z>,
iii'. CP=[IPI V 1/12] and either ~FI/I\<Z> or ~FI/I2<Z>,
iv. cp=[3vn l/l] and for some aEI~I, ~FI/I<Z vn or
a
»,
iv'. cP = ['Vvnl/l] and for all z E I~I, ~FI/I<Z( ~n ».
3.4 Satisfaction and Truth 143

1beorem 4.4. For all 2£,z,cp,


i. 2£I=Vvncp(z) iff 2£1= -,[3vn -, cp](z), 2£1=3vncp(z) iff 2£1= -, [Vvn -, cp](z);
ii. 2£1=[cpAI/I](z) iff 2£1=-,[ -,cpV-,I/I](z), 2£1=[cpVI/I](z) iff 2£1=-,[ -,cpA
-,I/I](z).

The proof consists of an easy unwinding of the above definition, and is


left as an exercise (see Exercise 1).
One may view Theorem 4.4 as saying that each quantifier can be
defined in terms of the other and '-,', and that each of the symbols 'V'
and 'A' can be defined in terms of the other and '-,'. In other words, L
suffers no loss in expressive power if we delete from its list of symbols 'V'
and 'A' (or '3' and 'V', or 'V' and 'V', or '3' and 'A'), and if we delete
from Definition 4.3 clauses iii and iv' (or, respectively, clauses iii' and iv, or
iii' and iv', or iii and iv). In addition to the added succinctness in
describing L, this approach makes Definition 4.3 easier to use in that fewer
clauses need be checked (see the proof of Theorem 4.5). On the other
hand, the choice of symbols for L as given seems natural and is a bit more
convenient in the statement of several theorems. So from now on the
formulation of L that is used in a given discussion will be chosen on the
basis of convenience.
The definitions of bound occurrence of a variable, free occurrence,
assertion, 'v is free for t in cp', etc., are exactly as before (see Part II,
Definitions 10.7 and 1O.9i). The meanings of cp(vjt'''''vj) and cp(t1, ... ,tk)
carry over to our extended language (see end of §2.1O). We shall often find
it convenient to write cp(tr(z), ... ,tl(z» instead of2£l=cp(t 1, ... ,tk)(z)-even
though this is an ambiguous convention, since z might be an assignment to
two structures 2£ and ~ with tj'«(z)-t,?(z) for i-I,2, ... ,k and 2£1=
cp(t1, ... ,tk)(z) but ~1=-'CP(tl, ... ,tk)(Z). However, we shall use this conven-
tion only where such confusion is unlikely.

1beorem 4.5.
i. Let t be a term, and let z and z' be assignments to 2£ such that z(v)-z'(v)
for all variables v not occurring in t. Then t'il(z) - t'-(z').
ii. Let cp be a formula, and z and z' assignments to 2£ such that z(v)-z'(v)
for all variables v occurring free in cpo Then 2£l=cp(z) iff 2£l=cp(z').
PROOF: Exactly like that for Theorem 1O.8i in Part II. o
The theorem implies that if 0 is an assertion, i.e., a formula without free
variables, then 2£l=o(z) for all assignments z to 2£, or no assignment
satisfies 0 in 2£. Thus we write 2£1=0 if there is an assignment that satisfies 0
in 2£, and we say that 0 is true in 2£ or that 2£ satisfies O. We say that 0 is
valid and write 1=0 if for all 2£ of type ~'To, 2£1=0. For example, Vx[RxV
-, Rx] is valid.
144 III An Introduction to Model Theory

Definition 4.6.
i. The theory of ~, abbreviated Th~, is {O:~FO}. If % is a class of
structures, then Th%={o:oETh~ for all ~E%}. ~ is elementarily
equivalent to~, in symbols ~=~, if Th~=Th~.
ii. If ~ is a set of sentences, then Mod~, the class of models of ~, is the
class of all ~ such that ~FO for all a E~. We write ~ EModo instead of
~ E Mod{ a}. A class of structures % is an elementary class if % =
Mod~ for some set of sentences~.

Throughout the rest of the section we shall use p and a to denote


assertions, cp and I/J to denote formulas, fl, r, and ~ to denote sets of
assertions, and z to denote assignments.
We now give several examples of elementary classes, first giving the
class % and then a set of assertions ~ such that Mod ~ = %. In Examples
4.7 through 4.10, T% = {..;;}, where..;; is some binary relation symbol. As
before, cp-+I/J abbreviates -,cpVI/J, and cp~1/J abbreviates (-,cpVI/J)/\(-,1/J
Vcp).

EXAMPLE 4.7. Partially ordered structures:


Vvo[ vo";;vo],
VVov,[ [vo";;v,/\v, ";;vo]-+vo~v,],
VVOV,V2[ [ Vo ";;v,/\v, ";;V2]-+VO ";;V2].

EXAMPLE 4.8. Linearly ordered structures: The assertions of Example 4.7


along with

ExAMPLE 4.9. Densely ordered structures: The assertions of Example 4.8


along with

3vov,[ vo~v,]/\[Vvov,[ [vo<V,]-+3V2[ VO<V2/\V2<V,]] J.


Of course, vo<v, abbreviates Vo ";;v,/\vo~v,. The rationals and the reals
along with the usual orderings are examples of densely ordered structures.

EXAMPLE 4.10. Discretely ordered structures: The assertions of Example


4.8 along with
Vvo[3v, [ V, <VO]-+3V2[ V2 <vo/\ VV3[ V3 <VO-+V3 ..;;v2]]],
Vvo[3v,[ vo<Vd-+3v2[ VO<V2/\ VV3[ VO<V3-+V2..;;v3]J]·
The integers are discretely ordered by the usual 'less than' relation.
3.4 Satisfaction and Truth 145

EXAMPLE 4.11. Groups are structures (A, *~) satisfying


VVOVt V2[ Vo* [Vt*v2]R:![ vo*vt] *V2]'
3vo Vv t [ vo*VtR:!Vt*voR:!vt],
Vv03v t VV2[ [ Vo* v t ] *V2R:![ Vt *vo] *V2R:! V2].
Here, * is 12 a for some convenient a, and we write u *v instead of
Aauv. Similar itotational devices will be used throughout the chapter
without mention.
Alternatively, we could consider a group to be a structure (A, *~,'~,eVl)
satisfying
VV OVt V2[ vo* [V t *V2]R:![ vo*vt] *V2]'
Vvo[ Vo * iR:!i *VoR:!Vo],
VVOVt [ vo*voR:!iR:!vo * vo]·
It is not difficult to show that the groups in the first sense are exactly the
groups in the second sense reducted to the type {*}.
A group is Abelian if

ExAMPLE 4.12. Rings are structures ~=(A, *~, oVl) where (A, *!ll) is an
Abelian group, i.e., ~ satisfies the assertions in Example 4.11, and also ~
satisfies
VVOV tV2[ VOO [v t oV2]R:![ voovt] ov2],
VVOVtV2[ [voo [Vt*V2]R:![ VOOVt] * [ VOov2]]
!\[ [V t *V2] OVoR:![ Vt OVo] * [V2 OVo]] J-
Of course each alternative formulation of a group as an elementary
class yields an alternative formulation of a ring. For example, we can view
rings as structures ~=(A, *Vl, -t~,iVl, o~) where (A, *Vl, -t!ll,i~) is an
Abelian group according to our second formulation of a group, and ~
satisfies the above two assertions.

EXAMPLE 4.13. Fields are rings satisfying the additional assertions


3vo Vv t [ VOOVtR:!Vt ° VoR:!Vt] ,
Vvo[ vo*voR:!VoV3VtVV2[ [voovd ° V2R:! [ Vt OVo] OV2R:!V2] J.
Again, there are alternative formulations.

These examples by no means exhaust the elementary classes that are,


mathematically interesting. On the other hand, as we show in §3.5, there
are many classes of interest that are not elementary classes.
146 III An Introduction to Model Theory

In our next example, we use L to give a precise statement of the axioms


of Zermelo-Fraenkel set theory (with regularity). In particular, the am-
biguities in our statement of the axiom of replacement as given in Part I
are avoided. Here e is a binary relation symbol whose intended interpreta-
tion is E.

ExAMPLE 4.14 (The Axioms of Zermelo-Fraenkel Set Theory).


i. Extensionality:

ii. Null set:

iii. Pairing:

iv. Union:

v. Power set:
Vvo3vI VV2[ v2evI~Vv3[ v3ev2~v3evO]]·
vi. Replacement schema: For each formula cp of L with free variables
Va. VI' ... , Vn the following is an axiom:
Vv2··· vn[VV0 3Vn+1 Vv l [ CP~VI~Vn+I]]
~Vvn+13vn+2 Vv l [ VI EVn+2~3vo[ voevn+tI\CP]].
vii. Infinity:
3vo[0evo/\ VVI[ vlevo~vl U {vdevo]]·
Here 0evo is an abbreviation for 3v I VV2[ --, V2evI/\VI evo], and VI U
{VI} Evo is an abbreviation for 3v2[VV 3[V3 EV2~V3~VI VV3evI]/\
v 2evO]·
viii. Regularity:
VVO[VVI[ vI.t'vo]V3vl[ VI evo/\ VV2[ v2evo~v#vd]].

In addition to the Zermelo-Fraenkel axioms, the other axioms dis-


cussed in § 1.11, such as the axiom of choice and the generalized con-
tinuum hypotheses, can also be formulated in L. This we leave as an
exercise.

Our last two examples are number theories. The second is an attempt to
realize the Paeno axioms for arithmetic within L. In contrast to the second
example, the first involves only a single assertion, and it is this property
3.4 Satisfaction and Truth 147

which we will need in §3.12 when considering the possibility of an


algorithmic test for validity.

EXAMPLE 4.15 (The Formalization Q). Our formulation is in the type


{ !f. , 0 ,S,O} (where S denotes the successor function):
'fIVOVI[[SVO~SVI~VO~VI]

I\[O~Svo]

1\[0~vo~3vI[ SVI ~VO]]


n
1\[0 + Vo~ vo]
n n
I\[vo + SVI~S(VO + VI)]
n
I\[vo . O~O]

EXAMPLE (A Fragment of Peano's Arithmetic). Adjoin to the single asser-


tion of Q above the infinite list of assertions in the following induction
schema: All assertions of the form

'fIuo, ... ,Un_ l [ [[ cp(uo,···,Un_I'O)

1\'fIun[ cp(Uo> ... ' un-I, u,,)~cp( Uo> ... ,Un-I' SUn)]]
~'fIun[ cp( uo,···, Un-I' un)]],

where Uo, ... , Un is a list of the free variables in cpo


The induction axiom for Peano's axiomatization asserts: For all subsets
X of N, if OEX and if n+ 1 EX whenever nEX, then X=N. In particular,
if X={n:NFcp(n)}, we obtain a typical assertion of the induction schema
listed above. We cannot state the induction axiom itself in L since this
requires quantifying over arbitrary sets ("for all subsets of N, if ... ").

In the remainder of this section we discuss several theorems relating the


notions of substructures and isomorphism to the language L.

DeflnidOD 4.16. Say that an assertion is simple if it has one of the following
forms: dl~d2' dl~d2' Rdl.·.dn, -,Rdl··.dn, jdl, ... ,dn-d, jdl ... dn~d,
where R is a relation symbol, j a function symbol, and the d's constant
symbols. If ~ is an expansion of ~ such that for all a E I~I there is acE 'T~
with c!B = a, then the set of all simple sentences true in ~ is called a
diagram of ~. Even though a structure has many diagrams, there will be no
harm in speaking of 'the diagram of ~' and writing 6j) ~.
148 III An Introduction to Model Theory

For example, if ~ = (N, +, ',0, I), where c~ = 0 and c~ = I, then we can


take ~=(~,C;'+:JnEw' where c!'+2=n+2 (or alternatively c!'+2=n; the fact
that several symbols may denote the same element is of no consequence).
Then CS+C3R::Cg, -,[cg<c s], -,[C7'C3R::CIO] are all members of 6j)~.

Theorem 4.17. ~ is isomorphic to a substructure of ~ if and only if some


expansion of mis a model of 6j) ~ and T~ = Tm.
PROOF: Suppose that g is an isomorphism on ~ onto ~'k B. Let ~+ be an
expansion of ~ such that each a E I~I is a constant cs:+ in ~+. Let m+ be
!8+ !8+ _ !ll+ m:.
(~,ca )aEI\ll1 where Ca -g(ca ). For each R,jET~ we have R al, ... ,an Iff
!8 ' ! 8 + !8+ !8+ m:+ _.
R g(al)"'" g(an) Iff RCa., ... , Ca , and f a l, ... , an - aO Iff
!8 ( ) ( ) - (
f gal , ... ,g an -g ao 1 ) 'ff f!8' .... !8+ 18+
CaI , ... ,Cn~ • AI 21+ - 21+
so, CaI -Ca2 iff g (21+)_
CaI -
21+ !8+!8+
g(Ca2 ) iff cal =Ca2 . Hence ~+ EMod6j) ~.
For the converse, suppose that m+ is an expansion of ~ and m+ E
Mod 6j) ~. For each a E I~I let Ca be the symbol in 76j) ~ denoting a. We
claim that the function g, defined by g(a) = c~+, is an isomorphism on ~
onto Rngg and that Rngg k~.
i. Rngg is the universe of a substructure of ~: We must show that Rngg
is closed under f!8 for allfE7~. Let bl, ... ,bnERngg, say b;=c:'+. Let
a= f\llal, ... ,an. Then caR::fca "",ca E6j)~ and so is true in ~\ i.e.,
ca!8+ --f!8+cal'
!8+ ... , Ca.'
!8+ Thus I C!8+ -~f!8b I' ... , bn and!8+
a ca Egg, Rn as
needed.
ii. g is 1-1: If al~a2' then calG16ca2E6j)~, so ~+Fcal~ca2 and c~+G16c!+,
i.e., g(al)~g(a:J.
iii. "g preserves relations: R 2Ia l , ... ,an iff RCa '''''Ca. E6j)~' iff ~+F
Rca" ... ,Ca. iff R!8+c~+, ... ,ct iff R!8g(a l), ... ,g(an).
iv. "g preserves functions": f2lal , ... ,an = a iff fCa , .. "ca.R::a E 6j) ~ iff ~+F
fica,,,,,c n R::Ca
'ff f!8+ C!8+'''''Cn!8+ =c!8+'iffI f !8g(al), ... ,g(an)=g(a).
_ !8a1 a
1
I ~ \ll' ~ • ,
Hence g(j al, ... ,an)-f g(al), ... ,g(an). Thus we see that g IS an ISO-
morphism onto a substructure of ~. D

Theorem 4.18. If ~ is isomorphic to a substructure of ~, then there is a


~~~ such that ~k~.

PROOF: Let ~~gm'km. Let h be a 1-1 function on I~I such that h(b)=
g-I(b) if bERngg, and h(b)EtI~1 otherwise. Now define ~ as follows:
1~I=Rngh,
c~ = c 2I
for everyc E 7~,
l
R~dl, ... ,dn iff R!8h- (d l ), ... ,h- l(dn) for all R E7~,

rdl , ... ,dn = h(J!8h- l(d l ), ... ,h-I(dn» for allfE7~.


Clearly, ~ ~h ~ and ~ k~, as needed. D
3.4 Satisfaction and Truth 149

We shall use this theorem, often without explicit mention, to excuse the
writing of ~ C~ when we mean that ~ is isomorphic to a substructure of
~.

1heorem 4.19. If~o;;;;;.~, then ~=~. In fact, if~o;;;;;.g~, then for all formulas
q> and all ZEVbII~I, ~Fq>(Z) iff~Fq>(goZ).
PROOF: Let g be an isomorphism from ~ to ~. We first show that for all
z E VbII~1 and all terms t,
g(t'«(z» = l'\goZ). (*)
For t variable or a constant symbol we have
»
g( v':(z») =g(z(vn =v;'(g oz),
g(c~(z) )=gc~=c:(goz).
Now suppose (*) is true for all terms tl, ... ,tn, and let t beftl, ... ,tn. Then
g((jt l, ... ,tS11.(z) )=g(J'«tr(z), ... ,t:(z»)
= f!8g(tr(z) ), ... ,g(t:(z»)
[ since g is an isomorphism]
=f!8t~(g 0 z), ... ,t;'(g 0 z)
[ since (*) holds for t 1' ••. ' tn ]
= (jt I' ••• , tn)!8(g 0 z) [by Definition 4.2].
This proves (*).
We now show by induction on formulas that for all zEVbII~1 and all
formulas q>
~Fq>(Z) iff ~Fq>(goZ). (**)
We first prove (**) for q> atomic:
i. ~F[tl~t2Kz) iff tr(z) = t~(z)
iff g(tr(z»=g(t~(z» (since g is an isomorphism)
iff t~(goz) = t~(goz) [by (*)] iff ~F[tI~t2Kgoz).
ii. ~FRtl, ... ,tn(z) iff R'#!tr(z), ... ,t:(z)
iff R!8g(tr(z», ... ,g(t:(z» (since g is an isomorphism)
iff R!8t~(goz), ... ,t;'(goz) [by (*)]
iff ~F[ RtI' ... , tnK g 0 z) (by Definition 4.3).
Next we show that if (**) is true for q>1 and q>2 and all z, then it holds for
-, q>1' q>.I\ q>2' and 3Vq>1:
iii. ~F[ -,q>.1(z) iff not ~Fq>I(Z) iff not ~Fq>I(goZ) (by the induction
assumption) iff ~F[ -, q>.1( g 0 z).
150 III An Introduction to Model Theory

iv. ~[qV\CP2](Z> iff ~FCPI(Z> and ~Fcplz> iff ~FCPI(goZ> and ~F


CP2( g 0 z> (by our induction assumption) iff ~F[ cpd\ CP2]( g z>. 0

v. ~F3vcpl(Z> iff for some aEI~I,

~FCPI (z(~))
iff for some a E I~I,
~FCPI ( go (z( ~))) [since we assume (**) for cpd
iff for some a E I~I,

iff ~F3vcpl(goZ>.
In the last equivalence, the implication from left to right is immediate
from the definition of satisfaction, while the implication from right to
left uses both the definition of satisfaction and the fact that g is onto.
This completes the proof of the second clause of the theorem. The first
clause follows immediately from the second. D
The converse of this theorem is false, as we shall see in the next section.

EXERCISES FOR §3.4


1. Prove Theorem 4.4.
2. If W is finite and TW is finite then there is a a such that !BFa and T!B = TW
iff W~!B.
3. Suppose that no variable other than v occurs free in qJ or 1/1. Show that
FVv[qJ--+I/I]--+[VVqJ--+VvI/J], but for some qJ and 1/1 Y[VVqJ--+VvI/J]--+Vv[qJ--+I/I] and
Y.., nVVqJ-+VvI/J]-+Vv[ qJ--+I/I]].
4. Suppose that 1/1 is an assertion and that v is not free in 1/1. Show that F[3VqJ--+I/I]~
Vv[ qJ--+I/I].
5. Show that two structures are isomorphic iff they have identical diagrams.
6. Given a structure W, describe a set of simple assertions ~ ~ 6j) W such that any
structure!B contains a homomorphic image of W iff ~~6j)!B.

3.5 Normal Forms


Different assertions can have the same meaning. For example,
VVI3v2 VV3[RvIV3+-+V3~V2] and VVI3v2[Rv1v21/\ VVIV2V3[RvIV2/\RvIV3-+
V2~V3] both assert that R is a function. When investigating the properties
of the class of models of (J it is often helpful to find some assertion
equivalent to (J that has some convenient form. As we shall see in §3.9, the
3.5 Normal Forms 151

algebraic properties of Moda and the form of assertions equivalent to a


are closely related.

Definition S.l.
i. If cp is a formula with free variables UO,,,,,un- I' then Vuo, ... ,Un_ICP is a
universal closure of cpo
ii. Say that cp is valid, and write Fcp if it has a valid universal closure.
(Recall that an assertion a is valid if ~Fa for all ~ of type :l Ta.)
Of course, if cp has a valid universal closure, then any universal closure
of cp is valid.
iii. We say that cp and t/I are equivalent if FCP~t/I.

Unwinding the definitions, we see that FCP~t/I iff for all ~ of type
1'(cp~t/I) and for all z E VbII~I,
~FCP<Z>iff ~Ft/I<Z>.
From this it is clear that logical equivalence is an equivalence relation on
the class of formulas; i.e., for all formulas cp,t/I,~:
FCP~Cp,
FCP~t/I implies Ft/I~cp,
FCP~t/I and Ft/I~ implies Fcp~.

Theorem S.2. If FCP~CP' and ~' results from ~ by replacing an occurrence of cp


with cp', then F~~'.
PROOF: Let FCP~CP' and say that ~ has the property (*) if whenever f is
obtained from ~ by replacing an occurrence of cp with cp', then F~~'.
Case i. Every atomic formula has property (*): Indeed, if ~ is atomic
and ~' is obtained from ~ by replacing an occurrence of cp with cp', then
~-cp and f=cp'.
Case ii. Suppose ~=.., t/I and (*) holds for t/I. Let~' be obtained from ~
by replacing an occurrence of cp with cp'. If ~ - cp, then f = cp' and we are
done. If ~ ::focp, then f = .., t/I', where t/I' is obtained from t/I by replacing cp by
cp'. By assumption, Ft/I~t/I', from which it follows that F~~'.
Case iii. Suppose ~-t/l11\t/l2' where both t/ll and t/l2 satisfy (*). Again, we
need only consider the case where ~::focp. Then if f is obtained from ~ by
replacing an occurrence of cp with t/I, then ~' has the form t/l11\t/l2' where t/I;
is t/I; or is obtained from t/I; by replacing an occurrence of cp with cp'. By
assumption Ft/I;~t/I; for i= 1,2. Thus for all ~ of the appropriate type and
all ZEVbII~I, we have ~F~<Z> iff ~Ft/lI<Z>, and ~Ft/l2<Z> iff ~Ft/l1<Z>, and
~Ft/l2<Z.> iff ~Ft/lll\t/l2<z> if ~F~<Z>. Hence F~~'.
Case iv. Suppose ~ = 3Vt/1 and t/I satisfies (*). The case ~ = cp being
immediate, we suppose that f-3Vt/1', where t/I' results from t/I by replacing
152 III An Introduction to Model Theory

an occurrence of cP in'" with cp'. Then t="'~"" by assumption, so


~t=~<z) iff for some a E I~I, ~t="'( z( ~) )
iff for some a E I~I, ~t="" ( z( ~) )
iff ~t=f<z).

Thus all formulas have the property (*), which proves the theorem. 0
Lemma 5.3. Let s be an expression and u a variable that does not occur in s.
Let s' be the result of replacing every occurrence in s of the variable v by u.
Then for all ~ of type ~ '1'S and all z E Vb/I~I:
a. if s is a term, then

b. If s is a formula, then

~t=s<z) iff ~t=s'( z( ztv»)).


PROOF: In what follows, if r is an expression then r' denotes the result of
replacing v by u throughout r. Also, we abbreviate z( ztv») by z'.
PROOF OF a is by induction on terms:
Case i. s is a variable. If S:FV then s=s' and sW:<z)=z(v)=z'(v)=
s'~<z'). If s=v then s'= u and sW:<z) = z(v) = z'(u) = s,W:<z').
Case ii. s is a constant symbol. Then s=s' and s~<z) =s~=s'~<z').
Case iii. s=J",atO, ... ,tn-1 where the t/s are terms such that t'f<z)=
t;~<z'). Then s~<z) = f~at~<z)· .. t:_I<z) = f~at~<z')· .. t~_I<Z') =
s'~<z'). 0
PROOF OF b is by induction on formulas:
Case i. s is an atomic formula. If s is tO~tl' then s' is t~~t~ and ~t=s<z)
iff t~<z) = tr<z) iff (by clause a of the lemma) t~w.<z') = t~W:<z) iff ~t=
s'<z'). A similar argument applies when sis Rn,atOo ... ,tn-l.
Case ii. s is -, cp, and the lemma is true for cpo Thus s' is -, cp' and
~t=s<z> iff ~YCP<z) iff ~YCP'<z') iff ~t=s'<z').
Case iii. s is CPI/\ CP2' and the lemma holds for CPI and CP2. Then
s' = cP~ /\ CP2 and
~t=s<z) iff ~t=CPI<Z) and ~t=CP2<Z) iff
~t=cP~<z') and ~t=CP2<Z') iff ~t=s'<z').
Case iv. s is 3vj cp, and the lemma holds for cpo Then s' is either 3ucp' or
3vj cp' according as u = Vj or not. Hence the following are equivalent:
~t=s<z).
3.S Normal Forms 153

There is some a E 1211 such that

2lFCP( z( ~)).
There is some a E 1211 such that

2lFCP'( (ztv»)( ~)).


Noting that

(z( ztv) ))( ~ )=z'( ~),


we see that the last line is equivalent to 2lFS'(Z'). This concludes the proof
of the lemma. []

Lemma 5.4. Let u be a variable that does not occur in cP, and let cp' be the
result of replacing each bound occurrence of v by u. Then Fqx-+CP'.
PROOF: The proof is by induction on formulas. We consider only the case
where cP has the form 3vjt{l under the assumption that the lemma holds for
t{I, leaving the remaining easier cases as an exercise.
If Vj=FV, then cP' is 3vjt{l', where t{I' is the result of replacing each bound
occurrence of v by u in cpo By assumption, Ft{I~t{I'. Hence by Lemma 5.3,
Fcp~cp'·
If Vj =v, then cp' is 3 ut{I", where t{I" is the result of replacing every
occurrence of v by u in t{I. By the preceding lemma,
2lFt{I( z( ~) ) iff 2l =Ft{I" ( z( ~) ).
Hence, 2lFCP(Z) iff for some aEI2lI,
2lFt{I(Z(~))
iff
for some a E 12l1, 2lFt{I" ( z( ~) )
iff 2lFt{I'. Thus Fcp~cp'. []

Lemma 5.5.
i. F -,3vcp~Vv.., cpo
ii. F.., V vcp~3v .., cpo
iii. If u is not free in t{I, then F[t{I/\3ucp]~3u[t{I/\cp] and [t{I/\ Vucp]~Vu[t{I/\
cpl·
iv. If u is not free in t{I, then F[t{lV3ucp]~3u[t{lVcp] and F[t{lV3ucp]~3u[t{I
Vcp].
PROOF: The first two clauses follow easily from Definition 4.3. To prove
clause iii suppose 2lF[t{I/\3ucp](z). Then 2lFt{I(Z), and for some aEI2lI,
2lFCP( z(~)).
154 III An Introduction to Model Theory

By Theorem 4.5,

since u is not free in 1/;. Hence

~F [ I/; /\ cp ] ( z( ~) ),
and so ~F3u[l/;/\cp1<z>. This shows that F[I/;/\3ucp]-+3u[l/;/\cp]. Since
F3u[l/;/\cp]-+[1/;/\3ucp] is obvious by Definition 4.3, we have iii. The
remaining half of clause iii has a similar proof, as does clause iv. D

Definition 5.6.
i. A formula is open if no quantifier occurs in it.
ii. A prenex normal form formula is a formula Qcp where cp is open and Q is
a sequence Qouo'" Qn-\un-\ with each QjE{V,3} and each U j a
variable. Q is called the prefix of Qcp, and cp the matrix of Qcp.

EXAMPLE. VV23vO Vv\[Rv2Vo/\ --, SV\V2] is a prenex normal form for-


mula with prefix VV23vOVv\ and matrix RV2VO/\ --, SV\V2' The formula
[RV2VO/\ --, SV\V2] is open.

Theorem 5.7. Every formula is logically equivalent to a prenex normal form


formula having exactly the same free variables.
PROOF: We use induction on formulas.
Case i. If I/; is atomic, then I/; is a prenex normal form formula.
Now suppose that cp and cp' are logically equivalent to prenex normal
form formulas Qp and Q'p' respectively, where Q is Qouo'" Qn-\un-\, Q'
is Q6u~· .. Q~, _ \U' n' _ \ and p and p' are open.
Case ii. I/; is --, cpo Then FI/;~ --, Qp by Theorem 5.2. Let Qj # be V if Qj is
3, and let Qj # be 3 otherwise. Then repeated use of Lemma 5.5i and ii
gives F--,Q~Qo#uo'" Qn~\un-\--'P'
Case iii. I/;=cp/\cp'. By Lemma 5.4, we can suppose that no uj occurs in
p' and no uj occurs in p. Iterating Lemma 5.5iii and iv, we have F[ cp /\ cp']~
QQ'[p/\p']. This gives FI/;~QQ'[p/\p'].
Case iv. I/; is 3vcp. By Theorem 5.2, FI/;~3 vQp, as needed. This
completes the proof. D
In what follows we usually ignore the fact that a formula cp has many
prenex normal form formulas equivalent to it, and speak of "the" prenex
normal form of cpo On the other hand, it is easy to see that our proof of the
existence of a prenex normal form formula equivalent to cp can be
modified to give an algorithm that provides a unique such formula.
3.5 Normal Forms 155

Definition 5.S. A formula is a disjunctive normal form formula if it has the


form
('PI, 1/\ 'PI,2f/\ ... /\ 'PI,n)V( 'P2,1/\ 'P2,2/\ ... /\ 'P2,n,) V ...

V('Pk,I/\'Pk,2/\" ·/\'Pk,n.),
where each 'P;J is an atomic formula or the negation of an atomic formula.
A conjunctive normal form formula is defined analogously except that the
symbols /\ and V are interchanged.

Theorem 5.9. Let 1/;1,1/;2"'" I/;n be an enumeration of the atomic subformulas


occurring in the open formula 1/;. Then I/; is equivalent to a disjunctive normal
form formula
('PI, 1/\ 'PI,2/\ ... /\ 'PI,n)V( 'P2,1/\ 'P2,2/\ ... /\ 'P2,n) V ...

V('Pk,I/\'Pk,2/\" '/\'Pk,n),
where 'P;J E { I/;j> -, I/;j}' The analogous statement for conjunctive normal forms
is also true.
PROOF: Let 'P; (i <:'k) be an enumeration of all those formulas of the form

such that

We claim that
F'PI V··· V'Pk~l/;·
If not, then there is some ~ and z E VbII~1 such that
~F -, ['PI V .. · V'Pk]<Z> and ~FI/;<Z>.

>
Let 'P. = 'Pr /\ 'Pt /\ ... /\ 'P: , where 'P; is I/;j if ~F 1/;/ Z and is -, '"j
otherwise. Then ~F'P·<Z>. Moreover, since I/; is open, the truth value of '"
in any model with any assignment depends only on the truth values of the
1/;;. Hence F'P· ~I/;. But then 'P. is one of the 'P;'s and ~F -, ['PI V ... V'Pk]-
a contradiction. Hence F'PI V ... V 'Pk~l/;· D
The last normal form we shall discuss is one of the most useful. A
formula in this normal form is universal.

Definition 5.10. Say that 'v'UI, ... ,Un'P(J(ul, ... ,u,:',w;, ... ,wk» is a one step
Skolemization of", if 1/;='v'UI, ... ,Un3u'P, andf is a function symbol not
occurring in 'P, and WI"'" wk is a list of the free variables of 1/;. If no
variable is free in 1/;, and I/; has the form 3u'P, and c is a constant symbol
not occurring in 1/;, then 'P(~) is a one step Skolemization.
156 III An Introduction to Model Theory

For example, VVI [[ VI <f( VI' V3)] 1\ [f( VI' V3) <V3]] is a one step Skole-
mization of Vvl3vl[ VI <V2]1\[ V2 <V3]]·
Notice that iterating one step Skolemizations will lead to a universal
formula provided that we begin with a prenex normal form formula.

Definition 5.11. Let cP be a prenex normal form of 1/1, and let CPI,CP2' ... 'CPn
be a sequence of formulas such that
i. CPI = 1/1,
ii. CPn is universal,
iii. CPH I is a one step Skolemization of cPj for each i = 1, ... , n - 1.
Then CPn is said to be a Skolem normal form of 1/1, which we shall often
abbreviate CPn E Sk( 1/1 ).

For example, if 1/1 is VV I V2 [ VI <v2~3v3[ [ VI <v3]1\[ V3 <V2] then n,


VVIV23v3[[VI<V2]~[[VI<V3]I\[V3<V2]]] is a prenex normal form of
1/1 and VVIV2[[VI<V2]~[[VI<f(VI,V2)]I\[f(VI,V2)<V2]]] is a Skolem
normal form of 1/1.

Theorem 5.12. Let cP E Sk( 1/1). Then for every ~ there is an expansion msuch
that for all Z EVbII~1
iff mFcp<Z).
~F1/I<Z)
Furthermore, if Z EVbII~1 and ~FCP<Z), then ~F1/I<Z).
PROOF: Clearly it is enough to show that the conclusion of the theorem
holds when cP is a one step Skolemization of 1/1. Let WI' W2' ••• ' W k be the
variables that occur free in 1/1. Then cP has the form VUI, ... ,un~(f(ul, ... ,
u:, WI' ••• ' wk ), wheref does not occur in 1/1. For each aEn+kl~llet

uI,···,un,wI,···,wk'u )}
Xii= { d:~F~ ( .
a l ,··· ,an,an+I'··· ,an+k,d
Let a* be some fixed element of I~I. Now we let m=(~,j~, where
f~aEXii if Xii:;60, andf~a=a* otherwise. (Notice that the existence of
such an f requires the axiom of choice; see Exercise 10.) Oearly, for any
assignment Z to ~
~F1/I<Z) iff mFcp<Z).
The second part of the theorem is obvious. o
Definition 5.13. Let Vu l , ... , un~ be a Skolem normal form for satisfiability
of .., 1/1, where ~ is quantifier free. Then 3UI, ... ,Un (..,~ is a Skolem normal
form for validity of 1/1.

Notice that a Skolem normal form for validity is existential.


3.5 Normal Forms 157
1beorem 5.14. Let 1/1. be a Skolem normal form for the validity of 1/1. Then 1/1
is V4lid iff 1/1. is valid.

PROOF: Let 1/1. be 3u, ... , un ( -, 0 with ~ quantifier free. The following
statements are equivalent:

1/1 is not valid.


2lF -, I/I<z >for some 21 and z E VbII21I.
~FVUl, ... ,Un~<z> for some ~ and zEVbII21I.
1/1. is not valid.
The equivalence of the second and third statement follows from Theorem
5.12. 0

EXERCISES FOR §3.5


l. Complete the proof of Lemma 5.4.
2. Show that the stipulation that u is not free in I/J is necessary in parts iii and iv of
Lemma 5.5.
3. Complete the proof of Lemma 5.5.
4. Find a prenex normal form formula that is equivalent to ...,'v'x[Rxy)V[3ySxy).
5. Prove Theorem 5.9 for conjunctive normal forms. (Hint: Apply Theorem 5.9 to
..,I/J.)
6. Give the disjunctive normal form of [Ruv/\Sv)~[Su/\u~v).

7. Give a Skolem normal form for satisfiability of 'v'vI3v2[RvIV2V3/\


3V4[J(V2,V3)~V4]].
8. Give a Skolem normal form for validity of the formula in Exercise 7.
9. Prove the axiom of choice from Theorem 5.12.
10. An atomic formula is simple if it has the form

or

where each u; is either a constant symbol or a variable. Show that every atomic
formula is equivalent to an existential formula of the form

of the same type and with the same free variables where each cp; is simple.
158 III An Introduction to Model Theory

3.6 The Compactness Theorem


The most studied language of mathematical logic is the first order predi-
cate calculus. It has considerable expressive power combined with a highly
tractable theory. It is easy to find languages with greater expressive power,
but usually the theory of such languages is far less rich.
One of the properties that makes the first order predicate calculus so
amenable is the finitary character of satisfaction. An infinite set of asser-
tions has a model whenever each finite subset has a model. This fact is
known as the compactness theorem (for reasons spelled out in Exercise 8)
and is another one of Godel's extraordinary achievements. The theorem is
frequently used to construct models that have a wealth of desirable
properties from the knowledge that finite subsets of the properties have
models. Sometimes such a set of properties may be so demanding that a
model for it may appear paradoxical. In this section we shall present
several of these examples. Here we state the compactness theorem, but the
proof will not be given until 3.7.

Theorem 6.1 (Compactness Theorem). Let ~ be a set of assertions of L. If


each finite subset of ~ has a model, then ~ has a model; in fact ~ has a
model of cardinality <: '" + c~.

Here we shall examine several consequences of this theorem. In §3.8


several of the examples discussed here will be sharpened and generalized.

ExAMPLE 6.2. A model of arithmetic with an infinite number: Let ~=


(N, +,., <,0,1,2, ... ), where n=c: and +,', < are the symbols denoting
+,.,< respectively. Let ~=Th~u{cn<C.,:nE"'}. We claim that every
finite subset ~' of ~ has a model. Indeed, if n*""max{n:cn occurs in ~/},
then ~EMod~/, where ~=(~,c!) and n*+ I =c~. For clearly, ~ satisfies
those assertions of ~' of the form cn < COl' and since the other members of
~' belong to Th~, ~ satisfies them also.
Thus by compactness, there is a model ij; of ~ (and in fact a countable
model). By Theorem 4.17 and Lemma 5.4 we can assume ~kij;t'T~ with
c: = n. ij; has an infinite element c!, infinite in the sense that n<litc! for all
n EN. In fact ij; has an infinity of infinite elements. Indeed, since ij; E
Mod Th~, we have ij;I=V VOVI V2[ vo< VI ~VI < VI + v~. Thus m <COl + m for
all m, n EN, and so all elements of ij; of the form COl + n are infinite.
Moreover, if I~k, then COl + I~c., + k, since VVOVIV2[VO+VI =VO+V2~VI =
V2] is in Th ~ and so in Th ij;.
Since ~=ij;t'T~ but ~a:ij;t'T~, we see that the class of structures that
are isomorphic to ~ is not elementary; hence the converse of Theorem 4.19
fails.
3.6 The Compactness Theorem 159

ExAMPLE 6.3. A model of the reals with infinite numbers and infinitesi-
mals: Let ~ be the ordered field of real numbers (R, +,., <,r)'ER where
every real is represented by a constant, say r = c~, and +,', < denote
+,., < respectively. Let l::=Th~U {c,<c:rER}, where c is a constant
symbol different from the c:s. We claim that every finite subset of l:: has a
model. For if l::' <;;;;l:: and l::' is finite, then only finitely many constant
symbols c, occur in l::', say c"c" ... ,c,. Let r*=I+max{r l,r2 , ••• ,rn).
Take ~ to be (~, c!ll), where c!ll':" r*2. Clearly ~ E Mod l::', thus proving the
claim. Hence by compactness, l:: has a model @:.
By Theorems 4.17 and 4.18 we can assume that ~ <;; ; @: t T~, so that
c':= r. Also ~=@:tT~. However, ~~@:, and in fact cl! is an infinite
number in @: in the sense that c >c, for all r ER. Since @: is a field, every
element x in I@:I other than 0 has a multiplicative inverse, which we call
llx. Since O<r<c for all rER, it follows that

1 1
0<-<--
c n+ 1

for all nEN in this sense: l/cl! is infinitesimal.

EXAMPLE 6.4. Let f be a set of sentences such that for every nEw there is
an ~ E Mod f whose universe has cardinality ~ n. Then there is a ~ E
Modf whose universe is infinite. For let l::=fu{dm~dn:m<n<w},
where the d's are distinct constant symbols none of which occur in f. Let
l::' be a finite subset of l:: with n* = max {n:dn occurs in l::'}. By assumption
there is an ~EModf with cl~1 ~n*. Let ao, ... ,an - I be distinct elements in
I~I. Expand ~ to ~=(~,diJ3,d~, ... ,dn!lt_I)' where dj!ll=aj. Then clearly
~EModl::'. Hence, every finite subset of l:: has a model, and so by
compactness l:: has a model. But every model @: of l:: is infinite, since
@:Fdm~dn for all m<n<w. Moreover, if@:EModl::, then @:tTfEModf as
needed.
Hence we see that the class of all finite groups is not an elementary
class, and the same is true of the class of all finite rings and the class of all
finite fields.

EXAMPLE 6.5. The class of well-ordered structures is not an elementary


class. For let f be any set of assertions all of which are true in every
well-ordered structure. We must show that Modf has a member that is not
well-ordered. Let l::=fU {dn+l<dn:nEw}. Every finite subset l::' of l:: has
a model; indeed, any infinite model of f [such as (w, <)] can obviously be
expanded to a model of l::'. Hence by compactness l:: has a model. But no
model ~ of l:: is well ordered, since {dnw:nEw} has no least member (with
respect to <91) in ~. Hence ~ t { <} is a model of f which is not well
ordered.
160 III An Introduction to Model Theory

ExAMPLE 6.6. Let ZF be the set of axioms for Zermelo-Fraenkel set theory
(given in Section 1.11). Included in ZF is the axiom of regularity. But in
spite of regularity, if ZF has a model, then ZF has a model 2£ = (A, e lll) in
which there are elements c~,c~, ... such that c:+1elllc: for n=O,I,2, ....
Indeed, given n E", and a model mof ZF, mcan be expanded to a model
of ZF U {cn e cn _ I' cn-I e c 1 e co}. Hence every finite subset of ZF U
{cn+ 1ecn:n E",} has a model. By compactness, ZFu {cn+ 1ecn:nE",} has a
model, say 2£. The regularity axiom is true in 2£, yet 2£l=cn+ 1ecn for all
n E "'. Of course this means that {c:: n E "'} is not a set in 2£, i.e., there is
no aEI2£1 such that for all bEI2£I, bellla iff b=c: for some nE",.

ExAMPLE 6.7 (A countable model of the reals). Let 2£ be the field of reals,
2£ = (R, +, .). Th 2£ is countable and so has a countable model. Indeed, if m
is any expansion of 2£ with countable type, then Thm is countable and so
has a countable model.

ExAMPLE 6.8 (Skolem's paradox). If Zermelo-Fraenkel set theory has a


model, then it has a countable model. The reason that this fact at first
glance seems paradoxical is that one can prove from the Zermelo-Fraenkel
axioms the existence of uncountable sets, i.e., sets x such that there is no
function mapping x into "'. This can be written as an assertion of L. Hence
this assertion is true in any countable model. Yet for any x in a countable
model 2£ the cardinality of {y:ye!!{ x} is countable. However, this only
means that in 12£1 there are sets x such that for some y E 12£1, y is a I-I
function on x onto a proper subset of x, in the sense of 2£, but for no
z E 12£1 is z a function on x I-I into", in the sense of 2£.

EXERCISES FOR §3.6


1. !B is a non-standard model of arithmetic if !B=(N, +,.) but !B~(N, +, .).
Show that every non-standard model of arithmetic has an infinite number.
2. Let ~ be a non-standard model of arithmetic.
(a) Show that ~ has an infinite prime number.
(b) Show that there is a kEI~1 such that n divides k for all nEw.
3. A famous open question of number theory asks if there are infinitely many
primes p such that p + 2 is also a prime. Show that the answer is yes iff every
non-standard model of arithmetic bas an infinite prime p such that p + 2 is a
prime.
4. Let!B be a non-standard model of arithmetic. Show that there are bo.bl>b2,···
all in I!BI such that bi+ 1 <bi for all i EN. (x <y means there is acE I!BI such
thaty=x+!llc.) Hence!B is not well ordered.
5. If a,bER and a>O, then there is an nEN such that an>b. This is the
Archimedian property for the real number field. Let !B be a non-standard
model of the real number field, i.e., let !B=<R, +,.) and !Baii<R, +,.). Then
!B is non-Archimedian.
3.6 The Compactness Theorem 161

6. Let ~ be a non-standard model of 2l, where 2l=<R,Q,I,N, +,.), with Q the


set of rationals and I the set of integers. So ~=2l but ~~2l. Suppose also that
~ ::12l. Then Qf8 is uncountable. [Hint: For every r, e E 12l1, if e > 0, then there is
a q E Q such that q E (r - e, r + e). Consider the meaning of this in ~ when e is
infinitesimal but r is standard, i.e., in 12l1. If rl =for2 and ql E(rl - e,rl + e) and
q2 E(r2- e,r2 + e), then ql =foq2']

7. To the definition of L formulas (Definition 2.1d) add

v. if qJ is a formula and v is a variable, then Qtxp is a formula.


To the definition of satisfaction (Definition 4.3) add

v. qJ= Qv!/I, and {aEI2lI:2ll=o/(a)} is infinite.


Show that this new language is not compact.
Remark: If we replace clause v by
v*. cp= QVI[!, and c{aEI~I:~I=I/t(a)} >w,

then we get a language that is countably compact in the sense that any
countable finitely satisfiable set of assertions is satisfiable.
8. In Definition 2.ld change clause iii to

iii*. if a"; wand {qJi:i <a} ~X then 1\ {qJi:i <a} EX and V {qJi:i <a} EX.
In Definition 2.1d change clause iii to

iii*. qJ= 1\ {qJi:i<a} and 2ll=qJi for each i<a, orqJ= V {qJi:i<a} and 2ll=qJi for
some i<a.

(a) Show that there is an assertion (1 in this new language such that 2l1=(1 and
'T2l= {<) iff 2l=-(N, <).
(b) Conclude that this new language is not compact.
9. This problem is for those who have some knowledge of point set topology. The
problem is to show that compactness in the sense of Theorem 6.1 is equivalent
to the compactness of some topological space.
Let s be a similarity type, and let T be the set of all structures of type s and
of cardinality ..;cs+w. For each 2lET we let 2l*={~ET:2l=~}. Now let
T*= {2l*:2l E T}. For each ~ of type s let Fl;= {2l*:2l EMod~}.
(a) Show that '5={Fl;:'T~~S} is a base for closed sets for a topology on T*,
i.e., show that 0 and '!* E'5, and if K ~ T*, then n
K E T*.
(b) '5 is a T2 topology, i.e., if 2l*=fo~*, then there is some X, Y such that
X, f E '5 and 2l* E X, ~* E Y, and X n Y= 0.
(c) Derive the compactness of '5 from Theorem 6.1 and conversely. (To say
that '5 is compact means that if '5'~'5 and if n
K=fo0 whenever K is a
finite subset of K, then n
'5' =fo0.)
162 III An Introduction to Model Theory

3.7. Proof of the Compactness Theorem


Although this section is devoted to a proof of the compactness theorem,
several of the lemmas are of independent interest.

Definition 7.1. Say that ~ is finitely satisfiable if every finite subset of ~ has
a model.
Lemma 7.2. Let ° be of type 'T~. If ~ is finitely satisfiable, then either
~ U { o} is finitely satisfiable or ~ U { -,o} is finitely satisfiable.

PROOF: Suppose that neither ~ U { o} nor ~ U { -, o} is finitely satisfiable.


Then there are finite subsets ~I and ~2 of ~ such that ~I U { o} and
~2 U { -,o} have no models. 'But then ~I U ~2 is a finite subset of ~ having
no models, since any model of ~I U ~2 must be a model of or of -, o. D °
Definition 7.3. We say that ~ is complete if for all ° of type 'T~, either ° E~
or -,oE~.

In the next lemma we need a well ordering of the set of all assertions of
some fixed type. At first glance this would seem to require the use of the
well ordering principle or some other version of the axiom of choice. But
this can be avoided by identifying the symbols with ordinals (as suggested
in Section 2.10) and then well-ordering the class of expressions by defining
rorl .. .rn_l<so ... sm_1 if either n<m or n=m and 'jEsj , where} is the
least k such that reFsk. Hence, using Theorem 10.9 of part I, we see that
any set of expressions can be indexed by an ordinal. The axiom of choice
appears implicitly in Lemmas 7.6 and 7.7, but again can be avoided using
these devices.

Lemma 7.4. If ~ is finitely satisfiable, then there is a r such that


i. r:2~,
ii. 'Tr = 'T~,
iii. r is complete,
iv. r is finitely satisfiable.
PROOF: By the remark above, we can assume that the set of assertions of
type 'T ~ is well ordered; say { 0a : E ° f3} is this set, where f3 is some ordinal.
Now define
~o=~·

~ = { ~aU {oa} if Mod(~ U {oa}) =1= 0,


a+1 ~aU{-,oa} otherwise,
if a <f3.
~Il = U ~Y if 8 = U 8'" f3.
YEll
3.7 Proof of the Compactness Theorem 163

By induction on a and Lemma 7.2, it is easy to see that each l':.. satisfies
conditions i, ii, and iii of the theorem. Let f=l':,e. We need only observe
that f is complete: Let 'TO~'Tl':. Then 0=0.. for some a<{3. By definition
of l':.. +I' o El':.. + I or ...,oEl':.. +I' Since l':.. +1 ~f, we have oEf or ...,oEr.
Hence f is complete as needed. 0
Notice that this lemma is an immediate consequence of the compactness
theorem. For if every finite subset of l': has a model, then so does l':. Let
21 E Mod l':. Then Th 21 is complete and Th 21 ~ l':. However, we want to use
this lemma to prove the compactness theorem, and so we need a proof that

'.
does not use the compactness theorem.

Definition 7.5. The constant symbol c is a witness for 3vcp in l': if


fJi( ~) El':.

Lemma 7.6. If l': is finitely satisfiable, then there is a 0 such that:


i. l':~0.
ii. If 3vcpEO, then 3vcp has a witness in O.
iii. cO = cl': + w.
iv. 0 is finitely satisfiable.
PROOF: Suppose that l': is finitely satisfiable. Let g be a I-I function on l':
into the constant symbols not in 'Tl':. For each oEl': let

0* = fJi( g(o»)
if 0 = 3vcp for some v and fJi, and let 0* = 0 otherwise. Let 0 = l': u { 0*: 0 E
l':}. Clearly 0 satisfies i, ii, and iii. Let 0' be a finite subset of 0, say
O'=l':'u{o;*:i<n}, where l':'~l': and oj=3ujfJij for each i<n. Since l': is
finitely satisfiable, l':'u{o;:i<n} has a model 21. For each i<n choose
~E1211 such that 21I=fJij(aj). Now let ~=(21,ao, ... ,an_I)' where aj=g(oj)'iI:l.
Clearly ~ is a model of 0'. Hence 0' is finitely satisfiable. 0
Lemma 7.7. Suppose l': is complete and finitely satisfiable, and every assertion
3vcp El': has a witness in l':. Then l': has a model of cardinality <cl': + w.
PROOF: Let A' be the set of all terms of type 'Tl': in which no variable
occurs. Define t l --/2 if tlR:!/2El':.
We first show that '--' is an equivalence relation on A'.
Case i. t--t: If not, then ...,[tR:!tJEl': by completeness; but this con-
tradicts the finite satisfiability of l':, since { ""[/R:!t]} has no model.
Case ii. 11--/2 implies t2--/ 1: Suppose t l --t2. If 'T2R:!'TI Ell':, then ...,[/2R:!
tdEl': by completeness; but then {t1R:!t2, ...,[t2R:!/1]} is a finite subset of l':
without a model-a contradiction. Hence, 'T2--'T1"
164 III An Introduction to Model Theory

Case iii. If 11"",,12 and 12"",,13, then 11"",,/3: Suppose 11"",,12 and 12"",,13 but
11~/3' Then "[11~/3]E~ by completeness, but {t1~12,/2~13' "[/1~13]}
is a finite subset of ~ without a model. This contradicts the finite
satisfiability of ~. Hence if 11"""/2 and 12"",,13, then 11"",,/3,
Hence' "",,' is an equivalence relation.
We next note that if 11"",,1; for i<n, then for all J",aE1'~, then
in,a/O, ... ,ln-l~in,al~, ... ,/~-1 E~. For if not, then by completeness
in,a/o,'''' In-I ~J",al~, ... , I~_I E~, contradicting finite satisfiability, since
{t;~/;:i <n} U {fn,alo>' .. ,In-I~lo.,,., I~_ d has no model.
Similarly, one sees that if 1;"""1; for i <n, then for any Rn,a E 1'~,
Rn,aIO, ... ,ln-1 E~ iff Rn,a/~, ... ,I~-1 E~.
We can now define a model & of ~ as follows:
a. Let I&I=A = {t-:/EA'} where 1-= {t':/"""I'}.
b.c: = ca for all Ca E 1'~.
III - -_
c. in 'wa 1o,- ... ,tn-I
-- ' J" a 1o, ... ,In-I for all in,a E1'~.
d. Rn,aIO, ... ,ln-1 iff Rn,a/O, ... ,ln-1 E~ for all Rn,a E1'~.
As we have shown in the paragraphs preceding the definition, clauses c
and d are unambiguous in that they do not depend on which representa-
tives 10"'" In _ I are ~hosen from the equivalence classes I~, .. ·,in _ I'
Notice that IIlI = I for all terms I EA'. Certainly this is true for constant
symbols by clause b of the definition of &, and if true for 1o,"" In_I' then
III _ III III III _ III - - . •
(J"g/Q, ... ,ln-l) -in,a/o ... In_1 -J",alo ... ln-I whIch by _clause b IS
in,a/O ... ln-I' Hence by induction on formulas, we have I III = I for all lEA'.
We now show by induction on assertions that for all (J of type 1'~
&F(J iff (JE~. (*)
We take advantage of the remark following Theorem 4.4 to reduce the
number of cases considered to the four that follow:
Case 1. (J is atomic: Then (J is either of the form RIO ... l n - 1 or of the
form lo~ I I' If (J = RIo, ... , In _ I' then by clause d of the definition of & and
the fact that IIlI = I for all I E A', we have &F RIO, ... ,1n _I iff R 1lI/~, ... , 1':_1
iff R 1lI/0 , ... , i,. _I iff RIo, ... , In _I E~. The argument is completely similar
when (J is 10~/1'
Case 2. (J is ., cP and cP satisfies (*): First notice that cP and ., cP cannot
both be in ~, since ~ is finitely satisfiable but {cp, ., cp} has no models. By
completeness either cP or ., cP is in ~. Hence ., cP E ~ iff cP El ~ iff not &Fcp
iff &F., cpo
Case 3. (J is CPI!\ CP2' where both CPI and CP2 satisfy (*): By completeness,
~ intersects each of the pairs {CPI' "CPI},{CP2' "CP2},{CPI!\CP2' "[CPI!\CP2]}'
Since no finitely satisfiable set contains { ., CPI' CPI!\ CP2}' {., CP2' CPI!\ CP2}' or
{CPI,CP2,"[CPI!\CP2]}' we have (JE~ iff [cpIE~ and cp2E~] iff [&FCPI and
&FCP2] iff &F(J.
Case 4. (J is 3vcp and cP satisfies (*): Notice that if (JE~, then (J has a
witness, say c, such that cp(C)E~. On the other hand, if cp(C)E~ for some
c, then (J E~. For if not, then ., (J E ~ by completeness, but then
3.7 Proof of the Compactness Theorem 165

{'1'(c), .., C1} has no model, contradicting the finite satisfiability of l:. Hence
C1El: iff [for some c, 'P(C)El:] iff [for some c, 2lF'P(C)] iff 2lFC1.
This completes the proof of (.), and so 21 EModl:.
To finish the proof of the theorem we need only observe that cl211 <cA'
<cl:. The last inequality holds because any complete set of sentences is
infinite, and only finitely many terms occur in each sentence. 0
PROOF OF THE COMPAClNESS THEOREM. For each finitely satisfiable l:, let
G(l:) be the set r of Lemma 7.4, and let H(l:) be the set n of Lemma 7.6.
Now let fl be a finitely satisfiable set, and define:
l:o=fl,
l:2" + 1 = G(l:2,,)'
l:2,,+2 = H(l:2" + I)'
A trivial induction shows that for all nEw:
i. l:,,!:l:,,+I'
ii. l:" is finitely satisfiable.
iii. l:2n + 1 is complete.
iv. If 3vcp E l:2" +2' then 3vcp has a witness in l:2n+2'
v. l:" has cardinality cl:+w.
Now let fl· = U jE",l:". We show that fl· satisfies the hypotheses of
Lemma 7.7.
i. fl· is finitely satisfiable: Suppose {C10, ... ,C1,,_.}!:fl·. Then for each
i<n there is aj(i)Ew such that C1j El:j(j)' Letj=max{j(i):i<n}. By
conclusion i, {C10."" C1,,_ I} !: l:j. By conclusion ii, l:j is finitely satisfi-
able. Hence {C10,'''' C1,,_.} has a model. Thus fl· is finitely satisfiable.
11. fl· is complete: Let C1 be of type1'fl·. Then for some nEw, C1 is of type
T l:". By conclusions i and iii, either C1 E l:2" + 1 or .., C1 E l:2n + I' Hence
C1 E fl· or .., C1 E fl·, which shows that fl· is complete.
iii. Each formula 3vcpEfl· has a witness in fl·: If 3vcpEfl·, then 3vcpEl:"
for some nEw. By conclusion i, 3vcpEl:2,,+2' So by conclusion iv, 3vcp
has a witness in l:2,,+2 and so a witness in fl·.
Thus we can apply Lemma 7.7 and conclude that there is a model 21 of
fl· such that cl211 <cfl·. Since fl!:fl·, we have 2lEModfl·.1t remains only
to observe that cfl·=w·(cfl+w)=cfl+w. 0
EXERCISES FOR §7
Here we outline our alternate proof of the compactness theorem. This
proof has a more algebraic flavor and requires the axiom of choice.
Let J be a set, and let F!: P(J), i.e., F is a set of subsets of J. Say that F
is a filter base on J if
i. 0ftF,
ii. X E F and Y E F implies X n Y E F.
166 III An Introduction to Model Theory

F is a filter on J if in addition
iii. J:2 Y:2 X and X E F implies Y E F.
A filter F is an ultrafilter if
iv. for each Yr;;;,J either YEF or J - YEF.
l. Let aEJ, and let F={X:aEX~J}. Show that F is an ultrafilter. An ultra-
filter of this kind, i.e., containing a singleton, is called a principal ultrafilter.
2. Show that F is a filter and X~J, then either FU{X} or FU{J-X} is
contained in a filter. Compare with Lemma 7.2.
3. Every filter is contained in an ultrafilter. [Hint: Well-order P(X) and use 2
above or use Theorem 9.3 in Part I.]
4. There are non-principal ultrafilters. [Hint: Let J be infinite, and let F be the set
of all subsets X of J such that c(J - X) <cJ. Then F is a filter, and any
ultrafilter containing F is non-principal.]
5. Show that there are 22< ultrafilters on J if cJ = /c.
Let F be an ultrafilter on J, and let {~j:jEJ} be a set of structures all
of type s. Let A # be the set of all choice functions on this set, i.e.,
A # = {g:Domg=J and gV)EI~jl for eachjEJ}. Assuming the axiom of
choice, A # ~0. Write g--h if U:gU)=hU)} EF.
6. Show that •- ' is an equivalence relation on A # .
Now let g={h:g--h}, and let A ={g:gEA#}. We define a structure
~ = IIF~jas follows:
i. I~I=A;
and for all c,j,R Es,
ii. cV1. =g, where gU) = cV1. j ;

iii. letjV1.g1, ... ,gn =g, where gU)= jV1.Jg1U), ... ,gnU);
iv. R'Ilg1,. .. ,gn if U:Rw'jglV), ... ,gnU)} EF.
7. Show thatr is well defined, i.e., if h;Eg; for i=I,2, ... ,n, and if we define
j·gb ... ,gn=h, where hV)=j9f.jh1V), ... ,hnV), then h=g.
8. Show that R'I{ is well defined, i.e., if h; E g; for i = 1,2, ... , n, then
U:R9f.JhIW, ... ,h"W} EF iff U:R'il.jgIW, ... ,gn(})} EF.
The structure ~ is the ultraproduct of {~j:jEJ} with respect to F. The
main theorem of ultraproducts is
Theorem. ~I=CP(gl, ... ,gn) iff U:~)=CP(gIU), ... ,gnU)} EF.
9. Prove this theorem by induction on formulas.
Principal ultrafilters are uninteresting, since:
10. If F={X ~J:jEX}, then ~~~j'

However, non-principal ultrafilters can be used to meld the properties


of the various ~/s.
3.8 The LOwenheim-Skolem Theorems 167

II. Compactness theorem via ultraproducts: Let l: be finitely satisfiable. We want


to show that l: is satisfiable. Without loss of generality we can suppose that
ad\'" /\anEl: whenever aiEl: for each i=I,2, ... ,n. For each aEl: let
~.. EModa. Let F.. ={pEl::p~a}.
(a) Show that {F.. :aEl:} is a filter base and so is contained in an ultrafilter F.
(b) If Fis any ultrafilter containing {F.. :aEl:}, then IIF{~.. :aEl:} is a model
of l:.
We end this section with a few miscellaneous problems on ultraproducts
of a special kind. If each ~j=~' then ~=IIF~ is called the ultrapower of
~ with respect to F.

12. For each bEI~llet hbEJ{b}, the constant function on J with value b. Prove
that the function H defined by
H(b)=hb
is an elementary embedding of ~ into ~.

13. Let ~=<N, +,.), and let F be non-principal on w. Show that IIF~ is a
non-standard model of ThN, i.e., a model of ThN that is not isomorphic to N.
In fact, if h(n)=n, then ii is an infinite element in ~.
14. Let ~=<R, +,.), and let F be a non-principal ultrafilter on w.ThenIIF~ is a
non-standard model of R.

3.8 The Lowenheim-Skolem Theorems


In this section we shall consider a more restrictive notion of substructure,
namely the notion of elementary substructure. Many of the examples given
in §3.6 can be sharpened by requiring the standard model to be an
elementary substructure of the non-standard model. But the main interest
in the new notion stems from its usefulness in constructing models
having specified properties, as we shall see in §3.l0.

DefiDidon 8.1. We say that ~ is an elementary substructure of ~ or that ~ is


an elementary extension of ~ and write ~ ex: ~ if
i. ~~~ and
ii. for all z E Vb·I~1 and all formulas 'P, ~1='P<z> iff ~1='P<z>.

ExAMPLE. Let ~=<N+">, ~=<N,'>, where' is the successor function


n' = n + 1. Clearly ~ ~~. Moreover, ~ ~~, and so ~ =~ by Theorem 6.1.
However, ~ cjx:~, for if z E Vb·I~1 and z(va) = 1, then z satisfies 3v.[ VI ~vo]
in ~ but not in ~.

The next theorem provides necessary and sufficient conditions for ~ to


be an elementary substructure of ~. What makes the test so useful is that
the condition involves satisfaction in ~ alone.
168 III An Introduction to Model Theory

1beorem 8.2. Let W~!a, and suppose that for every formula cp and every
zE VbllWI such that !a1=3vcp(z), there is an aEIWlfor which

Then Wex:!a.

PROOF: Suppose Wand !a are as in the hypotheses of the theorem. We


show by induction on formulas that for all cp

(*) Wl=cp(z) iff !al=cp(z), for all z E Vb1IWI.

By the remark following Theorem 4.4 it is sufficient to consider the


following four cases:

i. For atomic formulas this is clear, since W~!a.


ii. Suppose 1/1 satisfies (*) and cp = --,1/1. Then Wl=cp(z) iff not WI=I/I(z) iff
not !a1=1/I(z) iff !al=cp(z). Hence cp satisfies (*).
iii. Suppose 1/11 and 1/12 satisfies (*) and cp=[1/I 1VI/I21. Then Wl=cp(z) iff
[either WI=I/II(Z) or WI=1/I2(z)1 iff [either !a1=1/I1(z) or 'B1=1/I2(z)1 iff
!a1= [1/1 , VI/I2](Z) iff !al=cp(z).
iv. Suppose t/I satisfies (*) and cp=3V1/1.1f Wl=cp(z), then for some aEIWI,
WI=I/I(z( ~). Hence, since 1/1 satisfies (*), !a1=1/I(z( ~); so !al=cp(z).
Conversely, if !al=cp(z) then for some a E IWI, !a1=1/I(z( ~) by the
hypotheses of the theorem. Hence, since 1/1 satisfies (*), WI=I/I(z( ~); so
Wl=cp(z). Thus cp satisfies (*). This concludes the induction and the
proof of the theorem. D

CoroUary 8.3. Let W ~!a, and suppose that for every finite subset
{ao, ... ,an-d of IWI, andfor every bEI!aI, there is an automorphism g on !a
such that

i. g(b)EIWI,
ii. g(aj)=a;!or all i<n.

Then Wex: !a.

PROOF: Under the conditions of the theorem suppose that !al=


3vcp(ao, ... ,an_,), where ajEIWI for all i<n. Then for some bE!a, !al=
cp(b,ao, ... ,an_,). Let g be an automorphism satisfying conditions i and ii.
By Theorem 4.19, !al=cp(g(b),g(ao), ... ,g(an _,»
i.e., !al=
cp(g(b),ao, ... ,an_ 1). Hence by Theorem 8.2, Wex:!a. D

EXAMPLE. If A ~B and cA ;>"', then (A)ex:(B). This follows trivially from


the corollary. Hence, if C and D are infinite, then (C)=(D), for by the
3.8 The LOwenheim-Skolem Theorems 169

axiom of choice, there is a 1-1 map on C onto some D'c;;;,D, or a 1-1 map
on D onto some C' c;;;, C. Since such a map is an isomorphism on C onto D'
or on D onto C', we have (C)=:(D')a:(D) or (D)=:(C')a:(C). In either
case C=:D.

EXAMPLE. Let (x,y) be the open interval on the real line between x andy,
i.e., the set of all reals greater than x but less than y. Let 2£ = ((0, t), < ),
m=«O, 1), <) where '<' is the usual ordering. Then by Corollary 8.3, we
see that 2£a:m.

Definition 8.4. Let m = (2£, c:)a E IWI' where c: = a. Then Th m is a complete


diagram of m.
No structure has a unique complete diagram, but since the complete
diagram in question is usually clear or immaterial in most arguments, we
shall speak of the complete diagram of 2£, abbreviated 6j)<' 2£.
The relation between elementary substructures and complete diagrams
is analogous to that between substructures and diagrams, as a comparison
of Theorem 4.17 and the following shows:

Theorem 8.5. 2£ is isomorphic to an elementary substructure of miff there is


an expansion m+ ofm such that m+ EMod 6j)c 2£.
PROOF: Let 2£~gm'a:m. Let 2£+ = (2£, c:+)aEIWI' where c:+ =a for each
aEI2£I. Now take ~=(m,c;), where c;=g(a). We show that ~EMod6j)c2£.
For suppose that q>(cao, ••• ,Co,,_,)EMod6j)<'2£ with 'Tq>(cao, ... ,co,,_,)-'T2£=
{cao' ... , Co,,_I}' Then 2£ F q>( ao, ... , an _ I)' Hence by Theorem 4.19,
m'Fq>(a(ao), ... ,g(an_ I». Since m'a:m, mFq>(g(ao), ... ,g(an_I»' Hence ~F
q>( cao' ... , Ca._). Thus ~ E Mod 6j)<' 2£.
Conversely, let ~ be an expansion of m such that ~ETh2£+, where
=
2£+ = (2£, c:)aElwl and c:+ a. Define a function g on 12£1 into Iml by
g( a) = c;. Let m' = Rngg. Since 6j)c 2£ ~ 6j) 2£, we have by Theorem 4.17 that
2£~gm'c;;;, m. Now let cp be a formula whose free variables are uo,""un_ l ,
let zE Vb1Im'l, and let g-I(z(u;»=a;. Then m'Fq><Z) iff 2£Fq><g-1 oz) iff
2£+Fq>(Cao""'co,,_,) iff ~Fq>(Cao"",ca._,) (since ~EMod6j)<'2£) iff mFq><Z).
Hence m'a:m. 0

We may write 2£a:m when all we mean is that for some m', 2£~m'a:m.

Theorem 8.6 (Upward LOwenheim-Skolem Theorem). Let cl2£1 >w, and let
K >cl2£1 + c'T2£. Then there is a m such that clml = K and 2£ a:m.

PROOF: With 2£ and K as in the hypotheses, let 2£+ = (2£,C:+)aEIWI' where


c:+ =a. Let {da:aEA} be a set of constant symbols disjoint from'T2£+,
such that da=l=dp for a<f3<K. Let f=Th2£+ U {da¢dp :a<f3<K}. Clearly,
each finite subset of f is satisfied in some expansion of 2£+. Hence by
compactness, f has a model ~ of power K. By the preceding· theorem,
2£a:~t'T2£· 0
170 III An Introduction to Model Theory

ExAMPLE. For every infinite cardinal IC, there is a !B such that


(N,+,·,0,1,2, ... )ex:!B and cl!BI=IC. Hence there are structures in every
cardinality that are elementarily equivalent to (N, +, ',0,1,2, ... ).

Theorem 8.7 (Downward LOwenheim-Skoiem 'Theorem). Let X ~!B. Then


there is an 2£ such that X ~ 2£ ex:!B and cl2£1 <cX + c.,.!B + w.
PROOF: Let < be a well ordering of I~I (here we use the axiom of choice).
For each formula 3txp and each Z E VbII!B1 such that !B1=3txp<z), let g(qJ,z)
be the first b (with respect to <) such that

!Bl=qJ( z( ~)).
Now define
Ao=X,
An+1 =Anu {g(qJ,z):z E Vb~n and !B1=3txp<z)}.
Now let A = U ne..,An' We first observe that if cE.,.!B, then c!8 E A;
indeed, since !B1=3v [v~c], c!8 E AI' Also, if ao, ... ,an_1EA andjn,,,,E.,.!B,
thenjn!8",ao>
,
...• ,an_ 1EA; for if ajEA;.J, and m=max{jj:i<n}, then ao>""
an _ 1E Am' smce Ak ~ AI for all k <I < w. Hence jn,,,,ao,'''' an _I E Am+ I'
~

since !B1=3v[Jn,,,,(ao, ... ,an_l)~V]. Hence there is a substructure 2£~!B


with 12£I=A. Clearly X ~12£1. Moreover, for each nEw, cAn <CX +CT!B+W,
and so cA <cX + CT!B + w. It remains to show that 2£ ex: !B.
For this we use Theorem 8.2. Suppose qJ is a formula with only the
variables V,Uo, ... ,Un_1 free. Let z E VbII2£I, and suppose !B1=3txp<z). Take m
large enough so that z(u o), ... , z(u n _ I) E Am. Then
!Bl=qJ(g(qJ,z),z(uo), ... ,z(un_ I» and g(qJ,Z)EA m+l, and so g(qJ,z) EA.
Hence by Theorem 8.2, 2£ ex:!B, as needed. D

Definition 8.8. Let (2£",)",<" be a sequence of structures of type s such that


2£1i~2£y whenever /3<y<lC. Let U ",<"2£,,, be that structure 2£ of type s
such that

i. 12£1 = U ",<"12£,,,1,
ii. c·=c·o for each cEs,
iii. j91.= U ",<reI'#J.· for eachjEs,
iv. R'J1.= U ",<"R·· for each REs.
Notice that clause iii is equivalent to: j'#J.al , ... ,an = b iff j.·a l, ... ,an = b
whenever a l , ... ,an,bEI2£",I. Similarly, clause iv is equivalent to: R'J1.a, ... ,an
iff R'J1.·al , ... ,an whenever a 1, ... ,an EI2£",I.
In model theory one often builds a structure 2£ satisfying certain first
order requirements by erecting a tower of approximations 2£1,2£2'"''
2£"" ... , where 2£1i~2£y for /3<y and 2£= U ",<"2£,,,. But the requirement
2£1i~2£y for p<y is seldom strong enough to determine the first order
3.8 The LOwenheim-Skolem Theorems 171

properties of OC. For example, if OC; =({ - i, - i + I, - i + 2, ... }, <) for each
i= 1,2,3, ... , then not only is OCj~OCk forj<k, but OCj-=OCk and so OCj=OCk.
Nevertheless, if OC= U ;E"'OC;, then OC=([, <)iEOCj" To assure OCjO:OC for
each j, we need the condition OC yo:OC8 whenever y<8. Such a sequence
(OCa)aE" is called an elementary chain.

Theorem 8.9. Let (OCa)a<" be an elementary chain, i.e., let OCpo:OCy whenever
{3<y</(.. Then OCpo:Ua<"OCafor each {3<".
PROOF: Let OC = U a "OCa. Let f be the set of formulas cP such that
whenever {3<" and Z E~IIOCpl, then OCpFcp(Z) whenever OCFcp(Z). We want
to show that all formulas of type .,.OC belong to f.
Clearly the atomic formulas belong to f (see Definition 8.8, clauses iii
and iv). It is easy to see that if CPl Ef and CP2Ef, then ""CP2Ef and
CPl Vcp2 Ef.
Suppose that cp=3V\[1, where I/IEf. If OCpFcp(Z), then OCpFI/I(Z( ~) for
some bEIOCpl. Since I/IEf, OCFI/I(Z(~), and so OCFcp(Z). Conversely, if
OCFcp(Z), where Z EVb1IOCpl, then OCFI/I(Z( ~ ), for some b EIOCI. Since
IOCI= UaE"IOCal, bEIOCyl for some yE". Let 8=max{{3,y}. Then since
Z(~)EVblIOC81 and I/IEf, we have OC8FI/I(Z(~)' and so OC8Fcp(Z). But
OCp o:OC8, and so OCpFcp(Z). Hence cpEf. This completes the proof that all
formulas belong to f, which gives Theorem 8.9. 0
Definition 8.10. Say that l: implies 0 if Modl: = Mod(l: U {0 n, i.e., if every
model of l: is a model of o.

Next we generalize Definition 7.3 as follows:

Dermition 8.11. l: is complete if for all 0 of type .,.l:, either l: implies 0 or l:


implies ...,0.
Clearly, if l: is complete in the old sense, then it is complete in the new
sense. Conversely, a set l: complete in the new sense is contained in a
unique set f that is complete in the old sense and of type .,.l:, namely
f= {o:.,.o~.,.l: and l: implies oJ.

Definition 8.12. l: is ,,·categorical if l: has models of cardinality", and if


any two models of cardinality" are isomorphic.
The following useful criterion for completeness is known as Vaught's
test.

Theorem 8.13. If l: is ,,·categorical, where" >cl:+w, and if every model of


l: is infinite, then l: is complete.
172 III An Introduction to Model Theory

PROOF: Suppose that l: is as in the hypotheses but is not complete. Then


for some 0 of type '1'l:, both l: u { o} and l: U { -,o} have models. By the
compactness theorem, there are models ~ of l: U { o} and ~ of l: U { -, 0 }
such that cl~1 = cl~1 <; cl: + w. By Theorem 8.6 and our assumption that all
the models of l: are infinite, there is a ~'>~ and ~'>~ such that
cl~'1 = cl~'1 = /c. Since l: is /C-categorical, ~'~~'. By Theorem 4.19, ~'=~'.
But this is impossible, since ~'I=o and ~'I= -, o. Hence l: must be com-
plete. []

ExAMPLE. Let /\ j<J<nVj~1J.i be a conjunction of all formulas Vj~Vj where


i<J<n, and let 3:>n be the formula 3vo,,,,,Vn_1 /\j<J<nVj~1J' Oearly
~1=3:>n iff cl~l;)n. Let l:={3:>n:nEw}. Then Modl: is the class of all
structures ~=(A) such that cA ;)w. Since any two such structures of
cardinality w are isomorphic, l: is w-categorical, and so by Vaught's test l:
is complete.

A linearly ordered structure ~ (see § 1.8) is densely ordered if cl~1 ;) 2 and

~I=VVOVI[ vo<vl~3v2[ Vo <V2/\V2 <v.] ].

Our next example is based on the following theorem of Cantor.

1heorem 8.14. If ~ and ~ are countable densely ordered structures without


least or greatest elements, then ~l:I:!B.

PROOF: Let ~=(A, <91) and ~=(B, <!l), where A = {ao,a l, ... } and B=
{bo,b l, ... }. Suppose that CO<91 ... <91 cn _ 1 and do<'iI:J··· <'il:Jdn_ l. Then
for every c EA there is a dEB such that for all i <n, cj <c iff ~ <d, and
c<cj iff d<~. Indeed, if c<co or if Cn- I <c, then such a d exists, since ~
has no least or greatest element. If for somej<n-I, <:;-1 <c<cj> then such
a d exists because the ordering is dense. Now let g({cj:i<n},c,{~:i<n})
be the first term in the sequence bo,b l, ... that can serve as such a d.
Reversing the roles of ~ and~, we get a function h({~:i <n},d, {cj:i <n}).
Now define sequences aO,a), ... and bOob), ... as follows:
ao=ao>
bo=bo>
a2n + 1 = a2n + I ,
b2n+I = g({ ao. ... ,a2n},a2n+I' {bOo ... ,b2n }),
b2n +2= b2n +2,
a2n+2= h( {bo, ... ,b2n +.},b2n +2, {ao. ... ,a2n+ I})'
It is easy to see that the function I defined by /( a;) = b; is an isomorphism
on ~ onto ~, which proves the theorem. []
3.8 The LOwenheim-Skolem Theorems 173

EXAMPLE. Let 9C be the class of densely ordered structures without


endpoints. Clearly, 9C = Mod~, where ~ is composed of the following
assertions:
VV03V 1V2[ [VI <vo]/\[ VO<V2]]'
Vvov l [ VO<VI--+VI {vo],
VVOV1V2[ VO<VI/\VI <V2--+VO<V2]'
Vvov l [ vo<v 1VVI <voVvo~v.],
VVOVI3v2[ Vo <v 1--+[ vO<v2/\V2 <v.]].
It is easy to see that all models of ~ are infinite. Moreover, the above
theorem shows that ~ is ",-categorical. Hence ~ is complete by Theorem
8.13.

EXAMPLE. Let 2£=(I,j9l), where I is the set of integers andj(x)=x+ I for


all x EI. By recursion on n, define the termj"v as follows:
jIV~jV,

j,,+ IV~j(j,,( v».


It is easy to see that the following assertions are true in 2£:
i. VV03VI[JVI~VOJ,
ii. VVOVI[JVO~jVI--+VO~V.J,
iii. 'Vvolj"vo~voJ for each n E",.
Let ~ be the set of the above sentences. We claim that the set of sentences
implied by ~ is Th2£. To prove this it is enough to show that ~ is complete.
For this we shall use Theorem 8.13.
Notice, however, that ~ is not ",-categorical. Indeed, if B=lu{x+'1T:x
EI} and ifjIB(x)=z+ I for allzEB, then ~=(B,j~ is clearly a countable
model of ~, but ~ g{: 2£.
But ~ is categorical in every uncountable cardinality: For let ~,<r be
models of ~ with cl~1 = cl<rl = /C > "'. Define bo--b l if rnbo= b l or j"IBb l = boo
It is easy to check that '--' is an equivalence relation on ~ and each
equivalence class is countable. Analogously, define an equivalence relation
CO--CitCI on l<rl. There are /C many __IB-equivalence classes, say Pa , aE/C,
and there are /C many --Cit-equivalence classes, say Qa' a E /C. Choose a
point Pa EPa and a point qa E Qa for each a E /C. Now define an isomor-
phism F on ~ onto <r by defining F(x) = y if for some a and some neither
i. j"IB(x)=Pa and1:(y)=qa' or
ii. j"IB(pa) = x and 1:(qa) = y .
(It is easy to check that F is an isomorphism, and the reader is asked to do
so in Exercise 4.) Hence ~ is categorical in every infinite power and so is
complete.
174 III An Introduction to Model Theory

ExERCISES FOR §3.8


1. Show that if ~ex:1i and !8ex:1i and ~ ~!8, then ~ex:!8.

2. Suppose that 'T~a~'T~p whenever a </:J<IC. Also suppose that ~aex:~pt'T~a


whenever a </:J<IC. Let U aEK~a be that structure ~ of type s= U aEK'T~a
where

aEK

j.= u r«
aE",
foranyjEs,

aEK
Show that ~aex:~t'T~a for each aEIC.
3. Letj,,(v) be the term defined in the last example of this section. Let l: be the
following set:
(a) VV03Vl[ vo~c~[J( Vl)~VO]]'
(b) Vvo VVl[J( vo)~j( Vl)~VO~Vl]'
(c) Vvo[J,,(vo)~vo],
(d) Vvo[J(vo)~c].
Prove that l: is complete and that Th(N,jH) ={(JIl: implies (J), where jH(n) =
n+ 1.
4. Let F be the function from 18 to Ii described in the last example of this section.
Show that F is an isomorphism.
S. Let (D, <) be a partially ordered structure such that for each d, and d2
in D there is a dED for which d, <d and d2 <d. Let {~d:dED} be a set
of structures such that ~dex:~d' whenever d<d'. Define U dED~d as in
Exercise 2. Show that ~d'ex: U dED~d for each d'ED.
6. Let l:=Th(R, <). Show that whenever ~~!8 and ~ and 18 are models of l:,
then ~ex:!8.
7. Find~, 18', and 18 such that ~ex:!8 and !8~!8'ex:~ but ~~!8.

8. Prove the analog of Theorem 4.18 for elementary substructures, i.e., show that
if ~;;:!8' ex:!8, then there is a Ii ~!8 such that ~ ex: Ii.

3.9 The Prefix Problem


In this section we show that certain algebraic conditions are satisfied by an
elementary class K just in case K=Modr for some set r each member of
which has a particular simple form. For example, if each substructure of a
member of K is a member of K, then r can be taken to be a set of
3.9 The Prefix Problem. 175

assertions each of the form 'tI uo· .. un _ Itp where no quantifier occurs in tp
(Theorem 9.3).

Lemma 9.1. Suppose that ~ ~~, that z E VbII~I, and that t/I is open. Then

~Ft/I<Z> iff ~Ft/I<Z>.

PROOF: Let K be the class of formulas t/I for which this is true. Then every
atomic formula belongs to K by the definition of substructure. Moreover,
if t/l1,t/l2EK, then
i. ~F'It/lI<Z> iff ~Ft/lI<Z> iff ~Ft/lI<Z> iff ~F'It/lI<Z>, and
ii. ~F[t/ll Vt/l2](Z> iff (~Ft/lI<Z> or ~Ft/l2<Z» iff (~Ft/lI<Z> or ~Ft/l2<z»
iff ~F[t/lIVt/l2](z>.
Hence t/l1' t/l2 E K iinplies 'I t/l1' t/ll Vt/l2 E K. Thus K contains the open
formulas, which proves the lemma. 0
Definition 9.2. A formula in prenex normal form is said to be universal
(existential) if the only quantifier symbol occurring in its prefix in 'tI (3).
That is, a universal formula is one of the form 'tIuo··· Un-I'll, where tp is
open. A set r of formulas is said to be universal if every tp E f is universal.

Theorem 9.3. If K is an elementary class, then the following are equivalent:


i. Whenever ~ ~~ and ~ E K, then ~ E K.
ii. For some universal set f, K=Modf.
PROOF: ii implies i: Let ~~~EModf' with f universal. If (JEf, then (J
has the form 'tIuo··· u,,-It/I, where t/I is open. If ~EModf, then ~F(J and
so ~Ft/I<Z > for all Z E VbII~I. In particular, ~Ft/I<Z > for all Z E VbII~I. Hence
by Lemma 9.1, ~Ft/I<Z> for all Z E VbII~I. Hence ~F(J for all (JEr, as was to
be shown.
i implies ii: Let K be an elementary class satisfying condition i. Let f be
the set of all universal sentences in ThK. We show that K=Modf.
Clearly, K~Modf. Now let ~EModf. We claim that every finite subset
of liD ~ has a model that is an expansion of some structure in K. For
suppose {t/lo, ... , t/ln _ I} ~ liD ~ but has no such model. Let do> .•• , dr _I be the
constant symbols in T[t/lO/\· .. /\t/ln-I]- TK. Let t/I; be the result of replac-
ing each ~ in t/li by t?;. Then no ~ in K is a model of 3vo··· VI-I
[t/lo/\·· ·/\t/I~-I]· Hence 'tIvo·· ·VI-I'I[t/lO/\·· ·/\t/I~_t1EThK, and so
'tIvo··· VI_I 'I [t/lo/\ .. ·/\t/I~_I]Ef. But ~EModf, so BF'tIvo··· VI_I,[t/lO
/\ .. ·/\t/I~_I]-impossible, since t/lo/\·· ·/\t/ln-I EliD ~. Hence for every
finite subset fl' of liD~, there is an ~ such that ~EModfl'UK.
Now let fl=6j) ~UThK. We have just shown that the compactness
theorem applies. Hence fl has a model ~. By Theorems 4.17 and 4.18,
we can suppose ~~~~TK. Since ~~TKEModThK and ModThK=
176 III An Introduction to Model Theory

K(K is an elementary class), we have ~ t'TK E K. And since we are assum-


ing that K satisfies condition i, ~EK. Thus K;;;1Modf and soK=Modf.
o
Corollary 9.4. Suppose K = Mod p for some p. Then the following are equiv-
alent:
i. ~!:;;;~ and ~EK implies ~EK
ii. K=Modo for some universal sentence o.
PROOF: We first show that i implies ii. Assuming condition i and applying
the theorem, we have K=Modl: for some set l: of universal sentences.
Since l: u { ..., p} has no models, the compactness theorem implies that there
is a finite subset {oo,". ,On- d of l: such that {oo, ... , 0n- d U { ...,p} has no
models. Letting a be 00 /\ ••• /\on-I' this means that FO~p. On the other
hand, since Modp=Modl:, FP~Oi for each i<n, from which it follows
that Fp~O. Hence FO+-+P, i.e., K=Modo, as needed. The proof that ii
implies i is immediate from the theorem. 0
Corollary 9.5. Let K = Mod p. Then the following are equivalent:
i. ~!:;;;~ and ~EK implies ~EK.
ii. K = Modo for some existential assertion o.
PROOF: Suppose K=Modp and condition i holds. Then K=Mod ...,p, and
K is closed under substructure. Hence by the preceding corollary, K=
Modo' for some universal 0'; say 0' is "vo··· vn_ll[;, where I[; is open.
Letting a be 3vo··· vn-I...,I[;, we have F...,O'+-+O. Hence K=Modo, and a is
existential, as needed. 0

A completely algebraic characterization of elementary classes Modl:


having the property that whenever ~EModl: and ~!:;;;~ then ~EModl:
is considerably more difficult than Theorem 9.3, and will not be discussed.
Instead we consider a version of Theorem 9.3 for 'TK without function
symbols in which the proviso requiring K to be an elementary class is
dropped in favor of purely algebraic conditions. For this we need several
lemmas.
Lemma 9.6. For every ~ having a finite type and a finite universe, there is an
existential assertion all( such that for all ~, ~FOm iff ~ !:~.
PROOF: The finiteness of I~I and 'T~ implies that 6j) ~ is finite; say
6j) ~ = No, ... ,I[;n- d· Let {ko> ... ' k i _ d = 'T6j) ~ -
'T~, and let I[; be the result
of replacing k i by Vi in 1[;0/\··· /\I[;n-I. Clearly ~F3vo··· Vi-II[; if some
expansion of ~ is a model of 6j) ~. The lemma follows by an application of
Theorem 4.17. 0
Lemma 9.7. Let e be a set of substructures of~. Then n {I~I: ~ E e} is
either empty or a substructure of~.
3.9 The Prefix Problem 177

The proof follows easily from the definition of substructure.

Definition 9.8. Let X be a non-empty subset of ~. Let e = {W:X ~ IWI and


W~~}. Define S(~,X) to be that substructure of ~ whose universe is
n {lWI:WEe}.

Of course X ~S(~,X), and if X ~I@:I, where @:~~, then S(~,X)~@:.


For the next theorem, we need to observe that if T~ is finite and
contains no function symbols and X is a finite subset of ~, then S(~,X) is
finite. Indeed, IS(~,X)I=X U {C::C.. ET~}.

Theorem 9.9. Let K be a class of structures of some fixed finite type not
containing function symbols. Then K=Mod~ for some set ~ of universal
assertions iff the following conditions are met:
i. W;a;:~ and ~EK implies WEK.
ii. W~~ and ~EK implies WEK.
iii. {S(~,X):X ~I~I, O<cX <w} ~K implies ~EK.
PROOF: Suppose K=Mod~, where ~ is a set of universal sentences.
Theorem 4.19 tells us that condition i is satisfied. Condition ii is a
consequence of the preceding theorem. Now suppose that {S(~,X):X ~
I~I, O<cX <w} ~K. Let oE~; say o=Vuo'" un-II/!, where I/! is open. Let
ZEVbII~I. Let W=Sub(~,X), where X={z(uJ, ... ,z(un_ I)}. WEModK,
and so WFO. Hence WFI/!<Z), and so by Lemma 9.1, ~FI/!<Z). This shows
that ~ E K and so condition iii holds.
For the converse, assume i, ii, and iii. Let f=ThK. We claim that
K-Modf. Clearly, K~Modf. To prove the reverse inclusion we take
~flK with T~-TK. Since K satisfies condition iii, there must be a finite
subset X of I~I such that Sub(~,X)flK. Let W-Sub(~,X). Since TW is
finite and does not contain function symbols, W is finite. By i and ii, no
@: E K contains a substructure isomorphic to W. So by Lemma 9.6, @:F'" o~
for all @:EK. Hence ..,o.Ef. Since ~FOIJ[ (again using Lemma 9.6), we
have ~flModf. Hence K~Modf, and so K=Modf. Applying Theorem
9.3, we see that K-Mod~ for some set ~ of universal assertions. 0
Lemma 9.10. Let CflT~UTCp. Then
F~~VU'P iff F~~'P( ~).
PROOF: Suppose +F~~VUIP. Let W+EMod~, where TW+~T~UTCp(~).
Then W+-(W,c· ) for some W of type T~UTCp and WEMod~. Hence
WFVU'P, and so W+F'P( ~). This shows that F~~'P( ~).
For the converse, suppose F~~'P(~) and WEMod~. Let aEW, and
form W+ -(W~T~ U TCp,c·+), where c·+ -a. Then W+ EMod~, so
W+F'P( ~). Hence, WFCP<Z( ~) for all a, so WFVUIP, as needed. 0
178 III An Introduction to Model Theory

Definition 9.11.
i. An V3-formula is a formula of the form VUO,.",un_13wO, ... ,wn_ICP
where cP is open.
ii. Let K be a class of structures, and define ThY3 K = {0: 0 E ThK and 0 is
a V3-formula}.

Definition 9.12. We write ~a::rm if


i. ~!:;m,
ii. for all cpEr and all ZEVbII~I, ~l=cP<z) iff ml=cp<z).

Notice that ~!:;m iff ~a::0m and that ~a::m iff ~a::rm where r is the
set of all formulas of type 'T~. We write ~a::ym for ~a::rm when r is the
set of universal formulas of type 'T~.

Lemma 9.13. If ~a::ym, then there is a @; such that m!:;@; and ~a::@;.
PROOF: Let m+=(m,Cr)b~'!8I' where Cb~'Tm and b=c'r. Let ~+=
(~,Cr)bEI.I' where c:+ =c: for bEI~I. Let ~=Th~+ u~, where ~ is the
diagram. of m determined by m+. By Theorems 4.17 and 8.5 it is enough to
prove that ~ has a model. Suppose not. Then some finite subset has no
model, say {oo, ... ,op-I,Po. ... ,Pq-d, where oiETh~+ and PjE~ for each
i<.p,j<q. Let 0 be 001\·· ·I\Op_1 and P be Pol\·· ·I\Pq-l. Then I=o~
,p. We can write P as p(cao, ... ,Co,,_I,Cbo, ... ,Cbm_)' where ao. ... ,an- I EI~I,
bo, •.. , bm _ 1 E Iml-I~I, and every Cb occurring in P is some ca, or some cb.•
Using Lemma 9.10, we see that 1=0~Vuo, ... ,um-I,p(cao, ... ,co,,_I'uo. .. :,
um_ I)' where uo, ... , um_ 1 are distinct variables not occurring in p. Since
~+I=o and ~a::ym, we have m+I=Vuo. ... ,um-I,p(cao, ... ,ca",_I,uO, ... ,Um-I)'
contradicting m+l=p(cao, ... ,Co,,_I,Cbo, ... ,Cbm_.). Hence ~ is finitely satisfiable
and so has a model, as we needed to show. D

Lemma 9.14. Let K be an elementary class, and let ~EModThY3K. Then


there is a mEK such that ~a::ym.

PROOF: Let ~+ =(~,c:+)aEI.I' where Ca~'T~ and c:+ =a for all aEI~I. Let
6DY~ be the set of all universal formulas true in ~+. Let ~=ThKU 6DY~.
We claim that every finite subset of ~ has a model. For if not, then some
finite subset~' of ~ has no model. Let ~'={oo, ... ,on-I'Po. ... 'Pm-d,
where each ojEThK, and each PjE6j)A~. Let Pj be Vuj,o··· ui,n;-ICPj. Using
Lemma 5.4, we can assume that UjJ=FUk,t when (iJ)=/=(k,I).1t is easy to see
that Pol\··· I\Pm-1 is equivalent to Vuo, ... ,Utcp where each uiJ is some Uk
and cP is CPol\·· ·I\CPm-l. Letting 0 be 001\··· I\on-I' we have I=o~
,Vuo, ... ,ujcp, or equivalently 1=0~3uo, ... ,Uj_I'CP. Let {cao, ... ,ca,_.}='Tcp
- 'TK, and let wo, ... , Wr_1 be variables not occurring in 3uo. ... , ut ' cpo Let
cP' be the result of replacing ca. by Wi throughout cP for each i <r. Applying
Lemma 9.10, we have 1=0~Vwo··· Wr_13uO··· ut-I,cp'. Since oEThK,
3.9 The Prefix Problem 179

VWo".v'_13uo' "u/_I,cp'EThV3 K and so is true in ~. But this con-


tradicts ~+I=Vuo'" Ui-ICP. Hence ~ is finitely satisfiable and so has a
model ~+. Now let ~=~+ I'T~. Then ~r:xv~ and ~EK. 0

Theorem 9.15. Let K be an elementary class. Then the following two


conditions are equivalent:

i. If for each iEw, ~i~~i+1 and ~iEK, then U iEW~iEK.


ii. K=Modf for some set f ofV3-assertions.
PROOF: ii implies i: Let ~= U iEW~i' where for each iEw, ~iEK and
~i~~i+I' We assume that K=Modf, where f is a set of V3-assertions.
Let oEf; say 0 is Vuo'" Un_13wO'" wm-Icp, where cP is open. Choose
ao, ... ,an-I EI~I. Then there is an i* such that ao, ... ,an_ 1EI~i.l. Since
~i.EModf, there are bo, ... ,bm- I EI~i.1 such that ~i.I=CP<Z>, where z(u) =
aj and z(w/)=b/ for everyj<n, I<m. Hence ~l=cP<z> by Lemma 9.1. Thus
for every ao, ... ,an _ 1 EI~I, there are ba. ... ,bm - I EI~I such that ~I=
cp(ao, .•. ,an_l,bo, ... ,bm_ I ), i.e., ~I=o for all oEf, as needed.
We now assume i and prove ii: Let f=Th v3 K. Clearly, Modf:;)K, and
we need only show that Modf~K. Let ~oEModf. Using Lemmas 9.13
and 9.14 alternately gives a sequence ~0'~1'~2"" such that for all nEw,
i. ~2nr:xV~2n+1 and ~2n+1 EK,
ii. ~2n + I ~ ~2n +2 and ~2n r:x ~2n +2'
Notice that U nEw~n= U nEw~2n= U nEw~2n+I' Let ~= U nEw~n'
Since we are assuming condition i, we have ~ E K. Also, ~o r:x ~ by
Theorem 8.9, and so ~o=~. Since ~ is an elementary class, this means
that ~oEK. Hence Modf~K and so Modf=K, as we needed to show.

EXERCISES FOR §3.9


l. Let (1 be the conjunction of the following assertions:
i. Vuvw[(u·v)·w=u·(v·w)],
ii. 3uVv[(u'v) = v],
iii. Vu3vVw[(u·v)·w=w].
Then (G, .)1=(1 iff (G,·) is a group with operation '.'. Show that there is no
universal assertion equivalent to (1.
2. Let p be the conjunction of the following:
i. Vuvw[(u'v)'w=u'(v'w)],
ii. Vv[e·v=v],
iii. Vv[v·v-I=e].
Then (G,·, -I,e) is a group iff it is a model of p. Clearly p is equivalent to a
universal assertion. Reconcile this with Exercise l.
3. Is the class K of discrete orderings a universal class, i.e., is there a universal set
of sentences}; such that mod}; = K?
180 III An Introduction to Model Theory

4. Find two assertions " and p such that


i. 9lF" iff 91 t,." is a field,
ii. ~Fp iff ~t'TP is a field,
iii. "is universal, and
iv. p is not equivalent to a universal assertion.
5. Is the class of discrete orderings an V3-class? Is the class of dense orderings V3?
6. Show that if K - Mod" and if K is closed under the union of chains, then
K - Modp for some V3-assertion p.

3.10 Interpolation and Definability


Two structures ~ and ~ such that ~ ~ s == ~ ~ s, where s == T~ n T~, can be
glued together in the obvious way. The result is a structure ~ such that
~~T~==~ and ~~T~==~. In this section we will show that '==' can be
weakened to '=', i.e., if ~~s=~ts, where s==T~n'T~, then there is a ~
such that ~tT~=~ and ~tT~=~.
A closely related result states that if FO-+P and s == TO n 'TP, then there is
a B such that TB == s and o-+B and B-+p. This is the interpolation theorem,
which will be discussed here along with a corollary on explicit definitions.

Theorebl 10.1 (Consistency Lemma). Let s == 'T~ n T~, and suppose that
~ t s=~t s. Then there is a ~ such that

i. T~ ='T~ U 'T~,
ii. ~ ex: ~ t 'T~,
iii. ~= ~t'T~.

PROOF: Let ~, ~, and s satisfy the hypotheses of the theorem. We first


prove the existence of a structure ~o such that
~tsex:~ots and ~o=~. (1)
Let ~ == Th ~ u 6jY ~ t s. If ~ has no models, then compactness implies the
existence of a sentence oE6j)"~ts such that FTh~-+...,O. 0 has the form
UOo ••• ,Un - 1 )
(
'P dOo ••• ,dn - 1 '
where {do, .. . ,d,,-tl == 'TO - s. By Lemma 9.10, FTh~-+VUO·· .Un_I""
'P(UO ••• un _ I). Hence VUO ... Un_I""'P(uO ... Un_I)ETh~ts==Th~ts, while ~
has an expansion that is a model of o-a contradiction. Hence ~ has a
model ~+. By Theorem 8.5 we can satisfy (1) by taking mo to be ~+ t'T~.

We next show that there is an ~o such that


~ex:~o and ~ot sex:~ot s. (2)
3.10 Interpolation and Definability 181

Chpose expansions ~+ =(~,c~+)aEI9(1 and ~+ =(~otS,C:'+)bEI!8o~ such that


c:' =b for all bEI~ol and c: =a for all aEI~I. Let ~=Th~ UTh~+.
We claim that ~ has a model. If not, then by compactness there is a
0' ETh~+ such that I=Th~+~.., 0'.0' is of the form

Uo ... Un_IWo ... Wm_l)


(
cP Cao"'C-n-I
n Cb0 ",Cbm-i '

where aiEI~1 and bjEI~ol-I~1 for i<n,}<m. Applying Lemma 9.10, we


get that

Then ~tsI=VwO",wm_I"'CP<z> for any zEVbII~1 such that z(ui)=ai for


i<n. Since ~tsex:~ots, we have ~otsI=VwO ... Vm_I"'CP<Z>. But this con-
tradicts the fact that O'ETh~+. Hence ~ has a model ~o, and by Theorem
8.5 we can take ~o to be ~o t 'T~.
Using (2) but substituting ~o for ~ and ~o for ~o, we get a structure ~I
such that
~Oex:~1 and ~OtSex:~ltS.
Continuing in this way, we generate two sequences {~i:iEw}, {~i:iE
w} such that for all i < w,
i. ~i ex: ~i+ I'
ii. ~i ex: ~i+ I'
iii. ~itSex:~itS and ~itSex:~i+ltS.
Let ~*= U {~i:iEw}, ~*= U and ~= U {~its:iEw}=
{~i:iEw},
U {~its:iEw}. By Theorem 8.9, ~ex:~oex:~* and ~=~oex:~*. Hence
~=~* and ~=~*. Moreover, ~* and ~* are both expansions of the same
structure ~. Hence we can expand ~* to a structure @: of type 'T~ U 'T~
such that @:t'T~=~*. This @: obviously satisfies the conclusion of the
theorem. D
Corollary 10.2. Let ~I and ~2 be sets of assertions such that
i. Mod~t';i=0,
ii. Mod ~2:;6 0,
iii. {0':1=~IU~2~0' and'T0'~'T~ln'T~2} is complete.
Then Mod(~,U~0:;60.

PROOF: Let ~EMod~" and let ~EMod~2' ~ and ~ satisfy the hypothe-
ses of the theorem, and so there is a @: such that ~ex:@:t'T~ and ~ex:@:t'T~.
Hence ~ E Mod(~, U ~0· D
182 III An Introduction to Model Theory

Definidon 10.3. Say that a is an interpolant for al--+a2 if


i.- Ta~ 'Tal n Ta2'
ii. I=al--+a,
iii. l=a--+a2'

1beorem 10.4 (Interpolation Theorem). If l=al--+a2' then al--+a2 has an


interpolant.

PROOF: Suppose that a l --+a2 has no interpolant. Let s=TalnTa2' Let


r={p:Tp~S and I=al--+p}, We claim that rU{-,a2} has a model. For if
not, compactness implies the existence of a finite subset {po, ... ,Pn-l} of r
such that I=po/\' .. /\Pn-I--+a2' Since I=al--+Po/\' .. /\Pn-I' this means that
al--+a2 has an interpolant, contradicting our assumption. Hence rU { -, a2}
has a model, and so there is a complete set of assertions f';;;) r u {-, a2}
such that r' has a model. Let ~={y:yEr' and ry=s}. We claim that
i. ~is complete,
ii. ~u { -, a2} has a model,
iii. ~U {a l } has a model.

The first two clauses are obvious. If clause iii is false, then there are
YO>"',Yn-IE~ such that I=al--+-'(YO/\"'/\Yn-I)' But then ~ has no
model, since -,(y /\ ... /\ Yn_I)Er and hence { -'(Yo' .. Yn-I)' YO>···' Yn-d
~~, a contradiction. Hence clause iii holds also.
Taking ~I to be ~ U {ad and ~2 to be ~ U { -, a2}' we apply Corollary
10.2 and conclude that ~I U ~2 has a model. But this means that al/\ -, a2
has a model, and so al--+a2 is not valid. Hence if l=al--+a2' then al--+a2 must
have an interpolant. 0
In other proofs of the interpolation theorem the interpolant a for al--+a2
is given constructively in terms of a l and a2' In other words, if one is
explicitly given a valid implication al--+a2' an algorithm may be followed
that will explicitly yield an interpolant. Later we shall obtain such an
algorithm, but one that is less expedient than that usually given.

Definidon 10.5.
i. Let R be a relation symbol. Say that ~ implicitly defines R if whenever
~=(@:,Rw) and m=(@:,R~ and ~,mEMod~, then ~=~; i.e., any
structure @: of type Ta-{ R} has at most one expansion in Mod~.
ii. If R is an n-ary relation symbol and cp is a formula of type l' ~ - {R },
then we say that cp is an explicit definition of R with respect to ~ if
whenever ~ E Mod ~ and ~ = (@:,R w), then R ¥l = {(ao, ... , an -I):@:I=
cp(ao>'" ,an-I)}'

EXAMPLE. Let ~ = Th K, where K is the class of all structures ~ =


(A,RQ(,S91) such that RQ( is a discrete ordering of A and for all x,yEA,
3.10 Interpolation and Definability 183

S·xy iff y is the immediate successor of x. Clearly l: implicitly defines S.


Moreover, S is explicitly defined with respect to l: by the formula vo<vd\
'1f v2[ Vo <vr-+vl '" V2]'
Let R and S be n-placed relation symbols. By cp( R)
we mean the
formula we get from cp by replacing each atomic fo~ula Rto, ... ,tn_ 1
occurring in it with St o, ... , tn-I' For example if 0 is
3vovl[Rf(vl),vo-+'1fv2[Rv2,v2Vvo~v2]J, then o( ~) is
3vovl[ Sf(v l ),vo-+'1fv2[ SV2,V2 VVO~V2] J. By r( ~) we mean

. {CP(~):cpEr}.
Lemma 10.6.
i. 0 implicitly defines the n-relation R iff whenever S is an n-relation symbol
not occurring in 0, then

FO!\O( ;n.a)
n.a
-+'1fvo··· Vn_ 1 [ Rvo··· Vn_I~SvO'" Vn-I J.
ii. Let cp be a formula of type 'TO - { R }. Then cp defines R explicitly with
respect to 0 iff
FO-+'1fvo'" vn- I [ cp( vo" .vn_I)~Rvo··· vn- I J.
The proof of this theorem is straightforward and is left as an exercise.

Lemma 10.7.
i. If l: implicitly defines R, then some finite subset of l: implicitly defines R.
ii. If cp explicitly defines R with respect to l:, then cp explicitly defines R with
respect to some finite subset of l:.
PROOF OF I. Suppose that no finite subset of l: implicitly defines R. Then
by Lemma 10.6, for each finite subset l:' of l: there is a model of

l:'U l:'( ~) U { --, '1fuo ... Un_I(RuO· .. Un_I~SUO·" Un_I)}'


So by compactness there is a model ~ of

l:Ul:( ~)u {--,'1fuo ... un_I(Ruo ... Un_I~SUO ... Un_I)'


Let ~'==~~'Tl:-{R}. Then (~',R"),(~',SIi)EMod~ and Rq=FS q. Hence
R is not implicitly defined by l:. This proves part i. 0
PROOF OF II. If cp does not explicitly define R with respect to some finite
subset of l:, then by Lemma 10.6, every finite subset l:' of l: has a model
184 III An Introduction to Model Theory

that is also a model of ...,'VuO",Un_I(CPUO,,,Un_I+-+Ruo,,,un_I)' Hence by


compactness, ~ has a model of ...,'VuO",Un_I(CPUO",un_I+-+RuO",un_I)'
which means that fP does not define R with respect to ~, thus proving part
a 0
Theorem 10.8 (Definability Theorem). If ~ implicitly defines R, then R is
explicitly definable with respect to ~.
PROOF: By Lemma 10.7 it is enough to prove the theorem for ~-{o}.
Suppose that R is implicitly defined by o. Then by Lemma 1O.6i,

FO /\o( ~ )--+'Vvo .. ,Vn_1 [ Rvo .. ,Vn_I+-+SVO" .vn_.].


Let CO"",cn- 1 be constant symbols that do not occur in o. Then

FO/\O( ~)--+[ RCO"",Cn_I+-+Sco",Cn_.],


and so

FO/\RCo ... cn_I--+[ o( ~)--+SCo ... Cn-ll

Let fP(c;o"'c;._) be an interpolant for this implication (an application of


Theorem 10.4). Notice that 1lP(Vo" •Vn -I) ~ TO - {R, s}. We claim that
fP( vo" . vn _ I) is an explicit definition of R.
Since FO /\Rc,'o ... C,',,-1 --+fP(c,'0 ... C,'II-I), we have Fo--+[Rc" 0 ... c..' - I --+
fPC;o ... C,'II-I J. From

FfPC;o ... c..' - I --+[o( RS)--+Sc;0 ... C;II-I ]


we get FfPC;o ... c;._,--+[o--+Rc;o",ci,,_,J (neither R nor S occur in fP). Hence
FO--+[fPC;o ... c,',.-1 --+Rc,'0 ... C,'II-I J. With the above, this gives Fo--+[Rc,'0 ... C,'II-I +-+
fPC;o'" c;.J. Applying Lemma 9.10, FO--+'VVo '" Vn_I[RvO'" Vn- I +-+
fPVo .. •Vn -.1. So by clause ii of Lemma 10.6, we see that fP defines R
explicitly in terms of o. 0
EXERCISES FOR §3.1O
1. Prove Lemma 10.6.
2. It is not necessary to prove the interpolation lemma via the consistency lemma.
In fact, there are direct proofs of the interpolation lemma that yield consider-
ably more information about the form of the interpolant relative to the form of
the valid implication. For this reason it is convenient to have a proof of the
consistency lemma assuming the interpolation theorem. Give such a proof.
3. Let T=Th(N, +). Is there a formula of type TT such that (N, + )I=cp(n,m) iff
n<m'!
4. Let P be the set of positive integers. Let T=Th(I, +, ·,P). Show that T defines
P explicitly. (Hint: Every natural number is the sum of four squares.) Do the
same for T=Th(R, +, ·,P}, where P is the set of positive numbers.
3.11 Herbrand's Theorem 185

5. Show that if l: defines R explicitly, then l: defines R implicitly.


6. Let T=Th(I, +, -,.). Let a be the conjunction of the following assertions:

P(v)v P( -v),
...,[P(v)/\P( -v)],
P(u)/\P(v)~P(u·v)/\P(u+v).

Does T U {a} implicitly define P?


7. Answer the same question as in Exercise 6 except that T=Th(R, +, -,').
8. Say that l: is existentially complete if whenever a is an existential assertion, then
either aEl: or ...,aEl:. Show that if l: is existentially complete and if ~!:~l
and ~!:!82 and ~,!8),!82 E Mod l:, then there is a Q: E Mod l: and isomorphisms i
and j such that
i:!8)~Q:) !:Q:,
j:!82~Q:2 !:Q:,

and
it~=jt~·

9. Suppose that ~=(A,B,C,R,S), where B and C are unary and (B,R)=(C,S).


Show that there is some ~'=(A',B',C',R',S') such that ~-<~' and (B,R)~
(B',S').

3.11 Herbrand's Theorem


Suppose that tl, ... ,t" are constant terms (i.e., terms without free variables)
and that 21~fP(tI)V' .. VfP(t,,). Then surely 21~3vq>v. Much more interest-
ing is the fact that if 3tXp is valid and fP is open and has a constant symbol
in its type, then there are constant terms t I' ... , tIl such that fPC t I)
V' .. VfP(t,,) is valid. This is known as Herbrand's theorem and is the
main topic of this section. Since there is an effective procedure for
checking the validity of open sentences, this leads to an effective method
of enumerating the valid sentences· of L and an axiom system in the next
section that is complete and correct for the valid L-sentences.

Definition 11.1.
i. We call t a constant term if no variable occurs in it.
ii. Let fP(uo ... U,,-I) be an open formula. Say that fP(to ... t,,_I) is a
substitution instance of fP if each tj is a constant term of type TCp.

Lemma 11.2. Suppose that some constant symbol belongs to T21. Let B =
{t'if.: t is a constant term of type T21}. Then B is the universe of a substructure
of 21.
186 III An Introduction to Model Theory

PROOF: By Definition 3.8, it is enough to show that whenever bo, ... , bn _ I E


B andfET~, thenf~bo ... bn_1 EB. But each bi is some ti~ for t a constant
term. Hence f~bo ... bn-I = f~t~ ... t:_ 1=[ftO ... tn]~' where fto ... tn- I is a
constant term. Hencef~bo ... bn_1 EB. 0

Theorem 11.3 (Herbrand's Theorem). Let 'P be an open formula with free
variables uo, ... ,un_ l , and suppose that some constant symbol belongs to T'P.
Then 3uO"'un_ I'P is valid iff there is a finite sequence 'Po, ... ,'Pk-1 of
substitution instances of'P such that 'Po V ... V'Pk _ I is valid.

PROOF: Let ~={ ""''Pi:'Pi is a substitution instance of 'P}. Suppose that no


disjunction 'Po V ... V'Pk-1 of substitution instances of 'P is valid. Then
every finite subset of ~ has a model. Let ~ EMod~, and let ~ be that
substructure of ~ whose universe is {t2l: t is a constant term of type k T~}
(such as ~ exists by Lemma 11.2). Since ~F""''PtO ... tn_1 for all constant
terms ti of type kT~, and since ""''P is open, we have by Lemma 9.1 that
~F...., 'Pbo ... bn_ 1 for each bo, ... ,bn- I E~. Hence ~F ....,3no ... un-I'P, so that
3 Uo... Un _ I'P is not valid. So if 3 Uo... un _ I'P is valid, then there is a
disjunction 'Po V ... V'Pn - I of substitution instances of 'P that is valid. 0

Our next theorem provides us with an algorithm for determining the


satisfiability of open assertions. This algorithm also plays an important
role in our presentation of an axiom system that is complete for validities.

Theorem 11.4. Let 0 be an open assertion, and let X be the set of all terms
occurring in o. If 0 has a model, then 0 has a model of power .;;;;x.

PROOF: Suppose that 0 has a model. Then 0 has a model ~ of type TO. Let
B={t~:tEX}. Since B*0, we can choose some b*EB. Now letfbe an
n-function symbol in TO, and let bO ... bn- 1EB. Define

For each constant symbol c E TO define cj!J = c~; for each n-relation symbol
R and each bo ... bn- I EB define Rj!Jbo ... bn_ 1 iff R~bo ... bn_I' Now let
~=(B,Rj!J,jj!J,Cj!J)R,J,CETO' It is easy to see that ~FO; indeed, an easy
induction on subformulas 'P of 0 shows that ~F'P iff ~F'P. 0

How does Theorem 11.4 yield a test for satisfiability for open asser-
tions? Suppose we are given an open assertion o. We can then write down
explicitly the set X of terms occurring in o. By the theorem, we know that 0
has a model iff it has a model of power .;;;; eX.
3.12 Axiomatizing the Validities of L 187

Notice that if "I is a I-I function on I~I onto B, then "I induces an
isomorphism on ~ onto ~, where ~ is defined by
I~I=B,
c!ll=yc'il,
»,
j!llbo·· .bn - I = y(J~( y-Ibo)··· (y-Ibn_ I
R!lIbo ... bn _ 1 iff Rl}l(y-Ibo) ... (y-lbn_l)
for all c,j, R ET~ and all bO, ... ,bn _ 1 EB. Hence if a has a model, then a
has a model whose universe is either {O}, {O,l}, ... , or {O,I,2, ... ,m-l},
where m = cX. Next we make a list of all those sturctures of type TO having
such a universe. We then check to see if a is satisfied in any of these
(finitely many) structures. If not, then we know that a is not satisfied in
any structure.
These considerations also yield a test for validity of open assertions
because a is valid iff -, a is not satisfiable. Hence the above test for
satisfiabilityof -, a is tantamount to a test for validity of a.

EXERCISES FOR §3.11


I. Let A = {O, I}. Let s = {coJI.o.R2.0}. List all structures ~ such that I~I = A and
t~=s.

2. Let A={O,I, ... ,n}, and let s={co,cbfl.o.i2,o.f3.o.RI.o.R2,o.R3.o}. How many


structures ~ are there such that I~I = A and t~ = s?
3. Complete the proof of Theorem 11.4.
4. Suppose that l=a~3vo ... v"cp, where cp is open and a is universal. Suppose that
'I'cp contains a constant symbol. Show that I=a~~, where ~ is some disjunction of
substitution instances of cp using constant terms of type Ta U 'I'cp.

3.12 Axiomatizing the Validities of L


The main purpose of this section is to show that the set of valid L-asser-
tions is effectively enumerable, i.e., there is an algorithm for producing a
list 00> a I' 02' ... of the valid assertions of L.
In order to realize this goal we must restrict our attention to those
validities of some fixed, effectively enumerable type $, since if {oo>0l'
02""} is an effectively enumerable set, then so is U {too> tal' t02, ... }.
Here we take $ to be {cl:i E",} U {/;+IJ:iJE"'} U {Ri+IJ:iJE",}.
Using the techniques of §2.12 it is easy to effectively assign a number
# a to each of type $. Hence our assertion that the validities of type $ are
188 III An Introduction to Model Theory

effectively enumerable has a more formal counterpart, namely, {# a: to ~ s


and Fa} is a machine enumerable set. Here we shall be content with the
less formal version, although the passage from "effectively enumerable" to
"machine enumerable" is routine (but tedious).
Let V={a:ta~s and Fa}. We shall give an axiomatization § that is
effectively given, complete for V (i.e., every a E V is an § theorem), and
correct for V (i.e., every § theorem belongs to V). As argued in §2.13, the
existence of such an § implies that V is effectively enumerable.

1be Axiom System §.


i. If % V ... V q:>n _ 1 is a valid disjunction of substitution instances of some
formula q:> of type s, then q:>oV ... Vq:>n-I is an axiom.
ii. If for each i<n, q:>j is a substitution instance of q:> and 'TCpj~S, and
3uo ... um_1q:> is an assertion, then (q:>oV ... Vq:>n_I,3uO... um_ 1q:» is a
rule of inference.
iii. If 0* is a Skolem normal form for validity of a, and ta* ~s, then (0*,0)
is a rule of inference.

Theorem 12.1. § is effectively given, correct, and complete for V.


PROOF: The discussion at the end of §3.11 shows that the axioms of § are
effectively given. aearly, given an ordered pair (0,0'), we can decide
whether or not (0,0') is a rule of inference of either form ii or iii. Hence §
is effectively given.
Each axiom is valid. Moreover, if Fq:>oV ... Vq:>n-I' and each q:>j is a
substitution instance of q:>, and 3uo ... um_,q:> is an assertion, then surely
F3uO ... Um_Iq:>. Hence the rules in ii preserve validity. By Theorem 5.14
each rule of the form iii also preserves validity. It follows by an easy
induction on the length of proofs that § is correct for V.
To prove completeness we must show that every valida of type s has an
§ proof, i.e., that Fa implies ~§a. Suppose Fa. Then Fa*, where 0* is any
Skolem normal form for validity of type s (by Theorem 5.14, 0* is of the
form 3uo ... um_1q:>, where q:> is open). By Theorem 11.3, there is a valid
disjunction of substitution instances of q:>, say q:>o V ... V q:>n -I' Hence (q:>o
V ... Vq:>n_l,a*,a) is an §-proof of a; so ~§a. This gives the completeness
of § and concludes the proof of Theorem 12.1. 0
There are many other axiom systems that are effectively given, com-
plete, and correct for the validities of L. Some of these appeal because they
seem to be natural deductive systems. Their rules of deduction seem more
like those we use informally when reasoning in mathematics; so formal
proofs in these systems are organized in a way that appears to mimic our
usual informal mathematical arguments. Other axiom systems allow a fine
analysis of formal proofs that can yield more information. This study of
various axiom systems has grown into a branch of logic called proof
3.12 Axiomatizing the Validities of L 189

theory. In our presentation, we contented ourselves with a not too natural,


not too useful axiomatization, but one that gave us a quick proof of the
completeness theorem: the validities of L are effectively enumerable. This
is yet another of Godel's achievements.

EXERCISES FOR §3.12


1. Suppose that L * is a language that extends L in the sense that every formula of
L is also a formula of L *. Suppose that a is an assertion of L * such that
2( ;;;: (N, +, .) iff some expansion of 2( is a model of a. Show that L * is not
axiomatizable, in the sense that there is no effectively given axiom system ~ *
such that the ~ *-theorems are exactly the valid L*-assertions. (Hint: See §2.12).
2. Suppose that L * is a language that extends L, and that 'Y is an L *-assertion such
that A is finite iff (A) can be expanded to a model of 'Y. Show that L* is not
axiomatizable. (Hint: Produce a a satisfying the conditions of Exercise 1.)
3. Again assume L * extends L, but now suppose there is an L *-assertion 8 such
that 2( == (N, <) iff 2( can be extended to a model of 8. Show that L * is not
axiomatizable.
4. Prove that any language L * satisfying the conditions of any of the above
exercises is incompact.
5. Extend L to a language L * (the monadic second order calculus) as follows: Add
second order variables X b X 2, ••• (to range over sets). Extend the definition of
atomic formula to include VjU(/ whenever iJ EN+ and to include 3X/'P
whenever 'P is a formula. An assignment Z to A is now a function z such that
z(v)EA if v is a (first order) variable as before and for each i, z(X/)!;A. The
definition of satisfaction is extended in the obvious way by adding the clauses
~FveX(Z) iff z(v)Ez(X), and
2£F3X'P(Z) iff there is a subset u of 12£1 such that 2£F'P(z( !).
Show that there is an assertion a of L * such that 2( ~ (N, +, .) iff some expan-
sion of 2£ is a model of a. Hence L* is neither compact nor axiomatizable.
6. Let L * be an extension of L, and let 1: be a countable set of L *-assertions such
that each finite subset of 1: has a model but 1: does not. Show that there is a set
r of L*-assertions such that 2£;;,:(N, <) iff 2£ has an expansion that is a model of
r.
7. The propositional calculus is that language '5' whose symbols are the following.
Propositional variables: P b P 2,P3, ... ,
Sentential connectives: -', /\,
Parenthesis: [, ].
The set of fomulas is the least set X that contains the propositional variables and
is such that whenever 'P,I/IEX, then [-''P]EX and ['P/\I/I] EX. An assignment z
is a function from {P;:i=I,2,3, ... } into {t,!} (read t as true andf as false).
'P(z) is defined inductively as follows:
i. H 'P=P1, then 'P(z) is z(P;).
ii. H 'P"'[ -,1/1], then 'P(z) in t if I/I(z) is!, and 'P(z) isfif I/I(z) is t.
190 III An Introduction to Model Theory

iii. If Cp=[~JI\lh], then cp<z) is t if both ~I<Z) and ~2<Z) are t, and cp<z) is!
otherwise.
cp is valid (or a tautology) if cp<z) = t for all z. Show that the question 'Is cp a
tautology?' is decidable. [Hint: The values of cp<z) and cp<z') are the same if
z(P)=z'(P) for each P occurring in cp.]
8. Let VL be the set of valid assertions of L. We have seen that VL is effectively
enumerable. However, VL is not decidable. The proof of this is outlined below.
(a) Let a be the universal closure of the conjunction of the following:
i. Sv~Su....,u~::::w,
ii. I~Sv,
iii. v~I....,3u(v~Su),
iv. v+I~Sv,
v. v+Su~S(v+u),
vi. v·l~v,
vii. v·Su~v·u+v.

Let ~ = (N+, +, ., S, I), where S is the successor function. Show that if !BFa,
then ~k!B.
(b) Define the constant term n· as follows:

1·=1,
(n + I)· = S(n·).
Let cp(v) be the formula at the beginning of §2.l2 that says "v is the number
of a Turing machine that halts on input v." cp is existential. Show that if
Fa:...,.cp(n·), then ~Fcp(n·) and conversely.
(c) Show that VL is not decidable.

3.13 Some Recent Trends in Model Theory


The first order predicate calculus has a marvelously rich model theory. The
cornerstones of this theory are the compactness theorem and the down-
ward LOwenheim-Skolem theorem. In fact, there is a remarkable theorem
of Lindstrom which characterizes the first order predicate calculus L in
terms of these properties. Suppose that L * is a language that extends L in
the sense that every L-assertion is an L *-assertion. Suppose that L * is
compact and that any countable set of L * assertions that has a model has
a countable model. Then L * is L. In other words, this extraordinary result
shows that the first order predicate calculus is the richest language that is
compact and also has the downward Lowenheim-Skolem theorem to No.
Sinee so much of the model theory of L is built up from its complete-
ness property and the LOwenheim-Skolem theorem, one might take Lin-
strom's result as an indication that languages more expressive than L will
not have significant model theories.
3.13 Some Recent Trends in Model1heory 191

On the other hand, the expressive power of the first order predicate
calculus is not sufficient to characterize many of the objects and concepts
which mathematicians are interested in. For example, the notion of finite-
ness, or of cardinality Ie for Ie infinite, cannot be expressed in L. The
natural numbers with < cannot be characterized up to isomorphism in L;
nor can one characterize (N, +, .) or (R, <) or (R, +, .), and so on. So
there is sUfficient motivation to investigate languages more expressive than
L with the hope that these languages will be relevant to areas of mathemat-
ics beyond the scope of L and yet have interesting model theories.
It is quite easy to invent languages with more expressive power than L.
For example, the second order predicate calculus has, in addition to the
first order variables, variables of second order intended to range over sets.
Thus, adding YXr[XO;\Yy[Xy--+X(y+I)]]--+vz[Xz]l to Th(N,+,O)
gives a second order theory l: which determines (N, +,0) up to isomor-
phism, i.e., ~EModl: iff ~S!:(N, +, ',0). The second order calculus is
expressive enough to state the order completeness property of (R, +, " <),
and hence can describe this structure up to isomorphism. The notion of
well ordering can be captured faithfully, and so on. Moreover, one need
not stop with the second order. We can consider third order variables
ranging over sets of sets, and fourth order, and so on. However, these
languages, from the second order calculus on up, have not admitted a
model theory as rich or as beautiful as that for L. The game is then to
discover languages that have greater expressive power than L but still
possess an amenable model theory. Several such languages have been
discovered, and we briefly describe a couple of them.
Let Ie and ~ be infinite cardinals with Ie;> ~;> w. We describe a language
L",,. that differs from the first order predicate calculus in allowing conjunc-
tions and disjunctions over sets of assertions of cardinality less than Ie and
simultaneous universalization or existentialization of fewer than ~ vari-
ables. More precisely, the definition of formula, Definition 2.ld, is mod-
ified by replacing clause iii with
iii*. if f~X and cf<1e then [;\f] EX and [Vf] EX,
and replacing iv with
iv*. if cp E X and u is a set of variables of cardinality less than ~, then
[3ucp] and [Yucp] EX.
We define ~Fcp(Z) in the obvious way by replacing iii, iii', iv, and iv' of
Definition 4.3 with
iii*. cp=[ ;\f], and ~FY(Z) for each yEf.
iii'*. cp= [ Vf], and ~FY(Z) for some yEf.
iv*. cp=[3U1/1], and for some z*EVbll~1 such that z(v)=z*(v) whenever
vflu, we have ~Fo/(Z*).
iv'*. cp=[\fUI/I), and for all z*EVbll~1 such that z(v)=z*(v) whenever
vflu, we have ~Fo/(Z*).
192 III An Introduction to Model Theory

If Ie is w, and ). is arbitrary, then L",A is just the first order predicate


calculus. But if Ie> w and ). ~ w, then the increase in expressive power over
L is spectacular. For example, (N, <) can be characterized up to isomor-
phism. To see this, let 0 be the first order assertion '< is a discrete
ordering with first element but no last element', and let cpn(u,v) be the first
order formula expressing 'u and v are separated by fewer than n elements'.
Then 0/\Vuvv{ cpn:n E N+} is an L", .. ",-sentence characterizing (N, <) up
to isomorphism. Similarly (N, +, .) can be characterized in L", .. ", as can
(a,e) when a is a countable ordering. Scott has shogn that if ~ is
countable and 'T~ is countable, then there is a single assertion 0 E L",.,,,,
such that for any countable ~ we have ~I=o iff ~;;;~. However, L",.,,,,
cannot characterize the class of well-ordered structures, or the class of
structures of any given infinite cardinal.
Of course L", .. ", is incompact. On the other hand, there are several
analogies between its model theory and that of L. For example, every
o E L",.,,,, that has a model has a countable model, and any sentence 0 with
models of cardinality >:In for all nEw has models of arbitrarily high
cardinality. The obvious analogs of the interpolation theorem and the
definability theorem hold, and there are many more similarities.
When we pass to languages as rich as or richer than L",.,,,,., we gain a
great deal of expressive power, but the resulting model theory seems to be
relatively meager. As an example of the power of L",.,,,,., let 0 be the first
order assertion '< is a linear ordering', and let 8 be the L", .. ",.-assertion
.,3{ vi:i Ew} [/\ {Vi >vi+ \ :iEw} ].
Then ~EMod(0/\8) iff 89[ well-orders~.
Another way of extending the first order predicate calculus is to add a
new quantifier Qn (n=I,2, ... ) to the symbols of L and add the following
clause to Definition 2.ld, the definition of L-formula:
v. if cpEX and n are variables, then Qnu\, ... ,unCPEX.
U\,U2, ... ,U

For each infinite cardinal Ie, we have a Ie-interpretation of the quantifier


Qn:~I="Qnu\, ... ,u,.cp<z> iff there is a Ie-powered subset Y of I~I such that
for all Y\, ... 'Yn E Y we have
~1="CP(z( U\, ... ,Un )).
Y\""'Yn
Allowing different Qn,s to appear in the same formula gives the language
L<"'.
Clearly, none of these languages are compact in the sense of L, for in
the Ie-interpretation the following set of assertions has no model, but each
finite subset does:
{ca~cfl:a<p<le} U { ., Qnu\, ... ,un[ u\~ud}·
(Note that the L n-assertion requires that any model have cardinality <Ie.)
However, if we restrict our attention to countable sets of sentences, then
3.13 Some Recent Trends in Model Theory 193

L I is compact in all uncountable interpretations, and it is consistent to


assume that L <'" is compact in the "'I-interpretation and alllC+ + -interpre-
tations. In the interpretations and under the set theoretic hypothesis in
which the languages are known to be compact, they are also known to be
axiomatizable. LOwenheim-Skolem theorems have not yet been fully in-
vestigated for these languages, but it is easy to prove that if (1 E L <'" and (1
has a model in the IC-interpretation, then (1 has a model of power at most IC
in the IC-interpretation. Research in this area has begun only recently, and
many fundamental questions remain open.
Recently, Rubin and others have found languages that are properly
more expressive than L <'" and yet still retain the desirable model theoretic
attributes of L <"', such as axiomatizable and countable compactness. The
search for stronger languages is still continuing. The exploration of the
model theory of these languages has made heavy use of infinitary combi-
natorics and set theory, and some of us believe that these languages will be
of some use in the study of axiomatic set theory and infinitary combinator-
ics.
Meanwhile, work continues in many directions. Fragments of the first
order predicate calculus may give a rich theory. One such example is
equational logic, which deals with sentences of the form VUo,"" un[tR::l I']
where I and I' are terms. A great many interesting algebraic classes such as
groups, rings, and fields, can be presented in this form.
Non-standard analysis develops classical analysis on the basis of non-
standard models of the reals. Probability theory, certain branches of
topology and algebra, and other topics can be given a non-standard
treatment as well.
It is impossible to say where the borders are between set theory,
combinatorics, model theory, computable function theory, and the more
classical branches of mathematics such as algebra and topology. The
cross-fertilization between these areas has given enormous impetus to the
development of mathematical logic, and we can look forward to many
years of exciting progress.
Subject Index

Absolute 54 Cardinality 37
Addition of cardinals 37 Cartesian product 6
Addition of ordinals 35 Categorical 171
V3-formula 178 Chain 23,29
Algebraic number 20 Chinese remainder theorem 109
Arithmetical 118 Choice function 27
Assertion 115 Church, Alonzo 90
Assignment 112, 142 Cohen, Paul 48, 57
At least as numerous 15 Compactness theorem 158
Atomic formula 113, 137 Complement of A in B 4
Axiom 124 Complete 125, 162, 171
of choice 27 diagram 169
of determinacy 49 Composite of f and g 7
of extensionality 44 Composition of f with g\, ... ,gr 68
of infinity 45 Computable function 63
of the null set 44 Computable relation 65
of pairing 44 Computable set 65
of the power set 45 Congruence relation 141
of regularity 44 Cot:Uunctive normal form 155
of replacement 45 Cons~tencylemma 180
of separation 45 Cons~tent 51
of union 45 Constant term 185
system 124 Continuum hypothesis 48
Correct 125
Countable 10
Barwise, J. 134
axiom of choice 49
Binary relation 6
Bound variable 115
Bounded quantifiers 75
Definability theorem 184
Branch 23
Defines 118
Definition by cases 76
Cantor-Bernstein theorem 17 Definition by transfinite recursion 41
Cardinal number 36 De Morgan's rules 5

195
196 Subject Index

Dense ordering 22 Hilbert, David 131


Densely ordered structure 22, 172 Homomorphism 141
Denumerable 10 Huge cardinal 49
Diagonal function 127
Diagram 147
Difference of A from B 4 Immediate predecessor 22
Diophantine equation 131 Immediate successor 22
Discrete 22 Implicit definition 182
Disjoint 4 Implies 51
Disjunctive normal form 155 Inaccessible cardinals 48, 55
Domain 6 Incompleteness theorem 128
Independent 51
Infinite 10
E-transitive 31 Initial pairing 24
Effectively given 124 Initial segment 23
Element I Injection 140
Elementarily equivalent 144 Input 61,62
Elementary chain 171 Integral polynomial 20
Elementary class 144 Interpolant 182
Elementary extension 167 Interpolation theorem 182
Elementary substructure 167 Intersection 4
Equinumerous 9 over X 4
Equivalent 151 Inverse function 7
Existential 175 Isomorphic 34
Expansion 140 Isomorphism 34, 140
Explicit definition 182
Exponentiation of cardinals 39
Exponentiation of ordinals 35 Kieene, S. C. 90
Expression 111, 136 Konig'S infinity lemma 29
Extension of g 7, 140 Konig's lemma 39
Kreisel, G. 134
False 116
Field 6
Filter 166 Least element 23
base 165 Lebesgue measurable 49
Finite 10 Less numerous 15
Finitely satisfiable 162 Limit cardinal 36
Fischer, M. and Rabin, M. 132 Limit ordinal 35
Formula 113, 137 Lindenmeyer, A. 134
Free 115 Linear ordering 22
Free for 117 Linearly ordered structure 22
Friedberg, R. 133 LOwenheim-Skolem Theorem 169
Function 7
Functional 41
Machine enumerable relation 101
Machine enumerable set 100
Generalized continuum hypothesis 48 Marker 60
GOdel, Kurt 48,51,56,90, 128, 129 Martin, T. 46
GOdel's incompleteness theorem 128 Matrix 154
Greatest element 23 Matiasevic, Y. 131
Maximal 23, 29
element 23
Halting problem 94 Maximal principle 29
Herbrand's theorem 186 Measurable cardinal 49
Subject Index 197

Minimal element 23 Satisfaction 114, 142


Model 51, 144 Scanned term 60
More numerous 15 Second incompleteness theorem 129
Mublik, A. A 133 Self-halting problem 92
Multiplication of cardinals 37 Set 1
Multiplication of ordinals 35 Simple 147
Mycielski, J. 49 Skolem normal form 156
for validity 156
Skolem's paradox 160
Occurs 114 Solovay, R. 49
OnA 7 Souslin's hypothesis 49
One to one 7 Standard 54
OntoB 7 State 60
Open 154 Strong inaccessible 48
Ord a 33 Structure 51, 138
Order preserving 24 Subformula 114
Ordinal 33 Subsequence 114
Output 61,62 Subset 2
Substitution instance 185
Substructure 140
Pairing 9 Successor 22
Pairwise disjoint sets 4 cardinal 36
Partial computation 61 ordinal 35
Partial ordering 21 tape position 61
Partially ordered structure 21 Super-huge cardinal 49
Post, E. L. 90, 133 Symbols of L 136
Power set 5
Predecessor 22
Prefix 154 Tape 60
Prenex normal form 154 Tape position 60
Prime Factorization Theorem 12 Tarski, Alfred 122
Primitive recursive functions 110 Term 111, 136
Principal ultrafilter 166 Theorem 124
Product of cardinals 37 Theory of W 144
Product of ordinals 37 Transcendental number 20
Proof 124 Transfinite induction 40
Tree 29
Truth 116
Quantifiers 103 Turing machine 59,60
Quotient structure 142 Turing's thesis 89
Type 136

Ramsey's theorem 49,50


Range 6 Ulam, S. 49
Recursive definitions 41, 42, 70 Ultrafilter 166
Recursive function 107 Ultrapower 167
Reduct 140 Ultraproduct 166
Relation 6 Undecidable 127
Representing function 65 Union 3
Restriction 140 over X 4
Restriction of f 7 Unique factorization theorem 91
Rozenberg, G. 134 Universal 175
Rubin, M. 193 closure 151
Rule of inference 124 machine 95
198 Subject Index

Valid 151 Young, P. 132


Vaught's test 171

Zermelo-Fraenkel axiomatization 44
Weakly compact cardinal 49 146 '
Well orderUlg 23
Witness 163
Undergraduate Texts in Mathematics

Apostol: Introduction to Analytic Malitz: Introduction to Mathematical


Number Theory. Logic.
1976. xii, 334 pages. 24 illus. Set Theory - Computable Functions -
Model Theory.
Childs: A Concrete Introduction to 1979. Approx. 250 pages. Approx. 2 illus.
Higher Algebra.
1979. Approx. 336 pages. Approx. 7 illus. Prenowitz/ Jantosciak: The Theory of
Join Spaces.
Chung: Elementary Probability Theory A Contemporary Approach to Convex
with Stochastic Processes. Sets and Linear Geometry.
1975. x, 325 pages. 36 illus. 1979. Approx. 350 pages. Approx. 400
illus.
Croom: Basic Concepts of Algebraic Priestley: Calculus: An Historical
Topology. Approach.
1978. x, 177 pages. 46 illus. 1979. Approx. 409 pages. Approx. 269
illus.
Fleming: Functions of Several Variables.
Protter/Morrey: A First Course in Real
Second edition.
Analysis.
1977. xi, 411 pages. 96 illus.
1977. xii, 507 pages. 135 illus.
Halmos: Finite-Dimensional Vector Sigler: Algebra.
Spaces. Second edition. 1976. xii, 419 pages. 32 illus.
1974. viii, 200 pages.
Singer/Thorpe: Lecture Notes on
Elementary Topology and Geometry.
Halmos: Naive Set Theory.
1976. viii, 232 pages. 109 illus.
1974. vii, 104 pages.
Smith: Linear Algebra
Hewitt: Numbers, Series, and Integrals. 1977. vii, 280 pages. 21 illus.
1979. Approx. 450 pages.
Thorpe: Elementary Topics in
Differential Geometry.
Kemeny/Snell: Finite Markov Chains.
1979. Approx. 250 pages. Approx. III
1976. ix, 210 pages.
illus.

Lax/ Burstein/ Lax: Calculus with Wilson: Much Ado About Calculus.
Applications and Computing, A Modem Treatment with Applications
Volume 1. Prepared for Use with the Computer.
1976. xi, 513 pages. 170 illus. 1979. Approx. 500 pages. Approx. 145
illus.
LeCuyer: College Mathematics with Wybum/Duda: Dynamic Topology.
A Programming Language. 1979. Approx. 175 pages. Approx. 20
1978. xii, 420 pages.126illus. 64 diagrams. illus.

You might also like