14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF
14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF
14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF
Editors
F. W. Gehring
P. R. Halmos
Advisory Board
C. DePrima
I. Herstein
J. Kiefer
Jerome Malitz
Introduction to
Mathematical Logic
Set Theory
Computable Functions
Model Theory
Springer-Verlag
New York Heidelberg Berlin
J. Malitz
Department of Mathematics
University of Colorado
Boulder, Colorado 80309
USA
Editorial Board
P.R. Halmos
Managing Editor
Indiana University
Department of Mathematics
Bloomington, Indiana 47401
USA
F.W. Gehring
University of Michigan
Department of Mathematics
Ann Arbor, Michigan 48104
USA
With 2 Figures
Library of Congress Cataloging in Publication Data
Malitz, J.
Introduction to mathematical logic.
Bibliography: p.
Includes index.
1. Logic, Symbolic and mathematical. I. Title.
QA9.M265 511'.3 78-13588
9 8 7 6 5 432 I
Preface ix
Glossary of Symbols xi
vii
V11l Contents
ix
x Preface
f- 1 7
{ ... } 2 7
ftC
I 2 fog 7
N 2 9
N+ 2 -<,'" 15
Q 2 (R, <), (Q, <), (I, <), 22
Q+ 2 <.4
R 2 (N, <) 23
R+ 2 Orda 33
{x:'" } 2 ~ 34
0 2 Card 36
E 2 c(x) 37
~,;;;?,c,::> 2, 3 ZF 38
U 3 ZFC 46
UX 4 1= 51
U Ai 4 M(t) 61
iEI Sum 63
n 4 Ck,d 63
nX 4 Pk,t 63
B-A 4 Pred 63
P(X) 5 Prodn 69
AxB 6 Mult 70
[B]k 6 Pow 71
DomR 7 Diff' 71
RanzR 7 m':"n 71
1-1 7 3x<y 75
f:A-+B 7 Vx<y 75
AB 7 P(H,x) 75
J[C] 7 Prime 75
xi
Glossary of Symbols xii
Prim 75 In 101
Exp' 76 Halt 101
Max 77 'If 103
Mlt,."..s 80 3 103
n• 80 Rec 107
Icompress I 80
Rem
L
108
111
M) 82 (see also 136)
~ ~ 111
M
V,/\,. III
MM 'If, 3 111
It" \i 82 EB,0 111
M) M) [, ] 111
M Trm 111
It" ...,. 82 t<z) 112
M) M2 Fm 113
rM,M+--., 82 F 114
I- 124
Icopyk I 83 Cons:;;
Prf:;;
129
129
Ishift right I
p 133
84 NP 133
I I
7' 136
shift left 84 Fm, 137
~~~ 140
Ierase I 84 ~ts
::::::g,::::::
140
140
#k 91 z(~) 142
TS 95 til<z) 142
STP 95 ~FCP<Z) 142
Idecode I 95
Th~
-
144
144
144
Icode I
Mod~
98 6J) (~) 147
Exp 98 IIF 166
RR 98
ex: 167
RC 98
6J)c~ 169
NP 99 U ~a 170
a<k
NS 99
S(~,X) 177
NST 99
~ex:r~ 178
T 99 'lf3-formula 178
STP 99 178
ThV3 K
Row 100
Mach 101
PART I
An Introduction to Set Theory
1.1 Introduction
Through the centuries mathematicians and philosophers have wondered if
size comparisons between infinite collections of objects can be made in a
meaningful way. Does it make sense to ask if there are as many even
numbers as odd numbers? What does it mean to say that one infinite
collection has greater magnitude than another? Can one speak of different
sizes of infinity?
Before the last three decades of the nineteenth century, mathematicians
and philosophers generally agreed that such notions are not meaningful.
But then in the early 18708, a German mathematician, Georg Cantor
(1845-"1918), in a remarkable series of papers, formulated a theory in
which size comparisons between infinite collections could be made. This
theory became known as set theory. As with many radical departures from
traditional approaches, his ideas were at first violently attacked but now
have come to be regarded as a useful and basic part of modem mathemat-
ics. This chapter is an introduction to set theory.
1.2 Sets
We use the term set to refer to any collection of objects. The objects
composing a set will be referred to as the members or elements of the set.
There are various ways to denote sets. One approach is to list the elements
of the set in some way and enclose this list in braces. For example, using
2 I An Introduction to Set Theory
this convention, the set consisting of the numbers 1,2, and 3 is denoted by
{l,2,3}. A set is completely determined by its members, and so the order
in which we list the elements is immaterial. Thus {1,2,3}={2,3,1}=
{3,2, I} = {l,3,2} = {2, 1,3} = {3, 1,2}.
A set may have so many members belonging to it that it is impractical
or impossible to use the above method of notation, and so other notational
devices must be used. For example, instead of using the method described
above to denote the set of all positive integers less than or equal to 10 10, we
might use {1, 2, 3, ... , 101O} to denote this set. The three dots indicate that
some members of the set being described have not been listed explicitly. Of
course, in using this notational device it is important to include enough
members of the list before and after the three dots so that the reader will
know which elements belong to the set and which do not. For example, the
set of even integers between -100 and 100 inclusive should not be
denoted by {-100, ... , lOO} but by something like {-100, -98, ... ,
-4, -2,0,2,4, ... ,98, lOO} or by {0,2, -2,4, -4, ... ,98, -98, 100, -100}.
Again, the order in which the elements are listed is arbitrary as long as the
reader understands which elements of the set have not been mentioned
explicitly.
The method for denoting sets using the three dots abbreviation can also
be used for infinite sets. For example, the set of even integers can be
denoted by {0,2, -2,4, -4, ... } or by { ... , -6, -4, -2,0,2,4,6, ... }. We
will use N to denote the set {O, 1,2,3, ... } of natural numbers, while N+
will denote {t, 2, 3, ... }. I will denote the set of integers
{O, 1, - 1,2, - 2, 3, - 3, ... }. Q will denote the set of rationals, Q + the set of
positive rationals, R the set of real numbers, and R + the set of positive
reals.
If a set consists of exactly those objects satisfying a certain condition,
say P, we may denote it by {x:P(x)}, which is read: "the set of all x such
that P(x) is true." For example, {x: 3..;; x";; 8 and x is a rational number} is
the set of rationals between 3 and 8 inclusive. Notice that x merely
represents a typical object in the set under consideration, and any letter
will serve just as well in place of x. Thus {l, 2, 3} = {x: x is an integer and
1,,;;x";;3}={y:y is an integer and 1";;y";;3}={x:x is an integer and
0<x<4}. Notice that the last two conditions are different but define the
same set.
We consider as a set the collection which has no members. We call this
set the null set and denote it by 0, rather than { }.
A set may contain other sets as elements. For example, the set
{1, {2, 3} } is the set whose elements are the number 1 and the set {2, 3}. It
is important to understand that this set has only two elements, namely 1
and {2, 3}. 2 is an element of {2, 3}, but 2 is not an element of {I, {2, 3}}.
We write x EA when x is an element of A, and x ElA otherwise.
Let A be a set. We say that a set B is a subset of A if each eleme:Q.1 of B
is an element of A. If B is a subset of A we write B ~A or A -;;J B. If B ~A
1.2 Sets 3
Theorem 2.1.
i. A (;;; B implies that A U B = B.
ii. A UB=B UA.
iii. AU(BUC)=(AUB)UC.
The proof of the theorem is very easy, and we leave all parts but iii as
exercises.
To prove part iii, first suppose that xEA U(B U C). Then either xEA
or x E B U C. If x E A, then x E A U B, and so x E (A U B) U C. If x E B U
C, then xEB or xE C. If xEB, then xEA U B, and so xE(A UB)U C. If
x E C, then x E(A U B)U C. Hence we have shown that whenever x EA U
(B u C), then xE(A U B)U C, in other words, we have shown that A U(B
U C)(;;;(A u B)U C. In the same way one proves that A U(B U C):?(A U
B)U C (the reader should check this). Hence A U(B U C)=(A U B)U Cas
claimed.
Because of part iii, no confusion can arise if parentheses are omitted
from (A U B)U C and we write AU B u C.
It should be clear what is meant by A I U A2 U ... U An' namely, {x: x E
A I or x E A 2 or ... or x E An}' An alternative notation for this set is
4 I An Introduction to Set Theory
We prove vii, leaving the proof of the other clauses for the exercises.
Here and throughout the text we use 'ifr to abbreviate 'if and only ir.
XEC-( U X) iff
x E C and x ~ A for all A E X iff
xEC-A forallAEX iff
xE n {C-A:A EX}.
In other words, C-( U X) and n
{C-A:A EX} have the same mem-
bers and so are identical, as claimed in vii.
Clauses vii and viii are called De Morgan's rules.
We next define the power set, P(X), of a set X. This is the set of all
subsets of X, i.e., P(X) is defined as {Y: YeX}.
For example, if X= {I,2,3}, then P(X) = {f/>, {I}, {2}, {3}, {I,2},
{I,3}, {2,3}, {I,2,3}}. Oearly, we always have f/>EP(X) and X EP(X).
Elementary properties of the power set operation will be found in Exercise
7 below.
7. Show that
(a) A ~ B implies P(A)~ P(B);
(b) P(A U B)~ P(A)U P(B), and more generally
1.4 Pairings
Suppose that A and B are sets and j is a 1-1 function from A onto B. j can
be thought of as an association or pairing of the elements of A with those
of B such that each element x of A has a unique associate j(x) in B and,
conversely, each element y in B has a unique associate j- I(y) in A.
Clearly, if A has ten elements, then so does B; if A has one million
elements, then so does B. Indeed, the existence of such a pairingj assures
us that if A has n elements, where it EN, then B also has n elements. But
what if A is infinite? It seems natural to use the existence of such an j to
assert that A and B have equal magnitude or are equinumerous even when
both sets are infinite. It is this simple idea that underlies the theory of
infinite sets. We now discuss this idea in more detail and consider some of
the surprising consequences.
Let A and B be arbitrary sets. A pairing between A and B is a 1-1
function on A onto B.
EXAMPLE 4.2. Let A = {I,3,5, 7,9,11, ... }, B= {0,2,4,6,8, ... }. One pairing
between A and B is defined by P(n)=n-l for all odd nEN+, i.e.,
P={(I,0),(3,2),(5,4),(7,6), ... }. Clearly P is 1-1 and onto.
EXAMPLE 4.3. Let A =N, B={O, -1, -2, -3, ... }. One pairing of A and B
is given by P(n) = - n for all nEN, i.e., P= {(O, 0), (1, -1),(2, -2), ... ).
EXAMPLE 4.5. Let A = {l,2,3}, B={I,2}. The reader can easily list all
functions from A into B and check that no pairing is possible between A
~~ .
This definition fits well with our intuition, and yet some of its con-
sequences at first glance seem bizarre. For example, a set A may properly
contain a set B and still be equinumerous to B. Of course this can not
happen if A is finite, but it can happen when A is infinite.
10 I An Introduction to Set Theory
P(n)=i(n+ 1) if n is odd.
when referring to finite sets having the same number of elements and the
technical meaning given to the word in the present context. This analogy
would be weak indeed without the following.
The arrows indicate the ordering of NXN imposed by the pairing with N:
O-P(O,O), I-P(O, I), 2-P(I,O),
3- P(2,O), 4- P(I, I), 5-P(O,2), ....
1heorem 4.13. Suppose that An is countable for each n EN. Then U nE~n
is countable.
So far all the examples discussed in this section have involved sets that
are either finite or equinumerous to N. Are there any infinite sets that are
not equinumerous to N, or is there only one "size of infinity"? If all infinite
sets are equinumerous, then there is little else to say about this notion.
However, this is not the case, as we shall see in the next section.
ExERCISES FOR §4
1. Show that if A is countable and not empty, then there are infinitely many fs
I-I
such thatj:A -+N.
2. Prove that Q+ -N by making use of a table as in the second proof that
NxN-N.
3. Let A-B, and suppose that A has 20 elements. How many pairings are there
between A and B? Justify your answer.
In Exercises 4 through 6 let A, B, C, and D be sets such that A--C and
B--D.
14 I An Introduction to Set Theory
a contradiction. On the other hand, if afl.B, then afl.j(a), so, again by the
definition of B, we conclude that a E B, contradiction. Since both assump-
tions a E B and a fl B lead to contradictions, our assumption that B E Ranj
is erroneous, i.e., B flRanj, as we needed to show. 0
Corollary S.2. Let A be an arbitrary set. Then A?"P(A).
PROOF: Suppose for some set A we have A-P(A). Then there is a pairing
P:A :! P(A), contradicting Theorem 5.1. 0
onto
0 2 3 4
Here aij is 0 ifjfl.j(i), and aij is 1 ifjEj(I). For example, ifj(3) is the set
of primes, then the fourth line of the table begins as follows:
0011010100010 .... We define B as before, that is, B={x:xEN, xfl.j(x)}
={m:mEN and am.m=O}. Then for each n, B=I=j(n), since nEj(n) iff
ann = 1 and ann = I iff n fl B. Note how the elements of B are determined by
the diagonal of the table, namely the entries ann for n EN.
Diagonal arguments play an important role in many proofs in this book.
As with the proof of Theorem 5.1, the "diagonal" considered may not be
immediately obvious but it is usually helpful to rewrite the proof in tabular
form.
The power set of an infinite set is certainly infinite, since P(A)::J {{a} :a
EA}, and so Corollary 5.2 implies that there are different sizes of infinity.
Is there a reasonable way to compare different sizes of infinity? Can one
speak of one infinite set having greater "magnitude" than another infinite
set? The following definition fits well with our intuition and provides a
basis for size comparisons of infinite sets.
We say that B is at least as numerous as A if A is equinumerous with a
subset of B. If this is the case, we write B~A or A~B. In other words,
B ~A if there is a 1-1 function from A into B. If B is at least as numerous
as A but not equinumerous to A, we say that B is more numerous than A
(or A is less numerous than B) and we write B>-A (or A-<B).
If A is an arbitrary set, then the functionj defined by j(a) = {a} for all
a EA yields a 1-1 function from A into P(A). Thus A ~P(A). By Corollary
5.2, A is not equinumerous with P(A). Hence A is less numerous than
P(A), and we have proved:
16 I An Introduction to Set Theory
The proofs of parts i and ii are immediate from the definitions. Part iii
follows from Theorem 3.1i. We will prove iv in the next section. Part v is
considerably harder and will not be proved until § 1.9.
Note that Corollary 4.10 and Theorem 5.4v together imply that the
magnitude of N is minimal among the infinite sets. In other words, we
have
Here is a direct proof of Corollary 5.5 that does not make use of
theorem 5.4. Letfbe a function whose domain is P(A)- {0} and such that
1-1
f(X)EX for each X EP(A)-{cf>}. Now define g:N --+A as follows:
g(O)=f(A),
g(l)= f(A - {g(O)}),
g(2) = f(A - {g(O),g(I)}),
g(3) = f(A - {g(0),g(I),g(2)}),
and so on. In other words g(n+ 1)= f(A - {g(m):m <;n}). Notice that for
each n, g(n+l) is defined, since A-{g(m):m<;n}:;6cf>. Since g(n+l)(t:
{g(m):m <;n}, it is clear that g is I-I. Hence g:N ~A.
5. Prove that if A has n elements, where n EN, then P(A) has 2ft elements. (fry a
proof by induction on n, using Exercise 10 of §1.4.)
6. Prove:
(a) A-<BnCimpliesA-<B andA-<C.
(b) A-<B n C implies A-<B and A-<C.
(c) B U C-<A implies B-<A and C-<A.
(d) B U C-<A implies B-<A and C-<A.
Figure 1.1
18 I An Introduction to Set Theory
6. Show that NR_R. [Hint: R .... NR is clear. Let A={r:O<r<l and rER}. Let
JENA; say j(n)=O.rn.lrn.2rn.3 .... Let F(f)=O,SIS2"" where Sj=O ifj is not of
the form 2 3, and Sj = rkl otherwise. Show that F: 'A ~ R.]
k i N I-I
7. Show that
(a) NN_R,
(b) ~-R.
(Hint: Show R .... NN .... NR).
8. Let e be the class of continuous functions. Then e-R.
Theorem 7.1. Let T be the set of transcendental numbers and A the set of
algebraic numbers. Then T~R and A~N.
PROOF: We first show that there are countably many algebraic numbers.
Let I[x] be the set of all integral polynomials in the variable x. If
f(x)EI[x], [say f(x)=ao+alx+ ... +anx n, an:;60], we call n the degree of
f(x). Recall that a polynomial of degree n with real coefficients can have
at most n real roots, a fact usually proved in courses in elementary algebra.
1.7 Algebraic and Transcendental Numbers 21
1.8 Orderings
In every branch of mathematics orderings of one sort or another are
encountered. We have no intention of giving a comprehensive classifica-
tion of the various orderings that arise, but instead we restrict our attention
to those that arise most frequently and that are particularly important in
logic.
Apartiai ordering is a binary relation R such that for every x,y,z
i. xXx;
ii. xRy impliesyAx;
iii. xRy andyRz implies xRz.
22 I An Introduction to Set Theory
[Recall that xRy means (x,y)ER and x.xy means (x,y)€lR.] We shall
often use the symbol < to denote a partial ordering.
An ordered pair (A,R) is apartially ordered structure if A:;I=0 and R is a
partial ordering with field kA.
EXAMPLE 8.2. For a,bEN+ we write alb if there is a cEN such that c=FI
and C=Fa and a·c=b. Then (N,!) is a partially ordered structure.
EXAMPLE 8.3. Let A be the set of polynomials with real coefficients. Let
p,qEA. Writeplq if for some rEA,p·r=q and the degree ofp and of r is
positive. Then I is a partial ordering with field A.
EXAMPLE 8.4. Let §" =RR. Given f,g E §", define f 6.g if f(x)..: g(x) for
each xER and f(z)<g(z) for some zER. Then (§",6.) is a partially
ordered structure.
predecessor has an immediate predecessor. (N, <) and (I, <) are discretely
ordered, but (Q, <) and (R, <) are not.
If B =F 0 and (B X B) n R is a linear ordering, then B is called a chain or
a branch. In Example 8.2 above, pn: n E N+} is a chain. A minimal element
in a partially ordered structure (A,R) is an element aEA such that for no
bEA do we have bRa. If there is no bEA such that aRb, then a is a
maximal element. If for each bEA we have aRb or a=b, then a is the
least element, and if for each bEA we have bRa or a=b, then a is the
greatest element.
In Example 8.1, cp is the least element and X is the greatest element. In
Example 8.5 there is no minimal element and no maximal element. If we
let A = P(N) - {0,N}, then (A, ~) is a partially ordered structure with as
many minimal elements and many maximal elements as there are elements
in A (see Exercise 9).
A well ordering is a linear ordering R having the additional property
that
v. Every non-empty subset of the field of R has a least element.
In other words, for every non-empty X ~ the field of R, there is an
element x E X such that x Ry for all y EX - {x}.
With the usual "less than" relation the natural numbers are well
ordered. However, neither I nor R+ is well ordered. To see this let X be the
set of elements less than I (of I and R+ respectively). Clearly X does not
have a least element.
Next we describe a construction that provides many examples of well-
ordered structures.
Let (A, <A) be a linearly ordered structure, and let (Ba' <a) be a linearly
ordered structure for each aEA. Define <A~{(Ba' <a):aEA} to be the
structure (B, <B), where B= U {(a,b):aEA and bEBa}, and (a,b)
<B(c,d) iff a<AC or (a=c and b<ad).
Loosely speaking, (B, <B) is obtained by replacing each a EA with a
copy of (Ba' <a). For example, if A={O,I,2}, B;=N for each iEA, and
<A and <; are the usual orderings on A and on B;, then <A~{(B;, <;):iE
A} can be viewed as the ordering obtained by stacking three copies of
(N, <) one upon the other. It is easy to see that the resulting structure is
well ordered. More generally we have the following.
Theorem 8.6.
i. <A~{(Ba' <a):a EA} is linearly ordered.
ii. If (A, <A) is well ordered and if (Ba' <a) is well ordered for each aEA,
then <A ~{(Ba' <a):a EA} is well ordered.
then C4Aa, and so we can not have (c,d)<B(a,b). If a=c and b<ad,
then d4ab, and so again (c,d) <tB(a,b). Now suppose that (a,b)<B(c,d)
and (c,d)<B(e,j). Then either
i. a<Ac<Ae,
ii. a<Ac=e and d<ci,
iii. a = c <A e and b <a d, or
iv. a=c=e and b<ad and d<cf.
In either case a <A e, or a= e and b <af; hence (a,b) <B(e,j). D
PROOF: ii. Let X be a non-empty subset of B. Let XI={a:(a,b)EX}.
XI :;60, and XI C;A. Since (A, <A) is well ordered, XI has a least element,
say a*. Let X2={b:(a*,b)EX}. Then X 2:;60 and X 2 C;Bao. Since
(Bao, <ao) is well ordered, X 2 has a least element, say b*. Clearly (a*,b*) is
the least element of X. D
An initial segment of a linearly ordered structure (A, <) is a subset
X C;A such that for each xEX and aEA, if a<x then aEX.
For example, {x: x E Q and x <'IT} is an initial segment of Q but not of
R, and {x: x E Rand 0 <x, 4} is an initial segment of R + but not of R.
Our next theorem shows that given two well-ordered structures an initial
segment of one of them is a copy of the other. For this we need some
definitions and several easy lemmas.
Thro'ugh the remainder of this section we let (AI' <I) and (A2' <2) be
well-ordered structures.
A binary relation SC;A I XA 2 is order preserving [with respect to
(AI' <I) and (A2' <2)] if whenever (xl'YI)ES and (X2,Y2)ES, then XI <IX2
iff YI <2Y2·
Clearly, an order preserving relation is a 1-1 function.
An order preserving relation S is an initial pairing if DomS is an initial
segment of (AI' <I) and RanS is an initial segment of (A2' <2).
Lemma S.7. If Sand S' are initial pairings and DomS=DomS', then
S=S'.
PROOF: Assume that the hypotheses holds but that for some x EDomS we
have Sx:;6S'x. Let x* be the least such x (in the sense of <I), say
S'X*<2SX*. Then SZ=S'Z<2S'X* for each Z<IX*, and SX*'2Sz for
each Z ~I x*, Z EDomS. Thus S'x*Et:RanS, and so RanS is not an initial
segment of (A2' <2). This contradicts the assumption that S is an initial
pairing. D
Theorem 8.11. If (AI' <I) and (A2' <2) are well-ordered structures, then
there is a unique order preserving function S such that either Dom S = A I and
RanS is an initial segment of (A2' <2), or RanS=A 2 and DomS is an
initial segment of (AI' <I),
PROOF: Let :JC be the set of all initial pairings. Let S= U :JC.
We first show that S is order preserving. For suppose (XI'YI) E S and
(X 2'Y2)ES and XI <IX2' Then (xI'YI)ES I and (x2,y0ES2 for some SI,S2
E:JC. By Lemma 8.9, SIkS2 or S 2k SI; let's say SlkS2' HenceYI<2Y2'
since S2 is order preserving. Thus S is order preserving.
DomS= U {DomS':S'E:JC}, and this is an initial segment of
(AI' <I) by Lemma 8.10. Similarly RanS is an initial segment of (A2' <2).
Thus S is an initial pairing.
Suppose DomS cAl and RanS cA 2• Let a be the least number (in the
sense of <I) of A 1- DomS, and let b be the least number (in the sense of
<2) of A 2-RanS. Then clearly SU{(a,b)}E:JC, and so (a,b)ES. But
then aEDomS-a contradiction. Hence DomS=A I or RanS=A 2.
The unicity of S follows from Lemma 8.7. 0
Let <s be a linear ordering with field B, and let <c be a linear
ordering with field C. We say that <s is an initial segment of <c if B is an
initial segment of <c and <s = <c n (B X B). The following theorem will
be useful in the next section.
Theorem 8.12. Suppose :JC is a set of linear orderings (well orderings) such
that whenever <s E:JC and <c E:JC, one of them is an initial segment of the
other. Then U:JC is a linear ordering (well ordering) and each <s E:JC is
an initial segment of U :JC.
PROOF: U:JC is a binary relation; call it <. Suppose that Y <z and z <y.
Theny<sz and z<cY for some <s, <cE:JC. We may assume that <B is
26 I An Introduction to Set Theory
an initial segment of <c. But then y<cz and z<cy, contrary to the
assumption that <c is a linear ordering. Hence y <z implies z {y. Simi-
larly one shows that x{x, and that if x <y andy <z then x <z. Hence <
is a linear ordering.
Suppose that a <b when bE B, and B is the field of <B E %. Then
a <c b for some <c E %. Either <c is an initial segment of <B, or <B is
an initial segment of <c; and both alternatives imply that a E B. Hence <B
is an initial segment of <.
Finally, suppose each <B E % is well ordered and X is a non-empty
subset of the field of %. Let aEX. Then aEC, where C is the field of
some <c E%. Let d be the least element of X n {x:x <;; ca}. Then d is the
least element of X in the sense of <, since <c is an initial segment of <.
Hence < is a well ordering. D
sits well with the intuitions. This is particularly true in set theory. For
example, we shall use it in proving that given any sets A and B either
A~B or B~A.
In the remainder of this section, and in the next, we shall present some
of the consequences of the axiom of choice.
1beorem 9.1. Every set can be well ordered, i.e., for every set A there is a
binary relation < such that < is a well ordering of A.
The intuitive idea behind the proof that follows is this. A choice
functionf on P(A)- {0} can be used to well-order parts of A as follows.
Take f(A) to be the least element ao. The next element is f(A - {ao}); call
it a,. The immediate successor of a, isf(A - {ao, ad); call it a2 , and so on.
We collect all orderings obtained in this way in a set X and show that any
two members of X fit together in the sense that one is an initial segment of
the other. From this it follows that U X is a well ordering of A.
PROOF: Let f be a choice function on P(A) - {0}. Let X be the set of all
well orderings <B where <B has field B for some B!:;A and b = f(A -
{x:x<Bb}) for all bEB. (So given an initial <B segment Y,fpicks the
next element in the <B ordering from A - Y.) We show that X satisfies
the hypotheses of Theorem 8.12 and then that U X well-orders A.
First notice that X+0, since the empty relation well-orders {f(A)}.
Now suppose that <B and <c belong to X. By Theorem 8.11 we can
assume that we have an order preserving function S:B-+C such that RanS
is an initial segment of C. We claim that S(x) = x for all x E B. If not,
there is a least b (in the sense of <B) such that S(b)+b, say S(b) = d. But
d=f(A-{x:x<cd}), and x<cd implies S-'(x)<b, so S-'(x)=x.
Hence {x:x<cd}={X:X<Bb}, and d=b-a contradiction. Therefore
S(x) = x for all xEB, and <c is an initial segment of <B. We now know
that X satisfies the hypotheses of Theorem 8.12, and hence U X is a
well ordering; call it <.
Now let b belong to the field of <. Then for some <B E X, bE B.
Hence b=f(A-{x:X<Bb}), and so b=f(A-{x:x<b}). Therefore < E
X.
Let A· be the field of <. We claim that A· =A. For if not, we can add
d= f(A - A·) at the top to extend < to <#, i.e., we define x <# y if x <y
when both x,yEA·, and x<#d when xEA·. Clearly <# EX, and so
dE A • -a contradiction. Therefore < well-orders A. 0
Recall that we postponed the proof of Theorem 5.4v, which states that
for any sets A and B, either A ~B or B~A. Assuming the axiom of choice,
we can now supply the proof. By Theorem 9.1, there is a well ordering <.of
on A and a well ordering <B on B. Hence by Theorem 8.11 there is an
order preserving functionf on A into B or on B into A. Since f is 1-1, we
have A ~B or B~S as required. 0
1.9 The Axiom of Choice 29
1beorem 9.5 (Konig's Infinity Lemma). Suppose (A, <) is a tree such that
A is infinite but each a EA has only finitely many immediate successors.
Then (A, <) has an infinite chain.
30 I An Introduction to Set Theory
then X = N and we are done. So let x EX. Suppose that y Ex U {x} and
zEy. We must show zExU{x}. ButyEXU{X} impliesyEx or y=x.
Since x is E -trall'Sitive, y E x and z Ey gives z Ex. Y = {x} yields z Ex
immediately. Hence xu {x} is E-transitive. Hence each nEN is E-transi-
tive.
Now consider the set K of all xEN such that yEx implies yEN.
Clearly OEK. Suppose xEK, and letyExu {x}. Either yEx or y=x; in
either case y E K. Hence by the definition of N, K = N, and so N is
E -transitive. 0
Theorem 10.3. Each n EN is well ordered by E, and so in N.
PROOF: The set X consisting of those members of N that are well ordered
by E certainly contains O. Suppose x EX. We need to see that xU {x} EN.
xex, for otherwise {x} would be a subset of x having no E-minimal
member, contrary to the assumption that Ewell-orders x. A similar
argument shows that there is no y E x such that x Ey. Also, if z Ey and
y E x, then z E x by the preceding lemma. These observations together with
our assumption that E well-orders x shows that E linearly orders xU { x}.
Now lety be a non-empty subset of xu {x}. Ifynx*0, theny has an E
least member, since Ewell-orders x. If ynx=0, theny={x}, and x is
the E-minimal member (we already noticed that x ex). Thus every subset
of xu {x} has an E-minimal member, and so xu {x} is well ordered by
E. Using this, similar considerations show that Ewell-orders N also. 0
What about extending the sequence 1,2,3, ... to include "infinite num-
bers"? Since n= {O, 1, .. . ,n -I} for each nEN, a reasonable candidate for
the first number greater than each "finite number" is N itself, {O, I,2, ... }.
Then why not go on and consider N U {N} a number, the immediate
successor of N? Calling this number N + 1, we go on to the successor of
N+I, namely N+lU{N+l}, which we call N+2. Continuing, we get
N + 3, N + 4, and so on. Continuing, let the number immediately following
N,N+ I,N+2, ... be {O, I,2, ... ,N,N+ I,N+2, ... } which we can call N·2.
Then continue from there, letting N ·2+ 1= N . 2 U {N . 2}. and so on. This
begins the sequence of ordinal numbers, but what we need is an explicit
way of defining them all.
I t is tempting to define the ordinals as being the members of the least
set X such that
i. OEX,
ii. x E X implies XU {x } EX,
iii. x ~X implies U x EX.
Unfortunately, such a set X is an impossibility, as we shall see in § 1.11.
Although this approach fails, there is a satisfactory alternative. Rather
than attempting to define the collection of all ordinals, we define the
property of being an ordinal, taking as our guideline Theorems 10.2 and
10.3.
1.10 Transfinite Numbers 33
Theorem 10.5.
i. OrdN.
ii. Orda implies Ord aU {a}.
iii. If a E/3 and Ord/3, then Orda.
PROOF OF i. This is immediate by Theorems 10.2 and 10.3. D
PROOF OF ii. See the proof of Theorem 10.3. D
PROOF OF iii. Suppose aE/3 and Ord/3. Let xEy and yEa. Since /3 is
E-transitive, y E/3, and hence so is x. Since /3 is linearly ordered by E,
x E a. Hence a is E -transitive. Since a r;;;, /3 by the E -transitivity of /3, it
follows that a is well ordered by E. Hence Orda. D
times do carry over (see Exercises 7 and 8). Although there are many
interesting theorems about ordinal arithmetic, they are outside our main
interest, and we go on to consider another notion of number.
The finite numbers have an attribute that some of the ordinals do not
have. No two members of N are equinumerous, and so each finite set has a
unique number associated with it, its magnitude. However, this is not a
property that all ordinals enjoy. In fact, as we shall see, if a is infinite and
p is no greater than a, then a-a + p-a· p. However, there is a more
restrictive notion of number that does not have this defect.
1beorem 10.11.
i. Each n EN is a cardinal, and N is also.
ii. If K is an infinite cardinal, then {a: Ord a and a ~ K} is a cardinal, and in
fact is the smallest cardinal larger than K.
iii. If x is a set of cardinals, then U x is a cardinal.
PROOF OF i is clear. o
PROOF OF ii. Since {a:Orda and a~K} is E-transitive and well ordered, it
is an ordinal p. Since we cannot have PEP, we must have K-< P and p~a
for any a E p. Hence P is a cardinal greater than K. If AE p, then A~ K, so
P is the least cardinal greater than K. 0
PROOF OF iii. Let x be a set of cardinals. Then U x is an ordinal by
Corollary 1O.8ii. Now suppose that a E U x and a- U x. But then a E K
for some K Ex. Also, by the E-transitivity of K, a C; K. Hence, by the
Cantor-Bernstein theorem (actually Theorem 6.1) we see that a-K-
impossible, since CardK. Hence U x is a cardinal. o
When viewed as a cardinal, N is usually denoted by W or Wo or No. With
each ordinal a associate a cardinal Wa such that Wa is the immediate
cardinal successor of wp if a = P+ 1 and Wa = Up Eawp if a = U a. Na is
often used in place of Wa.
We use K+ to denote the cardinal successor of K. If K=A + for some A,
then K is called a successor cardinal; otherwise K is a limit cardinal.
As a consequence of the axiom of choice, every set has its magnitude.
Theorem 10.12. For every x there is a unique cardinal K such that X-K.
1.10 Transfinite Numbers 37
Definition 10.13. If x ...... /C and Card/C, we write c(x)=/C and say that the
cardinality of x is /c.
Definition 10.14.
i. The cardinal sum of /C and >. is c( /C U {(O, a) : a E >. }).
ii. The cardinal product of /C and >. is c( /C x>.).
We denote the cardinal sum of /C and >. and the cardinal product of /C
and ~ by /C+~ and /C,>'.
Although we are using '+' and '.' to denote the ordinal addition and
multiplication as well as cardinal addition and multiplication, it should be
clear which is intended by the context and by our convention of using
/C,~, p.," for cardinals and a, p, y, I) for ordinals.
Much of cardinal arithmetic is extremely simple because of the follow-
ing fact.
o
1
2
(t, ~)
Figure 10.1
Since X~ and X~' are well ordered by < *, so is Xa (by Theorem 8.6), and
hence so is X (by Theorem 8.6 again). We show that (X, < *) is isomorphic
to (K, E tK).
First let S be a proper initial segment of X. Then X - S has a least
element x*. Let X*EXa •.Then Sk(a*+I)X(a*+I)={(y,~):y,~E
a* + I}. Let Y =(a* + 1) X (a* + 1). Since Y -c(a* + 1) X c(a* + 1), and
since c(a*+ 1)·c(a*+ 1)= c(a* + 1) by the induction hypotheses, we have
cS E K. Hence, < * is a well ordering in which every initial segment has
cardinality less than K.
Let p ·be that ordinal such that (P, E tP)-(X, -<*) (by Theorem 10.9).
Clearly PIlK (otherwise CardK is false, since P-X~K). From what we
have just shown, KIl p. Hence K= p, and so K-X as needed. D
1heorem 10.18 (Konig's Lemma). Suppose /(p <Ap Jor each {3 Ea. Then
U pEa/(p<cIlpEaAp'
PROOF: For each 8 E Up Ea/(p let H( 8) be that function h in IIp EaAp such
that for all {3 E a, h( {3) = 8 if 8 E /(p and h( {3) = 0 otherwise. Clearly
H: UPEa/(p~IIpEaAp. We need only show that Upea/(p is not equi-
I-I
numerous to IIpEaAn. For suppose G: UPEa/(p-+IIpEaAp. Let
Xp = {( G(8»( {3):8 E /(p}. Then Xp C,Ap and Ap - X p=F0. Let J(fJ) be the
least element in "Ap-Xp. ThenJEIIpEaAp, but clearly JERanG. Hence G
is not onto. Therefore Up Ea/(p <CIIPEaAp. 0
By analogy with finite arithmetic, one might conjecture that if 1 </(p <;
Ap for all {3 E a, then U /(p <cIIAp, but this is false (see Exercise 13).
The definition of cardinal exponentiation does not follow the pattern
used for addition and multiplication; /(A in the cardinal sense is not defined
as the cardinality of /(A in the ordinal sense. Instead, the definition is
motivated by the fact that for finite cardinals mn,.~,/Im.
Dermition 10.19. If /( and A are cardinals, then /(A=c(A/().
Again our notation involves some ambiguity, since /(A now has two
different meanings depending on whether ordinal exponentiation or cardi-
nal exponentiation is used. For example, 2'" in the ordinal sense is count-
able, but 2'" in the cardinal sense is not. However, in the remainder of the
text, exponentiation will always mean cardinal exponentiation.
The proof of the next lemma is trivial and so is omitted.
Lemma 10.20.
i. IJA>.",then/(A>/(~.
ii. If /( >A, then /(~ >A~.
Theorem 10.11.
i. /(A/(~ = /(A+~.
ii. (/(Ay= /(A.~.
iii. /(A."A=(/(·."t.
PROOF OF i. We consider only the case where A >." and A > w and /( > 2,
since the other cases are analogous or trivial. By the lemma, /(A > /( P and
/(A > w. By Theorem 10.16, /(A/(P= /(A. Also A+ .,,=A, so /(A+ .. = /(A. 0
PROOF OF ii. LetfE ..(A/(). Then for every aE.", {3 EA we have (f(a»({3)E
/(. Let H map J to the function g, defined g,(a,{3)=(f(a»({3). Then
H:"(A/() ~Ax,,/(. It is also clear that each gE Ax ../( is g, for somefEP(".c), and
so H is onto AX ../(. Hence (/(Ay = /(A.... 0
PROOF OF iii. We consider the case where A> w, since the other case is
trivial. We may assume that /( >.". Then /(A."A = /(A by the lemma and
40 I An Introduction to Set Theory
Here are two calculations that make use of Lemma 10.20 and Theorem
10.21 and show that Lemma 10.20 does not hold if we replace' <; , by '<'
throughout.
i. (2"')"'=2"'·"'=2"'=(2",)1 (compare with Lemma 1O.20i).
ii. 2'" = 2"'·'" = (2"')'" ;> w"', which along with 2'" <; w'" gives 2'" =w'" (compare
with Lemma 1O.20ii).
As we shall see in the next sections, the value of /C" in relation to /C and A
is very mysterious. For example, while it is consistent to assume that
2"'=W1' it is also consistent to assume that 2"'=w2 • In fact, it is known that
if /C > wand there is no countable subset X of /C such that U X = /C, then
one can consistently assume that 2'" = /C. No principle we have used so far
determines the value of 2"'.
We end this section by considering extensions of two of the most
commonly used principles in mathematics, definition by recursion and
proof by induction on w. What makes these principles work is the fact that
w is well ordered. Therefore it is not surprising that these principles have
useful extensions to transfinite ordinals. We shall consider several exten-
sions here and in the problems. Others, which are more complicated to
state, will not be mentioned. For the most part, they are easy enough to
devise when the need arises, following the form of the versions given here.
Theorem 10.22". Suppose that 1/1 is a property such that 1/1 (a) whenever 1/1 ({3)
for all {3 Ea. Then 1/1 (a) for every ordinal a.
PROOF: Like that of Theorem 10.22. o
The difference between Theorems 10.22 and 10.22" is that a has been
replaced by Ord (which, as we will see in the next section, is not a set), and
X has been replaced by the property 1/1, which does not correspond to a set
either. Theorem 10.22' can be stated with analogous modifications.
Next we consider various transfinite generalizations of definition by
recursion. Again, we shall not try to give the most comprehensive versions
of the theorem, since these are more complicated to state, but follow the
same general outline as the simpler versions.
F( {3) = G( {3,sp),
10. Suppose cX=cX', cY=cY', and X and Yare infinite. Show that c(XU Y)=
c(X'U Y'), c(XX Y)=c(X'X Y'), and cYX=cY'X'.
II. Suppose K, A, and p. are cardinals. Show that (K+A)P.=K·P.+A·P..
12. Suppose that for each nEw, Kn <2"'. Show that Une",IC,,<2"'.
13. A relation R is well founded if for each x either {y :yRx} = 0 or there is a y*
such that y* R x and there is no z such that x Ry* and z R x. (Note that every
well ordering is well founded.) Prove the following extension of Theorem
10.22: Suppose R is well rounded and that X has the property that whenever
xEDomR and {y:yRx}~X, then xEX. Then DomR ~X.
14. Extend Theorem 10.22" to well-founded properties along the lines of Exercise
13.
What should one do with set theory in view of these paradoxical 'sets'?
Should we abandon its use as a notational and conceptual framework for
classical mathematics? And what about Cantor's proof that most numbers
are transcendental? Should that be abandoned because of the paradoxes
even though none of the paradoxical "sets" are mentioned in the proof? In
fact, the paradoxical "sets" of the examples never arise in any branch of
classical mathematics, and the sets that do arise seem to be completely
innocuous as far as giving rise to contradictions is concerned.
In the early part of this century, Russell and Whitehead in their
Prinicpia, and Zermelo in a series of papers, attempted to axiomatize a
significant portion of set theory in a way that would avoid the paradoxes.
The axiomatization given by Zermelo and later modified by Fraenkel is
the one most frequently encountered today. This axiomatization appears to
be highly successful. The axioms have strong intuitive appeal, apparently
asserting simple truths about sets. The axiomatization seems to be free of
contradiction, and moreover, is strong enough to provide a base for all of
classical mathematics. One axiom, the axiom of extensionality, says that a
set is determined by its members. The remaining axioms either state that a
certain set exists or that a set is obtained from a given set by a specified
operation. In developing set theory within such an axiomatic framework,
the only sets that can be asserted to exist are those that can be proven to
exist by a valid argument whose only premises about sets are those given
by the axioms.
The remainder of this section is devoted to the Zermelo-Fraenkel
axiomatization (abbreviated ZF).
Axiom of Pairing. If x and y are sets, then there is a set z such that for all
w, wEz iff w=x or w=y.
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 45
Axiom of Union. For every x there is a y such that z Ey iff there is awE x
with z Ew.
Axiom of the Power Set. For every x there is ay such that for all z, z Ey iff
zCx.
This axiom assures us that there is a set containing w. The next axiom
will allow us to extract w from such a set.
Recall from § 1.10 that a functional property P is one such that for every
x there is at most one y such that Pxy. In our examples of paradoxes the
pathological sets are inordinately large. If y is a set obtained by replace-
ment as the range of P restricted to x, then the magnitude of y is no
greater than that of x, which provides some intuitive justification of
replacement as an axiom.
Our statement of the axiom of replacement is somewhat sloppy in that
the notion of a property is undefined. At this point we shall take it to mean
any statement about sets mentioning only the E-relation. A precise
version of this axiom (and more elegant statements of the others) will be
given in §3.4.
We used the axiom of replacement in proving Theorem 10.9, although
an alternative proof can be given that avoids its use. In our proof we need
to know that there is a set F whose elements are the fa's for a EA. Let
P( u, v) hold iff u E A and v =fu, or u fl A and v = 0. Then P is functional,
and the range of P restricted to A is the needed set F. The axiom of
replacement was used again in Theorem 1O.11ii.
A very useful consequence of the axiom of replacement is the axiom of
separation, which says that a definable subset of a set is a set. In other
words, if x is a set and S a property then there is a set y whose elements
are exactly those of x which satisfy S. To deduce the axiom of separation
from replacement, first choose some a E x such that a has property S (if no
such a exists, then the axiom of the null set gives {y:yEx and Sy}). Now
let Puv be the property
Either uEx and Su and v=u, or uflx and v=a, or not Su and v=a.
46 I An Introduction to Set Theory
Clearly, for every u there is a unique v such that Puv. Now the axiom of
replacetnent applied to x and this P gives the set {z: z E x and Sz}
immediately.
On the other hand, we will see in Exercise 7 of § 1.12 that the axiom of
separation does not imply replacement and hence is a weaker axiom.
Since the axiom of separation is a theorem of ZF as we have just seen,
there is no need to include it as an additional axiom.
Zermelo's original axiomatization included separation but not regular-
ity. Fraenkellater modified the axiomatization by deleting separation and
aiding replacement.
Separation has been used implicitly several times in previous sections.
As another example of its use, we can prove that if x is a set, then n x is
a set. If x = 0, then we are done, and if there is some y E x, then separation
gives us the existence of the set {z:zEy and zEw for all wEx}, which is
n x. Letting s be the set specified in the axiom of infinity, and letting x
be the set of all z k s such that 0 E z and whenever u E z then u u { u} E z
(x exists by the axioms of power set and separation), we see that n x is a
set, so the axioms give the existence of "'.
The axiom of separation is strong enough to develop most of classical
mathematics within our set theoretic framework. However, quite recently,
Martin found an assertion in analysis that can be proved from replacement
but not separation. The assertion is 'Every Borel set of reals is determined'.
(The set of Borel sets is the smallest set containing the open intervals that
is closed under complementation and countable unions. A set X is de-
termined if one of the two players in the following game has a winning
strategy: The players a and b move alternately beginning with the a player.
Each chooses an integer between 0 and 9 inclusive; say a chooses aj on the
ith move and b chooses bj on the jth. Then player a wins just in case
O.aOb)a2b3a ... EX.)
The axioms mentioned so far, including replacement, constitute the
Zermelo-Fraenkel axiomatization, and as we have said are sufficient to
form a foundation for classical mathematics. Moreover, the axiomatization
is extremely elegant in that it can be stated in terms of E alone (see §3.4).
On the other hand, there are statements of relatively recent vintage that
have considerable mathematical interest but cannot be proved or dis-
proved in ZF. Some of these have been considered as additional axioms.
The one that has gained the broadest acceptance is the axiom of choice.
The axiomatization obtained by adjoining the axiom of choice to the
Zermelo-Fraenkel axioms will be denoted by ZFC.
At this point we want to stress that everything done in the preceding
sections can be justified within ZFC, and most of it within ZF. For
example, consider the notion of the ordered pair (x,y). The only property
of the ordered pair that has mathematical significance is the following: If
(x,y)=(u,v), then x= u andy = v. If we are to develop a suitable notion of
ordered pair within our axiomatic framework, then (x,y) has to be given a
1.11 Paradise Lost, Paradox Found (Axioms for Set Theory) 47
definition in terms of x, y, and E, and then using ZFC the ordered pair
must be shown to exist and to have the requisite property. This we now do.
The axiom of pairing asserts that for every x and y there is a z such that
wE z iff w = x or w = y. By extensionality, there is only one z such that
wE z iff w = x or w = y. This z is, of course, the unordered pair of x and y,
which we denote by {x,y}. For x=y we write {x} instead of {x,x}. Now
define (x,y) to be {{ x}, {x,y}}. Another application of pairing gives
ZFe. However, what we have just done should help convince the reader
that such a task is a straightforward, albeit tedious, exercise. At the end of
this section there are several problems that continue in this direction.
There are other statements that have been considered as axioms but
none of them has gained the wide acceptance of the axiom of choice.
Moreover, it has been shown by GOdel that if ZF is free of contradiction,
then so is ZFC (we shall say more about this in the next section). With
some of these other axioms, either no such proof of relative consistency
can be given or else the axiom does not appeal to the intuitions as strongly
as the axiom of choice (at least at this time) or are not as useful.
Perhaps the most famous of these axioms is the generalized continuum
hypothesis, abbreviated GCH. This states that the cardinal successor of the
cardinal /C is 2". The special case for /C=W, namely the assertion WI =2"', is
called the continuum hypothesis (CH). Since 2'" = cP(w) and P(w)--R (see
§ 1.5), the CH says that every subset of R either is equinumerous to R or is
countable. As concrete a statement as this seems, GOdel's work in 1938
and Cohen's work in 1963 show that CH and GCH cannot be proved or
refuted within ZFe.
A very active area of investigation in the past few years has centered
around axioms of infinity. Roughly speaking, these axioms assert the
existence of extremely large cardinals. The axiom of infinity of ZF is an
axiom of infinity and gives us the existence of w, a cardinal which has the
following two properties:
i. nEw implies 2n E w, and
ii. if nEw and mO,ml, ... ,mn _ 1 Ew, then UjEnmjEw.
It is natural to ask if there are any other cardinals /C that enjoy these
properties, i.e., cardinals /C such that
i. AE /C implies i' E /C, and
ii. A E /C and Va E /C for every 0: EA implies U aE"va E /C.
i=t=J}. Say that IC is weakly compact if whenever Y !;;[IC]2, then there is some
Z !: IC such that cZ = IC and either [Z]2!;; Y or [Z]2 !;;[IC]2 - Y. '" is weakly
compact, by a famous theorem of Ramsey (see Exercise 3). However, the
axiom asserts the existence of a larger weakly compact cardinal. Moreover,
it can be shown that if '" E IC and IC is weakly compact, then IC is strongly
inaccessible, but much larger than the first strongly inaccessible cardinal.
No proof of relative consistency of this axiom with ZF can be given, since
no such proof can be given for IC.
Far larger than the first uncountable weakly compact cardinal is the
first measurable cardinal, a notion introduced by Ulam. Say that a
function f:P(IC)~{O, I} is a two valued measure on IC if f(IC) = 1, and
whenever A<IC and {X.. :aEA} is a set of pairwise disjoint subsets of IC,
then f U .. E>"X.. = ~"E.J(X.. ). A cardinal IC is measurable if there is a
measure f whose domain is IC.
Other cardinals have been considered which dwarf the first measurable
cardinal (huge, super-huge, Vopenka, etc.), and the game of finding ever
larger axioms of infinity continues.
Before we close this section we want to mention several famous state-
ments that, like the axiom of choice, are independent of ZF, i.e., can be
neither proven nor refuted within ZF. The first concerns the reals with the
usual ordering (R, <). This ordering is dense, is complete (every set having
an upper bound has a least upper bound), and has the property that no
uncountable set of intervals can be pairwise disjoint. Souslin's hypothesis
asserts that any other ordering (A, <) having these three properties is
isomorphic to the reals. This statement, earthy as it seems, is independent
not only of ZF but also of ZFC plus the GCH.
Another question, even more relevant to analysis, is the following. Is
every set of reals measurable in the sense of Lebesgue? Lebesgue measura-
bility is a standard notion that is treated in most texts dealing with real
analysis, and we shall not discuss it here except to say that the Lebesgue
measure assigns to certain subsets of the reals a length, and this notion of
length generalizes the usual one as applied to, say, a union of disjoint
intervals. In courses of real analysis, it is proved that sets exist that are not
Lebesgue measurable, and these proofs invariably use the axiom of choice.
In fact, choice must be used in a strong form, for Solovay in 1964 proved
that the assertion 'all sets are Lebesgue measurable' is consistent with ZF
and a restricted version of choice. This restricted version of choice, called
the countable axiom of choice, asserts that every countable set of non-
empty sets has a choice function. Countable choice and the assumption
that all sets of reals have length in the sense of Lebesgue yields a very slick
development of the early portion of real analysis.
Finally, we want to mention an axiom that is not known to be con-
sistent with ZF. This is Mychielski's axiom of determinacy. Recall the
game mentioned in our discussion of Martin's theorem: A set X of reals is
fixed and two players a and b alternately choose integers between 0 and 9
50 I An Introduction to Set Theory
inclusive; say a chooses ai' then b chooses b2, then a chooses a3, etc. If the
number 0.a l b2a3b4 ••• belongs to X, then player a is declared the winner;
otherwise b wins. The axiom of determinacy states that the game is
determined regardless of X, i.e., one of the players has a winning strategy
(of course, which player has a winning strategy will depend on X). ZFC
implies the negation of the axiom of determinacy, although ZF plus
determinacy implies that every countable set of non-empty subsets of R
has a choice function. Moreover, it is conjectured that ZF plus de-
terminacy is consistent with the countable axiom of choice. Much of the
interest in determinacy stems from its implications for real analysis. For
example, determinacy implies that every set of reals is Lebesgue measur-
able. In addition determinacy implies that every set of reals has the
property of Baire and every uncountable set of reals contains a perfect set
(we leave the definition of these notions to a course in real analysis).
Determinacy is also known to imply the consistency of 'there is a measur-
able cardina1', which, as we shall see in the next section, shows that
determinacy is not a consequence of ZF.
Lemma 12.2.
i. If x is E-transitive, then so is Pa(x).
ii. If f3 Ea, then Pp(x) EPa(x) and so Pp(x) c Pix).
iii. f3 E Pa(x) iff f3 Ea.
PROOF: The proof of each clause is by induction on a. Clearly, parts i
through iii are true when a = O. Suppose i through iii hold for all 8 Ea.
There are two cases to consider: a=y+ I for some y, and a= U a. We
prove the first and leave the second case, which is similar, as an exercise.
Let z E wand wE Pa(x). By the definition of Pix), we have we Pix)
and so z E Py(x). By the induction hypothesis Pix) is E-transitive, and so
z C Py(x). Hence z E Pa(x), giving part i. 0
The definition of Pa(x) immediately gives Py(x) E Pa(x), so by part i,
Py(x) c Pa(x). 0
If f3 <a, then f3 <:. a, and so by the induction hypothesis f3 C Py(x). So
f3 E Pa(x). Conversely, if f3 E Pa(x), then f3 C Pix), and so f3 <:. y <a. This
proves part iii. 0
in the context of all sets. For example, "'+ 1 is a cardinal in P",+2(O) but
not in the context of all sets (Exercise 9). In the work of GOdel and Cohen,
x may be the power set of y in the sense of one model but not in the sense
of another or in the context of all sets. However, there are many important
properties that behave more uniformly, at least with respect to structures
of the following kind.
Definition 12.6. The n-ary property t/I is absolute if for every standard
structure M and every x\, ... ,xnEM, we have t/lx\, ... ,xn is a true statement
about sets iff MFt/lX\, ... ,xn'
To get from the third statement to the second note that if x=(y,z) and
xEM, then {y},{y,z}EMandsox,yEM. 0
PROOF OF vii follows from ii and vi. o
PROOF OF viii follows from vii. o
PROOF OF ix. Let j EM. By part viii we may suppose that j is a function
and that M'Fj is a function. Notice that X7'~y and (x,z),(y,z)Ej iff
(x,z),(y,z)EM and (x,z),(y,z)Ej. Hencejis 1-1 just in case M'Fjis 1-1.
Also xEDomj iff for some y (x,y)Ej, iff (x,y)EM and (x,y)Ej, iff
M'FxEDomj. Hence z=Domjiff M'Fz=Domj. Similarly for Ranj. 0
PROOF OF x. Let x E M. Then z Ey and y E x iff z,y E M and z Ey and
yEx. 0
PROOF OF xi follows from x and ii. o
Now let's look at the other axioms of infinity discussed in § 1.11. All are
known to imply IC, the axiom that asserts the existence of a strongly
inaccessible cardinal. Since our next theorem states that IC is not implied
by ZFC, it follows that the other axioms of infinity are not implied by
ZFC either.
belong to P,,(O), and so P'p:P EA} EP,,+I(O). Hence U {Ap:P EA} EM,
so a is not strongly inaccessible in this case either. D
We shall now use the terms structure and model in a more general
context that involves an abuse of notation. We now think of a structure as
an ordered pair (M,e) where M is a unary property and e is a binary
property. The abuse of notation arises because M and e need not be sets,
in which case (M,e) no longer denote an ordered pair of sets. However, we
think of M as the collection of all sets x having the property M. In general,
this collection, like the collection of all sets, or the collection of all
cardinals, will be "too large" to be a set.
The notion of truth in (M,e) and the notion of model are then extended
in the obvious way. For example, to say that pairing is true in (M,e), or
that (M,e) is a model of pairing, or that (M,e)t=pairing, means that for
every x and y having the property M, there is a z having the property M
such that for all w having the property M, wEz iff w=x or w=y. Clearly,
there is a need to further abuse notation, and we write 'x E M' instead of
'x has the property M'; also, we will write 'xEy' instead of '(x,y) has the
property e' (except for one exercise, e will be E, so this takes care of itself).
The definitions of 'E -transitive' and 'standard model' are generalized in
the obvious way.
Now let ~ be the axioms of ZFC other than regularity. Let x have the
property M just in case x E P,,(O) for some a. Now consider (M, e ~ M).
Trivial modifications of Lemma 12.3 and the proof of Theorem 12.4 show
that M is a model of ZFC. Hence if the axioms of ZF other than regularity
have a model, then ZF has a model. On the other hand, there is a model of
the axioms of ZF other than regularity in which regularity fails. Thus the
regularity axiom is not redundant, and in fact we have
described a model of ZFC plus GCH; then, in 1963, Cohen, assuming the
existence of a model for ZF, produced a model of ZF in which choice fails,
and a model of ZFC in which GCH fails. (One can show that choice is a
consequence of ZF plus GCH, and so no model of ZF plus GCH exists in
which choice is false.) Together these results prove the following.
1heorem 12.10.
i. The axiom of choice is independent of ZF.
ii. GCH is independent of ZFC.
The constructions of Godel and Cohen are much too involved to give
here, although it might appear at first glance that part of Godel's contribu-
tion has already been dealt with in Theorem 12.8 and Theorem 12.9, where
we constructed a model of ZFC. However, our proofs that choice holds in
these models depended on choice being used in the universe of all sets.
Thus assuming the truth of ZFC in the universe of all sets, we produced
models of ZFC having additional properties. But suppose that some
doubter believes that ZF is true in the universe of all sets but that choice is
not-even more, he suspects that choice can be disproved from ZF. Godel
produced a model of ZFC assuming only that ZF has a model and hence
showed that ZFC is as consistent as ZF, and so choice cannot be disproved
from ZF.
Not only are these theorems of GOdel and Cohen milestones in the
foundations of mathematics in themselves, but the techniques used to
prove them have been extremely fruitful in the last decade, yielding
consistency results that answered longstanding problems in logic, topology,
analysis, algebra, and other branches of mathematics. Work in this direc-
tion is still continuing at an enormous rate.
6. Let x' = x for all x other than x =0 or x = 1. Let 0' = I and I' =0. Let x E'y iff
x Ey'. With M the collection of all sets, prove that (M, E') is a model of ZF in
which regularity fails. This along with the argument preceding Theorem 12.9
gives a proof of 12.9 and shows that regularity is independent of ZF.
7. The axiom of separation is weaker than the axiom of replacement. To show
this let M=P.,+.,(O) and consider (M,EtM). Verify that this is a model of
separation and all the axioms of ZF except replacement. To see that replace-
ment fails, consider the property o/xy which holds iff xEw andy = P., + x(O). As
we have seen in § 1.11, separation is a consequence of ZF, and so this example
shows that separation is weaker than replacement.
8. Show that the following are absolute:
x=yXz,
x= Uy,
x= ny,
x=2,
x=w.
2.1 Introduction
What are the capabilities and limitations of computers? Are they glorified
adding machines capable of superfast arithmetic computations and noth-
ing else? Can they outdo man in the variety of problems they can handle?
Let's narrow the question a bit. Consider the class of number theoretic
functions that a computer can be programmed to compute or that a man
can be instructed to compute. Are any of these functions computable by a
computer but not by a man, or by a man but not by a computer? Is there a
number theoretic function that is not computable by any computer, and if
so, can such functions be described? Is there a computer that can be
programmed to compute any function that any other computer can com-
pute? Is man such a computer?
In order to make these questions amenable to mathematical analysis,
Alan Turing in 1942, introduced a purely abstract mathematical notion of
computer and presented heuristic arguments in support of the view that his
"machines" have exactly the same computational powers as a "real com-
puter" or a man, at least if speed of computation is ignored.
The reason that his arguments are necessarily heuristic is that neither
the notion of "man computable" nor "real computer" is mathematically
defined. His mathematical machines are intended to be an abstraction of
"real computers" that allows precise mathematical analysis, just as the
integral is an attempt to make rigorous our intuitive notion of area.
We begin this section with a description of Turing machines, and later
take up some of the various intuitive arguments that support the thesis that
man, computers, and Turing machines are computationally equivalent.
Most of the work will be aimed at delimiting the capabilities of Turing
59
60 II An Introduction to Computability Theory
Here we have substituted O's and 1's for the blanks and checks of our
initial description. We now need three functions: one to tell us how to alter
the scanned term, one to tell us which term to scan next, and one to tell us
the next state of the machine.
The functions d,p,s are the "print-erase" function, the "next position"
function, and the "next state" function respectively. For example, if the
tape position t is given by
(3,2): 1111010 ...
and the machine M=(d,p,s) is such that
d(I,2) =0,
p(I,2)= -I,
s(I,2) =4,
then one application of M to t yields a new tape position
(2,4): 11010 ....
Originally the third cell is scanned and 2 is the state of the machine. This is
denoted by the marker (3,2). d tells us to place a 0 in the scanned cell; p
tells us to move the scanner to the left, so that now the second cell is the
scanned cell; and s tells us that the new state is 4. More generally, we have
Definidon 2.3. Let t be the tape position with marker U,k) and tape a, and
let M be the machine (d,p,s). Then M(t), the successor tape position, has
marker (j+p(apk),s(apk» and tape a 1a2 ••• aj _ 1 d(aj,k) aj +laj +2 •••
provided that (apk)EOomM andj+p(apk»O. A partial computation of
M is a sequence t l ,t2, ••• ,tm of tape positions such that t;+1 =M(t;) for each
i<m. The sequence is a computation if M(tm) is not defined. If t l,t2 , ••• ,tm
is a computation, then t 1 is the input and tm the output. If t I' ••• , t, is a
partial computation, we may write M'(t l) for t,.
scanner shift to the left, we use R, 0, and L in the table instead of 1,0, and
-I.
As an example, take M to be the machine above and t the tape position
(2,1): 0100000 ....
Then M(t) is
(3,1): 0000000 ....
[Since the second term is the scanned term and its value is 1, and the state
is 1, d(1, 1)=0 is the value of the second term in M(t). p(1, 1)= 1, which
says that the term to the right of the second term, i.e., the third term, is the
new scanned term. Finally s(1, 1)= 1, which says that the new state is I.] It
should be clear that the following is a computation:
(2,1): 0100000 .. .
(3,1): 0000000 .. .
(2,2): 00.1 0000 .. .
(2,2): 0110000 ... .
An easy induction on n shows that with M as above there is a computation
that begins with
(2,1): 01 ... 1000 ...
~
n consecutive I's
Pred(m)= { m~ I ifm>2,
ifm=l.
[Read 'Pred(m)' as 'Predecessor m'.]
PROOF OF i. Consider the machine
o I
I 0R2
2 IL3 IR2
3 OR4 IL3
PROOF OF ii. We give the machine for d=2 only, but this can be easily
modified to handle any given d:
o
I OR2 ORI
2 IL3 ORI
3 103 o
PROOF OF iii. The general idea will be quite clear from a discussion of the
special case k=4, t=3. For this case we use the following machine:
o I
I OR2 ORI
2 OR3 OR2
3 OR4 IR3
4 OL5 OR4
5 OL5 IL6
6 OR7 IL6
If (nl,n2,n3,n4) is the input, then a computation results in which the first
two blocks of I's are erased, the third is passed over, the fourth is erased,
and then the scanner returns to the first I of the third block. 0
PROOF OF iv. It is easy to verify that Pred(m) is computed by the following
machine:
o I
I OR2
2 103
o I
I 102
namely, the identity function f(n)= n. In fact it is easy to see that any
computable function is computed by infinitely many different Turing
machines (Exercise 5).
Our machine for addition computes both a unary function and a binary
function but not a ternary function. Some machines do not compute any
function; others compute k-functions for all k.
2.2 Turing Machines 65
EXAMPLE 2.7. Let X denote the set of all odd numbers. Then Rx(n) equals
I if n is odd and equals 2 if n is even. It is easily seen that the following
machine computes Rx:
o
IL2 OR2
2 103 ORI
Hence, X is computable.
EXAMPLE 2.8. Let X denote the set of all positive integers :> 2, i.e.,
X= {m:m:> 2}. Rx(m) = I if m:> 2 and Rx(1) =2. It is easy to see that the
following machine computes Rx:
o I
I IR2
2 IL5 OR3
3 OL4 OR3
4 OL4 101
Hence X is computable.
EXAMPLE 2.9. Let P denote the relation <. That is, P= {(m,n):m <n}. The
representing function Rp for this relation is defined by Rp(m,n) = I if
m <n, and Rp(m,n) =2 otherwise.
The Turing machine below computes Rp in the following way: Given
the input (m,n), the machine erases the first check on the tape and then the
last check on the tape, and then repeats the procedure until either the
block of I's on the left has been erased but not the entire block on the right
(in which case all I's are then erased and then I is printed), or the right
block of I's is erased first (in which case all I's are then erased and two I's
are printed). The dotted line divides the table into halves: the top half
dictates the erasing of the leftmost I and then movement to the right after
checking that all of the l's in the left hand block have not been erased; the
lower half of the table dictates a dual operation but from right to left.
2.2 Turing Machines 67
o I
I 0R2
2 OR5 IR3
3 OR4 IR3
4 OL6 IR4
5
.lOll
... ORS
....
6 OL7
7 OLIO IL8
8 OL9 IL8
9 ORI IL9
10 IL5 OLIO
Much more interesting examples of computable functions and relations
will be found in the sections that follow.
The notion of a Turing machine can be formulated in many equivalent
ways-equivalent in the sense that the resulting set of computable func-
tions will be the same as the set of functions that are computable accord-
ing to our present definitions. For example, two way infinite tapes of the
form ... a-2a-1aOa1a2'" can be used in place of one way infinite tapes.
The terms of the tape might be restricted to {O, t, ... ,n} instead of {O, I}.
There are many more alternative formulations, each being advantageous in
certain circumstances and disadvantageous in others. Our choice is moti-
vated by personal preference and expediency in the present development.
Theorem 3.1. If f is a computable r-.function and g\, ... ,g, are computable
k-functions, then the composition of f with g\, ... ,g, is a computable k-
function.
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 69
This result states, in other words, that the composition 'of computable
functions is computable. The proof of this result is given in §2.4. In this
section we shall be primarily concerned with showing how results of this
type are used to demonstrate computability without recourse to machines.
EXAMPLE 3.2. Using Theorem 3.1, we see that the function f(n)-2n is
computable, since f(n)-Sum(P I,I(n),PI,I(n»-Sum(n,n)-2n and both
the functions Sum and PI,I are computable by Theorem 2.5.
EXAMPLE 3.4. We show that, for every n, the I-function Prodn, defined by
Prodn(m)-nm, is computable. (In Example 3.2 above we showed that
Pro~ is computable.) The proof is by induction on n. For n -1, we have
Prodl-PI,I' which is computable by Theorem 2.5. Now suppose that
Prodk is computable. Then Prodk+l(m)-Sum(Prodk(m),PI,I(m»-km+m
-(k+ I)m. Since Sum and PI,I are computable by Theorem 2.5 and
Prodk is computable by assumption, we see by Theorem 3~1 that Prodk + 1
is computable. This completes the induction.
Corollary 3.S. Let f be a computable r-function, and let k EN+ ,iI '"
'" k, i2
k, ... ,i, '" k. Then the k-function h defined by h (n)-
f(Pk,i,(n),Pk,iz(n), ... ,Pk,i,(n» is computable.
PROOF: The result follows immediately from Theorems 2.5 and 3.1. 0
ExAMPLE 3.6. Suppose f is a computable 2-function. Then the 3-func-
tion hi defined by hl(a, b, c) - f(a, b) is computable, since
hl(a,b,c)- f(P3,I(a,b,c),P3,2(a,b,c». Similarly the following 2 and 3-func-
tions are computable:
i. the 3-function ~ defined by h2(a,b,c)- f(a,c),
ii. the 3-function h3 defined by h3(a,b,c)- f(a,a),
iii. the 2-function h4 defined by h4(a,b)- f(b,a).
70 II An Introduction to Computability Theory
We can now use Theorem 3.7 to get the computability of Mult. We first
observe that the functionf defined by f(a,b,c)=Sum(a,c) is computable
by Corollary 3.5. Now apply Theorem 3.7, taking k= 1, g=PI,I' and thisf.
Diff'(m,n) = { 7- m if n-m ~ 1,
otherwise.
For m= 1, n arbitrary, we have h(1,n)=Predn=Diff'(I,n). Suppose
for some k and all n we have h(k, n) = Diff'(k, n). Then
h( k + 1, n) = Pred( h( k, n» = Pred(Diff' (k, n». To complete the induction
we need only show that Diff'(k+ I,n)=Pred(Diff'(k,n». If Diff'(k+ I,n)
~ 2, then Diff'(k + I,n)= n -(k + I)=(n - k)-I = Pred(Diff'(k,n». If Diff'
(k+I,n)=I, then n-(k+I)~ 1, so n-k~2. Thus Pred(Diff'(k,n» = 1
and the induction is completed.
computable:
h(I)=d, (1)
h(n+ 1)= J(h(n),n). (2)
Prod(ank'n2k2'"
\ km -' nle,.)
nm-I'm
= Prod( anf'n;2 . .. n~_\, Pow( C1,1e,. (nm),P 1, I(nm»).
Thus any polynomial with positive integral coefficients having only one
term is computable. Now we use induction on the number of terms. If h
has r terms, r> I, then h(n\, ... ,nm)=g(nl, ... ,nm)+ f(n\, ... ,nm), where g is
a polynomial with r-I terms and f is a polynomial with I term. So f is
computable, as we have just shown, and g is computable by the induction
hypotheses. Hence h(nl, ... ,nm)=Sum(g(nl, ... ,nm),f(nl, ... ,nm» and is
computable by Theorems 2.5 and 3.1.
EXAMPLE 3.14. The relations > and < are computable. We saw in
Example 2.8 that the set {m: m ;;> 2} is computable. Let f be the represent-
ing function for this set, i.e., f(m)= I if m;;> 2 and f(1)=2. Then f is
computable. It follows that the 2-function g defined by
g(m,n) = f«m+ I)-':'n)
is computable, since g is obtained by composition of computable functions
[-.:. was shown computable in Example 3.9, and n+ I =Sum(n,C1,I(m»].
Note that g(m,n) = I if and only if (m+1)-':'n;;>2, i.e., if and only if
m-n;;> I, or m>n. This proves that g(m,n) = 1 if m>n and g(m,n)=2
otherwise. Thus g is the representing function of the relation> (which is
the set {( m, n): m > n}). Since g is computable, > is computable. If we
define h(m,n)=g(n,m), then h is the representing function of <, ie.,
h(m,n)= I if m<n and h(m,n)=2 otherwise. We see from Example 3.6
that h is computable and so the relation < is also computable.
EXAMPLE 3.15. The relations ..; and ;;> are computable: Let f be the
representing function for the relation >, i.e., f(m,n) = 1 if m>n, and
f(m,n)=2 otherwise. Let g be defined by g(m,n)=3-':' f(m,n) = C2,3(m,n)
-.:. f(m,n). Then g is computable (by Example 3.9, Example 3.14, Theorem
2.5, and Theorem 3.1). Also, g is the representing function for ..;, for
g(m,n)= 1 if and only if 3-':' f(m,n)= 1, and this occurs if and only if
m::j.n, i.e., if and only if m ";n. Setting h(m,n)=g(n,m) shows that the
relation ;;> is computable.
74 II An Introduction to Computability Theory
Theorem 3.1S. Let P be a computable r-relation, and let g), ... ,gr be
computable k-functions; then the k-relation Q, defined by 'Q(n) if and only if
P(g)(n), ... ,gr(n))', is computable.
PROOF: Clearly, RQ(n) = Rp(g)(n), ... ,gr(n)). Now apply Theorem 3.1. 0
Corollary 3.19. Let P be a computable r-relation, and let i) ..; k, i2 ..; k, ... , ir ..;
k. Then the k-relation Q defined by 'Qn if and only if P(Pk,i,(n),Pk,i2(n), ... ,
Pk.i,(n))' is computable.
PROOF: Immediate from Theorems 2.5 and 3.18. o
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 75
such that P(n,x), then JLXP(n,x) is defined for all nEk(N+). In this case
the following equation defines a k-function [with domain k(N+)]:
f(n) = JLXP(n,x).
As we shall see in the next section, if P is computable, there is a Turing
machine that will search for the least x such that P(n,x) when the input is
n. This is stated in the next theorem.
Theorem 3.23. Let P be a computable (k + I)-relation. Then there is a Turing
machine M such that for every input n,
i. the output is JLXP(n,x) if there is an x such that P(n,x), or
ii. there is no output and no x such that P( n, x).
Hence, if for every nEk(N+) there is an x such that P(n,x), then the function
f defined by f(n) = JLXP(n,x) is computable, and in fact is computed by M.
EXAMPLE 3.24. Let Prm(n) be the nth prime number in order of magnitude.
So we have Prm(l)=2, Prm(2) = 3, Prm(3)=S, Prm(4)=7, Prm(S) = 11, etc.
[Do not confuse the I-function Prm(n), which enumerates the primes, with
the I-relation Prime(n), 'n is a prime'.] Prm(n) can be defined as follows:
Prm(I) =2,
Prm(n + 1) = JLX(Prime(x)J\(x > Prm(n»).
(Prime(x)J\(x>y» is computable by Examples 3.IS and 3.22 and Theo-
rem 3.17. Moreover, for every n there is an m such that P(n,m). Hence the
functionf defined by f(n) = JLXP(n,x) is computable by Theorem 3.23.
Theorem 3.26. Let g), ... ,g, be computable k-functions, and let p), ... ,P, be
computable k-relations such that {p), ... ,P,} is a partition on k(N+). Letf be
2.3 Demonstrating Computability without an Explicit Description of a Turing Machine 77
'Theorem 3.28. The functions and relations described below are computable:
i. Mult(m,n)=m·n.
ii. Pow(m,n)=n m •
iii. All polynomials with positive integral coefficients.
.
IV.
.
m-n=. {m-n ijm>n,
I otherwise.
v. n!=1·2 ... ·n.
vi. Provided that g is computable, so is
m
~ g(i,ii) = g(l,ii) + g(2,ii) + ... + g(m,ii).
10. Prove that all polynomials in any number of variables are computable.
II. Let P={(a,b):a2 >b}. Prove that P is computable.
12. Let X be a subset of N+ such that N+ - X is finite. Prove that X is
computable.
13. Let X= {(a,b,e):a 2 +b2 = e 2 }. Prove that X is computable.
14. Let Y= {(a,b,e):a 2 ..:. b 2 >e2 }. Prove that Yis computable.
15. Let g.c.d.(m,n) be the greatest common divisor of m and n. Show that this
function is computable. Do the same for the least common multiple of m and
n, l.c.m.(m,n).
16. Letf(n)= I for all n if Fermat's last theorem is true. Otherwise let f(n) =2 for
all n. Prove that f is computable.
17. Find a relation R(x,y) such that for some but not all n, there is an m such that
R(n,m). Letf(n) = p.xR(n, x). What is the domain off? In what sense does the
Turing machine mentioned in Theorem 3.23 compute this f?
18. Let
f(n) = {n if n an~ n+2 are both prime,
1 otherwIse.
Show that f is computable. Note that f is unbounded if and only if the
following unsolved conjecture is true: There are infinitely many twin prime
pairs, i.e., there are infinitely many primes p such that p + 2 is also prime.
19. Let X={l}u {n:2n is the sum of two primes}. Show that X is computable. A
famous unsolved conjecture of Goldbach states that every even number greater
than 2 is the sum of 2 primes. In other words, the conjecture states that
X=N+.
the- g's are computable k-functions. Our descriptions of this machine will
be given in terms of machines that compute f and the g's. The description
is detailed enough so that a machine for the composite function can be
written down explicitly whenever machines that compute f and the g's are
given explicitly. Careful proofs that our machines do what we claim they
do can be given by induction on some parameter of the tape (on m in
Lemma 4.2 for example), but this is messy business, and we shall not
torture the reader with these details. However, following the machines
explicitly through one or two computations should demonstrate the work-
ings of the particular machine under consideration.
Similarly, the proof of computability of a function h defined recursively
in terms of computable functions g and f by
h(l,n)=g(n),
h(m+ I,n)= f(h(m,n),m,n)
consists of a description of a machine that computes h, a description given
in terms of machines that compute g and f. As before, one obtains an
explicit description of a machine that computes h when provided with
machines that compute g and f.
To prove Theorem 3.23 we describe a machine that searches for the
least x such that P(n,x), provided that P is computable. The description is
given in terms of a machine that computes Rp.
Next we describe some "component" machines and several methods of
linking machines together. Using these methods with our component
machines will yield the machines for composition, recursion, and the
p.-operator.
Icompress I,
such that for each m and n EN+
• •
Icompress II··· I (Y" II (n-I)o ••• ~ ••• 10 II (n-I)(Y" ....
2.4 Machines for Composition, Recursion, and the Least Operator 81
PROOF: I I
Let compress be the following machine:
o
I IL2
2 lL3 OR7
3 OR4 IR2
4 OL5 IR4
5 OL6
6 ORI lL6
This machine first checks to see if m >2. If not, then the output has the
same tape as the input. If m > 2, then a partial computation yields the tape
position
*
(k+m,I): ... lOm - 1 1 noo ... ,
and this process of printing a I to the left of In and erasing a I on the right
of In is iterated until we get ... 10 Inom... . 0
Let M=(d,p,s). We may write M(i,k) for (d(i,k),p(i,k),s(i,k)). Let
S(M) be the set of all k such that either (O,k)EDomM (the domain of
M), or (l,k)EM, or kERans.
Clearly U,k)a is an output for M arising from the input tiff U,u)a is an
output of M u arising from the input t. .
t
Definition 4.4. Let [M,l] be the machine with domain {(i,I+k):(i,k)E
DomM} such that
[ M,l](i,l+ k) = (d(i,k),p(i,k),s(i,k) + I)
for each (i,k)EDomM.
So the table for [M,l] is the result of replacing each state k in the table
of M by l+k.
82 II An Introduction to Computability Theory
By M we mean MuUMou[MI'U].
It' !
MI
This machine takes all outputs (i,})a of M in which aj=O and uses
these as inputs to MI.
iii. Now let Mo be
UI \lOU+ 11.
Define M to be M u U Mo U [M I, u]. Here all outputs of M in which
~ !
MI
a 1 is scanned are converted to inputs for MI.
iv. Let ~M be M uU Mo. where
!
M o=uIOOll I.
Roughly speaking, an output U,k)a for M in which aj=O is converted
to an input by ~ M and looped back.
v. Take
1101 1,
and define M ~ to be M uU Mo. What would be an output U,k)a for
M, where aj = 1, is loop;d back to M~.
vi. Now define
ICOPyk!
such that
PROOF: We consider only the case k=2, which is enough to illustrate the
general argument. Let M I be the machine
o
I IR2
2 OL3 OR2
3 OL3 IR4
*
Clearly MII(nl,n2)~ lon, In2.
Next we need a machine M2 such that
*
M 2101j + 1 on,-jln20Ij~Olj+1 *
on,-jl n2 0lj+l.
For M2 we can take
o
I ORI IR2
2
3
ORJ
IL4
IR2
IRJ
} Go right and add a I to Ij.
4 OL5 IL4
5
6
OL6
OL6
IL5
IR7 } Go left.
We also need a machine M3 to check if the cell to the right of the scanned
cell in the output of M2 has a I:
o I
I OR2
2 OL3 103
o
84 II An Introduction to Computability Theory
rl
MO=M,
Mk+I=[
For example
Lemma 4.8.
1. I I
There is a machine shift right such that
Icompress I
~
Ishift left Ik
2.4 Machines for Composition, Recursion, and the Least Operator 85
I copyk+r-l Ik
~
~
Icompress I
~
Ishift left Ik+,- I
PROOF OF THEOREM 3.1. Let gl,g2, ... ,g, be computable k-ary functions,
and let f be a computable r-ary function. We want to show that the k-ary
function h defined by
h( ii) = f( g I (ii),g2( ii), ... ,g,( ii»
is computable. Let [2] compute f. Then h is computed by
Mg1···g,
~
Ierase Ik
~
Mf 0
PROOF OF THEOREM 3.7. Let g be a k-function computed by 0, andf a
computable (k+2)-function. Define h by the equations
h(I, ii) = g( ii)
h(m + l,ii) = f(h(m,ii),m,ii).
By Corollary 3.5, the function l' defined by
f'(m,ii,r)- f(r,m,ii)
is computable, say by [2]. This gives an alternate definition for h:
h(l,ii) = g(ii),
h(m + I,ii) = 1'(m,ii,h(m,ii»,
86 II An Introduction to Computability Theory
and this definition will guide our construction of a machine that computes
M.
Let
* *
MilO I n.-... 0 Ig(;o,
* *
MilO I mn.-IO I m-I n for m> 1.
For MI we can take
J. IR3
oJ.
OR4
IILlI
Icopyk+ I (+1
J. ~
IORII
Icompress I
J.
Ishift left k
I +I
2.4 Machines for Composition, Recursion, and the Least Operator 87
~ IOL2/ILl I ~
I 0R21 ORII
t
[2]
t
Icompress I IILlI
t
r--.:..----, k + I
*
Next we describe a machine M3 such that given nO 1m, our machine
first yields a copy and then uses the copy to compute ~. If the answer is
*
no, then the output is nO 1m + I; if the answer is yes, then the output is
*
nO lmor- I 0 for some r.
I copy k+ 1 Ik+1
t
~
t
IORll
* *
We also need a machine M41 nOlm~otl m, where 1="i.I<;.j<;.knj+k+ 1.
OLl lL2
ORI lL2
Ishift left Ik
t
Ierase I k
•
2. Let
M = ,--I-----'-I_OR---'lI,
M-~
J-~'
There are several heuristic arguments that bolster Turing's stand. Per-
haps- the most. simple minded is that in over thirty years no function has
been found that is man computable but not machine computable.
Another argument requires adequate faith in the genius of men like
Post, Church and Kleene, GOdel, and Turing, along with a reluctance to
accept coincidence as an explanation. Each of these men defined a set of
number theoretic functions and proposed that their set be taken as the set
of man computable functions. Although their definitions seemed to differ
radically from each other, it later turned out that they all define the same
set of functions, i.e., they all define the set of machine computable
functions. Most of these alternative definitions were made before Turing
proposed his thesis.
A third argument, and to me the most persuasive, attempts to analyze
the intuitive notion of man computability into its simplest components.
Let's say we have an algorithm for computing the unary function f. The
algorithm is a set of directions given in some language, say English. By
restating the directions if necessary, we can assume that the number n is
written as n consecutive checks on a tape divided into cells. Thus the
directions tell us how to pass from n to f(n), each step of the computation
depending on at most the preceding steps, which, as we have seen in
previous sections, can be coded as the last step of the computation. It
seems quite plausible that the directions can be written in such detail that
the passage from one step in the computation to another is accomplished
by erasing or writing checks on a tape and moving one cell at a time. This
then gives us an algorithm that is very close to a Turing machine.
For these reasons, mathematicians regard the Turing machine as the
mathematical definition of algorithm, and Turing machine computability
as the precise analog of man computability.
With slight modifications, the above arguments offered in support of the
computational equivalence of men and Turing machines can also be given
in support of the computational equivalence of real computers and Turing
machines. The speed of computation does not enter into these considera-
tions; we require only that the computation end in a finite number of
steps.
take an aribtrary (but finite) amount of time and tape. Nor are these
functions computable by man; no finite set of directions will tell a man
how to find f(x) for arbitrary x if f is not computable by a Turing
machine.
The following definition and theorem will be used to prove that among
all the functions, relatively few are computable.
By # k we mean the k-function with domain kN defined by
# k(n I' n2"'" nk) -_pnl'pn2.
I 2 •••
.pn.
k
wherep; is the ith prime in order of magnitude. For example, #3(4,1,3)=
24 .3 1.5 3 •
I uVWJXYZI
-we define #r to be #6(# u, # V, # W, #x, # Y, #Z). If M is a k-row
machine, the ith row being r l , we define #M to be #k(#r l , #r2"'" #rk).
For example, if M is the machine
0R2 IRI
IR3 OL2
IR4
then
#r l =22315271111131,
# r 2=21315372112132,
#r3=2 13154 ,
and
92 II An Introduction to Computability Theory
The fact that different machines are assigned different numbers is im-
mediate from Theorem 6.1.
From now on, we shall not mention Theorem 6.1 when making use of it.
We can now easily show that there are functions that are not comput-
able. For each computable k-function 1, define Gk(J) to be the smallest
number m such that m = # M for some machine M that computes 1. Then
Gk is a I-I function on the set of all computable k-functions into N+. As
was seen in §1.5, there is no I-I function whose domain is the set of all
k-functions on N+ and whose range is contained in N+. Therefore, some
k-functions are not computable.
By Theorem 4.13, in Part I we see that there are No computable
k-functions. (Hence, there are No computable functions, because a count-
able union of countable sets is countable, and the set of computable
functions is U kEN+Ck , where C k is the set of computable k-functions.)
Since there are 2110 k-functions, we see that there are 2"° k-functions that
are not computable; so most k-functions are not computable.
We shall now see that there are non-computable functions that are very
easy to describe.
ExAMPLE 6.2. First we make a list (with repetitions) of all the computable
I-functions as follows. If m is the number of a machine that computes a
I-function, we let 1m be that I-function; if m is not the number of a
machine that computes a I-function, we let Im(n) = I for all n. Then each
computable I-function occurs at least once in the sequencefl,I2, .... Now
define a I-function F as follows:
F(n) = In(n) + 1.
We claim that F is not computable. For suppose F is computable. Then F
isft for some I. Hence F(I)= ft(l), but at the same time F(/) = ft(/) + I-a
contradiction. Therefore F is not computable. (This is another example of
a diagonal argument; see the second proof of Theorem 5.1 in Part I.)
EXAMPLE 6.4 (The Self-Halting Problem). We say that M halts for the
input A if the complete sequence of tape positions determined by M that
begins with A is finite. Let K' be the set of all machines M such that M
halts for the input #M. Let K= {#M:M EK'}. The self-halting problem
2.6 Non-computable Functions 93
Let
MI
M= t .
M2
Clearly, if Mlln~l, then M does not halt when the input is n, and if
Mlln~2, then Mln~1. Hence M halts on #M iff MII#M~1 iff M does
not halt on # M-a contradiction. Therefore the ass\1mption that RK is
computable is untenable, and we must conclude that K is not computable.
EXAMPLE 6.S. Let KI be the set of all numbers m such that for some M,
m = # M, and M yields the output I when the input is # M. A slight
modification of the argument used in Example 6.4 proves the non-comput-
ability of K 1•
EXAMPLE 6.6. Let K2 be the set of all numbers of the form # M, where M
computes C1 I' the cOnstant I-function whose value is always 1. We shall
show that K 2' is not computable. Given a machine M, let M be the machine
MI
t where MI computes C1 #M; say MI is as given in the proof of
M .
Theorem 2.S. It is easy to see that there is an effective procedure for
getting M from M; hence assuming Turing's theses, the function g, defined
as follows, is computable:
g(m) = {#M if m= #M,
I if for no M is m = # M.
g can be shown computable by the techniques of §2.3 without recourse to
Turing's thesis, but we shall not take the time to go through this tedious
but straightforward bit of work. Notice that # ME K2 if and only if
#MEK1 (we defined KI in the preceding example). Hence, RK2(g(m»=
RK,(m) for all m. Since g is computable and RK, is not by Example 6.S, we
94 II An Introduction to Computability Theory
see by Theorem 3.1 that RK2 is not computable either. Hence, K2 is not
computable.
h(n) = f(g(n))?
2.7 Universal Machines 95
3. Prove: H A !;N+ and A is infinite, then there are non-computable sets B and C
such that A = B U C and B n C is empty. (Hint: Try a cardinality argument.
How many ways are there of partitioning A into two subsets?)
4. Let K be the set of all numbers of machines M that yield some number k as an
output for the input # M. Prove that K is not computable.
5. Let E be the set of all numbers of machines that compute bounded I-functions.
Show that K is not computable.
Icompress I
t
.-------,2
Idecode I
that computes the function D(n) that subtracts the location of the first 1 on
the tape from 1 plus the location of the last 1:
I
Clearly, ,decode I yields the output B when the input is # lB.
(See ExerCise 1.)
2.7 Universal Machines 97
(2)
m
when II Prmk,(i + m + I) > 1.
i= 1
(3)
'compress'
98 II An Introduction to Computability Theory
I compress 1
J,
I shift left 1
I code 1
J,
U'
J,
Idecode I·
If M is any machine and (nl, ... ,nk) any tuple, then
MI(n1, ... ,nk)=Y iff UI(#M,n1, ... ,nk)=Y'
We still have to show that STP, the successor tape function, is comput-
able. To do this we require several functions, each of which is easily seen
to be computable from its definition.
Recall that Exp'(m,n) is computable, where Exp'(m,n)= p.x«mX %n)V
(m= I)). Now let
Exp(m,n) = Exp'(Prm(m),n)-=-1.
Then if the mth prime divides n, Exp( m, n) is the largest power of Prm( m)
that divides n.
If m is the number of a machine and 1 is the number of a tape position,
then the code number of the relevant row of the table is given by the
function
RR( m, I) = Exp(Exp(2, I), m).
The code number of the relevant column is
RC( I) = Exp(Exp(l, I) + 2, I).
(Recall that the O-column is coded by 2 and the I-column by 1.)
2.7 Universal Machines 99
i if (k=Exp(I,/»/\(Exp(S":"RC2(m,/),RR(m,/)=j)
T(k,m,/)= { /\(k=Ln/":"2)/\ -,(Exp(I,/)= I/\NP(m,/) =2
2 if k=Ln/":"l
Exp( k + 2, /) otherwise.
The examples above show that the converse of Theorem 8.S is false.
Hence, using the examples at the beginning of this section and taking
complements, we obtain examples of non-machine enumerable relations.
We do not define machine enumerable functions. The reason is that
machine enumerable functions are computable, as the following theorem
shows.
PROOF: ..., R(iiI) if and only if (3x)..., Sl(m,x). Hence by the theorem, both
Rand ..., R are machine enumerable. Therefore Theorem 8.6 applies, and
so R is computable. D
We end this section with a result in the spirit of Theorem 3.l8-a result
that shows how machine enumerable relations can be combined to yield
other machine enumerable relations.
1beorem 8.11.
i. If Rl and R2 are machine enumerable k-relations, then so are Rl U R2 and
R 1 nR2•
ii. If R is a machine enumerable (k+ I)-relation, then the following are
machine enumerable k-relations:
(a) (3x)R(y,x),
(b) (3x <; Yl)R(y, x),
(c) (Vx <; Yl)R(y, x).
PROOF OF I. By Theorem 8.9 there are computable relations Sl and S2 such
that
and
Rim) if and only if (3x)S2(m,x).
Then R1(iiI)V R 2(iiI) if and only if (3x)SI(m,x)V(3x)S2(m,x) if and only
if (3x)(SI(m,x)V Sim,x», where SI(m,n)V S2(m,n) is computable.
Hence Rl U R2 is machine enumerable by Theorem 8.9. Similarly, R1(iiI)/\
R 2(m) if and only if (3x)(SI(m, Exp(l,x»/\S2(m, Exp(2,x»), so Rl n R2
is machine enumerable. D
PROOF OF II. Let R be a machine enumerable (k+ I)-relation. By Theorem
8.9 there is a computable (k+2)-relation S such that
R(y,x) if and only if (3z)S(y,x,z)
Hence,
(3x)R(y,x) if and only if (3w)(S(y, Exp(l, w),Exp(2, w»),
which (by Theorem 8.9) proves part ii(a). Also,
(3x <;Yl)R(y,x) if and only if (3z)(3x <;Yl)S(y,X,z),
which proves ii(b) (by an application of Theorem 8.9 and Theorem 3.21).
Finally,
(Vx <;Yl)R(y,x) if and only if (3w)(Vx <;Yl)S(y,x,Exp(x, w»,
which proves ii(c) (by an application of Theorem 8.9 and Theorem 3.21). D
interesting sets that are machine enumerable but not computable, and
others that are not even machine enumerable.
8. Let R(m,n) hold if and only if there are machines M .. M 2 such that m= #M1,
n = # M 2, and MI and M2 compute the same I-functions, or one of the two
machines does not compute a function. Is R machine enumerable?
So if x= #(n\, ... ,nk) and y= #(m\, ... ,m/), then C(x,y) = #(n\, ... ,nk'
m\, ... ,m/). Clearly C is computable. Now we have
nl
and
e
Next we will obtain a more elegant formulation of that has only two
closure conditions (closure under recursive definitions being omitted).
Definition 9.2. The set of recursive junctions, Rec, is the smallest set :JC such
that:
i. +, " and R> belong to:JC, as does Pk" for each kEN+ and t<.k.
ii. If h(ii)=f(g.(ii), ... ,gk(ii» andf,g., ... ,gkE:JC, then hE:JC.
iii. If gE:JC, and for each ii there is an m such that g(m,ii) = 1, and if
h(ii)= px(g(x,ii) = 1), then hE:JC.
Lemma 9.5.
i. For every k, dEN +, the constant junction Ck,d is recursive.
ii. Pred E Rec.
iii. The equality relation is recursive.
108 II An Introduction to Computability Theory
Finally, note that (3x <;nl)(Px,nl, ... ,nk) iff ,('fix <;n l )( ,Px,nl, ... ,nk)
and apply part iii. 0
Recall that m and n are relatively prime if 1 is their greatest common
divisor. We need several number theoretic facts whose proofs are short
enough to be given here.
Theorem 9.6. If m and n are relatively prime, then there are integers s and t
such that 1 = sm + tn.
PROOF: Let u be the least positive integer that can be written in the form
sm + tn where s and t are integers (positive, negative, or 0). Notice that
ulm, for if not, then m=uk+1 for some k and some I such that O<I<u;
hence m=(sm+ tn)k+l, from which we get I=(l-sk)m- tkn, contradict-
ing the minimality of u. Similarly uln. Since m and n are relatively prime, u
must be 1. 0
Now define the remainder function Rem as follows:
if m=nk+r for some k> 1, n>r> 1,
Rem(m,n)= { ~
otherwise.
2.9 An Alternate Definition of Computable Function 109
11teorem 9.7. (Chinese Remainder Theorem). If n1, ... ,nk are relatively
prime in pairs, and if aj <nj for all i <.k, then there is an m such that
Rem(m,nj)=ador all i<.k.
PROOF: Let z=n(n2 • ••• ·nk and let zj=z/nj. Since Zj and nj are relatively
prime, there are integers Sj and I j such that 1 = SjZj + tjnj. Hence aj = ajsjzj +
ajtjnj or aj-ajtjnj=ajsjzj. Dividing both sides by nj we get Rem(ajsjzj,nj)=
aj. Let m=alslzl+ ... +akskzk+lz where I is arbitrarily chosen so that
m>O. Then
Rem(m,nj)=aj foreachi<.k. o
11teorem 9.8. For each finite sequence a1, ... ,ak there is an m and an n such
that
Rem(m, 1+jn)=aj
for eachj<.k.
PROOF: By the preceding theorem it is enough to find an n such that the
numbers 1+jn for j <. k are relatively prime in pairs and 1 +jn >aj • Let
n=b!, where b=max{al, ... ,ak,k}. Surely l+jn>aj for each j<.k.
Suppose that some prime p divides both 1 +jn and 1+ in where i <j <.k.
Then p divides (1 + jn) - (1 + in), i.e., p divides U - i)n. Hence p jj - i or
pin. But if pjj-i, then pin, sincej-i<k<b and n=b!. So in any case,
pin. Along with our assumption that pi 1 +jn, we getpl(I +jn)-j(n), orpll
-a contradiction. Hence the 1 + jn are relatively prime in pairs for j <. k. 0
PROOF: Using the lemma we see that the following relation H is recursive:
H(x,y,m,ii,s)
iff (g(ii) = Rem(x, I +y»
/\CVw<m)(Rem(x, 1+(w+ I)y) = f(m,ii,Rem(x, 1+ 10/»
/\Rem(x, I + my)=s).
Notice that h(m,ii)=s iff there is an x andy such that H(x,y,m,ii,s). The
lemma gives the recursiveness of the function
G(m,ii) = p.x(3y <x)(3s<x)H(x,y,m,ii,s).
110 II An Introduction to Computability Theory
systems such as the rationals, the reals, the complex numbers, etc., can be
defined in terms of the arithmetic of the natural numbers in such a way
that their properties can be derived from those of the natural numbers.
Unfortunately, in 1931, Godel proved that this goal was not obtainable
and with his incompleteness theorem pointed out startling limitations
inherent in the axiomatic method. With GOdel's remarkable theorem as
our goal, our first task is to define a language that is appropriate for an
axiomatic development of number theory. This language, which we call L,
can be viewed as a fragment of English, so formalized that the set of
meaningful expressions is precisely defined, and the meaning of each
assertion is unambiguous.
Since we shall be using the English language to discuss expressions of L,
we make the symbols of L distinct from those of English in order to avoid
confusion.
THE SYMBOLS OF L
Variables: VI' V2' VO ' · · • •
Constant symbols: 1,2,3, ....
Equality symbols: ~.
Symbols for 'and', 'or', and 'not': 1\, V, ,.
Existential and universal quantifier symbols: 3, 'fl.
Symbols for the functions of addition and multiplication: ED, <:).
Left and right parentheses: [,].
For example, ],], 'fI,9,3, 'fI is an expression. From now on we shall omit
the commas between successive terms of expressions, so that this will be
written ]]'fI93'f1.
Definition IO.lb. The set of lerms, Trm, is the smallest set X of expressions
such that
i. v;EX for all iEN+,
11. iEX for all iEN+,
iii. if 11,/2 EX, then so are [/ 1$/2 ] and [/ 1 8/2].
J.
Some examples of terms are v 28' 9, and [[ 1$ v 4 ] <:) [ 4 <:) 3] To see
that the last expression is a term, we first observe by conditions i and ii
that I, V4' 4, and 3 are terms. Hence by condition iii, [I ED V4] and [4 <:) 3] are
terms; call them II and 12 respectively. Hence by iii again [tl 8/2] is a term
as needed.
The expressions 3ED4 and 28 vIl are not terms.
Without the parentheses, we could not distinguish between
[[2$4] 03J and [2ED[403]]; i.e., the expression 2$403 could be
viewed both as II $/2 and Ii 8 I~ where 11,/2, li,/~ are respectively 2ED4, 3, 2,
112 II An Introduction to Computability Theory
and 4 03. That this kind of ambiguity is avoided by the use of parentheses
is the content of the following:
Theorem 10.2 (Unique Readability for Terms). For each term t, exactly one
of the following conditions hold:
i. there is a unique i such that t = Vi'
ii. there is a unique i such that t = i,
iii. there is exactly one sequence t),t2,s where t),t2 ETrm and sE{e, 0}
such that t=[t)st2]'
PROOF: We first observe that a proper initial segment of a term has fewer
right parentheses than left parentheses, and that a proper end segment of a
term has fewer left parentheses than right parentheses. For let Y be the set
of all terms for which this is true. aearly all variables and constant
symbols belong to Y (these have no proper segments), and it is easily seen
that [t)et2]E Y and [I) 0t2]E Y if t),t2 E Y. Hence Y=Trm. It follows
that a term has as many right parentheses as left parentheses.
Now suppose that t is a term that is neither a variable or a constant
symbol, say t=[t)*/2], where '*' is either 'O' or 'e'. If t can also be
written differently as [s) # S2]' then s) is a proper initial segment of t) or t)
is a proper initial segment of s), which is impossible in view of our remarks
about parentheses. 0
Of course, our intention is to have the symbol 1 denote the number 1,
the symbol 2 denote the number 2, and so on. In the most natural way we
use terms other than the constant symbols to denote numbers also. For
example, [1 e 1] denotes the number 2, while [[ [1 e 1] e 1] e 1], [202],
[Ie [2el]] are three terms denoting the number 4.
On the other hand, terms having variables have no denotation. For
example, [3eV9] has no denotation, since V9 does not denote a specific
number. However, if a particular number is assigned to the variable V9'
then we may think of [3eV9] as denoting a number. For example, if we
assign the number 5 to V9' then [3eV9] denotes 8. We now spell out these
ideas more precisely.
Definidon 10.4b. The set of all formulas (call it Fm) is the smallest set X of
expressions such that
i. every atomic formula belongs to X,
ii. if cpEX then [....,cp]EX,
iii. if cpEX and o/EX, then both [cp/\o/] and [cpVo/] belong to X,
iv. if cpEX and V is a variable, then both [Vvcp] and [3vcp] belong to X.
For example, the following expressions are formulas:
A proof for Theorem 10.5 can be given along the lines of that for
Theorem 10.2 (see Exercise 3).
Some of our formulas can be interpreted in the obvious
way as assertions about arithmetic. For example,
[VvI[3v2[[[20 v2]~vdv[[20V 2]~[vIE91]]]J] is intended to
assert that every number is either divisible by 2 or its successor is. The
intended assertion of the formula [3vI [Vv2[3v 3[ [V2E9V3] ~VI]]]] is
that there is a largest natural number. On the other hand,
114 II An Introduction to Computability Theory
Definition 10.6. We say that the assignment Z satisfies the formula qJ and
write N+FqJ(Z) if either
i. for some t l,t2ETrm, qJ is [tl~t2] and tl(Z) = t2(z),
ii. for some o/EFm, qJ is [,0/] and it is not the case that N+Fo/(Z)'
iii. for some 0/1,0/2 EFm, qJ is [0/1/\0/2] and both N+F0/1(Z) and N+F0/2(Z),
iii. for some 0/1,0/2 EFm, qJ is [0/1 V 0/2] and either N+Fo/l(Z) or N+F0/2(Z)
(as usual, 'or' here is used in the inclusive sense),
IV. for some o/EFm and some variable v, qJ is [V'vo/], and for all n,
N+Fo/(Z( ~),
or
iv'. for some 0/ EFm and some variable v, qJ is [3vo/], and there is at least
one n such that
Definition 10.7.
i. If S is a sequence SI'S2' ... 'Sn and if I <.i<'j<.n, then the i,j-sub-
sequence of S is Sj,si+I, ... ,Sj.
ii. 0/ is the i,}-subformula of cp if 0/ is a formula and the i,j-subsequence of
cpo 0/ is the subformula of qJ if 0/ is the i,j-subformula of cp for some i,j
iii. The symbol s occurs at i in qJ if s is the ith term of the sequence cpo
2.10 An Idealized Language 115
Theorem 10.S.
i. If tETrm, and Z and z' are assignments such that z(v)=z'(v) for all
variables v occurring in t, then
t<z) = t<z').
ii. If cpEFm and z and z' are assignments such that z(v) = z'(v) for all
variables v occurring free in cp, then
PROOF OF I. Let X be the set of all terms t for which Theorem IO.8i is true,
< <
i.e., all terms t such that t z) = t z') whenever z( v) = z'(v) for all variables
v occurring in t. Clearly, every constant symbol and every variable belongs
to X. Moreover, if tl and t2 belong to X, then [tl ffit 2]<z) = tl<z) + t2<z) =
t l<z')+t2<Z')=[t l E9t2Kz'), and so [t l ffit 2] EX. Similarly [t 10t2] belongs
to X if t l ,t2 EX. Hence, by Definition lO.lb, X =Trm as we needed to
show. D
PROOF OF II. Let X be the set of all formulas for which Theorem 1O.8ii
holds. Using part i, we see that the atomic formulas are in X.
Suppose that I/; E S and that z and z' are assignments that agree on the
free variables that occur in 1/;. Then N+FI/;<Z> if and only if N+FI/;<Z'>.
Noting that I/; and [ I 1/;] have the same free variables, and using Definition
1O.6ii, we see that the following statements are equivalent:
N+F[ II/;Kz).
It is not the case that N+FI/;<Z>.
It is not the case that N+FI/;<Z').
N+FbI/;Kz'>.
Hence, if I/; EX, then [, 1/;] EX.
116 II An Introduction to Computability Theory
Now suppose that 1/11 EX and 1/12 EX. aearly, every variable that occurs
free in 1/11 or occurs free in 1/12 also occurs free in [1/11 V 1/12]' Now suppose
that Z and z' are assignments that agree on the variables occurring free in
[1/11 V 1/12]' Since I/I1,I/I2EX, we have that N+t=I/II<z) if and only if N+t=
I/II<Z'), and N+t=I/I2<Z) if and only if N+t=I/I2<Z'). Hence, using Definition
lO.lbiii, we see that the following statements are equivalent:
N+t=[I/I1 VI/I2]<z),
N+t=I/II<z) or N+t=I/I2<z),
N+t=I/II<z') or N+t=I/I2<Z').
N + t=[1/1 1V 1/121< z').
Hence, if 1/11,1/12 E X, then [1/11 V 1/12] E X. A similar argument shows that if
1/11,1/12 EX, then [I/Id'I/I2] EX.
Now suppose 1/1 EX, and let Z and z' be assignments that agree on all
variables occurring free in [3vtJ!]. Suppose that N+t=[3vtJ!1<z). By Defini-
tion lO.6iv' this means that for some assignment w agreeing with Z on all
variables except possibly v, we have N+t=I/I<w). Now let w'(v)=w(v) and
w'(u)=z'(u) for all u#'v. Since we are assuming that 1/1 EX, and since a
variable that occurs free in 1/1 either is v or occurs free in [3vtJ!], we see that
N+t=I/I<w'). Hence, by Definition lO.6iv' again, N+t=[3vtJ!1<z'). Thus if
I/IEX, so is [3vtJ!]. Similarly, one shows that if 1/1 EX, then ['VvtJ!] EX.
Hence, by Definition lO.4a, we see that X = Fm, as needed to prove the
theorem. [J
instead of
['VVI [3v 2[[[ VI~V2]1\[ [V2 <:> V3] ~V4] ]1\[ 3v s[ [V3EBVS]~V4]]]]]
(2)
[So according to the context, (1) might be an expression that is not a
formula, or an abbreviation for the formula (2).]
When in a mathematical discussion a statement of the form 'If A then
B' or of the form 'A implies B' is made, it is intended to mean that B is
true if A is true, but if A is false then B may be either true or false. So the
statement 'If A then B' has the same meaning as 'Either A is false or B is
true'. It is handy to have a counterpart of such statements in L, so we may
2.10 An Idealized Language 117
abbreviate formulas of the form '[ ...,cp]VI/I' by 'cp---+I/I' and read these as 'If
cp then 1/1' or as 'cp implies 1/1'. Similarly, a mathematical statement of the
form 'A if and only if B' means that either A and B are both true or both
false, i.e., if A then B, and if B then A. The formalized counterpart of this
in L is '[ cp---+ 1/1] 1\[ I/I---+CP)" which we further abbreviate as 'cp+-+I/I' and read
'cp if and only if 1/1'.
Definition 10.9. The variable v is free at k for the term 'T in the formula cp if
i. v occurs free at k in cp, and
ii. if u is any variable occurring in 'T, and cp' is the result of replacing v at
the kth place in cp by u, then u is free at k in cp'.
Definition 11.1.
i. Let R be a k-relation. Say that cp defines R if R={(nl, ... ,nk ):
N+Fcp(n l ,· .. ,nk )}·
ii. The k-function f is defined by cp if the (k + I)-relation f( it) = y is
defined by cpo
iii. A function or relation is arithmetical if there is some formula that
defines it.
For example, let Div be the formula 3V3[VJ'V3~V2]' Then clearly Div
defines the 2-relation xly.
As another example, Prime is defined by the following formula:
VV2[Div(v2,vl)~[ [v2~1 ]v[ V2~VI]] ]!\[ vl~I].
ii. Arth is closed under composition. For suppose that the k-function f is
defined by t/I and that the I-functions gl"" ,gk are defined by epl'"'' epk
respectively. Then the composition of f with the g's is defined by
(3VI+2' .. 3VI+k+2)[ epl( VI'"'' VI' VI+2)/\'" /\ epk(VI'"'' VI' VI+k+2)
/\t/I( VI+2"'" VI+k+2' VI+ I)]'
iii. The set of arithmetical functions is closed under the restricted JL-opera-
tor. For suppose that g is an arithmetical (k+ I)-function with ep
defining g. Suppose further that for all k-tuples ii there is an m such
that
g(ii,m) = 1.
Then the k-function f whose value at ii is p.x( g( ii, x) = 1) has as a
defining formula
ep(VI"'" Vk' Vk+ I' 1) /\(VVk+2)( Vk+ I ..;; ...., ep( VI"'" Vk' vk+ I' 1»,
where Vi";;tJ is (vi~v)V(3vl)(Vi+V/~V),
In view of Definition 9.1, this completes the proof that Arth:2 Rec. 0
Theorem 11.3. All machine enumerable relations (and hence all computable
relations) are arithmetical.
PROOF: We first show that the computable relations are arithmetical. To
say that the k-relation S is computable means that its representing func-
tion Rs is computable. By Theorem 11.2, Rs is arithmetical. Let ep define
Rs. Then clearly, S is defined by
ep( VI' V2, .. ·, Vk' 1).
Hence all computable relations are arithmetical. o
Now let S' be a machine enumerable k relation. By Theorem 8.9 there
is a computable (k+ I)-relation S such that for all k-tuples ii
S'(nl, ... ,nk _ l ) if and only if (3nk)S(n l , ... ,nk _l,nk)'
As we have just observed, there is an X that represents S. Hence S' is
represented by
The converse of Theorem 11.3 is false. There are sets that are arithmeti-
cal but not machine enumerable. In fact, if X is any machine enumerable
non-computable set (cf. Example 8.2), then N+ - X is not machine enu-
merable, by Corollary 8.7. However, N+ - X is arithmetical, since by
Theorem 11.3 there is a formula ep that defines X, and so clearly ...., ep
represents N+ - X.
Since there are only countably many formulas, it follows that there are
only countably many arithmetical sets. Hence the vast majority of sets are
not arithmetical. In the next section we shall exhibit an important example
of a set that is not arithmetical.
120 II An Introduction to Computability Theory
strong as the assumption that S is computable (see Theorem 11.2 and the
last two paragraphs of §2.11).
if m= 1,
pfpi2• •• Pr"; if m=pf'Pi2 •• ·Pr"'. and thep's
f(k,l,m) =
n;
are distinct primes, and = nj
n;
when nj *- k, and = I when nj = k.
Now we need a computable function that will pick out the constant with
the largest index occurring in a sequence:
ii. :i) is correct for N+ if for all L-assertions 0, I-§o implies N+l=o.
iii. :i) is complete for N+ if for all L-assertions 0, N+l=o implies I-§o.
ExAMPLE 13.3. :i)1 (described above) is complete and correct for N+, but
not effectively given.
ExAMPLE 13.4. Let 1'£ ={l = I} and ~2 ={(l = 1,0): 0 an L-assertion}. Let
:i)2 = :i)2(~' ~2)' Then :i)2 is effectively given and complete for N+, but not
correct for N+.
EXAMPLE 13.6. Let :i)4 = :i)4(1'£4' ~4)' where 1'£4 is the set of all atomic
assertions that are true iII. N+, and ~4 is the set of all sequences
(OI, ... ,Ok' [od\ ... I\Ok]VP) where k=N+ and 01, ... ,Ok'P are L-asser-
tions. Then :i)4 is effectively given and correct for N+, but is not complete
for N+. Indeed, ..., [1 = 2] is true in N+ but is not an S4 theorem, for
..., [1 = 2] is clearly not an axiom or a conclusion of an S4-rule.
We have just seen that there is an effective test for membership in X. Also
there is an effective test for membership in the set of axioms of ~.
Applying these tests for m=I,2, ... ,n enables us to decide if (8 1, ••• ,8m ) is
an ~ -proof or not. 0
PROOF OF ii. If there are no axioms in the system ~, then the set of all
~ -theorems is empty and there is nothing to prove. So suppose that a is an
axiom of ~. By the lemma, there is an effective enumeration of the finite
sequences of L-assertions. Let aI' 02' .. • be such an enumeration with
0i =(ai,I, ... ,ai,k;,ai,k;+ I)' We enumerate the theorems of ~ by letting an be a
if an is not a proof, and letting an = an, A;. + 1 otherwise. Clearly, this is an
effective enumeration. 0
'Theorem 13.8. No axiom system ~ can satisfy all three of the following
conditions :
i. ~ is effectively given,
ii. ~ is correct for N+,
iii. ~ is complete for N+ .
PROOF: Suppose ~ is effectively given, correct for N+, and complete for
N+. By the preceding theorems, there is an effective way of enumerating
{a: I-§a}. Also {a: I-§a}={a: N+Fa}. But this contradicts the fact that no
effective enumeration exists for {a: N+ Fa}, as shown in Theorem 12.1. 0
As one expects, the last two theorems can be placed in the context of
machine enumerability. This presupposes a numbering # of the L-asser-
tions.
i. {#a: aEte},
ii. {m:l(Prm(i»#O"/: (al, ... ,ak,ak+I)E~}.
PROOF: Assume the hypotheses of the theorem and suppose f-~o. Then by
the correctness of ~, N+t=o. Hence for some k, N+t=D(#cpo,k) and
N#t="'~(k). Since D defines the diagonal function, k must be # ICP(#cp~,
i.e., k=#,o. Hence N+t="'~(#,o). Since "'~ defines {p: f-~}, f-~IO.
This and our assumption f-~o contradict the correctness of ~. Thus f-~o is
impossible.
On the other hand, suppose that f-~ 1 o. Since "'~ defines {p: f-~},
we have N+t="'~(# 1 0). We also have N+t=D(#cpo, # 10). Hence N+t=
3Vl[D(#cpO,Vl)!\"'~(Vl)]' i.e., N+t=o. But ~ is correct, so that our
assumption f-~ 1 0 implies N+t= 1 o-a contradiction. Therefore, ,ois not
a theorem of ~. Hence, with what we have above, 0 is undecidable with
respect to ~ . 0
Friedberg and Mucnik., much was learned about this ordering. For exam-
ple, any countable partially ordered structure can be embedded in it.
Notions of computability have been developed for objects of higher
type, such as functions that map functions to functions.
Analogs of computability have been proposed by Kreisel for functions
defined on some ordinal a with range in a. This generalization grew into a
rich theory developed by Barwise and others.
Formal language theory is another direction that has received consider-
able attention. An alphabet is a set l: of symbols. l:* is the set of all finite
sequences of terms in l:. l:* is called the set of words on l:. A language is a
subset of some l:*. Languages can be specified by syntactic conditions or
by a process. As an example of the latter kind, suppose we fix a word
wEl:* and a function f : l:-+l:*. Now define a function F: l:*-+l:* as
follows. If a El:*, say o=a.a2 ... an then Fo=f(a.)f(a2) ...f(a n), i.e., each
aj is replaced by the wordf(a;) and the resulting concatination of symbols
is Fa. Let L be the orbit of Fon w, that is, L={w,Fw,F2w, ... }. Then L is
a language, and languages obtained in this way are simple examples of
Lindenmeyer languages. Lindenmeyer is a botanist who first proposed the
study of these languages, and the cause was subsequently championed by
Rozenberg. These languages have been used to model simple morphogenic
processes in biology and are of interest to computer scientists as examples
of parallel programming.
Many other interesting classes of languages have been described, and
the relationships among them are an active current area of research.
PART III
An Introduction to Model Theory
3.1 Introduction
For the remainder of the text we turn our attention to that branch of logic
called model theory. Here we consider formal languages with enough
expressive power to formulate a large class of notions that arise in many
diverse areas of mathematics. Within our idealized language we shall be
able to describe different kinds of orderings, groups, rings, fields, and other
commonly studied mathematical notions.
If an assertion 0 is a true statement about a mathematical structure W,
then Wis said to be a model of o. The main concern of model theory is the
relation between assertions in some formal language and their models. For
example, we shall describe a language strong enough to capture many of
the important properties of the real number field but weak enough so that
the sentences true in this structure also have a non-standard model, a
model in which there are infinitesimaly small numbers and infinitely large
numbers. Within such non-standard models one can justify a development
of the calculus along lines close to that of Newton's original conception.
For example lim 11--+ 00 all = a would mean that Iall - al is infinitesimal
whenever n is infinite.
Where model theory is developed within the context of algebra, the
latter subject undergoes considerable unification and generalization, while
model theory benefits from the examples, methods, and problems of
algebra. This interaction has been particularly fruitful in the areas of
Boolean algebra and the theory of groups.
The interaction of set theory and model theory has given tremendous
impetus to both, and each has contributed techniques and theorems to the
other that were used to solve famous problems of long standing.
135
136 III An Introduction to Model Theory
In fact, over the past several decades, the connections between model
theory, computable function theory, set theory, and infinitary combinator-
ics have become more and more closely knit. There are areas in which the
symbiosis is so strong that any division between them is bound to be
artificial. In the last section we shall briefly indicate some of the more
recent directions taken by model theory and hint at the growing interrela-
tion between the various branches of logic.
THE SYMBOLS OF L
Variables: VO,VI,V2'··· •
For each ordinal ex, a constant symbol Ca.
Equality symbol: RJ.
Symbols for 'and', 'or', and 'not': 1\, V, -,.
Existential and universal quantifier symbols: 3, 'tI.
Left and right parentheses: [, ].
For each nEN+ and each ordinal ex, an n-function symbolfn,a.
For each n EN+ and each ordinal ex, an n-relation symbol Rn a.
Definition 2.la. A set whose elements are either constant symbols, function
symbols, or relation symbols is called a type. An expression is a finite
sequence of symbols. If cP is an expression then the type of cp, written T( cp),
is the set of constant symbols, function symbols, and relation symbols
occurring in cpo If s is a type, then cp is of type s if T( cp) k s.
Definition 2.lb. Let s be a type. The set of lerms of type s, Trms ' is the
smallest set X of expressions such that
i. v;EX for all fEW,
ii. ca EX for all ca Es,
iii. if II' ... ,In EX and fn,a Es, thenfn,all' ... ' tn EX.
3.2 The First Order Predicate Calculus 137
As before (Theorem 10.2 in part II), unique readability for terms is easy
but tedious to prove, so we leave both the statement and the proof as an
exercise.
Definition 2.1d. Let s be a type. The set Fms ' the formulas of type s, is the
smallest set X such that
i. every atomic formula of type s belongs to X,
ii. if q>EX then [-,q>]EX,
iii. if q>EX and t/lEX, then [q>Vt/l] EX and [q>At/I] EX,
iv. if q>EX and v is a variable, then [3Vq>] EX and [VVq>] EX.
If l: is a set of formulas, then '7'l:, the type of l:, is U {'7'(0):0 El:}.
is a formula. By Definition 2.1b, u,fv, and c are terms. Hence [Rufv] and
[Jv~c] are formulas by Definition 2.1di, and so by Definition 2.1diii we
get that [[ Rufv] V [fv~c]] is a formula. By Definition 2.ldiii used twice
we see that (*) is a formula.
3.3 Structures
In Part I we defined a structure to be an ordered pair (A,e) where A *0
and e is a binary relation on A. We now extend the notion of structure so
as to encompass a great variety of constructs that are of interest to
mathematicians.
EXAMPLE 3.7. Structures of the empty type are allowed, as for example (N).
We also allow two different symbols to have the same denotation, as in the
structure (A, c~, c~) where A = N, c~ = c~ = 3. Since structures are func-
tions, ~=~ iff Dom~=Dom~ and for all x in the common domain,
~(x)=~(x). Thus if ~=(A,fro) and ~=(B.Jf\), then ~-=I=~ even if
• !8 ' ,
A=B,iI,o=iI,I'
140 III An Introduction to Model Theory
i. s2£-s~,
ii. 12£1 c I~I,
... l!{-!Bf all c.. EQ'(
111. C.. - c.. or Su,
iv. J,,~ .. ii-.t:: .. ii for every In, .. Es2£ and all iiE n l2£l, and
v. R,,~..ii iff R,,~.. ii for every R", .. Es2£ and all iiE n l2£l.
If ~ has a substructure with universe X, then that substructure will be
called the restriction of ~ to X and denoted by ~IX.
ExAMPLE Let ~ - (N+, +, ., < , 1,2) = (N+ ,I~o,J~ I' Ri!Ot c~, c~, 2£-
(N+,·, <,2)-(N+,J~I,R~Otc~), and s= {f2,I,R 2•0 ,cd. Then ~ts-2£ and
~=(2£, +, 1).
Definition 3.10. Let 72£ C 7~. A function g on 12£1 into I~I is an ilifection if
i. g is 1-1,
ii. g( c9l) - c!B for all constant symbols c E 72£,
iii. R~al'''''a" iff R!Bg(al), ... ,g(an ) for all a l ,. .. ,an EI2l1 and all n-relation
symbols R E 72£,
iv. gU'fI.al, ... ,a,,)-I!Bg(al), ... ,g(a,,) for all a l , ... ,an EI2£1 and all n-func-
tion symbols I E 72£.
If in addition g is onto I~I, then g is an isomorphism of 2£ onto ~, in which
case we write 2£~g~. Or we may write 2£~~ if explicit mention of the
isomorphism is unnecessary.
3.3 Structures 141
Notice that this definition coincides completely with the use of 'injec-
tion' and 'isomorphism' in algebra. Several other notions from abstruct
algebra will be generalized in the problems and in the sections that follow.
2. Show that •==' is an equivalence relation by showing that for all structures
~,!B,(i,
(a) ~==~,
(b) ~==!B implies !B~~, and
(c) ~:=!B and !B~~ implies ~:=~.
3. Give examples of structures ~, ~', !B, and !B' such that
~==!B' C;;;;!Ba;;~' C;;;;~
but ~a!B.
4. Suppose that ~I C;; ; ~i + \ for i =0, 1,2, .... Let ~ be that structure of type .,.~o such
that
I~I= U I~il,
lEN
c"=c"j foreachcE"'~.
Clearly each ~i is a substructure of~. Find such a substructure chain ~\ C;;;;~2C;;;;
~3 c;;;; ••• such that each ~I is isomorphic to (N+, <) but ~ is not.
5. LetA be the set of all integral powers of 3, A = {Jl:jEI}. Show that (A,·)==
(/, +).
6. A function h mapping I~I into I!BI is a homomorphism if .,.~ =.,.!B and whenever
c,R,jET~, then
i. hc" .. c'i8,
ii. hj"(a\, ... ,0,,)= j'i8(ha\,. .. ,ha,,),
iii. R"a\, ... ,a" iff R!8ha\, ... ,ha".
Let k be a positive integer and let + k and ·k be addition and multiplication
modulo k. Let h(n)-=nmoda. Show that h is a homomorphism from (J, +,.)
onto ({O,l, ... ,k}, +k, ·k).
7. An equivalence relation C on I~I is a congruence relation on ~ if whenever
al C bi for i < n and R is an n-relation in .,.~ and j is an n-function in .,.~, then
Ra\, ... ,a" iff Rb)o ... ,b"
and
ja\, ... ,a" Cjb)o ... ,b".
Let h be a homomorphism on ~ to !B. Let C={(a,b):a,bE~, h(a)-=h(b)}.
Show that C is a congruence relation.
142 III An Introduction to Model Theory
8. L-et C be a congruence relation on ~. For each a E I~I we let IJ= {b E 1~I:b C a}.
The quotient structure ~ modulo C is that structure lB of type 'T~ with
IlBl= {a :aEI~I},
c!B =?i' for each c E 'T~,
f !B""""
al , ... , an - f al, ... ,an for eachfE'T~,
r-. _ ' !II '
Letting Vbl be the set of variables, we can use the notation of Part I and
write' z E Vb\I~I' when z is an assignment to ~.
Definition 4.2. Let t be a term of type k 'T~ and let z E Vb\I~I. We define
t~<z> (by induction on the length of t) as follows:
i. v~<z> =z(vn),
ii. c~<z >= c~,
iii. Un,a(t\, ... , tn»~<z> = f~a(t~<z>, ... , t~<z».
Definition 4.3. Let cp be a formula of type ~ and let z E Vb\I~I. We say that
z satisfies the formula cp in ~, written ~FCP<Z>, if either
i. cp=[t\R:::t2] and t~<z>R:::t~<z>,
i'. cp= [Rn,at\, ... ,tn] and Rn~at\<z>, ... , tn<z>,
ii. cp=[ -lip] and it is not the case that ~FI/I<Z>,
iii. cp = [1/1\ A 1/12] and both ~FI/I\<z> and ~FI/I2<Z>,
iii'. CP=[IPI V 1/12] and either ~FI/I\<Z> or ~FI/I2<Z>,
iv. cp=[3vn l/l] and for some aEI~I, ~FI/I<Z vn or
a
»,
iv'. cP = ['Vvnl/l] and for all z E I~I, ~FI/I<Z( ~n ».
3.4 Satisfaction and Truth 143
1beorem 4.5.
i. Let t be a term, and let z and z' be assignments to 2£ such that z(v)-z'(v)
for all variables v not occurring in t. Then t'il(z) - t'-(z').
ii. Let cp be a formula, and z and z' assignments to 2£ such that z(v)-z'(v)
for all variables v occurring free in cpo Then 2£l=cp(z) iff 2£l=cp(z').
PROOF: Exactly like that for Theorem 1O.8i in Part II. o
The theorem implies that if 0 is an assertion, i.e., a formula without free
variables, then 2£l=o(z) for all assignments z to 2£, or no assignment
satisfies 0 in 2£. Thus we write 2£1=0 if there is an assignment that satisfies 0
in 2£, and we say that 0 is true in 2£ or that 2£ satisfies O. We say that 0 is
valid and write 1=0 if for all 2£ of type ~'To, 2£1=0. For example, Vx[RxV
-, Rx] is valid.
144 III An Introduction to Model Theory
Definition 4.6.
i. The theory of ~, abbreviated Th~, is {O:~FO}. If % is a class of
structures, then Th%={o:oETh~ for all ~E%}. ~ is elementarily
equivalent to~, in symbols ~=~, if Th~=Th~.
ii. If ~ is a set of sentences, then Mod~, the class of models of ~, is the
class of all ~ such that ~FO for all a E~. We write ~ EModo instead of
~ E Mod{ a}. A class of structures % is an elementary class if % =
Mod~ for some set of sentences~.
ExAMPLE 4.12. Rings are structures ~=(A, *~, oVl) where (A, *!ll) is an
Abelian group, i.e., ~ satisfies the assertions in Example 4.11, and also ~
satisfies
VVOV tV2[ VOO [v t oV2]R:![ voovt] ov2],
VVOVtV2[ [voo [Vt*V2]R:![ VOOVt] * [ VOov2]]
!\[ [V t *V2] OVoR:![ Vt OVo] * [V2 OVo]] J-
Of course each alternative formulation of a group as an elementary
class yields an alternative formulation of a ring. For example, we can view
rings as structures ~=(A, *Vl, -t~,iVl, o~) where (A, *Vl, -t!ll,i~) is an
Abelian group according to our second formulation of a group, and ~
satisfies the above two assertions.
iii. Pairing:
iv. Union:
v. Power set:
Vvo3vI VV2[ v2evI~Vv3[ v3ev2~v3evO]]·
vi. Replacement schema: For each formula cp of L with free variables
Va. VI' ... , Vn the following is an axiom:
Vv2··· vn[VV0 3Vn+1 Vv l [ CP~VI~Vn+I]]
~Vvn+13vn+2 Vv l [ VI EVn+2~3vo[ voevn+tI\CP]].
vii. Infinity:
3vo[0evo/\ VVI[ vlevo~vl U {vdevo]]·
Here 0evo is an abbreviation for 3v I VV2[ --, V2evI/\VI evo], and VI U
{VI} Evo is an abbreviation for 3v2[VV 3[V3 EV2~V3~VI VV3evI]/\
v 2evO]·
viii. Regularity:
VVO[VVI[ vI.t'vo]V3vl[ VI evo/\ VV2[ v2evo~v#vd]].
Our last two examples are number theories. The second is an attempt to
realize the Paeno axioms for arithmetic within L. In contrast to the second
example, the first involves only a single assertion, and it is this property
3.4 Satisfaction and Truth 147
I\[O~Svo]
1\'fIun[ cp(Uo> ... ' un-I, u,,)~cp( Uo> ... ,Un-I' SUn)]]
~'fIun[ cp( uo,···, Un-I' un)]],
DeflnidOD 4.16. Say that an assertion is simple if it has one of the following
forms: dl~d2' dl~d2' Rdl.·.dn, -,Rdl··.dn, jdl, ... ,dn-d, jdl ... dn~d,
where R is a relation symbol, j a function symbol, and the d's constant
symbols. If ~ is an expansion of ~ such that for all a E I~I there is acE 'T~
with c!B = a, then the set of all simple sentences true in ~ is called a
diagram of ~. Even though a structure has many diagrams, there will be no
harm in speaking of 'the diagram of ~' and writing 6j) ~.
148 III An Introduction to Model Theory
PROOF: Let ~~gm'km. Let h be a 1-1 function on I~I such that h(b)=
g-I(b) if bERngg, and h(b)EtI~1 otherwise. Now define ~ as follows:
1~I=Rngh,
c~ = c 2I
for everyc E 7~,
l
R~dl, ... ,dn iff R!8h- (d l ), ... ,h- l(dn) for all R E7~,
We shall use this theorem, often without explicit mention, to excuse the
writing of ~ C~ when we mean that ~ is isomorphic to a substructure of
~.
1heorem 4.19. If~o;;;;;.~, then ~=~. In fact, if~o;;;;;.g~, then for all formulas
q> and all ZEVbII~I, ~Fq>(Z) iff~Fq>(goZ).
PROOF: Let g be an isomorphism from ~ to ~. We first show that for all
z E VbII~1 and all terms t,
g(t'«(z» = l'\goZ). (*)
For t variable or a constant symbol we have
»
g( v':(z») =g(z(vn =v;'(g oz),
g(c~(z) )=gc~=c:(goz).
Now suppose (*) is true for all terms tl, ... ,tn, and let t beftl, ... ,tn. Then
g((jt l, ... ,tS11.(z) )=g(J'«tr(z), ... ,t:(z»)
= f!8g(tr(z) ), ... ,g(t:(z»)
[ since g is an isomorphism]
=f!8t~(g 0 z), ... ,t;'(g 0 z)
[ since (*) holds for t 1' ••. ' tn ]
= (jt I' ••• , tn)!8(g 0 z) [by Definition 4.2].
This proves (*).
We now show by induction on formulas that for all zEVbII~1 and all
formulas q>
~Fq>(Z) iff ~Fq>(goZ). (**)
We first prove (**) for q> atomic:
i. ~F[tl~t2Kz) iff tr(z) = t~(z)
iff g(tr(z»=g(t~(z» (since g is an isomorphism)
iff t~(goz) = t~(goz) [by (*)] iff ~F[tI~t2Kgoz).
ii. ~FRtl, ... ,tn(z) iff R'#!tr(z), ... ,t:(z)
iff R!8g(tr(z», ... ,g(t:(z» (since g is an isomorphism)
iff R!8t~(goz), ... ,t;'(goz) [by (*)]
iff ~F[ RtI' ... , tnK g 0 z) (by Definition 4.3).
Next we show that if (**) is true for q>1 and q>2 and all z, then it holds for
-, q>1' q>.I\ q>2' and 3Vq>1:
iii. ~F[ -,q>.1(z) iff not ~Fq>I(Z) iff not ~Fq>I(goZ) (by the induction
assumption) iff ~F[ -, q>.1( g 0 z).
150 III An Introduction to Model Theory
~FCPI (z(~))
iff for some a E I~I,
~FCPI ( go (z( ~))) [since we assume (**) for cpd
iff for some a E I~I,
iff ~F3vcpl(goZ>.
In the last equivalence, the implication from left to right is immediate
from the definition of satisfaction, while the implication from right to
left uses both the definition of satisfaction and the fact that g is onto.
This completes the proof of the second clause of the theorem. The first
clause follows immediately from the second. D
The converse of this theorem is false, as we shall see in the next section.
Definition S.l.
i. If cp is a formula with free variables UO,,,,,un- I' then Vuo, ... ,Un_ICP is a
universal closure of cpo
ii. Say that cp is valid, and write Fcp if it has a valid universal closure.
(Recall that an assertion a is valid if ~Fa for all ~ of type :l Ta.)
Of course, if cp has a valid universal closure, then any universal closure
of cp is valid.
iii. We say that cp and t/I are equivalent if FCP~t/I.
Unwinding the definitions, we see that FCP~t/I iff for all ~ of type
1'(cp~t/I) and for all z E VbII~I,
~FCP<Z>iff ~Ft/I<Z>.
From this it is clear that logical equivalence is an equivalence relation on
the class of formulas; i.e., for all formulas cp,t/I,~:
FCP~Cp,
FCP~t/I implies Ft/I~cp,
FCP~t/I and Ft/I~ implies Fcp~.
Thus all formulas have the property (*), which proves the theorem. 0
Lemma 5.3. Let s be an expression and u a variable that does not occur in s.
Let s' be the result of replacing every occurrence in s of the variable v by u.
Then for all ~ of type ~ '1'S and all z E Vb/I~I:
a. if s is a term, then
b. If s is a formula, then
2lFCP( z( ~)).
There is some a E 1211 such that
Lemma 5.4. Let u be a variable that does not occur in cP, and let cp' be the
result of replacing each bound occurrence of v by u. Then Fqx-+CP'.
PROOF: The proof is by induction on formulas. We consider only the case
where cP has the form 3vjt{l under the assumption that the lemma holds for
t{I, leaving the remaining easier cases as an exercise.
If Vj=FV, then cP' is 3vjt{l', where t{I' is the result of replacing each bound
occurrence of v by u in cpo By assumption, Ft{I~t{I'. Hence by Lemma 5.3,
Fcp~cp'·
If Vj =v, then cp' is 3 ut{I", where t{I" is the result of replacing every
occurrence of v by u in t{I. By the preceding lemma,
2lFt{I( z( ~) ) iff 2l =Ft{I" ( z( ~) ).
Hence, 2lFCP(Z) iff for some aEI2lI,
2lFt{I(Z(~))
iff
for some a E 12l1, 2lFt{I" ( z( ~) )
iff 2lFt{I'. Thus Fcp~cp'. []
Lemma 5.5.
i. F -,3vcp~Vv.., cpo
ii. F.., V vcp~3v .., cpo
iii. If u is not free in t{I, then F[t{I/\3ucp]~3u[t{I/\cp] and [t{I/\ Vucp]~Vu[t{I/\
cpl·
iv. If u is not free in t{I, then F[t{lV3ucp]~3u[t{lVcp] and F[t{lV3ucp]~3u[t{I
Vcp].
PROOF: The first two clauses follow easily from Definition 4.3. To prove
clause iii suppose 2lF[t{I/\3ucp](z). Then 2lFt{I(Z), and for some aEI2lI,
2lFCP( z(~)).
154 III An Introduction to Model Theory
By Theorem 4.5,
~F [ I/; /\ cp ] ( z( ~) ),
and so ~F3u[l/;/\cp1<z>. This shows that F[I/;/\3ucp]-+3u[l/;/\cp]. Since
F3u[l/;/\cp]-+[1/;/\3ucp] is obvious by Definition 4.3, we have iii. The
remaining half of clause iii has a similar proof, as does clause iv. D
Definition 5.6.
i. A formula is open if no quantifier occurs in it.
ii. A prenex normal form formula is a formula Qcp where cp is open and Q is
a sequence Qouo'" Qn-\un-\ with each QjE{V,3} and each U j a
variable. Q is called the prefix of Qcp, and cp the matrix of Qcp.
V('Pk,I/\'Pk,2/\" ·/\'Pk,n.),
where each 'P;J is an atomic formula or the negation of an atomic formula.
A conjunctive normal form formula is defined analogously except that the
symbols /\ and V are interchanged.
V('Pk,I/\'Pk,2/\" '/\'Pk,n),
where 'P;J E { I/;j> -, I/;j}' The analogous statement for conjunctive normal forms
is also true.
PROOF: Let 'P; (i <:'k) be an enumeration of all those formulas of the form
such that
We claim that
F'PI V··· V'Pk~l/;·
If not, then there is some ~ and z E VbII~1 such that
~F -, ['PI V .. · V'Pk]<Z> and ~FI/;<Z>.
>
Let 'P. = 'Pr /\ 'Pt /\ ... /\ 'P: , where 'P; is I/;j if ~F 1/;/ Z and is -, '"j
otherwise. Then ~F'P·<Z>. Moreover, since I/; is open, the truth value of '"
in any model with any assignment depends only on the truth values of the
1/;;. Hence F'P· ~I/;. But then 'P. is one of the 'P;'s and ~F -, ['PI V ... V'Pk]-
a contradiction. Hence F'PI V ... V 'Pk~l/;· D
The last normal form we shall discuss is one of the most useful. A
formula in this normal form is universal.
Definition 5.10. Say that 'v'UI, ... ,Un'P(J(ul, ... ,u,:',w;, ... ,wk» is a one step
Skolemization of", if 1/;='v'UI, ... ,Un3u'P, andf is a function symbol not
occurring in 'P, and WI"'" wk is a list of the free variables of 1/;. If no
variable is free in 1/;, and I/; has the form 3u'P, and c is a constant symbol
not occurring in 1/;, then 'P(~) is a one step Skolemization.
156 III An Introduction to Model Theory
For example, VVI [[ VI <f( VI' V3)] 1\ [f( VI' V3) <V3]] is a one step Skole-
mization of Vvl3vl[ VI <V2]1\[ V2 <V3]]·
Notice that iterating one step Skolemizations will lead to a universal
formula provided that we begin with a prenex normal form formula.
Definition 5.11. Let cP be a prenex normal form of 1/1, and let CPI,CP2' ... 'CPn
be a sequence of formulas such that
i. CPI = 1/1,
ii. CPn is universal,
iii. CPH I is a one step Skolemization of cPj for each i = 1, ... , n - 1.
Then CPn is said to be a Skolem normal form of 1/1, which we shall often
abbreviate CPn E Sk( 1/1 ).
Theorem 5.12. Let cP E Sk( 1/1). Then for every ~ there is an expansion msuch
that for all Z EVbII~1
iff mFcp<Z).
~F1/I<Z)
Furthermore, if Z EVbII~1 and ~FCP<Z), then ~F1/I<Z).
PROOF: Clearly it is enough to show that the conclusion of the theorem
holds when cP is a one step Skolemization of 1/1. Let WI' W2' ••• ' W k be the
variables that occur free in 1/1. Then cP has the form VUI, ... ,un~(f(ul, ... ,
u:, WI' ••• ' wk ), wheref does not occur in 1/1. For each aEn+kl~llet
uI,···,un,wI,···,wk'u )}
Xii= { d:~F~ ( .
a l ,··· ,an,an+I'··· ,an+k,d
Let a* be some fixed element of I~I. Now we let m=(~,j~, where
f~aEXii if Xii:;60, andf~a=a* otherwise. (Notice that the existence of
such an f requires the axiom of choice; see Exercise 10.) Oearly, for any
assignment Z to ~
~F1/I<Z) iff mFcp<Z).
The second part of the theorem is obvious. o
Definition 5.13. Let Vu l , ... , un~ be a Skolem normal form for satisfiability
of .., 1/1, where ~ is quantifier free. Then 3UI, ... ,Un (..,~ is a Skolem normal
form for validity of 1/1.
PROOF: Let 1/1. be 3u, ... , un ( -, 0 with ~ quantifier free. The following
statements are equivalent:
or
where each u; is either a constant symbol or a variable. Show that every atomic
formula is equivalent to an existential formula of the form
of the same type and with the same free variables where each cp; is simple.
158 III An Introduction to Model Theory
ExAMPLE 6.3. A model of the reals with infinite numbers and infinitesi-
mals: Let ~ be the ordered field of real numbers (R, +,., <,r)'ER where
every real is represented by a constant, say r = c~, and +,', < denote
+,., < respectively. Let l::=Th~U {c,<c:rER}, where c is a constant
symbol different from the c:s. We claim that every finite subset of l:: has a
model. For if l::' <;;;;l:: and l::' is finite, then only finitely many constant
symbols c, occur in l::', say c"c" ... ,c,. Let r*=I+max{r l,r2 , ••• ,rn).
Take ~ to be (~, c!ll), where c!ll':" r*2. Clearly ~ E Mod l::', thus proving the
claim. Hence by compactness, l:: has a model @:.
By Theorems 4.17 and 4.18 we can assume that ~ <;; ; @: t T~, so that
c':= r. Also ~=@:tT~. However, ~~@:, and in fact cl! is an infinite
number in @: in the sense that c >c, for all r ER. Since @: is a field, every
element x in I@:I other than 0 has a multiplicative inverse, which we call
llx. Since O<r<c for all rER, it follows that
1 1
0<-<--
c n+ 1
EXAMPLE 6.4. Let f be a set of sentences such that for every nEw there is
an ~ E Mod f whose universe has cardinality ~ n. Then there is a ~ E
Modf whose universe is infinite. For let l::=fu{dm~dn:m<n<w},
where the d's are distinct constant symbols none of which occur in f. Let
l::' be a finite subset of l:: with n* = max {n:dn occurs in l::'}. By assumption
there is an ~EModf with cl~1 ~n*. Let ao, ... ,an - I be distinct elements in
I~I. Expand ~ to ~=(~,diJ3,d~, ... ,dn!lt_I)' where dj!ll=aj. Then clearly
~EModl::'. Hence, every finite subset of l:: has a model, and so by
compactness l:: has a model. But every model @: of l:: is infinite, since
@:Fdm~dn for all m<n<w. Moreover, if@:EModl::, then @:tTfEModf as
needed.
Hence we see that the class of all finite groups is not an elementary
class, and the same is true of the class of all finite rings and the class of all
finite fields.
ExAMPLE 6.6. Let ZF be the set of axioms for Zermelo-Fraenkel set theory
(given in Section 1.11). Included in ZF is the axiom of regularity. But in
spite of regularity, if ZF has a model, then ZF has a model 2£ = (A, e lll) in
which there are elements c~,c~, ... such that c:+1elllc: for n=O,I,2, ....
Indeed, given n E", and a model mof ZF, mcan be expanded to a model
of ZF U {cn e cn _ I' cn-I e c 1 e co}. Hence every finite subset of ZF U
{cn+ 1ecn:n E",} has a model. By compactness, ZFu {cn+ 1ecn:nE",} has a
model, say 2£. The regularity axiom is true in 2£, yet 2£l=cn+ 1ecn for all
n E "'. Of course this means that {c:: n E "'} is not a set in 2£, i.e., there is
no aEI2£1 such that for all bEI2£I, bellla iff b=c: for some nE",.
ExAMPLE 6.7 (A countable model of the reals). Let 2£ be the field of reals,
2£ = (R, +, .). Th 2£ is countable and so has a countable model. Indeed, if m
is any expansion of 2£ with countable type, then Thm is countable and so
has a countable model.
then we get a language that is countably compact in the sense that any
countable finitely satisfiable set of assertions is satisfiable.
8. In Definition 2.ld change clause iii to
iii*. if a"; wand {qJi:i <a} ~X then 1\ {qJi:i <a} EX and V {qJi:i <a} EX.
In Definition 2.1d change clause iii to
iii*. qJ= 1\ {qJi:i<a} and 2ll=qJi for each i<a, orqJ= V {qJi:i<a} and 2ll=qJi for
some i<a.
(a) Show that there is an assertion (1 in this new language such that 2l1=(1 and
'T2l= {<) iff 2l=-(N, <).
(b) Conclude that this new language is not compact.
9. This problem is for those who have some knowledge of point set topology. The
problem is to show that compactness in the sense of Theorem 6.1 is equivalent
to the compactness of some topological space.
Let s be a similarity type, and let T be the set of all structures of type s and
of cardinality ..;cs+w. For each 2lET we let 2l*={~ET:2l=~}. Now let
T*= {2l*:2l E T}. For each ~ of type s let Fl;= {2l*:2l EMod~}.
(a) Show that '5={Fl;:'T~~S} is a base for closed sets for a topology on T*,
i.e., show that 0 and '!* E'5, and if K ~ T*, then n
K E T*.
(b) '5 is a T2 topology, i.e., if 2l*=fo~*, then there is some X, Y such that
X, f E '5 and 2l* E X, ~* E Y, and X n Y= 0.
(c) Derive the compactness of '5 from Theorem 6.1 and conversely. (To say
that '5 is compact means that if '5'~'5 and if n
K=fo0 whenever K is a
finite subset of K, then n
'5' =fo0.)
162 III An Introduction to Model Theory
Definition 7.1. Say that ~ is finitely satisfiable if every finite subset of ~ has
a model.
Lemma 7.2. Let ° be of type 'T~. If ~ is finitely satisfiable, then either
~ U { o} is finitely satisfiable or ~ U { -,o} is finitely satisfiable.
In the next lemma we need a well ordering of the set of all assertions of
some fixed type. At first glance this would seem to require the use of the
well ordering principle or some other version of the axiom of choice. But
this can be avoided by identifying the symbols with ordinals (as suggested
in Section 2.10) and then well-ordering the class of expressions by defining
rorl .. .rn_l<so ... sm_1 if either n<m or n=m and 'jEsj , where} is the
least k such that reFsk. Hence, using Theorem 10.9 of part I, we see that
any set of expressions can be indexed by an ordinal. The axiom of choice
appears implicitly in Lemmas 7.6 and 7.7, but again can be avoided using
these devices.
By induction on a and Lemma 7.2, it is easy to see that each l':.. satisfies
conditions i, ii, and iii of the theorem. Let f=l':,e. We need only observe
that f is complete: Let 'TO~'Tl':. Then 0=0.. for some a<{3. By definition
of l':.. +I' o El':.. + I or ...,oEl':.. +I' Since l':.. +1 ~f, we have oEf or ...,oEr.
Hence f is complete as needed. 0
Notice that this lemma is an immediate consequence of the compactness
theorem. For if every finite subset of l': has a model, then so does l':. Let
21 E Mod l':. Then Th 21 is complete and Th 21 ~ l':. However, we want to use
this lemma to prove the compactness theorem, and so we need a proof that
'.
does not use the compactness theorem.
0* = fJi( g(o»)
if 0 = 3vcp for some v and fJi, and let 0* = 0 otherwise. Let 0 = l': u { 0*: 0 E
l':}. Clearly 0 satisfies i, ii, and iii. Let 0' be a finite subset of 0, say
O'=l':'u{o;*:i<n}, where l':'~l': and oj=3ujfJij for each i<n. Since l': is
finitely satisfiable, l':'u{o;:i<n} has a model 21. For each i<n choose
~E1211 such that 21I=fJij(aj). Now let ~=(21,ao, ... ,an_I)' where aj=g(oj)'iI:l.
Clearly ~ is a model of 0'. Hence 0' is finitely satisfiable. 0
Lemma 7.7. Suppose l': is complete and finitely satisfiable, and every assertion
3vcp El': has a witness in l':. Then l': has a model of cardinality <cl': + w.
PROOF: Let A' be the set of all terms of type 'Tl': in which no variable
occurs. Define t l --/2 if tlR:!/2El':.
We first show that '--' is an equivalence relation on A'.
Case i. t--t: If not, then ...,[tR:!tJEl': by completeness; but this con-
tradicts the finite satisfiability of l':, since { ""[/R:!t]} has no model.
Case ii. 11--/2 implies t2--/ 1: Suppose t l --t2. If 'T2R:!'TI Ell':, then ...,[/2R:!
tdEl': by completeness; but then {t1R:!t2, ...,[t2R:!/1]} is a finite subset of l':
without a model-a contradiction. Hence, 'T2--'T1"
164 III An Introduction to Model Theory
Case iii. If 11"",,12 and 12"",,13, then 11"",,/3: Suppose 11"",,12 and 12"",,13 but
11~/3' Then "[11~/3]E~ by completeness, but {t1~12,/2~13' "[/1~13]}
is a finite subset of ~ without a model. This contradicts the finite
satisfiability of ~. Hence if 11"""/2 and 12"",,13, then 11"",,/3,
Hence' "",,' is an equivalence relation.
We next note that if 11"",,1; for i<n, then for all J",aE1'~, then
in,a/O, ... ,ln-l~in,al~, ... ,/~-1 E~. For if not, then by completeness
in,a/o,'''' In-I ~J",al~, ... , I~_I E~, contradicting finite satisfiability, since
{t;~/;:i <n} U {fn,alo>' .. ,In-I~lo.,,., I~_ d has no model.
Similarly, one sees that if 1;"""1; for i <n, then for any Rn,a E 1'~,
Rn,aIO, ... ,ln-1 E~ iff Rn,a/~, ... ,I~-1 E~.
We can now define a model & of ~ as follows:
a. Let I&I=A = {t-:/EA'} where 1-= {t':/"""I'}.
b.c: = ca for all Ca E 1'~.
III - -_
c. in 'wa 1o,- ... ,tn-I
-- ' J" a 1o, ... ,In-I for all in,a E1'~.
d. Rn,aIO, ... ,ln-1 iff Rn,a/O, ... ,ln-1 E~ for all Rn,a E1'~.
As we have shown in the paragraphs preceding the definition, clauses c
and d are unambiguous in that they do not depend on which representa-
tives 10"'" In _ I are ~hosen from the equivalence classes I~, .. ·,in _ I'
Notice that IIlI = I for all terms I EA'. Certainly this is true for constant
symbols by clause b of the definition of &, and if true for 1o,"" In_I' then
III _ III III III _ III - - . •
(J"g/Q, ... ,ln-l) -in,a/o ... In_1 -J",alo ... ln-I whIch by _clause b IS
in,a/O ... ln-I' Hence by induction on formulas, we have I III = I for all lEA'.
We now show by induction on assertions that for all (J of type 1'~
&F(J iff (JE~. (*)
We take advantage of the remark following Theorem 4.4 to reduce the
number of cases considered to the four that follow:
Case 1. (J is atomic: Then (J is either of the form RIO ... l n - 1 or of the
form lo~ I I' If (J = RIo, ... , In _ I' then by clause d of the definition of & and
the fact that IIlI = I for all I E A', we have &F RIO, ... ,1n _I iff R 1lI/~, ... , 1':_1
iff R 1lI/0 , ... , i,. _I iff RIo, ... , In _I E~. The argument is completely similar
when (J is 10~/1'
Case 2. (J is ., cP and cP satisfies (*): First notice that cP and ., cP cannot
both be in ~, since ~ is finitely satisfiable but {cp, ., cp} has no models. By
completeness either cP or ., cP is in ~. Hence ., cP E ~ iff cP El ~ iff not &Fcp
iff &F., cpo
Case 3. (J is CPI!\ CP2' where both CPI and CP2 satisfy (*): By completeness,
~ intersects each of the pairs {CPI' "CPI},{CP2' "CP2},{CPI!\CP2' "[CPI!\CP2]}'
Since no finitely satisfiable set contains { ., CPI' CPI!\ CP2}' {., CP2' CPI!\ CP2}' or
{CPI,CP2,"[CPI!\CP2]}' we have (JE~ iff [cpIE~ and cp2E~] iff [&FCPI and
&FCP2] iff &F(J.
Case 4. (J is 3vcp and cP satisfies (*): Notice that if (JE~, then (J has a
witness, say c, such that cp(C)E~. On the other hand, if cp(C)E~ for some
c, then (J E~. For if not, then ., (J E ~ by completeness, but then
3.7 Proof of the Compactness Theorem 165
{'1'(c), .., C1} has no model, contradicting the finite satisfiability of l:. Hence
C1El: iff [for some c, 'P(C)El:] iff [for some c, 2lF'P(C)] iff 2lFC1.
This completes the proof of (.), and so 21 EModl:.
To finish the proof of the theorem we need only observe that cl211 <cA'
<cl:. The last inequality holds because any complete set of sentences is
infinite, and only finitely many terms occur in each sentence. 0
PROOF OF THE COMPAClNESS THEOREM. For each finitely satisfiable l:, let
G(l:) be the set r of Lemma 7.4, and let H(l:) be the set n of Lemma 7.6.
Now let fl be a finitely satisfiable set, and define:
l:o=fl,
l:2" + 1 = G(l:2,,)'
l:2,,+2 = H(l:2" + I)'
A trivial induction shows that for all nEw:
i. l:,,!:l:,,+I'
ii. l:" is finitely satisfiable.
iii. l:2n + 1 is complete.
iv. If 3vcp E l:2" +2' then 3vcp has a witness in l:2n+2'
v. l:" has cardinality cl:+w.
Now let fl· = U jE",l:". We show that fl· satisfies the hypotheses of
Lemma 7.7.
i. fl· is finitely satisfiable: Suppose {C10, ... ,C1,,_.}!:fl·. Then for each
i<n there is aj(i)Ew such that C1j El:j(j)' Letj=max{j(i):i<n}. By
conclusion i, {C10."" C1,,_ I} !: l:j. By conclusion ii, l:j is finitely satisfi-
able. Hence {C10,'''' C1,,_.} has a model. Thus fl· is finitely satisfiable.
11. fl· is complete: Let C1 be of type1'fl·. Then for some nEw, C1 is of type
T l:". By conclusions i and iii, either C1 E l:2" + 1 or .., C1 E l:2n + I' Hence
C1 E fl· or .., C1 E fl·, which shows that fl· is complete.
iii. Each formula 3vcpEfl· has a witness in fl·: If 3vcpEfl·, then 3vcpEl:"
for some nEw. By conclusion i, 3vcpEl:2,,+2' So by conclusion iv, 3vcp
has a witness in l:2,,+2 and so a witness in fl·.
Thus we can apply Lemma 7.7 and conclude that there is a model 21 of
fl· such that cl211 <cfl·. Since fl!:fl·, we have 2lEModfl·.1t remains only
to observe that cfl·=w·(cfl+w)=cfl+w. 0
EXERCISES FOR §7
Here we outline our alternate proof of the compactness theorem. This
proof has a more algebraic flavor and requires the axiom of choice.
Let J be a set, and let F!: P(J), i.e., F is a set of subsets of J. Say that F
is a filter base on J if
i. 0ftF,
ii. X E F and Y E F implies X n Y E F.
166 III An Introduction to Model Theory
F is a filter on J if in addition
iii. J:2 Y:2 X and X E F implies Y E F.
A filter F is an ultrafilter if
iv. for each Yr;;;,J either YEF or J - YEF.
l. Let aEJ, and let F={X:aEX~J}. Show that F is an ultrafilter. An ultra-
filter of this kind, i.e., containing a singleton, is called a principal ultrafilter.
2. Show that F is a filter and X~J, then either FU{X} or FU{J-X} is
contained in a filter. Compare with Lemma 7.2.
3. Every filter is contained in an ultrafilter. [Hint: Well-order P(X) and use 2
above or use Theorem 9.3 in Part I.]
4. There are non-principal ultrafilters. [Hint: Let J be infinite, and let F be the set
of all subsets X of J such that c(J - X) <cJ. Then F is a filter, and any
ultrafilter containing F is non-principal.]
5. Show that there are 22< ultrafilters on J if cJ = /c.
Let F be an ultrafilter on J, and let {~j:jEJ} be a set of structures all
of type s. Let A # be the set of all choice functions on this set, i.e.,
A # = {g:Domg=J and gV)EI~jl for eachjEJ}. Assuming the axiom of
choice, A # ~0. Write g--h if U:gU)=hU)} EF.
6. Show that •- ' is an equivalence relation on A # .
Now let g={h:g--h}, and let A ={g:gEA#}. We define a structure
~ = IIF~jas follows:
i. I~I=A;
and for all c,j,R Es,
ii. cV1. =g, where gU) = cV1. j ;
iii. letjV1.g1, ... ,gn =g, where gU)= jV1.Jg1U), ... ,gnU);
iv. R'Ilg1,. .. ,gn if U:Rw'jglV), ... ,gnU)} EF.
7. Show thatr is well defined, i.e., if h;Eg; for i=I,2, ... ,n, and if we define
j·gb ... ,gn=h, where hV)=j9f.jh1V), ... ,hnV), then h=g.
8. Show that R'I{ is well defined, i.e., if h; E g; for i = 1,2, ... , n, then
U:R9f.JhIW, ... ,h"W} EF iff U:R'il.jgIW, ... ,gn(})} EF.
The structure ~ is the ultraproduct of {~j:jEJ} with respect to F. The
main theorem of ultraproducts is
Theorem. ~I=CP(gl, ... ,gn) iff U:~)=CP(gIU), ... ,gnU)} EF.
9. Prove this theorem by induction on formulas.
Principal ultrafilters are uninteresting, since:
10. If F={X ~J:jEX}, then ~~~j'
12. For each bEI~llet hbEJ{b}, the constant function on J with value b. Prove
that the function H defined by
H(b)=hb
is an elementary embedding of ~ into ~.
13. Let ~=<N, +,.), and let F be non-principal on w. Show that IIF~ is a
non-standard model of ThN, i.e., a model of ThN that is not isomorphic to N.
In fact, if h(n)=n, then ii is an infinite element in ~.
14. Let ~=<R, +,.), and let F be a non-principal ultrafilter on w.ThenIIF~ is a
non-standard model of R.
1beorem 8.2. Let W~!a, and suppose that for every formula cp and every
zE VbllWI such that !a1=3vcp(z), there is an aEIWlfor which
Then Wex:!a.
CoroUary 8.3. Let W ~!a, and suppose that for every finite subset
{ao, ... ,an-d of IWI, andfor every bEI!aI, there is an automorphism g on !a
such that
i. g(b)EIWI,
ii. g(aj)=a;!or all i<n.
axiom of choice, there is a 1-1 map on C onto some D'c;;;,D, or a 1-1 map
on D onto some C' c;;;, C. Since such a map is an isomorphism on C onto D'
or on D onto C', we have (C)=:(D')a:(D) or (D)=:(C')a:(C). In either
case C=:D.
EXAMPLE. Let (x,y) be the open interval on the real line between x andy,
i.e., the set of all reals greater than x but less than y. Let 2£ = ((0, t), < ),
m=«O, 1), <) where '<' is the usual ordering. Then by Corollary 8.3, we
see that 2£a:m.
We may write 2£a:m when all we mean is that for some m', 2£~m'a:m.
Theorem 8.6 (Upward LOwenheim-Skolem Theorem). Let cl2£1 >w, and let
K >cl2£1 + c'T2£. Then there is a m such that clml = K and 2£ a:m.
!Bl=qJ( z( ~)).
Now define
Ao=X,
An+1 =Anu {g(qJ,z):z E Vb~n and !B1=3txp<z)}.
Now let A = U ne..,An' We first observe that if cE.,.!B, then c!8 E A;
indeed, since !B1=3v [v~c], c!8 E AI' Also, if ao, ... ,an_1EA andjn,,,,E.,.!B,
thenjn!8",ao>
,
...• ,an_ 1EA; for if ajEA;.J, and m=max{jj:i<n}, then ao>""
an _ 1E Am' smce Ak ~ AI for all k <I < w. Hence jn,,,,ao,'''' an _I E Am+ I'
~
i. 12£1 = U ",<"12£,,,1,
ii. c·=c·o for each cEs,
iii. j91.= U ",<reI'#J.· for eachjEs,
iv. R'J1.= U ",<"R·· for each REs.
Notice that clause iii is equivalent to: j'#J.al , ... ,an = b iff j.·a l, ... ,an = b
whenever a l , ... ,an,bEI2£",I. Similarly, clause iv is equivalent to: R'J1.a, ... ,an
iff R'J1.·al , ... ,an whenever a 1, ... ,an EI2£",I.
In model theory one often builds a structure 2£ satisfying certain first
order requirements by erecting a tower of approximations 2£1,2£2'"''
2£"" ... , where 2£1i~2£y for /3<y and 2£= U ",<"2£,,,. But the requirement
2£1i~2£y for p<y is seldom strong enough to determine the first order
3.8 The LOwenheim-Skolem Theorems 171
properties of OC. For example, if OC; =({ - i, - i + I, - i + 2, ... }, <) for each
i= 1,2,3, ... , then not only is OCj~OCk forj<k, but OCj-=OCk and so OCj=OCk.
Nevertheless, if OC= U ;E"'OC;, then OC=([, <)iEOCj" To assure OCjO:OC for
each j, we need the condition OC yo:OC8 whenever y<8. Such a sequence
(OCa)aE" is called an elementary chain.
Theorem 8.9. Let (OCa)a<" be an elementary chain, i.e., let OCpo:OCy whenever
{3<y</(.. Then OCpo:Ua<"OCafor each {3<".
PROOF: Let OC = U a "OCa. Let f be the set of formulas cP such that
whenever {3<" and Z E~IIOCpl, then OCpFcp(Z) whenever OCFcp(Z). We want
to show that all formulas of type .,.OC belong to f.
Clearly the atomic formulas belong to f (see Definition 8.8, clauses iii
and iv). It is easy to see that if CPl Ef and CP2Ef, then ""CP2Ef and
CPl Vcp2 Ef.
Suppose that cp=3V\[1, where I/IEf. If OCpFcp(Z), then OCpFI/I(Z( ~) for
some bEIOCpl. Since I/IEf, OCFI/I(Z(~), and so OCFcp(Z). Conversely, if
OCFcp(Z), where Z EVb1IOCpl, then OCFI/I(Z( ~ ), for some b EIOCI. Since
IOCI= UaE"IOCal, bEIOCyl for some yE". Let 8=max{{3,y}. Then since
Z(~)EVblIOC81 and I/IEf, we have OC8FI/I(Z(~)' and so OC8Fcp(Z). But
OCp o:OC8, and so OCpFcp(Z). Hence cpEf. This completes the proof that all
formulas belong to f, which gives Theorem 8.9. 0
Definition 8.10. Say that l: implies 0 if Modl: = Mod(l: U {0 n, i.e., if every
model of l: is a model of o.
PROOF: Let ~=(A, <91) and ~=(B, <!l), where A = {ao,a l, ... } and B=
{bo,b l, ... }. Suppose that CO<91 ... <91 cn _ 1 and do<'iI:J··· <'il:Jdn_ l. Then
for every c EA there is a dEB such that for all i <n, cj <c iff ~ <d, and
c<cj iff d<~. Indeed, if c<co or if Cn- I <c, then such a d exists, since ~
has no least or greatest element. If for somej<n-I, <:;-1 <c<cj> then such
a d exists because the ordering is dense. Now let g({cj:i<n},c,{~:i<n})
be the first term in the sequence bo,b l, ... that can serve as such a d.
Reversing the roles of ~ and~, we get a function h({~:i <n},d, {cj:i <n}).
Now define sequences aO,a), ... and bOob), ... as follows:
ao=ao>
bo=bo>
a2n + 1 = a2n + I ,
b2n+I = g({ ao. ... ,a2n},a2n+I' {bOo ... ,b2n }),
b2n +2= b2n +2,
a2n+2= h( {bo, ... ,b2n +.},b2n +2, {ao. ... ,a2n+ I})'
It is easy to see that the function I defined by /( a;) = b; is an isomorphism
on ~ onto ~, which proves the theorem. []
3.8 The LOwenheim-Skolem Theorems 173
aEK
j.= u r«
aE",
foranyjEs,
aEK
Show that ~aex:~t'T~a for each aEIC.
3. Letj,,(v) be the term defined in the last example of this section. Let l: be the
following set:
(a) VV03Vl[ vo~c~[J( Vl)~VO]]'
(b) Vvo VVl[J( vo)~j( Vl)~VO~Vl]'
(c) Vvo[J,,(vo)~vo],
(d) Vvo[J(vo)~c].
Prove that l: is complete and that Th(N,jH) ={(JIl: implies (J), where jH(n) =
n+ 1.
4. Let F be the function from 18 to Ii described in the last example of this section.
Show that F is an isomorphism.
S. Let (D, <) be a partially ordered structure such that for each d, and d2
in D there is a dED for which d, <d and d2 <d. Let {~d:dED} be a set
of structures such that ~dex:~d' whenever d<d'. Define U dED~d as in
Exercise 2. Show that ~d'ex: U dED~d for each d'ED.
6. Let l:=Th(R, <). Show that whenever ~~!8 and ~ and 18 are models of l:,
then ~ex:!8.
7. Find~, 18', and 18 such that ~ex:!8 and !8~!8'ex:~ but ~~!8.
8. Prove the analog of Theorem 4.18 for elementary substructures, i.e., show that
if ~;;:!8' ex:!8, then there is a Ii ~!8 such that ~ ex: Ii.
assertions each of the form 'tI uo· .. un _ Itp where no quantifier occurs in tp
(Theorem 9.3).
Lemma 9.1. Suppose that ~ ~~, that z E VbII~I, and that t/I is open. Then
PROOF: Let K be the class of formulas t/I for which this is true. Then every
atomic formula belongs to K by the definition of substructure. Moreover,
if t/l1,t/l2EK, then
i. ~F'It/lI<Z> iff ~Ft/lI<Z> iff ~Ft/lI<Z> iff ~F'It/lI<Z>, and
ii. ~F[t/ll Vt/l2](Z> iff (~Ft/lI<Z> or ~Ft/l2<Z» iff (~Ft/lI<Z> or ~Ft/l2<z»
iff ~F[t/lIVt/l2](z>.
Hence t/l1' t/l2 E K iinplies 'I t/l1' t/ll Vt/l2 E K. Thus K contains the open
formulas, which proves the lemma. 0
Definition 9.2. A formula in prenex normal form is said to be universal
(existential) if the only quantifier symbol occurring in its prefix in 'tI (3).
That is, a universal formula is one of the form 'tIuo··· Un-I'll, where tp is
open. A set r of formulas is said to be universal if every tp E f is universal.
Theorem 9.9. Let K be a class of structures of some fixed finite type not
containing function symbols. Then K=Mod~ for some set ~ of universal
assertions iff the following conditions are met:
i. W;a;:~ and ~EK implies WEK.
ii. W~~ and ~EK implies WEK.
iii. {S(~,X):X ~I~I, O<cX <w} ~K implies ~EK.
PROOF: Suppose K=Mod~, where ~ is a set of universal sentences.
Theorem 4.19 tells us that condition i is satisfied. Condition ii is a
consequence of the preceding theorem. Now suppose that {S(~,X):X ~
I~I, O<cX <w} ~K. Let oE~; say o=Vuo'" un-II/!, where I/! is open. Let
ZEVbII~I. Let W=Sub(~,X), where X={z(uJ, ... ,z(un_ I)}. WEModK,
and so WFO. Hence WFI/!<Z), and so by Lemma 9.1, ~FI/!<Z). This shows
that ~ E K and so condition iii holds.
For the converse, assume i, ii, and iii. Let f=ThK. We claim that
K-Modf. Clearly, K~Modf. To prove the reverse inclusion we take
~flK with T~-TK. Since K satisfies condition iii, there must be a finite
subset X of I~I such that Sub(~,X)flK. Let W-Sub(~,X). Since TW is
finite and does not contain function symbols, W is finite. By i and ii, no
@: E K contains a substructure isomorphic to W. So by Lemma 9.6, @:F'" o~
for all @:EK. Hence ..,o.Ef. Since ~FOIJ[ (again using Lemma 9.6), we
have ~flModf. Hence K~Modf, and so K=Modf. Applying Theorem
9.3, we see that K-Mod~ for some set ~ of universal assertions. 0
Lemma 9.10. Let CflT~UTCp. Then
F~~VU'P iff F~~'P( ~).
PROOF: Suppose +F~~VUIP. Let W+EMod~, where TW+~T~UTCp(~).
Then W+-(W,c· ) for some W of type T~UTCp and WEMod~. Hence
WFVU'P, and so W+F'P( ~). This shows that F~~'P( ~).
For the converse, suppose F~~'P(~) and WEMod~. Let aEW, and
form W+ -(W~T~ U TCp,c·+), where c·+ -a. Then W+ EMod~, so
W+F'P( ~). Hence, WFCP<Z( ~) for all a, so WFVUIP, as needed. 0
178 III An Introduction to Model Theory
Definition 9.11.
i. An V3-formula is a formula of the form VUO,.",un_13wO, ... ,wn_ICP
where cP is open.
ii. Let K be a class of structures, and define ThY3 K = {0: 0 E ThK and 0 is
a V3-formula}.
Notice that ~!:;m iff ~a::0m and that ~a::m iff ~a::rm where r is the
set of all formulas of type 'T~. We write ~a::ym for ~a::rm when r is the
set of universal formulas of type 'T~.
Lemma 9.13. If ~a::ym, then there is a @; such that m!:;@; and ~a::@;.
PROOF: Let m+=(m,Cr)b~'!8I' where Cb~'Tm and b=c'r. Let ~+=
(~,Cr)bEI.I' where c:+ =c: for bEI~I. Let ~=Th~+ u~, where ~ is the
diagram. of m determined by m+. By Theorems 4.17 and 8.5 it is enough to
prove that ~ has a model. Suppose not. Then some finite subset has no
model, say {oo, ... ,op-I,Po. ... ,Pq-d, where oiETh~+ and PjE~ for each
i<.p,j<q. Let 0 be 001\·· ·I\Op_1 and P be Pol\·· ·I\Pq-l. Then I=o~
,p. We can write P as p(cao, ... ,Co,,_I,Cbo, ... ,Cbm_)' where ao. ... ,an- I EI~I,
bo, •.. , bm _ 1 E Iml-I~I, and every Cb occurring in P is some ca, or some cb.•
Using Lemma 9.10, we see that 1=0~Vuo, ... ,um-I,p(cao, ... ,co,,_I'uo. .. :,
um_ I)' where uo, ... , um_ 1 are distinct variables not occurring in p. Since
~+I=o and ~a::ym, we have m+I=Vuo. ... ,um-I,p(cao, ... ,ca",_I,uO, ... ,Um-I)'
contradicting m+l=p(cao, ... ,Co,,_I,Cbo, ... ,Cbm_.). Hence ~ is finitely satisfiable
and so has a model, as we needed to show. D
PROOF: Let ~+ =(~,c:+)aEI.I' where Ca~'T~ and c:+ =a for all aEI~I. Let
6DY~ be the set of all universal formulas true in ~+. Let ~=ThKU 6DY~.
We claim that every finite subset of ~ has a model. For if not, then some
finite subset~' of ~ has no model. Let ~'={oo, ... ,on-I'Po. ... 'Pm-d,
where each ojEThK, and each PjE6j)A~. Let Pj be Vuj,o··· ui,n;-ICPj. Using
Lemma 5.4, we can assume that UjJ=FUk,t when (iJ)=/=(k,I).1t is easy to see
that Pol\··· I\Pm-1 is equivalent to Vuo, ... ,Utcp where each uiJ is some Uk
and cP is CPol\·· ·I\CPm-l. Letting 0 be 001\··· I\on-I' we have I=o~
,Vuo, ... ,ujcp, or equivalently 1=0~3uo, ... ,Uj_I'CP. Let {cao, ... ,ca,_.}='Tcp
- 'TK, and let wo, ... , Wr_1 be variables not occurring in 3uo. ... , ut ' cpo Let
cP' be the result of replacing ca. by Wi throughout cP for each i <r. Applying
Lemma 9.10, we have 1=0~Vwo··· Wr_13uO··· ut-I,cp'. Since oEThK,
3.9 The Prefix Problem 179
Theorebl 10.1 (Consistency Lemma). Let s == 'T~ n T~, and suppose that
~ t s=~t s. Then there is a ~ such that
i. T~ ='T~ U 'T~,
ii. ~ ex: ~ t 'T~,
iii. ~= ~t'T~.
PROOF: Let ~EMod~" and let ~EMod~2' ~ and ~ satisfy the hypothe-
ses of the theorem, and so there is a @: such that ~ex:@:t'T~ and ~ex:@:t'T~.
Hence ~ E Mod(~, U ~0· D
182 III An Introduction to Model Theory
The first two clauses are obvious. If clause iii is false, then there are
YO>"',Yn-IE~ such that I=al--+-'(YO/\"'/\Yn-I)' But then ~ has no
model, since -,(y /\ ... /\ Yn_I)Er and hence { -'(Yo' .. Yn-I)' YO>···' Yn-d
~~, a contradiction. Hence clause iii holds also.
Taking ~I to be ~ U {ad and ~2 to be ~ U { -, a2}' we apply Corollary
10.2 and conclude that ~I U ~2 has a model. But this means that al/\ -, a2
has a model, and so al--+a2 is not valid. Hence if l=al--+a2' then al--+a2 must
have an interpolant. 0
In other proofs of the interpolation theorem the interpolant a for al--+a2
is given constructively in terms of a l and a2' In other words, if one is
explicitly given a valid implication al--+a2' an algorithm may be followed
that will explicitly yield an interpolant. Later we shall obtain such an
algorithm, but one that is less expedient than that usually given.
Definidon 10.5.
i. Let R be a relation symbol. Say that ~ implicitly defines R if whenever
~=(@:,Rw) and m=(@:,R~ and ~,mEMod~, then ~=~; i.e., any
structure @: of type Ta-{ R} has at most one expansion in Mod~.
ii. If R is an n-ary relation symbol and cp is a formula of type l' ~ - {R },
then we say that cp is an explicit definition of R with respect to ~ if
whenever ~ E Mod ~ and ~ = (@:,R w), then R ¥l = {(ao, ... , an -I):@:I=
cp(ao>'" ,an-I)}'
. {CP(~):cpEr}.
Lemma 10.6.
i. 0 implicitly defines the n-relation R iff whenever S is an n-relation symbol
not occurring in 0, then
FO!\O( ;n.a)
n.a
-+'1fvo··· Vn_ 1 [ Rvo··· Vn_I~SvO'" Vn-I J.
ii. Let cp be a formula of type 'TO - { R }. Then cp defines R explicitly with
respect to 0 iff
FO-+'1fvo'" vn- I [ cp( vo" .vn_I)~Rvo··· vn- I J.
The proof of this theorem is straightforward and is left as an exercise.
Lemma 10.7.
i. If l: implicitly defines R, then some finite subset of l: implicitly defines R.
ii. If cp explicitly defines R with respect to l:, then cp explicitly defines R with
respect to some finite subset of l:.
PROOF OF I. Suppose that no finite subset of l: implicitly defines R. Then
by Lemma 10.6, for each finite subset l:' of l: there is a model of
P(v)v P( -v),
...,[P(v)/\P( -v)],
P(u)/\P(v)~P(u·v)/\P(u+v).
and
it~=jt~·
Definition 11.1.
i. We call t a constant term if no variable occurs in it.
ii. Let fP(uo ... U,,-I) be an open formula. Say that fP(to ... t,,_I) is a
substitution instance of fP if each tj is a constant term of type TCp.
Lemma 11.2. Suppose that some constant symbol belongs to T21. Let B =
{t'if.: t is a constant term of type T21}. Then B is the universe of a substructure
of 21.
186 III An Introduction to Model Theory
Theorem 11.3 (Herbrand's Theorem). Let 'P be an open formula with free
variables uo, ... ,un_ l , and suppose that some constant symbol belongs to T'P.
Then 3uO"'un_ I'P is valid iff there is a finite sequence 'Po, ... ,'Pk-1 of
substitution instances of'P such that 'Po V ... V'Pk _ I is valid.
Theorem 11.4. Let 0 be an open assertion, and let X be the set of all terms
occurring in o. If 0 has a model, then 0 has a model of power .;;;;x.
PROOF: Suppose that 0 has a model. Then 0 has a model ~ of type TO. Let
B={t~:tEX}. Since B*0, we can choose some b*EB. Now letfbe an
n-function symbol in TO, and let bO ... bn- 1EB. Define
For each constant symbol c E TO define cj!J = c~; for each n-relation symbol
R and each bo ... bn- I EB define Rj!Jbo ... bn_ 1 iff R~bo ... bn_I' Now let
~=(B,Rj!J,jj!J,Cj!J)R,J,CETO' It is easy to see that ~FO; indeed, an easy
induction on subformulas 'P of 0 shows that ~F'P iff ~F'P. 0
How does Theorem 11.4 yield a test for satisfiability for open asser-
tions? Suppose we are given an open assertion o. We can then write down
explicitly the set X of terms occurring in o. By the theorem, we know that 0
has a model iff it has a model of power .;;;; eX.
3.12 Axiomatizing the Validities of L 187
Notice that if "I is a I-I function on I~I onto B, then "I induces an
isomorphism on ~ onto ~, where ~ is defined by
I~I=B,
c!ll=yc'il,
»,
j!llbo·· .bn - I = y(J~( y-Ibo)··· (y-Ibn_ I
R!lIbo ... bn _ 1 iff Rl}l(y-Ibo) ... (y-lbn_l)
for all c,j, R ET~ and all bO, ... ,bn _ 1 EB. Hence if a has a model, then a
has a model whose universe is either {O}, {O,l}, ... , or {O,I,2, ... ,m-l},
where m = cX. Next we make a list of all those sturctures of type TO having
such a universe. We then check to see if a is satisfied in any of these
(finitely many) structures. If not, then we know that a is not satisfied in
any structure.
These considerations also yield a test for validity of open assertions
because a is valid iff -, a is not satisfiable. Hence the above test for
satisfiabilityof -, a is tantamount to a test for validity of a.
iii. If Cp=[~JI\lh], then cp<z) is t if both ~I<Z) and ~2<Z) are t, and cp<z) is!
otherwise.
cp is valid (or a tautology) if cp<z) = t for all z. Show that the question 'Is cp a
tautology?' is decidable. [Hint: The values of cp<z) and cp<z') are the same if
z(P)=z'(P) for each P occurring in cp.]
8. Let VL be the set of valid assertions of L. We have seen that VL is effectively
enumerable. However, VL is not decidable. The proof of this is outlined below.
(a) Let a be the universal closure of the conjunction of the following:
i. Sv~Su....,u~::::w,
ii. I~Sv,
iii. v~I....,3u(v~Su),
iv. v+I~Sv,
v. v+Su~S(v+u),
vi. v·l~v,
vii. v·Su~v·u+v.
Let ~ = (N+, +, ., S, I), where S is the successor function. Show that if !BFa,
then ~k!B.
(b) Define the constant term n· as follows:
1·=1,
(n + I)· = S(n·).
Let cp(v) be the formula at the beginning of §2.l2 that says "v is the number
of a Turing machine that halts on input v." cp is existential. Show that if
Fa:...,.cp(n·), then ~Fcp(n·) and conversely.
(c) Show that VL is not decidable.
On the other hand, the expressive power of the first order predicate
calculus is not sufficient to characterize many of the objects and concepts
which mathematicians are interested in. For example, the notion of finite-
ness, or of cardinality Ie for Ie infinite, cannot be expressed in L. The
natural numbers with < cannot be characterized up to isomorphism in L;
nor can one characterize (N, +, .) or (R, <) or (R, +, .), and so on. So
there is sUfficient motivation to investigate languages more expressive than
L with the hope that these languages will be relevant to areas of mathemat-
ics beyond the scope of L and yet have interesting model theories.
It is quite easy to invent languages with more expressive power than L.
For example, the second order predicate calculus has, in addition to the
first order variables, variables of second order intended to range over sets.
Thus, adding YXr[XO;\Yy[Xy--+X(y+I)]]--+vz[Xz]l to Th(N,+,O)
gives a second order theory l: which determines (N, +,0) up to isomor-
phism, i.e., ~EModl: iff ~S!:(N, +, ',0). The second order calculus is
expressive enough to state the order completeness property of (R, +, " <),
and hence can describe this structure up to isomorphism. The notion of
well ordering can be captured faithfully, and so on. Moreover, one need
not stop with the second order. We can consider third order variables
ranging over sets of sets, and fourth order, and so on. However, these
languages, from the second order calculus on up, have not admitted a
model theory as rich or as beautiful as that for L. The game is then to
discover languages that have greater expressive power than L but still
possess an amenable model theory. Several such languages have been
discovered, and we briefly describe a couple of them.
Let Ie and ~ be infinite cardinals with Ie;> ~;> w. We describe a language
L",,. that differs from the first order predicate calculus in allowing conjunc-
tions and disjunctions over sets of assertions of cardinality less than Ie and
simultaneous universalization or existentialization of fewer than ~ vari-
ables. More precisely, the definition of formula, Definition 2.ld, is mod-
ified by replacing clause iii with
iii*. if f~X and cf<1e then [;\f] EX and [Vf] EX,
and replacing iv with
iv*. if cp E X and u is a set of variables of cardinality less than ~, then
[3ucp] and [Yucp] EX.
We define ~Fcp(Z) in the obvious way by replacing iii, iii', iv, and iv' of
Definition 4.3 with
iii*. cp=[ ;\f], and ~FY(Z) for each yEf.
iii'*. cp= [ Vf], and ~FY(Z) for some yEf.
iv*. cp=[3U1/1], and for some z*EVbll~1 such that z(v)=z*(v) whenever
vflu, we have ~Fo/(Z*).
iv'*. cp=[\fUI/I), and for all z*EVbll~1 such that z(v)=z*(v) whenever
vflu, we have ~Fo/(Z*).
192 III An Introduction to Model Theory
Absolute 54 Cardinality 37
Addition of cardinals 37 Cartesian product 6
Addition of ordinals 35 Categorical 171
V3-formula 178 Chain 23,29
Algebraic number 20 Chinese remainder theorem 109
Arithmetical 118 Choice function 27
Assertion 115 Church, Alonzo 90
Assignment 112, 142 Cohen, Paul 48, 57
At least as numerous 15 Compactness theorem 158
Atomic formula 113, 137 Complement of A in B 4
Axiom 124 Complete 125, 162, 171
of choice 27 diagram 169
of determinacy 49 Composite of f and g 7
of extensionality 44 Composition of f with g\, ... ,gr 68
of infinity 45 Computable function 63
of the null set 44 Computable relation 65
of pairing 44 Computable set 65
of the power set 45 Congruence relation 141
of regularity 44 Cot:Uunctive normal form 155
of replacement 45 Cons~tencylemma 180
of separation 45 Cons~tent 51
of union 45 Constant term 185
system 124 Continuum hypothesis 48
Correct 125
Countable 10
Barwise, J. 134
axiom of choice 49
Binary relation 6
Bound variable 115
Bounded quantifiers 75
Definability theorem 184
Branch 23
Defines 118
Definition by cases 76
Cantor-Bernstein theorem 17 Definition by transfinite recursion 41
Cardinal number 36 De Morgan's rules 5
195
196 Subject Index
Zermelo-Fraenkel axiomatization 44
Weakly compact cardinal 49 146 '
Well orderUlg 23
Witness 163
Undergraduate Texts in Mathematics
Lax/ Burstein/ Lax: Calculus with Wilson: Much Ado About Calculus.
Applications and Computing, A Modem Treatment with Applications
Volume 1. Prepared for Use with the Computer.
1976. xi, 513 pages. 170 illus. 1979. Approx. 500 pages. Approx. 145
illus.
LeCuyer: College Mathematics with Wybum/Duda: Dynamic Topology.
A Programming Language. 1979. Approx. 175 pages. Approx. 20
1978. xii, 420 pages.126illus. 64 diagrams. illus.