0% found this document useful (0 votes)
7 views

Introduction To Mathematical Logic

These lecture notes provide an introduction to mathematical logic, including: 1) Defining structures and models as pairs consisting of a set and additional structure like functions, relations, etc. Examples of structures include graphs, partial orders, rings, and the natural numbers. 2) Introducing formal languages as sets of symbols and defining how structures interpret the symbols of a language. Homomorphisms, embeddings, and isomorphisms between structures are defined. 3) Defining substructures as subsets of a structure that preserve the interpretation of symbols through the identity mapping onto the larger structure.

Uploaded by

masterfreecss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Introduction To Mathematical Logic

These lecture notes provide an introduction to mathematical logic, including: 1) Defining structures and models as pairs consisting of a set and additional structure like functions, relations, etc. Examples of structures include graphs, partial orders, rings, and the natural numbers. 2) Introducing formal languages as sets of symbols and defining how structures interpret the symbols of a language. Homomorphisms, embeddings, and isomorphisms between structures are defined. 3) Defining substructures as subsets of a structure that preserve the interpretation of symbols through the identity mapping onto the larger structure.

Uploaded by

masterfreecss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC

PHILIPP SCHLICHT

Contents
Overview 1
1. Formal languages and structures 2
1.1. Structures and formulas 2
1.2. Semantics 7
1.3. Elementary substructures 8
1.4. Theories and axioms 10
1.5. Universal truths and the Hilbert calculus 13
2. Sets and Axioms 21
2.1. Wellfounded relations and wellorders 22
2.2. Axioms of set theory 25
2.3. Cardinals 34
2.4. Tarski’s undefinability of truth 37
3. Completeness and compactness 39
3.1. Completeness of Hilbert’s calculus 39
3.2. The Compactness Theorem and applications 42
4. Incomplete theories 52
4.1. Finite set theory 52
4.2. The Levy hierarchy 55
4.3. Incompleteness of extensions of finite set theory 59
4.4. An analysis of Gödel’s sentence 61
4.5. Incompleteness of extensions of Peano arithmetic 62
5. Complete theories 65
5.1. Quantifier elimination 65
5.2. Categoricity 69
References 73

These are lecture notes from the University of Bonn in the summer of 2021. I would
like to thank the participants for comments and for finding errors and typos, in particular
Tom Stalljohann, Celina Teke, Lucas Valle Thiele, Josia Pietsch, Analena Kamm, Yuliya
Kryvitskaya and others. I would like to thank the tutors Thomas von Campenhausen,
Moritz Hartlieb and Sebastian Meyer for contributing to the exercises.

Overview
In the first chapter, we study structures, formulas and introduce the Hilbert calculus.
In the second chapter, we give an introduction to set theory. We begin informally with
ordinals and cardinals, and then study axiomatic set theory up to transfinite induction.
This can be seen as a foundation on which all results in this course are built.

Date: August 23, 2021.

1
2 PHILIPP SCHLICHT

In the third chapter, we present the completeness of Hilbert’s proof calculus. We then
study the compactness theorem and applications, deriving finitary analogues of infinitary
combinatorial statements such as the infinite Ramsey’s theorem.
In the fourth chapter, the main goal is Gödel’s first incompleteness theorem. It shows
that no matter how one extends the theory of the natural numbers, assuming there is a
reasonable listing of all axioms, some statements that can neither be proved nor disproved
will always remain.
In the final chapter, we give an introduction to model theory. We aim for some ap-
plications in algebra, for instance the Lefschetz principle, which relates statements about
the complex numbers to other algebraically closed fields.

1. Formal languages and structures


Lecture 1
12. April Mathematical logic studies formal languages and proofs (syntax), structures such as
groups, fields, graphs or linear orders, and the connection between languages and struc-
tures (semantics). Expressions in a formal language are themselves considered as mathe-
matical objects. For instance, a word in a language is a finite sequence of symbols, i.e. a
function.

1.1. Structures and formulas. We begin by introducing structures and formulas in


first-order logic.1 Many familiar mathematical structures consist of a set with additional
structure, for example:
(a) A graph is a pair (G, E), where G 6= ∅ is the set of nodes and E ⊆ G2 is the set of
edges, a symmetric set of ordered pairs in G. (A subset E of G2 is called symmetric
if ∀x, y ∈ G (x, y) ∈ E ↔ (y, x) ∈ E.)
(b) A partial order is a pair (P, ≤), where P 6= ∅ is a set and ≤ is a binary relation on
P satisfying the following conditions:
(i) (Reflexivity) ∀x ∈ P x ≤ x
(ii) (Antisymmetry) ∀x ∈ P ((x ≤ y ∧ y ≤ x) → x = y)
(iii) (Transitivity) ∀x ∈ P ((x ≤ y ∧ y ≤ z) →≤ z)
In general, a structure is defined as follows:
Definition 1.1.1. A structure or model is a pair M = (M, F), where M is a nonempty
set and F = hFi | i ∈ Ii is a family of
(1) elements (constants) Fi ∈ M ,
(2) functions Fi : M ki → M with ki ∈ N, and
(3) relations Fi ⊆ M ki with ki ∈ N.
and I is a set. Note that in most cases, I will be finite or countable.
When we write M and N , we will assume that M = (M, F) and N = (N, F) as above.
Here are some more examples. The superscript notation will be defined in Definition
1.1.5.
Example 1.1.2.
(1) A ring (R, 0R , 1R , +R , ·R ).
(2) A group (G, 1G , ·G , (.−1 )G ).
(3) The structure of the natural numbers (N, 0N , S N , +N , ·N . <N ), where S N denotes the
successor function.
(4) The field (Q, 0Q , 1Q , +Q , ·Q ).
1Second-order logic, which we won’t study here, allows two kinds of objects, for instance natural
numbers and sets of natural numbers.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 3

All groups have a binary operation (multiplication), a neutral element and an inverse
function. This is encoded in the language of groups.
Definition 1.1.3. A language or alphabet is a set of constant symbols, function symbols
and relation symbols. Function and relation symbol have an arity, i.e. a number of
arguments, k ∈ N with k ≥ 1. For example, an k-ary function on a set M is of the form
f : M k → M . A k-ary relation R on a set M is of the form R ⊆ M k .
Here are some examples of languages.
Example 1.1.4.
(1) The empty language L∅ = ∅.
(2) The language LR = {0, 1, +, ·} of rings and fields.
(3) The language LG = {1, ·, −1 } of groups.
(4) The language LO = {<} of strict linear ordes.
(5) The language LOF = LR ∪ LO of linearly ordered fields.
(6) The language LN = {0, S, +, <} of the natural numbers.
(7) The language L∈ = {∈} of set theory.
Let always c, d denote constant symbols, f, g function symbols and R, S relation sym-
bols.
Definition 1.1.5. Suppose that L is a language. An L-structure is a structure M =
(M, F), where M is a nonempty set and F = hsM | s ∈ Li and
(1) sM ∈ M if c ∈ L is a constant symbol,
(2) f M : M k → M if f ∈ L is a k-ary function symbol, and
(3) RM ⊆ M k if R ∈ L is a k-ary relation symbol.
So every symbol has an interpretation as an element, function or relation in the structure.
For example, let R = (R, 0R , 1R , +R , ·R ) denote the field of real numbers, a structure
in the language LR = {0, 1, +, ·} of rings. Here an otherwise, we will often confuse the
structure with its underlying set and write (R, 0R , 1R , +R , ·R ). One can further simplify
the notation to (R, 0, 1, +, ·) when it is clear that one means constants and functions
rather than symbols.
The familiar notions of homomorphisms, embeddings and isomorphism of (e.g.) groups,
vector spaces etc. make sense in this general setting:
Definition 1.1.6. Suppose that M = (M, hsM | s ∈ Li) and N = N, hsN | s ∈ Li)
are L-structures. By a function h : M → N , we mean a function h : M → N on the
underlying sets.
(1) h is a homomorphism if for all n ∈ N and all a0 , . . . , an−1 ∈ M :
(a) h(cM ) = cN for all constant symbols c.
(b) h(f M (a0 , · · · , ak−1 )) = f N (h(a0 ), · · · , h(ak−1 )) for all k-ary function symbols
f.
(c) RM (a0 , · · · , ak−1 ) =⇒ RN (h(a0 ), · · · , h(ak−1 )) for all k-ary relation symbols
R.
(2) h is an embedding if it is an injective homomorphism and for all k-ary relation
symbols R and a0 , . . . , ak−1 ∈ M ,
RM (a0 , · · · , ak−1 ) ⇐⇒ RN (h(a0 ), · · · , h(ak−1 )).
(3) h is an isomorphism if it is a surjective embedding.
(4) h is an automorphism if it is an isomorphism and M = N .
The notion of subgroup, subfield etc. make sense in this general setting.
4 PHILIPP SCHLICHT

Definition 1.1.7. Suppose that M = (M, hsM | s ∈ Li) and N = N, hsN | s ∈ Li) are
L-structures.
(1) M is a substructure of N if M ⊆ N and the identity id : M → N is an embedding,
i.e. for all n ∈ N and all a0 , . . . , an−1 ∈ M :
(a) cM = cN for all constant symbols c ∈ L.
(b) f M (a0 , · · · , ak−1 ) = f N (a0 , · · · , ak−1 ) for all k-ary function symbols f ∈ L.
(c) RM (a0 , · · · , ak−1 ) ⇐⇒ RN (a0 , · · · , ak−1 ) for all k-ary relation symbols R ∈ L.
(2) N is a superstructure of M if M is a substructure of N .
One can also change a structure by adding or removing constants, functions or relations.
Definition 1.1.8. Suppose that K ⊆ L are languages and M = (M, hsM | s ∈ Li) is an
L-structure.
(1) M K = (M, hsM | s ∈ Ki) is called a reduct of M, more precisely the reduct of M
to the language K.
(2) M is called an expansion of MK. In other words, M is an expansion of a structure
N if N is a reduct of M.
Example 1.1.9.
(1) M = (R, 0R , 1R , +R , ·R , <R ) is an LOF -structure (in fact it is an ordered field, i.e. it
satisfies the axioms of ordered fields), and MLR = (R, 0R , 1R , +R , ·R ) is its reduct
to the language LR of rings.
(2) Suppose that M = (M, hsM | s ∈ Li) is an is an L-structure and A ⊆ M . Then
MA = (M, hsM | s ∈ Li ∪ ha | a ∈ Ai) is an expansion of M to the language
LA = L ∪ A.
The structure MA has the property that every homomorphism h : MA → MA fixes A
pointwise.
We now begin with building formulas from the language/alphabet. One first builds
terms, and from those, formulas. The notion of term generalises the notion of polynomials
over (Z, 0, 1, +, ·). The terms in the language of rings are the polynomials with coefficients
in Z in an arbitrary number of variables.
We fix a sequence hvn | n ∈ Ni of variables once and for all. We will still use the
notation x, y, z for variables, but this will mean that they are among the vn .
In the following, a word with n letters from a set S is formally a function f : {0, . . . , n − 1} →
S. If S is countable, one can assume that S is a set of natural numbers, to realise words
a partial functions on N.
Definition 1.1.10. The following words are L-terms:
(1) Every variable vn
(2) Every constant symbol in L
(3) f (t0 , . . . , tn−1 ), if f is an n-ary function symbol and t0 , . . . , tn−1 are L-terms
The L-terms are those words generated by the rules (1)-(3).
We next list the logical symbols, that are allowed independent of the language.
Definition 1.1.11. The following symbols are called logical symbols:
(1) Variables vn ∈ Var
.
(2) The equality symbol =
(3) The negation symbol ¬
(4) The disjunction symbol ∨
(5) The existential quantifier ∃
(6) The left bracket ( right bracket ) and comma ,2
2Some authors call these auxiliary symbols instead of logical symbols.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 5

L-formulas are, informally, those words that make sense. They are built as follows.
Definition 1.1.12. The following words are L-formulas:
.
(1) s = t, if s, t are L-terms.
(2) R(t0 , . . . , tk−1 ), if R is a k-ary relation symbol and t0 , . . . , tk−1 are terms
(3) (¬ϕ), if ϕ is an L-formula
(4) (ϕ ∨ ψ), if ϕ, ψ are L-formulas
(5) (∃xϕ), if ϕ is an L-formula and x is a variable
L-formulas are those words generated by the rules (1)-(5). Moreover, a formula is called
quantifier-free if it is generated using only (1)-(4), and atomic if it is generated only from
(1) and (2).
While this is the formal definition of formulas, we will always allow the usual abbre-
viations to simplify the notation. For example, we write x + y for +(x, y) or abbreviate
((x < y) ∧ (y < z)) by x < y < z. We also leave out brackets when there is no danger of
confusion.
Note that in the previous definition, the brackets around ϕ∧ψ are necessary, since one
could otherwise not distinguish between ∃x(ϕ ∧ ψ) and (∃x ϕ) ∧ ψ. The brackets around
¬ϕ and ∃xϕ are not strictly necessary:3 one could still prove Lemma 1.1.17, but not
Lemma 1.1.15.
We will often do induction on terms. This means that we show a statement for variables
and constants in the beginning of the induction, and show that it holds for f (t0 , . . . , tk )
assuming it holds for t0 , . . . , tk in the induction step. This is a valid induction on n ∈ N,
since the term f (t0 , . . . , tk ) has some length n, while the subterms t0 , . . . , tk are strictly
shorter.
Induction on formulas works similarly.
The disjunction ∨ and the universal quantifier ∀ are still missing. It is convenient to
introduce them as notations, rather than as a part of the language itself, since this reduces
the number of cases in proofs. We will call this the extended language, and will always
use it from now on.
Notation 1.1.13. (Extended language)
(1) (ϕ ∧ ψ) := (¬(¬ϕ ∨ ¬ψ))
(2) (ϕ → ψ) := ((¬ϕ) ∨ ψ))
(3) (ϕ ↔ ψ) := ((ϕ → ψ) ∧ (ψ → ϕ))
(4) (∀x ϕ) := (¬(∃x ¬ϕ))
(5) (ϕ0 ∧ · · · ∧ ϕn ) := (((( ϕ0 ∧ ϕ1 ) ∧ ϕ2 ) · · · ∧ ϕn )
|{z}
n
(6) (ϕ0 ∨ · · · ∨ ϕn ) := (((( ϕ0 ∨ ϕ1 ) ∨ ϕ2 ) · · · ∨ ϕn )
|{z}
n
(7) (∃x0 . . . xn ϕ) := ∃x0 (∃x1 (. . . (∃xn ϕ )))
|{z}
n
(8) (∀x0 . . . xn ϕ) := ∀x0 (∀x1 (. . . (∀xn ϕ )))
|{z}
n

The following is the usual formulation of the group axioms in the extended language,
using some abbreviations.
Example 1.1.14. The group axioms are the following formulas in the language LG :
.
(1) ∀x, y, z (x · y) · z = x · (y · z)
. .
(2) ∀x (x · 1 = 1 · x = x)
−1 . −1 .
(3) ∀x (x · x = x · x = 1)
3For example, there are no brackets there in Martin Ziegler’s book.
6 PHILIPP SCHLICHT

Formally, a word is a sequence of symbols in a set S, or in other words, a function


f : {0, . . . , n} → S. If f : {0, . . . , n} → S is a word, then an initial segment is a restriction
f {0, . . . , k} for some k ≤ n. An end segment is defined similarly.
Lemma 1.1.15.
(1) An L-term cannot be a proper inital segment or end segment of another L-term.
(2) An L-formula cannot be a proper initial segment or end segment of another L-
formula.
Proof. (1): It is a bit easier than the following argument to see this by observing that the
left and right brackets in a term cancel out, so given the beginning of a term, one can
uniquely determine its end. So the next argument is not necessary, but I left it here.
Recall that in f (t0 , . . . , tm ), f , (, ), , are symbols and the ti are themselves words. If
f (t0 , . . . , tm ) is an initial segment or end segment of g(u0 , . . . , un ), then it is easy to see
from the inductive hypothesis that m = n, ti = ui for all i ≤ m and f = g. We only
consider one case in detail, since the other cases use similar steps.
Suppose that s = f (s0 , . . . , sk ) and u = g(u0 , . . . , ul ) are L-terms and s is an initial
segment of u. We will see that s = t. Since the first symbols of s and t agree, we have
f = g, and since f is a k-ary function symbol, so is g, and hence k = l. We now show by
induction that si = ti for all i ≤ k. Write u v v if u is an initial segment of v, and u @ v
if it is a proper initial segment. Either s0 @ t0 , t0 @ s0 , or s0 = t0 . The first two cases
are impossible by the inductive hypothesis, so s0 = t0 . Moving on to the next term, we
have either s1 @ t1 , t1 @ s1 , or s1 = t1 , etc.
(2): Suppose that ψ is a proper initial segment or end segment of θ. When θ is atomic,
.
i.e. of the form s = t or R(t0 , . . . , tn ), then the claim follows from (1). When θ equals (¬ϕ),
(ϕ ∧ ψ) or (∃x ϕ), it is easy to see that the claim follows from the inductive hypothesis
(2). 
Lecture 2
14. April Note that we often do an induction on the length of formulas. A more interesting
notion of measuring the size of terms and formulas is their depth, where, informally, each
step in the construction of a term of formula adds 1 to their depth. All proofs would work
for induction on the depth as well.
A segment of a word f : {0, . . . , n} → S is a connected subword g of f , i.e., there are
k, l ∈ N such that g : {0, . . . , k} → S, g(i) = f (l + i) for all i ≤ k.
Definition 1.1.16. A subformula ϕ of an L-formula ψ is a segment of ψ that is itself an
L-formula. It is a proper subformula if additionally ϕ 6= ψ.
Lemma 1.1.17. 4 All subformulas of a formula ϕ appear in its construction, i.e.
.
(1) Atomic formulas s = t and R(t0 , . . . , tk ) do not have any proper subformulas.
(2) Any proper subformula of
(a) (¬ϕ) is a subformula of ϕ;
(b) (ϕ ∨ ψ) is a subformula of ϕ or a subformula of ψ;
(c) (∃x ϕ) is a subformula of ϕ.
Therefore, for each nonatomic formula ϕ, there is a unique way in which ϕ is built from
one or two other formulas.
Proof. This follows from Lemma 1.1.15. 
The previous lemma shows that one can recover the way in which the formula was built.
In particular, this shows that one has avoided ambiguous formulas such as ∃x ϕ∧ψ, which
could have meant either ∃x (ϕ ∧ ψ) or (∃x ϕ) ∧ ψ
4This also holds for the extended language, by the same argument.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 7

The role of variables is relevant for formal derivations later on. It is important to
distinguish between free and bound variables. For example, the variable x is free in x < y,
but is bound by the quantifier ∀x in ∀x x < y.
Definition 1.1.18. An occurence of a variable x in an L-formula θ is free if this occurence
is not bound by a quantifier, i.e.:
(a) If θ is an atomic formula, then every occurence of x is free.
(b) If θ is the formula (ϕ ∧ ψ), then an occurence of x in ϕ is free in θ if it is free in ϕ;
the same holds for ψ.
(c) If θ is the formula (∃y ϕ), then an occurence of x in ϕ is free in θ if it is free in ϕ
and x 6= y.
An occurence of a variable x in an L-formula is bound if it is not free.
1.2. Semantics. We now define when a formula is true in a structure, i.e. the semantics,
or meaning, of the formula in the structure. The definition takes as inputs two objects,
a structure M and a formula ϕ, and outputs whether the formula holds in the structure.
One writes M |= ϕ if ϕ holds in M, i.e. M is a model of ϕ.
Note that there is a difference between formal statements and their truth within a
structure (defined formally by semantics), and informal mathematical statements that
describe the structure from the outside. For example, the size of an infinite structure is a
property that can be seen in the mathematical universe. E.g. the field Calg of algebraic
complex numbers is countable, but the field C of complex numbers is uncountable. How-
ever, this cannot be expressed within the structures, since Calg and C satisfy precisely the
same formulas (Calg ≺ C is an elementary substructure, as we will see later).
The formula ∀x, y, z ((x · y) · z = x · (y · z)) holds in an LG -structure (G, 1, ·, −1 ), if for
all a, b, c ∈ G, the formula (x · y) · z = x · (y · z) holds with the values a, b, c assigned to
x, y, z, respectively. We need such assignments for giving a recursive definition of validity
of a formula in a structure.
Let Var = {vn | n ∈ N} always denote our fixed set of variables.
Definition 1.2.1. An assignment (of variables) for a structure M = (M, F) is a function
ξ : Var → M .
Definition 1.2.2. Suppose M = (M, F) is an L-structure and ξ is an assignment for M.
We define tM,ξ by induction on L-terms:
(1) cM,ξ = cM , if c ∈ L is a constant symbol
(2) viM,ξ = ξ(vi ) for all variables vi
(3) f (t0 , . . . , tk−1 )M,ξ = f M (tM,ξ
0 , . . . , tM,ξ
k−1 ) if f is a k-ary function symbol

Example 1.2.3. For L = {0, 1, +, ·}, (Q, 0Q , 1Q , +Q , ·Q ), the polynomial t = (v0 · v0 ) +


(v1 · v2 ) and the assignment ξ(vi ) = i + 2, we have tM,ξ = 16.
Lemma 1.2.4. Suppose M = (M, F) is an L-structure and t is an L-term. Then tM,ξ
depends only on the values ξ(vi ) for variables vi that appear in t.
Proof. This is immediate, since the value ξ(vi ) appears in the definition of tM,ξ only if vi
appears in t.
More formally, we show by induction on L-terms t that for assignments ξ and ζ for M
such that ξ(vi ) = ζ(vi ) for all variables vi that appear in t, we have tM,ξ = tM,ζ :
(1) For variables vn , vnM,ξ = ξ(vn ) = ζ(vn ) = vnM,ζ .
(2) For constants c, cM,ξ = cM = cM,ζ .
(3) If f ∈ L is a k-ary function symbol and t0 , . . . , tk−1 are terms, then f (t0 , . . . , tk−1 )M,ξ =
f (t0M,β , . . . , tM,ξ M,ζ
k−1 ) = f (t0 , . . . , tM,ζ
k−1 = f (t0 , . . . , tk−1 )
M,ζ by the inductive hypoth-

esis.
8 PHILIPP SCHLICHT


Notation 1.2.5.
(1) If t is an L-term, we write t = t(x0 , . . . , xn−1 ) if x0 , . . . , xn−1 lists all variables in t
in the order of their first appearance in ϕ.
(2) For an L-term t = t(x0 , . . . , xn−1 ) and an assignment ξ for M with ξ(xi ) = ai for
i < n, we write tM,a0 ,...,an−1 for tM,ξ .
To define when a formula is true in a structure, we will need to inductively add more
values to an assignment:
Definition 1.2.6. Suppose that ξ is an assignment for M = (M, F), x is a variable and
a ∈ M . The assignment ξ xa is defined by
(
a a if x = y
ξ (y) = .
x ξ(y) 6 y.
if x =
Here is the definition of truth in a structure, just as you would expect:
Definition 1.2.7. Suppose that ξ is an assignment for an L-structure M = (M, F). We
define the statement ϕ holds in M for ξ, written as M |= ϕ[ξ], by induction on L-formulas
ϕ:
.
(1) M |= s = t [ξ] ⇐⇒ sM,ξ = tM,ξ .
(2) M |= R(t0 , . . . , tk )[ξ] ⇐⇒ RM (tM,ξ
0 , . . . , tM,ξ
k ).
(3) M |= (¬ψ)[ξ] ⇐⇒ M 6|= ψ[ξ].
(4) M |= (ψ ∨ θ)[ξ] ⇐⇒ M |= ψ[ξ] or M |= θ[ξ].
(5) M |= (∃x ψ)[ξ] ⇐⇒ ∃a ∈ M M |= ψ[ξ xa ].
1.3. Elementary substructures. The notion of substructure was introduced in Defini-
tion 1.1.7 above. The next lemma shows that every structure has a smallest substructure.
Lemma 1.3.1. Suppose that N = (N, hsN | s ∈ Li) is an L-structure and A ⊆ N . The
following conditions are equivalent:
(a) There is a substructure A of N of the form A = (A, hsA | s ∈ Li).
(b) For any a0 , . . . , an ∈ A and all L-terms t = t(x0 , . . . , xn ), we have tN ,a0 ,...,an ∈ A.
Assuming that L contains at least one constant symbol,5 it follows that there is a (⊆-
)least substructure of N , and its domain is {tN ,a0 ,...,an | t(x0 , ..., xn ) is an L-Term and
a0 , . . . , an ∈ A}.
Proof. Exercise 
A substructure can have very different properties than the original structure. For
instance (Z, 0Z , 1Z , +Z , ·Z ) is a substructure of (R, 0R , 1R , +R , ·R ), so a substructure of a
field is not necessarily a field. The next definition describes a more useful notion.
We here already use the notation M |= ϕ[a] that is only introduced after Lemma 1.3.4
below.
Definition 1.3.2. Suppose that M and N are L-structures.
(1) M is an elementary substructure of N , written as M ≺ N , if M ⊆ N and for all
a0 , . . . , an−1 ∈ M and all L-formulas ϕ with n free variables
M |= ϕ[a0 , . . . , an−1 ] ⇐⇒ N |= ϕ[a0 , . . . , an−1 ].
(2) An L-sentence is an L-formula without free variables.
5Recall that structures are by definition nonempty. If L does not contain constant symbols, then the
following set is empty.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 9

(3) M and N are called elementary equivalent if for all L-sentences ϕ,


M |= ϕ ⇐⇒ N |= ϕ.
Example 1.3.3.
(1) Every substructure of a complete graph (i.e., there is an edge between any two
vertices) is itself a complete graph. If both are infinite, it is also an elementary
substructure. (We will prove this later in the lecture.)
(2) (Z, ≤Z ) is a substructure of (Q, ≤Q ), but not an elementary substructure.
(Consider the formula ∀x ∀y [(x ≤ y ∧ x 6= y) → ∃y(x ≤ y ∧ y ≤ z ∧ x 6= y ∧ y 6= z)].)
(3) (2N, +N , 0N ) is not an elementary substructure of (N, +N , 0N ). (We can look at the
notion of evenness (∃y x = y + y). In N, 1 + 1 is even, but this fails in 2N.)
(4) (Q, ≤Q ) is an elementary substructure of (R, ≤R ).
(We will prove this later in the lecture.)
Lecture 3
The next lemma was already used in Definition 1.3.2. 19. April

Lemma 1.3.4. If ξ, ζ are assignments for an L-structure M such that ξ(vi ) = ζ(vi ) for
all variables vi that are free in ϕ, then
M |= ϕ[ξ] ⇐⇒ M |= ϕ[ζ].
If x0 , . . . , xn are the free variables of ϕ and ξ(xi ) = ai for all i ≤ n, we can therefore
write M |= ϕ[a0 , . . . , an ] instead of M |= ϕ[ξ]. If ϕ has no free variables, we simply write
M |= ϕ.
Proof. This is immediate because only the values of variables occuring in ϕ appear in the
inductive definition of M |= ϕ[ξ].
In more detail, we do an induction on formulas ϕ. The induction hypothesis states that
the claim holds for ϕ for all assignments. If R ∈ L is an n-ary relation symbol, t0 , . . . , tn
are L-terms and ϕ = R(t0 , . . . , tn ), then
M |= ϕ[ξ] ⇐⇒ RM (tM,ξ0 , . . . , tM,ξ M M,ζ
n ) ⇐⇒ R (t0 , . . . , tM,ζ
n ) ⇐⇒ M |= ϕ[ζ].
.
The cases s = t, (¬ψ) and (ψ ∧ θ) are similar. If ϕ = (∃x ψ), then
a a
M |= ϕ[ξ] ⇐⇒ ∃a ∈ M M |= ψ[ξ ] ⇐⇒ ∃a ∈ M M |= ψ[ζ ] ⇐⇒ M |= ϕ[ζ].
x x

Example 1.3.5. For each n ∈ N, there is a sentence ϕ such that M = (M, F) |= ϕ if
and only if |M | = n. E.g. for n = 3, let ϕ be the sentence ∃x0 , x1 , x2 (x0 6= x1 ∧ x0 6=
. . .
x2 ∧ x1 6= x2 ∧ ∀y (y = x0 ∨ y = x1 ∨ y = x2 )).
Hence a finite structure does not have proper elementary substructures.
Example 1.3.6. If M = (M, <M ) is an elementary substructure of N = (N, <N ), then
M = N.
Proof. We show n ∈ M for all n ∈ N by induction on n.
To see that 0 ∈ M , note that the statement ∃y¬∃x (x < y) holds in N and thus it also
holds in M. So there is some a ∈ M with M |= ¬∃x (x < y)[a]. Since M ≺ N , we also
have N |= ¬∃x (x < y)[a]. Thus a is the <N -least element of N, i.e. a = 0. So 0 = a ∈ M .
Now assume that n ∈ M . We have N |= ∃z > x ¬∃y(x < y < z))[n] (this formula is
an abbreviation for ∃z(z > x ∧ ¬∃y((x < y) ∧ (z < x))) ). A similar argument as for 0
shows that n + 1 ∈ N . 
Analogous to the notation for terms in Notation 1.2.5, we write ϕ(x0 , . . . , xn ) if x0 , . . . , xn
are precisely the free variables of ϕ, listed by the first appearance in ϕ.
10 PHILIPP SCHLICHT

Lemma 1.3.7 (Tarski’s test). Suppose that M and N are L-structures. The following
conditions are equivalent:
(1) M is an elementary substructure of M, i.e. M ≺ N .
(2) M is a substructure of N , and for all L-formulas ϕ(x, x0 , . . . , xn ) and all a0 , . . . , an ∈
M:
If there is some b ∈ N with N |= ϕ[b, a0 , . . . , an ],
then there is some a ∈ M with N |= ϕ[a, a0 , . . . , an ].
Proof. (1)⇒(2): If N |= ϕ[b, a0 , . . . , an ], then N |= ∃x ϕ(x, x0 , . . . , xn )[a0 , . . . , an ]. Since
M ≺ N , there is some a ∈ M with M |= ϕ[a, a0 , . . . , an ]. Since M ≺ N , we have
N |= ϕ[a, a0 , . . . , an ].
(2)⇒(1): By induction on formulas ϕ. The cases ∨ and ¬ are easy.
For the existential case, first suppose that ϕ = ϕ(x, x0 , . . . , xn ) and M |= ∃x ϕ[a0 , . . . , an ].
Then there is some a ∈ M with M |= ϕ[a, a0 , . . . , an ]. By the inductive hypothesis for ϕ,
N |= ϕ[a, a0 , . . . , an ].
Now suppose that N |= ∃x ϕ [a0 , . . . , an ]. By (2), there is some a ∈ M with N |=
ϕ[a, a0 , . . . , an ]. By the inductive hypothesis for ϕ, we have M |= ϕ[a, a0 , . . . , an ]. So
M |= ∃x ϕ [a0 , . . . , an ].

Homomorphisms (see Definition 1.1.6) preserve interpretations of terms:
Lemma 1.3.8. Suppose that M = (M, F), N = (N, G) are L-structures and h : M → N
is a homomorphism. Then for any term t = t(x0 , . . . , xn ) and all a0 , . . . , an ∈ M ,
h(tM,a0 ,...,an ) = tN ,h(a0 ),...,h(an ) .
Proof. Exercise in the tutorials. 
Isomorphisms preserve the truth of formulas:
Lemma 1.3.9. Suppose that M = (M, F), N = (N, G) are L-structures and h : M → N
is an isomorphism. Then for any formula ϕ = ϕ(x0 , . . . , xn ) and all a0 , . . . , an ∈ M ,
M |= ϕ[a0 , . . . , an ] ⇐⇒ N |= ϕ[h(a0 ), . . . , h(an )].
Proof. Homework exercise. 

1.4. Theories and axioms. One often wants to derive results about a structure from
axioms. A set of L-sentences (formulas with no free variables) is called a theory. The
sentences in a theory T are often called axioms and T is called an axiom system.
Definition 1.4.1. One says that an L-structure M satisfies an L-theory T , or M is a
model of T , in symbols M |= T , if M |= ϕ for all ϕ ∈ T .
Given an axiom system T , we can ask:
(1) (Syntactic) Which formulas are provable from T ? This will be made precise using a
proof calculus in Section 1.5.
(2) (Semantic) Which formulas does T imply? (See Definition 1.4.7.) This is equivalent
to the previous question by Gödel’s proof of the completeness of the proof calculus
in chapter 3.
Which models does T have? We study this question throughout the lecture. We
have already looked at the notion of elementary substructure, where one has two
structures with the same theory. The question how many models (of a given size) T
has is also connected with incompleteness of T , since an axiom system that implies
neither ϕ nor ¬ϕ has some models that satisfy ϕ and some that don’t.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 11

We now look at some examples of theories. Peano arithmetic is an axiom system in the
language LArith = {0, S, +, ·} of arithmetic that holds in the structure (N, 0N , S N , +N , ·N )
of the natural numbers, where S N denotes the successor function.
Example 1.4.2. (Peano Arithmetic) PA consists of the axioms:
(1) ∀x (S(x) 6= 0)
. .
(2) ∀x, y (S(x) = S(y) → x = y)
.
(3) ∀x, y (x + 0 = x)
.
(4) ∀x, y (S(x + y) = x + S(y))
.
(5) ∀x, y (x · 0 = 0)
.
(6) ∀x, y (x · S(y) = x · y + x)
(7) (Axiom scheme of induction) If ϕ(x, ~y ) is an LArith -formula, then
∀~y (ϕ(0, ~y ) ∧ ∀x [ϕ(x, ~y ) → ϕ(S(x), ~y )]) → ∀x ϕ(x, ~y ).
(The axioms (1)-(6) together with the axiom ∀y (y = 0 ∨ ∃x (S(x) = y)), which follows
from PA , are called Robinson Arithmetic.)
Note that PA consists of infinitely many axioms. (By a theorem of Ryll-Nardzewski,
one cannot axiomatise PA with only finitely many axioms.)
Every model of PA satisfies statements about + and · that can be proved by induction,
for example division with remainder.
Example 1.4.3. The following statements hold in every model of PA:
(1) ∀y (y = 0 ∨ ∃x (S(x) = y))
(2) ∀x, y, z ((x + y) + z = x + (y + z))
(3) ∀x, y (x + y = y + x)
In the case of PA, the axioms aim to describe a single structure. Other axioms, e.g.
the group axioms, aim to describe a class of structures.
By a class, we mean the collection of all objects with a certain property, for example
the class of all sets or the class of all groups. This will be studied in more detail in the
next chapter.
Definition 1.4.4. For a class C of L-structures, an L-theory T is called an axiomatisation
of C in L if C is the class of L-structures M with M |= T . If such a axiomatisation in
L (resp., finite axiomatisation in L) exists, C is called axiomatisable in L (resp., finitely
axiomatisable in L)
Note that some classes of L-structures are not axiomatisable in L, but axiomatisable
as reducts by a theory T 0 in an extended language L0 ⊇ L in the following sense: The
class C consist of precisely those L-structures M such that there exists an expansion of
M to an L0 -structure M0 with M0 |= T 0 . In other words, the structures in C are precisely
the reducts of those L0 -structures that satisfy the theory T 0 .
(14 June) To clarify, we will always say:
(1) A class C of L-structures is axiomatisable in L, or
(2) A class C of L-structures is axiomatisable by language extension. This means that C is the class of
reducts of models of an L0 -theory T 0 in some L0 ⊇ L.

Here are examples of axiomatisations of various classes of structures.


Example 1.4.5.
(1) For any language L and any n ∈ N with n ≥ 1, the class C≤n of L-structures with
at most elements is axiomatised by the axiom
n−1
_ .
ϕ≤n = ∃x0 . . . ∃xn−1 ∀y y = xi .6
i=0

6This is an abbreviation for y = x ∨ · · · ∨ y = x .


0 n
12 PHILIPP SCHLICHT

Similarly, the class C≥n of L-structures with at least elements can be axiomatised
in the empty language by the axiom
^ .
ϕ≥n = ∃x0 . . . ∃xn−1 ¬(xi = xj ).
i<j≤n−1
(2) For any language L, the class C∞ of infinite L-stuctures is axiomatised by the theory

T∞ = {ϕ≥n | n ∈ N}.
We will see later that C∞ has no finite axiomatisation in the empty language.
Let Cfin denote the class of finite L-structures. We will see later that Cfin cannot
be axiomatised in any language.
(3) The class of (symmetric) graphs G = (G, E G ) with no cycles is axiomatised in the
language L = {E} with a single binary relation symbol by

T = {ϕ} ∪ {ϕn | n ∈ N},


where ϕ = (∀x, y (E(x, y) → E(y, x))) and ϕn = (∀x0 , . . . , xn (x0 = xn → ¬
V
i<n E(xi , xi+1 )).

Lecture 4
21. April
Definition 1.4.6. The theory Th(M) of an L-structure M is defined as the set of L-
sentences ϕ with M |= ϕ.
We already introduced |= for truth of a formula in a model (with an assignment). One
also writes |= for (semantical) implication:
Definition 1.4.7. Suppose that T is an L-theory and ϕ is an L-formula.
T (semantically) implies ϕ, written as T |=L ϕ, if every model of T , with any assign-
ment, is a model of ϕ.
Moreover |=L ϕ means that ϕ is universally valid, or universally true, i.e. ϕ holds in
any L-structure with any assignment.
A first observation is that implication does not depend on the language.
Lemma 1.4.8. Suppose that K ⊆ L are languages, T is a K-theory and ϕ is a K-formula.
Then
T |=K ϕ ⇐⇒ T |=L ϕ.
Proof. For any K-structure M and any assignment ξ for M, we have M |= ϕ[ξ] ⇐⇒
MK |= ϕ[ξ]. This is because only symbols in K are actually used. It is an easy induction
on formulas, similar to Lemma 1.3.4.
We can assume that ϕ is an K-sentence by replacing a formula ϕ with free variables
x0 , . . . , xn by ∀x0 , . . . , xn ϕ.
First suppose that T |=K ϕ. If M is an L-structure with M |= T , then MK |= T by
the remark in the beginning of the proof. Hence MK |= ϕ and M |= ϕ
Now suppose that T |=L ϕ. If M is an K-structure with M |= T , take any L-structure
N expanding M, i.e. with arbitrary interpretations of the new symbols. (Here we use
that M 6= ∅, since all structures are nonempty by definition.) Then N |= T by the above
remark. Hence N |= ϕ and M = N K |= ϕ. 
Note that a formula ϕ(x0 , . . . , xn ) is universally valid if and only if its universal closure
∀x0 , . . . , xn ϕ(x0 , . . . , xn ) is universally valid.
To study e.g. the class of all groups, one want to determine which LG -sentences are
implied by the group axioms. It is useful to study universal truths, since the implication
from ϕ to ψ is equivalent to universal truth of ϕ → ψ.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 13

1.5. Universal truths and the Hilbert calculus. A universal truth is an L-formula
that is true in any L-structure for any assignment of variables. We will collect several
kinds of universal truths and will then build up a proof calculus from them.
It is easy to check that ϕ → ϕ and (ϕ ∧ ψ) ∨ (¬ϕ) ∨ (¬ψ) universally valid. More
generally, for any Boolean combination of formulas ϕ0 , . . . , ϕn using ∨, ¬, ∧ and →, the
truth of a combination in a model M depends only on the truth values of ϕ0 , . . . , ϕn in
M. (This is easy to see from the definition of M |= ϕ.)
Propositional logic studies this. To define propositional formulas, we fix a countably
infinite set P, i.e. with one element for each natural number, whose elements we call
propositional variables. For example, p → p for p ∈ P is a propositional formula and
ϕ → ϕ is a L-formula obtained by replacing p by ϕ in p → p. A propositional formula
can be understood as a string of symbols, but this does not fit into the framework of
languages studied above, and propositional variables are not logical variables as studied
above.
Definition 1.5.1.
(1) Propositional formulas are formal Boolean combinations of propositional variables
p ∈ P, i.e. they are generated as follows:
(a) Each p ∈ P is a propositional formula.
(b) If p and q are propositional formulas, then (p ∨ q) is a propositional formula.
(c) If p is a propositional formula, then (¬p) is a propositional formula.
(2) A propositional assignment is an arbitrary function µ : P → {0, 1}, where 1 stands
for true and 0 for false. µ can be extended to a function on all propositional formulas
by letting
(a) µ(¬q) = 1 if µ(q) = 0, and µ(¬q) = 0 otherwise;
(b) µ(q ∨ r) = 1 if (µ(q) = 1 or µ(r) = 1), and µ(q ∨ r) = 0 otherwise.
(3) A propositional formula p is called a propositional tautology if µ(p) = 1 holds for all
propositional assignments µ for p.
We also use the abbreviations (p ∧ q) = ¬((¬p) ∨ (¬q)) and (p → q) = ((¬p) ∨ q). Using
(2), one obtains
(a) µ(p ∧ q) = 1 iff (µ(p) = 1 and µ(q) = 1).
(b) µ(p → q) = 1 iff (µ(p) = 0 or µ(q) = 1).
Example 1.5.2. For all propositional variables p and q, the propositional formulas (p →
p) and ((p ∧ (p → q)) → q) are propositional tautologies.
Definition 1.5.3. A tautology is an L-formula that is obtained from a propositional
tautology p by replacing each propositional variable pn in p by an L-formula ϕn .
Lemma 1.5.4. (Tautologies) All tautologies are universally valid.
Proof. Suppose that ϕ is a tautology that arises from a Boolean combination p of propo-
sitional variables p0 , . . . , pn by replacing pi by the L-formula ϕi for all i ≤ n. Suppose
further that M is an L-structure and ξ is an assignment for M.
We consider the truth values of subformulas of ϕ in M for ξ. If we choose the compo-
nents pi as true or false according to these truth values, the truth of subformulas of ϕ in
M for ξ will correspond to the values of the corresponding propositional subformulas of
p. This is because the inductive definition of µ corresponds to the inductive definition of
M |= ϕ.
In more detail, we define µ(pi ) = 1 ⇐⇒ M |= ϕi [ξ] for i ≤ n and let µ(q) be arbitrary
for all other propositional variables q. Using Definitions 1.2.7 and 1.5.1, we see by induc-
tion on Boolean combinations that M |= ϕ[ξ] ⇐⇒ µ(p) = 1. Since p is a propositional
tautology, M |= ϕ[ξ]. 
14 PHILIPP SCHLICHT

We will allow tautologies as basic steps in the proof calculus. Note that some authors
fix a finite list of tautologies and derive all other ones from them using a proof calculus,
see for instance [1, Page 11]. But then even proof of simple statements such as ϕ → ϕ
can be quite complicated, see [1, Page 14].
We next consider universal truths about equality. The next lemma is immediate.
Lemma 1.5.5. (Axioms of equality) The following L-sentences are universally valid.
.
(1) (Reflexivity) ∀x x = x
. .
(2) (Symmetry) ∀x, y (x = y → y = x)
. . .
(3) (Transitivity) ∀x, y (x = y ∧ y = z → x = z)
(4) (Congruence for functions) For all n-ary relation symbols f ,
. . .
∀x0 , . . . xn , y0 , . . . , yn ((x0 = y0 ∧ · · · ∧ xn = yn ) → f (x0 , . . . , xn ) = f (y0 , . . . , yn )).
(5) (Congruence for relations) For all n-ary relation symbols R,
. .
∀x0 , . . . xn , y0 , . . . , yn ((x0 = y0 ∧ · · · ∧ xn = yn ) → (R(x0 , . . . , xn ) ↔ R(y0 , . . . , yn ))).
The next three lemmas collect ways to generate universal truths.
Lemma 1.5.6. (Modus ponens) If ϕ and ϕ → ψ are universally valid formulas, then ψ
is a universally valid formula.
Proof. Suppose that ξ is an assignment for M. Then both ϕ and ϕ → ψ hold in M for
ξ. Recall that ϕ → ψ is defined as (¬ϕ) ∨ ψ, so we have M 6|= ϕ[ξ] or M |= ψ[ξ] 
The modus tollens states that if ¬ψ and ϕ → ψ are universally valid, then ¬ϕ is
universally valid. This can be found by using the tautology (ϕ → ψ) −→ (¬ψ → ¬ϕ) and
applying the modus ponens twice.
Lemma 1.5.7. (∃→ introduction) If ϕ → ψ is a universally valid formula and x is not
free in ψ, then (∃xϕ) → ψ is an universally valid formula.
Proof. If M |= (∃xϕ)[ξ], then there is some a ∈ M with M |= ϕ[ξ xa ]. Since ϕ → ψ is
universally valid, we have M |= ψ[ξ xa ], so M |= ψ[ξ] by Lemma 1.3.4. 
The next, and final, → ∃-axiom states that certain implications of the form
t
→ ∃x ϕ
ϕ
x
are universally valid. Here ϕ xs means that all free occurences of the variable x in ϕ
are replaced by the term s. Although this definition is clear, we give the full recursive
definition in the following, since this definition is used in the next lemmas.
Definition 1.5.8. Suppose that s, t are L-terms and x is a variable. The term t xs is
defined by replacing all occurences of x by s. More formally, define by induction on t:
(
s if x = y
(1) For y ∈ Var, y xs =
y otherwise
(2) For constants c ∈ L, c xs = c.
(3) If f ∈ L is an n-ary function symbol and t0 , . . . , tn−1 are L-terms, then f (t0 , . . . , tn−1 ) xs =
f (t0 xs , . . . , tn−1 xs ).
Suppose that ϕ is an L-formula, x is a variable and s is an L-term. The formula ϕ xs is
defined by replacing all free occurences of x by s. More formally, define by induction on
ϕ:
. .
(1) (u = v) xs = (u xs = v xs ) and R(t0 , . . . , tn ) xs = R(t0 xs , . . . , tn xs ) for terms u, v, t0 , . . . , tn .
(2) (¬ψ) xs = (¬(ψ xs )).
(3) (ψ ∧ θ) xs = (ψ xs ) ∧ (θ xs ).
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 15
(
∃y (ψ xs ) if x 6= y
(4) (∃y ψ) xs = .
∃y ψ if x = y.

The next lemma shows that the interpretation of t xs can be found by interpreting t,
but changing the value at x.
Lecture 5
26. April
Lemma 1.5.9. (Substitution for terms) For any L-term t and any assignment ξ for an
L-structure M,
s sM,ξ
(t )M,ξ = tM,(ξ x ) .
x
Proof. By induction on terms.
sM,ξ
We have (x xs )M,ξ = sM,ξ = xM,ξ x , if y 6= x is a variable then (y xs )M,ξ = ξ(y) =
M,ξ sM,ξ
M,(ξ s x
y )
and if c ∈ L is a constant then (c xs )M,ξ = cM = cM,ξ x .
M,ξ M,ξ
M,ξ s M,ξ s
Moreover, f (t0 xs , . . . , tn xs )M,ξ = f M ((t0 xs )M,ξ , . . . , (tn xs )M,ξ ) = f M (t0 x
, . . . , tn x
)
M,ξ
M,ξ s x
= f (t0 , . . . , tn ) 

When we substitute a variable in a formula, in some cases the formula does not have
the intended meaning. This problem is prevented by the next condition.

Definition 1.5.10. The substitution ϕ xt is allowed if no variables of t are bound in the


places where x is replaced by t, i.e. at free occurences of x.
(More formally, one should say that substituting t for x in ϕ is allowed, since the
definition depends on the triple (ϕ, t, x).)
Here is a more detailed, recursive definition: the substitution is allowed if
(1) ϕ is atomic,
(2) ϕ = (¬ψ) and the substitution ψ xt is allowed,
(3) ϕ = (ψ ∨ θ) and the substitutions ψ xt and θ xt are allowed,
(4) ϕ = (∃y ψ) and (a) x = y or (b) x 6= y, the substitution ψ xt is allowed and y does
not occur in t.7

This always holds when t is a constant c. If t is a variable y, then the condition states
that y is not bound in the relevant places. The next lemma shows that in this case,
substitution works well.

Lemma 1.5.11. (Substitution for formulas) Suppose that ϕ is an L-formula and s is an


L-term. If the substitution ϕ xs is allowed, then

s sM,ξ
M |= (ϕ )[ξ] ⇐⇒ M |= ϕ[ξ ].
x x
Proof. If x does not occur freely in ϕ, then the claim holds by Lemma 1.3.4.
Suppose that x occurs freely in ϕ. The atomic case follows from Lemma 1.5.9, and the
cases ϕ = (¬ψ) and ϕ = (ψ ∨ θ) are easy.
Suppose that ϕ = (∃y ψ). Since x appears freely in ϕ, this implies x 6= y. Since the
substitution is allowed, y does not appear in s. Then:

7The previous version of this definition, taken from [3, page 14], did not distinguish the cases x = y
and x 6= y. Thanks to Tom Stalljohann for pointing out this mistake.
16 PHILIPP SCHLICHT

M |= (∃y ψ) xs [ξ] ⇐⇒ M |= ψ xs [ξ ay ] for some a ∈ M


M,ξ a
y
⇐⇒ M |= ψ[(ξ ay ) s x ] for some a ∈ M [by the inductive hypothesis]
a sM,ξ
⇐⇒ M |= ψ[(ξ y ) x ] for some a ∈ M [since y does not appear in s]
M,ξ
⇐⇒ M |= ψ[(ξ s x ) ay ] for some a ∈ M [since x 6= y]
M,ξ
⇐⇒ M |= (∃y ψ)[ξ s x ]
Note that the functions (ξ ay ) xb : Var → M and (ξ xb ) ay : Var → M , where b = sM,ξ , are
identical since x 6= y. 
Lemma 1.5.12. (→ ∃-axiom) Suppose that ϕ is an L-formula, t is an L-term and x is a
variable. If the substitution ϕ xt is allowed, then the formula
t
ϕ → ∃x ϕ
x
is universally valid.
Proof. Suppose that M is an L-structure and ξ is an assignment for M. By Lemma
1.5.11,
t tM,ξ
M |= ϕ [ξ] ⇐⇒ M |= ϕ[ξ ] =⇒ M |= ∃x ϕ[ξ].
x x

We now define the Hilbert calculus as the system of formal rules that consists of the
above rules to generate universal truths.8
Definition 1.5.13. An L-formula ϕ is called L-provable (in the Hilbert calculus) in each
of the following cases:
(1) ϕ is an equality axiom
(2) ϕ is a tautology
(3) ϕ is an → ∃-axiom
(4) ϕ is generated from two L-provable L-formulas using the modus ponens
(5) ϕ is generated from an L-provable L-formula using the ∃→ -rule.
A formal L-proof of ϕ is a list of L-formulas, each of which is L-provable from the
previous formulas in the list, ending with ϕ. We write `L if such a proof exists.
Suppose that T is a set of L-formulas. A formal L-proof of ϕ from T is an L-proof of
(ψ0 ∧ · · · ∧ ψn ) → ϕ for some ψ0 , . . . , ψn ∈ T . We write T `L ϕ if such a proof exists.
To clarify, an L-proof is always a list of formulas given by the rules of the calculus:
ψ0
ψ1
...
ψn
Here each ψj is either an axiom of the Hilbert calculus, or can be derived from the
previous ψi using the modus ponens or the ∃→ -rule.

Hilbert calculus:
Axioms of the calculus:
(1) Axioms of equality are L-provable:
.
(a) (Reflexivity) ∀x x = x
. .
(b) (Symmetry) ∀x, y (x = y → y = x)
. . .
(c) (Transitivity) ∀x, y (x = y ∧ y = z → x = z)
8This calculus is used in [3] and many other books on mathematical logic.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 17

(d) (Congruence for functions) For all n-ary relation symbols f ,


. . .
∀x0 , . . . xn , y0 , . . . , yn ((x0 = y0 ∧ · · · ∧ xn = yn ) → f (x0 , . . . , xn ) = f (y0 , . . . , yn )).
(e) (Congruence for relations) For all n-ary relation symbols R,
. .
∀x0 , . . . xn , y0 , . . . , yn ((x0 = y0 ∧ · · · ∧ xn = yn ) → (R(x0 , . . . , xn ) ↔ R(y0 , . . . , yn ))).
(2) All tautologies are L-provable.
(3) (→ ∃-axiom) Suppose that ϕ is an L-formula, t is an L-term and x is a variable. If
the substitution ϕ xt is allowed, then the formula
t
ϕ → ∃x ϕ
x
is L-provable.
Rules of the calculus:
(1) (Modus ponens) If ϕ and ϕ → ψ are L-provable formulas, then ψ is L-provable.
(2) (∃→ introduction) If ϕ → ψ is an L-provable formula and x is not free in ψ, then
(∃xϕ) → ψ is L-provable.

Note that a proof of ϕ from a set T of formulas is defined as a proof of (ψ0 ∧· · ·∧ψn ) → ϕ
with ψn ∈ T .
One could naively think that in a proof from T , one can use formulas in T within the
list of formulas, just like the axioms of the calculus. But this is not allowed, and in fact
the ∃→ -introduction would lead to problems.
The advantage of this calculus (and its variants) compared to e.g. the sequent calculus
is that the rules are short and simple. But in any formal proof calculus, writing down
actual formal proofs can be complicated and may involve many steps.
It will follow from the compactness theorem that `L is equivalent for differerent lan-
guages L (as long as the relevant formula ϕ is an L-formula), so we will later write `
instead of `L .
Let > = (ϕ0 ∨(¬ϕ0 )) (true) for a fixed formula ϕ0 without free variables and ⊥ := (¬>)
(false).
Definition 1.5.14. An L-theory is called
(1) (syntactically) L-consistent if T 6`L ⊥, i.e. one cannot prove a contradiction from T .
(2) (syntactically) L-complete if for every L-formula L-sentence ϕ, T `L ϕ or T `L ¬ϕ.
Proposition 1.5.15. (Compactness for `) An L-theory T is L-consistent if every finite
subset of T is L-consistent.
Proof. By definition of `L . 

Syntactic-semantic duality

Syntactic (proof theoretic) Semantic (model theoretic)


Implication T `ϕ T |= ϕ
Consistency/Satisfiability T 6` ⊥ T 6|= ⊥, i.e. T has a model
Completeness For all ϕ, T ` ϕ or T  ¬ϕ For all ϕ, T  ϕ or T  ¬ϕ
Compactness T `ϕ⇒ T |= ϕ ⇒
there is a finite T0 ⊆ T with T0 ` ϕ there is a finite T0 ⊆ T with T0 |= ϕ

We will see in chapter 3 that the Hilbert calculus is complete, i.e. it can prove anything
that can be proved by any other means. Moreover, ` and  are equivalent. This will show
that the left and right side in each box are equivalent.
We next give some examples how to construct formal proofs.
18 PHILIPP SCHLICHT

Example 1.5.16.
(1) (∀→ -axiom) Suppose that ϕ is an L-formula, t is an L-term and x is a variable. If
the substitution ϕ xt is allowed, then the formula
t
∀x ϕ → ϕ
x
is provable.
Proof. ¬ϕ xt → ∃x ¬ϕ is an → ∃-axiom. Note that
t t
(¬ϕ → ∃x (¬ϕ)) ←→ (∀x ϕ → ϕ )
x x
is a tautology obtained from the propositional tautology (¬p → q) ↔ (¬q → p).
(Recall that ∀x ϕ is an abbreviation for ¬∃x (¬ϕ).) Modus Ponens yields the required
formula. 

(2) ( ∀-introduction) If ϕ → ψ is provable and x is not free in ϕ, then ϕ → ∀xψ is
provable.
Note that a special case ϕ = (θ → θ) (or any other tautology), ϕ → ψ is prov-
able if and only if ψ is provable. We thus obtain the following special case of → ∀-
introduction (for any variable x):
If ψ is provable, then ∀xψ is provable.
Proof. Note that ¬ψ → ¬ϕ is provable, since (ϕ → ψ) ↔ (¬ψ → ¬ϕ) is a tautology.
Then (∃x¬ψ) → ¬ϕ holds by (∃→ introduction) Using the tautology
((∃x¬ψ) → ¬ϕ) −→ (ϕ → ∀xψ),
obtained from the propositional tautology (p → ¬q) → (q → ¬p), Modus Ponens
yields ϕ → ∀xψ. 
Recall that L-provability of ψ from T means that `L (ϕ0 ∧ · · · ∧ ϕn ) → ψ for some
ϕ0 , . . . , ϕn ∈ T . If T is a theory, then ϕ0 , . . . , ϕn do not contain free variables. Hence
by the ∀→ -axiom and → ∀-introduction, T `L ψ is equivalent to T `L ∀x0 , . . . , xk ψ,
if x0 , . . . , xk are the free variables of ψ.
Lecture 6
28. April (3) Next is a simple example of the ∃→ -axiom: to prove an existential formula, one
provides a term witnessing it.
Suppose that L contains a unary function symbol S. Then ∀x ∃y (S(x) = y) is
provable.
.
Proof. ∀x (S(x) = S(x)) is an equality axiom. By the ∀→ -axiom and Modus Ponens,
. .
S(x) = S(x). The ∃→ -axiom yields S(x) = S(x) → ∃y (S(x) = y). By Modus
Ponens, ∃y (S(x) = y) is provable as well. By → ∀-introduction, ∀x ∃y (S(x) = y) is
provable.
More precisely, we here use a consequence of → ∀-introduction (`L ϕ implies `L
∀xϕ) that is shown on Übungsblatt 3.

(4) In the next example, one needs to remove the quantifier ∀ before applying tautologies.
The formula (∀x (ϕ ∧ ψ)) → (∀x ϕ) is provable.
Note that it follows by tautologies that (∃x ϕ) → (∃x (ϕ ∨ ψ)) is provable.
Proof. Note that (∀x (ϕ ∧ ψ)) → (ϕ ∧ ψ) is an ∀→ -axiom and (ϕ ∧ ψ) → ϕ is
a tautology. By tautologies, (∀x (ϕ ∧ ψ)) → ϕ is provable. By → ∀-introduction,
(∀x (ϕ ∧ ψ)) → ∀x ϕ is provable 
(5) In the next example, one has to work backwards to construct a proof. Again we
leave out several steps using tautologies.
The formula (∀x (ϕ → ψ)) → (∃x ϕ → ∃x ψ) is provable.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 19

Proof. By tautologies and ∃→ -introduction, it suffices to show that


ϕ → (∀x (ϕ → ψ) → ∃x ϕ)
is provable. Again by tautologies,
(ϕ ∧ ∀x (ϕ → ψ) → ∃x ϕ)
suffices. Note that ∀x (ϕ → ψ) → (ϕ → ψ) is an ∀→ -axiom and ψ → ∃x ψ is
provable by → ∃-introduction. Tautologies yield the claim. 
The next lemma shows that the role of free variables in a provable formula is the same
as the role of new constants in an extended language. This will be used in chapter 3 in
the proof of the completeness of Hilbert’s calculus.
Lemma 1.5.17. Suppose that ϕ is an L-formula, x0 , . . . , xn are (among the) free variables
in ϕ, C is a set of new constants and c0 , . . . , cn ∈ C are distinct. Then
c0 cn
`L∪C ϕ( , . . . , ) ⇐⇒ `L ϕ.
x0 xn
Proof. Suppose that P 9 is an L ∪ C-proof of ϕ( xc00 , . . . , xcnn ), where k ≥ n and c0 , . . . , ck
are distinct. We choose new variables y0 , . . . , yk that do not appear in the proof. By
replacing ci by yi everywhere in P , we obtain a L-proof P yc00 , . . . , yckk 10 of ϕ( xy00 , . . . , xynn ).
(One can easily check that each axiom and rule remains valid.) → ∀-introduction yields a
proof of ∀y0 , . . . , yn ϕ( xy00 , . . . , xynn ). By the ∀→ -axiom ∀y0 , . . . , yn ϕ( xy00 , . . . , xynn ) → ϕ (the
xi are not free in the formula on the left). By Modus Ponens, we obtain `L ϕ.
Conversely, suppose that `L ϕ holds. By → ∀-introduction, we have `L ∀x0 , . . . , xn ϕ.
By the ∀→ -axiom and Modus Ponens, `L∪C ϕ( xc00 , . . . , xcnn ). 

Note that the special case of the previous lemma where one chooses no variables at all
shows that `L ϕ ⇐⇒`L∪C ϕ holds for any L-formula ϕ and a set C of constants. Hence
the meaning of `L does not change when L is enriched by constants. will later see that
this is also true for relation and function symbols.
Ending this chapter, we have a look at Hilbert’s program, as paraphrased in [Kossak:
Mathematical Logic (2018), page 180]:
(1) “Define a system based on a formal language in which all mathematical statements
can be expressed, and in which proofs of theorems can be carried out according to
well-defined, strict rules of proof.
(2) Show that the system is complete, i.e. all true mathematical statement can be proved
in the formalism.
(3) Show that the system is consistent, i.e. it is not possible to derive a statement and
its negation. The consistency should be carried out using finitistic means without
appeal to the notion of actual infinity.
(4) Show that the system is conservative, i.e. if a statement about concrete objects of
mathematics, such as natural numbers of geometric figures, has a proof involving
infinitistic methods, then is also has an elementary proof in which those methods
are not used.
(5) Show that the system is decidable by finding an algorithm for deciding the truth of
falsity of any mathematical statement.”

9A proof is a finite sequence of formulas.


10Up to now, we only defined substitution of variables by terms in formulas. If we were more precise
here, we would define substitution of constants by variables in formulas, and thus in proofs, in precisely
the same way.
20 PHILIPP SCHLICHT

We completed (1) in this chapter, and (2) is the completeness of Hilbert’s calculus
proved in chapter 3.
But the other items cannot be realised: (5) is false by (the proof of) Gödel’s first
incompleteness theorem; (3) and (4) are false by Gödel’s second incompleteness theorem.
The failure of (4) also follows from the unprovability in PA of the convergence of Goodstein
sequences.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 21

2. Sets and Axioms


In this section, we study the framework of set theory. Lecture 7
We first study wellorders and ordinals informally. We then introduce the axioms of set 03. May
theory and see how they are use to introduce ordinals and cardinals. We prove transfinite
induction and recursion.
Up to now, we have worked informally in these settings:
• The world of (hereditarily) finite sets, i.e. finite sets whose elements, elements
of elements etc. are finite. A word, i.e. a finite sequence of symbols, is again
(hereditarily) finite, since we can assume that each symbol in our language is a
(hereditarily) finite set.
We have not yet said precisely which axioms we use in this setting. We will see
that finite set theory, i.e. a version of the axioms of set theory without the axiom
of infinity, suffices.
In particular, we have done proofs by induction on the length of terms, or on
the partial ordering s is a subterm of t. These are special cases of induction along
wellfounded relations, which we study in this section.
• Arithmetic, i.e. PA (in exercise problems). In the set theoretic definition of natural
numbers, any natural number is a (hereditarily) finite set, and PA can be proved
from finite set theory.
• Set theory. We worked with infinite structures and need set theory to formalise
those statements.
Set theory forms a basis for all of mathematics. Why do we need such a foundation? We
want to have a formal system in which all common mathematical proofs and in which one
can construct all mathematical objects, for instance the real and complex fields, function
spaces, categories etc.
Some fields in mathematics have their own axiomatisations. For example, think of the
theories of groups or fields. But to study these structures, one often goes beyond them,
for example one studies fields with the help of their automorphism groups. Set theory is
a unified framework in which all mathematical constructions can be done.
Example 2.0.1. This example explains how to construct the real ordered field. We will
construct the structure (N, 0, 1, +, ·, <) of the natural numbers below. From this, one can
easily construct the ordered field (Q, 0, 1, +, ·, <).
A cut in (Q, 0, 1, +, ·, <) is an downwards closed subset with respect to < with an upper
bound, but no maximum. (R, 0, 1, +, ·, <) is the set of cuts with pointwise operations
induced by those of Q.11 So the reals are constructed via the power set of Q.
It is not hard to show that (R, 0, 1, +, ·, <) is a complete ordered field, i.e. every bounded
subset has a supremum. In analysis, one shows that there is a unique complete ordered
field.
A central question in set theory is about size, the most basic property of mathematical
objects. Why is this useful? For example, an obvious way to show that a structure G
does not embed into a structure H is to show that G is strictly larger than H. Here is an
example where it is not immediately obvious how two sets compare in size:
Example 2.0.2. Consider the set L of linear orders on N up to bi-embeddability ∼,
where (N, <0 ) ∼ (N, <1 ) if (N, <0 ) embeds into (N, <1 ) and conversely.12 We want to
compare L/∼ with R. How do the sizes |L/∼| and |R| of these sets compare?

11Multiplication is first defined pointwise on R


≥0 and then extended to R by cases for positive and
negative numbers.
12This is strictly weaker then isomorphism.
22 PHILIPP SCHLICHT

It is not hard to show that |L/∼| = ℵ1 , where ℵ1 denotes the first uncountable cardinal.
Thus |L/∼| ≤ |R|. The axiom of choice is relevant: without it, one cannot show that these
sets are comparable in size, i.e. that there exists an injection L/∼ → R or an injection
R → L/∼.
Whether |L/∼| = |R| cannot be decided on the basis of the axioms of set theory. Thus
it is in general a highly nontrivial question to determine the size of a set.
The aim of this chapter is an introduction to set theory up to cardinals. The basic
notion for comparing the cardinality of two sets is wellordering. We will therefore spend
most of our efforts to study wellorders and recursion along wellorderings.
2.1. Wellfounded relations and wellorders. This section is an introduction to wellorders.
They are used to count past the natural numbers.
We work informally as in usual mathematical proofs. This gives some of the flavour of
working with ordinals and cardinals.
As any mathematical proof, these arguments can be fully formalised within the axiom
system of set theory. We do this in the next section by developing basic results in set
theory directly from the axioms.
One could develop most of the basic results of set theory informally.
Suppose that A is a set (later, classes will be allowed) an (A, ≤) is a linear order. We
will write a < b for (a ≤ b ∧ a 6= b). We call (A, <) a strict linear order.
Definition 2.1.1.
(1) A binary relation < on a set A is called wellfounded if every nonempty subset of A
has an <-minimal element.
(2) A wellorder (A, <) is a wellfounded strict linear order.
Only this chapter, we also allow the empty set with a relation as a structure. In
particular, the empty set with the empty relation is a wellorder. It is called 0.
The usual order of the natural numbers is an example of a wellorder. Its order type13
is denoted by ω. Going beyond this, we obtain ω + 1 by adding an element on top of ω,
then ω + 2, ω + 3, . . . ω + ω = ω · 2, . . . , ω 2 , ω 3 , . . . , ω ω etc.
Any wellfounded relation (A, <) satisfies the following induction principle for all prop-
erties P :
Suppose that for all b ∈ A, if P (a) holds for all a < b then P (b) holds.
Then P (a) holds for all a ∈ A.
To see this, note that if P (a) would fail for some a ∈ A, then it would fail for some
<-minimal a ∈ A. This is because we assumed that (A, <) is a wellorder. Thus P (b)
holds for all b < a. However, then P (a) holds by the assumption on P . We will formally
prove this below. P will be given by a first-order formula in the language of set theory.
The axiom of choice is not needed to prove the below results about wellorders. However,
the next lemma clarifies the concept of wellorder with the use of the axiom of choice. It
is not needed below.
Lemma 2.1.2. Suppose that < is a binary relation on a set A. The following conditions
are equivalent:
(1) (A, <) is wellfounded.
(2) There is no strictly decreasing infinite sequence a0 > a1 > . . . in A.
Proof. (1)⇒(2): Towards a contradiction, suppose that han | n ∈ Ni is a strictly decreas-
ing sequence in A. Then {an | n ∈ N} has no <-minimal element.
13By the order type, we mean the isomorphism class. The right definition of ω can only be introduced
in the next section.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 23

(2)⇒(1): Towards a contradiction, suppose that (A, <) is not wellfounded. There is
a subset B of A that contains no <-minimal element. We can therefore construct a
sequence strictly decreasing sequence {an | n ∈ N} in A. Let a0 ∈ B be arbitrary. Choose
an arbitrary an+1 < an in step n + 1. More precisely, this argument uses the axiom of
choice, which we will introduce below. One can find an+1 by using a choice function that
sends every nonempty subset of A to an element. 
We show next that any two wellorders are comparable. We define an initial segment
of a wellorder (A, <) to be either of the form (A, <) or (A<b , < A<b ), where A<b = {a ∈
A | a < b} for some b ∈ A. We will always use the notation A = (A, <A ), B = (B, <B )
and C = (C, <C ) for wellorders. We will write A < B if A is isomorphic to a proper initial
segment of B.
The proof of the next lemma is an example of the following recursion principle for
wellorders A = (A, <A ). Suppose that G is a function such that G(f ) is defined for any
partial function f from A to a set B, and G(f ) ∈ B for all such functions f .
There is a unique function F : A → B such that for all a ∈ A, F (a) =
G(f A<a ).
A proof of this recursion principle will be added here. In the proof of the next lemma,
the recursion is proved ad hoc.
Assumming the wellordering principle, i.e. the statement that every set can be wellordered,
the next lemma shows that any two sets can be compared in size. The wellordering prin-
ciple follows from the axioms of set theory.
Lemma 2.1.3. For wellorders A and B, exactly one of the following holds:
(1) A ∼
= B.
(2) A < B.
(3) B < A.
Moreover, in each case the isomorphism is unique.14
Proof. By induction on a ∈ A (by the induction principle above), we can assume that
the statement of the lemma, including the uniqueness claim, already holds for all proper
initial segments of A.
We now prove that one of (1), (2) or (3) holds for A.
First assume that for every a ∈ A, there is an isomorphism fa from A<a to an initial
segment of B. By uniqueness of the fa for a ∈ A, we have fb A<a = fa for all a < b in
A, i.e. fb extends fa . Then the union of the functions fa for all a ∈ A is an isomorphism
from A to an initial segment of B. Hence (1) or (2) holds.
Now assume that for some a ∈ A, there is no isomorphism fa from A<a to an initial
segment of B. We can assume that a is the <-least such element of A.
First assume that A<a has a largest element a0 . By our assumption, there is an isomor-
phism f from A<a0 to an initial segment of B. If its range is B, then (3) holds. Otherwise,
its range is B<b0 for some b0 ∈ B. We can extend f to an isomorphism f 0 from A<a to
an initial segment of B by defining f (a0 ) = b0 . but we assumed that such a map does not
exist.
Now assume that A<a has no largest element. As in the the beginning of the proof of
one of (1),(2) or (3), we can use the uniqueness of the isomorphisms fa0 from A<a0 to an
initial segment of B for a0 < a to see that their union is an isomorphism from A<a to an
initial segment of B, contradicting the assumption on a.
It remains to prove uniqueness. Towards a contradiction, suppose that f 6= g are
isomorphisms in one of (1), (2). The proof of (3) is similar. It is easy to see that f cannot
extend g or conversely. Hence there is some a ∈ A with f (a) 6= g(a). Suppose that a
14We mean the isomorphism from A to a proper initial segment of B in (2), and similarly in (3).
24 PHILIPP SCHLICHT

is <A -least with f (a) 6= g(a). But f and g are both isomorphisms to initial segments
of B, so f (a) and g(a) both equal the <B -least element of B strictly above the range
ran(f A<a ) = ran(gA<a ). 
We will now see that arithmetic can be extended to the ordinals. The definition of the
operations is defined by glueing wellorders together.
When we work with wellorders A < B, we will assume that A is an initial segment of
B. We thus omit the isomorphisms in the following arguments. This can be justified by
checking that the arguments still work with the necessary isomorphisms, or alternatively
by noting that this property is actually true for ordinals introduced later in this chapter.
Now let (Ai , <Ai ) be wellorders for i ∈ I, where I is a set. By our assumption, each
(Ai , <Ai ) is an initial segment of (Aj , <Aj ) or conversely, so we can form their union,
S S
denoted supi∈I (Ai , <Ai ) = ( i∈I Ai , i∈I <Ai ). It is easy to check that the isomorphism
type of supi∈I (Ai , <Ai ) depends only on the isomorphism types of (Ai , <Ai ) for i ∈ I.
We say that a wellorder (A, <A ) has successor length if it has a largest element, and
limit length otherwise. If (A, <A ) has limit length, then it is the union of its proper initial
segments.
Definition 2.1.4. (Addition of wellorders) Suppose that A and B are wellorders. We
assume that A and B are disjoint by replacing B by an isomorphic copy, if necessary. Let
A + B = (A ∪ B, <), where a < b if
• a, b ∈ A and a <A b,
• a, b ∈ B and a <B b, or
• a ∈ A and b ∈ B.
Thus B is glued on top of A. Note that addition is not commutative. What is 3 + ω?
Lemma 2.1.5. Let A, B and C denote wellorders.
(1) A + 0 ∼= A.
(2) A + [B + C] ∼= [A + B] + C.
(3) If C has limit length, then
∼ sup{A + B | B < C}.
A+C =
Proof. These claims are easy to check directly from the definitions. (3) holds since the
supremum is defined as the union. 
Lecture 8
05. May
Definition 2.1.6. (Multiplication of wellorders) Suppose that A and B are wellorders.
Let A · B = (A × B, <r−lex ), where (a, b) <r−lex (a0 , b0 ) is the right-lexicographical order
defined by
• b <B b0 or
• b = b0 and a <A a0 .
Thus A · B glues B many copies of A together. For example, ω · 3 = ω + ω + ω. Note
that multiplication is not commutative. What is 3 · ω?
Let 1 denote a wellorder with a single element.
Lemma 2.1.7. Let A, B and C denote wellorders.
(1) A · 1 ∼
= A.
(2) A · [B · C] ∼
= [A · B] · C.
(3) A · [B + (C] ∼= [A · B] + [A · C].
(4) If C has limit length, then
A·C = ∼ sup{A · B | B < C}.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 25

Proof. These claims can be proved directly from the definitions are are left as exercises.

One can further define exponentiation of wellorders as follows.
Definition 2.1.8. (Exponentiation of wellorders) Suppose that A and B are wellorders.
Let AB = (A(B) , <r−lex ), where A(B) denotes the set of finite partial functions f : B → A
and f <r−lex g is the right-lexicographical order defined by f (b) < g(b) or f (b) is not
defined, where b ∈ B is <B -least such that either f (b) 6= g(b) or one of them is defined
and the other one ist not.
Exponentiation has the expected properties, see e.g. [2, Exercises 3.10 & 3.11].
We will get back to ordinal arithmetic later when we discuss ordinals.
2.2. Axioms of set theory. Before we begin to introduce the axioms, we begin with an
example to illustrate how any mathematical object can be constructed as a set.
Example 2.2.1.
(1) A natural number is of the form 0 := ∅, or n + 1 := n ∪ {n} for a natural number
n.15 Thus n is itself a set with n elements.
(2) A rational number q ∈ Q is an equivalence class of triples (m, n, k)16 of natural
numbers, where (m, n, k) represents m−nk+1 .
(3) A real number r ∈ R is a nonempty set of rational numbers that is bounded from
above, downwards closed (if q ∈ r and p ≤ q, then p ∈ r) and has no maximal
element.
(4) A function f : R → R is a subset f of R2 such that for each r ∈ R, there is a unique
s ∈ R with (r, s) ∈ f .
In (3), we used the power set of the rationals Q, i.e. the set of subsets of Q. In (4), we
used subsets of the reals.
Intuivitely, set formation is described as follows:
Georg Cantor:
"Unter einer ’Menge’ verstehen wir jede Zusammenfassung M von bes-
timmten wohlunterschiedenen Objekten in unserer Anschauung oder un-
seres Denkens (welche die ’Element’ von M genannt werden) zu einem
Ganzen."
Felix Hausdorff:
"Eine Menge ist eine Zusammenfassung von Dingen zu einem Ganzen, d.h.
zu einem neuen Ding."
Can one form the set of objects with any given property (unrestricted separation)?
Several paradoxes related to this emerged in work of Cantor, Burali-Forti and Russell,
but it seems that they were not seen as paradoxes at that time. My impression from the
literature is that unrestricted separation was never considered as an axiom scheme, but
only emerged in the logical study of the foundations of set theory in Russell’s work. When
fixing the axioms in a logical setting, one has to make precise what kind of separation is
allowed.
It was known to Cantor that some collections of sets are themselves not sets, for example
Cantor’s paradox states that the collection of all cardinal numbers does not form a set.
Burali-Forti’s paradox states that the collection of ordinals does not form a set. The
difference between these and Russell’s observation that the unrestricted principle of set
15This is an informal definition that will be made precise later. The set of natural numbers will be
defined as the smallest set that is closed under the function n 7→ n ∪ {n}.
16Ordered pairs and tuples are defined below.
26 PHILIPP SCHLICHT

formation is contradictory is that Russell’s paradox is purely logical and does not assume
any other axioms.
Remark 2.2.2. (Russell’s Paradox) Assume that there exists a set x that contains pre-
cisely those sets y with y ∈/ y as elements. Then x ∈ x holds if and only if x ∈ / x.
Therefore, no such set can exist.
Actually, we will see that the axiom of foundation prohibits the existence of sets y
with y ∈ y. Assuming the axiom of foundation, Russell’s paradox says: there is no set
of all sets, since it would contain itself as an element. However, Russell’s formulation is
more general and, as we will see, has counterparts in the proofs of Gödel’s incompleteness
theorems.
What is a possible solution to this paradox? As hinted, the collection of all sets is
simply too large. Such collections are called classes. Classes are, for us, syntactical
abbreviations of the form {x | ϕ(x)}, i.e. every class is given by a formula ϕ.
(An alternative is Bernays-Gödel class theory, where classes are objects by themselves.)
We now develop basic properties of sets and classes, introducing axioms when needed.
The language of set theory is {∈}, where ∈ is a binary relation symbol.
Axiom. (Existence) ∃x (∀y y ∈
/ x).
One could alternatively work with the Existence Axiom ∃x (x = x); the existence of
the empty set would follow from the Separation Axiom below.
Two sets are equal if and only if they have the same elements.
Axiom. (Extensionality) ∀x, x0 (∀y (y ∈ x ↔ y ∈ x0 ) → x = x0 ).
We next define classes. One could formulate everything that follows using sets alone,
but classes are quite convenient.
Definition 2.2.3. A class term
A = A(s0 , . . . , sn ) = {x | ϕ(x, s0 , . . . , sn )}
is given by an L∈ -formula ϕ and variables s0 , . . . , sn .
A class is a class term A(s0 , . . . , sn ) together with sets s0 , . . . , sn .17 So a class term is
variable, while a class is a fixed collection of sets. (Formally, this distinction means that
we are in a context where the si are bound variables.)
Definition 2.2.4. Suppose that A = {x | ϕ(x, s0 , . . . , sn )} is a class term and s is a
variable. We define
(1) s ∈ A to mean ϕ(s, s0 , . . . , sn ).
.
(2) s = A18 to mean (∀x ∈ s ϕ(x, s0 , . . . , sn )) ∧ (∀x(ϕ(x, s0 , . . . , sn ) → x ∈ s).
One can thus regard classes as more general objects than sets. Some classes are equal
to sets; the remaining ones are too large to be sets – by the separation axiom below, they
are not a subset of any set – they are called proper classes.
A class is called a proper class if it is not equal to any set.
Definition 2.2.5. Suppose that A, B are classes.
(1) A ⊆ B if ∀x ∈ A (x ∈ B).
.
(2) A = B if A ⊆ B and B ⊆ A.
.
(3) A ∈ B if ∃s (s = A ∧ s ∈ B).
.
We will often simply write = for =.
17I.e. s , . . . , s are elements of the set theoretic universe V in Definition 2.2.7.
0 n
18We write = . .
instead of = here to make clear that = is used as a logical symbol in an extended language
.
that allows abreviations. But one can always write = instead of =.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 27

. . .
Lemma 2.2.6. If A, B are classes and s, t are sets with A = s and B = t, then s = t
.
holds if and only if A = B.
Proof. By the Axiom of Extensionality. 
Definition 2.2.7.
(1) ∅ = {x | x 6= x}.
.
(2) V = {x | x = x} (the set-theoretic universe).
. .
(3) {x0 , . . . xn } = {x | x = x0 ∨ · · · ∨ x = xn }.
Lemma 2.2.8. ∅ ∈ V .
Definition 2.2.9. A class A is a proper class if there is no set s with A = s.
Definition 2.2.10. Suppose that A, B, A0 , . . . An are classes.
(1) A0 ∪ · · · ∪ An = {x | x ∈ A0 ∨ · · · ∨ x ∈ An }.
(2) A0 ∩ · · · ∩ An = {x | x ∈ A0 ∧ · · · ∧ x ∈ An }.
(3) A \B = {x | x ∈ A ∧ x ∈
/ B}.
(4) T A = Tx∈A x = {y | ∃x ∈ A y ∈ x}.
S S

(5) A = x∈A x = {y | ∀x ∈ A y ∈ x}.


Lecture 9
10. May
{x, y} = x ∪ y.
S
Lemma 2.2.11.
Proof. ⊆: Suppose that u ∈ {x, y}. Then u ∈ v for some v ∈ {x, y}. Assume v = x.
S

Then u ∈ x, so u ∈ x ∪ y.
⊇: Suppose that u ∈ x ∪ y. Assume u ∈ x. Then u ∈ {x, y}.
S

Axiom. (Pairing) ∀x, y (∃z (∀u (u ∈ z) ↔ (u = x ∨ u = y)).
Note that the pair is unique by the Axiom of Extensionality.
Definition 2.2.12. (Kuratowski pair) Suppose that s, t, s0 , . . . sn+1 are sets.
(1) (s, t) := {{s}, {s, t}} is the ordered pair of s, t.
(2) (s0 , . . . , sn+1 ) := ((s0 , . . . , sn ), sn+1 ).19
Lemma 2.2.13.
(1) ∀x, y ∃z (z = (x, y)).
(2) ∀x0 , . . . , xn ∃z (z = (x0 , . . . , xn )).
Proof. (1) By the pairing axiom, {s} = {s, s} and {s, t} are sets, so {{s}, {s, t}} is a set
as well.
(2) By induction on n. 
Lemma 2.2.14. ∀x, y, x0 , y 0 ((x, y) = (x0 , y 0 ) → (x = x0 ) ∧ (x = y 0 )), i.e. Definition
2.2.12 satisfies the fundamental property of ordered pairs.
Proof. Suppose that (x, y) = (x0 , y 0 ).
First suppose that x = y. Then {x} = {x, y} and (x, y) = {{x}}. We have (x0 , y 0 ) =
{{x0 }, {x0 , y 0 }} = {{x}}. So {x0 } = {x0 , y 0 } and thus x0 = y 0 . Then (x, y) = {{x}} =
{{y}} and (x0 , y 0 ) = {{x0 }} = {{y 0 }}. So x = x0 = y = y 0 .
Now suppose that x 6= y. By Extensionality, x = x0 or x0 = x = y. Then {x, y} =
{x0 , y 0 }, so y = y 0 . 
19Note that this is an induction in the metatheory. A weak fragment of PA would suffice for formalise
the metatheory. However, this is not necessary: we could restrict ourselves to pairs here. No metatheory
at all is needed to develop set theory, since every single theorem and proof consists of only finitely many
formulas. The metatheory only comes into play when we define the full ZFC as a scheme and study its
properties and its models.
28 PHILIPP SCHLICHT

Definition 2.2.15. If A0 , . . . , An are class terms or classes, let


A0 × · · · × An = {(x0 , . . . , xn ) | x0 ∈ A0 ∧ · · · ∧ xn ∈ An }.
Definition 2.2.16.
(1) A class R is called a binary relation on a class A if R ⊆ A × A.
(2) A relation F is called a function or map if
∀x, y, y 0 , ((x, y) ∈ F ∧ (x, y 0 ) ∈ F ) → y = y 0 .
Definition 2.2.17. Suppose that R is a relation.
(1) dom(R) = {x | ∃y (x, y) ∈ R}.
(2) ran(R) = {y | ∃x (x, y) ∈ R}.
(3) field(R) = dom(R) ∪ ran(R).
Definition 2.2.18. Suppose that F is a function and A, B are classes.
(1) F is a function from A to B (F : A → B) if dom(F ) = A and ran(F ) ⊆ B.
(2) F is a partial function from A to B (F : A * B) if dom(F ) ⊆ A and ran(F ) ⊆ B.
(3) If B is a set, let A B = {f | f : A → B}.
Axiom. (Union) ∀x ∃y ∀z (z ∈ y ↔ ∃u (u ∈ x ∧ z ∈ u)).
Lemma 2.2.19. ∀x0 , . . . , xn {x0 , . . . , xn } ∈ V (i.e. ∃z ∀u (u ∈ z ↔ (u = x0 ∨ · · · ∨ u =
xn )).
Proof. For n = 0, 1 this holds by the Pairing Axiom. S
Suppose this holds for n. Then {x0 , . . . , xn+1 } = {{x0 , . . . , xn }, {xn+1 } is a set by
the Pairing, Union and Extensionality Axioms. 
We now aim to define set of natural numbers. It will be defined as the smallest inductive
set. The idea why one defines s + 1 as s ∪ {s} is that one wants to add a new element
to the set s. Note that we will therefore also need to know that s ∈ / s. The Axiom of
Foundation will ensure this.
Definition 2.2.20.
(1) For any set s, we define s + 1 := s ∪ {s} = {s, {s}}.
S

(2) A set s is called inductive if ∅ ∈ s and ∀x ∈ s (x + 1) ∈ s.


The Axiom of Infinity states that there is an inductive set.20
Axiom. (Infinity) ∃y (∅ ∈ y ∧ ∀x (x ∈ y → x + 1 ∈ y)).
We need an axiom to ensure that an inductive set has infinitely many elements
Axiom. (Foundation) ∀x 6= ∅ ∃y (y ∈ x ∧ x ∩ y = ∅).
Lemma 2.2.21. There are no ∈-cycles x0 ∈ x1 ∈ · · · ∈ xn ∈ x0 .
Proof. y = {x0 , x1 , . . . , xn } is a set by Lemma 2.2.19. By the Foundation Axiom, y has
an ∈-minimal element xk . If k = 0, then x0 ∩ y = ∅. But xn ∈ x0 ∩ y, contradiction.
If k > 0, then xk ∩ y = ∅. But xk−1 ∈ xk ∩ y, contradiction. 
The Separation Scheme states that {z ∈ x | ϕ(z, x0 , . . . , xn )} is always a set. In other
words, a subclass of a set is itself a set.
Axiom Scheme. (Separation) For any formula ϕ(z, x0 , . . . , xn ),
∀x ∀x0 , . . . , xn ∃y ∀z (z ∈ y ↔ (z ∈ x ∧ ϕ(z, x0 , . . . , xn ))).
20Given all the other axioms and schemes of set theory, including the Axiom of Choice, one can show
that this follows from the existence of an infinite, i.e. not finite, set. Here by a finite set, we mean a set
x such that every injective function f : x → x is surjective.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 29

Lemma 2.2.22. V is not a set, i.e. V ∈


/ V.
Proof. By Lemma 2.2.21. 
Lemma 2.2.23. There is a ⊆-least inductive set.
Proof. By the Axiom of Infinity, there is an inductive set x. The class
y = {u | u ∈ x and for all inductive sets z, u ∈ z},
i.e. the intersection of all inductive sets, is a set by the Separation Scheme. Since ∅ ∈ y,
and for all z ∈ y we have z + 1 ∈ y, y is in fact inductive. 
By the Axiom of Extensionality, the ⊆-least inductive set is unique. We will denote it
by ω and call it the set of natural numbers.
Axiom Scheme. (Replacement) If F is a function, then ∀x F [x] ∈ V , where F [x] = {z |
∃y ∈ z (y, z) ∈ F }.
The Replacement Scheme lists infinitely many formulas. However, for any proof of a
formula from the axioms of set theory, only finitely many instances of the Replacement
Scheme are used. Thus for any specific result in set theory, we do not need to worry about
issues such as: does one need to assume a certain theory such as PA, or the axioms of set
theory without the axiom of infinity, to work with formulas? Such problems only arise
when one studies the connection between structures and formulas as in the first chapter.
Definition 2.2.24. The axiom system ZF− of
Zermelo-Fraenkel Set Theory without the Power Set Axiom
consists of the Axioms of Existence, Extensionality, Pairing, Union, Foundation, Infinity,
and the Axiom Schemes of Separation and Replacement.
We work with ZF− because this suffices to develop basic notions of set theory and to
prove the recursion theorem. ZF consists ZF− together with the Power Set Axiom. ZFC
consists ZF together with the Axiom of Choice. These will be introduced later to prove
properties of cardinals. For instance, without them one cannot prove that uncountable
sets exist.

Lecture 10
12. May
Remark 2.2.25.
(1) The Axiom of Infinity implies the Axiom of Existence.
(2) The Axioms of Existence, Extensionality and the Replacement Scheme imply the
Separation Scheme.
(3) The Axiom of Infinity and the Replacement Scheme imply the Axiom of Pairing.
Proof. (1) This holds since an inductive set contains an empty set by definition. In more
detail, the Axiom of Infinity is logically equivalent to the sentence ∃x∃y (∀z z ∈
/ x) ∧ y ∈
yx ∧ ∀z ∈ x z + 1 ∈ x. It therefore logically implies the existence of an empty set.
The previous version of the lecture notes additionally used the Separation Scheme: given any set x,
{y ∈ x | y 6= y} is an empty set by the Separation Scheme.

(2) Suppose that s, s0 , . . . , sn are sets and ϕ(x, y0 , . . . , yn ) is a formula.


If there is no t ∈ s with ϕ(t, s0 , . . . , sn ), then u = {x ∈ s | ϕ(x, s0 , . . . , sn )} is empty.
By the Axiom of Existence, there is an empty set v and by the Axiom of Extensionality,
u = v.
Now assume that there is some t ∈ s with ϕ(t, s0 , . . . , sn ). Define a function F : s → s
by letting F (z) = z if ϕ(z, y0 , . . . , yn ) holds and F (z) = t otherwise. Then u = {x ∈ s |
ϕ(x, s0 , . . . , sn )} = ran(F ) by the Axiom of Extensionality, and ran(F ) is a set by the
Replacement Scheme.
30 PHILIPP SCHLICHT

Extensionality can be avoided if one is careful about the statement of Separation and Replacement

(3) This is left as an exercise. 


Remark 2.2.26. We aim to show later in the lecture:
(1) ZF− is not finitely axiomatisable, i.e. there is no finite list of sentences which both
follow from ZF− and imply ZF− . This means that the Separation Scheme and Re-
placement Scheme cannot be (both) replaced by finitely many axioms.
(2) Any consistent and sufficiently simple extension of ZF− is incomplete by Gödel’s first
incompleteness theorem. (By sufficiently simple, we mean that the list of axioms can
be generated by an algorithm. For example, any theory that consists of finitely many
axioms and schemes.) Thus ZF− (and ZF, ZFC) only determine the general rules
how to work with sets, but do not provide a complete description of the set-theoretic
universe.
We now study induction, recursion and ordinals using the axioms of ZF− .
If < is a relation and y is a set, we write pred< (y) = {x | x < y}. (Note that in general,
this is a class.)
Definition 2.2.27. Suppose that < is a relation (on V ).
(1) < is called wellfounded if any nonempy set contains a <-minimal element.
(2) < is wellfounded for classes if every nonempty class contains a <-minimal element.
(3) < is called strongly wellfounded if < is wellfounded and for any y ∈ field(<), pred< (y)
is a set.
(4) A wellorder is a wellfounded linear order.
We will see below that strongly wellfounded implies wellfounded for classes.
The next theorems generalise induction and recursion on the natural numbers in two
ways:
(1) To the transfinite, i.e. to ordinals beyond the natural numbers.
(2) To wellfounded partial orders.
The first one has a number of applications. One that can be found in most textbooks
on set theory is to enumerate any given set, assuming the axiom of choice, and thus prove
the wellordering theorem.
The second one is much more general. As an example, one can think of a tree partial
order, i.e. a wellfounded partial order < such that for all x ∈ field(<), pred< (x) is linearly
ordered by <. In such an example, one assumes a property to hold at the root of the tree
and wants to prove it for all nodes.
Another example for induction along wellfounded partial orders is recursion for terms
and formulas.
While one usually uses wellfoundedness to prove induction, the next theorem states
that these two properties are in fact equivalent.
To formulate induction, we introduce the following notation. Suppose that ϕ(u, v) is
a formula and z is a set. We call ϕ <-inductive with parameter z if for all y ∈ field(<),
we have (∀x < y ϕ(x, z)) → ϕ(y, z). We further say that < satisfies induction if for any
<-inductive ϕ with parameter z, ϕ(x, z) holds for all x ∈ field(<).
Lemma 2.2.28. (Transfinite induction) The following conditions are equivalent for a
relation <:
(1) < is wellfounded for classes.
(2) < satisfies induction.
Proof. (1) ⇒ (2): Suppose that ϕ(x, z) fails for some x ∈ field(<). Since < is wellfounded
for classes, there is a <-minimal y for which ϕ(y, z) fails. Then ∀x < y ϕ(x, z) holds. But
since ϕ is <-inductive with parameter z, this would imply ϕ(y, z).
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 31

(2) ⇒ (1): Assuming (1) fails, let A = {x | ϕ(x, z)} be a nonempty class with no
<-minimal element. Then (2) fails for ¬ϕ. 
Induction as in (2) also holds for strongly wellfounded relations, since such relations are
wellfounded for classes by the proof of the next theorem. Note that Ord + Ord, Ord · ω,
Ord · Ord are wellfounded for classes, where Ord · α means Ord × α ordered by <r−lex . So
one can do induction far beyond Ord.
In the next theorem, we will use the following notation. Given a function G : A×V → V
and B ⊆ A, we say that a function f : B → V is guided by G if
(1) B is pred< -closed and
(2) f (x) = G(x, f pred< (x)) for all x ∈ B.
Condition (1) is necessary for the second condition to make sense. Condition (2) states
that f satisfies the recursion defined by G.
Theorem 2.2.29. (Transfinite recursion) Suppose that < is a strongly wellfounded rela-
tion with field(<) = A. (A is a class.) For any function G : A × V → V , there exists a
unique function F : A → V that is guided by G.
Proof. We first assume that < is wellfounded for classes. We will eliminate this assumption
in the end of the proof. An approximation is a set function guided by G.
Claim. Suppose that f , g are approximations. Then v = dom(f )∩dom(g) is pred< -closed
and f v = gv.
Proof. It is clear that v = dom(f ) ∩ dom(g) is pred< -closed. We show by induction that
f (x) = g(x) for all x ∈ v. Suppose x ∈ v and for all y < x (in v), we have f (y) = g(y). So
f pred< (x) = gpred< (x). Then f (x) = G(x, f pred< (x)) = G(x, gpred< (x)) = g(x).
By Lemma 2.2.28, ∀x ∈ v f (x) = g(x), as required. 

Lecture 11
For any set x, we call an approximation f with x ∈ dom(f ) an approximation at x. 17. May
The use of the next claim is to choose an approximation at x for each x ∈ A. Even the
axiom of choice would not be useful here, since we need to make class many choices.
Claim. Suppose that x ∈ A and there is a approximation f at x. Then there is a (unique)
⊆-least approximation fx at x.
Proof. It is easy to see that the intersection f of all approximations at x is a set by
the Separation Scheme. The previous claim implies that f satisfies (1). To see that (2)
holds for f , let g be any approximation at x. Then f (y) = g(y) = G(y, gpred< (y)) =
G(y, f pred< (y)) for all y ∈ dom(f ). 
A straightforward induction completes the proof.
Claim. For any x ∈ A, there exists an approximation at x.
Proof. By induction for <, we can assume that there exists S an approximation at each
y < x. By the Axiom of Union and the first claim, f = y<x fy is a set function.
Moreover, it is easy to check that f is an approximation. If x ∈ dom(f ), then we are
done. Otherwise
g = f ∪ {(x, G(x, f pred< (x)))}
is an approximation at x. 
By the first claim, F = y∈A fy : A → V is a function, and one can easily check that it
S

is guided by G.
It remains to show that any strongly wellfounded relation (A, <) is wellfounded for
classes. To see this, take some x ∈ A. By recursion for ω, we construct g : ω → V with Exercise: Con-
struct the least
pred< -closed
set by a direct
recursion with-
out the Axiom
of Infinity.
32 PHILIPP SCHLICHT

g(0) = {x} and g(n + 1) = g(n) ∪ y∈g(n) pred< (y). This is a set by the Replacement
S
S
Scheme and the Axiom of Union. Then n∈ω g(n) is a pred< -closed set containing x.
Apply wellfoundedness to this set to obtain a <-minimal element with respect to a given
class. 
There is no simple characterisation of those relations that allow recursion, as we have
for induction. First of all, the reason why wellfoundedness for classes does not suffice in
the proof is because the inductive hypothesis, i.e. the existence of class approximations,
cannot be formulated in the language of set theory for wellorders beyond Ord. With the
right formulation, one can prove recursion for Ord · n for all n ∈ ω.21 However, it is
independent of ZFC whether recursion of length Ord · ω holds.
This is no problem at all, since recursion for strongly wellfounded relations suffices for
all purposes.
We now turn to ordinals and their basic properties.
Definition 2.2.30.
(1) (Transitive sets) A class A is called transitive if ∀x, y (x ∈ y ∈ A → x ∈ A), i.e. A
is downwards closed with respect to ∈.
(2) (Ordinals) A set x is called an ordinal if x is transitive and (x, ∈) is a strict linear
order. Note that by the Axiom of Foundation, every ordinal is wellfounded. The
point of taking transitive sets here is to have a unique ordinal for each order type.
(3) (Comparison of ordinals) For ordinals α, β, we write α < β :⇐⇒ α ∈ β.
(4) (Successors and limits) An ordinal α is called a successor ordinal if α = β+1 = β∪{β}
for some ordinal β. If α 6= 0 is not a successor ordinal, then it is called a limit ordinal.
(5) Ord denotes the class of ordinals.
(6) ω denotes the (unique) ⊆-least inductive set. Its elements are called natural numbers.
Recall that we define 0 := ∅.
Lemma 2.2.31.
(1) Ord is inductive.
(2) Ord is transitive.
(3) The relation < is a strict linear order on Ord.
(4) Ord is a proper class.
Proof.
(1) Clearly 0 ∈ Ord. Suppose that α ∈ Ord. To see that α + 1 ∈ Ord, suppose that
γ ∈ β ∈ α + 1 = α ∪ {α}. We claim that γ ∈ α + 1. There are two cases. If β ∈ α,
then γ ∈ α ⊆ α + 1, since α is transitive. If β = α, then γ ∈ β = α ⊆ α + 1.
(2) Suppose that γ ∈ δ ∈ Ord. We claim that γ ∈ Ord. Note that (γ, ∈) is a strict linear
order, since γ ⊆ δ by transitivity of δ.
Call ordinals α, β comparable if the trichotomy (α < β) ∨ (α = β) ∨ (β < α) holds
and incomparable otherwise.
To see that γ is transitive, take α ∈ β ∈ γ. Since δ is transitive, α ∈ δ. Since (δ, ∈)
is a strict linear order, α and γ are comparable. Since α = γ and γ ∈ α contradict
Lemma 2.2.21, we have α ∈ γ as required.
(3) Consider the strict partial order (α, β) < (γ, δ) ⇐⇒ (α < γ) ∧ (β < δ) on Ord × Ord.
Clearly (Ord × Ord, <) is very strongly wellfounded and thus wellfounded for classes.
We show by induction along (Ord × Ord, <) that all pairs (α, β) are comparable.
Suppose that there is an incomparable pair (α, β). Since (Ord × Ord, <) is well-
founded for classes, we can assume that (α, β) is <-minimal. It suffices to show that
21We have not defined Ord · n formally. We mean the product Ord × n with the right-lexicographical
order.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 33

α = β, since this would contradict the fact that α and β are incomparable. To see
that α ⊆ β, take any γ ∈ α. By minimality of (α, β), β and γ are comparable. But
β = γ and β ∈ γ would contradict the fact that α and β are incomparable. Thus
γ ∈ β. Similarly, β ⊆ α and hence α = β.
(4) Towards a contradiction, suppose that Ord is a set. Then by (3), Ord is an ordinal.
We would then have Ord ∈ Ord, contradicting the Axiom of Foundation.

The fact in (4) above that Ord is a proper class is called the Burali-Forti’s paradoxon
(1897). Burali-Forti showed that if there were a set Ω of all ordinals, then one could form
its successor Ω + 1, thus leading to the impossible inequality Ω < Ω + 1 ≤ Ω. This was
known to Russell when he published his paradox in 1903.
Lemma 2.2.32.
(1) (Induction principle) Any inductive subset x of ω equals ω
(2) ω is the least limit ordinal.
Proof. (1) This is immediate, since ω is the ⊆-least inductive set.
(2) We first show that ω is an ordinal. (ω, ∈) is linearly ordered, since ω ⊆ Ord. It
remains to show that ω is transitive. Let ϕ(x) denote the formula x ⊆ ω. To show that
ϕ(n) holds for all n ∈ ω, it suffices by (1) to show that {n ∈ ω | ϕ(n)} is inductive.
Clearly ϕ(0) holds. If ϕ(n) holds, then n + 1 = n ∪ {n} ⊆ ω and hence ϕ(n + 1) holds.
We now show that ω is a limit ordinal. Otherwise, ω = α + 1 = α ∪ {α} for some
α ∈ Ord. Then α ∈ ω and since ω is inductive, α + 1 ∈ ω, contradicting the Axiom of
Foundation.
To see that all ordinals α < ω are either successors or equal to 0, note that the set of
these ordinals is inductive. By (1), this set equals ω. 

Let α ≤ β denote α < β ∨ α = β for ordinals α, β. Lecture 12


31. May
Lemma 2.2.33. For ordinals α, β, we have α ≤ β if and only if α ⊆ β.
Proof. ⇒: If α ∈ β, then α ⊆ β by transitivity of β. If α = β, then the claim is obvious.
⇐: It suffices to show that β < α fails. But this would imply β ∈ β, contradicting the
Axiom of Foundation. 
The next lemma is only included to show the existence of transitive closures. You can
forget about it without losing anything essential.
If < is a relation, we call a set x <-closed if for all y ∈ x, we have ∀z (z < y → z ∈ x).
It is the same as pred< (x)-closed.
Lemma 2.2.34.
(1) If < is a strongly wellfounded relation, then any x ∈ field(<) is an element of some
<-closed set.
(2) For any set x, there is a ⊆-least transitive set that contains x as an element. This
is called the transitive closure tc(x) of x.
Proof. (1) This was already done at the end of the proof of Theorem 2.2.29. By recursion
for ω, we construct g : ω → V with g(0) = {x} and g(n + 1) = g(n) ∪ y∈g(n) pred< (y).
S
S
Then n∈ω g(n) is a <-closed set containing x.
(2) It is easy to see that the intersection of (arbitarily many) transitive sets is again
transitive. So the claim follows from (1). 
The next lemma shows that wellorders and ordinals are the same up to isomorphism.
Every wellorder is represented by an ordinal.
It is easy to see that the following definitions can be done via the recursion theorem.
34 PHILIPP SCHLICHT

Definition 2.2.35.
(1) (a) α + 0 = α
(b) α + (β + 1) = (α + β) + 1
(c) α + γ = supβ<γ (α + β) for limits γ.
(2) (a) α · 1 = α
(b) α · (β + 1) = (α · β) + α
(c) α · γ = supβ<γ (α · β) for limits γ.
(3) (a) α0 = 1
(b) αβ+1 = (αβ ) · α
(c) αγ = supβ<γ (αβ ) for limits γ.
We already argued above that these definitions agree with the more intuitive definitions
using sums and products of wellorders.
2.3. Cardinals. We will use the Power Set Axiom and the Axiom of Choice to prove
that there are cardinals beyond ω.22
Axiom. (Power Set) ∀x ∃y ∀z (z ∈ y ↔ z ⊆ x).
The set P(x) = {z | z ⊆ x} is called the power set of x.
A choice function for a set x is a function f : x → V with f (y) ∈ y for all nonempty
y ∈ x.
Axiom. (Axiom of Choice) Any set has a choice function.
Definition 2.3.1. ZF is the axiom system ZF− with the Power Set Axiom. ZFC is the
axiom system ZF with the Axiom of Choice.
From now on, we work in ZFC. The next lemma will allow us to associate a cardinal
to any set.
Lemma 2.3.2. For any set x, there is a bijection f : α → x for some α ∈ Ord. If (x, <)
is a wellorder, then we can choose f as an isomorphism between (x, <) and (α, <).23
Proof. We write ran(x) = {z | ∃y (y, z) ∈ x} for any set x. (Previously, we had only
defined this for relations.)
Let g : (P(x) \ {∅}) → x be a choice function. Let x∗ denote any set with x∗ ∈/ x, for
instance x∗ = x.
Consider G : Ord × V → V , where G(α, y) = g(x \ ran(yα)) if x 6⊆ ran(yα) 6= ∅ and
G(α, y) = x∗ otherwise.
By the recursion theorem, there is a unique function F : Ord → V that is guided by G.
Claim. Suppose that γ ∈ Ord. If x 6⊆ ran(F α) for all α < γ, then (F γ) : γ → x is
injective.
Proof. This follows from the fact that F is guided by G and from the definition of G. 
Note that x ⊆ ran(F γ) for some γ. Otherwise, F : Ord → x is injective by the previous
claim, contradicting the fact that Ord is a proper class and the Replacement Scheme. Let
γ denote the least such ordinal.
By the previous claim, (F γ) : γ → x is injective. Thus F γ : γ → x is bijective.
For the second part of the lemma, let g be the choice function that picks the <-least
element of a subset of x. It is easy to check that the function F γ : γ → x in the previous
proof is an isomorphism between (γ, <) and (x, <). 
22The Power Set Axiom is necessary, while the Axiom of Choice could be avoided.
23It is easy to see there is a unique isomorphism.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 35

Recall that the ordinals are linearly ordered by inclusion. Hence by the previous lemma,
for any two sets x and y there exists and injective function f : x → y or an injective
function g : y → x.
Definition 2.3.3.
(1) A cardinal is an ordinal β such that for all α < β, there is no bijective function
f : α → β.
(2) Card denotes the class of cardinals.
(3) For any set x, the size (or cardinality) |x| of x is the least α ∈ Ord such that there
exists a bijection f : x → α.
The next lemma is left as an exercise.
Lemma 2.3.4. If there exists an injective function f : γ → β for some β < γ, then there
exists an bijective function g : γ → α for some α ≤ β. Hence γ ∈
/ Card.
For a relation R, let R−1 = {(x, y) | (y, x) ∈ R}.
Lemma 2.3.5. The following conditions are equivalent for sets x and y:
(1) |x| ≤ |y|.
(2) There is an injective function f : x → y.
(3) There is a surjective function g : y → x.
Proof. (1) ⇒ (2): This is clear.
(2) ⇒ (1): We can assume that y is a cardinal, i.e. y = |y|. By assumption, there is an
injective function f : |x| → |y|. If |x| ≤ |y|, then we are done. If |x| > |y|, then we have a
contradiction to the previous lemma.
(2) ⇒ (3): Suppose that f : x → y is an injective function. Fix any x∗ ∈ x. Then
g : y → x, (
f −1 (z) if z ∈ ran(f )
g(z) =
x∗ otherwise
is surjective.
(3) ⇒ (2): Suppose that g : y → x is a surjective function. Let h be a choice function
for P(y). Then f : x → y, f (z) = h(g −1 (z)) is injective. 
Lemma 2.3.6. For every n ∈ ω, any injective function f : n → n is surjective. In
particular, n ∈ Card.
Proof. We show this by induction. It is clear for n = 0.
Now suppose that claim holds for some n ∈ ω. Suppose that f : n + 1 → n + 1 is
injective, but not surjective. We can assume that f (n) = n by exchanging two values
of f . In more detail, suppose first that n ∈ ran(f ). Then f (m) = n and f (n) = k for A missing
some m, k. Then let f (m) = k and f (n) = n, but leave the remaining values. Otherwise case has been
added.
n∈ / ran(f ). Suppose that f (n) = i. Then let f (n) = n and leave the remaining values.
(13. June)
Then (f n) : n → n is surjective by the induction hypothesis. It follows that f is
surjective. 
The next lemma is left as an exercise:
Lemma 2.3.7. For any set x ⊆ Card, sup(x) ∈ Card. In particular, ω ∈ Card.
Definition 2.3.8. A set x is called
(1) finite if |x| < ω.
(2) infinite if x is not finite.
(3) countable if there is an injective function f : x → ω.
(4) uncountable if x is not countable.
36 PHILIPP SCHLICHT

Lemma 2.3.9. (Cantor) |P(x)| > |x| holds for all sets x.
Proof. We have |x| ≤ |P(x)|, since the function f : x → P(x), f (y) = {y}, is injective by
the Axiom of Extensionality.
Towards a contradiction, suppose that |P(x)| ≤ |x|. Note that there is a bijection
between P(x) and 2x = {f | f : x → 2} by identifying sets with their characteristic
functions. Using Lemma 2.3.5, we obtain exists a surjective function g : x → 2x . Let
f ∈ 2x denote the unique function with f (i) 6= g(i)(i) for all i ∈ x. Then f 6= g(i) for all
i ∈ x. This contradicts the fact that g is onto. 
Definition 2.3.10.
(1) For any α ∈ Ord, α+ denotes the least cardinal λ > α. (This is well defined by the
previous lemma.)
(2) Cardinals of the form α+ are called successor cardinals. The remaining cardinals
λ 6= 0 are called limit cardinals.
(3) (The Aleph-function) We define ℵ : Ord → Card by recursion on α ∈ Ord:
(a) ℵ0 = ω
(b) ℵα+1 = ℵ+ α
(c) ℵγ = supα<γ ℵα for limit ordinals γ.
The ℵ-function is strictly monotone, i.e. ∀α, β ∈ Ord (α < β → ℵα < ℵβ ). This can be
proved easily by induction from the definition. In particular, the ℵ-function is injective.
Since Ord is a proper class, by the Replacement Scheme, Card is a proper class as well.
ℵα is often denoted by ωα .
Lecture 13
2. June
Lemma 2.3.11. Every infinite cardinal κ equals ℵα for some α ∈ Ord.
Proof. Let γ = sup{α ∈ Ord | ℵα ≤ κ}.
We claim that ℵγ ≤ γ. This is clear if γ is a successor ordinal. If γ is a limit ordinal,
then ℵγ = supα<γ ℵα ≤ κ.
Hence ℵγ ≤ κ < ℵγ+1 = ℵ+γ , so κ = ℵγ . 
Definition 2.3.12.
(1) The continuum hypothesis CH states that |P(ℵ0 )| = ℵ1 .
(2) The generalised continuum hypothesis GCH states that |P(ℵα )| = ℵα+1 for all α ∈
Ord.
Definition 2.3.13.
(1) A function f : α → β is called cofinal if ran(f ) is unbounded in β, i.e. ∀β 0 < β ∃α0 <
α (f (α0 ) > β 0 ).
(2) The cofinality cof(κ) of a cardinal κ is the least cardinal λ ≤ κ such that there exists
a cofinal function f : λ → κ.
(3) An infinite cardinal κ is called regular if cof(κ) = κ and singular otherwise.
For example, it is easy to see that cof(ℵω ) = ω, so ℵω is singular. On the other hand,
one can show that ℵn is regular for all n ∈ ω. In fact all successor cardinals are regular.
Finally, we discuss Zorn’s Lemma. This is useful for many applications, for instance
that any vector space has a basis, any ring contains a maximal ideal. Many other similar
results follow pretty directly from Zorn’s Lemma.
Suppose that (A, ≤) is a partial order. An element p of a subset B of A is called
maximal in B if for any q ≥ p in B, p = q. An element q ∈ A is called maximal in (A, ≤)
if p ≤ q holds for all p ∈ A. A subclass B of A is called a chain if it is linearly ordered by
≤. An element q ∈ A is called an upper bound for a subclass B of A if p ≤ q holds for all
p ∈ B. It is called strict if p < q holds for all p ∈ B.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 37

Lemma 2.3.14 (Zorn’s Lemma). Suppose that (x, ≤) is a partially ordered set such
that every linearly ordered subset has an upper bound. Then (x, ≤) contains at least one
maximal element.
Proof. Towards a contradiction, suppose that (x, ≤) has no maximal elements.
We claim that every chain y in (x, ≤) has a strict upper bound. To see this, take any
upper bound p for y. If this is not strict, then p ∈ y. Since p is not maximal in (x, ≤), y
has a strict upper bound.
Let chain(x, ≤) denote the set of chains in (x, ≤). (This is a set by the Power Set
Axiom and the Replacement Scheme.) By the Axiom of Choice, there exists a function
f : chain(x, ≤) → x such that for all y ∈ chain(x, ≤), f (y) is a strict upper bound for y.
By recursion on α ∈ Ord, define
(
f (g[α]) if g[α] is a chain in (x, ≤)
g(α) =
∅ otherwise.
One can easily show by induction that for all α ∈ Ord, g[α] is a chain in (x, ≤), so
the second case in the definition never occurs, and that g : Ord → x is injective. This
contradicts the Replacement Scheme. 
2.4. Tarski’s undefinability of truth. We now turn to metamathematical aspects of
ZFC.
Why did we define truth M |= ϕ only for set models M and not for classes? Clearly,
for any fixed formula ϕ and any class model M, M |= ψ can be defined in the same way.
But it is a different definition for each formula ψ. In fact, there is no uniform definition
by the next result.
Added on 21 July: It suffices to read only the simplified argument after the following theorem. (This
also leads to a better understanding of the undefinability of truth.)

We extend the language L∈ by adding a constant symbol cϕ for each L-formula. This
is done in more detail in Section 4.1.
A truth definition is a formula T (w, x, y, z) and a set parameter t such that for all
formulas ψ(x, y) and all sets r, s,

ψ(r, s) ⇐⇒ T (cψ , r, s, t).


We use cψ here so that it is clear how to formulate this statement formally. It should
mean that the set Φ of formulas
∀x, y (ψ(x, y) ←→ T (cψ , x, y, z))
holds for t. One cannot actually formulate the existence of some t satisfying a scheme
of formulas as a sentence. (This problem is avoided in the simplified version after the
lemma.) But the point is: the next lemma already shows that Φ is inconsistent.
Theorem 2.4.1. (Tarski) There is no truth definition. More precisely, the set Φ above
is inconsistent.
Proof. Suppose that T (w, x, y, z) is a truth definition. Let ψ(x, y) denote the formula
¬T (x, x, y, y). Then for all t,
ψ(cψ , t) ⇐⇒ ¬T (cψ , cψ , t, t).
This contradicts the assumption that T is a truth definition. 
A simpler case (without parameter) of the previous result is to show that there is no
formula T (x, y) such that for all formulas ψ(x) and all words s,
ψ(s) ⇐⇒ T (cψ , s).
38 PHILIPP SCHLICHT

Suppose that T (x, y) is such a formula. Let ψ(x) denote the formula ¬T (x, x). Then
ψ(cψ ) ⇐⇒ ¬T (cψ , cψ ), contradicting the assumption that T is a truth definition.
Informally, Tarski’s result shows that one cannot express properties of classes by first-
order formulas. One can also use the result to show that basic properties of classes such
as {x | ϕ(x, y)} = ∅ cannot be expressed by a formula in ϕ and x.
An explanation of Tarski’s undefinability of truth is given by the Levy hierarchy of
formulas in Section 4.2. Formulas with more quantifiers appear higher in the hierarchy.
One can show that formulas with a certain number of quantifiers cannot be expressed by
formulas with fewer quantifiers. This explains why a single formula such as T above does
not suffice to express all formulas.
Undefinability of truth is also one of the main steps in the proof of Gödel’s first incom-
pleteness theorem.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 39

3. Completeness and compactness


3.1. Completeness of Hilbert’s calculus. In this section we prove that Hilbert’s cal-
culus is complete: if T is a set of L-formulas and ϕ is an L-formula with T |= ϕ, then
T `L ϕ. This theorem (in a slightly weaker variant) is originally due to Gödel. The
proof that we present is essentially due to Henkin. We shall show that every L-consistent
L-theory has a model. It is an easy exercise to show that this implies the completeness
theorem.
To this end, we start with a consistent L-theory and aim to build a model of T . To
understand this task, think of the field axioms together with a set of first order conditions.
If the conditions state e.g. that the field has characteristic 0, then such structures are
already available. But in general, one does not know in advance how a model of the
theory will look like. Roughly, the problem is that even for theories T of the form Th(M)
for some structure M, T often has less information than M. For example, one cannot
reconstruct the complex field from its theory in any sensible way.
However, such a reconstruction is possible if the structure is canonical. This means
there is a constant symbol for each element. The following syntactical conditions suffice
to ensure that a theory T equals the theory of a canonical model.
• T is complete (see Definition 1.5.14)
• T is L-deductively closed, i.e. for all L-sentences ϕ, T `L ϕ implies ϕ ∈ T .
• T is a Henkin theory, i.e. for any L-formula ϕ and every free variable x in ϕ, it
contains an L-formula of the form
c
(∃xϕ) → ϕ .
x
Thus, we aim to extend T to a complete Henkin theory by recursion. We recursively
expand both the language and the theory and thereby add more information about the
structure that we aim to construct.
It is clear that any L-complete L-theory can be extended to an L-deductively closed
theory simply by adding all derivable sentences.
We first show how to extend any theory T to a complete theory.
Lemma 3.1.1. Suppose that T is an L-consistent L-theory and ϕ is an L-formula. Then
T ∪ {ϕ} or T ∪ {¬ϕ} is L-consistent.
Proof. Suppose that both theories are inconsistent. Then there are ψ0 , . . . , ψn ∈ T with
`L (ψ0 ∧ · · · ∧ ψn ∧ ¬ϕ) → ⊥
`L (ψ0 ∧ · · · ∧ ψn ∧ ϕ) → ⊥.
By tautologies and the Modus Ponens, we have
`L (ψ0 ∧ · · · ∧ ψn ) → ⊥.
But then T would be L-inconsistent, contrary to the assumption. 
Lemma 3.1.2. Suppose that T is an L-consistent L-theory. Then there is an L-consistent,
L-complete L-theory T 0 ⊇ T .
Proof. We assume that L is countable. The general case is proved similarly by transfinite
recursion.
It is easy to see that there are only countably many L-formulas. (This also follows from
Exercise 26 (2).)
Suppose that hϕn | n ∈ ωi enumerates all L-formulas. We construct a sequence hTn |
n ∈ ωi of L-consistent L-theories by recursion. Let T0 = T . Let Tn+1 = Tn ∪ {ϕn } if
Tn ∪ {ϕn } is consistent, and Tn+1 = Tn ∪ {¬ϕn }. Tn+1 is L-consistent by Lemma 3.1.1.
Then T 0 = n∈ω Tn is L-consistent. Clearly, T 0 is complete.
S

40 PHILIPP SCHLICHT

The second step is to extend T to a Henkin theory. It can be necessary to extend the
language, since for example T might not contain any constant symbols.
Lemma 3.1.3. Suppose that T is a consistent L-theory, ϕ is an L-formula, x is a free
variable in ϕ24 and c is a new constant. Then the L ∪ {c}-theory T 0 := T ∪ {(∃xϕ) → ϕ xc }
is L ∪ {c}-consistent.
Proof. Suppose that T 0 is inconsistent. Then there are ψ0 , . . . , ψn ∈ T such that for
ψ = ψ0 ∧ · · · ∧ ψn ,
c
`L∪{c} ψ → ¬((∃x ϕ) → ϕ ).
x
By tautologies,
`L∪{c} (¬∃x ϕ) → ¬ψ
and
c
`L∪{c} ϕ → ¬ψ.
x
By Lemma 1.5.17,
`L (¬∃x ϕ) → ¬ψ
and
`L ϕ → ¬ψ.
Note that ψ has no free variables, since T is a theory. By ∃→ -introduction, we have
`L (∃xϕ) → ¬ψ.
By tautologies, `L ¬ψ. But then T would be inconsistent. 
Lecture 14
7. June
Lemma 3.1.4. Suppose that T is an L-consistent L-theory. Then there is a set C of new
constants and an L ∪ C-consistent Henkin L ∪ C-theory T + .
Proof. We will define a sequence hCn | n ∈ ωi of disjoint sets of constants and write
Ln = L ∪ C0 ∪ · · · ∪ Cn . We will define a ⊆-increasing sequence hTn | n ∈ ωi of Ln -
consistent Ln -theories.
Let T0 = T and C0 = ∅. Now suppose that hCi , Ti | i ≤ ni are defined for some n ≥ 0.
Choose a set Cn+1 of new constants which consists of a constant cϕ for each L-formula
ϕ(x) ∈ Tn with a single free variable x. By Lemmas 3.1.3 and 1.5.17,


Tn+1 := Tn ∪ {(∃xϕ) → ϕ ) | ϕ is an L-formula, x is a free variable in ϕ }
x
is L ∪ {cϕ }-consistent. More precisely, suppose that Tn+1 `Ln+1 ⊥. Then there are
ϕ0 , x0 , . . . , ϕk , xk as above such that

Tn ∪ {(∃xi ϕi ) → ϕi i ) | i ≤ k} `Ln+1 ⊥.
x
By Lemma 1.5.17

Tn ∪ {(∃xi ϕi ) → ϕi i ) | i ≤ k} `Ln ∪{cϕi |i≤k} ⊥,
x
but this contradicts Lemma 3.1.3.
Thus T + = n≥0 Tn is a consistent Henkin theory.
S


24It suffices below to assume that x is the only free variable in ϕ. So one could also define the notion
of Henkin theory only for such formulas.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 41

Suppose that T is an L-consistent L-theory. By Lemma 3.1.4, we can extend T to


an L0 -consistent L0 -Henkin theory T0 for some L0 ⊇ L. T0 can be extended to an L0 -
complete L0 -theory by Lemma 3.1.2. The theory of all L-sentences L-provable from this
theory is L-deductively closed.
Only the next step remains to prove the completeness theorem.
Lemma 3.1.5. Suppose that T is an L-consistent L-theory. Then the following conditions
are equivalent:
(1) T = Th(M) for some canonical L-structure M.
(2) T is an L-complete Henkin theory.
Proof. (1) =⇒ (2): This is clear.
(2) =⇒ (1): Since T is L-complete, it is L-deductively closed. We will use this through-
out the proof.
One can construct the term model of T as follows. The underlying set is given by
.
L-constant symbols. Let c ∼ d :⇔ (c = d) ∈ T for constant symbols c, d ∈ L and let [c]
denote the ∼-equivalence class of c. The next claim shows that this is well-defined. Let
M = {[c] | c ∈ L is a constant} be the underlying set of the term model M.
.
Claim. c ∼ d :⇔ (c = d) ∈ T defines an equivalence relation on the set of constant
symbols in L.
.
Proof. ∼ is reflexive: (c = c) ∈ T holds by the equality axioms and the ∀→ -axiom.
. .
∼ is symmetric: If (c = d) ∈ T , then (d = c) ∈ T holds by the equality axioms and the

∀ -axiom.
. . .
∼ is transitive: Suppose that (c = d) ∈ T and (d = e) ∈ T . Then (c = e) ∈ T , since by
. . .
the equality axioms (c = d ∧ d = e → c = e) ∈ T . 
Next, we define the interpretations of constant symbols, relation symbols and function
symbols. Let cM = [c] for constant symbols c.
For relation symbols R, let RM ([c0 ], . . . , [cn ]) :⇐⇒ R(c0 , . . . , cn ) ∈ T . the next claim
shows that this is well-defined.
. .
Claim. If (c0 = d0 ), . . . , (cn = dn ), R(c0 , . . . , cn ) ∈ T , then R(d0 , . . . , dn ) ∈ T .
Proof. By the equality axioms for relation symbols. 
For function symbols f , let f M ([c0 ], . . . , [cn ]) = [c] :⇐⇒ (f (c0 , . . . , cn−1 ) = c) ∈ T .
This is well-defined by the next two claims.
. .
Claim. If (c0 = d0 ), . . . , (cn = dn ), f (c0 , . . . , cn ) = e, f (d0 , . . . , dn ) = e0 ∈ T , then (e =
e0 ) ∈ T .
Proof. By the equality axioms for function symbols. 
Claim. For any n-ary function symbol f ∈ L and constant symbols c0 , . . . , cn−1 ∈ L,
there is a constant symbol c ∈ L with (f (c0 , . . . , cn−1 ) = c) ∈ T .
. .
Proof. Since `L f (c0 , . . . , cn ) = f (c0 , . . . , cn ), the → ∃-axiom implies `L ∃x f (c0 , . . . , cn ) =
x. Since T is a Henkin theory, there is a constant symbol c ∈ L such that
. .
((∃x f (c0 , . . . , cn ) = x) → f (c0 , . . . , cn ) = c) ∈ T.
.
Therefore f (c0 , . . . , cn ) = c ∈ T . 
.
Claim. If t is an L-term with no free variables, then tM = [c] ⇐⇒ (t = c) ∈ T .
Proof. By induction on terms.
Suppose that t = d is a constant symbol. If tM = [c], then [c] = tM = [d], so
.
(c = d) ∈ T by the definition of ∼.
42 PHILIPP SCHLICHT

Suppose that t = f (t0 , . . . , tn ). By the previous claim, find constant symbols with
.
(ti = ci ) ∈ T for i ≤ n. Thus (f (t0 , . . . , tn ) = f (c0 , . . . , cn )) ∈ T . By the inductive
hypothesis, tM i = [ci ] for i ≤ n.
.
We have f (c0 , . . . , cn )M = [c] ⇐⇒ (f (c0 , . . . , cn ) = c) ∈ T by the definition of
f (c0 , . . . , cn )M . Since T is deductively closed, the claim follows. 
Claim. Th(M) = T .
Proof. We show by induction that for all L-sentences ϕ,
M |= ϕ ⇐⇒ ϕ ∈ T.
. .
First suppose that ϕ = (s = t). Suppose that sM = [c] and tM = [d]. Then (s =
.
c), (t = d) ∈ T by the previous claim. Then
. .
M |= ϕ ⇔ [c] = [d] ⇔ (c = d) ∈ T ⇔ (s = t) ∈ T.
Suppose that ϕ = (R(t0 , . . . , tn )). Suppose that tM
i = [ci ] for i ≤ n. Then
M |= ϕ ⇔ R(t0 , . . . , tn ) ∈ T ⇔ (R(t0 , . . . , tn )) ∈ T.
Suppose that ϕ = (¬ψ). Since T is complete,
M |= ϕ ⇔ M 6|= ψ ⇔ ψ ∈
/ T ⇔ ϕ ∈ T.
Suppose that ϕ = (ψ ∧ θ). Then
M |= ϕ ⇔ (M |= ψ and M |= θ) ⇔ (ψ ∈ T and θ ∈ T ) ⇔ (ψ ∧ θ) ∈ T.
Suppose that ϕ = (∃x ψ). Then
M |= (∃x ψ) ⇐⇒ M |= ψ[[c]] for some constant symbol c ∈ L
⇐⇒ M |= ψ xc for some constant symbol c ∈ L
⇐⇒ (ψ xc ∈ T ) for some constant symbol c ∈ L
⇐⇒ (∃x ψ) ∈ T.
The second equivalence holds by Lemma 1.5.11 (substitution for formulas). The third
holds by the inductive hypothesis. 
This completes the proof of Lemma 3.1.5. 
Lecture 15
9. June
Theorem 3.1.6 (Completeness of Hilbert’s calculus).
(1) Any consistent L-theory has a model.
(2) For any L-theory T and any L-sentence:
(T `L ϕ) ⇐⇒ (T |= ϕ).
Proof. (1): By Lemmas 3.1.2, 3.1.4 and 3.1.5.
(2): =⇒: Suppose that T `L ϕ. Then there are formulas ψ0 , . . . , ψn ∈ T with `L
(ψ0 ∧ · · · ∧ ψn ) → ϕ. If M |= T , then M |= ψ0 ∧ · · · ∧ ψn and hence M |= ϕ, as required.
⇐=: Suppose that T 6`L ϕ. Then T ∪ {¬ϕ} is L-consistent. By (1), there is a model
M of T ∪ {¬ϕ}. Thus T 6|= ϕ. 

3.2. The Compactness Theorem and applications.


Definition 3.2.1.
(1) A theory T is called satisfiable if there is a model M |= T .
(2) A theory T is called finitely satisfiable if every finite T0 ⊆ T has a model.
Theorem 3.2.2 (Compactness theorem). Every finitely satisfiable theory is satisfi-
able.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 43

Proof. Suppose that T is not satisfiable. Then T |= θ for all L-sentences θ. Then T `L θ
for all L-sentences θ by Theorem 3.1.6. By the definition of `L , there are ψ0 , . . . , ψn ∈ T
with `L (ψ0 ∧ · · · ∧ ψn ) → ⊥. Then T0 := {ψ0 , . . . , ψn } is not satisfiable. 
3.2.1. From finite to infinite.
Lemma 3.2.3. If a theory T has models of size ≥ n for arbitrarily large n ∈ N, then it
has an infinite model.
Proof. Recall the sentence
^ .
ϕ≥n = ∃x0 . . . ∃xn−1 ¬(xi = xj )
i<j≤n−1
from Example 1.4.5, which axiomatises the class of structures with ≥n elements in the
empty language. Let
T∞ = {ϕ≥n | n ∈ N}.
By assumption T ∪ T∞ is finitely satisfiable. By the compactness theorem, it has a model
M. Then M is infinite. 
Definition 3.2.4. Suppose that K = (K, 0, 1, +, ·) is a field. The characteristic char(K)
is the least n ∈ N such that 1 + 1 + · · · + 1 (n times) = 0, if such an n exists. (One then
says that K has positive characteristic.) Otherwise K has characteristic 0.
Let Tfield denote the theory of fields.
Lemma 3.2.5. If a theory T ⊇ Tfield has models of characteristic ≥ n for arbitrarily large
n ∈ N, then it has a model of characteristic 0.
Proof. For primes p, let
ϕchar6=p = 1 + 1 + · · · + 1 (p times) 6= 0
^
ϕchar≥n = ϕchar6=p
1≤p<n, p prime
and Tchar=0 = {ϕchar≥n | n ≥ 1}. Note that the models of Tchar=0 are precisely the fields
of characteristic 0.
T ∪ Tchar is finitely satisfiable by the assumption and hence it has a model M. Then
M has characteristic 0. 
If a property holds for all finite subsets of a given structure, one can often conclude via
the Compactness Theorem that the property holds for the entire structure.
For example, a graph G = (G, E) consists of sets G and E ⊆ [G]2 . (Thus, by definition,
G has no loops, i.e. edges connecting a vertex with itself.) G is called k-colourable if there
is a function f : G → {0, . . . , k − 1} such that f (x) 6= f (y) whenever x, y are connected by
an edge. By an (induced) subgraph of a graph G we mean the restriction of G to a subset
of G.
Lemma 3.2.6. If every finite subgraph of a graph G = (G, E) is k-colourable, then G is
k-colourable.
Proof. The language L of graphs consists of a single binary relation symbol R. Extend L
to L0 by adding a constant symbol di for each element i ∈ G and a unary relation symbol
Ci for each i < k. T is the L0 -theory that consists of the formulas:25
(1) di 6= dj for all i 6= j in G.
(2) R(di , dj ) if (i, j) ∈ E and ¬R(di , dj ) otherwise, for all i 6= j in G.
25A variation of this proof still works if (3) and (4) are replaced with the relevant atomic formulas with
constant symbols di instead of quantifying over x. Then the compactness theorem for propositional logic
suffices.
44 PHILIPP SCHLICHT

∀x i<k Ci (x).
W
(3) V
(4) i<k ∀x, y (R(x, y) → ¬(Ci (x) ∧ Ci (y))).
Since every finite subgraph of G is k-colourable, T is finitely satisfiable. By the compact-
ness theorem, T has a model H. Since H satisfies (1) and (2), it is easy to see that there
is an embedding of G into H (i.e. an injective homomorphism, see Definition 1.1.6). Since
H satisfies (3) and (4), there is a k-colouring of H. The restriction to G is a k-colouring
of G. 
Write [A]n for the class of n-element subsets of A.
Lemma 3.2.7 (Infinite Ramsey’s theorem). Suppose that A is an infinite subset of
N, k ≥ 0, n ≥ 1 and c : [A]n → {0, 1, . . . , k} is a function. (We call c a colouring.)
Then there is an infinite set B ⊆ A with |c[[B]n ]| = 1. (B is called homogeneous or
monochromatic for c.)
Proof. The case for arbitrary k ≥ 1 follow from the case k = 1 by iterated application of
the case k = 1. We thus assume k = 1.
The case n = 1 is clear: if we split N into two sets, at least one of them is infinite.
(This is called the pigeonhole principle.)
For the induction step, suppose that the claim holds for some n ≥ 1. Suppose that
c : [A]n+1 → {0, 1} is given.
We will inductively construct a ⊆-decreasing sequence B ~ = hBj | j ∈ Ni of subsets of
A and a strictly increasing sequence ~a = haj | j ∈ Ni in A with aj ∈ Bj .
Let B0 = A and pick any a0 ∈ A (for instance, the least element of A). For the
successor step, write
ca0 (x) = c({a0 } ∪ x)
for x ∈ [B0 ]n with min(x) > a0 . by the induction hypothesis for n, there is an infinite
subset B1 ⊆ B0 that is homogeneous for ca0 . Then for some i0 ∈ {0, 1}, c(a0 ∪ x) = i0 ≤ 1
for all x ∈ [B1 ]n . Since this property remains true for all subsets of B1 , we can assume
that a0 < min(B1 ) by shrinking B1 . Pick a1 ∈ B1 and continue the construction of a2 , B2
etc. similarly.
Once this has been completed, take i ≤ 1 such that J = {j ∈ N | ij = i} is infinite.
Then {aj | j ∈ J} is homogeneous for c by the construction. 
Lecture 16
14. June Next is a short proof of the finite Ramsey’s theorem from the infinite Ramsey’s theorem
via the Compactness Theorem.
Lemma 3.2.8 (Finite Ramsey’s theorem). For all k, l ∈ N and n ≥ 1, there is some
N ∈ N26 such that for any colouring c : [{0, . . . , N − 1}]n → {0, . . . , k − 1}, there is a
homogeneous set H ⊆ {0, . . . , N − 1} for c of size l.
Proof. We can assume that l ≥ k. Let L = {f }∪{ci | i ∈ N}, where f is an n-ary function
symbol and ci is a constant symbol for i ∈ N.
The next formula ψN holds in an L-structure M if and only if there exists a homoge-
neous subset of size l for f M {cM
i | i ≤ N }:
_ _ ^
ψN := f (ci0 , . . . , cin−1 ) = cm
I∈[{0...,N }]l m<k {i0 ,...,in−1 }∈[I]n

By the assumption, Let T be the theory that consists of the formulas ¬ψN for N ≥ l, the
formulas ci 6= cj for i 6= j, a formula θ asserting that f (x0 , . . . , xn−1 ) does not depend on
26We use an uppercase N to sugggest that N is very large compared to k and l. See the remark after
the lemma.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 45

the order of its arguments:


^
θ := ∀x0 , . . . , xn−1 (( f (x0 , . . . , xn−1 ) = f (xπ(0 , . . . , xπ(n−1) )},
π∈Autn

where Autn denotes the set of bijections π : {0, . . . , n−1} → {0, . . . , n−1}, and a formula
χ which states that f is a colouring:
^ _
χ := ∀x0 , . . . , xn−1 (( xi 6= xj ) → f (x0 , . . . , xn−1 ) = cm ),
i<j<n m<k

T is finitely satisfiable: for any N ∈ N, there exists a colouring g : [{0, . . . , N }]n →


{0, . . . , k − 1} with no homogeneous set of size l. Take this colouring on {cN N
0 , . . . , cN } and
extend it in an arbitrary way.
By the Compactness Theorem, there exists a model M of T . We can assume that
N
ci = i and N is the underlying set of M, since the restriction of M to {cN i | i ∈ N} is
also a model of T .
Then there is no homogeneous set of size l for f M , and in particular no infinite one.
This contradicts the infinite Ramsey’s theorem. 

It is known that N grows extremely fast (exponentially) relative to l. For k = 2, the


least such N is called the Ramsey number R(l, l). (See the Wikipedia article on Ramsey’s
theorem.)

3.2.2. (Finitely) axiomatisable classes. The theory Tchar=0 in Lemma 3.2.5 axiomatises
the class of fields with characteristic 0. Let Lring denote the language of rings and fields.

Lemma 3.2.9. The class of fields of positive characteristic is not axiomatisable in Lring .

Proof. Towards a contradiction, suppose that T is an axiomatisation. For every prime p,


Z/Zp is a field wich characteristic p. (This is not hard to check.) Thus by Lemma 3.2.5,
T has models of characteristic 0. 

The class of fields of characteristic 0 is not finitely axiomatisable in Lring by the next
lemma. Otherwise, the class of fields of positive characteristic is axiomatisable in Lring
by the next lemma, using the obvious fact that class of fields is axiomatisable.
Note that it is easy to see the class of fields of characteristic 0 is axiomatisable by
language extension.

Lemma 3.2.10. The following conditions are equivalent for any class C of L-structures:
(1) C is finitely axiomatisable in L.
(2) C is axiomatisable in L and its complement (i.e., the class of L-structures not in
C)27 is axiomatisable in L.

Proof. (1) ⇒ (2): Suppose that T = {ψ0 , . . . , ψn } axiomatises C. Let φ = (ψ0 ∧ · · · ∧ ψn ).


Then {φ} axiomatises C and {¬φ} axiomatises its complement.
(2) ⇒ (1): Suppose that T axiomatises C and T 0 axiomatises its complement. Then
T ∪ T 0 is not satisfiable. By the Compactness Theorem 3.2.2, there are finite sets S ⊆ T
and S 0 ⊆ T 0 such that S ∪ S 0 is not satisfiable.
We claim that S axiomatises C. To see this, suppose that M |= S. Since S ∪ S 0 is not
satisfiable, M 6|= S 0 and hence M 6|= T 0 . Thus M ∈ C. 

27For example, the complement of the class of fields with characteristic 0 in the class of L
ring -structures
consists of fields with positive characteristic and those Lring -structures that are not fields.
46 PHILIPP SCHLICHT

3.2.3. The Löwenheim-Skolem theorems.


Lemma 3.2.11. For all infinite cardinals κ, we have |κ × κ| = κ, |[κ]<ω | = κ.
Proof. For κ = ω, this was shown in exercise 26. We leave out the general case for time
reasons (see Schimmerling: A course in Set Theory), although it is not difficult. 
The completeness theorem provides us with a model of any consistent theory. One can
ask which sizes these models can have. The next theorem gives a complete answer by
a direct application of the compactness theorem: an L-theory with infinite models has
models of all sizes ≥|L|.
Theorem 3.2.12. The following conditions are equivalent for any L-theory T :
(1) For all n ∈ N, T has a model of size ≥n
(2) T has an infinite model.
(3) T has models of size κ for any infinite cardinal κ ≥ |L|.
Proof. It suffices to show (2)=⇒(3). Let L0 = L ∪ {cα | α < κ}, where cα are distinct
new constant symbols. The theory T 0 = T ∪ {cα 6= cβ | α 6= β} is finitely satisfiable.
We constructed a model M of T 0 in the proof of the completeness theorem. In fact, in
each step of the construction, only κ many new constant symbols were added, since there
are precisely |[κ]<ω | = κ (by Lemma 3.2.11 ) many new formulas. Therefore, the whole
construction adds at most |κ × ω| = κ (by Lemma 3.2.11) many constant symbols. So M
has size κ. 
It remains to understand what are the sizes of elementary substructures and elementary
superstructures of a given structure M. (Recall the definition of elementary substructures
in Section 1.3.) The Löwenheim-Skolem theorems fully answer this question.
The theory ThLA (MA ) (see Example 1.1.9) in the next lemma is also called the ele-
mentary diagram of M. It is essentially a list of all tuples in M and the formulas true
for these tuples in M.
Lemma 3.2.13. Suppose that M is an L-structure with underlying set A. If N is a
model of ThLA (MA ), then M ≺ N .
Proof. Suppose that N is a structure on a set B. Let f : A → B be the function c 7→ cN .
Since N is a model of ThLA (MA ), f is an elementary embedding. 
Theorem 3.2.14 (Upward Löwenheim-Skolem Theorem). Any infinite L-structure
M on a set A has an elementary superstructure of any infinite size κ ≥ |A|, |L|.
Proof. |LA | = κ by Lemma 3.2.11. Thus ThLA (MA ) has models of any size ≥ κ by
Theorem 3.2.12. These are superstructures of M by Lemma 3.2.13. 
Lecture 17
16. June The construction of elementary substructures in the next result is direct and does not
use the compactness theorem.
Suppose that N is an L-structure on a set B. It follows from Exercise 5 that a subset A
of B is the domain of a substructure of N if and only if A contains cN and is closed under
f N for all constant symbols c and function symbols f in L. To construct a substructure,
one can start with a subset A0 of A containingSall cN and obtain An+1 from An by adding
all f N (a0 , . . . , ak ) for a0 , . . . , ak ∈ An . Then n∈ω An is as required.
For an elementary substructure, we want to additionally satisfy the condition in Tarski’s
Test 1.3.7: if there is some b ∈ N with N |= ϕ[b, a0 , . . . , an ], then there is some a ∈ M
with N |= ϕ[a, a0 , . . . , an ]. So we want to add a witness to each such formula in every
step of the construction. Functions providing such witnesses are called Skolem functions:
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 47

Definition 3.2.15. Suppose that N is an L-structure on a set B and ϕ(x0 , . . . , xn , y) is


an L-formula. A Skolem function for ϕ is a partial function fϕN : B n+1 → B such that for
N
all (b0 , . . . , bn ) ∈ B n+1 : if N |= (∃y ϕ)[b0 , . . . , bn ], then N |= ϕ(b0 , . . . , bn , fϕ(b ).
0 ,...,bn )

Skolem functions exist by to the Axiom of Choice. (In fact, they cannot be constructed
without it.)
Theorem 3.2.16 (Downward Löwenheim-Skolem Theorem). Suppose that N is
an L-structure on a set C and A ⊆ C. Then there is an elementary substructure M ≺ N
on an infinite set B with A ⊆ B of size at most λ := max(|A|, |L|).
Proof. By the Axiom of Choice, find a Skolem function fϕ for any L-formula ϕ(x0 , . . . , xn , y).
Let B0 = A and
[
Bn+1 = Bn ∪ fϕ (Bnk+1 ),
n∈N, ϕ(x0 ,...,xk ,y) is an L-formula
fϕ (Bnk+1 )
S
where denotes the pointwise image. It is easy to check that B := n∈ω Bn
passes Tarski’s test as required.
There are |L| ≤ λ (by Lemma 3.2.11) many L-formulas, so in each step, only | n∈ω (λ×
S

λn )| ≤ λ (by Lemma 3.2.11) many new elements are added. Hence |B| ≤ λ. 
The downward Löwenheim-Skolem theorem provides us with many examples of ele-
mentary substructures. Before this, we did not have any examples besides Exercise 14.
Example 3.2.17. The field (C, 0, 1, +, ·) of complex numbers is uncountable (since the
set 2N of infinite binary sequences is uncountable by Cantor’s theorem). By the downward
Löwenheim-Skolem theorem, it has a countable elementary subfield.
We aim to show later, using a technique called quantifier elimination, that the set of
algebraic complex numbers forms an elementary subfield of the complex field.
By Theorem 3.2.12, or alternatively by the downward Löwenheim-Skolem theorem, we
have the following striking consequence, called the Skolem paradox. If ZFC is consistent,
then it has a countable model (M, ∈M ). This may seems contradictory at first sight.
One way to visualise such a model is to identify each element of M by a natural number
and thus obtain a model (N, ∈N ) of ZFC. Visualise (N, ∈N ) as a direct graph with arrows
that point to a set from its elements. Think of a statement true in (N, ∈N ), for instance
the existence of an uncountable set. This is just a statement about arrows.
The seeming paradox about uncountability is resolved by the difference by the internal
and external meaning of uncountable. Assume that ∈M equals the restriction of ∈ to M .28
We write P for P(N). Further assume:
• N ⊆ M and “NM = N”:
let ϕ(x) denote the L∈ -formula stating that x is an element of ω. Then
for all sets n, ϕ(n) holds (i.e. n ∈ N) if and only if (M, ∈) |= ϕ[n].29
M
• “P ⊆ P”:
Let ψ(x) denote the L∈ -formula stating that x is a subset of ω. Then
for all x ∈ M with (M, ∈) |= ψ[x], x is really a set of natural numbers,
i.e. ψ(x) holds.
Since PM ⊆ M and this PM is countable, there is a bijection f : PM → N, i.e. a subset
of PM × N with certain properties. Such a function cannot exist in M , since (M, ∈) |=
PM is uncountable.. In other words, (M, ∈) cannot “see” the countability of PM .
28The downward Löwenheim-Skolem theorem implies that such a model exists if there is a set N of
any size with (N, ∈) |= ZFC. The existence of such a set N follows from the existence of an inaccessible
cardinal (which we do not study here).
29These properties of N and P can be realised by the Mostowski collapse (which we do not study here)
applied to M .
48 PHILIPP SCHLICHT

3.2.4. Nonstandard models for the natural numbers and for the reals. In the first chapter,
we asked:
• How can we axiomatise a given class of structures?
Conversely, one can ask:
• What is the class of models of a given theory?
Any countable theory with infinite models has models in many cardinalities by Theorem
3.2.12. It remains to classify the models of a fixed cardinality. Here we shall see that the
theory of the natural numbers, and thus PA, has more than one countable model up to
isomorphism. In fact, it is not hard to show that there are infinitely many isomorphism
types. Thus PA does not say much about the structure of its models.
We will return to these questions in the last chapter, where we consider other theories.
The language of arithmetic is LArith = {0, S, +, ·}. Recall the axiom system PA from
Definition 1.4.2.
The model (N, 0N , S N , +N , ·N )30 is called the standard model of PA. It is abbreviated
by N. All other models of PA, and extensions of PA, are called nonstandard models.
Lemma 3.2.18. Suppose that N = (N, 0N , S N , +N , ·N ) is a model of PA. Then there is
a unique homomorphism i = iN : N → N from N to N .
Proof. We define by recursion i(0) = 0N , i(S N (n)) = S N (f (n)). Note that any homo-
morphism has to equal f .
We claim that i(m +N n) = i(m) +N i(n). By induction, i(m +N 0) = i(m) = i(m) +N 0
and
i(m +N (n +N 1)) = i(m +N S N (n)) = i(S N (m +N n)) = S N (i(m +N n))
= S N (i(m) +N i(n)) = i(m) + S N (i(n)) = i(m) +N (i(n + 1).
Similarly, i(m ·N n) = i(m) ·N i(n). 
Suppose that N = (N, . . . ) is a nonstandard model of PA. We identify N with a subset
of N via iN . Note that (N, <N ) is an initial segment of (N, <N ), since by induction on
n ∈ N, if N |= m < n then m ∈ N. When N is clear from the context, we call all n ∈ N
standard and all n ∈ N \ N nonstandard.
To construct a countable nonstandard model, we define terms ∆n by induction for
n ∈ N: ∆0 := 0 and ∆n+1 = S(∆n ). Note that iN maps each n ∈ N to ∆N n by definition.

Lemma 3.2.19. The theory Th(N), and hence also PA, has a countable nonstandard
model.
Proof. Let c be a new constant symbol and let L = LArith ∪ {c} Let
T := Th(N) ∪ {c 6= ∆n | n ∈ N}.
Since T is finitely satisfiable by the standard model N, T has a model N . To see that N
is nonstandard, note that iN is not surjective since cN 6= ∆N
n holds for all n ∈ N. 
Suppose that N = (N, 0N , S N , +N , ·N ) is a nonstandard model of PA. We can define
a relation < from +N . Note that each element x of the nonstandard part has a successor
S N (x) and a predecessor y with S N (y) = x. Thus the nonstandard part of (N, <) consists
of “intervals” isomorphic to (Z, <Z ). One can easily show that the order or these “copies”
of (Z, <Z ) is a dense linear order without end points.
Lecture 18
21. June We now introduce the nonstandard reals.
30Formally, this is defined by restricting ordinal addition and multiplication to ω. Hence the full
induction principle holds for all subsets of N, and in particular PA holds.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 49

An ordered field (R, 0, 1, +, ·, <) a model of the field axioms, the axioms of strict linear
orders, and the axioms stating that addition with a fixed number, and multiplication with
a positive number, are strictly monotone functions:
• ∀x, y, z (x < y → x + z < y + z)
• ∀x, y, z (x < y ∧ z > 0 → x · z < y · z)
The language L consists of the language of ordered fields together with a set S consisting
of:
• A constant symbol cr for every r ∈ R.
• An n-ary function symbol Ff for every function f : Rn → R.
• An n-ary relation symbol RA for every set A ⊆ Rn .
We abbreviate the structure (R, 0R , 1R , +R , ·R , <R , S R ) of the reals with the obvious
interpretation of symbols in S as R.
∗ ∗ ∗ ∗ ∗ ∗
A hyperreal field is an L-structure (R∗ , 0R , 1R , +R , ·R , <R , S R ), abbreviated as R∗ ,
with the following properties:
• (Transfer) R ≺ R∗ , i.e. R is an elementary substructure of R∗ .
• R∗ is not archimedean, i.e. R∗ contains a positive infinitesimal element  with
0 <  < n1 for all n ∈ N.
Every function f : Rn → R has a “version” f ∗ : (R∗ )n → R∗ . Similarly, every relation
A ⊆ R∗ has “version” A∗ ⊆ (R∗ )n . Sometime f ∗ and A∗ are denoted by f and R,
respectively.
Lemma 3.2.20. There exists a hyperreal field.
Proof. Let c be a new constant symbol and let L0 = L ∪ {c} Let
T := ThL (R) ∪ {0 < c < c 1 | n ∈ N}.
n

Since T is finitely satisfiable in R, it has a model R∗ . We have R ≺ R∗ , since this holds



for all models of ThL (R). Moreover,  := cR witnesses that R∗ is not archimedean. 
An alternative construction of a hyperreal field R∗ goes essentially as follows. Consider
all sequences ~r = hrn | n ∈ ωi in R. R∗ is defined as the set of such sequences modulo
an equivalence relation ∼. One defines ~r ∼ ~s if r(n) = s(n) for “many” n ∈ N, ~r < ~s
if r(n) < s(n) for “many” n ∈ N etc. Thus the sequences ~r = h 21n | n ∈ Ni and
1
~s = h n+1 | n ∈ Ni represent infinitesimals with 0 < ~r < ~s <  for all  ∈ R>0 . More
precisely, “many” n ∈ N means U-many n ∈ N with respect to an ultrafilter U on N.
Hyperreals allow elegant formulations of definitions and proofs from analysis. We shall
give a few examples.
From now on, we fix a hyperreal field R∗ . We call its elements hyperreals.
We first describe some basic facts about the structure of R∗ .
Definition 3.2.21.
(1) The set of finite hyperreals is
Rfin = {x ∈ R∗ | ∃n ∈ N |x| < n}.
(2) The set of infinite hyperreals is
Rinf = R∗ \ Rfin .
(3) The set of infinitesimals is
1
µ = {x ∈ R∗ | ∀n ∈ N |x| < }.
n
Lemma 3.2.22.
(1) Rfin is a subring of R∗ : for all x, y ∈ Rfin , we have x ± y, x · y ∈ Rfin .
50 PHILIPP SCHLICHT

(2) µ is an ideal in Rfin : it is a subring of Rfin and for all x ∈ Rfin and y ∈ µ, we have
x · y ∈ µ.
Proof. (1) Suppose that x, y ∈ Rfin . Find r, s ∈ R>0 with |x| < r and |y| < s. Then
|x ± y| ≤ |x| + |y| ≤ r + s ∈ R>0 and |x · y| = |x| · |y| ≤ r · s ∈ R>0 .
(2) Suppose that x, y ∈ µ. We aim to show x ± y ∈ µ. Thus we fix any r ∈ R>0 and
show |x ± y| < r. Since x, y ∈ µ, we have |x|, |y| < 2r , so |x ± y| ≤ |x| + |y| < r.
Suppose that x ∈ Rfin and y ∈ µ. We aim to show x · y ∈ µ. We fix any r ∈ R>0 and
show |x · y| < r. Find some s ∈ R>0 with |x| < s. Since y ∈ µ, we have |y| < rs . Then
|x · y| = |x| · |y| < s · rs = r. 
Definition 3.2.23. For x, y ∈ R∗ , x ≈ y means that |x − y| ∈ µ. (x, y are infinitely
close.)
Lemma 3.2.24.
(1) ≈ is an equivalence relation on R∗ , i.e. for all x, y, z ∈ R∗ :
(a) x ≈ x.
(b) If x ≈ y, then y ≈ x.
(c) If x ≈ y and y ≈ z, then x ≈ z.
(2) ≈ is a congruence relation on Rfin with respect to ± and ·, i.e. for all x, y, u, v ∈ Rfin :
If x ≈ u and y ≈ v, then x ± y ≈ u ± v and x · y ≈ u · v.
Proof. (1) follows from the definition of ≈, using that µ is additively closed.
(2) can be checked by first replacing x with u and then y with v. This uses the definition
of ≈, the triangle inequality, and for the product, the fact that µ is an ideal in Rfin . (See
Lemma 3.2.22.) 
Note that R∗ has its “version” N∗ of the natural number by definition of hyperreal fields.
This is a nonstandard model of PA. To see this, note that R∗ contains an infinitesimal

real . Since R ≺ R∗ , N∗ is unbounded in R∗ , so there is some N ∈ N∗ with 1 <R N .
∗ ∗
Since 0 <R  <R n1 for all n ∈ N, we have N > n for all n ∈ N.
R∗ not a complete ordered field, since it is easy to see that sup(N) does not exists in R∗ .
If it did, sup(N) − 1 ∈ Rfin , but then sup(N) ∈ Rfin , since Rfin is a subring of R∗ . However,
Rfin has no maximal element. Note that we already knew for abstract reasons that R∗ not
a complete ordered field: since (N∗ , <N ) ∼ 6 (N <N ), we have (R∗ , <R ) ∼
∗ ∗
= 6 (R <R ), but it
=
follows from Problem 29 that there is a unique complete ordered field up to isomorphism.
Lemma 3.2.25. (Existence of standard part) For any r ∈ Rfin , there is a unique s ∈ R
with r ≈ s. s is called the standard part of r and is written as st(r) = s.
Proof. To see that s is unique, suppose that s0 , s1 ∈ R with r ≈ s0 and r ≈ s1 . Since ≈ is
an equivalence relation, we have s0 ≈ s1 . Thus |s0 − s1 | ∈ µ ∩ R = {0} and hence s0 = s1 .
To show that s exists, we can assume r > 0, since the case r < 0 is similar. Let
A := {x ∈ R | x < r}.
A is nonempty, since 0 ∈ A. A is bounded above, since it is bounded by r ∈ Rfin and
hence by some n ∈ N. by completeness of R, s := sup(A) exists.
Take any δ ∈ R>0 . It suffices to show that s − δ < r < s + δ. We first claim that
r ≤ s + δ. Since s is an upper bound for A, we have s + δ ∈ / A. Hence r ≤ sδ . We now
claim that r ≥ s − δ. If r < s − δ, then s − δ is an upper bound for A. This contradicts
the fact that s is the least upper bound. 
Lemma 3.2.26. Suppose that x, y ∈ Rfin .
(1) x ≈ y ⇐⇒ st(x) = st(y).
(2) x ≤ y =⇒ st(x) ≤ st(y), but the converse fails.
(3) x ∈ R =⇒ st(x) = x.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 51

Proof. This is easy to check and is left as an exercise. 


Lemma 3.2.27. The map st : Rfin → R is a surjective ring homomorphism: for all
x, y ∈ Rfin , st(x ± y) = st(x) ± st(y) and st(x · y) = st(x) · st(y).
Proof. This follows from the fact that ≈ is a congruence relation. 
Lemma 3.2.28. Rfin /µ ∼ = R.
Proof. The kernel ker(st) = {x ∈ Rfin | st(x) = 0} of st equals µ. Thus the claim follows
from the isomorphism theorem for rings that can be found in any introductory textbook
on algebra. 
Recall the formal definition of convergence of sequences:
Definition 3.2.29. Suppose that sn ∈ R∗ for n ∈ N and t ∈ R∗ . ~s = hsn | n ∈ Ni
converges to t, written as limn→∞ sn = t, if
∀ ∈ R>0 ∃m ∈ N∀n ≥ m |sn − t| < .
We now give elegant nonstandard characterisations of convergence and continuity.
In the next lemma, we write N > N if N > n for all n ∈ N. Recall that R∗ associates
a sequence ~s∗ : N∗ → R∗ to any sequence ~s : N → R. For any N ∈ N∗ , we write s∗N for the
N -th element (~s∗ )N of this sequence. Note that by R ≺ R∗ , we have F ∗ Rn = F for any
function F : Rn → R.
Lemma 3.2.30. limn→∞ sn hold if and only if for all N > N in N∗ , s∗N ≈ t.
Proof. =⇒: Suppose that limn→∞ sn = t. We fix some N > N in N∗ and show s∗N ≈ t.
To see this, fix any  ∈ R>0 . It suffices to show |s∗N − t| < .
Since limn→∞ sn = t, there is some m ∈ N such that
R |= ∀n ≥ m |sn − t| < .
Since R ≺ R∗ ,
R∗ |= ∀n ≥ m |s∗n − t| < .
In particular, R∗ |= |s∗N − t| <  as required.
⇐=: Suppose that for all N > N in N∗ , s∗N ≈ t. To show convergence, fix any  ∈ R>0 .
We are looking for some m ∈ N such that ∀n ≥ m |sn − t| < . We have
R∗ |= ∃m ∀n ≥ m |s∗n − t| < ,
since this holds for any N > N in N∗ . Since R ≺ R∗ , this holds in R as required. 
Lemma 3.2.31. Suppose f : A → R and c ∈ A. The following conditions are equivalent:
(1) f is continuous at c.
(2) If x ∈ A∗ and x ≈ c, then f (x) ≈ f (c).
(3) There is some δ ∈ µ>0 such that for all x ∈ A∗ with |x−c| < δ, we have f (x) ≈ f (c).
Proof. This is left as an exercise. 
Definition 3.2.32. A function f : A → R is called uniformly continuous if for any  ∈
R>0 , there is some δ ∈ R>0 such that
∀x, y ∈ A (|x − y| < δ =⇒ |f (x) − f (y)| < ).
Lemma 3.2.33. A function f : A → R is uniformly continuous if and only if for all
x, y ∈ A∗ with x ≈ y, we have f (x) ≈ f (y). Added a miss-
ing star ∗
The proof is similar to the characterisation of continuity in Lemma 3.2.31. I might
add this proof later, but it is not relevant for the exam. I might also add an elegant
formulations of differentiability. The appeal of nonstandard analysis is that many basic
and advanced result in analysis have short proofs that avoid calculations with ’s and δ’s.
52 PHILIPP SCHLICHT

4. Incomplete theories
Lecture 19
23. June
4.1. Finite set theory. Recall from Definition 1.5.14 that a theory is called complete if
for every formula ϕ, T ` ϕ or T ` ¬ϕ. (It is easy to see that provability does not depend
on the language by the completeness theorem.)
Let Fin denote the set of hereditarily finite sets, i.e. those with only finitely many
elements, elements of elements etc. We assume L ⊆ Fin for all languages studied below.
Then any L-formula is an element of Fin, since L-formulas are functions {0, . . . , n} → L.
We study circumstances in which an L-theory T is incomplete.
We consider axiom systems T with the following properties:
• T is self-referential: it can prove sufficiently many facts about formulas and about
provability in T .
• The axioms of T have an effective listing such as for PA or ZFC.
The second item ensures that provability in T can be expressed by a formula provT .
We will work with the following axiom system ZFFin . This is ZF except:
• The Axiom of Infinity is replaced by the Axiom of Finiteness.
Question while • The Foundation Axiom is replaced with the Foundation Scheme.
writing the
lecture notes: The proofs of transfinite induction in Lemma 2.2.28 and of transfinite recursion in
does the Foun- Theorem 2.2.29 work in ZFFin . Note that the Axiom of Finiteness is not important for
dation Ax- most proofs. However, with this axiom, there is a closed relationship between models of
iom imply the ZFFin and models of PA.
Foundation Here are the axioms:
Scheme?
Axiom. (Existence) ∃x (∀y y ∈
/ x).
Axiom. (Extensionality) ∀x, x0 (∀y (y ∈ x ↔ y ∈ x0 ) → x = x0 ).
Axiom. (Pairing) ∀x, y (∃z (∀u (u ∈ z) ↔ (u = x ∨ u = y)).
Axiom. (Union) ∀x ∃y ∀z (z ∈ y ↔ ∃u (u ∈ x ∧ z ∈ u)).
The Axiom of Finiteness states that there is no inductive set and every set is in bijection
with an ordinal.
Axioms
marked in blue
are different Axiom. (Finiteness) ¬∃y (∅ ∈ y ∧ ∀x (x ∈ y → x + 1 ∈ y)) ∧
from ZF ∀x ∃n ∈ Ord ∃f : n → x bijective.
Using the Foundation Scheme, it follows that every ordinal is 0 or a successor ordinal.
We thus call the ordinals natural numbers.
Axiom. (Power Set) ∀x ∃y ∀z (z ∈ y ↔ z ⊆ x).
We work with the Foundation Scheme instead of the Axiom of Foundation.
Axiom Scheme. (Foundation) For any formula ϕ(z, x0 , . . . , xn ),
∀x0 , . . . , xn (∃z ϕ(z, x0 , . . . , xn ) → (∃z ϕ(z, x0 , . . . , xn ) ∧ ∀z 0 ∈ z ¬ϕ(z 0 , x0 , . . . , xn )).

Axiom Scheme. (Separation) For any formula ϕ(z, x0 , . . . , xn ),


∀x ∀x0 , . . . , xn ∃y ∃z (z ∈ y ↔ (z ∈ x ∧ ϕ(z, x0 , . . . , xn ))).
Axiom Scheme. (Replacement) If F is a function, then ∀x F [x] ∈ V , where F [x] = {z |
∃y ∈ z (y, z) ∈ F }.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 53

We want to expand ZFFin by adding a constant symbol cs for each s ∈ Fin and adding
their properties to the theory.
We first discuss how to extend an L-theory without proving more L∈ -sentences. Sup-
pose that T is an L-theory and T 0 is an L0 -theory with L ⊆ L0 and T ⊆ T 0 . T 0 is called
a conservative extension of T if for all L-sentences ϕ, T ` ϕ holds if and only if T 0 ` ϕ
holds.
Language extensions, i.e. considering an L-theory T as an L0 -theory, are always con-
servative. This follows from the completeness theorem.
A remark about language extension by definition: Suppose that L is a language and T
is an L-theory. Let R be a new n-ary relation symbol, L0 = L ∪ {R} and
T 0 = T ∪ {∀x0 , . . . , xn−1 (R(x0 , . . . , xn−1 ) ←→ ϕ(x0 , . . . , xn−1 ))}.
Every model M = (M, F) of T can be expanded to a model of T 0 by interpreting R as
{(x0 , . . . , xn−1 ) ∈ M n | M |= ϕ(x0 , . . . , xn−1 ))}. Thus L ϕ holds if and only if L0 ϕ for
all L-sentences ϕ. By the completeness theorem, the same holds for `.
One can similarly add constant symbols and function symbols. We discuss only constant
symbols, since function symbols work similarly. Let ∃!x ϕ(x) state that a unique x with
ϕ(x) exists, i.e. it is an abbreviation for ∃xϕ(x) ∧ ∀y(ϕ(y) → x = y). Suppose that
T ` ∃!xϕ(x). Let c be a new constant symbol, L0 = L ∪ {c} and T 0 = T ∪ {ϕ(c)}. Then
L ϕ holds if and only if L0 ϕ for all L-sentences ϕ.
We now expand ZFFin as follows. We add a new constant symbol cs to L∈ for every
set s ∈ Fin. For all s ∈ t in Fin, we add the sentence
_
ct = {x | x = cs }.
s∈t
As noted, this is a conservative extension of ZFFin . From now on, we identify this extension
with ZFFin .
When we write (Fin, ∈) |= ψ for a sentence in the language extended by constants cs ,
we always mean that (Fin, ∈) is the structure for the extended language where cs has
value s.
Recall that a subset A of a structure M = (M, F) is definable over M (with parameters,
respectively) if A = {x ∈ M | M |= ϕ(x)} for some formula ϕ(x) without parameters
(with parameters, respectively).
Lemma 4.1.1. ZFFin is incomplete. In fact, any T ⊆ Th(Fin, ∈) that is definable over
(Fin, ∈) is incomplete.
Proof. Let provT (ϕ) denote an L∈ -formula stating that T ` ϕ, i.e. there exists a finite
sequence of formulas ϕ0 , . . . , ϕn with ϕn = ϕ such that each one is obtained from the
previous ones by a rule of the Hilbert calculus. Recall that each formula ϕi is a function
{0, . . . , ni } for some natural number ni . Thus e.g. the statement “ϕj is derived from ϕi
and ϕk by the Modus Ponens” can be expressed by an L∈ -formula, and similarly for the
other rules.
The following is a diagonalisation argument as we have seen in Cantor’s proof of
|P(x)| > |x| and in Tarski’s undefinability of truth. Imagine a list of all formulas. A
truth definition would allow us to write down a new formula by diagonalisation that is
different from all formulas.
It T were complete, then for any formula ψ(x) and any s ∈ Fin, we would have
(Fin, ∈) |= ψ[s] ⇐⇒ (Fin, ∈) |= provT [ψ(cs )] ⇐⇒ (Fin, ∈) |= provT (sub(cψ , cs )), 31
31Here and elsewhere, ϕ(t) is an abbreviation for the formula obtained by replacing ϕ’s only free
variable by the term t.
As a reminder: [s] stands for an assignment of variables, the only relevant variable being mapped to the
element s of the structure.
54 PHILIPP SCHLICHT

where sub(ψ, x) denotes the recursive definition of the formula obtained from ψ by sub-
stituting x for its free variable. So T(ψ, s) = provT (sub(cψ , cs )) is a truth definition for
(Fin, ∈). We now argue why this is contradictory.
Let ψ(x) denote the formula ¬T(x, x) = ¬provT (sub(x, cx )) (it is not relevant what
this formula means when x is not a formula). Then
(Fin, ∈) |= ψ[ψ] ⇐⇒ (Fin, ∈) |= ¬T[ψ, ψ] ⇐⇒ (Fin, ∈) 6|= ψ[ψ],
where sub(ψ, x) denotes the recursive definition of the formula obtained from ψ by sub-
stituting x for its free variable.
The first equivalence holds by the definition of ψ. The second equivalence uses the fact
that T is a truth definition: (Fin, ∈) |= T[ψ, s] ⇐⇒ (Fin, ∈) |= ψ[s] as stated above. 

Gödel’s first incompleteness theorem, in the form proved by Rosser, is a strengthening


of Lemma 4.1.1: it does not assume that T is true, only that it is consistent. In this and
the next sections, we work towards this goal.
We will need a syntactical version of the previous argument using sentences instead of
formulas. Instead of (Fin, ∈) |= χ[s] for a formula with one free variable and s ∈ Fin, we
want to write (Fin, ∈) |= χ(cs ).
The argument for the first equivalence can then be rewritten as follows. We assign to
θ(x) := provT (x) the sentence ψ(x) := ¬θ(sub(x, cx )) and obtain (Fin, ∈) |= ψ(cψ ) ⇐⇒
(Fin, ∈) |= ¬θ(cψ(cψ ) ), since clearly

(Fin, ∈) |= sub(cψ , cs ) = cψ(cs ) .


In fact, one can show that for any formula ϕ(x) and any s ∈ Fin, the equality
sub(cψ , cs ) = cψ(cs ) is provable in ZFFin , where sub is defined as substitution of vari-
ables by recursion. This will follow either from the fact that it can be checked from the
formula defining sub, or from the fact that all Σ1 -formulas true in (Fin, ∈) are provable
in ZFFin and its building on this that Σ1 -definable functions are representable in ZFFin
(see Lemma 4.2.12).
Thus the equivalence ψ(cψ ) ←→ ¬θ(cψ(cψ ) ) is provable in ZFFin . The existence of a
sentence ϕ with ZFFin ` ϕ ←→ ¬θ(cϕ ) is called the Fixed point lemma in the literature.
The argument for the second equivalence in the previous lemma does not work without
the assumption T ⊆ Th(Fin, ∈). We only want to assume that T is consistent. Suppose
we work with a model M of T instead of (Fin, ∈). We still have the first equivalence
M |= ψ[ψ] ⇐⇒ M |= ¬prov[ψ(cψ )]
by the arguments above and the variant
(Fin, ∈) |= ¬prov[ψ(cψ )] ⇐⇒ M 6|= ψ[ψ]
of the second equivalence, since T is complete. However, the argument breaks because
(Fin, ∈) and M might not satisfy the same sentences, i.e. they might not be elementarily
equivalent (see Definition 1.3.2).
A model M of ZFCFin different from (Fin, ∈) is called nonstandard. By the compactness
theorem, such models exist. We note that in fact, some models of ZFCFin do not even
satisfy the same Σ1 -sentences as (Fin, ∈) (see the next section on the Levy hierarchy for the
definition of Σ1 -formulas); this follows for instance from Gödel’s second incompleteness
theorem.
In the following, we will see how to modify the argument to also make the second
equivalence work. The point is that sufficiently simple statements (Σ0 -statements) hold
in any model of ZFCFin if and only if they hold in (Fin, ∈). We will need that any
∆1 -definable set is representable by a Σ1 -formula (see Lemma 4.2.12).
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 55

Lecture 20 4.2. The Levy hierarchy.


28. June We now study the complexity of formulas with respect to quantifiers. The complexity
in the Levy hierarchy is determined by counting alternating blocks of quantifiers.
Can formulas with more quantifiers define more sets? For instance, can formulas of the
form ∀x ∃y ϕ(x, y, z) define more sets than formulas of the form ∃y ϕ(y, z) or ∀y ϕ(y, z),
where ϕ is quantifier-free?
This depends on the structure. For some interesting structures such as the complex
field, every definable subset can be defined without any quantifiers. We will come to this
in the next chapter.
Here we will see that in the structure (Fin, ∈) the number of quantifiers does matter.
As the number of quantifiers increases, one can define more sets.
Definition 4.2.1. Work with the extension of L∈ defined above.
(1) We write ∃x ∈ y ϕ for the formula ∃x (x ∈ y ∧ ϕ), and ∀x ∈ y ϕ for the formula
∀x (x ∈ y → ϕ) Any quantifier of this form is called bounded. Quantifiers of the
form ∃x and ∀x are called unbounded.
(2) A Σ0 -formula (also called Π0 -formula) is a formula with non unbounded quantifiers.
(3) A Σ1 -formula is a formula of the form ∃x0 . . . ∃xk ϕ, whereϕ is a Σ0 -formula.
(4) A Πn -formula is a formula of the form ¬Σn , where ϕ is a Σn -formula
(5) A Σn+1 -formula is a formula of the form ∃x0 . . . ∃xk ϕ, whereϕ is a Πn -formula.
(6) A set A ⊆ Fink with k ≥ 1 is called Σn -definable if there is a Σn -formula ϕ(x0 , . . . , xk−1 )
such that
A = {(x0 , . . . , xk−1 ) ∈ Fink | (Fin, ∈) |= ϕ[x0 , . . . , xk−1 ]}.
Πn -definable subsets of Fink are defined similarly.
(7) A set A ⊆ Fink with k ≥ 1 is called ∆n -definable if it is both Σn -definable and
Πn -definable.
(8) A function f : (Fin)k → Fin is called Σn -definable (Πn -definable, ∆n -definable, re-
spectively) if its graph is Σn -definable (Πn -definable, ∆n -definable, respectively).
Definition 4.2.2. Generalised Σ1 -formulas arise from Σ0 -formulas by closing under ∃x,
∀x ∈ y, ∧, and ∨.
For example, if ϕ is a Σ1 -formula, then ∀x ∈ y ∃z ϕ is a generalised Σ1 -formula.
There are various ways of bringing formulas into a simple form. One of them is prenex
normal form (see Wikipedia). We need the following normal form, a variant of negation
normal form for first order logic. It is used to avoid the negation case in inductive proofs.
Lemma 4.2.3 (Negation normal form). Every formula is logically equivalent (i.e.
provably equivalent in Hilbert’s calculus) to a formula in which negations only appear at
atomic formulas.
Proof. This is easily proved by induction on formulas. For instance, ¬∃x ϕ is logically
equivalent to ∀x ¬ϕ and ¬(ϕ ∧ ψ) is logically equivalent to (¬ϕ) ∨ (¬ψ). 
Lemma 4.2.4. Every generalised Σ1 -formula ϕ(x0 , . . . , xk−1 ) is in equivalent in Fin to
a Σ1 -Formula ψ, i.e. for all (s0 , . . . , sk−1 ) ∈ Fin,

Fin |= ϕ[s0 , . . . , sk−1 ] ⇐⇒ Fin |= ψ[s0 , . . . , sk−1 ].


Proof. By induction on formulas. Suppose that ϕ is the formula ∀x ∈ y ψ(x, y, ~z), where
ψ is a generalised Σ1 -formula. By the induction hypothesis, we can assume that ψ is
a Σ1 -formula ∃x0 , . . . , xm θ(x0 , . . . , xm , x, y, ~z), where θ is a Σ0 -formula. Then ϕ is in
(Fin, ∈) equivalent to the Σ1 -formula ∃z ∀x ∈ y ∃x0 ∈ z . . . ∃xm ∈ z θ. 
56 PHILIPP SCHLICHT

Lemma 4.2.5. Every formula that arises by an ∈-recursion guided by a Σ1 -formula


is itself Σ1 . Moreover, given a ∆1 -definable language L, the following statements are
expressible by Σ1 -formulas:
(1) The k-th symbol of a word is x.
(2) The length of a word is k ∈ N.
(3) v is a variable, constant symbol, function symbol, logical symbol etc.
(4) t is a term.
(5) ϕ = (∀xψ), ϕ = (¬ψ).
(6) ϕ is a formula.
(7) x is a variable that occurs free (bound) at place k ∈ N.
(8) ϕ is a sentence.
(9) t = s xr , where r, s, t are terms.
(10) ϕ = ψ xt , where ϕ, ψ are formulas, t is a term and x is a variable.
(11) ϕ is an axiom of Hilbert’s calculus.
(12) ϕ0 , ϕ1 , . . . , ϕn are formulas such that each ϕj arises from ϕ0 , . . . , ϕj−1 by rules of
Hilbert’s calculus.
Proof. This is left as an exercise. For instance, suppose that F : Fin → Fin is defined
by recursion guided by a Σ1 -definable function G : Fin × Fin → Fin. Then F (x) = y
if there exists a transitive set z and a function f : z → Fin such that for every y ∈ z,
f (y) = G(y, F y). Note that “for all y ∈ z” is a bounded quantifier. 
L always denotes a language that is a subset of Fin.
Definition 4.2.6. Suppose that ϕ(x0 , . . . , xk−1 ) is an L-formula and T is an L-theory.
(1) ϕ defines a set R ⊆ Fink if for all a0 , . . . , ak−1 ∈ Fin:
(a0 , . . . , ak−1 ) ∈ R ⇐⇒ (Fin, ∈) |= ϕ[a0 , . . . , ak−1 ].
(2) T decides ϕ if for all a0 , . . . , ak−1 ∈ Fin, one of the following holds:
c cak−1
(a) T ` ϕ xa00 . . . xk−1 .
c cak−1
(b) T ` ¬ϕ xa00 . . . xk−1 .
(3) ϕ represents a relation R ⊆ Fink in T if for all a0 , . . . , ak−1 ∈ Fin:
c cak−1
(a) If (a0 , . . . , ak−1 ) ∈ R, then T ` ϕ xa00 . . . xk−1 .
c cak−1
/ R, then T ` ¬ϕ xa00 . . .
(b) If (a0 , . . . , ak−1 ) ∈ xk−1 .

The relationship between these notions is stated in the next lemma. This lemma is not
actually needed below.
Lemma 4.2.7. The following conditions are equivalent for any formula ϕ(x0 , . . . , xk−1 )
and R ⊆ Fink :
(1) ϕ represents R in ZFFin .
(2) ϕ defines R and ZFFin decides ϕ.
Proof. (1) ⇒ (2): It is clear that ϕ decides R. Since ZFFin holds in Fin, ϕ defines R in
Fin.
(2) ⇒ (1): Suppose that (a0 , . . . , an−1 ) ∈ R. Since ϕ defines R, ϕ[a0 , . . . , an−1 ] holds in
c can−1 c can−1
(Fin, ∈) and hence ZFFin 6` ¬ϕ xa00 . . . xn−1 . Since ZFFin decides ϕ, ZFFin ` ϕ xa00 . . . xn−1
as required. The case (a0 , . . . , an−1 ) ∈ R is similar. 
Lemma 4.2.8.
(1) Every Σ0 -formula ϕ true in (Fin, ∈) is provable in ZFFin .
(2) Every Σ0 -definable set R ⊆ Fink is representable in ZFFin .
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 57

Proof. (1) By induction on Σ0 -sentences in negation normal form (see Lemma 4.2.3 ).
The claim holds for atomic sentences cs ∈ ct , cs = ct and their negations by the choice of
the extension of ZFFin to the language extended by constant symbols cs for s ∈ Fin.
If ϕ ∧ ψ holds, then both ϕ and ψ are provable by the induction hypothesis. Hence
ϕ ∧ ψ is provable.
The case ϕ ∨ ψ is similar.
Suppose that ∃x ∈ ct ϕ holds in (Fin, ∈). Then for some s ∈ t, ϕ cxs holds in (Fin, ∈).
The latter is provable in ZFFin by the inductive hypothesis. Hence ∃x ∈ ct ϕ is provable
in ZFFin .
Suppose that ∀x ∈ ct ϕ holds in (Fin, ∈). Then for all s ∈ t, ϕ cxs holds in (Fin, ∈).
Since ϕ cxs is provable for each s ∈ t, their conjunction is provable. Hence ∀x ∈ ct ϕ is
provable in (Fin, ∈).
(2) This follows from (1). 
Lemma 4.2.9. Every Σ1 -sentence ϕ true in (Fin, ∈) is provable in ZFFin .
Proof. Suppose that ϕ is the Σ1 -sentence ∃x ψ(~x), where ψ is a Σ0 -formula. If ψ holds in
(Fin, ∈), there are a0 , . . . , an ∈ Fin with (Fin, ∈) |= ψ[a0 , . . . , an ]. By the previous lemma,
c
ψ xa00 . . . cxann is provable in ZFFin . Hence ψ is provable as well. 

In fact, every Σ0 -definable set R ⊆ Fink is representable by the formula that defines it.
This is not the case for Σ1 -definable sets.
Lecture 21
To show that Σ1 -definable functions are representable, we first want to simplify Σ1 - 30. June
formulas.
Lemma 4.2.10. For every Σn+1 -formula θ(y0 , . . . , yl ), there is a Σn+1 -formula of the
form ∃x ψ(x, y0 , . . . , yl ) that is equivalent to θ, provably in ZFFin , where ψ is Πn .
Proof. Consider the Σn+1 -formula ∃x0 . . . ∃xk ϕ(x0 , . . . , xk , y0 , . . . , yl ). Then ∃x∃x0 ∈
x . . . ∃xk ∈ x ϕ(x0 , . . . , xk , y0 , . . . , yl ) works. A similar argument as in the proof of Lemma
4.2.4 now shows that ∃x0 ∈ x . . . ∃xk ∈ x ϕ(x0 , . . . , xk , y0 , . . . , yl ) is a Πn -formula. 
Definition 4.2.11. A function f : Fink → Fin is called Σ1 -representable in T if its graph
is representable in T .
Lemma 4.2.12.
(1) Every Σ1 -definable function f : Fink → Fin is Σ1 -representable in ZFFin .
(2) Every ∆1 -definable set R ⊆ Fink is Σ1 -representable in ZFFin .
Moverover, we can choose the Σ1 -formula representing f in (1) such that ZFFin proves
∀z (ψ(~x, z) → y = z).
Proof. (1) Note that the original Σ1 -definition does not always work. The idea is to
consider a formula which asks for which value a witness appears first with respect to the
Vn -hierarchy.
If one has a definable wellorder of the universe, then one can pick the first witness in
this wellorder. ZFFin actually proves that there is such a definable wellorder. But we give
a direct proof of this lemma.
Let V~ = hVn | n ∈ ωi denote the V -hierarchy, where V0 = ∅ and Vn+1 = P(Vn ) for
n ∈ ω. We first claim that the statement x ∈ Vn is definable in Fin by a ∆1 -definition
in x and n. To see this, note that in Fin, we can define P(x) = y by the Σ0 -formula
∅ ∈ y and ∀u ∈ x ∀v ∈ y (v ∪ {u} ∈ y ∧ v \ {u} ∈ y). (For infinite sets, P(x) = y is
only Π1 -definable.) Thus ZFFin proves that V~ is definable by a ∆1 -recursion and is hence
∆1 -definable.
58 PHILIPP SCHLICHT

Suppose that the graph of f is defined by the Σ1 -formula ∃z ϕ(~x, y, z), where ϕ is a
Σ0 -formula.
Consider the Σ0 -formula θ(~x, y, v) stating that v is transitive, ~x ∈ v, and in v, y is
unique with ∃z ϕ(~x, y, z). Note that in the last part, the quantifiers are restricted to v.
Consider the Σ1 -statement ψ(~x, y) stating the there is some n such that θ(~x, y, Vn ) holds.
Claim. If ψ(x0 , . . . , xk , y) holds in Fin, then ψ(cx0 , . . . , cxk , cy ) is provable in ZFFin .
Proof. Recall that all true Σ1 -sentences are provable in ZFFin . 
The next claim implies that ψ represents f : if ¬ψ(x0 , . . . , xk , y) holds in Fin, then
¬ψ(cx0 , . . . , cxk , cy ) is provable in ZFFin .
Claim. Suppose that f (~x) = y holds in Fin. Then ZFFin proves ∀z (ψ(~x, z) → y = z).
Proof. Since ψ(~x, y) holds, θ(~x, y, Vn ) holds for some n. Since this is a true Σ0 -statement,
it is provable in ZFFin .
Therefore for any z 6= y, θ(~x, z, v) fails for any transitive set v with Vn ⊆ v, since
uniqueness of z fails, as witnessed by y. This is provable in ZFFin ; more precisely,
∀z (ψ(~x, z) → y = z) is provable. 

(2) Let f : Fink → Fin denote the characteristic function of R, i.e. f (~x) = 1 if ~x ∈ R
and f (~x) = 0 otherwise. Since R is ∆1 -definable, f is Σ1 -definable. By (1), let ψ(~x, y)
be a Σ1 -formula representing f . Then ψ(~x, c0 ) represents R: if ~x ∈
/ R, then ψ(~x, c1 ) is
provable in ZFFin , since ψ represents f . By the previous claim, ¬ψ(~x, c0 ) is provable. 
The rest of this section is not used later. Its intention is to better understand truth
definitions. In particular, we see that a truth definition for Σn -formulas exists.
We next want to show that not every Σn+1 -definable set is Σn -definable. This provides
another explanation of Tarki’s undefinability of truth: a Σn -formula T(x, y) cannot define
all Πn -definable sets, so T is not a truth definition.
Definition 4.2.13. A formula U(x, y) is called a universal Σ1 -formula if it is a Σ1 -formula
and for any Σ1 -formula ϕ(y), there is some s ∈ Fin such that
Fin |= ∀y (ϕ(y) ↔ U(s, y)).
If T(x, y) is a Σ1 -formula that is a truth definition for Σ1 -formulas, then in particular
T is a universal Σ1 -formula, since one can let s = ϕ above.
Lemma 4.2.14. There is a Σ1 -formula T1 (x, y) that is a truth definition for Σ1 -formulas.
Proof. It suffices to find a Σ1 -formula T1 (x, y) that is equivalent to the statement: there
is a set M such that (M, ∈) |= ϕ[y]. (It is not relevant how T1 (ϕ, y) is defined if ϕ is not
a formula.) Note that (M, ∈) |= ϕ[y] is defined by a Σ1 -recursion in Chapter 1, so it is
Σ1 -definable by Lemma 4.2.5. In a bit more detail, (M, ∈) |= ϕ[y] says that there exists a
function f that assigns truth values true/false to subformulas of ϕ and tuples in M such
that (a) f satisfies the recursive definition, and (b) f (ϕ, y) = true. 
If T1 (x, y) is a Σ1 truth definition, then ¬T1 (x, y) is then a Π1 truth definition. Using
this and Lemma 4.2.10, one can inductively obtain a Σn -formula that is a truth definition
for Σn -formulas for n ≥ 2.
Lemma 4.2.15. For each n ≥ 1, there is a Σn -formula Tn (x, y) that is a truth definition
for Σn -formulas.
Proof. Suppose that T1 (ϕ, y, z) is a Σ1 truth definition for formulas with two variables.
This exists by an easy modification of the above argument. Consider Σ2 -formulas ψ of
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 59

the form ∃z ϕψ (y, z), where ϕψ a Π1 -formula. One can use Lemma 4.2.10 to see that this
suffices. Then
∃z¬T(ϕ, y, z)
is a Σ2 truth definition. Similarly, we can construct a Σn+1 truth definition from a Σn
truth definition. 
Then next lemma shows that some Πn -definable sets are not Σn -definable.
Lemma 4.2.16. Suppose that Un (ϕ, y) is a universal Σn -formula. Then ¬Un (x, x) is a
Πn -formula that is not equivalent to any Σn -formula in (Fin, ∈).
Proof. Suppose that there is a Σn -formula ϕ(x) such that
∀x (¬U(x, x) ↔ ϕ(x))
holds in (Fin, ∈). Since Un is universal, there is some s ∈ Fin such that
∀x (U(x, s) ↔ ϕ(x))
holds in (Fin, ∈). But the case x = s yields a contradiction. 
4.3. Incompleteness of extensions of finite set theory. Lecture 22
The next lemma is a provable version of the fact that there is no truth definition T(ϕ, x). 05. July
To see this, let ψ(x) be the formula ¬T(x, x).
Lemma 4.3.1 (Fixed point lemma). For every formula ψ(x), there is a sentence ϕ
with ZFFin ` ϕ ←→ ψ(cϕ ). (Moreover, if ψ(x) is a Σ1 -formula, then ϕ can be chosen as
a Σ1 -sentence.32)
Proof. Recall from the proof of Lemma 4.1.1 that sub(ψ, x) denotes the recursive definition
of the formula obtained from ψ by substituting x for its unique free variable. This is a
Σ1 -definition by Lemma 4.2.5.
There are two ways to do the proof. One can either notice that ZFFin proves that the
recursive definition of sub is a (partial) function. Or one can apply Lemma 4.2.12 and
obtain some other Σ1 -formula ν that defines sub in (Fin, ∈) such that ZFFin proves that
ν defines a (partial) function. In any case, write ν(ϕ, x, ϕ0 ) for such a Σ1 -formula, where
ϕ0 stands for the formula obtained by replacing the unique free variable of ϕ by x. Write
s(ϕ, x) = ϕ0 if ν(ϕ, x, ϕ0 ) holds.
Let θ(x) denote the formula ψ(s(x, x)), in more detail ∃z ν(x, x, z) ∧ ψ(z). Then
ZFFin ` θ(cθ ) ←→ ψ(s(cθ , cθ )) ←→ ψ(cθ(cθ ) ).
In more detail,
ZFFin ` θ(cθ ) ←→ ∃z (ν(cθ , cθ , z) ∧ ψ(z)) ←→ ψ(cθ(cθ )) ).
The first equivalence holds by the definition of θ and since ν represents sub. The second
equivalence holds since ZFFin proves s(cθ , cθ ) = cθ(cθ ) , in more detail ZFFin proves that
cθ(cθ ) is the unique z with ν(cθ , cθ , z). This is because ν represents sub and ZFFin proves
that ν defines a function, as in the additional property in Lemma 4.2.12.
Let ϕ denote the formula θ(cθ ). 
For the next theorem, note that for any Σ1 -definable theory T , the set
{ϕ | ϕ is an L-sentence with T ` ϕ}
is ∆1 -definable Σ1 -definable. If T is additionally complete, then this set is in fact ∆1 -
definable, since its complement
{ϕ | ϕ is an L-sentence with T 6` ϕ} = {ϕ | ϕ is an L-sentence with T ` ¬ϕ}
32However, we will apply this to the Π -formula ¬prov (x).
1 T
60 PHILIPP SCHLICHT

is also Σ1 -definable.
The following is Rosser’s stronger version of Gödel’s incompleteness theorem, here in
a version for ZFFin .
Theorem 4.3.2. (A strong version of Gödel’s first incompleteness theorem, in a version
for ZFFin ) Suppose that T is a consistent extension of ZFFin that is Σ1 -definable over
(Fin, ∈). Then
{ϕ | ϕ is an L-sentence with T ` ϕ}
is not ∆1 -definable. In particular, T is incomplete.
Proof. The idea is to apply the fixed point lemma to the formula ¬provT (x). In the actual
proof, one works with a slightly modified formula.
Suppose that the set in the assumption is ∆1 -definable. Then it is represented in ZFFin
by some Σ1 -formula ψ(x) by Lemma 4.2.12. By the previous Lemma 4.3.1, there is a Σ1
sentence ϕ such that
ZFFin ` ϕ ←→ ¬ψ(cϕ ).
Then
T ` ϕ ⇐⇒ ZFFin ` ψ(cϕ ) ⇐⇒ ZFFin ` ¬ϕ.

T 6` ϕ ⇐⇒ ZFFin ` ¬ψ(cϕ ) ⇐⇒ ZFFin ` ϕ.


The first equivalences hold by the choice of ψ. 
Added the as- Gödel’s original proof of a weaker form of the incompleteness theorem avoids the use
sumption ω- of Σ1 -representations and Lemma 4.2.12. (Note that the proof of the fixed point lemma
consistent in
does not need Lemma 4.2.12.) One still needs that true Σ1 -statements are provable as in
the next corol-
lary Lemma 4.2.9.
Gödel used the notion of ω-consistency. It is a stronger form of consistency that is only
of historical interest, as far as I know. A theory T extending ZFFin is called ω-consistent
if whenever T ` ∃x ϕ(x), there exists some t ∈ Fin such that T 6` ¬ϕ(ct ). Setting ϕ = ⊥
shows that any ω-consistent theory is consistent.
Let proof(x, cϕ ) denote a formula stating that x is a proof of cϕ . If one writes down
such a formula in a straightforward way, one can see that it is decided in ZFFin , i.e.
ZFFin ` proof(x, cϕ ) or ZFFin ` ¬proof(x, cϕ ).
We claim that an ω-consistent theory T does not prove ¬con(T ), where con(T ) =
¬provT (⊥) = ¬∃x proof(x, ⊥). To see this, suppose towards a contradiction that T `
∃x proof(x, ⊥). Since T is ω-consistent, there is some t ∈ Fin such that T 6` ¬proof(ct , ⊥).
As we remarked a few lines above, then ZFFin ` proof(ct , ⊥). Hence t is actually a proof
of ⊥ and thus T is inconsistent. But every ω-consistent theory is consistent.
Also note that if T 6` ¬con(T ), then T is consistent.
Corollary 4.3.3. (Gödel’s original first incompleteness theorem, in a version for ZFFin )
Suppose that T is an ω-consistent extension of ZFFin that is Σ1 -definable over (Fin, ∈).
Then T is incomplete.
Proof. This follows from the previous theorem. Here is a shorter proof.
We do not actually use that T is ω-consistent, only that T 6` ¬con(T ).
By the fixed point Lemma 4.3.1, there is a sentence ϕ such that
ZFFin ` ϕ ←→ ¬provT (cϕ ).
Since T is complete, one of the following two cases occurs:
Suppose that T ` ϕ. Since provT (cϕ ) is a true Σ1 -sentence, ZFFin ` provT (cϕ ). Thus
ZFFin ` ¬ϕ by the choice of ϕ. This contradicts the case assumption.
Suppose that T ` ¬ϕ. Thus T ` provT (cϕ ) by the choice of ϕ. Since provT (c¬ϕ ) is
a true Σ1 -sentence, ZFFin ` provT (c¬ϕ ) and thus T ` provT (c¬ϕ ). But T ` provT (cϕ )
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 61

and T ` provT (c¬ϕ ) imply by tautologies that T ` cϕ∧¬ϕ and hence T ` ¬con(T ). This
contradicts the assumption that T is ω-consistent. 
Given the previous theorem, the question arises: is it possible that T is consistent
and T ` ¬con(T )? Gödel’s second incompleteness theorem implies that this is indeed
possible. An example of such a theory is T = ZFFin + ¬con(ZFFin ). How does a model M
of such a theory T look like? M will have non-standard natural numbers, and proofs of
non-standard length. From the viewpoint of M, there exists a finite proof of ⊥. But from
the outside, we can see that the proof is an infinite object and thus not really a proof of
⊥.

4.4. An analysis of Gödel’s sentence. Suppose that T is a consistent theory extending


ZFFin . In the arguments above, a Σ1 sentence ϕ with
ZFFin ` ϕ ←→ ¬provT (cϕ )
was used to show that a given extension of ZFFin is incomplete. We will say Gödel’s
sentence when we mean any sentence with this property and write ϕT . The concrete
sentence above was ¬provT (sub(x, x)).
We now know that T is incomplete. We would like to analyse Gödel’s sentence, in
particular we ask:
• Is ϕT true?
• What else can we say about ϕT ?
We will see that ϕ is true, but not provable in T . (Incompleteness of T follows again,
using that if ϕ were true, then it would be provable in ZFFin .) Moreover, we will use the
analysis of ψ to prove that T cannot prove its own consistency. This is Gödel’s second
incompleteness theorem.
Lemma 4.4.1. ϕT is not provable in T .
Proof. Write ϕ for ϕT . Suppose that T ` ϕ. Then (Fin, ∈) |= provT (cϕ ). Since true
Σ1 -sentences are provable in ZFFin , we have that ZFFin ` provT (cϕ ). Since ϕ is a Gödel
sentence for T , this implies ZFFin ` ¬ϕ. This contradicts the assumption. 
In the previous proof, we had the chain of implications:

T ` ϕ ⇒ (Fin, ∈) |= provT (cϕ ) ⇒ ZFFin ` provT (cϕ ) ⇒ ZFFin ` ¬ϕ.


Thus T ` ϕ yields a contradiction. We will show below that the formalised version
provT (cϕ ) → ⊥ (equivalent to ¬provT (cϕ )) of this implication is provable in ZFFin .
Lemma 4.4.2. For formulas ψ and θ, the following sentences are provable in ZFFin :
(1) (provT (cψ ) ∧ provT (cθ )) → provT (cψ∧θ ).
(2) (provT (cψ ) ∨ provT (cθ )) → provT (cψ∨θ ).
(3) (provT (cψ ) ∧ provT (cψ→θ )) → provT (cθ ).
Proof. (1) ZFFin proves the stronger statement ∀ψ, θ (provT (ψ) ∧ provT (θ) → provT (ψ ∧
θ)). This is because the concatenation of proofs of ψ and θ with the additional formula
ψ ∧ θ yields a proof of ψ ∧ θ. This is provable in any theory in which concatenations of
arbitrary finite functions exist, for instance in ZFFin . (2) and (3) are similar. 
Lemma 4.4.3. For every Σ1 -sentence, ZFFin proves ϕ → provT (cϕ ).
Proof. The proofs of Lemmas 4.2.8 and 4.2.9 are inductive and thus work in ZFFin . So
we have the stronger fact: ZFFin proves that for every Σ1 -sentence θ, θ → provT (cθ )
holds. 
62 PHILIPP SCHLICHT

Let con(T ) denote the sentence ¬provT (⊥) stating that T is consistent, i.e. that there
is no proof of a contradiction from T . Note that this sentence is Π1 . It is not necessarily
Σ1 . If it were Σ1 , then if true, it would be provable in ZFFin and thus in any extension
T . Hence T ` con(T ) would hold for any such theory. This contradicts Gödel’s second
incompleteness theorem.
Theorem 4.4.4. (Gödel’s second incompleteness theorem) Suppose that T is a consistent
extension of ZFFin that is Σ1 -definable over (Fin, ∈). Then T 6` con(T ).
Proof. The statement of Lemma 4.4.1 is provable in ZFFin by Lemmas 4.4.2 and 4.4.3.
Thus

ZFFin ` provT (cϕ ) → ¬con(T )


Now suppose towards a contradiction that T ` con(T ). Since ZFFin ` con(T ) →
¬provT (cϕ ), we have T ` ¬provT (ϕ). Since ZFFin ` ϕ ←→ ¬provT (cϕ ) by the choice of
ϕ, we have T ` ϕ. But this contradicts Lemma 4.4.1. 
Lemma 4.4.5. Suppose that T is a consistent extension of ZFFin that is Σ1 -definable
over (Fin, ∈) and ϕT is a Gödel sentence for T . Then T ` ϕT ←→ con(T ).
Proof. One direction was proved in the previous proof. The other direction is a simple
exercise. 
Remark 4.4.6. The proof of the second incompleteness Theorem 4.4.4 also works of
instead of provT we define ϕT by using a formula ψ representing provability as in Rosser’s
proof, if the conditions in Lemmas 4.4.2 and 4.4.3 are satisfied. Under these assumptions,
ZFFin ` con(T ) → ϕ for Rosser’s sentence ϕ.
The proof of Lemma 4.4.5 fails for ϕ if T is a theory with T ` ¬con(T ). (Note that for
any extension S of ZFFin as above, T = S ∪ {con(S)} is consistent by Theorem 4.4.4 and
T has this property.)
It has been a well known open question for decades whether Rosser’s sentences have a
simple characterisation, in the same way that Gödel’s sentence is provably equivalent to
con(T ). This would imply that all Rosser sentences are provably equivalent. Solovay and
Guaspari (1979) showed that for a certain modification of provT , all Rosser sentences are
provably equivalent, but this is open for the Rosser sentence for provT itself.
Lecture 23 4.5. Incompleteness of extensions of Peano arithmetic.
07. July We already considered the notion of definable subsets of structures. We now look at
the situation that one can define a structure in another one and this is provable with
respect to a theory.
Definition 4.5.1.
(1) A structure M = (M, F) is called interpretable in a structure N = (N, G) if there
exists a structure M0 = (M 0 , F 0 ) isomorphic to M such that for some k ∈ N:
(a) M 0 is a subset of N k definable over N .
(b) Each R ∈ F 0 is a definable over N subset of N k·l for the appropriate l ∈ N.
If k = 1, then M is called definable in N .
(2) An L-theory S is called interpretable in a K-theory T if for some k ∈ N, there are K-
formulas ϕ(x) and ψs (~x) for every s ∈ L such that for k ∈ N, Mϕ = {~x ∈ M k | ϕ(~x)},
Rψs = {~x | ψs (~x)} and G = hRψs | s ∈ Li:
T ` (Mϕ , G) |= S.
More precisely, for each sentence θ ∈ S, T proves that θ is true in this structure.
If k = 1, then S is called definable in T .
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 63

An example for interpretability of structures is that the complex field (C, 0, 1, +, ·) is


interpretable in the real field (R, 0, 1, +, ·). Conversely, one can show that (R, 0, 1, +, ·) is
not interpretable in (C, 0, 1, +, ·).
Interpretability of theories in (2) above states that for all models N = (N, G) of T , one
can define a model of S as a subset of N k for some k ∈ N, and this works uniformly, i.e.
with fixed formulas.
It is clear that PA is definable in ZFFin . We will show that ZFFin is definable in PA. In
fact, the domain of the model of ZFFin will be the same as the model of PA. Write ∈PA
for the ∈-relation defined in PA that we will define. The set of ∈PA -formulas is defined by
replacing ∈ by ∈PA recursively in ∈-formulas, so this set is ∆1 -definable over (Fin, ∈).
We explain why the first incompleteness theorem for ∆1 over (Fin, ∈) definable ex-
tension of ZFFin (Theorem 4.3.2) implies incompleteness of ∆1 over (Fin, ∈) definable
extension T of PA. Towards a contradiction, suppose that T is complete. For any ∈-
formula ϕ, let ϕPA denote the ∈PA -formula obtained by replacing ∈ by ∈PA everywhere in
ϕ. Let
T∈ = {ϕ | ϕ is an ∈-formula with T ` ϕPA }.
Since T is a complete LArith -theory, T∈ is a complete L∈ -theory. T∈ is Σ1 -definable. But
this contradicts the first incompleteness Theorem 4.3.2.
We now work towards the definition of ∈PA . It can be shown that one cannot define
multiplication · in (N, +). Our aim is to show Gödel’s result that one can define expo-
nentiation exp(m, n) = mn in PA. An intuition why this should be possible is that one
can easily define very fast growing functions in PA, for example the product
Y
f (n) = p
p prime, p≤n

of all primes ≤n.


Lemma 4.5.2.
(1) (Division with remainder) PA proves ∀m, n m < n → ∃k, l (n = k · ml ∧ l < m).
(2) PA proves: for all primes with p|m · n, we have p|m or p|n.
Proof. The proofs are by induction and are left as exercises. 

The idea for defining exponentiation is to code finite sequences by natural numbers.
This allows recursive definitions, thus we can define exponentiation from multiplication,
iterated exponentiation from exponentiation etc.
The following definition codes pairs of natural numbers by a single natural number.
Definition 4.5.3. Define p(a, b) = (a + b)2 + a + 1.
Lemma 4.5.4. If p(a, b) = p(a0 , b0 ), then a = a0 and b = b0 .
Proof. Note that a + b < a0 + b0 implies that p(a, b) ≤ (a + b + 1)2 < p(a0 , b0 ). Therefore,
p(a, b) = p(a0 , b0 ) implies that a + b = a0 + b0 . a = a0 and b = b0 follow. 
Lemma 4.5.5. There is a definable partial function ρ : N2 → N such that for all m ≥ 1
and all 0 < i < j ≤ m, the numbers ρ(m, i) and ρ(m, j) are mutually prime.
Proof. Suppose that B is least such that ∀0 < i ≤ m i|b. Define ρ(m, i) = bi + 1.
Suppose that p is a prime with p|bi + 1 and p|bj + 1, where 0 < i < j ≤ m. Then p 6 |b.
Moreover, we have p|(bj + 1) − (bi + 1) = b(j − i). Apply Lemma 4.5.2. If p|j − i, then
p|b by the definition of b. Hence p|b. However, we showed that p 6 |b. 

We use the previous lemma to code finite sets and finite sequences.
64 PHILIPP SCHLICHT

Someone remarked correctly during the lecture, the following lemma is not formulated in LArith . I
will correct this.
More precisely, the lemma should state that for any c, m and any k > m, one can find some c0 that
“outputs” the same values as c up to m, and outputs the only additional value k up to k.

Lemma 4.5.6.
(1) There is a definable set R ⊆ N3 such that for all m and all S ⊆ {0, . . . , m}, there is
some c such that for all i, R(c, m, i) ⇔ i ∈ S.
(2) (A variant of Gödel’s β-function) There is a definable function β : N3 → N such that
for all m and all c0 , . . . , cm , there is some c such that ∀i ≤ m β(c, m, i) = ci .
Proof. (1) We define c = i∈S ρ(m, i). Then ∀i ≤ m (i ∈ S ⇔ ρ(m, i)|c). We define
Q

R(c, m, i) ⇔ ρ(m, i)|c.


(2) Apply (1) to {p(i, ci ) | i ≤ m}. 
One needs for formulate the previous lemma appropriately in LArith . Then they are
provable in PA.
Lemma 4.5.7. Exponentiation exp(m, n) = mn is definable in PA.
Proof. More precisely, the claim is that one can define a function exp with exp(m, 0) = 1
and exp(m, n + 1) = exp(m, n) · m for all m, n.
We want to define exp(m, n) = k informally by the statement: There are c1 , . . . cn with
c1 = m, for all 1 ≤ i < n, we have ci+1 = m · ci and cn = k.. Formally, ∃l, c β(c, l, 1) =
m ∧ ∀1 ≤ i < n β(c, l, i + 1) = m · β(c, l, i) ∧ β(c, l, n) = k. 
Note that one can show that exp is ∆1 -definable.
Using exp, we now define
m ∈PA n ⇔ ∃k, r k > 0 ∧ r < exp(2, m) ∧ n = (2 · k + 1) · exp(2, m) + r].
Lemma 4.5.8. ∈PA satisfies the axioms and schemes of ZFFin .
Proof. See Problem 52 on sheet 12. 
In arithmetic, one defines bounded quantifiers of the form ∀x ≤ y and ∃x ≤ y. this
leads to a hierarchy of Σn , Πn formulas and ∆n sets. Using the above interpretations,
one can show that the Σn -formulas of set theory are translated precisely to Σn -formulas
of arithmetic and vice versa.
The first incompleteness theorem applies to all axiom systems listed by an algorithm.
This is because one can show that all sets listed by an algorithm are Σ1 -definable.
The Σ1 -definition states that there exists a run of the algorithm that outputs the given
number n at some stage.
Coversely, any Σ1 -definable subset of (N, +, ·) can be listed by an algorithm simply by
search for a witness to the Σ1 -statement and then verifying the formula by running finite
checks.
This chapter completes the solution of Hilbert’s program at the end of Section 1: (5)
is false by (the proof of) Gödel’s first incompleteness theorem; (3) and (4) are false by
Gödel’s second incompleteness theorem.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 65

5. Complete theories
Lecture 24
Contrary to the theories studied above, many interesting theories are actually complete. 12. July
For instance, the theory of algebraically closed fields of fixed characteristic.
In this section, we study two techniques to prove completeness of a theory: quantifier
elimination and categoricity.
5.1. Quantifier elimination. Suppose that T is an L-theory and ϕ(~x), ψ(~x) are L-
formulas. We say that ϕ and ψ are equivalent modulo T and write ϕ ∼T ψ if T ` ϕ ↔ ψ.
Definition 5.1.1. An L-theory T has quantifier elimination if for every L-formula ϕ(~x),
there is a quantifier-free L-formula ψ(~x) ∼T ϕ(~x).
Note that quantifier elimination depends on the choice of the language. In fact, one
can extend any theory to a theory in an extended language with quantifier elimination.
We now want to see that it is sufficient to show absoluteness for a very restricted class
of existential formulas.
Definition 5.1.2. A formula ψ is called simple existential if it is of the form ∃xϕ for some
quantifier-free formula ϕ. If ϕ is moreover a conjunction of basic formulas (i.e. formulas
of the form ψ or ¬ψ, where ψ is atomic), then ψ is called primitive existential.
The next lemma shows that primitive existential formulas are sufficient.
Lemma 5.1.3. An L-theory T has quantifier elimination if and only if every primitive
existential formula is equivalent modulo T to a quantifier-free formula.
Proof. Suppose that this condition holds. We show quantifier elimination by induction on
formulas. The cases ∧, ∨ and ¬ are obvious. Suppose that ∃xϕ(x) is an L-formula. We can
assume by the inductive hypothesis that ϕ(x)Wis quantifier-free. Therefore, we can assume
that ϕ is in disjunctive normal form, i.e. ϕ = i≤n ϕi , where each is a conjunction of basic
formulas (see Definition 5.1.2). Then ∃xϕ(x) is equivalent to the formula i<n ∃xϕi (x).
W

By the assumption, this is equivalent modulo T to a quantifier-free formula. 


Example 5.1.4. The theory DLO of dense linear orders without end points consists of
the axioms:
(1) ∀x x 6< x
(2) ∀x, y, z (x < y ∧ y < z → x < z)
(3) ∀x, y, z (x < y ∨ y < x ∨ x = y)
(4) ∀x, z (x < z → ∃y x < y < z)
(5) ∀x ∃y, z (x < y ∧ z < x)
Definition 5.1.5. From now on, we will use the logical constants >, ⊥. The definition
of  is extended so that > is always true and ⊥ is always false. Moreover, the definition
of ` is extended so that > is provable.
Lemma 5.1.6. DLO satisfies quantifier elimination.
Proof. We first show that negated atomic formulas are not needed. To see this, note that
in DLO, x 6= y ↔ x < y ∨ y < x, and x 6< y ↔ y < x ∨ x = y. So in DLO, all quantifier-free
formulas are equivalent to formulas in disjunctive normal form build only from atomic
formulas.
Suppose that ϕ(~x, y) is a conjunction of atomic formulas. We can write ϕ(~x) as ϕ0 (~x) ∧
ϕ1 (~x, y), where each of ϕ0 and ϕ1 is a conjunction of atomic formulas and y appears in
all atomic formulas in ϕ1 .
If y < y appears in ϕ1 , then ∃y ϕ(~x, y) ∼DLO ⊥. If y = xi appears in ϕ1 , then
∃y ϕ(~x, y) ∼DLO ϕ(~x, xi ). So assume otherwise.
66 PHILIPP SCHLICHT

Then only formulas of the form y = y, y < xi and xi < y appear in ϕ1 . We can
omit all formulas of the form y = y. If only formulas of the form xi < y appear, then
∃y ϕ(~x, y) ∼DLO >. If only formulas of the form y < xi appear, then ∃y ϕ(~x, y) ∼DLO >.
So assume otherwise.V
We define θ(~x) = {xi < xj | xi < y and y < xj appear in ϕ1 }. We can assume that for
all i 6= j not both xi < xj and xj < xi appear in θ∧ϕ1 , since otherwise ∃y ϕ(~x, y) ∼DLO ⊥.
If θ ∧ ϕ0 is not compatible with a linear order of all xi which appearing in θ ∧ ϕ0 , then
∃y ϕ(~x, y) ∼DLO ⊥.
Otherwise fix such a linear order. Let xi be maximal such that xi < y appears and
xj minimal such that y < xj appears. Then xi < xj by the previous assumption. So
∃y ϕ(~x, y) ∼DLO >. 

Quantifier elimination is a useful tool to prove completeness. For instance, quantifier


elimination holds if the theory has a prime structure.
Definition 5.1.7. Suppose that T is an L-theory and M is a model of T .
(a) M is a prime structure for T if it can be embedded into every model of T .
(b) M is a prime model for T if it can be elementarily embedded into every model of T .
If a theory T has quantifier elimination and a prime structure M, then M is a prime
model by absoluteness of quantifier-free formulas. Then M ≺ N holds for any model N
of T , so T is complete.
Example 5.1.8. (Q, <) is a prime model of DLO. To see this, note that it is easy to
show that (Q, <) is embeddable into any dense linear order without end points. In fact,
a simple back-and-forth construction shoes that (Q, <) is isomorphic to any countable
dense linear order without end points.
We now see a useful model theoretic criterion for quantifier elimination.
The atomic diagram Diag(M) of an L-structure M = (M, . . . ) is defined as the set
of basic LM -formulas that are true in M. In the following proof, we will further write
~x = (x0 , . . . , xn ), ~a = (a0 , . . . , an ).
The following is a useful test for quantifier elimination. Moreover, Lemma 5.1.3 shows
that it is sufficient to check condition (b) only for primitive existential formulas (defined
below).
These two Recall that for L-structures M = (N, F) and N = (N, G) with M ⊆ N , M is a
paragraphs substructure of N if and only if the constant symbols have the same interpretations in
are new. They
M and N , and the functions and relations of M are obtained by restricting those of N .
explain how
quantifier-free Thus by definition, M is a substructure of N if and only if the truth of atomic formulas
formulas be- is the same in M and N .
have with re- Any quantifier-free statements are obtained by combining atomic formulas using ∧, ∨
spect to sub- and ¬. by induction on formulas, for any substructure M of a structure N , the truth of
structures. any quantifier-free formula is the same in M and N .
Lemma 5.1.9. If M = (M, . . . ) is an L-structure, T is an L-theory and ϕ(x0 , . . . , xn )
is an L-formula, then the following statements are equivalent.
(a) There is a quantifier-free L-formula ψ(~x) with T |= ∀~x(ϕ(~x) ↔ ψ(~x)).
(b) If M = (M, . . . ) and N = (N, . . . ) are models of T and A = (A, . . . ) is a substructure
of both M and N , then M |= ϕ(a0 , . . . , an ) ⇔ N |= ϕ(a0 , . . . , an ) for all a0 , . . . , an ∈
A.
Proof. The first implication follows from the fact that quantifier-free statements about
elements of A are absolute between A, M and N . We now assume that (b) holds. If
T |= ∀~xϕ(~x), then let ψ = > (the true statement) and if T |= ∀~xϕ(~x), then let ψ = ⊥
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 67

(the false statement). We can thus assume that both T ∪ {ϕ(~x)} and T ∪ {¬ϕ(~x)} are
consistent. We choose new constants and (d0 , . . . , dn ) and write d~ = (d0 , . . . , dn ). Let
Γ(~x) = {ψ(~x) | ψ(~x) ∈ FormL is quantifier-free and T ∪ {ϕ(~x)} |= ψ(~x)}
denote the set of quantifier-free consequences of T ∪ {ϕ(~x)}.
~ |= ϕ(d).
Claim 5.1.10. T ∪ Γ(d) ~

Proof. Assuming otherwise, there is a model M of T ∪ Γ(d) ~ ∪ {¬ϕ(d)}.


~ Let A be the
~M M M
substructure of M that is generated by d = (d0 , . . . , dn ). We now show that the
theory Σ = T ∪ Diag(A) ∪ {ϕ(d)} ~ is consistent. Assuming that it is inconsistent, there are
~ ~
ψ0 (d), . . . , ψn (d) ∈ Diag(A) such that T |= i≤m ψi (d)
V ~ → ¬ϕ(d) ~ and hence T |= ϕ(d)~ →
~ ~ ~ ~ ~
i≤m ¬ψi (d). So i≤m ¬ψi (d) ∈ Γ(d). Since M |= Γ(d) and Γ(d) only contains quantifier-
W W

free formulas, we have A |= Γ(d). ~ So A |= ~


i≤m ¬ψi (d) and thus there is some i ≤ m
W

with A |= ¬ψi (d). ~ But this contradicts the fact that ψi (d)
~ ∈ Diag(A). Since we have now
shown that Σ is consistent, let N be a model of Σ. Then N is a model of T ∪ {¬ψ(d)} ~
and we can hence assume that it contains A as a substructure. Since M is a model of
T ∪ {ψ(d)} ~ that contains A as a substructure and d~M = d~N ∈ A, this contradicts our
assumption (b). 
~ . . . , θk (d)
By the completeness theorem, there are finitely many sentences θ0 (d), ~ ∈ Γ(d)
~
~ ~ ~ ~ ~ ~
such that Γ(d) |= θ(d) for θ(d) = i≤k θi (d). Then T |= θ(d) ↔ ϕ(d) and hence T |=
V

∀~x(θ(~x) ↔ ϕ(~x)). 
Lecture 25
We now show quantifier elimination for vector spaces and algebraically closed fields. 14. July
The language LV (K) of vector spaces over a field K consists of the language LAddGroup =
{0, +} of additive groups with an additional function symbol fλ for scalar multiplication
with each λ ∈ K.
Lemma 5.1.11. The theory of infinite-dimensional vector spaces over a fixed field K has
quantifier elimination.
Proof. Suppose that V0 and V1 are K-vector spaces of infinite dimension that both contain
a K-vector space V . Suppose that ψ = ∃xϕ(x, x0 , . . . , xn ) is a simple existential formula
that holds in V0 for some a0 , . . . , an ∈ V , witnessed by some a ∈ V0 . If a ∈ V then ψ
holds in V1 , so suppose that a ∈ V0 \ V .
First suppose that V ( V1 and let b ∈ V1 ( V . Let W0 = hV ∪ {a}iV0 and W1 =
hV ∪ {b}iV1 denote the subspaces generated by V ∪ {a} in V0 and by V ∪ {b} in V1 ,
respectively. Pick an isomorphism f : W0 → W1 with f V = id and f (a) = b. Since
W0 |= ϕ(a, a0 , . . . , an ) and isomorphisms preserve truth, we have W1 |= ϕ(b, a0 , . . . , an )
and hence V1 |= ∃xϕ(x, a0 , . . . , an ).
If V = V1 , then V has infinite dimension. Let V 0 ( V be a subspace with a0 , . . . , an ∈ This case was
V . Applying the previous argument to V 0 instead of V yields V1 |= ∃xϕ(x, a0 , . . . , an ). 
0 missing in the
lecture
One can also be show this for arbitrary vector spaces over a fixed infinite field with a
similar argument.
For both finite and infinite fields K, it is easy to see that the theory of infinite K-vector
spaces has a prime model. Therefore it is complete.
Let ACFp denote the theory of algebraically closed fields of characteristic p. The next
result uses some facts from algebra about the existence and uniqueness of algebraic clo-
sures.
Theorem 5.1.12. For any prime p or p = 0, the theory ACFp has quantifier elimination.
68 PHILIPP SCHLICHT

Proof. Suppose that L and M are algebraically closed fields of characteristic p and R
is a substructure of both, i.e. a subring. Then the quotient fields of R in L and M
are isomorphic and hence we can assume that they are equal and denote this field by K.
Moreover the proof of uniqueness of algebraic closures shows that there is an isomorphism
between the algebraic closures of K in L and M that is the identity on K. We can thus
assume that there is an an algebraic closure K̄ of K that is contained in both L and M .
We now assume that some primitive existential formula ∃xϕ(x, x0 , . . . , xn ) holds in L
for some a0 , . . . , an ∈ R. Moreover assume that this is witnessed by some a ∈ L. If a ∈ K̄,
then ϕ(a, a0 , . . . , an ) holds in M , since ϕ is quantifier-free and hence absolute. We can
thus assume that a ∈ / K̄. Suppose that ϕ(x) = ( i<k fi (x) = 0) ∧ ( j<l gj (x) 6= 0), where
V V

fi and gj are polynomials over R. Then fi (a) = 0 for all i < k. Since a is not algebraic
over K, each fi is the zero polynomial.
Since gj 6= 0 in R[X] for all j < l, the polynomial x · j<l gj (x) + 1 is not constant and
Q

hence has a root b in M . Hence gj (b) 6= 0 for all j < l and thus M |= ϕ(b, a0 , . . . , an ).
Instead of arguing as in the last paragraph, one can also note that the set defined by
ϕ(x) is cofinite, i.e. its complement is finite. Note that any algebraically closed field K is
infinite, since for finite K = {a0 , . . . , an } the polynomial 1 + i≤n (X − ai ) does not have
Q

roots in K. Therefore M |= ∃x ϕ(x, a0 , . . . , an ), as required. 


By the uniqueness of the algebraic closure up to isomorphism, ACFp has a prime struc-
ture. Since it has quantifier elimination, this is a prime model. So ACFp is complete.
Definition 5.1.13.
(1) A structure M = (M, F) is called minimal if it only every subset of M that is
definable over M with parameters is either finite or cofinite.
(2) A strongly minimal theory is a complete theory all models of which are minimal.
It is easy to see that quantifier elimination implies the following result.
Lemma 5.1.14. Every algebraically closed field is minimal.
The ordered real field (R, 0, 1, +, ·, <) is not minimal, but it can be shown to satisfy
the following property.
Definition 5.1.15. Suppose that L is a language that contains a binary relation symbol
<.
(1) A structure M = (M, F) in which <M is a strictly linear order is called o-minimal 33
if every subset of M that is definable over M with parameters is a finite union of
intervals and points. (An interval can be open, half-open, and tend to ∞ or −∞.)
(2) An o-minimal theory is a theory all models of which are o-minimal.
o-minimality can be understood as weak form of quantifier elimination. Moreover, in
the case of the ordered real field, the definable sets are the semialgebraic sets. So the
study of o-minimal structures generalises real algebraic geometry.
We now give some examples for using quantifier elimination for ACFp to obtain elegant
proofs of some results about polynomials.
Theorem 5.1.16. (Hilbert’s Nullstellensatz) Suppose that K is an algebraically closed
field and f0 , . . . , fn ∈ K[X0 , . . . , Xk ] such that I = (f0 , . . . , fn ) is a proper ideal in
K[X0 , . . . , Xn ] (i.e. 1 ∈
/ I). Then there are a0 , . . . , ak ∈ K such that fi (a0 , . . . , ak ) = 0
for all i ≤ n.
Proof. The trick is to construct an algebraically closed extension L̄ of K where such a
root (a0 , . . . , ak ) exists, and then use quantifier elimination to conclude that such a root
exists in K. Recall that K ≺ L̄ by quantifier elimination.
33o stands for order.
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 69

By Zorn’s Lemma applied to the set of proper ideals J in K[X0 , . . . , Xk ] which contain I,
there is a maximal ideal J in K[X0 , . . . , Xk ] containing I. Since J is a maximal ideal, L =
K[X0 , . . . , Xk ]/J is a field. Moreover we can identify K with a subfield of L by identifying
a ∈ K with a+J. For each i ≤ k we have fi (X0 +J, . . . , Xk +J) = fi (X0 , . . . , Xn )+J = J.
The first equation holds by the definition of addition and multiplication in quotient rings
and the second equation holds since fi (X0V , . . . , Xk ) ∈ I.
Therefore, the formula θ = ∃x0 , . . . , xk i≤n fi (x0 , . . . , xk ) = 0 is true in L and hence
V
also in its algebraic closure L̄. The latter holds because the formula i≤n fi (x0 , . . . , xk ) is
remains true in all large models, since it is quantifier-free. By quantifier elimination for
ACFp , θ is also true in K ≺ L̄. So f0 , . . . , fn have a common root (a0 , . . . , ak ) ∈ K k+1 , as
required. 
If K is an algebraically closed field, then a subset S of K n is called constructible if it
is a finite Boolean combination of zero sets of polynomials in K[X0 , . . . , Xn−1 ] and their
complements.
By quantifier elimination, every definable subset of K n is constructible (and conversely).
In more detail, any definable subset is definable by some formula generated by ∧ and ∨
from atomic formulas and negations of atomic formulas.
Lemma 5.1.17. (Chevalley) If K is an algebraically closed field, S is a constructible
subset of K m and f : K m → K n is a polynomial function, then f (S) is constructible.
Proof. Suppose that f is given by the polynomials g0 , . . . , gn−1 ∈ K[X0 , . . . , Xm−1 ]. Then
(a0 , . . . , an−1 ) ∈ f (S) ⇔ ∃x0 , . . . , xm−1 ∈ K i<n gi (x0 , . . . , xm−1 ) = ai . Thus f (S) is
V

definable and hence constructible. 


We now define the model-theoretic version of algebraic closure and show that quantifier
elimination for ACF0 and ACFp implies that for these theories, the algebraic closures in
the sense of algebra and of model theory are equal.
Definition 5.1.18. Suppose that M = (M, F) is an L-structure, ϕ(x) is an L-formula
and A ⊆ M .
(a) Let ϕ(M) = {x ∈ M | M |= ϕ(x)}.
(b) ϕ is called algebraic over M if ϕ(M) is finite.
(c) An element x ∈ M is called algebraic over A if M |= ϕ(x) for some algebraic LA -
formula.
(d) The algebraic closure acl(A) = aclM (A) of A in M is the set of all algebraic elements
of M over A.
(e) A is algebraically closed in M if acl(A) = A.
If A is a subset of an algebraically closed field K of characteristic p, then aclK is equal
to the standard algebraic closure (from algebra) by quantifier elimination for ACFp . To
see this, suppose that a ∈ acl(A) and ϕ(x) is a quantifier-free formula with parameters in
A and only finitely many solutions including a. By quantifier elimination, W
weVcan assume
that it is quantifier-free and replace it by a logically equivalentVformula i∈I i∈Ji ϕi,j (x)
with basic formulas ϕi,j . Then a satisfies the formula ψ(x) = i∈Ji ϕi,j (x) for some i ∈ I
– the same is true when the basic inequalities ϕi,j (x) are removed. This conjunction of
polynomial equations can be rewritten as a single polynomial equation, showing that a is
in the algebraic closure in the usual sense (as defined in field theory).
Lecture 26
19. July
5.2. Categoricity.
Definition 5.2.1. If κ is an infinite cardinal, a theory T is called κ-categorical if T has
exactly one model of size κ up to isomorphism.
70 PHILIPP SCHLICHT

Lemma 5.2.2. (Vaught’s test) Suppose that κ is an infinite cardinal, L is a language


with |L| ≤ κ and T is a consistent theory with no finite models. If T is κ-categorical, then
it is complete.
Proof. We show that any two models M and N of T are elementarily equivalent. Since
M and N are infinite and |L| ≤ κ, Th(M) and Th(N ) have infinite models M0 and N 0 of
size κ by the Löwenheim-Skolem Theorem. By our assumption, M ≡ M0 ∼ = N0 ≡ N. 
If a theory is ω-categorical, is it necessarily κ-categorical for uncountable cardinals κ?
The next theory is a simple counterexample.
Example 5.2.3. Consider the theory T of equivalence relations with exactly two classes,
both of which are infinite. The language is L = {E}, where E is a binary relation symbol.
The theory T consists of the axioms for equivalence relations together with the axioms
∃x, y ¬E(x, y)
∀x, y, z (E(x, y) ∨ E(x, z) ∨ E(y, z)
^ ^
ϕn = ∀x0 , . . . , xn ( E(x0 , xi ) → ∃x (x 6= xi ∧ E(x, xi )))
i≤n i≤n
for all n ∈ N. It is easy to see that any two countable models of T are isomorphic.
However T is not κ-categorical for any uncountable cardinal κ. To see this, let M be a
model of T of size κ where both equivalence classes have size κ and let N be a model of
size κ where one equivalence class has size κ and the other one has ℵ0 .
The theory DLO of dense linear orders without end points in Example 5.1.4 is another
example of a complete theory that is ω-cateogrical, but not κ-categorical for uncountable
cardinals κ.
Theorem 5.2.4. DLO is ℵ0 -categorical.
Proof. The following is a typical example of a back-and-forth construction; this is an
iterative constructions that alternates between enumerations of two structures.
Suppose that A = (A, <A ) and B = (B, <B ) are countably infinite models of DLO and
let han | n ∈ Ni and hbn | n ∈ Ni enumerate them without repetitions.
We construct a sequence of finite sets An , Bn of A, B and isomorphisms fn : An → Bn
by recursion. Let A0 = B0 = f0 = ∅. Suppose that An , Bn and fn are already defined.
We first extend the domain. If an ∈ An (this will happen often), let A0n = An , Bn0 = Bn
/ An , let A0n = An ∪ {an }. Since B is a model of DLO, there is some
and fn0 = fn . If an ∈
bn ∈ B such that the extension fn0 of fn with fn0 (an ) = b0n is an isomorphism from A0n to
0

Bn0 = Bn ∪ {b0n }.
We proceed similarly for the range. If bn ∈ Bn0 , let An+1 = A0n , Bn+1 = Bn0 and
fn+1 = fn0 . If however bn ∈/ Bn0 , we let Bn+1 = Bn0 ∪ {bn } and choose some a0n ∈ A such
that the extension fn+1 of fn0 with fn+1 (a0n ) = bn from An+1 = A0n ∪ {a0n } to Bn+1 is an
isomorphism.
Finally, let f denote the union of the functions fn for all n ∈ N. By the construction,
f : A → B is bijective. It is an isomorphism, since each fn is an isomorphism. 
If A = (A, <A ) and B = (B, <B ) are strict linear orders, their lexicographical order
<lex on A × B is defined by letting (a, b) <lex (a0 , b0 ) if a < a0 or (a = a0 ∧ b < b0 ). It is
easy to check that this is always a linear order.
Theorem 5.2.5. DLO is not κ-categorical for any uncountable cardinal κ.
Proof. Let <Q denote the usual linear order on Q.
We first claim that for any linear order (A, <A ), the lexicographical order <lex on A×Q
given by <A and <Q is dense and does not have end points. To show this, assume that
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 71

(a, q) <lex (b, r). If a <A b, then we can pick any q 0 ∈ Q with q <Q q 0 and have that
(a, q) <lex (a, q 0 ) <lex (b, r). If otherwise a = b and q <Q r, then we choose some q 0 ∈ Q
with q <Q q 0 <Q r and have that (a, q) <lex (a, q 0 ) <lex (b, r). Moreover <lex has no end
points, since there are no end points in <Q .
As usual in set theory, κ equals the set of ordinals α < κ and is wellordered by the
usual order < on ordinals. We further let <∗ denote the reverse linear order on κ that is
defined by α <∗ β ⇐⇒ β < α. Now let <lex and <∗lex denote the lexicographical orders on
κ × Q that are induced by <, <Q and <∗ , <Q , respectively. By the previous paragraph,
(κ × Q, <lex ) and (κ × Q, <∗lex ) are dense linear orders without end points.
We show that (κ×Q, <lex ) and (κ×Q, <∗lex ) are not isomorphic. Note that (κ×Q, <lex )
contains strictly increasing sequences of length κ, for instance h(α, 0) | α < κi.
However, we claim that (κ × Q, <∗lex ) does not contain strictly increasing sequences of
length κ. Towards a contradiction, suppose that h(αβ , qβ ) | β < κi is such a sequence.
Then the sequence hαβ | β < κi is a non-increasing sequence in κ by the definition of
<∗lex , i.e. αβ ≥ αγ for all β < γ < κ. Since (κ, <) is a well-order, there is some β < κ
such that αβ = αγ for all γ with β ≤ γ < κ. By the definition of <∗lex , it follows that the
sequence hqγ | β ≤ γ < κi is an uncountable strictly decreasing sequence in Q. But Q is
countable. Note that the proof shows there is no uncountable strictly decreasing sequence
in (κ × Q, <∗lex ). 
Suppose that K = (K, 0, 1, +, ·) is a field. As above, LV (K) = LGroup ∪ {ma | a ∈ K} is
the language of K-vector space, where ma is interpreted as scalar multiplication with a.
Let T denote the L-theory of K-vector spaces.
Example 5.2.6. The LV (K) -theory T of K-vector spaces is κ-categorical for all cardinals
κ > |K|, since any two K-vector spaces are isomorphic if and only if their dimension is
equal. Moreover, for any K-vector space V of size |V | > |K|, we have |V | = dim(V ), by
a counting argument.
Thus for finite fields K, T is κ-categorical for all infinite cardinals κ. For all countably
infinite fields K, T it is not ℵ0 -categorical, but κ-categorical for all uncountable cardinals
κ.
Lecture 27
We now turn to algebraically closed fields. Recall that a field is called algebraically 21. July
closed if every polynomial f (X) 6= 0 in K[X] has a root in K. If moreover K ⊆ L
are fields, then an element x ∈ L is called algebraic over K if it is the solution to some
polynomial f (X) 6= 0 in K[X] and transcendent otherwise. Moreover L is called algebraic
over K if each of its elements is algebraic over K.
Note that since for all a, b 6= 0 we have a · b 6= 0, the characteristic n is necessarily a
prime number; if n = m · k with m, k > 1, then (m · 1) · (k · 1) = 0 in K and hence one of
m · 1 and k · 1 is equal to 0, contradicting the minimality of n.
We use the following results from algebra without proofs.
Theorem 5.2.7. Every field K has an algebraic closure K̄ that is unique up to isomor-
phism.
Lemma 5.2.8. For every n > 0, there is a field Fpn of characteristic p and size pn that
is unique up to isomorphism. S
Moreover, the algebraic closure F̄pn of Fpn equals n≥1 Fpn and can be written as an
increasing union of finite fields.
We aim to define the transcendence degree of field extensions. To this end, we will
assume that K, L and M are algebraically closed fields with K ⊆ L, M . A subset A of
L is algebraically independent over K if for all a0 , . . . , an ∈ A and f ∈ K[X0 , . . . , Xn ]
with f 6= 0, we have f (a0 , . . . , an ) 6= 0. Moreover, a trancendence base of L over K is
72 PHILIPP SCHLICHT

a maximal algebraically independent subset of L over K. If A is such a base, it follows


that L is an algebraic extension of K(A). So for every x ∈ L, there is a polynomial with
coefficients in K[A] with f (x) = 0.
For example, {π} is a trancendence base of the algebraic closure of the subfield Q(π)
of C generated by π.
We can use the next lemma to show that the size of transcendence bases is unique.
Lemma 5.2.9. (Exchange property) Suppose that A and B are transcendence bases of
L over K and b ∈ B, then there is some a ∈ A such that A0 = (A \ {a}) ∪ {b} is a
transcendence base of L over K.
By successively replacing elements of A with elements of B in a transfinite induction,
we obtain that |A| = |B|. We can thus define the transcendence degree of L over K are
unique size of a transcendence base. If K is the algebraic closure of Q or Fp (depending
on the characteristic), then this is simply called the transcendence degree of L.
Lemma 5.2.10. Any two algebraically closed fields of the same characteristic and tran-
scendence degree are isomorphic.
Proof sketch. Suppose that A and B are transcendence bases of the same size of alge-
braically closed fields L and M of characteristic p. Let K denote the prime field of
characteristic p and assume K ⊆ L, M .
F extends uniquely to a ring isomorphism F : K[A] → K[B], since K[A] and K[B] are
isomorphic to the quotient fields of the polynomial rings K[hXa | a ∈ Ai] and K[hXb | b ∈
Bi] via the isomorphism that maps Xa to Xb .
Thus F extends uniquely to an isomorphism between the fields of fractions K(A) and
K(B). Since L is the algebraic closure of K(A) and M is the algebraic closure of K(B),
L∼ = M by the uniqueness of algebraic closures. 
A question to the reader: what is the transcendence degree of C?
Example 5.2.11. ACFp is not ℵ0 -categorical, but κ-categorical for all uncountable car-
dinals κ. This follows from the fact that two algebraically closed fields with characteristic
p are isomorphic if and only if their transcendence degree is equal.
In more detail, if A is a transcendence base of a field K, then |K| is bounded by the
number of polynomials times the number of its roots. So |K| ≤ |A<ω × Z × ω| = |A| if A
is infinite. Thus |K| = |A|.
By Vaught’s test (Lemma 5.2.2), ACFp is complete. We now derive some consequences
of this fact.
Lemma 5.2.12. (Lefschetz principle) The following conditions are equivalent for any
sentence ϕ in the language Lrings of rings and fields.
(a) ϕ holds in every algebraically closed field of characteristic 0.
(b) ϕ holds in the complex numbers.
(c) ϕ holds in some algebraically closed field of characteristic 0.
(d) There is some n ∈ N such that for all primes p > n, ϕ holds in all algebraically
closed fields of characteristic p.
(e) There are arbitrarily large primes p such that ϕ holds in some algebraically closed
field of characteristic p.
Proof. The implications from (a) to (b) and from (b) to (c) are clear. Assuming (c) we
have that ϕ holds in C and thus ACF0 |= ϕ, since ACF0 is complete. Therefore there is
a finite set ∆ ⊆ ACF0 with ∆ |= ϕ and hence ACFp |= ϕ if p is sufficiently large. The
implication from (d) to (e) is again clear. If (e) holds, we assume towards a contradiction
that ACF0 6|= ϕ. Since ACF0 is complete, we have ACF0 |= ¬ϕ. By the implication from
LECTURE NOTES: INTRODUCTION TO MATHEMATICAL LOGIC 73

(a) to (d) for ¬ϕ, we have that ¬ϕ holds in all algebraically closed fields of sufficiently
large characteristic, contradicting the assumption. 
Theorem 5.2.13. Every injective polynomial map from Cn to Cn is surjective.
Proof. We first show this for the algebraic closure F̄p of Fp for all primes p. Suppose that
f : (F̄p )n → (F̄p )n is an injective map given by polynomials p0 , . . . , pk with coefficients
a0 , . . . , al ∈ F̄p and it is injective, but some b ∈ F̄p is not in its range. The subfield
K ⊆ F̄p generated by a0 , . . . , ak , b is finite by Lemma 5.2.8 and the polynomials p0 , . . . , pk
define an injective map f 0 : K n → K n . Since K is finite f 0 is surjective, contradicting the
assumption that b ∈ / ranf 0 .
Suppose that there is a counterexample that is given by polynomials of degrees at
most d. Let ϕn,d be the first-order statement that every injective polynomial map with n
inputs and outputs that is given by polynomials of degrees at most d is surjective. Since
ACFp |= ϕn,d , this holds in C as well by Lemma 5.2.12. 

References
[1] Lorenz Halbeisen and Regula Krapf. Gödel’s Theorems & Zermelo’s Axioms.
[2] Ernest Schimmerling. A course on set theory. Cambridge University Press, 2011.
[3] Martin Ziegler. Mathematische Logik. Springer, 2010.

You might also like