Philippeter AxiomaticSetTheory
Philippeter AxiomaticSetTheory
Philippeter AxiomaticSetTheory
Peter Philip∗
Lecture Notes
Created for the Class of Summer Semester 2024 at LMU Munich†
Contents
1 Motivation and Preliminaries 3
1.1 Cantor’s Definition, Russell’s Antinomy . . . . . . . . . . . . . . . . . . . 3
1.2 Mathematical Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Set-Theoretic Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
CONTENTS 2
References 77
1 MOTIVATION AND PRELIMINARIES 3
Note that the disjunction A ∨ B is true if, and only if, at least one of the statements
A, B is true. Here one already has to be a bit careful – A ∨ B defines the inclusive or,
whereas “or” in common English is often understood to mean the exclusive or (which is
false if both input statements are true). Instead of A implies B, one also says if A then
B, B is a consequence of A, B is concluded or inferred from A, A is sufficient for B, or
B is necessary for A.
The implication A ⇒ B is always true, except if A is true and B is false. At first
glance, it might be surprising that A ⇒ B is defined to be true for A false and B true,
however, this is precisely what distinguishes the implication from the equivalence. After
a moment’s contemplation, one will most likely notice that one is quite familiar with
examples of incorrect statements implying correct statements: For instance, squaring
the (false) equality of integers −1 = 1, implies the (true) equality of integers 1 = 1.
Of course, the implication A ⇒ B is not really useful in situations, where the truth
values of both A and B are already known. Rather, in a typical application, one tries
to establish the truth of A to prove the truth of B (a strategy that will fail if A happens
to be false).
The equivalence A ⇔ B means A is true if, and only if, B is true. Analogous to the
situation of implications, A ⇔ B is not really useful if the truth values of both A and
B are known a priori, but can be a powerful tool to prove B to be true or false by
establishing the truth value of A.
Note that the expressions in the first row of the truth table (1.2) (e.g. A ∧ B) are not,
actually, statements, as they contain the statement variables (also known as proposi-
tional variables) A or B. However, the expressions become statements if all statement
variables are substituted with actual statements. We will call expressions of this form
propositional formulas. Moreover, if a truth value is assigned to each statement variable
of a propositional formula, then this uniquely determines the truth value of the formula.
In other words, the truth value of the propositional formula can be calculated from
the respective truth values of its statement variables – the presently discussed topic is,
therefore, known as propositional calculus.
Example 1.1. (a) Consider the propositional formula (A ∧ B) ∨ (¬B). Suppose A is
true and B is false. The truth value of the formula is obtained according to the
1 MOTIVATION AND PRELIMINARIES 6
(b) The propositional formula A ∨ (¬A), also known as the law of the excluded middle,
has the remarkable property that its truth value is T for every possible choice of
truth values for A:
A ¬A A ∨ (¬A)
T F T (1.4)
F T T
Formulas with this property are of particular importance.
Definition 1.2. A propositional formula φ is called a tautology or universally true if,
and only if, its truth value is T for all possible assignments of truth values to all the
statement variables it contains. One writes ⊢ φ if, and only if, φ is a tautology.
Notation 1.3. We write φ(A1 , . . . , An ) if, and only if, the propositional formula φ
contains precisely the n statement variables A1 , . . . , An .
Definition 1.4. The propositional formulas φ(A1 , . . . , An ) and ψ(A1 , . . . , An ) are called
equivalent if, and only if, φ(A1 , . . . , An ) ⇔ ψ(A1 , . . . , An ) is a tautology.
—
For all logical purposes, two equivalent formulas are exactly the same – it does not
matter if one uses one or the other. The following Th. 1.6 provides some important
equivalences of propositional formulas. As too many parentheses tend to make formulas
less readable, we first introduce some precedence conventions for logical operators:
Convention 1.5. ¬ takes precedence over ∧, ∨, which take precedence over ⇒, ⇔. So,
for example,
(A ∨ ¬B ⇒ ¬B ∧ ¬A) ⇔ ¬C ∧ (A ∨ ¬D)
is the same as
A ∨ (¬B) ⇒ (¬B) ∧ (¬A) ⇔ (¬C) ∧ A ∨ (¬D) .
Theorem 1.6. (a) ⊢ (A ⇒ B) ⇔ ¬A ∨ B. This means one can actually define impli-
cation via negation and disjunction.
(b) ⊢ (A ⇔ B) ⇔ (A ⇒ B) ∧ (B ⇒ A) , i.e. A and B are equivalent if, and only if,
A is both necessary and sufficient for B. One also calls the implication B ⇒ A the
converse of the implication A ⇒ B. Thus, A and B are equivalent if, and only if,
both A ⇒ B and its converse hold true.
1 MOTIVATION AND PRELIMINARIES 7
Proof. Each equivalence is proved by providing a suitable truth table, showing that the
respective equivalence τ is a tautology: In each case, the final column of the truth table
shows that, for all possible assignments of truth values to A, B, C (where applicable),
τ has truth value T:
(a):
A B ¬A A ⇒ B ¬A ∨ B (A ⇒ B) ⇔ ¬A ∨ B
T T F T T T
T F F F F T
F T T T T T
F F T T T T
A B ¬A ¬B A ∧ B ¬(A ∧ B) ¬A ∨ ¬B ¬(A ∧ B) ⇔ ¬A ∨ ¬B
T T F F T F F T
T F F T F T T T
F T T F F T T T
F F T T F T T T
(j): Exercise.
1 MOTIVATION AND PRELIMINARIES 8
(k):
A ¬A ¬¬A ¬¬A ⇔ A
T F T T
F T F T
(l):
A B ¬A ¬B A ⇒ B ¬B ⇒ ¬A (A ⇒ B) ⇔ (¬B ⇒ ¬A)
T T F F T T T
T F F T F F T
F T T F T T T
F F T T T T T
Having checked all the equivalences completes the proof of the theorem.
The importance of the rules provided by Th. 1.6 lies in their providing proof techniques,
i.e. methods for establishing the truth of statements from statements known or assumed
to be true. The rules of Th. 1.6 will be used frequently in proofs throughout this class.
Remark 1.7. Another important proof technique is the so-called proof by contradic-
tion, also called indirect proof. It is based on the observation, called the principle of
contradiction, that A ∧ ¬A is always false:
A ¬A A ∧ ¬A
T F F (1.5)
F T F
Two more rules we will use regularly in subsequent proofs are the so-called transitivity
of implication and the transitivity of equivalence. In preparation for the transitivity
rules, we generalize implication to propositional formulas:
Definition 1.8. In generalization of the implication operator defined in (1.2), we say
the propositional formula φ(A1 , . . . , An ) implies the propositional formula ψ(A1 , . . . , An )
(denoted φ(A1 , . . . , An ) ⇒ ψ(A1 , . . . , An )) if, and only if, each assignment of truth values
to the A1 , . . . , An that makes φ(A1 , . . . , An ) true, makes ψ(A1 , . . . , An ) true as well, i.e.
if, and only if, ⊢ φ(A1 , . . . , An ) ⇒ ψ(A1 , . . . , An ).
Theorem 1.9. (a) Transitivity of Implication: ⊢ (A ⇒ B) ∧ (B ⇒ C) ⇒ (A ⇒ C).
1 MOTIVATION AND PRELIMINARIES 9
Proof. Both implications are proved by providing a suitable truth table, showing that
the respective implication τ (A, B, C) is a tautology: In each case, the final column of
the truth table shows that, for all possible assignments of truth values to A, B, and C,
τ (A, B, C) has truth value T. We carry out (a) and leave (b) as an exercise.
(a):
A B C A⇒B B⇒C (A ⇒ B) ∧ (B ⇒ C) A⇒C (A ⇒ B) ∧ (B ⇒ C) ⇒ (A ⇒ C)
T T T T T T T T
T F T F T F T T
F T T T T T T T
F F T T T T T T
T T F T F F F T
T F F F T F F T
F T F T F F T T
F F F T T T T T
(b): Exercise.
A := {x ∈ B : P (x)}
to be the subset of B, consisting of all elements of B such that P (x) is true. Now take
B := N := {1, 2, . . . } to be the set of the natural numbers and let
P (x) := “The number x can be defined by fifty English words or less”. (1.6)
1
Actually, more generally, a proof of the statement B is given by a finite sequence of statements
A1 , A2 , . . . , An such that A1 is true; the logical disjunction A1 ∨ · · · ∨ Ai implies Ai+1 for 1 ≤ i < n;
and A1 ∨ · · · ∨ An implies B. It is then still correct that the existence of a proof of B guarantees B to
be true.
1 MOTIVATION AND PRELIMINARIES 10
Then A is a finite subset of N, since there are only finitely many English words (if you
think there might be infinitely many English words, just restrict yourself to the words
contained in some concrete dictionary). Then there is a smallest natural number n that
is not in A. But then n is the smallest natural number that can not be defined by
fifty English words or less, which, actually, defines n by less than fifty English words, in
contradiction to n ∈/ A.
To avoid contradictions of this type2 , we require P (x) to be a so-called set-theoretic
formula.
Definition 1.11. (a) The language of set theory consists precisely of the following
symbols: ∧, ¬, ∃, (, ), ∈, =, vj , where j = 1, 2, . . . .
(b) A set-theoretic formula is a finite string of symbols from the above language of set
theory that can be built using the following recursive rules:
(φ) ∨ (ψ) is short for ¬((¬(φ)) ∧ (¬(ψ))) (cf. Th. 1.6(j)), (1.7a)
(φ) ⇒ (ψ) is short for (¬(φ)) ∨ (ψ) (cf. Th. 1.6(a)), (1.7b)
(φ) ⇔ (ψ) is short for ((φ) ⇒ (ψ)) ∧ ((ψ) ⇒ (φ)) (cf. Th. 1.6(b)). (1.7c)
such that ∀vj (φ) means “each set vj has the property φ”, which is equivalent to the
statement “there does not exist a set vj that does not have the property φ”. Further
abbreviations and transcriptions are obtained from omitting parentheses if it is clear
from the context and/or from Convention 1.5 where to put them in, by writing variables
bound by quantifiers under the respective quantifiers (to improve readability), and by
using other symbols than vj for set variables. For example,
Moreover,
call the variable vj free. Bound variables are sometimes also called dummy variables,
since, if the bound variable vj in, say, ∃vj (φ) is replaced by vk (and vk is not free
in φ, cf. Ex. 1.16(e) below), then ∃vj (φ) and the formula with vj replaced by vk are
equivalent. Thus, if one uses the transcriptions introduced in (1.7e) and (1.7f), then the
bound variables are precisely those, occurring under a quantifier. In principle, it is not
forbidden for the same variable (more precisely, the same variable symbol) to occur as
both a free variable and a bound variable in the same formula, and it could also occur in
the scope of several different quantifiers4 . However, using the same variable symbol both
free and bound and/or within several scopes tends to make formulas less readable and
it can, actually, always be avoided, using additional variable symbols, see Ex. 1.16(c)-
(e) below. One might already have encountered the analogous situation when writing
integrals: For example, consider f : R −→ R, defined by the formula
Z 1 Z 1 Z x
f (x) := x + x dx + x+ exp(x) dx dx . (1.8a)
0 x 0
In (1.8a), the variable x occurs as a bound variable with three different scopes (within
the scope of each of the three integrals, x is used as the respective integrand’s dummy
variable) and also as a “free” variable (not bound by any integral), namely as the
function argument of f . Successively replacing each bound version of x, starting with
the innermost integral, one can write (1.8a) in the equivalent (and more readable) form
Z 1 Z 1 Z z
f (x) := x + u du + z+ exp(y) dy dz . (1.8b)
0 x 0
Example 1.16. (a) x ∈ y has x and y as free variables and no bound variables. It
states that the set x is an element of the set y.
(b) ∃ (x ∈ y) has x bound and y free. It states that there exists a set x that is an
x
element of the set y (i.e. that y is not the empty set).
(c) In the formula
∀ y ∈ x ⇒ ∃ (x ∈ y) ,
y x
y is bound, whereas x occurs both free and bound. If one replaces the bound version
of x by z, then one obtains the equivalent formula
∀ y ∈ x ⇒ ∃ (z ∈ y) .
y z
The formulas state that, if the set y is an element of the set x, then y contains an
element z – in other words, the set x does not contain the empty set.
4
Using the same variable symbol in such a way is similar to using the same variable name for different
local variables when coding computer programs.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 13
contains x as bound variables within two different scopes. Replacing the version of
x in the scope of the all quantor by y yields the equivalent formula
∃ ∀ (y ∈ y).
x y
It is somewhat peculiar, as it has the form ∃ φ, where x does not occur as a free
x
variable in φ. According to the interpretation given by Rem. 1.13, the formula is
true if, and only if, there exists a set such that φ is true, i.e. if, and only if, the
considered universe of sets in nonempty and every set in the universe contains itself.
(e) As stated in Def. and Rem. 1.15, dummy (i.e. bound) variables may be replaced by
other symbols without changing the meaning of the formula, however, if replacing
x in, say, ∃ (φ), then one has to make sure that the replacement does not occur as
x
a free variable in φ: For instance, in ∃ (x ∈ y) of (b), one can replace x with every
x
variable symbol, except y: While ∃ (x ∈ y) states that the set y is not empty, the
x
formula ∃ (y ∈ y) states the existence of a set that contains itself.
y
Remark 1.17. In Def. and Rem. 1.10, we defined a proof of statement B from statement
A1 as a finite sequence of statements A1 , A2 , . . . , An such that, for 1 ≤ i < n, Ai implies
Ai+1 , and An implies B. In the field of proof theory, which, similar to mathematical
logic, is a large field in its own right and a detailed treatment is beyond the scope
of this class, proofs are formalized via a finite set of rules that can be applied to (set-
theoretic) formulas (see, e.g., [EFT07, Sec. IV], [Kun12, Sec. II]). Once proofs have been
formalized in this way, one can, in principle, mechanically check if a given sequence of
symbols does, indeed, constitute a valid proof (without even having to understand the
actual meaning of the statements). Indeed, several different computer programs have
been devised that can be used for automatic proof checking, for example Coq [Wik22b],
HOL Light [Wik21], Isabelle [Wik22c] and Lean [Wik22d] to name just a few.
We are now in a position to formulate and discuss the axioms of axiomatic set the-
ory. More precisely, we will present the axioms of Zermelo-Fraenkel set theory, usually
abbreviated as ZF, which are Axiom 0 – Axiom 8 below. While there exist various
set theories in the literature, each set theory defined by some collection of axioms, the
axioms of ZFC, consisting of the axioms of ZF plus the axiom of choice (Axiom 9, see
Sec. 3 below), are used as the foundation of mathematics currently accepted by most
mathematicians.
—
In naive set theory, based on Cantor’s definition as described in Sec. 1.1, sets X and
Y are defined to be equal if, and only if, they contain precisely the same elements. In
axiomatic set theory, this is guaranteed by the axiom of extensionality:
Axiom 1 Extensionality:
∀ ∀ ∀ (z ∈ X ⇔ z ∈ Y ) ⇒ X = Y .
X Y z
2 ZERMELO-FRAENKEL SET THEORY (ZF) 15
—
Following [Kun12], we assume that the substitution property of equality is part of the
underlying logic, i.e. if X = Y , then X can be substituted for Y and vice versa without
changing the truth value of a (set-theoretic) formula. In particular, this yields the
converse to extensionality:
∀ ∀ X = Y ⇒ ∀ (z ∈ X ⇔ z ∈ Y ) .
X Y z
Definition 2.2 (Toy Models). Let a, b, c, d, e be distinct elements. For each index i in
{1, 2, . . . , 10}, we define the model Mi := (Di , Ei ), where Mi is the pair consisting of
the “domain” Di and a relation Ei on Di (i.e. Ei ⊆ Di × Di ), where one thinks of Di as
modeling the universe of sets and of Ei as modeling the element relation ∈ (one might
be concerned that the construction of these models is not justified by the axioms that
have, thus far, been introduced – this is a fair concern and we will address it further in
2 ZERMELO-FRAENKEL SET THEORY (ZF) 16
M1 := (D1 , E1 ), D1 := {a}, E1 := ∅,
M2 := (D2 , E2 ), D2 := {a}, E2 := {(a, a)},
M3 := (D3 , E3 ), D3 := {a, b}, E3 := {(a, b), (b, a)},
M4 := (D4 , E4 ), D4 := {a, b, c}, E4 := {(a, b), (b, a), (a, c), (b, c)},
M5 := (D5 , E5 ), D5 := {a, b, c}, E5 := {(a, b), (a, c)},
M6 := (D6 , E6 ), D6 := {a, b, c, d}, E6 := {(a, b), (a, c), (a, d), (b, c), (b, d), (c, d)},
M7 := (D7 , E7 ), D7 := {a, b, c}, E7 := {(a, b), (b, c)},
M8 := (D8 , E8 ), D8 := {a, b, c}, E8 := {(b, c)},
M9 := (D9 , E9 ), D9 := {a, b, c, d, e}, E9 := {(a, b), (b, c), (c, d), (b, e), (c, e)},
M10 := (D10 , E10 ), D10 := {a, b}, E10 := {(a, b), (b, b)}.
Example 2.3. For each toy model Mi of Def. 2.2, we will check if it satisfies Axiom
0 (i.e. existence of a set), Axiom 1 (i.e. extensionality), and the (non-)existence of an
empty set (c.f. (2.1) below). We will see that Axioms 0 and 1 are independent from
each other and that Axioms 0 and 1 together neither imply nor refute the existence of
an empty set.
(a) Axiom 0 holds in each of the above models Mi , i ∈ {1, . . . , 10}, since Di 6= ∅ in
each case.
(b) Axiom 1 holds in each Mi , i ∈ {1, 2, 3, 4, 6, 7, 9, 10}, but is violated in M5 and M8 :
Axiom 1 holds in Mi , i ∈ {1, 2}, since Di contains only 1 element.
Axiom 1 holds in M3 , since E3 provides precisely the relations aE3 b and bE3 a, i.e.,
in this universe of sets, b has only a as an element and a has only b as an element
– in particular, there are no distinct sets that contain precisely the same elements.
Axiom 1 is violated in M5 , since, according to E5 , both b and c contain precisely a
as an element.
We leave M4 and M6 – M10 as an exercise.
(c) We check which of our toy models do not contain an “empty set”, i.e. satisfy the
“axiom”
¬ ∃ ∀ x∈ /X : (2.1)
X x
From (a) – (c), we see, in particular, that M2 , M3 , M4 satisfy Axioms 0, 1, plus (2.1);
whereas M1 , M6 , M7 , M9 , M10 satisfy Axioms 0, 1, plus the existence of an “empty
set”.
Remark 2.4. Using models of set theory to prove independence results, as we have
just done in Def. 2.2 and Ex. 2.3 is subject to some logical subtleties: The validity of
such arguments relies on the admissibility of constructing the respective models: For
example, one can obtain all the models of Def. 2.2, if one is allowed to form sets with
up to 5 distinct elements, one is allowed to form ordered pairs from these elements, and
one is also allowed to form sets, containing the obtained ordered pairs as elements (of
course, each individual model can be obtained with weaker construction rules).
2.3 Comprehension
To obtain, among many other things, the existence of the empty set, we introduce the
additional axiom of comprehension. More precisely, in the case of comprehension, we
do not have a single axiom, but a scheme of infinitely many axioms, one for each set-
theoretic formula that satisfies a certain condition. Its formulation makes use of the
following definition:
Definition 2.5. One obtains the universal closure of a set-theoretic formula φ, by
writing ∀ in front of φ for each variable vj that occurs as a free variable in φ (recall from
vj
Def. and Rem. 1.15 that vj is free in φ if, and only if, it is not bound by a quantifier
in φ). While, if φ contains more than one free variable, the universal closure of φ is
nonunique (as one can choose an arbitrary order of the ∀ in front of φ), this does not
vj
cause a problem, since all universal closures of φ are equivalent.
is an axiom. Thus, the comprehension scheme states that, given the set
X, there exists (at least one) set Y , containing precisely the elements of X
that have the property φ (the importance of allowing φ in (2.2) to have free
variables will be illustrated in Ex. 2.11 below, where Ex. 2.11(e) will also
show, why Y must not be free in φ).
Lemma 2.6. Axioms 0 and 2 (i.e. the existence of a set together with the comprehension
scheme) imply the existence of (at least one) empty set, i.e. the validity of
∃ ∀ x∈
/ Y. (2.3)
Y x
2 ZERMELO-FRAENKEL SET THEORY (ZF) 18
Proof. According to Axiom 0, there exists a set X. Letting φ denote the set-theoretic
formula x 6= x, Axiom 2 yields
∃ ∀ x ∈ Y ⇔ (x ∈ X ∧ x 6= x) .
Y x
Since, for each x, the statement x ∈ X ∧ x 6= x is false, x ∈ Y must be false for each x
as well, thereby proving (2.3).
Example 2.7. We check which of our toy models M1 , . . . , M10 of Def. 2.2 satisfy Axiom
2 (i.e. the comprehension scheme):
We begin with some general considerations that will be useful for several of the models:
Claim 1: If X in (2.2) is an empty set, then (2.2) holds with Y := X: Indeed, both
x ∈ Y and x ∈ X ∧ φ are then false for each x and φ.
Claim 2: If the domain Di contains elements A, B, C (not necessarily distinct), where A
is empty and C contains precisely one element, namely B, then (2.2) holds for X := C:
Indeed, there are four possible cases to check: (i) φ does not contain x as a free variable
and φ is true (independently of x) – then (2.2) holds with Y := C (since x ∈ C ⇔
(x ∈ C ∧ φ)); (ii) φ does not contain x as a free variable and φ is false (independently
of x) – then (2.2) holds with Y := A; (iii) φ does contain x as a free variable and φ is
true for x = B – then (2.2) holds with Y := C (since x ∈ C ⇔ (x ∈ C ∧ φ), both sides
being true for x = B, both sides being false for x 6= B); (iv) φ does contain x as a free
variable and φ is false for x = B – then (2.2) holds with Y := A (since both sides of
x ∈ A ⇔ (x ∈ C ∧ φ) are false for each x).
Axiom 2 holds in M1 : Since D1 contains only one set, namely a, which is empty (ac-
cording to E1 ), (2.2) is true for Y := a by Claim 1. In combination with Ex. 2.3, we see
that M1 satisfies all Axioms 0 – 2. As it also satisfies
∀ ¬ ∃ (y ∈ x) ,
x y
M1 shows that Axioms 0 – 2 do not suffice to prove the existence of nonempty sets.
Axiom 2 does not hold in M2 , M3 , M4 : We know from Ex. 2.3(a),(c) that these models
satisfy Axiom 0, but violate (2.3). Thus, Lem. 2.6 yields that Axiom 2 does not hold.
Axiom 2 holds in M5 : If X := a, then, since a is an empty set, (2.2) holds with Y := a
by Claim 1. If X := b, then, since b contains precisely a, (2.2) holds by Claim 2 (using
A := B := a, C := b). If X := c, then, since c contains precisely a, (2.2) holds again by
Claim 2 (using A := B := a, C := c).
It is an exercise to show that Axiom 2 holds in M7 , M8 , M9 , but fails in M6 and M10 .
2 ZERMELO-FRAENKEL SET THEORY (ZF) 19
We summarize the toy models’ properties we found so far in the following table:
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Axiom 0 (Existence) T T T T T T T T T T
Axiom 1 (Extensionality) T T T T F T T F T T
¬(2.1) (has empty set) T F F F T T T T T T
Axiom 2 (Comprehension) T F F F T F T T T F
In particular, comparing the corresponding rows in the table above, we find Axioms 1
and 2 to be mutually independent.
Remark 2.8. Comprehension alone does not provide uniqueness (for instance, we found
in Ex. 2.7 that model M8 satisfies comprehension, even though it has two distinct empty
sets). However, if one also assumes Axiom 1 (extensionality) and if both Y and Y ′ are
sets containing precisely the elements of X that have the property φ, then
∀ x ∈ Y ⇔ (x ∈ X ∧ φ) ⇔ x ∈ Y ′ ,
x
{x : x ∈ X ∧ φ} := {x ∈ X : φ} := Y. (2.4)
Theorem 2.9. Assuming Axioms 0 – 2, there exists a unique empty set (which we
denote by ∅ or by 0 – it is common to identify the empty set with the number zero in
axiomatic set theory).
Remark 2.10. In Rem. 1.14 we said that every formula with additional symbols and
notation is to be regarded as an abbreviation or transcription of a set-theoretic formula
as defined in Def. 1.11(b). Thus, formulas containing symbols for defined sets (e.g. 0
or ∅ for the empty set) are to be regarded as abbreviations for formulas without such
symbols. Some logical subtleties arise from the fact that there is some ambiguity in the
way such abbreviations can be resolved: For example, 0 ∈ X might abbreviate
ψ : ∃ φ(y) ∧ y ∈ X or χ : ∀ φ(y) ⇒ y ∈ X , where φ(y) stands for ∀ (v ∈ / y).
y y v
2 ZERMELO-FRAENKEL SET THEORY (ZF) 20
(e.g., if Axioms 0 – 2 hold), but they can be nonequivalent, otherwise: For example, in
model M8 of Def. 2.2, consider ψ and χ with X := c. In M8 , φ(y) is true for y := a and
y := b. Thus, ψ is true in M8 (since (b, c) ∈ E8 ), but χ is false in M8 (since (a, c) ∈
/ E8 ).
To avoid introducing logical ambiguities, we will only use formulas with symbols for
defined sets under the assumption of extensionality.
—
At first glance, the role played by the free variables in φ, which are allowed to occur
in Axiom 2, might seem a bit obscure. So let us consider examples to illustrate that
allowing free variables (i.e. set parameters) in comprehension is quite natural:
(a) If φ in (2.2) is the formula x ∈ Z (having x, Z as free variables), then the set given
by the resulting axiom yields precisely the intersection of X and Z:
X ∩ Z := {x ∈ X : φ} = {x ∈ X : x ∈ Z}.
(b) While (a) shows how Axiom 2 provides the intersection of two sets, with a modifi-
cation, Axiom 2 also yields the existence of intersections of more than two sets (of
both finitely and even infinitely many): If M is a nonempty set, X ∈ M, and φ in
(2.2) is the formula ∀ x ∈ M (having x, M as free variables), then the set given
M ∈M
by the resulting axiom yields precisely the intersection of all sets that are elements
of M:
\ \
M := M := x : ∀ x ∈ M := x ∈ X : ∀ x ∈ M . (2.5)
M ∈M M ∈M
M ∈M
M := {Mi : i ∈ I}
(if I is a set and Mi is a set for each i ∈ I, then M as above will be a set by Axiom
5). It is emphasized that the sets M and I in (2.5) and (2.6), respectively, were
required to be nonempty. If one tries to form
\ \
∅ = x : ∀ x ∈ X = x : ∀ x ∈ Mi = Mi ,
X∈∅ i∈∅
i∈∅
then one obtains the so-called universal class of all sets V, which is not a set (cf.
Sec. 2.4 below, in particular Ex. 2.14(b)).
X \ Z := {x ∈ X : φ} = {x ∈ X : x ∈
/ Z}.
(d) Note that it is even allowed for φ in (2.2) to have X as a free variable, so one can
let φ be the formula ∃ (x ∈ u ∧ u ∈ X) to define the set
u
n o
∗
X := x ∈ X : ∃ (x ∈ u ∧ u ∈ X) .
u
2∗ = {0} = 1.
(e) It is essential that φ in (2.2) must not contain Y as a free variable. Otherwise, one
would have a contradiction as soon as there exists any nonempty set: Suppose φ
in (2.2) were allowed to be the formula x ∈ / Y . Then, if X is nonempty, i.e. there
exists x ∈ X, (2.2) required the existence of a set Y such that x ∈ Y ⇔ x ∈ / Y.
to allowing a cow or a monkey (or any other object without elements, other than the
empty set) to be considered a set, which would mean that our set-theoretic variables vj
were allowed to be a cow or a monkey as well. However, extensionality then implies the
false statement C = M = ∅, thereby excluding cows and monkeys from the mathematical
universe. Similarly, {C} and {M } (or any other object that contains a non-set), can
not be inside the mathematical universe. Indeed, otherwise we had
∀ x ∈ {C} ⇔ x ∈ {M }
x
(as C and M are non-sets) and, by extensionality, {C} = {M } were true, in contradic-
tion to a set with a cow inside not being the same as a set with a monkey inside. Thus,
we see that all objects of the mathematical universe must be so-called hereditary sets,
i.e. sets all of whose elements (thinking of the elements as being the children of the sets)
are also sets.
2.4 Classes
As we need to avoid contradictions such as Russell’s antinomy, we must not require the
existence of a set {x : φ} for each set-theoretic formula φ. However, it can still be
useful to think of a “collection” of all sets having the property φ. Such collections are
commonly called classes:
(b) If φ is a set-theoretic formula, then we say the class {x : φ} exists (as a set) if, and
only if
∃ ∀ x∈X ⇔ φ (2.7)
X x
Example 2.14. (a) Due to Russell’s antinomy of Sec. 1.1, we know that
R := {x : x ∈
/ x}
(b) The universal class of all sets, V := {x : x = x}, is a proper class. Once again,
this is related to Russell’s antinomy: If V were a set, then
R = {x : x ∈
/ x} = {x : x = x ∧ x ∈
/ x} = {x : x ∈ V ∧ x ∈
/ x}
Remark 2.15. From the perspective of formal logic, statements involving proper classes
are to be regarded as abbreviations for statements without proper classes. For example,
it turns out that the class G of all sets forming a group is a proper class. But we might
write G ∈ G as an abbreviation for the statement “The set G is a group.”
2.5 Pairing
As we saw from our investigation of model M1 in Ex. 2.7, Axioms 0 – 2 are still consistent
with the empty set being the only set in existence. The next axiom will provide the
existence of nonempty sets:
Axiom 3 Pairing:
∀ ∀ ∃ (x ∈ Z ∧ y ∈ Z). (2.8)
x y Z
Thus, the pairing axiom states that, for all sets x and y, there exists a set Z
that contains x and y as elements.
—
0 := ∅, (2.9a)
1 := {0}, (2.9b)
2 := {0, 1} (2.9c)
Definition 2.16. Assume Axioms 0 – 3. If x, y are sets and Z is given by the pairing
axiom, then we call
(c) (x, y) := {{x}, {x, y}} the ordered pair given by x and y.
where, by extensionality {x} 6= {x, y} 6= {x′ }. Thus, using extensionality again, {x} =
{x′ } and x = x′ . Next, we conclude
(a) We now have the existence of the infinitely many different sets 0, {0}, {{0}}, . . . .
In particular, none of our finite toy models M1 , . . . , M10 from Def. 2.2 can satisfy
Axioms 0 – 3. While we will need the axiom of infinity of Sec. 2.8.1 below to
formally define the notions finite and infinite, in Ex. 2.19 below, we will see that
only M2 and M10 satisfy pairing (and we know from Ex. 2.7 that M2 and M10 do
not satisfy comprehension). However, Axioms 0 – 3 do not, yet, suffice to prove the
existence of sets with more than two elements.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 25
(b) At this stage, it would already be possible to introduce the notion of a relation by
calling a set a relation if, and only if, all its elements are ordered pairs. However,
without further axioms, this becomes cumbersome, one can not, actually, construct
many interesting relations anyway, and certain definitions (such as domain, image,
function) would depend on the particular definition of (x, y) := {{x}, {x, y}} in Def.
2.16(c), rather than merely on the key property (2.10) of ordered pairs (cf. [Kun12,
Sec. I.7.1]). Thus, we postpone the definition and consideration of relations and
functions to Sec. 2.7.2, where we can use the axioms of union and replacement
to justify the existence of Cartesian products, then giving rise to relations and
functions in the usual way.
(c) Once one has ordered pairs, one can proceed to define more general ordered tuples
by letting
(v1 ) := v1 ,
(v1 , v2 ) := v1 , (v2 ) := {v1 }, {v1 , v2 } (ordered pair, same as Def. 2.16(c)),
(v1 , v2 , v3 ) := v1 , (v2 , v3 ) (ordered triple),
(v1 , v2 , v3 , v4 ) := v1 , (v2 , v3 , v4 ) (ordered quadruple),
...
where v1 , v2 , . . . are arbitrary sets. While this is less elegant than the usual definition
of ordered n-tuples (v1 , . . . , vn ) as the function v : {1, . . . , n} −→ {v1 , . . . , vn },
vi := v(i), it has the advantage of not needing any further axioms. Once we have
sufficiently many axioms to justify definition via recursion and proof via induction,
we can show both definitions of ordered n-tuples to be equivalent.
Example 2.19. We check which of our toy models M1 , . . . , M10 of Def. 2.2 satisfy Axiom
3 (pairing): Axiom 3 holds only in M2 and M10 , and is violated in all the remaining
models: Axiom 3 holds in M2 , since a is the only set in the model and a is an element of
a. Axiom 3 does not hold in M1 , M3 , . . . , M8 : In M1 , there is no set containing a; in M3 ,
there is no set containing both a and b; in M4 , M5 , M7 , M8 , there is no set containing
c; and in M6 and M9 , there is no set containing d. Axiom 3 holds in M10 , since a and
b are the only sets in the model and b contains both a and b. We summarize the toy
models’ properties we found so far in the following table:
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Axiom 0 (Existence) T T T T T T T T T T
Axiom 1 (Extensionality) T T T T F T T F T T
¬(2.1) (has empty set) T F F F T T T T T T
Axiom 2 (Comprehension) T F F F T F T T T F
Axiom 3 (Pairing) F T F F F F F F F T
2 ZERMELO-FRAENKEL SET THEORY (ZF) 26
2.6 Union
To be able to construct sets with more than two elements, we introduce the following
axiom:
Axiom 4 Union:
∀ ∃ ∀ ∀ (x ∈ X ∧ X ∈ M) ⇒ x ∈ Y . (2.11)
M Y x X
Thus, the union axiom states that, for each set of sets M, there exists a set
Y containing all elements of elements of M.
Definition 2.20. (a) If M is a set and Y is given by the union axiom, then define
[ [
M := X := x ∈ Y : ∃ x∈X . (2.12)
X∈M
X∈M
Remark 2.21. (a) Analogous to (2.6) for intersections, once one has a family of sets
(Mi )i∈I , it is also useful to define set-theoretic unions as
[
Mi := x : ∃ x ∈ Mi . (2.13)
i∈I
i∈I
is the empty setT– in particular, a set (this is in contrast to the situation for inter-
sections, where ∅ = V, which is a proper class and not a set, cf. Ex. 2.11(b)).
2 ZERMELO-FRAENKEL SET THEORY (ZF) 27
Definition 2.22. For each set x, we define its successor to be the set x ∪ {x}. While
we will define functions between sets in the usual way in Sec. 2.7.2 below, it can already
be useful to think of the successor function as a class function
S : V −→ V, S(x) := x ∪ {x}
(clearly, S will not be a function between sets, since it is defined for each set x, that
means it is defined on the proper class V – however each restriction to a set V will be
a set function in the usual sense). Recalling (2.9), we have 1 = S(0), 2 = S(1); and we
can define 3 := S(2), . . .
Example 2.23. We check which of our toy models M1 , . . . , M10 of Def. 2.2 satisfy Axiom
4 (union): As it turns out, Axiom 4 holds in each Mi , except for i = 9:
Axiom 4 holds in M1 , since a is the only set in D1 and a is empty. Axiom 4 holds in M3 :
If M := a in (2.11), then the only possibility (due to E3 ) is X = b and, thus, x = a,
implying (2.11) to hold with Y = b (since (a, b) ∈ E3 ). Switching the roles of a and b
shows (2.11) to hold with Y = a for M := b. Axiom 4 holds in M5 : For M := a, (2.11)
is trivially true (with arbitrary Y ∈ D5 ), since a is empty; for M := b or M := c, (2.11)
still holds with arbitrary Y , since, in both cases, X = a and a is empty.
We leave it as an exercise to verify Axiom 4 also holds in M2 , M4 , M6 , M7 , M8 , M10 .
Axiom 4 does not hold in M9 : Consider (2.11) with M := e. Since e contains b and c,
b contains a, and c contains b, we would need an element Y of D9 that contains both a
and b. However, D9 does not contain such an element.
We summarize the toy models’ properties we found so far in the following table:
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Axiom 0 (Existence) T T T T T T T T T T
Axiom 1 (Extensionality) T T T T F T T F T T
¬(2.1) (has empty set) T F F F T T T T T T
Axiom 2 (Comprehension) T F F F T F T T T F
Axiom 3 (Pairing) F T F F F F F F F T
Axiom 4 (Union) T T T T T T T T F T
2.7 Replacement
2.7.1 Replacement Scheme, Cartesian Products
As mentioned before, we desire to define relations and functions in the usual manner,
making use of the Cartesian product A × B of two sets A and B, where A × B consists
2 ZERMELO-FRAENKEL SET THEORY (ZF) 28
of all ordered pairs (x, y), where x ∈ A and y ∈ B. However, Axioms 0 – 4 are not suf-
ficient to justify the existence of Cartesian products. To obtain Cartesian products, we
employ the following axiom of replacement. Analogous to the axiom of comprehension,
the axiom of replacement actually consists of a scheme of infinitely many axioms, one
for each set-theoretic formula. For the formulation of replacement, it is convenient to
introduce another abbreviation:
Notation 2.24. If φ is a set-theoretic formula, then
∃! φ is short for ∃ φ(y) ∧ ∀ φ(z) ⇒ y = z , (2.14)
y y z
where the notation φ(y) and φ(z) is supposed to mean that, if y is free in φ, then this
free y is replaced by z to obtain φ(z) from φ(y). Thus, ∃! φ holds if, and only if, there
y
exists a unique set y with the property φ.
is an axiom. Thus, the replacement scheme states that if, for each x ∈ X,
there exists a unique y having the property φ (where, in general, φ will depend
on x), then there exists a set Y that, for each x ∈ X, contains this y with
property φ. One can view this as obtaining Y by replacing each x ∈ X by
the corresponding y = y(x).
Theorem 2.25. Assuming Axioms 0 – 5, the following holds true: If A and B are sets,
then the Cartesian product of A and B, i.e. the class
A × B := x : ∃ ∃ x = (a, b) (2.16)
a∈A b∈B
exists as a set.
Proof. For each a ∈ A, we can use replacement with X := B and φ := φa being the
formula y = (a, x) to obtain the existence of the set
{a} × B := {(a, x) : x ∈ B} (2.17a)
(in the usual way, comprehension and extensionality were used as well). Analogously,
using replacement again with X := A and φ being the formula y = {x} × B, we obtain
the existence of the set
M := {{x} × B : x ∈ A}. (2.17b)
2 ZERMELO-FRAENKEL SET THEORY (ZF) 29
to be a set as well.
Example 2.26. We check which of our toy models M1 , . . . , M10 of Def. 2.2 satisfy
Axiom 5 (replacement): We will see that Axiom 5 holds in M1 , M2 , M3 , M10 , but fails
in M4 , . . . , M9 :
Axiom 5 holds in M1 : Since the only set in D1 is the empty set a, x ∈ X = a in (2.15)
is false for each x, implying (2.15) to hold with Y = a.
Axiom 5 holds in M2 : Once again, X = a is the only possibility in (2.15). Since a is the
only element of D1 , each admissible φ must hold precisely for y := a, implying (2.15) to
hold with Y = a.
Axiom 5 holds in M3 : The only possibilities in (2.15) are X := a or X := b. In both
cases, since a and b each have precisely one element, each admissible φ must either hold
precisely for y := a (in which case (2.15) holds with Y = b) or precisely for y := b (in
which case (2.15) holds with Y = a).
Axiom 5 does not hold in M4 : Consider (2.15) with X := b and
φ := ∃ u 6= v ∧ u ∈ y ∧ v ∈ y .
u,v
Then φ is admissible in (2.15) (since y = c is the unique set in D4 with precisely two
elements). However, there does not exist a set Y ∈ D4 such that (c, Y ) ∈ E4 .
Models M5 – M10 are left as an exercise.
We summarize the toy models’ properties we found so far in the following table5 :
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
Axiom 0 (Existence) T T T T T T T T T T
Axiom 1 (Extensionality) T T T T F T T F T T
¬(2.1) (has empty set) T F F F T T T T T T
Axiom 2 (Comprehension) T F F F T F T T T F
Axiom 3 (Pairing) F T F F F F F F F T
Axiom 4 (Union) T T T T T T T T F T
Axiom 5 (Replacement) T T T F F F F F F T
5
In the literature, one sometimes finds the statement that the axiom of replacement plus the existence
of an empty set implies the axiom of comprehension. Models M3 and M10 show that with the axiom
of replacement in the form (2.15), which is the version found, e.g., in [Kun12, Sec. I.2] and [Hal17, Ch.
3.7], this is not the case! The situation is different if the axiom of replacement requires that the set Y
in (2.15) contains precisely those y with φ = φ(x, y) true for some x ∈ X.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 30
Now that we have the existence of Cartesian products according to Th. 2.25, we proceed
to define relations in the usual way:
Definition 2.27. Assume Axioms 0 – 5. Given sets A and B, each subset R of A × B is
called a relation over A and B (if A = B, then we call R a relation on A). If one wants
to be completely precise, a relation is an ordered triple (A, B, R), where R ⊆ A × B (see
Rem. 2.18(c) above for the definition of ordered triples). The set A is called the domain
of R, denoted dom(R)6 , B is called the codomain of R, denoted codom(R), and R is
the relations graph (here we commit the usual abuse of notation, referring to both the
relation triple and relation’s graph as R). One says that a ∈ A and b ∈ B are related
according to the relation R if, and only if, (a, b) ∈ R. In this context, one often writes
a R b instead of (a, b) ∈ R.
Definition and Remark 2.28. Assume Axioms 0 – 5. Let A, B be sets and let
R ⊆ A × B be a relation over A and B. If T is a subset of A, then call
R(T ) := b ∈ B : ∃ (a, b) ∈ R
a∈T
the preimage or inverse image of U under R. Moreover, we call R(A) the image of R
and we call R−1 (B) the preimage, inverse image, or active domain of R (cf. footnote
to the definition of domain in Def. 2.27 above). To prove the existence of R(T ) and
R−1 (U ) as sets, apply (2.15) with X := R and
φ := ∃ x = (a, y)
a
to obtain Y to be a superset of R(T ) (and, then, R(T ) via comprehension), and with
X := R and
φ := ∃ x = (y, b)
b
(a) R is called univalent or right-unique or a partial function if, and only if,
∀ ∀ (x R y1 ∧ x R y2 ) ⇒ y1 = y2 ,
x∈A y1 ,y2 ∈B
i.e. if, and only if, every element of A is related to at most one element of B.
∀ ∃ (x R y),
x∈A y∈B
i.e. if, and only if, in terms of Def. and Rem. 2.28, the active domain of R is all of
A (i.e. R−1 (B) = A).
i.e. if, and only if, for every element y of B there exists at most one element of A
that is related to y.
(d) R is called one-to-one if, and only if, it is an injective partial function.
∀ ∃ (x R y),
y∈B x∈A
i.e. if, and only if, in terms of Def. and Rem. 2.28, the image of R is all of B (i.e.
R(A) = B).
(f ) R is called a function if, and only if, it is a total partial function. In this case, one
usually writes R : A −→ B and one introduces the usual notation
∀ ∀ R(x) = y :⇔ x 7→ y :⇔ xRy
x∈A y∈B
(g) If A = B, then R is called the identity on A if, and only if, R : A −→ A, R(x) = x.
For the identity on R, one writes IdA (or simply Id, if A is understood). Actually,
the identity on A is the same as the equality relation “=” on A, sometimes also
called the diagonal on A, denoted ∆(A). Thus, one has
IdA = ∆(A) := {(x, x) ∈ A × A : x ∈ A}
and
∀ x = y ⇔ IdA (x) = y ⇔ (x, y) ∈ ∆(A) .
x,y∈A
The preferred terms and notation depend on the emphasis being either on the
function perspective or the relation perspective.
Definition 2.30. Assume Axioms 0 – 5 and let R ⊆ A × B be a relation over sets A
and B.
is called the inverse or converse relation of R (note that the notation R−1 is con-
sistent with the notation introduced in Def. and Rem. 2.28).
(b) Given U ⊆ A, the relation S ⊆ U × B over U and B, defined by
S := (a, b) ∈ U × B : a R b
is called the restriction of R to U ; R is called an extension of S to A. In this
situation, one also uses the notation R↾U for S (some authors prefer the notation
R|U or R|U and often one is less precise and still writes R for the restriction). If R
is a relation on A (i.e. R ⊆ A × A), then we also define its strong restriction to U ,
denoted R↾↾U ⊆ U × U , to be the relation on U defined by
R↾↾U := (a, b) ∈ U × U : a R b
(in general, one then has R↾↾U ( R↾U ).
(c) Given a relation T ⊆ C × D over sets C and D the composition of R and T is the
relation over A and D defined by
T ◦ R := (a, d) ∈ A × D : ∃ ( a R b ∧ b T d ) ⊆ A × D.
b∈B∩C
T ◦ (S ◦ R) = (T ◦ S) ◦ R. (2.18)
Moreover,
(d) One has the following law for forming images and preimages:
(e) If R and S are both partial functions (resp. both injective or both one-to-one), then
so is S ◦ R.
(f ) Assuming R(A) ⊆ C, the following holds true: If R and S are both total (resp. both
a function), then so is S ◦ R (but see Ex. 2.32(a)).
(g) Assuming S −1 (D) ⊆ B, the following holds true: If R and S are both surjective,
then so is S ◦ R (but see Ex. 2.32(a)).
2 ZERMELO-FRAENKEL SET THEORY (ZF) 34
(h) Assuming B = C, the following holds true 7 : If R and S are both bijective functions,
then so is S ◦ R (but see Ex. 2.32(a)).
(i) If R is a bijective function, then R−1 ◦ R = IdA (but see Ex. 2.32(b)).
Proof. (a): According to Def. 2.30(c), both T ◦ (S ◦ R) and (T ◦ S) ◦ R are relations over
A and F . So it just remains to prove
∀ (a, f ) ∈ T ◦ (S ◦ R) ⇔ (a, f ) ∈ (T ◦ S) ◦ R .
(a,f )∈A×F
proving S ◦ R to be total. If R and S are both functions, then they are both total partial
functions, implying S ◦ R to be a total partial function (i.e. a function) by combining
what we have just proved with (e).
(g) – (i) are left as exercises.
Example 2.32. (a) To see that Prop. 2.31(f),(g),(h) are not correct without their re-
spective assumptions R(A) ⊆ C, S −1 (D) ⊆ B, B = C, consider A := B := {1, 2},
C := D := {2, 3}, R := IdA = {(1, 1), (2, 2)}, S := IdC = {(2, 2), (3, 3)}. Then R
and S are bijective functions, but S ◦ R = {(2, 2)} is neither total nor surjective.
An even simpler example is given by A := B := {1}, C := D := {2}, R := IdA ,
S := IdC , S ◦ R = ∅.
7
As one wants to apply (f) and (g), instead of B = C, one might be inclined to use the hypotheses
R(A) ⊆ C and S −1 (D) ⊆ B, since, at first glance, this might appear weaker. However, the also assumed
surjectivity of R then yields R(A) = B ⊆ C and the also assumed totality of S (i.e. surjectivity of S −1 )
then yields S −1 (D) = C ⊆ B, and we are back to B = C.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 35
(b) To see that the converse of Prop. 2.31(i) does not hold, consider A := {1}, B :=
{1, 2, 3}, R := {(1, 1), (1, 2)}. Then R is not a function and not surjective, but still
R−1 = {(1, 1), (2, 1)} and R−1 ◦ R = IdA .
Definition 2.33. Assume Axioms 0 – 5 and let R be a relation on a set A, i.e. R ⊆ A×A.
∀ x R x,
x∈A
i.e. if, and only if, each x is related to y if, and only if, y is related to x.
i.e. if, and only if, the only possibility for x to be related to y at the same time that
y is related to x is in the case x = y.
i.e. if, and only if, the relatedness of x and y together with the relatedness of y and
z implies the relatedness of x and z.
(f ) R is called an equivalence relation if, and only if, R is reflexive, symmetric, and
transitive.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 36
i.e. if, and only if, if x and y are distinct, then x is related to y or y is related to x.
(h) R is called a partial order if, and only if, R is reflexive, antisymmetric, and transitive.
If R is a partial order, then one usually writes x ≤ y instead of x R y. A partial
order is called a total or linear order if, and only if, it also satisfies trichotomy.
(i) R is called a strict partial order if, and only if, R is asymmetric and transitive. If
R is a partial order, then one usually writes x < y instead of x R y. A strict partial
order is called a strict total or strict linear order if, and only if, it also satisfies
trichotomy.
Lemma 2.34. If ≤ is a partial order on a set A, then, using the notation of Def. 2.29(g),
< := ≤ \ ∆(A) is a strict partial order (called the strict partial order corresponding to
≤). Conversely, if < is a strict partial order on A, then ≤ := < ∪ ∆(A) is a partial
order (called the partial order corresponding to <).
Proposition 2.35. Let R be a relation on a set A and R−1 its inverse relation as defined
2 ZERMELO-FRAENKEL SET THEORY (ZF) 37
Proof. Since R = (R−1 )−1 , for each equivalence, it suffices to prove just one implication
(the converse then follows by applying the first implication with R replaced by R−1 ).
Let x, y, z ∈ A. Then
x R x ⇒ x R−1 x,
proving (2.22a). If R is transitive, then
thereby proving (2.22j). We leave the remaining cases (all straightforward) as exercises.
Definition 2.36. Let ≤ be a partial order on A 6= ∅, ∅ 6= B ⊆ A.
(a) x ∈ A is called lower (resp. upper) bound for B if, and only if, x ≤ b (resp. b ≤ x)
for each b ∈ B. Moreover, B is called bounded from below (resp. from above) if, and
only if, there exists a lower (resp. upper) bound for B; B is called bounded if, and
only if, it is bounded from above and from below.
(b) x ∈ B is called minimum or just min (resp. maximum or max) of B if, and only if,
x is a lower (resp. upper) bound for B. One writes x = min B if x is minimum and
x = max B if x is maximum.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 38
(c) A maximum of the set of lower bounds of B (i.e. a largest lower bound) is called
infimum of B, denoted inf B; a minimum of the set of upper bounds of B (i.e. a
smallest upper bound) is called supremum of B, denoted sup B.
We extend all the notions defined above to strict partial orders < by applying them
to the partial order corresponding to <, i.e. to ≤ := < ∪ ∆(A): For example, we call
x ∈ A a lower bound of B ⊆ A with respect to < if, and only if, x is a lower bound of
B with respect to ≤, and analogous for the other notions.
Lemma 2.37. Let ≤ and < be relations on a set A, where ≤ is a partial order and < is
a strict partial order. Let ≥ := (≤)−1 and > := (<)−1 be the respective inverse relations
according to Def. 2.30(a), i.e.
∀ x≥y ⇔y≤x ∧ x>y ⇔ y<x . (2.23)
x,y∈A
According to (2.22h) and (2.22i), ≥ is also a partial order on A and > is also a strict
partial order on A, where ≤ (resp. <) being total on A, implies ≥ (resp. >) to be total
on A as well. If < is the strict order corresponding to ≤, then > is the strict order
corresponding to ≥. Moreover for A 6= ∅ and ∅ 6= B ⊆ A, using obvious notation, we
have, for each x ∈ A,
x ≤-lower bound for B ⇔ x ≥-upper bound for B, (2.24a)
x ≤-upper bound for B ⇔ x ≥-lower bound for B, (2.24b)
x = min≤ B ⇔ x = max≥ B, (2.24c)
x = max≤ B ⇔ x = min≥ B, (2.24d)
x = inf ≤ B ⇔ x = sup≥ B, (2.24e)
x = sup≤ B ⇔ x = inf ≥ B. (2.24f)
All the equivalences in (2.24) also hold if ≤ is replaced by < and ≥ is replaced by >.
proving (2.24a). Analogously, we obtain (2.24b). Next, (2.24c) and (2.24d) are implied
by (2.24a) and (2.24b), respectively. Finally, (2.24e) is proved by
x = inf ≤ B ⇔ x = max≤ {y ∈ A : y ≤-lower bound for B}
⇔ x = min≥ {y ∈ A : y ≥-upper bound for B} ⇔ x = sup≥ B,
2 ZERMELO-FRAENKEL SET THEORY (ZF) 39
and (2.24f) follows analogously. That all the equivalences in (2.24) also hold if ≤ is
replaced by < and ≥ is replaced by > is now immediate from the last paragraph of Def.
2.36.
Proof. Exercise.
Definition 2.39. Let A, B be nonempty sets with partial orders, both denoted by ≤
(even though they might be different). A function f : A −→ B, is called (strictly)
isotone, order-preserving, or increasing if, and only if,
∀ x < y ⇒ f (x) ≤ f (y) (resp. f (x) < f (y)) ; (2.25a)
x,y∈A
Functions that are (strictly) isotone or antitone are called (strictly) monotone.
Proposition 2.40. Let A, B be nonempty sets with partial orders, both denoted by ≤.
(c) If the order ≤ on A is total and f : A −→ B is bijective and strictly isotone (resp.
antitone), then f −1 is also strictly isotone (resp. antitone).
Example 2.41. The following examples show that the assertions of Prop. 2.40(b),(c)
are no longer correct if one does not assume the order on A to be total. Let
A := (1, 1), (2, 1), (1, 2) .
Then
(m1 , m2 ) ≤ (n1 , n2 ) ⇔ m1 ≤ n1 ∧ m2 ≤ n2 , (2.26)
defines a partial order on A that is not a total order (for example, neither (1, 2) ≤ (2, 1)
nor (2, 1) ≤ (1, 2)).
is strictly isotone and bijective, however f −1 is not isotone (since 2 < 3, but
f −1 (2) = (1, 2) and f −1 (3) = (2, 1) are not comparable, i.e. f −1 (2) ≤ f −1 (3) is
not true).
Definition 2.42. A relation R on a set A is called a (strict) well-order if, and only if,
R is a (strict) total order and every nonempty subset of A has a min with respect to R
(for example, we will see that the usual ≤ constitutes a well-order on N; however, the
usual ≤ does not constitute a well-order on Z (e.g., Z does not have a min) or on R+ 0
(e.g., R+ does not have a min)).
Definition 2.43. (a) Let R be a relation on a set A. We define R6= := R \ ∆(A), i.e.
R6= is the relation on A defined by
x R6= y :⇔ x R y ∧ x 6= y
(for example, if ≤ is a partial order, then < := (≤)6= is the corresponding strict
partial order, cf. Lem. 2.34).
(b) Let R be a relation on a set A and let S be a relation on a set B. We define a
relation P := R ⊙ S on A × B, called the lexicographic product of R and S, where
(a1 , b1 ) P (a2 , b2 ) :⇔ (a1 , b1 ) (R⊙S) (a2 , b2 ) :⇔ a1 R6= a2 ∨ (a1 = a2 ∧ b1 S b2 ).
2 ZERMELO-FRAENKEL SET THEORY (ZF) 41
(e) If R6= and S are transitive, then P is transitive (but see Ex. 2.45).
(g) If R and S are (strict) partial orders, then P is a (strict) partial order. In this
situation, one calls P the lexicographic order given by R and S. It is also common
to denote all three orders R, S, P by the same symbol ≤ (or all by < in the strict
case).
(h) If R and S are (strict) total orders, then P is a (strict) total order.
showing P to be antisymmetric.
(d),(e): Exercise.
(f): Assume R and S to satisfy trichotomy. Then
¬ (a1 , b1 ) P (a2 , b2 ) ∧ ¬ (a2 , b2 ) P (a1 , b1 ) ⇒ ¬(a1 R6= a2 ) ∧ ¬(a2 R6= a1 )
⇒ a1 = a2 ⇒ ¬(b1 S b2 ) ∧ ¬(b2 S b1 ) ⇒ b1 = b2 ⇒ (a1 , b1 ) = (a2 , b2 ),
2 ZERMELO-FRAENKEL SET THEORY (ZF) 42
Example 2.45. To see that the lexicographic product of transitive relations need not
be transitive and that the lexicographic product of equivalence relations need not be
an equivalence relation, consider A := {1, 2} with R := {(1, 1), (2, 2), (1, 2), (2, 1)}, and
S := {(1, 1), (2, 2)}. It is an exercise to show R and S are both equivalence relations,
but R ⊙ S is not transitive (in particular, not an equivalence relation).
Lemma 2.46. Let R be a relation on a set A, U ⊆ A, and let R↾↾U denote its strong
restriction to U as defined in Def. 2.30(b).
xRy ∨ yRx ∨ x = y
2.7.3 Ordinals
In preparation for our official definition of N in Def. 2.72 below, we will study so-called
ordinals, which are special sets also of further interest to the field of set theory (the
natural numbers will turn out to be precisely the finite ordinals).
2 ZERMELO-FRAENKEL SET THEORY (ZF) 44
Definition 2.47. A set X is called transitive if, and only if, every element of X is also
a subset of X:
∀ x ⊆ X. (2.27a)
x∈X
Definition 2.49. (a) A set α is called an ordinal number or just an ordinal if, and only
if, α is transitive and ∈ constitutes a strict well-order on α. An ordinal α is called a
successor ordinal if, and only if, there exists an ordinal β such that α = S(β), where
S is the successor function of Def. 2.22. An ordinal α 6= 0 is called a limit ordinal
if, and only if, it is not a successor ordinal. We denote the class of all ordinals by
ON (it is a proper class by Cor. 2.57 below).
(b) We define
i.e.
∀ (x, y) ∈ R∈ ⇔ x ∈ y. (2.29b)
x,y∈A
Example 2.51. (a) Using (2.9), 0 = ∅ is an ordinal, and 1 = S(0), 2 = S(1) are both
successor ordinals (in Prop. 2.74, we will identify N0 as the smallest limit ordinal).
Even though X := {1} and Y := {0, 2} are well-ordered by ∈, they are not ordinals,
since they are not transitive sets: 1 ∈ X, but 1 6⊆ X (since 0 ∈ 1, but 0 ∈ / X);
similarly, 1 ∈ 2 ∈ Y , but 1 ∈
/ Y.
(b) As a caveat, we point out that, in genereal, saying that a set A is transitive is not
equivalent to saying that R∈ is transitive on A: Actually, in general, neither impli-
cation is true: In (a) we saw that R∈ was a transitive relation on the nontransitive
sets X and Y . To see that the converse implication can fail, consider
A := 0, 1, 2, {1} .
∀ α∈
/ α.
α∈ON
Proof. The set X is transitive by Lem. 2.48(a), and, since X ⊆ α, ∈ is a strict well-order
on X by Lem. 2.46(j).
Proposition 2.55. On the class ON, the relation ≤ (as defined in (2.28)) is the same
as the relation ⊆, i.e.
∀ α ≤ β ⇔ α ⊆ β ⇔ (α ∈ β ∨ α = β) . (2.30)
α,β∈ON
Proof. Exercise.
Theorem 2.56. The class ON is strictly well-ordered by ∈, i.e.
Corollary 2.57. ON is a proper class (i.e. there is no set containing all the ordinals).
(a) X is well-ordered by ∈.
(b) X is an ordinal if, and only if, X is transitive. Note: A transitive set of ordinals
X is sometimes called an initial segment of ON, since, here, transitivity can be
restated in the form
∀ ∀ α<β ⇒ α∈X . (2.32)
α∈ON β∈X
Next, we obtain some results regarding the successor function S of Def. 2.22 in the
context of ordinals.
2 ZERMELO-FRAENKEL SET THEORY (ZF) 48
Proof. (a): Due to Prop. 2.53, S(α) is a set of ordinals. Thus, by Cor. 2.58(b), it merely
remains to prove that S(α) is transitive. Let x ∈ S(α). If x = α, then x = α ⊆ α∪{α} =
S(α). If x 6= α, then x ∈ α and, since α is transitive, this implies x ⊆ α ⊆ S(α), showing
S(α) to be transitive, thereby completing the proof of (a).
(b) holds, as α ∈ S(α) holds by the definition of S(α).
(c) is clear, since, for each ordinal β,
β < S(α) ⇔ β ∈ S(α) ⇔ β ∈ α ∨ β = α ⇔ β ≤ α.
In Th. 2.67 below, we will show that, up to isomorphism, ordinals are the only strictly
well-ordered sets. While we are mostly interested in order isomorphisms, it seems to
make sense to introduce homomorphism for relations in general:
Definition 2.62. Let A, B be sets, let R be a relation on A, and let S be a relation
on B. A function f : A −→ B is called a homomorphism between (A, R) and (B, S) if,
and only if,
∀ x R y ⇒ f (x) S f (y) . (2.33)
x,y∈A
Proof. Exercise.
Definition 2.64. Let R be a relation on a set A. Using the notation of Def. and Rem.
2.28, we define
∀ a↓ := pred(A, a) := pred(A, a, R) := R−1 ({a}) = {x ∈ A : x R a},
a∈A
where we use the notation pred(A, a) and a↓ if R or both R and A are understood. One
can think of pred(A, a, R) as the set of predecessors of a in A with respect to the relation
R (which is especially useful, if R constitutes an order relation on A). If R well-orders
A, then one can also think of pred(A, a, R) as an initial segment of A with respect to
the well-order.
Lemma 2.65. Isomorphisms between well-ordered sets map initial segments to initial
segments: If A, B are sets with strict well-orders, both denoted by <, and f : (A, <) ∼
=
(B, <) is an isomorphism, then
f : (A, <) ∼
= (B, <) ⇒ ∀ f (a↓ ) = (f (a))↓ .
a∈A
2 ZERMELO-FRAENKEL SET THEORY (ZF) 50
Proof. If y ∈ f (a↓ ), then there exists x ∈ A with x < a such that y = f (x). Then, as f is
strictly isotone by Lem. 2.63, y = f (x) < f (a), i.e. y ∈ (f (a))↓ , proving f (a↓ ) ⊆ (f (a))↓ .
We can now apply what we just proved with a replaced by f (a) and f replaced by f −1
to obtain f −1 (f (a))↓ ⊆ (f −1 (f (a)))↓ = a↓ . Applying f to both sides of this inclusion
In Th. 2.71 and Th. 2.77 below, we will justify the proof method of induction on the
set of natural numbers and, subsequently, we will generalize induction proofs such that
they can be applied on general well-ordered sets and even on well-ordered classes (like
ON) and still more general ojects. The basic idea of induction proofs is as follows: To
proof an assertion P (x) holds for all x ∈ C, C being a suitable class, one first establishes
that P (x) holds for all “small” x ∈ C, then assumes the existence of a smallest x ∈ C
with ¬P (x), showing this to provide a contradiction. We will see first examples of this
strategy in the proofs of Prop. 2.66 and Th. 2.67 below.
Proposition 2.66. If α, β ∈ ON and f : (α, <) ∼ = (β, <), then α = β and f = Idα (in
particular, the identity is the unique automorphism on an ordinal).
Theorem 2.67. If A is a set and < is a strict well-order on A, then there exists a
unique α ∈ ON such that (A, <) ∼ = (α, ∈) (we then define type(A) := type(A, <) := α
and call α the order type of the strict well-order (A, <); we write type(A), if the strict
well-order < on A is understood). Moreover, the isomorphism f : (A, <) ∼ = (α, ∈) is
unique.
Idα = f ◦ g −1 ⇒ g = f,
2 ZERMELO-FRAENKEL SET THEORY (ZF) 51
∃ (a↓ , <) ∼
= (ξ, ∈).
f (a):=ξ∈ON
2.8 Infinity
2.8.1 Natural Numbers
The following axiom of infinity guarantees the existence of infinite sets (e.g., it will allow
us to define the set of natural numbers N, which is infinite by Th. 2.80 below).
2 ZERMELO-FRAENKEL SET THEORY (ZF) 52
Axiom 6 Infinity:
∃ 0∈X ∧ ∀ (x ∪ {x} ∈ X) . (2.36)
X x∈X
Thus, the infinity axiom states the existence of a set X containing ∅ (iden-
tified with the number 0), and, for each of its elements x, its successor
S(x) = x ∪ {x}.
Example 2.68. We would like to check which of our toy models M1 , . . . , M10 of Def. 2.2
satisfies Axiom 6 (one might expect that the answer is “none”, since the models all are
finite, however M10 will show that Axiom 6 does not guarantee the existence of infinite
sets in the absence of comprehension). There is a slight complication arising from the
fact that the formulation of (2.36) already makes use of Axiom 1 (extensionality) and
Axiom 4 (union). Therefore, for the purpose of this example only, we replace (2.36) by
∃ ∃ ∀ y∈ /Y ∧ ∀ ∃ x ∈ Z ∧ (u ∈ x ⇒ u ∈ Z) . (2.37)
X Y ∈X y x∈X Z∈X
Definition 2.69. An ordinal n is called a natural number if, and only if,
n 6= 0 ∧ ∀ m ≤ n ⇒ m = 0 ∨ m is successor ordinal .
m∈ON
References
[Bla84] A. Blass. Existence of Bases Implies the Axiom of Choice. Contemporary
Mathematics 31 (1984), 31–33.
[Hal17] Lorenz J. Halbeisen. Combinatorial Set Theory, 2nd ed. Springer Mono-
graphs in Mathematics, Springer, Cham, Switzerland, 2017.
[Kun80] Kenneth Kunen. Set Theory. Studies in Logic and the Foundations of Math-
ematics, Vol. 102, North-Holland, Amsterdam, 1980.
[Kun13] Kenneth Kunen. Set Theory. Studies in Logic, Vol. 34, College Publica-
tions, London, 2013, revised edition.