320 Lecture Notes
320 Lecture Notes
BURAK KAYA
Abstract. These are the lecture notes I used for a 14-week introductory set
theory class I taught at the Department of Mathematics of Middle East Tech-
nical University during Spring 2018. In order to determine the course content
and prepare the lecture notes, I mainly used the textbook by Hrbacek and
Jech [1], which I also listed as a supplementary resource for the course.
Contents
0. Prelude
0.1. Some historical remarks.
0.2. The language of set theory and well-formed formulas
0.3. What are sets anyway?
0.4. Classes vs. Sets
0.5. Notational remarks
1. Some axioms of ZFC and their elementary consequences
1.1. And G said, “Let there be sets”; and there were sets.
1.2. Constructing more sets
2. From Pairs to Products
2.1. Relations
2.2. Functions
2.3. Products and sequences
2.4. To choose or not to choose
3. Equivalence Relations and Order Relations
3.1. Equivalence relations, partitions and transversals.
3.2. A Game of Thrones, Prisoners and Hats.
3.3. Order relations
3.4. Well-orders.
3.5. Well-founded relations and the Axiom of Foundation
4. Natural Numbers
4.1. The construction of the set of natural numbers
4.2. Arithmetic on the set of natural numbers
5. Equinumerosity
5.1. Finite sets
5.2. To infinity and beyond
6. Construction of various number systems
6.1. Integers
6.2. Rational numbers
6.3. Real numbers
7. Ordinal numbers
7.1. How do the ordinals look like?
BURAK KAYA
Week 1
BURAK KAYA
0. Prelude
0.1. Some historical remarks. If one examines the history of mathematics, one
sees that towards the end of 19th century, some mathematicians started to inves-
tigate the “nature” of mathematical objects. For example, Dedekind gave a con-
struction for the real numbers, Peano axiomatized the natural numbers, Cantor
established a rigorous way to deal with the notion of infinity. These works may be
considered as first steps to understand what mathematical objects are.
In early 20th century, arose what is known as the foundational crisis of mathe-
matics. Mathematicians searched for a proper foundations of mathematics which
is free of contradictions and is sufficient to carry out all traditional mathematical
reasoning. There were several philosophical schools having different views on how
mathematics should be done and what mathematical objects are. Among these
philosophical schools, the leading one was Hilbert’s formalist approach, according
to which mathematics is simply an activity carried out in some formal system1. On
the one hand, mathematics had already been done “axiomatically” since Euclid.
On the other hand, Hilbert wanted to provide a rigorous axiomatic foundation to
mathematics2. With the work of Dedekind and Cantor, the idea that mathemat-
ics can be founded on set theory became more common. This eventually led3 to
the development of the Zermelo-Fraenkel set theory with the axiom of Choice, by
Ernst Zermelo, with the later contributions of Abraham Fraenkel, Thoralf Skolem
and John von Neumann.
Today, some mathematicians consider ZFC as the foundation of mathematics, in
which one can formalize virtually all known mathematical reasoning. In this course,
we aim to study the axioms of ZFC and investigate their consequences. That said,
we should note there are many other set theories with different strengths introduced
for various purposes, such as von Neumann-Gödel-Bernays set theory, Morse-Kelley
set theory, New Foundations, Kripke-Platek set theory and the Elementary Theory
of the Category of Sets.
0.2. The language of set theory and well-formed formulas. We shall work
in first-order logic with equality symbol whose language consists of a single binary
relation symbol ∈. For those who are not familiar with first-order logic, we first
review how the well-formed formulas in the language of set theory are constructed.
Our basic symbols consist of the symbols
∈= ∀∃¬ ∧ ∨ →↔ ()
together with an infinite supply of variable symbols
a b c d e ...
The (well-formed) formulas in the language of set theory are those strings that can
be obtained in finite numbers of steps by application of the following rules.
• Strings of the form x ∈ y and x = y, where x and y are variable symbols,
are formulas.
1To illustrate this point,we should perhaps remind Hilbert’s famous saying: “Mathematics is
a game played according to certain simple rules with meaningless marks on paper.”
2In fact, Hilbert wanted more than this. Those who wish to learn more should google the term
“Hilbert’s program”.
3We refer reader to the web page https://plato.stanford.edu/entries/settheory-early/ for a
detailed and more accurate historical description.
MATH 320 SET THEORY
• If ϕ and ψ are formulas and x is any variable symbol, then the following
strings are formulas
¬ϕ ∃xϕ ∀xϕ (ϕ ∧ ψ) (ϕ ∨ ψ) (ϕ → ψ) (ϕ ↔ ψ)
For example, the string ∃x∀y¬y ∈ x is a well-formed formula in the language of set
theory, whereas, the string ∃x∀¬x → ∃∨ is not. A variable in a formula is said to be
bound if it is in the scope of a quantifier; otherwise, it is said to be free. A formula
with no free variables is called a sentence. For example, the string ∃x∀y¬y ∈ x is
a sentence, and the string ∃z∃t((¬z = t ∧ z ∈ x) ∧ t ∈ y) is a formula with two free
variables x and y and hence not a sentence.
“Officially”, we work in an axiomatic system that consists of the axioms of ZFC
and the standard logical axioms (in the language of set theory) together with a
sound and complete proof system4. “Unofficially”, we are going to work in natural
language and carry out our mathematical arguments informally, as is the case in
any other branch of mathematics. Nevertheless, if necessary, the reader should be
able to convert arguments in natural language to formal proofs in first-order logic
and vice versa.
0.3. What are sets anyway? Up to this point, we have not mentioned anything
related to the meaning of the formulas in the language of set theory. For example,
what does x ∈ y really mean?
On the one hand, we note that it is perfectly possible to take a purely formalist
approach and simply derive theorems in the aforementioned axiomatic system with
attaching no meaning to symbols. On the other hand, we believe that this approach
is pedagogically inappropriate for students who are exposed to set theory for the first
time; and that it fails to acknowledge the role of mathematical intuition, which not
only manipulates symbols but also understands what they refer to. Consequently,
we shall adopt a Platonist point of view that we think is better-suited for teaching
purposes5. Back to the question... What does x ∈ y really mean?
A long time ago in a galaxy far, far away.... existed the universe of mathematical
objects called sets which is denoted by V. We shall not try to define what a set
is. You should think of sets as primitive objects, perhaps by comparing it to points
of Euclid’s Elements. Sets are to us like points are to Euclid. Sets are simply the
objects in the universe of sets.
Between certain sets holds the membership relation which we denote by x ∈ y.
Our intuitive interpretation of the relation ∈ is that x ∈ y holds if the set y contains
the set x as its element. In this sense, sets are objects that contain certain other
sets as their members.
Quantifiers ranging over the universe of sets and logical connectives having their
usual intended meanings, a sentence in the language of set theory is simply an
assertion about the universe of sets that is either true or false, depending on how
the membership relation holds between sets.
4Details of our proof system are not really relevant for this course, since most of our arguments
are going to be done informally. Moreover, there are many (essentially equivalent) proof systems
that are sufficient for our purposes. Those students who wish to learn how a sound and complete
proof system for first-order logic may be set up should google the term Hilbert(-style) proof system.
5However, I personally do not consider myself as a follower of mathematical Platonism.
BURAK KAYA
We assume that the axioms of ZFC are true sentences about the universe of sets,
whose truth is self-evident and dictated by our mathematical intuition6. In this
course, we shall study the logical consequences of the axioms of ZFC and try to
understand the structure of the universe of sets V.
0.4. Classes vs. Sets. A class is simply a collection of sets and hence is a subcol-
lection of the universe of sets. We remark that classes are not (necessarily) objects
in the universe of sets according to this definition. Consequently, we cannot di-
rectly talk about them in our axiomatic system by referring to them via variable
symbols7. However, there is a way to get around this problem and make assertions
about classes in a meaningful manner.
Let ϕ(x) be a property of sets, i.e. a formula in the language of set theory with
one free variable. The collection C of sets satisfying the formula ϕ(x) is a class and
is denoted by
{x : ϕ(x)}
In this case, the class C is said to be defined by the formula ϕ(x). We also allow
multiple free variables to appear in the defining formula, in which case the class
{x : ψ(x, p, q, . . . , t)}
is said to be defined by ψ with parameters p, q, . . . , t, where p, q, . . . , t are fixed sets.
For the rest of this course, we shall restrict our attention to those classes that are
defined by some formula in the language of set theory possibly via some parameters.
As such, we can meaningfully make assertions about classes in our axiomatic system
by identifying formulas with the corresponding classes. For example, if C and D
are classes that are defined by the formulas ϕ(x) and ψ(x) respectively, then the
assertion C = D can be stated by the sentence ∀x(ϕ(x) ↔ ψ(x)). We can also
“quantify” over a class C defined by the formula ϕ(x) using the formulas
∀x(ϕ(x) → ψ) and ∃x(ϕ(x) ∧ ψ)
which would intuitively correspond to ∀x ∈ C ψ and ∃x ∈ C ψ respectively if we
could have quantified over the classes in the first place. One can similarly define
quantification over classes defined via parameters.
It is clear that every set, being a collection of sets, is a class. More precisely,
given a set x, we can simply define it by the formula y ∈ x using the set x itself as
a parameter, i.e. x = {y : y ∈ x}. On the other hand, not every class is a set.
Theorem 1 (Russell’s paradox). The class R = {x : ¬x ∈ x} is not a set. More
precisely,
¬∃x∀y(y ∈ x ↔ ¬y ∈ y)
Proof. Assume to the contrary that there exists x such that ∀y(y ∈ x ↔ ¬y ∈ y).
Then, letting y be the set x, we have ¬x ∈ x ↔ x ∈ x, which is a contradiction.
Classes that are not sets are called proper classes. For example, the class R
defined above is a proper class. As we shall see later, another example of a proper
class is the universe of sets V which can be defined by the formula x = x.
6Those students with philosophical tendencies may read Penelope Maddy’s famous articles
Believing the Axioms, I, Believing the Axioms, II and her book Defending the Axioms after
completing this course.
7We note that some of the set theories we mentioned earlier are capable of talking about classes
directly. For example, this can be done in NBG and MK.
MATH 320 SET THEORY
0.5. Notational remarks. In what follows, our assertions about sets should ide-
ally be written in the language of set theory, having only ∈ as a non-logical symbol.
However, this approach is cumbersome and for convenience we will often expand
our language by introducing new non-logical symbols that are abbreviations for
certain formulas of set theory. For example, the formula ¬x ∈ y is abbreviated as
x∈ / y. The reader is expected to keep track of introductions of such abbreviations.
Another notational convenience we shall adopt is to write ∀z ∈ x ϕ instead of
∀z(z ∈ x → ϕ) and to write ∃z ∈ x ϕ instead of ∃z(z ∈ x ∧ ϕ) where ϕ is a formula
in the language of set theory. Finally, we note that parentheses are usually omitted
whenever there is no ambiguity.
In other words, for any sets x and y, the collection {x, y} is indeed a set. We
shall call this set the unordered pair of x and y. Here are two applications of the
axiom of pairing.
• By pairing ∅ with itself, we can now prove that the set {∅} exists.
• By pairing the set {∅} with ∅, we can also construct the set {∅, {∅}}.
Next follows an important application of the axiom of pairing. Let x and y be sets.
Then, by the axiom of pairing, the sets {x} and {x, y} both exist. By pairing these
sets, we obtain the set {{x}, {x, y}}.
Definition 1 (Kuratowski). The set {{x}, {x, y}} is called the ordered pair of x
and y and is denoted by (x, y).
The reason (x, y) is called the ordered pair is easily seen from the next lemma.
Lemma 1. Let x, y, x0 , y 0 be sets. (x, y) = (x0 , y 0 ) if and only if x = x0 and y = y 0 .
Proof. Left to the reader as an exercise.
We next introduce an axiom that allows us to collect the elements of elements
of a set into a single set.
Axiom 4 (The axiom of union). For any set x, there exists a set y which consists
of exactly the elements of elements of x.
∀x∃y∀z(z ∈ y ↔ ∃s(s ∈ x ∧ z ∈ s))
We are used to thinking of union as an operation applied to a collection of sets
instead of a single set. In the axiom above, you should think of the set x as the
collection of sets whose union is to be taken. In this case, the set y isSthe union of
elements of x. We shall call y simply the union of x and denote it by x. In other
words, [
x = {z : ∃s ∈ x z ∈ s}
Next follows the definition of the union of two sets. Let x and y be sets. Then, by
pairing, the set {x, y} exists.
S
Definition 2. The set {x, y} is called the union of x and y and is denoted by
x ∪ y.
Exercise 1. Let x and y be sets. Prove that for all z, we have that z ∈ x ∪ y if
and only if z ∈ x or z ∈ y.
The dual notion of the union of a set is the intersection of a set x, which can be
defined as follows. \
x = {z : ∀s ∈ x z ∈ s}
T T
Exercise 2. Show that every set belongs to the class ∅. In other words, ∅ = V.
T
Note that we do not know yet whether or not the class x is indeed a set for
every non-empty set x. In order to show this, we shall need the following axiom.
Axiom 5 (The axiom of separation). Let ϕ(z, p) be a formula in the language of
set theory with two variables z and p. For any p and for any x, there exists a set
y that consists of elements of x satisfying the property ϕ(·, p).
∀p∀x∃y∀z(z ∈ y ↔ (z ∈ x ∧ ϕ(z, p)))
MATH 320 SET THEORY
Before introducing the next axiom, we will need the notion of a subset of a set.
Let x and y be sets. The set x is said to be a subset of y if every element of x
belongs to y. More precisely, x is a subset of y if we have ∀z(z ∈ x → z ∈ y). We
shall write x ⊆ y if x is a subset of y; and write x ( y if x ⊆ y and x 6= y. In the
latter case, x is said to be a proper subset of y. The reader can easily verify that
for all x, y, z and non-empty w, we have that
• ∅ ⊆ x and x ⊆ x,
• {t : t ∈ x ∧ ϕ(t)} ⊆ x for any property ϕ,
• (x ⊆ y ∧ y ⊆ x) ↔ x = y,
• T
(x ⊆ y ∧ Sy ⊆ z) → x ⊆ z,
• w⊆ T w S
• y∈x→ x⊆y⊆ x
The next axiom guarantees the existence of the set of all subsets of a set.
Axiom 6 (The axiom of power set). For any set x there exists a set y that consists
of all subsets of x.
∀x∃y∀z(z ⊆ x ↔ z ∈ y)
The set {z : z ⊆ x} is called the power set of x and is denoted by P(x). When
we introduce infinite sets, the power set of an infinite set will be a central object
to study, some fundamental properties of which cannot be decided2 via the axioms
of ZFC.
Exercise 4. Prove that for any set x, the set P(x), together with the binary op-
eration 4, forms an abelian group in which every non-identity element has order
2.
Exercise 5. Prove that for any non-empty set x, the set P(x) forms a commuta-
tive ring in which every element equals its square, where the binary operations for
addition and multiplication are 4 and ∩ respectively.
Axioms 1-6 are far from being complete to serve as a foundation of mathematics.
For once, we cannot prove the existence of an “infinite” set without further axioms.
Before introducing more axioms, in the next section, we are going to study how
various mathematical concepts can be represented by sets.
2The proper term for this phenomenon is independence. A sentence ϕ is said to be independent
of ZFC in the case that neither ϕ nor ¬ϕ can be proven from ZFC.
MATH 320 SET THEORY
Week 2
BURAK KAYA
Definition 10. Let A and B be sets. The cartesian product of A and B is the set
{(a, b) ∈ P(P(A ∪ B)) : a ∈ A ∧ b ∈ B}
and is denoted by A × B.
Definition 11. Let R be a relation and A, B be sets. The relation R is said to be
• a relation from A to B if R ⊆ A × B;
• a relation on A if R ⊆ A × A.
In particular, every relation R is a relation from dom(R) to ran(R). However,
notice that a relation R being from the set A to the set B does not necessarily
mean that A = dom(R) and B = ran(R).
Definition 12. Let R and S be relations. Then the composition of S and R is the
relation
{(a, b) : ∃c (a, c) ∈ R ∧ (c, b) ∈ S}
and is denoted by S ◦ R.
The notion of composition of two relations is most frequently used when both
relations are a special type of relations called functions. On the other hand, some
useful properties of the operation ◦ still hold for arbitrary relations.
Exercise 7. Let R and S be relations. Prove that (S ◦ R)−1 = R−1 ◦ S −1 .
Exercise 8. Let R, S and T be relations. Prove that T ◦ (S ◦ R) = (T ◦ S) ◦ R.
Before introducing the notion of a function, we would like to mention two rela-
tions defined on an arbitrary set, which will be useful in later sections.
Definition 13. Let A be a set. The membership relation on A is the relation
{(a, b) ∈ A × A : a ∈ b}
and is denoted by ∈A .
Definition 14. Let A be a set. The identity relation on A is the relation
{(a, b) ∈ A × A : a = b}
and is denoted by ∆A .
The notion of a binary relation can be generalized to that of an n-ary relation,
which is a relation that holds or not holds between n many sets. However, the
most convenient way to define n-ary relations requires the construction of natural
numbers and the n-fold cartesian product of sets. Consequently, we postpone the
definition of an n-ary relation until Section 3.
2.2. Functions. Recall that one can think of a relation R as a “rule” that relates
certain sets in dom(R) to certain sets in ran(R). If this “rule” happens to uniquely
assign each set dom(R) to a certain set in ran(R), then the corresponding relation
is said to be a function. More precisely,
Definition 15. Let R be a relation. The relation R is said to be a function if
∀a∀b∀c (aRb ∧ aRc → b = c)1
1Some authors call a relation satisfying this property well-defined. In this terminology, func-
tions are simply relations that are well-defined.
BURAK KAYA
The simplest example of a function is the empty set ∅. Notice that the definition
of a function vacuously holds for the empty set for it has not elements.
Definition 16. Let R be a relation and A, B be sets. The relation R is said to be
a function from A to B if R is a function, dom(R) = A and ran(R) ⊆ B. In this
case, R is said to have domain A and codomain B.
An important point to realize is that, according to this definition, the very same
set can be considered as a function from the same domain to different codomains.
For this reason, whenever it is necessary, we shall always specify the codomain of a
function.
Definition 17. Let R be a function and x ∈ dom(R). The (necessarily) unique
element y ∈ ran(R) for which (x, y) ∈ R is called the value of R at x.
Before we proceed, we introduce some notation regarding functions. From now
on, we shall write R : A → B whenever we need to denote a set R which is a
function from the set A to the set B. The value of R at a will be denoted by R(a).
We would also like to emphasize that functions are relations and hence all notions
introduced for relations so far are applicable to functions as well. We next introduce
the notion of a bijective function, which will be central to our study of infinite sets.
Definition 18. Let f : A → B be a function with domain A and codomain B.
Then f is said to be
• one-to-one (or injective) if for all x, y ∈ A we have f (x) = f (y) → x = y.
• onto (or surjective) if ran(f ) = f [A] = B.
• one-to-one correspondence (or bijection) if it is both one-to-one and onto.
Observe that surjectivity and bijectivity of a function both depend on the speci-
fied codomain, unlike injectivity. Consequently, the very same set can be surjective
for some codomain and not surjective for some other codomain. The following
exercise illustrates this fact.
Exercise 9. Prove that the empty set ∅ is a bijection as a function from ∅ to ∅ and
not a surjection as a function from ∅ to {∅}.
The notion of injectivity can be generalized to arbitrary relations. More specifi-
cally, a relation R is said to be injective if and only if ∀x∀y∀z (xRz ∧yRz → x = y).
It is easily seen that a relation being injective is equivalent to its inverse relation
being a function and vice versa. Consequently, we have the following fact.
Lemma 3. Let R be a relation. Then the relation R is an injective function if and
only if the inverse relation R−1 is an injective function.
Proof. Let R be a relation that is an injective function. Since R is injective,
∀x∀y∀z (xRz ∧ yRz → x = y) and hence ∀x∀y∀z (zR−1 x ∧ zR−1 y → x = y),
which is exactly what it means for R−1 to be a function. Since R is a function,
∀x∀y∀z (xRy ∧ xRz → y = z) and hence ∀x∀y∀z (yR−1 x ∧ zR−1 x → y = z),
which is exactly what it means for R−1 to be injective. By changing the roles of R
and R−1 , the proof of the right-to-left direction can be done similarly.
One can easily verify that any subset of a function is itself a function. This
observation suggests the following definition.
MATH 320 SET THEORY
The following exercise shows that the lemma above can be generalized to arbi-
trary collections of compatible functions.
Exercise 10. Let S be a set S such that elements of S are functions
S which are
pairwise compatible. Show that S is a function with domain {dom(f ) : f ∈ S}.
The next lemma shows that the class of functions are closed under the operation
of composition.
Lemma 5. Let f and g be functions. Then the composition g ◦ f is a function.
Proof. Let x, y, z be sets such that (x, y) ∈ g ◦f and (x, z) ∈ g ◦f . We want to show
that y = z. By definition of composition, there exist y 0 and z 0 such that (x, y 0 ) ∈ f
and (y 0 , y) ∈ g; and (x, z 0 ) ∈ f and (z 0 , z) ∈ g. Since f is a function, (x, y 0 ) ∈ f and
(x, z 0 ) ∈ f implies that y 0 = z 0 . Since g is a function and y 0 = z 0 , (y 0 , y) ∈ g and
(z 0 , z) implies that y = z.
Exercise 11. Let f and g be functions. Show that the domain of the function g ◦ f
is dom(f ) ∩ f −1 [dom(g)] and that (g ◦ f )(x) = g(f (x)) for all x in this domain.
Given two sets x and y, a function f from x to y is an element of P(x × y) and
hence we can form the set of all functions from x to y
{f ∈ P(x × y) : ∀a∀b∀c ((a, b) ∈ f ∧ (a, c) ∈ f ) → b = c ∧ dom(f ) = x}
BURAK KAYA
using the axioms introduced so far. From now on, the set of all functions from
the set x to the set y will be denoted by x y. Some authors use the notation y x to
denote this set, however, we reserve this notation for exponentiation on ordinal and
cardinal numbers in order to avoid ambiguities.
2.3. Products and sequences. Next will be discussed how to define the product
of an arbitrary collection of sets.
Recall that when we defined the cartesian product A × B of two sets, the order
of the sets A and B mattered. Even though the cartesian product B × A is in
a natural bijection with the cartesian product A × B, these are different objects
in the universe of sets. Therefore, in order to generalize the concept of cartesian
product to arbitrarily many sets, we first need to label the sets whose product is to
be taken. This labeling can be done through some function.
Let J be a set which contains the sets whose product is to be taken and possibly
other sets. Let F : I → J be an arbitrary function. We will refer to the function
F an indexed system of sets with the index set I. Here we think of the set i ∈ I as
the label of the set F (i) for all i ∈ I. While talking about indexed systems of sets,
it is customary to write Fi instead of F (i) and write {Fi }i∈I instead of F [I], which
we will also refer to as an indexed system of sets.
Definition 21. Let {Fi }i∈I be an indexed system of sets with the index set I. The
product of the indexed system {Fi }i∈I is the set
[
{f : I → {Fi }i∈I | ∀i ∈ I f (i) ∈ Fi }
Q
and is denoted by i∈I Fi .
Q
In other words, the product i∈I Fi is the set of all functionsQ f with domain I
such that f (i) ∈ Fi for all i ∈ I. One usually denotes a set f ∈ i∈I Fi using the
sequence notation (f (i))i∈I since f can be considered as a sequence which takes
values in Fi at each component i.
Indeed, this is exactly how we define sequences over arbitrary sets. Let {Si }i∈I
be an indexed family of setsQ for some index set I such that Si = S for all i ∈ I. An
element f of the product i∈I S is called a sequence over S with the index set I
and is denoted by (f (i))i∈I .
We have not constructed the natural numbers yet. For the following exercises,
the reader should assume2 that 0 = ∅, 1 = {0} and 2 = {0, 1}.
Exercise 12. LetQ{Ai }i∈2 be an indexed system of set with the index set 2. Show
that the map f : i∈2 Ai → A0 × A1 given by f (g) = (g(0), g(1)) is a bijection.
Consequently, the notion of cartesian product can be considered as a special
case of the product of an indexed system of sets. As the reader may guess, once
we define natural numbers, the cartesian product of sets A1 , . . . , An will simply be
defined as the product of the indexed family {Ai }i∈n .
The next exercise shows that there is a natural bijection between the power set
of any set X and the product of an appropriately chosen system with index set X.
Exercise 13. Let X be any set. Show that X 2 = i∈X 2 and that the map f from
Q
Q
i∈X 2 to P(X) given by f (g) = {x ∈ X : g(x) = 1} is a bijection.
2Since the notions introduced so far are enough to carry out the construction of natural num-
bers, the curious reader may read the first subsection of Section 4 for a precise construction at
this point.
MATH 320 SET THEORY
Week 3
MATH 320 SET THEORY
2.4. To choose or not to choose. In this section, we shall introduce the axiom
of choice, one of the most famous axioms of ZFC. For historical reasons, the axiom
of choice became so famous that the letter C of ZFC stands for this axiom.
There are literally dozens of equivalent formulations of the axiom of choice.
Below, we introduce the formulation which states that the product of an indexed
system of non-empty sets is non-empty. Some equivalent formulations of this axiom
will be mentioned in later sections.
Axiom 7 (The axiom of choice). For all sets Q I and for all indexed systems of sets
{Ai }i∈I with Ai 6= ∅ for all i ∈ I, the product i∈I Ai is non-empty3.
Q
Recall that an element f of the product i∈I Ai is a function with f (i) ∈ Ai
for all i ∈ I. Loosely speaking, the function f chooses one element from each Ai .
In this sense, the axiom of choice allows us to “simultaneously choose” an element
from each set in a set of non-empty sets. The reader who does not feel comfortable
with indexed systems may find the following lemma more intuitive.
Lemma 6. Let M be S a set whose elements are non-empty sets. Then there exists
a function f : M → M such that f (x) ∈ x for all x ∈ M .
Proof. Notice that every set can be indexed by itself through identity function.
More precisely, let M = I and {Mi }i∈I be the indexed system of sets with MiS= i.
Then, since Mi 6= ∅ for all i ∈ I, by the axiom of choice, there exists f : I → M
such that f (i) ∈ i for all i ∈ I, which is precisely what we wanted to prove.
It is easily seen that the axiom of choice is implied by the statement of the lemma
above together with Axioms 1-6.
3.2. A Game of Thrones, Prisoners and Hats. After the battle of the Blackwa-
ter, King Joffrey of Westeros captured countably infinitely many soldiers of Stannis
Baratheon as his prisoners and put the set of prisoners in a bijection with the set
of natural numbers. In other words, every prisoner is uniquely labeled by some
natural number.
King Joffrey, who has been known for his cruel games, explained to the prisoners
that they would be executed the next morning, unless they succeed in the following
game that will take place before the execution:
The prisoners will be standing in a straight line in such a way that every prisoner
will be able to see the infinitely many prisoners whose labels are greater than his
label, i.e. the prisoners are standing on the number line facing the positive direction.
Then each prisoner will be randomly given a hat that is either red or blue.
The prisoners can see all the hats in front of them but cannot see their own hats.
Moreover, they are not allowed to move or communicate in any way. After all the
hats are distributed, each prisoner will be asked to guess the color of his own hat
and write his guess in a piece of paper.
The rules of the game are as follows: If there are only finitely many prisoners who
guess wrong, then all the prisoners are set free. Otherwise, they all are executed.
Once the rules are explained to the prisoners, they immediately think that it is
impossible to succeed since they are in no position to obtain information about the
colors of their own hats by looking at the colors of other prisoners’ hats.
Tyrion Lannister, who is not fond of King Joffrey and who has studied set theory
in his youth, decides to help the prisoners. Soldiers of Stannis are so smart that
they have been known to memorize infinite amount of information if necessary.
Knowing this fact, Tyrion realizes that he can set the prisoners free.
f Eg ↔ ∃m ∈ N ∀n ∈ N (n ≥ m → f (n) = g(n))
In other words, two functions from N to 2 are E-equivalent if and only if they
take the same values at sufficiently large natural numbers. We skip the details of
checking that E is indeed an equivalence relation and leave this as an exercise to
the reader.
MATH 320 SET THEORY
Week 4
BURAK KAYA
Next will be introduced several notions of “bigness” and “smallness” for a partial
order relation.
Definition 31. Let ≤ be a partial order relation on a set X and Y ⊆ X. Then an
element y ∈ Y is said to be
• a least element of Y with respect to ≤ if ∀x ∈ Y y ≤ x.
• a minimal element of Y with respect to ≤ if ∀x ∈ Y (x ≤ y → x = y).
• a greatest element of Y with respect to ≤ if ∀x ∈ Y x ≤ y.
• a maximal element of Y with respect to ≤ if ∀x ∈ Y (y ≤ x → x = y).
It follows from the transitivity of ≤ that least and greatest elements, if they
exist, are unique. Hence, we can talk about the least element of Y and the greatest
element of Y with respect to ≤. Being least is clearly stronger than being minimal,
i.e. least elements are also minimal elements. However, the converse is not true as
shown by the following exercise
Exercise 19. Let 4 be the relation on A = N − {0, 1} defined by
x 4 y ↔ ∃k ∈ N+ y = k · x
for all x, y ∈ N. Show that there is no least element of A with respect to 4 and
that the set of minimal elements of A with respect to 4 is exactly the set of prime
numbers.
On the other hand, being minimal implies being least whenever any two elements
are comparable.
Exercise 20. Let ≤ be a partial order relation on a set X and Y ⊆ X be a subset
such that x and y are comparable with respect to ≤ for all x, y ∈ Y . Show that if
y ∈ Y is a minimal element of Y with respect to ≤, then it is also the least element
of Y with respect to Y .
As can be seen from the previous exercise, subsets in which any two elements
are comparable with respect to an order relation are of importance and deserve a
special name.
Definition 32. Let ≤ be a partial order relation on a set X. A subset Y ⊆ X is
said to be a chain (with respect to ≤) if x and y are comparable (with respect to ≤)
for all x, y ∈ Y .
From now on, while referring to properties of elements with respect to some
order relation ≤, we may sometimes omit the phrase “with respect to ≤” if the
order relation is understood from the context.
3.4. Well-orders. In this subsection, we shall learn the notion of a well-order
relation. Since the theory of ordinal numbers will be based on well-orders, the
reader is expected to get a solid grasp of this notion.
Definition 33. Let ≤ be a partial order relation on a set X. The relation ≤ is said
to be a well-order relation if ≤ is a linear order relation on X and every non-empty
subset of X has a least element.
Definition 34. Let < be a strict partial order relation on a set X. The relation <
is said to be a strict well-order relation if the induced partial order ≤ is a well-order
relation.
MATH 320 SET THEORY
Week 5
BURAK KAYA
Another consequence of the axiom of foundation is that there are no sets that
are members of themselves.
Theorem 11. For any set x, we have that x ∈
/ x.
Proof. Assume towards a contradiction that there exists a set x such that x ∈ x.
Let y be the set {x}. Since y is non-empty, by foundation, there exists z ∈ y such
that z ∩ y = ∅. However, the only element of y is the set x and y ∩ x = {x}, which
is a contradiction. Thus there cannot exist such a set x.
One can similarly prove with a similar argument that there cannot be sets which
form a “loop” under the membership relation.
Exercise 24. There are no sets x an y such that we have both x ∈ y and y ∈ x.
Next will be shown that the successor operation, which will be central to the
following sections, is one-to-one.
Definition 42. Let x be a set. The successor of x is defined to be the set x ∪ {x}
and is denoted by S(x).
Lemma 13. Let x and y be sets. If S(x) = S(y), then x = y.
Proof. Assume that S(x) = S(y). Then, by definition, x ∪ {x} = y ∪ {y}. It follows
that x ∈ y ∨ x = y and y ∈ x ∨ y = x. If x 6= y, then we have both x ∈ y and
y ∈ x, which contradicts the previous exercise. Thus, x = y.
4. Natural Numbers
As we have discussed before, ZFC is supposed to be a foundation of mathematics.
Consequently, we should be able to “represent” numbers as sets. In this section,
we shall construct the set of natural numbers together with its usual arithmetic
operations.
4.1. The construction of the set of natural numbers. The idea is to define the
natural number n to be a set with n elements. However, since there are infinitely
many such sets, we should choose a “canonical” set with n elements to be the
natural number n.
We define the natural number 0 to be the empty set ∅. The natural number n
is the set obtained by applying the successor operation to the empty set n times.
In other words,
0=∅
1 = S(0) = {0}
2 = S(S(0)) = S(1) = {0, 1}
3 = S(S(S(0))) = S(2) = S(S(1)) = {0, 1, 2}
...
Notice that each specific natural number can be constructed from the axioms in-
troduced so far. However, we cannot prove without additional axioms that there
exists a set which contains 0, S(0), S(S(0)), . . . . Before we introduce an axiom that
asserts the existence of such a set, we shall define the notion of an inductive set.
Definition 43. A set x is said to be inductive if
• ∅ ∈ x and
MATH 320 SET THEORY
• ∀y (y ∈ x → S(y) ∈ x).
Observe that an inductive set is “infinite” and necessarily contains 0, S(0), . . . .
However, there may be other elements contained in an inductive set. We would like
the set of natural numbers to be the smallest inductive set. Of course, in order to
construct this set, we first have to assert that an inductive set exists.
Axiom 9 (The Axiom of Infinity). An inductive set exists, i.e.
∃x (∅ ∈ x ∧ (∀y(y ∈ x → S(y) ∈ x)))
By the axiom of infinity, we know that there exists an inductive set I. It follows
from the axiom of separation that the collection
{x ∈ I : ∀J (“J is inductive” → x ∈ J)}
is a set. This set is called the set of natural numbers and is denoted by N. Any
element of N is said to be a natural number.
A trivial but important observation is that N is inductive and that N ⊆ I for every
inductive set I. The following principle immediately follows from this observation.
Theorem 12 (The Principle of Induction). Let ϕ(x) be a property of sets. If
I = {n ∈ N : ϕ(n)} is inductive, then I = N.
Proof. Assume that I = {n ∈ N : ϕ(n)} is inductive. Then N ⊆ I by our previous
observation. On the other hand, by definition, I ⊆ N. Thus, I = N.
The principle of induction is a fundamental tool to prove statements about the
set of natural numbers. It is easily checked that the principle of induction holds
even if we allow parameters ϕ(x, y, z, . . . , w) in the property defining the set I in the
statement. We should perhaps summarize the induction principle as follows: Any
inductive subset of the set of natural numbers equals the set of natural numbers.
We next define the order relation on N. Define the relation < on N as follows
m<n↔m∈n
for all m, n ∈ N. At this point, we do not know that < is a strict order relation. In
order to prove this, we shall some basic properties of <.
Lemma 14. For all n ∈ N, we have 0 ≤ n, where ≤ is the relation defined by
x ≤ y ↔ x < y ∨ x = y.
Proof. We shall prove this by the principle of induction. Let I be the set of natural
numbers for which the claim holds. We clearly have 0 ∈ I. Now, let n ∈ I. Then,
0 ∈ n or 0 = n. In both cases, 0 ∈ S(n) and hence 0 ≤ S(n). Thus, I is inductive
and hence, by induction I = N.
Exercise 25. Prove that for all k, n ∈ N, we have k < S(n) ↔ k ≤ n.
Exercise 26. Using the principle of induction, prove that for all n ∈ N, we have
that n = 0 or n = S(k) for some k ∈ n.
We shall next establish that the set of natural numbers together with the relation
defined above is a strictly well-ordered set.
Theorem 13. (N, <) is a strictly well-ordered set.
Proof. The proof of this theorem is a long induction argument.
BURAK KAYA
• < is transitive.
Let I = {z ∈ N : ∀x, y ∈ N(x < y ∧ y < z → x < z)}. We shall prove
by induction that I = N. It is easily seen that 0 ∈ I since the property
defining elements of I vacuously holds for 0. To complete the proof that I is
inductive, we need to prove that n ∈ I → S(n) ∈ I. Let n ∈ I and x, y ∈ N
such that x < y and y < S(n). Since y < S(n), by previous exercise, we
know that y ≤ n and hence y < n or y = n. If y < n, then x < n since
n ∈ I. If y = n, then x < n by assumption. Since I is inductive, it equals
N and hence < is transitive.
• < is asymmetric.
Let x, y ∈ N such that x < y. Then, by definition, x ∈ y. It follows from
an exercise in the previous section that y ∈ / x. Thus ¬y < x and hence <
is asymmetric1.
• Any two natural numbers are comparable with respect to <.
Let I = {n ∈ N : ∀m ∈ N (m < n ∨ m = n ∨ n < m)}. By Lemma 14, we
know that 0 ∈ I. Let n ∈ I. We wish to show that S(n) ∈ I. Let m ∈ N.
We know that m < n ∨ m = n ∨ n < m. If either one of the first two cases
holds, then m ∈ S(n) and so m < S(n). Thus, in order to complete the
proof that S(n) ∈ I, it is sufficient to prove that if n < m, then S(n) ≤ m.
We shall prove this via another induction argument.
Let
J = {j ∈ N : ∀k ∈ N (k < j → S(k) ≤ j)}
Clearly, 0 ∈ J. Let j ∈ J. We wish to prove that S(j) ∈ J. To prove this,
pick k ∈ N such that k < S(j). Then, either k ∈ j or k = j. If k ∈ j, then
we have S(k) ≤ j since j ∈ J. But this means that S(k) < S(j). If k = j,
then we have S(k) = S(j) and hence S(k) ≤ S(j). Thus, S(j) ∈ J and
hence J is inductive. Consequently, J = N.
Going back to the main proof. If n < m, then S(n) ≤ m by the proof
above. Thus, S(n) ∈ I, which completes the proof I is inductive and hence
equals N.
• Every non-empty subset of N has a least element with respect to <.
This proof is left to the reader as an exercise.
Exercise 27. Using the principle of induction, prove that (n, ∈n ) is a strictly well-
ordered set for all n ∈ N, where ∈n is the membership relation on n.
Now that we have defined the set of natural numbers, we provide some definitions
that were promised to the reader earlier.
Definition 44. Let X be a set and n ∈ N. A sequence of length n over the set X
is a function f : n → X. A sequence (indexed by N) over the set X is a function
f : N → X.
1Notice that we used the axiom of foundation here. In truth, we do not need the axiom of
foundation to prove this statement. Proving this statement via the principle of induction is left
the reader as an exercise
MATH 320 SET THEORY
As noted earlier, while working with sequences, it is convenient to use the nota-
tion (fi )i∈n or (f0 , f1 , . . . , fn−1 ) to denote the sequence f : n → X, where fi = f (i).
Similarly, we shall write (fi )i∈N to denote a sequence f : N → X indexed by N.
Next will be given the definition of the cartesian product of finitely many sets,
which is simply a special case of the product of indexed systems.
Definition 45. Given an indexed system of sets {Ai }i∈n indexed by some natural
number n ∈Q N, we define the (n-fold) cartesian product A0 × A1 × . . . An−1 to be
the product {Ai }i∈n .
We note that all properties of products of arbitrary indexed systems holds for
n-fold cartesian products. If {Ai }i∈n is an indexed system of sets such that Ai = X
for all i ∈ n for some fixed set X, then we use the notation X n to denote the n-
fold cartesian product A0 × A1 × · · · × An−1 since this set is identically the set
of functions from n to X, for which we use the notations X n or n X. Before we
conclude this subsection, we finally define n-ary operations on a set.
Definition 46. Let X be a set and n ∈ N. An n-ary operation on the set X is a
function from X n to the set X.
It is common practice to call 1-ary, 2-ary and 3-ary functions unary, binary and
ternary functions respectively. Notice that there exist natural bijections between
X and X 1 , and X × X and X 2 , and (X × X) × X and X 3 , and so on2. For
this reason, we shall often not make a distinction between these sets and use them
interchangeably.
4.2. Arithmetic on the set of natural numbers. In this subsection, we shall
define the usual arithmetic operations on the set of natural numbers. It is conve-
nient to define these operations recursively. In order to be able to justify the usage
of recursive definitions, we shall next prove the following fundamental theorem.
Theorem 14 (The Recursion Theorem). Let X be a non-empty set, x ∈ X and
f : X → X be a function. Then there exists a unique function g : N → X such that
• g(0) = x and
• g(S(n)) = f (g(n)) for all n ∈ N.
The function g, considered as a sequence over the set X with the index set N,
is simply the sequence (x, f (x), f (f (x)), . . . ). Intuitively speaking, the function f
can be considered as instructions to compute the value of g at the successor of a
natural number using the value of g at that natural number.
Proof. For a natural number n ∈ N, call a function t : S(n) → X an n-step
computation if t(0) = x and t(S(k)) = f (t(k)) for all k ∈ S(n). Define
G = {t ⊆ N × X : ∃n ∈ N “t is an n-step computation”}
S
Let g = G. We shall prove several claims.
• g is a function.
By Exercise 10, it is sufficient to prove that element of G are compatible
functions. Let t, u ∈ G and set n = dom(t) and m = dom(u). Without loss
of generality, assume that n ∈ m. Then it follows from an easy induction
argument that n ⊆ m. If t and u were not compatible, then there would
2For the case n = 2, see Exercise 13. This idea can easily be generalized to other values of n.
BURAK KAYA
Week 6
BURAK KAYA
We shall next define the binary operation of addition + on the set N of natural
numbers by recursion. For any m, n ∈ N, define
+(m, 0) = m
From now on, we shall use the usual notation m + n to denote the value +(m, n).
Next are proven some basic properties of the addition operation.
Lemma 17. For all m, n ∈ N, m + S(n) = S(m) + n.
Proof. We prove this by induction on n. Let
I = {n ∈ N : ∀m ∈ N m + S(n) = S(m) + n}
We wish to prove that I is inductive.
• m + S(0) = S(m + 0) = S(m) = S(m) + 0 by the previous lemmas and
hence 0 ∈ I.
• Assume that n ∈ I. It follows from the previous lemmas and n ∈ I that
m + S(S(n)) = S(m + S(n)) = S(S(m) + n)) = S(m) + S(n) and hence
S(n) ∈ I.
This show that I is inductive and hence I = N by the principle of induction.
Exercise 28. Using the principle of induction, show that 0 + n = n for all n ∈ N.
Lemma 18. For all m, n ∈ N, m + n = n + m.
Proof. We prove this by induction on n. Let I = {n ∈ N : ∀m ∈ N m + n = n + m}.
We wish to prove that I is inductive.
• By the previous exercise, 0 + m = m = m + 0 for all m ∈ N and hence
0 ∈ I.
3That this relation is indeed a function follows from the uniqueness of f for each m ∈ N.
m
MATH 320 SET THEORY
5. Equinumerosity
In this section, we shall learn the notion of equinumerosity which allows us to
compare the “size” of sets. Historically speaking, this concept is what led to the
development of set theory.
Definition 47. Two sets A and B are said to be equinumerous if there exists a
bijection from A to B, in which case we write |A| = |B|.
4Even though it is a minor point, we note that we are using the axiom of choice in order to
choose an element from Si for every i ∈ N whenever it is possible.
BURAK KAYA
If two sets are equinumerous, then we can use the elements of one of them
as labels to “count” the elements of the other set. For this reason, we consider
equinumerous sets as sets that have the same “size”.
5.1. Finite sets. Having constructed natural numbers, we can now give a precise
meaning to the notion of “finite”.
Definition 48. A set A is said to be finite if there exists n ∈ N such that A is
equinumerous with the natural number n, in which case we say A has n elements
and write |A| = n.
By definition, each natural number is a finite set since it is equinumerous with
itself via the identity function. The following lemma shows that a natural number
cannot be equinumerous with a proper subset of itself.
Lemma 20. For each n ∈ N, any injective function from n to n is surjective.
Proof. We prove this by induction on n. The claim vacuously holds for n = 0.
Assume that it holds for n ∈ N. We wish to show that it also holds for S(n).
Assume to the contrary that there exists a function f : S(n) → S(n) which is
injective but not surjective.
• If n ∈
/ ran(f ), then f n is an injective function from n to n − {f (n)} which
contradicts the induction assumption.
• If n ∈ ran(f ), then n = f (k) for some (unique) k ∈ S(n). If k = n, then the
function f n is an injective function from n to n which is not surjective,
contradicting the induction assumption. If k 6= n, then the relation
g = (f − {(k, n), (n, f (n))}) ∪ {(k, f (n))}
is an injective function from n to n such that ran(g) = ran(f ) − {n}. Since
ran(f ) 6= S(n) and n ∈ ran(f ), we have that ran(g) 6= n. Thus, g is an
injective function from n to n which is not surjective, contradicting the
induction assumption.
Therefore, the claim holds for S(n) and hence holds for all n ∈ N by induction.
This lemma has some important corollaries.
Corollary 15. The set N is not finite.
Proof. Since there exists an injective function on N which is not surjective, if N were
equinumerous with some natural number n ∈ N, then there would exist an injective
function on n that is not surjective, which contradicts the previous lemma.
Corollary 16 (The Pigeonhole Principle). Let m, n ∈ N such that m < n. Then
there does not exist an injective function from n to m.
Proof. Assume to the contrary that f : n → m is an injective function. Since m ⊆ n,
the function f m is an injective function from m to m − {f (m)}, contradicting the
previous lemma.
Exercise 29. Using the previous lemma, show that if S is finite and equinumerous
with both n ∈ N and m ∈ N, then m = n.
We will next prove some basic properties of finite sets.
Theorem 17. Any subset of a finite set is finite.
MATH 320 SET THEORY
Week 7
MATH 320 SET THEORY
5.2. To infinity and beyond. In this subsection, we shall investigate some basic
properties of infinite sets. The reader is expected to develop a good understanding
of infinite sets since modern set theory, at its core, is simply the study of the
behaviour of infinite sets.
Definition 49. A set X is said to be infinite if it is not finite, i.e. there does not
exist a natural number n ∈ N such that X and n are equinumerous.
Even though this is the “official” definition of being infinite, we will often use
alternative characterizations of being infinite in order to show that certain sets are
infinite.
Definition 50. A set X is said to be Dedekind-infinite if there exists an injection
from X onto a proper subset of X.
Next will be proven that being infinite is equivalent to being Dedekind-infinite.
Lemma 23. Let X be an infinite set. Then there exists an injection from N to X.
Proof. By Lemma 6, there exists a function g : P(X)−{∅} → X such that g(U ) ∈ U
for all U ∈ P(X) − {∅}. Let x ∈ X be a fixed element. Let G : P(X) → X be the
function given by G(U ) = g(U ) for non-empty U ⊆ X and G(∅) = x. By recursion,
define a function f : N → X as follows.
• f (0) = x
• f (S(i)) = G(X − {f (0), f (1), . . . , f (i)}) for all i ∈ N.
An easy induction argument shows that X − {f (0), . . . , f (i)} = 6 ∅ for all i ∈ N since
X is infinite. It follows that for all i ∈ N we have that f (S(i)) 6= f (j) for each
0 ≤ j ≤ i. This implies that f is an injective function.
Theorem 18. A set is infinite if and only if it is Dedekind-infinite.
Proof. Let X be a set. Assume that X is infinite. Then, by the previous lemma,
there exists an injection f : N → X. Consider the function g : X → X given by
(
x if x ∈
/ ran(f )
g(x) =
f (k + 1) if x = f (k) for some k ∈ N
It is easily checked that g is an injection and ran(g) = X − {f (0)} and hence X is
Dedekind-infinite.
For the converse direction, assume that X is Dedekind-infinite, say g : X → X is
an injection which is not a surjection. If X were equinumerous with some natural
number n ∈ N via some bijection h : X → n, then h ◦ g ◦ h−1 : n → n would be an
injection which is not a surjection, contradicting Lemma 20. Thus X is infinite.
Having established another characterization of being infinite, we next define an
important class of infinite sets.
Definition 51. Let A be a set. Then A is said to be countably infinite if A and N
are equinumerous, and said to be at most countable (or simply, countable) if it is
either finite or countably infinite.
We next prove a series of lemmas that will eventually be used to prove that the
set of sequences of finite length over an at most countable set is at most countable.
Lemma 24. If A is at most countable, then there is a surjection from N onto A.
BURAK KAYA
Proof. Trivial.
Lemma 25. If there exists a surjection from A to B, then there exists an injection
from B to A.
Proof. Let f : A → B be a surjection. Let Ub denote the set f −1 [{b}] for each
b ∈ B. Since Ub is non-empty for each b ∈ B, Lemma 6 implies that there exists
G : {Ub : b ∈ B} → A such that G(Ub ) ∈ Ub . It is easily seen that the function
g : B → A defined by g(b) = G(Ub ) is an injective function.
Lemma 26. The set N × N is countably infinite.
Proof. The function f : N × N → N given by f (m, n) = 2m (2n + 1) − 1 is a bijection
since every positive natural number can be uniquely expressed as a product of a
power of two and an odd number.
An immediate corollary of this lemma is the following fact.
Corollary 19. If A and B are countably infinite sets, then A × B is countably
infinite.
Proof. Exercise.
Lemma 27. Any subset of a countably infinite set is at most countable.
Proof. Let A be a countably infinite set and B ⊆ A. Let f : N → A be a bijection.
If B is finite, then it is at most countable by definition. Assume that B is infinite.
We need to prove that B and N are equinumerous. Let b = f (k) where k is the least
natural number such that f (k) ∈ B, which exists since B is infinite. By recursion,
define the function g : N → B as follows.
• g(0) = b
• g(S(i)) = f (k) where k is the least natural number such that f (k) ∈ B and
f (k) ∈/ {g(0), g(1), . . . , g(i)}. (Notice that such k always exists since B is
infinite.)
In this case, g is a bijection from N to B, which completes the proof that B is
at most countable. We leave the details of checking that g is a bijection to the
reader.
Lemma 28. Let {Ai }i∈N be S an indexed system of sets such that Ai is at most
countable for all i ∈ N. Then i∈N Ai is at most countable.
Proof. Since each Ai is at most countable, the set Bi consisting of surjections from
N onto Ai is non-empty for all i ∈ N. Consequently, we S can choose a surjection
gi : N → Ai from each Bi for each i ∈ N. Let f : N × N → i∈N Ai be the function
given by f (m, n) = gm (n). Since each gm is a surjection, weShave that f is a
surjection. Thus, by Lemma 25, there exists an injection from i∈N Ai to N × N.
However,
S the latter set is countably infinite and it follows from the previous lemma
that i∈N Ai is at most countable.
Exercise 30. Using induction, show that if A is an at most countable set, then An
is at most countable for any n ∈ N.
Recall that, given a natural number n ∈ N, we defined a sequence of length n
over a set A to be a function from n to A. Thus the set of sequences of finite length
over a set A is simply n∈N An .
S
MATH 320 SET THEORY
An of sequences
S
Lemma 29. Let A be an at most countable set. Then the set n∈N
of finite length over A is at most countable.
Proof. By the previous exercise, An is countable for all n ∈ N. By Lemma 28, we
know that countable union of at most countable sets is at most countable and hence
n
S
n∈N A is at most countable.
Are all sets countable?.. The answer to this question turns out to be negative,
as was proven by Georg Cantor in 1874 in his groundbreaking paper “Über eine
Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen”. Historically speak-
ing, this was the birth of set theory. Here we shall present not the original proof of
this observation but another proof which is also due to Cantor. Before we present
this proof, we introduce some terminology.
Definition 52. Let A and B be sets. Then we say that the cardinality of A is
• less than equal to the cardinality of B, denoted by |A| ≤ |B|, if there exists
an injection from A to B.
• strictly less than the cardinality of B, denoted by |A| < |B|, if there exists
an injection from A to B but there is no bijection from A to B.
We shall next prove a remarkable theorem which is usually referred to as Can-
tor’s theorem. Cantor’s theorem has an elegant and simple proof that involves the
technique called diagonalization 1.
Theorem 20 (Cantor’s Theorem). For any set X, we have |X| < |P(X)|, i.e. the
cardinality of X is strictly less than the cardinality of its power set P(X).
Proof. It is easily seen that the function g : X → P(X) given by g(x) = {x} for
all x ∈ X is an injection. Thus it is sufficient to prove that there cannot be any
bijection from X to P(X).
Let f : X → P(X) be any function. Consider the set W = {x ∈ X : x ∈ / f (x)}.
Clearly we have W ∈ P(X). We claim that W ∈ / ran(f ). Assume to the contrary
that W ∈ ran(f ). Then there exists w ∈ X such that f (w) = W . It follows from
the definition of W that
w∈W ↔w∈ /W
which is a contradiction. Therefore W ∈ / ran(f ). This shows that no function from
X to P(X) is surjective, which completes the proof.
A set is said to be uncountable if it is not at most countable. Cantor’s theorem
shows that, for example, the set P(N) is uncountable.
Exercise 31. Prove that the set N N is uncountable. (Hint: Use Exercise 13.)
Our next goal is to prove that P(N) and the set of real numbers R are equinu-
merous. However, we do not know at this point how to construct the set R and
represent real numbers as sets. In the next section, we shall provide the construc-
tion of some standard number systems such as integers, rational numbers and real
numbers.
Before that, we would like to conclude this section with another important the-
orem that is extremely useful for proving that various sets are equinumerous.
1The author personally considers the proof of Cantor’s theorem as one of the most beautiful
arguments in mathematics.
BURAK KAYA
Week 8
BURAK KAYA
for all i ∈ N. In other words, the map g sends the natural numbers 0, 1, 2, 3, 4, . . .
to 0, −1, 1, −2, 2, . . . respectively. It is an exercise to the reader to check that the
map g is a bijection.
BURAK KAYA
6.2. Rational numbers. In this subsection, we will construct the set Q of rational
numbers together with its arithmetical operations and linear order relation, which
turn Q into an ordered field. Let Z∗ denote the set Z−{[0, 0]} and ∼ be the relation
on Z × Z∗ defined by
(p, q) ∼ (r, s) ←− p ·Z s = q ·Z r
It is easily verified that ∼ is an equivalence relation. We define the set of rational
numbers to be the quotient set
Q = Z × Z∗ / ∼
Intuitively speaking, the equivalence class [(p, q)]∼ is supposed to represent the
fraction pq . For this reason, the equivalence class [(p, q)]∼ will be denoted by pq
through this section. It easily follows from Theorem 22 that pq = −p −q for all p ∈ Z
and q ∈ Z∗ . Consequently, while referring to the rational number pq , we may assume
without loss of generality that 0 <Z q. We next define two binary operations +Q
and ·Q on Q as follows.
p r (p ·Z s) +Z (q ·Z r)
+Q =
q s q ·Z s
p r p ·Z r
·Q =
q s q ·Z s
for all p, r ∈ Z and q, s ∈ Z∗ . That the relations +Q and ·Q are well-defined can
be checked using Theorem 22, with a proof similar to that of Lemma 30. We now
define a relation ≤Q on Q as follows.
p r
≤Q ←− p ·Z s ≤Z q ·Z r
q s
for all p, r ∈ Z and q, s ∈ Z with q, s >Z 0. Together with +Q , ·Q and ≤Q , the set
of rational numbers satisfies its usual properties. More precisely, we have that
Theorem 24. The structure (Q, +Q , ·Q , ≤Q ) is an ordered field.
As was the case before, we shall not prove Theorem 24 since its proof is long,
tedious and does not contain any important ideas. However, the author thinks
that every mathematician should see such proofs at least once in his or her lifetime
to be completely convinced that such constructions indeed work the way they are
supposed to work.
Once again, the well-known secondary school fact Z ⊆ Q is not true with our
constructions of Z and Q. However, there exists a canonical copy of Z inside Q,
namely, the range of the function f : Z → Q given by
i
f (i) =
1
for all i ∈ Z. Identifying Z and f [Z], we may consider Z as a subset of Q. Con-
sequently, we may use a numeral p to denote both the integer p and the rational
number p1 . It turns out that the set of rational numbers is also countable.
Theorem 25. Q is countable.
Proof. We shall use Cantor-Schröder-Bernstein theorem. It is straightforward to
check that the map f : N → Q given by f (n) = n1 for all n ∈ N is an injection.
Note that the map g : Z × Z∗ → Q given by g(m, n) = m n for all m ∈ Z and n ∈ Z
∗
∗
is a surjection and hence, by Lemma 25, there exists an injection h : Q → Z × Z .
MATH 320 SET THEORY
Our idea is to collect all Dedekind cuts of Q together so that they are going
to fill all the holes in the number line. But what about the points in the number
line that do not correspond to holes, namely, the rational numbers? Should we
construct them separately as in the previous subsection?
A moment’s thought shows that not every Dedekind cut is determined by a hole.
For example, for every rational number q ∈ Q, the set C = {x ∈ Q : x <Q q} is a
Dedekind cut, however, there is no hole between C and Q − C in the number line.
The reason is that the set Q − C has a least element, namely, the rational number
q. We wish to consider a Dedekind cut D as the real number which is supposed
to be “right after” all the elements of D. With this interpretation, such Dedekind
cuts as C whose complement has a least element will represent rational numbers
and other Dedekind cuts such as A will represent real numbers that fill the holes
between rational numbers. Having this intuitive picture in mind, we define the set
of real numbers to be
R = {A ⊆ Q : A is a Dedekind cut}
In other words, a real number is simply a Dedekind cut. We next define the linear
order relation ≤R on the set R as follows.
r ≤R s ←− r ⊆ s
It should be clear to the reader that the relation ≤R overlaps with our intuitive
understanding of the real number line.
Proposition 1. The relation ≤R is a linear order relation.
Proof. That ≤R is a partial order relation easily follows from the properties of ⊆
and is left to the reader to be proven as an exercise. We shall only show that any
two real numbers are comparable with respect to ≤R .
Let r, s ∈ R. If r = s, then we clearly have r ≤R s. Assume now that r 6= s.
Then we have r * s or s * r. Without loss of generality, we may assume that
r * s. By definition, there exists x ∈ r such that x ∈
/ s. We claim that s ⊆ r.
Let y ∈ s. Since ≤Q is a linear order relation x ∈ / s, we have that y <Q x or
x <Q y. If it were the case that x <Q y, then, since s is a Dedekind cut, we would
have x ∈ s, which is a contradiction. Therefore, we have y <Q x. But then, since r
is a Dedekind cut and x ∈ r, we have that y ∈ r. Therefore s ⊆ r and hence s ≤R r
by definition.
2The curious reader may wish to google the term “Cantor set”, which is the range of f .
MATH 320 SET THEORY
Week 9
BURAK KAYA
7. Ordinal numbers
In this section, we shall learn the concept of an ordinal number, which arguably
is the most important concept in modern set theory. Historically, ordinal numbers
were first defined as isomorphism classes of strictly well-ordered sets. Later on,
John von Neumann presented a simpler definition, which has been the standard
definition since then.
Definition 54. A set x is said to be transitive if every element of x is also a subset
of x, i.e. ∀y(y ∈ x → y ⊆ x).
In other words, transitive sets are those sets that contain elements
S of elements
of themselves. Equivalently, a set x is transitive if and only if x ⊆ x.
Exercise 32. Using induction, prove that n is transitive for all n ∈ N.
Ordinal numbers are supposed to “represent” strictly well-ordered sets. This
is why ordinal numbers were first defined as isomorphism classes of strictly well-
ordered sets. However, this definition brings some technical difficulties since an
isomorphism class of a strictly well-ordered sets is not a set but a proper class.
John von Neumann’s idea was to find “canonical” representatives in isomorphism
classes of strictly well-ordered sets. It turns out that every strictly well-ordered set
is isomorphic to some transitive set which is strictly well-ordered by the member-
ship relation ∈. This leads us to the following simple but extra-ordinarily useful
definition.
Definition 55. A set α is an ordinal number if
• α is transitive, and
• (α, ∈α ) is a strictly well-ordered set.
What are some examples of ordinal numbers? It follows from Exercise 27 and
32 that each natural number is an ordinal number. Moreover, the set of natural
numbers N itself is also an ordinal number. From now on, we shall use ω to denote
the set N. The next lemma shows that the successor of an ordinal number is also
an ordinal number.
Lemma 31. If α is an ordinal number, then so is S(α).
Proof. Let α is an ordinal number. Let γ ∈ S(α). Then either γ ∈ α or γ = α. In
both cases, we have γ ⊆ S(α) since α is transitive. Checking that ∈S(α) is a strict
well-order relation is left to the reader as an exercise. Hint. Notice that the pair
(S(α), ∈S(α) ) is obtained by joining α to the strictly well-ordered set (α, ∈α ) as the
greatest element.
Ordinal numbers have various equivalent characterization. The reader is ex-
pected to be able to prove equivalence of some of these characterizations.
Exercise 33. Prove that α is an ordinal if and only if α is transitive and for all
γ, δ ∈ α either γ ∈ δ, γ = δ or δ ∈ γ.
Next will be proven that the class of ordinal numbers together with the mem-
bership relation ∈ is strictly well-ordered1.
1We have not officially defined what it means for an arbitrary class to be strictly well-ordered.
The reader need not learn this and should simply know that the membership relation ∈, considered
on the class of ordinal numbers, has all the properties of a strict well-order relation.
MATH 320 SET THEORY
iv. Let X be a non-empty set whose elements are ordinal numbers. Pick α ∈ X.
If α∩X = ∅, then α is the least element of X with respect to ∈. If α∩X 6= ∅,
then α ∩ X has a least element γ with respect to ∈ since α ∩ X ⊆ α and α is
an ordinal. If there were δ ∈ X such that δ ∈ γ, then δ ∈ α by transitivity
and hence δ ∈ α ∈ X, which contradicts the minimality of γ in α ∩ X with
respect to ∈. Thus, γ is the least element of X with respect to ∈.
Lemma 35. For any set X of ordinals, there exists an ordinal γ such that γ ∈ / X.
S
Proof. Let X be a set of ordinals and set γ = X. Since the elements of X are
transitive sets, so is γ. Moreover, it follows from Lemma 33 and 29 that γ is strictly
well-ordered by ∈. Therefore, γ is an ordinal. S
Consider the ordinal α = S(γ). If it were the case that α ∈ X, then α ⊆ X = γ
and hence α = γ or α ∈ γ by Lemma 34. In both cases, we have that α ∈ S(γ) = α,
which is a contradiction. Therefore, α ∈ / X.
2It is an easy exercise to the reader to check that there can be no ordinals between α and S(α).
MATH 320 SET THEORY
3For example, the axiom of empty set can be proven from the axiom of infinity and the axiom
of separation. Similarly, the axiom of separation can be proven from the remaining axioms.
4The curios reader may read the note “ZFC without parameters” by Ralf Schindler and Philipp
Schlicht via the link http://www.math.uni-bonn.de/people/schlicht/publications.html as of 30
April 2018.
BURAK KAYA
for any set A the collection Fϕ [A] is a set. Indeed, some textbooks introduce the
axiom of replacement in this parametric form.
Back to ordinals... How does the axiom of replacement help us construct the
next limit ordinal after the first infinite ordinal?
Consider the class function Fϕ which maps each natural number n to the infinite
ordinal which contains exactly n ordinals that contain limit ordinals and maps other
sets to the empty set. Then Fϕ (0) = ω, Fϕ (1) = S(ω), Fϕ (2) = S(S(ω)) and so on.
By the axiom of replacement, the set Fϕ [ω] exists. Taking the union of this set, we
obtain [
Fϕ [ω] = {0, 1, 2, . . . , ω, S(ω), S(S(ω)), . . . }
For the reasons we shall not explain at this moment, let us name this ordinal ω + ω.
At the moment, we have the following picture of the ordinals.
0 1 2 3 . . . ω S(ω) S(S(ω)) . . . ω + ω
Having constructed the second limit ordinal ω + ω, we can keep applying the suc-
cessor operation and taking the unions of the ordinals constructed via the axiom
of replacement.
0 1 2 3 . . . ω S(ω) S(S(ω)) . . . ω + ω S(ω + ω) S(S(ω + ω)) . . .
Where does this process end? The reader may have realized that all the ordinals
that can be constructed after finitely many stages via this “bottom-up” process are
countable sets.
MATH 320 SET THEORY
Week 10
BURAK KAYA
Are there uncountable ordinals? Before we answer this question, we need to prove
the following fact which shows that ordinal numbers indeed “represent” strictly
well-ordered sets.
Theorem 30. Let (W, ≺) be a strictly well-ordered set. Then there exists a unique
ordinal α such that (W, ≺) and (α, ∈α ) are isomorphic.
Proof. Recall that for any w ∈ W , the set pred(w) of predecessors of w together
with the relation ≺w = ≺ ∩ (pred(w) × pred(w)) is a strictly well-ordered set. Let
A = {w ∈ W : ∃α “α is an ordinal” ∧ (pred(w), ≺w ) ∼
= (α, ∈α )}
As strictly well-ordered sets, given two distinct ordinals, one of them is a proper
initial segment of the other one. Therefore, for any w ∈ W , if (pred(w), ≺w ) is
isomorphic to some ordinal, then this ordinal has to be unique and we shall denote
it by αw .
It follows from the axiom of replacement that the collection Ω = {αw : ∃w w ∈ A}
is a set. We claim that Ω is an ordinal. Since the elements of Ω are ordinals, it is
strictly well-ordered by ∈. Therefore, it suffices to prove that Ω is transitive. Let
αw ∈ Ω and γ ∈ αw . By definition, there exists an isomorphism g : pred(w) → αw .
Then, since γ is a proper initial segment of the ordinal αw , the set g −1 [γ] is a proper
initial segment of pred(w) and hence is of the form pred(w0 ) for some w0 ∈ pred(w).
This means that γ ∈ Ω since g pred(w0 ) : pred(w0 ) → γ is an order isomorphism.
This completes the proof that Ω is transitive.
Consider the function f : A → Ω given by f (w) = αw for all w ∈ A. We claim
that f is an order isomorphism. That f is surjective follows from the definition.
Checking that a ≺ b → f (a) ∈ f (b) is left to the reader as an exercise.
Finally, we show that A = W . If it were the case that W − A 6= ∅, then there
would be a least element w ∈ W − A with respect to ≺ and A = pred(w). This
would imply w ∈ A, contradicting the choice of w. Thus (W, ≺) is isomorphic to
(Ω, ∈Ω ). As before, notice that, given two distinct ordinals, one of them is a proper
initial segment of the other one. Therefore, (W, ≺) cannot be isomorphic to two
distinct ordinals.
In the light of this theorem, we can associate an order type to each strictly
well-ordered set.
Definition 57. Let (W, ≺) be a strictly well-ordered set. The order type of (W, ≺)
is the unique ordinal α such that (W, ≺) ∼
= (α, ∈α ) and is denoted by ot(W, ≺).
Exercise 35. Let 2N and 2N + 1 denote the sets of even natural numbers and odd
natural numbers respectively. Let ≺ be the relation defined on N as follows.
m ≺ n ↔ (m + n ∈ 2N ∧ m < n) ∨ (m ∈ 2N ∧ n ∈ 2N + 1)
where < is the usual order relation on N. You are given the fact that ≺ is a strict
well-order relation on N. Prove that the order type of (N, ≺) is ω + ω by explicitly
constructing an order isomorphism from N to ω + ω.
7.2. Hartogs numbers. We are now ready to prove that uncountable ordinals
exist. This result trivially follows from the fact that every set can be well-ordered,
which we shall see later is a consequence of the axiom of choice. However, it is
MATH 320 SET THEORY
possible to prove that existence of uncountable ordinals using only the axioms of
ZF5.
Theorem 31. Let X be a set. Then there exists an ordinal λ such that there is no
injection from λ to X.
Proof. Let
W = {R ∈ P(X × X) : ∃YR (YR ⊆ X ∧ (YR , R) is a strictly well-ordered set)}
By the previous theorem, for each R ∈ W , there exists a unique ordinal αR such that
(YR , R) ∼
= (αR , ∈αR ). By the axiom of replacement, the set λ = {αR : ∃R(R ∈ W )}
exists. We claim that λ is an ordinal.
Since the elements of λ are ordinals, it is sufficient to prove that λ is transitive.
Let γ ∈ λ and β ∈ γ. Then, by definition, there exists an order isomorphism g from
some strictly well-ordered set (Y, R) to (γ, ∈γ ) where Y ⊆ X. Clearly g −1 [β] and
β, as strictly well-ordered sets, are order isomorphic via the restriction of g. Thus,
β ∈ λ and hence λ is transitive. This completes the proof that λ is an ordinal.
If there were an injection from λ to X, then some subset of X could be strictly
well-ordered so that the corresponding strictly well-ordered set is isomorphic to λ.
But then, by definition, we would have λ ∈ λ, which is a contradiction. Thus there
exists no injection from λ to X.
Given any set X, the least ordinal which does not inject into X is called the
Hartogs number of X and is denoted by ℵ(X). By definition, there can be no
bijection between a set X and its Hartogs number ℵ(X).
In particular, the Hartogs number of ω, which is the least ordinal that does not
inject into ω, is uncountable. This ordinal is known as the first uncountable ordinal
and is denoted by ω1 . Since ω1 is the least ordinal that does not inject into ω, the
elements of ω1 have to be countable ordinals. Conversely, any countable ordinal
injects into ω and hence is an element of ω1 . Therefore, ω1 is precisely the set of
countable ordinals.
One can similarly argue that the Hartogs number ℵ(λ) of an ordinal λ is the
ordinal which consists of precisely the ordinals that are equinumerous with some
subset of λ.
The existence of Hartogs numbers shows that our “bottom-up” approach of
constructing ordinal numbers is useless. There are ordinal numbers that cannot be
obtained from the empty set in finitely (even, countably) many stages by applying
the successor operation and taking unions. Ordinal numbers exist simply because
they do so.
Proof. Assume towards a contradiction that there exists an ordinal γ such that
¬ϕ(γ) holds. Then, the set {δ ∈ S(γ) : ¬ϕ(δ)} is non-empty and hence has a least
element θ with respect to <. By the choice of θ, we have that ϕ(δ) holds for all
δ < θ. By assumption, this implies that ϕ(θ), which is a contradiction. Thus, for
all ordinals γ, we have that ϕ(γ).
It can easily be seen that the principle of transfinite recursion holds even if we
allow fixed parameters ϕ(x, p1 , . . . , pn ) in the statement of the theorem. Note that,
given a fixed ordinal δ, the principle of transfinite induction can be used to prove
that a property ϕ(x) holds for all ordinals γ ≤ δ by showing that the inductive
hypothesis holds for all ordinals up to δ, i.e. for all γ ≤ δ we have if ϕ(β) for all
β < γ, then ϕ(γ).
The principle of transfinite induction is the generalization of the principle of
induction on N to the class of ordinal numbers. In practice, this principle is most
commonly used in the following form.
Theorem 33 (The principle of transfinite induction, alternative formulation). Let
ϕ(x) be a formula in the language of set theory with one free variable. Assume that
• ϕ(0) holds.
• For all ordinals γ, if ϕ(γ) holds, then so does ϕ(S(γ)).
• For all limit ordinals θ, if ϕ(γ) holds for all γ < θ, then ϕ(θ) holds.
Proof. Exercise. Hint. Imitate the proof of the original formulation of the principle
of transfinite induction.
As before, this form of the principle of transfinite induction can also be used to
show that a property ϕ(x) holds for all ordinals up to an ordinal δ by showing that
the inductive hypothesis hold for all ordinals up to δ.
We shall next prove the transfinite recursion theorem which allows us to define
class functions recursively on the class of ordinal numbers.
Theorem 34 (The transfinite recursion theorem). Let Fϕ be a class function.
Then there exists a class function Fψ such that Fψ (α) = Fϕ (Fψ α ) for all ordinal
numbers α. Moreover, this class function is unique in the sense that if there exists
another class function Fφ satisfying the same property, then Fφ (α) = Fψ (α) for all
ordinal numbers α.
Before we proceed to the proof of this theorem, we would like to take a pause
and understand certain subtleties regarding the statement of this theorem. In the
statement, we are given a class function Fϕ and we assert the existence of another
class function Fψ . As we emphasized before, a class function Fψ is technically not
a function, but rather, is a formula ψ(x, y) that assigns a unique set y to each set
x. Naturally, one should ask the following question: How on earth can we quantify
over class functions in the statement of this theorem?!
What the transfinite recursion theorem really says is that, given a formula ϕ(x, y)
that assigns a unique set y to each set x, one can produce a formula ψ(x, y) that
assigns a unique set y to each set x such that Fψ (α) = Fϕ (Fψ α ) for all ordinals α;
and that the values of Fψ on the class of ordinal numbers are uniquely determined.
Therefore, one should think of the transfinite recursion theorem not as a single
theorem but as a theorem schema. For each formula ϕ(x, y) defining a class function,
the corresponding instance of this schema is a theorem of ZFC. Having realized this
subtle point, we are now ready to prove the transfinite recursion theorem.
MATH 320 SET THEORY
Proof of Theorem 34. Let Fϕ be a class function. Throughout this proof, we will
refer to a function f such that dom(f ) = S(α) and f (γ) = Fϕ (f γ ) for all γ ≤ α
as a computation of length α.
First, we show that computations of length α, if they exist, are unique for all
ordinals α. Let α be an ordinal and let f and f 0 be computations of length α. We
will prove that f (γ) = f 0 (γ) for all γ ≤ α via transfinite induction. Let γ ≤ α and
assume that the claim holds for all ordinals less than γ. Then
f (γ) = Fϕ (f γ )) = Fϕ (f 0 γ )) = f 0 (γ)
Thus, f (γ) = f 0 (γ) for all γ ≤ α and hence computations of a fixed length are
unique if they exist.
Let ψ(x, y) be the following formula with two free variables6.
(“x is not an ordinal” ∧ y = 0)
∨
(“x is an ordinal” ∧ ∃f (y = f (x) ∧ “f is a computation of length x”))
In other words, ψ(x, y) is the formula which maps each non-ordinal x to the empty
set and each ordinal x to the value of some computation of length x at x. We claim
that this formula defines a class function and that Fψ satisfies the requirements in
the statement of the theorem.
At this point, it is not clear that ψ(x, y) indeed defines a class function, i.e. for
each x there exists a unique y such that ψ(x, y). Clearly, if x is not an ordinal, then
y = ∅. Thus it is sufficient to show that for each ordinal x there exists a unique
y such that ψ(x, y) holds. Since y is given as the value of some computation of
length x at the set x, it suffices to prove that for all ordinals x there exists a unique
computation of length x. We already know that computations of a fixed length are
unique if they exist. Thus we only need to show the existence of computations of
arbitrary length.
We proceed by transfinite induction. Let α be an ordinal and assume that there
exists a (necessarily unique) computation of length β for all β < α. We need to
prove that there exists a (necessarily unique) computation of length α. Let H be
the set
{g : ∃β ∈ α “g is a computation of length β”}
which exists by the assumption and the axiom of replacement. Define
[ n [ o
f= H ∪ α, Fϕ H
We now prove that f is S a computation of length α. Observe that, by construction,
we have that dom(f ) = g∈H dom(g) ∪ {α} = S(α). To prove that f is indeed
a function, it is sufficient to prove that the collection H consists of compatible
functions. Let g, g 0 ∈ H be functions and assume without loss of generality that
dom(g) = β1 ≤ β2 = dom(g 0 ). We prove by transfinite induction that g(θ) = g 0 (θ)
for all θ < β1 . Let θ < β1 and assume that the claim holds for all ordinals less than
θ. Then
g(θ) = Fϕ (g θ ) = Fϕ (g 0 θ ) = g 0 (θ)
6Obviously, this formula needs to be written in the language of set theory. However, we shall
not carry out this tedious task since the reader should be able to convert the following informal
description to a formula in the language of set theory.
BURAK KAYA
Thus, by transfinite induction, we have g(θ) = g 0 (θ) for all θ < β1 and hence
H is a collection of compatible functions. Consequently, the relation f is indeed
a function. Next, let β < α and pick g ∈ H such that dom(g) = S(β). Then
f (β) = g(β)S= Fϕ (g β ) = Fϕ (f β ) since g ⊆ f and g is a computation. Finally,
f (α) = Fϕ ( H)) = Fϕ (f α )) and hence, f is a computation of length α, which
completes the induction.
We have proven that the formula ψ(x, y) defines a class function Fψ . We next
show that this class function satisfies the conditions in the statement of the the-
orem. Let α be any ordinal. Then, by definition, Fψ (α) = f (α) where f is
the unique computation length α. The crucial observation is that the restric-
tion of a computation of length α to any ordinal S(β) < α is the unique com-
putation of length β. Therefore, f (β) = Fψ (β) for any β < α. It follows that
Fψ (α) = f (α) = Fϕ (f α ) = Fϕ (Fψ α ). To show that such a class function Fψ is
unique, assume that there exists another formula φ(x, y) defining a class function
Fφ such that Fφ (α) = Fϕ (Fφ α ) for all ordinals α. By transfinite induction, we
will prove that Fψ (α) = Fφ (α) for all ordinals α.
Let α be an ordinal and assume that the claim holds for all ordinals less than α.
Then Fψ (α) = Fϕ (Fψ α ) = Fϕ (Fφ α ) = Fφ (α). Thus, by transfinite induction,
we have that Fψ (α) = Fφ (α) for all ordinals α, which completes the proof.
In practice, the following variant of the transfinite recursion theorem is frequently
used.
Theorem 35. Let Fϕ1 , Fϕ2 , Fϕ3 be class functions. Then there exists a unique
class function Fψ such that
• Fψ (0) = Fϕ1 (0),
• Fψ (S(α)) = Fϕ2 (Fψ (α)) for all ordinals α, and
• Fψ (α) = Fϕ3 (Fψ α ) for all limit ordinals α
The proof of this theorem easily follows from the proof of the transfinite recursion
theorem. Finally, we note that, as was the case with the recursion theorem, there
are variants of the transfinite recursion theorem that allow “parameters” in the
recursive definition of Fψ . We refer the reader to [1] for a precise statement and
the proof of such a variant.
MATH 320 SET THEORY
Week 11
BURAK KAYA
b. α < β ←→ γ + α < γ + β.
c. α ≤ β −→ α + γ ≤ β + γ.
d. γ + α = γ + β ←→ α = β.
e. (α · β) · γ = α · (β · γ).
f. α · (β + γ) = (α · β) + (α · γ).
g. α < β −→ α · γ ≤ β · γ.
h. If γ > 0, then α < β −→ γ · α < γ · β.
i. αβ+γ = αβ · αγ .
j. (αβ )γ = αβ·γ .
k. If γ > 1, then α < β → γ α < γ β .
l. α < β −→ αγ ≤ β γ .
Proof. [a.] We shall prove this by transfinite induction on γ. Let α and β be
ordinals. Clearly (α + β) + 0 = α + β = α + (β + 0) and hence the claim holds
for γ = 0. Let γ be an ordinal and assume that (α + β) + γ = α + (β + γ). Then
(α + β) + S(γ) = S((α + β) + γ) = S(α + (β + γ)) = α + S(β + γ) = α + (β + S(γ))
and hence the claim holds for the ordinal S(γ). Finally, let γ be a limit ordinal
and assume that (α + β) + θ = α + (β + θ) for all θ < γ. First, note that by the
inductive assumption
(α + β) + γ = sup{(α + β) + θ : θ < γ} = sup{α + (β + θ) : θ < γ}
On the other hand, it follows from the next lemma8 that
sup{α + (β + θ) : θ < γ} = α + sup{β + θ : θ < γ} = α + (β + γ)
It follows from the principle of transfinite induction that the claim holds for all
ordinals γ.
[b.] We shall prove this by transfinite induction on β. Let P (β) be the property
that β is an ordinal and α < β −→ γ + α < γ + β for all ordinals α, γ. Note that
P (0) trivially holds. Assume that P (β) holds for some ordinal β. Let γ and α be
ordinals such that α < S(β). Then either α < β or α = β. If α < β, then we have
γ + α < γ + β < S(γ + β) = γ + S(β)
by the inductive assumption. If α = β, then γ + α = γ + β < S(γ + β) = γ + S(β).
Hence P (S(β)) holds. Finally, let β be a limit ordinal and assume that P (θ) holds
for all ordinals θ < β. Then, for any α < β, we have α < S(α) < β and hence
γ + α < S(γ + α) = γ + S(α) ≤ sup{γ + θ : θ < β} = γ + β
which shows that P (β) holds. By the principle of transfinite induction, P (β) holds
for all ordinals β.
Before concluding this subsection, we would like to mention that addition, mul-
tiplication and exponentiation on ordinal numbers are “continuous” in the second
variable in the following sense.
Lemma 38. Let X be a non-empty set of ordinals and α be an ordinal. Then
sup{α + β : β ∈ X} = α + sup(X)
Proof. Let γ = sup(X). The proof splits into two cases.
8The proof of the next lemma only uses Lemma 37.b whose proof does not use Lemma 37.a.
Consequently, we are justified to use this lemma.
BURAK KAYA
Proof. It follows from Lemma 37 that α·(γ +1) ≥ (γ +1) > γ and hence there exists
a least ordinal δ such that α·δ > γ. If δ were limit, then α·δ = sup{α·θ : θ < δ} > γ
would imply that there exists an ordinal θ < δ such that α·θ > γ, which contradicts
the choice of δ. Thus δ is a successor ordinal, say δ = S(β). Then, by the choice of
δ, we have α · β ≤ γ and α · S(β) > γ. Hence, β is the greatest ordinal such that
α · β ≤ γ.
Using the exact same proof strategy in the previous lemmas, one can also prove
the following.
Lemma 41. Let α, γ be ordinals such that 2 ≤ α ≤ γ. Then there exists a greatest
ordinal β such that αβ ≤ γ.
Proof. Exercise.
We are now ready to prove the analogue of Euclidean division for ordinals num-
bers.
Lemma 42. Let α, γ be ordinals such that γ 6= 0. Then there exist unique ordinals
β and ρ with ρ < γ such that α = γ · β + ρ.
Proof. If α < γ, then clearly we can choose β = 0 and ρ = α since α = γ · 0 + α.
Assume that γ ≤ α. Then it follows from Lemma 40 that there exists β such that
β is the greatest ordinal for which we have γ · β ≤ α. By Lemma 39, there exists
an ordinal ρ such that α = γ · β + ρ. If it were the case that ρ ≥ γ, then we would
have α = γ · β + ρ ≥ γ · β + γ = γ · (β + 1), contradicting the choice of β. Hence
we have ρ < γ.
To prove the uniqueness of β and ρ, assume that α = γ · β + ρ = γ · β 0 + ρ0 for
some ordinals β, β 0 , ρ, ρ0 with ρ, ρ0 < γ. If it were the case that β < β 0 , then we
would have β + 1 ≤ β 0 and hence
α = γ · β + ρ < γ · β + γ = γ · (β + 1) ≤ (γ · β 0 ) ≤ γ · β 0 + ρ0 = α
which is a contradiction. Similarly, we cannot have β 0 < β. Hence β = β 0 . But then
Lemma 37.d implies that ρ = ρ0 , which completes the proof of the theorem.
Lemma 43. Let α, β be ordinals such that α < β and k ∈ ω be a natural number.
Then ω α · k < ω β .
Proof. By Lemma 37, we have ω α · k < ω α · ω = ω α+1 ≤ ω β .
Next will be proven the main theorem of this subsection.
Theorem 36. Let α > 0 be an ordinal number. Then there exist unique ordinals
β1 > β2 > · · · > βn and positive natural numbers k1 , k2 , . . . , kn such that
α = ω β1 · k1 + ω β2 · k2 + · · · + ω βn · kn
The expression ω β1 · k1 + ω β2 · k2 + · · · + ω βn in the statement of the theorem is
said to be the Cantor normal form of the ordinal α. We shall see later that ordinal
arithmetic is substantially easier when one uses Cantor normal form of ordinals.
Proof of Theorem 36. First, we prove the existence of Cantor normal forms of or-
dinals by transfinite induction on α > 0. Let α > 0 be an ordinal and assume that
all ordinals 1 ≤ γ < α have Cantor normal forms. If α < ω, then clearly ω 0 · α is
a Cantor normal form of α. Assume that ω ≤ α. Then, by Lemma 41, there exists
a greatest ordinal β such that ω β ≤ α. It follows from Lemma 42 that there exist
BURAK KAYA
unique δ and ρ with ρ < ω β such that α = ω β · δ + ρ. If it were the case that δ ≥ ω,
then we would have
α = ω β · δ + ρ ≥ ω β · δ ≥ ω β · ω = ω β+1
which contradicts the choice of β. Hence δ < ω. If ρ = 0, then α has a Cantor
normal form α = ω β · δ. Assume that ρ ≥ 1. Then, by the inductive hypothesis,
since ρ < ω β ≤ α, the ordinal ρ has a Cantor normal form ω β2 · k2 + · · · + ω βn · kn
for some ordinals β2 > · · · > βn and positive natural numbers k2 , . . . , kn . Since
ω β2 ≤ ρ < ω β , we have β2 < β by Lemma 37.k. It follows that
ω β · δ + ω β2 · k2 + · · · + ω βn · kn
is a Cantor normal form for the ordinal α. Thus, by transfinite induction, each
ordinal α > 0 has a Cantor normal form.
Next, we prove the uniqueness of the Cantor normal form of ordinals by trans-
finite induction on α > 0. Let α > 0 be an ordinal and assume that all ordinals
1 ≤ γ < α have unique Cantor normal forms. If α < ω, then clearly ω 0 · α is the
unique Cantor normal form of α. Assume that ω ≤ α. Let
α = ω β1 · k1 + ω β2 · k2 + · · · + ω βn · kn = ω γ1 · l1 + ω γ2 · l2 + · · · + ω γm · lm
be two Cantor normal forms of α. If β1 < γ1 , then, by the previous lemma, we
would have α ≥ ω γ1 > ω β1 · (k1 + k2 + · · · + kn ) ≥ α, which is a contradiction.
Similarly, we cannot have γ1 < β1 . Therefore we have γ1 = β1 . Let
δ = ω γ1
ρ1 = ω β2 · k2 + · · · + ω βn · kn
ρ2 = ω γ2 · l2 + · · · + ω γm · lm
Observe that α = δ · k1 + ρ1 = δ · l1 + ρ2 that ρ1 < δ and ρ2 < δ. Consequently,
the uniqueness of ordinals in the statement of Lemma 42 implies that k1 = l1 and
ρ1 = ρ2 . If ρ1 = 0, then ω β1 · k1 is the unique Cantor normal form of α. If ρ1 6= 0,
then ρ2 6= 0 and ρ1 has a unique Cantor normal form by the inductive assumption.
It follows that m = n and βi = γi for each 2 ≤ i ≤ m. This shows that α has a
unique Cantor normal form, which completes the proof.
Having established that all ordinals have unique Cantor normal forms, we shall
next prove several lemmas that allow us to easily compute sums and products of
ordinals using their Cantor normal forms.
Lemma 44. For all ordinals α, γ, if α < γ, then ω α + ω γ = ω γ .
Proof. We shall prove this by transfinite induction9 on γ.
• For γ = 0, the claim is vacuously true since there are no ordinals less than
zero.
9We would like to note that this lemma can be proven without transfinite induction by simply
observing that there exist unique ordinals β > 0 and θ such that γ = α + β and ω β = ω + θ;
and hence ω α + ω γ = ω α + ω α+β = ω α (1 + ω β ) = ω α (1 + ω + θ) = ω α (ω + θ) = ω α · ω β = ω γ .
However, the author thinks that it would be more beneficial for the reader to see as many examples
of transfinite induction as possible.
MATH 320 SET THEORY
• Let γ be an ordinal and assume that the claim holds for γ. Then for any
ordinal α < γ we have
ω α + ω S(γ) = ω α + ω γ · ω = ω α + ω γ · (1 + ω)
= ωα + ωγ + ωγ · ω
= ω γ + ω γ · ω = ω γ · (1 + ω) = ω S(γ)
and for α = γ we have
ω γ + ω S(γ) = ω γ + ω γ · ω = ω γ · (1 + ω) = ω S(γ)
Hence the claim holds for S(γ).
• Let γ be a limit ordinal and assume that the claim holds for all ordinals
strictly less than γ. Then, by Exercise 38, for any ordinal α < γ, we have
ω α + ω γ = ω α + sup{ω θ : θ < γ} = sup{ω α + ω θ : θ < γ}
Since α < γ and γ is a limit, there exist ordinals θ such that α < θ < γ. It
then follows from the inductive assumption that
ω α + ω γ = sup{ω α + ω θ : θ < γ} = sup{ω θ : θ < γ} = ω γ
Therefore the claim holds for γ. By transfinite induction, the claim holds
for all ordinals γ.
Next corollary immediately follows from the lemma above.
Corollary 37. Let α < γ be ordinals and m, n ∈ ω be such that n > 0. Then
ωα · m + ωγ · n = ωγ · n
Proof. Exercise. (Hint. Use induction on m. For the base case, use the exercise10.)
Lemma 45. Let ω β1 · k1 + ω β2 · k2 + · · · + ω βn · kn be the Cantor normal form of
a non-zero ordinal α. Then, for any k ∈ ω with k > 0, we have
α · k = ω β1 · (k1 · k) + ω β2 · k2 + · · · + ω βn · kn
Proof. Exercise. (Hint. Use induction on k and apply Corollary 37.)
Lemma 46. Let ω β1 · k1 + ω β2 · k2 + · · · + ω βn · kn be the Cantor normal form of
a non-zero ordinal α. Then, for any ordinal γ > 0, we have α · ω γ = ω β1 +γ
Proof. It follows from Lemma 37 that ω β1 ≤ α ≤ ω β1 · (k1 + k2 + · · · + kn ) and
hence, multiplying each side by ω γ on the right, we get
ω β1 · ω γ ≤ α · ω γ ≤ (ω β1 · (k1 + k2 + · · · + kn )) · ω γ
Since ordinal multiplication is associative and we have (k1 + k2 + · · · + kn ) · ω γ = ω γ ,
it follows that ω β1 · ω γ ≤ α · ω γ ≤ ω β1 · ω γ and hence α · ω γ = ω β1 +γ .
We now use these results to compute some sums and products of ordinals in their
Cantor normal forms. One can prove similar results for ordinal exponentiation in
order to express the Cantor normal form of αβ in terms of the Cantor normal forms
of α and β. However, we shall not cover these identities and refer the curious reader
10Exercise. Prove that if θ ≥ ω is an ordinal and n is a natural number, then n + θ = θ.
BURAK KAYA
Week 12
BURAK KAYA
8. Cardinal numbers
In this section, we shall define the class of cardinal numbers, which is a special
subclass of ordinal numbers. The motivation behind the concept of cardinal num-
bers comes from the following question: Can we find “natural representatives” for
the equinumerosity classes1 of sets?
We have established the fact that every strictly well-ordered set is isomorphic to
a unique ordinal. If it is the case that every set can be well-ordered, then every set
is equinumerous with some ordinal number and hence the equinumerosity class of
a set can be represented by the least ordinal with which the set is equinumerous.
Thus we need to show that every set can be well-ordered.
8.1. Zorn’s lemma, the well-ordering theorem and the axiom of choice. In
this subsection, we shall prove two important consequences of the axiom of choice,
which will turn out to be equivalent to the axiom of choice assuming only the
axioms of ZF.
Lemma 47 (Zorn’s lemma). Let (P, ) be a partially ordered set such that every
chain has an upper bound in P, i.e. for every C ⊆ P, if C is a chain, then there
exists p ∈ P such that c p for all c ∈ C. Then there exists a maximal element in
P with respect to .
Proof. Assume to the contrary that P does not have a maximal element. Then, for
every chain C ⊆ P, the set {u ∈ P : ∀c ∈ C c ≺ u} is non-empty since there are no
maximal elements and there exists an upper bound for C in P.
Let C be the set of chains in P. It follows from Lemma 6 there exists a function
h : P(P) − {∅} → P such that h(S) ∈ S for all non-empty S ⊆ P. Consequently,
there exists a function f : C → P such that c ≺ f (C) for all c ∈ C and C ∈ C,
namely the function f (C) = h({u ∈ P : ∀c ∈ C c ≺ u}).
P is clearly non-empty. Fix some arbitrary element a0 ∈ P. By transfinite
recursion, define
(
f ({aβ : β < α}) , if {aβ : β < α} is a chain
aα =
a0 , otherwise
for all ordinals α > 0. It follows from an easy transfinite induction argument that
aβ ≺ aα = f ({aγ : γ < α}) for all β < α. It follows that, for every ordinal α,
there exists an injective function from α to P, namely the function γ 7→ aγ . This
contradicts the existence of the Hartogs number of P. Thus, P has a maximal
element.
Lemma 48 (The well-ordering theorem). Every set can be well-ordered, i.e. for
every set A there exists a well-order relation A on the set A.
Proof. Let A be a set. Consider the set
P = {(B, B ) : B ⊆ A ∧ B is a well-order relation on B}
and the relation v on P given by
(B, B ) v (C, C ) ←→
B ⊆ C ∧ B ⊆ C ∧ “B is an initial segment of C in (C, C )”
1Given a set A, the class of all sets that are equinumerous with A is called the equinumerosity
class of A
MATH 320 SET THEORY
Notice that the proof of Zorn’s lemma uses ZFC and the proof of the well-
ordering theorem uses only the axioms of ZF and Zorn’s lemma. Consequently, if
we can prove that the axioms of ZF together with the well-ordering theorem imply
the axiom of choice, then Zorn’s lemma, the well-ordering theorem and the axiom
of choice would be equivalent over ZF.
Exercise 39. Assuming only the axioms of ZF, prove that the axiom of choice holds
if and only if for every non-empty set X there exists a function f : P(X)−{∅} → X
such that f (S) ∈ S for all non-empty S ⊆ X.
Lemma 49 (ZF). If the well-ordering theorem holds, then so does the axiom of
choice.
Proof. Let X be a non-empty set. If the well-ordering theorem holds, then there
exists a well-order relation on the set X. It follows that the relation
f = {(S, a) : S ⊆ X ∧ S 6= ∅ ∧ a ∈ S ∧ ∀s ∈ S a s}
is a function from P(X) − {∅} to X such that f (S) ∈ S for all non-empty S ⊆ X.
The existence of such a function is equivalent to the axiom of choice by the previous
lemma.
8.2. Cardinal number of a set. Having established that every set can be well-
ordered, every set can be put in a bijection with some ordinal. We would like to
declare the least ordinal with which a set is equinumerous to be the cardinality of
this set. For this reason, let us first give a special name to such ordinals.
Definition 58. An ordinal number α is said to be a cardinal number if α is not
equinumerous with β for all ordinals β < α.
Definition 59. Let X be a set. The cardinal number (or simply, the cardinality) of
X is the unique cardinal number which is equinumerous with X, which is denoted
by |X|.
Notice that our usage of the word cardinality and the notation |X| is consistent
with our earlier usage. More precisely, there exists an injection from X to Y if and
only if the cardinal number of X is less than or equal to the cardinal number of Y ,
for both of which we use the notation |X| ≤ |Y |.
Exercise 40. Let A and B be sets. Prove that |A × B| = | |A| × |B| | and
| B A | = | |B| |A| | and | |A| | = |A|.
BURAK KAYA
It is easily seen that the set of finite cardinal numbers are precisely the set of
natural numbers and the first infinite cardinal is ω. By transfinite recursion, define
the following class of ordinals.
• ℵ0 = ω
• ℵα+1 = ℵ(ℵα ), the Hartogs number of ℵα , for all ordinals α, and
• ℵγ = sup{ℵθ : θ < γ} for all limit ordinals γ.
It is easily seen that α < β implies that ℵα < ℵβ . Consequently, the collection of ℵ
numbers is a proper class. We shall next prove that the proper class of ℵ numbers
is a complete (transfinite) list of infinite cardinal numbers.
Lemma 50. For all ordinals α, ℵα is a cardinal number.
Proof. We shall prove this by transfinite induction. The claim clearly holds for
α = 0. Assume that the claim holds for an ordinal α ≥ 0. Since the Hartogs
number of ℵα is the least ordinal which does not inject into ℵα , no ordinal less
than ℵα+1 can be equinumerous with ℵα+1 and hence ℵα+1 is a cardinal. Finally,
assume that the claim holds for all ordinals less than a limit ordinal γ. If ℵγ were
not a cardinal, then there would exist some ordinal δ < ℵγ such that δ and ℵγ
are equinumerous. In this case, by the definition of ℵγ , there would exist θ < γ
such that δ < ℵθ . Since ℵθ ⊆ ℵγ and the latter is equinumerous with δ < ℵθ , we
would obtain that ℵθ is not a cardinal number, which is a contraction. Thus ℵγ is
a cardinal, which completes the transfinite induction.
define
[
κ + λ = κ × {0} λ × {1}
κ · λ = |κ × λ|
κλ = λ
κ
In other words, the sum of two cardinals is the cardinality of their “disjoint union”,
the product of two cardinals is the cardinality of their cartesian product and κ
raised to the power λ is the cardinality of the set of functions from λ to κ.
Even though cardinal numbers are ordinal numbers of special type, cardinal
arithmetic is very different than ordinal arithmetic. For example, ω ω is a countable
ordinal if the exponentiation operation is considered in ordinal arithmetic and is
an uncountable cardinal if the exponentiation operation is considered in cardinal
arithmetic.
Below we shall list some properties of cardinal arithmetic and prove one of these
properties. The reader is expected to prove the rest. We note that, unlike properties
of ordinal arithmetic, it is not necessary to use transfinite induction to prove the
following.
Lemma 52. Let κ, λ, µ be cardinal numbers. Then
a. κ + 0 = 0 + κ = κ
b. κ + λ = λ + κ
c. (κ + λ) + µ = κ + (λ + µ)
d. κ ≤ µ −→ κ + λ ≤ µ + λ
e. κ · 0 = 0 · κ = 0
f. κ · 1 = 1 · κ = κ
g. κ · λ = λ · κ
h. (κ · λ) · µ = κ · (λ · µ)
i. κ ≤ µ −→ κ · λ ≤ µ · λ
j. κ0 = 1
k. κ1 = κ
l. 1κ = κ
m. (κλ )µ = κλ·µ
n. κλ · κµ = κλ+µ
o. κµ · λµ = (κ · λ)µ
p. κ ≤ λ −→ κµ ≤ λµ
r. µ ≥ 1 ∧ κ ≤ λ −→ µκ ≤ µλ
Proof. [m.] Since κλ = |λ κ| and λ · µ = |λ × µ|, by Exercise 40, it is sufficient to
prove that
|µ (λ κ)| = |λ×µ κ|
Consider the function g : µ (λ κ) → λ×µ κ given by
(g(f ))(θ1 , θ2 ) = (f (θ2 ))(θ1 )
for all f ∈ ( κ). In other words, the function g takes a function f from µ to λ κ
µ λ
to the function from λ × µ to κ whose value at (θ1 , θ2 ) is the value of the function
f (θ1 ) at the point θ2 . We claim that g is a bijection.
Let f1 , f2 ∈ µ (λ κ) and assume that g(f1 ) = g(f2 ). Let θ ∈ µ. Then, for any
0
θ ∈ λ, we have
(f2 (θ))(θ0 ) = (g(f2 ))(θ0 , θ) = (g(f1 ))(θ0 , θ) = (f1 (θ))(θ0 )
BURAK KAYA
Week 13
BURAK KAYA
It turns out that the sum or product of two cardinals at least one of which is
infinite is the maximum of these two cardinals. In order to prove this fact, we shall
need the following theorem.
Theorem 38. For all ordinals α, we have |ℵα × ℵα | = ℵα .
Proof. We shall prove this by transfinite induction on α. By Lemma 26, the claim
holds for α = 0. Let α ≥ 0 be an ordinal number and assume that the claim holds
for all ordinal β < α. Consider the relation ≺ on ℵα × ℵα given by
(γ1 , δ1 ) ≺ (γ2 , δ2 ) ←→ max{γ1 , δ1 } < max{γ2 , δ2 } ∨
(max{γ1 , δ1 } = max{γ2 , δ2 } ∧ γ1 < γ2 ) ∨
(max{γ1 , δ1 } = max{γ2 , δ2 } ∧ γ1 = γ2 ∧ δ1 < δ2 )
We claim that ≺ is a strict well-order relation on ℵα × ℵα . We skip the details
of checking that ≺ is a strict linear order relation2 and only show that every non-
empty subset of ℵα ×ℵα has a minimal element with respect to ≺. Let X ⊆ ℵα ×ℵα
be a non-empty set. The set {max{γ, δ} : (γ, δ) ∈ X} is non-empty and hence has
a least element, say the ordinal θ. Then the set
{γ : ∃δ (γ, δ) ∈ X ∧ max{γ, δ} = θ}
is non-empty and has a least element, say the ordinal . Similarly, the set
{δ : (, δ) ∈ X ∧ max{, δ} = θ}
is non-empty and has a least element, say the ordinal ξ. We claim that (, ξ) is the
least element of X. Given any (γ, δ) ∈ X,
• If max{γ, δ} > θ, then (, ξ) ≺ (γ, δ).
• If max{γ, δ} = θ, then, by construction, we have either = γ or < γ. In
the former case, by construction, we have ξ ≤ δ and hence (, ξ) (γ, δ).
In the latter case, we have (, ξ) ≺ (γ, δ).
Therefore (, ξ) (γ, δ), which shows that (, ξ) is the least element of X with
respect to ≺. Having shown that ≺ is a strict-well order relation, we shall next
count the cardinality of the predecessors of an element (γ, δ). Let (γ, δ) ∈ ℵα × ℵα
and (γ 0 , δ 0 ) ∈ pred(γ, δ). Set λ = max{γ, δ}. There are two possible cases.
• If max{γ 0 , δ 0 } < max{γ, δ}, then (γ 0 , δ 0 ) ∈ λ × λ ⊆ (λ + 1) × (λ + 1).
• If max{γ 0 , δ 0 } = max{γ, δ}, then we have
(γ 0 , δ 0 ) ∈ (λ + 1) × (λ + 1)
Therefore, the cardinality of pred(γ, δ) ⊆ (λ + 1) × (λ + 1) is at most
|(λ + 1) × (λ + 1)| = | |λ + 1| × |λ + 1| | = | |λ| × |λ| |
Since |λ| < ℵα , there exists ν < α such that |λ| = ℵν and hence, by the induction
assumption, we have | |λ| × |λ| | = |λ| < ℵα . Therefore, any element in ℵα × ℵα
has less than ℵα many predecessors in the strictly well-ordered set (ℵα × ℵα , ≺).
It follows that we have ot(ℵα × ℵα , ≺) ≤ ℵα . Consequently, |ℵα × ℵα | ≤ |ℵα |.
Finally, observe that we have |ℵα | ≤ |ℵα × ℵα | via the injection ξ 7→ (ξ, 0) and
hence |ℵα × ℵα | = ℵα by Cantor-Schröder-Bernstein theorem. This completes the
transfinite induction.
Corollary 39. Let κ and λ be infinite cardinals. Then κ + λ = κ · λ = max{κ, λ}.
2The reader is expected to prove this as an exercise.
MATH 320 SET THEORY
Proof. Let κ and λ be infinite cardinals. Without loss of generality, assume that
κ = ℵα and λ = ℵβ for some ordinals α < β. It follows from the previous theorem
and Lemma 52 that
λ = ℵβ ≤ ℵβ + ℵα ≤ ℵβ + ℵβ = ℵβ · 2 ≤ ℵβ · ℵβ ≤ ℵβ = λ
and that
λ = ℵβ ≤ ℵβ · ℵα ≤ ℵβ · ℵβ = ℵβ = λ
n
Exercise 42. Let κ be an infinite cardinal. Prove that κ = κ for all n ∈ ω − {0}.
Recall that a countable union of countable sets is countable. It turns out that,
in this fact, one can change the word “countable” to “of size κ” for any infinite
cardinal κ, using Corollary 39.
Lemma 53. Let κ be an infinite S cardinal and {Xα }α∈κ be an indexed system of
sets such that |Xα | ≤ κ. Then | α∈κ Xα | ≤ κ.
Proof. For S each α ∈ κ, choose a surjection fα : κ → Xα . Consider the map
g : κ × κ → α∈κ Xα given by g(α, β) = fα (β) for all α, β ∈ κ. It is straightforward
to
S check that g is a surjection. By Lemma 25, there S exists an injection from
α∈κ αX to κ × κ. Since |κ × κ| = κ, we have that | α∈κ Xα | ≤ κ.
Contrary to cardinal addition and multiplication, determining the result of an
exponentiation in cardinal arithmetic is “difficult” in a certain sense that will be
explained in the next subsection.
One of the most fascinating achievement of the last century in set theory is the
independence of these statement from the axioms of ZFC. It follows from the work
of Kurt Gödel in 1940 and the work of Paul Cohen in 1963 that if the axiom of ZFC
are consistent3, then one cannot4 prove or disprove CH or GCH using the axioms
of ZFC.
One should perhaps mention that Paul Cohen was awarded the Fields Medal in
1966 for his ground breaking work, which remains the only Fields Medal awarded
for a work in mathematical logic.
It should be obvious at this point that the axioms of ZFC are insufficient to
provide an answer to some basic questions about cardinal exponentiation. Never-
theless, there are some results that can proven in ZFC regarding cardinal exponen-
tiation, some of which we are going to learn in the next subsection.
8.5. More on cardinal exponentiation. We begin by restating Cantor’s theo-
rem in the terminology of cardinal numbers.
Theorem 40 (Cantor’s theorem revisited). For all cardinals κ, κ < 2κ .
An immediate consequence of the results we have proven so far is that the set of
functions from κ to λ has the same cardinality as the power set of κ provided that
λ is not “too big”.
Lemma 54. Let κ and λ be cardinal numbers such that ω ≤ κ and 2 ≤ λ ≤ κ.
Then 2κ = λκ .
Proof. It follows from Cantor’s theorem, Exercise 40 and Corollary 39 that
2κ ≤ λκ ≤ κκ ≤ (2κ )κ ≤ 2κ·κ = 2κ
and hence 2κ = λκ .
Next we turn our attention to the following problem. We know that, given an
infinite cardinal κ, we have κn = κ for all n ∈ ω and κκ > κ. One can ask the
following question: What is the least cardinal λ such that κλ > κ?
This question is “difficult” in the sense that its answer is generally independent of
ZFC for arbitrary κ, as was the case with the continuum hypothesis5. Nevertheless,
it happens to be the case that one can provide an upper bound for λ which is
sometimes better than κ. In order to learn this upper bound, we shall need the
concept of cofinality.
Week 14
BURAK KAYA
S Q
Proof. Consider the function f : λi × {i} → i∈I κi given by
i∈I
(
0 if i 6= j
(f (θ, j))(i) =
θ + 1 otherwise
for all i, j ∈ I and θ ∈ λj . Assume that f (θ, j) = f (θ0 , j 0 ) for some j, j 0 ∈ I and
θ ∈ λj and θ0 ∈ λj 0 . Since the sequences f (θ, j) and f (θ0 , j 0 ) with the index set
I are non-zero only at the indices j and j 0 respectively, we have that j = j 0 . On
the other hand, since θ + 1 = (f (θ, P (θ0 , j))(j) = θ0 + 1, we have θ = θ0 .
j))(j) = (fQ
Therefore f is one-to-one and hence i∈I λi ≤ i∈I κiS . In order to finish the proof,
it sufficesQto show that there exists no surjection from i∈I λi × {i} to the cartesian
product S Q system {κi }i∈I .
i∈I κi of the indexed
Let h : i∈I λi × {i} → i∈I κi be any function. For each i ∈ I, consider the
function hi : λi → κi given by hi (θ) = (h(θ, i))(i) for all θ ∈ λi . Since we have
λi < κi for all i ∈ I, the function hi cannot be a surjection and hence, using the
axiom of choice, we can choose Q δi ∈ κi such that δi ∈ / ran(hi ) for each i ∈ I.
Then the sequence (δi )i∈I ∈ i∈I κi is not in the range of h and hence h is not a
surjection.
Corollary 42. For any infinite cardinal κ, we have κ < κcf (κ) .
Proof. Let κ be an infinite cardinal and f : cf (κ) → κ be a strictly increasing
function whose range is cofinal. Then, since f (ξ) < κ for all ξ ∈ cf (κ), it follows
from König’s theorem that
X Y
κ = sup{f (ξ) : ξ ∈ cf (κ)} ≤ |f (ξ)| < κ = cf (κ) κ = κcf (κ)
ξ∈cf (κ) ξ∈cf (κ)
Before concluding this subsection, we shall give a special name to those infinite
cardinals that are equal to their own cofinality since they are of importance.
Definition 63. Let κ be an infinite cardinal. The cardinal κ is said to be a regular
cardinal if cf (κ) = κ and is said to be a singular cardinal if it is not a regular
cardinal.
Using Lemma 55, one can show that the cofinality of an infinite cardinal is always
a regular cardinal.
Exercise 45. Prove that if α is a limit ordinal, then cf (α) is regular.
An important consequence of Lemma 53 is that successor cardinals are regular.
Lemma 56. Successor cardinals are regular.
BURAK KAYA
Proof. Let κ be a successor cardinal, say κ = λ+ for some infinite cardinal λ. Let
f : cf (κ) → κ be a function whose range is cofinal. Then we have
[
κ = sup{f (θ) : θ ∈ cf (κ)} = {f (θ) : θ ∈ cf (κ)}
by the definition of a cofinal subset. Observe that |f (θ)| ≤ λ since κ is a cardinal
+
number.
S If it were the case that cf (κ) < κ = λ , then Lemma 53 would imply that
κ = {f (θ) : θ ∈ cf (κ)} ≤ λ which is a contradiction. Therefore cf (κ) = κ and
hence κ is regular.
8.6. Cardinal exponentiation under GCH. Even though some most basic ques-
tions regarding cardinal exponentiation are independent of ZFC as was pointed out
before, it turns out that cardinal exponentiation trivializes if one additionally as-
sumes the generalized continuum hypothesis.
Theorem 44 (ZFC+GCH). Let κ and λ be infinite cardinals. Then
+
λ if κ ≤ λ
λ +
κ = κ if cf (κ) ≤ λ < κ
κ if λ < cf (κ)
Observe that, for the last two inequalities, we use GCH and Lemma 53
respectively.
For example, assuming GCH, we have
ℵ
ℵωω1 = ℵ+
ω1 = ℵω1 +1
ℵℵω1 = ℵ+
ω = ℵω+1
ℵℵ2 0 = ℵ2
As a concluding remark for this section, we would like to note that, while most
statements that are not trivial consequences of König’s theorem turn out to be in-
dependent of ZFC, some non-trivial identities regarding cardinal exponentiation are
provable using the axioms of ZFC. For example, the following surprising inequality
is proven by Saharon Shelah.
ℵℵω0 ≤ 2ℵ0 + ℵℵ4
MATH 320 SET THEORY
2A class C is said to be transitive if it contains the elements of its elements. More formally, if
a class C is defined by the formula ϕ(x), then C is transitive if ∀x ϕ(x) → (∀y (y ∈ x → ϕ(y))).
MATH 320 SET THEORY
10. Coda
This course is intended to serve as an introductory course in axiomatic set theory.
The author sincerely hopes that the reader enjoyed the course and benefited as much
as possible. Those who are struck by the intrinsic beauty of the subject and who
wish to learn deeper and more beautiful set theoretic topics are strongly suggested
to read the graduate level textbooks of Jech [2] and Kunen [3], both of whom the
author thinks are terrific expositors. The author also believes that the reader who
enjoyed what he or she has learned so far would undoubtedly find more of the
“supreme beauty” that Russell referred to, in studying advanced set theory.
“Mathematics, rightly viewed, possesses not only truth, but supreme
beauty - a beauty cold and austere, like that of sculpture, without
appeal to any part of our weaker nature, without the gorgeous
trappings of painting or music, yet sublimely pure, and capable of
a stern perfection such as only the greatest art can show. The true
spirit of delight, the exaltation, the sense of being more than Man,
which is the touchstone of the highest excellence, is to be found in
mathematics as surely as poetry.”
Bertrand Russell.
References
[1] Karel Hrbacek and Thomas Jech, Introduction to set theory, third ed., Monographs and Text-
books in Pure and Applied Mathematics, vol. 220, Marcel Dekker, Inc., New York, 1999.
MR 1697766
[2] Thomas Jech, Set theory, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 2003,
The third millennium edition, revised and expanded. MR 1940513
[3] Kenneth Kunen, Set theory, Studies in Logic (London), vol. 34, College Publications, London,
2011. MR 2905394