0% found this document useful (0 votes)

35 views440 pages

Axiombook

Uploaded by

keynet4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views440 pages

Axiombook

Uploaded by

keynet4

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 440

Titlepage

Set Theory

An Introduction to Axiomatic Reasoning

Robert André
To
Jinxia, Camille and Isabelle
“Everything has beauty, but not everyone sees it.”
Confucius
i

Preface
A set theory textbook can cover a vast amount of material depending on the
mathematical background of the readers it was designed for. Selecting the
material for presentation in this book often came down to deciding how much
detail should be provided when explaining concepts and what constitutes a
reasonable logical gap which can be independently filled in by the reader.
Choice of topics and calibration of the level of communication is based on
the estimated mathematical fluency of the reader. In this book readers will
find that the initial chapters of this book are presented in a form destined
to students who have little experience in proving mathematical statements.
But as the student progresses throughout the book he or she will slowly be
eased into a progressively denser form of mathematical arguments and pre-
sentation. That said, the overall targeted readership profile is one of a student
registered in a one-semester intermediate level general math course. A course
which would be part of a four-year university program in which the study of
mathematics bears a strong component; but, nevertheless, is not necessarily
one designed to prepare a student for the study of mathematics at the grad-
uate level. The writer assumes that most students have not necessarily been
exposed to the type of mathematical rigor normally found in most textbooks
in set theory. In the beginning, the pace at which new concepts are intro-
duced is one that may be subjectively considered as being “leisurely”. The
meaning of a mathematical statement is explained at length and their proofs
presented in great detail. The purpose is to sufficiently capture the interest of
the reader thus inciting him or her to delve further into the subject matter.
As the student progresses through the course, he or she will develop a better
understanding of what constitutes a correct mathematical proof. To help at-
tain this objective, numerous examples of simple straightforward proofs are
presented as models throughout the text.
This reader understands that doing mathematics is a skill and so is some-
thing that must be done and practiced under the supervision of an instructor
who can point out errors or weaknesses in certain mathematical arguments.
Most of us are not born with this skill; it is one which is studied and developed.
Having read the content, most mathematicians would say that the book is
self-contained. This is an accurate statement, mathematically speaking. But
it is not sufficiently nuanced. Students who have previously studied a course
in, say, linear algebra, elementary number theory or an elementary course in
real analysis will fare better than one who hasn’t. This not because the book
assumes knowledge from those courses. It is because those students will have
already developed some mathematical skills all of which will turn out to be
quite useful in solving certain types of problems. Much of the mathematical
content of this text is inspired from prepared lecture notes sourced from well-
known set theory textbooks such as Hrbacek and Jech’s.
ii

The subject material is subdivided into ten major parts. The first few
are themselves subdivided into “bite-sized” chapters. Smaller sections allow
students to test their understanding on fewer notions at a time. This will allow
the instructor to better diagnose the understanding of those specific points
which challenge the students the most, thus helping to eliminate obstacles
which may slow down their progress later on.
Each chapter is followed by a list of Concepts review type questions. These
questions highlight for students the main ideas presented in that section and
help them deepen their understanding of these concepts before attempting the
exercises. The answers to all Concept review questions are in the main body of
the text. Attempting to answer these questions will help the student discover
essential notions which are often overlooked when first exposed to these ideas.
Textbook examples will serve as solution models to most of the exercise
questions at the end of each section. Exercise questions are divided into three
groups: A, B and C. The answers to the group A questions normally follow
immediately from definitions and theorem statements presented in the text.
The group B questions require a deeper understanding of the concepts, while
the group C questions allow the students to deduce by themselves a few con-
sequences of theorem statements presented in the text.
The course begins with an informal discussion of primitive concepts and
a presentation of the ZFC-axioms. We then discuss, in this order, operations
on classes and sets, relations on classes and sets, functions, construction of
numbers (beginning with the natural numbers followed by the rational num-
bers and real numbers), infinite sets, cardinal numbers and, finally, ordinal
numbers. It is hoped that the reader will eventually perceive the ordinal num-
bers as a natural logical extension of the natural numbers and as being what
constitutes the “spine of set theory” − as some authors have described them.
Towards the end of the book we present, a brief discussion of a few more
advanced topics such as Well-ordering theorem, Zorn’s lemma (both proven
to be equivalent forms of the Axiom of choice) as well as Martin’s axiom.
Finally, we briefly discuss the more abstract Axiom of regularity and a few
of its implications. A brief and very basic presentation of ordinal arithmetic
properties is then given.
Note that Chapters 1, 2, 13 to 21 together constitute the “meat” of the
book. Students who already possess a substantial amount of mathematical
background may feel they can comfortably skip many chapters without loss of
continuity, since these contain notions with which they have already developed
some familiarity. The following order sequence will allow readers with the
required background to advance more quickly to the meat of the textbook:
Chapter 1 on the topic of the ZFC-axioms can be immediately followed by
Chapters 13 and 14 on the topic of natural numbers, Chapters 18 to 22 on
the topic of infinite sets and cardinal numbers followed by Chapters 26 to 29,
32 and 33, on ordinals, and finally, Chapters 30 and 31 on the axiom of choice
and the axiom of regularity.
Many readers may notice that, in Chapter 27 on ordinals, we provide a
iii

much more detailed introduction to the study of ordinals then what is normally
found in some set theory texts. Often, if too many details about ordinals are
left for the readers to discover for themselves in the form of exercises this
tends to leave some doubts in their mind about whether they understand this
topic adequately, thus leading them to avoid using them in certain examples
where they might prove to be useful.
The short chapter titled Martin’s Axiom and Appendix A towards the end
of the book is presented as a matter of interest and is destined to readers who
are well-versed on the subjects of topological spaces and of real analysis.

As we all know, any textbook, when initially published, will contain some
errors, some typographical, others in spelling or in formatting and, what is
even more worrisome, some mathematical in nature. Critical or alert readers
of the text can help eliminate the most glaring mistakes by communicating
suggestions and comments directly to the author. This writer carries an im-
mense debt of gratitude to the scores of students and readers whose numerous
questions and enquiries on a preliminary online version of this text have helped
to clarify my thoughts, weed out typos, awkward explanations and occasional
anacoluthons.

Robert André
University of Waterloo, Ontario
Contents

I Axioms and classes

1 Classes, sets and axioms . . . . . . . . . . . . . . . . . . . . 1
2 Constructing classes and sets . . . . . . . . . . . . . . . . . . 13

II Class operations 25
3 Operations on classes and sets . . . . . . . . . . . . . . . . . 27
4 Cartesian products . . . . . . . . . . . . . . . . . . . . . . . 36

III Relations 45
5 Relations on a class or set . . . . . . . . . . . . . . . . . . . 47
6 Equivalence relations and order relations . . . . . . . . . . . 53
7 Partitions induced by equivalence relations . . . . . . . . . . 64
8 Equivalence classes and quotient sets . . . . . . . . . . . . . 70

IV Functions 77
9 Functions: A set-theoretic definition . . . . . . . . . . . . . . 79
10 Operations on functions . . . . . . . . . . . . . . . . . . . . . 88
11 Images and preimages of sets . . . . . . . . . . . . . . . . . . 96
12 Equivalence relations induced by functions . . . . . . . . . . 102

V From sets to numbers 109

13 Natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . 111
14 The natural numbers as a well-ordered set . . . . . . . . . . 128
15 Arithmetic of the natural numbers . . . . . . . . . . . . . . . 138
16 The integers Z and the rationals Q . . . . . . . . . . . . . . 149
17 Real numbers: “Dedekind cuts are us!” . . . . . . . . . . . . 160

VI Infinite sets 171

18 Infinite sets versus finite sets . . . . . . . . . . . . . . . . . . 173
19 Countable and uncountable sets . . . . . . . . . . . . . . . . 186
20 Equipotence as an equivalence relation . . . . . . . . . . . . 196
21 The Schröder-Bernstein theorem . . . . . . . . . . . . . . . . 211

v
VII Cardinal numbers 219
22 Introduction to cardinal numbers . . . . . . . . . . . . . . . 221
23 Addition and multiplication in C . . . . . . . . . . . . . . . 231
24 Exponentiation of cardinal numbers . . . . . . . . . . . . . . 238
25 On sets of cardinality c . . . . . . . . . . . . . . . . . . . . . 247

VIII Ordinal numbers 259

26 More on well-ordered sets . . . . . . . . . . . . . . . . . . . . 261
27 Ordinals: definition and properties . . . . . . . . . . . . . . . 277
28 Properties of the class of ordinals. . . . . . . . . . . . . . . . 295
29 Cardinal numbers: “Initial ordinals are us!” . . . . . . . . . . 317

IX Choice, regularity and Martin’s axiom 333

30 Axiom of choice . . . . . . . . . . . . . . . . . . . . . . . . . 335
31 Axiom of regularity . . . . . . . . . . . . . . . . . . . . . . . 350
32 Cumulative hierarchy . . . . . . . . . . . . . . . . . . . . . . 356
33 Martin’s axiom . . . . . . . . . . . . . . . . . . . . . . . . . . 372

X Ordinal arithmetic 383

34 Ordinal addition . . . . . . . . . . . . . . . . . . . . . . . . . 385
35 Ordinal multiplication and exponentiation . . . . . . . . . . 393

XI Appendix 403

Appendix A: Boolean algebras and Martin’s axiom . . . . . . . . . 405

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421

Index 421
Part I

Axioms and classes

Part I: Axioms and classes 1

1 / Classes, sets and axioms

Abstract. In this section we discuss axiomatic systems in mathemat-

ics. We explain the notions of “primitive concepts” and “axioms”. We
declare as primitive concepts of set theory the words “class”, “set” and
“belong to”. These will be the only primitive concepts in our system.
We then present and briefly discuss the fundamental Zermelo-Fraenkel
axioms of set theory.

1.1 Contradictory statements.

When expressed in a mathematical context, the word “statement” is
viewed in a specific way. A mathematical statement is a declaration which
can be characterized as being either true or false. By this we mean that
if a statement is not false, then it must be true, and vice-versa. There
exists a predetermined set of rigorous logical rules which can be used to
help determine the true or false value of such statements. Whether one
does mathematics as an expert or as a beginner, these elementary rules of
logic must always be respected. An argument which does not respect one
of these rules is said to be “illogical”. Then, by combining various mathe-
matical statements whose true or false values are known, we can logically
determine the true or false value of other mathematical statements. A rule
of logic looks something like this:
If Q is true whenever P is true, and T is true whenever Q is
true, then T is true whenever P is true.
Such rules can be symbolically represented in a way that avoids the use
of words. For example, the above statement is represented as:
[(P ⇒ Q) ∧ (Q ⇒ T )] ⇒ [P ⇒ T ]
In this way, we can construct an elaborate system of mathematical state-
ments each of which has been determined to be true or false. The logical
steps which help us determine the true or false values of a statement is
called a “mathematical proof”. Most readers have previously been exposed
to this particular way of thinking in various courses such as calculus and
linear algebra. Basic rules of logic are normally not taught explicitly in
such courses. It is however expected that a student who has sufficiently
been exposed to rigorous mathematical arguments and has often enough
attempted to formulate correct mathematical proofs − sometimes more
successfully than others − progressively learns to distinguish valid logical
arguments from ones that are flawed. Like learning to speak any language,
formulating correct mathematical arguments is a skill that is developed
with practice.
If the truth of a mathematical statement is logically deduced by combining
statements previously known to be true, then clearly there had to be, at
2 Section 1: Classes, sets and axioms

some point, a set of statements whose true-false values were not derived
from previous statements. That is, the process must start somewhere, with
some initial statements whose true-false value were unknown. Such initial
statements are not “deduced” but simply declared to be true based on
nothing more than “common sense”. For example, one may declare the
statement: “Distinct parallel lines cannot intersect” as being self-evident
or being so “elementary” that it cannot be proved. Once we give ourselves
a set A of self-evident statements and a list of rules that can be used
to determine the true-false value of other statements then the universe
UA of all possible true statements derived from A is determined. This
determined universe UA of statements constitutes a mathematical theory
which is ours to explore, or discover, one statement at a time.
But what if the choice of our original set A of statements was not a wise
one? “How can it not be a wise one if based on common sense?” one might
ask. Imagine this scenario:
Say that from a set A of initial self-evident statements, a state-
ment A has been shown to be true, and given that A is true
it is deduced that statement B must be true, and from B we
deduce that P is true. On the other hand it is shown that given
A, statement D must be true and that from D we show that P
is false. Hence, from A we have deduced that the statement P
is both true and false.
A statement which has been determined to be both true and false is re-
ferred to as a “contradictory statement” or a paradox. If a contradictory
statement logically flows from what was assumed to be a paradox-free sys-
tem, then the foundation of this system, as well as the methods used to de-
termine the true or false value of statements, must be carefully scrutinized
to determine the incorrect assumption(s) which allowed this “renegade”
statement to emerge. In this book we will explore a specific mathematical
system. It is hoped that in the process, the reader will be able to appreci-
ate the skill and ingenuity required for the construction of this impressive
mathematical structure. This system is called the “theory of sets” or more
simply “set theory”.

1.2 Sets.
Most people are familiar with the notion of a set and its elements. “Sets”
are viewed as collections of things, while “elements” are viewed as those
things which belong to sets. Normally, a set is defined in terms of certain
properties shared by its elements. These properties must be well described,
with no ambiguities, so that it is always clear whether a given element
Part I: Axioms and classes 3

belongs to a given set or not. Being a “set” can also be an element property;
so sets whose elements are sets exist, for example, the set S of all teams
in a particular hockey league. The elements of the set S are sets of hockey
players.
Let us consider a few examples of entities we may consider to be sets.
a) Let T denote the set of all straight lines in the Cartesian plane. For
example, the set A = {(x, y) : y = 2x + 3} belongs to T , while the set
B = {1, 2, 3} does not. We easily see that T is not an element of T
since T is not a line in the Cartesian plane.
b) Let U denote the set of all sets which contain infinitely many elements.
This set is well-defined since we can easily distinguish those elements
that belong to U from those that do not belong to U . For example, the
subset {−2, 0, 100} is not an element of U since it contains only three
elements. We ask the question: Is the set U an element of U ? To help
answer this question, witness the sets

A0 = {0, 1, 2, 3, · · ·}
A1 = {−1, 0, 1, 2, 3, · · · }
A2 = {−2, −1, 0, 1, 2, 3, · · ·}
..
.
An = {−n, −(n − 1), −(n − 2), · · · , −2, −1, 0, 1, 2, 3, · · ·}
..
.

Every element in the set {A0 , A1 , A2 , A2 , . . .} belongs to U . Hence, U

contains infinitely many elements. We conclude that U is an element of
U.
c) Define S to be the set of all “sets that are not elements of themselves”.
For example, the set T described in example (a) is in S since T is not
an element of itself. The set U described in example (b) does not belong
to S since U is an element of itself.

We now look more closely at the three sets described above. Other than
the fact that it is an extremely large set, there is nothing extravagant
about the set T described in example (a). On the other hand, the set
U discussed in example (b) also appears to be well-defined, since a set
which is infinite can easily be distinguished from one that is not. But
the fact that this set is an element of itself makes one wonder whether
we should allow sets to satisfy this property. On the other hand, it is
difficult to express what could possibly go wrong with such sets. Let
us now look closely at the “set” described in the example (c): A set
belongs to S only if it does not belong to itself. We wonder whether,
like the U in example (b), the set S is an element of itself. But S cannot
belong to S since no element in S can belong to itself. So S is not an
4 Section 1: Classes, sets and axioms

element of itself. Then, by definition of S, S would be an element of S.

This doesn’t make sense. There is obviously a problem with the “set”
described in example c). Even if it was fairly easy to detect the con-
tradiction which follows from example c) specifically determining the
source of this contradiction can be more difficult.
Example (c) nicely illustrates what is called a paradox. As we mentioned
earlier, a paradox consists of two contradictory statements both of which
logically flow from what was thought to be a well-understood and clearly
defined concept. In this case, to say that the statement “S is an element
of S” is true means that the statement “S is an element of S” is false, and
vice-versa. For many, this is just a play on words; it may seem harmless
enough. But for mathematicians, this was not a trivial matter. The discov-
ery of this particular paradox by Bertrand Russell1 was equivalent to the
uncovering of a malicious virus lying dormant in the heart of the operating
system of a microcomputer. It cannot be ignored. The way sets are de-
fined along with their universally accepted properties form the foundation
of modern mathematics. Paradoxes such as this undermines the confidence
we have in mathematics, a discipline which prides itself on its clarity of
thought, a discipline which sees itself as a seeker of irrefutable truths. Once
this flaw was exposed, it was important to understand why we did not see
this before. Mathematicians also wondered whether any other cracks in
the foundation of mathematics existed, requiring immediate attention.
Paradoxes are usually the result of some “erroneous assumptions”. In ex-
ample (c), we are making an erroneous assumption of some kind. But it
is not obvious what the erroneous assumption is. Should we allow our-
selves to talk of a “set that contains itself as an element”? Or maybe the
erroneous assumption is to say that there exists a “set that contains all
sets”. How do we decide which assumptions are acceptable ones and which
are not? This problem motivated mathematicians to determine as clearly
as possible what are acceptable properties for “sets”. Since most of mod-
ern mathematics can be derived from the notion of “sets”, this question
was labeled “High priority” in the early 1900s. It is in this period that
particular attention was given to developing a reliable axiomatic system
which could serve as a foundation of modern mathematics. The modern
set-theoretic axiomatic system which evolved as a result of these efforts is
the main topic of this book.

1.3 Axiomatic systems.

An axiomatic system is normally set up by first declaring some primitive
concepts or undefined notions. These primitive concepts carry no intrinsic
1 Bertrand Russell (1872-1970) was a British mathematician, logician, philosopher, and
public intellectual. He had influence on mathematics, logic, set theory, and various areas of
analytic philosophy. (Wikipedia).
Part I: Axioms and classes 5

meaning, although the symbols or words used to represent them often con-
vey some intuitive idea in the mind of the reader. That is, the words which
represent this undefined notion are such that the user will more easily un-
derstand the properties which will be prescribed for this concept. Specific
rules and properties which declare how these concepts relate to each other
are then formulated; these rules and properties must allow mathematical
constructs which are viewed as being important in our mathematical sys-
tem. These properties and rules are called the “axioms”.

Euclid’s axiomatic system. Euclid provided us with a useful model for con-
structing an axiomatic system. He is the first person known to apply the
axiomatic method to study the field of geometry. In his axiomatic system,
the words “line” and “point” are primitive concepts. He instinctively rec-
ognized that some undefined terms would be required. He then described
properties of “lines” and “points”. These properties are his axioms. These
axioms are statements whose true-false values are not logically deduced
from statements previously shown to be true. They are simply assumed to
be true. The important point here is that he explicitly states what these
“assumed to be true properties” are. The proposed primitive concepts and
these axioms, when gathered together, constitute the foundation of the
“Euclidean axiomatic universe” more commonly referred to as Euclidean
geometry. Euclid justified his choice of axioms by saying that these point
and line properties were “self-evident”. Note that Euclid’s primitive con-
cepts and axioms differ entirely from the set-theoretic axioms we will be
studying. But the axiomatic method he used to study geometry has served
as a valuable model for others who wanted to develop different mathemat-
ical systems.
Euclid then used deductive reasoning to show that various geometric state-
ments were true. The assumptions made were limited to
1) the stated axioms, along with
2) other statements previously shown to be true.
In this way, Euclidean geometry came to be. In spite of Euclid’s best ef-
forts, careful scrutiny of his work revealed that Euclid erred in certain
ways. He unknowingly made assumptions which were neither stated as ax-
ioms nor previously proven to be true. In 1899, the mathematician David
Hilbert revised the Euclidean axiomatic system by proposing three prim-
itive concepts: point, straight line, plane. He also proposed 21 axioms. In
1902, one axiom was shown to be redundant and so was eliminated from
the list. These primitive concepts along with 20 axioms are now widely
accepted as forming a firm logical footing for Euclidean geometry.

1.4 Zermelo-Fraenkel axiomatic system.

The axiomatic system of set theory as we know it today was in large part
6 Section 1: Classes, sets and axioms

developed in the period of 1908 to 1922 by Ernst Zermelo and Abraham

Fraenkel. Mathematicians T. Skolem and John Von Neumann made slight
modifications to these a few years later. These axioms are now referred
to as the ZF-axioms which stands for the “Zermelo-Fraenkel axiomatic
system”.
In what follows, we will strive to develop an intuitive understanding of
what axiomatic set theory is all about. We will avoid logical formalism,
a treatment of axiomatic set theory based entirely on symbols normally
reserved for more advanced courses. Our approach to set theory is referred
to as “naive set theory” in the sense that we use ordinary language (words,
sentences) to better understand what the basic axioms actually mean. We
will see how these axioms are used to construct well-known sets such as
the natural numbers and the real numbers. Finally, we will see how and
why the chosen axioms serve as a widely accepted foundation of modern
mathematics.

Primitive concepts and notation. What makes a set of primitive concepts

and axioms suitable for a particular theory? Most will agree that these
must satisfy the following conditions:
1) The number of undefined terms and axioms must be as few as possi-
ble.
2) Normally, an axiom should not be logically deducible from other ax-
ioms. (If one is deducible from the others, this should be explicitly
expressed.)
3) We should be able to prove from these axioms and concepts most
of what we consider to be interesting or useful mathematics. Often,
parts of the logical universe which is determined from the axioms
are preconceived, in the sense that axioms are introduced so that
certain mathematical statements will turn out to be true. This, of
course, can turn out to be a “dangerous game”. But there has to be
some motivation for choosing one statement as an axiom rather than
choosing another.
4) These axioms must not lead to any paradoxes. An axiomatic system
which contains contradictions is either modified to one in which these
contradictions do not occur, or some axioms are simply discarded and
replaced with others if needed.

The primitive concepts in our theory. There will be three undefined notions
in our axiomatic system:
“class”
“set”
“belongs to”
Part I: Axioms and classes 7

The expression “belongs to” is often stated as “is a member of ” or “is an

element of ”; it is usually abbreviated by the symbol, ∈.
All objects in our theory are classes. There is nothing else. We will soon
distinguish special kinds of classes. Once we have discussed a few axioms,
we will define a set as being a special kind of class. A class which is not
a set will be called a proper class. Some axioms will help us distinguish
between those classes which are sets and those which are proper classes.
Classes will be represented either by lower-case or upper-case letters. So
we can write “Let x and A be two classes”.

The expression
x∈A
is to be read as “the class x belongs to the class A”, or “the class x is in the
class A” or “x is an element of A”. However, no class will be representable
by a lower-case letter, x, unless it is known that x ∈ B for some class B.
Those classes which can be represented by a lower-case letter, say x, will
be given a special name:
If a class A is such that A ∈ B for some class B, then we will
refer to the class A as being an “element”.
Elements are still classes; but they are special classes, since they “belong
to” another class. So an element can be represented by both a lower-case
or an upper-case letter. For example, if we write x ∈ y or A ∈ B this means
that x, y and A are elements, while B may or may not be an element.
Why is “element” not an undefined notion? The reader may find surprising
that the object element is not expressed as an undefined notion. After
all, we are accustomed to distinguishing elements from sets. Introducing
a fourth undefined term was eventually seen as being superfluous. This
became clear when we realized that we often view sets as being “elements”
of other sets.2 Witness:
· Points (a, b) in the Cartesian plane are actually two-element sets
{a, b} of real numbers stated in a particular order.
· Rational numbers a/b can be described as the set of all two-element
sets {a, b} of integers in a particular order where b is not 0.
· Irrational numbers can be viewed as infinite sequences of rational
numbers converging to a non-rational number, again a set.

1.5 The axioms of set theory.

We now give a “preview” of what the set-theoretic axioms are, keeping in
mind that a full understanding of what they mean will only be developed
2 There exists a branch of set theory in which mathematical entities which are neither

sets nor classes are considered. These are referred to as “urelements”. We will not consider
these in this text.
8 Section 1: Classes, sets and axioms

when we actually invoke each of these in various situations where they are
required. These are called the ZF-axioms. The reader will see how surpris-
ingly few are required. At this point, much of this will look like gibberish,
but as we prod through the subject matter, we will step-by-step develop
a better understanding of what they mean.

Primitive concepts: “class”, “set” and “belongs to”.

Axiom A1 (Axiom of extent): For the classes x, A and B, [A = B] ⇔

[x ∈ A ⇔ x ∈ B].
Axiom A2 (Axiom of class construction): Let P (x) designate a state-
ment about x which can be expressed entirely in terms of the sym-
bols ∈, ∨, ∧, ¬, ⇒, ∀, brackets and variables x, y, z, . . . , A, B, . . . Then
there exists a class C which consists of all the elements x which satisfy
P (x).
Axiom A3 (Axiom of pair): If A and B are sets, then the doubleton {A,
B} is a set.
Axiom A4 (Axiom of subsets): If S is a set and φ is a formula describing
a particular property, then the class of all sets in S which satisfy this
property φ is a set. More succinctly, every subclass of a set of sets is
a set.3
Axiom A5 (Axiom of power set): If A is a set, then the power set P(A)
is a set.
S
Axiom A6 (Axiom of union): If A is a set of sets, then C∈A C is a set.
Axiom A7 (Axiom of replacement): Let A be a set. Let φ(x, y) be a
formula which associates to each element x of A an element y in such
a way that whenever both φ(x, y) and φ(x, z) hold true, y = z. Then
there exists a set B which contains all elements y such that φ(x, y)
holds true for some x ∈ A.4
Axiom A8 (Axiom of infinity): There exists a non-empty class A called
a set that satisfies the condition: “X ∈ A” ⇒ “X ∪ {X} ∈ A”. (A set
satisfying this condition is called a successor set or an inductive set.)
3 This axiom is more often expressed as the Axiom of comprehension, Axiom of Speci-

fication or Axiom of separation. It is in fact many axioms (which, when viewed together,
are referred to as a schema) each differing only by the formula φ it refers to. So to be more
precise, given a formula φ in set theory language, we would refer to it as axiom A4(φ) rather
than A4.
4 This axiom is more often expressed as the Replacement axiom schema since it is in fact

many axioms each differing only by the formula φ it refers to. So to be more precise, given
a formula φ in set theory language, we would refer to it as axiom A7(φ) rather than A7. It
essentially allows us to confirm that if the domain A of a set f is a set, then the image f [A]
is a set.
Part I: Axioms and classes 9

Axiom A9 (Axiom of regularity): Every non-empty set A contains an

element x whose intersection with A is empty.
Another “special” axiom is usually stated separately from the other nine
axioms above. It is viewed by many as being different in nature. It was
also, at least initially, quite controversial. It is called the Axiom of choice.

Axiom of choice: For every set A of non-empty sets, there is a function

f which associates to every set A in A an element a ∈ A.

In this text we will refer to this set of nine axioms viewed together with
the Axiom of choice as “ZF +Choice” or simply by ZFC.5
The Axiom of choice. The controversy surrounding the Axiom of choice
requires some explanation. The Axiom of choice is an axiom which was
added after most of the ZF-axioms were widely accepted as a foundation
for modern mathematics. It is so subtle a concept that many early math-
ematicians unknowingly invoked it in their proofs. That is, it was invoked
without stating it explicitly as an assumption. Some mathematicians pub-
licly questioned this assumption, asking openly whether the word “obvi-
ously” was sufficient justification for using it. These questions could not
be ignored. Numerous attempts at proving this axiom from the ZF-axioms
failed. In 1963, it was finally proven that neither the Axiom of choice, nor
its negation, can be proven from the ZF-axioms. This implied that we
are free to state it as an axiom, along with the other ZF-axioms, with-
out fear of producing a contradiction. A lengthy debate on whether this
statement should be included with the other “fundamental” ZF-axioms
followed. Some described it as “the most interesting and, in spite of its
late appearance, the most discussed axiom of mathematics, second only
to Euclid’s axiom of parallels which was introduced more than two thou-
sand years ago” (Fraenkel, Bar-Hillel and Levy 1973). Eventually, it was
felt that “not accepting” the Axiom of choice closes the door to many
fundamentally important results of modern mathematics. One could say
that the Axiom of choice had already been used so extensively that it was
deeply ingrained in the modern mathematical fabric; we were “addicted”
to the Axiom of choice, so to speak.
Even though proofs that invoke the Axiom of choice are widely viewed as
being acceptable, it is often felt that a correct proof that does not invoke
the Axiom of choice is preferable to a simpler proof which invokes it. This
5 Note that some of the ZF axioms listed have been shown to follow from the others. So

some set theory texts may omit one or more of these from their formal list of ZFC axioms.
Since most of these axioms are non-controversial we will adopt, for this text, this list of 10
axioms as the ZFC-axioms. The reader should simply be alerted to the fact that the list of
the ZFC axioms may vary from text to text.
10 Section 1: Classes, sets and axioms

is because it assumes the existence of something that can neither be seen

nor constructed. It is viewed somewhat pejoratively by some as the “magic
wand” that magically opens closed doors. For this reason, when proving
a statement, it is customary to point out explicitly the steps where the
Axiom of choice is invoked. Actually, there is a general consensus on one
point: The Axiom of choice should not be listed with the ZF -axioms; it
should be set apart in a category of its own. This is why we refer to this
group of axioms as “ZF +Choice”, or simply ZFC. One can view this as
some sort of compromise.

1.6 A few more words on these axioms.

Even though the words class and set are undefined, the axioms will allow
us to perceive them (once we can decode them) as “collections of objects”.
It is still too early to extract the full meaning of the axioms stated above.
But the reader will feel more at ease if we interpret at least certain aspects
of these immediately.
Axiom A1,
“For the classes x, A and B, [A = B] ⇔ [x ∈ A ⇔ x ∈ B].”
is an axiom which states that “a class is defined by its elements”. If two
classes are equal, then they have the same elements. Conversely, two classes
which have the same elements are the same class.
We now examine more closely Axioms A2, A9 and the Axiom of choice,
in random order.
Axiom A2 (Class construction):
“Let P (x) designate a statement about x which can be expressed
entirely in terms of the symbols ∈, ∨, ∧, ¬, ⇒, ∀, brackets and
variables x, y, z, . . . , A, B, . . . Then there exists a class C which
consists of all the elements x which satisfy P (x).”
states that we can use well-defined properties which can be expressed by
the given symbols to construct classes. For example A = {X : u ∈ X} and
B = {X : X = X} are different classes since the properties that charac-
terize their elements are different. For the class A, P (x) is the property
u ∈ X, while for the class B, P (x) is the property “X = X”.
Axiom A8 (Infinity):
“There exists a non-empty class A called a set that satisfies the
condition:

“X ∈ A” ⇒ “X ∪ {X} ∈ A”.”

says that there exists a class called a set which is infinite in size. (This
axiom also guarantees that at least one class called a set exists.) It essen-
tially allows us to define the “natural numbers”, 0, 1, 2, 3, . . . ,.
Part I: Axioms and classes 11

Axiom of choice:
“For every set A of non-empty sets there is a rule f which
associates to every set A in A an element a ∈ A.”
says that given a set of non-empty sets, there exists a certain type of func-
tion. But it does not show how to construct or find such a function.
Note that Axioms A1 and A2 refer only to classes (we class these “class
axioms”) while all the other axioms (Axioms A3 to A9 and the Axiom of
choice) are “set axioms”. The set axioms determine what kind of objects
exist in the universe of all sets.
Axioms A2, A3, A4, A5, A6 and A7 are “constructive” axioms since A2
gives us a way to construct a class by referring to a property. Axioms A3
to A7 provide a method to construct new sets from ones that are known
to exist.
Axiom A9, the Axiom of regularity, is sometimes referred to as the “use-
less” axiom by some. Others don’t consider it as a basic axiom since most
of mathematics which is based on set theory does not require it. It will
be invoked only in the last chapter of this book. Although it is not obvi-
ous, just from reading it, this axiom actually states that “those non-empty
classes which don’t have a least element are not sets”. It is in fact an axiom
which does not allow certain types of sets to exist in the universe of sets.
It is of an exclusionary nature. The other axioms (except for Axiom A1)
increase the number of sets in the universe of sets.
Axioms A4 (Subsets) and A7 (Replacement) each represent many axioms.
We refer to such axioms as schema.6 Axiom A4 speaks of a set S and a
particular formula φ describing a property. For each property we have a
different Axiom. Given φ, we could say the “Axiom A4 for φ”. Axiom A7
speaks of a set A and a class B of sets along with a particular formula
φ(x, y) which plays the role of a function (normally referred to as a func-
tional). For each functional, φ(x, y), we have a different axiom.

1.7 Some things we may immediately wonder about.

As one may suspect, when formally expressed, axioms do not contain
words. The ZFC-axioms are expressed using symbolism of first-order logic.
For example, when formally stated, an axiom may look like this:
∀x∀y∃z(x ∈ z ∨ y ∈ z)
This is the Axiom of pair. We use the words and sentences to develop an
intuitive understanding of what this code means.
A second point one may wonder about: Are the ZF-axioms consistent?
That is, do we know for sure that the ZF-axioms, as stated, will never yield
some contradiction? If one day we actually encounter a contradiction that
6A dictionary describes schema “an underlying organizational pattern or structure”.
12 Section 1: Classes, sets and axioms

flows from these axioms, then we can answer: “No, the ZF-axioms are not
consistent, since we have revealed a contradiction which flows from these
axioms!” If such a contradiction is discovered, we must tinker some more
with the set-theoretic axioms to correct the flaw.
But as long as we do not encounter such a paradox, the answer to this ques-
tion is: “We don’t know for sure whether the ZF-axioms are consistent.” It
has been shown that using only the ZF -axioms, it is impossible to prove or
disprove that the ZF -axioms are consistent. It is the “nature of the beast”,
so to speak. Since new forms of mathematics are uncovered every day, it
is possible that next week, in 100 years or in a 1000 years someone will
discover that ZF is inconsistent. “Set theory” is, as the words indicate,
just a theory. By their very nature, all theories evolve to explain newly
discovered previously unknown facts. The ZF -set-theoretic system is no
different. As a foundation of modern mathematics, the ZF -set-theoretic
system seems to serve its purpose well; it is the best theory we have today,
even though some day we may discover significant ways of improving it.

Concepts review:

1. What is Russell’s paradox?

2. Why do paradoxes occur?
3. What are three primitive concepts of set theory?
4. What is the difference between a class, a set and a proper class?
5. When is a class called an element?
6. Which classes can be represented by a lower-case letter?
7. What does ZFC stand for?
8. How many axioms belong to ZFC?
Part I: Axioms and classes 13

2 / Constructing classes and sets

Abstract. In this section we define the symbols “=” and “ ⊆” and discuss
Axioms A1 to A5. We show how Axioms A1 and A2 are used to construct
classes. Axioms A3 to A5 are used as tools to construct sets. We distin-
guish between classes, sets and elements by exhibiting a class which is not
an element and a class which is not a set. We also show that all sets are
elements. We introduce the concept of “power set of a set” as a set con-
structing tool.

2.1 Basic statements, definitions and notation.

To discuss the axioms and some of their immediate consequences, we first
define a few words and symbols that will allow us to communicate certain
ideas more efficiently. We first confirm that every class is equal to itself. This
is not an axiom, since it is an immediate consequence of axiom A1.

Theorem 2.1 For any class C, C = C.

This follows from axiom A1: Since x ∈ C ⇒ x ∈ C and x ∈ C ⇒ x ∈ C then
C = C.

If the statement “A = B” is false, then we will write A 6= B.

Definition 2.2 If A and B are classes or sets, we define A ⊆ B to mean that

every element of A is an element of B. That is,
A ⊆ B if and only if x ∈ A ⇒ x ∈ B
If A ⊆ B we will say that A is a subclass (subset) of B. If A ⊆ B and A 6= B
we will say that A is a proper subclass (proper subset) of B; in this case we
write A ⊂ B. So “A ⊂ B” is a shorter way of saying “A ⊆ B but A 6= B”.1

We restate the Axiom of construction:

Axiom A2: If P (x) is a property of an element x which can be

expressed entirely in terms of the symbols ∈, ∨, ∧, ¬, ⇒, ∀, brack-
ets and variables x, y, z, ..., A, B, ..., then there exists a class C
which consists of all the elements x which satisfy P (x).
1 Do not confuse ⊂ with ∈. When we say that “the class A belongs to the class B”, we

mean that A ∈ B, not A ⊂ B.

14 Section 2: Constructing classes and sets

This axiom refers to properties of elements which can be expressed in terms

of logical symbols. Many students may have used some or all of these symbols
before; for completeness, we explicitly state how these symbols should be
interpreted:
∈ is given the meaning “is an element of”
∨ is given the meaning “or”
∧ is given the meaning “and”
¬ is given the meaning “not”
⇒ is given the meaning “implies”
∃ is given the meaning “there exists”
∀ is given the meaning “for all”

The Axiom of construction allows us to construct a class by first defining

a property P and then gathering together all the elements possessing this
property to form the class

C = {A : A possesses the property P } = {x : P (x)}

This class is succinctly expressed as: {x : P (x)}. The brackets { } refer to a

“class”. The symbol, P (x), means “x possesses property P ”. The lower-case
symbol, x, refers to a “class which is an element of another class”. Elements
are the only classes that can be denoted by a lower-case letter.
A word of caution: Axiom A2 allows us to gather together all the “elements ”
that possess a property P , not all “classes ” that possess a property P . For if
the word “element” is replaced with the word “class”, then we easily obtain
a paradox. Witness the class, C, defined as follows:

C = {A : A is a class and A satisfies the property ¬(A ∈ A)} = {A : A 6∈ A}

which leads to Russell’s paradox, since neither C 6∈ C ⇔ C ∈ C is both true

and absurd.
Also, in expressions such as

“the class {C : C = C}”

it is understood that C must be an element. The expression {x : x = x},

means the same thing except it emphasizes that the classes it refers to are
elements.

2.2 Properties of classes.

Axiom A1 allows classes to have the properties normally attributed to those
things we call “collections of objects”. After all, this is what we would like
classes and sets to be. The axioms are developed with a preconceived idea
Part I: Axioms and classes 15

of what the only objects (classes and sets) in our set-theoretic universe are.
The statements in the following theorem are all logical consequences of these
axioms. They are easily seen to hold true, but it is good practice to explicitly
write out the proofs.

Theorem 2.3 If C, D, and E are classes (sets), then:

a) C = D ⇒ D = C.
b) C = D and D = E ⇒ C = E.
c) C ⊆ D and D ⊆ C ⇒ C = D.
d) C ⊆ D and D ⊆ E ⇒ C ⊆ E.

P roof:
a) Suppose C = D. Then by axiom A1, x ∈ C ⇒ x ∈ D and x ∈ D ⇒
x ∈ C. Then x ∈ D ⇒ x ∈ C and x ∈ C ⇒ x ∈ D. Hence, by definition of
equality D = C.
Proofs of (b) to (d) are left as an exercise.

We said that all objects in our set-theoretic universe are classes. Some of
those classes are elements provided these belong to another class. It is nor-
mal to ask whether there exists at least one class which is not an element.
We answer this question in the following theorem.

Theorem 2.4 There exists a class which is not an element.

P roof:
Let C be the class
C = {x : x 6∈ x}2
By Axiom A2, the class C is well-defined. Suppose C is an element. Then
C belongs to {x : x 6∈ x}. So C does not belong to C. Contradiction. Then
C is not an element, as required.

2C = {A : A is an element and A satisfies ¬(A ∈ A)}.

16 Section 2: Constructing classes and sets

2.3 The universal class and the empty class.

The class
U = {x : x = x}
is easily seen to be the class that contains precisely all elements. By Theo-
rem 2.3 part (a), “Every element is equal to itself”. The class U is called
the universal class. In the proof of theorem 2.4 we constructed a class C
which does not belong to U . Hence, U does not contain all classes.
At this point, we have discussed only two of the primitive concepts: class
and “belongs to”. From these we have defined “element ”. The reader has
maybe noticed that the word “set ” has remained on the sidelines. We have
not yet discussed this primitive concept, other than witnessing the very large
set given to us “for free” by the Axiom of infinity. Axiom A2 will allow us
to construct a much smaller set. We start with the following definition.

Definition 2.5 Axiom A2 authorizes us to call C = {x : x 6= x} a class.

Since we have proven above that every element is equal to itself, then this
class contains no elements. We will call the class with no elements the empty
class and denote it by ∅.

Theorem 2.6 For any class C, ∅ ⊆ C.

P roof:
Let C be a class. To show that ∅ ⊆ C it suffices to show, by definition of
“⊆”, that x ∈ ∅ ⇒ x ∈ C. Any element in ∅ belongs to C since ∅ contains
no elements; then ∅ ⊆ C, as required.3

2.4 Sets which are derived from other sets.

Axiom A1 “If x = y and x ∈ A, then y ∈ A” is a statement about elements
x, y and the class A. Since every set is a class, axioms referring to classes
also refer to sets. Axiom A2 is a statement which shows how to construct
classes by referring to some property, P (x); it does not refer to properties
specific to sets only. On the other hand, Axioms A3 and A4 are set-specific:
Axiom A3 (Axiom of pair): If A and B are sets, then the dou-
bleton class C = {A, B} is a set.
Axiom A4 (Axiom of subsets): Every subclass of a set is a set.

3 Alternatively, the statement x 6∈ C ⇒ x 6∈ ∅ is the logical equivalent of x ∈ ∅ ⇒ x ∈ C.

If x is not an element of C, then x is not an element of ∅ since ∅ has no elements.

Part I: Axioms and classes 17

These two axioms alone will allow us to construct sets from those classes
known to be “sets”. Axiom A8 (Axiom of infinity) guarantees that at least
one class called “set” exists: It contains the words “...there exists a class A
called a set that...”. We need not search any further.
Axiom A3 refers to the set C = {A, B} as a “doubleton”. We will use the
word doubleton when referring to two sets A and B viewed together to form
a collection {A, B} of sets. For convenience, we will not put any restrictions
on how the set B relates to A. For example, we can refer to the set {A, A}
as a doubleton even though it contains only one element.
The statements in the following theorem follow immediately from the Ax-
ioms A3 and A4.

Theorem 2.7 Let S be a set. Then:

a) ∅ ⊆ S and so ∅ is a set.
b) The set S is an element. Hence, “All sets are elements”.

P roof:
a) We are given that S is a set. We are required to prove that ∅ is a set.
We can directly apply axiom A4: Since ∅ = {x ∈ S : x 6= x} and, by
hypothesis, S is a set then, by A4, ∅ is a set as required.
b) We are given that S is a set. We are required to prove that S is an
element. By Axiom A3 (Axiom of pair), for any set S, {S, S} is a set.
Since S ∈ {S, S}, for all sets S, then, by definition, S is an element, as
required.

In the proof above, we discussed the set {S, S} which contains the set S as
an element. Since {S, S} = {S} (Prove this!) this is a one-element class. We
call such sets singleton sets. The reader should note that according to our
definition of doubleton above, every singleton set {A} can be expressed as a
doubleton {A, A}.
We can now verify that the universal class U = {x : x = x} is not a set:
Suppose U is a set. See that the class C = {x ∈ U : x 6∈ x} is a subclass of
U . Since we assumed U to be a set, by the Axiom of subset, C must also
be a set. But we showed in Theorem 2.4 that the class C is not an element
and so cannot be a set. We have a contradiction. Therefore the universal
class is a proper class, as claimed.

2.5 Examples of sets which are non-empty.

At this point we have only exhibited one set, the empty set ∅. In the fol-
lowing examples we use some axioms to construct other sets.
18 Section 2: Constructing classes and sets

(a) The set ∅ contains no elements. By Axiom A3, the class C = {∅, ∅} =
{∅} is a set which contains exactly one element (the element ∅). Observe
that ∅ 6= {∅} since {∅} contains one element, while ∅ does not.
(b) Let A = ∅ and B = {∅}. By Axiom A3, C = {∅, {∅}} is a set which
contains exactly two elements (the element ∅ and the one element set
{∅}).
(c) Let a, b, c be three sets. Then, by repeated applications of Axiom A3,
{a, {a}, {{a}}, {a, b, c}} is a four-element set.
d) Let c be an element (class). Then A = {c} is a class with only one ele-
ment since A = {x : x = c} and so, by Axiom A2, A = {c} is a class. If c
is known to be a set, A = {c} = {c, c}, so we can conclude that A is a set.

We see, from the above rules, that we can “theoretically”, construct all finite
sets so that they each appear in the form of various orders and combinations
of the symbols,
{, }, ∅
Of course, in practice, there would be no point in actually doing that.

We can use a symbol of our choice, say A, to represent an “infinite set” guar-
anteed to exist by the Axiom of infinity. We can then use the construction
rules to construct other sets with A. So nothing prevents us from considering
a set described as, say, D = {A, B, C}.

2.6 The class of all sets.

Recall that the word set is a primitive concept (along with the word class).
We define the property symbol “set(x)” to mean “x is a set”. Then by Axiom
A2, S = {x : set(x)} = “all elements which are sets” forms a class. Since
every set is an element (by Theorem 2.7), we can say S = {x : set(x)} is a
subclass of the universal class, U = {x : x = x}. We call S the class of all
sets. From this we observe that:
Given any property P ,

S = {x : set(x) ∧ P (x)} = {x : x is a set and x satisfies P }

is a class.
Axiom A2 said that {x : P (x)} = “all elements which satisfy property P ”
is a class. Now it makes sense to talk about the “class of sets satisfying a
property P ”.
Note that the class, S , of all sets is a proper class. To see this, suppose S
was a set. Let D = {x ∈ S : x 6∈ x}. The class D cannot be a set, for if it
was, then as previously shown, we would quickly obtain the contradiction,
Part I: Axioms and classes 19

D ∈ D and D 6∈ D. See that D is a subclass of S . Since S was assumed to

be a set, by the Axiom of subset, D must be a set. We have a contradiction.
The source of the contradiction is our assumption that S is a set. So S is
a proper class.
2.7 Power sets.
Axiom A3 allows us to construct new sets from known ones by forming dou-
bleton sets, while Axiom A4 allows us to construct sets by taking subclasses
of sets and calling them subsets. Axiom A5 will allow us to construct, from
a known set A, what seems to be a larger set, P(A). It is called the power
set of A. We define “power set”.

Definition 2.8 If A is a set, then we define the power set of A as being the
class P(A) of all subsets of A. It can be described as follows:
P(A) = {X : X ⊆ A}

We verify the following facts:

– By Axiom A4, “X ⊆ A” ⇒ “X is a set” so all elements of P(A) are
sets.
– By Axiom A2, P(A) is a class. (The fact that “all sets are elements”
is proved above).
No one has been able to prove that P(A) is a set. So if we want it to be a
set, we must postulate this fact. Axiom A5 does precisely that. We will later
see why this axiom plays a fundamental role in the mathematical universe
we are exploring today.
Axiom A5: If A is a set, then the power set, P(A), is a set.
In formal language, the Axiom of power set actually reads as follows:
∀A∃P [B ∈ P ⇐⇒ B ⊆ A]
where A is specified to be a set. This expression includes the definition of the
“power set P(A) of a set A”. The interior of the square brackets specifies
that “the elements B of the power set P(A) are precisely the subsets B of
A”. The sequence of symbols “∀A∃P ” instructs the reader that given any
set A, the class P(A) exists as a set. Declaring that the power set of any set
is a set adds many sets to our universe of sets. There is, of course, some risk
in doing this, since we may be allowing sets in our universe of sets which
are so strange that we will not be sure whether we want them there or not.
On the other hand, we will see that the Power set axiom is extremely useful
for constructing sets we need. For example, the Power set axiom allows us
to construct Cartesian products.
Axioms A6, A7, A8 and A9 have not yet been discussed. We will study these
only when we require them later on.
20 Section 2: Constructing classes and sets

2.8 Examples.
We provide a few exercises which allow us to practice notions related to
power sets.
1) Power sets. List the elements of the power set of
a) The empty set, ∅.
b) A singleton set.
c) A doubleton set.
Solution:

a) The power set of the empty set: P(∅) = {X : X ⊆ ∅}. If X ∈ P(∅),

then X ⊆ ∅. Thus, X = ∅. So

P(∅) = {∅}

b) The power set of a singleton set {x}: For an element x, P({x}) =

{∅, {x}}. Note that x 6⊆ {x} since the elements of x are not in {x},
a single element set.
c) The power set of a doubleton set {x, y}: For the elements x and y,
P({x, y}) = {∅, {x}, {y}, {x, y}}.

2) Consider the three-element class C = {x, {x, y}, {z}}. Determine which
of the following statements are true and which are false.
a) We can write x ∈ C.
b) We can write x ⊆ C.
c) We can write {x} ⊆ P(C).
d) We can write {{z}} ∈ P(C).
e) We can write z ∈ P(C).
f) We can write {z} ⊆ C.
Solution:
a) True. We can write x ∈ C since x is an element explicitly listed as a
class in C.
b) False. We cannot write x ⊆ C since this does not satisfy the definition
of ⊆. To write x ⊆ C is to say that every element in x is an element
in C. But the contents of x are unknown. So there is no basis to state
that x ⊆ C.
c) True. We can write {x} ⊆ P(C) since every element in {x} is also
an element of C.
d) True. We can write {{z}} ∈ P(C) since {{z}} ⊆ C. The only element
in {{z}} is in C.
e) False. We cannot write z ∈ P(C) since the element z does not appear
as an element of C.
Part I: Axioms and classes 21

f) False. We cannot write {z} ⊆ P(C) since {z} contains only one
element z. This element is not a subset of C.
It was shown above that “all sets satisfying a property P ” is a class. In
the following example, we say something similar. But there are subtle
differences in the statement. See if you can detect these differences.
3) Let A be a set and P denote some property. Show that the class

S = {x : (x ⊆ A) ∧ P (x)}

is a set.
Solution:
We are given that A is a set and S = {x : (x ⊆ A) ∧ P (x)}. We are
required to show that S is a set. Since A is a set, and, for every x ∈ S,
x ⊆ A then every x ∈ S is a set (by Axiom A4). By Axiom A5, P(A) is
a set. Since the class S ⊆ P(A), then S is a set (by Axiom A4). This is
what we were required to prove.

2.9 Can a set be an element of itself?

If A and B are two classes, we have said that the expression, A ∈ B, is to
be interpreted as “A belongs to B” or “A is an element of B”. For example,
if A = {∅, {∅}} and B = { ∅, {∅}, {∅, {∅}} }, then A ∈ B.
We wonder if there exists a set a such that a ∈ a. Intuitively, one would
hope that no such set can exist.
To prove that no such set exists in the ZFC-universe of sets, we require the
Axiom of regularity. This axiom is rarely invoked. This may be why some
people refer to it as the “useless axiom”. It may be because they just don’t
care whether there are sets which are elements of themselves, or not (while
still managing to sleep well at night). It states:
“Every non-empty set A contains an element x whose intersection
with A is empty.”
In the chapter reserved for the Axiom of regularity (in Chapter 31 on page
351) we will prove that the set

T = {a : a is a set and a ∈ a}

is empty.

Until then, we will avoid dealing with sets, a, such that, a ∈ a.

22 Section 2: Constructing classes and sets

Concepts review:
1. What does it mean to say “the class A is equal to the class B”,
A = B?
2. What does it mean to say “the class A is contained in the class B”,
A ⊆ B?
3. What does it mean to say the class A is a proper subclass of the
class B?
4. How should we read the expression C = {x : P (x)}?
5. Is it true that ∅ 6∈ ∅? Why?
6. State a class that is not an element.
7. What is the universal class?
8. What is the empty class ∅?
9. Is a set an element?
10. Given a set A, what is the power set, P(A), of A? How do we know
that P(A) is a set?
11. If B is a set and A ⊆ B, what can we say about A? Why?
12. Why is ∅ a set?

EXERCISES

A. 1. Suppose A is a proper class and A ⊆ B. Show that B is a proper class.

2. Prove the following.
a) {c, d, e} = {c, d} if and only if e = c or e = d.
b) c = {d} ⇒ d ∈ c
3. If x = {u, v}, y = {v, {w}} and z = {x, y} write out explicitly the elements
of the following classes: P(x), P(y), P(P(x)) and P(z)
4. Prove parts (c), (d) and (e) of Theorem 2.3.
5. For sets S and T show that:
a) S ⊆ T if and only if P(S) ⊆ P(T )
b) S = T if and only if P(S) = P(T )
6. Show all the elements in the set P(P(∅)).
7. Show all the elements in the set P(P(P(∅))).
8. Show that [S ⊂ T )] ∧ [T ⊆ V )] ⇒ (S ⊂ V ).
9. Show that [S ⊆ T )] ∧ [T ⊂ V )] ⇒ (S ⊂ V ).
10. Show that c = d if and only if {c} = {d}.
11. Show that c ∈ d if and only if {c} ⊆ d.
Part I: Axioms and classes 23

B. 12. Show that (S ⊆ ∅) ⇒ (S = ∅).

13. Using the empty set, ∅, construct a set containing seven-elements.
14. If A = {∅, ∅, ∅, {∅, ∅}} and B = {∅, {∅}}, show that A = B.

C. 15. Show that the statement “For any set S, P(S) ⊆ S” is a false statement.
16. Suppose U and V are sets. Determine whether the statement P(U ) ∪
P(V ) = P(U ∪ V ) is true or false. Justify your answer.
17. Suppose U and V are sets. Determine whether the statement P(U ) ∩
P(V ) = P(U ∩ V ) is true or false. Justify your answer.
Part II

Class operations
Part II: Class operations 27

3 / Operations on classes and sets.

Abstract. In this section we define operations on classes that will allow
us to better see how classes relate to each other. The concept of unions,
intersections and complements of classes and sets are defined. Axiom A6
(Axiom of union) is discussed. This axiom is used to determine when the
union of sets is a set. We show how Venn diagrams can serve as a guide
when interpreting operations on sets. Some basic laws for the union, in-
tersection and complements of classes and sets are presented in the form
of theorems. De Morgan’s laws are also stated.

3.1 Unions, intersections and complements of classes.

Forming unions and taking intersections of classes are methods for con-
structing new classes and sets from old ones. We begin by defining these
formally.

Definition 3.1 Let A and B be classes (sets). We define the union, A ∪ B,

of the class A and the class B as

A ∪ B = {x : (x ∈ A) ∨ (x ∈ B)}

That is, x ∈ A ∪ B if and only if x ∈ A or x ∈ B. If A is a non-empty class

of classes then we define the union of all classes in A as
[
C = {x : x ∈ C for some C ∈ A }
C∈A
S
That is, x ∈ C∈A C if and only if there exists C ∈ A such that x ∈ C.

Definition 3.2 Let A and B be two classes (sets). We define the intersection,
A ∩ B, of the class A and the class B as

A ∩ B = {x : (x ∈ A) ∧ (x ∈ B)}

That is, x ∈ A ∩ B if and only if x ∈ A and x ∈ B. If A is a non-empty class

of classes, then we define the intersection of all classes in A as
\
C = {x : x ∈ C ∀ C ∈ A }
C∈A
T
That is, x ∈ C∈A C if and only if x ∈ C for every class C in A .
28 Section 3: Operations on classes and sets

Observe that
S
a) if A = {D, E} then D ∪ E = C∈A C.
T
b) if A = {D, E} then D ∩ E = C∈A C.
Also see that T
S the axiom A2, Axiom of construction, guarantees that both
C∈A C and C∈A C are classes. If the class A contains no elements, then
by definition of “union” and “intersection” the union and intersection of all
elements in A is ∅.

Definition 3.3 We will say that two classes (sets) C and D are disjoint if
the two classes have no elements in common. That is, the classes C and D are
disjoint if and only if C ∩ D = ∅.

Definition 3.4 The complement, C 0 , of a class (set) C consists of all elements

which do not belong to C. That is, if C is a class, then

C 0 = {x : x 6∈ C}

Hence, x ∈ C 0 if and only if x 6∈ C. Again, the axiom of construction (A2)

guarantees that C 0 is a class. Given two classes (sets) C and D, the difference
C − D, of C and D, is defined as

C − D = C ∩ D0

This is also a class. The symmetric difference, C4D, is defined as (the class)

C4D = (C − D) ∪ (D − C)

3.2 Unions and intersections referring specifically to sets.

T
Observe
T that C∈A C ⊆ C for all C ∈ A since, by definition, every element
in C∈A C belongs to every C ∈ A .
T
− If A is a non-empty class
T of sets then C∈A C is a subclass of every set
C. So, by axiom A4, C∈A C is a set.
− Hence, we can say, in a general way, that the intersection of arbitrarily
large classes A of sets is always a set.
S
If A is a non-empty class of sets, is C∈A C necessarily a set? What about
the special case where A is a set of sets? None of the Axioms A1 to A5,
nor any previously proven statement resulting from these, help to answer
this question. It will turn out to be useful if we can answer “yes” to the
second question. However, we have been unable to prove this. Then, we
need an axiom that will postulate this to be true. Axiom A6, Axiom of
union, declares when a union of sets is a set. We restate it here:
Part II: Class operations 29
S
Axiom 6: If A is a non-empty set of sets then C∈A C is a set.

Thus, Axiom A6 says “The union of all sets in a set of sets is a set”. We
should also be clear about what Axiom A6 does not say: “The union of all
sets in a class of sets is a set.” If we make the mistake of assuming this to
be true, it will lead to a contradiction, as the following example shows.
Suppose A is the class A = {x : x is a set and x 6∈ x}. Show
Example: S
that D = x∈A x is not a set.
Solution:
What we are given: A = {x : x is a setS and x 6∈ x}.
What we are required to show: D = x∈A x is not a set.
S
Suppose D = x∈A x is a set. Then by Axiom S A5, P(D) is also a set.
We know that for every x ∈ A , x ⊆ D = x∈A x. (Make sure you see
why. If not look at A ⊆ A ∪ B.) So, for every x ∈ A , x ∈ P(D). Hence,
A ⊆ P(D).
Since A is a subclass of a set, then, by Axiom A4, A is a set.
We now argue as in Theorem 2.4: Since A is a set, “A ∈ / A ” ⇒ “AS ∈ A ”
and “A ∈ A ” ⇒ “A ∈ / A ”. This is a contradiction. So D = x∈A x
cannot be a set. This is what we were required to show.

Then the statement “The union of all sets in a class of sets is a set” is not
a true statement, in general. Even though, in set theory, both class and set
intuitively represent a “collection of objects”, freely substituting the word
set with the word class may lead to some nasty consequences.

3.3 Venn diagrams.

We often use Venn diagrams as a tool to visualize how sets relate to oth-
ers. Venn diagrams should not be substitutes for proofs of statement; but
they are helpful when used to guide our intuition. We represent in Figures
1 through 3 Venn diagrams representing some relations defined above.

FIGURE 1
Intersection and union of two sets
30 Section 3: Operations on classes and sets

FIGURE 2
Difference and symmetric difference

FIGURE 3
Intersection distributing over a union.

3.4 Basic laws for operations on classes and sets.

We now prove a few fundamental properties of sets. The proofs of less com-
mon properties will be left as exercises.

Theorem 3.5 Let C and D be classes (sets). Then

a) C ⊆ C ∪ D
b) C ∩ D ⊆ C

P roof:
a) Let x ∈ C. It suffices to show that x ∈ C ∪ D.

x∈C ⇒ x ∈ {x : x ∈ C or x ∈ D}
⇒ x∈C ∪D

b) The proof is left as an exercise.

Part II: Class operations 31

Theorem 3.6 Let C and D be classes (sets). Then

a) C ∪ (C ∩ D) = C
b) C ∩ (C ∪ D) = C

P roof:
a) What we are given: C and D are classes (sets).
What we are required to show: C ∪ (C ∩ D) = C

C ⊆ C ∪ (C ∩ D) (By Theorem 3.5 a))

C ∩D ⊆ (C ∩ D) ∪ C (By Theorem 3.5 a))

x ∈ (C ∩ D) ∪ C ⇒ x ∈ C ∩ D or x ∈ C
⇒ x ∈ C or x ∈ C (Since Theorem 3.5 b) says C ∩ D ⊆ C)
⇒ x∈C
⇒ (C ∩ D) ∪ C ⊆ C

So C ⊆ C ∪ (C ∩ D) and (C ∩ D) ∪ C ⊆ C implies C ∪ (C ∩ D) = C (Def. of

equal classes).

b) The proof is left as an exercise.

Theorem 3.7 Let C be a class (a set). Then (C 0 )0 = C.

P roof:
x ∈ (C 0 )0 ⇒ x 6∈ C 0 (By Definition 3.4)
⇒ x∈C (By Definition 3.4)
⇒ (C 0 )0 ⊆ C

x∈C ⇒ x 6∈ C 0 (By Definition 3.4)

⇒ x ∈ (C 0 )0 (By Definition 3.4)
⇒ C ⊆ (C 0 )0
(C 0 )0

⊆ C
⇒ (C 0 )0 = C (By Definition of =)
C ⊆ (C 0 )0
32 Section 3: Operations on classes and sets

Theorem 3.8 (De Morgan’s laws) Let C and D be classes (sets). Then

a) (C ∪ D)0 = C 0 ∩ D0
b) (C ∩ D)0 = C 0 ∪ D0

P roof:
a) Given: C and D are classes (sets).

x ∈ (C ∪ D)0 ⇒ x 6∈ C ∪ D (By Definition 3.4)

⇒ x∈6 C and x 6∈ D (For if x ∈ C or ∈ D, then x ∈ C ∪ D)
⇒ x ∈ C 0 ∩ D0
⇒ (C ∪ D)0 ⊆ C 0 ∩ D0

Next x ∈ C 0 ∩ D0 ⇒ x ∈ C 0 and x ∈ D0
⇒ x∈ 6 C and x 6∈ D (By Definition 3.4)
⇒ x 6∈ C ∪ D (For if x ∈ C ∪ D, then x ∈ C or x ∈ D)
⇒ x ∈ (C ∪ D)0 (By Definition 3.4)
⇒ C 0 ∩ D0 ⊆ (C ∪ D)0
(C ∪ D)0 C 0 ∩ D0

⊆
⇒ (C ∪ D)0 = C 0 ∩ D0
C 0 ∩ D0 ⊆ (C ∪ D)0

b) The proof is left as an exercise.

Theorem 3.9 Let C, D and E be classes (sets). Then

a) C ∪ D = D ∪ C and C ∩ D = D ∩ C (Commutative laws)

b) C ∪ C = C and C ∩ C = C (Idempotent laws)

c) C ∪ (D ∪ E) = (C ∪ D) ∪ E and C ∩ (D ∩ E) = (C ∩ D) ∩ E (Associative
laws)
d) C ∪ (D ∩ E) = (C ∪ D) ∩ (C ∪ E) and C ∩ (D ∪ E) = (C ∩ D) ∪ (C ∩ E)
(Distribution)

P roof: The proofs of (a) to (d) are left as an exercise.

Theorem 3.10 Let A be a class and U denote the class of all elements.
a) U ∪ A = U
b) A ∩ U = A
c) U 0 = ∅
Part II: Class operations 33

d) ∅0 = U
e) A ∪ A0 = U

P roof:
a) By Theorem 3.5 a), U ⊆ U ∪ A. If x ∈ U ∪ A, then x ∈ A or x ∈ U .
In either case x is an element and so x ∈ U . Thus, U ∪ A ⊆ U . Hence,
U ∪A = U.
Parts(b) to (e) are left as an exercise.

3.5 Generalized distributive laws and De Morgan’s laws.

The distributive law and De Morgan’s laws generalize to arbitrarily large
unions and intersections.

Theorem 3.11 Let A be a non-empty class (set).

0 T
= C∈A C 0
S
a) C∈A C
0 S
= C∈A C 0
T
b) C∈A C

P roof: !0
[ [
a) x∈ C ⇔ x 6∈ C
C∈A C∈A
⇔ x 6∈ C for all C ∈ A
⇔ x ∈ C 0 for all C ∈ A
\
⇔ x∈ C0
C∈A

Part (b) is left as an exercise.

Theorem 3.12 Let D be a class and A be a non-empty class (set) of classes.

S S
a) D ∩ C∈A C = C∈A (D ∩ C)
T T
b) D ∪ C∈A C = C∈A (D ∪ C)
34 Section 3: Operations on classes and sets

P roof:
a)
!
[ [
x∈D∩ C ⇔ x ∈ D and x ∈ C
C∈A C∈A
⇔ x ∈ D and x ∈ C for some C ∈ A
⇔ x ∈ D ∩ C for some C ∈ A
[
⇔ x∈ (D ∩ C)
C∈A

Parts (b) is left as an exercise.

Concepts review:
S
1. If A is a class of classes, how should we interpret the class C∈A C?
T
2. If A is a class of classes, how should we interpret the class C∈A C?
How do we know that this is indeed a class?
3. What does it mean to say that two classes A and B are disjoint?
4. What is the complement, C 0 , of a class C?
5. What is the difference, C − D, of the two classes C and D? What
is the symmetric difference C4D?
6. S
If A is a non-empty set of sets, how do we know that the union
C∈A C is a set?
7. What do De Morgan’s laws say in reference to two classes C and
D?
8. Let A be a class of classes. Can we generalize De Morgan’s laws to
{C : C ∈ A }?
9. Is it true that the union of sets C in a class A is a set?
10. List the ZF -axioms that refer specifically to sets and were invoked
at least once up to now?
11. In algebra, we know about the distributive property of “multiplica-
tion over sums and differences”. Is there a similar property which
refers to “unions distributing over intersections” and “intersections
distributing over unions”?
Part II: Class operations 35

EXERCISES
S
A. 1. Prove or disprove that if D ∈ A , then it is always true that D ⊆ C∈A C.
S
2. Show that if A is a class of sets, then A ⊆ P( x∈A x).

B. 3. Show that P(A) ∩ P(B) = P(A ∩ B).

4. Show that P(A) ∪ P(B) = P(A ∪ B).
5. Show that C ∩ D = ∅ if and only if P(C) ∩ P(D) = {∅}.
6. If A is a set, show that ∪C∈P(A) C = A.
7. If A is a set, show that ∩C∈P(A) C = ∅.
S
8. If A is a class, show that A = x∈A {x}.
9. Prove the following statements.
a) Part (b) of Theorem 3.5.
b) Part (b) of Theorem 3.6.
c) Part (b) of Theorem 3.8.
d) Part (a) to (d) of Theorem 3.9.
e) Part b) to e) of theorem 3.10.
f) Part (b) of Theorem 3.11.
g) Part (b) of Theorem 3.12.

C. 10. If A and B are sets, show that P(A) ∈ P(B) implies P(A) ⊆ B and so
A ∈ B.
36 Section 4: Cartesian products

4 / Cartesian products.
Abstract. In this section we define the notion of “ordered pairs” in terms
of classes and sets. We then define the Cartesian product of two classes
(sets). We also present a few of the basic properties of Cartesian products.

4.1 Ordered pairs.

The notion of “ordered pairs” is an important one since it is involved in
most areas of mathematics. Most students are familiar with the idea of or-
dered pairs since they have learned early on that when given an ordered pair
of numbers, the order in which the numbers appear conveys a particular
meaning.
For example, say 120 desks in an exam room are arranged in a rectangular
grid of 10 rows with each row containing 12 desks. Suppose each of the 120
students writing an exam in this room is given an ordered pair, (a, b), to be
interpreted as follows: You will write your exam on ath desk in the bth row
where the first row is the one in the front of the room, and the first desk
in this row is the one which is closest to the door as you enter the room.
The student understands that the desk labeled (2, 3) is not the same desk
as the one labeled (3, 2). The order in which the numbers are presented has
meaning.
We know that functions can also be represented by ordered pairs. For ex-
ample, the function f(x) = x2 with domain R can be represented by the set
S = {(x, x2) : x ∈ R}. So we can say that the ordered pair (5, 25) is an
element of this function, while the pair (25, 5) is not, since the second entry
is not the square of the first entry.
We would now like to formally define “ordered pairs” in our set-theoretic
axiomatic system. Our first step will be to remind ourselves of the way
“ordered pairs” as we know them are defined. Someone may attempt to
define an ordered pair as follows:

Given two elements a and b, an ordered pair, (a, b), is a doubleton

{a, b} where one element a is labeled as the “first” while the other
element b is labeled as the “second”. The element labeled “first”
must be listed first. The round brackets “( )” are used to indicate
that (a, b) is not a simple doubleton but rather a doubleton where
the order in which the elements a and b appear has a particular
meaning.

This is a bit wordy. Also, it is not clear what the words “first” and “second”
mean. We have not defined these in our set-theoretic universe. Can we define
ordered pairs without using the words “first” and “second”? That is, can
we obtain an equivalent definition of “ordered pairs” by avoiding these two
Part II: Class operations 37

words entirely? Let us consider the following definition and then see if it
works.

Definition 4.1 (Kuratowski definition1 ) Given a pair of elements c and d, we

can construct the class
{{c}, {c, d}}
The doubleton {{c}, {c, d}} is called an ordered pair. The elements c and d
need not be distinct. Ordered pairs are denoted as (c, d) = {{c}, {c, d}}.

First, we should verify that there are no inherent ambiguities in this defi-
nition. Remember that in our set theory all objects are classes. Given this
definition we should first verify that if c and d are elements, then {{c}, {c, d}}
is a class:

c and d are elements ⇒ {c} and {d} are classes. (Axiom A2.)

⇒ {c} ∪ {d} = {c, d} is a class. (Axiom A2 where P (x) is x ∈ c ∨ x ∈ d)

⇒ {{c}, {c, d}} is a class. (Axiom A2.)

We verify immediately that if c and d are sets, then {{c}, {c, d}} is a set:

c and d are sets ⇒ {c} and {d} are sets. (Axiom of pair)
⇒ {c} ∪ {d} = {c, d} is a set. (Axiom of union)
⇒ {{c}, {c, d}} is a set. (Axiom of pair)

Now that this has been established, we should make sure that the double-
ton defined above satisfies the essential “ordered pairs property”, [(a, b) =
(c, d)] ⇔ [a = c and b = d].

Theorem 4.2 Let a, b, c and d be elements. Then (a, b) = (c, d) if and only
if a = c and b = d.

P roof:
(⇐) That a = c and b = d implies (a, b) = {{a}, {a, b}} = {{c}, {c, d}} =
(c, d) is immediate.
(⇒) What we are given: (a, b) = (c, d). What we are required to show: a = c
and b = d.
1 Kazimierz Kuratowski (1896-1980) was a Polish mathematician and logician. He was

one of the leading representatives of the Warsaw School of Mathematics. He worked as

a professor at the University of Warsaw and at the Mathematical Institute of the Polish
Academy of Sciences.
38 Section 4: Cartesian products

(a, b) = (c, d) ⇒ {{a}, {a, b}} = {{c}, {c, d}}

Case 1: a 6= b ⇒ {a, b} =
6 {c} hence {a, b} = {c, d}
⇒ {a} = {c}
⇒ a=c
{a, b} = {c, d} and a = c ⇒ b=d

Case 2: a = b ⇒ {{a}, {a, b}} = {{a}, {a, a}} = {{a}}

⇒ {a} = {c} and {a} = {c, d}
⇒ a=c
{a} = {c, d} and a = c ⇒ {a} = {a, d}
⇒ {a, d} = {a, a}
⇒ a=d
Then (a, b) = (c, d) if and only if a = c and b = d.

From this theorem we deduce that:

[a 6= b] ⇒ [(a, b) 6= (b, a)]
Given the distinct sets a and b, we see that the sets {a, b} and (a, b) are
indeed different mathematical “creatures”. The definition of ordered pair
simply shows that (a, b) is constructed from a and b in a way that guar-
antees that the “ordered pair” property holds true. It says that (a, b) is a
doubleton where one of its elements is a singleton, while the other is itself a
doubleton. Since the two elements of this set have different characteristics,
it allows us to decide which is the first entry and which is the second. We
can say that “the singleton is the first entry”, while “the doubleton is the
second entry”.
Having defined an ordered pair of elements, we can conveniently define, in
a similar way, an ordered triple (x, y, z) as an ordered pair where the “first”
entry is itself an ordered pair:
(x, y, z) = ((x, y), z)

Alternate definitions of ordered pairs. There are other possible definitions

of ordered pairs (c, d) in terms of sets. Many readers may find the following
definition intuitively preferable. This definition is a slight variation of the
one put forward by Felix Hausdorff, so we will label it as being Hausdorff’s.
Part II: Class operations 39

Definition 4.3 (Hausdorff definition)2 If c and d are elements, the expression

(c, d) is defined as follows:

(c, d) = { {c, ∅}, {d, {∅}} }

The main reason why this definition may be intuitively appealing to some
is that it looks more like we are indexing the two elements c and d with the
symbols φ and {φ}. It allows one to visualize the ordered pair as follows:

(c, d) = { {c, ∅}, {d, {∅}} } = c∅ , d{∅} = {c0 , d1}

This in fact resembles more the way we will be viewing ordered pairs once
we define “functions” and the “natural numbers”. We will defer the proof
which guarantees that this definition satisfies the essential property of or-
dered pairs to the end of this section. In this text, we will adopt the more
commonly used Kuratowski definition.

4.2 Cartesian products.

Now that we have defined ordered pairs of classes, we can construct new
classes with old ones. Recall how, from two known sets, say N and R, we can
construct a new set N × R = {(n, x) : n ∈ N, x ∈ R}. This is what we want
to do with classes. Any two classes (sets) C and D can be used to construct
another class (called a Cartesian product) whose elements are ordered pairs.

Definition 4.4 Let C and D be two classes (sets). We define the Cartesian
product, C × D, as follows:

C × D = {(c, d) : c ∈ C and d ∈ D}

We could of course also write C × D = { {c}, {c, d}} : c ∈ C and d ∈ D}.

Since we are particularly interested in constructing new sets from old ones,
we should first make sure that as long as C and D are sets, then C × D is
a set. We will do this by first proving the following lemma.

2 Felix Hausdorff (1868-1942) was a German mathematician, who is considered to be

one of the founders of modern topology and who contributed significantly to set theory,
descriptive set theory, measure theory, and functional analysis. Historical note: Life became
difficult for Hausdorff and his family after the Kristallnacht of 1938. The next year he
initiated efforts to emigrate to the United States, but was unable to make arrangements to
receive a research fellowship. On 26 January 1942, Hausdorff, died by suicide rather than
comply with German orders to move to the Endenich camp (Wikipedia).
40 Section 4: Cartesian products

Lemma 4.5 Let C and D be two classes (sets). Then the Cartesian product,
C × D, of C and D satisfies the property C × D ⊆ P(P(C ∪ D)).
P roof:
What we are given: That C and D are two classes (sets).
What we are required to show: C × D ⊆ P(P(C ∪ D)).
Let c ∈ C and d ∈ D. It will suffice to show that (c, d) ∈ P(P(C ∪ D)).
{c} ∈ P({c, d}) and {c, d} ∈ P({c, d}) ⇒ {{c}, {c, d}} ⊆ P({c, d})
⇒ (c, d) ⊆ P({c, d})
⇒ (c, d) ∈ P(P({c, d}))
P(P({c, d})) ⊆ P(P(C ∪ D))† ⇒ (c, d) ∈ P(P(C ∪ D))
Hence, C × D ⊆ P(P(C ∪ D)), as required.3

Theorem 4.6 If C and D are classes, then the Cartesian product, C × D, is

a class. If C and D are sets, then C × D is a set.

P roof:
To show that C × D is a class we can express C × D as

C×D = {x : x ∈ P(P(C∪D)) and x = (c, d) for some c ∈ C and some d ∈ D}

and invoke axiom of construction A2 which declares it to be a class.

Given that C and D are sets, then C ∪ D is a set (by the Axiom of union A6)
which implies that P(P(C ∪ D)) is a set (by the Axiom of power set A5).
Since C ×D ⊆ P(P(C ∪D)), then C ×D is a set (by the axiom of subset A4).

We can then write that if C and D are sets, C × D is the set of all those
specific elements u in P(P(C ∪ D)) which are of the form u = (c, d) for
some c in C and d in D.
Once we have defined the Cartesian product of two classes C and D, referring
to our definition of ordered triples (c, d, e) = ((c, d), e), we can define the
Cartesian product of three classes C, D and E as follows:

C ×D×E = {(c, d, e) : c ∈ C, d ∈ D, e ∈ E}
= {((c, d), e) : c ∈ C, d ∈ D, e ∈ E}
= (C × D) × E

3 To see this verify that S ⊂ T ⇒ P (S) ⊂ P (T )

Part II: Class operations 41

4.3 A few properties of Cartesian products.

The following theorem illustrates properties of Cartesian products involving
the symbols intersections, “∩”, and union, “∪”.

Theorem 4.7 Let C, D, E and F be classes. Then

a) C × (D ∩ E) = (C × D) ∩ (C × E)
b) C × (D ∪ E) = (C × D) ∪ (C × E)
c) (C ∩ E) × D = (C × D) ∩ (E × D)
d) (C ∪ E) × D = (C × D) ∪ (E × D)
e) (C ∪ D) × (E ∪ F ) = (C × E) ∪ (D × E) ∪ (C × F ) ∪ (D × F )
f) (C ∩ D) × (E ∩ F ) = (C × E) ∩ (D × E) ∩ (C × F ) ∩ (D × F )

P roof:
a) (c, d) ∈ C × (D ∩ E) ⇔ c ∈ C and d ∈ (D ∩ E)
⇔ c ∈ C and d ∈ D and d ∈ E
⇔ (c, d) ∈ C × D and (c, d) ∈ C × E
⇔ (c, d) ∈ (C × D) ∩ (C × E)

Hence, C × (D ∩ E) = (C × D) ∩ (C × E) (by Axiom A1).

Proofs of parts (b) to (f) are left as an exercise.

Theorem 4.8 If C ⊆ D and E ⊆ F , then C × E ⊆ D × F

P roof:
By definition C × E = {(c, e) : c ∈ C and e ∈ E} and
D × F = {(d, f) : d ∈ D and f ∈ F }.

(c, e) ∈ C × E ⇒ c ∈ C and e ∈ E
⇒ c ∈ D and e ∈ F (Since C ⊆ D and E ⊆ F )

⇒ (c, e) ∈ D × F
Hence, C × E ⊆ D × F .

The following theorem shows that there is a one-to-one correspondence be-

tween the elements of S × (U × V ) and the elements of (S × U ) × V .
42 Section 4: Cartesian products

Theorem 4.9 Given three classes (sets) S, U and V there is a one-to-one

correspondence between the two classes (sets) S × (U × V ) and (S × U ) × V .
P roof:
Let φ : S × (U × V ) → (S × U ) × V be defined as: φ((s, (u, v))) = ((s, u), v).
We will prove that φ maps distinct elements in S × (U × V ) to distinct
elements in (S × U ) × V . We can prove this by invoking Theorem 4.2 as
follows:
(s, (u, v)) = (a, (b, c)) ⇔ s = a and (u, v) = (b, c)
⇔ s = a and u = b and v = c
⇔ (s, u) = (a, b) and v = c
⇔ ((s, u), v) = ((a, b), c)
⇔ φ(s, (u, v)) = φ((a, (b, c))

4.4 Proof for the Hausdorff definition of ordered pairs.

We end this section with a proof showing that the alternate form of
(c, d) = { {c, ∅}, {d, {∅}} }
satisfies the essential property of “ordered pairs” and so can also be used to
represent ordered pairs.

Theorem 4.10 For classes c, d, e and f, if (c, d) = {{c, ∅}, {d, {∅}}} and
(e, f) = {{e, ∅}, {f, {∅}}}, then (c, d) = (e, f) if and only if c = e and d = f.
P roof:
(⇐) That c = e and d = f implies (c, d) = (e, f) is immediate.

(⇒) What we are given:

· (c, d) = {{c, ∅}, {d, {∅}}}
· (e, f) = {{e, ∅}, {f, {∅}}}
· (c, d) = (e, f)
What we are required to show: c = e and d = f.
We first consider the case where c is the empty class.
(∅, d) = (e, f) ⇒ {{∅, ∅}, {d, {∅}}} = {{e, ∅}, {f, {∅}}}
⇒ {{∅}, {d, {∅}}} = {{e, ∅}, {f, {∅}}}
⇒ {{∅}, {d, {∅}}} = {{∅, ∅}, {f, {∅}}} )
( Since {f, {∅}} can never equal {∅}.

⇒ {{∅}, {d, {∅}}} = {{∅}, {f, {∅}}}

⇒ {d, {∅}} = {f, {∅}}
⇒ d=f
Part II: Class operations 43

Thus, (∅, d) = (e, f) ⇒ e = ∅ and d = f as required. We now consider the

case where c 6= ∅.

(c, d) = (e, f) ⇒ {{c, ∅}, {d, {∅}} } = {{e, ∅}, {f, {∅}} }

If {c, ∅} = {e, ∅} ⇒ c = e, it quickly follows that d = f. ( Check the details.)

If {c, ∅} =
6 {e, ∅} ⇒ {c, ∅} = {f, {∅}} and {d, {∅}} = {e, ∅}
{c, ∅} = {f, {∅}} ⇒ f = ∅ ( Since ∅ 6= {∅} this forces f = ∅.)
⇒ {∅} = c
{d, {∅}} = {e, ∅} ⇒ d = ∅ and e = {∅} ( For the same reasons as above.)
c = {∅} = e and d = ∅ = f ⇒ c = e and d = f.

Note that the two different representations of ordered pairs (a, b),
{{a}, {a, b}} and {{a, ∅}, {b, {∅}} } do not form equal sets. These two
classes only share the fundamental property of ordered pairs.

Concepts review:
1. If c and d are elements, what is the (Kuratowski) definition of the
ordered pair (c, d)?
2. Given two classes C and D, what is the definition of C × D?
3. If C and D are sets, is it true that C × D ⊆ P(P(C ∪ D))? Why?
4. Is it generally true that C × D = D × C? If so, why? If not give a
counterexample.
5. Is it generally true that (C × D) ∪ (E × F ) = (C ∪ E) × (D ∪ F )?
If so, why? If not, give a counterexample.

EXERCISES

A. 1. Prove that C × D = ∅ if and only if C = ∅ or D = ∅.

2. Show that for classes C, D and E, (C × D) ∩ (C 0 × E) = ∅.
3. Show that A ⊆ B ⇒ A × C ⊆ B × C.

B. 4. If (c, d) ∈ C × D, is it necessarily true that {c} ∈ C and {c, d} ∈ D? If so,

why? If not, give a counterexample.
44 Section 4: Cartesian products

5. Show that A × C ⊆ B × C 6⇒ A ⊆ B, that is, the converse of the statement

in question (3) does not necessarily hold true.
6. Prove parts (b) and (c) of Theorem 4.7 on page 41.
7. Let S = {x} be a set. Show that (S × S) × S 6= S × (S × S).
8. Describe each of the following classes. But first explain why each of these
classes is a set.
a) ∅ × {∅}
b) {∅} × ∅
c) ∅ × ∅
d) {∅} × {∅}
e) {∅ × ∅}

C. 9. Show that C × (D − E) = (C × D) − (C × E).

10. Is the statement “C × D = E × F if and only if C = E and D = F ” always
true? If there are situations where it fails to be true, state which ones.
11. Show that, if a and b are sets, then {{a, ∅}, {b, {∅}}} is a set.
Part III

Relations
Part III: Relations 47

5 / Relations on a class or set.

Abstract. In this section we define a relation R on a class (a set) S.
For a relation R on a set S, we define the inverse R−1 of the relation R.
We also define the domain and the image of R. The composition of two
relations R and T is defined, and some of their properties are given.

5.1 Introduction.
We have seen that studying sets by categorizing them according to some
clearly identifiable properties or some characteristics shared by some but
not others is enough of a valuable endeavor to sustain the interest of many.
But for others, restricting our study to this aspect of set theory would be
missing the point, if not the main reason why set theory constitutes a branch
of mathematics well worth studying.
Consider, for example, the following analogous situation. Suppose one wishes
to study the universe who elements are “people living on this planet” by re-
grouping individuals in this universe based on identifiable characteristics
shared by some but not by others. For example, one might identify charac-
teristics based on race, culture, religious beliefs and so on. Research would
identify that most individuals are either male or female, with much smaller
groups which identify as neither or even both. Although studying, in this
way, various sub-categories of individuals which populate this universe is
worth investigating, trying to understand how subgroups “relate” to other
appears to be a much richer area worth investigating. Certainly, it is more
complex. Also, it would be seen as being more useful, in practice. For similar
reasons, the time is ripe in our study to embark on the study of the notion
of “relations between sets”.
Recall that the symbol U denotes the “Universal class”, {x : x = x} (the
class of all elements). Since U is a class, we can then construct the Cartesian
product, U × U , itself a class (as we have seen). Recall that the elements
of U × U are ordered pairs.

Definition 5.1

a) We will call any subset R of ordered pairs in U × U a binary relation.1

b) We will say that R is a binary relation on a class C if R is a subclass
(subset) of C × C. In such cases we will simply say that R is a relation
in C (or on C).
1 The word binary refers to fact that the elements of R are doubletons (pairs). We can also

speak of a ternary relation when considering subclasses of the Cartesian product U ×U ×U .

Unless we specify otherwise, all relations in this text are assumed to be “binary” and so the
word relation will be used to abbreviate the words binary relation.
48 Section 5: Relations on a class or set

c) If A and B are classes (sets) and R is a subclass (subset) of A × B, then

R can be viewed as a relation on A ∪ B.

From this definition we see that any Cartesian product C × D is a relation.

But a relation need not be a Cartesian product. For example, the smallest
Cartesian product which contains the set G = {(a, c), (b, d)} is A × B where
A = {a, b} and B = {c, d}. But G 6= A × B since (a, d) ∈ (A × B) − G.
We will be referring to specific kinds of relations on a class or set. Given
a class C, a relation on C is usually expressed by the symbol, R, although
other capital letters are used whenever it is necessary to distinguish between
two relations on a same class. Suppose R is a relation on a class C and that
(x, y) ∈ R. Then, by definition, both x and y belong to C. Common ways
of expressing that (x, y) belongs to the relation R are:
· (x, y) ∈ R
· xRy holds true
· x is related to y under R. (respecting the order of x and y)

Remark : Recall from Lemma 4.5 that C × D ⊆ P(P(C ∪ D); hence, if R

is a relation subset of C × D, then R ⊆ P(P(C ∪ D).

Also note that when we say “y is related to x under R”, we mean (y, x) ∈ R.

5.2 Examples of relations on a class.

1) We define a relation, R1 , in U as follows: (x, y) ∈ R1 if and only if x ∈ y.
This says that “any x is related precisely to those classes (sets) y which
contain it.”. We can also write

R1 = {(x, y) : x ∈ y}

For example, we can write (a, {a, b}) ∈ R1 or, if one prefers, aR1 {a, b}
“holds true”. On the other hand, we can write (b, {c, d}) 6∈ R1 . Also,
(∅, ∅) 6∈ R1 , but {∅, {∅}} ∈ R1 .
2) We define a relation, R3 , in U as follows: (x, y) ∈ R3 if and only if x = y.
This says that “a class (a set) x is related only to itself and no other
class”. We can write R3 = {(x, y) : x = y}. We see that (a, {a}) 6∈ R3
but that ({a}, {a}) ∈ R3 . The statement ∅R3 ∅ is true.
Part III: Relations 49

Definition 5.2 Let C be a class (a set).

a) The relation
∈C = {(x, y) : (x, y) ∈ C × C, x ∈ y}
is called the membership relation on C.
b) The relation

IdC = {(x, y) : (x, y) ∈ C × C, x = y}

is called the identity relation on C.

We see that the only elements in the identity relation IdU on U are those
of the form (x, x).

5.3 Domain and image of a relation.

The reader may see some similarities between the concept of a relation on
C and what is known to be a “function from a set C to C”:
− Both relations and functions are collections of ordered pairs.
− In both cases, the first entry should not be confused with the second entry.
The second entry is often defined in terms of the first entry: That is, a
rule states why the second entry is related to the first. This rule is the
mechanism which allows us to determine which ordered pairs belong to
the relation or the function and which don’t.

Definition 5.3 Let R be a relation on a class (set) C. The domain of R is

the class, dom R = {x : x ∈ C and (x, y) ∈ R for some y ∈ C}. The image of
R is the class, im R = {y : y ∈ C and (x, y) ∈ R for some x ∈ C}. The word
range of R is often used instead of “the image of R”. If R ⊆ A × B is viewed
as a relation on A ∪ B, then dom R ⊆ A and im R ⊆ B.

Example: Suppose R is the membership relation on the set S where S is

defined as,
S = { a, b, c, {a}, {a, b}, {{c}}, ∅, {∅}}
That is, R = {(x, y) : x ∈ y}. Find dom R and im R.
Solution:
To find the domain and the image of R, we will write out the elements of R
explicitly:
R = { (a, {a}), (a, {a, b}), (b, {a, b}), (∅, {∅}) }
50 Section 5: Relations on a class or set

The domain, dom R, is

dom R = {a, b, ∅}
while the image, im R, is

im R = {{a}, {a, b}, {∅}}

5.4 Inverse of a relation R on a set S.

Just as for one-to-one functions, we can speak of the inverse of a relation R
on a set S. However, a relation need not be “one-to-one” to have an inverse.
“One-to-many” relations are quite common. We will begin by formally defin-
ing what we mean by the inverse of a relation.

Definition 5.4 Let C be a class (a set) and let R be a relation defined in C.

The inverse, R−1 , of the relation R is defined as follows:

R−1 = {(x, y) : (y, x) ∈ R}

In the example above, we defined a relation on the set

S = { a, b, c, {a}, {a, b}, {{c}}, ∅, {∅}}

as R = {(x, y) : x ∈ y}.
Since R was found to be:

R = { (a, {a}), (a, {a, b}), (b, {a, b}), (∅, {∅}) }

then the inverse relation, R−1 , is

R−1 = { ({a}, a), ({a, b}, a), ({a, b}, b), ({∅}, ∅) }

The inverse of R can also be expressed in the more succinct form

R−1 = {(x, y) : y ∈ x}.

5.5 Composition of two relations R and T .

Just like pairs of functions f and g, a pair of relations R and T on a class
C can be combined to obtain a new relation. Other than the fact that the
first entries of a relation can be associated to many values in the image of
R, compositions of relations work exactly like the composition of functions.
We define the composition of two relations as follows.
Part III: Relations 51

Definition 5.5 Let C be a class (a set) and let R and T be two relations in
C. We define the relation T ◦ R as follows:

T ◦ R = {(x, y) : (z, y) ∈ T for some z ∈ im R}

Suppose, for example that the relations R and T on the set

S = { a, b, c, {a}, {a, b}, {{c}}, ∅, {∅}}

are defined as:

R = {(x, y) : x ∈ y}
T = {(x, y) : x = {y}}
Then the relation R can be described as:

R = { (a, {a}), (a, {a, b}), (b, {a, b}), (∅, {∅}) }

The relation T is:

T = { ({a}, a}), ({∅}, ∅})}
For the relation T ◦ R we obtain:

T ◦ R = {(a, a), (∅, ∅)}

For the relation R◦ T we obtain:

R◦ T = {({a}, {a}), ({∅}, {∅})}

Concepts review:
1. Given a class C, what is a relation on C?
2. Given a relation R on a class C, what does the expression xRy
mean?
3. Given a class C, what is the membership relation, ∈C , on C?
4. Given a class C, what is the identity relation, IdC , on C?
5. Given a relation R on a class C, what is the domain, dom R, of R
and the image, im R, of R?
6. Given a relation R on a class C, what is the inverse, R−1 , of the
relation R? Is R−1 a relation on C?
7. Does a relation R on C have to be “one-to-one” for R−1 to be a
relation?
52 Section 5: Relations on a class or set

8. If R and T are two relations on a class C, what does R◦ T mean?

EXERCISES

A. 1. Suppose R, S and T are three relations on a set A. Prove that T ◦ (R◦ S) =

(T ◦ R)◦ S.
2. Suppose R is a relation on a set A. Prove that (R−1 )−1 = R.
3. Let R = {(a, a), (a, c), (c, c), (c, d)} and T = {(a, b), (c, a), (d, c)}. Describe:
a) R−1 and T −1 .
b) R◦ T and T ◦ R.
c) R◦ T −1 .
4. If R is a relation on a set S, show that dom R−1 = im R.

B. 5. Let C = {∅, {∅}} and D = P(C).

a) Write out explicitly the elements of the membership relation, ∈D , on
D.
b) Write out explicitly the elements of the identity relation, IdD , on D.
c) Describe the dom ∈D and im ∈D .
d) List all elements of D = P(C).
e) List all possible relations on C.

C. 6. Let R, S and T be three relations on a set A.

a) Prove that (R ∪ S)◦ T = (R◦ T ) ∪ (S ◦ T ).
b) Prove that (R ∩ S)−1 = R−1 ∩ S −1 .
c) Prove that R ⊆ T implies R−1 ⊆ T −1 .
d) Prove that dom (S ∪ T ) = dom S ∪ dom T .
Part III: Relations 53

6 / Equivalence relations and order relations.

Abstract. In this section we define special types of relations on a class or
set: reflexive, symmetric, antisymmetric, asymmetric and transitive rela-
tions. Equivalence relations on a class S will be defined as those relations
which are simultaneously reflexive, symmetric and transitive. We also de-
fine “partial order relations” and “strict order relations” on sets and pro-
vide examples of these. Equivalence relations and order relations are the
two most important types of relations in the study of Set theory.

6.1 A few special types of relations on a class S.

Although we would not normally think of it as a relation, the empty set is
a relation on any non-empty set S:
∅ = ∅ × S = {(x, y) : x ∈ ∅, y ∈ S} ⊆ S × S
Its main properties are:
i) dom ∅ = {x ∈ S : (x, y) ∈ ∅} = ∅,
ii) ∅−1 = {(x, y) : (y, x) ∈ ∅} = ∅,
iii) im ∅ = {y ∈ S : (x, y) ∈ ∅} = ∅.
We define other special types of relations below.

Definition 6.1 Let S be a class and R be a relation on S.

a) We say that R is a reflexive relation on S if, for every x ∈ S, (x, x) ∈ R.

b) We say that R is a symmetric relation on S if, whenever (x, y) ∈ R,
then (y, x) ∈ R.
c) We say that R is an antisymmetric relation on S if, whenever (x, y) ∈ R
and (y, x) ∈ R, then x = y.
d) We say that R is an asymmetric relation on S if, whenever (x, y) ∈ R,
then (y, x) 6∈ R.
e) We say that R is a transitive relation on S if, whenever (x, y) ∈ R and
(y, z) ∈ R, then (x, z) ∈ R.
f) We say that R is an irreflexive relation on S if, for every x ∈ S, (x, x) 6∈
R.
g) If, for every x, y ∈ S where x =
6 y, either (x, y) ∈ R or (y, x) ∈ R,
then we say “any two elements a and b in S are comparable under the
relation R”.
54 Section 6: Equivalence relations and order relations

It follows from these definitions that a relation which is both antisymmetric

and irreflexive must be asymmetric. We illustrate these relation properties
with the following four examples.
1) Suppose G represents a set whose elements are the individuals who live
in the mythical Gotham City. Consider two relations on G:

R = {(f, b) : f is a female and b is a brother of f }

T = {(x, y) : x and y are distinct siblings}

· R is irreflexive since a female inhabitant of this city cannot be her own

brother. The relation T is also irreflexive since a person cannot be two
children of the same biological parents.
· R is an asymmetric relation on G since if (f, b) ∈ R, then (b, f) 6∈ R
since b is male. However, x and y are distinct siblings in whichever order
we consider them. So T is symmetric.
· Transitivity: Since both (a, b) and (b, c) cannot belong to R (since b is
male), then we will say that R is “vacuously” transitive. The relation T
is transitive since if x and y are siblings and y and z are siblings, then
x and z are siblings.
· If we assume that Gotham City contains more than one family, then
there are pairs of individuals which are not comparable under both R
an T .
2) Let S = {a, b, c, d}. Consider the relation R1 = {(a, a), (b, b), (c, c), (d, d), (a, b)}
on S.
· R1 is a reflexive relation on S since R1 contains (x, x) for each x ∈ S.
· R1 is not a symmetric relation on S since R1 contains (a, b) but not
(b, a).
· R1 is “vacuously” antisymmetric on S since R1 contains (a, b) and that
(b, a) is not in R1 .1
· R1 is not asymmetric since R1 contains (a, a).
· R1 is transitive on S since R1 contains (a, b) and (b, b) and also contains
(a, b).

3) Let S = {a, b, c, d}. Consider the relation R2 = {(a, a), (b, b), (d, d), (a, b)}.
· R2 is not a reflexive relation on S since R2 does not contain (c, c). It is
not irreflexive since it contains (a, a).
· R2 is not a symmetric relation on S since R2 contains (a, b) but not
(b, a).
1 Note that the statement “whenever (a, b) and (b, a) are in S, then a = b” holds true.
Part III: Relations 55

· R2 is2 vacuously antisymmetric since R2 contains (a, b) and (b, a) is not

in R .
· R2 is not asymmetric since R2 contains (a, a).
· R2 is transitive since R2 contains (a, b) and (b, b) and also contains
(a, b).

4) Let S = {a, b, c, d}. Consider the relation

R3 = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, a), (b, c), (c, b)}
· We see that R3 is a reflexive relation on S.
· Since R3 contains both pairs {(a, b), (b, a)} and {(b, c), (c, b)}, then R3
is a symmetric relation on S.
· Since R3 contains {(a, b), (b, a), (a, a)} and {(b, c), (c, b), (b, b)}, then R3
is antisymmetric.
· Since R3 contains (a, a), then R3 is not asymmetric.
· Since R3 contains the triples {(a, b), (b, a), (a, a)} and {(b, c), (c, b), (b, b)},
then R3 is transitive on S.
A word of caution: Some readers may conclude that a relation R which is
both symmetric and transitive on a class S is automatically reflexive based
on the following reasoning:
Symmetric R says “(a, b) ∈ R implies (b, a) ∈ R” while transitive
R says “(a, b) and (b, a) in R implies (a, a) ∈ R ”. So “symmetric
+ transitive ⇒ reflexive”.

This conclusion is however not correct.

Consider the relation R = {(a, a), (b, b), (d, d), (a, b), (b, a)} on the set S =
{a, b, c, d}. It is symmetric and transitive and yet (c, c) is not in R and so R
is not reflexive. We should remember that if a relation R is to be reflexive
on S we must have (x, x) ∈ R for all x ∈ S

6.2 Equivalence relations on a class S.

Equivalence relations are important types of relations on classes and sets.
Students who study other fields of mathematics will frequently encounter
sets equipped with this type of relation.

Definition 6.2 Let S be a class and R be a relation on S. We say that R is

an equivalence relation on S if R is simultaneously reflexive, symmetric and
transitive on S.

Examples: Let S and T be two non-empty classes.

56 Section 6: Equivalence relations and order relations

a) Recall that IdS denotes the identity relation on S:

(x, y) ∈ IdS if and only if x = y

In this relation an element is only related to itself and no other. It is

easily seen that IdS is reflexive, symmetric and transitive on S. Thus,
the identity relation, IdS , is an equivalence relation on S.
b) Let D = S × T . We define a relation, R, on D as follows:

((a, b), (c, d)) ∈ R if and only if a = c

This means all ordered pairs in S × T with the same first entry are
related under R. Since
i. ((a, b), (a, b)) ∈ R) for all (a, b) ∈ D so R is reflexive on D.
ii. ((a, b), (c, d)) ∈ R ⇒ a = c ⇒ ((c, d), (a, b)) ∈ R so R is symmetric.
iii. ((a, b), (c, d)) ∈ R and ((c, d), (e, f)) ∈ R ⇒ a = c = e ⇒
((a, b), (e, f)) ∈ R so R is transitive.
We conclude that R is an equivalence relation on D.
c) We will refer to the first example presented on page 54. If G denotes
all the inhabitants of Gotham City and H is the relation on G defined
as
H = {(x, y) : x and y are siblings or the same person}
then H is reflexive, symmetric and transitive and so forms an equiva-
lence relation on G.

6.3 Order relations on a class S.

We now discuss another very important type of relation called “order rela-
tion”. An order relation will be either strict or non-strict. In each of those
two categories an order relation can be a partial ordering or a linear order-
ing. These terms are defined below.

Definition 6.3 Let S be a class.

a) Non-strict order relation. The relation R is a non-strict order relation on

S if it is simultaneously
· reflexive (aRa holds true for any a in S),
· antisymmetric (if aRb and bRa, then a = b)
· transitive (aRb and bRc implies aRc)
on S. A non-strict order relation, R, on S is said to be a
non-strict linear order relation
Part III: Relations 57

if, for every pair of elements a and b in S, either (a, b) ∈ R, (b, a) ∈ R or

a = b. That is, every pair of elements are “comparable” under R.1 A non-
strict ordering, R, on S which is not linear is said to be a non-strict partial
ordering relation on S.2
Strict order relation. The relation R is a strict order relation if it is simul-
taneously
b)· irreflexive ((a, a) 6∈ R),
· asymmetric ((a, b) ∈ R ⇒ (b, a) 6∈ R)
· transitive
on S. If every pair of distinct elements, a and b, in S are comparable under a
strict order relation, R, then R is a strict linear ordering on S. Those strict
orderings which are not linear are called strict non-linear orderings or, more
commonly, strict partial ordering relation.
A non-strict partial order R on S always induces a strict partial order R∗
by defining aR∗b ⇒ [aRb and a 6= b]. Similarly, a strict partial order R on
S always induces a non-strict partial order R† by defining aR† b ⇒ [aRb or
a = b].

Note that none of the relations defined above are equivalence relations since
a partial ordering relation is normally not symmetric, while the strict order-
ing relation is not reflexive.
Notation : If R is a partial ordering relation, then, instead of writing (a, b) ∈
R or aRb, it is standard to write

a≤b

or a ≤R b if we want to be more specific about which relation we are dis-

cussing. Similarly, if R is a strict ordering relation then, instead of writing
(a, b) ∈ R or aRb, it is standard to write

a<b

or a <R b if we want to be more specific about which relation we are dis-

cussing. Given an order relation R on the set S, keep in mind that when we
write a ≤R b we are saying that (a, b) ∈ R ⊆ S × S.
At first glance, this may all appear very abstract and so the reader may find
it difficult to distinguish one relation from the other or remember precisely
1 A class on which is defined a linear ordering R is also said to be fully ordered or totally

ordered by R. In certain branches of mathematics, “linearly ordered set” is abbreviated as

l.o.set or simply called loset.
2 Again in certain branches of mathematics, “partially ordered set” is abbreviated as

p.o.set or simply called a poset

58 Section 6: Equivalence relations and order relations

what they mean. Studying the following few examples carefully will help
construct a mental representation of the structure these relations provide to
sets.
Example 1.
Mortimer is constructing a chart in which he will list all of his ancestors.
He lets S denote a set whose elements represent his ancestors. He defines
an order relation R on S as follows: If a and b are two ancestors, (a, b) ∈ R
only if b is an ancestor of a − equivalently, a is a descendant of b. We list
some properties of the relation R:
− We see that R is transitive since, “a is a descendant of b” and “b is a
descendant of c” implies “a is a descendant of c”.
− Since an element a cannot be an ancestor of a, R is irreflexive.
− Finally, if a is a descendant of b, then clearly b cannot be a descendant
of a. So R is asymmetric.
We conclude that R is a strict order relation on S. Instead of writing (a, b) ∈
R, we will write a < b with the understanding that “<” is only to be
interpreted as “a is a descendant of b”. We list a few more properties of R:
− It is clear that R does not linearly order S since one parent of Mor-
timer cannot be an ancestor of the other parent (excluding cases where
something highly unnatural is going on). Hence, there exist pairs of
ancestors a and b such that a 6< b and b 6< a.
− Let’s assume that Mortimer has included himself in the set S and is
represented by the letter M . Then M < a for all a ∈ S. We will say
that M is the minimum element of S with respect to the ordering “<”.
− Beginning with M , Mortimer can trace different paths upwards forming
chains of inequalities each in the form M < a < b < c < · · · < · · · . Such
chains are linearly ordered subsets of S since for any two elements a, b
in such chains either a < b or b < a. So, not only is M the minimum
element of S, M is also the minimal element of each chain.
To allow us to illustrate in this example as many properties of ordered re-
lations as possible, let’s assume that Adam and Eve were “spontaneously
generated” and so were the most ancient of Mortimer’s ancestors (assuming
Adam and Eve are the only human beings which were spontaneously gener-
ated). Say that in the set S, Adam is represented by A and Eve by E. We
add a few other properties of the set S when equipped with the given order
relation R:
− We see that there are numerous chains of elements (linearly ordered
subsets of S) each of which begins with the minimal element M and
finishes with either A or E.
− If a chain C linearly links M to A, then A is a maximal element of this
chain in the sense that all elements of S which are comparable to A
are “below” A. Similarly, E is the maximal element of all chains which
link M to E.
Part III: Relations 59

− However, S has no maximum element since A is not an ancestor of E

and E is not an ancestor of A. The elements A and E are simply not
comparable under R. In this sense, we can say that S has no maximum
element and only two maximal elements.
We now formalize some of the concepts illustrated in this example with the
following definitions.

Definition 6.4 Let S be a class and R be an order relation on S. If R is a

non-strict order relation, “(a, b) ∈ R” is represented as “a ≤ b”, and if R is
a strict order relation, “(a, b) ∈ R” is represented as “a < b”. If a ≤ b and
a 6= b, we will simply write a < b.

a) A subset of S which is linearly ordered by R is called a chain in S. If R

linearly orders S, then S is a linearly ordered subset of itself and therefore
is a chain.
b) An element, M , of S is called a maximal element of S with respect to ≤
if there does not exist an element b in S such that M < b. An element
m of S is called a minimal element of S with respect to ≤ if there does
not exist an element b in S such that b < m.
c) Two elements a and b of S are said to be comparable with respect to the
order relation, ≤, if either a ≤ b or b ≤ a. Let M and m be elements
which are comparable (with respect to ≤) to all elements of the set S.
The element, M , in S is called the maximum element of S with respect
to ≤ if there does not exist an element a in S such that M < a. The
element m in S is called the minimum element of S if there does not
exist an element a in S such that a < m.

Note that a maximum or minimum element of an ordered set S must be com-

parable to all elements of S. This is not necessarily the case for a maximal or
minimal element of S. Referring to the example on “ancestors” of Mortimer
above, S has two maximal elements A and E; it has no maximum element,
since no single element in S is strictly larger than all other elements. If it
is the case that Adam was “spontaneously generated” and Eve was formed
from one of Adam’s ribs, then Adam would be Eve’s only ancestor and so
A would be the maximum element of S.

Example 2.
Let S denote the set of all molecules constructed from the atoms listed in
the periodic table of elements. In this case, molecules are viewed as sets
whose elements are atoms. (We exclude crystals.) The simplest molecules
60 Section 6: Equivalence relations and order relations

are those that contain only one atom. We define the relation R on S as
follows: (a, b) ∈ R if a 6= b and all atoms in molecule a are contained in
molecule b and any atom which appears n times in a also appears n times
in b. If (a, b) ∈ R, we will write a ⊂ b (or say that a is a proper subset of
b). For example, (H2 O, H2 O2 ) ∈ R. We describe the structure of S when
equipped with this particular relation R.
− By definition, R is irreflexive.
− If molecule a is a proper subset of molecule b, then b cannot be a proper
subset of molecule a and so R is asymmetric.
− If a ⊂ b and b ⊂ c, then a ⊂ c and so R is transitive.
We conclude that the relation R strictly orders the set S. The set S is
however not linearly ordered since well-known molecules such as Cl2 (gaseous
chlorine) and H2 (hydrogen gas) are not comparable under “⊂”. We discuss
a few more properties of S when equipped with R.
− Since a molecule made of a single atom cannot be properly contained
in any other molecule, “single-atom molecules” are minimal elements
of S. So S has as many minimal elements as there are atoms. Clearly,
S does not contain a minimum element.
− There are atoms belonging to the family of noble gases (helium (He),
argon (Ar), krypton (Kr), etc.) that are non-reactive and so do not tend
to bond with other elements to form molecules. In S, these elements will
form one-element chains. For example, the helium atom is not properly
contained in any other molecule. It is both a maximal and minimal
element of S. These particular atoms (noble gases) form in S what is
called an antichain. An antichain is a subset of an ordered set in which
no two elements are comparable.
− Other molecules will join together to form new molecules. For example,
the carbon element, C, and hydrogen element, H, both belong to the
molecule, CH2 , which will join with some other molecules containing
carbon and oxygen atoms to form C2 H2 O4 .
− The order structure of S will then contain numerous chains of
molecules. One suspects that at some point each of these chains may
attain some extremely large molecule which is non-reactive or will be
too unstable to form other lasting links. If this is the case, then such a
molecule is a maximal element of S. However, S cannot have a maxi-
mum element since elements such as those belonging to the noble gases
are not contained in any molecule.

Example 3.
Consider the set Z of all integers. If m and n are non-zero integers, we will
say that “m divides n” if there exists a positive integer k such that mk = n.
We will define the relation R on Z as follows:
R = {(0, n) : n ∈ Z} ∪ {(m, n) : m divides n} ∪ {(m, n) : m < 0 and n > 0}
Part III: Relations 61

We first determine the most elementary properties of this relation.

− We confirm that the relation, R, is reflexive on Z: For every non-zero
integer m, m × 1 = m so (m, m) ∈ R. Also (0, 0) ∈ R. Then R is a
reflexive relation on Z.
− We confirm that the relation, R, is transitive on Z: Let m, n and k
be non-zero integers in Z. Case 1: If (0, m) ∈ R and (m, n) ∈ R, then,
trivially, (0, n) ∈ R. Case 2: By considering each of the possible positive-
negative cases for the values of m, n and k we see that if (m, n) ∈ R
and (n, k) ∈ R, then n = rm and k = sn = srm, then (m, k) ∈ R.
Then R is a transitive relation on Z.
− We confirm that the relation, R, is antisymmetric on Z: If m and n
are both negative or both positive, then if m divides n and n divides
m, then it must be the case that m = n. If m is negative and n is
positive, then (m, n) ∈ R but (n, m) 6∈ R. Finally (0, m) ∈ R for all
m, but (m, 0) ∈ R only if m = 0. We conclude that R is antisymmetric.

We conclude that the relation R partially orders the set Z.

The expression “a ≤R b” will be another way of stating that (a, b) ∈ R. We
will write a <R b if a ≤R b and a 6= b. We see that R does not linearly order
Z since neither “3 divides 5” nor “5 divides 3” holds true. So 3 and 5 are not
comparable under R. We list a few more properties of the order structure
imposed on Z by R.
− It is explicitly stated in the definition of the relation R that 0 is the
minimum element of Z. Given any positive integer n, n + 1 does not
divide n and so n cannot be a maximum element of Z. So no integer is
a maximum element of Z with respect to the relation R.
− We will say that a chain C is a maximal chain if no larger chain in Z
properly contains C. If p represents a positive prime number, maximal
chains must begin with 0 <R −1 <R −p, when listed in the order
dictated by R. For example,
0 <R −1 <R −3 <R −6 <R −12 <R · · · <R 1 <R 7 <R 14 <R 28 <R 56 <R · · ·
is an example of a chain with respect to the order relation R. But many
distinct chains may begin with 0 < −1 < −3. For example,
0 <R −1 <R −3 <R −9 <R −27 <R · · · <R 1 <R 2 <R 4 <R 12 <R 24 <R · · ·
In the order structure defined by R all maximal chains contain the
number one.

6.4 If x is a set, can we prove that x 6∈ x?

We wondered about this, early on (on page 21).
It turns out that to prove that a set cannot be an element of itself, we require
the Axiom of regularity. We have not yet invoked this Axiom. It states:
62 Section 6: Equivalence relations and order relations

“Every non-empty set A contains an element x whose intersection

with A is empty.”
We will say more about this in Chapter 31 on page 350 where we will prove
the following equivalent form of the Axiom of regularity:
“The Axiom of regularity holds true if and only if every non-empty
set S contains a minimal element with respect to the membership
relation “∈”.”
We remind ourselves of what the “minimal element of a set” means:
“ The set m is a minimal element of T with respect to the mem-
bership relation, ∈, if there does not exist an element b in T such
that b ∈ m”.
Let’s consider the set

T = {a : a is a set and a ∈ a}

Suppose T is a non-empty set. We invoke the equivalent form of the Axiom

of regularity (stated above) which guarantees that T must have minimal
element with respect to ∈, say y. Since y belongs to T then y ∈ y. So y
both belongs to T and is a minimal element of T with respect to ∈. Since
y is a minimal element of T , no element of T is ∈-less than y. Then y 6∈ y.
We have a contradiction. The source of the contradiction is our assumption
that T is non-empty.

Concepts review:

1. What is the “empty relation” on a class S?

2. What does it mean to say that the relation R on S is reflexive?
3. What does it mean to say that the relation R on S is irreflexive?
4. What does it mean to say that the relation R on S is symmetric?
5. What does it mean to say that the relation R on S is asymmetric?
6. What does it mean to say that the relation R on S is antisymmetric?
7. What does it mean to say that the relation R on S is transitive?
8. What does it mean to say that any two elements a and b in S are
comparable under the relation R?
9. What is an equivalence relation R on a class S?
10. What is a partially ordered class?
Part III: Relations 63

11. Give an example of a partial ordering on P(S).

12. What is a partially ordered set?
13. What is a strictly ordered class?
14. Give an example of a strict ordering on P(S).
15. What is a poset?
16. If R is a partial order on the set S, what does it mean to say that
a is a maximal element of S? What does it mean to say that a is a
minimal element of S?
17. What is a chain in a partially ordered set?
18. What is the maximum element of a partially ordered set?
19. What is a linearly ordered set?

EXERCISES

A. 1. Suppose R is a reflexive relation on a class S. Show that IdS ⊆ R.

2. Show that if R is reflexive, then R−1 is reflexive.
3. Show that if R is symmetric, then R−1 is symmetric.
4. Show that if R is transitive, then R−1 is transitive.

B. 5. Show that if R is asymmetric, then R ∩ R−1 = ∅.

6. Show that if R is an equivalence relation, then R◦ R = R.
7. Suppose T is a reflexive relation on a class S. Show that for every relation
R on S, R ⊆ T ◦ R and R ⊆ R◦ T .

C. 8. Show that if R is a partial order relation on S, then R ∩ R−1 = IdS and

R ◦ R = R.
9. Show that if R is a partial order relation on S, then so is R−1 .
10. Suppose R is an equivalence relation on a class S. Show that if H and J
are relations on S, then R ⊆ (H ∩ J) ⇒ R ⊆ H ◦ J.
11. Show that if a is a maximum element of a partially ordered set S, then it
is the only maximal element of S.
12. Show that if a is a maximal element of a linearly ordered set S, then it is
a maximum element of S.
64 Section 7: Partitions induced by equivalence relations

7 / Partitions induced by equivalence relations

Abstract. In this section we show that an equivalence relation R on a set

S can be used to subdivide S into pairwise disjoint subsets. The subsets
when viewed together are called a partition of S. We illustrate how such
partitions are obtained.

7.1 Subdividing a set S using an equivalence relation R on S .

Suppose R is an equivalence relation on the set S. We will show that this
equivalence relation R, no matter how it is defined, can be used to subdivide
the set S into a collection of non-empty pairwise disjoint1 subsets. Further-
more, no element of S will be left out in this process (in the sense that every
element of S will belong to one of these subsets).
The process is similar to subdividing a school’s student population, S, into
smaller subgroups {S1 , S2 , S3 , . . .} based on some predefined student char-
acteristics; the characteristics are such that no student can belong to two
subgroups.
The technique of subdividing a set into a set of subsets by using an equiv-
alence relation is a fundamental way of creating new sets from a given set.
It is practiced in many branches of mathematics and so is part of general
mathematical culture. Hence it is taught early on in most mathematics pro-
grams. Hence, we will progress through this section slowly, so as to allow
readers to see how reflexivity, symmetry and transitivity of a relation are
required in this process.
Note: Even though most of the results proven from here on can apply to
classes, we will only prove these for the case of sets.

Notation 7.1 Let R be an equivalence relation on a set S and let x ∈ S. We

define the set Sx as follows:

Sx = {y : (x, y) ∈ R} = {y : xRy}
That is, Sx is the set2 of all elements y in S such that x is related to y under
R.

1 A set S of sets whose elements are “pairwise disjoint” means that for any two sets A

and B in S , A ∩ B = ∅.
2 We justify that S is a “set” as follows: We have that S is a subclass of the class S;
x x
given that S is declared to be a set, by axiom A4 (Axiom of subset) Sx is a set.
Part III: Relations 65

The following theorem statements show, step-by-step, how an equivalence

relation, R, on S partitions the set S into subsets, {Sx : x ∈ S}. The reader
should notice how all three relation properties which characterize equiva-
lence relations are required to partition the set S in this way.
Let
SR = {Sx : x ∈ S}
We first verify that no element of S is the empty set. To do this, we invoke
the reflexive property of R.

Theorem 7.2 Let R be an equivalence relation on a set S. Let x ∈ S. Then

Sx is non-empty.

P roof:
Suppose R is an equivalence relation on the set S and x ∈ S.
Since R is reflexive, (x, x) ∈ R so Sx contains x. Then Sx is non-empty.

S S
Then S ⊆ Sx∈S Sx . Since, for each x ∈ S, x ∈ Sx ⊆ S, then x∈S Sx ⊆ S
and so S = x∈S Sx .

Theorem 7.3 Let R be an equivalence relation on a set S. Let x and y be

two elements in S such that (x, y) ∈ R. Then Sx = Sy .
P roof:
What we are given:
· R is an equivalence relation
· x, y ∈ S
· (x, y) ∈ R
What we are required to show: Sx = Sy .
We claim that Sx ⊆ Sy . Let z ∈ Sx . Then (x, z) = (z, x) ∈ R (since R
is symmetric). Since x is related to y, then [(z, x) ∈ R and (x, y) ∈ R]⇒
[(z, y) ∈ R] (since R is transitive). So z ∈ Sy . Thus, Sx ⊆ Sy as claimed.
We prove in a similar way that Sy ⊆ Sx . Thus, Sx = Sy as required.

Next we show Sx ∩ Sy = ∅ if and only if x 6= y. Symmetry, reflexivity and

transitivity of R are invoked.
Clearly, if Sx ∩ Sy = ∅, x 6= y, for if x = y then, since R is reflexive,
(x, y) ∈ R so y ∈ Sy and y ∈ Sx so Sx ∩ Sy 6= ∅.
66 Section 7: Partitions induced by equivalence relations

Theorem 7.4 Let R be an equivalence relation on a set S. Let x, y ∈ S such

that x 6= y. Then Sx ∩ Sy = ∅.
P roof:
We are given that R is an equivalence relation and x 6= y.
Suppose Sx ∩ Sy is non-empty. That is, suppose z ∈ Sx ∩ Sy . Then (z, x)
and (z, y) both belong to R. Since R is symmetric, (x, z) also belongs to
R. But (x, z) ∈ R and (z, y) ∈ R ⇒ (x, y) ∈ R (since R is an equivalence
relation and so is transitive). This contradicts our hypothesis: (x, y) 6∈ R.
Then Sx ∩ Sy must be empty.

We have seen that, anS equivalence relation R on a set S can be used to

express S as a union, x∈S Sx , of pairwise disjoint subsets, Sx , of S. Let

SR = {Sx : x ∈ S}

denote the set of all the sets formed from the equivalence relation R.
We verify that the class, SR , is indeed a set: Since S is declared to be a set,
then P(S) is a set (by the Axiom of power set); since SR is a subclass of
the set P(S), then, by the Axiom of subset, SR is a set.
We summarize three important properties of SR :
S
1) x∈S Sx = S.
2) Sx 6= Sy ⇒ Sx ∩ Sy = ∅.
3) Sx 6= ∅ for all x ∈ S.
The three properties together describe what is called a partition of a set S.3
A proper understanding of the method used to partition a set S in this way
is important in our study of set theory.

7.2 Examples of partitions induced by an equivalence relation on a

set.
1) The identity relation, IdS , on S is defined as follows: (x, y) ∈ IdS if
and only if x = y. The identity relation on S is easily seen to be an
equivalence relation. Then

SIdS = {Sx : x ∈ S}
3 If
S
a set SR = {Sx : x ∈ S} of subsets of S is such that x∈S Sx = S, we often say “SR
covers S ” to express this fact.
Part III: Relations 67

Since, for each x ∈ S,

Sx = {y ∈ S : (x, y) ∈ Ids }
= {y ∈ S : x = y}
= {x}

then
SIdS = {{x} : x ∈ S}4
The set, SIdS , represents the “finest” partition possible of the set S.
2) Consider the relation R on the set S defined as follows: (x, y) ∈ R if
x and y both belong to S. The relation R is easily verified to be a
equivalence relation on S. See that, for each x ∈ S,

Sx = {y ∈ S : (x, y) ∈ R}
= {y ∈ S : x and y belong to S}
= {y : y ∈ S}
= S

then
SR = {S}
Then the set SR contains only one element, S. Note that it would be
incorrect to write SR = S or SR = {{S}}. The set SR is referred to
as being the “coarsest” partition on S.

Concepts review:

1. Given an equivalence relation R on a set S and an element x ∈ S,

what does the symbol Sx represent?
2. Given an equivalence relation R on a set S, what does the symbol
SR represent?
3. Suppose S denotes a set of subsets of S. What does it mean to say
that S partitions S?
4. Given an equivalence relation R on a set S, what are the essential
properties of SR ?
5. What is the “finest” partition of S obtainable by an equivalence
relation on S?
4 Note that it would be incorrect to write SIdS = {x : x ∈ S}.
68 Section 7: Partitions induced by equivalence relations

6. What is the “coarsest” partition of S obtainable by an equivalence

relation on S?

EXERCISES

A. 1. Let S be a set and let R and T be two equivalence relations on S. Is the

relation V = R ∪ T necessarily an equivalence relation on S? Explain.
2. Suppose R is an equivalence relation on a set S and that A ⊆ S where A
is non-empty. We define the relation RA on A as follows:

RA = {(x, y) : (x, y) ∈ R ∩ (A × A)}

Show that RA is an equivalence relation on A.

3. Suppose R is a partial order relation on a set S and that A ⊆ S where A
is non-empty. We define the relation RA on A as follows:

RA = {(x, y) : (x, y) ∈ R ∩ (A × A)}

Show that RA is a partial order relation on A.

B. 4. Let S be a set and let R and T be two equivalence relations on S. Show

that V = R ∩ T is an equivalence relation on S.
5. Suppose R and T are two equivalence relations on a set S. For each x ∈ S,
let R Sx = {y : y ∈ S, (x, y) ∈ R} and T Sx = {y : y ∈ S, (x, y) ∈ T }. If, for
each x ∈ S, R Sx ⊆ T Sx show that R ⊆ T .
6. Suppose R and T are two equivalence relations on a set S. For each x ∈ S,
let R Sx = {y : y ∈ S, (x, y) ∈ R} and T Sx = {y : y ∈ S, (x, y) ∈ T }. If
R ⊆ T , show that for each x ∈ S, R Sx ⊆ T Sx .

C. 7. Let S be a set and let R and T be two equivalence relations on S. For each
x ∈ S let R Sx = {y : y ∈ S, (x, y) ∈ R} and T Sx = {y : y ∈ S, (x, y) ∈ T }.
Let SR = {R Sx : x ∈ S} and ST = {T Sx : x ∈ S}. We have seen
that SR and ST form sets of non-empty subsets of S which are pairwise
disjoint and cover all of S. For each x ∈ S, let Sx = R Sx ∩ T Sx . Show that
S = {Sx : x ∈ S} forms a set of subsets of S satisfying the properties:
a) Sx 6= ∅ for each x.
b) Whenever Sx 6= Sy then Sx ∩ Sy = ∅.
S
c) x∈S Sx = S.

8. Let S and T be sets. It has been shown that “⊆” constitutes a partial order
relation on the set P(S). Consider the set L = P(S) × P(T ). We define
Part III: Relations 69

a relation R on L as follows: For (A, B) and (C, D) ∈ P(S) × P(T )


 A⊂C
((A, B), (C, D)) ∈ R ⇔ or
A = C and B ⊆ D


Show that R is a partial ordering relation on L.4

4 This relation is often called the lexicographic ordering of a Cartesian product.

70 Section 8: Equivalence classes and quotient sets

8 / Equivalence classes and quotient sets

Abstract. In this section we continue our discussion of partitions of a set
S by an equivalence relation. When a set S is partitioned by an equivalence
relation R, the subsets in this partition are called “equivalence classes in-
duced by R”. These equivalence classes when viewed together are called a
“quotient set of S induced by R”. We then show that any partition of S is
induced by some equivalence relation R.

8.1 More on partitions.

We have seen that an equivalence relation, R, on a set, S, subdivides S
into pairwise disjoint subsets that cover all of S. Tools that allow us to sys-
tematically partition sets into subsets are important in mathematics. In set
theory, this can be a method to construct a new set from a known set S
by partitioning S into smaller pieces and forming a new set whose elements
are those pieces. We have casually spoken of partitions with an intuitive
understanding of what they are. We should formally define them before we
go on.

Definition 8.1 Let S be a set. We say that a set of subsets C ⊆ P(S) forms
a partition of S if C satisfies the three properties:
S
1) A∈C A = S.
2) If A and B ∈ C and A 6= B, then A ∩ B = ∅.
3) A 6= ∅ for all A ∈ C .

Based on this definition, we can see that the set of subsets of S, SR = {Sx :
x ∈ S}, formed by the equivalence relation, R, is a partition of the set S.1
The elements of SR are given a particular name.

Definition 8.2 Suppose S is a set on which we have defined an equivalence

relation, R.
a) Each element Sx of SR = {Sx : x ∈ S} is called an equivalence class of
x under R or an equivalence class induced by the relation R.2
1 We can say SR partitions S.
2 The expression “equivalence class of x modulo R” is sometimes used.
Part III: Relations 71

b) The set, SR = {Sx : x ∈ S}, of all equivalence classes induced by the

relation, R, is called the quotient set of S induced by R. The set, SR , is
also commonly represented by the symbol, S/R. So
S/R = {Sx : x ∈ S}
From here on, we will use the more common notation, S/R.

Note that if S is a set, then by the Axiom of subset, the equivalence classes in
S/R = SR = {Sx : x ∈ S} are in fact “sets”. But if S is a proper class, then
it may occur that the elements of S/R = {Sx : x ∈ S} may be proper classes.

8.2 Examples of quotient sets induced by an equivalence relation R.

In the following examples, S and T are non-empty sets.
a) Recall that IdS denotes the identity relation on S:
(x, y) ∈ IdS if and only if x = y
We have seen on page 55 that IdS is an equivalence relation on S. For
each x ∈ S, the equivalent class of x induced by Idx is
Sx = {y : (x, y) ∈ IdS }
= {y : x = y}
= {x}
So the quotient set of S induced by IdS is
S/IdS = SIdx = {{x} : x ∈ S}
The set SIdx is the largest possible quotient set on S induced by a relation.
We also say that this quotient set is the “finest” partition of S.
b) Let R be the relation defined as follows: (x, y) ∈ R if and only if x and
y belong to S. Then R = S × S was shown to be an equivalence relation
on S. For x ∈ S,
Sx = {y : y ∈ S, (x, y) ∈ R}
= {y : y ∈ S}
= S
Hence, for every x ∈ S, the equivalence class of x induced by R is Sx = S.
So the quotient set of S induced by R only contains the element Sx = S
and no other. Since
S/R = SR = {S}
it is the smallest possible quotient set of S induced by a relation. We also
say that this quotient set is the “coarsest” partition of S.
72 Section 8: Equivalence classes and quotient sets

c) For non-empty sets S and T , let D = S × T . We define a relation, R, on

D as follows:
((a, b), (c, d)) ∈ R if and only if a = c
We have seen on page 55 that R is an equivalence relation on D. Let
(a, b) ∈ D. Then

S(a,b) = {(x, y) : (x, y) ∈ D, x = a}

= {(a, y) : y ∈ T }
= {a} × T

Thus, {a}×T is the equivalence class of (a, b) induced by R. The quotient

set of D induced by R is

D/R = DR = {{x} × T : x ∈ S}

8.3 Equivalence relations defined from a partition.

We have seen how any equivalence relation R on a set S can be used to
partition S into pairwise disjoint subsets. We will now work the other way
around: If we are given a partition, C , of S, is there an equivalence relation,
R, such that SR = C ? We will see if we can construct the required equiva-
lence relation.
Let S be a set on which we have defined a partition C .
This means that C is a class of non-empty pairwise disjoint subsets of S
which covers all of S. Suppose we define a relation RC on S in the following
way:

(x, y) ∈ RC if and only if {x, y} ⊆ C for some element C in C

Thus, the only pairs of elements x and y of S which are related under RC
are those pairs that appear together in the same subset C ∈ C . Is RC an
equivalence relation?
− We verify that RC is reflexive: For every x ∈ S, x belongs to some C
and so {x} = {x, x} ⊆ C and so (x, x) ∈ RC .
− We verify symmetry of RC : If {x, y} ⊆ C ∈ C , then {y, x} ⊆ C. So
(x, y) ∈ RC ⇒ (y, x) ∈ RC .
− We verify transitivity of RC : If {x, y} ⊆ C and {y, z} ⊆ C, then
{x, z} ⊆ C. So (x, y) ∈ RC and (y, z) ∈ RC ⇒ (x, z) ∈ RC .
The relation RC is indeed an equivalence relation on S. We conclude that
any partition C of a set S defines an equivalence relation RC on S. This
result deserves to be called a theorem.
Part III: Relations 73

Theorem 8.3 Let S be a set and C be a partition of S. Let RC be the relation

such that (x, y) ∈ RC if and only if {x, y} ⊆ C for some element, C, in C .
Then RC is an equivalence relation on S.

8.4 Refining an equivalence relation3 .

Let S = {a, b, c, d, e}. Suppose R, T, K, IdS and M are equivalence relations
on S which are defined as follows:
R = {(a, a), (b, b), (c, c), (d, d), (e, e), (a, b), (b, a), (d, c), (c, d)}
T = {(a, a), (b, b), (c, c), (d, d), (e, e), (a, b), (b, a), (c, d), (d, c), (c, e), (e, c), (d, e), (e, d)}
K = {(a, a), (b, b), (c, c), (d, d), (e, e), (a, c), (c, a), (a, b), (b, a), (c, b), (b, c), (d, e), (e, d)}
IdS = {(x, y) : x ∈ S, y ∈ S and x = y} = {(a, a), (b, b), (c, c), (d, d), (e, e)}
M = {(x, y) : x ∈ S, y ∈ S} = S × S

Verify that R, T , K and M are indeed equivalence relations. (Note that it

suffices to show that R, T and K partition S.)
We describe explicitly the elements of SR , ST , SK , SIdS and SM . (Recall
that these can also be expressed in the form S/R, S/T , S/K, S/IdS , S/M ):4
SR = {R Sa , R Sb , R Sc , R Sd , R Se }
= {{a, b, }, {c, d}, {e}}
ST = {T Sa , T Sb , T Sc , T Sd , T Se }
= {{a, b, }, {c, d, e}}
SK = {T Sa , T Sb , T Sc , T Sd , T Se }
= {{a, b, c}, {d, e}}
SIdS = {IdS dSa , IdS Sb , IdS Sc , IdS Sd , IdS Se }

= {{a}, {b}, {c}, {d}, {e}}

SM = {T Sa , T Sb , T Sc , T Sd , T Se }
= {{a, b, c, d, e}} = {S}

We now make a few observations:

a) We see that each pair in R belongs to the relation T . So we can write
R ⊆ T . Similarly, we see that the IdS is a subset of each of R, T and
K. But this relationship doesn’t hold true between R and K: The pair
(c, d) is an element of R but not of K. So R 6⊆ K.
b) Notice how every equivalence class under R is contained in an equiva-
lence class under T :
{a, b, } ⊆ {a, b, }
{c, d} ⊆ {c, d, e}
{c} ⊆ {c, d, e}
Similarly, every equivalence class under IdS is a subset of some equiva-
lence class under R. The same can be said for the relationship between
3 This section can be omitted without loss of continuity.
4 Recall that R Sx = {y : (x, y) ∈ R} where R is an equivalence relation on S.
74 Section 8: Equivalence classes and quotient sets

the equivalence classes of IdS and of K. But this relationship between

the equivalence classes under R and the equivalence classes under K
doesn’t hold true. Witness:
{c, d} 6⊆ {a, b, c}
{c, d} 6⊆ {d, e}

c) In cases such as R and T above, we say that the equivalence relation

R refines or is a refinement of the relation T . We can make the more
general statement:
An equivalence relation R refines the equivalence relation T
whenever R ⊆ T .
This is the same as saying “R refines T if every equivalence class under
R is a subset of some equivalence class under T ”. We see that IdS will
refine any equivalence relation R. We see that R does not refine K since
R 6⊆ K. Similarly, we see that T does not refine K since T 6⊆ K.
d) Also note that no matter which equivalent relation R on S we consider,
R ⊆ M , and so M is refined by any equivalence relation on S.

Concepts review:
1. Given a set S, what does it mean to say that the class C of subsets
of S partitions S?
2. Given an equivalence relation R on a set S and an element x ∈ S,
what is an equivalence class of x under R? How is it denoted?
3. Given an equivalence relation R on a set S, what is the quotient set
of S induced by R? How is it denoted?
4. Given an equivalence relation R on a set S, what do the expressions
SR and S/R mean?
5. Given a partition C of a set S, can we define an equivalence relation
R on S such that S/R = C ?
6. What does it mean to say that an equivalence relation R refines the
equivalence relation T ?
7. Is there an equivalence relation on a set S that refines all other
equivalence relations?
8. Is there an equivalence relation on a set S that is refined by all
other equivalence relations?
Part III: Relations 75

EXERCISES

A. 1. Suppose R is an equivalence relation on a set S and that A ⊆ S where A

is non-empty. We define the relation RA on A as follows:
RA = {(x, y) : (x, y) ∈ R ∩ (A × A)}
a) Show that RA is an equivalence relation on A.
b) If S/R = SR = {Sx : x ∈ S} represents the quotient set of S induced
by R, describe the elements of the quotient set, SRA , of A induced by
RA .

B. 2. Let S be a set and let R and T be two equivalence relations on S.

a) Show that V = R ∩ T is an equivalence relation on S.
b) If S/R = SR = {R Sx : x ∈ S} and S/T = ST = {T Sx : x ∈ S} repre-
sent the quotient set of S induced by R and induced by T respectively,
describe the elements of the quotient set of S, SV , induced by V .
3. Suppose R and T are two equivalence relations on a set S. For each x ∈ S,
let S/R = SR = {R Sx : x ∈ S} and S/T = ST = {T Sx : x ∈ S} represent
the quotient set of S induced by R and induced by T respectively. If R ⊆ T ,
show that for each x ∈ S,
[
T Sx = R Sy
y∈T Sx

4. Let R and T be two equivalence relations on a set S. If R ⊆ T we say that

R is finer than T (or is a refinement of T ). The choice of these expres-
sions when comparing two equivalence relations is suggested by the result
described in problem 3: The quotient sets of a finer equivalence relation
all seem to fit neatly inside the quotient sets of a coarser equivalence rela-
tion. Give the finest possible equivalence relation on S. Give the coarsest
possible equivalence relation on S.
5. Let S = {a, b, c, d, e, f} be a set. Suppose R and T are equivalence relations
on S defined as follows:
R = IdS ∪ {(a, b), (b, a), (b, c), (c, b), (a, c), (c, a), (d, c), (c, d)}
T = IdS ∪ {(b, c), (c, b)}
Write out explicitly the elements of the sets SR and ST .

C. 6. Let S be a set. Let R and T be two equivalence relations on S where R ⊆ T

(that is R is finer than T ). For each x ∈ S, let S/R = SR = {R Sx : x ∈ S}
and S/T = ST = {T Sx : x ∈ S} represent the quotient set of S induced
by R and induced by T respectively. We define the quotient of T by R,
denoted T /R, as follows:
T /R = {(R Sx , R Sy ) : (x, y) ∈ T }
76 Section 8: Equivalence classes and quotient sets

From this definition we see that T /R is a relation on S/R = SR . That is

R Sx and R Sy are related under T /S if and only if x and y are related under
T.

a) Show that T /R is an equivalence relation on S/R = SR .

b) Let K be another equivalence relation on S where R ⊆ T ⊆ K. Show
that T /G ⊆ K/R.
c) Referring to the example on page 73, write out explicitly the elements
of the equivalence relation T /R on SR .
d) Referring to the set S described in question 5 above, write out explic-
itly the elements of the equivalence relation T /R on SR .
Part IV

Functions
Part IV: Functions 79

9 / Functions: A set-theoretic definition

Abstract. In this section we formally define what we mean by a “func-
tion”. This is done using only the set-theoretic concepts developed up to
now. We introduce notation to simplify the discussion and definition of
these concepts. Examples of simple functions such as the identity function,
the characteristic function and constant functions are presented. Given a
function, f, on a set A, we define the restriction of this function on a sub-
set, D, of A. We state what we mean by “equal functions”. The expressions
“one-to-one”, “injective”, “onto”, “surjective” and “bijective functions”
are also defined.

9.1 A set-theoretic definition.

The concept of a function is not new to most readers. A standard definition
goes something like this: “Given two sets A and B, a function is a rule, f,
which associates to each element of A a single element of B”.
To construct a function, first a property involving two elements x, y is
defined. Say we represent this property by φ(x, y). The property φ is the
blueprint which is used to construct a subset

f = {(x, y) ∈ A × B : x, y satisfies the property φ(x, y)}

of the Cartesian product A × B. So φ is the tool used to distinguish those

ordered pairs (x, y) that belong to f from those that don’t. This defines a
relation, f. If it can be shown that “(x, y) and (x, z) belong to f implies y =
z”, then f is called a function. Since a function f is a subclass of a Cartesian
product of sets, then functions are, by definition, sets. In practice, users
often do not distinguish between the rule φ and the set f whose elements
are determined by it (even though φ is just a formula, while f is a well-
defined set and so is governed by axioms associated with sets). For example,
let A = {a, b} and B = {{a, {a}}, {b, {b}}, {c, {c}}}. The expression

f(a) = {a, {a}}

illustrates the rule for f : A → B. Then f(b) = {b, {b}}. The set which
follows from φ is,

f = { (a, {a, {a}}), (b, {b, {b}}) } ⊂ A × B

For practical reasons, we normally just use the symbol f to represent both
the rule and the set which flows from it. Opportunities to say more about
the notion of a function abound in the following chapters in this text.
Our objective will be to define the concept of a function within the ZFC-
universe, without adding any new primitive concepts to the three we already
80 Section 9: Functions: A set-theoretic definition

have: “class”, “set”, “belongs to”. We must formulate this definition care-
fully so that it represents precisely what we want and understand it to be.
For most readers, the notion of a “function” is intrinsically linked to those
sets we call “numbers”, commonly studied in the form of polynomial,
trigonometric or exponential functions. We will see that functions hover,
in the abstract, well above those sets we will call numbers. We have not yet
shown how numbers can be constructed using our ZFC axioms. This is yet
to come. Studying functions in the absence of numbers will allow readers to
better see, in essence, what they truly are.

Definition 9.1 A function f mapping elements from a set A into a set B is

a triple hf, A, Bi satisfying the following properties1 :

1) f ⊆ A × B.
2) For every a ∈ A, there exists b ∈ B such that (a, b) ∈ f.2
3) If (a, b) ∈ f and (a, c) ∈ f, then b = c. Equivalently, if (a, b) ∈ f and
(c, d) ∈ f, then (b 6= d) ⇒ (a 6= c). 3

From this definition we see that a function f ⊆ A × B is a special type of

relation on A ∪ B with dom f ⊆ A and im f ⊆ B. A function f can also be
viewed as a particular element of P(A × B).

9.2 Commonly used notation when discussing functions.

There is no reason why we should adopt a functional notation which is
different from the one we are accustomed to. We should however explain
carefully how this notation is to be interpreted in set theory.
− Rather than represent a function as hf, A, Bi, we will write

f :A→B

and say “f maps elements of A into B”. When we write, “f : A → B”, it

will always be understood that A and B are sets.
− If (x, y) ∈ f, we will write
f(x) = y
and say that y is the image of x under f.
1 By “a triple hf, A, Bi” we mean that a function is characterized by three sets f , A and

B with the described properties.

2 Using logical symbols: ∀x ∈ A ∃ b ∈ B | (a, b) ∈ f .
3 Using logical symbols: [(a, b) ∈ f ] ∧ [(a, c) ∈ f ] ⇒ (b = c).
Part IV: Functions 81

− We will also say that x is a preimage or an inverse image of the element

y.
The definition of a function, f, states that f is a subset of A × B and
therefore is a relation. If A = B it is a relation on A with the extra condition:
“f(a) 6= f(b) ⇒ a 6= b”. Since a function is a relation, we can then speak of
its domain and its image.
− From the definition of “function”, we see that for every x ∈ A, there is
some y ∈ B such that (x, y) ∈ A × B. So, by definition of the domain of
a relation (see Definition 5.3),

A = dom f

− It may be that not every y ∈ B is such that (x, y) ∈ f for some x in A.

So we must be clear about what we mean by the image of f:

im f = {y ∈ B : (x, y) ∈ f for some x ∈ dom f}

If A is the domain of the function f ⊆ A × B, then we will express the

image, im f, of A under f as

im f = f[A]

− Note that im f is contained in B and need not be equal to B. To distin-

guish between f[A] and B we will refer to B as being the codomain of
A, abbreviated as codom f. The words “range of f”, denoted as ran f, is
often used instead of “image of f”. In such cases you will read

ran f = f[A]

9.3 Restricting a function to a subset of its domain.

Suppose we are given a function, f : A → B. Then f is mapping each ele-
ment in its domain A into B. If D ⊆ A, then we may restrict the domain of
f so that it only acts on the elements of D. We will show that f : D → C
is also a function:
Since f : A → B is a function, then by definition, for every x ∈ A, there
exists y ∈ B such that f(x) = y. Since D ⊆ A then for every x ∈ D, x ∈ A;
hence there exists y ∈ B such that f(x) = y. Since the image (under f) of
every x ∈ A is unique, then the same is true for every x ∈ D ⊆ A. Thus,
by definition, f : D → B is a function.
Notation to express the restriction of a function f to a subset of its domain
will be useful. We introduce this now.
82 Section 9: Functions: A set-theoretic definition

Definition 9.2 If f : A → B is a function and D ⊆ A, then we say that the

function f : D → C is a restriction of f to D. In this case we will use the
symbol, f|D , to represent the restriction of f to D. Note that if D ⊆ A, then
we can write, f|D ⊆ f, since

f|D = {(x, y) : x ∈ D and (x, y) ∈ f} ⊆ f

We will now see that a function, f : A → B, can always be expressed as the

union of two functions, provided its domain contains more than one element.

Theorem 9.3 Let f : A → B be a function and suppose A = C ∪ D, where

neither C nor D is empty. Then f = f|C ∪ f|D .
P roof:
Given: A function f : A → B is defined and A = C ∪ D.

f = {(x, y) : (x, y) ∈ A × B and (x, y) ∈ f}

= {(x, y) : (x, y) ∈ (A × B) ∩ f}
= {(x, y) : (x, y) ∈ [(C ∪ D) × B] ∩ f}
= {(x, y) : (x, y) ∈ [ (C × B) ∪ (D × B) ] ∩ f} (Theorem 4.7)
= {(x, y) : (x, y) ∈ [ (C × B) ∩ f] ∪ [ (D × B) ∩ f ]} (Theorem 3.9)
= {(x, y) : (x, y) ∈ (C × B) ∩ f}
∪ {(x, y) : (x, y) ∈ (D × B) ∩ f}
= {(x, y) : (x, y) ∈ (C × B) and (x, y) ∈ ∩f}
∪ {(x, y) : (x, y) ∈ (D × B) and (x, y) ∈ ∩f}
= f|C ∪ f|D

9.4 Equal functions.

We know that two sets are equal provided both sets contain the same ele-
ments. Since functions are defined as being sets of ordered pairs, then we
can establish equality of two functions by comparing the elements of the sets
they represent. If the function f and g contain the same ordered pairs, then
we can write f = g.

Theorem 9.4 Two functions f : A → B and g : A → B are equal if and only

if f(x) = g(x) for all x ∈ A.
Part IV: Functions 83

P roof:

f =g ⇔ For any x ∈ A, (x, y) ∈ f if and only if (x, y) ∈ g

⇔ For any x ∈ A, f(x) = y and g(x) = y
⇔ For any x ∈ A, f(x) = g(x)

9.5 Some particular types of functions.

We present a few elementary functions with particular properties often en-
countered in various fields of mathematics.

Definition 9.5 Let f : A → B be a function.

a) We say that “f maps A onto B” if im f = B. We often use the expression
“f : A → B is surjective” instead of the word onto.
b) We say that “f maps A one-to-one into B” if, whenever f(x) = f(y),
then x = y. We often use the expression “f : A → B is injective” instead
of the words one-to-one into B.
c) If the function f : A → B is both one-to-one and onto B, we then say
that f is “one-to-one and onto”. Another way of conveying this is to
say that f is bijective, or f is a bijection. So “injective + surjective ⇔
bijective”.
d) Two classes (or sets) A and B for which there exists some bijective func-
tion f : A → B are said to be in one-to-one correspondence.

9.6 A few examples of simple functions.

a) The constant function. Let A and B be two sets and suppose b ∈ B.
Define the function f : A → B as follows:

f(x) = b for all x ∈ A

This function maps all elements of A to the same element of B. We call

this a constant function.
− If B 6= {b}, then f is not “onto B” or surjective; it is just “into” B.
− However, if we were to write, f : A → {b}, then of course we could
say that f is surjective.
− If A has only one element, say A = {a}, then f : {a} → {b} is a
constant function which is bijective.
84 Section 9: Functions: A set-theoretic definition

b) The characteristic function. Let C be the two-element set C = {∅, {∅}}.4

Let A be a set and D be a non-empty subset of A, such that A−D is also
non-empty. We define a function denoted by χD : A → C as follows5:

∅ if x 6∈ D
χD (x) =
{∅} if x ∈ D

This is called the characteristic function of D in A.

− We see that χD is “onto C”.
− The characteristic function is constant on D mapping all elements of
D to the single element {∅}. It is constant on A − D mapping all
elements of A − D to the single element ∅.
− We can write
χD = {(x, ∅) : x ∈ A − D} ∪ {(x, {∅}) : x ∈ D}}
= (χD )|A−D ∪ (χD )|D
c) Recall that in Theorem 4.9, we showed that the elements of the two
classes A × (B × C) could be matched one-to-one with the elements
of (A × B) × C. The proof of this theorem shows that the function
f : A × (B × C) → (A × B) × C defined as f((a, (b, c))) = (a, (b, c))
is a bijection between these two classes.

9.7 Class functions.

A “function” was formally defined as being a particular kind of subset of the
Cartesian product of two sets. Suppose that X and Y are classes (possibly
proper) and f is a subclass of X × Y which satisfies the conditions given
in Definition 9.1. In order to distinguish f from the notion of “function” as
presented in Definition 9.1, we will refer to f as a class function, keeping in
mind that f may be a proper class.

9.8 Example.
Suppose f : U → V is a function where U ⊆ V . Show that f ∈
P(P(P(V ))).
Solution: For each x ∈ U , let yx = f(x) ∈ V . Then
f = {(x, yx) : x ∈ U } ⊆ U × V
Recall from Kuratowski’s definition of ordered pair in 4.1 that (x, yx) =
{ {x}, {x, yx} }. So
f = { { {x}, {x, yx} } : x ∈ U }
4 We should justify that C is a set: Since ∅ is a subclass of any set, then by the Axiom

A4 (Axiom of subset), ∅ is a set. Also {∅} is a set since {∅, ∅} is a set (by the Axiom of
pair). Again by the Axiom of pair, {∅, {∅}} is a set.
5 The Greek letter χ is pronounced “kie” (like the word “pie”).
Part IV: Functions 85

Since U ⊆ V , x ∈ V , so {x} ∈ P(V ). Since f[U ] ⊂ V then, for each

x ∈ U , yx ∈ V so x, yx ∈ V . Then {x, yx} ∈ P(V ). We then have
{x}, {x, yx} ∈ P(V ).
It follows that, for each x ∈ U , { {x}, {x, yx} } ⊂ P(V ). So, for each x ∈ U ,
{ {x}, {x, yx} } ∈ P(P(V )). Then

f = { { {x}, {x, yx} } : x ∈ U } ⊂ P(P(V ))

That is,
f = { { {x}, {x, yx} } : x ∈ U } ∈ P(P(P(V )))

Concepts review:
1. What is the definition of a function f from a set A to a set B?
2. Is it acceptable to view a function f from a set A to a set B as a
set of ordered pairs?
3. Given a function f from a set A to a set B, what do each of the
sets dom f, codom f, im f and ran f represent?
4. Given a function f from a set A to a set B and y ∈ im f, what is
the preimage or inverse of y?
5. Given a function f from a set A to a set B and a set D such that
D ⊆ A, what does the symbol f|D mean?
6. Given a function f from a set A to a set B, where A = C ∪ D is it
true that f = f|C ∪ f|D ?
7. Given the two functions f : A → B and g : A → B, what does it
mean to say that the functions f and g are equal? How can we show
that f = g?
8. Given a function f from a set A to a set B, what does it mean to
say that f is onto B?
9. Given a function f from a set A to a set B, what does it mean to
say that f is one-to-one into B?
10. Given a function f from a set A to a set B, what does it mean to
say that f is injective?
11. Given a function f from a set A to a set B, what does it mean to
say that f is surjective?
12. Given a function f from a set A to a set B, what does it mean to
say that f is one-to-one and onto B?
86 Section 9: Functions: A set-theoretic definition

13. Given a function f from a set A to a set B, what does it mean to

say that f is bijective (or f is a bijection)?
14. Given a function f from a set A to a set B, is it safe to say that if
f is both injective and surjective, then it is bijective?
15. Given two sets A and B, what does it mean to say that there is a
one-to-one correspondence between A and B?
16. If D is a subset of A, what is the characteristic function χ|D of D
in A? Describe χ|D as a set of ordered pairs.

EXERCISES

A. 1. Suppose A is a set. Show that the set {(x, x) : x ∈ A} is a function.

2. Let f : A → B be a function. Show that if g ⊆ f and g is non-empty, then
g is a function.
3. Suppose f : A → B and g : A → B are two functions each of which has the
set A as domain. If f ⊆ g show that f = g.
4. If C is as set, let f : C → IC be defined as f(a) = (a, a) for all a ∈ C.
a) Show that f satisfies the definition of a function.
b) Show that f is one-to-one and onto and therefore is bijective

B. 5. Let D and E be two sets such that D ∩ E = ∅. Let g : D → B and

h : E → B be two functions. Let f = g ∪ h.
a) Show that f : D ∪ E → B is a function.
b) Show that f|D = g and f|E = h.
6. Let f : A → B and g : C → D be two functions. We define (f × g) :
A × C → B × D as follows:

(f × g)((x, y)) = (f(x), g(x)) for all (x, y) ∈ A × C

a) Show that (f × g) : A × C → B × D is a function.

b) Show that if f : A → B and g : C → D are bijective, then (f × g) :
A × C → B × D is bijective.
7. Let f : A → B and g : C → D be two bijective functions where A ∩ C = ∅
and B ∩ D = ∅. Let h : A ∪ C → B ∪ D be defined as follows:

f(x) for all x ∈ A
h(x) =
g(x) for all x ∈ C

a) Show that h : A ∪ C → B ∪ D is a function.

Part IV: Functions 87

b) Show that if f : A → B and g : C → D are bijective, then h : A ∪ C →

B ∪ D is bijective.
8. Let S and T be sets and f be a function on S × T defined as: f((x, y)) = x
for all (x, y) ∈ S × T .
a) Verify that f is indeed a function.
b) Describe the image of f.
c) Verify whether f is one-to-one or not. If it is, prove it; if it isn’t, show
why not.
9. Let A be a set and D ⊆ A. Recall that χ|D is the characteristic function
mapping x to {∅} if x ∈ D and x to ∅ if x 6∈ D. Show that im χ|D is a set.

C. 10. Let f : A → B and g : C → D be two bijective functions. Let h : A ∪ C →

B ∪ D be defined as follows:

f(x) for all x ∈ A
h(x) =
g(x) for all x ∈ C

a) Is h necessarily a function? If it isn’t, give an example illustrating this.

b) If h is a function, is h necessarily bijective? If it isn’t, give an example
illustrating this.
11. Let f : A → B be a function. Show that if g ⊆ f, then there exists some
subset C of A such that f|C = g.
12. Is ∅ a one-to-one function? Explain.
88 Section 10: Operations on functions

10 / Operations on functions
Abstract. In this section we define the composition, g◦f, of two func-
tions f : A → B and g : B → C. We view “composition of functions”
as an operation “ ◦” on two functions f and g. From this perspective we
then discuss the main properties of composition of functions (such as non-
commutativity and associativity). It is in this particular context that we
describe the identity function and the inverse of a function. We also de-
fine the concept of “invertible function”.

10.1 Composition of functions: a set-theoretic definition.

Suppose f : A → B and g : B → C are two functions. A noticeable
fact about these two functions is that the domain of the function g is the
codomain of f. So for x ∈ A we have f(x) ∈ dom g. For such an element
x ∈ A, the expression g(f(x)) is well-defined. This allows us to construct
the set:
h = {(x, y) : x ∈ A, y = g(f(x)) ∈ im g} ⊆ A × C
By the Axiom of construction A2, h is a well-defined subset of A × C. With
these thoughts in mind, we formally define this notion of “composition of
two functions”.

Definition 10.1 Suppose f : A → B and g : B → C are two functions such

that the codomain of the function f is the domain of the function g. Let

h = {(x, z) ∈ A × C : y = f(x) and z = g(y) = g(f(x)) }

Thus, (x, z) ∈ h if and only if (x, z) = (x, g(f(x)). We will call h the compo-
sition of g and f, and denote it by g◦f where

(g◦f)(x) = g(f(x))

Given the functions f : A → B and g : B → C, and seeing that g◦f ⊆ A×C,

we naturally suspect that g◦f : A → C is a function. We will, of course, have
to make sure that this is the case.

Theorem 10.2 Let f : A → B and g : B → C be two functions such that

the codomain of the function f is the domain of the function g. Then the
composition of g and f, (g◦f) : A → C, is a function.
Part IV: Functions 89

P roof:
What we are given: f : A → B and g : B → C are two functions.
What we are required to show: That g◦f is a function.
1) By definition of h = g◦f,

h = g◦f ⊆ A × C

2) Let x ∈ A. We are required to show that h(x) ∈ C.

x∈A ⇒ f(x) ∈ B (Since im f ⊆ B)

⇒ f(x) ∈ dom g (Since im f ⊆ dom g)
⇒ g(f(x)) ∈ C (Since g : B → C is a function )
⇒ h(x) ∈ C (Since g(f(x)) = h(x))

Thus, x ∈ A ⇒ h(x) = (g◦f)(x) ∈ C.

3) Suppose h(a) = g(f(a)) 6= g(f(b)) = h(b).
h(a) = g(f(a)) 6= g(f(b)) = h(b)
⇒ f(a) 6= f(b)
⇒ a 6= b
The three conditions being satisfied, we conclude that g◦f is a function.

10.2 Composition of functions viewed as an operation on functions.

Given two functions f : A → B and g : B → C, we have shown that we can
associate with this pair of functions another function h = g◦f called “the
composition of f and g”.
This suggests that “composition”, denoted by the symbol, ◦ , can be viewed
as an operation on pairs of functions, just like ×, ∪, ∩. We wonder:
1) Can we compose any pair of functions?
The definition of composition of functions makes it quite clear that we
can’t compose certain pairs of functions. For the composition g◦f of
two functions f and g to be well-defined, the image of f must be in the
domain of g.

2) Is the composition of functions commutative?

Again, it is clear from the definition of composition of functions that we
can’t commute certain pairs of functions with respect to composition.
− Suppose for example that we can compose the functions f : A → B
and g : C → D in this order: f ◦g.
90 Section 10: Operations on functions

− Then, this means that dom f ⊆ im g. But if dom g is not contained

in the im f, the expression, g◦f, is not meaningful; so we cannot
commute the pair f and g.
3) Is the composition of functions associative?
Yes. The following theorem shows that the composition of functions
satisfies the associative property.

Theorem 10.3 Let f : A → B, g : B → C and h : C → D be three functions.

Then h◦(g◦f) = (h◦g)◦f.
P roof: It suffices to show that h◦(g◦f) ⊆ (h◦g)◦f and (h◦g)◦f ⊆ h◦(g◦f).
Proof of h◦(g◦f) ⊆ (h◦g)◦f:

(x, y) ∈ h◦(g◦f) ⇒ [h◦(g◦f)](x) = y

⇒ h(g(f(x)) = y
⇒ h(z) = y for some z ∈ dom h ⊆ C
⇒ g(f(x)) = z ∈ C
⇒ g(u) = z for some u ∈ dom g ⊆ B
⇒ f(x) = u ∈ B

((h◦g)◦f)(x) = (h◦g)(f(x)) = (h◦g)(u)

= h(g(u))
= h(z)
= y
((h◦g)◦f)(x) = y ⇒ (x, y) ∈ (h◦g)◦f

We have shown that h◦(g◦f) ⊆ (h◦g)◦f.

To show (h◦g)◦f ⊆ h◦(g◦f) we proceed similarly. This is left as an exercise.

An identity element for the operation “ ◦ ”. Is there a function I such that

any function g composed with I will leave that function unchanged, i.e.,
g◦I = I ◦g = g?
An obvious candidate for an identity element with respect to composition is
the identity relation, I : U → U .1 It is defined as, I(x) = x, for all x ∈ U .
If A is a set, then IA will denote the restriction of I to A and so IA : A → A
is defined as IA (x) = x, for all x ∈ A. The following theorem confirms that
this function behaves as expected.

Theorem 10.4 Let f : A → B. Then IB ◦f = f and f ◦IA = f.

1 Recall that U denotes the class of all elements and is called the Universal class.
Part IV: Functions 91

P roof:

Given: f : A → B and IB (x) = x for all x ∈ B.

(x, y) ∈ IB ◦f ⇔ (x, z) ∈ f and (z, y) ∈ IB for some z ∈ B

⇔ (x, z) ∈ f and z = y for some z ∈ B
⇔ (x, y) ∈ f

Thus, IB ◦f = f.
The proof of f ◦IA = f is similar. It is left to the reader.

Inverses with respect to “ ◦ ” and the identity I. Once an identity element I

has been identified, one naturally wonders whether certain functions f have
an inverse g with respect to this identity so that g◦f = I?
We will show that only certain functions have an “inverse” with respect to
“composition”.

Definition 10.5 Let f : A → B. If g : B → A is a function satisfying

g◦f = IA , then we will call g an “inverse of f ”; we represent g as f −1 .

Theorem 10.6 Let f : A → B be a one-to-one onto function.

a) An inverse function, f −1 : B → A, of f exists.
b) The function, f −1 , is one-to-one and onto.
c) The function, f : A → B, is the inverse of f −1 : B → A. That is,
(f −1 )−1 = f.
d) The inverse function, f −1 , of f is unique.

P roof:

What we are given for parts (a) to (d): That f : A → B is a one-to-one onto
function.
a) What we are required to show: That there exists a function g such that
g(f(x)) = x. This function g will be f −1 .
Define g : B → A as follows: g(x) = y only if f(y) = x. We claim that
g : B → A is a well-defined function:
− Let x ∈ B. Since f is onto B, then there exists y ∈ A such that
f(y) = x. Thus, dom g = B. Suppose now that (x, y) and (x, z) are
in g. Then y and z are in A such that f(y) = x and f(z) = x.
Since f is one-to-one, then y and z must be the same element. Thus,
g : B → A is a well-defined function.
92 Section 10: Operations on functions

Then g satisfies the definition of an inverse of f, g(f(x)) = x, and so

g◦f = IA . Thus, g = f −1 .

b) What we are required to show: That f −1 is one-to-one:

Suppose (x, y) and (z, y) both belong to f −1 : B → A. Then f(y) = x
and f(y) = z. Since f : A → B is a function, then x = z. Thus, f −1 is
one-to-one on its domain as claimed.

c) What we are required to show: If f −1 : B → A and f : A → B, then

f ◦f −1 = IB :
We are assuming that A = im f −1 and A = dom f. Suppose f ◦f −1 (x) =
z. Then there is some y ∈ A such that (x, y) ∈ f −1 and (y, z) ∈ f. But
since f −1 is the inverse of f, (x, y) ∈ f −1 implies (y, x) ∈ f. Since both
(y, x) and (y, z) belong to f, then z = x. Then f ◦f −1 (x) = x for all
x ∈ B. We conclude that f is an inverse of f −1 .

d) What we are required to show: If h : B → A is a function satisfying

h◦f = IA , then h can only be f −1 .

h◦f = IA = f −1 ◦f ⇒ (h◦f)◦f −1 = (f −1 ◦f)◦f −1

⇒ h◦(f ◦f −1 ) = f −1 ◦(f ◦ f −1 ) (Associativity)

⇒ h◦IB = f −1 ◦IB
⇒ h = f −1

This theorem confirms that:

– If f is one-to-one on its domain, then f has an inverse f −1
– This function f −1 is unique and is one-to-one.
Conversely, if f : A → B has an inverse f −1 , then for a, b ∈ A
f(a) 6= f(b) ⇒ f −1 (f(a)) = a and f −1 (f(b)) = b
⇒ a=6 b (Otherwise f maps a = b to distinct points f(a) and f(b).)

so the function f must be one-to-one.

We have shown that “f has an inverse if and only if f is one-to-one”.

Definition 10.7 Invertible functions, f : A → B, on A are precisely the

one-to-one functions on A.
Part IV: Functions 93

10.3 The inverse of the composition of functions.

Suppose we are given the two functions, f : A → B and g : B → C, both
of which are one-to-one and onto functions. The following theorem shows
us how to proceed when we wish to find the inverse of their composition,
(g◦f)−1 .

Theorem 10.8 Let f : A → B and g : B → C be two one-to-one and onto

functions.

a) Then the function, (g◦f) : A → C, is also one-to-one and onto C.

b) Then the inverse of g◦f, is (g◦f)−1 = f −1 ◦g−1
P roof: The proofs of these statements are left as an exercise.

10.4 Comparing the inverse of a function to the inverse of a relation.

Recall that “inverses” were discussed before we defined the notion of a func-
tion (see Definition 5.4). We referred to inverses while studying relations
and some of their properties. We pause to compare the inverse of a relation
to the inverse of a function so that we can better see how they are similar
and how they differ.
− Relations. Given sets A and B, any subset of {(a, b) : (a, b) ∈ A×B} is
a relation. No other conditions are specified. For any relation R ⊆ A×B
we can construct another relation, R−1 , called its inverse. This inverse
is defined as: R−1 = {(y, x) : (x, y) ∈ R}.
− Functions. A function f : A → B is a set of ordered pairs {(a, b) :
a ∈ A, b = f(a) ∈ B} and so is a relation. But those relations we call
“functions” must satisfy the condition “[(a, b) = (a, c)] ⇒ [b = c]”. We
have declared that a function is an invertible function only if f is one-
to-one. But when viewing f as a relation (a subset of A × B) we can
speak of its “inverse”, as a relation, even though it is not one-to-one.
That is, if f = {(x, y) : y = f(x)} the inverse, f −1 , of f is

f −1 = {(y, x) : y = f(x), x ∈ dom f} = {(y, x) : (x, y) ∈ f}

There is no contradiction here. But we should be more specific by

saying: If f is not one-to-one we can speak of its inverse f −1 , with the
caveat that f −1 cannot be referred to as a “function”. So, when we say
that “f is invertible if and only if it is one-to-one”, we actually mean
“the inverse f −1 of the function, f, is a function if and only if this
function f is one-to-one”.
94 Section 10: Operations on functions

We consider the following example. Let A be a set, {a, b} ⊆ A where a 6= b

and U be a non-empty subset of A. Consider the function f : A → A defined
as follows:

a if x∈U
f(x) =
b if x 6∈ U

Then f can be described as

f = [U × {a}] ∪ [(A − U ) × {b}]

Its inverse
f −1 = [{a} × U ] ∪ [{b} × (A − U )]
can only be referred to as a relation on A, unless of course both U and A−U
are singleton sets.

Concepts review:
1. Given two functions f : A → B and g : B → C, what does the
expression “g◦f” mean? Under what conditions does this expression
make sense?
2. Is the composition of functions commutative? Are there any pairs
of functions which always “commute” with each other?
3. Under what condition(s) is the composition of functions associative?
4. Which function plays the role of the identity with respect to “◦ ”?
5. What does it mean to say that a function f is “invertible”?
6. Under what condition(s) is a function invertible with respect to “◦ ”?
7. If a function h can be expressed as h = g◦f where both f and g are
invertible, is h invertible? If so, how can we express h−1 ?
8. If f is not one-to-one on its domain, what interpretation can we
give to the expression f −1 .

EXERCISES

A. 1. Suppose f : A → B and g : B → C are functions and D ⊆ A. Prove that

(g◦f)|D = g◦(f|D ).

B. 2. Let f : A → B and g : B → C be two functions.

a) Prove that if (g◦f) : A → C is one-to-one, then f : A → B is one-to-one.
Part IV: Functions 95

b) Prove that if (g◦f) : A → C is onto C, then g is onto C.

3. Let g : B → C and h : B → C be two functions. Suppose g◦f = h◦f for
every function f : A → B. Prove that g = h.
4. Let g : A → B and h : A → B be two functions and let C be a set with more
than one element. Prove that if f ◦g = f ◦h for every function f : B → C,
then g = h.

C. 8. Prove the statements (a) and (b) of Theorem 10.8.

96 Section 11: Images and preimages of sets

11 / Images and preimages of sets

Abstract. Suppose we are given a function f : A → B. In this section we

“elevate” this function so that it acts on P(A), mapping its elements to
elements of P(B) according to the rule determined by f. This provides a
mechanism by which we can study the image of sets under the function f.
We also define the set-valued inverse of a function, f ← , and provide some
examples. We also show how such functions act on unions and intersec-
tions of sets.

11.1 Image of set under a function f .

The domain of all functions in this section are hypothesized to be sets. Given
a function, f : A → B, it is sometimes useful to see what effect the func-
tion has on subsets of the domain rather than simply on its elements. To
study the action of functions on sets, we will introduce some special notation.

Definition 11.1 Suppose f is a relation with the set A as domain and the set
B as range. If S is a subset of A, that is, if S ∈ P(A), then we will represent
the image of the set S under f as

f[S] = {y ∈ B : (x, y) ∈ f and x ∈ S}

If U ⊆ B, that is, if U ∈ P(B), we will refer to the set

f ← [U ] = {x ∈ A : (x, y) ∈ f and y ∈ U }

as the preimage of the set U under f.

Remarks. What is new in this definition?

− First observe that the expression f[S] − notice the square brackets −
is the image of S under f. So the square brackets are not there just as
a matter of style. They have meaning. The function, f[ ], associates
elements of P(A) to elements of P(B) where f is predefined either as
a relation or, more specifically, as a function.
− The symbol, “f ← ”, and the words, preimage of a set, are new. If f is a
function, then f ← [U ] is simply the image of U under the relation f −1 .
Again notice the square brackets which means we are associating sets
to sets, an association governed by f ← .1 (If f is a function and x ∈
1 Note that use of the notation f ← [U ] is not universal. It is introduced here to avoid

confusing the preimage of an element, f −1 (x), normally used with one-to-one functions,
with the preimage, f −1 [U ], of a set U . A general topologist might refer to “B = f ← [U ]” by
saying that “f pulls back the set U to the set B”.
Part IV: Functions 97

im f, in some branches of mathematics f ← [{x}] is referred to as the

fiber of x under the function f.)

Examples.

1) Let B = {a, b, c}, A be a non-empty set and U be a non-empty proper

subset of A. Consider the function f : A → B defined as follows:

a if x ∈ U
f(x) =
b if x 6∈ U
Then

f[U ] = {a} and f ← [{a}] = U

f[A − U ] = {b} and f ← [{b}] = A − U
f[∅] = {c} and f ← [{c}] = ∅

We see that the preimage of {c} is empty since f maps no elements of

A to c. This is another way of saying that c is not in the range of f,
or, c 6∈ f[A].
2) Suppose f : A → P(A) is defined as f(x) = {x}. If {a} =
6 {b},
then a 6= b, hence, in this case, f is a function. If B ⊆ A, then
f[B] = {{x} : x ∈ B} ⊆ P(A). In this case f maps the ele-
ment B ∈ P(A) to the element f[B] ∈ P(P(A)). For x, y ∈ A,
f ← [{{x}, {y}}] = {x, y}.

3) Let A = {a, b, c, d, e, k, h} and B = {u, v, w, z, s} and let D = {e, k, h}

and E = {c, d} (both subsets of A).
We define f = {(a, u), (b, u), (c, u), (d, v), (e, v), (k, z), (h, s)}. We de-
scribe the function f via images and preimages.

f ← [{u}] = {a, b, c}
f ← [{v}] = {d, e}
f ← [{z}] = {k}
f|D [D] = f|D [{e, k, h}] = {v, z, s}
(f|D )← [{v}] = {e}
(f|E )[E] = (f|E )[{c, d}] = {u, v}
(f|E )← [{u}] = {c}
98 Section 11: Images and preimages of sets

11.2 Images and preimages of unions and intersections of sets.2

Since we have defined how a function f : A → B can be elevated to
f : P(A) → P(B), we can determine how such functions behave when f
acts on unions and intersections of sets (or classes). We present these few
properties in the form of a theorem. While reading through these properties,
we will see that f ← always “respects”3 unions and intersections. The func-
tion f will be seen to “respect” unions; but f will “respect” intersections
only in certain circumstances.

Theorem 11.2 Let f : A → B be a function mapping the set A to the set

B. Let A be a set of subsets of A and B be a set of subsets of B. Let D ⊆ A
and E ⊆ B. Then:
S S
a) f S∈A S = S∈A f [S]
T T
b) f S∈A S ⊆ S∈A f [S] where equality holds true only if f is one-to-
one.
c) f [A − D] ⊆ B − f[D]. Equality holds true only if f is one-to-one and
onto B.4
d) f ← S∈B S = S∈B f ← [S]
S S

e) f ← S∈B S = S∈B f ← [S]

T T

f) f ← [B − E] = A − f ← [E]

P roof: "
[
#
[
a) x∈f S ⇔ x = f(y) for some y ∈ S
S∈A S∈A
⇔ x = f(y) for some y in some S ∈ A
⇔ x = f(y) ∈ f[S] for some S ∈ A
[
⇔ x∈ f [S]
S∈A

b) It will be helpful to first prove this statement for the intersection of

only two sets U and V . The use of a Venn diagram will also help visu-
alize what is happening.
So we first prove the statement: f[U ∩ V ] ⊆ f[U ] ∩ f[V ] with equality
2 The main theorem in this subsection is not invoked in the remaining part of this text.

It is however considered standard subject matter in most set theory textbooks. So it was
included here.
3 We say that f respects unions if it is always true that f [A ∪ B] = f [A] ∪ f [B]. Similarly,

f respects intersections if it is always true that f [A ∩ B] = f [A] ∩ f [B].

4 Remember that A − B = A ∩ B 0 equals A intersection the complement of B.
Part IV: Functions 99

only if f is one-to-one on U ∪ V .
Case 1: We consider the case where U ∩ V = ∅.
Then f [U ∩ V ] = ∅ ⊆ f[U ] ∩ f[V ]. So the statement holds true.
Case 2: We now consider the case where U ∩ V 6= ∅.
x ∈ f [U ∩ V ] ⇔ x = f(y) for some y ∈ U ∩ V
⇔ x = f(y) for some y contained in both U and V
⇒ x = f(y) ∈ f[U ] and f[V ]
⇔ x ∈ f[U ] ∩ f[V ]

We now show that if f is one-to-one on U ∪ V , then f[U ] ∩ f[V ] ⊆

f [U ∩ V ] and so equality holds true.
− Suppose x = f(y) ∈ f[U ]∩f[V ]. Then there exists u ∈ U and v ∈ V
such that f(u) = f(v) = f(y). Since f is one-to-one, u = v = y.
This implies y ∈ U ∩ V . Hence, f[U ∩ V ] = f[U ] ∩ f[V ].
The proof of the general statement is left as an exercise.
c) Proof is left as an exercise.
" #
[ [
←
d) x ∈ f S ⇔ x = f(y) for some y ∈ S (By definition of f ← .)

S∈B S∈B
⇔ x = f(y) for some y in some S ∈ B
⇔ x ∈ f ← [{y}] ⊆ f ← [S] for some S ∈ B
[
⇔ x∈ f ← [S]
S∈B

Thus, f ← f ← (S).
S S
S∈B S = S∈B
e) Proof is left as an exercise.
f) Proof is left as an exercise.

Concepts review:
1. Given a function f : A → B and S ⊆ A, what does the expression
f[S] mean?
2. Given a function f : A → B and S ⊆ B, what does the expression
f ← [S] mean?
3. What is the preimage of a set S under a function f?
4. Given a function f : A → B and x ∈ B − im f, what is f ← ({x})?
5. Under what conditions does f respect unions?
6. Under what conditions does f respect intersections?
100 Section 11: Images and preimages of sets

7. Under what conditions does f ← respect unions?

8. Under what conditions does f ← respect intersections?

EXERCISES

A. 1. Give an example of a function f : A → B where A contains two unequal

non-empty subsets D and E satisfying f[D] = f[E].
2. Let f : A → B be a function where A and B are sets.
a) Prove that if D and E are equal subsets of A, then f[D] = f[E].
b) Prove that if U and V are equal subsets of B, then f ← [U ] = f ← [V ].
3. Suppose f : A → B where A and B are sets.
a) Show that for any subset D of A, D ⊆ f ← (f[D]).
b) Give an example where D 6= f ← (f[D]).
b) Show that for any subset E of B, f[f ← [E]] ⊆ E.
d) Is it necessarily true that f[f ← [E]] = E?
e) Prove that if f is one-to-one on A, then, for any subset D of A, D =
f ← [f[D]].
f) Prove that if f is onto B, then, for any subset E of B, f[f ← [E]] = E.

B. 4. Let S and T be sets and f : S × T → S be a function on S × T defined as:

f((x, y)) = x for all (x, y) ∈ S × T .
a) If u ∈ im f, what is f ← [{u}]?
b) If U is a non-empty subset of S, what is f ← [U ]?
5. Prove the general case of part b) of Theorem 11.2.
6. Prove part (c) of theorem 11.2.
7. Prove part (e) of theorem 11.2.
8. Prove part (f) of theorem 11.2.

C. 9. Let f : A → B be a function mapping the set A to the set B. Prove that

if D ⊆ A, then
f [f ← [f[D]]] = f[D]
10. Let f : A → B be a function which maps the set A onto the set B. Prove
that [
f ← [B] = f ← [{x}]
x∈B
Part IV: Functions 101

11. Let f : A → B be a function which maps the set A onto the set B. Prove
that if x and y are distinct elements of B, then

f ← [{x}] ∩ f ← [{y}] = ∅

12. Let f : A → B be a function which maps the set A onto the set B. Prove
that the set of sets
S = {f ← [{x}] : x ∈ B}
forms a partition of the set A.
102 Section 12: Equivalence relations induced by functions

12 / Equivalence relations induced by functions

Abstract. In this section we show how a function f : S → T partitions its

domain S. This partition induces an equivalence relation Rf on S which
in turn leads to the quotient set S/Rf . We then present a theorem which
shows how any function can be expressed as the composition of two func-
tions, neither of which is f itself or the identity function.

12.1 Partitioning the domain of a function f : A → B .

Suppose f : A → B is a function which maps a set A into a set B.
We claim that the set {f ← [{x}] : x ∈ f[A]} ⊆ P(A) forms a partition of A.1
– Since every x ∈ f[A] is in the image of f, f ← [{x}] is non-empty for all
x ∈ f[A].
– If x 6= y, f ← ({x}) ∩ f ← ({y}) = ∅ otherwise an element z ∈ f ← ({x}) ∩
f ← ({y}) would be mapped to distinct points, contradicting the fact that
f is a function.
– Finally, the function f ← sends the image,
S f[A], of ASback to A, i.e.,
f ← [f[A]] = A. So A = f ← [f[A]] = f ← [ x∈f[A] {x}] = x∈f[A] f ← ({x}).
Hence, {f ← [{x}] : x ∈ f[A]} covers all of A.
The set. {f ← ({x}) : x ∈ f[A]}, forms a pairwise disjoint set of sets which
covers all of A. So this set partitions A, as claimed.
We have seen that the partition of a set is the quotient set of some equiv-
alence relation, R, (see Theorem 8.3). The equivalence relation, R, induced
by a partition on a set A was defined as follows:
Two elements of the set A are related under R if and only if they
belong to the same element of the partition induced by R.
Then, given a function, f : A → B, on A, we can declare that two elements a
and b in A are related under a relation Rf if and only if they appear together
in f ← [{x}] for some x in the image, f[A], of A. So the set of subsets of A,
{f ← [{x}] : x ∈ f[A]}, is a quotient set of A induced by Rf . We formalize
these thoughts in the following definition.

Definition 12.1 Let f : A → B be a function which maps a set A into a set

B. We define the equivalence relation, Rf , on A induced (or determined) by f
as follows:
1 We remind the reader of the definition of a partition (also found at Definition 8.1). For
S
a set S, we say that a set of subsets C ⊆ P (S) forms a partition of S if, (1) A∈C A = S,
(2) if A and B ∈ C and A 6= B, then A ∩ B = ∅, (3) A 6= ∅ for all A ∈ C .
Part IV: Functions 103

Two elements a and b are related under Rf if and only if {a, b} ⊆ f ← [{x}]
for some x in im f. The quotient set of A induced by Rf is then
A/Rf = ARf = {f ← [{x}] : x ∈ f[A]}

We will refer to A/Rf (or ARf ) as the quotient set of A induced (or deter-
mined) by f.

We illustrate this in a simple example. Let U = {a, {a}, {a, {a}}, {{a}} }
where a is a set. Consider the function f : U → U defined as follows:

 a if a ∈ x
f(x) = {a} if {a} ∈ x and a ∈/x
{{a}} if x = a


We see that f is a well-defined function on the set U . So the set

C = {f ← [{a}], f ← [{{a}}, f ← [{{{a}}}] }
partitions U in three pairwise disjoint non-empty sets. From this, we can
define the equivalence relation Rf on U where U/Rf = C .
We will list the elements in each set:
f ← [{a}] = { {a}, {a, {a}}}
f ← [{{a}}] = { {{a}} }
f ← [{{{a}}}] = {a}
Observe that all elements of U are represented in the three sets above.

12.2 The canonical decomposition of a function.

Let f : S → T be a function mapping a set S into a set T . We have seen
how this function, f, determines a new set: the quotient set, S/Rf , induced
by f. For each x ∈ S we let Sx = f ← [{f(x)}]. Then
S/Rf = {Sx : x ∈ S}
is the quotient set induced by f. Note that the elements of S/Rf are subsets
of S. We now show how the function f can be expressed as a composition
of two other functions neither of which is the identity function.

Figure 4 illustrates how we will express the function f as a composition of

two functions.
A. We define the function gf : S → S/Rf as follows:
gf (x) = Sx
We first verify that gf : S → S/Rf is a well-defined function on S.
104 Section 12: Equivalence relations induced by functions

FIGURE 4
Canonical decomposition of f : S → T

− We first verify that dom gf = S:

Let x ∈ S. Then y = f(x) ∈ f[A] = im f. So Sx = f ← [{y}]. Thus,
gf (x) = Sx = f ← [{y}] ∈ S/Rf .
− Next we show that gf is indeed a function:
Suppose Sx 6= Sy . Since {Sx : x ∈ S} partitions S, then

Sx ∩ Sy = f ← [{f(x)}] ∩ f ← [{f(y)}] = ∅

So x 6= y, otherwise we would have f ← [{f(x)}] = f ← [{f(y)}] re-

sulting in a contradiction. Then gf is a function as claimed.
We now verify that the function gf is onto S/Rf :
− Since, for any x ∈ S, Sx = f ← [{f(x)}] = gf (x), then gf is onto
S/Rf

B.. We define another function hf : S/Rf → T as follows (Remember that

T contains the image if f):

hf (Sx ) = f(x)

We verify that hf : S/Rf → T is a well-defined function on S/Rf :

− We first verify that dom hf = S/Rf :

Let Sx ∈ S/Rf . Since x ∈ S, then f(x) is defined and so hf (Sx ) =
f(x) is defined.
Part IV: Functions 105

− We now show hf is indeed a function:

Suppose f(x), f(y) ∈ hf [S/Rf ] and f(x) 6= f(y). Then ∅ =
f ← [{f(x)}] ∩ f ← [{f(y)}] = Sx ∩ Sy . So Sx 6= Sy . Then hf is
indeed a function.
The function hf is one-to-one:
The function hf is one-to-one on S/Rf , since

Sx 6= Sy ⇒ Sx ∩ Sy = ∅
⇒ f ← [{f(x)}] ∩ f ← [{f(y)}] = ∅
⇒ f(x) 6= f(y)

C.. Combining the two functions gf : S → S/Rf and hf : S/Rf → T we

obtain the composition (hf ◦gf ) : S → T where

(hf ◦gf )(x) → f(x)

We verify that the function (hf ◦gf ) = f: For x ∈ S,

f(x) = hf (Sx )
= hf (gf (x))
= (hf ◦gf )(x)

Thus, the two functions hf ◦gf and f agree everywhere on the domain,
S, of f. We have just proven the following theorem.

Theorem 12.2 Let f : S → T be an onto function where S and T are

sets. There exists an onto function gf : S → S/Rf and a one-to-one function
hf : S/Rf → T such that
hf ◦gf = f
The function, hf ◦gf = f, is called the canonical decomposition of f.

Example: Let U = {a, {a}, {a, {a}}, {{a}} } where a is a set. Let the func-
tion f : U → U be defined as follows:

 a if a∈x
f(x) = {a} if {a} ∈ x and a ∈
/x
{{a}} if x=a


From this we can define the equivalence relation Rf on U where

U/Rf = { f ← [{a}], f ← [{{a}}], f ← [{{{a}}}] }

= { { {a}, {a, {a}} }, { {{a}} }, {a} }
106 Section 12: Equivalence relations induced by functions

We can define gf : U 7→ U/Rf and hf : U/Rf 7→ U as below:

gf hf
a 7−→ {{{a}}} = f ← [{{a}}] 7−→ {{a}} = f(a)
gf hf
{a} 7−→ {{a}, {a, {a}}} = f ← [{a}] 7−→ a = f({a})
gf ← hf
{a, {a}} 7−→ {{a}, {a, {a}}} = f [{a}] 7−→ a = f({a, {a}})
gf hf
{{a}} 7−→ {a} = f ← [{{{a}}}] 7−→ {a} = f({{a}})

We see that
(hf ◦gf )(a) = f(a)
(hf ◦gf )({a}) = f({a})
(hf ◦gf )({a, {a}}) = f({a, {a}})
(hf ◦gf )({{a}}) = f({{a}})

Concepts review:
1. If f : A → B is a function mapping the set A into the set B, describe
a partition of the set A induced by the function f.
2. If f : A → B is a function mapping the set A into the set B, describe
an equivalence relation Rf on A induced by f.
3. If f : A → B is a function mapping the set A into the set B, describe
the elements of the quotient set A/Rf induced by f.
4. If f : A → B is a function mapping the set A into the set B, what
does “the canonical decomposition of f” mean?
5. If f : A → B is a function mapping the set A into the set B, is it
always possible to “decompose” f as a composition of two functions?
How?

EXERCISES

A. 1. Let S = {a, b, c}, be a set containing three distinct elements.

a) List all the elements of P(S).
b) We define a function f : P(S) × S → P(S) as follows:

A if x 6∈ A
f((A, x)) =
{x} if x ∈ A

List all the elements of the function f.

c) Is f onto P(S)? Explain.
Part IV: Functions 107

d) Is f one-to-one on S? Explain.
e) For every D ∈ P(S), give f ← (D).
f) If the function f determines a partition of P(S) × S, list the subsets
which are members of this partition.

B. 3) Let A be a non-empty subset of the set S and let T = {∅, {∅}}. We define
the function f : S → T as follows: f(x) = ∅ if x 6∈ A and f(x) = {∅} if
x ∈ A. Let Rf denote the equivalence relation determined by f.
a) List the elements of the quotient set S/Rf .
b) If hf ◦gf = f is the canonical decomposition of f, list the elements of
the functions gf and of hf .
Part V

From sets to numbers

Part V: From sets to numbers 111

13 / Natural numbers

Abstract. The main objective in this section is to discuss how the natural
numbers, N = {0, 1, 2, 3, . . .}, are constructed within the Zermelo-Fraenkel
axiomatic system. We begin by stating the definitions of “successor set”
and “inductive set”. The natural numbers, N, is then defined as the “small-
est” inductive set. A ZF -axiom will guarantee the existence of this “small-
est inductive set” called the “natural numbers”. We then show how the
Principle of mathematical induction is an immediate consequence of this
definition of N. We define “transitive sets” as sets, A, whose elements are
subsets of A. The elements of N are then shown to be “transitive sets”.
We then prove a few properties possessed by all natural numbers.

13.1 Preliminary discussion.

We now have enough background material to appropriately define the set
commonly known as the natural numbers. We have an intuitive understand-
ing of what the numbers 0, 1, 2, 3, . . . mean, and so we will let our intuition
guide us in our attempt to define N within our set-theoretic axiomatic sys-
tem.
When defining N, what are our options? If we want N = {0, 1, 2, 3, . . . , }
to be defined within the ZFC-axiomatic system, then we don’t have much
choice in the matter: The elements of N must be sets. But they are sets
whose elements have certain characteristics. These characteristics are such
that nobody would confuse N with the real or complex numbers, for exam-
ple. A way of approaching this question is to ask ourselves what properties
√
of the natural numbers allow us to say with confidence that 1/3 or 2 are
not natural numbers. One obvious property of N is that it is an infinite set.
We would necessarily have to define what an “infinite set” is. We then would
have to think deeply about the fundamental characteristics of N and its el-
ements. For example, any set which represents a non-zero natural number
must have an immediate predecessor and an immediate successor. We must
ask ourselves, “what kind of set can have an immediate predecessor and an
immediate successor?”. Furthermore, we will eventually have to determine
how arithmetic operations can be performed on such sets. Determining the
natural numbers’ intrinsic properties which allow us to distinguish them
from other types of numbers must thus be our starting point.

13.2 Constructing the natural numbers.

As we reflect on sets which would suitably represent natural numbers, we
come to realize how very few sets we have actually witnessed up to now in
our study of set theory. What kind of sets have we encountered?
112 Section 13: Natural numbers

1) First, we gave ourselves an axiom (Axiom of class construction) which

guarantees that {x : x 6= x} is a well-defined class. We decided to
represent this class by “∅” and call it the “empty class”.
2) Then we gave ourselves an axiom (Axiom of subset) that says that “if S
is a set, then any subclass of S must also be a set”. We gave ourselves an
axiom that guaranteed that there exists at least one set (The Axiom of
infinity states that “there exists a non-empty class A called a set such
that...”). Once we showed that ∅ ⊆ S, for any set, then ∅ was our first
explicitly constructed set.
3) We then gave ourselves set constructing tools. The Axiom of pair allows
us to say, for example, that {∅}, {∅, {∅}}, {{∅, {∅}}}, . . . are sets. The
Axiom of union allows to gather together all the elements from a “set
of sets” to form a larger set. The Axiom of power set allows us to
construct a set whose elements are the subsets of a set.

From this we see that nearly all the sets we explicitly constructed up to now
have evolved from successive applications of the axioms of pair, union and
power set, with the empty set as a starting point. If we explicitly list the
elements of these sets, we will see repetitive sequences of the pairs of “curly
brackets”, { and }, and the symbol, “∅”. We then expect every natural
number to be a set of this nature.
If we are asked to define the set of all natural numbers as succinctly as
possible, we may consider the following definition as a reasonable one:
The set, N, of all natural numbers is the intersection of all sets S
which satisfy the two properties, 0 ∈ S and [n ∈ S] ⇒ [n + 1 ∈ S].
Given the knowledge and the experience we have with natural numbers, it
would be difficult to imagine a natural number which does not√belong to
such a set. It also seems obvious that numbers such as 45 and 5 cannot
belong to such a set. This will be our model for formulating a set-theoretic
definition of the natural numbers. It seems natural to define, 0 = ∅, as being
the smallest of all natural numbers. The challenge is to define the operation
“+ 1” using the language of sets. We can view “+ 1” as an “immediate
successor constructing mechanism”. We begin with the following definition.

Definition 13.1 For any set x, we define the successor, x+ , of x as

x+ = x ∪ {x}

We see that this is an operation which adds a single element to a given set,
x. For example, if A = {a, b, c}, then the successor of A is
A+ = {a, b, c} ∪ {{a, b, c}} = {a, b, c, {a, b, c}}
This is a set constructing mechanism. We need only one set to initiate a
Part V: From sets to numbers 113

non-ending process. Given any set, we can construct a successor. By the

Axiom of union (A7) a successor is always a set, provided the class which
initiates the process is a set. Starting with the empty set ∅, we obtain

B0 = ∅
B1 = ∅+ = ∅ ∪ {∅} = {∅}
B2 = (∅+ )+ = {∅}+ = {∅} ∪ {{∅}} = {∅, {∅}}
Rather than use the symbols, {B0 , B1 , B2 , . . . , }, why not use conventional
natural number notation?
0 = ∅
1 = 0+ = ∅+ = ∅ ∪ {∅} = {∅} = {0}
2 = 1+ = {∅}+ = {∅} ∪ {{∅}} = {∅, {∅}} = {0, 1}
3 = 2+ = {∅, {∅}}+ = {∅, {∅}} ∪ {{∅, {∅}}} = {∅, {∅}, {∅, {∅}}} = {0, 1, 2}

We can thus define, one at a time, each symbol 0, 1, 2, 3, . . ., as a set. Let’s

continue for a bit and see what happens. We will define:
4 = 3+ = { ∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}} = {0, 1, 2, 3}
5 = 4+ = {0, 1, 2, 3, 4}
6 = 5+ = {0, 1, 2, 3, 4, 5}
7 = 6+ = {0, 1, 2, 3, 4, 5, 6}

A clear pattern emerges. This method for constructing natural numbers is

worth exploring.
We further examine the properties of a set constructed in this way.
− As mentioned before, construction starts with the set 0 = ∅ moving
upwards.
− For each of the elements of the set, 7 = {0, 1, 2, 3, 4, 5, 6}, we see that the
property
n ⊂ n+
is satisfied (in the sense that every element of the set, 6, belongs to the
set, 7). And surprisingly enough,

n ∈ n+

(in the sense that the set, 6, is an element of the set, 7). That is,

“. . . each set, n, is both a subset and an element of its successor

n+ ”.

For convenience, we define the ordering relation “n < m” to mean “n ⊂

m” so that 0 < 1 < 2 < 3 < 4 < 5 < 6 < 7, for example. Witness
3 = 2+ = {0, 1, 2} = {0, 1, {0, 1}}, so 2 = {0, 1} ⊂ 3; hence, 2 < 3.
− For each element, n, of the set 7 = {0, 1, 2, 3, 4, 5, 6}, we see that n is
precisely the set of all “natural numbers” strictly less than itself.
114 Section 13: Natural numbers

− Also, if we set 0 = ∅

1 = {∅} = {0} so 0 ∈ 1 and 0 ⊂ 1

2 = {0, 1} = {0, {0}} so 1 ∈ 2 and 1 ⊂ 2
3 = {0, 1, 2} = {0, {0}, {0, {0}}} so 2 ∈ 3 and 2 ⊂ 3
1 ∈ 3 and 1 ⊂ 3

− We have verified above that the set, 3 = {0, 1, 2}, satisfies the property:
Every pair of elements contained in the number 3 are comparable with
respect to <, and they satisfy the interesting property:

(n < m) ⇒ [(n ∈ m) ⇔ (n ⊂ m)]

It is easily verified that each of 4, 5, 6, and 7, thus constructed, satisfies

this property.
We seem to have a mechanism to construct finitely many natural numbers.
But we can’t use it to construct them all since there are too many. We will
have to find a way to work around this obstacle.
Since the expression, n∪{n}, is at the core of everything we have seen above,
we formally provide some vocabulary to discuss such a concept.

Definition 13.2 If x is a set, then we define

x+ = x ∪ {x}
1
A set, A, is called an inductive set if it satisfies the following two properties:

a) ∅ ∈ A.
b) x ∈ A ⇒ x+ ∈ A.

The above definition of “Inductive set” nicely describes what the set of all
natural numbers is like. But defining “Inductive set” doesn’t guarantee that
one exists in our set-theoretic universe. We will need some outside help for
this.2 This is done with the Axiom of infinity (A8).

Axiom (The axiom of infinity): An inductive set exists.

1 The term successor set is also used instead of inductive set.

2 This is somewhat like in Ancient Greek theater: The literary technique of, magically,
extracting a character out of an intractable problematic situation was referred to as, pulling
a “Deus ex machina” (pronounced “makina”).
Part V: From sets to numbers 115

With this guarantee that at least one inductive set exists, we will define the
natural numbers as being the smallest one.

Definition 13.3 We define the set, N, of all natural numbers as the intersec-
tion of all inductive sets. That is,

N = {x : x ∈ I for all inductive set I}

Is the set N itself inductive? We verify this: By definition, all induction sets
contain the element ∅ and so ∅ belongs to their intersection, N. Condition
one is satisfied. We verify condition two: If x ∈ N, then x belongs to all
inductive sets and so x+ must belong to all inductive sets; so x+ ∈ N. So N
is an inductive set. It immediately follows that if n is any natural number,
then so is its successor, n+ .

13.3 Mathematical induction.

The cleverly chosen four words “An inductive set exists” will allow us to
prove that the smallest inductive set provides a precise set-theoretic repre-
sentation of the set of all natural numbers as we know it. This inductive set
possesses all the essential properties of the natural numbers, including its
linear ordering structure. As we shall soon see, it will allow us to define on it
the common arithmetic operations we normally perform on natural numbers.
Proving that this inductive set possesses all the essential properties of the
natural numbers will require the well-known mathematical tool called the
Principle of mathematical induction. This principle is “hardwired” within
the definition of “inductive set”.

Theorem 13.4 Let A be a subset of N. If A satisfies the two properties:

a) 0 ∈ A
b) m ∈ A ⇒ m+ ∈ A,
then A = N.

P roof:
By hypothesis, A is an inductive set since it satisfies the two required properties.
Since N is the intersection of all inductive sets, then N ⊆ A. By hypothesis,
A ⊆ N. Thus, A = N.
116 Section 13: Natural numbers

Corollary 13.5 (Principle of mathematical induction.) Let P denote a par-

ticular set property. Suppose P (n) means “the property P is satisfied when
applied to the value of the natural number n”. If

a) P (0) holds true,

b) P (n) holds true ⇒ P (n+ ) holds true,
then P (n) holds true for all natural numbers n.

P roof: Let
A = {n ∈ N : P (n) holds true}
Part (a) of the hypothesis states that “0 ∈ A”, while part (b) states that
[n ∈ A] ⇒ [n+ ∈ A]. Then A is an inductive set and so A = N (by the theo-
rem). So P (n) is true for all natural numbers n.

A few remarks. The proofs above illustrate how the Principle of mathemati-
cal induction is intrinsically linked to the definition of the natural numbers.
The set of all natural numbers is the only (non-empty) set whose existence
is essentially postulated. The other explicitly defined set is the empty class
which was shown to be a set (as a consequence of the Axiom of construction
followed by the Axiom of subset).
Some readers may not be familiar with “proofs by mathematical induc-
tion”. For these readers we provide a few examples of proofs by induc-
tion in this section. We summarize the main steps to be followed when
proving a statement by induction. Induction is used when we are dealing
with some property P (n) which is a function of the natural numbers. Let
S = {n ∈ N : P (n) holds true}. Now this set, S, may possibly be empty,
may contain a few elements of N or may even contain all of its elements.
The objective is to show that if two specific conditions are satisfied, then
S = N. That is, we want to prove that P (n) holds true for all values of n.
For example, suppose P (n) is described as the property

n(n + 1) 3
1+ 2+3+···+n =
2
We want to prove that this holds true no matter what natural number n we
use. We highlight the main steps.
Step 1: Write down explicitly the property which is a function of n as illus-
trated above.
Step 2: Prove the “Base case”. This means that we must prove that P (0) is
3 To allow us to present this particular example at this time we will assume that the

operations of addition and multiplication on the natural numbers are known to us. These
will be properly defined soon.
Part V: From sets to numbers 117

true. In our example, we are required to show that 0 = 0(0+1) 2

= 0. We see
that the base case holds true. If the property cannot be shown to be true
for the base case, then P (n) may not hold true for all n. It sometimes helps
us to understand what is going on if we prove that both P (0) and P (1) are
true (especially when the base case is “vacuously true”).
Step 3: State the “Inductive hypothesis”. In this step we suppose that the
P (n) is true for some unspecified natural number n. That this property holds
true for a particular n is now considered to be “given”.
Step 4: With the help of the assumption that P (n) is true, prove that P (n+ )
(equivalently P (n + 1)) is true. In our example we would write something
like this:
n(n + 1) n(n + 1)
1 + 2 + ··· + n = ⇒ 1 + 2 + · · · + n + (n + 1) = + (n + 1)
2 2
(n + 1)(n + 2)
⇒ 1 + 2 + · · · + n + (n + 1) =
2

Step 5: Write down the conclusion: Since “P (n) is true” implies that
“P (n + 1) is true”, then, by the principle of mathematical induction, P (n)
holds true for all n.
Difficulties encountered by students when first applying this procedure are
often due to skipped steps.

13.4 Transitive sets.

The few examples of natural numbers constructed above have illustrated an
interesting property: If n is a natural number (viewed as a set), each and
every element of n is also a subset of n. That is, if a ∈ n then a ⊂ n. We
formally define the words used to describe sets which satisfy this property.
We then provide in the form of a theorem a useful characterization of such
sets.

Definition 13.6 A set, S, which satisfies the property “x ∈ S ⇒ x ⊆ S” is

called a transitive set.

Theorem 13.7 The non-empty set, S, is a transitive set if and only if the
property
[x ∈ y and y ∈ S] ⇒ [x ∈ S]
holds true.
118 Section 13: Natural numbers

P roof:
( ⇒ ) What we are given: That S is a transitive set. Suppose y ∈ S and
x ∈ y.
What we are required to show: That x ∈ S.
Since S is a transitive set, y ∈ S implies y ⊆ S. Since x ∈ y ⊆ S then x ∈ S.
( ⇐ ) We are given that S satisfies the property: If x ∈ y and y ∈ S then
x ∈ S.
We are required to show that S is a transitive set.
Suppose (x ∈ y and y ∈ S) ⇒ (x ∈ S)”. Let z ∈ S. It suffices to show that
“z ⊆ S”.
If z = ∅, then z ⊆ S and we are done. Suppose that z 6= ∅. Let a ∈ z. By
hypothesis, a ∈ S. Since a ∈ z implies a ∈ S, then z ⊆ S.

The above characterization “(x ∈ y and y ∈ S) ⇒ x ∈ S” is reminiscent of

the previously defined idea of transitivity (that is, a < b and b < c ⇒ a < c).

The next theorem shows that N is a transitive set.

Theorem 13.8 The set N of natural numbers is a transitive set.

P roof:
By the characterization of transitive sets stated above, it suffices to show
that for each n ∈ N, x ∈ n ⇒ x ∈ N. We will prove this by mathematical
induction. Let P (n) denote the statement “(x ∈ n ∈ N) ⇒ x ∈ N”.
− Base case: The statement, “x ∈ 0 = ∅” ⇒ “x ∈ N”, is true since there
are no elements in 0 = ∅. So P (0) holds true.
− Inductive hypothesis: Suppose the statement P (n) holds true for the
natural number n. We are required to show that P (n+ ) holds true. Sup-
pose y ∈ n+ = n ∪ {n}. Then either y ∈ n or y ∈ {n}. If y ∈ n, then by
the inductive hypothesis, y ∈ N. If y ∈ {n}, then y = n ∈ N.
By mathematical induction, the statement holds true for all elements of N
and so, by definition, N is a transitive set.

The following theorem first establishes that no natural number is an element

of itself. It also shows that, for distinct natural numbers, one is an element
of the other if and only if it is a proper subset of the other. This of course
implies that every natural number is a transitive set.
Part V: From sets to numbers 119

Theorem 13.9

a) If n, m are natural numbers, m ∈ n ⇒ m ⊆ n. Hence, every natural

number is a transitive set.
b) For any natural number n, n 6= n+ .
c) For any natural number n, n 6∈ n.
d) Suppose n is a natural numbers. If m is a natural number such that
m ⊂ n then m ∈ n. 4

P roof:

a) This is a proof by mathematical induction. Let P (n) be the property “m

and n are natural numbers and m ∈ n ⇒ m ⊆ n”. We are required to
prove that the set {n ∈ N : P (n) is true } = N.
− Base case: We claim that P (0) holds true. Recall that 0 = ∅. Suppose
P (∅) is false. Then there must be some x ∈ ∅ such that x 6⊂ ∅. This
is absurd since ∅ does not contain any elements. So P (0) holds true.
− Inductive hypothesis: Suppose that for some natural number n, P (n)
holds true. We are required to show that P (n+ ) holds true.
* To show that P (n+ ) is true, suppose m ∈ n+ = n ∪ {n}. We are
required to show that m ⊆ n+ = n ∪ {n}.
Case 1: If m ∈ n then by the inductive hypothesis, P (n) is true, and
so m ⊆ n. Then m ⊆ n+ = n ∪ {n} and so P (n+ ) is true.
Case 2: Suppose m 6∈ n.

m 6∈ n and m ∈ n+ = n ∪ {n} ⇒ m ∈ {n}

⇒ m=n

Clearly m = n ⊆ n ∪ {n}. Then P (n+ ) is true.

We have shown that if P (n) is true, then P (n+ ) is true. By mathe-
matical induction, P (n) is true for all n ∈ N. We conclude that every
natural number is a transitive set.
b) We prove that n 6= n+ by induction. Let P (n) denote the statement
“n 6= n ∪ {n}”.
Base case: Since ∅ = { } 6= {∅}, P (0) holds true.
Inductive hypothesis: Suppose n 6= n ∪ {n} for some natural number n.
We are required to show that n ∪ {n} = 6 n ∪ {n} ∪ {n ∪ {n}}.

4 The reader is cautioned not to misread this statement. It does not say that any subset

of a natural number n is an element of n. It says that “any natural number which is a proper
subset of n is an element of n”.
120 Section 13: Natural numbers

Now
n ∪ {n} ∪ {n ∪ {n}} = n ∪ {n} ⇒ n ∪ {n} ∈ n ∪ {n}
⇒ n ∪ {n} ∈ n or n ∪ {n} = n
The inductive hypothesis does not allow “n ∪ {n} = n”. So n ∪ {n} ∈ n.
By part a), n ∪ {n} ⊆ n. Since n ⊆ n ∪ {n}, then n = n ∪ {n}
(Axiom of extent) again contradicting the inductive hypothesis. Then
n ∪ {n} ∪ {n ∪ {n}} = 6 n ∪ {n}. So “P (n) ⇒ P (n+ )” holds true.
By mathematical induction, n 6= n ∪ {n} for all natural numbers.
c) Suppose n is a natural number such that n ∈ n. If m ∈ n∪{n} then m ∈ n
or m = n. In both cases m ∈ n. Then n ∪ {n} ⊆ n. Since n ⊆ n ∪ {n},
n = n ∪ {n} contradicting the statement of part (b). We must conclude,
if n is a natural number, then n 6∈ n.
d) We are required to show that for all m, n ∈ N, m ⊂ n ⇒ m ∈ n. We will
prove this by mathematical induction on n. Let P (n) be the property “[m
is a natural number and m ⊂ n] ⇒ [m ∈ n]”.
− Base cases n = ∅ or 1: For n = ∅, the statement m ⊂ ∅ ⇒ m ∈ ∅ is
vacuously true. For n = 1, ∅ ⊂ 1 = {∅} and ∅ ∈ 1 = {∅} hold true.
So both base cases P (0) and P (1) hold true. (Actually showing P (0)
holds true is sufficient.)
− Inductive hypothesis: Suppose n is a natural number such that P (n)
holds true; that is, for any natural number m, “m ⊂ n ⇒ m ∈ n”. We
are required to show that for any natural number m, “m ⊂ n+ ⇒ m ∈
n+ ”.
· Let m be a natural number such that “m ⊂ n+ = n ∪ {n}”.
Case 1: If n 6∈ m, then m ⊂ n. By the inductive hypothesis,
m ∈ n ⊂ n ∪ {n}. Hence, m ∈ n ∪ {n}.
Case 2: Suppose n ∈ m. By part (b), m 6= n. Since m and n are
distinct natural numbers, by part (a), n ⊂ m. Then n ∪ {n} ⊆ m.
Since m ⊂ n ∪ {n}, then n ∪ {n} ⊂ n ∪ {n}, a contradiction.
So only case 1 applies. So P (n+ ) holds true.
By the principle of mathematical induction, m ⊂ n ⇒ m ∈ n for all
natural numbers m and n.

To illustrate how the elements of N satisfy the property “x ∈ n ⇒ x ⊂ n”

consider, for example, the natural number 4,
4 = { ∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}
We see that
0=∅∈4 and 0 = ∅⊂ 4
1 = {∅} ∈ 4 and 1 = {∅} ⊂ 4 Since every element of {∅} belongs to 4.

2 = {∅, {∅}} ∈ 4 and 2 = {∅, {∅}} ⊂ 4 Since every element of {∅, {∅}} belongs to 4.

3 = {∅, {∅}, {∅, {∅}}} ∈ 4 and {∅, {∅}, {∅, {∅}}} ⊂ 4

Since every element of {∅, {∅}, {∅, {∅}}} belongs to 4.
Part V: From sets to numbers 121

13.5 Other basic properties of natural numbers.

We prove a few more properties of the set, N, as defined above.

Theorem 13.10 Let m and n be distinct natural numbers.

a) If m ⊂ n, then m+ ⊆ n.
b) Let m and n be any pair of distinct natural numbers. Then either m ⊂ n
or n ⊂ m. Equivalently, m ∈ n or n ∈ m. Hence, both “⊂” and “∈” are
strict linear orderings of N.
c) There is no natural number m such that n ⊂ m ⊂ n+ .

P roof:

a) What we are given: That m and n are distinct natural numbers where
m ⊂ n. What we are required to show: That m ∪ {m} ⊆ n.
Since m ⊂ n, then m ∈ n (by Theorem 13.9, part (d) ). By 13.9 part (a),
every natural number is a transitive set, so m ∈ n implies m ⊆ n. Then
m ∪ {m} ⊆ n ∪ {m}. Suppose y ∈ m ∪ {m}. Then y ∈ n or y ∈ {m}. If
y ∈ {m}, then y = m ∈ n. So {m} ⊆ n. Then n ∪ {m} ⊆ n. We conclude
that m ∪ {m} ⊆ n, as required.

b) What we are given: That m and n are distinct natural numbers. What
we are required to prove: That m ⊂ n or n ⊂ m.
We will prove this by mathematical induction on n. Let P (n) be the
statement “for every natural number m 6= n, either m ⊂ n or n ⊂ m”
− Base cases n = ∅ or 1: For n = ∅, the statement ∅ ⊂ m holds
true for all non-zero natural numbers m. For n = 1 and m = 1,
∅ ⊂ 1 = {∅}. Suppose m is a natural number other than 0 and 1.
Then ∅ ⊂ m ⇒ ∅+ = {∅} ⊆ m (by the Theorem 13.9 above). Since
m 6= 1, then {∅} ⊂ m holds true for any such m (since ∅ ∈ m for
every non-zero natural number m).
− Inductive hypothesis: Suppose P (n) holds true for some natural num-
ber n. That is, suppose n is a natural number such that for any natural
number m not equal to n, either m ⊂ n or n ⊂ m. We are required to
show that P (n+ ) holds true.
Let m be a natural number such that m 6= n+ . Case 1: If m = n,
then m ∈ n ∪ {n} and so m ⊂ n ∪ {n} (by Theorem 13.9) and we
are done. Case 2: Suppose m 6= n. Then, by the inductive hypothesis,
either m ⊂ n or n ⊂ m. If m ⊂ n, then m ⊂ n+ = n ∪ {n}. Done.
If n ⊂ m, then n+ ⊆ m (by part a)). Since m 6= n+ , n+ ⊂ m. Then
P (n+ ) holds true.
122 Section 13: Natural numbers

By the principle of mathematical induction, for any pair of distinct nat-

ural numbers m and n, either m ⊂ n or n ⊂ m. Since m ⊂ n if and only
if m ∈ n, this property also holds true with respect to the relation “∈”.
c) What we are given: That n and m are distinct natural numbers. What
we are required to prove: That n ⊂ m ⊂ n+ is impossible.
Suppose n ⊂ m ⊂ n+ . Then m ∈ n ∪ {n}. Since m 6= n, then m ∈ n
which means m ⊂ n. This contradicts our hypothesis, n ⊂ m. We have
shown that n ⊂ m ⊂ n+ is impossible, as required.

13.6 The immediate predecessor of a natural number.

We have seen that the elements, n, of the natural numbers are equipped
with an “immediate successor” constructing algorithm, n+ = n ∪ {n}, where
n ⊂ n+ = n∪{n} and no other natural number m sits between n an n+ . It is
normal to ask if every non-zero natural number has an “immediate predeces-
sor”. That is, given an arbitrary natural number, n, are we guaranteed that
there exists a natural number, k, such that k ∪ {k} = k + = n. Part c) of the
theorem above guarantees that there can be no natural number between k
and n, and so such a k would be the immediate predecessor of n. If such a k
exists, is there a way to construct this predecessor k of n just as we were able
to construct an immediate successor of a natural number? The next theorem
shows how we can construct the immediate predecessor of a natural number.

Theorem 13.11 If m and n are natural numbers such that m+ = n, then m

is called an immediate predecessor of n. For any non-zero natural number n,
k = ∪{m ∈ N : m ⊂ n} is a natural number which is an immediate predeces-
sor of n.

P roof:
What we are given: That n is a non-zero natural number.
What we are required to show: That k = ∪{m ∈ N : m ⊂ n} is a natural
number and k + = n.
Proof by induction. Let P (n) be the statement “k = ∪{m ∈ N : m ⊂ n} is
a natural number and k + = n”.
S
− Base cases n = 1 or 2: If n = 1, then k = m∈1 m = ∅ a natural
+
S such that k = ∅ ∪ {∅} = {∅} = 1 = n. If n = 2, then
number
k = m∈2 m = ∅ ∪ 1 = ∅ ∪ {∅} = {∅} = 1 a natural number such that
k + = {∅} ∪ {{∅}} = {∅, {∅}} = 2 = n.
Part V: From sets to numbers 123

− Induction hypothesis: Suppose PS(n) holds true. That is, suppose n

is a natural number for which m∈n m is a natural number satisfy-
+
= n. To show that P (n+ ) holds true, it suffices to
S
ing m∈n m
+
= n+ . Let
S S
showSthat m∈n+ m is a natural number and m∈n+ m
k = m∈n+ m. See that

k = ∪{m ∈ N : m ∈ n ∪ {n}}
= ∪{m ∈ N : m ∈ n or m = n}
= ∪{m ∈ N : m ∈ n} ∪ n

By the induction hypothesis, ∪{m ∈ N : m ∈ n} is a natural number

which is an immediate predecessor of n. Then it must be a proper subset
of n. It follows that ∪{m ∈ N : m ∈ n} ∪ n = n. Then k = n which implies
k + = n+ . So P (n+ ) holds true.
By mathematical induction, for every natural number n, ∪{m ∈ N : m ⊂ n}
+
is a natural number and (∪{m ∈ N : m ⊂ n}) = n.

Theorem 13.12 Unique immediate predecessors. Any non-zero natural num-

ber has a unique immediate predecessor.

P roof:
We prove this by induction. For non-zero natural numbers n, let P (n) be the
statement “ the natural number n has a unique immediate predecessor”.
− Base case n = 1: By definition, 1 = {∅} = ∅ ∪ {∅} = ∅+ = 0+ . So P (1)
holds true.
− Induction hypothesis: Suppose n is a non-zero natural number such that
P (n) holds true. That is, there is only one natural number m such that
m+ = n. We are required to show that n+ has a unique predecessor.
Trivially, n is one immediate predecessor of n+ . Suppose k is another
natural number such that k + = n+ . Then n ∪ {n} = k ∪ {k}. We claim
that n = k. Suppose not. Then both n ∈ k and k ∈ n must hold true. We
have shown that every natural number is a transitive set (see Theorem
13.9 (a)). By Theorem 13.7, n ∈ k and k ∈ n implies n ∈ n. By 13.9 part
(c) this cannot be true for any natural number, and so we have a con-
tradiction. Then n = k as claimed. Hence, n+ has as unique immediate
predecessor, n. So P (n+ ) holds true.
By mathematical induction, every non-zero natural number has a unique im-
mediate predecessor.
124 Section 13: Natural numbers

13.7 The second version of the Principle of mathematical induction.

The following theorem is a variation of the Principle of mathematical induc-
tion. It may sometimes be more efficient to apply this version when proving
certain theorem statements. Although we will not present an application of
this version now, we will soon see some proofs in which this version is easier
to apply.

Theorem 13.13 (The Principle of mathematical induction: second version.)

Suppose P (n) is a property whose truth value depends on the natural number
n. Suppose that for any natural number n,

[P (k) is true for all k < n] ⇒ [P (n) is true]

Then P (n) holds true for all natural numbers n.5

P roof:
Given hypothesis: [P (k) is true for all k < n] ⇒ [P (n) is true]
What we are required to show: That P (n) holds true for all natural numbers n.
Let P ∗(n) denote the statement “P (k) is true for all k < n”. We will show
by induction (original version) that P ∗ (n) holds true for all n. From this we
will conclude that P (n) holds true for all n.
− Base case: Since P (k) vacuously holds true for all k ∈ ∅, then by the given
hypothesis, P (0) = P (∅) holds true. So the base case P ∗(0) is satisfied.
− Inductive hypothesis: Suppose P ∗ (n) holds true for some natural number
n. This means “P (k) is true for all k < n”. By the given hypothesis, P (n)
must hold true. Then “P (k) is true for all k < n+ ”. That is, P ∗ (n+ ) holds
true.
By mathematical induction, P ∗ (n) holds true for all natural numbers n.
Let m be any natural number. Then, by what we have just shown, P ∗ (m+ )
holds true. That is, “P (k) is true for all k < m+ ”. So P (m) is true. The
statement is thus proved.

13.8 A few words about the Peano axioms.

There is a particular set of axioms which serves as a foundation for all
mathematical statements related to the natural numbers. These axioms are
not set-theoretic and slightly predate the ZFC-axioms. In 1889, the Italian
mathematician Giuseppe Peano proposed a set of nine mathematical state-
ments from which evolve all mathematical statements relating to the natural
5 Note that k < n and k ∈ n are equivalent expressions.
Part V: From sets to numbers 125

numbers. Today these are referred to as the Peano axioms (the Italian name
“Peano” is pronounced as ‘pay-ah-no’). We will see that each of these axioms
belongs to ZFC-set-theoretic universe, and so as a group play the role of in-
termediary − a more easily understandable one − between the Set theory
axioms and the body of mathematics we refer to as number theory. We will
list the nine Peano axioms below. The symbol “0” is an undefined symbol.
The symbol “S” represents a single valued function we refer to as the “suc-
cessor function” on the natural numbers.

P1 The symbol 0 is a natural number.

Peano axioms on equality.

P2 Every natural number is equal to itself. That is, equality “=” is a
reflexive binary relation on N.
P3 If n and m are natural numbers such that n = m, then m = n. That
is, equality is a symmetric binary relation on N.
P4 If n, m and k are natural numbers such that n = m and m = k then
n = k. That is, equality is a transitive binary relation on N.
P5 If n is a natural number and “a = n”, then a is a natural number.
Properties involving the successor function, S.
P6 If n is a natural number, then so is the image, S(n), of n under S. We
refer to “S(n)” as a successor of n.
P7 The natural numbers n and m are equal if and only if S(n) = S(m).
Hence, every natural number has precisely one successor and a natural
number is the successor of, at most, one natural number.
P8 For any natural number n, S(n) 6= 0.
Mathematical induction.
P9 If M is a set which contains the natural number 0 and whenever n is a
natural number, then so is S(n) then M contains all natural numbers.
That is {0, S(0), S(S(0)), S(S(S(0))), . . . , } ⊆ M .
We verify that each of these nine statements follows from ZFC-axioms. The
empty set, ∅, is easily perceived as being the natural number 0. Equality
of sets is reflexive in ZFC (this follows just about immediately from Axiom
A1) and so this must hold true for those sets in ZFC we call the natural
numbers. Symmetry and transitivity of “=” on sets automatically applies to
those sets we call the natural numbers. So P3 and P4 also belong to ZFC.
Equal sets contain the same elements and so P5 holds true in ZFC (see The-
orem 2.3 (c)). By the Axiom of pair and union, for any natural number n,
n ∪ {n} = S(n) is a set. So P6, P7 and P8 easily follow from this definition
of the “successor of n”. The Mathematical induction statement, P9, follows
126 Section 13: Natural numbers

from the Axiom of infinity. Note that the Axiom of power set, the Axiom
of replacement, the Axiom of regularity and the Axiom of choice are not
required to do mathematics with the natural numbers.

Concepts review:
1. If x is a set, then what is its successor?
2. What is an inductive set?
3. What does the Axiom of infinity state?
4. How is the set of natural numbers defined?
5. List the first four natural numbers using set notation.
6. What is the Principle of mathematical induction?
7. What is a transitive set?
8. If n is a natural number, is n a transitive set?
9. What is the difference between an inductive set and a transitive set?
10. Is N a transitive set?
11. Give a characterization of transitive sets.
12. Is it true that any element of a natural number is a natural number?
Why?
13. Can a natural number be an element of itself?
14. Is N a natural number? Why?
15. If n is a natural number, how many successors can n have?
16. What is a second version of the Principle of mathematical induc-
tion?
17. If n is a natural number, what does it mean to say that m is its
predecessor?
18. Give an expression which describes the predecessor of a natural
number n.
19. If m and n are natural numbers such that m ∈ n, can it happen
that m+ = n? Can it happen that n ∈ m+ ?
20. If m is a subset of the natural number n, is it possible that m ∈ n?
In which case?
21. Are there any natural numbers which are inductive sets?
22. For three natural numbers m, n and t satisfying m ∈ n and n ∈ t,
does it always follow that m ∈ t?
Part V: From sets to numbers 127

EXERCISES

A. 1. Let m and n be two natural numbers. Prove that if m ∈ n, then m 6= n+ .

2. Show that if n is a natural number, then n+ 6= 0.
3. Write down the natural number 5 using only left and right brackets, commas
and the symbol “∅”.
4. If n is a natural number, is n an inductive set? Justify your answer.
5. Is the set of all natural numbers a natural number? Justify your answer.

B. 6. Is N ∪ {N} a natural number? Explain why or why not.

7. Suppose n is a non-zero natural number.
a) Is P(n) a natural number? Why?
b) Does P(n) contain a natural number? Which one?
c) List all elements of P(3).
8. Consider the class P(N).
a) Is P(N) a set? Why?
b) Does P(N) contain any natural numbers? Explain.
c) Does P(N) contain elements which are not natural numbers? If so, list
at least three.
d) Is P(N) a natural number? Why?
9. Is N ∪ {N} a transitive set? If so, prove it. If not, say why.
10. Is N ∪ {N} an inductive set? If so, prove it. If not, say why.

C. 11. Show that finite unions and finite intersections of transitive sets are tran-
sitive sets.
12. Suppose S ⊂ N. Suppose that the union of all elements of S is S. Prove
that S cannot be a natural number.
13. Jo-Anne has defined the natural numbers in the ZFC-axiomatic system as
follows. She defined an inductive set as “S is inductive if, whenever x ∈ S,
then {x} ∈ S”. By first invoking the axiom of infinity she defines the
natural numbers N as the smallest inductive set linearly ordered by “∈”.
She defines 0 = ∅, 1 = {∅}, 2 = {{∅}}, 3 = {{{∅}}}, 4 = {{{{∅}}}} and
so on. We see that 0 ∈ 1 ∈ 2 ∈ 3 ∈ 4 · · · . Will this work as a definition of
the natural numbers? If so, say why. If not, explain why.
14. Show that N = ∪{n : n ∈ N}.
128 Section 14: Natural numbers as a well-ordered set

14 / The natural numbers as a well-ordered set

Abstract. In this section we introduce the notion of “well-ordered set”.

We show that the set of all natural numbers, when equipped with the
membership ordering relation “∈”, is a well-ordered set. We then define
“bounded set” and “the maximal element of a set”. Finally we show that
bounded subsets of N must contain a maximal element. We then use N to
construct various other sets, some of which are also well-ordered.

14.1 Order relations on N.

We have seen that the definition of the natural numbers within the ZFC-
axiomatic system leads to two equivalent order relations on N. Both “⊂”
and “∈” have been shown to be equivalent strict linear orderings of N, in
the sense that n ⊂ m if and only if n ∈ m.
We can naturally extend the strict order relation “⊂” to the non-strict or-
der relation“⊆” while maintaining the linearity property. That is, m ⊆ n
if either m ⊂ n or m = n. We can similarly extend the relation “∈” by
introducing the following notation.

Notation 14.1 We define the relation “∈= ” on N as follows:

m ∈= n if and only if m = n or m ∈ n

If m ∈= n and we want to state explicitly that m 6= n, we write m ∈ n.

14.2 A well-ordering of N.
There is an important property that is not possessed by all linearly ordered
classes. It is called the well-ordering property. We formally define this prop-
erty. We will then prove that (N, ∈) is a well-ordered set.

Definition 14.2 Let (S, ≤) be a linearly ordered set. Suppose T ⊆ S.

a) We say that the element q is “a least element of T with respect to ≤” if

and only if q ∈ T and q ≤ m for all m ∈ T .
b) If S is equipped with a strict linear ordering “<”, we say that q is a least
element of T with respect to < if and only if q belongs to T and q < m,
for all m ∈ T where m 6= q.
Part V: From sets to numbers 129

c) The set (S, ≤) is said to be well-ordered with respect to “≤” or that “≤

well-orders S” if every non-empty subset T of S contains its least element
with respect to ≤. Similarly, the set (S, <) is said to be well-ordered with
respect to “<” or that “< well-orders S” if every non-empty subset T of
S contains its least element with respect to <.

We show that ∈ well-orders the set N.

Theorem 14.3 The set N of all natural numbers is a strict ∈-well-ordered

set.
P roof:
What we are given: The relation “∈” strictly linearly orders N; the set A is
a non-empty subset of N.
What we are required to show: That A contains a least element with respect
to ∈.
Proof by contradiction: Suppose A does not contain a least element. We
claim that A must then be empty, thus contradicting our hypothesis.
− Proof of the claim: We invoke the second version of the Principle of
mathematical induction. For each natural number k, let P (k) denote
the statement “k 6∈ A”.
Induction hypothesis: Let n be some natural number such that P (k) =
“k 6∈ A” holds true for all k ∈ n.
Suppose n ∈ A. Then, for all a ∈ A, n ∈ a (for if a ∈ n, then, by the
induction hypothesis, P (a) = “a 6∈ A” holds true). This means that n is
a least element of A with respect to ∈. This contradicts “A contains no
least element with respect to ∈”. Then n 6∈ A. Then, P (n) = “n 6∈ A”
holds true. By the second version of the principle of mathematical in-
duction, P (k) = “k 6∈ A” holds true for all k ∈ N. Then A contains no
elements, as claimed.
This contradicts the fact that A is non-empty. The source of this contradic-
tion is our assumption that A does not contain a least element. We must
conclude that every subset of N has a least element with respect to “∈”.

We have previously shown that the second version of the Principle of mathe-
matical induction follows from the first version or the Principle of mathemat-
ical induction. We can show that if we only assume that N is ∈-well-ordered
and the second version of the induction principle, then the first version of
the induction principle holds true. The proof is as follows.
What we are given: That N is ∈-well-ordered and that the second version of
130 Section 14: Natural numbers as a well-ordered set

the Principle of mathematical induction holds true.

What we are required to show: That the first version of the Principle of
mathematical induction must hold true.
Let P (n) be a property whose truth value depends on the natural number
n. Suppose P (0) is known to be true. Also suppose that if P (n) holds true,
then so does P (n+ ). Let A = {k ∈ N : P (k) is false}. Then N − A contains
0 and, whenever n ∈ N − A, then n+ ∈ N − A.
We claim that A must be empty (hence, P (k) holds true for all k ∈ N).
Proof of claim: Suppose A is non-empty. Then, since N is ∈-well-ordered, A
has a least element, say s = m+ . Then P (k) holds true for all k ∈ s = m+ .
Then m ∈ N − A. By hypothesis, m+ ∈ N − A. This contradicts the fact
that m+ is the least element of A. The source of the contradiction is the
assumption that A is non-empty. Then A must be empty, as claimed.
We conclude that the set A is empty and so P (k) holds true for all natural
numbers k. We have thus shown that the first version of the Principle of
mathematical induction on N holds true.

Corollary 14.4 Every natural number n is a ∈-well-ordered set.

P roof:
Let n = {0, 1, 2, . . ., n − 1} be a natural number. We already know that
the natural numbers are ∈-linearly ordered. Let U be a non-empty sub-
set of n. Then U can be seen as a non-empty subset of N. When viewed
as a subset of the ∈-well-ordered set N, the set U contains a least natural
number, say k. Then k ∈ U and k ∈ m for all other m ∈ U . So when U
is viewed as a subset of n, k is the least element of U . So n is ∈-well-ordered.

We have thus shown that not only is N a well-ordered set, but so is every
single natural number.

14.3 Bounded subsets of N

The reader may be familiar with the concept of bounded subsets. In the con-
text of a linearly ordered set (S, <), we say that a subset A of S is bounded
above, or has an upper bound if there exists some element M ∈ S we call
an “upper bound of A” such that x ≤ M for all x ∈ A. A subset can have
many upper bounds. Similarly the subset A is “bounded below” if there
exists an element m we call a “lower bound of A” such that m ≤ x for all
x ∈ A. For example, every non-empty subset of (N, ∈) is bounded below by 0.
Suppose A is a non-empty subset of a linearly ordered set (S, ≤) which con-
tains an upper bound M of A. Then, since A is linearly ordered, for every
element x ∈ A, x ≤ M . Furthermore, M is the only upper bound of A which
Part V: From sets to numbers 131

is contained in A, for if M ∗ is another upper bound of A which is contained

in A, then M ∗ ≤ M ; if M ∗ < M , then M ∗ is not an upper bound of A;
so M ∗ = M . In this case, we can refer to M as being the maximal element
of A or the maximum element of A. This corresponds to the definition we
have previously provided for the words “maximal element” and “maximum
element” of an ordered set. A maximal element of a partially ordered set may
not necessarily be the maximum element of a set. But for linearly ordered
sets, the words “maximal” and “maximum” are interchangeable.
The following theorem shows that any non-empty bounded subset of N must
contain a maximal element with respect to “∈”.

Theorem 14.5 Any bounded non-empty subset S of (N, ∈) has a maximal

element.
P roof:
Suppose S is a non-empty bounded subset of N with respect to the linear
ordering “∈”. Let Q be the set of all upper bounds of S. Since S is bounded,
by definition, it has at least one upper bound and so Q 6= ∅. Since “∈”
well-orders N, Q must contain a “least element”, say k. We claim that k is
a maximal element of S.
To prove this claim, it suffices to show that k is both an upper bound of S
and belongs to S. Since k ∈ Q, then k is an upper bound of S.
We now show that k ∈ S:
Suppose k 6∈ S. We claim this will lead to a contradiction.
Let t be the unique immediate predecessor of k. That is, t+ = k. Then if
x ∈ S,

k 6∈ S ⇒ x 6= k,
⇒ x∈k
⇒ x ∈ t+

x ∈ t+ ⇒ x ∈ (t ∪ {t})
⇒ x ∈ t or x ∈ {t}
⇒ x ∈ t or x = t

We have shown that for every x ∈ S, either x ∈ t or x = t. If x 6∈ t, x = t.

Hence, t is an upper bound of S. That is, t ∈ Q. But t ∈ t+ = k where k
was declared to be the least element in the set Q. This is a contradiction.
The source of the contradiction is our supposition that k 6∈ S. Then k ∈ S.
Then k is both and upper bound of S and belongs to S. So S has a maximal
element, k.
132 Section 14: Natural numbers as a well-ordered set

14.4 Constructing other well-ordered sets from N

We have seen that the order relation “∈” is a strict well-ordering of the set
of all natural numbers, N. Now that we have given ourselves a large set to
work with, we will use the set N as a building block to construct other large
well-ordered sets. We will introduce an order relation on various Cartesian
products involving N. This particular order relation is defined in terms of
the ordering “∈” on N.

Lexicographic ordering of the elements of the Cartesian product {1, 2} × N.

Consider the subset

{1, 2} × N = {(i, n) : i = 1 or 2, n ∈ N}

of N × N. We will order the elements of {1, 2} × N by what is called a

lexicographic ordering 1 , denoted by <lex . This means (a, b) <lex (c, d) if a ∈ c
or, if a = c, b ∈ d. For example, (1, 34) <lex (2, 7) and (2, 5) <lex (2, 54). The
elements of ({1, 2} × N, <lex ) can then be listed in a strictly increasing order
as follows:

{(1, 0), (1, 1), (1, 2), (1, 3), · · · , (2, 0), (2, 1), (2, 2), (2, 3), · · · , }

The lexicographic ordering inherited from “∈” is easily seen to be a linear

ordering of {1, 2} × N.
We now investigate specific ordering properties of ({1, 2} × N, <lex ).
a) We verify that the order relation <lex on {1, 2} × N is a well-ordering.
Consider a non-empty subset A of {1, 2} × N.
Case 1. If there exists elements of the form (1, a) in A, then the least
element of A is (1, m) where m is the least element of {n ∈ N : (1, n) ∈ A}
(guaranteed to exist since ∈ well-orders N).
Case 2. If all elements of A are of the form (2, b), then the least element
of A is (2, m) where m is the least element of {n ∈ N : (2, n) ∈ A}, again
guaranteed to exist since ∈ well-orders N.
Then ({1, 2} × N, <lex ) is a well-ordered set.
b) We investigate upper bounds and lower bounds for particular subsets of
{1, 2} × N with respect to <lex
− If S ⊆ {1} × N, then any element which has the form (2, n) is an
upper bound of S with respect to <lex .
− Suppose T is a bounded subset of N, with an upper bound m. Then
(2, m) would be an upper bound of any subset of {1, 2} ×T ⊂ {1, 2} ×
N.
1 Some texts may refer to this as the “dictionary ordering”.
Part V: From sets to numbers 133

− If T is an unbounded subset of N, then {2} × T is unbounded in

{1, 2} × N with respect to <lex . But {1} × T is bounded above in
{1, 2} × N with respect to <lex since (1, m) <lex , (2, 0) for all m ∈ T .
c) Note that bounded subsets in ({1, 2}×N, <lex) need not necessarily contain
a maximal element.
The subset T = {(1, n) : n ∈ N} of {1, 2} × N has as upper bound the
element (2, 0). But it does not contain a maximal element with respect
to <lex .

Lexicographic ordering can be used to well-order other sets such as

S = {0, 1, 2, 3} × N ⊂ N × N

14.5 Constructing “non-well-ordered” sets from N.

For most readers, in the example above, we have moved towards less famil-
iar territory. The set exhibited above refers to a relation {1, 2} × N whose
domain is {1, 2} and codomain N. We linearly ordered its elements with
the lexicographic ordering, <lex . We showed that <lex well-orders the set
{1, 2} × N.
We now turn our attention to the relation, N × {1, 2} = {(n, 1) : n ∈
N} ∪ {(n, 2) : n ∈ N}. We specifically consider those subsets of N × {1, 2}
which are functions. For example the set g = {(0, 1), (1, 2), (2, 1), (3, 2), . . .}
is an element of P(N × {1, 2}) which represents a function. The set of all
functions mapping N into {1, 2} is normally denoted as

{1, 2}N

Then any specific function f in {1, 2}N can be expressed as an infinite set
of ordered pairs

{(0, a0 ), (1, a1), (2, a2 ), (3, a3), . . . , }

where ai = f(i) = 1 or 2. Actually we could more succinctly express this

element f as a sequence

{a0 , a1 , a2 , a3 , . . . , }

where each ai = f(i) is the image of i under f.1

So if f ∈ {1, 2}N, f can accurately be described as an infinite sequence of 1s
and 2s in an order dictated by the associated elements in the domain of f.
This is the particular way we will view the elements of {1, 2}N. That is,

{1, 2}N = {{ai }∞

i=0 : ai = 1 or 2}

1 So ai is in fact shorthand for (i, f (i)).

134 Section 14: Natural numbers as a well-ordered set

So comparing two functions f and g in {1, 2}N is essentially comparing two

infinite sequences of 1s and 2s.
Given the set S = {1, 2}N we want to define an order relation on S. The
order relation that we will choose is inspired by the lexicographic ordering
defined on subsets of N × N above. We formally define it below.2

Definition 14.6 Consider the set {1, 2}N of all functions mapping nat-
ural numbers to 1 or 2. We define the lexicographic order “<lex ” on
{1, 2}N as follows: For any two elements f = {a0 , a1 , a2 , a3 , . . . , } and g =
{b0 , b1 , b2 , b3 , . . . , } in {1, 2}N, f <lex g if and only if for the first two unequal
corresponding terms ai and bi , ai ∈ bi (ai < bi ). Also, f = g if and only if
ai = bi for all i ∈ N.3

For example, if f = {1, 2, 2, 1, 1, · · · , } and g = {1, 2, 2, 2, 1, 2, · · ·}, then

f <lex g. This ordering is easily seen to be linear. We now investigate basic
ordering properties of ({1, 2}N, <lex).
a) The relation <lex is not a well-ordering of {1, 2}N.
At first glance, based on our experience with lexicographic orderings, we
may suspect that <lex well-orders {1, 2}N. But we should be cautious.
Does {1, 2}N have a least element? The lexicographic ordering rule shows
that no element in {1, 2}N can be smaller than {1, 1, 1, 1, · · · }. So {1, 2}N
at least has a smallest element. Let’s try to think about what its second
smallest element is. We have listed the elements of a subset of {1, 2}N in
the form of a strictly decreasing sequence of elements where each element
is larger than {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, · · · }:

{2, 1, 1, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 2, 1, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 2, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 2, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 2, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 1, 2, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 1, 1, 2, 1, 1, 1, · · · }
..
.

2 Even though the following definition of the ordering on the set {1, 2}N is inspired from

the lexicographic ordering of sets of ordered pairs and adopts the notation <lex , it is good
to remember that we are not ordering ordered pairs but sets which represent functions.
3 A lexicographic ordering can similarly be defined on S N where S is any subset of N.
Part V: From sets to numbers 135

We see that each element is strictly less than its immediate predecessor in
this list. We also see that we can never reach {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, · · ·}
using such a decreasing sequence. This convinces us that the set

S = {f ∈ {1, 2}N : f > {1, 1, 1, 1, 1, 1, 1, · · ·}}

does not contain its least element since no matter where we insert our first
“2”, you will be able to insert a “2” further down. Since the non-empty
subset S has no minimal element with respect to the ordering “<lex ”,
then {1, 2}N is not a well-ordered set.
b) The set {1, 2}N is bounded.
We easily see that {2, 2, 2, 2, · · · } is a maximal element of {1, 2}N. So
every subset of {1, 2}N has at least {2, 2, 2, 2, · · · } as upper bound.
c) On maximal elements of bounded sets.
Does every bounded subset of {1, 2}N contain a maximal element? To help
answer this question, let’s try to find the element of {1, 2}N which im-
mediately precedes {2, 2, 2, 2, · · ·}. Another way of stating this is: What
is the maximal element of S = {f ∈ {1, 2}N : f < {2, 2, 2, 2, · · · }}? This
maximal element must have at least one “1” in it, along with as many
2s as possible. The question is where shall we insert this “1”? No matter
where we insert this “1” we will be able to reconsider our choice and
reinsert it farther down. So S contains no maximal element.

14.6 The set N{1,2}.2

Having studied the set {1, 2}N of all functions whose domain is N and
codomain is {1, 2} we now consider the set of all functions with domain {1, 2}
and codomain N. This set is denoted as N{1,2}. Of course, if f ∈ N{1,2}, then
f = {(1, a1 ), (2, a2)} where a1 = f(1) ∈ N and a2 = f(2) ∈ N. See that we
have purposely lexicographically ordered the two elements (1, a1 ) and (2, a2 )
of f. Just as for the functions in {1, 2}N we can more succinctly represent
f ∈ N{1,2} as f = {a1 , a2 }, ordered doubletons of natural numbers. That is,

N{1,2} = {{a1 , a2 } : a1 , a2 ∈ N}

is a set of ordered doubletons. Since these doubletons are ordered natural

numbers, we can lexicographically order the elements {a1 , a2 } of N{1,2} as
if N{1,2} was a set of ordered pairs (a1 , a2 ) of natural numbers. That is, a
function φ : N × N → N{1,2} defined as φ((a1 , a2 )) = {a1 , a2 } can be used
to lexicographically order the elements of N{1,2} in such a way that the
elements of N{1,2} are ordered in precisely the same way as the elements of
N × N. This ordering on N{1,2} is also referred to as a lexicographic ordering
and is also denoted by the same symbol, “<lex ”. So (N{1,2}, <lex ) has order
2 This section can be skipped without loss of continuity.
136 Section 14: Natural numbers as a well-ordered set

properties which are identical to those of (N × N, <lex ).

Concepts review:
1. What does it mean to say that “<” strictly well-orders a set S?
2. Describe two order relations which well-order N?
3. What does it mean to say a subset S of N ordered by “∈” is
bounded?
4. What does it mean to say that a non-empty subset S of N ordered
by “∈” has a maximal element?
5. Describe the set {1, 2} × N by providing three distinct elements of
this set.
6. Define the lexicographic ordering on {1, 2} × N.
7. Define the lexicographic ordering on {1, 2}N.
8. List three elements of {1, 2}N in increasing order.
9. Is the lexicographically ordered set {1, 2} × N well-ordered?
10. Is the lexicographically ordered set {1, 2}N well-ordered?
11. Does every non-empty subset of the lexicographically ordered set
{1, 2} × N have a maximal element?
12. Does every non-empty subset of the lexicographically ordered set
{1, 2}N have a maximal element?
13. Does {1, 2}N have a maximal element?
14. Describe the elements of N{1,2}. Propose an ordering for its ele-
ments.

EXERCISES

A. 1. Show that (N, ∈) does not contain a maximal element.

2. Can a non-empty bounded subset S of N ordered by “∈” be an inductive
set? Why or why not?
3. Construct three non-empty subsets of {1, 2}N, each of which contains no
least element.

B. 4. Consider the lexicographically ordered set (N × N, <lex ).

a) Describe the first few elements of (N × N, <lex).
b) Does N × N have a maximal element? What is it?
Part V: From sets to numbers 137

c) Is (N × N, <lex ) a well-ordered set? If so, show it. If not, produce a

non-empty subset which does not contain its least element.
d) Does every bounded subset of N × N have a maximal element? Why?
5. Consider the lexicographically ordered set S = {1, 2, 3, · · · , 9}N of all func-
tions mapping N to the set {1, 2, 3. · · · , 9}.
a) Describe the first few elements of S.
b) Write the three elements f = {5, 5, 5, 5, 5, . . ., }, g = {4, 9, 2, 2, 2, . . . , }
and h = {4, 9, 1, 9, 9, 9, . . .} in increasing order.
c) Does S have a maximal element? What is it?
d) Is S a well-ordered set? If so, show it. If not, produce a non-empty
subset which does not contain its least element.
e) Does every bounded subset of S have a maximal element? Why?

C. 6. Suppose we represent the set of all functions mapping {0, 1, 2} to N as

N{0,1,2} = {{a0 , a1 , a2 } : ai ∈ N}

Suppose we order this set lexicographically.

a) Describe the first few elements of N{0,1,2}.
b) Write the three elements f = {4, 0, 600}, g = {600, 9, 8} and h =
{6, 9, 53} in increasing order.
c) Does N{0,1,2} have a maximal element? What is it?
d) Is N{0,1,2} a well-ordered set? If so, show it. If not, produce a non-empty
subset which does not contain its least element.
e) Does every bounded subset of N{0,1,2} have a maximal element? Why?
7. Consider the lexicographically ordered set NN of all functions mapping N
into N.
a) Describe an element of NN as a set.
b) Does NN have a least element? What is it?
c) Does NN have a second element with respect to the lexicographic or-
dering? What is it?
d) If f ∈ NN , can f be viewed as a subset of N × N? Explain.
e) If f ∈ NN , can f be viewed as an element of P(N × N)? Explain.
f) Can NN be viewed as a subset of P(N × N)? Explain.
138 Section 15: Arithmetic of the natural numbers

15 / Arithmetic of the natural numbers

Abstract. In this section we define addition, subtraction and multiplica-
tion of the natural numbers in a set-theoretic context. We then show that
with these definitions, we obtain the expected results. Addition and multi-
plication on N are defined recursively.

15.1 Highlights of what we have learned in set theory up to now.

Now that we have defined the natural numbers within the framework of set
theory, it is a good time to look back on what we have learned and provide
some insight on what is to come.
In our theory, all objects are classes or sets. Since we are mainly interested
in those objects called “sets”, our attention is directed towards these specif-
ically. A few fundamental properties of classes and sets are stated without
proof in the form of axioms. Axioms are normally expected to be intuitively
obvious to users. This does not mean that choosing axioms is done without
debate, since what is obvious to one may not be obvious at all to another.
We listed ten set-theoretic axioms referred to as the “ZFC-axioms”. The
ZF -axioms refer to the first nine, while the “C ” refers to the tenth axiom
called “Axiom of choice”. The Axiom of infinity, which posits the existence
of a set which satisfies the essential properties of the natural numbers is
not an axiom that is “intuitively obvious”. It is however perceived as being
essential since it provides a logical basis on which rests most of the math-
ematics we do today. The importance of the Axiom of infinity cannot be
overstated since it postulates the existence of a set which is at the core of
all of the mathematics that evolve from the ZFC-axioms. The most contro-
versial axiom is the Axiom of choice. We have not invoked this axiom yet.
We will soon see why it is needed if we want our set-theoretic universe to
unfold as we think it should.
Once we gave ourselves classes, sets and a few axioms to work with, we
gave ourselves the means to construct classes and sets from ones that exist
(union, intersection, Cartesian products). This was followed by definitions
of relations and functions, both of which exist in our set-theoretic universe
as sets. Finally, we defined the natural numbers so that they possessed the
required well-ordering property. Of course, giving life to the natural numbers
is just the beginning. Our next step is to appropriately define addition and
multiplication of the natural numbers. Definitions must be such that results
we obtain with these operations are what we expect them to be. Definitions
of the integers, the rational numbers and the real numbers will then follow.
Finally, we will study infinite sets and their various properties.
Part V: From sets to numbers 139

15.2 Defining addition recursively.

Since natural numbers are sets, we may instinctively attempt to define ad-
dition of the natural numbers as follows: 3 + 5 = 3 ∪ 5. But this doesn’t
produce the desired result since we know that 3 ⊂ 5 and so 3 + 5 = 3 ∪ 5
would equal 5, and this is not what we want at all. We see that viewing
addition of natural numbers by simply joining sets together will not work,
particularly if the sets have non-empty intersection.
It may be that we are attempting to define too much at the same time. Let’s
try a step-by-step definition of addition. We will experiment with addition
of natural numbers with the natural number “3”, specifically. Suppose we
define addition of 0 to 3 as follows: 3 + 0 = 3 ∪ ∅ = 3. Then we progressively
define addition with 3 of successively larger and larger numbers. In what
follows the symbol “:=” will serve as a succinct way of saying “is defined
as”.

3+0 = 3
3+1 = 3 + 0+ := (3 + 0)+ = 3+ = 4
3+2 = 3 + 1+ := (3 + 1)+ = 4+ = 5
.
..
3+n = m
3 + n+ := (3 + n)+ = m+
..
.

In this sequence of sums only the initial sum 3 + 0 = 3 is specifically defined,

while a globally defined rule “3 + n+ = (3 + n)+ ” is applied to evaluate the
sum of all other natural numbers with 3. If we know the numerical value of
3 + n, then, by the general “rule”, 3 + n+ has numerical value (3 + n)+ . So
if we want to determine the value of 3 + 4 we first have to find the values
of 3 + 1, 3 + 2 and 3 + 3.1
3+4 = (3 + 3)+
= ((3 + 2)+ )+
= (((3 + 1)+ )+ )+
= ((((3 + 0)+ )+ )+ )+
= (((3+)+ )+ )+
= ((4+ )+ )+
= (5+ )+
= 6+
= 7

It is a sure (albeit tedious) method of addition that will consistently produce

the unique expected values for sums of two natural numbers. For example,
1 This is reminiscent of the way we learned addition by using addition tables in elementary

school: Before learning that 3 + 2 = 5, we learned that 3 + 0 = 3, 3 + 1 = 4 and deduced

from this that 3 + 2 had to equal 5.
140 Section 15: Arithmetic of the natural numbers

once we have computed 3 + 4 = 7, we can then compute 3 + 5 = (3 + 4)+ =

7+ = 8 by applying the algorithm 3 +n+ = (3 +n)+ . Once all values of 3 +n
are obtained, we then obtain all values for 4 + n as n ranges over N, and
so on. This simply shows that it is possible to adequately define addition of
natural numbers in a universe of sets in such a way that sums correspond to
sums obtained by usual addition algorithms. This does not prevent us from
using the various algorithms which allow us to obtain more efficiently the
values of sums of natural numbers.
We, thus, propose the following formal definition of addition of pairs of nat-
ural numbers.

Definition 15.1 Let m be a fixed natural number. Addition of a natural

number n with m is defined as the function rm : N → N satisfying the two
conditions

rm (0) = m
+
rm (n ) = [rm(n)]+

The expression m + n is another way of writing rm (n). Thus,

rm (0) = m ⇔ m+0=m (1)

rm (n ) = [rm (n)]+
+
⇔ m + n+ = (m + n)+ (2)

For example, once the value of 34 + 0 = 34 is declared, the value of the sum
34+123 is uniquely determined by applying the formula 34+n+ = (34+n)+
finitely many times to successively obtain 34 + 1 = 35, 34 + 2 = 36,
34 + 3 = 37, . . . , 34 + 123 = 157.
Readers may have no doubt noticed that the function rm (n) is defined by
using a mechanism that we have not used or seen before in this text. We
are accustomed to defining a function f : A → A by declaring a rule which
associates to each element in A some other element in A without referring to
other ordered pairs (a, f(a)) in f. For example, the only way we can confirm
that the ordered pair (2, 3 + 2) = (2, 5) belongs to the function r3 is by first
determining that (0, 3 + 0) = (0, 3) and (1, 3 + 1) = (1, 4) also belong to r3 .
Most readers will intuitively feel that there is no ambiguity in the way we
have defined the function rm on N.
We refer to functions which are defined in this way as being recursively
defined functions. If rm is indeed a well-defined function, then we must be
able to prove that it satisfies the conditions stated in the formal definition of
a function. We remind ourselves how we defined a “function” (see Definition
9.1):
Given two sets A and B, a function is a subset f of A × B which
satisfies the property “(x, y) and (x, z) belong to f implies y = z”.
Part V: From sets to numbers 141

There is no reason to deviate from this understanding of functions. For the

most skeptics amongst us, we will now formally show that this recursively
defined function of addition satisfies the property described in the definition.
That is, we will show that if n = k, then rm (n) = m + n = m + k = rm (k).
Those readers already convinced that this method of proceeding will work
can bypass this slightly tedious proof and read on without loss of continuity.2

Theorem 15.2 Let m be a fixed natural number and let rm : N → N be a

relation satisfying the two properties

rm (0) = m
rm (n+ ) = [rm(n)]+

Then rm is a well-defined function on N.

P roof:
Let m be a fixed natural number. Let S be a class of relations on N defined
as follows:

S = {R ⊆ N × N : (0, m) ∈ R and (n, y) ∈ R ⇒ (n+ , y+ ) ∈ R}

Now S is non-empty since it contains N×N. Let r ∗ = R∈S R. This means

T
that r ∗ is the smallest set of ordered pairs satisfying the conditions described
for S . The relation r ∗ looks something like

r ∗ = {(0, m), (1, m+ ), (2, (m+ )+ ), · · · }

We will show two things: (1) r ∗ is a function mapping N into N, and, (2)
r∗ = rm .
1) We claim that r ∗ is a function mapping N into N.
Proof of claim:
We first establish (by induction) that dom r ∗ = N.
Base case: We first note that since (0, m) ∈ r ∗ , then 0 ∈ dom r ∗ .
Inductive hypothesis: Suppose n ∈ dom r ∗. Then (n, y) ∈ r ∗ for some y ∈ N.
This implies (n+ , m+ ) ∈ r ∗ and so n+ ∈ dom r ∗.
Hence, by induction, the domain of r ∗ is all of N.
We now proceed to the proof of the claim. The proof of the claim invokes the
second version of the principle of mathematical induction. Let P (n) denote
the statement “[(n, x) ∈ r ∗ ∧ (n, y) ∈ r ∗ ] ⇒ [x = y]”.
Inductive hypothesis: Suppose P (m) holds true all natural numbers m < n.
That is, (m, x) ∈ r ∗ and (m, y) ∈ r ∗ implies x = y. We will show that given
our hypothesis, P (n) must hold true.
2 Should you decide to skip reading the proof for this time, don’t make a habit of it. It

may, at some point in time, get you into trouble.

142 Section 15: Arithmetic of the natural numbers

Suppose not. Suppose (n, x) ∈ r ∗ and (n, y) ∈ r ∗ where x 6= y. Let

U = r ∗ − {(n, y)} (the set r ∗ takes away the element (n, y)). Then U is
still a relation belonging to S which is strictly smaller than r ∗ . But r ∗
was previously declared to be the smallest of the relations in S . We
have a contradiction whose source is our assumption that x 6= y. Then
x must be equal to y. We conclude that P (n) holds true as required.
By mathematical induction (version two) P (n) holds true for all n.
So r ∗ is a well-defined function as claimed.
2) We now claim that the function r ∗ is the relation rm as defined above.
Proof of claim:
The proof is by induction. Let P (n) denote the statement “r ∗ (n) = rm (n)”.
Base case: The statement P (0) holds true since r ∗ (0) = m = rm (0) = m+0.
Inductive hypothesis: Suppose P (n) holds true. That is, suppose (n, rm(n)) ∈
r ∗ . Then, by definition of r ∗ , (n+ , rm (n)+ ) ∈ r ∗ . Since rm (n+ ) = rm (n)+ ,
then (n+ , rm (n+ )) ∈ r ∗ . That is, r ∗ (n+ ) = rm (n+ ). So P (n+ ) holds true.
By mathematical induction r ∗(n) = rm (n) for all n.
Then r ∗ = rm as claimed.
We conclude that rm and r ∗ are indeed the same relation. Since r ∗ was
shown to be a function, the recursively defined relation rm : N → N is a
well-defined function.

15.3 Basic properties of addition.

We must now be sure that the addition operation we have defined on N
satisfies the basic properties of addition we are accustomed to.
1) For every natural number n, n+ and n + 1 are the same number. We are
required to show that assuming 0+ is denoted by the symbol 1, then for
any non-zero natural number n,

n+ = 1 + n

This can be shown by induction:

Proof: By induction. Let P (n) be the property “n+ = 1 + n” (where
1 = 0+ ).
Base case: We see that P (0) holds true since

r0+ (0) = 0+ + 0 = 0+ (By (1) in the definition of addition above).

0+ = 1 + 0 (By notation).

Inductive hypothesis: Suppose P (n) holds true for some n. Then

(n+ )+ = (1 + n)+ = 1 + n+ (By (2) in the definition of addition above).

Part V: From sets to numbers 143

So P (n+ ) holds true. By mathematical induction n+ = 1 + n for all nat-

ural numbers n.

2) For any natural number n,

0+n=n
Proof: By induction. Let P (n) be the property “0 + n = n”.
Base case: We see that P (0) holds true since r0 (0) = 0 + 0 = 0, by (1) in
the definition of addition above.
Inductive hypothesis: Suppose P (n) holds true. Then 0 + n+ = (0 + n)+ =
n+ , by (2) in the definition of addition. So P (n+ ) holds true. By mathe-
matical induction 0 + n = n for all natural numbers n.

3) Addition of the natural numbers is associative. That is, for any three nat-
ural numbers m, n and k

(m + k) + n = m + (k + n)
Proof: By induction. Let m and k be any two natural numbers. Let P (n)
be the property “(m + k) + n = m + (k + n)”.
Base case: We see that P (0) holds true since (m + k) + 0 = m + k =
m + (k + 0), (by (1) in the definition of addition above).
Inductive hypothesis: Suppose P (n) holds true. Then

(m + k) + n+ = [(m + k) + n]+ (By (2) in the definition of addition above.)

= [m + (k + n)]+ (Since P (n) holds true.)

= m + (k + n)+
= m + (k + n+ )

So P (n+ ) holds true. By mathematical induction (m+k)+n = m+(k +n)

for all natural numbers n. Since m and k were arbitrarily chosen, then
(m + k) + n = m + (k + n) holds true for any three natural numbers m, n
and k.

4) Addition of the natural numbers is commutative. That is, for any two
natural numbers m and n

m+n =n+m
Proof: By induction. Let m be any natural number. Let P (n) be the
property “m + n = n + m”.
Base case: We see that P (0) holds true since

m+0 = m (By (1) in the definition above.)

= 0 + m (By Property 2).
144 Section 15: Arithmetic of the natural numbers

Induction hypothesis: Suppose P (n) holds true. Then

m + n+ = (m + n)+ (By (2) in the definition above.)

= (n + m)+ (Since P (n) holds true.)
= 1 + (n + m) (By property (1).)
= (1 + n) + m (By property (3).)
= n+ + m (By property (1).)

So P (n+ ) holds true; by mathematical induction m + n = n + m for all

natural numbers n. Since m was arbitrarily chosen, then m + n = n + m
holds true for any pair of natural numbers m and n.

15.4 Definition of multiplication on N.

As for addition, multiplication will be inductively defined.

Definition 15.3 For any natural number m, multiplication with the natural
number m is defined as the function sm : N → N satisfying the two conditions

sm (0) = 0
sm (n+ ) = sm (n) + m

We define the expression mn and m × n as alternate ways of writing sm (n).

Thus

sm (0) = 0 ⇔ m0 = m × 0 = 0 (3)
sm (n ) = sm (n) + m ⇔ mn+ = mn + m = m × n + m
+
(4)

Theorem 15.4 Let m be a fixed natural number and let sm : N → N be a

function satisfying the two properties:

sm (0) = 0
sm (n+ ) = sm (n) + m

Then sm is a well-defined function on N.

P roof: (Outline only)

Let S be a class of relations on N defined as follows:

S = {R ⊆ N × N : (0, 0) ∈ R and (n, y) ∈ R ⇒ (n+ , y + m) ∈ R}

Now S is non-empty since it contains N×N. Let s∗ = R∈S R. This means

T
Part V: From sets to numbers 145

that s∗ is the smallest set of ordered pairs satisfying the conditions described
for S . The relation s∗ looks something like

s∗ = {(0, 0), (1, m), (2, m + m), (3, m + m + m), · · · , (n, m × n), · · · , }

Claim: s∗ is a function mapping N into N. Proof of the claim is left as an

exercise.
Claim: s∗ satisfies the properties which characterize sm . Proof of the claim
is left as an exercise.
We conclude that sm and s∗ are indeed the same relation. Since s∗ was shown
to be a function, the recursively defined function sm : N → N is well-defined.

We now verify that the expected properties of multiplication are satisfied.

1) For any natural number,
0n = 0
Proof: By induction. It is left as an exercise.
2) For any natural number n,
1n = n
Proof: By induction. It is left as an exercise.
3) Multiplication of the natural numbers is distributive over addition. That
is, for any three natural numbers m, n and k

n(m + k) = nm + nk and (m + k)n = mn + kn

Proof: By induction. Outline of proof for left-hand distribution. Right-
hand distribution is left as an exercise. Let m and k be any two natural
numbers. Let P (n) be the property “k(m + n) = km + kn”.
Base case: We see that P (0) holds true since

k(m + 0) = km
= km + 0
= km + k0

Inductive hypothesis: If P (n) holds true, then

k(m + n+ ) = k(m + n)+

= k(m + n) + k (By (2) in the definition.)
= km + kn + k (Since P (n) holds true.)
= km + kn+ (By (2) in the definition.)

So P (n+ ) holds true. By mathematical induction k(m + n) = km + kn

for all natural numbers n. Since m and k were arbitrarily chosen, then
k(m + n) = km + kn holds true for any three natural numbers m, k
and n.
146 Section 15: Arithmetic of the natural numbers

4) Multiplication of the natural numbers is associative. That is, for any

three natural numbers m, n and k,

(mn)k = m(nk)
Proof: By induction. It is left as an exercise.
5) Multiplication of the natural numbers is commutative. That is, for any
two natural numbers m and n,

mn = nm
Proof: By induction. It is left as an exercise.

15.5 Subtraction on the natural numbers.

Subtraction is easily defined in terms of addition. To define subtraction, we
must first establish the following fact.

Theorem 15.5 For any two natural numbers m and n, m ∈= n if and only
if there exists a unique natural number k such that n = m + k.

P roof:
By induction. Let P (n) be the statement: “For any m ∈= n there exists a
unique natural number k such that n = m + k.”
Base case: Suppose n = 0. Then, for any m ∈= 0, m = 0 and so there exists
only k = 0 such that 0 = n = m + k = 0 + 0. So P (0) holds true.
Inductive hypothesis: Suppose n is a natural number such that for any nat-
ural number m ∈= n, there exists a unique natural number k such that
n = m + k. Suppose m is a natural number such that m ∈= n+ . Then
m ∈= n+ = n ∪ {n} implies m ∈ n or m = n or m = n+ . The equality
m = n+ means we can only choose k = 0. If m ∈ n, then the existence of a
unique natural number k such that n = m+k is guaranteed by our inductive
hypothesis. So

n+ = n+1
= (m + k) + 1
= m + (k + 1)
= m + k+

In this case the required natural number is k + . If m = n, then n+ = k + =

k + 1 = m + 1. The unique required value is k = 1. So P (n+ ) holds true.
By mathematical induction, P (n) holds true for all values of n.
Part V: From sets to numbers 147

Definition 15.6 For any two natural numbers m and n such that m ≤ n, the
unique natural number k satisfying n = m + k is called the difference between
n and m and is denoted by n − m. The operation “−” is called subtraction.

With respect to the basic operations of addition, subtraction and multiplica-

tion on the natural numbers, our work is done. These operations have been
shown to be properly defined in our ZFC -axiomatic universe. Again, the
intention is not to adopt the formal definitions as regular methods for doing
arithmetic; it is only to ensure that arithmetic is definable in a set-theoretic
context.

Concepts review:
1. How is addition on the natural numbers defined?
2. For any natural number n, give two ways of describing n+ .
3. How can we prove that 0 + n = n for all n from the definition of
addition?
4. How can we prove that addition is associative from the definition
of addition?
5. How can we prove that addition is commutative from the definition
of addition?
6. How is multiplication of natural numbers defined?
7. How is subtraction of natural numbers defined?

EXERCISES

A. 1. Use mathematical induction to prove the following multiplication proper-

ties.
a) For all natural numbers n, 0n = 0.
b) For all natural numbers n, 1n = n.
c) For all natural numbers m, n and k, n(m+k) = nm+nk and (m+k)n =
mn + kn.
d) For all natural numbers m, n and k, (mn)k = m(nk).
e) For all natural numbers m and n, mn = nm.

B. 2. Prove that if n < k, then m + n < m + k.

148 Section 15: Arithmetic of the natural numbers

3. Prove that m + n = m + k implies n = k.

4. Prove that if m < n, then mk < nk.
5. Prove that if mk = nk and k 6= 0, then m = n.

C. 6. Prove that m + k < n + k implies m < n.

7. Prove that mk < nk implies m < n.
8. Prove that for any two natural numbers m and n, m ≤ n if and only if
there exists a unique natural number k such that n = m + k.
9. Prove in detail Theorem 15.4.
Part V: From sets to numbers 149

16 / The integers Z and the rationals Q

Abstract. In this section we define both the integers, Z, and the rational
numbers, Q. The integers are presented as a quotient set of N × N, while
the rationals are presented as a quotient set of Z × (Z − {0}). Addition,
subtraction and multiplication on each of these are defined within the set-
theoretic context. Order relations are defined on each of Z and Q so that
they are linearly ordered in the way we are accustomed to.

16.1 Constructing the set of integers Z from N.

Most people easily recognize an integer when they see one. One might say,
given the natural numbers N = {0, 1, 2, 3, . . ., }, if we add to it all the “nega-
tive” natural numbers {−1, −2, −3, . . . , } we obtain all integers. We are then
only left with the sticky problem of explaining what a “negative” natural
number is within our set-theoretic framework.
Remember that the only mathematical objects in our set-theoretic universe
are sets. So the integer −3 must be a set of some sort. But the idea of a
“negative set” is not very intuitive. One might say that −3 is the difference
between the natural numbers 2 and 5. But, in the last Chapter, the dif-
ference, n − m, between natural numbers was only defined for pairs (n, m)
where the second natural number is less than or equal to the first one. So
the expression 2 − 5 is, as yet, not defined.
What do the ordered pairs (5, 2), (10, 7) and (3, 0) have in common? We
notice that the first entry minus the second entry is 3 for each of the pairs.
If we consider, on the other hand, the pairs (1, 5), (6, 10) and (0, 4), we see
that in each case, the first entry minus the second entry is −4. This suggests
that an equivalence relation of some sort on N × N may provide a useful
way of representing negative integers. We explore this avenue to see where
it leads us.

Theorem 16.1 Let Z = N × N. Let Rz be a relation on Z which is defined

as follows: (a, b)Rz (c, d) if and only if a + d = b + c. Then Rz is an equivalence
relation on Z.

P roof:
Reflexivity: Since a + b = b + a, (a, b)Rz (a, b).
Symmetry: (a, b)Rz (c, d) ⇒ a + d = b + c ⇒ c + b = d + a ⇒ (c, d)Rz (a, b).
Transitivity: Suppose (a, b)Rz (c, d) and (c, d)Rz (e, f). Then a + d = b + c
and c + f = d + e. This implies a + d + c + f = b + c + d + e. Subtracting
c+d from both sides of the equality gives a+f = b +e. Hence, (a, b)Rz (e, f).
150 Section 16: The integers Z and the rationals Q

Notation: If R is an equivalence relation on S and x ∈ S, then we will use

the notation
[x]R
to denote the equivalence class of all elements equivalent to x under R. When
the context indicates which relation we are referring to and there is no risk
of confusion, we will simply write [x] instead of [x]R.

Corollary 16.2 Let Z = N × N be equipped with the equivalence relation

Rz defined as:
(a, b)Rz (c, d) ⇔ a + d = b + c
For each n ∈ N, let [(0, n)] and [(n, 0)] denote the Rz -equivalence classes
containing the elements (0, n) and (n, 0), respectively. Then the quotient set
induced by Rz on Z can be expressed as

Z/Rz = {[(0, n)] : n ∈ N} ∪ {[(n, 0)] : n ∈ N}

P roof:
We will start by showing that the equivalence classes in Z/Rz cover all of
Z. That is, we will show that
[ [
Z⊆U = [(0, n)] ∪ [(n, 0)]
n∈N n∈N

Let (a, b) ∈ Z. We will show that (a, b) ∈ U . Suppose (c, d) ∈ [(a, b, )]. Then
(a, b)Rz (c, d) ⇒ a + d = b + c. We consider two cases, d ≤ c and c ≤ d

d≤c ⇒ a+d−d = b+c−d

⇒ a + 0 = b + (c − d)
⇒ (a, b)Rz ((c − d), 0)
c≤d ⇒ a+d−c = b+c−c
⇒ a + (d − c) = b + 0
⇒ (a, b)Rz (0, (d − c))

∈ Z, then either (a,

So if (a, b)S S b) ∈ [((c−d), 0)] or (a, b) ∈ [(0, (d−c))]. Thus,
Z ⊆ U = n∈N [(0, n)] ∪ n∈N [(n, 0)]. So every element of Z = N × N is
an element of some equivalence class in Z/Rz .
Next we show that if m and n are distinct, then (0, n) and (0, m) are not
related with respect to Rz :

m 6= n ⇒ 0 + n 6= m + 0
⇒ (0, m) 6∈ [(0, n)]
Part V: From sets to numbers 151

Since Rz is reflexive, (m, 0) 6∈ [(n, 0)]. We show that the elements of

Z/Rz with distinct representatives do not overlap: For if m 6= n and
(x, y) ∈ [(0, m)] ∩ [(0, n], then (0, m)Rz (x, y) and (x, y)Rz (0, m) implies
(0, m)Rz (0, n) (by transitivity), a contradiction.
So the sets in {[(0, n)] : n ∈ N} ∪ {[(n, 0)] : n ∈ N} represent all equivalences
class of Z induced by Rz .

We have set the stage for a set-theoretic definition of the “integers”. Some
readers may have some insight on where this is leading. It seems that the
plan is to have the equivalence class [(0, n)] represent the negative integers
−n = 0 − n and [(n, 0)] represent the positive integers n − 0 = n. We could
then equate −5 with [(0, 5)] and the integer 5 with [(5, 0)].
Some may immediately wonder: Why do the positive integers need defining?
Aren’t positive integers simply the natural numbers? How can the natural
number 5 = {0, 1, 2, 3, 4} be the same set as the integer 5 = [(0, 5)]? These
two sets are indeed different since they don’t contain the same elements. It
is true, the “natural number 5” and the “integer 5” have different set rep-
resentations. The question is: Is this a major problem or is it just a minor
annoyance? It may be possible to construct the integers with the specific
requirement that the sets which represent the positive integers and the sets
which represent the natural numbers be the same. But this constraint may
present some hurdles around which it may be difficult to maneuver. When
we think about it carefully, it is not the sets which represent the natural
numbers and the sets which represent the positive integers that are impor-
tant. What is however crucial is that the arithmetic operations on these sets
each produce the expected values. That is, both 5 + 3 and [(5, 0)] + [(3, 0)]
produce 8 “the natural number” and 8 = [(8, 0)] “the positive integer”, re-
spectively. With this in mind, we proceed with a formal definition of the
integers.

Definition 16.3 The set of integers, Z, is defined as:

Z = Z/Rz = {[(a, b)] : a, b ∈ N} = {[(0, n)] : n ∈ N} ∪ {[(n, 0)] : n ∈ N}

a) Negative integers: The set of negative integers is defined as being the set

Z− = {[(0, n)] : n ∈ N}

Positive integers: The set of positive integers is defined as being the set

Z+ = {[(n, 0)] : n ∈ N}
152 Section 16: The integers Z and the rationals Q

If n is not 0, the elements of the form [(0, n)] can be represented by −n =

[(0, n)], while the elements of the form [(n, 0)] can be represented as n =
[(n, 0)].
b) Order relation on Z: We define a relation ≤z on Z as follows: [(a, b)] ≤z
[(c, d)] if and only if a + d ≤ b + c. It is a routine exercise to show that ≤z
is a linear ordering of Z.
c) Addition on Z: We must sometimes distinguish between addition of natural
numbers and addition of integers. Where there is a risk of confusion, we
will use the following notation: “+n ” means addition of natural numbers,
while “+z ” means addition of integers.
Addition +z on Z is defined as:

[(a, b)] +z [(c, d)] = [(a +n c, b +n d)]

d) Opposites of integers: The opposite −[(a, b)] of [(a, b)] is defined as:

−[(a, b)] = [(b, a)]1

e) Subtraction on integers: Subtraction “−z ” on Z is defined as:

[(a, b)] −z [(c, d)] = [(a, b)] + (−[(c, d)])2

f) Multiplication of integers: Multiplication ×z on Z is defined as:

[(a, b)] ×z [(c, d)] = [(ac + bd, ad + bc)].3

In particular, [(0, n)] ×z [(m, 0)] = [(0 + 0, 0 + nm)] = [(0, nm)] = −[(nm, 0)]
and [(n, 0)] ×z [(m, 0)] = [(nm, 0)].
g) Absolute value of an integer: The absolute value, |n|, of an integer n is
defined as
n if 0 ≤z n 4
|n| =
−n if n <z 0
h) Equality of two integers: If (a, b) and (c, d) are ordered pairs which are
equivalent under the relation Rz , then the Rz -equivalence classes [(a, b)]
and [(c, d)] are equal sets. To emphasize that they are equal sets under the
relation Rz , we can write

[(a, b)] =z [(c, d)]

1 Note that −[(n, 0)] = [(0, n)] = −n.
2 When there is no risk of confusion with subtraction of other types of numbers, we will
simply use “−”.
3 Note that the “center dot” can be used instead of the “×00 symbol. When there is no
z
risk of confusion with multiplication of other types of numbers, we will simply use “×”.
4 View “absolute value” as a function | | : Z → Z.
Part V: From sets to numbers 153

i) Distribution properties: If [(a, b)], [(c, d)] and [(e, f)] are integers, then

[(a, b)] ×z ([(c, d)] +z [(e, f)]) =z [(a, b)] ×z [(c, d)] +z [(a, b)] ×z [(e, f)]

and

([(c, d)] +z [(e, f)]) ×z [(a, b)] =z [(c, d)] ×z [(a, b)] +z [(e, f)] ×z [(a, b)]

It is good to remember that any integer can be written in the form [(n, 0)]
or [(0, n)] = −[(n, 0)]. These forms make it easier to add and multiply
them without memorizing intricate formulas. For example, the expression
[(2, 4)] ×z [(5, 2)] can be more easily computed as follows:

[(2, 4)] ×z [(5, 2)] = [(0, 2)] ×z [(3, 0)]

= −[(2, 0)] ×z [(3, 0)]
= −[(6, 0)] = [(0, 6)]

We verify that the product of two non-negative integers produces a positive

integer as it should.
[(2, 4)] ×z [(2, 5)] =z [(0, 2)] ×z [(0, 3)]
=z [(0 + 6) + 0]
=z [(6, 0)]

Remark : We pause to deconstruct the elements of Z to better see the nature

of the sets that belong to it. Let u ∈ Z. Then u = [(a, b)] for some a, b ∈ N.
By lemma 4.5, (a, b) ∈ N × N ∈ P(P(N)). Since [(a, b)] ⊂ P(P(N))
then [(a, b)] ∈ P(P(P(N))) For convenience we denote P(P(P(N))) as
P(P(P(N))) = P 3 (N). We conclude that

Z ⊆ P 3 (N)

16.2 Constructing the rational numbers, Q, from Z.

We have succeeded in “extracting” the integers Z from N × N by construct-
ing a quotient set induced by a particular equivalence relation on N × N. To
construct the rationals, we will proceed in a similar way.
When looking at a rational number, ab , it may be useful to view it as an
ordered pair of integers (a, b) of integers where the first entry plays the role
of the numerator, while the (non-zero) second entry plays the role of the
denominator.
154 Section 16: The integers Z and the rationals Q

But simply defining a/b as an ordered pair (a, b) in Z×Z would not do, since a
2
rational number, say −2/3, can have many equivalent forms: −3 , −4 20
6 , −30 . So
the associated ordered pairs of integers (−2, 3), (−4, 6) and (20, −30) should
also be equivalent forms of the same number. To overcome this difficulty,
we will define an equivalence relation on Q = Z × Z∗ (where Z∗ = Z −
{0}) so that all equivalent forms of (−2, 3) belong to an equivalence class
[(−2, 3)] induced by this equivalence relation. We have chosen the Cartesian
product Z × Z∗ rather than Z × Z since the second entry cannot be zero.
The equivalence relation we will use to extract Q from Q = Z × Z∗ will
be represented by Rq . To define this equivalence relation Rq , we will ask
8
ourselves: What property makes the two rational numbers −2/3 and −12
equivalent? We see that
−2 8
= implies (−2)(−12) = (3)(8)
3 −12

More generally we see that

a c
= if and only if ad = bc
b d

We want the ordered pairs (a, b) and (c, d) in Z × Z∗ to be related under Rq

provided they satisfy the property a ×z d = b ×z c. Proving that this is a
valid equivalence relation is routine. It is formally stated as a theorem.

Theorem 16.4 Let Q = Z × Z∗ where Z∗ = Z − {0}. Let Rq be a relation

on Q defined as follows: (a, b)Rq (c, d) if and only if a ×z d = b ×z c. Then Rq
is an equivalence relation on Q.

P roof: The proof is left as an exercise.

For example, consider the two elements (6, 10) and (15, 25) of Z × Z∗.
Since 6 ×z 25 = 150 = 10 ×z 15, they are equivalent rational numbers. So
[(6, 10)]q =q [(15, 25)]q. Remember that [(a, b)]q will represent the set of all
elements of Z × Z∗ which are Rq -equivalent to the element (a, b) ∈ Z × Z∗.
When there is no risk of confusion with the equivalence class of another
equivalence relation, we will simply use [(a, b)] rather than [(a, b]q .
We now formally define the rational numbers within a set-theoretic context.
Part V: From sets to numbers 155

Definition 16.5 The set of rational numbers, Q, is defined as:

Q = Q/Rq = {[(a, b)] : a ∈ Z, b ∈ Z∗}5

The expression [(a, b)] is normally written in the form ab .

a) We define a relation ≤q on Q as follows: If b and d are both positive,
[(a, b)] ≤q [(c, d)] if and only if a ×z d ≤z b ×z c.
b) Addition +q on Q is defined as:

[(a, b)] +q [(c, d)] = [(ad +z bc, b ×z d)]

c) Subtraction −q on Q is defined as:

[(a, b)] −q [(c, d)] = [(a, b)] +q [(−c, d)]

d) Multiplication ×q on Q is defined as:

[(a, b)] ×q [(c, d)] = [(a ×z c, b ×z d)]

e) Equality of two rational numbers: If (a, b) and (c, d) are ordered pairs of
integers (where neither b nor d is 0) which are equivalent under the relation
Rq , then the Rq -equivalence classes [(a, b)] and [(c, d)] are equal sets. To
emphasize that they are equal sets under the relation Rq , we can write

[(a, b)] =q [(c, d)]

f) Opposites of rational numbers. If (a, b) is an ordered pair of integers (b 6= 0)
and [(a, b)] is its Rq -equivalence class, then the opposite of the rational
number [(a, b)] is defined as [(−a, b)] and is denoted as −[(a, b)] =q [(−a, b)].

Proofs that addition, subtraction, multiplication and linear ordering, thus

defined, reflect precisely what we normally obtain when performing the usual
algorithms on Q is left as an exercise. Once this is verified, the reader is, of
course, free to use the usual algorithms for computation involving rational
numbers. When convenient, we will interchangeably represent Rq equiva-
lence classes [(a, b)] as ab and vice-versa.
Remark : We pause to deconstruct the elements of Q to better see the na-
ture of the sets that belong to it. Recall that Z ⊆ P 3 (N). Let u ∈ Q. Then
u = [(a, b)] for some a, b ∈ Z. By Lemma 4.5, (a, b) ∈ Z × Z∗ ∈ P(P(Z)).
Since [(a, b)] ⊂ P(P(Z)) then [(a, b)] ∈ P(P(P(Z))) = P 3 (Z). We con-
clude that Q ⊆ P 3 (Z) ⊆ P 3 (P 3 (N)) = P 6 (N).

Theorem 16.6 Suppose a and b are positive integers where b 6= 0 and [(a, b)]
is an Rq equivalence class. Then
5 Recall that a and −b is shorthand for expressions of the form [(a, 0)] or −[(0, b)]
156 Section 16: The integers Z and the rationals Q

−a a
a) [(−a, −b)] = −b = b = [(a, b)].
−a a
b) −[(a, b)] =q [(−a, b)] = b = −b = [(a, −b)].
P roof:
a) Since −a ×z b = −b ×z a, then (−a, −b)Rq (a, b). Then (−a, −b) ∈ [(a, b)]
and so we can write −a a
−b = [(−a, −b)] =q [(a, b)] = b .

b) Note that if a and b are positive integers, then since −a ×z −b = b ×z a,

then (−a, b) is Rq -equivalent to (a, −b) and so
−a a
= [(−a, b)] =q −[(a, b)] =q [(a, −b)] =
b −b
Hence, moving negative signs from numerators to denominators and vice versa
in rational numbers is justified.

Is an integer equal to a rational number in this set-theoretic context? To help

answer this question, we will compare a particular integer to its equivalent
rational number form. In our set-theoretic axiomatic system, the integer −3
looks like [(0, 3)]z = −[(3, 0)]z ⊂ N × N. In the same axiomatic system, the
rational number −3/1 looks like [(−3, 1)]q ⊂ Z × Z∗ . So, they are not the
same set. However, it can be verified to one’s satisfaction that operations
made with integers n = ±[(n, 0)]z when viewed as rational numbers [(n, 1)]q
will provide results which consistently match those obtained by performing
parallel operations with integer numbers.
Just as a matter of interest, let’s see how our number systems are evolving
in our set-theoretic universe.
− The natural numbers, N, exists thanks to Axiom 8.
− The integers, Z, are defined as follows:

n ∈ Z ⇒ n = [(n, 0)] ⊂ N × N, −n = [(0, n)] ⊂ N × N

⇒ n = [(n, 0)] ∈ P(N × N), −n = [(0, n)] ∈ P(N × N)

Then Z ⊂ P(N × N).

− The rational numbers, Q, are defined as follows:

a/b ∈ Q ⇒ a/b = [(a, b)]q ⊂ Z × Z∗

⇒ a/b = [(a, b)]q ∈ P(Z × Z)
⇒ a/b ∈ P( P(N × N) × P(N × N) )

Then Q ⊂ P( P(N × N) × P(N × N) ).

Part V: From sets to numbers 157

Examples.
The following examples illustrate that doing arithmetic by referring to the
set-theoretic definitions and the listed properties is a bit awkward and re-
quires some thought. It is of course not an efficient way of doing arithmetic.
To see this, we compute the following expressions by representing these num-
bers as equivalence classes of ordered pairs and using the above definitions.
When useful, we indicate which of the above statements are invoked to jus-
tify various steps.
b) Compute −2 6 3

a) Compute −4(3 − 7) 5 7 − 11

Solution :

a)
−4(3 − 7) = [(0, 4)] ×z ([(3, 0)] −z [(7, 0)])
=z [(0, 4)] ×z ([(3, 0)] +z [(−7, 0)]) (By Theorem 16.6, b) .)
=z [(0, 4)] ×z ([(3, 0)] +z [(0, 7)])
=z ([(0, 4)] ×z [(3, 0)]) +z ([(0, 4)] ×z [(0, 7)])
=z [(0, 12)] +z [(28, 0)]
=z [(28, 12)]
=z [(16, 0)] (Since (a, b)Rz (c, d) ⇔ a + d = b + c.)

b)

−2 6 3
− = [(−2, 5)] ×q ([(6, 7)] −q [(3, 11)])
5 7 11
=q [(−2, 5)] ×q ([(6, 7)] +q [(−3, 11)]) (By Theorem 16.6, b) .)

=q [(−2, 5)] ×q [(6 ×z 11 +z 7 ×z −3, 7 ×z 11)]

=q [(−2, 5)] ×q [(45, 77)]
=q [(−2 ×z 45, 5 ×z 77)]
=q [(−90, 385)]
=q [(−18, 77)]
=q −[(18, 77)]
18
= −
77
158 Section 16: The integers Z and the rationals Q

Concepts review:
1. Describe the equivalence relation Rz on N × N used to define the
elements of the integers Z.
2. Describe the equivalence class induced by Rz on N × N which rep-
resents the integer −9. What about the integer 3?
3. Do the equivalence classes {[(0, n)] : n ∈ N} ∪ {[(n, 0)] : n ∈ N}
account for all the equivalence classes induced by Rz on N × N?
4. How is addition +z defined on Z in a set-theoretic context?
5. How is multiplication ×z defined on Z in a set-theoretic context?
6. Describe the equivalence relation Rq on the Cartesian product Z ×
Z∗ used to define the rational numbers Q.
7. How is addition +q defined on Q in a set-theoretic context?
8. How is multiplication ×q defined on Q in a set-theoretic context?

EXERCISES

A. 1. Use the definitions in 16.3 to show that the following statements are true:
a) −2 ≤ 10.
b) 0 ≤ 3.
c) −5 − 7 = −12.
d) −2 + 6 = 4.
e) 7 × −2 = −14.
f) −1 × −2 = 2.
2. Use the definitions in 16.5 to show that the following statements are true:
a) −2 2
−3 = 3 .
b) −23 ≤ 2.
3

c) 53 + 32 = 19
6
.
d) −2 − 65 = − 16 5
.
6 1 2
e) 5 × 3 = 5 .
f) 3 = 62 .

B. 3. Let Z = N × N. Let Rz be a relation on Z that is defined as follows:

((a, b), (c, d)) ∈ R if and only if a+d = b+c. Prove that Rz is an equivalence
relation on Z.
Part V: From sets to numbers 159

4. The relation ≤z on Z is defined as follows: [(a, b)] ≤z [(c, d)] if and only if
a + d ≤ b + c. Show that this is a linear ordering.
5. Let Q = Z × Z∗ where Z∗ = Z − {0}. Let Rq be a relation on Q defined as
follows: ((a, b), (c, d)) ∈ Rq if and only if a ×z d = b ×z c. Show that Rq is
an equivalence relation on Q.

C. 6. We define a relation ≤q on Q as follows: When both b and d are non-

negative, [(a, b)] ≤q [(c, d)] if and only if a ×z d ≤ b ×z c. Show that this is
a linear ordering.
160 Section 17: Real numbers: “Dedekind cuts are us!”

17 / Real numbers: “Dedekind cuts are us!”

Abstract. In this section we show how R is defined within the confines of

the ZFC-set-theoretic axiomatic system. With this objective in mind, we
begin by defining “initial segments” of rational numbers. These special ele-
ments of P(Q) are referred to as “Dedekind cuts”. A linear order relation
is defined on these as well as operations of addition and multiplication. We
then show that there is a function, f, which maps the real numbers R to the
Dedekind cuts one-to-one and onto, while respecting order, addition and
multiplication. Finally we show that the set of all Dedekind cuts satisfies
the essential Completeness property of the real numbers.

17.1 Definition of Dedekind cuts.

√
Numbers such as 2 and π have long been known to be “not rational”.
Solid proofs showing that such numbers cannot be expressed in the form of
a quotient of integers were produced long ago, and that their values could
be approximated to any degree by rational numbers. It was also known that
non-rational numbers were plentiful and that they could be ranked in size
along with the rationals on a line. Initially, those who used mathematics as
a tool in their field of study were satisfied with calling such numbers “ir-
rational numbers” and defining the set of “real numbers”, R, as being the
union of these two disjoint sets. Still, the irrational number has, for centuries,
remained somewhat of a mystery. At least until a branch of mathematics we
call “calculus” came to be. With its intricate methods, a real number was
expressed in a form that most mathematicians became comfortable with
(not something we want to discuss here).
We are now faced with the difficult task of defining the set, R, within the
confines of a mathematical universe governed only by the ZFC-axioms. Our
definition must be such that each real number is viewed as a set. At this
point in our study, we have invoked, only Axioms 1 to 6 and Axiom 8. Ax-
iom 7 (Axiom of replacement), Axiom 9, (Axiom of regularity) and Axiom
10 (Axiom of choice) have not yet been required in our study. We will see,
surprisingly enough, that Axioms 7, 9 and 10 will still not be required to
accurately define the set of real numbers.
Before we attempt to define the set, R, within the confines of the ZFC
mathematical universe, it is critical for us to understand what distinguishes
R from Q.
It is true that the real numbers are, in many respects, similar to the rational
numbers. The set Q is one which is linearly ordered, in such a way that, to
any pair of distinct rationals a and b satisfying a < b, we can associate a
third rational number c such that a < c < b. The set R also has a linear
ordering which satisfies this property. But there is a particular property
Part V: From sets to numbers 161

which distinguishes the reals, R, from the rationals, Q, in a fundamental

way. This property is called the

“completeness property”.

It is also often referred to as the “least upper bound property”. It states that

“Every non-empty bounded subset S of R has a least upper

bound”.1

Does the set, Q, satisfy the “completeness property”? That is, is it true that
for every non-empty bounded subset S of Q, Q contains the least upper
bound of S? Well, let’s consider the subset
√
S = [−4, 2) ∩ Q

We see that S is indeed a bounded non-empty subset of Q. Does Q contain

the least upper bound of S? The set U of all upper bounds of S is
√
[ 2, ∞) ∩ Q
√
But √[ 2, ∞) ∩ Q has no least rational number since√
for any rational number
r > 2 there exists a rational number s such that 2 < s < r.
So Q does not satisfy the “completeness property”.
√ Some will refer to Q as
an “incomplete set”. In case
√ of S = [−4, 2) the least upper bound of S is
the non-rational number, 2.
In this section we will present a very elegant and clever set-theoretic def-
inition of the real numbers put forward by the mathematician Richard
Dedekind2 . We begin by defining a family of subsets of Q which will set
the foundation for Dedekind’s set-theoretic definition of real numbers.

Definition 17.1 A Dedekind cut is a subset S of the rational numbers, Q,

which satisfies the following properties:
1. The set S 6= ∅ and S 6= Q.
2. For any two rational numbers a and b , if a ∈ S and b < a, then b ∈ S.
3. The set S contains no maximal element. That is, if a ∈ S, there exists
b ∈ S such that a < b.
1 We can also say: “If S is a non-empty bounded subset of R, the set R contains the least

element of the set U of all upper bounds of S”.

2 Richard Dedekind (1831-1916) was a German mathematician who made important con-

tributions to number theory, abstract algebra (particularly ring theory), and the axiomatic
foundations of arithmetic. His best-known contribution is the definition of real numbers
through the notion of Dedekind cut. He is also considered a pioneer in the development of
modern set theory and of the philosophy of mathematics known as Logicism.
162 Section 17: Real numbers: “Dedekind cuts are us!”

Along with this slightly abstract definition, we consider the following subset
of Q. For r ∈ R, we define

(←Q r) = (−∞, r) ∩ Q

where the number, r, whether it is rational or not, is referred to as the

“leader of (←Q r)”

We will show that subsets of Q of the form (←Q r) offer another way of per-
ceiving the Dedekind cuts.
Claim: We claim that any Dedekind cut can be expressed in the form (←Q r).
Proof of claim: Suppose S is a Dedekind cut, a subset of Q satisfying the
three conditions stated above.
Condition one states that there exists a rational number k such that k 6∈ S.
If u ∈ S and u > k then, by condition two k would belong to S. So S ⊆
(−∞, k)∩Q. Condition two also guarantees that if a ∈ S, then (−∞, a)∩Q ⊆
S. So, if a ∈ S, then

(−∞, a) ∩ Q = (−∞, a) ∩ S ⊂ (−∞, k)

Let q be the least upper bound of S (possibly an irrational number). Then,

for any a ∈ S,

(−∞, a) ∩ Q = (−∞, a) ∩ S ⊆ (−∞, q] ⊆ (−∞, k]

See that the number q cannot belong to S, for if it did, S would contain its
maximal element q, contradicting condition three. We conclude that

S = (−∞, q) ∩ Q = (←Q q)

where q is the least upper bound of S, as claimed.

Proving that any subset of Q of the form (←Q r) = (−∞, r) ∩ Q, for some
real number, r, satisfies the three conditions in the definition of a Dedekind
cut is left as an exercise.
Hence, for each r ∈ R, (←Q r) is a Dedekind cut.

We define,
D = {(←Q r) : r ∈ R}
We have argued above that the set, D, precisely represents the set of all
Dedekind cuts.
Part V: From sets to numbers 163

17.2 Linearly ordering, adding and multiplying the elements of D .

The perspicacious reader may already have some insight on where we are
going with this, with the suspicion that we will eventually view the elements
in D as being the “real numbers” in ZFC. But for now, let’s just call this
speculation. We still have lots of work to do. We first show how the elements
of D can be linearly ordered; then we show how to add and multiply its el-
ements.
a) A linear ordering of D: Since the elements of D are subsets of Q, we will
order the elements of D by inclusion. That is, we define “<” on D as
follows:
(←Q r) < (←Q t) ⇔ (←Q r) ⊂ (←Q t)
Notice that this ordering of D is one which rigorously respects the order
of their leaders (whether they are rational or irrational). That is (←Q r) <
(←Q t) ⇔ r < t. For example, since −2 < π, then

(←Q −2) ⊂ (←Q π) and so (←Q −2) < (←Q π)

Furthermore, (←Q a) = (←Q b) if and only if a = b. If (←Q c) and (←Q d)

are distinct elements of D, then c and d are distinct real numbers and so
either c < d or d < c. Hence, either (←Q c) ⊂ (←Q d) or (←Q d) ⊂ (←Q c).
So “<” linearly orders the elements of D.
b) Addition on D: We define addition in D as follows:

(←Q r) + (←Q t) = {x + y ∈ Q : x ∈ (←Q r) and y ∈ (←Q t)}

Claim #1: We claim that the set

D = {x + y : x ∈ (←Q r) and y ∈ (←Q t)}

is a Dedekind cut.
Proof of claim #1: To see this, we will show that D satisfies the three
Dedekind cuts’ conditions. First note that, if k > r and q > t, then, for
any x + y ∈ D implies x + y < k + q. Then k + q is a strict upper bound
of D. So D cannot be all of Q. Then condition one is satisfied.
Next, suppose s ∈ D. Then s = a + b where a ∈ (←Q r), b ∈ (←Q t).
Suppose d ∈ Q such that d < a + b < r + t. Since d − a < b ∈ (←Q t). So
d − a ∈ (←Q t)}. Since a ∈ (←Q r), then

d = a + (d − a) ∈ D

So for any s ∈ D, if d < s then d ∈ D. So,

s ∈ D ⇒ (←Q s) ⊂ D

Condition two is satisfied.

164 Section 17: Real numbers: “Dedekind cuts are us!”

Finally suppose s = a + b where a ∈ (←Q r), b ∈ (←Q t). Then there exists
ar and bt in (←Q r) and b ∈ (←Q t), respectively, such that s = a + b <
ar + bt ∈ D. Condition three is satisfied.
So D is indeed a Dedekind cut. This establishes claim #1.
Claim #2: We claim that D = (←Q (r + t)).
Proof of Claim #2: Suppose s = a+b ∈ D where a ∈ (←Q r) and b ∈ (←Q t).
Then s = a + b < r + t. So s ∈ (←Q (r + t)). So D ⊆ (←Q (r + t)).
We now show that (←Q (r + t)) \ D is empty. To do this, it suffices to show
that r + t is the least upper bound of D. If not, there exists a positive
number, say ε, such that r + t − ε is the least upper bound of D. But

r + t − ε = r − ε/2 + t − ε/2

There exists a rational a in (r − ε/2, r) and a rational b in (t − ε/2, t).

Then a + b > r + t − ε and a + b ∈ D contradicting the fact that r + t − ε
is the least upper bound of D. So the least upper bound of D must be r+t.
Then D = (←Q (r + t)), which establishes Claim #2. We have shown that

D = {x + y ∈ Q : x ∈ (←Q r) and y ∈ (←Q t)} = (←Q (r + t))

For example: (←Q −5) + (←Q 7) = (←Q (−5 + 7)) = (←Q 2).

c) Multiplication on D: In the case where both r and t are greater than zero
we define multiplication as:

(←Q r) (←Q t) = xy : x ∈ (←Q r) and y ∈ (←Q t), x, y > 0 ∪ [(−∞, 0)∩Q]

The more general definition of multiplication on D, which includes prod-

ucts of negative numbers, is a bit more complicated.
In general: 

 (←Q 0) for the case where r or t is 0.
( |r|) ( |t|)

 ←Q
 ←Q for the case where r and t are both positive
(←Q r) (←Q t) = or both negative.
−(←Q |r|) (←Q |t|)


 for the case where precisely one of r or t


is negative.

Showing that
(←Q r) (←Q t) = (←Q rt)
is left as an exercise.

Note that multiplication of the elements of D rigorously respects the multi-

plication of its leaders.
Part V: From sets to numbers 165

For example:

(←Q 5)
√ (←Q 7) √ = (←Q 5 ×
√ 7) √ = (←Q 35)
(←Q 2) (←Q − 2) = −(←Q 2 × 2) = −(←Q 2)
(←Q −4) (←Q 0) = (←Q 0)
(←Q −2) (←Q −10) = (←Q 2 × 10) = (←Q 20)

Basic addition and multiplication properties for Dedekind cuts.

The definitions of addition and multiplication are such that the following
fundamental properties are all satisfied:
(←Q 0) + (←Q a) = (←Q 0 + a) = (←Q a)
(←Q 0)(←Q a) = (←Q 0 × a) = (←Q 0)
(←Q −a) + (←Q a) = (←Q −a + a) = (←Q 0)
(←Q 1) (←Q b) = (←Q 1 × b) = (←Q b)
(←Q 1/a) (←Q a) = (←Q 1/a × a) = (←Q 1)
We see that:
– (←Q 0) plays the role of the additive identity in D
– (←Q 0) plays the role of the multiplicative zero element in D
– every element in D has an additive inverse
– (←Q 1) plays the role of the multiplicative identity in D
– every non-zero element of D has a multiplicative inverse

17.3 Defining a one-to-one onto function from R onto D .

We define the function f : R → D defined as

f(r) = (←Q r)

The function f is a natural one-to-one mapping which copies R into P(Q).

This function, f, respects the linear ordering, addition, multiplication in R:

r<t ⇔ f(r) = (←Q r) ⊂ (←Q t) = f(t)

f(r + t) = (←Q r + t) = (←Q r) + (←Q t) = f(r) + f(t)
f(r × t) = (←Q r × t) = (←Q r) × (←Q t) = f(r) × f(t)

17.4 Defining the real numbers R in ZFC as Dedekind cuts.

We now have the background and the ingredients needed to provide a set-
theoretic definition of the real numbers.
166 Section 17: Real numbers: “Dedekind cuts are us!”

Definition 17.2 We define the real numbers as being the set of all Dedekind
cuts D, linearly ordered by inclusion with addition + and multiplication ×
(as described above). Those Dedekind cuts which have no least upper bound
in Q are called irrational numbers.

17.5 Completeness property of the real numbers R.

We will now verify if the set of all Dedekind cuts satisfies the completeness
property (or as it is often called, the least upper bound property). We remind
ourselves what this property states:
“Every bounded subset of R has a least upper bound which is a
real number.”3

The completeness property is one which distinguishes R from other infinite

linearly ordered sets. If the set D does not satisfy this property, it disqual-
ifies it from being called the “real numbers”. We will show that the set of
all Dedekind cuts, linearly ordered as described above, passes the test for
completeness. We first prove a lemma.

Lemma 17.3 The union of a non-empty set of Dedekind cuts is either itself
a Dedekind cut or is the set Q.

P roof:
What we are given: That U is a non-empty set of Dedekind cuts.
What we are required to show: That ∪{V : V ∈ U } is Q or is of the form
(←Q r) for some r ∈ R.
Case 1: Suppose U = {(←Q t) : t ∈ R}. Every Dedekind
S cut can be ex-
pressed as (←Q t) for some real number t. Then Q = t∈R (←Q t), contains all
Dedekind cuts.
Case 2: Suppose that ∪{V : V ∈ U } =6 Q. Then there isSa proper non-empty
subset M ⊂ R such S that U = {(←Q t) : t ∈ M } and t∈M (←Q t) 6= Q. It
suffices to show that t∈M (←Q t) is a Dedekind cut.
S
Since t∈M (←Q t) 6= Q, then there exists some u ∈ Q such that u 6∈ (←Q t)
for all t ∈ M . Then t < u for all t ∈ M . This means that u is an upper
3 Note that this property is often expressed in many different but equivalent forms. The

following properties are all equivalent to the Completeness property: (1) The limit of every
infinite decimal sequence is a real number, (2) Every bounded monotonic sequence is con-
vergent, (3) A sequence is convergent if and only if it is a Cauchy Sequence. Googling the
words “Completeness property” may direct the internet surfer to any one of these.
Part V: From sets to numbers 167

bound of M ⊂ R.
By the completeness principle for the real numbers, since M is bounded in
R, M has a least upper bound, say v ∈ R.
S
We claim that t∈M (←Q t) = (←Q v).
S
− We first show that (←Q v) ⊆ t∈M (←Q t):
Let z ∈ (←Q v). Then there exists t ∈ M such that z < t < v (for if t ≤ z
for all t ∈ M , then z is an upper bound of M , S a contradiction of the
definition of v). So z ∈ (←Q t) ∈ U . Then (←Q v) ⊆ t∈M (←Q t) must hold
true.
S
− We now show that t∈M (←Q t) ⊆ (←Q v):
Let u ∈ (←SQ t) for some t ∈ M . Since t ∈ M , t < v. Then u ∈ (←Q t) ⊂
(←Q v). So t∈M (←Q t) ⊆ (←Q v) as claimed.
S
So t∈M (←Q t) = (←Q v), a Dedekind cut as claimed.
S
So the union t∈M (←Q t) of all elements of U = {(←Q t) : t ∈ M } is a Dedekind
cut.

Theorem 17.4 Let D denote the set of all Dedekind cuts linearly ordered
by ⊂. Then if S is a non-empty bounded subset of D, S has a least upper
bound (with respect to the ordering ⊂).

P roof:

We are given that S is a non-empty bounded subset of D. Then S is of

the form
S = {(←Q t) : t ∈ M ⊂ R}
for some proper subset M of R. Since S is a bounded subset of D with
respect to ⊂, there Sexists k ∈ R such that (←Q t) ⊆ (←Q k) for all t ∈ M .
Then its union U = t∈M (←Q t) cannot be all of Q. By the lemma, its union
U is a Dedekind cut, say, (←Q v).
We claimSthat U = (←Q v) is the least upper bound of S with respect to
⊂: Since t∈M (←Q t) = (←Q v), then (←Q t) ⊆ (←Q v) for all t in M and so
the Dedekind cut (←Q v) is an upper
S bound of S . Suppose (←Q u) is another
upper bound of S . Then U = t∈M (←Q t) ⊆ (←Q u). So (←Q v) ⊆ (←Q u). We
must conclude that (←Q v) is a least upper bound of S .
So D satisfies the completeness property.

From this, we conclude that the set of all Dedekind cuts, D, represents
the set of all real numbers in the ZFC-set-theoretic universe. Thus, from
the primitive concepts “class”, “set” and “belongs to” and Axioms1 to 8
we have successfully defined the sets of natural numbers, integers, rational
numbers and real numbers. The elements of these sets are themselves sets. If
the existence of the natural numbers N is almost an immediate consequence
168 Section 17: Real numbers: “Dedekind cuts are us!”

of the Axiom of infinity, the other axioms provided the necessary tools to
construct from N the integers, rationals and real numbers (as sets).
Remark : Having defined the real numbers as Dedekind cuts, we see that ev-
ery real number u in ZFC is viewed as a subset of Q and so u ∈ P(Q). Since
Q ⊆ P 6 (N) (see page 155), u ∈ P(P 6 (N)) = P 7 (N) and so, R ⊆ P 7 (N).
We can now see why the Axiom of power set plays an essential role in the
ZFC-universe. Had we not declared that “P(S) is a set whenever S is a set”,
what guarantee would we have that the real numbers exists in our universe?
We have not yet invoked the following axioms: Axiom 7, called the Axiom
of replacement, Axiom A9, called the Axiom of regularity, and the Axiom
of choice. These three axioms will help us handle certain difficulties encoun-
tered, while dealing with infinite sets, the main subject of our investigation
for the rest of this book.

Concepts review:
1. What is an initial segment? What is its leader?
2. How is addition of initial segments defined?
3. If r and t are positive real numbers, how is multiplication of (←Q r)
and (←Q t) defined?
4. Provide a definition of Dedekind cuts.
5. Define a function f which maps R one-to-one onto the set D of all
Dedekind cuts.
6. Give a set-theoretic definition of the real numbers.
7. What can we say about the union of a family of Dedekind cuts?
8. What does the Completeness property of the reals (equivalently,
Least upper bound principle of the real numbers) state?
9. Does the set of all Dedekind cuts satisfy the Completeness property?
10. How are the elements of all Dedekind cuts ordered?
11. How does a Dedekind cut representing a rational number differ from
one representing an irrational number?
Part V: From sets to numbers 169

EXERCISES

A. 1. Perform the following operations on the given Dedekind cuts:

a) (←Q −3) × (←Q 1/3)
b) (←Q 1/3) + (←Q 5)
c) (←Q 5) × (←Q 0)
d) (←Q −1) + (←Q 1)
2. Show why (←Q −1) + (←Q 5) < (←Q 1/3) + (←Q 10).
S
3. What is the least upper bound of t<0 (←Q t)?
4. Find a Dedekind cut strictly in between the two cuts (←Q 1/2) and (←Q 2/3).
5. Is there a Dedekind cut that contains all Dedekind cuts? If so, what is it?
If not, why not?
6. How does the set-theoretic definition of R guarantee that it is a set?
7. What link can we make between the set-theoretic definition of R and the
natural numbers N?

B. 8. Show that any set of the form (←Q r), where r is a real number, satisfies the
three conditions in the formal definition of a Dedekind cut.

C. 9. Let f : R → P(Q) be a function defined as follows: f(x) is the least upper

bound of {y ∈ Q : y < x}. What is the image of R under the function f?
Part VI

Infinite sets
Part VI: Infinite sets 173

18 / Infinite sets versus finite sets

Abstract. In this section we give a definition of “infinite set” as put
forward by Richard Dedekind. We then establish a few of the most basic
properties of infinite sets and compare them to those of finite sets. These
properties allow us to characterize finite sets as those sets which are in
one-to-one correspondence with some natural number, n. We also show
that for any finite set S, P(S) is finite. Finally, we prove a version of the
“Recursively defined function theorem”.

18.1 Infinite sets.

The notions of “finite set” and “infinite set” are often viewed as being op-
posites of each other, in the sense that if we define one of these, then the
other is its negation. We all have an intuitive idea of what a finite set is and
how it differs from an infinite set. Most would agree with the statement “A
finite set is a set whose elements you can count so as to determine how large
or how small it is”. We would of course first have to explain what it means
to “count” and what it means to determine “how many elements there are
in a set”. Our definition of “finite” would have to be such that we can say
“every natural number is finite”.
The concept of “infinite set” is abstract, purely an idealization of something
we find useful when discussing certain topics, even though it is impossible to
perceive. Yet, for anyone who studies or uses mathematics in any field, such
as engineering, physics, or social sciences, doing mathematics without refer-
ring to “infinity” or “infinite sets” would feel like trying to walk with both
shoelaces tied together. When the word “infinite” is used in a conversation,
only a mathematician might possibly ask “What do you mean when you
say the word infinite?”. Most individuals might embarrassingly respond “I
don’t really know, but anybody would recognize a set which is infinite if they
saw one.” Appropriately defining “infinite sets” is important, not because
it would be an interesting mathematical exercise to do so, but because if
we want to make sense of the mathematics we do today, we have no choice.
Experience shows that doing mathematics without clearly defining the con-
cepts we are referring to can lead to contradictions or erroneous results. In
this text, we have informally used the words “finite” and “infinite” before,
but never in a mathematical statement proven to be true or false. If we want
to refer to infinite and finite sets in theorem statements or definitions, these
must be precisely defined. We choose to provide a definition of “infinite set ”
and then define “finite set ” as being one which is not infinite.
174 Section 18: Infinite sets versus finite sets

18.2 Dedekind’s definition of infinite set.

We start with the definition of an infinite set as evoked by the mathemati-
cian Richard Dedekind.

Definition 18.1 A set S is said to be an infinite set if there exists a one-to-

one function mapping S onto a proper subset of itself. If a set S is not infinite,
then we say that it is a finite set.1

Surprised? Well, it is better than saying that an infinite set is “a set with
lots of things in it” or “is a set which has more things in it than we can
count”. (A set containing one hundred billion atoms has a lot of things in
it and yet no one would perceive it as being an infinite set.) Dedekind’s
definition is succinct and without ambiguities since it is only expressed us-
ing words that have been previously defined. An infinite set is a set which
properly contains a one-to-one image of itself. If we were to define finite sets
as being those sets that are not infinite, then we could say that “a set S
is finite if and only if S contains no proper subset T which is a one-to-one
image of itself”. For example, since the natural number 6 = {0, 1, 2, 3, 4, 5}
cannot be in one-to-one correspondence with any of its elements, it cannot
be infinite. When a function f : A → B maps a set A one-to-one into a set
B, we often say that f embeds A inside B in the sense that B contains a
“copy” of A. Using this vocabulary we can say that “S is infinite if and only
if it is embedded into a proper subset of itself ”.

18.3 Example: N is an infinite set.

The set, N, of all natural numbers was defined as being the smallest inductive
set. We test our definition of “infinite set” on N. The set N is infinite only if
we can produce a function f : N → N which embeds N into a proper subset
of itself. Consider the function f : N → N defined as f(n) = 2n. We see that
f is one-to-one and that the image, f[N] = {0, 2, 4, 6, . . ., }, of N under f is
a proper subset of N. We have proven that:

“The set of all natural numbers N is an infinite set.”

One could also declare R to be an infinite set since (−π/2, π/2) ⊂ R and
(−π/2, π/2) is a one-to-one image of R under the function tan : R → R.

1 Actually records show that Bolzano suggested in 1847 (before Dedekind) that an infinite

set is a set that can be mapped one-to-one onto a proper subset of itself, a property which
cannot be satisfied by finite sets.
Part VI: Infinite sets 175

18.4 Properties of infinite sets and finite sets.

The following theorems confirm that infinite and finite sets satisfy the prop-
erties we expect from them. Ultimately, we want to show that the finite
sets are precisely those sets which are in one-to-one correspondence with
some natural number. Surprisingly, this does not follow immediately from
our definitions of infinite and finite sets.

Theorem 18.2 Basic properties of infinite and finite sets.

a) The empty set is a finite set.
b) Any singleton set is a finite set.
c) A set which has a subset which is infinite must itself be infinite.
d) A subset of a finite set must be finite.

P roof:
a) The empty set, ∅, has no proper subsets and so a function f cannot map
∅ into a proper subset of ∅. So ∅ is finite.

b) The singleton set, {x}, contains only one element x. Since x is not a proper
subset of x, the only proper subset of {x} is ∅. Then, for any well-defined
function f : {x} → {x}, ∅ cannot be the one-to-one image of {x} under
f. So singleton sets are finite.

c) What we are given: That X is an infinite subset of a set S.

What we are required to show: That S is infinite.
If X = S, then S is infinite and we are done. Suppose X 6= S. Since X
is infinite, then there exists a one-to-one function f which maps X onto
f[X] ⊂ X. Since f[X] ⊂ X ⊂ S, the set S is the pairwise disjoint union
of the three sets S − X, X − f[X] and f[X]. Define the map g : S →
(S − X) ∪ f[X] as follows:

x if x ∈ S − X
g(x) =
f(x) if x ∈ X

Then g maps S one-to-one onto (S − X) ∪ f[X], a proper subset of S. So,

by definition, S is an infinite set.

d) Suppose F is a finite set and X ⊆ F . We are required to show that X is

finite. Suppose X is an infinite set. Then by the statement in part c) F
must be infinite, contradicting our hypothesis. So X must be finite.
176 Section 18: Infinite sets versus finite sets

Theorem 18.3 Let f : X → Y be a one-to-one function mapping X onto Y .

The set X is infinite if and only if the set Y is infinite.
P roof:
(⇒) What we are given: That X is an infinite set and that f : X → Y is a
one-to-one function mapping X onto a set Y .
What we are required to show: That Y is infinite.
Since X is infinite, by definition, there exists a function, g : X → g[X],
mapping X one-to-one onto a proper subset g[X] of X. Then f|g[X] : g[X] →
Y maps the proper subset g[X] of X one-to-one into Y . Since f maps X
one-to-one onto Y , it has an inverse f −1 mapping Y one-to-one and onto
X. Then the function
(f|g[X] )◦g◦f −1 : Y → Y

maps Y one-to-one onto f|g[X] [g[X]] = f[g[X]], a proper subset of f[X] = Y .

So, by definition, Y is an infinite set.
(⇐) Suppose Y is infinite. Then, since f −1 : Y → X is a one-to-one map
from Y onto X, by the first part of this proof, X is infinite.

Corollary 18.4 The one-to-one image of a finite set is finite.

P roof: The proof is left as an exercise.

Lemma 18.5 If S is an infinite set and a ∈ S, then S − {a} is an infinite set.

P roof:
What we are given: That S is an infinite set and a ∈ S.
What we are required to show: That S − {a} is an infinite set.
Since S is infinite, then there exists a one-to-one function g : S → S such
that g[S] ⊂ S. We will show that S −{a} is infinite by exhibiting a one-to-one
function h on S − {a} such that h[S − {a}] is a proper subset of S − {a}.
Choose an arbitrary element k ∈ S − g[S].
− Case A: Suppose a ∈ g[S]. Then there is some u ∈ S such that such that
g(u) = a.
· Subcase A-1: Suppose u 6= a. Define a function h on S − {a} as follows:

g(x) if x ∈ S − {a, u}
h(x) =
k if x = u
Since g is one-to-one on S − {a, u}, then so is h. Furthermore, h uniquely
maps u to k ∈ S −g[S]. So h is one-to-one on S −{a}. See that neither of
the elements g(u) and a belongs to h[S − {a}]. So h[S − {a}] is a proper
subset of S − {a}.
Part VI: Infinite sets 177

· Subcase A-2: Suppose u = a. Choose an element v 6= a in g[S]. Define a

function h on S − {a} as follows:

g(x) if x ∈ S − {a, v}
h(x) =
k if x = v

Since g is one-to-one on S − {a, v}, then so is h. Furthermore, h uniquely

maps v to k in S − g[S]. So h is one-to-one on S − {a}. Also see that
neither g(v) nor a belongs to h[S − {a}]. So h[S − {a}] is a proper subset
of S − {a}.
− Case B: Suppose a ∈ S − g[S]. Define a function h on S − {a} as follows:

h(x) = g|S−{a} (x) for all x ∈ S − {a}

Since no other element in S − {a} is mapped to g(a) by g, neither g(a) nor

a belongs to h[S − {a}]. So h[S − {a}] is a proper subset of S − {a}.
We conclude that S − {a} is infinite.

Theorem 18.6 Every natural number n is a finite set.

P roof:
The proof is by induction. Let P (n) be the property “The natural num-
ber n is finite”. Since 0 = ∅ is finite, then P(0) holds true. Suppose
the natural number n = {0, 1, 2, 3, . . . , n − 1} is finite. We claim that
n+1 = n+ = {0, 1, 2, 3, . . . , n} must be finite. Suppose not. That is, suppose
n+ is infinite. By the Lemma 18.5, (n + 1) − {n} = n must also be infinite
contradicting the fact that P (n) holds true. So P (n + 1) must hold true.
By the principle of mathematical induction, P (n) holds true for all natural
numbers n. Thus, every natural number is a finite set.

In the proof of the following corollary, we require the Axiom of choice to

justify a particular step. This is the first time we invoke this axiom. We will
discuss the axiom of choice in length later on. For now we will simply state
it and point out the step where it is invoked:
Axiom of choice: For every set A of non-empty sets there is a
function f which associates to every set A in A an element a ∈ A.
At first, it seems rather harmless enough. It says that if we have a set of
non-empty sets, then we can choose from each set one element. If the set of
sets has only finitely many sets, then the Axiom of choice is not required.
The sticky point is the one encountered when the set contains infinitely
178 Section 18: Infinite sets versus finite sets

many sets. If a theorem statement invokes the Axiom of choice in its proof,
it is common practice to alert the reader to this fact by posting the acronym
[AC].2

Corollary 18.7 [AC] A set S is finite if and only if S is empty, or it is in

one-to-one correspondence with some natural number n.

P roof:
(⇐) Suppose S is empty or is the one-to-one image of a natural number n.
Since ∅ is finite and every natural number n is finite, then S must be finite
(by Corollary 18.4 and Theorem 18.6).
(⇒) Conversely, suppose S is a non-empty finite set.
We are required to show that there exists a natural number n which can be
mapped one-to-one onto S. Suppose not. That is, suppose there does not
exist a natural number n which maps one-to-one onto S.
Claim: That S must then be infinite, contradicting our hypothesis.
Proof of claim: We prove the claim by constructing a one-to-one function
f : N → S which maps N into S.
Choose an element s0 in S to form the subset S1 = {s0 } of S. Define the
function f : {0} → {s0 } as f(0) = s0 . Then S − S1 is non-empty, for if it
was empty, then S = S1 would be the one-to-one image of {0} under the
function f contradicting the fact that S is not the one-to-one image of a
natural number. So we can choose an element s1 from S − S1 to construct
the subset S2 = {s0 , s1 }. Define the one-to-one function f : {0, 1} → {s0 , s1 }
as f(i) = si for i = 1, 2.
Suppose we have inductively constructed the subset Sn = {s0 , s1 , s2 , . . . , sn−1 }
of S where f : {0, 1, . . . , n − 1} → Sn is the one-to-one function defined as
f(i) = si . Then to avoid a contradiction, S − Sn must be non-empty. The
Axiom of choice provides us with the choice function k : P(S) → S which
allows us to choose from each set S −Sn an element sn from which we define
the one-to-one function f : {0, 1, . . ., n} → Sn+1 defined as

f(i) = si if i < n
f(n) = k(S − Sn ) = sn

We can, in this way, “inductively” construct a one-to-one function f :

N → S mapping N into S. Then S contains a one-to-one image f[N] =
{s0 , s1 , s2 , s3 , . . .} of N, as claimed.
By part (c) of Theorem 18.2, S must be infinite. This contradicts the part
of our hypothesis in which S was declared to be finite. The source of this
2 The reader may see the posting of this acronym as a challenge by the author asking:

“Is it possible to prove this statement without invoking the Axiom of choice?”.
Part VI: Infinite sets 179

contradiction is the statement “there does not exist a natural number n

which maps one-to-one onto S.” So any finite subset is the one-to-one image
of some natural number.

A few words on the “inductively” constructed function in the above proof.

In the proof of the corollary, we have constructed a one-to-one function
f : N → S by defining f(i) for one number i at a time. For each n, the
value of f(n) depends on the values of f(i) for each i < n. This is because
the choice function k assigns to the set S − {s0 , s1 , . . . , sn−1 } an element sn .
The value of f(n) is then set to be equal to sn .
We conveniently spoke of an “inductively constructed function f ” as if it
was clear that the induction process will automatically produce well-defined
functions. Even though it seems like a reasonably safe method for construct-
ing functions, we should not be blind to an element of uncertainty involved in
this process. It does not immediately follow from the definition of a function
that this method for constructing functions will always produce a function.
If one asserts that this is obvious, then why not produce a proof that shows
us how “obvious” it really is. We will immediately state the theorem which
guarantees that functions constructed in this way are valid but defer its
proof to the end of this section to avoid digressing from our discussion of
finite and infinite sets.

Theorem 18.8 The recursive function theorem. Let S be a set. Let k :

P(S) → S be a function on P(S) and f ⊆ N × S be a relation. We write
f(n) = a if and only if (n, a) ∈ f. Let m ∈ S. Suppose the relation, f, satisfies
the two properties

f(0) = m ⇒ (0, m) = (0, f(0)) ∈ f
(n, f(n)) ∈ f ⇒ (n + 1, k(S − {f(0), f(1), . . . , f(n)}) = (n + 1, f(n + 1)) ∈ f

Then f is a well-defined function on N.

P roof: The proof appears at the end of this section.

We have shown that “counting” the elements in a finite set S comes down
to determining which natural number n is mapped one-to-one onto S.
We are essentially assigning to each of the n elements of S the labels
0, 1, 2, 3, . . ., n − 1. The corollary above shows that we could have defined
finite sets as follows:
Definition: A set S is a finite set if and only if it can be mapped
one-to-one onto some natural number n. If we say that
180 Section 18: Infinite sets versus finite sets

“the finite set S contains n elements”

we mean that S is the one-to-one image of the natural number
n. So “S is a finite set” and “S contains n elements for some n”
are equivalent expressions. A set is an infinite set if it is not the
one-to-one image of some natural number.

All the definitions and theorems stated and proved above would logically
follow from this definition of finite sets. The following theorem provides an-
other characterization of infinite sets.

Theorem 18.9 [AC] A set S is an infinite set if and only if it contains a

one-to-one image of the set of natural numbers N.
P roof:
(⇐ ) If S contains a subset U which is a one-to-one image of N, then U is
infinite (by Theorem 18.3) and so S is infinite (by Theorem 18.2).
(⇒) The proof is left as an exercise. (The proof mimics the proof of Corol-
lary 18.7.)

Theorem 18.10 [AC] If the set S is a finite set and f : S → X is a function,

then f[S] is finite3 .
P roof:
Suppose the set S is a finite set and f : S → X is a function mapping S
into some set X. We are required to show that f[S] is a finite set. Suppose
f[S] is an infinite set. Then there exists a function g : N → f[S] mapping N
into f[S] (18.9). Say

g[N] = {g(0), g(1), g(2), g(3), . . .} ⊆ f[S]

Then for each i ∈ N, f ← (g(i)) is a non-empty subset of S. For each i ∈ N we

can choose an element si from f ← (g(i)) (the Axiom of choice allows us to
choose an infinite number of elements in this way). So S contains a subset
{s1 , s2 , s3 , . . .} of distinct elements. Let h : N → S be the function defined
as h(i) = si . This set is infinite since it is a one-to-one image of N under the
function h. Since S contains an infinite subset, then it must be infinite (by
18.2). A contradiction! So f[S] is a finite set.

3 Note that f need not be a one-to-one function for this to hold true.
Part VI: Infinite sets 181

Theorem 18.11 If a set S contains n elements, then P(S) contains 2n ele-

ments4 . Hence, if a set S is a finite set, then the set P(S) is finite.
P roof:
The proof is by induction.
For a natural number n, let Sn denote a subset of S which contains
n elements. That is, S0 = ∅, S1 = {s0 }, S2 = {s0 , s1 }, . . . , Sn =
{s0 , s1 , . . . , sn−1 }. For each natural number n let P (n) denote the state-
ment
“The power set P(Sn ) contains 2n elements”
Base case: If n = 0, then S0 = ∅ and so P(S0 ) = {∅} contains 1 = 20
element so P (0) holds true.
Inductive hypothesis: Suppose P (n) holds true. That is, suppose that for
any set of n elements, the set P(Sn ) contains 2n elements. (For example,
if S3 = {s0 , s1 , s2 } then P(S3 ) = {U0 , U1 , U2 , . . . , U23−1 }.) Let Sn+1 =
{s0 , s1 , s2 , . . . , sn } be a set containing n + 1 distinct elements. Then if Sn =
{s0 , s1 , s2 , . . . , sn−1 }, by the inductive hypothesis, we can express P(Sn )

P(Sn ) = {U0 , U1 , U2 , . . . , U2n−1 }

where the Ui ’s represent all distinct subsets of Sn ; we will suppose that

U0 = ∅ and U2n −1 = Sn . Since the elements of the set Sn+1 are distinct, then
sn 6∈ Sn , and so {sn } 6∈ P(Sn ). Then P(Sn ) ⊂ P(Sn+1 ). For each i = 0
to 2n − 1 define Vi = Ui ∪ {sn }. We see that P(Sn ) ∩ {V0 , V1 , . . . , V2n −1 } =
∅ since every Vi contains the element sn . Furthermore, every element in
P(Sn+1 ) is accounted for in P(Sn ) ∪ {V0 , V1 , . . . , V2n −1 }. Then

P(Sn+1 ) = {U0 , U1 , U2 , . . . , U2n −1 , V0 , V1 , V2 , . . . , V2n−1 }

We see that P(Sn+1 ) contains 2n × 2 = 2n+1 elements. So P (n + 1) holds

true.
By the principle of mathematical induction, P (n) holds true for all natural
numbers n. We conclude that for any set S which contains n elements, the
set P(S) contains 2n elements. Thus, if S is finite, then so is P(S).

4 If
n = 0 we define 20 = 1, 21 = 20 × 2. If n is a natural number other than 0 we define
2×2 ×···× 2
2n = | {z } .
n times
182 Section 18: Infinite sets versus finite sets

18.5 Proof of the recursive function theorem.5

A recursively defined function f : N → S (on N) is a function which is de-
fined one term at a time. The process begins by defining f(0) at 0. Then, for
each n ≥ 0, the value of f(n) is determined based on the values previously
assigned to each of f(0), f(1), f(2), . . ., f(n − 1). The theorem explicitly
states the conditions under which this method of defining a function is valid.
The recursive function theorem: Let S be a set. Let k : P(S) → S be a
function mapping subsets of S to elements of S and f ⊆ N × S be a relation.
We write “f(n) = a” if and only if (n, a) ∈ f. Let m ∈ S. Suppose the
relation f satisfies the two properties

 f(0) = m ⇒ (0, m) = (0, f(0)) ∈ f
(n, f(n)) ∈ f ⇒ (n + 1, k(S − {f(0), f(1), . . . , f(n)})
= (n + 1, f(n + 1)) ∈ f


Then f is a well-defined function on N.

P roof:
Let S be a class of relations R in N × S which contain (0, f(0)) = (0, m) and
satisfy the condition:

{(n, f(i)) : i ≤ n} ⊆ R ⇒ (n + 1, k(S − {f(0), f(1), . . . , f(n)})

= (n + 1, f(n + 1)) ∈ R

Now S is non-empty since it contains N × S. Let f ∗ = R∈S R. This means

T
that f ∗ is the smallest set of ordered pairs satisfying the conditions described
for S . The relation f ∗ looks something like

f ∗ = {(0, f(0)), (1, k(S − {f(0)})), (2, k(S − {f(0), f(1)})), · · · }

We claim that f ∗ is a function mapping N into S.

Proof of claim: We first establish that dom f ∗ = N. We first note that 0 ∈
dom f ∗ . If n ∈ dom f ∗ , then (n, f(n)) ∈ f ∗ . This implies

(n + 1, k(S − {f(0), f(1), . . . , f(n)})) = (n + 1, f(n + 1)) ∈ f ∗

and so n + 1 ∈ dom f ∗ . Hence, by induction, the domain of f ∗ is all of N. We

now proceed to the proof of the claim.
The proof of the claim is by the second version of mathematical induction.
Let P (n) denote the statement “[(n, x) ∈ f ∗ and (n, y) ∈ f ∗ ] ⇒ [x = y]”.
− Inductive hypothesis: Suppose P (m) holds true for all natural numbers
m < n. That is, (m, x) ∈ f ∗ and (m, y) ∈ f ∗ implies x = y. We will show
that given our hypothesis, P (n) must hold true.
5 The proof of this theorem is a bit “steep” to be presented this early in the book. Should

the reader decide to skip there will be no loss of continuity in the subject matter.
Part VI: Infinite sets 183
∗
Suppose not. Suppose (n, x) ∈ r and (n, y) ∈ f ∗ where x 6= y. Let
U = f ∗ − {(n, y)} (the set f ∗ take away the element (n, y)). Then we
easily see that U is one of the relations in S . The fact that U is strictly
smaller than f ∗ , previously declared to be the smallest of the relations
in S , is a contradiction. Then x must be equal to y. We conclude that
P (n) holds true as required.
By mathematical induction, P (n) holds true for all n. So f ∗ is a well-defined
function as claimed.
We will now show that f ∗ is unique. Let g be another function satisfying the
conditions (0, m) = (0, g(0)) ∈ g and

{(n, g(i)) : i ≤ n} ⊆ f ∗ ⇒ (n+1, k(S−{g(0), g(1), . . . , g(n)}) = (n+1, g(n+1)) ∈ f ∗

We prove uniqueness by induction: Let P (n) denote the statement “f ∗ (n) =

g(n)”.
− Inductive hypothesis: Suppose P (m) holds true for all m < n. That is,
f ∗ (m) = g(m) for all m < n.
Then by definition of f ∗ ,

(n, k(S − {f(0), f(1), . . . , f(n − 1)})) = (n, f(n))

= (n, k(S − {g(0), g(1), . . . , g(n − 1)}))
= (n, g(n))

Hence, P (n) holds true.

By mathematical induction f ∗ (n) = g(n) for all n, and so f ∗ is unique as
claimed.
We conclude that the function f = f ∗ is a function which is uniquely defined
by the given conditions.

Concepts review:
1. What is the definition of infinite set as put forward by Dedekind?
2. From Dedekind’s definition of infinite set, how can we show that N
is infinite?
3. How do we define a finite set?
4. Is the empty set a finite set?
5. Is a subset of a finite set finite?
6. If S is an infinite set and u ∈ S, must S − {u} be infinite?
7. If a set S has a subset which is infinite, is S necessarily infinite?
8. If a set S is infinite and f : S → Y is a one-to-one mapping onto a
set Y , what can we say about the set Y ?
184 Section 18: Infinite sets versus finite sets

9. What can we say about one-to-one images of finite sets?

10. If a set S is infinite and we remove two elements from this set is the
resulting set necessarily infinite?
11. We know that natural numbers are sets. Are all natural numbers
necessarily finite sets?
12. Statement: “Any finite set S is necessarily the one-to-one image of
some natural number.” Is this statement true or false?
13. Statement: “An infinite set is the one-to-one image of the natural
numbers.” Is this statement true or false?
14. Statement: “An infinite set necessarily contains a subset which is a
one-to-one image of the natural numbers.” Is this statement true or
false?
15. What can we say about the image (not necessarily one-to-one) of a
finite set?
16. Is saying “S is a finite set” equivalent to saying “S contains n ele-
ments” where n is a suitable natural number?
17. If S is a finite set, is it necessarily true that P(S) is a finite set?
18. If S contains six elements, how many elements does P(S) contain?

EXERCISES

A. 1. Prove that the one-to-one image of a finite set is finite.

2. Prove that:
a) Z is an infinite set.
b) Q is an infinite set.
c) R is an infinite set.

B. 3. Prove that if S is infinite, then S × S is infinite.

4. Prove that if S is finite, then S × S is finite.
5. Prove that if S and T are infinite, then S ∪ T is infinite.
6. Prove that the union of two finite sets is finite.
7. Prove that the union of a finite set of sets is finite. (You may use the result
in question 6 combined with a proof by induction on the number of finite
sets.)
8. Prove that if S ∪ T is infinite, then either S or T is infinite.
9. Prove that if the set A is infinite and B is any set, then the set A × B must
be infinite.
Part VI: Infinite sets 185

C. 10. Prove that if F is a finite subset of an infinite set S, then S − F is infinite.

11. Prove that if a set S is an infinite set, then it contains a one-to-one image
of the set of natural numbers N.
186 Section 19: Countable and uncountable sets

19 / Countable and uncountable sets

Abstract. In this section we define the words “equipotent sets”, “countable
sets” and “uncountable sets”. We show that Q and Z are countable since
we can produce a one-to-one correspondence between each of these and N.
We also show that N × Z and N × N are countable. Countable unions of
countable sets are shown to be countable. We also show that no such one-
to-one correspondence can exist between R and N, and so R is uncountable.

19.1 Can we compare infinite sets as we do finite sets?

“Finite sets are precisely those sets which are in one-to-one correspondence
with some natural number n” is one of the simplest and most intuitive
characterizations of finite sets. It essentially allows us to characterize sets
according to their size. For example, suppose the two sets Sn and Sm can
be mapped one-to-one and onto the natural numbers n and m, respectively.
We can declare Sm to be larger than Sn if and only if n ⊂ m. We can de-
clare them to be the same size if and only if n and m are the same natural
number. Note that declaring two sets S and T to be the same size does not
mean that they are equal. It simply states that they are both one-to-one
images of the same natural number.
Can the described method for comparing finite sets be used to compare in-
finite sets? One might ask why we would want to compare infinite sets in
this way and argue that “infinite sets need no comparing since, intuitively,
they are all as big as a set can be”. But our intuition is not always reli-
able, particularly when referring to infinite sets as they are defined in the
ZFC-universe of sets. Verifying whether the infinite sets N, Z, Q and R are
one-to-one images of each other is a question worth investigating.

19.2 Equipotent sets.

One of the set theory axioms states that two sets are equal provided they
contain the same elements. We also say that two finite sets A and B (not
necessarily equal) which are both one-to-one images of the same natural
number n are the same size, or contain the same number of elements. If two
sets A and B contain n elements, it of course follows that these two sets
are one-to-one images of each other. Any two infinite sets S and T can also
be one-to-one images of the other. The set N and the set of even natural
numbers are an example. We introduce a term used to describe this relation
between two sets.
Part VI: Infinite sets 187

Definition 19.1 Two sets, A and B, are said to be equipotent sets if there
exists a one-to-one function, f : A → B, mapping A onto B. If A and B are
equipotent, we will say that “A is equipotent to B” or “A is equipotent with
B”.

So equipotent finite sets are precisely those finite sets which are equipotent
to the same natural number. We know that for finite sets A and B, “A is
equipotent to a proper subset of B”, and “A is smaller than B” are equiv-
alent statements; we cannot however say that an infinite set A which is
equipotent to a proper subset of a set B is necessarily “smaller” than B.
For example, even if the set N is easily seen to be equipotent to the proper
subset, {0, 4, 8, 12, . . . , }, of itself (via the function f(n) = 4n), we instinc-
tively hesitate to say that N is a “smaller” set than {0, 2, 4, 6, . . ., } or that
{0, 2, 4, 6, . . ., } is smaller than N. The words “smaller than” seem to have a
precise meaning only when discussing finite sets.

19.3 Countable sets.

We will encounter many kinds of infinite sets. It will be helpful if we can
categorize infinite sets into subfamilies of equipotent sets. We will be par-
ticularly interested in those infinite sets which are equipotent with N. Such
sets are said to be countably infinite sets.

Definition 19.2 Countable sets are those sets that are either finite or equipo-
tent to N. Infinite countable sets are said to be countably infinite. Those infinite
sets which are not countable are called uncountable sets.

So the adjective “countable” means “to be a one-to-one image of some subset

of N”. Does it make sense to speak of a proper class of elements which is
“countable”? The Axiom of replacement (A7) guarantees that all “countable
classes” are sets. To see this, we recall the Axiom for replacement.

Axiom of replacement: Let A be a set. Let φ(x, y) be a formula

which associates to each element, x, of A an element y in such
a way that whenever both φ(x, y) and φ(x, z) hold true, y = z.
Then there exists a set B which contains all elements y such that
φ(x, y) holds true for some x ∈ A.1
1 This axiom is more often expressed as the Replacement axiom schema since it is in fact

many axioms each differing only by the formula φ it refers to. So to be more precise, given
a formula φ in set theory language, we would refer to it as Axiom A7(φ) rather than A7.
188 Section 19: Countable and uncountable sets

This axiom dictates that if A is a set and B is a non-empty class of ele-

ments and f ⊆ A × B is a relation which satisfies the property “(x, y) ∈ f
and (x, z) ∈ f, then y = z”, then there exists a set C which con-
tains the elements, f(x), for all x ∈ A. This axiom guarantees that if
A is a set and f[A] contains only elements, then the functional image,
f[A] = {x : f(a) = x for some a ∈ A}, is a set. So any class which is a
one-to-one image of a subset of N must be a set.
Our definition of countable sets does not guarantee that uncountable sets
exist. We simply stated that those sets which are not countable (if any ex-
ist) will be called “uncountable sets”. We will carefully study the sets we
have encountered to this point and determine which ones are countable and
which ones are not. Recall (from Theorem 18.2 a) ) that the empty-set, ∅, is
finite and so, by definition, is countable. Of course, the set, N, of all natural
numbers is equipotent to itself and so is countable. Since no element of N
is infinite, every natural number is countable. We now investigate the set of
all integers.
Example: Show that the set Z of all integers is countable.
Solution: Define the function f : N → Z as follows:

− n+1

2 if n is odd
f(n) = n
2 if n is even

It is easily verified that f maps N one-to-one onto Z. For example,

f(0) = 0
f(1) = −1
f(2) = 1
f(3) = −2
f(4) = 2
..
.
So Z is an infinite countable set2 .
There are quite a few general statements that we can state about countable
sets. We will find it very useful to know that subsets of countable sets and
images of countable sets are countable.

Theorem 19.3 A subset of a countable set is countable.

P roof:
2 The reader should note that this is just one of many ways of proving that Z is countable.
Part VI: Infinite sets 189

What we are given: That S is a countable set and T is a subset of S.

What we are required to show: That T is countable.
Case 1: If S is finite, then by Theorem 18.2 (d), every subset of S is finite
and so T is countable; we are done.
Case 2: Suppose both S and T are infinite.

Since S is countable it is a one-to-one image, say h[N], of N. Then the func-

tion h : N → S can be expressed as h(i) = xi ∈ S. We can then express S
as a sequence S = {x0 , x1 , x2 , . . . , xn, . . .}.
We are required to show that T is countable. This is done by constructing
a function f : N → T mapping N one-to-one onto T as follows:
− Let g : P(N) → N be the function on P(N) which maps any subset of
N to its least element. Since N has been shown to be well-ordered the
function g is well-defined.
· We recursively define the function m : N → N as follows:

m(0) = g({i ∈ N : xi ∈ T })
m(k) = g({i ∈ N : xi ∈ T − {xm(0), xm(1), . . . , xm(k−1)})

By Theorem 18.8, m : N → N is a well-defined function. Since T

is infinite, the domain of m is N. Furthermore, m is a one-to-one
strictly increasing function.
· Note that {xm(i) : i ∈ N} ⊆ T . We claim: That {xm(i) : i ∈ N} = T :
Suppose xk ∈ T . Let U = {m(i) ∈ N : m(i) < k}. Then

g({i ∈ N : xi ∈ T − {xm(i) : m(i) ∈ U }}) = k

So xk ∈ {xm(i) : i ∈ N}. We conclude that {xm(i) : i ∈ N} = T as

claimed.

− Define the function f : N → T as f(i) = xm(i) . Since f is one-to-one

and onto T , then T is countable.

19.4 More examples of infinite countable sets.

Showing that an infinite set is countable can be a challenge since it requires
constructing a function mapping N one-to-one onto a set. The above result
stating that “subsets of countable sets are countable” is a useful tool for
showing that some infinite sets are countable without specifically exhibiting
a function which maps N onto it.
We provide examples of other sets which are countable.
Example 1. Show that the set N × Z is countable.
190 Section 19: Countable and uncountable sets

Solution: Define the function f : N × Z → Z − {0} as follows:

f(m, n) = 2m (2n − 1)

We claim that the function f is onto Z − {0}:

− Let z be any non-zero integer. Then we can factor at most a finite
number of 2s, say m 2s for some integer m (m possibly equal to 0),
leaving behind a single (either positive or negative) odd factor 2n − 1
for some integer n. So z = 2m (2n − 1). Thus, there exists an ordered
pair (m, n) ∈ N × Z which is mapped to z. So f is onto as claimed.

We claim the function f is one-to-one:

− Suppose x = y in Z−{0}. Since x and y are integers there exists natural
numbers m and n and integers s and t such that3

x = f(m, s) = 2m (2s − 1) = 2n (2t − 1) = f(n, t) = y

Suppose, without loss of generality, that m ≥ n, then

2m (2s − 1) = 2n (2t − 1) ⇒ 2m−n (2s − 1) = 2t − 1

⇒ m = n and 2s − 1 = 2t − 1 (RHS = odd ⇒ m -n = 0)

⇒ s=t
⇒ (m, s) = (n, t)

So f is one-to-one as claimed.
We have shown that f is both one-to-one and onto. Thus, the sets N × Z
and Z − {0} are equipotent. We have shown that Z is countable. Since
Z − {0} ⊂ Z, then, by the previous theorem, Z − {0} is also countable. So
there exists a one-to-one function g mapping Z − {0} onto N. So g−1 ◦f −1
maps N one-to-one onto N × Z. So N × Z is countable, as required.

Example 2. Show that the set of rational numbers, Q, is countable.

Solution: We will represent all rational numbers in the form

Q = {m/n : m ∈ N, n ∈ Z − {0}, m/n is irreducible.}

The rational numbers in this set are irreducible so a/b = c/d if and only if
a = c and b = d. We define the function f : N × (Z − {0}) → Q as follows:

f(m, n) = m/n
3 Note that any non-zero integer can be expressed as a product 2m (2n − 1) for some

natural numbers m and n. For example, suppose we are given the integer 1584. If we factor
out as many 2s as possible from 1584, we obtain 24 and we are left with an odd number
2 · 50 − 1. See that 1584 = 24(2 · 50 − 1).
Part VI: Infinite sets 191

This function is clearly one-to-one and onto and so N × (Z − {0}) and Q

are equipotent. Since N × (Z − {0}) ⊂ N × Z and N × Z was shown to be
countable, then N×(Z −{0}) is countable. Thus, Q is countable, as claimed.

Example 3. Show that the set N × N is countable.

Solution: Since N×Z is countable and N×N ⊂ N×Z, then N×N is countable.

Theorem 19.4 Any finite product, N × N × · · · × N, of N is a countable set.

P roof:
The proof is by induction. It follows from the statement in Example 3. The
details are left as an exercise.

19.5 Countable unions of countable sets.

How many countable sets can we join together and still obtain a countable
set? The following lemma is the first step towards showing that joining to-
gether a countable number of countable sets will always result in a countable
set. The lemma guarantees that the image of a countable set is countable.

Lemma 19.5 Suppose f maps an infinite countable set A onto a set B = f[A].
Then B is countable.
P roof:
What we are given: The set A is countable, f : A → B maps A onto the set
B. What we are required to show: That B is countable.
Since A is countable, we can index the elements of A with the natural
numbers. Let A = {ai : i ∈ N}. For each b ∈ f[A] = B let ab∗ be the element
in f ← ({b}) such that

b∗ = min{i ∈ N : ai ∈ f ← ({b})}

This minimum element exists since N is well-ordered.4 Since {ab∗ : b ∈ B} ⊆

A, then {ab∗ : b ∈ B} is countable. (Subsets of countable sets are countable.)
Then the function g : {ab∗ : b ∈ B} → B defined as g(ab∗ ) = b is one-to-one
and onto B. So B is countable as required.

4 Note that the axiom of choice is not required for this.

192 Section 19: Countable and uncountable sets

Theorem 19.6 LetS{Ai : i ∈ S ⊆ N} be a countable set of non-empty count-

able sets Ai . Then i∈S Ai is countable.

P roof:
What we are given: That the sets in {A
S i : i ∈ S ⊆ N} are all countable sets.
What we are required to show: That i∈S Ai is countable.
Since each set Ai is countable, then we can index the elements of each Ai
with an initial segment Ti of natural numbers or with all elements of N. For
each i, let Ai = {a(i,j) : j ∈ Ti ⊆ N}.
S
We will define a function f : i∈S Ai → S × N as follows: f(a(i,j) ) = (i, j).
We see that f maps ∪i∈S Ai one-to-one into S × N. Since f [∪i∈S Ai ] ⊆
S × N ⊆ N × N, it is countable. Then ∪i∈S Ai is the one-to-one image of the
countable set f [∪i∈S Ai ] under the inverse map f −1 .
We conclude that ∪i∈S Ai is countable.

19.6 The set R of all real numbers is uncountable.

It is only after numerous attempts to show that the set of all real numbers
is countably infinite, that mathematicians turned their attention towards
showing that R is not a countable set. We can of course not say that since
our very best mathematicians are unable to prove that R is countable, then
R must be uncountable and leave it at that. We present a clever proof de-
vised by Georg Cantor5 Cantor successfully shows that no one-to-one image
of N in R can be comprised of all real numbers. That is, for every one-to-one
function f : N → R, the subset, R−f[N], of R will never be empty. This way
of proving that R is uncountable is referred to as Cantor’s diagonalization
method.6

Theorem 19.7 The set R of all real numbers is uncountable.

P roof:
5 Georg Cantor (1845-1918) was a mathematician who played a pivotal role in the creation

of set theory, which has become a fundamental theory in mathematics. Cantor established
the importance of one-to-one correspondence between the members of two sets, defined
infinite and well-ordered sets, and proved that the real numbers are more numerous than
the natural numbers. Cantor’s method of proof of this theorem implies the existence of
an infinity of infinities. He defined the cardinal and ordinal numbers and their arithmetic.
Cantor’s work is of great philosophical interest, a fact he was well aware of. (Wikipedia)
6 Readers who struggle a bit with the proof are encouraged to persist in their efforts to

grasp the general idea. Discuss aspects of the proof that seem to elude your understanding
with a co-reader. It is considered part of the standard mathematical culture. I occasionally
get messages from students who are convinced that they have a proof which shows that R
is countably infinite.
Part VI: Infinite sets 193

We will show that the open interval (0, 1) is not countable. As a consequence,
it will be impossible for R to be countable, since subsets of countable sets
have been shown to be countable.
Proof by contradiction.
Suppose f : N → (0, 1) is a one-to-one function mapping N onto (0, 1). This
means that we can index the elements of (0, 1) with the natural numbers
as follows: (0, 1) = {x0 , x1, x2 , x3 , . . . , } where xi = f(i). We claim that at
least one real number does not belong to f[N] and so f is not “onto” (0, 1):
− We write out each real number as an infinite decimal expansion:

x0 = 0.a11 a12 a13 a14 a15 . . .

x1 = 0.a21 a22 a23 a24 a25 . . .
x2 = 0.a31 a32 a33 a44 a45 . . .
..
. ......
xn−1 = 0.an1 an2 an3 an4 an5 . . .
..
. ......

We will construct a non-zero real number 0.b1b2 b3 b4 . . . between 0 and

1 which is not accounted for in this list.
· For each i, if aii ∈ {0, 1, 2, 3, 4}, let bi = 7. If aii ∈ {5, 6, 7, 8, 9} let
bi = 2. Note that there is nothing special about the numbers 7 and 2.
We could have chosen another pair of integers between 0 and 9. But
not 9s since a number containing an infinite string of 9s can produce
a real number which has two infinite decimal representations.7
· So the real number x = 0.b1b2 b3 b4 . . . is a string of 2s and 7s. We
claim that x is not in the set {x0 , x1, x2 , x3 , . . . , } said to contain
all real numbers in (0, 1). Verify that |b1 − a11 | ≥ 1, |b2 − a22 | ≥ 1,
and more generally, |bi − aii | ≥ 1 for all i and so the real number x
cannot be any one of the numbers in the list as claimed.
− So the function f is not onto (0, 1) as claimed.
Note that this does not only prove that this particular function is not onto;
it also proves that all one-to-one functions f : N → (0, 1) which claim to be
onto cannot be so. Since (0, 1) is not countable, then R cannot be countable.

Some readers, possibly for philosophical reasons, may find it difficult to ac-
cept that the infinite set R is not a one-to-one image of N, and hence is a
strictly larger infinite set than N, even though they can point to no obvious
7 It can be shown, for example, that the rational numbers 0.04999999 . . . and 0.05000 . . .

are different representations of the same rational number 5/100. But the decimal representa-
tion of a rational a/b is unique provided we do not allow a tail end of 9s in our representation
of this number.
194 Section 19: Countable and uncountable sets

errors in Cantor’s proof. Skeptical readers may find some comfort in learn-
ing that even very skilled mathematicians, when confronted by results which
appear counter-intuitive, may harbor some nagging doubts in spite of being
presented with an irrefutable proof. Georg Cantor once wrote to Richard
Dedekind “Je le vois, mais je ne le crois pas”8 (I see it, but I don’t believe
it) after determining that there is a one-to-one correspondence between all
points in the plane and the set of points on a line.
Having now convinced ourselves that the set of all real numbers is uncount-
able, we can subdivide the class of all infinite sets into two categories: the
subclass of all countably infinite sets and the subclass of all uncountably
infinite sets. We will see in the next section that the class of all uncountable
sets can itself be divided into other major subcategories of infinite sets.

Concepts review:
1. What does it mean to say that two sets are equipotent sets?
2. What does it mean to say that a set is countable?
3. Is the set ∅ countable?
4. What can we say about the image of a countable set under some
function f?
5. Which of the sets N, Q, Z, N × Z, R are countable sets.
6. Is it true that the subset of a countable set must be countable?
7. How is the procedure Cantor used to prove the uncountability of R
referred to?
8. What can we say about the countable union of countable sets?

EXERCISES

A. 1. Prove that N × N is an infinite countable subset.

2. Prove that the sets N − {0} and N are equipotent.
3. Are there sets which are neither countable nor uncountable? Construct a
set other than R which is uncountable and explain what makes this set
uncountable.

B. 4. Prove that Q does not contain a one-to-one image of R.

8 Jean Cavaillès, Philosophie mathématique, p. 211.
Part VI: Infinite sets 195

5. If S is finite, show that P(S) is countable.

C. 6. Prove that N contains a subset which is equipotent with Q × Q.

7. Show that (N × N) × N is countable.
8. Show that N × R is uncountable.
9. Suppose f : R → Z is defined as follows: f(x) is the smallest integer y
such that y > x. Is the quotient set induced on R by f a countable or
uncountable set. Explain.
10. Suppose we tried to use the Cantor diagonalization method to prove that
Q is uncountable? Explain why this would not work.
11. Is the set P(Q) countable or uncountable. Explain why.
12. Prove that the one-to-one image of an uncountable set must be uncountable.
196 Section 20: Equipotence as an equivalence relation

20 / Equipotence as an equivalence relation

Abstract. In this section we first show that the equipotence relation “∼e ”
is an equivalence relation on the class S of all sets. We use this equiv-
alence relation to partition S into equivalence classes. We show that for
any set S, S cannot be equipotent to its power set P(S). This fact allows
us to construct infinitely many distinct classes of mutually equipotent sets.
We also show that for any non-empty set S, the two sets P(S) and 2S
are equipotent. Finally, we show that P(N) is embedded in R, and R is
embedded in P(N).

20.1 Viewing equipotence as a relation.

Let S = {S : S is a set} denote the class of all sets. The Axiom of class
construction guarantees this to be a well-defined class. Now “equipotence”
can be viewed as a relation, denoted here by Re , on S :
A pair (A, B) ∈ S × S belongs to Re if and only if A and B are
equipotent.
So, the word equipotence conveniently refers to the property possessed by
sets which are equipotent.1
Notation: If two sets, A and B, are equipotent, we will represent this prop-
erty by,
A ∼e B
For example, we have previously shown that N ∼e Q and N ∼e N × Z; so
the pairs (N, Q) and (N, N × Z) belong to the relation Re on S . We can also
say, for example, that the class

{S ∈ S : S ∼e N or S ∼e n, n ∈ N}

is the class of all countable sets. Also, for all infinite subsets A of N, N ∼e A.
We have also seen that N 6∼e R; hence, (N, R) 6∈ Re . It is natural to wonder
whether Re is an equivalence relation on S . We immediately verify that
this is the case.

Theorem 20.1 Let S be a class of sets. The equipotence relation Re on S

is an equivalence relation on S .

1 The word “equinumerous” is also used to describe two sets which are equipotent. The

word “equinumerosity” is also used to describe the property of sets which are equipotent.
The word “equipotence” has the advantage of having only four syllables rather than the
tongue-twisting seven syllables in “equinumerosity”.
Part VI: Infinite sets 197

P roof:
Reflexivity: For every set S, S is equipotent to itself.
Symmetry: If S is equipotent to T , then T is equipotent to S.
Transitivity: If S is equipotent to T and T is equipotent to H, then S is
equipotent to H.

The equivalence relation, Re , on the class S allows the construction of sub-

classes of S we call equivalence classes induced by Re . If S is a set, we will
represent the equivalence class which contains S by [S]e . We list infinitely
many distinct equivalence classes:
{[0]e, [1]e, [2]e, [3]e, [4]e, . . . , [n]e, . . . , [N]e, [R]e }

Just about every set we have discussed up to now belongs to one of these
equivalence classes. Each of these is a subclass of S . This gives rise to a
compelling question: Are there any equipotence-induced equivalence classes
other than the ones listed here? One of the main objectives of this section
is to show that there are.

20.2 A few fundamental properties of the equipotence relation on S .

There are far too many sets to determine, on a case-by-case basis, which
pairs belong to the same equipotence-induced equivalence class and which
don’t. There are, however, a few fundamental equipotence relation properties
we can derive immediately which will help us classify sets by equipotence.
We prove some of these now.

Theorem 20.2 Suppose A, B, C and D are sets such that A ∼e B and

C ∼e D where A ∩ C = ∅ = B ∩ D. Then (A ∪ C) ∼e (B ∪ D).
P roof:
What we are given: A ∼e B and C ∼e D, A ∩ C = ∅, B ∩ D = ∅
What we are required to show: (A ∪ C) ∼e (B ∪ D)
Since A ∼e B and C ∼e D, there exists one-to-one onto functions f : A → B
and g : C → D. Define the function h : A ∪ C → B ∪ D as follows: h|A = f
and h|C = g. Since A ∩ C = ∅ = B ∩ D, then h is well-defined, one-to-one
and onto B ∪ D. So (A ∪ C) ∼e (B ∪ D) as required.

Theorem 20.3 Suppose A, B, C and D are sets such that A ∼e B and

C ∼e D. Then A × C ∼e B × D.
P roof:
198 Section 20: Equipotence as an equivalence relation

What we are given: A ∼e B and C ∼e D

What we are required to show: That A × C ∼e B × D.
Since A ∼e B and C ∼e D, there exists one-to-one onto functions f : A → B
and g : C → D. Define the function h : A × C → B × D as follows:
h(a, c) = (f(a), g(c)). It is left as an exercise to show that h is a one-to-one
onto function. So A × C ∼e B × D as required.

Corollary 20.4 Suppose A and B are infinite sets.

1) If {A, B} ⊂ [N]e , then A × B ∈ [N]e .
2) If {A, B} ⊂ [R]e , then A × B ∈ [R]e . Hence, R × R ∼e R.

P roof:
1) Suppose {A, B} ⊂ [N]e . Then A ∼e N and B ∼e N. By Theorem 20.3,
A × B ∼e N × N where N × N is known to be countable (Theorem 19.4).
Hence, A × B ∈ [N]e .
2) Suppose {A, B} ⊂ [R]e . Then A ∼e R and B ∼e R and so, by The-
orem 20.3, A × B ∼e R × R. It is easily seen that R is equipotent with
{1} × R ⊂ R × R. Since {1} × R is uncountable, R × R is uncountable. Then
A × B 6∈ [N]e . We must now show that A × B ∈ [R]e . This will be the case
if R × R ∈ [R]e .
We claim that R × R ∈ [R]e .
− It is easily seen that (0, 1) ∼e R.2 Then, by Theorem 20.3, (0, 1)×(0, 1) ∼e
R × R. To show that R × R ∈ [R]e , it then suffices to show that (0, 1) ×
(0, 1) ∼e (0, 1).
− For 0.x1 x2 x3 x4 x5 . . . ∈ (0, 1) (ignoring those decimal expansions with
infinite strings of 9s) define the function f : (0, 1) → (0, 1) × (0, 1) as
follows:

f(0.x1 x2 x3 x4 x5 . . .) = (0.x1 x3 x5 x7 x9 . . . , 0.x2x4 x6 x4 x8 . . .)

The function f is onto:

· Let (0.a1 a2 a3 a4 a5 . . . , 0.b1b2 b3 b4 b5 . . .) ∈ (0, 1) × (0, 1). Then

f(0.a1 b1 a2 b2 a3 b3 . . .) = (0.a1 a2 a3 a4 a5 . . . , 0.b1b2 b3 b4 b5 . . .)

so f is onto (0, 1) × (0, 1).

The function f is one-to-one:
` ´
2 The function f (x) = tan πx2
maps the open interval (0, 1) one-to-one onto (0, ∞). If
g(x) = ln x, g ◦f maps (0, 1) one-to-one onto (−∞, ∞) = R.
Part VI: Infinite sets 199

· Let (0.a1 a2 a3 a4 . . . , 0.b1b2 b3 b4 . . .) ∈ (0, 1) × (0, 1). By definition of f,

(0.a1 b1 a2 b2 a3 b3 . . .) is the only element that can be mapped to
(0.a1 a2 a3 a4 . . . , 0.b1b2 b3 b4 . . .). Hence, f is one-to-one.
So (0, 1) ∼e (0, 1) × (0, 1) as required.
We have shown that not only is R × R uncountable, R × R ∈ [R]e . Since
A × B ∼e R × R, A × B ∈ [R]e .

Example: Show that (−π/2, π/2) × Q ∼e R × N.

Solution:
Since the function f(x) = tan x maps the interval (−π/2, π/2) one-to-one
and onto R, while Q has been shown to be equipotent with N, then, by the
above theorem, (−π/2, π/2) × Q ∼e R × N.

20.3 Products of countable sets.

Given two infinite countable sets S and T , using Theorem 20.3, we can write

S × T ∼e N × N ∼e N

So the product of two infinite countable sets is countable. If {AQ i : i =

0, 1, 2, 3, . . ., n} is a finite set of countable sets, then the expression ni=0 Ai
is defined as:
n
Y
Ai = A0 × A1 × A2 × · · · × An = {(x0 , x1 , x2 , . . . , xn ) : xi ∈ Ai }
i=0

We can prove the following slightly more general result.

Theorem Q 20.5 Let {Ai : i ∈ N} be a (countable) set of non-empty countable

n
sets. Then i=0 Ai is countable for all n.

P roof:
What we are given: {Ai : i ∈ N} is a set of countable sets.
What we are required to show: A0 × A1 × A2 × · · · × An is countable.
We know that the product of any two non-empty countable sets is countable:
We prove the statement by induction. Let P (n) be the statement
n
Y
“ Ai is countable ”
i=0

− Base case: Since A0 is countable, then P (0) holds true.

200 Section 20: Equipotence as an equivalence relation
Qn
− Inductive hypothesis: Suppose P (n) holds true. Then i=0 Ai is count-
able. Now
n+1 n
!
Y Y
Ai ∼e Ai × An+1
i=0 i=0
is a product of two countable sets. We have initially shown that products
Qn+1
of pairs of countable sets are countable. Hence, the product i=0 Ai is
countable. So P (n + 1) holds true.
Qn
By the principle of mathematical induction i=0 Ai is countable for all n.

The reader should be careful not to generalize the above theorem when it
comes to Cartesian products. It does not say that “The Cartesian product
of countably many countable sets is countable”. This statement does not
hold true in general. We will soon witness infinite products of countable sets
which are not countable.

20.4 Adding countable sets to infinite sets.

We will now show that if a set, S, is infinite and another set, T , is countable
(finite or infinite), then S ∪ T ∈ [S]e . Adding countably many elements to
an infinite set S will always result in a set which is equipotent with S.

Theorem 20.6 Suppose S is an infinite set and T is a countable set such

that S ∩ T = ∅. Then
S ∼e S ∪ T
P roof:
What we are given: That T is countable, S is infinite and S ∩ T = ∅.
What we are required to show: That S and S ∪ T are equipotent.
It suffices to construct a one-to-one function f : S ∪ T → S mapping S ∪ T
onto S.
− Since S is infinite, then it must contain a countably infinite subset, say,
X. (By Theorem 18.9.)
− Since T is countable and X is countably infinite, then, by Theorem 19.6,
T ∪ X is countably infinite.
− So there exists a one-to-one onto function g : T ∪ X → X mapping
T ∪ X onto X ⊆ S (since both T ∪ X and X belong to [N]e .)
− We will now construct a function f : S ∪ T → S as follows:

x if x ∈ S − X
f(x) =
g(x) if x ∈ T ∪ X
We see that f maps S ∪ T one-to-one and onto S, as required.
Part VI: Infinite sets 201

Example: Let J denote the set of all irrational numbers. Show that J ∈ [R]e .
It was shown that the set of all rational numbers Q is countably infinite. If
J was countable, then by the theorem above, R = J ∪ Q would be countable,
a contradiction. So J is uncountable. We claim that J ∈ [R]e .

− Since J ∩ Q = ∅, then, by Theorem 20.6, J ∼e J ∪ Q = R. We conclude

that J ∈ [R]e .

20.5 Equipotence classes of power sets.

Given any set S, the Axiom of power set allows us to construct a new set,
P(S), by gathering together all subsets of S and viewing these subsets as
the elements of P(S). We have also shown that, if a set S contains n el-
ements, then its power set, P(S), contains precisely 2n elements (18.11).
Obviously, at least for finite sets S, S and P(S) are not equipotent. We
would like to see whether this rule generalizes to infinite sets.
We begin by showing that pairs of sets which are equipotent produce equipo-
tent power sets.

Theorem 20.7 If the sets A and B are equipotent, then so are their associ-
ated power sets P(A) and P(B).
P roof:
Given that A and B are equipotent, there exists a one-to-one function f :
A → B mapping A onto B. We define the function f ∗ : P(A) → P(B) as
follows:
f ∗ (T ) = M ⇔ f[T ] = M
Claim: The function f ∗ is onto P(B).
Proof of claim: Let M ∈ P(B). If M = ∅, then f ∗ (∅) = f[∅] = ∅.
Suppose M is a non-empty subset of B. Since f is onto B, M ⊆ f[A].
Then f[f ← [M ]] = M . So f ∗ maps the element f ← [M ] in P(A) to the
element M in P(B). We conclude that f ∗ maps P(A) onto P(B).

Claim: The function f ∗ is well-defined.

Suppose U and V are distinct elements of P(B) such that x ∈ U −V 6= ∅.
Since f is one-to-one and onto, f ← [{x}] is a singleton set {y} contained
in f ← [U ] ⊆ A. Since x 6∈ V , y 6∈ f ← [V ]. Then y ∈ f ← [U ] − f ← [V ]; this
implies that f ∗ maps the distinct elements f ← [U ] and f ← [V ] to U and
V respectively.
202 Section 20: Equipotence as an equivalence relation

Claim: The function f ∗ is one-to-one.

Suppose S and T are distinct elements of P(A) where S is non-empty.
Suppose that x ∈ S − T 6= ∅. Since f is one-to-one, f(x) 6∈ f[T ].
Then f(x) ∈ f[S] − f[T ]; this implies f[S] 6= f[T ]. We conclude that
f ∗ (S) 6= f ∗ (T ). So f ∗ is one-to-one.
So P(A) and P(B) are equipotent.

At this point, we know of only two equipotence-induced equivalence classes

of infinite sets. They are [N]e and [R]e . We wonder: Is P(N) a countable
set? That is, does P(N) belong to [N]e ? Let’s gather together a few proven
facts and possible deductions which can be made from these:

− We have already shown that Q ∼e N.

− It then follows from the theorem above that P(Q) ∼e P(N).
− Now R was defined as the set of all Dedekind cuts; Dedekind cuts were
seen to be elements of P(Q).
− Hence, R is equipotent to a subset of P(Q).
− Since R is uncountable, then P(Q) must also be uncountable.
− Since P(Q) ∼e P(N), P(N) is uncountable, it must then follow that
P(N) 6∈ [N]e .
− We would now have to verify whether P(N) ∈ [R]e .

So, just like P(n) 6∼e n for all natural numbers n, P(N) 6∼e N.
A general follow-up question might be: Is it possible for any infinite set, S,
to be equipotent with its power set P(S)?
We will show that the answer to this question is, no! That is, if S is infinite,
S 6∈ [P(S)]e .

Theorem 20.8 Any non-empty set S is embedded in its power set P(S).
But no subset of S is equipotent with P(S).

P roof:
What we are given: That S is a non-empty set.
What we are required to prove:
1) That S is embedded in P(S).
2) That P(S) is not equipotent to K for any K ⊆ S.
Part VI: Infinite sets 203

Firstly, for any element x ∈ S, {x} ∈ P(S), so the function f : S → P(S)

defined as f(x) = {x} maps S one-to-one into P(S). So S is embedded in
P(S), as required.
For the second part we have a proof by contradiction. Suppose K ⊆ S such
that K ∼e P(S). Note that K 6= ∅ since P(S) 6= ∅. Then there exists a
function
g : K → P(S)
mapping K one-to-one onto P(S). We claim that this leads to a contradic-
tion.
− If x ∈ K, g(x) is seen to be an element of P(S) and therefore is a subset
of S.
− Either x ∈ g(x) or x 6∈ g(x). Let T be the subset of K defined as:

T = {x ∈ K : x 6∈ g(x)}

Note that T ⊆ K ⊆ S and so T ∈ P(S).

− Since the function g : K → P(S) is onto P(S) there must be some
element in K, say y, such that g(y) = T . Let’s determine whether y is
in T or not:
· If y ∈ T , then, by definition of T , y 6∈ g(y) = T . This makes no
sense, so y 6∈ T .
− On the other hand, if y 6∈ T , then y ∈ g(y) = T . This contradicts the
pre-established fact that y 6∈ T .
The source of this contradiction is the supposition that g : K = P(S) is
one-to-one and onto. So K 6∼e P(S) for any K ⊆ S.

The above theorem confirms that, for any set S, S and P(S) are not equipo-
tent so [S]e and [P(S)]e are distinct equivalence classes. It also suggests that
there are many more equipotence-induced equivalence classes than the ones
listed previously on page 197. For example,

[0]e 6= [1]e 6= [2]e 6= · · · =

6 [N]e 6= [R]e 6= [P(R)]e 6= [P(P(R))]e · · ·

20.6 A relation defined on equipotent equivalence classes.

We will show how a never-ending list of equipotence-induced equivalence
classes of infinite sets can be constructed. To ensure that there is no ambigu-
ity in what follows, we formerly define the expression “properly embedded”.
204 Section 20: Equipotence as an equivalence relation

Definition 20.9 If A and B are non-empty sets such that A is equipotent to

some subset C of B, we will say that A is embedded in B. In such a case we
represent the relationship “A is embedded in B” by writing

A ,→e∼ B

If A and B are non-empty sets, we will say that the set A is properly embedded
in the set B, if A is equipotent to some proper subset C of B where B is not
equipotent to C. To describe “A is properly embedded in B”, we will write

A ,→e B

The relations, ,→e and ,→e∼ are easily seen to be both reflexive and transitive
relations on the class, S , of all sets.
Let S = {S : S is a set}. Remember that the axiom of power set guarantees
that if S ∈ S then P(S) ∈ S . We define the class, E , as

E = {[S]e : S ∈ S }

Both ,→e and ,→e∼ induce a relation on E , which we now define.

Notation 20.10 Let S = {S : S is a set} and E = {[S]e : S ∈ S }. Re-

member that both S and E are classes. Let [A]e and [B]e be elements of E .
We write
[A]e <e [B]e
if and only if A ,→e B. We write

[A]e ≤e [B]e

if and only if A ,→e∼ B.

The relation ≤e is easily seen to be reflexive and transitive on E . But it is

not clear whether ≤e is antisymmetric on E . If so, then ≤e defines an order
relation on E . We see that [A]e ≤e [B]e only if A ,→e∼ B and [B]e ≤e [A]e
only if B ,→e∼ A. But does

([A]e ≤e [B]e ) ∧ ([B]e ≤e [A]e ) =⇒ ([A]e = [B]e )

necessarily hold true? That is, does “A is embedded in B and B is embedded

in A” imply that A and B are equipotent sets? We suspect that this is the
case. But proving this is not a trivial matter. So we cannot assume this to
be a fact at this time. But we will hold that thought for now. The statement
Part VI: Infinite sets 205

which guarantees that this holds true is called the Schröder-Bernstein theo-
rem. This very important theorem will be the main topic of the next section.
In the meantime, keep in mind that we are working with the two classes, S
of all sets and the class E of equivalence classes induced by ,→e , on the table.
We provide a few examples. We have previously shown that any infinite set
contains a subset which is equipotent with N (18.9), and, since N and R are
known to be non-equipotent, N ,→e R. It then follows that
[N]e <e [R]e
Similarly, for any natural number n and any infinite set A, n ,→e N ,→e∼ A.
So [n]e <e [N]e ≤e [A]e implies [n]e <e [A]e .

Proposition 20.11 Let S be any set. Suppose

P 0 (S) = S, P 1 (S) = P(S), [P 2 (S)]e = [P(P(S))]e
and
P n (S) = P(P n−1 (S))
for all n ≥ 1. The set
[S]e , [P(S)]e , [P 2 (S)]e , . . . , [P n (S)]e . . . ,

forms an infinite asymmetric <e -ordered chain of distinct classes in E .

P roof:
Proof by induction. Let P (n) denote the statement “{[P i (S)]e : i =
0, 1, 2, . . ., n} forms a <e -ordered chain of distinct classes in E ”.2
Base case: {[P 0 (S)]e } = {[S]e } ∈ E . Since {[S]e } contains only one element
<e linearly orders {[S]e }. Since [S]e <e [P(S)]e and [P(S)]e 6<e [S]e then
base case holds true.
Inductive hypothesis: Suppose P (n) holds true. That is, suppose {[P i (S)]e :
i = 0, 1, 2, . . ., n} forms a <e -ordered chain of distinct classes in E . We are
required to show that P (n + 1) holds true.
By Theorem 20.8, P n (S) ,→e P n+1 (S), and so [P n (S)]e <e [P n+1 (S)].
Since <e is irreflexive, antisymmetric and transitive on {[P i (S)]e : i =
0, 1, 2, . . ., n}, then it must the case for {[P i (S)]e : i = 0, 1, 2, . . ., n, n + 1}.
So P (n + 1) holds true.
By mathematical induction {[P n (S)]e : i = 0, 1, 2, . . ., n} forms a <e -
ordered chain of distinct classes in E for each n ∈ N.
From this we conclude that the relation <e is a strict linear ordering of the
infinite set {[P n (S)]e : n ∈ N}.

2 Note that by the Axiom of power set, P n (S) is a set for all n ∈ N; hence {[P n (S)] :
e
n ∈ N} ⊆ E .
206 Section 20: Equipotence as an equivalence relation

The above proposition allows us to say that both

{[P n (N)]e : n = 0, 1, 2, . . . , }

and
{[P n (R)]e : n = 0, 1, 2, . . . , }
form infinite <e -ordered chains of distinct equivalent classes in S . It will
be interesting to determine whether these two chains have any elements in
common. We will have the tools required to answer this question only in the
next section.

20.7 Equipotence of 2S and P(S).

In Example 2 on page 133, we introduced the set {1, 2}N equipped with the
lexicographic linear ordering. This set was defined as being the set of all
functions mapping N into the set {1, 2}. It can be equivalently described as

{1, 2}N = {(a0 , a1 , a2 , a3 , . . . , ) : ai equals 1 or 2}

which is the set of all possible countably infinite ordered strings of 1s and
2s. Whether N is mapped to {1, 2} or {0, 1} is not considered as being a
significantly different set since all possible countably infinite strings of 0s
and 1s will essentially produce a set which is equipotent to the set of all
possible countably infinite strings of 1s and 2s. Using 0s and 1s will allow
us to represent the set, {0, 1}N, more succinctly as, 2N .
We can generalize the expression by replacing N with any set S. That is, if
S is any non-empty set, 2S represents all functions which map the set S to
{0, 1}. For example, given some finite set, say, S = {3, 4} we can actually
list the functions in this set as:

2{3,4} =

{(3, 0), (4, 0)}, {(3, 1), (4, 1)}, {(3, 0), (4, 1)}, {(3, 1), (4, 0)}

a set equivalent to the set of four sequences

{03 , 04 }, {13 , 14 }, {03 , 14 }, {(13 , 04 }

This set has four, or 22 , elements. If S has three elements, say, S = {7, 8, 9}
and we list all elements of 2S we would see that it contains precisely 23 = 8
elements. Verify this. It can be shown by mathematical induction that, if
S has n elements, 2S must contain 2n elements (see the Exercise section).
Recall that in Theorem 18.11, we showed by induction that the power set,
P(S), of any n-element set, S, contains 2n elements. From this fact, we
deduce that,

“If S is finite, 2S ∼e P(S)”

Part VI: Infinite sets 207

Question: Can we generalize this statement so that it holds true for all sets
S, including infinite ones?
Answer : We will convince ourselves that we can. But this will require some
careful explaining. To help answer this question, let’s consider a third way
of viewing the elements of 2S (whether S is finite or not). Suppose f ∈ 2S .
− Then f ← [{1}] and f ← [{0}] form disjoint subsets, say T and S − T , of S
respectively. Note that T may possibly be empty or possibly be all of S.
− So f is a function which maps x to 1 if and only if x ∈ T and all other
elements to 0.
− That is, f = χT ∈ 2S = {0, 1}S .1 In fact, for every K ⊆ S, equivalently
for every K ∈ P(S), χK ∈ 2S ; conversely, for every f ∈ 2S , there is
precisely one T ⊆ S, equivalently T ∈ P(S), such that f = χT .
Consider the function, g : P(S) → 2S , defined as: g(T ) = χT . We have
just shown that g maps P(S) one-to-one onto 2S . Hence, 2S ∼e P(S). It
is worth formally stating this important statement as a theorem.

Theorem 20.12 For any non-empty set S,

2S = {χT : T ∈ P(S)} ∼e P(S)

If S = N, then 2N ∼e P(N) or equivalently 2N ∈ [P(N)]e . Similarly,

2P(N) ∈ [P(P(N))]e and 2R ∈ [P(R)]e .

20.8 Comparing P(N) and R

We eagerly wonder: “Do the sets P(N) and R belong to the same equipo-
tence equivalence class or not?” We know that N is embedded in R. But it
is not clear how P(N) relates to R with respect to the equipotence relation.
The following theorems do not entirely answer this question, but they con-
stitute a first step towards answering it.

Theorem 20.13 The set, R, is embedded in P(N).

P roof:
Recall that Dedekind defines of the real numbers R as being a particu-
lar set of initial segments of Q and so R ⊆ P(Q). Since Q ∼e N, then
P(Q) ∼e P(N) (by Theorem 20.7). Since R ⊆ P(Q) ∼e P(N), R is
equipotent to a subset P(N).

1 Recall that on page 84 we introduced a function called the “characteristic function of

T ” represented as, χT . The function χT on S was defined as a function mapping T to {1}

and the rest of S to {0}; it is precisely the same function as f .
208 Section 20: Equipotence as an equivalence relation

We provide some background that will help follow the proof of the next
statement. Theorem 20.12 states that 2N = {χT : T ∈ P(N)} ∼e P(N).
Recall that the function χT : N → {0, 1} is defined as

0 if n 6∈ T
χT (n) =
1 if n ∈ T

So, for a specific subset T of N, χT can be viewed as a sequence,

{i0 , i1 , i2 , i3 , . . . , }, of 0s and 1s, where ik = 0 only if k 6∈ T , otherwise ik = 1.
We see that there is a one-to-one correspondence between the set {χT : T ⊆
N} and the set of sequences {i0 , i1 , i2 , i3 , . . . , } : in ∈ {0, 1} . Equivalently,
there is a one-to-one correspondence between the set, {χT : T ⊆ N}, and
the set of decimal expansions, i0 .i1 i2 i3 i4 · · · : where in ∈ {0, 1} . For ex-
ample, if E = {n ∈ N : n is even }, then χE can uniquely be represented
as:
χE = {(0, 1), (1, 0), (2, 1), (3, 0), (4, 1), (5, 0), . . . , }
→ {10 , 01, 12 , 03 , 14 , 05 , . . . , }
→ 10 .01 12 03 14 05 · · ·
→ 1.01010101 · · ·
If O = {n ∈ N : n is odd}, then we can write
χO = {(0, 0), (1, 1), (2, 0), (3, 1), (4, 0), (5, 1), . . ., }
→ {00 , 11 , 02 , 13, 04 , 15 , . . . , }
→ 00 .11 02 13 04 15 · · ·
→ 0.10101010 · · ·

Say, we define a function, f * : {χT : T ∈ P(N)} → i0 .i1 i2 i3 i4 · · · : where in ∈ {0, 1}
as
f * (χT ) = i0 .i1 i2 i3 i4 · · · if and only if in = χT (n) for n ∈ N
For example:
f * (χN ) = 1.111111111111111 · · · = 10/9
f * (χE ) = 1.010101010101010 · · · = 100/99
f * (χO ) = 0.101010101010101 · · · = 10/99
f * (χ∅ ) = 0.000000000000000 · · · = 0
f * (χ{0,3}) = 1.001000000000000 · · · = 1001/1000

We see that the maximum value in the image of {χT : T ∈ P(N)} under
f * is 10/9, while the minimum value in the image is 0. Also, if U 6= V , then
f * (χU ) 6= f * (χV ) so f * is one-to-one on {χT : T ∈ P(N)}.
We are now set to prove the following theorem.
Part VI: Infinite sets 209

Theorem 20.14 The set P(N) is embedded in R.

P roof:

We define a function f * : {χT : T ∈ P(N)} → i0 .i1 i2 i3 i4 · · · : where in ∈ {0, 1}
as

f * (χT ) = i0 .i1 i2 i3 i4 · · · if and only if in = χT (n) for n ∈ N

We see that the function f * maps {χT : T ∈ P(N)} one-to-one into the
interval [0, 10/9].

It follows that
P(N) ∼e {χT : T ∈ P(N)} ,→e [0, 10/9] ⊂ R
So P(N) is embedded in R, as required.

Concepts review:
1. Describe the equivalence relation on the class of all sets which was
discussed in this section.
2. What can we say about the finite union of disjoint countable sets?
3. What can we say about the Cartesian products of two countable
sets?
4. If we add a countable set to an infinite set S, what can we say about
the set that results from this union?
5. With which set is the set of all irrationals equipotent?
6. If two sets A and B are equipotent what can we say about their
respective power sets?
7. What is the meaning given to the expression “A is properly embed-
ded in B”?
8. From any non-empty set S, construct a set B such that S ,→e B.
9. Name a set which contains a copy of R but is not equipotent with
R.
10. If S is a set, what set of functions is equipotent with P(S) other
than P(S) itself?
11. If S is a set, what does the set {χT : T ⊆ S} represent? With which
set it equipotent?
210 Section 20: Equipotence as an equivalence relation

EXERCISES

A. 1. Show that the following pair of sets are equipotent.

a) A = (0, 1) and B = (−1, 1). (These represent open intervals in R.)
b) A = (−1, 1) and R.
2. If S = {0, {1, 2}} and T = {{x}, y}, write out explicitly the elements in the
set P(S) and P(T ).
3. Show that P(N), P(Q) and P(Z) are equipotent uncountable sets.
4. Prove that the set of all non-negative irrational numbers and the set R of
all real numbers are equipotent.
5. Show that the sets {0, 1} × N is countable.
6. Let S = {a, b, c}. Show that the three sets 2S , P(S) and {χT : T ∈ P(S)}
contain the same number of elements by listing all their elements.

B. 7. Is the set NN countable?

8. Is NN embeddable in R? If so, find a suitable mapping.
9. Is R embeddable in NN ? If so, find a suitable mapping.
S
10. Prove that 2(2 ) and P(P(S)) are equipotent.
11. Prove, by mathematical induction, that if a non-empty set S contains n
elements, then the set 2S contains 2n elements.
12. Prove that there are infinitely many equipotence-induced equivalence
classes of uncountable sets.

C. 12. Prove that R × R × R × · · · × R (n times) is equipotent to R for all non-zero

natural numbers n.
13. Prove in detail Theorem 20.3.
Part VI: Infinite sets 211

21 / The Schröder-Bernstein theorem

Abstract. In this section we state and prove the Schröder-Bernstein the-
orem. We then illustrate some of its consequences. In particular, we use it
as a tool to prove that R and P(N) are equipotent. We finally show that
NN and R are equipotent.

21.1 Reviewing some basic properties of infinite sets.

In the last chapter, we discussed certain properties possessed by infinite sets.
We will now build on those results to prove more general statements about
these. These results tend to be less intuitive since they relate to sets other
than those which represent the numbers we are accustomed to. Before we
begin, we list the results from the last section which will serve as the main
tools to prove the statements which follow.
Recall that the class of all sets is denoted by S = {x : x is a set}.
− The countable union of countable sets is countable.
− The finite product of countable sets is countable.
− (A ∼e B) ∧ (C ∼e D) ⇒ A × C ∼e B × D.
− [(S is infinite) ∧ (T ∼e N)] ⇒ (S ∼e S ∪ T ).
− (S ∈ S ) ⇒ (S ,→e P(S)) (Theorem 20.8)
− {χT : T ∈ P(S)} ∼e P(S)
− (S ∈ S ) ⇒ 2S ∼e P(S)
− R ,→e∼ P(N) and P(N) ,→e∼ R

21.2 The Schröder-Bernstein theorem.

Even if two sets are known to be equipotent, it can be quite difficult to
construct a function which maps one set one-to-one onto the other. For ex-
ample, we may suspect that P(N) and R are equipotent sets, but proving
this by actually producing a function which maps R one-to-one onto P(N)
could be a challenging task, one that will no longer be necessary once we
have proved a statement called The Schröder-Bernstein theorem.1

Theorem 21.1 (The Schröder-Bernstein theorem) If S and T are infinite

subsets where S is embedded in T and T is embedded in S, then S and T are
equipotent.
1 Ernst Schröder (1841-1902) was a German mathematician mostly known for his work

in algebraic logic (he authored Lectures in Algebra of Logic). Felix Bernstein (1878-1956)
was a German Jewish mathematician. He studied in Munich, Berlin and Göttenberg. He
emigrated to the United States in the early thirties during the rise of Nazism.
212 Section 21: The Schröder-Bernstein theorem

The proof is presented once we have proved the following lemma.

Lemma 21.2 Let T be a proper subset of the set, S, and f : S → T be a one-

to-one function mapping S into T . Then there exists a one-to-one function,
f * : S → T , mapping S onto T .

P roof:
What we are given: That T ⊂ S; that f : S → T maps S one-to-one into T .
What we are required to show: There exists a one-to-one function f * : S → T
which maps S onto T .
Since T is a proper subset of S, then S − T is non-empty.
We construct a sequence of sets {Si : i ∈ N} as follows:

S0 = S −T
S1 = f[S − T ] = f[S0 ]
S2 = f 2 [S − T ] = f[S1 ]
S3 = f 3 [S − T ] = f[S2 ]
S4 = f 4 [S − T ] = f[S3 ]
..
.
Sn = f n [S − T ] = f[Sn−1 ]
..
.
S
Let U = i∈N Si . Since f maps all of S in T , for all i > 0, Si ⊆ T . Remember
that S0 = S − T . The Si ’s can be shown to be pairwise disjoint. Verification
of this fact is left as an exercise. (This fact is important for the validity of
this proof. Try a proof by induction.)
We define the function f * : S → T as follows:

* f(x) if x ∈ U
f (x) =
x if x 6∈ U

− We verify that the image of S under f * is a subset of T :

Any element a in S is either in U or is not in U . If a ∈ U , then
f * (a) = f(a) ∈ Si for some i ≥ 1 and so f * (a) ∈ T . If a 6∈ U , then it is
not in S0 = S − T and so is in T . So the image of S under f * is in T .
− We verify that f * is one-to-one on S:
Since f * = f|U on U (on which f is one-to-one) and is the identity
map on S − U , then f * is one-to-one on S. (Some details are left as an
exercise. The reader should see clearly why this is true.)
− We verify that f * is onto T :
Part VI: Infinite sets 213

Let t ∈ T . If t ∈ T − U , then f * maps t to t. Suppose t ∈ U . Then

t ∈ Si for some i > 0 (t cannot be in S0 since S0 = S − T ). Then t is
in the image of Si−1 under f and so is in the image of T under f * . So
every element of T is in the image of S under f * .
So f * maps S one-to-one onto T as required.

Proof of the Schröder-Bernstein theorem.

P roof:
What we are given: There exists a one-to-one function, f : S → T , mapping
S into T and a one-to-one function, g : T → S, mapping T into S.
What we are required to show: There is a one-to-one function which maps
T onto S.
Let h = g◦f. Then h is a function mapping S into S. Since both f and g
are one-to-one on their respective domains, then h is one-to-one on S. Then

h[S] = g[f[S]] ⊆ g[T ]

By the lemma, there exists a one-to-one function h* : S → g[T ] mapping S

−1
onto g[T ] ⊆ S. Then h* : g[T ] → S maps g[T ] one-to-one onto S. This
* −1 −1
means that (h ◦g)[T ] = S. That is, the function h* ◦g maps T one-to-
one onto S.
Thus, S and T are equipotent, as required.

21.3 Some consequences of the Schröder-Bernstein theorem.

Recall that if A ∈ S , [A]e denotes the class of all sets which are equipotent
to A. Also E = {[A]e : A ∈ S } denotes the class of all equivalent classes
induced by ∼e on which we have defined the relation ≤e . The important
Schröder-Bernstein theorem will allow us to prove statements we suspected
were true but for which we lacked the tools to confirm our suspicions. It
shows that the relation, ≤e , on E = {[A]e : A ∈ S } is antisymmetric; that
is, [A]e ≤e [B]e and [B]e ≤e [A]e implies [A]e = [B]e . Hence, ≤e is a non-
strict order relation on E .

Theorem 21.3 The set R of all real numbers is equipotent to P(N).

P roof:
214 Section 21: The Schröder-Bernstein theorem

We proved in Theorems 20.13 and 20.14 that R is embedded in P(N)

and P(N) is embedded in R. Then by the Schröder-Bernstein theorem
R ∼e P(N).

Since R ∼e P(N), then [R]e = [P(N)]e . By Theorem 20.7, P(R) ∼e

P(P(N)), and so [P(R)]e = [P(P(N))]e = [P 2 (N)]e . It easily follows,
by mathematical induction, that the two countably infinite <e -ordered sets

{[0]e , [1]e, [2]e, . . . , [N]e, [R]e, [P(R)]e , [P 2 (R)]e , [P 3 (R)]e , . . . , }

{[0]e , [1]e, [2]e, . . . , [N]e, [P(N)]e , [P 2 (N)]e , [P 3 (N)]e , [P 4 (N)]e , . . . , }

have the same elements and so represent identical <e -chains in the class
E = {[S]e : S is a set}.

21.4 The set NN .

In Theorem 20.12, we saw that for any set S,
2S ∼e {χT : T ⊆ S} ∼e P(S)

Hence, 2N ∼e {χT : T ⊆ N} ∼e P(N), where 2N is the set of all functions

mapping N into {0, 1}. The set 2N was also seen to be equipotent to the set,
n o
{a0 , a1 , a2 , a3 , . . . , } : ai ∈ {0, 1} 2

We will now investigate the “set of all functions mapping N into N”. That
is, we will consider a set whose elements are of the form, f = {(i, ai ) : i ∈
N, ai ∈ N}. To be consistent with our notation, we will express this set as
NN

For example, g = {(0, 1), (2, 4), (3, 9), (4, 16), . . . , } represents a particular
element of the set NN where n is mapped to n2 . We could also represent this
element as, (a0 , a1 , a2 , a3 , . . . , ), where ai = i2 . That is, each ai is associated
to the element (i, i2 ).3
The sets, NN and 2N , are both sets of functions with domain, N, except the
functions in 2N have range, {0, 1}, while the functions in NN have range, N.
Not surprisingly, if g ∈ 2N , then g ∈ NN ; hence, 2N ⊂ NN . Of course, NN con-
tains many elements which do not belong to 2N . For example, {(i, i2 ) : i ∈ N}
2 If
Q
A = {0, 1} for i = 0, 1, 2, 3, . . . we define i∈N Ai = {(a0 , a1 , a2 , . . . , ) : ai ∈ {0, 1}}.
Qi N
Then i∈N Ai ∼e 2 . Q
3 If A = N for i = 0, 1, 2, 3, . . . ,, we define
i Q i∈N Ai = {(a0 , a1 , a2 , a3 , . . . , ) : ai ∈ N}. Or
if one prefers, i∈N Ai can be viewed as the set of all possible countably infinite sequences
of natural numbers. Q The element g = {(0, 1), (2, 4), (3, 9), (4, 16), . . . , }Qcan be viewed as
(0, 1, 4, 9, 16, . . . , ) ∈ i∈N Ai = {(a0 , a1 , a2 , a3 , . . . , ) : ai ∈ N}. In fact, i∈N Ai ∼e NN .
Part VI: Infinite sets 215

belongs to NN but not to 2N. But it may still be possible for NN to be equipo-
tent to 2N. If we can show that NN is embedded in 2N , it will follow from
the Schröder-Bernstein theorem that [NN ]e = [2N ]e .

Theorem 21.4 The sets, NN and R, are equipotent.

P roof:
What we are given: NN is the set of all functions mapping N into N.
What we are required to show: NN and R are equipotent.
Claim: R is embedded in NN .
− We have shown that R ∼e 2N ⊂ NN ; hence, R is embedded in NN .
Claim: NN is embedded in R.
− Let f ∈ NN . Then f can be expressed in the form f =
{(0, a0 ), (1, a1), (2, a2 ), . . .} a subset of N × N. Since f ⊂ N × N, then
f ∈ P(N × N). Then

NN ⊂ P(N × N)
∼e P(N) (By 20.4, N × N ∼e N, followed by 20.7.)

∼e R

Then NN is embedded in R as claimed.

By the Schröder-Bernstein theorem NN ∼e P(N) ∼e R.

21.5 The set B A of all functions mapping A into B.

Since we are discussing sets of functions such as, NN and 2S , we slightly
generalize these notions by considering ranges other than {0, 1} and N.

Definition 21.5 If A and B are two sets, then the symbol, B A , refers to the
set of all functions mapping A into B.4

4 The following argument confirms that if A and B are sets, then AB is a set: Every
element f ∈ AB is a subset of the set B × A (finite products of sets are sets). So for every
f ∈ AB , f ∈ P (B × A). Then AB ⊆ P (B × A). Since P (B × A) is a set (Axiom of power
set), then AB must be a set (Axiom of subset).
216 Section 21: The Schröder-Bernstein theorem

Examples:
a) The set QN denotes the set of all functions f : N → Q. For example,

{xi : xi = 1/(i + 1), i ∈ N} = { 11 , 12 , 31 , . . . ,}

is such a function. We can of course say that QN is the set of all infinite
countable sequences of rational numbers.
b) If S contains three elements and T contains four elements, we can verify
that the set S T will contain 34 elements.
We wonder how the sets, QN and NN , are related. It is clear that if

x = {(0, n0 ), (1, n1), (2, n2 ), (3, n3), . . . , } ∈ NN

then x ∈ QN ; hence, NN ⊂ QN . On the other hand, we know that

Q and N are equipotent; so there exists a one-to-one function f :
Q → N mapping Q onto N. It then follows that for any element x =
{(0, q0 ), (1, q1), (2, q2), (3, q3), . . . , } in QN , we can associate a unique element

f * (x) = y = {(0, f(q0 )), (1, f(q1 )), (2, f(q2 )), (3, f(q3 ), . . . , })

in NN . So QN is embedded in NN . Hence, by the Schröder-Bernstein theorem,

QN and NN are equipotent.

Concepts review:
1. What does the Schröder-Bernstein theorem say?
2. Name three sets which are equipotent to the power set P(N).
3. What do the symbols 2N and NN mean?
4. Is the set NN equipotent with R?
5. What does the expression B A mean? If B has 3 elements and A
has 2 elements, how many elements does B A contain? How many
elements does AB contain?

EXERCISES

A. 1. Show that an infinite set S is countable if and only if S is equipotent with

every one of its infinite subsets.

B. 2. Prove that an infinite countable set S can be expressed as the union of two
disjoint infinite countable sets.
Part VI: Infinite sets 217

3. Prove that if S and T are sets and S − T and T − S are equipotent, then
S and T are equipotent.
4. Prove that for any m ∈ N, Nm is countable.
5. If S = {0, 1, 2} and T = {x, y} write out explicitly the elements of the
following sets:
a) S T
b) T S
c) 2S
d) P(S)

C. 6. Show that NN is equipotent with a subset of P(N × N).

7. Show that P(N × N) is equipotent with a subset of R.
8. Show that the set {Si : i ∈ N} of sets constructed in the proof of Lemma
21.2 are pairwise disjoint.
9. Show that the function h constructed in the proof of Lemma 21.2 is one-
to-one on S.
10. Suppose R is an equivalence relation on an infinite countable set S. Show
that the set of all equivalence classes on S induced by R is countable.
11. Let S be the set of all infinite sequences of natural numbers. We will say
that a sequence s = {si : i ∈ N} ∈ S has a “constant tail-end” if there
exists a number k ∈ N such that i > k ⇒ si = sk . Let T = {s ∈ S :
s has a constant tail-end }. Show that T is countable.
12. Q
In Theorem 20.5 it is proven that if the Ai ’s are countable,
Q then, for any n,
n
A
i=0 i is countable. Give an example that shows that i∈N Ai need not
be countable. Explain.
13. A subset T of R is said to be open if for every element x ∈ T , x is contained
in an open interval entirely contained in T . Let S be a set of pairwise
5
disjoint
S open subsets of R. Show that S is countable. (Hint: Let U =
Q ∩ ( S∈S S). A step invoking the Axiom of choice will follow.)
14. Show that QQ ∈ [R]e .

5 The statement “Any infinite linearly ordered set V such that the set S of pairwise

disjoint open subsets is at most countable must be equipotent with R.” is referred to as the
Suslin’s problem. It remained an open question until it was proved that it is impossible to
prove or disprove this statement from ZF plus the Axiom of choice.
Part VII

Cardinal numbers
Part VII: Cardinal numbers. 221

22 / Introduction to cardinal numbers

Abstract. In this section we state the Continuum hypothesis and the Gen-
eralized continuum hypothesis; we discuss their meaning and consequences.
We “informally” define the class of cardinal numbers, C , introduce the
“aleph” notation for cardinal numbers and define addition, multiplication
and exponentiation of cardinal numbers. Finally, we show that the class of
all cardinal numbers is a proper class.

22.1 Equipotence-based classification of sets.

In the last few sections, we have used the equipotence relation to subdivide
the class, S , of all sets into subclasses, [S]e , of mutually equipotent sets. As
was done previously (on page 204), we will continue to represent the class
of all ∼e -equivalence classes on S as
E = {[S]e : S is a set}

In Section 20.10, we defined the relation, <e , on E as,

([A]e <e [B]e ) ⇔ (A ,→e B)
The relation, <e , was seen to be a strict order relation on E .
We proved two fundamental results concerning the elements of E :
1) For all sets S, [S]e <e [P(S)]e , proven in Theorem 20.8,
2) [R]e = [P(N)]e , proven in Theorem 21.3.

Combining these two statements, along with Proposition 20.11, allowed us

to conclude that the set
A = {[0]e , [1]e, . . . , [n]e, . . . , [N]e, [P(N)]e , [P 2 (N)]e , . . . , [P n (N)]e , . . . , }
of distinct equipotence-induced equivalence classes, linearly ordered by the
relation <e , not only contains [R]e but also the equivalence classes of all
power sets generated by R. That is, [P(N)]e = [R]e , [P 2 (N)]e = [P(R)]e ;
more generally
[P n+1 (N)]e = [P n (R)]e

A few words of caution: Even though we have shown that <e linearly orders
the set, A , described above, we have not proven that <e linearly orders the
class E , even though we suspect that it does. We will not assume this to be
the case until we formally prove it to be true.

22.2 Continuum hypothesis.

A particularly intriguing question concerning the strictly ordered set of
equivalence classes described above baffled mathematicians for decades:
222 Section 22: Introduction to cardinal numbers

“Does there exist an uncountable set S (that is, one which is not
equipotent with N) which is properly embedded in R ∼e P(N)?”
Equivalently,
“Does there exist an uncountable set S such that [N]e <e [S]e <e
[R]e = [P(N)]e ? ”
After numerous attempts to construct such a set S in vain, Georg Cantor
came to believe that no such set S exists. In 1878, he conjectured that:

“No uncountable set can be properly embedded1 in R.”

This conjecture is referred to as the Continuum hypothesis (abbreviated by

CH).2 In 1900, the mathematician David Hilbert declared that proving, or
disproving, the Continuum hypothesis was one of the 23 most important un-
solved mathematical questions of that time. In 1940, Kurt Gödel showed it is
impossible to disprove the Continuum hypothesis within ZFC. In 1963, Paul
Cohen showed that it is impossible to prove that the Continuum hypothesis
holds true within ZFC. This settled the question: Neither assuming “CH is
true” nor assuming “CH is false” can be the source of a contradiction. This
means that we are free to work in a universe governed by ZFC+CH or by
ZFC+¬CH, as we prefer, without fear of a contradiction evolving from the
annexation of the axiom CH or ¬CH to ZFC.3
There is a more general version of the Continuum hypothesis called the
Generalized Continuum Hypothesis (suitably abbreviated by GCH), which
states that:
“For any infinite set A there does not exist a set S such that
A ,→e S ,→e P(A).
Or we can equivalently say, “For every set A there does not exist a set S such
that [A]e <e [S]e <e [P(A)]e ”. The GCH only refers to infinite sets, not
finite ones. The Generalized continuum hypothesis implies the Continuum
hypothesis(CH). It is, however, known that GCH does not follow from CH.
Assuming ZFC+GCH, the linearly ordered set

A = {[0]e , [1]e, . . . , [n]e, . . . , [N]e, [P(N)]e , [P 2 (N)]e , . . . , [P n (N)]e , . . . , }

for example, is an “initial segment” of equivalence classes in the sense that

for any n, {[S]e ∈ E : [S]e <e P n (N)} ⊂ A . This does not say that this set
represents all equipotence-induced equivalence classes, far from it. It simply
means that assuming GCH, such a set forms a string of countably many
1 Recall that “A is properly embedded in B” means that A is equipotent with a subset

of B, but B is not equipotent with A nor any of its subsets.

2 The word continuum is simply another way of referring to the set R.
3 The negation of CH is represented as “¬CH”. The symbol “¬” is often read as “not”.
Part VII: Cardinal numbers. 223

equivalence classes, with none missing. In the ZFC+¬GCH universe, for

each n > 0, {[S]e ∈ E : [S]e <e P n (N)} 6⊂ A .
It is also known that neither GCH nor ¬GCH can be proved in ZFC.4
This leaves us with an intriguing question: When trying to determine
whether a mathematical statement holds true or not, should we assume
CH or “not CH”? If a statement can be proved without invoking either of
these axioms, most readers will prefer the proof which avoids these state-
ments (viewing them as being extraneous). However, some mathematical
statements have as only proof one which assumes CH (or GCH). In such
cases, the reader should be alerted to this fact, and should be informed at
which point in the proof this axiom is invoked.

22.3 The class, E , of equipotence induced equivalence classes.

The reader has no doubt noticed that we have been careful not to refer to
the class
E = {[S]e : S ∈ S }
of all ∼e -induced equivalence classes on S as a “set of sets”. It is not diffi-
cult to show that E cannot be a set of sets.

Theorem 22.1 The class, E = {[S]e : S ∈ S }, is not a set of sets.

P roof:
Proof by contradiction. Suppose E is a set of sets.
− Then, by the Axiom of choice (“Every set of sets has a choice function”),
there exists a choice function, f, which maps every set [S]e ∈ E to some
element s in [S]e ⊂ S . That is, f chooses from each set [S]e of E a set
representative, f([S]e ) = s where s ∼e S.
− Let B = f[E ] denote the image of E under f. By the Axiom of replace-
ment, the image of a set under a well-defined function is a set; hence, B
is a set of sets.
− Then T = ∪{s : s ∈ B} is the union of a set of sets. Hence, by the Axiom
of union, T is a set.
− By the Axiom of power set, “T is a set” implies that P(T ) is also a set.
Since P(T ) is a set, P(T ) ∈ [S]e for some [S]e ∈ E . Then P(T ) ∼e s
for some s ∈ B. But s ⊂ T . Since P(T ) ∼e s and s is embedded in
T , then, by transitivity, P(T ) ,→e T , a contradiction of Theorem 20.8
(which states that no subset of T is equipotent with P(T )).
4 Interestingly, it was shown by Warclaw Sierpiński that the Axiom of choice follows

from ZF+GCH. That is, the Axiom of choice exists in a universe governed by ZF+GCH.
224 Section 22: Introduction to cardinal numbers

So E is not a set of sets.

See that, there is nothing to indicate that the equivalence classes, [S]e (in-
duced by equipotence) are “sets” and, even if they were, there are too many
of them to allow E to be a set.

22.4 Cardinal numbers.

We pause to examine more closely the equivalence class [2]e , the class of
all sets which are equipotent to the natural number 2. Examples of a few
elements which belong to [2]e are the sets

{9, 7}

{R, N}
n o
∅, {{ ∅}}
n o
{{{ ∅}}}, {{ ∅}}

each of which is equipotent to 2. We also see that [2]e = [{9, 7}]e. Of course,
being equipotent to itself, 2 = {∅, {∅}} also belongs to [2]e. In fact, it is the
only natural number which is an element of [2]e (noting that, for example,
{9, 7} is not a natural number). If we represent the class, [2]e , in this way,
rather than representing it as, say [{9, 7}]e, it is because we surreptitiously
selected the set 2 = {∅, {∅}} as being the “official” representative of this
class. In fact, we have chosen the natural numbers as the official representa-
tives of all equivalence classes whose elements are finite sets. On the other
hand, possible representatives of [R]e and [P(R)]e are R and P(R), respec-
tively. But we could of course have used P(N) and P 2 (N), respectively.
It would be convenient to uniquely specify an “official” class representative
for each element of E . Determining how we can select a set from each and
every equivalence class in E is, however, not obvious. The Axiom of choice
states that there is a choice function that allows us to select an element from
each set in a “set of sets”. But E is not a “set of sets”. So the Axiom of
choice is not available to us as a tool for selecting an element in each set
in E .5 We need to identify a specific property possessed by a single set in
[S]e which clearly distinguishes it from all other sets in [S]e . Unfortunately,
at this time, we have not yet sufficiently explored our universe of sets to be
able to identify what this set property could be.
There are different ways we can go about solving this conundrum. We could
5 We could use each equivalent class in E as “self-representatives” and call them cardinal

numbers. The problem with this is that these equivalence class are not known to be sets.
We want a set which represents each equivalence class in E .
Part VII: Cardinal numbers. 225

wait until we have established all the mathematical machinery necessary to

judiciously choose a suitable element in each class. Alternatively, I (like a
few others) have a slight preference for proceeding otherwise. For the time
being, we will postulate the existence of a class, C , containing the unique
representative of each and every single ∼e -equivalence class in E . We will
refer to these class representatives as “cardinal numbers”.6 We will of course
have to be cautious whenever we refer to the entity of “cardinal number” in
our mathematical discourse. We will carefully clear a logical path towards
an eventual formal definition.

Postulate 22.2 There exists a class of sets, C , which satisfies the following
properties:
1. Every natural number n is an element of C .
2. Any set S ∈ S is equipotent to precisely one element in C .
The sets in C are called cardinal numbers. When we say that a set, S, has
cardinality κ, we mean that κ ∈ C and that S ∼e κ, or equivalently, S ∈ [κ]e .

If the set S has cardinality κ, we will write,

|S| = κ

Note that each cardinal number is a set. From here on, the symbol, C , is
strictly reserved to represent the class of all cardinal numbers. We emphasize
that we postulate the existence of the cardinal numbers, C , immediately, for
convenience only. We will eventually prove the existence of such a class, C .

22.5 Definitions and notation associated to the class C .

Given any set S, the expression, |S|, was said to denote the cardinality of
S.
When referring to a “generic” uncountably infinite cardinal number (that
is, a cardinal number which is not the cardinality of an explicitly specified
uncountable set S) we will represent it by a Greek letter such as κ or λ.7
For example, we would represent a cardinal number as a Greek letter in a
phrase such as “Suppose S is a set whose cardinality is κ, ...”. However, we
don’t normally represent a finite cardinal number by a Greek letter. When
6 For those readers who wish to read ahead, the sets that we will call “cardinal numbers”
are the initial ordinals. (See Definition 29.7.)
7 The Greek letter κ is read as “kappa”. The Greek letter λ is read as “lambda”.
226 Section 22: Introduction to cardinal numbers

referring to some finite set, F , containing, say, n elements, it’s cardinality is

the natural number n, so we write,
|F | = n
Depending on the context, we will say, “the natural number n” or the “car-
dinal number n”.
The cardinality of the empty set, is defined as
|∅| = 0

The aleph notation. It is customary to express the cardinality of count-

ably infinite sets by using the “aleph” notation8 . For example, we write
|N| = ℵ0 ∈ C
The symbol “ℵ0 ” is pronounced “aleph-not”. From here on, ℵ0 is strictly
reserved for representing the cardinality of N. For example, since we have
shown that N × N ∼e N and that N ,→e R, we can write |N × N| = ℵ0 , while
|R| =
6 ℵ0 . We will read this as saying “the cardinality of the set, N×N, is ℵ0 ”.
The cardinality of R. There is one particular exception to the rules of
cardinal number representation cited above. (Isn’t there always one? What’s
a rule without an exception?). The cardinality of R is often represented by
the symbol 9
c = |R|
Since R, P(N) and 2N were shown to be equipotent, then we can write
|P(N)| = |2N | = c
If we assume the Continuum hypothesis (CH), there can be no uncountable
cardinal number κ such that N ,→e κ ,→e c. That is, CH states that
“...there are no cardinal numbers between ℵ0 and c.”
If we assume ¬CH, then we are saying that there exists an uncountable car-
dinal κ such that N ,→e κ ,→e c.
Since we are referring to class representatives as “numbers”, we will replace
the symbols ,→e with < and ,→e∼ with ≤. That is, if κ = |S| and λ = |T |
κ < λ ⇔ S ,→e T
κ ≤ λ ⇔ S ,→e∼ T
Transfinite cardinal numbers. Transfinite cardinal numbers are those
cardinal numbers which are infinite sets. The natural numbers are those
cardinal numbers which are not transfinite.

8 The aleph notation was introduced by Georg Cantor.

9 The symbol c abbreviates the word “continuum”, another way of referring to the set R.
Part VII: Cardinal numbers. 227

22.6 Three operations on cardinal numbers.

Having explicitly provided symbols, 0, 1, 2, 3, . . ., ℵ0 , c, for the first few car-
dinal numbers, we now develop methods to construct from these, other car-
dinal numbers not found in this list. Methods used to construct new sets
from known sets will be applied to construct new cardinal numbers from
known ones.
For example, given two sets, A and B, we can define new sets such as
C = A ∪ B, D = A × B and AB . (Recall that AB denotes the set of all
functions mapping the set B into A.) We will define from these three car-
dinal number operations. It will be useful to see how the cardinality of the
sets A ∪ B, A × B and AB compares with the cardinality of two sets, A and
B. Once we formally define operations on cardinal numbers, we will try to
determine if there are general principles that can be used to more efficiently
compute the value of the cardinal numbers which result from these opera-
tions.

Definition 22.3 If S and T are sets and κ = |S| and λ = |T |, then we define
addition “+”, multiplication “×” and exponentiation of two cardinal numbers
as follows:

a) If S ∩ T = ∅, κ + λ = |S ∪ T |
b) κ × λ = |S × T |
c) κλ = |S T | where S T represents the set of all functions mapping T into S
(as previously defined). That is, |S||T | = |S T |. For convenience, we define

0λ = 0
κ0 = 1

Note that for any non-zero cardinal number κ = |S|,

κ × 1 = |S| × |{∅}| = |S × {∅}| = |S| = κ

Also
κ1 = |S||{∅}| = |S {∅} | = |(∅, a) : a ∈ S| = |S| = κ
We verify the following fact:

1κ = |{∅}||S| = |{∅}S | = 1

since there is only one function which maps all of S to {∅}.

228 Section 22: Introduction to cardinal numbers

If two sets, S and T , have non-empty intersection, it is still possible to

determine |S| + |T | by proceeding as follows:
(κ = |S|) ⇒ (κ = |S × {0}|)
(λ = |T |) ⇒ (λ = |T × {1}|)
(S × {0}) ∩ (T × {1}) = ∅ ⇒ κ + λ = |(S × {0}) ∪ (T × {1})|

One should verify that the definitions of sums, products and exponents of
cardinal numbers agree with the operations we perform with finite cardinal
numbers (the natural numbers). Suppose, for example, that A contains four
elements and B contains two elements. Then there are 16 = 42 elements in
AB . Verify this by listing all the elements in AB . Also there are 4 × 2 = 8
elements in A × B and 4 + 2 = 6 elements in A ∪ B (assuming that A and
B have no elements in common). Verify this fact.

Examples:

a) if F = {∅, {∅}, {∅, {∅}}} the cardinality of F is |F | = 3.

b) The cardinality of the set {2, {3}} is |{2, {3}}| = 2.
c) The sets Q, Z, and N×N×N all have cardinality |Q| = |Z| = |N×N×N| =
|N| = ℵ0 .
d) If A = {1, 2, 3} and B = {13, 14} the cardinalities of A and B are 3 and
2 respectively.
· By definition: |A| + |B| = |A ∪ B| = |{1, 2, 3, 13, 14}| = 5.
· By definition:

|A|×|B| = |A×B| = |{(1, 13), (1, 14), (2, 13), (2, 14), (3, 13), (3, 14)}| = 6

· By definition: |A||B| = |AB | is equal to the cardinality of the following

set of functions mapping {13, 14} into {1, 2, 3}:

{(13, 1), (14, 1)}

{(13, 2), (14, 2)}
{(13, 3), (14, 3)}
{(13, 1), (14, 2)}
{(13, 1), (14, 3)}
{(13, 2), (14, 1)}
{(13, 2), (14, 3)}
{(13, 3), (14, 1)}
{(13, 3), (14, 2)}
Since there are precisely nine functions in this set, 32 = 9.
e) The cardinality of the two sets 2N and P(N) are

|2N | = |P(N)| = |R| = c

Part VII: Cardinal numbers. 229

f) The cardinality of 2ℵ0 is

2ℵ0 = |2||N| = |2N| = c

We will formally show that the class, C , of all cardinal numbers is not a set.
The proof mimics the one used to show that the class E is not a set of sets.

Theorem 22.4 The class C of all cardinal numbers is a proper class.

P roof:
S
Suppose C is a set. Let T = κ∈C κ.
Then T must be a set (by the Axiom of union). This implies P(T ) must
be a set (Axiom of power set). Since P(T ) is a set, it has a cardinality,
|P(T )| = 2|T | = λ. So P(T ) ∼e λ. But λ ⊂ T . Then P(T ) ∼e λ ⊂ T .
So P(T ) is equipotent to a subset of T , contradicting the previously estab-
lished fact, P(T ) 6,→e T (see Theorem 20.8).
So C cannot be a set.

22.7 Previous theorem statements using cardinal number notation.

Many of the statements proven in the last few sections can now be restated
using cardinal number notation. The results are from Sections 19, 20 and
21. Let S and T be sets. Suppose that κ = |S|, λ = |T |.
a) If |S| = ℵ0 and T ⊆ S, then either |T | = n for some n ∈ N or |T | = ℵ0 .
(Theorem 19.3)
b) If |S| = κ = |T | if and only if S ∼e T . (Postulate 22.2)
c) If ℵ0 ≤ κ, then κ + ℵ0 = κ. (Theorem 20.6)
d) If κ = |S|, then |P(S)| = 2κ . (Theorem 20.12)
e) For all cardinal numbers κ, κ < 2κ (equivalently, κ ,→e 2κ ).
(Theorem 20.8)
f) |R| = c = 2ℵ0 . (Theorem 21.3)
g) Continuum hypothesis: There does not exist a cardinal number κ such
that ℵ0 < κ < c = |R|.
h) Generalized continuum hypothesis: For any cardinal number κ, there does
not exist a cardinal number λ such that κ < λ < 2κ .
230 Section 22: Introduction to cardinal numbers

Concepts review:
1. What does the Continuum hypothesis say? What does the negation
of the Continuum hypothesis say? Which one holds true in ZFC?
2. State the Generalized continuum hypothesis.
3. Define the class of all cardinal numbers.
4. Describe the finite cardinal numbers.
5. What symbol is used to represent the cardinality of the set R?
6. What symbol is used to represent the cardinality of N?
7. Which cardinal numbers are referred to as being transfinite cardinal
numbers?
8. How are the operations of addition, multiplication and exponentia-
tion of cardinal numbers defined?
9. Can the class of all cardinal numbers be referred to as a set? Why?

EXERCISES

A. 1. Show that there can be no largest cardinal number.

2. Show that for every finite cardinal number n, n ∈ ℵ0 .

B. 3. Prove that if S ,→e T and T ,→e M , then S ,→e M .

4. Find ∪{κ ∈ C : κ is a finite cardinal number}.
5. Determine the cardinal number equal to each of the following expressions:
a) c + ℵ0
b) ℵ0 × ℵ0
c) 2ℵ0
d) ℵ20
6. What is the cardinality of the set 2Z ?
7. What is the cardinality of the set NN ?

C. 8. Prove that if S ⊆ T ⊆ M and S ∼e M , then S ∼e T .

9. Prove that ℵ0 + c = c + ℵ0 .
10. Prove that ℵ0 × c = c × ℵ0 .
11. Prove that ℵ0 < 2ℵ0 .
12. Show that the class of all infinite sets is not a set.
Part VII: Cardinal numbers 231

23 / Addition and multiplication in C

Abstract. We have defined addition and multiplication of cardinal num-

bers in such a way that when adding or multiplying finite cardinal numbers,
we obtain precisely the same answers as the ones obtained when performing
these operations with natural numbers in the conventional way. In this sec-
tion we verify that these two operations are well-defined even when adding
and multiplying infinite cardinals. We then verify that addition and multi-
plication of cardinals satisfy, in many cases, the same properties as addi-
tion and multiplication of natural numbers. But not all of their properties
generalize from the natural numbers to infinite cardinal numbers.

23.1 Reviewing basic facts about C .

We have seen that every element, κ, of the class C is a set which repre-
sents all sets S such that S ∼e κ. Every ∼e -equivalence class in S contains
exactly one cardinal number. Those cardinal numbers which are finite sets
were declared to be the natural numbers. The elements of C are ordered by
the relation “<” where κ < λ if and only if κ ,→e λ, and κ ≤ λ if and only if
κ ,→e∼ λ. We cannot yet declare that ≤ linearly orders C since we have not
yet shown that ,→e∼ linearly orders S (although we certainly would like
this to be the case). We will now study the properties of the two operations,
+ and ×, previously defined on C .

23.2 Addition of cardinal numbers.

Addition of two cardinal numbers was defined in the last chapter as follows:
“If κ = |S| and λ = |T | where S ∩ T = ∅, then κ + λ is equal to the cardinal
number |S ∪ T |”. We can easily see that as long as S and T are disjoint finite
sets, addition of the cardinal numbers |S| and |T | is simply the number of
elements in the set obtained when we merge both sets into one.
Although we are sure that, under the given conditions, addition of finite
cardinals is well-defined, we should also verify that addition of any pair of
cardinal numbers is well-defined.

Theorem 23.1 Addition on C is well-defined. That is, if S1 , S2 , T1 and T2 are

sets such that κ = |S1 | = |S2 | and λ = |T1 | = |T2 | and S1 ∩ T1 = ∅ = S2 ∩ T2 ,
then |S1 ∪ T1 | = κ + λ = |S2 ∪ T2 |.

P roof:
What we are given: That S1 ∩ T1 = ∅ = S2 ∩ T2 , S1 and S2 are equipotent
and T1 and T2 are equipotent.
What we are required to show: That |S1 ∪ T1 | and |S2 ∪ T2 | are the same
232 Section 23: Addition and multiplication in C

cardinal number.
Since S1 , S2 and T1 , T2 are equipotent pairs, then there exist one-to-one
onto functions:

f : S1 → S2
g : T1 → T2
By definition of addition, we have

κ + λ = |S1 | + |T1 | = |S1 ∪ T1 |

κ + λ = |S2 | + |T2 | = |S2 ∪ T2 |

To prove that addition is well-defined, it suffices to show that |S1 ∪ T1 | =

|S2 ∪ T2 |, i.e., that S1 ∪ T1 and S2 ∪ T2 are equipotent:
− We define the function h : S1 ∪ T1 ⇒ S2 ∪ T2 as follows: h|S1 = f and
h|S2 = g.
− Since S1 and S2 are disjoint and both f and g are one-to-one and onto,
then h is a well-defined one-to-one and onto function.
− So S1 ∪T1 and S2 ∪T2 are equipotent. Thus, |S1 ∪T1 | = |S2 ∪T2 | as required.

We now verify that addition on C , thus defined, satisfies most of the basic
addition properties.

Theorem 23.2 Let κ, λ, φ and ψ be any four cardinal numbers. Then

a) κ + λ = λ + κ (Commutativity of addition)

b) (κ + λ) + φ = κ + (λ + φ) (Associativity of addition)

c) κ ≤ κ + λ
d) κ ≤ λ and φ ≤ ψ ⇒ κ + φ ≤ λ + ψ.

P roof
a) Let S and T be disjoint sets such that κ = |S| and λ = |T |. To prove
that κ + λ = λ + κ, it suffices to prove that S ∪ T ∼e T ∪ S. This is left
as an exercise.
b) Let S, T and F be disjoint sets such that κ = |S|, λ = |T | and φ =
|F |. To prove that (κ + λ) + φ = κ + (λ + φ), it suffices to show that
(S ∪ T ) ∪ F ∼e S ∪ (T ∪ F ). This is left as an exercise.
c) Let S and T be disjoint sets such that κ = |S| and λ = |T |. Since S and
T are disjoint, we see that S can be mapped one-to-one into the subset
S of S ∪ T . Hence, κ ≤ κ + λ.
Part VII: Cardinal numbers 233

d) Let {S, F } and {T, P } be two pairs of disjoint sets such that κ = |S| ≤
λ = |T | and φ = |F | ≤ ψ = |P |. The case where we have equality is
straightforward. We will only prove the case involving the strict inequal-
ity “<”. Assuming κ < λ and φ < ψ,

S ,→e T ,→e T ∪ P S ,→e T ∪ P
⇒
F ,→e P ,→e T ∪ P F ,→e T ∪ P

Since T and P are disjoint S ∪ F ,→e T ∪ P , then κ + φ < λ + ψ.

On canceling out terms in addition. Not all addition properties which hold
true for finite cardinals extend to infinite cardinals. For example, for finite
cardinals m, n, k the statement
(m + n = m + k) ⇒ n = k

is always true. But for arbitrary cardinals κ, λ, ψ, if κ + λ = κ + ψ, it does

not necessarily follow that λ = ψ. Recall (from Theorem 20.6) that if

“If κ ≥ ℵ0 and λ ≤ ℵ0 , then κ = κ + λ”

is shown to be true. For example, c + 0 = c + ℵ0 does not imply that ℵ0 is

0. Even though, for a non-zero finite cardinal n, the expression n = n + n
doesn’t make sense, we will soon show that for any infinite cardinal κ, it is
always true that κ = κ + κ.

23.3 Multiplication of cardinal numbers.

As previously stated, multiplication of two cardinal numbers is defined as
being the cardinal number of the Cartesian product of the sets they repre-
sent. Just as we have done for addition, we confirm that multiplication of
cardinal numbers is well-defined.

Theorem 23.3 Multiplication on C is well-defined. That is, if S1 , S2 , T1 and

T2 are sets such that κ = |S1 | = |S2 | and λ = |T1 | = |T2 |, then

|S1 × T1 | = κ × λ = |S2 × T2 |
P roof:
What we are given: That S1 and S2 are equipotent and T1 and T2 are equipo-
tent.
What we are required to show: That |S1 × T1 | and |S2 × T2 | are the same
cardinal number.
Since S1 , S2 and T1 , T2 are equipotent pairs, then there exist one-to-one
onto functions:
234 Section 23: Addition and multiplication in C

f : S1 → S2
g : T1 → T2
By definition of multiplication, we have

κ × λ = |S1 | × |T1 | = |S1 × T1 |

κ × λ = |S2 | × |T2 | = |S2 × T2 |

To prove that multiplication is well-defined, it suffices to show that |S1 ×

T1 | = |S2 × T2 |, i.e., that S1 × T1 and S2 × T2 are equipotent:
· We define the function h : S1 ×T1 ⇒ S2 ×T2 as follows: h(s, t) = (f(s), g(t)).
· Since both f and g are one-to-one and onto, then h is a well-defined one-
to-one and onto function.1
· So S1 × T1 and S2 × T2 are equipotent.
So multiplication is well-defined.

We now describe and prove a few of the most basic multiplication properties
on C . We will see that most (but not all) of the multiplication properties
which hold true for the natural numbers extend to infinite cardinal numbers.

Theorem 23.4 Let κ, λ, φ and ψ be any three cardinal numbers. Then

a) κ × λ = λ × κ. (Commutativity of multiplication)

b) (κ × λ) × φ = κ × (λ × φ). (Associativity of multiplication)

c) κ × (λ + φ) = (κ × λ) + (κ × φ). (Left-hand distributivity)

d) λ > 0 ⇒ κ ≤ (κ × λ).
e) κ ≤ λ and φ ≤ ψ ⇒ κ × φ ≤ λ × ψ.
f) κ + κ = 2 × κ.
g) κ + κ ≤ κ × κ when κ ≥ 2.

1 To see this, note that (f (s ), g(t )) = (f (s ), g(t )) implies that f (s ) = f (s ) and

1 1 2 2 1 2
g(t1 ) = g(t2 ) which implies that s1 = s2 and t1 = t2 ⇒ (s1 , t1 ) = (s2 , t2 ).
Part VII: Cardinal numbers 235

P roof
a) Let S and T be sets such that κ = |S| and λ = |T |.
What we are required to show: That κ × λ = λ × κ.
To attain this result, it suffices to show that S × T ∼e T × S.
Let h : S × T → T × S be defined as h(s, t) = (t, s). Now

(s, t) = (a, b) ⇔ s = a and t = b

⇔ (b, a) = (t, s)

Then h is a one-to-one function.

Also, if (u, v) ∈ T × S, then (v, u) ∈ S × T and h(v, u) = (u, v). So h is
onto T × S.
We conclude that S × T ∼e T × S, as required.

b) Let S, T and F be sets such that κ = |S|, λ = |T | and φ = |F |.

What we are required to show: That κ × (λ × φ) = (λ × κ) × φ).
To attain this result, it suffices to show that (S ×T )×F ∼e T ×(S ×F ).
This is proven in Theorem 4.9.
c) Let S, T and F be sets such that κ = |S|, λ = |T | and φ = |F |. Without
loss of generality, suppose T and F are disjoint.
What we are required to show: That multiplication of cardinal numbers
is left-hand distributive, i.e., κ × (λ + φ) = (κ × λ) + (κ × φ).
By definition, λ + φ = |T ∪ F |, κ × λ = |S × T | and κ × φ = |S × F |. So

κ × (λ + φ) = |S × (T ∪ U )|
= |(S × T ) ∪ (S × F )| (By Theorem 4.7 (b) ).
= |S × T | + |S × F | (Since T and F are disjoint ⇒ S × T and S × F are disjoint).

= (κ × λ) + (κ × φ)

d) Let S and T be sets such that κ = |S| and λ = |T |. Let a ∈ T .

The property κ ≤ (κ × λ) follows from the fact that the function
h : S → S × T defined as h(s) = (s, a) maps S one-to-one onto
S × {a} ⊆ S × T . The details are left as an exercise.
e) Let S, T , F and P be sets such that κ = |S|, λ = |T |, φ = |F | and
ψ = |P |. Suppose f : S → T maps S one-to-one into T and g : F → P
maps F one-to-one into P . That is, suppose that κ ≤ λ and φ ≤ ψ
What we are required to show: That κ × φ ≤ λ × ψ.
To show this, it suffices to show a function h : S × F → T × P which
maps S × F one-to-one into T × P .
The function h : S × F → T × P defined as h(s, u) = (f(s), g(u)) can
be shown to be one-to-one. This is left as an exercise.
f) What we are given: That S is a set such that κ = |S| and 2 is the
cardinal number of the set {0, 1}.
236 Section 23: Addition and multiplication in C

What we are required to show: That κ + κ = 2 × κ.

Note that κ = |S| implies κ = |S ×{0}| = |S ×{1}|. Then, by definition,

κ+κ = |(S × {0}) ∪ (S × {1})| (Since S × {0} and S × {1} are disjoint).

= |S × ({0} ∪ {1})| (By Theorem 4.7 (b) ).

= |S × {0, 1}|
= |{0, 1} × S| (Since S × {0, 1} and {0, 1} × S are equipotent).
= 2×κ

g) What we are given: That S is a set such that κ = |S| ≥ 2.

What we are required to show: That κ + κ ≤ κ × κ.
By part (f), κ + κ = 2 × κ. So it suffices to show that 2 × κ ≤ κ × κ.
Since 2 ≤ κ = |S|, then 2 = {0, 1} is embedded in S. Let f : {0, 1} → S
be a one-to-one function mapping {0, 1} into S. Then the function
h : {0, 1} × S → S × S defined as h(i, s) = (f(i), s) can be seen as
being one-to-one. Showing this is left as an exercise. It follows that
2 × κ ≤ κ × κ. So κ + κ ≤ κ × κ, as required.

Concepts review:
1. How do we go about showing that addition and multiplication of
cardinal numbers are “well-defined”?
2. Is addition of cardinal numbers commutative? Is it associative?
3. Is multiplication of cardinal numbers commutative? Is it associa-
tive?
4. Does λ + κ = λ + ψ imply κ = ψ? If so why? If not, give an example
showing why not.

EXERCISES

A. 1. Show that |S| ≤ |S T | for any set S and non-empty set T .

B. 2. Let κ, λ, φ and ψ be any three cardinal numbers. Show the details of the
proofs of the following statements:
a) κ + λ = λ + κ.
b) (κ + λ) + φ = λ + (κ + φ).
c) κ ≤ λ and φ ≤ ψ ⇒ κ + φ ≤ λ + ψ.
3. Let κ, λ, φ and ψ be any three cardinal numbers. Show the details of the
proofs of the following statements:
Part VII: Cardinal numbers 237

a) κ × λ = λ × κ.
b) (κ × λ) × φ = λ × (κ × φ).
c) λ > 0 ⇒ κ ≤ (κ × λ).
d) κ ≤ λ and φ ≤ ψ ⇒ κ × φ ≤ λ × ψ.
e) κ + κ ≤ κ × κ when κ ≥ 2.
4. Show that 2ℵ0 = |R − N|.
5. Show that for any cardinal number κ, κ + κ + κ + κ = 4 × κ.
6. Let n be a finite cardinal number. Prove that:
a) n + ℵ0 = ℵ0 .
b) n × ℵ0 = ℵ0 .
c) n + 2ℵ0 = 2ℵ0 .
d) n × 2ℵ0 = 2ℵ0 .
e) ℵ0 + 2ℵ0 = 2ℵ0 .
f) ℵ0 × 2ℵ0 = 2ℵ0 .

C. 7. Prove that if κ × λ = 0, then either κ = 0 or λ = 0.

8. Prove that if κ × λ = 1, then either κ = 1 or λ = 1.
9. Prove that if κ × λ = ℵ0 , then either κ = ℵ0 or λ = ℵ0 .
238 Section 24: Exponentiation of cardinal numbers

24 / Exponentiation of cardinal numbers

Abstract. In this section we show that cardinal exponentiation is well-
defined. We then prove three of the most basic properties of cardinal ex-
ponentiation as well as inequalities involving cardinal exponentiation. We
also show that |RR| = cc = 2c.

24.1 Cardinal exponentiation.

Given two sets, A and B, we have seen that the expression AB represents
the set of all functions mapping B into A. This means that every element,
f ∈ AB , is a subset of B × A. Since f is a function, for every pair of
ordered pairs in f of the form (x, u) and (x, y), then y = u. We see that
f ∈ P(B × A); hence, AB ⊂ P(B × A). If both A and B are finite, we more
easily understand the use of the notation AB to represent this set. Suppose
A = {a, b, c}, a three-element set, and B = {0, 1}, a two-element set. We
then list the functions in the set AB :

f1 : {(0, a), (1, a)}

f2 : {(0, a), (1, b)}
f3 : {(0, a), (1, c)}
f4 : {(0, b), (1, b)}
f5 : {(0, b), (1, a)}
f6 : {(0, b), (1, c)}
f7 : {(0, c), (1, c)}
f8 : {(0, c), (1, a)}
f9 : {(0, c), (1, b)}

There are precisely nine elements in AB . Or, we can say that the cardinality
of |AB | of AB is |A||B| = 32 = 9. So the notation AB is designed to remind
us of the number of elements contained in such sets when the sets A and B
are finite. For convenience, this notation is maintained for sets of all cardi-
nalities. In this section we try to develop a few rules that will help simplify
expressions involving exponentiation of infinite cardinals. We will soon see
that cardinal exponentiation is a considerably more complex operation than
the cardinal addition and multiplication operations.
We remind ourselves of the formal definition of cardinal number exponenti-
ation:
If κ and λ are the cardinal numbers of the non-empty sets A and
B we define κλ = |A||B| = |AB |. For convenience we define 0λ = 0
and κ0 = 1.
Part VII: Cardinal numbers 239

We will begin by showing that exponentiation of cardinal numbers is well-

defined.

Theorem 24.1 Exponentiation on C is well-defined. That is, if S, S ∗ , T and

∗
T ∗ are sets such that |S| = |S ∗ | and |T | = |T ∗ |, then |S T | = |S ∗T |.

P roof:
What we are given: The sets S and S ∗ are equipotent as well as the pair T
and T ∗ . ∗
What we are required to prove: That S T and S ∗ T are equipotent.
Since S ∼e S ∗ and T ∼e T ∗ there exist one-to-one onto functions α : T → T ∗
and β : S → S ∗ .

If g ∈ S T define

φ(g) = {(α(t), β(g(t))) : t ∈ T } ∈ P(T ∗ × S ∗ )

∗
We claim that for any g ∈ S T , φ(g) ∈ S ∗ T :
First note that (α(t), β(g(t))) ∈ T ∗ × S ∗ for all t ∈ T and that the do-
main of {(α(t), β(g(t))) : t ∈ T } is T ∗ = α[T ]. If (α(a), β(g(a))) and
(α(b), β(g(b))) are elements of φ(g) such that β(g(a)) 6= β(g(b)), then
g(a) 6= g(b) (since β is a one-to-one function on S). Since g ∈ S T , a 6= b.
Since α : T → T ∗ is one-to-one, α(a) 6= α(b). We have shown that
β(g(a)) 6= β(g(b)) implies that α(a) 6= α(b). So φ(g) is a function whose
∗
domain is T ∗ with range S ∗ . Then, for any g ∈ S T , φ(g) ∈ S ∗ T as
claimed.
∗
We claim φ : S T → S ∗ T is a one-to-one function on S T :
Since φ associates to any g ∈ S T an element φ(g) in S ∗ T ∗ , φ has domain
S T and range S ∗ T ∗ . Suppose h, g ∈ S T .

h 6= g ⇔ ∃ u ∈ T such that h(u) 6= g(u)

⇔ β(g(u)) 6= β(h(u)) (Since β is one-to-one.)
⇔ (α(u), β(g(u))) 6= (α(u), β(h(u)))
⇔ {(α(t), β(g(t))) : t ∈ T } =
6 {(α(t), β(h(t))) : t ∈ T }
⇔ φ(g) 6= φ(h)
∗
So φ : S T → S ∗ T is one-to-one on S T as claimed.
∗ ∗
Since S T is embedded in S ∗ T , then |S T | ≤ |S ∗ T |.
Using the same arguments but replacing α : T → T ∗ with
∗
α−1 : T ∗ → T
∗ −1 ∗ ∗T
and β : S → S with β : S → S we can show that S is embedded in
240 Section 24: Exponentiation of cardinal numbers
∗
S T . Then |S ∗ T | ≤ |S T |.
∗ ∗
By the Schröder-Bernstein theorem, |S ∗ T | ≤ |S T | and |S T | ≤ |S ∗ T | im-
∗
plies that |S T | = |S ∗ T | as required.

24.2 Three basic identities involving cardinal exponentiation.

We now verify that exponentiation in C satisfies the usual three basic ex-
ponential properties.

Theorem 24.2 Let κ, λ and φ be any three cardinal numbers. Then

a) κλ+φ = κλ × κφ .
b) (κλ )φ = κλ×φ.
c) (κ × λ)φ = κφ × λφ .
P roof:
What we are given: That κ, λ and φ are cardinal numbers where κ = |S|,
λ = |T | and φ = |U | for sets S, T and U .
a) κλ+φ = κλ × κφ (where it is assumed that T ∩ U = ∅):
What we are required to show: That |S T × S U | = |S T ∪U | = |S||T ∪U | .
It suffices to show that S T × S U ∼e S T ∪U .
Let (f, g) ∈ S T × S U . Let h{f,g} : T ∪ U → S be the function defined
as:

f(x) if x ∈ T
h{f,g} (x) =
g(x) if x ∈ U
We claim that h{f,g} ∈ S T ∪U : If h{f,g} (a) 6= h{f,g} (b), then either
f(a) 6= f(b), g(a) 6= g(b) or f(a) 6= g(b). Since f and g are functions and
T ∩U = ∅, then for one of these three cases, a 6= b. Then h{f,g} ∈ S T ∪U .

Define the function φ : S T × S U → S T ∪U as

φ(f, g) = h{f,g}

We claim that φ maps S T × S U one-to-one onto S T ∪U :

− The function φ is well-defined: If (f, g) 6= (k, r) in S T × S U , then
either f(x) 6= k(x) for some x ∈ T or g(x) 6= r(x) for some x ∈ U ;
hence, h{f,g} (x) 6= h{k,r} (x) for some x ∈ T ∪ U . So h{f,g} 6= h{k,r}.
− The function φ is onto S T ∪U : Suppose t ∈ S T ∪U . Then
φ(t|T , t|U )(x) = h{t|T ,t|U } (x) for all x ∈ T ∪ U since T and U are
disjoint sets and t|T ∈ S T and t|U ∈ S U .
Part VII: Cardinal numbers 241

− The function φ is one-to-one: Suppose (f, g) and (k, t) are distinct

elements of S T ×S U . We are required to show that φ(f, g) 6= φ(k, t).
Suppose
φ(f, g) = h{f,g} = h{k,t} = φ(k, t)
Then f = k on T and g = t on U . So (f, g) = (k, t) on T ∪ U . Since
T ∩ U = ∅, f = k in S T and g = t in S U . Thus, (f, g) = (k, t) in
S T × S U , a contradiction. Then φ(f, g) 6= φ(k, t) as claimed.
We conclude that S T × S U and S T ∪U are equipotent. Hence,
κλ × κφ = |S||T | × |S||U | = |S T × S U | = |S T ∪U | = |S||T ∪U | = κλ+φ

b) (κλ )φ = κλ×φ :
What we are required to show: That S T ×U and (S T )U are equipotent.
For each u ∈ U and f ∈ S T ×U we define the function fu : T → S in S T
as
fu (t) = f|T ×{u} (t, u) ∈ S
Then for each u ∈ U , fu maps T into S. That is,
{fu : f ∈ S T ×U , u ∈ U } ⊆ S T

Given f ∈ S T ×U define the function φf : U → S T as follows:

φf (u) = fu

We claim that for each f, φf is a well-defined function mapping U into

ST :
φf (u1 ) 6= φf (u2 ) ⇒ fu1 6= fu2
⇒ f|T ×{u1 } (t, u1 ) 6= f|T ×{u2 } (t, u2 ), for all t ∈ T .

⇒ f(t, u1 ) 6= f(t, u2 ), for all t ∈ T .

⇒ u1 6= u2
So φf : U → S T is well-defined, as claimed.

We define the relation ψ : S T ×U → (S T )U as

ψ(f) = φf ∈ (S T )U
We claim that ψ is a one-to-one function on S T ×U :
f 6= g ⇔ f(t0 , u0 ) 6= g(t0 , u0) for some pair (t0 , u0 ) ∈ T × U .
⇔ f|T ×{u0 } (t0 , u0 ) 6= g|T ×{u0 } (t0 , u0)
⇔ fu0 (t0 ) 6= gu0 (t0 )
⇔ fu0 6= gu0
⇔ φf 6= φg
⇔ ψ(f) 6= ψ(g)
242 Section 24: Exponentiation of cardinal numbers

So ψ is a one-to-one function on S T ×U as claimed.

We claim that the function ψ is onto (S T )U :
Let φ be a function in (S T )U . Then φ(u) ∈ S T for all u in its domain
U.
We are required to exhibit a function f ∈ S T ×U such that ψ(f) = φ.
That is, we must find a function f ∈ S T ×U such that φf (u) = fu =
φ(u) for all u ∈ U .
For each u ∈ U , φ(u) is an element in S T . Define, for each u ∈ U ,
the function gu mapping T × {u} into S as

gu (t, u) = [φ(u)](t), ∀t ∈ T

Note that T × U = ∪{T × {u} : u ∈ U }, the union of a collection of

pairwise disjoint sets. Define f : T × U → S as follows:

f = ∪{gu : u ∈ U }

Since the respective domains of the gu ’s are pairwise disjoint and

their union is all of T × U , then f : T × U → S is well-defined. Then
gu = f|T ×{u} = fu for each u ∈ U . So φf (u) = fu = φ(u) for each
u ∈ U . That is, φf = ψ(f).
Hence, the function ψ is onto (S T )U , as claimed.
So the sets S T ×U and (S T )U are equipotent. We conclude that (κλ )φ =
κλ×φ.

c) (κ × λ)φ = κφ × λφ :
What we are required to show: That S U ×T U and (S ×T )U are equipo-
tent.
We define the function φ : S U × T U → (S × T )U as follows:

φ(f, g) = h

where f ∈ S U , g ∈ T U and h : U → S × T is defined as

h(u) = (f(u), g(u)). That is, φ(f, g)(u) = h(u) = (f(u), g(u)) for all
u ∈ U.
We claim that φ maps S U × T U one-to-one onto (S × T )U :
The function φ is onto (S × T )U : Suppose h ∈ (S × T )U . Then
h : U → S × T . For any u ∈ U , h(u) = (h1 (u), h2 (u)). Then
φ(h1 , h2 ) = h; so φ maps S U × T U onto (S × T )U .
The function φ is one-to-one: Let (f1 , g1 ) and (f2 , g2) be pairs of
functions in S U × T U and q and r be functions mapping U into
S × T defined as q(u) = (f1 (u), g1 (u)) and r(u) = (f2 (u), g2 (u)).
Part VII: Cardinal numbers 243

Then

φ(f1 , g1 ) 6= φ(f2 , g2 ) ⇔ q 6= r
⇔ q(u) 6= r(u), for some u ∈ U
⇔ (f1 (u), g1(u)) 6= (f2 (u), g2 (u))
⇔ f1 (u) 6= f2 (u) or g1 (u) 6= g2 (u)
⇔ f1 6= f2 or g1 6= g2
⇔ (f1 , g1) 6= (f2 , g2 )

Then the function φ is one-to-one.

We conclude that φ maps S U × T U one-to-one onto (S × T )U and so
these two sets are equipotent.

The following example shows how these identities can help simplify the com-
putation of cardinal exponentials.
Find the cardinality of RR .

Solution:
|RR | = cc
= (2ℵ0 )c
= 2ℵ0×c

The function f(x) = (0, x) ∈ {0} × R embeds R in N × R; hence, c ≤ ℵ0 × c.

Since ℵ0 < c, then, by Theorem 23.4 (e), ℵ0 × c ≤ c × c. By Corollary 20.4,
c × c = c. We then have

c ≤ ℵ0 × c ≤ c × c = c

which implies ℵ0 × c = c. Then

|RR | = 2ℵ0×c = 2c

24.3 A few basic inequalities involving cardinal exponentiation.

We verify a few more basic cardinal exponentiation properties.
244 Section 24: Exponentiation of cardinal numbers

Theorem 24.3 Let κ, λ, and α be infinite cardinal numbers. Then

a) κ ≤ κλ .
b) α ≤ κ ⇒ αλ ≤ κλ .
c) α ≤ λ ⇒ κα ≤ κλ .

P roof:
What we are given: That κ = |K|, α = |A| and λ = |L|.
a) κ ≤ κλ :
What we are required to show: That K is embedded in K L .
Define the function f : K → K L as follows: f(k) = {k}L ⊂ K L . Note
that {k}L contains only one function; it maps all elements of L to the
single element k. Since “k 6= t implies {k}L 6= {t}L”, the function f is
one-to-one. Since f embeds K in K L , then κ ≤ κλ .
b) α ≤ κ ⇒ αλ ≤ κλ :
What we are also given: That A is embedded in K.
What we are required to show: That AL is embedded in K L .
Suppose the function f : A → K embeds A into K. Define φ : AL →
K×L
φ(g) = {(l, f(g(l))) : l ∈ L} ⊆ L × K
We claim that φ(g) ∈ K L :
If (a, f(g(a))) and (b, f(g(b))) are elements of φ(g) such that
f(g(a)) 6= f(g(b)), then g(a) 6= g(b) (since f is a function mapping A
to K). Since (a, g(a)) and (b, g(b)) both belong to g ∈ AL , then a 6= b
and so φ(g) is a function in K L as claimed.
We claim φ : AL → K L is one-to-one:
Suppose h, g ∈ AL .

h 6= g ⇔ ∃ u ∈ L such that h(u) 6= g(u)

⇔ f(g(u)) 6= f(h(u)) since f : A → K is one-to-one.
⇔ (u, f(g(u))) 6= (u, f(h(u)))
⇔ {(l, f(g(l))) : l ∈ L} =
6 {(l, f(h(l))) : l ∈ L}
⇔ φ(g) 6= φ(h)

So φ : AL → K L is one-to-one as claimed. Since φ : AL → K L embeds

AL into K L , then αλ ≤ κλ as required.
c) [AC] α ≤ λ ⇒ κα ≤ κλ :
What we are also given: That A is embedded in L.
Part VII: Cardinal numbers 245

What we are required to show: That K A is embedded in K L .

Suppose the function f : A → L embeds A into L. Define φ : K A →
f[A] × K as:

φ(g) = {(f(a), g(a)) : a ∈ A} ⊆ f[A] × K ⊆ L × K

We claim that φ(g) ∈ K f[A] :

6
If (f(a), g(a)) and (f(b), g(b)) are elements of φ(g) such that g(a) =
g(b), then a 6= b (since g ∈ K A ). Since f is one-to-one mapping A
into L, a 6= b implies f(a) 6= f(b). Then φ(g) ∈ K f[A] as claimed.
We claim φ : K A → K f[A] is one-to-one:
Suppose h, g ∈ K A .

h 6= g ⇔ ∃ u ∈ A such that h(u) 6= g(u)

⇔ (f(u), g(u)) 6= (f(u), h(u))
⇔ {(f(a), g(a)) : a ∈ A} =6 {(f(a), h(a)) : a ∈ A}
⇔ φ(g) 6= φ(h)

So φ : K A → K f[A] is one-to-one as claimed.

For each function h ∈ K f[A] the Axiom of choice allows us to choose
a function h∗ ∈ K L such that h∗ |f[A] = h. We define the function
φ∗ : K A → K L as φ∗ (g) = φ(g)∗ . Since φ : K A → K f[A] is one-to-one,
then φ∗ : K A → K L is one-to-one.
We conclude that K A is embedded in K L . This implies κα ≤ κλ as
required.

Concepts review:
1. If κ and λ are two cardinal numbers, how is the expression κλ
defined?
2. What are the three basic identities for cardinal exponentiation
stated and proved in this section?
3. What are the three basic inequalities for cardinal exponentiation
stated and proved in this section?

EXERCISES

A. 1. Show that for any cardinal number κ:

a) κ1 = κ.
b) 1κ = 1.
246 Section 24: Exponentiation of cardinal numbers

c) κ0 = 1.
d) 0κ = 0, if κ > 0.

B. 2. Show that for any finite cardinal number n and any cardinal number κ:
a) (2ℵ0 )n = 2ℵ0
b) ℵn0 = ℵ0
c) ℵℵ0 0 = 2ℵ0
d) nℵ0 = 2ℵ0 .
e) (2ℵ0 )ℵ0 = 2ℵ0
3. Let κ be an infinite cardinal number. Suppose |K| = κ and that {Ki : i ∈
K} is a set of pairwise disjoint sets Ki each of which has cardinality κ.
Show that | ∪ {Ki : i ∈ K}| = κ.

C. 4. Show that if the cardinal number κ 6= 1, then κκ 6= κ.

ℵ0
5. Show that 2ℵ0 < 2(2 ) .
Part VII: Cardinal numbers 247
1
25 / On sets of cardinality c
Abstract. In this section we examine a few sets whose cardinality equals
the cardinality of R. In particular, we determine the cardinality of the set
of all one-to-one functions mapping N to R and the cardinality of the set of
all continuous real-valued functions. We also study the well-known Cantor
set, discuss its construction and prove that it has cardinality c.

25.1 Sets related to R.

It can sometimes be challenging to determine the cardinality of certain types
of sets, depending on how they are defined. Many of the cardinal arithmetic
principles presented in the last section will help us determine the cardinality
of a few sets associated to R.
The cardinality of finite products of the reals, of the complex numbers and
the irrational numbers is discussed in the following theorem.

Theorem 25.1 Let C denote the set of all complex numbers and J denote
the set of all irrational numbers. Let n denote the cardinality of a non-empty
finite set.
a) The cardinality of Rn is c.
b) The cardinality of C is c.
c) The cardinality of J is c.

P roof:
a) |Rn | = c :
To prove that |Rn | = c, it suffices to show that cn = c.
We will prove this by mathematical induction.
What we are given: That n is a natural number greater than zero.
What we are required to show: That cn = c.
Let P (n) be the statement “cn = c”.
− Base case: Trivially, P (1) holds true (c1 = c was previously proven).
− Inductive hypothesis: Suppose P (n) holds true. That is, suppose cn =
c. Then

cn+1 = cn × c1 (Theorem 24.2 (a).)

= c × c (By the inductive hypothesis.)
= c (Corollary 20.4)
1 Understanding the subject matter in this section requires that the reader has a more

solid background in mathematics than the one required for the previous sections.
248 Section 25: On sets of cardinality c

So by mathematical induction, cn = c for all finite non-zero cardinals

n. So |Rn | = c.

b) |C| = c :
Define the function f : R2 → C as f(a, b) = a + bi. The function f is
easily shown to be one-to-one. So R2 and C are equipotent. It follows that
|R2 | = |C| = c.

c) |J| = c :
Suppose κ = |J|. Since J ∪ Q = R and J ∩ Q = ∅, then

|J ∪ Q| = |J| + |Q|
= |R|
= c

Hence κ + ℵ0 = c. If κ ≤ ℵ0 , then κ + ℵ0 = ℵ0 6= c. So ℵ0 < κ. That is, κ

is an uncountable set. By Theorem 20.6, κ + ℵ0 = κ. Hence, κ = c. That
is, |J| = c as required.

25.2 Cardinality of sets of sequences and functions.

Sets of sequences and functions are more abstract in nature. This sometimes
makes it more difficult to determine their cardinality. The following theorem
illustrates a few strategies that can be used to determine the cardinality of
such sets. In the proof of the following theorem we will invoke the statement,
N × N ∼ N, proven in Theorem 19.4.

Theorem 25.2

a) Let SR denote the set of all countably infinite sequences of real numbers.
Then the cardinality of SR is c.
b) Let SN denote the set of all countably infinite sequences of natural num-
bers. Then the cardinality of SN is c.
c) Let NN
(1−1) denote the set of all one-to-one functions mapping N to N.
Then the cardinality of NN
(1−1) is c.

d) Let RN
(1−1) denote the set of all one-to-one functions mapping N to R.
Then the cardinality of RN
(1−1) is c.
Part VII: Cardinal numbers 249

P roof:
a) |SR| = c:
A sequence of real numbers {a0 , a1 , a2 , . . .} is a function s : N → R map-
ping each natural number i ∈ N to ai ∈ R. So each infinite sequence
{a0 , a1 , a2 , . . .} is associated to a unique function s : N → R. So the set of
all infinite sequences of real numbers can be represented by RN . Then
|RN | = |R||N| (By Definition 22.3)

= (2 ) = 2ℵ0 ×ℵ0 (By Theorem 24.2)

ℵ0 ℵ0

= 2|N×N|
= 2|N| = 2ℵ0 (By Theorem 19.4)
= |2N | = c
Then |SR | = |RN| = c.
b) |SN| = c:
A sequence of natural numbers {a0 , a1 , a2 , . . .} is a function f : N → N
mapping each natural number i ∈ N to ai ∈ N. So the set of all infinite
sequences of natural numbers can be represented by NN . The cardinality
of the set of all infinite sequences of natural numbers is then |NN |. Note
that
f ∈ NN ⇒ f ⊆N×N
⇒ f ∈ P(N × N)
⇒ NN ⊆ P(N × N)
Then
c = 2ℵ 0
≤ ℵℵ0 0 (By Theorem 24.3 (b).)

= |NN|
≤ |P(N × N)|
= |P(N)| (N × N ∼ N followed by Theorem 20.7.)
= |R| = c
We conclude that |NN| = |SN | = c.
c) |NN
(1−1)| = c:

Let f ∈ NN . Consider the set

Sf = {(n, (n, f(n)) : n ∈ N} ⊂ N × (N × N)
We claim that Sf is a one-to-one function mapping N into N × N:
If (n, (a, b)) and (n, (c, d)) belong to Sf , then a = n = c and, since f
is a function, b = f(n) = d; hence, (a, b) = (c, d). So Sf is a function
mapping N into N × N. If (a, (n, f(n))) and (b, (n, f(n)) belong to Sf ,
then a = b = n. So Sf : N → N × N is a one-to-one function, as claimed.
250 Section 25: On sets of cardinality c

We know that N × N ∼e N. Then there exists a function h mapping N × N

one-to-one onto N. Since both h and Sf are one-to-one, then, for each
f ∈ NN , h◦ Sf : N → N is a one-to-one function and so belongs to NN(1−1) .
Let H : NN → NN (1−1) be a function defined as H(f) = h ◦ S . We claim
f
that H is one-to-one:
f 6= g ⇒ f(m) 6= g(m) for some natural number m
⇒ Sf (m) = (m, f(m)) 6= (m, g(m)) = Sg (m) for some natural number m

⇒ h(Sf (m)) 6= h(Sg (m)) for some natural number m

⇒ h◦ Sf =
6 h◦ Sg
⇒ H(f) = 6 H(g)

So H is one-to-one as claimed.
Then the function H embeds the set NN into NN (1−1) . We conclude that
N N N N
|N | ≤ |N(1−1)|. Since |N(1−1)| ≤ |N |, then by the Schröder-Bernstein
theorem |NN N
(1−1) | = |N | = c.

d) |RN
(1−1)| = c:

By part (c) |NN N N N

(1−1) | = c. Since N(1−1) ⊂ R(1−1) , then |R(1−1)| ≥ c. Also
given that RN
(1−1) ⊂ R
N

|RN
(1−1)| ≤ |RN |
= |R||N|
= (2ℵ0 )ℵ0
= 2ℵ0 ×ℵ0
= 2ℵ 0
= c

So |RN
(1−1) | = c.

25.3 The Cantor set.

We begin by a definition of the Cantor set which is an inductively con-
structed subset of the closed interval [0, 1]. Because of its interesting prop-
erties, it is often discussed in various branches of mathematics. We will
present the steps for its construction and discuss its cardinality.
In what follows, the expression (a, b) will mean the open interval in [0, 1],
with endpoints a and b. We will construct a countably infinite set of subsets
{Cn : n ∈ N} where each Cn is defined as follows:
Part VII: Cardinal numbers 251

C0 = [0, 1]
C1 = C0 − (1/3, 2/3) = [0, 1/3] ∪ [2/3, 3/3]

C2 = C1 − [(1/32 , 2/32 ) ∪ (7/32 , 8/32 )]

= [0, 1/32] ∪ [2/32 , 3/32] ∪ [6/32 , 7/32] ∪ [8/32, 32 /32 ]

C3 = C2 − [(1/33 , 2/33 ) ∪ (7/33 , 8/33 ) ∪ (13/33 , 14/33 ) ∪ (25/33 , 26/33 )]

= [0, 1/33] ∪ [2/33 , 3/33] ∪ [6/33, 7/33 ] ∪ [8/33 , 9/33]
∪ [18/33 , 19/33] ∪ [20/33 , 21/33] ∪ [24/33 , 25/33] ∪ [26/33 , 33/33 ]
.. ..
. .

The construction can be summarized as follows: Cn+1 is obtained by “punch-

ing out” open middle thirds from each closed subinterval in its predecessor
Cn . Actually constructing C0 to C3 will allow one to see the pattern of con-
struction and develop a mental picture of what each Cn looks like.

Pursuing this process an infinite number of times (theoretically) will yield

a countably infinite set {Cn : n ∈ N} of subsets of the closed interval [0, 1].
For every natural number n, we see that Cn+1 ⊂ Cn . Furthermore, for each
natural number n, the level, Cn , will be the union of 2n closed intervals.
We will index these closed intervals with sequences of n zeroes and ones as
follows:2
2 Some readers may want to refer to the natural numbers expressed in base 2: 0, 1, 10,

11, 100, 101, 110, 111, . . ., to determine the order in which the finite zero-one sequences are
ordered.
252 Section 25: On sets of cardinality c

C0 = [0, 1]
1 2
C1 = 1 I {0} ∪ 1 I {1} = [0, 3 ] ∪ [ 3 , 1]
C2 = 2 I {0,0} ∪ 2 I {0,1} ∪ 2 I {1,0} ∪ 2 I {1,1}
C3 = 3 I {0,0,0} ∪ 3 I {0,0,1} ∪ 3 I {0,1,0} ∪ 3 I {0,1,1} ∪ 3 I {1,0,0} ∪ 3 I {1,0,1} ∪ 3 I {1,1,0} ∪ 3 I {1,1,1}
C4 = 4 {0,0,0,0} ∪ 4 I {0,0,0,1} ∪ · · · ∪ · · · ∪ 4 I {1,1,1,1}
I
C5 = 5 I {0,0,0,0,0} ∪ 5 I {0,0,0,0,1} ∪ · · · ∪ · · · ∪ 5 I {1,1,1,1,1}
.. ..
. .

For each n let An = {Aj : j = 1 to 2n } = {A1 , A2 , . . . , A2n } denote a set of

2n elements where the Ai ’s are distinct finite sequences of zeroes and ones.
For each n ∈ N, the level Cn can be described as,
Cn = ∪{n IAj : j ∈ {1, 2, . . ., 2n }}
Since Cn+1 ⊂ Cn for each n ∈ N, the sets {Cn : n ∈ N} are said to be
“nested”.
We call the resulting infinite intersection,
\
C= Cn
n∈N
the Cantor set.

A few noteworthy facts which follow from this.

Given the above definition, we provide a few facts about the Cantor set, C,
and its construction.
1) Note that, if for each j, Aj is a sequence of n zeroes and ones and n IAj
is one of the intervals which forms Cn , then, by the construction rules of
Cn+1 ,

n+1 IAj ∪{0} ∪ n+1 IAj ∪{1} ⊂ n IAj .

For example,
3 I{0,0,0} ∪ 3 I{0,0,1} ⊂ 2 I{0,0}
I
3 {0,1,0} ∪ 3 I{0,1,1} ⊂ 2 I{0,1}

3 I{1,0,0} ∪ 3 I{1,0,1} ⊂ 2 I{1,0}

3 I{1,1,0} ∪ 3 I{1,1,1} ⊂ 2 I{1,1}

Then every level Cn = C0 ∩ C1 ∩ C2 ∩ · · ·∩ Cn can also be obtained by taking

the finite union of the intersections of nested sets of closed intervals
2n
( n )
[ \
Cn = m IAj
j=1 m=0
Part VII: Cardinal numbers 253

For example, one nested set of closed intervals in C100 would be of the form,
100
\
m IA1 = [0, 1] ∩ [0, 31 ] ∩ [0, 312 ] ∩ [0, 313 ] ∩ · · · ∩ [0, 3100
1
]
m=0

2) Since, for a fixed j, {n IAj : n ∈ N} is an infinite set of nested closed

6 ∅.3
intervals, then ∩n∈N{n IAj } =
3) Also, the length of each closed interval which forms Cn is 31n .
If {n IA : n ∈ N} is an infinite set of nested closed intervals, then the length
of the interval ∩n∈N {n IA } must be limn→∞ 31n = 0.
Then the set, \
{n IAj }
n∈N

must be a singleton set.

4) Let s be a specific (infinite) sequence in {0, 1}N. For each n ∈ N, let s(n)
denote the finite sequence made of the first n terms (of zeroes and ones) of
the infinite sequence, s.4
Then, for each n ∈ N, n+1 Is(n+1) ⊂ n Is(n) and so the set {n Is(n) : n ∈ N}
forms a set of nested closed intervals (uniquely determined by the sequence
s). The Nested interval lemma guarantees that the expression

∩∞
n=0 n Is(n) = ∞ Is

is non-empty for the chosen s ∈ {0, 1}N.

In the following proposition, we will show that the Cantor set, C is (quite
surprisingly!) an uncountably infinite set. This is in spite of the large amount
of points removed from [0, 1] to construct it. Different authors may provide
different ways of proving that C is uncountable. We provide a proof that
has a set-theoretic flavor to it.

Proposition 25.3 The Cantor set has cardinality c.

P roof:

Since the Cantor set C is a subset of [0, 1], then |C| ≤ c. The cardinal-
ity of {0, 1}N is known to be c (see Theorem 20.12). We will show that
3 This statement is referred to as the Nested interval lemma. This lemma is proven in

most Calculus texts.

4 For example, suppose s = {1, 0, 1, 1, 1, . . .}. Then we choose I
2 s(2) = 2 I{1,0} in C2 , we
choose 3 Is(3) = 3 I{1,0,1} in C3 , 4 Is(4) = 4 I{1,0,1,1} in C4 , and so on.
254 Section 25: On sets of cardinality c

|{0, 1}N| ≤ |C|.

Let s be a specific sequence in {0, 1}N. For each n ∈ N, let s(n) denote the
finite sequence made of the first n terms (of zeroes and ones) of the infinite
sequence, s.5
Then, for each n ∈ N, n+1 Is(n+1) ⊂ n Is(n) and so the set {n Is(n) : n ∈ N}
forms a set of nested closed intervals (uniquely determined by the sequence
s). The Nested interval lemma guarantees that the expression

∩∞
n=0 n Is(n)

is non-empty for the chosen s ∈ {0, 1}N. For each s, we can then choose an
element xs in ∩∞ n=0 n Is(n) . See that, since xs ∈ n Is(n) ⊂ Cn for all n, then
xs ∈ ∩∞n=0 C n = C. We define the function f : {0, 1}N → C mapping {0, 1}N
into C, as
f(s) = xs
We claim that f is one-to-one: Suppose s and t are distinct elements of
{0, 1}N. Let n be the least natural number such that s(n) 6= t(n). Then the
two closed intervals
n Is(n) and n It(n)

in Cn have empty intersection. Then the intersection of the two sets of nested
closed intervals,

{f(s)} = {xs } ⊆ ∩{n Is(n) : n ∈ N}

{f(t)} = {xt } ⊆ ∩{n It(n) : n ∈ N}

must be empty. So xs and xt cannot be the same element. This shows that
f is one-to-one, as claimed. Then f embeds {0, 1}N into C. Hence,

c = |{0, 1}N| ≤ |C|

as claimed. Since C ⊂ R, |C| ≤ c. We conclude that |C| = c.

It is surprising to see that the cardinality of C is the same as the cardinality

of R since, to obtain C from [0, 1], we removed from [0, 1] a total length of
open intervals equal to
1
2 22

1 2 4 1 3
+ 2 + 3 +··· = 1+ + 2 +··· = = 1
3 3 3 3 3 3 1 − 32

There are still uncountably many points that are left behind. One may
expect that C is simply the set of all endpoints that appear in all Cn ’s. But
5 For example, suppose s = {1, 0, 1, 1, 1, . . .}. Then we choose I
2 s(2) = 2 I{1,0} in C2 , we
choose 3 Is(3) = 3 I{1,0,1} in C3 , 4 Is(4) = 4 I{1,0,1,1} in C4 , and so on.
Part VII: Cardinal numbers 255
S
this can’t be, since if we take the union of all endpoints n∈N En we obtain
only a countably infinite set, while C has been proven to be uncountable.
The Cantor set must then contain uncountably many numbers which are not
endpoints! Skeptical readers may want to look at the proof again to see if
there is any sleight of hand. Even if one believes the given proof, it does not
mean that it will necessarily be what we might call “a satisfying proof”. We
cannot actually see what is going on at the very high levels of n. The proof
doesn’t help us understand why the “non-endpoints” in C are not excluded
in the construction process.
Identifying numbers in C which are non-endpoints. The following arguments
show why some “non-endpoints” of C remain in the infinite intersection of
the sets, Cn , which are used to construct the Cantor set C. Consider the
sequence of numbers, {Sn : n ∈ N}, where
n k
X −1
Sn =
3
k=0

By carefully examining this sequence, we can deduce the following facts:

− We see that S0 = 1, S1 = 1 − 13 , S2 = 1 − 13 + 312 , S3 = 1 − 13 + 312 − 1
33 ,
and so on.
− The subsequence {S2n : n ∈ N} is a strictly decreasing sequence, while
the subsequence {S2n+1 : n ∈ N} is a strictly increasing sequence.
− We also see that, for all n ∈ N, [S2n+1 , S2n ] ⊂ [S2n−1 , S2n−2], hence
the set {[S2n+1 , S2n ] : n ∈ N} forms a nested set of closed intervals.
− For each n, [S2n+1 , S2n ] ⊂ C2n+1 .
− Since the elements of the sequence {Sn } are partial sums of a geometric
n+1 n+1
series with common ratio r = −1 3
, then Sn = 1−r1−r
= 1−(−1/3)
1−(−1/3)
.
3
The limit of the sequence {Sn } is then computed to be 4 where, for all
n, S2n+1 < 43 < S2n .
− We deduce that

{3/4} = ∩{[S2n+1 , S2n ] : n ∈ N} ⊆ ∩{Cn : n ∈ N} = C

So, even though 34 is not an endpoint of one of the subintervals which form
each level Cn it belongs to the Cantor set. Other such points can be found
in C in this way.

25.4 Counting those elements of P(R) which are of cardinality c.

In what follows, we will let Ac denote the set

Ac = {S ⊆ R : |S| = |R| = c}
256 Section 25: On sets of cardinality c

Since {R − {x} : x ∈ R} ⊂ Ac and Ac ⊂ P(R), then c ≤ |Ac | ≤ 2c . We

know that the cardinality of P(R) is 2|R| = 2c . We claim that |Ac | = 2c .
Recall (from Corollary 20.4) that |R × R| = |R| = c. Then to each subset
S ⊆ R × R of cardinality c we can associate a unique subset S ∗ ⊂ R of
cardinality c. Then

|Ac| = |{S ⊆ R × R : |S| = c}|

For any non-empty subset U of R and x ∈ U ,

c = |{x} × R| ≤ |U × R| ⇒ |U × R| ≥ c ∀ U ⊆ R

Since
|{U × R : U ∈ P(R) − {∅}}| = 2c
then

|Ac | = |{S ⊆ R × R : |S| = c}| ≥ |{U × R : U ∈ P(R) − {∅}}| = 2c

So |Ac | = 2c, as required.

25.5 Counting all real-valued continuous functions on R.

Continuous real-valued functions on R can be characterized as being those
functions which satisfy the property

lim f(xn ) = f( lim xn ) = f(x)

n→∞ n→∞

for any sequence {xn } of numbers which converges to a number x. Since

every irrational number is the limit of a sequence of rational numbers, the
value of a function f at an irrational number x is uniquely determined by
the value of this function at all rational numbers which surround it. This
means that given any continuous real-valued function, f, on Q, the con-
tinuous function f ∗ on R such that f ∗ |Q = f is unique. In set-theoretic
language, this means that the sets, B = {f ∈ RR : f is continuous} and
D = {f ∈ RQ : f is continuous}, are equipotent. So to determine the car-
dinality of the set B, it suffices to determine the cardinality of the set D.
It is shown in Theorem 25.2 part (a) that the cardinality of the set RN of
all functions mapping N into R is c. From the equipotence relation Q ∼e N
we deduce that RQ ∼e RN . Since D ⊂ RQ , |D| ≤ c. The uncountable set
{f ∈ RQ : f = c, c ∈ R} of constant functions is a subset of the set D. Then
c = |{f ∈ RQ : f = c, c ∈ R}| ≤ |D|. So the cardinality of D is c. We con-
clude that the cardinality of the set of all real-valued continuous functions
on R is c.
Part VII: Cardinal numbers 257

Concepts review:
1. What is the cardinality of Rn for any natural number n?
2. Do the real numbers and the complex numbers have the same car-
dinality?
3. How does the cardinality of the set of all countably infinite se-
quences of real numbers compare with the cardinality of RR ?
4. What is the cardinality of the set of all irrational numbers?
5. What is the cardinality of the set of all countably infinite sequences
of natural numbers?
6. What is the cardinality of the set of all countably infinite sequences
of real numbers?
7. Let NN denote the set of all functions mapping N into N and NN
1−1
denote the set of one-to-one functions mapping N into N. Are the
sets NN and NN1−1 equipotent? What is their cardinality?
8. What is the Cantor set? How is it constructed? What is its cardi-
nality?
9. What is the cardinality of the set of all continuous real-valued func-
tions?

EXERCISES
ℵ0 ℵ0
A. 1. Show that for any finite cardinal n, n × 2(2 )
= 2(2 )
.

B. 2. Let P(N)F denote the set of all finite subsets of N.

a) Show that the set N is embedded in P(N)F .
b) Express P(N)F as the union of a countably infinite number of pairwise
disjoint subsets of N.
c) What is the cardinality of the set ∪k∈N Nk ?
d) Construct a one-to-one function which embeds P(N)F in ∪k∈NNk .
e) What is the cardinality of P(N)F ?
3. Let P(R)F denote the set of all finite subsets of R.
a) Show that the set R is embedded in P(R)F .
b) Express P(R)F as the union of a countably infinite number of pairwise
disjoint subsets of R.
c) What is the cardinality of the set ∪k∈N Rk ?
d) Construct a one-to-one function which embeds P(R)F in ∪k∈RRk .
e) What is the cardinality of P(R)F ?
258 Section 25: On sets of cardinality c

C. 4. Consider the strictly increasing sequence S = {21 , 22 , 23 , 24 , . . . , 2n, . . .} of

cardinal numbers.
a) Does the cardinal number 2ℵ0 belong to S? Why?
b) Consider the two infinite cardinal numbers ℵ0 and 2ℵ0 . Is one of these
a least upper bound of S?6 If not, say why. If so, which one?

6 We remind the reader of the definition of “a least upper bound of an ordered set S”:

The element m is a least upper bound of a set S if m is an upper bound of S and for any
other upper bound n, m ≤ n.
Part VIII

Ordinal numbers
Part VIII: Ordinal numbers 261

26 / More on well-ordered sets

Abstract. In this section we review a few basic notions about well-ordered
sets. We define special subsets of well-ordered sets called “initial seg-
ments”. “Order isomorphisms” are defined as strictly increasing one-to-
one functions between well-ordered sets. Initial segments of well-ordered
sets are themselves well-ordered sets. We prove basic properties of order
isomorphisms between initial segments and well-ordered sets. We then de-
fine the relation, ≤WO , on well-ordered sets as follows: “S ≤WO T if S and
T are order isomorphic or one is order isomorphic to an initial segment
of the other”. This section provides the fundamental background for the
study of the important set-theoretic topic of “ordinal numbers”.

26.1 Overview.
In the last few sections, we have familiarized ourselves with some of the main
properties of infinite sets. We have seen that the ZFC-axioms have cleared
a path into unfamiliar mathematical territory, populated by uncountably
many “infinite sets” in infinite varieties, leading us to reflect on numerous
counterintuitive notions. We have discovered, for example, that given any
infinite set A we can find another infinite set B = P(A), not equipotent
to A, which properly contains a one-to-one copy of A. To express this rela-
tionship, we said that A is “properly embedded” in B and wrote A ,→e B.
We can thus construct infinite chains of sets linearly ordered by the proper
embedding ,→e -relation. For example:
0 ,→e 1 ,→e 2 ,→e · · · ,→e N ,→e P(N) ,→e P(P(N)) ,→e P(P(P(N))) ,→e · · ·

This chain of sets ordered by ,→e begins with the empty set 0 = { }. This
set is followed by an infinite number of finite sets called the “natural num-
bers”. Once we attain the first infinite set N, an endless sequence of infinite
sets can be constructed by successively taking powers of a set. Note that
no natural number is constructed by taking the power set of its immediate
predecessor. So the method for constructing each natural number from its
predecessor is different from the method used to construct each new infinite
set. In fact, it is an axiom that allows N to exist. Another axiom allows us
to say that if S is a set, then its power set is also a set. Of course, vari-
ous chains of sets can be constructed in this way, each depending on the
choice of the first set. If we started with the set of all real numbers, R, we
then obtained what initially appeared to be a different chain of infinite sets,
R ,→e P(R) ,→e P(P(R)) · · · . It was then determined that R and P(N)
are in fact equipotent, and so the displayed chain containing power sets of
N contains copies of the R-related power sets.
262 Section 26: More on well-ordered sets

We thought it would be practical to partition the class of all sets into sub-
classes of mutually equipotent sets. These subclasses were seen to be equiva-
lence classes induced by the equipotence relation ∼e . We defined the notion
of ∼e -equivalence class representatives called cardinal numbers. A cardinal
number was declared to be a set which represents all sets which are equipo-
tent to it. We had to postulate the existence of the cardinal numbers with
the promise that once we have developed the required set-theoretic tools,
the cardinal numbers would be appropriately defined or constructed.
We have seen that the set of all natural numbers has been extremely useful
in determining various properties of countably infinite sets. A critically im-
portant tool in our study was the principle of mathematical induction over
N. Since any countably infinite set is a one-to-one image of N, this means
that the elements of such sets can be indexed by the elements of N. Indexing
countable sets in this way allows us to linearly order these sets. We can then
apply the principle of mathematical induction to determine some of their
properties. When working with uncountable sets, we do not yet have access
to uncountable well-ordered sets whose elements can be used to index such
sets. We will soon see that ZFC provides the necessary ingredients to con-
struct “universal indexing sets”. 1

26.2 Well-ordered sets revisited.

Recall that “ordered relations” on a set S are relations which fall into
two major categories: linearly ordered relations or partially ordered rela-
tions (also, non-linearly ordered relations). Each of these can be strictly or
non-strictly ordered relations. Non-strictly ordered relations are often rep-
resented by “≤”, while strictly ordered relations are often represented by
“<” although other symbols may be used. An ordered relation on a set, S,
is linear provided any two distinct elements of S are comparable under the
given ordered relation. That is, all elements of S can be lined up on a line,
the “larger” elements normally to the right of (or above) the “smaller” ones.
Partially ordered classes are often viewed as having many branches where
elements on different branches are not comparable by ≤ or <. It is often said
that non-linearly ordered relations have many “chains of elements” (subsets
which are linearly ordered), while a linearly ordered class has all its elements
lined up in a single chain.
In what follows, the hypothesized sets, S, will be linearly ordered by ≤ or <.
We will be investigating those linearly ordered sets which are “well-ordered”.
We remind ourselves of what “well-ordered” means:

1 These sets will be called ordinals (soon to be defined). Cardinal numbers will be defined

as being those ordinals which satisfy a specific property (at Definition 29.7). Until we for-
mally define “cardinal numbers”, we will refrain from referring to the notion of “cardinality
of a set” in the process that leads to this definition.
Part VIII: Ordinal numbers 263

A well-ordered set is a set, S, which is linearly ordered by ≤ or <

in such a way that every non-empty subset, T , of S contains its
least element. If a relation ≤ well-orders a class or a set S, we will
sometimes, more succinctly, say that “S is ≤-well-ordered”.
Well-ordered classes. Note that in the above definition of a “well-ordered
set”, we can replace the word “set” with the word “class”, so that we can
speak of an ordered proper class, A, which is well-ordered by some relation,
≤. Proper well-ordered classes will be discussed further on in the text.
Before we start, we should recall that the set, N, as well as every natural
number, n, were shown to be ∈-well-ordered, (see Theorem 14.3 and Corol-
lary 14.4). The set, N, of all natural numbers and any natural, n, are also
⊂-well-ordered.
The reader may presently wonder: If N is the only well-ordered set we have
essentially seen up to now, then why all this fuss about the “well-ordered”
property on linear sets? It is because we will eventually want to discuss
uncountable sets which are well-ordered. For now we want to get a handle
on what the well-ordered property means, in the abstract, and then try to
develop some insight on what it would look like on uncountable sets. The
concept of an uncountable set is not an easy one for even the most fertile
imaginative mind to grasp. Let alone the fact that uncountable sets come in
an infinite number of sizes. How do we compare them?
We will begin by showing that, given any non-empty countable set S, we
can define an ordered relation which well-orders S.

Theorem 26.1 Let f : T → S be a one-to-one function mapping T onto S.

If T is a well-ordered set, then T induces a well-ordering on S. Hence, every
countable set can be well-ordered.
P roof:
What we are given: There exists a function f : T → S which maps the
well-ordered set (T, <T ) one-to-one onto the set S.
What we are required to show: There exists an ordered relation which well-
orders the set S.
Since f : T → S maps T one-to-one onto the set S, we can then index the
elements of S as follows: If s = f(n), express s as sn . Then S = {sn : n ∈
T } = f[T ].
We define the relation “<S ” as

sn <S sm if and only if n <T m

We claim that <S well-orders S:
264 Section 26: More on well-ordered sets

− The set S is <S -linearly ordered: It is clear that since f is one-to-one

onto, <S is irreflexive and asymmetric. For transitivity, we see that
sn <S sm and sm <S sr ⇒ n <T m and m <T r
⇒ n <T r
⇒ sn <S sr

We now verify that every pair of elements in S are comparable under <S .
If sn , sm ∈ S, then n and m are the unique corresponding elements in T .
Then n <T m or m <T n. Hence, either sn <S sm or sm <S sn . Hence, all
pairs of elements of S are <S -comparable and so S is <S -linearly ordered.
− The set S is <S -well ordered: Suppose A = {si : i ∈ U ⊆ T } is a
non-empty subset of S. Then U is a non-empty subset of T . Since T is
well-ordered, U has a least element, say k. Since k ≤T i for all i ∈ U , then
sk ≤S si for all si ∈ A. Thus, A contains a least element.
This proves that the relation, <S , induced on S by T is a well-ordering.
We now show that every non-empty countable set can be well-ordered. Let S
be a countably infinite set. Then there exists a function, f : N → S, mapping
N one-to-one onto S. Since N is well-ordered, then S has a well-ordering.
If S is finite and non-empty, then it is the one-to-one image of some natural
number n (18.7). Since every natural number n is ∈-well-ordered (14.4), S
inherits this well-ordering from n as described above.

We provide a few examples of linearly ordered sets which are well-ordered

and some that are not (at least in the form in which they are presented).

a) The set of all even natural numbers, Ne , with the ordering inherited
from (N, ⊂) is a well-ordered set since every pair of even numbers are
comparable and every subset of even numbers contains a least even
number.
b) Every natural number, n, is a well-ordered set. For example, 5 =
{0, 1, 2, 3, 4} is ∈-linear (or ⊂-linear) and every subset of 5 contains
a least element.
c) The set of all countably infinite sequences of natural numbers, NN (an
uncountable set with cardinality c), equipped with the lexicographic
ordering2 has been shown to be a set which is linearly ordered, but
not well-ordered, since it contains subsets with no least element. For
example, suppose that for each i ∈ N, xi = {aj : j ∈ N} where aj = 1 if
j = i and aj = 0 otherwise. Then for each i ∈ N, xi ∈ NN . The subset
S = {xi : i ∈ N} of NN does not contain a least element since it does
not contain the element (0, 0, 0, . . . , ).
2 See the definition of lexicographic ordering on page 134.
Part VIII: Ordinal numbers 265

d) The set, N × N, can also be equipped with the lexicographic ordering:

{(0, 0), (0, 1), (0, 2), . . . , (1, 0), (1, 1), . . . , (2, 0), (2, 1), (2, 2), (2, 3), . . .}

When ordered in this way, N × N can be seen as being the union of

a countably infinite number of copies of N lined up from end to end.
This is easily seen to be a linear ordering. Given any non-empty subset
M = {(s, t) : s ∈ S, t ∈ T } of N × N, the least element of M is (u, v)
where u = least{S} and v = least{t : (u, t) ∈ M }. The element, (u, v),
belongs to M since both u and v are least elements of subsets of a
well-ordered set. We conclude that the lexicographically ordered N × N
is well-ordered.
e) The set, R, of real numbers equipped with the usual real number or-
dering is linear but is not well-ordered since the set {x ∈ R : x > 1}
does not have a least element. Note that this does not mean that there
isn’t an order relation which well-orders the real numbers.3
Our experience with well-ordered sets is quite limited. When we well-order
a set we are adding “structure” to the set. For example there is a difference
between the set N×N and (N×N, lexicographic ordering), even though both
sets are countably infinite. Even if we can use the lexicographic ordering tool
to construct long chains of copies of N, we have however not yet been able
to exhibit a single uncountably infinite set which is well-ordered. Anyone
who has attempted to find a well-ordering relation for R may wonder if an
uncountable well-ordered set exists at all.

26.3 Initial segments.

We have seen examples of “initial segments” in the chapter where we dis-
cussed Dedekind cuts, although we did not formally define them using this
term at that time. So these may seem familiar to the reader. Initial segments
discussed here are the same mathematical objects as the ones discussed in
the chapter whose purpose was to define the real numbers R. However, the
context is quite different. In this section the initial segments we will study
are subsets of abstract well-ordered sets. We will discuss the notion of “ini-
tial segments” as though we have never seen these before. We start with the
following formal definition.

Definition 26.2 Given a well-ordered set (S, ≤), a subset U of S satisfying

the property
U 6= S and ∀u ∈ U, [x < u] ⇒ [x ∈ U ]
is called an initial segment of S. In this definition, the partial order relation ≤
3 We will see later on that the Axiom of choice guarantees that R can be well-ordered

without explicitly stating what such a well-ordering could be.

266 Section 26: More on well-ordered sets

can be used instead of < without altering the meaning of “initial segment”.

Formal definitions of abstract concepts are often not expressed in a reader-

friendly form. This is because the reader-friendly form is not always the form
that is best adapted to the process of proving statements in which carefully
formulated definitions are required. The following theorem will allow the
reader to more easily visualize what initial segments in well-ordered sets
look like.

Theorem 26.3 If (S, ≤) is a well-ordered set, then every initial segment in

S is of the form
Sa = {x ∈ S : x < a}
for some a ∈ S.

P roof:
What we are given: That (S, ≤) is a well-ordered set; T is a proper subset of
S satisfying the property “∀t ∈ T, [x < t] ⇒ [x ∈ T ]”.
What we are required to show: That T = Sa = {x ∈ S : x < a} for some
a ∈ S.
Since T is a proper subset of S, then S − T is non-empty. So S − T must
contain its least element, say a (since S is well-ordered).
Claim Sa ⊆ T : Since a is the least element of S − T , x < a ⇒ x 6∈ S − T ⇒
x ∈ T . So Sa ⊆ T , as claimed.
Claim T ⊆ Sa : If x 6∈ Sa , then x ≥ a. Then the element x cannot belong to
T , for if x ∈ T , a ≤ x would imply that a ∈ T (by definition of the set T );
since a ∈ S − T , we would obtain a contradiction. So u ∈ T ⇒ u < a. That
is, T ⊆ Sa as claimed.
So the initial segment, T , of the well-ordered set, S, is the set Sa = {x ∈ S :
x < a} where a is the least element in S − T , as required.

Given the initial segment Sa , we will refer to a as the leader of the initial
segment. The leader, a, of the initial segment, Sa , is not an element of the
initial segment. It is also important to remember that, by definition, a well-
ordered set S is not an initial segment of itself. We provide a few examples
of sets which are initial segments and sets which are not:
a) Note that every natural number n in N is an initial segment of N. For
example, 5 = {0, 1, 2, 3, 4} = {n ∈ N : n < 5} = S5 is an initial segment
of N.
Part VIII: Ordinal numbers 267

− We can view 5 = S5 = {0, 1, 2, 3, 4} as being ⊂-well-ordered. The initial

segments of 5 are the following sets only:

4 = S4 = {0, 1, 2, 3}
3 = S3 = {0, 1, 2}
2 = S2 = {0, 1}
1 = S1 = {0}

b) Even though the set Ne of all even natural numbers is a proper subset
of N, it is not an initial segment of N since 26 ∈ Ne and 17 < 26 but
17 6∈ Ne . However, S26 = {n ∈ Ne : n < 26} is an initial segment of the
well-ordered set (Ne ⊂).
c) The subset S = {0, 2, 3, 4, 5, . . ., } in N is not an initial segment of N since
3 ∈ S and 1 < 3, but 1 does not belong to S.
d) Consider the set, N{0,1,2} = {{a0 , a1 , a2 } : ai ∈ N}, of all functions
mapping {0, 1, 2} into N, ordered lexicographically.4 This set is easily
verified to be well-ordered.5 The set S{0,1,0} = {{0, 0, i} : i ∈ N} is an
initial segment of N{0,1,2}. It is the set of all elements in N{0,1,2} which
are strictly less than {0, 1, 0}.
Initial segments of well-ordered sets are well-ordered. If Sa is an initial seg-
ment of a <-well-ordered set S, then Sa can inherit the ordered relation “<”
from S so that it can itself be viewed as a <-well-ordered set.

26.4 “Order isomorphisms” between well-ordered classes.

Given two sets A and B, there can be many functions mapping A into B.
We may want to classify these functions by “types”. For example, we may
want to consider only those functions f : A → B which are one-to-one, or
only those functions which are constant, or only those with finite range, and
so on. If we are given two linearly ordered sets (S, ≤S ) and (T, ≤T ), we may
be interested only in those functions f : S → T which

“respect the order”

of these functions. By “respecting the order” we mean that n ≤S m ⇒

f(n) ≤T f(m). For example, the function f : N → N defined as f(n) = 5n
respects the order of the elements of the set N (for example, 3 < 4 where
4 Note that this set is the set of all ordered triples of natural numbers and so is equivalent

to the Cartesian product N × N × N.

5 If S is a non-empty subset of N{0,1,2} , let (b , b , b ) be the element in S such that
0 1 2
b0 , b1 , and b2 is the least element of all first, second and third coordinates of elements in S
respectively. Let (x, y, z) ∈ S. If b0 < x, then (b0 , b1 , b2 ) < (x, y, z); if b0 = x and b1 < y,
again, (b0 , b1 , b2 ) < (x, y, z); if b0 = x and b1 = y since b2 ≤ y, then (b0 , b1 , b2 ) ≤ (x, y, z).
So (b0 , b1 , b2 ) is the least element of S.
268 Section 26: More on well-ordered sets

f(3) = 15 < 20 = f(4)), while the function, g : (0, 1] → N, defined as,

g(x) = 1/x, does not (since 1/3 < 1/2 and yet g(1/3) = 3 6< 2 = g(1/2)).
One-to-one order-respecting functions between well-ordered sets will be
called order isomorphism. We begin by formally defining this concept.

Definition 26.4 Let f : (S, ≤S ) → (T, ≤T ) be a function mapping a well-

ordered class, (S, ≤S ), onto a well-ordered class, (T, ≤T ). Note that the sym-
bols ≤S and ≤T will allow us to distinguish between the order relations applied
to the sets S and T , respectively.

a) We will say that the function, f, is increasing on (S, ≤S ) if

(x ≤S y) ⇒ (f(x) ≤T f(y))

b) We will say that the function f is strictly increasing on (S, ≤S ) if

(x <S y) ⇒ (f(x) <T f(y))

A strictly increasing function must be one-to-one.

c) If f : (S, ≤S ) → (T, ≤T ) is strictly increasing, then f is said to be an
order isomorphism mapping S into T .
If there exists an onto order isomorphism between the two well-ordered classes,
(S, ≤S ) and (T, ≤T ), we will say that the classes are order isomorphic, or that
a function maps S order isomorphically onto T .

Remark: It follows from this definition that:

If A and B are two well-ordered sets which are order-isomorphic,

then A and B are equipotent sets.

We provide a few examples of order isomorphisms between well-ordered sets

introduced in previous chapters.

− Let (Ne , ≤) denote the even natural numbers equipped with the standard
natural number ordering ≤. Since the function f : N → Ne defined as
f(n) = 2n is one-to-one and strictly increasing, then it maps N order
isomorphically onto Ne .
− On the other hand, the function g : (N, ≤) → (N, ≤) defined as g(n) =
n + (−1)n is one-to-one and onto (N, ≤) but is not an order isomorphism.
If g(n) = an , witness a0 = 1, a1 = 0, a2 = 3, a3 = 2, . . .. We see that g
does not respect the order of the elements.
Part VIII: Ordinal numbers 269

− Consider the set N × N × N = {(a0 , a1 , a2 ) : ai ∈ N} ordered lexicograph-

ically. We see that the set

S(0,1,0) = {(0, 0, i) : i ∈ N}

is an initial segment of N × N × N since (0, 0, i) < (0, 1, 0) for all natu-

ral numbers i. Verify that the function f : N → N × N × N defined as
f(n) = (0, 0, n) maps N order isomorphically onto S(0,1,0).
Does there exist some other order isomorphism which maps N onto
S(0,1,0)? (A statement proven in Proposition 26.5 below will confirm that
there can be no other.)
− Suppose <∗ orders the elements of N as follows:
· If n is even and m is odd, then n <∗ m.
· If n and m are both even or both odd, then n and m respect the
usual order of natural numbers. That is,

(N <∗ ) = {0, 2, 4, 6, . . ., 1, 3, 5, 7, . . .}

Then the function f : N → (N, <∗ ) defined as f(n) = 2n, maps N order-
isomorphically onto the initial segment {0, 2, 4, 6, . . ., } of (N, <∗ ).

26.5 Basic properties of order isomorphisms.

We now list and prove a few basic properties of order isomorphisms. We will
refer to these often in our study of those sets we will call ordinals.

Proposition 26.5 Let (S, ≤S ) and (T, ≤T ) be well-ordered sets.

a) The inverse of an order isomorphism is an order isomorphism.
b) If f : (S, ≤S ) → (S, ≤S ) is a strictly increasing function mapping S into
itself, then x ≤ f(x), for all x ∈ S.
c) No initial segment of S can be an order isomorphic image of S.
d) If f : (S, ≤S ) → (S, ≤S ) is an order isomorphism mapping S onto itself,
then f is the identity function.6
e) If f : (S, ≤S ) → (T, ≤T ) and g : (S, ≤S ) → (T, ≤T ) are two order
isomorphisms mapping S onto T , then f = g.
f) Suppose f : (S, ≤S ) → (T, ≤T ) is an order isomorphism mapping S onto
an initial segment of T . Then S and T cannot be order isomorphic.

6 An order isomorphism from an ordered set onto itself is called an order automorphism.

Here we are stating that the only automorphism is the identity function.
270 Section 26: More on well-ordered sets

P roof:
a) What we are given: That f : S → T is an order isomorphism mapping
the well-ordered set (S, ≤S ) onto (T, ≤T ).
What we are required to show: That f −1 : T → S must also be an order
isomorphism:
To see this, let u, v be elements in T such that u <T v. Since f is
one-to-one and onto, there exists distinct elements a = f −1 (u) and
b = f −1 (v) in S. Since S is well-ordered, it is linear and so all ele-
ments in S are comparable. So either f −1 (u) = a <S b = f −1 (v) or
f −1 (v) = b <S a = f −1 (u). If b <S a, then, since f is order preserving,
f(b) = v <T u = f(a), a contradiction. So a = f −1 (u) <S f −1 (v) = b.
So f −1 : T → S must also be an order isomorphism.
b) What we are given: That (S, ≤) is well-ordered, that f : S → S, and
that x < y implies f(x) < f(y) (that is, f is strictly increasing).
What we are required to prove: That x ≤ f(x) for all x. That is, f
cannot map an element x “below itself”.
Suppose there exists an element x of S such that f(x) < x. Then, the
set
T = {x ∈ S : f(x) < x}
is non-empty. We claim that this will lead to a contradiction:
− Since S is well-ordered, T must contain a least element, say a. Since
a ∈ T , f(a) < a.
− Since f is strictly increasing,

f(a) < a ⇒ f(f(a)) < f(a)

− By definition of T , f(f(a)) < f(a) implies f(a) ∈ T . Since a is

the least element of T , a ≤ f(a). But a ≤ f(a) and f(a) < a
are contradictory statements. The source of this contradiction is
supposing that T 6= ∅.
We must conclude that T = ∅. That is, for all x ∈ S, x ≤ f(x).
c) What we are given: That (S, ≤) is a well-ordered set.
What we are required to show: That S cannot be order isomorphic to
an initial segment of itself.
An initial segment of S must be of the form Sa = {x ∈ S : x < a} where
a ∈ S. If f : (S, ≤) → Sa is an order isomorphism onto Sa , then f must
map a to some element f(a) in Sa ; this means f(a) < a. By definition of
order isomorphism, f is strictly increasing on S and so f(f(a)) < f(a).
But, by part b) above, x ≤ f(x) for all x ∈ S and so f(a) ≤ f(f(a)).
The statements f(f(a)) < f(a) and f(a) ≤ f(f(a)) are contradictory.
So no initial segment of S can be the order isomorphic image of S.
d) What we are given: That f : (S, ≤) → (S, ≤) is an onto order isomor-
phism.
Part VIII: Ordinal numbers 271

What we are required to show: That f(x) = x for all x ∈ S.

If f is an order isomorphism from S onto itself, then both f and f −1
must be strictly increasing functions. By part (b) above, x ≤ f(x) and
x ≤ f −1 (x), for all x ∈ S. Suppose s < f(s) for some s ∈ S. Then

f −1 (s) < f −1 (f(s)) = s

The statements f −1 (s) < s and s ≤ f −1 (s) are contradictory. So there

can be no element s ∈ S such that s < f(s). Then x ≤ f(x) and
x 6< f(x) forces f(x) = x for all x ∈ S.
e) Suppose f : (S, ≤S ) → (T, ≤T ) and g : (S, ≤S ) → (T, ≤T ) are two
order isomorphisms mapping S onto T . Then f −1 : T → S is an order
isomorphism and so the function

f −1 ◦g : (S, ≤S ) → (S, ≤S )

is an order isomorphism of S onto itself. By part (d) f −1 ◦g must be the

identity map. Then g = (f −1 )−1 = f.
f) What we are given: f : (S, ≤S ) → (T, ≤T ) is an order isomorphism
mapping S onto an initial segment Tu of T .
What we are required to show: That S and T cannot be order isomor-
phic.
If S and Tu are order isomorphic and S and T are order isomorphic,
then T is order isomorphic to Tu contradicting the statement in part
(c). So S and T cannot be order isomorphic.

We highlight some important points that are made in the above statements.
Firstly, if f : S → T is an order isomorphism between well-ordered sets S
and T , then there can be no other one. This is important, since it points to a
crucial difference between equipotent sets and order isomorphic sets. There
can be many different functions which map a set S one-to-one onto a set
T . But if (S, <S ) and (T, <T ) are known to be order isomorphic, then only
one order isomorphism can bear witness to this fact.7 We might say that
an order-isomorphism is “sensitive” to the structure of a well-ordered set,
while equipotence is not. For example, the equipotence relation perceives
({0, 1} × N, <lex ) simply as a countable set allowing for many ways of map-
ping N one-to-one and onto this set, while an order isomorphism is sensitive
to the fact that this set is made of two copies of N lined up one after the
7 Note that even if the order isomorphism between two initial segments is unique, it is

still entirely possible for an initial segment to be mapped order-isomorphically onto another
subset of a well-ordered set. For example, the initial segment {0, 1, 2, 3} can be mapped
order-isomorphically to the non-initial-segment A = {11, 12, 13, 14}. But note that A is not
an initial segment of N.
272 Section 26: More on well-ordered sets

other and so cannot view this set as a single copy of N.

Secondly, two initial segments of the same well-ordered set are order isomor-
phic only if they are equal.
Thirdly, a well-ordered set can never be order isomorphic to an initial seg-
ment of itself. This again underlines an important difference with the equipo-
tence relation. By definition, an infinite set is precisely a set which is equipo-
tent with a proper subset of itself.

26.6 Ranking well-ordered sets with order isomorphisms.

We will show how order isomorphisms can be used to “rank” well-ordered
sets. We begin by introducing the following notation.

Notation 26.6 Let S and T be two well-ordered sets. Then the expression

S ∼WO T

means “S and T are order isomorphic”. The expression

S <WO T

means “S ∼WO Ta ” where Ta is some initial segment of T . The expression

S ≤WO T

means “S ∼WO T or S <WO T ”.

If
W = {S ∈ S : S is well-ordered}
denotes the class of all well-ordered sets, verify that the relation ∼WO is
reflexive, symmetric and transitive on W and so is an equivalence relation.
See that ≤WO is also reflexive and transitive on W . The relation ≤WO is
not antisymmetric on W in the usual sense, since S ≤WO T and T ≤WO S
implies S ∼WO T , not S = T .8 But ≤WO can always be used as a ranking
too for the elements of W . We will now show that any two well-ordered
sets are ≤WO -comparable. That is, given any two well-ordered sets, S and
T , either S ≤WO T or T ≤WO S. The reader should carefully note how the
“well-ordered properties” are used in various parts of the proof.

8 However, if W ∗ = {[S]
WO : S ∈ W } denotes the class of all equivalence classes induced
by the equivalence relation, ∼WO , on W , then the statement in the theorem will allow us
to conclude that ≤WO induces a linear ordering on W ∗ .
Part VIII: Ordinal numbers 273

Theorem 26.7 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. Then ei-
ther S ≤WO T or T ≤WO S.
P roof:
What we are given: Two well-ordered sets (S, ≤S ) and (T, ≤T ). The expres-
sion Sa represents the initial segment {x ∈ S : x < a} whose leader is a.
We are required to show that: S ≤WO T or T ≤WO S
The symbol Sa ∼WO Tb is to be interpreted as “the initial segment Sa of S
is order isomorphic to the initial segment Tb of T ”.
We define the function f : S → T as follows: f(a) = b if and only if
Sa ∼WO Tb . We will carefully examine this function and describe its proper-
ties.
− We verify that f is well-defined : If f(a) = b and f(a) = c, then Sa ∼WO Tb
and Sa ∼WO Tc . This implies Tb ∼WO Tc . Two initial segments of the same
well-ordered set are order isomorphic if and only if they are equal. Then
b = c.
− We verify that the domain of f is non-empty: Suppose 0S and 0T denote
the least elements of S and T respectively. If 1S and 1T denote the least
element in S − {0S } and T − {0T }, respectively, then S1S ∼WO T1T . Then
f(1S ) = 1T so the domain of f contains at least the element 1S . Let D
denote the domain of f.
− We verify that f is strictly increasing: Suppose a and b are in the domain
D of f such that a <S b. If f(a) = c and f(b) = d, then Sa ∼WO Tc and
Sb ∼WO Td . Then
Tc ∼WO Sa and Sa ⊂ Sb and Sb ∼WO Td
So Tc is order isomorphic to an initial segment of Td . This implies
Tc ⊂ Td ⇒ c < d. So f is strictly increasing on D.
If D = S, then f maps S order isomorphically into T (since f has been
shown to be strictly increasing on D). It follows that D = S ≤WO T , and we
are done. So let us suppose that D 6= S.
− We claim that the domain D is an initial segment of S: If u ∈ D, then
Su ∼WO Tk for some k ∈ T . That is, there exist an order isomorphism
g : Su → Tk . If x <S u, then Sx ⊂ Su and g|Sx : Sx → Tk maps Sx onto
an initial segment, say Tt , in Tk . Then Sx ∼WO Tt implies f(x) = t. Then
x ∈ D. So D is an initial segment of S as claimed.
− We claim that if f[D] 6= T , then f[D] is an initial segment of T : Let
v ∈ f[D]. Then there exists an element a ∈ D such that f(a) = v. This
implies that Sa ∼WO Tv . Let u < v in T . Then Tu ⊂ Tv . Since Sa ∼WO Tv ,
then Tu is order isomorphic to an initial segment Sb ⊂ Sa , for some b ∈ D.
Then f(b) = u. So u ∈ f[D]. Since f[D] 6= T , by definition, f[D] is an
initial segment of T as claimed.
274 Section 26: More on well-ordered sets

− We claim that f[D] = T : Suppose not. Recall that D is an initial seg-

ment of S and so there exists q ∈ S such that D = Sq . Then, as we
just showed, f[D] = f[Sq ] must be equal to an initial segment, say Tr ,
for some r ∈ T . Then, since f is an order isomorphism, Sq ∼WO Tr .
This means that f(q) = r. So q ∈ D. But this contradicts the fact that
D = Sq = {x ∈ S : x < q}. We must conclude that f[D] = T as claimed.
We have thus shown that either f maps S order isomorphically into T or f
order isomorphically maps an initial segment D of S onto T . From this we
conclude that for any well-ordered sets (S, ≤S ) and (T, ≤T ), either S ≤WO T
or T ≤WO S.

The above theorem states that if we gather all well-ordered sets together to
form a class of sets, we can rank them with the relation ≤WO . Note that in
this class, distinct well-ordered sets may well be equal or equipotent sets. For
example, the set (N∗ , <∗) = {0, 2, 4, . . ., 1, 3, 5, 7, . . . , } of all natural num-
bers where the even numbers are first enumerated in the usual order followed
by all odd numbers enumerated in the usual way, is simply another way of
describing the set N. Nevertheless, N <WO N∗ . Even though N and N∗ are
the same set, they are not order isomorphic. On the other hand, we easily see
that N∗ and the lexicographically ordered set {0, 1}×N are order isomorphic.
It is also interesting to note that every well-ordered set is an initial segment
of another well-ordered set. Indeed, if S is well-ordered by ≤, then S is order
isomorphic to the initial segment {1} × S of the lexicographically ordered
set {1, 2} × S. In relation to such lexicographically ordered sets, we present
the following more general result.

Proposition 26.8 For every natural number n, the lexicographically ordered

set S = {1, 2, . . ., n} × N is well-ordered.

P roof:
Let T be a non-empty subset of S. Let u be the least element of the
set {r ∈ {1, 2, . . ., n} : (r, t) ∈ S}. Since every natural number is well-
ordered (14.4), such a number u exists. Let v be the least number in
{t ∈ N : (u, t) ∈ S}. Since N is well-ordered (14.3), such a number v ex-
ists. Then (u, v) ≤ (i, j) for all (i, j) ∈ A. Hence, every non-empty subset A
of S has a least element. So S is <-well-ordered.
Part VIII: Ordinal numbers 275

Concepts review:
1. What is a well-ordered set?
2. What is an initial segment of a well-ordered set?
3. Is a well-ordered set an initial segment of itself?
4. Give three examples of well-ordered sets.
5. Is the lexicographic ordering of N × N a well-ordering? Why or why
not.
6. List all initial segments of the natural number 7.
7. Give an infinite initial segment of N{0,1,2}.
8. What is an order isomorphism between two well-ordered sets?
9. Can a well-ordered set be order isomorphic to one of its initial seg-
ments?
10. If f : S → T where (S, ≤S ) and (T, ≤T ) are linearly ordered sets,
what does it mean to say that f is strictly increasing?
11. If a well-ordered set S is order isomorphic to an initial segment of
a well-ordered set T , can S and T be order isomorphic?
12. What can we say about two well-ordered sets S and T in reference
to order isomorphism?
13. How many order isomorphisms are there between an initial segment
and itself?
14. If S and T are order isomorphic sets and f and g are two order
isomorphisms mapping S onto T , what can we say about f and g?

EXERCISES

A. 1. Is the set of all prime numbers ordered in the usual way a well-ordered set?
2. List the first three initial segments of the set of all prime numbers ordered
in the usual way. Are initial segments of prime numbers initial segments of
N?
1
3. Let S = {0} ∪ { n+1 : n ∈ N} be ordered by < in the usual way. Is the set
S a well-ordered set? Justify.

B. 4. Let (S, ≤) be a well-ordered set. We say that an element b is an immediate

successor of a if there does not exist an element c such that a < c < b.
Show that if S is a well-ordered set, then every element of S that is not
the maximal element of the set must have an immediate successor.
276 Section 26: More on well-ordered sets

5. Suppose (S, ≤) is a well-ordered set. Is there an order relation we can

define on the set {T : T is an initial segment of S} which will make it a
well-ordered set?
6. Let No denote the set of all odd natural numbers ordered in the usual way.
Are No and N order isomorphic? If so say why. If not explain why.
7. Are the sets of all prime numbers and N both ordered in the usual way
order isomorphic? If so, say why. If not, explain why.
8. Consider the set {1, 2} × N when ordered lexicographically.
a) List the first few elements of {1, 2} × N.
b) Show that {1, 2} × N ordered lexicographically is a well-ordered set.
c) List three finite initial segments of {1, 2} × N. List three infinite initial
segments of {1, 2} × N.
d) In how many ways (if any) can we map N order isomorphically onto an
initial segment of {1, 2} × N?
e) In how many ways (if any) can we map N order isomorphically onto
{1, 2} × N?

C. 9. Suppose S is a countably infinite set. Show that there exists a well-ordering

<S such that (S, <S ) and ({0, 1} × N, <lex ) are order isomorphic.
1
10. Let S = {1 − n+1 : n ∈ N} ∪ {1} be ordered by < in the usual way.
a) Is the set S a well-ordered set?
b) Are the sets S and N order isomorphic? If so, show why. If not, explain
why not.
Part VIII: Ordinal numbers 277

27 / Ordinals: definition and properties

Abstract. In this section we provide some motivation for the construc-
tion of what we will call the “ordinal numbers”. We will see that N, as
well as all of its elements, are ordinal numbers. When N is viewed as an
ordinal number, it is represented by the symbol, ω. We review the notions
of “transitive sets” and “∈-well-ordering”. We then define “the immedi-
ate successor” of an element of a linearly ordered set, and show that the
immediate successor of an ordinal number is an ordinal number. We also
exhibit an ordinal number successor formula, α+ = α ∪ {α}, and show how
it is used to recursively construct the most ordinals. We then prove a few
basic properties of ordinal numbers. From this we deduce that all pairs of
distinct ordinals are ∈-comparable. We define “limit ordinals”, show how
these are constructed and provide methods to recognize them.

27.1 Introduction.
Our study of infinite sets began with a declaration of what it means for a set
to be infinite. We stated that only a set S “which can be mapped one-to-one
onto a proper subset of itself” is referred to as being “infinite”. All other
sets are said to be “finite”. Then we discovered that infinite sets could be
subdivided into two categories: Those that are one-to-one images of N −
referred to as “countably infinite” sets − and those that are not − referred
to as “uncountably infinite”. Then, we discovered that the class of uncount-
ably infinite sets actually has a more complicated structure. We found that
not all uncountable sets were pairwise equipotent. We were led to this con-
clusion when we proved that no infinite set S could be mapped one-to-one
onto its power set P(S). This implied that we could partition the class of all
sets into infinitely many subclasses of sets each containing sets which were
pairwise equipotent sets. Up to now, our attention has mainly been centered
on investigating the properties of those sets which belong to the class of all
countably infinite sets and the class of all sets which are equipotent to R
(since the sets N and R are the two sets we are the most familiar with).
Within the class of all sets, we investigated the subclass of sets known to
be equipped with a “well-ordering” binary relation. When equipped with a
well-ordering, these are called well-ordered sets. The ones we have exhibited
up to now are all countable. In this section we will show how to construct
new well-ordered sets from old ones. We saw that order isomorphisms allow
us to partition further the class of well-ordered sets. For example, a class of
all well-ordered sets can be partitioned into subclasses of pairwise order iso-
morphic sets. Recall that an “order isomorphism” between two well-ordered
sets S and T is a one-to-one function which respects the order of the elements
in the domain S and the image T of S. That is, the order of the elements
of the domain and the image is preserved by the one-to-one function. For
278 Section 27: Ordinals: definition and properties

convenience, we introduced the following notation:

S ∼WO T ⇔ “S and T are order isomorphic”

S <WO T ⇔ S ∼WO W = an initial segment of T
S ≤WO T ⇔ S ∼WO T or S <WO T

We were able to show that all pairs of well-ordered set are ≤WO -comparable.
This is in striking contrast with our first attempts at grasping the structure
of the class of all sets. The reader will recall that we were not clear on how
to prove that “,→e∼” linearly orders the class of all sets, even though we
strongly suspect that we will eventually be able to show that this is the case.
Our ultimate objective in this section (and the one that follows) will be to
construct a “well-ordered class of well-ordered sets” which contains an order
isomorphic copy of every well-ordered set. We will see that ZFC provides
us with the tools to construct a class of sets which serves this purpose. The
elements of this class of sets will be called ordinals.

27.2 Definition of “ordinal number”.

The reader will recall that every natural number is a “transitive set”. Tran-
sitive sets are those sets S that satisfy the rule:

(y ∈ S) ⇒ (y ⊂ S)

This property was shown (Theorem 13.7) to be equivalent to the property

x ∈ y and y ∈ S ⇒ x ∈ S

which is more suggestive of the notion of “transitivity” with respect to the

membership order relation ∈. A proper class which satisfies this transitive
property will be referred to as a transitive class. We showed (in 13.8) that
not only is N a transitive set, but each natural number is also transitive
(13.9). This is easy to see if we reexamine how the natural numbers are
constructed:

∅ = 0
∅+ = 0 ∪ {0} = {0} = 1
1+ = 1 ∪ {1} = {0, 1} = 2
2+ = 2 ∪ {2} = {0, 1, 2} = 3
3+ = 3 ∪ {3} = {0, 1, 2, 3} = 4
..
.
n+ = n ∪ {n} = {0, 1, 2, . . ., n} = n + 1
Part VIII: Ordinal numbers 279

So n + 1 = {0, 1, 2, 3, . . . , n} ⊆ N is both a subset of N and an element of N

contained in the subset n + 2 = {0, 1, 2, 3, . . ., n + 1} ⊆ N .
The “transitive set” property of a set S does not depend on any particular
order relation on S. But, if every element of a set S is transitive, it makes
it possible for ∈ to take on the role of an order relation on S. It is shown in
Theorem 14.4 and Theorem 14.3 that all natural numbers n, as well as the
set N, are strictly ∈-well-ordered. To say that “N is strictly ∈-well-ordered”
means that N is ∈-irreflexive, ∈-asymmetric, ∈-transitive, any two distinct
elements are ∈-comparable. This means that any non-empty set, S, of nat-
ural numbers contains an element x such that x ∈ y for all y ∈ S. We will
now discuss ∈-well-ordered sets other than natural numbers.

Definition 27.1 Let S be a set. If S satisfies the two properties,

1) S is a transitive set,
2) S is strictly ∈-well-ordered,
then S is called an ordinal number or simply an ordinal.

The set N as well as each of its elements are ordinal numbers. Since the set
N of all natural numbers, as well as each natural number, have been shown
(Theorems 13.8, 13.9, 14.4 and 14.3) to be strictly ∈-well-ordered transitive
sets, the class of all ordinals contains infinitely many finite ordinals and at
least one infinite ordinal, namely N. We will continue to (generically) repre-
sent finite ordinal numbers by the usual lower case letters such as m or n,
but infinite ordinal numbers will be represented by lower-case Greek letters,
such as ω, α and β.

Notation 27.2 When viewed as an ordinal number, N will be represented by

the lower-case Greek letter ω.1 We then write,

ω = {0, 1, 2, 3, . . . , }

1 The letter ω is read “omega”. So N has at least three representations, depending on

the context: When simply viewed as a set, we commonly use N, when viewed as a cardinal
number, we commonly use ℵ0, when viewed as an ordinal number we commonly use ω.
Later, the symbol, ω0 , will be used instead of ω (to specify that it is the smallest of all
infinite ordinals).
280 Section 27: Ordinals: definition and properties

The reader should note that, by definition, only “sets” can be ordinals. That
is, a strictly ∈-well-ordered proper class is not an ordinal.

27.3 Constructing new ordinals from known ordinals.

Remember that each natural number is constructed with the help of a “suc-
cessor constructing algorithm” n + 1 = n+ = n ∪ {n}. We will use the same
mechanism to construct those ordinal numbers beyond ω.
We first show that ω 6∈ ω: If N ∈ N, then, by definition of N, N is a natural
number n. But N cannot be a natural number n for if it was, then n ∈ n,
contradicting n 6∈ n proven in Theorem 13.9. So the ordinal number, ω, is
not an element of itself, as claimed.
Since ω is a set, the expression

ω+ = ω ∪ {ω}

is the union of two sets and so is itself a set. Note that ω+ 6= ω for if it was,
then ω ∈ ω, a contradiction. So, from ω, we have generated a new set ω+ .
This set is represented as, ω+ = ω + 1. If we repeat the procedure again
starting with ω + 1, we obtain

ω + 2 = (ω + 1)+ = ω + 1 ∪ {ω + 1}

But, if α is an ordinal number, is α+ necessarily an ordinal number? We

confirm immediately, with the following theorem, that it is.

Theorem 27.3 If α is an ordinal number, then so is its successor, α+ =

α ∪ {α}.

P roof:
What we are given: That α is an ordinal number (i.e., α is a transitive set
and strictly ∈-well-ordered).
What we are required to prove: That α+ is an ordinal number.
The class α+ is a set : Note that since α is an ordinal α must be a set.
Hence, by Axiom 3 (Axiom of pair), {α} is a set. By Axiom 6 (Axiom of
union), a+ = α ∪ {α} is a set, as claimed.
The set α+ is transitive: Suppose x ∈ α+ = α ∪ {α}. By definition of “tran-
sitive”, it suffices to show that x ⊂ α+ . If x = α, then x ⊂ α ∪ {α} = α+
and we are done. Suppose x 6= α; then x ∈ α. Since α is transitive x ⊂ α
and so x ⊂ α+ . So α+ is transitive. It follows that when viewed as a relation
Part VIII: Ordinal numbers 281

on α+ , ∈ is a transitive relation.
The elements of the set α+ are ∈-comparable : Let x and y be distinct ele-
ments in α+ = α ∪ {α}.
Case 1: If x = α, then y ∈ x (since x 6= y). Then x and y are ∈-comparable.
Case 2: If both x, y ∈ α, then, since α is known to be ∈-linearly ordered,
either x ∈ y or y ∈ x. So all pairs of elements in α+ are ∈-comparable.
It follows that the relation “∈” linearly orders α+ .
The relation ∈ is a strict linear ordering of α+ : Since ∈ strictly orders α,
x 6∈ x for all x ∈ α. Also α 6∈ α, for if α = x ∈ α, then x ∈ x contradicting
the fact that ∈ strictly orders α.
The set α+ is ∈-well-ordered : Let S be a non-empty subset of α+ . Let
T = S ∩ α.
Case 1: If T = ∅, then S = {α}. Since α 6∈ α, α must be the least (actually
the only) element of S.
Case 2: Suppose T 6= ∅. Since α is ∈-well-ordered, there exists an m ∈ T
which is the ∈-least element of T . If S = T , then m is the ∈-least element of
S, as required. If, on the other hand, S = T ∪ {α}, since α is the maximal
element in α+ , m < α. Again m is the ∈-least element of S.
We conclude that α+ is strictly ∈-well-ordered. So α+ is an ordinal number.

Definition 27.4 Suppose the set, S, is <-ordered. We say that an element y

in S is an immediate successor of the element x if x < y, and there does not
exist any element z in S such that x < z < y. We say that x is an immediate
predecessor of y if y is an immediate successor of x.

If α is an ordinal number, then we naturally expect α+ to be an immediate

successor of α. We verify that this is indeed the case. Suppose there exists
an element, β, which is “strictly in between α and α+ ” with respect to the
∈-ordering. That is, suppose α ∈ β ∈ α ∪ {α}. Since β 6= α and β 6∈ α,
“β ∈ α ∪ {α}” is impossible. So α+ is an immediate successor with respect
to the ∈-well-ordering.
The ordinal constructing mechanism, α+ = α ∪ {α}, can now be used to
construct infinitely many ordinals beyond ω.

ω = {0, 1, 2, 3, . . . , }
ω+1 = ω+ = ω ∪ {ω} = {0, 1, 2, . . . , } ∪ {ω} = {0, 1, 2, . . . , ω}
ω+2 = (ω + 1) ∪ {ω + 1} = {0, 1, 2, . . . , ω} ∪ {ω + 1} = {0, 1, 2, . . . , ω, ω + 1}
282 Section 27: Ordinals: definition and properties

ω+3 = (ω + 2)+ = {0, 1, 2, 3, . . . , ω, ω + 1, ω + 2}

ω+4 = (ω + 3)+ = {0, 1, 2, 3, . . . , ω, ω + 2, ω + 3}
..
.
ω+n = (ω + (n − 1))+ = (ω + n − 1) ∪ {ω + (n − 1)}
..
. = {0, 1, 2, 3, . . . , ω + (n − 1)}

We see that this method for constructing ordinals has a limited range. Are
there any other transitive “∈-well-ordered sets” beyond the set {ω, ω +
1, ω + 2, ω + 3, . . .} of ordinals? Our experience with ordinals tells us that
there can be. Recall that having defined all finite ordinals (natural numbers)
0, 1, 2, 3, . . ., we gathered together all natural numbers to form a new set,
N = ω = {0, 1, 2, 3, . . .}. We then explicitly proved that this new infinite
set, ω, is itself an ordinal. This illustrates that the “immediate successor
constructing algorithm” is not the only way to construct ordinals. Consider,
for example, the set

ω + ω = {0, 1, 2, 3, . . ., ω, ω + 1, ω + 2, ω + 3, . . . , ω + n, . . . , }

obtained by gathering together the ordinals 0, 1, 2, 3, . . ., ω, ω + 1, ω + 2, ω +

3, . . . ,. Is ω + ω an ordinal? It is not of the form α+ = α ∪ {α} for some
ordinal α. We will soon show that ω + ω is indeed an ordinal. First, we
must show how the notions of “ordinal” and “initial segment of ordinals”
are different ways of describing the same object.

27.4 An ordinal viewed as an initial segment of another ordinal.

If the ordinal numbers look familiar to us, it is because their properties are
generalizations of properties possessed by the natural numbers. Parts of the
proofs of the statements that follow mimic the proofs of various properties
of the elements of N.
Before we begin, we remind ourselves of what an “initial segment of a set
S” is: A set, U , is an initial segment of an ordered set, (S, <), if and only if
U is a proper subset of S, and

∀u ∈ U, [v < u] ⇒ [v ∈ U ]

Theorem 27.5 Let α be an ordinal number greater than zero.

a) Every element of α is an initial segment of α.

Part VIII: Ordinal numbers 283

b) Every ordinal α is an initial segment of α+ = α ∪ {α}.

c) Every initial segment of an ordinal α is itself an ordinal number.
d) Every element of the ordinal α is an ordinal number.

P roof:

a) What we are given: That x is an element of the ordinal α.

What we are required to show: That x is an initial segment of α.
Since x ∈ α, and α is strictly ∈-well-ordered, then x 6= α. We confirm
that x is a proper subset of α : Since x ∈ α and α is a transitive set, then
x ⊂ α.
Let u ∈ x and suppose v ∈ u. We are required to show that v ∈ x. Given
that ∈ linearly orders α, then ∈ is transitive, and so v ∈ u ∈ x ⇒ v ∈ x.
We have shown that x is a proper subset of α which satisfies the “initial
segment” property with respect to ∈, as required.
b) We have shown that if α is an ordinal, then so is α+ = α ∪ {α}. Since
α ∈ α+ , then by part (a) α is an initial segment of α+ .
c) What we are given: x is an initial segment in α where α is an ordinal
number.
What we are required to show: x is an ordinal number.
We claim that x is a transitive set: Let z ∈ y ∈ x. It suffices to show that
z ∈ x. Now y ∈ x ⊂ α implies y ∈ α. Also z ∈ y ∈ α implies z ∈ α (since
α is transitive). So z is ∈-less than y, with respect to α’s order relation ∈.
Since x is an initial segment, z is ∈-less than y and y ∈ x implies z ∈ x.
So x is a transitive set as claimed.
The relation, ∈, is a strict linear ordering of x : All elements of x are
elements of α (since x ⊂ α) so x inherits from α all ∈-ordering properties,
including ∈-transitivity and ∈-linearity. So x is ∈-linearly ordered. Since
u 6∈ u for all u in α, then this must be the case for all elements in x. So ∈
is a strict linear ordering of x.
The set x is ∈-well-ordered : Let T be a non-empty subset of x. We are
required to show that T contains an ∈-least element.
Case 1: If T = x, then, since x is an initial segment, T contains the ∈-least
ordinal, 0, and so we are done.
Case 2: Suppose T ⊂ x. Since α is ∈-well-ordered, T = T ∩ x contains its
∈-least element, say y. Since y ∈ T ∩ x, y ∈ x. So y is an element of x
which is the ∈-least element of T . So x is ∈-well-ordered.
So x is a transitive strictly ∈-well-ordered set. We conclude that x is an
ordinal number.
284 Section 27: Ordinals: definition and properties

d) If γ is any element of the ordinal α, then by part (a) γ is an initial seg-

ment of α. Having shown in part (b) that initial segments of ordinals are
ordinals, then γ is itself an ordinal.

Recall that ω represents the smallest infinite countable ordinal. We now

show that any infinite ordinal (other than ω itself) contains ω.

Proposition 27.6 Any infinite ordinal α not equal to ω contains ω, as an

element.

P roof:
What we are given: α is an infinite ordinal number.
What we are required to show: ω ∈ α.
We claim that {n : n ∈ ω} ⊂ α.
We prove the claim by induction. Let P (n) be the statement: “The natural
number n belongs to α”.
Base case: We are required to show that P (0) holds true. Let γ be the ∈-
least ordinal in α. If γ = 0, then we are done. Suppose γ 6= 0. That is,
suppose γ is non-empty. Then, when viewed as a subset of α, it contains a
least element x. We then see that x ∈ γ ∈ α contradicting the fact that γ is
the ∈-least element of α. The source of the contradiction is our supposition
that γ is not zero. Hence, γ = 0 ∈ α. So P (0) holds true.
Inductive hypothesis: Suppose P (n) holds true for some natural number n.
That is suppose that n = {0, 1, 2, 3, . . ., n − 1} ∈ α. Since α is transitive,
n = {0, 1, 2, 3, . . . , n − 1} ⊂ α. Then n + 1 = {0, 1, 2, 3, . . ., n − 1, n} =
n ∪ {n} ⊆ α. Since α is infinite, we actually have n + 1 ⊂ α. Since
n + 1 = {0, 1, 2, 3, . . ., n − 1, n} is an initial segment of α it is an ordi-
nal in α. Then P (n+ ) holds true.
By the principle of mathematical induction, α contains every natural num-
ber, as claimed.
Since ω 6= α, then ω is an initial segment of α. So ω ∈ α as required.

Proposition 27.7 Let α and β be distinct ordinal numbers. If α ⊂ β, then

α ∈ β.
Part VIII: Ordinal numbers 285

P roof:
What we are given: That α and β are distinct ordinal numbers.
What we are required to show: That (α ⊂ β) ⇒ (α ∈ β).
Suppose α ⊂ β. The set β − α is non-empty, and so contains its least ele-
ment, say γ. We will show that γ = α. If so, then α ∈ β and we are done.
Since elements of ordinals are ordinals, γ is an ordinal number.
We claim that α ⊆ γ:
− Let x ∈ α ⊂ β. Then x is also an ordinal number. It suffices to show that
x ∈ γ. Suppose x 6∈ γ. Since β is ∈-linearly ordered, x 6∈ γ implies either
γ ∈ x or γ = x holds true. But γ ∈ x ⊂ α or γ = x ⊂ α implies γ ∈ α
(since α is transitive). This contradicts γ ∈ β − α. So x ∈ γ. It follows
that α ⊆ γ as claimed.
We claim that α = γ:
− Suppose α ⊂ γ. Then there exists x ∈ γ − α ⊂ β − α. This means x is
an element in β − α which is strictly ∈-less than its least element γ. This
contradiction is caused by our supposition x ∈ γ − α. We conclude that
α = γ as claimed.
We have shown that if α ⊂ β, then α is the the least element of β − α and
so α ∈ β, as required.

The above results have an implication which is worth pointing out imme-
diately. We have shown in Theorem 27.5 that for every ordinal α, α is an
initial segment of β = α ∪ {α} with respect to the ∈ order relation. Then
we can write
γ = {α ∈ β : α ∈ γ}
where the ordinal γ is the leader of its initial segment. This is the case, for
any ordinal γ. This is consistent with what we have observed up to now.
Witness the ordinal, 3 = {0, 1, 2} = {n : n ∈ 3}, where 3 is the leader of the
ordinal 3, and the infinite ordinal, ω = {0, 1, 2, 3, . . . , } = {n : n ∈ ω}, where
ω is the leader of the ordinal ω.2

27.5 The relation, ∈, linearly orders the class of all ordinals.

Similarities between the methods of construction of the ordinals and the
natural numbers strongly suggests that the relation “∈” linearly orders the
class of ordinal numbers. This remains to be proved. Before we do this, we
must prove the following lemma.

2 Unfortunately, we cannot deduce from this that the set {α : α is an ordinal, α ∈ ω + ω}

is an ordinal since we have not yet shown that ω + ω is an ordinal.

286 Section 27: Ordinals: definition and properties

Lemma 27.8 If the ordinals α and β are order isomorphic, then α = β.

P roof:
What we are given: The sets α and β are ordinals for which there exists an
onto order isomorphic map f : α → β.
What we are required to show: That α = β.
Let S = {x ∈ α : f(x) 6= x}. Recall that order isomorphisms are strictly
increasing. Since 0 is the least ordinal of both α and β, f(0) = 0 (if, for
example, f(0) = 1, then f must map some element α > 0 to 0 < 1, a con-
tradiction). Hence, S is not all of α.
If S = ∅, then f(x) = x for all x in α; then α = β and we are done.
Suppose S 6= ∅. We claim that this will lead to a contradiction:
− Since α is ∈-well-ordered, S contains a smallest element, say d. Since
d ∈ S, then f(d) 6= d.
We claim: f(d) ⊆ d.
If x ∈ f(d) in β, then there exists z ∈ α such that f(z) = x.
Since f respects ∈-ordering

x ∈ f(d) ⇒ f −1 (x) ∈ f −1 (f(d))

⇒ z∈d

Since d is the smallest element such that f(x) 6= x, z ∈ d ⇒ f(z) = z.

But f(z) = x. So z ∈ d ⇒ x ∈ d.
This shows x ∈ f(d) ⇒ x ∈ d. So f(d) ⊆ d, as claimed.

− But we also see that d ⊆ f(d), since

u ∈ d ⇒ f(u) = u ∈ f(d)
So d ⊆ f(d) and f(d) ⊆ d implies d = f(d) which contradicts the fact
that d is least ordinal such that f(d) 6= d.
So S must be empty. This means that f is the identity map. We can only
conclude that α = β.
Part VIII: Ordinal numbers 287

Theorem 27.9 The relation “∈” linearly orders the class of all ordinals.

P roof:

We have shown in Theorem 26.7 that any two well-ordered sets S and T
are either order isomorphic or one is order isomorphic to an initial segment
of the other. If the ordinals α and β are not order isomorphic, then one
must contain an order isomorphic copy of the other. Suppose, without loss
of generality, that α is order isomorphic to an initial segment γ of β. By part
(b) of Theorem 27.5, γ must be an ordinal number. Since α and γ are order
isomorphic ordinals, then, by Lemma 27.8, they must be the same ordinal
number. Hence, α ∈ β. We can conclude that any two ordinal numbers are
∈-comparable; so “∈” linearly orders the class of all ordinals.

The immediate successor of an ordinal is unique. Since “∈” linearly orders

the class of ordinals, then this restricts the number of immediate successors
an ordinal can have. For, suppose β1 and β2 are immediate successors of the
ordinal α. If β1 6= β2 , then either β1 ∈ β2 or β2 ∈ β1 . Suppose without loss
of generality that β1 ∈ β2 . Then α ∈ β1 ∈ β2 . But this implies that β2 is
not an immediate successor of α, a contradiction. We must conclude that
β1 = β2 . So the immediate successor of an ordinal is unique.

27.6 Limit ordinals.

We have seen that all ordinals are initial segments of ordinals. Now an initial
segment of a linearly ordered set may or may not contain a maximal element.
For example, the ordinal number

ω + 2 = {0, 1, 2, . . . , ω, ω + 1} = {0, 1, 2, . . ., ω} ∪ {ω + 1} = ω + 1 ∪ {ω + 1}

contains the maximal element, ω + 1, since every element of ω + 2 is

either ω + 1 or is contained in ω + 1. On the other hand, the ordinal
ω = {0, 1, 2, 3, . . . , } is an initial segment {γ : γ ∈ ω} of the ordinal ω + 1
which has no maximal element. If an ordinal β has a maximal element, say
α, then β is the immediate successor of α. (Equivalently, the maximal ordi-
nal α is an immediate predecessor of β.) We can then divide the class of all
ordinals into two subclasses:
1) Ordinals that contain a maximal element with respect to ∈. These are
precisely the ordinals that have an immediate predecessor. That is,

β + = {0, 1, 2, . . ., β}

where β + is an ordinal which has a maximal element β.

288 Section 27: Ordinals: definition and properties

2) Ordinals β that do not contain a maximal element with respect to ∈.

These are the ordinals that do not have an immediate predecessor. They
can be represented as,
β = {α : α ∈ β}
For example, ω = {0, 1, 2, . . .} = {α : α ∈ ω}.
Those ordinals which do not have a maximal element (equivalently, do not
have an immediate predecessor) are called “limit ordinals”. We define this
formally.

Definition 27.10 An ordinal α which does not contain a maximal element

is called a limit ordinal.

27.7 Constructing limit ordinals.

Suppose U is a non-empty set whose elements are ordinals. What can we say
about the union, ∪{α : α ∈ U }, of all ordinals in the set U ? In particular,
we ask the question: Is the union of all ordinals in U necessarily an ordinal?
We will show that it must be so. Whether this union of ordinals is a limit
ordinal, or a non-limit ordinal, will depend on whether the set U contains,
or does not contain, a maximal element with respect to ∈.

Proposition 27.11 Let U be a non-empty set of ordinals which contains a

maximal element β with respect to ∈. Then the union, ∪{α : α ∈ U }, is equal
to the maximal ordinal, β, of U .
P roof:
We are given that β is an ordinal which is the maximal element of a set of
ordinals U . Then α ∈ β, for all α in U which are distinct from β. Since β
is an ordinal, it is a transitive set, and so, for all α ∈ U such that α 6= β,
α ⊂ β. Then ∪{α : α ∈ U, α 6= β} ⊆ β. Since β ∈ U , β = ∪{α : α ∈ U }.

We have shown above that, if β is the maximal element of U , ∪{α : α ∈

U } = β. Hence, in such a case, ∪{α : α ∈ U } cannot be equal to U . For
example, if U = 3 = {0, 1, 2}
Part VIII: Ordinal numbers 289

∪{α : α ∈ U } = ∪{0, 1, 2}
= 0∪1∪2
= ∅ ∪ {∅} ∪ {∅, {∅}}
= {∅, {∅}} = 2 6= U

We will now show that if U is a set of ordinals which contains no maximal

element, then ∪{α : α ∈ U } is a limit ordinal which is not contained in U .

Theorem 27.12 If U is a set of ordinals which does not contain a maximal

element with respect to “∈”, then γ = ∪{α : α ∈ U } is a limit ordinal which
is not contained in U . Furthermore, γ is the ∈-least ordinal which contains all
elements of U .
P roof:
What we are given: That U is a set of ordinals with no maximal element and
γ = ∪{α : α ∈ U }.
What we are required to prove: That γ is a limit ordinal which is not an ele-
ment of U . Also that γ is the least ordinal containing all elements of U .
First note that the class γ is a set. Recall that ordinals are sets (by definition)
and, since U is declared to be a set, γ is the union of a set of sets. Axiom 6
guarantees that γ is a set.
Fact #1 : γ = ∪{α : α ∈ U } is an ordinal.
Proof of fact #1. First see that every pair of elements in γ are ∈-comparable.
Any two elements of γ are both elements of some ordinal and so are themselves
ordinals. By Theorem 27.8, they are ∈-comparable.
We verify that the set γ is a transitive set. Let β be an element of γ. Then
β ∈ α for some α ∈ U . Since α is transitive, β ⊂ α. Since α ⊂ γ, β ⊂ γ. Then
γ is a transitive set, as claimed.
We now confirm that the set γ is a well-ordered set. If A is a non-empty subset
of γ, A ∩ α 6= ∅, for some α ∈ U . Since α is ∈-well-ordered, A ∩ α contains a
least element β. Let ψ be any element (equivalently, ordinal) in A not equal
to β. If ψ ∈ β, then ψ ∈ α. Then ψ is an element of A ∩ α which is strictly
∈-less than the least element β in A ∪ α. Since this cannot be, β ∈ ψ. So β is
the least element of A. Then γ is ∈-well-ordered, as required.
We have thus shown that γ = ∪{α : α ∈ U } is an ordinal. We have proven fact
#1.
Fact #2. The ordinal γ = ∪{α : α ∈ U } does not belong to U .
Proof of fact #2. Note that α ⊆ γ for all α ∈ U . Since γ has been shown to
be an ordinal, then for any α ∈ U not equal to γ, (α ⊂ γ) ⇒ (α ∈ γ) (by
Proposition 27.7). Suppose γ ∈ U . Then, for all α ∈ U such that α 6= γ, α ∈ γ.
290 Section 27: Ordinals: definition and properties

This implies that γ is a maximal element of U , contradicting our hypothesis

stating that U has no maximal element. Then γ 6∈ U , as claimed in fact #2.
Fact #3. The ordinal γ = ∪{α : α ∈ U } is a limit ordinal.
Proof of fact #3. Suppose not. That is, suppose γ = β + = β ∪{β}. Then β ∈ γ.
This means that β ∈ φ for some ordinal φ ∈ U . Equivalently, β ⊂ φ (since φ
is transitive). Now (φ ∈ U ) ⇒ (φ ⊂ γ) (we have shown that γ 6∈ U , so we can
use strict containment “φ ⊂ γ”). It follows that φ ⊆ β. But β ⊂ φ ⊆ β implies
β ⊂ β, a contradiction. The source of this contradiction is our supposition that
γ has an immediate predecessor β. Then γ has no immediate predecessor and
so is, by definition, a limit ordinal.
Fact #4. The ordinal γ = ∪{α : α ∈ U } is the least ordinal which contains all
elements of U .
Proof of fact #4. Suppose α ∈ U . Then α ⊂ γ. Since γ has been shown to be
an ordinal, then α ∈ γ (by Theorem 27.7). Then U ⊆ γ. We claim that γ is
the least ordinal satisfying this property. Suppose ψ is some ordinal such that
ψ ∈ γ. Then ψ ∈ α for some α ∈ U . Then α 6∈ ψ. Then U 6⊆ ψ. So γ is the
least ordinal which contains all elements of U , as claimed.

We summarize. The above theorem now provides us an alternate method for

constructing new ordinals from known ordinals. Taking the union of a set U
of ordinals where U contains no ∈-maximal element will always produce a
limit ordinal which does not belong to U .

Corollary 27.13 Let U be a non-empty set of ordinals which contains no

maximal element. The set U satisfies the “initial segment property” if ∀α ∈ U ,
[γ ∈ α)] ⇒ [γ ∈ U ]. If U satisfies the “initial segment property”, then U is
the limit ordinal ∪{α : α ∈ U }.
P roof:
What we are given: That U is a set of ordinals which contains no maximal
element and satisfies the initial segment property.
What we are required to show: That U = ∪{α : α ∈ U }.
We have shown above that the set, ∪{α : α ∈ U }, is the least ordinal which
contains all the elements of U . So, certainly, U ⊆ ∪{α : α ∈ U }.
Suppose β ∈ ∪{α : α ∈ U }. Then β ∈ α for some α ∈ U . Since U satisfies
the “initial segment property”, β ∈ U . Then ∪{α : α ∈ U } ⊆ U . We con-
clude that U = ∪{α : α ∈ U } as required.
Part VIII: Ordinal numbers 291

Examples.
a) We define

ω + ω = {0, 1, 2, 3, . . ., ω, ω + 1, ω + 2, ω + 3, . . . , } = ω ∪ {ω + n}∞
n=0

We conclude that ω + ω is a limit ordinal by arguing as follows:

− Since ω + ω is the union of two countable sets, it is a set.
− It is easily seen that ω + ω contains no maximal element and satisfies
the initial segment property.
− By the above corollary, ω + ω = {α : α ∈ ω + ω} is the least ordinal
which contains all finite ordinals and all infinite ordinals of the form
ω + n where n ∈ N.
Then ω + ω is a limit ordinal number.
b) If we define

ω + ω + ω = ω + ω ∪ {ω + ω, ω + ω + 1, ω + ω + 2, ω + ω + 3, . . . , }
we can similarly conclude that ω + ω + ω is the least ordinal which contains
all ordinals in ω + ω, and ordinals of the form ω + ω + n, where n ∈ N.
c) Ordinals such as, ω + ω, ω + ω + ω and ω + ω + ω + ω, are more succinctly
written as
ω2, ω3, ω4, ω5, . . .
and so on. We denote the set of ordinals ∪{ωn : n ∈ N} by

ωω

We continue constructing, in this way, larger and larger ordinals. Following

these general principles for representing ordinals, we list a few more of these:

0, 1, 2, 3, . . ., ω, ω2, ω3, . . . , ωn, . . . , ωω = ω2 , . . . , ω2 + n, . . . , ω2 + ω, . . . ,

ω2 + ω + n, . . . , ω2 + ω2, . . . , ω2 + ω3, . . . , ω2 + ωn, . . . ,
ω2 + ω2 = ω2 2, . . . , ω2 3, . . . , ωω2 n . . . ,
2 ω
ωωω = ω3 , . . . , ωn , . . . , ωω , . . . , ωω , . . . ωω , . . .

Since every one of these is the countable union of countably many ordinals,
each is countable; so each of these is a set.

27.8 Characterizations of limit ordinals.

We can also describe limit ordinals in terms of “least upper bounds”. We first
remind the reader of what is meant by “least upper bound” of an ordered set.
Suppose T is a non-empty subset of a strictly ordered set (S, <). If u is an
element of S such that t ≤ u for all t ∈ T , then we say that u is an upper
bound of the set T .
292 Section 27: Ordinals: definition and properties

Definition 27.14 Let T be a non-empty subset of an ordered set (S, <). If

u is an upper bound of the set T and, for any other upper bound v of T ,
u ≤ v, then we say that u is the least upper bound of T . We also abbreviate
the expression by writing u = lub T or u = lub(T ). 3

Note that if a set γ is a non-empty ordinal and for all α ∈ γ, either α ∈ β or

α = β then β is an upper bound of γ. The least upper bound of a “non-limit
(non-empty) ordinal” is always its maximal element. For example,

lub(ω + 2) = lub{0, 1, 2, 3, . . . , ω, ω + 1} = ω + 1

since every element of this set is “∈-less than or equal to” ω + 1. So

lub(ω + 2) 6= ω + 2

But if β is a limit ordinal (such as ω = {0, 1, 2, 3 . . ., }, for example), then

it has no maximal element (immediate predecessor) and so lub(β) 6∈ β.

The following theorem provides various ways of recognizing limit ordinals.

Theorem 27.15 Let γ be a non-zero ordinal number. The following are

equivalent:

1) The ordinal γ is a limit ordinal.

2) The ordinal γ is such that γ = ∪{α : α ∈ γ}.
3) The ordinal γ is such that lub(γ) = γ.
P roof:
(1 ⇒ 2) Suppose γ is a limit ordinal. Then γ is an initial segment (27.5)
which has no maximal element. By Corollary 27.13, γ = ∪{α : α ∈ γ}.
(2 ⇒ 1) Suppose the ordinal γ is such that γ = ∪{α : α ∈ γ}. If γ contains a
maximal ordinal β, then α ∈= β for all α ∈ γ. Then ∪{α : α ∈ γ} = β 6= γ,
a contradiction. Then γ contains no maximal element. By definition, γ is a
limit ordinal.
(1 ⇒ 3) If γ is a limit ordinal, then γ does not contain a maximal element.
By Theorem 27.12, γ = ∪{α : α ∈ γ} is the ∈-least ordinal containing all
elements of γ; hence, γ is the ∈-least upper bound of γ.
(3 ⇒ 1) Suppose the ordinal γ is such that lub(γ) = γ. If γ has a maximal
3 Instead of “least upper bound of T ”, the word supremum of T is commonly used.
Part VIII: Ordinal numbers 293

element β, then for all α ∈ γ either α ∈ β or α = β. Then β = lub(γ) ∈ γ,

a contradiction. Then γ cannot have a maximal element and so is a limit
ordinal.

Here are a few examples of least upper bounds of sets of ordinals.

ω = lub{0, 1, 2, 3, . . .} = {0, 1, 2, 3, . . .} = ∪{n : n ∈ ω}

ω + ω = ω2 = lub{0, 1, 2, 3, . . . , ω, ω + 1, ω + 2, . . . , } = lub{α : α ∈ ω2}
ω3 = lub{0, 1, 2, 3, . . . , ω, ω + 1, ω + 2 . . . , ω2, ω2 + 1, . . . , } = lub{α : α ∈ ω3}
ω4 = lub{α : α ∈ ω4}
lub 4 = lub{0, 1, 2, 3} = 3 6= 4 (4 is not a limit ordinal)

Concepts review:
1. What is an “ordinal number”?
2. What is a transitive set?
3. What does it mean to say that a transitive set is strictly ∈-well-
ordered?
4. Given two elements x and y of a well-ordered set, what does it mean
to say that y is an immediate successor of x?
5. Give an example of an infinite linearly ordered set which contains
elements with no immediate successor.
6. Describe a method for constructing immediate successors of ordi-
nals.
7. When viewed as an ordinal number, how do we represent N?
8. Given an ordinal number α, which one of its elements are initial
segments of α?
9. Can an ordinal number be an initial segment of itself?
10. Which elements of an ordinal number α are themselves ordinal num-
bers?
11. Which elements of an ordinal number α are proper subsets of α?
12. Which subsets of an ordinal number α are elements of α?
13. What can be said about two ordinals which are order isomorphic?
14. How are limit ordinals different from non-limit ordinals?
15. How are the ordinals ω2, ω3 and ω4 described?
16. What kind of ordinals can be represented as γ = ∪{α : α ∈ γ}?
294 Section 27: Ordinals: definition and properties

17. What is the least upper bound (supremum) of a limit ordinal α?

18. What is the least upper bound (supremum) of a non-limit ordinal
α?
19. What does the expression lub(α) = α say about the ordinal α?

EXERCISES

A. 1. Find a well-ordered set which is order isomorphic to the ordinal number

ω + 3 = {0, 1, 2, . . . , ω, ω + 1, ω + 2} (other than ω + 3 itself).
2. Let S be a ∈-well-ordered transitive set. Construct another ∈-well-ordered
transitive set which contains S.
3. What is the smallest ordinal that properly contains the ordinal number ω?
Find the smallest ordinal that properly contains all elements of the ordinal
number ω.
4. What is the intersection of all non-zero ordinals?
5. Does the ordinal ω + 2 contain an order isomorphic copy of N which is not
an initial segment of ω + 2?
6. True or False: “Every subset of an ordinal is an ordinal”.

B. 7. Find an ordinal number which is order isomorphic to the set

{1, 3, 5, . . . , 0, 2, 4} of natural numbers ordered in this particular way.
8. Is the union of two distinct ordinal numbers necessarily an ordinal number?
Explain.

C. 9. Show that there cannot exist a largest ordinal number.

10. Can there be an ordinal number that is not the immediate successor of
another ordinal number? Explain.
11. Find three ordinal numbers which are equipotent with the ordinal number
ω + 3?
12. Let U be a set of ordinals. Show that ∩{α : α ∈ U } is an ordinal.
13. Let A be a set of ordinals. Show that the intersection of all ordinals in A
is the least ordinal of the set A.
14. What is the union of all ordinals in ω + 1?
15. Prove that {ωn : n ∈ N} is an ordinal number.
16. What is the smallest limit ordinal in ω3?
17. What is the union of all ordinal numbers in ω3 + 1?
18. Show that the ordinal ωω is countable.
19. Construct a well-ordered set which is order isomorphic to the ordinal ωω
(other than ωω itself).
Part VIII: Ordinal numbers. 295

28 / Properties of the class of ordinals.

Abstract. In this section we discuss the class, O, of all ordinal numbers.
It will be seen to be an ∈-well-ordered transitive proper class of sets. Initial
segments of O are shown to be ordinals. Once we prove the “Principle of
transfinite induction” we show that every well-ordered set is order isomor-
phic to some ordinal. We then show that for any set, S, there exists an
ordinal which cannot be embedded in S. We introduce what is known as
the “Hartogs number of a set” and use this concept to construct a strictly
increasing sequence of uncountable ordinals indexed with the ordinals.

28.1 The well-ordering of the class of ordinal numbers.

In the last chapter we studied those sets called ordinals. By definition, a set is
an ordinal if and only if it is an ∈-well-ordered transitive set. We investigated
the set structure of an ordinal. We also investigated how different ordinals
relate to each other and identified two types of ordinals, limit ordinals and
“non-limit” ordinals. Our investigation of ordinal numbers has revealed that:
1) Each ordinal has an immediate successor ordinal with respect to ∈
(27.3).
2) Each ordinal is both an element and subset of any ordinal that contains
it (27.7).
3) Given any two distinct ordinals, one is an element of the other (27.9).
4) Some ordinals have no immediate predecessor. The first three such
ordinals are

ω = {0, 1, 2, 3, . . ., }
ω2 = {0, 1, 2, . . . , ω + 1, ω + 2, . . . , ω + n, . . .}
ω3 = {0, 1, 2, . . . , ω + 1, ω + 2, . . . , ω2, ω2 + 1, ω2 + 2, . . . , }

These are called limit ordinals. Limit ordinals are seen to be those
ordinals which do not contain a maximal ordinal. Equivalently, they
are those ordinals α, such that α = lub(α) (27.15).
Such properties are not entirely new to us since the set, N, is seen to satisfy
these very same properties. When viewed as an ordinal we represent N as ω.
We saw that there are ordinals which can be much larger than the ordinal
ω. We exhibited methods to construct large sets of ordinal numbers, each
of which contains the natural numbers, themselves ordinals. In this chapter
we will gather all ordinals together and investigate the class which contains
“all” ordinals. The class of all ordinals is much too big to be called a “set”
(as we shall prove in Theorem 28.4). We can nevertheless study its structure.
296 Section 28: Properties of the class of ordinals.

Notation. The class of all ordinal numbers will be denoted by,

In the next few pages, we will show that the class, O, itself satisfies, just
like its elements (the ordinal numbers), most of the properties possessed by
its elements.

We will begin by showing that O is ∈-linearly ordered. Then we will show

that O forms an ∈-well-ordered class.

Theorem 28.1 The relation, ∈, is an order relation on O. In fact, the class,

O, of ordinal numbers is a strict ∈-linearly ordered class.

P roof:
What we are given: O is the class of all ordinals.
What we are required to show: That “∈” is a strict linear order relation on O.
− Since α 6∈ α for all α ∈ O, “∈” is irreflexive and asymmetric. We verify
transitivity of the order relation ∈: If α ∈ β and β ∈ γ, then α ⊂ β and
β ⊂ γ; so α ⊂ γ; this implies α ∈ γ. So ∈ is a strict order relation on O.
− That every pair of ordinals are ∈-comparable has been shown in Theorem
27.8.
We conclude that the class, O, of all ordinal numbers is ∈-linearly ordered.

Theorem 28.2 The class, O, of all ordinal numbers is ∈-well-ordered.

P roof:
What we are given: That S is a non-empty subset of ordinal numbers in O.
What we are required to show: That S contains an ordinal which is the
∈-least element of S. That is, S contains an ordinal, β, such that, for all
α ∈ S other than β, β ∈ α.
Suppose γ ∈ S. If γ ∈ α for all α ∈ S, then γ is the least ordinal of S and
we are done.
Suppose, on the other hand, there is some ordinal α ∈ S such that α ∈ γ.
Then γ ∩ S is a non-empty subset of the well-ordered set γ. This means γ ∩ S
contains a least element, say β.
We claim that β is the ∈-least element of S. Suppose φ is an ordinal in
S such that φ ∈ β. Since β is an element of γ and γ is transitive, φ ∈ γ.
Then φ is an element of S ∩ γ which is ∈-less than the least element, β, of
Part VIII: Ordinal numbers. 297

γ ∩ S. This is a contradiction. The source of the contradiction is our suppo-

sition that φ ∈ β. There can be no such φ in S. So β is the least element of S.
This shows that O is ∈-well-ordered.

Since O is ∈-well-ordered, we can speak of “initial segments” of O with re-

spect to ∈. Recall the definition of “initial segment” in Definition 26.2. We
know that each ordinal, α, is an initial segment of any ordinal that contains
it and that the initial segment of any ordinal is an ordinal (See Theorem
27.5). The following theorem confirms that initial segments of the class, O,
(with respect to ∈) are precisely the ordinal numbers.

Theorem 28.3 A set S is an initial segment of O if and only if S is an ordinal

number.
P roof:
If S is an ∈-initial segment of O, by Theorem 26.3, there exists an ordinal
γ such that S = {α ∈ O : α ∈ γ}. But {α : α ∈ γ} is an initial segment of
γ ∪ {γ} and so, by Theorem 27.5, is an ordinal.
Conversely, if S = γ for some γ ∈ O, then γ = {α : α ∈ γ}, an initial
segment of O.

Recall that a set or class S satisfies the transitive property if “x ∈ S ⇒ x ⊂

S”. We easily verify that O is a transitive class:

β ∈ O and α ∈ β ⇒ α∈O (Since elements of ordinals are ordinals.)

⇒ β ⊂O
⇒ O is a transitive class.

28.2 The ∈-well-ordered class, O, is not an ordinal number.

Since O is a transitive ∈-well-ordered class of ordinals, it possesses all the
essential properties of ordinals. But, if O is to be an ordinal, it must also be
a set. The following theorem confirms that O cannot be a set.

Theorem 28.4 The class, O, of all ordinal numbers is not a set.1

1 If one states that the class of all ordinals is a set, it leads to what is referred to as the

Burali-Forti paradox.
298 Section 28: Properties of the class of ordinals.

P roof:
Suppose the class O of all ordinal numbers is a set. Then, since O is transi-
tive and ∈-well-ordered, it is an ordinal number. Then O ∈ O. Since O is a
transitive set, O ⊂ O. Since no ordinal number can be order isomorphic to a
proper subset of itself (Theorem 26.5), this is a contradiction. So O cannot
be a set.

So, even though O looks, feels and is, in many ways, similar to an ordinal,
it is an entirely a different mathematical object. We cannot manipulate it
as if it was a set.

28.3 Principle of transfinite induction over the ordinals.

The principle of mathematical induction over the natural numbers was seen
to be an extremely useful tool to prove that certain properties hold true
for sets whose elements can be indexed by the natural numbers. We remind
ourselves of what it means to prove a statement by mathematical induction
on N (or on the ordinal ω as we can now call it).
Suppose P (n) is a property which holds true depending on the
value of the natural number (the finite ordinal) n. The principle
of induction on ω states that if P (0) holds true, and “P (n) holds
true” ⇒ “P (n + 1) holds true”, then P (n) holds true for all values
of n.
We will show that this principle generalizes to mathematical induction over
the ordinal numbers. We must, however, keep in mind that there is an im-
portant difference between O and ω. Some of the elements of an ordinal
α may be limit ordinals. A limit ordinal β has no immediate predecessor
and so applying the induction algorithm “P (α) ⇒ P (α+)”, alone, cannot
be used to prove that P (β) holds true. Mathematical induction generalizes
to ordinals provided we can verify that P (β) holds true for limit ordinals,
β, as well as for non-limit ordinals.
The principle of mathematical induction can be used to prove statements
about classes or sets whose elements can be indexed by ordinals. For ex-
ample, if γ ∈ O and S = {xα : α ∈ γ}, then, since γ is a set and S is
a one-to-one image of γ, by the Axiom of replacement, S is a also a set.
Furthermore, the ordinal γ induces a well-ordering on the set S. The fol-
lowing theorem shows how this is done. It is followed by a second version of
mathematical induction.
Part VIII: Ordinal numbers. 299

Theorem 28.5 Principle of transfinite induction. Let Bγ = {xα : α ∈ γ} be

a set whose elements are indexed by the ordinal γ. Let P denote a particular
element property. Suppose P (α) means “the element xα in Bγ satisfies the
property P ”. Suppose that for any β ∈ γ,

“P (α) is true ∀ α ∈ β” implies “P (β) is true”

Then P (α) holds true for every α ∈ γ. That is, every element of Bγ satisfies
the property P .

P roof:
We are given that for every ordinal β ∈ γ, “P (α) is true for all α ∈ β implies
P (β) is true”. We are required to show that P (α) holds true for every α ∈ γ.
Suppose not. Suppose there exists some ordinal δ ∈ γ such that P (δ) is false.
We claim this will lead to a contradiction: By our supposition, the class
A = {α : α ∈ γ and P (α) is false} is non-empty. Since γ is ∈-well-ordered,
A must have a least element, say λ. That is, λ is the least ordinal in γ such
that P (λ) is false. Since this is the least element of A, P (α) holds true for
all α ∈ λ. By hypothesis, P (λ) must be true. We obtain a contradiction, as
claimed.
Then the set, A, must be empty. So P (α) holds true for all ordinals α ∈ γ.
That is, every element of Bγ satisfies the property P .

The following theorem illustrates a second version of the Principle of transfi-

nite induction, preferred by many because of its similarity to the Principle of
mathematical induction. It is, in a way, more transparent than the statement
proven above.

Corollary 28.6 Transfinite induction: A second version. Let Bγ = {xα : α ∈

γ} be a set whose elements are indexed by the ordinal γ. Let P denote a par-
ticular element property. Suppose P (α) means “the element xα in Bγ satisfies
the property P ”. Suppose that the three following conditions,
1) P (0) holds true,
2) P (α) holds true implies P (α + 1) holds true,
3) If β is a limit ordinal in γ, “P (α) is true for all α ∈ β implies P (β) is
true”,
hold true. Then P (α) holds true for all ordinals α in γ. That is, every element
of Bγ satisfies the property P .
300 Section 28: Properties of the class of ordinals.

P roof:
We are given that P is a property for which conditions 1, 2 and 3 hold true.
We are required to show that P (α) holds true for all ordinals α in γ.
Let λ be an ordinal in γ.
− If λ = 0, then by condition (1), P (λ) holds true.
− Suppose λ has an immediate predecessor, δ, that is, δ + = λ, such that
P (δ) holds true. By condition (2) P (λ) = P (δ + ) holds true.
− Suppose λ is a limit ordinal such that “P (α) is true for all α ∈ λ”. Then
by condition (3) P (λ) holds true.
Then P (α) is true for all α ∈ γ implies P (γ) is true. By the preceding the-
orem, P (α) holds true for all ordinals α ∈ γ.

28.4 Ordinality of a well-ordered set.

In Theorem 26.7, it was shown that given any two distinct non-order-
isomorphic well-ordered sets, one is order isomorphic to an initial segment
of the other. Since ordinals are, by definition, well-ordered, given any well-
ordered set S and ordinal number α, precisely one of the following two
statements must hold true:
1) The set S is order isomorphic to some ordinal β ∈ α.
2) The ordinal α is order isomorphic to an initial segment of S.
If the well-ordered set S is order isomorphic to some ordinal β, then this is
equivalent to saying that “the elements of S can be indexed by the elements
of the ordinal β”. On the other hand, if the ordinal α is order isomorphic to
an initial segment of S, clearly the elements of α cannot be used to index
the elements of S since α is not “big enough” to be used as an indexing set
for S. It is not yet clear whether, given any well-ordered set S, there exists
some ordinal which can be used to index the elements of S. The following
important theorem guarantees that there are sufficiently many ordinals in
O so that every single well-ordered set S is order isomorphic to some ordi-
nal. That is, any well-ordered set S can be indexed by the elements of some
ordinal number α.

Theorem 28.7 Let S be a <-well-ordered set. Then S is order isomorphic to

some ordinal number α ∈ O. Furthermore, the order isomorphism mapping S
onto α is unique.
Part VIII: Ordinal numbers. 301

P roof:
What we are given: The set S is a <-well-ordered set.
What we are required to show: There exists a unique ordinal α which is or-
der isomorphic to S. The required order isomorphism, f : S → α, is unique.
For each element k ∈ S, let Sk = {x ∈ S : x < k} denote an initial segment
of S. Let

D = {u ∈ S : Su ∼WO αu for some ordinal number αu }

Let
f = {(0S , 0)} ∪ {(u, αu) : u ∈ D} ⊆ D × O 2
Now D is non-empty since 0S ∈ D (where 0S is the <-least element of S).
We claim that f is a function with domain D: Suppose (u, α) and (u, β)
both belong to f. Then Su ∼WO α and Su ∼WO β. So α ∼WO β. By Lemma
27.8, α = β. Since the domain of f is a set, there is a set in the class O which
contains the image of the set D under f (by the Axiom of replacement). So
the relation
f : D → O defined as f(u) = αu
is a function with domain D, as claimed.
The function f is strictly increasing and so is an order isomorphism: If
u, v ∈ D such that u < v, then Su ⊂ Sv . Now, αu ∼WO Su ⊂ Sv ∼WO αv .
Given that αu and αv are ordinals, αu ∈ αv . Then, u < v implies
f(u) = αu ∈ αv = f(v). So f is strictly increasing. We conclude that
f : D → f[D] ⊂ O is an order isomorphism.
We claim that D is a subset of S which satisfies the initial segment property :
If u ∈ D, then f(u) = αu for some αu ∈ O. That is, there exists an order
isomorphism g : Su → αu mapping Su onto αu . If x < u, then Sx ⊂ Su . The
function g|Sx is an order isomorphism mapping Sx onto an initial segment
(equivalently an ordinal), say αx, in αu . Then f(x) = αx. We have shown
that
∀u ∈ D, [x < u ∈ D] ⇒ [x ∈ D]
Hence, D satisfies the initial segment property, as claimed.
We now claim that f[D] is an ordinal number: It suffices to show that f[D]
is an initial segment of O and invoke Theorem 28.3 (which states that any
2 To better understand the set f , we describe the first few elements. Suppose 1 , 2 , 3
S S S
represent the first few elements of the well-ordered set S.
S1S = {u ∈ S : u < 1S } = {0S } ∼WO {0} = 1 = α1S ⇒ (1S , 1) ∈ f
S2S = {u ∈ S : u < 2S } = {0S , 1S } ∼WO {0, 1} = 2 = α2S ⇒ (2S , 2) ∈ f
S3S = {u ∈ S : u < 3S } = {0S , 1S , 2S } ∼WO {0, 1, 2} = 3 = α3S ⇒ (3S , 3) ∈ f
302 Section 28: Properties of the class of ordinals.

initial segment of O is an ordinal). Since f[D] has been shown to be a set,

then f[D] cannot be equal to the proper class O. Let αv ∈ f[D]. Then there
exists an element v ∈ D such that Sv is order isomorphic to αv . Let β ∈ αv .
Then β ⊂ αv . Since Sv is order isomorphic to αv , then β is order isomorphic
to an initial segment Su ⊂ Sv . Then f(u) = β. So β ∈ f[D]. We have shown
that if β ∈ αv ∈ f[D], then β ∈ f[D], and so, f[D] is an initial segment of
O. Then, by Theorem 28.3, f[D] is an ordinal number, as claimed.
Finally, we claim that D = S: Suppose the domain D of f is not all of S.
We have shown that f[D] is an ordinal number. Say f[D] = γ ∈ O. Since
D 6= S, there exists some q ∈ S such that Sq = D (having shown that D
satisfies the initial segment property). Then f : D → f[D] = f[Sq ] is an
order isomorphism mapping Sq onto the ordinal γ. So, by definition of f,
(q, γ) ∈ f. This means that q ∈ D. But D = Sq = {x ∈ S : x < q} implies
q 6∈ D. This is a contradiction. The source of the contradiction is the sup-
position that D 6= S. Then D = S, as claimed.
Then f maps S order isomorphically onto the ordinal number γ. By Theo-
rem 26.5 part (e), this order isomorphism is unique, as required.

This means that the elements of any well-ordered set S can be indexed by
the elements of some ordinal. That is, if (S, <) is a well-ordered set which is
order isomorphic to some ordinal β, then S can be expressed as the indexed
set S = {sα : α ∈ β}. This makes the set S susceptible to proofs by math-
ematical induction over the ordinal β, an extremely useful tool for proving
various mathematical statements.
At this point, it will be useful to introduce some vocabulary that will allow
us to state which ordinal is order isomorphic to a given well-ordered set S.

Definition 28.8 Let S be a <-well-ordered set. If α is the unique ordinal

which is order isomorphic to S, then we will say that S is of order type α, or
is of ordinality α. If S is of ordinality α, we will write
ord
S=α

Note that there can be many well-orderings of the same set. Different well-
orderings may give rise to different ordinalities. The ordinality of a set de-
scribes a property of an a well-ordered set only. It doesn’t provide informa-
tion on its cardinality.

Example. The set N, ordered in the usual way, can then be said to have
ordinality,
ord
N=ω
Part VIII: Ordinal numbers. 303

On the other hand, the reader can verify that the same set, N, can be well-
ordered as
N∗ = {0, 2, 4, 6, . . . , 1, 3, 5, . . .}
When well-ordered in this way, it has ordinality
ord
N∗ = ω + ω = ω2
The lexicographically ordered countably infinite set S = {1, 2, 3} × N has
ordinality,
ord
S = ω + ω + ω = ω3

28.5 Ordinals viewed as ∼WO -equivalence class representatives.

Let
W = {S ∈ S : S is well-ordered}
denote the class of all well-ordered sets. Recall that S ∼WO T means that S
and T are order isomorphic. We easily see that ∼WO is reflexive, symmetric
and transitive on W , and so is an equivalence relation on this class of sets.
Let
W ∗ = {[S]WO : S ∈ W }
denote the class of all equivalence classes induced by ∼WO . We have shown
that every well-ordered class is order isomorphic to some ordinal. Then if
[T ]WO ∈ W ∗ , the equivalence class, [T ]WO , contains precisely one ordinal
which is order isomorphic to every well-ordered set it contains. This means
that we can adopt the ordinals in O as ∼WO -equivalence class representatives
of the elements in W ∗ .
For example, if ω3 ∈ [S]WO ∈ W ∗ , it means that [S]WO contains precisely all
well-ordered sets of ordinality ω3. If ord T = ω0 , then [T ]WO contains precisely
all well-ordered sets which are order isomorphic to N (when N is ordered in
the usual way). In this case, [T ]WO contains Neven = {0, 2, 4, 6, . . . , } but not
the well-ordered set M = {0, 2, 4, 6, . . . , 1, 3, 5, 7, . . .} where ord M = ω + ω =
ω2 6= ω.

28.6 Proving the existence of uncountable ordinals.

The reader may have noticed that we have not yet explicitly exhibited an
uncountable ordinal (or even a single uncountable well-ordered set3 ). A po-
tential candidate for the least uncountable ordinal may be the set:
ω1 = {α ∈ O : α is a countable ordinal}
Indeed, this is a well-defined class of ordinals whose elements are precisely all
countable ordinals. We show that it satisfies two fundamental characteristics:

3 If an uncountable well-ordered set is constructed, then, by Theorem 28.7, it must be

order isomorphic to some (uncountable) ordinal number.

304 Section 28: Properties of the class of ordinals.

a) The class ω1 satisfies the “initial segment” property : Suppose α ∈ ω1

and β ∈ α. Since α is a countable ordinal, then the ordinal β is countable
and so β ∈ ω1 . Then ω1 satisfies the “initial segment” property.
b) The class ω1 is uncountable: Suppose ω1 is countable. Since it is the one-
to-one image of the set N, ω1 is a set. Then ω1 is an initial segment of
ordinals which is not equal to O (since O is not a set). Then, by Theorem
28.3, ω1 is an ordinal. It follows that ω1 ∈ ω1 . Since no ordinal can be
an element of itself, we have a contradiction. The class ω1 cannot be a
countable set and so must be uncountable.

Can we conclude from this that ω1 is an ordinal? We are tempted to answer

“Of course”. But someone may object with the question:“What if ω1 = O?”
If so, then ω1 is a proper class. To show that ω1 6= O, it will suffice to show
that an uncountable ordinal β exists.
We prove a very clever result known as Hartogs’ lemma.4
If (S, <S ) is a well-ordered set, let RS = {(x, y) ∈ S × S : x <S y} describe
the corresponding order relation.

Lemma 28.9 Hartogs’ lemma.5 Let S be any set. Then there exists an ordi-
nal β which is not equipotent with S or any of its proper subsets.
P roof:
What we are given: That S is a set.
What we are required to prove: That there exists an ordinal β that cannot
be mapped one-to-one onto any subset of S.
Let
MS = {(T, RT ) ∈ P(S) × P(S × S) : T ⊆ S, <T well-orders T }
denote the set of all well-ordered subsets of S.6 By Theorem 28.7, every well-
4 Notice how the Axiom of replacement plays a fundamental role in the proof of Hartogs’

lemma, 28.9.
5 Friedrich Moritz “Fritz” Hartogs (1874-1943) was a German-Jewish mathematician,

known for his work on set theory and foundational results on several complex variables.
Historical note: As a Jew, he suffered greatly under the Nazi regime: he was fired in 1935,
was mistreated and briefly interned in Dachau concentration camp in 1938, and eventually
committed suicide in 1943. (Wikipedia)
6 Suppose (S, < ) is a well-ordered set. Let R = {(x, y) ∈ S × S : x < y}. Then R can
S S S
be viewed as an element of P (S × S), while S can be viewed as an element of P (S). So
(S, RS ) ∈ P (S) × P (S × S)
Then for any subset T of S, (T , RT ) ∈ P (S) × P (S × S).
Let
MS = {(T , RT ) ∈ P (S) × P (S × S) : T ⊆ S, <T well-orders T }
Since both P (S) and P (S × S) are sets, and MS ⊆ P (S) × P (S × S) then MS is also a
set.
Part VIII: Ordinal numbers. 305

ordered set, T , has a unique ordinality. So there is a well-defined function

f : MS → O defined as
ord
f((T, <T )) = (T, <T ) ∈ O

Then f[MS ] is a subclass of O. Since MS is a set, the Axiom of replacement,

guarantees that f[MS ] is a set (not a proper class) and so f[MS ] 6= O. Since
f[MS ] is not all of O, there exists an ordinal β ∈ O − f[MS ]. That is, β is
not the ordinality of any well-ordered subset of T .
Claim: That the ordinal β cannot be equipotent to any subset of S.
Proof of claim. Suppose β is equipotent to a subset T of S. That is, suppose,
there exists a function,
h:β→T
which maps β one-to-one and onto the subset T ⊆ S. Then, by Theorem
26.1, T inherits a well-ordering <T from β (that is, h(u) <T h(v) in T if
and only if u ∈ v in β.) So β is the ordinality of the well-ordered subset
(T, <T ). This contradicts the fact that β is not in the image MS under f.
So β cannot be equipotent with any subset of S, as claimed.

Theorem 28.10 There exists an uncountable ordinal.

P roof:
Let the set S in the lemma be the set N of all natural numbers. By Hartogs’
lemma, there is an infinite ordinal β which cannot be mapped one-to-one
onto any subset of N. Since the ordinal β is not equipotent to any subset
of N, then it cannot be countable. So β is an uncountable ordinal. We have
shown that an uncountable ordinal exists.

Corollary 28.11 The class

ω1 = {α ∈ O : α is a countable ordinal }

is the ∈-least uncountable ordinal.

P roof:
Hartogs’ lemma states that there exists an ordinal number, say γ, which is
uncountable. Since every countable ordinal is an element of γ, then ω1 ⊆ γ.
Since γ is a set, then ω1 must also be a set. Then, ω1 cannot be equal to the
class O of all ordinals. It was shown (in paragraph (a) preceding the lemma
above) that ω1 is a subclass of O which satisfies the initial segment property.
Proper subsets of O which satisfy the initial segment property were shown
306 Section 28: Properties of the class of ordinals.

to be limit ordinals (see Theorem 27.13). Hence, ω1 is an ordinal. Since any

ordinal α ∈ ω1 is countable, then ω1 must be the ∈-least uncountable ordi-
nal.

We can now lay this problem to rest. Uncountable ordinals exist in the ZFC-
universe.7

28.7 The Hartogs number of a set.

Is there an even larger ordinal, β, such that ω1 ∈ β? To help answer the
question above, we introduce the notion of a Hartogs number of a set.

Definition 28.12 Let S be any set. Let

US = {α ∈ O : α not equipotent to any subset of S}

By Hartogs’ lemma the class, US , is non-empty. Since O is ∈-well-ordered,

US contains a unique least ordinal, we will denote as, h(S). We will call the
ordinal, h(S), the Hartogs number of the set S.8 Then h can be viewed as a
class function, h : S → O, which associates to each set S in the class of all
sets S , a unique ordinal number, α, in the class of all ordinals O.9

By Hartogs’ lemma (28.9), every set S in S is assigned a unique Har-

togs number, h(S). For example, if the set S is the ordinal, 10, the Har-
togs number, h(10), of 10 is the ordinal 11, since 11 is the least ordinal
which cannot be embedded in the ordinal 10. While the Hartogs number,
h(ω + 7), of the countable ordinal, ω + 7, is the first uncountable ordinal
ω1 = {α : α is a countable ordinal}.10 We will use the above definition
of Hartogs numbers to show how we can recursively construct an endless,
strictly increasing, ,→e -chain of uncountable ordinals.
In the proof of the theorem below, we will use a function constructing proce-
dure called “transfinite recursion”. We have encountered recursively defined
functions before when defining addition (15.2) and multiplication (15.4) on
7 Some readers may be tempted at this point to say ord R = ω . But that would be
1
“jumping the gun”, so to speak. By definition, we can only speak of the ordinality of a set if
it is well-ordered. We have not yet discussed the topic of the well-ordering of R. Furthermore,
we will also have to first take care of a small (but non-trivial) detail called the Continuum
Hypothesis (see page 222).
8 In the literature, the expression “Hartogs number” seems to be more common than the

expression “’Hartogs’ number”

9 See page 84 for the definition of “class function”.
10 The Hartogs number, h(ω + 7), of ω + 7 is not ω + 8 since ω + 8 ∼ ω + 7 ∼ ω.
e e
Part VIII: Ordinal numbers. 307

N. Using transfinite recursion to define a class function over the ordinal

numbers is analogous to recursively defining a function on N (as shown in
Theorem 18.8). It is in fact a generalization of this process. The theorem
which declares that recursively defined functions are well-defined functions
is called the Transfinite recursion theorem.
We will illustrate how the transfinite recursively defined functions are con-
structed in the proof of the following theorem. We will defer the proof of
the Transfinite recursion theorem to the end of this section.

Theorem 28.13 There exists a strictly ,→e -increasing class

{ωα : α ∈ O}

of (pairwise non-equipotent) infinite ordinals, all of which are uncountable

except for ω0 = ω.
P roof:
Let h denote the class function which associates to each set S the unique
least ordinal number which cannot be embedded in S. The ordinal, h(S), is
called the Hartogs number of S. By transfinite recursion, we define the class
function
g:O →O
as follows:
g(0) = ω0 = N=ω
ord

g(α+ ) = ωα + = h(ωα ), for all α

g(γ) = ωα = lub{ωγ : γ ∈ α}, for all limit ordinals α

Note that the function, h : O → O, maps the proper class of all ordinals O
into O. The Transfinite recursion theorem guarantees that the sequence

{g(α) : α ∈ O} = {ωα : α ∈ O}

indexed by the ordinals is well-defined.

We claim that the sequence {ωα }α∈O is (strictly) ,→e -increasing:
The proof of the claim is by transfinite induction. Let P (α) denote the state-
ment “β ∈ α implies ωβ ,→e ωα ”.
Base case : The statement P (1) holds true since ω0 = ω is properly em-
bedded in all uncountable sets (by Theorem 18.9) and so ω0 ,→e ω1 .
Inductive hypothesis for the case where α is a non-limit ordinal : Suppose
P (α) holds true for the non-limit ordinal α. That is, “β ∈ α ⇒ ωβ ,→e ωα ”.
By definition of the Hartogs number, h(ωα ) = ωα+ = ωα+1 6,→e B, for any
308 Section 28: Properties of the class of ordinals.

subset B ⊆ ωα . Then ωα+1 6∈ ωα and ωα+1 6= ωα . So ωα ∈ ωα+ . This means

that ωα is order isomorphic to some initial segment of ωα+ (by Theorem
26.7). Since ωα+ 6∼e ωα , then ωα ,→e ωα+ . If β ∈ α+ and β 6= α, then
ωβ ,→e ωα ,→e ωα+ . Hence, ωβ ,→e ωα+ .
We have shown that if P (α) holds true, then P (α+ ) holds true.
Inductive hypothesis for the case where α is a limit ordinal: Suppose P (γ)
holds true for all ordinals γ ∈ α where α is a limit ordinal. We claim that
P (α) holds true. Let β ∈ α. We are required to show that ωβ ,→e ωα . Since
ωα = lub{ωγ : γ ∈ α} and β ∈ α, ωβ ∈ ωα . (See that ωβ is not equal to ωα
for if it was, ωα = ωβ ∈ ωβ+ ∈ {ωγ : γ ∈ α} contradicting the fact that ωα
is an upper bound of {ωγ : γ ∈ α}.) Then ωβ ∈ ωα . Since ωβ ,→e ωβ+ ∈ ωα ,
then ωβ ,→e ωα . So P (α) holds true.
By transfinite mathematical induction, for any ordinal α, β ∈ α ⇒ ωβ ,→e
ωα .
Hence, the class,
{ωα : α ∈ O}
constructed above is a class of (strictly) ,→e -increasing ordinals, as claimed.

We write out more explicitly how we obtained the first terms of the class
{ωα : α ∈ O}.

ord
ω0 = N
ω1 = h(ω0 ) = the least ordinal not equipotent to ω0 or its subsets.
ω2 = h(ω1 ) = the least ordinal not equipotent to ω1 or its subsets.
ω3 = h(ω2 ) = the least ordinal not equipotent to ω2 or its subsets.
.. ..
. .

Notation: From here on, the least infinite ordinal, ord

N, previously repre-
sented by ω, will be represented as ω0 .
We immediately establish a few facts about the class of ordinals

{ωα : α ∈ O}

constructed above.

Proposition 28.14 Let {ωα : α ∈ O} be the class of ordinals as defined in

the previous theorem.
Part VIII: Ordinal numbers. 309

a) Every element of {ωα : α ∈ O} is a limit ordinal.

b) For every ordinal α ∈ O, either α ∈ ωα or α = ωα . (Equivalently, ∀α ∈ O,
ωα 6∈ α.)
P roof:
a) Let ωγ ∈ {ωα : α ∈ O}. We consider two cases: (1) γ is a successor ordinal,
and (2) γ is a limit ordinal.
Case 1: Suppose γ = α+ , for some α. We claim that ωγ must be a limit
ordinal.
Suppose not. If ωα+ is not a limit ordinal, then ωα+ = β + = β ∪ {β} for
some ordinal β. Since both ωα+ and β are infinite sets, by Theorem 20.6,
ωα+ ∼e β ∪ {β} ∼e β, so ωα+ ∼e β. Furthermore, ωα+ = β ∪ {β} implies
that β ∈ ωα+ . That is, β is ∈-less than ωα+ . By definition, ωα+ is the
least ordinal such that ωα+ 6∼e ωα . Since β ∼e ωα+ , then β is an ordinal
strictly less than ωα+ such that β 6∼e ωα , a contradiction. The source of
the contradiction is our supposition that ωα+ is not a limit ordinal. So
any element in {ωα : α ∈ O} of the form ωα+ must be a limit ordinal.
Case 2: Suppose γ is a limit ordinal. Then, by definition, ωγ = lub{ωα :
α ∈ γ}. We claim that ωγ must be a limit ordinal. Suppose not. That
is, suppose ωγ = β ∪ {β} = β + , for some ordinal β. Then β ∈ ωγ and
β ∼e ωγ .
Claim: That β ∈ ωψ or β = ωψ , for some ψ ∈ γ.
Suppose not. That is, suppose ωα ∈ β for all α ∈ γ. Then β is an upper
bound of {ωα : α ∈ γ} which is strictly less than ωγ , a contradiction.
So β ∈ ωψ or β = ωψ , for some ψ ∈ γ, as claimed
Since γ is a limit ordinal, ψ+ ∈ γ; hence ωψ+ ∈ {ωα : α ∈ γ}. Then

ωγ ∼e β = ωψ ∈ ωψ+ ∈ {ωα : α ∈ γ}

contradicting the fact that ωγ is an upper bound of {ωα : α ∈ γ}. The

source of this contradiction is our assumption that ωγ is not a limit ordi-
nal. We can only conclude that for case 2, ωγ is a limit ordinal.

b) We are required to prove that: ∀α ∈ O, α ∈ ωα or α = ωα . (Equivalently,

∀α ∈ O, ωα 6∈ α.)
The proof is by transfinite induction. Let P (α) denote the statement
“α ∈ ωα or α = ωα ”.
Base case: The 0-ordinal ∈ ω0 since 0 belongs to all ordinals except the
ordinal 0. So P (0) holds true.
310 Section 28: Properties of the class of ordinals.

First inductive hypothesis: Suppose P (α) holds true for some α. That is,
suppose ωα 6∈ α. We are required to show that ωα+ 6∈ α+ . Since ωα 6∈ α,
then either α ∈ ωα or α = ωα .
Case 1: Suppose α ∈ ωα . Then, since ωα is a limit ordinal, α+ ∈ ωα ∈
ωα+ . So P (α+) holds true.
Case 2: Suppose α = ωα . Then again, since ωα+ is a limit ordinal and
ωα ∈ ωα+ , α+ = ωα+ ∈ ωα+ . So P (α+ ) holds true.
Second inductive hypothesis: Suppose γ is a limit ordinal, lub{ωα : α ∈
γ} = ωγ and P (α) holds true for all α ∈ γ. That is, “ωα 6∈ α” for all
α ∈ γ,
We are required to show: That ωγ 6∈ γ.
If ψ ∈ ∪{ωα : α ∈ γ}, then for some α ∈ γ, ψ ∈ ωα ∈ ωα+ ∈ ωγ ; hence,

∪{ωα : α ∈ γ} ⊆ ωγ

Since γ is a limit ordinal, γ = ∪{α : α ∈ γ} (by Theorem 27.15).

Given that “α ∈ ωα or α = ωα ” for all α ∈ γ, then α ⊆ ωα for all α ∈ γ
and so,
γ = ∪{α : α ∈ γ} ⊆ ∪{ωα : α ∈ γ} ⊆ ωγ
We conclude that ωγ 6∈ γ.
By transfinite induction, α ∈ ωα or α = ωα for all ordinals α.

A chain of ordinals under two distinct relations. The class {ωα : α ∈ O} can
be viewed as a chain of infinite ordinals which is strictly ordered by “∈”:

ω0 ∈ ω1 ∈ ω2 ∈ · · · ∈ ωω0 ∈ ωω0 +1 ∈ · · · ∈ ωω0 2 ∈ · · · ∈ ωω0 ω0 ∈ · · · ∈ ωω1 ∈ · · ·

The class of ordinals {ωα : α ∈ O} is also inductively constructed in a way

that ωα is not equipotent to any of its predecessors. Then it is also strictly
ordered by the proper embedding relation “,→e”:

ω0 ,→e ω1 ,→e · · · ,→e ωω0 ,→e ωω0 +1 ,→e · · · ,→e ωω0 2 ,→e · · · ,→e ωω1 ,→e · · ·

Amongst these, only ω0 is countable. This takes us considerably further

down into the realm of uncountable sets. We already knew that the set
{P n (N) : n ∈ N} was a countably infinite set of pairwise non-equipotent
sets. What is new here is that the class, {ωα : α ∈ O}, is composed of as
many pairwise non-equipotent well-ordered uncountable sets as there are
ordinals!
Part VIII: Ordinal numbers. 311

After studying the construction of the strictly increasing class of ordinals,

{ωα : α ∈ O}

described in detail above (in which is involved the somewhat tricky definition
of the Hartogs number), the reader may wonder:
“Precisely, what is the point of this particular class?”
If only to relieve a bit of the suspense, I think we can let reader in on a
little secret immediately. As we often say when we are about to share how
a, lengthy carefully developed enigmatic story line will turn out, “Spoiler
alert!” The class of ordinals

{0, 1, 2, 3, . . .} ∪ {ωα : α ∈ O}

is destined to be called the class of all cardinal numbers (as will be described
in the next chapter in the formal Definition 29.7).

28.8 Proof of the transfinite recursion theorem.

We end this section by proving that recursively defined functions over the
ordinals O are indeed well-defined.

Theorem 28.15 The Transfinite recursion theorem. Let W be a well-ordered

class and f : W → W be a class function mapping W into W . Let u ∈ W .
Then there exists a unique class function g : O → W which satisfies the
following properties:
a) g(0) = u
b) g(α+ ) = f(g(α)), ∀α ∈ O
c) g(β) = lub{g(α) : α ∈ β}, ∀ limit ordinals β
P roofoutline : Let H denote the class of all subclasses of O ×W which satisfy
the three properties given in the theorem statement. That is, U ∈ H if and
only if
a) (0, u) ∈ U
b) [(α, x) ∈ U ] ⇒ [(α+ , f(x)) ∈ U ], ∀α ∈ O
c) For any limit ordinal β,

[(α, xα) ∈ U ∀α ∈ β] ⇒ (β, lub{xα : α ∈ β}) ∈ U

Now H is non-empty since O ×W satisfies all three properties and so O ×W

is an element of H . Let G = ∩U ∈H {U }. Then G is the smallest element of
H (with respect to “⊆”). The objective is to prove that the class G is the
uniquely defined class function g : O → W that we seek. The first step is to
show that G ∈ H . This is straightforward and so is left as an exercise.
312 Section 28: Properties of the class of ordinals.

The second step is to show that G is a class function, while the third step
is to show that G is unique.
We will show that G is a class function by transfinite induction. For each
ordinal γ let
G|γ = {(α, xα ) ∈ G : γ 6∈ α}
For example,
G|0 contains at least (0, u)
G|1 contains at least (0, u) and (1, f(u))
G|2 contains at least (0, u), (1, f(u)) and (2, f(f(u)))
G|3 contains at least (0, u), (1, f(u)), (2, f(f(u))) and (3, f(f(f(u))))
.. .. ..
. . .
Let P (α) represent the statement “G|α is a function”.
− Inductive hypothesis: Case 1. Suppose P (α) holds true for all α ∈ φ+ for
some non-limit ordinal φ+ . This means that G|φ is a function.
We are required to show that P (φ+ ) holds true. That is, we must show
that G|φ+ is also a function. Now G|φ+ = G|φ ∪ {(φ+ , x) : (φ+ , x) ∈ G}.
We know that (φ, xφ ) ∈ G|φ so (φ+ , f(xφ )) ∈ {(φ+ , x) : (φ+ , x) ∈ G}. To
show that G|φ+ is a function it suffices to show that {(φ+ , x) : (φ+ , x) ∈
G} is the singleton set {(φ+ , f(xφ ))}. Suppose not. That is, suppose there
exists in G an element (φ+ , y) such that y 6= f(xφ ).
Claim: G − {(γ, y)} ∈ H . If so, then this contradicts the fact that G is
the smallest element of H . The proof of the claim is left as an exercise.
Assuming the claim is proved, we conclude that P (φ+ ) holds true.
− Inductive hypothesis: Case 2. Suppose P (γ) holds true for all ordinals
α ∈ γ where γ is a limit ordinal. This means that G|α is a function for all
ordinals α ∈ γ. Equivalently, {(α, xα ) : α ∈ γ, (α, xα ) ∈ G} is a function.
We are required to show that P (γ) holds true. That is, we must show
that G|γ is also a function. Now
G|γ = {(α, xα) : α ∈ γ, (α, xα ) ∈ G} ∪ {(γ, xγ ) : (γ, xγ ) ∈ G}
Let sγ = lub{xα : α ∈ γ}. We know, by definition of G, that (γ, sγ ) ∈
{(γ, xγ ) : (γ, xγ ) ∈ G}. To show that G|γ is a function, it suffices to show
that {(γ, xγ ) : (γ, xγ ) ∈ G} is the singleton set {(γ, sγ )}. Suppose not.
That is, suppose there exists (γ, y) such that y 6= sγ .
Claim: G − {(γ, y)} ∈ H . If so, then this contradicts the fact that G is
the smallest element of H . The proof of the claim is left as an exercise.
Assuming the claim is proved, we conclude that P (γ) holds true.
Part VIII: Ordinal numbers. 313

Then by Transfinite induction “G|α is a function” for all α ∈ O.

We claim that G must then be a class function. Suppose not. Then there
exists α such that {(α, x), (α, y)} ⊂ G where x 6= y. Then, {(α, x), (α, y)} ⊂
G|α , where G|α is a set shown to be a function. Since this is a contradiction,
G must then be a class function, as claimed.
The proof that G is unique is left as an exercise.

We then define the class function g in the statement as g = G.

28.9 Ordinals, for topologists’ eyes only.11

For readers well-versed in general topology we present a few topological re-
sults involving ordinal sets and numbers.12
A topology on a set of ordinals [0, ωα].
We will consider initial segment of ordinals,
S = [0, ωα]
(either with an open or closed right end). It can be verified that the collection
B = {(α, β] : α, β ∈ S, α < β}
satisfies the “base property”. Hence B will generate a topology τ on S.
Then, (S, τ ) is generally referred to as an ordinal space.

A product of two sets of ordinals.

Let
S = [0, ω1] × [0, ω0]
be the product space of the two given ordinal spaces. Then the elements of S
can be viewed as ordered pairs (α, β). Since both sets are linearly ordered,
it doesn’t hurt to visualize the product space, S, as a Cartesian plane of
numbers. We would then have (0, 0) in the lower left corner and (ω1 , ω0 ) in
the top right corner. The topological space S = [0, ω1] × [0, ω0] equipped
with this topology is commonly referred to by topologists as the
“Tychonoff plank”
The subspace S ∗ of S defined as
S ∗ = S − {(ω1 , ω0 )}
11 Readwith the caveat stated in the title in mind.
12 Thedetailed explanations and proof of statements are found in the book Point-Set
Topology with Topics, World Scientific publishing, 2024, by R. André.
314 Section 28: Properties of the class of ordinals.

simply obtained by deleting the top right corner from the Tychonoff plank
is appropriately referred to as the

“Deleted Tychonoff plank”

As an open neighbourhood base of the point, (β, µ) ∈ S, we can use elements

of the form

B(β,µ) = {(α, β] × (γ, µ] : α < β and γ < µ}

The Tychonoff plank may appear to be a topological space that is, in many
ways, similar to the product space R × N. But, as we will eventually see,
it has quite different properties. Both the Tychonoff plank and the Deleted
Tychonoff plank are useful topological spaces to remember.

Subspaces of normal spaces need not be normal.

The ordinal sets [0, ω1] and [0, ω0 ] are easily verified to be both Hausdorff
and compact. Then the Tychonoff plank S = [0, ω1] ×[0, ω0] is both compact
(by Tychonoff theorem) and Hausdorff. We can conclude that the Tychonoff
plank is normal.

Fact: It can be shown that the deleted Tychonoff plank S ∗ = S − {(ω1 , ω0 )}

is not a normal space. Hence subspaces of normal spaces need not be normal.

A normal space which is not perfectly normal.

A normal topological space, (S, τ ), is said to be perfectly normal if and only
if every closed subset of S is a Gδ . The ordinal space [0, ω1] which is known
to be normal can be shown to be “not” perfectly normal.

A space which is not compact but is countably compact.

The ordinal space [0, ω1) is known to be a non-compact space. Nevertheless,
it can be verified to be countably compact.

A space which is not compact but is pseudocompact.

The ordinal space [0, ω1) is known to be a non-compact space. Since every
real-valued function is bounded on a countably compact space, [0, ω1 ) is
pseudocompact.

A space whose Stone-Čech compactification is its one-point compactification.

If S = [0, ω1), it can be shown that βS = [0, ω1].
Part VIII: Ordinal numbers. 315

A space which is not realcompact.

It can be shown that if a pseudocompact space is realcompact then it must
be compact. Since S = [0, ω1) is known to be both pseudocompact and non-
compact, then it cannot be realcompact.

Concepts review:
1. Which ordering relation well-orders the class, O, of all ordinals?
2. If S is a subset of ordinals in O, what is one way of describing its
least element?
3. What can we say about initial segments of the well-ordered class
O?
4. Is O an ordinal number? Why or why not?
5. How do we define the immediate successor of an element of an
ordered set?
6. Give an example of a linearly ordered set where no element has an
immediate successor.
7. State the two versions of the principle of induction over the ordinals.
8. What does it mean to say that elements of every well-ordered set
can be indexed by the elements of some ordinal?
9. Which ZFC-axiom is invoked to prove that every well-ordered set
is order isomorphic to a single ordinal.
10. What does “ordinality of a well-ordered set” mean?
11. What does Hartogs’ lemma state?
12. How does the existence of an uncountable ordinal follow from Har-
togs’ lemma?
13. What is the least uncountable ordinal?
14. What is the Hartogs number of a set S?
15. How is the concept of Hartogs number combined with the Transfi-
nite recursion theorem to show that there exists an infinite sequence
of uncountable ordinal numbers no two of which are equipotent?

EXERCISES

A. 1. Show that a well-ordered set can only be isomorphic to a single ordinal

number.
316 Section 28: Properties of the class of ordinals.

2. We have seen that ω0 + ω0 is a limit ordinal. Describe the smallest limit

ordinal which is larger than ω0 + ω0 .
3. Is the non-limit ordinal ω0 + 2 equipotent with the limit ordinal ω2?
4. What is the smallest ordinal number which is equipotent with ω3?

B. 5. Let P(Q) denote the set of all subsets of the set of rational numbers Q.
a) Construct a countably infinite subset S of P(Q) which is well-ordered
by the relation ⊆ such that (S, ⊆) is order isomorphic to the ordinal
number ω0 . Prove that ⊆ both linearly orders and well-orders the set
S.
b) Construct a countably infinite subset T of P(Q) which is well-ordered
by the relation ⊆ such that (T, ⊆) is order isomorphic to the ordinal
number ω0 + ω0 .

C. 6. Theorem 26.7 states that “any two well-ordered sets S and T are either
order isomorphic or one is order isomorphic to an initial segment of the
other”. Can we replace the word “sets” with the word “classes” in this
statement? Justify your answer.
7. Construct a set which is not an ordinal number but whose elements can be
indexed by the elements of ω5.
8. Consider the lexicographically well-ordered set S = {1, 2, . . . , 100} × N.
State the ordinal number which is order isomorphic to the subset

{(1, 0), (1, 1), (1, 2), (1, 3), . . . , (2, 0)}

Which ordinal number is order isomorphic to S?

9. Let S = {α ∈ O : |α| ≤ |N|}. Does S have a maximal element? If so, what
is it? If not, state why.
10. Show that there is an ordinal number α which is not equipotent with R.
11. Let S = {α ∈ O : |α| ≤ |ω1 |}. Does S have a maximal element? If so, what
is it? If not, state why not.
Part VIII: Ordinal numbers 317

29 / Cardinal numbers: “Initial ordinals are us!”

Abstract. In this section we formally define “initial ordinals”. We then
prove that the class of all initial ordinals is precisely the class I =
ω0 ∪ {ωα : α ∈ O} defined at the end of the previous section. We state
and prove (by invoking the Axiom of choice) the Well-ordering theorem.
Finally, we define the “cardinal numbers” as being the class of all initial
ordinals.

29.1 Initial ordinals.

Recall that Hartogs’ lemma states that, if S ∈ S , then the set,

US = {α ∈ O : α 6,→e S and α 6∼e S}

is non-empty. This class describes “the set of all ordinals which cannot be
mapped one-to-one into S”. Since the class, US , of ordinals is non-empty
and O is ∈-well-ordered, then US has an ∈-least element. We called this
∈-least element of US the “Hartogs number”, h(S), of the set S.
Since ordinals are sets, then every ordinal, α, can be assigned a Hartogs
number, h(α).
For example: Determine the Hartogs number, h(ω0 + ω0 ), of the ordinal
ω0 + ω0 .
Since ω0 + ω0 is a countable set and h(ω0 + ω0 ) represents the least ele-
ment of Uω0 +ω0 , then it is the least ordinal that cannot be mapped into the
countable ordinal, ω0 + ω0 . Then the Hartogs number, h(ω0 + ω0 ), must be
uncountable. Since ω1 is the smallest such ordinal, then h(ω0 + ω0 ) = ω1 .

More generally, if α is an ordinal, then Uα = {γ ∈ O : γ 6,→e α and γ ∼

6 e α}.
The Hartogs number of the ordinal, α, can be viewed as being the unique
ordinal, h(α), satisfying the following properties:
1. α ∈ h(α) (for, if h(α) ∈= α, then h(α) 6∈ Uα ).
2. For any ordinal γ ∈ h(α), h(α) 6∼e γ.
For example, ω + 7 ∈ h(ω + 7) = ω1 and since ω + 8 ∈ h(ω + 7) = ω1 , then
h(ω + 7) 6∼e ω + 8.

Also, the Hartogs number of any finite ordinal, n, is h(n) = n + 1 since

n + 1 is not equipotent to n nor to any of its elements; furthermore, it is the
smallest such ordinal.
By definition, the Hartogs number, h(α), of an ordinal, α, is never equipo-
tent to any of its elements. An ordinal which is not equipotent with any of
318 Section 29: Cardinal numbers: “Initial ordinals are us!”

its elements is given a particular name.

Notation: For convenience and to render the phrasing less cumbersome, we
introduce at this time some rather unconventional notation. In what follows,
if α and β are ordinals, the expression α ∈= β is to be interpreted as “α ∈ β
or α = β”. Equivalently, β 6∈ α.

Definition 29.1 We say that an ordinal, β, is an initial ordinal if β is the

least of all ordinals which are equipotent with β. That is, β is an initial ordinal
if α ∈ β ⇒ α 6∼e β.

For example, ω0 + 1 is not an initial ordinal since ω0 + 1 is not the least of

all ordinals which is equipotent to ω0 + 1; the ordinal, ω0 , is the least of all
ordinals which is equipotent to ω0 + 1.
We already know of many initial ordinals. Trivially, every finite ordinal, n,
is the least ordinal equipotent to n since the only ordinal which is equipo-
tent to the natural number n is n. Also, no ordinal α ∈ ω0 is equipotent to
ω0 , so ω0 is seen to be the least countably infinite initial ordinal. Next in
line is the ordinal, ω1 , shown to be the least ordinal not equipotent to any
countable ordinal. It is then, by definition, the second infinite initial ordinal.
Of course, ω0 + ω0 is not an initial ordinal since ω0 + 2 ∈ ω0 + ω0 where
ω0 + 2 ∼e ω0 ∼e ω0 + ω0 .
We will investigate the elements of the class,

{ω0 , ω1 , ω2 , . . . , ωω , . . . , ωω1 , . . . , } = {ωα : α ∈ O}

of recursively constructed ordinals in Theorem 28.13.

We recall how these ordinals were defined:

– In the case of a successor ordinal, say α = φ + 1, ωα was defined as

being h(φ).
– While in the case of a limit ordinal, γ, ωγ was defined as,
ωγ = lub{ωα : α ∈ γ}.

The above examples suggest that ordinals such as ωα are in fact initial
ordinals. We introduce the following notation.
I = {0, 1, 2, 3, . . ., } ∪ {ωα : α ∈ O}

We will show that the class, I , comprises the complete class of all initial
ordinals.
Part VIII: Ordinal numbers 319

Lemma 29.2 The class, I = {0, 1, 2, 3, . . . , } ∪ {ωα : α ∈ O}, contains all

initial ordinals.
P roof:
What we are given: ψ is an initial ordinal and I = {0, 1, 2, 3, . . ., } ∪ {ωα :
α ∈ O}.
What we are required to show: ψ ∈ I .
If the initial ordinal, ψ, is a finite ordinal, then ψ ∈ I , and we are done.
We then suppose that ψ is an infinite initial ordinal.
By part (b) of Theorem 28.14, for any ordinal ψ, ψ ∈= ωψ . If ψ = ωψ , then
ψ ∈ I and we are done. Suppose ψ ∈ ωψ .
Claim: That ψ = ωβ ∈ ωψ , for some ordinal β (and so ψ ∈ I ).
Proof of the claim (by transfinite induction).
Let P (α) denote the statement “If ψ is an infinite initial ordinal in ωα , then
ψ = ωβ ∈ ωα , for some ordinal β”.
Base case: If ψ is an infinite initial ordinal which belongs to ω1 , then
ψ = ω0 ∈ ω1 . So P (1) holds true.
First inductive hypothesis: Suppose P (α) holds true for some ordinal α. That
is, for any infinite initial ordinal ψ ∈ ωα , ψ = ωβ ∈ ωα , for some ordinal β.
We are required to prove that P (α+) holds true. Suppose ψ ∈ ωα+ = h(ωα ).
Since both ωα and ψ belong to ωα+ either ψ ∈= ωα or ωα ∈ ψ.
Case 1: If ψ = ωα , then we are done.
Case 2: If ψ ∈ ωα , then by the inductive hypothesis, there exists some ordi-
nal β such that ψ = ωβ ∈ ωα .
Case 3: If ωα ∈ ψ, then, since ψ is an initial ordinal, ωα 6∼e ψ. Then
ωα+ = h(ωα ) ∈= ψ. But this contradicts our hypothesis ψ ∈ ωα+ . So ωα 6∈ ψ.
This means, ψ ∈= ωα .
If ψ = ωα , then we are done. Otherwise, ψ ∈ ωα implies ψ = ωβ for some β,
by our inductive hypothesis. In both cases ψ ∈ I . So P (α+ ) holds true.
Second inductive hypothesis: Suppose γ is a limit ordinal and P (α) holds
true, for all α ∈ γ. We are required to show that P (γ) holds true. Let ψ be
an infinite initial ordinal such that ψ ∈ ωγ = lub{ωα : α ∈ γ}. Then ψ ∈ ωα ,
for some α ∈ γ. By the inductive hypothesis, there exists an ordinal β such
that ψ = ωβ ∈ ωα . So ψ ∈ I . Hence, P (γ) holds true.
This proves the claim that (ψ ∈ ωψ ) ⇒ (ψ = ωβ ∈ ωψ ) for some ordinal β.
Hence, ψ ∈ I , as required.

Theorem 29.3 The class,

I = {0, 1, 2, 3, . . ., } ∪ {ωα : α ∈ O}
320 Section 29: Cardinal numbers: “Initial ordinals are us!”

is precisely the class of all initial ordinals.

P roof:

We have shown in the lemma that {α ∈ O : α is an initial ordinal } ⊆ I .

To show that {α ∈ O : α is an initial ordinal} = I , it suffices to show
that I ⊆ {α ∈ O : α is an initial ordinal}. Since we have already
shown that the finite ordinals are initial ordinals, it suffices to show that
{ωα : α ∈ O} ⊆ {α ∈ O : α is an initial ordinal}. We prove the statement
by transfinite induction. Let P (α) denote the statement “ωα is an initial
ordinal”.

− Trivially, P (0) holds true.

− Suppose P (α) holds true. That is, suppose ωα is an initial ordinal. We
are required to show that ωα+ = h(ωα ) is an initial ordinal. Let β ∈
h(ωα ). It suffices to show that β 6∼e h(ωα ). Recall that h(ωα ) is the
least ordinal which is not equipotent to ωα or any of its elements. If
β ∈ h(ωα ), then the ordinal β must be equipotent to some subset of
ωα ∈ ωα+ . Then β cannot be equipotent to h(ωα ). This means that
ωα+ = h(ωα ) is an initial ordinal. So P (α+ ) holds true.
− Suppose γ is a limit ordinal and P (α) holds true for all α ∈ γ. That is,
ωα is an initial ordinal for all α ∈ γ. We are required to show that ωγ
is an initial ordinal. Suppose not. Suppose ωγ ∼e β for some β ∈ ωγ =
lub{ωα : α ∈ γ}. Then β ∈= ωα for some α ∈ γ. Since ωγ ∼e β ∈= ωα ∈
ωα+ ∈ ωγ , we have a contradiction. So ωγ is an initial ordinal.
By transfinite induction, every element of {ωα : α ∈ O} is an initial ordinal.
We conclude that I = ω0 ∪ {ωα : α ∈ O} precisely represents the class of
all initial ordinals.

29.3 Well-ordering theorem.

We say that a set, S, is well-orderable if a well-ordering relation, <, is known
to exist on S. For example, we have shown that each and every countable set
is well-orderable, since every infinite countable set is equipotent to N and N
is well-ordered (See Theorem 26.1).
On the other hand, other than the uncountable ordinals such as ω1 , ω3 +4, ω5 ,
we have not yet witnessed any uncountable well-orderable “non-ordinal” set.
Indeed, we have shown that R is uncountable. We agreed on the set “c” to
represent the class of all sets which are equipotent to R. But, we have not
yet determined the nature of this set. Since we have not yet been able to
determine a relation, <, on R which well-orders R, we have no information
on its ordinality. Hence, as yet, we have no information on how to relate c
Part VIII: Ordinal numbers 321

to {ωα : α ∈ O}.
Is R even well-orderable? To show that a set, S, is well-orderable and to
actually produce an algorithm that well-orders S are two different things.
Of course, producing an algorithm that well-orders S is more useful than
simply proving that S is well-orderable. But sometimes, the best we can
hope for is to prove that a set S is well-orderable, even though we may be
convinced that no algorithm that well-orders S will ever be explicitly found.
It may come as a surprise to many readers to learn that, in the set-theoretic
universe governed by ZFC, all sets are well-orderable (including uncountable
ones such as R). The statement “All sets are well-orderable” proved below is
called the Well-ordering theorem or sometimes the Well-ordering principle.
It is a direct consequence of the Axiom of choice.
Since the Axiom of choice plays a fundamental role in the proof of the Well-
ordering theorem, it will be useful to remind ourselves of what the Axiom
of choice states.
Axiom of choice: Every set of sets has a choice function.

The Axiom of choice states that if S = {X : X ∈ P(S), X 6= ∅} is an

infinite set of non-empty subsets of a set S, then there exists a function
f : S → ∪{X : X ∈ P(S)} which maps each non-empty subset X of S to
some element x in X. The key point is that there is no globally stated rule
on S which describes which particular element to choose from each set. A
convenient (often used) example is to imagine an infinite set S of “pairs
of socks”. The Axiom of choice states that there exists a function f which
associates to each pair of socks one sock, and this even if, in each pair, one
sock is indistinguishable from the other.

Theorem 29.4 [AC] The Well-ordering theorem. Every set can be well-
ordered.

P roof:
What we are given: That S is a non-empty set.
What we are required to show: That S is well-orderable.
To do this, it suffices to show that S is the one-to-one image of some ordinal
number. Then, by invoking Theorem 26.1, we can conclude that S is well-
orderable.
Case 1: Suppose S is a countable set. If S is finite, then it is the one-to-one
image of some finite ordinal (natural number), and so S is well-orderable. If
S is infinite, then it is the one-to-one image of N (a well-ordered set). Again,
we must conclude that S is well-orderable.
322 Section 29: Cardinal numbers: “Initial ordinals are us!”

Case 2: Suppose that S is uncountable. We will recursively construct a func-

tion g : α → S which maps some ordinal α one-to-one onto S. If such
a function g is shown to exist, then S is the one-to-one image of a well-
ordered set and so can be declared to be well-orderable (26.1).
Let P(S)∗ = P(S) − ∅. By the Axiom of choice, there exists a function
f : P(S)∗ → S which maps each element X ∈ P(S)∗ to some element
x ∈ X ⊆ S. We recursively construct a function g : O → S as follows:

g(0) = s0 ∈ S, for an arbitrarily chosen element s0 in S

g(1) = s1 = f(S − g[{0}]) = f(S − {s0 })
g(2) = s2 = f(S − g[{0, 1}]) = f(S − {s0 , s1 })
.. .. ..
. . .
g(α+ ) = sα+ = f(S − g[{0, 1, . . ., α}]) = f(S − {s0 , s1 , . . . , sα }), ∀α ∈ O
g(β) = sβ = f(S − {g(α) : α ∈ β}) = f(S − {sα : α ∈ β}), ∀ lim. ord’s β

For each γ in the domain, dom g, of g define g|γ as:

g|γ = {(α, sα ) ∈ g : α ∈ γ}

Claim: For each ordinal, γ, such that γ ⊆ dom g, g|γ is a one-to-one function
on γ.
The proof of the claim is by transfinite induction. Let P (α) denote the
statement “g|α : α → S is a one-to-one function mapping α into S”.
Inductive hypothesis: Suppose γ ⊆ dom g. Suppose P (α) holds true for all
α ∈ γ, where γ belongs to the domain of g. We are required to show that
P (γ) holds true. That is, we are required to show that g|γ is one-to-one on
γ.
Suppose (β, sβ ) and (µ, sµ ) are two elements in g|γ such that β ∈ µ. Then
β and µ are elements of γ. It suffices to show that sβ 6= sµ . Case 1: If γ is a
limit ordinal, then (β, sβ ) and (µ, sµ ) belong to g|µ+ ⊂ g|γ . By the induc-
tive hypothesis, g|µ+ is one-to-one on µ+ ; hence, sβ 6= sµ . Case 2: Suppose
γ is a successor ordinal. If µ+ 6= γ, then by the inductive hypothesis, g|µ+ is
one-to-one on µ and, since (β, sβ ) and (µ, sµ ) belong to g|µ+ , then sβ 6= sµ .
Suppose µ+ = γ. By definition of g, g(µ) = sµ = f(S − {sα : α ∈ µ}).
Since β ∈ µ, sβ 6∈ S − {sα : α ∈ µ}; hence, sβ 6= f(S − {sα : α ∈ µ}) = sµ .
Then g|γ is one-to-one on γ.
By transfinite induction, g|γ is one-to-one on γ, for all γ ⊆ dom g, as
claimed. We conclude that g is one-to-one on dom g.
Claim: The function g maps dom g onto S. That is, for every s ∈ S, (α, s) ∈ g
for some ordinal α. Let D denote the domain of g. To prove the claim, it
suffices to show that S − g[D] = ∅.
Part VIII: Ordinal numbers 323

− We first show that D satisfies the initial segment property: If γ ⊆ D,

then g|γ = {(α, sα ) ∈ g : α ∈ γ} ⊆ g. Hence, α ∈ γ ⇒ (α, sα ) ∈ g ⇒
α ∈ D. Then D satisfies the initial segment property in O.
− The domain D of g is a set: Since S is a set and g : D → S is a one-
to-one function on D ⊂ O, the image g−1 [S] of the one-to-one function
g−1 : S → O must be a set (by the Axiom of replacement).
− The domain D of g is an ordinal: We have shown that the domain D of
g is a set in O which satisfies the initial segment property. Then D 6= O
and so there exists an ordinal δ such that D = Sδ = {α ∈ O : α ∈ δ} = δ
(by Theorem 26.3, also see page 285). So we can write D = δ.
− It now suffices to show that S − g[δ] is empty. Suppose the set S − g[δ] is
non-empty. Then the choice function f maps the non-empty set S − g[δ]
to some element sδ in S − g[δ]. Then g(δ) = sδ , which means that
δ ∈ D = δ, a contradiction. The source of the contradiction is the as-
sumption that S − g[δ] is non-empty. Then g[D] = g[δ] = S. So every
element of S is in the image of the function g, as claimed.

The function g : δ → S mapping δ one-to-one onto S = {sα : α ∈ δ} then

induces a well-ordering on S.
We conclude that any non-empty set S is well-orderable.

29.4 Formally defining the cardinal numbers.

Overview − We are now set to formally define the sets we call “cardinal
numbers”. First, we review some background material on how we came to
discuss the concept of “cardinal numbers”.
Given the class, S , of all sets, we defined a relation ∼e on S as follows:

S ∼e T if and only if S and T are equipotent

The relation, ∼e , was shown to be an equivalence relation on S and so

allows us to partition S into a class of equivalence classes,

E = {[S]e : S ∈ S }

where [S]e = {T ∈ S : T ∼e S}. For example, [R]e and [N]e are (dis-
tinct) equivalence classes containing all sets which are equipotent to R and
N respectively. For example, once we had verified that P(N) ∼e R (with
the help of the Schröder-Bernstein theorem), we could write that [R]e =
[P(N)]e where R and P(N) were simply different representatives of the
same equivalence class. When we first discussed the concept of “cardinal
numbers”, the tools available at that time were insufficient to construct a
324 Section 29: Cardinal numbers: “Initial ordinals are us!”

class of sets whose elements could serve as representatives for each of the
equipotence-induced equivalence classes.1 So we postulated the existence of
the class of cardinal numbers as follows (reproduced from Postulate 22.2):

There exists a class C which contains N, such that every set S is

equipotent with exactly one element κ ∈ C .

We now have all the ingredients required to prove that a class of sets whose
properties characterize the cardinal numbers exists in ZFC.

Theorem 29.5 The class of all initial ordinals, I = ω0 ∪ {ωα : α ∈ O},

satisfies the following properties:
1. Every set S is equipotent to exactly one element in I .
2. Two sets, S and T , are equipotent if and only if they are equipotent to
the same element of I .
3. The class, I , is ∈-linearly ordered.

P roof: The Well-ordering theorem is invoked in this proof.

1) Let S be a set. By Theorem 29.4, (Well-ordering theorem) the set S has
a well-ordering “<”. If (S, <) equipped with this well-ordering, by Theorem
28.7, S is order isomorphic to some ordinal β, β and S are equipotent sets.
Amongst all ordinals which are equipotent to β there is a unique initial or-
dinal, ωα , which is equipotent with β. Then ωα is the unique initial ordinal
which is equipotent to the set S, as required.
2) Suppose S and T are equipotent. Then by part (1) both S and T are
equipotent to initial ordinals ωγ and ωα respectively. Since distinct initial
ordinals cannot be equipotent, ωγ = ωα . Conversely, equipotent initial or-
dinals must be the same ordinal and so the elements of the class of all sets
which are equipotent to the same initial ordinal are pairwise equipotent.
3) Every pair of ordinals in O are ∈-comparable. So every pair of ordinals
in I ⊂ O must be ∈-comparable. So I is ∈-linearly ordered.

Remark. We pause to gather together a few facts related to the set R.

– We have defined ω1 , the least uncountable ordinal that is not equipotent

to N in two ways: ω1 = {α : α is a countable ordinal} (page 303) and
ω1 = h(ω0 ) (by Theorem 28.13).
1 One may suggest that we could have defined, [S] , as being the cardinal number of S.
e
The problem with this is that we want cardinal numbers to be sets; the ∼e -equivalence
class, [S]e , may be a proper class.
Part VIII: Ordinal numbers 325

– By theorems, 20.12 and 21.3, the three sets, 2N , P(N) and R are equipo-
tent sets.
– If we assume the Continuum hypothesis, P(N) is the smallest uncount-
able set which contains N. That is, there are no uncountable sets strictly
in between N and P(N).
– Then, if we assume the Continuum hypothesis, R is the smallest un-
countable set which contains N.
– Then, if we assume the Continuum hypothesis, c and ω1 are equipotent
sets.

We are now set to formally define the sets we will call the “cardinal num-
bers”.
Recall that, for each equivalence class in the class

E = {[S]e : S ∈ S }

we postulated the existence of a set which could serve as a class represen-

tative (without being able to identify which set could serve that purpose).
Suppose [T ]e ∈ E . By the above theorem, T is equipotent to exactly one
element in I , say ωα . So ωα ∈ [T ]e . If [T ]e 6= [B]e , then T 6∼e B, so ωα
cannot belong to [B]e . Also for any set, A in [T ]e , A and T are equipotent
and so A ∼e ωα . So we can specifically identify the initial ordinal, ωα , as the
unique class representative of [T ]e . So ωα can be chosen be the “cardinality”
of every set in [T ]e .

Definition 29.6 Cardinal numbers. An ordinal is called a cardinal number if

and only if this ordinal is an initial ordinal. When the elements of

I = ω0 ∪ {ωα : α ∈ O}

are viewed as cardinal numbers we represent I as C .

326 Section 29: Cardinal numbers: “Initial ordinals are us!”

29.5 Aleph notation for cardinal numbers.

Some standard notation has been adopted to represent cardinal numbers.

Notation 29.7 Although we could use the “ωα ” notation to represent the
cardinal numbers it is customary to use the “aleph” notation, ℵα . We have
already used the aleph notation once with, ℵ0 , used to represent the cardinal-
ity, |N|, as introduced on page 226. We now generalize its use to represent all
elements of {ωα : α ∈ O}. That is, we set

ℵ0 = ω0
ℵ1 = ω1
ℵ2 = ω2
..
.
ℵα = ωα
..
.

for any ordinal α, ℵα = ωα where each ωα is an initial ordinal.

Note that the algorithms used to define the operations of addition, multipli-
cation and exponentiation, in the chapters where cardinal operations were
defined, did not involve the fact that cardinal numbers are initial ordinals.
So the algorithms can be freely used using the aleph notation for the cardi-
nals {ℵα : α ∈ O}.

For example, when we write the expression ℵ1 we are thinking “the cardinal
number ℵ1 ” rather than “the initial ordinal ω1 ” even though they represent
the same entity. Since the initial ordinal, ℵ1 = ω1 , is, by definition, “the least
ordinal number which is not countable”, it is the first uncountable cardinal
(ordinal).

If we assume the Continuum hypothesis, then there exists no uncountable

cardinal number, ℵα , such that

|N| = ℵ0 ,→e ℵα ,→e |P(N)| = 2ℵ0 = |R| = c (Theorem 21.3)

Hence, if we assume the Continuum Hypothesis

ℵ1 = ω1 = c = |R|
Part VIII: Ordinal numbers 327

is the least ordinal which is not countable.

The only countable cardinals are the natural numbers and ℵ0 itself.
Similarly, assuming the Generalized continuum hypothesis, the least un-
countable ordinal which is not equipotent to ω1 is the cardinality of the
set P(R), in which case we say “assuming GCH, the cardinality of P(R) is
ℵ2 ” where ℵ2 is the least ordinal which is not equipotent to ℵ1 .
The class of all cardinal numbers, C , can now be represented as
C = ℵ0 ∪ {ℵα : α ∈ O}
where, for all ordinals α, ℵα+ is the least ordinal which is not equipotent
with ℵα and, for limit ordinals γ, ℵγ = lub{ℵα : α ∈ γ}. What is also new,
is that the class of all infinite cardinal numbers is indexed by the ordinal
numbers.
Even though every cardinal number is an initial ordinal, the context usually
allows us to determine whether we are referring to a set’s “cardinality” or
“ordinality”. We normally refer to the “ordinality of a set” only if we have a
specific (or hypothesized) well-ordering of that set in mind. In the following
tables, we describe how the cardinality, and ordinality of a set are perceived
when we are assuming the Generalized continuum hypothesis and when we
are not.

Table in which we are not assuming CH nor GCH (in the presence of the
Axiom of choice):
Set S cardinality of S initial ordinal of S ord
S

{} 0 0 0
{a, b, c} 3 3 3
Nstandard ℵ0 ω0 ω0
ω0 + 3∈-well-ordered ℵ0 ω0 ω0 + 3
{1, 2} × Nlexico ℵ0 ω0 ω0 2
N × Nlexico ℵ0 ω0 ω0 ω0
ω1 ℵ1 ω1 ω1
.. .. .. ..
. . . .
R c = 2ℵ0 = ℵα ≥ ℵ1 ωα
.. .. .. ..
. . . .
P(R) 2ℵα = ℵβ ≥ ℵα+1 ωβ
.. .. .. ..
. . . .
P(P(R)) 2ℵβ = ℵγ ≥ ℵβ+1 ωγ
.. .. .. ..
. . . .
P(P(P(R))) 2ℵγ = ℵδ ≥ ℵγ+1 ωδ
328 Section 29: Cardinal numbers: “Initial ordinals are us!”

Note that it is the Well-ordering theorem (itself a consequence of the Axiom

of choice) which guarantees that for every ordinal, β, 2ℵβ = |2ℵβ | = |P(ℵβ )|
is equal to some (initial) ordinal number ωγ = ℵγ ≥ ℵβ+1 . That is, the Well-
ordering theorem states that P(ℵβ ) ∼e ℵγ , for some cardinal ℵγ ≥ ℵβ+1 .
Without the Axiom of choice, the set, P(ℵβ ), may not be well-orderable in
which case it would not necessarily be equipotent to some ordinal number.

Assuming GCH, cardinal numbers are more clearly defined:

Set S cardinality of S initial ordinal of S ord

{a, b, c} 3 3 3
Nstandard ℵ0 ω0 ω0
ω0 + 3∈-well-ordered ℵ0 ω0 ω0 + 3
{1, 2} × Nlexico ℵ0 ω0 ω0 2
N × Nlexico ℵ0 ω0 ω0 ω0
R 2ℵ0 = ℵ1 ω1
R×R ℵ1 ω1
P(R) 2ℵ1 = ℵ2 ω2
P(P(R)) 2ℵ2 = ℵ3 ω3
P(P(P(R))) 2ℵ3 = ℵ4 ω4
.. .. .. ..
. . . .
..
. ℵα ωα
.. .. .. ..
. . . .

What does this say about GCH? The Axiom of choice guarantees that every
set can be well-ordered and so all sets can be ranked on an “equipotence
based scale C of sets” called the cardinal numbers. This means that every
set is associated to a uniquely specified ordinal number (cardinal number)
on this scale of ordinals. We make the following universe comparisons.
In the ZFC − universe: For any infinite set S such that |S| = ℵγ ,
2S ∼e P(S) ∼e ℵα for some α > γ. The value of α is guaranteed to
exist, but cannot be determined. The value of α is simply assumed to be
equal to some ordinal greater than or equal to γ + 1.
In the ZFC + GCH − universe: If S is any infinite set such that |S| = ℵγ ,
then 2S ∼e P(S) ∼e ℵγ+1 . The cardinality of the set 2S ∼e P(S) is the
least cardinal number (on the equipotence-based scale) which is larger
than ℵγ . The axiom GCH limits the size of power sets P(S) relative to
the size of S.
In the ZFC + CH − universe: For any infinite set S such that |S| = ℵγ >
ℵ0 , 2S ∼e P(S) ∼e ℵα for some α > γ. The value of α is guaranteed to
Part VIII: Ordinal numbers 329

exist, but cannot be determined. It is equal to some ordinal greater than

or equal to γ + 1. But, if |S| = ℵ0 , then 2S ∼e P(S) is the immediate
successor cardinal, ℵ1 , of ℵ0 . So CH only limits the size of 2ℵ0 .
In the ZFC + ¬CH − universe : For any infinite set S such that
|S| = ℵγ > ℵ0 , 2S ∼e P(S) ∼e ℵα for some α > γ. The value of α
is guaranteed to exist, but cannot be determined. It is equal to some or-
dinal greater than or equal to γ + 1. But, if |S| = ℵ0 , then 2S ∼e P(S)
is not the immediate successor cardinal, ℵ1 , of ℵ0 . That is, 2ℵ0 > ℵ1 .
Ranking the elements of the class S of all sets in ZFC with ,→e . Suppose
S and T are two infinite sets in S . If S and T are equipotent, then they
can be viewed as being the “same size” (just like a set of five squirrels and
a set of five submarines are viewed as being the same size, in spite of the
fact that squirrels and submarines are entirely different types of objects).
Suppose now that S and T are not equipotent. We wonder whether one of
these two infinite sets is necessarily embedded in the other.
We show that it must be the case. Say the relation <S well-orders S and
<T well-orders T . (The Well-ordering theorem guarantees that such well-
orderings exist for each of these two sets.) Suppose α = ord S and β = ord T .
Now α and β cannot be the same ordinal number, for if they were, then S
and T would be equipotent. Then, either α ∈ β or β ∈ α. Suppose, without
loss of generality, that α ∈ β. The set S is order isomorphic to an initial
segment of T . It follows that S ,→e T . We have shown that any pair of non-
equipotent sets S and T are ,→e -comparable. Note that comparing sizes of
sets in this way would not be possible without the Well-ordering theorem
(which follows from the Axiom of choice).
Finally, since the class, {ℵα : α ∈ O}, of all infinite cardinal numbers is
indexed by the elements of O, no two of which are equipotent, we can then
say that our universe of sets contains as many different infinite set sizes as
there are ordinals!

29.6 Successor cardinals and limit cardinals.

Some curious readers may have already detected subcategories of C and
wonder whether they deserve some more consideration. We provide an ex-
ample. We know that cardinal numbers, ℵα , are indexed by ordinals. There
are two types of ordinals: successor ordinals and limit ordinals. This means
we can subdivide cardinal numbers into two subfamilies: One family whose
cardinal numbers are indexed by a successor ordinal, while a second sub-
family whose cardinal numbers are indexed by a limit ordinals. These two
subfamilies have already been recognized and were named Successor cardi-
nals and limit cardinals, respectively. These are studied in depth in more
comprehensive or more advanced texts of Set theory (See Introduction to
Set Theory by Hrbacek and Jech in the Bibliography).
330 Section 29: Cardinal numbers: “Initial ordinals are us!”

Concepts review:
1. What is a well-orderable set? How is it different from a well-ordered
set?
2. What can be said about those sets that are the one-to-one image of
an ordinal number?
3. Is it true that any well-ordered set, no matter how large, is order
isomorphic to some ordinal number?
4. What does it mean to say that the well-ordered set S has order type
(ordinality) α?
5. What does the Well-ordering theorem say?
6. The Well-ordering theorem is a consequence of which fundamental
ZFC-axiom?
7. What is an initial ordinal? Are natural numbers initial ordinals?
Why?
8. What is the least uncountable initial ordinal? How is it obtained?
9. Are all ordinals in ω0 ∪ {ωα : α ∈ O} initial ordinals?
10. Define the “cardinal numbers”.
11. If we assume the Continuum hypothesis, what is the cardinality of
R?
12. If we do not assume the Continuum hypothesis, what is the cardi-
nality of R?
13. What is a limit cardinal?
14. What is a successor cardinal?

EXERCISES

A. 1. Consider the set of ordinals defined as S = ∪{ωn : n ∈ N, n > 2}.

a) Is S = O? Why?
b) Is S an ordinal number?
c) If S is an ordinal number, is it a limit ordinal number?
d) What is the least upper bound of S?
2. List a few possible order types of a set of cardinality ℵ0 .
3. What is the cardinality of a set whose ordinality is ω3 + ω2 ?
Part VIII: Ordinal numbers 331

4. Use the appropriate symbols to represent the ordinality, the associated

initial ordinal and the cardinality of the set S = N ∪ {N} (ordered in the
usual way).
5. Does the Cantor set have a well-ordering?
6. If we assume CH, what is the least ordinal that is not equipotent to R?
7. What is the least ordinal that does not belong to I ?
8. What is the least limit ordinal that does not belong to I ?

B. 9. Let S = {1, 2, 4, 8, 32, 64}. Define the order relation ≤ on S as follows: a ≤ b

if and only if a divides b.
a) Is ≤ a well-defined order relation on S?
b) Is ≤ a well-ordering of S?
c) Express (S, ≤) as an element of P(S)×P(S ×S) by explicitly exhibit-
ing all of its elements.
10) Let S = {aα : α ∈ ω0 + ω} be a set whose elements are defined as follows:
1
 2 − α+1 when α ∈ ω0


aα = 2 when α = ω0
1
2 + f(α)+1 when ω0 ∈ α


where f(ω0 + n) = n.
a) Are the elements of the set S well-defined?
b) Is the ordering induced on the elements of S by the index set ω0 + ω a
well-ordering?
c) If the elements of S are assumed to be well-ordered by the index set,
what is the least element of the set S with respect to this ordering?
d) What is the least upper bound (supremum) of the set {aα : α ∈ ω}
with respect to the ordering defined by the index set?

C. 11. Show that if S is a class of ordinals, then the least upper bound of S is
∪{α ∈ O : α ∈ S}.
12. Is it true that given any two sets S and T , either S is embeddable in T or
T is embeddable in S? Why?
13. Is the class I of all initial ordinals an initial segment of O? Why?
14. Is the class I of all initial ordinals a transitive class?
Part IX

Choice, regularity and

Martin’s axiom
Part IX: Choice, regularity and Martin’s axiom 335

30 / Axiom of choice
Abstract. In this section we prove that the Axiom of choice is equivalent
to the Well-ordering principle. We provide a few mathematical statements
whose proof requires the Axiom of choice. Finally we present Zorn’s lemma
and show that it is equivalent to the Axiom of choice. A proof of the fact
that every vector space has a basis is given by invoking Zorn’s lemma.

30.1 Introduction.
Amongst the eight set-axioms we have listed, there are only two that pos-
tulate the existence of a set. The other set-axioms are ones that provide us
with the necessary tools to construct new sets from ones that we already
have. The first such existence axiom that we have encountered is the Axiom
of Infinity. The Axiom of infinity postulates the existence of an inductive
set (X ∈ A ⇒ X ∪ {X} ∈ A). Most people have no complaints about this
axiom, since without it we have nothing “on the table” to work with, so
to speak. If we must postulate the existence of at least one set, why not
postulate existence of a set which characterizes the natural numbers?
The second existence set-axiom is the Axiom of choice. This existence axiom
is quite different in nature from the Axiom of infinity (which postulates the
existence of just a single set). The Axiom of Choice states that, given any set
U = {Sx : x ∈ A} of non-empty sets, there exists a function f : U → ∪x∈ASx
which maps each set Sx to a set yx ∈ Sx . We refer to f as a “choice function
for U ” since, from each set, Sx , in U , f chooses a particular set f(x) without
specifying the rule according to which this choice is made. Remember that
the function, f, is itself a set. So the Axiom of choice postulates, for each
U , the existence of a “set”, f.
So, for each such set, U , we postulate the existence of at least one associated
set, f : U → ∪x∈A Sx . Some readers may wonder why we don’t just construct
the function f : U → ∪x∈A Sx . It is just that, in most cases, we don’t know
how or whether, in practice, such a function is even constructible. So we are
postulating the existence of a “rule” without ever being able to know what
this rule could possibly be. Some may wonder how we can permit ourselves
to postulate the existence of a set that can never be constructed, witnessed
or exhibited. The problem is that, to do most of mathematics that are es-
sential for us today, we absolutely need it.
In this chapter, we will try to develop a deeper understanding of what this
axiom is about. The Axiom of choice will be seen to be equivalent to other
mathematical statements, some of which many find more palatable.
336 Section 30: Axiom of choice

30.2 Axiom of choice.

The Axiom of choice plays a fundamental role in the study of mathematics.
This principle was invoked in the proofs of numerous mathematical state-
ments long before mathematicians began acknowledging it as an axiom. It
is worth restating it here:

The Axiom of choice: Let S be a set of non-empty sets whose

union is the set, T , of elements. Then we can choose some ele-
ment s from each set S in S . That is, there exists a function,
f : S → T , with domain, S , such that for each set, S ∈ S ,
f(S) ∈ S.1
The function f : S → T described in this paragraph is referred to as a
“choice function”.
Suppose we are given a set of non-empty sets, U = {Sα : α ∈ γ ∈ O} and
we wish to select an element s from each set S ∈ U . To do this, it is im-
portant to be able to distinguish between situations where one must invoke
the Axiom of choice from those where this is not required. We consider the
following examples.
Example 1. Suppose we are given a non-empty set S for which no specific
element of S can be distinguished from the others. Suppose we wish to
choose a single element in the set S. To do this, is it necessary to invoke the
Axiom of choice? The answer is no. We can simply argue as follows:

S 6= ∅ ⇒ P(S) 6= ∅
⇒ P(S) × S 6= ∅ (By definition of Cartesian product.)

⇒ {S} × S 6= ∅ (Since S ∈ P(S).)

⇒ there exists an element (S, x) ∈ {S} × S

Observe that {(S, x)} is a function with domain {S} and range {x} which
associates to S some element x of S. We did not invoke the Axiom of choice
to postulate the existence of the function f(S) = x.
Example 2. Suppose we wish to select an element from each set of non-
empty sets from P(N). To do this, do we need to invoke the Axiom of
Choice? Consider the set P(N)∗ = P(N) − {∅} of all non-empty subsets
of N. Suppose we want to form a new set, S = {nA : A ∈ P(N)∗ }, by
selecting from each set, A, in P(N)∗ a single element, nA . We can argue as
follows: Since the set N has been shown to be well-ordered, we can choose
from each non-empty set A ∈ P(N)∗ the unique smallest number, nA , in
1 Formally expressing this axiom in first order logic,
[
∀X[∅ 6∈ X ⇒ ∃f : X → A ∀A ∈ X(f (A) ∈ A)]
A∈X
Part IX: Choice, regularity and Martin’s axiom 337

A. The Axiom of choice is not required, since each element A in the sets
of P(N)∗ is specifically and unambiguously identified as being the unique
least element in A. We can express our choice function, f : P(N)∗ → N as
follows:

f = {(A, nA) ∈ P(N)∗ × N : nA = unique least element of A }

In this case, the Axiom of choice is not required.

Example 3. Suppose, on the other hand, that we are given an infinite set of
identical golf balls distributed in infinitely many boxes, A = {Ai : i ∈ I} in
such a way that each box, Ai , has at least one ball. We are asked to construct
a function, f : A → ∪{Ai : i ∈ I}, which chooses a single ball from each box.
Note that the golf balls in the boxes are neither labeled nor indexed, in the
sense that they are all identical in all respects. So there can be no formula
for a function f which globally states which specific ball is to be chosen in
each box. In this case, the best we can do is to invoke the Axiom of choice
which guarantees that at least one choice function, f : A → ∪{Ai : i ∈ I},
exists.
It was shown by Kurt Gödel in 1938 that no contradictions can result from
invoking the Axiom of choice. In 1963, Paul Cohen showed that the Axiom
of choice cannot be proved from ZF. So the Axiom of choice adds new sets
(since functions are sets) to a ZF-universe of sets.
The following more general theorem shows that the existence of a choice
function for finitely many sets does not require the intervention of the Ax-
iom of choice.

Theorem 30.1 Suppose S = {Ui : i = 1 to k} is a finite set of non-empty

sets. Then there exists a function f : S → ∪ki=1 Ui which maps each set, Ui
to one of its elements f(Ui ) = yi ∈ Ui .
P roof: We prove this by induction on n ∈ N.
Let P (n) be the statement: “A system of n sets has a choice function”.
Base case: The statement P (1) holds true as shown in Example 1 above.
Inductive hypothesis: Suppose the statement P (n) holds true.
That is, if S = {Ui : i = 1 to n} there exists a choice function f : S →
∪ni=1 Ui which maps each set, Ui to one of its elements f(Ui ) = yi ∈ Ui , i = 1
to n.
Suppose S = {Ui : i = 1 to n + 1}. By the inductive hypothesis, a choice
function
g : {Ui : i = 1 to n} → ∪ni=1 Ui
exists g(Ui ) = zi ∈ Ui , i = 1 to n.
Since Un+1 is non-empty, there exists an element, say (Un+1 , u) ∈ {Un+1 } ×
338 Section 30: Axiom of choice

Un+1 .

Define the choice function f : S → ∪n+1

i=1 Ui as follows:

g(Ui ) if i ≤ n
f(Ui ) =
u if i = n + 1

So P (n + 1) holds true. By mathematical induction, every finite system of

sets has a choice function.

A word of caution. The proof above does not show that any countably in-
finite set of sets has a choice function. The statement P (n) states that “a
set of n sets has a choice function” no matter what the value of n is. It only
proves that all finite sets have a choice function, nothing more.

30.3 Axiom of countable choice.

The Axiom of countable choice is a weaker form of the Axiom of choice. We
state it formally:
The Axiom of countable choice. Let S = {Ui : i ∈ ω0 } be a countable
collection of nonempty sets whose union is the set, T = ∪i∈ω0 Ui , of
elements. Then we can choose some element u from each set Ui in S .
That is, there exists a function f : S → T with domain S such that
for each set Ui ∈ S , f(S) ∈ Ui .
Since the Axiom of countable choice is a special case of the Axiom of choice,
it obviously follows from it. In the particular case where T = ∪i∈ω0 Ui is a
countable set, then it is well-orderable, and so the Axiom of countable choice
is not required to justify the existence of a choice function. The function, f,
can simply assign each Ui to the least element of Ui . If T = ∪i∈ω0 Ui is un-
countable, the Axiom of countable choice is required to justify the existence
of a choice function on S = {Ui : i ∈ ω0 }.2
The reader may wonder why we cannot generalize the arguments used to
prove the existence of a choice function on finite sets of sets to justify the
existence of a choice function on an infinite number of subsets of an infinite
set S. The main reason is that if there are infinitely many non-empty subsets
F = {Sα : α ∈ γ} of an infinite set S, then, for each set Sα ∈ F , we would
have to invoke the existence principle,

{Sα } × Sα 6= ∅ ⇒ there exists (Sα , s) ∈ {Sα } × Sα

2 Paul Cohen has also proven that the Axiom of countable choice cannot be proven from

the ZF-axioms.
Part IX: Choice, regularity and Martin’s axiom 339

This means that this existence principle would have to be invoked infinitely
many times. The ZF -axioms do not define a formula with an infinite chain
of existence symbols.

30.4 Equivalent forms of the Axiom of choice.

There are many mathematical statements which are equivalent to the Ax-
iom of choice. One of the simplest statements is the following one.

Theorem 30.2 Let [AC∗] denote the statement:

“For any set, S = {Sα : α ∈ γ}, of non-empty sets, Πα∈γ Sα is non-
empty”.
The Axiom of choice holds true if and only if [AC∗ ] holds true.
P roof:
(⇒) Suppose the Axiom of choice holds true. Let S = {Sα : α ∈ γ} be a
set of non-empty sets. Then there exists a choice function f : S → ∪α∈γ Sα
such that f(Sα ) = sα where sα is some element in Sα . Then (sα : α ∈ γ) ∈
Πα∈γ Sα . Hence, Πα∈γ Sα is non-empty. Then [AC∗ ] holds true.
(⇐) Suppose the statement [AC∗ ] holds true. Let S = {Sα : α ∈ γ} be a
set of non-empty sets. Since Πα∈γ Sα is non-empty, it contains some element
(sα : α ∈ γ) ∈ Πα∈γ Sα . Then, for each α ∈ γ, sα ∈ Sα . The function,
f : S → ∪α∈γ Sα , defined as, f(Sα ) = sα , is well-defined. Then the Axiom
of choice holds true.

We will now prove that the statement “Every set is well-orderable” to con-
clude that the Axiom of choice and “Every set is well-orderable” are equiv-
alent statements.

Theorem 30.3 The statement “Every set is well-orderable” holds true if and
only if the Axiom of choice holds true.

P roof:
(⇐) That the Axiom of choice implies “Every set is well-orderable” is proven
in the Well-ordering theorem (29.4).

(⇒) What we are given: That S = {Ui : i ∈ A} is a class of non-empty sets

and that T = ∪i∈A Ui is a well-orderable set.
340 Section 30: Axiom of choice

What we are required to show: There exists a function f : S → T which

maps each set S in S to some element s ∈ S.
Since T is well-orderable, there exists a function f mapping some ordinal
α ∈ O one-to-one onto T . For any Ui ∈ S , f −1 [Ui ] is a non-empty subset
of the ordinal α and so f −1 [Ui ] must have a least element, say αi ∗ (since
α is well-ordered). Then f(αi ∗ ) is the least element of Ui with respect to
the well-ordering induced on T by α. The function g : S → T defined as
g(Ui ) = f(αi ∗ ) ∈ Ui is a well-defined function mapping each set Ui in S to
one of its elements, as required.

30.5 Some consequences of the Axiom of choice.

The Axiom of choice has already been invoked a few times in the earlier
sections to prove important statements which are unprovable without it.
There are many such statements, intuitively felt to be true, which cannot
be proven from the ZF -axioms. That is, the Axiom of choice is the only key
we can use to unlock certain “doors” which would remain closed otherwise.
A simple example is the following statement:
If f : S → T is a function mapping S onto T , then there exists
a function g : T → S mapping T into S such that f ◦ g = IT , the
identity map on T .
This statement seems obviously true. It is trivially true if the function f is
one-to-one and onto. If f is not one-to-one, at first glance the proof appears
to be straightforward. It would go something like this:
· Since f is onto T , for each y ∈ T , f ← [{y}] is non-empty.
· For each y ∈ T , choose uy ∈ f ← [{y}].
· Define the function g : T → S as follows: g(y) = uy .
· Then (f ◦ g)(y) = f(g(y)) = f(uy ) = y. So f ◦ g is the identity function on T .

A minor flaw in this proof is that the statement “For each y ∈ T , choose
uy ∈ f ← [{y}]” is not appropriately justified. The Axiom of choice must be
invoked to justify the choice of an element in each set of an infinite number
of sets.
We state a few other well-known statements whose proofs depend on the
Axiom of choice. That is, the following results hold true only if we accept
the Axiom of choice as an axiom along with the other ZF -axioms.

“Every infinite set contains a one-to-one image of the natural

numbers”. The proof is given in Theorem 18.9.

“Any infinite set can be expressed as the union of a pairwise dis-

joint set of infinite countable sets”.
Part IX: Choice, regularity and Martin’s axiom 341

The proof is provided in the theorem below.

Theorem 30.4 [AC] Any infinite set can be expressed as the union of a pair-
wise disjoint set of infinite countable sets.
P roof:
What we are given: That S is an infinite set.
What we are required to show: That there exists a countably infinite fam-
ily of pairwise disjoint sets, F = {Uα : α ∈ φ}, such that S = ∪{Uα : α ∈ φ}.
We will recursively construct the set F :
− Let T = {F ∈ P(S) : F is countably infinite}. Since S is a set, then
P(S) is a set and so T is a set. Since S is infinite, T must contain at
least one countably infinite subset U0 of S (By Theorem 18.9).
− If S − U0 is finite, then S is countably infinite and we can let F = {S};
we are done.
− More generally: Let Kγ = {Uα : α ∈ γ} be a set of countably infinite
pairwise disjoint subsets of S indexed with the elements of the ordinal
γ. Either S − ∪{Uα : α ∈ γ} is finite or it is infinite.
· If S − ∪{Uα : α ∈ γ} is infinite, then choose an arbitrary element
Uγ of T which is entirely contained in S − ∪{Uα : α ∈ γ}. We
then obtain the set Kγ+1 = {Uα : α ∈ γ + 1} of countably infinite
pairwise disjoint subsets of S. The Axiom of choice will allow us to
make the selection of Uγ for all such sets Kγ of countably infinite
sets.
− Let F = {Uα : α ∈ φ} be the set of all countably infinite sets obtained
in this way. Then F is a set of pairwise disjoint countably infinite sub-
sets of S. Now either ∪{Uα : α ∈ φ} is equal to S or it is not. If it is equal
to S, then we are done. If it is not equal to S, then S − ∪{Uα : α ∈ φ}
is finite. In such a case, we can throw those last few elements in Uα for
some α ∈ φ. We then obtain S = ∪{Uα : α ∈ φ} as required.

30.6 Zorn’s lemma3

Zorn’s lemma is one of the most commonly used statements that are equiv-
alent to the Axiom of choice. It refers to a specific property of a partially
ordered set. Recall that a chain of a partially ordered set X is a linearly
3 Max August Zorn (1906-1993) was a German mathematician. He was an algebraist,

group theorist, and numerical analyst. He is best known for Zorn’s lemma, a method used
in set theory that is applicable to a wide range of mathematical constructs such as vector
spaces, and ordered sets amongst others. Zorn’s lemma was first postulated by Kazimierz
Kuratowski in 1922, and then independently by Zorn in 1935. (Wikipedia)
342 Section 30: Axiom of choice

ordered subset of X. A maximal element of the partially ordered set is an

element m of X for which there does not exist an element x of X such that
m < x. Zorn’s lemma states:
“If every chain of a partially ordered set (X, <) has an upper
bound, then X has a maximal element”.

We will first prove that the Axiom of choice implies Zorn’s lemma holds
true. This will be followed by the proof of its converse.

Theorem 30.5 [AC] Let (X, <) be a partially ordered set. If every chain of
X has an upper bound, then X has a maximal element.
P roof:
What we are given: That X is a partially ordered set. For every chain C in
X, there is an element kC in X such that c ≤ kC for all c ∈ C.
What we are required to show: That X contains an element m such that no
element x of X satisfies the property m < x.
Let φ be the Hartogs number of X. That is, φ is the least ordinal number
which is not equipotent with any subset of X.
Proof by contradiction. Suppose X has no maximal element. Then, for every
element s ∈ X, the set s∗ = {x : x > s} is non-empty. The Axiom of choice
guarantees the existence of a choice function f : P(X) → X which maps
the set s∗ to some element f(s∗ ) ∈ s∗ .
We recursively define the function g : φ → X as follows:

g(0) = x0 = f(X)
g(1) = x1 = f(x∗0 ) > x0
g(2) = x2 = f(x∗1 ) > x1
.. ..
. .
g(α+ )
= xα+ = f(x∗α ) > xα
If λ = limit ordinal, g(λ)
= xλ = upper bound of the chain {xα : α ∈ λ}
.. ..
. .
In the case where λ is a limit ordinal, the hypothesis guarantees that the
upper bound of the chain {xα : α ∈ λ} exists in X. Since the function
g is strictly increasing, it is one-to-one. Since X has no maximal element,
the function g maps φ = {α : α ∈ φ} one-to-one into X, contradicting the
fact that X’s Hartogs number φ is the least ordinal which cannot be mapped
one-to-one into X. The source of the contradiction is our assumption that X
has no maximal element. We must conclude that X has a maximal element,
as required. A theorem on which we have attached a label [ZL] indicates to
the reader that Zorn’s lemma is invoked in the proof of the statement.
Part IX: Choice, regularity and Martin’s axiom 343

Theorem 30.6 [ZL] Suppose that those partially ordered sets (X, <) in
which every chain has an upper bound must have a maximal element. Then,
given any subset S ⊆ P(S) − ∅, there exists a choice function f : S → S
which maps each set in S to one of its elements.
P roof:
What we are given: That partially ordered sets (X, <) in which every chain
has an upper bound have a maximal element and that S ⊆ P(S) − ∅.
What we are required to show: That there exists a choice function f : S → S
which maps each set in S to one of its elements.
Let

F = {f : D → S : f is a choice function with domain D ⊂ S }

Note that F is non-empty since it has been shown that all finite sets of
sets have a choice function. We will partially order the functions in F by
inclusion “⊂”. That is,

[f ⊂ g] ⇔ [{(S, s) ∈ S × S : f(S) = s} ⊂ {(S, s) ∈ S × S : g(S) = s}]

Let C be a chain in (F , ⊂). Then ∪{f : f ∈ C} is an upper bound of

C which is contained in F . Then every chain in the partially ordered set
(F , ⊂) has an upper bound. By hypothesis, (F , ⊂) has a maximal element,
say h.
We claim that h : S → S is a choice function on S
Suppose not. That is, suppose there exists a non-empty element S ∗ ∈ S
which does not belong to the domain of h. Let h∗ = h ∪ {(S ∗ , s∗ )}. Since
S ∗ is non-empty, then there exists an element s∗ in S ∗ . Then h ⊂ h∗ ∈ F .
This is a contradiction, since h was declared to be a maximal element of F .
We conclude that h : S → S is a choice function for S .
Hence, the Axiom of choice follows from Zorn’s lemma.

30.7 A consequence of Zorn’s lemma.

We now provide a proof of a statement normally encountered in a course of
linear algebra. In first-year courses, this statement is often stated without
proof. Postulating this statement is easier than postulating the more ab-
stract Zorn’s lemma. We begin by recalling a few facts.
A basis B of a vector space V is a non-empty subset of V with special prop-
erties. This set B need not be finite. But it must satisfy two properties:
1) Every finite linear combination of elements in B which equals zero must
have only zeroes as coefficients. This is called the linear independence prop-
erty.
2) Every vector in V is a linear combination of finitely many elements of B.
344 Section 30: Axiom of choice

In such cases, we say that “B spans V ”.

There are algorithms we can use to find a basis for a finite dimensional
vector space. But some vector spaces do not have a finite basis. For ex-
ample, the vector space of all countably infinite sequences of real numbers,
(a1 , a2 , a3 , . . . , ) does not have a finite basis. But if we assume Zorn’s lemma,
we can prove that it has a basis. But this proof does not show how to con-
struct it or produce an explicit basis for vector spaces which have no finite
spanning family. In fact, for this vector space, no basis can be found. We
prove below that a basis exists for any vector space. Again, the acronym
[ZL] next to the theorem statement informs the reader that the proof in-
vokes Zorn’s lemma.

Theorem 30.7 [ZL] Every vector space has a basis.

P roof:
Let V be a vector space and let (F , ⊆F ) be the set of all linearly indepen-
dent subsets of the vector space V ordered by inclusion ⊆F . The set F is
non-empty since non-zero singleton sets are linearly independent.
Let C be a chain of linearly independent subsets in F .
We claim that the union ∪C∈C C is also linearly independent:
- It suffices to show that every finite linear combination of elements of
∪C∈C C which equals zero must have zeroes as coefficients. Let U =
{v1 , v2 , v2 , . . . , vn } be a set of vectors in ∪C∈C C.
- Then U ⊆ C for some C ∈ C (since C is a chain of subsets).
- Since C is linearly independent, then α1 v1 +α2 v2 +α3 v3 +· · ·+αn vn = 0
implies α1 = α2 = · · · = αn = 0.
- So ∪C∈C C is linearly independent as claimed.
Then every chain C in (F , ⊆F ) has an upper bound in F .
By Zorn’s lemma, (F , ⊆F ) has a maximal linearly independent set B ∗ . That
is, B ∗ is a linearly independent set that is not a subset of any other linearly
independent set. We now show that B ∗ spans V . If v ∈ V −B ∗ is not a linear
combination of vectors in B ∗ , then B ∗ ∪ {v} is a linearly independent subset
of V which properly contains B ∗ , contradicting the maximality of B ∗ . So,
B ∗ spans V . So, B ∗ is a basis of V . Thus, every vector space has a basis.

When reading a mathematical statement which contains the words “. . . there

exists . . . ” it is natural for the reader to watch out in the proof for a step
in which some form of the Axiom of choice is invoked (even if this is not
pointed out explicitly to the reader).
Part IX: Choice, regularity and Martin’s axiom 345

The examples of the mathematical statements above whose proof requires

invoking the Axiom of choice are usually sufficient to convince most mathe-
maticians that this axiom plays an important role in modern mathematics.
Even mathematicians who feel queasy when trying to imagine what a well-
ordering of the real numbers looks like or fear of its possible side-effects
eventually give in to invoking the Axiom of choice once they see that it is
the only way to arrive at some mathematical results they consider to be
“significant” in their field of study.
But in spite of this, most of us would agree that a correct proof which avoids
using the Axiom of Choice is still preferable to one which unnecessarily in-
vokes it. When presenting a correct proof which invokes the Axiom of choice
one should expect, or at least not be surprised with the follow-up question:
“Your proof is fine, but do you really need to Axiom of choice to prove this?”

30.8 Example: Axiom of choice on strong limit cardinals.

We consider another application of the Axiom of choice in a context that
involves limit cardinals.
Recall that if α and γ are ordinals, then α ∈ γ implies ℵα ∈ ℵγ . Also, if γ is
a limit ordinal then

ℵγ = lub{ℵα : α ∈ γ}
= lub{ωα : α ∈ γ}
= ωγ

In this case, we refer to ℵγ as a limit cardinal. The cardinal numbers ℵ0 ,

ℵω0 , ℵω0 +ω0 , ℵω0 +ω0 +ω0 are examples of limit cardinals. If γ is a limit ordi-
nal greater than ω0 , then ℵγ is an uncountable cardinal number.
An uncountable limit cardinal number, is said to be a strong limit cardinal,
if, for each α ∈ γ,
2ℵα ∈ ℵγ

That is, the uncountable limit cardinal, ℵγ , is said to be a strong limit

cardinal, if,
{2ℵα : α ∈ γ} ⊆ ℵγ
As we try to wrap our minds around this definition, it is normal to wonder
whether a strong limit cardinal numbers even exists.
Before we present the following theorem, recall what the GCH states:

“If ℵα is an infinite cardinal, the next infinite cardinal in size is

ℵα+1 = 2ℵα ”.
346 Section 30: Axiom of choice

Assuming the GCH, we immediately prove the following straightforward re-

sult.

Theorem 30.8 [GCH] Every uncountable limit cardinal is a strong limit car-
dinal.

P roof:
Suppose ℵγ is an uncountable limit cardinal, and suppose α ∈ γ. Since ℵγ
is a limit cardinal, then, by definition, ℵα+1 ∈ ℵγ . Assuming GCH, we have
2ℵα = ℵα+1 ∈ ℵγ . We have shown that α ∈ γ ⇒ 2ℵα ∈ ℵγ . So, when influ-
enced by the GCH, ℵγ is a strong limit cardinal, as required.

But do we absolutely need GCH to prove this result? Are there any strong
limit cardinals in the ZFC-universe when uninfluenced by GCH? Surprisingly
enough, there are, but to prove it we will have to invoke the Axiom of choice.

Theorem 30.9 [AC] There exists a strong limit cardinal number.

P roof:
The class, C ∗ = {ℵα : α ∈ O, α 6= 0}, denotes the class of all uncountable
infinite cardinals.
For each ordinal α, by the Well-ordering theorem (equivalent to the Axiom
of choice), the set P(ℵα ) is well-orderable.4 By Theorem 20.12, 2ℵα and
P(ℵα ) are equipotent sets. Equipotent sets have the same cardinality. So
|P(ℵα )| = |2ℵα |. Since |2ℵα | = |2||ℵα| = 2ℵα , then |P(ℵα )| = 2ℵα . So, for
each α, the well-ordered set, P(ℵα ), is order isomorphic to the unique car-
dinal number 2ℵα .
We define the function g : C ∗ → C ∗ as follows:

g(ℵα ) = |P(ℵα )| = 2ℵα

See that the function g is well-defined. We recursively define the elements
of the set,
U = {κn : n ∈ ω0 }
4 Without the Axiom of choice the set P (ℵ ) may not be well-orderable. This is the only
α
point where the Axiom of Choice exercises its influence in the proof.
Part IX: Choice, regularity and Martin’s axiom 347

in C ∗ , as follows:

κ0 = ℵ0
κ1 = g(κ0 ) = 2κ0
κ2 = g(κ1 ) = 2κ1
κ3 = g(κ2 ) = 2κ2
..
.
κn+1 = g(κn ) = 2κn
..
.
Then {κn : n ∈ ω0 } is a countable (strictly) increasing sequence of cardinals
which does not contain its maximal element. It follows that

γ = ∪{κn : n ∈ ω0 }

is a limit ordinal which is the least upper bound of the set U (see Theorem
27.12).
Claim #1: The ordinal γ is a cardinal number.
Clearly γ is infinite. It suffices to show that γ is an initial ordinal (since
initial ordinals are cardinals). The ordinal γ is an initial ordinal if

[β ∈ γ] ⇒ [β 6∼e γ]

(that is, |β| < |γ|).

Suppose β is an ordinal in γ. Since γ = ∪{κn : n ∈ ω0 }, then there is some
m such that,
β ∈ κm = 2κm−1 ∈ 2κm = κm+1 ∈ γ
Then |β| < κm < κm+1 ≤ |γ|. Since the cardinality of β is less than |γ|, then
γ is an (infinite) initial ordinal. So, by definition, the ordinal γ is cardinal
number. This establishes Claim #1.
Since γ is an infinite cardinal number, there must exist some ordinal λ such
that γ can be expressed as
ℵλ = γ
Claim #2: The cardinal ℵλ is a strong limit cardinal. If so we are done.

Proof of Claim #2: Suppose α ∈ λ. It suffices to show that 2ℵα ∈ ℵλ . Then

ℵα ∈ ℵλ = γ = ∪{κn : n ∈ ω0 }

Then ℵα ∈ κm for some m ∈ ω0 . It follows that

ℵα+1 ∈= 2ℵα ∈ 2κm = κm+1 ∈ ℵλ (2κm = κm+1 by construction of {κn : n ∈ ω0 })

Since λ is a limit ordinal, ℵα+1 ∈ ℵλ , so ℵλ is a limit cardinal. By definition,

348 Section 30: Axiom of choice

2ℵα ∈ ℵλ implies, ℵλ is a strong limit cardinal number, as claimed.

This concludes the proof.

We have shown that strong limit cardinals exist in a ZFC-universe. Unfortu-

nately the proof cannot tell us how to construct one or what it looks like. In
ZFC strong limit cardinals are completely elusive. But, ethereal as they may
be in the absence of the powerful GCH, we can safely make the assumption
that at least one exists in ZFC. It is important to see that GCH plays no
role in the above proof. In a ZFC+GCH-universe strong limit cardinals are
everywhere and easily seen since they are the limit cardinals.

Concepts review:
1. What does the Well-ordering principle say?
2. What does the Axiom of choice say?
3. Is the Axiom of choice required to justify the existence of a choice
function for finite sets?
4. Is an axiom required to justify the existence of a choice function for
countably infinite subsets of P(S)?
5. What does the Axiom of countable choice say?
6. Provide an example of a statement whose proof requires the Axiom
of choice.
7. Does the Axiom of countable choice follow from the ZF -axioms?
8. State Zorn’s lemma.
9. What linear algebra statement can be proved by invoking Zorn’s
lemma.

EXERCISES

A. 1. For which of the following sets of sets is the Axiom of choice required to
guarantee the existence of a function which selects an element from each
set.
a) An infinite set of sets S where each set in S contains one element.
b) Three sets each containing all elements of R.
c) A countably infinite number of sets each containing all the rational
numbers.
Part IX: Choice, regularity and Martin’s axiom 349

d) Uncountably many sets each containing three identical golf balls.

e) A countably infinite number of sets each containing uncountably many
golf balls and one marble.
f) An uncountably infinite number of sets each containing a pair of socks.
g) An uncountably infinite number of sets each containing two blue mar-
bles and one red.

B. 2. Let U and V be non-empty sets. Suppose R is a relation in U × V with

domain T ⊆ U . Prove that there exists a function f : T → V such that
f ⊆ R.
3. Let U and V be non-empty sets. Prove that a function f : U → V maps U
onto V if and only if there exists some function h : V → U such that f ◦ h
is the identity function on U .
4. Let S be a non-empty set. Suppose U is a non-empty subset of the set
P(S) partially ordered by “⊆”. Prove that lub U = ∪x∈U x.

C. 5. If S is a set containing more than one element, show that there exists a
one-to-one function f : S → S such that f maps no point x in S to itself.
That is, f(x) 6= x for all x in S.
6. Let S be a set of sets. Let M = {U ∈ P(S ) : X, Y ∈ U implies X ∩
Y 6= ∅}. Show that M contains a maximal element T with respect to the
ordering “⊆”. That is, T ∈ M and, for any B ∈ S − T , T ∪ {B} 6∈ M .
350 Section 31: Axiom of regularity

31 / Axiom of regularity
Abstract. In this section we state the Axiom of regularity and present
some of its equivalent forms. We prove that the Axiom of regularity is
equivalent to the statement “Every set has an ∈-minimal element”. We
also show that in the presence of the Axiom of regularity, no set can be
an element of itself. We define “well-founded sets ” and show that in the
presence of the Axiom of choice, the Axiom of regularity is equivalent to
the statement “Every set is well-founded”. The transitive closure of a set
is defined.

31.1 Equivalent forms of the Axiom of regularity.

The Axiom of regularity is the only ZFC-axiom we have not yet, explicitly,
invoked to justify any steps in the proofs of theorems seen up to now. We
restate this axiom:

Axiom of regularity: Every non-empty set S has an element x such

that x ∩ S = ∅.

To help us better understand the meaning of the Axiom of regularity − also

known as the Axiom of foundation − we investigate some of its equivalent
forms and consequences. In what follows, we will say that m is a minimal
element of the set S with respect to the order relation “<” if S is ordered by
“<” and S contains no element x such x < m.

Theorem 31.1 The Axiom of regularity holds true if and only if every non-
empty set S contains a minimal element with respect to the membership
relation “∈”.
P roof:
(⇒)
What we are given: A non-empty set S. That the axiom of regularity holds
true in S. That is, there exists in S an element m such that m ∩ S = ∅.
What we are required to show: That S contains a minimal element with
respect to the membership relation “∈”.
Suppose the given m is not a minimal element of S. That is, suppose S
contains an element x such that x ∈ m. Then x ∈ m ∩ S 6= ∅, contradicting
our hypothesis. Then this element m is minimal in S with respect to “∈”,
as claimed.
(⇐)
What we are given: A non-empty set S. That S contains a minimal element
m with respect to “∈”.
What we are required to show: That S contains some element m such that
Part IX: Choice, regularity and Martin’s axiom 351

m ∩ S = ∅.
Suppose that S does not contain an element m such that m∩S = ∅. That is,
for every m ∈ S, there exists x ∈ m ∩ S; then no element m in S is minimal
with respect to ∈. This contradicts our hypothesis. Hence, for every set S
there exists m such m ∩ S = ∅.

Up to this point in our study of sets, we have not encountered a set x such
that x ∈ x. We thought it would be best to avoid such “creatures”, at least
until we better understand the difficulties that they may cause. We will now
see that the statement “No set can be an element of itself” is a consequence
of the Axiom of regularity.

Theorem 31.2 [Axiom of regularity] No set is an element of itself .

P roof:
Let x be an element of the singleton set, S = {x}. Suppose x ∈ x. Then,
x ∈ x ∩ S. Then, for every element, x, of S, x ∈ x ∩ S 6= ∅. This contradicts
the Axiom of regularity. Then, for any set x, x 6∈ x.

So the ZFC-set-theoretic universe contains no set which is an element of

itself. Recall that in the formal definition of “ordinal numbers”, we required
ordinal numbers to be strictly ∈-well-ordered. Now we see that defining ordi-
nals as being simply ∈-well-ordered would have been sufficient, in the sense
that the strictly ordered property would follow from “regularity”. Proceed-
ing as we did allows us to see that ordinals exist as sets even in the absence
of the Axiom of regularity.

Definition 31.3 We say that a class S is well-founded if S does not contain

an infinite descending chain of sets. That is, there does not exist an infinite
sequence {xn : n ∈ ω} such that · · · ∈ x4 ∈ x3 ∈ x2 ∈ x1 ∈ x0 .

The set-theoretic universe we have explored up to now was not assumed to

be a well-founded universe. Sets which are not well-founded were simply not
considered or raised as a subject for discussion. Attempting to prove that
non-well-founded sets exist, or do not exist, was more or less viewed as a
digression from the concepts we were studying at that time.
352 Section 31: Axiom of regularity

We will now show that in the presence of the Axiom of choice,1 the two state-
ments “Every set is well-founded ” and the Axiom of regularity are equiva-
lent. This means that in the absence of the Axiom of regularity, we would
have to accept that non-well-founded sets may exist and study what impact
the existence of such sets has in our set-theoretic universe.2

Theorem 31.4 [AC] The Axiom of regularity and the statement “Every set
is well-founded ” are equivalent statements.
P roof:
(⇒)
What we are given: The Axiom of regularity holds true.
What we are required to show: That all sets are well-founded.
Suppose there exists a set which contains an infinite descending chain of
sets S = {xn : n ∈ ω}. This means that an ∈-ordered chain such as
· · · ∈ x4 ∈ x3 ∈ x2 ∈ x1 ∈ x0 exists. By hypothesis, S must contain
some element m such that m ∩ S = ∅. This element m must be equal to xk
for some k ∈ ω. Since xk+1 ∈ m ∩ S, we have a contradiction. So non-well-
founded sets cannot exist in the presence of regularity.
(⇐)
What we are given: That every set is well-founded.
What we are required to show: That every non-empty set S contains an
element m such that m ∩ S = ∅.
Suppose there exists a non-empty set S such that for every m ∈ S, m ∩ S 6=
∅. Then there exists a relation R ⊂ S × S such that (m, x) ∈ R if and only
if x ∈ m ∩ S. Since m ∩ S 6= ∅ for all m ∈ S, the domain of R is all of S.
Invoking the Axiom of choice, there exists a “choice function” f : S → S,
f ⊆ R, where, for each m ∈ S, f(m) ∈ m ∩ S. Let x0 = f(S). We recursively
define a function g : ω0 → S as follows:

g(0) = x0 = f(S)
g(1) = x1 = f(x0 ) ∈ x0 ∩ S
g(2) = x2 = f(x1 ) ∈ x1 ∩ S
.. ..
. .
g(n + 1) = xn+1 = f(xn ) ∈ xn ∩ S
.. ..
. .

We have thus constructed a set {xn : n ∈ ω0 } where, for each n ∈ ω0 ,

1 The theorem actually uses a weak form of the Axiom of choice, called the Axiom of

dependent choice
2 It was shown in 1929 by von Neumann that if “ZF without regularity” is consistent,

then “ZF with regularity” is also consistent.

Part IX: Choice, regularity and Martin’s axiom 353

xn+1 ∈ xn . The set {xn : n ∈ ω0 } is an infinite descending chain. Its exis-

tence contradicts the hypothesis stating that S is a well-founded set, a set
in which no such chain can exist. The source of our contradiction is our sup-
position that S does not contain a minimal element m. Hence, in presence
of the Axiom of choice, the Axiom of regularity follows from the statement
“Every set is well-founded”.

31.2 Transitive closure of a set

Before we discuss another characterization of the Axiom of regularity (in
Theorem 32.5 of the next chapter), we introduce what is known as the tran-
sitive closure of a set. Recall that a set S is a transitive set if whenever y ∈ S
and x ∈ y, then x ∈ S. Equivalently, if x ∈ S then x ⊂ S.

Definition 31.5 Let x be a set. The transitive closure of x is a set tx satis-

fying the following three properties:

1) The set tx is a transitive set.

2) x ⊆ tx
3) tx is the ⊆-least transitive set satisfying properties 1 and 2.

Example − Consider the set S = {2}. We see that the set S is not transi-
tive, since 1 ∈ 2 = {0, 1}, 2 ∈ S, but 1 6∈ S. So S does not contain all the
elements required for it to be transitive. Starting with S, we will construct,
step-by-step, its transitive closure, tS . We have seen that element 1 is miss-
ing, so let’s add it to S: Let S1 = {1, 2}. We see that S1 is not transitive
since 0 ∈ 1 = {0}, 1 ∈ S1 , but 0 6∈ S. We then add to S1 , the element 0:
Let S2 = {0, 1, 2}. We see that S2 is the natural number 3 known to be
transitive. Then, the transitive closure, tS , of S = {2} is the natural number
3 = {0, 1, 2}.
Completing a non-transitive set, S, to its transitive closure, tS , means to
add to S all the elements which belong to elements of the set.
The following theorem guarantees that every non-empty set, x, has a tran-
sitive closure, tx .

Theorem 31.6 Let x be a set. Then there exists a smallest transitive set, tx ,
which contains all elements of x. That is, every set, x, has a transitive closure,
tx .
354 Section 31: Axiom of regularity

P roof:
Let S denote the class of all sets. Let f : S → S be a function defined as:
f(u) = ∪{y ∈ S : y ∈ u}. Let x0 ∈ S . We recursively define the function
g : ω0 → S as follows:

g(0) = x0 ∈ S
g(1) = x1 = f(x0 ) = ∪{y ∈ S : y ∈ x0 }
g(2) = x2 = f(x1 ) = ∪{y ∈ S : y ∈ x1 }
.. ..
. .
g(n) = xn = f(xn−1 ) = ∪{y ∈ S : y ∈ xn−1 }
g(n + 1) = xn+1 = f(xn ) = ∪{y ∈ S : y ∈ xn }
.. ..
. .
Let
tx0 = ∪{xn : n ∈ ω0 } = x0 ∪ x1 ∪ x2 ∪ · · ·
Since x0 is a set, each xn is the union of a set of sets and so tx0 is itself a
set.
Claim #1 : That tx0 is a transitive set.
Proof of Claim #1: Suppose u ∈ tx0 and v ∈ u. We are required to
show that v ∈ tx0 . Since tx0 = ∪{xn : n ∈ ω0 }, u ∈ xk for some
k ∈ ω. Since xk+1 = f(xk ) = ∪{y ∈ S : y ∈ xk }, u ⊆ xk+1 . Hence,
v ∈ u ⊆ xk+1 ⊆ ∪{xn : n ∈ ω0 } = tx0 . So tx0 is a transitive set, as claimed.
Suppose now that s is some transitive set such that x0 ⊆ s.
Claim #2: That tx0 ⊆ s.
Proof of Claim #2: It suffices to show that xn ⊆ s for all n. We will show
this by induction.
Base case: That x0 ⊆ s is given.
Inductive hypothesis: Suppose xn ⊆ s. We are required to show that
xn+1 ⊆ s. If u ∈ xn+1 = f(xn ) = ∪{y ∈ S : y ∈ xn }, then u ∈ y, for
some y ∈ xn ⊆ s. Since s is transitive, u ∈ y ⊆ s implies u ∈ s. Then
xn+1 ⊆ s.
Hence, by mathematical induction, xn ⊆ s, for all n. Since tx0 = ∪{xn : n ∈
ω}, tx0 ⊆ s, as claimed.
We have thus constructed the smallest transitive set tx0 which contains x0 .
Then for any set x there exists a smallest transitive set tx which contains x.

Concepts review:
1. State the Axiom of regularity.
2. What does it mean to say that a set S has a minimal element with
respect to ∈?
Part IX: Choice, regularity and Martin’s axiom 355

3. Which statement refers to a minimal element of sets is equivalent

to the Axiom of regularity?
4. What does it mean to say that a set is well-founded?
5. If we assume the Axiom of choice, which ZFC-axiom is equivalent
to the statement “Every set is well-founded”?
6. Given a non-empty set x, what is the transitive closure of x?
7. Given a non-empty set x, does x necessarily have a transitive clo-
sure?

EXERCISES

A. 1. Describe the transitive closure tx for each of the following sets x.

a) x = {1, 2}
b) x = {0, {2, 3}}
c) x = {0, 1, 2}
d) x = {{∅}}
2. Show that a set S is transitive if and only if for every x ∈ S, x ∩ S = x.
3. Show that x is transitive if and only if x is its own transitive closure tx.
356 Section 32: Cumulative hierarchy

32 / Cumulative hierarchy
Abstract. We show how to construct, incrementally, a class of sets whose
union, V , contains all sets. The class, V, is referred to as the “Von Neu-
mann’s universe of sets ”. The Axiom of regularity is used to show that
this class, V , indeed contains all sets. We then define the “cumulative hi-
erarchy” and the “rank of a set”.

32.1 Constructing the class of all sets in incremental stages.

The reader may be wondering whether the only purpose of the Axiom of
regularity is to exclude from the ZFC-universe those “sets which are ele-
ments of themselves”. That in itself constitutes a good enough reason, but
we will soon see that it is useful in other ways.
We will begin this section by constructing, in levels, a particular ordinal-
indexed class
{Vα : α ∈ O}
of sets, starting with the empty set V0 = ∅. This structured class, ordered
by inclusion “⊂”, is referred to as the Cumulative hierarchy of sets.1 The
union,
V = ∪{Vα : α ∈ O}
of all the Vα -sets in the cumulative hierarchy is referred to as the Von Neu-
mann2 universe of sets or the
Von Neumann hierarchy of sets
The restrictions imposed by the Axiom of regularity on the type of sets
which belong to the ZFC-universe will guarantee that every set which be-
longs to the ZFC-universe of sets also belongs to V . That is, V accounts for
all sets whose existence is determined from the ZFC-axioms.
We have often used in this text the symbol, S , to denote the “class of all
sets” in the ZFC-universe. The class, V , of sets which we will now inves-
tigate will also denote the class of all sets; but the class V has a ∈-linear
structure, as we will soon see. It is constructed in stages, starting only with
the empty set, ∅ . The construction of the class, {Vα : α ∈ O}, is described
below.

1 The terms “rank hierarchy of sets” is sometimes used.

2 John von Neumann (1903-1957) was a Hungarian and American mathematician, physi-
cist, computer scientist and engineer. He had perhaps the widest coverage of any math-
ematician of his time, integrating pure and applied sciences and making major contribu-
tions to many fields, including mathematics, physics, economics, computing, and statistics,
(Wikipedia).
Part IX: Choice, regularity and Martin’s axiom 357

Definition 32.1 Define the class function, f : S → S , as f(S) = P(S).

The elements of the class, {Vα : α ∈ O}, belong to the image of the class
function g(α) = Vα which is recursively defined as follows:

g(0) = V0 = ∅ = 0
g(1) = V1 = f(V0 ) = P(0) = {∅} = 1
g(2) = V2 = f(V1 ) = P(1) = {∅}, ∅ = 21 = 2
g(3) = V3 = f(V2 ) = P(2) = {{∅}, ∅}, {{∅}}, {∅}, ∅
g(4) = V4 = f(V3 ) = P(V3 ) (24 = 16 elements)
.. ..
. .
g(α+ ) = Vα+ = f(Vα ) = P(Vα )
.. ..
. .
g(λ)
If λ = limit ordinal, = Vλ = ∪α∈λ Vα
g(λ + 1) = Vλ+1 = f(Vλ ) = P(Vλ )
.. ..
. .

The class {Vα : α ∈ O} is called the Cumulative hierarchy of sets. The union
of all the elements of the cumulative hierarchy of sets is denoted as:

V = ∪α∈O Vα

Note that, since O is a proper class, V need not be a set.

The idea behind the construction is quite simple: Beginning with V0 = ∅,
each Vα is the power set of its predecessor or, in the case where α is a limit
ordinal, is the union of all its predecessors. So every Vα is an element of its
immediate successor. For example, V3 ∈ P(V3 ) = V4 .
We verify that every single element, Vα of the cumulative hierarchy of sets
is indeed a set: Since each Vα is either the image of a set, the power set of
a set, or the union of a set of sets, then each Vα is a set (by the axioms of
replacement, union and power set). That is,

{Vα : α ∈ O}

is a class of sets. So every element of V = ∪α∈O Vα is a set. That is

V ⊆S

For the following theorem, recall that a set S is transitive with respect to ∈
if x ∈ S ⇒ x ⊂ S, equivalently if, x ∈ y and y ∈ S, then x ∈ S.

Theorem 32.2 For every α ∈ O, Vα is a transitive set.

358 Section 32: Cumulative hierarchy

P roof:
It suffices to show that y ∈ x ∈ Vγ implies y ∈ Vγ , for all ordinals, γ. We
can prove this by a straightforward application of transfinite induction.
Let P (α) denote the statement “Vα is transitive”.
Since V0 = ∅, P (0) trivially holds true.
Suppose P (α) holds true. That is, suppose Vα is transitive.

y ∈ x ∈ Vα+ = P(Vα ) ⇒ y ∈ x ⊆ Vα
⇒ y ⊆ Vα (Since Vα is transitive.)
⇒ y ∈ P(Vα ) = Vα+
⇒ Vα+ is transitive
⇒ P (α+ ) holds true.

Suppose γ is a limit ordinal and that α ∈ γ implies P (α) holds true.

y ∈ x ∈ Vγ = ∪α∈γ Vα ⇒ y ∈ x ∈ Vβ for some transitive Vβ

⇒ y ∈ Vβ ⊆ Vγ
⇒ Vγ is transitive
⇒ P (γ) holds true.

By transfinite induction, every Vα is transitive.

Up to now, we have always represented the class of all sets which has evolved
from the ten ZFC-axioms by the symbol, S = {x : x is a set}. Since every
element of {Vα : α ∈ O} is a set, we can write, {Vα : α ∈ O} ⊆ S .
Furthermore,

x ∈ V = ∪α∈O Vα ⇒ x ∈ Vγ , for some γ ∈ O,

⇒ x ⊆ Vγ , Since Vγ is transitive.

⇒ x is a set.
⇒ x ∈ S.

Hence
V = ∪α∈O Vα ⊆ S

We now wonder whether S ⊆ V = ∪{Vα : α ∈ O}. That is, are all sets
accounted for in V ? We will prove that this is indeed the case. With this
objective in mind, we first establish the following lemma.
Part IX: Choice, regularity and Martin’s axiom 359

Lemma 32.3 The class {Vα : α ∈ O} is a strictly ∈-increasing (⊂-increasing)

chain of transitive sets. That is, if α ∈ β, then Vα ∈ Vβ . Hence, Vα ⊂ Vβ .

P roof:
We have proven above that Vα is transitive for every α ∈ O.
Let P (γ) denote the statement “α ∈ γ implies Vα ∈ Vγ ”.
Suppose P (β) holds true for all β ∈ γ. That is, if α ∈ β ∈ γ, then Vα ∈ Vβ .
Suppose φ ∈ γ. We are required to show that Vφ ∈ Vγ .
Case 1: If γ is a limit ordinal, then Vφ ∈ Vφ+ ⊂ ∪α∈γ Vα = Vγ .
Case 2: Suppose γ = ψ+ . If φ ∈ ψ, then Vφ ∈ Vψ , by the inductive hypothesis.
If φ = ψ, then Vφ = Vψ ∈ P(Vψ ) = Vψ+ = Vγ ; hence, P (γ) holds true.
By transfinite induction, α ∈ γ implies Vα ∈ Vγ for all γ. Since Vγ is transi-
tive, Vα ⊂ Vγ for all γ. This completes the proof of the lemma.

Lemma 32.4 For any non-empty set B, if B ⊂ V = ∪{Vα : α ∈ O} then

B∈V.

P roof:
What we are given: That B is a non-empty set such that B ⊂ V = ∪{Vα :
α ∈ O}.
What we are required to show: That B ∈ V .
Let u ∈ B. Then, since B ⊂ V , u ∈ Vα for some α ∈ O.
Then the set {α ∈ O : u ∈ Vα } is non-empty. This ensures that the function
f : B → O defined as

f(u) = least{α ∈ O : u ∈ Vα }

is well-defined.
− Since B is a set, by the Axiom of replacement, f[B] is a set of ordinals.
Since f[B] is a set, β = ∪α∈f[B] α (a union of a set of sets) is a set. By
Theorems 27.11 and 27.12, β is an ordinal.
− Since α ⊆ β for all α ∈ f[B] and β is transitive, α ∈= β, for all α ∈ f[B].3
− By Lemma 32.3, for every α ∈ f[B], [α ∈= β] ⇒ [Vα ⊆ Vβ ].
Then, for every u ∈ B, u ∈ Vf(u) ⊆ Vβ . This implies that B ⊆ Vβ and so

B ∈= P(Vβ ) = Vβ+ ⊂ V

This concludes the proof of the lemma.

3 Recall that “∈= ” is read as “is an element of, or is equal to”.

360 Section 32: Cumulative hierarchy

Theorem 32.5 [Axiom of regularity] For every set x, x ∈ V = ∪α∈O Vα .

That is, S ⊆ V = ∪α∈O Vα .
P roof:
What we are given: That x is a set and that V = ∪α∈O Vα .
What we are required to show: That x ∈ V . To prove this, we will invoke
the Axiom of regularity.
If x = ∅, then x ∈ V , we are done. We then suppose x 6= ∅.
Claim: The transitive closure, tx , of the set x, is a proper subset of V .
Proof of claim: Suppose tx 6⊂ V . That is, suppose there is an element u such
that u ∈ tx and u 6∈ V . Then the set U = {u ∈ tx : u 6∈ V } is non-empty.
Since U is non-empty, by the Axiom of regularity, U contains a minimal el-
ement m ∈ U . That is, m ∈ U and m ∩ U = ∅. Note that m 6∈ V . If m = ∅
then m ∈ V , a contradiction; so m is a non-empty set. Suppose y ∈ m. Since
tx is transitive and m ∈ tx , then y ∈ tx . But since m ∩ U = ∅, y cannot
belong to U . Then y ∈ V . We conclude that m ⊂ tx ∩ V . By the previous
lemma, [m ⊂ V ] ⇒ [m ∈ V ]. This contradicts the fact that m ∈ U . The
source of this contradiction is our supposition that U = {u ∈ tx : u 6∈ V } is
non-empty. Then U = {u ∈ tx : u 6∈ V } is empty and so tx ⊂ V , as claimed.
By the previous lemma, tx ⊂ V ⇒ tx ∈ V . Then tx ∈ Vα for some α.

x ⊆ tx ∈ Vα ⇒ x ⊆ tx ⊆ Vα (Since Vα is a transitive set)

⇒ x ⊆ Vα
⇒ x ∈ P(Vα ) = Vα+ ⊂ V

So x ∈ V , as required. So S ⊆ V .

Since S ⊆ V and every element of V is a set, then the class V = ∪α∈O Vα

equipped with its unique ∈-linear structure represents the class S that
evolves from the ten ZFC-axioms (nine ZF-axioms plus Choice) listed in
Chapter 1. It is interesting to note that the class, V = ∪α∈O Vα , contains
all elements of S only in the presence of the Axiom of regularity. That is,
without the restrictions that the Axiom of regularity imposes on sets, some
sets in S would not be accounted for in V .
We now show that if S ⊆ V , then the Axiom of regularity holds true on
S . Hence, if S = V , then the Axiom of regularity holds true.

Theorem 32.6 Let V = ∪α∈O Vα be the class of sets constructed as described

above and S denote the class of all sets. If S ⊆ V , then every set in S has
a ∈-minimal element. That is, the Axiom of regularity holds true.
Part IX: Choice, regularity and Martin’s axiom 361

P roof:
What we are given: That the class V contains all sets.
What we are required to show: That every non-empty set x contains a ∈-
minimal element.
Suppose x is a non-empty set. By hypothesis, x belongs to V = ∪α∈O Vα .
Then x ∈ Vα for some α. Since Vα is transitive x ⊂ Vα ⊂ V , we can then
define a function f : x → O as f(u) = least{α ∈ O : u ⊂ Vα }. Since x is
a non-empty set, by the Axiom of replacement, f[x] is a non-empty set of
ordinals and so contains a least element, say φ. Since φ is in the image of x
under f, there exists an element m in the domain x of f such that f(m) = φ.
Then φ is the least ordinal such that m ⊂ Vφ ; equivalently, it is the least
ordinal such that m ∈ Vφ+ . Then m 6∈ Vφ .
We claim that m is an ∈-minimal element of x: Suppose not. Suppose
z ∈ m ∩ x. Then since z ∈ m ⊂ Vφ implies z ∈ Vφ we have f(z) ∈ f(m) = φ.
This contradicts the fact that φ is minimal in f[x]. Then m is an ∈-minimal
of x as claimed.
Then every set has a minimal element with respect to “∈”.

The combining Theorems 32.5 and 32.6, we have that,

Given the axiom of regularity on S , S ⊆ V . On the other hand,

if we are given that S ⊆ V , then the axiom of regularity holds
true on S .

32.2 Rank of a set.

The theorem which associates the class of all sets, S , and the class,
V = ∪α∈O Vα , turns out to be an important result, as we shall soon see.
Suppose U is a set. Then U ∈ S . Since S = V = ∪α∈O Vα (by 32.5) then
U ∈ Vα for some α. Since O is ∈-well-ordered, there will be a smallest ordi-
nal α such that U ⊆ Vα . This provides a new way of characterizing the set U .

Definition 32.7 Given any set U , we will define the “rank of the set U ”,
denoted as rank(U ), as follows:

rank(U ) = least{α ∈ O : U ⊆ Vα }

If the rank, rank(U ), of a set U is known to be κ, then Vκ is the smallest

“container” in {Vα : α ∈ O} that can contain U . So the rank of a set may
provide some idea of its size, but also of its content.
362 Section 32: Cumulative hierarchy

First note that rank : V → O mapping a set, U , to a unique ordinal number,

rank(U ), is a well-defined function.
Example − We list the sets V0 to V4 to help us determine the rank of a few
simple sets.

g(0) = V0 = ∅ = 0
g(1) = V1 = f(V0 ) = P(0) = {∅} = 1
g(2) = V2 = f(V1 ) = P(1) = {∅}, ∅ = 21 = 2
g(3) = V3 = f(V2 ) = P(2) = {{∅}, ∅}, {{∅}}, {∅}, ∅
g(4) = V4 = f(V3 ) = P(V3 ) (24 = 16 elements)

− Suppose A = 3. Since 3 = {0, 1, 2} = ∅, {∅}, {{∅}, ∅} ⊂ V3 and
3 6⊂ V2 , then rank(A) = 3. (Could it be that the rank of any ordinal,
γ, is γ? We will soon see!)
− Suppose B = {{{∅}}}. We see that {{∅}} ∈ V3 so B ⊆ V3 and
{{∅}} 6⊂ V2 ; hence, rank(B) = 3.
− Suppose C = { {{{∅}}}, ∅}. We see that {{{∅}}} and ∅ are both
elements of P(V3 ) = V4 ; hence, C ⊂ V4 . Since {{{∅}}} 6∈ V3 , C 6⊂ V3 ;
hence, rank(C) = 4.

The above definition allows us to subdivide elements in S into sub-

categories based on their rank. But, as one quickly finds out, in practice,
determining the rank of a set can be tricky business. The following theorem
will provide a few helpful tools to do so.

Theorem 32.8 Let V = ∪{Vα : α ∈ O}.

a) The rank of the empty set ∅ is zero.

b) If U ∈ Vβ , then rank(U ) < β.
c) For all sets U , U 6∈ Vrank(U ).
d) If U is a set and rank(U ) < β, then U ∈ Vβ .
e) For any ordinal α, Vα is precisely the set of all sets U whose rank is less
than α.
f) If U and D are sets such that U ∈ D, then rank(U ) < rank(D).
g) If γ is an ordinal, then rank(γ) = γ.

P roof:
a) Note that if U = ∅, then rank(U ) is the least ordinal number α such
that ∅ ⊆ α, namely 0. So rank(∅) = 0.
Part IX: Choice, regularity and Martin’s axiom 363

b) What we are given: U ∈ Vβ .

What we are required to show: rank(U ) < β.
To show this, we consider the cases whether β is a successor ordinal or
β is a limit ordinal.
Case 1: Suppose β = α + 1 for some α. Then Vβ = Vα+1 = P(Vα ).
Suppose U ∈ Vβ = Vα+1 = P(Vα ). Then U ⊆ Vα . But rank(U ) is the
least ordinal κ such that U ⊆ Vκ . Then rank(U ) ≤ α < α + 1 = β.
Case 2: Suppose β is a limit ordinal. Since U ∈ Vβ = ∪α∈β Vα , then
U ∈ Vα for some α ∈ β. Since Vα is transitive U ⊆ Vα so again,
rank(U ) ≤ α < β, as required.
Then if U ∈ Vβ , then rank(U ) < β.
c) If U ∈ Vrank(U ), then by the first part b) rank(U ) < rank(U ), a contra-
diction. So U 6∈ Vrank(U ) .
d) What we are given: That U is a set and rank(U ) < β.
What we are required to show: U ∈ Vβ .
See that if rank(U ) < β then U ⊆ Vα , for some α where α + 1 ≤ β.
Then U ∈ Vα+1 ⊆ Vβ . Then U ∈ Vβ , as required.
e) We are required to show that for any α,

Vα = {U ∈ S : rank(U ) < α}
Suppose D ∈ Vα . Then, by part (b), rank(D) < α. So D ∈ {U ∈
S : rank(U ) < α}. Suppose on the other hand that D ∈ {U ∈ S :
rank(U ) < α}, then rank(D) < α. By part (d), D ∈ Vα . So for any
ordinal α, Vα is precisely the set of all sets U whose rank is less than α.
f) What we are given: U and D are sets such that U ∈ D.
What we are required to show: rank(U ) < rank(D).
Recall that rank(D) = least{α : D ⊆ Vα } (and if α < rank(D),
D 6⊆ Vα ).
Then D ⊆ Vrank(D) . We are given that U ∈ D. Then U ∈ Vrank(D) .
By part b), rank(U ) < rank(D). As required.
g) What we are given: γ is an ordinal.
What we are required to show: That rank(γ) = γ.
Claim: rank(γ) ≤ γ for all ordinals γ.
Suppose γ < rank(γ).
γ < rank(γ) ⇒ γ < β ≤ rank(γ) For some ordinal β.
⇒ γ ∈ β ∈= rank(γ)
⇒ γ ∈ β ⊆ Vβ ⊆ Vrank(γ) By part (b) of Lemma 32.3.
⇒ γ ∈ Vrank(γ) Contradicting part (b).

The source of the contradiction is our supposition that γ < rank(γ).

Then rank(γ) ≤ γ, as claimed.
364 Section 32: Cumulative hierarchy

We now show that rank(γ) = γ for all ordinals γ. We prove this by

transfinite mathematical induction. Suppose rank(α) = α for all or-
dinals α ∈ γ. It suffices to show that rank(γ) = γ. Since we have
shown that rank(γ) ≤ γ, it suffices to show that rank(γ) 6< γ. Sup-
pose not. That is, suppose rank(γ) = β < γ. Then, by the inductive
hypothesis, rank(β) = β = rank(γ). Since, β < γ, β ∈ γ and so, by
part (f), rank(β) < rank(γ), contradicting rank(β) = rank(γ). Hence,
rank(γ) < γ is impossible. Then rank(γ) 6< γ and rank(γ) ≤ γ implies
rank(γ) = γ. By transfinite induction, rank(γ) = γ for all ordinals γ.

Remark : If α is any ordinal then, by part (g) of Theorem 32.8, rank(α) = α.

Therefore, for any ordinal β ∈ α, α ⊆ Vα and α 6⊂ Vβ . This means that the
set of all natural numbers, ω0 , is a subset of Vω0 .
Furthermore, we mentioned earlier that R ⊆ P 7 (ω0 ) = Vω0 +7 (see page
168); then we can safely say that R ⊆ Vω0 +ω0 . Hence, even for relatively
small ordinals, α, Vα can contain many of the sets we really care about in
mathematics. In fact, as we shall soon see, many of the ZFC-axioms hold
true in some Vα where α is relatively small.

32.3 How the ZFC-axioms hold true in V .

In Theorem 32.6, we have shown that S = V = ∪α∈O Vα (where S denotes
the class of all sets in the ZFC-universe). Since all of the ZFC-axioms hold
true on S , then they must hold true in V = ∪α∈O Vα . In particular, we
have seen that if we exclude the Axiom of regularity from the list of the
ZFC-axioms, then V is a proper subclass of S .

In what follows, we will illustrate how eight of the ten ZFC-axioms are sat-
isfied either on V or simply on the certain subsets, Vα , of V . It will be good
practice (even if it can be a bit tricky at times in doing so). We will see how
knowing the rank of a set can be useful in proving that the ZFC-set-axioms
hold true in V .

Axiom of regularity. Let x ∈ V = ∪α∈O Vα . Then x ∈ Vγ for some γ ∈ O.

Since Vγ is a transitive set, then x ⊂ Vγ . So x is a set. By the Axiom of
regularity, since x ∈ S then x contains a minimal element, say m. We then
have m ∈ x ⊂ Vγ . Since Vγ is a transitive set, m ⊂ Vγ which implies m is a
set. So the Axiom of regularity holds true on V .
Axiom of pair. Recall that the Axiom of pair states that if a and b are
sets, then {a, b} is also a set. To show that some set K satisfies this axiom,
we must prove that a, b ∈ K ⇒ {a, b} ∈ K. In the following proposition, we
show that, V satisfies the Axiom of pair.
Part IX: Choice, regularity and Martin’s axiom 365

Proposition 32.9 The class of set V = ∪{Vα : α ∈ O} satisfies the property

described by the Axiom of pair.
P roof:
What we are given: The elements a, b ∈ V . There is an ordinal β so that a
and b in Vβ .4
What we are required to show: That {a, b} ∈ V .
We consider two cases.
Case 1. Suppose β is a successor ordinal. Then there is an ordinal α such
that β = α + 1. Then

a, b ∈ Vβ = Vα+1 = P(α)

Then a ⊂ Vα and b ⊂ Vα . Since Vα ⊂ Vβ (since α ∈ β and by Theorem

32.3), then a ⊂ Vβ and b ⊂ Vβ . So a, b ∈ Vβ+1 . Then {a, b} ∈ Vβ+2 . Then
{a, b} ∈ V .
Case 2. Suppose β is a limit ordinal. Then a, b ∈ Vβ = ∪α∈β Vα .
Then there exists some γ ∈ β such that a, b ∈ Vγ . So {a, b} ⊂ Vγ . Then
{a, b} ∈ Vγ+1 ∈ {Vα : α ∈ O}. Then {a, b} ∈ V .

Axiom of union. The Axiom of union states that if A is a set of sets, then
∪{C : C ∈ A } is a set.

Proposition 32.10 If U ∈ V , then ∪{x : x ∈ U } ∈ V .

P roof:
What we are given: That U ∈ V . Then there is β such that U ∈ Vβ .
What we are required to show: That ∪{x : x ∈ U } ∈ V .
Since U ∈ Vβ , then, by property (b) in Theorem 32.8, rank(U ) < β.
Let y ∈ ∪{x : x ∈ U }. Then y ∈ x for some x ∈ U . Since y ∈ x ∈ U , then

rank(y) < rank(x) < rank(U ) (by part f) of Theorem 32.8 above)

By part (d) above,

h i
[rank(y) < rank(U )] and [rank(U ) < β] ⇒ [y ∈ Vrank(U ) ⊂ Vβ ]

So y ∈ Vβ .

4 Then there exists γ, β such that a ∈ V , b ∈ V , where γ ∈ β. Since γ ∈ β ⇒ V ⊂ V

γ β γ β
(by 32.3), a and b in Vβ .
366 Section 32: Cumulative hierarchy

We deduce that
∪{x : x ∈ U } ⊆ Vrank(U ) ⊂ Vβ
This means that ∪{x : x ∈ U } ∈ P(Vrank(U ) ) ⊆ Vβ .
It follows that ∪{x : x ∈ U } ∈ Vβ , as required.
So V satisfies the property described in the Axiom of union.

Axiom of power set. To show that V satisfies the property of power set,
we must show that if U ∈ V , then P(U ) ∈ V .

Proposition 32.11 If U is an element of V , then P(U ) ∈ V .

P roof:
What we are given: That U ∈ V .
What we are required to show: That P(U ) ∈ V .
By definition, U ⊆ Vrank(U ) . Then U ∈ P(Vrank(U ) ) = Vrank(U )+1 .
Suppose A ∈ P(U ). Then A ⊆ U ⊆ Vrank(U ) . So,

A ∈ P(Vrank(U )) = Vrank(U )+1

Then P(U ) ⊆ Vrank(U )+1 , and so P(U ) ∈ P(Vrank(U )+1 ) = Vrank(U )+2 .

So U ∈ V implies P(U ) ∈ V .

Axiom of infinity. The Axiom of infinity states that there exists a non-
empty set A (called an inductive set) that satisfies the condition: (x ∈ A) ⇒
(x ∪ {x} ∈ A). The set, ω0 = {0, 1, 2, 3, . . . , }, was defined to be the smallest
inductive set. To show that K satisfies the Axiom of infinity, we need to
show that ω0 ∈ K.

Proposition 32.12 Let α be an ordinal number strictly larger than ω0 . Then

ω0 ∈ Vα ⊂ V .
Part IX: Choice, regularity and Martin’s axiom 367

P roof:
What we are given: An ordinal number, α, strictly larger than ω0 .
What we are required to show: That ω0 ∈ Vα .
First note that, for each n ∈ ω0 Vn is finite. This is easily verified by a proof
by mathematical induction on ω0 .
Since, by definition, Vω0 = ∪{Vn : n ∈ ω0 }. Then ω0 6∈ Vω0 (other-
wise, ω0 ∈ Vn for some n). Since ω0 is an ordinal then, by Theorem 32.8,
rank(ω0 ) = ω0 . This means that the least ordinal α such that ω0 ⊆ Vα is ω0 .
Then ω0 ⊆ Vω0 . This implies ω0 ∈ P(Vω0 ) = Vω0 +1 ⊆ Vγ , for all γ > ω0 .
We conclude that ω0 ∈ Vα ⊂ V for all α such that α > ω0 , as required.

Axiom of subsets. The Axiom of subsets states that if S is a set and φ is

a formula expressed in the language of set theory, then {x ∈ S : φ(x) holds
true} is a set.

Proposition 32.13 The Axiom of subsets holds true in V .

P roof:
What we are given: Let S ∈ V and α be an ordinal such that S ∈ Vα . Let
M = {x ∈ S : φ(x)} ⊆ S.
What we are required to show: That M ∈ V . It suffices to show that M ∈ Vα .
Since S ∈ Vα , then rank(S) < α (by Theorem 32.8(b)).
Since M = {x ∈ S : φ(x)} ⊆ S, then rank(M ) ≤ rank(S) < α.
By part (d) of Theorem 32.8, Vrank(M ) ∈ Vα .
Since Vα is transitive, then Vrank(M ) ⊂ Vα .
Then Vrank(M ) + 1 = P(Vrank(M ) ) ⊂ Vα . (Since U ∈ P(Vrank(M ) ) ⇒ U ⊂ Vrank(M ) ⊂ Vα .)
Then M ⊆ Vrank(M ) ⇒ M ∈ P(Vrank(M ) ) ⇒ M ∈ Vα ⊂ V , as required.

Axiom of replacement. The Axiom of replacement can be expressed as

follows: Let w be a set. If φ is a formula so that for every x ∈ w, [φ(x, z)
and φ(x, y)] ⇒ [y = z], then the class {u : φ(x, u) for some u ∈ w} is a set.
We will restrict ourselves to the simpler case of only showing that the Axiom
of replacement holds true in Vω0 .

Proposition 32.14 The set Vω0 satisfies the property described by the Ax-
iom of replacement.
368 Section 32: Cumulative hierarchy

P roof:
What we are given: That z ∈ Vω0 and f : z → V is a function mapping z
into V .
What we are required to show: That f[z] ∈ Vω0 .
Recall that Vω0 = ∪{Vn : n ∈ ω0 }. Since z ∈ Vω0 , then z ∈ Vm for some
natural number m. Since Vm is transitive, u ∈ z ⇒ u ∈ Vm , then z is a finite
set. Then f[z] is also a finite set.
Suppose f[z] = {a0 , a1 , a2 , . . . , ak }. We claim f[z] ⊂ Vω0 . (If so f[z] is a set.)

Recall that rank(ai ) = least{α ∈ O : ai ⊆ Vα }.

Let
rank(aq ) + 1 = max{rank(ai ) + 1 : i = 0 to k}
Then, for i = 1 to k, rank(ai ) + 1 ∈= rank(aq ) + 1.

Since
ai ⊆ Vrank(ai ) ⊂ Vrank(ai ) + 1 ⊆ Vrank(aq ) + 1
so for i = 1 to k,

ai ∈ P(Vrank(ai ) + 1 ) ⊆ P(Vrank(aq ) + 1 ) = Vrank(aq ) + 2 ∈ V

Then f[z] = {a0 , a1 , a2 , . . . , am } ⊆ V , as required.

Axiom of choice. The Axiom of choice states that if A is a set of sets, then
there exists a function f : A → ∪{x : x ∈ A} which maps each subset x of
A to an element of yx ∈ x.
In the following proposition we show that, if the axiom of choice holds true
on V , then, whenever γ is a limit ordinal, the Axiom of choice also holds
true on Vγ .

Proposition 32.15 Let γ be a limit ordinal. Then, if the axiom of choice

holds true on V , the set Vγ satisfies the property described by the Axiom of
choice.
P roof:
What we are given: That γ is a limit ordinal and that the Axiom of choice
holds true on V .
What we are required to show: That the Axiom of choice holds true on Vγ .
Let U ∈ Vγ = ∪{Vα : α ∈ γ} where U 6= ∅ and ∅ 6∈ U . We are required to
show that there exists a function, f ∈ Vγ , f : U → Vγ , such that, for each
Part IX: Choice, regularity and Martin’s axiom 369

x ∈ U , f(x) ∈ x.
Since γ is a limit ordinal and U ∈ Vγ = ∪{Vα : α ∈ γ}, then U ∈ Vβ for
some ordinal β ∈ γ. Since Vβ is transitive,

U ∈ Vβ ⇒ U ⊆ Vβ
x ∈ Vβ for each x ∈ U ⇒ x ⊆ Vβ for each x ∈ U
⇒
∪{x : x ∈ U } ⊆ Vβ

Now U and ∪{x : x ∈ U } ⊆ Vβ are also elements of V . Since the Ax-

iom of choice holds true on V , then there exists a function, f ∈ V ,
f : U → ∪{x : x ∈ U } mapping each element x of U to an element
yx ∈ x ∈ Vβ .
See that f[U ] = {yx : x ∈ U } ⊆ Vβ ; hence, f[U ] ∈ P(Vβ ) = Vβ+1 ∈ Vγ .
We are now required to show that f ∈ Vγ . 5
We have seen that if x ∈ U , then, since each Vα is transitive,

yx = f(x) ∈ Vβ ∈ Vβ+1 ∈ Vγ

What we are given (x, yx) ∈ f,

x, yx ∈ Vβ ⇒ {x, yx} ⊆ Vβ
⇒ {x, yx} ∈ P(Vβ )
⇒ {x, yz } ∈ Vβ+1

For precisely the same reason, x ∈ Vβ implies {x} ∈ Vβ+1 .

For the same reasons, for each x ∈ U ,

{x}, {x, yx} ∈ Vβ+1 ⇒ (x, yx) = {{x}, {x, yx}} ∈ P(Vβ+1 ) = Vβ+2

Then,

f = { {{x}, {x, yx}} } : x ∈ U } ⊂ Vβ+2

f = { {{x}, {x, yx}} } : x ∈ U } ∈ P(Vβ+2 )
f = { {{x}, {x, yx}} } : x ∈ U } ∈ Vβ+3
f ∈ Vγ (Since γ is a limit ordinal.)

So f ∈ Vγ as claimed.
We have shown that there exists a function f ∈ Vγ mapping U onto a set
f[U ] ∈ Vγ such that f(x) ∈ x for each x ∈ U , as required.

5 To show that f ∈ V , we must show that f = {(x, y ) : x ∈ U } ⊆ V . Recall that the

γ x γ
ordered pair (x, yx ) is defined as {{x}, {x, yx }} (See Kuratowski Definition 4.1 of ordered
pair). So, to show that each (x, yx ) belongs to Vγ , it suffices to show that, for each x ∈ U ,
(x, yx ) = {{x}, {x, yx }} ∈ Vγ .
370 Section 32: Cumulative hierarchy

32.4 Generalized continuum hypothesis in Von Neumann’s universe.

Note that neither the Continuum hypothesis (CH), nor the Generalized con-
tinuum hypothesis (GCH), plays a role in the construction of V . Similarly,
the proof showing that S = V does not invoke CH nor GCH. That is,
S = V neither assumes nor negates CH and GCH.
We pause to reflect a bit on this matter. Recall that the GCH declares that
for any infinite set S, there does not exist a set T such that |S| < |T | <
|P(S)|. Equivalently,

GCH ⇔ [ℵα+1 = 2ℵα , ∀ α ∈ O]

Suppose we don’t assume GCH. Then it may be the case, for example, that

ℵ0 < ℵ1 < ℵ2 < ℵ3 = 2ℵ0

where the cardinal number, ℵ1 , is the least ordinal larger than ℵ0 , such that
ℵ1 6∼e ℵ0 , the cardinal number, ℵ2 , is the least ordinal larger than ℵ1 such
that ℵ2 6∼e ℵ1 , and the cardinal number, ℵ3 , is the least ordinal larger than
ℵ2 such that ℵ3 6∼e ℵ2 . Will these “extra” sets, ℵ1 , ℵ2 , for example, be
present in some Vα ? We have

ω 0 < ω 1 < ω 2 < ω 3 = 2 ω0

where ω0 , ω1 , ω2 and ω3 are the corresponding initial ordinals. We have

shown above that rank(ω1 ) = ω1 . That is, the least α such that ω1 ⊆ Vα is
ω1 . In this case, ω1 ⊆ Vω1 , while ω1 6∈ Vω0 . This means ω1 ∈ P(Vω1 ) = Vω1 +1
and so ω1 appears in Vω1 +1 ⊂ V = ∪α∈O Vα . Similarly, ω2 ∈ Vω2 +1 ⊂ V . So
whether we assume GCH or not, all sets in S are accounted for in V . So
the equality, S = V , is not sensitive to assumptions made on CH or GCH.

Concepts review:
1. How is the class of sets V = ∪α∈O Vα constructed?
2. The class {Vα : α ∈ O} has a strict linear ordering with respect to
which order relation?
3. The statement “The class {Vα : α ∈ O} contains all sets” is equiv-
alent to which ZFC-axiom?
4. What is the rank of a set U ?
Part IX: Choice, regularity and Martin’s axiom 371

5. If α is an ordinal, what is its rank?

EXERCISES

A. 1. Show that the Vα ’s in the expression V = ∪α∈O Vα are sets.

B. 2. Show that if γ ∈ O, then ∪α∈γ α = sup {α : α ∈ γ}.
3. Write out all the elements of each of the sets V0 to V5 .
4. Describe Vω and Vω+ .
5. What is the least ordinal α such that 7 ⊂ Vα ?
C. 6. If x is a set, then we define the rank of x, rank(x), as follows:

rank(x) = least{α ∈ O : x ⊆ Vα }

a) What is the rank of the set N?

b) If the sets A and B have the same rank, what is the rank of A ∪ B?
c) Can you find a set S such that the rank of ∪A∈S A equals the rank of
S?
d) Can you find a set S such that the rank of ∪A∈S A is an element of the
rank of S?
7. Show that the Axiom of infinity cannot be proven from the other ZFC-
axioms.
372 Section 33: Martin’s axiom

33 / Martin’s axiom
Abstract. In this section we define the “countable chain condition” on a
partially ordered set (P, ≤). We then define those subsets of (P, ≤) called
“filters”. We introduce an axiom which is independent of ZFC called Mar-
tin’s axiom, of particular interest when ¬CH is assumed. We then list a
few consequences of this axiom. NOTE: This chapter is presented as a
matter of interest and is mostly destined to readers well versed on topolog-
ical spaces and more advanced topics in real analysis.

33.1 Introduction.
At this point we have discussed nine basic set theory axioms we called the
ZF-axioms. To these nine axioms we have adjoined a tenth axiom called
the Axiom of choice. When viewed together, these axioms are referred to
as the ZFC-axioms or ZF+Choice. Most mathematicians view these ten ax-
ioms as constituting a solid and reliable foundation of mathematics, at least
for the time being. A few mathematicians or logicians, as well as certain
philosophers of mathematics, continue to investigate these axioms in healthy
attempts to identify what they consider to be some shortcomings or weak
points of the theory, occasionally questioning the validity of some of these.
And this is fine, since no one can prove that the ZFC-axioms will not, at
some point in time, lead to some contradiction. Many mathematicians inves-
tigate other axioms which are independent of these, as being useful tools to
prove certain mathematical statements which push or cross the mathemat-
ical boundaries established by ZFC. Examples of these are the Continuum
hypotheses axioms: CH, GCH, ¬CH and ¬GCH. The Continuum hypothesis
(CH) declares that the smallest cardinal number which is larger than the
countable cardinal ℵ0 is ℵ0+1 = 2ℵ0 . The negation, ¬CH, of CH declares
that there is at least one uncountable cardinal ℵ1 such that ℵ0 < ℵ1 < 2ℵ0 .
The Generalized continuum hypothesis (GCH) states that, for any cardi-
nal number ℵα , the smallest cardinal number which is greater than ℵα is
ℵα+1 = 2ℵα . Its negation declares that there are cardinal numbers ℵα such
that ℵα < ℵα+1 < 2ℵα .
In this section we will discuss another axiom called Martin’s axiom.1 This is
a slightly more advanced topic of set theory, since the understanding of some
proofs assumes some basic knowledge of topology on the part of the reader.
Before we state and describe this axiom, we must introduce two particu-
lar notions associated to partially ordered sets, (P, ≤). One is the countable
chain condition in P and the other refers to special subsets of P called filters.

1 Introduced in 1970 by Donald A. Martin and Robert M. Solovay.

Part IX: Choice, regularity and Martin’s axiom 373

33.2 Countable chain condition ccc on a partially ordered set.

Recall that a partially ordered set is a set P on which we have defined a
relation ≤ which is reflexive, transitive and antisymmetric. As an example
consider the closed interval X = [0, 1]. Let B denote intervals in [0, 1] of the
form [0, a), (a, b), (a, 1] where b ≥ a. Let τ (X) be a subset of the power set,
P(X), of X defined as follows:

τ (X) = {U ∈ P(X) : U is a union of elements of B}

The elements of τ (X) are referred to as the open subsets of X. See that τ (X)
is essentially the collection of all open subsets of [0, 1]. If we equip τ (X) with
the relation ⊆, then (τ (X), ⊆) is an example of a partially ordered set. In
what follows, ∧P denotes the minimal element of P , (if P has one).
Recall that a chain in a partially ordered set is a subset which is linearly
ordered. Suppose (P, ≤) is a partially ordered set which may or may not
contain ∧P (the minimal element of P ). Let A be a subset of P . We say
that A is an antichain in P if A is a subset of P in which no two elements
are comparable. We say that “A is strong antichain in the partially ordered
set P ” if A satisfies simultaneously these three properties,
1) ∧P 6∈ A (if ∧P exists),
2) for any pair u, v ∈ A, u and v are not comparable under ≤,
3) for any pair u, v ∈ A, there does not exist an element r ∈ P such that
r ≤ u and r ≤ v.
The subset A is a “strong” antichain in P if no element of P is less than any
pair of elements in A. With this in mind, we introduce the following concept.

Definition 33.1 Let (P, ≤) be a partially ordered set (which may or may
not contain ∧P ). If P contains no uncountable strong antichain, then (P, ≤)
is said to satisfy the countable chain condition. In this case, we say that (P, ≤)
satisfies the ccc or that (P, ≤) is a ccc partial order.

So, if A is a strong antichain of (P, ≤) and P is known to be ccc, then A

must be countable. In what follows the word “antichain” will always mean
a “strong antichain”.
Example : Let X = [0, 1] and (τ (X), ⊆) be the partially ordered set defined
above. We verify that (τ (X), ⊆) is a ccc partially ordered set. Suppose A is
a strong antichain in τ (X). We claim that A cannot be uncountable and so
τ (X) is ccc. The ccc property means that the open sets in A are pairwise
disjoint open subsets of [0, 1]. We index the elements of A as follows:

A = {Uκ : κ < ℵα }
374 Section 33: Martin’s axiom

for some cardinal ℵα . From each Uκ ∈ A we choose precisely one ratio-

nal number qκ (we can do this since Uκ is an open subset of [0, 1] and so
contains at least one open interval). Since Q is countable, |Q| = ℵ0 , then
|{qκ : κ < ℵα }| ≤ ℵ0 . It follows that |A| ≤ ℵ0 and so A is countable. So
A cannot be an uncountable number of open subsets of [0, 1]. We conclude
that (τ (X), ⊆) satisfies the ccc.1

33.3 Dense subsets of a partially ordered set (P, ≤).

Given a partially ordered set (P, ≤) there are special subsets of P said to be
dense in P . We explain what this means in the following definition.

Definition 33.2 Let (P, ≤) be a partially ordered set. Let D be a subset of

P such that for every element p in P there exists an element d in D such that
d ≤ p. A subset D satisfying this property is said to be a dense subset of P
with respect to ≤.

Examples of dense subsets of a partially ordered set.

Example 1. Let X = [0, 1] and (τ (X), ⊆) be the partially ordered set defined
above. We will construct a dense subset D of (τ (X), ⊆) as follows. Let

E = {(x − ε, x + ε) ⊆ [0, 1] : x ∈ (0, 1), ε ∈ (0, 1)}

the set of all open intervals which do not contain their endpoints. We claim
that E is dense in (τ (X), ⊆). Let U be an element of τ (X). Then U is
a non-empty open set (read “a union of open intervals”). Since U is non-
empty, there is an x ∈ U such that x is not 0 or 1. Then there exists ε such
that (x − ε, x + ε) ⊂ (a, b) ⊆ U . (This is a fundamental property of the
real numbers.). Then, for every element U of τ (X), there is an element of
E which is a subset of U . So E is dense in τ (X) with respect to ⊆, as claimed.
Here is a trickier example (for good practice).
Example 2. Let (P, ≤) be a partially ordered set. If x ∈ P , we define x↓ as
follows:
x↓ = {y ∈ P : y ≤ x}
We say that a and b are compatible in P if a↓ ∩ b↓ contains some element of
1 More generally, in the case of a topological space (X, τ ), the ccc property of X can

be expressed as follows: If for any family F of pairwise disjoint open subsets of X, the
cardinality of F is less than or equal to ℵ0 , then (X, τ ) satisfies the ccc. So, for example, R
with the usual topology satisfies the ccc.
Part IX: Choice, regularity and Martin’s axiom 375

P which is not the minimum of P .

Let u and v be fixed elements of P . Let
V(u,v) = {x ∈ P : x is not compatible with u or v}
W(u,v) = u↓ ∩ v↓
D(u,v) = V(u,v) ∪ W(u,v)
Claim: That D(u,v) is dense in P .
Proof of claim: Let z ∈ P and y ≤ z. We are required to show that there is
d ∈ D(u,v) such that d ≤ z.
Case 1. If y↓ ∩ u↓ = ∅ or y↓ ∩ v↓ = ∅, then y ∈ V(u,v) so y ∈ D(u,v) . Then
there exists y ∈ D(u,v) where y ≤ z, and we are done.
Case 2. Suppose, on the other hand, that there is some q ∈ y↓ ∩ u↓ . Then
q ≤ y and since y ≤ z, q ∈ z↓ ∩ u↓ where q ≤ y ≤ z. Also note that q ∈ u↓ .
· Subcase 2.1. If q↓ ∩ v↓ = ∅, then q ∈ V(u,v). So q ∈ D(u,v) where q ≤ z,
and we are done.
· Subcase 2.2. Suppose q↓ ∩ v↓ 6= ∅. That is, suppose k ∈ q↓ ∩ v↓ for some
k. Then k ∈ v↓ . Also k ≤ q and since q ∈ u↓ , then k ∈ u↓ ∩ v↓ . We then
have k ≤ z and k ∈ W(u,v) ⊆ D(u,v). So k ∈ D(u,v).
Then D(u,v) is dense in P , as claimed.

33.4 Filters in a partially ordered set (P, ≤).

The notion of a filter is often seen in the particular context of (P(X), ⊆),
the power set for some non-empty set X ordered by inclusion ⊆. In this case,
a filter F in (P(X), ⊆) is seen as being a non-empty subset F of P(X)
which satisfies:

1. F is closed under supersets (i.e., if F ∈ F and F ⊂ T , then T ∈ F )

2. F is closed under finite intersections. That is, if A, B ∈ F then A ∩ B
belongs to F . This generalizes to “the intersection of finitely many
elements of F belongs to F and so is never empty”.
When the second condition is satisfied, we say that F satisfies the finite
intersection property. See that P(X) is itself a filter in P(X). If F is a
filter in P(X) not equal to P(X), we say F is a proper filter.
For example, if X is an infinite set, the subset F = {U ∈ P(X) :
X − U is finite } ⊂ P(X) does not contain the empty set, is closed un-
der supersets and satisfies the finite intersection property and so is a proper
filter in (P(X), ⊆). When a filter is a subset of (P(X), ⊆), we will be using
the script font, F , to represent it. Note that F is proper if and only if
∅ 6∈ F .
In a more general context of a partially ordered set (P, ≤), filters are defined
as follows.
376 Section 33: Martin’s axiom

Definition 33.3 Let F be a subset of a partially ordered set (P, ≤).

The subset, F , is called a filter if it is non-empty and satisfies the two prop-
erties:

1. If x and y belong to F , there exists z in F which is less than or equal to

both x and y (i.e., F is a filter base or downward directed).
2. If x belongs to F and x is less than or equal to an element y of P , then y
belongs to F (i.e., F is upward closed).
A filter in (P, ≤) is a proper filter if it is not all of P . If x ∈ P , then the set of
all elements above x is called a principal filter with principal element x. Such
a filter is the smallest filter which contains x.

33.5 Martin’s axiom.

Martin’s axiom2 is a statement concerning those sets of cardinality κ < 2ℵ0
which, when hypothesized, allows sets of cardinality κ to behave more like
those sets of cardinality ℵ0 than those sets which are of cardinality 2ℵ0 .
When referring to a partially ordered set such as (P(X), ⊆) the Martin’s
κ-statement reads as follows:
MA(κ) statement for a power set : Let κ be an infinite cardinal.
Suppose (P(X), ⊆) contains no uncountable family of pairwise
disjoint subsets of X (i.e., P(X) satisfies ccc). Suppose

D ∗ = {D ⊆ P(X) : D is dense in P(X)}

If |D ∗ | ≤ κ then there is a proper filter F ⊆ P(X) such that, for

every set D ∈ D ∗ , F ∩ D 6= ∅.
In the more general context of a partially ordered set, (P, ≤) Martin’s κ-
statement, denoted by MA(κ) is defined as follows:
MA(κ): Let κ be an infinite cardinal and (P, ≤) be a non-empty
partially ordered set satisfying the countable chain condition. Let

D = {D ∈ P(P ) : D is dense in P }

such that |D| ≤ κ. Then there is a proper filter F ⊆ P such that,

F ∩ D 6= ∅ for every set D ∈ D.
2 Donald Anthony Martin (1940- ), also known as Tony Martin, is an American set

theorist and philosopher of mathematics at UCLA, where he is an emeritus professor of

mathematics and philosophy. This axiom was introduced in collaboration with Robert M.
Solovay (1938 - ), an American mathematician who specializes in Set Theory. (Wikipedia)
Part IX: Choice, regularity and Martin’s axiom 377

We will first show that MA(ℵ0 ) holds true in ZFC.

Theorem 33.4 The statement MA(ℵ0 ) holds true in ZFC. (MA(κ) where
κ = ℵ0 .)
P roof:
Suppose (P, ≤) is a non-empty partially ordered. In what follows, the reader
will notice that the ccc property on (P, ≤) is not required for the Martin’s
ℵ0 -statement to hold true.
Let D be a family of dense subsets of P such that |D| ≤ ℵ0 . That is, there
are at most countably many dense subsets in D. Let a ∈ P .
Case 1: We consider the case where D is empty.
We are required to find a proper filter F ⊂ P such that F ∩ D 6= ∅ for every
set D ∈ D.
The set, F = {x ∈ P : x ≥ a}, of all elements above a is a principal filter.
But D doesn’t contain any sets. So F intersects every element of D.
Then MA(ℵ0 ) holds true.
Case 2: We consider the case where 0 < |D| ≤ ℵ0 . So D contains at least
one dense subset. We can then enumerate the sets in D as

D = {D1 , D2 , D3 , . . . , }
We are required to find a proper filter F ⊂ P such that F ∩ Di 6= ∅ for
every set i ≥ 1.
Since, for each i, Di is dense in P , we can choose some d1 ∈ D1 such that
d1 ≤ a, d2 ∈ D2 such that d2 ≤ d1 ≤ a, and if dn ≤ dn−1 ≤ · · · ≤ d1 ≤ a
choose dn+1 ∈ Dn+1 such that

dn+1 ≤ dn ≤ dn−1 ≤ · · · ≤ d2 ≤ d1 ≤ a

We now let the set F ⊂ P be one that contains all {di : i = 1, 2, 3, . . .} and
all elements of P which are above d1 . That is, if q ∈ F , q ≥ d1 or q ≥ a.
We claim that F is a filter in P .
1) Clearly F is non-empty.
2) If b is in P such that a ≤ b, then b ∈ F .
3) If b, c belong to F , then either b < c or c < b.
Without loss of generality, suppose c < b. Then c ≤ c and c ≤ b. Then F is
a filter, as claimed.
Furthermore, F intersects every Di at di for each i.
So F is the filter which intersects every element of D. That is, if D is count-
able, then satisfies the condition for MA(ℵ0 ).
378 Section 33: Martin’s axiom

Note how countability of D plays a role in the above theorem and how, if
D is an uncountable family, difficulties may arise.

In the following theorem, we show that, in ZFC, the statement MA(2ℵ0 )

(i.e., κ = 2ℵ0 ) cannot hold true, in general. That is, we construct an ex-
ample of a partially ordered set (P, ≤) for which |D| = κ = 2ℵ0 . We then
show that there can be no proper filter, F of (P, ≤), which intersects every
element of D.
A few preliminary words. Before we prove this, we state the following well-
known facts about the closed and bounded interval X = [0, 1]. If U ∈ τ (X)
(where τ (X) is defined above as the set of all non-empty open subsets of X)
then the closure of U , denoted by cl(U ), is a subset of X where X− cl(U )
is the the largest open set which contains elements outside of U . A subset
is said to be closed if and only if its complement is an open subset. The
simplest example is: cl(a, b) = [a, b], a closed interval X.
The following are well-known facts about closed and bounded subsets of R:
Fact #1 : If K is a closed and bounded subset of R and F = {F : F
is closed in K} is known to be a filter in P(K) (i.e. satisfies
the finite intersection property), then ∩{F : F ∈ F } = 6 ∅.
Fact #2 : If (a, b) is an open interval in X = [0, 1] and x ∈ (a, b) ⊂ [0, 1],
then there exists c, d ∈ X such that x ∈ (c, d) ⊂ cl(c, d) =
[c, d] ⊂ (a, b).

Theorem 33.5 The statement MA(2ℵ0 ) fails in ZFC.

P roof : Let X = [0, 1]. Then |X| = |R| = 2ℵ0 .
For each x ∈ X, let Ux = X − {x} (the complement of {x} in X). Then

U = {Ux : x ∈ X} ⊂ τ (X)

where |U | = |X| = 2ℵ0 . Note that ∩{Ux : x ∈ X} = ∅. For each x ∈ X, let

Dx = {D ∈ τ (X) : cl(D) ⊆ Ux }

We claim that, for each x ∈ X, Dx is dense in (τ (X), ⊆).

Proof of claim. Let x ∈ X. Suppose M is some element of τ (X). Then there
exists some D ∈ Dx such that

D ⊂ cl(D) ⊆ M ∩ Ux = M ∩ [X − {x}] ∈ τ (X)

Then, for any M ∈ τ (X), there is D ∈ Dx such that D ⊆ M . Then Dx is

Part IX: Choice, regularity and Martin’s axiom 379

dense in the partially ordered (τ (X), ⊆) as claimed.

Then D = {Dx : x ∈ X} is a set of dense subsets of (τ (X), ⊆) of cardinality,
|D| = 2ℵ0 .
Suppose that MA(2ℵ0 ) holds true. It states that there exists a proper filter
F = {F : F ∈ τ (X)} such that, for each x ∈ X, F ∩ Dx is non-empty. For
each x ∈ X, choose Dx ∈ F ∩ Dx . Since F is a filter, finite intersections
of the chosen elements {Dx : x ∈ X} are non-empty. Then {cl(Dx ) : x ∈
X} must also satisfy the finite intersection property. Since Dx ∈ Dx then
cl(Dx ) ⊆ Dx . Since X is closed and bounded (compact) then (by Fact#1
above)
∩{cl(Dx ) : x ∈ X} =
6 ∅
This contradicts the fact that
{cl(Dx ) : x ∈ X} ⊆ ∩{Dx : x ∈ X} ⊆ ∩{Ux : x ∈ X} = ∅

The source of our contradiction is our supposition that MA(2ℵ0 ) holds true.
We conclude that MA(2ℵ0 ) does not hold true in ZFC.

The two theorems above show that the only Martin κ-statements which are
of interest are those where κ is such that ℵ0 ≤ κ < 2ℵ0 .
We then state Martin’s axiom as follows.

Definition 33.6 Martin’s axiom, [MA], is defined as being MA(κ) where κ

satisfies ℵ0 ≤ κ < 2ℵ0 .

Trivially CH ⇒ [MA] (since CH states that κ = ℵ0 ). On the other hand, it

has been shown (by Solovay and Martin) that Martin’s axiom is independent
of ZFC and is consistent with ZFC + ¬CH .

33.6 Consequences of Martin’s axiom in topology.

There are a few equivalent forms of Martin’s axiom. For those readers fa-
miliar with point-set topology, we present a nicely formulated consequence
of Martin’s axiom which is of interest. The following statement is in fact
equivalent to MA. We prove that it is equivalent to MA in the Appendix A.

Theorem 33.7 [MA] Suppose κ is an infinite cardinal such that κ < 2ℵ0 . If
X is a Hausdorff compact space with ccc and {Uα : α ≤ κ} is a family of open
dense3 subsets of X then ∩{Uα : α} =
6 ∅.

3 The subset U is dense in X in the topological sense if and only if cl(U ) = X.

380 Section 33: Martin’s axiom

A few other well-known topological consequences of Martin’s axiom are:

− MA(ℵ1 ) implies that “If X is topological space such that every closed
subset of X is a Gδ -set, then every subspace of X has a countable dense
subset”.
− MA(ℵ1 ) implies that “A product of ccc topological spaces is ccc”.
− When [MA] is assumed: “If κ is such that ℵ0 ≤ κ < 2ℵ0 then 2κ = 2ℵ0 ”.
The proofs are beyond the scope of this book.

33.7 Martin’s axiom compared to the Baire category theorem.

Certain readers who are well versed in real analysis may find that this topo-
logical statement is reminiscent of the Baire category theorem of which one
version states:
“If X is a locally compact Hausdorff space and D = {Dα : α < ℵ0 }
is a countable set of open and dense subsets in X, then ∩{Dα :
α < ℵ0 } is dense in X”.4
The following theorem compares Martin’s axiom to the Baire category the-
orem. The proof of the statement of equivalence is found in the Appendix
A at Theorem 1.14.

Theorem 33.8 Let κ be a cardinal such that ℵ0 ≤ κ < 2ℵ0 . Let X

be a Hausdorff topological space satisfying ccc such that {x ∈ X :
x has a compact neighborhood} is dense in X. Suppose that D = {Dα :
α ≤ κ} is a family of dense open subsets of X. Then

∩{Dα : α ≤ κ} is dense in X ⇔ Martin’s axiom holds true

Readers with a background in point-set topology will find a more in-depth

study of the topics of Martin’s axiom and Boolean algebras in Appendix A.

4 In fact, its similarity to MA is such that some may want to refer to MA as an “Enhanced”

Baire category theorem. For the case, R with the usual topology, MA implies the Baire
category theorem.
Part IX: Choice, regularity and Martin’s axiom 381

Concepts review:
1. When does a partially ordered set satisfy the “countable chain con-
dition”?
2. What is an open subset of the closed interval X = [0, 1]?
3. What subset of P([0, 1]) does τ ([0, 1]) represent?
4. What is a strong antichain in a partially ordered set (P, ≤)?
5. What is a dense subset of a partially ordered set (P, ≤)?
6. What is filter in a partially ordered set (P, ≤)? What is proper
filter? What is a principal filter?
7. What does it mean to say that a family of subsets satisfies the “finite
intersection property”?
8. State the Martin’s κ-statement MA(κ).
9. State MA(κ) when it refers specifically to a power set (P, ⊆).
10. What can be said about MA(ℵ0 )?
11. What can be said about MA(2ℵ0 )?
12. State Martin’s axiom, [MA].
13. State the Baire category theorem.
Part X

Ordinal arithmetic
Part X: Ordinal arithmetic 385

34 / Ordinal Addition
Abstract. In this section we define the operation of addition on the or-
dinal numbers. We then show its most basic properties and provide a few
examples.

34.1 Definition of ordinal addition.

Just as for cardinal numbers, we can define addition of ordinal numbers.
Ordinal number addition will be very similar to cardinal number addition.
Recall how cardinal number addition was defined:
Let S and T be two sets such that S ∩ T = ∅ where S is of
cardinality κ and T is of cardinality λ. We define: κ + λ as the
cardinality of S ∪ T .
Now the ordinality of a well-ordered set S is the unique ordinal which is order
isomorphic to S. If we simply transpose the “cardinal addition definition”
onto the ordinal numbers, then we would obtain: “If (S, ≤S ) and (T, ≤T ) are
well-ordered sets of ordinality α and β, respectively, then α+β = ord(S ∪T )”.
There is, however, some critical information missing here. We can only speak
of the ordinality of a set in reference to a stated well-ordering of that set.
We should then begin by defining a well-ordering of S ∪ T .

Definition 34.1 Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets.
We define the relation “≤S∪T ” on S ∪ T as follows:

a) u ≤S∪T v if {u, v} ⊆ S and u ≤S v.

b) u ≤S∪T v if {u, v} ⊆ T and u ≤T v.
c) u ≤S∪T v if u ∈ S, v ∈ T .

Our next step should be a verification that this newly defined order relation
actually well-orders the union S ∪ T . We express this in the form of a theo-
rem, and leave the straightforward proof as an exercise.

Theorem 34.2 Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets.
Then the relation ≤S∪T well-orders the set S ∪ T .
P roof: The proof is left as an exercise.
386 Section 34: Ordinal addition

We can now define addition of two ordinal numbers as the ordinality of the
union of disjoint well-ordered sets.

Definition 34.3 Let α and β be two ordinal numbers.

Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets of order type α and
β respectively.1 We define α + β as follows:

α + β = ord(S ∪ T, ≤S∪T )

The stated definition of addition of ordinal numbers α and β applies only to

order types of disjoint well-ordered sets (S, ≤S ) and (T, ≤T ). If we wish to
add the order types α and β of two sets (S, ≤S ) and (T, ≤T ) with non empty
intersection, we can construct the sets (S × {0}, ≤0 ) and (T × {1}, ≤1 ) of
the same order types each equipped with the lexicographical ordering. That
is,

(u, 0) ≤0 (v, 0) if u ≤S v
(u, 1) ≤1 (v, 1) if u ≤T v

We then add the ordinals α and β as defined.

Theorem 34.4 Let (S, ≤S ), (T, ≤T ) and (U, ≤U ), (V, ≤V ) be two pairs of
disjoint well-ordered sets such that
ord ord
S =α= U
ord ord
T =β= V

Then ord (S ∪ T, ≤S∪T ) = α +β = ord

(U ∪ V, ≤U ∪V ). Hence, addition of ordinal
numbers is well-defined.

P roof: The proof is left as an exercise.

1 Addition can also be defined inductively as follows: For all α and β, (a) β + 0 = β, (b)

β + (α + 1) = (β + α) + 1, (c) β + α =lub{β + γ : γ < α} whenever α is a limit ordinal.

Part X: Ordinal arithmetic 387

34.2 Examples.
Addition of natural numbers, when viewed as ordinals, should agree with re-
sults obtained when adding natural numbers the usual way. We have already
verified that this is the case for cardinal numbers. The following example
illustrates that this is the case for addition of natural numbers if viewed as
ordinals. Note that in the examples and theorem below, “<” represents the
“ordinal inclusion” order relation ∈.
a) Example. Determine the sum 3 + 7 when these natural numbers are
viewed as ordinals. Also determine the sum 7 + 3.
Solution: We see that 3 = ord {7, 8, 9} and 7 = ord {0, 1, 2, 3, 4, 5, 6}. The
choice of the natural numbers used is arbitrary. The chosen well-ordered
set representatives are disjoint. See that ord{7, 8, 9, 0, 1, 2, . . . , 6} = ord
{0, 1, 2, . . ., 9} since {7, 8, 9, 0, 1, 2, . . ., 6} (with the ordering defined on
unions) and {0, 1, 2, . . . , 9} (with the usual natural number ordering)
are order isomorphic. By definition,
3 + 7 = ord{7, 8, 9, 0, 1, 2, . . . , 6} = ord{0, 1, 2, . . . , 9} = 10
7 + 3 = ord{0, 1, 2, . . . , 9} = 10
b) Example. Determine both sums ω0 + 7 and 7 + ω0 .
Solution: So that we obtain disjoint well-ordered set representatives,
we will use
7 = ord {0, 1, 2, 3, 4, 5, 6}
ord
ω0 = {7, 8, 9, 10, . . .}

Then, by definition,
ord
7 + ω0 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, . . . , } = ω0
ord
ω0 + 7 = {7, 8, 9, 10, . . ., 0, 1, 2, 3, 4, 5, 6} = ω0 + 7

It is worth noting that 7 + ω0 = ω0 < ω0 + 7 and so, even if addition on

finite ordinals is commutative, this is not always the case for addition
of transfinite ordinals.

c) Example. Show that ω0 + 1, when viewed as the sum of the ordinal ω0

and the ordinal 1 = {0}, is the immediate successor, ω0+ , of ω0 .
Solution:
ord
ω0 + 1 = (ω0 × {0} ∪ {(0, 1)})≤ω
0 ×{0}∪{(0,1)}
ord
= ({0, 1, 2, 3, . . .} ∪ {ω0 })
= {0, 1, 2, 3, . . .} ∪ {ω0 }
= ω0 ∪ {ω0 }
= ω0+
388 Section 34: Ordinal addition

34.3 Basic properties of ordinal addition.

Many of the addition properties generalize from finite ordinal addition to
transfinite ordinal addition. But we should watch out for those properties
that do not.

Theorem 34.5 Let α, β and γ be three (non-zero) ordinal numbers. Then:

a) (α + β) + γ = α + (β + γ) (Addition is associative.)

b) For any ordinal γ > 0, α < α + γ

c) For any ordinal γ, γ ≤ α + γ
d) α < β ⇒ α + γ ≤ β + γ
e) α < β ⇒ γ + α < γ + β
f) α + β = α + γ ⇒ β = γ (Left term cancellation is valid.)

g) α + 0 = α

P roof: Let A, B and C be three pairwise disjoint well-ordered set represen-

tatives of the ordinals α, β and γ respectively.
a) We are required to show that (α + β) + γ = α + (β + γ). By definition
ord
(α + β) + γ = (A ∪ B) + ord C
ord
= [(A ∪ B) ∪ C] (By definition of ordinal addition.)
ord
= [A ∪ (B ∪ C)] (This is justified below at ∗ .)
ord
= A + ord (B ∪ C)
= α + (β + γ)
∗
Justification of ord [(A ∪ B) ∪ C] = ord [A ∪ (B ∪ C)]: The two well-
ordered sets (A ∪ B) ∪ C and A ∪ (B ∪ C) contain the same elements
and so are equal as sets. To conclude that they have the same or-
dinality, we must show that their elements are ordered in the same
way. We will proceed by cases:
1) If both u and v belong to one of A, B, or C, then u < v in
(A ∪ B) ∪ C if and only if u < v in A ∪ (B ∪ C).
2) If u ∈ A and v belongs either to B or C, then u < v in (A∪ B)∪ C
if and only if u < v in A ∪ (B ∪ C).
3) If u ∈ B and v ∈ C, then u < v in B ∪ C and so u < v in
(A ∪ B) ∪ C if and only if u < v in A ∪ (B ∪ C).
In all cases the order of u and v is respected in both sets (A∪ B)∪ C
and A ∪ (B ∪ C). So not only are they equal sets, they are order
isomorphic and so have the same ordinality.
Part X: Ordinal arithmetic 389

b) We are given that α + γ = ord (A ∪ C) and γ > 0. We are required

to show that α < α + γ. Since C is well-ordered and non-empty, it
contains a least element, say x. Then for every y ∈ A, y < x. Then
A = Sx = {u ∈ A ∪ B : u < x}, an initial segment of A ∪ B equipped
with the ordering ≤A∪B . Then α is an initial segment of α + γ and so
α ∈ α + γ. Equivalently, α < α + γ.
c) We are given that α + γ = ord (A ∪ C). We are required to show that
γ ≤ α + γ. Since both C and A ∪ C are well-ordered they are either
order isomorphic or one is order isomorphic to an initial segment of the
other (by Theorem 26.7). If C and A ∪ C are order isomorphic, then
γ = α + γ. Suppose they are not order isomorphic. Then α 6= 0 and so
C is a proper subset of A ∪ C. If A ∪ C is order isomorphic to a subset
of C, then A ∪ C is order isomorphic to a proper subset of itself. Since
no well-ordered set can be order isomorphic to a proper subset of itself,
then A ∪ C 6<WO C; hence, α + γ 6< γ, so γ < α + γ.
d) We are given that α < β. We are required to show that α + γ ≤ β + γ.
Since α < β there exists an order isomorphism f : A → B which maps A
to an initial segment B ∗ of B. Consider the function g : A ∪ C → B ∪ C
such that g|A = f and g|C is the identity map on C. Note that a < c
for all a ∈ A and c ∈ C and b < c for all b ∈ B and c ∈ C and so
g : A ∪ C → B ∗ ∪ C is an order isomorphism mapping A ∪ C onto
B ∗ ∪ C. Since ord (B ∗ ∪ C) ≤ ord (B ∪ C) = β + γ, α + γ ≤ β + γ.2
e) We are given that α < β. We are required to prove that γ + α < γ + β.
Since α < β there exists an order isomorphism f : A → B which maps A
to an initial segment B ∗ of B. Consider the function g : C ∪ A → C ∪ B ∗
such that g|C is the identity map and and g|A = f. Note that c < a
for all c ∈ C and a ∈ A and c < b for all c ∈ C and b ∈ B ∗ and so
g[C ∪ A] = C ∪ B ∗ is an initial segment of C ∪ B. Since ord (C ∪ A) =
ord
(C ∪ B ∗ ) < ord (C ∪ B) = γ + β, then γ + α < γ + β.
f) We are given that α + β = α + γ. We are required to show that β = γ.
Suppose β < γ. Then by part (e), α + β < α + γ, a contradiction.
Similarly, γ < β implies α + γ < α + β, again a contradiction. We
conclude that β = γ.
g) We are given that α is an ordinal number. We are required to show that
α + 0 = α. Simply see that α + 0 = ord(A ∪ { }) = ordA = α.

Remark: In part (f) of the above theorem we show that “left cancellation” on
addition applies just like for natural numbers. However “right cancellation”
2 Note that g[A∪ C] = B ∗ ∪ C need not be an initial segment of B ∪ C. So even if B ∗ ∪ C ⊂

B ∪ C equality ord (B ∗ ∪ C) = ord (B ∪ C) is possible. For example, for U = {0, 1, 3, 4, 5, . . .}

and V = {0, 1, 2, 3, 4, 5, . . .} we have both U ⊂ V and ord U = ω0 = ord V .
390 Section 34: Ordinal addition

does not work. For example,

ord
2 + ω0 = {0, 1} + ord{6, 7, 8, 9, . . ., }
ord
= ({0, 1} ∪ {6, 7, 8, 9, . . . , })
ord
= {0, 1, 6, 7, 8, . . ., }
= ω0

Similarly, 3 + ω0 = ω0 . But 2 + ω0 = 3 + ω0 6⇒ 2 = 3.

34.4 Addition of limit ordinals.

When adding limit ordinals, determining the simplest form for the sum
requires some thought. For example,

(ω1 + 7) + ω1 = ω1 + (7 + ω1 )
= ω1 + ω1 = ω1 2

We provide another approach to addition, for cases where the second term
is a limit ordinal. Recall that given a non-empty subset A of a well-ordered
set W , an upper bound of A is any element u of W such that a ≤ u for all
a ∈ A. Suppose the element, s, is the least upper bound of A. That is, s is
an upper bound of A, and, for any upper bound u, s ≤ u. In this case we
write s = lub(A) (or sup A).
For example, the ordinal 5 = {0, 1, 2, 3, 4} has least upper bound, lub(5) = 4,
since 4 is greater than or equal to all of the elements of 5 and it is the least
of all upper bounds. The limit ordinal ω0 = {0, 1, 2, 3, . . ., } has as a least
upper bound, lub(ω0 ) = ω0 , itself. Note that in this case, lub(ω0 ) is not an
element of ω0 . In fact, α is a limit ordinal if and only if lub α = α 6∈ α.
Another property characterizes limit ordinals. The ordinal γ is a limit ordinal
if and only if ∪α∈γ α = γ. In the case where γ has an immediate predecessor,
say, β, then γ = {0, 1, 2, 3, . . . , β} and so

∪α∈γ α = 0 ∪ 1 ∪ 2 ∪ · · · ∪ β = β

So even if it is always true that lub γ = ∪α∈γ α, we have lub γ = ∪α∈γ α = γ

only in the cases where γ is a limit ordinal.
For example, the least upper bound of ω1 is ω1 , while the least upper bound
of ω1 + 3 = {0, 1, 2, . . ., ω1 , ω1 + 1, ω1 + 2} is ω1 + 2. We now show a useful
property involving addition of limit ordinals.

Theorem 34.6 Let β be a limit ordinal. Then, for any ordinal, α,

α + β = lub {α + γ : γ < β}
Part X: Ordinal arithmetic 391

P roof:
We are given that β is a limit ordinal and α is any ordinal.
We are required to show that α + β is the least upper bound of the set
{α + γ : γ < β}.
We claim that α + β is an upper bound of the set {α + γ : γ < β}:
− For δ < β, by Theorem 34.5 part (e), α + δ < α + β. So α + β is an
upper bound of the set {α + γ : γ < β} as claimed.
We claim that α + β is the least such upper bound:
− Suppose δ is any upper bound of the set {α +γ : γ < β}. Then α +γ ≤ δ
for all γ < β. Suppose δ < α + β. Then for all γ, α + γ ≤ δ < α + β.
Then there exists a least ordinal µ ∈ β such that δ ≤ α + µ < α + β.
Since β is a limit ordinal µ+ < β, then δ < α +µ+ < β. This contradicts
the fact that α + γ ≤ δ for all γ < β. Then δ ≥ α + β. So α + β is the
least such upper bound of {α + γ : γ < β} as required.

Example: Compute the sum (ω0 + 7) + 100 + (2 + ω0 ) to its simplest form.

Solution:

(ω0 + 7) + 100 + (2 + ω0 ) = (ω0 + 7) + (100 + 2) + ω0

= (ω0 + 7) + (102 + ω0 )
= (ω0 + 7) + lub {102 + n : n ∈ ω0 }
= (ω0 + 7) + ω0
= ω0 + (7 + ω0 )
= ω0 + lub {7 + n : n ∈ ω0 }
= ω0 + ω0

Concepts review:
1. Given two disjoint well-ordered sets (S, ≤S ) and (T, ≤T ) define a
well-ordering on S ∪ T .
2. For any two ordinals α and β, how is α + β defined?
3. For which ordinals does the given property hold true.
a) (α + β) + γ = α + (β + γ)
b) α < α + γ
c) γ ≤ α + γ
d) α + γ ≤ β + γ
392 Section 34: Ordinal addition

e) γ + α < γ + β
f) α + β = α + γ ⇒ β = γ
g) α + 0 = α
4. If β is a limit ordinal, simplify the expression sup {α + γ : γ < β}.

EXERCISES

A. 1. Suppose 4 + β = β. Prove that ω0 ≤ β.

2. Show that if α < γ, then α + 1 ≤ γ.

B. 3. Suppose that α and δ are ordinals such that α ≤ δ. Show that there can
only be one ordinal β such that α + β = δ.
4. Compute or simplify the sum (50 + ω0 ) + (ω0 + ω1 ).
5. Show that if α is a finite ordinal and γ is a limit ordinal, then the least
upper bound of α + γ is γ.
6. Show that for any ordinal α and limit ordinal γ, α + γ is a limit ordinal.
7. Provide a concrete example of ordinals such that α < β and α + γ = β + γ
simultaneously hold true.

C. 8. Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets. Show that the
relation ≤S∪T well-orders the set S ∪ T .
9. Let (S, ≤S ), (T, ≤T ) and (U, ≤U ), (V, ≤V ) be two pairs of disjoint well-
ordered sets such that
ord
S = α = ord U
ord
T = β = ord V

Show that ord (S ∪ T, ≤S∪T ) = α + β = ord

(U ∪ V, ≤U ∪V ). Hence, addition
of ordinal numbers is well-defined.
Part X: Ordinal arithmetic 393

35 / Ordinal multiplication and exponentiation.

Abstract. In this section we define the “lexicographic ordering” of the
Cartesian product of two well-ordered sets. We then define the multiplica-
tion of two ordinals α and β as α × β = ord B × A where A and B are
their respective set representatives. We then list a few basic properties of
ordinal multiplication. This is followed by a definition of exponentiation
and a presentation of a few exponentiation properties.

35.1 Well-ordering the Cartesian product of well-ordered sets.

The definition of multiplication for cardinal numbers will serve as a model
for the definition of multiplication of ordinal numbers. Recall how cardinal
number multiplication was defined:
Let S and T be two sets where S is of cardinality κ and T is of
cardinality λ. We define: κ × λ as the cardinality of S × T .
We cannot simply substitute the words “set” with “well-ordered set” and
“cardinality” with the word “ordinality” since the ordinality of a set is always
expressed in terms of a well-ordering on that set. We will order the elements
of S × T lexicographically. We have discussed this ordering before, but since
it is essential in the definition of ordinal multiplication, we define it formally.

Definition 35.1 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. We define
the lexicographic ordering on the Cartesian product S × T as follows:

 s1 <S s2
(s1 , t1 ) ≤S×T (s2 , t2 ) provided or

s1 = s2 and t1 ≤T t2


Theorem 35.2 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. The lexi-
cographic ordering of the Cartesian product S × T is a well-ordering.

P roof: The proof is left as an exercise.

Theorem 35.3 If the well-ordered sets, S1 and S2 , are order isomorphic and
the well-ordered sets, T1 and T2 , are order isomorphic, then the lexicographi-
cally ordered Cartesian products, S1 × T1 and S2 × T2 , are order isomorphic.
394 Section 35: Ordinal multiplication and exponentiation

P roof:
We are given onto order isomorphisms f : S1 → S2 and g : T1 → T2 .
We are required to produce an onto order isomorphism h : S1 ×T1 → S2 ×T2 .
We define the function h : S1 × T1 → S2 × T2 as h(s, t) = (f(s), g(t)).
We show that h is a well-defined one-to-one function on S1 × T1 : Since
h(s, t) = h(a, b) ⇒ (f(s), g(t)) = (f(a), g(b))
⇒ f(s) = f(a) and g(t) = g(b)
⇒ s = a and t = b (Since both f and g are one-to-one.)
⇒ (s, t) = (a, b)
then h is one-to-one.
The function h is onto S2 × T2 : If (s, t) ∈ S2 × T2 , then, since f and g are
“onto” S2 and T2 respectively, s = f(a) and t = g(b) for some a ∈ S1 and
b ∈ T1 ; hence, h(a, b) = (s, t). Hence, h is onto S2 × T2 .
The function h respects theordering of the sets:
 s1 <S s2
(s1 , t1 ) ≤S×T (s2 , t2 ) ⇔ or

s1 = s2 and t1 ≤T t2


 f(s1 ) <S f(s2 )
⇔ or

f(s1 ) = f(s2 ) and g(t1 ) ≤T g(t2 )



 (f(s1 ), g(t1 )) <S×T (f(s2 ), g(t2 ))
⇔ or

(f(s1 ), g(t1 )) ≤ (f(s2 ), g(t2 ))


(Equality ⇔ (s1 , t1 ) = (s2 , t2 ))

So h is order isomorphic.

35.2 Definition of ordinal multiplication.

We are now set to define ordinal multiplication. At least for finite products,
multiplication is closely linked to addition. For example, 3 × 2 = 3 + 3 =
2 + 2 + 2. We would expect multiplication of ordinals to be so that ω0 × 2
equals ω0 + ω0 , for example. We propose the following definition.

Definition 35.4 Let α and β be two ordinals with set representatives A and
B respectively. We define the multiplication, α × β, as:
α × β = ord(B × A)
The product, α × β, is equivalently written as, αβ, (respecting the order).
Note the order of the terms in the Cartesian product, B × A, is different from
the order, α × β, of their respective ordinalities.
Part X: Ordinal arithmetic 395

a) Example. Compute both ω0 × 2 and 2 × ω0 .

ord
ω0 × 2 = ({0, 1} × N)
ord
= {(0, 0), (0, 1), (0, 2), . . ., (0, n), . . . , (1, 0), (1, 1), (1, 2), . . . , (1, n), . . .}
= ω0 + ω0

ord
2 × ω0 = (N × {0, 1})
ord
= {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), . . ., (n, 0), (n, 1), . . ., }
= ω0

We see that this multiplication is not commutative. At least in one case of

ω0 × 2 it is linked to the notion of sums.

b) Example. Order the following pairs.

i) ((ω0 , ω0 × 2), 7) and ((ω0 , 2 × ω0 ), 2 + ω0 ),
ii) (ω0 , (ω0 × 2, 7)) and (ω0 , (2 × ω0 , 2 + ω0 )).
Solution of (i): We compare ((ω0 , ω0 × 2), 7) and ((ω0 , 2 × ω0 ), 2 + ω0 ):
We first compare the two first coordinates (ω0 , ω0 × 2) and (ω0 , 2 × ω0 ).
(ω0 , ω0 × 2) = (ω0 , ω0 + ω0 ) and (ω0 , 2 × ω0 ) = (ω0 , ω0 )
⇒
(ω0 , 2 × ω0 ) < (ω0 , ω0 × 2)
⇒
((ω0 , 2 × ω0 ), 2 + ω0 ) < ((ω0 , ω0 × 2), 7)

Solution of (ii): We compare (ω0 , (ω0 × 2, 7)) and (ω0 , (2 × ω0 , 2 + ω0 )):

(ω0 × 2, 7) = (ω0 + ω0 , 7) and (2 × ω0 , 2 + ω0 ) = (ω0 , ω0 )
⇒
(2 × ω0 , 2 + ω0 ) < (ω0 × 2, 7)
⇒
(ω0 , (2 × ω0 , 2 + ω0 )) < (ω0 , (ω0 × 2, 7))

35.3 Basic properties of ordinal multiplication.

We show that some of the multiplication properties generalize from finite
ordinal multiplication to transfinite ordinal multiplication.

Theorem 35.5 Let α, β and γ be three ordinal numbers. Then:

396 Section 35: Ordinal multiplication and exponentiation

a) (γβ)α = γ(βα) (Multiplication is associative.)

b) For any γ > 0, α < β ⇒ γα < γβ

c) γ(α + β) = γα + γβ (Left-hand distribution is acceptable.)

d) For any γ > 0, γα = γβ ⇒ α = β (Left-hand cancellation is acceptable.)

e) γ0 = 0
f) For any limit ordinal β 6= 0, αβ = lub {αγ : γ < β}

P roof: Let A, B and C be three pairwise disjoint well-ordered set represen-

tatives of the ordinals α, β and γ respectively.
a) We are required to show that (A × B) × C is order isomorphic to
A × (B × C).
ord ord
(α × β) × γ = (B × A) × C
ord
= (C × (B × A))
ord
= ((C × B) × A) (This is justified below at ∗ .)
ord
= A × ord (C × B)
= α × (β × γ)
∗
That the sets C × (B × A) and (C × B) × A are equipotent fol-
lows from Theorem 4.9. We now show that the one-to-one function
f(a, (b, c)) = ((a, b), c) which maps A × (B × C) onto (A × B) × C
respects the ordering:
8
< 
in A × (B × C) : −→ (a, (b, c)) < (c, (d, e))
If a < c
: in (A × B) × C : (a, b) < (c, d) ⇒ ((a, b), c) < ((c, d), e)
8
< a = c 
in A × (B × C) : (b, c) < (d, e) ⇒ (a, (b, c)) < (c, (d, e))
If and
: in (A × B) × C : (a, b) < (c, d) ⇒ ((a, b), c) < ((c, d), e)
8 b < d
< a = c 
in A × (B × C) : (b, c) < (d, e) ⇒ (a, (b, c)) < (c, (d, e))
If b = d
: in (A × B) × C : −→ ((a, b), c) < ((c, d), e)
c < e

We have considered all cases and see that the ordering is respected.

b) We are given that γ > 0 and α < β. We are required to show that
γα < γβ.
Since α < β, there exists an order isomorphism f : A → B mapping A
to an initial segment in B. It suffices to show that C × A is isomorphic
to an initial segment of C × B. Define the function g : C × A → C × B
as follows: g((c, a)) = (c, f(a)).
Since f is one-to-one into B, g is easily seen to be one-to-one.
We show that g respects the order:
Part X: Ordinal arithmetic 397

– Suppose (u, v) ≤C×A (s, t).

– If u = s, then v ≤A t and so (u, f(v)) ≤C×B (u, f(t)) = (s, f(t)).
– If u <C s, then (u, f(v)) <C×B (s, f(t)).
– So g respects the order.
c) The left-distributive property proof is left as an exercise.
d) We are given that γ > 0 and γα = γβ. We are required to show that
α = β.
Suppose not. Suppose, without loss of generality that α < β. Then by
part (b), γα < γβ, a contradiction. So α = β, as required.
e) We are required to show that γ × 0 = 0.
Simply note that γ × 0 = ord({ } × C) = ord{ } = 0.
f) We are given that β is a limit ordinal not equal to 0. We are required
to show that αβ = lub {αγ : γ < β}.
We claim that αβ is an upper bound of the set {αγ : γ < β}:
– Note that γ < β ⇒ αγ < αβ (by part (b)). So αβ is an upper bound
of the set {αγ : γ < β} as claimed.
We claim that if δ is an upper bound of the set {αγ : γ < β}, then
αβ ≤ δ:
– Let δ be an upper bound of the set {αγ : γ < β}.
– We claim that δ ≥ α lub {γ : γ < β}:
∗ Suppose δ < α lub {γ : γ < β}.
[
δ < α lub {γ : γ < β} ⇒ δ ∈ α γ
γ<β

⇒ δ ∈ αγ For some γ < β

⇒ δ < αγ

∗ But δ < αγ contradicts the fact that δ is an upper bound of the

set {αγ : γ < β}.
∗ So δ ≥ α lub {γ : γ < β} as claimed.
– Since β is a limit ordinal, then β = lub {γ : γ < β}.
– So δ ≥ αβ as claimed.
Combining the two claim statements, we obtain αβ = lub {αγ : γ < β}.

35.4 Examples.
When performing ordinal arithmetic, it is not always obvious how to simplify
expressions. In the following examples, we show how some of the properties
shown above can be used to simplify expressions.
398 Section 35: Ordinal multiplication and exponentiation

a) By mathematical induction, show that for any ordinal α and finite ordinal
n<0
αn = α + α + · · · + α (n times)
Solution: Induction on the natural numbers. Let A be a set such that
ord
A = α.
Let P (n) denote the statement “αn = α + α + · · · + α (n times)”
Base case: Since α1 = ord({0} × A) = ordA = α, P (1) holds true.
Inductive hypothesis: Suppose P (n) holds true. Then, by left-hand distribu-
tivity of ordinals, α(n + 1) = αn + α1 = αn + α = α + α + · · ·+ α (n + 1 times).
By mathematical induction αn = α + α + · · · + α (n times) for all non-zero
finite ordinals n.
b) Define α + α + α + · · ·+ (Countably infinite times) = lub {α, α + α, α + α + α, . . .}.
Show that for any α,

αω0 = α + α + α + · · · + (Countably infinite times)

Solution:
αω0 = lub {αn : n < ω0 }
= lub {α, α2, α3, α4, . . .}
= lub {α, α + α, α + α + α, . . .}
= α+α+α+··· (Countably infinite times)

c) Show that mω0 = ω0 for any non-zero finite ordinal m.

Solution:
mω0 = lub {mn : n < ω0 }
= ω0

d) We define ω02 = ω0 ω0 . Express ω02 = ω0 ω0 as an infinite sum illustrating

what this means.
Solution:
ω0 ω0 = lub {ω0 n : n < ω0 }
= lub {ω0 , ω0 + ω0 , ω0 + ω0 + ω0 , . . .}
= ω0 + ω0 + ω0 + · · · (Countably infinite times)

35.5 Some ordinal comparisons.

The following table may be of some help in seeing how countably infinite
ordinals are ordered. The ordinals are increasing with respect to ∈ as we
go down the table. All the ordinals in the third column are limit ordinals.
There is no end to this process, so there is an unlimited source of ordinal
numbers.
Part X: Ordinal arithmetic 399

0 ≤ 0, 1, 2, 3, . . . < ω0
ω0 ≤ ω0 , ω0 + 1, ω0 + 2, ω0 + 3, . . . , < ω 0 + ω0
ω0 2 ≤ ω0 2, ω0 2 + 1, ω0 2 + 2, ω0 2 + 3, . . . , < ω 0 2 + ω0
ω0 3 ≤ ω0 3, ω0 3 + 1, . . . , ω0 4, . . . , ω0 5, . . . , < ω0 ω0 = ω02
ω0 ω0 = ω02 ≤ ω02 , ω02 + 1, . . . , ω02 + ω0 , . . . , . . . , ω02 + ω0 ω0 = ω02 + ω02 = ω02 2
ω02 2 ≤ ω02 2, ω02 2 + 1 . . . , ω02 3, . . . , ω02 4, . . . , ω02 ω0 = ω03
ω03 ≤ ω03 , ω03 + 1, . . . , ω04 , . . . , ω05 , . . . , ω06 , . . . < ω0ω0
ω ω0 ω ω ω
ω0 0 ≤ ω0 , ω0 0 + 1, . . . , (ω0 0 )ω0 , . . . , ((ω0 0 )ω0 )ω0 , . . . , < ω1 , . . .

Many of the ordinals listed above may seem incredibly large. In spite of this,
note that every one of these ordinals is countable!
Observe that we go from one limit ordinal to the next by adding a countably
infinite set, remembering (from Theorem 19.6), that adding a countably in-
finite set to some infinite set S does not change the cardinality of S. We see
this in more detail in the following table.
ω0 → ω0 2 → ω0 3 → · · · → ω02
→ ω02 + 1 → ω02 + 2 → · · · → ω02 + ω0
→ ω02 + ω0 + 1 → ω02 + ω0 + 2 → · · · → ω02 + ω0 2
→ ω02 + ω02 = ω02 2 → · · · → ω02 ω0 = ω03
→ ω03 + 1 = ω03 + 2 → · · · → ω04

Recall that all of these are elements of the first uncountable ordinal is ω1
(defined as the least ordinal which cannot be embedded in ω0 ). The ordinal
ω1 is uncountable. It is not constructed (other than being the union of all
countable ordinals), but its existence is guaranteed by Hartogs’ lemma (see
Theorems 28.9, 28.10 and 28.11).

35.6 Ordinal Exponentiation.

We now define, by transfinite recursion, exponentiation of ordinals.

Definition 35.6 Let γ be any non-zero ordinal. We define the γ-based expo-
nentiation function gγ : O → O as follows:
1) gγ (0) = 1
2) gγ (α+ ) = gγ (α)γ
3) gγ (α) = lub{gγ (β) : β < α} whenever α is a limit ordinal.
Whenever γ 6= 0 we represent gγ (α) as γ α . Then γ α+1 = γ α γ. If γ = 0 we
define γ α = 0α = 0.
400 Section 35: Ordinal multiplication and exponentiation

Many of the principles used in the computation of expressions involving ex-

ponentiation of natural numbers extend to large ordinal numbers. We begin
with expressions involving inequalities.

Theorem 35.7 Let α, β and γ be three ordinal numbers. Then, assuming

γ > 1,
α < β ⇔ γα < γβ

P roof: In what follows, 1 < γ.

( ⇒ ) Proof by transfinite induction. Let P (β) denote the statement “α <
β ⇒ γ α < γ β ”. Then P (0) holds (vacuously) true. Suppose P (δ) holds true
for all δ < β. That is, suppose α < δ ⇒ γ α < γ δ for any δ < β.
Case 1 : β is a successor ordinal. That is, β = µ + 1 for some ordinal µ.
Suppose δ < β = µ + 1. It suffices to show that γ δ < γ β .

δ=µ ⇒ γδ = γµ
⇒ γ δ (1) < γ µ γ (By Theorem 35.5)
⇒ γ δ < γ µ+1 = γ β (By definition)
δ<µ ⇒ γ δ < γ µ (By induction hypothesis))
⇒ γ δ < γ µ (1) < γ µ γ = γ µ+1 = γ β

In both cases, δ < β implies γ δ < γ β .

Case 2 : β is a limit ordinal. That is, β = lub{δ : δ < β}. We are given that
δ < µ < β ⇒ γ δ < γ µ . Suppose δ < β. It suffices to show that γ δ < γ β .

δ<β ⇒ δ < δ+1 < β

⇒ γ δ < γ δ+1 ≤ lub{γ δ : δ < β} = γ β (By (3) in Definition 35.6)

⇒ γδ < γβ

In both cases 1 and 2, δ < β implies γ δ < γ β .

( ⇐ ) Suppose γ δ < γ β . If β < δ, then by the first part above we have both
γ β < γ δ and γ δ < γ β , a contradiction. Then δ ≤ β. Since δ = β implies
γ δ = γ β , we must have δ < β, as required.

Theorem 35.8 Let α, β and γ be three ordinal numbers where α 6= 0.

a) γ β γ α = γ β+α
b) (γ β )α = γ βα
Part X: Ordinal arithmetic 401

P roof:
a) Proof by transfinite induction. Let P (δ) denote the statement “γ β γ δ =
γ β+δ ”. Then P (0) holds true. Suppose P (δ) holds true for all δ < α.
That is, suppose γ β γ δ = γ β+δ whenever δ < α.
Case 1 : α is a successor ordinal. That is, α = µ + 1 for some ordinal µ.
Then

γβ γα = γ β γ µ+1
= γβ γµ γ
= γ β+µ γ (By the inductive hypothesis)

(β+µ)+1
= γ
= γ β+(µ+1)
= γ β+α

Case 2 : α is a limit ordinal. That is, α = lub{δ : δ < α}. We are

given that γ β γ δ = γ β+δ whenever δ < α. We are required to show that
γ β γ α = γ β+α . We know that γ α = lub{γ δ : δ < α}.

We claim that γ α must be a limit ordinal: Suppose not. Then γ α = µ+1

for some ordinal µ. Then µ < γ δ for some δ < α (for, if not,
µ < γ α = lub{γ δ : δ < α} ≤ µ, a contradiction). Then (by Theo-
rem 35.7) µ + 1 ≤ γ δ < γ α = µ + 1, a contradiction. So γ α is a limit
ordinal as claimed.

We claim that γ β+α ≤ γ β γ α :

By Definition 35.6 (c), γ β+α = lub{γ δ : δ < β + α}
(“α is limit ordinal” ⇒ “β + α is a limit ordinal”) .

δ < β + α ⇒ δ < β + µ < β + α for some µ < α

⇒ γ δ < γ β+µ = γ β γ µ < γ β γ α (By induction hypothesis and Theorem 35.7)

Hence γ β+α = lub{γ δ : δ < β + α} ≤ γ β γ α , as claimed.

We claim that γ β γ α ≤ γ β+α :

Since γ α is a limit ordinal γ β γ α = lub{γ β δ : δ < γ α }. (Theorem 35.5 (f) )

δ < γα ⇒ δ < γ µ < γ α for some µ < α

⇒ γ β δ < γ β γ µ = γ β+µ < γ β+α (By induction hypothesis.)

Then γ β γ α = lub{γ β δ : δ < γ α } ≤ γ β+α , as claimed.

We conclude that γ β γ α = γ β+α , as required.

b) This can be proved by transfinite induction as in part (a). Let P (δ)

denote the statement “(γ β )δ = γ βδ ”. Then P (0) holds true. Suppose
402 Section 35: Ordinal multiplication and exponentiation

P (δ) holds true for all δ < α. That is, suppose (γ β )δ = γ βδ whenever
δ < α. As in part (a) consider the two cases, (1) α has an immediate
predecessor, and (2) α is a limit ordinal, separately.
The details are left as an exercise.

Concepts review:
1. When defining multiplication of ordinals, what kind of ordering is
used on the Cartesian product of the ordinals being multiplied?
2. Define the multiplication of two ordinals α and β.
3. How does the definition of ordinal multiplication compare with the
definition of cardinal multiplication?
4. Construct a set representation for the ordinal product ω0 2.
5. How does the ordinal ω0 3 compare with the ordinal 3ω0 ?
6. Is left-hand distribution acceptable in ordinal multiplication?
7. If we start with the countably infinite ordinal ω0 and gradually in-
crease its ordinality, one at a time, by an endless process of ordinal
addition and multiplication, can we reach ω1 with this process?

EXERCISES

A. 1. Simplify or describe the following expressions:

a) ω0 + ω02
b) (ω0 + 2) + ω0
c) ω0 2 + 1
2. Which ordinal is larger, 3ω0 or ω0 3? Explain.
3. Does the ordinal ω02 + 3ω0 have an immediate predecessor? If it does, de-
scribe it.
4. Find two ordinals whose ordinal product is ω0 + ω02 .

B. 5. Prove that if δ < αβ, then δ = µγ for some µ ≤ α and some γ ≤ β.

6. Show that (ω0 + ω0 )ω0 = ω0 ω0 .

C. 7. Prove that α1 = 1α = α.
8. Prove that if α and β are both finite ordinals, then αβ = βα.
9. Prove that if αβ = 0, then either α or β is 0.
10. Prove that γ(α + β) = γα + γβ.
Part XI

Appendix
Appendix A 405

A / Boolean algebras and Martin’s axiom

Abstract. The contents of this section is destined to readers with some

background in point-set topology. Subject material is presented in a more
dense form, while many proofs are left as exercise, or are only outlined and
are more terse. We discuss those partially ordered sets called lattices and
Boolean algebras. Any Boolean algebra B is then shown to have a topologi-
cal representation B(S (B)). Finally, characterizations of Martin’s axiom
are given in terms of these concepts.1

A.1 Lattices.
Given a partially ordered set (P, ≤) there may be pairs of elements a, b in P
such that a and b do not have a common upper bound or a common lower
bound in P . Those partially ordered sets in which every pair of elements in
P have lower and upper bounds play an important role in mathematics. We
refer to these sets as lattices.

Definition 1.1 A partially ordered set (P, ≤) is called a lattice if a ∨ b =

lub{a, b} and a ∧ b = glb{a, b} both exist in P for all pairs a, b in P .

Definition 1.2 If B is a subset of a partially ordered set, (P, ≤), ∨B denotes

the least upper bound of B and ∧B denotes the greatest lower bound of B
(both with respect to ≤). Note that ∨B and ∧B may or may not be an element
of B. A lattice (P, ≤) is said to be a complete lattice if for any non-empty subset
B of P , both ∨B and ∧B exist and belong to P .

Examples of lattices.

1) Let X be a set (P(X), ⊆) be a partially ordered set ordered by inclusion.

Let A and B be any pair of subsets of X (read, A, B ∈ P(X)). We define
A ∨ B = A ∪ B and A ∧ B = A ∩ B. For B ⊆ P(X), ∨B = ∪{A ∈
P(X) : A ∈ B} (read, “∨B is the union of all subsets, A, of X such that
A ∈ B ”). Also define ∧B = ∩{A ∈ P(X) : A ∈ B} (read, “∧B is the
intersection of all subsets, A, of X such that A ∈ B ”). In this case, with
∨ and ∧ defined as ∪ and ∩, respectively, (P(X), ⊆, ∪, ∩) is a complete
lattice.
1 The content is a rough outline of what appears in more detail in the book Point-Set

Topology with Topics, World Scientific Publishing, 2024, by R. André.

406 Boolean algebras and Martin’s axiom

2) Let τ (X) be a topology on the set X. Let A and B be any pair of

open subsets of X (read, A, B ∈ τ (X)). We define A ∨ B = A ∪ B and
A ∧ B = A ∩ B. For B ⊆ τ (X), ∨B = ∪{A ∈ τ (X) : A ∈ B}. Also
define ∧B = intX (∩{A ∈ τ (X) : A ∈ B}). In this case, with ∨ and ∧
defined as ∪ and ∩, respectively, (τ (x), ⊆, ∪, ∩) is a complete lattice in
(P(X), ⊆).
3) Let X be topological space. Let B(X) = {A ⊆ X : A is clopen in X}2
partially ordered by inclusion ⊆. If A and B are clopen subsets of X,
we define A ∨ B = A ∪ B and A ∧ B = A ∩ B. Then (B(X), ⊆, ∪, ∩)
is a lattice in (P(X), ⊆). But it is not necessarily complete. Verify that
(B(Q), ⊆, ∪, ∩) is not a complete lattice.

Definition 1.3 Let X be a topological space. A subset B is said to be regular

open in X if B = intX (clX (B)). The set of all regular open subsets of X will
be denoted as Ro(X).

For example, (−5, 0) ∪ (0, 5) is open in R but not regular open in R.

A particular example of a complete lattice. Let X be a topological space with

a Hausdorff topology τ (X). By definition, Ro(X) ⊆ τ (X). For the partially
ordered set (Ro(X), ⊆) we define ∧ and ∨ as follows: For A, B ∈ Ro(X),

A∧B = A∩B
A∨B = intX (clX (A ∪ B))

If we want to form a lattice (Ro(X), ⊆, ∨, ∩) in the partially ordered set

(τ (X), ⊆), we cannot define A ∨ B as A ∪ B since it is not true, in general,
that
intX (clX (A)) ∪ intX (cl X (B)) ∈ Ro(X)
To see this, consider, for example, the open intervals A = (a, b) and B =
(b, c) both elements of Ro(R). See that

intR (clR (A ∪ B)) = (a, c) 6= (a, b) ∪ (b, c) = A ∪ B

Hence Ro(X) is not closed with respect to the union, ∪, of finitely many
sets. The following statement confirms that (Ro(X), ⊆, ∨, ∩) is a complete
lattice in the partially ordered set (τ (X), ⊆).

Theorem 1.4 Let X be a topological space. Then (Ro(X), ⊆, ∨, ∩) is a com-

plete lattice in (τ (X), ⊆).
2A set is “clopen” if it is simultaneously open and closed in X.
Appendix A 407

Proof : It immediately follows from our definition of ∨ that, for A, B ∈ Ro(X),

A ∨ B ∈ Ro(X). We leave it as an exercise to show that

A ∩ B = intX (cl X (A)) ∩ intX (clX (B)) = intX (clX (A ∩ B)) ∈ Ro(X)

Suppose D ⊆ Ro(X). Showing that

∩{B : B ∈ D} = intX (clX (∩{B : B ∈ D})) ∈ Ro(X)

∨{B : B ∈ D} = intX (clX (∪{B : B ∈ D})) ∈ Ro(X)

is also left as an exercise. Then (Ro(X), ⊆, ∨, ∩) is a complete lattice in

(τ (X), ⊆) as required.

The partially ordered set Ro(X) forms a base for a topology on X. Since
∅, X ∈ Ro(X) and Ro(X) is closed under intersections, ∩, then Ro(X)
forms a base for some topology. That is, Ro(X) generates some topology
τ ∗ (X) ⊆ τ (X). If (X, τ (X)) is assumed to be Hausdorff, τ (X) separates
points of X; it then easily follows that Ro(X) also separates points of X.
We have shown that (X, τ ∗ (X)) is Hausdorff on X.3

A.2 Lattice filters.

We will consider a lattice (L ⊆, ∨, ∧). A subset F of (L ⊆, ∨, ∧) is called an
L-filter if
1) F is non-empty,
2) F is such that, for non-empty A, B ∈ F there exists D 6= ∅ such that
D ≤ A ∧ B ∈ F.
3) F is such that, if A ∈ F and A ≤ C ∈ L, then C ∈ F .
When only conditions 1 and 2 are satisfied we say that F is an L-filter base.
We say that the L-filter base, F , generates the L-filter, F ↑ . If F ∈ L, then
{F }↑ is the L-filter generated by the singleton {F }. A filter F is said to be
a proper L-filter if F 6= L. A filter F is proper if and only if ∅ 6∈ F .

Definition 1.5 Let (L, ≤, ∨, ∧) be a lattice. An L-ultrafilter is a proper filter

F in L which is not properly contained in any other proper filter in L. If the
filter F is such that ∩{F : F ∈ F } = 6 ∅, then we say that the filter F is a
fixed ultrafilter. Ultrafilters which are not fixed are said to be free ultrafilters.

3 Given a topological space (X, τ (X)), the set (X, τ ∗ (X)) is referred to as the semiregu-

larization of (X, τ (X)).

408 Boolean algebras and Martin’s axiom

Theorem 1.6 Let X be a topological space.

a) Suppose F is a proper L-filter where (L, ⊆, ∨, ∧) is a lattice in (P(X), ⊆).

Then F can be extended to an L-ultrafilter.
b) Suppose F is an L-filter in (L, ⊆, ∨, ∧) a lattice in (P(X), ⊆). Then F is
an L-ultrafilter if and only if for every A ⊆ X, either A ∈ F or X −A ∈ F .

Proof :
a) Let F be a proper L-filter.
Let H = {M : M be a proper L-filter such that F ⊆ M }. We partially or-
der H with ⊆. Let C be a chain in (H , ⊆). Then ∪{C : C ∈ C } is an upper
bound of C with respect to ⊆. So every chain in H has an upper bound.
By Zorn’s lemma, (H , ⊆) has a maximal element. That is, H contains a
filter, F ∗ , which is not properly contained in any other filter. Since F ∗ ∈ H ,
F ⊆ F ∗ . Then F can be extended to an L-ultrafilter, as required.

b) ( ⇒ ) We are given that F is an L-ultrafilter in L ⊆ P(X) and that

A ⊆ X. Suppose neither A nor X − A belongs to F . Let

H = {B ∈ L : A ∩ F ⊆ B for some F ∈ F }

Clearly, F ⊆ H and A ∈ H − F . Then F is a proper subset of H . Note

that ∅ 6∈ H for, if it was, A ∩ F = ∅ for some F and so F ⊆ X − A which
would imply X − A ∈ F , a contradiction. It is a straightforward exercise to
show that H is closed under finite applications of ∧. Hence H is an L-filter
base which will generate a filter H ∗ . Then F ⊂ H ∗ which contradicts the
fact that F is an L-ultrafilter.
( ⇐ ) Conversely, suppose that for every A ⊆ X, either A ∈ F or X − A ∈ F .
We are required to show that F is an ultrafilter. Suppose not. That is, sup-
pose that there exists a proper filter, H , such that F ⊂ H . Then there exists
some non-empty A such A ∈ H − F . Then X − A cannot belong to F , for, if
it did, then A ∩ (X − A) = ∅ must belong to H , contradicting the fact that
H is a proper filter. Hence F must be an L-ultrafilter.

In the next theorem we investigate the particular case of a Ro(X)-ultrafilter,

of the (Ro(X), ⊆, ∨, ∩) lattice.

Theorem 1.7 Let X be a topological space. Then (Ro(X), ⊆, ∨, ∩) is a com-

plete lattice in (τ (X), ⊆). An Ro(X)-filter, F , is an Ro(X)-ultrafilter if and
only if, for any A ∈ Ro(X), either A or X − clX (A) belongs to F .
Appendix A 409

Proof :
( ⇒ ) Suppose F is an Ro(X)-ultrafilter and A ∈ Ro(X). Let F ∈ F .
Case 1: If F ∩ A = ∅ then (since F is open) F ⊆ X − clX A. Since
X − clX A = intX (X − A)
= intX (X − intX clX A)
= intX clX (X − clX A)
F ⊆ X − clX A ∈ Ro(X) which implies X − clX A ∈ F .
Case 2: Suppose F ∩ A 6= ∅ for all F ∈ F . Suppose A 6∈ F . Let
H = {B ∈ Ro(X) : A ∩ F ⊆ B}. Then F ⊆ H and A ∈ H − F . As
shown in the theorem above, H is a filter base which generates a filter H ∗
in (Ro(X), ⊆). Since F ⊂ H ∗ this contradicts the fact that F is an Ro(X)-
ultrafilter. So A must belong to F .
( ⇐ ) Suppose that for any A ∈ Ro(X), either A or X − clX (A) belongs to F .
See that the filter F extends to an Ro(X)-ultrafilter F ∗ . Suppose A ∈ F ∗ . If
A 6∈ F , then X − clX A ∈ F ⊆ F ∗ implying that A ∩ (X − clX A) = ∅ ∈ F ∗ ,
a contradiction. So A ∈ F . Then F ∗ ⊆ F which implies that F is an Ro(X)-
ultrafilter.

It can similarly be shown that a τ (X)-filter, F , is a τ (X)-ultrafilter if and

only if, for any A ∈ τ (X), either A or X − clX (A) belongs to F .

A.3 Boolean algebras.

We now define certain lattices with special properties.

Definition 1.8 A lattice (L, ∨, ∧) is said to be a distributive lattice if, for

any x, y, and z in L, x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) and x ∧ (y ∨ z) =
(x ∧ y) ∨ (x ∧ z). The lattice (L, ∨, ∧) is said to be a complemented lattice if
it has a maximum element, denoted by 1, and a minimum element, denoted
by 0, and for every x ∈ L there exists a unique x0 such that x ∨ x0 = 1 and
x ∧ x0 = 0. A complemented distributive lattice is referred to as a Boolean
algebra. A Boolean algebra is denoted as (B, ≤, ∨, ∧, 0, 1,0 ) when we explicitly
want to express what the maximum and minimum elements are.

Examples. The lattices (P(X), ⊆, ∪, ∩, X, ∅, 0 ) and (B(X), ⊆, ∪, ∩, X, ∅, 0 )

are the simplest examples of Boolean algebras.4 In both cases, A0 = X − A.
The lattice (Ro(X), ⊆, ∨, ∩, X, ∅, 0 ) is also a Boolean algebra, although this
is not at all obvious. To prove this, one must show that, for any A, B, C ∈
Ro(X),
4 Recall that B(X) are the elements of P (X) which are clopen.
410 Boolean algebras and Martin’s axiom

· X − clX A = intX clX (X − clX A) ∈ Ro(X),

· A ∩ (X− clX A) = ∅
· A ∨ (X− clX A) = intX (clX (A ∪ (X− clX A))) = X
· A ∨ (B ∩ C) = (A ∨ B) ∩ (A ∨ C) and A ∩ (B ∨ C) = (A ∩ B) ∨ (A ∩ C)
Proving these is left as an exercise.

Note that in the case where A ∈ Ro(X), A0 = X− clX A. It is erroneous to

interpret A0 as meaning X − A.

Like lattices (L, ⊆, ∨, ∧) in P(X) a Boolean algebra (B, ≤, ∨, ∧, 0, 1,0 ) con-

tains B-filters and B-ultrafilters. The following concepts and properties are
slight generalizations of ones we have already seen, so they will seem familiar
to readers.
Boolean filters and ultrafilters. If (B, ≤, ∨, ∧, 0, 1,0 ) is a Boolean algebra,
then a B-filter, F , is a non-empty subset of B which is closed under finite
applications of ∧ and, for any x ∈ F , (x ≤ y) ⇒ (y ∈ F ). Note that F ⊆ B
and F ∈ P(B). We say that the B-filter, F , is a B-ultrafilter if F is a
proper filter (hence a proper subset of B) and is not contained in any other
proper B-ultrafilter.
We point out three important B-ultrafilter properties. For any B-ultrafilter,
U,
− whenever x ∨ y ∈ U , either x ∈ U or y ∈ U .
− if x ∈ B and x ∧ y 6= ∅ for all y ∈ U , then x ∈ U .
− for any x ∈ B, either x ∈ U or x0 ∈ U . To see this, suppose x 6∈ U .

Proving these is left as an exercise.

Definition 1.9 Suppose we are given two lattices (B1 , ≤1 , ∨1 , ∧1 , 0, 1,0 ) and
(B2 , ≤2 , ∨2, ∧2 , 0, 1,0 ) and a function f which maps elements of B1 to elements
of B2 . We say that f : B1 → B2 is a Boolean homomorphism if, for any
x, y ∈ B,
1) f(x ∨1 y) = f(x) ∨2 f(y),
2) f(x ∧1 y) = f(x) ∧2 f(y),
3) f(x0 ) = f(x)0 .
The function f : B1 → B2 is a Boolean isomorphism if f is a bijection, and
both f and f ← are Boolean homomorphisms.
Appendix A 411

We say that f : B1 → B2 is an order homomorphism if (x ≤1 y) ⇒ (f(x) ≤2

f(y)). It can be shown that:
1) If f : B1 → B2 is a Boolean homomorphism, then f must be an order
homomorphism.
2) If f : B1 → f[B1 ] is a bijective Boolean homomorphism, then f must
be a Boolean isomorphism.
Proving these is left as an exercise.

Topological representations.
Suppose (B, ≤, ∨, ∧, 0, 1,0 ) is any Boolean algebra. Let X be some topo-
logical space and B(X) be the Boolean algebra of all clopen sets on some
topological space X. We will say that the Boolean algebra (B, ≤, ∨, ∧, 0, 1,0 )
has a topological representation if:

“There exists a Boolean isomorphism f : B → B(X) mapping

B onto B(X) where (B(X), ⊆, ∪, ∩) is the Boolean algebra of all
clopen sets on some topological space X”.

Definition 1.10 Let (B, ≤, ∨, ∧, 0, 1,0 ) be a Boolean algebra. Let

S (B) = {U ∈ P(B) : U is a B-ultrafilter on B}

That is, S (B) is the set of all B-ultrafilters on B.

We define the function fB : B → P(S (B)), a function which maps each
element of B to a particular set of B-ultrafilters as follows:

fB (x) = {M ∈ P(S (B)) : x ∈ U ⇔ U ∈ M }

Theorem 1.11 Let (B, ≤, ∨, ∧, 0, 1,0 ) be a Boolean algebra, S (B) = {U :

U is a B-ultrafilter}, denote the class of all B-ultrafilters on B, and fB (x) be
the function defined above. Then the set {fB (x) : x ∈ B} is a base for the
open sets of some topology, τ (S (B)), on the set S (B).

Proof :
Let BB = {fB (x) : x ∈ B}. To show that BB forms a base for a topology on
S (B) we must show two things: That BB covers all of S (B) and that BB
is closed under finite intersections.5
We first show that the sets in BB cover all of S (B). Since B-ultrafilters are
proper filters no ultrafilter can contain 0; then fB (0) = ∅ ∈ BB . Also, 1
5 See Theorem 5.4 of Point-set topology with topics by R. André.
412 Boolean algebras and Martin’s axiom

belongs to all B-ultrafilters, hence fB (1) = S (B) ∈ BB .

Second, suppose fB (x) and fB (y) belong to BB . Any B-ultrafilter F contains
x ∧ y if and only if it contains both x and y. Then U ∈ fB (x ∧ y) if and only
if U ∈ fB (x) ∩ fB (y). We can then write fB (x ∧ y) = fB (x) ∩ fB (y).
Then the set BB = {fB (x) : x ∈ B} ⊆ P(S (B)) is a base for the open sets
of some topology on S (B).

For a given Boolean algebra (B, ≤, ∨, ∧, 0, 1,0 ), we can now speak of a topo-
logical space (S (B), τ (S (B))) which is associated to B. Its elements are
B-ultrafilters. When equipped with this topology, the set S (B) is referred
to as the
Stone space
We now describe a few properties of the function fB which associates B to
subsets of the topological space S (B). We refer to this important theorem
as the Stone representation theorem.

Theorem 1.12 Let (B, ≤, ∨, ∧, 0, 1,0 ) be a Boolean algebra.

1) The function fB : B → P(S (B)) is a Boolean homomorphism mapping
B into P(S (B)).
2) For every x ∈ B, fB (x) is clopen in S (B). Hence fB [B] ⊆ B(S (B)) (the
set of all clopen sets in S (B)).
3) The function fB : B → S (B) is a Boolean isomorphism mapping B into
B(S (B)) (the set of all clopen sets in S (B)).
4) The topological space (S (B), τ (S (B))) where τ (S (B)) is the topology
generated by the open base {fB (x) : x ∈ B} is a compact zero-dimensional
Hausdorff topological space.
5) The Boolean isomorphism fB : B → S (B) maps B onto B(S (B)) (the
set of all clopen sets in S (B)).

Proof :
We are given that (B, ≤, ∨, ∧, 0, 1,0 ) is a Boolean algebra.
1) We have shown in the previous theorem that fB (x ∧ y) = fB (x) ∩ fB (y).
We now show that fB (x ∨ y) = fB (x) ∪ fB (y): If F ∈ fB (x) ∪ fB (y), then
either x or y belongs to F . By definition of a filter, x ∨ y ∈ F , hence
F ∈ fB (x ∨ y). Then fB (x) ∪ fB (y) ⊆ fB (x ∨ y). On the other hand, if
F ∈ fB (x ∨ y), then x ∨ y ∈ F . By a property of B-ultrafilters described
above, either x or y belongs to F . Hence F either belongs to fB (x) or to
fB (y). Then fB (x ∨ y) ⊆ fB (x) ∪ fB (y). So fB (x ∨ y) = fB (x) ∪ fB (y).
Appendix A 413

We show that fB (x0 ) = fB (x)0 = S (B) − fB (x): We know that fB (x) ∩

fB (x0 ) = fB (x ∧ x0 ) = fB (0) = ∅. We also know that fB (x) ∪ fB (x0 ) =
fB (x ∨ x0 ) = fB (1) = S (B). We conclude that fB (x0 ) = S (B) − fB (x).
From this, we conclude that fB : B → P(S (B)) is a Boolean homomor-
phism.
2) We are required to show that, for every x ∈ B, fB (x) is both an open
and closed subset of S (B) (with respect to the topology generated by
the base {fB (x) : x ∈ B}). For each x in B, fB (x) is open in S (B)
(since it is an open base element). Since fB (x0 ) = fB (x)0 is open and
fB (x) = S (B) − fB (x)0 (shown above), then fB (x) is also closed. So ev-
ery element of {fB (x) : x ∈ B} is clopen.
It follows that fB [B] = {fB (x) : x ∈ B} ⊆ B(S (B)) ⊆ P(S (B)).
3) Having already shown that fB : B → fB [B] ⊆ B(S (B)) is a homo-
morphism. To show it is an isomorphism, it suffices to show that fB is
one-to-one.
The function fB is one-to-one: Suppose x and y are distinct points in
B. Since x 6= y, then one of the two statements x ≤ y, y ≤ x must be
false. We will assume, without loss of generality, that x 6≤ y. We define
x − y = x ∧ y0 . We claim that since x 6≤ y, then x − y 6= 0. For suppose
that x 6≤ y and x ∧ y0 = 0. See that

x0 = x0 ∨ 0 ⇒ x0 ∨ (x ∧ y0 )
⇒ x0 ∨ (x ∧ y0 )
⇒ (x0 ∨ x) ∧ (x0 ∨ y0 ) (B is distributive)

⇒ 1 ∧ (x0 ∨ y0 )
⇒ (x0 ∨ y0 )
⇒ y 0 ≤ x0
⇒ x≤y
We have a contradiction. The source of the contradiction is our suppo-
sition that x ∧ y0 = 0. So x − y 6= 0 as claimed. Let F be the B-filter
{x − y}↑ generated by x − y ∈ B. Then F extends to a B-ultrafilter F ∗ .
Then F ∗ ∈ fB (x − y). But x and y0 are both above x − y = x ∧ y0 and
so must both belong to F ∗ . Then F ∗ ∈ fB (x) ∩ fB (y0 ). Then F ∗ cannot
belong to fB (y) (for if it did, y and y0 would both belong to F ∗ ). Then
fB (x) 6= fB (y). We conclude that fB is one-to-one on B.
4) Let (S (B), τ (S (B))) be the topological space where τ (S (B)) is the
topology generated by the open base {fB (x) : x ∈ B}. A topological
space is zero-dimensional if it has an open base of clopen sets. We have
shown above that {fB (x) : x ∈ B} is a set of clopen subsets of S (B). So
S (B) is zero-dimensional.
The topological space (S (B), τ (S (B))) is Hausdorff: Let F1 and F2 be
414 Boolean algebras and Martin’s axiom

distinct B-ultrafilters in S (B). Then there exists some element x ∈ F2

which does not belong to F1 . Then x0 must belong to F1 . It follows that
F1 ∈ fB (x) and F2 ∈ fB (x0 ). Since x ∧ x0 = 0 implies fB (x) ∩ fB (x0 ) = ∅
it follows that S (B) is a Hausdorff topological space.
The topological space (S (B), τ (S (B))) is compact: To show that S (B)
is compact, it suffices to show that a set {Fi }i∈I of closed subsets which
satisfies the finite intersection property has non-empty intersection. Let
H = {Fi }i∈I be a collection of closed subsets of S (B) which satisfies the
finite intersection property. Then H is a filter base of closed sets. Let H ∗
be the filter of closed subsets which is generated by the filter base H . Let

T = {x ∈ B : F ⊆ fB (x) for some F ∈ H ∗ }

Since {fB (x) : x ∈ B} is a base of clopen sets in S (B), it is a base for

closed sets in S (B) and so every closed set in S (B) is the intersection of
elements in {fB (x) : x ∈ B}. Then ∩{F : F ∈ H ∗ } = ∩{fB (x) : x ∈ T }.
Suppose x, y ∈ T . We claim that x ∧ y ∈ T : There exists Fi and Fj in H ∗
such that Fi ⊆ fB (x) and Fj ⊆ fB (y). Then Fi ∩ Fj ⊆ fB (x) ∩ fB (y) =
fB (x ∧ y). Since Fi ∩ Fj must be non-empty and belong to H ∗ , then
x ∧ y ∈ T , as claimed. From this we deduce that T is a filter in B.
Now T extends to the B-ultrafilter T . Since x ∈ T for all x ∈ T , then
T ∈ ∩{fB (x) : x ∈ T } = ∩{F : F ∈ H ∗ }. Then ∩{F : F ∈ H } = 6 ∅. We
conclude that S (B) is compact, as required.
5) Let A ∈ B(S (B)), the Boolean algebra of all clopen sets in S (B). Since
A is open, then A = ∪{fB (x) : x ∈ K} for some K ⊆ B. Now A is also
closed and so is compact in S (B). The collection {fB (x) : x ∈ K} is
an open cover of A and so A has a finite cover. That is, there is a finite
subset M ⊆ K such that A = ∪{fB (x) : x ∈ M } = fB (∨(x ∈ M )). We
conclude that A ∈ fB [B]. That is, fB maps B onto B(S (B)), as required.

A.4 Characterizations of Martin’s axiom.

We can now present and prove a few characterizations of Martin’s axioms.

Theorem 1.13 Let κ be an infinite cardinal number such that κ < 2ℵ0 . Then
the following are equivalent:

1) (Martin’s axiom, MA) If (P, ≤) is a partially ordered set satisfying ccc and
D = {Dα : α ≤ κ} is a family of dense subsets of P , then there exists a
filter F on P such that F ∩ Dα 6= ∅ for each α ≤ κ.
2) If X is a compact Hausdorff topological space satisfying ccc and D = {Dα :
α ≤ κ} is a family of dense open subsets of X, then ∩{Dα : α ≤ κ} = 6 ∅.
Appendix A 415

3) If (B, ≤, ∨, ∧, 0 ) is a Boolean algebra with the ccc property and D = {Dα :

α ≤ κ} is a family of dense subsets of B, then there exists a filter F on
B such that F ∩ Dα 6= ∅ for each α ≤ κ.
4) If (P, ≤) is a partially ordered set satisfying ccc and |P | ≤ κ and D =
{Dα : α ≤ κ} is a family of dense subsets of P , then there exists a filter
F on P such that F ∩ Dα 6= ∅ for each α ≤ κ.

Proof :
We are given that κ is an infinite cardinal number such that κ < 2ℵ0 .
(1 ⇒ 2): We begin with the trivial case. Suppose X is finite. If X =
{x1 , x2 , . . . , xn }, then every element of X is clopen and so the only dense
subset of X is X. Hence the intersection of all dense subsets of X is X 6= ∅.
We are done.
What we are given: Suppose now that X is an infinite set which is com-
pact Hausdorff. Let τ (X) denote the set of all non-empty open sub-
sets of X. Then (τ (X), ⊆) is a partially ordered set of subsets of X.
Suppose X does not contain an uncountable family of pairwise disjoint
open subsets of X. That is, suppose (τ (X), ⊆) satisfies the ccc. Suppose
D = {Dα : α ≤ κ} ⊆ τ (X) where Dα is dense in X (i.e., cl(Dα ) = X).
For each α < κ, we define Uα = {U ∈ τ (X) : cl(Uα ) ⊆ Dα }. (Since X
is compact Hausdorff and none of Dα ’s are empty, none of the Uα ’s are
empty.) We are given that MA holds true.
We are required to prove that ∩{Dα : α ≤ κ} = 6 ∅.
Claim: That, for each α, Uα is a dense subset of the partially ordered set
(τ (X), ⊆).
Proof of claim: Suppose M ∈ τ (X). It suffices to show that there is an ele-
ment of Uα which is a subset of M . See that, for any α ≤ κ, M ∩Dα ∈ τ (X)
and so there exists an element x and open set U such that x ∈ U ⊆
cl(U ) ⊆ M ∩ Dα ⊆ Dα . Then U ∈ Uα . We have shown that an element of
Uα is a subset of M . So Uα is a dense subset of (τ (X), ⊆), as claimed.
The set E = {Uα : α ≤ κ} is then a family of dense subsets of (τ (X), ⊆)
satisfying ccc. By Martin’s axiom, (τ (X), ⊆) contains a filter F such that
F ∩ Uα 6= ∅ for all α ≤ κ. For each α, choose Fα ∈ F ∩ Uα . Since
Fα ∈ Uα , then cl(Fα ) ⊆ Dα . Since F is a filter of non-empty open subsets
of X which satisfies the finite intersection property, then {cl(Fα ) : α ≤ κ}
satisfies the finite intersection property inside compact X. Then there must
be some a ∈ X such that a ∈ ∩{cl(Fα ) : α ≤ κ} ⊆ ∩{Dα : α ≤ κ}. Thus
∩{Dα : α ≤ κ} = 6 ∅. This is what we were required to prove.
(2 ⇒ 3): We are given that (B, ≤, ∨, ∧, 0 ) is a Boolean algebra with the ccc
property and D = {Dα : α ≤ κ} is a family of dense subsets of B.6 We are
6 Recall that “D is dense in B” means “if 0 < x ∈ B there exists d ∈ D such that
α α
0 < d ≤ x”.
416 Boolean algebras and Martin’s axiom

required to find a B-filter, F , such that F ∩ Dα 6= ∅ for all α ≤ κ. By

the Stone representation theorem, there exists a one-to-one isomorphism fB
which maps B onto B(S (B)), the set of all clopen subsets of the Stone
space S (B) shown to be compact and zero-dimensional.
The Stone space S (B) satisfies the ccc : Suppose A = {Aα : α ≤ µ, Aα ∈
τ (S (B))} is a strong antichain of non-empty open subsets of S (B) of car-
dinality µ. We claim that µ ≤ ℵ0 . For each α we can choose fB (aα ) ⊆ Aα .
Since the elements of A are pairwise disjoint and fB is one-to-one, then
the set {fB (aα ) : α ≤ µ} is of cardinality µ. Since fB is an isomorphism,
then {aα : α ≤ µ} is an antichain in B of cardinality µ. Since B is ccc then
µ ≤ ℵ0 . This establishes the claim.
For each α ≤ κ, choose dα ∈ Dα ⊆ B. If Mα = ∪{fB (dα ) : dα ∈ Dα },
we see that Mα is an open subset of S (B) (since {fB (x) : x ∈ B} is an
open base for the topology τ (S (B)) on S (B)). Suppose u is an ultrafil-
ter in S (B) − Mα and fB maps w ∈ B to a basic open neighbourhood
fB (w) of u. Then there exists d ∈ Dα such that 0 < d ≤ w; this implies
fB (d) ⊆ fB (w) ∩ Mα . So every basic neighbourhood of u intersects Mα .
Then Mα is dense in S (B). So {Mα : α ≤ κ} is collection of open dense
subsets of the compact set S (B).
Our hypothesis guarantees that there exists z ∈ ∩{Mα : α ≤ κ}. Let Bz
denote the set of all basic neighbourhoods of S (B) which contain z. Now
←
Bz satisfies the finite intersection property. Then the set Fz = fB (Bz ) is
a B-filter.
We claim that the subset Fz in B is the B-filter we seek : Choose an ar-
bitrary α1 ≤ κ. Since z ∈ Mα1 = ∪{fB (dα ) : dα ∈ Dα1 } then z ∈ fB (dα )
←
for some α. This implies dα ∈ Fz ∩ Dα1 . Then the B-filter, Fz = fB (Bz ),
intersects every dense set Dα in B. This concludes the proof.
(3 ⇒ 4): We are given that (P, ≤) is a partially ordered set such that |P | ≤ κ
which satisfies ccc. Also suppose that the elements of D = {Dα : α ≤ κ} are
known to be dense subsets of P . We are required to show that, given our
hypothesis, there exists a P -filter, H , such that H ∩ Dα 6= ∅ for all α ≤ κ.
We will suppose, without loss of generality, that ∧P does not exist. (We
can do this since, if we prove the existence of a P -filter, H , in P − {∧P }
then H is a P -filter in P .). If x ∈ P we define “x↓ ” as
x↓ = {y ∈ P : y ≤ x}
Consider the subset BP = {x↓ : x ∈ P }. Note that if a ∈ x↓ ∩ y↓ , then
a↓ ⊆ x↓ ∩ y↓ , hence BP forms an open base for some topology τ (P ) on
P . We have previously shown that the set of all regular open subsets of P ,
(Ro(P ), ⊆, ∨, ∩, P, ∅ 0) forms a Boolean algebra in (τ (P ), ⊆, ∪, ∩, P, ∅ 0)
We claim that since P satisfies ccc, then so does Ro(P ) :7 Let A be an
7 The reader is left to verify the following facts: (1) If A ⊆ B then int cl (A) ⊆
X X
intX clX (B) (2) intX clX (A)∩ intX clX (B) = intX clX (A ∩ B).
Appendix A 417

antichain in Ro(P ) and A ∈ A . Then A = intP clP (A). Since A is open

in P , then it is the union of base elements of the form x↓ . We can then
choose aA↓ ⊆ A such that intP clP (aA ↓ ) ⊆ intP clP (A) = A. Then there is a
well-defined one-to-one function h : A → P such that h(A) = aA (Choice!).
If A, B ∈ A , then A ∩ B = ∅ hence h(A) ∩ g(B) = ∅. We conclude that
h[A ] = {aA ∈ P : A ∈ A } is an antichain in P . Since P satisfies ccc, then
h[A ] must be countable. Since h is one-to-one, A must be countable. So
Ro(P ) satisfies ccc, as claimed.
We increase the size of the set D = {Dα : α ≤ κ} of dense subsets of P , as
follows. Let
D ∗ = {Dα : α ≤ κ} ∪ {D(x,y) : x, y ∈ P }
where each D(x,y) is a dense subset of P previously defined after definition
33.2.8 So every element of D ∗ is a dense subset of P . Since D = {Dα :
α ≤ κ} ⇒ |D| ≤ κ and |P | ≤ κ ⇒ | ∪ {D(x,y) : x, y ∈ P }| ≤ κ then
|D ∗ | ≤ κ (by Theorem 23.4). We define the function g : P → Ro(P ) as
g(x) = intP clP (x↓ ). For α ≤ κ and x, y ∈ P we define

Aα = g[Dα ] = {intP clP (x↓ ) : x ∈ Dα }

A(x,y) = g[D(x,y) ] = {intP clP (x↓ ) : x ∈ D(x,y) }

Let A = {Aα : α ≤ κ} ∪ {A(x,y) : x, y ∈ P } ⊆ Ro(P ).9 Let A be an

arbitrary element in A .
We claim that A is dense in Ro(P ). We first consider the case where
A is of the form Aα . To see this, let H ∈ τ (P ) − {∅} such that H =
intP clP H ∈ Ro(P ). Then there exists t ∈ P and a basic element t↓ ∈ τ (P )
such that t ∈ t↓ ⊆ H. Since Dα is dense in P , there exists d ∈ Dα such
that d ≤ t. So intP clP (d↓ ) ⊆ intP clP (t↓ ) ⊆ intP clP H = H. We have found
an element intP clP (d↓ ) in A such that intP clP (d↓ ) ⊆ H. So A is dense
in Ro(P ). If A is of the form A(x,y) , the proof that A is dense proceeds
identically. So F intersects every set in A .
By hypothesis, Ro(P ) contains a filter, F , such that F ∩ A 6= ∅ for all
A ∈ A . Let g : P → Ro(P ) be defined as above. That is, g(x) = intP clP (x↓ )
and let
H = g← [F ] = {x ∈ P : intP clP (x↓ ) ∈ F }

The set H intersects all elements of D ∗ : Consider Dβ ∈ D ∗ . Then

g[Dβ ] = Aβ ∈ A . There must exist u ∈ F ∩Aβ . Then g← (u) ∈ Dβ ∩H . We
now consider D(x,y) ∈ D ∗ for some x, y in P . Then g[D(x,y) ] = A(x,y) ∈ A .
There must exist u ∈ F ∩ A(x,y) . Then g← (u) ∈ D(x,y) ∩ H . Then H
8 The subset D
(x,y) is previously defined in Example 2 following the Definition ?? as:
D(x,y) = V(x,y) ∪ W(x,y) where V(x,y) = {z ∈ P : z is not compatible with x or with y}
and W(x,y) = x↓ ∩ y↓ .
9 The reader is left to verify that, if A is a subset of X, then int cl A is regular open
X X
in X.
418 Boolean algebras and Martin’s axiom

meets all elements of D ∗ , as claimed.

We claim that H is a proper P -filter : The set H is non-empty since
F ∩ Aα 6= ∅. The set H does not contain a minimum element since P was
assumed not to have one, hence H 6= P . Suppose x ∈ H and x ≤ y. Then
g(x) = intP clP (x↓ ) ∈ F and, since F is a Ro(P )-filter and g(x) ⊆ g(y) =
intP clP (y↓ ), then g(y) ⊆ F , hence y ∈ H .
Let x, y ∈ H . We are now required to find q ∈ H such that q ≤ x and
q ≤ y. Since x, y ∈ H then g(x), g(y) ∈ F . Since D(x,y) ∩ H 6= ∅ there
exists s ∈ D(x,y) ∩ H . Then g(s) ∈ F . It is not possible that s↓ ∩ x↓ = ∅,
for if it was, we would have ∅ = g(s↓ ∩ x↓ ) = g(s↓ ) ∩ g(x↓ ) ∈ F , a contra-
diction. Also, it is impossible that s↓ ∩ y↓ = ∅, for this would not allow g(s)
and g(y) to both belong to F . Then s ∈ x↓ ∩ y↓ . Then s ≤ x and s ≤ y.
We conclude that H is a P -filter.
This concludes the proof of (3 ⇒ 4).
(4 ⇒ 1): Suppose (P ≤) be a partially ordered set satisfying ccc. Let
D = {Dα : α ≤ κ} be a set of dense subsets of P . We will assume that P
does not have a minimum element. Since Dα is dense in P , for each x ∈ P
there exists q ∈ Dα such that q ≤ x; hence x↓ ∩ Dα 6= ∅ for all x ∈ P . For
each α we define a function gα : P → Dα as

gα (x) = xα where xα ∈ x↓ ∩ Dα (Choice!)

We also define the function g : P × P → P as

g(x, y) = z where z ∈ x↓ ∩ y↓ , g(x, y) arbitrary if x↓ ∩ y↓ 6= ∅ (Choice!)

Let T0 = {k} for some k ∈ P . We inductively define the elements of {Tn :

n < ℵ0 } as follows:

Tn+1 = Tn ∪ g[Tn × Tn ] ∪ [∪{gα [Tn ] : α ≤ κ}]

and let T = ∪{Tn : n < ℵ0 }. Since T ⊆ P , T inherits “≤” from P to form

a partially ordered set (T, ≤). Also, note that since |{Tn : n < ℵ0 }| < ℵ0
and |Tn | ≤ κ, then |T | = | ∪ {Tn : n ≤ ℵ0 }| ≤ ℵ0 × κ = κ. Also see that
gα [Tn ] ⊆ Tn+1 ⊆ T for all n and so gα [T ] ⊆ T . Similarly, g[T × T ] ⊆ T .
Let A = {aα : α ≤ λ} be an antichain in T ⊆ P . Then aα↓ ∩ aβ ↓ = ∅ in T if
α 6= β. If p ∈ P −T such that p ∈ aα↓ ∩ aβ ↓ , then g(aα , aβ ) = k ∈ aα↓ ∩ aβ ↓ .
Since k ∈ T , then aα ↓ ∩ aβ ↓ 6= ∅ in T , a contradiction. Then aα ↓ ∩ aβ ↓ = ∅
in P . We conclude that A is an antichain in P . Since P satisfies ccc, then
A cannot be uncountable. Therefore T satisfies ccc.
We claim that, for any given α, T ∩ Dα is dense in T : Let t ∈ T and
α ≤ κ. Since Dα is dense in P there exists z ∈ Dα such that z ≤ t ∈ T .
Then gα (z) = zα ∈ z↓ ∩Dα . Then zα ∈ T and zα ≤ z ≤ t. Since zα ∈ T ∩Dα
Appendix A 419

we can conclude that T ∩ Dα is dense in T , as claimed.

By hypothesis, there exists a T -filter, F , such that F ∩ (T ∩ Dα ) 6= ∅ for
all α ≤ κ. If q ∈ P let {q}↑ = {x ∈ P : x ≥ q} be the principal P -filter
generated by q. We let F ∗ = ∪{{q}↑ : q ∈ F }. It easily verified that F ∗ is
a P -filter which contains F . Since F ⊆ F ∗ and F intersects every Dα , then
so does F ∗. Then F ∗ is P -filter we were required to find.

We finally have the following topological statement which is equivalent to

Martin’s axiom.

Theorem 1.14 Let κ be a cardinal such that ℵ0 ≤ κ < 2ℵ0 . Let X

be a Hausdorff topological space satisfying ccc such that {x ∈ X :
x has a compact neighbourhood} is dense in X. Suppose that D = {Dα :
α ≤ κ} is a family of dense open subsets of X. Then ∩{Dα : α ≤ κ} is dense
in X if and only if Martin’s axiom holds true.

Proof :
( ⇒ ) Let X be a compact Hausdorff space which satisfies ccc where X
contains a family of dense open subsets of X, D = {Dα : α ≤ κ}. Since
X is compact, every point x ∈ X contains a compact neighbourhood. By
hypothesis, ∩{Dα : α ≤ κ} is dense in X, so ∩{Dα : α ≤ κ} is not empty.
Then by (2 ⇔ 1) in the previous theorem, Martin’s axiom holds true.
( ⇐ ) Suppose Martin’s axiom holds true. Let X be a Haus-
dorff topological space satisfying ccc such that T = {x ∈ X :
x has a compact neighbourhood} is dense in X. Suppose that D = {Dα :
α ≤ κ} is a family of dense open subsets of X (where ℵ0 ≤ κ < 2ℵ0 ). We
are required to show that ∩{Dα : α ≤ κ} is dense in X. For any non-empty
open subset U , there exists a point x ∈ U ∩ T and some open neighbourhood
S of x with compact closure, clX S, such that x ∈ clX (S ∩ U ) ⊆ clX S. Since
X satisfies ccc its compact subset clX (S ∩ U ) must also satisfy ccc. For any
Dα ∈ D, Dα ∩ U ∩ S is open and dense in clX (S ∩ U ). By the topological
equivalent form of MA, there exists q ∈ ∩{Dα ∩ U ∩ S : α ≤ κ}. Since
∩{Dα ∩ U ∩ S : α ≤ κ} ⊆ U ∩ (∩{Dα : α ≤ κ}), then q ∈ ∩{Dα : α ≤ κ} ∩ U .
Not only is ∩{Dα : α ≤ κ} non-empty, but it also intersects every open
subset U of X. So ∩{Dα : α ≤ κ} is dense in X.
420 Bibliography

Bibliography

1. Andre, R. Point-Set Topology with Topics, World Scientific publishing Co.

Pte. Ltd., 2024.
2. Enderton, H.B. Elements of set theory, Academic Press, Inc., 1977.
3. Fremlin, D.H. Consequences of Martin’s axiom, Cambridge University press,
London, 1984.
4. Hrbacek, K., Jech, T. Introduction to set theory, Marcel Dekker, Inc., New
York and Basel, 1984.
5. Holz, M., Steffens, K., Weitz, E. Introduction to Cardinal Arithmetic,
Birkhäuser, Basel, 1999
6. Jech, T., Set Theory: The Third Millenium Edition, Springer/Sci-
Tech/Trade, 2013.
7. Monk, J.D., Introduction to set theory, McGraw-Hill, New York, 1969
8. Roitman, J. Introduction to modern set theory, John Wiley & sons, New
York, 1990.
9. Pinter, C. Set theory, Addison-Wesley Publishing Company, 1971,
10. Potter, M., Set Theory and Its Philosophy: A Critical Introduction, Oxford
University Press, 2004
11. Porter, J. R., Woods, R. G. Extensions and absolutes of Hausdorff spaces,
Springer-Verlag, 1988.
12. Willard, S. General Topology, Addison-Wesley Publishing Company, 1968.
13. Weese, M., Winfried, J. Discovering Modern set theory, American mathe-
matical society, 1998.
Index

<e , 204 N{1,2}, 135

<W O , 272 N{1,2}, 135
A ⊆ B, 13 ¬CH, 222
C = C, 13 ω0 , 308
Gδ -set, 379 ω1 , 303
Re , 196 ∈-least uncountable ordinal, 305
S finite ord
S, 302
⇒ P(S) finite, 181 Q, 154
S/R, 71 is countable, 190
Sx , 64 R is uncountable, 192
[S]e , 197 ∼, 196
ℵ0 , 226 ∼WO , 272
ℵ1 , 327 τ (X), 378, 414
∅, 16 Z, 151
∅ ⊆ C, 16 Z is ctble, 188
is a set, 17 {1, 2}N, 134
,→e , 204 a 6∈ a, 21
,→e∼, 204 ccc, 373
∈, 7 f ← , 96
∈= , 318 P(A), 19
∈= , 128 is a set, 19
κ × 1 = κ, 227 SR , 66
κ1 = κ, 227
≤e , 204 addition, 140
≤W O , 272 aleph notation, 226
≤lex , 132 for infinite cardinals, 325
set(x), 18 antichain, 60, 373
U , 16 antisymmetric relation, 53
E , 204 asymmetric relation, 53
I , 318 Axiom
O, 296 of choice, 9, 336, 340
Ro(X), 405 of class construction, 8
S , 18 of countable choice, 338
W ∗ , 303 of extent, 8
W , 303 of foundation, 350
N×N of infinity, 8
is countable, 190 of pair, 8

421
422 Index

of power set, 8 cardinality of κ0 , 227

of regularity, 8, 350 Cartesian products, 39
of replacement, 8 ccc for topological spaces, 374
of subsets, 8 CH, 222
of union, 8 chain, 59, 373
Axiom of Choice characteristic function, 84
Every set is well-orderable, 339 choice function, 337
Axiom of choice class, 7
equivalent form, 339 proper, 7
Zorn’s lemma, 341 which is not element, 15
Axiom of infinity class O
an inductive set exists, 114 is ∈-well-ordered, 296
Axiom of regularity is not a set, 297
x 6∈ x, 351 class axioms, 10
equivalent form, 350 class functions, 84
every set well-founded, 352 class of all sets, S , 18
axiomatic system, 4 class of ordinals, 296
is ∈-linearly ordered, 296
Baire category theorem, 380 closure, 378
belong’s to, 7 coarsest partition, 67
bijective function, 83 codomain of a function, 81
binary relation, 47 comparable relation, 53
Boolean algebra, 408 complement of classes or sets, 28
Boolean homomorphism, 409 complemented lattice, 408
Boolean isomorphism, 409 complete lattice, 404
bounded above, 130 Completeness property, 166
bounded subsets of N composition of relations, 50
contain maxm element, 131 composition of two functions, 88
Burali-Forti paradox, 297 constant function, 83
continuum, 222
canonical decomposition of f, 105 Continuum Hypothesis, 326
Cantor set, 250 Continuum hypothesis, 222
cardinal number, 225 countable
formal definition, 325 image of ctble is ctble, 191
cardinal number addition, 227 countable chain condition, 373
cardinal number exponentiation, 227 countable set
cardinal number multiplication, 227 subset is countable, 188
cardinal number operations, 227 countable sets, 187
cardinality, 225 countable union is ctble, 191
of Rn is c, 247 finite products are countable,
of C is c, 247 199
of J is c, 247 Cumulative hierarchy of sets, 356
of Cantor set is c, 250 cumulative hierarchy of sets, 357
cardinality of 0λ , 227
cardinality of ∅, 226 De Morgan’s laws, 31, 33
Index 423

Dedekind cut, 166 image, 80

dense in a poset, 374 preimage, 81
dense subset of X, 379 restriction to a set, 82
difference of two classes, 28
disjoint, 28 Generalized Continuum Hypothesis,
distributive lattice, 408 326
domain of a function, 81 Generalized continuum hypothesis,
domain of a relation, 49 222
doubleton, 17
doubleton set, 20 Hartogs number of a set, 306
Hartogs’ lemma, 304
element, 7 Hausdorff definition, 39
a 6∈ a, 21, 61 hierarchy of sets, 356
all sets are, 17
embedded, properly, 204 IdS , 56
embeds, 174 identity, 49
empty class, 16 identity function, 90
empty set identity relation, 56
is finite, 175 identity relation on, 49
equinumerosity, 196 image of a function, 80
equipotence, 196 image of a relation, 49
induces equivalence classes, 197 immediate predecessor, 281
is an equivalence relation, 196, of a natural number, 122
204 immediate successor, 281
equipotent increasing function, 268
2S and P(S) are, 206 induction, 116
P(N) and R are, 207 principle of, 116
equipotent sets, 186 induction over the ordinals, 298
equivalence class of x under R, 70 inductive set, 114
equivalence relation, 55 infinite ordinal
partitions S, 64 if 6= ω contains ω, 284
equivalence relation determined by infinite set, 174
f, 102 initial ordinal, 318
Euclid geometry, 5 initial ordinals
Hilbert’s 20 axioms, 5 properties, 324
initial segment
fiber, 97 of O is an ordinal, 297
filter, 406 of a linearly ordered set, 265
filter in a poset, 376 injective function, 83
finer, 75 integers, 151
finest partition, 67 intersection of classes or sets, 27
finite intersection property, 375 inverse of a function, 91
finite set, 174 inverse of a relation, 50
function, 80 invertible functions, 92
equal, 82 irreflexive relation, 53
424 Index

Kuratowski definition, 37 ordinals are equal, 286

order isomorphism, 268, 410
lattice, 404 properties, 269
leader, of an initial segment, 266 order relation, 56
least uncountable ordinal, 305 linear, 56
least upper bound, 291 non-strict, 56
least upper bound property, 166 partial, 57
lexicographic order, 134 strict, 57
lexicographic ordering, 393 order type, 302
limit cardinal, 329 ordered pair, 37
limit ordinal, 288 ordered triple, 38
limit ordinals ordinal
characterizations, 292 an uncountable one exists, 305
linear ordering, 56 ordinal number, 279
⇒ successor an ordinal, 280
MA, 379 ordinal numbers
MA(κ), 376 and initial segments, 282
Martin’s axiom, 379 ordinality, 302
maximal, 59 ordinals
maximal element, 59 for topologists’ eyes, 313
membership relation on, 49
minimal, 59 partial ordering relation, 57
minimal element, 58, 59, 350 partition of a set, 70
minimum, 58 Peano axioms, 124
multiplication, 144 poset, 57
multiplication of ordinals, 394 power set, 19
preimage of a function, 81
natural number primitive concept, 4, 6
m ⊂ n ⇒ m ∈ n, 119 principal filter, 376
each is an ordinal, 279 proper filter, 376, 406
every is ∈-well-ordered, 130 properly embedded, 204
every one is a transitive set, 119
is finite, 177 quotient set, 71
natural numbers, 115
n 6∈ n, 119 ran f, 81
is ∈-well-ordered, 129 range, 49
is a transitive set, 118 range of a function, 81
is an infinite set, 174 rank of a set, 361
is an ordinal, 279 rational numbers, 154
real numbers, 166
omega (ω-ordinal), 279 recursion, 311
one-to-one correspondence, 83 recursion theorem, 141
one-to-one function, 83 recursively
one-to-one onto, 83 constructed functions theorem,
onto function, 83 182
order isomorphic, 268 refinement, 75
Index 425

reflexive relation, 53 successor, 112

regular open, 405 successor cardinal, 329
relation, 47 supremum, 291
antisymmetric, 53 surjective function, 83
asymmetric, 53 symmetric difference of two classes,
comparable, 53 28
composition of, 50
domain, 49 ternary relation, 47
equivalence, 55 topological representation, 410
identity, 49, 56 transfinite induction, 298
image, 49 principle of, 299
inverse of, 50 second version, 299
irreflexive, 53 transfinite numbers, 226
membership, 49 transfinite recursion theorem, 311
reflexive, 53 transitive
strict ordering, 57 class, 278
transitive, 53 closure, 353
replacement axiom, 8, 187, 305 relation, 53
restriction of a function f, 82 set, 117
Russell’s paradox, 4 transitive set
characterization, 117
Schröder-Bernstein theorem, 211 two equipotent sets
set, 7 have equipotent power sets, 201
which are not ∅, 17 Tychonoff plank, 313
set axioms, 11 deleted, 314
sets
are not equipotent to power set, ultrafilter, 406
202 uncountable sets, 187
are embedded in their power set, union of classes or sets, 27
202 universal class, 16
singleton set, 17, 20 upper bound, 291
is finite, 175
Stone representation theorem, 411 Venn diagrams, 29
strict ordering relation, 57 Von Neumann universe, 356
strictly increasing function, 268
well-founded, 351
strong antichain, 373
well-orderable set, 320
strong limit cardinal, 345
well-ordered, 128
subclass, 13
Well-ordered class, 263
subclass, proper, 13
well-ordered set
subset, 13
is isomorphic to an ordinal, 300
subset of finite set
well-ordered sets, 262
is finite, 175
can induce a well-ordering, 263
subset of infinite set
form of an initial segment, 266
is infinite, 175
well-ordering, 128
subtraction, 147
Well-ordering principle, 321
426 Index

Well-ordering theorem, 321

well-orders, 128

Zermelo-Fraenkel, 5
ZF-axioms, 8
ZFC, 9
Zorn’s lemma, 341, 343

A Course On Set Theory - Ernest Schimmerling - Cambridge University Press, Cambridge, 2011
No ratings yet
A Course On Set Theory - Ernest Schimmerling - Cambridge University Press, Cambridge, 2011
188 pages
Infinity Causation and Paradox
100% (3)
Infinity Causation and Paradox
222 pages
The Readers' Advisor's Companion
No ratings yet
The Readers' Advisor's Companion
326 pages
5203 Things To Do Instead of Looking at Your Phone
100% (1)
5203 Things To Do Instead of Looking at Your Phone
401 pages
An Introduction To Set Theory and Topology PDF
100% (3)
An Introduction To Set Theory and Topology PDF
459 pages
Lipschutz SchaumsTheoryAndProblemsOfSetTheory
No ratings yet
Lipschutz SchaumsTheoryAndProblemsOfSetTheory
240 pages
Mückenheim: Source Book of Transfinity
100% (1)
Mückenheim: Source Book of Transfinity
366 pages
Set Theory
No ratings yet
Set Theory
440 pages
1aaset Theory 140613
No ratings yet
1aaset Theory 140613
335 pages
1aaset Theory 140613 PDF
100% (1)
1aaset Theory 140613 PDF
459 pages
300 Textfa 22
No ratings yet
300 Textfa 22
95 pages
Set Theory and Topology
No ratings yet
Set Theory and Topology
8 pages
A Course On Set Theory - E. Schimmerling - CUP (2011) PDF
100% (2)
A Course On Set Theory - E. Schimmerling - CUP (2011) PDF
180 pages
A Course On Set Theory
100% (9)
A Course On Set Theory
180 pages
Set Theory
No ratings yet
Set Theory
49 pages
Numbers, Sets and Axioms - Hamilton PDF
83% (6)
Numbers, Sets and Axioms - Hamilton PDF
265 pages
MATH 421 Practice
No ratings yet
MATH 421 Practice
86 pages
Set Theory For The Working Mathematician - Krzysztof Ciesielski - 1997
100% (3)
Set Theory For The Working Mathematician - Krzysztof Ciesielski - 1997
244 pages
Uci Math 13 Notes
No ratings yet
Uci Math 13 Notes
176 pages
3200 Supplemental
No ratings yet
3200 Supplemental
308 pages
Intro Analysis
No ratings yet
Intro Analysis
269 pages
Set Theory Axiomatic Reasoning - Robert Andre
No ratings yet
Set Theory Axiomatic Reasoning - Robert Andre
704 pages
Preliminaries of Real Analysis
No ratings yet
Preliminaries of Real Analysis
80 pages
Undergraduate Mathematics Foundations PDF
No ratings yet
Undergraduate Mathematics Foundations PDF
260 pages
Discrete Math - Olteanu
100% (1)
Discrete Math - Olteanu
135 pages
NumberTheory by Frances Odumodu NAU
No ratings yet
NumberTheory by Frances Odumodu NAU
37 pages
Set Theory
No ratings yet
Set Theory
170 pages
Set Theory
No ratings yet
Set Theory
360 pages
Monk J.D. Introduction To Set Theory (MGH, 1969) (ISBN 0070427151) (T) (O) (204s) - MAa
No ratings yet
Monk J.D. Introduction To Set Theory (MGH, 1969) (ISBN 0070427151) (T) (O) (204s) - MAa
204 pages
Set Theory For The Working Mathematician
100% (2)
Set Theory For The Working Mathematician
243 pages
Set Theory Oxford
No ratings yet
Set Theory Oxford
47 pages
MeasureTheory UCLA
100% (3)
MeasureTheory UCLA
105 pages
B1.2 Set Theory: Martin Bays HT23 Oxford
No ratings yet
B1.2 Set Theory: Martin Bays HT23 Oxford
41 pages
14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF
No ratings yet
14 Jerome - Malitz - (Auth.) - Introduction - To - Mathematic PDF
208 pages
Carol Whitehead - Guide To Abstract Algebra PDF
100% (5)
Carol Whitehead - Guide To Abstract Algebra PDF
269 pages
Set Theory Lecture Notes
100% (1)
Set Theory Lecture Notes
170 pages
Main
No ratings yet
Main
313 pages
(Synthese Library 34) Jean-Louis Krivine (Auth.) - Introduction To Axiomatic Set Theory (1971, Springer Netherlands) PDF
No ratings yet
(Synthese Library 34) Jean-Louis Krivine (Auth.) - Introduction To Axiomatic Set Theory (1971, Springer Netherlands) PDF
107 pages
Senior Mathematics PART 9
No ratings yet
Senior Mathematics PART 9
303 pages
Intuitive Axiomatic Set Theory (José Luis García) (Z-Library)
No ratings yet
Intuitive Axiomatic Set Theory (José Luis García) (Z-Library)
362 pages
Introduction To General Topology by S. T Hu
No ratings yet
Introduction To General Topology by S. T Hu
240 pages
MTH 202
No ratings yet
MTH 202
209 pages
Notes4 PDF
No ratings yet
Notes4 PDF
50 pages
Math in The Modern World Chapter 1
100% (2)
Math in The Modern World Chapter 1
48 pages
1mth202 Soln
No ratings yet
1mth202 Soln
251 pages
Beauty in Mathematics
No ratings yet
Beauty in Mathematics
464 pages
John D. Baum - Elements of Point-Set Topology-Prentice-Hall (1964)
No ratings yet
John D. Baum - Elements of Point-Set Topology-Prentice-Hall (1964)
168 pages
(Encyclopedia of Mathematics and Its Applications) John P. Mayberry-The Foundations of Mathematics in The Theory of Sets-Cambridge University Press (2001)
100% (1)
(Encyclopedia of Mathematics and Its Applications) John P. Mayberry-The Foundations of Mathematics in The Theory of Sets-Cambridge University Press (2001)
446 pages
Leon W. Cohen, Gertrude Ehrlich - The Structure of The Real Number System - Van Nostrand (1963)
No ratings yet
Leon W. Cohen, Gertrude Ehrlich - The Structure of The Real Number System - Van Nostrand (1963)
124 pages
Disc Math
No ratings yet
Disc Math
153 pages
Introduction to Real Analysis
From Everand
Introduction to Real Analysis
Michael J. Schramm
3/5 (3)
Basic Abstract Algebra: For Graduate Students and Advanced Undergraduates
From Everand
Basic Abstract Algebra: For Graduate Students and Advanced Undergraduates
Robert B. Ash
4/5 (5)
Essential Calculus with Applications
From Everand
Essential Calculus with Applications
Richard A. Silverman
4.5/5 (4)
A Book of Set Theory
From Everand
A Book of Set Theory
Charles C Pinter
4/5 (3)
An Introduction to Essential Algebraic Structures
From Everand
An Introduction to Essential Algebraic Structures
Martyn R. Dixon
No ratings yet
Linear Algebra
From Everand
Linear Algebra
Sterling K. Berberian
3/5 (2)
Introduction to Matrices and Linear Transformations: Third Edition
From Everand
Introduction to Matrices and Linear Transformations: Third Edition
Daniel T. Finkbeiner
3/5 (1)
Two-Dimensional Calculus
From Everand
Two-Dimensional Calculus
Robert Osserman
5/5 (1)
A Bridge to Advanced Mathematics
From Everand
A Bridge to Advanced Mathematics
Dennis Sentilles
1/5 (1)
Set Theory and Logic
From Everand
Set Theory and Logic
Robert R. Stoll
3.5/5 (12)
Linear Algebra and Matrix Theory
From Everand
Linear Algebra and Matrix Theory
Robert R. Stoll
5/5 (1)
Analysis in Euclidean Space
From Everand
Analysis in Euclidean Space
Kenneth Hoffman
No ratings yet
Theoretical Numerical Analysis: An Introduction to Advanced Techniques
From Everand
Theoretical Numerical Analysis: An Introduction to Advanced Techniques
Peter Linz
No ratings yet
Vector Spaces and Matrices
From Everand
Vector Spaces and Matrices
Robert M. Thrall
No ratings yet
22-23 Curriculum Guide 2
No ratings yet
22-23 Curriculum Guide 2
51 pages
Windows 11 For Beginners Windows 11 User Guide To Mastering All The Features and Functions of The New Windows 11 by Smith, Donald
No ratings yet
Windows 11 For Beginners Windows 11 User Guide To Mastering All The Features and Functions of The New Windows 11 by Smith, Donald
50 pages
Geometry of Lengths, Areas, and Volumes Two-Dimensional Spaces by Cannon, James W.
100% (3)
Geometry of Lengths, Areas, and Volumes Two-Dimensional Spaces by Cannon, James W.
133 pages
Indigenous Crime and Settler Law White Sovereignty After Empire (Heather Douglas, Mark Finnane (Auth.) ) (Z-Library)
No ratings yet
Indigenous Crime and Settler Law White Sovereignty After Empire (Heather Douglas, Mark Finnane (Auth.) ) (Z-Library)
279 pages
Thinking Mathematically Notes
No ratings yet
Thinking Mathematically Notes
28 pages
An Informal Introduction To Topos Theory
No ratings yet
An Informal Introduction To Topos Theory
24 pages
N PDF
No ratings yet
N PDF
201 pages
Infinite-Dimensional Vector Spaces
100% (1)
Infinite-Dimensional Vector Spaces
9 pages
Cohen 1967
No ratings yet
Cohen 1967
14 pages
Untitled Document
No ratings yet
Untitled Document
9 pages
The Realm of The Infinite: W. Hugh Woodin
No ratings yet
The Realm of The Infinite: W. Hugh Woodin
35 pages
Oxford Math Classes
No ratings yet
Oxford Math Classes
62 pages
(Lecture Notes in Mathematics) Infinitary Combinatorics and The Axiom of Determinateness Volume 612 Chapter I (Kleinberg, Eugene M.) (Z-Library)
No ratings yet
(Lecture Notes in Mathematics) Infinitary Combinatorics and The Axiom of Determinateness Volume 612 Chapter I (Kleinberg, Eugene M.) (Z-Library)
13 pages
The Banach-Tarski Paradox: Zaichen Lu October 2009
No ratings yet
The Banach-Tarski Paradox: Zaichen Lu October 2009
10 pages
Functional Analysis A Terse Introduction Gerardo Chacn Humberto Rafeiro Juan Camilo Vallejo Instant Download
No ratings yet
Functional Analysis A Terse Introduction Gerardo Chacn Humberto Rafeiro Juan Camilo Vallejo Instant Download
91 pages
Set Theory Math
No ratings yet
Set Theory Math
65 pages
Set Theory - Wikipedia
No ratings yet
Set Theory - Wikipedia
14 pages
Axiom of Choice Book3
100% (1)
Axiom of Choice Book3
248 pages
Hebrew Magic
No ratings yet
Hebrew Magic
56 pages
Pub Nonstandard Analysis
100% (1)
Pub Nonstandard Analysis
179 pages
Ziemer-Modern Real Analysis
No ratings yet
Ziemer-Modern Real Analysis
403 pages
Nation J.B. - Notes On Lattice Theory (2008)
No ratings yet
Nation J.B. - Notes On Lattice Theory (2008)
135 pages
Axiombook
No ratings yet
Axiombook
440 pages
Notes mth102
No ratings yet
Notes mth102
8 pages
Models and Reality - Hillary Putnam
No ratings yet
Models and Reality - Hillary Putnam
20 pages
(Graduate Studies in Mathematics) Alberto Torchinsky - Problems in Real and Functional Analysis-American Mathematical Society (2015)
No ratings yet
(Graduate Studies in Mathematics) Alberto Torchinsky - Problems in Real and Functional Analysis-American Mathematical Society (2015)
480 pages
Banach-Tarski Paradox
No ratings yet
Banach-Tarski Paradox
17 pages
Tychonoff's Theorem: John Terilla Fall 2014
No ratings yet
Tychonoff's Theorem: John Terilla Fall 2014
9 pages
History of Set Theory
No ratings yet
History of Set Theory
39 pages