Axiombook
Axiombook
Set Theory
Robert André
To
Jinxia, Camille and Isabelle
“Everything has beauty, but not everyone sees it.”
Confucius
i
Preface
A set theory textbook can cover a vast amount of material depending on the
mathematical background of the readers it was designed for. Selecting the
material for presentation in this book often came down to deciding how much
detail should be provided when explaining concepts and what constitutes a
reasonable logical gap which can be independently filled in by the reader.
Choice of topics and calibration of the level of communication is based on
the estimated mathematical fluency of the reader. In this book readers will
find that the initial chapters of this book are presented in a form destined
to students who have little experience in proving mathematical statements.
But as the student progresses throughout the book he or she will slowly be
eased into a progressively denser form of mathematical arguments and pre-
sentation. That said, the overall targeted readership profile is one of a student
registered in a one-semester intermediate level general math course. A course
which would be part of a four-year university program in which the study of
mathematics bears a strong component; but, nevertheless, is not necessarily
one designed to prepare a student for the study of mathematics at the grad-
uate level. The writer assumes that most students have not necessarily been
exposed to the type of mathematical rigor normally found in most textbooks
in set theory. In the beginning, the pace at which new concepts are intro-
duced is one that may be subjectively considered as being “leisurely”. The
meaning of a mathematical statement is explained at length and their proofs
presented in great detail. The purpose is to sufficiently capture the interest of
the reader thus inciting him or her to delve further into the subject matter.
As the student progresses through the course, he or she will develop a better
understanding of what constitutes a correct mathematical proof. To help at-
tain this objective, numerous examples of simple straightforward proofs are
presented as models throughout the text.
This reader understands that doing mathematics is a skill and so is some-
thing that must be done and practiced under the supervision of an instructor
who can point out errors or weaknesses in certain mathematical arguments.
Most of us are not born with this skill; it is one which is studied and developed.
Having read the content, most mathematicians would say that the book is
self-contained. This is an accurate statement, mathematically speaking. But
it is not sufficiently nuanced. Students who have previously studied a course
in, say, linear algebra, elementary number theory or an elementary course in
real analysis will fare better than one who hasn’t. This not because the book
assumes knowledge from those courses. It is because those students will have
already developed some mathematical skills all of which will turn out to be
quite useful in solving certain types of problems. Much of the mathematical
content of this text is inspired from prepared lecture notes sourced from well-
known set theory textbooks such as Hrbacek and Jech’s.
ii
The subject material is subdivided into ten major parts. The first few
are themselves subdivided into “bite-sized” chapters. Smaller sections allow
students to test their understanding on fewer notions at a time. This will allow
the instructor to better diagnose the understanding of those specific points
which challenge the students the most, thus helping to eliminate obstacles
which may slow down their progress later on.
Each chapter is followed by a list of Concepts review type questions. These
questions highlight for students the main ideas presented in that section and
help them deepen their understanding of these concepts before attempting the
exercises. The answers to all Concept review questions are in the main body of
the text. Attempting to answer these questions will help the student discover
essential notions which are often overlooked when first exposed to these ideas.
Textbook examples will serve as solution models to most of the exercise
questions at the end of each section. Exercise questions are divided into three
groups: A, B and C. The answers to the group A questions normally follow
immediately from definitions and theorem statements presented in the text.
The group B questions require a deeper understanding of the concepts, while
the group C questions allow the students to deduce by themselves a few con-
sequences of theorem statements presented in the text.
The course begins with an informal discussion of primitive concepts and
a presentation of the ZFC-axioms. We then discuss, in this order, operations
on classes and sets, relations on classes and sets, functions, construction of
numbers (beginning with the natural numbers followed by the rational num-
bers and real numbers), infinite sets, cardinal numbers and, finally, ordinal
numbers. It is hoped that the reader will eventually perceive the ordinal num-
bers as a natural logical extension of the natural numbers and as being what
constitutes the “spine of set theory” − as some authors have described them.
Towards the end of the book we present, a brief discussion of a few more
advanced topics such as Well-ordering theorem, Zorn’s lemma (both proven
to be equivalent forms of the Axiom of choice) as well as Martin’s axiom.
Finally, we briefly discuss the more abstract Axiom of regularity and a few
of its implications. A brief and very basic presentation of ordinal arithmetic
properties is then given.
Note that Chapters 1, 2, 13 to 21 together constitute the “meat” of the
book. Students who already possess a substantial amount of mathematical
background may feel they can comfortably skip many chapters without loss of
continuity, since these contain notions with which they have already developed
some familiarity. The following order sequence will allow readers with the
required background to advance more quickly to the meat of the textbook:
Chapter 1 on the topic of the ZFC-axioms can be immediately followed by
Chapters 13 and 14 on the topic of natural numbers, Chapters 18 to 22 on
the topic of infinite sets and cardinal numbers followed by Chapters 26 to 29,
32 and 33, on ordinals, and finally, Chapters 30 and 31 on the axiom of choice
and the axiom of regularity.
Many readers may notice that, in Chapter 27 on ordinals, we provide a
iii
much more detailed introduction to the study of ordinals then what is normally
found in some set theory texts. Often, if too many details about ordinals are
left for the readers to discover for themselves in the form of exercises this
tends to leave some doubts in their mind about whether they understand this
topic adequately, thus leading them to avoid using them in certain examples
where they might prove to be useful.
The short chapter titled Martin’s Axiom and Appendix A towards the end
of the book is presented as a matter of interest and is destined to readers who
are well-versed on the subjects of topological spaces and of real analysis.
As we all know, any textbook, when initially published, will contain some
errors, some typographical, others in spelling or in formatting and, what is
even more worrisome, some mathematical in nature. Critical or alert readers
of the text can help eliminate the most glaring mistakes by communicating
suggestions and comments directly to the author. This writer carries an im-
mense debt of gratitude to the scores of students and readers whose numerous
questions and enquiries on a preliminary online version of this text have helped
to clarify my thoughts, weed out typos, awkward explanations and occasional
anacoluthons.
Robert André
University of Waterloo, Ontario
Contents
II Class operations 25
3 Operations on classes and sets . . . . . . . . . . . . . . . . . 27
4 Cartesian products . . . . . . . . . . . . . . . . . . . . . . . 36
III Relations 45
5 Relations on a class or set . . . . . . . . . . . . . . . . . . . 47
6 Equivalence relations and order relations . . . . . . . . . . . 53
7 Partitions induced by equivalence relations . . . . . . . . . . 64
8 Equivalence classes and quotient sets . . . . . . . . . . . . . 70
IV Functions 77
9 Functions: A set-theoretic definition . . . . . . . . . . . . . . 79
10 Operations on functions . . . . . . . . . . . . . . . . . . . . . 88
11 Images and preimages of sets . . . . . . . . . . . . . . . . . . 96
12 Equivalence relations induced by functions . . . . . . . . . . 102
v
VII Cardinal numbers 219
22 Introduction to cardinal numbers . . . . . . . . . . . . . . . 221
23 Addition and multiplication in C . . . . . . . . . . . . . . . 231
24 Exponentiation of cardinal numbers . . . . . . . . . . . . . . 238
25 On sets of cardinality c . . . . . . . . . . . . . . . . . . . . . 247
XI Appendix 403
Index 421
Part I
some point, a set of statements whose true-false values were not derived
from previous statements. That is, the process must start somewhere, with
some initial statements whose true-false value were unknown. Such initial
statements are not “deduced” but simply declared to be true based on
nothing more than “common sense”. For example, one may declare the
statement: “Distinct parallel lines cannot intersect” as being self-evident
or being so “elementary” that it cannot be proved. Once we give ourselves
a set A of self-evident statements and a list of rules that can be used
to determine the true-false value of other statements then the universe
UA of all possible true statements derived from A is determined. This
determined universe UA of statements constitutes a mathematical theory
which is ours to explore, or discover, one statement at a time.
But what if the choice of our original set A of statements was not a wise
one? “How can it not be a wise one if based on common sense?” one might
ask. Imagine this scenario:
Say that from a set A of initial self-evident statements, a state-
ment A has been shown to be true, and given that A is true
it is deduced that statement B must be true, and from B we
deduce that P is true. On the other hand it is shown that given
A, statement D must be true and that from D we show that P
is false. Hence, from A we have deduced that the statement P
is both true and false.
A statement which has been determined to be both true and false is re-
ferred to as a “contradictory statement” or a paradox. If a contradictory
statement logically flows from what was assumed to be a paradox-free sys-
tem, then the foundation of this system, as well as the methods used to de-
termine the true or false value of statements, must be carefully scrutinized
to determine the incorrect assumption(s) which allowed this “renegade”
statement to emerge. In this book we will explore a specific mathematical
system. It is hoped that in the process, the reader will be able to appreci-
ate the skill and ingenuity required for the construction of this impressive
mathematical structure. This system is called the “theory of sets” or more
simply “set theory”.
1.2 Sets.
Most people are familiar with the notion of a set and its elements. “Sets”
are viewed as collections of things, while “elements” are viewed as those
things which belong to sets. Normally, a set is defined in terms of certain
properties shared by its elements. These properties must be well described,
with no ambiguities, so that it is always clear whether a given element
Part I: Axioms and classes 3
belongs to a given set or not. Being a “set” can also be an element property;
so sets whose elements are sets exist, for example, the set S of all teams
in a particular hockey league. The elements of the set S are sets of hockey
players.
Let us consider a few examples of entities we may consider to be sets.
a) Let T denote the set of all straight lines in the Cartesian plane. For
example, the set A = {(x, y) : y = 2x + 3} belongs to T , while the set
B = {1, 2, 3} does not. We easily see that T is not an element of T
since T is not a line in the Cartesian plane.
b) Let U denote the set of all sets which contain infinitely many elements.
This set is well-defined since we can easily distinguish those elements
that belong to U from those that do not belong to U . For example, the
subset {−2, 0, 100} is not an element of U since it contains only three
elements. We ask the question: Is the set U an element of U ? To help
answer this question, witness the sets
A0 = {0, 1, 2, 3, · · ·}
A1 = {−1, 0, 1, 2, 3, · · · }
A2 = {−2, −1, 0, 1, 2, 3, · · ·}
..
.
An = {−n, −(n − 1), −(n − 2), · · · , −2, −1, 0, 1, 2, 3, · · ·}
..
.
We now look more closely at the three sets described above. Other than
the fact that it is an extremely large set, there is nothing extravagant
about the set T described in example (a). On the other hand, the set
U discussed in example (b) also appears to be well-defined, since a set
which is infinite can easily be distinguished from one that is not. But
the fact that this set is an element of itself makes one wonder whether
we should allow sets to satisfy this property. On the other hand, it is
difficult to express what could possibly go wrong with such sets. Let
us now look closely at the “set” described in the example (c): A set
belongs to S only if it does not belong to itself. We wonder whether,
like the U in example (b), the set S is an element of itself. But S cannot
belong to S since no element in S can belong to itself. So S is not an
4 Section 1: Classes, sets and axioms
meaning, although the symbols or words used to represent them often con-
vey some intuitive idea in the mind of the reader. That is, the words which
represent this undefined notion are such that the user will more easily un-
derstand the properties which will be prescribed for this concept. Specific
rules and properties which declare how these concepts relate to each other
are then formulated; these rules and properties must allow mathematical
constructs which are viewed as being important in our mathematical sys-
tem. These properties and rules are called the “axioms”.
Euclid’s axiomatic system. Euclid provided us with a useful model for con-
structing an axiomatic system. He is the first person known to apply the
axiomatic method to study the field of geometry. In his axiomatic system,
the words “line” and “point” are primitive concepts. He instinctively rec-
ognized that some undefined terms would be required. He then described
properties of “lines” and “points”. These properties are his axioms. These
axioms are statements whose true-false values are not logically deduced
from statements previously shown to be true. They are simply assumed to
be true. The important point here is that he explicitly states what these
“assumed to be true properties” are. The proposed primitive concepts and
these axioms, when gathered together, constitute the foundation of the
“Euclidean axiomatic universe” more commonly referred to as Euclidean
geometry. Euclid justified his choice of axioms by saying that these point
and line properties were “self-evident”. Note that Euclid’s primitive con-
cepts and axioms differ entirely from the set-theoretic axioms we will be
studying. But the axiomatic method he used to study geometry has served
as a valuable model for others who wanted to develop different mathemat-
ical systems.
Euclid then used deductive reasoning to show that various geometric state-
ments were true. The assumptions made were limited to
1) the stated axioms, along with
2) other statements previously shown to be true.
In this way, Euclidean geometry came to be. In spite of Euclid’s best ef-
forts, careful scrutiny of his work revealed that Euclid erred in certain
ways. He unknowingly made assumptions which were neither stated as ax-
ioms nor previously proven to be true. In 1899, the mathematician David
Hilbert revised the Euclidean axiomatic system by proposing three prim-
itive concepts: point, straight line, plane. He also proposed 21 axioms. In
1902, one axiom was shown to be redundant and so was eliminated from
the list. These primitive concepts along with 20 axioms are now widely
accepted as forming a firm logical footing for Euclidean geometry.
The primitive concepts in our theory. There will be three undefined notions
in our axiomatic system:
“class”
“set”
“belongs to”
Part I: Axioms and classes 7
The expression
x∈A
is to be read as “the class x belongs to the class A”, or “the class x is in the
class A” or “x is an element of A”. However, no class will be representable
by a lower-case letter, x, unless it is known that x ∈ B for some class B.
Those classes which can be represented by a lower-case letter, say x, will
be given a special name:
If a class A is such that A ∈ B for some class B, then we will
refer to the class A as being an “element”.
Elements are still classes; but they are special classes, since they “belong
to” another class. So an element can be represented by both a lower-case
or an upper-case letter. For example, if we write x ∈ y or A ∈ B this means
that x, y and A are elements, while B may or may not be an element.
Why is “element” not an undefined notion? The reader may find surprising
that the object element is not expressed as an undefined notion. After
all, we are accustomed to distinguishing elements from sets. Introducing
a fourth undefined term was eventually seen as being superfluous. This
became clear when we realized that we often view sets as being “elements”
of other sets.2 Witness:
· Points (a, b) in the Cartesian plane are actually two-element sets
{a, b} of real numbers stated in a particular order.
· Rational numbers a/b can be described as the set of all two-element
sets {a, b} of integers in a particular order where b is not 0.
· Irrational numbers can be viewed as infinite sequences of rational
numbers converging to a non-rational number, again a set.
sets nor classes are considered. These are referred to as “urelements”. We will not consider
these in this text.
8 Section 1: Classes, sets and axioms
when we actually invoke each of these in various situations where they are
required. These are called the ZF-axioms. The reader will see how surpris-
ingly few are required. At this point, much of this will look like gibberish,
but as we prod through the subject matter, we will step-by-step develop
a better understanding of what they mean.
fication or Axiom of separation. It is in fact many axioms (which, when viewed together,
are referred to as a schema) each differing only by the formula φ it refers to. So to be more
precise, given a formula φ in set theory language, we would refer to it as axiom A4(φ) rather
than A4.
4 This axiom is more often expressed as the Replacement axiom schema since it is in fact
many axioms each differing only by the formula φ it refers to. So to be more precise, given
a formula φ in set theory language, we would refer to it as axiom A7(φ) rather than A7. It
essentially allows us to confirm that if the domain A of a set f is a set, then the image f [A]
is a set.
Part I: Axioms and classes 9
In this text we will refer to this set of nine axioms viewed together with
the Axiom of choice as “ZF +Choice” or simply by ZFC.5
The Axiom of choice. The controversy surrounding the Axiom of choice
requires some explanation. The Axiom of choice is an axiom which was
added after most of the ZF-axioms were widely accepted as a foundation
for modern mathematics. It is so subtle a concept that many early math-
ematicians unknowingly invoked it in their proofs. That is, it was invoked
without stating it explicitly as an assumption. Some mathematicians pub-
licly questioned this assumption, asking openly whether the word “obvi-
ously” was sufficient justification for using it. These questions could not
be ignored. Numerous attempts at proving this axiom from the ZF-axioms
failed. In 1963, it was finally proven that neither the Axiom of choice, nor
its negation, can be proven from the ZF-axioms. This implied that we
are free to state it as an axiom, along with the other ZF-axioms, with-
out fear of producing a contradiction. A lengthy debate on whether this
statement should be included with the other “fundamental” ZF-axioms
followed. Some described it as “the most interesting and, in spite of its
late appearance, the most discussed axiom of mathematics, second only
to Euclid’s axiom of parallels which was introduced more than two thou-
sand years ago” (Fraenkel, Bar-Hillel and Levy 1973). Eventually, it was
felt that “not accepting” the Axiom of choice closes the door to many
fundamentally important results of modern mathematics. One could say
that the Axiom of choice had already been used so extensively that it was
deeply ingrained in the modern mathematical fabric; we were “addicted”
to the Axiom of choice, so to speak.
Even though proofs that invoke the Axiom of choice are widely viewed as
being acceptable, it is often felt that a correct proof that does not invoke
the Axiom of choice is preferable to a simpler proof which invokes it. This
5 Note that some of the ZF axioms listed have been shown to follow from the others. So
some set theory texts may omit one or more of these from their formal list of ZFC axioms.
Since most of these axioms are non-controversial we will adopt, for this text, this list of 10
axioms as the ZFC-axioms. The reader should simply be alerted to the fact that the list of
the ZFC axioms may vary from text to text.
10 Section 1: Classes, sets and axioms
“X ∈ A” ⇒ “X ∪ {X} ∈ A”.”
says that there exists a class called a set which is infinite in size. (This
axiom also guarantees that at least one class called a set exists.) It essen-
tially allows us to define the “natural numbers”, 0, 1, 2, 3, . . . ,.
Part I: Axioms and classes 11
Axiom of choice:
“For every set A of non-empty sets there is a rule f which
associates to every set A in A an element a ∈ A.”
says that given a set of non-empty sets, there exists a certain type of func-
tion. But it does not show how to construct or find such a function.
Note that Axioms A1 and A2 refer only to classes (we class these “class
axioms”) while all the other axioms (Axioms A3 to A9 and the Axiom of
choice) are “set axioms”. The set axioms determine what kind of objects
exist in the universe of all sets.
Axioms A2, A3, A4, A5, A6 and A7 are “constructive” axioms since A2
gives us a way to construct a class by referring to a property. Axioms A3
to A7 provide a method to construct new sets from ones that are known
to exist.
Axiom A9, the Axiom of regularity, is sometimes referred to as the “use-
less” axiom by some. Others don’t consider it as a basic axiom since most
of mathematics which is based on set theory does not require it. It will
be invoked only in the last chapter of this book. Although it is not obvi-
ous, just from reading it, this axiom actually states that “those non-empty
classes which don’t have a least element are not sets”. It is in fact an axiom
which does not allow certain types of sets to exist in the universe of sets.
It is of an exclusionary nature. The other axioms (except for Axiom A1)
increase the number of sets in the universe of sets.
Axioms A4 (Subsets) and A7 (Replacement) each represent many axioms.
We refer to such axioms as schema.6 Axiom A4 speaks of a set S and a
particular formula φ describing a property. For each property we have a
different Axiom. Given φ, we could say the “Axiom A4 for φ”. Axiom A7
speaks of a set A and a class B of sets along with a particular formula
φ(x, y) which plays the role of a function (normally referred to as a func-
tional). For each functional, φ(x, y), we have a different axiom.
flows from these axioms, then we can answer: “No, the ZF-axioms are not
consistent, since we have revealed a contradiction which flows from these
axioms!” If such a contradiction is discovered, we must tinker some more
with the set-theoretic axioms to correct the flaw.
But as long as we do not encounter such a paradox, the answer to this ques-
tion is: “We don’t know for sure whether the ZF-axioms are consistent.” It
has been shown that using only the ZF -axioms, it is impossible to prove or
disprove that the ZF -axioms are consistent. It is the “nature of the beast”,
so to speak. Since new forms of mathematics are uncovered every day, it
is possible that next week, in 100 years or in a 1000 years someone will
discover that ZF is inconsistent. “Set theory” is, as the words indicate,
just a theory. By their very nature, all theories evolve to explain newly
discovered previously unknown facts. The ZF -set-theoretic system is no
different. As a foundation of modern mathematics, the ZF -set-theoretic
system seems to serve its purpose well; it is the best theory we have today,
even though some day we may discover significant ways of improving it.
Concepts review:
of what the only objects (classes and sets) in our set-theoretic universe are.
The statements in the following theorem are all logical consequences of these
axioms. They are easily seen to hold true, but it is good practice to explicitly
write out the proofs.
a) C = D ⇒ D = C.
b) C = D and D = E ⇒ C = E.
c) C ⊆ D and D ⊆ C ⇒ C = D.
d) C ⊆ D and D ⊆ E ⇒ C ⊆ E.
P roof:
a) Suppose C = D. Then by axiom A1, x ∈ C ⇒ x ∈ D and x ∈ D ⇒
x ∈ C. Then x ∈ D ⇒ x ∈ C and x ∈ C ⇒ x ∈ D. Hence, by definition of
equality D = C.
Proofs of (b) to (d) are left as an exercise.
We said that all objects in our set-theoretic universe are classes. Some of
those classes are elements provided these belong to another class. It is nor-
mal to ask whether there exists at least one class which is not an element.
We answer this question in the following theorem.
P roof:
Let C be the class
C = {x : x 6∈ x}2
By Axiom A2, the class C is well-defined. Suppose C is an element. Then
C belongs to {x : x 6∈ x}. So C does not belong to C. Contradiction. Then
C is not an element, as required.
These two axioms alone will allow us to construct sets from those classes
known to be “sets”. Axiom A8 (Axiom of infinity) guarantees that at least
one class called “set” exists: It contains the words “...there exists a class A
called a set that...”. We need not search any further.
Axiom A3 refers to the set C = {A, B} as a “doubleton”. We will use the
word doubleton when referring to two sets A and B viewed together to form
a collection {A, B} of sets. For convenience, we will not put any restrictions
on how the set B relates to A. For example, we can refer to the set {A, A}
as a doubleton even though it contains only one element.
The statements in the following theorem follow immediately from the Ax-
ioms A3 and A4.
P roof:
a) We are given that S is a set. We are required to prove that ∅ is a set.
We can directly apply axiom A4: Since ∅ = {x ∈ S : x 6= x} and, by
hypothesis, S is a set then, by A4, ∅ is a set as required.
b) We are given that S is a set. We are required to prove that S is an
element. By Axiom A3 (Axiom of pair), for any set S, {S, S} is a set.
Since S ∈ {S, S}, for all sets S, then, by definition, S is an element, as
required.
In the proof above, we discussed the set {S, S} which contains the set S as
an element. Since {S, S} = {S} (Prove this!) this is a one-element class. We
call such sets singleton sets. The reader should note that according to our
definition of doubleton above, every singleton set {A} can be expressed as a
doubleton {A, A}.
We can now verify that the universal class U = {x : x = x} is not a set:
Suppose U is a set. See that the class C = {x ∈ U : x 6∈ x} is a subclass of
U . Since we assumed U to be a set, by the Axiom of subset, C must also
be a set. But we showed in Theorem 2.4 that the class C is not an element
and so cannot be a set. We have a contradiction. Therefore the universal
class is a proper class, as claimed.
(a) The set ∅ contains no elements. By Axiom A3, the class C = {∅, ∅} =
{∅} is a set which contains exactly one element (the element ∅). Observe
that ∅ 6= {∅} since {∅} contains one element, while ∅ does not.
(b) Let A = ∅ and B = {∅}. By Axiom A3, C = {∅, {∅}} is a set which
contains exactly two elements (the element ∅ and the one element set
{∅}).
(c) Let a, b, c be three sets. Then, by repeated applications of Axiom A3,
{a, {a}, {{a}}, {a, b, c}} is a four-element set.
d) Let c be an element (class). Then A = {c} is a class with only one ele-
ment since A = {x : x = c} and so, by Axiom A2, A = {c} is a class. If c
is known to be a set, A = {c} = {c, c}, so we can conclude that A is a set.
We see, from the above rules, that we can “theoretically”, construct all finite
sets so that they each appear in the form of various orders and combinations
of the symbols,
{, }, ∅
Of course, in practice, there would be no point in actually doing that.
We can use a symbol of our choice, say A, to represent an “infinite set” guar-
anteed to exist by the Axiom of infinity. We can then use the construction
rules to construct other sets with A. So nothing prevents us from considering
a set described as, say, D = {A, B, C}.
is a class.
Axiom A2 said that {x : P (x)} = “all elements which satisfy property P ”
is a class. Now it makes sense to talk about the “class of sets satisfying a
property P ”.
Note that the class, S , of all sets is a proper class. To see this, suppose S
was a set. Let D = {x ∈ S : x 6∈ x}. The class D cannot be a set, for if it
was, then as previously shown, we would quickly obtain the contradiction,
Part I: Axioms and classes 19
Definition 2.8 If A is a set, then we define the power set of A as being the
class P(A) of all subsets of A. It can be described as follows:
P(A) = {X : X ⊆ A}
2.8 Examples.
We provide a few exercises which allow us to practice notions related to
power sets.
1) Power sets. List the elements of the power set of
a) The empty set, ∅.
b) A singleton set.
c) A doubleton set.
Solution:
P(∅) = {∅}
2) Consider the three-element class C = {x, {x, y}, {z}}. Determine which
of the following statements are true and which are false.
a) We can write x ∈ C.
b) We can write x ⊆ C.
c) We can write {x} ⊆ P(C).
d) We can write {{z}} ∈ P(C).
e) We can write z ∈ P(C).
f) We can write {z} ⊆ C.
Solution:
a) True. We can write x ∈ C since x is an element explicitly listed as a
class in C.
b) False. We cannot write x ⊆ C since this does not satisfy the definition
of ⊆. To write x ⊆ C is to say that every element in x is an element
in C. But the contents of x are unknown. So there is no basis to state
that x ⊆ C.
c) True. We can write {x} ⊆ P(C) since every element in {x} is also
an element of C.
d) True. We can write {{z}} ∈ P(C) since {{z}} ⊆ C. The only element
in {{z}} is in C.
e) False. We cannot write z ∈ P(C) since the element z does not appear
as an element of C.
Part I: Axioms and classes 21
f) False. We cannot write {z} ⊆ P(C) since {z} contains only one
element z. This element is not a subset of C.
It was shown above that “all sets satisfying a property P ” is a class. In
the following example, we say something similar. But there are subtle
differences in the statement. See if you can detect these differences.
3) Let A be a set and P denote some property. Show that the class
S = {x : (x ⊆ A) ∧ P (x)}
is a set.
Solution:
We are given that A is a set and S = {x : (x ⊆ A) ∧ P (x)}. We are
required to show that S is a set. Since A is a set, and, for every x ∈ S,
x ⊆ A then every x ∈ S is a set (by Axiom A4). By Axiom A5, P(A) is
a set. Since the class S ⊆ P(A), then S is a set (by Axiom A4). This is
what we were required to prove.
T = {a : a is a set and a ∈ a}
is empty.
Concepts review:
1. What does it mean to say “the class A is equal to the class B”,
A = B?
2. What does it mean to say “the class A is contained in the class B”,
A ⊆ B?
3. What does it mean to say the class A is a proper subclass of the
class B?
4. How should we read the expression C = {x : P (x)}?
5. Is it true that ∅ 6∈ ∅? Why?
6. State a class that is not an element.
7. What is the universal class?
8. What is the empty class ∅?
9. Is a set an element?
10. Given a set A, what is the power set, P(A), of A? How do we know
that P(A) is a set?
11. If B is a set and A ⊆ B, what can we say about A? Why?
12. Why is ∅ a set?
EXERCISES
C. 15. Show that the statement “For any set S, P(S) ⊆ S” is a false statement.
16. Suppose U and V are sets. Determine whether the statement P(U ) ∪
P(V ) = P(U ∪ V ) is true or false. Justify your answer.
17. Suppose U and V are sets. Determine whether the statement P(U ) ∩
P(V ) = P(U ∩ V ) is true or false. Justify your answer.
Part II
Class operations
Part II: Class operations 27
A ∪ B = {x : (x ∈ A) ∨ (x ∈ B)}
Definition 3.2 Let A and B be two classes (sets). We define the intersection,
A ∩ B, of the class A and the class B as
A ∩ B = {x : (x ∈ A) ∧ (x ∈ B)}
Observe that
S
a) if A = {D, E} then D ∪ E = C∈A C.
T
b) if A = {D, E} then D ∩ E = C∈A C.
Also see that T
S the axiom A2, Axiom of construction, guarantees that both
C∈A C and C∈A C are classes. If the class A contains no elements, then
by definition of “union” and “intersection” the union and intersection of all
elements in A is ∅.
Definition 3.3 We will say that two classes (sets) C and D are disjoint if
the two classes have no elements in common. That is, the classes C and D are
disjoint if and only if C ∩ D = ∅.
C 0 = {x : x 6∈ C}
C − D = C ∩ D0
This is also a class. The symmetric difference, C4D, is defined as (the class)
C4D = (C − D) ∪ (D − C)
Thus, Axiom A6 says “The union of all sets in a set of sets is a set”. We
should also be clear about what Axiom A6 does not say: “The union of all
sets in a class of sets is a set.” If we make the mistake of assuming this to
be true, it will lead to a contradiction, as the following example shows.
Suppose A is the class A = {x : x is a set and x 6∈ x}. Show
Example: S
that D = x∈A x is not a set.
Solution:
What we are given: A = {x : x is a setS and x 6∈ x}.
What we are required to show: D = x∈A x is not a set.
S
Suppose D = x∈A x is a set. Then by Axiom S A5, P(D) is also a set.
We know that for every x ∈ A , x ⊆ D = x∈A x. (Make sure you see
why. If not look at A ⊆ A ∪ B.) So, for every x ∈ A , x ∈ P(D). Hence,
A ⊆ P(D).
Since A is a subclass of a set, then, by Axiom A4, A is a set.
We now argue as in Theorem 2.4: Since A is a set, “A ∈ / A ” ⇒ “AS ∈ A ”
and “A ∈ A ” ⇒ “A ∈ / A ”. This is a contradiction. So D = x∈A x
cannot be a set. This is what we were required to show.
Then the statement “The union of all sets in a class of sets is a set” is not
a true statement, in general. Even though, in set theory, both class and set
intuitively represent a “collection of objects”, freely substituting the word
set with the word class may lead to some nasty consequences.
FIGURE 1
Intersection and union of two sets
30 Section 3: Operations on classes and sets
FIGURE 2
Difference and symmetric difference
FIGURE 3
Intersection distributing over a union.
P roof:
a) Let x ∈ C. It suffices to show that x ∈ C ∪ D.
x∈C ⇒ x ∈ {x : x ∈ C or x ∈ D}
⇒ x∈C ∪D
P roof:
a) What we are given: C and D are classes (sets).
What we are required to show: C ∪ (C ∩ D) = C
x ∈ (C ∩ D) ∪ C ⇒ x ∈ C ∩ D or x ∈ C
⇒ x ∈ C or x ∈ C (Since Theorem 3.5 b) says C ∩ D ⊆ C)
⇒ x∈C
⇒ (C ∩ D) ∪ C ⊆ C
equal classes).
P roof:
x ∈ (C 0 )0 ⇒ x 6∈ C 0 (By Definition 3.4)
⇒ x∈C (By Definition 3.4)
⇒ (C 0 )0 ⊆ C
Theorem 3.8 (De Morgan’s laws) Let C and D be classes (sets). Then
a) (C ∪ D)0 = C 0 ∩ D0
b) (C ∩ D)0 = C 0 ∪ D0
P roof:
a) Given: C and D are classes (sets).
Next x ∈ C 0 ∩ D0 ⇒ x ∈ C 0 and x ∈ D0
⇒ x∈ 6 C and x 6∈ D (By Definition 3.4)
⇒ x 6∈ C ∪ D (For if x ∈ C ∪ D, then x ∈ C or x ∈ D)
⇒ x ∈ (C ∪ D)0 (By Definition 3.4)
⇒ C 0 ∩ D0 ⊆ (C ∪ D)0
(C ∪ D)0 C 0 ∩ D0
⊆
⇒ (C ∪ D)0 = C 0 ∩ D0
C 0 ∩ D0 ⊆ (C ∪ D)0
Theorem 3.10 Let A be a class and U denote the class of all elements.
a) U ∪ A = U
b) A ∩ U = A
c) U 0 = ∅
Part II: Class operations 33
d) ∅0 = U
e) A ∪ A0 = U
P roof:
a) By Theorem 3.5 a), U ⊆ U ∪ A. If x ∈ U ∪ A, then x ∈ A or x ∈ U .
In either case x is an element and so x ∈ U . Thus, U ∪ A ⊆ U . Hence,
U ∪A = U.
Parts(b) to (e) are left as an exercise.
P roof: !0
[ [
a) x∈ C ⇔ x 6∈ C
C∈A C∈A
⇔ x 6∈ C for all C ∈ A
⇔ x ∈ C 0 for all C ∈ A
\
⇔ x∈ C0
C∈A
S S
a) D ∩ C∈A C = C∈A (D ∩ C)
T T
b) D ∪ C∈A C = C∈A (D ∪ C)
34 Section 3: Operations on classes and sets
P roof:
a)
!
[ [
x∈D∩ C ⇔ x ∈ D and x ∈ C
C∈A C∈A
⇔ x ∈ D and x ∈ C for some C ∈ A
⇔ x ∈ D ∩ C for some C ∈ A
[
⇔ x∈ (D ∩ C)
C∈A
Concepts review:
S
1. If A is a class of classes, how should we interpret the class C∈A C?
T
2. If A is a class of classes, how should we interpret the class C∈A C?
How do we know that this is indeed a class?
3. What does it mean to say that two classes A and B are disjoint?
4. What is the complement, C 0 , of a class C?
5. What is the difference, C − D, of the two classes C and D? What
is the symmetric difference C4D?
6. S
If A is a non-empty set of sets, how do we know that the union
C∈A C is a set?
7. What do De Morgan’s laws say in reference to two classes C and
D?
8. Let A be a class of classes. Can we generalize De Morgan’s laws to
{C : C ∈ A }?
9. Is it true that the union of sets C in a class A is a set?
10. List the ZF -axioms that refer specifically to sets and were invoked
at least once up to now?
11. In algebra, we know about the distributive property of “multiplica-
tion over sums and differences”. Is there a similar property which
refers to “unions distributing over intersections” and “intersections
distributing over unions”?
Part II: Class operations 35
EXERCISES
S
A. 1. Prove or disprove that if D ∈ A , then it is always true that D ⊆ C∈A C.
S
2. Show that if A is a class of sets, then A ⊆ P( x∈A x).
C. 10. If A and B are sets, show that P(A) ∈ P(B) implies P(A) ⊆ B and so
A ∈ B.
36 Section 4: Cartesian products
4 / Cartesian products.
Abstract. In this section we define the notion of “ordered pairs” in terms
of classes and sets. We then define the Cartesian product of two classes
(sets). We also present a few of the basic properties of Cartesian products.
This is a bit wordy. Also, it is not clear what the words “first” and “second”
mean. We have not defined these in our set-theoretic universe. Can we define
ordered pairs without using the words “first” and “second”? That is, can
we obtain an equivalent definition of “ordered pairs” by avoiding these two
Part II: Class operations 37
words entirely? Let us consider the following definition and then see if it
works.
First, we should verify that there are no inherent ambiguities in this defi-
nition. Remember that in our set theory all objects are classes. Given this
definition we should first verify that if c and d are elements, then {{c}, {c, d}}
is a class:
c and d are elements ⇒ {c} and {d} are classes. (Axiom A2.)
We verify immediately that if c and d are sets, then {{c}, {c, d}} is a set:
c and d are sets ⇒ {c} and {d} are sets. (Axiom of pair)
⇒ {c} ∪ {d} = {c, d} is a set. (Axiom of union)
⇒ {{c}, {c, d}} is a set. (Axiom of pair)
Now that this has been established, we should make sure that the double-
ton defined above satisfies the essential “ordered pairs property”, [(a, b) =
(c, d)] ⇔ [a = c and b = d].
Theorem 4.2 Let a, b, c and d be elements. Then (a, b) = (c, d) if and only
if a = c and b = d.
P roof:
(⇐) That a = c and b = d implies (a, b) = {{a}, {a, b}} = {{c}, {c, d}} =
(c, d) is immediate.
(⇒) What we are given: (a, b) = (c, d). What we are required to show: a = c
and b = d.
1 Kazimierz Kuratowski (1896-1980) was a Polish mathematician and logician. He was
Case 1: a 6= b ⇒ {a, b} =
6 {c} hence {a, b} = {c, d}
⇒ {a} = {c}
⇒ a=c
{a, b} = {c, d} and a = c ⇒ b=d
The main reason why this definition may be intuitively appealing to some
is that it looks more like we are indexing the two elements c and d with the
symbols φ and {φ}. It allows one to visualize the ordered pair as follows:
(c, d) = { {c, ∅}, {d, {∅}} } = c∅ , d{∅} = {c0 , d1}
This in fact resembles more the way we will be viewing ordered pairs once
we define “functions” and the “natural numbers”. We will defer the proof
which guarantees that this definition satisfies the essential property of or-
dered pairs to the end of this section. In this text, we will adopt the more
commonly used Kuratowski definition.
Definition 4.4 Let C and D be two classes (sets). We define the Cartesian
product, C × D, as follows:
C × D = {(c, d) : c ∈ C and d ∈ D}
one of the founders of modern topology and who contributed significantly to set theory,
descriptive set theory, measure theory, and functional analysis. Historical note: Life became
difficult for Hausdorff and his family after the Kristallnacht of 1938. The next year he
initiated efforts to emigrate to the United States, but was unable to make arrangements to
receive a research fellowship. On 26 January 1942, Hausdorff, died by suicide rather than
comply with German orders to move to the Endenich camp (Wikipedia).
40 Section 4: Cartesian products
Lemma 4.5 Let C and D be two classes (sets). Then the Cartesian product,
C × D, of C and D satisfies the property C × D ⊆ P(P(C ∪ D)).
P roof:
What we are given: That C and D are two classes (sets).
What we are required to show: C × D ⊆ P(P(C ∪ D)).
Let c ∈ C and d ∈ D. It will suffice to show that (c, d) ∈ P(P(C ∪ D)).
{c} ∈ P({c, d}) and {c, d} ∈ P({c, d}) ⇒ {{c}, {c, d}} ⊆ P({c, d})
⇒ (c, d) ⊆ P({c, d})
⇒ (c, d) ∈ P(P({c, d}))
P(P({c, d})) ⊆ P(P(C ∪ D))† ⇒ (c, d) ∈ P(P(C ∪ D))
Hence, C × D ⊆ P(P(C ∪ D)), as required.3
P roof:
To show that C × D is a class we can express C × D as
We can then write that if C and D are sets, C × D is the set of all those
specific elements u in P(P(C ∪ D)) which are of the form u = (c, d) for
some c in C and d in D.
Once we have defined the Cartesian product of two classes C and D, referring
to our definition of ordered triples (c, d, e) = ((c, d), e), we can define the
Cartesian product of three classes C, D and E as follows:
C ×D×E = {(c, d, e) : c ∈ C, d ∈ D, e ∈ E}
= {((c, d), e) : c ∈ C, d ∈ D, e ∈ E}
= (C × D) × E
P roof:
a) (c, d) ∈ C × (D ∩ E) ⇔ c ∈ C and d ∈ (D ∩ E)
⇔ c ∈ C and d ∈ D and d ∈ E
⇔ (c, d) ∈ C × D and (c, d) ∈ C × E
⇔ (c, d) ∈ (C × D) ∩ (C × E)
(c, e) ∈ C × E ⇒ c ∈ C and e ∈ E
⇒ c ∈ D and e ∈ F (Since C ⊆ D and E ⊆ F )
⇒ (c, e) ∈ D × F
Hence, C × E ⊆ D × F .
Theorem 4.10 For classes c, d, e and f, if (c, d) = {{c, ∅}, {d, {∅}}} and
(e, f) = {{e, ∅}, {f, {∅}}}, then (c, d) = (e, f) if and only if c = e and d = f.
P roof:
(⇐) That c = e and d = f implies (c, d) = (e, f) is immediate.
(c, d) = (e, f) ⇒ {{c, ∅}, {d, {∅}} } = {{e, ∅}, {f, {∅}} }
If {c, ∅} =
6 {e, ∅} ⇒ {c, ∅} = {f, {∅}} and {d, {∅}} = {e, ∅}
{c, ∅} = {f, {∅}} ⇒ f = ∅ ( Since ∅ 6= {∅} this forces f = ∅.)
⇒ {∅} = c
{d, {∅}} = {e, ∅} ⇒ d = ∅ and e = {∅} ( For the same reasons as above.)
c = {∅} = e and d = ∅ = f ⇒ c = e and d = f.
Note that the two different representations of ordered pairs (a, b),
{{a}, {a, b}} and {{a, ∅}, {b, {∅}} } do not form equal sets. These two
classes only share the fundamental property of ordered pairs.
Concepts review:
1. If c and d are elements, what is the (Kuratowski) definition of the
ordered pair (c, d)?
2. Given two classes C and D, what is the definition of C × D?
3. If C and D are sets, is it true that C × D ⊆ P(P(C ∪ D))? Why?
4. Is it generally true that C × D = D × C? If so, why? If not give a
counterexample.
5. Is it generally true that (C × D) ∪ (E × F ) = (C ∪ E) × (D ∪ F )?
If so, why? If not, give a counterexample.
EXERCISES
Relations
Part III: Relations 47
5.1 Introduction.
We have seen that studying sets by categorizing them according to some
clearly identifiable properties or some characteristics shared by some but
not others is enough of a valuable endeavor to sustain the interest of many.
But for others, restricting our study to this aspect of set theory would be
missing the point, if not the main reason why set theory constitutes a branch
of mathematics well worth studying.
Consider, for example, the following analogous situation. Suppose one wishes
to study the universe who elements are “people living on this planet” by re-
grouping individuals in this universe based on identifiable characteristics
shared by some but not by others. For example, one might identify charac-
teristics based on race, culture, religious beliefs and so on. Research would
identify that most individuals are either male or female, with much smaller
groups which identify as neither or even both. Although studying, in this
way, various sub-categories of individuals which populate this universe is
worth investigating, trying to understand how subgroups “relate” to other
appears to be a much richer area worth investigating. Certainly, it is more
complex. Also, it would be seen as being more useful, in practice. For similar
reasons, the time is ripe in our study to embark on the study of the notion
of “relations between sets”.
Recall that the symbol U denotes the “Universal class”, {x : x = x} (the
class of all elements). Since U is a class, we can then construct the Cartesian
product, U × U , itself a class (as we have seen). Recall that the elements
of U × U are ordered pairs.
Definition 5.1
Also note that when we say “y is related to x under R”, we mean (y, x) ∈ R.
R1 = {(x, y) : x ∈ y}
For example, we can write (a, {a, b}) ∈ R1 or, if one prefers, aR1 {a, b}
“holds true”. On the other hand, we can write (b, {c, d}) 6∈ R1 . Also,
(∅, ∅) 6∈ R1 , but {∅, {∅}} ∈ R1 .
2) We define a relation, R3 , in U as follows: (x, y) ∈ R3 if and only if x = y.
This says that “a class (a set) x is related only to itself and no other
class”. We can write R3 = {(x, y) : x = y}. We see that (a, {a}) 6∈ R3
but that ({a}, {a}) ∈ R3 . The statement ∅R3 ∅ is true.
Part III: Relations 49
a) The relation
∈C = {(x, y) : (x, y) ∈ C × C, x ∈ y}
is called the membership relation on C.
b) The relation
We see that the only elements in the identity relation IdU on U are those
of the form (x, x).
as R = {(x, y) : x ∈ y}.
Since R was found to be:
R = { (a, {a}), (a, {a, b}), (b, {a, b}), (∅, {∅}) }
R−1 = { ({a}, a), ({a, b}, a), ({a, b}, b), ({∅}, ∅) }
Definition 5.5 Let C be a class (a set) and let R and T be two relations in
C. We define the relation T ◦ R as follows:
R = { (a, {a}), (a, {a, b}), (b, {a, b}), (∅, {∅}) }
Concepts review:
1. Given a class C, what is a relation on C?
2. Given a relation R on a class C, what does the expression xRy
mean?
3. Given a class C, what is the membership relation, ∈C , on C?
4. Given a class C, what is the identity relation, IdC , on C?
5. Given a relation R on a class C, what is the domain, dom R, of R
and the image, im R, of R?
6. Given a relation R on a class C, what is the inverse, R−1 , of the
relation R? Is R−1 a relation on C?
7. Does a relation R on C have to be “one-to-one” for R−1 to be a
relation?
52 Section 5: Relations on a class or set
EXERCISES
3) Let S = {a, b, c, d}. Consider the relation R2 = {(a, a), (b, b), (d, d), (a, b)}.
· R2 is not a reflexive relation on S since R2 does not contain (c, c). It is
not irreflexive since it contains (a, a).
· R2 is not a symmetric relation on S since R2 contains (a, b) but not
(b, a).
1 Note that the statement “whenever (a, b) and (b, a) are in S, then a = b” holds true.
Part III: Relations 55
R3 = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, a), (b, c), (c, b)}
· We see that R3 is a reflexive relation on S.
· Since R3 contains both pairs {(a, b), (b, a)} and {(b, c), (c, b)}, then R3
is a symmetric relation on S.
· Since R3 contains {(a, b), (b, a), (a, a)} and {(b, c), (c, b), (b, b)}, then R3
is antisymmetric.
· Since R3 contains (a, a), then R3 is not asymmetric.
· Since R3 contains the triples {(a, b), (b, a), (a, a)} and {(b, c), (c, b), (b, b)},
then R3 is transitive on S.
A word of caution: Some readers may conclude that a relation R which is
both symmetric and transitive on a class S is automatically reflexive based
on the following reasoning:
Symmetric R says “(a, b) ∈ R implies (b, a) ∈ R” while transitive
R says “(a, b) and (b, a) in R implies (a, a) ∈ R ”. So “symmetric
+ transitive ⇒ reflexive”.
This means all ordered pairs in S × T with the same first entry are
related under R. Since
i. ((a, b), (a, b)) ∈ R) for all (a, b) ∈ D so R is reflexive on D.
ii. ((a, b), (c, d)) ∈ R ⇒ a = c ⇒ ((c, d), (a, b)) ∈ R so R is symmetric.
iii. ((a, b), (c, d)) ∈ R and ((c, d), (e, f)) ∈ R ⇒ a = c = e ⇒
((a, b), (e, f)) ∈ R so R is transitive.
We conclude that R is an equivalence relation on D.
c) We will refer to the first example presented on page 54. If G denotes
all the inhabitants of Gotham City and H is the relation on G defined
as
H = {(x, y) : x and y are siblings or the same person}
then H is reflexive, symmetric and transitive and so forms an equiva-
lence relation on G.
Note that none of the relations defined above are equivalence relations since
a partial ordering relation is normally not symmetric, while the strict order-
ing relation is not reflexive.
Notation : If R is a partial ordering relation, then, instead of writing (a, b) ∈
R or aRb, it is standard to write
a≤b
a<b
what they mean. Studying the following few examples carefully will help
construct a mental representation of the structure these relations provide to
sets.
Example 1.
Mortimer is constructing a chart in which he will list all of his ancestors.
He lets S denote a set whose elements represent his ancestors. He defines
an order relation R on S as follows: If a and b are two ancestors, (a, b) ∈ R
only if b is an ancestor of a − equivalently, a is a descendant of b. We list
some properties of the relation R:
− We see that R is transitive since, “a is a descendant of b” and “b is a
descendant of c” implies “a is a descendant of c”.
− Since an element a cannot be an ancestor of a, R is irreflexive.
− Finally, if a is a descendant of b, then clearly b cannot be a descendant
of a. So R is asymmetric.
We conclude that R is a strict order relation on S. Instead of writing (a, b) ∈
R, we will write a < b with the understanding that “<” is only to be
interpreted as “a is a descendant of b”. We list a few more properties of R:
− It is clear that R does not linearly order S since one parent of Mor-
timer cannot be an ancestor of the other parent (excluding cases where
something highly unnatural is going on). Hence, there exist pairs of
ancestors a and b such that a 6< b and b 6< a.
− Let’s assume that Mortimer has included himself in the set S and is
represented by the letter M . Then M < a for all a ∈ S. We will say
that M is the minimum element of S with respect to the ordering “<”.
− Beginning with M , Mortimer can trace different paths upwards forming
chains of inequalities each in the form M < a < b < c < · · · < · · · . Such
chains are linearly ordered subsets of S since for any two elements a, b
in such chains either a < b or b < a. So, not only is M the minimum
element of S, M is also the minimal element of each chain.
To allow us to illustrate in this example as many properties of ordered re-
lations as possible, let’s assume that Adam and Eve were “spontaneously
generated” and so were the most ancient of Mortimer’s ancestors (assuming
Adam and Eve are the only human beings which were spontaneously gener-
ated). Say that in the set S, Adam is represented by A and Eve by E. We
add a few other properties of the set S when equipped with the given order
relation R:
− We see that there are numerous chains of elements (linearly ordered
subsets of S) each of which begins with the minimal element M and
finishes with either A or E.
− If a chain C linearly links M to A, then A is a maximal element of this
chain in the sense that all elements of S which are comparable to A
are “below” A. Similarly, E is the maximal element of all chains which
link M to E.
Part III: Relations 59
Example 2.
Let S denote the set of all molecules constructed from the atoms listed in
the periodic table of elements. In this case, molecules are viewed as sets
whose elements are atoms. (We exclude crystals.) The simplest molecules
60 Section 6: Equivalence relations and order relations
are those that contain only one atom. We define the relation R on S as
follows: (a, b) ∈ R if a 6= b and all atoms in molecule a are contained in
molecule b and any atom which appears n times in a also appears n times
in b. If (a, b) ∈ R, we will write a ⊂ b (or say that a is a proper subset of
b). For example, (H2 O, H2 O2 ) ∈ R. We describe the structure of S when
equipped with this particular relation R.
− By definition, R is irreflexive.
− If molecule a is a proper subset of molecule b, then b cannot be a proper
subset of molecule a and so R is asymmetric.
− If a ⊂ b and b ⊂ c, then a ⊂ c and so R is transitive.
We conclude that the relation R strictly orders the set S. The set S is
however not linearly ordered since well-known molecules such as Cl2 (gaseous
chlorine) and H2 (hydrogen gas) are not comparable under “⊂”. We discuss
a few more properties of S when equipped with R.
− Since a molecule made of a single atom cannot be properly contained
in any other molecule, “single-atom molecules” are minimal elements
of S. So S has as many minimal elements as there are atoms. Clearly,
S does not contain a minimum element.
− There are atoms belonging to the family of noble gases (helium (He),
argon (Ar), krypton (Kr), etc.) that are non-reactive and so do not tend
to bond with other elements to form molecules. In S, these elements will
form one-element chains. For example, the helium atom is not properly
contained in any other molecule. It is both a maximal and minimal
element of S. These particular atoms (noble gases) form in S what is
called an antichain. An antichain is a subset of an ordered set in which
no two elements are comparable.
− Other molecules will join together to form new molecules. For example,
the carbon element, C, and hydrogen element, H, both belong to the
molecule, CH2 , which will join with some other molecules containing
carbon and oxygen atoms to form C2 H2 O4 .
− The order structure of S will then contain numerous chains of
molecules. One suspects that at some point each of these chains may
attain some extremely large molecule which is non-reactive or will be
too unstable to form other lasting links. If this is the case, then such a
molecule is a maximal element of S. However, S cannot have a maxi-
mum element since elements such as those belonging to the noble gases
are not contained in any molecule.
Example 3.
Consider the set Z of all integers. If m and n are non-zero integers, we will
say that “m divides n” if there exists a positive integer k such that mk = n.
We will define the relation R on Z as follows:
R = {(0, n) : n ∈ Z} ∪ {(m, n) : m divides n} ∪ {(m, n) : m < 0 and n > 0}
Part III: Relations 61
T = {a : a is a set and a ∈ a}
Concepts review:
EXERCISES
Sx = {y : (x, y) ∈ R} = {y : xRy}
That is, Sx is the set2 of all elements y in S such that x is related to y under
R.
1 A set S of sets whose elements are “pairwise disjoint” means that for any two sets A
and B in S , A ∩ B = ∅.
2 We justify that S is a “set” as follows: We have that S is a subclass of the class S;
x x
given that S is declared to be a set, by axiom A4 (Axiom of subset) Sx is a set.
Part III: Relations 65
P roof:
Suppose R is an equivalence relation on the set S and x ∈ S.
Since R is reflexive, (x, x) ∈ R so Sx contains x. Then Sx is non-empty.
S S
Then S ⊆ Sx∈S Sx . Since, for each x ∈ S, x ∈ Sx ⊆ S, then x∈S Sx ⊆ S
and so S = x∈S Sx .
SR = {Sx : x ∈ S}
denote the set of all the sets formed from the equivalence relation R.
We verify that the class, SR , is indeed a set: Since S is declared to be a set,
then P(S) is a set (by the Axiom of power set); since SR is a subclass of
the set P(S), then, by the Axiom of subset, SR is a set.
We summarize three important properties of SR :
S
1) x∈S Sx = S.
2) Sx 6= Sy ⇒ Sx ∩ Sy = ∅.
3) Sx 6= ∅ for all x ∈ S.
The three properties together describe what is called a partition of a set S.3
A proper understanding of the method used to partition a set S in this way
is important in our study of set theory.
SIdS = {Sx : x ∈ S}
3 If
S
a set SR = {Sx : x ∈ S} of subsets of S is such that x∈S Sx = S, we often say “SR
covers S ” to express this fact.
Part III: Relations 67
Sx = {y ∈ S : (x, y) ∈ Ids }
= {y ∈ S : x = y}
= {x}
then
SIdS = {{x} : x ∈ S}4
The set, SIdS , represents the “finest” partition possible of the set S.
2) Consider the relation R on the set S defined as follows: (x, y) ∈ R if
x and y both belong to S. The relation R is easily verified to be a
equivalence relation on S. See that, for each x ∈ S,
Sx = {y ∈ S : (x, y) ∈ R}
= {y ∈ S : x and y belong to S}
= {y : y ∈ S}
= S
then
SR = {S}
Then the set SR contains only one element, S. Note that it would be
incorrect to write SR = S or SR = {{S}}. The set SR is referred to
as being the “coarsest” partition on S.
Concepts review:
EXERCISES
C. 7. Let S be a set and let R and T be two equivalence relations on S. For each
x ∈ S let R Sx = {y : y ∈ S, (x, y) ∈ R} and T Sx = {y : y ∈ S, (x, y) ∈ T }.
Let SR = {R Sx : x ∈ S} and ST = {T Sx : x ∈ S}. We have seen
that SR and ST form sets of non-empty subsets of S which are pairwise
disjoint and cover all of S. For each x ∈ S, let Sx = R Sx ∩ T Sx . Show that
S = {Sx : x ∈ S} forms a set of subsets of S satisfying the properties:
a) Sx 6= ∅ for each x.
b) Whenever Sx 6= Sy then Sx ∩ Sy = ∅.
S
c) x∈S Sx = S.
8. Let S and T be sets. It has been shown that “⊆” constitutes a partial order
relation on the set P(S). Consider the set L = P(S) × P(T ). We define
Part III: Relations 69
Definition 8.1 Let S be a set. We say that a set of subsets C ⊆ P(S) forms
a partition of S if C satisfies the three properties:
S
1) A∈C A = S.
2) If A and B ∈ C and A 6= B, then A ∩ B = ∅.
3) A 6= ∅ for all A ∈ C .
Based on this definition, we can see that the set of subsets of S, SR = {Sx :
x ∈ S}, formed by the equivalence relation, R, is a partition of the set S.1
The elements of SR are given a particular name.
Note that if S is a set, then by the Axiom of subset, the equivalence classes in
S/R = SR = {Sx : x ∈ S} are in fact “sets”. But if S is a proper class, then
it may occur that the elements of S/R = {Sx : x ∈ S} may be proper classes.
D/R = DR = {{x} × T : x ∈ S}
Thus, the only pairs of elements x and y of S which are related under RC
are those pairs that appear together in the same subset C ∈ C . Is RC an
equivalence relation?
− We verify that RC is reflexive: For every x ∈ S, x belongs to some C
and so {x} = {x, x} ⊆ C and so (x, x) ∈ RC .
− We verify symmetry of RC : If {x, y} ⊆ C ∈ C , then {y, x} ⊆ C. So
(x, y) ∈ RC ⇒ (y, x) ∈ RC .
− We verify transitivity of RC : If {x, y} ⊆ C and {y, z} ⊆ C, then
{x, z} ⊆ C. So (x, y) ∈ RC and (y, z) ∈ RC ⇒ (x, z) ∈ RC .
The relation RC is indeed an equivalence relation on S. We conclude that
any partition C of a set S defines an equivalence relation RC on S. This
result deserves to be called a theorem.
Part III: Relations 73
Concepts review:
1. Given a set S, what does it mean to say that the class C of subsets
of S partitions S?
2. Given an equivalence relation R on a set S and an element x ∈ S,
what is an equivalence class of x under R? How is it denoted?
3. Given an equivalence relation R on a set S, what is the quotient set
of S induced by R? How is it denoted?
4. Given an equivalence relation R on a set S, what do the expressions
SR and S/R mean?
5. Given a partition C of a set S, can we define an equivalence relation
R on S such that S/R = C ?
6. What does it mean to say that an equivalence relation R refines the
equivalence relation T ?
7. Is there an equivalence relation on a set S that refines all other
equivalence relations?
8. Is there an equivalence relation on a set S that is refined by all
other equivalence relations?
Part III: Relations 75
EXERCISES
Functions
Part IV: Functions 79
illustrates the rule for f : A → B. Then f(b) = {b, {b}}. The set which
follows from φ is,
For practical reasons, we normally just use the symbol f to represent both
the rule and the set which flows from it. Opportunities to say more about
the notion of a function abound in the following chapters in this text.
Our objective will be to define the concept of a function within the ZFC-
universe, without adding any new primitive concepts to the three we already
80 Section 9: Functions: A set-theoretic definition
have: “class”, “set”, “belongs to”. We must formulate this definition care-
fully so that it represents precisely what we want and understand it to be.
For most readers, the notion of a “function” is intrinsically linked to those
sets we call “numbers”, commonly studied in the form of polynomial,
trigonometric or exponential functions. We will see that functions hover,
in the abstract, well above those sets we will call numbers. We have not yet
shown how numbers can be constructed using our ZFC axioms. This is yet
to come. Studying functions in the absence of numbers will allow readers to
better see, in essence, what they truly are.
1) f ⊆ A × B.
2) For every a ∈ A, there exists b ∈ B such that (a, b) ∈ f.2
3) If (a, b) ∈ f and (a, c) ∈ f, then b = c. Equivalently, if (a, b) ∈ f and
(c, d) ∈ f, then (b 6= d) ⇒ (a 6= c). 3
f :A→B
A = dom f
im f = f[A]
ran f = f[A]
P roof:
9.8 Example.
Suppose f : U → V is a function where U ⊆ V . Show that f ∈
P(P(P(V ))).
Solution: For each x ∈ U , let yx = f(x) ∈ V . Then
f = {(x, yx) : x ∈ U } ⊆ U × V
Recall from Kuratowski’s definition of ordered pair in 4.1 that (x, yx) =
{ {x}, {x, yx} }. So
f = { { {x}, {x, yx} } : x ∈ U }
4 We should justify that C is a set: Since ∅ is a subclass of any set, then by the Axiom
A4 (Axiom of subset), ∅ is a set. Also {∅} is a set since {∅, ∅} is a set (by the Axiom of
pair). Again by the Axiom of pair, {∅, {∅}} is a set.
5 The Greek letter χ is pronounced “kie” (like the word “pie”).
Part IV: Functions 85
That is,
f = { { {x}, {x, yx} } : x ∈ U } ∈ P(P(P(V )))
Concepts review:
1. What is the definition of a function f from a set A to a set B?
2. Is it acceptable to view a function f from a set A to a set B as a
set of ordered pairs?
3. Given a function f from a set A to a set B, what do each of the
sets dom f, codom f, im f and ran f represent?
4. Given a function f from a set A to a set B and y ∈ im f, what is
the preimage or inverse of y?
5. Given a function f from a set A to a set B and a set D such that
D ⊆ A, what does the symbol f|D mean?
6. Given a function f from a set A to a set B, where A = C ∪ D is it
true that f = f|C ∪ f|D ?
7. Given the two functions f : A → B and g : A → B, what does it
mean to say that the functions f and g are equal? How can we show
that f = g?
8. Given a function f from a set A to a set B, what does it mean to
say that f is onto B?
9. Given a function f from a set A to a set B, what does it mean to
say that f is one-to-one into B?
10. Given a function f from a set A to a set B, what does it mean to
say that f is injective?
11. Given a function f from a set A to a set B, what does it mean to
say that f is surjective?
12. Given a function f from a set A to a set B, what does it mean to
say that f is one-to-one and onto B?
86 Section 9: Functions: A set-theoretic definition
EXERCISES
10 / Operations on functions
Abstract. In this section we define the composition, g◦f, of two func-
tions f : A → B and g : B → C. We view “composition of functions”
as an operation “ ◦” on two functions f and g. From this perspective we
then discuss the main properties of composition of functions (such as non-
commutativity and associativity). It is in this particular context that we
describe the identity function and the inverse of a function. We also de-
fine the concept of “invertible function”.
Thus, (x, z) ∈ h if and only if (x, z) = (x, g(f(x)). We will call h the compo-
sition of g and f, and denote it by g◦f where
(g◦f)(x) = g(f(x))
P roof:
What we are given: f : A → B and g : B → C are two functions.
What we are required to show: That g◦f is a function.
1) By definition of h = g◦f,
h = g◦f ⊆ A × C
P roof:
Thus, IB ◦f = f.
The proof of f ◦IA = f is similar. It is left to the reader.
P roof:
What we are given for parts (a) to (d): That f : A → B is a one-to-one onto
function.
a) What we are required to show: That there exists a function g such that
g(f(x)) = x. This function g will be f −1 .
Define g : B → A as follows: g(x) = y only if f(y) = x. We claim that
g : B → A is a well-defined function:
− Let x ∈ B. Since f is onto B, then there exists y ∈ A such that
f(y) = x. Thus, dom g = B. Suppose now that (x, y) and (x, z) are
in g. Then y and z are in A such that f(y) = x and f(z) = x.
Since f is one-to-one, then y and z must be the same element. Thus,
g : B → A is a well-defined function.
92 Section 10: Operations on functions
⇒ h◦IB = f −1 ◦IB
⇒ h = f −1
Its inverse
f −1 = [{a} × U ] ∪ [{b} × (A − U )]
can only be referred to as a relation on A, unless of course both U and A−U
are singleton sets.
Concepts review:
1. Given two functions f : A → B and g : B → C, what does the
expression “g◦f” mean? Under what conditions does this expression
make sense?
2. Is the composition of functions commutative? Are there any pairs
of functions which always “commute” with each other?
3. Under what condition(s) is the composition of functions associative?
4. Which function plays the role of the identity with respect to “◦ ”?
5. What does it mean to say that a function f is “invertible”?
6. Under what condition(s) is a function invertible with respect to “◦ ”?
7. If a function h can be expressed as h = g◦f where both f and g are
invertible, is h invertible? If so, how can we express h−1 ?
8. If f is not one-to-one on its domain, what interpretation can we
give to the expression f −1 .
EXERCISES
Definition 11.1 Suppose f is a relation with the set A as domain and the set
B as range. If S is a subset of A, that is, if S ∈ P(A), then we will represent
the image of the set S under f as
f ← [U ] = {x ∈ A : (x, y) ∈ f and y ∈ U }
confusing the preimage of an element, f −1 (x), normally used with one-to-one functions,
with the preimage, f −1 [U ], of a set U . A general topologist might refer to “B = f ← [U ]” by
saying that “f pulls back the set U to the set B”.
Part IV: Functions 97
Examples.
f ← [{u}] = {a, b, c}
f ← [{v}] = {d, e}
f ← [{z}] = {k}
f|D [D] = f|D [{e, k, h}] = {v, z, s}
(f|D )← [{v}] = {e}
(f|E )[E] = (f|E )[{c, d}] = {u, v}
(f|E )← [{u}] = {c}
98 Section 11: Images and preimages of sets
f) f ← [B − E] = A − f ← [E]
P roof: "
[
#
[
a) x∈f S ⇔ x = f(y) for some y ∈ S
S∈A S∈A
⇔ x = f(y) for some y in some S ∈ A
⇔ x = f(y) ∈ f[S] for some S ∈ A
[
⇔ x∈ f [S]
S∈A
It is however considered standard subject matter in most set theory textbooks. So it was
included here.
3 We say that f respects unions if it is always true that f [A ∪ B] = f [A] ∪ f [B]. Similarly,
only if f is one-to-one on U ∪ V .
Case 1: We consider the case where U ∩ V = ∅.
Then f [U ∩ V ] = ∅ ⊆ f[U ] ∩ f[V ]. So the statement holds true.
Case 2: We now consider the case where U ∩ V 6= ∅.
x ∈ f [U ∩ V ] ⇔ x = f(y) for some y ∈ U ∩ V
⇔ x = f(y) for some y contained in both U and V
⇒ x = f(y) ∈ f[U ] and f[V ]
⇔ x ∈ f[U ] ∩ f[V ]
S∈B S∈B
⇔ x = f(y) for some y in some S ∈ B
⇔ x ∈ f ← [{y}] ⊆ f ← [S] for some S ∈ B
[
⇔ x∈ f ← [S]
S∈B
Thus, f ← f ← (S).
S S
S∈B S = S∈B
e) Proof is left as an exercise.
f) Proof is left as an exercise.
Concepts review:
1. Given a function f : A → B and S ⊆ A, what does the expression
f[S] mean?
2. Given a function f : A → B and S ⊆ B, what does the expression
f ← [S] mean?
3. What is the preimage of a set S under a function f?
4. Given a function f : A → B and x ∈ B − im f, what is f ← ({x})?
5. Under what conditions does f respect unions?
6. Under what conditions does f respect intersections?
100 Section 11: Images and preimages of sets
EXERCISES
11. Let f : A → B be a function which maps the set A onto the set B. Prove
that if x and y are distinct elements of B, then
f ← [{x}] ∩ f ← [{y}] = ∅
12. Let f : A → B be a function which maps the set A onto the set B. Prove
that the set of sets
S = {f ← [{x}] : x ∈ B}
forms a partition of the set A.
102 Section 12: Equivalence relations induced by functions
Two elements a and b are related under Rf if and only if {a, b} ⊆ f ← [{x}]
for some x in im f. The quotient set of A induced by Rf is then
A/Rf = ARf = {f ← [{x}] : x ∈ f[A]}
We will refer to A/Rf (or ARf ) as the quotient set of A induced (or deter-
mined) by f.
We illustrate this in a simple example. Let U = {a, {a}, {a, {a}}, {{a}} }
where a is a set. Consider the function f : U → U defined as follows:
a if a ∈ x
f(x) = {a} if {a} ∈ x and a ∈/x
{{a}} if x = a
FIGURE 4
Canonical decomposition of f : S → T
Sx ∩ Sy = f ← [{f(x)}] ∩ f ← [{f(y)}] = ∅
hf (Sx ) = f(x)
Sx 6= Sy ⇒ Sx ∩ Sy = ∅
⇒ f ← [{f(x)}] ∩ f ← [{f(y)}] = ∅
⇒ f(x) 6= f(y)
f(x) = hf (Sx )
= hf (gf (x))
= (hf ◦gf )(x)
Thus, the two functions hf ◦gf and f agree everywhere on the domain,
S, of f. We have just proven the following theorem.
Example: Let U = {a, {a}, {a, {a}}, {{a}} } where a is a set. Let the func-
tion f : U → U be defined as follows:
a if a∈x
f(x) = {a} if {a} ∈ x and a ∈
/x
{{a}} if x=a
We see that
(hf ◦gf )(a) = f(a)
(hf ◦gf )({a}) = f({a})
(hf ◦gf )({a, {a}}) = f({a, {a}})
(hf ◦gf )({{a}}) = f({{a}})
Concepts review:
1. If f : A → B is a function mapping the set A into the set B, describe
a partition of the set A induced by the function f.
2. If f : A → B is a function mapping the set A into the set B, describe
an equivalence relation Rf on A induced by f.
3. If f : A → B is a function mapping the set A into the set B, describe
the elements of the quotient set A/Rf induced by f.
4. If f : A → B is a function mapping the set A into the set B, what
does “the canonical decomposition of f” mean?
5. If f : A → B is a function mapping the set A into the set B, is it
always possible to “decompose” f as a composition of two functions?
How?
EXERCISES
d) Is f one-to-one on S? Explain.
e) For every D ∈ P(S), give f ← (D).
f) If the function f determines a partition of P(S) × S, list the subsets
which are members of this partition.
B. 3) Let A be a non-empty subset of the set S and let T = {∅, {∅}}. We define
the function f : S → T as follows: f(x) = ∅ if x 6∈ A and f(x) = {∅} if
x ∈ A. Let Rf denote the equivalence relation determined by f.
a) List the elements of the quotient set S/Rf .
b) If hf ◦gf = f is the canonical decomposition of f, list the elements of
the functions gf and of hf .
Part V
13 / Natural numbers
Abstract. The main objective in this section is to discuss how the natural
numbers, N = {0, 1, 2, 3, . . .}, are constructed within the Zermelo-Fraenkel
axiomatic system. We begin by stating the definitions of “successor set”
and “inductive set”. The natural numbers, N, is then defined as the “small-
est” inductive set. A ZF -axiom will guarantee the existence of this “small-
est inductive set” called the “natural numbers”. We then show how the
Principle of mathematical induction is an immediate consequence of this
definition of N. We define “transitive sets” as sets, A, whose elements are
subsets of A. The elements of N are then shown to be “transitive sets”.
We then prove a few properties possessed by all natural numbers.
From this we see that nearly all the sets we explicitly constructed up to now
have evolved from successive applications of the axioms of pair, union and
power set, with the empty set as a starting point. If we explicitly list the
elements of these sets, we will see repetitive sequences of the pairs of “curly
brackets”, { and }, and the symbol, “∅”. We then expect every natural
number to be a set of this nature.
If we are asked to define the set of all natural numbers as succinctly as
possible, we may consider the following definition as a reasonable one:
The set, N, of all natural numbers is the intersection of all sets S
which satisfy the two properties, 0 ∈ S and [n ∈ S] ⇒ [n + 1 ∈ S].
Given the knowledge and the experience we have with natural numbers, it
would be difficult to imagine a natural number which does not√belong to
such a set. It also seems obvious that numbers such as 45 and 5 cannot
belong to such a set. This will be our model for formulating a set-theoretic
definition of the natural numbers. It seems natural to define, 0 = ∅, as being
the smallest of all natural numbers. The challenge is to define the operation
“+ 1” using the language of sets. We can view “+ 1” as an “immediate
successor constructing mechanism”. We begin with the following definition.
x+ = x ∪ {x}
We see that this is an operation which adds a single element to a given set,
x. For example, if A = {a, b, c}, then the successor of A is
A+ = {a, b, c} ∪ {{a, b, c}} = {a, b, c, {a, b, c}}
This is a set constructing mechanism. We need only one set to initiate a
Part V: From sets to numbers 113
B0 = ∅
B1 = ∅+ = ∅ ∪ {∅} = {∅}
B2 = (∅+ )+ = {∅}+ = {∅} ∪ {{∅}} = {∅, {∅}}
Rather than use the symbols, {B0 , B1 , B2 , . . . , }, why not use conventional
natural number notation?
0 = ∅
1 = 0+ = ∅+ = ∅ ∪ {∅} = {∅} = {0}
2 = 1+ = {∅}+ = {∅} ∪ {{∅}} = {∅, {∅}} = {0, 1}
3 = 2+ = {∅, {∅}}+ = {∅, {∅}} ∪ {{∅, {∅}}} = {∅, {∅}, {∅, {∅}}} = {0, 1, 2}
n ∈ n+
(in the sense that the set, 6, is an element of the set, 7). That is,
− Also, if we set 0 = ∅
− We have verified above that the set, 3 = {0, 1, 2}, satisfies the property:
Every pair of elements contained in the number 3 are comparable with
respect to <, and they satisfy the interesting property:
x+ = x ∪ {x}
1
A set, A, is called an inductive set if it satisfies the following two properties:
a) ∅ ∈ A.
b) x ∈ A ⇒ x+ ∈ A.
The above definition of “Inductive set” nicely describes what the set of all
natural numbers is like. But defining “Inductive set” doesn’t guarantee that
one exists in our set-theoretic universe. We will need some outside help for
this.2 This is done with the Axiom of infinity (A8).
With this guarantee that at least one inductive set exists, we will define the
natural numbers as being the smallest one.
Definition 13.3 We define the set, N, of all natural numbers as the intersec-
tion of all inductive sets. That is,
Is the set N itself inductive? We verify this: By definition, all induction sets
contain the element ∅ and so ∅ belongs to their intersection, N. Condition
one is satisfied. We verify condition two: If x ∈ N, then x belongs to all
inductive sets and so x+ must belong to all inductive sets; so x+ ∈ N. So N
is an inductive set. It immediately follows that if n is any natural number,
then so is its successor, n+ .
P roof:
By hypothesis, A is an inductive set since it satisfies the two required properties.
Since N is the intersection of all inductive sets, then N ⊆ A. By hypothesis,
A ⊆ N. Thus, A = N.
116 Section 13: Natural numbers
P roof: Let
A = {n ∈ N : P (n) holds true}
Part (a) of the hypothesis states that “0 ∈ A”, while part (b) states that
[n ∈ A] ⇒ [n+ ∈ A]. Then A is an inductive set and so A = N (by the theo-
rem). So P (n) is true for all natural numbers n.
A few remarks. The proofs above illustrate how the Principle of mathemati-
cal induction is intrinsically linked to the definition of the natural numbers.
The set of all natural numbers is the only (non-empty) set whose existence
is essentially postulated. The other explicitly defined set is the empty class
which was shown to be a set (as a consequence of the Axiom of construction
followed by the Axiom of subset).
Some readers may not be familiar with “proofs by mathematical induc-
tion”. For these readers we provide a few examples of proofs by induc-
tion in this section. We summarize the main steps to be followed when
proving a statement by induction. Induction is used when we are dealing
with some property P (n) which is a function of the natural numbers. Let
S = {n ∈ N : P (n) holds true}. Now this set, S, may possibly be empty,
may contain a few elements of N or may even contain all of its elements.
The objective is to show that if two specific conditions are satisfied, then
S = N. That is, we want to prove that P (n) holds true for all values of n.
For example, suppose P (n) is described as the property
n(n + 1) 3
1+ 2+3+···+n =
2
We want to prove that this holds true no matter what natural number n we
use. We highlight the main steps.
Step 1: Write down explicitly the property which is a function of n as illus-
trated above.
Step 2: Prove the “Base case”. This means that we must prove that P (0) is
3 To allow us to present this particular example at this time we will assume that the
operations of addition and multiplication on the natural numbers are known to us. These
will be properly defined soon.
Part V: From sets to numbers 117
Step 5: Write down the conclusion: Since “P (n) is true” implies that
“P (n + 1) is true”, then, by the principle of mathematical induction, P (n)
holds true for all n.
Difficulties encountered by students when first applying this procedure are
often due to skipped steps.
Theorem 13.7 The non-empty set, S, is a transitive set if and only if the
property
[x ∈ y and y ∈ S] ⇒ [x ∈ S]
holds true.
118 Section 13: Natural numbers
P roof:
( ⇒ ) What we are given: That S is a transitive set. Suppose y ∈ S and
x ∈ y.
What we are required to show: That x ∈ S.
Since S is a transitive set, y ∈ S implies y ⊆ S. Since x ∈ y ⊆ S then x ∈ S.
( ⇐ ) We are given that S satisfies the property: If x ∈ y and y ∈ S then
x ∈ S.
We are required to show that S is a transitive set.
Suppose (x ∈ y and y ∈ S) ⇒ (x ∈ S)”. Let z ∈ S. It suffices to show that
“z ⊆ S”.
If z = ∅, then z ⊆ S and we are done. Suppose that z 6= ∅. Let a ∈ z. By
hypothesis, a ∈ S. Since a ∈ z implies a ∈ S, then z ⊆ S.
P roof:
By the characterization of transitive sets stated above, it suffices to show
that for each n ∈ N, x ∈ n ⇒ x ∈ N. We will prove this by mathematical
induction. Let P (n) denote the statement “(x ∈ n ∈ N) ⇒ x ∈ N”.
− Base case: The statement, “x ∈ 0 = ∅” ⇒ “x ∈ N”, is true since there
are no elements in 0 = ∅. So P (0) holds true.
− Inductive hypothesis: Suppose the statement P (n) holds true for the
natural number n. We are required to show that P (n+ ) holds true. Sup-
pose y ∈ n+ = n ∪ {n}. Then either y ∈ n or y ∈ {n}. If y ∈ n, then by
the inductive hypothesis, y ∈ N. If y ∈ {n}, then y = n ∈ N.
By mathematical induction, the statement holds true for all elements of N
and so, by definition, N is a transitive set.
Theorem 13.9
P roof:
4 The reader is cautioned not to misread this statement. It does not say that any subset
of a natural number n is an element of n. It says that “any natural number which is a proper
subset of n is an element of n”.
120 Section 13: Natural numbers
Now
n ∪ {n} ∪ {n ∪ {n}} = n ∪ {n} ⇒ n ∪ {n} ∈ n ∪ {n}
⇒ n ∪ {n} ∈ n or n ∪ {n} = n
The inductive hypothesis does not allow “n ∪ {n} = n”. So n ∪ {n} ∈ n.
By part a), n ∪ {n} ⊆ n. Since n ⊆ n ∪ {n}, then n = n ∪ {n}
(Axiom of extent) again contradicting the inductive hypothesis. Then
n ∪ {n} ∪ {n ∪ {n}} = 6 n ∪ {n}. So “P (n) ⇒ P (n+ )” holds true.
By mathematical induction, n 6= n ∪ {n} for all natural numbers.
c) Suppose n is a natural number such that n ∈ n. If m ∈ n∪{n} then m ∈ n
or m = n. In both cases m ∈ n. Then n ∪ {n} ⊆ n. Since n ⊆ n ∪ {n},
n = n ∪ {n} contradicting the statement of part (b). We must conclude,
if n is a natural number, then n 6∈ n.
d) We are required to show that for all m, n ∈ N, m ⊂ n ⇒ m ∈ n. We will
prove this by mathematical induction on n. Let P (n) be the property “[m
is a natural number and m ⊂ n] ⇒ [m ∈ n]”.
− Base cases n = ∅ or 1: For n = ∅, the statement m ⊂ ∅ ⇒ m ∈ ∅ is
vacuously true. For n = 1, ∅ ⊂ 1 = {∅} and ∅ ∈ 1 = {∅} hold true.
So both base cases P (0) and P (1) hold true. (Actually showing P (0)
holds true is sufficient.)
− Inductive hypothesis: Suppose n is a natural number such that P (n)
holds true; that is, for any natural number m, “m ⊂ n ⇒ m ∈ n”. We
are required to show that for any natural number m, “m ⊂ n+ ⇒ m ∈
n+ ”.
· Let m be a natural number such that “m ⊂ n+ = n ∪ {n}”.
Case 1: If n 6∈ m, then m ⊂ n. By the inductive hypothesis,
m ∈ n ⊂ n ∪ {n}. Hence, m ∈ n ∪ {n}.
Case 2: Suppose n ∈ m. By part (b), m 6= n. Since m and n are
distinct natural numbers, by part (a), n ⊂ m. Then n ∪ {n} ⊆ m.
Since m ⊂ n ∪ {n}, then n ∪ {n} ⊂ n ∪ {n}, a contradiction.
So only case 1 applies. So P (n+ ) holds true.
By the principle of mathematical induction, m ⊂ n ⇒ m ∈ n for all
natural numbers m and n.
2 = {∅, {∅}} ∈ 4 and 2 = {∅, {∅}} ⊂ 4 Since every element of {∅, {∅}} belongs to 4.
a) If m ⊂ n, then m+ ⊆ n.
b) Let m and n be any pair of distinct natural numbers. Then either m ⊂ n
or n ⊂ m. Equivalently, m ∈ n or n ∈ m. Hence, both “⊂” and “∈” are
strict linear orderings of N.
c) There is no natural number m such that n ⊂ m ⊂ n+ .
P roof:
a) What we are given: That m and n are distinct natural numbers where
m ⊂ n. What we are required to show: That m ∪ {m} ⊆ n.
Since m ⊂ n, then m ∈ n (by Theorem 13.9, part (d) ). By 13.9 part (a),
every natural number is a transitive set, so m ∈ n implies m ⊆ n. Then
m ∪ {m} ⊆ n ∪ {m}. Suppose y ∈ m ∪ {m}. Then y ∈ n or y ∈ {m}. If
y ∈ {m}, then y = m ∈ n. So {m} ⊆ n. Then n ∪ {m} ⊆ n. We conclude
that m ∪ {m} ⊆ n, as required.
b) What we are given: That m and n are distinct natural numbers. What
we are required to prove: That m ⊂ n or n ⊂ m.
We will prove this by mathematical induction on n. Let P (n) be the
statement “for every natural number m 6= n, either m ⊂ n or n ⊂ m”
− Base cases n = ∅ or 1: For n = ∅, the statement ∅ ⊂ m holds
true for all non-zero natural numbers m. For n = 1 and m = 1,
∅ ⊂ 1 = {∅}. Suppose m is a natural number other than 0 and 1.
Then ∅ ⊂ m ⇒ ∅+ = {∅} ⊆ m (by the Theorem 13.9 above). Since
m 6= 1, then {∅} ⊂ m holds true for any such m (since ∅ ∈ m for
every non-zero natural number m).
− Inductive hypothesis: Suppose P (n) holds true for some natural num-
ber n. That is, suppose n is a natural number such that for any natural
number m not equal to n, either m ⊂ n or n ⊂ m. We are required to
show that P (n+ ) holds true.
Let m be a natural number such that m 6= n+ . Case 1: If m = n,
then m ∈ n ∪ {n} and so m ⊂ n ∪ {n} (by Theorem 13.9) and we
are done. Case 2: Suppose m 6= n. Then, by the inductive hypothesis,
either m ⊂ n or n ⊂ m. If m ⊂ n, then m ⊂ n+ = n ∪ {n}. Done.
If n ⊂ m, then n+ ⊆ m (by part a)). Since m 6= n+ , n+ ⊂ m. Then
P (n+ ) holds true.
122 Section 13: Natural numbers
P roof:
What we are given: That n is a non-zero natural number.
What we are required to show: That k = ∪{m ∈ N : m ⊂ n} is a natural
number and k + = n.
Proof by induction. Let P (n) be the statement “k = ∪{m ∈ N : m ⊂ n} is
a natural number and k + = n”.
S
− Base cases n = 1 or 2: If n = 1, then k = m∈1 m = ∅ a natural
+
S such that k = ∅ ∪ {∅} = {∅} = 1 = n. If n = 2, then
number
k = m∈2 m = ∅ ∪ 1 = ∅ ∪ {∅} = {∅} = 1 a natural number such that
k + = {∅} ∪ {{∅}} = {∅, {∅}} = 2 = n.
Part V: From sets to numbers 123
k = ∪{m ∈ N : m ∈ n ∪ {n}}
= ∪{m ∈ N : m ∈ n or m = n}
= ∪{m ∈ N : m ∈ n} ∪ n
P roof:
We prove this by induction. For non-zero natural numbers n, let P (n) be the
statement “ the natural number n has a unique immediate predecessor”.
− Base case n = 1: By definition, 1 = {∅} = ∅ ∪ {∅} = ∅+ = 0+ . So P (1)
holds true.
− Induction hypothesis: Suppose n is a non-zero natural number such that
P (n) holds true. That is, there is only one natural number m such that
m+ = n. We are required to show that n+ has a unique predecessor.
Trivially, n is one immediate predecessor of n+ . Suppose k is another
natural number such that k + = n+ . Then n ∪ {n} = k ∪ {k}. We claim
that n = k. Suppose not. Then both n ∈ k and k ∈ n must hold true. We
have shown that every natural number is a transitive set (see Theorem
13.9 (a)). By Theorem 13.7, n ∈ k and k ∈ n implies n ∈ n. By 13.9 part
(c) this cannot be true for any natural number, and so we have a con-
tradiction. Then n = k as claimed. Hence, n+ has as unique immediate
predecessor, n. So P (n+ ) holds true.
By mathematical induction, every non-zero natural number has a unique im-
mediate predecessor.
124 Section 13: Natural numbers
numbers. Today these are referred to as the Peano axioms (the Italian name
“Peano” is pronounced as ‘pay-ah-no’). We will see that each of these axioms
belongs to ZFC-set-theoretic universe, and so as a group play the role of in-
termediary − a more easily understandable one − between the Set theory
axioms and the body of mathematics we refer to as number theory. We will
list the nine Peano axioms below. The symbol “0” is an undefined symbol.
The symbol “S” represents a single valued function we refer to as the “suc-
cessor function” on the natural numbers.
from the Axiom of infinity. Note that the Axiom of power set, the Axiom
of replacement, the Axiom of regularity and the Axiom of choice are not
required to do mathematics with the natural numbers.
Concepts review:
1. If x is a set, then what is its successor?
2. What is an inductive set?
3. What does the Axiom of infinity state?
4. How is the set of natural numbers defined?
5. List the first four natural numbers using set notation.
6. What is the Principle of mathematical induction?
7. What is a transitive set?
8. If n is a natural number, is n a transitive set?
9. What is the difference between an inductive set and a transitive set?
10. Is N a transitive set?
11. Give a characterization of transitive sets.
12. Is it true that any element of a natural number is a natural number?
Why?
13. Can a natural number be an element of itself?
14. Is N a natural number? Why?
15. If n is a natural number, how many successors can n have?
16. What is a second version of the Principle of mathematical induc-
tion?
17. If n is a natural number, what does it mean to say that m is its
predecessor?
18. Give an expression which describes the predecessor of a natural
number n.
19. If m and n are natural numbers such that m ∈ n, can it happen
that m+ = n? Can it happen that n ∈ m+ ?
20. If m is a subset of the natural number n, is it possible that m ∈ n?
In which case?
21. Are there any natural numbers which are inductive sets?
22. For three natural numbers m, n and t satisfying m ∈ n and n ∈ t,
does it always follow that m ∈ t?
Part V: From sets to numbers 127
EXERCISES
C. 11. Show that finite unions and finite intersections of transitive sets are tran-
sitive sets.
12. Suppose S ⊂ N. Suppose that the union of all elements of S is S. Prove
that S cannot be a natural number.
13. Jo-Anne has defined the natural numbers in the ZFC-axiomatic system as
follows. She defined an inductive set as “S is inductive if, whenever x ∈ S,
then {x} ∈ S”. By first invoking the axiom of infinity she defines the
natural numbers N as the smallest inductive set linearly ordered by “∈”.
She defines 0 = ∅, 1 = {∅}, 2 = {{∅}}, 3 = {{{∅}}}, 4 = {{{{∅}}}} and
so on. We see that 0 ∈ 1 ∈ 2 ∈ 3 ∈ 4 · · · . Will this work as a definition of
the natural numbers? If so, say why. If not, explain why.
14. Show that N = ∪{n : n ∈ N}.
128 Section 14: Natural numbers as a well-ordered set
m ∈= n if and only if m = n or m ∈ n
14.2 A well-ordering of N.
There is an important property that is not possessed by all linearly ordered
classes. It is called the well-ordering property. We formally define this prop-
erty. We will then prove that (N, ∈) is a well-ordered set.
We have previously shown that the second version of the Principle of mathe-
matical induction follows from the first version or the Principle of mathemat-
ical induction. We can show that if we only assume that N is ∈-well-ordered
and the second version of the induction principle, then the first version of
the induction principle holds true. The proof is as follows.
What we are given: That N is ∈-well-ordered and that the second version of
130 Section 14: Natural numbers as a well-ordered set
We have thus shown that not only is N a well-ordered set, but so is every
single natural number.
k 6∈ S ⇒ x 6= k,
⇒ x∈k
⇒ x ∈ t+
x ∈ t+ ⇒ x ∈ (t ∪ {t})
⇒ x ∈ t or x ∈ {t}
⇒ x ∈ t or x = t
{1, 2} × N = {(i, n) : i = 1 or 2, n ∈ N}
{(1, 0), (1, 1), (1, 2), (1, 3), · · · , (2, 0), (2, 1), (2, 2), (2, 3), · · · , }
S = {0, 1, 2, 3} × N ⊂ N × N
{1, 2}N
Then any specific function f in {1, 2}N can be expressed as an infinite set
of ordered pairs
{a0 , a1 , a2 , a3 , . . . , }
Definition 14.6 Consider the set {1, 2}N of all functions mapping nat-
ural numbers to 1 or 2. We define the lexicographic order “<lex ” on
{1, 2}N as follows: For any two elements f = {a0 , a1 , a2 , a3 , . . . , } and g =
{b0 , b1 , b2 , b3 , . . . , } in {1, 2}N, f <lex g if and only if for the first two unequal
corresponding terms ai and bi , ai ∈ bi (ai < bi ). Also, f = g if and only if
ai = bi for all i ∈ N.3
{2, 1, 1, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 2, 1, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 2, 1, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 2, 1, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 2, 1, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 1, 2, 1, 1, 1, 1, · · · }
{1, 1, 1, 1, 1, 1, 2, 1, 1, 1, · · · }
..
.
2 Even though the following definition of the ordering on the set {1, 2}N is inspired from
the lexicographic ordering of sets of ordered pairs and adopts the notation <lex , it is good
to remember that we are not ordering ordered pairs but sets which represent functions.
3 A lexicographic ordering can similarly be defined on S N where S is any subset of N.
Part V: From sets to numbers 135
We see that each element is strictly less than its immediate predecessor in
this list. We also see that we can never reach {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, · · ·}
using such a decreasing sequence. This convinces us that the set
does not contain its least element since no matter where we insert our first
“2”, you will be able to insert a “2” further down. Since the non-empty
subset S has no minimal element with respect to the ordering “<lex ”,
then {1, 2}N is not a well-ordered set.
b) The set {1, 2}N is bounded.
We easily see that {2, 2, 2, 2, · · · } is a maximal element of {1, 2}N. So
every subset of {1, 2}N has at least {2, 2, 2, 2, · · · } as upper bound.
c) On maximal elements of bounded sets.
Does every bounded subset of {1, 2}N contain a maximal element? To help
answer this question, let’s try to find the element of {1, 2}N which im-
mediately precedes {2, 2, 2, 2, · · ·}. Another way of stating this is: What
is the maximal element of S = {f ∈ {1, 2}N : f < {2, 2, 2, 2, · · · }}? This
maximal element must have at least one “1” in it, along with as many
2s as possible. The question is where shall we insert this “1”? No matter
where we insert this “1” we will be able to reconsider our choice and
reinsert it farther down. So S contains no maximal element.
N{1,2} = {{a1 , a2 } : a1 , a2 ∈ N}
Concepts review:
1. What does it mean to say that “<” strictly well-orders a set S?
2. Describe two order relations which well-order N?
3. What does it mean to say a subset S of N ordered by “∈” is
bounded?
4. What does it mean to say that a non-empty subset S of N ordered
by “∈” has a maximal element?
5. Describe the set {1, 2} × N by providing three distinct elements of
this set.
6. Define the lexicographic ordering on {1, 2} × N.
7. Define the lexicographic ordering on {1, 2}N.
8. List three elements of {1, 2}N in increasing order.
9. Is the lexicographically ordered set {1, 2} × N well-ordered?
10. Is the lexicographically ordered set {1, 2}N well-ordered?
11. Does every non-empty subset of the lexicographically ordered set
{1, 2} × N have a maximal element?
12. Does every non-empty subset of the lexicographically ordered set
{1, 2}N have a maximal element?
13. Does {1, 2}N have a maximal element?
14. Describe the elements of N{1,2}. Propose an ordering for its ele-
ments.
EXERCISES
N{0,1,2} = {{a0 , a1 , a2 } : ai ∈ N}
3+0 = 3
3+1 = 3 + 0+ := (3 + 0)+ = 3+ = 4
3+2 = 3 + 1+ := (3 + 1)+ = 4+ = 5
.
..
3+n = m
3 + n+ := (3 + n)+ = m+
..
.
rm (0) = m
+
rm (n ) = [rm(n)]+
For example, once the value of 34 + 0 = 34 is declared, the value of the sum
34+123 is uniquely determined by applying the formula 34+n+ = (34+n)+
finitely many times to successively obtain 34 + 1 = 35, 34 + 2 = 36,
34 + 3 = 37, . . . , 34 + 123 = 157.
Readers may have no doubt noticed that the function rm (n) is defined by
using a mechanism that we have not used or seen before in this text. We
are accustomed to defining a function f : A → A by declaring a rule which
associates to each element in A some other element in A without referring to
other ordered pairs (a, f(a)) in f. For example, the only way we can confirm
that the ordered pair (2, 3 + 2) = (2, 5) belongs to the function r3 is by first
determining that (0, 3 + 0) = (0, 3) and (1, 3 + 1) = (1, 4) also belong to r3 .
Most readers will intuitively feel that there is no ambiguity in the way we
have defined the function rm on N.
We refer to functions which are defined in this way as being recursively
defined functions. If rm is indeed a well-defined function, then we must be
able to prove that it satisfies the conditions stated in the formal definition of
a function. We remind ourselves how we defined a “function” (see Definition
9.1):
Given two sets A and B, a function is a subset f of A × B which
satisfies the property “(x, y) and (x, z) belong to f implies y = z”.
Part V: From sets to numbers 141
We will show two things: (1) r ∗ is a function mapping N into N, and, (2)
r∗ = rm .
1) We claim that r ∗ is a function mapping N into N.
Proof of claim:
We first establish (by induction) that dom r ∗ = N.
Base case: We first note that since (0, m) ∈ r ∗ , then 0 ∈ dom r ∗ .
Inductive hypothesis: Suppose n ∈ dom r ∗. Then (n, y) ∈ r ∗ for some y ∈ N.
This implies (n+ , m+ ) ∈ r ∗ and so n+ ∈ dom r ∗.
Hence, by induction, the domain of r ∗ is all of N.
We now proceed to the proof of the claim. The proof of the claim invokes the
second version of the principle of mathematical induction. Let P (n) denote
the statement “[(n, x) ∈ r ∗ ∧ (n, y) ∈ r ∗ ] ⇒ [x = y]”.
Inductive hypothesis: Suppose P (m) holds true all natural numbers m < n.
That is, (m, x) ∈ r ∗ and (m, y) ∈ r ∗ implies x = y. We will show that given
our hypothesis, P (n) must hold true.
2 Should you decide to skip reading the proof for this time, don’t make a habit of it. It
n+ = 1 + n
3) Addition of the natural numbers is associative. That is, for any three nat-
ural numbers m, n and k
(m + k) + n = m + (k + n)
Proof: By induction. Let m and k be any two natural numbers. Let P (n)
be the property “(m + k) + n = m + (k + n)”.
Base case: We see that P (0) holds true since (m + k) + 0 = m + k =
m + (k + 0), (by (1) in the definition of addition above).
Inductive hypothesis: Suppose P (n) holds true. Then
= m + (k + n)+
= m + (k + n+ )
4) Addition of the natural numbers is commutative. That is, for any two
natural numbers m and n
m+n =n+m
Proof: By induction. Let m be any natural number. Let P (n) be the
property “m + n = n + m”.
Base case: We see that P (0) holds true since
Definition 15.3 For any natural number m, multiplication with the natural
number m is defined as the function sm : N → N satisfying the two conditions
sm (0) = 0
sm (n+ ) = sm (n) + m
sm (0) = 0 ⇔ m0 = m × 0 = 0 (3)
sm (n ) = sm (n) + m ⇔ mn+ = mn + m = m × n + m
+
(4)
that s∗ is the smallest set of ordered pairs satisfying the conditions described
for S . The relation s∗ looks something like
s∗ = {(0, 0), (1, m), (2, m + m), (3, m + m + m), · · · , (n, m × n), · · · , }
k(m + 0) = km
= km + 0
= km + k0
(mn)k = m(nk)
Proof: By induction. It is left as an exercise.
5) Multiplication of the natural numbers is commutative. That is, for any
two natural numbers m and n,
mn = nm
Proof: By induction. It is left as an exercise.
Theorem 15.5 For any two natural numbers m and n, m ∈= n if and only
if there exists a unique natural number k such that n = m + k.
P roof:
By induction. Let P (n) be the statement: “For any m ∈= n there exists a
unique natural number k such that n = m + k.”
Base case: Suppose n = 0. Then, for any m ∈= 0, m = 0 and so there exists
only k = 0 such that 0 = n = m + k = 0 + 0. So P (0) holds true.
Inductive hypothesis: Suppose n is a natural number such that for any nat-
ural number m ∈= n, there exists a unique natural number k such that
n = m + k. Suppose m is a natural number such that m ∈= n+ . Then
m ∈= n+ = n ∪ {n} implies m ∈ n or m = n or m = n+ . The equality
m = n+ means we can only choose k = 0. If m ∈ n, then the existence of a
unique natural number k such that n = m+k is guaranteed by our inductive
hypothesis. So
n+ = n+1
= (m + k) + 1
= m + (k + 1)
= m + k+
Definition 15.6 For any two natural numbers m and n such that m ≤ n, the
unique natural number k satisfying n = m + k is called the difference between
n and m and is denoted by n − m. The operation “−” is called subtraction.
Concepts review:
1. How is addition on the natural numbers defined?
2. For any natural number n, give two ways of describing n+ .
3. How can we prove that 0 + n = n for all n from the definition of
addition?
4. How can we prove that addition is associative from the definition
of addition?
5. How can we prove that addition is commutative from the definition
of addition?
6. How is multiplication of natural numbers defined?
7. How is subtraction of natural numbers defined?
EXERCISES
Abstract. In this section we define both the integers, Z, and the rational
numbers, Q. The integers are presented as a quotient set of N × N, while
the rationals are presented as a quotient set of Z × (Z − {0}). Addition,
subtraction and multiplication on each of these are defined within the set-
theoretic context. Order relations are defined on each of Z and Q so that
they are linearly ordered in the way we are accustomed to.
P roof:
Reflexivity: Since a + b = b + a, (a, b)Rz (a, b).
Symmetry: (a, b)Rz (c, d) ⇒ a + d = b + c ⇒ c + b = d + a ⇒ (c, d)Rz (a, b).
Transitivity: Suppose (a, b)Rz (c, d) and (c, d)Rz (e, f). Then a + d = b + c
and c + f = d + e. This implies a + d + c + f = b + c + d + e. Subtracting
c+d from both sides of the equality gives a+f = b +e. Hence, (a, b)Rz (e, f).
150 Section 16: The integers Z and the rationals Q
Let (a, b) ∈ Z. We will show that (a, b) ∈ U . Suppose (c, d) ∈ [(a, b, )]. Then
(a, b)Rz (c, d) ⇒ a + d = b + c. We consider two cases, d ≤ c and c ≤ d
m 6= n ⇒ 0 + n 6= m + 0
⇒ (0, m) 6∈ [(0, n)]
Part V: From sets to numbers 151
We have set the stage for a set-theoretic definition of the “integers”. Some
readers may have some insight on where this is leading. It seems that the
plan is to have the equivalence class [(0, n)] represent the negative integers
−n = 0 − n and [(n, 0)] represent the positive integers n − 0 = n. We could
then equate −5 with [(0, 5)] and the integer 5 with [(5, 0)].
Some may immediately wonder: Why do the positive integers need defining?
Aren’t positive integers simply the natural numbers? How can the natural
number 5 = {0, 1, 2, 3, 4} be the same set as the integer 5 = [(0, 5)]? These
two sets are indeed different since they don’t contain the same elements. It
is true, the “natural number 5” and the “integer 5” have different set rep-
resentations. The question is: Is this a major problem or is it just a minor
annoyance? It may be possible to construct the integers with the specific
requirement that the sets which represent the positive integers and the sets
which represent the natural numbers be the same. But this constraint may
present some hurdles around which it may be difficult to maneuver. When
we think about it carefully, it is not the sets which represent the natural
numbers and the sets which represent the positive integers that are impor-
tant. What is however crucial is that the arithmetic operations on these sets
each produce the expected values. That is, both 5 + 3 and [(5, 0)] + [(3, 0)]
produce 8 “the natural number” and 8 = [(8, 0)] “the positive integer”, re-
spectively. With this in mind, we proceed with a formal definition of the
integers.
a) Negative integers: The set of negative integers is defined as being the set
Z− = {[(0, n)] : n ∈ N}
Positive integers: The set of positive integers is defined as being the set
Z+ = {[(n, 0)] : n ∈ N}
152 Section 16: The integers Z and the rationals Q
d) Opposites of integers: The opposite −[(a, b)] of [(a, b)] is defined as:
In particular, [(0, n)] ×z [(m, 0)] = [(0 + 0, 0 + nm)] = [(0, nm)] = −[(nm, 0)]
and [(n, 0)] ×z [(m, 0)] = [(nm, 0)].
g) Absolute value of an integer: The absolute value, |n|, of an integer n is
defined as
n if 0 ≤z n 4
|n| =
−n if n <z 0
h) Equality of two integers: If (a, b) and (c, d) are ordered pairs which are
equivalent under the relation Rz , then the Rz -equivalence classes [(a, b)]
and [(c, d)] are equal sets. To emphasize that they are equal sets under the
relation Rz , we can write
i) Distribution properties: If [(a, b)], [(c, d)] and [(e, f)] are integers, then
[(a, b)] ×z ([(c, d)] +z [(e, f)]) =z [(a, b)] ×z [(c, d)] +z [(a, b)] ×z [(e, f)]
and
([(c, d)] +z [(e, f)]) ×z [(a, b)] =z [(c, d)] ×z [(a, b)] +z [(e, f)] ×z [(a, b)]
It is good to remember that any integer can be written in the form [(n, 0)]
or [(0, n)] = −[(n, 0)]. These forms make it easier to add and multiply
them without memorizing intricate formulas. For example, the expression
[(2, 4)] ×z [(5, 2)] can be more easily computed as follows:
Z ⊆ P 3 (N)
But simply defining a/b as an ordered pair (a, b) in Z×Z would not do, since a
2
rational number, say −2/3, can have many equivalent forms: −3 , −4 20
6 , −30 . So
the associated ordered pairs of integers (−2, 3), (−4, 6) and (20, −30) should
also be equivalent forms of the same number. To overcome this difficulty,
we will define an equivalence relation on Q = Z × Z∗ (where Z∗ = Z −
{0}) so that all equivalent forms of (−2, 3) belong to an equivalence class
[(−2, 3)] induced by this equivalence relation. We have chosen the Cartesian
product Z × Z∗ rather than Z × Z since the second entry cannot be zero.
The equivalence relation we will use to extract Q from Q = Z × Z∗ will
be represented by Rq . To define this equivalence relation Rq , we will ask
8
ourselves: What property makes the two rational numbers −2/3 and −12
equivalent? We see that
−2 8
= implies (−2)(−12) = (3)(8)
3 −12
For example, consider the two elements (6, 10) and (15, 25) of Z × Z∗.
Since 6 ×z 25 = 150 = 10 ×z 15, they are equivalent rational numbers. So
[(6, 10)]q =q [(15, 25)]q. Remember that [(a, b)]q will represent the set of all
elements of Z × Z∗ which are Rq -equivalent to the element (a, b) ∈ Z × Z∗.
When there is no risk of confusion with the equivalence class of another
equivalence relation, we will simply use [(a, b)] rather than [(a, b]q .
We now formally define the rational numbers within a set-theoretic context.
Part V: From sets to numbers 155
Theorem 16.6 Suppose a and b are positive integers where b 6= 0 and [(a, b)]
is an Rq equivalence class. Then
5 Recall that a and −b is shorthand for expressions of the form [(a, 0)] or −[(0, b)]
156 Section 16: The integers Z and the rationals Q
−a a
a) [(−a, −b)] = −b = b = [(a, b)].
−a a
b) −[(a, b)] =q [(−a, b)] = b = −b = [(a, −b)].
P roof:
a) Since −a ×z b = −b ×z a, then (−a, −b)Rq (a, b). Then (−a, −b) ∈ [(a, b)]
and so we can write −a a
−b = [(−a, −b)] =q [(a, b)] = b .
Examples.
The following examples illustrate that doing arithmetic by referring to the
set-theoretic definitions and the listed properties is a bit awkward and re-
quires some thought. It is of course not an efficient way of doing arithmetic.
To see this, we compute the following expressions by representing these num-
bers as equivalence classes of ordered pairs and using the above definitions.
When useful, we indicate which of the above statements are invoked to jus-
tify various steps.
b) Compute −2 6 3
a) Compute −4(3 − 7) 5 7 − 11
Solution :
a)
−4(3 − 7) = [(0, 4)] ×z ([(3, 0)] −z [(7, 0)])
=z [(0, 4)] ×z ([(3, 0)] +z [(−7, 0)]) (By Theorem 16.6, b) .)
=z [(0, 4)] ×z ([(3, 0)] +z [(0, 7)])
=z ([(0, 4)] ×z [(3, 0)]) +z ([(0, 4)] ×z [(0, 7)])
=z [(0, 12)] +z [(28, 0)]
=z [(28, 12)]
=z [(16, 0)] (Since (a, b)Rz (c, d) ⇔ a + d = b + c.)
b)
−2 6 3
− = [(−2, 5)] ×q ([(6, 7)] −q [(3, 11)])
5 7 11
=q [(−2, 5)] ×q ([(6, 7)] +q [(−3, 11)]) (By Theorem 16.6, b) .)
Concepts review:
1. Describe the equivalence relation Rz on N × N used to define the
elements of the integers Z.
2. Describe the equivalence class induced by Rz on N × N which rep-
resents the integer −9. What about the integer 3?
3. Do the equivalence classes {[(0, n)] : n ∈ N} ∪ {[(n, 0)] : n ∈ N}
account for all the equivalence classes induced by Rz on N × N?
4. How is addition +z defined on Z in a set-theoretic context?
5. How is multiplication ×z defined on Z in a set-theoretic context?
6. Describe the equivalence relation Rq on the Cartesian product Z ×
Z∗ used to define the rational numbers Q.
7. How is addition +q defined on Q in a set-theoretic context?
8. How is multiplication ×q defined on Q in a set-theoretic context?
EXERCISES
A. 1. Use the definitions in 16.3 to show that the following statements are true:
a) −2 ≤ 10.
b) 0 ≤ 3.
c) −5 − 7 = −12.
d) −2 + 6 = 4.
e) 7 × −2 = −14.
f) −1 × −2 = 2.
2. Use the definitions in 16.5 to show that the following statements are true:
a) −2 2
−3 = 3 .
b) −23 ≤ 2.
3
c) 53 + 32 = 19
6
.
d) −2 − 65 = − 16 5
.
6 1 2
e) 5 × 3 = 5 .
f) 3 = 62 .
4. The relation ≤z on Z is defined as follows: [(a, b)] ≤z [(c, d)] if and only if
a + d ≤ b + c. Show that this is a linear ordering.
5. Let Q = Z × Z∗ where Z∗ = Z − {0}. Let Rq be a relation on Q defined as
follows: ((a, b), (c, d)) ∈ Rq if and only if a ×z d = b ×z c. Show that Rq is
an equivalence relation on Q.
“completeness property”.
It is also often referred to as the “least upper bound property”. It states that
Does the set, Q, satisfy the “completeness property”? That is, is it true that
for every non-empty bounded subset S of Q, Q contains the least upper
bound of S? Well, let’s consider the subset
√
S = [−4, 2) ∩ Q
tributions to number theory, abstract algebra (particularly ring theory), and the axiomatic
foundations of arithmetic. His best-known contribution is the definition of real numbers
through the notion of Dedekind cut. He is also considered a pioneer in the development of
modern set theory and of the philosophy of mathematics known as Logicism.
162 Section 17: Real numbers: “Dedekind cuts are us!”
Along with this slightly abstract definition, we consider the following subset
of Q. For r ∈ R, we define
(←Q r) = (−∞, r) ∩ Q
We will show that subsets of Q of the form (←Q r) offer another way of per-
ceiving the Dedekind cuts.
Claim: We claim that any Dedekind cut can be expressed in the form (←Q r).
Proof of claim: Suppose S is a Dedekind cut, a subset of Q satisfying the
three conditions stated above.
Condition one states that there exists a rational number k such that k 6∈ S.
If u ∈ S and u > k then, by condition two k would belong to S. So S ⊆
(−∞, k)∩Q. Condition two also guarantees that if a ∈ S, then (−∞, a)∩Q ⊆
S. So, if a ∈ S, then
See that the number q cannot belong to S, for if it did, S would contain its
maximal element q, contradicting condition three. We conclude that
S = (−∞, q) ∩ Q = (←Q q)
We define,
D = {(←Q r) : r ∈ R}
We have argued above that the set, D, precisely represents the set of all
Dedekind cuts.
Part V: From sets to numbers 163
is a Dedekind cut.
Proof of claim #1: To see this, we will show that D satisfies the three
Dedekind cuts’ conditions. First note that, if k > r and q > t, then, for
any x + y ∈ D implies x + y < k + q. Then k + q is a strict upper bound
of D. So D cannot be all of Q. Then condition one is satisfied.
Next, suppose s ∈ D. Then s = a + b where a ∈ (←Q r), b ∈ (←Q t).
Suppose d ∈ Q such that d < a + b < r + t. Since d − a < b ∈ (←Q t). So
d − a ∈ (←Q t)}. Since a ∈ (←Q r), then
d = a + (d − a) ∈ D
s ∈ D ⇒ (←Q s) ⊂ D
Finally suppose s = a + b where a ∈ (←Q r), b ∈ (←Q t). Then there exists
ar and bt in (←Q r) and b ∈ (←Q t), respectively, such that s = a + b <
ar + bt ∈ D. Condition three is satisfied.
So D is indeed a Dedekind cut. This establishes claim #1.
Claim #2: We claim that D = (←Q (r + t)).
Proof of Claim #2: Suppose s = a+b ∈ D where a ∈ (←Q r) and b ∈ (←Q t).
Then s = a + b < r + t. So s ∈ (←Q (r + t)). So D ⊆ (←Q (r + t)).
We now show that (←Q (r + t)) \ D is empty. To do this, it suffices to show
that r + t is the least upper bound of D. If not, there exists a positive
number, say ε, such that r + t − ε is the least upper bound of D. But
r + t − ε = r − ε/2 + t − ε/2
For example: (←Q −5) + (←Q 7) = (←Q (−5 + 7)) = (←Q 2).
c) Multiplication on D: In the case where both r and t are greater than zero
we define multiplication as:
(←Q r) (←Q t) = xy : x ∈ (←Q r) and y ∈ (←Q t), x, y > 0 ∪ [(−∞, 0)∩Q]
Showing that
(←Q r) (←Q t) = (←Q rt)
is left as an exercise.
For example:
(←Q 5)
√ (←Q 7) √ = (←Q 5 ×
√ 7) √ = (←Q 35)
(←Q 2) (←Q − 2) = −(←Q 2 × 2) = −(←Q 2)
(←Q −4) (←Q 0) = (←Q 0)
(←Q −2) (←Q −10) = (←Q 2 × 10) = (←Q 20)
f(r) = (←Q r)
Definition 17.2 We define the real numbers as being the set of all Dedekind
cuts D, linearly ordered by inclusion with addition + and multiplication ×
(as described above). Those Dedekind cuts which have no least upper bound
in Q are called irrational numbers.
Lemma 17.3 The union of a non-empty set of Dedekind cuts is either itself
a Dedekind cut or is the set Q.
P roof:
What we are given: That U is a non-empty set of Dedekind cuts.
What we are required to show: That ∪{V : V ∈ U } is Q or is of the form
(←Q r) for some r ∈ R.
Case 1: Suppose U = {(←Q t) : t ∈ R}. Every Dedekind
S cut can be ex-
pressed as (←Q t) for some real number t. Then Q = t∈R (←Q t), contains all
Dedekind cuts.
Case 2: Suppose that ∪{V : V ∈ U } =6 Q. Then there isSa proper non-empty
subset M ⊂ R such S that U = {(←Q t) : t ∈ M } and t∈M (←Q t) 6= Q. It
suffices to show that t∈M (←Q t) is a Dedekind cut.
S
Since t∈M (←Q t) 6= Q, then there exists some u ∈ Q such that u 6∈ (←Q t)
for all t ∈ M . Then t < u for all t ∈ M . This means that u is an upper
3 Note that this property is often expressed in many different but equivalent forms. The
following properties are all equivalent to the Completeness property: (1) The limit of every
infinite decimal sequence is a real number, (2) Every bounded monotonic sequence is con-
vergent, (3) A sequence is convergent if and only if it is a Cauchy Sequence. Googling the
words “Completeness property” may direct the internet surfer to any one of these.
Part V: From sets to numbers 167
bound of M ⊂ R.
By the completeness principle for the real numbers, since M is bounded in
R, M has a least upper bound, say v ∈ R.
S
We claim that t∈M (←Q t) = (←Q v).
S
− We first show that (←Q v) ⊆ t∈M (←Q t):
Let z ∈ (←Q v). Then there exists t ∈ M such that z < t < v (for if t ≤ z
for all t ∈ M , then z is an upper bound of M , S a contradiction of the
definition of v). So z ∈ (←Q t) ∈ U . Then (←Q v) ⊆ t∈M (←Q t) must hold
true.
S
− We now show that t∈M (←Q t) ⊆ (←Q v):
Let u ∈ (←SQ t) for some t ∈ M . Since t ∈ M , t < v. Then u ∈ (←Q t) ⊂
(←Q v). So t∈M (←Q t) ⊆ (←Q v) as claimed.
S
So t∈M (←Q t) = (←Q v), a Dedekind cut as claimed.
S
So the union t∈M (←Q t) of all elements of U = {(←Q t) : t ∈ M } is a Dedekind
cut.
Theorem 17.4 Let D denote the set of all Dedekind cuts linearly ordered
by ⊂. Then if S is a non-empty bounded subset of D, S has a least upper
bound (with respect to the ordering ⊂).
P roof:
From this, we conclude that the set of all Dedekind cuts, D, represents
the set of all real numbers in the ZFC-set-theoretic universe. Thus, from
the primitive concepts “class”, “set” and “belongs to” and Axioms1 to 8
we have successfully defined the sets of natural numbers, integers, rational
numbers and real numbers. The elements of these sets are themselves sets. If
the existence of the natural numbers N is almost an immediate consequence
168 Section 17: Real numbers: “Dedekind cuts are us!”
of the Axiom of infinity, the other axioms provided the necessary tools to
construct from N the integers, rationals and real numbers (as sets).
Remark : Having defined the real numbers as Dedekind cuts, we see that ev-
ery real number u in ZFC is viewed as a subset of Q and so u ∈ P(Q). Since
Q ⊆ P 6 (N) (see page 155), u ∈ P(P 6 (N)) = P 7 (N) and so, R ⊆ P 7 (N).
We can now see why the Axiom of power set plays an essential role in the
ZFC-universe. Had we not declared that “P(S) is a set whenever S is a set”,
what guarantee would we have that the real numbers exists in our universe?
We have not yet invoked the following axioms: Axiom 7, called the Axiom
of replacement, Axiom A9, called the Axiom of regularity, and the Axiom
of choice. These three axioms will help us handle certain difficulties encoun-
tered, while dealing with infinite sets, the main subject of our investigation
for the rest of this book.
Concepts review:
1. What is an initial segment? What is its leader?
2. How is addition of initial segments defined?
3. If r and t are positive real numbers, how is multiplication of (←Q r)
and (←Q t) defined?
4. Provide a definition of Dedekind cuts.
5. Define a function f which maps R one-to-one onto the set D of all
Dedekind cuts.
6. Give a set-theoretic definition of the real numbers.
7. What can we say about the union of a family of Dedekind cuts?
8. What does the Completeness property of the reals (equivalently,
Least upper bound principle of the real numbers) state?
9. Does the set of all Dedekind cuts satisfy the Completeness property?
10. How are the elements of all Dedekind cuts ordered?
11. How does a Dedekind cut representing a rational number differ from
one representing an irrational number?
Part V: From sets to numbers 169
EXERCISES
B. 8. Show that any set of the form (←Q r), where r is a real number, satisfies the
three conditions in the formal definition of a Dedekind cut.
Infinite sets
Part VI: Infinite sets 173
Surprised? Well, it is better than saying that an infinite set is “a set with
lots of things in it” or “is a set which has more things in it than we can
count”. (A set containing one hundred billion atoms has a lot of things in
it and yet no one would perceive it as being an infinite set.) Dedekind’s
definition is succinct and without ambiguities since it is only expressed us-
ing words that have been previously defined. An infinite set is a set which
properly contains a one-to-one image of itself. If we were to define finite sets
as being those sets that are not infinite, then we could say that “a set S
is finite if and only if S contains no proper subset T which is a one-to-one
image of itself”. For example, since the natural number 6 = {0, 1, 2, 3, 4, 5}
cannot be in one-to-one correspondence with any of its elements, it cannot
be infinite. When a function f : A → B maps a set A one-to-one into a set
B, we often say that f embeds A inside B in the sense that B contains a
“copy” of A. Using this vocabulary we can say that “S is infinite if and only
if it is embedded into a proper subset of itself ”.
One could also declare R to be an infinite set since (−π/2, π/2) ⊂ R and
(−π/2, π/2) is a one-to-one image of R under the function tan : R → R.
1 Actually records show that Bolzano suggested in 1847 (before Dedekind) that an infinite
set is a set that can be mapped one-to-one onto a proper subset of itself, a property which
cannot be satisfied by finite sets.
Part VI: Infinite sets 175
P roof:
a) The empty set, ∅, has no proper subsets and so a function f cannot map
∅ into a proper subset of ∅. So ∅ is finite.
b) The singleton set, {x}, contains only one element x. Since x is not a proper
subset of x, the only proper subset of {x} is ∅. Then, for any well-defined
function f : {x} → {x}, ∅ cannot be the one-to-one image of {x} under
f. So singleton sets are finite.
P roof:
What we are given: That S is an infinite set and a ∈ S.
What we are required to show: That S − {a} is an infinite set.
Since S is infinite, then there exists a one-to-one function g : S → S such
that g[S] ⊂ S. We will show that S −{a} is infinite by exhibiting a one-to-one
function h on S − {a} such that h[S − {a}] is a proper subset of S − {a}.
Choose an arbitrary element k ∈ S − g[S].
− Case A: Suppose a ∈ g[S]. Then there is some u ∈ S such that such that
g(u) = a.
· Subcase A-1: Suppose u 6= a. Define a function h on S − {a} as follows:
g(x) if x ∈ S − {a, u}
h(x) =
k if x = u
Since g is one-to-one on S − {a, u}, then so is h. Furthermore, h uniquely
maps u to k ∈ S −g[S]. So h is one-to-one on S −{a}. See that neither of
the elements g(u) and a belongs to h[S − {a}]. So h[S − {a}] is a proper
subset of S − {a}.
Part VI: Infinite sets 177
P roof:
The proof is by induction. Let P (n) be the property “The natural num-
ber n is finite”. Since 0 = ∅ is finite, then P(0) holds true. Suppose
the natural number n = {0, 1, 2, 3, . . . , n − 1} is finite. We claim that
n+1 = n+ = {0, 1, 2, 3, . . . , n} must be finite. Suppose not. That is, suppose
n+ is infinite. By the Lemma 18.5, (n + 1) − {n} = n must also be infinite
contradicting the fact that P (n) holds true. So P (n + 1) must hold true.
By the principle of mathematical induction, P (n) holds true for all natural
numbers n. Thus, every natural number is a finite set.
many sets. If a theorem statement invokes the Axiom of choice in its proof,
it is common practice to alert the reader to this fact by posting the acronym
[AC].2
P roof:
(⇐) Suppose S is empty or is the one-to-one image of a natural number n.
Since ∅ is finite and every natural number n is finite, then S must be finite
(by Corollary 18.4 and Theorem 18.6).
(⇒) Conversely, suppose S is a non-empty finite set.
We are required to show that there exists a natural number n which can be
mapped one-to-one onto S. Suppose not. That is, suppose there does not
exist a natural number n which maps one-to-one onto S.
Claim: That S must then be infinite, contradicting our hypothesis.
Proof of claim: We prove the claim by constructing a one-to-one function
f : N → S which maps N into S.
Choose an element s0 in S to form the subset S1 = {s0 } of S. Define the
function f : {0} → {s0 } as f(0) = s0 . Then S − S1 is non-empty, for if it
was empty, then S = S1 would be the one-to-one image of {0} under the
function f contradicting the fact that S is not the one-to-one image of a
natural number. So we can choose an element s1 from S − S1 to construct
the subset S2 = {s0 , s1 }. Define the one-to-one function f : {0, 1} → {s0 , s1 }
as f(i) = si for i = 1, 2.
Suppose we have inductively constructed the subset Sn = {s0 , s1 , s2 , . . . , sn−1 }
of S where f : {0, 1, . . . , n − 1} → Sn is the one-to-one function defined as
f(i) = si . Then to avoid a contradiction, S − Sn must be non-empty. The
Axiom of choice provides us with the choice function k : P(S) → S which
allows us to choose from each set S −Sn an element sn from which we define
the one-to-one function f : {0, 1, . . ., n} → Sn+1 defined as
f(i) = si if i < n
f(n) = k(S − Sn ) = sn
“Is it possible to prove this statement without invoking the Axiom of choice?”.
Part VI: Infinite sets 179
We have shown that “counting” the elements in a finite set S comes down
to determining which natural number n is mapped one-to-one onto S.
We are essentially assigning to each of the n elements of S the labels
0, 1, 2, 3, . . ., n − 1. The corollary above shows that we could have defined
finite sets as follows:
Definition: A set S is a finite set if and only if it can be mapped
one-to-one onto some natural number n. If we say that
180 Section 18: Infinite sets versus finite sets
All the definitions and theorems stated and proved above would logically
follow from this definition of finite sets. The following theorem provides an-
other characterization of infinite sets.
3 Note that f need not be a one-to-one function for this to hold true.
Part VI: Infinite sets 181
4 If
n = 0 we define 20 = 1, 21 = 20 × 2. If n is a natural number other than 0 we define
2×2 ×···× 2
2n = | {z } .
n times
182 Section 18: Infinite sets versus finite sets
the reader decide to skip there will be no loss of continuity in the subject matter.
Part VI: Infinite sets 183
∗
Suppose not. Suppose (n, x) ∈ r and (n, y) ∈ f ∗ where x 6= y. Let
U = f ∗ − {(n, y)} (the set f ∗ take away the element (n, y)). Then we
easily see that U is one of the relations in S . The fact that U is strictly
smaller than f ∗ , previously declared to be the smallest of the relations
in S , is a contradiction. Then x must be equal to y. We conclude that
P (n) holds true as required.
By mathematical induction, P (n) holds true for all n. So f ∗ is a well-defined
function as claimed.
We will now show that f ∗ is unique. Let g be another function satisfying the
conditions (0, m) = (0, g(0)) ∈ g and
Concepts review:
1. What is the definition of infinite set as put forward by Dedekind?
2. From Dedekind’s definition of infinite set, how can we show that N
is infinite?
3. How do we define a finite set?
4. Is the empty set a finite set?
5. Is a subset of a finite set finite?
6. If S is an infinite set and u ∈ S, must S − {u} be infinite?
7. If a set S has a subset which is infinite, is S necessarily infinite?
8. If a set S is infinite and f : S → Y is a one-to-one mapping onto a
set Y , what can we say about the set Y ?
184 Section 18: Infinite sets versus finite sets
EXERCISES
Definition 19.1 Two sets, A and B, are said to be equipotent sets if there
exists a one-to-one function, f : A → B, mapping A onto B. If A and B are
equipotent, we will say that “A is equipotent to B” or “A is equipotent with
B”.
So equipotent finite sets are precisely those finite sets which are equipotent
to the same natural number. We know that for finite sets A and B, “A is
equipotent to a proper subset of B”, and “A is smaller than B” are equiv-
alent statements; we cannot however say that an infinite set A which is
equipotent to a proper subset of a set B is necessarily “smaller” than B.
For example, even if the set N is easily seen to be equipotent to the proper
subset, {0, 4, 8, 12, . . . , }, of itself (via the function f(n) = 4n), we instinc-
tively hesitate to say that N is a “smaller” set than {0, 2, 4, 6, . . ., } or that
{0, 2, 4, 6, . . ., } is smaller than N. The words “smaller than” seem to have a
precise meaning only when discussing finite sets.
Definition 19.2 Countable sets are those sets that are either finite or equipo-
tent to N. Infinite countable sets are said to be countably infinite. Those infinite
sets which are not countable are called uncountable sets.
many axioms each differing only by the formula φ it refers to. So to be more precise, given
a formula φ in set theory language, we would refer to it as Axiom A7(φ) rather than A7.
188 Section 19: Countable and uncountable sets
− n+1
2 if n is odd
f(n) = n
2 if n is even
f(0) = 0
f(1) = −1
f(2) = 1
f(3) = −2
f(4) = 2
..
.
So Z is an infinite countable set2 .
There are quite a few general statements that we can state about countable
sets. We will find it very useful to know that subsets of countable sets and
images of countable sets are countable.
m(0) = g({i ∈ N : xi ∈ T })
m(k) = g({i ∈ N : xi ∈ T − {xm(0), xm(1), . . . , xm(k−1)})
f(m, n) = 2m (2n − 1)
⇒ s=t
⇒ (m, s) = (n, t)
So f is one-to-one as claimed.
We have shown that f is both one-to-one and onto. Thus, the sets N × Z
and Z − {0} are equipotent. We have shown that Z is countable. Since
Z − {0} ⊂ Z, then, by the previous theorem, Z − {0} is also countable. So
there exists a one-to-one function g mapping Z − {0} onto N. So g−1 ◦f −1
maps N one-to-one onto N × Z. So N × Z is countable, as required.
The rational numbers in this set are irreducible so a/b = c/d if and only if
a = c and b = d. We define the function f : N × (Z − {0}) → Q as follows:
f(m, n) = m/n
3 Note that any non-zero integer can be expressed as a product 2m (2n − 1) for some
natural numbers m and n. For example, suppose we are given the integer 1584. If we factor
out as many 2s as possible from 1584, we obtain 24 and we are left with an odd number
2 · 50 − 1. See that 1584 = 24(2 · 50 − 1).
Part VI: Infinite sets 191
Lemma 19.5 Suppose f maps an infinite countable set A onto a set B = f[A].
Then B is countable.
P roof:
What we are given: The set A is countable, f : A → B maps A onto the set
B. What we are required to show: That B is countable.
Since A is countable, we can index the elements of A with the natural
numbers. Let A = {ai : i ∈ N}. For each b ∈ f[A] = B let ab∗ be the element
in f ← ({b}) such that
b∗ = min{i ∈ N : ai ∈ f ← ({b})}
P roof:
What we are given: That the sets in {A
S i : i ∈ S ⊆ N} are all countable sets.
What we are required to show: That i∈S Ai is countable.
Since each set Ai is countable, then we can index the elements of each Ai
with an initial segment Ti of natural numbers or with all elements of N. For
each i, let Ai = {a(i,j) : j ∈ Ti ⊆ N}.
S
We will define a function f : i∈S Ai → S × N as follows: f(a(i,j) ) = (i, j).
We see that f maps ∪i∈S Ai one-to-one into S × N. Since f [∪i∈S Ai ] ⊆
S × N ⊆ N × N, it is countable. Then ∪i∈S Ai is the one-to-one image of the
countable set f [∪i∈S Ai ] under the inverse map f −1 .
We conclude that ∪i∈S Ai is countable.
of set theory, which has become a fundamental theory in mathematics. Cantor established
the importance of one-to-one correspondence between the members of two sets, defined
infinite and well-ordered sets, and proved that the real numbers are more numerous than
the natural numbers. Cantor’s method of proof of this theorem implies the existence of
an infinity of infinities. He defined the cardinal and ordinal numbers and their arithmetic.
Cantor’s work is of great philosophical interest, a fact he was well aware of. (Wikipedia)
6 Readers who struggle a bit with the proof are encouraged to persist in their efforts to
grasp the general idea. Discuss aspects of the proof that seem to elude your understanding
with a co-reader. It is considered part of the standard mathematical culture. I occasionally
get messages from students who are convinced that they have a proof which shows that R
is countably infinite.
Part VI: Infinite sets 193
We will show that the open interval (0, 1) is not countable. As a consequence,
it will be impossible for R to be countable, since subsets of countable sets
have been shown to be countable.
Proof by contradiction.
Suppose f : N → (0, 1) is a one-to-one function mapping N onto (0, 1). This
means that we can index the elements of (0, 1) with the natural numbers
as follows: (0, 1) = {x0 , x1, x2 , x3 , . . . , } where xi = f(i). We claim that at
least one real number does not belong to f[N] and so f is not “onto” (0, 1):
− We write out each real number as an infinite decimal expansion:
Some readers, possibly for philosophical reasons, may find it difficult to ac-
cept that the infinite set R is not a one-to-one image of N, and hence is a
strictly larger infinite set than N, even though they can point to no obvious
7 It can be shown, for example, that the rational numbers 0.04999999 . . . and 0.05000 . . .
are different representations of the same rational number 5/100. But the decimal representa-
tion of a rational a/b is unique provided we do not allow a tail end of 9s in our representation
of this number.
194 Section 19: Countable and uncountable sets
errors in Cantor’s proof. Skeptical readers may find some comfort in learn-
ing that even very skilled mathematicians, when confronted by results which
appear counter-intuitive, may harbor some nagging doubts in spite of being
presented with an irrefutable proof. Georg Cantor once wrote to Richard
Dedekind “Je le vois, mais je ne le crois pas”8 (I see it, but I don’t believe
it) after determining that there is a one-to-one correspondence between all
points in the plane and the set of points on a line.
Having now convinced ourselves that the set of all real numbers is uncount-
able, we can subdivide the class of all infinite sets into two categories: the
subclass of all countably infinite sets and the subclass of all uncountably
infinite sets. We will see in the next section that the class of all uncountable
sets can itself be divided into other major subcategories of infinite sets.
Concepts review:
1. What does it mean to say that two sets are equipotent sets?
2. What does it mean to say that a set is countable?
3. Is the set ∅ countable?
4. What can we say about the image of a countable set under some
function f?
5. Which of the sets N, Q, Z, N × Z, R are countable sets.
6. Is it true that the subset of a countable set must be countable?
7. How is the procedure Cantor used to prove the uncountability of R
referred to?
8. What can we say about the countable union of countable sets?
EXERCISES
Abstract. In this section we first show that the equipotence relation “∼e ”
is an equivalence relation on the class S of all sets. We use this equiv-
alence relation to partition S into equivalence classes. We show that for
any set S, S cannot be equipotent to its power set P(S). This fact allows
us to construct infinitely many distinct classes of mutually equipotent sets.
We also show that for any non-empty set S, the two sets P(S) and 2S
are equipotent. Finally, we show that P(N) is embedded in R, and R is
embedded in P(N).
{S ∈ S : S ∼e N or S ∼e n, n ∈ N}
is the class of all countable sets. Also, for all infinite subsets A of N, N ∼e A.
We have also seen that N 6∼e R; hence, (N, R) 6∈ Re . It is natural to wonder
whether Re is an equivalence relation on S . We immediately verify that
this is the case.
1 The word “equinumerous” is also used to describe two sets which are equipotent. The
word “equinumerosity” is also used to describe the property of sets which are equipotent.
The word “equipotence” has the advantage of having only four syllables rather than the
tongue-twisting seven syllables in “equinumerosity”.
Part VI: Infinite sets 197
P roof:
Reflexivity: For every set S, S is equipotent to itself.
Symmetry: If S is equipotent to T , then T is equipotent to S.
Transitivity: If S is equipotent to T and T is equipotent to H, then S is
equipotent to H.
Just about every set we have discussed up to now belongs to one of these
equivalence classes. Each of these is a subclass of S . This gives rise to a
compelling question: Are there any equipotence-induced equivalence classes
other than the ones listed here? One of the main objectives of this section
is to show that there are.
P roof:
1) Suppose {A, B} ⊂ [N]e . Then A ∼e N and B ∼e N. By Theorem 20.3,
A × B ∼e N × N where N × N is known to be countable (Theorem 19.4).
Hence, A × B ∈ [N]e .
2) Suppose {A, B} ⊂ [R]e . Then A ∼e R and B ∼e R and so, by The-
orem 20.3, A × B ∼e R × R. It is easily seen that R is equipotent with
{1} × R ⊂ R × R. Since {1} × R is uncountable, R × R is uncountable. Then
A × B 6∈ [N]e . We must now show that A × B ∈ [R]e . This will be the case
if R × R ∈ [R]e .
We claim that R × R ∈ [R]e .
− It is easily seen that (0, 1) ∼e R.2 Then, by Theorem 20.3, (0, 1)×(0, 1) ∼e
R × R. To show that R × R ∈ [R]e , it then suffices to show that (0, 1) ×
(0, 1) ∼e (0, 1).
− For 0.x1 x2 x3 x4 x5 . . . ∈ (0, 1) (ignoring those decimal expansions with
infinite strings of 9s) define the function f : (0, 1) → (0, 1) × (0, 1) as
follows:
S × T ∼e N × N ∼e N
P roof:
What we are given: {Ai : i ∈ N} is a set of countable sets.
What we are required to show: A0 × A1 × A2 × · · · × An is countable.
We know that the product of any two non-empty countable sets is countable:
We prove the statement by induction. Let P (n) be the statement
n
Y
“ Ai is countable ”
i=0
The reader should be careful not to generalize the above theorem when it
comes to Cartesian products. It does not say that “The Cartesian product
of countably many countable sets is countable”. This statement does not
hold true in general. We will soon witness infinite products of countable sets
which are not countable.
Example: Let J denote the set of all irrational numbers. Show that J ∈ [R]e .
It was shown that the set of all rational numbers Q is countably infinite. If
J was countable, then by the theorem above, R = J ∪ Q would be countable,
a contradiction. So J is uncountable. We claim that J ∈ [R]e .
Theorem 20.7 If the sets A and B are equipotent, then so are their associ-
ated power sets P(A) and P(B).
P roof:
Given that A and B are equipotent, there exists a one-to-one function f :
A → B mapping A onto B. We define the function f ∗ : P(A) → P(B) as
follows:
f ∗ (T ) = M ⇔ f[T ] = M
Claim: The function f ∗ is onto P(B).
Proof of claim: Let M ∈ P(B). If M = ∅, then f ∗ (∅) = f[∅] = ∅.
Suppose M is a non-empty subset of B. Since f is onto B, M ⊆ f[A].
Then f[f ← [M ]] = M . So f ∗ maps the element f ← [M ] in P(A) to the
element M in P(B). We conclude that f ∗ maps P(A) onto P(B).
So, just like P(n) 6∼e n for all natural numbers n, P(N) 6∼e N.
A general follow-up question might be: Is it possible for any infinite set, S,
to be equipotent with its power set P(S)?
We will show that the answer to this question is, no! That is, if S is infinite,
S 6∈ [P(S)]e .
Theorem 20.8 Any non-empty set S is embedded in its power set P(S).
But no subset of S is equipotent with P(S).
P roof:
What we are given: That S is a non-empty set.
What we are required to prove:
1) That S is embedded in P(S).
2) That P(S) is not equipotent to K for any K ⊆ S.
Part VI: Infinite sets 203
T = {x ∈ K : x 6∈ g(x)}
The above theorem confirms that, for any set S, S and P(S) are not equipo-
tent so [S]e and [P(S)]e are distinct equivalence classes. It also suggests that
there are many more equipotence-induced equivalence classes than the ones
listed previously on page 197. For example,
A ,→e∼ B
If A and B are non-empty sets, we will say that the set A is properly embedded
in the set B, if A is equipotent to some proper subset C of B where B is not
equipotent to C. To describe “A is properly embedded in B”, we will write
A ,→e B
The relations, ,→e and ,→e∼ are easily seen to be both reflexive and transitive
relations on the class, S , of all sets.
Let S = {S : S is a set}. Remember that the axiom of power set guarantees
that if S ∈ S then P(S) ∈ S . We define the class, E , as
E = {[S]e : S ∈ S }
[A]e ≤e [B]e
which guarantees that this holds true is called the Schröder-Bernstein theo-
rem. This very important theorem will be the main topic of the next section.
In the meantime, keep in mind that we are working with the two classes, S
of all sets and the class E of equivalence classes induced by ,→e , on the table.
We provide a few examples. We have previously shown that any infinite set
contains a subset which is equipotent with N (18.9), and, since N and R are
known to be non-equipotent, N ,→e R. It then follows that
[N]e <e [R]e
Similarly, for any natural number n and any infinite set A, n ,→e N ,→e∼ A.
So [n]e <e [N]e ≤e [A]e implies [n]e <e [A]e .
2 Note that by the Axiom of power set, P n (S) is a set for all n ∈ N; hence {[P n (S)] :
e
n ∈ N} ⊆ E .
206 Section 20: Equipotence as an equivalence relation
{[P n (N)]e : n = 0, 1, 2, . . . , }
and
{[P n (R)]e : n = 0, 1, 2, . . . , }
form infinite <e -ordered chains of distinct equivalent classes in S . It will
be interesting to determine whether these two chains have any elements in
common. We will have the tools required to answer this question only in the
next section.
which is the set of all possible countably infinite ordered strings of 1s and
2s. Whether N is mapped to {1, 2} or {0, 1} is not considered as being a
significantly different set since all possible countably infinite strings of 0s
and 1s will essentially produce a set which is equipotent to the set of all
possible countably infinite strings of 1s and 2s. Using 0s and 1s will allow
us to represent the set, {0, 1}N, more succinctly as, 2N .
We can generalize the expression by replacing N with any set S. That is, if
S is any non-empty set, 2S represents all functions which map the set S to
{0, 1}. For example, given some finite set, say, S = {3, 4} we can actually
list the functions in this set as:
2{3,4} =
{(3, 0), (4, 0)}, {(3, 1), (4, 1)}, {(3, 0), (4, 1)}, {(3, 1), (4, 0)}
This set has four, or 22 , elements. If S has three elements, say, S = {7, 8, 9}
and we list all elements of 2S we would see that it contains precisely 23 = 8
elements. Verify this. It can be shown by mathematical induction that, if
S has n elements, 2S must contain 2n elements (see the Exercise section).
Recall that in Theorem 18.11, we showed by induction that the power set,
P(S), of any n-element set, S, contains 2n elements. From this fact, we
deduce that,
Question: Can we generalize this statement so that it holds true for all sets
S, including infinite ones?
Answer : We will convince ourselves that we can. But this will require some
careful explaining. To help answer this question, let’s consider a third way
of viewing the elements of 2S (whether S is finite or not). Suppose f ∈ 2S .
− Then f ← [{1}] and f ← [{0}] form disjoint subsets, say T and S − T , of S
respectively. Note that T may possibly be empty or possibly be all of S.
− So f is a function which maps x to 1 if and only if x ∈ T and all other
elements to 0.
− That is, f = χT ∈ 2S = {0, 1}S .1 In fact, for every K ⊆ S, equivalently
for every K ∈ P(S), χK ∈ 2S ; conversely, for every f ∈ 2S , there is
precisely one T ⊆ S, equivalently T ∈ P(S), such that f = χT .
Consider the function, g : P(S) → 2S , defined as: g(T ) = χT . We have
just shown that g maps P(S) one-to-one onto 2S . Hence, 2S ∼e P(S). It
is worth formally stating this important statement as a theorem.
We provide some background that will help follow the proof of the next
statement. Theorem 20.12 states that 2N = {χT : T ∈ P(N)} ∼e P(N).
Recall that the function χT : N → {0, 1} is defined as
0 if n 6∈ T
χT (n) =
1 if n ∈ T
We see that the maximum value in the image of {χT : T ∈ P(N)} under
f * is 10/9, while the minimum value in the image is 0. Also, if U 6= V , then
f * (χU ) 6= f * (χV ) so f * is one-to-one on {χT : T ∈ P(N)}.
We are now set to prove the following theorem.
Part VI: Infinite sets 209
It follows that
P(N) ∼e {χT : T ∈ P(N)} ,→e [0, 10/9] ⊂ R
So P(N) is embedded in R, as required.
Concepts review:
1. Describe the equivalence relation on the class of all sets which was
discussed in this section.
2. What can we say about the finite union of disjoint countable sets?
3. What can we say about the Cartesian products of two countable
sets?
4. If we add a countable set to an infinite set S, what can we say about
the set that results from this union?
5. With which set is the set of all irrationals equipotent?
6. If two sets A and B are equipotent what can we say about their
respective power sets?
7. What is the meaning given to the expression “A is properly embed-
ded in B”?
8. From any non-empty set S, construct a set B such that S ,→e B.
9. Name a set which contains a copy of R but is not equipotent with
R.
10. If S is a set, what set of functions is equipotent with P(S) other
than P(S) itself?
11. If S is a set, what does the set {χT : T ⊆ S} represent? With which
set it equipotent?
210 Section 20: Equipotence as an equivalence relation
EXERCISES
in algebraic logic (he authored Lectures in Algebra of Logic). Felix Bernstein (1878-1956)
was a German Jewish mathematician. He studied in Munich, Berlin and Göttenberg. He
emigrated to the United States in the early thirties during the rise of Nazism.
212 Section 21: The Schröder-Bernstein theorem
P roof:
What we are given: That T ⊂ S; that f : S → T maps S one-to-one into T .
What we are required to show: There exists a one-to-one function f * : S → T
which maps S onto T .
Since T is a proper subset of S, then S − T is non-empty.
We construct a sequence of sets {Si : i ∈ N} as follows:
S0 = S −T
S1 = f[S − T ] = f[S0 ]
S2 = f 2 [S − T ] = f[S1 ]
S3 = f 3 [S − T ] = f[S2 ]
S4 = f 4 [S − T ] = f[S3 ]
..
.
Sn = f n [S − T ] = f[Sn−1 ]
..
.
S
Let U = i∈N Si . Since f maps all of S in T , for all i > 0, Si ⊆ T . Remember
that S0 = S − T . The Si ’s can be shown to be pairwise disjoint. Verification
of this fact is left as an exercise. (This fact is important for the validity of
this proof. Try a proof by induction.)
We define the function f * : S → T as follows:
* f(x) if x ∈ U
f (x) =
x if x 6∈ U
P roof:
What we are given: There exists a one-to-one function, f : S → T , mapping
S into T and a one-to-one function, g : T → S, mapping T into S.
What we are required to show: There is a one-to-one function which maps
T onto S.
Let h = g◦f. Then h is a function mapping S into S. Since both f and g
are one-to-one on their respective domains, then h is one-to-one on S. Then
P roof:
214 Section 21: The Schröder-Bernstein theorem
We will now investigate the “set of all functions mapping N into N”. That
is, we will consider a set whose elements are of the form, f = {(i, ai ) : i ∈
N, ai ∈ N}. To be consistent with our notation, we will express this set as
NN
For example, g = {(0, 1), (2, 4), (3, 9), (4, 16), . . . , } represents a particular
element of the set NN where n is mapped to n2 . We could also represent this
element as, (a0 , a1 , a2 , a3 , . . . , ), where ai = i2 . That is, each ai is associated
to the element (i, i2 ).3
The sets, NN and 2N , are both sets of functions with domain, N, except the
functions in 2N have range, {0, 1}, while the functions in NN have range, N.
Not surprisingly, if g ∈ 2N , then g ∈ NN ; hence, 2N ⊂ NN . Of course, NN con-
tains many elements which do not belong to 2N . For example, {(i, i2 ) : i ∈ N}
2 If
Q
A = {0, 1} for i = 0, 1, 2, 3, . . . we define i∈N Ai = {(a0 , a1 , a2 , . . . , ) : ai ∈ {0, 1}}.
Qi N
Then i∈N Ai ∼e 2 . Q
3 If A = N for i = 0, 1, 2, 3, . . . ,, we define
i Q i∈N Ai = {(a0 , a1 , a2 , a3 , . . . , ) : ai ∈ N}. Or
if one prefers, i∈N Ai can be viewed as the set of all possible countably infinite sequences
of natural numbers. Q The element g = {(0, 1), (2, 4), (3, 9), (4, 16), . . . , }Qcan be viewed as
(0, 1, 4, 9, 16, . . . , ) ∈ i∈N Ai = {(a0 , a1 , a2 , a3 , . . . , ) : ai ∈ N}. In fact, i∈N Ai ∼e NN .
Part VI: Infinite sets 215
belongs to NN but not to 2N. But it may still be possible for NN to be equipo-
tent to 2N. If we can show that NN is embedded in 2N , it will follow from
the Schröder-Bernstein theorem that [NN ]e = [2N ]e .
P roof:
What we are given: NN is the set of all functions mapping N into N.
What we are required to show: NN and R are equipotent.
Claim: R is embedded in NN .
− We have shown that R ∼e 2N ⊂ NN ; hence, R is embedded in NN .
Claim: NN is embedded in R.
− Let f ∈ NN . Then f can be expressed in the form f =
{(0, a0 ), (1, a1), (2, a2 ), . . .} a subset of N × N. Since f ⊂ N × N, then
f ∈ P(N × N). Then
NN ⊂ P(N × N)
∼e P(N) (By 20.4, N × N ∼e N, followed by 20.7.)
∼e R
Definition 21.5 If A and B are two sets, then the symbol, B A , refers to the
set of all functions mapping A into B.4
4 The following argument confirms that if A and B are sets, then AB is a set: Every
element f ∈ AB is a subset of the set B × A (finite products of sets are sets). So for every
f ∈ AB , f ∈ P (B × A). Then AB ⊆ P (B × A). Since P (B × A) is a set (Axiom of power
set), then AB must be a set (Axiom of subset).
216 Section 21: The Schröder-Bernstein theorem
Examples:
a) The set QN denotes the set of all functions f : N → Q. For example,
is such a function. We can of course say that QN is the set of all infinite
countable sequences of rational numbers.
b) If S contains three elements and T contains four elements, we can verify
that the set S T will contain 34 elements.
We wonder how the sets, QN and NN , are related. It is clear that if
f * (x) = y = {(0, f(q0 )), (1, f(q1 )), (2, f(q2 )), (3, f(q3 ), . . . , })
Concepts review:
1. What does the Schröder-Bernstein theorem say?
2. Name three sets which are equipotent to the power set P(N).
3. What do the symbols 2N and NN mean?
4. Is the set NN equipotent with R?
5. What does the expression B A mean? If B has 3 elements and A
has 2 elements, how many elements does B A contain? How many
elements does AB contain?
EXERCISES
B. 2. Prove that an infinite countable set S can be expressed as the union of two
disjoint infinite countable sets.
Part VI: Infinite sets 217
3. Prove that if S and T are sets and S − T and T − S are equipotent, then
S and T are equipotent.
4. Prove that for any m ∈ N, Nm is countable.
5. If S = {0, 1, 2} and T = {x, y} write out explicitly the elements of the
following sets:
a) S T
b) T S
c) 2S
d) P(S)
5 The statement “Any infinite linearly ordered set V such that the set S of pairwise
disjoint open subsets is at most countable must be equipotent with R.” is referred to as the
Suslin’s problem. It remained an open question until it was proved that it is impossible to
prove or disprove this statement from ZF plus the Axiom of choice.
Part VII
Cardinal numbers
Part VII: Cardinal numbers. 221
A few words of caution: Even though we have shown that <e linearly orders
the set, A , described above, we have not proven that <e linearly orders the
class E , even though we suspect that it does. We will not assume this to be
the case until we formally prove it to be true.
“Does there exist an uncountable set S (that is, one which is not
equipotent with N) which is properly embedded in R ∼e P(N)?”
Equivalently,
“Does there exist an uncountable set S such that [N]e <e [S]e <e
[R]e = [P(N)]e ? ”
After numerous attempts to construct such a set S in vain, Georg Cantor
came to believe that no such set S exists. In 1878, he conjectured that:
from ZF+GCH. That is, the Axiom of choice exists in a universe governed by ZF+GCH.
224 Section 22: Introduction to cardinal numbers
See that, there is nothing to indicate that the equivalence classes, [S]e (in-
duced by equipotence) are “sets” and, even if they were, there are too many
of them to allow E to be a set.
{9, 7}
{R, N}
n o
∅, {{ ∅}}
n o
{{{ ∅}}}, {{ ∅}}
each of which is equipotent to 2. We also see that [2]e = [{9, 7}]e. Of course,
being equipotent to itself, 2 = {∅, {∅}} also belongs to [2]e. In fact, it is the
only natural number which is an element of [2]e (noting that, for example,
{9, 7} is not a natural number). If we represent the class, [2]e , in this way,
rather than representing it as, say [{9, 7}]e, it is because we surreptitiously
selected the set 2 = {∅, {∅}} as being the “official” representative of this
class. In fact, we have chosen the natural numbers as the official representa-
tives of all equivalence classes whose elements are finite sets. On the other
hand, possible representatives of [R]e and [P(R)]e are R and P(R), respec-
tively. But we could of course have used P(N) and P 2 (N), respectively.
It would be convenient to uniquely specify an “official” class representative
for each element of E . Determining how we can select a set from each and
every equivalence class in E is, however, not obvious. The Axiom of choice
states that there is a choice function that allows us to select an element from
each set in a “set of sets”. But E is not a “set of sets”. So the Axiom of
choice is not available to us as a tool for selecting an element in each set
in E .5 We need to identify a specific property possessed by a single set in
[S]e which clearly distinguishes it from all other sets in [S]e . Unfortunately,
at this time, we have not yet sufficiently explored our universe of sets to be
able to identify what this set property could be.
There are different ways we can go about solving this conundrum. We could
5 We could use each equivalent class in E as “self-representatives” and call them cardinal
numbers. The problem with this is that these equivalence class are not known to be sets.
We want a set which represents each equivalence class in E .
Part VII: Cardinal numbers. 225
Postulate 22.2 There exists a class of sets, C , which satisfies the following
properties:
1. Every natural number n is an element of C .
2. Any set S ∈ S is equipotent to precisely one element in C .
The sets in C are called cardinal numbers. When we say that a set, S, has
cardinality κ, we mean that κ ∈ C and that S ∼e κ, or equivalently, S ∈ [κ]e .
|S| = κ
Note that each cardinal number is a set. From here on, the symbol, C , is
strictly reserved to represent the class of all cardinal numbers. We emphasize
that we postulate the existence of the cardinal numbers, C , immediately, for
convenience only. We will eventually prove the existence of such a class, C .
Definition 22.3 If S and T are sets and κ = |S| and λ = |T |, then we define
addition “+”, multiplication “×” and exponentiation of two cardinal numbers
as follows:
a) If S ∩ T = ∅, κ + λ = |S ∪ T |
b) κ × λ = |S × T |
c) κλ = |S T | where S T represents the set of all functions mapping T into S
(as previously defined). That is, |S||T | = |S T |. For convenience, we define
0λ = 0
κ0 = 1
Also
κ1 = |S||{∅}| = |S {∅} | = |(∅, a) : a ∈ S| = |S| = κ
We verify the following fact:
1κ = |{∅}||S| = |{∅}S | = 1
One should verify that the definitions of sums, products and exponents of
cardinal numbers agree with the operations we perform with finite cardinal
numbers (the natural numbers). Suppose, for example, that A contains four
elements and B contains two elements. Then there are 16 = 42 elements in
AB . Verify this by listing all the elements in AB . Also there are 4 × 2 = 8
elements in A × B and 4 + 2 = 6 elements in A ∪ B (assuming that A and
B have no elements in common). Verify this fact.
Examples:
|A|×|B| = |A×B| = |{(1, 13), (1, 14), (2, 13), (2, 14), (3, 13), (3, 14)}| = 6
We will formally show that the class, C , of all cardinal numbers is not a set.
The proof mimics the one used to show that the class E is not a set of sets.
P roof:
S
Suppose C is a set. Let T = κ∈C κ.
Then T must be a set (by the Axiom of union). This implies P(T ) must
be a set (Axiom of power set). Since P(T ) is a set, it has a cardinality,
|P(T )| = 2|T | = λ. So P(T ) ∼e λ. But λ ⊂ T . Then P(T ) ∼e λ ⊂ T .
So P(T ) is equipotent to a subset of T , contradicting the previously estab-
lished fact, P(T ) 6,→e T (see Theorem 20.8).
So C cannot be a set.
Concepts review:
1. What does the Continuum hypothesis say? What does the negation
of the Continuum hypothesis say? Which one holds true in ZFC?
2. State the Generalized continuum hypothesis.
3. Define the class of all cardinal numbers.
4. Describe the finite cardinal numbers.
5. What symbol is used to represent the cardinality of the set R?
6. What symbol is used to represent the cardinality of N?
7. Which cardinal numbers are referred to as being transfinite cardinal
numbers?
8. How are the operations of addition, multiplication and exponentia-
tion of cardinal numbers defined?
9. Can the class of all cardinal numbers be referred to as a set? Why?
EXERCISES
P roof:
What we are given: That S1 ∩ T1 = ∅ = S2 ∩ T2 , S1 and S2 are equipotent
and T1 and T2 are equipotent.
What we are required to show: That |S1 ∪ T1 | and |S2 ∪ T2 | are the same
232 Section 23: Addition and multiplication in C
cardinal number.
Since S1 , S2 and T1 , T2 are equipotent pairs, then there exist one-to-one
onto functions:
f : S1 → S2
g : T1 → T2
By definition of addition, we have
We now verify that addition on C , thus defined, satisfies most of the basic
addition properties.
a) κ + λ = λ + κ (Commutativity of addition)
b) (κ + λ) + φ = κ + (λ + φ) (Associativity of addition)
c) κ ≤ κ + λ
d) κ ≤ λ and φ ≤ ψ ⇒ κ + φ ≤ λ + ψ.
P roof
a) Let S and T be disjoint sets such that κ = |S| and λ = |T |. To prove
that κ + λ = λ + κ, it suffices to prove that S ∪ T ∼e T ∪ S. This is left
as an exercise.
b) Let S, T and F be disjoint sets such that κ = |S|, λ = |T | and φ =
|F |. To prove that (κ + λ) + φ = κ + (λ + φ), it suffices to show that
(S ∪ T ) ∪ F ∼e S ∪ (T ∪ F ). This is left as an exercise.
c) Let S and T be disjoint sets such that κ = |S| and λ = |T |. Since S and
T are disjoint, we see that S can be mapped one-to-one into the subset
S of S ∪ T . Hence, κ ≤ κ + λ.
Part VII: Cardinal numbers 233
d) Let {S, F } and {T, P } be two pairs of disjoint sets such that κ = |S| ≤
λ = |T | and φ = |F | ≤ ψ = |P |. The case where we have equality is
straightforward. We will only prove the case involving the strict inequal-
ity “<”. Assuming κ < λ and φ < ψ,
S ,→e T ,→e T ∪ P S ,→e T ∪ P
⇒
F ,→e P ,→e T ∪ P F ,→e T ∪ P
On canceling out terms in addition. Not all addition properties which hold
true for finite cardinals extend to infinite cardinals. For example, for finite
cardinals m, n, k the statement
(m + n = m + k) ⇒ n = k
|S1 × T1 | = κ × λ = |S2 × T2 |
P roof:
What we are given: That S1 and S2 are equipotent and T1 and T2 are equipo-
tent.
What we are required to show: That |S1 × T1 | and |S2 × T2 | are the same
cardinal number.
Since S1 , S2 and T1 , T2 are equipotent pairs, then there exist one-to-one
onto functions:
234 Section 23: Addition and multiplication in C
f : S1 → S2
g : T1 → T2
By definition of multiplication, we have
We now describe and prove a few of the most basic multiplication properties
on C . We will see that most (but not all) of the multiplication properties
which hold true for the natural numbers extend to infinite cardinal numbers.
a) κ × λ = λ × κ. (Commutativity of multiplication)
d) λ > 0 ⇒ κ ≤ (κ × λ).
e) κ ≤ λ and φ ≤ ψ ⇒ κ × φ ≤ λ × ψ.
f) κ + κ = 2 × κ.
g) κ + κ ≤ κ × κ when κ ≥ 2.
P roof
a) Let S and T be sets such that κ = |S| and λ = |T |.
What we are required to show: That κ × λ = λ × κ.
To attain this result, it suffices to show that S × T ∼e T × S.
Let h : S × T → T × S be defined as h(s, t) = (t, s). Now
κ × (λ + φ) = |S × (T ∪ U )|
= |(S × T ) ∪ (S × F )| (By Theorem 4.7 (b) ).
= |S × T | + |S × F | (Since T and F are disjoint ⇒ S × T and S × F are disjoint).
= (κ × λ) + (κ × φ)
κ+κ = |(S × {0}) ∪ (S × {1})| (Since S × {0} and S × {1} are disjoint).
Concepts review:
1. How do we go about showing that addition and multiplication of
cardinal numbers are “well-defined”?
2. Is addition of cardinal numbers commutative? Is it associative?
3. Is multiplication of cardinal numbers commutative? Is it associa-
tive?
4. Does λ + κ = λ + ψ imply κ = ψ? If so why? If not, give an example
showing why not.
EXERCISES
B. 2. Let κ, λ, φ and ψ be any three cardinal numbers. Show the details of the
proofs of the following statements:
a) κ + λ = λ + κ.
b) (κ + λ) + φ = λ + (κ + φ).
c) κ ≤ λ and φ ≤ ψ ⇒ κ + φ ≤ λ + ψ.
3. Let κ, λ, φ and ψ be any three cardinal numbers. Show the details of the
proofs of the following statements:
Part VII: Cardinal numbers 237
a) κ × λ = λ × κ.
b) (κ × λ) × φ = λ × (κ × φ).
c) λ > 0 ⇒ κ ≤ (κ × λ).
d) κ ≤ λ and φ ≤ ψ ⇒ κ × φ ≤ λ × ψ.
e) κ + κ ≤ κ × κ when κ ≥ 2.
4. Show that 2ℵ0 = |R − N|.
5. Show that for any cardinal number κ, κ + κ + κ + κ = 4 × κ.
6. Let n be a finite cardinal number. Prove that:
a) n + ℵ0 = ℵ0 .
b) n × ℵ0 = ℵ0 .
c) n + 2ℵ0 = 2ℵ0 .
d) n × 2ℵ0 = 2ℵ0 .
e) ℵ0 + 2ℵ0 = 2ℵ0 .
f) ℵ0 × 2ℵ0 = 2ℵ0 .
There are precisely nine elements in AB . Or, we can say that the cardinality
of |AB | of AB is |A||B| = 32 = 9. So the notation AB is designed to remind
us of the number of elements contained in such sets when the sets A and B
are finite. For convenience, this notation is maintained for sets of all cardi-
nalities. In this section we try to develop a few rules that will help simplify
expressions involving exponentiation of infinite cardinals. We will soon see
that cardinal exponentiation is a considerably more complex operation than
the cardinal addition and multiplication operations.
We remind ourselves of the formal definition of cardinal number exponenti-
ation:
If κ and λ are the cardinal numbers of the non-empty sets A and
B we define κλ = |A||B| = |AB |. For convenience we define 0λ = 0
and κ0 = 1.
Part VII: Cardinal numbers 239
P roof:
What we are given: The sets S and S ∗ are equipotent as well as the pair T
and T ∗ . ∗
What we are required to prove: That S T and S ∗ T are equipotent.
Since S ∼e S ∗ and T ∼e T ∗ there exist one-to-one onto functions α : T → T ∗
and β : S → S ∗ .
If g ∈ S T define
φ(f, g) = h{f,g}
b) (κλ )φ = κλ×φ :
What we are required to show: That S T ×U and (S T )U are equipotent.
For each u ∈ U and f ∈ S T ×U we define the function fu : T → S in S T
as
fu (t) = f|T ×{u} (t, u) ∈ S
Then for each u ∈ U , fu maps T into S. That is,
{fu : f ∈ S T ×U , u ∈ U } ⊆ S T
gu (t, u) = [φ(u)](t), ∀t ∈ T
f = ∪{gu : u ∈ U }
c) (κ × λ)φ = κφ × λφ :
What we are required to show: That S U ×T U and (S ×T )U are equipo-
tent.
We define the function φ : S U × T U → (S × T )U as follows:
φ(f, g) = h
Then
φ(f1 , g1 ) 6= φ(f2 , g2 ) ⇔ q 6= r
⇔ q(u) 6= r(u), for some u ∈ U
⇔ (f1 (u), g1(u)) 6= (f2 (u), g2 (u))
⇔ f1 (u) 6= f2 (u) or g1 (u) 6= g2 (u)
⇔ f1 6= f2 or g1 6= g2
⇔ (f1 , g1) 6= (f2 , g2 )
The following example shows how these identities can help simplify the com-
putation of cardinal exponentials.
Find the cardinality of RR .
Solution:
|RR | = cc
= (2ℵ0 )c
= 2ℵ0×c
c ≤ ℵ0 × c ≤ c × c = c
|RR | = 2ℵ0×c = 2c
a) κ ≤ κλ .
b) α ≤ κ ⇒ αλ ≤ κλ .
c) α ≤ λ ⇒ κα ≤ κλ .
P roof:
What we are given: That κ = |K|, α = |A| and λ = |L|.
a) κ ≤ κλ :
What we are required to show: That K is embedded in K L .
Define the function f : K → K L as follows: f(k) = {k}L ⊂ K L . Note
that {k}L contains only one function; it maps all elements of L to the
single element k. Since “k 6= t implies {k}L 6= {t}L”, the function f is
one-to-one. Since f embeds K in K L , then κ ≤ κλ .
b) α ≤ κ ⇒ αλ ≤ κλ :
What we are also given: That A is embedded in K.
What we are required to show: That AL is embedded in K L .
Suppose the function f : A → K embeds A into K. Define φ : AL →
K×L
φ(g) = {(l, f(g(l))) : l ∈ L} ⊆ L × K
We claim that φ(g) ∈ K L :
If (a, f(g(a))) and (b, f(g(b))) are elements of φ(g) such that
f(g(a)) 6= f(g(b)), then g(a) 6= g(b) (since f is a function mapping A
to K). Since (a, g(a)) and (b, g(b)) both belong to g ∈ AL , then a 6= b
and so φ(g) is a function in K L as claimed.
We claim φ : AL → K L is one-to-one:
Suppose h, g ∈ AL .
Concepts review:
1. If κ and λ are two cardinal numbers, how is the expression κλ
defined?
2. What are the three basic identities for cardinal exponentiation
stated and proved in this section?
3. What are the three basic inequalities for cardinal exponentiation
stated and proved in this section?
EXERCISES
c) κ0 = 1.
d) 0κ = 0, if κ > 0.
B. 2. Show that for any finite cardinal number n and any cardinal number κ:
a) (2ℵ0 )n = 2ℵ0
b) ℵn0 = ℵ0
c) ℵℵ0 0 = 2ℵ0
d) nℵ0 = 2ℵ0 .
e) (2ℵ0 )ℵ0 = 2ℵ0
3. Let κ be an infinite cardinal number. Suppose |K| = κ and that {Ki : i ∈
K} is a set of pairwise disjoint sets Ki each of which has cardinality κ.
Show that | ∪ {Ki : i ∈ K}| = κ.
Theorem 25.1 Let C denote the set of all complex numbers and J denote
the set of all irrational numbers. Let n denote the cardinality of a non-empty
finite set.
a) The cardinality of Rn is c.
b) The cardinality of C is c.
c) The cardinality of J is c.
P roof:
a) |Rn | = c :
To prove that |Rn | = c, it suffices to show that cn = c.
We will prove this by mathematical induction.
What we are given: That n is a natural number greater than zero.
What we are required to show: That cn = c.
Let P (n) be the statement “cn = c”.
− Base case: Trivially, P (1) holds true (c1 = c was previously proven).
− Inductive hypothesis: Suppose P (n) holds true. That is, suppose cn =
c. Then
solid background in mathematics than the one required for the previous sections.
248 Section 25: On sets of cardinality c
b) |C| = c :
Define the function f : R2 → C as f(a, b) = a + bi. The function f is
easily shown to be one-to-one. So R2 and C are equipotent. It follows that
|R2 | = |C| = c.
c) |J| = c :
Suppose κ = |J|. Since J ∪ Q = R and J ∩ Q = ∅, then
|J ∪ Q| = |J| + |Q|
= |R|
= c
Theorem 25.2
a) Let SR denote the set of all countably infinite sequences of real numbers.
Then the cardinality of SR is c.
b) Let SN denote the set of all countably infinite sequences of natural num-
bers. Then the cardinality of SN is c.
c) Let NN
(1−1) denote the set of all one-to-one functions mapping N to N.
Then the cardinality of NN
(1−1) is c.
d) Let RN
(1−1) denote the set of all one-to-one functions mapping N to R.
Then the cardinality of RN
(1−1) is c.
Part VII: Cardinal numbers 249
P roof:
a) |SR| = c:
A sequence of real numbers {a0 , a1 , a2 , . . .} is a function s : N → R map-
ping each natural number i ∈ N to ai ∈ R. So each infinite sequence
{a0 , a1 , a2 , . . .} is associated to a unique function s : N → R. So the set of
all infinite sequences of real numbers can be represented by RN . Then
|RN | = |R||N| (By Definition 22.3)
= 2|N×N|
= 2|N| = 2ℵ0 (By Theorem 19.4)
= |2N | = c
Then |SR | = |RN| = c.
b) |SN| = c:
A sequence of natural numbers {a0 , a1 , a2 , . . .} is a function f : N → N
mapping each natural number i ∈ N to ai ∈ N. So the set of all infinite
sequences of natural numbers can be represented by NN . The cardinality
of the set of all infinite sequences of natural numbers is then |NN |. Note
that
f ∈ NN ⇒ f ⊆N×N
⇒ f ∈ P(N × N)
⇒ NN ⊆ P(N × N)
Then
c = 2ℵ 0
≤ ℵℵ0 0 (By Theorem 24.3 (b).)
= |NN|
≤ |P(N × N)|
= |P(N)| (N × N ∼ N followed by Theorem 20.7.)
= |R| = c
We conclude that |NN| = |SN | = c.
c) |NN
(1−1)| = c:
⇒ h◦ Sf =
6 h◦ Sg
⇒ H(f) = 6 H(g)
So H is one-to-one as claimed.
Then the function H embeds the set NN into NN (1−1) . We conclude that
N N N N
|N | ≤ |N(1−1)|. Since |N(1−1)| ≤ |N |, then by the Schröder-Bernstein
theorem |NN N
(1−1) | = |N | = c.
d) |RN
(1−1)| = c:
|RN
(1−1)| ≤ |RN |
= |R||N|
= (2ℵ0 )ℵ0
= 2ℵ0 ×ℵ0
= 2ℵ 0
= c
So |RN
(1−1) | = c.
C0 = [0, 1]
C1 = C0 − (1/3, 2/3) = [0, 1/3] ∪ [2/3, 3/3]
11, 100, 101, 110, 111, . . ., to determine the order in which the finite zero-one sequences are
ordered.
252 Section 25: On sets of cardinality c
C0 = [0, 1]
1 2
C1 = 1 I {0} ∪ 1 I {1} = [0, 3 ] ∪ [ 3 , 1]
C2 = 2 I {0,0} ∪ 2 I {0,1} ∪ 2 I {1,0} ∪ 2 I {1,1}
C3 = 3 I {0,0,0} ∪ 3 I {0,0,1} ∪ 3 I {0,1,0} ∪ 3 I {0,1,1} ∪ 3 I {1,0,0} ∪ 3 I {1,0,1} ∪ 3 I {1,1,0} ∪ 3 I {1,1,1}
C4 = 4 {0,0,0,0} ∪ 4 I {0,0,0,1} ∪ · · · ∪ · · · ∪ 4 I {1,1,1,1}
I
C5 = 5 I {0,0,0,0,0} ∪ 5 I {0,0,0,0,1} ∪ · · · ∪ · · · ∪ 5 I {1,1,1,1,1}
.. ..
. .
For example, one nested set of closed intervals in C100 would be of the form,
100
\
m IA1 = [0, 1] ∩ [0, 31 ] ∩ [0, 312 ] ∩ [0, 313 ] ∩ · · · ∩ [0, 3100
1
]
m=0
∩∞
n=0 n Is(n) = ∞ Is
In the following proposition, we will show that the Cantor set, C is (quite
surprisingly!) an uncountably infinite set. This is in spite of the large amount
of points removed from [0, 1] to construct it. Different authors may provide
different ways of proving that C is uncountable. We provide a proof that
has a set-theoretic flavor to it.
P roof:
Since the Cantor set C is a subset of [0, 1], then |C| ≤ c. The cardinal-
ity of {0, 1}N is known to be c (see Theorem 20.12). We will show that
3 This statement is referred to as the Nested interval lemma. This lemma is proven in
∩∞
n=0 n Is(n)
is non-empty for the chosen s ∈ {0, 1}N. For each s, we can then choose an
element xs in ∩∞ n=0 n Is(n) . See that, since xs ∈ n Is(n) ⊂ Cn for all n, then
xs ∈ ∩∞n=0 C n = C. We define the function f : {0, 1}N → C mapping {0, 1}N
into C, as
f(s) = xs
We claim that f is one-to-one: Suppose s and t are distinct elements of
{0, 1}N. Let n be the least natural number such that s(n) 6= t(n). Then the
two closed intervals
n Is(n) and n It(n)
in Cn have empty intersection. Then the intersection of the two sets of nested
closed intervals,
must be empty. So xs and xt cannot be the same element. This shows that
f is one-to-one, as claimed. Then f embeds {0, 1}N into C. Hence,
There are still uncountably many points that are left behind. One may
expect that C is simply the set of all endpoints that appear in all Cn ’s. But
5 For example, suppose s = {1, 0, 1, 1, 1, . . .}. Then we choose I
2 s(2) = 2 I{1,0} in C2 , we
choose 3 Is(3) = 3 I{1,0,1} in C3 , 4 Is(4) = 4 I{1,0,1,1} in C4 , and so on.
Part VII: Cardinal numbers 255
S
this can’t be, since if we take the union of all endpoints n∈N En we obtain
only a countably infinite set, while C has been proven to be uncountable.
The Cantor set must then contain uncountably many numbers which are not
endpoints! Skeptical readers may want to look at the proof again to see if
there is any sleight of hand. Even if one believes the given proof, it does not
mean that it will necessarily be what we might call “a satisfying proof”. We
cannot actually see what is going on at the very high levels of n. The proof
doesn’t help us understand why the “non-endpoints” in C are not excluded
in the construction process.
Identifying numbers in C which are non-endpoints. The following arguments
show why some “non-endpoints” of C remain in the infinite intersection of
the sets, Cn , which are used to construct the Cantor set C. Consider the
sequence of numbers, {Sn : n ∈ N}, where
n k
X −1
Sn =
3
k=0
So, even though 34 is not an endpoint of one of the subintervals which form
each level Cn it belongs to the Cantor set. Other such points can be found
in C in this way.
Ac = {S ⊆ R : |S| = |R| = c}
256 Section 25: On sets of cardinality c
c = |{x} × R| ≤ |U × R| ⇒ |U × R| ≥ c ∀ U ⊆ R
Since
|{U × R : U ∈ P(R) − {∅}}| = 2c
then
Concepts review:
1. What is the cardinality of Rn for any natural number n?
2. Do the real numbers and the complex numbers have the same car-
dinality?
3. How does the cardinality of the set of all countably infinite se-
quences of real numbers compare with the cardinality of RR ?
4. What is the cardinality of the set of all irrational numbers?
5. What is the cardinality of the set of all countably infinite sequences
of natural numbers?
6. What is the cardinality of the set of all countably infinite sequences
of real numbers?
7. Let NN denote the set of all functions mapping N into N and NN
1−1
denote the set of one-to-one functions mapping N into N. Are the
sets NN and NN1−1 equipotent? What is their cardinality?
8. What is the Cantor set? How is it constructed? What is its cardi-
nality?
9. What is the cardinality of the set of all continuous real-valued func-
tions?
EXERCISES
ℵ0 ℵ0
A. 1. Show that for any finite cardinal n, n × 2(2 )
= 2(2 )
.
6 We remind the reader of the definition of “a least upper bound of an ordered set S”:
The element m is a least upper bound of a set S if m is an upper bound of S and for any
other upper bound n, m ≤ n.
Part VIII
Ordinal numbers
Part VIII: Ordinal numbers 261
26.1 Overview.
In the last few sections, we have familiarized ourselves with some of the main
properties of infinite sets. We have seen that the ZFC-axioms have cleared
a path into unfamiliar mathematical territory, populated by uncountably
many “infinite sets” in infinite varieties, leading us to reflect on numerous
counterintuitive notions. We have discovered, for example, that given any
infinite set A we can find another infinite set B = P(A), not equipotent
to A, which properly contains a one-to-one copy of A. To express this rela-
tionship, we said that A is “properly embedded” in B and wrote A ,→e B.
We can thus construct infinite chains of sets linearly ordered by the proper
embedding ,→e -relation. For example:
0 ,→e 1 ,→e 2 ,→e · · · ,→e N ,→e P(N) ,→e P(P(N)) ,→e P(P(P(N))) ,→e · · ·
This chain of sets ordered by ,→e begins with the empty set 0 = { }. This
set is followed by an infinite number of finite sets called the “natural num-
bers”. Once we attain the first infinite set N, an endless sequence of infinite
sets can be constructed by successively taking powers of a set. Note that
no natural number is constructed by taking the power set of its immediate
predecessor. So the method for constructing each natural number from its
predecessor is different from the method used to construct each new infinite
set. In fact, it is an axiom that allows N to exist. Another axiom allows us
to say that if S is a set, then its power set is also a set. Of course, vari-
ous chains of sets can be constructed in this way, each depending on the
choice of the first set. If we started with the set of all real numbers, R, we
then obtained what initially appeared to be a different chain of infinite sets,
R ,→e P(R) ,→e P(P(R)) · · · . It was then determined that R and P(N)
are in fact equipotent, and so the displayed chain containing power sets of
N contains copies of the R-related power sets.
262 Section 26: More on well-ordered sets
We thought it would be practical to partition the class of all sets into sub-
classes of mutually equipotent sets. These subclasses were seen to be equiva-
lence classes induced by the equipotence relation ∼e . We defined the notion
of ∼e -equivalence class representatives called cardinal numbers. A cardinal
number was declared to be a set which represents all sets which are equipo-
tent to it. We had to postulate the existence of the cardinal numbers with
the promise that once we have developed the required set-theoretic tools,
the cardinal numbers would be appropriately defined or constructed.
We have seen that the set of all natural numbers has been extremely useful
in determining various properties of countably infinite sets. A critically im-
portant tool in our study was the principle of mathematical induction over
N. Since any countably infinite set is a one-to-one image of N, this means
that the elements of such sets can be indexed by the elements of N. Indexing
countable sets in this way allows us to linearly order these sets. We can then
apply the principle of mathematical induction to determine some of their
properties. When working with uncountable sets, we do not yet have access
to uncountable well-ordered sets whose elements can be used to index such
sets. We will soon see that ZFC provides the necessary ingredients to con-
struct “universal indexing sets”. 1
1 These sets will be called ordinals (soon to be defined). Cardinal numbers will be defined
as being those ordinals which satisfy a specific property (at Definition 29.7). Until we for-
mally define “cardinal numbers”, we will refrain from referring to the notion of “cardinality
of a set” in the process that leads to this definition.
Part VIII: Ordinal numbers 263
We now verify that every pair of elements in S are comparable under <S .
If sn , sm ∈ S, then n and m are the unique corresponding elements in T .
Then n <T m or m <T n. Hence, either sn <S sm or sm <S sn . Hence, all
pairs of elements of S are <S -comparable and so S is <S -linearly ordered.
− The set S is <S -well ordered: Suppose A = {si : i ∈ U ⊆ T } is a
non-empty subset of S. Then U is a non-empty subset of T . Since T is
well-ordered, U has a least element, say k. Since k ≤T i for all i ∈ U , then
sk ≤S si for all si ∈ A. Thus, A contains a least element.
This proves that the relation, <S , induced on S by T is a well-ordering.
We now show that every non-empty countable set can be well-ordered. Let S
be a countably infinite set. Then there exists a function, f : N → S, mapping
N one-to-one onto S. Since N is well-ordered, then S has a well-ordering.
If S is finite and non-empty, then it is the one-to-one image of some natural
number n (18.7). Since every natural number n is ∈-well-ordered (14.4), S
inherits this well-ordering from n as described above.
a) The set of all even natural numbers, Ne , with the ordering inherited
from (N, ⊂) is a well-ordered set since every pair of even numbers are
comparable and every subset of even numbers contains a least even
number.
b) Every natural number, n, is a well-ordered set. For example, 5 =
{0, 1, 2, 3, 4} is ∈-linear (or ⊂-linear) and every subset of 5 contains
a least element.
c) The set of all countably infinite sequences of natural numbers, NN (an
uncountable set with cardinality c), equipped with the lexicographic
ordering2 has been shown to be a set which is linearly ordered, but
not well-ordered, since it contains subsets with no least element. For
example, suppose that for each i ∈ N, xi = {aj : j ∈ N} where aj = 1 if
j = i and aj = 0 otherwise. Then for each i ∈ N, xi ∈ NN . The subset
S = {xi : i ∈ N} of NN does not contain a least element since it does
not contain the element (0, 0, 0, . . . , ).
2 See the definition of lexicographic ordering on page 134.
Part VIII: Ordinal numbers 265
{(0, 0), (0, 1), (0, 2), . . . , (1, 0), (1, 1), . . . , (2, 0), (2, 1), (2, 2), (2, 3), . . .}
can be used instead of < without altering the meaning of “initial segment”.
P roof:
What we are given: That (S, ≤) is a well-ordered set; T is a proper subset of
S satisfying the property “∀t ∈ T, [x < t] ⇒ [x ∈ T ]”.
What we are required to show: That T = Sa = {x ∈ S : x < a} for some
a ∈ S.
Since T is a proper subset of S, then S − T is non-empty. So S − T must
contain its least element, say a (since S is well-ordered).
Claim Sa ⊆ T : Since a is the least element of S − T , x < a ⇒ x 6∈ S − T ⇒
x ∈ T . So Sa ⊆ T , as claimed.
Claim T ⊆ Sa : If x 6∈ Sa , then x ≥ a. Then the element x cannot belong to
T , for if x ∈ T , a ≤ x would imply that a ∈ T (by definition of the set T );
since a ∈ S − T , we would obtain a contradiction. So u ∈ T ⇒ u < a. That
is, T ⊆ Sa as claimed.
So the initial segment, T , of the well-ordered set, S, is the set Sa = {x ∈ S :
x < a} where a is the least element in S − T , as required.
Given the initial segment Sa , we will refer to a as the leader of the initial
segment. The leader, a, of the initial segment, Sa , is not an element of the
initial segment. It is also important to remember that, by definition, a well-
ordered set S is not an initial segment of itself. We provide a few examples
of sets which are initial segments and sets which are not:
a) Note that every natural number n in N is an initial segment of N. For
example, 5 = {0, 1, 2, 3, 4} = {n ∈ N : n < 5} = S5 is an initial segment
of N.
Part VIII: Ordinal numbers 267
4 = S4 = {0, 1, 2, 3}
3 = S3 = {0, 1, 2}
2 = S2 = {0, 1}
1 = S1 = {0}
b) Even though the set Ne of all even natural numbers is a proper subset
of N, it is not an initial segment of N since 26 ∈ Ne and 17 < 26 but
17 6∈ Ne . However, S26 = {n ∈ Ne : n < 26} is an initial segment of the
well-ordered set (Ne ⊂).
c) The subset S = {0, 2, 3, 4, 5, . . ., } in N is not an initial segment of N since
3 ∈ S and 1 < 3, but 1 does not belong to S.
d) Consider the set, N{0,1,2} = {{a0 , a1 , a2 } : ai ∈ N}, of all functions
mapping {0, 1, 2} into N, ordered lexicographically.4 This set is easily
verified to be well-ordered.5 The set S{0,1,0} = {{0, 0, i} : i ∈ N} is an
initial segment of N{0,1,2}. It is the set of all elements in N{0,1,2} which
are strictly less than {0, 1, 0}.
Initial segments of well-ordered sets are well-ordered. If Sa is an initial seg-
ment of a <-well-ordered set S, then Sa can inherit the ordered relation “<”
from S so that it can itself be viewed as a <-well-ordered set.
(x ≤S y) ⇒ (f(x) ≤T f(y))
− Let (Ne , ≤) denote the even natural numbers equipped with the standard
natural number ordering ≤. Since the function f : N → Ne defined as
f(n) = 2n is one-to-one and strictly increasing, then it maps N order
isomorphically onto Ne .
− On the other hand, the function g : (N, ≤) → (N, ≤) defined as g(n) =
n + (−1)n is one-to-one and onto (N, ≤) but is not an order isomorphism.
If g(n) = an , witness a0 = 1, a1 = 0, a2 = 3, a3 = 2, . . .. We see that g
does not respect the order of the elements.
Part VIII: Ordinal numbers 269
S(0,1,0) = {(0, 0, i) : i ∈ N}
(N <∗ ) = {0, 2, 4, 6, . . ., 1, 3, 5, 7, . . .}
Then the function f : N → (N, <∗ ) defined as f(n) = 2n, maps N order-
isomorphically onto the initial segment {0, 2, 4, 6, . . ., } of (N, <∗ ).
6 An order isomorphism from an ordered set onto itself is called an order automorphism.
Here we are stating that the only automorphism is the identity function.
270 Section 26: More on well-ordered sets
P roof:
a) What we are given: That f : S → T is an order isomorphism mapping
the well-ordered set (S, ≤S ) onto (T, ≤T ).
What we are required to show: That f −1 : T → S must also be an order
isomorphism:
To see this, let u, v be elements in T such that u <T v. Since f is
one-to-one and onto, there exists distinct elements a = f −1 (u) and
b = f −1 (v) in S. Since S is well-ordered, it is linear and so all ele-
ments in S are comparable. So either f −1 (u) = a <S b = f −1 (v) or
f −1 (v) = b <S a = f −1 (u). If b <S a, then, since f is order preserving,
f(b) = v <T u = f(a), a contradiction. So a = f −1 (u) <S f −1 (v) = b.
So f −1 : T → S must also be an order isomorphism.
b) What we are given: That (S, ≤) is well-ordered, that f : S → S, and
that x < y implies f(x) < f(y) (that is, f is strictly increasing).
What we are required to prove: That x ≤ f(x) for all x. That is, f
cannot map an element x “below itself”.
Suppose there exists an element x of S such that f(x) < x. Then, the
set
T = {x ∈ S : f(x) < x}
is non-empty. We claim that this will lead to a contradiction:
− Since S is well-ordered, T must contain a least element, say a. Since
a ∈ T , f(a) < a.
− Since f is strictly increasing,
f −1 ◦g : (S, ≤S ) → (S, ≤S )
We highlight some important points that are made in the above statements.
Firstly, if f : S → T is an order isomorphism between well-ordered sets S
and T , then there can be no other one. This is important, since it points to a
crucial difference between equipotent sets and order isomorphic sets. There
can be many different functions which map a set S one-to-one onto a set
T . But if (S, <S ) and (T, <T ) are known to be order isomorphic, then only
one order isomorphism can bear witness to this fact.7 We might say that
an order-isomorphism is “sensitive” to the structure of a well-ordered set,
while equipotence is not. For example, the equipotence relation perceives
({0, 1} × N, <lex ) simply as a countable set allowing for many ways of map-
ping N one-to-one and onto this set, while an order isomorphism is sensitive
to the fact that this set is made of two copies of N lined up one after the
7 Note that even if the order isomorphism between two initial segments is unique, it is
still entirely possible for an initial segment to be mapped order-isomorphically onto another
subset of a well-ordered set. For example, the initial segment {0, 1, 2, 3} can be mapped
order-isomorphically to the non-initial-segment A = {11, 12, 13, 14}. But note that A is not
an initial segment of N.
272 Section 26: More on well-ordered sets
Notation 26.6 Let S and T be two well-ordered sets. Then the expression
S ∼WO T
S <WO T
S ≤WO T
If
W = {S ∈ S : S is well-ordered}
denotes the class of all well-ordered sets, verify that the relation ∼WO is
reflexive, symmetric and transitive on W and so is an equivalence relation.
See that ≤WO is also reflexive and transitive on W . The relation ≤WO is
not antisymmetric on W in the usual sense, since S ≤WO T and T ≤WO S
implies S ∼WO T , not S = T .8 But ≤WO can always be used as a ranking
too for the elements of W . We will now show that any two well-ordered
sets are ≤WO -comparable. That is, given any two well-ordered sets, S and
T , either S ≤WO T or T ≤WO S. The reader should carefully note how the
“well-ordered properties” are used in various parts of the proof.
8 However, if W ∗ = {[S]
WO : S ∈ W } denotes the class of all equivalence classes induced
by the equivalence relation, ∼WO , on W , then the statement in the theorem will allow us
to conclude that ≤WO induces a linear ordering on W ∗ .
Part VIII: Ordinal numbers 273
Theorem 26.7 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. Then ei-
ther S ≤WO T or T ≤WO S.
P roof:
What we are given: Two well-ordered sets (S, ≤S ) and (T, ≤T ). The expres-
sion Sa represents the initial segment {x ∈ S : x < a} whose leader is a.
We are required to show that: S ≤WO T or T ≤WO S
The symbol Sa ∼WO Tb is to be interpreted as “the initial segment Sa of S
is order isomorphic to the initial segment Tb of T ”.
We define the function f : S → T as follows: f(a) = b if and only if
Sa ∼WO Tb . We will carefully examine this function and describe its proper-
ties.
− We verify that f is well-defined : If f(a) = b and f(a) = c, then Sa ∼WO Tb
and Sa ∼WO Tc . This implies Tb ∼WO Tc . Two initial segments of the same
well-ordered set are order isomorphic if and only if they are equal. Then
b = c.
− We verify that the domain of f is non-empty: Suppose 0S and 0T denote
the least elements of S and T respectively. If 1S and 1T denote the least
element in S − {0S } and T − {0T }, respectively, then S1S ∼WO T1T . Then
f(1S ) = 1T so the domain of f contains at least the element 1S . Let D
denote the domain of f.
− We verify that f is strictly increasing: Suppose a and b are in the domain
D of f such that a <S b. If f(a) = c and f(b) = d, then Sa ∼WO Tc and
Sb ∼WO Td . Then
Tc ∼WO Sa and Sa ⊂ Sb and Sb ∼WO Td
So Tc is order isomorphic to an initial segment of Td . This implies
Tc ⊂ Td ⇒ c < d. So f is strictly increasing on D.
If D = S, then f maps S order isomorphically into T (since f has been
shown to be strictly increasing on D). It follows that D = S ≤WO T , and we
are done. So let us suppose that D 6= S.
− We claim that the domain D is an initial segment of S: If u ∈ D, then
Su ∼WO Tk for some k ∈ T . That is, there exist an order isomorphism
g : Su → Tk . If x <S u, then Sx ⊂ Su and g|Sx : Sx → Tk maps Sx onto
an initial segment, say Tt , in Tk . Then Sx ∼WO Tt implies f(x) = t. Then
x ∈ D. So D is an initial segment of S as claimed.
− We claim that if f[D] 6= T , then f[D] is an initial segment of T : Let
v ∈ f[D]. Then there exists an element a ∈ D such that f(a) = v. This
implies that Sa ∼WO Tv . Let u < v in T . Then Tu ⊂ Tv . Since Sa ∼WO Tv ,
then Tu is order isomorphic to an initial segment Sb ⊂ Sa , for some b ∈ D.
Then f(b) = u. So u ∈ f[D]. Since f[D] 6= T , by definition, f[D] is an
initial segment of T as claimed.
274 Section 26: More on well-ordered sets
The above theorem states that if we gather all well-ordered sets together to
form a class of sets, we can rank them with the relation ≤WO . Note that in
this class, distinct well-ordered sets may well be equal or equipotent sets. For
example, the set (N∗ , <∗) = {0, 2, 4, . . ., 1, 3, 5, 7, . . . , } of all natural num-
bers where the even numbers are first enumerated in the usual order followed
by all odd numbers enumerated in the usual way, is simply another way of
describing the set N. Nevertheless, N <WO N∗ . Even though N and N∗ are
the same set, they are not order isomorphic. On the other hand, we easily see
that N∗ and the lexicographically ordered set {0, 1}×N are order isomorphic.
It is also interesting to note that every well-ordered set is an initial segment
of another well-ordered set. Indeed, if S is well-ordered by ≤, then S is order
isomorphic to the initial segment {1} × S of the lexicographically ordered
set {1, 2} × S. In relation to such lexicographically ordered sets, we present
the following more general result.
P roof:
Let T be a non-empty subset of S. Let u be the least element of the
set {r ∈ {1, 2, . . ., n} : (r, t) ∈ S}. Since every natural number is well-
ordered (14.4), such a number u exists. Let v be the least number in
{t ∈ N : (u, t) ∈ S}. Since N is well-ordered (14.3), such a number v ex-
ists. Then (u, v) ≤ (i, j) for all (i, j) ∈ A. Hence, every non-empty subset A
of S has a least element. So S is <-well-ordered.
Part VIII: Ordinal numbers 275
Concepts review:
1. What is a well-ordered set?
2. What is an initial segment of a well-ordered set?
3. Is a well-ordered set an initial segment of itself?
4. Give three examples of well-ordered sets.
5. Is the lexicographic ordering of N × N a well-ordering? Why or why
not.
6. List all initial segments of the natural number 7.
7. Give an infinite initial segment of N{0,1,2}.
8. What is an order isomorphism between two well-ordered sets?
9. Can a well-ordered set be order isomorphic to one of its initial seg-
ments?
10. If f : S → T where (S, ≤S ) and (T, ≤T ) are linearly ordered sets,
what does it mean to say that f is strictly increasing?
11. If a well-ordered set S is order isomorphic to an initial segment of
a well-ordered set T , can S and T be order isomorphic?
12. What can we say about two well-ordered sets S and T in reference
to order isomorphism?
13. How many order isomorphisms are there between an initial segment
and itself?
14. If S and T are order isomorphic sets and f and g are two order
isomorphisms mapping S onto T , what can we say about f and g?
EXERCISES
A. 1. Is the set of all prime numbers ordered in the usual way a well-ordered set?
2. List the first three initial segments of the set of all prime numbers ordered
in the usual way. Are initial segments of prime numbers initial segments of
N?
1
3. Let S = {0} ∪ { n+1 : n ∈ N} be ordered by < in the usual way. Is the set
S a well-ordered set? Justify.
27.1 Introduction.
Our study of infinite sets began with a declaration of what it means for a set
to be infinite. We stated that only a set S “which can be mapped one-to-one
onto a proper subset of itself” is referred to as being “infinite”. All other
sets are said to be “finite”. Then we discovered that infinite sets could be
subdivided into two categories: Those that are one-to-one images of N −
referred to as “countably infinite” sets − and those that are not − referred
to as “uncountably infinite”. Then, we discovered that the class of uncount-
ably infinite sets actually has a more complicated structure. We found that
not all uncountable sets were pairwise equipotent. We were led to this con-
clusion when we proved that no infinite set S could be mapped one-to-one
onto its power set P(S). This implied that we could partition the class of all
sets into infinitely many subclasses of sets each containing sets which were
pairwise equipotent sets. Up to now, our attention has mainly been centered
on investigating the properties of those sets which belong to the class of all
countably infinite sets and the class of all sets which are equipotent to R
(since the sets N and R are the two sets we are the most familiar with).
Within the class of all sets, we investigated the subclass of sets known to
be equipped with a “well-ordering” binary relation. When equipped with a
well-ordering, these are called well-ordered sets. The ones we have exhibited
up to now are all countable. In this section we will show how to construct
new well-ordered sets from old ones. We saw that order isomorphisms allow
us to partition further the class of well-ordered sets. For example, a class of
all well-ordered sets can be partitioned into subclasses of pairwise order iso-
morphic sets. Recall that an “order isomorphism” between two well-ordered
sets S and T is a one-to-one function which respects the order of the elements
in the domain S and the image T of S. That is, the order of the elements
of the domain and the image is preserved by the one-to-one function. For
278 Section 27: Ordinals: definition and properties
We were able to show that all pairs of well-ordered set are ≤WO -comparable.
This is in striking contrast with our first attempts at grasping the structure
of the class of all sets. The reader will recall that we were not clear on how
to prove that “,→e∼” linearly orders the class of all sets, even though we
strongly suspect that we will eventually be able to show that this is the case.
Our ultimate objective in this section (and the one that follows) will be to
construct a “well-ordered class of well-ordered sets” which contains an order
isomorphic copy of every well-ordered set. We will see that ZFC provides
us with the tools to construct a class of sets which serves this purpose. The
elements of this class of sets will be called ordinals.
(y ∈ S) ⇒ (y ⊂ S)
x ∈ y and y ∈ S ⇒ x ∈ S
∅ = 0
∅+ = 0 ∪ {0} = {0} = 1
1+ = 1 ∪ {1} = {0, 1} = 2
2+ = 2 ∪ {2} = {0, 1, 2} = 3
3+ = 3 ∪ {3} = {0, 1, 2, 3} = 4
..
.
n+ = n ∪ {n} = {0, 1, 2, . . ., n} = n + 1
Part VIII: Ordinal numbers 279
The set N as well as each of its elements are ordinal numbers. Since the set
N of all natural numbers, as well as each natural number, have been shown
(Theorems 13.8, 13.9, 14.4 and 14.3) to be strictly ∈-well-ordered transitive
sets, the class of all ordinals contains infinitely many finite ordinals and at
least one infinite ordinal, namely N. We will continue to (generically) repre-
sent finite ordinal numbers by the usual lower case letters such as m or n,
but infinite ordinal numbers will be represented by lower-case Greek letters,
such as ω, α and β.
ω = {0, 1, 2, 3, . . . , }
the context: When simply viewed as a set, we commonly use N, when viewed as a cardinal
number, we commonly use ℵ0, when viewed as an ordinal number we commonly use ω.
Later, the symbol, ω0 , will be used instead of ω (to specify that it is the smallest of all
infinite ordinals).
280 Section 27: Ordinals: definition and properties
The reader should note that, by definition, only “sets” can be ordinals. That
is, a strictly ∈-well-ordered proper class is not an ordinal.
ω+ = ω ∪ {ω}
is the union of two sets and so is itself a set. Note that ω+ 6= ω for if it was,
then ω ∈ ω, a contradiction. So, from ω, we have generated a new set ω+ .
This set is represented as, ω+ = ω + 1. If we repeat the procedure again
starting with ω + 1, we obtain
ω + 2 = (ω + 1)+ = ω + 1 ∪ {ω + 1}
P roof:
What we are given: That α is an ordinal number (i.e., α is a transitive set
and strictly ∈-well-ordered).
What we are required to prove: That α+ is an ordinal number.
The class α+ is a set : Note that since α is an ordinal α must be a set.
Hence, by Axiom 3 (Axiom of pair), {α} is a set. By Axiom 6 (Axiom of
union), a+ = α ∪ {α} is a set, as claimed.
The set α+ is transitive: Suppose x ∈ α+ = α ∪ {α}. By definition of “tran-
sitive”, it suffices to show that x ⊂ α+ . If x = α, then x ⊂ α ∪ {α} = α+
and we are done. Suppose x 6= α; then x ∈ α. Since α is transitive x ⊂ α
and so x ⊂ α+ . So α+ is transitive. It follows that when viewed as a relation
Part VIII: Ordinal numbers 281
on α+ , ∈ is a transitive relation.
The elements of the set α+ are ∈-comparable : Let x and y be distinct ele-
ments in α+ = α ∪ {α}.
Case 1: If x = α, then y ∈ x (since x 6= y). Then x and y are ∈-comparable.
Case 2: If both x, y ∈ α, then, since α is known to be ∈-linearly ordered,
either x ∈ y or y ∈ x. So all pairs of elements in α+ are ∈-comparable.
It follows that the relation “∈” linearly orders α+ .
The relation ∈ is a strict linear ordering of α+ : Since ∈ strictly orders α,
x 6∈ x for all x ∈ α. Also α 6∈ α, for if α = x ∈ α, then x ∈ x contradicting
the fact that ∈ strictly orders α.
The set α+ is ∈-well-ordered : Let S be a non-empty subset of α+ . Let
T = S ∩ α.
Case 1: If T = ∅, then S = {α}. Since α 6∈ α, α must be the least (actually
the only) element of S.
Case 2: Suppose T 6= ∅. Since α is ∈-well-ordered, there exists an m ∈ T
which is the ∈-least element of T . If S = T , then m is the ∈-least element of
S, as required. If, on the other hand, S = T ∪ {α}, since α is the maximal
element in α+ , m < α. Again m is the ∈-least element of S.
We conclude that α+ is strictly ∈-well-ordered. So α+ is an ordinal number.
ω = {0, 1, 2, 3, . . . , }
ω+1 = ω+ = ω ∪ {ω} = {0, 1, 2, . . . , } ∪ {ω} = {0, 1, 2, . . . , ω}
ω+2 = (ω + 1) ∪ {ω + 1} = {0, 1, 2, . . . , ω} ∪ {ω + 1} = {0, 1, 2, . . . , ω, ω + 1}
282 Section 27: Ordinals: definition and properties
We see that this method for constructing ordinals has a limited range. Are
there any other transitive “∈-well-ordered sets” beyond the set {ω, ω +
1, ω + 2, ω + 3, . . .} of ordinals? Our experience with ordinals tells us that
there can be. Recall that having defined all finite ordinals (natural numbers)
0, 1, 2, 3, . . ., we gathered together all natural numbers to form a new set,
N = ω = {0, 1, 2, 3, . . .}. We then explicitly proved that this new infinite
set, ω, is itself an ordinal. This illustrates that the “immediate successor
constructing algorithm” is not the only way to construct ordinals. Consider,
for example, the set
ω + ω = {0, 1, 2, 3, . . ., ω, ω + 1, ω + 2, ω + 3, . . . , ω + n, . . . , }
∀u ∈ U, [v < u] ⇒ [v ∈ U ]
P roof:
P roof:
What we are given: α is an infinite ordinal number.
What we are required to show: ω ∈ α.
We claim that {n : n ∈ ω} ⊂ α.
We prove the claim by induction. Let P (n) be the statement: “The natural
number n belongs to α”.
Base case: We are required to show that P (0) holds true. Let γ be the ∈-
least ordinal in α. If γ = 0, then we are done. Suppose γ 6= 0. That is,
suppose γ is non-empty. Then, when viewed as a subset of α, it contains a
least element x. We then see that x ∈ γ ∈ α contradicting the fact that γ is
the ∈-least element of α. The source of the contradiction is our supposition
that γ is not zero. Hence, γ = 0 ∈ α. So P (0) holds true.
Inductive hypothesis: Suppose P (n) holds true for some natural number n.
That is suppose that n = {0, 1, 2, 3, . . ., n − 1} ∈ α. Since α is transitive,
n = {0, 1, 2, 3, . . . , n − 1} ⊂ α. Then n + 1 = {0, 1, 2, 3, . . ., n − 1, n} =
n ∪ {n} ⊆ α. Since α is infinite, we actually have n + 1 ⊂ α. Since
n + 1 = {0, 1, 2, 3, . . ., n − 1, n} is an initial segment of α it is an ordi-
nal in α. Then P (n+ ) holds true.
By the principle of mathematical induction, α contains every natural num-
ber, as claimed.
Since ω 6= α, then ω is an initial segment of α. So ω ∈ α as required.
P roof:
What we are given: That α and β are distinct ordinal numbers.
What we are required to show: That (α ⊂ β) ⇒ (α ∈ β).
Suppose α ⊂ β. The set β − α is non-empty, and so contains its least ele-
ment, say γ. We will show that γ = α. If so, then α ∈ β and we are done.
Since elements of ordinals are ordinals, γ is an ordinal number.
We claim that α ⊆ γ:
− Let x ∈ α ⊂ β. Then x is also an ordinal number. It suffices to show that
x ∈ γ. Suppose x 6∈ γ. Since β is ∈-linearly ordered, x 6∈ γ implies either
γ ∈ x or γ = x holds true. But γ ∈ x ⊂ α or γ = x ⊂ α implies γ ∈ α
(since α is transitive). This contradicts γ ∈ β − α. So x ∈ γ. It follows
that α ⊆ γ as claimed.
We claim that α = γ:
− Suppose α ⊂ γ. Then there exists x ∈ γ − α ⊂ β − α. This means x is
an element in β − α which is strictly ∈-less than its least element γ. This
contradiction is caused by our supposition x ∈ γ − α. We conclude that
α = γ as claimed.
We have shown that if α ⊂ β, then α is the the least element of β − α and
so α ∈ β, as required.
The above results have an implication which is worth pointing out imme-
diately. We have shown in Theorem 27.5 that for every ordinal α, α is an
initial segment of β = α ∪ {α} with respect to the ∈ order relation. Then
we can write
γ = {α ∈ β : α ∈ γ}
where the ordinal γ is the leader of its initial segment. This is the case, for
any ordinal γ. This is consistent with what we have observed up to now.
Witness the ordinal, 3 = {0, 1, 2} = {n : n ∈ 3}, where 3 is the leader of the
ordinal 3, and the infinite ordinal, ω = {0, 1, 2, 3, . . . , } = {n : n ∈ ω}, where
ω is the leader of the ordinal ω.2
P roof:
What we are given: The sets α and β are ordinals for which there exists an
onto order isomorphic map f : α → β.
What we are required to show: That α = β.
Let S = {x ∈ α : f(x) 6= x}. Recall that order isomorphisms are strictly
increasing. Since 0 is the least ordinal of both α and β, f(0) = 0 (if, for
example, f(0) = 1, then f must map some element α > 0 to 0 < 1, a con-
tradiction). Hence, S is not all of α.
If S = ∅, then f(x) = x for all x in α; then α = β and we are done.
Suppose S 6= ∅. We claim that this will lead to a contradiction:
− Since α is ∈-well-ordered, S contains a smallest element, say d. Since
d ∈ S, then f(d) 6= d.
We claim: f(d) ⊆ d.
If x ∈ f(d) in β, then there exists z ∈ α such that f(z) = x.
Since f respects ∈-ordering
u ∈ d ⇒ f(u) = u ∈ f(d)
So d ⊆ f(d) and f(d) ⊆ d implies d = f(d) which contradicts the fact
that d is least ordinal such that f(d) 6= d.
So S must be empty. This means that f is the identity map. We can only
conclude that α = β.
Part VIII: Ordinal numbers 287
Theorem 27.9 The relation “∈” linearly orders the class of all ordinals.
P roof:
We have shown in Theorem 26.7 that any two well-ordered sets S and T
are either order isomorphic or one is order isomorphic to an initial segment
of the other. If the ordinals α and β are not order isomorphic, then one
must contain an order isomorphic copy of the other. Suppose, without loss
of generality, that α is order isomorphic to an initial segment γ of β. By part
(b) of Theorem 27.5, γ must be an ordinal number. Since α and γ are order
isomorphic ordinals, then, by Lemma 27.8, they must be the same ordinal
number. Hence, α ∈ β. We can conclude that any two ordinal numbers are
∈-comparable; so “∈” linearly orders the class of all ordinals.
ω + 2 = {0, 1, 2, . . . , ω, ω + 1} = {0, 1, 2, . . ., ω} ∪ {ω + 1} = ω + 1 ∪ {ω + 1}
β + = {0, 1, 2, . . ., β}
∪{α : α ∈ U } = ∪{0, 1, 2}
= 0∪1∪2
= ∅ ∪ {∅} ∪ {∅, {∅}}
= {∅, {∅}} = 2 6= U
Examples.
a) We define
ω + ω = {0, 1, 2, 3, . . ., ω, ω + 1, ω + 2, ω + 3, . . . , } = ω ∪ {ω + n}∞
n=0
ω + ω + ω = ω + ω ∪ {ω + ω, ω + ω + 1, ω + ω + 2, ω + ω + 3, . . . , }
we can similarly conclude that ω + ω + ω is the least ordinal which contains
all ordinals in ω + ω, and ordinals of the form ω + ω + n, where n ∈ N.
c) Ordinals such as, ω + ω, ω + ω + ω and ω + ω + ω + ω, are more succinctly
written as
ω2, ω3, ω4, ω5, . . .
and so on. We denote the set of ordinals ∪{ωn : n ∈ N} by
ωω
Since every one of these is the countable union of countably many ordinals,
each is countable; so each of these is a set.
lub(ω + 2) = lub{0, 1, 2, 3, . . . , ω, ω + 1} = ω + 1
lub(ω + 2) 6= ω + 2
Concepts review:
1. What is an “ordinal number”?
2. What is a transitive set?
3. What does it mean to say that a transitive set is strictly ∈-well-
ordered?
4. Given two elements x and y of a well-ordered set, what does it mean
to say that y is an immediate successor of x?
5. Give an example of an infinite linearly ordered set which contains
elements with no immediate successor.
6. Describe a method for constructing immediate successors of ordi-
nals.
7. When viewed as an ordinal number, how do we represent N?
8. Given an ordinal number α, which one of its elements are initial
segments of α?
9. Can an ordinal number be an initial segment of itself?
10. Which elements of an ordinal number α are themselves ordinal num-
bers?
11. Which elements of an ordinal number α are proper subsets of α?
12. Which subsets of an ordinal number α are elements of α?
13. What can be said about two ordinals which are order isomorphic?
14. How are limit ordinals different from non-limit ordinals?
15. How are the ordinals ω2, ω3 and ω4 described?
16. What kind of ordinals can be represented as γ = ∪{α : α ∈ γ}?
294 Section 27: Ordinals: definition and properties
EXERCISES
ω = {0, 1, 2, 3, . . ., }
ω2 = {0, 1, 2, . . . , ω + 1, ω + 2, . . . , ω + n, . . .}
ω3 = {0, 1, 2, . . . , ω + 1, ω + 2, . . . , ω2, ω2 + 1, ω2 + 2, . . . , }
These are called limit ordinals. Limit ordinals are seen to be those
ordinals which do not contain a maximal ordinal. Equivalently, they
are those ordinals α, such that α = lub(α) (27.15).
Such properties are not entirely new to us since the set, N, is seen to satisfy
these very same properties. When viewed as an ordinal we represent N as ω.
We saw that there are ordinals which can be much larger than the ordinal
ω. We exhibited methods to construct large sets of ordinal numbers, each
of which contains the natural numbers, themselves ordinals. In this chapter
we will gather all ordinals together and investigate the class which contains
“all” ordinals. The class of all ordinals is much too big to be called a “set”
(as we shall prove in Theorem 28.4). We can nevertheless study its structure.
296 Section 28: Properties of the class of ordinals.
In the next few pages, we will show that the class, O, itself satisfies, just
like its elements (the ordinal numbers), most of the properties possessed by
its elements.
P roof:
What we are given: O is the class of all ordinals.
What we are required to show: That “∈” is a strict linear order relation on O.
− Since α 6∈ α for all α ∈ O, “∈” is irreflexive and asymmetric. We verify
transitivity of the order relation ∈: If α ∈ β and β ∈ γ, then α ⊂ β and
β ⊂ γ; so α ⊂ γ; this implies α ∈ γ. So ∈ is a strict order relation on O.
− That every pair of ordinals are ∈-comparable has been shown in Theorem
27.8.
We conclude that the class, O, of all ordinal numbers is ∈-linearly ordered.
P roof:
What we are given: That S is a non-empty subset of ordinal numbers in O.
What we are required to show: That S contains an ordinal which is the
∈-least element of S. That is, S contains an ordinal, β, such that, for all
α ∈ S other than β, β ∈ α.
Suppose γ ∈ S. If γ ∈ α for all α ∈ S, then γ is the least ordinal of S and
we are done.
Suppose, on the other hand, there is some ordinal α ∈ S such that α ∈ γ.
Then γ ∩ S is a non-empty subset of the well-ordered set γ. This means γ ∩ S
contains a least element, say β.
We claim that β is the ∈-least element of S. Suppose φ is an ordinal in
S such that φ ∈ β. Since β is an element of γ and γ is transitive, φ ∈ γ.
Then φ is an element of S ∩ γ which is ∈-less than the least element, β, of
Part VIII: Ordinal numbers. 297
⇒ β ⊂O
⇒ O is a transitive class.
Burali-Forti paradox.
298 Section 28: Properties of the class of ordinals.
P roof:
Suppose the class O of all ordinal numbers is a set. Then, since O is transi-
tive and ∈-well-ordered, it is an ordinal number. Then O ∈ O. Since O is a
transitive set, O ⊂ O. Since no ordinal number can be order isomorphic to a
proper subset of itself (Theorem 26.5), this is a contradiction. So O cannot
be a set.
So, even though O looks, feels and is, in many ways, similar to an ordinal,
it is an entirely a different mathematical object. We cannot manipulate it
as if it was a set.
Then P (α) holds true for every α ∈ γ. That is, every element of Bγ satisfies
the property P .
P roof:
We are given that for every ordinal β ∈ γ, “P (α) is true for all α ∈ β implies
P (β) is true”. We are required to show that P (α) holds true for every α ∈ γ.
Suppose not. Suppose there exists some ordinal δ ∈ γ such that P (δ) is false.
We claim this will lead to a contradiction: By our supposition, the class
A = {α : α ∈ γ and P (α) is false} is non-empty. Since γ is ∈-well-ordered,
A must have a least element, say λ. That is, λ is the least ordinal in γ such
that P (λ) is false. Since this is the least element of A, P (α) holds true for
all α ∈ λ. By hypothesis, P (λ) must be true. We obtain a contradiction, as
claimed.
Then the set, A, must be empty. So P (α) holds true for all ordinals α ∈ γ.
That is, every element of Bγ satisfies the property P .
P roof:
We are given that P is a property for which conditions 1, 2 and 3 hold true.
We are required to show that P (α) holds true for all ordinals α in γ.
Let λ be an ordinal in γ.
− If λ = 0, then by condition (1), P (λ) holds true.
− Suppose λ has an immediate predecessor, δ, that is, δ + = λ, such that
P (δ) holds true. By condition (2) P (λ) = P (δ + ) holds true.
− Suppose λ is a limit ordinal such that “P (α) is true for all α ∈ λ”. Then
by condition (3) P (λ) holds true.
Then P (α) is true for all α ∈ γ implies P (γ) is true. By the preceding the-
orem, P (α) holds true for all ordinals α ∈ γ.
P roof:
What we are given: The set S is a <-well-ordered set.
What we are required to show: There exists a unique ordinal α which is or-
der isomorphic to S. The required order isomorphism, f : S → α, is unique.
For each element k ∈ S, let Sk = {x ∈ S : x < k} denote an initial segment
of S. Let
This means that the elements of any well-ordered set S can be indexed by
the elements of some ordinal. That is, if (S, <) is a well-ordered set which is
order isomorphic to some ordinal β, then S can be expressed as the indexed
set S = {sα : α ∈ β}. This makes the set S susceptible to proofs by math-
ematical induction over the ordinal β, an extremely useful tool for proving
various mathematical statements.
At this point, it will be useful to introduce some vocabulary that will allow
us to state which ordinal is order isomorphic to a given well-ordered set S.
Note that there can be many well-orderings of the same set. Different well-
orderings may give rise to different ordinalities. The ordinality of a set de-
scribes a property of an a well-ordered set only. It doesn’t provide informa-
tion on its cardinality.
Example. The set N, ordered in the usual way, can then be said to have
ordinality,
ord
N=ω
Part VIII: Ordinal numbers. 303
On the other hand, the reader can verify that the same set, N, can be well-
ordered as
N∗ = {0, 2, 4, 6, . . . , 1, 3, 5, . . .}
When well-ordered in this way, it has ordinality
ord
N∗ = ω + ω = ω2
The lexicographically ordered countably infinite set S = {1, 2, 3} × N has
ordinality,
ord
S = ω + ω + ω = ω3
Lemma 28.9 Hartogs’ lemma.5 Let S be any set. Then there exists an ordi-
nal β which is not equipotent with S or any of its proper subsets.
P roof:
What we are given: That S is a set.
What we are required to prove: That there exists an ordinal β that cannot
be mapped one-to-one onto any subset of S.
Let
MS = {(T, RT ) ∈ P(S) × P(S × S) : T ⊆ S, <T well-orders T }
denote the set of all well-ordered subsets of S.6 By Theorem 28.7, every well-
4 Notice how the Axiom of replacement plays a fundamental role in the proof of Hartogs’
lemma, 28.9.
5 Friedrich Moritz “Fritz” Hartogs (1874-1943) was a German-Jewish mathematician,
known for his work on set theory and foundational results on several complex variables.
Historical note: As a Jew, he suffered greatly under the Nazi regime: he was fired in 1935,
was mistreated and briefly interned in Dachau concentration camp in 1938, and eventually
committed suicide in 1943. (Wikipedia)
6 Suppose (S, < ) is a well-ordered set. Let R = {(x, y) ∈ S × S : x < y}. Then R can
S S S
be viewed as an element of P (S × S), while S can be viewed as an element of P (S). So
(S, RS ) ∈ P (S) × P (S × S)
Then for any subset T of S, (T , RT ) ∈ P (S) × P (S × S).
Let
MS = {(T , RT ) ∈ P (S) × P (S × S) : T ⊆ S, <T well-orders T }
Since both P (S) and P (S × S) are sets, and MS ⊆ P (S) × P (S × S) then MS is also a
set.
Part VIII: Ordinal numbers. 305
ω1 = {α ∈ O : α is a countable ordinal }
P roof:
Hartogs’ lemma states that there exists an ordinal number, say γ, which is
uncountable. Since every countable ordinal is an element of γ, then ω1 ⊆ γ.
Since γ is a set, then ω1 must also be a set. Then, ω1 cannot be equal to the
class O of all ordinals. It was shown (in paragraph (a) preceding the lemma
above) that ω1 is a subclass of O which satisfies the initial segment property.
Proper subsets of O which satisfy the initial segment property were shown
306 Section 28: Properties of the class of ordinals.
We can now lay this problem to rest. Uncountable ordinals exist in the ZFC-
universe.7
{ωα : α ∈ O}
Note that the function, h : O → O, maps the proper class of all ordinals O
into O. The Transfinite recursion theorem guarantees that the sequence
{g(α) : α ∈ O} = {ωα : α ∈ O}
We write out more explicitly how we obtained the first terms of the class
{ωα : α ∈ O}.
ord
ω0 = N
ω1 = h(ω0 ) = the least ordinal not equipotent to ω0 or its subsets.
ω2 = h(ω1 ) = the least ordinal not equipotent to ω1 or its subsets.
ω3 = h(ω2 ) = the least ordinal not equipotent to ω2 or its subsets.
.. ..
. .
{ωα : α ∈ O}
constructed above.
ωγ ∼e β = ωψ ∈ ωψ+ ∈ {ωα : α ∈ γ}
First inductive hypothesis: Suppose P (α) holds true for some α. That is,
suppose ωα 6∈ α. We are required to show that ωα+ 6∈ α+ . Since ωα 6∈ α,
then either α ∈ ωα or α = ωα .
Case 1: Suppose α ∈ ωα . Then, since ωα is a limit ordinal, α+ ∈ ωα ∈
ωα+ . So P (α+) holds true.
Case 2: Suppose α = ωα . Then again, since ωα+ is a limit ordinal and
ωα ∈ ωα+ , α+ = ωα+ ∈ ωα+ . So P (α+ ) holds true.
Second inductive hypothesis: Suppose γ is a limit ordinal, lub{ωα : α ∈
γ} = ωγ and P (α) holds true for all α ∈ γ. That is, “ωα 6∈ α” for all
α ∈ γ,
We are required to show: That ωγ 6∈ γ.
If ψ ∈ ∪{ωα : α ∈ γ}, then for some α ∈ γ, ψ ∈ ωα ∈ ωα+ ∈ ωγ ; hence,
∪{ωα : α ∈ γ} ⊆ ωγ
A chain of ordinals under two distinct relations. The class {ωα : α ∈ O} can
be viewed as a chain of infinite ordinals which is strictly ordered by “∈”:
ω0 ,→e ω1 ,→e · · · ,→e ωω0 ,→e ωω0 +1 ,→e · · · ,→e ωω0 2 ,→e · · · ,→e ωω1 ,→e · · ·
{ωα : α ∈ O}
described in detail above (in which is involved the somewhat tricky definition
of the Hartogs number), the reader may wonder:
“Precisely, what is the point of this particular class?”
If only to relieve a bit of the suspense, I think we can let reader in on a
little secret immediately. As we often say when we are about to share how
a, lengthy carefully developed enigmatic story line will turn out, “Spoiler
alert!” The class of ordinals
{0, 1, 2, 3, . . .} ∪ {ωα : α ∈ O}
is destined to be called the class of all cardinal numbers (as will be described
in the next chapter in the formal Definition 29.7).
The second step is to show that G is a class function, while the third step
is to show that G is unique.
We will show that G is a class function by transfinite induction. For each
ordinal γ let
G|γ = {(α, xα ) ∈ G : γ 6∈ α}
For example,
G|0 contains at least (0, u)
G|1 contains at least (0, u) and (1, f(u))
G|2 contains at least (0, u), (1, f(u)) and (2, f(f(u)))
G|3 contains at least (0, u), (1, f(u)), (2, f(f(u))) and (3, f(f(f(u))))
.. .. ..
. . .
Let P (α) represent the statement “G|α is a function”.
− Inductive hypothesis: Case 1. Suppose P (α) holds true for all α ∈ φ+ for
some non-limit ordinal φ+ . This means that G|φ is a function.
We are required to show that P (φ+ ) holds true. That is, we must show
that G|φ+ is also a function. Now G|φ+ = G|φ ∪ {(φ+ , x) : (φ+ , x) ∈ G}.
We know that (φ, xφ ) ∈ G|φ so (φ+ , f(xφ )) ∈ {(φ+ , x) : (φ+ , x) ∈ G}. To
show that G|φ+ is a function it suffices to show that {(φ+ , x) : (φ+ , x) ∈
G} is the singleton set {(φ+ , f(xφ ))}. Suppose not. That is, suppose there
exists in G an element (φ+ , y) such that y 6= f(xφ ).
Claim: G − {(γ, y)} ∈ H . If so, then this contradicts the fact that G is
the smallest element of H . The proof of the claim is left as an exercise.
Assuming the claim is proved, we conclude that P (φ+ ) holds true.
− Inductive hypothesis: Case 2. Suppose P (γ) holds true for all ordinals
α ∈ γ where γ is a limit ordinal. This means that G|α is a function for all
ordinals α ∈ γ. Equivalently, {(α, xα ) : α ∈ γ, (α, xα ) ∈ G} is a function.
We are required to show that P (γ) holds true. That is, we must show
that G|γ is also a function. Now
G|γ = {(α, xα) : α ∈ γ, (α, xα ) ∈ G} ∪ {(γ, xγ ) : (γ, xγ ) ∈ G}
Let sγ = lub{xα : α ∈ γ}. We know, by definition of G, that (γ, sγ ) ∈
{(γ, xγ ) : (γ, xγ ) ∈ G}. To show that G|γ is a function, it suffices to show
that {(γ, xγ ) : (γ, xγ ) ∈ G} is the singleton set {(γ, sγ )}. Suppose not.
That is, suppose there exists (γ, y) such that y 6= sγ .
Claim: G − {(γ, y)} ∈ H . If so, then this contradicts the fact that G is
the smallest element of H . The proof of the claim is left as an exercise.
Assuming the claim is proved, we conclude that P (γ) holds true.
Part VIII: Ordinal numbers. 313
simply obtained by deleting the top right corner from the Tychonoff plank
is appropriately referred to as the
The Tychonoff plank may appear to be a topological space that is, in many
ways, similar to the product space R × N. But, as we will eventually see,
it has quite different properties. Both the Tychonoff plank and the Deleted
Tychonoff plank are useful topological spaces to remember.
Concepts review:
1. Which ordering relation well-orders the class, O, of all ordinals?
2. If S is a subset of ordinals in O, what is one way of describing its
least element?
3. What can we say about initial segments of the well-ordered class
O?
4. Is O an ordinal number? Why or why not?
5. How do we define the immediate successor of an element of an
ordered set?
6. Give an example of a linearly ordered set where no element has an
immediate successor.
7. State the two versions of the principle of induction over the ordinals.
8. What does it mean to say that elements of every well-ordered set
can be indexed by the elements of some ordinal?
9. Which ZFC-axiom is invoked to prove that every well-ordered set
is order isomorphic to a single ordinal.
10. What does “ordinality of a well-ordered set” mean?
11. What does Hartogs’ lemma state?
12. How does the existence of an uncountable ordinal follow from Har-
togs’ lemma?
13. What is the least uncountable ordinal?
14. What is the Hartogs number of a set S?
15. How is the concept of Hartogs number combined with the Transfi-
nite recursion theorem to show that there exists an infinite sequence
of uncountable ordinal numbers no two of which are equipotent?
EXERCISES
B. 5. Let P(Q) denote the set of all subsets of the set of rational numbers Q.
a) Construct a countably infinite subset S of P(Q) which is well-ordered
by the relation ⊆ such that (S, ⊆) is order isomorphic to the ordinal
number ω0 . Prove that ⊆ both linearly orders and well-orders the set
S.
b) Construct a countably infinite subset T of P(Q) which is well-ordered
by the relation ⊆ such that (T, ⊆) is order isomorphic to the ordinal
number ω0 + ω0 .
C. 6. Theorem 26.7 states that “any two well-ordered sets S and T are either
order isomorphic or one is order isomorphic to an initial segment of the
other”. Can we replace the word “sets” with the word “classes” in this
statement? Justify your answer.
7. Construct a set which is not an ordinal number but whose elements can be
indexed by the elements of ω5.
8. Consider the lexicographically well-ordered set S = {1, 2, . . . , 100} × N.
State the ordinal number which is order isomorphic to the subset
{(1, 0), (1, 1), (1, 2), (1, 3), . . . , (2, 0)}
is non-empty. This class describes “the set of all ordinals which cannot be
mapped one-to-one into S”. Since the class, US , of ordinals is non-empty
and O is ∈-well-ordered, then US has an ∈-least element. We called this
∈-least element of US the “Hartogs number”, h(S), of the set S.
Since ordinals are sets, then every ordinal, α, can be assigned a Hartogs
number, h(α).
For example: Determine the Hartogs number, h(ω0 + ω0 ), of the ordinal
ω0 + ω0 .
Since ω0 + ω0 is a countable set and h(ω0 + ω0 ) represents the least ele-
ment of Uω0 +ω0 , then it is the least ordinal that cannot be mapped into the
countable ordinal, ω0 + ω0 . Then the Hartogs number, h(ω0 + ω0 ), must be
uncountable. Since ω1 is the smallest such ordinal, then h(ω0 + ω0 ) = ω1 .
The above examples suggest that ordinals such as ωα are in fact initial
ordinals. We introduce the following notation.
I = {0, 1, 2, 3, . . ., } ∪ {ωα : α ∈ O}
We will show that the class, I , comprises the complete class of all initial
ordinals.
Part VIII: Ordinal numbers 319
I = {0, 1, 2, 3, . . ., } ∪ {ωα : α ∈ O}
320 Section 29: Cardinal numbers: “Initial ordinals are us!”
P roof:
to {ωα : α ∈ O}.
Is R even well-orderable? To show that a set, S, is well-orderable and to
actually produce an algorithm that well-orders S are two different things.
Of course, producing an algorithm that well-orders S is more useful than
simply proving that S is well-orderable. But sometimes, the best we can
hope for is to prove that a set S is well-orderable, even though we may be
convinced that no algorithm that well-orders S will ever be explicitly found.
It may come as a surprise to many readers to learn that, in the set-theoretic
universe governed by ZFC, all sets are well-orderable (including uncountable
ones such as R). The statement “All sets are well-orderable” proved below is
called the Well-ordering theorem or sometimes the Well-ordering principle.
It is a direct consequence of the Axiom of choice.
Since the Axiom of choice plays a fundamental role in the proof of the Well-
ordering theorem, it will be useful to remind ourselves of what the Axiom
of choice states.
Axiom of choice: Every set of sets has a choice function.
Theorem 29.4 [AC] The Well-ordering theorem. Every set can be well-
ordered.
P roof:
What we are given: That S is a non-empty set.
What we are required to show: That S is well-orderable.
To do this, it suffices to show that S is the one-to-one image of some ordinal
number. Then, by invoking Theorem 26.1, we can conclude that S is well-
orderable.
Case 1: Suppose S is a countable set. If S is finite, then it is the one-to-one
image of some finite ordinal (natural number), and so S is well-orderable. If
S is infinite, then it is the one-to-one image of N (a well-ordered set). Again,
we must conclude that S is well-orderable.
322 Section 29: Cardinal numbers: “Initial ordinals are us!”
g|γ = {(α, sα ) ∈ g : α ∈ γ}
Claim: For each ordinal, γ, such that γ ⊆ dom g, g|γ is a one-to-one function
on γ.
The proof of the claim is by transfinite induction. Let P (α) denote the
statement “g|α : α → S is a one-to-one function mapping α into S”.
Inductive hypothesis: Suppose γ ⊆ dom g. Suppose P (α) holds true for all
α ∈ γ, where γ belongs to the domain of g. We are required to show that
P (γ) holds true. That is, we are required to show that g|γ is one-to-one on
γ.
Suppose (β, sβ ) and (µ, sµ ) are two elements in g|γ such that β ∈ µ. Then
β and µ are elements of γ. It suffices to show that sβ 6= sµ . Case 1: If γ is a
limit ordinal, then (β, sβ ) and (µ, sµ ) belong to g|µ+ ⊂ g|γ . By the induc-
tive hypothesis, g|µ+ is one-to-one on µ+ ; hence, sβ 6= sµ . Case 2: Suppose
γ is a successor ordinal. If µ+ 6= γ, then by the inductive hypothesis, g|µ+ is
one-to-one on µ and, since (β, sβ ) and (µ, sµ ) belong to g|µ+ , then sβ 6= sµ .
Suppose µ+ = γ. By definition of g, g(µ) = sµ = f(S − {sα : α ∈ µ}).
Since β ∈ µ, sβ 6∈ S − {sα : α ∈ µ}; hence, sβ 6= f(S − {sα : α ∈ µ}) = sµ .
Then g|γ is one-to-one on γ.
By transfinite induction, g|γ is one-to-one on γ, for all γ ⊆ dom g, as
claimed. We conclude that g is one-to-one on dom g.
Claim: The function g maps dom g onto S. That is, for every s ∈ S, (α, s) ∈ g
for some ordinal α. Let D denote the domain of g. To prove the claim, it
suffices to show that S − g[D] = ∅.
Part VIII: Ordinal numbers 323
E = {[S]e : S ∈ S }
where [S]e = {T ∈ S : T ∼e S}. For example, [R]e and [N]e are (dis-
tinct) equivalence classes containing all sets which are equipotent to R and
N respectively. For example, once we had verified that P(N) ∼e R (with
the help of the Schröder-Bernstein theorem), we could write that [R]e =
[P(N)]e where R and P(N) were simply different representatives of the
same equivalence class. When we first discussed the concept of “cardinal
numbers”, the tools available at that time were insufficient to construct a
324 Section 29: Cardinal numbers: “Initial ordinals are us!”
class of sets whose elements could serve as representatives for each of the
equipotence-induced equivalence classes.1 So we postulated the existence of
the class of cardinal numbers as follows (reproduced from Postulate 22.2):
We now have all the ingredients required to prove that a class of sets whose
properties characterize the cardinal numbers exists in ZFC.
– By theorems, 20.12 and 21.3, the three sets, 2N , P(N) and R are equipo-
tent sets.
– If we assume the Continuum hypothesis, P(N) is the smallest uncount-
able set which contains N. That is, there are no uncountable sets strictly
in between N and P(N).
– Then, if we assume the Continuum hypothesis, R is the smallest un-
countable set which contains N.
– Then, if we assume the Continuum hypothesis, c and ω1 are equipotent
sets.
We are now set to formally define the sets we will call the “cardinal num-
bers”.
Recall that, for each equivalence class in the class
E = {[S]e : S ∈ S }
I = ω0 ∪ {ωα : α ∈ O}
Notation 29.7 Although we could use the “ωα ” notation to represent the
cardinal numbers it is customary to use the “aleph” notation, ℵα . We have
already used the aleph notation once with, ℵ0 , used to represent the cardinal-
ity, |N|, as introduced on page 226. We now generalize its use to represent all
elements of {ωα : α ∈ O}. That is, we set
ℵ0 = ω0
ℵ1 = ω1
ℵ2 = ω2
..
.
ℵα = ωα
..
.
Note that the algorithms used to define the operations of addition, multipli-
cation and exponentiation, in the chapters where cardinal operations were
defined, did not involve the fact that cardinal numbers are initial ordinals.
So the algorithms can be freely used using the aleph notation for the cardi-
nals {ℵα : α ∈ O}.
For example, when we write the expression ℵ1 we are thinking “the cardinal
number ℵ1 ” rather than “the initial ordinal ω1 ” even though they represent
the same entity. Since the initial ordinal, ℵ1 = ω1 , is, by definition, “the least
ordinal number which is not countable”, it is the first uncountable cardinal
(ordinal).
ℵ1 = ω1 = c = |R|
Part VIII: Ordinal numbers 327
Table in which we are not assuming CH nor GCH (in the presence of the
Axiom of choice):
Set S cardinality of S initial ordinal of S ord
S
{} 0 0 0
{a, b, c} 3 3 3
Nstandard ℵ0 ω0 ω0
ω0 + 3∈-well-ordered ℵ0 ω0 ω0 + 3
{1, 2} × Nlexico ℵ0 ω0 ω0 2
N × Nlexico ℵ0 ω0 ω0 ω0
ω1 ℵ1 ω1 ω1
.. .. .. ..
. . . .
R c = 2ℵ0 = ℵα ≥ ℵ1 ωα
.. .. .. ..
. . . .
P(R) 2ℵα = ℵβ ≥ ℵα+1 ωβ
.. .. .. ..
. . . .
P(P(R)) 2ℵβ = ℵγ ≥ ℵβ+1 ωγ
.. .. .. ..
. . . .
P(P(P(R))) 2ℵγ = ℵδ ≥ ℵγ+1 ωδ
328 Section 29: Cardinal numbers: “Initial ordinals are us!”
{a, b, c} 3 3 3
Nstandard ℵ0 ω0 ω0
ω0 + 3∈-well-ordered ℵ0 ω0 ω0 + 3
{1, 2} × Nlexico ℵ0 ω0 ω0 2
N × Nlexico ℵ0 ω0 ω0 ω0
R 2ℵ0 = ℵ1 ω1
R×R ℵ1 ω1
P(R) 2ℵ1 = ℵ2 ω2
P(P(R)) 2ℵ2 = ℵ3 ω3
P(P(P(R))) 2ℵ3 = ℵ4 ω4
.. .. .. ..
. . . .
..
. ℵα ωα
.. .. .. ..
. . . .
What does this say about GCH? The Axiom of choice guarantees that every
set can be well-ordered and so all sets can be ranked on an “equipotence
based scale C of sets” called the cardinal numbers. This means that every
set is associated to a uniquely specified ordinal number (cardinal number)
on this scale of ordinals. We make the following universe comparisons.
In the ZFC − universe: For any infinite set S such that |S| = ℵγ ,
2S ∼e P(S) ∼e ℵα for some α > γ. The value of α is guaranteed to
exist, but cannot be determined. The value of α is simply assumed to be
equal to some ordinal greater than or equal to γ + 1.
In the ZFC + GCH − universe: If S is any infinite set such that |S| = ℵγ ,
then 2S ∼e P(S) ∼e ℵγ+1 . The cardinality of the set 2S ∼e P(S) is the
least cardinal number (on the equipotence-based scale) which is larger
than ℵγ . The axiom GCH limits the size of power sets P(S) relative to
the size of S.
In the ZFC + CH − universe: For any infinite set S such that |S| = ℵγ >
ℵ0 , 2S ∼e P(S) ∼e ℵα for some α > γ. The value of α is guaranteed to
Part VIII: Ordinal numbers 329
Concepts review:
1. What is a well-orderable set? How is it different from a well-ordered
set?
2. What can be said about those sets that are the one-to-one image of
an ordinal number?
3. Is it true that any well-ordered set, no matter how large, is order
isomorphic to some ordinal number?
4. What does it mean to say that the well-ordered set S has order type
(ordinality) α?
5. What does the Well-ordering theorem say?
6. The Well-ordering theorem is a consequence of which fundamental
ZFC-axiom?
7. What is an initial ordinal? Are natural numbers initial ordinals?
Why?
8. What is the least uncountable initial ordinal? How is it obtained?
9. Are all ordinals in ω0 ∪ {ωα : α ∈ O} initial ordinals?
10. Define the “cardinal numbers”.
11. If we assume the Continuum hypothesis, what is the cardinality of
R?
12. If we do not assume the Continuum hypothesis, what is the cardi-
nality of R?
13. What is a limit cardinal?
14. What is a successor cardinal?
EXERCISES
aα = 2 when α = ω0
1
2 + f(α)+1 when ω0 ∈ α
where f(ω0 + n) = n.
a) Are the elements of the set S well-defined?
b) Is the ordering induced on the elements of S by the index set ω0 + ω a
well-ordering?
c) If the elements of S are assumed to be well-ordered by the index set,
what is the least element of the set S with respect to this ordering?
d) What is the least upper bound (supremum) of the set {aα : α ∈ ω}
with respect to the ordering defined by the index set?
C. 11. Show that if S is a class of ordinals, then the least upper bound of S is
∪{α ∈ O : α ∈ S}.
12. Is it true that given any two sets S and T , either S is embeddable in T or
T is embeddable in S? Why?
13. Is the class I of all initial ordinals an initial segment of O? Why?
14. Is the class I of all initial ordinals a transitive class?
Part IX
30 / Axiom of choice
Abstract. In this section we prove that the Axiom of choice is equivalent
to the Well-ordering principle. We provide a few mathematical statements
whose proof requires the Axiom of choice. Finally we present Zorn’s lemma
and show that it is equivalent to the Axiom of choice. A proof of the fact
that every vector space has a basis is given by invoking Zorn’s lemma.
30.1 Introduction.
Amongst the eight set-axioms we have listed, there are only two that pos-
tulate the existence of a set. The other set-axioms are ones that provide us
with the necessary tools to construct new sets from ones that we already
have. The first such existence axiom that we have encountered is the Axiom
of Infinity. The Axiom of infinity postulates the existence of an inductive
set (X ∈ A ⇒ X ∪ {X} ∈ A). Most people have no complaints about this
axiom, since without it we have nothing “on the table” to work with, so
to speak. If we must postulate the existence of at least one set, why not
postulate existence of a set which characterizes the natural numbers?
The second existence set-axiom is the Axiom of choice. This existence axiom
is quite different in nature from the Axiom of infinity (which postulates the
existence of just a single set). The Axiom of Choice states that, given any set
U = {Sx : x ∈ A} of non-empty sets, there exists a function f : U → ∪x∈ASx
which maps each set Sx to a set yx ∈ Sx . We refer to f as a “choice function
for U ” since, from each set, Sx , in U , f chooses a particular set f(x) without
specifying the rule according to which this choice is made. Remember that
the function, f, is itself a set. So the Axiom of choice postulates, for each
U , the existence of a “set”, f.
So, for each such set, U , we postulate the existence of at least one associated
set, f : U → ∪x∈A Sx . Some readers may wonder why we don’t just construct
the function f : U → ∪x∈A Sx . It is just that, in most cases, we don’t know
how or whether, in practice, such a function is even constructible. So we are
postulating the existence of a “rule” without ever being able to know what
this rule could possibly be. Some may wonder how we can permit ourselves
to postulate the existence of a set that can never be constructed, witnessed
or exhibited. The problem is that, to do most of mathematics that are es-
sential for us today, we absolutely need it.
In this chapter, we will try to develop a deeper understanding of what this
axiom is about. The Axiom of choice will be seen to be equivalent to other
mathematical statements, some of which many find more palatable.
336 Section 30: Axiom of choice
S 6= ∅ ⇒ P(S) 6= ∅
⇒ P(S) × S 6= ∅ (By definition of Cartesian product.)
Observe that {(S, x)} is a function with domain {S} and range {x} which
associates to S some element x of S. We did not invoke the Axiom of choice
to postulate the existence of the function f(S) = x.
Example 2. Suppose we wish to select an element from each set of non-
empty sets from P(N). To do this, do we need to invoke the Axiom of
Choice? Consider the set P(N)∗ = P(N) − {∅} of all non-empty subsets
of N. Suppose we want to form a new set, S = {nA : A ∈ P(N)∗ }, by
selecting from each set, A, in P(N)∗ a single element, nA . We can argue as
follows: Since the set N has been shown to be well-ordered, we can choose
from each non-empty set A ∈ P(N)∗ the unique smallest number, nA , in
1 Formally expressing this axiom in first order logic,
[
∀X[∅ 6∈ X ⇒ ∃f : X → A ∀A ∈ X(f (A) ∈ A)]
A∈X
Part IX: Choice, regularity and Martin’s axiom 337
A. The Axiom of choice is not required, since each element A in the sets
of P(N)∗ is specifically and unambiguously identified as being the unique
least element in A. We can express our choice function, f : P(N)∗ → N as
follows:
Un+1 .
A word of caution. The proof above does not show that any countably in-
finite set of sets has a choice function. The statement P (n) states that “a
set of n sets has a choice function” no matter what the value of n is. It only
proves that all finite sets have a choice function, nothing more.
the ZF-axioms.
Part IX: Choice, regularity and Martin’s axiom 339
This means that this existence principle would have to be invoked infinitely
many times. The ZF -axioms do not define a formula with an infinite chain
of existence symbols.
We will now prove that the statement “Every set is well-orderable” to con-
clude that the Axiom of choice and “Every set is well-orderable” are equiv-
alent statements.
Theorem 30.3 The statement “Every set is well-orderable” holds true if and
only if the Axiom of choice holds true.
P roof:
(⇐) That the Axiom of choice implies “Every set is well-orderable” is proven
in the Well-ordering theorem (29.4).
A minor flaw in this proof is that the statement “For each y ∈ T , choose
uy ∈ f ← [{y}]” is not appropriately justified. The Axiom of choice must be
invoked to justify the choice of an element in each set of an infinite number
of sets.
We state a few other well-known statements whose proofs depend on the
Axiom of choice. That is, the following results hold true only if we accept
the Axiom of choice as an axiom along with the other ZF -axioms.
Theorem 30.4 [AC] Any infinite set can be expressed as the union of a pair-
wise disjoint set of infinite countable sets.
P roof:
What we are given: That S is an infinite set.
What we are required to show: That there exists a countably infinite fam-
ily of pairwise disjoint sets, F = {Uα : α ∈ φ}, such that S = ∪{Uα : α ∈ φ}.
We will recursively construct the set F :
− Let T = {F ∈ P(S) : F is countably infinite}. Since S is a set, then
P(S) is a set and so T is a set. Since S is infinite, T must contain at
least one countably infinite subset U0 of S (By Theorem 18.9).
− If S − U0 is finite, then S is countably infinite and we can let F = {S};
we are done.
− More generally: Let Kγ = {Uα : α ∈ γ} be a set of countably infinite
pairwise disjoint subsets of S indexed with the elements of the ordinal
γ. Either S − ∪{Uα : α ∈ γ} is finite or it is infinite.
· If S − ∪{Uα : α ∈ γ} is infinite, then choose an arbitrary element
Uγ of T which is entirely contained in S − ∪{Uα : α ∈ γ}. We
then obtain the set Kγ+1 = {Uα : α ∈ γ + 1} of countably infinite
pairwise disjoint subsets of S. The Axiom of choice will allow us to
make the selection of Uγ for all such sets Kγ of countably infinite
sets.
− Let F = {Uα : α ∈ φ} be the set of all countably infinite sets obtained
in this way. Then F is a set of pairwise disjoint countably infinite sub-
sets of S. Now either ∪{Uα : α ∈ φ} is equal to S or it is not. If it is equal
to S, then we are done. If it is not equal to S, then S − ∪{Uα : α ∈ φ}
is finite. In such a case, we can throw those last few elements in Uα for
some α ∈ φ. We then obtain S = ∪{Uα : α ∈ φ} as required.
group theorist, and numerical analyst. He is best known for Zorn’s lemma, a method used
in set theory that is applicable to a wide range of mathematical constructs such as vector
spaces, and ordered sets amongst others. Zorn’s lemma was first postulated by Kazimierz
Kuratowski in 1922, and then independently by Zorn in 1935. (Wikipedia)
342 Section 30: Axiom of choice
We will first prove that the Axiom of choice implies Zorn’s lemma holds
true. This will be followed by the proof of its converse.
Theorem 30.5 [AC] Let (X, <) be a partially ordered set. If every chain of
X has an upper bound, then X has a maximal element.
P roof:
What we are given: That X is a partially ordered set. For every chain C in
X, there is an element kC in X such that c ≤ kC for all c ∈ C.
What we are required to show: That X contains an element m such that no
element x of X satisfies the property m < x.
Let φ be the Hartogs number of X. That is, φ is the least ordinal number
which is not equipotent with any subset of X.
Proof by contradiction. Suppose X has no maximal element. Then, for every
element s ∈ X, the set s∗ = {x : x > s} is non-empty. The Axiom of choice
guarantees the existence of a choice function f : P(X) → X which maps
the set s∗ to some element f(s∗ ) ∈ s∗ .
We recursively define the function g : φ → X as follows:
g(0) = x0 = f(X)
g(1) = x1 = f(x∗0 ) > x0
g(2) = x2 = f(x∗1 ) > x1
.. ..
. .
g(α+ )
= xα+ = f(x∗α ) > xα
If λ = limit ordinal, g(λ)
= xλ = upper bound of the chain {xα : α ∈ λ}
.. ..
. .
In the case where λ is a limit ordinal, the hypothesis guarantees that the
upper bound of the chain {xα : α ∈ λ} exists in X. Since the function
g is strictly increasing, it is one-to-one. Since X has no maximal element,
the function g maps φ = {α : α ∈ φ} one-to-one into X, contradicting the
fact that X’s Hartogs number φ is the least ordinal which cannot be mapped
one-to-one into X. The source of the contradiction is our assumption that X
has no maximal element. We must conclude that X has a maximal element,
as required. A theorem on which we have attached a label [ZL] indicates to
the reader that Zorn’s lemma is invoked in the proof of the statement.
Part IX: Choice, regularity and Martin’s axiom 343
Theorem 30.6 [ZL] Suppose that those partially ordered sets (X, <) in
which every chain has an upper bound must have a maximal element. Then,
given any subset S ⊆ P(S) − ∅, there exists a choice function f : S → S
which maps each set in S to one of its elements.
P roof:
What we are given: That partially ordered sets (X, <) in which every chain
has an upper bound have a maximal element and that S ⊆ P(S) − ∅.
What we are required to show: That there exists a choice function f : S → S
which maps each set in S to one of its elements.
Let
Note that F is non-empty since it has been shown that all finite sets of
sets have a choice function. We will partially order the functions in F by
inclusion “⊂”. That is,
P roof:
Let V be a vector space and let (F , ⊆F ) be the set of all linearly indepen-
dent subsets of the vector space V ordered by inclusion ⊆F . The set F is
non-empty since non-zero singleton sets are linearly independent.
Let C be a chain of linearly independent subsets in F .
We claim that the union ∪C∈C C is also linearly independent:
- It suffices to show that every finite linear combination of elements of
∪C∈C C which equals zero must have zeroes as coefficients. Let U =
{v1 , v2 , v2 , . . . , vn } be a set of vectors in ∪C∈C C.
- Then U ⊆ C for some C ∈ C (since C is a chain of subsets).
- Since C is linearly independent, then α1 v1 +α2 v2 +α3 v3 +· · ·+αn vn = 0
implies α1 = α2 = · · · = αn = 0.
- So ∪C∈C C is linearly independent as claimed.
Then every chain C in (F , ⊆F ) has an upper bound in F .
By Zorn’s lemma, (F , ⊆F ) has a maximal linearly independent set B ∗ . That
is, B ∗ is a linearly independent set that is not a subset of any other linearly
independent set. We now show that B ∗ spans V . If v ∈ V −B ∗ is not a linear
combination of vectors in B ∗ , then B ∗ ∪ {v} is a linearly independent subset
of V which properly contains B ∗ , contradicting the maximality of B ∗ . So,
B ∗ spans V . So, B ∗ is a basis of V . Thus, every vector space has a basis.
ℵγ = lub{ℵα : α ∈ γ}
= lub{ωα : α ∈ γ}
= ωγ
Theorem 30.8 [GCH] Every uncountable limit cardinal is a strong limit car-
dinal.
P roof:
Suppose ℵγ is an uncountable limit cardinal, and suppose α ∈ γ. Since ℵγ
is a limit cardinal, then, by definition, ℵα+1 ∈ ℵγ . Assuming GCH, we have
2ℵα = ℵα+1 ∈ ℵγ . We have shown that α ∈ γ ⇒ 2ℵα ∈ ℵγ . So, when influ-
enced by the GCH, ℵγ is a strong limit cardinal, as required.
But do we absolutely need GCH to prove this result? Are there any strong
limit cardinals in the ZFC-universe when uninfluenced by GCH? Surprisingly
enough, there are, but to prove it we will have to invoke the Axiom of choice.
P roof:
The class, C ∗ = {ℵα : α ∈ O, α 6= 0}, denotes the class of all uncountable
infinite cardinals.
For each ordinal α, by the Well-ordering theorem (equivalent to the Axiom
of choice), the set P(ℵα ) is well-orderable.4 By Theorem 20.12, 2ℵα and
P(ℵα ) are equipotent sets. Equipotent sets have the same cardinality. So
|P(ℵα )| = |2ℵα |. Since |2ℵα | = |2||ℵα| = 2ℵα , then |P(ℵα )| = 2ℵα . So, for
each α, the well-ordered set, P(ℵα ), is order isomorphic to the unique car-
dinal number 2ℵα .
We define the function g : C ∗ → C ∗ as follows:
in C ∗ , as follows:
κ0 = ℵ0
κ1 = g(κ0 ) = 2κ0
κ2 = g(κ1 ) = 2κ1
κ3 = g(κ2 ) = 2κ2
..
.
κn+1 = g(κn ) = 2κn
..
.
Then {κn : n ∈ ω0 } is a countable (strictly) increasing sequence of cardinals
which does not contain its maximal element. It follows that
γ = ∪{κn : n ∈ ω0 }
is a limit ordinal which is the least upper bound of the set U (see Theorem
27.12).
Claim #1: The ordinal γ is a cardinal number.
Clearly γ is infinite. It suffices to show that γ is an initial ordinal (since
initial ordinals are cardinals). The ordinal γ is an initial ordinal if
[β ∈ γ] ⇒ [β 6∼e γ]
ℵα ∈ ℵλ = γ = ∪{κn : n ∈ ω0 }
Concepts review:
1. What does the Well-ordering principle say?
2. What does the Axiom of choice say?
3. Is the Axiom of choice required to justify the existence of a choice
function for finite sets?
4. Is an axiom required to justify the existence of a choice function for
countably infinite subsets of P(S)?
5. What does the Axiom of countable choice say?
6. Provide an example of a statement whose proof requires the Axiom
of choice.
7. Does the Axiom of countable choice follow from the ZF -axioms?
8. State Zorn’s lemma.
9. What linear algebra statement can be proved by invoking Zorn’s
lemma.
EXERCISES
A. 1. For which of the following sets of sets is the Axiom of choice required to
guarantee the existence of a function which selects an element from each
set.
a) An infinite set of sets S where each set in S contains one element.
b) Three sets each containing all elements of R.
c) A countably infinite number of sets each containing all the rational
numbers.
Part IX: Choice, regularity and Martin’s axiom 349
C. 5. If S is a set containing more than one element, show that there exists a
one-to-one function f : S → S such that f maps no point x in S to itself.
That is, f(x) 6= x for all x in S.
6. Let S be a set of sets. Let M = {U ∈ P(S ) : X, Y ∈ U implies X ∩
Y 6= ∅}. Show that M contains a maximal element T with respect to the
ordering “⊆”. That is, T ∈ M and, for any B ∈ S − T , T ∪ {B} 6∈ M .
350 Section 31: Axiom of regularity
31 / Axiom of regularity
Abstract. In this section we state the Axiom of regularity and present
some of its equivalent forms. We prove that the Axiom of regularity is
equivalent to the statement “Every set has an ∈-minimal element”. We
also show that in the presence of the Axiom of regularity, no set can be
an element of itself. We define “well-founded sets ” and show that in the
presence of the Axiom of choice, the Axiom of regularity is equivalent to
the statement “Every set is well-founded”. The transitive closure of a set
is defined.
Theorem 31.1 The Axiom of regularity holds true if and only if every non-
empty set S contains a minimal element with respect to the membership
relation “∈”.
P roof:
(⇒)
What we are given: A non-empty set S. That the axiom of regularity holds
true in S. That is, there exists in S an element m such that m ∩ S = ∅.
What we are required to show: That S contains a minimal element with
respect to the membership relation “∈”.
Suppose the given m is not a minimal element of S. That is, suppose S
contains an element x such that x ∈ m. Then x ∈ m ∩ S 6= ∅, contradicting
our hypothesis. Then this element m is minimal in S with respect to “∈”,
as claimed.
(⇐)
What we are given: A non-empty set S. That S contains a minimal element
m with respect to “∈”.
What we are required to show: That S contains some element m such that
Part IX: Choice, regularity and Martin’s axiom 351
m ∩ S = ∅.
Suppose that S does not contain an element m such that m∩S = ∅. That is,
for every m ∈ S, there exists x ∈ m ∩ S; then no element m in S is minimal
with respect to ∈. This contradicts our hypothesis. Hence, for every set S
there exists m such m ∩ S = ∅.
Up to this point in our study of sets, we have not encountered a set x such
that x ∈ x. We thought it would be best to avoid such “creatures”, at least
until we better understand the difficulties that they may cause. We will now
see that the statement “No set can be an element of itself” is a consequence
of the Axiom of regularity.
We will now show that in the presence of the Axiom of choice,1 the two state-
ments “Every set is well-founded ” and the Axiom of regularity are equiva-
lent. This means that in the absence of the Axiom of regularity, we would
have to accept that non-well-founded sets may exist and study what impact
the existence of such sets has in our set-theoretic universe.2
Theorem 31.4 [AC] The Axiom of regularity and the statement “Every set
is well-founded ” are equivalent statements.
P roof:
(⇒)
What we are given: The Axiom of regularity holds true.
What we are required to show: That all sets are well-founded.
Suppose there exists a set which contains an infinite descending chain of
sets S = {xn : n ∈ ω}. This means that an ∈-ordered chain such as
· · · ∈ x4 ∈ x3 ∈ x2 ∈ x1 ∈ x0 exists. By hypothesis, S must contain
some element m such that m ∩ S = ∅. This element m must be equal to xk
for some k ∈ ω. Since xk+1 ∈ m ∩ S, we have a contradiction. So non-well-
founded sets cannot exist in the presence of regularity.
(⇐)
What we are given: That every set is well-founded.
What we are required to show: That every non-empty set S contains an
element m such that m ∩ S = ∅.
Suppose there exists a non-empty set S such that for every m ∈ S, m ∩ S 6=
∅. Then there exists a relation R ⊂ S × S such that (m, x) ∈ R if and only
if x ∈ m ∩ S. Since m ∩ S 6= ∅ for all m ∈ S, the domain of R is all of S.
Invoking the Axiom of choice, there exists a “choice function” f : S → S,
f ⊆ R, where, for each m ∈ S, f(m) ∈ m ∩ S. Let x0 = f(S). We recursively
define a function g : ω0 → S as follows:
g(0) = x0 = f(S)
g(1) = x1 = f(x0 ) ∈ x0 ∩ S
g(2) = x2 = f(x1 ) ∈ x1 ∩ S
.. ..
. .
g(n + 1) = xn+1 = f(xn ) ∈ xn ∩ S
.. ..
. .
dependent choice
2 It was shown in 1929 by von Neumann that if “ZF without regularity” is consistent,
Example − Consider the set S = {2}. We see that the set S is not transi-
tive, since 1 ∈ 2 = {0, 1}, 2 ∈ S, but 1 6∈ S. So S does not contain all the
elements required for it to be transitive. Starting with S, we will construct,
step-by-step, its transitive closure, tS . We have seen that element 1 is miss-
ing, so let’s add it to S: Let S1 = {1, 2}. We see that S1 is not transitive
since 0 ∈ 1 = {0}, 1 ∈ S1 , but 0 6∈ S. We then add to S1 , the element 0:
Let S2 = {0, 1, 2}. We see that S2 is the natural number 3 known to be
transitive. Then, the transitive closure, tS , of S = {2} is the natural number
3 = {0, 1, 2}.
Completing a non-transitive set, S, to its transitive closure, tS , means to
add to S all the elements which belong to elements of the set.
The following theorem guarantees that every non-empty set, x, has a tran-
sitive closure, tx .
Theorem 31.6 Let x be a set. Then there exists a smallest transitive set, tx ,
which contains all elements of x. That is, every set, x, has a transitive closure,
tx .
354 Section 31: Axiom of regularity
P roof:
Let S denote the class of all sets. Let f : S → S be a function defined as:
f(u) = ∪{y ∈ S : y ∈ u}. Let x0 ∈ S . We recursively define the function
g : ω0 → S as follows:
g(0) = x0 ∈ S
g(1) = x1 = f(x0 ) = ∪{y ∈ S : y ∈ x0 }
g(2) = x2 = f(x1 ) = ∪{y ∈ S : y ∈ x1 }
.. ..
. .
g(n) = xn = f(xn−1 ) = ∪{y ∈ S : y ∈ xn−1 }
g(n + 1) = xn+1 = f(xn ) = ∪{y ∈ S : y ∈ xn }
.. ..
. .
Let
tx0 = ∪{xn : n ∈ ω0 } = x0 ∪ x1 ∪ x2 ∪ · · ·
Since x0 is a set, each xn is the union of a set of sets and so tx0 is itself a
set.
Claim #1 : That tx0 is a transitive set.
Proof of Claim #1: Suppose u ∈ tx0 and v ∈ u. We are required to
show that v ∈ tx0 . Since tx0 = ∪{xn : n ∈ ω0 }, u ∈ xk for some
k ∈ ω. Since xk+1 = f(xk ) = ∪{y ∈ S : y ∈ xk }, u ⊆ xk+1 . Hence,
v ∈ u ⊆ xk+1 ⊆ ∪{xn : n ∈ ω0 } = tx0 . So tx0 is a transitive set, as claimed.
Suppose now that s is some transitive set such that x0 ⊆ s.
Claim #2: That tx0 ⊆ s.
Proof of Claim #2: It suffices to show that xn ⊆ s for all n. We will show
this by induction.
Base case: That x0 ⊆ s is given.
Inductive hypothesis: Suppose xn ⊆ s. We are required to show that
xn+1 ⊆ s. If u ∈ xn+1 = f(xn ) = ∪{y ∈ S : y ∈ xn }, then u ∈ y, for
some y ∈ xn ⊆ s. Since s is transitive, u ∈ y ⊆ s implies u ∈ s. Then
xn+1 ⊆ s.
Hence, by mathematical induction, xn ⊆ s, for all n. Since tx0 = ∪{xn : n ∈
ω}, tx0 ⊆ s, as claimed.
We have thus constructed the smallest transitive set tx0 which contains x0 .
Then for any set x there exists a smallest transitive set tx which contains x.
Concepts review:
1. State the Axiom of regularity.
2. What does it mean to say that a set S has a minimal element with
respect to ∈?
Part IX: Choice, regularity and Martin’s axiom 355
EXERCISES
32 / Cumulative hierarchy
Abstract. We show how to construct, incrementally, a class of sets whose
union, V , contains all sets. The class, V, is referred to as the “Von Neu-
mann’s universe of sets ”. The Axiom of regularity is used to show that
this class, V , indeed contains all sets. We then define the “cumulative hi-
erarchy” and the “rank of a set”.
g(0) = V0 = ∅ = 0
g(1) = V1 = f(V0 ) = P(0) = {∅} = 1
g(2) = V2 = f(V1 ) = P(1) = {∅}, ∅ = 21 = 2
g(3) = V3 = f(V2 ) = P(2) = {{∅}, ∅}, {{∅}}, {∅}, ∅
g(4) = V4 = f(V3 ) = P(V3 ) (24 = 16 elements)
.. ..
. .
g(α+ ) = Vα+ = f(Vα ) = P(Vα )
.. ..
. .
g(λ)
If λ = limit ordinal, = Vλ = ∪α∈λ Vα
g(λ + 1) = Vλ+1 = f(Vλ ) = P(Vλ )
.. ..
. .
The class {Vα : α ∈ O} is called the Cumulative hierarchy of sets. The union
of all the elements of the cumulative hierarchy of sets is denoted as:
V = ∪α∈O Vα
{Vα : α ∈ O}
V ⊆S
For the following theorem, recall that a set S is transitive with respect to ∈
if x ∈ S ⇒ x ⊂ S, equivalently if, x ∈ y and y ∈ S, then x ∈ S.
P roof:
It suffices to show that y ∈ x ∈ Vγ implies y ∈ Vγ , for all ordinals, γ. We
can prove this by a straightforward application of transfinite induction.
Let P (α) denote the statement “Vα is transitive”.
Since V0 = ∅, P (0) trivially holds true.
Suppose P (α) holds true. That is, suppose Vα is transitive.
y ∈ x ∈ Vα+ = P(Vα ) ⇒ y ∈ x ⊆ Vα
⇒ y ⊆ Vα (Since Vα is transitive.)
⇒ y ∈ P(Vα ) = Vα+
⇒ Vα+ is transitive
⇒ P (α+ ) holds true.
Up to now, we have always represented the class of all sets which has evolved
from the ten ZFC-axioms by the symbol, S = {x : x is a set}. Since every
element of {Vα : α ∈ O} is a set, we can write, {Vα : α ∈ O} ⊆ S .
Furthermore,
⇒ x ⊆ Vγ , Since Vγ is transitive.
⇒ x is a set.
⇒ x ∈ S.
Hence
V = ∪α∈O Vα ⊆ S
We now wonder whether S ⊆ V = ∪{Vα : α ∈ O}. That is, are all sets
accounted for in V ? We will prove that this is indeed the case. With this
objective in mind, we first establish the following lemma.
Part IX: Choice, regularity and Martin’s axiom 359
P roof:
We have proven above that Vα is transitive for every α ∈ O.
Let P (γ) denote the statement “α ∈ γ implies Vα ∈ Vγ ”.
Suppose P (β) holds true for all β ∈ γ. That is, if α ∈ β ∈ γ, then Vα ∈ Vβ .
Suppose φ ∈ γ. We are required to show that Vφ ∈ Vγ .
Case 1: If γ is a limit ordinal, then Vφ ∈ Vφ+ ⊂ ∪α∈γ Vα = Vγ .
Case 2: Suppose γ = ψ+ . If φ ∈ ψ, then Vφ ∈ Vψ , by the inductive hypothesis.
If φ = ψ, then Vφ = Vψ ∈ P(Vψ ) = Vψ+ = Vγ ; hence, P (γ) holds true.
By transfinite induction, α ∈ γ implies Vα ∈ Vγ for all γ. Since Vγ is transi-
tive, Vα ⊂ Vγ for all γ. This completes the proof of the lemma.
P roof:
What we are given: That B is a non-empty set such that B ⊂ V = ∪{Vα :
α ∈ O}.
What we are required to show: That B ∈ V .
Let u ∈ B. Then, since B ⊂ V , u ∈ Vα for some α ∈ O.
Then the set {α ∈ O : u ∈ Vα } is non-empty. This ensures that the function
f : B → O defined as
f(u) = least{α ∈ O : u ∈ Vα }
is well-defined.
− Since B is a set, by the Axiom of replacement, f[B] is a set of ordinals.
Since f[B] is a set, β = ∪α∈f[B] α (a union of a set of sets) is a set. By
Theorems 27.11 and 27.12, β is an ordinal.
− Since α ⊆ β for all α ∈ f[B] and β is transitive, α ∈= β, for all α ∈ f[B].3
− By Lemma 32.3, for every α ∈ f[B], [α ∈= β] ⇒ [Vα ⊆ Vβ ].
Then, for every u ∈ B, u ∈ Vf(u) ⊆ Vβ . This implies that B ⊆ Vβ and so
B ∈= P(Vβ ) = Vβ+ ⊂ V
So x ∈ V , as required. So S ⊆ V .
P roof:
What we are given: That the class V contains all sets.
What we are required to show: That every non-empty set x contains a ∈-
minimal element.
Suppose x is a non-empty set. By hypothesis, x belongs to V = ∪α∈O Vα .
Then x ∈ Vα for some α. Since Vα is transitive x ⊂ Vα ⊂ V , we can then
define a function f : x → O as f(u) = least{α ∈ O : u ⊂ Vα }. Since x is
a non-empty set, by the Axiom of replacement, f[x] is a non-empty set of
ordinals and so contains a least element, say φ. Since φ is in the image of x
under f, there exists an element m in the domain x of f such that f(m) = φ.
Then φ is the least ordinal such that m ⊂ Vφ ; equivalently, it is the least
ordinal such that m ∈ Vφ+ . Then m 6∈ Vφ .
We claim that m is an ∈-minimal element of x: Suppose not. Suppose
z ∈ m ∩ x. Then since z ∈ m ⊂ Vφ implies z ∈ Vφ we have f(z) ∈ f(m) = φ.
This contradicts the fact that φ is minimal in f[x]. Then m is an ∈-minimal
of x as claimed.
Then every set has a minimal element with respect to “∈”.
Definition 32.7 Given any set U , we will define the “rank of the set U ”,
denoted as rank(U ), as follows:
rank(U ) = least{α ∈ O : U ⊆ Vα }
g(0) = V0 = ∅ = 0
g(1) = V1 = f(V0 ) = P(0) = {∅} = 1
g(2) = V2 = f(V1 ) = P(1) = {∅}, ∅ = 21 = 2
g(3) = V3 = f(V2 ) = P(2) = {{∅}, ∅}, {{∅}}, {∅}, ∅
g(4) = V4 = f(V3 ) = P(V3 ) (24 = 16 elements)
− Suppose A = 3. Since 3 = {0, 1, 2} = ∅, {∅}, {{∅}, ∅} ⊂ V3 and
3 6⊂ V2 , then rank(A) = 3. (Could it be that the rank of any ordinal,
γ, is γ? We will soon see!)
− Suppose B = {{{∅}}}. We see that {{∅}} ∈ V3 so B ⊆ V3 and
{{∅}} 6⊂ V2 ; hence, rank(B) = 3.
− Suppose C = { {{{∅}}}, ∅}. We see that {{{∅}}} and ∅ are both
elements of P(V3 ) = V4 ; hence, C ⊂ V4 . Since {{{∅}}} 6∈ V3 , C 6⊂ V3 ;
hence, rank(C) = 4.
P roof:
a) Note that if U = ∅, then rank(U ) is the least ordinal number α such
that ∅ ⊆ α, namely 0. So rank(∅) = 0.
Part IX: Choice, regularity and Martin’s axiom 363
Vα = {U ∈ S : rank(U ) < α}
Suppose D ∈ Vα . Then, by part (b), rank(D) < α. So D ∈ {U ∈
S : rank(U ) < α}. Suppose on the other hand that D ∈ {U ∈ S :
rank(U ) < α}, then rank(D) < α. By part (d), D ∈ Vα . So for any
ordinal α, Vα is precisely the set of all sets U whose rank is less than α.
f) What we are given: U and D are sets such that U ∈ D.
What we are required to show: rank(U ) < rank(D).
Recall that rank(D) = least{α : D ⊆ Vα } (and if α < rank(D),
D 6⊆ Vα ).
Then D ⊆ Vrank(D) . We are given that U ∈ D. Then U ∈ Vrank(D) .
By part b), rank(U ) < rank(D). As required.
g) What we are given: γ is an ordinal.
What we are required to show: That rank(γ) = γ.
Claim: rank(γ) ≤ γ for all ordinals γ.
Suppose γ < rank(γ).
γ < rank(γ) ⇒ γ < β ≤ rank(γ) For some ordinal β.
⇒ γ ∈ β ∈= rank(γ)
⇒ γ ∈ β ⊆ Vβ ⊆ Vrank(γ) By part (b) of Lemma 32.3.
⇒ γ ∈ Vrank(γ) Contradicting part (b).
In what follows, we will illustrate how eight of the ten ZFC-axioms are sat-
isfied either on V or simply on the certain subsets, Vα , of V . It will be good
practice (even if it can be a bit tricky at times in doing so). We will see how
knowing the rank of a set can be useful in proving that the ZFC-set-axioms
hold true in V .
a, b ∈ Vβ = Vα+1 = P(α)
Axiom of union. The Axiom of union states that if A is a set of sets, then
∪{C : C ∈ A } is a set.
P roof:
What we are given: That U ∈ V . Then there is β such that U ∈ Vβ .
What we are required to show: That ∪{x : x ∈ U } ∈ V .
Since U ∈ Vβ , then, by property (b) in Theorem 32.8, rank(U ) < β.
Let y ∈ ∪{x : x ∈ U }. Then y ∈ x for some x ∈ U . Since y ∈ x ∈ U , then
rank(y) < rank(x) < rank(U ) (by part f) of Theorem 32.8 above)
So y ∈ Vβ .
We deduce that
∪{x : x ∈ U } ⊆ Vrank(U ) ⊂ Vβ
This means that ∪{x : x ∈ U } ∈ P(Vrank(U ) ) ⊆ Vβ .
It follows that ∪{x : x ∈ U } ∈ Vβ , as required.
So V satisfies the property described in the Axiom of union.
Axiom of power set. To show that V satisfies the property of power set,
we must show that if U ∈ V , then P(U ) ∈ V .
P roof:
What we are given: That U ∈ V .
What we are required to show: That P(U ) ∈ V .
By definition, U ⊆ Vrank(U ) . Then U ∈ P(Vrank(U ) ) = Vrank(U )+1 .
Suppose A ∈ P(U ). Then A ⊆ U ⊆ Vrank(U ) . So,
Then P(U ) ⊆ Vrank(U )+1 , and so P(U ) ∈ P(Vrank(U )+1 ) = Vrank(U )+2 .
So U ∈ V implies P(U ) ∈ V .
Axiom of infinity. The Axiom of infinity states that there exists a non-
empty set A (called an inductive set) that satisfies the condition: (x ∈ A) ⇒
(x ∪ {x} ∈ A). The set, ω0 = {0, 1, 2, 3, . . . , }, was defined to be the smallest
inductive set. To show that K satisfies the Axiom of infinity, we need to
show that ω0 ∈ K.
P roof:
What we are given: An ordinal number, α, strictly larger than ω0 .
What we are required to show: That ω0 ∈ Vα .
First note that, for each n ∈ ω0 Vn is finite. This is easily verified by a proof
by mathematical induction on ω0 .
Since, by definition, Vω0 = ∪{Vn : n ∈ ω0 }. Then ω0 6∈ Vω0 (other-
wise, ω0 ∈ Vn for some n). Since ω0 is an ordinal then, by Theorem 32.8,
rank(ω0 ) = ω0 . This means that the least ordinal α such that ω0 ⊆ Vα is ω0 .
Then ω0 ⊆ Vω0 . This implies ω0 ∈ P(Vω0 ) = Vω0 +1 ⊆ Vγ , for all γ > ω0 .
We conclude that ω0 ∈ Vα ⊂ V for all α such that α > ω0 , as required.
P roof:
What we are given: Let S ∈ V and α be an ordinal such that S ∈ Vα . Let
M = {x ∈ S : φ(x)} ⊆ S.
What we are required to show: That M ∈ V . It suffices to show that M ∈ Vα .
Since S ∈ Vα , then rank(S) < α (by Theorem 32.8(b)).
Since M = {x ∈ S : φ(x)} ⊆ S, then rank(M ) ≤ rank(S) < α.
By part (d) of Theorem 32.8, Vrank(M ) ∈ Vα .
Since Vα is transitive, then Vrank(M ) ⊂ Vα .
Then Vrank(M ) + 1 = P(Vrank(M ) ) ⊂ Vα . (Since U ∈ P(Vrank(M ) ) ⇒ U ⊂ Vrank(M ) ⊂ Vα .)
Then M ⊆ Vrank(M ) ⇒ M ∈ P(Vrank(M ) ) ⇒ M ∈ Vα ⊂ V , as required.
Proposition 32.14 The set Vω0 satisfies the property described by the Ax-
iom of replacement.
368 Section 32: Cumulative hierarchy
P roof:
What we are given: That z ∈ Vω0 and f : z → V is a function mapping z
into V .
What we are required to show: That f[z] ∈ Vω0 .
Recall that Vω0 = ∪{Vn : n ∈ ω0 }. Since z ∈ Vω0 , then z ∈ Vm for some
natural number m. Since Vm is transitive, u ∈ z ⇒ u ∈ Vm , then z is a finite
set. Then f[z] is also a finite set.
Suppose f[z] = {a0 , a1 , a2 , . . . , ak }. We claim f[z] ⊂ Vω0 . (If so f[z] is a set.)
Since
ai ⊆ Vrank(ai ) ⊂ Vrank(ai ) + 1 ⊆ Vrank(aq ) + 1
so for i = 1 to k,
Axiom of choice. The Axiom of choice states that if A is a set of sets, then
there exists a function f : A → ∪{x : x ∈ A} which maps each subset x of
A to an element of yx ∈ x.
In the following proposition we show that, if the axiom of choice holds true
on V , then, whenever γ is a limit ordinal, the Axiom of choice also holds
true on Vγ .
x ∈ U , f(x) ∈ x.
Since γ is a limit ordinal and U ∈ Vγ = ∪{Vα : α ∈ γ}, then U ∈ Vβ for
some ordinal β ∈ γ. Since Vβ is transitive,
U ∈ Vβ ⇒ U ⊆ Vβ
x ∈ Vβ for each x ∈ U ⇒ x ⊆ Vβ for each x ∈ U
⇒
∪{x : x ∈ U } ⊆ Vβ
yx = f(x) ∈ Vβ ∈ Vβ+1 ∈ Vγ
x, yx ∈ Vβ ⇒ {x, yx} ⊆ Vβ
⇒ {x, yx} ∈ P(Vβ )
⇒ {x, yz } ∈ Vβ+1
{x}, {x, yx} ∈ Vβ+1 ⇒ (x, yx) = {{x}, {x, yx}} ∈ P(Vβ+1 ) = Vβ+2
Then,
So f ∈ Vγ as claimed.
We have shown that there exists a function f ∈ Vγ mapping U onto a set
f[U ] ∈ Vγ such that f(x) ∈ x for each x ∈ U , as required.
Suppose we don’t assume GCH. Then it may be the case, for example, that
where the cardinal number, ℵ1 , is the least ordinal larger than ℵ0 , such that
ℵ1 6∼e ℵ0 , the cardinal number, ℵ2 , is the least ordinal larger than ℵ1 such
that ℵ2 6∼e ℵ1 , and the cardinal number, ℵ3 , is the least ordinal larger than
ℵ2 such that ℵ3 6∼e ℵ2 . Will these “extra” sets, ℵ1 , ℵ2 , for example, be
present in some Vα ? We have
Concepts review:
1. How is the class of sets V = ∪α∈O Vα constructed?
2. The class {Vα : α ∈ O} has a strict linear ordering with respect to
which order relation?
3. The statement “The class {Vα : α ∈ O} contains all sets” is equiv-
alent to which ZFC-axiom?
4. What is the rank of a set U ?
Part IX: Choice, regularity and Martin’s axiom 371
EXERCISES
rank(x) = least{α ∈ O : x ⊆ Vα }
33 / Martin’s axiom
Abstract. In this section we define the “countable chain condition” on a
partially ordered set (P, ≤). We then define those subsets of (P, ≤) called
“filters”. We introduce an axiom which is independent of ZFC called Mar-
tin’s axiom, of particular interest when ¬CH is assumed. We then list a
few consequences of this axiom. NOTE: This chapter is presented as a
matter of interest and is mostly destined to readers well versed on topolog-
ical spaces and more advanced topics in real analysis.
33.1 Introduction.
At this point we have discussed nine basic set theory axioms we called the
ZF-axioms. To these nine axioms we have adjoined a tenth axiom called
the Axiom of choice. When viewed together, these axioms are referred to
as the ZFC-axioms or ZF+Choice. Most mathematicians view these ten ax-
ioms as constituting a solid and reliable foundation of mathematics, at least
for the time being. A few mathematicians or logicians, as well as certain
philosophers of mathematics, continue to investigate these axioms in healthy
attempts to identify what they consider to be some shortcomings or weak
points of the theory, occasionally questioning the validity of some of these.
And this is fine, since no one can prove that the ZFC-axioms will not, at
some point in time, lead to some contradiction. Many mathematicians inves-
tigate other axioms which are independent of these, as being useful tools to
prove certain mathematical statements which push or cross the mathemat-
ical boundaries established by ZFC. Examples of these are the Continuum
hypotheses axioms: CH, GCH, ¬CH and ¬GCH. The Continuum hypothesis
(CH) declares that the smallest cardinal number which is larger than the
countable cardinal ℵ0 is ℵ0+1 = 2ℵ0 . The negation, ¬CH, of CH declares
that there is at least one uncountable cardinal ℵ1 such that ℵ0 < ℵ1 < 2ℵ0 .
The Generalized continuum hypothesis (GCH) states that, for any cardi-
nal number ℵα , the smallest cardinal number which is greater than ℵα is
ℵα+1 = 2ℵα . Its negation declares that there are cardinal numbers ℵα such
that ℵα < ℵα+1 < 2ℵα .
In this section we will discuss another axiom called Martin’s axiom.1 This is
a slightly more advanced topic of set theory, since the understanding of some
proofs assumes some basic knowledge of topology on the part of the reader.
Before we state and describe this axiom, we must introduce two particu-
lar notions associated to partially ordered sets, (P, ≤). One is the countable
chain condition in P and the other refers to special subsets of P called filters.
The elements of τ (X) are referred to as the open subsets of X. See that τ (X)
is essentially the collection of all open subsets of [0, 1]. If we equip τ (X) with
the relation ⊆, then (τ (X), ⊆) is an example of a partially ordered set. In
what follows, ∧P denotes the minimal element of P , (if P has one).
Recall that a chain in a partially ordered set is a subset which is linearly
ordered. Suppose (P, ≤) is a partially ordered set which may or may not
contain ∧P (the minimal element of P ). Let A be a subset of P . We say
that A is an antichain in P if A is a subset of P in which no two elements
are comparable. We say that “A is strong antichain in the partially ordered
set P ” if A satisfies simultaneously these three properties,
1) ∧P 6∈ A (if ∧P exists),
2) for any pair u, v ∈ A, u and v are not comparable under ≤,
3) for any pair u, v ∈ A, there does not exist an element r ∈ P such that
r ≤ u and r ≤ v.
The subset A is a “strong” antichain in P if no element of P is less than any
pair of elements in A. With this in mind, we introduce the following concept.
Definition 33.1 Let (P, ≤) be a partially ordered set (which may or may
not contain ∧P ). If P contains no uncountable strong antichain, then (P, ≤)
is said to satisfy the countable chain condition. In this case, we say that (P, ≤)
satisfies the ccc or that (P, ≤) is a ccc partial order.
A = {Uκ : κ < ℵα }
374 Section 33: Martin’s axiom
the set of all open intervals which do not contain their endpoints. We claim
that E is dense in (τ (X), ⊆). Let U be an element of τ (X). Then U is
a non-empty open set (read “a union of open intervals”). Since U is non-
empty, there is an x ∈ U such that x is not 0 or 1. Then there exists ε such
that (x − ε, x + ε) ⊂ (a, b) ⊆ U . (This is a fundamental property of the
real numbers.). Then, for every element U of τ (X), there is an element of
E which is a subset of U . So E is dense in τ (X) with respect to ⊆, as claimed.
Here is a trickier example (for good practice).
Example 2. Let (P, ≤) be a partially ordered set. If x ∈ P , we define x↓ as
follows:
x↓ = {y ∈ P : y ≤ x}
We say that a and b are compatible in P if a↓ ∩ b↓ contains some element of
1 More generally, in the case of a topological space (X, τ ), the ccc property of X can
be expressed as follows: If for any family F of pairwise disjoint open subsets of X, the
cardinality of F is less than or equal to ℵ0 , then (X, τ ) satisfies the ccc. So, for example, R
with the usual topology satisfies the ccc.
Part IX: Choice, regularity and Martin’s axiom 375
D = {D ∈ P(P ) : D is dense in P }
Theorem 33.4 The statement MA(ℵ0 ) holds true in ZFC. (MA(κ) where
κ = ℵ0 .)
P roof:
Suppose (P, ≤) is a non-empty partially ordered. In what follows, the reader
will notice that the ccc property on (P, ≤) is not required for the Martin’s
ℵ0 -statement to hold true.
Let D be a family of dense subsets of P such that |D| ≤ ℵ0 . That is, there
are at most countably many dense subsets in D. Let a ∈ P .
Case 1: We consider the case where D is empty.
We are required to find a proper filter F ⊂ P such that F ∩ D 6= ∅ for every
set D ∈ D.
The set, F = {x ∈ P : x ≥ a}, of all elements above a is a principal filter.
But D doesn’t contain any sets. So F intersects every element of D.
Then MA(ℵ0 ) holds true.
Case 2: We consider the case where 0 < |D| ≤ ℵ0 . So D contains at least
one dense subset. We can then enumerate the sets in D as
D = {D1 , D2 , D3 , . . . , }
We are required to find a proper filter F ⊂ P such that F ∩ Di 6= ∅ for
every set i ≥ 1.
Since, for each i, Di is dense in P , we can choose some d1 ∈ D1 such that
d1 ≤ a, d2 ∈ D2 such that d2 ≤ d1 ≤ a, and if dn ≤ dn−1 ≤ · · · ≤ d1 ≤ a
choose dn+1 ∈ Dn+1 such that
dn+1 ≤ dn ≤ dn−1 ≤ · · · ≤ d2 ≤ d1 ≤ a
We now let the set F ⊂ P be one that contains all {di : i = 1, 2, 3, . . .} and
all elements of P which are above d1 . That is, if q ∈ F , q ≥ d1 or q ≥ a.
We claim that F is a filter in P .
1) Clearly F is non-empty.
2) If b is in P such that a ≤ b, then b ∈ F .
3) If b, c belong to F , then either b < c or c < b.
Without loss of generality, suppose c < b. Then c ≤ c and c ≤ b. Then F is
a filter, as claimed.
Furthermore, F intersects every Di at di for each i.
So F is the filter which intersects every element of D. That is, if D is count-
able, then satisfies the condition for MA(ℵ0 ).
378 Section 33: Martin’s axiom
Note how countability of D plays a role in the above theorem and how, if
D is an uncountable family, difficulties may arise.
U = {Ux : x ∈ X} ⊂ τ (X)
Dx = {D ∈ τ (X) : cl(D) ⊆ Ux }
The source of our contradiction is our supposition that MA(2ℵ0 ) holds true.
We conclude that MA(2ℵ0 ) does not hold true in ZFC.
The two theorems above show that the only Martin κ-statements which are
of interest are those where κ is such that ℵ0 ≤ κ < 2ℵ0 .
We then state Martin’s axiom as follows.
Theorem 33.7 [MA] Suppose κ is an infinite cardinal such that κ < 2ℵ0 . If
X is a Hausdorff compact space with ccc and {Uα : α ≤ κ} is a family of open
dense3 subsets of X then ∩{Uα : α} =
6 ∅.
4 In fact, its similarity to MA is such that some may want to refer to MA as an “Enhanced”
Baire category theorem. For the case, R with the usual topology, MA implies the Baire
category theorem.
Part IX: Choice, regularity and Martin’s axiom 381
Concepts review:
1. When does a partially ordered set satisfy the “countable chain con-
dition”?
2. What is an open subset of the closed interval X = [0, 1]?
3. What subset of P([0, 1]) does τ ([0, 1]) represent?
4. What is a strong antichain in a partially ordered set (P, ≤)?
5. What is a dense subset of a partially ordered set (P, ≤)?
6. What is filter in a partially ordered set (P, ≤)? What is proper
filter? What is a principal filter?
7. What does it mean to say that a family of subsets satisfies the “finite
intersection property”?
8. State the Martin’s κ-statement MA(κ).
9. State MA(κ) when it refers specifically to a power set (P, ⊆).
10. What can be said about MA(ℵ0 )?
11. What can be said about MA(2ℵ0 )?
12. State Martin’s axiom, [MA].
13. State the Baire category theorem.
Part X
Ordinal arithmetic
Part X: Ordinal arithmetic 385
34 / Ordinal Addition
Abstract. In this section we define the operation of addition on the or-
dinal numbers. We then show its most basic properties and provide a few
examples.
Definition 34.1 Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets.
We define the relation “≤S∪T ” on S ∪ T as follows:
Our next step should be a verification that this newly defined order relation
actually well-orders the union S ∪ T . We express this in the form of a theo-
rem, and leave the straightforward proof as an exercise.
Theorem 34.2 Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets.
Then the relation ≤S∪T well-orders the set S ∪ T .
P roof: The proof is left as an exercise.
386 Section 34: Ordinal addition
We can now define addition of two ordinal numbers as the ordinality of the
union of disjoint well-ordered sets.
α + β = ord(S ∪ T, ≤S∪T )
(u, 0) ≤0 (v, 0) if u ≤S v
(u, 1) ≤1 (v, 1) if u ≤T v
Theorem 34.4 Let (S, ≤S ), (T, ≤T ) and (U, ≤U ), (V, ≤V ) be two pairs of
disjoint well-ordered sets such that
ord ord
S =α= U
ord ord
T =β= V
1 Addition can also be defined inductively as follows: For all α and β, (a) β + 0 = β, (b)
34.2 Examples.
Addition of natural numbers, when viewed as ordinals, should agree with re-
sults obtained when adding natural numbers the usual way. We have already
verified that this is the case for cardinal numbers. The following example
illustrates that this is the case for addition of natural numbers if viewed as
ordinals. Note that in the examples and theorem below, “<” represents the
“ordinal inclusion” order relation ∈.
a) Example. Determine the sum 3 + 7 when these natural numbers are
viewed as ordinals. Also determine the sum 7 + 3.
Solution: We see that 3 = ord {7, 8, 9} and 7 = ord {0, 1, 2, 3, 4, 5, 6}. The
choice of the natural numbers used is arbitrary. The chosen well-ordered
set representatives are disjoint. See that ord{7, 8, 9, 0, 1, 2, . . . , 6} = ord
{0, 1, 2, . . ., 9} since {7, 8, 9, 0, 1, 2, . . ., 6} (with the ordering defined on
unions) and {0, 1, 2, . . . , 9} (with the usual natural number ordering)
are order isomorphic. By definition,
3 + 7 = ord{7, 8, 9, 0, 1, 2, . . . , 6} = ord{0, 1, 2, . . . , 9} = 10
7 + 3 = ord{0, 1, 2, . . . , 9} = 10
b) Example. Determine both sums ω0 + 7 and 7 + ω0 .
Solution: So that we obtain disjoint well-ordered set representatives,
we will use
7 = ord {0, 1, 2, 3, 4, 5, 6}
ord
ω0 = {7, 8, 9, 10, . . .}
Then, by definition,
ord
7 + ω0 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, . . . , } = ω0
ord
ω0 + 7 = {7, 8, 9, 10, . . ., 0, 1, 2, 3, 4, 5, 6} = ω0 + 7
g) α + 0 = α
Remark: In part (f) of the above theorem we show that “left cancellation” on
addition applies just like for natural numbers. However “right cancellation”
2 Note that g[A∪ C] = B ∗ ∪ C need not be an initial segment of B ∪ C. So even if B ∗ ∪ C ⊂
Similarly, 3 + ω0 = ω0 . But 2 + ω0 = 3 + ω0 6⇒ 2 = 3.
(ω1 + 7) + ω1 = ω1 + (7 + ω1 )
= ω1 + ω1 = ω1 2
We provide another approach to addition, for cases where the second term
is a limit ordinal. Recall that given a non-empty subset A of a well-ordered
set W , an upper bound of A is any element u of W such that a ≤ u for all
a ∈ A. Suppose the element, s, is the least upper bound of A. That is, s is
an upper bound of A, and, for any upper bound u, s ≤ u. In this case we
write s = lub(A) (or sup A).
For example, the ordinal 5 = {0, 1, 2, 3, 4} has least upper bound, lub(5) = 4,
since 4 is greater than or equal to all of the elements of 5 and it is the least
of all upper bounds. The limit ordinal ω0 = {0, 1, 2, 3, . . ., } has as a least
upper bound, lub(ω0 ) = ω0 , itself. Note that in this case, lub(ω0 ) is not an
element of ω0 . In fact, α is a limit ordinal if and only if lub α = α 6∈ α.
Another property characterizes limit ordinals. The ordinal γ is a limit ordinal
if and only if ∪α∈γ α = γ. In the case where γ has an immediate predecessor,
say, β, then γ = {0, 1, 2, 3, . . . , β} and so
∪α∈γ α = 0 ∪ 1 ∪ 2 ∪ · · · ∪ β = β
α + β = lub {α + γ : γ < β}
Part X: Ordinal arithmetic 391
P roof:
We are given that β is a limit ordinal and α is any ordinal.
We are required to show that α + β is the least upper bound of the set
{α + γ : γ < β}.
We claim that α + β is an upper bound of the set {α + γ : γ < β}:
− For δ < β, by Theorem 34.5 part (e), α + δ < α + β. So α + β is an
upper bound of the set {α + γ : γ < β} as claimed.
We claim that α + β is the least such upper bound:
− Suppose δ is any upper bound of the set {α +γ : γ < β}. Then α +γ ≤ δ
for all γ < β. Suppose δ < α + β. Then for all γ, α + γ ≤ δ < α + β.
Then there exists a least ordinal µ ∈ β such that δ ≤ α + µ < α + β.
Since β is a limit ordinal µ+ < β, then δ < α +µ+ < β. This contradicts
the fact that α + γ ≤ δ for all γ < β. Then δ ≥ α + β. So α + β is the
least such upper bound of {α + γ : γ < β} as required.
Concepts review:
1. Given two disjoint well-ordered sets (S, ≤S ) and (T, ≤T ) define a
well-ordering on S ∪ T .
2. For any two ordinals α and β, how is α + β defined?
3. For which ordinals does the given property hold true.
a) (α + β) + γ = α + (β + γ)
b) α < α + γ
c) γ ≤ α + γ
d) α + γ ≤ β + γ
392 Section 34: Ordinal addition
e) γ + α < γ + β
f) α + β = α + γ ⇒ β = γ
g) α + 0 = α
4. If β is a limit ordinal, simplify the expression sup {α + γ : γ < β}.
EXERCISES
B. 3. Suppose that α and δ are ordinals such that α ≤ δ. Show that there can
only be one ordinal β such that α + β = δ.
4. Compute or simplify the sum (50 + ω0 ) + (ω0 + ω1 ).
5. Show that if α is a finite ordinal and γ is a limit ordinal, then the least
upper bound of α + γ is γ.
6. Show that for any ordinal α and limit ordinal γ, α + γ is a limit ordinal.
7. Provide a concrete example of ordinals such that α < β and α + γ = β + γ
simultaneously hold true.
C. 8. Let (S, ≤S ) and (T, ≤T ) be two disjoint well-ordered sets. Show that the
relation ≤S∪T well-orders the set S ∪ T .
9. Let (S, ≤S ), (T, ≤T ) and (U, ≤U ), (V, ≤V ) be two pairs of disjoint well-
ordered sets such that
ord
S = α = ord U
ord
T = β = ord V
Definition 35.1 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. We define
the lexicographic ordering on the Cartesian product S × T as follows:
s1 <S s2
(s1 , t1 ) ≤S×T (s2 , t2 ) provided or
s1 = s2 and t1 ≤T t2
Theorem 35.2 Let (S, ≤S ) and (T, ≤T ) be two well-ordered sets. The lexi-
cographic ordering of the Cartesian product S × T is a well-ordering.
Theorem 35.3 If the well-ordered sets, S1 and S2 , are order isomorphic and
the well-ordered sets, T1 and T2 , are order isomorphic, then the lexicographi-
cally ordered Cartesian products, S1 × T1 and S2 × T2 , are order isomorphic.
394 Section 35: Ordinal multiplication and exponentiation
P roof:
We are given onto order isomorphisms f : S1 → S2 and g : T1 → T2 .
We are required to produce an onto order isomorphism h : S1 ×T1 → S2 ×T2 .
We define the function h : S1 × T1 → S2 × T2 as h(s, t) = (f(s), g(t)).
We show that h is a well-defined one-to-one function on S1 × T1 : Since
h(s, t) = h(a, b) ⇒ (f(s), g(t)) = (f(a), g(b))
⇒ f(s) = f(a) and g(t) = g(b)
⇒ s = a and t = b (Since both f and g are one-to-one.)
⇒ (s, t) = (a, b)
then h is one-to-one.
The function h is onto S2 × T2 : If (s, t) ∈ S2 × T2 , then, since f and g are
“onto” S2 and T2 respectively, s = f(a) and t = g(b) for some a ∈ S1 and
b ∈ T1 ; hence, h(a, b) = (s, t). Hence, h is onto S2 × T2 .
The function h respects theordering of the sets:
s1 <S s2
(s1 , t1 ) ≤S×T (s2 , t2 ) ⇔ or
s1 = s2 and t1 ≤T t2
f(s1 ) <S f(s2 )
⇔ or
So h is order isomorphic.
Definition 35.4 Let α and β be two ordinals with set representatives A and
B respectively. We define the multiplication, α × β, as:
α × β = ord(B × A)
The product, α × β, is equivalently written as, αβ, (respecting the order).
Note the order of the terms in the Cartesian product, B × A, is different from
the order, α × β, of their respective ordinalities.
Part X: Ordinal arithmetic 395
ord
2 × ω0 = (N × {0, 1})
ord
= {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), . . ., (n, 0), (n, 1), . . ., }
= ω0
e) γ0 = 0
f) For any limit ordinal β 6= 0, αβ = lub {αγ : γ < β}
We have considered all cases and see that the ordering is respected.
b) We are given that γ > 0 and α < β. We are required to show that
γα < γβ.
Since α < β, there exists an order isomorphism f : A → B mapping A
to an initial segment in B. It suffices to show that C × A is isomorphic
to an initial segment of C × B. Define the function g : C × A → C × B
as follows: g((c, a)) = (c, f(a)).
Since f is one-to-one into B, g is easily seen to be one-to-one.
We show that g respects the order:
Part X: Ordinal arithmetic 397
⇒ δ < αγ
35.4 Examples.
When performing ordinal arithmetic, it is not always obvious how to simplify
expressions. In the following examples, we show how some of the properties
shown above can be used to simplify expressions.
398 Section 35: Ordinal multiplication and exponentiation
a) By mathematical induction, show that for any ordinal α and finite ordinal
n<0
αn = α + α + · · · + α (n times)
Solution: Induction on the natural numbers. Let A be a set such that
ord
A = α.
Let P (n) denote the statement “αn = α + α + · · · + α (n times)”
Base case: Since α1 = ord({0} × A) = ordA = α, P (1) holds true.
Inductive hypothesis: Suppose P (n) holds true. Then, by left-hand distribu-
tivity of ordinals, α(n + 1) = αn + α1 = αn + α = α + α + · · ·+ α (n + 1 times).
By mathematical induction αn = α + α + · · · + α (n times) for all non-zero
finite ordinals n.
b) Define α + α + α + · · ·+ (Countably infinite times) = lub {α, α + α, α + α + α, . . .}.
Show that for any α,
Solution:
αω0 = lub {αn : n < ω0 }
= lub {α, α2, α3, α4, . . .}
= lub {α, α + α, α + α + α, . . .}
= α+α+α+··· (Countably infinite times)
0 ≤ 0, 1, 2, 3, . . . < ω0
ω0 ≤ ω0 , ω0 + 1, ω0 + 2, ω0 + 3, . . . , < ω 0 + ω0
ω0 2 ≤ ω0 2, ω0 2 + 1, ω0 2 + 2, ω0 2 + 3, . . . , < ω 0 2 + ω0
ω0 3 ≤ ω0 3, ω0 3 + 1, . . . , ω0 4, . . . , ω0 5, . . . , < ω0 ω0 = ω02
ω0 ω0 = ω02 ≤ ω02 , ω02 + 1, . . . , ω02 + ω0 , . . . , . . . , ω02 + ω0 ω0 = ω02 + ω02 = ω02 2
ω02 2 ≤ ω02 2, ω02 2 + 1 . . . , ω02 3, . . . , ω02 4, . . . , ω02 ω0 = ω03
ω03 ≤ ω03 , ω03 + 1, . . . , ω04 , . . . , ω05 , . . . , ω06 , . . . < ω0ω0
ω ω0 ω ω ω
ω0 0 ≤ ω0 , ω0 0 + 1, . . . , (ω0 0 )ω0 , . . . , ((ω0 0 )ω0 )ω0 , . . . , < ω1 , . . .
Many of the ordinals listed above may seem incredibly large. In spite of this,
note that every one of these ordinals is countable!
Observe that we go from one limit ordinal to the next by adding a countably
infinite set, remembering (from Theorem 19.6), that adding a countably in-
finite set to some infinite set S does not change the cardinality of S. We see
this in more detail in the following table.
ω0 → ω0 2 → ω0 3 → · · · → ω02
→ ω02 + 1 → ω02 + 2 → · · · → ω02 + ω0
→ ω02 + ω0 + 1 → ω02 + ω0 + 2 → · · · → ω02 + ω0 2
→ ω02 + ω02 = ω02 2 → · · · → ω02 ω0 = ω03
→ ω03 + 1 = ω03 + 2 → · · · → ω04
Recall that all of these are elements of the first uncountable ordinal is ω1
(defined as the least ordinal which cannot be embedded in ω0 ). The ordinal
ω1 is uncountable. It is not constructed (other than being the union of all
countable ordinals), but its existence is guaranteed by Hartogs’ lemma (see
Theorems 28.9, 28.10 and 28.11).
Definition 35.6 Let γ be any non-zero ordinal. We define the γ-based expo-
nentiation function gγ : O → O as follows:
1) gγ (0) = 1
2) gγ (α+ ) = gγ (α)γ
3) gγ (α) = lub{gγ (β) : β < α} whenever α is a limit ordinal.
Whenever γ 6= 0 we represent gγ (α) as γ α . Then γ α+1 = γ α γ. If γ = 0 we
define γ α = 0α = 0.
400 Section 35: Ordinal multiplication and exponentiation
δ=µ ⇒ γδ = γµ
⇒ γ δ (1) < γ µ γ (By Theorem 35.5)
⇒ γ δ < γ µ+1 = γ β (By definition)
δ<µ ⇒ γ δ < γ µ (By induction hypothesis))
⇒ γ δ < γ µ (1) < γ µ γ = γ µ+1 = γ β
⇒ γδ < γβ
( ⇐ ) Suppose γ δ < γ β . If β < δ, then by the first part above we have both
γ β < γ δ and γ δ < γ β , a contradiction. Then δ ≤ β. Since δ = β implies
γ δ = γ β , we must have δ < β, as required.
P roof:
a) Proof by transfinite induction. Let P (δ) denote the statement “γ β γ δ =
γ β+δ ”. Then P (0) holds true. Suppose P (δ) holds true for all δ < α.
That is, suppose γ β γ δ = γ β+δ whenever δ < α.
Case 1 : α is a successor ordinal. That is, α = µ + 1 for some ordinal µ.
Then
γβ γα = γ β γ µ+1
= γβ γµ γ
= γ β+µ γ (By the inductive hypothesis)
(β+µ)+1
= γ
= γ β+(µ+1)
= γ β+α
P (δ) holds true for all δ < α. That is, suppose (γ β )δ = γ βδ whenever
δ < α. As in part (a) consider the two cases, (1) α has an immediate
predecessor, and (2) α is a limit ordinal, separately.
The details are left as an exercise.
Concepts review:
1. When defining multiplication of ordinals, what kind of ordering is
used on the Cartesian product of the ordinals being multiplied?
2. Define the multiplication of two ordinals α and β.
3. How does the definition of ordinal multiplication compare with the
definition of cardinal multiplication?
4. Construct a set representation for the ordinal product ω0 2.
5. How does the ordinal ω0 3 compare with the ordinal 3ω0 ?
6. Is left-hand distribution acceptable in ordinal multiplication?
7. If we start with the countably infinite ordinal ω0 and gradually in-
crease its ordinality, one at a time, by an endless process of ordinal
addition and multiplication, can we reach ω1 with this process?
EXERCISES
C. 7. Prove that α1 = 1α = α.
8. Prove that if α and β are both finite ordinals, then αβ = βα.
9. Prove that if αβ = 0, then either α or β is 0.
10. Prove that γ(α + β) = γα + γβ.
Part XI
Appendix
Appendix A 405
A.1 Lattices.
Given a partially ordered set (P, ≤) there may be pairs of elements a, b in P
such that a and b do not have a common upper bound or a common lower
bound in P . Those partially ordered sets in which every pair of elements in
P have lower and upper bounds play an important role in mathematics. We
refer to these sets as lattices.
Examples of lattices.
A∧B = A∩B
A∨B = intX (clX (A ∪ B))
Hence Ro(X) is not closed with respect to the union, ∪, of finitely many
sets. The following statement confirms that (Ro(X), ⊆, ∨, ∩) is a complete
lattice in the partially ordered set (τ (X), ⊆).
A ∩ B = intX (cl X (A)) ∩ intX (clX (B)) = intX (clX (A ∩ B)) ∈ Ro(X)
The partially ordered set Ro(X) forms a base for a topology on X. Since
∅, X ∈ Ro(X) and Ro(X) is closed under intersections, ∩, then Ro(X)
forms a base for some topology. That is, Ro(X) generates some topology
τ ∗ (X) ⊆ τ (X). If (X, τ (X)) is assumed to be Hausdorff, τ (X) separates
points of X; it then easily follows that Ro(X) also separates points of X.
We have shown that (X, τ ∗ (X)) is Hausdorff on X.3
3 Given a topological space (X, τ (X)), the set (X, τ ∗ (X)) is referred to as the semiregu-
Proof :
a) Let F be a proper L-filter.
Let H = {M : M be a proper L-filter such that F ⊆ M }. We partially or-
der H with ⊆. Let C be a chain in (H , ⊆). Then ∪{C : C ∈ C } is an upper
bound of C with respect to ⊆. So every chain in H has an upper bound.
By Zorn’s lemma, (H , ⊆) has a maximal element. That is, H contains a
filter, F ∗ , which is not properly contained in any other filter. Since F ∗ ∈ H ,
F ⊆ F ∗ . Then F can be extended to an L-ultrafilter, as required.
H = {B ∈ L : A ∩ F ⊆ B for some F ∈ F }
Proof :
( ⇒ ) Suppose F is an Ro(X)-ultrafilter and A ∈ Ro(X). Let F ∈ F .
Case 1: If F ∩ A = ∅ then (since F is open) F ⊆ X − clX A. Since
X − clX A = intX (X − A)
= intX (X − intX clX A)
= intX clX (X − clX A)
F ⊆ X − clX A ∈ Ro(X) which implies X − clX A ∈ F .
Case 2: Suppose F ∩ A 6= ∅ for all F ∈ F . Suppose A 6∈ F . Let
H = {B ∈ Ro(X) : A ∩ F ⊆ B}. Then F ⊆ H and A ∈ H − F . As
shown in the theorem above, H is a filter base which generates a filter H ∗
in (Ro(X), ⊆). Since F ⊂ H ∗ this contradicts the fact that F is an Ro(X)-
ultrafilter. So A must belong to F .
( ⇐ ) Suppose that for any A ∈ Ro(X), either A or X − clX (A) belongs to F .
See that the filter F extends to an Ro(X)-ultrafilter F ∗ . Suppose A ∈ F ∗ . If
A 6∈ F , then X − clX A ∈ F ⊆ F ∗ implying that A ∩ (X − clX A) = ∅ ∈ F ∗ ,
a contradiction. So A ∈ F . Then F ∗ ⊆ F which implies that F is an Ro(X)-
ultrafilter.
Definition 1.9 Suppose we are given two lattices (B1 , ≤1 , ∨1 , ∧1 , 0, 1,0 ) and
(B2 , ≤2 , ∨2, ∧2 , 0, 1,0 ) and a function f which maps elements of B1 to elements
of B2 . We say that f : B1 → B2 is a Boolean homomorphism if, for any
x, y ∈ B,
1) f(x ∨1 y) = f(x) ∨2 f(y),
2) f(x ∧1 y) = f(x) ∧2 f(y),
3) f(x0 ) = f(x)0 .
The function f : B1 → B2 is a Boolean isomorphism if f is a bijection, and
both f and f ← are Boolean homomorphisms.
Appendix A 411
Topological representations.
Suppose (B, ≤, ∨, ∧, 0, 1,0 ) is any Boolean algebra. Let X be some topo-
logical space and B(X) be the Boolean algebra of all clopen sets on some
topological space X. We will say that the Boolean algebra (B, ≤, ∨, ∧, 0, 1,0 )
has a topological representation if:
Proof :
Let BB = {fB (x) : x ∈ B}. To show that BB forms a base for a topology on
S (B) we must show two things: That BB covers all of S (B) and that BB
is closed under finite intersections.5
We first show that the sets in BB cover all of S (B). Since B-ultrafilters are
proper filters no ultrafilter can contain 0; then fB (0) = ∅ ∈ BB . Also, 1
5 See Theorem 5.4 of Point-set topology with topics by R. André.
412 Boolean algebras and Martin’s axiom
For a given Boolean algebra (B, ≤, ∨, ∧, 0, 1,0 ), we can now speak of a topo-
logical space (S (B), τ (S (B))) which is associated to B. Its elements are
B-ultrafilters. When equipped with this topology, the set S (B) is referred
to as the
Stone space
We now describe a few properties of the function fB which associates B to
subsets of the topological space S (B). We refer to this important theorem
as the Stone representation theorem.
Proof :
We are given that (B, ≤, ∨, ∧, 0, 1,0 ) is a Boolean algebra.
1) We have shown in the previous theorem that fB (x ∧ y) = fB (x) ∩ fB (y).
We now show that fB (x ∨ y) = fB (x) ∪ fB (y): If F ∈ fB (x) ∪ fB (y), then
either x or y belongs to F . By definition of a filter, x ∨ y ∈ F , hence
F ∈ fB (x ∨ y). Then fB (x) ∪ fB (y) ⊆ fB (x ∨ y). On the other hand, if
F ∈ fB (x ∨ y), then x ∨ y ∈ F . By a property of B-ultrafilters described
above, either x or y belongs to F . Hence F either belongs to fB (x) or to
fB (y). Then fB (x ∨ y) ⊆ fB (x) ∪ fB (y). So fB (x ∨ y) = fB (x) ∪ fB (y).
Appendix A 413
x0 = x0 ∨ 0 ⇒ x0 ∨ (x ∧ y0 )
⇒ x0 ∨ (x ∧ y0 )
⇒ (x0 ∨ x) ∧ (x0 ∨ y0 ) (B is distributive)
⇒ 1 ∧ (x0 ∨ y0 )
⇒ (x0 ∨ y0 )
⇒ y 0 ≤ x0
⇒ x≤y
We have a contradiction. The source of the contradiction is our suppo-
sition that x ∧ y0 = 0. So x − y 6= 0 as claimed. Let F be the B-filter
{x − y}↑ generated by x − y ∈ B. Then F extends to a B-ultrafilter F ∗ .
Then F ∗ ∈ fB (x − y). But x and y0 are both above x − y = x ∧ y0 and
so must both belong to F ∗ . Then F ∗ ∈ fB (x) ∩ fB (y0 ). Then F ∗ cannot
belong to fB (y) (for if it did, y and y0 would both belong to F ∗ ). Then
fB (x) 6= fB (y). We conclude that fB is one-to-one on B.
4) Let (S (B), τ (S (B))) be the topological space where τ (S (B)) is the
topology generated by the open base {fB (x) : x ∈ B}. A topological
space is zero-dimensional if it has an open base of clopen sets. We have
shown above that {fB (x) : x ∈ B} is a set of clopen subsets of S (B). So
S (B) is zero-dimensional.
The topological space (S (B), τ (S (B))) is Hausdorff: Let F1 and F2 be
414 Boolean algebras and Martin’s axiom
Theorem 1.13 Let κ be an infinite cardinal number such that κ < 2ℵ0 . Then
the following are equivalent:
1) (Martin’s axiom, MA) If (P, ≤) is a partially ordered set satisfying ccc and
D = {Dα : α ≤ κ} is a family of dense subsets of P , then there exists a
filter F on P such that F ∩ Dα 6= ∅ for each α ≤ κ.
2) If X is a compact Hausdorff topological space satisfying ccc and D = {Dα :
α ≤ κ} is a family of dense open subsets of X, then ∩{Dα : α ≤ κ} = 6 ∅.
Appendix A 415
Proof :
We are given that κ is an infinite cardinal number such that κ < 2ℵ0 .
(1 ⇒ 2): We begin with the trivial case. Suppose X is finite. If X =
{x1 , x2 , . . . , xn }, then every element of X is clopen and so the only dense
subset of X is X. Hence the intersection of all dense subsets of X is X 6= ∅.
We are done.
What we are given: Suppose now that X is an infinite set which is com-
pact Hausdorff. Let τ (X) denote the set of all non-empty open sub-
sets of X. Then (τ (X), ⊆) is a partially ordered set of subsets of X.
Suppose X does not contain an uncountable family of pairwise disjoint
open subsets of X. That is, suppose (τ (X), ⊆) satisfies the ccc. Suppose
D = {Dα : α ≤ κ} ⊆ τ (X) where Dα is dense in X (i.e., cl(Dα ) = X).
For each α < κ, we define Uα = {U ∈ τ (X) : cl(Uα ) ⊆ Dα }. (Since X
is compact Hausdorff and none of Dα ’s are empty, none of the Uα ’s are
empty.) We are given that MA holds true.
We are required to prove that ∩{Dα : α ≤ κ} = 6 ∅.
Claim: That, for each α, Uα is a dense subset of the partially ordered set
(τ (X), ⊆).
Proof of claim: Suppose M ∈ τ (X). It suffices to show that there is an ele-
ment of Uα which is a subset of M . See that, for any α ≤ κ, M ∩Dα ∈ τ (X)
and so there exists an element x and open set U such that x ∈ U ⊆
cl(U ) ⊆ M ∩ Dα ⊆ Dα . Then U ∈ Uα . We have shown that an element of
Uα is a subset of M . So Uα is a dense subset of (τ (X), ⊆), as claimed.
The set E = {Uα : α ≤ κ} is then a family of dense subsets of (τ (X), ⊆)
satisfying ccc. By Martin’s axiom, (τ (X), ⊆) contains a filter F such that
F ∩ Uα 6= ∅ for all α ≤ κ. For each α, choose Fα ∈ F ∩ Uα . Since
Fα ∈ Uα , then cl(Fα ) ⊆ Dα . Since F is a filter of non-empty open subsets
of X which satisfies the finite intersection property, then {cl(Fα ) : α ≤ κ}
satisfies the finite intersection property inside compact X. Then there must
be some a ∈ X such that a ∈ ∩{cl(Fα ) : α ≤ κ} ⊆ ∩{Dα : α ≤ κ}. Thus
∩{Dα : α ≤ κ} = 6 ∅. This is what we were required to prove.
(2 ⇒ 3): We are given that (B, ≤, ∨, ∧, 0 ) is a Boolean algebra with the ccc
property and D = {Dα : α ≤ κ} is a family of dense subsets of B.6 We are
6 Recall that “D is dense in B” means “if 0 < x ∈ B there exists d ∈ D such that
α α
0 < d ≤ x”.
416 Boolean algebras and Martin’s axiom
Proof :
( ⇒ ) Let X be a compact Hausdorff space which satisfies ccc where X
contains a family of dense open subsets of X, D = {Dα : α ≤ κ}. Since
X is compact, every point x ∈ X contains a compact neighbourhood. By
hypothesis, ∩{Dα : α ≤ κ} is dense in X, so ∩{Dα : α ≤ κ} is not empty.
Then by (2 ⇔ 1) in the previous theorem, Martin’s axiom holds true.
( ⇐ ) Suppose Martin’s axiom holds true. Let X be a Haus-
dorff topological space satisfying ccc such that T = {x ∈ X :
x has a compact neighbourhood} is dense in X. Suppose that D = {Dα :
α ≤ κ} is a family of dense open subsets of X (where ℵ0 ≤ κ < 2ℵ0 ). We
are required to show that ∩{Dα : α ≤ κ} is dense in X. For any non-empty
open subset U , there exists a point x ∈ U ∩ T and some open neighbourhood
S of x with compact closure, clX S, such that x ∈ clX (S ∩ U ) ⊆ clX S. Since
X satisfies ccc its compact subset clX (S ∩ U ) must also satisfy ccc. For any
Dα ∈ D, Dα ∩ U ∩ S is open and dense in clX (S ∩ U ). By the topological
equivalent form of MA, there exists q ∈ ∩{Dα ∩ U ∩ S : α ≤ κ}. Since
∩{Dα ∩ U ∩ S : α ≤ κ} ⊆ U ∩ (∩{Dα : α ≤ κ}), then q ∈ ∩{Dα : α ≤ κ} ∩ U .
Not only is ∩{Dα : α ≤ κ} non-empty, but it also intersects every open
subset U of X. So ∩{Dα : α ≤ κ} is dense in X.
420 Bibliography
Bibliography
421
422 Index
Zermelo-Fraenkel, 5
ZF-axioms, 8
ZFC, 9
Zorn’s lemma, 341, 343