Beauty in Mathematics
Beauty in Mathematics
Beauty in Mathematics
Mathematics
Version 2.0
Joseph Fields
Southern Connecticut State University
ii
iii
Acknowledgments
This is version 2.0 of A Gentle Introduction to the Art of Mathematics. Earlier versions were used and classroom tested by several
colleagues: Robert Vaden-Goad, John Kavanagh, Ross Gingrich.
I thank you all. A particular debt of gratitude is owed to Leon
Brin whose keen eyes caught a number of errors and inconsistencies, and who contributed many new exercises. Thanks, Len.
iv
Contents
1 Introduction and notation
1.1
Basic sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
1.3
1.4
. . . . . . . . . . . . . . . . . . . . . . . 21
1.4.1
1.4.2
1.4.3
Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4.4
1.4.5
1.4.6
Binomial coefficients . . . . . . . . . . . . . . . . . . . 30
1.5
Some algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6
1.7
Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
59
2.1
2.2
Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.3
Logical equivalences . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4
Two-column proofs . . . . . . . . . . . . . . . . . . . . . . . . 92
2.5
Quantified statements
. . . . . . . . . . . . . . . . . . . . . . 96
v
vi
CONTENTS
2.6
2.7
3 Proof techniques I
123
3.1
3.2
3.3
3.4
Disproofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
3.5
3.6
4 Sets
169
4.1
4.2
Containment
4.3
4.4
4.5
. . . . . . . . . . . . . . . . . . . . . . . . . . . 176
205
5.1
5.2
5.3
5.4
237
6.1
Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
6.2
6.3
6.4
6.5
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
CONTENTS
6.6
vii
299
7.1
Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
7.2
7.3
7.4
8 Cardinality
349
8.1
8.2
8.3
8.4
Dominance
8.5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
391
9.1
9.2
9.3
References
420
422
Index
437
viii
CONTENTS
List of Figures
1.1
1.2
Pascals triangle. . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.3
1.4
1.5
2.1
2.2
2.3
2.4
2.5
3.1
3.2
3.3
3.4
A Z-module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.1
6.2
6.3
6.4
6.5
LIST OF FIGURES
6.6
6.7
6.8
7.1
7.2
7.3
7.4
8.1
8.2
8.3
8.4
8.5
8.6
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
List of Tables
2.1
2.2
2.3
3.1
4.1
6.1
xi
xii
LIST OF TABLES
To the student
You are at the right place in your mathematical career to be reading this
book if you liked Trigonometry and Calculus, were able to solve all the problems, but felt mildly annoyed with the text when it put in these verbose,
incomprehensible things called proofs. Those things probably bugged you
because a whole lot of verbiage (not to mention a sprinkling of epsilons and
deltas) was wasted on showing that a thing was true, which was obviously
true! Your physical intuition is sufficient to convince you that a statement
like the Intermediate Value Theorem just has to be true how can a function
move from one value at a to a dierent value at b without passing through
all the values in between?
Mathematicians discovered something fundamental hundreds of years before other scientists physical intuition is worthless in certain extreme situations. Probably youve heard of some of the odd behavior of particles
in Quantum Mechanics or General Relativity. Physicists have learned, the
hard way, not to trust their intuitions. At least, not until those intuitions
have been retrained to fit reality! Go back to your Calculus textbook and
look up the Intermediate Value Theorem. Youll probably be surprised to
find that it doesnt say anything about all functions, only those that are
continuous. So what, you say, arent most functions continuous? Actually,
the number of functions that arent continuous represents an infinity so huge
that it outweighs the infinity of the real numbers!
xiii
xiv
TO THE STUDENT
The point of this book is to help you with the transition from doing math
xv
a creative, organic, visual, right-brain sort of process however, in
communicating ones results one must find that linear, deductive, stepby-step, left-brain argument. You must use your whole mind to master
advanced mathematics.
Also, there are amusing quotations at the start of every chapter.
xvi
TO THE STUDENT
xviii
FOR INSTRUCTORS
might be expected to learn the art of proof writing while actually writing
proofs in courses like algebra and analysis1 . Judging from the feedback I
receive from students who have completed our transitions course at Southern
Connecticut State University, I think such a return to the methods of the
past is unlikely. The benefits of these transitions courses are enormous, and
even though the curriculum for undergraduate Mathematics majors is an
extremely full one, the place of a transition course is, I think, assured.
What precisely are the benefits of these transitions courses? One of my
pet theories is that the process one goes through in learning to write and
understand proofs represents a fundamental reorganization of the brain. The
only evidence for this stance, albeit rather indirect, are the almost universal
reports of weird dreams from students in these courses. Our minds evolved
in a setting where inductive reasoning is not only acceptable, but advisable in
coping with the world. Imagine some Cro Magnon child touching a burning
branch and being burned by it. S/He quite reasonably draws the conclusion
that s/he should not touch any burning branches, or indeed anything that is
on fire. A Mathematician has to train him or herself to think strictly by the
rules of deductive reasoning the above experience would only provide the
lesson that at that particular instant of time, that particular burning branch
caused a sensation of pain. Ideally, no further conclusions would be drawn
obviously this is an untenable method of reasoning for an animal driven
by the desire to survive to adulthood, but it is the only way to think in the
artificial world of Mathematics.
While a gentle introduction to the art of reading and writing proofs is the
primary focus of this text, there are other subsidiary goals for a transitions
course that we hope to address. Principal among these is the need for an
introduction to the culture of Mathematics. There is a shared mythos
1
xix
and language common to all Mathematicians although there are certainly
some distinct dialects! Another goal that is of extraordinary importance is
impressing the budding young Mathematics student with the importance of
play. My thesis adviser2 used to be famous for saying Well, I dont know!
Why dont you monkey around with it a little . . . In the course of monkeying
around doing small examples by hand, trying bigger examples with the
aid of a computer, changing some element of the problem to see how it
aected the answer, and various other activities that can best be described as
play, eventually patterns emerged, conjectures made themselves apparent,
and possible proof techniques suggested themselves. In this text there are
a great many open-ended problems, some with associated hints as to how
to proceed (which the wise student will avoid until hair-thinning becomes
evident), whose point is to introduce students to this process of mathematical
discovery.
To recap, the goals of this text are: an introduction to reading and writing mathematical proofs, an introduction to mathematical culture, and an
introduction to the process of discovery in Mathematics. Two pedagogical
principles have been of foremost importance in determining how this material is organized and presented. One is the so-called rule of three which is
probably familiar to most educators. Propounded by (among others) Hughes,
Hallett, et al. in their reform Calculus it states that, when possible, information should be delivered via three distinct mechanisms symbolically,
graphically and numerically. The other is also a rule of three of sorts, it
is captured by the old speechwriters maxim Tell em what youre gonna
tell em. Tell em. Then tell em what you told em. Important and/or
difficult topics are revisited at least three times in this book. In marked
contrast to the norm in Mathematics, the first treatment of a topic is not
rigorous, precise definitions are often withheld. The intent is to provide a
2
Dr. Vera Pless, to whom I am indebted in more ways than I can express.
xx
FOR INSTRUCTORS
bit of intuition regarding the subject material. Another reason for providing
a crude introduction to a topic before giving rigorous detail revolves around
the way human memory works. Unlike computer memory, which (excluding
the eects of the occasional cosmic ray) is essentially perfect, animal memory
is usually imperfect and mechanisms have evolved to ensure that data that
are important to the individual are not lost. Repetition and rote learning
are often derided these days, but the importance of multiple exposures to a
concept in anchoring it in the mind should not be underestimated.
A theme that has recurred over and over in my own thinking about the
transitions course is that the transition is that from inductive to deductive
mental processes. Yet, often, we the instructors of these courses are ourselves so thoroughly ingrained with the deductive approach that the mode
of instruction presupposes the very transition we hope to facilitate! In this
book I have, to a certain extent, taken the approach of teaching deductive
methods using inductive ones. The first time a concept is encountered should
only be viewed as providing evidence that lends credence to some mathematical truth. Most concepts that are introduced in this intuitive fashion are
eventually exposited in a rigorous manner there are exceptions though,
ideas whose scope is beyond that of the present work which are nonetheless
presented here with very little concern for precision. It should not be forgotten that a good transition ought to blend seamlessly into whatever follows.
The courses that follow this material should be proof-intensive courses in
geometry, number theory, analysis and/or algebra. The introduction of some
material from these courses without the usual rigor is intentional.
Please resist the temptation to fill in the missing proper definitions and
terminology when some concept is introduced and is missing those, uhmm,
missing things. Give your students the chance to ruminate, to chew3 on
3
Why is it that most of the metaphorical ways to refer to thinking actually seem to
refer to eating?
xxi
these new concepts for a while on their own! Later well make sure they get
the same standard definitions that we all know and cherish. As a practical
matter, if you spend more than 3 weeks in Chapter 1, you are probably filling
in too much of that missing detail so stop it. It really wont hurt them to
think in an imprecise way (at first) about something so long as we get them
to be rigorous by the end of the day.
Finally, it will probably be necessary to point out to your students that
they should actually read the text. I dont mean to be as snide as that
probably sounds. . . Their experiences with math texts up to this point have
probably impressed them with the futility of reading just see what kind
of problems are assigned and skim til you find an example that shows you
how to do one like that. Clearly such an approach is far less fruitful in
advanced study than it is in courses which emphasize learning calculational
techniques. I find that giving expressed reading assignments and quizzing
them on the material that they are supposed to have read helps. There are
exercises given within most sections (as opposed to the Exercises that
appear at the end of the sections) these make good fodder for quizzes and/or
probing questions from the professor. The book is written in an expansive,
friendly style with whimsical touches here and there. Some students have
reported that they actually enjoyed reading it!4
Although it should be added that they were making that report to someone from
xxii
FOR INSTRUCTORS
Chapter 1
Introduction and notation
Wisdom is the quality that keeps you from getting into situations where you
need it. Doug Larson
1.1
Basic sets
It has been said1 that God invented the integers, all else is the work of
Man. This is a mistranslation. The term integers should actually be
whole numbers. The concepts of zero and negative values seem to many
to be unnatural constructs. Indeed, otherwise intelligent people are still
known to rail against the concept of a negative quantity How can you
have negative three apples? The concept of zero is also incredibly profound.
Probably most will agree that the natural numbers are a natural construct. We will take as given that you know what the natural numbers are
the numbers we use to count things. Traditionally, the natural numbers are
denoted N.
1
Usually attributed to Kronecker Die ganze Zahl shuf der liebe Gott, alles Ubrige
ist Menschenwerk.
N = {1, 2, 3, 4, . . .}
Perhaps the best way of saying what a set is, is to do as we have above.
List all the elements. Of course, if a set has an infinite number of things in
it, this is a difficult task so we satisfy ourselves by listing enough of the
elements that the pattern becomes clear.
Taking N for granted, what is meant by the all else that humankind
is responsible for? The basic sets of dierent types of numbers that every
mathematics student should know are: N, Z, Q, R and C. Respectively: the
naturals, the integers, the rationals, the reals, and the complex numbers.
The use of N, R and C is probably clear to an English speaker. The integers
are denoted with a Z because of the German word zahlen which means to
count. The rational numbers are probably denoted using Q, for quotients.
Etymology aside, is it possible for us to provide precise descriptions of the
remaining sets?
The integers (Z) are just the set of natural numbers together with the
negatives of naturals and zero. We can use a doubly infinite list to denote
this set.
Z = {. . .
3, 2, 1, 0, 1, 2, 3, . . .}
To describe the rational numbers precisely well have to wait until Section 1.6. In the interim, we can use an intuitively appealing, but somewhat
imprecise definition for the set of rationals. A rational number is a fraction
built out of integers. This also provides us with a chance to give an example
of using the main other way of describing the contents of a set so-called
set-builder notation.
Q={
a
a 2 Z and b 2 Z and b 6= 0}
b
Lets parse the entire mathematical sentence weve been discussing with
an English translation in parallel.
Q
{
the set of all
a
b
fractions of the form a over b
2
such that
Some Mathematicians contend that only the equality test meaning of the equals
sign is real, that by writing the mathematical sentence above we are asserting the truth
of the equality test. This may be technically correct but it isnt how most people think of
things.
b2Z
and
and
b 6= 0
and
b is nonzero.
Although it was not published until 1736, Newtons book (De Methodis Serierum et
Fluxionum) describing both dierential and integral Calculus was written in 1671.
pure Mathematician, believed that every real quantity was in fact rational, a
p
belief that we now know to be false. The numbers and 2 mentioned above
are not rational numbers. For the moment it is useful to recall a practical
method for distinguishing between rational numbers and real quantities that
are not rational consider their decimal expansions. If the reader is unfamiliar with the result to which we are alluding, we urge you to experiment. Use
a calculator or (even better) a computer algebra package to find the decimal
p
expansions of various quantities. Try , 2, 1/7, 2/5, 16/17, 1/2 and a few
other quantities of your own choice. Given that we have already said that
the first two of these are not rational, try to determine the pattern. What is
it about the decimal expansions that distinguishes rational quantities from
reals that arent rational?
Given that we cant give a precise definition of a real number at this point
it is perhaps surprising that we can define the set C of complex numbers with
precision (modulo the fact that we define them in terms of R).
C = {a + bi a 2 R and b 2 R and i2 =
1}
a + bi
expressions of the form a plus b times i
such that
a2R
and
b2R
and
and
and
We sometimes denote a complex number using a single variable (by convention, either late alphabet Roman letters or Greek letters. Suppose that
weve defined z = a + bi. The single letter z denotes the entire complex
number. We can extract the individual components of this complex number
by talking about the real and imaginary parts of z. Specifically, Re(z) = a
is called the real part of z, and Im(z) = b is called the imaginary part of z.
involving the variable coming first, but this is just a convention. The sum of
those binomials would be 4 4x and so the sum of the given complex numbers
is 4
2x and 4 + 3x is (3 4) + (3 3x) +
1.
(3
2i) (4 + 3i)
= (3 4) + (3 3i) + ( 2i 4) + ( 2i 3i)
= 12 + 9i
8i
6i2
= 12 + i + 6
= 18 + i
The real numbers have a natural ordering, and hence, so do the other
sets that are contained in R. The complex numbers cant really be put into
a well-defined order which should be bigger, 1 or i? But we do have a
way to, at least partially, accomplish this task. The modulus of a complex
number is a real number that gives the distance from the origin (0 + 0i) of
the complex plane, to a given complex number. We indicate the modulus
using absolute value bars, and you should note that if a complex number
happens to be purely real, the modulus and the usual notion of absolute
value coincide. If z = a + bi is a complex number, then its modulus, ka + bik,
p
is given by the formula a2 + b2 .
Several of the sets of numbers weve been discussing can be split up based
on the so-called trichotomy property: every real number is either positive,
negative or zero. In particular, Z, Q and R can have modifiers stuck on so
that we can discuss (for example) the negative real numbers, or the positive
rational numbers or the integers that arent negative. To do this, we put
superscripts on the set symbols, either a + or a
So
Z+ = {x 2 Z x > 0}
and
Z
= {x 2 Z x < 0}
and
Znoneg = {x 2 Z x
0}.
Presumably, we could also use nonpos as a superscript to indicate nonpositive integers, but this never seems to come up in practice. Also, you
should note that Z+ is really the same thing as N, but that Znoneg is dierent
because it contains 0.
We would be remiss in closing this section without discussing the way the
sets of numbers weve discussed fit together. Simply put, each is contained
in the next. N is contained in Z, Z is contained in Q, Q is contained in R,
and R is contained in C. Geometrically the complex numbers are essentially
a two-dimensional plane. The real numbers sit inside this plane just as the
x-axis sits inside the usual Cartesian plane in this context you may hear
people talk about the real line within the complex plane. It is probably
clear how N lies within Z, and every integer is certainly a real number. The
intermediate set Q (which contains the integers, and is contained by the reals)
has probably the most interesting relationship with the set that contains it.
Think of the real line as being solid, like a dark pencil stroke. The rationals
are like sand that has been sprinkled very evenly over that line. Every point
on the line has bits of sand nearby, but not (necessarily) on top of it.
Exercises 1.1
1. Each of the quantities indexing the rows of the following table is in one
or more of the sets which index the columns. Place a check mark in a
table entry if the quantity is in the set.
N
17
22/7
6
e0
1+i
p
3
i2
10
2, 2/5, 16/17,
3, 1/2 and
42/100. Classify each of these quantitys decimal expansion as: terminating, having a repeating pattern, or showing no discernible pattern.
7. Consider the process of long division. Does this algorithm give any insight as to why rational numbers have terminating or repeating decimal
expansions? Explain.
8. Give an argument as to why the product of two rational numbers is
again a rational.
9. Perform the following computations with complex numbers
(a) (4 + 3i)
(3 + 2i)
(b) (1 + i) + (1
(c) (1 + i) (1
(d) (2
3i) (3
i)
i)
2i)
11
12
1.2
You may have noticed that in Section 1.1 an awful lot of emphasis was placed
on whether we had good, precise definitions for things. Indeed, more than
once apologies were made for giving imprecise or intuitive definitions. This
is because, in Mathematics, definitions are our lifeblood. More than in any
other human endeavor, Mathematicians strive for precision. This precision
comes with a cost Mathematics can deal with only the very simplest of
phenomena4 . To laypeople who think of math as being a horribly difficult
subject, that last sentence will certainly sound odd, but most professional
Mathematicians will be nodding their heads at this point. Hard questions
are more properly dealt with by Philosophers than by Mathematicians. Does
a cat have a soul? Impossible to say, because neither of the nouns in that
question can be defined with any precision. Is the squareroot of 2 a rational
number? Absolutely not! The reason for the certainty we feel in answering
this second question is that we know precisely what is meant by the phrases
squareroot of 2 and rational number.
We often need to first approach a topic by thinking visually or intuitively,
but when it comes to proving our assertions, nothing beats the power of having the right definitions around. It may be surprising to learn that the
right definition often evolves over the years. This happens for the simple
reason that some definitions lend themselves more easily to proving assertions. In fact, it is often the case that definitions are inspired by attempts to
prove something that fail. In the midst of such a failure, it isnt uncommon
for a Mathematician to bemoan If only the definition of (fill in the blank)
were . . . , then to realize that it is possible to use that definition or a modification of it. But! When there are several definitions for the same idea they
4
For an intriguing discussion of this point, read Gian Carlo Rotas book Indiscrete
Thoughts [14].
13
14
Well begin in the third century B.C.. Eratosthenes of Cyrene was a Greek
Mathematician and Astronomer who is remembered to this day for his many
accomplishments. He was a librarian at the great library of Alexandria. He
made measurements of the Earths circumference and the distances of the
Sun and Moon that were remarkably accurate, but probably his most remembered achievement is the sieve method for finding primes. Indeed, the
sieve of Eratosthenes is still of importance in mathematical research. Basically, the sieve method consists of creating a very long list of natural numbers
and then crossing o all the numbers that arent primes (a positive integer
that isnt 1, and isnt a prime is called composite). This process is carried
out in stages. First we circle 2 and then cross o every number that has 2 as
a factor thus weve identified 2 as the first prime number and eliminated
a whole bunch of numbers that arent prime. The first number that hasnt
been eliminated at this stage is 3, we circle it (indicating that 3 is the second prime number) and then cross o every number that has 3 as a factor.
Note that some numbers (for example, 6 and 12) will have been crossed o
more than once! In the third stage of the sieve process, we circle 5, which
is the smallest number that hasnt yet been crossed o, and then cross o
all multiples of 5. The first three stages in the sieve method are shown in
Figure 1.1.
It is interesting to note that the sieve gives us a means of finding all the
primes up to p2 by using the primes up to (but not including) p. For example,
to find all the primes less than 132 = 169, we need only use 2, 3, 5, 7 and 11
in the sieve.
Despite the fact that one can find primes using this simple mechanical
method, the way that prime numbers are distributed amongst the integers
is very erratic. Nearly any statement that purports to show some regularity
in the distribution of the primes will turn out to be false. Here are two such
false conjectures regarding prime numbers.
15
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Figure 1.1: The first three stages in the sieve of Eratosthenes. What is the
smallest composite number that hasnt been crossed o?
Conjecture 1. Whenever p is a prime number, 2p
Conjecture 2. The polynomial x2
1 is also a prime.
16
T
H
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
2 3 5 7 1 3
1 3 7 9 3
1
7 1 3
1
9
3 9
1 7
3
1
9
9 1
7 1
9 3
3 9
1
3
1 3 7
9
1
1 7 9 3
9
1
1
1 7
3
3
1
1 3
3 7
3
9 1
1
3
9
7 1 3
1 3
3 9
1
1
9
3 9
1 7
3
7
3
1
7
3
1
9
3
7 1
1 3 7
3
1
1
1
7 9
7
9 1
1
7 9
9
1
9
9 1
7
7
9 1
9 1
1
7
7
7
9
9
7
9
9
7
9
7
7
9
7 9
9
7 9
9
7
3 7 9
3
3
1
3 7
3
9 1
7 1
3 7 9 3
1
1 3
3
1
7
3
3 7 9
9
1 3
3 9
3 9 1
7
3 7 9 3
3
1
7
3
3
3
1
1 3
7 9
9 1
7 1 3
7 9
9 1
7
9 3
1
1 3
9 3
9
7 1
9
7
9
3
9 1 7
1 7
7 9 3
9
7
7
7
7
3
1 7
3 7
7
3
9 1
1 3
9
7
1 3
9 3
7
7
3
7
6
9 1
9
9
9
9
9
9
7
1
9 1
9
7 1 3
1
7 9 3
3 9 1 7
3
7 1 7
1
1
9 3 9
1
3
7
9 1
1
9
3 7
3
7
9
3
7 1
9
1
7
1
9
3 9 1
3 7
3
9
7 9 3 9 1 7
3
1 7
3
7
3 9
1 3
3
7
3 9 1
3 7
1 7
9
7 9 3 9
9 1
1 3
7
7 9
9
1 3
1 7
3
7 9
1
7 9 3 9
1 3 9 3
1
1 3 7
3
3
3
1
1
7
9
7
7 9
1
3
1
7
3 7
1 3 7 9
9
7
7 9
9
1 3
1
7
3 7 9
3 9
7
1 7
1
1
7 1 3 9
3 7
3 9
3 9 1 7
7
3 9
1 3 7
9
3 9 1 7
1
3 7
1
9 3
3
7
7 1 7
1 3 9
3
1
7 9
1 7
3
1
7 1
9
3 7 9
7
1 7 1 3 7 9
3 9
3 9
1
9
7 9 3
1 7
7
3 7
9
3
1 7
7
7
1
9
3 9 1
1 7
9
3 7 9
1
1
1 3
1 3 7 9
1
1 3 7
1 7 9
9
3
7
7
3 9
7
1
1 3
3
3
3
1 7
3
3 9
9
1
1
7 9
7
3
8
3
1
1 3
3
1
1
1
1
9
9
7
1 3 7 9
3
9
7
7 1
9
7
3 9
3
1
7
7
3 7
3
1 7
7 1 3 7
7
3
3 9 1 7
9
3 7 9 3 9
3
7
3 7 9
3 7 9
9
7
3 7 9
3 7 9
9
1 7
1 3 9
3 7
3 9
1 3
3 7 9 3 9
9 1 7
7
7
9
3 9
1 7 1
9
9 1
1
9
1 3
3
1 7
3 7
1
9
9
1 3 9
3
7
7
1
1 3
3
3 7 9
9
7
3
1
1
7
3
17
18
Exercises 1.2
1. Find the prime factorizations of the following integers.
(a) 105
(b) 414
(c) 168
(d) 1612
(e) 9177
2. Use the sieve of Eratosthenes to find all prime numbers up to 100.
1
9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
3. What would be the largest prime one would sieve with in order to find
all primes up to 400?
19
1 prime? factors
yes
1 and 3
yes
1 and 7
31
yes
127
11
5. Characterize the prime factorizations of numbers that are perfect squares.
6. Find a counterexample for Conjecture 2.
7. Use the second definition of prime to see that 6 is not a prime.
In other words, find two numbers (the a and b that appear in the
definition) such that 6 is not a factor of either, but is a factor of their
product.
8. Use the second definition of prime to show that 35 is not a prime.
9. A famous conjecture that is thought to be true (but for which no proof
is known) is the Twin Prime conjecture. A pair of primes is said to be
twin if they dier by 2. For example, 11 and 13 are twin primes, as
are 431 and 433. The Twin Prime conjecture states that there are an
infinite number of such twins. Try to come up with an argument as to
why 3, 5 and 7 are the only prime triplets.
20
10. Another famous conjecture, also thought to be true but as yet unproved, is Goldbachs conjecture. Goldbachs conjecture states that
every even number greater than 4 is the sum of two odd primes. There
is a function g(n), known as the Goldbach function, defined on the positive integers, that gives the number of dierent ways to write a given
number as the sum of two odd primes. For example g(10) = 2 since
10 = 5 + 5 = 7 + 3. Thus another version of Goldbachs conjecture is
that g(n) is positive whenever n is an even number greater than 4.
Graph g(n) for 6 n 20.
1.3
21
It is often the case that we want to prove statements that assert something
is true for every element of a set. For example, Every number has an additive inverse. You should note that the truth of that statement is relative,
it depends on what is meant by number. If we are talking about natural
numbers it is clearly false: 3s additive inverse isnt in the set under consideration. If we are talking about integers or any of the other sets weve
considered, the statement is true. A statement that begins with the English
words every or all is called universally quantified. It is asserted that the
statement holds for everything within some universe. It is probably clear
that when we are making statements asserting that a thing has an additive
inverse, we are not discussing human beings or animals or articles of clothing
we are talking about objects that it is reasonable to add together: numbers
of one sort or another. When being careful and we should always strive to
be careful! it is important to make explicit what universe (known as the
universe of discourse) the objects we are discussing come from. Furthermore,
we need to distinguish between statements that assert that everything in the
universe of discourse has some property, and statements that say something
about a few (or even just one) of the elements of our universe. Statements
of the latter sort are called existentially quantified.
Adding to the glossary or translation lexicon we started earlier, there
are symbols which describe both these types of quantification. The symbol
8, an upside-down A, is used for universal quantification, and is usually
22
8x 2 Z, 9y 2 Z, x + y = 0.
Parsing this as we have done before with an English translation in parallel,
we get:
8x
2Z
9y
x+y =0
23
Exercises 1.3
1. How many quantifiers (and what sorts) are in the following sentence?
Everybody has some friend that thinks they know everything about
a sport.
2. The sentence Every metallic element is a solid at room temperature.
is false. Why?
3. The sentence For every pair of (distinct) real numbers there is another
real number between them. is true. Why?
4. Write your own sentences containing four quantifiers. One sentence in
which the quantifiers appear (8989) and another in which they appear
(9898).
24
1.4
1.4.1
If you divide a number by 2 and it comes out even (i.e. with no remainder)
the number is said to be even. So the word even is related to division. It
turns out that the concept even is better understood through thinking about
multiplication.
Definition. An integer n is even exactly when there is an integer m such
that n = 2m.
You should note that there is a two-way street sort of quality to this
definition indeed with most, if not all, definitions. If a number is even, then
we are guaranteed the existence of another integer half as big. On the other
hand, if we can show that another integer half as big exists, then we know
the original number is even. This two-wayness means that the definition is
what is known as a biconditional, a concept which well revisit in Section 2.2.
A lot of people dont believe that 0 should be counted as an even number.
Now that we are armed with a precise definition, we can answer this question
easily. Is there an integer x such that 0 = 2x ? Certainly! let x also be 0.
(Notice that in the definition, nothing was said about m and n being distinct
from one another.)
An integer is odd if it isnt even. That is, amongst integers, there are only
two possibilities: even or odd. We can also define oddness without reference
to even.
Definition. An integer n is odd exactly when there is an integer m such that
n = 2m + 1.
1.4.2
25
You can also identify even numbers by considering their decimal representation. Recall that each digit in the decimal representation of a number has
a value that depends on its position. For example, the number 3482 really
means 3 103 + 4 102 + 8 101 + 2 100 . This is also known as place notation.
The fact that we use the powers of 10 in our place notation is probably due to
the fact that most humans have 10 fingers. It is possible to use any number
in place of 10. In Computer Science there are 3 other bases in common use:
2, 8 and 16 these are known (respectively) as binary, octal and hexadecimal notation. When denoting a number using some base other than 10, it is
customary to append a subscript indicating the base. So, for example, 10112
is binary notation meaning 1 23 + 0 22 + 1 21 + 1 20 or 8 + 2 + 1 = 11. No
matter what base we are using, the rightmost digit of the number multiplies
the base raised to the 0-th power. Any number raised to the 0-th power is
1, and the rightmost digit is consequently known as the units digit. We are
now prepared to give some statements that are equivalent to our definition of
even. These statements truly dont deserve the designation theorem, they
are immediate consequences of the definition.
Theorem 1.4.1. An integer is even if the units digit in its decimal representation is one of 0, 2, 4, 6 or 8.
Theorem 1.4.2. An integer is even if the units digit in its binary representation is 0.
For certain problems it is natural to use some particular notational system. For example, the last two theorems would tend to indicate that binary
numbers are useful in problems dealing with even and odd. Given that
there are many dierent notations that are available to us, it is obviously
26
are pronounced dec and el) are necessary since we need single symbols
for the things we ordinarily denote using 10 and 11.
Converting from some other base to decimal is easy. You just use the
definition of place notation. For example, to find what 4516637 represents in
decimal, just write
1.4.3
27
Divisibility
As was the case in defining even, it turns out that it is best to think of
multiplication, not division, when making a formal definition of this concept.
Given any two integers n and d we define the symbol d | n by
Definition. d | n exactly when 9k 2 Z such that n = kd.
In spoken language the symbol d | n is translated in a variety of ways:
d is a divisor of n.
d divides n evenly.
d is a factor of n.
n is an integer multiple of d.
28
1.4.4
1 < x y.
Basically, the definition of floor says that y is an integer that is less than
or equal to x, but y + 1 definitely exceeds x. The definition of ceiling can be
paraphrased similarly.
1.4.5
29
In the next section well discuss the so-called division algorithm this may
be over-kill since you certainly already know how to do division! Indeed, in
the U.S., long division is usually first studied in the latter half of elementary
school, and division problems that dont involve a remainder may be found
as early as the first grade. Nevertheless, were going to discuss this process
in sordid detail because it gives us a good setting in which to prove relatively
easy statements. Suppose you are setting-up a long division problem in which
the integer n is being divided by a positive divisor d. (If you want to divide
by a negative number, just divide by the corresponding positive number and
then throw an extra minus sign on at the end.)
q
d
n
..
.
r
Recall that the answer consists of two parts, a quotient q, and a remainder
r. Of course, r may be zero, but also, the largest r can be is d
1. The
30
nm
(mod d).
If one is in a context in which it is completely clear what d is, its acceptable to just write n m.
computations modulo some number d, (this is known as modular arithmetic or, sometimes, clock arithmetic) some very nice properties of mod
come in handy:
1.4.6
Binomial coefficients
31
(a + b)0 = 1
(a + b)1 = a + b
(a + b)2 = a2 + 2ab + b2
To go much further than the second power requires a bit of work, but try
the following
Exercise. Multiply (a + b) and (a2 + 2ab + b2 ) in order to determine (a + b)3 .
If you feel up to it, multiply (a2 +2ab+b2 ) times itself in order to find (a+b)4 .
Since were interested in the coefficients of these polynomials, its important to point out that if no coefficient appears in front of a term that means
the coefficient is 1.
These binomial coefficients can be placed in an arrangement known as
Pascals triangle 5 , which provides a convenient way to calculate small binomial coefficients
1
1
1
1
1
1
2
3
4
1
3
1
4
Figure 1.2: The first 5 rows of Pascals triangle (which are numbered 0
through 4 . . . ).
5
This triangle was actually known well before Blaise Pascal began to study it, but it
32
and that the numbers on the inside of the triangle are the sum of the two
numbers above them. You can use these facts to extend the triangle.
Exercise. Add the next two rows to the Pascal triangle in Figure 1.2.
Binomial coefficients are denoted using a somewhat strange looking symbol. The number
in the k-th position in row number n of the triangle is
n
denoted
. This looks a little like a fraction, but the fraction bar is missk
ing. Dont put one in! Its supposed to be missing.
In spoken English you
n
say n choose k when you encounter the symbol
.
k
There is a formula for the binomial coefficients which is nice. Otherwise
wed need to complete
a pretty huge Pascal triangle in order to compute
52
something like
. The formula involves factorial notation. Just to be
5
sure we are all on the same page, well define factorials before proceeding.
The symbol for factorials is an exclamation point following a number.
This is just a short-hand for expressing the product of all the numbers up
to a given one. For example 7! means 1 2 3 4 5 6 7. Of course, theres
really no need to write the initial 1 also, for some reason people usually
write the product in decreasing order (7! = 7 6 5 4 3 2).
The formula for a binomial coefficient is
n
n!
=
.
k
k! (n k)!
For example
5
5!
12345
=
=
= 10.
3
3! (5 3)!
(1 2 3) (1 2)
A slightly more complicated example (and one that gamblers are fond of)
is
33
52
52!
1 2 3 52
=
=
5
5! (52 5)!
(1 2 3 4 5) (1 2 3 47)
48 49 50 51 52
=
= 2598960.
12345
The reason that a gambler might be interested in the number we just calculated is that binomial coefficients do more than just give us the coefficients
in the expansion of a binomial. They also can be used to compute how many
ways one can choose a subset of a given size from a set. Thus
52
5
is the
number of ways that one can get a 5 card hand out of a deck of 52 cards.
Exercise. There are seven days in a week. In how many ways can one choose
a set of three days (per week)?
34
Exercises 1.4
1. An integer n is doubly-even if it is even, and the integer m guaranteed
to exist because n is even is itself even. Is 0 doubly-even? What are
the first 3 positive, doubly-even integers?
2. Dividing an integer by two has an interesting interpretation when using
binary notation: simply shift the digits to the right. Thus, 22 = 101102
when divided by two gives 10112 which is 8 + 2 + 1 = 11. How can you
recognize a doubly-even integer from its binary representation?
3. The octal representation of an integer uses powers of 8 in place notation.
The digits of an octal number run from 0 to 7, one never sees 8s or 9s.
How would you represent 8 and 9 as octal numbers? What octal number
comes immediately after 7778 ? What (decimal) number is 7778 ?
4. One method of converting from decimal to some other base is called
repeated division. One divides the number by the base and records
the remainder one then divides the quotient obtained by the base
and records the remainder. Continue dividing the successive quotients
by the base until the quotient is smaller than the base. Convert 3267
to base-7 using repeated division. Check your answer by using the
meaning of base-7 place notation. (For example 543217 means 5 74 +
4 73 + 3 72 + 2 71 + 1 70 .)
35
octal binary
0000
0001
0010
000
001
B
C
D
E
F
36
10. Suppose that 340 pounds of sand must be placed into bags having a
50 pound capacity. Write an expression using either floor or ceiling
notation for the number of bags required.
11. True or false?
jnk
d
<
lnm
d
d 2 e?
13. Assuming the symbols n,d,q and r have meanings as in the quotientremainder theorem (Theorem 1.4.3). Write expressions for q and r, in
terms of n and d using floor and/or ceiling notation.
14. Calculate the following quantities:
(a) 3 mod 5
37
(b) 37 mod 7
(c) 1000001 mod 100000
(d) 6 div 6
(e) 7 div 6
(f) 1000001 div 2
15. Calculate the following binomial coefficients:
(a)
3
0
(b)
7
7
(c)
13
5
(d)
13
8
(e)
52
7
16. An ice cream shop sells the following flavors: chocolate, vanilla, strawberry, coee, butter pecan, mint chocolate chip and raspberry. How
many dierent bowls of ice cream with three scoops can they make?
38
1.5
as the slow advance whereby computers have become able to utilize more and more abstracted descriptions of algorithms. Perhaps in the not-too-distant future machines will
be capable of understanding instruction sets that currently require human interpreters.
39
Assignment statements
If-then control statements
Goto statements
Return
We take the view that an algorithm is something like a function, it takes
for its input a list of parameters that describe a particular case of some general problem, and produces as its output a solution to that problem. (It
should be noted that there are other possibilities some programs require
that the variable in which the output is to be placed be handed them as an
input parameter, others have no specific output, their purpose is achieved as
a side-eect.) The intermediary between input and output is the algorithm
instructions themselves and a set of so-called local variables which are used
much the way scrap paper is used in a hand calculation intermediate calculations are written on them, but they are tossed aside once the final answer
has been calculated.
Assignment statements allow us to do all kinds of arithmetic operations
(or rather to think of these types of operations as being atomic.) In actuality
even a simple procedure like adding two numbers requires an algorithm of
sorts, well avoid such a fine level of detail. Assignments consist of evaluating
some (possibly quite complicated) formula in the inputs and local variables
and assigning that value to some local variable. The two uses of the phrase
local variable in the previous sentence do not need to be distinct, thus
x = x + 1 is a perfectly legal assignment.
If-then control statements are decision makers. They first calculate a
Boolean expression (this is just a fancy way of saying something that is either
true or false), and send program flow to dierent locations depending on
that result. A small example will serve as an illustration. Suppose that in
40
Is x equal to y?
No
If x = y then
Yes
Let x = x + 1.
x=x+1
End If
.
.
.
41
Division
integers n and d.
Local variables:
q and r.
Let q = 0.
Let r = n.
Label 1.
If r < d then
Return q and r.
End If
Let q = q + 1.
Let r = r
d.
Goto 1.
42
Let q = 0 and r = n.
Is r > d?
No
Yes
Let r = r d.
Let q = q + 1.
Return:
q&r
Goto
43
lcm(a, b) =
ab
,
gcd(a, b)
44
Euclidean
integers a and b.
Local variables:
q and r.
Label 1.
Let (q, r) = Division(a, b).
If r = 0 then
Return b.
End If
Let a = b.
Let b = r.
Goto 1.
45
Is r = 0?
Yes
No
Let a = b.
Let b = r.
Return:
b
Goto
46
quite easily by considering their factorizations into primes. For the moment
consider numbers that factor into primes but not into prime powers (that
is, their factorizations dont involve exponents). The gcd is the product of
the primes that are in common between these factorizations (if there are no
primes in common it is 1). The lcm is the product of all the distinct primes
that appear in the factorizations. As an example, consider 30 and 42. The
factorizations are 30 = 2 3 5 and 42 = 2 3 7. The primes that are
The set of all the primes that appear in either factorization is {2, 3, 5, 7} so
The technique just described is of little value for numbers having more
than about 50 decimal digits because it rests a priori on the ability to find
the prime factorizations of the numbers involved. Factoring numbers is easy
enough if theyre reasonably small, especially if some of their prime factors
are small, but in general the problem is considered so difficult that many
cryptographic schemes are based on it.
47
Exercises 1.5
1. Trace through the division algorithm with inputs n = 27 and d = 5,
each time an assignment statement is encountered write it out. How
many assignments are involved in this particular computation?
2. Find the gcds and lcms of the following pairs of numbers.
a b gcd(a, b) lcm(a, b)
110 273
105 42
168 189
3. Formulate a description of the gcd of two numbers in terms of their
prime factorizations in the general case (when the factorizations may
include powers of the primes involved).
4. Trace through the Euclidean algorithm with inputs a = 3731 and
b = 2730, each time the assignment statement that calls the division
algorithm is encountered write out the expression a = qb + r. (With
the actual values involved !)
48
1.6
When we first discussed the rational numbers in Section 1.1 we gave the
following definition, which is slightly flawed.
Q={
a
a 2 Z and b 2 Z and b 6= 0}
b
and
14
28
are distinct things that appear in the set defined above, but we
all know that they both represent the rational number 12 . To eliminate this
problem with our definition of the rationals we need to add an additional
condition that ensures that such duplicates dont arise. It turns out that
what we want is for the numerators and denominators of our fractions to
have no factors in common. Another way to say this is that the a and b
from the definition above should be chosen so that gcd(a, b) = 1. A pair of
numbers whose gcd is 1 are called relatively prime.
Were ready, at last, to give a good, precise definition of the set of rational
numbers. (Although it should be noted that were not quite done fiddling
around; an even better definition will be given in Section 6.3.)
Q={
a
a, b 2 Z and b 6= 0 and gcd(a, b) = 1}.
b
49
a, b 2 Z
b 6= 0
and
gcd(a, b) = 1
50
result but we wont, well save this proof for the student to do later (heh,
heh, heh. . . ). These sorts of intermediate results, things that dont deserve to
be called theorems themselves, but that arent entirely self-evident are known
as lemmas. It is often the case that in an attempt at proving a statement we
find ourselves in need of some small fact. Perhaps it even seems to be true
but its not clear. In such circumstances, good form dictates that we first
state and prove the lemma then proceed on to our theorem and its proof.
So, here, without its proof is the lemma well need.
Lemma 1.6.2. If the square of an integer is even, then the original integer
is even.
Given that thoroughness demands that we fill in this gap by actually
proving the lemma at a later date, we can now proceed with the proof of our
theorem.
Proof:
2 is a rational number.
2=
a2
.
b2
a2 = 2b2
51
Q.E.D.
52
Exercises 1.6
1. Rational Approximation is a field of mathematics that has received
much study. The main idea is to find rational numbers that are very
good approximations to given irrationals. For example, 22/7 is a wellknown rational approximation to . Find good rational approximations
p p p
to 2, 3, 5 and e.
2. The theory of base-n notation that we looked at in sub-section1.4.2
can be extended to deal with real and rational numbers by introducing
a decimal point (which should probably be re-named in accordance
with the base) and adding digits to the right of it. For instance 1.1011
is binary notation for 1 20 + 1 2
1 1
1
11
1+ + +
=1 .
2 8 16
16
+02
+12
+12
or
p is
irrational for every prime number p. What statement would be equivalent to the lemma about the parity of x and x2 in such a generalization?
5. Write a proof that
3 is irrational.
1.7. RELATIONS
1.7
53
Relations
One of the principle ways in which mathematical writing diers from ordinary
writing is in its incredible brevity. For instance, a Ph.D. thesis for someone
in the humanities would be very suspicious if its length were less than 300
pages, whereas it would be quite acceptable for a math doctoral student to
submit a thesis amounting to less than 100 pages. Indeed, the usual criteria
for a doctoral thesis (or indeed any scholarly work in mathematics) is that
it be new, true and interesting. If one can prove a truly interesting, novel
result in a single page theyll probably hand over the sheepskin.
How is this great brevity achieved? By inserting single symbols in place
of a whole paragraphs worth of words! One class of symbols in particular has
immense power so-called relational symbols. When you place a relational
symbol between two expressions, you create a sentence that says the relation
holds. The period at the end of the last sentence should probably be pronounced! The relation holds, period! In other words when you write down
a mathematical sentence involving a relation, you are asserting the relation
is True (the capital T is intentional). This is why its okay to write 2 < 3
but its not okay to write 3 < 2. The symbol < is a relation symbol and
you are only supposed to put it between two things when they actually bear
this relation to one another.
The situation becomes slightly more complicated when we have variables
in relational expressions, but before we proceed to consider that complication
lets make a list of the relations weve seen to date:
=, <, >, , , | , and
(mod m).
54
the relation symbol (often by drawing a slash through it, but some of the
symbols above are negations of others).
So what about expressions involving variables and these relation symbols?
For example what does x < y really mean? Okay, I know that you know
what x < y means but, philosophically, a relation symbol involving variables
is doing something that you may have only been vaguely aware of in the past
it is introducing a supposition. Watch out for relation symbols involving
variables! Whenever you encounter them it means the rules of the game are
being subtly altered up until the point where you see x < y, x and y are
just two random numbers, but after that point we must suppose that x is
the smaller of the two.
The relations weve discussed so far are binary relations, that is, they go
in between two numbers. There are also higher order relations. For example,
a famous ternary relation (a relationship between three things) is the notion
of betweenness. If A, B and C are three points which all lie on a single
line, we write A ? B ? C if B falls somewhere on the line segment AC. So the
symbol A ? B ? C is shorthand for the sentence Point B lies somewhere in
between points A and C on the line determined by them.
There is a slightly silly tendency these days to define functions as being
a special class of relations. (This is slightly silly not because its wrong
indeed, functions are a special type of relation but because its the least
intuitive approach possible, and it is usually foisted-o on middle or high
school students.) When this approach is taken, we first define a relation to
be any set of ordered pairs and then state a restriction on the ordered pairs
that may be in a relation if it is to be a function. Clearly what these Algebra
textbook authors are talking about are binary relations, a ternary relation
would actually be a set of ordered triples, and higher order relations might
involve ordered 4-tuples or 5-tuples, etc. A couple of small examples should
help to clear up this connection between a relation symbol and some set of
1.7. RELATIONS
55
tuples.
Consider the numbers from 1 to 5 and the less-than relation, <. As a set
of ordered pairs, this relation is the set
{(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)}.
The pairs that are in the relation are those such that the first is smaller
than the second.
An example involving the ternary relation betweenness can be had from
the following diagram.
A
F
G
{(A, B, C), (A, G, D), (A, F, E), (B, G, E), (C, B, A), (C, G, F ), (C, D, E),
(D, G, A), (E, D, C), (E, G, B), (E, F, A), (F, G, C)}.
Exercise. When thinking of a function as a special type of relation, the
pairs are of the form (x, f (x)). That is, they consist of an input and the
56
1.7. RELATIONS
57
Exercises 1.7
1. Consider the numbers from 1 to 10. Give the set of pairs of these
numbers that corresponds to the divisibility relation.
2. The domain of a function (or binary relation) is the set of numbers
appearing in the first coordinate. The range of a function (or binary
relation) is the set of numbers appearing in the second coordinate.
Consider the set {0, 1, 2, 3, 4, 5, 6} and the function f (x) = x2 (mod 7).
{(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10),
(2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10),
(3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 10),
(4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (4, 10),
(5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (5, 10),
(6, 6), (6, 7), (6, 8), (6, 9), (6, 10),
(7, 7), (7, 8), (7, 9), (7, 10),
(8, 8), (8, 9), (8, 10),
(9, 9), (9, 10),
(10, 10)}
4. Draw a five-pointed star, label all 10 points. There are 40 triples of
these labels that satisfy the betweenness relation. List them.
58
you as a relation that is, you are just given a set of pairs. How
can you distinguish whether the function represented by this list of
input/output pairs is invertible? How can you produce the inverse (as
a set of ordered pairs)?
7. There is a relation known as has color which goes from the set
F = {orange, cherry, pumpkin, banana}
to the set
C = {orange, red, green, yellow}.
What pairs are in has color?
Chapter 2
Logic and quantifiers
If at first you dont succeed, try again. Then quit. Theres no use being a
damn fool about it. W. C. Fields
2.1
In every branch of Mathematics there are special, atomic, notions that defy
precise definition. In Geometry, for example, the atomic notions are points,
lines and their incidence. Euclid defines a point as that which has no part
people can argue (and have argued) incessantly over what exactly is meant by
this. Is it essentially saying that anything without volume, area or length of
some sort is a point? In modern times it has been recognized that any formal
system of argumentation has to have such elemental, undefined, concepts
and that Euclids apparent lapse in precision comes from an attempt to hide
this basic fact. The notion of point cant really be defined. All we can do is
point (no joke intended) at a variety of points and hope that our audience will
absorb the same concept of point that we hold via the process of induction1 .
1
TION
59
60
ship. The atomic concepts in Logic are true, false, sentence and
statement.
Regarding true and false, we hope there is no uncertainty as to their
meanings. Sentence also has a well-understood meaning that most will agree
on a syntactically correct ordered collection of words such as Johnny was
a football player. or Red is a color. or This is a sentence which does
not refer to itself. A statement is a sentence which is either true or false.
In other words, a statement is a sentence whose truth value is definite, in
more other words, it is always possible to decide one way or the other
whether a statement is true or false.2 The first example of a sentence given
above (Johnny was a football player) is not a statement the problem is
that it is ambiguous unless we know who Johnny is. If it had said Johnny
Unitas was a football player. then it would have been a statement. If it
had said Johnny Appleseed was a football player. it would also have been
a statement, just not a true one.
Ambiguity is only one reason that a sentence may not be a statement.
As we consider more complex sentences, it may be the case that the truth
value of a given sentence simply cannot be decided. One of the most celebrated mathematical results of the 20th century is Kurt Godels Incompleteness Theorem. An important aspect of this theory is the proof that in
any axiomatic system of mathematical thought there must be undecidable
sentences statements which can neither be proved nor disproved from the
axioms3 . Simple sentences (e.g. those of the form subject-verb-object) have
2
instance it is certainly either true or false that I ate eggs for breakfast on my 21st birthday
but I dont remember, and short of building a time machine, I dont know how you could
find out.
3
There are trivial systems that are complete, but if a system is sufficiently complicated
that it contains interesting statements it cant be complete.
61
little chance of being undecidable for this reason, so we will next look at ways
of building more complex sentences from simple components.
Lets start with an example. Suppose I come up to you in some windowless room and make the statement: The sun is shining but its raining! You
decide to investigate my claim and determine its veracity. Upon reaching a
room that has a view of the exterior there are four possible combinations
of sunniness and/or precipitation that you may find. That is, the atomic
predicates The sun is shining and It is raining can each be true or false
independently of one another. In the following table we introduce a convention used throughout the remainder of this book that true is indicated
with a capital letter T and false is indicated with the Greek letter
(which
It is raining
T
T
Each row of the above table represents a possible state of the outside
world. Suppose you observe the conditions given in the last row, namely
that it is neither sunny, nor is it raining you would certainly conclude
that I am not to be trusted. I.e. my statement, the compounding of The
sun is shining and It is raining (with the word but in between as a
connector) is false. If you think about it a bit, youll agree that this so-called
compound sentence is true only in the case that both of its component pieces
are true. This underscores an amusing linguistic point: but and and
have exactly the same meaning! More precisely, they denote the same thing,
they have subtly dierent connotations however but indicates that both
of the statements it connects are true and that the speaker is surprised by
this state of aairs.
62
ple sentences into compound ones. The conjunction of two sentences is the
compound sentence made by sticking the word and between them. The
disjunction of two sentences is formed by placing an or between them.
Conjunctions are true only when both components are true. Disjunctions
are false only when both components are false.
As usual, mathematicians have developed an incredibly terse, compact
notation for these ideas.4 First, we represent an entire sentence by a single
letter traditionally, a capital letter. This is called a predicate variable. For
example, following the example above, we could denote the sentence The
sun is shining by the letter S. Similarly, we could make the assignment
R = It is raining. The conjunction and disjunction of these sentences can
then be represented using the symbols S ^ R and S _ R, respectively. As a
mnemonic, note that the connective in S ^ R looks very much like the capital
letter A (as in And).
To display, very succinctly, the eect of these two connectives we can use
so-called truth tables. In a truth table we list all possible truth values of
the predicate variables and then enumerate the truth values of some compound sentence. For the conjunction and disjunction connectors we have
(respectively):
A
A^B
T
and
T
T
A_B
T
T
humanity.
63
and its truth value is exactly the opposite of As truth value. The negation
of a sentence is also known as the denial of a sentence. A truth table for the
negation operator is somewhat trivial but we include it here for completeness.
A
T
A
T
These three simple tools (and, or & not) are sufficient to create extraordinarily complex sentences out of basic components. The way these pieces
interrelate is a bit reminiscent of algebra, in fact the study of these logical
operators (or any operators that act like them) is called Boolean Algebra5 .
There are distinct dierences between Boolean and ordinary algebra however.
In regular algebra we have the binary connectors + (plus) and (times), and
the unary negation operator
but there are certain consequences of the fact that multiplication is eectively repeated addition that simply dont hold for the Boolean operators.
For example, there is a well-defined precedence between and +. In parsing
the expression 4 5 + 3 we all know that the multiplication is to be done first.
In honor of George Boole, whose 1854 book An investigation into the Laws of Thought
64
garden-variety algebra, there are also many similarities. For instance, the
associative, commutative and distributive laws of Algebra all have versions
that work in the Boolean case.
A very handy way of visualizing Boolean expressions is given by digital
logic circuit diagrams. To discuss these diagrams we must make a brief
digression into Electronics. One of the most basic components inside an
electronic device is a transistor, this is a component that acts like a switch
for electricity, but the switch itself is controlled by electricity. In Figure 2.1
we see the usual schematic representation of a transistor. If voltage is applied
to the wire labeled z, the transistor becomes conductive, and current may
flow from x to y.
z
65
Figure 2.2: The connection of two transistors in series provides an implementation of the and operator.
for some: in common speech the use of the word or often has the sense
known as exclusive or (a.k.a. xor), when we say X or Y we mean Either
X or Y, but not both. In Electronics and Mathematics, or always has the
non-exclusive (better known as inclusive) sense.
z
Figure 2.3: The connection of two transistors in parallel provides an implementation of the or operator.
66
And (^)
Or (_)
Not ()
67
A
B
C
((A ^ B) ^ (C ^ D))
A
B
C
D
(((A ^ B) ^ C) ^ D)
Figure 2.4: Two of the possible ways to parenthesize the conjunction of four
statement variables expressed as digital logic circuits.
68
input/output table.
x y z
out
0 0
0 1
1 0
1 1
0 0
0 1
1 0
1 1
69
x
y
z
70
x
y
z
out
Figure 2.5: A digital logic circuit built using disjunctive normal form. The
output of this circuit is (x ^ y ^ z) _ (x ^ y ^ z) _ (x ^ y ^ z).
71
Exercises 2.1
1. Design a digital logic circuit (using and, or & not gates) that implements an exclusive or.
2. Consider the sentence This is a sentence which does not refer to itself.
which was given in the beginning of this chapter as an example. Is this
sentence a statement? If so, what is its truth value?
4. Complete truth tables for each of the sentences (A^B)_C and A^(B _
C). Does it seem that these sentences have the same logical content?
72
A B A#B
T T
T
T
T
T
T
T T
and
T
T
T
2.2. IMPLICATION
2.2
73
Implication
Suppose a mother makes the following statement to her child: If you finish
your peas, youll get dessert.
This is a compound sentence made up of the two simpler sentences P =
You finish your peas and D = Youll get dessert. It is an example of
a type of compound sentence called a conditional. Conditionals are if-then
type statements. In ordinary language the word then is often elided (as is
the case with our example above). Another way of phrasing the If P then
D. relationship is to use the word implies although it would be a rather
uncommon mother who would say Finishing your peas implies that you will
receive dessert.
As was the case in the previous section, there are four possible situations
and we must consider each to decide the truth/falsity of this conditional
statement. The peas may or may not be finished, and independently, the
dessert may or may not be proered.
Suppose the child finishes the peas and the mother comes across with the
dessert. Clearly, in this situation the mothers statement was true. On the
other hand, if the child finishes the hated peas and yet does not receive a
treat, it is just as obvious that the mother has lied! What do we say about
the mothers veracity in the case that the peas go unfinished? Here, Mom
gets a break. She can either hold firm and deliver no dessert, or she can
be a softy and give out unearned sweets in either case, we cant accuse
her of telling a falsehood. The statement she made had to do only with the
eventualities following total pea consumption, she said nothing about what
happens if the peas go uneaten.
A conditional statements components are called the antecedent (this is
the if part, as in finish your peas) and the consequent (this is the then
part, as in get dessert). The discussion in the last paragraph was intended
74
to make the point that when the antecedent is false, we should consider the
conditional to be true. Conditionals that are true because their antecedents
are false are said to be vacuously true. The conditional involving an antecedent A and a consequent B is expressed symbolically using an arrow:
A =) B. Here is a truth table for this connective.
A
A =) B
T
T
Exercise. Note that this truth table is similar to the truth table for A _ B
in that there is only a single row having a
B the
=)
2.2. IMPLICATION
75
reflection of what the mother intended. The problem really is that people are
incredibly sloppy with their conditional statements! A lot of people secretly
want the 3rd row of the truth table for =) to have a
in it, and it
A () B
T
T
T
T
Please note, that while we like to strive for precision, we do not necessarily
recommend the use of phrases such as You will receive dessert if, and only
if, you finish your peas. with young children.
Since conditional sentences are often confused with the sentence that
has the roles of antecedent and consequent reversed, this switched-around
sentence has been given a name: it is the converse of the original statement.
Another conditional that is distinct from (but related to) a given conditional
is its inverse. This sort of sentence probably had to be named because of a
very common misconception, many people think that the way to negate an
if-then proposition is to negate its parts. Algebraically, this looks reasonable
sort of a distributive law for logical negation over implications (A =)
B) = A =) B. Sadly, this reasonable looking assertion cant possibly
be true; since implications have just one
an implication must have three but the statement with the s on the parts
of the implication is going to only have a single
76
B =) A
A =) B
B =) A
inverses
A =) B
2.2. IMPLICATION
77
Exercises 2.2
1. The transitive property of equality says that if a = b and b = c then
a = c. Does the implication arrow satisfy a transitive property? If so,
state it.
2. Complete truth tables for the compound sentences A =) B and
A _ B.
3. Complete a truth table for the compound sentence A =) (B =) C)
and for the sentence (A =) B) =) C. What can you conclude
about conditionals and the associative property?
4. Determine a sentence using the and connector (^) that gives the negation of A =) B.
5. Rewrite the sentence Fix the toilet or I wont pay the rent! as a
conditional.
6. Why is it that the sentence If pigs can fly, I am the king of Mesopotamia.
true?
7. Express the statement A =) B using the Peirce arrow and/or the
Scheer stroke. (See Exercise 5 in the previous section.)
8. Find the contrapositives of the following sentences.
(a) If you cant do the time, dont do the crime.
(b) If you do well in school, youll get a good job.
(c) If you wish others to treat you in a certain way, you must treat
others in that fashion.
(d) If its raining, there must be clouds.
78
9. What are the converse and inverse of If you watch my back, Ill watch
your back.?
f (n) <
n=1
f (x).
2.3
79
Logical equivalences
Some logical statements are the same. For example, in the last section,
we discussed the fact that a conditional and its contrapositive have the same
logical content. Wouldnt we be justified in writing something like the following?
A =) B = B =) A
Well, one pretty serious objection to doing that is that the equals sign
(=) has already got a job; it is used to indicate that two numerical quantities are the same. What were doing here is really sort of a dierent thing!
Nevertheless, there is a concept of sameness between certain compound
statements, and we need a symbolic way of expressing it. There are two notations in common use. The notation that seems to be preferred by logicians
is the biconditional ( () ). The notation well use in the rest of this book
is an equals sign with a bit of extra decoration on it (
=).
Thus we can can either write
(A =) B) () (B =) A)
or
A =) B
= B =) A.
I like the latter, but use whichever form you like no one will have any
problem understanding either.
The formal definition of logical equivalence, which is what weve been
describing, is this: two compound sentences are logically equivalent if in a
truth table (that contains all possible combinations of the truth values of
the predicate variables in its rows) the truth values of the two sentences are
equal in every row.
80
rows will suffice. Fill out the missing entries in the truth table and determine
whether the statements are equivalent.
A
A_B
A _ (A ^ B)
T
T
One could, in principle, verify all logical equivalences by filling out truth
tables. Indeed, in the exercises for this section we will ask you to develop a
certain facility at this task. While this activity can be somewhat fun, and
many of my students want the filling-out of truth tables to be a significant
portion of their midterm exam, you will probably eventually come to find it
somewhat tedious. A slightly more mature approach to logical equivalences
is this: use a set of basic equivalences which themselves may be verified via
truth tables as the basic rules or laws of logical equivalence, and develop
a strategy for converting one sentence into another using these rules. This
process will feel very familiar, it is like doing algebra, but the rules one is
allowed to use are subtly dierent.
First we have the commutative laws, one each for conjunction and disjunction. Its worth noting that there isnt a commutative law for implication.
The commutative property of conjunction says that A ^ B
= B ^ A. This
81
A
A^B
A
B^A
A
A_B
A
B
B_A
The associative laws also have something to do with what order operations are done. One could think of the dierence in the following terms:
Commutative properties involve spatial or physical order and the associative
properties involve temporal order. The associative law of addition could be
used to say well get the same result if we add 2 and 3 first, then add 4, or if
we add 2 to the sum of 3 and 4 (i.e. that (2+3)+4 is the same as 2+(3+4).)
Note that physically, the numbers are in the same order (2 then 3 then 4) in
both expressions but that the parentheses indicate a precedence in when the
plus signs are evaluated.
82
In visual terms, this means the following two circuit diagrams are equivalent.
A
(A ^ B) ^ C
B
C
A
A ^ (B ^ C)
B
C
A
(A _ B) _ C
B
C
A
B
C
A _ (B _ C)
83
The next type of basic logical equivalences well consider are the so-called
distributive laws. Distributive laws involve the interaction of two operations,
when we distribute multiplication over a sum, we eectively replace one instance of an operand and the associated operator, with two instances, as is
illustrated below.
2*(3+4)=(2*3)+(2*4)
The logical operators ^ and _ each distribute over the other. Thus we
84
B
C
A
B
(A ^ B) _ (A ^ C)
C
A
A _ (B ^ C)
B
C
A
B
(A _ B) ^ (A _ C)
C
85
x+
y) should
correspond to in Boolean algebra. At first blush one might assume the analogous thing in Boolean algebra would be something like (A ^ B)
= A ^ B,
but we can easily dismiss this by looking at a truth table.
A
(A ^ B)
A ^ B
T
T
T
T
86
x+
x=0
and
x
1
= 1.
x
Boolean algebra only has one inverse concept, the denial of a predicate
(i.e. logical negation), but the equations above have analogues, as do the
symbols 0 and 1 that appear in them. First, consider the Boolean expression
A _ A. This is the logical or of a statement and its exact opposite; when
one is true the other is false and vice versa. But, the disjunction A _ A, is
always true! We use the symbol t (which stands for tautology) to represent a
and
A ^ A
= c
and a statement with something that is always true, this new compound
has the exact same truth values as the original. If you or a statement
with something that is always false, the new compound statement is also
unchanged from the original. Thus performing a conjunction with a tautology
has no eect sort of like multiplying by 1. Performing a disjunction with a
contradiction also has no eect this is somewhat akin to adding 0.
The number 0 has a special property: 0 x = 0 is an equation that holds
87
there isnt a dominance rule that involves 1. On the Boolean side, both the
symbols t and c have related domination rules.
A_t
=t
and
A^c
=c
A_A
=A
and
A^A
=A
and
t
=c
and
A _ (A ^ B)
=A
88
Commutative
laws
Associative
laws
Distributive
laws
Conjunctive
Disjunctive
Algebraic
version
version
analog
A^B
=B^A
A_B
=B_A
2+3=3+2
A ^ (B ^ C)
= (A ^ B) ^ C
A _ (B _ C)
= (A _ B) _ C
A ^ (B _ C)
=
A _ (B ^ C)
=
(A ^ B) _ (A ^ C) (A _ B) ^ (A _ C)
2 + (3 + 4)
= (2 + 3) + 4
2 (3 + 4)
= (2 3 + 2 4)
(A ^ B)
= A _ B
(A _ B)
= A ^ B
none
A ^ A
= c
A _ A
= t
2 + ( 2) = 0
A^t
=A
A_c
=A
7+0=7
Domination
A^c
=c
A_t
=t
70=0
Idempotence
A^A
=A
A_A
=A
11=1
A ^ (A _ B)
=A
A _ (A ^ B)
=A
none
DeMorgans
laws
Complementarity
Identity
laws
Absorption
89
Exercises 2.3
1. There are 3 operations used in basic algebra (addition, multiplication
and exponentiation) and thus there are potentially 6 dierent distributive laws. State all 6 laws and determine which 2 are actually valid.
(As an example, the distributive law of addition over multiplication
would look like x + (y z) = (x + y) (x + z), this isnt one of the true
ones.)
(b) A ^ (B _ A)
= A^B
(c) (A ^ B) _ (A ^ B)
= (A _ B) ^ (A _ B)
90
91
7. You encounter two natives of the island of knights and knaves (see
page 72). Fill in an explanation for each line of the proofs of their
identities.
(a) Natasha says, Boris is a knave.
Boris says, Natasha and I are knights.
Claim: Natasha is a knight, and Boris is a knave.
knight.
If Wellington is a knight, then Bonaparte is a knave.
Therefore, if Bonaparte is a knight, then Bonaparte is a
knave.
Therefore, Bonaparte is a knave.
If Bonaparte is a knave, then Wellington is a knave.
Therefore, Wellington is a knave.
Q.E.D.
92
2.4
Two-column proofs
If youve ever spent much time trying to check someone elses work in solving
an algebraic problem, youd probably agree that it would be a help to know
what they were trying to do in each step. Most people have this fairly vague
notion that theyre allowed to do the same thing on both sides and theyre
allowed to simplify the sides of the equation separately but more often than
not, several dierent things get done on a given line, mistakes get made, and
it can be nearly impossible to figure out what went wrong and where.
Now, after all, the beauty of math is supposed to lie in its crystal clarity,
so this sort of situation is really unacceptable. It may be an impossible goal
to get the average Joe to perform algebraic manipulations with clarity,
but those of us who aspire to become mathematicians must certainly hold
ourselves to a higher standard. Two-column proofs are usually what is meant
by a higher standard when we are talking about relatively mechanical
manipulations like doing algebra, or more to the point, proving logical
equivalences. Now dont despair! You will not, in a mathematical career, be
expected to provide two-column proofs very often. In fact, in more advanced
work one tends to not give any sort of proof for a statement that lends itself
to a two-column approach. But, if you find yourself writing As the reader
can easily verify, Equation 17 holds. . . in a paper, or making some similar
remark to your students, you are morally obligated to being able to produce
a two-column proof.
So what, exactly, is a two-column proof? In the left column you show
your work, being careful to go one step at a time. In the right column you
provide a justification for each step.
Were going to go through a couple of examples of two-column proofs
in the context of proving logical equivalences. One thing to watch out for:
if youre trying to prove a given equivalence, and the first thing you write
93
down is that very equivalence, its wrong! This would constitute the logical
error known as begging the question also known as circular reasoning.
Its clearly not okay to try to demonstrate some fact by first asserting the
very same fact. Nevertheless, there is (for some unknown reason) a powerful
temptation to do this very thing. To avoid making this error, we will not
put any equivalences on a single line. Instead we will start with one side or
the other of the statement to be proved, and modify it using known rules of
equivalence, until we arrive at the other side.
Without further ado, lets provide a proof of the equivalence A ^ (B _
A)
= A ^ B.6
A ^ (B _ A)
= (A ^ B) _ (A ^ A)
= (A ^ B) _ c
distributive law
complementarity
identity law
= (A ^ B)
We have assembled a nice, step-by-step sequence of equivalences each
justified by a known law that begins with the left-hand side of the statement
to be proved and ends with the right-hand side. Thats an irrefutable proof!
In the next example well highlight a slightly sloppy habit of thought that
tends to be problematic. People usually (at first) associate a direction with
the basic logical equivalences. This is reasonable for several of them because
one side is markedly simpler than the other. For example, the domination
rule would normally be used to replace a part of a statement that looked
like A ^ c with the simpler expression c. There is a certain amount of
6
This equivalence should have been verified using truth tables in the exercises from the
previous section.
94
distributive law
= ((A^B)_(A^C))_(A^C)
associative law
= (A^B)_((A^C)_(A^C))
idempotence
= (A ^ B) _ (A ^ C)
distributive law
= A ^ (B _ C)
Note that in the example weve just done, the two applications of the
distributive law go in opposite directions as far as their influence on the
complexity of the expressions are concerned.
95
Exercises 2.4
Write two-column proofs that verify each of the following logical equivalences.
1. A _ (A ^ B)
= A ^ (A _ B)
2. (A ^ B) _ A
= A
3. A _ B
= A _ (A ^ B)
4. (A _ B) _ (A ^ B)
= A
5. A
= A ^ ((A _ B) _ (A _ B))
6. (A ^ B) ^ (A _ B
= c
7. A
= A ^ (A _ (A ^ (B _ C)))
8. (A ^ B) ^ (A ^ C)
= A _ (B ^ C)
96
2.5
Quantified statements
All of the statements discussed in the previous sections were of the completely unambiguous sort; that is, they didnt have any unknowns in them.
As a reader of this text, its a sure bet that youve mastered Algebra and are
firmly convinced of the utility of x and y. Admittedly, weve used variables
to refer to sentences (or sentence fragments) themselves, but weve said that
sentences that had variables in them were ambiguous and didnt even deserve
to be called logical statements. The notion of quantification allows us to use
the power of variables within a sentence without introducing ambiguity.
Consider the sentence There are exactly 7 odd primes less than 20.
This sentence has some kind of ambiguity in it (because it doesnt mention
the primes explicitly) and yet it certainly seems to have a definite truth
value! The reason its truth value is known (by the way, it is T) is that the
sentence is quantified. X is an odd prime less than 20. is an ambiguous
sentence, but There are exactly 7 distinct Xs that are odd primes less than
20. is not. This example represents a fairly unusual form of quantification.
Usually, we take away the ambiguity of a sentence having a variable in it by
asserting one of two levels of quantification: this is true at least once or
this is always true. Weve actually seen the symbols (9 and 8) for these
concepts already (in Section 1.3).
i) P (x) = 22 + 1 is a prime.
ii) Q(x, y) = x is prime or y is a divisor of x.
iii) L(f, c, l) = The function f has limit l at c, if and only if, for every
97
c| <
l| < .
is easily seen to be
the phrase for every positive number appears before it. What is the status
of x? Is it really bound? The answers to such questions may not be clear at
first, but after some thought you should be able to decide that x is universally
quantified.
Exercise. What word in example iii) indicates that x is in the scope of a 8
quantifier?
c| <
E(f, x, l, ) = |f (x)
l| <
98
x!c
x!c
l| < ).
It would not be unfair to say that developing the facility to read, and
understand, this hieroglyph (and others like it) constitutes the first several
weeks of a course in Real Analysis.
Let us turn back to another of the examples (of an open sentence) from
x
F 0 = 22 + 1 = 3
1
F 1 = 22 + 1 = 5
7
99
F2 = 22 + 1 = 17
3
F3 = 22 + 1 = 257
4
F4 = 22 + 1 = 65537
Fermat probably computed that F5 = 4294967297, and we can well imagine that he checked that this number was not divisible by any small primes.
Of course, this was well before the development of eective computing machinery, so we shouldnt blame Fermat for not noticing that 4294967297 =
641 6700417. This remarkable feat of factoring can be replicated in seconds
assertion, 8x 2 U, P (x). is certainly true. You should note that the only
100
x > ln x
x2R+
1000abcd9999
9k 2 Z, abcd = k dcba.
avoid an unnecessarily complex problem statement. One should not necessarily avoid such
abuses if ones readers can be expected to easily understand what is meant, any more than
one should completely eschew the splitting of infinitives.
101
The Pep Boys Manny, Moe and Jack are hopefully known to some readers as the
102
and
(9x 2 U, P (x))
= 8x 2 U, P (x).
Its equally valid to think of these rules in a way thats divorced from
DeMorgans laws. To show that a universal sentence is false, it suffices to
show that an existential sentence involving a negation of the original is true.10
10
To show that it is not the case that every Pep boys name starts with M, one only
needs to demonstrate that there is a Pep boy (Jack) whose name doesnt start with M.
103
Exercises 2.5
1. There is a common variant of the existential quantifier, 9!, if you write
9! x, P (x) you are asserting that there is a unique element in the universe that makes P (x) true. Determine how to negate the sentence
9! x, P (x).
l| < ).
Marie-Sophie Germain (1776 - 1831) was a French mathematician who made major
104
5. Alvin, Betty, and Charlie enter a cafeteria which oers three dierent
entrees, turkey sandwich, veggie burger, and pizza; four dierent beverages, soda, water, coee, and milk; and two types of desserts, pie and
pudding. Alvin takes a turkey sandwich, a soda, and a pie. Betty takes
a veggie burger, a soda, and a pie. Charlie takes a pizza and a soda.
Based on this information, determine whether the following statements
are true or false.
(a) 8 people p, 9 dessert d such that p took d.
(b) 9 person p such that 8 desserts d, p did not take d.
(c) 8 entrees e, 9 person p such that p took e.
(d) 9 entree e such that 8 people p, p took e.
(e) 8 people p, p took a dessert () p did not take a pizza.
(f) Change one word of statement 5d so that it becomes true.
(g) Write down the negation of 5a and compare it to statement 5b.
Hopefully you will see that they are the same! Does this make
you want to modify one or both of your answers to 5a and 5b?
2.6
105
106
107
probably wont actually have been written, hypotheses and all the deductions
made up to this point) by one of the so-called rules of inference.
Each of the rules of inference actually amounts to a logical tautology that
has been re-expressed as a sort of re-writing rule. Each rule of inference will
be expressed as a list of logical sentences that are assumed to be among the
premises of the argument, a horizontal bar, followed by the symbol ) (which
is usually voiced as the word therefore) and then a new statement that can
be placed among the deductions.
For example, one (very obvious) rule of inference is
A^B
) B
A =) B
) A
12
Latin for method of affirming, the related modus tollens rule means method of
denying.
108
Modus ponens and modus tollens are also known as syllogisms. A syl-
) A
syllogism rule into an equivalent conditional. How is the new argument form
related to modus ponens and/or modus tollens?
The word dilemma usually refers to a situation in which an individual
is faced with an impossible choice. A cute example known as the Crocodiles
dilemma is as follows:
A crocodile captures a little boy who has strayed too near the
river. The childs father appears and the crocodile tells him
Dont worry, I shall either release your son or I shall eat him.
If you can say, in advance, which I will do, then I shall release
him. The father responds, You will eat my son. What should
the crocodile do?
109
) B_D
Destructive dilemma is often not listed among the rules of inference because it can easily be obtained by using the constructive dilemma and replacing the implications with their contrapositives.
A =) B
C =) D
B _ D
) A _ C
In Table 2.3, the ten most common rules of inference are listed. Note
that all of these are equivalent to tautologies that involve conditionals (as
opposed to biconditionals), every one of the basic logical equivalences that
we established in Section 2.3 is really a tautology involving a biconditional,
collectively these are known as the rules of replacement. In an argument,
any statement allows us to infer a logically equivalent statement. Or, put
dierently, we could replace any premise with a dierent, but logically equivalent, premise. You might enjoy trying to determine a minimal set of rules
of inference, that together with the rules of replacement would allow one to
form all of the same arguments as the ten rules in Table 2.3.
110
Form
A
Modus ponens
A =) B
) B
Modus tollens
A =) B
) A
A =) B
Hypothetical syllogism
B =) C
) A =) C
Disjunctive syllogism
A_B
B
) A
A =) B
Constructive dilemma
C =) D
A_C
) B_D
Name
Form
A =) B
Destructive dilemma
C =) D
B _ D
) A _ C
Conjunctive simplification
A^B
) A
A
Conjunctive addition
B
) A^B
Disjunctive addition
Absorption
A
) A_B
A =) B
) A =) (A ^ B)
111
112
Exercises 2.6
1. In the movie Monty Python and the Holy Grail we encounter a medieval villager who (with a bit of prompting) makes the following argument.
If she weighs the same as a duck, then shes made of wood.
If shes made of wood then shes a witch.
Therefore, if she weighs the same as a duck, shes a witch.
Which rule of inference is he using?
2. In constructive dilemma, the antecedent of the conditional sentences
are usually chosen to represent opposite alternatives. This allows us to
introduce their disjunction as a tautology. Consider the following proof
that there is never any reason to worry (found on the walls of an Irish
pub).
Either you are sick or you are well.
If you are well theres nothing to worry about.
If you are sick there are just two possibilities:
Either you will get better or you will die.
If you are going to get better theres nothing to worry about.
If you are going to die there are just two possibilities:
Either you will go to Heaven or to Hell.
If you go to Heaven there is nothing to worry about. If you go
to Hell, youll be so busy shaking hands with all your friends
there wont be time to worry . . .
Identify the three tautologies that are introduced in this proof.
113
114
2.7
a = ab
subtracting b2 from both
sides
a
b = ab
b) = b(a
b)
canceling (a
b) from both
sides
a+b=b
Now let a and b both have a particular value, a = b = 1, and we
see that 1 + 1 = 1, i.e. 2 = 1.
This argument is not sound (thank goodness!) because one of the premises
actually the bad premise appears as one of the justifications of a step is
115
false. You can argue with perfect logic to achieve complete nonsense if you
include false premises.
Exercise. It is not true that you can always cancel the same thing from
both sides of an equation. Under what circumstances is such cancellation
disallowed?
So, how can you tell if an argument has a valid form? Use a truth table.
As an example, well verify that the rule of inference known as destructive
dilemma is valid using a truth table. This argument form contains 4 predicate variables so the truth table will have 16 rows. There is a column for
each of the variables, the premises of the argument and its conclusion.
A B C D A =) B C =) D B _ D A _ C
T T T T
T T T
T T
T T
T
T T
T
T
T
T
T
T T T
T T
T
T
T T
116
are true. You should note that in every single situation in which all the
premises are true the conclusion is also true. Thats what makes destructive
dilemma and all of its friends a rule of inference. Whenever all the
premises are true so is the conclusion. You should also notice that there are
several rows in which the conclusion is true but some one of the premises isnt.
Thats okay too, isnt it reasonable that the conclusion of an argument can be
true, but at the same time the particulars of the argument are unconvincing?
As weve noted earlier, an argument by deductive reasoning can go wrong
in only certain well-understood ways. Basically, either the form of the argument is invalid, or at least one of the premises is false. Avoiding false
premises in your arguments can be trickier than it sounds many statements that sound appealing or intuitively clear are actually counter-factual.
The other side of the coin, being sure that the form of your argument is valid,
seems easy enough just be sure to only use the rules of inference as found
in Table 2.3. Unfortunately most arguments that you either read or write
will be in prose, rather than appearing as a formal list of deductions. When
dealing with that setting using natural rather than formalized language
making errors in form is quite common.
Two invalid forms are usually singled out for criticism, the converse error
and the inverse error. In some sense these two apparently dierent ways
to screw up are really the same thing. Just as a conditional statement and
its contrapositive are known to be equivalent, so too are the other related
statements the converse and the inverse equivalent. The converse error
consists of mistaking the implication in a modus ponens form for its converse.
The converse error:
B
A =) B
) A
117
118
A =) B
) B
If we replaced the conditional in this argument form by its inverse (A =)
B) then the revised argument would be modus ponens. Similarly, if we re-
place the conditional in an argument that suers from the converse error by
its converse, well have modus ponens.
119
Exercises 2.7
1. Determine the logical form of the following arguments. Use symbols to
express that form and determine whether the form is valid or invalid. If
the form is invalid, determine the type of error made. Comment on the
soundness of the argument as well, in particular, determine whether
any of the premises are questionable.
(a) All who are guilty are in prison.
George is not in prison.
Therefore, George is not guilty.
(b) If one eats oranges one will have high levels of vitamin C.
You do have high levels of vitamin C.
Therefore, you must eat oranges.
(c) All fish live in water.
The mackerel is a fish.
Therefore, the mackerel lives in water.
(d) If youre lazy, dont take math courses.
Everyone is lazy.
Therefore, no one should take math courses.
(e) All fish live in water.
The octopus lives in water.
Therefore, the octopus is a fish.
(f) If a person goes into politics, they are a scoundrel.
Harold has gone into politics.
Therefore, Harold is a scoundrel.
120
) C
) B(p)
is the particular form of modus ponens (here, p is not a variable it
stands for some particular element of the universe of discourse) and
8x, A(x) =) B(x)
8x, B(x)
) 8x, A(x)
121
122
Chapter 3
Proof techniques I Standard
methods
Love is a snowmobile racing across the tundra and then suddenly it flips over,
pinning you underneath. At night, the ice weasels come. Matt Groening
3.1
If you form the product of 4 consecutive numbers, the result will be one less
than a perfect square. Try it!
1 2 3 4 = 24 = 52
2 3 4 5 = 120 = 112
3 4 5 6 = 360 = 192
It always works!
123
124
tive argument in favor of the result. If you like we can try a bunch of further
examples,
13 14 15 16 = 43680 = 2092
14 15 16 17 = 571200 = 2392
but really, no matter how many examples we produce, we havent proved the
statement weve just given evidence.
Generally, the first thing to do in proving a universal statement like this
is to rephrase it as a conditional. The resulting statement is a Universal
Conditional Statement or a UCS. The reason for taking this step is that the
hypotheses will then be clear they form the antecedent of the UCS. So,
while you wont have really made any progress in the proof by taking this
advice, you will at least know what tools you have at hand. Taking the
example we started with, and rephrasing it as a UCS we get
1.
125
k2
1 = (a2 + 3a + 1)2
1
1
Q.E.D.
Now, if you followed the algebra above, (none of which was particularly
difficult) the proof stands as a completely valid argument showing the truth of
our proposition, but this is very unsatisfying! All the real work was concealed
in one stark little sentence: Let k be a2 + 3a + 1. Where on Earth did
that particular value of k come from? The answer to that question should
hopefully convince you that there is a huge dierence between devising a
proof and writing one. A good proof can sometimes be somewhat akin to a
126
5a
a
Now, fill in the entries of the table by multiplying the corresponding row
and column headers.
a2
5a
6a2
5a2
6a
a3
5a
127
Finally add up all the entries of the table, combining any like terms.
You should note that the F.O.I.L rule is just a mnemonic for the case
when the table has 2 rows and 2 columns.
Okay, lets get back to doing proofs. We are going to do a lot of proofs
involving the concepts of elementary number theory so, as a convenience,
all of the definitions that were made in Chapter 1 are gathered together in
Table 3.1.
128
9k 2 Z, n = 2k
()
n is even
Odd
8n 2 Z,
n is odd
()
9k 2 Z, n = 2k + 1
Divisibility
8n 2 Z, 8 d > 0 2 Z,
d|n
()
9k 2 Z, n = kd
Floor
8x 2 R,
y = bxc
()
y 2Z ^ y x<y+1
()
y2Z ^ y
Ceiling
8x 2 R,
y = dxe
1<xy
9!q, r 2 Z, n = qd + r ^ 0 r < d
n div d = q
n mod d = r
Prime
8p 2 Z
p is prime
(p > 1)
()
(8x, y 2 Z+ , p = xy =) x = 1 _ y = 1)
129
In this section we are concerned with direct proofs of universal statements. Such statements come in two flavors those that appear to involve
conditionals, and those that dont:
Every prime greater than two is odd.
versus
For all integers n, if n is a prime greater than two, then n is odd.
These two forms can readily be transformed one into the other, so we will
always concentrate on the latter. A direct proof of a UCS always follows a
form known as generalizing from the generic particular. We are trying to
prove that 8x 2 U, P (x) =) Q(x). The argument (in skeletal outline) will
look like:
Proof: Suppose that a is a particular but arbitrary element of U such that P (a) holds.
..
.
Therefore Q(a) is true.
Thus we have shown that for all x in U , P (x) =) Q(x).
Q.E.D.
Okay, so this outline is pretty crappy. It tells you how to start and end a
direct proof, but those obnoxious dot-dot-dots in the middle are where all the
real work has to go. If I could tell you (even in outline) how to fill in those
dots, that would mean mathematical proof isnt really a very interesting activity to engage in. Filling in those dots will sometimes (rarely) be obvious,
more often it will be extremely challenging; it will require great creativity,
130
loads of concentration, youll call on all your previous mathematical experiences, and you will most likely experience a certain degree of anguish. Just
remember that your sense of accomplishment is proportional to the difficulty
of the puzzles you attempt. So lets attempt another. . .
In Table 3.1 one of the very handy notions defined is that of the floor of
a real number.
y = bxc () (y 2 Z ^ y x < y + 1).
There is a sad tendency for people to apply old rules in new situations
just because of a chance similarity in the notation. The brackets used in
notating the floor function look very similar to ordinary parentheses, so the
following rule is often proposed
bx + yc = bxc + byc
Exercise. Find a counterexample to the previous rule.
What is (perhaps) surprising is that if one of the numbers involved is an
integer then the rule really works.
Theorem 3.1.2.
8x 2 R, 8n 2 Z, bx + nc = bxc + bnc
Since the floor of an integer is that integer, we could restate this as
bx + nc = bxc + n.
the only hypotheses that we can use involve what kinds of numbers x and
n are our hypotheses arent particularly potent. The next most useful
ally in constructing proofs are the definitions of the concepts involved. The
quantity bxc appears in the theorem, lets make use of the definition:
131
a = bxc () a 2 Z ^ a x < a + 1.
The only other floor function that appears in the statement of the theorem
(perhaps even more prominently) is bx + nc, here, the definition gives us
b = bx + nc () b 2 Z ^ b x + n < b + 1.
These definitions are our only available tools so well certainly have to
make use of them, and its important to notice that that is a good thing; the
definitions allow us to work with something well-understood (the inequalities
that appear within them) rather than with something new and relatively
suspicious (the floor notation). Putting the proof of this statement together
is an exercise in staring at the two definitions above and noting how one can
be converted into the other. It is also a testament to the power of naming
things.
Proof: Suppose that x is a particular but arbitrary real number
and that n is a particular but arbitrary integer. Let a = bxc.
bx + nc = bxc + n.
Q.E.D.
132
The nastiest mistake you can make is to use the same variable for two
dierent things.
Please write a rough draft first. Write two drafts! Even if you can write
beautiful, lucid prose on the first go around, it wont fly when it comes
to organizing a proof.
The statements in a proof are supposed to be logical statements. That
means they should be Boolean (statements that are either true or false).
An algebraic expression all by itself doesnt count, an inequality or an
equality does.
Dont say if when you mean since. Really! If you start a proof
about rational numbers like so:
133
Mark o the beginning and the end of your proofs as a hint to your
prefer placing a small rectangle at the end of their proofs, but Q.E.D. seems more pompous.
134
Exercises 3.1
1. Every prime number greater than 3 is of one of the two forms 6k + 1
or 6k + 5. What statement(s) could be used as hypotheses in proving
this theorem?
2. Prove that 129 is odd.
3. Prove that the sum of two rational numbers is a rational number.
4. Prove that the sum of an odd number and an even number is odd.
5. Prove that if the sum of two integers is even, then so is their dierence.
6. Prove that for every real number x,
2
3
<x<
3
4
=) b12xc = 8.
1.
evenness(n) = k () 2k | n ^ 2k+1 - n
State and prove a theorem concerning the evenness of products.
11. Suppose that a, b and c are integers such that a | b and b | c. Prove that
a | c.
135
136
3.2
There is a lovely result known as the arithmetic-geometric mean inequality whose proof epitomizes this approach. Basically this inequality compares
two dierent ways of getting an average between two real numbers. The
arithmetic mean of two real numbers a and b is the one youre probably used
to, (a + b)/2. Many people just call this the mean of a and b without using
the modifier arithmetic but as well see, our notion of what intermediate
value to use in between two numbers is dependent on context. Consider the
following two sequences of numbers (both of which have a missing entry)
2 9 16 23
37 44
and
3 6 12 24
96 192.
Some people refer to this as the forwards-backwards method, since you work back-
wards from the conclusion, but also forwards from the premises, in the hopes of meeting
somewhere in the middle.
137
The blank in the first sequence should be filled with the arithmetic mean
of the surrounding entries (23 + 37)/2 = 30. The blank in the second sequence should be filled using the geometric mean of its surrounding entries:
p
24 96 = 48.
0 =)
a+b
2
ab
() a + b
() (a + b)2
ab
p
2 ab
4ab
138
() a2 + 2ab + b2
4ab
y =)
() a2
4ab
2ab + b2
Whoa! Were done! Do you see why? If not, Ill give you one hint: the
square of any real number is greater than or equal to zero.
Exercise. Re-assemble all of the steps taken in the previous few paragraphs
into a proof of the arithmetic-geometric mean inequality.
139
Exercises 3.2
1. Suppose you have a savings account which bears interest compounded
monthly. The July statement shows a balance of $ 2104.87 and the
September statement shows a balance $ 2125.97. What would be the
balance on the (missing) August statement?
2. Recall that a quadratic equation ax2 + bx + c = 0 has two real solutions
if and only if the discriminant b2
have dierent signs then the quadratic equation has two real solutions.
3. Prove that if x3
4. Prove that for all integers a, b, and c, if a|b and a|(b + c), then a|c.
5. Show that if x is a positive real number, then x +
1
x
2.
6. Prove that for all real numbers a, b, and c, if ac < 0, then the quadratic
equation ax2 + bx + c = 0 has two real solutions.
Hint: The quadratic equation ax2 + bx + c = 0 has two real solutions
if and only if b2
7. Show that
r k n).
n
k
n
r
n r
k r
140
3.3
Suppose we are trying to prove that all thrackles are polycyclic 3 . A direct
proof of this would involve looking up the definition of what it means to be
a thrackle, and of what it means to be polycyclic, and somehow discerning
a way to convert whatever thrackles logical equivalent is into the logical
equivalent of polycyclic. As happens fairly often, there may be no obvious
way to accomplish this task. Indirect proof takes a completely dierent
tack. Suppose you had a thrackle that wasnt polycyclic, and furthermore,
show that this supposition leads to something truly impossible. Well, if its
impossible for a thrackle to not be polycyclic, then it must be the case that
all of them are. Such an argument is known as proof by contradiction.
Quite possibly the sweetest indirect proof known is Euclids proof that
there are an infinite number of primes.
Theorem 3.3.1. (Euclid) The set of all prime numbers is infinite.
Proof:
N =1+
n
Y
pk
k=1
3
Both of these strange sounding words represent real mathematical concepts, however,
141
Q.E.D.
If you are working on proving a UCS and the direct approach seems to be
failing you may find that another indirect approach, proof by contraposition,
will do the trick. In one sense this proof technique isnt really all that indirect;
what one does is determine the contrapositive of the original conditional and
then prove that directly. In another sense this method is indirect because
a proof by contraposition can usually be recast as a proof by contradiction
fairly easily.
The easiest proof I know of using the method of contraposition (and
possibly the nicest example of this technique) is the proof of the lemma we
p
stated in Section 1.6 in the course of proving that 2 wasnt rational. In case
youve forgotten we needed the fact that whenever x2 is an even number, so
is x.
Lets first phrase this as a UCS.
8x 2 Z, x2 even =) x even
Perhaps you tried to prove this result earlier. If so you probably came
across the conceptual problem that all you have to work with is the evenness
142
143
Q.E.D.
The main problem in applying the method of proof by contradiction is
that it usually involves cleverness. You have to come up with some reason
why the presumption that the theorem is false leads to a contradiction
and this may or may not be obvious. More than any other proof technique,
proof by contradiction demands that we use drafts and rewriting. After
monkeying around enough that we find a way to reach a contradiction, we
need to go back to the beginning of the proof and highlight the feature that
we will eventually contradict! After all, we want it to look like our proofs are
completely clear, concise and reasonable even if their formulation caused us
some sort of Gordian-level mental anguish.
Well end this section with an example from Geometry.
Theorem 3.3.3. Among all triangles inscribed in a fixed circle, the one with
maximum area is equilateral.
Proof: Well proceed by contradiction. Suppose to the contrary
that there is a triangle, 4ABC, inscribed in a circle having maxi-
144
the formula bh/2 (where b is the base, and h is the altitude), this
triangles area is evidently greater than that of 4ABC. This is a
contradiction since 4ABC was presumed to have maximal area.
We leave the actual construction 4AB 0 C to the following exercise.
Q.E.D.
Exercise. Where should we place the point B 0 in order to create a triangle
4AB 0 C having greater area than any triangle such as 4ABC which is not
isosceles?
145
Exercises 3.3
1. Prove that if the cube of an integer is odd, then that integer is odd.
2. Prove that whenever a prime p does not divide the square of an integer,
it also doesnt divide the original integer. (p - x2 =) p - x)
3. Prove (by contradiction) that there is no largest integer.
4. Prove (by contradiction) that there is no smallest positive real number.
5. Prove (by contradiction) that the sum of a rational and an irrational
number is irrational.
6. Prove (by contraposition) that for all integers x and y, if x + y is odd,
then x 6= y.
7. Prove (by contraposition) that for all real numbers a and b, if ab is
irrational, then a is irrational or b is irrational.
8. A Pythagorean triple is a set of three natural numbers, a, b and c, such
that a2 + b2 = c2 . Prove that, in a Pythagorean triple, at least one
of a and b is even. Use either a proof by contradiction or a proof by
contraposition.
9. Suppose you have 2 pairs of real numbers whose products are 1. That
is, you have (a, b) and (c, d) in R2 satisfying ab = cd = 1. Prove that
a < c implies that b > d.
146
3.4
Disproofs
yes
2 4 5
yes
3 12 11
yes
3 5 15
yes
5 4 15
yes
5 10 3
yes
7 2 14
yes
3.4. DISPROOFS
147
n
Y
pk .
k=1
Define a sequence by
Nn = 1 +
n
Y
pk ,
k=1
where {p1 , p2 , . . . , pn } are the actual first n primes. The first several values
Nn
1 + (2) = 3
1 + (2 3) = 7
3
4
1 + (2 3 5) = 31
1 + (2 3 5 7) = 211
5 1 + (2 3 5 7 11) = 2311
..
..
.
.
148
Qn
k=1
pk is itself
3.4. DISPROOFS
149
Exercises 3.4
1. Find a polynomial that assumes only prime values for a reasonably
large range of inputs.
2. Find a counterexample to Conjecture 3 using only powers of 2.
3. The alternating sum of factorials provides an interesting example of a
sequence of integers.
1! = 1
2!
3!
4!
1! = 1
2! + 1! = 5
3! + 2!
1! = 19
et cetera
Are they all prime? (After the first two 1s.)
4. It has been conjectured that whenever p is prime, 2p
1 is also prime.
150
8. True or false: There are two irrational numbers whose product is rational. Prove your answer.
9. True or false: Whenever an integer n is a divisor of the square of an
integer, m2 , it follows that n is a divisor of m as well. (In symbols,
8n 2 Z, 8m 2 Z, n | m2 =) n | m.) Prove your answer.
10. In an exercise in Section 3.2 we proved that the quadratic equation
ax2 + bx + c = 0 has two solutions if ac < 0. Find a counterexample
which shows that this implication cannot be replaced with a biconditional.
3.5
151
It is necessary to provide an argument that this list of cases is complete! I.e. that
152
are always colored dierently. Figure 3.1 shows one instance of an arrangement of nations that requires at least four dierent colors, the theorem says
that four colors are always enough. It should be noted that real cartographers usually reserve a fifth color for oceans (and other water) and that it is
possible to conceive of a map requiring five colors if one allows the nations to
be non-contiguous. In 1977, Kenneth Appel and Wolfgang Haken proved the
four color theorem by reducing the infinitude of possibilities to 1,936 separate cases and analyzing each of these with a computer. The inelegance of a
proof by cases is probably proportional to some power of the number of cases,
but in any case, this proof is generally considered somewhat inelegant. Ever
since the proof was announced there has been an ongoing eort to reduce the
number of cases (currently the record is 633 cases still far too many to be
checked through without a computer) or to find a proof that does not rely
on cases. For a good introductory article on the four color theorem see[6].
153
yet the statement has been checked for a large number of cases. Goldbachs
conjecture is one such statement. Christian Goldbach [4] was a mathematician born in Konigsberg Prussia, who, curiously, did not make the conjecture6
which bears his name. In a letter to Leonard Euler, Goldbach conjectured
that every odd number greater than 5 could be expressed as the sum of three
primes (nowadays this is known as the weak Goldbach conjecture). Euler
apparently liked the problem and replied to Goldbach stating what is now
known as Goldbachs conjecture: Every even number greater than 2 can be
expressed as the sum of two primes. This statement has been lying around
since 1742, and a great many of the worlds best mathematicians have made
their attempts at proving it to no avail! (Well, actually a lot of progress
has been made but the result still hasnt been proved.) Its easy to verify
the Goldbach conjecture for relatively small even numbers, so what has been
done is/are proofs by exhaustion of Goldbachs conjecture restricted to finite
universes. As of this writing, the conjecture has been verified to be true of
all even numbers less than 2 1017 .
ment it is generally felt that a direct proof would be more esthetically pleasing. If you are in a situation that doesnt admit such a direct proof, you
should at least seek a proof by cases using the minimum possible number of
cases. For example, consider the following theorem and proof.
Theorem 3.5.1. 8n 2 Z n2 is of the form 4k or 4k + 1 for some k 2 Z.
Proof:
154
Q.E.D.
While the proof just stated is certainly valid, the argument is inelegant
since a smaller number of cases would suffice.
Exercise. The previous theorem can be proved using just two cases. Do so.
Well close this section by asking you to determine an exhaustive proof
where the complexity of the argument is challenging but not too impossible.
Graph pebbling is an interesting concept originated by the famous combinatorialist Fan Chung. A graph (as the term is used here) is a collection of
places or locations which are known as nodes, some of which are joined by
paths or connections which are known as edges. Graphs have been studied by mathematicians for about 400 years, and many interesting problems
can be put in this setting. Graph pebbling is a crude version of a broader
problem in resource management often a resource actually gets used in the
process of transporting it. Think of the big tanker trucks that are used to
155
transport gasoline. What do they run on? Well, actually they probably burn
diesel but the point is that in order to move the fuel around we have to
consume some of it. Graph pebbling takes this to an extreme: in order to
move one pebble we must consume one pebble.
Imagine that a bunch of pebbles are randomly distributed on the nodes
of a graph, and that we are allowed to do graph pebbling moves we remove
two pebbles from some node and place a single pebble on a node that is
connected to it. See Figure 3.3.
156
Figure 3.3: A graph pebbling move takes two pebbles o of a node and puts
one of them on an adjacent node (the other is discarded). Notice how node
C, which formerly held 3 pebbles, now has only 1 and that a pebble is now
present on node D where previously there was none.
157
For example, consider the triangle graph three nodes which are all
mutually connected. The pebbling number of this graph is 3. If we start
with one pebble on each node we are already done; if there is a node that has
two pebbles on it, we can use a pebbling move to reach either of the other
two nodes.
Exercise. There is a graph C5 which consists of 5 nodes connected in a circular fashion. Determine its pebbling number. Prove your answer exhaustively.
Hint: the pebbling number must be greater than 4 because if one pebble is
placed on each of 4 nodes the configuration is unmovable (we need to have
two pebbles on a node in order to be able to make a pebbling move at all) and
so the 5th node can never be reached.
158
Exercises 3.5
1. Prove that if n is an odd number then n4 (mod 16) = 1.
2. Prove that every prime number other than 2 and 3 has the form 6q + 1
or 6q + 5 for some integer q. (Hint: this problem involves thinking
about cases as well as contrapositives.)
3. Show that the sum of any three consecutive integers is divisible by 3.
4. Find the pebbling number of a graph whose nodes are the corners and
whose edges are the, uhmm, edges of a cube.
5. A vampire number is a 2n digit number v that factors as v = xy where
x and y are n digit numbers and the digits of v are the union of the
digits in x and y in some order. The numbers x and y are known as
the fangs of v. To eliminate trivial cases, pairs of trailing zeros are
disallowed.
Show that there are no 2-digit vampire numbers.
Show that there are seven 4-digit vampire numbers.
6. Lagranges theorem on representation of integers as sums of squares
says that every positive integer can be expressed as the sum of at most
4 squares. For example, 79 = 72 + 52 + 22 + 12 . Show (exhaustively)
that 15 can not be represented using fewer than 4 squares.
7. Show that there are exactly 17 numbers x in the range 1 x 100
that cant be represented using fewer than 4 squares.
8. The trichotomy property of the real numbers simply states that every
real number is either positive or negative or zero. Trichotomy can be
used to prove many statements by looking at the three cases that it
159
guarantees. Develop a proof (by cases) that the square of any real
number is non-negative.
9. Consider the game called binary determinant tic-tac-toe7 which is
played by two players who alternately fill in the entries of a 3 3
array. Player One goes first, placing 1s in the array and player Zero
goes second, placing 0s. Player Ones goal is that the final array have
determinant 1, and player Zeros goal is that the determinant be 0.
The determinant calculations are carried out mod 2.
Show that player Zero can always win a game of binary determinant
tic-tac-toe by the method of exhaustion.
This question was problem A4 in the 63rd annual William Lowell Putnam Math-
ematics Competition (2002). There are three collections of questions and answers from
previous Putnam exams available from the MAA [1, 7, 9]
160
3.6
From a certain point of view, there is no need for the current section. If
we are proving an existential statement we are disproving some universal
statement. (Which has already been discussed.) Similarly, if we are trying
to disprove an existential statement, then we are actually proving a related
universal statement. Nevertheless, sometimes the way a theorem is stated
emphasizes the existence question over the corresponding universal and so
people talk about proving and disproving existential statements as a separate
issue from universal statements.
Proofs of existential questions come in two basic varieties: constructive
and non-constructive. Constructive proofs are conceptually the easier of the
two you actually name an example that shows the existential question is
true. For example:
Exercise. The Fibonacci numbers are defined by the initial values F (0) = 1
and F (1) = 1 and the recursive formula F (n + 1) = F (n) + F (n
1) (to get
the next number in the series you add the last and the penultimate).
161
n F (n)
0
5
..
.
8
..
.
such that is
rational.
p
p
is rational
then
we
are
done.
(Let
=
=
2.)
p p2
p
Otherwise,
let = 2 and = 2. The result follows because
p p 2
p 2
p (p2p2) p 2
2
= 2
= 2 = 2, which is clearly rational.
Proof: If
Q.E.D.
Many existential proofs involve a property of the natural numbers known
as the well-ordering principle. The well-ordering principle is sometimes abbreviated WOP. If a set has WOP it doesnt mean that the set is ordered
in a particularly good way, but rather that its subsets are like wells the
162
kind one hoists water out of with a bucket on a rope. You neednt be concerned with WOP in general at this point, but notice that the subsets of the
natural numbers have a particularly nice property any non-empty set of
natural numbers must have a least element (much like every water well has
a bottom).
Because the natural numbers have the well-ordering principle we can
prove that there is a least natural number with property X by simply finding
any natural number with property X by doing that weve shown that the
set of natural numbers with property X is non-empty and thats the only
hypothesis the WOP needs.
For example, in the exercises in Section 3.5 we introduced vampire numbers. A vampire number is a 2n digit number v that factors as v = xy where
x and y are n digit numbers and the digits of v are the union of the digits in
x and y in some order. The numbers x and y are known as the fangs of v.
To eliminate trivial cases, pairs of trailing zeros are disallowed.
Theorem 3.6.3. There is a smallest 6-digit vampire number.
Proof:
163
There are quite a few occasions when we need to prove statements involving the unique existence quantifier (9!). In such instances we need to do
just a little bit more work. We need to show existence either constructively
or non-constructively and we also need to show uniqueness. To give an
example of a unique existence proof well return to a concept first discussed
in Section 1.5 and finish-up some business that was glossed-over there.
Recall the Euclidean algorithm that was used to calculate the greatest
common divisor of two integers a and b (which we denote gcd(a, b)). There
is a rather important question concerning algorithms known as the halting
problem. Does the program eventually halt, or does it get stuck in an
infinite loop? We know that the Euclidean algorithm halts (and outputs the
correct result) because we know the following unique existence result.
8a, b 2 Z+ , 9! d 2 Z+ such that d = gcd(a, b)
Now, before we can prove this result, well need a precise definition for
gcd(a, b). Firstly, a gcd must be a common divisor which means it needs to
divide both a and b. Secondly, among all the common divisors, it must be the
largest. This second point is usually addressed by requiring that every other
common divisor divides the gcd. Finally we should note that a gcd is always
positive, for whenever a number divides another number so does its negative,
and whichever of those two is positive will clearly be the greater! This allows
us to extend the definition of gcd to all integers, but things are conceptually
easier if we keep our attention restricted to the positive integers.
Definition. The greatest common divisor, or gcd, of two positive integers
a and b is a positive integer d such that d | a and d | b and if c is any other
positive integer such that c | a and c | b then c | d.
8a, b, c, d 2 Z+ d = gcd(a, b) () d | a ^ d | b ^ (c | a ^ c | b =) c | d)
164
unique existence of the gcd. The uniqueness part is easier so well do that
first. We argue by contradiction. Suppose that there were two dierent numbers d and d0 satisfying the definition of gcd(a, b). Put d0 in the place of c
in the definition to see that d0 | d. Similarly, we can deduce that d | d0 and
if two numbers each divide into the other, they must be equal. This is a
contradiction since we assumed d and d0 were dierent.
For the existence part well need to define a set known as the Z-module
generated by a and b that consists of all numbers of the form xa + yb where
x and y range over the integers.
This set has a very nice geometric character that often doesnt receive
the attention it deserves. Every element of a Z-module generated by two
numbers (15 and 21 in the example) corresponds to a point in the Euclidean
plane. As indicated in Figure 3.4 there is a dividing line between the positive
and negative elements in a Z-module. It is also easy to see that there are
many repetitions of the same value at dierent points in the plane.
Exercise. The value 0 clearly occurs in a Z-module when both x and y are
themselves zero. Find another pair of (x, y) values such that 21x + 15y is
zero. What is the slope of the line which separates the positive values from
the negative in our Z-module?
In thinking about this Z-module, and perusing Figure 3.4, you may have
noticed that the smallest positive number in the Z-module is 3. If you hadnt
noticed that, look back and verify that fact now.
Exercise. How do we know that some smaller positive value (a 1 or a 2)
doesnt occur somewhere in the Euclidean plane?
What weve just observed is a particular instance of a general result.
Theorem 3.6.4. The smallest positive number in the Z-module generated by
a and b is d = gcd(a, b).
165
-24
-3
18
39
60
81
102
123
144
165
-39
-18
24
45
66
87
108
129
150
-54
-33
-12
30
51
72
93
114
135
-69
-48
-27
-6
15
36
57
78
99
120
-84
-63
-42
-21
21
42
63
84
105
-99
-78
-57
-36
-15
27
48
69
90
-114
-93
-72
-51
-30
-9
12
33
54
75
-129
-108
-87
-66
-45
-24
-3
18
39
60
-144
-123
-102
-81
-60
-39
-18
24
45
Figure 3.4: The Z-module generated by 21 and 15. The number 21x + 15y
is printed by the point (x, y).
166
qd = a
q(xa + yb) = (1
qx)a
Q.E.D.
167
Exercises 3.6
1. Show that there is a perfect square that is the sum of two perfect
squares.
2. Show that there is a perfect cube that is the sum of three perfect cubes.
3. Show that the WOP doesnt hold in the integers. (This is an existence
proof, you show that there is a subset of Z that doesnt have a smallest
element.)
4. Show that the WOP doesnt hold in Q+ .
5. In the proof of Theorem 3.6.4 we weaseled out of showing that d | b.
Fill in that part of the proof.
168
Chapter 4
Sets
No more turkey, but Id like some more of the bread it ate. Hank Ketcham
4.1
In modern mathematics there is an area called Category theory1 which studies the relationships between dierent areas of mathematics. More precisely,
the founders of category theory noticed that essentially the same theorems
and proofs could be found in many dierent mathematical fields with only
the names of the structures involved changed. In this sort of situation one
can make what is known as a categorical argument in which one proves the
desired result in the abstract, without reference to the details of any particular field. In eect this allows one to prove many theorems at once all you
need to convert an abstract categorical proof into a concrete one relevant
to a particular area is a sort of key or lexicon to provide the correct names
for things. Now, category theory probably shouldnt really be studied until you have a background that includes enough dierent fields that you can
1
The classic text by Saunders Mac Lane [11] is still considered one of the best intro-
169
170
CHAPTER 4. SETS
make sense of their categorical correspondences. Also, there are a good many
mathematicians who deride category theory as abstract nonsense. But, as
someone interested in developing a facility with proofs, you should be on
the lookout for categorical correspondences. If you ever hear yourself utter
something like well, the proof of that goes just like the proof of the (insert
weird technical-sounding name here) theorem you are probably noticing a
categorical correspondence.
Okay, so category theory wont be of much use to you until much later in
your mathematical career (if at all), and one could argue that it doesnt really
save that much eort. Why not just do two or three dierent proofs instead
of learning a whole new field so we can combine them into one? Nevertheless,
category theory is being mentioned here at the beginning of the chapter on
sets. Why?
We are about to see our first example of a categorical correspondence.
Logic and Set theory are dierent aspects of the same thing. To describe a
set people often quote Kurt Godel A set is a Many that allows itself to be
thought of as a One. (Note how the attempt at defining what is really an
elemental, undefinable concept ends up sounding rather mystical.) A more
practical approach is to think of a set as the collection of things that make
some open sentence true.2
Recall that in Logic the atomic concepts were true, false, sentence
and statement. In Set theory, they are set, element and membership. These concepts (more or less) correspond to one another. In most
books, a set is denoted either using the letter M (which stands for the German word menge) or early alphabet capital roman letters A, B, C, et
cetera. Here, we will often emphasize the connection between sets and open
2
This may sound less metaphysical, but this statement is also faulty because it defines
set in terms of collection which will of course be defined elsewhere as the sort of
things of which sets are one example.
171
172
CHAPTER 4. SETS
same set. Also, a set either contains, or doesnt contain, a given element. It
doesnt make sense to have an element in a set multiple times. By convention, if an element is listed more than once when a set is listed we ignore
the repetitions. So, the sets {1, 1} and {1} are really the same thing. If the
multiset concept is useful when studying puzzles like How many ways can
the letters of MISSISSIPPI be rearranged? because the letters in MISSISSIPPI can be expressed as the multiset {1 M, 4 I, 2 P, 4 S}. With the
173
If instead we compare them after theyve been sorted, the job is much
easier.
S1 = {1, A, }, e, , h, , , , }
S2 = {1, A, }, e, , , , , s, }
This business about ordered versus unordered comes up fairly often so
its worth investing a few moments to figure out how it works. If a collection
of things that is inherently unordered is handed to us we generally put them
in an order that is pleasing to us. Consider receiving five cards from the
dealer in a card game, or extracting seven letters from the bag in a game
of Scrabble. If, on the other hand, we receive a collection where order is
important we certainly may not rearrange them. Imagine someone receiving
the telephone number of an attractive other but writing it down with the
digits sorted in increasing order!
Exercise. Consider a universe consisting of just the first 5 natural numbers
U = {1, 2, 3, 4, 5}. How many dierent sets having 4 elements are there
174
CHAPTER 4. SETS
empty set (note the definite article). There are as many singletons as there
are elements in your universe. They arent the same though, for example
1 6= {1}. There is only one empty set and it is denoted ; irrespective of
the universe we are working in.
a set, whose elements are all the possible sets in this universe. This set is
known as the power set of the universal set. Indeed, we can construct the
power set of any set A and we denote it with the symbol P(A). Returning
to our example we have
P({1, 2, 3}) =
;,
Exercise.
Find the power sets P({1, 2}) and P({1, 2, 3, 4}).
Hint: If your conjectured formula is correct you should see why these sets
should count how many things are in A. If A isnt a set then we are talking
about the ordinary absolute value
175
Exercises 4.1
1. What is the power set of ;? Hint: if you got the last exercise in the
chapter youd know that this power set has 20 = 1 element.
176
CHAPTER 4. SETS
4.2
Containment
There are two notions of being inside a set. A thing may be an element
of a set, or may be contained as a subset. Distinguishing these two notions
of inclusion is essential. One difficulty that sometimes complicates things is
that a set may contain other sets as elements. For instance, as we saw in the
previous section, the elements of a power set are themselves sets.
A set A is a subset of another set B if all of As elements are also in B.
The terminology superset is used to refer to B in this situation, as in The set
of all real-valued functions in one real variable is a superset of the polynomial
functions. The subset/superset relationship is indicated with a symbol that
should be thought of as a stylized version of the less-than-or-equal sign, when
A is a subset of B we write A B.
the fact that the sets are not equal we can write A ( B. By the way, if
you want to emphasize the superset relationship, all of these symbols can
be turned around. So for example A B means that A is a superset of B
although they could potentially be equal.
and the set that its in. The following exercise is intended to clarify the
distinction between 2 and .
Exercise. Let A = 1, 2, {1}, {a, b} . Which of the following are true?
i) {a, b} A.
vi) {1} A.
ii) {a, b} 2 A.
vii) {1} 2 A.
iv) 1 2 A.
ix) {2} A.
iii) a 2 A.
v) 1 A.
viii) {2} 2 A.
x) {{1}} A.
4.2. CONTAINMENT
177
Another perspective that may help clear up the distinction between 2 and
should be something that can appropriately be inserted between two sentences. Lets run through a short example to figure out what that might be.
To keep things simple well work inside the universal set U = {1, 2, 3, . . . 50}.
so: T F . On the other hand we can re-express the sets T and F using
10 | x}
F = {x 2 U
5 | x}
its the implication arrow. Its easy to verify that 10 | x =) 5 | x, and its
equally easy to note that the other direction doesnt work, 5 | x ; 10 | x
The general statement is: if A and B are sets, and MA (x) and MB (x) are
their respective membership questions, then A B corresponds precisely to
8x 2 U, MA (x) =) MB (x).
178
CHAPTER 4. SETS
Now to many people (me included!) this looks funny at first, in Set
4.2. CONTAINMENT
179
C = {x 2 Z 9k 2 Z, x = 3k}.
The set D is contained in C. Lets prove it!
Proof:
180
CHAPTER 4. SETS
Exercises 4.2
1. Insert either 2 or in the blanks in the following sentences (in order
to produce true sentences).
i) 1
ii) {a}
iii) {a, b}
iv) {{a, b}}
=)
MB . What
6. Prove that the set of perfect fourth powers is contained in the set of
perfect squares.
4.3
181
Set operations
of union ([) and intersection (\). The symbols are designed to provide a
mnemonic for the correspondence; the Set theory symbols are just rounded
versions of those from Logic.
Explicitly, if P (x) and Q(x) are open sentences, then the union of the
corresponding truth sets SP and SQ is defined by
SP [ SQ = {x 2 U P (x) _ Q(x)}.
Exercise. Suppose two sets A and B are given. Re-express the previous
definition of union using their membership criteria, MA (x) = x 2 A and
MB (x) = x 2 B.
The union of more than two sets can be expressed using a big union
symbol. For example, consider the family of real intervals defined by In =
(n, n + 1].3 Theres an interval for every integer n. Also, every real number
is in one of these intervals. The previous sentence can be expressed as
R =
In .
n2Z
The intersection of two sets is conceptualized as what they have in common but the precise definition is found by considering conjunctions,
A \ B = {x 2 U x 2 A ^ x 2 B}.
3
The elements of In can also be distinguished as the solution sets of the inequalities
n < x n + 1.
182
CHAPTER 4. SETS
Exercise. With reference to two open sentences P (x) and Q(x), define the
intersection of their truth sets, SP \ SQ .
There is also a big version of the intersection symbol. Using the same
family of intervals as before,
; =
In .
n2Z
183
was writing the last paragraph, this text was nothing more than a very long
sequence of zeros and ones in the memory of my computer. . .
Every rule that we learned in Chapter 2 (see Table 2.2) has a set-theoretic
equivalent. These set-theoretic versions are expressed using equalities (i.e.
the symbol = in between two sets) which is actually a little bit funny if you
think about it. We normally use = to mean that two numbers or variables
have the same numerical magnitude, as in 122 = 144, we are doing something altogether dierent when we use that symbol between two sets, as in
p p p
{1, 2, 3} = { 1, 4, 9}, but people seem to be used to this so theres no
sense in quibbling.
Exercise. Develop a useful definition for set equality. In other words, come
up with a (quantified) logical statement that means the same thing as A =
B for two arbitrary sets A and B.
Exercise. What symbol in Logic should go between the membership criteria
MA (x) and MB (x) if A and B are equal sets?
In Table 4.1 the rules governing the interactions between the set theoretic
operations are collected.
We are now in a position somewhat similar to when we jumped from
proving logical assertions with truth tables to doing two-column proofs. We
have two dierent approaches for showing that two sets are equal. We can
do a so-called element chasing proof (to show A = B, assume x 2 A and
prove x 2 B and then vice versa). Or, we can construct a proof using the
basic set equalities given in Table 4.1. Often the latter can take the form of
a two-column proof.
184
CHAPTER 4. SETS
Commutative
laws
Intersection
Union
version
version
A\B =B\A
A[B =B[A
A \ (B \ C)
A [ (B [ C)
A \ (B [ C) =
A [ (B \ C) =
A\B
A[B
A\A = ;
A[A = U
A\U =A
A[;=A
Domination
A\;=;
A[U =U
Idempotence
A\A=A
A[A=A
A \ (A [ B) = A
A [ (A \ B) = A
Associative
= (A \ B) \ C
laws
Distributive
(A \ B) [ (A \ C)
laws
DeMorgans
= A[B
laws
Complementarity
Identity
laws
Absorption
= (A [ B) [ C
(A [ B) \ (A [ C)
= A\B
185
x 2 A _ x 2 B.
The conjunctive identity law and the fact that x 2 A _ x 2
/ A is
a tautology gives us an equivalent logical statement:
(x 2 A _ x 2
/ A) ^ (x 2 A _ x 2 B).
Finally, this last statement is equivalent to
x 2 A _ (x 2
/ A ^ x 2 B)
which is the definition of x 2 A [ (A \ B).
On the other hand, if we assume that x 2 A [ (A \ B), it follows
that
x 2 A _ (x 2
/ A ^ x 2 B).
Applying the distributive law, disjunctive complementarity and
the identity law, in sequence we obtain
186
CHAPTER 4. SETS
x 2 A _ (x 2
/ A ^ x 2 B)
/ A) ^ (x 2 A _ x 2 B)
= (x 2 A _ x 2
= t ^ (x 2 A _ x 2 B)
=x2A_x2B
A[B
Given
U \ (A [ B)
Identity law
= (A [ A) \ (A [ B)
Complementarity
(A [ (A \ B)
Distributive law
Q.E.D.
There are some notions within Set theory that dont have any clear parallels in Logic. One of these is essentially a generalization of the concept of
complements. If you think of the set A as being the dierence between
the universal set U and the set A you are on the right track. The dierence
between two sets is written A \ B (sadly, sometimes this is denoted using the
ordinary subtraction symbol A
B) and is defined by
A \ B = A \ B.
187
some developments of Set theory, the dierence of sets is defined first and
then complementation is defined by A = U \ A.
The dierence of sets (like the dierence of real numbers) is not a commu-
define an operation that acts somewhat like the dierence, but that is commutative. The symmetric dierence of two sets is denoted using a triangle
(really a capital Greek delta)
A4B = (A \ B) [ (B \ A).
Exercise. Show that A4B = (A [ B) \ (A \ B).
Come on! You read right past that exercise without even pausing!
What? You say you did try it and it was too hard?
Okay, just for you (and this time only) Ive prepared an aid to help you
through. . .
On the next page is a two-column proof of the result you need to prove,
but the lines of the proof are all scrambled. Make a copy and cut out all the
pieces and then glue them together into a valid proof.
So, no more excuses, just do it!
188
CHAPTER 4. SETS
= (A \ B) [ (B \ A)
identity law
= (A [ B) \ (A \ B)
(A [ B) \ (A \ B)
= ((A \ A) [ (A \ B)) [ ((B \ A) [ (B \ B))
= (A \ B) [ (B \ A)
= (A \ (A \ B)) [ (B \ (A \ B))
= A4B
Given
distributive law
distributive law
= (A \ (A [ B) [ (B \ (A [ B))
DeMorgans law
= (; [ (A \ B)) [ ((B \ A) [ ;)
complementarity
189
Exercises 4.3
1. Let A = {1, 2, {1, 2}, b} and let B = {a, b, {1, 2}}. Find the following:
(a) A \ B
(b) A [ B
(c) A \ B
(d) B \ A
(e) A4B
2. In a standard deck of playing cards one can distinguish sets based on
face-value and/or suit. Let A, 2, . . . 9, 10, J, Q and K represent the sets
of cards having the various face-values. Also, let ~, , | and } be the
sets of cards having the possible suits. Find the following
(a) A \ ~
(b) A [ ~
(c) J \ ( \ ~)
(d) K \ ~
(e) A \ K
(f) A [ K
3. Do element-chasing proofs (show that an element is in the left-hand
side if and only if it is in the right-hand side) to prove each of the
following set equalities.
(a) A \ B = A [ B
(b) A [ B = A [ (A \ B)
(c) A4B = (A [ B) \ (A \ B)
190
CHAPTER 4. SETS
(d) (A [ B) \ C = (A \ C) [ (B \ C)
In
In
n2N
n2N
5. There is a set X such that, for all sets A, we have X4A = A. What
is X?
6. There is a set Y such that, for all sets A, we have Y 4A = A. What is
Y?
A \ (B [ C).
4.4
191
Venn diagrams
Hopefully, youve seen Venn diagrams before, but possibly you havent thought
deeply about them. Venn diagrams take advantage of an obvious but important property of closed curves drawn in the plane. They divide the points
in the plane into two sets, those that are inside the curve and those that
are outside! (Forget for a moment about the points that are on the curve.)
This seemingly obvious statement is known as the Jordan curve theorem,
and actually requires some details. A Jordan curve is the sort of curve you
might draw if you are required to end where you began and you are required
not to cross-over any portion of the curve that has already been drawn. In
technical terms such a curve is called continuous, simple and closed. The
Jordan curve theorem is one of those statements that hardly seems like it
needs a proof, but nevertheless, the proof of this statement is probably the
best-remembered work of the famous French mathematician Camille Jordan.
The prototypical Venn diagram is the picture that looks something like
the view through a set of binoculars.
U
192
CHAPTER 4. SETS
In a Venn diagram the universe of discourse is normally drawn as a rect-
angular region inside of which all the action occurs. Each set in a Venn
diagram is depicted by drawing a simple closed curve typically a circle, but
not necessarily! For instance, if you want to draw a Venn diagram that shows
all the possible intersections among four sets, youll find its impossible with
(only) circles.
U
Exercise. Verify that the diagram above has regions representing all 16 possible intersections of 4 sets.
There is a certain zen to Venn diagrams that must be internalized, but
once you have done so they can be used to think very eectively about the
193
relationships between sets. The main deal is that the points inside of one
of the simple closed curves are not necessarily in the set only some of the
points inside a simple closed curve are in the set, and we dont know precisely
where they are! The various simple closed curves in a Venn diagram divide
the universe up into a bunch of regions. It might be best to think of these
regions as fenced-in areas in which the elements of a set mill about, much
like domesticated animals in their pens. One of our main tools in working
with Venn diagrams is to deduce that certain of these regions dont contain
any elements we then mark that region with the emptyset symbol (;).
Mr. Ed
Black Beauty
Donald Duck
Snowball
Shadowfax
Ren
Heckle
Silver
Tweety Bird
Misty
Wile E. Coyote
Secretariat
And here is the same universe with some Jordan curves used to encircle two
subsets.
194
CHAPTER 4. SETS
Mr. Ed
Black Beauty
Donald Duck
Snowball
Shadowfax
Ren
Heckle
Silver
Tweety Bird
Misty
H
Wile E. Coyote
Secretariat
This picture might lead us to think that the set of cartoon characters and
the set of horses are disjoint, so we thought it would be nice to add one more
element to our universe in order to dispel that notion.
Mr. Ed
Black Beauty
Donald Duck
Snowball
Shadowfax
Ren
Night Mare
Heckle
Silver
Tweety Bird
Misty
H
Secretariat
Wile E. Coyote
C
195
Suppose we have two sets A and B and were interested in proving that
B A. The job is done if we can show that all of Bs elements are actually in
if we can show that the region marked with ; in the following diagram is
actually empty.
196
CHAPTER 4. SETS
A
B
However, both of these situations can also be dealt with by working with
Venn diagrams in which the sets are in general position which in this
situation means that every possible intersection is shown and then marking
any empty regions with ;.
Exercise. On a Venn diagram for two sets in general position, indicate the
empty regions when
b) A is contained in B.
197
198
CHAPTER 4. SETS
U
A
A\B\C
A\B\C
A\B\C
A\B\C
A\B\C
A\B\C
A\B\C
A\B\C
199
Exercises 4.4
1. Venn diagrams are usually made using simple closed curves with no
further restrictions. Try creating Venn diagrams for 3, 4 and 5 sets (in
general position) using rectangular simple closed curves.
2. We call a curve rectilinear if it is made of line segments that meet
at right angles. Use rectilinear simple closed curves to create a Venn
diagram for 5 sets.
3. Argue as to why rectilinear curves will suffice to build any Venn diagram.
4. Find the disjunctive normal form of A \ (B [ C).
5. Find the disjunctive normal form of (A4B)4C
6. The prototypes for the modus ponens and modus tollens argument
forms are the following:
All men are mortal.
Socrates is a man.
Therefore Socrates is
mortal.
man.
Illustrate these arguments using Venn diagrams.
7. Use Venn diagrams to convince yourself of the validity of the following
containment statement
(A \ B) [ (C \ D) (A [ C) \ (B [ D).
Now prove it!
200
CHAPTER 4. SETS
4.5
Russells Paradox
There are prizes considered equivalent to the Nobel in stature the Fields Medal,
awarded every four years by the International Mathematical Union to up to four mathematical researchers under the age of forty, and the Abel Prize, awarded annually by the
King of Norway.
201
202
CHAPTER 4. SETS
et cetera.
This obviously seems like a problem. Indeed, often paradoxes seem to be
caused by self-reference of this sort. Consider
The sentence in this box is false.
So a reasonable alternative is to do math among the sets that dont
exhibit this particular pathology.
Thus, inside the set of all sets we are singling out a particular subset that
consists of sets which dont contain themselves.
S = {A A is a set ^ A 2
/ A}
Now within the universal set were working in (the set of all sets) there
are only two possibilities: a given set is either in S or it is in its complement
Russell himself developed a workaround for the paradox which bears his
Isaac Newton also published a 3 volume work which is often cited by this same title,
203
Exercises 4.5
1. Verify that (A =) A) ^ (A =) A) is a logical contradiction
in two ways: by filling out a truth table and using the laws of logical
equivalence.
2. One way out of Russells paradox is to declare that the collection of sets
that dont contain themselves as elements is not a set itself. Explain
how this circumvents the paradox.
204
CHAPTER 4. SETS
Chapter 5
Proof techniques II
Induction
Who was the guy who first looked at a cow and said, I think Ill drink
whatever comes out of these things when I squeeze em!? Bill Watterson
5.1
206
v) If a subset of the natural numbers contains 0 and also has the property
that whenever a 2 S it follows that s(a) 2 S, then the subset S is
actually equal to N.
The last axiom is the one that justifies PMI. Basically, if 0 is in a subset,
and the subset has this property about successors1 , then 1 must be in it. But
if 1 is in it, then 1s successor (2) must be in it. And so on . . .
The subset ends up having every natural number in it.
Exercise. Verify that the following symbolic formulation has the same content as the version of the 5th Peano axiom given above.
8S N (0 2 S) ^ (8a 2 N, a 2 S =) s(a) 2 S) =) S = N
On August 16th 2003, Ma Lihua of Beijing, China earned her place in
the record books by single-handedly setting up an arrangement of dominoes
standing on end (actually, the setup took 7 weeks and was almost ruined by
some cockroaches in the Singapore Expo Hall) and toppling them. After the
first domino was tipped over it took about six minutes before 303,621 out of
the 303,628 dominoes had fallen. (One has to wonder what kept those other
7 dominoes upright . . . )
1
207
This is the model one should keep in mind when thinking about PMI:
domino toppling. In setting up a line of dominoes, what do we need to do in
order to ensure that they will all fall when the toppling begins? Every domino
must be placed so that it will hit and topple its successor. This is exactly
analogous to (a 2 S =) s(a) 2 S). (Think of S having the membership
criterion, x 2 S = x will have fallen when the toppling is over.) The other
thing that has to happen (barring the action of cockroaches) is for someone
to knock over the first domino. This is analogous to 0 2 S.
convenient to recast our discussion in terms of infinite families of logical statements. If we have a sequence of statements, (one for each natural number)
P0 , P1 , P2 , P3 , . . . we can prove them all to be true using PMI. We have to
do two things. First and this is usually the easy part we must show that
P0 is true (i.e. the first domino will get knocked over). Second, we must
208
show, for every possible value of k, Pk =) Pk+1 (i.e. each domino will
knock down its successor). These two parts of an inductive proof are known,
respectively, as the basis and the inductive step.
An outline for a proof using PMI:
Theorem 8n 2 N, Pn
Proof: (By induction)
Basis:
..
.
Inductive step:
..
.
Q.E.D.
Soon well do an actual example of an inductive proof, but first we have
to say something REALLY IMPORTANT about such proofs. Pay attention!
This is REALLY IMPORTANT ! When doing the second part of an inductive
proof (the inductive step), you are proving a UCS, and if you recall how
thats done, you start by assuming the antecedent is true. But the particular
UCS well be dealing with is 8k, Pk =) Pk+1 . That means that in the
course of proving 8n, Pn we have to assume 8k, Pk . Now this sounds very
much like the error known as circular reasoning, especially as many authors
dont even use dierent letters (n versus k in our outline) to distinguish the
two statements. (And, quite honestly, we only introduced the variable k to
assuage a certain lingering guilt regarding circular reasoning.) The sentence
8n, Pn is what were trying to prove. The sentence 8k, Pk is known as the
209
inductive hypothesis, and once that proof was done, it would be okay to quote
that result in an inductive proof of 8n, Pn . Thus we can compartmentalize
our way out of the difficulty!
210
the element a from it. This shows that |S1 | = |S2 |. Putting this
all together we get that |P(A)| = 2k + 2k = 2(2k ) = 2k+1 .
Q.E.D.
We close this section with a few pieces of advice.
Statements that can be proved inductively dont always start out with
P0 . Sometimes P1 is the first statement in an infinite family. Sometimes
be proved in the inductive step just dont make it look like youre
assuming what needs to be shown. For instance in the proof above
See exercise 2, the classic fallacious proof that all horses are the same color.
211
it might have been nice to start the inductive step with a comment
along the following lines, What we need to show is that under the
assumption that any set of size k has a power set of size 2k , it follows
that a set of size k + 1 will have a power set of size 2k+1 .
212
Exercises 5.1
1. Consider the sequence of number that are 1 greater than a multiple of
4. (Such numbers are of the form 4j + 1.)
4j + 1 = 2n2 + 3n + 1
j=0
Pn
j=0
2n2 + 3n + 1
4j + 1
1
1+5=6
2 1+5+9=
1
2
21 +31+1=6
3
4
2. What is wrong with the following inductive proof of all horses are the
same color.?
Theorem Let H be a set of n horses, all horses in H are the same
color.
Proof: We proceed by induction on n.
Basis: Suppose H is a set containing 1 horse. Clearly this
horse is the same color as itself.
213
Inductive step: Given a set of k + 1 horses H we can construct two sets of k horses. Suppose H = {h1 , h2 , h3 , . . . hk+1 }.
Define Ha = {h1 , h2 , h3 , . . . hk } (i.e. Ha contains just the first
k horses) and Hb = {h2 , h3 , h4 , . . . hk+1 } (i.e. Hb contains the
3. For each of the following theorems, write the statement that must be
proved for the basis then prove it, if you can!
(a) The sum of the first n positive integers is (n2 + n)/2.
(b) The sum of the first n (positive) odd numbers is n2 .
(c) If n coins are flipped, the probability that all of them are heads
is 1/2n
(d) Every 2n 2n chessboard with one square removed can be tiled
Here, perfectly tiled means that every trominoe covers 3 squares of the chessboard
(nothing hangs over the edge) and that every square of the chessboard is covered by some
trominoe.
214
4. Suppose that the rules of the game for PMI were changed so that one
did the following:
Basis. Prove that P (0) is true.
Inductive step. Prove that for all k, Pk implies Pk+2
Explain why this would not constitute a valid proof that Pn holds for
all natural numbers n. How could we change the basis in this outline
to obtain a valid proof?
5.2
215
Gauss, when only a child, found a formula for summing the first 100 natural
numbers (or so the story goes. . . ). This formula, and his clever method for
justifying it, can be easily generalized to the sum of the first n naturals.
While learning calculus, notably during the study of Riemann sums, one
encounters other summation formulas. For example, in approximating the
integral of the function f (x) = x2 from 0 to 100 one needs the sum of the
first 100 squares. For this reason, somewhere in almost every calculus book
one will find the following formulas collected:
n
X
j=
j=1
n
X
j2 =
j=1
n
X
n(n + 1)(2n + 1)
6
j3 =
j=1
n(n + 1)
2
n2 (n + 1)2
.
4
A really industrious author might also include the sum of the fourth powers. Jacob Bernoulli (a truly industrious individual) got excited enough to
find formulas for the sums of the first ten powers of the naturals. Actually,
Bernoulli went much further. His work on sums of powers lead to the definition of what are now known as Bernoulli numbers and let him calculate
P1000 10
in about seven minutes long before the advent of calculators! In
j=1 j
[16, p. 320], Bernoulli is quoted:
With the help of this table it took me less than half of a quarter
of an hour to find that the tenth powers of the first 1000 numbers
being added together will yield the sum
216
91, 409, 924, 241, 424, 243, 424, 241, 924, 242, 500.
To the beginning calculus student, the beauty of the above relationships
may be somewhat dimmed by the memorization challenge that they represent. It is fortunate then, that the right-hand side of the third formula is
just the square of the right-hand side of the first formula. And of course, the
right-hand side of the first formula is something that can be deduced by a six
year old child (provided that he is a super-genius!) This happy coincidence
leaves us to apply most of our rote memorization energy to formula number
two, because the first and third formulas are related by the following rather
bizarre-looking equation,
n
X
j3 =
j=1
n
X
j=1
!2
The sum of the cubes of the first n numbers is the square of their sum.
For completeness we should include the following formula which should
be thought of as the sum of the zeroth powers of the first n naturals.
n
X
1=n
j=1
10
x3
2x + 3dx
x=0
217
we need to show that the k + 1th version of the formula holds, assuming
that the kth version does. Before proceeding on to read the proof do the
following
Exercise. Write down the k + 1th version of the formula for the sum of the
first n naturals. (You have to replace every n with a k + 1.)
Theorem 5.2.1.
8n 2 N,
n
X
j=
j=1
n(n + 1)
2
j=1
= (k + 1) +
k
X
j=1
Next, we can use the inductive hypothesis to replace the sum (the
part that goes from 1 to k) with a formula.
4
If youd prefer to avoid the empty sum argument, you can choose to use n = 1 as
the basis case. The theorem should be restated so the universe of discourse is positive
naturals.
218
= (k + 1) +
k(k + 1)
2
2(k + 1) k(k + 1)
+
2
2
2(k + 1) + k(k + 1)
2
(k + 1) (k + 2)
.
2
Q.E.D.
Notice how the inductive step in this proof works. We start by writing
down the left-hand side of Pk+1 , we pull out the last term so weve got the lefthand side of Pk (plus something else), then we apply the inductive hypothesis
and do some algebra until we arrive at the right-hand side of Pk+1 . Overall,
weve just transformed the left-hand side of the statement we wish to prove
into its right-hand side.
There is another way to organize the inductive steps in proofs like these
that works by manipulating entire equalities (rather than just one side or the
other of them).
Inductive step (alternate): By the inductive hypothesis, we
can write
k
X
j=1
j=
k(k + 1)
.
2
219
j = (k + 1) +
j=1
k(k + 1)
.
2
j=
j=1
(k + 1)(k + 2)
.
2
Q.E.D.
Oftentimes one can save considerable eort in an inductive proof by creatively using the factored form during intermediate steps. On the other hand,
sometimes it is easier to just simplify everything completely, and also, completely simplify the expression on the right-hand side of P (k + 1) and then
verify that the two things are equal. This is basically just another take on
the technique of working backwards from the conclusion. Just remember
that in writing-up your proof you need to make it look as if you reasoned directly from the premises to the conclusion. Well illustrate what weve been
discussing in this paragraph while proving the formula for the sum of the
squares of the first n naturals.
Theorem 5.2.2.
8n 2 N,
n
X
j=1
j2 =
n(n + 1)(2n + 1)
6
220
j2 =
j=1
k(k + 1)(2k + 1)
.
6
(k + 1) +
k
X
j=1
j2 =
k(k + 1)(2k + 1)
+ (k + 1)2 .
6
Thus,
k+1
X
j=1
Therefore,
j2 =
k+1
X
j2 =
j=1
221
(k 2 + k)(2k + 1) 6(k 2 + 2k + 1)
+
6
6
Q.E.D.
Notice how the last four lines of the proof are the same as those in the
box above containing our scratch work? (Except in the reverse order.)
Well end this section by demonstrating one more use of this technique.
This time well look at a formula for a product rather than a sum.
Theorem 5.2.3.
2 2 Z,
8n
n
Y
1
j2
j=2
n+1
.
2n
Before preceding with the proof lets look at an example (although this
has nothing to do with proving anything, its really not a bad idea it can
keep you from wasting a lot of time trying to prove something that isnt
actually true!) When n = 4 the product is
1
22
1
32
1
42
222
1
4
1
9
1
16
3
8
15
4
9
16
360
.
576
1
j2
j=2
k+1
.
2k
1
(k + 1)2
k
Y
j=2
1
j2
k+1
2k
1
(k + 1)2
Really, the only reason Im doing this silly proof is to point out to you that when
youre doing the inductive step in a proof of a formula for a product, you dont add to
both sides anymore, you multiply. You see that, right? Well, consider yourself to have
been pointed out to or . . . oh, whatever.
223
Thus
k+1
Y
j=2
1
j2
k+1
2k
(k + 1)
2k(k + 1)2
k+1
2k
(1)
2k(k + 1)
k+1
2k
1
(k + 1)2
(k + 1)2 1
2k(k + 1)
k 2 + 2k
2k(k + 1)
k(k + 2)
2k(k + 1)
k+2
.
2(k + 1)
Q.E.D.
224
Exercises 5.2
1. Write an inductive proof of the formula for the sum of the first n cubes.
2. Find a formula for the sum of the first n fourth powers.
3. The sum of the first n natural numbers is sometimes called the n-th
triangular number Tn . Triangular numbers are so-named because one
can represent them with triangular shaped arrangements of dots.
4=
1
1
4+9=6
4+9
16 =
10
et cetera
Guess a general formula for
Pn
i=1 (
1
i
1
n
n
X
225
0.
j=0
7. Prove
n
X
i=1
(2i
1
n
=
for all natural numbers n.
1)(2i + 1)
2n + 1
Fn+2 = Fn + Fn+1
The first two Fibonacci numbers (actually the zeroth and the first) are
both 1.
Thus, the first several Fibonacci numbers are
(Fi )2 = Fn Fn+1
226
5.3
There is a very famous result known as Fermats Little Theorem. This would
probably be abbreviated FLT except for two things. In science fiction FLT
means faster than light travel and there is another theorem due to Fermat
that goes by the initials FLT: Fermats Last Theorem. Fermats last theorem
states that equations of the form an + bn = cn , where n is a positive natural
number, only have integer solutions that are trivial (like 03 + 13 = 13 ) when
n is greater than 2. When n is 1, there are lots of integer solutions. When
n is 2, there are still plenty of integer solutions these are the so-called
Pythagorean triples, for example 3,4 & 5 or 5,12 & 13. It is somewhat unfair
that this statement is known as Fermats last theorem since he didnt prove
it (or at least we cant be sure that he proved it). Five years after his death,
Fermats son published a translated6 version of Diophantuss Arithmetica
containing his fathers notations. One of those notations near the place
where Diophantus was discussing the equation x2 + y 2 = z 2 and its solution
in whole numbers was the statement of what is now known as Fermats last
theorem as well as the following claim:
Cuius rei demonstrationem mirabilem sane detexi hanc marginis
exiguitas non caperet.
In English:
I have discovered a truly remarkable proof of this that the margin
of this page is too small to contain.
Between 1670 and 1994 a lot of famous mathematicians worked on FLT
but never found the demonstrationem mirabilem. Finally in 1994, Andrew
6
The translation from Greek into Latin was done by Claude Bachet.
227
Wiles of Princeton announced a proof of FLT, but in Wiless own words, his
is a twentieth century proof it cant be the proof Fermat had in mind.
These days most people believe that Fermat was mistaken. Probably he
thought a proof technique that works for small values of n could be generalized. It remains a tantalizing question, can a proof of FLT using only
methods available in the 17th century be accomplished?
Part of the reason that so many people spent so much eort on FLT over
the centuries is that Fermat had an excellent record as regards being correct
about his theorems and proofs. The result known as Fermats little theorem
is an example of a theorem and proof that Fermat got right. It is probably
known as his little theorem because its statement is very short, but it is
actually a fairly deep result.
Theorem 5.3.1 (Fermats Little Theorem). For every prime number p, and
for all integers x, the p-th power of x and x itself are congruent mod p.
Symbolically:
xp x
(mod p)
228
(k + 1)3 + 2(k + 1) + 6
= (k 3 + 3k 2 + 3k + 1) + (2k + 2) + 6
= (k 3 + 2k + 6) + 3k 2 + 3k + 3
= (k 3 + 2k + 6) + 3(k 2 + k + 1).
By the inductive hypothesis, 3 is a divisor of k 3 + 2k + 6 so there
is an integer m such that k 3 + 2k + 6 = 3m. Thus,
(k + 1)3 + 2(k + 1) + 6
= 3m + 3(k 2 + k + 1)
= 3(m + k 2 + k + 1).
This equation shows that 3 is a divisor of (k + 1)3 + 2(k + 1) + 6,
which is the desired conclusion.
Q.E.D.
Exercise. Devise an inductive proof of the statement, 8n 2 N, 5 | x5 +4x 10.
There is one other subtle trick for devising statements to be proved by
PMI that you should know about. An example should suffice to make it
229
clear. Notice that 7 is equivalent to 1 (mod 6), it follows that any power of
7 is also 1 (mod 6). So, if we subtract 1 from some power of 7 we will have
a number that is divisible by 6.
The proof (by PMI) of a statement like this requires another subtle little
trick. Somewhere along the way in the proof youll need the identity 7 = 6+1.
Theorem 5.3.3.
8n 2 N, 6 | 7n
Inductive step:
(We need to show that if 6 | 7k
Consider the quantity 7k+1
7k+1
1 then 6 | 7k+1
1.
1 = 7 7k
= (6 + 1) 7k
= 6(7k ) + (7k
1
1
= 6 7k + 1 7k
1.)
1)
1 so there is an integer m
1 = 6(7k ) + 6m.
1.
Q.E.D.
230
2n
n!
As the table illustrates, for small values of n, 2n > n!. But from n = 4
onward the inequality is reversed.
Theorem 5.3.4.
8n
4 2 N, 2n < n!
It might be smoother to justify this step by first proving the lemma that 8a, b, c, d 2
231
So
Q.E.D.
The observant Calculus student will certainly be aware of the fact that,
asymptotically, exponential functions grow faster than polynomial functions.
That is, if you have a base b which is greater than 1, the function bx is eventually larger than any polynomial p(x). This may seem a bit hard to believe if
b = 1.001 and p(x) = 500x10 . The graph of y = 1.001x is practically indistinguishable from the line y = 1 (at first), whereas the graph of y = 500x10 has
already reached the astronomical value of five trillion (5, 000, 000, 000, 000)
when x is just 10. Nevertheless, the exponential will eventually outstrip the
polynomial. We can use the methods of this section to get started on proving
the fact mentioned above. Consider the two sequences n2 and 2n .
n
16
25
36
2n
16
32
64
4 then n2 2n .
232
So the result remains in doubt unless you can complete the exercise that follows. . .
Q.E.D.???
Exercise. Prove the lemma: For all n 2 N, if n
4 then 2n + 1 n2 .
Exercises 5.3
Give inductive proofs of the following
1. 8x 2 N, 3 | x3
2. 8x 2 N, 3 | x3 + 5x
3. 8x 2 N, 11 | x11 + 10x
4. 8n 2 N, 3 | 4n
5. 8n 2 N, 6 | (3n2 + 3n
6. 8n 2 N, 5 | (n5
12)
5n3 + 14n
7. 8n 2 N, 4 | (13n + 4n
1)
8. 8n 2 N, 7 | 8n + 6
9. 8n 2 N, 6 | 2n3
10. 8n
2n
14
3 2 N, n3 + 3 > n2 + 3n + 1
13. 8x
4 2 N, x2 2x 4x
233
234
5.4
8k(P0 ^ P1 ^ . . . ^ Pk 1 ) =) Pk .
An outline of a strong inductive proof is:
Theorem 8n 2 N, Pn
Proof: (By complete induction)
Basis:
(Technically,
..
.
PCI
We recommend
(Here we
V must show
k 1
that 8k,
=)
i=0 Pi
Pk is true.)
Q.E.D.
235
Its fairly common that we wont truly need all of the statements from P0
to Pk
to be true, but just one of them (and we dont know a priori which
one). The following is a classic result; the proof that all numbers greater
than 1 have prime factors.
Theorem 5.4.1. For all natural numbers n, n > 1 implies n has a prime
factor.
Proof: (By strong induction) Consider an arbitrary natural number n > 1. If n is prime then n clearly has a prime factor (itself),
so suppose that n is not prime. By definition, a composite natural number can be factored, so n = a b for some pair of natural
numbers a and b which are both greater than 1. Since a and b are
factors of n both greater than 1, it follows that a < n (it is also
true that b < n but we dont need that . . . ). The inductive hypothesis can now be applied to deduce that a has a prime factor
p. Since p | a and a | n, by transitivity p | n. Thus n has a prime
factor.
Q.E.D.
236
Exercises 5.4
Give inductive proofs of the following
1. A postage stamp problem is a problem that (typically) asks us to
determine what total postage values can be produced using two sorts
of stamps. Suppose that you have 3c stamps and 7c stamps, show
(using strong induction) that any postage value 12c or higher can be
achieved. That is,
8n 2 N, n
12 =) 9x, y 2 N, n = 3x + 7y.
2. Show that any integer postage of 12c or more can be made using only
4c and 5c stamps.
3. The polynomial equation x2 = x + 1 has two solutions, =
=
n
p
1+ 5
2
and
for all n
0.
Chapter 6
Relations and functions
If evolution really works, how come mothers only have two hands? Milton
Berle
6.1
Relations
A relation in mathematics is a symbol that can be placed between two numbers (or variables) to create a logical statement (or open sentence). The main
point here is that the insertion of a relation symbol between two numbers
creates a statement whose value is either true or false. For example, we have
previously seen the divisibility symbol (|) and noted the common error of
mistaking it for the division symbol (/); one of these tells us to perform an
arithmetic operation, the other asks us whether if such an operation were
performed there would be a remainder. There are many other symbols that
we have seen which have this characteristic, the most important is probably
=, but there are lots: 6=, <, , >,
between two numbers we get a Boolean thing, its either true or false. If, instead of numbers, we think of placing sets on either side of a relation symbol,
then =, and are valid relation symbols. If we think of placing logical
237
238
expressions on either side of a relation then, honestly, any of the logical symbols is a relation, but we normally think of ^ and _ as operators and give
things like , =) and () the status of relations.
In the examples weve looked at the things on either side of a relation are
of the same type. This is usually, but not always, the case. The prevalence
of relations with the same kind of things being compared has even lead to
the aphorism Dont compare apples and oranges. Think about the symbol
2 for a moment. As weve seen previously, it isnt usually appropriate to put
sets on either side of this, we might have numbers or other objects on the left
and sets on the right. Lets look at a small example. Let A = {1, 2, 3, a, b}
and let B = {{1, 2, a}, {1, 3, 5, 7, . . .}, {1}}. The element of relation, 2, is
a relation from A to B.
1
2
{1, 2, a}
{1, 3, 5, 7, . . .}
a
{1}
6.1. RELATIONS
239
relations. But we should point out certain hidden assumptions. First, theyll
only work if we are dealing with finite sets, or sets like the odd numbers in
our example (sets that are infinite but could in principle be listed). Second,
by drawing the two sets separately, it seems that we are assuming they are
not only dierent, but disjoint. The sets not only need not be disjoint, but
often (most of the time!) we have relations that go from a set to itself so
the sets in a picture like this may be identical. In Figure 6.2 we illustrate
the divisibility relation on the set of all divisors of 6 this is an example in
which the sets on either side of the relation are the same. Notice the linguistic
distinction, we can talk about either a relation from A to B (when there
are really two dierent sets) or a relation on A (when there is only one).
Figure 6.2: The divides relation is an example of a relation that goes from
a set to itself. In this example we say that we have a relation on the set of
divisors of 6.
Purists will note that it is really inappropriate to represent the same set
in two dierent places in a Venn diagram. The diagram in Figure 6.2 should
240
2
3
6
6.1. RELATIONS
241
A B = {(a, b) a 2 A ^ b 2 B}
From here on out in your mathematical career youll need to take note of
the context that the symbol appears in. If it appears between numbers go
ahead and multiply, but if it appears between sets youre doing something
dierent forming the Cartesian product.
The familiar xy plane, is often called the Cartesian plane. This is done
for two reasons. Rene Descartes, the famous mathematician and philosopher,
was the first to consider coordinatizing the plane and thus is responsible for
our current understanding of the relationship between geometry and algebra.
Rene Descartes name is also memorialized in the definition of the Cartesian
product of sets, and the plane is nothing more than the product R R.
Indeed, the plane provided the very first example of the concept that was
later generalized to the Cartesian product of sets.
Exercise. Suppose A = {1, 2, 3} and B = {a, b, c}. Is (a, 1) in the Cartesian
x 2 R+ }.
that a particular pair, (a, b), of things make the relation true we have to write
242
aRb.
This looks funny too.
Despite the strange appearances, these examples do express the correct
way to deal with relations.
Lets do a completely made-up example. Suppose A is the set {a, e, i, o, u}
R = {(a, s), (a, t), (a, n), (e, t), (e, l), (e, n), (i, s), (i, t), (o, r), (o, n), (u, s)}.
Then, for example, because (e, t) 2 R we can write eRt. We indicate the
negation of the concept that two elements are related by drawing a slash
through the name of the relation, for example the notation 6= is certainly
familiar to you, as is (although in this latter case we would normally write
instead). We can denote the fact that (a, l) is not a pair that makes the
relation true by writing a6 Rl.
means we can view the relation as a subset of the xy plane. In other words,
we can graph it. The graph of the < relation is given in Figure 6.3.
A relation on any set that is a subset of R can likewise be graphed. The
graph of the | relation is given in Figure 6.4.
6.1. RELATIONS
243
Figure 6.3: The less than relation can be viewed as a subset of R R, i.e.
it can be graphed.
Figure 6.4: The divisibility relation can be graphed. Only those points (as
indicated) with integer coordinates are in the graph.
244
6.1. RELATIONS
245
symbols f and g appear in the same order. But beware! there are atavists
out there who write their compositions the other way around.
You should probably have a diagram like the following in mind while
thinking about the composition of relations. Here, we have the set A =
{1, 2, 3, 4}, the set B is {a, b, c, d} and C = {w, x, y, z}. The relation R goes
from A to B and consists of the following set of pairs,
R = {(1, a), (1, c), (2, d), (3, c), (3, d)}.
And
S = {(a, y), (b, w), (b, x), (b, z)}.
S
R
1
246
Exercises 6.1
1. The lexicographic order, <lex , is a relation on the set of all words,
where x <lex y means that x would come before y in the dictionary.
Consider just the three letter words like i, fig, the, et cetera.
Come up with a usable definition for x1 x2 x3 <lex y1 y2 y3 .
2. What is the graph of = in R R?
3. The inverse of a relation R is denoted R 1 . It contains exactly the
same ordered pairs as R but with the order switched. (So technically,
they arent exactly the same ordered pairs . . . )
= {(b, a) (a, b) 2 R}
4. The socks and shoes rule is a very silly little mnemonic for remembering how to invert a composition. If we think of undoing the process
of putting on our socks and shoes (thats socks first, then shoes) we
have to first remove our shoes, then take o our socks.
The socks and shoes rule is valid for relations as well.
Prove that (S R)
=R
S 1.
6.2
247
Properties of relations
There are two special classes of relations that we will study in the next
two sections, equivalence relations and ordering relations. The prototype
for an equivalence relation is the ordinary notion of numerical equality, =.
The prototypical ordering relation is . Each of these has certain salient
properties that are the root causes of their importance. In this section we
will study a compendium of properties that a relation may or may not have.
A relation that has three of the properties well discuss:
1. reflexivity
2. symmetry
3. transitivity
is said to be an equivalence relation; it will in some ways resemble =.
A relation that has another set of three properties:
1. reflexivity
2. anti-symmetry
3. transitivity
is called an ordering relation; it will resemble .
tions have.
There are a total of 5 properties that we have named, and we will discuss
them all more thoroughly. But first, well state the formal definitions. Take
note that these properties are all stated for a relation that goes from a set
to itself, indeed, most of them wouldnt even make sense if we tried to define
them for a relation from a set to a dierent set.
248
aRa
8a, b 2 S,
aRb =) bRa
aRb ^ bRa
=)
a=b
aRb ^ bRc
=)
aRc
aRa).
How does this dier from the defining property for irreflexive?
249
8a, b 2 S,
aRb =) bRa.
250
c
a
d
aRb ^ bRa
=)
a = b.
It may be hard at first to understand why the definition we use for antisymmetry is the one above. If one wanted to insure that there were never
two-way connections between elements of the set it might seem easier to
define anti-symmetry as follows:
(Alternate definition) A relation R on a set S is anti-symmetric
i
8a, b 2 S,
aRb =) b6 Ra.
This definition may seem more straight-forward, but it turns out the
original definition is easier to use in proofs. We need to convince ourselves
that the (first) definition really accomplishes what we want. Namely, if a
relation R satisfies the property that 8a, b 2 S,
aRb ^ bRa
=)
a=
b, then there will not actually be any pair of elements that are related in
both orders. One way to think about it is this: suppose that a and b are
distinct elements of S and that both aRb and bRa are true. The property now
guarantees that a = b which contradicts the notion that a and b are distinct.
This is a miniature proof by contradiction; if you assume there are a pair of
251
distinct elements that are related in both orders you get a contradiction, so
there arent!
A funny thing about the anti-symmetry property is this: When it is true
of a relation it is always vacuously true! The property is engineered in such a
way that when it is true, it forces that the statement in its antecedent never
really happens.
Transitivity is an extremely useful property as witnessed by the fact that
both equivalence relations and ordering relations must have this property.
When speaking of the transitive property of equality we say Two things that
are equal to a third, are equal to each other. When dealing with ordering
we may encounter statements like the following. Since Aardvark precedes
Bulwark in the dictionary, and since Bulwark precedes Catastrophe, it is
plainly true that Aardvark comes before Catastrophe in the dictionary.
Again, the definition of transitivity involves a conditional. Also, transitivity may be viewed as the most complicated of the properties weve been
studying; it takes three universally quantified variables to state the property.
A relation R on a set S is transitive i
8a, b, c 2 S,
aRb ^ bRc
=)
aRc
252
Exercise. Find logical negations for the formal properties defining each of
the five properties.
If a relation R is reflexive we will never see a node that doesnt have a
loop.
c
a
d
If a relation R is irreflexive we will never see a node that does have a loop!
c
a
d
c
a
b
253
c
a
d
c
a
b
254
Exercises 6.2
1. Consider the relation S defined by S = {(x, y)
6.3
255
Equivalence relations
The main idea of an equivalence relation is that it is something like equality, but not quite. Usually there is some property that we can name, so
that equivalent things share that property. For example Albert Einstein and
Adolf Eichmann were two entirely dierent human beings, if you consider
all the dierent criteria that one can use to distinguish human beings there
is little they have in common. But, if the only thing one was interested
in was a persons initials, one would have to say that Einstein and Eichmann were equivalent. Future examples of equivalence relations will be less
frivolous. . . But first, the formal definition:
Definition. A relation R on a set S is an equivalence relation i R is reflexive, symmetric and transitive.
Probably the most important equivalence relation weve seen to date is
equivalence mod m which we will denote using the symbol m . This
relation may even be more interesting than actual equality! The reason for
this seemingly odd statement is that equivalence mod m gives us nontrivial equivalence classes. Equivalence classes are one of the most potent
ideas in modern mathematics and its essential that you understand them,
so well start with an example. Consider equivalence mod 5. What other
numbers is (say) 11 equivalent to? There are many! Any number that leaves
the same remainder as 11 when we divide it by 5. This collection is called
the equivalence class of 11 and is usually denoted using an overline 11,
another notation that is often seen for the set of things equivalent to 11 is
11/ 5 .
11 = {. . . , 9, 4, 1, 6, 11, 16, . . .}
Its easy to see that we will get the exact same set if we choose any other
256
element of the equivalence class (in place of 11), which leads us to an infinite
list of set equalities,
1 = 6 = 11 = . . .
And similarly,
2 = 7 = 12 = . . .
In fact, there are really just 5 dierent sets that form the equivalence classes
mod 5: 0, 1, 2, 3, and 4. (Note: we have followed the usual convention of
using the smallest possible non-negative integers as the representatives for
our equivalence classes.)
What weve been discussing here is one of the first examples of a quotient structure. We start with the integers and mod out by an equivalence
relation. In doing so, we move to the quotient which means (in this instance) that we go from Z to a much simpler set having only five elements:
{0, 1, 2, 3, 4}. In moving to the quotient we will generally lose a lot of in-
S=
X2P
and
8X, Y 2 P, X 6= Y =) X \ Y = ;.
257
In words, if you take the union of all the pieces of the partition youll
get the set S, and any pair of sets from the partition that arent identical
are disjoint. Partitions are an inherently useful way of looking at things,
although in the real world there are often problems (sets we thought were
disjoint turn out to have elements in common, or we discover something that
doesnt fit into any of the pieces of our partition), in mathematics we usually
find that partitions do just what we would want them to do. Partitions divide
some set up into a number of convenient pieces in such a way that were
guaranteed that every element of the set is in one of the pieces and also so
that none of the pieces overlap. Partitions are a useful way of dissecting sets,
and equivalence relations (via their equivalence classes) give us an easy way
of creating partitions usually with some additional structure to boot! The
properties that make a relation an equivalence relation (reflexivity, symmetry
and transitivity) are designed to ensure that equivalence classes exist and do
provide us with the desired partition. For the beginning proof writer this all
may seem very complicated, but take heart! Most of the work has already
been done for you by those who created the general theory of equivalence
relations and quotient structures. All you have to do (usually) is prove
that a given relation is an equivalence relation by verifying that it is indeed
reflexive, symmetric and transitive. Lets have a look at another example.
In Number Theory, the square-free part of an integer is what remains
after we divide-out the largest perfect square that divides it. (This is also
known as the radical of an integer.) The following table gives the square-free
part, sf (n), for the first several values of n.
n
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
sf (n) 1 2 3 1 5 6 7 2 1 10 11 3 13 14 15 1 17 2 19 5
Its easy to compute the square-free part of an integer if you know its
prime factorization just reduce all the exponents mod 2. For example1
1
This is the size of largest sporadic finite simple group, known as the Monster.
258
808017424794512875886459904961710757005754368000000000
= 246 320 59 76 112 133 17 19 23 29 31 41 47 59 71
the square-free part of this number is
5 13 17 19 23 29 31 41 47 59 71
= 3504253225343845
which, while it is still quite a large number, is certainly a good bit smaller
than the original!
We will define an equivalence relation S on the set of natural numbers by
using the square-free part:
8x, y 2 N, xSy () sf (x) = sf (y)
In other words, two natural numbers will be S-related if they have the
same square-free parts.
Exercise. What is 1/S?
Before we proceed to the proof that S is an equivalence relation wed
like you to be cognizant of a bigger picture as you read. Each of the three
parts of the proof will have a similar structure. We will show that S has
one of the three properties by using the fact that = has that property. In
more advanced work this entire proof could be omitted or replaced by the
phrase S inherits reflexivity, symmetry and transitivity from equality, and
is therefore an equivalence relation. (Nice trick isnt it? But before youre
allowed to use it you have to show that you can do it the hard way . . . )
259
=)
and further suppose that both xSy and ySz. From the definition
of S we deduce that sf (x) = sf (y) and sf (y) = sf (z). Clearly,
sf (x) = sf (z) (this deduction comes from the transitive property
of =), so xSz.
Q.E.D.
Well end this section with an example of an equivalence relation that
doesnt inherit the three properties from equality.
260
Two graphs are said to be isomorphic if they represent the same connections. There must first of all be a one-to-one correspondence between the
vertices of the two graphs, and further, a pair of vertices in one graph are
connected by some number of edges if and only if the corresponding vertices
in the other graph are connected by the same number of edges.
2
261
Exercise. The four examples of graphs above actually are two pairs of isomorphic graphs. Which pairs are isomorphic?
This word isomorphic has a nice etymology. It means same shape.
Two graphs are isomorphic if they have the same shape. We dont have
the tools right now to do a formal proof (in fact we need to look at some
further prerequisites before we can really precisely define isomorphism), but
isomorphism of graphs is an equivalence relation. Lets at least verify this
informally.
Reflexivity Is a graph isomorphic to itself? That is, does a graph have
the same shape as itself? Clearly!
Symmetry If graph A is isomorphic to graph B, is it also the case that
graph B is isomorphic to graph A? I.e. if A has the same shape as B,
doesnt B have the same shape as A? Of course!
Transitivity Well . . . the answer here is going to be Naturally! but
lets wait to delve into this issue when we have a usable formal definition for
graph isomorphism. The question at this stage should be clear though: If A
is isomorphic to B and B is isomorphic to C, then isnt A isomorphic to C?
262
Exercises 6.3
() x2 = y 2 . Show
()
w1 is an anagram of w2 .
4. The two diagrams below both show a famous graph known as the Petersen graph. The picture on the left is the usual representation which
emphasizes its five-fold symmetry. The picture on the right highlights
the fact that the Petersen graph also has a three-fold symmetry. Label
the right-hand diagram using the same letters (A through J) in order
to show that these two representations are truly isomorphic.
263
F
E
B
G
5. We will use the symbol Z to refer to the set of all integers except 0.
Define a relation Q on the set of all pairs in Z Z (pairs of integers
6. The relation Q defined in the previous problem partitions the set of all
pairs of integers into an interesting set of equivalence classes. Explain
why
Q
(Z Z )/Q.
264
6.4
Ordering relations
position we always root for the underdog but one of our favorite ordering
relation (divisibility) is reflexive and it would be eliminated if we made the
other choice3 . So. . .
Definition. A relation R on a set S is an ordering relation i R is reflexive,
anti-symmetric and transitive.
Now, weve used to decide what properties an ordering relation should
have, but we should point out that most ordering relations dont do nearly
as good a job as does. The relation imposes what is known as a total
order on the sets that it acts on (you should note that it cant be used to
compare complex numbers, but it can be placed between reals or any of the
sets of numbers that are contained in R.) Most ordering relations only create
what is known as a partial order on the sets they act on. In a total ordering
(a.k.a. a linear ordering) every pair of elements can be compared and we
can use the ordering relation to decide which order they go in. In a partial
ordering there may be elements that are incomparable.
Definition. If x and y are elements of a set S and R is an ordering relation
on S then we say x and y are comparable if xRy _ yRx.
3
If you insist on making the other choice, you will have a strict ordering relation
265
12
On the other hand, perhaps you noticed these numbers are the divisors of
12. The divisibility relation will give us our first example of a partial order.
2
12
266
267
12
6
4
Figure 6.5: Hasse diagrams of the set {1, 2, 3, 4, 6, 12} totally ordered by
unless they happen to be equal! This allows us to draw the Hasse diagram
for this set with the nodes arranged in four rows. (See Figure 6.6.)
Exercise. Try drawing a Hasse diagram for the partially ordered set
(P({1, 2, 3, 4}), ).
Posets like (P({1, 2, 3}), ) that can be laid out in ranks are known as
graded posets. Things in a graded poset that have the same rank are always
incomparable.
268
{1, 2}
{1, 3}
{2, 3}
{1}
{2}
{3}
Figure 6.6: Hasse diagram for the power set of {1, 2, 3} partially ordered by
set containment.
269
72 = 23 32
24 = 23 31
8 = 23
36 = 22 32
12 = 22 31
4 = 22
18 = 21 32
6 = 2 1 31
21
9 = 32
31
Figure 6.7: Hasse diagram for the divisors of 72, partially ordered by divisibility. This is a graded poset.
270
would be possible to add both 1 and 72 to this chain and still have a chain,
this chain is not maximal. (But, of course, {1, 2, 6, 12, 24, 72} is.) On the
other hand, {8, 12, 18} is an antichain (indeed, this is a maximal antichain).
This poset has both a top and a bottom 1 is the least element and 72 is the
greatest element. Notice that the elements which cover 1 (the least element)
are the prime divisors of 72.
271
Exercises 6.4
1. In population ecology there is a partial order predates which basically
means that one organism feeds upon another. Strictly speaking this
relation is not transitive; however, if we take the point of view that
when a wolf eats a sheep, it is also eating some of the grass that the
sheep has fed upon, we see that in a certain sense it is transitive.
A chain in this partial order is called a food chain and so-called
apex predators are said to sit atop the food chain. Thus apex
predator is a term for a maximal element in this poset. When poisons
such as mercury and PCBs are introduced into an ecosystem, they
tend to collect disproportionately in the apex predators which is why
pregnant women and young children should not eat sharks or tuna but
sardines are fine.
Below is a small example of an ecology partially ordered by predates
Fox
Cow
Duck
Alligator
Robin
Goose
Grass
Find the largest antichain in this poset.
Worms
272
a. Grass
2. A maximal antichain
b. Goose
3. A maximal element
c. Fox
4. A (non-maximal) chain
d. {Grass, Duck}
5. A maximal chain
7. A least element
8. A minimal element
273
274
6.5
Functions
The concept of a function is one of the most useful abstractions in mathematics. In fact it is an abstraction that can be further abstracted! For instance
an operator is an entity which takes functions as inputs and produces functions as outputs, thus an operator is to functions as functions themselves
are to numbers. There are many operators that you have certainly encountered already just not by that name. One of the most famous operators is
dierentiation, when you take the derivative of some function, the answer
you obtain is another function. If two dierent people are given the same
dierentiation problem and they come up with dierent answers, we know
that at least one of them has made a mistake! Similarly, if two calculations
of the value of a function are made for the same input, they must match.
The property we are discussing used to be captured by saying that a
function needs to be well-defined. The old school definition of a function
was:
Definition. A function f is a well-defined rule, that, given any input value
x produces a unique output4 value f (x).
A more modern definition of a function is the following.
Definition. A function is a binary relation which does not contain distinct
pairs having the same initial element.
When we think of a function as a special type of binary relation, the pairs
that are in the function have the form (x, f (x)), that is, they consist of an
input and the corresponding output.
We have gotten relatively used to relations on a set, but recall that the
more general situation is that a binary relation is a subset of A B. In this
4
The use of the notation f (x) to indicate the output of function f associated with input
6.5. FUNCTIONS
275
(of course) but also some sets that well denote by A0 and B 0 . The set A0
consists of those elements of A that actually appear as the first coordinate
of a pair in the relation f . The set B 0 consists of those elements of B that
actually appear as the second coordinate of a pair in the relation f . A generic
example of how these four sets might look is given in Figure 6.8.
a
w
B0
x = f (c)
y = f (d)
d
e
z = f (e)
276
f . The set we have been calling A does not have a name. In fact, the formal
definition of the term function has been rigged so that there is no dierence
between the sets A and A0 . This seems a shame, if you think of range and
domain as being primary, doesnt it seem odd that we have a way to refer
to a superset of the range (i.e. the codomain) but no way of referring to a
superset of the domain?
Nevertheless, this is just the way it is . . . There is only one set on the
input side the domain of our function.
The domain of any relation is expressed by writing Dom(R). Which is
defined as follows.
Definition. If R is a relation from A to B then Dom(R) is a subset of A
defined by
Dom(R) = {a 2 A 9b 2 B, (a, b) 2 R}
We should point out that the notation just given for the domain of a
relation R, (Dom(R)) has analogs for the other sets that are involved with
a relation. We write Cod(R) to refer the the codomain of the relation, and
Rng(R) to refer to the range.
Since we are now thinking of functions as special classes of relations,
it follows that a function is just a set of ordered pairs. This means that
the identity of a function is tied up, not just with a formula that gives the
output for a given input, but also with what values can be used for those
inputs. Thus the function f (x) = 2x defined on R is a completely dierent
animal from the function f (x) = 2x defined on N. If you really want to
specify a function precisely you must give its domain as well as a formula for
it. Usually, one does this by writing a formula, then a semicolon, then the
domain. (E.g. f (x) = x2 ;
0.)
Okay, so, finally, we are prepared to give the real definition of a function.
6.5. FUNCTIONS
277
Recapping, a function must have its domain equal to the set A where its
inputs come from. This is sometimes expressed by saying that a function is
defined on its domain. A functions range and codomain may be dierent
however. In the event that the range and codomain are the same (Cod(R) =
Rng(R)) we have a rather special situation and the function is graced by the
appellation surjection. The term onto is also commonly used to describe
a surjective function.
Exercise. There is an expression in mathematics, Every function is onto
its range. that really doesnt say very much. Why not?
If one has elements x and y, of the domain and codomain, (respectively)
and y = f (x)5 then one may say that y is the image of x or that x is a
preimage of y. Take careful note of the articles used in these phrases we
say y is the image of x but x is a preimage of y. This is because y is
uniquely determined by x, but not vice versa. For example, since the squares
of 2 and
2.
thing. By writing f
278
and
In terms of graphs, the inverse and the original relation are related by
being reflections in the line y = x. It is possible for one, both, or neither of
these to be functions. The canonical example to keep in mind is probably
f (x) = x2 and its inverse.
3)
and square it we get a positive number (9) and if we then come along and
take the square root we get another positive number (3). This is problematic
since we didnt end up where we started which is what ought to happen if
we apply a function followed by its inverse.
Well try to handle the general situation in a bit, but for the moment
lets consider the nice case: when the inverse of a function is also a function.
When exactly does this happen? Well, we have just seen that the inverse
6.5. FUNCTIONS
279
of a function doesnt necessarily pass the vertical line test, and it turns out
that that is the predominant issue. So, under what circumstances does the
inverse pass the vertical line test? When the original function passes the socalled horizontal line test (every horizontal line intersects the graph at most
once). Thinking again about f (x) = x2 , there are some horizontal lines that
miss the graph entirely, but all horizontal lines of the form y = c where c
is positive will intersect the graph twice. There are many functions that do
pass the horizontal line test, for instance, consider f (x) = x3 . Such functions
are known as injections, this is the same thing as saying a function is one-toone. Injective functions can be inverted the domain of the inverse function
of f will only be the range, Rng(f ), which as we have seen may fall short of
the being the entire codomain, since Rng(f ) Cod(f ).
Lets first define injections in a way that is divorced from thinking about
their graphs.
Definition. A function f (x) is an injection i for all pairs of inputs x1 and
x2 , if f (x1 ) = f (x2 ) then x1 = x2 .
This is another of those defining properties that is designed so that when
it is true it is vacuously true. An injective function never takes two distinct
inputs to the same output. Perhaps the cleanest way to think about injective
functions is in terms of preimages when a function is injective, preimages are
unique. Actually, this is a good time to mention something about surjective
functions and preimages if a function is surjective, every element of the
codomain has a preimage. So, if a function has both of these properties it
means that every element of the codomain has one (and only one) preimage.
A function that is both injective and surjective (one-to-one and onto)
is known as a bijection. Bijections are tremendously important in mathematics since they provide a way of perfectly matching up the elements of
two sets. You will probably spend a good bit of time in the future devising
280
maps between sets and then proving that they are bijections, so we will start
practicing that skill now. . .
Ordinarily, we will show that a function is a bijection by proving separately that it is both a surjection and an injection.
To show that a function is surjective we need to show that it is possible
to find a preimage for every element of the codomain. If we happen to know
what the inverse function is, then it is easy to find a preimage for an arbitrary element. In terms of the taxonomy for proofs that was introduced in
Chapter 3, we are talking about a constructive proof of an existential statement. A function f is surjective i 8y 2 Cod(f ), 9x 2 Dom(f ), y = f (x), so
1; x 2 N.
naturals.
1 is a
bijection from N to O.
Proof:
6.5. FUNCTIONS
281
1 = 2k + 2
1 = 2k + 1 = y.
Next we show that f is injective. Suppose that there are two input
values, x1 and x2 such that f (x1 ) = f (x2 ). Then 2x1 1 = 2x2 1
and simple algebra leads to x1 = x2 .
Q.E.D.
For a slightly more complicated example consider the function from N to
Z defined by
f (x) =
x/2
(x
if x is even
1)/2 if x is odd
This function does quite a handy little job, it matches up the natural
numbers and the integers in pairs. Every even natural gets matched with a
positive integer and every odd natural (except 1) gets matched with a negative integer (1 gets paired with 0). This function is really doing something
remarkable common sense would seem to indicate that the integers must
be a larger set than the naturals (after all N is completely contained inside
of Z), but the function f defined above serves to show that these two sets
are exactly the same size!
Theorem 6.5.2. The function f defined above is bijective.
Proof: First we will show that f is surjective.
It suffices to find a preimage for an arbitrary element of Z. Suppose that y is a particular but arbitrarily chosen integer. There
are two cases to consider: y 0 and y > 0.
282
2y is
odd whenever y is an integer, thus this value for x will fall into
the second case in the definition of f . So, f (x) = f (1
((1
2y)
1)/2 =
2y) =
( 2y)/2 = y.
Since the cases y > 0 and y 0 are exhaustive (that is, every y
in Z falls into one or the other of these cases), and we have found
a preimage for y in both cases, it follows that f is surjective.
Next, we will show that f is injective.
Suppose that x1 and x2 are elements of N and that f (x1 ) = f (x2 ).
Consider the following three cases: x1 and x2 are both even, both
odd, or have opposite parity.
If x1 and x2 are both even, then by the definition of f we have
f (x1 ) = x1 /2 and f (x2 ) = x2 /2 and since these functional values
are equal, we have x1 /2 = x2 /2. Doubling both sides of this leads
to x1 = x2 .
If x1 and x2 are both odd, then by the definition of f we have
f (x1 ) =
(x1
(x2
(x1
(x2
1)/2.
(x2
1 so
2 so f (x1 ) = x1 /2
1.
6.5. FUNCTIONS
283
x2
(x2
(x2
1)/2
1)/2 0
f (x2 ) 0
therefore we have a contradiction since it is impossible for the two
values f (x1 ) and f (x2 ) to be equal while f (x1 )
1 and f (x2 ) 0.
(T ) and
f
(T ) = {x 9y 2 Cod(f ), y 2 T ^ y = f (x)}.
284
({y}) = {x}.
6.5. FUNCTIONS
285
Exercises 6.5
1. For each of the following functions, give its domain, range and a possible
codomain.
(a) f (x) = sin (x)
(b) g(x) = ex
(c) h(x) = x2
(d) m(x) =
x2 +1
x2 1
you just determined is both injective and surjective. Find the inverse
function of the bijection above.
3. The natural logarithm function ln(x) is defined by a definite integral
with the variable x in the upper limit.
ln(x) =
x
t=1
1
dt.
t
286
4. Georg Cantor developed a systematic way of listing the rational numbers. By listing a set one is actually developing a bijection from N to
that set. The method known as Cantors Snake creates a bijection
from the naturals to the non-negative rationals. First we create an
infinite table whose rows are indexed by positive integers and whose
columns are indexed by non-negative integers the entries in this table
are rational numbers of the form column index / row index. We
then follow a snake-like path that zig-zags across this table whenever
we encounter a rational number that we havent seen before (in lower
terms) we write it down. This is indicated in the diagram below by
circling the entries.
Eectively this gives us a function f which produces the rational number that would be found in a given position in this list. For example
f (1) = 0/1, f (2) = 1/1 and f (5) = 1/3.
(3/4)? What is f
(6/7)?
6.5. FUNCTIONS
287
0/1
1/1
2/1
3/1
4/1
5/1
6/1
7/1
8/1
0/2
1/2
2/2
3/2
4/2
5/2
6/2
7/2
8/2
0/3
1/3
2/3
3/3
4/3
5/3
6/3
7/3
8/3
0/4
1/4
2/4
3/4
4/4
5/4
6/4
7/4
8/4
0/5
1/5
2/5
3/5
4/5
5/5
6/5
7/5
8/5
0/6
1/6
2/6
3/6
4/6
5/6
6/6
7/6
8/6
0/7
1/7
2/7
3/7
4/7
5/7
6/7
7/7
8/7
0/8
1/8
2/8
3/8
4/8
5/8
6/8
7/8
8/8
288
6.6
Special functions
There are a great many functions that fail the horizontal line test which we
nevertheless seem to have inverse functions for. For example, x2 fails HLT
p
but x is a pretty reasonable inverse for it one just needs to be careful
about the plus or minus issue. Also, sin x fails HLT pretty badly; any
horizontal line y = c with
and
= {(x, y) x 2 D ^ (x, y) 2 f }.
so that f
f . There can be problems in doing this, but if we are careful about how we
choose D, these problems are usually resolvable.
Exercise. Suppose f is a function that is not one-to-one, and D is a subset
of Dom(f ) such that f
has an
or
g(f (x)) = x?
It might be labeled asin instead. The old-style way to refer to the inverse of a trig.
function was arc-whatever. So the inverse of sine was arcsine, the inverse of tangent was
arctangent.
289
If we restrict the domain of the sine function to the closed interval [ /2, /2],
we have an invertible function. The inverse of this restricted function is the
function we know as sin 1 (x) or arcsin(x). The domain and range of sin 1 (x)
are (respectively) the intervals [ 1, 1] and [ /2, /2].
Notice that if we choose a number x in the range
1 x 1 and apply
/2 and /2
290
({(0, 1)}) ?
[0,2)
it is possible to express (in terms of the inverse functions of sine and cosine)
if we consider the four cases determined by what quadrant a point on the
unit circle may lie in.
Exercise. Suppose (x, y) represents a point on the unit circle. If (x, y) happens to lie on one of the coordinate axes we have
291
W
W
((1, 0)) = 0
((0, 1)) = /2
(( 1, 0)) =
((x, y)) using the cases (i) x > 0 ^ y > 0, (ii) x <
This last example that we have done (the winding map) was unusual in
that the outputs were ordered pairs. In thinking of this map as a relation
(that is, as a set of ordered pairs) we have an ordered pair in which the second
element is an ordered pair! Just for fun, here is another way of expressing
the winding map:
W = {(t, (cos t, sin t)) t 2 R}
When dealing with very complicated expressions involving ordered pairs,
or more generally, ordered n-tuples, it is useful to have a way to refer succinctly to the pieces of a tuple.
Lets start by considering the set P = R R i.e. P is the xy plane.
There are two functions, whose domain is P that pick out the x, and/or
y coordinate. These functions are called 1 and 2 , 1 is the projection onto
the first coordinate and 2 is the projection onto the second coordinate.7
7
Dont think of the usual 3.14159 when looking at 1 and 2 . These functions
are named as they are because is the Greek letter corresponding to p which stands for
projection.
292
1 ((x, y)) = x.
The definition of 2 is entirely analogous.
You should note that these projection functions are very bad as far as
being one-to-one is concerned. For instance, the preimage of 1 under the
map 1 consists of all the points on the vertical line x = 1. Thats a lot of
preimages! These guys are so far from being one-to-one that it seems impossible to think of an appropriate restriction that would become invertible.
Nevertheless, there is a function that provides a right inverse for both 1 and
2 . Now, these projection maps go from R R to R so an inverse needs to
be a map from R to R R. What is a reasonable way to produce a pair of
real numbers if we have a single real number in hand? There are actually
many ways one could proceed, but one reasonable choice is to create a pair
where the input number appears in both coordinates. This is the so-called
diagonal map, d : R R ! R, defined by d(a) = (a, a).
Exercise. Which of the following is always true,
d(1 ((x, y)) = (x, y)
or
1 (d(x)) = x?
task if an input x is in the set S the function will indicate this by returning
1, otherwise it will return 0. The function which has this behavior is known
as 1S , and is called the characteristic function of the subset S (There are
293
1S : D ! {0, 1}
(
1 if x 2 S
1S (x) =
0 otherwise
Exercise. If you have the characteristic function of a subset S, how can you
create the characteristic function of its complement, S.
A characteristic function may be thought of as an embodiment of a membership criterion. The logical open sentence x 2 S being true is the same
larity, that does the same thing for an arbitrary open sentence. The Iverson
bracket notation uses the shorthand [P (x)] to represent a function that sends
any x that makes P (x) true to 1, and any inputs that make P (x) false will
get sent to 0.
[P (x)] =
1 if P (x)
0 otherwise
[2 | i] + [3 | i]
[6 | i]
294
ij
ij
1 if i = j
0 otherwise
given a subset we get a unique binary expansion, and given binary expansion
we get (using
) a unique subset of N.
will have 1s in the first three positions after the decimal ({1, 2, 3}) = .111
this is the number written .875 in decimal. The infinite repeating binary
295
number .01 is the base-2 representation of 1/3, it is easy to see that .01 is
the image of the set of odd naturals, {1, 3, 5, . . .}.
Exercise. Find the binary representation for the real number which is the
image of the set of even numbers under .
Exercise. Find the binary representation for the real number which is the
image of the set of triangular numbers under . (Recall that the triangular
numbers are T = {1, 3, 6, 10, 15, . . .}.)
296
Exercises 6.6
1. The n-th triangular number, denoted T (n), is given by the formula
T (n) = (n2 + n)/2. If we regard this formula as a function from R to
R, it fails the horizontal line test and so it is not invertible. Find a
suitable restriction so that T is invertible.
2. The usual algebraic procedure for inverting T (x) = (x2 +x)/2 fails. Use
your knowledge of the geometry of functions and their inverses to find
a formula for the inverse. (Hint: it may be instructive to first invert
the simpler formula S(x) = x2 /2 this will get you the right vertical
scaling factor.)
3. What is 2 (W (t))?
4. Find a right inverse for f (x) = |x|.
5. In three-dimensional space we have projection functions that go onto
the three coordinate axes (1 , 2 and 3 ) and we also have projections
onto coordinate planes. For example, 12 : R R R
defined by
! R R,
helix. What is the set 12 (H) ? What are the sets 13 (H) and 23 (H)?
6. Consider the set {1, 2, 3, . . . , 10}. Express the characteristic function
of the subset S = {1, 2, 3} as a set of ordered pairs.
297
7. If S and T are subsets of a set D, what is the product of their characteristic functions 1S 1T ?
8. Evaluate the sum
10
X
1
i=1
[i is prime].
298
Chapter 7
Proof techniques III
Combinatorics
Tragedy is when I cut my finger. Comedy is when you fall into an open sewer
and die. Mel Brooks
7.1
Counting
300
when we should multiply, and the addition rule which tells us when we
should add.
Before we describe these principles in detail, well have a look at a simpler
problem which is most easily described by an example: How many integers
are there in the list (7, 8, 9, . . . 44)? We could certainly write down all the
integers from 7 to 44 (inclusive) and then count them although this wouldnt
be the best plan if the numbers 7 and 44 were replaced with (say) 7, 045, 356
and 22, 355, 201. A method that does lead to a generalized ability to count
the elements of a finite sequence arises if we think carefully about what
exactly a finite sequence is.
Definition. A sequence from a set S is a function from N to S.
Definition. A finite sequence from a set S is a function from {0, 1, 2, . . . , n}
to S, where n is some particular (finite) integer.
Now it is easy to see that there are n+1 elements in the set {0, 1, 2, . . . , n}
7.1. COUNTING
301
302
Figure 7.1: In Yahtzee, a full house may consist of a pair and a larger threeof-a-kind, or vice versa.
The multiplication rule gives us a way of counting things by thinking
about how we might construct them. The numbers that are multiplied are
the number of choices we have in the construction process. Surprisingly
often, the number of choices we can make in a given stage of constructing
some configuration is independent of the choices that have gone before if
this is not the case the multiplication rule may not apply.
If some object can be constructed in k stages, and if in the first stage we
have n1 choices as to how to proceed, in the second stage we have n2 choices,
et cetera. Then the total number of such objects is the product n1 n2 nk .
A permutation of an n-set (w.l.o.g. {1, 2, . . . , n}) is an ordered n-tuple
about building such a thing in three stages. First, we must select a number to
go in the first position there are 3 choices. Having made that choice, there
will only be two possibilities for the number in the second position. Finally
7.1. COUNTING
303
there is just one number remaining to put in the third position1 . Thus there
are 3 2 1 = 6 permutations of a 3 element set.
There are times when configurations that are like permutations (in that
they are ordered and have no duplicates) but dont consist of all n numbers
are useful.
Definition. A k-permutation from an n-set is an ordered selection of k distinct elements from a set of size n.
There are certain natural limitations on the value of k, for instance k cant
be negative although (arguably) k can be 0, it makes more sense to think
of k being at least 1. Also, if k exceeds n we wont be able to find any kpermutations, since it will be impossible to meet the distinct requirement.
If k and n are equal, there is no dierence between a k-permutation and an
ordinary permutation. Therefore, we ordinarily restrict k to lie in the range
0 < k < n.
The notation P (n, k) is used for the total number of k-permutations of
a set of size n. For example, P (4, 2) is 12, since there are twelve dierent
ordered pairs having distinct entries where the entries come from {1, 2, 3, 4}.
Exercise. Write down all twelve 2-permutations of the 4-set {1, 2, 3, 4}.
Counting k-permutations using the multiplication rule is easy. We build
a k-permutation in k stages. In stage 1, we pick the first element in the
permutation there are n possible choices. In stage 2, we pick the second
element there are now only n
entry. We keep going like this until weve picked k entries. The number
P (n, k) is the product of k numbers beginning with n and descending down
1
People may say you have no choice in this last situation, but what they mean is
304
to n
k + 1. To verify that n
1, n
2, . . . n
k + 1).
0, n
1, n
2, . . . n
(k
1)).
Lets have a look at another small example P (8, 4). There will be 8
choices for the first entry in a 4-tuple, 7 choices for the second entry, 6 choices
for the third entry and 5 choices for the last entry. (Note that 5 = 8
4 + 1.)
n!
.
(n k)!
If we were playing a card game in which we were dealt 5 cards from a deck
of 52, we would receive our cards in the form of P (52, 5) = 5251504948 =
311875200 ordered 5-tuples. Normally, we dont really care about what order
the cards came to us in. In a card game one ordinarily begins sorting the
cards so as to see what hand one has this is a sure sign that the order the
cards were dealt is actually immaterial. How many dierent orders can five
cards be put in? The answer to this question is 5! = 120 since what we are
discussing is nothing more than a permutation of a set of size 5. Thus, if we
say that there are 311,875,200 dierent possible hands in 5-card poker, we are
over-counting things by quite a bit! Any given hand will appear 120 times in
that tabulation, which means the right value is 311875200/120 = 2598960.
7.1. COUNTING
305
Okay, so there around 2.6 million dierent hands in 5-card poker. Unless
you plan to become a gambler this isnt really that useful of a piece of information but if you generalize what weve done in the paragraph above,
youll have found a way to count unordered collections of a given size taken
from a set.
A k-combination from an n-set is an unordered selection, without repetitions, of k things out of n. This is the exact same thing as a subset of
size k of a set of size n, and the number of
such
things is denoted by several
n
dierent notations C(n, k), nCk and
among them2 . We can come
k
up with a formula for C(n, k) by a slightly roundabout argument. Suppose
we think of counting the k-permutations of n things using the multiplication
rule in a dierent way then we have previously. Well build a k-permutation
in two stages. First well choose k symbols to put into our permutation
which can be done in C(n, k) ways. And second, well put those k symbols into a particular order which can be accomplished in k! ways. Thus
P (n, k) = C(n, k) k!. Since we already know that P (n, k) =
n!
,
(n k)!
we can
C(n, k) =
n!
.
k! (n k)!
n
k
n
k
n
k
. They
306
No
No
Yes
Yes
The phrase PIN number is redundant. The N in PIN stands for number. Any-
way, a PIN is a four digit (secret) number used to help ensure that automated banking
(such as withdrawing your lifes savings) is only done by an authorized individual.
7.1. COUNTING
307
k)!
n!
k!(n k)!
distinct subsets. Here, well give an example that doesnt sound like were
talking about counting subsets of a particular size. (Although we really are!)
How many dierent sequences of 6 strictly increasing numbers can we
choose from {1, 2, 3, . . . 20}?
Obviously, listing all such sequences would be an arduous task. We
might start with (1, 2, 3, 4, 5, 6) and try to proceed in some orderly fashion
to (15, 16, 17, 18, 19, 20), but unfortunately there are 38,760 such sequences
so unless we enlist the aid of a computer we are unlikely to finish this job
in a reasonable time. The number weve just given (38,760) is C(20, 6) and
308
so it would seem that were claiming that this problem is really unordered
selection without repetition of 6 things out of 20. Well, actually, some parts
of this are clearly right we are selecting 6 things from a set of size 20, and
because our sequences are supposed to be strictly increasing there will be no
repetitions but, a strictly increasing sequence is clearly ordered and the
formula we are using is for unordered collections.
By specifying a particular ordering (strictly increasing) on the sequences
we are counting above, we are actually removing the importance of order.
Put another way: if order really mattered, the symbols 1 through 6 could
be put into 720 dierent orders but we only want to count one of those
possibilities. Put another other way: there is a one-to-one correspondence
between a 6-subset of {1, 2, 3, . . . 20} and a strictly increasing sequence. Just
make sure the subset is written in increasing order!
Okay, at this point we have filled-in three out of the four cells in our table.
Does order matter?
No
No
P (n, k) =
Yes
Yes
nk
n!
(n k)!
C(n, k) =
n!
k!(n k)!
7.1. COUNTING
309
What kinds of things are we counting in the lower right part of the table?
Unordered selections of k things out of n possibilities where there may (or
may not!) be repetitions. The game Yahtzee provides a nice example of
this type of configuration. When we roll 5 dice, we do not do so one-ata-time, rather, we roll them as a group the dice are indistinguishable so
there is no way to order our set of 5 outcomes. In fact, it would be quite
reasonable to, after ones roll, arrange the die in (say) increasing order. Well
repeat a bit of advice that was given previously: if one is free to rearrange a
configuration to suit ones needs, that is a clue that order is not important
in the configurations under consideration. Finally, are repetitions allowed?
The outcomes in Yahtzee are 5 numbers from the set {1, 2, 3, 4, 5, 6}, and
In general, the same number can appear on two, or several, or even all 5 of
the die4 .
So, how many dierent outcomes are there when one rolls five dice? To
answer this question it will be helpful to think about how we might express
such an outcome. Since order is unimportant, we can choose to put the
numbers that appear on the individual die in whatever order we like. We
may as well place them in increasing order. There will be 5 numbers and
each number is between 1 and 6. We can list the outcomes systematically by
starting with an all-ones Yahtzee:
(1,1,1,1,1)
(1,1,1,1,2)
(1,1,1,1,3)
(1,1,1,1,4)
(1,1,1,1,5)
(1,1,1,1,6)
(1,1,1,2,2)
(1,1,1,2,3)
(1,1,1,2,4)
(1,1,1,2,5)
(1,1,1,2,6)
(1,1,1,3,3)
(1,1,1,3,4)
(1,1,1,3,5)
(1,1,1,3,6)
(1,1,1,4,4)
(1,1,1,4,5)
(1,1,1,4,6)
(1,1,1,5,5)
(1,1,1,5,6)
(1,1,1,6,6)
(1,1,2,2,2)
(1,1,2,2,3)
(1,1,2,2,4)
(1,1,2,2,5)
(1,1,2,2,6)
(1,1,2,3,3)
(1,1,2,3,4)
(1,1,2,3,5)
(1,1,2,3,6)
(1,1,2,4,4)
(1,1,2,4,5)
(1,1,2,4,6)
(1,1,2,5,5)
(1,1,2,5,6)
(1,1,2,6,6)
When this happens you are supposed to jump in the air and yell Yahtzee!
310
(1,1,3,3,3)
(1,1,3,3,4)
(1,1,3,3,5)
(1,1,3,3,6)
(1,1,3,4,4)
(1,1,3,4,5)
(1,1,3,4,6)
(1,1,3,5,5)
(1,1,3,5,6)
(1,1,3,6,6)
(1,1,4,4,4)
(1,1,4,4,5)
(1,1,4,4,6)
(1,1,4,5,5)
(1,1,4,5,6)
(1,1,4,6,6)
(1,1,5,5,5)
(1,1,5,5,6)
(1,1,5,6,6)
(1,1,6,6,6)
(1,2,2,2,2)
(1,2,2,2,3)
(1,2,2,2,4)
(1,2,2,2,5)
(1,2,2,2,6)
(1,2,2,3,3)
(1,2,2,3,4)
(1,2,2,3,5)
(1,2,2,3,6)
(1,2,2,4,4)
(1,2,2,4,5)
(1,2,2,4,6)
(1,2,2,5,5)
(1,2,2,5,6)
(1,2,2,6,6)
(1,2,3,3,3)
(1,2,3,3,4)
(1,2,3,3,5)
(1,2,3,3,6)
(1,2,3,4,4)
(1,2,3,4,5)
(1,2,3,4,6)
(1,2,3,5,5)
(1,2,3,5,6)
(1,2,3,6,6)
(1,2,4,4,4)
(1,2,4,4,5)
(1,2,4,4,6)
(1,2,4,5,5)
(1,2,4,5,6)
(1,2,4,6,6)
(1,2,5,5,5)
(1,2,5,5,6)
(1,2,5,6,6)
(1,2,6,6,6)
(1,3,3,3,3)
(1,3,3,3,4)
(1,3,3,3,5)
(1,3,3,3,6)
(1,3,3,4,4)
(1,3,3,4,5)
(1,3,3,4,6)
(1,3,3,5,5)
(1,3,3,5,6)
(1,3,3,6,6)
(1,3,4,4,4)
(1,3,4,4,5)
(1,3,4,4,6)
(1,3,4,5,5)
(1,3,4,5,6)
(1,3,4,6,6)
(1,3,5,5,5)
(1,3,5,5,6)
(1,3,5,6,6)
(1,3,6,6,6)
(1,4,4,4,4)
(1,4,4,4,5)
(1,4,4,4,6)
(1,4,4,5,5)
(1,4,4,5,6)
(1,4,4,6,6)
(1,4,5,5,5)
(1,4,5,5,6)
(1,4,5,6,6)
(1,4,6,6,6)
(1,5,5,5,5)
(1,5,5,5,6)
(1,5,5,6,6)
(1,5,6,6,6)
(1,6,6,6,6)
(2,2,2,2,2)
(2,2,2,2,3)
(2,2,2,2,4)
(2,2,2,2,5)
(2,2,2,2,6)
(2,2,2,3,3)
(2,2,2,3,4)
(2,2,2,3,5)
(2,2,2,3,6)
(2,2,2,4,4)
(2,2,2,4,5)
(2,2,2,4,6)
(2,2,2,5,5)
(2,2,2,5,6)
(2,2,2,6,6)
(2,2,3,3,3)
(2,2,3,3,4)
(2,2,3,3,5)
(2,2,3,3,6)
(2,2,3,4,4)
(2,2,3,4,5)
(2,2,3,4,6)
(2,2,3,5,5)
(2,2,3,5,6)
(2,2,3,6,6)
(2,2,4,4,4)
(2,2,4,4,5)
(2,2,4,4,6)
(2,2,4,5,5)
(2,2,4,5,6)
(2,2,4,6,6)
(2,2,5,5,5)
(2,2,5,5,6)
(2,2,5,6,6)
(2,2,6,6,6)
(2,3,3,3,3)
(2,3,3,3,4)
(2,3,3,3,5)
(2,3,3,3,6)
(2,3,3,4,4)
(2,3,3,4,5)
(2,3,3,4,6)
(2,3,3,5,5)
(2,3,3,5,6)
(2,3,3,6,6)
(2,3,4,4,4)
(2,3,4,4,5)
(2,3,4,4,6)
(2,3,4,5,5)
(2,3,4,5,6)
(2,3,4,6,6)
(2,3,5,5,5)
(2,3,5,5,6)
(2,3,5,6,6)
(2,3,6,6,6)
(2,4,4,4,4)
(2,4,4,4,5)
(2,4,4,4,6)
(2,4,4,5,5)
(2,4,4,5,6)
(2,4,4,6,6)
(2,4,5,5,5)
(2,4,5,5,6)
(2,4,5,6,6)
(2,4,6,6,6)
(2,5,5,5,5)
(2,5,5,5,6)
(2,5,5,6,6)
(2,5,6,6,6)
(2,6,6,6,6)
(3,3,3,3,3)
(3,3,3,3,4)
(3,3,3,3,5)
(3,3,3,3,6)
(3,3,3,4,4)
(3,3,3,4,5)
(3,3,3,4,6)
(3,3,3,5,5)
(3,3,3,5,6)
(3,3,3,6,6)
(3,3,4,4,4)
(3,3,4,4,5)
(3,3,4,4,6)
(3,3,4,5,5)
7.1. COUNTING
311
(3,3,4,5,6)
(3,3,4,6,6)
(3,3,5,5,5)
(3,3,5,5,6)
(3,3,5,6,6)
(3,3,6,6,6)
(3,4,4,4,4)
(3,4,4,4,5)
(3,4,4,4,6)
(3,4,4,5,5)
(3,4,4,5,6)
(3,4,4,6,6)
(3,4,5,5,5)
(3,4,5,5,6)
(3,4,5,6,6)
(3,4,6,6,6)
(3,5,5,5,5)
(3,5,5,5,6)
(3,5,5,6,6)
(3,5,6,6,6)
(3,6,6,6,6)
(4,4,4,4,4)
(4,4,4,4,5)
(4,4,4,4,6)
(4,4,4,5,5)
(4,4,4,5,6)
(4,4,4,6,6)
(4,4,5,5,5)
(4,4,5,5,6)
(4,4,5,6,6)
(4,4,6,6,6)
(4,5,5,5,5)
(4,5,5,5,6)
(4,5,5,6,6)
(4,5,6,6,6)
(4,6,6,6,6)
(5,5,5,5,5)
(5,5,5,5,6)
(5,5,5,6,6)
(5,5,6,6,6)
(5,6,6,6,6)
(6,6,6,6,6)
,,,
,,.
, ,, , , and
,,,,,
312
rangements?
, , , , ,
,,,,
,,,,,
{3, 3, 3, 3, 4}
{5, 5, 6, 6, 6}
It may seem at first that this blank-comma thing is okay, but that were
still no closer to answering the question we started with. It may seem that
way until you realize how easy it is to count these blank-comma arrangements! You see, there are 10 symbols in one of these blank-comma arrangements and if we choose positions for (say) the commas, the blanks will have to
go into the other positions thus every 5-subset of {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
1, k) or C(k + n
1, n
1 commas.
It turns out that these binomial coefficients are equal so theres no problem
with the apparent ambiguity.
So, finally, our table of counting formulas is complete. Well produce it
here one more time and, while were at it, ditch the C(n, k) notation in favor
of the more usual binomial coefficient notation
n
k
7.1. COUNTING
313
Does order matter?
No
No
P (n, k) =
n!
(n k)!
n
k
n!
k!(n k)!
Yes
Yes
nk
n+k 1
k
314
Exercises 7.1
1. Determine the number of entries in the following sequences.
(a) (999, 1000, 1001, . . . 2006)
(b) (13, 15, 17, . . . 199)
(c) (13, 19, 25, . . . 601)
(d) (5, 10, 17, 26, 37, . . . 122)
(e) (27, 64, 125, 216, . . . 8000)
(f) (7, 11, 19, 35, 67, . . . 131075)
2. How many full houses are there in Yahtzee? (A full house is a pair
together with a three-of-a-kind.)
3. In how many ways can you get two pairs in Yahtzee?
n+k 1
n+k 1
4. Prove that the binomial coefficients
and
are
k
n 1
equal.
5. The Cryptographers alphabet is used to supply small examples in
coding and cryptography. It consists of the first 6 letters, {a, b, c, d, e, f }.
How many words of length up to 6 can be made with this alphabet?
(A word need not actually be a word in English, for example both fed
and dfe would be words in the sense we are using the term.)
6. How many words are there of length 4, with distinct letters from the
Cryptographers alphabet, in which the letters appear in increasing
order alphabetically? (Acef would be one such word, but cafe
would not.)
7.1. COUNTING
315
316
7.2
This section is concerned with two very powerful elements of the proofmaking arsenal: Parity is a way of referring to the result of an even/odd
calculation; Counting arguments most often take the form of counting some
collection in two dierent ways and then comparing those results. These
techniques have little to do with one another, but when they are applicable
they tend to produce really elegant little arguments.
In (very) early computers and business machines, paper cards were used
to store information. A so-called punch card or Hollerith card was used
to store binary information by means of holes punched into it. Paper tape
was also used in a similar fashion. A typical paper tape format would involve
8 positions in rows across the tape that might or might not be punched,
often a column of smaller holes would appear as well which did not store
information but were used to drive the tape through the reading mechanism
on a sprocket. Tapes and cards could be read either by small sets of
electrical contacts which would touch through a punched hole or be kept
separate if the position wasnt punched, or by using a photo-detector to sense
whether light could pass through the hole or not. The mechanisms for reading
and writing on these paper media were amazingly accurate, and allowed early
data processing machines to use just a couple of large file cabinets to store
what now fits in a jump drive one can wear on a necklace. (About 10 or 12
cabinets could hold a gigabyte of data).
Paper media was ideally suited to storing binary information, but of
course most of the real data people needed to store and process would be
alphanumeric5 . There were several encoding schemes that served to translate between the character sets that people commonly used and the binary
5
both alphabetic characters and numeric characters as well as punctuation marks, etc.
317
numerals that could be stored on paper. One of these schemes still survives
today ASCII. The American Standard Code for Information Interchange
uses 7-bit binary numerals to represent characters, so it contains 128 dierent
symbols. This is more than enough to represent both upper- and lower-case
letters, the 10 numerals, and the punctuation marks many of the remaining
spots in the ASCII code were used to contain so-called control characters
that were associated with functionality that appeared on old-fashioned teletype equipment things like ring the bell, move the carriage backwards
one space, move the carriage to the next line, etc. These control characters are why modern keyboards still have a modifier key labeled Ctrl on
them. The following listing gives the decimal and binary numerals from 0 to
127 and the ASCII characters associated with them the non-printing and
control characters have a 2 or 3 letter mnemonic designation.
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0001
0001
0001
0001
0001
0001
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
NUL
SOH
STX
ETX
EOT
ENQ
ACK
BEL
BS
TAB
LF
VT
FF
CR
SO
SI
DLE
DC1
DC2
DC3
DC4
NAK
SYN
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0100
0101
0101
0101
0101
0101
0101
0101
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
318
0001
0001
0001
0001
0001
0001
0001
0001
0001
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0010
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0011
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
!
"
#
$
%
&
(
)
*
+
,
.
/
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
0101
0101
0101
0101
0101
0101
0101
0101
0101
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0110
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
W
X
Y
Z
[
\
]
^
_
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
319
Now it only takes 7 bits to encode the 128 possible values in the ASCII
system, which can easily be verified by noticing that the left-most bits in all
of the binary representations above are 0. Most computers use 8 bit words
or bytes as their basic units of information, and the fact that the ASCII
code only requires 7 bits lead someone to think up a use for that additional
bit. It became a parity check bit. If the seven bits of an ASCII encoding
have an odd number of 1s, the parity check bit is set to 1 otherwise, it is
set to 0. The result of this is that, subsequently, all of the 8 bit words that
encode ASCII data will have an even number of 1s. This is an example of a
so-called error detecting code known as the even code or the parity check
code. If data is sent over a noisy telecommunications channel, or is stored
in fallible computer memory, there is some small but calculable probability
that there will be a bit error. For instance, one computer might send
10000111 (which is the ASCII code that says ring the bell) but another
machine across the network might receive 10100111 (the 3rd bit from the left
has been received in error) now if we are only looking at the rightmost seven
bits we will think that the ASCII code for a single quote has been received,
but if we note that this piece of received data has an odd number of ones
well realize that something is amiss. There are other more advanced coding
schemes that will let us not only detect an error, but (within limits) correct
it as well! This rather amazing feat is what makes wireless telephony (not
to mention communications with deep space probes whoops! I mentioned
it) work.
The concept of parity can be used in many settings to prove some fairly
remarkable results.
In Section 6.3 we introduced the idea of a graph. This notion was first
used by Leonhard Euler to solve a recreational math problem posed by the
citizens of Konigsberg, Prussia (this is the city now known as Kaliningrad,
Russia.) Konigsberg was situated at a place where two branches of the
320
Pregel river6 come together there is also a large island situated near this
confluence. By Eulers time, the city of Konigsberg covered this island as well
as the north and south banks of the river and also the promontory where the
branches came together. A network of seven bridges had been constructed
to connect all these land masses. The townsfolk are alleged to have become
enthralled by the question of whether it was possible to leave ones home
and take a walk through town which crossed each of the bridges exactly once
and, finally, return to ones home.
321
322
it is usually easy to do. Of course, when its impossible you struggle a bit. . .
To help get things rolling (just in case you havent really done the exercise)
Ill give a hint for the first list it is possible to draw a graph, for the second
it is not. Can you distinguish the pattern? What makes one list of vertex
degrees reasonable and another not?
Exercise. (If you didnt do the last exercise, stop being such a lame-o and
try it now. BTW, if you did do it, good for you! You can either join with
me now in sneering at all those people who are scurrying back to do the last
one, or try the following:)
Figure out a way to distinguish a sequence of numbers that can be the
degree sequence of some graph from the sequences that cannot be.
Okay, now if youre reading this sentence you should know that every
other list of vertex degrees above is impossible, you should have graphs drawn
in the margin here for the 1st, 3rd and 5th degree sequences, and you may
have discovered some version of the following
Theorem 7.2.1. In an undirected graph, the number of vertices having an
odd degree is even.
A slightly pithier statement is: All graphs have an even number of odd
nodes.
Well leave the proof of this theorem to the exercises but most of the work
is done in proving the following equivalent result.
323
Theorem 7.2.2. In an undirected graph the sum of the degrees of the vertices
is even.
Proof: The sum of the degrees of all the vertices in a graph G,
X
deg(v),
v2V (G)
v2V (G)
deg(v) = 2 |E(G)|.
Q.E.D.
The question of whether a graph having a given list of vertex degrees can
exist comes down to an elegant little argument using both of the techniques
in the title of this section. We count the edge set of the graph in two ways
once in the usual fashion and once by summing the vertex degrees; we also
note that since this latter count is actually a double count we can bring in
the concept of parity.
Another perfectly lovely argument involving parity arises in questions
concerning whether or not it is possible to tile a pruned chessboard with
dominoes. Weve seen dominoes before in Section 5.1 and were just hoping
youve run across chessboards before. Usually a chessboard is 8 8, but we
324
dominoes?
First lets specify the question a bit more. By perfectly tiling a chessboard we mean that every domino lies (fully) on the board, covering precisely
two squares, and that every square of the board is covered by a domino.
The answer is straightforward. If at least one of m or n is even it can
be done. A necessary condition is that the number of squares be even (since
every domino covers two squares) and so, if both m and n are odd we will
be out of luck.
A pruned board is obtained by either literally removing some of the
squares or perhaps by marking them as being o limits in some way. When
we ask questions about perfect tilings of pruned chessboards things get more
interesting and the notion of parity can be used in several ways.
Here are two tiling problems regarding square chessboards:
1. An even-sided square board (e.g. an ordinary 8 8 board) with diagonally opposite corners pruned.
325
Exercise. Below are two five-by-five chessboards each having a single square
pruned. One can be tiled by dominoes and the other cannot. Which is which?
Nothing about chess requires this structure either, but it does let us do some error
checking. For instance, bishops always end up on the same color they left from and knights
always switch colors as they move.
326
The definition of a magic square requires that the rows and columns sum
to the same number but says nothing about what that number must be.
It is conceivable that we could produce magic squares (of the same order)
having dierent magic sums. This is conceivable, but in fact the magic sum
is determined completely by n.
Theorem 7.2.3. A magic square of order n has a magic sum equal to
n3 + n
.
2
S = 1 + 2 + 3 + . . . + n2 .
Using the formula for the sum of the first k naturals (
k2 +k
2
S=
Pk
i=1
i=
n4 + n2
.
2
327
S = nM.
By equating these dierent expressions for S and solving for M ,
we prove the desired result:
nM =
n4 + n2
,
2
M=
n3 + n
.
2
therefore
Q.E.D.
328
Exercises 7.2
1. A walking tour of Konigsberg such as is described in this section, or
more generally, a circuit through an arbitrary graph that crosses each
edge precisely once and begins and ends at the same node is known
as an Eulerian circuit. An Eulerian path also crosses every edge of a
graph exactly once but it begins and ends at distinct nodes. For each
of the following graphs determine whether an Eulerian circuit or path
is possible, and if so, draw it.
329
2. Complete the proof of the fact that Every graph has an even number
of odd nodes.
3. Provide an argument as to why an 8 8 chessboard with two squares
5. The five tetrominoes (familiar to players of the video game Tetris) are
relatives of dominoes made up of four small squares.
All together these five tetrominoes contain 20 squares so it is conceivable that they could be used to tile a 4 5 chessboard. Prove that this
is actually impossible.
330
7.3
331
The word pigeonhole can refer to a hole in which a pigeon roosts (i.e.
pretty much what it sounds like) or a series of roughly square recesses in a
desk in which one could sort correspondence (see Figure 7.4).
Whether you prefer to think of roosting birds or letters being sorted, the
first and easiest version of the pigeonhole principle is that if you have more
things than you have containers there must be a container holding at
least two things.
If we have 6 pigeons who are trying to roost in a coop with 5 pigeonholes,
two birds will have to share.
If we have 7 letters to sort and there are 6 pigeonholes in our desk, we
will have to put two letters in the same compartment.
The things and the containers dont necessarily have to be interpreted
in the strict sense that the things go into the containers. For instance,
a nice application of the pigeonhole principle is that if there are at least 13
people present in a room, some pair of people will have been born in the
same month. In this example the things are the people and the containers
are the months of the year.
The abstract way to phrase the pigeonhole principle is:
Theorem 7.3.1. If f is a function such that |Dom(f )| > |Rng(f )| then f is
not injective.
332
pigeonholes
333
bijection between Dom(f ) and Rng(f ). Therefore (since f provides a one-to-one correspondence) |Dom(f )| = |Rng(f )|. This
clearly contradicts the statement that |Dom(f )| > |Rng(f )|.
Q.E.D.
forced to have chosen two numbers such that one is divisible by the other.
We can come up with stronger forms of the pigeonhole principle by considering pigeonholes with capacities. Suppose we have 6 pigeonholes in a
desk, each of which can hold 10 letters. What number of letters will guarantee that one of the pigeonholes is full? The largest number of letters we
could have without having 10 in some pigeonhole is 9 6 = 54, so if there are
55 letters we must have 10 letters in some pigeonhole.
334
335
Exercises 7.3
1. The statement that there are two non-bald New Yorkers with the same
number of hairs on their heads requires some careful estimates to justify
it. Please justify it.
2. A mathematician, who always rises earlier than her spouse, has developed a scheme using the pigeonhole principle to ensure that
she always has a matching pair of socks. She keeps only blue socks,
green socks and black socks in her sock drawer 10 of each. So as
not to wake her husband she must select some number of socks from
her drawer in the early morning dark and take them with her to the
adjacent bathroom where she dresses. What number of socks does she
choose?
3. If we select 1001 numbers from the set {1, 2, 3, . . . , 2000} it is certain
that there will be two numbers selected such that one divides the other.
We can prove this fact by noting that every number in the given set
can be expressed in the form 2k m where m is an odd number and
using the pigeonhole principle. Write-up this proof.
4. Given any set of 53 integers, show that there are two of them having
the property that either their sum or their dierence is evenly divisible
by 103.
5. Prove that if 10 points are placed inside a square of side length 3, there
p
will be 2 points within 2 of one another.
6. Prove that if 10 points are placed inside an equilateral triangle of side
length 3, there will be 2 points within 1 of one another.
336
7.4
337
n!
k!(n k)!
n
k
will have already seen the arrangement of these binomial coefficients into a
triangular array known as Pascals triangle, but if not. . .
1
1
1
1
1
1
1
2
3
4
5
6
10
15
1
3
1
4
10
20
1
5
15
1
6
et cetera.
The thing that makes this triangle so nice and that leads to the strange
name binomial coefficients for the number of k-combinations of an n-set is
that you can use the triangle to (very quickly) compute powers of binomials.
A binomial is a polynomial with two terms. Things like (x + y), (x + 1)
and (x7 + x3 ) all count as binomials but to keep things simple just think
about (x + y). If you need to compute a large power of (x + y) you can just
multiply it out, for example, think of finding the 6th power of (x + y).
We can use the F.O.I.L rule to find (x + y)2 = x2 + 2xy + y 2 . Then
(x + y)3 = (x + y) (x + y)2 = (x + y) (x2 + 2xy + y 2 ).
You can compute that last product either by using the distributive law
338
+3x2 y
+3xy 2
+y 3
x3
+3x2 y
+3xy 2
+y 3
In the end you should obtain
x6 + 6x5 y + 15x4 y 2 + 20x3 y 3 + 15x2 y 4 + 6xy 5 + y 6 .
Now all of this is a lot of work and its really much easier to notice the
form of the answer: The exponent on x starts at 6 and descends with each
successive term down to 0. The exponent on y starts at 0 and ascends to 6.
The coefficients in the answer are the numbers in the sixth row of Pascals
triangle.
Finally, the form of Pascals triangle makes it really easy to extend. A
number in the interior of the triangle is always the sum of the two above
it (on either side). Numbers that arent in the interior of the triangle are
always 1.
We showed rows 0 through 6 above. Rows 7 and 8 are
1
1
7
8
21
28
35
56
35
70
21
56
7
28
1
8
1.
339
1
.
1
n
k
will partition these k-subsets into two disjoint cases: those that
contain the final number, n, and those that do not.
Let
A = {S N |S| = k ^ n 2
/ S}
and, let
B = {S N |S| = k ^ n 2 S}.
340
n 1
k
1}, so |A| =
1)-set
n 1
k 1
1
k
n
+
k
1
.
1
1
k
(n 1)!
+
k!(n 1 k)! (k
n
+
k
1
1
(n 1)!
1)!((n 1) (k
(n 1)!
(n 1)!
+
k!(n k 1)! (k 1)!(n k)!
1))!
341
k)!. (We
will have to multiply the top and bottom of the first fraction by
(n
k(n 1)!
(n k)(n 1)!
+
k!(n k)(n k 1)! k(k 1)!(n k)!
(n
k)(n 1)!
k(n
+
k!(n k)!
k!(n
(n
1)!
k)!
1)!
k + k)(n 1)!
k!(n k)!
(n)(n 1)!
k!(n k)!
n!
.
k!(n k)!
n
k
, so we
1
k
n
+
k
1
1
n
=
.
k
Q.E.D.
342
that can also be proved in (at least) two ways. We will provide one or two
other examples and leave the rest to you in the exercises for this section.
Theorem 7.4.2. For all natural numbers n and k with 0 < k n,
n
n
k
=n
k
k
1
.
1
n!
1)!(n
k)!
n
k
=n
k
(k
(n 1)!
1)!((n 1) (k
1)
(k
1)) we have
n
=n
1))!
k
1
1
Q.E.D.
343
n
k
n
k
tion.
n 1
k 1
1 elements of the
n
n
k
=n
k
k
1
1
Q.E.D.
For example, a committee of k individuals one of whom has been chosen as chairper-
344
One of which suers from the fault that it is like swatting a fly with a sledge
hammer.
The result concerns the sum of all the numbers in some row of Pascals
triangle.
Theorem 7.4.3. For all natural numbers n and k with 0 < k n,
n
X
n
k=0
= 2n
(x + y) =
n
X
n
k=0
xn k y k .
We wont be proving this result just now. But, the following proof is a
proof of the previous theorem using this more powerful result.
Proof: Substitute x = y = 1 in the binomial theorem.
Q.E.D.
Our second proof will be combinatorial. Let us re-iterate that a combinatorial proof usually consists of counting some collection in two dierent
ways. The formula we have in this example contains a sum, so we should
search for a collection of things that can be counted using the addition rule.
Proof:
345
P(N ) = S0 [ S1 [ S2 [ . . . [ Sn ,
where Sk = {S S N ^ |S| = k} for 0 k n. Since no
|P(N )|
n
k
so it follows that
n
n
n
n
2 =
+
+
+ ... +
.
0
1
2
n
n
Q.E.D.
346
Exercises 7.4
1. Use the binomial theorem (with x = 1000 and y = 1) to calculate
10016 .
2. Find (2x + 3)5 .
3. Find (x2 + y 2 )6 .
4. The following diagram contains a 3-dimensional analog of Pascals triangle that we might call Pascals tetrahedron. What would the next
layer look like?
1
1
1
2
1
2
1
3
6
1
3
3
3
3
5. The student government at Lagrange High consists of 24 members chosen from amongst the general student body of 210. Additionally, there
is a steering committee of 5 members chosen from amongst those in
student government. Use the multiplication rule to determine two different formulas for the total number of possible governance structures.
347
k
r
r
k
r
r
combinatorially.
7. Prove the binomial theorem.
8n 2 N, 8x, y 2 R, (x + y)
n
X
n
k=0
xn k y k
348
Chapter 8
Cardinality
The very existence of flame-throwers proves that some time, somewhere,
someone said to themselves, You know, I want to set those people over there
on fire, but Im just not close enough to get the job done. George Carlin
8.1
Equivalent sets
Perversely, there are also those who use the term equipollent to indicate that sets are
the same size. This term actually applies to logical statements that are deducible from
one another.
349
350
CHAPTER 8. CARDINALITY
351
Consider the solmization syllables used for the notes of the major scale in
music; they form the set {do, re, mi, fa, so, la, ti}. What are we doing when
we count this set (and presumably come up with a total of 7 notes)? We first
point at do while saying one, then point at re while saying two, et cetera.
In a technical sense we are creating a one-to-one correspondence between the
set containing the seven syllables and the special set {1, 2, 3, 4, 5, 6, 7}. You
may seem a bit weak since in fact there are 7! = 5040 correspondences, but
there exists is what we really want here. What exactly is a one-to-one
correspondence? Well, weve actually seen such things before a one-to-one
correspondence is really just a bijective function between two sets. Were
finally ready to write a definition that Georg Cantor would approve of.
Definition. For all sets A and B, we say A and B are equivalent, and write
A B i there exists a one-to-one (and onto) function f , with Dom(f ) = A
and Rng(f ) = B.
Somewhat more succinctly, one can just say the sets are equivalent i
there is a bijection between them.
We are going to ask you to prove that the above definition defines an
equivalence relation in the exercises for this section. In order to give you a
bit of a jump start on that proof well outline what the proof that the relation
is symmetric should look like.
To show that the relation is symmetric we must assume that A
and B are sets with A B and show that this implies that
352
CHAPTER 8. CARDINALITY
B A. According to the definition above this means that well
is a function
is a function is a
is one-to-
The Hebrew letter (capital) aleph with a subscript zero usually pronounced aleph
naught.
353
that are actually countable! After all it would literally take forever to count
the natural numbers! We have to presume that the people who instituted
this terminology meant for countable to mean countable, in principle
or countable if youre willing to let me keep counting forever or maybe
countable if you can keep counting faster and faster and are capable of
ignoring the speed of light limitations on how fast your lips can move. Worse
yet, the term countable has come to be used for sets whose cardinalities are
either finite or the size of the naturals. If we want to refer specifically to the
infinite sort of countable set most mathematicians use the term denumerable
(although this is not universal) or countably infinite. Finally, there are sets
whose cardinalities are bigger than the naturals. In other words, there are
sets such that no one-to-one correspondence with N is possible. We dont
mean that people have looked for one-to-one correspondences between such
sets and N and havent been able to find them we literally mean that it
cant be done; and it is has been proved that it cant be done! Sets having
cardinalities that are this ridiculously huge are known as uncountable.
354
CHAPTER 8. CARDINALITY
Exercises 8.1
1. Name four sets in the equivalence class of {1, 2, 3}.
2. Prove that set equivalence is an equivalence relation.
3. Construct a Venn diagram showing the relationships between the sets of
sets which are finite, infinite, countable, denumerable and uncountable.
4. Place the sets N, R, Q, Z, Z Z, C, N2007 and ;; somewhere on the
Venn diagram above. (Note to students (and graders): there are no
wrong answers to this question, the point is to see what your intuition
about these sets says at this point.)
8.2
355
356
CHAPTER 8. CARDINALITY
Point three, who cares? is in some sense the toughest of all to deal with.
Hopefully youll enjoy the clever arguments to come for their own intrinsic
beauty. But, if you can figure a way to make big piles of money using this
stu that would be nice too.
Lets get started.
Which set is bigger the natural numbers, N or the set, Enoneg , of nonnegative even numbers? Both are clearly infinite, so the infinity is infinity
camp might be lead to the correct conclusion through invalid reasoning. On
the other hand, the even numbers are contained in the natural numbers so
theres a pretty compelling case for saying the evens are somehow smaller
than the naturals. The mathematically rigorous way to show that these
sets have the same cardinality is by displaying a one-to-one correspondence.
Given an even number how can we produce a natural to pair it with? And,
given a natural how can we produce an even number to pair with it? The
map f : N
every non-negative even number the image of some natural under f ? Given
some non-negative even number e we need to be able to come up with an
x such that f (x) = e. Well, since e is an even number, by the definition of
even we know that there is an integer k such that e = 2k and since e is
either zero or positive it follows that k must also be either 0 or positive. It
turns out that k is actually the x we are searching for. Put more succinctly,
every non-negative even number 2k has a preimage, k, under the map f . So
f maps N surjectively onto Enoneg . Now the sets weve just considered,
N = {0, 1, 2, 3, 4, 5, 6, . . .}
and
4
If x and y are dierent numbers that map to the same value, then f(x) = f(y) so 2x =
2y. But we can cancel the 2s and derive that x = y, which is a contradiction.
357
listing, and if you think about what the symbol means youll probably
come up with
Z = {0, 1, 1, 2, 2, 3, 3, . . .}.
This singly infinite listing of the integers does the job were after in a sense
it displays a one-to-one correspondence with N. In fact any singly infinite
listing can be thought of as displaying a one-to-one correspondence with N
the first entry (or should we say zeroth entry?) in the list is corresponded
with 0, the second entry is corresponded with 1, and so on.
0
l
0
1
l
1
2
l
1
3
l
2
4
l
2
5
l
3
6
l
3
...
...
To make all of this precise we need to be able to explicitly give the oneto-one correspondence. It isnt enough to have a picture of it we need a
formula. Notice that the negative integers are all paired with even naturals
358
CHAPTER 8. CARDINALITY
and the positive integers are all paired with odd naturals. This observation
leads us to a piecewise definition for a function that gives the bijection we
seek
f (x) =
x/2
if x is even
.
(x + 1)/2 if x is odd
By the way, notice that since 0 is even it falls into the first case, and
fortunately that formula gives the right value.
Exercise. The inverse function, f
based on whether the input is positive or negative. Define the inverse function.
The examples weve done so far have shown that the integers, the natural
numbers and the even naturals all have the same cardinality. This is the
first infinite cardinal number, known as @0 . In a certain sense we could view
product of two finite sets (the set of all ordered pairs with entries from the
sets in question) has cardinality equal to the product of the cardinalities of
the sets. What do you suppose will happen if we let the sets be infinite?
For instance, what is the cardinality of N N? Consider this: the subset of
ordered pairs that start with a 0 can be thought of as a copy of N sitting
inside this Cartesian product. In fact the subset of ordered pairs starting
with any particular number gives another copy of N inside N N. There are
infinitely many copies of N sitting inside of N N! This just really ought to
Cantors snake was originally created to show that Qnoneg and N are equinumerous.
359
(0, 6)
(1, 6)
(2, 6)
(3, 6)
(4, 6)
(5, 6)
(6, 6)
(0, 5)
(1, 5)
(2, 5)
(3, 5)
(4, 5)
(5, 5)
(6, 5)
(0, 4)
(1, 4)
(2, 4)
(3, 4)
(4, 4)
(5, 4)
(6, 4)
(0, 3)
(1, 3)
(2, 3)
(3, 3)
(4, 3)
(5, 3)
(6, 3)
(0, 2)
(1, 2)
(2, 2)
(3, 2)
(4, 2)
(5, 2)
(6, 2)
(0, 1)
(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(0, 0)
(1, 0)
(2, 0)
(3, 0)
(4, 0)
(5, 0)
(6, 0)
Figure 8.1: Cantors snake winds through the set N N encountering its
elements one after the other.
the origin and the positive x and y axes). This set of points and the path
through them known as Cantors snake is shown in Figure 8.1.
This function was introduced in the exercises for Section 6.5. The version we are presenting
here avoids certain complications.
360
CHAPTER 8. CARDINALITY
The diagram in Figure 8.1 gives a visual form of the one-to-one correspon-
dence we seek. In tabular form we would have something like the following.
0
1
2
3
4
5
6
7
8 ...
l
l
l
l
l
l
l
l
l
(0, 0) (0, 1) (1, 0) (0, 2) (1, 1) (2, 0) (0, 3) (1, 2) (2, 1) . . .
We need to produce a formula. In truth, we should really produce two
formulas. One that takes an ordered pair (x, y) and produces a number n.
Another that takes a number n and produces an ordered pair (x, y) The
number n tells us where the pair (x, y) lies in our infinite listing. There is a
problem though: the second formula (that gives the map from N to N N)
is really hard to write down its easier to describe the map algorithmically.
A simple observation will help us to deduce the various formulas. The ordered pairs along the y-axis (those of the form (0, something)) correspond
to triangular numbers. In fact the pair (0, n) will correspond to the n-th triangular number, T (n) = (n2 + n)/2. The ordered pairs along the descending
slanted line starting from (0, n) all have the feature that the sum of their
coordinates is n (because as the x-coordinate is increasing, the y-coordinate
is decreasing). So, given an ordered pair (x, y), the number corresponding
to the position at the upper end of the slanted line it is on (which will have
coordinates (0, x+y)) will be T (x+y), and the pair (x, y) occurs in the listing
exactly x positions after (0, x + y). Thus, the function f : N N ! N is
given by
(x + y)2 + (x + y)
f (x, y) = x + T (x + y) = x +
.
2
To go the other direction that is, to take a position in the listing and derive
an ordered pair we need to figure out where a given number lies relative
to the triangular numbers. For instance, try to figure out what (x, y) pair
position number 13 will correspond with. Well, the next smaller triangular
361
it follows that f
(x) =
2x + 1/4
1/2.
p
2n + 1/4 1/2c.
It is not pretty, but the above discussion can be translated into a formula for
f
(n) =
p
b 2n + 1/4
1/2c
p
2n + 1/4
n+
1/2c2 + b
2
p
2n + 1/4
2n + 1/4
1/2c2 + b
2
1/2c
p
2n + 1/4
1/2c
and f
362
CHAPTER 8. CARDINALITY
far we have shown that the sets Enoneg , N, Z and N N all have the same
(x
a)(d c)
(b a)
There are other geometric constructions which we can use to show that
there are the same number of points in a variety of entities. For example,
consider the upper half of the unit circle (Remember the unit circle from
Trig? All points (x, y) satisfying x2 + y 2 = 1.) This is a semi-circle having a
radius of 1, so the arclength of said semi-circle is . It isnt hard to imagine
that this semi-circular arc contains the same number of points as an interval
363
Figure 8.2: Projection from a point can be used to show that intervals of
dierent lengths contain the same number of points.
of length , and weve already argued that all intervals contain the same
number of points. . . But, a nice example of geometric projection vertical
projection (a.k.a. 1 ) can be used to show that (for example) the interval
( 1, 1) and the portion of the unit circle lying in the upper half-plane are
equinumerous.
Once the bijection is understood geometrically it is fairly simple to provide
formulas. To go from the semi-circle to the interval, we just forget about the
y-coordinate:
f (x, y) = x.
To go in the other direction we need to recompute the missing y-value:
f
(x) = (x,
x2 ).
364
CHAPTER 8. CARDINALITY
365
or not you can finish that proof it should be evident what that transitivity
means to us in the current situation. Any pair of line segments are the same
size a line segment (i.e. an interval) and a semi-circle are the same size
the semi-circle and an infinite line are the same size transitivity tells us that
an infinitely extended line has the same number of points as (for example)
the interval (0, 1).
366
CHAPTER 8. CARDINALITY
Exercises 8.2
1. Prove that positive numbers of the form 3k + 1 are equinumerous with
positive numbers of the form 4k + 2.
a)(d c)
provides a bijection from the
(b a)
interval [a, b] to the interval [c, d].
(x
3. Prove that any two circles are equinumerous (as sets of points).
4. Determine a formula for the bijection from ( 1, 1) to the line y = 1
determined by vertical projection onto the upper half of the unit circle,
followed by projection from the point (0, 0).
5. It is possible to generalize the argument that shows a line segment is
equivalent to a line to higher dimensions. In two dimensions we would
show that the unit disk (the interior of the unit circle) is equinumerous
with the entire plane R R. In three dimensions we would show that
the unit ball (the interior of the unit sphere) is equinumerous with the
entire space R3 = R R R. Here we would like you to prove the
two-dimensional case.
8.3
367
Cantors theorem
Many people believe that the result known as Cantors theorem says that
the real numbers, R, have a greater cardinality than the natural numbers, N.
That isnt quite right. In fact Cantors theorem is a much broader statement,
one of whose consequences is that |R| > |N|. Before we go on to discuss
Cantors theorem in full generality, well first explore it, essentially, in this
simplified form. Once we know that |R| 6= |N|, well be in a position to
we are making an infinite list of reals!). The problem is that we would need
to be sure that every real number is on the list somewhere. In fact, since
weve used a geometric argument to show that the interval (0, 1) and the set
R are equinumerous, it will be sufficient to presume that there is an infinite
list containing all the numbers in the interval (0, 1).
368
CHAPTER 8. CARDINALITY
a list of 10 real numbers in the interval (0, 1). Make sure that at least 5 of
them are not rational.
In the previous exercise, youve started the job, but we need to presume
that it is truly possible to complete this job. That is, we must presume that
there really is an infinite list containing every real number in the interval
(0, 1).
Once we have an infinite list containing every real number in the interval (0, 1) we have to face up to a second issue. What does it really mean
to list a particular real number? For instance if e
position on our list, is it OK to write e
2 is in the seventh
but it isnt necessarily possible to do something of that kind for every real
number on the other hand, writing down the decimal expansion is a problem too; in a certain sense, most real numbers in (0, 1) have infinitely long
decimal expansions. There is also another problem with decimal expansions;
they arent unique. For example, there is really no dierence between the
finite expansion 0.5 and the infinitely long expansion 0.49.
Rather than writing something like e 2 or 0.7182818284590452354. . . ,
we are going to in fact write .1011011111100001010100010110001010001010
. . . In other words, we are going to write the base-2 expansions of the real
numbers in our list. Now, the issue of non-uniqueness is still there in binary,
and in fact if we were to stay in base-10 it would be possible to plug a certain
gap in our argument but the binary version of this argument has some especially nice features. Every binary (or for that matter decimal) expansion
corresponds to a unique real number, but it doesnt work out so well the
other way around there are sometimes two dierent binary expansions
that correspond to the same real number. There is a lovely fact that we
are not going to prove (you may get to see this result proved in a course in
369
Real Analysis) that points up the problem. Whenever two dierent binary
expansions represent the same real number, one of them is a terminating expansion (it ends in infinitely many 0s) and the other is an infinite expansion
(it ends in infinitely many 1s). We wont prove this fact, but the gist of the
argument is a proof by contradiction you may be able to get the point by
studying Figure 8.4. (Try to see how it would be possible to find a number
in between two binary expansions that didnt end in all-zeros and all-ones.)
.0
.1
.00
.01
.10
.11
Figure 8.4: The base-2 expansions of reals in the interval [0, 1] are the leaves
of an infinite tree.
So, instead of showing that the set of reals in (0, 1) cant be put in one-toone correspondence with N, what were really going to do is show that their
binary expansions cant be put in one-to-one correspondence with N. Since
there are an infinite number of reals that have two dierent binary expansions
this doesnt really do the job as advertised at the beginning of this section.
(Perhaps you are getting used to our wily ways by now yes, this does mean
that were going to ask you to do the real proof in the exercises.) The set of
370
CHAPTER 8. CARDINALITY
divide (but not divide by 0) with them. We are only mentioning this fact so
that youll understand why the set {0, 1} is often referred to as F2 . Were
only mentioning that fact so that youll understand why we call the set of
371
that N and R are not equinumerous. Strangely, the argument cant be made
to work in binary, and since youre going to be asked to write it up in the
exercises, we want to point out one of the potential pitfalls. If we were to
use a diagonal argument to show that (0, 1) isnt countable, we would start
by assuming that every element of (0, 1) was written down in a list. For
most real numbers in (0, 1) we could write out their binary representation
uniquely, but for some we would have to make a choice: should we write down
the representation that terminates, or the one that ends in infinitely-many
1s? Suppose we choose to use the terminating representations, then none
of the infinite binary strings that end with all 1s will be on the list. Its
possible that the thing we get when we complement the diagonal is one of
these (unlisted) binary strings so we dont necessarily have a contradiction.
If we make the other choice use the infinite binary representation when we
have a choice there is a similar problem. You may think that our use of
binary representations for real numbers was foolish in light of the failure of
the argument to go through in binary. Especially since, as weve alluded to,
it can be made to work in decimal. The reason for our apparent stubbornness
is that these infinite binary strings do something else thats very nice. An
infinitely long binary sequence can be thought of as the indicator function of
a subset of N. For example, .001101010001 is the indicator of {2, 3, 5, 7, 11}.
Exercise. Complete the table.
binary expansion subset of N
.1
{0}
.0111
{2, 4, 6}
.01
{3k + 1 k 2 N}
372
CHAPTER 8. CARDINALITY
The set, F1
2 , weve been working with is in one-to-one correspondence
with the power set of the natural numbers, P(N). When viewed in this
light, the proof we did above showed that the power set of N has an infinite
cardinality strictly greater than that of N itself. In other words, P(N) is
uncountable.
What Cantors theorem says is that this always works. If A is any set,
and P(A) is its power set then |A| < |P(A)|. In a way, this more general
theorem is easier to prove than the specific case we just handled.
Let S = {x 2 A x 2
/ f (x)}.
If S is in the range of f , there is a preimage y such that S = f (y).
But, if such a y exists then the membership question, y 2 S, must
either be true or false. If y 2 S, then because S = f (y), and S
Q.E.D.
Cantors theorem guarantees that there is an infinite hierarchy of infinite
cardinal numbers. Lets put it another way. People have sought a construction that, given an infinite set, could be used to create a strictly larger set.
For instance, the Cartesian product works like this if our sets are finite
373
seen, this is not necessarily so if A is infinite (remember the snake argument that N and N N are equivalent). The real import of Cantors theorem
is that taking the power set of a set does create a set of larger cardinality.
So we get an infinite tower of infinite cardinalities, starting with @0 = |N|,
374
CHAPTER 8. CARDINALITY
Exercises 8.3
1. Determine a substitution rule a consistent way of replacing one digit
with another along the diagonal so that a diagonalization proof showing
that the interval (0, 1) is uncountable will work in decimal. Write up
the proof.
2. Can a diagonalization proof showing that the interval (0, 1) is uncountable be made workable in base-3 (ternary) notation?
3. In the proof of Cantors theorem we construct a set S that cannot
be in the image of a presumed bijection from A to P(A). Suppose
A = {1, 2, 3} and f determines the following correspondences: 1
2
! {1, 3} and 3
! ;,
4. An argument very similar to the one embodied in the proof of Cantors theorem is found in the Barbers paradox. This paradox was
originally introduced in the popular press in order to give laypeople an
understanding of Cantors theorem and Russells paradox. It sounds
somewhat sexist to modern ears. (For example, it is presumed without
comment that the Barber is male.)
In a small town there is a Barber who shaves those men (and
only those men) who do not shave themselves. Who shaves
the Barber?
Explain the similarity to the proof of Cantors theorem.
5. Cantors theorem, applied to the set of all sets leads to an interesting
paradox. The power set of the set of all sets is a collection of sets, so
it must be contained in the set of all sets. Discuss the paradox and
determine a way of resolving it.
375
376
CHAPTER 8. CARDINALITY
8.4
Dominance
Weve said a lot about the equivalence relation determined by Cantors definition of set equivalence. Weve also, occasionally, written things like |A| < |B|,
without being particularly clear about what that means. Its now time to
come clean. There is actually a (perhaps) more fundamental notion used
for comparing set sizes than equivalence dominance. Dominance is an
ordering relation on the class of all sets. One should probably really define
dominance first and then define set equivalence in terms of it. We havent
followed that plan for (at least) two reasons. First, many people may want
to skip this section the results of this section depend on the difficult
Cantor-Bernstein-Schroder theorem6 . Second, we will later take the view
that dominance should really be considered to be an ordering relation on the
set of all cardinal numbers i.e. the equivalence classes of the set equivalence relation not on the collection of all sets. From that perspective, set
equivalence really needs to be defined before dominance.
One set is said to dominate another if there is a function from the latter
into the former. More formally, we have the following
Definition. If A and B are sets, we say A dominates B and write |A| >
|B| i there is an injective function f with domain B and codomain A.
It is easy to see that this relation is reflexive and transitive. The CantorBernstein-Schroder theorem proves that it is also anti-symmetric which
means dominance is an ordering relation. Be advised that there is an abuse
of terminology here that one must be careful about what are the domain
and range of the dominance relation? The definition would lead us to
6
This theorem has been known for many years as the Schr
oder-Bernstein theorem, but,
lately, has had Cantors name added as well. Since Cantor proved the result before the
other gentlemen this is fitting. It is also known as the Cantor-Bernstein theorem (leaving
out Schr
oder) which doesnt seem very nice.
8.4. DOMINANCE
377
think that sets are the things that go on either side of the dominance
relation, but the notation is a bit more honest, |A| > |B| indicates that
the things really being compared are the cardinal numbers of sets (not the
sets themselves). Thus anti-symmetry for this relation is
(|A| > |B|) ^ (|B| > |A|) =) (|A| = |B|).
In other words, if A dominates B and vice versa, then A and B are
equivalent sets a strict interpretation of anti-symmetry for this relation
might lead to the conclusion that A and B are actually the same set, which
is clearly an absurdity.
Naturally, we want to prove the Cantor-Bernstein-Schroder theorem (which
were going to start calling the C-B-S theorem for brevity), but first itll be
instructive to look at some of its consequences. Once we have the C-B-S
theorem we get a very useful shortcut for proving set equivalences. Given
sets A and B, if we can find injective functions going between them in both
directions, well know that theyre equivalent. So, for example, we can use
C-B-S to prove that the set of all infinite binary strings and the set of reals in
(0, 1) really are equinumerous. (In case you had some remaining doubt. . . )
It is easy to dream up an injective function from (0, 1) to F1
2 : just send a
real number to its binary expansion, and if there are two, make a consistent
choice lets say well take the non-terminating expansion.
There is a cute thought-experiment called Hilberts Hotel that will lead
us to a technique for developing an injective function in the other direction.
Hilberts Hotel has @0 rooms. If any countable collection of guests show
up there will be enough rooms for everyone. Suppose you arrive at Hilberts
hotel one dark and stormy evening and the No Vacancy light is on there
are already a denumerable number of guests there every room is full. The
clerk sees you dejectedly considering your options, trying to think of another
hotel that might still have rooms when, clearly, a very large convention is
378
CHAPTER 8. CARDINALITY
in town. He rushes out and says My friend, have no fear! Even though we
have no vacancies, there is always room for one more at our establishment.
He goes into the office and makes the following announcement on the PA
system. Ladies and Gentlemen, in order to accommodate an incoming guest,
please vacate your room and move to the room numbered one higher. Thank
you. There is an infinite amount of grumbling, but shortly you find yourself
occupying room number 1.
To develop an injection from F1
2 to (0, 1) well use room number 1 to
separate the binary expansions that represent the same real number. Move
all the digits of a binary expansion down by one, and make the first digit
0 for (say) the terminating expansions and 1 for the non-terminating ones.
Now consider these expansions as real numbers all the expansions that
previously coincided are now separated into the intervals (0, 1/2) and (1/2, 1).
Notice how funny this map is, there are now many, many, (infinitely-many)
real numbers with no preimages. For instance, only a subset of the rational
numbers in (0, 1/2) have preimages. Nevertheless, the map is injective, so CB-S tells us that F1
2 and (0, 1) are equivalent. There are quite a few dierent
proofs of the C-B-S theorem. The one Cantor himself wrote relies on the
axiom of choice. The axiom of choice was somewhat controversial when it was
introduced, but these days most mathematicians will use it without qualms.
What it says (essentially) is that it is possible to make an infinite number of
choices. More precisely, it says that if we have an infinite set consisting of
non-empty sets, it is possible to select an element out of each set. If there
is a definable rule for picking such an element (as is the case, for example,
when we selected the nonterminating decimal expansion whenever there was
a choice in defining the injection from (0, 1) to F1
2 ) the axiom of choice isnt
needed. The usual axioms for set theory were developed by Zermelo and
Frankel, so you may hear people speak of the ZF axioms. If, in addition,
we want to specifically allow the axiom of choice, we are in the ZFC axiom
8.4. DOMINANCE
379
system. If its possible to construct a proof for a given theorem without using
the axiom of choice, almost everyone would agree that that is preferable. On
the other hand, a proof of the C-B-S theorem, which necessarily must be able
to deal with uncountably infinite sets, will have to depend on some sort of
notion that will allow us to deal with huge infinities.
The proof we will present here7 is attributed to Julius Konig. Konig
was a contemporary of Cantors who was (initially) very much respected by
him. Cantor came to dislike Konig after the latter presented a well-publicized
(and ultimately wrong) lecture claiming the continuum hypothesis was false.
Apparently the continuum hypothesis was one of Cantors favorite ideas,
because he seems to have construed Konigs lecture as a personal attack.
Anyway. . .
Konigs proof of C-B-S doesnt use the axiom of choice, but it does have
its own strangeness: a function that is not necessarily computable that is,
a function for which (for certain inputs) it may not be possible to compute
an output in a finite amount of time! Except for this oddity, Konigs proof
is probably the easiest to understand of all the proofs of C-B-S. Before we
get too far into the proof it is essential that we understand the basic setup.
The Cantor-Bernstein-Schroder theorem states that whenever A and B are
sets and there are injective functions f : A ! B and g : B ! A, then it
follows that A and B are equivalent. Saying A and B are equivalent means
that we can find a bijective function between them. So, to prove C-B-S, we
hypothesize the two injections and somehow we must construct the bijection.
Figure 8.5 has a presumption in it that A and B are countable which
need not be the case. Nevertheless, it gives us a good picture to work from.
The basic hypotheses, that A and B are sets and we have two functions,
one from A into B and another from B into A, are shown. We will have to
build our bijective function in a piecewise manner. If there is a non-empty
7
380
CHAPTER 8. CARDINALITY
a1
b1
a2
b2
a3
b3
a4
b4
a5
b5
a6
b6
a7
b7
a8
b8
a9
b9
a10
b10
a11
b11
a12
b12
8.4. DOMINANCE
381
intersection between A and B, we can use the identity function for that part
of the domain of our bijection. So, without loss of generality, we can presume
that A and B are disjoint. We can use the functions f and g to create infinite
sequences, which alternate back and forth between A and B, containing any
particular element. Suppose a 2 A is an arbitrary element. Since f is defined
on all of A, we can compute f (a). Now since f (a) is an element of B, and g
is defined on all of B, we can compute g(f (a)), and so on. Thus, we get the
infinite sequence
a,
f (a),
g(f (a)),
f (g(f (a))), . . .
fails to be defined.
. . . g 1 (f
(g 1 (a))),
(g 1 (a)),
g 1 (a),
a,
f (a),
g(f (a)),
f (g(f (a))), . . .
: A ! B by deciding
what it must do on these sequences. There are four possibilities for how the
sequences weve just defined can play out. In extending them to the left, we
may run into a place where one of the inverse functions needed isnt defined
or not. We say a sequence is an A-stopper, if, in extending to the left, we
end up on an element of A that has no preimage under g (see Figure 8.6).
Similarly, we can define a B-stopper. If the inverse functions are always
382
CHAPTER 8. CARDINALITY
defined within a given sequence there are also two possibilities; the sequence
may be finite (and so it must be cyclic in nature) or the sequence may be
truly infinite.
Finally, here is a definition for .
(x) =
g 1 (x) if x is in a B-stopper
f (x) otherwise
8.4. DOMINANCE
383
a1
b1
a2
b2
a3
b3
a4
b4
a5
b5
a6
b6
a7
b7
a8
b8
a9
b9
a10
b10
a11
b11
a12
b12
384
CHAPTER 8. CARDINALITY
Exercises 8.4
1. How could the clerk at the Hilbert Hotel accommodate a countable
number of new guests?
2. Let F be the collection of all real-valued functions defined on the real
line. Find an injection from R to F . Do you think it is possible to find
an injection going the other way? In other words, do you think that F
and R are equivalent? Explain.
3. Fill in the details of the proof that dominance is an ordering relation.
(You may simply cite the C-B-S theorem in proving anti-symmetry.)
a
to 2a 3b . Use this and anb
other obvious injection to (in light of the C-B-S theorem) reaffirm the
8.5
385
The word continuum in the title of this section is used to indicate sets of
points that have a certain continuity property. For example, in a real interval
it is possible to move from one point to another, in a smooth fashion, without
ever leaving the interval. In a range of rational numbers this is not possible,
because there are irrational values in between every pair of rationals. There
are many sets that behave as a continuum the intervals (a, b) or [a, b], the
entire real line R, the x-y plane R R, a volume in 3-dimensional space (or
for that matter the entire space R3 ). It turns out that all of these sets have
the same size.
The cardinality of the continuum, denoted c, is the cardinality of all of
the sets above.
In the previous section we mentioned the continuum hypothesis and how
angry Cantor became when someone (Konig) tried to prove it was false. In
this section well delve a little deeper into what the continuum hypothesis
says and even take a look at CHs big brother, GCH. Before doing so, it
seems like a good idea to look into the equivalences weve asserted about all
those sets above which (if you trust us) have the cardinality c.
Weve already seen that an interval is equivalent to the entire real line
but the notion that the entire infinite Cartesian plane has no more points
in it than an interval one inch long defies our intuition. Our conception
of dimensionality leads us to think that things of higher dimension must be
larger than those of lower dimension. This preconception is false as we can see
by demonstrating that a 11 square can be put in one-to-one correspondence
with the unit interval. Let S = {(x, y) 0 < x < 1 ^ 0 < y < 1} and let I
be the open unit interval (0, 1). We can use the Cantor-Bernstein-Schroeder
theorem to show that S and I are equinumerous we just need to find
386
CHAPTER 8. CARDINALITY
387
388
CHAPTER 8. CARDINALITY
ever the next infinite cardinal is, is called @1 . Its conceivable that there
actually isnt a next infinite cardinal after @0 it might be the case that
the collection of infinite cardinal numbers isnt well-ordered! In any case, if
there is a next infinite cardinal, what is it? Cantors theorem shows that
there is a way to build some infinite cardinal bigger than @0 just apply
the power set construction. The continuum hypothesis just says that this
bigger cardinality that we get by applying the power set construction is that
next cardinality weve been talking about.
To re-iterate, weve shown that the power set of N is equivalent to the
interval (0, 1) which is one of the sets whose cardinality is c. So the continuum
hypothesis, the thing that got Georg Cantor so very heated up, comes down
to asserting that
@1 = c.
There really should be a big question mark over that. A really big question mark. It turns out that the continuum hypothesis lives in a really weird
world. . . To this day, no one has the least notion of whether it is true or false.
But wait! Thats not all! The real weirdness is that it would appear to be
impossible to decide. Well, thats not so bad after all, we talked about
undecidable sentences way back in the beginning of Chapter 2. Okay, so
heres the ultimate weirdness. It has been proved that one cant prove the
continuum hypothesis. It has also been proved that one cant disprove the
continuum hypothesis.
Having reached this stage in a book about proving things I hope that the
last two sentences in the previous paragraph caused some thought along the
389
lines of well, ok, with respect to what axioms? to run through your head.
So, if you did think something along those lines pat yourself on the back.
And if you didnt then recognize that you need to start thinking that way
things are proved or disproved only in a relative way, it depends what
axioms you allow yourself to work with. The usual axioms for mathematics
are called ZFC; the Zermelo-Frankel set theory axioms together with the
axiom of choice. The ultimate weirdness weve been describing about the
continuum hypothesis is a result due to a gentleman named Paul Cohen that
says CH is independent of ZFC. More pedantically it is impossible to
either prove or disprove the continuum hypothesis within the framework of
the ZFC axiom system.
It would be really nice to end this chapter by mentioning Paul Cohen, but
there is one last thing wed like to accomplish explain what GCH means.
So here goes.
The generalized continuum hypothesis says that the power set construction is basically the only way to get from one infinite cardinality to the next.
In other words GCH says that not only does P(N) have the cardinality known
as @1 , but every other aleph number can be realized by applying the power set
@n+1 = 2@n .
Id really rather not bring this chapter to a close with that monstrosity
so instead I think Ill just say
Paul Cohen.
Hah! I did it! I ended the chapter by sayi. . . Hunh? Oh.
390
CHAPTER 8. CARDINALITY
Paul Cohen.
Chapter 9
Proof techniques IV Magic
If you can keep your head when all about you are losing theirs, its just
possible you havent grasped the situation. Jean Kerr
The famous mathematician Paul Erdos is said to have believed that God
has a Book in which all the really elegant proofs are written. The greatest
praise that a collaborator1 could receive from Erdos was that they had discovered a Book proof. It is not easy or straightforward for a mere mortal
to come up with a Book proof but notice that, since the Book is inaccessible
to the living, all the Book proofs of which we are aware were constructed by
ordinary human beings. In other words, its not impossible!
The title of this final chapter is intended to be whimsical there is no real
magic involved in any of the arguments that well look at. Nevertheless, if you
reflect a bit on the mental processes that must have gone into the development
of these elegant proofs, perhaps youll agree that there is something magical
there.
1
and their collaborators, etc. are organized into a tree structure according to their so-called
Erd
os number, see [5].
391
392
There is a lovely book entitled Proofs from the Book [2] that has a nice collection
of Book proofs.
9.1
393
Morleys miracle
Duplicating the cube is also known as the Delian problem the problem comes from
a pronouncement by the oracle of Apollo at Delos that a plague aicting the Athenians
would be lifted if they built an altar to Apollo that was twice as big as the existing altar.
The existing altar was a cube, one meter on a side, so they carefully built a two meter cube
but the plague raged on. Apparently what Apollo wanted was a cube that had double
p
the volume of the present altar its side length would have to be 3 2 1.25992 and since
this was Greece and it was around 430 B.C. and there were no electronic calculators, they
were basically just screwed.
394
Figure 9.1: The setup for Morleys Miracle start with an arbitrary triangle
and trisect each of its angles.
The six angle trisectors that weve just drawn intersect one another in
quite a few points.
Exercise. You could literally count the number of intersection points between
the angle trisectors on the diagram, but you should also be able to count them
(perhaps we should say double-count them) combinatorially. Give it a try!
Among the points of intersection of the angle trisectors there are three
that we will single out the intersections of adjacent trisectors. In Figure 9.2
the intersection of adjacent trisectors are indicated, additionally, we have
connected them together to form a small triangle in the center of our original
triangle.
395
C
Figure 9.2: A triangle is formed whose vertices are the intersections of the
adjacent trisectors of the angles of 4ABC.
Are you ready for the miraculous part? Okay, here goes!
Theorem 9.1.1. The points of intersection of the adjacent trisectors in an
arbitrary triangle 4ABC form the vertices of an equilateral triangle.
In other words, that little blue triangle in Figure 9.2 that kind of looks
like it might be equilateral actually does have all three sides equal to one
another. Furthermore, it doesnt matter what triangle we start with, if we
do the construction above well get a perfect 60
60
60 triangle in the
middle!
Sources dier, but it is not clear whether Morley ever proved his theorem.
The first valid proof (according to R. K. Guy in [8] was published in 1909 by
M. Satyanarayana [15]. There are now many other proofs known, for instance
the cut-the-knot website (http://www.cut-the-knot.org/) exposits no less
than nine dierent proofs. The proof by Satyanarayana used trigonometry.
The proof well look at here is arguably the shortest ever produced and it is
396
show that the triangle whose vertices are the intersections of the adjacent trisectors is equilateral this triangle will be referred to as the Morley triangle.
Lets also denote by A, B and C the measures of the angles of 4ABC. (This
the angles at those vertices.) It turns out that it is fairly hard to reason
from our knowledge of what the angles A, B and C are to deduce that the
Morley triangle is equilateral. How does the following plan sound: suppose
we construct a triangle, that definitely does have an equilateral Morley triangle, whose angles also happen to be A, B and C. Such a triangle would be
similar4 to the original triangle 4ABC if we follow the similarity transform
from the constructed triangle back to 4ABC we will see that their Morley
triangles must coincide; thus if one is equilateral so is the other!
One of the features of Conways proof that leads to its great succinctness
and beauty is his introduction of some very nice notation. Since we are
dealing with angle trisectors, let a, b and c be angles such that 3a = A,
3b = B and 3c = C. Furthermore, let a superscript star denote the angle
that is /3 (or 60 if you prefer) greater than a given angle. So, for example,
a? = a + /3
and
a?? = a + 2/3.
4
In Geometry, two objects are said to be similar if one can be made to exactly coincide
with the other after a series of rigid translations, rotations and scalings. In other words,
they have the same shape if you allow for dierences in scale and are allowed to slide them
around and spin them about as needed.
397
(a, b? , c? )
(a, b?? , c)
(a? , b? , c)
(a?? , b, c)
(a? , b, c? )
Exercise. What would a triangle whose vertex angles are (0? , 0? , 0? ) be?
In a nutshell, Conways proof consists of starting with an equilateral triangle of unit side length, adding appropriately scaled versions of the six
triangles above and ending up with a figure (having an equilateral Morley
triangle) similar to 4ABC. The generic picture is given in Figure 9.3. Be-
fore we can really count this argument as a proof, we need to say a bit more
about what the phrase appropriately scaled means. In order to appropriately scale the triangles (the small acute ones) that appear green in Figure 9.3
we have a relatively easy job just scale them so that the side opposite the
trisected angle has length one; that way they will join perfectly with the
central equilateral triangle.
The triangles (these are the larger obtuse ones) that appear purple in 9.3
are a bit more puzzling. Ostensibly, we have two dierent jobs to accomplish
we must scale them so that both of the edges that they will share with green
triangles have the correct lengths. How do we know that this wont require
two dierent scaling factors? Conway also developed an elegant argument
that handles this question as well. Consider the purple triangle at the bottom
of the diagram in Figure 9.3 it has vertex angles (a, b, c?? ). It is possible
to construct triangles similar (via reflections) to the adjacent green triangles
398
a? b ?
c?
c?
a?
b?
a
a
a??
??
b
b
b
Figure 9.3: Conways proof involves putting these pieces together to obtain
a triangle (with an equilateral Morley triangle) that is similar to 4ABC.
(a, b? , c? ) and (a? , b, c? ) inside of triangle (a, b, c?? ). To do this just construct
two lines that go through the top vertex (where the angle c?? is) that cut the
opposite edge at the angle c? in the two possible senses these two lines will
coincide if it should happen that c? is precisely /2 but generally there will
be two and it is evident that the two line segments formed have the same
length. We scale the purple triangle so that this common length will be 1.
See Figure 9.4.
Exercise. If it should happen that c? = /2, what can we say about C?
Of course the other two obtuse triangles can be handled in a similar way.
399
b ? a?
a
c? c?
Figure 9.4: The scaling factor for the obtuse triangles in Conways puzzle
proof is determined so that the segments constructed in there midsts have
unit length.
400
Exercises 9.1
1. What value should we get if we sum all of the angles that appear around
one of the interior vertices in the finished diagram? Verify that all three
have the correct sum.
c c c
b?? a?
c?
a
a
a
b? a??
c?
b ? a?
c??
b
b
b
9.2
401
In this section well talk about another Book proof also due to John Conway.
This proof serves as an introduction to a really powerful general technique
the idea of an invariant. An invariant is some sort of quantity that one can
calculate that itself doesnt change as other things are changed. Of course
dierent situations have dierent invariant quantities.
The setup here is simple and relatively intuitive. We have a bunch of
checkers on a checkerboard in fact we have an infinite number of checkers,
but not filling up the whole board, they completely fill an infinite half-plane
which we could take to be the set
S = {(x, y) x 2 Z ^ y 2 Z ^ y 0}.
See Figure 9.5.
Think of these checkers as an army and the upper half-plane is enemy
territory. Our goal is to move one of our soldiers into enemy territory as far
as possible. The problem is that our soldiers move the way checkers do,
by jumping over another man (who is then removed from the board). Its
clear that we can get someone into enemy territory just take someone in
the second row and jump a guy in the first row. It is also easy enough to
see that it is possible to get a man two steps into enemy territory we could
bring two adjacent men a single step into enemy territory, have one of them
jump the other and then a man from the front rank can jump over him.
Exercise. The strategy just stated uses 4 men (in the sense that they are
removed from the board 5 if you count the one who ends up two steps into
enemy territory as well). Find a strategy for moving someone two steps into
enemy territory that is more efficient that is, involves fewer jumps.
Exercise. Determine the most efficient way to get a man three steps into
402
Figure 9.5: An infinite number of checkers occupying the integer lattice points
such that y 0.
403
enemy territory. An actual checkers board and pieces (or some coins, or
rocks) might come in handy.
Well count the man who ends up some number of steps above the x-axis,
as well as all the pieces who get jumped and removed from the board as a
measure of the efficiency of a strategy. If you did the last exercise correctly
you should have found that eight men are the minimum required to get 3
steps into enemy territory. So far, the number of men required to get a given
distance into enemy territory seems to always be a power of 2.
# of steps # of men
1
2
2
4
3
8
As a picture is sometimes literally worth one thousand words, we include
here 3 figures illustrating the moves necessary to put a scout 1, 2 and 3 steps
into the void.
In order to show that 8 men are sufficient to get a scout 3 steps into
enemy territory, we show that it is possible to reproduce the configuration
that can place a man two steps in shifted up by one unit.
You may be surprised to learn that the pattern of 8 men which are needed
to get someone three steps into the void can be re-created shifted up by
one unit using just 12 men. This means that we can get a man 4 steps into
enemy territory using 12 + 8 = 20 men. You were expecting 16 werent you?
(I know I was!)
Exercise. Determine how to get a marker 4 steps into the void.
The real surprise is that it is simply impossible to get a man five steps
into enemy territory. So the sequence weve been looking at actually goes
2, 4, 8, 20, 1.
404
Figure 9.6: One man is sacrificed in order to move a scout one step into
enemy territory.
405
3
2
1
Figure 9.7: Three man are sacrificed in order to move a scout two steps into
enemy territory.
406
Figure 9.8: Eight men are needed to get a scout 3 steps into the void.
407
The proof of this surprising result works by using a fairly simple, but
clever, strategy. We assign a numerical value to a set of men that is dependent
on their positions then we show that this value never increases when we
make checker jumping moves finally we note that the value assigned to
a man in position (0, 5) is equal to the value of the entire original set of men
(that is, with all the positions in the lower half-plane occupied). This is a
pretty nice strategy, but how exactly are we going to assign these numerical
values?
A mans value is related to his distance from the point (0, 5) in what is
often called the taxicab metric. We dont use the straight-line distance,
but rather determine the number of blocks we will have to drive in the northsouth direction and in the east-west direction and add them together. The
value of a set of men is the sum of their individual values. Since we need to
deal with the value of the set of men that completely fills the lower half-plane,
we are going to have to have most of these values be pretty tiny! To put it
in a more mature and dignified manner: the infinite sum of the values of the
men in our army must be convergent.
Weve previously seen geometric series which have convergent sums. Recall the formula for such a sum is
1
X
k=0
ark
a
1
where a is the initial term of the sum and r is the common ratio between
terms.
Conways big insight was to associate the powers of some number r with
the positions on the board rk goes on the squares that are distance k from
the target location. If we have a man who is actually at the target location,
he will be worth r0 or 1. We need to arrange for two things to happen: the
sum of all the powers of r in the initial setup of the board must be less than
408
10
10
11
10
10
11
409
rk
rk+1
rk+2
Before
After
Figure 9.10: In making a checker-jump move, two men valued rk+1 and rk+2
are replaced by a single man valued rk .
If we choose r so that rk+2 + rk+1 rk then the checker-jumping move
will at worst leave the total sum fixed. Note that so long as r < 1 a checkerjumping move that takes us away from the target position will certainly
decrease the total sum.
As is often the case, well analyze the inequality by looking instead at
the corresponding equality. What value of r makes rk+2 + rk+1 = rk ? The
410
1.
1.618033989 . . . and
.618033989
and 1/ , where
p . . ., these decimal approximations are actually
1+ 5
=
is the famous golden ratio. If we are hoping for the sum over
2
all the occupied positions of rk to be convergent, we need |r| < 1, so the
negative solution is extraneous and so the inequality rk+2 + rk+1 rk is
true in the interval [1/ , 1).
Next we want to look at the value of this invariant when men occupy
all of the positions with y 0. By looking at Figure 9.9 you can see that
there is a single square with value r5 , there are 3 squares with value r6 , there
are 5 squares with value r7 , et cetera. The sum, S, of the values of all the
initially occupied positions is
S
r5
1
X
(2k + 1)rk .
k=0
We have previously seen how to solve for the value of an infinite sum
involving powers of r. In the expression above we have powers of r but also
multiplied by odd numbers. Can we solve something like this?
Lets try the same trick that works for a geometric sum. Let
T
1
X
(2k + 1)rk
1 + 3r + 5r2 + 7r3 + . . . .
k=0
Note that
rT
1
X
k=0
(2k + 1)rk+1
411
rT
1+2
1
X
rk
k=1
A bit more algebra (and the formula for the sum of a geometric series)
leads us to
T =
1
1
which simplifies to
T =
1+
2r
1
1+r
.
(1 r)2
r5 + r6
.
(1 r)2
It is interesting to proceed from this expression for S, using the fact that
r satisfies x2 = 1
412
Exercises 9.2
1. Do the algebra (and show all your work!) to prove that invariant defined in this section actually has the value 1 for the set of all the men
occupying the x-axis and the lower half-plane.
9.3
413
Theres a nice sequence of matchstick puzzles that starts with Use nine nonoverlapping matchsticks to form 4 triangles (all of the same size. Its not
that hard, and after a while most people come up with
The kicker comes when you next ask them to use six matches to form 4
(equal sized) triangles. Theres a picture of the solution to this new puzzle
at the back of this section. The answer involves thinking three-dimensionally,
so with that hint give it a try for a while before looking in the back.
Monges circle theorem has nothing to do with matchsticks, but it is
a sweet example of a proof that works by moving to a higher dimension.
People often talk about thinking outside of the box when discussing critical
414
Figure 9.11: The setup for Monges circle theorem: three randomly placed
circles we are also showing the external tangents to one pair of circles.
415
Notice how the external tangents5 to two of the circles meet in a point?
Unless the circles just happen to have exactly the same size (And what are
the odds of that?) this is going to be the case. Each pair of external tangents
are going to meet in a point. There are three such pairs of external tangents
and so they determine three points. I suppose, since these three points are
determined in a fairly complicated way from three randomly chosen circles,
that we would expect the three points to be pretty much random. Monges
circle theorem says that that isnt so.
Theorem 9.3.1 (Monges Circle Theorem). If three circles of dierent radii
in the Euclidean plane are chosen so that no circle lies in the interior of
another, the three pairs of external tangents to these circles meet in points
which are collinear.
In Figure 9.12 we see a complete example of Monges Circle theorem in
action. There are three random circles. There are three pairs of external
tangents. The three points determined by the intersection of the pairs of
external tangents lie on a line (shown dashed in the figure).
We wont even try to write-up a formal proof of the circle theorem. Not
that it cant be done its just that you can probably get the point better
via an informal discussion.
The main idea is simply to move to 3-dimensional space. Imagine the
original flat plane containing our three random circles as being the plane
z = 0 in Euclidean 3-space. Replace the three circles by three spheres of
the same radius and having the same centers clearly the intersections of
these spheres with the plane z = 0 will be our original circles. While pairs of
circles are encompassed by two lines (the external tangents that weve been
discussing so much), when we have a pair of spheres in 3-space, they are
5
The reason I keep saying external tangents is that there are also internal tangents.
416
417
encompassed by a cone which lies tangent to both spheres6 . Notice that the
cones that lie tangent to a pair of spheres intersect the plane precisely in
those infamous external tangents.
Well, okay, weve moved to 3-d. Weve replaced our circles with spheres
and our external tangents with tangent cones. The points of intersection of
the external tangents are now the tips of the cones. But, what good has this
all done? Is there any reason to believe that the tips of those cones lie in a
line?
Actually, yes! There is a plane that touches all three spheres tangentially.
Actually, there are two such planes, one that touches them all on their upper
surfaces and one that touches them all on their lower surfaces. Oh damn!
There are actually lots of planes that are tangent to all three spheres but
only one that lies above the three of them. That plane intersects the plane
z = 0 in a line nothing fancy there; any pair of non-parallel planes will
intersect in a line (and the only way the planes we are discussing would be
parallel is if all three spheres just happened to be the same size). But that
plane also lies tangent to the cones that envelope our spheres and so that
plane (as well as the plane z = 0) contains the tips of the cones!
As before, when the spheres happen to have identical radii we get a degenerate case
418
Figure 9.13: Six matchstick (actually, pencils are a lot easier to hold) can be
arranged three-dimensionally to create four triangles.
419
Exercises 9.3
1. There is a scenario where the proof we have sketched for Monges circle
theorem doesnt really work. Can you envision it? Hint: consider two
relatively large spheres and one that is quite small.
420
Bibliography
[1] R. E. Greenwood A. M. Gleason and L. M. Kelly. The William Lowell
Putnam Mathematical Competition Problems & Solutions: 1938-1964.
The Mathematical Association of America, Reissued 2003.
[2] Martin Aigner and Gunter M. Ziegler.
contributors.
Cantor-bernstein-schroeder
theorem.
422
BIBLIOGRAPHY
[8] Richard K. Guy. The lighthouse theorem, Morley & Malfatti a budget
of paradoxes. American Mathematical Monthly, 2007.
[9] Bjorn Poonen Kiran S. Kedlaya and Ravi Vakil. The William Lowell
Putnam Mathematical Competition 1985-2000: Problems Solutions, and
Commentary. The Mathematical Association of America, 2002.
[10] C. W. H. Lam. The search for a finite projective plane of order 10.
http://www.www.cecm.sfu.ca/organics/papers/lam/paper/html/paper.html.
[11] Saunders MacLane.
http://www-
Radziszowski.
Small
ramsey
numbers.
http://www.combinatorics.org/Surveys/ds1.pdf.
[14] Gian-Carlo Rota. Indiscrete Thoughts. Birkhauser, 1997.
[15] M. Satyanarayana. none given. Math. Quest. Educ. Times (New Series),
1909.
[16] D. J. Struik. A Source Book in Mathematics, 1200-1800. Princeton
University Press, 1986.
[17] Alfred North Whitehead and Bertrand Russell. Principia Mathematica.
Cambridge University Press, 1910.
Preamble
The purpose of this License is to make a manual, textbook, or other functional and useful document free in the sense of freedom: to assure everyone
the eective freedom to copy and redistribute it, with or without modifying
it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while
not being considered responsible for modifications made by others.
This License is a kind of copyleft, which means that derivative works
of the document must themselves be free in the same sense. It complements
the GNU General Public License, which is a copyleft license designed for free
software.
We have designed this License in order to use it for manuals for free
software, because free software needs free documentation: a free program
should come with manuals providing the same freedoms that the software
423
424
BIBLIOGRAPHY
does. But this License is not limited to software manuals; it can be used
for any textual work, regardless of subject matter or whether it is published
as a printed book. We recommend this License principally for works whose
purpose is instruction or reference.
BIBLIOGRAPHY
425
426
BIBLIOGRAPHY
requires to appear in the title page. For works in formats which do not have
any title page as such, Title Page means the text near the most prominent
appearance of the works title, preceding the beginning of the body of the
text.
The publisher means any person or entity that distributes copies of
the Document to the public.
A section Entitled XYZ means a named subunit of the Document
whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for
a specific section name mentioned below, such as Acknowledgements,
Dedications, Endorsements, or History.) To Preserve the Title of such a section when you modify the Document means that it remains
a section Entitled XYZ according to this definition.
The Document may include Warranty Disclaimers next to the notice
which states that this License applies to the Document. These Warranty
Disclaimers are considered to be included by reference in this License, but
only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no eect on the meaning of this
License.
2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are
reproduced in all copies, and that you add no other conditions whatsoever
to those of this License. You may not use technical measures to obstruct or
control the reading or further copying of the copies you make or distribute.
However, you may accept compensation in exchange for copies. If you dis-
BIBLIOGRAPHY
427
tribute a large enough number of copies you must also follow the conditions
in section 3.
You may also lend copies, under the same conditions stated above, and
you may publicly display copies.
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have
printed covers) of the Document, numbering more than 100, and the Documents license notice requires Cover Texts, you must enclose the copies in
covers that carry, clearly and legibly, all these Cover Texts: Front-Cover
Texts on the front cover, and Back-Cover Texts on the back cover. Both
covers must also clearly and legibly identify you as the publisher of these
copies. The front cover must present the full title with all words of the title
equally prominent and visible. You may add other material on the covers in
addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated
as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly,
you should put the first ones listed (as many as fit reasonably) on the actual
cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent
copy along with each Opaque copy, or state in or with each Opaque copy
a computer-network location from which the general network-using public
has access to download using public-standard network protocols a complete
Transparent copy of the Document, free of added material. If you use the
latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy
will remain thus accessible at the stated location until at least one year after
428
BIBLIOGRAPHY
the last time you distribute an Opaque copy (directly or through your agents
or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the
Document well before redistributing any large number of copies, to give them
a chance to provide you with an updated version of the Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under
the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling
the role of the Document, thus licensing distribution and modification of the
Modified Version to whoever possesses a copy of it. In addition, you must
do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct from that
of the Document, and from those of previous versions (which should, if
there were any, be listed in the History section of the Document). You
may use the same title as a previous version if the original publisher of
that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities
responsible for authorship of the modifications in the Modified Version,
together with at least five of the principal authors of the Document (all
of its principal authors, if it has fewer than five), unless they release
you from this requirement.
C. State on the Title page the name of the publisher of the Modified
Version, as the publisher.
D. Preserve all the copyright notices of the Document.
BIBLIOGRAPHY
429
430
BIBLIOGRAPHY
BIBLIOGRAPHY
431
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under
this License, under the terms defined in section 4 above for modified versions,
provided that you include in the combination all of the Invariant Sections
of all of the original documents, unmodified, and list them all as Invariant
Sections of your combined work in its license notice, and that you preserve
all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there
are multiple Invariant Sections with the same name but dierent contents,
make the title of each such section unique by adding at the end of it, in
parentheses, the name of the original author or publisher of that section if
known, or else a unique number. Make the same adjustment to the section
titles in the list of Invariant Sections in the license notice of the combined
work.
In the combination, you must combine any sections Entitled History
in the various original documents, forming one section Entitled History;
likewise combine any sections Entitled Acknowledgements, and any sections Entitled Dedications. You must delete all sections Entitled Endorsements.
6. COLLECTIONS OF DOCUMENTS
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this
432
BIBLIOGRAPHY
License in the various documents with a single copy that is included in the
collection, provided that you follow the rules of this License for verbatim
copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute
it individually under this License, provided you insert a copy of this License
into the extracted document, and follow this License in all other respects
regarding verbatim copying of that document.
7. AGGREGATION WITH
INDEPENDENT WORKS
A compilation of the Document or its derivatives with other separate
and independent documents or works, in or on a volume of a storage or
distribution medium, is called an aggregate if the copyright resulting from
the compilation is not used to limit the legal rights of the compilations users
beyond what the individual works permit. When the Document is included in
an aggregate, this License does not apply to the other works in the aggregate
which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies
of the Document, then if the Document is less than one half of the entire
aggregate, the Documents Cover Texts may be placed on covers that bracket
the Document within the aggregate, or the electronic equivalent of covers if
the Document is in electronic form. Otherwise they must appear on printed
covers that bracket the whole aggregate.
8. TRANSLATION
Translation is considered a kind of modification, so you may distribute
translations of the Document under the terms of section 4. Replacing Invari-
BIBLIOGRAPHY
433
ant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You
may include a translation of this License, and all the license notices in the
Document, and any Warranty Disclaimers, provided that you also include
the original English version of this License and the original versions of those
notices and disclaimers. In case of a disagreement between the translation
and the original version of this License or a notice or disclaimer, the original
version will prevail.
If a section in the Document is Entitled Acknowledgements, Dedications, or History, the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
9. TERMINATION
You may not copy, modify, sublicense, or distribute the Document except
as expressly provided under this License. Any attempt otherwise to copy,
modify, sublicense, or distribute it is void, and will automatically terminate
your rights under this License.
However, if you cease all violation of this License, then your license from
a particular copyright holder is reinstated (a) provisionally, unless and until
the copyright holder explicitly and finally terminates your license, and (b)
permanently, if the copyright holder fails to notify you of the violation by
some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated
permanently if the copyright holder notifies you of the violation by some
reasonable means, this is the first time you have received notice of violation
of this License (for any work) from that copyright holder, and you cure the
violation prior to 30 days after your receipt of the notice.
434
BIBLIOGRAPHY
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under this
License. If your rights have been terminated and not permanently reinstated,
receipt of a copy of some or all of the same material does not give you any
rights to use it.
11. RELICENSING
Massive Multiauthor Collaboration Site (or MMC Site) means any
World Wide Web server that publishes copyrightable works and also provides prominent facilities for anybody to edit those works. A public wiki
BIBLIOGRAPHY
435
that anybody can edit is an example of such a server. A Massive Multiauthor Collaboration (or MMC) contained in the site means any set of
copyrightable works thus published on the MMC site.
CC-BY-SA means the Creative Commons Attribution-Share Alike 3.0
license published by Creative Commons Corporation, a not-for-profit corporation with a principal place of business in San Francisco, California, as well
as future copyleft versions of that license published by that same organization.
Incorporate means to publish or republish a Document, in whole or in
part, as part of another Document.
An MMC is eligible for relicensing if it is licensed under this License,
and if all works that were first published under this License somewhere other
than this MMC, and subsequently incorporated in whole or in part into
the MMC, (1) had no cover texts or invariant sections, and (2) were thus
incorporated prior to November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the
site under CC-BY-SA on the same site at any time before August 1, 2009,
provided the MMC is eligible for relicensing.
436
BIBLIOGRAPHY
version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled GNU Free
Documentation License.
with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts
being LIST.
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use
in free software.
Index
absorption, 87, 88
Blaise Pascal, 31
Boole, George, 63
Alephnaught, 352
algorithm, 38
bound variables, 97
alphanumeric, 316
and gates, 66
cardinality, 174
antecedent, 73
anti-symmetry, 247
antichain, 270
ceiling function, 28
chain, 268
complementarity law, 88
biconditional, 75
complementarity laws, 86
bijection, 279
complex numbers, 5
binary relation, 54
component-wise operations, 6
binomial, 337
composite, 14
binomial coefficients, 30
438
INDEX
compound sentence, 61
disjunction, 62
conditional statement, 73
congruence, 29
conjunction, 62
divisibility, 27
consequent, 73
division algorithm, 41
domain, 57
contradiction, 86
domination law, 88
contrapositive, 76
doubly-even, 34
converse, 75
counterexample, 146
equinumerous, 349
deduction, 105
degree, 320
DeMorgans law, 88
Euclidean algorithm, 43
DeMorgans laws, 85
denumerable, 353
Descartes, Rene, 241
destructive dilemma, 109, 115
diagonal map, 292
dierence (of sets), 186
factorials, 32
INDEX
439
Fermat numbers, 98
hypercube, 272
hypotheses, 106
floor function, 28
flowchart, 38
form (of an argument), 116
forwards-backwards method, 136
four color theorem, 151
idempotence, 88
idempotent, 87
identity law, 88
identity laws, 86
i, 75
imaginary part, 6
graph, 260
injection, 279
integers, 2
intersection, 181
inverse, 75
inverse error, 116
inverse image, of a set, 283
inverse relation, 246, 278
440
INDEX
invertible function, 58
multiset, 172
NAND, 72
natural numbers, 1
negation, 62
Kaliningrad, 319
Newton, Isaac, 4
noneg, 7
NOR, 72
Kronecker, Leopold, 1
not gates, 66
octal representation, 34
open sentence, 96
operator, 274
lemmas, 50
or gates, 66
parallel connection, 64
parity check code, 319
partial order, 264
partially ordered set, 266
partition, 256
Peirce arrow, 72
permutation, 302
pigeonhole, 331
INDEX
441
range, 57
place notation, 25
Rational approximation, 52
rationals, 2
poset, 266
real part, 6
reals, 4
predicate variable, 62
recognizers, 68
reductio ad absurdam, 49
premise, 106
reflexivity, 247
prime factorization, 15
relations, 53
prime numbers, 13
relative primality, 48
projection, 292
Scheer stroke, 72
sentence, 60
sequence, 300
series connection, 64
set theoretic equalities, 184
set-builder notation, 2
sieve of Eratosthenes, 14
similarity transform, 396
singleton set, 173
Smullyan, Raymond, 72
Sophie Germain prime, 103
442
INDEX
statement, 60
subset, 176
successor, 270
superset, 176
syllogism, 108
symmetric dierence, 187
symmetry, 247
tautology, 86
ternary relation, 54
tetromino, 329
TFAE, 13
vacuous truth, 74
valid argument form, 114
vampire number, 158, 162
Venn diagram, 191
weak Goldbach conjecture, 153
weasels, ice, 123
well-ordering principle, 161, 167
William Lowell Putnam Mathematics
Competition, 159
winding map, 290
Yahtzee, 301
Z-module, 164