Discrete Mathematics and Functional Programming PDF
Discrete Mathematics and Functional Programming PDF
Thomas VanDrunen
I Set 1
II Logic 33
III Proof 71
IV Algorithm 91
V Relation 127
VI Function 155
i
BRIEF CONTENTS BRIEF CONTENTS
ii
Contents
Preface ix
I Set 1
1 Sets and elements 3
1.1 Your mathematical biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Reasoning about items collectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Intuiton about sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Set notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Set Operations 19
3.1 Axiomatic foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Operations and visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Powersets, cartesian products, and partitions . . . . . . . . . . . . . . . . . . . . . . 22
II Logic 33
5 Logical Propositions and Forms 35
5.1 Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Boolean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Truth tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 Logical equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6 Conditionals 43
6.1 Conditional propositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 Negation of a conditional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.3 Converse, inverse, and contrapositive . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.4 Writing conditionals in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.5 Conditional expressions in ML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
iii
CONTENTS CONTENTS
7 Argument forms 49
7.1 Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Common syllogisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.3 Using argument forms for deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
III Proof 71
10 Subset proofs 73
10.1 Introductory remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.2 Forms for proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10.4 Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
12 Conditional proofs 83
12.1 Worlds of make believe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
12.2 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
12.3 Biconditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
12.4 Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
IV Algorithm 91
14 Algorithms 93
14.1 Problem-solving steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
14.2 Repetition and change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
14.3 Packaging and parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
14.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
15 Induction 101
15.1 Calculating a powerset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
15.2 Proof of powerset size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
15.3 Mathematical induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
15.4 Induction gone awry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
15.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
iv
CONTENTS CONTENTS
V Relation 127
19 Relations 129
19.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
19.2 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
19.3 Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
21 Closures 141
21.1 Transitive failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
21.2 Transitive and other closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
21.3 Computing the transitive closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
21.4 Relations as predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
VI Function 155
23 Functions 157
23.1 Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
23.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
23.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
23.4 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
24 Images 163
24.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
24.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
24.3 Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
v
CONTENTS CONTENTS
31 Combinatorics 205
31.1 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
31.2 Permutations and combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
31.3 Computing combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
vi
CONTENTS CONTENTS
36 Isomorphisms 229
36.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
36.2 Isomorphic invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
36.3 The isomorphic relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
36.4 Final bow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
vii
CONTENTS CONTENTS
viii
Preface
If you have discussed your schedule this semester with anyone, you have probably been asked what
discrete mathematics is—or perhaps someone has asked would could make math indiscreet. While
discrete mathematics is something few people outside of mathematical fields have heard of, it is
comprised of topics that are fundamental to mathematics; to gather these topics together into one
course is a more recent phenomenon in mathematics curriculum. Because these topics are sometimes
treated separately or in various other places in an undergraduate course of study in mathematic,
discrete math texts and courses appear like hodge-podges, and unifying themes are sometimes hard
to identify. Here we will attempt to shed some light on the matter.
Discrete mathematics topics include symbolic logic and proofs, including proof by induction;
number theory; set theory; functions and relations on sets; graph theory; algorithms, their analysis,
and their correctness; matrices; sequences and recurrence relations; counting and combinatorics;
discrete probability; and languages and automata. All of these would be appropriate in other courses
or in their own course. Why teach them together? For one thing, students in a field like computer
science need a basic knowledge of many of these topics but do not have time to take full courses in
all of them; one course that meanders through these topics is thus a practical compromise. (And
no one-semester course could possibly touch all of them; we will be completely skipping matrices,
probability, languages, and automata, and number theory, sequences, recurrence relations, counting,
and combinatorics will receive only passing attention.)
However, all these topics do have something in common which distinguishes them from much of
the rest of mathematics. Subjects like calculus, analysis, and differential equations, anything that
deals with the real or complex numbers, can be put under the heading of continuous mathematics,
where a continuum of values is always in view. In contrast to this, discrete mathematics always has
separable, indivisible, quantized (that is, discrete) objects in view—things like sets, integers, truth
values, or vertices in a graph. Thus discrete math stands towards continuous math in the same way
that digital devices stand toward analog. Imagine the difference between an electric stove and a gas
stove. A gas stove has a nob which in theory can be set to an infinite number of positions between
high and low, but the discrete states of an electric stove are a finite, numbered set.
This particular course, however, does something more. Here we also intertwine functional pro-
gramming with the discrete math topics. Functional programming is a different style or paradigm
from the procedural, imperative, and/or object-oriented approach that those of you who have pro-
grammed before have seen (which, incidentally, should place students who have programming ex-
perience and those who have not on a level playing field). Instead of viewing a computer program
as a collection of commands given to a computer, we see a program as a collection of interacting
functions, in the mathematical sense. Since functions are a major topic of discrete math anyway,
the interplay is natural. As we shall see, functional programming is a useful forum for illustrating
the other discrete math topics as well.
But like any course, especially at a liberal arts college, our main goal is to make you think better.
You should leave this course with a sharper understanding of categorical reasoning, the ability to
analyze logic formally, an appreciation for the precision of mathematical proofs, and the clarity of
thought necessary to arrange tasks into an algorithm. The slogan of this course is, “Math majors
should learn to write programs and computer science majors should learn to write proofs together.”
Math majors will spend most of their time as undergraduates proving things, and computer science
majors will do a lot of programming; yet they both need to do a little of the other. The fact is,
ix
CHAPTER 0. PREFACE
robust programs and correct proofs have a lot to do with each other, not the least of which is that
they both require clear, logical thinking. We will see how proof techniques will allow us to check
that an algorithm is correct, and that proofs can prompt algorithms. Moreover, the programming
component motivates the proof-based discrete math for the computer science majors and keeps it
relevant; the proof component should lighten the unfamiliarity that math majors often experience
in a programming course.
There are three major theme pairs that run throughout this course. The theme of proof and
program has already been explained. The next is symbol and representation. So much of
precise mathematics relies on accurate and informative notation, and it is important to distinguish
the difference between a symbol and the idea it represents. This is also a point where math and
computer science streams of thought swirl; much of our programming discussions will focus on the
best ways to represent mathematical concepts and structure on a computer. Finally, the theme of
analysis and synthesis will recur. Analysis is the taking of something apart; synthesis is putting
something together. This pattern occurs frequently in proofs. Take any proposition in the form “if
q then p.” q will involve some definition that will need to be analyzed straight away to determine
what is really being asserted. The proof will end by assembling the components according to the
definitions of the terms used in p. Likewise in functional programming, we will practice decomposing
a problem into its parts and synthesizing smaller solutions into a complete solution.
This course covers a lot of material, and much of it is challenging. However, with careful practice,
none of it is beyond the grasp of anyone with the mathematical maturity that is usually achieved
around the time a student takes calculus.
x
Part I
Set
1
Chapter 1
The whole number operation of subtraction eventually forced you to face a dilemma: what
happens if you subtract a larger number from a smaller number? Since W is insufficient to answer
this, negative numbers were invented. We call all whole numbers with their opposites (that is, their
negative counterparts) integers, and we use Z (from Zahlen, the German word for “numbers”) to
symbolize the integers.
−5 5
3
1.1. YOUR MATHEMATICAL BIOGRAPHY CHAPTER 1. SETS AND ELEMENTS
can split five apples. Physically, they could chop one of the apples into two equal parts and each get
one part, but how can you describe the resulting quantity that each caveman would get? Human
languages handle this with words like “half”; mathematics handles this with fractions, like 25 or the
equivalent 2 21 , which is shorthand for 2 + 21 . We call numbers that can be written as fractions (that
is, ratios of integers) rational numbers, symbolized by Q (for quotient). Since a number like 5 can
be written as 15 , all integers are rational numbers.
5 3
−5 7
5 3
−5 7
All of these designations ought to be second nature to you. A lesser known distinction that
you may or may not remember is that real numbers can be split up√into two camps: algebraic
numbers (A), each of which is a root to some polynomial function, like 2 and all the integers; and
transcendental numbers (T), which are not.
4
CHAPTER 1. SETS AND ELEMENTS 1.1. YOUR MATHEMATICAL BIOGRAPHY
5 3
−5 7
We first considered negative numbers when we invented the integers. However, as we expanded
to rationals and reals, we introduced both new negative numbers and new positive numbers. Thus
negative (real) numbers considered as a collection (R− ) cut across all of these other collections,
except W and N.
5 3
−5 7
To finish off the picture, remember how N, Z, and Q each in turn proved to be inadequate√because
of operations we wished to perform on them. Likewise R is inadequate for operations like −1. To
handle that, we have complex numbers, C.
5 3
−5 7
5
1.3. INTUITON ABOUT SETS CHAPTER 1. SETS AND ELEMENTS
6
CHAPTER 1. SETS AND ELEMENTS 1.4. SET NOTATION
The set of car models produced by Ford (different from the set of cars produced by Ford).
The set of entrees served at Bon Appetit.
The set of the Fruits of the Spirit.
Since set is a noun, we can even have a set of sets; for example, the set of number sets included
in R, which would contain Q, Z, W, and N. In theory, a set can even contain itself—the set of things
mentioned on this page is itself mentioned on this page and thus includes itself—though that leads
to some paradoxes.
Hrbacek and Jech give an important clarification to our intuition of what a set is:
Sets are not objects of the real world, like tables or stars; they are created by our mind,
not by our hands. A heap of potatoes is not a set of potatoes, the set of all molecules in
a drop of water is not the same object as that drop of water[9].
It is legitimate, though, to speak of the set of molecules in the drop and of the set of potatoes in
the heap.
Red 6= {Red}
Using this system, {} stands for a set with no elements, that is, the empty set, but we also have a
special symbol for that, ∅. The symbol ∈ stands for set membership and should be read “an element ∅
of” or “is an element of”, depending on the grammatical context (sometimes just “in” works if you
∈
are reading quickly).
N = {x|x ∈ Z, x > 0}
which reads “the set of natural numbers is the set of all x such that that x is an integer and x is
greater than 0.” Recall from analysis that you can specify a range on the real number line, say all
from one exclusive to 5 inclusive, by the notation (1, 5]. Indeed, a range is a set; note
7
1.4. SET NOTATION CHAPTER 1. SETS AND ELEMENTS
superset, ⊆ Those with Hebraic tendencies will appreciate the less-used ⊇, standing for superset, that is,
X ⊇ Y if Y is completely contained in X. Also, if you want to exclude the possibility that a subset
is equal to the larger set (say X is contained in Y , but Y has some elements not in X), what you
proper subset, ⊂ have in mind is called a proper subset, symbolized by X ⊂ Y . Compare ⊆ and ⊂ with < and ≤.
Rarely, though, will we want to restrict ourselves to proper subsets.
Often it is useful to take all the elements in two or more sets and consider them together. The
union, ∪ resulting set is called the union, and its construction is symbolized by ∪. The union of two sets is
the set of elements in either set.
If any element occurs in both of the original sets, it still occurs only once in the resulting set.
There is no notion of something occuring twice in a set. On the other hand, sometimes we will want
intersection, ∩ to consider only the elments in both sets. We call that the intersection, and use ∩.
At this point it is very important to understand that X ∩ Y means “the set where X and Y
overlap.” It does not mean “X and Y overlap at some point.” It is a noun, not a sentence. This
will be reemphasized in the next chapter.
difference, − The fanciest operation on sets for this chapter is set difference, which expresses the set the
resulting when we remove all the element of one set from another. We use the subtraction symbol
for this, say X − Y . Y may or may not be a subset of X.
Finally, those circle diagrams have a name. Venn diagrams are so called after their inventor
John Venn. They help visualize how different sets relate to each other in terms of containment and
overlap. Note that the areas of the regions have no meaning—a large area might not contain more
elements than a small area, for example.
8
CHAPTER 1. SETS AND ELEMENTS 1.4. SET NOTATION
Exercises
9. A ⊆ C.
Let T be the set of trees, D be the set of deciduous trees, and C 10. R ⊆ C ∩ R−1
be the set of coniferous trees. In exercises 1–6, write the statement
symbolically. 11. 4 ∈ C.
12. Q ∩ T = ∅.
1. Oak is a deciduous tree.
1
13. ∈ Q − R.
2. Pine is not a decidous tree. 63
14. Z − R−1 = W.
3. All coniferous trees are trees.
15. T ∪ Z ⊆ A.
4. Decidous trees are those that are trees but are not conifer-
ous. 16. All of the labeled sets we considered in Section 1.1 have an in-
5. Decidous trees and conferous trees together make all trees. finite number of elements, even though some are completely
contained in others. (We will later consider whether all in-
6. There is no tree that is both deciduous and coniferous. finities should be considered equal.) However, two regions
7. Write [2.3, 9.5) in set notation. have a finite number of elements.
a. Describe the region shaded . How many elements does
In exercises 8–15, determine whether each statement is true or
it have?
false.
b. Describe the region shaded . How many elements
8. −12 ∈ N. does it have?
0
0
3
5
−5 7
9
1.4. SET NOTATION CHAPTER 1. SETS AND ELEMENTS
10
Chapter 2
2.1 Expressions
One of the most important themes in this course is the modeling or representation of mathematical
concepts in a computer system. Ultimately, the concepts are modeled in computer memory; the
arrangements of bits is information which we interpret as representing certain concepts, and we
program the computer to operate on that information in a way consistent with our interpretation.
An expression is a programming language construct that expresses something. ML’s interactive expression
mode works as a cycle: you enter an expression, which it will evaluate. A value is the result of the
evaluation of an expression. To relate this to the previous chapter, a value is like an element. An evaluate
expression is a way to describe that element. For example, 5 and 7 − 2 are two ways to express the
value
same element of N.
When you start ML, you will see a hyphen, which is ML’s prompt, indicating it is waiting for
you to enter an expression. The way you communicate to ML is to enter an expression followed by
a semicolon and pressing the “enter” key. (If you press “enter” before the expression is finished, you
will get a slightly different prompt, marked by an equals sign; this indicates ML assumes you have
more to say.)
Try entering 5 into the ML prompt. Text that the user types into the prompt will be in
typewriter font; ML’s response will be in slanted typewriter font .
- 5;
val it = 5 : int
<expression> ;
val is short for “value”, indicating this is the value the ML interpreter has found for the expression
you entered.
it is a variable. A variable is a symbol that represents a value in a given context. Note that this variable
means that a variable, too, is an expression; however, unlike the symbol 5, the value associated
with the variable changes as you declare it to. Variables in ML are like those you are familiar
with from mathematics (and other programming languages, if you have programmed before),
and you can think of a variable as a box that stores values. Unless directed otherwise, ML
automatically stores the value of the most recently evaluated expression in a variable called
it .
5 is the value of the expression (not surprisingly, the value of 5 is 5 ).
11
2.2. TYPES CHAPTER 2. EXPRESSIONS AND TYPES
int is the type of the expression (in this case, short for “integer”), about which we will say more
soon.
We can make more interesting expressions using mathematical operators. We can enter
- 7 - 2;
val it = 5 : int
Note that this expression itself contains two other expressions, 7 and 2. Smaller expressions that
subexpression compose a larger expression are called subexpressions of that expression. - is an operator , and the
operator subexpressions are the operands of that operator. + means what you would expect, * stands for
operand multiplication, and ∼ is used as a negative sign (having one operand, to distinguish it from -, which
has two); division we will discuss later. To express (and calculate) 67 + 4 × −13, type
- 67 + 4 * ~ 13;
val it = 15 : int
2.2 Types
type So far, all these values have had the type int (we will use sans serif font for types). A type is a set
of values that are related by the operations that can be performed on them. This provides another
example of modeling concepts on a computer: a type models our concept of set.
Nevertheless, this also demonstrates the limitations of modeling because types are more restricted
than our general concept of a set. ML does not provide a way to use the concepts of subsets, unions,
or intersections on types. We will later study other ways to model sets to support these concepts.
Moreover, the type int, although it corresponds to the set Z in terms of how we interpret it, does
not equal the set Z. The values (elements) of int are computer representations of integers, not the
integers themselves, and since computer memory is limited, int comprises only a finite number of
values. On the the computer used to write this book, the largest integer ML recognizes is 1073741823.
Although 1073741824 ∈ Z, it is not a valid ML int.
- 1073741824;
ML also has a type real corresponding to R. The operators you have already seen are also defined
for reals, plus / for division.
- ~4.73;
- 5.3 - 0.3;
12
CHAPTER 2. EXPRESSIONS AND TYPES 2.2. TYPES
Notice that 5.0 has type real, not type int. Again the set modeling breaks down. int is not a
subset (or subtype) of real, and 5.0 is a completely different value from 5 .
A consequence of int and real being unrelated is that you cannot mix them in arithmetic expres-
sions. English requires that the subject of a sentence have the same number (singular or plural) as
the main verb, which is why it does not allow a sentence like, “Two dogs walks down the street.”
This is called subject-verb agreement. In the same way, these ML operators require type agreement. type agreement
That is, +, for example, is defined for adding two reals and for adding two ints, but not one of each.
Attempting to mix them will generate an error.
- 7.3 + 5;
This rule guarantees that the result of an arithmetic operation will have the same type as the
operands. This complicates the division operation on ints. We expect that 5 ÷ 4 = 1.25—as we
noted in the previous chapter, division takes us out of the circle of integers. Actually, the / operator
is not defined for ints at all.
- 5/4;
Instead, another operator performs integer division, which computes the integer quotient (that is, integer division
ignoring the remainder) resulting from dividing two integers. Such an operation is different enough
from real number division that it uses a different symbol: the word div. The remainder is calculated
by the modulus operator, mod.
- 5 div 3;
val it = 1 : int
- 5 mod 3;
val it = 2 : int
But would it not be useful to include both reals and ints in some computations? Yes, but to
preserve type purity and reduce the chance of error, ML requires that you convert such values
explicitly using one of the converters in the table below. Note the use of parentheses. These
“converters” are functions, as we will see in a later chapter.
For example,
13
2.3. VARIABLES CHAPTER 2. EXPRESSIONS AND TYPES
- 15.3 / real(6);
- trunc(15.3) div 6;
val it = 2 : int
2.3 Variables
Since variables are expressions, they are fair game for entering into a prompt.
- it;
val it = 2 : int
We too little appreciate what a powerful thing it is to know a name. Having a name by which
to call something allows one to exercise a certain measure of control over it. As Elwood Dowd said
when meeting Harvey the rabbit, “You have the advantage on me. You know my name—and I don’t
know yours”[2]. More seriously, the Third Commandment shows how zealous God is for the right
use of his name. Geerhardus Vos comments:
It is not sufficient to think of swearing and blasphemy in the present-day common sense
of these terms. The word is one of the chief powers of pagan superstition, and the most
potent form of word-magic is name-magic. It was believed that through the pronouncing
of the name of some supernatural entity this can be compelled to do the bidding of
the magic-user. The commandment applies to the divine disapproval of such practices
specifically to the name “Jehovah.” [13].
Compare also Ex 6:2 and Rev 19:12. A name is a blessing, as in Gen 32:26-29 and Rev 2:17.
Even more so, to give a name to something is act of dominion over it, as in Gen 2:19-20. Think of
how in the game of tag the player who is “it” has the power to place that name on someone else.
The name it in ML gives us the power to recall the previous value
- it * 15;
val it = 30 : int
- it div 10;
val it = 3 : int
To name a value something other than it, imitate the interpreter’s response using val, the
desired variable, and equals, something in the form of
- val x = 5;
val x = 5 : int
You could also add a colon and a type after the expression, like the interpreter does in its response,
but there is no need to—the interpreter can figure that out on its own. However, we will see a few
occasions much later when complicated expressions need an explicit typing for disambiguation.
identifier An identifier is a programmer-given name, such as a variable. ML has the following rules for
valid identifiers:
14
CHAPTER 2. EXPRESSIONS AND TYPES 2.4. MAKING YOUR OWN TYPES
It is convention to use mainly lowercase letters in variables. If you use several words joined
together to make a variable, capitalize the first letter of the subsequent words.
15
2.4. MAKING YOUR OWN TYPES CHAPTER 2. EXPRESSIONS AND TYPES
When defining a datatype, separate the elements by vertical lines called “pipes,” a character
that appears with a break in the middle on some keyboards. The name of the type and the elements
must be valid identifiers. As demonstrated here, it is conventional for the names of types to be all
lower case, whereas the elements have their first letters capitalized. Notice that in its response, ML
alphabetizes the elements; as with any set, their order does not matter. Until we learn to define
functions, there is little of interest we can use datatypes for. Arithmetic operators, not surprisingly,
cannot be used.
- cat + evergreen;
16
CHAPTER 2. EXPRESSIONS AND TYPES 2.4. MAKING YOUR OWN TYPES
Exercises
(e) wHeAtoN
1. Determine the type of each of the following. (f) wheaton
(a) 5.3 + 0.3 (g) wheaton12
(b) 5.3 - 0.3 (h) 12wheaton
(c) 5.3 < 0.3
5. Redo the computation of the number of seconds in a year in
(d) 24.0 / 6.0 Section 2.3, but take into consideration that there are actu-
(e) 24.0 * 6.0 ally 365.25 days in a year, and so daysInYear should be of
(f) 24 * 6 type real. Your final answer should still be an int.
2. Is ceil(15.2) an expression? If no, why not? If so, what is 6. Mercury orbits the sun in 87.969 days. Calculate how old
its value and what is its type? Would ML accept ceil(15)? you will be in 30 Mercury-years. Your answer should de-
Why or why not? pend on your birthday (that is, don’t simply add a number
of years to your age), but it should be an int.
3. Make two points, stored in variables point1 and point2, and
calculuate the distance between the two points. (Recall that 7. Create a datatype of varieties of fish.
this can be done using the Pythagorean theorem.) 8. Use ML to compute the circumference and area of a circle
4. Which of the following are valid ML identifiers? and the volume and surface area of a sphere, first each of
radius 12, then of radius 12.75.
(a) wheaton
(b) wheaton college 9. Store the values 4.5 and 6.7 in variables standing for base
and height, respectively, of a rectangle, and use the variables
(c) wheaton’college to calculate the area. Then do the same but assuming they
(d) wheatonCollege are the base and height of a triangle.
17
2.4. MAKING YOUR OWN TYPES CHAPTER 2. EXPRESSIONS AND TYPES
18
Chapter 3
Set Operations
Axiom 2 (Extensionality.) If every element of a set X is an element of a set Y and every element
of Y is an element of X, then X = Y .
We may not know what sets and elements are, but we know that it is possible for a set to have
no elements; Axiom 1 tells us that there is an empty set. Axiom 2 tells us what it means for sets to
be equal, and this implicit definition captures what we mean when we say that sets are unordered,
since if two different sets had all the same elements but in different orders, they in fact would not
be two different sets, but one and the same.
Moreover, putting these two axioms together confirms that we may speak meaningfully not only
of an empty set, but the empty set, since there is only one. Suppose for the sake of argument there
were two empty sets. Since they have all the same elements—that is, none at all—they are actually
the same set. This is what we would call a trivial application of the axiom, but it is still valid.
Hence the empty set is unique.
A complete axomatic foundation for set theory is tedious and beyond the scope of our purposes.
We will touch on these axioms once more when we study proof in Chapter 10, but the important
lesson for now is the use of axioms to describe basic and undefinable terms. Axioms are good if they
correctly capture our intuion; notice that on page 8 we essentially derived the Axiom 2 from our
informal definition of set.
19
3.2. OPERATIONS AND VISUALIZATION CHAPTER 3. SET OPERATIONS
is probably the set of animals, or the set { Cheetah, Sponge, Apple tree } would imply the set of
living things. Either way, there is some context of which all sets in the discussion are a subset. It
is unlikely one would ever speak of the set { Green, Sponge, Acid reflux, Annuciation } unless the
context is, say, the set of English words.
universal set That background set relevant to the context is called the universal set, and designated U . In
Venn diagrams, it is often drawn as a rectangle framing the other sets. Further, shading is often
used to highlight particular sets or regions among sets. A simple diagram showing a single set might
look like this:
X
cardinality We also pause to define the cardinality of a finite set, which is the number of elements in the
set. This is symbolized by veritcal bars, like absolute value. If U is the set of lowercase letters, then
|{a, b, c, d}| = 4. Note that this definition does not allow you to say, for example, that the cardinality
of Z is infinity; rather, cardinality is defined only for finite sets. Expanding the idea of cardinality
to infinite sets brings up interesting problems we will explore sometime later.
We can visualize basic set operations by drawing two overlapping circles, shading one of them
(X, defined as above) and the other (Y = {c, d, e, f })
is shaded at all, and the intersection X ∩ Y is the region shaded
. The union X ∪ Y is anything that
, all shown on the left; on the
right is the intersection alone.
U U
X X
Y
X Y
Note that it is not true that |X ∪ Y | = |X| + |Y |, since the elements in the intersection, X ∩ Y =
{c, d, }, would be counted twice. However, do not assume that just because sets are drawn so that
they overlap that they in fact share some elements. The overlap region may be empty. We say that
disjoint two sets X and Y are disjoint is they have no elements in common, that is, X ∩ Y = ∅. Note that
if X and Y are disjoint, then |X ∪ Y | = |X| + |Y |.
We can expand the notion of disjoint by considering a larger collection of sets. A set of sets
pairwise disjoint {A1 , A2 , . . . An } is pairwise disjoint if no pair of sets have any elements in common, that is, if for
20
CHAPTER 3. SET OPERATIONS 3.2. OPERATIONS AND VISUALIZATION
all i, j, 1 ≤ i, j ≤ n, i 6= j, Ai ∩ Aj = ∅.
Remember that the difference between two sets X and Y is X − Y = {x|x ∈ X and x ∈ / Y }.
From our earlier X and Y , X − Y = {a, b}. Having introduced the universal set, it now makes sense
also to talk about the complement of a set, X = {x|x ∈/ X}, everything that (is in the universal set complement
but) is not in the given set. In our example, X = {e, f, g, h, . . . , z}. Difference is illustrated to the
left and complement to the right; Y does not come into play in set complement, but is drawn for
consistency.
U
U
X
Y
Y
X
Y
Now we can use this drawing and shading method to verify propositions about set operations.
For example, suppose X, Y , and Z are sets, and consider the proposition
X ∪ (Y ∩ Z) = (X ∪ Y ) ∩ (X ∪ Z)
This is called the distributive law ; compare with the law from algebra x · (y + z) = x · y + x · y
for x, y, z ∈ R. First we draw a Venn diagram with three circles.
U
Y
X Z
Then we shade it according to the left side of the equation and, separately, according to the right
and compare the two drawings. First, shade X with and Y ∩ Z with . The union operation
indicates all the shaded regions together. (Note that X ∩ (Y ∩ Z) is shaded , but that is not
important for our present task.) Thus the left side of the equation:
21
3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS CHAPTER 3. SET OPERATIONS
U
Y
X
Z
U
Y
X
Z
Since the total shaded region in the first picture is the same as the double-shaded region in the
second picture, we have verified the proposition. Notice how graphics, will a little narration to help,
can be used for an informal, intutive proof.
P(X) = {Y |Y ⊆ X}
If X = {1, 2, 3}, then P(X) = {{1, 2, 3}, {1, 2}, {2, 3}, {1, 3}, {1}, {2}, {3}, ∅}. It is important to
notice that for any set X, X ∈ P(X) and ∅ ∈ P(X), since X is a subset of itself and ∅ is a subset
of everything. It so happens that for finite sets, |P(X)| = 2|X| .
ordered pair An ordered pair is two elements (not necessarily of the same set) written in a specific order.
Suppose X and Y are sets, and say x ∈ X and y ∈ Y . Then we say that (x, y) is an ordered pair
over X and Y . We say two ordered pairs are equal, say (x, y) = (w, x) if x = w and y = z. An
ordered pair is different from a set of cardinality 2 in that it is ordered. Moreover, the Cartesian
Cartesian product product of two sets, X and Y , written X × Y , is the set of all order pairs over X and Y . Formally,
22
CHAPTER 3. SET OPERATIONS 3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS
If X = {1, 2} and Y = {2, 3}, then X × Y = {(1, 2), (1, 3), (2, 2), (2, 3)}. The Cartesian product,
named after Decartes, is nothing new to you. The most famous Cartesian product is R × R, that
is, the Cartesian plane. Similarly, we can define ordered triples, quadrupals, and n-tuples, and
corresponding higher-ordered products.
If X is a set, then a partition of X is a set of non-empty sets {X1 , X2 , . . . , Xn } such that partition
X1 , X2 , . . . , Xn are pairwise disjoint and X1 ∪ X2 ∪ . . . ∪ Xn = X. Intuitively, a partition of a set is
a bunch of non-overlapping subsets that constitute the entire set. From Chapter 1, T and A make
up a partition of R. Here is how we might draw a partition:
X
X2
X1 X5
X4
X3
23
3.3. POWERSETS, CARTESIAN PRODUCTS, AND PARTITIONS CHAPTER 3. SET OPERATIONS
Exercises
5. (A ∩ B) ∩ (A ∪ C).
1. Complete the on-line Venn diagram drills found at
Describe the powerset (by listing all the elements) of the following.
www.ship.edu/~deensl/DiscreteMath/flash/ch3/sec3 1
/venntwoset.html and www.ship.edu/~deensl/DiscreteMath 6. {1, 2}
/flash/ch3/sec3 1/vennthreeset.html
7. {a, b, c, d}
Let A, B, and C be sets, subsets of the universal set U . Draw Venn
diagrams to show the following (do not draw C in the cases where 8. ∅
it is not used). 9. P({1, 2}.
2. (A ∩ B) − A.
10. Describe three distinct partitions of the set Z. For exam-
3. (A − B) ∪ (B − A). ple, one partition is the set of evens and the set of odds
4. (A ∪ B) ∩ (A ∪ C). (remember that these two sets make one partition).
24
Chapter 4
4.1 Tuples
One of the last things we considered in the previous chapter was the Cartesian product over sets. ML
has a ready-made way to represent ordered pairs—or their generalized counterparts, tuples, as they tuples
are more frequently spoken of in the context of ML. In ML, a tuple is made by listing expressions,
separated by commas and enclosed in parentheses, following standard mathematical notation. The
expressions are evaluated, and the values are displayed, again in a standard way.
Note that the type is real * real, corresponding to R×R. We can think of this as modeling a point
in the real plane. (1.6, 8.4) is itself a value of that type. We can store this value in a variable;
we also can extract the components of this value using #1 and #2 for the first and second number
in the pair, respectively.
- #1(point);
- #2(point);
25
4.1. TUPLES CHAPTER 4. TUPLES AND LISTS
- (newx, newy);
Note that although we mentioned “shifting the point,” we really are not making effecting a
change on the variable point—it has stayed the same. (If you have programmed before, note well
that this is different from changing the state of an object or array.)
- point;
We can make tuples of any size and of non-uniform types. We can make tuples of any types,
even of tuple types.
- (4.3, 7.9, ~0.002);
datatype Bird
= Chicken
| Dove
...
- #2(#3(it));
val it = 5 : int
It is most important to observe the types of the various expressions we are considering. Type
correctness is how parts of a computer program are assembled in a meaningful way. For the simplic-
ities sake, let us confine ourselves to thinking of pairs of reals. The operators #1 and #2 consume a
pair (a value of type real * real) and produces a real. The parentheses and comma consume two reals
and produces a pair of reals (real * real). Notice the difference between “two reals” (two distinct
values) and “a pair of reals” (one value). Consider again
- (#1(point) + 5.0, #2(point) + 0.3);
real * real
26
CHAPTER 4. TUPLES AND LISTS 4.2. LISTS
4.2 Lists
At first glance, it might seem that tuples are reasonable candidates for representing sets in ML. We
can group three values together and consider them a single entity, as in
- (Robin, Duck, Chicken);
This is a poor solution because even though a tuple’s length is arbitrary, it is still fixed. The
type Bird * Bird * Bird is more restricted than a “set of Birds.” A 4-tuple of Birds has a completely
different type.
- (Finch, Goose, Penguin, Dove);
An alternative which will enable us better to represent the mathematical concept of sets is a
list. Cosmetically, the difference between the two is to use square brackets instead of parentheses. list
Observe how the interpreter responds to these various attempts at using lists.
- [Finch, Robin, Owl];
- [Vulture, Sparrow];
- [Eagle];
- [];
val it = [] : ’a list
Every list made up of birds has the same type, Bird list, regardless of how many Birds there are.
Likewise we can have a list of ints, but unlike tuples we cannot have a list of mixed types. The type
of which the list is made up is called its base type. The interpreter typed the expression [] as ’a list; base type
’a is a type variable, a symbol ML uses to stand for an unknown type. Two lone square braces is
obviously an empty list, but even an empty list must have a base type. This is the first case we have type variable
seen of an expression whose type ML cannot infer from context. We can disambiguate this using
explicit typing, which is done by following an expression with a colon and the type. For example, explicit typing
we can declare that we want an empty list to be considered a list of Birds.
27
4.2. LISTS CHAPTER 4. TUPLES AND LISTS
- [] : Bird list;
It is perfectly logical to speak of lists of lists—that is, the base type of a list may be itself a list
type.
Make sure the difference between tuples and lists is understood. An n-tuple has n components,
and n is a fundamental aspect of the type. Lists (with base type ’a), on the other hand, are considered
head to have exactly two components: the first element (called the head , of type ’a) and the rest of the
tail list (called the tail , of type ’a list)—all this, of course, unless the list is empty. Thus we defy the
grammar school rule that one cannot define something in terms of itself by saying an ’a list is
• an empty list, or
• an ’a followed by an ’a list.
Corresponding to the difference in definition, a programmer interacts with lists in a way different
from how tuples are used. A tuple has n components referenced by #1, #2, etc. A list has these two
accessors components referenced by the accessors hd for head and tl for tail.
- hd(sundryBirds);
- tl(sundryBirds);
- tl(it);
- tl(it);
- tl(it);
- tl(it);
28
CHAPTER 4. TUPLES AND LISTS 4.2. LISTS
hd and tl can be used to slice up a list, one element at a time. However, it is an error to try to
extract the tail (or head, for that matter) on an empty list.
We have seen how to make a list using square braces and how to takes lists apart. We can take
two lists and concatenate them—that is, tack one on the back of the other to make a new list—using concatenate
the cat operator, @.
cat, @
- [Owl, Finch] @ [Eagle, Vulture];
We also can take a value of the base type and a list and considering them to be the head and
tail, respectively, of a new list. This is by the construct or cons operator, ::. cons, ::
- Robin::it;
Cons must take an item and a list; cat must take two lists.
- it::Hawk;
- Hawk@[Loon];
- Sparrow::Robin::Turkey;
However, cons works from right to left, so these next two are fine:
- Sparrow::Robin::[Turkey];
- Sparrow::Robin::Turkey::[];
29
4.3. LISTS VS. TUPLES VS. ARRAYS CHAPTER 4. TUPLES AND LISTS
- [Loon]::[];
- [Robin]::it;
- [[Duck, Vulture]]@it;
- open Array;
One creates a new array by typing array(n, v), which will evaluate to an array of size n, with
each position of the array initialized to the value v. The value at positioni in an array is produced
by sub(A, i), and the value at that position is modified to contain v by update(A, i, v).
- update(A, 2, 16);
val it = () : unit
- update(A, 3, 21);
val it = () : unit
- A;
- sub(A, 3);
val it = 21 : int
30
CHAPTER 4. TUPLES AND LISTS 4.3. LISTS VS. TUPLES VS. ARRAYS
The interpreter’s response val it = () : unit will be explained later. Note that arrays can
be changed—they are mutable. Although we can generate new tuples and lists, we cannot change mutable
the value of a tuple or list. However, unlike lists, new arrays cannot be made by concatenating two
arrays together.
Much of the work of programming is weighting trade-offs among options. In this case, we are
considering the appropriateness of various data structures, each of which has its advantages and
liabilities. The following table summarized the differences among tuples, lists, and arrays.
In this case, we want a data structure suitable for representing sets. Our choice to use lists comes
because the concept of “list of X” is so similar to “set of X”—and because ML is optimized to operate
on lists rather than arrays. However, there are downsides. A list, unlike a set, may contain multiple
copies of the same element. The cat operator is, for example, a poor union operation because it will
keep both copies if an element belongs to both subsets that are being unioned. Later we will learn
to write our own set operations to operate on lists.
31
4.3. LISTS VS. TUPLES VS. ARRAYS CHAPTER 4. TUPLES AND LISTS
Exercises
2. [tl([Sparrow, Robin, Turkey])]@[Owl, Finch] 12. Remove the first two items of collection and tack them on
the back.
3. [Owl, Finch]@tl([Sparrow, Robin, Turkey])
In Exercises 13–15, state whether it would be best to use an array,
4. [Owl, Finch]::tl([Sparrow, Robin, Turkey]) a tuple, or a list.
5. [Owl, Finch]::[tl([Sparrow, Robin, Turkey])]
13. You have a collection of numbers that you wish to sort. The
6. hd([Sparrow, Robin, Turkey])::[Owl, Finch] sorting method you wish to use involves cutting the collec-
7. ([Finch, Robin], [2, 4]) tion into smaller parts and then joining them back together.
8. [(Finch, Robin), (2, 4)] 14. The units of data you are using have components of differ-
ent type, but the units are always the same length, with the
9. [5, 12, ceil(7.3 * 2.1), #2(Owl, 17)] same types of components in the same order.
In Exercises 10–12, assume collection is a list with sufficient 15. You have a collection of numbers that you wish to sort. The
length for the given problem. Write an ML expression to produce sorting method you wish to use involves interchanging pairs
a list as indicated. of values from various places in the collection.
32
Part II
Logic
33
Chapter 5
In this chapter, we begin our study of formal logic. Logic is the set of rules for making deductions,
that is, for producing new bits of information by piecing together other known bits of information.
We study logic, in this course in particular, because logic is a foundational part of the language of
mathematics, and it is the basis for all computing. The circuits of microchips, after all, are first
modeling logical operations; that logic is then used to emulate other work, such as arithmetic. Logic,
further, trains the mind and is a tool for any field, whether it be natural science, rhetoric, philosophy,
or theology. Two things should be clarified before we begin.
First, you must understand that logic studies the form of arguments, not their contents. The
argument
is perfectly logical. Its absurdity lies in the content of its premises. Similarly, the argument
5.1 Forms
Consider these two arguments:
35
5.2. SYMBOLS CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS
What do these have in common? If we strip out everything except except for the words if, or,
then, so, not and and (that is, if we strip out all the content but leave the logical connectors), we
are left in either case with
If p or q, then r.
So, if not r, then not p and not q.
Notice that we replaced the content with variables. This allowed us to abstract from the two
arguments to find a form common to both. These variables are something new because they do not
stand for numbers but for independent clauses, grammatical items that have a complete meaning
proposition and can be true or false. In the terminology of logic, we say a proposition is a sentence that is true
or false, but not both. The words true, false, and sentence we leave undefined, like set and element.
Since a proposition can be true or false, the following qualify as propositions:
7−3=4
7−4=3
Bob is taking discrete mathematics.
By saying a proposition must be one or the other and not both, we disallow the following:
7−x=4
He is taking discrete mathematics.
In other words, a proposition must be complete enough (no variables) to be true or false. 1
5.2 Symbols
Logic uses a standard system of symbols, much of which was formulated by Alfred Tarski. We have
already seen that variables can be used to stand for propositions, and you may have noticed that
propositions can be joined together by connective words to make new propositions. These connective
words stand for logical operations, similar to the operations used on numbers to perform arithmetic.
The three basic operations are
Symbolization Meaning
which Bob we are talking about; similarly, He is taking discrete mathematics might be a proposition if the context
makes plain who the antecedent is of he. We use these only as examples of the need for a sentence to be well-defined
to be a proposition.
36
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.3. BOOLEAN VALUES
Notice that “and” and “but” have the same symbolic translation. This is because both conjunc-
tions have the same denotational, logical meaning. Their difference in English is in their connota-
tions. Whichever we choose, we are asserting that two things are both true; “a but b” merely spins
the statement to imply something like “a and b are both true, and b is surprising in light of a.”
If it is hard to swallow the idea that “and” and “but” mean the same thing, observe how another
language differentiates things more subtly. Greek has three words to cover the same semantic range
as our “and” and “but”: kai, meaning “and”; alla, meaning “but”; and de, meaning something
halfway between “and” and “but,” joining two things together in contrast but not as sharply as alla.
- true;
- false;
- it;
- val p = true;
- val q = false;
The basic boolean operators are named not, andalso, and orelse.
- p andalso q;
- p orelse q;
37
5.4. TRUTH TABLES CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS
Comparison operators, testing for equality and related ideas, are different from others we have
seen in that their results have types different from their operands. If we compare two ints, we do
not get another int, but a bool. The ML comparison operators are =, <, >, <=, >=, and <>, the last
three for ≤, ≥, and 6=, respectively.
- 5 <> 4;
- 5 <= 4;
Round-off error makes testing reals for equality and inequality unreliable, so ML disallows it.
Instead, check for both < and >.
stdIn:24.1-24.11 Error: operator and operand don’t agree [equality type required]
operator domain: ’’Z * ’’Z
operand: real * real
in expression:
5.2 <> 4.3
38
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.5. LOGICAL EQUIVALENCE
The two rightmost columns are identical. This is because the forms ∼ (p ∧ q) and ∼ p∨ ∼ q are
logically equivalent, that is, they have the same truth value for any assignments of their arguments. logically equivalent
(We also say that propositions are logically equivalent if they have logically equivalent forms.) We
use ≡ to indicate that two forms are logically equivalent. For example, similar to the equivalence
demonstrated above, it is true that ∼ (p ∨ q) ≡∼ q∧ ∼ q. These two equivalences are called
DeMorgan’s laws, after Augustus DeMorgan.
It is important to remember that the operators ∨ and ∧ flip when they are negated. For example,
take the sentence
x is even and prime.
We do not call this a proposition, because x is an unknown, but by supplying a value for x (taking
N as the universal set) we would make it a proposition. The set of values that make this a true
proposition is {2}. The set of values that make this proposition false needs to be the complement
of that set—that is, the set of all natural numbers besides 2. It may be tempting to negate the
not-quite-a-proposition as
x is not even and not prime.
But this is wrong. The set of values that makes this a true proposition is the set of all numbers
except evens and primes—a much different set from what is required. The correct negation is
x is not even or not prime.
A form that is logically equivalent with the constant value T (something always true, no matter
what the assignments are to the variables) is called a tautology. A form that is logically equivalent tautology
to F (something necessarily false) is called a contradiction. Obviously all tautologies are logically
equivalent to each other, and similarly for contradictions. The following truth table explores some contradiction
tautologies and contradictions.
p ∼p p∨ ∼ p p∧ ∼ p p∧T p∨T p∧F p∨F
T F T F T T F T
F T T F F T F F
39
5.5. LOGICAL EQUIVALENCE CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS
Theorem 5.1 (Logical equivalences.) Given logical variables p, q, and r, the following equiva-
lences hold.
Commutative laws: p∧q ≡ q∧p p∨q ≡ q∨p
Associative laws: (p ∧ q) ∧ r ≡ p ∧ (q ∧ r) (p ∨ q) ∨ r ≡ p ∨ (q ∨ r)
Distributive laws: p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r) p ∨ (q ∧ r) ≡ (p ∨ q) ∧ (p ∨ r)
Absorption laws: p ∧ (p ∨ q) ≡ p p ∨ (p ∧ q) ≡ p
DeMorgan’s laws: ∼ (p ∧ q) ≡ ∼ p∨ ∼ q ∼ (p ∨ q) ≡ ∼ p∧ ∼ q
Negation laws: p∨ ∼ p ≡ T p∧ ∼ p ≡ F
These can be verified using truth tables. They also can be used to prove other equivalences
without using truth tables by means of a step-by-step reduction to a simpler form. For example,
q ∧ (p ∨ T ) ∧ (p∨ ∼ (∼ p∨ ∼ q)) is equivalent to p ∧ q:
q ∧ (p ∨ T ) ∧ (p∨ ∼ (∼ p∨ ∼ q))
≡ q ∧ T ∧ (p∨ ∼ (∼ p∨ ∼ q)) by universal bounds
≡ q ∧ (p∨ ∼ (∼ p∨ ∼ q)) by identity
≡ q ∧ (p∨ ∼∼ (p ∧ q)) by DeMorgan’s
≡ q ∧ (p ∨ (p ∧ q)) by double negative
≡ q∧p by absorption
≡ p∧q by commutativity.
40
CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS 5.5. LOGICAL EQUIVALENCE
Exercises
Determine which of the following are propositions. four possible assignments to q and p, checking that they agree each
time).
1. Spinach is on sale.
2. Spinach on sale. 10. ∼ (p ∧ q) ≡∼ p∨ ∼ q.
3. 3 > 5. 11. p ∧ (p ∨ q) ≡ p.
4. If 3 > 5, then Spinach is on sale. Verify the following equivalences using a truth table.
5. Why is Spinach on sale? 12. (p ∧ q) ∧ r ≡ p ∧ (q ∧ r).
Let s stand for “spinach is on sale” and k stand for “kale is on 13. p ∧ (q ∨ r) ≡ (p ∧ q) ∨ (p ∧ r).
sale.” Write the following using logical symbols.
Verify the following equivalences by applying known equivalences
6. Kale is on sale, but spinach is not on sale. from Theorem 5.1.
7. Either kale is on sale and spinach is not on sale or kale and
spinach are both on sale. 14. ∼ (∼ p ∨ (∼ p∧ ∼ q))∨ ∼ p ≡ T .
8. Kale is on sale, but spinach and kale are not both on sale. 15. p ∧ (∼ q ∨ (p∧ ∼ p)) ≡ p∧ ∼ q.
Verify the following equivalences using a truth table. Then ver- 17. ((q ∧ (p ∧ (p ∨ q))) ∨ (q∧ ∼ p))∧ ∼ q ≡ F .
ify them using ML (that is, type in the left and right sides for all 18. ∼ (∼ (p ∧ p) ∨ (∼ q ∧ T )).
41
5.5. LOGICAL EQUIVALENCE CHAPTER 5. LOGICAL PROPOSITIONS AND FORMS
42
Chapter 6
Conditionals
If p or q, then r.
has more than just an or indicating its logical form. We also have the words if and then, which
together knit “p or q” and “r” into a logical form. Let us simplify this a bit to
If p, then q.
A proposition in this form is called a conditional , and is symbolized by the operator →. The conditional
symbolism p → q reads “if p then q” or “p implies q.” p is called the hypothesis and q is called the
conclusion. hypothesis
To define this operator formally, consider the various scenarios for the truth of the hypothesis
conclusion
and conclusion and how they affect the truth of the conditional proposition.
43
6.2. NEGATION OF A CONDITIONAL CHAPTER 6. CONDITIONALS
other three cases, the hypothesis and conclusion do not disprove the conditional, but they do not
prove it either. To clarify this, we say that a conditional proposition is true if the truth values of
the hypothesis and conclusion are consistent with the proposition being true. This means the cases
vacuously true where the hypothesis is false are both true, by default. (We call this being vacuously true). Thus
we have this truth table for →:
p q p→q
T T T
T F F
F T T
F F T
We can further use a truth table to show, for example, that p → q ≡ q∨ ∼ (p ∨ q).
p q p∨q ∼ (p ∨ q) q∨ ∼ (p ∨ q) p→q
T T T F T T
T F T F F F
F T T F T T
F F F T T T
In other words, “If spinach is on sale, then I go to the store” is equivalent to “I go to the store
or it is not true that either spinach is on sale or I go to the store.”
If spinach is not on sale, then I go to the store. This is not right because it does not
adequately address the situation where one goes to the store every day, whether spinach
is on sale or not. In that case, both this and the original proposition would be true, so
this is not a negation.
If spinach is not on sale, I do not go to the store. Merely propagating the negation to
hypothesis and conclusion does not work at all. If spinach is on sale and I go, or spinach
is not on sale and I do not go, both this and the original proposition hold.
If spinach is on sale, I do not go to the store. This attempt is perhaps the most attractive,
because it does indeed contradict the original proposition. However, it can be considered
“too strong” and so not a negation—both it and the original proposition are vacuously
true if spinach is not on sale.
To find a true negation, use a truth table to identify the truth values for ∼ (p → q); then we will
try to construct a simple form equivalent to it.
p q p→q ∼ (p → q)
T T T F
T F F T
F T T F
F F T F
44
CHAPTER 6. CONDITIONALS 6.3. CONVERSE, INVERSE, AND CONTRAPOSITIVE
That is, we are looking for a proposition that is true only when both p is true and q is not true.
Thus we have ∼ (p → q) ≡ p∧ ∼ q.
This is a surprising result. The negation of a conditional is not itself a conditional. The negation
of “If spinach is on sale, then I go to the store” is “Spinach is on sale and I do not go to the store.”
p q p→q q→p
T T T T
T F F T
F T T F
F F T T
Many common errors in reasoning come down to a failure to recognize this. p being correlated
to q is not the same thing as q being correlated to p. I may go to the store every time spinach is on
sale, but that does not mean that I will never go to the store if spinach is not on sale, and so my
going to the store does not imply that spinach is on sale.
The inverse is formed by negating each of the hypothesis and conclusion separately (not negating inverse
the entire conditional), ∼ p →∼ q.
For the same reason as above, the inverse is not logically equivalent to the proposition either.
p q p→q ∼ p →∼ q
T T T T
T F F T
F T T F
F F T T
The contrapositive is formed by negating and switching the components of a conditional, ∼ q →∼ contrapositive
p.
p q p→q ∼ q →∼ p
T T T T
T F F F
F T T T
F F T T
Compare the truth tables of the converse and the inverse. Notice that they are logically equiva-
lent. In fact, the converse and inverse are contrapositives of each other.
45
6.4. WRITING CONDITIONALS IN ENGLISH CHAPTER 6. CONDITIONALS
necessary conditions We also sometimes speak of necessary conditions and sufficient conditions, which refer to converse
sufficient conditions conditional and conditional propositions, respectively.
An even degree is a necessary condition for a polynomial to have no real roots
means
If a polynomial function has no real roots, then it has an even degree.
A positive global minimum is a sufficient condition for a polynomial to have no real roots
means
If a polynomial function has a positive global minimum, then it has no real roots.
Values all of the same sign is a necessary and sufficient condition for a polynomial to
have no real roots.
means
A polynomial function has values all of the same sign if and only if the function has no
real roots.
46
CHAPTER 6. CONDITIONALS 6.5. CONDITIONAL EXPRESSIONS IN ML
The first expression is called the condition. The second two are called the then-clause and else-
clause, respectively. The condition must have type bool. The other expressions must have the same
type, but that type can be anything. If the condition is true, then the value of the entire expression is
the value of the then-clause; if it is false, then the entire expression’s value is that of the else-clause.
Thus the type of the second two expressions is also the type of the entire expression.
- val x = 3;
val x = 3 : int
val it = 0 : int
- val a = Hog;
At this point, the number of interesting things for which we can use this are limited, but it is
important to understand how types fit together in an expression like this for later use.
47
6.5. CONDITIONAL EXPRESSIONS IN ML CHAPTER 6. CONDITIONALS
Exercises
T S R Q P
19. Find a form that will always produce the same result (that
is, is equivalent to)
48
Chapter 7
Argument forms
7.1 Arguments
So far we have considered symbolic logic on the proposition level. We have considered the logical
connectives that can be used to knit propositions together into new propositions, and how to evaluate
propositions based on the value of their component propositions. However, to put this to use—either
for writing mathematical proofs or engaging in any other sort of discourse—we need to work in units
larger than propositions. For example,
contains several propositions, and they are not connected to become a single proposition. This,
instead, is an argument, a sequence of propositions, with the last proposition beginning with the word argument
“therefore”—or “so” or “hence” or some other such word, and possibly in a postpositive position, as
in “Spinach, therefore, is on sale.” Similarly, an argument form is a sequence of proposition forms, argument form
with the last prefixed by the symbol ∴. All except the last in the sequence are called premises; the
last proposition is called the conclusion. premises
Propositions are true or false. Arguments, however, are not spoken of as being true or false;
conclusion
instead, they are valid or invalid. We say that an argument form is valid if, whenever all the
premises are true (depending on the truth values of their variables), the conclusion is also true. valid
Consider another argument:
The previous argument and this one have the following argument forms, respectively (rephrasing
“During the full moon. . . ” as “If the moon is full, then. . . ”):
p→q p→q
p q
∴q ∴p
It is to be hoped that you readily identify the left argument form as valid and the right as invalid.
The truth table verifies this.
49
7.2. COMMON SYLLOGISMS CHAPTER 7. ARGUMENT FORMS
←−−−−−−−
←−−−−−−−
conclusion
conclusion
←−−−−−
←−−−−−
←−−−−−
←−−−−−
premise
premise
premise
p q premise
p→q q p q p→q p
critical row critical row
T T T T ←−−−−−−−
− T T T T ←−−−−−−−
−
T F F F T F F T
critical row
F T T T F T T F ←−−−−−−−
−
F F T F F F T F
The first argument has only one case where both premises are true, and we see there that the
conclusion is also true. The rest of the truth table does not matter—only the rows where all premises
critical rows are true count. We call these critical rows, and when you are evaluating large argument forms, it is
acceptable to leave the entries in non-critical rows blank. Moreover, once you have found a critical
row where the conclusion is false, nothing more needs to be done. The second argument has a critical
row where the conclusion is false; hence it is an invalid argument.
Since the contrapositive of a conditional is logically equivalent to the conditional itself, a truth
table from the previous chapter proves
p→q
∴∼ q →∼ p
∼ q →∼ p
∼q
∴∼ p
modus tollens Putting these two together results in the second most famous syllogism, modus tollens, “lifting
[i.e., denying] method.”
p→q If the moon is full, then Lupin will be a werewolf.
∼q Lupin is not a werewolf.
∴∼ p Therefore, the moon is not full.
We can also prove this directly with a truth table, with only the critical row completed for the
conclusion column.
p q p→q ∼q ∼p
T T T F
T F F T
F T T F
F F T T T
50
CHAPTER 7. ARGUMENT FORMS 7.3. USING ARGUMENT FORMS FOR DEDUCTION
The form generalization may seem trivial and useless, but, in fact, it captures a reasoning tech- generalization
nique we often use subconsciously. It relies on the fact that a true proposition or ’ed to any other
proposition is still a true proposition.
p We are in Pittsburgh.
∴ p∨q Therefore, we are in Pittburgh or Mozart wrote The Nutcracker.
A similar (though symmetric) argument form using and is called specialization. specialization
Sherlock Holmes describes the process of elimination as, “. . . when you have eliminated the
impossible, whatever remains, however improbable, must be the truth.” Formally, elimination is elimination
We will later prove that, given sets A, B, and C, if A ⊆ B and B ⊆ C, then A ⊆ C. This means
that the ⊆ relation is transitive. The analogous logical form transitivity is transitivity
Sometimes a phenomenon has two possible causes (or, at least, two things that are correlated to
it). Then all that needs to be shown is that at least one such cause is true. Or, if we know that at
least one of two possibilities is true and that they each imply a fact, that fact is true. We call this
form division into cases. division into cases
Because of certain paradoxes that have arisen in the study of the foundations of mathematics,
some logicians call into question the validity of proof by contradiction. If p leads to a contradiction,
it might not be that p is false; it could be that p is neither true nor false, that is, p might not be a
proposition at all. For our purposes, however, we can rely on proof by contradiction.
(a) ∼ p∧ ∼ r → s
(b) p →∼ q
(c) ∼ t
(d) t∨ ∼ s
(e) r →∼ q
(f) ∴∼ q
51
7.3. USING ARGUMENT FORMS FOR DEDUCTION CHAPTER 7. ARGUMENT FORMS
This does not follow immediately from the argument forms we have given. However, we can
deduce immediately that ∼ s by applying division into cases to (c) and (d). Our goal is to generate
new propositions from known argument forms until we have verified proposition (f).
(i) ∼s by (c), (d), and division into cases.
(ii) ∼ (∼ p∧ ∼ r) by (a), (i), and modus tollens
(iii) p∨r by (ii) and DeMorgan’s laws
(iv) ∼q by (iii), (b), (c), and division into cases.
Notice that using logical equivalences from Theorem 5.1 is fair game.
52
CHAPTER 7. ARGUMENT FORMS 7.3. USING ARGUMENT FORMS FOR DEDUCTION
Exercises
9. (a) p ∨ q
Verify the following syllogisms using truth tables. (b) q → r
1. Generalization. (c) p ∧ s → t
2. Specialization.
(d) ∼ r
3. Elimination.
(e) ∼ q → u ∧ s
4. Transitivity.
(f) ∴ t
5. Division into cases.
10. (a) ∼ p → r∧ ∼ s
6. Contradiction.
(b) t → s
Use known syllogisms and logical equivalences to verify the follow-
ing arguments. (c) u →∼ p
7. (a) p → q (d) ∼ w
(b) r ∨ s (e) u ∨ w
(c) r → t (f) ∴∼ t
(d) ∼ q 11. (a) p → q
(e) u → v (b) r ∨ s
(f) s → p
(c) ∼ s →∼ t
(g) ∴ t
(d) ∼ q ∨ s
8. (a) ∼ p ∨ q → r
(e) ∼ s
(b) s∨ ∼ q
(f) ∼ p ∧ r → u
(c) ∼ t
(g) w ∨ t
(d) p → t
(e) ∼ p ∧ r →∼ s (h) ∴ u ∧ w
(f) ∴∼ q Exercises 7–11 are taken from Epp [5].
53
7.3. USING ARGUMENT FORMS FOR DEDUCTION CHAPTER 7. ARGUMENT FORMS
54
Chapter 8
It is unlikely that we would either presume or prove such a narrow premise as “If Socrates is a
man, then he is mortal.” What is so special about Socrates that this conditional mortality accrues
to him? Rather, we would be more likely to say
Our premise, “All men are mortal,” now addresses a wider scope, and our syllogism merely
applies this universal truth to a specific case. However, we have not yet discussed how do deal with
concepts like “all” in formal logic. Is it necessary to expand our notation (and reasoning rules), or
can this be captured by the logical forms we already know? For example, could we not express the
first premise using a conditional?
This is equivalent, but now we have introduced the pronouns “someone” and “he,” which is
English’s way of referring to the same but unknown value. Therefore we could simplistically replace
the pronouns with pseudo-mathematical notation to get
However, variables (and pronouns with uncertain antecedents) mean that a sentence is no longer
a proposition. In this chapter, we will see the use of unknowns in formal reasoning.
8.1 Predication
When we introduced conditionals, we moved from the specific case to the general case by replacing
parts of a conditional sentence with variables.
These variables are parameters, that is, independent variables that can be supplied with any parameters
1 This is not quite right; we still need to say that this is true “for all x.”
55
8.2. UNIVERSAL QUANTIFICATION CHAPTER 8. PREDICATES AND QUANTIFIERS
values from a certain set (in this case, the set of propositions) and that affect the value of the
entire expression (in this case, also a proposition, once the values have been supplied). When we
replace parts of a mathematical expression with independent variables, we are parameterizing that
expression. We see the same process here:
This makes a proposition with a hole in it. A sentence that is a proposition but for an independent
predicate variable is a predicate. This is the same term predicate that you remember from grammar school; a
predicate is the part of a clause that expresses something about the subject.
hit the
The boy |{z} ball.}
| {z Spinach is
|{z} green.
| {z } | {z } | {z }
subject transitive direct subject linking predicate
verb object verb nominative
| {z } | {z }
predicate predicate
If we want to represent a given predicate with a symbol, it would be useful to incorporate the
parameter. Hence we can define, for example,
P (x) = x is mortal
You should recognize this notation as being the same as that used for functions in algebra and
calculus. We will study functions formally in Part VI, but you can use what you remember from
prior math courses to recognize that a predicate is alternately defined as a function whose codomain
is the set of truth values, { true, false }. In fact, another term for predicate is propositional function.
domain The domain of a predicate is the set of values that may be substituted in place of the independent
variable.
To play with the two sentences above, let
And so we can note P (the boy), P (the bat), Q(spinach), Q(Kermit), and ∼ Q(ruby). It is
important to note that the domain of a predicate is not just those things that make it true, but
rather all things it would make sense to talk about in that context (here we can assume something
like “visible objects and substances”). It is not invalid to say Q(ruby) or “ruby is green.” It merely
happens to be false.
Here is a mathematical example. Let P (x) = x2 > x. What is P (x) for various values of x, if
we assume R as the domain?
1
x 5 2 1 2 0 − 12 −1
P (x) T T F F F T T
It is not too difficult to characterize the numbers that make this predicate true—positive numbers
truth set greater than or equal to one, and all negative numbers. The truth set of a predicate P (x) with domain
D is the set of all elements in D that make P (x) true when substituted for x. We can denote this
using set notation as {x ∈ D|P (x)}. In this case,
56
CHAPTER 8. PREDICATES AND QUANTIFIERS 8.3. EXISTENTIAL QUANTIFICATION
because “All men are mortal” is a proposition, no mere predicate. It truly does make a claim that is
either true or false. In other words, the predicate does not capture the assertion being made about
all men. The problem is that the variable x is not truly free, but rather we want to remark on the
extent of x, the values for which we are asserting this predicate. Words that express this are called
quantifiers. Here is a rephrasing of “all men are mortal” that uses variables correctly: quantifiers
The symbol ∀ stands for “for all.” Then, if we let M stand for the set of all men and M (x) = x
is mortal, we have
∀ x ∈ M, M (x)
∀ is called the universal quantifier . Likewise, a universal proposition is a proposition in the form universal quantifier
∀x ∈ D, P (x), where P (x) is a predicate and D is the domain (or a subset of the domain) of P (x).
Unfortunately, defining the meaning of a universal proposition cannot be done simply with a truth universal proposition
table. Instead we say, almost banally, that the proposition is true if P (x) is true for every element
in the domain. For example, let D = {3, 54, 219, 318, 471}. Which of the following are true?
What we used on the last proposition was the method of exhaustion, that is, we tried all possible method of exhaustion
values for x until we exhausted the domain, demonstrating each one made the predicate true.
Obviously this method of proof is possible only with finite sets, and it is impractical for any set
much larger than the one in this example. On the other hand, disproving a universal proposition
is quite easy, since it takes only one hole to sink the ship. If for any element of D, P (x) is false,
then the entire proposition is false. Having found 3, not an even number, we disproved the second
proposition, without even noting that the predicate fails also for 219 and 471. An element of the
domain for which the predicate is false is called a counterexample. counterexample
x2 = 16 for all x ∈ R.
It is true, however, that
x2 = 16 for some x ∈ R.
namely, for 4 and −4. While it is true that
∼ (∀ x ∈ R, x2 = 16)
it is not true that
∀ x ∈ R, ∼ (x2 = 16)
The word “some” expresses the situation that falls between being true for all and being true
for none—in other words, the predicate is true for at least one, perhaps more. It is an existential
quantifier , because it asserts that at least one thing exists with the given predicate as a property. existential quantifier
We can rephrase the second proposition of this section to get
57
8.4. IMPLICIT QUANTIFICATION CHAPTER 8. PREDICATES AND QUANTIFIERS
∃ x ∈ R | x2 = 16
Formally, an existential proposition is a proposition of the form ∃ x ∈ D | P (x) for some predicate
P (x) with domain (or domain subset) D.
Revisiting our earlier example with D = {3, 54, 219, 318, 471}, which of the following are true?
∃ x ∈ D, x is a multiple of 3 Yes: 3 = 3 · 1.
Notice that with existentially quantified propositions, we must use the method of exhaustion to
disprove it. Only one specimen is needed to show that it is true.
∃ x ∈ Z+ | x divides 121
which is true, letting x = 11. The existential quantification is implicit, hanging on the indefinite
article a. However, the indefinite article can also imply universal quantification, depending on the
context. Hence
If a number is a rational number, then it is a real number.
becomes
∀ x ∈ Q, x ∈ R
Notice also that this is equivalent to
∀ x ∈ U, x ∈ Q → x ∈ R
where we take the universal set to be all numbers. More generally,
∀ x ∈ D, Q(x) ≡ ∀ x ∈ U, x ∈ D → Q(x)
Quantification in natural language is subtle and a frequence source of ambiguity for the careless
(though the intended meaning is usually clear from context or voice inflection). Adverbs (besides
not) usually do not affect the logical meaning of a sentence, but notice how just turns
I wouldn’t give that talk to any audience.
into
I wouldn’t give that talk to just any audience.
58
CHAPTER 8. PREDICATES AND QUANTIFIERS 8.5. NEGATION OF QUANTIFIED PROPOSITIONS
What will help us here is to think about what the quantifiers are actually saying. If a predicate
is true for all elements in the set, then if we could order those elements, it would be true for the first
one, and the next one, and one after that. In other words, we can think of universal quantification as
an extension of conjunction. If there merely exists an element for which the predicate is true, then it
is true for the first, or the next one, or one of the elements after that. Hence if D = {d 1 , d2 , d3 , . . .},
∼ (∀ x ∈ D, P (x)) ≡ ∼ (P (d1 )∧P (d2 )∧P (d3 ) . . .) ≡∼ P (d1 )∨ ∼ P (d2 )∨ ∼ P (d3 ) . . . ≡ ∃ x ∈ D, ∼ P (x)
and
∼ (∃ x ∈ D | P (x)) ≡ ∼ (P (d1 )∨P (d2 )∨P (d3 ) . . .) ≡∼ P (d1 )∧ ∼ P (d2 )∧ ∼ P (d3 ) . . . ≡ ∀ x ∈ D, ∼ P (x)
Hence the negation of a universal proposition is an existential proposition, and the negation of
an existential proposition is a universal proposition. To negate our examples above, we would say
59
8.5. NEGATION OF QUANTIFIED PROPOSITIONS CHAPTER 8. PREDICATES AND QUANTIFIERS
Exercises
60
Chapter 9
Multiple quantification;
representing predicates
In this chapter, we have two separate concerns, both of which extend the concepts of predication from
the previous chapter. First, we consider how to interpret propositions that are multiply quantified,
that is, have two (or more) quantifiers. Second, we consider how to represent predicates in ML.
Written symbolically, this is ∃ x ∈ Z | x > 5. A doubter would have to challenge this by saying
“So, you think there is a number greater than five, do you? Well then, show me one.” To satisfy
this doubter, your response would be to name any number that is both an integer and greater than
5— 6 will do fine. On the other hand, suppose you were asserting that
or, ∀ x ∈ Z, x ∈ Q. It would be unfair for a challenger to ask you to verify this proposition for every
element in Z. Even the most hardened skeptic would not have the patience to hear you enumerate an
infinite number of possibilities. However, this does demonstrate the futility of proving by example:
What makes proving by example unconvincing is that the prover is picking the test cases. What
if he is picking only the ones on which the proposition seems to be true, and ignoring the coun-
terexamples? It is like a huckster picking his own shills out of the crowd on whom to demonstrate
61
9.1. MULTIPLE QUANTIFICATIONCHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
his snake oil. Instead, to make the game both fair and finite, it must be the doubter who picks the
sample from the domain and challenges the prover to demonstrate the predicate is true for it.
Thus an existentially quantified proposition is proven by providing an example, and a universally
quantified proposition is proven by providing a way to confirm any given example.
All this serves to build intuition for how to interpret propositions with nested levels of quantifi-
cation. How would we symbolically represent the proposition
To work this out, let us follow the kind of reasoning above in reverse. Suppose you were to
prove this. What kind of challenge would you expect? The idea here is that the integer 5 has for
its opposite -5, -10 has 10, 0 has 0, and so on. Since you want to show this pattern holds for every
integer, it makes sense that challenger gets to pick the integer on which to argue. However, once
that integer is picked, how is the rest of the game played? You, the prover, must come up with
an opposite to match that integer. Hence the game has two steps: the doubter picks an item to
challenge you, and you counter that challenge with another item. The two steps correspond to two
levels of quantification. First, you are claiming that some predicate is true for all integers, so we
have something in the form
∀ x ∈ Z, P (x)
But what is P(x)? What are we claiming about all integers? We claim that something exists
corresponding to it, namely an opposite. P (x) = ∃ y ∈ Z | y = −x. All together,
∀ x ∈ Z, ∃ y ∈ Z | y = −x
multiply quantified This is a multiply quantified proposition, meaning that the predicate of the proposition is itself
quantified. This is more specific than that the proposition merely has more than one quantifier. The
proposition “Every frog is green, and there exists a brown toad” has more than one quantifier, but
this is not what we have in mind by multiple quantification, because one quantified subproposition
is not nested within the other.
Do not fail to notice that quantifiers are not commutative. We would have a very different (and
false) proposition if we were to say
∃ y ∈ Z | ∀ x ∈ Z, y = −x
or
Also notice that the innermost predicate (y = −x) has two independent variables. The general
form is
∀ x ∈ D, ∃ y ∈ E | P (x, y)
where P is a two-argument predicate, with arguments of domains D and E, respectively; or, equiv-
alently, P is a single-argument predicate with domain D × E.
Let us try another example. How would you translate to symbols the proposition
62
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
9.2. AMBIGUOUS QUANTIFICATION
Obviously the outer proposition is existential. Picking P to stand for the set of prime numbers and
writing half-symbolically we have
∃ x ∈ P | ∀ y ∈ P, y ≤ x
(Why did we say “≤” rather than “<”?) Now we negate this.
∼ ∃ x ∈ P | ∀ y ∈ P, y ≤ x
Evaluate the negation of the existential quantifier.
∀ x ∈ P, ∼ ∀ y ∈ P, y ≤ x
Evaluate the negation of the universal quantifier.
∀ x ∈ P, ∃ y ∈ P | y > x
or
As a final example, think back to the beginning of calculus when you first encountered the formal
definition of a limit. You may remember it as a less than pleasant experience. It is likely that one
of the main frustrations was that it is multiply quantified. lim x→a f (x) = L means
63
9.3. PREDICATES IN ML CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
do we mean
9.3 Predicates in ML
We have already seen how to write boolean expressions in ML—expressions that represent proposi-
tions. However, ML understandably rejects an expression containing an undefined variable.
- x < 15.3;
However, we can capture the independent variable by giving a name to the expression, that is,
defining a predicate. In ML, we define predicates using the form
The keyword fun is similar to val in that it assigns a meaning to an identifier (namely the first
identifier), which is the name of the predicate. The second identifier, enclosed in parentheses, is the
independent variable.
- P(1.5);
- P(15.3);
- P(27.4);
Let us compose predicates to test if a real number is within the range [−3.4, 47.3) and if an
integer is divisible by three.
- Q(16.5);
64
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING
- Q(0.1);
- Q(57.9);
- R(2);
- R(4);
- R(6);
Notice several things. First, the keyword fun is used because we are defining a function—
specifically, a function whose co-domain is bool. Next, note ML’s response to the definition of a
predicate, for example val Q = fn : real -> bool . This means that the variable Q is assigned a
certain value. Indeed, Q is a variable, but not a variable that stores a int or bool value, but one that
stores a function value. The value is not printed, but fn (another keyword based on an abbreviation
for “function”) stands in for it. The type of the value is real →bool, which essentially means a
function whose domain is real and whose co-domain is bool.
Remember that * corresponds to × in mathematical notation for Cartesian products. In math-
ematical notation, we would say that a predicate like Q is a function R → {true, false}. You should
use your knowledge of functions from earlier study in mathematics to understand this for now, but
general functions and the concept of function types with be examined more carefully in a later
chapter. The function is quite the crucial concept in ML programming.
9.4 Pattern-matching
As a finale for this chapter, we consider writing predicates for non-numerical data. In an earlier
chapter, we learned how to create a new datatype. For example, we considered the set of trees (here
expanded):
We know that equality is defined automatically. By chaining comparisons using orelse, we can
create more complicated statements and encapsulate them in predicates.
65
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
- fun isConiferous(tr) =
= tr = Pine orelse tr = Spruce orelse tr = Fir;
- isConiferous(Willow);
- isConiferous(Pine);
Notice that we never explicitly consider the cases for non-coniferous trees. An equivalent way of
writing this is
- fun isConiferous2(tr) =
= if (tr = Pine)
= then true
= else if (tr = Spruce)
= then true
= else if (tr = Fir)
= then true
= else false;
- isConiferous2(Oak);
- isConiferous2(Spruce);
Although this is much more verbose, a pattern like this would be necessary if such a predicate (or,
more generally, function) needed to perform more computation to determine its result (as opposed
to merely returning literals true and false). ML does, however, provide a cleaner way of using a
pattern conceptually the same as that used above. Instead of naming a variable for the predicate to
receive, we write a series of expressions for explicit input values:
- isConiferous3(Maple);
- isConiferous3(Fir);
66
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING
This is referred to as pattern matching because the predicate is evaluated by finding the definition
that matches the input. Patterns can become more complicated than we have seen here. Notice also
that we still use a variable in the last line of the definition of isConiferous3 as a default case.
It is legal to define a predicate so that it leaves some cases undefined. However, that will generate
a warning when you define it and an error message if you try to use it on a value for which it is not
defined.
- isConiferous4(Spruce);
- isConiferous4(Pine);
- isConiferous4(Elm);
As an example of a more complicated pattern, consider a 2-place predicate. Suppose you are
having guests over and want to serve food that will not violate any of your guests’ dietary restrictions.
Arwen tells you that she is a vegetarian, Luca says he is on a low-sodium diet, and Jael tells you
that she eats kosher. Estella says she will eat anything, and Bogdan claims he eats nothing, but you
figure that everyone will eat ice cream. To test what combination of foods would be palatable, you
use the following ML system:
67
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
68
CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES 9.4. PATTERN-MATCHING
Exercises
After your dinner party you and your guests (from the earlier ex-
Let M be the set of men, U be the set of unicorns, h(x, y) be ample) will be watching a movie. Nobody wants to watch Titanic.
the predicate that x hunts y, and s(x, y) be the predicate that x Luca, a Californian, wants to see the Governator in Terminator.
is smarter than y. Write the following symbolically, then negate Jael is up for anything but a drama. Estella is in the mood for a
them, and then write the negation in English. good comedy. Bogdan has a crush on Estella and wants to watch
whatever she wants.
1. A certain man hunts every unicorn.
8. Create a movie genre datatype with elements Drama, Com-
2. Every man is smarter than every unicorn. edy, and Action. Create a movie datatype with elements
3. Any man is mortal if he hunts a unicorn. Terminator, Shrek, Titanic, Alien, GoneWithTheWind, and
Airplane. Copy the guest datatype from the example.
4. There is a smartest unicorn.
9. Write three predicates using pattern-matching to determine
5. No man hunts every unicorn. the genre of a movie (i.e. isDrama, isComedy, isAction).
Evaluate the negation 10. Write a predicate wantsToWatch(guest, movie) that deter-
mines if a guest will watch a particular movie.
6. ∼ ∀ x ∈ D, ∃ y ∈ E|P (x, y).
11. Will Bogdan watch Shrek ? Will Jael watch Gone With The
7. ∼ ∃ x ∈ D|∀ y ∈ E, P (x, y). Wind ?
69
9.4. PATTERN-MATCHING CHAPTER 9. MULTIPLE QUANTIFICATION; REPRESENTING PREDICATES
70
Part III
Proof
71
Chapter 10
Subset proofs
a 4A ∼ = 4B SSS
b
c c ∠1 + ∠2 = 90◦ 4 angles sum to 180◦
∠1 + ∠20 = 90◦ ∠2 ∼
= ∠2
0
T
∠3 = 90◦ Supplementary ∠s
T is a square Equal sides, 90◦ ∠s
2
Area of T = c2 Area of
c
c
c b
Area of S = (a + b)2 Area of
b
a A
Area of each 4 = ab 2 Area of 4
B 3 (a + b)2 = c2 + 4 ab
2 Sum of areas
2’ 1 2 2 2
a + 2ab + b = c + 2ab Algebra (FOIL, simplification)
a b a ∴ c2 = a 2 + b2 Subtract 2ab from both sides.
Proofs in real mathematics, however, require something more professional. Proofs should be
written as paragraphs of complete English sentences—though mathematical symbolism is often useful
for conciseness and precision.
73
10.2. FORMS FOR PROOFS CHAPTER 10. SUBSET PROOFS
the promotion of conjectures to theorems by writing proofs for them. However, we will often speak
of this as “to prove a theorem,” since all propositions you will be asked to prove will have been
proven before.
Basic theorems take on one of three General Forms:
1. Facts. p
2. Conditionals. If p then q.
3. Biconditionals. p iff q.
Of these, General Form 2 is the most important, since facts can often be restated as conditionals,
and biconditionals are actually just two separate conditionals. The theorems we shall prove in this
part of the book all come out of set theory, and basic facts in set theory take on one of three Set
Proposition Forms:
1. Subset. X ⊆ Y .
2. Set equality. X = Y .
3. Set empty. X = ∅.
In this chapter, we will work on proving the simplest kinds of theorems: those that conform to
General Form 1 and Set Form 1. In the next chapter, we will consider theorems of General Form 1
and Set Forms 2 and 3, and the chapter after that will cover General Forms 2 and 3.
Let A and B be sets, subsets of the universal set U . An example of a proposition in General
Form 1, Set Form 1 is
Theorem 10.1 A ∩ B ⊆ A
This is a simple fact (not modified by a conditional) expressing a subset relation, that one set (A∩B)
is a subset of another (A). Our task is to prove that this is always the case, no matter what A and
B are. To prove that, we need to ask ourselves What does it mean for one set to be a subset of
another? and Why is it the case that these two sets are in that relationship?
The first question appeals to the definition of subset. Formal, precise definitions are extremely
important for doing proofs. Chapter 1 gave an informal definition of the subset relation, but we
need a formal one in order to reason precisely. Our knowledge of quantified logic allows us to define
X ⊆ Y if ∀ x ∈ X, x ∈ Y
(Observe how this fact now breaks down into a conditional. “A ∩ B ⊆ A” is equivalent to “if
a ∈ A ∩ B then a ∈ A.” This observation will make proving conditional propositions more familiar
when the time comes. More importantly, you should notice that definitions, although expressed
merely as conditionals, really are biconditionals; the “if” is an implied “iff.”)
The burden of our proof to show A ⊆ B is, then, to show that
∀ a ∈ A, a ∈ B
Notice that this is a special case of the more general form
∀ a ∈ A, P (a)
letting P (a) = a ∈ B. Think back to the previous chapter. How would you persuade someone that
this is the case? You allow the doubter to pick an element of A and then show that that element
makes the predicate true. The way we express an invitation to the doubter to pick an element is
with the word suppose. “Suppose a ∈ A. . . ” is math-speak for “choose for yourself an element of A,
and I will tell you what to do with it, which will persuade you of my point.”
In our case, the set that is a subset is A ∩ B. The formal definition of intersect is
X ∩ Y = {z | z ∈ X ∧ z ∈ Y }
Now follow the proof:
74
CHAPTER 10. SUBSET PROOFS 10.3. AN EXAMPLE
Proof. Suppose a ∈ A ∩ B.
By definition of intersection, a ∈ A and a ∈ B.
a ∈ A by specialization.
Therefore, by definition of subset, A ∩ B ⊆ A. 2
One line is italicized because it really could be omitted for the sake of brevity (not that this
particular proof needs to be more brief). The proofs in this text frequently will have italicized
sentences which will add clarity (but also clutter) to proofs; you may omit similar sentences when
you write proofs yourself. Specialization is the sort of logical step that it is fair to assume your
audience will perform automatically, as long as you recognize that a real logical operation is indeed
happening. Notice that
• Every other sentence is a proposition joined with a prepositional phrase governed by by.
• The last sentence begins with therefore and, except for the by part, is the proposition we are
proving.
• The proof is terminated by the symbol 2, a widely used end-of-proof marker. You will some-
times see proofs terminated with QED, an older convention from the Latin quod erat demon-
strandum, which means “which was to be proven.”
Compare that with our arguments in Section 7.3. Our overall strategy in this case is what is
called the element argument for proving facts of set form 1: element argument
To prove A⊆B
say Suppose a ∈ A.
Proofs at the beginning level are all about analysis and synthesis. Analysis is the taking apart
of something. Break down the assumed or proven propositions by applying definitions, going from
term to meaning. Synthesis is the putting together of something. Assemble the proposition to be
proven by applying the definition in the other direction, going from meaning to term. Here is a
summary of the formal definitions of set operations we have seen before.
X ∪Y = {z | z ∈ X ∧ z ∈ Y } X −Y = {z | z ∈ X ∧ z ∈/ Y}
X ∩Y = {z | z ∈ X ∨ z ∈ Y } X ×Y = {(x, y) | x ∈ X ∧ y ∈ Y }
X = {z | z ∈
/ X}
10.3 An example
Now we consider a more complicated example. The same principles apply, only with more steps
along the way. Let A, B, and C be sets, subsets of U . Prove
A × (B ∪ C) ⊆ (A × B) ∪ (A × C)
Immediately we notice that this fits our form, so we know our proof will begin with
75
10.4. CLOSING REMARKS CHAPTER 10. SUBSET PROOFS
x ∈ (A × B) ∪ (A × C) by . . . . Therefore, A × (B ∪ C) ⊆ (A × B) ∪ (A × C). 2
Notice several things. First, instead of writing each sentence on its own line, we combined several
sentences into paragraphs. This manifests the general divisions in our proof technique. The first
paragraph did the analysis. The next two each dealt with one case, also beginning the synthesis.
The last paragraph completed the synthesis.
Second, division into cases was turned into prose and paragraph form by highlighting each case,
with each case coming from a clause of a disjunction (“d ∈ B or d ∈ C”), and each case requires
another supposition. We will see this structure again later, and take more careful note of it then.
Third, we have peppered this proof with little words like “then,” “moreover,” “finally,” and “so.”
These do not add meaning, but they make the proof more readable.
Finally, we have made use of one extra but very important mathematical tool, that of substitution.
If two expressions are assumed or shown to be equal, we may substitute one for the other in another
expression. In this case, we assumed x and (a, d) were equal, and so we substituted x for (a, d) in
(a, d) ∈ A × B. (Or, we replaced (a, d) with x. Do not say that we substituted (a, d) with x. See
Fowler’s Modern English Usage on the matter [7].)
76
CHAPTER 10. SUBSET PROOFS 10.4. CLOSING REMARKS
you will write in grammatical sentences and fluent paragraphs. Two-column grids with non-parallel
phrases are for children. Sentences and paragraphs are for adults. However, you are by no means
expected to write good proofs immediately. Proof-writing is a fine skill that takes patience and
practice to acquire. Do not be discouraged when you write pieces of rubbish along the way. Being
able to write proofs is a goal of this course, not a prerequisite.
More specifically, writing proofs is persuasive writing. You have a proposition, and you want your
audience to believe that it is true. Yet mathematical proofs have a character of their own among
other kinds of persuasive writing. We are not about the business of amassing evidence or showing
that something is probable—mathematics stands out even among the other sciences in this regard.
One proof cannot merely be stronger than another; a proof either proves its theorem absolutely
or not at all. A mathematical proof of a theorem may in some ways be less weighty than, say, an
argument for a theological position. On the other hand, a mathematical proof has a level of precision
that no other discourse community can approach. This is the community standard to which you
must live up; when you write proofs, justify everything.
You may imagine the stereotypical drill sergeant telling recruits, “When you speak to me, the
first and last word out of your mouth better be ‘sir.’ ” You will notice an almost militaristic rigidity
in the proof-writing instruction in this book. Every proof must begin with suppose. Every other
sentence must contain by or since or because (with a few exceptions, which will generally contain
another suppose instead). Your last sentence must begin with therefore. Your proofs will conform to
a handful of patterns like the element argument we have seen here. However, this does not represent
the whole of mathematical proofs. If you go on in mathematical study, you will see that the structure
and phrasing of proofs can be quite varied, creative, and even colorful. Steps are skipped, mainly
for brevity. Phrasing can be less precise, as long as it is believable that they could be tightened
up. However, this is not yet the place for creativity, but for fundamental training. We teach you to
march now. Learn it well, and someday you will dance.
77
10.4. CLOSING REMARKS CHAPTER 10. SUBSET PROOFS
Exercises
4. (A ∪ B) ⊆ A ∩ B.
Let A, B, and C be sets, subsets of the universal set. Prove.
5. A ∪ (B ∩ C) ⊆ (A ∪ B) ∩ (A ∪ C).
1. A ⊆ A ∪ B.
2. A − B ⊆ B. 6. (A × B) ∪ (A × C) ⊆ A × (B ∪ C).
3. A ∩ B ⊆ A − B. 7. A × (B − C) ⊆ (A × B) − (A × C).
78
Chapter 11
X = Y if X ⊆ Y ∧ Y ⊆ X
This means you already know how to prove propositions of set equality—it is the same as proving
subsets, only it needs to be done twice (once in each direction of the equality). Observe this proof
of A − B = A ∩ B:
Proof. First, suppose x ∈ A − B. By the definition of set difference, x ∈ A and
x∈/ B. By the definition of complement, x ∈ B. Then, by the definition of intersection,
x ∈ A ∩ B. Hence, by the definition of subset, A − B ⊆ A ∩ B.
Next, suppose x ∈ A ∩ B. . . . Fill in your proof from Chapter 10, Exercise 3. . . Hence,
by the definition of subset, A ∩ B ⊆ A − B.
Therefore, by the definition of set equality, A − B = A ∩ B. 2
Notice that this proof required two parts, highlighted by “first” and “next,” each part with its
own supposition and each part arriving at its own conclusion. We used the word “hence” to mark
the conclusion of a part of the proof and “therefore” to mark the end of the entire proof, but they
mean the same thing. Notice also the general pattern: suppose an element in the left side and show
that it is in the right; suppose an element in the right side and show that it is in the left.
To avoid redoing work (and making proofs unreasonably long), you may use previously proven
propositions as justification in a proof. A theorem that is proven only to be used as justification in
a proof of another theorem is called a lemma. Lemmas and theorems are either identified by name lemma
or by number. For our purposes, we can also refer to theorems by their exercise number. Here is a
re-writing of the proof:
Lemma 11.1 A − B ⊆ A ∩ B.
Proof. Suppose x ∈ A − B. By the definition of set difference, x ∈ A and x ∈ / B. By
the definition of complement, x ∈ B. Then, by the definition of intersection, x ∈ A ∩ B.
Therefore, by the definition of subset, A − B ⊆ A ∩ B. 2
Now the proof of our theorem becomes a one-liner (incomplete sentences may be countenanced
when things get this simple):
Proof. By Lemma 11.1, Exercise 10.3, and the definition of set equality. 2
79
11.2. SET EMPTINESS CHAPTER 11. SET EQUALITY AND EMPTY PROOFS
A ∩ A=∅
To address Set Form 3, we must consider what it means for a set to be empty; though this may
seem obvious, a precise proof cannot be written if it does not have a precise defiinition for which to
aim. A set X is empty if
∼∃x∈U |x∈X
It is frequently useful to prove that a certain sort of thing does not exist. The doubter’s objection
that “just because you have not found one does not mean they do not exist” is quite a high hurdle for
the prover. We do, however, have a weapon for propositions of this sort—the proof by contradiction
syllogism we learned in Chapter 7. We suppose the opposite of what we are trying to prove
∃x∈U |x∈X
show that this leads to a contradiction, and then conclude what we were trying to prove. This is
indeed one of the most profound techniques in mathematics. G.H. Hardy remarked, “It is a far finer
gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but
the mathematician offers the game.”
In other words, there is no proof by tautology. The truth table below presents another way to see
why not—in the second critical row, the conclusion is false.
p T p→T p
p→T critical row
T T T T ←−−−−−−−
−
∴p critical row
F T T F ←−−−−−−−
−
80
CHAPTER 11. SET EQUALITY AND EMPTY PROOFS 11.3. REMARKS ON PROOF BY CONTRADICTION
Exercises
11. A ∪ U = U .
Let A, B, and C be sets, subsets of the universal set U . Prove. 12. (A ∪ B) = A ∩ B.
You may use exercises from the previous chapter in your proofs.
13. (A ∩ B) = A ∪ B.
1. A ∪ ∅ = A.
14. A ∪ (A ∩ B) = A.
2. A ∪ (A ∩ B) = A.
15. A ∪ B = A ∪ (B − (A ∩ B)).
3. A × (B ∪ C) = (A × B) ∪ (A × C).
16. A ∩ ∅ = ∅.
4. A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C). 17. A − ∅ = A.
5. A ∪ A = U . 18. A ∩ A = ∅.
6. A × (B − C) = (A × B) − (A × C). 19. A × ∅ = ∅.
7. A ∪ B = B ∪ A. 20. A − A = ∅.
8. (A ∪ B) ∪ C = A ∪ (B ∪ C). 21. (A − B) ∩ (A ∩ B) = ∅.
9. (A ∩ B) ∩ C = A ∩ (B ∩ C). 22. (A − B) ∩ B = ∅.
10. A = A. 23. A ∪ (B − (A ∩ B)) = ∅.
81
11.3. REMARKS ON PROOF BY CONTRADICTION CHAPTER 11. SET EQUALITY AND EMPTY PROOFS
82
Chapter 12
Conditional proofs
If A ⊆ B, then A ∩ B = A.
Proof. Suppose A ⊆ B.
We see from this that the proof of this slightly more sophisticated proposition is really composed
of smaller proofs with which we are already familiar. At the innermost levels, we have subset proofs,
two of them. Together, they constitute a proof of set equality, as we saw in the previous chapter. We
are now merely wrapping that proof in one more layer to get a proof of a conditional proposition.
That “one more layer” is another supposition. Take careful stock in how the word suppose is used
in proof above.
The sub-proof of A ⊆ A ∩ B makes no sense out of context. Certainly a set A is not in general
a subset of its intersection with another set. What makes that true (as we say in the proof) is that
we have supposed a restriction, namely A ⊆ B. The wrapper provides a context that makes the
proposition A ⊆ A ∩ B true.
When we say suppose, we are boarding Mister Rogers’s trolley to the Neighborhood of Make-
Believe. We are creating a fantasy world in which our supposition is true, and then demonstrating
that something else happens to be true in the world we are imagining. The imaginary world must
obey all mathematical laws plus the laws we postulate in our supposition. Sometimes, it is useful to
make another supposition, in which case we enter a fantasy world within the first fantasy world—that
world must obey all the laws of the outer world, plus whatever is now supposed.
Trace our journey through the proof above. We begin in the real world (or at least the world of
mathematical set theory), World 0. We imagine a world in which A ⊆ B, World 1. From that world
83
12.2. INTEGERS CHAPTER 12. CONDITIONAL PROOFS
we imagine yet another world, World 2, in which x ∈ A ∩ B, and show that in World 2, x ∈ A. This
proves that any world like World 2 within World 1 will behave that way, and this proves that within
World 1, A ∩ B ⊆ A. (It happens that this is true in World 0 as well, but we do not prove it.) We
exit World 2. We then imagine a World 3 within World 1 in which x ∈ A, and in a way similar to
what we had before, show that A ⊆ A ∩ B. Together, these things about World 1 also show that
A ∩ B = A in World 1, and our proof is complete. Notice that our return to World 0 is not explicit.
This is because the proposition we are proving does not ask us to prove anything directly about the
real world (as propositions in General Form 1 do). Rather, it asks us to prove something about all
worlds in which A ⊆ B.
12.2 Integers
For variety, let us try out these proof techniques in another realm of mathematics. Here we will
prove various propositions about integers, particularly about what facts depend upon them being
even even or odd. These proofs will rely on formal definitions of even and odd: An integer x is even if
∃ k ∈ Z | x = 2k
odd and an integer x is odd if
∃ k ∈ Z | x = 2k + 1
We will take as axioms that integers are closed under addition and multiplication, and that all
integers are either even or odd and not both.
Axiom 3 If x, y ∈ Z, then x + y ∈ Z.
Axiom 4 If x, y ∈ Z, then x · y ∈ Z.
You may also use basic properties of arithmetic and algebra in your proofs. Cite them as “by rules
of arithmetic” or “by rules of algebra,” although if you recall the name of the specific rule or rules
being used, your proof will be better if you cite them. We begin with
Proof. Suppose x and y are even integers. By the definition of even, there exist j, k ∈ Z
such that x = 2j and y = 2k. Then
x + y = 2j + 2k by substitution
= 2(j + k) by distribution
Further, j + k ∈ Z because integers are closed under addition. Hence there is an integer,
namely j + k, such that x + y = 2(j + k). Therefore x + y is an even integer by the
definition of even. 2
Notice how we brought in the variables j and k. By saying “. . . there exist j, k. . . ”, we have
made an implicit supposition about what j and k are. The definition of even establishes that this
supposition is legal. Notice also how we structured the steps from x + y to 2(j + k). This is a
convenient shorthand for dealing with long chains of equations. Were we to write this out fully, we
would say x + y = 2j + 2k by substitution, 2j + 2k = 2(j + k) by distribution, and x + y = 2(j + k)
by substitution again (or by the transitivity of equals). This is not so bad when we are juggling only
three expressions, but a larger number would become unreadable without chaining them together.
Finally, the second to last sentence is italicized because it merely rephrases what the previous two
sentences gave us. It is included here for explicitness, but you may omit such sentences.
84
CHAPTER 12. CONDITIONAL PROOFS 12.3. BICONDITIONALS
12.3 Biconditionals
Bicondional propositions (those of General Form 3) stand toward conditional propositions in the
same relationship as proofs of set equality stand toward subset proofs. A biconditional is simply two
conditionals written as one proposition; one merely needs to prove both of them.
A − B = ∅ iff A ⊆ B
Notice how many suppositions we have scattered all over the proof—and we are still proving fairly
simple propositions. To avoid confusion you should use paragraph structure and transition words
to guide reader around the worlds you are moving in and out of. The word conversely, for example,
indicates that we are now proving the second direction of a biconditional proposition (which is the
converse of the first direction).
12.4 Warnings
We conclude this chapter—and the part of this book explicitly about proofs—with the exposing
of logical errors particularly seductive to those learning to prove. Let not your feet go near their
houses.
If this reasoning were sound, then this would also prove that the sum of any two even integers is 8.
Reusing variables.
Suppose x and y are even integers. By the definition of even, there exists k ∈ Z such
that x = 2k and y = 2k.
Since x and y are even, each of them is twice some other integer—but those are different integers.
Otherwise, we would be proving that all even integers are equal to each other. What is confusing
about the correct way we wrote this earlier, “there exist j, k ∈ Z such that x = 2j and y = 2k,” is
that we were contracting the longer phrasing, “there exists j ∈ Z such that x = 2j and there exists
k ∈ Z such that y = 2k.” Had we reused the variable k in the longer version, it would be clear
that we were trying to reuse a variable we had already defined. This kind of mistake is extremely
common for beginners.
85
12.4. WARNINGS CHAPTER 12. CONDITIONAL PROOFS
This nonsense proof tries to postulate a world in which the proposition to be proven is already true,
followed by irrelevant manipulation of symbols. You cannot make any progress by supposing what
you mean to prove—this is merely a subtle form of the “proof by tautology” we repudiated in the
previous chapter.
This is more a matter of style and readability than logic. Remember that each step in a proof should
yield a new known proposition, justified by previously known facts. “If A − B = ∅, then it cannot be
that x ∈ A − B” does no such thing, only informing us that x ∈ A − B, contingent upon A − B = ∅
being true. Instead, this part of the proof should assert that x ∈
/ A − B because A − B = ∅.
86
CHAPTER 12. CONDITIONAL PROOFS 12.4. WARNINGS
Exercises
87
12.4. WARNINGS CHAPTER 12. CONDITIONAL PROOFS
88
Chapter 13
The usefulness of the set concept is its simpleness and its flexibility. For this reason set theory
is a core component of the foundations of mathematics. An example of the concept’s flexibility
is that we can talk sensibly about sets of sets—for example, powersets. Reasoning becomes more
subtle, however, if we speak of sets that even contain themselves. For example, the set X of all sets
mentioned in this book is hereby mentioned in this book, and so X ∈ X. Bertrand Russell called
for caution when playing with such ideas with the following paradox.
Let X be the set of all sets that do not contain themselves, that is, X = {Y |Y ∈ / Y }. Does X
contain itself?
First, suppose it does, that is X ∈ X. However, then the definition of X states that only
those sets that do not contain themselves are elements of X, so X ∈ / X, which is a contradiction.
Hence X ∈ / X. But wait—the same definition of X now tells us that X ∈ X, and we have another
contradiction.
A well-known puzzle, also attributed to Russell, presents the same problem. Suppose a certain
town has only one man who is a barber. That barber shaves only every man in the town who does
not shave himself. Does the barber shave himself? If he does, then he doesn’t; if he doesn’t, then
he does.
We can conclude from this only that the setup of the puzzle is an impossibility. There could not
possibly be a man who shaves only every man who does not shave himself. Likewise, the set of all
sets that do not contain themselves must not exist. This is why rigorous set theory must be built on
axioms. Although we have not presented a complete axiomatic foundation for set theory here, we
have assumed that at least one set exists (namely, the empty set), and we have assumed a notion
of what it means for sets to be equal. We have not assumed that any set that can be described
necessarily exists.
For this reason, when we name sets in a proof (“let X be the set. . . ”), we really are doing more
than assigning a name to some concept; we are jumping to the conclusion that the concept exists.
Therefore if we are defining sets in terms of a property (“let X = {x | x < 3}”), it is more rigorous
to make that set a subset of a known (or postulated) set merely limited by that property (“let
X = {x ∈ Z | x < 3}”).
This clears away the paradox nicely. Since it makes sense only to speak about things in the
universal set, we will assume that X ⊂ U , that is, X = {Y ∈ U | Y ∈ / Y }. Now the question, Does
X contain itself?, becomes, Is it true that X ∈ U and X ∈ / X? First suppose X ∈ U . Then either
X∈ / X or X ∈ X. As we saw before, both of those lead to contradictions. Hence X ∈ / U . In other
words, X does not exist.
Interestingly, this leaves powerset without a foundation, since we cannot define it as a subset of
something else. Since we do not want to abandon the idea altogether, we place it on firm ground
with its own axiom.
Axiom 6 (Powerset.) For any set X, there exists a set P(X) such that Y ∈ P(X) if and only
if Y ⊆ X.
89
CHAPTER 13. SPECIAL TOPIC: RUSSELL’S PARADOX
We have declared it O.K. to speak of powersets. However, we can prove that no set contains its
own powerset.
Moreover, this shows that the idea of a “set of all sets” is downright impossible in our axiomatic
system.
Proof. Suppose A were the set of all sets. By Axiom 6, P(A) also exists. Since P(A) is
a set of sets and A is the set of all sets, P(A) ⊆ A. However, by the theorem, P(A) * A,
a contradiction. Hence A does not exist. 2
This chapter draws heavily from Epp[5] and Hrbacek and Jech[9].
90
Part IV
Algorithm
91
Chapter 14
Algorithms
Notice how this algorithm takes two addends as its input, produces and answer for its output,
and uses the notions of sum and current column as temporary scratch space. It is particularly
important to notice that the sum and the current column keep changing.
A grammatical analysis of the algorithm is instructive. All the sentences are in the imperative
mood1 . Contrast this with propositions, which were all indicative. The algorithm does contain
1 It is a convenient aspect of English, however, that the “to” at the end of the previous paragraph makes them
infinitives.
93
14.2. REPETITION AND CHANGE CHAPTER 14. ALGORITHMS
propositions, however: the independent clauses “the current column has a number for either of the
addends” and “the last round of the repetition has a carry” are either true or false. Moreover, those
propositions are used to guard steps 2 and 3; the words “if” and “while” guide decisions about
whether or how many times to execute a certain command. Finally, note that step 2 is actually the
repetition of four smaller steps, bound together as if they were one. This is similar to how we use
simple expressions to build more complex ones.
We already know how to do certain pieces of this in ML: Boolean expressions represent proposi-
tions and if expressions make decisions. What we have not yet seen is how to repeat, how to change
the value of a variable, how to compound steps together to be treated as one, or generally how to do
anything explicitly imperative. ML’s tools for doing these things will seem clunky because it goes
against the grain of the kind of programming for which ML was designed. It is still worth our time
to consider imperative algorithms; bear with the unusual syntax for the next few chapters, and after
that ML will appear elegant again.
The results of 7 + 3, 15.6 / 34.7, and 4 < 17 are ignored, turning them into statements. A
statement list itself, however, has a value and thus is an expression and not a statement. Statements
side effects are used for their side effects, changes they make to the system that will affect expressions later.
We do not yet know anything that has a side effect.
while statement A while statement is a construct which will evaluate an expression repeatedly as long as a given
condition is true. Its form is
while <expression> do <expression>
Since it is a statement, it will always result in () so long as it does indeed finish. If we try
- val x = 4;
val x = 4 : int
- while x < 5 do x - 3;
ML gives no response, because we told it to keep subtracting 3 from x as long as x is less than
5. Since x is 4, it is and always will be less than 5, and so the execution of the statement goes
on forever. (Press control-C to make it stop.) The algorithm for adding two multi-digit numbers
has a concrete stopping point: when you run out of columns in the addends. The problem here is
that since variables do not change during the evaluation of an expression, there is no way that the
boolean expression guarding the loop will change either. Trying
94
CHAPTER 14. ALGORITHMS 14.2. REPETITION AND CHANGE
- while x < 5 do x = x - 3;
will merely compare x with x - 3 forever. The attempt
- while x < 5 do val x = x - 3;
val y = 5 : int
- val y = 16;
val y = 16 : int
- val y = Owl;
What is actually happening here is like when King Xerxes, realizing he could could not revoke an
old decree, instead issued a new decree that counteracted the old one. Technically, when you reuse
an identifier in ML, you are not changing the value of the old variable but making a new variable
of the same name that shadows the old one. This is a technical detail that is transparent to mos
ML programming (you need not understand or remember it), but it makes a difference in writing
imperative algorithms because you cannot redeclare a variable in the middle of an expression or
statement.
To deal with this, ML allows for a different kind of variable, called a reference variable, which reference variable
can change value. Three rules distinguish reference variables from ordinary ones.
• To declare a reference variable, precede the expression assinged with the keyword ref.
- val x = ref 5;
val it = 5 : int
• To set a reference variable, use an assignment statement in the form assignment statement
<identifier> := <expression>
- x := !x + 1;
val it = () : unit
- !x;
val it = 6 : int
95
14.3. PACKAGING AND PARAMETERIZATION CHAPTER 14. ALGORITHMS
Notice how types work here. The type of the variable x is int ref, and the type of the expression !x
is int. The operator ! always takes something of type a’ ref for some type a’ and returns something
of type a’.
As an example, consider computing a factorial. Using product notation, we define
n
Y
n! = i
i=1
For instance,
5
Y
5! = i = 1 × 2 × 3 × 4 × 5 = 125
i=1
Notice the algorithm nature of this definition (or just the product notation). It states to keep a
running product while repeated multiplying by a multiplier, incrementing the multiplier by one at
each step, starting at one, until a limit for the multiplier is reached. With reference variables in our
arsenal, we can do this in ML:
- val i = ref 0;
- while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);
val it = () : unit
- !fact;
The while statement produces only (), even though it computes 5!. It does not report that
answer (because it is not an expression), but instead changes the value of fact and i—a good
example of a side effect. Thus we need to evaluate the expression !fact to discover the answer.
- i := 0;
val it = () : unit
- fact := 1;
96
CHAPTER 14. ALGORITHMS 14.3. PACKAGING AND PARAMETERIZATION
val it = () : unit
- (while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);
= !fact);
However, the variables i and fact will still exists after the computation finishes, even though
they no longer have a purpose. The duration of a variable’s validity is called its scope. A variable scope
declared in the prompt has scope starting then and contuning until you exit ML. We would like
instead to have variables that are local to the expression, that is, variables that the expression alone
can use and that disappear when the expression finishes executing. We can make local variables by
using a let expression with form
The value of the last expression is also the value of the entire let expression. Now we rewrite
factorial computation as a single, self-contained expression:
- let
= val i = ref 0;
= val fact = ref 1;
= in
= (while !i < 5 do
= (i := !i + 1;
= fact := !fact * !i);
= !fact)
= end;
The only thing deficient in our program is that it is not reusable. Why would we bother, after
all, with a 9-line algorithm for computing 5! when the one line 1 * 2 * 3 * 4 * 5 would have
produced the same result? The value of an algorithm is its generality. This is the same algorithm
we would use to compute any other factorial n!, only that we would replace 5 in the fifth line with n.
In other words, we would like to parameterize the expression, or wrap it in a package that accepts
n as input and produces the factorial for any n. Just as we parameterized statements with fun to
make predicates, so we can here.
- fun factorial(n) =
= let
= val i = ref 0;
= val fact = ref 1;
= in
= (while !i < n do
= (i := !i + 1;
= fact := !fact * !i);
= !fact)
= end;
- factorial(5);
97
14.4. EXAMPLE CHAPTER 14. ALGORITHMS
- factorial(8);
- factorial(10);
For a second time, you are informally introduced to the notion of a function. We see it here as
a way to package an algorithm so that it can be used as an expression when given an input value.
Later we will see how to knit functions together to eliminate the need for while loops and reference
variables in almost all cases.
14.4 Example
In Chapter 4 introduced arrays as a type for storing finite, uniform, and random accessible collections
of data. You may recall that when you performed array operations that the interpreter responded
with ():unit , which we now know indicates that these are statements. The following displays and
uses an algorithm for computing long division. The function takes a divisor (as an integer) and a
dividend (as an array of integers, each position representing one column), and it returns a tuple
standing for the quotient and the remainder. Study this carefully.
- longdiv(8,A,4);
98
CHAPTER 14. ALGORITHMS 14.4. EXAMPLE
- update(A, 0, 3);
val it = () : unit
- update(A, 1, 5);
val it = () : unit
- update(A, 2, 2);
val it = () : unit
- update(A, 3, 1);
val it = () : unit
- A;
99
14.4. EXAMPLE CHAPTER 14. ALGORITHMS
Exercises
4. The Fibonacci sequence is defined by repeatedly adding the Then think about how you can rewrite this so that the tem-
two previous numbers in the sequence (starting with 0 and porary value can be local to the while statement, and a reg-
1) to obtain the next number, i.e. ular variable instead of a reference variable.)
100
Chapter 15
Induction
P(X) = {Y |Y ⊆ X}
For example, the powerset of {1, 2, 3} is {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. Here is an
algorithm for computing the powerset of a given set (represented by a list), anotated at the side.
- fun powerset(set) =
= let
= val remainingSet = ref set; remainingSet is the part of set we have
not yet processed.
= val powSet = ref ([[]] : int list list); powSet is the powerset as we have cal-
culated it so far.
= in
While there is still some of the set left
= (while not (!remainingSet = nil) do to process. . .
= let powAddition is what we are adding to
the power set this time around.
= val powAddition = ref ([] : int list list); remainingPowSet is the part of powSet we
have not yet processed.
= val remainingPowSet = ref (!powSet); currentElement is the element of set we
are processing this time around.
= val currentElement = hd(!remainingSet); . . . pick the next element. . .
= in
= (while not (!remainingPowSet = nil) do . . . and while there is still part of what
= let we have made so far left to process. . .
currentSubset is the subset we are pro-
= val currentSubSet = hd(!remainingPowSet); cessing this time around.
. . . pick the next subset. . .
= in
= (powAddition := (currentElement :: currentSubSet) . . . add currentElement to that
= :: !powAddition; subset. . .
= remainingPowSet := tl(!remainingPowSet))
= end;
= powSet := !powSet @ !powAddition; . . . and add all those new subsets to
= remainingSet := tl(!remainingSet)) the powerset.
= end;
= !powSet)
= end;
- powerset([1,2,3]);
101
15.2. PROOF OF POWERSET SIZE CHAPTER 15. INDUCTION
As we will see in a later chapter, this is far from the best way to compute this in ML. Dissecting
it, however, will exercise your understanding of algorithms from the previous chapter. Of immediate
interest to us is the relationship between the size of a set and the size of its powerset. Recall that
the cardinality of a finite set X, written |X|, is the number of elements in the set. (Defining the
cardinality of infinite sets is a more delicate problem, one we will pick up later.) With our algorithm
handy, we can generate the powerset of sets of various sizes and count the number of elements in
them.
- powerset([]);
- powerset([1]);
- powerset([1,2]);
- powerset([1,2,3,4]);
val it = [[],[1],[2,1],[2],[3,2],[3,2,1],[3,1],[3],
[4,3],[4,3,1],[4,3,2,1],[4,3,2], ...] : int list list
|A| |P(A)| The elipses on the last result indicate that there are more items in the list, but the list exceeds
0 1 the length that the ML interpreter normally displays. We summarize our findings in the adjacent
1 2
2 4
table. The pattern we recognize is that |P(A)| = 2|A| (so the last number is actually 16). An
3 8 informal way to verify this hypothesis is to think about what the algorithm is doing. We start out
4 > 12 with the empty set—all sets will at least have that as a subset. Then, for each other element in the
original set, we add it to each element in our powerset so far. Thus, each time we process an element
from the original set, we double the size of the powerset. So if the set has cardinality n, we begin
with a set of cardinality 1 and double its size n times; the resulting set must have cardinality 2 n .
This makes sense, but how do we prove it formally? For this we will use a new proving technique,
proof by mathematical induction.
This is in General Form 1 (straight facts, without explicit condition). However, restating this
will make the proof easier. First, we define the predicate
∀ n ∈ W, I(n)
What this gives us is that now we are predicating our work on whole numbers (“prove this for all
n”) rather than sets (“prove this for all A”).
102
CHAPTER 15. INDUCTION 15.2. PROOF OF POWERSET SIZE
If A and B are finite sets and A ⊆ B, then |B − A| = |B| − |A|. Chapter 25, Exercise 10.
If a ∈ A then P(A) = P(A − {a}) ∪ {a ∪ A0 |A0 ∈ P(A − {a})}. Chapter 12, Exercise 5.
If a ∈ A then P(A − {a}) ∩ { {a} ∪ A0 | A0 ∈ P(A − {a})} = ∅. Chapter 12, Exercise 6.
If A is a finite set and a ∈ A, then |{ {a} ∪ A0 | A0 ∈ P(A − {a})}| = |P(A − {a})|. Chapter 25, Exercise 11.
If A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|. Theorem 25.2.
Remember, we want to prove I(n) for all n. We have proven it only for n = 0, leaving infinitely
many possibilities to go. So far, this approach looks frighteningly like a proof by exhaustion with
an infinity of cases. However, this apparently paltry result allows us to say one thing more.
We know that this is true because it is at least true for n = 0. Possibly it is true for greater values
of n also. But this foot in the door now lets us take all other cases in a giant leap.
By Exercises 5 and 6 of Chapter 12, P(A − {a}) and { {a} ∪ A0 | A0 ∈ P(A − {a})} are
a partition of P(A).
Do not let the complicated notation intimidate you. All we are doing is splitting P(A) into the
subsets that contain a and those that do not. Review the relevant exercises of Chapter 12 if necessary.
Now we have proven (a) I(N ) for some N , and (b) if I(n) then I(n + 1). Stated differently, we have
shown that it is true for the first case, and that if it is true for one case it is true for the next case.
Thus,
By the principle of math induction, I(n) for all n ∈ W. Therefore, |P(A)| = 2|A| . 2
103
15.3. MATHEMATICAL INDUCTION CHAPTER 15. INDUCTION
Axiom 7 (Mathematical Induction) If I(n) is a predicate with domain W, then if I(0) and if
for all N ≥ 0, I(N ) → I(N + 1), then I(n) for all n ∈ W.
(The principle can also be applied where I(n) has domain N, and we prove I(1) first; or, any
other integer may be used as a starting point. Assuming zero to be our starting point allows us to
state the principle more succinctly.)
Even though we take this as an axiom for our purposes, it can be proven based on other axioms
generally taken in number theory. Intuitively, an inductive proof is like climbing a ladder. The first
step is like getting on the first rung. Every other step is merely a movement from one rung to the
next. Thus you prove two things: you can get on the first rung, and, supposing that you reach the
nth rung at some point, you can move to the n + 1st rung. These two facts taken together prove
that you eventually can reach any rung.
Recall the dialogue between the doubter and the prover in Chapter 9. In this case, the prover is
claiming I(n) for all n. When the doubter challenges this, the prover says, “Very well, you pick an n,
and I will show it works.” When the doubter has picked n, the prover then proves I(0); then, using
I(0) as a premise, proves I(1); uses I(1) to prove I(2); and so forth, until I(n) is proven. However,
all of these steps except the first are identical. A proof using mathematical induction provides a
recipe for generating a proof up to any n, should the doubter be so stubborn.
base case Therefore, a proof by induction has two parts, a base case and an inductive case. For clarity,
inductive case you should label these in your proof. Thus the format of a proof by induction should be
104
CHAPTER 15. INDUCTION 15.4. INDUCTION GONE AWRY
You may notice that with our formulation of the axiom, the proposition ∃ N ≥ 0 such that I(N )
is technically unnecessary. We will generally add it for clarity.
Proof. Let the predicate I(n) be “for any set of n cows, every cow in that set has the
same color.” We will prove I(n) for all n ≥ 1 by induction on n.
Base case.Suppose we have a set of one cow. Since that cow is the only cow in the set,
it obviously has the same color as itself. Thus all the cows in that set have the same
color. Hence I(1). Moreover, ∃N ≥ 1 such that I(N ).
Inductive case. Now, suppose we have a set, C, of N + 1 cows. Pick any cow, c 1 ∈ C.
The set C − {c1 } has N cows, by Exercise 10 of Chapter 12. Moreoever, by I(N ), all
cows in the set C − {c1 } have the same color. Now pick another cow c2 ∈ C, where
c2 6= c1 . We know c2 must exists because |C| = N + 1 ≥ 1 + 1 = 2. By reasoning similar
to that above, all cows in the set C − {c2 } must have the same color.
Now, c1 ∈ C − {c2 }, and so c1 must have the same color as the rest of the set (that is,
besides c2 ). Similarly, c2 ∈ C − {c1 }, so c2 must have the same color of the rest of the
set. Hence c1 and c2 also have the same color as each other, and so all cows of C have
the same color.
Therefore, by math induction, I(n) for all n, and all cows of any sized set have the same
color. 2
The error lurking here is the conclusion that c1 and c2 have the same color just because they
each have the same color as “the rest of C.” Since we had previously proven only I(1), we cannot
assume that C has any more than two elements—so it could be that C = {c1 , c2 }. In that case, {c1 }
and {c2 } each have cows all of the same color, but that does not relate c1 to c2 or prove anything
about the color of the cows of C.
The problem is not induction at all, but a faulty implicit assumption that C has size at least
three and that I(2) is true. If we had proven I(2) (which of course is ridiculous), then we could
indeed prove I(n) for all n. Suppose C = {c1 , c2 , c3 }. Take c1 out, leaving {c2 , c3 }. By I(2), c2
and c3 have the same color. Put c1 back in, take c2 out, leaving {c1 , c3 }. By I(2) again, c1 and c3
have the same color. By “transitivity of color,” c1 and c2 have the same color. Hence I(3), and by
induction I(n) for all n. One false assumption can prove anything.
15.5 Example
In Exercises 12 and 13 of Chapter 11, you proved what are known as DeMorgan’s laws for sets
(compare them with DeMorgan’s laws from Chapter 5). We can generalize those rules for unions
and intersections of more than two sets. First, we define the iterated union and iterated intersection iterated union
of the collection of sets A1 , A2 , . . . An :
iterated intersection
n
[
Ai = A 1 ∪ A 2 ∪ . . . ∪ A n
i=1
1 The original formulation, by George Pólya, was a proof that “any n girls have eyes of the same color”[11]. That
was in 1954. Cows are not as interesting, but they are a more appropriate subject, given modern sensitivities.
105
15.5. EXAMPLE CHAPTER 15. INDUCTION
n
\
Ai = A 1 ∩ A 2 ∩ . . . ∩ A n
i=1
Proof. By induction on n.
Base case. Suppose n = 1, and suppose A1 is a (collection of one) set. Then, by
n
[ n
\
definition of iterated union and iterated intersection, Ai = A 1 = Ai . Hence there
i=1 i=1
N
[ N
\
exists some N ≥ 1 such that Ai = Ai .
i=1 i=1
Inductive case. Now, suppose A1 , A2 , . . . AN +1 is a collection of N + 1 sets. Then
+1
N[
Ai = A1 ∪ A2 ∪ . . . ∪ AN ∪ AN +1 by definition of iterated union
i=1
N
[
= ( Ai ) ∪ AN +1 also by definition of iterated union
i=1
N
[
= Ai ∩ AN +1 by Exercise 12 of Chapter 11
i=1
\N
= ( Ai ) ∩ AN +1 by the inductive hypothesis
i=1
= (A1 ∩ A2 ∩ . . . ∩ AN ) ∩ AN +1 by definition of iterated intersection
+1
N\
= Ai also by definition of iterated intersection.
i=1
Therefore, by math induction, for all n ∈ N and for all collections of sets A1 , A2 , . . . An ,
[n \n
Ai = Ai . 2
i=1 i=1
Notice that in this proof we never gave a name to the predicate (for example, I(n)). This means
N
\
we could not say “since I(N )” to justify that ( Ai )∩AN +1 is equivalent to the previous expressions.
i=1
inductive hypothesis Instead, we used the term inductive hypothesis, which refers to our supposition that the predicate is
true for N . Notice also that we never make this supposition explicitly with the word “suppose.” We
make it implicitly when we say that it is true for some N —we have proven that some N exists (1 at
least), but we are still supposing an arbitrary number, and calling it N . For your first couple of tries
at math induction, you will probably find it easier to get it right if you name and use the predicate
explicitly. Once you get the hang of it, you will probably find the way we wrote the preceeding proof
more concise.
106
CHAPTER 15. INDUCTION 15.5. EXAMPLE
Exercises
n
X
5. A summation is an iterated addition, defined by ai =
Prove using mathematical induction, for all n ∈ N. i=1
a1 + a2 + . . . + an for some formula ai , depending on i. Us-
n
\ n
[ n
i = n(n+1)
X
1. Ai = Ai . ing math induction, prove 2
for all n ∈ N.
i=1 i=1 i=1
n n
[ [ 6. We say that an integer a is divisible by b 6= 0 (or b divides
2. (A ∩ Bi ) = A ∩ ( Bi ). a), written b|a, if there exists an integer c such that a = cb.
i=1 i=1
Prove that for all x ∈ Z and for all n ∈ W, x − 1|xn − 1.
n
[ n
[ Hint: first suppose x ∈ Z. Then use math induction on n.
3. (Ai − B) = ( Ai ) − B. If you do not see how it works at first, pick a specific value
i=1 i=1 for x, say 3, and try it for n = 0, n = 1, n = 2, and n = 3.
n
\ n
\ Notice the pattern, and then use math induction to prove it
4. (Ai − B) = ( Ai ) − B. for all n, but still assuming x = 3. Then rewrite the proof
i=1 i=1 for an arbitrary x.
107
15.5. EXAMPLE CHAPTER 15. INDUCTION
108
Chapter 16
Correctness of algorithms
- fun arithSum(N) =
= let
= val s = ref 0;
= val i = ref 1;
= in
= (while !i <= N do
= (s := !s + !i;
= i := !i + 1);
= !s)
= end;
But how do we know if this is correct? Or what does it even mean for a program to be correct?
Intuitively, a program is correct if, given a certain input, it will produce a certain, desired output.
In this case, if the input N is an integer (something the ML interpreter will enforce), then it should
evaluate to the desired sum. When a programmer is testing software, he or she runs the program on
several inputs chosen to represent the full variety of the input range and compares the result of the
program to the expected result. Obviously this approach to correctness is based on experimentation;
it is inductive reasoning in the non-mathematical sense, and while it can increase confidence in a
program’s correctness, it cannot prove correctness absolutely (except in the unrealistic case that
every possible input is tested—a proof by exhaustion over the set of possible input). This process
also assumes the intended result of the program can be verified conveniently by hand (or by another
program)—and if that was truly convenient, we may not have written the program in the first place.
Let us consider this notion of correctness more formally. The correctness of a given program is
defined by two sets of propositions: pre-conditions, which we expect to be true before the program pre-conditions
is evaluated, and post-conditions which we expect to hold afterwards. If the post-conditions hold
whenever the pre-conditions are met, then we say that the algorithm is correct. This approach is post-conditions
particularly useful in that it scales down to apply to smaller portions of an algorithm; Suppose we
have the pre-condition a is a nonnegative integer for the statement
b := !a + 1
We can think of many plausible post-conditions for this, including b is a positive integer, b > a,
and b−a = 1. Whichever of these makes sense, they are things we can prove mathematically, as long
as we implicitly take the efficacy of the assignment statement as axiomatic; propositions deduced
that way are justified as by assignment.
109
16.2. LOOP INVARIANTS CHAPTER 16. CORRECTNESS OF ALGORITHMS
Moreover, the post-conditions of a single statement are then the pre-conditions of the following
statement; the pre-conditions of the entire program are the pre-conditions of the first statement;
and the post-conditions of the final statement are the post-conditions of the entire program. Thus
by inspecting how each statement in turn affects the propositions in the pre- and post-conditions,
we can prove an algorithm is correct. Consider this program for computing a mod b (assuming we
can still use div).
- fun remainder(a, b) =
= let
= val q = a div b;
= val p = q * b;
= val r = a - p;
= in
= r
= end;
Sometimes we will consider val-declarations merely as setting up initial pre-conditions; since they
do all the work of computation in this example, we will consider them statements to be inspected.
The pre-condition for the entire program is that a, b ∈ Z+ . The result of the program should be
a mod b; equivalently, the post-condition of the declarations is r = a mod b. To analyze what a div b
does, we rely on the following standard result from number theory, a proof for which can be found
in any traditional discrete math text.
This theorem is the basis for our definition of division and modulus. q in the above theorem is called
quotient the quotient, and a div b = q. r is called the remainder , and a mod b = r.
remainder As a first pass, we intersperse the algorithm with little proofs.
Suppose a, b ∈ Z.
val q = a div b
By assignment, q = a div b. By the Quotient-
Remainder Theorem and the definition of division,
a = b · q + R for some R ∈ Z, where 0 ≤ R < b.
By algebra, q = a−Rb .
val p = q * b
p = q · b by assignment. p = a − R by substitution
and algebra.
val r = a - p
By assignment, r = a − p. By substitution and
algebra, r = a − (a − R) = R. Therefore, by the
definition of mod, r = a mod b. 2
110
CHAPTER 16. CORRECTNESS OF ALGORITHMS 16.3. BIG EXAMPLE
guard is i <= N. (An iteration is an execution of the body of a loop; the boolean expression which iteration
we test to see if the loop should continue is called the guard ). We want to prove some proposition to
be true for an arbitrary number of iterations. The number of iterations is certainly a whole number. guard
This suggests a proof by induction on the number of iterations.
We need a predicate, then, whose argument is the number N of iterations, and we need to show
it to be true for all N ≥ 0. This is asking for a lot of flexibility from this predicate: since it must be
true for 0 iterations, it is the pre-condition for the entire loop. Since it must be true for N iterations,
it is the post-condition for the entire loop. Since it must be true for every value between 0 and N , it
is the post-condition and pre-condition for every iteration along the way. Of course, if a statement
or code section has identical pre-conditions and post-conditions, that suggests the code does not do
anything. That is not what is in view here—these various pre- and post-conditions are not identical
to each other, but rather are parameterized. We must formulate this predicate in such a way that
the parameterization captures what the loop does. The predicate must state what the loop does not
change, with respect to the number of iterations.
A loop invariant I(n) is a predicate whose argument, n, is the number of iterations of the loop, loop invariant
chosen so that
• I(0) is true (that is, the proposition is true before the loop starts; this must be proven as the
base case in the proof).
• I(n) implies I(n + 1) (that is, if the proposition is true before a given iteration, it will still be
true after that iteration; this must be proven as the inductive case in the proof).
• If the loop terminates (after, say, N iterations), then I(N ) is true (this follows from the two
previous facts and the principle of math induction).
• Also if the the loop terminates, then I(N ) implies the post-condition of the entire loop.
These four points correspond to four steps in the proof that a loop is correct: we prove that the
loop is initialized so as to establish the loop invariant; that a given iteration maintains the loop initialization
invariant; that the loop will eventually terminate, that is, that the guard will be false eventually;
and that the loop invariant implies the post-condition. The first two steps constitute a proof by maintenance
induction, on which we focus. The last two complete the proof of algorithm correctness. termination
Now we prove
N
X
Theorem 16.2 For all N ∈ W, the program arithSum computes i.
i=1
Proof. Suppose N ∈ W.
111
16.4. SMALL EXAMPLE CHAPTER 16. CORRECTNESS OF ALGORITHMS
0
X
Base case / initialization. After 0 iterations, i = 1 and s = 0 = k by assignment
k=1
and the definition of summation. Hence I(0), and so there exists an n0 ≥ 0 such that
I(n0 ).
Inductive case / maintenance. Suppose I(n0 ). Let iold be the value of i after the
n0 th iteration and before the n0 + 1st iteration, and let inew be the value of i after the
Xn0
0 0 0
n + 1st iteration. Similarly define sold and snew . By I(n ), iold = n + 1 and sold = k.
k=1
n0
X
0
nX +1
By assignment and substitution, snew = sold + iold = k + (n0 + 1) = k. Similarly
k=1 k=1
inew = iold + 1 = (n0 + 1) + 1. Hence I(n0 + 1).
Hence by math induction, I(n) for all n ∈ W, and so I(N ).
Before we continue, be attentive to the subtle matter of our choice of variables. What is with
the N , the n, and the n0 ? N is the input to the program arithSum. n is the argument to (or,
independent variable of) the predicate I(n) used to analyze arithSum. One thing we are trying
to prove is I(N ) (I is true for n = N ) in the imaginary world created by supposing N . n0 is
an arbitrary whole number such that I(n0 ), used inside of our inductive proof that I(n) for all n.
Properly disambiguating variables is essential for all proofs and become particularly hard in proofs
of algorithm correctness when you are juggling similar variables and proving a property of an object
which itself has variables. Failure to do this will lead to equivocal nonsense.
Termination By I(N ), after N iterations, i = N + 1 > N , and so the guard will fail.
N
X N
X
Moreover, by I(N ), after N iterations, s = k. By change of variable, s = i, which
k=1 i=1
is our post-condition for the program. Hence the program is correct. 2
- fun findMin(array) =
= let
= val min = ref (sub(array, 0));
= val i = ref 1;
= in
= (while !i < length(array) do
= (if sub(array, !i) < !min
= then min := sub(array, !i)
= else ();
= i := !i + 1);
= !min)
= end;
As you will see by inspecting the code, it finds the minimum by considering each element in the
array in order. The variable min is used to store the “smallest seen so far.” Every time we see an
element smaller than the smallest so far (sub(array, !i) < !min), we make that element the new
smallest so far (then min := sub(array, !i)), and otherwise do nothing (else ()). At the end
of the loop, the “smallest so far” is also the “smallest overall.”
112
CHAPTER 16. CORRECTNESS OF ALGORITHMS 16.4. SMALL EXAMPLE
Thus our loop invariant must somehow capture the notion of “smallest so far.” To make this
more convenient, we introduce the notation A[i..j], which will stand for the set of values that are
elements of the subarray of A in the range from i inclusive to j exclusive. That is, subarray
Notice how the conditional requires a division into cases. Whichever way the condition branches,
we end up with min being the minimum of the range so far.
113
16.4. SMALL EXAMPLE CHAPTER 16. CORRECTNESS OF ALGORITHMS
Exercises
3. I(n) = m + n is odd.
In Exercises 1–3, prove the predicates to be loop invariants for the
- fun ccc(n) =
loops in the following programs.
= let
1. I(n) = x is even. = val x = ref 0;
- fun aaa(n) = = val y = ref 101;
= let = in
= val x = ref 0; = (while !x < n do
= val i = ref 0; = (x := !x + 4;
= in = y := !y - 2);
= (while !i < n do = !x + !y)
= (x := !x + 2 * i; = end;
= i := !i + 1);
4. Finish the proof of correctness for findMin. That is, show
= !x)
that the loop will terminate and that the loop invariant im-
= end;
plies the post-condition.
2. I(n) = x + y = 100.
5. Write a program which, given an array of integers, computes
- fun bbb(n) = the sum of the elements in the array. Write a complete proof
= let of correctness for your program: determine pre-conditions
= val x = ref 50; and post-conditions, determine a loop invariant, prove it to
= val y = ref 50; be a loop invariant, show that the loop will terminate, and
= val i = ref 0 show that the loop invariant implies the post-conditions.
= in
= (while !i < n do Exercises 2 and 3 are taken from Epp [5].
= (x := !x + 1;
= y := !y - 1);
= !x + !y)
= end; Hi.
114
Chapter 17
Theorem 16.1. If n, d ∈ Z+ , then there exist unique integers q and r such that n = d · q + r and
0 ≤ r < d.
Notice that this result tells us of the existence of something. A proof for such a proposition
must either show that it is impossible for the item not to exist (by a proof by contradiction) or give
an algorithm for calculating the item. The latter is called a constructive proof, and it in fact gives
more information that merely the truth of the proposition. The traditional proof of the QRT is
non-constructive.
However, the theorem itself, if studied carefully, tells us much about how to build an algorithm
for finding the quotient and the remainder, called the Division Algorithm. In the previous chapter, Division Algorithm
we proved results (specifically, correctness) about algorithms. In this chapter, we derive algorithms
from results in a process that is essentially the reverse of that of the previous chapter.
Consider what the QRT says about q and r. The two assertions (the propositions subordinate
to “such that”) can be thought of as restrictions that must be met as we try to find suitable values.
We have:
• n = d · q + r.
• r ≥ 0.
• r < d.
The restrictions n = d · q + r and r ≥ 0 are not difficult to satisfy; simply take q = 0 and
r = n—call this our initial guess. That guess, however, might conflict with the other restriction,
r < d. Our general strategy, then, is to alter our initial guess, making sure that any changes we
make do not violate the earlier restriction, and repeating until we satisfy the other restriction, too.
Note the elements of our algorithm sketch, which you should recognize from studying correctness
proofs from the previous chapter. We have:
• A loop invariant: n = d · q + r.
In ML,
115
17.2. THE EUCLIDEAN ALGORITHM CHAPTER 17. FROM THEOREMS TO ALGORITHMS
- fun divisionAlg(n, d) =
= let
= val r = ref n;
= val q = ref 0;
= in
= (while !r >= d do
= ();
= (!q, !r))
= end;
Notice that we return the quotient and the remainder as a tuple. The only thing missing is the
body of the loop—that is, how to mutate q and r so as to preserve the invariant and to make progress
towards the termination condition. It may seem backwards to determine the husk of a loop before
its body or to set up a proof of correctness before writing what is to be proven correct. However,
this way of thinking encourages programming that is amenable to correctness proof and enforces a
discipline of programming and proving in parallel.
Since the termination condition is r < d, progress is make by making r smaller. If the termination
condition does not hold, then r ≥ d, and so there is some nonnegative integer, say y, such that
r = d + y. Note that
n = d·q+r
= d·q+d+y
= d · (q + 1) + y
Compare the form of the expression at the bottom of the right column with that at the top. As
we did in the previous chapter, we will reckon the affect of the change to r by distinguishing r old
from rnew , and similarly for q. Then let
rnew = y = rold − d
qnew = qold + 1
In other words, decrease r by d and increase q by one. This preserves the loop invariant (by
substitution, n = d · qnew + rold ) and reduces r.
- fun divisionAlg(n, d) =
= let
= val r = ref n;
= val q = ref 0;
= in
= (while !r >= d do
= (r := !r - d;
= q := !q + 1);
= (!q, !r))
= end;
We have a correctness proof ready-made. Furthermore, the correctness of the algorithm proves
the theorem (the existence part of it, anyway), since a way to compute the numbers proves that
they exist.
• for all c such that c|a and c|b, c|d (that is, d is the greatest of all common divisors).
116
CHAPTER 17. FROM THEOREMS TO ALGORITHMS 17.2. THE EUCLIDEAN ALGORITHM
Recall from the definition of divides that the first item means that there exist integers p and q
such that a = p · d and b = q · d. The second items says that any other divisor of a and b must also
be a divisor of d. The GCD of a and b, commonly written gcd(a, b), is most recognized for its use
in simplifying fractions. Clearly a simple and efficient way to compute the GCD would be useful for
any application involving integers or rational numbers. Here we have two lemmas:
The proof of Lemma 17.1 is left as an exercise. For the proof of Lemma 17.2, consult a traditional
discrete math text. We now follow the same steps as in the previous section
The fact that we have two lemmas of course makes this more complicated. However, we can
simplify this by unifying their conclusions. They both provide equivalent expressions for gcd(a, b),
except that Lemma 17.1 addresses the special case of b = 0. From this we have the intuition that
the loop invariant will probably involve the GCD.
Now consider how the lemmas differ. The equivalent Lemma 17.2 gives is simply a new GCD
problem; Lemma 17.1, on the other hand, gives a final answer. Thus we have our termination
condition: b = 0. Lemma 17.2 then helps us distinguish between what does not change and what
does: The parameters to the GCD change; the answer does not. If we distinguish the changing
variables a and b from their original values a0 and b0 , we have
All that remains is how to mutate the variables a and b. a takes on the value of the old b. b
takes on the value of r from Lemma 17.2.
anew = bold
bnew = r = aold mod bold
This is known as the Euclidean Algorithm, though it was probably known before Euclid. Euclidean Algorithm
- gcd(21,36);
val it = 36 : int
36 clearly is not the GCD of 21 and 36. What went wrong? Our proof-amenable approach has
lulled us into a false sense of security, but it also will help us identify the problem speedily. The line
b := !a mod !b is not bnew = aold mod bold but rather bnew = anew mod bold . We must calculate r
before we change a. This can be done with a let expression:
117
17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY CHAPTER 17. FROM THEOREMS TO ALGORITHMS
- fun gcd(a, 0) = a
= | gcd(a, b) = gcd(b, a mod b);
The loop disappears completely. Instead of effecting repetition by a command to iterate (as with
a while statement), repetition happens implicitly by the repeated calling of the function. Instead of
mutating reference variables, we feed different parameters into gcd each time. Essentially we have
gcd(21, 36)
= gcd(36, 21)
= gcd(21, 15)
= gcd(15, 6)
= gcd(6, 3)
= gcd(3, 0)
= 3
recursion This interaction of a function with itself is called recursion, which means self-reference. A simple
recursive function has a conditional (or patterns) separating into two cases: a base case, a point
at which the recursion stops; and a recursive case, which involves a recursive function call. Notice
how the termination condition of the iterative version corresponds to the base case of the recursive
version. Also notice the similarity of structure between inductive proofs and recursive functions.
118
CHAPTER 17. FROM THEOREMS TO ALGORITHMS 17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY
What is most striking is how much more concise the recursive version is. Recursive solutions
do tend to be compact and elegant. Something else, however, is also at work here; ML is intended
for a programming style that makes heavy use of recursion. More generally, this style is called the
functional or applicative style of programming (as opposed to the iterative style, which uses loops), functional programming
where the central concept of building algorithms is the applying of functions, as opposed to the
repeating of loop bodies or the stringing together of statements. This style will be taught for the applicative style
remainder of this course.
119
17.3. THE EUCLIDEAN ALGORITHM, ANOTHER WAY CHAPTER 17. FROM THEOREMS TO ALGORITHMS
Exercises
120
Chapter 18
Recursive algorithms
In the previous chapter we were introduced to recursive functions. We will explore this further here,
with a focus on processing sets represented in ML as lists, and also on list processing in general.
This focus is chosen not only because these applications are amenable to recursion, but also because
of the centrality of sets to discrete mathematics and of lists to functional programming. We divide
our discussion into two sections: functions that take lists (and possibly other things as well) as
arguments and compute something about those lists; and functions that compute lists themselves.
18.1 Analysis
Suppose we are modeling the courses that make up a typical mathematics or computer science
program, and we wish to analyze the requirements, overlap, etc. Datatypes are good for modeling
a universal set.
The most basic set operation is ∈, set inclusion. We call it an “operation,” though it can be
just as easily be thought of as a predicate, say isElementOf. (In Part V, we will see that it is also
a relation.) Set inclusion is very difficult to describe formally, in part because sets are unordered.
However, any structure we use to represent sets must impose some incidental order to the elements.
Computing whether a set contains an element obviously requires searching the set to look for the
element, and the incidental ordering must guide our search. If we represented a set using an array,
we might write a loop which iterates over the array and update a bool variable to keep track of
whether we have found the desired item. Compare this with findMin from Chapter 16:
121
18.1. ANALYSIS CHAPTER 18. RECURSIVE ALGORITHMS
- csCourses;
- isElementOf(OperSys,csCourses);
(This function would give the same result is we had left out not (!found) andalso. Why did
we include that extra condition?)
Arrays are a cumbersome way to represent sets, and as we will see in the next section, completely
useless when we want to derive new sets from old. It is the fixed size and random access of arrays
that make them so inconvenient, and this is why lists have been and continue to be our preferred
representation of sets. We have a pattern for writing an array algorithm which asks, How does this
algorithm on the entire array break down to a step to be taken on each position? The answer to
this question become the body of the loop. We must develop a corresponding strategy for lists.
It is always easiest to start with the trivial. If a set is empty, no item is an element of it. Thus
we can say
The almost-trivial case is if by luck the element we are looking for is the head of the list.
or
What if the element is somewhere else in the list? Or what if it is not in the list, but the list is not
empty? Both of those questions are subsumed by asking, Is the element in rest or not?
Do not fail to marvel at the succinctness of this recursive algorithm on a list, contrasted with
the iterative algorithm on an array. The key insight is that after checking if the list is empty and
whether the first item is what we are looking for, we have reduced the problem (if we have not
solved it immediately) to a new problem, identical in structure to the original, but smaller. This
is the heart of thinking recursively. It works so well on lists because lists themselves are defined
recursively; a list is
• An empty list, or
122
CHAPTER 18. RECURSIVE ALGORITHMS 18.2. SYNTHESIS
You will learn to recognize that the answer to the last sub-question is also the solution to the entire
problem.
Let us apply this now to a new problem: computing the cardinality of a set. The type of the
function cardinality will be ’a list -> int. If a set is empty, then its cardinality is zero. Otherwise,
if a set A contains at least one element, x, then we can note
A = {x} ∪ (A − {x})
and since {x} and A − {x} are disjoint,
- fun cardinality([]) = 0
= | cardinality(x::rest) = 1 + cardinality(rest);
- cardinality(csCourses);
val it = 5 : int
We can describe the recursive case into three parts. First there is the work done before the
recursive call of the function; when working with lists, this work is usually the splitting of the list
into its head and tail, which can be done implicitly with pattern matching, as we have done here.
Second, there is the recursive call itself. Finally, we usually must do some work after the call, in this
case accounting for the first element by adding one to the result of the recursive call.
18.2 Synthesis
Now we consider functions that will construct lists. One drawback of using lists to represent sets is
that lists may have duplicate items. Our set operation functions operate under the assumption that
the list has been constructed so there happen to be no duplicates. We will get undesired responses
if that assumption breaks.
val it = 5 : int
It would be useful to have a function that will strip duplicates out of a list, say makeNoRepeats.
Applying the same strategy as before, first consider how to handle an empty list. Since an empty
list cannot have any repeats, this is the trivial case, but keep in mind we are not computing an int
or bool based on this list. Instead we are computing another list, in this case, just the empty list.
- fun makeNoRepeats([]) = []
123
18.2. SYNTHESIS CHAPTER 18. RECURSIVE ALGORITHMS
Now, what do we do with the case of x::rest? The subproblem that corresponds to the entire
problem is removing the duplicates from rest. It is a safe guess that our answer will include
makeNoRepeats(rest). The leaves just x to be dealt with. If we wish to include x in our resulting
list, we can use the cons operator to construct that new list, just as we use it to break down a list
in pattern matching: x::makeNoRepeats(rest). However, we should include x in the resulting list
only if it does not appear there already (otherwise it would be a duplicate); it will appear in the
result of the recursive call if and only if it appears in rest (why?). Thus we have
- fun makeNoRepeats([]) = []
= | makeNoRepeats(x::rest) =
= if isElementOf(x,rest)
= then makeNoRepeats(rest)
= else x::makeNoRepeats(rest);
To test your understanding of how this works, explain why it was the first occurrence of OperSys
that disappeared, rather than the second.
We noted in Chapter 4 that the cat operator is a poor way to perform a union on sets represented
on lists because any elements in the intersection of the two lists will be included twice. Now we can
write a simple union function, making use of makeNoRepeats:
A final example will reemphasize the importance of types. Recall that one can make a list of any
type, including other list types. Now suppose we wanted to take a list, create one-element lists of
all its elements, and return a list of those lists, for example
- listify([RealAnalysis,Discrete,DiffEq,Compilers]);
An empty list is still listified to an empty list. But now the work that needs to be done to x is
to make a list out of it.
- fun listify([]) = []
= | listify(x::rest) = [x] :: listify(rest);
What is interesting here is an analysis of the types of the subexpressions of the recursive case:
x ] :: listify (rest
[|{z} )
| {z } | {z }
’a list ’a list
| {z } |’a list -> ’a list
{z }
’a list ’a list list
| {z }
’a list list
124
CHAPTER 18. RECURSIVE ALGORITHMS 18.2. SYNTHESIS
Exercises
125
18.2. SYNTHESIS CHAPTER 18. RECURSIVE ALGORITHMS
126
Part V
Relation
127
Chapter 19
Relations
19.1 Definition
Relations are not a new concept for you. Most likely, you learned about relations by how they
differed from functions. We, too, will use relations as a building block for studying functions in
Part VI, but we will also consider them for logical and computational interest in their own right.
If we consider curves in the real plane, the equation y = 4 − x2 represents a function because its
graph passes the vertical line test. A circle, like y 2 = 4 − x2 , fails the test and thus is not a function,
but it is still a relation.
y = 4 − x2 y 2 = 4 − x2
Notice that a “curve” in a graph is really a set of points; so is the equation the curve represents,
2
for that matter.
√ √y = 4 √ − x2√can be thought of as defining the set that includes (0, 2), (2, 0), (0, −2),
(−2, 0), ( 2, − 2), (− 2, 2), etc. A point, in turn, is actually an ordered pair of real numbers.
This leads to what you may remember as the high school definition of a relation: a set of ordered
pairs. We should also recognize them as subsets of R × R.
Our notion of relation will be more broad and technical, allowing for subsets of any Cartesian
product. Thus if X and Y are sets, then a relation R from X to Y is a subset of X × Y . If X = Y , relation
we say that R is a relation on X. (Sometimes subsets on higher-ordered Cartesian products are also
considered; in that context, our definition of a relation more specifically is of a binary relation.) If
(x, y) ∈ R, we say that x is related to y. This is sometimes written xRy, especially if R is a special
symbol like |, ∈, or =, which, you should notice, are all relations. We can also consider R to be a
129
19.2. REPRESENTATION CHAPTER 19. RELATIONS
14
2
15
3
Relation | (divides) from {(2, 14), (2, 16), (2, 18), (3, 15), 16
{2, 3, 5, 7} to {14, 15, 16, 17, 18} (3, 18), (5, 15), (7, 14)} 5
17
7
18
2
{(2, {2, 3, 5}), (2, {2, 7}), { 2, 3, 5 }
Relation ∈ from {2, 3, 5, 7} to (3, {2, 3, 5}), (3, {3, 5, 7}), 3
{{2, 3, 5}, {3, 5, 7}, {2, 7}}. (5, {2, 3, 5}), (5, {3, 5, 7}), { 3, 5, 7 }
5
(7, {3, 5, 7}), (7, {2, 7})} { 2, 7 }
7
coyote
hawk
Relation eats on the set { hawk, {(coyote, rabbit), (hawk, rabbit), fox
coyote, rabbit, fox, clover } (coyote, fox), (fox, rabbit),
(rabbit, clover)} rabbit
clover
Notice that in some cases, an element may be related to itself. This is represented graphically
self-loop by a self-loop.
19.2 Representation
A principal goal of this course is to train you to think generally about mathematical objects. Origi-
nally, the mathematics you knew was only about numbers. Gradually you learned that mathematics
can be concerned about other things as well, such as points or matrices. Our primary extension of
this list has been sets. You should now be expanding your horizon to include relations as full-fledged
mathematical objects, since a relation is simply a special kind of set. In ML, our concept of value
130
CHAPTER 19. RELATIONS 19.2. REPRESENTATION
corresponds to what we mean when we say mathematical object. We now consider how to represent
relations in ML.
Our running example through this part is the relations among geographic entities of Old Testa-
ment Israel. As raw material, we represent (as datatypes) the sets of tribes and bodies of water.
DAN
ASHER
LUN
= | Issachar | Dan | Gad
I
HTAL
Chinnereth
ZEBU
an
= | Manasseh | Reuben | Ephraim
ne
NAP
a
= | Benjamin | Judah | Simeon;
err
dit
ISSACHAR
Me
- datatype WaterBody = Mediterranean | Dead E
H
S
= | Jordan | Chinnereth A
S
N
= | Arnon | Jabbok; A
Jordan
M
GAD
Jabbok
EPHRAIM
BENJAMIN
REUBEN
H
A
D
U Dead
J Arnon
SIMEON
Now we consider the relation from Tribe to WaterBody defined so that tribe t is related to water
body w if t is bordered by w. We could represent this using a predicate, and we will later find some
circumstances where a predicate is more convenient to use. Using lists, however, it is much easier
to define a relation.
- val tribeBordersWater =
= [(Asher, Mediterranean), (Manasseh, Mediterranean), (Ephraim, Mediterranean),
= (Naphtali, Chinnereth), (Naphtali, Jordan), (Issachar, Jordan),
= (Manasseh, Jordan), (Gad, Jordan), (Benjamin, Jordan), (Judah, Jordan),
= (Reuben, Jordan), (Judah, Dead), (Reuben, Dead), (Simeon, Dead),
= (Gad, Jabbok), (Reuben, Jabbok), (Reuben, Arnon)];
val tribeBordersWater =
[(Asher,Mediterranean),(Manasseh,Mediterranean),(Ephraim,Mediterranean),
(Naphtali,Chinnereth),(Naphtali,Jordan),(Issachar,Jordan),
(Manasseh,Jordan),(Gad,Jordan),(Benjamin,Jordan),(Judah,Jordan),
(Reuben,Jordan),(Judah,Dead),...] : (Tribe * WaterBody) list
The important thing is the type reported by ML: (Tribe * WaterBody) list. Since (Tribe *
WaterBody) is the Cartesian product and list is how we represent sets, this fits perfectly with our
131
19.3. MANIPULATION CHAPTER 19. RELATIONS
formal definition of a relation. Now we can use our predicate isElementOf from Chapter 18 as a
model for a predicate determining if two items are related.
- fun isRelatedTo(a, b, []) = false
= | isRelatedTo(a, b, (h1, h2)::rest) =
= (a = h1 andalso b = h2) orelse isRelatedTo(a, b, rest);
- isRelatedTo(Judah,Jordan, tribeBordersWater);
- isRelatedTo(Simeon,Chinnereth, tribeBordersWater);
19.3 Manipulation
We conclude this introduction to relations by defining a few objects that can be computed from
image relations. First, the image of an element a ∈ X under a relation R from X to Y is the set
IR (a) = {b ∈ Y | (a, b) ∈ R}
The image of 3 under | (assuming the subsets of Z from the earlier example) is {15, 18}. The image
of fox under eats is { rabbit }. The image of Reuben under tribeBordersWater is [Jordan, Dead,
Jabbok, Arnon].
inverse The inverse of a relation R from X to Y is the relation
132
CHAPTER 19. RELATIONS 19.3. MANIPULATION
Exercises
6. Write a function addImage which takes an element and a set 7. Write a function compose which takes two relations and re-
(represented as a list) and returns a relation under which turns the composition of those two relations. (Hint: use your
the given element’s image is the given set. image and addImage functions.)
133
19.3. MANIPULATION CHAPTER 19. RELATIONS
134
Chapter 20
Properties of relations
20.1 Definitions
Certain relations that are on a single set X (as opposed to being from X to a distinct Y ) have some
of several interesting properties.
• A relation R on a set X is reflexive if ∀ x ∈ X, (x, x) ∈ R. You can tell a reflexive relation reflexive
by its graph because every element will have a self-loop. Examples of reflexive relations are
= on anything, ≡ on logical propositions, ≤ and ≥ on R and its subsets, ⊆ on sets, and “is
acquainted with” on people.
Furthermore, a relation is an equivalence relation if it is reflexive, symmetric, and transitive. equivalence relation
135
20.2. PROOFS CHAPTER 20. PROPERTIES OF RELATIONS
20.2 Proofs
These properties give an exciting opportunity to revisit proving techniques because they require
careful consideration of what burden of proof their definition demands. For example, suppose we
have the theorem
If we unpack this using the definition of “reflexive,” we see that this is a “for all” proposition,
which itself contains a set-membership proposition which requires the application of the definition
of this particular relation.
Symmetry and transitivity require similar reasoning, except that their “for all” propositions
require two and three picks, respectively, and they contain General Form 2 propositions.
Proof. Suppose a, b, c ∈ Z, and suppose a|b and b|c. By the definition of divides, there
exist d, e ∈ Z such that a·d = b and b·e = c. By substitution and association, a(d·e) = c.
By the definition of divides, a|c. Hence | is transitive. 2
Notice that this proof involved two applications of the definition of the relation, the first to
analyze the fact that a|b and b|c, the second to synthesize the fact that a|c. Why did we restrict
ourselves to Z+ for reflexivity, but consider all of Z for transitivity?
136
CHAPTER 20. PROPERTIES OF RELATIONS 20.3. EQUIVALENCE RELATIONS
Proving that a specific relation is an equivalence relation follows a fairly predictable pattern.
The parts of the proof are the proving of the three individual properties.
Theorem 20.3 Let R be a relation on Z defined that (a, b) ∈ R if a + b is even. R is an equivalence
relation.
Proof. Suppose a ∈ Z. Then by arithmetic, a + a = 2a, which is even by definition.
Hence (a, a) ∈ R and R is reflexive.
Now suppose a, b ∈ Z and (a, b) ∈ R. Then, by the definition of even, a + b = 2c for
some c ∈ Z. By the commutativity of addition, b + a = 2c, which is still even, and so
(b, a) ∈ R. Hence R is symmetric.
Finally suppose a, b, c ∈ Z, (a, b) ∈ R and (b, c) ∈ R. By the definition of even, there
exist d, e ∈ Z such that a+b = 2d and b+c = 2e. By algebra, a = 2d−b. By substitution
and algebra
a + c = 2d − b + c = 2d − 2b + b + c
= 2d − 2b + 2e = 2(d − b + e)
which is even by definition (since d − b + e ∈ Z). Hence (a, c) ∈ R, and so R is transitive.
Therefore, R is an equivalence relation by definition. 2
Once one knows that a relation is an equivalence relation, there are many other facts one can
conclude about it.
Theorem 20.4 If R is an equivalence relation, then R = R −1 .
This result might be spiced up with a sophisticated subject, but the way to attack it is to
remember that a relation is a set, and so R = R−1 is in Set Proposition Form 2 and wrapped in a
General Form 2 proposition.
Proof. Suppose R is an equivalence relation.
First suppose (a, b) ∈ R. Since R is an equivalence relation, it is symmetric, so (b, a) ∈ R
by definition of symmetry. Then by the definition of inverse, (a, b) ∈ R −1 , and so
R ⊆ R−1 by definition of subset.
Next suppose (a, b) ∈ R−1 . By definition of inverse, (b, a) ∈ R. Again by symmetry,
(a, b) ∈ R, and so R−1 ⊆ R.
Therefore, by definition of set equality, R = R−1 . 2
137
20.4. COMPUTING TRANSITIVITY CHAPTER 20. PROPERTIES OF RELATIONS
Equivalence relations have a natural connection to partitions, introduced back in Chapter 3. Let
X be a set, and let P = {X1 , X2 , . . . , Xn } be a partition of X. Let R be the relation on X defined
relation induced so that (x, y) ∈ R if there exists Xi ∈ P such that x, y ∈ Xi . We call R the relation induced by the
partition.
Similarly, let R be an equivalence relation on X. Let [x] be the image of a given x ∈ X under
equivalence class R. We call [x] the equivalence class of x (under R). It turns out that any relation induced by a
partition is an equivalence relation, and the collection of all equivalence classes under an equivalence
relation is a partition.
1. The list is empty. Then true, the list is (vacuously) transitive with respect to (x, y).
2. The list begins with (y, z) for some z. Then the list is transitive with respect to (x, y) if (x, z)
exists in the relation, and if the rest of the list is transitive with respect to (x, y).
3. The list begins with (w, z) for some w 6= y. Then the list is transitive with respect to (x, y) if
the rest of the list is.
138
CHAPTER 20. PROPERTIES OF RELATIONS 20.4. COMPUTING TRANSITIVITY
- val waterWestOf =
= [(Mediterranean, Chinnereth), (Mediterranean, Jordan), (Mediterranean, Dead),
= (Mediterranean, Jabbok), (Mediterranean, Arnon), (Chinnereth, Jabbok),
= (Chinnereth, Arnon), (Jordan, Jabbok), (Jordan, Arnon), (Dead, Jabbok),
= (Dead, Arnon)];
val waterWestOf =
[(Mediterranean,Chinnereth), ...] : (WaterBody * WaterBody) list
- isTransitive(waterWestOf);
139
20.4. COMPUTING TRANSITIVITY CHAPTER 20. PROPERTIES OF RELATIONS
Exercises
7. Give a counterexample proving that R is not transitive. 15. Write a predicate isSymmetric which tests if a relation is
symmetric, similar to isTransitive.
Let R and S be a relations on A.
16. Write a predicate isAntisymmetric which tests if a relation
8. Suppose that R is reflexive and that for all a, b, c ∈ A, if is antisymmetric.
(a, b) ∈ R and (b, c) ∈ R, then (c, a) ∈ R. Prove that R is 17. Why do we not ask you to write a predicate isReflexive?
an equivalence relation. What would such a predicate require that you do not know
9. Prove that if R is an equivalence relation, then R ◦ R ⊆ R. how to do?
140
Chapter 21
Closures
- val waterVerticalAlign =
= [(Chinnereth, Jordan), (Chinnereth, Dead),
= (Jordan, Chinnereth), (Jordan, Dead), Chinnereth
• Instead of returning true on an empty list or when a is related to d, return [], that is, an
empty list, indicating no contradicting pairs are found.
• Instead of returning false when a is not related to d, return [(a, d)], that is, a list containing
the pair that was expected but not found.
• Instead of anding the (boolean) value of testOnePair on the rest of the list with what we
found for the current pair, concatenate the result for the current pair to the (list) value of
testOnePair.
141
21.1. TRANSITIVE FAILURE CHAPTER 21. CLOSURES
- fun counterTransitive(relation) =
= let fun testOnePair((a, b), []) = []
= | testOnePair((a, b), (c,d)::rest) =
= (if ((not (b=c)) orelse isRelatedTo(a, d, relation))
= then [] else [(a, d)])
= @ testOnePair((a,b), rest);
= fun test([]) = []
= | test((a,b)::rest) = testOnePair((a,b), relation) @ test(rest)
= in
= test(relation)
= end;
Note the type. It is no trouble to write the same things again, and it is a safeguard for you.
- counterTransitive(waterVerticalAlign);
val it =
[(Chinnereth,Chinnereth),(Chinnereth,Chinnereth),(Jordan,Jordan),
(Jordan,Jordan),(Dead,Dead),(Dead,Dead),(Jabbok,Jabbok),(Arnon,Arnon)]
: (WaterBody * WaterBody) list
This reveals the problem: we forgot to add self-loops. Adding them (but eliminating repeats)
makes the predicate transitive.
- val correctedWaterVerticalAlign =
= waterVerticalAlign @ makeNoRepeats(counterTransitive(waterVerticalAlign));
val correctedWaterVerticalAlign =
[(Chinnereth,Jordan),(Chinnereth,Dead),(Jordan,Chinnereth),(Jordan,Dead),
(Dead,Chinnereth),(Dead,Jordan),(Jabbok,Arnon),(Arnon,Jabbok),
(Chinnereth,Chinnereth),(Jordan,Jordan),(Dead,Dead),(Jabbok,Jabbok),...]
: (WaterBody * WaterBody) list
- isTransitive(correctedWaterVerticalAlign);
Similarly, we could use this to derive the transitive relation waterWestOf from waterImmedWestOf:
- isTransitive(waterImmedWestOf);
- val waterWestOf =
= waterImmedWestOf @ makeNoRepeats(counterTransitive(waterImmedWestOf));
142
CHAPTER 21. CLOSURES 21.2. TRANSITIVE AND OTHER CLOSURES
val waterWestOf =
[...] : (WaterBody * WaterBody) list
- isTransitive(waterWestOf);
1. RT is transitive.
2. R ⊆ RT .
Proof. Suppose S and T are relations fulfilling the requirements for being transitive
closures of R. By items 1 and 2, S is transitive and R ⊆ S, so by item 3, T ⊆ S. By
items 1 and 2, T is transitive and R ⊆ T , so by item 3, S ⊆ T . Therefore S = T by the
definition of set equality. 2
• The transitive closure of our relation “eats” on { hawk, coyote, rabbit, fox, clover } is “gets
nutrients from.” A coyote ultimately gets nutrients from clover.
• Let R be the relation on Z defined that (a, b) ∈ R if a+1 = b. Thus (−15, −14), (1, 2), (23, 24) ∈
R. The transitive closure of R is <.
The reflexive closure and the symmetric closure are defined similarly, though these are of less reflexive closure
importance. The reflexive closure of < is ≤. “Is in love with” in an ideal world is the symmetric
closure of “is in love with” in the real world. symmetric closure
- fun transitiveClosure(relation) =
= relation @ makeNoRepeats(counterTransitive(relation));
However, this is wrong. Suppose we test it on a relation that relates the tribes descended from
Leah according to who immediately precedes whom in birth (since we are considering tribes as
geographic entities, we have ignored Levi, who received no land).
143
21.3. COMPUTING THE TRANSITIVE CLOSURE CHAPTER 21. CLOSURES
- val immediatelyPrecede =
= [(Reuben, Simeon), (Simeon, Judah), (Judah, Issachar),
= (Issachar, Zebulun)];
val birthOrder =
[(Reuben,Simeon),(Simeon,Judah),(Judah,Issachar),(Issachar,Zebulun),
(Reuben,Judah),(Simeon,Issachar),(Judah,Zebulun)] : (Tribe * Tribe) list
- isTransitive(birthOrder);
- counterTransitive(birthOrder);
val it =
[(Reuben,Issachar),(Simeon,Zebulun),(Reuben,Issachar),(Reuben,Zebulun),
(Simeon,Zebulun)] : (Tribe * Tribe) list
The transitive closure should completely express who is older than whom, yet the answer is
missing, for example, (Reuben, Issachar). By adding the pair (Reuben, Judah), we have also
created a new “missing pair.” To compute the transitive closure correctly, we must add missing
pairs repeatedly until the relation is transitive. In other words, we must add not only the pairs of
R2 = R ◦ R, but also those of R3 , R4 , etc. The following theorem informs how to calculate the
transitive closure.
Theorem 21.2 If R is a relation on a set A, then
∞
[
R∞ = Ri = {(x, y) | ∃ i ∈ N such that (x, y) ∈ Ri }
i=1
is the transitive closure of R.
Proof. Suppose R is a relation on a set A.
Suppose a, b, c ∈ A, (a, b) ∈ R∞ , and (b, c) ∈ R∞ . By the definition of R∞ , there exist
i, j ∈ N such that (a, b) ∈ Ri and (b, c) ∈ Rj . By the definition of relation composition,
(a, c) ∈ Ri ◦ Rj = Ri+j ⊆ R∞ . By the definition of subset, (a, c) ∈ R∞ , and so R∞ is
transitive.
Suppose a, b ∈ A and (a, b) ∈ R. By the definition of R∞ (taking i = 1), (a, b) ∈ R∞ ,
and so R ⊆ R∞ .
Suppose S is a transitive relation on A and R ⊆ S. Further suppose (a, b) ∈ R ∞ . Then,
by definition of R∞ , there exists i ∈ N such that (a, b) ∈ Ri . We will prove that (a, b) ∈ S
by induction on i.
Suppose i = 1. Then (a, b) ∈ R ⊆ S, so (a, b) ∈ S. Hence there exists some I ≥ 1 such
that for all (e, f ) ∈ RI , (e, f ) ∈ S.
Next suppose that i = I + 1. Then by the definition of relation composition there exist
j, k ∈ N, j + k = i and c ∈ A such that (a, c) ∈ Rj and (c, b) ∈ Rk . Since j, k < i,
j, k ≤ I, both by arithmetic. By our induction hypothesis, (a, c), (c, b) ∈ S. Since S is
transitive, (a, b) ∈ S.
Hence, by math induction, (a, b) ∈ S for all i ∈ N.
Hence R∞ ⊆ S by definition of subset.
Therefore, R∞ is the transitive closure of R. 2
144
CHAPTER 21. CLOSURES 21.4. RELATIONS AS PREDICATES
The potential need for making an infinity of compositions is depressing. However, on a finite set,
the number of possible pairs is also finite, so eventually these compositions will not have anything
more to add; we can stop when the relation we are constructing is finally transitive. Interpreting
this iteratively,
- fun transitiveClosure(relation) =
= let val closure = ref relation;
= in
= (while not (isTransitive(!closure)) do
= closure := !closure @ makeNoRepeats(counterTransitive(!closure));
= !closure)
= end;
Iu Manin said, “A good proof is one which makes us wiser” [10]. The same can be said about
good programs. In this case, we can restate our definition of the transitive closure of a relation to
be
• The transitive closure of the union of the relation to its immediately missing pairs, otherwise.
- fun transitiveClosure(relation) =
= if isTransitive(relation)
= then relation
= else transitiveClosure(makeNoRepeats(counterTransitive(relation))
= @ relation);
This suggests that our situation could be remedied if we followed the intuition of the other way
to represent relations, as predicates. For example, eats back in Chapter 9 is a relation.
What we do not want to lose is the ability to treat relations as what we have been calling
“mathematical entities.” What we mean is that we should be able to pass a relation to a function
and have a function return a relation, as transitiveClosure does. A value in a programming
environment that can be passed to and returned from a function is a first-class value. A pillar of first-class value
functional programming is that functions are first-class values.
A relation represented by a predicate will have a type like (’a * ’a) → bool. This means “function
that maps from an ’a × ’a pair to a bool.” Our first task is to write a function that will convert
from list representation to predicate representation. To return a predicate from a function, simply
define the predicate (locally) within the function and return it by naming it without giving any
parameters.
145
21.4. RELATIONS AS PREDICATES CHAPTER 21. CLOSURES
- fun listToPredicate(oldRelation) =
= let fun newRelation(a, b) = isRelatedTo(a, b, oldRelation);
= in
= newRelation
= end;
val listToPredicate = fn : (’’a * ’’b) list -> ’’a * ’’b -> bool
In a similar way, a function can receive a predicate. Observe this function to compute the reflexive
closure:
- fun reflexiveClosure(relation) =
= let fun closure(a, b) = a = b orelse relation(a, b);
= in
= closure
= end;
val reflexiveClosure = fn : (’’a * ’’a -> bool) -> ’’a * ’’a -> bool
Computing the symmetric closure is an exercise. We cannot compute the transitive closure
directly, but if the relation is originally represented as a list, we could compute the transitive closure
before converting to predicate form.
As in Chapter 4, we are faced with the dilemma of choosing among two representations, each of
which has favorable features and drawbacks. The list representation is the more intuitive, at least in
terms of the formal definition of a relation; with the predicate representation, however, we can test
a pair for membership directly, as opposed to relying on a predicate like isRelatedTo. Because we
can iterate through all pairs, the list representation allows testing and computing of transitivity, but
with the predicate representation we can compute reflexive and symmetric closures. Conversion is
one-way, from lists to predicates. Neither representation allows us to test reflexivity. These aspects
are summarized below. Every “no” would become a “yes” if only we had the means to iterate
through all elements of a datatype.
List Predicate
first class value yes yes
membership test indirectly directly
isReflexive no no
isSymmetric yes no
isTransitive yes no
isAntiSymmetric yes no
reflexiveClosure no yes
symmetricClosure yes yes
transitiveClosure yes no
convert to other yes no
Finally, a word of interest only to those who have programmed in an object-oriented language such
as Java. When you read examples like these, you should be thinking about the best way to represent
a concept in other programming languages. In an object-oriented language, objects are first-class
values; hence we would want to write a class to represent relations. The primary difference between
the list and predicate representations presented here is that the former represents the relation as
data, the latter as functionality. The primary characteristic of objects is that they encapsulate data
and functionality together in one package. Since the predicate representation can be built from a list
representation, one would expect that it would be strictly more powerful; however, in the conversion
we lost the ability to test for transitivity, and this is because we have lost access to the list. If instead
we made the list to be an instance variable of a class, the methods could do the functional work
146
CHAPTER 21. CLOSURES 21.4. RELATIONS AS PREDICATES
of reflexiveClosure as well as the iterative work of isTransitive. Moreover, Java 5 has enum
types unavailable in earlier versions of Java, which provide the functionality of ML’s datatypes, as
well as a means of iterating over all elements of a set, something that hinders representing relations
as lists or predicates in ML (ML’s datatype is more powerful than Java’s enum types in other ways,
however). The following shows an implementation of relations using Java 5.
/** public boolean isTransitive() {
* Class Relation to model mathematical relations. for (List current = pairs; current != null;
* The relation is assumed to be over a set modeled current = current.tail)
* by a Java enum. if (! isTransitiveWRTPair(current.first,
* current.second))
* @author ThomasVanDrunen return false;
* Wheaton College return true;
* June 30, 2005 }
*/
public boolean isAntisymmetric() {
public class Relation<E extends Enum<E>> { if (isSymmetric()) return false;
for (List current = pairs; current != null;
private class List { current = current.tail)
public E first; if (current.first != current.second &&
public E second; relatedTo(current.second, current.first))
public List tail; return false;
public List(E first, E second, List tail) { return true;
this.first = first; }
this.second = second;
this.tail = tail; public Relation reflexiveClosure() {
} if (isReflexive()) return this;
else return new Relation<E>() {
/** public boolean isReflexive() { return true; }
* Concatenate a give list to the end of this list. public boolean relatedTo(E a, E b) {
* @param other The list to add. return a == b
* POSTCONDITION: The other list is added to || Relation.this.relatedTo(a, b);
* the end of this list; the other list is not affected. }
*/ };
public void concatenate(List other) { }
if (tail == null) tail = other;
else tail.concatenate(other); public Relation symmetricClosure() {
} if (isSymmetric()) return this;
} else return new Relation<E>() {
public boolean isSymmetric() { return true; }
/** public boolean relatedTo(E a, E b) {
* The set ordered pairs of the relation. return Relation.this.relatedTo(a, b) ||
*/ Relation.this.relatedTo(b, a);
private List pairs; }
};
/** }
* Constructor to create a new relation of n pairs from an
* n by 2 array. private List counterTransitiveWRTPair(E a, E b) {
* @param input The array of pairs; essentially an array of List toReturn = null;
* length-2 arrays of the base enum. for (List current = pairs; current != null;
*/ current = current.tail)
public Relation(E[][] input) { if (b == current.first
for (int i = 0; i < input.length; i++) { && ! relatedTo(a, current.second))
assert input[i].length == 2; toReturn = new List(a, current.second, toReturn);
pairs = new List(input[i][0], input[i][1], pairs); return toReturn;
} }
}
private List counterTransitive() {
public boolean relatedTo(E a, E b) { List toReturn = null;
for (List current = pairs; current != null; for (List current = pairs; current != null;
current = current.tail) current = current.tail) {
if (current.first == a && current.second == b) List currentCounter =
return true; counterTransitiveWRTPair(current.first,
return false; current.second);
} if (currentCounter != null) {
currentCounter.concatenate(toReturn);
public boolean isReflexive() { toReturn = currentCounter;
// if there are no pairs, we can assume this is }
// not reflexive. }
if (pairs == null) return false; return toReturn;
try { }
for (E t : (E[]) pairs.first.getClass()
.getMethod("values").invoke(null)) /**
if (! relatedTo(t, t)) return false; * Default constructor used by transitiveClosure().
} catch (Exception e) { } // won’t happen */
return true; private Relation() {}
}
public Relation transitiveClosure() {
public boolean isSymmetric() { if (isTransitive()) return this;
for (List current = pairs; current != null; Relation toReturn = new Relation();
current = current.tail) toReturn.pairs = counterTransitive();
if (! relatedTo(current.second, current.first)) toReturn.pairs.concatenate(pairs);
return false; return toReturn.transitiveClosure();
return true; }
}
public boolean isEquivalenceRelation() {
private boolean isTransitiveWRTPair(E a, E b) { return isReflexive() && isSymmetric() && isTransitive();
for (List current = pairs; current != null; }
current = current.tail)
if (b == current.first public boolean isPartialOrder() {
&& ! relatedTo(a, current.second)) return isReflexive() && isAntisymmetric()
return false; && isTransitive();
return true; }
} }
147
21.4. RELATIONS AS PREDICATES CHAPTER 21. CLOSURES
Exercises
148
Chapter 22
Partial orders
22.1 Definition
No one would mistake the ⊆ relation over a powerset for an equivalence relation by
0
looking at the graph. So far from parcelling the set out into autonomous islands, it
connects everything in an intricate, flowing, and (if drawn well) beautiful network.
{1} {2} {3}
However, it does happen to have two of the three attributes of an equivalence
relation: it is reflexive and it is transitive. Symmetry makes all the difference
here. ⊆ in fact is the opposite of symmetric—it is antisymmetric, the property { 2, 3 } { 1, 3 } { 1, 2 }
we mentioned only briefly in Chapter 19, but which you also have seen in a few
exercises. Being antisymmetric, informally, means that no two distinct elements { 1, 2, 3 }
in the set are mutually related—though any single element may be (mutually)
related to itself. Note carefully, though, that the definition of antisymmetric says “if two elements
are mutually related, they must be equal,” not “if two elements are equal, they must be mutually
related.” Relations like this are important because they give a sense of order to the set, in this case
a hierarchy from the least inclusive subset to the most inclusive, with certain sets at more or less
the same level.
A partial order relation is a relation R on a set X that is reflexive, transitive, and antisymmetric. partial order relation
A set X on which a partial order is defined is called a partially ordered set or a poset. The idea of
the ordering being only partial is because not every pair of elements in the set is organized by it. poset
In this case, for example, {1, 2} and {1, 3} are not comparable, which we will define formally in the
next section.
Theorem 22.1 Let A be any set of sets over a universal set U . Then A is a poset with the relation
⊆.
The graphs of equivalence relations and of partial orders become very cluttered. For equivalence
relations, it is more useful visually to illustrate regions representing the equivalence classes, like
on page 141. For partial orders, we use a pared down version of a graph called a Hasse diagram, Hasse diagram
after German mathematician Helmut Hasse. It strips out redundant information. To transform the
149
22.2. COMPARABILITY CHAPTER 22. PARTIAL ORDERS
graph of a partial order relation into a Hasse diagram, first draw it so that all the arrows (except for
self-loops) are pointing up. Antisymmetry makes this possible. Then, since the arrangement on the
page informs us what direction the arrows are going, the arrowheads themselves are redundant and
can be erased. Finally, since we know that the relation is transitive and reflexive, we can remove
self-loops and short-cuts. In the end we have something more readable, with the (visual) symmetry
apparent.
{ 1, 2, 3 } { 1, 2, 3 } { 1, 2, 3 }
{ 2, 3 } { 1, 3 } { 1, 2 } { 2, 3 } { 1, 3 } { 1, 2 } { 2, 3 } { 1, 3 } { 1, 2 }
0 0 0
• ≤ on R.
Chinnereth Jordan Dead
• waterWestOf on WaterBody.
• Alphabetical ordering over the set of lexemes in a lan-
guage. Mediterranean
Generic partial orders are often denoted by the symbol , with obvious reference to ≤ and ⊆.
22.2 Comparability
We have noted that the partial order relation ⊆ does not put in order, for example, {1, 3} and {2, 3}.
comparable We say that for a partial order relation on a set A, a, b ∈ A are comparable if a b or b a. 12
and 24 are comparable for | (12|24), but 12 and 15 are noncomparable. Mediterranean and Arnon
are comparable for waterWestOf (the Mediterranean is west of the Arnon), but Arnon and Jabbok
are noncomparable. For ⊆, ∅ is comparable to everything; in fact, it is a subset of everything.
Arnon is not comparable to everything, but everything it is comparable with is west of it. These
last observations lead us to say that if is a partial order relation on A, then a ∈ A is
maximal • maximal if ∀ b ∈ A, b a or b and a are not comparable.
minimal • minimal if ∀ b ∈ A, a b or b and a are not comparable.
greatest • greatest if ∀ b ∈ A, b a.
least • least if ∀ b ∈ A, a b
As you can see, a poset may have many maximal or minimal elements, but at most one greatest
or least. An infinite poset, like R with ≤ may have none, but we can prove that a finite poset has at
least one maximal element. First, a trivial result that will be useful: For any poset, we can remove
one element and it will still be a poset.
150
CHAPTER 22. PARTIAL ORDERS 22.3. TOPOLOGICAL SORT
Lemma 22.1 If A is a poset with partial order relation , and a ∈ A, then A − {a} is a poset with
partial order relation −{(b, c) ∈ | b = a or c = a}.
Proof. Suppose A is a poset with partial order relation , and suppose a ∈ A. Let
A0 = A − {a} and 0 = −{(b, c) ∈ | b = a or c = a}
Theorem 22.2 A finite, non-empty poset has at least one maximal element.
Proof. Suppose A is a poset with partial order relation . We will prove it has at least
one maximal element by induction on |A|.
Base case. Suppose |A| = 1. Let a be the one element of A. Trivially, suppose b ∈ A.
Since |A| = 1, b = a, and since is reflexive, b a. Hence a is a maximal element, and
so there exists an N ≥ 1 such that for any poset of size N , it has at least one maximal
element.
151
22.3. TOPOLOGICAL SORT CHAPTER 22. PARTIAL ORDERS
494
231: none 351: 232 and 245
232: 231 363: 232 441 451 463
243: 231 441: 341
245: 231 451: 351 341 351 333 331 363
341: 245
231
A topological sort would be a sequence of these courses taken in an order that does not conflict
with the prerequisite requirement, for example,
231—232—243—333—245—331—351—341—441—363—463—494—451
152
CHAPTER 22. PARTIAL ORDERS 22.3. TOPOLOGICAL SORT
Exercises
153
22.3. TOPOLOGICAL SORT CHAPTER 22. PARTIAL ORDERS
154
Part VI
Function
155
Chapter 23
Functions
23.1 Intuition
Of all the major topics in this course, the function is probably the one with which your prior
familiarity is the greatest. In the modern treatment of mathematics, functions pervade algebra,
analysis, the calculus, and even analytic geometry and trigonometry. It is hard for today’s student
even to comprehend the mathematics of earlier ages which did not have the modern understanding
and notation of functions. Recall the distinction we made at the beginning of this course, that
though you had previously studied the contents of various number sets, you would now study sets
themselves. Similarly, up till now you have used functions to talk about other mathematical objects.
Now we shall study functions as mathematical objects themselves.
Your acquaintance with functions has given you several models or metaphors by which you
conceive functions. One of the first you encountered was that of dependence—two phenomena or
quantities are related to each other in such a way that one was dependent on the other. In high
school chemistry, you may have performed an experiment where, given a certain volume of water,
the change of temperature of the water was affected by how much heat was applied to the water.
The number of joules applied is considered the independent variable, a quantity you can control.
The temperature change, in Kelvins, is a dependent variable, which changes predictably based on
the independent variable. Let f be the temperature change in Kelvins and x be the heat in joules.
With a kilogram of water, you would discover that
f (x) = 4.183x
Next, one might think of a function as a kind of machine, one that has a slot into which you can
feed raw materials and a slot where the finished product comes out, like a Salad Shooter.
x, input,
argument
f(x),
output
157
23.2. DEFINITION CHAPTER 23. FUNCTIONS
23.2 Definition
function A function f from a set X to a set Y is a relation from X to Y such that each x ∈ X is related to
domain exactly one y ∈ Y , which we denote f (x). We call X the domain of f and Y the codomain. We
codomain write f : X → Y to mean “f is a function from X to Y .”
Let A = {1, 2, 3} and B = {5, 6, 7}. Let f = {(1, 5), (2, 5), (1, 7)}. f is not a function because 1
is related to two different items, 5 and 7, and also because 3 is not related to any item. (It is not a
problem that more than one item is related to 5, or that nothing is related to 6.) When a supposed
well defined function meets the requirements of the definition, we sometimes will say that it is well defined . Make
sure you remember, however, that being well defined is the same thing as being a function. Do not
say “well-defined function” unless the redundancy is truly warranted for emphasis.
The term codomain might sound like what you remember calling the range. However, we give
range a specific and slightly different definition for that: for a function f : X → Y , the range of f is the
set {y ∈ Y | ∃ x ∈ X such that f (x) = y}. That is, a function may be broadly defined to a set but
may actually contain pairs for only some of the elements of the codomain. An arrow diagram will
illustrate.
a1 f b1 B
a2 b2
a3 b3
a4 b4
b5
a5
Since f is a function, each element of A is at the tail of exactly one arrow. Each element of
B may be at the head of any number (including zero) of arrows. Although the codomain of f is
B = {b1 , b2 , b3 , b4 , b5 }, since nothing maps to b3 , the range of f is {b1 , b2 , b4 , b5 }.
function equality Two functions are equal if they map all domain elements to the same things. That is, for
f : X → Y and g : X → Y , f = g if for all x ∈ X, f (x) = g(x). The following result is not
surprising, but it demonstrates the structure of a proof of function equality.
(2x−4)(x+1)
Theorem 23.1 Let f : R → R as f (x) = x2 − 4 and g : R → R as g(x) = 2 . f = g.
158
CHAPTER 23. FUNCTIONS 23.3. EXAMPLES
2
f (x0 ) = x0 − 4
2
= x0 − 2x0 + 2x0 − 4
= (x0 − 2)(x0 + 2)
0 0
= 2(x −2)(x
2
+2)
0 0
= (2x −4)(x
2
+2)
= g(x0 )
Notice that we chose x0 as the variable to work with instead of x. This is to avoid equivocation
by the reuse of variables. We used x in the rules given to define f and g. x0 is the symbol we used
to stand for an arbitrary element of R which we were plugging into the rule.
23.3 Examples
Domain: R
f (x) = 3x + 4 Codomain: R
Range: R
Domain: R
f (x) = 4 − x2 Codomain: R
Range: (−∞, 4]
Domain: R
f (x) = bxc Codomain: R or Z
Range: Z
Domain: R or Z
f (x) = 5 Codomain: R or Z
Range: {5}
Domain: R or Z
P (x) = x > −5 ∧ x < 3 Codomain: { true, false }
Range: { true, false }
159
23.4. REPRESENTATION CHAPTER 23. FUNCTIONS
Notice that in some cases, the domain codomain are open to interpretation. Accordingly, in
exercises like the previous examples you will be asked to give a “reasonable” domain and codomain.
constant function A function like f (x) = 5 whose range is a set with a single element is called a constant function. We
can now redefine the term predicate to mean a function whose codomain is { true, false }. Though
not displayed above, notice that the identity relation on a set is a function. Finally, you may recall
seeing some functions with more than one argument. Our simple definition of function still works
in this case if we consider the domain of the function to be a Cartesian product. That is, f (4, 12) is
simply f applied to the tuple (4, 12). This, in fact, is exactly how ML treats functions apparently
with more than one argument.
23.4 Representation
It is almost awkward to introduce functions in ML, since by now they are like a well-known friend
to you. However, you still have much to learn about each other. You have been told to think of an
ML function as a parameterized expression. Nevertheless, we are primarily interested in using ML
to represent the mathematical objects we discuss, and the best way to represent a function is with,
well, a function. Let us take the parabola example and a new, more subtle curve:
0 if x = 1
g(x) = x2 −1
x−1 otherwise
The type real -> real corresponds to what we write R → R. More importantly, the fact that a
function has a type emphasizes that it is a value (as we have seen, a first class value no less). An
identifier like f is simply a variable that happens to store a value that is a function. As we saw
apply with the function reflexiveClosure, it can be used in an expression without applying it to any
arguments.
- f;
- it(5.0);
Once the value of f has been saved to it, it can be used as a function. In fact, we need not
store a function in a variable; to write an anonymous function, use the form
Thus we have
160
CHAPTER 23. FUNCTIONS 23.4. REPRESENTATION
- it(15);
val it = 3 : int
or even
val it = 3 : int
Note that
161
23.4. REPRESENTATION CHAPTER 23. FUNCTIONS
Exercises
162
Chapter 24
Images
24.1 Definition
We noticed in the previous chapter that a function’s range may be a proper subset of its codomain;
that is, a function may take its domain to a smaller set than the set it is theoretically defined to.
This leads us to consider the idea of a function mapping a set (rather than just a single element).
A function will map a subset of the domain to a subset of the codomain. Suppose f : X → Y and
A ⊆ Y . The image of A under f is image
a1 f b1 B
a2 b2
a3 F(A’) b3
A’
a4 b4
b5
a5
F −1 (Y ) = {x ∈ X | f (x) ∈ Y }
The image of a subset of the codomain is the set of elements that hit the subset. It is vital to
163
24.2. EXAMPLES CHAPTER 24. IMAGES
remember that although the image is defined by a set in the domain, it is a set in the codomain,
and although the inverse image is defined by a set in the codomain, it is a set in the domain. It is
also important to be able to distinguish an inverse image from an inverse (of a) function, which we
will meet in Chapter 25. Let B 0 = {b2 , b3 }. Then F −1 (B 0 ) = {a4 , a5 }. Notice that F −1 ({b3 }) = ∅.
a1 f b1 B
a2 b2
a3 B’ b3
a4 b4
F (B’) b5
a5
24.2 Examples
Proofs of propositions involving images and inverse images are a straightforward matter of applying
definitions. However, students frequently have trouble with them, possibly because they are used
to thinking about functions operating on elements rather than on entire sets of elements. The
important thing to remember is that images and inverse images are sets. Therefore, you must put
the techniques you learned for set proofs to work.
Do not be distracted by the new definitions. This is a proof of set equality, Set Proof Form 2.
Moreover, F (A ∪ B) ⊆ Y . Choose your variable names in a way that shows you understand this.
The inverse image also may look intimidating, but reasoning about them is still a matter of set
manipulation. Just remember that an inverse image is a subset of the domain.
164
CHAPTER 24. IMAGES 24.3. MAP
These are classic examples of the analysis/synthesis process. We take apart expressions by
definition, and by definition we reconstruct other expressions.
24.3 Map
Frequently it is useful to apply an operation over an entire collection of data, and therefore to get a
collection of results. For example, suppose we wanted to square every element in a list of integers.
- square([1,2,3,4,5]);
We can generalize this pattern using the fact that functions are first class values. A program
traditionally called map takes a function and a list and applies that function to the entire list.
Keep in mind that we are considering lists in general, not lists as representing sets. However,
map very naturally adapts to our notion of image, using the list representation of sets.
165
24.3. MAP CHAPTER 24. IMAGES
Exercises
In Exercises 1–11, assume f : X → Y . Prove, unless you are asked 11. If A ⊆ Y , F (F −1 (A)) ⊆ A. This is false; give a counterex-
to give a counterexample. ample.
1. If A, B ⊆ X, F (A ∩ B) ⊆ F (A) ∩ F (B).
12. Sometimes it is useful to write a program that operates on
2. If A, B ⊆ X, F (A ∩ B) = F (A) ∩ F (B). This is false; give a
a list but also takes some extra arguments. For example
counterexample.
a program scale might take a list of integers and another
3. If A, B ⊆ X, F (A) − F (B) ⊆ F (A − B). integer and return a list of the old integers multiplied by
4. If A, B ⊆ X, F (A − B) ⊆ F (A) − F (B). This is false; give the other integer. Write a program mapPlus that takes
a counterexample. a function, a list, and an extra argument and applies the
5. If A ⊆ B ⊆ Y , then F −1 (A) ⊆ F −1 (B). function to every element in the list and the extra argu-
ment. For example, mapPlus(fn (x, y) => x * y, [1, 2,
6. If A, B ⊆ Y , then F −1 (A ∪ B) = F −1 (A) ∪ F −1 (B). 3, 4], 2) would return [2, 4, 6, 8].
7. If A, B ⊆ Y , then F −1 (A ∩ B) = F −1 (A) ∩ F −1 (B).
13. Write a program mapPlusMaker that takes a function and
8. If A ⊆ X, A ⊆ F −1 (F (A)).
returns a function that will take a list and an extra argu-
9. If A ⊆ X, A = F −1 (F (A)). This is false; give a counterex- ment and apply the given function to all the elements in
ample. the list. For example, mapPlusMaker(fn (x, y) => x * y)
10. If A ⊆ Y , F (F −1 (A)) ⊆ A. would return the scale program described above.
166
Chapter 25
Function properties
25.1 Definitions
Some functions have certain properties that imply that they behave in predictable ways. These
properties will come in particularly handy in the next chapter when we consider the composition of
functions.
We have seen examples of functions where some elements in the codomain are hit more than
once, and functions where some are hit not at all. Two properties give names for (the opposite) of
these situations. A function f : X → Y is onto if for all y ∈ Y , there exists an x ∈ X such that onto
f (x) = y. In other words, an onto function hits every element in the codomain (possibly more than
once). If B ⊆ Y and for all y ∈ B there exists an x ∈ X such that f (x) = y, then we say that f is
onto B. A function is one-to-one if for all x1 , x2 ∈ X, if f (x1 ) = f (x2 ), then x1 = x2 . If any two one-to-one
domain elements hit the same codomain element, they must be equal (compare the structure of this
definition with the definition of antisymmetry); this means that no element of the codomain is hit
more than once. A function that is both one-to-one and onto is called a one-to-one correspondence; one-to-one
in that case, every element of the codomain is hit exactly once. correspondence
A A A
a1 f b1 B a1 f b1 B a1 f b1 B
a2 b2 a2 b2 a2 b2
b3 b3 b3
a3 a3 a3
a4 b4 a4 b4 a4 b4
a5 a5
b5 b5
Onto but not one-to-one One-to-one but not onto One-to-one correspondence
Sometimes onto functions, one-to-one functions, and one-to-one correspondences are called sur-
jections, injections, and bijections, respectively.
Last time you proved that F (A ∩ B) ⊆ F (A) ∩ F (B) but gave a counterexample against F (A) ∩
F (B) ⊆ F (A ∩ B). If you look back at that counterexample, you will see that the problem is that
your f is not one-to-one. Thus
Theorem 25.1 If f : X → Y , A, B ⊆ X, and f is one-to-one, then F (A) ∩ F (B) ⊆ F (A ∩ B).
Proof. Suppose f : X → Y , A, B ⊆ X, and f is one-to-one.
Now suppose y ∈ F (A) ∩ F (B). Then y ∈ F (A) and y ∈ F (B) by definition of inter-
section. By the definition of image, there exist x1 ∈ A such that f (x1 ) = y and x2 ∈ B
167
25.2. CARDINALITY CHAPTER 25. FUNCTION PROPERTIES
25.2 Cardinality
If a function f : X → Y is onto, then every element in Y has at least one domain element seeking
it, but two elements in X could be rivals for the same codomain element. If it is one-to-one, then
every domain element has a codomain element all to itself, but some codomain elements may be
left out. If it is a one-to-one correspondence, then everyone has a date to the dance. When we are
considering finite sets, we can use this as intuition for comparing the cardinalities of the domain and
codomain. For example, f could be onto only if |X| ≥ |Y | and one-to-one only if |X| ≤ |Y |. If f is
a one-to-one correspondence, then it must be that |X| = |Y |.
How could we prove this, though? In fact, we have never given a formal definition of cardinality.
The one careful proof we did involving cardinality was Theorem 15.1 which relies in part on this
chapter. Rather than proving that the existence of a one-to-one correspondence implies sets of equal
cardinality cardinality, we simply define cardinality so. Two finite sets X and Y have the same cardinality if
there exists a one-to-one correspondence from X to Y . We write |X| = n for some n ∈ N if there
exists a one-to-one correspondence from {1, 2, . . . , n} to X, and define |∅| = 0.
(There is actually one wrinkle in this system: it lacks a formal definition of the term finite. Tech-
nically, we should define a set X to be finite if there exists an n ∈ N and a one-to-one correspondence
from {1, 2, . . . , n} to X. Then, however, we would need to use this to define cardinality. We would
prefer to keep the definition of cardinality separate from the notion of finite subsets of N to make it
easier to extend the concepts to infinite sets later. We are also are assuming that a set’s cardinality
is unique, or that the cardinality operator | | is well-defined as a function. This can be proven, but
it is difficult.)
Now we can use cardinality formally.
Theorem 25.2 If A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|.
Proof. Suppose A and B are finite, disjoint sets. By the definition of finite, there exist
i, j ∈ N and one-to-one correspondences f : {1, 2, . . . , i} → A and g : {1, 2, . . . , j} → B.
Note that |A| = i and |B| = j. Define a function h : {1, 2, . . . , i + j} → A ∪ B as
f (x) if x ≤ i
h(x) =
g(x − i) otherwise
Now suppose y ∈ A ∪ B. Then either y ∈ A or y ∈ B by definition of union, and it is
not true that y ∈ A and y ∈ B by definition of disjoint. Hence we have two cases:
Case 1: Suppose y ∈ A and y ∈ / B. Then, since f is a onto, there existss a k ∈ {1, 2, . . . , i}
such that f (k) = y. By our definition of h, h(k) = y. Further, suppose ` ∈ {1, 2, . . . , i+j}
and h(`) = y. Suppose ` > i; then y = h(`) = g(` − i) ∈ B, contradiction; hence ` ≤ i.
This implies h(`) = f (`), and since f is one-to-one, ` = k.
Case 2: Suppose y ∈ B and y ∈ / A. Then, since g is onto, there exist a k ∈ {1, 2, . . . , j}
such that g(k) = y. By our definition of h, h(k + i) = g(k) = y. Further, suppose
` ∈ {1, 2, . . . , i+j} and h(`) = y. Suppose ` ≤ i; then y = h(`) = f (`) ∈ A, contradiction;
hence ` > i. This implies h(`) = g(` − i), and since g is one-to-one, ` − i = k or ` = k + i.
In either case, there exists a unique element in m ∈ {1, 2, . . . , i+j} (m = k and m = k+i,
respectively) such that h(m) = y. Hence h is a one-to-one correspondence. Therefore,
|A ∪ B| = i + j = |A| + |B|. 2
168
CHAPTER 25. FUNCTION PROPERTIES 25.3. INVERSE FUNCTIONS
f −1 = {(y, x) ∈ Y × X | f (x) = y}
It is more convenient to call this the inverse function of f , but the title “function” does not come inverse function
for free.
Theorem 25.3 If f : X → Y is a one-to-one correspondence, then f −1 : Y → X is well-defined.
Proof. Suppose y ∈ Y . Since f is onto, there exists x ∈ X such that f (x) = y. Hence
(y, x) ∈ f −1 or f −1 (y) = x.
Next suppose (y, x1 ), (y, x2 ) ∈ f −1 or f −1 (y) = x1 and f −1 (y) = x2 . Then f (x1 ) = y
and f (x2 ) = y. Since f is one-to-one, x1 = x2 .
Therefore, by definition of function, f −1 is well-defined. 2
Do not confuse the inverse of a function and the inverse image of a function. Remember that the
inverse image is a set, a subset of the domain, applied to a subset of the codomain; the inverse image
always exists. The inverse function exists only if the function itself is a one-to-one correspondence;
it takes an element of the codomain and produces an element of the domain.
As an application of functions and the properties discussed here, we consider an important
concept in information security. A hash function is a function that takes a string (that is, a variable- hash function
length sequence of characters) and returns fixed-length string. Since the output is smaller than
the input, a hash function could not be one-to-one. However, a good hash function (often called a
one-way hash function) should have the following properties:
• It should be very improbable for two arbitrary input strings to produce the same output.
Obviously some strings will map to the some output, but such pairs should be very difficult to
find. In this way, the function should be “as one-to-one as possible” and any collisions should
happen without any predictable pattern.
• It should be impossible to approximate an inverse for it. Since it is not one-to-one, a true
inverse is impossible, but in view here is that given an output string, it should be very difficult
to produce any input string that could be mapped to it.
The idea is that a hash function can be used to produce a fingerprint of a document or file
which proves the document’s existence without revealing its contents. Suppose you wanted to prove
to someone that you have a document that they also have, but you do not want to make that
document public. Instead, you could compute the hash of that document and make that public.
All those who have the document already can verify the hash, but no one who did not have the
document could invert the hash to reproduce it. Another use is time stamping. Suppose you have
a document that contains intellectual property (say, a novel or a blueprint) for which you want to
ensure that you get credit. You could compute the hash of the document and make the hash public
(for example, printing it in the classified ads of a newspaper); some time later, when you make the
document itself public, you will have proof that you knew its contents on or before the date the hash
was published.
169
25.3. INVERSE FUNCTIONS CHAPTER 25. FUNCTION PROPERTIES
Exercises
170
Chapter 26
Function composition
26.1 Definition
We have seen that two relations can be composed to form a new relation, say, given relations R from
X to Y and S from Y to Z:
g f
C
A
a1
c1
a2
c2
b1
a3
c3
b2
f
B
b3
171
26.2. FUNCTIONS AS COMPONENTS CHAPTER 26. FUNCTION COMPOSITION
The most intuitive way to think of composition is to use the machine model of functions. We
simply attach two machines together, feeding the output slot of the one into the input slot of the
other. To make this work, the one output slot must fit into the other input slot, and the one machine’s
output material must be appropriate input material for the other machine. Mathematically, we
describe this by considering the domains and codomains. Given f : A → B and g : C → D, g ◦ f is
defined only for B = C, though it could easily be extended for the case where B ⊆ C.
Function composition√happens quite frequently without our noticing it. For example, the real-
√ f (x) = x − 12 can be consiered the composition of the functions g(x) = x − 12
valued function
and h(x) = x.
growthRate is a function mapping tree genera to their rates of growth. We have overlooked
something, though. There is a larger categorization of these trees that will afffect how this growth
rate is applied: Coniferous trees grow all year round, but deciduous trees are inactive during the
winter. The number of months used for growing, therefore, is a function of the taxonomic division.
172
CHAPTER 26. FUNCTION COMPOSITION 26.3. PROOFS
Because of the hierarchy of taxonomy, we can determine a tree’s division based on its genus. To
avoid making the mapping longer than necessary, we pattern-match on coniferous trees and make
deciduous the default case.
Since division maps treeGenus to treeDivision and growingMonths maps treeDivision to real,
their composition maps treeGenus to real.
The predicted height of the tree is of course the initial height plus the growth rate times growth
time. The growth rate is calculated from the genus, and the growth time is calculated by multiplying
years times the growing months. Thus we have
We can generalize this composition process by writing a function that takes two functions and
returns a composed function. Just for the sake of being fancy, we can use it to rewrite the expression
growingMonths(division(genus)).
val compose = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c
26.3 Proofs
Composition finally gives us enough raw material to prove some fun results on functions. Remember
in all of these to apply the definitions carefully and also to follow the outlines for proving propositions
in now-standard forms. For example, it is easy to verify visually that the composition of two one-to-
one functions is also a one-to-one function. If no two f or g arrows collide, then no g ◦ f arrows have
a chance of colliding. Proving this result relies on the definitions of composition and one-to-one.
173
26.3. PROOFS CHAPTER 26. FUNCTION COMPOSITION
In the long run, we want to prove that something is one-to-one, so we need to gather the materials
to syntesize the definition. It means we can pick any two elements from the domain of g ◦ f , and if
they map to the same element, they themselves must be the same. Here is how the proof connects
with a visual verification.
g f g f
C C
A A
a1
c
a2
f f
g g
B B
Suppose f : A → B and g : B → C are one-to-one. Now suppose a1 , a2 ∈ A and c ∈ C such that g ◦
f (a1 ) = c and g ◦ f (a2 ) = c.
g f g f
C C
A A
a1 a1
c c
a2 a2
f (a1) f (a1)
f (a2)
f f (a2) f
g g
B B
By definition of composition, g(f (a1 )) = c and Since g is one-to-one, f (a1 ) = f (a2 ).
g(f (a2 )) = c.
g f g f
C C
A A
a1 a1
a2 c a2 c
f (a1)
f (a2)
f f
g g
B B
Since f is one-to-one, a1 = a2 . Therefore, by definition of one-to-one, g ◦ f is one-to-
one. 2
174
CHAPTER 26. FUNCTION COMPOSITION 26.3. PROOFS
Our intuition about inverses from the previous chapter conceived of functions as taking us from
a spot in one set to a spot in another. Only if the function were a one-to-one correspondence could
we assume a “round trip” (using, an inverse function) existed. Since the net affect of a round trip
is getting you nowhere, you may conjecture that if you compose a function with its inverse, you will
get an identify function. The converse is true, too.
175
26.3. PROOFS CHAPTER 26. FUNCTION COMPOSITION
Exercises
176
Chapter 27
Both our informal definition of cardinality in Chapter 3 and the more careful one in Chapter 25 were
restricted to finite sets. This was in deference to an unspoken assumption that the cardinality of a set
ought to be something, that is, a whole number. As has been mentioned already, we cannot merely
say that a set like Z has cardinality infinity. Infinity is not a whole number—or even a number at
all, if one has in mind the number sets N, W, Z, Q, R, and C. The definition of cardinality taken
at face value, however, does not guarantee that the cardinality is something; it merely inspired us
to define what the operator || means by comparing a set to a subset of N. Indeed, the definition
of cardinality merely gives us a way to say that two sets have the same cardinality as each other.
What would happen if we extended this bare notion to infinite sets—that is, drop the term finite,
which we have not defined anyway?
Two sets X and Y have the same cardinality if there exists a one-to-one correspondence from cardinality
X to Y . We know from Exercise 13 of Chapter 26 that this relation partitions sets into equivalence
classes. (Your proof did not depend on the sets being finite, did it?) We say that a set X is finite if finite
it is the empty set or if there exists an n ∈ W such that X has the same cardinality as {1, 2, . . . , n}.
Otherwise, we say the set is infinite. infinite
It is worth remembering that the term infinite is defined negatively; this also makes sense ety-
mologically, since the word is simply finite with a negative prefix. Did you ever wonder what was
so infinite about the grammatical term infinitive? The term makes more sense in Latin, where the
pronoun subject of a verb is given by a suffix: ambulo, ambulas, ambulat is the conjugation “I walk,”
“you walk,” “he/she/it walks,” o, s, and t being the singulars for first, second, and third person,
respectively. An infinitive (in this case, ambulare) has no pronominal suffix. This does not mean
that it goes on forever, just that it, literally, has no proper ending. Since the caboose has fallen
out of use in American railways, can we now say that freight trains are therefore infinite? Perhaps
not, since even though there is no longer a formal ending car, they are at least terminated by a box
called a FRED (which stands for Flashing Rear-End Device; no kidding).
On a more serious note, this raises the question, are all infinities equal? More than merely raising
the question, it gives a rigorous way to phrase it: Are all infinite sets in the same equivalence class?
Our intuition could go either way. On one hand, one might assume that infinity is infinity without
qualification. Thus
177
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY
This calls for a proof of existence. The definition of cardinality requires that a one-to-one
correspondence exists between the two sets. We must either prove that it is impossible for such a
function not to exist or propose a candidate function and demonstrate that it meets the requirements.
We use the latter strategy.
So it is possible for a proper subset to have as many (infinite) elements as the set that contains
it. Moreover, |N| = |Z|, so the rest of the equality chain seems plausible. But is it true? Before
countably infinite asking again a slightly different question, we say that a set X is countably infinite if X has the same
countable cardinality as N. A set X is countable if it is finite or countably infinite. The idea is that we could
count every element in the set by assigning a number 1, 2, 3, . . . to each one of them, even if it took
uncountable us forever. A set X is uncountable if it is not countable. This gives us a new question: Are all sets
countable?
Let us try Q next. The jump from N to Z was not so shocking since Z has only about “twice”
as many elements as N. Q, however, has an infinite number of elements between 0 and 1 alone, and
1
an infinite number again between 0 and 10 . Nevertheless,
A formal proof is delicate, but this ML program demonstrates the main idea:
- fun cantorDiag(n) =
= let
= fun gcd(a, 0) = a
= | gcd(a, b) = gcd(b, a mod b);
= fun reduce(a, b) =
= let val comDenom = gcd (a, b);
= in (a div comDenom, b div comDenom)
178
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY
= end;
= fun nextRatio(a, b) =
= if a = 1 andalso b mod 2 = 1 then (1, b + 1)
= else if b = 1 andalso a mod 2 = 0 then (a + 1, 1)
= else if (a + b) mod 2 = 1 then (a + 1, b - 1)
= else (a - 1, b + 1);
= fun contains((a, b), []) = false
= | contains((a, b), (c, d)::rest) =
= if a = c andalso b = d then true else contains((a, b), rest);
= val i = ref 1;
= val currentRatio = ref (1, 1);
= val usedList = ref [];
= in
= (while !i < n do
= (usedList := !currentRatio :: !usedList;
= while contains(reduce(!currentRatio), !usedList) do
= currentRatio := nextRatio(!currentRatio);
= i := !i + 1);
= !currentRatio)
= end;
- map(cantorDiag, [1,2,3,4,5,6,7,8]);
What this function computes is a diagonal walk over the set of positive rationals, invented by
Cantor, illustrated here. We lay out all ratios of integers as an infinite 2 × 2 grid and weave our way
around them, assigning a natural number to each in the order that we come to them, but skipping
over ones that are equivalent to something we have seen before.
1 1 1 1 1
1 2 3 4 5
2 2 2 2 2
1 2 3 4 5
3 3 3 3 3
1 2 3 4 5
4 4 4 4 4
1 2 3 4 5
This function hits every positive rational exactly once, so it is a one-to-one correspondence. Thus
at least Q+ is countably infinite, and by a process similar to the proof of Theorem 27.1, we can
bring Q into the fold by showing its cardinal equivalence to Q+ . Thus
179
CHAPTER 27. SPECIAL TOPIC: COUNTABILITY
For this we will use a geometric argument. Imagine taking the line segment (0, 1) and rolling it
up into a ball with one point missing where 0 and 1 would meet. Then place the ball on the real
number line, so .5 on the ball is tangent with 0 on the line. Now define a function f : (0, 1) → R
so that to find f (x) we draw a line from the 0/1 point on the ball through x on the ball; the value
f (x) is the point where the drawn line hits the real number line. Proving that this function is a
one-to-one correspondence is a matter of using analytic geometry to find a formula for f and then
showing that every R is hit from exactly one value on the ball.
0 or 1
.75 .5
.25
This appears to cut our task down immensely. To prove all infinities (that we know of) equal,
all we need to show is that any of the sets already proven countable can be mapped one-to-one and
onto the simple line segment (0, 1). However,
Since Countability calls for an existence proof, uncountability requires a non-existence proof, for
which we will need a proof by contradiction.
Since d ∈ (0, 1) and f is onto, there exists an n0 ∈ N such that f (n0 ) = d. Moreover,
f (n0 ) = 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . ., so d = 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . . by substitution.
However, by how we have defined d, dn 6= an0 ,n0 and so d 6= 0.an0 ,1 an0 ,2 an0 ,3 . . . an0 ,n0 . . .,
a contradiction.
Therefore (0, 1) is not countable. 2
Anticlimactic? Perhaps. But also profound. There a just as many naturals as integers as
rationals. But there are many, many more reals.
This chapter draws heavily from Epp[5].
180
Part VII
Program
181
Chapter 28
Recursion Revisited
First, a word of introduction for this entire part on functional programming. We have already seen
many of the building blocks of programming in the functional paradigm, and in the next few chapters
we will put them together in various applications and also learn certain advanced techniques. The
chapters do not particularly depend on each other and could be resequenced. The order presented
here has the following rationale: This chapter will ease you into the sequence with a fair amount of
review, and will also look at recursion from its mathematical foundations. Chapter 29 will present
the datatype construct as a much more powerful tool than we have seen before, particularly in how
the idea of recursion can be extended to types. Chapter 30 is the climax of the part, illustrating
the use of functional programming in the fixed-point iteration algorithm strategy; it is also the only
chapter of the book that requires a basic familiarity with differential calculus, so if you have not
taken the first semester of calculus, ask a friend who has to give you the five-minute explanation of
what a derivative is. Chapter 31 is intended as a breather following Chapter 30, applying our skills
to computing combinations and permutations.
28.1 Scope
In Chapter 23, we implied that
However, this is not true; there is a subtle but important difference, and it boils down to scope.
Recall that a variable’s scope is the duration of its validity. A variable (including one that holds a
function value) is valid from the point of its declaration on. It is not valid, however, in its declaration
itself; in the val/fn form, <identifier>1 cannot appear in <expression>. The name of a function
defined using the fun form, however, has scope including its own definition. Recall that recursion is
self-reference; a function defined using fun can call itself—or return itself, for that matter.
Functional programming is a style where no variables are modified. We will demonstrate this
distinction and how recursive calls make this possible by transforming our iterative factorial function
from Chapter 14 into the functional style. We modify this slightly by counting from 1 to n instead
of from 0 to n − 1, and accordingly we update fact before i.
183
28.1. SCOPE CHAPTER 28. RECURSION REVISITED
- fun factorial(n) =
= let val i = ref 1;
= val fact = ref 1;
= in
= (while !i <= n do
= (fact := !fact * !i;
= i := !i + 1);
= !fact)
= end;
Our first change is to encapsulate the body of the while loop into a function, which we will call
factBody. The body of the while loop does two things: It updates both fact and i. Our function,
then, must consume the old fact and i and produce new values for them. We can handle the need
to return two values by returning a tuple. Thus we also consolidate fact and i into one value, the
tuple current. This essentially represents the current state of the computation.
- fun factorial(n) =
= let fun factBody(fact, i) = (i * fact, i + 1);
= val current = ref (1, 1);
= in
= (while #2(!current) <= n do
= current := factBody(!current);
= #1(!current))
= end;
Next, we can be more ambitious about how much of the work we subsume into the function
factBody. At this point we also take advantage of the recursive use of function names. That the
while loop is doing one thing: calling factBody repeatedly. Since factBody can call itself, it may as
well eat up the rest of the while loop. Notice that the while is effectively replaced with an if. Both
the old while and the new if are making the same decision—either stop (and do not change the state
of current or (fact, i)) or make the change and repeat.
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else (fact, i);
= val current = ref (1, 1);
= in
= (current := factBody(!current);
= #1(!current))
= end;
Now we notice that the second item in current is no longer used; current can be a single int.
We have come full circle, in a way—current is now equivalent to the old variable (now parameter)
fact. Accordingly, factBody should only return one thing, the new current fact value. The main
call to factBody needs to be given an initial value for the second item (the old variable, now
parameter i).
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= val current = ref 1;
= in
= (current := factBody(!current, 1);
= !current)
= end;
184
CHAPTER 28. RECURSION REVISITED 28.1. SCOPE
Next, consider the statement list inside the let expression. It is rather silly to store the result
of the main call to factBody and immediately retrieve it. Why not replace the statement list with
just the call?
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= val current = ref 1;
= in
= factBody(!current, 1)
= end;
Now that current is never updated, there is no need for it to be a reference variable—or a
variable at all, for that matter. We replace its one remaining use with its initial value, 1.
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i <= n then factBody(i * fact, i + 1)
= else fact;
= in
= factBody(1, 1)
= end;
Next we make use of the associativity of multiplication. Our current version of factBody performs
its multiplication first and then passes the result to the call, essentially performing (. . . (((1 × 1) ×
2) × 3) . . . × n). This means that fact gets bigger on the way down, and the result is unchanged
as it comes back up from the series of recursive calls. Instead, we can do the multiplication after
the call, so that fact stays the same on the way down, but the result gets bigger as it comes up,
essentially (1 × (1 × (2 × (3 × . . . (n) . . .)))).
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i = 0 then fact
= else factBody(i * fact, i - 1);
= in
= factBody(1, n)
= end;
We can also make use of associativity. Instead of performing multiplication first and then passing
the result to the call (so fact gets bigger on the way down, and the result is unchanged as it
comes back up from the series of recursive calls) we can do the multiplication after the call, so
that fact stays the same on the way down, but the result gets bigger as it comes up, essentially
(n × ((n − 1) × ((n − 2) × (. . . (1) . . .)))).
- fun factorial(n) =
= let fun factBody(fact, i) =
= if i = 0 then fact
= else i * factBody(fact, i - 1);
= in
= factBody(1, n)
= end;
An amazing thing has happened: The variable fact no longer varies with each call. This means
we can eliminate it from the parameter list and replace its use with its only value, 1.
185
28.2. RECURRENCE RELATIONS CHAPTER 28. RECURSION REVISITED
- fun factorial(n) =
= let fun factBody(i) =
= if i = 0 then 1
= else i * factBody(i - 1);
= in
= factBody(n)
= end;
Now all that factorial does is make a local function and apply it to its input without modifi-
cation. We may as well replace its body with the body of factBody—but be careful to substitute
factorial for factBody and n for i.
- fun factorial(n) =
= if n = 0 then 1
= else n * factorial(n - 1);
- fun factorial(0) = 1
= | factorial(n) = n * factorial(n - 1);
The second of these is the same as the number of pairs alive after n − 1 months, that is, f (n − 1).
What about the first point? How many new pairs will be born? Since the fertility rate is 100%,
every pair will give birth to a new pair—except for juvenile pairs, which are not fertile yet. Since
we assume all the pairs older than one month are fertile, this means that every pair alive after n − 2
months is ready, and so f (n − 2) new pairs are born.
f (n) = f (n − 1) + f (n − 2)
Yet this does not fully define the function, since f (n − 1) and f (n − 2) are unknown, unless we define
it for a set of base cases.
If you have not guessed, Leonardo of Pisa was better know by his nickname, Fibonacci.
186
CHAPTER 28. RECURSION REVISITED 28.2. RECURRENCE RELATIONS
- fun fibonacci(0) = 1
= | fibonacci(1) = 1
= | fibonacci(n) = fibonacci(n-1) + fibonacci(n-2);
Recall that a mathematical sequence is a function with domain W or N. For a sequence a, we often
write ak for a(k) and refer to this as the kth term in the sequence. A recurrence relation for a sequence recurrence relation
a is a formula which, for some N relate each term ak (for all k ≥ N ) to a finite number of predecessor
terms ak−N , ak−N +1 , . . . , ak−1 ; moreover, terms a0 , a1 , a1 , . . . , aN −1 are explicitly defined, which
we call the initial conditions of the recurrence relation. (The term “relation” is misleading here, initial conditions
since it is not directly related to relation as a set of ordered pairs.) This demonstrates for us the
basic structure of recursion: A formula has base case by which it is grounded explicitly, and in other
cases it is defined in terms of other values of the same formula.
For another example, consider the well-known Tower of Hanoi puzzle. Imagine you have three
pegs and k disks. Each disk has a different size and a hole in the middle big enough for the pegs to
fit through. Initially all disks are on the first peg from the largest disk on the bottom to the smallest
disk on the top. Your task is to move all the disks to the third peg under the following rules:
The strategy we use is itself recursive. Assume we call the pegs 0, 1, and 2, and generalize the
problem to moving k disks from peg i to peg j. Then
1. Move the top k − 1 disks from peg i to the peg other than i or j.
2. Move the kth disk from peg i to peg j.
3. Move the top k − 1 disks from the other peg to peg j.
How many moves will this take? If mk is the number of moves it takes to solve the puzzle for k
disks, then
m1 = 1
In ML,
- fun hanoi(1) = 1
= | hanoi(n) = 2 * hanoi(n-1) + 1
187
28.2. RECURRENCE RELATIONS CHAPTER 28. RECURSION REVISITED
Exercises
188
Chapter 29
Recursive Types
- add((1,3), (1,2));
However, this introduces a software engineering danger: What if there are other uses for tuples
of two integers in the system, say, points, pairs in a relation, or simply the result of a function
that needed to return two values? It would be very easy for a programmer to become careless
and introduce bugs that improperly treats an int tuple as a fraction or a fraction as some other
int tuple. In this course we have mostly considered an idealized mathematical world apart from
practical software engineering concerns. However, one purpose of types is to make mistakes like this
harder to make and easier to detect.
What is missing in the situation above is proper abstraction, in this case a way to encapsulate
a fraction value and designate it specifically as a fraction so, for example, the point (5, 3) is not
mistaken for 35 . A better solution is to extend ML’s type system to include a rational number type.
We know one way to create new types, and that is using the datatype construct. So far we have
seen it used to represent only finite sets, not (nearly)1 infinite sets like the rational type we have in
mind. We can expand on the old way of writing datatypes by tacking a little extra data onto the
options we give for a datatype.
189
29.1. DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES
- Fraction(3,5);
- add(Fraction(1,2), Fraction(1,3));
constructor expression The construct Fraction of int * int is called a constructor expression. The idea is that it
explains one way to construct a value of the type rational. Another way to think of it is that
Fraction is a husk that contains an int tuple as its core. A datatype declaration, then, is a sequence
of constructor expressions each following the form
<identifier> of <type expression>
with the “of . . . ” part optional.
Suppose we were writing an application to manage payroll and other human resources operations
for a company. The company has both hourly and salaried employees. We wish to represent both
kinds by a single type (so that, for example, values of both can be stored in the same list or array),
but different data is associated with them: for hourly employees, their hourly wage and the number
of hours clocked since the last pay period; for salaried employees, their yearly salary. We use this
datatype:
- datatype employee = Hourly of real * int ref | Salaried of real;
The reason the second value of Hourly is a reference type is so that values can be updated as
the employee clocks more hours. Thus
- fun clockHours(Hourly(rate, hours), newHours) = hours := !hours + newHours
= | clockHours(Salaried(salary), newHours) = ();
Notice how pattern matching naturally extends to more complicated datatypes. This function
does nothing when clocking hours for salaried employees. (If this were a realistic example, it might
store those hours for the sake of assessing the employee’s work; we also likely would store information
such as the employee’s name and office location.) Similarly, we use pattern matching to determine
how to compute an employee’s wage (assume a two-week pay period):
- fun computePay(Hourly(rate, hours)) =
= let val hoursThisPeriod = !hours;
= in
= (hours := 0;
= rate * real(hoursThisPeriod))
= end
= | computePay(Salaried(salary)) = salary / 26.0;
Notice how the option for hourly employees both resets the hours to 0 and returns the computed
wage.
190
CHAPTER 29. RECURSIVE TYPES 29.2. PEANO NUMBERS
If we interpret successor to mean “one more than,” these axioms allow us to define whole numbers
recursively (called Peano numbers); a whole number is Peano numbers
• zero, or
• one more than another whole number.
Now we see just how flexible the datatype construct is: The scope of the name of the type being
defined includes the definition itself. This means types can be defined recursively.
The second definition is merely shorthand for “a piece of the calcareous or horny skeletal deposit
produced by anthozoan polyps”—that is, the use of the word coral internal to the second definition
refers only to the first definition, not back to the second definition itself. No reasonable person
would interpret this as a rule that would produce, as a replacement for the occurrence of coral in a
text, “a piece of a piece of a piece of a piece of a piece of the calcareous or horny skeletal deposit
produced by anthozoan polyps.”
That is, however, how we interpret our definition of whole numbers. The recursive part establishes
a pattern for generating every possible whole number. For example,
In ML,
val six = OnePlus (OnePlus (OnePlus (OnePlus (OnePlus (OnePlus Zero))))) : wholeNumber
191
29.2. PEANO NUMBERS CHAPTER 29. RECURSIVE TYPES
Finding the successor of a number is just a matter of tacking “OnePlus” to the front, a process
easily automated.
Conversion from an int to a wholeNumber is a recursive process—the base case, 0, can be returned
immediately; for any other case, we add one to the wholeNumber representation of the int that comes
before the one we are converting. (Negative ints will get us into trouble.)
Notice how subtracting one from the int and adding one to the resulting wholeNumber balance
predecessor each other off. Opposite the successor, we define the predecessor of a natural number n, pred n, to
be the number of which n is the successor. From Axiom 11 we can prove that the predecessor of a
number, if it exists, is unique; Axiom 10 says that 0 has no predecessor. Pattern matching makes
stripping off a “OnePlus” easy:
You may remember this warning from the first time we saw pattern-matching in Chapter 9—or
from more recent mistakes you have made. In this case it is not a mistake; we truly want to leave
the operation undefined for Zero. Using the function on Zero, rather than failing to define it for
Zero, would be the mistake.
- pred(three);
- pred(Zero);
Now we can start defining arithmetic recursively. Zero will always be our base case; anything
we add to Zero is just itself. For other numbers, picture an abacus. We have two wires, each with
a certain number of beads pushed up. At the end of the computation, we want one of the wires to
contain our answer. Thus we push down one bead from the other wire, bring up one bead on the
answer wire, and repeat until the other wire has no beads left. In other words, we define addition
similarly to our recursive gcd lemmas from Chapter 17.
0+b = b
a+0 = a
a + b = (a + 1) + (b − 1) if b 6= 0
In ML,
192
CHAPTER 29. RECURSIVE TYPES 29.3. PARAMETERIZED DATATYPE CONSTRUCTORS
Examine for yourself the similarity of structure for isLessThatOrEqualTo. Keep in mind that
recursively-define predicates have two base cases, one true and one false. Here the first and second
parameter are in a survival contest; they repeatedly shed a OnePlus, and the first one reduced to
Zero loses.
a−0 = a
a − b = (a − 1) − (b − 1) if a 6= 0 and b 6= 0
In ML,
This rightly leaves the pattern minus(Zero, OnePlus(num)) undefined. Finally, conversion back
to int is just a literal interpretation of the identifiers we gave to the constructor expressions.
- fun asInt(Zero) = 0
= | asInt(OnePlus(num)) = 1 + asInt(num);
The data structure can grow and shrink as you add to and remove from it. You retrieve elements
in the reverse order of which you added them. A favorite real-world example is a PEZ dispenser,
which will help explain the intuition of the operation name depth.
One thing we have left unspecified is what sort of thing (that is, what type) these pieces of
data are. This is intentional; we would like to be able to use stacks of any type, just as we have
for lists and arrays. This, too, can be implemented using a datatype, because datatypes can be
parameterized by type. For example
193
29.3. PARAMETERIZED DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES
- Container(5);
- Container(true);
type parameter Here ’a is a type parameter . Indeed, variables may be used to store types, in which case the
identifier should begin with an apostrophe. (This should explain some unusual typing judgments
ML has been giving you.) The form for declaring a parameterized datatype is
where the identifier is the name of the parameterized type. Notice that this second form is itself a
type expression type expression, any construct that expresses a type.
With this in hand, we can define a stack recursively as being either
• empty, or
• a single item on top of another stack
Implementing the operations for the stack come easily by applying the principles from the Peano
numbers example to the definitions of the operations.
- fun depth(Empty) = 0
= | depth(NonEmpty(x, rest)) = 1 + depth(rest);
194
CHAPTER 29. RECURSIVE TYPES 29.3. PARAMETERIZED DATATYPE CONSTRUCTORS
Exercises
195
29.3. PARAMETERIZED DATATYPE CONSTRUCTORS CHAPTER 29. RECURSIVE TYPES
196
Chapter 30
Fixed-point iteration
30.1 Currying
Before we begin, we introduce a common functional programming technique for reducing the number
of arguments or arity of a function by partially evaluating it. Take for a simple example a function arity
that takes two arguments and multiplies them.
- fun mul(x, y) = x * y;
We could specialize this by making a function that specifically doubles any given input by calling
mul with one argument hardwired to be 2.
This can be generalized an automated by a function that takes any first argument and returns a
function that requires only the second argument.
- makeMultiplier(2);
- it(3);
val it = 6 : int
This process is called currying, after mathematician Haskell Curry, who studied this process currying
(though he did not invent it). We can generalize this with a function that transforms any two-
argument function into curried form.
- curry(mul)(2)(3);
val it = 6 : int
197
30.2. PROBLEM CHAPTER 30. FIXED-POINT ITERATION
30.2 Problem
In this chapter we apply our functional programming skills to a non-trivial problem: calculating
square roots. One approach is to adapt Newton’s method for finding roots in general, which you
may recall from calculus. Here is how Newton’s method works.
Suppose a curve of a function f crosses the x-axis at x0 Functionally, this means f (x0 ) = 0, and
root we say that x0 is a root of f . Finding x0 may be difficult or even impossible; obviously x0 may not
be rational, and if f is not a polynomial, then x0 may not even be an algebraic number (do you
remember the difference between A and T?). Instead, we use Newton’s method to approximate the
root.
f(x)
g(x)
( xi , f (xi))
A
( xi+1, f (x i+1))
C
D
xi x i+1 B x’
The approximation is done by making an initial guess and then improving the guess until it is
tolerance “close enough” (in technical terms, it is within a desired tolerance of the correct answer). Suppose
xi is a guess in this process. To improve the guess, we draw a tangent to the curve at the point A,
(xi , f (xi )), and then calculate the point at which the tangent strikes the x-axis. The slope of the
tangent can be calculated by evaluating the derivative at that point, f 0 (xi ). Recall from first-year
algebra that if you have a slope m and a point (x0 , y 0 ), a line through that point with that slope
satisfies the equation
y − y0 = m · (x − x0 )
y = m · (x − x0 ) + y 0
Let xi+1 be the x value where g strikes the x-axis. Thus we want g(xi+1 ) = 0, and solving this
equation for xi+1 lets us find point B, the intersection of the tangent and the axis.
198
CHAPTER 30. FIXED-POINT ITERATION 30.2. PROBLEM
xi f 0 (xi ) − f (xi )
xi+1 =
f 0 (xi )
f (xi )
= xi − (30.1)
f 0 (xi )
Drawing a vertical line through B leads us to C, the next point on the curve where we will draw
a tangent. Observe how this process brings us closer to x0 , the actual root, at point D. xi+1 is
thus our next guess. Equation 30.1 tells us how to generate each successive approximation from the
previous one. When the absolute value of f (xi ) is small enough (the function value of our guess is
within the tolerance of zero), then we√declare xi to be our answer.
We can use this method to find c by noting that the square root is simply
the positive root of the function f (x) = x2 − c. In this case we find the derivative
f 0 (x) = 2x and plug this into Equation 30.1 to produce a function for improving f ( x) = x −c
(0 , c )
2
a given guess x:
x2 − c
I(x) = x −
2x
In ML, (− c , 0) ( c , 0)
- fun improve(x) =
= x - (x * x - c) / (2.0 * x);
Obviously this will work only if c has been given a valid definition already. Now that we have
our guess-improver in place, our concern is the repetition necessary to achieve a result. Stated
algorithmically, while our current approximation is not within the tolerance, simply improve the
approximation. By now we should have left iterative solutions behind, so we omit the ML code for
this approach. Instead, we set up a recursive solution based on the current guess, which is the data
that is being tested and updated. There are two cases, depending on whether the current guess is
within the tolerance or not.
• Base case: If the current guess is within the tolerance, return it as the answer.
• Recursive case: Otherwise, improve the guess and reapply the this test, returning the result
as the answer.
In ML,
- fun sqrtBody(x) =
= if inTolerance(x)
= then x
= else sqrtBody(improve(x))
We called this function sqrtBody instead of sqrt because it is a function of the previous guess,
x, not a function of the radicand, c. Two things remain: a predicate to determine if a guess is within
the tolerance (say, .001; then we are in the tolerance when |x2 − c| < .001), and an initial guess (say,
1). If we package this together, we have
199
30.3. ANALYSIS CHAPTER 30. FIXED-POINT ITERATION
- sqrt(2.0);
- sqrt(16.0);
- sqrt(121.0);
- sqrt(121.75);
30.3 Analysis
Whenever you solve a problem in mathematics or computer science, the next question to ask is
whether the solution can be generalized so that it applies to a wider range of problems and thus can
be reused more readily. To generalize an idea means to reduce the number of assumptions and to
acknowledge more unknowns. In other words, we are replacing constants with variables.
Our square root algorithm was a specialization of Newton’s method. The natural next question
is how to program Newton’s method in general. What assumptions or restrictions did we make on
Newton’s method when we specialized it? Principally, we assumed that the function for which we
were finding a root was in the form x2 − c where c is a variable to the system. Let us examine how
this assumption affects the segment of the solution that tests for tolerance.
- fun isInTolerance(x) =
= abs((x * x) - c) < 0.001;
The assumed function shows itself in the expression (x * x) - c. By taking the absolute value
of that function for a supplied x and comparing with .001, we are checking if the function is within an
epsilon of zero. We know that functions can be passed as parameters to functions; here, as happens
frequently, generalization manifests itself as parameterization.
- fun isInTolerance(function, x) =
= abs(function(x)) < 0.001;
200
CHAPTER 30. FIXED-POINT ITERATION 30.3. ANALYSIS
Take stock of the type. The function isInTolerance takes a function (in turn mapping from a
type ’a to real) and a value of type ’a. The given information does not allow ML to infer what type
function would accept; hence the type variable ’a. How function’s return type is inferred to be
real is more subtle abs is a special kind of function that is defined so that it can accept either reals
or ints, but it must return the same type that it receives. Since we compare its result against 0.001,
its result must be real; thus its parameter must also be real, and finally we conclude that function
must return a real.
isInTolerance is now less easy to use because we must pass in the function whenever we want
to use it, unless function is in scope already and we can eliminate it as a parameter. However,
we know that functions can also return functions. To make this more general, instead of writing a
function to test the tolerance, we write a function that produces a tolerance tester, based on a given
function.
- fun toleranceTester(function) =
= fn x => abs(function(x)) < 0.001;
Notice that the -> operator is right associative, which means it groups items on the right side
unless parentheses force it to do otherwise. toleranceTester accepts something of type ’a -> real
and returns something of type ’a -> real. Now we need call toleranceTester only once and call the
function it returns whenever we want to test for tolerance. To further generalize, let us no longer
assume = .001, but instead parameterize it.
What happened to the type? Since 0.001 no longer appears, there is nothing to indicate that
we are dealing with reals. Yet ML cannot simply introduce a new type variable (for, say, (’a -> ’b)
* ’b -> ’a -> bool) because abs is not defined for all types, just int and real. Instead, ML has to
guess, and when it comes between real and int, it goes with int. We will force it to chose real.
Next, consider the function improve. We can generalize this by stepping back and considering
how we formulated it in the first place. It comes from applying Equation 30.1 to a specific function
f . Thus we can generalize it by making the function of the curve a parameter. Since we do not have
a means of differentiating f , we will need f 0 to be supplied as well.
However, just as with tolerance testing, we would prefer to think of our next-guesser as a function
only of the previous guess, not of the curve function and derivative. We can modify nextGuess easily
so that it produces a function like improve:
201
30.4. SYNTHESIS CHAPTER 30. FIXED-POINT ITERATION
Notice that this process amounts to the partial application of a function, and example of currying.
nextGuess has three parameters; nextGuesser allows us to supply values for some of the parameters,
and the result is another function. sqrtBody also demonstrates a widely applicable technique. If we
generalize our function I(x) based on Equation 30.1 we have
f (x)
G(x) = x −
f 0 (x)
If x is an actual root, then f (x) = 0, and so G(x) = x. In other words, a root of f (x) is a solution
to the equation
x = G(x)
fixed point problems Problems in this form are called fixed point problems because they seek a value which does not
change when G(x) is applied to it (and so it is fixed). If the fixed point is a local minimum or
maximum and one starts with a good initial guess, one approach to solving (or approximating) a
fixed point iteration fixed point problem is fixed point iteration, the repeated application of the function G(x), that is
30.4 Synthesis
We have decomposed our implementation of the square root function to uncover the elements in
Newton’s method (and more generally, a fixed point iteration). Now we assemble these to make
useful, applied functions. In the analysis, parameters proliferated; as we synthesize the components
into something more useful, we will reduce the parameters, or “fill in the blanks.” Simply hooking
up fixedPoint, nextGuesser, and toleranceTester, we have
val newtonsMethod = fn : (real -> real) * (real -> real) * real -> real
Given a function, its derivative, and a guess, we can approximate a root. However, one parameter
in particular impedes our use of newtonsMethod. We are required to supply the derivative of
function; in fact, many curves on which we wish to use Newton’s method may not be readily
differentiable. In those cases, we would be better off finding a numerical approximation to the
derivative. The easiest such approximation is the secant method, where we take a point on the curve
near the point at which we want to evaluate the derivative and calculate the slope of the line between
those points (which is a secant to the curve). Thus for small , f 0 (x) ≈ f (x+)−f
(x)
. In ML, taking
= .001,
- fun numDerivative(function) =
= fn x => (function(x + 0.001) - function(x)) / 0.001;
202
CHAPTER 30. FIXED-POINT ITERATION 30.4. SYNTHESIS
Now we make an improved version of our earlier function. Since any user of this new function is
concerned only about the results and not about how the results are obtained, our name for it shall
reflect what the function does rather than how it does it.
Coming full circle, we can apply these pre-packaged functions to a special case: finding the square
root. We can use newtonsMethod directly and provide an explicit derivative or we can use findRoot
and rely on a numerical derivative, with different levels of precision. Since x2 − c is monotonically
increasing, 1 is a safe guess, which we provide.
- fun sqrt(c) =
= newtonsMethod(fn x => x * x - c, fn x => 2.0 * x, 1.0);
- sqrt(2.0);
- sqrt(16.0);
- sqrt(121.0);
- sqrt(121.75);
There are several lessons here. First, this has been a demonstration of the interaction between
mathematics and computer science. The example we used comes from an area of study called
numerical analysis which straddles the two fields. Numerical analysis is concerned with the numerical
approximation of calculus and other topics of mathematical analysis. More importantly, this also
demonstrates the interaction between discrete and continuous mathematics. We have throughout
assumed that f (x) is a real-valued function like you are accustomed to seeing in calculus or analysis.
However, the functions themselves are discrete objects. The most important lesson is how functions
can be parameterized to become more general, curried to reduce parameterization, and, as discrete
objects, passed and returned as values.
The running example in this chapter was developed from Abelson and Sussman [1].
203
30.4. SYNTHESIS CHAPTER 30. FIXED-POINT ITERATION
Exercises
204
Chapter 31
Combinatorics
31.1 Counting
In Chapter 25, we proved that if A and B are finite, disjoint sets, then |A ∪ B| = |A| + |B|. We will
generalize this idea, but first
Lemma 31.1 If A and B are finite sets and B ⊆ A, then |A − B| = |A| − |B|.
Now,
Theorem 31.1 If A and B are finite sets, then |A ∪ B| = |A| + |B| − |A ∩ B|.
Proof. Suppose A and B are finite sets. Note that by Exercise 23 of Chapter 11, A and
B − (A ∩ B) are disjoint; and that by Exercise 10.1 also of Chapter 11, A ∩ B ⊆ B. Then,
This result can be interpreted as directions for counting the elements of A and B. It is not simply
a matter of adding the number of elements in A to the number of elements in B, because A and B
might overlap. For example, if you counted the number of math majors and the number of computer
science majors in this course, you may end up with a number larger than the enrollment because
you counted the double math-computer science majors twice. To avoid counting the overlapping
elements twice, we subtract the cardinality of the intersection.
The area of mathematics that studies counting sets and the ways they combine and are ordered is
combinatorics. (Elementary combinatorics is often simply called “counting,” but that invites derision combinatorics
from those unacquainted with higher mathematics.) It plays a part in many field of mathematics,
probability and statistics especially. Lemma 31.1 is called the difference rule and Theorem 31.1 is difference rule
called the inclusion/exclusion rule. We can generalize Theorem 25.2 to define the addition rule:
inclusion/exclusion rule
n
X
Theorem 31.2 If A is a finite set with partition A1 , A2 , . . . , An , then |A| = |Ai |. addition rule
i=1
Theorem 31.3 If A1 , A2 , . . . , An are finite sets, then |A1 × A2 × . . . × An | = |A1 | · |A2 | · . . . · |An |.
205
31.2. PERMUTATIONS AND COMBINATIONS CHAPTER 31. COMBINATORICS
addition rule
z }| {
( (|A| + |B| + |C|)
× (|A| + |B| + |C| + |D| + |E|)
multiplication rule × (|A| + |B| + |C| + |D| + |E|)
× (|A| + |B| + |C| + |D| + |E|) )
10 · 10 · 10 · 10 = 10000
four-digit suffixes, and
8 · 10 · 10 − 3 = 797
three-digit prefixes that do not start with 0 or 1 and do not include the restricted prefixes. Choosing
a phone number means choosing a prefix and a suffix, hence 797 · 10000 = 7970000 possible phone
numbers.
P (n) = n · (n − 1) · (n − 2) · . . . · 1
= n!
r-permutation An r-permutation of a set is a permutation of a subset of size r—in other words, we do not pull
out all the balls, only the first r we come to. The number of r-permutations of a set of size n is
P (n, r) = n · (n − 1) · (n − 2) · . . . · (n − r + 1)
n!
= (n−r)!
On the other hand, what if we grabbed several balls from the urn at the same time? If you have
four balls in your hand at once, it is not clear what order they are in; instead, you have simply
206
CHAPTER 31. COMBINATORICS 31.3. COMPUTING COMBINATIONS
chosen an unordered subset. An r-combination of a set of n elements is a subset of size r. How r-combination
n
many subsets of size r are there of a set of size n (written ( r ))? First, we know that the number of
n!
orderings of subsets of that size is (n−r)! . Each of those subsets can be ordered in r! ways. Hence
n n!
=
r r!(n − r)!
- fun combos([], r) = []
= | combos(x, 0) = []
The case where r is 1 is also straight forward: Every element in the set is, by itself, a combination
of size 1. Creating a set of all those little sets is accomplished by our listify function from
Chapter 18.
- fun combos([], r) = []
= | combos(x, 0) = []
= | combos(x, 1) = listify(x)
The strategy taking shape is odd compared to most of the recursive strategies we have seen
before. The variety of base cases seems to anticipate the problem being made smaller in both the
x argument and the r argument. What does this suggest? When you form a combination of size r
from a list, you first must decide whether that combination will contain the first element of the list
or not. Thus all the combinations are
• all the combinations of size r that do not contain the first element, plus
• all the combinations of size r − 1 that do not contain the first element with the first element
added to all of them.
- fun combos([], r) = []
= | combos(x, 0) = []
= | combos(x, 1) = listify(x)
= | combos(head::rest, r) =
= addToAll(head, combos(rest, r-1)) @ combos(rest, r);
207
31.3. COMPUTING COMBINATIONS CHAPTER 31. COMBINATORICS
Exercises
208
Chapter 32
In Chapter 27, we noted the amazing property that natural numbers, integers, and rationals all have
the same cardinality, being countably infinite, but that real numbers are more infinitely numerous.
That discussion only considered comparing sizes of number sets. What about infinite sets of other
things?
Let us take for example computer programs. The finite nature of any given computer necessitates
that the set of computer programs is not even infinite. However, suppose we even allow for a
computer with an arbitrary amount of memory, where more always could be added if need be. Since
computer programs are stored in memory, and memory is a series of bits, we can interpret the bit-
representation of a program as one large natural number in binary. Thus we have a function from
computer programs to natural numbers. This function is not necessarily a one-to-one correspondence
(it certainly is not, if we exclude from the domain any bit representations of invalid programs), but it
is one-to-one, since every natural number has only one binary representation, which means that there
are at least as many natural numbers as programs, perhaps more. The set of computer programs is
therefore countable.
In ML programming, we think of a program as something that represents and computes a func-
tion. How many functions are there? Since every real number can be considered to be a constant
function, it is easily argued that there are uncountably many possible functions. Nevertheless, let us
look at a narrower scope of functions, ones we have a chance of writing a program to compute (we
could not expect, for example, a program to compute an arbitrary real number in a finite amount of
time). For convenience, we choose the seemingly arbitrary set T of functions from natural numbers
to digits,
T = {f : N → {0, 1, 2, . . . , 9} }
Now we will define a function not in the set T but with T as its codomain (a function that
returns functions), h : (0, 1) → T . Suppose we represent a number in (0, 1) as 0.a1 a2 a3 . . . an . . ..
Then define
209
CHAPTER 32. SPECIAL TOPIC: COMPUTABILITY
that program/function and return true if the given program halts (that is, does not loop forever or
have infinite recursion) on that input, false otherwise. To clarify, the following program does half of
the intended work:
This program will indeed return true if the given program halts on the given input, but rather
than return true otherwise, it will go one forever itself. We cannot just let the program run for a
while and, if it does not end on its own, break the execution and conclude it loops because we will
never know whether or not we have waited long enough. A program that does decide this would be
useful indeed. One of the most frequent kinds of programming mistakes (especially for beginners) is
unterminated iteration or recursion. Such a program—at least a program that would differentiate all
programs precisely and not have any program for which it could not tell either way—is impossible.
To see that this is true, suppose we had such a program, halt.
Given a function m, this program feeds m into halt as both the function to run and the input to
run it on. If m does not halt on itself, this function returns true. If it does halt, then this program
loops forever (the false is present in the statement list just so that it will type correctly; it will
never be reached because the while loop will not end).
What would be the result of running d(d)? If d halts when it is applied to itself, then halt will
return true, so then d will loop forever. If d loops forever when it is applied to itself, then halt will
return false, so then d halts, returning true. If it will halt, then it will not; if it will not, then it will.
This contradiction means that the function halt cannot exist.
halting problem This, the unsolvability of the halting problem is a fundamental result in the theory of computation
since it defines a boundary of what can and cannot be computed. We can write a program that
accepts this problem, that is, that will answer true if the program halts but does not answer at all if
it does not. What we cannot write is a program that decides this problem. Research has shown that
all problems that can be accepted but not decided are reducible to the halting problem. If we had
a model of computation—something much different from anything we have imagined so far—which
could decide the halting problem, then all problems we can currently accept would then also be
decidable.
1 This actually would not type in ML because the application halt(m, m) requires the equation
’a = (’a -> ’b) to hold. ML cannot handle recursive types unless datatypes are used.
210
Chapter 33
This chapter is for students who have taken a programming course using an object-oriented language
such as Java. If you plan to take such a course in the future, you are recommended to come back
and read this chapter after you have learned the fundamentals of class design, subtyping, and
polymorphism.
You have no doubt noticed the striking difference in flavor between functional programming
and object-oriented programming. A functional language views a program as a set of interacting
functions, whereas an object-oriented program is a set of interacting objects. A quick sampling of
their similarities will illuminate these differences. Here are two principles which cut across all styles
of programming which are relevant for our purposes:
• A program or system is comprised of data structures and functionality.
• A well-designed language encourages writing code that is modular (it is made up of small, semi-
autonomous parts), reusable (those parts can be plugged into other systems), and extensible
(the system can be modified by adding new parts).
Notice how functional and object-oriented styles address the first point, at least in the way they
are taught to beginners. In an object-oriented language, data structures and the functionality defined
on them are packaged together; a class represents the organization of data (the instance variables)
and operations (the instance methods) in one unit. In a functional language, the data (defined, for
example, by a datatype) is less tightly coupled to the functionality (the functions written for that
datatype). This touches on the second point as well: In a functional language, the primary unit of
modularity is the function, and in an object-oriented language, the primary unit of modularity is
the class.
To see this illustrated, consider this system in ML to model animals and the noises they make.
- datatype Animal = Dog | Cat ;
It is easy to extend the functionality (that is, add an operation). We need only write a new function,
without any change to the datatype or other functions.
- fun angryNoise(Dog) = "grrrrr"
= | angryNoise(Cat) = "hisssss"
211
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING
It is difficult, on the other hand, to extend the data. Adding a new kind of animal requires
changing the datatype and every function that operates on it. From the ML interpreter’s perspective,
this is rewriting the whole system from scratch.
In an object-oriented setting, we have the opposite situation. A Java system equivalent to our
original ML example would be
interface Animal {
String happyNoise();
String excitedNoise();
}
Although the interface demands that everything of type Animal will have methods happyNoise
and excitedNoise defined for it, the code for an operation like happyNoise is distributed among
the classes. The result is a system where it is very easy to extend the data; you simply write a new
class, without changing the other classes or the interface.
The price is that we have made extending the functionality difficult. Adding a new operation
now requires a change to the interface and to every class.
interface Animal {
String happyNoise();
String excitedNoise();
String angryNoise();
}
212
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING
Relative to this table, functional programming packages things by rows, and adding a row to
the table is convenient. Adding a column is easy is object-oriented programming, since a column is
encapsulated by a class.
It is worth noting that object-oriented programming’s most touted feature, inheritance, does not
touch this problem. Adding a new operation like angryNoise by subclassing may allow us to leave
the old classes untouched, but it does require writing three new classes and a new interface.
interface AnimalWithAngryNoise extends Animal {
String angryNoise();
}
213
CHAPTER 33. SPECIAL TOPIC: COMPARISON WITH OBJECT-ORIENTED PROGRAMMING
214
Part VIII
Graph
215
Chapter 34
Graphs
34.1 Introduction
We commonly use the word graph to refer to a wide range of graphics and charts which provide
an illustration or visual representation of information, particularly quantitative information. In the
realm of mathematics, you probably most closely associate graphs with illustrations of functions in
the real-number plane. Graph theory, our topic in this part, is a field of mathematics that studies
a very specific yet abstract notion of a graph. It has many applications throughout mathematics,
computer science, and other fields, particularly for modeling systems and representing knowledge.
Unfortunately, the beginning student of graph theory will likely feel intimidated by the horde
of terminology required; the slight differences of terms among sources and textbooks aggravates the
situation. The student is encouraged to take careful stock of the definitions in these chapters, but
also to enjoy the beauty of graphs and their uses. This chapter will consider the basic vocabulary of
graph theory and a few results and applications. The following chapter will explore various kinds of
paths through graphs. Finally, we will use graph theory as a framework for discussing isomorphism,
a central concept throughout mathematics.
A graph G = (V, E) is a pair of finite sets, a set V of vertices (singular vertex ) and a set E of graph
pairs of vertices called edges. We will typically write V = {v1 , v2 , . . . , vn } and E = {e1 , e2 , . . . , em }
where each ek = (vi , vj ) for some vi , vj ; in that case, vi and vj are called end points of the edge ek . vertex
Graphs are drawn so that vertices are dots and edges are line segments or curves connecting two
edge
dots.
As an example of a mathematical graph and its relation to everyday visual displays, consider the end points
graph where the set of vertices is { Chicago, Gary, Grand Rapids, Indianapolis, Lafayette, Urbana,
Wheaton } and the edges are direct highway connections between these cities. We have the following
graph (with vertices and edges labeled). Notice how this resembles a map, simply more abstract
(for example, it contains no accurate information about distance or direction).
Grand Rapids
Wheaton
Chicago
I90 Gary
I88 I196
I294
I65n Lafayette
I57
I65s
Urbana
Indianapolis
I74
We call the edges pairs of vertices for lack of a better term; a pair is generally considered a
217
34.2. DEFINITIONS CHAPTER 34. GRAPHS
two-tuple (in this case, it would be an element of V × V ); moreover, we write edges with parentheses
and a comma, just as we would with tuples. However, we mean something slightly different. First,
tuples are ordered. In our basic definition of graphs, we assume that the end points of an edge are
unordered: we could write I57 as (Chicago, Urbana) or (Urbana, Chicago). Second, an edge as a
pair of vertices is not unique. In the cities example, we have duplicate entries for (Chicago, Gary):
both I90 and I294. This is why it is necessary to have two ways to represent an edge, a unique name
as well as a descriptive one.
The kinship between graphs and relations should be readily apparent. Graphs, however, are more
flexible. As a second example, this graph represents the relationships (close friendship, siblinghood,
or romantic involvement—labeled on the drawing but not as names for the edges) among the main
characters of Anna Karenina. V = { Karenin, Anna, Vronsky, Oblonsky, Dolly, Kitty, Levin }.
E = { (Karenin, Anna), (Anna, Vronsky), (Vronsky, Kitty), (Anna, Oblonsky), (Oblonsky, Dolly),
(Dolly, Kitty), (Oblonsky, Levin), (Kitty, Levin) }.
Vronsky
ex Kitty
ur
mo
ra
sp
g
pa
ou
lin
se
sib
Anna Dotty
Levin
spouse
se
ou
sib
sp
lin d
g en
fri
Karenin
Oblonski
34.2 Definitions
incident An edge (vi , vj ) is incident on its end points vi and vj ; we also say that it connects them. If vertices
connects vi and vj are connected by an edge, they are adjacent to one another. If a vertex is adjacent to
adjacent itself, that connecting edge is called a self-loop. If two edges connect the same two vertices, then
self-loop those edges are parallel to each other. Below, e1 is incident on v1 and v4 . e10 connects v7 and v6 .
parallel v9 and v6 are adjacent. e8 is a self-loop. e4 and e5 are parallel.
e8
v1 e1 v
5
v4 e2
e e6
3
v3 e7
e4
v v6
2 e5
e 10
v7
e
9
e e 14
12
e 11
v9
e 13
v8
degree The degree deg(v) of a vertex v is the number of edges incident on the vertex, with self-loops
subgraph counted twice. deg(v1 ) = 2, deg(v5 ) = 3, and deg(v2 ) = 4. A subgraph of a graph G = (V, E)
is a graph G0 = (V 0 , E 0 ) where V 0 ⊆ V and E 0 ⊆ E (and, by definition of graph, for any edge
simple (vi , vj ) ∈ E 0 , vi , vj ∈ V 0 ). A graph G = (V, E) is simple if it contains no parallel edges or self-
loops. The graph ({v1 , v2 , v3 , v4 , v5 }, {e1, e2 , e3 , e4 , e6 }) is a simple subgraph of the graph shown.
218
CHAPTER 34. GRAPHS 34.3. PROOFS
A simple graph G = (V, E) is complete if for all vi , vj ∈ V , the edge (vi , vj ) ∈ E. The subgraph complete
({v7 , v8 , v9 }, {e11 , v12 , v13 }) is complete. The complement of a simple graph G = (V, E) is a graph
G = (V, E 0 ) where for vi , vj ∈ V , (vi , vj ) ∈ E 0 if (vi , vj ) ∈
/ E; in other words, the complement has all complement
the same vertices and all (and only) those possible edges that are not in the original graph. The com-
plement of the subgraph ({v3 , v4 , v6 , v7 }, {e6 , e7 , e10 }) is ({v3 , v4 , v6 , v7 }, {(v3 , v7 ), (v7 , v4 ), (v3 , v6 )},
as shown below.
v4 v4
e6
v e7 v
3 3
v6 v6
e 10
v7 v7
e2
e3
e e
2
4
e
3
v3 e v
v4 v2 5 5
A directed graph is a graph where the edges are ordered pairs, that is, edges have directions. directed graph
Pictorially, the direction of the edge is shown with arrows. Notice that a directed graph with no
parallel edges is the same as a set together with a relation on that set. For this reason, we were able
to use directed graphs to visualize relations in Part V. In a directed graph, we must differentiate
between a vertex’s in-degree, the number of edges towards it, and its out-degree, the number of edges in-degree
away from it.
out-degree
34.3 Proofs
By now you should have achieved a skill level for writing proofs at which it is appropriate to ease
up on the formality slightly. The definitions in graph theory do not avail themselves to proofs as
detailed as those we have written for sets, relations, and functions, and graph theory proofs tend to
be longer anyway. Do not be misled, nevertheless, into thinking that this is lowering the standards
for logic and rigor; we will merely be stepping over a few obvious details for the sake of notation,
length, and readability. The proof of the following proposition shows what sort of argumentation is
expected for these chapters.
n
X
Theorem 34.1 (Handshake.) If G = (V, E) is a graph with V = {v1 , v2 , . . . , vn }, then deg(vi ) =
i=1
2 · |E|.
Proof. By induction on the cardinality of E. First, suppose that G has no edges, that
219
34.4. GAME THEORY CHAPTER 34. GRAPHS
n
X
is, |E| = 0. Then for any vertex v ∈ V , deg(v) = 0. Hence deg(vi ) = 0 = 2·0 = 2·|E|.
i=1
n
X
Hence there exists an N ≥ 0 such that for all m ≤ N , if |E| = m then deg(vi ) = 2·|E|.
i=1
Now suppose |E| = N +1, and suppose e ∈ E. Consider the subgraph of G, G0 = (V, E −
{e}). We will write the degree of v ∈ V when it is being considered a vertex in G0 instead
n
X
of G as deg0 (v). |E − {e}| = N , so by our inductive hypothesis deg0 (vi ) = 2 · |E − {e}|.
i=1
Suppose vi , vj are the end points of e. If vi = vj , then deg(vi ) = deg0 (vi ) + 2; otherwise,
deg(vi ) = deg0 (vi ) + 1 and deg(vj ) = deg0 (vj ) + 1; both by the definition of degree. For
any other vertex v ∈ V , where v 6= vi and v 6= vj , we have deg(v) = deg0 (v).
n
X n
X
Hence deg(vi ) = 2 + deg0 (vi ) = 2 + 2 · |E − {e}| = 2 + 2(|E| − 1) = 2 · |E|. 2
i=1 i=1
• The third sentence makes the unjustified claim that having no edges implies every vertex has
a degree of zero. This follows immediately from the definition of degree, and it is the best we
can do without a formal notion of what “the number of edges” means. Keep in mind that the
only formal mechanism we have developed for reasoning about quantity is cardinality.
• In the fourth sentence, substitution and rules of arithmetic are used without citation.
• The claim |E − {e}| = N depends on Lemma 31.1 and the facts that E and {e} are disjoint
that |{e}| = 1.
• The sixth sentence of the second paragraph together with the last sentence claims that whether
or not e is a self loop, it contributes two to the total sum of degrees. It is difficult to state this
more formally.
You must transport a cabbage, a goat, and a wolf across a river using a boat. The boat
has only enough room for you and one of the other objects. You cannot leave the goat
and the cabbage together unsupervised, or the goat will eat the cabbage. Similarly, the
wolf will eat the goat if you are not there to prevent it. How can you safely transport all
of them to the other side?
We will solve this puzzle by analyzing the possible “states” of the situation, that is, the possible
places you, the goat, the wolf, and the cabbage can be, relative to the river; and the “moves” that
can be made between the states, that is, your rowing the boat across the river, possibly with one of
the objects. Let f stand for you, g for the goat, w for the wolf, and c for the cabbage. / will show
how the river separates all of these. For example, the initial state is f gwc/, indicating that you and
all the objects are on one side of the river. If you were to row across the river by yourself, this would
move the puzzle into the state gwc/f , which would be a failure. Our goal is to find a series of moves
that will result in the state /f gwc.
First, enumerate all the states.
220
CHAPTER 34. GRAPHS 34.4. GAME THEORY
Now, mark the starting state with a double circle, the winning state with a triple circle, each
losing state with a square, and every other state with a single circle. These will be the vertices in
our graph.
Finally, we draw edges between states to show what would happen if you cross the river carrying
one or zero objects.
cgw/f
gw/fc
cw/fg
cg/fw
The puzzle is solved by finding a route through this graph (in the next chapter we shall see
that the technical term for this is path) from the starting state to the finishing state, never passing
through a losing state. One possible route informs you to transport them all by first taking over
the goat, coming back (alone), transporting the cabbage, coming back with the goat, transporting
the wolf, coming back (alone), and transporting the goat again. (One could argue that f /cgw is an
unreachable state, since you would first need to win in order to lose in that way.)
Theoretically, this strategy could be used to write an unbeatable chess-playing program: let each
vertex represent a legal position (or state) in a chess game, and let the edges represent how making a
move changes the position. Then trace back from the positions representing checkmates against the
computer and mark the edges that lead to them, so the computer will not choose them. However,
the limitations of time and space once again hound us: There are estimated to be between 10 43 and
1050 legal chess positions.
221
34.4. GAME THEORY CHAPTER 34. GRAPHS
Exercises
222
Chapter 35
v6 e5 v7
e4 e6
e 10
e e 13 e 14
e8 9 v8 e 11
e7 e 12
e 15 e 16
v9 v 10
e 18 e 19 e 20
v 11 v12 e 17 v 15
v 14
e 21 e 22 v13 e 23
A graph is connected if for all v, w ∈ V , there exists a walk in G from v to w. This graph is not connected
connected, since no walk exists from v5 or v15 to any of the other vertices. However, the subgraph
excluding v5 , v15 , and e14 is connected.
A path is a walk that does not contain a repeated edge. v1 e1 v2 e4 v6 e9 v8 e11 v7 e10 v6 e8 v9 is a path, path
but v11 e21 v12 e17 v9 e18 v13 e22 v12 e17 v9 e18 v13 . is not. If the walk contains no repeated vertices, except
possibly the initial and terminal, then the walk is simple. v1 e1 v2 e4 v6 e9 v8 e11 v7 e10 v6 e8 v9 is not simple, simple
since v6 occurs twice. Its subpath v8 e11 v7 e10 v6 e8 v9 is simple.
Propositions about walks and paths require careful use of the notation we use to denote a walk.
Observe the process used in this example.
Theorem 35.1 If G = (V, E) is a connected graph, then between any two distinct vertices of G
there exists a simple path in G.
The first thing you must do is understand what this theorem is claiming. A quick read might
mislead someone to think that this is merely stating the definition of connected. Be careful—being
connected only means that any two vertices are connected by a walk, but this theorem claims that
they are connected by a simple path, that is, without repeated edges or vertices. It turns out that
whenever a walk exists, a simple path must also.
223
35.2. CIRCUITS AND CYCLES CHAPTER 35. PATHS AND CYCLES
Lemma 35.1 If G = (V, E) is a graph, v, w ∈ V , and there exists a walk from v to w, then there
exists a simple path from v to w.
We will use a notation where we put subscripts on ellipses. The ellipses stand for subwalks which
we would like to splice in and out of walks we are constructing.
To meet the burden of this existence proof, we produced a walk that fulfills the requirements.
This is a constructive proof, and it gives an algorithm for deriving a simple path from any walk in
an undirected graph. It also makes a quick proof for the earlier theorem.
Theorem 35.2 If G = (V, E) is a connected graph and for all v ∈ V , deg(v) = 2, then G is a
circuit.
This requires us to show a walk that is a circuit and comprises the entire graph.
This looks like the beginning of a proof by induction, but actually it is a division into cases.
We are merely getting a special case out of the way. We want to use the fact that there can be no
self-loops, but that is true only if there are more than one vertex.
224
CHAPTER 35. PATHS AND CYCLES 35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES
We have constructed a walk. We must show that it meets the requirements we are looking for.
Only one vertex in c is repeated, since reaching a vertex for the second time stops the
building process. Hence c is simple.
Since we never repeat a vertex (until the last), each edge chosen leads to a new vertex,
hence no edge is repeated in c, so c is a path.
We are always choosing the edge other than the one we took into a vertex, so i 6= x − 1.
Suppose i 6= 1. Since no other vertex is repeated, vi−1 , vi+1 , and vx−1 are distinct.
Therefore, distinct edges (vi−1 , vi ), (vi , vi+1 ), and (vx−1 , vi ) all exist, and so deg(vi ) ≥ 3.
Since deg(vi ) = 2, this is a contradiction. Hence i = 1. Moreover, v1 = vx and c is
closed.
As a closed, simple path, c is a circuit.
Suppose that a vertex v ∈ V is not in c, and let v 0 be any vertex in c. Since G is
connected, there must be a walk, c0 from v to v 0 , and let edge e0 be the first edge in c0
(starting from v 0 ) that is not in c, and let v 00 be an endpoint in c0 in c. Since two edges
incident on v 00 occur in c, accounting for e0 means that deg(v 00 ) ≥ 3. Since deg(vi ) = 2,
this is a contradiction. Hence there is no vertex not in c.
Suppose that an edge e ∈ E is not in c, and let v be an endpoint of e. Since v is in
the circuit, there exist distinct edges e1 and e2 in c that are incident on v, implying
deg(v) ≥ 3. Since deg(v) = 2, this is a contradiction. Hence there is no edge not in c.
Therefore, c is a circuit that comprises the entire graph, and G is a circuit. 2
The reasoning becomes very informal, especially the last two point about the circuit comprising
the whole graph. However, make sure you see that the basic logic is still present: These are merely
proofs of set emptiness.
We can turn this into a graph problem by representing the information with a graph whose
vertices stand for the parts of town and whose edges stand for the bridges, as displayed above. Let
225
35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES CHAPTER 35. PATHS AND CYCLES
G = (V, E) be a graph. An Euler circuit of G is a circuit that contains every vertex and every edge. Eule
(Since it is a circuit, this also means that an Euler circuit contains very edge exactly once. Vertices,
however, may be repeated.) The question now is whether or not this graph has an Euler circuit. We
can prove that it does not, and so such a stroll about town is impossible.
Theorem 35.3 If a graph G = (V, E) has an Euler circuit, then every vertex of G has an even
degree.
The northern, eastern, and southern parts of town each have odd degrees, so by the contrapositive
of this theorem, no Euler circuit around town exists.
Hamiltonian cycle Another interesting case is that of a Hamiltonian cycle, which for a graph G = (V, E) is a cycle
that includes every vertex in V . Since it is a cycle, this means that no vertex or edge is repeated;
however, not all the edges need to be included. We reserve one Hamiltonian cycle proof for the
exercises, but here is a Hamiltonian cycle in a graph similar to the one at the beginning of this
chapter (with the disconnected subgraph removed).
v1 e1 v2 e2 v3 e3 v4
v6 e5 v7
e4 e6
e 10
e e 13
e e8 9 v8 e 11
24
e7 e 12
e 15 e 16
v9 v 10
e 18 e 19 e 20
v v12 e 17
11 v 14
e 21 e 22 v13 e 23
226
CHAPTER 35. PATHS AND CYCLES 35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES
Exercises
227
35.3. EULER CIRCUITS AND HAMILTONIAN CYCLES CHAPTER 35. PATHS AND CYCLES
228
Chapter 36
Isomorphisms
36.1 Definition
We have already seen that the printed shape of the graph—the placement of the dots, the resulting
angles of the lines, any curvature of the lines—is not of the essence of the graph. The only things that
count are the names of the vertices and edges and the abstract shape, that is, the connections that
the edges define. However, consider the two graph representations below, which illustrate the graphs
G = (V = {v1 , v2 , v3 , v4 }, E = {e1 = (v1 , v2 ), e2 = (v2 , v3 ), e3 = (v3 , v4 ), e4 = (v4 , v1 ), e5 = (v1 , v3 )})
and G0 = (W = {w1 , w2 , w3 , w4 , w5 }, F = {f1 = (w1 , w2 ), f2 = (w2 , w3 ), f3 = (w3 , w4 ), f4 =
(w3 , w1 ), f5 = (w4 , w2 )}).
v1 e1 v2
w f
1 1 w
2
e e f f5
4 5 4
e2
f
2
v e3 w w
v 4 f 3
4 3 3
These graphs have much in common. Both have four vertices and five edges. Both have
two vertices with degree two and two vertices with degree three. Both have a Hamiltonian cy-
cle (v1 e1 v2 e2 v3 e3 v4 e4 v1 and w1 f1 w2 f5 w4 f3 w3 f4 w1 , leaving out e5 and f2 , respectively) and two
other cycles (involving e5 and f2 ). In fact, if you imagine switching the positions of w1 and w2 , with
the edges sticking to the vertices as they move, and then doing a little stretching and squeezing, you
could transform the second graph until it appears identical to the first.
In other words, these two really are the same graph, in a certain sense of sameness. The only
difference is the arbitrary matter of names for the vertices and edges. We can formalize this by
writing renaming functions, g : V → W and h : E → F .
e h(e)
v g(v)
e1 f5
v1 w2
e2 f3
v2 w4
e3 f4
v3 w3
e4 f1
v4 w1
e5 f2
The term for this kind of equivalence is isomorphism, from the Greek roots iso meaning “same” isomorphism
229
36.2. ISOMORPHIC INVARIANTS CHAPTER 36. ISOMORPHISMS
and morphë meaning “shape.” This is a way to recognize identical abstract shapes of graphs, that
two graphs are the same up to renaming. Let G = (V, E) and G0 = (W, F ) be graphs. G is
isomorphic isomorphic to G0 if there exist one-to-one correspondences g : V → W and h : E → F such that
for all v ∈ V and e ∈ E, v is an endpoint of e iff g(v) is an endpoint of h(e). The two functions
g and h, taken together, are usually referred to as the isomorphism itself, that is, there exists an
isomorphism, namely g and h, between G and G0 .
As we shall see, what characterizes isomorphisms is that they preserve properties, that is, there
are many graph properties which, if they are true for one graph, are true for any other graph isomor-
phic to that graph. Graph theory is by no means the only area of mathematics where this concept
occurs. Group theory (a main component of modern algebra) involves isomorphisms as functions
that map from one group to another in such a way that operations are preserved. Isomorphisms
define equivalences between matrices in linear algebra. The main concept of isomorphism is the
independence of structure from data. If an isomorphism exists between two things, it means that
they exist as parallel but equivalent universes, and everything that happens in one universe has an
equivalent event in the other that keeps them in step.
Theorem 36.1 For any k ∈ N, the proposition P (G) = “G has a vertex of degree k” is an isomor-
phic invariant.
This is not a difficult result to prove as long as one can identify what burden is required. Being
an isomorphic invariant has significance only when two pieces are already in place: We have two
graphs known to be isomorphic and that the proposition is true for one of those graphs.
Now we can assume and use the definition of isomorphic (those handy one-to-one correspondences
must exist), and we must prove that G0 has a vertex of degree k. The definition of degree also comes
into play, especially in distinguishing between self-loops and other edges.
By definition of degree, there exist edges e1 , e2 , . . . , en ∈ E, non self-loops, and e01 , e02 , . . . e0m ∈
E, self-loops, that are incident on v, such that k = n + 2m.
By the definition of of isomorphism, there exist one-to-one correspondences g and h with
the isomorphic property.
Saying “with the isomorphic property” spares the trouble of writing out “for all v 0 ∈ V ” etc, and
instead more directly we claim
For each ei , 1 ≤ i ≤ n, h(ei ) has g(v) as one endpoint, and for each e0j , 1 ≤ j ≤ m, h(e0j )
has g(v) as both endpoints, and no other edge has g(v) as an endpoint. Each h(ei ) and
h(e0j ) is distinct since h is one-to-one. Hence
deg(g(v)) = n + 2m = k
230
CHAPTER 36. ISOMORPHISMS 36.3. THE ISOMORPHIC RELATION
Proof. Suppose G = (V, E) and G0 = (W, F ) are isomorphic graphs, and suppose that
G has a Hamiltonian cycle, say c = v1 e1 v2 . . . en−1 vn (where v1 = vn ). By the definition
of isomorphism, there exist one-to-one correspondences g and h with the isomorphic
property.
231
36.4. FINAL BOW CHAPTER 36. ISOMORPHISMS
232
CHAPTER 36. ISOMORPHISMS 36.4. FINAL BOW
Exercises
233
36.4. FINAL BOW CHAPTER 36. ISOMORPHISMS
234
Bibliography
[1] Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of
Computer Programs. McGraw Hill and the MIT Press, Cambridge, MA, second edition, 1996.
[2] Mary Chase. Harvey. Dramatists Play Service, Inc., New York, 1971. Originally published in
1944.
[3] G.K. Chesteron. Orthodoxy. Image Books, Garden City, NY, 1959.
[4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction
to algorithms. McGraw-Hill and MIT Press, second edition, 2001.
[5] Sussana S. Epp. Discrete Mathematics with Applications. Thomson Brooks/Cole, Belmont,
CA, third edition, 2004.
[6] Matthias Felleisen and Daniel P Friedman. The Little MLer. MIT Press, Cambridge, MA, 1998.
[7] H. W. Fowler. A Dictionary of Modern English Usage. Oxford University Press, Oxford, 1965.
Originally published 1926. Revised by Ernest Gowers.
[8] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements
of Reusable Object-Oriented Software. Addison-Wesley, 1995.
[9] Karel Hrbacek and Thomas Jech. Introduction to Set Theory. Marcel Dekker, New York, 1978.
Reprinted by University Microfilms International, 1991.
[10] Iu I Manin. A course in mathematical logic. Graduate texts in mathematics. Springer Verlag,
New York, 1977. Translated from the Russian by Neal Koblitz.
[11] George Pólya. Induction and Analogy in Mathematics. Princeton University Press, 1954. Volume
I of Mathematics and Plausible Reasoning.
[12] Michael Stob. Writing proofs. Unpublished, September 1994.
[13] Geerhardus Vos. Biblical Theology. Banner of Truth, Carlisle, PA, 1975. Originally published
by Eerdmans, 1948.
[14] Samuel Wagstaff. Fermat’s last theorem is true for any exponent less than 1000000 (abstract).
AMS Notices, 23(167):A–53, 1976.
235